Changeset 270158 in webkit


Ignore:
Timestamp:
Nov 21, 2020 9:51:10 PM (20 months ago)
Author:
sihui_liu@apple.com
Message:

Implement audio capture for SpeechRecognition on macOS
https://bugs.webkit.org/show_bug.cgi?id=218855
<rdar://problem/71331001>

Reviewed by Youenn Fablet.

Source/WebCore:

Introduce SpeechRecognizer, which performs audio capture and speech recogntion operations. On start,
SpeechRecognizer creates a SpeechRecognitionCaptureSource and starts audio capturing. On stop, SpeechRecognizer
clears the source and stops recognizing. SpeechRecognizer can only handle one request at a time, so calling
start on already started SpeechRecognizer would cause ongoing request to be aborted.

Tests: fast/speechrecognition/start-recognition-then-stop.html

fast/speechrecognition/start-second-recognition.html

  • Headers.cmake:
  • Modules/speech/SpeechRecognitionCaptureSource.cpp: Added.

(WebCore::SpeechRecognitionCaptureSource::SpeechRecognitionCaptureSource):

  • Modules/speech/SpeechRecognitionCaptureSource.h: Added.
  • Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp: Added. SpeechRecognitionCaptureSourceImpl provides

implementation of SpeechRecognitionCaptureSource on when ENABLE(MEDIA_STREAM) is true.
(WebCore::nextLogIdentifier):
(WebCore::nullLogger):
(WebCore::SpeechRecognitionCaptureSourceImpl::SpeechRecognitionCaptureSourceImpl):
(WebCore::SpeechRecognitionCaptureSourceImpl::~SpeechRecognitionCaptureSourceImpl):
(WebCore::SpeechRecognitionCaptureSourceImpl::audioSamplesAvailable): Push data to buffer, signal main thread to
pull from buffer and invoke data callback.
(WebCore::SpeechRecognitionCaptureSourceImpl::sourceStarted):
(WebCore::SpeechRecognitionCaptureSourceImpl::sourceStopped):
(WebCore::SpeechRecognitionCaptureSourceImpl::sourceMutedChanged):

  • Modules/speech/SpeechRecognitionCaptureSourceImpl.h: Added.
  • Modules/speech/SpeechRecognizer.cpp: Added.

(WebCore::SpeechRecognizer::SpeechRecognizer):
(WebCore::SpeechRecognizer::reset):
(WebCore::SpeechRecognizer::start):
(WebCore::SpeechRecognizer::startInternal):
(WebCore::SpeechRecognizer::stop):
(WebCore::SpeechRecognizer::stopInternal):

  • Modules/speech/SpeechRecognizer.h: Added.

(WebCore::SpeechRecognizer::currentClientIdentifier const):

  • Sources.txt:
  • SourcesCocoa.txt:
  • WebCore.xcodeproj/project.pbxproj:
  • platform/cocoa/MediaUtilities.cpp: Added.

(WebCore::createAudioFormatDescription):
(WebCore::createAudioSampleBuffer):

  • platform/cocoa/MediaUtilities.h: Added.
  • platform/mediarecorder/cocoa/MediaRecorderPrivateWriterCocoa.mm: Move code for creating CMSampleBufferRef to

MediaUtilities.h/cpp so it can shared between SpeechRecognition and UserMedia, as Speech recognition backend
will take CMSampleBufferRef as input.
(WebCore::createAudioFormatDescription): Deleted.
(WebCore::createAudioSampleBuffer): Deleted.

Source/WebKit:

  • UIProcess/SpeechRecognitionPermissionManager.cpp:

(WebKit::SpeechRecognitionPermissionManager::startProcessingRequest): Check and enable mock devices based on
preference as SpeechRecognition needs it for testing.

  • UIProcess/SpeechRecognitionServer.cpp:

(WebKit::SpeechRecognitionServer::start):
(WebKit::SpeechRecognitionServer::requestPermissionForRequest):
(WebKit::SpeechRecognitionServer::handleRequest):
(WebKit::SpeechRecognitionServer::stop):
(WebKit::SpeechRecognitionServer::abort):
(WebKit::SpeechRecognitionServer::invalidate):
(WebKit::SpeechRecognitionServer::sendUpdate):
(WebKit::SpeechRecognitionServer::stopRequest): Deleted.
(WebKit::SpeechRecognitionServer::abortRequest): Deleted.

  • UIProcess/SpeechRecognitionServer.h:
  • UIProcess/WebPageProxy.cpp:

(WebKit::WebPageProxy::syncIfMockDevicesEnabledChanged):

  • UIProcess/WebPageProxy.h:

LayoutTests:

  • TestExpectations:
  • fast/speechrecognition/start-recognition-in-removed-iframe.html: mark test as async to avoid flakiness.
  • fast/speechrecognition/start-recognition-then-stop-expected.txt: Added.
  • fast/speechrecognition/start-recognition-then-stop.html: Added.
  • fast/speechrecognition/start-second-recognition-expected.txt: Added.
  • fast/speechrecognition/start-second-recognition.html: Added.
  • platform/wk2/TestExpectations:
Location:
trunk
Files:
12 added
16 edited

Legend:

Unmodified
Added
Removed
  • trunk/LayoutTests/ChangeLog

    r270157 r270158  
     12020-11-21  Sihui Liu  <sihui_liu@apple.com>
     2
     3        Implement audio capture for SpeechRecognition on macOS
     4        https://bugs.webkit.org/show_bug.cgi?id=218855
     5        <rdar://problem/71331001>
     6
     7        Reviewed by Youenn Fablet.
     8
     9        * TestExpectations:
     10        * fast/speechrecognition/start-recognition-in-removed-iframe.html: mark test as async to avoid flakiness.
     11        * fast/speechrecognition/start-recognition-then-stop-expected.txt: Added.
     12        * fast/speechrecognition/start-recognition-then-stop.html: Added.
     13        * fast/speechrecognition/start-second-recognition-expected.txt: Added.
     14        * fast/speechrecognition/start-second-recognition.html: Added.
     15        * platform/wk2/TestExpectations:
     16
    1172020-11-21  Chris Dumez  <cdumez@apple.com>
    218
  • trunk/LayoutTests/TestExpectations

    r270045 r270158  
    180180http/tests/in-app-browser-privacy/ [ Skip ]
    181181fast/speechrecognition/permission-error.html [ Skip ]
     182fast/speechrecognition/start-recognition-then-stop.html [ Skip ]
     183fast/speechrecognition/start-second-recognition.html [ Skip ]
    182184
    183185# Only partial support on Cocoa platforms.
  • trunk/LayoutTests/fast/speechrecognition/start-recognition-in-removed-iframe.html

    r269810 r270158  
    66description("Verify that process does not crash when starting recognition in a removed iframe.");
    77
     8if (window.testRunner) {
     9    jsTestIsAsync = true;
     10}
     11
    812function test()
    913{
    1014    iframe = document.getElementsByTagName('iframe')[0];
    11     shouldNotThrow("iframe.contentWindow.startRecognition()"); 
     15    shouldNotThrow("iframe.contentWindow.startRecognition()");
    1216}
    1317
     
    1519{
    1620    shouldNotThrow("iframe.parentNode.removeChild(iframe)");
     21    setTimeout(() => finishJSTest(), 0);
    1722}
    1823
  • trunk/LayoutTests/platform/wk2/TestExpectations

    r269810 r270158  
    805805js/throw-large-string-oom.html [ Pass ]
    806806fast/speechrecognition/permission-error.html [ Pass ]
     807fast/speechrecognition/start-recognition-then-stop.html [ Pass ]
     808fast/speechrecognition/start-second-recognition.html [ Pass ]
    807809fullscreen/full-screen-enter-while-exiting.html [ Pass ]
  • trunk/Source/WebCore/ChangeLog

    r270157 r270158  
     12020-11-21  Sihui Liu  <sihui_liu@apple.com>
     2
     3        Implement audio capture for SpeechRecognition on macOS
     4        https://bugs.webkit.org/show_bug.cgi?id=218855
     5        <rdar://problem/71331001>
     6
     7        Reviewed by Youenn Fablet.
     8
     9        Introduce SpeechRecognizer, which performs audio capture and speech recogntion operations. On start,
     10        SpeechRecognizer creates a SpeechRecognitionCaptureSource and starts audio capturing. On stop, SpeechRecognizer
     11        clears the source and stops recognizing. SpeechRecognizer can only handle one request at a time, so calling
     12        start on already started SpeechRecognizer would cause ongoing request to be aborted.
     13
     14        Tests: fast/speechrecognition/start-recognition-then-stop.html
     15               fast/speechrecognition/start-second-recognition.html
     16
     17        * Headers.cmake:
     18        * Modules/speech/SpeechRecognitionCaptureSource.cpp: Added.
     19        (WebCore::SpeechRecognitionCaptureSource::SpeechRecognitionCaptureSource):
     20        * Modules/speech/SpeechRecognitionCaptureSource.h: Added.
     21        * Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp: Added. SpeechRecognitionCaptureSourceImpl provides
     22        implementation of SpeechRecognitionCaptureSource on when ENABLE(MEDIA_STREAM) is true.
     23        (WebCore::nextLogIdentifier):
     24        (WebCore::nullLogger):
     25        (WebCore::SpeechRecognitionCaptureSourceImpl::SpeechRecognitionCaptureSourceImpl):
     26        (WebCore::SpeechRecognitionCaptureSourceImpl::~SpeechRecognitionCaptureSourceImpl):
     27        (WebCore::SpeechRecognitionCaptureSourceImpl::audioSamplesAvailable): Push data to buffer, signal main thread to
     28        pull from buffer and invoke data callback.
     29        (WebCore::SpeechRecognitionCaptureSourceImpl::sourceStarted):
     30        (WebCore::SpeechRecognitionCaptureSourceImpl::sourceStopped):
     31        (WebCore::SpeechRecognitionCaptureSourceImpl::sourceMutedChanged):
     32        * Modules/speech/SpeechRecognitionCaptureSourceImpl.h: Added.
     33        * Modules/speech/SpeechRecognizer.cpp: Added.
     34        (WebCore::SpeechRecognizer::SpeechRecognizer):
     35        (WebCore::SpeechRecognizer::reset):
     36        (WebCore::SpeechRecognizer::start):
     37        (WebCore::SpeechRecognizer::startInternal):
     38        (WebCore::SpeechRecognizer::stop):
     39        (WebCore::SpeechRecognizer::stopInternal):
     40        * Modules/speech/SpeechRecognizer.h: Added.
     41        (WebCore::SpeechRecognizer::currentClientIdentifier const):
     42        * Sources.txt:
     43        * SourcesCocoa.txt:
     44        * WebCore.xcodeproj/project.pbxproj:
     45        * platform/cocoa/MediaUtilities.cpp: Added.
     46        (WebCore::createAudioFormatDescription):
     47        (WebCore::createAudioSampleBuffer):
     48        * platform/cocoa/MediaUtilities.h: Added.
     49        * platform/mediarecorder/cocoa/MediaRecorderPrivateWriterCocoa.mm: Move code for creating CMSampleBufferRef to
     50        MediaUtilities.h/cpp so it can shared between SpeechRecognition and UserMedia, as Speech recognition backend
     51        will take CMSampleBufferRef as input.
     52        (WebCore::createAudioFormatDescription): Deleted.
     53        (WebCore::createAudioSampleBuffer): Deleted.
     54
    1552020-11-21  Chris Dumez  <cdumez@apple.com>
    256
  • trunk/Source/WebCore/Headers.cmake

    r269984 r270158  
    117117    Modules/plugins/YouTubePluginReplacement.h
    118118
     119    Modules/speech/SpeechRecognitionCaptureSource.h
     120    Modules/speech/SpeechRecognitionCaptureSourceImpl.h
    119121    Modules/speech/SpeechRecognitionConnection.h
    120122    Modules/speech/SpeechRecognitionConnectionClient.h
     
    125127    Modules/speech/SpeechRecognitionResultData.h
    126128    Modules/speech/SpeechRecognitionUpdate.h
     129    Modules/speech/SpeechRecognizer.h
    127130
    128131    Modules/streams/ReadableStreamChunk.h
  • trunk/Source/WebCore/Sources.txt

    r270121 r270158  
    206206Modules/speech/SpeechRecognitionResultList.cpp
    207207Modules/speech/SpeechRecognitionUpdate.cpp
     208Modules/speech/SpeechRecognitionCaptureSource.cpp
     209Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp
     210Modules/speech/SpeechRecognizer.cpp
    208211Modules/speech/DOMWindowSpeechSynthesis.cpp
    209212Modules/speech/SpeechSynthesis.cpp
  • trunk/Source/WebCore/SourcesCocoa.txt

    r270067 r270158  
    239239platform/cocoa/LocalizedStringsCocoa.mm
    240240platform/cocoa/MIMETypeRegistryCocoa.mm
     241platform/cocoa/MediaUtilities.cpp
    241242platform/cocoa/NetworkExtensionContentFilter.mm
    242243platform/cocoa/ParentalControlsContentFilter.mm
  • trunk/Source/WebCore/WebCore.xcodeproj/project.pbxproj

    r270120 r270158  
    121121                0738E5EC2499839000DA101C /* AVOutputDeviceMenuControllerTargetPicker.mm in Sources */ = {isa = PBXBuildFile; fileRef = 0738E5EA249968AD00DA101C /* AVOutputDeviceMenuControllerTargetPicker.mm */; };
    122122                073A15542177A42600EA08F2 /* RemoteVideoSample.h in Headers */ = {isa = PBXBuildFile; fileRef = 073A15532177A39A00EA08F2 /* RemoteVideoSample.h */; settings = {ATTRIBUTES = (Private, ); }; };
    123                 073B87671E4385AC0071C0EC /* AudioSampleBufferList.h in Headers */ = {isa = PBXBuildFile; fileRef = 073B87631E43859D0071C0EC /* AudioSampleBufferList.h */; };
    124                 073B87691E4385AC0071C0EC /* AudioSampleDataSource.h in Headers */ = {isa = PBXBuildFile; fileRef = 073B87651E43859D0071C0EC /* AudioSampleDataSource.h */; };
     123                073B87671E4385AC0071C0EC /* AudioSampleBufferList.h in Headers */ = {isa = PBXBuildFile; fileRef = 073B87631E43859D0071C0EC /* AudioSampleBufferList.h */; settings = {ATTRIBUTES = (Private, ); }; };
     124                073B87691E4385AC0071C0EC /* AudioSampleDataSource.h in Headers */ = {isa = PBXBuildFile; fileRef = 073B87651E43859D0071C0EC /* AudioSampleDataSource.h */; settings = {ATTRIBUTES = (Private, ); }; };
    125125                074E82BB18A69F0E007EF54C /* PlatformTimeRanges.h in Headers */ = {isa = PBXBuildFile; fileRef = 074E82B918A69F0E007EF54C /* PlatformTimeRanges.h */; settings = {ATTRIBUTES = (Private, ); }; };
    126126                075033A8252BD36800F70CE3 /* VideoPlaybackQualityMetrics.h in Headers */ = {isa = PBXBuildFile; fileRef = 075033A6252BD36800F70CE3 /* VideoPlaybackQualityMetrics.h */; settings = {ATTRIBUTES = (Private, ); }; };
     
    27962796                939885C408B7E3D100E707C4 /* EventNames.h in Headers */ = {isa = PBXBuildFile; fileRef = 939885C208B7E3D100E707C4 /* EventNames.h */; settings = {ATTRIBUTES = (Private, ); }; };
    27972797                939B02EF0EA2DBC400C54570 /* WidthIterator.h in Headers */ = {isa = PBXBuildFile; fileRef = 939B02ED0EA2DBC400C54570 /* WidthIterator.h */; };
     2798                939C0D272564E47F00B3211B /* SpeechRecognizer.h in Headers */ = {isa = PBXBuildFile; fileRef = 939C0D2125648C3900B3211B /* SpeechRecognizer.h */; settings = {ATTRIBUTES = (Private, ); }; };
     2799                939C0D2B2564E7F300B3211B /* MediaUtilities.h in Headers */ = {isa = PBXBuildFile; fileRef = 939C0D292564E7F200B3211B /* MediaUtilities.h */; settings = {ATTRIBUTES = (Private, ); }; };
    27982800                93A0482825495506000AC462 /* SpeechRecognitionProvider.h in Headers */ = {isa = PBXBuildFile; fileRef = 93A0482625495500000AC462 /* SpeechRecognitionProvider.h */; settings = {ATTRIBUTES = (Private, ); }; };
    27992801                93A0482925495511000AC462 /* SpeechRecognitionResultData.h in Headers */ = {isa = PBXBuildFile; fileRef = 93A0481B254954E4000AC462 /* SpeechRecognitionResultData.h */; settings = {ATTRIBUTES = (Private, ); }; };
     
    28572859                93F1D5C112D5335600832BEC /* JSWebGLLoseContext.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F1D5BF12D5335600832BEC /* JSWebGLLoseContext.h */; };
    28582860                93F2CC932427FB9C005851D8 /* CharacterRange.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F2CC912427FB9A005851D8 /* CharacterRange.h */; settings = {ATTRIBUTES = (Private, ); }; };
     2861                93F6B81F2567A08C00A08488 /* SpeechRecognitionCaptureSourceImpl.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F6B81C25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.h */; settings = {ATTRIBUTES = (Private, ); }; };
     2862                93F6B8222567A65600A08488 /* SpeechRecognitionCaptureSource.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F6B81B25679F6F00A08488 /* SpeechRecognitionCaptureSource.h */; settings = {ATTRIBUTES = (Private, ); }; };
    28592863                93F6F1EE127F70B10055CB06 /* WebGLContextEvent.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F6F1EB127F70B10055CB06 /* WebGLContextEvent.h */; };
    28602864                93F925430F7EF5B8007E37C9 /* RadioButtonGroups.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F925410F7EF5B8007E37C9 /* RadioButtonGroups.h */; settings = {ATTRIBUTES = (Private, ); }; };
     
    1145411458                939B02EC0EA2DBC400C54570 /* WidthIterator.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = WidthIterator.cpp; sourceTree = "<group>"; };
    1145511459                939B02ED0EA2DBC400C54570 /* WidthIterator.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WidthIterator.h; sourceTree = "<group>"; };
     11460                939C0D2125648C3900B3211B /* SpeechRecognizer.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = SpeechRecognizer.h; sourceTree = "<group>"; };
     11461                939C0D2325648C4E00B3211B /* SpeechRecognizer.cpp */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.cpp; path = SpeechRecognizer.cpp; sourceTree = "<group>"; };
     11462                939C0D282564E7F200B3211B /* MediaUtilities.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = MediaUtilities.cpp; sourceTree = "<group>"; };
     11463                939C0D292564E7F200B3211B /* MediaUtilities.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = MediaUtilities.h; sourceTree = "<group>"; };
    1145611464                93A0481B254954E4000AC462 /* SpeechRecognitionResultData.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = SpeechRecognitionResultData.h; sourceTree = "<group>"; };
    1145711465                93A0481D254954E5000AC462 /* SpeechRecognitionConnection.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = SpeechRecognitionConnection.h; sourceTree = "<group>"; };
     
    1154011548                93F1D5BF12D5335600832BEC /* JSWebGLLoseContext.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = JSWebGLLoseContext.h; sourceTree = "<group>"; };
    1154111549                93F2CC912427FB9A005851D8 /* CharacterRange.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = CharacterRange.h; sourceTree = "<group>"; };
     11550                93F6B81B25679F6F00A08488 /* SpeechRecognitionCaptureSource.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = SpeechRecognitionCaptureSource.h; sourceTree = "<group>"; };
     11551                93F6B81C25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = SpeechRecognitionCaptureSourceImpl.h; sourceTree = "<group>"; };
     11552                93F6B81D25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = SpeechRecognitionCaptureSourceImpl.cpp; sourceTree = "<group>"; };
     11553                93F6B81E25679F7100A08488 /* SpeechRecognitionCaptureSource.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = SpeechRecognitionCaptureSource.cpp; sourceTree = "<group>"; };
    1154211554                93F6F1EA127F70B10055CB06 /* WebGLContextEvent.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = WebGLContextEvent.cpp; sourceTree = "<group>"; };
    1154311555                93F6F1EB127F70B10055CB06 /* WebGLContextEvent.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WebGLContextEvent.h; sourceTree = "<group>"; };
     
    2358823600                        sourceTree = "<group>";
    2358923601                };
     23602                93F6B80E2566EDE100A08488 /* cocoa */ = {
     23603                        isa = PBXGroup;
     23604                        children = (
     23605                        );
     23606                        path = cocoa;
     23607                        sourceTree = "<group>";
     23608                };
    2359023609                946D37271D6CB2250077084F /* parser */ = {
    2359123610                        isa = PBXGroup;
     
    2442424443                                06E81ED60AB5D5E900C87837 /* LocalCurrentGraphicsContext.h */,
    2442524444                                1A4832B21A953BA6008B4DFE /* LocalizedStringsCocoa.mm */,
     24445                                939C0D282564E7F200B3211B /* MediaUtilities.cpp */,
     24446                                939C0D292564E7F200B3211B /* MediaUtilities.h */,
    2442624447                                C53D39331C97892D007F3AE9 /* MIMETypeRegistryCocoa.mm */,
    2442724448                                A19D93491AA11B1E00B46C24 /* NetworkExtensionContentFilter.h */,
     
    2561625637                        isa = PBXGroup;
    2561725638                        children = (
     25639                                93F6B80E2566EDE100A08488 /* cocoa */,
    2561825640                                AA2A5ABA16A485D500975A25 /* DOMWindow+SpeechSynthesis.idl */,
    2561925641                                AA2A5AB816A485D500975A25 /* DOMWindowSpeechSynthesis.cpp */,
     
    2562525647                                934950BC2539434E0099F171 /* SpeechRecognitionAlternative.h */,
    2562625648                                934950BB2539434E0099F171 /* SpeechRecognitionAlternative.idl */,
     25649                                93F6B81E25679F7100A08488 /* SpeechRecognitionCaptureSource.cpp */,
     25650                                93F6B81B25679F6F00A08488 /* SpeechRecognitionCaptureSource.h */,
     25651                                93F6B81D25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.cpp */,
     25652                                93F6B81C25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.h */,
    2562725653                                93A0481D254954E5000AC462 /* SpeechRecognitionConnection.h */,
    2562825654                                93A04824254954E9000AC462 /* SpeechRecognitionConnectionClient.h */,
     
    2564925675                                93D6B76E254B8E1B0058DD3A /* SpeechRecognitionUpdate.cpp */,
    2565025676                                93D6B76D254B8E1B0058DD3A /* SpeechRecognitionUpdate.h */,
     25677                                939C0D2325648C4E00B3211B /* SpeechRecognizer.cpp */,
     25678                                939C0D2125648C3900B3211B /* SpeechRecognizer.h */,
    2565125679                                AA2A5ABD16A485D500975A25 /* SpeechSynthesis.cpp */,
    2565225680                                AA2A5ABE16A485D500975A25 /* SpeechSynthesis.h */,
     
    3353533563                                07C1C0E21BFB600100BD2256 /* MediaTrackSupportedConstraints.h in Headers */,
    3353633564                                07611DC12440E59B00D80704 /* MediaUsageInfo.h in Headers */,
     33565                                939C0D2B2564E7F300B3211B /* MediaUtilities.h in Headers */,
    3353733566                                51E1BAC31BD8064E0055D81F /* MemoryBackingStoreTransaction.h in Headers */,
    3353833567                                BCB16C180979C3BD00467741 /* MemoryCache.h in Headers */,
     
    3442734456                                934950CD253943610099F171 /* SpeechRecognition.h in Headers */,
    3442834457                                934950CE253943650099F171 /* SpeechRecognitionAlternative.h in Headers */,
     34458                                93F6B8222567A65600A08488 /* SpeechRecognitionCaptureSource.h in Headers */,
     34459                                93F6B81F2567A08C00A08488 /* SpeechRecognitionCaptureSourceImpl.h in Headers */,
    3442934460                                93A0482C25495519000AC462 /* SpeechRecognitionConnection.h in Headers */,
    3443034461                                93A0482E2549551E000AC462 /* SpeechRecognitionConnectionClient.h in Headers */,
     
    3444134472                                934950D6253943810099F171 /* SpeechRecognitionResultList.h in Headers */,
    3444234473                                93D6B771254BAB450058DD3A /* SpeechRecognitionUpdate.h in Headers */,
     34474                                939C0D272564E47F00B3211B /* SpeechRecognizer.h in Headers */,
    3444334475                                AA2A5AD416A4861100975A25 /* SpeechSynthesis.h in Headers */,
    3444434476                                C14938072234551A000CD707 /* SpeechSynthesisClient.h in Headers */,
  • trunk/Source/WebCore/platform/mediarecorder/cocoa/MediaRecorderPrivateWriterCocoa.mm

    r268363 r270158  
    3535#include "MediaRecorderPrivateOptions.h"
    3636#include "MediaStreamTrackPrivate.h"
     37#include "MediaUtilities.h"
    3738#include "VideoSampleBufferCompressor.h"
    3839#include "WebAudioBufferList.h"
     
    444445}
    445446
    446 static inline RetainPtr<CMFormatDescriptionRef> createAudioFormatDescription(const AudioStreamDescription& description)
    447 {
    448     auto basicDescription = WTF::get<const AudioStreamBasicDescription*>(description.platformDescription().description);
    449     CMFormatDescriptionRef format = nullptr;
    450     auto error = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, basicDescription, 0, NULL, 0, NULL, NULL, &format);
    451     if (error) {
    452         RELEASE_LOG_ERROR(MediaStream, "MediaRecorderPrivateWriter CMAudioFormatDescriptionCreate failed with %d", error);
    453         return nullptr;
    454     }
    455     return adoptCF(format);
    456 }
    457 
    458 static inline RetainPtr<CMSampleBufferRef> createAudioSampleBuffer(const PlatformAudioData& data, const AudioStreamDescription& description, CMTime time, size_t sampleCount)
    459 {
    460     auto format = createAudioFormatDescription(description);
    461     if (!format)
    462         return nullptr;
    463 
    464     CMSampleBufferRef sampleBuffer = nullptr;
    465     auto error = CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault, NULL, false, NULL, NULL, format.get(), sampleCount, time, NULL, &sampleBuffer);
    466     if (error) {
    467         RELEASE_LOG_ERROR(MediaStream, "MediaRecorderPrivateWriter createAudioSampleBufferWithPacketDescriptions failed with %d", error);
    468         return nullptr;
    469     }
    470     auto buffer = adoptCF(sampleBuffer);
    471 
    472     error = CMSampleBufferSetDataBufferFromAudioBufferList(buffer.get(), kCFAllocatorDefault, kCFAllocatorDefault, 0, downcast<WebAudioBufferList>(data).list());
    473     if (error) {
    474         RELEASE_LOG_ERROR(MediaStream, "MediaRecorderPrivateWriter CMSampleBufferSetDataBufferFromAudioBufferList failed with %d", error);
    475         return nullptr;
    476     }
    477     return buffer;
    478 }
    479 
    480447void MediaRecorderPrivateWriter::appendAudioSampleBuffer(const PlatformAudioData& data, const AudioStreamDescription& description, const WTF::MediaTime&, size_t sampleCount)
    481448{
  • trunk/Source/WebKit/ChangeLog

    r270156 r270158  
     12020-11-21  Sihui Liu  <sihui_liu@apple.com>
     2
     3        Implement audio capture for SpeechRecognition on macOS
     4        https://bugs.webkit.org/show_bug.cgi?id=218855
     5        <rdar://problem/71331001>
     6
     7        Reviewed by Youenn Fablet.
     8
     9        * UIProcess/SpeechRecognitionPermissionManager.cpp:
     10        (WebKit::SpeechRecognitionPermissionManager::startProcessingRequest): Check and enable mock devices based on
     11        preference as SpeechRecognition needs it for testing.
     12        * UIProcess/SpeechRecognitionServer.cpp:
     13        (WebKit::SpeechRecognitionServer::start):
     14        (WebKit::SpeechRecognitionServer::requestPermissionForRequest):
     15        (WebKit::SpeechRecognitionServer::handleRequest):
     16        (WebKit::SpeechRecognitionServer::stop):
     17        (WebKit::SpeechRecognitionServer::abort):
     18        (WebKit::SpeechRecognitionServer::invalidate):
     19        (WebKit::SpeechRecognitionServer::sendUpdate):
     20        (WebKit::SpeechRecognitionServer::stopRequest): Deleted.
     21        (WebKit::SpeechRecognitionServer::abortRequest): Deleted.
     22        * UIProcess/SpeechRecognitionServer.h:
     23        * UIProcess/WebPageProxy.cpp:
     24        (WebKit::WebPageProxy::syncIfMockDevicesEnabledChanged):
     25        * UIProcess/WebPageProxy.h:
     26
    1272020-11-21  Simon Fraser  <simon.fraser@apple.com>
    228
  • trunk/Source/WebKit/UIProcess/SpeechRecognitionPermissionManager.cpp

    r269918 r270158  
    105105
    106106    if (m_page.preferences().mockCaptureDevicesEnabled()) {
     107        m_page.syncIfMockDevicesEnabledChanged();
    107108        m_microphoneCheck = CheckResult::Granted;
    108109        m_speechRecognitionServiceCheck = CheckResult::Granted;
  • trunk/Source/WebKit/UIProcess/SpeechRecognitionServer.cpp

    r269868 r270158  
    4747{
    4848    MESSAGE_CHECK(clientIdentifier);
    49     ASSERT(!m_pendingRequests.contains(clientIdentifier));
    50     ASSERT(!m_ongoingRequests.contains(clientIdentifier));
     49    ASSERT(!m_requests.contains(clientIdentifier));
    5150    auto requestInfo = WebCore::SpeechRecognitionRequestInfo { clientIdentifier, WTFMove(lang), continuous, interimResults, maxAlternatives, WTFMove(origin) };
    52     auto& pendingRequest = m_pendingRequests.add(clientIdentifier, makeUnique<WebCore::SpeechRecognitionRequest>(WTFMove(requestInfo))).iterator->value;
     51    auto& newRequest = m_requests.add(clientIdentifier, makeUnique<WebCore::SpeechRecognitionRequest>(WTFMove(requestInfo))).iterator->value;
    5352
    54     requestPermissionForRequest(*pendingRequest);
     53    requestPermissionForRequest(*newRequest);
    5554}
    5655
     
    6564
    6665        auto identifier = weakRequest->clientIdentifier();
    67         auto takenRequest = m_pendingRequests.take(identifier);
    6866        if (decision == SpeechRecognitionPermissionDecision::Deny) {
     67            m_requests.remove(identifier);
    6968            auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::NotAllowed, "Permission check failed"_s };
    7069            sendUpdate(identifier, WebCore::SpeechRecognitionUpdateType::Error, error);
     
    7271        }
    7372
    74         m_ongoingRequests.add(identifier, WTFMove(takenRequest));
    75         handleRequest(*m_ongoingRequests.get(identifier));
     73        handleRequest(identifier);
    7674    });
     75}
     76
     77void SpeechRecognitionServer::handleRequest(WebCore::SpeechRecognitionConnectionClientIdentifier clientIdentifier)
     78{
     79    if (!m_recognizer) {
     80        m_recognizer = makeUnique<SpeechRecognizer>([this, weakThis = makeWeakPtr(this)](auto& update) {
     81            if (!weakThis)
     82                return;
     83
     84            auto clientIdentifier = update.clientIdentifier();
     85            if (!m_requests.contains(clientIdentifier))
     86                return;
     87
     88            auto type = update.type();
     89            if (type == SpeechRecognitionUpdateType::Error || type == SpeechRecognitionUpdateType::End)
     90                m_requests.remove(clientIdentifier);
     91
     92            sendUpdate(update);
     93        });
     94    }
     95
     96    m_recognizer->start(clientIdentifier);
    7797}
    7898
     
    80100{
    81101    MESSAGE_CHECK(clientIdentifier);
    82     if (m_pendingRequests.remove(clientIdentifier)) {
    83         sendUpdate(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
     102    if (m_recognizer && m_recognizer->currentClientIdentifier() == clientIdentifier) {
     103        m_recognizer->stop();
    84104        return;
    85105    }
    86106
    87     ASSERT(m_ongoingRequests.contains(clientIdentifier));
    88     stopRequest(*m_ongoingRequests.get(clientIdentifier));
     107    if (m_requests.remove(clientIdentifier))
     108        sendUpdate(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
    89109}
    90110
     
    92112{
    93113    MESSAGE_CHECK(clientIdentifier);
    94     if (m_pendingRequests.remove(clientIdentifier)) {
    95         sendUpdate(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
     114    if (m_recognizer && m_recognizer->currentClientIdentifier() == clientIdentifier) {
     115        m_recognizer->stop(WebCore::SpeechRecognizer::ShouldGenerateFinalResult::No);
    96116        return;
    97117    }
    98118
    99     ASSERT(m_ongoingRequests.contains(clientIdentifier));
    100     auto request = m_ongoingRequests.take(clientIdentifier);
    101     abortRequest(*request);
    102     auto update = WebCore::SpeechRecognitionUpdate::create(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
    103     send(Messages::WebSpeechRecognitionConnection::DidReceiveUpdate(update), m_identifier);
     119    if (m_requests.remove(clientIdentifier))
     120        sendUpdate(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
    104121}
    105122
     
    107124{
    108125    MESSAGE_CHECK(clientIdentifier);
    109     auto request = m_ongoingRequests.take(clientIdentifier);
    110     if (request)
    111         abortRequest(*request);
    112 }
    113 
    114 void SpeechRecognitionServer::handleRequest(WebCore::SpeechRecognitionRequest& request)
    115 {
    116     // TODO: start capturing audio and recognition.
    117 }
    118 
    119 void SpeechRecognitionServer::stopRequest(WebCore::SpeechRecognitionRequest& request)
    120 {
    121     // TODO: stop capturing audio and finalizing results by recognizing captured audio.
    122 }
    123 
    124 void SpeechRecognitionServer::abortRequest(WebCore::SpeechRecognitionRequest& request)
    125 {
    126     // TODO: stop capturing audio and recognition immediately without generating results.
     126    if (m_requests.remove(clientIdentifier)) {
     127        if (m_recognizer && m_recognizer->currentClientIdentifier() == clientIdentifier)
     128            m_recognizer->stop();
     129    }
    127130}
    128131
     
    134137    if (type == WebCore::SpeechRecognitionUpdateType::Result)
    135138        update = WebCore::SpeechRecognitionUpdate::createResult(clientIdentifier, *result);
    136     send(Messages::WebSpeechRecognitionConnection::DidReceiveUpdate(update), m_identifier);
     139    sendUpdate(update);
     140}
     141
     142void SpeechRecognitionServer::sendUpdate(const WebCore::SpeechRecognitionUpdate& update)
     143{
     144    send(Messages::WebSpeechRecognitionConnection::DidReceiveUpdate(update));
    137145}
    138146
  • trunk/Source/WebKit/UIProcess/SpeechRecognitionServer.h

    r269810 r270158  
    3232#include <WebCore/SpeechRecognitionRequest.h>
    3333#include <WebCore/SpeechRecognitionResultData.h>
     34#include <WebCore/SpeechRecognizer.h>
    3435#include <wtf/Deque.h>
    3536
     
    5960private:
    6061    void requestPermissionForRequest(WebCore::SpeechRecognitionRequest&);
    61     void handleRequest(WebCore::SpeechRecognitionRequest&);
    62     void stopRequest(WebCore::SpeechRecognitionRequest&);
    63     void abortRequest(WebCore::SpeechRecognitionRequest&);
     62    void handleRequest(WebCore::SpeechRecognitionConnectionClientIdentifier);
    6463    void sendUpdate(WebCore::SpeechRecognitionConnectionClientIdentifier, WebCore::SpeechRecognitionUpdateType, Optional<WebCore::SpeechRecognitionError> = WTF::nullopt, Optional<Vector<WebCore::SpeechRecognitionResultData>> = WTF::nullopt);
     64    void sendUpdate(const WebCore::SpeechRecognitionUpdate&);
    6565
    6666    // IPC::MessageReceiver.
     
    7373    Ref<IPC::Connection> m_connection;
    7474    SpeechRecognitionServerIdentifier m_identifier;
    75     HashMap<WebCore::SpeechRecognitionConnectionClientIdentifier, std::unique_ptr<WebCore::SpeechRecognitionRequest>> m_pendingRequests;
    76     HashMap<WebCore::SpeechRecognitionConnectionClientIdentifier, std::unique_ptr<WebCore::SpeechRecognitionRequest>> m_ongoingRequests;
     75    HashMap<WebCore::SpeechRecognitionConnectionClientIdentifier, std::unique_ptr<WebCore::SpeechRecognitionRequest>> m_requests;
    7776    SpeechRecognitionPermissionChecker m_permissionChecker;
     77    std::unique_ptr<WebCore::SpeechRecognizer> m_recognizer;
    7878};
    7979
  • trunk/Source/WebKit/UIProcess/WebPageProxy.cpp

    r270136 r270158  
    81758175}
    81768176
     8177void WebPageProxy::syncIfMockDevicesEnabledChanged()
     8178{
     8179#if ENABLE(MEDIA_STREAM)
     8180    userMediaPermissionRequestManager().syncWithWebCorePrefs();
     8181#endif
     8182}
     8183
    81778184void WebPageProxy::beginMonitoringCaptureDevices()
    81788185{
  • trunk/Source/WebKit/UIProcess/WebPageProxy.h

    r270136 r270158  
    18261826    void requestSpeechRecognitionPermissionByDefaultAction(const WebCore::SecurityOrigin&, CompletionHandler<void(bool)>&&);
    18271827
     1828    void syncIfMockDevicesEnabledChanged();
     1829
    18281830private:
    18291831    WebPageProxy(PageClient&, WebProcessProxy&, Ref<API::PageConfiguration>&&);
Note: See TracChangeset for help on using the changeset viewer.