Skip to content

Latest commit

 

History

History
54 lines (37 loc) · 4.38 KB

File metadata and controls

54 lines (37 loc) · 4.38 KB

Speech to Text

The recognizeMicrophone() and recognizeFile() helper methods are recommended for most use-cases. They set up the streams in the appropriate order and enable common options. These two methods are documented below.

The core of the library is the RecognizeStream that performs the actual transcription, and a collection of other Node.js-style streams that manipulate the data in various ways. For less common use-cases, the core components may be used directly with the helper methods serving as optional templates to follow. The full library is documented at http://watson-developer-cloud.github.io/speech-javascript-sdk/master/module-watson-speech_speech-to-text.html

NOTE The RecognizeStream class lives in the Watson Node SDK. Any option available on this class can be passed into the following methods. These parameters are documented at http://watson-developer-cloud.github.io/node-sdk/master/classes/recognizestream.html

Options:

  • keepMicrophone: if true, preserves the MicrophoneStream for subsequent calls, preventing additional permissions requests in Firefox
  • mediaStream: Optionally pass in an existing media stream rather than prompting the user for microphone access.
  • Other options passed to RecognizeStream
  • Other options passed to SpeakerStream if options.resultsbySpeaker is set to true
  • Other options passed to FormatStream if options.format is not set to false
  • Other options passed to WritableElementStream if options.outputElement is set

Requires the getUserMedia API, so limited browser compatibility (see http://caniuse.com/#search=getusermedia) Also note that Chrome requires https (with a few exceptions for localhost and such) - see https://www.chromium.org/Home/chromium-security/prefer-secure-origins-for-powerful-new-features

No more data will be set after .stop() is called on the returned stream, but additional results may be recieved for already-sent data.

Can recognize and optionally attempt to play a URL, File or Blob (such as from an <input type="file"/> or from an ajax request.)

Options:

  • file: a String URL or a Blob or File instance. Note that CORS restrictions apply to URLs.
  • play: (optional, default=false) Attempt to also play the file locally while uploading it for transcription
  • Other options passed to RecognizeStream
  • Other options passed to TimingStream if options.realtime is true, or unset and options.play is true
  • Other options passed to SpeakerStream if options.resultsbySpeaker is set to true
  • Other options passed to FormatStream if options.format is not set to false
  • Other options passed to WritableElementStream if options.outputElement is set

play requires that the browser support the format; most browsers support wav and ogg/opus, but not flac.) Will emit an UNSUPPORTED_FORMAT error on the RecognizeStream if playback fails. This error is special in that it does not stop the streaming of results.

Playback will automatically stop when .stop() is called on the returned stream.

For Mobile Safari compatibility, a URL must be provided, and recognizeFile() must be called in direct response to a user interaction (so the token must be pre-loaded).