1+ ######
12Speech
2- ======
3-
4- .. toctree ::
5- :maxdepth: 2
6- :hidden:
7-
8- client
9- encoding
10- operation
11- result
12- sample
13- alternative
3+ ######
144
155The `Google Speech `_ API enables developers to convert audio to text.
166The API recognizes over 80 languages and variants, to support your global user
177base.
188
199.. _Google Speech : https://cloud.google.com/speech/docs/getting-started
2010
21- Client
22- ------
2311
24- :class: `~google.cloud.speech.client.Client ` objects provide a
12+ Authentication and Configuration
13+ --------------------------------
14+
15+ :class: `~google.cloud.speech_v1.SpeechClient ` objects provide a
2516means to configure your application. Each instance holds
2617an authenticated connection to the Cloud Speech Service.
2718
2819For an overview of authentication in ``google-cloud-python ``, see
2920:doc: `/core/auth `.
3021
3122Assuming your environment is set up as described in that document,
32- create an instance of :class: `~google.cloud.speech.client.Client `.
23+ create an instance of :class: `~.speech_v1.SpeechClient `.
3324
3425.. code-block :: python
3526
3627 >> > from google.cloud import speech
37- >> > client = speech.Client ()
28+ >> > client = speech.SpeechClient ()
3829
3930
4031 Asynchronous Recognition
4132------------------------
4233
43- The :meth: `~google.cloud.speech.Client.long_running_recognize ` sends audio
44- data to the Speech API and initiates a Long Running Operation. Using this
45- operation, you can periodically poll for recognition results. Use asynchronous
46- requests for audio data of any duration up to 80 minutes.
34+ The :meth: `~.speech_v1.SpeechClient.long_running_recognize ` method
35+ sends audio data to the Speech API and initiates a Long Running Operation.
36+
37+ Using this operation, you can periodically poll for recognition results.
38+ Use asynchronous requests for audio data of any duration up to 80 minutes.
4739
4840See: `Speech Asynchronous Recognize `_
4941
@@ -52,13 +44,16 @@ See: `Speech Asynchronous Recognize`_
5244
5345 >> > import time
5446 >> > from google.cloud import speech
55- >> > client = speech.Client()
56- >> > sample = client.sample(source_uri = ' gs://my-bucket/recording.flac' ,
57- ... encoding = speech.Encoding.LINEAR16 ,
58- ... sample_rate_hertz = 44100 )
59- >> > operation = sample.long_running_recognize(
60- ... language_code = ' en-US' ,
61- ... max_alternatives = 2 ,
47+ >> > client = speech.SpeechClient()
48+ >> > operation = client.long_running_recognize(
49+ ... audio = speech.types.RecognitionAudio(
50+ ... uri = ' gs://my-bucket/recording.flac' ,
51+ ... ),
52+ ... config = speech.types.RecognitionConfig(
53+ ... encoding = ' LINEAR16' ,
54+ ... language_code = ' en-US' ,
55+ ... sample_rate_hertz = 44100 ,
56+ ... ),
6257 ... )
6358 >> > retry_count = 100
6459 >> > while retry_count > 0 and not operation.complete:
@@ -80,7 +75,7 @@ See: `Speech Asynchronous Recognize`_
8075 Synchronous Recognition
8176-----------------------
8277
83- The :meth: `~google.cloud.speech.Client .recognize ` method converts speech
78+ The :meth: `~.speech_v1.SpeechClient .recognize ` method converts speech
8479data to text and returns alternative text transcriptions.
8580
8681This example uses ``language_code='en-GB' `` to better recognize a dialect from
@@ -89,12 +84,17 @@ Great Britain.
8984.. code-block :: python
9085
9186 >> > from google.cloud import speech
92- >> > client = speech.Client()
93- >> > sample = client.sample(source_uri = ' gs://my-bucket/recording.flac' ,
94- ... encoding = speech.Encoding.FLAC ,
95- ... sample_rate_hertz = 44100 )
96- >> > results = sample.recognize(
97- ... language_code = ' en-GB' , max_alternatives = 2 )
87+ >> > client = speech.SpeechClient()
88+ >> > results = client.recognize(
89+ ... audio = speech.types.RecognitionAudio(
90+ ... uri = ' gs://my-bucket/recording.flac' ,
91+ ... ),
92+ ... config = speech.types.RecognitionConfig(
93+ ... encoding = ' LINEAR16' ,
94+ ... language_code = ' en-US' ,
95+ ... sample_rate_hertz = 44100 ,
96+ ... ),
97+ ... )
9898 >> > for result in results:
9999 ... for alternative in result.alternatives:
100100 ... print (' =' * 20 )
@@ -112,14 +112,17 @@ Example of using the profanity filter.
112112.. code-block :: python
113113
114114 >> > from google.cloud import speech
115- >> > client = speech.Client()
116- >> > sample = client.sample(source_uri = ' gs://my-bucket/recording.flac' ,
117- ... encoding = speech.Encoding.FLAC ,
118- ... sample_rate_hertz = 44100 )
119- >> > results = sample.recognize(
120- ... language_code = ' en-US' ,
121- ... max_alternatives = 1 ,
122- ... profanity_filter = True ,
115+ >> > client = speech.SpeechClient()
116+ >> > results = client.recognize(
117+ ... audio = speech.types.RecognitionAudio(
118+ ... uri = ' gs://my-bucket/recording.flac' ,
119+ ... ),
120+ ... config = speech.types.RecognitionConfig(
121+ ... encoding = ' LINEAR16' ,
122+ ... language_code = ' en-US' ,
123+ ... profanity_filter = True ,
124+ ... sample_rate_hertz = 44100 ,
125+ ... ),
123126 ... )
124127 >> > for result in results:
125128 ... for alternative in result.alternatives:
@@ -137,15 +140,20 @@ words to the vocabulary of the recognizer.
137140.. code-block :: python
138141
139142 >> > from google.cloud import speech
140- >> > client = speech.Client()
141- >> > sample = client.sample(source_uri = ' gs://my-bucket/recording.flac' ,
142- ... encoding = speech.Encoding.FLAC ,
143- ... sample_rate_hertz = 44100 )
144- >> > hints = [' hi' , ' good afternoon' ]
145- >> > results = sample.recognize(
146- ... language_code = ' en-US' ,
147- ... max_alternatives = 2 ,
148- ... speech_contexts = hints,
143+ >> > from google.cloud import speech
144+ >> > client = speech.SpeechClient()
145+ >> > results = client.recognize(
146+ ... audio = speech.types.RecognitionAudio(
147+ ... uri = ' gs://my-bucket/recording.flac' ,
148+ ... ),
149+ ... config = speech.types.RecognitionConfig(
150+ ... encoding = ' LINEAR16' ,
151+ ... language_code = ' en-US' ,
152+ ... sample_rate_hertz = 44100 ,
153+ ... speech_contexts = [speech.types.SpeechContext(
154+ ... phrases = [' hi' , ' good afternoon' ],
155+ ... )],
156+ ... ),
149157 ... )
150158 >> > for result in results:
151159 ... for alternative in result.alternatives:
@@ -160,7 +168,7 @@ words to the vocabulary of the recognizer.
160168 Streaming Recognition
161169---------------------
162170
163- The :meth: `~google.cloud.speech.Client .streaming_recognize ` method converts
171+ The :meth: `~speech_v1.SpeechClient .streaming_recognize ` method converts
164172speech data to possible text alternatives on the fly.
165173
166174.. note ::
@@ -170,18 +178,27 @@ speech data to possible text alternatives on the fly.
170178
171179.. code-block :: python
172180
181+ >> > import io
173182 >> > from google.cloud import speech
174- >> > client = speech.Client()
175- >> > with open (' ./hello.wav' , ' rb' ) as stream:
176- ... sample = client.sample(stream = stream,
177- ... encoding = speech.Encoding.LINEAR16 ,
178- ... sample_rate_hertz = 16000 )
179- ... results = sample.streaming_recognize(language_code = ' en-US' )
180- ... for result in results:
181- ... for alternative in result.alternatives:
182- ... print (' =' * 20 )
183- ... print (' transcript: ' + alternative.transcript)
184- ... print (' confidence: ' + str (alternative.confidence))
183+ >> > client = speech.SpeechClient()
184+ >> > config = speech.types.RecognitionConfig(
185+ ... encoding = ' LINEAR16' ,
186+ ... language_code = ' en-US' ,
187+ ... sample_rate_hertz = 44100 ,
188+ ... )
189+ >> > with io.open(' ./hello.wav' , ' rb' ) as stream:
190+ ... requests = [speech.types.StreamingRecognizeRequest(
191+ ... audio_content = stream.read(),
192+ ... )]
193+ >> > results = sample.streaming_recognize(
194+ ... config = speech.types.StreamingRecognitionConfig(config = config),
195+ ... requests,
196+ ... )
197+ >> > for result in results:
198+ ... for alternative in result.alternatives:
199+ ... print (' =' * 20 )
200+ ... print (' transcript: ' + alternative.transcript)
201+ ... print (' confidence: ' + str (alternative.confidence))
185202 ====================
186203 transcript: hello thank you for using Google Cloud platform
187204 confidence: 0.927983105183
@@ -193,20 +210,36 @@ until the client closes the output stream or until the maximum time limit has
193210been reached.
194211
195212If you only want to recognize a single utterance you can set
196- `` single_utterance`` to :data:`True ` and only one result will be returned.
213+ `` single_utterance`` to :data:`True ` and only one result will be returned.
197214
198215See: `Single Utterance` _
199216
200217.. code- block:: python
201218
202- >> > with open (' ./hello_pause_goodbye.wav' , ' rb' ) as stream:
203- ... sample = client.sample(stream = stream,
204- ... encoding = speech.Encoding.LINEAR16 ,
205- ... sample_rate_hertz = 16000 )
206- ... results = sample.streaming_recognize(
207- ... language_code = ' en-US' ,
208- ... single_utterance = True ,
209- ... )
219+ >> > import io
220+ >> > from google.cloud import speech
221+ >> > client = speech.SpeechClient()
222+ >> > config = speech.types.RecognitionConfig(
223+ ... encoding = ' LINEAR16' ,
224+ ... language_code = ' en-US' ,
225+ ... sample_rate_hertz = 44100 ,
226+ ... )
227+ >> > with io.open(' ./hello-pause-goodbye.wav' , ' rb' ) as stream:
228+ ... requests = [speech.types.StreamingRecognizeRequest(
229+ ... audio_content = stream.read(),
230+ ... )]
231+ >> > results = sample.streaming_recognize(
232+ ... config = speech.types.StreamingRecognitionConfig(
233+ ... config = config,
234+ ... single_utterance = False ,
235+ ... ),
236+ ... requests,
237+ ... )
238+ >> > for result in results:
239+ ... for alternative in result.alternatives:
240+ ... print (' =' * 20 )
241+ ... print (' transcript: ' + alternative.transcript)
242+ ... print (' confidence: ' + str (alternative.confidence))
210243 ... for result in results:
211244 ... for alternative in result.alternatives:
212245 ... print (' =' * 20 )
@@ -221,22 +254,31 @@ If ``interim_results`` is set to :data:`True`, interim results
221254
222255.. code- block:: python
223256
257+ >> > import io
224258 >> > from google.cloud import speech
225- >> > client = speech.Client()
226- >> > with open (' ./hello.wav' , ' rb' ) as stream:
227- ... sample = client.sample(stream = stream,
228- ... encoding = speech.Encoding.LINEAR16 ,
229- ... sample_rate = 16000 )
230- ... results = sample.streaming_recognize(
231- ... interim_results = True ,
232- ... language_code = ' en-US' ,
233- ... )
234- ... for result in results:
235- ... for alternative in result.alternatives:
236- ... print (' =' * 20 )
237- ... print (' transcript: ' + alternative.transcript)
238- ... print (' confidence: ' + str (alternative.confidence))
239- ... print (' is_final:' + str (result.is_final))
259+ >> > client = speech.SpeechClient()
260+ >> > config = speech.types.RecognitionConfig(
261+ ... encoding = ' LINEAR16' ,
262+ ... language_code = ' en-US' ,
263+ ... sample_rate_hertz = 44100 ,
264+ ... )
265+ >> > with io.open(' ./hello.wav' , ' rb' ) as stream:
266+ ... requests = [speech.types.StreamingRecognizeRequest(
267+ ... audio_content = stream.read(),
268+ ... )]
269+ >> > results = sample.streaming_recognize(
270+ ... config = speech.types.StreamingRecognitionConfig(
271+ ... config = config,
272+ ... iterim_results = True ,
273+ ... ),
274+ ... requests,
275+ ... )
276+ >> > for result in results:
277+ ... for alternative in result.alternatives:
278+ ... print (' =' * 20 )
279+ ... print (' transcript: ' + alternative.transcript)
280+ ... print (' confidence: ' + str (alternative.confidence))
281+ ... print (' is_final:' + str (result.is_final))
240282 ====================
241283 ' he'
242284 None
@@ -254,3 +296,13 @@ If ``interim_results`` is set to :data:`True`, interim results
254296.. _Single Utterance: https:// cloud.google.com/ speech/ reference/ rpc/ google.cloud.speech.v1beta1# streamingrecognitionconfig
255297.. _sync_recognize: https:// cloud.google.com/ speech/ reference/ rest/ v1beta1/ speech/ syncrecognize
256298.. _Speech Asynchronous Recognize: https:// cloud.google.com/ speech/ reference/ rest/ v1beta1/ speech/ asyncrecognize
299+
300+
301+ API Reference
302+ ------------ -
303+
304+ .. toctree::
305+ :maxdepth: 2
306+
307+ gapic/ api
308+ gapic/ types
0 commit comments