Skip to content

Commit 401bf40

Browse files
lukesneeringerdhermes
authored andcommitted
Speech GAPIC to master (googleapis#3607)
* Vendor the GAPIC for Speech. * Speech Partial Veneer (googleapis#3483) * Update to docs based on @dhermes catch. * Fix incorrect variable. * Fix the docs. * Style fixes to unit tests. * More PR review from me.
1 parent 66a9258 commit 401bf40

35 files changed

Lines changed: 2589 additions & 288 deletions

docs/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
resource-manager/api
1313
runtimeconfig/usage
1414
spanner/usage
15-
speech/usage
15+
speech/index
1616
error-reporting/usage
1717
monitoring/usage
1818
logging/usage

docs/speech/alternative.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

docs/speech/client.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

docs/speech/encoding.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

docs/speech/gapic/api.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Speech Client API
2+
=================
3+
4+
.. automodule:: google.cloud.speech_v1
5+
:members:
6+
:inherited-members:

docs/speech/gapic/types.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
Speech Client Types
2+
===================
3+
4+
.. automodule:: google.cloud.speech_v1.types
5+
:members:
Lines changed: 140 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,41 @@
1+
######
12
Speech
2-
======
3-
4-
.. toctree::
5-
:maxdepth: 2
6-
:hidden:
7-
8-
client
9-
encoding
10-
operation
11-
result
12-
sample
13-
alternative
3+
######
144

155
The `Google Speech`_ API enables developers to convert audio to text.
166
The API recognizes over 80 languages and variants, to support your global user
177
base.
188

199
.. _Google Speech: https://cloud.google.com/speech/docs/getting-started
2010

21-
Client
22-
------
2311

24-
:class:`~google.cloud.speech.client.Client` objects provide a
12+
Authentication and Configuration
13+
--------------------------------
14+
15+
:class:`~google.cloud.speech_v1.SpeechClient` objects provide a
2516
means to configure your application. Each instance holds
2617
an authenticated connection to the Cloud Speech Service.
2718

2819
For an overview of authentication in ``google-cloud-python``, see
2920
:doc:`/core/auth`.
3021

3122
Assuming your environment is set up as described in that document,
32-
create an instance of :class:`~google.cloud.speech.client.Client`.
23+
create an instance of :class:`~.speech_v1.SpeechClient`.
3324

3425
.. code-block:: python
3526
3627
>>> from google.cloud import speech
37-
>>> client = speech.Client()
28+
>>> client = speech.SpeechClient()
3829
3930
4031
Asynchronous Recognition
4132
------------------------
4233

43-
The :meth:`~google.cloud.speech.Client.long_running_recognize` sends audio
44-
data to the Speech API and initiates a Long Running Operation. Using this
45-
operation, you can periodically poll for recognition results. Use asynchronous
46-
requests for audio data of any duration up to 80 minutes.
34+
The :meth:`~.speech_v1.SpeechClient.long_running_recognize` method
35+
sends audio data to the Speech API and initiates a Long Running Operation.
36+
37+
Using this operation, you can periodically poll for recognition results.
38+
Use asynchronous requests for audio data of any duration up to 80 minutes.
4739

4840
See: `Speech Asynchronous Recognize`_
4941

@@ -52,13 +44,16 @@ See: `Speech Asynchronous Recognize`_
5244
5345
>>> import time
5446
>>> from google.cloud import speech
55-
>>> client = speech.Client()
56-
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
57-
... encoding=speech.Encoding.LINEAR16,
58-
... sample_rate_hertz=44100)
59-
>>> operation = sample.long_running_recognize(
60-
... language_code='en-US',
61-
... max_alternatives=2,
47+
>>> client = speech.SpeechClient()
48+
>>> operation = client.long_running_recognize(
49+
... audio=speech.types.RecognitionAudio(
50+
... uri='gs://my-bucket/recording.flac',
51+
... ),
52+
... config=speech.types.RecognitionConfig(
53+
... encoding='LINEAR16',
54+
... language_code='en-US',
55+
... sample_rate_hertz=44100,
56+
... ),
6257
... )
6358
>>> retry_count = 100
6459
>>> while retry_count > 0 and not operation.complete:
@@ -80,7 +75,7 @@ See: `Speech Asynchronous Recognize`_
8075
Synchronous Recognition
8176
-----------------------
8277

83-
The :meth:`~google.cloud.speech.Client.recognize` method converts speech
78+
The :meth:`~.speech_v1.SpeechClient.recognize` method converts speech
8479
data to text and returns alternative text transcriptions.
8580

8681
This example uses ``language_code='en-GB'`` to better recognize a dialect from
@@ -89,12 +84,17 @@ Great Britain.
8984
.. code-block:: python
9085
9186
>>> from google.cloud import speech
92-
>>> client = speech.Client()
93-
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
94-
... encoding=speech.Encoding.FLAC,
95-
... sample_rate_hertz=44100)
96-
>>> results = sample.recognize(
97-
... language_code='en-GB', max_alternatives=2)
87+
>>> client = speech.SpeechClient()
88+
>>> results = client.recognize(
89+
... audio=speech.types.RecognitionAudio(
90+
... uri='gs://my-bucket/recording.flac',
91+
... ),
92+
... config=speech.types.RecognitionConfig(
93+
... encoding='LINEAR16',
94+
... language_code='en-US',
95+
... sample_rate_hertz=44100,
96+
... ),
97+
... )
9898
>>> for result in results:
9999
... for alternative in result.alternatives:
100100
... print('=' * 20)
@@ -112,14 +112,17 @@ Example of using the profanity filter.
112112
.. code-block:: python
113113
114114
>>> from google.cloud import speech
115-
>>> client = speech.Client()
116-
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
117-
... encoding=speech.Encoding.FLAC,
118-
... sample_rate_hertz=44100)
119-
>>> results = sample.recognize(
120-
... language_code='en-US',
121-
... max_alternatives=1,
122-
... profanity_filter=True,
115+
>>> client = speech.SpeechClient()
116+
>>> results = client.recognize(
117+
... audio=speech.types.RecognitionAudio(
118+
... uri='gs://my-bucket/recording.flac',
119+
... ),
120+
... config=speech.types.RecognitionConfig(
121+
... encoding='LINEAR16',
122+
... language_code='en-US',
123+
... profanity_filter=True,
124+
... sample_rate_hertz=44100,
125+
... ),
123126
... )
124127
>>> for result in results:
125128
... for alternative in result.alternatives:
@@ -137,15 +140,20 @@ words to the vocabulary of the recognizer.
137140
.. code-block:: python
138141
139142
>>> from google.cloud import speech
140-
>>> client = speech.Client()
141-
>>> sample = client.sample(source_uri='gs://my-bucket/recording.flac',
142-
... encoding=speech.Encoding.FLAC,
143-
... sample_rate_hertz=44100)
144-
>>> hints = ['hi', 'good afternoon']
145-
>>> results = sample.recognize(
146-
... language_code='en-US',
147-
... max_alternatives=2,
148-
... speech_contexts=hints,
143+
>>> from google.cloud import speech
144+
>>> client = speech.SpeechClient()
145+
>>> results = client.recognize(
146+
... audio=speech.types.RecognitionAudio(
147+
... uri='gs://my-bucket/recording.flac',
148+
... ),
149+
... config=speech.types.RecognitionConfig(
150+
... encoding='LINEAR16',
151+
... language_code='en-US',
152+
... sample_rate_hertz=44100,
153+
... speech_contexts=[speech.types.SpeechContext(
154+
... phrases=['hi', 'good afternoon'],
155+
... )],
156+
... ),
149157
... )
150158
>>> for result in results:
151159
... for alternative in result.alternatives:
@@ -160,7 +168,7 @@ words to the vocabulary of the recognizer.
160168
Streaming Recognition
161169
---------------------
162170

163-
The :meth:`~google.cloud.speech.Client.streaming_recognize` method converts
171+
The :meth:`~speech_v1.SpeechClient.streaming_recognize` method converts
164172
speech data to possible text alternatives on the fly.
165173

166174
.. note::
@@ -170,18 +178,27 @@ speech data to possible text alternatives on the fly.
170178

171179
.. code-block:: python
172180
181+
>>> import io
173182
>>> from google.cloud import speech
174-
>>> client = speech.Client()
175-
>>> with open('./hello.wav', 'rb') as stream:
176-
... sample = client.sample(stream=stream,
177-
... encoding=speech.Encoding.LINEAR16,
178-
... sample_rate_hertz=16000)
179-
... results = sample.streaming_recognize(language_code='en-US')
180-
... for result in results:
181-
... for alternative in result.alternatives:
182-
... print('=' * 20)
183-
... print('transcript: ' + alternative.transcript)
184-
... print('confidence: ' + str(alternative.confidence))
183+
>>> client = speech.SpeechClient()
184+
>>> config = speech.types.RecognitionConfig(
185+
... encoding='LINEAR16',
186+
... language_code='en-US',
187+
... sample_rate_hertz=44100,
188+
... )
189+
>>> with io.open('./hello.wav', 'rb') as stream:
190+
... requests = [speech.types.StreamingRecognizeRequest(
191+
... audio_content=stream.read(),
192+
... )]
193+
>>> results = sample.streaming_recognize(
194+
... config=speech.types.StreamingRecognitionConfig(config=config),
195+
... requests,
196+
... )
197+
>>> for result in results:
198+
... for alternative in result.alternatives:
199+
... print('=' * 20)
200+
... print('transcript: ' + alternative.transcript)
201+
... print('confidence: ' + str(alternative.confidence))
185202
====================
186203
transcript: hello thank you for using Google Cloud platform
187204
confidence: 0.927983105183
@@ -193,20 +210,36 @@ until the client closes the output stream or until the maximum time limit has
193210
been reached.
194211
195212
If you only want to recognize a single utterance you can set
196-
``single_utterance`` to :data:`True` and only one result will be returned.
213+
``single_utterance`` to :data:`True` and only one result will be returned.
197214
198215
See: `Single Utterance`_
199216
200217
.. code-block:: python
201218
202-
>>> with open('./hello_pause_goodbye.wav', 'rb') as stream:
203-
... sample = client.sample(stream=stream,
204-
... encoding=speech.Encoding.LINEAR16,
205-
... sample_rate_hertz=16000)
206-
... results = sample.streaming_recognize(
207-
... language_code='en-US',
208-
... single_utterance=True,
209-
... )
219+
>>> import io
220+
>>> from google.cloud import speech
221+
>>> client = speech.SpeechClient()
222+
>>> config = speech.types.RecognitionConfig(
223+
... encoding='LINEAR16',
224+
... language_code='en-US',
225+
... sample_rate_hertz=44100,
226+
... )
227+
>>> with io.open('./hello-pause-goodbye.wav', 'rb') as stream:
228+
... requests = [speech.types.StreamingRecognizeRequest(
229+
... audio_content=stream.read(),
230+
... )]
231+
>>> results = sample.streaming_recognize(
232+
... config=speech.types.StreamingRecognitionConfig(
233+
... config=config,
234+
... single_utterance=False,
235+
... ),
236+
... requests,
237+
... )
238+
>>> for result in results:
239+
... for alternative in result.alternatives:
240+
... print('=' * 20)
241+
... print('transcript: ' + alternative.transcript)
242+
... print('confidence: ' + str(alternative.confidence))
210243
... for result in results:
211244
... for alternative in result.alternatives:
212245
... print('=' * 20)
@@ -221,22 +254,31 @@ If ``interim_results`` is set to :data:`True`, interim results
221254
222255
.. code-block:: python
223256
257+
>>> import io
224258
>>> from google.cloud import speech
225-
>>> client = speech.Client()
226-
>>> with open('./hello.wav', 'rb') as stream:
227-
... sample = client.sample(stream=stream,
228-
... encoding=speech.Encoding.LINEAR16,
229-
... sample_rate=16000)
230-
... results = sample.streaming_recognize(
231-
... interim_results=True,
232-
... language_code='en-US',
233-
... )
234-
... for result in results:
235-
... for alternative in result.alternatives:
236-
... print('=' * 20)
237-
... print('transcript: ' + alternative.transcript)
238-
... print('confidence: ' + str(alternative.confidence))
239-
... print('is_final:' + str(result.is_final))
259+
>>> client = speech.SpeechClient()
260+
>>> config = speech.types.RecognitionConfig(
261+
... encoding='LINEAR16',
262+
... language_code='en-US',
263+
... sample_rate_hertz=44100,
264+
... )
265+
>>> with io.open('./hello.wav', 'rb') as stream:
266+
... requests = [speech.types.StreamingRecognizeRequest(
267+
... audio_content=stream.read(),
268+
... )]
269+
>>> results = sample.streaming_recognize(
270+
... config=speech.types.StreamingRecognitionConfig(
271+
... config=config,
272+
... iterim_results=True,
273+
... ),
274+
... requests,
275+
... )
276+
>>> for result in results:
277+
... for alternative in result.alternatives:
278+
... print('=' * 20)
279+
... print('transcript: ' + alternative.transcript)
280+
... print('confidence: ' + str(alternative.confidence))
281+
... print('is_final:' + str(result.is_final))
240282
====================
241283
'he'
242284
None
@@ -254,3 +296,13 @@ If ``interim_results`` is set to :data:`True`, interim results
254296
.. _Single Utterance: https://cloud.google.com/speech/reference/rpc/google.cloud.speech.v1beta1#streamingrecognitionconfig
255297
.. _sync_recognize: https://cloud.google.com/speech/reference/rest/v1beta1/speech/syncrecognize
256298
.. _Speech Asynchronous Recognize: https://cloud.google.com/speech/reference/rest/v1beta1/speech/asyncrecognize
299+
300+
301+
API Reference
302+
-------------
303+
304+
.. toctree::
305+
:maxdepth: 2
306+
307+
gapic/api
308+
gapic/types

docs/speech/operation.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

docs/speech/result.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

0 commit comments

Comments
 (0)