@@ -25,51 +25,51 @@ class AudioEncoding(enum.IntEnum):
2525
2626 All encodings support only 1 channel (mono) audio.
2727
28- For best results, the audio source should be captured and transmitted using
29- a lossless encoding (``FLAC`` or ``LINEAR16``). The accuracy of the speech
30- recognition can be reduced if lossy codecs are used to capture or transmit
31- audio, particularly if background noise is present. Lossy codecs include
32- ``MULAW``, ``AMR``, ``AMR_WB``, ``OGG_OPUS``, and ``SPEEX_WITH_HEADER_BYTE``.
33-
34- The ``FLAC`` and ``WAV`` audio file formats include a header that describes the
35- included audio content. You can request recognition for ``WAV`` files that
36- contain either ``LINEAR16`` or ``MULAW`` encoded audio.
37- If you send ``FLAC`` or ``WAV`` audio file format in
38- your request, you do not need to specify an ``AudioEncoding``; the audio
39- encoding format is determined from the file header. If you specify
40- an ``AudioEncoding`` when you send send ``FLAC`` or ``WAV`` audio, the
28+ For best results, the audio source should be captured and transmitted
29+ using a lossless encoding (``FLAC`` or ``LINEAR16``). The accuracy of
30+ the speech recognition can be reduced if lossy codecs are used to
31+ capture or transmit audio, particularly if background noise is present.
32+ Lossy codecs include ``MULAW``, ``AMR``, ``AMR_WB``, ``OGG_OPUS``, and
33+ ``SPEEX_WITH_HEADER_BYTE``.
34+
35+ The ``FLAC`` and ``WAV`` audio file formats include a header that
36+ describes the included audio content. You can request recognition for
37+ ``WAV`` files that contain either ``LINEAR16`` or ``MULAW`` encoded
38+ audio. If you send ``FLAC`` or ``WAV`` audio file format in your
39+ request, you do not need to specify an ``AudioEncoding``; the audio
40+ encoding format is determined from the file header. If you specify an
41+ ``AudioEncoding`` when you send send ``FLAC`` or ``WAV`` audio, the
4142 encoding configuration must match the encoding described in the audio
4243 header; otherwise the request returns an
4344 ``google.rpc.Code.INVALID_ARGUMENT`` error code.
4445
4546 Attributes:
4647 ENCODING_UNSPECIFIED (int): Not specified.
4748 LINEAR16 (int): Uncompressed 16-bit signed little-endian samples (Linear PCM).
48- FLAC (int): ``FLAC`` (Free Lossless Audio
49- Codec) is the recommended encoding because it is
50- lossless--therefore recognition is not compromised--and
51- requires only about half the bandwidth of ``LINEAR16``. ``FLAC`` stream
52- encoding supports 16-bit and 24-bit samples, however, not all fields in
49+ FLAC (int): ``FLAC`` (Free Lossless Audio Codec) is the recommended encoding because
50+ it is lossless--therefore recognition is not compromised--and requires
51+ only about half the bandwidth of ``LINEAR16``. ``FLAC`` stream encoding
52+ supports 16-bit and 24-bit samples, however, not all fields in
5353 ``STREAMINFO`` are supported.
5454 MULAW (int): 8-bit samples that compand 14-bit audio samples using G.711 PCMU/mu-law.
55- AMR (int): Adaptive Multi-Rate Narrowband codec. ``sample_rate_hertz`` must be 8000.
55+ AMR (int): Adaptive Multi-Rate Narrowband codec. ``sample_rate_hertz`` must be
56+ 8000.
5657 AMR_WB (int): Adaptive Multi-Rate Wideband codec. ``sample_rate_hertz`` must be 16000.
5758 OGG_OPUS (int): Opus encoded audio frames in Ogg container
58- (`OggOpus <https://wiki.xiph.org/OggOpus>`_).
59- ``sample_rate_hertz`` must be one of 8000, 12000, 16000, 24000, or 48000.
59+ (`OggOpus <https://wiki.xiph.org/OggOpus>`__). ``sample_rate_hertz``
60+ must be one of 8000, 12000, 16000, 24000, or 48000.
6061 SPEEX_WITH_HEADER_BYTE (int): Although the use of lossy encodings is not recommended, if a very low
6162 bitrate encoding is required, ``OGG_OPUS`` is highly preferred over
62- Speex encoding. The `Speex <https://speex.org/>`_ encoding supported by
63+ Speex encoding. The `Speex <https://speex.org/>`__ encoding supported by
6364 Cloud Speech API has a header byte in each block, as in MIME type
64- ``audio/x-speex-with-header-byte``.
65- It is a variant of the RTP Speex encoding defined in
66- `RFC 5574 <https://tools.ietf.org/html/rfc5574>`_.
65+ ``audio/x-speex-with-header-byte``. It is a variant of the RTP Speex
66+ encoding defined in `RFC 5574 <https://tools.ietf.org/html/rfc5574>`__.
6767 The stream is a sequence of blocks, one block per RTP packet. Each block
68- starts with a byte containing the length of the block, in bytes, followed
69- by one or more frames of Speex data, padded to an integral number of
70- bytes (octets) as specified in RFC 5574. In other words, each RTP header
71- is replaced with a single byte containing the block length. Only Speex
72- wideband is supported. ``sample_rate_hertz`` must be 16000.
68+ starts with a byte containing the length of the block, in bytes,
69+ followed by one or more frames of Speex data, padded to an integral
70+ number of bytes (octets) as specified in RFC 5574. In other words, each
71+ RTP header is replaced with a single byte containing the block length.
72+ Only Speex wideband is supported. ``sample_rate_hertz`` must be 16000.
7373 """
7474 ENCODING_UNSPECIFIED = 0
7575 LINEAR16 = 1
@@ -91,9 +91,9 @@ class InteractionType(enum.IntEnum):
9191 INTERACTION_TYPE_UNSPECIFIED (int): Use case is either unknown or is something other than one of the other
9292 values below.
9393 DISCUSSION (int): Multiple people in a conversation or discussion. For example in a
94- meeting with two or more people actively participating. Typically
95- all the primary people speaking would be in the same room (if not,
96- see PHONE_CALL )
94+ meeting with two or more people actively participating. Typically all
95+ the primary people speaking would be in the same room (if not, see
96+ PHONE\_CALL )
9797 PRESENTATION (int): One or more persons lecturing or presenting to others, mostly
9898 uninterrupted.
9999 PHONE_CALL (int): A phone-call or video-conference in which two or more people, who are
@@ -178,9 +178,10 @@ class SpeechEventType(enum.IntEnum):
178178 speech utterance and expects no additional speech. Therefore, the server
179179 will not process additional audio (although it may subsequently return
180180 additional results). The client should stop sending additional audio
181- data, half-close the gRPC connection, and wait for any additional results
182- until the server closes the gRPC connection. This event is only sent if
183- ``single_utterance`` was set to ``true``, and is not used otherwise.
181+ data, half-close the gRPC connection, and wait for any additional
182+ results until the server closes the gRPC connection. This event is only
183+ sent if ``single_utterance`` was set to ``true``, and is not used
184+ otherwise.
184185 """
185186 SPEECH_EVENT_UNSPECIFIED = 0
186187 END_OF_SINGLE_UTTERANCE = 1
0 commit comments