# Python Sample Authoring Guide

We're happy you want to write a Python sample! Like a lot of Pythonistas,
we're opinioned and fussy. This guide is a reference for the format and
style expected of samples contributed to the
[python-docs-samples](https://github.com/GoogleCloudPlatform/python-docs-samples)
repo. The guidelines below are intended to ensure that all Python samples
meet the following goals:

* **Copy-paste-runnable.** A developer should be able to copy and paste the
code into their own environment and run it with as few modifications as
possible.
* **Teach through code.** Each sample should demonstrate best practices for
interacting with Google Cloud libraries, APIs, or services.
* **Idiomatic.** Each sample should follow widely accepted Python best
practices as covered below.

## Sample Guidelines

This section covers guidelines for Python samples. Note that
[Testing Guidelines](#testing-guidelines) are covered separately below.

### Folder Location

Samples that primarily show the use of one client library should be placed in the
client library repository. Other samples should be placed in this repository
`python-docs-samples`.

**Library repositories:** Each sample should be in the top-level samples folder `samples`
in the client library repository. See the [Text-to-Speech samples](https://github.com/googleapis/python-texttospeech/tree/master/samples)
for an example.

**python-docs-samples:** Each sample should be in a folder under the top-level folder of
[python-docs-samples](https://github.com/GoogleCloudPlatform/python-docs-samples)
that corresponds to the Google Cloud service or API used by the sample.
For example, a sample demonstrating how to work with BigTable should be
in a subfolder under the
[python-docs-samples/bigtable](https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/bigtable)
folder.

Conceptually related samples under a service or API should be grouped into
a subfolder. For example, App Engine Standard samples are under the
[appengine/standard](https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/appengine/standard)
folder, and App Engine Flex samples are under the
[appengine/flexible](https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/appengine/flexible)
folder.

If your sample is a set of discrete code snippets that each demonstrate a
single operation, these should be grouped into a `snippets` folder. For
example, see the snippets in the
[bigtable/snippets/writes](https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/bigtable/snippets/writes)
folder.

If your sample is a quickstart — intended to demonstrate how to quickly get
started with using a service or API — it should be in a _quickstart_ folder.

### Python Version

Samples should support Python 3.6, 3.7, and 3.8.

If the API or service your sample works with has specific Python version
requirements different from those mentioned above, the sample should support
those requirements.

### License Header

Source code files should always begin with an Apache 2.0 license header. See
the instructions in the repo license file on [how to apply the Apache license
to your work](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/LICENSE#L178-L201).
For example, see the license header for the [Datastore client quickstart
sample](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/datastore/cloud-client/quickstart.py#L1-L15).

### Shebang

If, and only if, your sample application is a command-line application, then
include a [shebang](https://en.wikipedia.org/wiki/Shebang_(Unix)) as the
first line. Separate the shebang line from the rest of the application with
a blank line. The shebang line for a Python application should always be:

```python
#!/usr/bin/env python
```

Don't include shebang lines in web applications or test files.

### Coding Style

All Python samples should follow the best practices defined in the
[PEP 8 style guide](https://www.python.org/dev/peps/pep-0008/) and the
[Google Python Style Guide](http://google.github.io/styleguide/pyguide.html).
The automated linting process for Python samples uses
[flake8](http://flake8.pycqa.org/en/latest/) to verify conformance to common
Python coding standards, so the use of flake8 is recommended.

If you prefer to use [pylint](https://www.pylint.org/), note that Python
samples for this repo are not required to conform to pylint’s default settings
outside the scope of PEP 8, such as the “too many arguments” or “too many
local variables” warnings.

The use of [Black](https://pypi.org/project/black/) to standardize code
formatting and simplify diffs is recommended, but optional.

The default noxfile has `blacken` session for convenience. Here are
some examples.

If you have pyenv configured:
```sh
nox -s blacken
```

If you only have docker:
```
cd proj_directory
../scripts/run_tests_local.sh . blacken
```

In addition to the syntax guidelines covered in PEP 8, samples should strive
to follow the Pythonic philosophy outlined in the
[PEP 20 - Zen of Python](https://www.python.org/dev/peps/pep-0020/) as well
as the readability tenets presented in Donald Knuth's
_[Literate Programming](https://en.wikipedia.org/wiki/Literate_programming)_.
Notably, your sample program should be self-contained, readable from top to
bottom, and fairly self-documenting. Prefer descriptive names, and use
comments and docstrings only as needed to further clarify the code’s intent.
Always introduce functions and variables before they are used. Prefer less
indirection. Prefer imperative programming as it is easier to understand.

### Functions and Classes

Very few samples will require authoring classes. Prefer functions whenever
possible. See [this video](https://www.youtube.com/watch?v=o9pEzgHorH0) for
some insight into why classes aren't as necessary as you might think in Python.
Classes also introduce cognitive load. If you do write a class in a sample, be
prepared to justify its existence during code review.

#### Descriptive function names

Always prefer descriptive function names, even if they are long.
For example `upload_file`, `upload_encrypted_file`, and `list_resource_records`.
Similarly, prefer long and descriptive parameter names. For example
`source_file_name`, `dns_zone_name`, and `base64_encryption_key`.

Here's an example of a top-level function in a command-line application:

```python
def list_blobs(bucket_name):
    """Lists all the blobs in the bucket."""
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)

    blobs = bucket.list_blobs()

    for blob in blobs:
        print(blob.name)
```

Notice the simple docstring and descriptive argument name (`bucket_name`
implying a string instead of just `bucket` which could imply a class instance).

This particular function is intended to be the "top of the stack" - the function
executed when the command-line sample is run by the user. As such, notice that
it prints the blobs instead of returning. In general, top of the stack
functions in command-line applications should print, but use your best
judgment.

#### Documenting arguments

Here's an example of a more complicated top-level function in a command-line
application:

```python
def download_encrypted_blob(
        bucket_name, source_blob_name, destination_file_name,
        base64_encryption_key):
    """Downloads a previously-encrypted blob from Google Cloud Storage.

    The encryption key provided must be the same key provided when uploading
    the blob.
    """
    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(source_blob_name)

    # Encryption key must be an AES256 key represented as a bytestring with
    # 32 bytes. Since it's passed in as a base64 encoded string, it needs
    # to be decoded.
    encryption_key = base64.b64decode(base64_encryption_key)

    blob.download_to_filename(
        destination_file_name, encryption_key=encryption_key)

    print(f'Blob {source_blob_name} downloaded to {destination_file_name}.'
```

Note the verbose parameter names and the extended description that helps the
user form context. If there were more parameters or if the parameters had
complex context, then it might make sense to expand the docstring to include
an `Args` section such as:

```
Args:
    bucket_name: The name of the cloud storage bucket.
    source_blob_name: The name of the blob in the bucket to download.
    destination_file_name: The blob will be downloaded to this path.
    base64_encryption_key: A base64-encoded RSA256 encryption key. Must be the
        same key used to encrypt the file.
```

Generally, however, it's rarely necessary to exhaustively document the
parameters this way. Lean towards unsurprising arguments with descriptive
names, as having to resort to this kind of docstring might be extremely
accurate but it comes at the cost of high redundancy, signal-to-noise ratio,
and increased cognitive load.

#### Documenting types

Argument types should be documented using Python type annotations as
introduced in [PEP 484](https://www.python.org/dev/peps/pep-0484/). For example:

```py
def hello_world(name: string):
    print(f"Hello {name}!")
```

If there is an `Args` section within the function's docstring, consider
documenting the argument types there as well. For example:

```
Args:
    credentials (google.oauth2.credentials.Credentials): Credentials
      authorized for the current user.
```

When documenting primitive types, be sure to note if they have a particular set
of constraints. For example, `A base64-encoded string` or `Must be between 0
and 10`.

### README File

Each sample should have a `README.md` file that provides instructions for how
to install, configure, and run the sample. Setup steps that cover creating
Google Cloud projects and resources should link to appropriate pages in the
[Google Cloud Documentation](https://cloud.google.com/docs/), to avoid
duplication and simplify maintenance.

### Dependencies

Every sample should include a
[requirements.txt](https://pip.pypa.io/en/stable/user_guide/#requirements-files)
file that lists all of its dependencies, to enable others to re-create the
environment that was used to create and test the sample. All dependencies
should be pinned to a specific version, as in this example:

```
Flask==1.1.1
PyMySQL==0.9.3
SQLAlchemy==1.3.12
```

If a sample has testing requirements that differ from its runtime requirements
(such as dependencies on [pytest](http://pytest.org/en/latest/) or other
testing libraries), the testing requirements may be listed in a separate
`requirements-test.txt` file instead of the main `requirements.txt` file.

### Region Tags

Sample code may be integrated into Google Cloud Documentation through the use
of region tags, which are comments added to the source code to identify code
blocks that correspond to specific topics covered in the documentation. For
example, see
[this sample](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/cloud-sql/mysql/sqlalchemy/main.py)
— the region tags are the comments that begin with `[START` or `[END`.

The use of region tags is beyond the scope of this document, but if you’re
using region tags they should start after the source code header
(license/copyright information), imports, and global configuration such as
initializing constants.

### Exception Handling

Sample code should use standard Python exception handling techniques as
covered in the [Google Python Style
Guide](http://google.github.io/styleguide/pyguide.html#24-exceptions).

## Testing Guidelines

Samples should include tests to verify that the sample runs correctly and
generates the intended output. Follow these guidelines while writing your
tests:

* Use [pytest](https://docs.pytest.org/en/latest/)-style tests and plain
asserts. Don't use `unittest`-style tests or `assertX` methods.
* Whenever possible, tests should allow for future changes or additions to
APIs that are unrelated to the code being tested.
For example, if a test is intended to verify a JSON payload
returned from an endpoint, it should only check for the existence of the
expected keys and values, and the test should continue to work correctly
if the order of keys changes or new keys are added to the response in a future
version of the API. In some cases, it may make sense for tests to simply
verify that an API call was successful rather than checking the response
payload.
* Samples that use App Engine Standard should use the [App Engine
testbed](https://cloud.google.com/appengine/docs/standard/python/refdocs/google.appengine.ext.testbed)
for system testing, as shown in [this
example](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/appengine/standard/localtesting/datastore_test.py).
* All tests should be independent of one another and order-independent.
* We use parallel processing for tests, so tests should be capable of running in parallel with one another.
* Use pytest's fixture for resource setup and teardown, instead of
  having them in the test itself.
* Avoid infinite loops.
* Retry RPCs

### Arrange, Act, Assert

Tests for samples should follow the “Arrange, Act, Assert” structure:

* _Arrange_ — create and configure the components required for the test.
Avoid nesting; prioritize readability and simplicity over efficiency. For
Python tests, typical "arrange" steps include imports, copying environment
variables to local variables, and so on.
* _Act_ — execute the code to be tested, such as sending a request to an API and
receiving a response.
* _Assert_ — verify that the test results match what is expected, using an
`assert` statement.

### External Resources

Whenever possible, tests should run against the live production version of
cloud APIs and resources. This will assure that any breaking changes in those
resources are identified by the tests.

External resources that must exist prior to the test (for example, a
Cloud SQL instance) should be identified and passed in through an environment
variable. If specific data needs to exist within such infrastructure resources,
however, the test should create this data as part of its _Arrange_ steps and
then clean up when the test is completed.

Creating mocks for external resources is strongly discouraged. Tests should
verify the validity of the sample against the APIs, and not against a mock
that embodies assumptions about the behavior of the APIs.

### Temporary Resources

When tests need temporary resources (such as a temp file or folder), they
should create reasonable names for these resources with a UUID attached to
assure uniqueness. Use the Python ```uuid``` package from the standard
library to generate UUIDs for resource names. For example:

```python
glossary_id = f'test-glossary-{uuid.uuid4()}'
```

or:

```python
# If full uuid4 is too long, use its hex representation.
encrypted_disk_name = f'test-disk-{uuid.uuid4().hex}'
```

```python
# If the hex representation is also too long, slice it.
encrypted_disk_name = f'test-disk-{uuid.uuid4().hex[:5]}'
```

All temporary resources should be explicitly deleted when testing is
complete. Use pytest's fixture for cleaning up these resouces instead
of doing it in test itself.

### Console Output

If the sample prints output to the console, the test should capture stdout to
a file and verify that the captured output contains the key information that
is expected. Strive to verify the content of the output rather than the syntax.
For example, the test might verify that a string is included in the output,
without taking a dependency on where that string occurs in the output.

### Avoid infinite loops

Never put potential infinite loops in the test code path. A typical
example is about gRPC's LongRunningOperations. Make sure you pass the
timeout parameter to the `result()` call.

Good:

```python
# will raise google.api_core.GoogleAPICallError after 60 seconds
operation.result(60)
```

Bad:

```python
operation.result()  # this could wait forever.
```

We recommend the timeout parameter to be around the number that gives
you more than 90% success rate. Don't put too long a timeout.

Now this test is inevitably flaky, so consider marking the test as
`flaky` as follows:

```python

@pytest.mark.flaky(max_runs=3, min_passes=1)
def my_flaky_test():
    # test that involves LRO poling with the timeout
```

This combination will give you very high success rate with fixed test
execution time (0.999 success rate and 180 seconds operation wait time
in the worst case in this example).

### Retry RPCs

All the RPCs are inevitably flaky. It can fail for many reasons. The
`google-cloud` Python client retries requests automatically for most
cases.

The old api-client doesn't retry automatically, so consider using
[`backoff`](https://pypi.org/project/backoff/) for retrying. Here is a
simple example:

```python

import backoff
from googleapiclient.errors import HttpError

@pytest.fixture(scope='module')
def test_resource():
    @backoff.on_exception(backoff.expo, HttpError, max_time=60)
    def create_resource():
        try:
            return client.projects().imaginaryResource().create(
                name=resource_id, body=body).execute()
        except HttpError as e:
            if '409' in str(e):
                # Ignore this case and get the existing one.
                return client.projects().imaginaryResource().get(
                    name=resource_id).execute()
            else:
                raise

    resource = create_resource()

    yield resource

    # cleanup
    ...
```

### Use filters with list methods

When writing a test for a `list` method, consider filtering the possible results.
Listing all resources in the test project may take a considerable amount of time.
The exact way to do this depends on the API.

Some `list` methods take a `filter`/`filter_` parameter:

```python
from datetime import datetime

from google.cloud import logging_v2

client = logging_v2.LoggingServiceV2Client()
resource_names = [f"projects/{project}"]
   # We add timestamp for making the query faster.
    now = datetime.datetime.now(datetime.timezone.utc)
    filter_date = now - datetime.timedelta(minutes=1)
    filters = (
        f"timestamp>=\"{filter_date.isoformat('T')}\" "
        "resource.type=cloud_run_revision "
        "AND severity=NOTICE "
)

entries = client.list_log_entries(resource_names, filter_=filters)

```

Others allow you to limit the result set with additional arguments
to the request:

```python
from google.cloud import asset_v1p5beta1

# TODO project_id = 'Your Google Cloud Project ID'
# TODO asset_types = 'Your asset type list, e.g.,
# ["storage.googleapis.com/Bucket","bigquery.googleapis.com/Table"]'
# TODO page_size = 'Num of assets in one page, which must be between 1 and
# 1000 (both inclusively)'

project_resource = "projects/{}".format(project_id)
content_type = asset_v1p5beta1.ContentType.RESOURCE
client = asset_v1p5beta1.AssetServiceClient()

# Call ListAssets v1p5beta1 to list assets.
response = client.list_assets(
    request={
        "parent": project_resource,
        "read_time": None,
        "asset_types": asset_types,
        "content_type": content_type,
        "page_size": page_size,
    }
)
```

### Test Environment Setup

Because all tests are system tests that use live resources, running tests
requires a Google Cloud project with billing enabled, as covered under
[Creating and Managing Projects](https://cloud.google.com/resource-manager/docs/creating-managing-projects).

Once you have your project created and configured, you'll need to set
environment variables to identify the project and resources to be used
by tests. See
[testing/test-env.tmpl.sh](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/testing/test-env.tmpl.sh)
for a list of all environment variables used by all tests. Not every
test needs all of these variables. All required environment variables
should be listed in the README and `testing/test-env.tmpl.sh`. If you
find one is missing, please add instructions for setting it as part of
your PR.

We suggest that you copy this file as follows:

```sh
$ cp testing/test-env.tmpl.sh testing/test-env.sh
$ editor testing/test-env.sh  # change the value of `GCLOUD_PROJECT`.
```

You can easily `source` this file for exporting the environment variables.

#### Development environment setup

This repository supports two ways to run tests locally.

1. nox

    This is the recommended way. Setup takes little more efforts than
    the second one, but the test execution will be faster.

2. Docker

    This is another way of running the tests. Setup is easier because
    you only need to instal Docker. The test execution will be bit
    slower than the first one.

#### nox setup

Please read the [MAC Setup Guide](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/MAC_SETUP.md).

### Running tests with nox

Automated testing for samples is managed by
[nox](https://nox.readthedocs.io). Nox allows us to run a variety of tests,
including the flake8 linter, Python 2.7, Python 3.x, and App Engine tests,
as well as automated README generation.

__Note:__

**Library repositories:** If you are working on an existing project, a `noxfile.py` will already exist.
For new samples, create a new `noxfile.py` and paste the contents of
[noxfile-template.py](https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/noxfile-template.py)

**python-docs-samples:** As a temporary workaround, each project currently uses first
`noxfile-template.py` found in a parent folder above the current sample. In
order to simulate this locally, you need to copy + rename the parent
`noxfile-template.py` as `noxfile.py` in the folder of the project (containing the `requirements.txt` for the file).

```console
cd python-docs-samples
cp noxfile-template.py PATH/TO/YOUR/PROJECT/noxfile.py
cd PATH/TO/YOUR/PROJECT/
```


To use nox, install it globally with `pip`:

```console
$ pip install nox
```

To run style checks on your samples:
```console
nox -s lint
```

To run tests with a python version, use the correct `py-3.*` sessions:
```console
nox -s py-3.6
```

To run a specific file:
```console
nox -s py-3.7 -- snippets_test.py
```

To run a specific test from a specific following:
```console
nox -s py-3.7 -- snippets_test.py:test_list_blobs
```

### Running tests with Docker

__Note__: This is currently only available for samples in `python-docs-samples`.

If you have [Docker](https://www.docker.com) installed and runnable by
the local user, you can use `scripts/run_tests_local.sh` helper script
to run the tests. For example, let's say you want to modify the code
in `cdn` directory, then you can do:

```sh
$ cd cdn
$ ../scripts/run_tests_local.sh .
# This will run the default sessions; lint, py-3.6, and py-3.7
$ ../scripts/run_tests_local.sh . lint
# Running only lint
```

If your test needs a service account, you have to create a service
account and download the JSON key to `testing/service-account.json`.

On MacOS systems, you also need to install `coreutils` to use
`scripts/run_tests_local.sh`. Here is how to install it with `brew`:

```sh
$ brew install coreutils
```

### Google Cloud Storage Resources

Certain samples require integration with Google Cloud Storage (GCS), most
commonly for APIs that read files from GCS. To run the tests for these
samples, configure your GCS bucket name via the `CLOUD_STORAGE_BUCKET`
environment variable.

The resources required by tests can usually be found in the `./resources`
folder inside the sample directory, as in [this
example](https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/automl/cloud-client/resources).
You can upload those resources to your own GCS bucket to run the tests with
[gsutil](https://cloud.google.com/storage/docs/gsutil). For example:

```console
gsutil cp ./resources/* gs://$CLOUD_STORAGE_BUCKET/
```