Skip to content

Latest commit

 

History

History
158 lines (113 loc) · 6.43 KB

File metadata and controls

158 lines (113 loc) · 6.43 KB

Google Batch provider

The google-batch API is in active development. Various features might not always work perfectly as we continue to make them better.

The google-batch provider utilizes the Google Batch API to queue a request for the following sequence of events.

  1. Create a Google Compute Engine Virtual Machine (VM) instance.
  2. Create a Google Compute Engine Persistent Disk and mount it as a "data disk".
  3. Localize files from Google Cloud Storage to the data disk.
  4. Run execute your --script or --command in your Docker container.
  5. Delocalize files from the data disk to Google Cloud Storage.
  6. Destroy the VM

Orchestration

When the Batch jobs.create() API is called, it creates a Batch Job. The Batch API service will then create the VM and disk when your Cloud Project has sufficient Compute Engine quota.

Execution of dsub features is handled by a series of Docker containers on the VM. The sequence of containers executed is:

  1. logging (copy logs to GCS; run in background)
  2. prepare (prepare data disk and save your script to the data disk)
  3. localization (copy GCS objects to the data disk)
  4. user-command (execute the user command)
  5. delocalization (copy files from the data disk to GCS)
  6. final_logging (copy logs to GCS; always run)

The prepare step does the following:

  1. Create runtime directories (script, tmp, workingdir).
  2. Write the user --script or --command to a file and make it executable.
  3. Create the directories for --input and --output parameters.

Container runtime environment

The data disk path in the Docker containers is:

  • /mnt/disks/data

The /mnt/disks/data folder contains:

  • input: location of localized --input and --input-recursive parameters.
  • output: location for your script to write files to be delocalized for --output and --output-recursive parameters.
  • script: location of the your dsub --script or --command script.
  • tmp: temporary directory for the your script. TMPDIR is set to this directory.
  • workingdir: the working directory set before the your script runs.

Task status

The Batch API supports task states of:

Task State Description
STATE_UNSPECIFIED Unknown state.
PENDING The Task is created and waiting for resources.
ASSIGNED The Task is assigned to at least one VM.
RUNNING The Task is running.
FAILED The Task has failed.
SUCCEEDED The Task has succeeded.
UNEXECUTED The Task has not been executed when the Job finishes.

dsub interprets the above to provide task statuses of:

  • RUNNING (PENDING, ASSIGNED, RUNNING)
  • SUCCESS (SUCCEEDED)
  • FAILURE (FAILED)
  • CANCELED (UNEXECUTED)

Logging

The google-batch provider saves 3 log files to Cloud Storage, every 5 minutes to the --logging location specified to dsub:

  • [prefix].log: log generated by all containers running on the VM
  • [prefix]-stdout.log: stdout from your Docker container
  • [prefix]-stderr.log: stderr from your Docker container

Logging paths and the [prefix] are discussed further in Logging.

Resource requirements

The google-batch provider supports many resource-related flags to configure the Compute Engine VMs that tasks run on, such as --machine-type or --min-cores and --min-ram, as well as --boot-disk-size and --disk-size. Additional provider-specific parameters are available and documented below.

Disk allocation

The Docker container launched by the Batch API will use the host VM boot disk for the system services needed to orchestrate the set of docker actions defined by dsub. All other directories set up by dsub will be on the data disk, including the TMPDIR (as discussed above). In general it should be unnecessary for end-users to ever change the --boot-disk-size and they should only need to set the --disk-size. One known exception is when very large Docker images are used, as such images need to be pulled to the boot disk.

Getting started on Google Batch

The steps for getting started are summarized below:

  1. Sign up for a Google account and create a project.

  2. Enable the Batch, Storage, and Compute APIs.

  3. Provide credentials so dsub can call Google APIs:

     gcloud auth application-default login
    
  4. Create a Google Cloud Storage bucket.

    The dsub logs and output files will be written to a bucket. Create a bucket using the storage browser or run the command-line utility gsutil, included in the Cloud SDK.

    gsutil mb gs://my-bucket
    

    Change my-bucket to a unique name that follows the bucket-naming conventions.

    (By default, the bucket will be in the US, but you can change or refine the location setting with the -l option.)

  5. Run a very simple "Hello World" dsub job and wait for completion.

    dsub \
        --provider google-batch \
        --project my-cloud-project \
        --regions us-central1 \
        --logging gs://my-bucket/logging/ \
        --output OUT=gs://my-bucket/output/out.txt \
        --command 'echo "Hello World" > "${OUT}"' \
        --wait
    

    Change my-cloud-project to your Google Cloud project, and my-bucket to the bucket you created above.

    The output of the script command will be written to the OUT file in Cloud Storage that you specify.

  6. View the output file.

     gsutil cat gs://my-bucket/output/out.txt