Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

Docker Initialization Action

This initialization action installs a binary release of Docker on a Google Cloud Dataproc cluster. After installation, it will add the yarn user to the special docker group so that YARN-executed applications can access Docker.

Using this initialization action

⚠️ NOTICE: See best practices of using initialization actions in production.

  1. Use the gcloud command to create a new cluster with this initialization action.

    REGION=<region>
    CLUSTER_NAME=<cluster_name>
    gcloud dataproc clusters create ${CLUSTER_NAME} \
        --region ${REGION} \
        --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/docker/docker.sh
  2. Docker is installed and configured on all nodes of the cluster (both master and workers). You can log into the master node and run a test command to see that it works:

    sudo docker run hello-world

    Or, to run as the yarn user would:

    sudo su yarn
    docker run hello-world