Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upPermit mounting via volumes-from by passing orchestrator ID #924
Conversation
tl;dr This helps CodeClimate engines not need intimiate docker host
knowledge.
In contexts like self-hosted Gitlab, we sometimes have a context where
we have an invoking runner like Gitlab CI running the Docker executor,
which exposes the Docker socket to the running job, so that the running
job may invoke its own Docker jobs on the host. Gitlab's top-level job
will set up some filesystem context (/builds, mounted as a Docker
volume, in the Gitlab case).
Right now, Gitlab can only support CodeClimate in a Docker-in-Docker
runner, because CodeClimate performs volume mounting for the individual
engines via Docker's --volume flag, which mounts not the path from the
invoking container, but rather a path on the docker host. This requires
that the path passed to CodeClimate as the CODECLIMATE_CODE variable
match the real host path, and in the Gitlab CI case, we don't want that,
so we have to "hide" the host with a DinD approach. However, this means
that we also don't get any layer caching between jobs, which makes
running CodeClimate prohibitively expensive, as all the engines etc have
to be downloaded for each job.
By supporting Docker's `volumes-from` mounting option, we can instead
tell the engines to inherit any mounts from the invoking orchestrator.
This permits CodeClimate to allow the top-level context set up a Docker
volume, bind it to the orchestrator, and then allow the orchestrator to
pass that to invoked children. This sidesteps the issue of the Engines
needing to know the actual host path; as long as the orchestrator's
/code directory is mounted, the children can just presume to use it
as-is.
To accomplish this, we just a) name the top-level container, and b) pass
that name via the CODECLIMATE_ORCHESTRATOR env var:
docker run \
--interactive --tty --rm \
--name codeclimate_orchestrator \
--env CODECLIMATE_ORCHESTRATOR="codeclimate_orchestrator" \
--env CODECLIMATE_CODE="/code" \
--volume "$PWD":/code \
--volume /var/run/docker.sock:/var/run/docker.sock \
--volume /tmp/cc:/tmp/cc \
codeclimate/codeclimate-wrapped analyze
In the bare-metal case, this doesn't change anything - we're mounting
the real host path, which then gets passed to the individual children
mounted on the /code mount.
While not immediately pertinent to the CodeClimate PR, In Gitlab, we can
invoke the Gitlab codequality image like so:
script:
- CONTAINER_ID=$(docker ps -q -f "label=com.gitlab.gitlab-runner.job.id=${CI_JOB_ID}")
- BUILDS_VOLUME_ID=$(docker inspect $CONTAINER_ID --format '{{ range .Mounts }}{{ if eq .Destination "/builds" }}{{ .Name }}{{ end }}{{ end }}')
- docker run
--rm
--name "codeclimate_orchestrator_${CI_JOB_ID}"
--env SOURCE_CODE="/code"
--env CODECLIMATE_VERSION="volumes-from"
--env ORCHESTRATOR_ID="codeclimate_orchestrator_${CI_JOB_ID}"
--volume /var/run/docker.sock:/var/run/docker.sock
--volume "${BUILDS_VOLUME_ID}":/code
codequality:orch /code
("volumes-from" is my local Docker image for the altered CodeClimage
build, and "codequality:orch" is my altered Gitlab codequality image)
Because this job _must_ be executed in a context that is visible to
Docker, we can query Docker to get the current job's container ID, and
from there get the volume ID mounted as `/builds`. We then volume mount
that volume as /code, and specify /code as the "host" location of our
code to be evaluated. The orchestrator will use the passed volume as
/code, which is then passed onto the engine jobs, allowing the entire
process to run against an ephemeral Docker volume rather than requiring
a known path on the host.
|
We are facing the same issue: Our application has no access to the Docker host, only to the Daemon itself via remote API. With this approach we could create a codeclimate container, copy all files to |
|
@efueger I think this is worth to look at, WDYT? |
|
@cheald I can't find |
I guess the image was built locally from this PR's branch and tagged |
tl;dr This helps CodeClimate engines not need intimiate docker host knowledge, which permits the usage of CodeClimate outside of docker-in-docker setups. In particular, this permits for easily running CodeClimate checks in Gitlab while retaining Docker layer caching, vastly improving the runtime of each build.
In contexts like self-hosted Gitlab, we sometimes have a context where we have an invoking runner like Gitlab CI running the Docker executor, which exposes the Docker socket to the running job, so that the running job may invoke its own Docker jobs on the host. Gitlab's top-level job will set up some filesystem context (/builds, mounted as a Docker volume, in the Gitlab case).
Right now, Gitlab can only support CodeClimate in a Docker-in-Docker runner, because CodeClimate performs volume mounting for the individual engines via Docker's --volume flag, which mounts not the path from the invoking container, but rather a path on the docker host. This requires that the path passed to CodeClimate as the CODECLIMATE_CODE variable match the real host path, and in the Gitlab CI case, we don't want that, so we have to "hide" the host with a DinD approach. However, this means that we also don't get any layer caching between jobs, which makes running CodeClimate prohibitively expensive, as all the engines etc have to be downloaded for each job.
By supporting Docker's
volumes-frommounting option, we can instead tell the engines to inherit any mounts from the invoking orchestrator. This permits CodeClimate to allow the top-level context set up a Docker volume, bind it to the orchestrator, and then allow the orchestrator to pass that to invoked children. This sidesteps the issue of the Engines needing to know the actual host path; as long as the orchestrator's /code directory is mounted, the children can just presume to use it as-is.To accomplish this, we just a) name the top-level container, and b) pass that name via the CODECLIMATE_ORCHESTRATOR env var:
In the bare-metal case, this doesn't change anything - we're mounting the real host path, which then gets passed to the individual children mounted on the /code mount.
While not immediately pertinent to the CodeClimate PR, In Gitlab, we can invoke the Gitlab codequality image like so:
Because this job must be executed in a context that is visible to Docker, we can query Docker to get the current job's container ID, and from there get the volume ID mounted as
$CI_BUILDS_DIR. We then volume mount that volume as /code, and specify /code as the "host" location of our code to be evaluated. The orchestrator will use the passed volume as /code, which is then passed onto the engine jobs, allowing the entire process to run against an ephemeral Docker volume rather than requiring a known path on the host.