Problem
The agent bootstrap script (provisionersdk/scripts/bootstrap_linux.sh) is embedded into templates at build time and inlined into shell commands via sh -c '${init_script}' in multiple official and third-party templates.
Any single quote (apostrophe) character in the bootstrap script silently breaks the shell quoting, causing the script to terminate early. The agent never starts, and the only indication is the waitonexit trap firing with a cryptic message.
This was introduced when comments containing apostrophes were added to bootstrap_linux.sh:
# unit instead of exec'ing it directly.
# ^
# We attempt it here but don't fail if it's denied - the template's
# ^ ^
The apostrophe in exec'ing terminates the single-quoted sh -c '...' string. Everything after it is parsed as a broken shell command, the script errors out, and the EXIT trap fires. The error is near-invisible because set -eux traces just show the trap firing, with no clear indication of what went wrong.
Affected templates
Templates using the sh -c '${init_script}' pattern (tracked separately in https://github.com/coder/registry — see related issue):
coder/templates/aws-linux (cloud-init userdata)
coder/templates/gcp-linux (metadata_startup_script)
mossylion/templates/scaleway-instance
ericpaulsen/templates/k8s-username
Templates using the safer write_files pattern (NOT affected):
coder/templates/azure-linux
coder/templates/digitalocean-linux
Impact
- Agent fails to start on new workspace creation
- Extremely difficult to debug: cloud-init logs show exit code 0 or 2 with no clear cause
- Users cannot connect to the workspace (since the agent never starts), making diagnosis even harder
Proposed fix
Two-pronged approach:
1. Remove existing apostrophes from bootstrap scripts
Already done — see the associated PR.
2. Add a CI lint check
Add a script (e.g. scripts/check_bootstrap_quotes.sh) that rejects single quotes in all files under provisionersdk/scripts/. This prevents future regressions since the constraint is non-obvious and easy to violate.
Example:
#!/usr/bin/env bash
set -euo pipefail
source "$(dirname "${BASH_SOURCE[0]}")/lib.sh"
cdroot
files=$(find provisionersdk/scripts -type f \( -name '*.sh' -o -name '*.ps1' \))
found=0
for f in $files; do
if grep -n "'" "$f"; then
echo "ERROR: $f contains single quotes (apostrophes)."
echo " Bootstrap scripts are inlined via sh -c '...' in templates."
echo " Single quotes break this quoting. Use alternative phrasing."
found=1
fi
done
if [ "$found" -ne 0 ]; then
exit 1
fi
Wire it into make lint as lint/bootstrap or similar.
Problem
The agent bootstrap script (
provisionersdk/scripts/bootstrap_linux.sh) is embedded into templates at build time and inlined into shell commands viash -c '${init_script}'in multiple official and third-party templates.Any single quote (apostrophe) character in the bootstrap script silently breaks the shell quoting, causing the script to terminate early. The agent never starts, and the only indication is the
waitonexittrap firing with a cryptic message.This was introduced when comments containing apostrophes were added to
bootstrap_linux.sh:The apostrophe in
exec'ingterminates the single-quotedsh -c '...'string. Everything after it is parsed as a broken shell command, the script errors out, and the EXIT trap fires. The error is near-invisible becauseset -euxtraces just show the trap firing, with no clear indication of what went wrong.Affected templates
Templates using the
sh -c '${init_script}'pattern (tracked separately in https://github.com/coder/registry — see related issue):coder/templates/aws-linux(cloud-init userdata)coder/templates/gcp-linux(metadata_startup_script)mossylion/templates/scaleway-instanceericpaulsen/templates/k8s-usernameTemplates using the safer
write_filespattern (NOT affected):coder/templates/azure-linuxcoder/templates/digitalocean-linuxImpact
Proposed fix
Two-pronged approach:
1. Remove existing apostrophes from bootstrap scripts
Already done — see the associated PR.
2. Add a CI lint check
Add a script (e.g.
scripts/check_bootstrap_quotes.sh) that rejects single quotes in all files underprovisionersdk/scripts/. This prevents future regressions since the constraint is non-obvious and easy to violate.Example:
Wire it into
make lintaslint/bootstrapor similar.