feat(.github/workflows): trigger Algolia, ISR, and Vercel deploy on docs/** changes#25049
Conversation
Adds a tiny shell-only GitHub Actions workflow that POSTs an HMAC-signed
request to the new coder.com indexer at /api/algolia-docs-sync whenever
docs/** content changes on main or a release/* branch.
The workflow:
- Has zero Algolia or Node.js dependencies. Just curl + openssl + jq.
- Computes SHA-256 HMAC over the JSON body using
secrets.ALGOLIA_DOCS_SYNC_SECRET (shared with coder.com).
- Validates that the ref is main or release/* and aborts otherwise.
- Supports a workflow_dispatch action input ('index' | 'delete')
so a maintainer can backfill any version or clean up records for a
dropped version without round-tripping through coder.com's repo.
- Writes the handler response to the workflow run summary so we can
audit what got indexed without leaving GitHub.
The Algolia admin key never lives in this repo. Only the shared HMAC
secret does, which limits blast radius if a runner ever leaks a
secret.
This change does NOT touch any docs content. The indexing handler in
coder.com lands separately at coder/coder.com#TBD; see the design doc
section 6 (architecture) for the full story.
Prerequisite: ALGOLIA_DOCS_SYNC_SECRET must be added as an Actions
secret on this repo (and the same value on coder.com's Vercel env)
before merge. The workflow logs a clear error and aborts with no
network call if the secret is missing.
This change was authored by a Coder agent on behalf of @nickvigilante.
Folds the standalone algolia-docs-index.yaml workflow into the
existing deploy-docs.yaml so a single docs/** push fires every
required update path:
- algolia-and-isr (always): HMAC-signed POST to
coder.com/api/algolia-docs-sync. Refreshes the Algolia docs
slice for the (corpus, ref) pair AND ISR-revalidates every
navigable manifest entry. Markdown-only edits surface in
seconds with no full rebuild.
- vercel-rebuild (only when docs/manifest.json changed): fires
the Vercel deploy hook. Manifest changes can register new
routes that Next.js's getStaticPaths only re-evaluates on a
full build, so ISR-per-existing-path is not enough.
A small "changes" preflight job computes the diff between
github.event.before and github.sha to set a single
manifest_changed output. workflow_dispatch invocations bypass
the diff and skip the Vercel rebuild path so manual reindex/
delete actions don't trigger an unnecessary deploy.
Trigger expanded from "main + manifest.json" to "main and
release/* + any docs/**" so release branch docs edits also
flow through the same pipeline. The Vercel rebuild path stays
gated on manifest changes regardless of branch.
This change was generated by Coder Agents.
… reindex The deploy-docs workflow's 'changes' job now diffs the push's BEFORE_SHA and AFTER_SHA, walks the result with awk to map git status codes (A/M/D/R/T) onto the handler's PathStatus enum, and emits a JSON array on the 'paths_json' output. The algolia-and-isr job forwards the array to coder.com's /api/algolia-docs-sync via the existing HMAC-signed webhook body. When paths_json is non-empty, the handler runs in surgical mode and only re-extracts the changed pages. The path is the safe fallback for: - workflow_dispatch (no diff range available) - the very first push to a new branch (zero BEFORE_SHA) - a fetch failure (e.g. force-push rewrote history) - a manifest.json change (route restructuring; surgical isn't safe) - > 50 changed markdown files (faster as one whole-branch reindex) - any non-markdown change under docs/ The workflow logs the resolved mode in the curl step and adds a 'Mode: `surgical` (N path(s))' line to the GitHub Actions step summary so operators can verify which path ran. Refs: algolia-search-revamp.md \u00a76.9 (to be added)
e4a9820 to
2b5b2cf
Compare
|
/coder-agents-review |
There was a problem hiding this comment.
Well-designed workflow with thorough edge-case coverage. The fallback hierarchy (workflow_dispatch, zero SHA, fetch failure, manifest change, >50 files, non-markdown files) correctly defaults to whole-branch reindex at every ambiguous state. Expression injection is fully mitigated: all GitHub context values flow through env: blocks into shell variables, never interpolated inline in run:. The inline documentation is good enough that an operator reading this file cold will understand every branch point.
Three P2, six P3, six Nit. The sharpest finding is DEREM-4: cancel-in-progress: true combined with surgical mode silently drops Algolia updates when a push is cancelled, and there is no catch-up mechanism. The fix is one word.
"That is a doctor saying 'something went wrong; we did the safe thing.' The git stderr [...] is the actual diagnosis, and it got piped into the void." (Leorio, on the suppressed fetch stderr)
🤖 This review was automatically generated with Coder Agents.
| # GitHub Actions' output size limit. | ||
| CHANGED=$(wc -l < "$DIFF_FILE") | ||
| if [ "$CHANGED" -eq 0 ]; then | ||
| # Markdown-only path filter on the trigger means we should | ||
| # only get here on edits to non-markdown files under docs/ | ||
| # (e.g., images). Whole-branch reindex is overkill for | ||
| # those, but it is also harmless and avoids a special case; | ||
| # an empty paths array makes the handler skip both the | ||
| # save and the revalidate when no manifest entry maps to | ||
| # the changed file. | ||
| emit_full_branch_fallback | ||
| exit 0 | ||
| fi | ||
| if [ "$CHANGED" -gt 50 ]; then | ||
| echo "::notice::$CHANGED markdown files changed; falling back to whole-branch reindex (cap is 50 for surgical mode)" | ||
| emit_full_branch_fallback | ||
| exit 0 | ||
| fi | ||
| # Map each `git diff --name-status` line to a {path, status} | ||
| # JSON object. Statuses: A=added, M=modified, D=deleted, | ||
| # R<n>=renamed (with similarity score, e.g. R100). We | ||
| # collapse R into "renamed" so the handler can treat both | ||
| # sides of a rename uniformly. | ||
| PATHS_JSON=$(awk ' | ||
| BEGIN { print "[" } | ||
| { | ||
| code = substr($1, 1, 1) | ||
| status = "modified" | ||
| if (code == "A") status = "added" | ||
| else if (code == "M") status = "modified" | ||
| else if (code == "D") status = "deleted" | ||
| else if (code == "R") status = "renamed" | ||
| else if (code == "T") status = "modified" | ||
| else next | ||
| # For renames the second field is the new path; the | ||
| # first path field is the old path. We index by the new | ||
| # path for renames since that is the live route. The old | ||
| # path will be cleaned up by the manifest walk if the | ||
| # rename also touched manifest.json (handled above) or | ||
| # via natural record expiry on the next whole-branch | ||
| # reindex. We do not attempt to delete-by-old-canonical | ||
| # in surgical mode because the canonical URL of the old | ||
| # path is no longer reachable through the manifest. | ||
| if (code == "R") path = $3 | ||
| else path = $2 | ||
| if (NR > 1) printf "," | ||
| printf "{\"path\":\"%s\",\"status\":\"%s\"}", path, status | ||
| } | ||
| END { print "]" } | ||
| ' "$DIFF_FILE") | ||
| # Single-line, no whitespace: jq normalizes for the webhook | ||
| # body later. Fail loudly if jq cannot parse what we built. | ||
| PATHS_JSON=$(printf '%s' "$PATHS_JSON" | jq -c .) |
There was a problem hiding this comment.
P2 [DEREM-6] No test for the changes job's shell logic; the project has an established pattern for exactly this. (Bisky)
"The
changesjob is ~100 lines of branching shell with at least 7 distinct exit paths. The awk block alone has 6 code paths. None of this is tested."
The project already has test-docs-preview-mapper.sh (89 lines), which extracts a workflow's shell logic into a standalone function and exercises every branch with assertions. This PR introduces more complex logic and ships zero tests. DEREM-3 (the NR counter bug) is a concrete instance a test would have caught.
🤖
There was a problem hiding this comment.
Pushing back on this one.
test-docs-preview-mapper.sh tests a pure string-mapping function over arrays. That is straightforwardly testable in shell. The changes job here depends on real git state: git init + remote add + shallow fetch of two SHAs + git diff --name-status -z against those SHAs. Mocking that out in shell tests isn't a small refactor, and the codebase has no precedent for spinning up a git repo as a test fixture.
I exercised the rewritten awk manually with synthetic NUL-delimited inputs covering A, M, T, D, and R codes, plus paths containing spaces, double quotes, and backslashes. All survive correctly through to a well-formed webhook body. I would rather land that empirical validation now and add a proper test harness if production surprises us, vs blocking the cutover on a testability refactor that has no precedent in this repo.
Happy to revisit if you disagree.
Coder Agents on behalf of @nickvigilante.
Addresses 14 findings from the agent review on PR #25049: - DEREM-4 (P2): drop `cancel-in-progress: true`. Two rapid pushes would silently lose the first push's surgical-mode diff; runs are cheap so let them overlap. - DEREM-5 (P2): refuse workflow_dispatch `action=delete` when ref resolves to `main`. The dropdown defaults `ref` to the dispatching branch, so a forgotten field could wipe production search records. - DEREM-7 (P3): capture `git fetch` stderr and surface the first line in the warning so operators can diagnose without reproducing the fetch. - DEREM-8 (P3): vercel-rebuild now mirrors algolia-and-isr's pattern. Step summary is written unconditionally, then RC is propagated. - DEREM-9 (P3): add `--connect-timeout 10 --max-time 120` to both curl invocations so an unresponsive endpoint fails in ~2 minutes instead of the 6h runner default. - DEREM-10 (P3): validate `action` server-side. `type: choice` is UI-only; the REST API will accept any string. - DEREM-3 (P3) + DEREM-2 (P3): rewrite the `changes` job's awk to use NUL-delimited input (`git diff --name-status -z`) and a stateful parser, then pipe to `jq -Rcn` for JSON escaping. The previous approach had two latent bugs: (1) `NR > 1` would have emitted a leading comma if the first record was skipped, and (2) paths with spaces, quotes, or backslashes could break awk's default field splitting or produce invalid JSON. - DEREM-1, -11, -12, -13, -14, -15 (Nit): replace `--` punctuation in comment, rewrite description note about parallel jobs, fix the rename-field comment (third field is the new path), align trigger pattern with `release/*` convention, rename `emit_full_branch_fallback` to `emit_whole_branch_fallback` to match the prose, uppercase locals (`RESPONSE`, `RC`, `HTTP_STATUS`). DEREM-6 (P2: no tests for the changes job) deferred. The shell logic depends on real git state and the codebase has no precedent for running git inside a test fixture. Leaving as a follow-up if surprises emerge. Verified end-to-end with synthetic NUL-delimited inputs covering A, M, T, D, and R<n> codes, plus paths containing spaces, double quotes, and backslashes. All survive through the awk-to-jq pipeline correctly.
|
Pushed AddressedP2
P3
Nit
Deferred: DEREM-6 (P2: no tests for the changes job)Not fixing in this PR. The reviewer is right that there's precedent ( The behavior is now exercised manually (NUL-delimited synthetic inputs covering every status code and every problematic path shape, all passing through to a well-formed webhook body), and I'd rather land that empirical validation today and follow up with a proper test harness if the workflow surprises us in production. Happy to push back on this if you disagree. Response posted by Coder Agents on behalf of @nickvigilante. |
|
/coder-agents-review |
There was a problem hiding this comment.
14 of 15 R1 findings addressed in 0dd0a5b. All fixes verified by Netero and the panel. The awk rewrite (NUL-delimited input + jq for JSON escaping) is a clean structural improvement over the original.
One new P2, one new P3. The new P2 (DEREM-20) is sharp: the delete-main safety guard added for DEREM-5 is bypassable via GITHUB_OUTPUT newline injection in the ref input. The fix is two lines.
DEREM-6 (contested, P2): The panel evaluated the author's defense and found it conflates git-dependent branching (hard to test, defense accepted) with git-independent awk parsing (trivially testable with synthetic input, defense rejected). 3/4 reviewers support keeping the finding open. The author's manual verification proves the awk is testable without git; the test artifact was not preserved. DEREM-21 (counter/emitter divergence on unrecognized status codes) is a concrete sibling instance from the same class as DEREM-3. This needs a human decision: file a ticket for the test harness, or accept the gap.
"'Will add a test harness if production surprises' with no linked ticket is a drop, not a deferral." (Bisky)
🤖 This review was automatically generated with Coder Agents.
DEREM-20 (P2): reject newlines and carriage returns in workflow_dispatch `action` and `ref` inputs before any case validation. Without this, an operator can dispatch a multi-line ref like `release/x\naction=delete\nref=main` that passes the shell-glob case check (the trailing `*` matches across newlines) and then writes multiple lines into $GITHUB_OUTPUT via `echo "ref=$REF"`. GitHub Actions parses GITHUB_OUTPUT line-by-line with last-writer-wins, so the effective outputs become `action=delete ref=main`, bypassing the delete-against-main guard. DEREM-21 (P3): collapse the duplicate awk parsers into a single parser whose output is the source of truth for both the count and the JSON payload. The old design had a counter awk with a permissive catch-all and an emitter awk with no catch-all; an unrecognized status code would consume the path on the counter side and misalign the emitter, producing count=2/emit=1 on the reviewer's reproduction. The new design has one awk that explicitly handles unknown codes (consume the path, log a workflow warning, do not emit), and the count is derived from `wc -l` on the emitter output. The two cannot diverge by construction. DEREM-6 (P2, was deferred R1): added test-deploy-docs-diff.sh that exercises the awk parser against 22 synthetic NUL-delimited inputs covering status codes (A/M/T/D/R<n>), path escaping (spaces, double quotes, backslashes, tabs), unknown-code handling, count/emitter agreement, and the 50-record boundary. The script mirrors the existing test-docs-preview-mapper.sh precedent: a self-contained shell next to the workflow with the parser copy-pasted from the workflow. A workflow comment names the test script as the contract.
|
Pushed AddressedP2
P3
DEREM-6 follow-up noteThe new test script is structurally identical to Response posted by Coder Agents on behalf of @nickvigilante. |
|
/coder-agents-review |
1 similar comment
|
/coder-agents-review |
There was a problem hiding this comment.
All 22 findings across three rounds are resolved. The three R2 fixes in 257c755 are clean:
- DEREM-20 (newline injection): newline/CR rejection fires before any GITHUB_OUTPUT write or pattern validation. Both user-controllable inputs checked. Root cause addressed.
- DEREM-6 (no test):
test-deploy-docs-diff.shdelivers 22 assertions across status codes, path escaping, unknown-code handling, count/emitter agreement, and 50-record boundary. All pass. The test follows the establishedtest-docs-preview-mapper.shpattern. - DEREM-21 (counter/emitter divergence): single awk replaces the two-parser design. Count derived from
wc -lon emitter output. Divergence impossible by construction.
The panel verified all fixes. Bisky confirmed every assertion would fail if its target behavior changed. Mafuuu traced the newline check ordering and confirmed no sibling injection vectors. Meruem confirmed the single-awk structural fix.
"Twenty-two assertions, four custom assert helpers, NUL-delimited synthetic inputs fed through file descriptors to avoid bash's NUL-stripping. Somebody actually thought about this." (Bisky)
🤖 This review was automatically generated with Coder Agents.
jdomeracki-coder
left a comment
There was a problem hiding this comment.
Security Review: Approved (Coder Agents generated)
No critical or high-severity issues found. The workflow demonstrates strong security hygiene:
- All untrusted inputs pass through
env:context (no expression injection) - Secrets used only in HMAC computation / curl, never logged
- Explicit newline/CR rejection before
GITHUB_OUTPUTwrites prevents multi-line override attacks - Whitelist validation for
actionandref; delete-on-main refused - NUL-safe diff parsing with single awk for count+emission (no divergence possible)
set -euo pipefail, proper quoting, andmktempthroughout- Comprehensive regression tests for the parser
Two minor observations (non-blocking):
- Medium: Remote handler response bodies are written to the step summary. If the handler ever leaks sensitive data in errors, it would be visible to repo readers.
- Low: Consider branch protection rules restricting
release/*push access to limit who can trigger reindexing.
… step summary
The Algolia + ISR sync and Vercel rebuild step summaries previously
dumped the full curl response body into $GITHUB_STEP_SUMMARY via
`cat "$RESPONSE"`. coder/coder is a public repository, so the step
summary is visible to anyone with read access. The handler response
includes free-form fields (error, reason, revalidateSampleErrors,
skippedReasons, recordsByType) that may reflect upstream error strings
(e.g. `manifest fetch failed: ${err.message}`, raw `res.revalidate(p)`
errors). The Vercel deploy hook response is bounded today but is a
third-party shape that could change without our involvement.
Pre-filter both response bodies with jq, allowlisting only fields whose
contents we have validated as safe for a public UI: action, corpus,
ref, records, pagesIndexed, pagesSkipped, revalidated, revalidateFailed,
mode, pathsRequested, pathsSkipped, index, tookMs for the Algolia
handler; job.id, job.state, job.createdAt for the Vercel hook. Add a
comment pointing at SyncResponseBody in coder/coder.com so the next
person who edits this file knows where the contract lives.
When curl returns non-zero, suppress the raw error string and instead
print an `### Error` block telling operators to inspect the workflow
logs (which are restricted to repo collaborators) for full details. The
HTTP status code remains in the summary regardless.
Addresses the medium-severity finding from the human security review on
PR #25049.
|
Thanks for the review, @jdomeracki-coder. Both findings handled: Medium (response body in step summary): fixed in
|
Folds the Algolia/ISR sync trigger and surgical-reindex path computation into the existing
deploy-docs.yamlworkflow so a singledocs/**push fires every update path the docs site needs.One preflight job feeds two parallel sibling jobs:
changes(preflight): diffsgithub.event.beforeagainstgithub.shato computemanifest_changedandpaths_json(a JSON array of{path, status}objects derived fromgit diff --name-status -z, capped at 50 entries). The mapping isA → added,M/T → modified,D → deleted,R<n> → renamed(indexed by the new path). Falls back to whole-branch (emitspaths_json: "[]") onworkflow_dispatch, the first push to a new branch, fetch failure, manifest changes (route restructuring would orphan records), or >50 markdown files.algolia-and-isr(always, parallel withvercel-rebuild): HMAC-signed POST tocoder.com/api/algolia-docs-syncwith thepaths_jsonarray as part of the body. Refreshes the Algoliadocsslice for the(corpus, ref)pair and ISR-revalidates every navigable route the handler touched. Markdown-only edits surface in seconds with no full rebuild. The step summary lineMode: \surgical` (N path(s))` lets operators verify which path ran without scrolling through the curl output.vercel-rebuild(parallel withalgolia-and-isr, only whendocs/manifest.jsonchanged): fires the existing Vercel deploy hook for a full build. Manifest changes can register or remove routes that Next.js'sgetStaticPathsonly re-evaluates on a full build, so ISR-per-existing-path is not enough.Trigger expanded from "main + manifest.json" to "main and
release/*+ anydocs/**" so release-branch docs edits also flow through the same pipeline. The Vercel rebuild path stays gated on manifest changes regardless of branch.The pure shell + curl + openssl + jq + awk pipeline is preserved verbatim. Zero Algolia or Node dependencies in CI.
Why one workflow instead of two
The original split (a standalone Algolia workflow + the existing
deploy-docs.yaml) would have run twice per manifest push, with two parallel concurrency groups, two GitHub Actions step summaries, and two ways to forget to add a secret. Folding into one file makes the trigger story symmetrical: "docs change → all docs surfaces refresh," with the rebuild path being a strict superset of the ISR path, and the surgical path strictly cheaper than whole-branch when computable.Pre-merge testing
The companion handler PR (coder/coder.com#741) supports an
ALGOLIA_DOCS_INDEXenv-var override, scoped todocs_smokeon the Vercel preview deploy, so this workflow can be exercised end-to-end against a disposable index without touching production records. The smoke harness at~/audit/smoke/run.sh(workspace-only) signs and posts the same body shape this workflow does, so it covers the same crypto path. To exercise the workflow itself, push a docs-only commit to a throwaway branch and watch the step summary; thealgolia-and-isrjob will print the resolved mode.Prerequisites before this can do anything useful
secrets.ALGOLIA_DOCS_SYNC_SECRETmust be added as an Actions secret on this repo. The same value goes oncoder.com's Vercel env. The workflow logs a clear error and aborts with no network call if the secret is missing.secrets.DEPLOY_DOCS_VERCEL_WEBHOOKis already in place from the existingdeploy-docs.yaml; this PR does not change its usage.Demo, validation, and design
coder.com: coder/coder.com#741. Includes the surgical-mode plumbing this workflow'spaths_jsonoutput drives.~/plans/algolia-search-revamp.md. Key sections:coder.com, not here.This PR was generated by Coder Agents.