Skip to content

feat(vendor): download prebuilt patched packages from patch.socket.dev (--vendor-source)#116

Open
Mikola Lysenko (mikolalysenko) wants to merge 9 commits into
mainfrom
feat/vendor-service-download
Open

feat(vendor): download prebuilt patched packages from patch.socket.dev (--vendor-source)#116
Mikola Lysenko (mikolalysenko) wants to merge 9 commits into
mainfrom
feat/vendor-service-download

Conversation

@mikolalysenko

Copy link
Copy Markdown
Collaborator

What

socket-patch vendor now downloads the already-built, integrity-verified patched package from the patch.socket.dev vendoring service instead of always building it locally — across every vendorable ecosystem: npm, pypi, cargo, golang, composer, and gem.

New --vendor-source <mode> / SOCKET_VENDOR_SOURCE:

  • auto (default) — download the prebuilt package; silently fall back to a local build on any miss.
  • service — require the service, fail closed.
  • build — always build locally (the pre-service behavior).

Plus host overrides for testing against staging / local dev: --vendor-url / SOCKET_VENDOR_URL (step-1 package-reference host) and --patch-server-url / SOCKET_PATCH_SERVER_URL (step-2 download host).

Verification is fail-closed: a downloaded artifact that fails its sha512 (and, for golang, the h1: dirhash) is never used — it's a hard error, never a silent fallback to wrong bytes.

Per-ecosystem

  • npm (all lock flavors), pypi (wheel; sdist falls back/refuses) — Tier A: write the prebuilt tarball.
  • cargo / golang / composer / gem — Tier B: download the patched archive, integrity-verify, extract into the vendor copy dir (the same source tree a local build commits), then run the existing path-dep wiring unchanged.
  • gem additionally downloads a gem-stub-gemspec second artifact (the converter-generated path-source stub gemspec a .gem doesn't carry in bundler's required form) and writes it as <name>.gemspec. An absent stub (native-extension gem, or a patch built before the server-side rollout) is a service miss — auto falls back to the local build, service refuses. Server side: SocketDev/depscan#21768.

Notable internals

  • api/client.rs: surface non-tarball served artifacts as secondary_artifacts (host-rewritten URL + sha512); download_artifact fetches one lazily.
  • service_fetch.rs: shared download-and-verify funnel; fetch_verified_secondary for named second artifacts.
  • registry_fetch.rs: pub(crate) extract helpers reused by the service paths (incl. a factored extract_gem_data).
  • Offline + --vendor-source=servicevendor_service_offline_conflict. A successful service vend emits vendor_prebuilt_downloaded. Unrelated to --download-mode (which selects the local build's patch-content format).

Verification

  • cargo clippy --workspace --all-features -- -D warnings — clean.
  • cargo test — full core lib (1514) + CLI (297) green; per-ecosystem service paths covered by wiremock-backed unit tests (success, integrity-mismatch hard-fail, pending/unavailable fallback, offline refuse; gem also stub-missing fallback-vs-refuse and native-ext refuse).
  • Tier-B build-equivalence is exercised by the toolchain-backed e2e suites (skip when the package manager is absent) — including the gem bundle install e2e that validates the TS-generated stub's fidelity.

🤖 Generated with Claude Code

…v (npm + shared core)

Add a service-download path to `vendor`: instead of always building the
installable patched artifact locally, download the already-built tarball +
integrity from the patch.socket.dev vendoring service, falling back to the
local build on any miss. This commit lands the shared infrastructure + the
npm flavor (package-lock / pnpm / yarn-classic / yarn-berry / bun); other
ecosystems follow.

- config: --vendor-source {auto|service|build} (SOCKET_VENDOR_SOURCE, default
  auto), --vendor-url (SOCKET_VENDOR_URL), --patch-server-url
  (SOCKET_PATCH_SERVER_URL); all env-var-backed with parse/tripwire tests
- api client: ApiClient::fetch_vendor_package — two-step package-reference POST
  (/v0/orgs/{slug}/patches/package or proxy /patch/package) -> grant-tokenized
  serve GET, with host rewrite + status mapping; 12 wiremock tests
- core: VendorServiceConfig, service_fetch (sha512 + golang-h1 verify,
  fail-closed), PackedTarball::from_bytes (DRY with pack_deterministic)
- threading: Option<&VendorServiceConfig> through the vendor dispatch chain
  (scan --vendor / repair pass None = build-only, unchanged)
- npm: service path in stage_patch_pack with the auto/service/build fallback
  table; integrity always re-verified before write; 9 integration tests cover
  both the service download and the local-build fallback

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extend the prebuilt-package download to pypi. `vendor_pypi` now acquires the
patched wheel service-first (skipping the installed-dist requirement), falling
back to the local wheel build on any miss.

- acquire_patched_wheel: service-first then local-build; the service path
  writes the downloaded wheel, recomputes sha256 (lockfiles embed sha256 while
  the service reports sha512), and derives the platform-locked advisory from
  the wheel filename's tag triple
- only .whl artifacts are usable (pypi vendoring is wheel-based) — an sdist
  (or any miss) falls back under `auto` and hard-fails under `service`
- in_sync_outcome refactored onto a shared synthesized_apply_result
- 5 integration tests: service success (wheel written + requirements line
  wired to the recomputed sha256), sdist-fallback (auto) / sdist-hard-fail
  (service), integrity-mismatch hard-fail, offline+service refusal
- box the large service-decision enum variants (clippy large_enum_variant)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extend the prebuilt-package download to cargo (the first Tier-B / directory-
vendored ecosystem). `vendor_cargo_crate` now materialises the patched copy
service-first: download the prebuilt `.crate`, verify sha512, and extract it
into `.socket/vendor/cargo/<uuid>/<name>-<version>/` (dropping any
`.cargo-checksum.json` so it stays a path dep) — no pristine source needed.
Falls back to the existing copy-pristine-and-patch build on any miss.

- expose registry_fetch::extract_tgz as pub(crate) for the .crate extraction
- cargo_service_copy helper + boxed CargoServiceCopy enum; auto/service
  fallback policy; offline+service refusal; existing config + Cargo.lock
  wiring is unchanged (it never read the copy contents)
- 4 integration tests: service success (extracts patched crate, wires config,
  no sidecar, no pristine needed), integrity-mismatch hard-fail, not-built
  auto-fallback-to-build, offline+service refusal

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Close the fail-closed gap for the partial rollout and document the feature.

- dispatch_vendor_one: under `--vendor-source=service`, ecosystems without a
  service path yet (golang, gem, composer, maven, nuget) now refuse with
  `vendor_service_unsupported_ecosystem` instead of silently building locally
  (which would violate the fail-closed contract). `auto`/`build` are unchanged.
- CLI_CONTRACT.md: --vendor-source/--vendor-url/--patch-server-url flag rows,
  the env-var table, and a "Prebuilt vendor artifacts" section (two-step flow,
  fail-closed integrity, per-outcome fallback table, current ecosystem coverage
  npm/pypi/cargo, and the new event codes)
- README.md: the three new flags + env vars

Service coverage today: npm (all lock flavors), pypi (wheel), cargo (.crate).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… zip)

Extend the prebuilt-package download to golang. `vendor_go_module` now
materialises the patched module service-first: download the prebuilt module
zip, verify it (sha512 + the `h1:` dirhash), extract it into
`.socket/vendor/golang/<uuid>/<module>@<version>/` (stripping the zip's
`{module}@{version}/` prefix), synthesize a minimal go.mod if absent, and wire
the go.mod `replace` via `ensure_replace_entry` — the same end state
`apply_go_redirect` produces, minus the copy + local apply, and with no
pristine module source needed. Falls back to the engine build on any miss.

- expose registry_fetch::extract_zip_with_prefix + go_redirect::ensure_module_go_mod
  as pub(crate)
- go_service_redirect helper + boxed GoServiceRedirect enum; auto/service
  fallback; offline+service refusal; empty-files patches defer to the engine
- add golang to dispatch_vendor_one's SERVICE_ECOSYSTEMS allowlist
- 4 integration tests: service success (extracts module, wires replace, no
  pristine needed), wrong-h1-dirhash hard-fail (exercises the golang dirhash
  check), not-built auto-fallback, offline+service refusal

Service coverage now: npm, pypi, cargo, golang.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… zip)

Extend the prebuilt-package download to composer. `vendor_composer` now
materialises the patched copy service-first: download the prebuilt dist zip,
verify sha512, and extract it into `.socket/vendor/composer/<uuid>/<vendor>/<name>@<version>/`
(dropping the zip's variable top-level dir) — no installed package needed.
Falls back to copy-installed-and-patch on any miss.

- expose registry_fetch::extract_zip as pub(crate)
- composer_service_copy helper + boxed ComposerServiceCopy enum; auto/service
  fallback; offline+service refusal; composer.lock dist->path rewiring unchanged
- add composer to dispatch_vendor_one's SERVICE_ECOSYSTEMS allowlist
- 4 integration tests: service success (extracts dist, rewrites lock, no
  install needed), integrity-mismatch hard-fail, not-built auto-fallback,
  offline+service refusal

Service coverage now: npm, pypi, cargo, golang, composer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…omposer); gem gated

gem stays build-local: a path-sourced gem needs a stub gemspec that the `.gem`
archive doesn't carry in bundler's required eval-able form (it's metadata.gz
YAML; RubyGems generates the stub into specifications/). A clean service path
can't produce it without the local install or Ruby-specific serialization.

- dispatch_vendor_one gate comment + detail message updated to the final set
- CLI_CONTRACT.md "Coverage today" + README.md flag doc updated; note Tier-B
  build-equivalence is exercised by the toolchain-backed e2e suites

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nd artifact)

Close the last gap in `vendor --vendor-source`: gem now downloads the prebuilt
patched `.gem` from patch.socket.dev instead of always building locally, like
npm/pypi/cargo/golang/composer.

Bundler's path source needs an eval-able Ruby `<name>.gemspec`, but a `.gem`
only carries the gemspec as YAML inside `metadata.gz`. The converter generates
that stub and serves it as a `gem-stub-gemspec` SECOND artifact alongside the
`.gem` (mirroring npm's `yarn-berry-zip`); the gem backend downloads and
integrity-verifies BOTH, extracts the `.gem`'s `data.tar.gz` into the vendor
copy dir, and writes the stub as `<name>.gemspec`. The Gemfile + Gemfile.lock
pair wiring is unchanged — only how the copy dir + its `.gemspec` are produced
differs.

- api/client.rs: surface non-tarball served artifacts on `FetchedVendorPackage`
  as `secondary_artifacts` (host-rewritten URL + sha512), and add
  `download_artifact` to fetch one lazily.
- service_fetch.rs: carry the secondary refs on `VerifiedArchive` and add
  `fetch_verified_secondary` (download + fail-closed sha512 verify).
- registry_fetch.rs: factor a `pub(crate) extract_gem_data` out of `fetch_gem`
  so the service path reuses the exact same `.gem` extraction.
- gem.rs: thread `service` through `vendor_gem`; `gem_service_copy` downloads +
  verifies the `.gem` and the stub (absent stub => miss: native-ext gem or a
  pre-rollout patch), refuses a native-ext stub, extracts, writes the stub;
  `materialise_patched_copy` unifies service-first / local-fallback across both
  the full path and the hot-path artifact rebuild. The local stub read is now
  non-fatal so an auto-fetched (not-installed) gem can still vendor via the
  service. 8 new wiremock-backed tests.
- vendor.rs: add `gem` to `SERVICE_ECOSYSTEMS`; pass `service` to `vendor_gem`.
- README / CLI_CONTRACT: gem is now service-covered.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The `test (macos-latest)` matrix job installs vexctl via `go install` and runs
tests/e2e_vex.rs against it. The macОS-latest runner image (Sequoia+) has a
dyld that refuses to load a Mach-O binary lacking an LC_UUID load command, and
Go's linker only began emitting one in 1.24 — so the 1.22-built vexctl crashed
on launch ("dyld: missing LC_UUID load command in .../vexctl") and every
e2e_vex assertion failed with "vexctl rejected the document". Environmental,
not a code regression (ubuntu/windows were unaffected); the shared matrix pin
just needed bumping.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant