Skip to content

Commit 0b46c60

Browse files
feat: GET /idp/account/export — self-service pod data download (JavaScriptSolidServer#353) (JavaScriptSolidServer#449)
* feat: GET /idp/account/export — self-service pod data download (JavaScriptSolidServer#353) The export side of the user-rights trio (JavaScriptSolidServer#351 password change, JavaScriptSolidServer#352 account delete, this). MVP slice of the Credible Exit ladder (JavaScriptSolidServer#448) — the L0-3 'take your stuff with you' deliverable. Authenticated owner gets a streamed tar.gz containing: jss-export/ manifest.json — webId, username, email, podName, mode, createdAt, exportedAt, jssVersion account.json — full account record minus passwordHash (single-user without an IDP account: omitted) pod/... — entire pod tree, including /private/ Per the Credible Exit framing in JavaScriptSolidServer#448, /private/privkey.jsonld IS included in the archive when the pod was provisioned with --provision-keys. The user's secret is theirs; refusing to export would make L4+ identity migration impossible. The endpoint is owner-authenticated; the secret never leaves the WAC perimeter to anyone but the owner. Streaming pipeline: tar.pack → zlib.createGzip → reply. Memory stays constant regardless of pod size; multi-GB pods don't OOM. Per-IP rate limit at 3/min — heavy read, but a legitimate operator pulling a backup shouldn't hit it. Failure modes: 401 — unauthenticated 403 — multi-user: no account for the caller's WebID 404 — pod directory unexpectedly missing Out of scope (per the issue): re-import, cross-server pod migration, periodic scheduled backups, partial / per-resource selection, UI button on /idp/account. New dep: tar-stream@^3.2.0. Pure JS, ~20KB, no native bindings — runs cleanly on Termux / mobile per the deployment target. Tests: - 401 unauthenticated - 200 owner export → valid tar.gz with manifest + account + pod tree - account.json never carries passwordHash - cross-account: each authenticated caller can only get their own data (the endpoint takes no target parameter) - single-user + --provision-keys: archive contains the on-disk /private/privkey.jsonld 864/864 tests pass. Closes JavaScriptSolidServer#353. First slice of JavaScriptSolidServer#448. * review: critical .idp leak fix + stream errors + dir entries + cleanup (JavaScriptSolidServer#449) Ten batched pickups from Copilot. The first is security-critical; the rest are real cleanup. CRITICAL — single-user root-pod data leak. In root-pod single-user mode (the default since JavaScriptSolidServer#348), podDir IS dataRoot — and dataRoot contains .idp/accounts/*.json (every IdP account incl. passwordHash) and .idp/keys/ (the IdP signing keys that mint tokens for any user). The previous handler walked the directory naively and would have shipped all of that to the caller, then shrugged off responsibility because account.json had passwordHash stripped — the same hashes (and worse) re-leaked via the raw walk. Add a first-level denylist (currently {.idp}) applied only when podDir is the data root. Named-pod / multi-user paths don't hit this code path because the pod tree is already isolated by the filesystem layout. New regression test pins the property by provisioning a root pod, asserting the .idp/ tree exists on disk, and confirming nothing from it lands in the archive. Other fixes batched into the same commit: - Stream-level error handler. pack.on('error') + gzip.on('error') now log + destroy the pipeline so an EACCES / mid-pack failure doesn't silently produce a 200 + truncated gzip. - Empty-directory tar entries via pack.entry({ type: 'directory' }). Preserves LDP container shape on round-trip restore — without this, an empty container provisioned by the operator vanishes. - TOCTOU on stat→read fix. Open the fd once, fstat off the same fd, stream through createReadStream({ autoClose: false }). A concurrent truncate/grow can no longer desync the declared tar size from the bytes piped, which would otherwise produce a corrupt tar that fails extraction. - Rate limit keyed by request.webId with IP fallback. One user's heavy backup pull no longer locks out everyone behind a shared NAT / proxy egress. - Merged the duplicate `import from './credentials.js'` blocks in src/idp/index.js — the split was a diff artifact. - Removed the contradictory symlink comment. The implementation uses dirent.isFile() and intentionally skips non-regular entries; the older "follows symlinks" comment was wrong. - Renamed the cross-account test ('scopes the export to the authenticated caller — no cross-account exposure') to reflect what it actually proves. The endpoint takes no target parameter so cross-account is structurally impossible; the docstring + PR description claim of 403 was inaccurate. Removed the 403 claim from the export.js header. - Content-Disposition: added the RFC 5987 `filename*=UTF-8''…` fallback so a future non-ASCII slip through sanitizeSlug degrades to valid percent-encoding rather than a malformed header. Out of scope here: hoisting the duplicate package.json read into a single helper (JavaScriptSolidServer#449 review JavaScriptSolidServer#9). Two callsites, two lines each; not worth the refactor right now. 865/865 tests pass. * review: expand denylist (.private), strengthen tests, honest rate-limit (JavaScriptSolidServer#449) - ROOT_POD_EXCLUDE: add `.private` (pay handler's Bitcoin keypair + UTXO state). Without this, single-user root-pod export ships a drainable Bitcoin keypair to the recipient. Same class of leak as `.idp/`, missed in the first pass. - account.json: switch from denylist (`{ passwordHash: _omit, ...rest }`) to allowlist (`ACCOUNT_EXPORT_FIELDS`). Future secret-bearing fields (passkey credentials, OIDC client secrets, recovery tokens) won't silently land in exports without a security review. - Client-disconnect handler on `reply.raw.on('close')`: destroy pack + gzip when the socket goes away so walkAndPack stops reading every file in a multi-GB pod into a dead pipeline. - Stop misleading rate-limit-by-WebID: /idp/* skips the global auth hook so `request.webId` is always undefined here, meaning the fallback to IP was the only thing actually running. Documented honestly; per-user keying is a follow-up needing pre-rate-limit auth resolution. - Tests: hard-assert ownerToken instead of `t.skip()` (a creds regression must not silently turn the denylist + Credible-Exit assertions into no-ops). Cross-account test seeds an alice-only canary file + asserts bob's archive contains neither the name nor the bytes (prior `/alice/` regex was tautological — the archive prefix never embeds a username segment). Denylist test also synthesizes `.private/keypair.json` so it asserts a real on-disk entry isn't packed. - Tiny: drop `request.log?.` optional chaining; Fastify always decorates the request with `log`. * review: 403 third-party WebID + manifest parity + safer denylist tests (JavaScriptSolidServer#449) CRITICAL FIX: - Single-user mode previously trusted ANY successfully-authenticated WebID and shipped the entire pod (incl. /private/privkey.jsonld) to the caller. With LWS-CID JWTs / external Solid-OIDC issuers in scope, a third-party bearer was sufficient to download the operator's secret. Handler now refuses with 403 when the authenticated WebID does NOT match the seeded single-user account — same shape as the multi-user "no local account" 403. New test mints a third-party HMAC token and asserts 403. Other: - Single-user manifest now populates `username`, `email`, `createdAt` from the resolved account record so manifest shape matches the multi-user branch — downstream importers see one shape, not two. - ROOT_POD_EXCLUDE: stronger inline guidance that adding a new server-internal top-level dir without updating this set + a regression test is a security bug. Allowlist alternative would break Credible Exit (drops legit user content) — denylist + test is the deliberate trade-off. - Removed misleading "may be null in --no-idp mode" comment; idpPlugin doesn't mount when idp is disabled, so the only path to a null accountRecord here is the third-party-WebID case handled by the new 403. - Tests: snapshot/restore process.env.DATA_ROOT around start/stop so cross-test leakage doesn't leave a stale path for any unrelated test that reads DATA_ROOT after this file runs. - Denylist test now asserts presence of an actual secret-bearing file (`.idp/accounts/*.json`), not just `.idp/` existence — pins the property "no IdP secrets in the export", not the implementation. * review: manifest podName parity + dead-guard cleanup + JSDoc + idempotent error handler (JavaScriptSolidServer#449) - manifest.podName in single-user now reads from accountRecord (same source as the multi-user branch), so manifest.podName === account.json.podName in every archive shape. Previously root-pod emitted manifest.podName=null while accountRecord.podName='me' — silent disagreement in the same archive that a downstream importer would see depending on which field it read. - packExport: dropped the now-dead `if (accountRecord)` guard. Both branches in handleExportAccount return 403 on null accountRecord, so by the time control reaches packExport the record is always defined and account.json is always emitted. - onStreamError: idempotent via `if (gzip.destroyed || pack.destroyed) return;`. Destroying a stream re-emits 'error' which would re-enter this handler from three sources (pack.error, gzip.error, streamingPromise.catch) and produce duplicate "pod export stream error" log lines for one underlying failure. - idpPlugin JSDoc: added `inviteOnly`, `singleUser`, `singleUserName`, `jssVersion` (previously only `issuer` was documented). A future caller reading the function header now discovers the export endpoint's plumbing requirements without grepping the body. - Test: added root-pod manifest shape + parity assertions (manifest.podName === account.podName === 'me'). Pins the consistency property — a regression to the previous `null`-vs-`'me'` disagreement now fails loudly. * review: close-handler race + defensive multi-user denylist + nested-allowlist + comments (JavaScriptSolidServer#449) - Close-handler race: `reply.raw.on('close')` could briefly observe `writableEnded === false` on a normal completion (window between socket close and writableEnded flip), producing spurious "client disconnected" warn lines for successful exports on slow writes. Added `packFinished` flag set after pack.finalize() resolves; the close handler short-circuits when set, distinguishing "real disconnect mid-walk" from "natural end of stream". - Defense-in-depth check on podDir basename: refuse with 500 if a multi-user export resolves podDir to a server-internal name in ROOT_POD_EXCLUDE. Account-creation validation rejects these names today; this is a second-line guard against an upstream regression silently letting `<dataRoot>/.idp` be walked as a "user pod". - ACCOUNT_EXPORT_FIELDS: documented top-level-only behavior. A future field that's itself an object (e.g. `oidcClientConfig: { secret }`) would export the entire nested structure including secrets. Comment now warns maintainers and points at projection as the right pattern for nested data. - Error vocabulary: added comment explaining the deliberate mixed convention (401 → OAuth-defined `invalid_token`, 403/404 → plain HTTP semantic), matching credentials.js and the rest of src/idp/. - src/server.js: dropped misleading "same approach the existing config code uses" comment. Updated to honestly note the duplicate package.json read (sync at idpPlugin registration vs async in onReady) and flag a memoized helper as a follow-up out of scope for JavaScriptSolidServer#353. * review: drop unused fs import + refresh failure-modes doc + don't echo podDir in 404 (JavaScriptSolidServer#449) - Drop unused `import fs from 'fs'` (all I/O goes through fsp). - Failure-modes JSDoc was stale: now lists the single-user 403 (third-party WebID, the security-critical fix from round 3) and the new 500 defense-in-depth code (podDir resolved to a server- internal name). - 404 error_description no longer echoes the resolved podDir back to the caller. Leaking the operator's filesystem layout to an authenticated owner whose pod is missing was needless info disclosure. Path stays in the server log for debugging. * review: pre-flight readability + named-pod podName consistency + drop void no-op (JavaScriptSolidServer#449) - Pre-flight readability via fsp.readdir(podDir) BEFORE flushing response headers. A first-byte EACCES on the top-level readdir was previously surfacing inside the streaming pipeline AFTER reply.send(gzip) had flushed headers — client got 200 + truncated/empty body instead of clean 5xx JSON. Pre-flight catches this and routes to a clean 404 (ENOENT/ENOTDIR) or 500 (anything else) before headers go out. Both error_descriptions stay generic; resolved podDir stays in server log only. - Named-pod single-user consistency: assert accountRecord.podName === options.singleUserName before walking. If seedSingleUserIdpAccount ever drifts so the seeded podName diverges from the CLI option that derives podDir, the manifest would advertise X while the export's tree comes from <dataRoot>/Y. Refuse with 500 + log so the seeding regression fails loudly. Skipped in root-pod (where podDir is dataRoot regardless of the seeded podName). - Drop `void streamingPromise` no-op + the local. `.catch()` is already attached so no unhandled-rejection risk; the void was suppressing nothing. Chain `.then().catch()` directly. * review: gate disconnect on response 'finish' + hard-fail pod creation + comment refresh (JavaScriptSolidServer#449) - packFinished was set when packExport (i.e. pack.finalize()) resolved — that's "we're done writing into the pipeline", not "the client has received the bytes". For a multi-GB gzipped response over a slow link the gap can be substantial, and a real client disconnect during the final flush would be silently swallowed (close handler short-circuits on a flag set too early). Renamed to responseFinished and gated on `reply.raw.on('finish')` instead — Node emits 'finish' when the last byte hits the OS socket buffer, after which a 'close' is "client gone after we were done" (correctly ignored). Before 'finish', a 'close' is a genuine mid-stream disconnect. - Multi-user before hook: hard-assert `aliceRes.ok` + `aliceToken` (and bob) so a /.pods shape regression surfaces as "pod creation failed: <status>" rather than as cryptic 401s in every downstream test. Same hard-fail pattern as the single-user before hooks. - onStreamError comment refreshed to mention the inline `.catch(onStreamError)` chained on packExport instead of the no-longer-existing streamingPromise local. * review: refuse empty singleUserName + access vs readdir + filename collision suffix (JavaScriptSolidServer#449) - singleUserName='' would have silently switched the handler into root-pod mode under the previous `!singleUserName` falsy-check. Worse, naively switching to a strict null check would route '' through the named-pod branch with podDir = path.join(dataRoot, '') = dataRoot AND excludeAtRoot = null, silently exporting server-internal `.idp/` and `.private/`. Now refuse empty-string / non-string singleUserName explicitly with 500 + log before deciding isRootPod, then `isRootPod = singleUserName == null` for the legitimate cases. - Pre-flight readability switched from `fsp.readdir(podDir)` (which reads + discards the entire top-level entry list before walkAndPack re-reads it) to `fsp.access(podDir, R_OK | X_OK)` — same semantic guard against EACCES, half the syscalls. - Filename: 6-hex-char crypto.randomBytes(3) suffix appended to `jss-export-<slug>-<isoDate>-<rand>.tar.gz`. ISO timestamps have ms resolution; consecutive exports in the same ms would otherwise produce identical Content-Disposition filenames that overwrite each other on the client side.
1 parent 7e22183 commit 0b46c60

6 files changed

Lines changed: 1312 additions & 6 deletions

File tree

package-lock.json

Lines changed: 267 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,11 @@
2424
"benchmark": "node benchmark.js"
2525
},
2626
"dependencies": {
27-
"@noble/curves": "^1.2.0",
28-
"@noble/hashes": "^1.3.2",
2927
"@fastify/middie": "^8.3.3",
3028
"@fastify/rate-limit": "^9.1.0",
3129
"@fastify/websocket": "^8.3.1",
30+
"@noble/curves": "^1.2.0",
31+
"@noble/hashes": "^1.3.2",
3232
"@simplewebauthn/server": "^13.2.2",
3333
"bcryptjs": "^3.0.3",
3434
"commander": "^14.0.2",
@@ -38,7 +38,8 @@
3838
"microfed": "^0.0.14",
3939
"n3": "^1.26.0",
4040
"oidc-provider": "^9.6.0",
41-
"sql.js": "^1.13.0"
41+
"sql.js": "^1.13.0",
42+
"tar-stream": "^3.2.0"
4243
},
4344
"engines": {
4445
"node": ">=18.0.0"

0 commit comments

Comments
 (0)