Commit 0b46c60
authored
feat: GET /idp/account/export — self-service pod data download (JavaScriptSolidServer#353) (JavaScriptSolidServer#449)
* feat: GET /idp/account/export — self-service pod data download (JavaScriptSolidServer#353)
The export side of the user-rights trio (JavaScriptSolidServer#351 password change,
JavaScriptSolidServer#352 account delete, this). MVP slice of the Credible Exit ladder
(JavaScriptSolidServer#448) — the L0-3 'take your stuff with you' deliverable.
Authenticated owner gets a streamed tar.gz containing:
jss-export/
manifest.json — webId, username, email, podName, mode,
createdAt, exportedAt, jssVersion
account.json — full account record minus passwordHash
(single-user without an IDP account: omitted)
pod/... — entire pod tree, including /private/
Per the Credible Exit framing in JavaScriptSolidServer#448, /private/privkey.jsonld
IS included in the archive when the pod was provisioned with
--provision-keys. The user's secret is theirs; refusing to export
would make L4+ identity migration impossible. The endpoint is
owner-authenticated; the secret never leaves the WAC perimeter
to anyone but the owner.
Streaming pipeline: tar.pack → zlib.createGzip → reply. Memory
stays constant regardless of pod size; multi-GB pods don't OOM.
Per-IP rate limit at 3/min — heavy read, but a legitimate
operator pulling a backup shouldn't hit it.
Failure modes:
401 — unauthenticated
403 — multi-user: no account for the caller's WebID
404 — pod directory unexpectedly missing
Out of scope (per the issue): re-import, cross-server pod
migration, periodic scheduled backups, partial / per-resource
selection, UI button on /idp/account.
New dep: tar-stream@^3.2.0. Pure JS, ~20KB, no native bindings —
runs cleanly on Termux / mobile per the deployment target.
Tests:
- 401 unauthenticated
- 200 owner export → valid tar.gz with manifest + account +
pod tree
- account.json never carries passwordHash
- cross-account: each authenticated caller can only get their
own data (the endpoint takes no target parameter)
- single-user + --provision-keys: archive contains the on-disk
/private/privkey.jsonld
864/864 tests pass.
Closes JavaScriptSolidServer#353. First slice of JavaScriptSolidServer#448.
* review: critical .idp leak fix + stream errors + dir entries + cleanup (JavaScriptSolidServer#449)
Ten batched pickups from Copilot. The first is security-critical;
the rest are real cleanup.
CRITICAL — single-user root-pod data leak. In root-pod single-user
mode (the default since JavaScriptSolidServer#348), podDir IS dataRoot — and dataRoot
contains .idp/accounts/*.json (every IdP account incl.
passwordHash) and .idp/keys/ (the IdP signing keys that mint
tokens for any user). The previous handler walked the directory
naively and would have shipped all of that to the caller, then
shrugged off responsibility because account.json had passwordHash
stripped — the same hashes (and worse) re-leaked via the raw walk.
Add a first-level denylist (currently {.idp}) applied only when
podDir is the data root. Named-pod / multi-user paths don't hit
this code path because the pod tree is already isolated by the
filesystem layout. New regression test pins the property by
provisioning a root pod, asserting the .idp/ tree exists on disk,
and confirming nothing from it lands in the archive.
Other fixes batched into the same commit:
- Stream-level error handler. pack.on('error') + gzip.on('error')
now log + destroy the pipeline so an EACCES / mid-pack failure
doesn't silently produce a 200 + truncated gzip.
- Empty-directory tar entries via pack.entry({ type: 'directory' }).
Preserves LDP container shape on round-trip restore — without
this, an empty container provisioned by the operator vanishes.
- TOCTOU on stat→read fix. Open the fd once, fstat off the same
fd, stream through createReadStream({ autoClose: false }). A
concurrent truncate/grow can no longer desync the declared tar
size from the bytes piped, which would otherwise produce a
corrupt tar that fails extraction.
- Rate limit keyed by request.webId with IP fallback. One user's
heavy backup pull no longer locks out everyone behind a shared
NAT / proxy egress.
- Merged the duplicate `import from './credentials.js'` blocks in
src/idp/index.js — the split was a diff artifact.
- Removed the contradictory symlink comment. The implementation
uses dirent.isFile() and intentionally skips non-regular
entries; the older "follows symlinks" comment was wrong.
- Renamed the cross-account test ('scopes the export to the
authenticated caller — no cross-account exposure') to reflect
what it actually proves. The endpoint takes no target parameter
so cross-account is structurally impossible; the docstring +
PR description claim of 403 was inaccurate. Removed the 403
claim from the export.js header.
- Content-Disposition: added the RFC 5987 `filename*=UTF-8''…`
fallback so a future non-ASCII slip through sanitizeSlug
degrades to valid percent-encoding rather than a malformed
header.
Out of scope here: hoisting the duplicate package.json read into
a single helper (JavaScriptSolidServer#449 review JavaScriptSolidServer#9). Two callsites, two lines each;
not worth the refactor right now.
865/865 tests pass.
* review: expand denylist (.private), strengthen tests, honest rate-limit (JavaScriptSolidServer#449)
- ROOT_POD_EXCLUDE: add `.private` (pay handler's Bitcoin keypair +
UTXO state). Without this, single-user root-pod export ships a
drainable Bitcoin keypair to the recipient. Same class of leak
as `.idp/`, missed in the first pass.
- account.json: switch from denylist (`{ passwordHash: _omit, ...rest }`)
to allowlist (`ACCOUNT_EXPORT_FIELDS`). Future secret-bearing
fields (passkey credentials, OIDC client secrets, recovery
tokens) won't silently land in exports without a security review.
- Client-disconnect handler on `reply.raw.on('close')`: destroy
pack + gzip when the socket goes away so walkAndPack stops
reading every file in a multi-GB pod into a dead pipeline.
- Stop misleading rate-limit-by-WebID: /idp/* skips the global auth
hook so `request.webId` is always undefined here, meaning the
fallback to IP was the only thing actually running. Documented
honestly; per-user keying is a follow-up needing pre-rate-limit
auth resolution.
- Tests: hard-assert ownerToken instead of `t.skip()` (a creds
regression must not silently turn the denylist + Credible-Exit
assertions into no-ops). Cross-account test seeds an alice-only
canary file + asserts bob's archive contains neither the name
nor the bytes (prior `/alice/` regex was tautological — the
archive prefix never embeds a username segment). Denylist test
also synthesizes `.private/keypair.json` so it asserts a real
on-disk entry isn't packed.
- Tiny: drop `request.log?.` optional chaining; Fastify always
decorates the request with `log`.
* review: 403 third-party WebID + manifest parity + safer denylist tests (JavaScriptSolidServer#449)
CRITICAL FIX:
- Single-user mode previously trusted ANY successfully-authenticated
WebID and shipped the entire pod (incl. /private/privkey.jsonld)
to the caller. With LWS-CID JWTs / external Solid-OIDC issuers
in scope, a third-party bearer was sufficient to download the
operator's secret. Handler now refuses with 403 when the
authenticated WebID does NOT match the seeded single-user
account — same shape as the multi-user "no local account" 403.
New test mints a third-party HMAC token and asserts 403.
Other:
- Single-user manifest now populates `username`, `email`, `createdAt`
from the resolved account record so manifest shape matches the
multi-user branch — downstream importers see one shape, not two.
- ROOT_POD_EXCLUDE: stronger inline guidance that adding a new
server-internal top-level dir without updating this set + a
regression test is a security bug. Allowlist alternative would
break Credible Exit (drops legit user content) — denylist + test
is the deliberate trade-off.
- Removed misleading "may be null in --no-idp mode" comment;
idpPlugin doesn't mount when idp is disabled, so the only path
to a null accountRecord here is the third-party-WebID case
handled by the new 403.
- Tests: snapshot/restore process.env.DATA_ROOT around start/stop
so cross-test leakage doesn't leave a stale path for any
unrelated test that reads DATA_ROOT after this file runs.
- Denylist test now asserts presence of an actual secret-bearing
file (`.idp/accounts/*.json`), not just `.idp/` existence — pins
the property "no IdP secrets in the export", not the implementation.
* review: manifest podName parity + dead-guard cleanup + JSDoc + idempotent error handler (JavaScriptSolidServer#449)
- manifest.podName in single-user now reads from accountRecord (same
source as the multi-user branch), so manifest.podName ===
account.json.podName in every archive shape. Previously root-pod
emitted manifest.podName=null while accountRecord.podName='me' —
silent disagreement in the same archive that a downstream importer
would see depending on which field it read.
- packExport: dropped the now-dead `if (accountRecord)` guard.
Both branches in handleExportAccount return 403 on null
accountRecord, so by the time control reaches packExport the
record is always defined and account.json is always emitted.
- onStreamError: idempotent via `if (gzip.destroyed || pack.destroyed)
return;`. Destroying a stream re-emits 'error' which would re-enter
this handler from three sources (pack.error, gzip.error,
streamingPromise.catch) and produce duplicate "pod export stream
error" log lines for one underlying failure.
- idpPlugin JSDoc: added `inviteOnly`, `singleUser`, `singleUserName`,
`jssVersion` (previously only `issuer` was documented). A future
caller reading the function header now discovers the export
endpoint's plumbing requirements without grepping the body.
- Test: added root-pod manifest shape + parity assertions
(manifest.podName === account.podName === 'me'). Pins the
consistency property — a regression to the previous
`null`-vs-`'me'` disagreement now fails loudly.
* review: close-handler race + defensive multi-user denylist + nested-allowlist + comments (JavaScriptSolidServer#449)
- Close-handler race: `reply.raw.on('close')` could briefly observe
`writableEnded === false` on a normal completion (window between
socket close and writableEnded flip), producing spurious "client
disconnected" warn lines for successful exports on slow writes.
Added `packFinished` flag set after pack.finalize() resolves; the
close handler short-circuits when set, distinguishing "real
disconnect mid-walk" from "natural end of stream".
- Defense-in-depth check on podDir basename: refuse with 500 if a
multi-user export resolves podDir to a server-internal name in
ROOT_POD_EXCLUDE. Account-creation validation rejects these names
today; this is a second-line guard against an upstream regression
silently letting `<dataRoot>/.idp` be walked as a "user pod".
- ACCOUNT_EXPORT_FIELDS: documented top-level-only behavior. A
future field that's itself an object (e.g. `oidcClientConfig:
{ secret }`) would export the entire nested structure including
secrets. Comment now warns maintainers and points at projection
as the right pattern for nested data.
- Error vocabulary: added comment explaining the deliberate mixed
convention (401 → OAuth-defined `invalid_token`, 403/404 → plain
HTTP semantic), matching credentials.js and the rest of src/idp/.
- src/server.js: dropped misleading "same approach the existing
config code uses" comment. Updated to honestly note the duplicate
package.json read (sync at idpPlugin registration vs async in
onReady) and flag a memoized helper as a follow-up out of scope
for JavaScriptSolidServer#353.
* review: drop unused fs import + refresh failure-modes doc + don't echo podDir in 404 (JavaScriptSolidServer#449)
- Drop unused `import fs from 'fs'` (all I/O goes through fsp).
- Failure-modes JSDoc was stale: now lists the single-user 403
(third-party WebID, the security-critical fix from round 3) and
the new 500 defense-in-depth code (podDir resolved to a server-
internal name).
- 404 error_description no longer echoes the resolved podDir back
to the caller. Leaking the operator's filesystem layout to an
authenticated owner whose pod is missing was needless info
disclosure. Path stays in the server log for debugging.
* review: pre-flight readability + named-pod podName consistency + drop void no-op (JavaScriptSolidServer#449)
- Pre-flight readability via fsp.readdir(podDir) BEFORE flushing
response headers. A first-byte EACCES on the top-level readdir
was previously surfacing inside the streaming pipeline AFTER
reply.send(gzip) had flushed headers — client got 200 +
truncated/empty body instead of clean 5xx JSON. Pre-flight
catches this and routes to a clean 404 (ENOENT/ENOTDIR) or 500
(anything else) before headers go out. Both error_descriptions
stay generic; resolved podDir stays in server log only.
- Named-pod single-user consistency: assert
accountRecord.podName === options.singleUserName before walking.
If seedSingleUserIdpAccount ever drifts so the seeded podName
diverges from the CLI option that derives podDir, the manifest
would advertise X while the export's tree comes from
<dataRoot>/Y. Refuse with 500 + log so the seeding regression
fails loudly. Skipped in root-pod (where podDir is dataRoot
regardless of the seeded podName).
- Drop `void streamingPromise` no-op + the local. `.catch()` is
already attached so no unhandled-rejection risk; the void was
suppressing nothing. Chain `.then().catch()` directly.
* review: gate disconnect on response 'finish' + hard-fail pod creation + comment refresh (JavaScriptSolidServer#449)
- packFinished was set when packExport (i.e. pack.finalize()) resolved
— that's "we're done writing into the pipeline", not "the client
has received the bytes". For a multi-GB gzipped response over a
slow link the gap can be substantial, and a real client disconnect
during the final flush would be silently swallowed (close handler
short-circuits on a flag set too early). Renamed to responseFinished
and gated on `reply.raw.on('finish')` instead — Node emits 'finish'
when the last byte hits the OS socket buffer, after which a 'close'
is "client gone after we were done" (correctly ignored). Before
'finish', a 'close' is a genuine mid-stream disconnect.
- Multi-user before hook: hard-assert `aliceRes.ok` + `aliceToken`
(and bob) so a /.pods shape regression surfaces as "pod creation
failed: <status>" rather than as cryptic 401s in every downstream
test. Same hard-fail pattern as the single-user before hooks.
- onStreamError comment refreshed to mention the inline
`.catch(onStreamError)` chained on packExport instead of the
no-longer-existing streamingPromise local.
* review: refuse empty singleUserName + access vs readdir + filename collision suffix (JavaScriptSolidServer#449)
- singleUserName='' would have silently switched the handler into
root-pod mode under the previous `!singleUserName` falsy-check.
Worse, naively switching to a strict null check would route ''
through the named-pod branch with podDir = path.join(dataRoot, '')
= dataRoot AND excludeAtRoot = null, silently exporting
server-internal `.idp/` and `.private/`. Now refuse empty-string /
non-string singleUserName explicitly with 500 + log before
deciding isRootPod, then `isRootPod = singleUserName == null` for
the legitimate cases.
- Pre-flight readability switched from `fsp.readdir(podDir)` (which
reads + discards the entire top-level entry list before
walkAndPack re-reads it) to `fsp.access(podDir, R_OK | X_OK)` —
same semantic guard against EACCES, half the syscalls.
- Filename: 6-hex-char crypto.randomBytes(3) suffix appended to
`jss-export-<slug>-<isoDate>-<rand>.tar.gz`. ISO timestamps have
ms resolution; consecutive exports in the same ms would otherwise
produce identical Content-Disposition filenames that overwrite
each other on the client side.1 parent 7e22183 commit 0b46c60
6 files changed
Lines changed: 1312 additions & 6 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
28 | | - | |
29 | 27 | | |
30 | 28 | | |
31 | 29 | | |
| 30 | + | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
41 | | - | |
| 41 | + | |
| 42 | + | |
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
| |||
0 commit comments