Symptom
test/nostr-cid-vm.test.js:223 (it('rejects a JWK with the right x but wrong y (curve-point integrity)', ...)) intermittently fails with:
ERR_ASSERTION
actual: 'https://alice.example.com/profile/card.jsonld#me'
expected: 'did:nostr:<hex pubkey>'
operator: 'strictEqual'
The test expects auth to fall back to did:nostr: when the JWK presents a corrupted y coordinate — but on some runs the WebID upgrade succeeds anyway, meaning the corrupted JWK was accepted as a valid curve point.
Root cause hypothesis
Each test run generates a fresh secp256k1 keypair (generateSecretKey() in the before/beforeEach). The corruption applied to y is:
const badJwk = {
...goodJwk,
y: goodJwk.y.slice(0, -1) + (goodJwk.y.endsWith('A') ? 'B' : 'A')
};
— flipping the last base64url character of y. For a random base point P = (x, y), that produces a value y' that's almost always not a valid curve point and therefore correctly rejected. But not always: each x has exactly two valid y solutions (y and p − y); if the single-character flip happens to round-trip to the negation −y mod p, the corrupted JWK is still on the curve and the test fails as observed.
The probability is tiny per run, but reproducible over many CI runs / local re-runs. This makes it a true flake — same code, same test, different outcome.
Reproduction
Run the suite repeatedly:
for i in {1..50}; do npm test 2>&1 | grep -E "^ℹ fail" | grep -v "fail 0" && echo "FLAKED ON RUN $i"; done
Expect at least one failure within tens of runs.
Fix options
- Deterministic keypair. Hardcode a known-good
secretKey/pubkey pair for this test — same behaviour every run, no probabilistic anything. Simplest. Loses a tiny bit of "tries lots of values" coverage but the test isn't fuzzing the curve impl.
- Provably-invalid corruption. Replace
y with a value that cannot be on the curve (e.g. all-zeroes — (x, 0) is only on secp256k1 for x = 0). Or set y to MAX_FIELD_VAL + 1. Removes the y == −y collision path entirely.
- Pre-check the corruption. After producing
badJwk, run a curve-point validation against it; if it happens to be valid, regenerate. Effectively rejection sampling. Works but feels like working around the symptom.
(1) is the smallest, clearest change. (2) is more thorough — corruption stays semantically equivalent to "not on the curve" without depending on the random keypair.
Severity
Low — test only, doesn't affect runtime behaviour. Annoying because it can fail any unrelated PR's CI run, eroding signal.
Out of scope
- A general audit of other tests that use
generateSecretKey() for similar randomness assumptions (test/nostr-event.test.js etc.). Worth a follow-up sweep but not urgent.
Symptom
test/nostr-cid-vm.test.js:223(it('rejects a JWK with the right x but wrong y (curve-point integrity)', ...)) intermittently fails with:The test expects auth to fall back to
did:nostr:when the JWK presents a corruptedycoordinate — but on some runs the WebID upgrade succeeds anyway, meaning the corrupted JWK was accepted as a valid curve point.Root cause hypothesis
Each test run generates a fresh
secp256k1keypair (generateSecretKey()in thebefore/beforeEach). The corruption applied toyis:— flipping the last base64url character of
y. For a random base pointP = (x, y), that produces a valuey'that's almost always not a valid curve point and therefore correctly rejected. But not always: eachxhas exactly two validysolutions (yandp − y); if the single-character flip happens to round-trip to the negation−y mod p, the corrupted JWK is still on the curve and the test fails as observed.The probability is tiny per run, but reproducible over many CI runs / local re-runs. This makes it a true flake — same code, same test, different outcome.
Reproduction
Run the suite repeatedly:
Expect at least one failure within tens of runs.
Fix options
secretKey/pubkeypair for this test — same behaviour every run, no probabilistic anything. Simplest. Loses a tiny bit of "tries lots of values" coverage but the test isn't fuzzing the curve impl.ywith a value that cannot be on the curve (e.g. all-zeroes —(x, 0)is only onsecp256k1forx = 0). Or setytoMAX_FIELD_VAL + 1. Removes they == −ycollision path entirely.badJwk, run a curve-point validation against it; if it happens to be valid, regenerate. Effectively rejection sampling. Works but feels like working around the symptom.(1) is the smallest, clearest change. (2) is more thorough — corruption stays semantically equivalent to "not on the curve" without depending on the random keypair.
Severity
Low — test only, doesn't affect runtime behaviour. Annoying because it can fail any unrelated PR's CI run, eroding signal.
Out of scope
generateSecretKey()for similar randomness assumptions (test/nostr-event.test.jsetc.). Worth a follow-up sweep but not urgent.