Conversation
|
Updated 9:46 PM PT - Apr 8th, 2026
❌ @robobun, your commit bb1e3c5 has 2 failures in
🧪 To try this PR locally: bunx bun-pr 29050That installs a local version of the PR into your bun-29050 --bun |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughIntroduces a wire-agnostic subscription frame type and parser, centralizes subscription-frame detection in response handling, refactors subscription dispatch to operate on parsed frames (.message/.subscribe/.unsubscribe) without in-function promise rejection, and adds PUB/SUB integration and regression tests exercising RESP2 (HELLO 2). Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/valkey/valkey.zig`:
- Around line 673-690: The code currently uses a locally-computed subscription
count via p._subscription_ctx.channelsSubscribedToCount(globalThis) before
resolving the SUBSCRIBE promise; change this to use the server-reported count in
the response by extracting frame.data[1] (via frame.data[1].asInteger()) and
pass that value into req_pair.promise.promise.resolve(...) so that
addSubscription(), onValkeySubscribe(), and the promise resolution use the
server's authoritative count for multi-channel SUBSCRIBE responses.
In `@test/regression/issue/29042.test.ts`:
- Around line 47-60: The test awaits (the Promise created via
Promise.withResolvers and other async assertions around sub.subscribe /
pub.send) can hang; wrap those awaits with a bounded timeout helper (e.g.,
withTimeout) so failures are immediate and diagnostic. Replace direct awaits of
promise/received (and the other places flagged: the other awaits around pub.send
/ subscriber resolution) with await withTimeout(promiseOrOperation, TIMEOUT_MS)
and keep using Promise.withResolvers, resolve, reject inside sub.subscribe to
trigger the resolver; ensure you apply this pattern to the other two spots
referenced (the awaits at the other subscribe/publish assertions).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c137757c-86bf-4cef-8b64-e2249fe242f5
📥 Commits
Reviewing files that changed from the base of the PR and between 1afabdd and 08152112e611cf0f3db9bf27979d711b319cff1e.
📒 Files selected for processing (4)
src/valkey/valkey.zigsrc/valkey/valkey_protocol.zigtest/js/valkey/valkey.test.tstest/regression/issue/29042.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/regression/issue/29042.test.ts`:
- Around line 57-89: The test leaks Redis sockets because pub.close() and
sub.close() only run on the happy path; change the setup to use scoped disposal
so clients are always closed even on assertion/timeout failures: create the
clients with the Disposable pattern (use await using or using) around
RedisClient and its connect() result (e.g., replace manual const pub = new
RedisClient(...); await pub.connect(); ... pub.close() with an await using
(const pub = await RedisClient.connect("redis://localhost:6379") or equivalent)
and similarly for sub; ensure the test body uses the scoped variables (sub, pub)
and remove the explicit close() calls so close() runs automatically on scope
exit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: c79ca729-d7bf-4070-b878-dc1bf4f90ed9
📥 Commits
Reviewing files that changed from the base of the PR and between 08152112e611cf0f3db9bf27979d711b319cff1e and ab1d5e97ca35fc831934a1388c1ee09a8656ab26.
📒 Files selected for processing (1)
test/regression/issue/29042.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/regression/issue/29042.test.ts`:
- Around line 19-28: The helper redisReachable creates a RedisClient (variable
c) but only calls c.close() on the success path so a thrown connect() leaves a
stray connection; change the function redisReachable to ensure the client is
always closed by moving c.close() into a finally block (guarding for c being
defined) and await the close if it returns a promise so the connection is
cleaned up regardless of whether connect() throws.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7ac281e1-35e2-4297-9fd3-46cbb16efe17
📥 Commits
Reviewing files that changed from the base of the PR and between ab1d5e97ca35fc831934a1388c1ee09a8656ab26 and 7e4aac57ddb55a7fcbd8857c3797d147af8a3a42.
📒 Files selected for processing (1)
test/regression/issue/29042.test.ts
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/valkey/valkey.zig`:
- Around line 844-847: The branch inside valkey.zig that checks if (value.* ==
.Error) currently calls this.fail(value.Error,
protocol.RedisError.InvalidResponse) after the subscriber request pair may have
already been popped from in_flight, which causes global rejection of remaining
queues and leaves the original caller unresolved; instead, detect whether
pair_maybe was removed and if so reject only the specific promise tied to that
subscriber (or return the appropriate error to the caller) rather than invoking
this.fail; update the logic around pair_maybe/in_flight handling so you either:
1) resolve/reject the popped pair directly when value.* == .Error, or 2) if the
pair is already removed, propagate the error only to that subscriber's promise,
and only call this.fail when the error should close the whole client.
- Around line 801-810: Restore the guard that prevents unsupported RESP3 Push
frames from entering the normal reply queue: before calling readItem() check if
the incoming value is a RESP3 Push (value.kind == .Push) and that
protocol.SubscriptionPushMessage.asFrame(value) returns null; if so, treat it as
an unsupported push (route to the subscription/push handler or drop/log it)
instead of letting it fall through into the usual reply handling/promise pair
consumption. Ensure this same guard is applied in the other affected block
mentioned (the code region around the existing SubscriptionPushMessage.asFrame
usage and the similar code at the other spot around lines 834–856) so only
non-push RESP2 replies are allowed to proceed into the normal reply queue.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a7010d48-8a22-4e50-b5aa-1d3e3f27ac56
📥 Commits
Reviewing files that changed from the base of the PR and between 7e4aac57ddb55a7fcbd8857c3797d147af8a3a42 and 928512cf7c74069519692167c9830170d14da817.
📒 Files selected for processing (2)
src/valkey/valkey.zigtest/regression/issue/29042.test.ts
There was a problem hiding this comment.
LGTM — the RESP2 delivery fix is correct, all reviewer feedback was addressed, and the code quality issues I raised were cleaned up in the follow-up commit.
Extended reasoning...
Overview
The PR introduces SubscriptionPushMessage.asFrame() in valkey_protocol.zig to normalize subscription frames from both RESP3 .Push and RESP2 .Array shapes into a single Frame struct. handleSubscribeResponse and handleResponse in valkey.zig are updated to dispatch on the parsed frame kind rather than the raw wire type, fixing silent message drops after HELLO 2. Two test files add coverage.
Security risks
No security surface — this is client-side protocol routing with no auth, crypto, or permission-checking involved.
Level of scrutiny
Moderate: the subscriber dispatch loop is on a hot path and correctness matters, but the change is well-bounded. The asFrame helper is a pure read with no side effects, and every code path through handleSubscribeResponse was already present in the old code — just gated on the wrong type tag. The regression test confirms the fix and verifies both SUBSCRIBE confirmation and message delivery under RESP2.
Other factors
- My earlier inline comments about the dead
fallthroughreturn type and the unnecessarychannelsSubscribedToCountcall on message frames were both addressed in the clean-up commit (928512c): the function now returnsvoidandsub_countis computed only inside the.subscribebranch. - The pre-existing SUBSCRIBE error hang (promise never rejected on server
.Errorresponse) is correctly identified as predating this PR and is out of scope. - The unrouted-push edge case (keyspace notifications arriving while a SUBSCRIBE is in-flight) was discussed; the PR author acknowledged it and committed to filing a separate tracking issue. The pre-PR behavior for that path was a hard connection failure, which is also wrong — the risk profile is comparable.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/valkey/valkey.zig (1)
823-886:⚠️ Potential issue | 🟠 MajorUnsupported RESP2 async arrays can still be silently dropped in subscriber mode.
At Line 824 and Line 885, if
subscription_frame == nulland there is no in-flight pair, execution reachespair_maybe orelse return, which silently discards unrecognized RESP2.Arraynotifications. This reintroduces silent-loss behavior for unsupported pub/sub kinds under RESP2.🛠️ Suggested fix
if (this.parent().isSubscriber()) { if (subscription_frame) |frame| { switch (frame.kind) { @@ } + } else if (value.* == .Array and this.in_flight.readableLength() == 0) { + // On subscriber connections, a RESP2 array with no associated in-flight + // command is server-initiated async data. If we can't classify it as a + // supported subscription frame, fail loudly instead of silently dropping. + try this.fail("Unexpected RESP2 pub/sub frame kind on subscriber connection", protocol.RedisError.InvalidResponse); + return; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/valkey/valkey.zig` around lines 823 - 886, The code currently returns silently when pair_maybe is null which causes unsupported RESP2 async Array notifications to be dropped; change the path where pair_maybe orelse { return; } to surface an error when we're in subscriber context and received a non-subscribe async Array with no in-flight pair: instead of returning, call this.fail(...) with an appropriate protocol.RedisError (e.g., InvalidResponse or a new UnsupportedAsyncNotification error) passing the received value so the client is notified; reference the symbols pair_maybe, subscription_frame, this.parent().isSubscriber(), value, and this.fail to locate where to implement this explicit error handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/valkey/valkey.zig`:
- Around line 823-886: The code currently returns silently when pair_maybe is
null which causes unsupported RESP2 async Array notifications to be dropped;
change the path where pair_maybe orelse { return; } to surface an error when
we're in subscriber context and received a non-subscribe async Array with no
in-flight pair: instead of returning, call this.fail(...) with an appropriate
protocol.RedisError (e.g., InvalidResponse or a new UnsupportedAsyncNotification
error) passing the received value so the client is notified; reference the
symbols pair_maybe, subscription_frame, this.parent().isSubscriber(), value, and
this.fail to locate where to implement this explicit error handling.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: bbefa202-d43e-4c64-b7ad-bf167b7a96c2
📥 Commits
Reviewing files that changed from the base of the PR and between 928512cf7c74069519692167c9830170d14da817 and d6cfe63b11ddecad4f8f23746e4d3800347b2ea0.
📒 Files selected for processing (1)
src/valkey/valkey.zig
|
On the outside-diff RESP2 array concern: declining as out-of-scope for this PR. That path ( |
Bun's Valkey/Redis client dispatches subscriber notifications by
matching on the RESP3 `.Push` (`>`) wire type. When a user puts the
subscriber connection into RESP2 mode via `sub.send("HELLO", ["2"])`,
Redis delivers the same notifications as ordinary RESP2 arrays
(`*3\r\n$7\r\nmessage\r\n$...`). Those arrays were silently dropped:
PUBLISH kept reporting the subscriber as present but the JS handler
never fired and no error was surfaced.
Unify both wire shapes behind `SubscriptionPushMessage.asFrame`, which
recognises a subscription frame whether it arrives as a `.Push` or as
a `.Array` whose first element is a matching kind tag, and route both
through the same handler path. This also fixes the sibling symptoms
(SUBSCRIBE/UNSUBSCRIBE confirmations hanging under RESP2).
Without the fix the RESP2 subscribe/unsubscribe/message paths hang rather than throw, so the previous test surfaced a generic 5s global timeout. withTimeout() races every network-dependent await and reports '#29042 regression? Timed out after Xms waiting for: <label>' so a future regression is immediately obvious.
A timeout or assertion failure in the middle of a test was leaking the Redis sockets because the close() calls were only on the happy path. RedisClient doesn't implement Symbol.dispose yet, so use explicit try/finally — that guarantees sub.close()/pub.close() runs even when an earlier withTimeout() rejects.
Three small follow-ups from review:
- handleSubscribeResponse now returns `bun.JSError!void`. The switch on
`frame.kind` is exhaustive, so the old `enum { handled, fallthrough }`
return type only ever produced `.handled`; the caller's `== .handled`
check was dead.
- Only compute `sub_count` on the `.subscribe` arm where it is actually
used. Incoming message frames — the hot path during active pub/sub —
no longer do a JSMap size lookup whose result gets discarded.
- Wrap the RedisClient in `redisReachable` in try/finally so the probe
doesn't leak a socket when `connect()` throws.
… errors Two subscriber dispatch corner cases flagged in review: - An unrouted RESP3 `.Push` frame (keyspace notification, CLIENT TRACKING `invalidate`, or a sharded/pattern push kind the client doesn't route yet) is a server-initiated async notification, not a reply to any in-flight command. Old code hard-failed the connection for these; the earlier refactor silently let them fall through into the regular command handler, which could corrupt an unrelated command's result. Drop the frame early, before consuming a promise pair. - A SUBSCRIBE command that the server answers with `-ERR` (NOPERM, ACL rejection, etc.) used to end up with the popped pair never getting rejected: by the time we reached `this.fail()`, the pair had already been dequeued from `in_flight`, so `rejectAllPendingCommands` couldn't see it and the caller hung. Reject the popped pair directly when it is present.
The previous unrouted-push guard dropped every RESP3 `.Push` whose kind wasn't in the supported set, intending to catch server-initiated async notifications (keyspace events, CLIENT TRACKING invalidations). But a pattern subscription's confirmation also arrives as an unrouted push kind (`psubscribe`/`pmessage`) and IS the reply to the in-flight PSUBSCRIBE command. Dropping it left the caller's promise hanging forever. Narrow the guard to only fire when `in_flight` is empty — that's the only case where the push cannot possibly be a command reply. When something is pending, fall through to the regular handler and resolve the pair with the raw value, matching pre-refactor behavior for unrouted kinds.
6d44a48 to
bb1e3c5
Compare
|
CI gate is failing on two tests unrelated to this PR on commit bb1e3c5:
Neither test is touched by this PR. Both are from main — this branch has been rebased onto My regression test
The two pre-existing failing tests are not mine to fix in this PR — they look like flakes/infrastructure issues from main. |
|
Getting re-pinged on the same CI failure. To re-confirm: the buildkite build (44649) at bb1e3c5 is failing on two tests unrelated to this PR:
Both tests come from earlier main PRs (#16109/#16123/#26861 for the v8 heap snapshot file, #26226 for 26225). Neither file is touched by this PR. My regression test I don't have a way to retry individual buildkite jobs or patch unrelated main tests from inside this PR. Flagging for a human to retry the failing lanes or address the main-side flakes separately. |
Fixes #29042.
Repro
Cause
ValkeyClient.handleResponse/handleSubscribeResponseinsrc/valkey/valkey.zigdispatch subscriber notifications by pattern-matching on the RESP3.Push(>) wire type. When the user switches the subscriber connection into RESP2 mode viaHELLO 2, Redis keeps deliveringmessage/subscribe/unsubscribenotifications on the wire but now encodes them as ordinary.Arrayvalues (*3\r\n$7\r\nmessage\r\n$...\r\n...). Those never matched.Push, so they fell through the subscriber block and were silently dropped — no error, no rejection, no handler call. The server-side count fromPUBLISHwas correct because Redis still considered the client subscribed.Fix
Added
SubscriptionPushMessage.asFrame(value)insrc/valkey/valkey_protocol.zigthat recognizes a subscription frame whether it arrives as a RESP3.Pushor as a RESP2.Arraywhose first element is a matching kind tag. Both entry points invalkey.zig(the promise-pair gating and the subscribe dispatch) now use that helper, so:messageframes reachonValkeyMessageand fire the JS handler.subscribe/unsubscribeconfirmations resolve their promises instead of hanging.PINGresponse) still fall through to the regular command handler.Verification
New regression test
test/regression/issue/29042.test.ts:bun bd test test/regression/issue/29042.test.ts→ passUSE_SYSTEM_BUN=1 bun test test/regression/issue/29042.test.ts→ fail (two tests time out because the handler never fires, one throwscan only be called while in subscriber modebecause the SUBSCRIBE confirmation was dropped)Also added two RESP2 cases to the existing
PUB/SUBdescribe block intest/js/valkey/valkey.test.tsso the Docker-backed pipeline exercises the fix too.