fix(rtc): avoid KeyError in local_track_unpublished handler during teardown#692
Merged
davidzhao merged 3 commits intoJun 8, 2026
Merged
Conversation
…ardown Room._on_room_event did an unchecked `self.local_participant.track_publications[sid]` lookup on the `local_track_unpublished` branch. LocalParticipant.unpublish_track removes the publication from `_track_publications` when its FFI async response is processed, and that response races the `local_track_unpublished` room event. When the response is handled first, the SID is already gone and the handler raises KeyError, which _listen_task logs as a spurious ERROR with a full traceback on every affected disconnect cleanup. Pop the publication defensively in the handler and only emit when it is still tracked, mirroring the remote `track_unpublished` and `local_track_republished` handlers. This also folds removal into the handler, fixing a latent leak where a server-forced unpublish (no unpublish_track call) left the publication in the dict. Because the handler now owns removal, the common (non-racing) ordering is that the room event is processed before unpublish_track's pop. Make unpublish_track's pop tolerant (`pop(track_sid, None)` + guarded `_track = None`) so it does not raise a new KeyError once the handler has already removed the publication. Fixes livekit#681 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2dbb047422
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
… race Address two issues from review of the local_track_unpublished fix: 1. mypy strict (CI gate) flagged room.py: `pop(sid, None)` was assigned to `lpublication`, which is also bound to `track_publications[sid]` (non-optional LocalTrackPublication) in sibling branches of the same method. mypy fixes the variable's type from that first binding and pushes it as the expected return type into `pop`, selecting the `pop(key, default: _VT) -> _VT` overload and rejecting None. Use a fresh variable with `.get()` + conditional `del`, mirroring the existing local_track_republished handler, which already passes the gate. 2. With the handler now removing the publication first in the common ordering, unpublish_track's `pop(track_sid, None)` returned None and skipped the `_track = None` cleanup, leaving a LocalTrackPublication reference held by application code reporting a stale non-None track after `await unpublish_track(...)` completed. Capture the publication reference before the FFI round-trip so the track is cleared once unpublish completes regardless of which path removes it from the dict. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
Author
|
@cloudwebrtc @lukasIO @xianshijing-lk The failing checks seem to be due to flakiness. Please let me know, if something needs to be done to merge this PR. |
davidzhao
approved these changes
Jun 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #681 — #681
Problem
Room._on_room_eventdoes an uncheckedself.local_participant.track_publications[sid]lookup on thelocal_track_unpublishedbranch (room.py).LocalParticipant.unpublish_trackremoves the publication from_track_publicationswhen its FFI async response is processed, and that response races thelocal_track_unpublishedroom event. When the response is handled first, the SID is already gone by the time the event is dispatched and the handler raisesKeyError._listen_taskcatches it and logserror running user callback for local_track_unpublished: ...with a full traceback, so the worker does not crash — but every affected disconnect cleanup produces a spurious ERROR log. In a production agent fleet this fired ~18×/24h on early/CLIENT_INITIATEDdisconnects (~1 in several hundred). Thelocal_track_unpublishedevent is also silently dropped when the race fires.This is the only
local_track_*handler that crashes on a missing SID —track_unpublished(remote) andlocal_track_republished(#655) already tolerate it.Fix
room.py— handler: pop defensively and only emit when present, mirroring the remotetrack_unpublishedandlocal_track_republishedhandlers (with adebuglog for the dropped case):This also folds removal into the handler, fixing a latent leak where a server-forced unpublish (no
unpublish_trackcall) previously left the publication in the dict forever.participant.py—unpublish_track: make its pop tolerant.This second change is required, not cosmetic. The KeyError is rare (~1 in several hundred), which means the common ordering is the room event being processed while the SID is still present — i.e. before
unpublish_track's pop. Once the handler itself starts popping, in that common path the old uncheckedpop(track_sid)inunpublish_trackwould raise a newKeyErroron nearly every unpublish. Making it tolerant prevents simply relocating the crash.Behavior across both race orderings
unpublish_trackgetsNone, skips — no crash, event still delivered.unpublish_trackpops + clears_track; handler getsNone, skips emit — no crash. The event is dropped, but it was crashing (never emitting) before, so this is strictly better.Testing
ast-parsed both files; no new lint/type diagnostics._on_room_event(existing tests are e2e and require a live server), so no test was added. Happy to add one with a mocked participant if preferred.🤖 Generated with Claude Code