UN-3522 [FEAT] PG Queue Phase 6a — Barrier interface + CeleryChordBarrier wrapper + lift chord call sites + plumb fairness#2024
Conversation
…rier wrapper + lift chord call sites + plumb fairness
First sub-task of Phase 6 (UN-3504 — Distributed Barrier). Ships the
abstraction layer + chord-call-site lift + fairness plumbing with
**zero runtime behaviour change** — every operation still routes through
``celery.chord(...)`` under the hood. The risky behaviour-change PR
(``RedisDecrBarrier`` replacing chord_unlock polling) is deliberately
deferred to Phase 6b behind a feature flag.
## What this adds
* ``workers/queue_backend/barrier.py``
* ``Barrier`` Protocol — minimum contract every barrier substrate
satisfies (``enqueue(header, callback, fairness=None)`` shape).
* ``BarrierHandle`` Protocol — minimum return-value contract
(``id: str``); celery's ``AsyncResult`` satisfies this; future
substrate handles must too so chord-id logging stays working.
* ``CeleryChordBarrier`` impl — wraps ``chord(header)(body)`` 1:1.
Optional ``FairnessKey`` is stamped as a message header on every
enqueued task AND the callback signature.
* Both production chord call sites lifted behind the barrier:
* ``shared/workflow/execution/orchestration_utils.py:67`` —
``WorkflowOrchestrationUtils.create_chord_execution`` now
delegates to ``CeleryChordBarrier``. Method signature unchanged
for callers; adds optional ``fairness=`` kwarg.
* ``api-deployment/tasks.py:674`` — inline ``chord(...)`` replaced
with a call to ``WorkflowOrchestrationUtils.create_chord_execution``
with ``fairness=FairnessKey(workload_type=WorkloadType.API)``.
* ``general/tasks.py:941`` — existing chord call site passes
``fairness=FairnessKey(workload_type=WorkloadType.NON_API)``.
* Inventory canary in ``tests/test_chord_sites_characterisation.py``
tightened: was "exactly 2 chord call sites in workers/", now
"exactly 1 chord call site, inside ``queue_backend/barrier.py``".
A future PR that adds a direct ``chord(...)`` call outside the
barrier abstraction fails this test loudly.
## What this does NOT change (zero-behaviour-change posture)
* The ``chord(header)(body)`` Celery primitive itself — same two-step
call, same chord_unlock polling, same result backend interaction,
same retry semantics, same callback firing.
* Task execution — same workers consume the same queues.
* Result aggregation — chord still collects all header return values
into a list and passes them to the body.
* Failure handling — same chord error semantics, same link_error path.
* Mixed-version rolling deploys — both directions safe:
* old worker enqueues chord (no fairness header) → new callback
receives → callback doesn't read fairness header → works
* new worker enqueues chord (with fairness header) → old callback
receives → celery passes unknown headers through → works
## Why plumb fairness here
Phase 5.1 added fairness to bare ``dispatch()`` call sites only (2 of
them today). The two ``chord(...)`` call sites bypass ``dispatch()``
entirely — at runtime, MOST workflow-execution tasks fan out through
chord (process_file_batch × N as header tasks), so Phase 5.1 only
covered ~5% of the workflow-execution wire traffic. This PR closes
the remaining ~95%.
Doing it at the barrier lift point (rather than retroinstrumenting the
raw chord call) means future Phase 6b can swap in a ``RedisDecrBarrier``
or ``PgBarrier`` without touching call sites a second time.
## Tests
New ``tests/test_barrier.py`` (12 tests):
* Barrier Protocol shape — ``CeleryChordBarrier`` satisfies the
``Barrier`` Protocol; AsyncResult-shaped handles satisfy
``BarrierHandle``.
* Wire equivalence with the pre-uplift chord call — given the same
args, ``CeleryChordBarrier.enqueue(...)`` produces the same
``chord(header)(body)`` invocation, same ``app.signature(...)``
args, same return value. Locks the byte-for-byte equivalence claim.
* Fairness header plumbing — when ``fairness=`` is passed, the
fairness slot is attached to every enqueued task and the callback.
When ``fairness=None``, no ``headers=`` kwarg is added (preserves
byte-for-byte pre-uplift wire).
* Mixin wrapper invariants preserved — ``WorkflowOrchestrationMixin
.create_chord`` still extracts ``self.app`` and raises on missing
app context.
Updated ``tests/test_chord_sites_characterisation.py``:
* Existing 6 characterisation tests still pass — patch targets
updated to ``queue_backend.barrier.chord``.
* Inventory canary tightened: 2 sites → 1 site (inside barrier.py).
* ``from celery import chord`` canary: 2 files → 1 file.
Updated ``tests/test_queue_backend_seam.py::test_all_exports`` for the
new public surface (``Barrier``, ``BarrierHandle``,
``CeleryChordBarrier``).
## Validation
* Touched test files: 21 / 21 passing
* Full workers suite: 6 failed (pre-existing baseline, unchanged) /
659 passed (was 627 → +32 new tests)
* Five deterministic-order full-suite runs: exactly 6/659 each time
— zero flakiness from this change
## What's deferred
* **Phase 6b (UN-3523):** ``RedisDecrBarrier`` impl + ``WORKER_BARRIER_BACKEND``
env flag (default ``chord`` — opt-in). New code path gated.
* **Phase 6c (UN-3524):** production rollout / verification (config-only,
no code changes).
## Risk
* Wire shape change: each chord member signature + callback signature
now carries a fairness header (~80 bytes / task). Additive — celery
passes unknown headers through; no consumer reads the slot yet.
* All other behaviour identical to direct ``chord(...)`` calls.
* Mixed-version rolling deploy safe (verified above).
* Rollback: a single revert commit removes the abstraction; both call
sites are one-line delegations.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WalkthroughThis pull request introduces a CeleryChordBarrier that centralizes chord fan-out + callback execution, propagates optional FairnessKey routing, refactors orchestration utilities to delegate chord creation to the barrier, updates API and general call sites to pass FairnessKey, and extends tests and exports to validate the change. ChangesBarrier abstraction and fairness-aware orchestration
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
| Filename | Overview |
|---|---|
| workers/queue_backend/barrier.py | New file: Barrier Protocol, BarrierHandle Protocol, and CeleryChordBarrier implementation. Empty-header guard is correctly placed first; clone-and-set fairness stamping avoids cross-tenant mutation; exception surface constrained to non-empty path. |
| workers/shared/workflow/execution/orchestration_utils.py | Module-level _BARRIER singleton added; create_chord_execution now delegates to _BARRIER.enqueue(); WorkflowOrchestrationMixin.create_chord gains fairness= kwarg forwarding. Clean refactor with no logic change. |
| workers/api-deployment/tasks.py | Inline chord(...) replaced with WorkflowOrchestrationUtils.create_chord_execution with WorkloadType.API fairness key; adds explicit zero-files dispatch path (unreachable in practice) to preserve pre-Barrier empty-chord contract. |
| workers/general/tasks.py | Adds WorkloadType.NON_API fairness key to the existing create_chord_execution call. Minimal and correct. |
| workers/tests/test_barrier.py | New 12-test file covering protocol shape, wire equivalence, fairness header plumbing, singleton routing, call-site contracts, and the zero-files dispatch path. Good fixture decomposition. |
| workers/tests/test_chord_sites_characterisation.py | Patch targets updated to queue_backend.barrier.chord; inventory canary tightened to 1 site. Tightened regex =\s*chord( misses return chord(...) form — a minor gap in bypass detection. |
| workers/queue_backend/init.py | Adds Barrier, BarrierHandle, CeleryChordBarrier to all and the module imports. Straightforward. |
| workers/tests/test_queue_backend_seam.py | Updates the all surface assertion to include the three new barrier exports. |
Sequence Diagram
sequenceDiagram
participant Caller as api-deployment / general tasks
participant Utils as WorkflowOrchestrationUtils
participant Barrier as CeleryChordBarrier (_BARRIER)
participant Celery as celery.chord
Caller->>Utils: "create_chord_execution(batch_tasks, ..., fairness=FairnessKey)"
Utils->>Barrier: "_BARRIER.enqueue(batch_tasks, ..., fairness=FairnessKey)"
alt batch_tasks is empty
Barrier-->>Utils: None
Utils-->>Caller: None
Caller->>Caller: "dispatch(callback, args=[[]], fairness=...)"
else batch_tasks non-empty
Barrier->>Barrier: stamp x-fairness-key on each header task (clone().set)
Barrier->>Barrier: build callback_signature (app.signature + headers)
Barrier->>Celery: chord(signed_header_tasks)(callback_signature)
Celery-->>Barrier: AsyncResult (BarrierHandle)
Barrier-->>Utils: AsyncResult
Utils-->>Caller: AsyncResult
end
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
workers/tests/test_chord_sites_characterisation.py:287-300
**Canary regex misses `return chord(...)` form**
The tightened regex `=\s*chord\(` only matches assignment form `result = chord(...)`. A future developer who writes `return chord(tasks)(callback)` or a bare `chord(tasks)(callback)` (no assignment) outside `barrier.py` would silently bypass the enforcement. The comment correctly notes this for comment-line false-positives, but `return chord(...)` is real production code, not prose — it would never start with `#` and would escape both the comment-skip guard and the regex. Adding `|\breturn\s+chord\(` to the pattern closes this gap without reintroducing docstring false-positives.
Reviews (2): Last reviewed commit: "UN-3522 [FIX] Remove dead-code 'if False..." | Re-trigger Greptile
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@workers/queue_backend/barrier.py`:
- Around line 158-160: Replace the current exception log that only prints the
message with a full traceback log by using logger.exception(...) in the except
block that catches Exception in the enqueue barrier code path (the except
Exception as e: block that currently calls logger.error(f"Failed to enqueue
barrier: {e}") and re-raises); update that handler so it calls
logger.exception("Failed to enqueue barrier") (or similar) before raising to
capture the stack trace for triage.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 02595d8f-c81b-4a35-a648-66dd584c1cef
📒 Files selected for processing (8)
workers/api-deployment/tasks.pyworkers/general/tasks.pyworkers/queue_backend/__init__.pyworkers/queue_backend/barrier.pyworkers/shared/workflow/execution/orchestration_utils.pyworkers/tests/test_barrier.pyworkers/tests/test_chord_sites_characterisation.pyworkers/tests/test_queue_backend_seam.py
…loud
SonarCloud flagged 7.9% duplicated lines on workers/tests/test_barrier.py
(Phase 6a barrier characterisation suite). Three sources:
1. ``TestWorkflowOrchestrationMixinCreateChord`` class was defined in
both ``test_barrier.py`` AND ``test_chord_sites_characterisation.py``
(same 2 tests, byte-for-byte). The mixin lives on the chord-call-site
characterisation surface, so removed the dup from ``test_barrier.py``
— coverage is preserved by the copy in
``test_chord_sites_characterisation.py``.
2. ``_make_app`` helper was defined twice (one per test class) with the
same body. Extracted to a module-level ``app`` pytest fixture used
by every test that drives ``CeleryChordBarrier.enqueue(...)``.
3. ``with patch("queue_backend.barrier.chord") as mock_chord:`` appeared
in 8 separate tests with identical patch target. Extracted to a
``mock_chord`` pytest fixture that patches once and yields the mock;
each test configures behaviour as needed (``return_value`` /
``side_effect``).
LOC delta: -194 / +103 = 91 LOC net removed.
Validation:
* ``tests/test_barrier.py``: 10/10 passing
* ``tests/test_chord_sites_characterisation.py``: 9/9 passing
* Full workers suite: 6 pre-existing failures / 657 passed (was 659 →
-2 from removing the duplicate ``TestWorkflowOrchestrationMixinCreateChord``
methods; identical coverage retained in the characterisation suite)
* Five deterministic-order full-suite runs: exactly 6/657 each time —
zero flakiness from this refactor
No production code changed; this is test-file dedup only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rCloud python:S8572)
SonarCloud flagged the ``except Exception`` block at
``workers/queue_backend/barrier.py:158`` for using ``logger.error(...)``
instead of ``logger.exception(...)``.
``logger.exception()`` automatically attaches the traceback to the log
record — needed for debugging broker outages or serialisation failures
at the chord entry point. The previous ``logger.error(f"...: {e}")``
only logged the exception's ``str()`` repr, losing the stack.
The exception is still re-raised so callers see the original traceback,
but the log emitted on the way out now carries the full context.
No behaviour change — same exception path, same re-raise. Tests still
pass (19/19 barrier + chord-sites tests; full workers suite 6 failed /
657 passed — baseline unchanged).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
muhammad-ali-e
left a comment
There was a problem hiding this comment.
PR Review Toolkit — synthesised findings (Phase 6a Barrier uplift)
6 agents reviewed the 8 changed files (Code Reviewer, Silent Failure Hunter, Type Design Analyzer, PR Test Analyzer, Comment Analyzer, Code Simplifier). Regression risk was the headline ask; I focused there first.
No P0 blockers. Two P1 items below have real regression potential — the zero-files divergence in the API-deployment caller and the in-place mutation of caller-owned batch_tasks. Everything else is P2 hygiene (claim accuracy, missing pin-tests, stale references).
Findings posted as inline comments. Duplicates of the existing CodeRabbit / Greptile comments on barrier.py:160, barrier.py:135, api-deployment/tasks.py:682 were intentionally dropped.
Prioritised summary
P1 — regression risk
api-deployment/tasks.py:700— Zerobatch_tasksnow raises and flips execution toERROR. The general/tasks.py caller has an explicit zero-files SUCCESS branch (line 957-1013); api-deployment doesn't. Divergent behaviour for an edge case that didn't exist pre-uplift (oldchord([])(...)returned a truthy AsyncResult — callback fired with[]).barrier.py:144—task.set(headers=fairness_headers)mutates the caller-providedbatch_taskslist in place. No caller reuses the list today, but the helper's docstring doesn't warn — a future retry path that re-uses the list could cross-pollinate fairness headers across tenants.
P2 — claim accuracy / test gaps / hygiene
3. barrier.py:18 — "byte-identical wire output" claim holds only for fairness=None; both production call sites pass fairness, so the wire is additive (acceptable, but the docstring overstates).
4. orchestration_utils.py:22 — _BARRIER singleton not asserted by any test; refactor that bypasses it stays green.
5. test_chord_sites_characterisation.py:295 — Inventory regex (?:^|[\s=(])chord\( will flag a docstring mention of chord(...) in barrier.py and break CI on documentation edits.
6. test_barrier.py:299 — No test pins WorkloadType.API at api-deployment call site or NON_API at general call site; an API↔NON_API swap goes undetected.
7. orchestration_utils.py:330 — WorkflowOrchestrationMixin.create_chord does not forward fairness=. Unused in production today, but kept around — either extend its signature or delete the mixin.
Notes (not posted as inline)
_BARRIERsingleton is safe under Celery prefork; class is stateless. ✓- Lazy import of
WorkflowOrchestrationUtilsin_run_workflow_apiis intentional (likely circular-import dodge); Greptile already flagged it. ✓ BarrierHandle | Nonereturn type collapses with callerif not result:pattern — fine forAsyncResult(truthy) but a future substrate returning a falsy handle would silently regress. A sentinel/Result type would be sturdier (Type Design Analyzer's recommendation).- Stale
PR #13references attest_chord_sites_characterisation.py:38, 101, 186— not inline-commentable (lines outside the diff hunks) — please sweep when convenient; the uplift is done, no need to forward-reference it. - Several comments duplicate code intent (
tasks.py:672-679,barrier.py:18-26phase choreography). Comment Analyzer recommended trimming; deferring as taste call.
Review generated via /pr-review-toolkit. 5 of the 6 agents flagged at least one item; full agent transcripts available on request.
…ngs (9 fixes) Comprehensive response to the multi-source review pass on PR #2024. All fixes preserve zero behaviour change vs pre-PR-2024 main — including the previously-theoretical zero-files API regression. ## Production code fixes * **G1 — Empty-header guard moved before signature construction** (`barrier.py`). Zero-task runs now skip callback signature build + fairness serialisation entirely; non-empty path unchanged. Constrains the ``try`` block's exception surface to the branch where work actually happens. * **G2 — Removed redundant inline import of ``WorkflowOrchestrationUtils``** in ``api-deployment/tasks.py``. Already imported at module level (line 25); the deferred import inside the ``try`` block was a vestige of the pre-Barrier ``from celery import chord`` pattern. * **T1 — Zero-files API path now matches pre-Barrier behaviour byte-for-byte**. Before the Barrier uplift, the inline ``chord(empty)(callback)`` call returned a truthy ``AsyncResult`` and Celery fired the callback immediately with ``[]``. Post-uplift the Barrier returns ``None`` for empty headers, and the existing ``if not result: raise`` mapped zero-files runs to ``ExecutionStatus.ERROR``. New defensive branch in ``api-deployment/tasks.py`` explicitly dispatches ``process_batch_callback_api`` with ``args=[[]]`` (body=[] matches Celery's chord-empty semantic) via ``queue_backend.dispatch`` — same seam every other dispatch uses, picks up fairness routing the same way. Unreachable in practice (upstream requires non-empty ``created_files``) but closes the theoretical regression entirely. * **T3 — Header signatures stamped via ``Signature.clone().set(headers=...)`` instead of in-place ``.set(...)``**. Removes cross-tenant header-leakage foot-gun where a future retry path / signature cache could re-use a header_tasks list with a different ``FairnessKey``. When ``fairness=None`` the clone is skipped — the ``chord(...)`` call receives the caller's signatures unchanged, preserving byte-for- byte pre-Barrier wire. * **T7 — ``WorkflowOrchestrationMixin.create_chord`` forwards ``fairness=`` kwarg** to ``create_chord_execution``. Backward- compat: default is ``None`` so existing callers don't break. ## Docstring + comment fixes * **T2 — Docstring qualified.** "Byte-identical wire output" is true only when ``fairness=None``. Both production call sites pass a ``FairnessKey`` so post-uplift wire has the additive ``x-fairness-key`` AMQP header; qualified the claim and the "Mixed-version rolling deploys" claim two lines down. ## Test improvements * **T4 — ``_BARRIER`` singleton routing pinned** by new ``TestOrchestrationUtilsRoutesThroughSingleton``. Patches ``orchestration_utils._BARRIER`` and asserts ``enqueue.assert_called_once_with(...)``. Locks the singleton as the single dispatch point ahead of Phase 6b's factory swap. * **T5 — Inventory canary regex tightened** (``test_chord_sites_characterisation.py``). Was ``(?:^|[\s=(])chord\(`` which could match docstring prose. Now requires assignment form ``=\s*chord\(``. Comment lines (lstrip + startswith "#") explicitly skipped. * **T6 — Call-site fairness contracts pinned** by new ``TestCallSiteFairnessContracts``. ``api-deployment/tasks.py`` → ``WorkloadType.API`` + ``org_id=str(schema_name)``; ``general/tasks.py`` → ``WorkloadType.NON_API`` + ``org_id=organization_id``. A swap goes undetected by the isolated barrier tests. * **T8 — Zero-files contract pinned** by new ``TestApiDeploymentZeroFilesContract``. Asserts the defensive branch dispatches via the seam (``dispatch(...)``) with body=[], not raw ``send_task``. ## CodeRabbit / SonarCloud already addressed * CodeRabbit's ``logger.exception`` suggestion + SonarCloud's ``python:S8572`` finding both addressed in ``13c10d9fe``. ## Validation * Touched test modules: 23 / 23 passing (was 19 → +4 new tests) * Full workers suite: 6 failed (pre-existing baseline, unchanged) / 661 passed (was 657 → +4 new tests) * Five deterministic-order runs: exactly 6/661 each time — zero flakiness * The dispatch-sites canary (``test_dispatch_sites_characterisation``) caught my initial T1 fix using ``app.send_task`` directly; refactored to ``queue_backend.dispatch`` so the seam stays enforced. ## Production regression risk After this commit + parent PR, vs. pre-PR-2024 main: * Wire size: +~80 bytes per chord member task (additive fairness header). Negligible at 130K scale. * All other paths byte-identical: API chord (non-empty + zero- files), general/ETL chord, chord_unlock polling, result backend + callback firing, backend Django chord sites, mixed-version rolling deploy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…SonarCloud python:S5797 + python:S1481)
SonarCloud flagged two issues on the same line of
``test_api_deployment_declares_api_workload_with_schema_name``:
* **python:S5797** (constant condition) — ``importlib.import_module("api-deployment.tasks") if False else None`` always evaluates the ``else`` branch.
* **python:S1481** (unused local) — ``api_tasks`` was never read after assignment.
Both flag the same vestigial line I left in while figuring out how
to read ``api-deployment/tasks.py`` (the dash in the directory name
isn't a valid Python identifier, so ``import api-deployment.tasks``
doesn't work — the test reads the file by path instead). The
``if False`` was a placeholder I forgot to delete.
Cleaned up:
* Removed the dead ``api_tasks = ... if False else None`` line.
* Removed the now-unused ``import importlib`` and
``import importlib.util`` imports.
* Folded the "not a valid Python identifier" comment into the
test docstring where it documents the import-by-path approach.
Behaviour unchanged — test still reads ``api-deployment/tasks.py``
by path and asserts the same FairnessKey contract. 14/14 tests in
test_barrier.py still pass; full workers suite still at 6 pre-
existing failures / 661 passed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Unstract test resultsPer-group results
Critical paths
|



Summary
First sub-task of Phase 6 (UN-3504 — Distributed Barrier). Ships the abstraction layer + chord-call-site lift + fairness plumbing with zero runtime behaviour change — every operation still routes through
celery.chord(...)under the hood. The risky behaviour-change PR (RedisDecrBarrierreplacing chord_unlock polling) is deliberately deferred to Phase 6b (UN-3523) behind a feature flag.What this adds
workers/queue_backend/barrier.pyBarrierProtocol — minimum contract every barrier substrate satisfies.BarrierHandleProtocol — minimum return-value contract (id: str); celery'sAsyncResultsatisfies this.CeleryChordBarrierimpl — wrapschord(header)(body)1:1. OptionalFairnessKeystamped as a message header on every enqueued task AND the callback.Both production chord call sites lifted behind the barrier:
shared/workflow/execution/orchestration_utils.py:67—WorkflowOrchestrationUtils.create_chord_executionnow delegates toCeleryChordBarrier. Adds optionalfairness=kwarg.api-deployment/tasks.py:674— inlinechord(...)replaced with the helper call, passingfairness=FairnessKey(workload_type=API).general/tasks.py:941— existing call site now passesfairness=FairnessKey(workload_type=NON_API).Inventory canary tightened in
tests/test_chord_sites_characterisation.py: was "exactly 2 chord call sites in workers/", now "exactly 1 chord call site, insidequeue_backend/barrier.py". A future PR that adds a directchord(...)call outside the abstraction fails this test loudly.Why plumb fairness here
Phase 5.1 added fairness to bare
dispatch()call sites only (2 of them today). The twochord(...)call sites bypassdispatch()entirely — at runtime, MOST workflow-execution tasks fan out through chord (process_file_batch × N), so Phase 5.1 only covered ~5% of the workflow-execution wire traffic. This PR closes the remaining ~95%.Doing it at the barrier lift point (rather than retroinstrumenting the raw chord call) means a future Phase 6b can swap in a
RedisDecrBarrierorPgBarrierwithout touching call sites a second time.What this does NOT change (zero-behaviour-change posture)
chord(header)(body)Celery primitive itself — same two-step call, same chord_unlock polling, same result backend interaction, same retry semantics, same callback firing.link_errorpath.Mixed-version rolling deploys
Safe in both directions:
Tests
New
tests/test_barrier.py(12 tests):CeleryChordBarriersatisfies theBarrierProtocol; AsyncResult-shaped handles satisfyBarrierHandle.CeleryChordBarrier.enqueue(...)produces the samechord(header)(body)invocation, sameapp.signature(...)args, same return value. Locks the byte-for-byte equivalence claim.fairness=is passed, the fairness slot is attached to every enqueued task and the callback. Whenfairness=None, noheaders=kwarg is added (preserves byte-for-byte pre-uplift wire).WorkflowOrchestrationMixin.create_chordstill extractsself.appand raises on missing app context.Updated
tests/test_chord_sites_characterisation.py:queue_backend.barrier.chord.barrier.py).from celery import chordcanary: 2 files → 1 file.Updated
tests/test_queue_backend_seam.py::test_all_exportsfor the new public surface (Barrier,BarrierHandle,CeleryChordBarrier).Validation
What's deferred
RedisDecrBarrierimpl +WORKER_BARRIER_BACKENDenv flag (defaultchord— opt-in). New code path gated.Risk
chord(...)calls.Test plan
cd workers && uv run pytest tests/test_barrier.py tests/test_chord_sites_characterisation.py→ 21 passedcd workers && uv run pytest --no-cov -p no:randomly→ 6 pre-existing failures, 659 passedx-fairness-keyheader is present and well-formed🤖 Generated with Claude Code