Skip to content

Zend: Fix swapped arguments in MSVC InterlockedCompareExchange CAS#21759

Open
EdmondDantes wants to merge 1 commit intophp:masterfrom
true-async:fix-msvc-atomic-cas-argument-order
Open

Zend: Fix swapped arguments in MSVC InterlockedCompareExchange CAS#21759
EdmondDantes wants to merge 1 commit intophp:masterfrom
true-async:fix-msvc-atomic-cas-argument-order

Conversation

@EdmondDantes
Copy link
Copy Markdown

Summary

The MSVC fallback implementations of zend_atomic_bool_compare_exchange_ex and zend_atomic_int_compare_exchange_ex in Zend/zend_atomic.h pass their arguments to InterlockedCompareExchange* in the wrong order.

/* before */
bool prev = (bool) InterlockedCompareExchange8(&obj->value, *expected, desired);
int  prev = (int)  InterlockedCompareExchange (&obj->value, *expected, desired);

The WinAPI signature is:

InterlockedCompareExchange(Destination, Exchange, Comparand)

i.e. (new, old). Every other CAS intrinsic used in this file — C11 __c11_atomic_compare_exchange_strong, GCC __atomic_compare_exchange, legacy __sync_val_compare_and_swap — takes (expected, desired) i.e. (old, new). The MSVC branch was written as if it followed the same convention, so Exchange and Comparand ended up swapped.

Effect:

  • The hardware CAS compares the destination against desired (the intended new value) instead of *expected.
  • On the (rare) success path, it writes *expected (the intended old value) back into the destination — essentially a no-op.
  • prev comes back holding the destination's actual value, which is then compared against *expected:
    • If they already matched (the destination already held the expected value), the function returns true without having performed any real exchange. Callers observe "success" but the store never happened.
    • Otherwise it updates *expected and returns false, so CAS loops do eventually converge — but every iteration is wasted and the store on the final iteration is still the no-op above.

In short: on this code path CAS silently degrades to "load, maybe report success, never store".

How it was found

While working on a downstream project I noticed zend_atomic_int_fetch_add-style refcount updates silently failing on a Windows MSVC build. Tracing it back, every CAS iteration was reporting success without actually updating the value, which pointed at this function. Swapping the two arguments to match the WinAPI order fixes the refcount behaviour.

Why it has gone unnoticed

This branch is selected by:

#if defined(ZEND_WIN32) && !defined(HAVE_C11_ATOMICS)

On modern MSVC + clang-cl builds HAVE_C11_ATOMICS is defined and the __c11_atomic_* branch is used instead, which is correct. The buggy code is only reached on MSVC configurations that do not advertise C11 atomics support (older toolchains, or builds where the feature test fails for whatever reason). Callers that only use zend_atomic_*_load / store / exchange are unaffected — those wrap InterlockedOr / InterlockedExchange, which only take one data argument and are correct. It's specifically the compare_exchange helpers (and therefore anything built on top of them, e.g. fetch-add loops) that are broken.

The fix

Swap Exchange and Comparand so the call matches the documented WinAPI contract:

/* after */
bool prev = (bool) InterlockedCompareExchange8(&obj->value, desired, *expected);
int  prev = (int)  InterlockedCompareExchange (&obj->value, desired, *expected);

No behavioural change on any non-MSVC / C11-atomics build.

Suspicions / things worth a second look

A few things I'd like more eyes on, since I only have one MSVC toolchain in front of me:

  1. Is the !HAVE_C11_ATOMICS MSVC branch still reachable in supported builds? If every supported MSVC configuration now goes through HAVE_C11_ATOMICS, this bug is latent — worth fixing anyway, but it would be good to know whether it actually manifests in CI or only in the wild. If the branch is dead code, it might be cleaner to delete it entirely in a follow-up.
  2. Memory ordering. InterlockedCompareExchange* on x86/x64 is a full barrier, so seq-cst semantics are preserved. I haven't audited ARM64 MSVC — someone more familiar with that target should double-check there isn't an implicit assumption baked in here.
  3. Test coverage. There doesn't appear to be any unit-level test for zend_atomic_*_compare_exchange_ex under this specific #if branch, which is why this slipped through. Might be worth adding one; happy to do it in a follow-up if reviewers agree.
  4. bool round-trip through InterlockedCompareExchange8. InterlockedCompareExchange8 operates on char. The cast (bool) InterlockedCompareExchange8(...) is fine for the return, but the *expected / desired arguments are implicitly widened from bool (0/1) to char, which is well-defined. Just flagging it for completeness.

Test plan

  • CI on MSVC build with HAVE_C11_ATOMICS defined — should be a no-op (branch not compiled).
  • CI on MSVC build without HAVE_C11_ATOMICS, if such a configuration still exists.
  • Existing Zend atomic-related tests continue to pass on Linux / macOS (branch not compiled there either).
  • Manual: a CAS-based refcount increment on Windows MSVC actually increments the refcount after this change.

The MSVC fallback implementations of zend_atomic_bool_compare_exchange_ex
and zend_atomic_int_compare_exchange_ex passed *expected as the Exchange
argument and desired as the Comparand argument to InterlockedCompareExchange*.

The WinAPI signature is InterlockedCompareExchange(Destination, Exchange,
Comparand) — i.e. (new, old) — whereas the C11 / __atomic / __sync CAS
intrinsics take (expected, desired) i.e. (old, new). The two arguments
were therefore swapped: the function wrote the old value back on success
and compared against the new value, so the CAS became a no-op in every
case except the degenerate *expected == desired one (where it still
reported success without having performed a real exchange).

On x86/x64 with HAVE_C11_ATOMICS this code path is not compiled, which
is why the bug has gone unnoticed. It is reachable on MSVC builds that
do not define HAVE_C11_ATOMICS (e.g. older toolchains or the !C11 branch
selected by the preprocessor), where any caller relying on CAS semantics
— notably zend_atomic_int_fetch_add-style loops — silently fails to
update the value.

Swap Exchange and Comparand to match the documented WinAPI order.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant