Skip to content

gh-151640: avoid sharing BytesIO buffer in free-threaded builds#151651

Open
pedramkarimii wants to merge 3 commits into
python:mainfrom
pedramkarimii:gh-151640-bytesio-ft-race
Open

gh-151640: avoid sharing BytesIO buffer in free-threaded builds#151651
pedramkarimii wants to merge 3 commits into
python:mainfrom
pedramkarimii:gh-151640-bytesio-ft-race

Conversation

@pedramkarimii

Copy link
Copy Markdown

Fixes gh-151640.

In free-threaded builds, BytesIO.read() could return self->buf directly for whole-buffer reads via Py_NewRef(). BytesIO.getvalue() could also return the internal buffer directly. A concurrent writer may then resize the same internal bytes object while another thread decrefs the returned reference, producing a TSAN-reported race between _PyBytes_Resize() and _Py_DecRefShared().

This change avoids exposing the internal BytesIO buffer in Py_GIL_DISABLED builds. Whole-buffer read() and getvalue() now return a copy in free-threaded builds, while keeping the existing fast path unchanged for regular GIL builds.

A focused free-threading regression test was added for concurrent whole-buffer read()/getvalue() and buffer-resizing writes.

Tests run:

  • ./python -m test test_free_threading.test_io -v -m test_concurrent_whole_buffer_read_and_resize
  • ./python -m test test_free_threading.test_io -v
  • ./python -m test test_io -v
  • ./python -m test test_free_threading -v
  • ./python ../gh151640_reproducer.py

@bedevere-app

bedevere-app Bot commented Jun 18, 2026

Copy link
Copy Markdown

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@python-cla-bot

python-cla-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app

bedevere-app Bot commented Jun 18, 2026

Copy link
Copy Markdown

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@pedramkarimii

Copy link
Copy Markdown
Author

Additional context:

This PR is intended to fix the specific _io.BytesIO free-threaded race reported in gh-151640. The issue report focuses on the whole-buffer BytesIO.read() fast path returning self->buf directly via Py_NewRef(), while a concurrent writer may later resize that same internal bytes object.

While investigating the code path, I also found that BytesIO.getvalue() can expose the same internal buffer in a similar way. For that reason, this PR handles both whole-buffer read() and getvalue() in Py_GIL_DISABLED builds.

The chosen fix is intentionally conservative: in free-threaded builds, avoid returning the internal BytesIO buffer directly to user code, while keeping the existing fast path unchanged for regular GIL builds. This avoids changing behavior/performance for non-free-threaded builds and keeps the patch localized to _io.BytesIO.

I also considered fixing this lower in the resize/shared-buffer ownership path, but that seemed more invasive because it would require changing the interaction between SHARED_BUF(), _PyBytes_Resize(), and buffer ownership/copy-on-write behavior. If maintainers prefer preserving this optimization in free-threaded builds, I’m happy to rework the patch in that direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data race between resize_buffer_lock_held and _Py_DecRefShared on a shared BytesIO buffer

1 participant