Skip to content

gh-141968: Use take_bytes to remove copy in _pyio.BytesIO.read()#149850

Open
lgeiger wants to merge 3 commits into
python:mainfrom
lgeiger:pyio-take-bytes
Open

gh-141968: Use take_bytes to remove copy in _pyio.BytesIO.read()#149850
lgeiger wants to merge 3 commits into
python:mainfrom
lgeiger:pyio-take-bytes

Conversation

@lgeiger
Copy link
Copy Markdown
Contributor

@lgeiger lgeiger commented May 14, 2026

This removes a copy going from bytearray to bytes in _pyio.BytesIO.read().

import timeit
from statistics import mean, stdev

times = timeit.repeat(
    "bio.seek(0); bio.read()",
    setup="import _pyio; bio = _pyio.BytesIO(b'x' * 1024 * 1024)",
    number=50_000,
    repeat=5,
)
print(f"{mean(times) * 1e3:.2f} ± {stdev(times) * 1e3:.2f} ms")
main:    1403.29 ± 14.36 ms
this PR: 785.91 ± 4.27 ms 

Comment thread Lib/_pyio.py
b = self._buffer[self._pos : newpos]
self._pos = newpos
return bytes(b)
return b.take_bytes()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell b = self._buffer[self._pos : newpos] already copies the bytearray so we can safely use take_bytes()

Comment thread Misc/NEWS.d/next/Library/2026-05-14-22-50-54.gh-issue-141968.eRqeCL.rst Outdated
@lgeiger lgeiger force-pushed the pyio-take-bytes branch from 0893c75 to d67f989 Compare May 14, 2026 23:12
@lgeiger lgeiger force-pushed the pyio-take-bytes branch from d67f989 to d32d3b0 Compare May 14, 2026 23:23
@maurycy
Copy link
Copy Markdown
Contributor

maurycy commented May 15, 2026

cc @cmaloney

Copy link
Copy Markdown
Contributor

@cmaloney cmaloney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your analysis is correct / the slice already does a copy so take_bytes() is safe in this case.

I've been experimenting to see if there is a way to avoid the temporary bytearray / go straight from the original to final bytes but haven't found anything simpler so far (would save allocating the bytearray head). Can swap in a memoryview but I don't find that simpler, just a different shape.

Could you remove the news file? I added skip news co PR checks should pass. This is just an optimizing refactor and _pyio is not widely used so people are unlikely to see an impact from this. End users only see the C _io implementation. The CPython test suite will run a little faster though which is nice :)

also: How did you spot/find this one? (Wondering if there's more tips and tricks should add to the whatsnew)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants