-
-
Notifications
You must be signed in to change notification settings - Fork 34.8k
gh-151814: Fix unbounded memory growth from repeated empty writes to io.TextIOWrapper
#151817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
StanFromIreland
wants to merge
1
commit into
python:main
Choose a base branch
from
StanFromIreland:textio-acc
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+39
−19
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 2 additions & 0 deletions
2
Misc/NEWS.d/next/Library/2026-06-20-21-15-13.gh-issue-151814.OIbgsO.rst
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| Fix unbounded memory growth in :class:`io.TextIOWrapper` when repeatedly | ||
| writing an empty string. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1820,32 +1820,38 @@ _io_TextIOWrapper_write_impl(textio *self, PyObject *text) | |
| } | ||
| } | ||
|
|
||
| if (self->pending_bytes == NULL) { | ||
| assert(self->pending_bytes_count == 0); | ||
| self->pending_bytes = b; | ||
| } | ||
| else if (!PyList_CheckExact(self->pending_bytes)) { | ||
| PyObject *list = PyList_New(2); | ||
| if (list == NULL) { | ||
| if (bytes_len > 0) { | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Git seems to render the diff poorly, locally I see with $ git show -w HEAD -- Modules/_io/textio.c
commit c6b5163133619febd0fbe8c327e52399b1a54ffd (HEAD -> textio-acc, origin/textio-acc)
Author: Stan Ulbrych <stan@python.org>
Date: Sat Jun 20 21:16:54 2026 +0100
Fix unbounded memory growth from repeated empty writes to io.TextIOWrapper
diff --git a/Modules/_io/textio.c b/Modules/_io/textio.c
index 24e08cec88f..5b2a20a30c2 100644
--- a/Modules/_io/textio.c
+++ b/Modules/_io/textio.c
@@ -1820,6 +1820,7 @@ _io_TextIOWrapper_write_impl(textio *self, PyObject *text)
}
}
+ if (bytes_len > 0) {
if (self->pending_bytes == NULL) {
assert(self->pending_bytes_count == 0);
self->pending_bytes = b;
@@ -1846,6 +1847,11 @@ _io_TextIOWrapper_write_impl(textio *self, PyObject *text)
}
self->pending_bytes_count += bytes_len;
+ }
+ else {
+ Py_DECREF(b);
+ }
+
if (self->pending_bytes_count >= self->chunk_size || needflush ||
text_needflush) {
if (_textiowrapper_writeflush(self) < 0)
cmaloney marked this conversation as resolved.
|
||
| if (self->pending_bytes == NULL) { | ||
| assert(self->pending_bytes_count == 0); | ||
| self->pending_bytes = b; | ||
| } | ||
| else if (!PyList_CheckExact(self->pending_bytes)) { | ||
| PyObject *list = PyList_New(2); | ||
| if (list == NULL) { | ||
| Py_DECREF(b); | ||
| return NULL; | ||
| } | ||
| // Since Python 3.12, allocating GC object won't trigger GC and release | ||
| // GIL. See https://github.com/python/cpython/issues/97922 | ||
| assert(!PyList_CheckExact(self->pending_bytes)); | ||
| PyList_SET_ITEM(list, 0, self->pending_bytes); | ||
| PyList_SET_ITEM(list, 1, b); | ||
| self->pending_bytes = list; | ||
| } | ||
| else { | ||
| if (PyList_Append(self->pending_bytes, b) < 0) { | ||
| Py_DECREF(b); | ||
| return NULL; | ||
| } | ||
| Py_DECREF(b); | ||
| return NULL; | ||
| } | ||
| // Since Python 3.12, allocating GC object won't trigger GC and release | ||
| // GIL. See https://github.com/python/cpython/issues/97922 | ||
| assert(!PyList_CheckExact(self->pending_bytes)); | ||
| PyList_SET_ITEM(list, 0, self->pending_bytes); | ||
| PyList_SET_ITEM(list, 1, b); | ||
| self->pending_bytes = list; | ||
|
|
||
| self->pending_bytes_count += bytes_len; | ||
| } | ||
| else { | ||
| if (PyList_Append(self->pending_bytes, b) < 0) { | ||
| Py_DECREF(b); | ||
| return NULL; | ||
| } | ||
| Py_DECREF(b); | ||
| } | ||
|
|
||
| self->pending_bytes_count += bytes_len; | ||
| if (self->pending_bytes_count >= self->chunk_size || needflush || | ||
| text_needflush) { | ||
| if (_textiowrapper_writeflush(self) < 0) | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test passes for me without this patch. Maybe expose privately
pending_bytesto be able to test it isn't getting longer?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted a simple little test to stress the patch, it does pass on modern systems with vast amounts of resources. I'm not sure about exposing new attributes, even privately. We could do messy things like
gc.get_referents(txt), but I worry it might break in the future for other reasons.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to not relying on GC (in theory we should get the zero-length immortal bytes object here; lots of internals).
This test for my machine only peeks at ~700MB of ram while taking ~2.5 seconds. Most that time is in
_pyiowhich looks like it needs a similar improvement (although probably in BufferedWriter for that one...).I'm okay with this test if it's tagged walltime but would really prefer a test which fails if the empty string path is removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked both at
_pyio.TextIOWrapper.writewhich writes straight through and has no accumulation. AndBufferedWriterwhich extends a bytearrary, so adding an empty string shouldn't be a problem there.As Zach noted on the issue, this is a "degenerate case," I don't think the complexity of testing this precisely is warranted here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I'd prefer no test and just comment well in the code.
If the test doesn't regress it's not adding a lot of value for the complexity/runtime. For me on debug linux
./python -m test test_io -j12current takes 8.3 seconds. Adding 2.5 seconds with this test which doesn't fail if the new code is removed isn't worth it.