Skip to content

gh-151814: Fix unbounded memory growth from repeated empty writes to io.TextIOWrapper#151817

Open
StanFromIreland wants to merge 1 commit into
python:mainfrom
StanFromIreland:textio-acc
Open

gh-151814: Fix unbounded memory growth from repeated empty writes to io.TextIOWrapper#151817
StanFromIreland wants to merge 1 commit into
python:mainfrom
StanFromIreland:textio-acc

Conversation

@StanFromIreland

@StanFromIreland StanFromIreland commented Jun 20, 2026

Copy link
Copy Markdown
Member

Comment thread Modules/_io/textio.c
else if (!PyList_CheckExact(self->pending_bytes)) {
PyObject *list = PyList_New(2);
if (list == NULL) {
if (bytes_len > 0) {

@StanFromIreland StanFromIreland Jun 20, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Git seems to render the diff poorly, locally I see with -w (--ignore-all-space):

$ git show -w HEAD -- Modules/_io/textio.c
commit c6b5163133619febd0fbe8c327e52399b1a54ffd (HEAD -> textio-acc, origin/textio-acc)
Author: Stan Ulbrych <stan@python.org>
Date:   Sat Jun 20 21:16:54 2026 +0100

    Fix unbounded memory growth from repeated empty writes to io.TextIOWrapper

diff --git a/Modules/_io/textio.c b/Modules/_io/textio.c
index 24e08cec88f..5b2a20a30c2 100644
--- a/Modules/_io/textio.c
+++ b/Modules/_io/textio.c
@@ -1820,6 +1820,7 @@ _io_TextIOWrapper_write_impl(textio *self, PyObject *text)
         }
     }
 
+    if (bytes_len > 0) {
         if (self->pending_bytes == NULL) {
             assert(self->pending_bytes_count == 0);
             self->pending_bytes = b;
@@ -1846,6 +1847,11 @@ _io_TextIOWrapper_write_impl(textio *self, PyObject *text)
         }
 
         self->pending_bytes_count += bytes_len;
+    }
+    else {
+        Py_DECREF(b);
+    }
+
     if (self->pending_bytes_count >= self->chunk_size || needflush ||
         text_needflush) {
         if (_textiowrapper_writeflush(self) < 0)

@StanFromIreland StanFromIreland changed the title gh-151814: Fix unbounded memory growth from repeated empty writes to io.TextIOWr… gh-151814: Fix unbounded memory growth from repeated empty writes to io.TextIOWrapper Jun 20, 2026
Comment thread Modules/_io/textio.c
self.assertRaises(TypeError, txt.writelines, None)
self.assertRaises(TypeError, txt.writelines, b'abc')

def test_write_empty_stress(self):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test passes for me without this patch. Maybe expose privately pending_bytes to be able to test it isn't getting longer?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test passes for me without this patch.

I wanted a simple little test to stress the patch, it does pass on modern systems with vast amounts of resources. I'm not sure about exposing new attributes, even privately. We could do messy things like gc.get_referents(txt), but I worry it might break in the future for other reasons.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to not relying on GC (in theory we should get the zero-length immortal bytes object here; lots of internals).

This test for my machine only peeks at ~700MB of ram while taking ~2.5 seconds. Most that time is in _pyio which looks like it needs a similar improvement (although probably in BufferedWriter for that one...).

I'm okay with this test if it's tagged walltime but would really prefer a test which fails if the empty string path is removed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test for my machine only peeks at ~700MB of ram while taking ~2.5 seconds. Most that time is in _pyio which looks like it needs a similar improvement (although probably in BufferedWriter for that one...).

I looked both at _pyio.TextIOWrapper.write which writes straight through and has no accumulation. And BufferedWriter which extends a bytearrary, so adding an empty string shouldn't be a problem there.

I'm okay with this test if it's tagged walltime but would really prefer a test which fails if the empty string path is removed.

As Zach noted on the issue, this is a "degenerate case," I don't think the complexity of testing this precisely is warranted here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case I'd prefer no test and just comment well in the code.

If the test doesn't regress it's not adding a lot of value for the complexity/runtime. For me on debug linux ./python -m test test_io -j12 current takes 8.3 seconds. Adding 2.5 seconds with this test which doesn't fail if the new code is removed isn't worth it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants