py/objarray: Guard array allocation size against overflow.#19222
Open
kbhetrr wants to merge 1 commit into
Open
py/objarray: Guard array allocation size against overflow.#19222kbhetrr wants to merge 1 commit into
kbhetrr wants to merge 1 commit into
Conversation
When constructing array.array() from an object whose __len__ returns
a very large value, the backing-buffer size computation
typecode_size * n could wrap around size_t and produce a small or
zero-sized allocation while o->len remained the original huge value.
Subsequent writes during construction then dereferenced a NULL or
undersized items pointer, crashing the VM.
Reproducer (Unix port, 64-bit):
from array import array
class Boom:
def __len__(self):
return 1 << 61
def __iter__(self):
for _ in range(4):
yield 1.23
array("d", Boom())
Without this change the Unix port crashes with a write to address
zero in mp_binary_set_val_array(); an ASan/UBSan build reports a
null pointer store in py/binary.c.
Check the multiplication in array_new() before allocating and raise
MemoryError via the existing m_malloc_fail() path when the requested
array would exceed SIZE_MAX bytes.
Fixes micropython#18620.
Signed-off-by: Kyounghwan Kim <kimkh7534@pnucslab.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19222 +/- ##
=======================================
Coverage 98.47% 98.47%
=======================================
Files 176 176
Lines 22845 22847 +2
=======================================
+ Hits 22497 22499 +2
Misses 348 348 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Code size report: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #18620.
Summary
array_new()allocates the backing buffer forarray.array()astypecode_size * nbytes, wherencomes from the initializer's__len__. With a large enoughnthe multiplication overflowssize_t, so the allocation is sized for a small (or zero-byte) buffer whileo->lenstill records the original huge value. The subsequent fill loop inarray_construct()then writes through a NULL or undersizeditemspointer.This PR adds an overflow check before allocation and raises
MemoryErrorvia the existingm_malloc_fail()path when the requested array would exceedSIZE_MAXbytes.Reproducer
Before the patch, on a 64-bit Unix port build:
mp_binary_set_val_array().py/binary.c.After the patch the same script raises
MemoryErrorcleanly.Tests
A regression test is added at
tests/basics/array_construct_len_overflow_intbig.py. The_intbigsuffix gates it on builds with long-int support, since the literal1 << 61is not representable as a small int on 32-bit small-int-only configurations.Result: 545 tests pass, 30 skipped (no new failures or regressions). Targeted check:
An ASan/UBSan Unix build was also used to verify the original reproducer no longer crashes.
Code formatting
tools/codeformat.py -c -f py/objarray.cwith uncrustify v0.72.ruff checkandruff format --checkon the new test file (both clean).Notes
array_new()rather thanarray_construct()so it also coversbytearray(<huge int>)and any other call site that funnels througharray_new().__len__value on 32-bit configs would already break the existingMP_OBJ_SMALL_INT_VALUE(len_in)assumption inarray_construct(), which is a separate latent issue; the regression test deliberately scopes itself to bignum-capable builds via the_intbigsuffix.