Skip to content

py/modstruct: Check for integer overflow in calc_size_items.#19189

Open
beastbroak30 wants to merge 1 commit into
micropython:masterfrom
beastbroak30:fix/struct-calcsize-overflow
Open

py/modstruct: Check for integer overflow in calc_size_items.#19189
beastbroak30 wants to merge 1 commit into
micropython:masterfrom
beastbroak30:fix/struct-calcsize-overflow

Conversation

@beastbroak30
Copy link
Copy Markdown

Summary

calc_size_items() in py/modstruct.c accumulates the total buffer size
for a struct format string into a size_t variable without any overflow
check. On 32-bit targets where size_t is 4 bytes -- ESP32, RP2040,
STM32, SAMD21/51, nRF52 -- a format string such as '1500000000I'
silently wraps the accumulator from ~6 GB back to a small positive value.
vstr_init_len() allocates only that small buffer. When
struct_pack_into_internal() writes the real payload it walks far past
the end of the allocation, corrupting the MicroPython heap.

On embedded targets there is no ASLR, no heap canary, and no MMU
protection between heap regions, so the corruption is reachable from pure
Python with no native-code access required:

import struct
struct.calcsize('1500000000I')  # silently returns 4 on 32-bit
struct.pack('1500000000I', 0)   # writes ~6 GB into a 4-byte buffer

The fix adds a single pre-loop overflow guard in calc_size_items():

if (sz > 0 && cnt > (SIZE_MAX - size) / sz) {
    mp_raise_ValueError(MP_ERROR_TEXT("pack format too large"));
}

sz equals align for every MicroPython struct format code (all
primitive types), so consecutive items of the same type add exactly sz
bytes each. The check is therefore exact for all single-type format
strings and conservative (safe) for mixed layouts.

Testing

I reproduced the overflow by compiling a standalone C harness with a
uint32_t accumulator to simulate 32-bit size_t (gcc -O2). The buggy
path allocates 4 bytes and then attempts a ~6 GB write; the fixed path
raises immediately with no allocation.

A regression test is added at tests/basics/struct_overflow.py. On
32-bit targets the test expects ValueError, OverflowError, or
MemoryError to be raised. On 64-bit hosts SIZE_MAX is large enough
that a uint32_t repeat count cannot trigger the overflow, so the test
prints SKIP and exits early.

Tested on: unix port x86-64 (test prints SKIP as expected). The fix
targets 32-bit embedded ports; I do not have physical hardware available,
but the guard path is straightforward and covered by the simulated
experiment.

Trade-offs and Alternatives

Code size increase is 32--80 bytes depending on target (see CI code size
report). The added cost is one integer division per format specifier.
Because sz is always a power of two for every MicroPython struct format
code, the compiler can replace the division with a right shift on
architectures that support it, making the runtime cost negligible.

An alternative per-iteration check (if (size + sz < size) overflow)
would be simpler but would fire on mixed-type format strings where
alignment padding temporarily makes size + sz non-monotone. The
pre-loop check is safe for all layouts.

Generative AI

I used generative AI tools when creating this PR, but a human has checked
the code and is responsible for the code and the description above.

@dpgeorge
Copy link
Copy Markdown
Member

dpgeorge commented May 5, 2026

Please check the failing CI.

@dpgeorge dpgeorge added the py-core Relates to py/ directory in source label May 5, 2026
Without a bounds check a format string like `struct.calcsize("NI")`
with a large repeat count N can overflow the 32-bit size_t accumulator,
wrapping it to a small value.  A subsequent struct.pack call then
allocates a tiny buffer and writes far past its end.

Add an O(1) guard in calc_size_items(): before entering the per-item
loop, check whether cnt items of sz bytes each would exceed the
remaining size_t budget.  The check is:

  if (sz > 0 && cnt > (SIZE_MAX - size) / sz) { raise ValueError }

This is safe against integer overflow in the right-hand side because
size <= SIZE_MAX is maintained as a loop invariant.

The test format `"1I1073741823I"` is used rather than the more
obvious `"1500000000I"` because format counts are parsed via
get_fmt_num(), which calls MP_OBJ_SMALL_INT_VALUE() and therefore
only works correctly for values within MicroPython's small-integer
range (0 .. 2**30 - 1 on 32-bit).  The value 1073741823 (= 2**30 - 1)
is the largest count that is a 32-bit small integer; combined with a
4-byte prefix it just exceeds the overflow threshold (1073741822).

Signed-off-by: beastbroak30 <akantarip30@gmail.com>
@beastbroak30 beastbroak30 force-pushed the fix/struct-calcsize-overflow branch from b50bc6a to 1bfc154 Compare May 5, 2026 09:30
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 98.45%. Comparing base (9f396bb) to head (1bfc154).
⚠️ Report is 62 commits behind head on master.

Files with missing lines Patch % Lines
py/modstruct.c 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19189      +/-   ##
==========================================
- Coverage   98.46%   98.45%   -0.01%     
==========================================
  Files         176      176              
  Lines       22811    22813       +2     
==========================================
+ Hits        22460    22461       +1     
- Misses        351      352       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

Code size report:

Reference:  samd/machine_adc: Fix the configuration with averaging enabled. [21b3a51]
Comparison: py/modstruct: Check for integer overflow in calc_size_items. [merge of 1bfc154]
  mpy-cross:    +0 +0.000% 
   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:   +80 +0.009% standard
      stm32:   +32 +0.008% PYBV10
      esp32:   +44 +0.003% ESP32_GENERIC[incl +16(data)]
     mimxrt:   +40 +0.010% TEENSY40
        rp2:   +40 +0.004% RPI_PICO_W
       samd:   +48 +0.017% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:   +39 +0.009% VIRT_RV32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

py-core Relates to py/ directory in source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants