py/gc: Track data and skip scan, MICROPY_GC_NO_SCAN by Gadgetoid · Pull Request #19367 · micropython/micropython

Gadgetoid · 2026-06-22T11:41:59Z

Summary

Buffers don't contain pointers. Don't scan them for pointers. This was never really a problem until we had 8MB of PSRAM with in-RAM font data and images.

Currently this brings new m_new_no_scan() and m_malloc_no_scan() methods for data that's guaranteed (by a contentious user who understands how painful use-after-free bugs are to trace) to contain absolutely 100% no pointers. These mirror the existing methods, using a new GC_ALLOC_FLAG_NO_SCAN flag.

For better or worse we'll be carrying this change downstream for Tufty 2350, since GC hangups are painful when trying to hit 30-60FPS screen updates. This doesn't eliminate them, but turns a 300ms pause into a 30ms one, the rest of which is handled by #19363

Might be of interest to @sfe-SparkFro

Testing

Aggressive multi-hour tests of both real-world examples (on Tufty 2350) and synthetic GC thrashing benchmarks.

Again this change does not make much of an impact on perfbench since we don't really benchmark GC, and in some cases it can cause a net loss (RP2040 XIP cache lottery).

Trade-offs and Alternatives

This feature sacrifices heap for the additional flag bit in the allocation table, and is thus default disabled. I'd recommend everyone shipping a board with PSRAM enable it as a matter of course, and suggest that leaving the RAM/performance tradeoff to the vendor of each board.

This is a big change to a scary part of MicroPython and as such I'm raising it as a draft in the hope others will exercise it downstream and feed back. I don't expect or need it to be merged, but it's fun to share!

Generative AI

I used generative AI tools when creating this PR, but a human has checked the
code and is responsible for the code and the description above.

The mark phase conservatively scans every word of every reachable block for pointers, so a large bytearray/array buffer is scanned in full on every collection despite holding no pointers. Add an optional per-block "no-scan table" (NTB, 1 bit/block, like the finaliser/weakref tables) and a GC_ALLOC_FLAG_NO_SCAN; tagged head blocks are marked but their contents are not scanned. A no-scan block has no child pointers, so the mark phase also skips the chain-walk for it (n_blocks left 0) and avoids re-reading the allocation table for every block of the buffer just to mark it - this matters for large buffers in slow PSRAM. The tag is written on every allocation (so a reused block never inherits a stale bit) and preserved across realloc moves. Exposed as m_new_no_scan() / m_malloc_no_scan(), which alias plain m_new()/gc_alloc() when disabled, and gated behind MICROPY_GC_NO_SCAN (default off). This commit adds the mechanism only; callers are converted separately. Signed-off-by: Phil Howard <github@gadgetoid.com>

Tag the buffers that only ever hold raw data (never heap pointers) with m_new_no_scan(), so the GC mark phase skips scanning them once MICROPY_GC_NO_SCAN is enabled (a no-op otherwise): py/objarray.c: array/bytearray item storage. py/objstr.c: str/bytes payloads. py/vstr.c: the vstr builder, growth via gc_realloc preserves the tag. Signed-off-by: Phil Howard <github@gadgetoid.com>

For CI, build tests only. Signed-off-by: Phil Howard <github@gadgetoid.com>

github-actions · 2026-06-22T11:54:40Z

Code size report:

Reference:  tools/mpy_ld.py: Allow overriding the internal MPY file name. [b49f098]
Comparison: rp2: Enable no-scan GC. [merge of 1fb9532]
  mpy-cross:    +0 +0.000% 
   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +0 +0.000% PYBV10
      esp32:    +0 +0.000% ESP32_GENERIC
     mimxrt:    +0 +0.000% TEENSY40
        rp2:  +108 +0.012% RPI_PICO_W[incl +4(bss)]
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

codecov · 2026-06-22T11:55:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.51%. Comparing base (b49f098) to head (1fb9532).

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #19367   +/-   ##
=======================================
  Coverage   98.51%   98.51%           
=======================================
  Files         176      176           
  Lines       22904    22905    +1     
=======================================
+ Hits        22563    22564    +1     
  Misses        341      341

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Gadgetoid · 2026-06-22T12:05:00Z

This incredibly unhelpful graph illustrates that there's a very real, if vanishingly small cost to adding this change, affecting routing garbage collection in smaller memory environments.

It's only paid if the feature is turned on however, and the relative benefit for even modest buffers (96k is absolutely peanuts on 8MB PSRAM, but this was tested on an RP2040 to show the worst case tradeoff) far, far outweighs any cost.

And here's the same test on a Pico LiPo 2 with 8MB PSRAM enabled:

What each test measures

frame_manual - Per-frame time (us) of a render loop that allocates a 1 KB bytearray + a small list each frame and calls gc.collect() EVERY frame (predictable pacing). p50=median frame, p99/max=worst-case stalls.
frame_ondemand - Same render loop with no manual/threshold control - GC only fires when the heap fills (cheap average, large spikes).
frame_threshold - Same render loop, but using gc.threshold() auto-collection instead of collecting manually.
churn_per_alloc_ns - Per-allocation time (ns) churning 50000 transient 4-element lists - allocation throughput including amortised GC.
collect_empty_us - gc.collect() pause (us) on a near-empty heap - the fixed overhead of a collection.
collect_3000_lists_us - gc.collect() pause (us) with 3000 live small lists (real pointers) - these must be scanned, so no-scan cannot help.
free_start - Free heap (bytes) after a full collect at start. Higher is better; the no-scan table costs ~1 bit/block of heap.
collect_96k_bytearray_us - gc.collect() pause (us) with one live 96 KB bytearray. The headline case: a large pure-data buffer the mark phase would otherwise scan word-by-word for pointers.

Note that the 96k bytearray is only allocated in that specific test, since it would massively skew the other tests in favour of this feature being turned on.

Gadgetoid · 2026-06-22T12:48:48Z

And since those were so bad, here's a graph which cuts right to the gist of the change:

Again I'm shooting low here, this is just the difference in collect times in SRAM, which probably makes it a reasonable sell even for a stock Pico 2 / Pico 2 W. PSRAM's speed (or lack thereof) compounds this effect.

Gadgetoid · 2026-06-22T13:39:49Z

Since this ties in strongly with the optimised tail scan (#19363) here's a graph of them working together, again just SRAM:

Gadgetoid added 3 commits June 22, 2026 12:25

rp2: Enable no-scan GC.

1fb9532

For CI, build tests only. Signed-off-by: Phil Howard <github@gadgetoid.com>

Gadgetoid mentioned this pull request Jun 22, 2026

py/gc: Add per-word GC table scans behind MICROPY_GC_FAST_TABLE_SCANS. #19363

Draft

dpgeorge added the py-core Relates to py/ directory in source label Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

py/gc: Track data and skip scan, MICROPY_GC_NO_SCAN#19367

py/gc: Track data and skip scan, MICROPY_GC_NO_SCAN#19367
Gadgetoid wants to merge 3 commits into
micropython:masterfrom
pimoroni:gc-no-scan

Gadgetoid commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

codecov Bot commented Jun 22, 2026

Uh oh!

Gadgetoid commented Jun 22, 2026

Uh oh!

Gadgetoid commented Jun 22, 2026

Uh oh!

Gadgetoid commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Gadgetoid commented Jun 22, 2026

Summary

Testing

Trade-offs and Alternatives

Generative AI

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

codecov Bot commented Jun 22, 2026

Codecov Report

Uh oh!

Gadgetoid commented Jun 22, 2026

What each test measures

Uh oh!

Gadgetoid commented Jun 22, 2026

Uh oh!

Gadgetoid commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants