py/objarray: Avoid double zero init on sized bytearrays.#18771
py/objarray: Avoid double zero init on sized bytearrays.#18771Gadgetoid wants to merge 2 commits into
Conversation
As per the implementation of m_malloc0, if MICROPY_GC_CONSERVATIVE_CLEAR is set then all RAM is guaranteed to be zero-init by gc_alloc. py/objarray.c: Guard the explicit zero init in bytearray_make_new against being run, initialising the RAM to zero a second time, if this flag is set. Note that MICROPY_GC_CONSERVATIVE_CLEAR is default enabled by MICROPY_ENABLE_GC, and no ports currently override this value. Co-authored-by: Mike Bell <mdb036@gmail.com> Signed-off-by: Phil Howard <github@gadgetoid.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #18771 +/- ##
==========================================
- Coverage 98.41% 98.41% -0.01%
==========================================
Files 171 171
Lines 22324 22322 -2
==========================================
- Hits 21971 21969 -2
Misses 353 353 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Code size report: |
|
Quick eyeball over the MicroPython codebase turns up Line 633 in cef2538 There's also Lines 284 to 285 in cef2538 vstr_init_len calls vstr_init which does vstr->buf = m_new(char, vstr->alloc); This is what underpins bytes(n) and, as such, it has exactly the same issue as bytearray(n).
|
As per the implementation of m_malloc0, if MICROPY_GC_CONSERVATIVE_CLEAR is set then all RAM is guaranteed to be zero-init by gc_alloc. py/objstr.c: Guard the explicit zero init in bytes_make_new against being run, initialising the RAM to zero a second time, if this flag is set. Signed-off-by: Phil Howard <github@gadgetoid.com>
|
Nice find! |
dpgeorge
left a comment
There was a problem hiding this comment.
This looks good, thanks.
(What we really should do is remove the MICROPY_GC_CONSERVATIVE_CLEAR option. It shouldn't be necessary. Someone needs to find time to investigate why that option was added in the first place, as per the comment in py/mpconfig.h.)
See #2195. |
As per the implementation of m_malloc0, if
MICROPY_GC_CONSERVATIVE_CLEARis set then all RAM is guaranteed to be zero-init by gc_alloc.py/objarray.c: Guard the explicit zero init in bytearray_make_new against being run, initialising the RAM to zero a second time, if this flag is set.
py/objstr.c: Guard the explicit zero init in bytes_make_new against being run, initialising the RAM to zero a second time, if this flag is set.
Note that
MICROPY_GC_CONSERVATIVE_CLEARis default enabled byMICROPY_ENABLE_GC, and no ports currently override this value.We stumbled across this inconsistency when testing the timings of allocating, collecting and collecting/freeing PSRAM, with a script looking something like the following:
We had a custom
imageclass which did its own allocation viam_mallocinternally, and found - quite unintentionally - that it was twice as fast to allocate our customimageversus an identically sizedbytearray. A little sleuthing revealed thatbytearrayignores the value ofMICROPY_GC_CONSERVATIVE_CLEARand will zero (usingmemset()) RAM that is already guaranteed to be zeroed bygc_alloc, eg:micropython/py/objarray.c
Lines 190 to 196 in cef2538
when
gc_allocalready does:micropython/py/gc.c
Lines 885 to 888 in edab53c
Testing
To ensure no non-zero values leaked into bytearrays I ran this very basic test with various maximum sizes:
And here are the
bytes(n)alloc times, on PSRAM with an RP2350 clocked to 200MHz both before and after the change`:Trade-offs and Alternatives
I am unsure if there is some subtle caveat for
gc_alloczero init that might cause problems specifically with a bytearray, but afaict this change is minimal and gives us bytearray allocations in roughly half the time.This is less noticeable on SRAM (tested on RP2350) where a 200k allocation takes 2.6ms, but on PSRAM it brings a 200k
bytearrayalloc from roughly 50ms down to 25ms, and a modest 10k alloc from 2ms down to 1ms. The alloc time for sizes between these two extremes is linear.