Port, board and/or hardware
stm32 and mimxrt
MicroPython version
Latest
Reproduction
This is a systemic design issue. I found this by adding a breakpoint to all cache maintenance operations that would trigger on a non-cache-aligned access and immediately hit this on the STM32 and MIMXRT on SDCARD mount on startup.
Expected behaviour
No response
Observed behaviour
The FATFS buffer allocated by fat_vfs_make_new is not cache aligned.
Because of this, when functions reading data from the sdcard like sdcard_read_blocks on the STM32 are called, which clean and/or invalidate the win byte array in the FATFS struct, cache lines to before and after win can be potentially corrupted if a second bus master other than the SD card controller modifies RAM.
Additionally, with sdcard_read_blocks, cache invalidation is not done after the sdcard controller writes data to RAM. While a clean and invalidate operation is required before another DMA master writes to a memory buffer the CPU wants to use, you also must invalidate the memory buffer again after the DMA master has finished writing to ensure that any speculative reads by the CPU are dropped.
Yes, this is a real thing; we had to fight it quite a bit. - Note that this is also not safe to do unless tha memory is cache aligned.
...
How to fix:
#if FF_FS_EXFAT
DWORD bitbase; /* Allocation bitmap base sector */
#endif
DWORD winsect; /* Current sector appearing in the win[] */
BYTE win[FF_MAX_SS]; /* Disk access window for Directory, FAT (and file data at tiny cfg) */
} FATFS;
Aligning the win array by cache lines is the correct thing to do. This can be accomplished by specifying cache alignment on the win structure and updating fat_vfs_make_new to allocate fs_user_mount_t on a cache line (by allocating an extra 31 bytes in mp_obj_malloc and then fixing the returned pointer. vfs->fatfs will need to track the original non-aligned and aligned pointer to prevent garbage collection from collecting the allocation.
Alternatively, 32 bytes of padding before and after win resolves the problem by ensuring nothing before and after can be corrupted. However, this would be a hack.
Finally, the cache must be invalidated after DMA finishes writing data from the SD card to RAM to ensure speculative CPU reads are dropped.
Additional Information
No, I've provided everything above.
Code of Conduct
Yes, I agree
Port, board and/or hardware
stm32 and mimxrt
MicroPython version
Latest
Reproduction
This is a systemic design issue. I found this by adding a breakpoint to all cache maintenance operations that would trigger on a non-cache-aligned access and immediately hit this on the STM32 and MIMXRT on SDCARD mount on startup.
Expected behaviour
No response
Observed behaviour
The FATFS buffer allocated by fat_vfs_make_new is not cache aligned.
Because of this, when functions reading data from the sdcard like sdcard_read_blocks on the STM32 are called, which clean and/or invalidate the
winbyte array in the FATFS struct, cache lines to before and afterwincan be potentially corrupted if a second bus master other than the SD card controller modifies RAM.Additionally, with
sdcard_read_blocks, cache invalidation is not done after the sdcard controller writes data to RAM. While a clean and invalidate operation is required before another DMA master writes to a memory buffer the CPU wants to use, you also must invalidate the memory buffer again after the DMA master has finished writing to ensure that any speculative reads by the CPU are dropped.Yes, this is a real thing; we had to fight it quite a bit. - Note that this is also not safe to do unless tha memory is cache aligned.
...
How to fix:
Aligning the
winarray by cache lines is the correct thing to do. This can be accomplished by specifying cache alignment on thewinstructure and updatingfat_vfs_make_newto allocatefs_user_mount_ton a cache line (by allocating an extra 31 bytes inmp_obj_mallocand then fixing the returned pointer.vfs->fatfswill need to track the original non-aligned and aligned pointer to prevent garbage collection from collecting the allocation.Alternatively, 32 bytes of padding before and after
winresolves the problem by ensuring nothing before and after can be corrupted. However, this would be a hack.Finally, the cache must be invalidated after DMA finishes writing data from the SD card to RAM to ensure speculative CPU reads are dropped.
Additional Information
No, I've provided everything above.
Code of Conduct
Yes, I agree