Skip to content

SD Card Cache Line Corruption #19010

@kwagyeman

Description

@kwagyeman

Port, board and/or hardware

stm32 and mimxrt

MicroPython version

Latest

Reproduction

This is a systemic design issue. I found this by adding a breakpoint to all cache maintenance operations that would trigger on a non-cache-aligned access and immediately hit this on the STM32 and MIMXRT on SDCARD mount on startup.

Expected behaviour

No response

Observed behaviour

The FATFS buffer allocated by fat_vfs_make_new is not cache aligned.

Because of this, when functions reading data from the sdcard like sdcard_read_blocks on the STM32 are called, which clean and/or invalidate the win byte array in the FATFS struct, cache lines to before and after win can be potentially corrupted if a second bus master other than the SD card controller modifies RAM.

Additionally, with sdcard_read_blocks, cache invalidation is not done after the sdcard controller writes data to RAM. While a clean and invalidate operation is required before another DMA master writes to a memory buffer the CPU wants to use, you also must invalidate the memory buffer again after the DMA master has finished writing to ensure that any speculative reads by the CPU are dropped.

Yes, this is a real thing; we had to fight it quite a bit. - Note that this is also not safe to do unless tha memory is cache aligned.

...

How to fix:

#if FF_FS_EXFAT
    DWORD   bitbase;        /* Allocation bitmap base sector */
#endif
    DWORD   winsect;        /* Current sector appearing in the win[] */
    BYTE    win[FF_MAX_SS]; /* Disk access window for Directory, FAT (and file data at tiny cfg) */
} FATFS;

Aligning the win array by cache lines is the correct thing to do. This can be accomplished by specifying cache alignment on the win structure and updating fat_vfs_make_new to allocate fs_user_mount_t on a cache line (by allocating an extra 31 bytes in mp_obj_malloc and then fixing the returned pointer. vfs->fatfs will need to track the original non-aligned and aligned pointer to prevent garbage collection from collecting the allocation.

Alternatively, 32 bytes of padding before and after win resolves the problem by ensuring nothing before and after can be corrupted. However, this would be a hack.

Finally, the cache must be invalidated after DMA finishes writing data from the SD card to RAM to ensure speculative CPU reads are dropped.

Additional Information

No, I've provided everything above.

Code of Conduct

Yes, I agree

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions