gh-103200: Fix performance issues with zipimport.invalidate_caches()
#103208
+35
−25
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
This PR fixes the over-eagerness of the original
zipimport.invalidate_caches()implementation.Currently in
zipimport.invalidate_caches(), the cache of zip files is repopulated at the point of invalidation. This causes cache invalidation to be slow, and violates the semantics of cache invalidation which should simply clear the cache. Cache repopulation should occur on the next access of files.There are three relevant events to consider:
invalidate_caches()is calledEvents (1) and (2) should be fast, while event (3) can be slow since we're repopulating a cache. In the original implementation, (1) and (3) are fast, but (2) is slow.
This PR shifts the cost of reading the directory out of cache invalidation and back to cache access, while avoiding any behaviour change introduced in Python 3.10+ and keeping the common path of reading the cache performant.
Ideally, this fix should be backported to Python 3.10+.
zipimport.invalidate_caches()implementation causes performance regression forimportlib.invalidate_caches()#103200