Fix several MacOS memory management bugs#31769
Conversation
Before calling [NSApp run], we need to drain the previous pool as the windows have been added to it during enumeration.
There exists a (likely rare) scenario in which new NSEvents can be added to the queue indefinitely, and thus the outermost pool will never be drained. Hence, create a new pool and drain it once per iteration.
Now that we are setting up and draining pools, our MatplotlibAppDelegate is short lived, as NSApplication.delegate is a weak reference.
The alloc/init of the NSFileHandle needs a -release The object returned by -addObserverForName:object:queue:usingBlock: is retained by the system and needs a corresponding -removeObserver
The call to -[NSApp windows] needs to be inside of an @autoreleasepool block, else there exists the possibility that the windows array will go into a lower pool that won't be drained.
Using setReleasedWhenClosed:NO has been the standard for programmatically-created windows since the introduction of ARC. Also add calls to setDelegate:nil as NSWindow.delegate was assign until macOS 10.13.
|
Thank you for opening your first PR into Matplotlib! If you have not heard from us in a week or so, please leave a new comment below and that should bring it to our attention. Most of our reviewers are volunteers and sometimes things fall through the cracks. We also ask that you please finish addressing any review comments on this PR and wait for it to be merged (or closed) before opening a new one, as it can be a valuable learning experience to go through the review process. You can also join us on discourse chat for real-time discussion. For details on testing, writing docs, and our review process, please see the developer guide. We strive to be a welcoming and open project. Please follow our Code of Conduct. |
PR summary
This pull request fixes several memory management issues in
_macosx.m. It allows for windows to be properly dealloc'd, which should fix #31106.I realize that I'm a stranger and first-time contributor, please feel free to push back and be honest! I use to do these types of object ownership audits when I worked at Apple, but it's been almost 15 years since I've looked at manual-reference-counted code :)
Autorelease Pools
One of the root causes of #31106 is that each NSWindow allocated in
FigureManager_new()was added to an NSAutoreleasePool that wasn't drained by the time-[NSApplication run]was called. macOS performed several-retain/-autoreleasepairs on the windows. By the time ourshow()function was called, the retain count of each windows was over 600 (See #31751).Pools at Boundaries
Although messy, a good guideline is to setup an autorelease pool each time you cross from the Python layer into the Objective-C layer. This prevents objects from going into a pool that you have no control over, and may never be drained.
I have added
@autoreleasepoolblocks to every static function exposed to Python. Some of these aren't needed since there are no Obj-C calls (event_loop_is_running(),Timer_dealloc(), etc). I can removed the unneeded ones; however, I felt like it was better to be consistent and guard against the possibility of a future contributor using Obj-C fromevent_loop_is_running()/etc and then re-introducing this issue.I followed the existing coding style and indented each
@autoreleasepoolblock. This resulted in a messy diff due to the number of lines changed. I could instead use preprocessor macros similar toPy_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADSwithout indentation. Perhaps something likeBEGIN_OBJC_CALLS?Pools in Loops
When a loop can iterate indefinitely, it's a good idea to create a pool for each iteration. I did so in
FigureCanvas__start_event_loop()andflushEvents(). There existed a possibility that the call to-sendEvent:could add additional events to the event queue and thus spin the loop indefinitely without pool drainage.Pools before -run
The structure of
show()andwait_for_stdin()is a bit different. Here, we need to use two pools. The first pool will catch the autoreleased value of-[NSApp windows]and will be drained prior to the call to[NSApp run]. Without two pools, the windows will have a +1 retain count and will be waiting to be drained while[NSApp run]is running.NSTrackingArea Leak
#31754, also fixed by PR #31661 . I can back this out if desired.
The
NSTrackingAreacreated inFigureCanvas_init()was leaking. The+alloc/-initneeds a balancing-release.Technically,
NSTrackingArea.ownerwas an unsafeassignreference until macOS 10.13 when it was turned into a saferweakreference. There now exists the possibility of a crash on macOS 10.12 if the owning NSView is deallocated before the NSTrackingArea. I'm not sure if this is worth addressing, however. I can fix it if desired.wake_on_fd_write leaks
#31755
There are two leaks in
wake_on_fd_write():NSFileHandleis never-releaseed.-addObserverForName:object:queue:usingBlock:is retained by the system and not released until-removeObserver:.I fixed both of these by adding cleanup code to the end of the block.
MatplotlibAppDelegate
With pools in place and draining, the
MatplotlibAppDelegateobject created inlazy_init()was short-lived. AsNSApplication.delegateis a weak property, there was no strong reference to it.Per ARC conventions, I made a new
staticvariable to point to the object. While this isn't necessary under manually reference counting, it's typically used to indicate that an+alloc/-initwithout a balanced-releaseis intentional. It also saves a step if a migration to ARC is ever desired.isReleasedWhenClosed
Setting
isReleasedWhenClosedtoNOhas been the standard for programmatically-created windows since the introduction of ARC in 2011. Parts of AppKit, such as NSWindowController, will automatically call[theWindow setReleasedWhenClosed:NO]on adopted windows.While the existing code should correctly release the window per historic API documentation, I'm not sure what kind of test coverage
isReleasedWhenClosed=YESwindows have inside of Apple, as the recommendation is to use NSWindowController.I changed the window to
isReleasedWhenClosed=NOand modified the calls to-closeas follows:The
setDelegate:nilcall is needed asNSWindow.delegatedidn't becomeweakuntil macOS 10.13. I believe that matplotlib officially still supports 10.12.Testing
Checking for Over-releases
Historically, fixing leaks in one section of code can expose over-releases/crashes in other sections. Prior to this PR, we are "leaking" several Objective-C objects due to them going into a pool that is never drained.
To mitigate this, I ran several built-in examples with permutations of the following environmental variables:
I also checked using Instruments with the Zombies template.
That said, the amount of "leaked" objects was large, so this PR may expose existing over-releases as a crash.
Checking for Leaks
I ran the following example and then attached with Instruments with the Allocations template.
I performed several iterations of clicking "Mark Generation" and then closing the window. I proceeded to let the loop run 20+ more times in order to let Python and malloc clear/recycle any memory regions. I found three leaks (the
NSTrackingAreainFigureCanvas_init()and the two inwake_on_fd_write()).AI Disclosure
assigntoweak. Apple doesn't make this information easily accessible.PR checklist
Note
I ran
pytestand got:8858 passed, 1555 skipped, 28 xfailed, 2 xpassedAs a first time contributor, I'm not sure how concerned I should be with "28 xfailed". I don't believe my change could affect those tests.