Skip to content

Fix MethodBinding/OverloadMapper memory leak (#691)#2719

Open
greateggsgreg wants to merge 2 commits into
pythonnet:masterfrom
greateggsgreg:memleak-691
Open

Fix MethodBinding/OverloadMapper memory leak (#691)#2719
greateggsgreg wants to merge 2 commits into
pythonnet:masterfrom
greateggsgreg:memleak-691

Conversation

@greateggsgreg
Copy link
Copy Markdown

@greateggsgreg greateggsgreg commented May 9, 2026

Fixes #691.

Cause

MethodBinding and OverloadMapper hold a PyObject target but didn't release it on tp_clear, so the underlying CLR instance waited on the .NET finalizer chain to drop the refcount. They also shared the same C# PyObject instance across mp_subscript / Overloads paths, so disposing one wrapper corrupted the others.

Fix

  • ExtensionType: add virtual OnClear() hook called from tp_clear.
  • MethodBinding / OverloadMapper: override OnClear to dispose target. (targetType left alone — disposing it broke unrelated subclass tests.)
  • Each sharing site now passes new PyObject(self.target.Reference) so each wrapper owns its own INCREF'd reference.

Tests

The three existing *_does_not_leak_memory tests cover the three sharing sites but their 0.9 MB/iter threshold was too loose — master was leaking ~600 KB/iter and still passing. Tightened to 0.1 MB/iter (104 KB).

Verification (Python 3.14 GIL, linux-aarch64)

.NET 8 .NET 10
master FAIL 765 KB/iter FAIL 572-647 KB/iter
this PR PASS -0.6 KB/iter PASS -0.5 KB/iter

MethodBinding and OverloadMapper held PyObject `target` references that
were not disposed during tp_clear, leaving Python-side refcount drops to
wait on the multi-hop .NET finalizer chain. They also shared the same
C# PyObject instance across mp_subscript/Overloads paths, so freeing one
could free the underlying Python object out from under the others.

- ExtensionType: add virtual OnClear() hook called from tp_clear before
  the GCHandle is released, letting subclasses eagerly drop owned
  Python references.
- MethodBinding/OverloadMapper: override OnClear to dispose `target`.
  (`targetType` is intentionally not disposed since Python types are
  long-lived and tracked by other caches.)
- Take an independent INCREF'd PyObject copy at every site that hands a
  shared target into a new MethodBinding or OverloadMapper, so each
  wrapper owns its own reference.

Result: the three _does_not_leak_memory tests drop from ~485 MB delta
to ~10 KB delta on Python 3.14.
The previous 90% threshold (0.9 MB/iter against a 1 MB allocation)
documented the issue but did not reproduce it: master leaks
~600-765 KB/iter, which the 0.9 MB threshold accepts as passing.

Drop the threshold to 10% (104 KB/iter). On the 2026-05-09 verification
run with Python 3.14 GIL on linux-aarch64:

  Without fix (master):   ~572-765 KB/iter (FAIL)
  With fix (this branch): ~-500 B/iter     (PASS)

Margin is roughly 6x in either direction across .NET 8 and .NET 10, so
the threshold cleanly separates buggy from fixed states without being
sensitive to GC noise.
@lostmsu
Copy link
Copy Markdown
Member

lostmsu commented May 11, 2026

If the objects are not shared anymore and are always owned by MethodBinding the NewReference is a better type to hold it than PyObject.

Comment on lines +87 to +94
/// <summary>
/// Called during tp_clear before the GCHandle is released.
/// Override to eagerly dispose Python object references (PyObject fields)
/// held by the subclass, preventing the multi-hop .NET finalizer chain
/// from delaying Python-side refcount decrements.
/// </summary>
protected virtual void OnClear() { }

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to not have ExtensionType be IDisposable instead of exposing OnClear?

Copy link
Copy Markdown
Author

@greateggsgreg greateggsgreg May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tp_clear and Dispose have different contracts. tp_clear releases references to break GC cycles but the Python object can still be reachable afterward, whereas Dispose should mean that the instance is dead. ExtensionType instances are owned by Python's GC, not by .NET callers — there's no using site and no one calls Dispose() on them. OnClear makes it clear that it's a hook fired from tp_clear.

@greateggsgreg
Copy link
Copy Markdown
Author

greateggsgreg commented May 12, 2026

NewReference is declared as a ref struct, so it can't be a class field. The closest owned type is PyObject. Unless I'm missing something? We'd also have to swap out a few of the callsites in MethodBinding that expect a PyObject

Ownership is still explicit: OnClear disposes it, the field has no external readers, and every sharing site INCREFs into its own Python object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory Leak on MethodBinding for generic method

2 participants