Skip to content

gh-149590: Remove faulthandler_traverse#150023

Draft
armaan-v924 wants to merge 1 commit into
python:mainfrom
armaan-v924:armaan/issue-149590-multi-instance-faulthandler
Draft

gh-149590: Remove faulthandler_traverse#150023
armaan-v924 wants to merge 1 commit into
python:mainfrom
armaan-v924:armaan/issue-149590-multi-instance-faulthandler

Conversation

@armaan-v924
Copy link
Copy Markdown

@armaan-v924 armaan-v924 commented May 18, 2026

tldr; faulthandler_traverse visits Python objects owned by _PyRuntime, not by the module instance. With multi-phase init allowing multiple module instances, each instance's GC traversal decrements gc_refs on the same runtime-owned objects, driving it negative when two instances are collected simultaneously. Cleanup is already handled well, traversal is redundant, and harmless with a single instantiation.

Expanded Explanation:

Author's note: This is my first foray into the CPython codebase. Would appreciate all the scrutiny :D

RCA:

BG: faulthandler uses multi-phase init, so deleting it from sys.modules and re-importing produces a second, independent module object. Both module objects are tracked by the garbage collector and both have md_state_traverse = faulthandler_traverse stored.

  1. subtract_refs iterates every object in the GC generation list.
  2. For each object it calls Py_TYPE(op)->tp_traverse — for module objects that's module_traverse in Objects/moduleobject.c
  3. module_traverse calls md_state_traverse (i.e. faulthandler_traverse) then visits md_dict
  4. faulthandler_traverse calls Py_VISIT(fatal_error.file)(expands to visit_decref(_PyRuntime.faulthandler.fatal_error.file))
  5. With two module instances this runs twice; gc_refs on fatal_error.file (stderr, by default) is decremented twice
  6. But fatal_error.file's real refcount only has one reference from _PyRuntime; gc_refs goes negative → debug assertion fires, silent UAF in release

Safety

m_traverse exists to tell the GC about Python objects owned by per-module C state (md_state, allocated when m_size > 0). faulthandler has m_size = 0, so there is no per-module C state. fatal_error.file, thread.file, and user_signals[signum].file are all owned by _PyRuntime, not by any module instance. _PyRuntime is not a Python object and is never in containers, so its references are never passed to visit_decref, they contribute to ob_refcnt but are never subtracted, meaning gc_refs >= 1 after subtract_refs. GC will never mark these objects as unreachable regardless of whether the module traverse visits them. Cleanup is already handled correctly: faulthandler_disable() calls Py_CLEAR(fatal_error.file) (and equivalents), and _PyFaulthandler_Fini() calls faulthandler_disable() at shutdown. No GC involvement is needed.

Functionality

From main:
image

After removal:
image

`faulthandler_traverse` visits Python objects owned by `_PyRuntime`, not
by the module instance. With multi-phase init allowing multiple module
instances, each instance's GC traversal decrements `gc_refs` on the same
runtime-owned objects, driving it negative when two instances are
collected simultaneously.
@bedevere-app
Copy link
Copy Markdown

bedevere-app Bot commented May 18, 2026

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@python-cla-bot
Copy link
Copy Markdown

python-cla-bot Bot commented May 18, 2026

All commit authors signed the Contributor License Agreement.

CLA signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant