gh-151618: Add nesting depth limit to xml.etree.ElementTree TreeBuilder#151620
Closed
NaveenKumarG-dev wants to merge 3 commits into
Closed
gh-151618: Add nesting depth limit to xml.etree.ElementTree TreeBuilder#151620NaveenKumarG-dev wants to merge 3 commits into
NaveenKumarG-dev wants to merge 3 commits into
Conversation
…eBuilder Add a MAX_XML_NESTING_DEPTH constant (5000 levels) in treebuilder_handle_start() to prevent C stack overflows caused by deeply nested XML documents. When the limit is exceeded, ParseError is raised with a descriptive message instead of silently building a tree that could crash the interpreter during GC or deepcopy. The FIXME comment /* FIXME: check stack size? */ in treebuilder_done() is replaced with an accurate comment explaining where the guard lives. Note: _Py_EnterRecursiveCall is not suitable here because in Python 3.12+ it checks the C machine stack pointer, but Expat calls treebuilder_handle_start iteratively at a constant depth, so the pointer check never triggers. Also document 'deeply nested elements' as an XML DoS attack vector in Doc/library/xml.rst, which previously listed only 4 vectors.
Documentation build overview
6 files changed ·
|
Member
|
I'm closing this PR because it's premature. We need a discussion first especially to consider whether this is really an issue or not. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes gh-151618.
Summary
xml.etree.ElementTreehad no limit on XML element nesting depth. A/* FIXME: check stack size? */comment inModules/_elementtree.cacknowledged this gap. Without a guard, parsing deeply nested XML silently builds an arbitrarily deepElementtree that can crash the interpreter (SIGSEGV) when Python's GC traverses it recursively (element_gc_clear→Py_DECREF→ destructor chain).Changes
Modules/_elementtree.c#define MAX_XML_NESTING_DEPTH 5000abovetreebuilder_handle_startParseError("xml nesting depth limit (5000 levels) exceeded")whenself->index >= MAX_XML_NESTING_DEPTHFIXMEcomment intreebuilder_donewith an accurate commentLib/test/test_xml_etree.pyNestingDepthTestwith 6 tests (deep nesting, shallow nesting, exact boundary conditions, TreeBuilder, XMLParser)test_xml_etree_c.pyDoc/library/xml.rstMisc/NEWS.d/Why not
_Py_EnterRecursiveCall?In Python 3.12+,
_Py_EnterRecursiveCallchecks the C machine stack pointer, not a counter. Expat callstreebuilder_handle_startiteratively at a constant C depth — the pointer check never triggers. A dedicated counter onself->indexis required.Test Results
0 regressions. 6 new
NestingDepthTestcases all pass viatest_xml_etree_c.