Python: visit function parameter and return annotations in new CFG#21943
Draft
yoff wants to merge 1 commit into
Draft
Python: visit function parameter and return annotations in new CFG#21943yoff wants to merge 1 commit into
yoff wants to merge 1 commit into
Conversation
The new (shared-CFG-based) Python control flow graph in `semmle.python.controlflow.internal.Cfg` previously did not emit CFG nodes for parameter type annotations (`def f(x: T): ...`) or for the return type annotation (`-> T`). The legacy CFG emitted both, and a small number of framework models rely on this: `LocalSources.qll`'s `annotatedInstance` walks the parameter annotation expression by way of its CFG node to track that a parameter receives an instance of the annotated class. After the dataflow flip to the new CFG/SSA this regression manifested as lost flows in any test exercising annotation-based parameter tracking: FastAPI `Depends()` receivers, Pydantic request bodies, Starlette `WebSocket`, the call-graph type-annotation test, and so on. Extend `FunctionDefExpr` to visit each annotation as a child of the function-def expression, in CPython evaluation order: positional parameter annotations, `*args` annotation, keyword-only parameter annotations, `**kwargs` annotation, then the return annotation. (Lambda expressions have no annotations in Python syntax, so `LambdaExpr` is unchanged.) PEP 695 type parameters remain out of scope; they belong to the inner annotation scope, not the enclosing CFG. Restored test results across `framework/aiohttp`, `framework/fastapi`, `framework/lxml`, the `CallGraph-type-annotations` test, and `CWE-022-PathInjection`. Two FastAPI list-comprehension MISSING markers become positive (`taint_test.py:41,55`). CPython CFG consistency remains clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #21937 (and the broader new-CFG stack: #21921, #21923, #21925).
Root cause. The new control flow graph (
semmle.python.controlflow.internal.Cfg, introduced in #21921) did not emit CFG nodes for parameter type annotations or for the return-type annotation on function definitions. The legacy CFG emitted both.LocalSources::annotatedInstance(and a small handful of framework models) walk the parameter-annotation expression via its CFG node to identify that a parameter receives an instance of the annotated class, so after the dataflow flip (#21925) annotation-based parameter tracking was silently broken.Fix. Extend
FunctionDefExpr.getChildto visit each annotation as a child of the function-def expression, in CPython evaluation order: positional parameter annotations,*argsannotation, keyword-only parameter annotations,**kwargsannotation, then the return annotation. Lambda expressions have no annotations in Python syntax, soLambdaExpris unchanged. PEP 695 type parameters remain out of scope (they belong to the inner annotation scope, not the enclosing CFG).Results.
Depends()receivers, Pydantic request bodies, StarletteWebSockethandlers, and the experimental call-graph type-annotation test all detect again.CWE-022-PathInjectionregains the lostfastapi_path_injection.py:26-27flow path throughfile_handler: FileHandler = Depends().taint_test.py:41,55).Stack: #21921 (P3) → #21923 (P4) → #21925 (Flip) → #21937 (CFG-Exc) → this.