Python: reintroduce instance-attribute type-tracking without the classInstanceTracker blow-up#22097
Conversation
|
@copilot add a case to one of the existing tests with a MISSING tag which demonstrates the shortcomings of the approach taken in this PR. |
Added a |
…ker blowup" This reverts commit f251a57.
4181855 to
7acbbb7
Compare
|
I've run DCA. The analysis times varied from +8% to -8%. There is one extra alert, which I haven't looked into yet. |
There was a problem hiding this comment.
Pull request overview
This PR reintroduces Python instance-attribute type-tracking for values written to self.attr in methods and later read as instance.attr, while avoiding the performance blow-up previously caused by using classInstanceTracker inside levelStepCall.
Changes:
- Re-enabled
instanceFieldStepinTypeTrackingInput::levelStepCalland rewroteinstanceAttrReadto identify instances via local flow from resolved constructor calls (instead ofclassInstanceTracker). - Updated type-tracking library tests to reflect the intended precision trade-off (no longer tracking instances that cross a call/return boundary before the read).
- Updated the SQL injection query test annotations/expected results and added a change note documenting the behavior/performance change.
Show a summary per file
| File | Description |
|---|---|
| python/ql/lib/semmle/python/dataflow/new/internal/TypeTrackingImpl.qll | Re-enables instanceFieldStep and changes instance identification to avoid recursive type-tracker dependency. |
| python/ql/test/library-tests/dataflow/typetracking/attribute_tests.py | Adjusts inline expectations and adds a regression-style example for the reduced-precision case. |
| python/ql/test/query-tests/Security/CWE-089-SqlInjection/app.py | Fixes inline expectations so unsafe3 is treated as a source/alert pair. |
| python/ql/test/query-tests/Security/CWE-089-SqlInjection/SqlInjection.expected | Updates expected results/edges/nodes to include the newly-detected unsafe3 finding. |
| python/ql/lib/change-notes/2026-06-30-instance-attribute-typetracking-performance.md | Documents the performance-motivated redesign and its precision trade-off. |
Review details
- Files reviewed: 5/5 changed files
- Comments generated: 0
- Review effort level: Low
#22092 disabled the
instanceFieldStepdisjunct ofTypeTrackingInput::levelStepCallbecause it identified instances viaclassInstanceTracker— a full type-tracker run — from insidelevelStepCall, creating a mutual recursion that caused catastrophic slowdowns on OOP-heavy code bases (e.g.mypy,dask). This reintroduces the same flow (a value stored onself.attrin a method, read later asinstance.attr) via a cheaper instance-identification strategy.Change
instanceAttrReadnow identifies the instance using local flow (getALocalSource(), purelysimpleLocalFlowStep-based) from a constructor call resolved byresolveClassCall/classTracker, instead ofclassInstanceTracker. This drops the recursive dependency to the same call-graph machinery already used by the un-disabledinheritedFieldStep.instanceFieldStepis otherwise unchanged (samelevelStepCallshape, samegetADirectSuperclass*relation).Trade-off
Precision drops for instances that flow across a call/return before being read — these are no longer covered, since the instance is now tracked only by local flow rather than a dedicated instance type-tracker.
Notes for reviewers
instanceFieldStepare reproduced; typetracking, dataflow, callgraph, ApiGraphs, and taint library tests pass.classTracker+ local flow vs.classInstanceTracker); it was not measured againstmypy/daskin this environment, so confirming the blow-up is resolved on those code bases is worth doing before merge.