Skip to content

Python: reintroduce instance-attribute type-tracking without the classInstanceTracker blow-up#22097

Open
owen-mc with Copilot wants to merge 4 commits into
codeql-cli-2.26.0from
copilot/reintroduce-functionality
Open

Python: reintroduce instance-attribute type-tracking without the classInstanceTracker blow-up#22097
owen-mc with Copilot wants to merge 4 commits into
codeql-cli-2.26.0from
copilot/reintroduce-functionality

Conversation

Copilot AI commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

#22092 disabled the instanceFieldStep disjunct of TypeTrackingInput::levelStepCall because it identified instances via classInstanceTracker — a full type-tracker run — from inside levelStepCall, creating a mutual recursion that caused catastrophic slowdowns on OOP-heavy code bases (e.g. mypy, dask). This reintroduces the same flow (a value stored on self.attr in a method, read later as instance.attr) via a cheaper instance-identification strategy.

Change

  • instanceAttrRead now identifies the instance using local flow (getALocalSource(), purely simpleLocalFlowStep-based) from a constructor call resolved by resolveClassCall/classTracker, instead of classInstanceTracker. This drops the recursive dependency to the same call-graph machinery already used by the un-disabled inheritedFieldStep.
  • instanceFieldStep is otherwise unchanged (same levelStepCall shape, same getADirectSuperclass* relation).
  • Updated QLDoc + change note.
private predicate instanceAttrRead(Class cls, string attr, DataFlowPublic::AttrRead read) {
  exists(DataFlowPublic::CallCfgNode construction |
    DataFlowDispatch::resolveClassCall(construction.asCfgNode(), cls) and
    read.getObject().getALocalSource() = construction
  ) and
  read.mayHaveAttributeName(attr)
}

Trade-off

Precision drops for instances that flow across a call/return before being read — these are no longer covered, since the instance is now tracked only by local flow rather than a dedicated instance type-tracker.

Notes for reviewers

  • The 4 external-instance-read results previously produced by instanceFieldStep are reproduced; typetracking, dataflow, callgraph, ApiGraphs, and taint library tests pass.
  • The performance improvement is argued from the dependency structure (classTracker + local flow vs. classInstanceTracker); it was not measured against mypy/dask in this environment, so confirming the blow-up is resolved on those code bases is worth doing before merge.

@owen-mc

owen-mc commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

@copilot add a case to one of the existing tests with a MISSING tag which demonstrates the shortcomings of the approach taken in this PR.

Copilot AI commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

@copilot add a case to one of the existing tests with a MISSING tag which demonstrates the shortcomings of the approach taken in this PR.

Added a MISSING case to attribute_tests.py (in the "attribute set in method" section). It defines a factory make_my_class2() returning a MyClass2(), then reads .foo on the returned instance; since instanceFieldStep now identifies instances only by local flow from the constructor call, the value stored on self.foo is no longer seen once the instance crosses the return boundary, so the read is annotated # $ MISSING: tracked. The typetracking tests still pass.

Copilot AI requested a review from owen-mc June 30, 2026 20:25
@owen-mc owen-mc force-pushed the copilot/reintroduce-functionality branch from 4181855 to 7acbbb7 Compare July 2, 2026 13:20
@owen-mc owen-mc changed the base branch from main to codeql-cli-2.26.0 July 2, 2026 13:25
@owen-mc

owen-mc commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

I've run DCA. The analysis times varied from +8% to -8%. There is one extra alert, which I haven't looked into yet.

@owen-mc owen-mc marked this pull request as ready for review July 3, 2026 19:56
@owen-mc owen-mc requested a review from a team as a code owner July 3, 2026 19:56
Copilot AI review requested due to automatic review settings July 3, 2026 19:56

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reintroduces Python instance-attribute type-tracking for values written to self.attr in methods and later read as instance.attr, while avoiding the performance blow-up previously caused by using classInstanceTracker inside levelStepCall.

Changes:

  • Re-enabled instanceFieldStep in TypeTrackingInput::levelStepCall and rewrote instanceAttrRead to identify instances via local flow from resolved constructor calls (instead of classInstanceTracker).
  • Updated type-tracking library tests to reflect the intended precision trade-off (no longer tracking instances that cross a call/return boundary before the read).
  • Updated the SQL injection query test annotations/expected results and added a change note documenting the behavior/performance change.
Show a summary per file
File Description
python/ql/lib/semmle/python/dataflow/new/internal/TypeTrackingImpl.qll Re-enables instanceFieldStep and changes instance identification to avoid recursive type-tracker dependency.
python/ql/test/library-tests/dataflow/typetracking/attribute_tests.py Adjusts inline expectations and adds a regression-style example for the reduced-precision case.
python/ql/test/query-tests/Security/CWE-089-SqlInjection/app.py Fixes inline expectations so unsafe3 is treated as a source/alert pair.
python/ql/test/query-tests/Security/CWE-089-SqlInjection/SqlInjection.expected Updates expected results/edges/nodes to include the newly-detected unsafe3 finding.
python/ql/lib/change-notes/2026-06-30-instance-attribute-typetracking-performance.md Documents the performance-motivated redesign and its precision trade-off.

Review details

  • Files reviewed: 5/5 changed files
  • Comments generated: 0
  • Review effort level: Low

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants