Align codegen metadata with CPython#7952
Conversation
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughAdds lineno_override and a line-only sentinel to codegen IR and preserves it through cloning/jump-threading; introduces ASYNC_GENERATOR flag and integrates it into VM fast-paths and frame handling; refines SymbolTable for methods and type-parameter defaults; aligns float formatting/to_hex with CPython; normalizes PEP 263 encodings; and updates spellfile and dis_dump. ChangesIR Location & Optimization
ASYNC_GENERATOR Code Flag Support
Symbol Table: methods & type-params
Float literal formatting & to_hex
Source encoding & stdlib
Supporting updates
🎯 4 (Complex) | ⏱️ ~75 minutes Possibly Related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] test: cpython/Lib/test/test_structseq.py dependencies: dependent tests: (no tests depend on structseq) [x] lib: cpython/Lib/py_compile.py dependencies:
dependent tests: (42 tests)
[x] test: cpython/Lib/test/test_float.py (TODO: 4) dependencies: dependent tests: (no tests depend on float) [x] test: cpython/Lib/test/test_sys.py (TODO: 7) dependencies: dependent tests: (233 tests)
[x] lib: cpython/Lib/socketserver.py dependencies:
dependent tests: (11 tests)
[ ] test: cpython/Lib/test/test_timeout.py dependencies: dependent tests: (no tests depend on timeout) [x] lib: cpython/Lib/inspect.py dependencies:
dependent tests: (90 tests)
[ ] test: cpython/Lib/test/test_exceptions.py (TODO: 23) dependencies: dependent tests: (no tests depend on exception) [x] test: cpython/Lib/test/test_compile.py (TODO: 14) dependencies: dependent tests: (no tests depend on compile) [x] lib: cpython/Lib/dis.py dependencies:
dependent tests: (73 tests)
[ ] lib: cpython/Lib/unittest dependencies:
dependent tests: (395 tests)
Legend:
|
8a0a382 to
a8f9a90
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
crates/codegen/src/ir.rs (1)
6632-6644:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve
lineno_overrideon synthesized control-flow markers.These initializers copy an existing source span but reset
lineno_overridetoNone. That dropsSome(-1)/LINE_ONLY_LOCATION_OVERRIDE, so laterinstruction_lineno()andgenerate_linetable()treat the syntheticNotTaken/Nopas normal column-bearing instructions and change the emitted linetable.Suggested fix
- lineno_override: None, + lineno_override: last_ins.lineno_override,- lineno_override: None, + lineno_override: source.lineno_override,Also applies to: 6677-6711, 7811-7820, 7848-7857
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/codegen/src/ir.rs` around lines 6632 - 6644, Synthesized control-flow marker InstructionInfo instances (e.g., the NotTaken and Nop entries) currently reset lineno_override to None, which drops special values like Some(-1); change those initializers to preserve the source span's override by copying last_ins.lineno_override (e.g., set lineno_override: last_ins.lineno_override) so instruction_lineno()/generate_linetable() sees the original override; apply the same change to all similar blocks that construct InstructionInfo (the ones around NotTaken/Nop at the other locations mentioned).crates/codegen/src/symboltable.rs (2)
1508-1514:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPass a real
has_defaultspredicate here.Hard-coding
truemeans every generic function gets a.defaultsparameter, even when it has no positional defaults. That undercuts the new conditional registration inenter_type_param_block()and changes the type-parameter scope signature.Suggested fix
self.enter_type_param_block( &format!("<generic parameters of {}>", name.as_str()), self.line_index_start(type_params.range), false, - true, + Self::has_positional_defaults(parameters), Self::has_kwonlydefaults(parameters), )?;fn has_positional_defaults(parameters: &ast::Parameters) -> bool { parameters .posonlyargs .iter() .chain(parameters.args.iter()) .any(|arg| arg.default.is_some()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/codegen/src/symboltable.rs` around lines 1508 - 1514, The call to enter_type_param_block currently passes a hard-coded true for the has_defaults predicate which forces every generic to get a .defaults parameter; replace that literal with a real predicate like has_positional_defaults(parameters). Add a helper function has_positional_defaults(&ast::Parameters) -> bool that returns true if any posonlyargs or args have a default (iterate parameters.posonlyargs.chain(parameters.args) and check arg.default.is_some()), then call enter_type_param_block(..., Self::has_kwonlydefaults(parameters), has_positional_defaults(parameters)) so the type-parameter scope signature and conditional registration remain correct.
1126-1146:⚠️ Potential issue | 🟠 Major | ⚡ Quick winLook through synthetic type-parameter scopes when setting
is_method.Generic class methods are entered under
CompilerScope::TypeParams, soparent.typ == CompilerScope::Classmisses cases likeclass C: def f[T](...). That leaves the new metadata false for generic methods.Suggested fix
- let is_method = parent.is_some_and(|table| { - table.typ == CompilerScope::Class + let is_method = parent.is_some_and(|table| { + (table.typ == CompilerScope::Class + || (table.typ == CompilerScope::TypeParams && table.can_see_class_scope)) && matches!( typ, CompilerScope::Function | CompilerScope::AsyncFunction | CompilerScope::Lambda | CompilerScope::Comprehension ) });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/codegen/src/symboltable.rs` around lines 1126 - 1146, The is_method check only looks at the immediate parent and misses synthetic CompilerScope::TypeParams wrappers (so generic class methods under a TypeParams scope are not detected); change the logic that computes is_method to walk up self.tables backwards skipping any CompilerScope::TypeParams entries (e.g., find the nearest ancestor whose typ != CompilerScope::TypeParams) and then test whether that ancestor.typ == CompilerScope::Class and the new scope typ matches function-like kinds (the existing match on CompilerScope::Function | AsyncFunction | Lambda | Comprehension); update the local is_method computation (the variable named is_method) to use that ancestor test instead of parent.is_some_and(|table| table.typ == CompilerScope::Class).
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode.rs (1)
452-469: Clean build recommended for bytecode changes.Since
CodeFlagsconstants are being added, consider performing a clean build to ensure no stale bytecode artifacts remain. As per coding guidelines, runcargo fmtto format the code andcargo clippyto check for any new lints before completing the task.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/compiler-core/src/bytecode.rs` around lines 452 - 469, You added new CodeFlags constants in the CodeFlags bitflags block (in bytecode.rs) which can leave stale artifacts or style/lint issues; perform a clean rebuild and run cargo fmt and cargo clippy: clean the build artifacts (cargo clean), build and test (cargo build && cargo test) to ensure no stale bytecode or compilation errors, run cargo fmt to apply formatting to crates/compiler-core/src/bytecode.rs and the workspace, and run cargo clippy to address any new lints introduced by the added flags (fix or silence issues reported for CodeFlags, e.g., unused or naming lints).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@crates/codegen/src/ir.rs`:
- Around line 6632-6644: Synthesized control-flow marker InstructionInfo
instances (e.g., the NotTaken and Nop entries) currently reset lineno_override
to None, which drops special values like Some(-1); change those initializers to
preserve the source span's override by copying last_ins.lineno_override (e.g.,
set lineno_override: last_ins.lineno_override) so
instruction_lineno()/generate_linetable() sees the original override; apply the
same change to all similar blocks that construct InstructionInfo (the ones
around NotTaken/Nop at the other locations mentioned).
In `@crates/codegen/src/symboltable.rs`:
- Around line 1508-1514: The call to enter_type_param_block currently passes a
hard-coded true for the has_defaults predicate which forces every generic to get
a .defaults parameter; replace that literal with a real predicate like
has_positional_defaults(parameters). Add a helper function
has_positional_defaults(&ast::Parameters) -> bool that returns true if any
posonlyargs or args have a default (iterate
parameters.posonlyargs.chain(parameters.args) and check arg.default.is_some()),
then call enter_type_param_block(..., Self::has_kwonlydefaults(parameters),
has_positional_defaults(parameters)) so the type-parameter scope signature and
conditional registration remain correct.
- Around line 1126-1146: The is_method check only looks at the immediate parent
and misses synthetic CompilerScope::TypeParams wrappers (so generic class
methods under a TypeParams scope are not detected); change the logic that
computes is_method to walk up self.tables backwards skipping any
CompilerScope::TypeParams entries (e.g., find the nearest ancestor whose typ !=
CompilerScope::TypeParams) and then test whether that ancestor.typ ==
CompilerScope::Class and the new scope typ matches function-like kinds (the
existing match on CompilerScope::Function | AsyncFunction | Lambda |
Comprehension); update the local is_method computation (the variable named
is_method) to use that ancestor test instead of parent.is_some_and(|table|
table.typ == CompilerScope::Class).
---
Nitpick comments:
In `@crates/compiler-core/src/bytecode.rs`:
- Around line 452-469: You added new CodeFlags constants in the CodeFlags
bitflags block (in bytecode.rs) which can leave stale artifacts or style/lint
issues; perform a clean rebuild and run cargo fmt and cargo clippy: clean the
build artifacts (cargo clean), build and test (cargo build && cargo test) to
ensure no stale bytecode or compilation errors, run cargo fmt to apply
formatting to crates/compiler-core/src/bytecode.rs and the workspace, and run
cargo clippy to address any new lints introduced by the added flags (fix or
silence issues reported for CodeFlags, e.g., unused or naming lints).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: c62028dd-45be-4d98-88b4-f81b2cf7f8e6
⛔ Files ignored due to path filters (9)
Lib/test/test_compile.pyis excluded by!Lib/**Lib/test/test_dis.pyis excluded by!Lib/**Lib/test/test_exceptions.pyis excluded by!Lib/**Lib/test/test_inspect/test_inspect.pyis excluded by!Lib/**Lib/test/test_py_compile.pyis excluded by!Lib/**Lib/test/test_strtod.pyis excluded by!Lib/**Lib/test/test_sys_settrace.pyis excluded by!Lib/**Lib/test/test_unittest/test_async_case.pyis excluded by!Lib/**crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__nested_double_async_with.snapis excluded by!**/*.snap
📒 Files selected for processing (10)
.cspell.dict/cpython.txtcrates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/codegen/src/symboltable.rscrates/compiler-core/src/bytecode.rscrates/literal/src/float.rscrates/vm/src/builtins/function.rscrates/vm/src/frame.rscrates/vm/src/stdlib/builtins.rsscripts/dis_dump.py
a8f9a90 to
0961896
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/codegen/src/symboltable.rs (1)
1508-1514:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPass the real positional-default presence into the type-params scope.
Line 1512 hardcodes
has_defaultstotrue, so a generic function with no positional defaults still gets a synthetic.defaultsparameter. That changes the helper scope’s parameter metadata and can shift later synthetic slots away from CPython.Suggested fix
if let Some(type_params) = type_params { self.enter_type_param_block( &format!("<generic parameters of {}>", name.as_str()), self.line_index_start(type_params.range), false, - true, + Self::has_defaults(parameters), Self::has_kwonlydefaults(parameters), )?; self.scan_type_params(type_params)?; }Add a helper alongside
has_kwonlydefaults:fn has_defaults(parameters: &ast::Parameters) -> bool { parameters .posonlyargs .iter() .chain(parameters.args.iter()) .any(|arg| arg.default.is_some()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/codegen/src/symboltable.rs` around lines 1508 - 1514, The call to enter_type_param_block currently hardcodes has_defaults to true causing synthetic `.defaults` to be added even when no positional defaults exist; add a helper fn has_defaults(parameters: &ast::Parameters) -> bool that checks posonlyargs and args for any arg.default.is_some(), then replace the hardcoded true in the enter_type_param_block invocation with Self::has_defaults(parameters) (leave Self::has_kwonlydefaults(parameters) as-is) so the type-params scope gets the correct positional-default presence; update any visibility (impl Self) as needed where has_kwonlydefaults is defined.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 4597-4599: debug_late_cfg_trace() must mirror finalize_code()
exactly by running the same constant-folding passes around peephole
optimization: add calls to fold_constants_per_block() before and after the
peephole/optimize steps so the trace includes the two new
fold_constants_per_block() passes that finalize_code() runs; ensure these new
calls are placed in the same order relative to optimize_build_tuple_unpack(),
eliminate_dead_stores(), apply_static_swaps(), and peephole_optimize() so the
trace output stays in lockstep with finalize_code().
In `@crates/compiler-core/src/bytecode.rs`:
- Around line 912-921: The PartialEq implementation for ConstantData currently
treats Float/Complex containing NaN as unequal to themselves, violating Rust's
Eq reflexivity; update ConstantData::PartialEq (the Float and Complex match
arms) to compare floats/complex parts by their bit-patterns (use to_bits()
equality for float components) so NaNs compare equal to themselves, and keep
impl Eq and the Hash impl unchanged; then move CPython's "don't merge NaNs"
behavior out of ConstantData into the constant interning/dedup layer (the
constant-key/wrapper or the interning/merge function) so that the interner
refuses to merge constants that contain NaNs instead of encoding that behavior
in ConstantData::PartialEq.
---
Outside diff comments:
In `@crates/codegen/src/symboltable.rs`:
- Around line 1508-1514: The call to enter_type_param_block currently hardcodes
has_defaults to true causing synthetic `.defaults` to be added even when no
positional defaults exist; add a helper fn has_defaults(parameters:
&ast::Parameters) -> bool that checks posonlyargs and args for any
arg.default.is_some(), then replace the hardcoded true in the
enter_type_param_block invocation with Self::has_defaults(parameters) (leave
Self::has_kwonlydefaults(parameters) as-is) so the type-params scope gets the
correct positional-default presence; update any visibility (impl Self) as needed
where has_kwonlydefaults is defined.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 24688510-8965-405d-895b-9187436f34e1
⛔ Files ignored due to path filters (9)
Lib/test/test_compile.pyis excluded by!Lib/**Lib/test/test_dis.pyis excluded by!Lib/**Lib/test/test_exceptions.pyis excluded by!Lib/**Lib/test/test_inspect/test_inspect.pyis excluded by!Lib/**Lib/test/test_py_compile.pyis excluded by!Lib/**Lib/test/test_strtod.pyis excluded by!Lib/**Lib/test/test_sys_settrace.pyis excluded by!Lib/**Lib/test/test_unittest/test_async_case.pyis excluded by!Lib/**crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__nested_double_async_with.snapis excluded by!**/*.snap
📒 Files selected for processing (10)
.cspell.dict/cpython.txtcrates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/codegen/src/symboltable.rscrates/compiler-core/src/bytecode.rscrates/literal/src/float.rscrates/vm/src/builtins/function.rscrates/vm/src/frame.rscrates/vm/src/stdlib/builtins.rsscripts/dis_dump.py
✅ Files skipped from review due to trivial changes (1)
- .cspell.dict/cpython.txt
| // Match CPython's _PyCode_ConstantKey(): NaN constants are not | ||
| // merged because nan != nan at the Python level. | ||
| (Float { value: a }, Float { value: b }) => { | ||
| !a.is_nan() && !b.is_nan() && a.to_bits() == b.to_bits() | ||
| } | ||
| (Complex { value: a }, Complex { value: b }) => { | ||
| if a.re.is_nan() || a.im.is_nan() || b.re.is_nan() || b.im.is_nan() { | ||
| return false; | ||
| } | ||
| a.re.to_bits() == b.re.to_bits() && a.im.to_bits() == b.im.to_bits() |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does Rust's Eqtrait require reflexive equality, and what issues can occur if aHash/Eq key type compares unequal to itself for NaN values?
💡 Result:
Yes, Rust's Eq trait strictly requires reflexive equality [1][2]. Specifically, for any type that implements Eq, the relation must satisfy the following properties for all values a, b, and c: - Reflexivity: a == a must always be true [1][2]. - Symmetry: a == b implies b == a [1][2]. - Transitivity: a == b and b == c implies a == c [1][2]. While PartialEq only requires symmetry and transitivity, Eq acts as a marker trait (it has no methods of its own) that informs the compiler and other developers that the equality relation is a total equivalence relation, including reflexivity [1][2][3]. Issues with NaN and Hash/Eq Keys: Floating-point types like f32 and f64 do not implement Eq precisely because NaN (Not-a-Number) values are not equal to themselves (NaN!= NaN), which violates the reflexivity requirement [4][5]. If you were to force a type that compares unequal to itself (like a raw float) to be used as a key in a HashMap or HashSet, it would cause significant logic errors [6][7]: 1. Unreachable Data: HashMap and HashSet rely on the assumption that a key is equal to itself [6][7]. If you insert a value (like NaN) as a key, you may be unable to retrieve, remove, or even verify the existence of that entry later, because the lookup process will fail to find the key even if it is present in the map [4][5]. 2. Logic Errors: Violating the Eq contract is considered a logic error [1][6]. While it will not cause undefined behavior (memory safety issues) in safe Rust, it leads to unpredictable behavior in collections, such as silent data loss or the inability to manage stored items [6][7]. To use floating-point numbers as keys in Rust, the standard approach is to use a wrapper type (such as those provided by the ordered-float crate) that defines a total ordering and equality for floats, typically by treating all NaN values as equal to each other or by rejecting them entirely [4][8].
Citations:
- 1: https://rust.docs.kernel.org/6.14/core/cmp/trait.Eq.html
- 2: https://doc.rust-lang.org/std/cmp/trait.Eq.html
- 3: https://www.linkedin.com/pulse/eq-partialeq-rust-amit-nadiger-514uc
- 4: https://users.rust-lang.org/t/hashmap-key-cant-be-float-number-type-why/7892
- 5: Problem with HashMaps and Resources bevyengine/bevy#6818
- 6: https://dev-doc.rust-lang.org/stable/std/collections/hash_map/struct.HashMap.html
- 7: https://dev-doc.rust-lang.org/core/hash/trait.Hash.html
- 8: https://www.reddit.com/r/rust/comments/1dfpnok/the_best_way_to_make_a_hash_map_of_floats/
Fix ConstantData’s Eq contract for NaN values.
Eq in Rust must be reflexive (a == a always). The current ConstantData PartialEq logic for Float/Complex makes NaN-bearing constants compare unequal to themselves (around 912–921), while impl Eq for ConstantData and its Hash impl remain in place (around 937 and 939–962). This violates Rust’s Eq/Hash assumptions and can break constant interning/dedup/lookups. Keep ConstantData’s equality reflexive and move CPython’s “don’t merge NaNs” behavior into the constant-key/wrapper or the interning/dedup logic instead of encoding it in ConstantData::PartialEq.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/compiler-core/src/bytecode.rs` around lines 912 - 921, The PartialEq
implementation for ConstantData currently treats Float/Complex containing NaN as
unequal to themselves, violating Rust's Eq reflexivity; update
ConstantData::PartialEq (the Float and Complex match arms) to compare
floats/complex parts by their bit-patterns (use to_bits() equality for float
components) so NaNs compare equal to themselves, and keep impl Eq and the Hash
impl unchanged; then move CPython's "don't merge NaNs" behavior out of
ConstantData into the constant interning/dedup layer (the constant-key/wrapper
or the interning/merge function) so that the interner refuses to merge constants
that contain NaNs instead of encoding that behavior in ConstantData::PartialEq.
0961896 to
8b87ce4
Compare
Summary by CodeRabbit
Bug Fixes
Chores
Tests