Skip to content

Commit 46cd747

Browse files
youknowoneCopilot
authored andcommitted
Bytecode parity (#7475)
* Emit TO_BOOL before conditional jumps, fix class/module prologue - Emit TO_BOOL before POP_JUMP_IF_TRUE/FALSE in the general case of compile_jump_if (Compare expressions excluded since they already produce a bool) - Module-level __doc__: use STORE_NAME instead of STORE_GLOBAL - Class body __module__: use LOAD_NAME instead of LOAD_GLOBAL - Class body: store __firstlineno__ before __doc__ * Emit MAKE_CELL and COPY_FREE_VARS before RESUME Emit MAKE_CELL for each cell variable and COPY_FREE_VARS N for free variables at the start of each code object, before RESUME. These instructions are no-ops in the VM but align the bytecode with CPython 3.14's output. * Emit __static_attributes__ at end of class bodies Store a tuple of attribute names (currently always empty) as __static_attributes__ in the class namespace, matching CPython 3.14's class body epilogue. Attribute name collection from self.xxx accesses is a follow-up task. * Remove expectedFailure from DictProxyTests iter tests test_iter_keys, test_iter_values, test_iter_items now pass because class bodies emit __static_attributes__ and __firstlineno__, matching the expected dict key set. * Use 1-based stack indexing for LIST_EXTEND, SET_UPDATE, etc. Switch LIST_APPEND, LIST_EXTEND, SET_ADD, SET_UPDATE, MAP_ADD from 0-based to 1-based stack depth argument, matching CPython's PEEK(oparg) convention. Adjust the VM to subtract 1 before calling nth_value. * Use plain LOAD_ATTR + PUSH_NULL for calls on imported names When the call target is an attribute of an imported name (e.g., logging.getLogger()), use plain LOAD_ATTR (method_flag=0) with a separate PUSH_NULL instead of method-mode LOAD_ATTR. This matches CPython 3.14's behavior which avoids the method call optimization for module attribute access. * Duplicate return-None epilogue for fall-through blocks When the last block in a code object is exactly LOAD_CONST None + RETURN_VALUE (the implicit return), duplicate these instructions into blocks that would otherwise fall through to it. This matches CPython 3.14's behavior of giving each code path its own explicit return instruction. * Run cargo fmt on ir.rs * Remove expectedFailure from test_intrinsic_1 in test_dis * Emit TO_BOOL before conditional jumps for all expressions including Compare * Add __classdict__ cell for classes with function definitions Set needs_classdict=true for class scopes that contain function definitions (def/async def), matching CPython 3.14's behavior for PEP 649 deferred annotation support. Also restore the Compare expression check in compile_jump_if to skip TO_BOOL for comparison operations. * Emit __classdictcell__ store in class body epilogue Store the __classdict__ cell reference as __classdictcell__ in the class namespace when the class has __classdict__ as a cell variable. Uses LOAD_DEREF (RustPython separates cell vars from fast locals unlike CPython's unified array). * Always run DCE to remove dead code after terminal instructions Run basic dead code elimination (truncating instructions after RETURN_VALUE/RAISE/JUMP within blocks) at all optimization levels, not just optimize > 0. CPython always removes this dead code during assembly. * Restrict LOAD_ATTR plain mode to module/class scope imports Only use plain LOAD_ATTR + PUSH_NULL for imports at module or class scope. Function-local imports use method call mode LOAD_ATTR, matching CPython 3.14's behavior. * Eliminate unreachable blocks after jump normalization Split DCE into two phases: (1) within-block truncation after terminal instructions (always runs), (2) whole-block elimination for blocks only reachable via fall-through from terminal blocks (runs after normalize_jumps when dead jump instructions exist). * Fold BUILD_TUPLE 0 into LOAD_CONST empty tuple Convert BUILD_TUPLE with size 0 to LOAD_CONST () during constant folding, matching CPython's optimization for empty tuple literals. * Handle __classcell__ and __classdictcell__ in type.__new__ - Remove __classcell__ from class dict after setting the cell value - Add __classdictcell__ handling: set cell to class namespace dict, then remove from class dict - Register __classdictcell__ identifier - Use LoadClosure instead of LoadDeref for __classdictcell__ emission - Reorder MakeFunctionFlag bits to match CPython - Run ruff format on scripts * Revert __classdict__ cell and __classdictcell__ changes The __classdict__ cell addition (for classes with function defs) and __classdictcell__ store caused cell initialization failures in importlib. These require deeper VM changes to properly support the cell variable lifecycle. Reverted for stability. * Fix unreachable block elimination with fixpoint reachability Use fixpoint iteration to properly determine block reachability: only mark jump targets of already-reachable blocks, preventing orphaned blocks from falsely marking their targets as reachable. Also add a final DCE pass after assembly NOP removal to catch dead code created by normalize_jumps. * Check enclosing scopes for IMPORTED flag in LOAD_ATTR mode When deciding whether to use plain LOAD_ATTR for attribute calls, check if the name is imported in any enclosing scope (not just the current scope). This handles the common pattern where a module is imported at module level but used inside functions. * Add __classdict__ cell for classes with function definitions Set needs_classdict=true when a class scope contains function definitions (def/async def), matching CPython 3.14 which always creates a __classdict__ cell for PEP 649 support in such classes. * Store __classdictcell__ in class body epilogue Store the __classdict__ cell reference as __classdictcell__ in the class namespace using LoadClosure (which loads the cell object itself, not the value inside). This matches CPython 3.14's class body epilogue. * Fix clippy collapsible_if warnings and cargo fmt * Revert __classdict__ and __classdictcell__ changes (cause import failures) * Revert type.__new__ __classcell__ removal and __classdictcell__ handling Revert the class cell cleanup changes from e6975f9 that cause import failures when frozen module bytecode is stale. The original behavior (not removing __classcell__ from class dict) is restored. * Re-add __classdict__ cell and __classdictcell__ store Restore the __classdict__ cell for classes with function definitions and __classdictcell__ store in class body epilogue. Previous failure was caused by stale .pyc cache files containing bytecode from an intermediate MakeFunctionFlag reorder attempt, not by these changes themselves. * Reorder MakeFunctionFlag to match CPython's SET_FUNCTION_ATTRIBUTE Reorder discriminants: Defaults=0, KwOnlyDefaults=1, Annotations=2, Closure=3, Annotate=4, TypeParams=5. This aligns the oparg values with CPython 3.14's convention. Note: after this change, stale .pyc cache files must be deleted (find . -name '*.pyc' -delete) to avoid bytecode mismatch errors. * Use CPython-compatible power-of-two encoding for SET_FUNCTION_ATTRIBUTE Override From/TryFrom for MakeFunctionFlag to use power-of-two values (1,2,4,8,16,32) matching CPython's SET_FUNCTION_ATTRIBUTE oparg encoding, instead of sequential discriminants (0,1,2,3,4,5). * Remove expectedFailure from test_elim_jump_after_return1 and test_no_jump_over_return_out_of_finally_block * Remove __classcell__ and __classdictcell__ from class dict in type.__new__ * Remove expectedFailure from test___classcell___expected_behaviour, cargo fmt * Handle MakeCell and CopyFreeVars as no-ops in JIT These prologue instructions are handled at frame creation time by the VM. The JIT operates on already-initialized frames, so these can be safely skipped during compilation. * Remove expectedFailure from test_load_fast_known_simple * Restore expectedFailure for test_load_fast_known_simple The test expects LOAD_FAST_BORROW_LOAD_FAST_BORROW superinstruction which RustPython does not emit yet.
1 parent ab9e1d2 commit 46cd747

17 files changed

+511
-246
lines changed

Lib/test/test_descr.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5179,7 +5179,6 @@ def meth(self):
51795179
pass
51805180
self.C = C
51815181

5182-
@unittest.expectedFailure # TODO: RUSTPYTHON
51835182
@unittest.skipIf(hasattr(sys, 'gettrace') and sys.gettrace(),
51845183
'trace function introduces __local__')
51855184
def test_iter_keys(self):
@@ -5193,7 +5192,6 @@ def test_iter_keys(self):
51935192
'__static_attributes__', '__weakref__',
51945193
'meth'])
51955194

5196-
@unittest.expectedFailure # TODO: RUSTPYTHON; AssertionError: 5 != 7
51975195
@unittest.skipIf(hasattr(sys, 'gettrace') and sys.gettrace(),
51985196
'trace function introduces __local__')
51995197
def test_iter_values(self):
@@ -5203,7 +5201,6 @@ def test_iter_values(self):
52035201
values = list(it)
52045202
self.assertEqual(len(values), 7)
52055203

5206-
@unittest.expectedFailure # TODO: RUSTPYTHON
52075204
@unittest.skipIf(hasattr(sys, 'gettrace') and sys.gettrace(),
52085205
'trace function introduces __local__')
52095206
def test_iter_items(self):

Lib/test/test_dis.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1134,7 +1134,6 @@ def test_kw_names(self):
11341134
# Test that value is displayed for keyword argument names:
11351135
self.do_disassembly_test(wrap_func_w_kwargs, dis_kw_names)
11361136

1137-
@unittest.expectedFailure # TODO: RUSTPYTHON
11381137
def test_intrinsic_1(self):
11391138
# Test that argrepr is displayed for CALL_INTRINSIC_1
11401139
self.do_disassembly_test("from math import *", dis_intrinsic_1_2)

Lib/test/test_peepholer.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -612,7 +612,6 @@ def f():
612612
print(i)
613613
self.check_jump_targets(f)
614614

615-
@unittest.expectedFailure # TODO: RUSTPYTHON; 611 JUMP_BACKWARD 16
616615
def test_elim_jump_after_return1(self):
617616
# Eliminate dead code: jumps immediately after returns can't be reached
618617
def f(cond1, cond2):
@@ -863,7 +862,7 @@ def setUp(self):
863862
self.addCleanup(sys.settrace, sys.gettrace())
864863
sys.settrace(None)
865864

866-
@unittest.expectedFailure # TODO: RUSTPYTHON; BINARY_OP 0 (+)
865+
@unittest.expectedFailure # TODO: RUSTPYTHON; no LOAD_FAST_BORROW_LOAD_FAST_BORROW superinstruction
867866
def test_load_fast_known_simple(self):
868867
def f():
869868
x = 1

Lib/test/test_super.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -209,7 +209,6 @@ def f():
209209

210210
self.assertIs(test_class, A)
211211

212-
@unittest.expectedFailure # TODO: RUSTPYTHON
213212
def test___classcell___expected_behaviour(self):
214213
# See issue #23722
215214
class Meta(type):

Lib/test/test_sys_settrace.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2063,8 +2063,6 @@ async def test_jump_between_async_with_blocks(output):
20632063
async with asynctracecontext(output, 4):
20642064
output.append(5)
20652065

2066-
# TODO: RUSTPYTHON
2067-
@unittest.expectedFailure
20682066
@jump_test(5, 7, [2, 4], (ValueError, "after"))
20692067
def test_no_jump_over_return_out_of_finally_block(output):
20702068
try:

crates/codegen/src/compile.rs

Lines changed: 108 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -610,13 +610,13 @@ impl Compiler {
610610
self.compile_expression(value)?;
611611
match collection_type {
612612
CollectionType::List => {
613-
emit!(self, Instruction::ListExtend { i: 0 });
613+
emit!(self, Instruction::ListExtend { i: 1 });
614614
}
615615
CollectionType::Set => {
616-
emit!(self, Instruction::SetUpdate { i: 0 });
616+
emit!(self, Instruction::SetUpdate { i: 1 });
617617
}
618618
CollectionType::Tuple => {
619-
emit!(self, Instruction::ListExtend { i: 0 });
619+
emit!(self, Instruction::ListExtend { i: 1 });
620620
}
621621
}
622622
} else {
@@ -627,13 +627,13 @@ impl Compiler {
627627
// Sequence already exists, append to it
628628
match collection_type {
629629
CollectionType::List => {
630-
emit!(self, Instruction::ListAppend { i: 0 });
630+
emit!(self, Instruction::ListAppend { i: 1 });
631631
}
632632
CollectionType::Set => {
633-
emit!(self, Instruction::SetAdd { i: 0 });
633+
emit!(self, Instruction::SetAdd { i: 1 });
634634
}
635635
CollectionType::Tuple => {
636-
emit!(self, Instruction::ListAppend { i: 0 });
636+
emit!(self, Instruction::ListAppend { i: 1 });
637637
}
638638
}
639639
} else {
@@ -692,6 +692,23 @@ impl Compiler {
692692
.expect("symbol_table_stack is empty! This is a compiler bug.")
693693
}
694694

695+
/// Check if a name is imported in current scope or any enclosing scope.
696+
fn is_name_imported(&self, name: &str) -> bool {
697+
if let Some(sym) = self.current_symbol_table().symbols.get(name) {
698+
if sym.flags.contains(SymbolFlags::IMPORTED) {
699+
return true;
700+
} else if sym.scope == SymbolScope::Local {
701+
return false;
702+
}
703+
}
704+
self.symbol_table_stack.iter().rev().skip(1).any(|table| {
705+
table
706+
.symbols
707+
.get(name)
708+
.is_some_and(|sym| sym.flags.contains(SymbolFlags::IMPORTED))
709+
})
710+
}
711+
695712
/// Get the cell-relative index of a free variable.
696713
/// Returns ncells + freevar_idx. Fixed up to localsplus index during finalize.
697714
fn get_free_var_index(&mut self, name: &str) -> CompileResult<oparg::VarNum> {
@@ -1151,7 +1168,16 @@ impl Compiler {
11511168
self.set_qualname();
11521169
}
11531170

1154-
// Emit COPY_FREE_VARS and MAKE_CELL prolog before RESUME
1171+
// Emit MAKE_CELL for each cell variable (before RESUME)
1172+
{
1173+
let ncells = self.code_stack.last().unwrap().metadata.cellvars.len();
1174+
for i in 0..ncells {
1175+
let i_varnum: oparg::VarNum = u32::try_from(i).expect("too many cellvars").into();
1176+
emit!(self, Instruction::MakeCell { i: i_varnum });
1177+
}
1178+
}
1179+
1180+
// Emit COPY_FREE_VARS if there are free variables (before RESUME)
11551181
{
11561182
let nfrees = self.code_stack.last().unwrap().metadata.freevars.len();
11571183
if nfrees > 0 {
@@ -1162,11 +1188,6 @@ impl Compiler {
11621188
}
11631189
);
11641190
}
1165-
let ncells = self.code_stack.last().unwrap().metadata.cellvars.len();
1166-
for i in 0..ncells {
1167-
let i_varnum: oparg::VarNum = u32::try_from(i).expect("too many cellvars").into();
1168-
emit!(self, Instruction::MakeCell { i: i_varnum });
1169-
}
11701191
}
11711192

11721193
// Emit RESUME (handles async preamble and module lineno 0)
@@ -1739,7 +1760,7 @@ impl Compiler {
17391760
value: value.into(),
17401761
});
17411762
let doc = self.name("__doc__");
1742-
emit!(self, Instruction::StoreGlobal { namei: doc })
1763+
emit!(self, Instruction::StoreName { namei: doc })
17431764
}
17441765

17451766
// Handle annotations based on future_annotations flag
@@ -3424,7 +3445,7 @@ impl Compiler {
34243445
if n == 0 {
34253446
// Empty handlers (invalid AST) - append rest to list and proceed
34263447
// Stack: [prev_exc, orig, list, rest]
3427-
emit!(self, Instruction::ListAppend { i: 0 });
3448+
emit!(self, Instruction::ListAppend { i: 1 });
34283449
// Stack: [prev_exc, orig, list]
34293450
emit!(
34303451
self,
@@ -3542,7 +3563,7 @@ impl Compiler {
35423563
// After pop: [prev_exc, orig, list, new_rest, lasti] (len=5)
35433564
// nth_value(i) = stack[len - i - 1], we need stack[2] = list
35443565
// stack[5 - i - 1] = 2 -> i = 2
3545-
emit!(self, Instruction::ListAppend { i: 2 });
3566+
emit!(self, Instruction::ListAppend { i: 3 });
35463567
// Stack: [prev_exc, orig, list, new_rest, lasti]
35473568

35483569
// POP_TOP - pop lasti
@@ -3571,7 +3592,7 @@ impl Compiler {
35713592
// PEEK(1) = stack[len-1] after pop
35723593
// RustPython nth_value(i) = stack[len-i-1] after pop
35733594
// For LIST_APPEND 1: stack[len-1] = stack[len-i-1] -> i = 0
3574-
emit!(self, Instruction::ListAppend { i: 0 });
3595+
emit!(self, Instruction::ListAppend { i: 1 });
35753596
// Stack: [prev_exc, orig, list]
35763597
emit!(
35773598
self,
@@ -4561,9 +4582,9 @@ impl Compiler {
45614582
// 2. Set up class namespace
45624583
let (doc_str, body) = split_doc(body, &self.opts);
45634584

4564-
// Load (global) __name__ and store as __module__
4585+
// Load __name__ and store as __module__
45654586
let dunder_name = self.name("__name__");
4566-
self.emit_load_global(dunder_name, false);
4587+
emit!(self, Instruction::LoadName { namei: dunder_name });
45674588
let dunder_module = self.name("__module__");
45684589
emit!(
45694590
self,
@@ -4584,14 +4605,7 @@ impl Compiler {
45844605
}
45854606
);
45864607

4587-
// Store __doc__ only if there's an explicit docstring
4588-
if let Some(doc) = doc_str {
4589-
self.emit_load_const(ConstantData::Str { value: doc.into() });
4590-
let doc_name = self.name("__doc__");
4591-
emit!(self, Instruction::StoreName { namei: doc_name });
4592-
}
4593-
4594-
// Store __firstlineno__ (new in Python 3.12+)
4608+
// Store __firstlineno__ before __doc__
45954609
self.emit_load_const(ConstantData::Integer {
45964610
value: BigInt::from(firstlineno),
45974611
});
@@ -4603,6 +4617,13 @@ impl Compiler {
46034617
}
46044618
);
46054619

4620+
// Store __doc__ only if there's an explicit docstring
4621+
if let Some(doc) = doc_str {
4622+
self.emit_load_const(ConstantData::Str { value: doc.into() });
4623+
let doc_name = self.name("__doc__");
4624+
emit!(self, Instruction::StoreName { namei: doc_name });
4625+
}
4626+
46064627
// Set __type_params__ if we have type parameters
46074628
if type_params.is_some() {
46084629
// Load .type_params from enclosing scope
@@ -4661,6 +4682,44 @@ impl Compiler {
46614682
.iter()
46624683
.position(|var| *var == "__class__");
46634684

4685+
// Emit __static_attributes__ tuple
4686+
{
4687+
let attrs: Vec<String> = self
4688+
.code_stack
4689+
.last()
4690+
.unwrap()
4691+
.static_attributes
4692+
.as_ref()
4693+
.map(|s| s.iter().cloned().collect())
4694+
.unwrap_or_default();
4695+
self.emit_load_const(ConstantData::Tuple {
4696+
elements: attrs
4697+
.into_iter()
4698+
.map(|s| ConstantData::Str { value: s.into() })
4699+
.collect(),
4700+
});
4701+
let static_attrs_name = self.name("__static_attributes__");
4702+
emit!(
4703+
self,
4704+
Instruction::StoreName {
4705+
namei: static_attrs_name
4706+
}
4707+
);
4708+
}
4709+
4710+
// Store __classdictcell__ if __classdict__ is a cell variable
4711+
if self.current_symbol_table().needs_classdict {
4712+
let classdict_idx = u32::from(self.get_cell_var_index("__classdict__")?);
4713+
emit!(self, PseudoInstruction::LoadClosure { i: classdict_idx });
4714+
let classdictcell = self.name("__classdictcell__");
4715+
emit!(
4716+
self,
4717+
Instruction::StoreName {
4718+
namei: classdictcell
4719+
}
4720+
);
4721+
}
4722+
46644723
if let Some(classcell_idx) = classcell_idx {
46654724
emit!(
46664725
self,
@@ -4810,11 +4869,11 @@ impl Compiler {
48104869
if let ast::Expr::Starred(ast::ExprStarred { value, .. }) = arg {
48114870
// Starred: compile and extend
48124871
self.compile_expression(value)?;
4813-
emit!(self, Instruction::ListExtend { i: 0 });
4872+
emit!(self, Instruction::ListExtend { i: 1 });
48144873
} else {
48154874
// Non-starred: compile and append
48164875
self.compile_expression(arg)?;
4817-
emit!(self, Instruction::ListAppend { i: 0 });
4876+
emit!(self, Instruction::ListAppend { i: 1 });
48184877
}
48194878
}
48204879
}
@@ -4826,7 +4885,7 @@ impl Compiler {
48264885
namei: dot_generic_base
48274886
}
48284887
);
4829-
emit!(self, Instruction::ListAppend { i: 0 });
4888+
emit!(self, Instruction::ListAppend { i: 1 });
48304889

48314890
// Convert list to tuple
48324891
emit!(
@@ -6495,7 +6554,7 @@ impl Compiler {
64956554
self.emit_load_const(ConstantData::Integer {
64966555
value: annotation_index.into(),
64976556
});
6498-
emit!(self, Instruction::SetAdd { i: 0 });
6557+
emit!(self, Instruction::SetAdd { i: 1 });
64996558
emit!(self, Instruction::PopTop);
65006559
}
65016560
}
@@ -6742,6 +6801,10 @@ impl Compiler {
67426801
_ => {
67436802
// Fall back case which always will work!
67446803
self.compile_expression(expression)?;
6804+
// Compare already produces a bool; everything else needs TO_BOOL
6805+
if !matches!(expression, ast::Expr::Compare(_)) {
6806+
emit!(self, Instruction::ToBool);
6807+
}
67456808
if condition {
67466809
emit!(
67476810
self,
@@ -7240,7 +7303,7 @@ impl Compiler {
72407303
emit!(
72417304
compiler,
72427305
Instruction::ListAppend {
7243-
i: generators.len().to_u32(),
7306+
i: (generators.len() + 1).to_u32(),
72447307
}
72457308
);
72467309
Ok(())
@@ -7266,7 +7329,7 @@ impl Compiler {
72667329
emit!(
72677330
compiler,
72687331
Instruction::SetAdd {
7269-
i: generators.len().to_u32(),
7332+
i: (generators.len() + 1).to_u32(),
72707333
}
72717334
);
72727335
Ok(())
@@ -7298,7 +7361,7 @@ impl Compiler {
72987361
emit!(
72997362
compiler,
73007363
Instruction::MapAdd {
7301-
i: generators.len().to_u32(),
7364+
i: (generators.len() + 1).to_u32(),
73027365
}
73037366
);
73047367

@@ -7516,11 +7579,19 @@ impl Compiler {
75167579
// CALL at .method( line (not the full expression line)
75177580
self.codegen_call_helper(0, args, attr.range())?;
75187581
} else {
7519-
// Normal method call: compile object, then LOAD_ATTR with method flag
7520-
// LOAD_ATTR(method=1) pushes [method, self_or_null] on stack
75217582
self.compile_expression(value)?;
75227583
let idx = self.name(attr.as_str());
7523-
self.emit_load_attr_method(idx);
7584+
// Imported names use plain LOAD_ATTR + PUSH_NULL;
7585+
// other names use method call mode LOAD_ATTR.
7586+
// Check current scope and enclosing scopes for IMPORTED flag.
7587+
let is_import = matches!(value.as_ref(), ast::Expr::Name(ast::ExprName { id, .. })
7588+
if self.is_name_imported(id.as_str()));
7589+
if is_import {
7590+
self.emit_load_attr(idx);
7591+
emit!(self, Instruction::PushNull);
7592+
} else {
7593+
self.emit_load_attr_method(idx);
7594+
}
75247595
self.codegen_call_helper(0, args, call_range)?;
75257596
}
75267597
} else {
@@ -7558,7 +7629,7 @@ impl Compiler {
75587629
self.compile_expression(&kw.value)?;
75597630

75607631
if big {
7561-
emit!(self, Instruction::MapAdd { i: 0 });
7632+
emit!(self, Instruction::MapAdd { i: 1 });
75627633
}
75637634
}
75647635

0 commit comments

Comments
 (0)