Align compiler and AST behavior with CPython#8138
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR updates compile-time plumbing, code generation, AST conversion and validation, runtime pattern handling, and builtin execution paths to align behavior with newer CPython-style semantics. ChangesCompiler and AST parity
Sequence Diagram(s)sequenceDiagram
participant Caller
participant builtins.rs
participant _ast
participant codegen
participant VM
Caller->>builtins.rs: compile/eval/exec input
builtins.rs->>_ast: parse, preprocess, or validate AST/source
_ast->>codegen: preprocess module and future flags
codegen->>VM: compile code object with flags
VM-->>builtins.rs: code object or exception
builtins.rs-->>Caller: result or raised exception
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] test: cpython/Lib/test/test_syntax.py (TODO: 65) dependencies: dependent tests: (no tests depend on syntax) [x] lib: cpython/Lib/pydoc.py dependencies:
dependent tests: (5 tests)
[x] test: cpython/Lib/test/test_unpack.py dependencies: dependent tests: (no tests depend on unpack) [x] test: cpython/Lib/test/test_cmd_line_script.py (TODO: 14) dependencies: dependent tests: (no tests depend on cmd_line_script) [x] lib: cpython/Lib/ast.py dependencies:
dependent tests: (148 tests)
[x] test: cpython/Lib/test/test_global.py dependencies: dependent tests: (no tests depend on global) [ ] test: cpython/Lib/test/test_generators.py (TODO: 11) dependencies: dependent tests: (no tests depend on generator) [x] test: cpython/Lib/test/test_list.py (TODO: 4) dependencies: dependent tests: (no tests depend on list) [x] test: cpython/Lib/test/test_compile.py dependencies: dependent tests: (no tests depend on compile) [x] lib: cpython/Lib/codeop.py dependencies:
dependent tests: (103 tests)
[x] lib: cpython/Lib/pdb.py dependencies:
dependent tests: (1 tests)
[x] test: cpython/Lib/test/test_builtin.py (TODO: 15) dependencies: dependent tests: (no tests depend on builtin) [x] test: cpython/Lib/test/test_str.py (TODO: 6) dependencies: dependent tests: (no tests depend on str) [ ] test: cpython/Lib/test/test_grammar.py (TODO: 1) dependencies: dependent tests: (no tests depend on grammar) [x] test: cpython/Lib/test/test_sys.py (TODO: 7) dependencies: dependent tests: (240 tests)
[x] lib: cpython/Lib/symtable.py dependencies:
dependent tests: (2 tests)
[x] test: cpython/Lib/test/test_named_expressions.py (TODO: 2) dependencies: dependent tests: (no tests depend on named_expressions) [x] lib: cpython/Lib/future.py dependencies:
dependent tests: (35 tests)
[ ] test: cpython/Lib/test/test_pep646_syntax.py (TODO: 2) dependencies: dependent tests: (no tests depend on pep646_syntax) [x] lib: cpython/Lib/typing.py dependencies:
dependent tests: (19 tests)
[ ] test: cpython/Lib/test/test_pyrepl (TODO: 23) dependencies: dependent tests: (no tests depend on pyrepl) [ ] test: cpython/Lib/test/test_patma.py (TODO: 1) dependencies: dependent tests: (no tests depend on patma) [ ] test: cpython/Lib/test/test_exceptions.py (TODO: 22) dependencies: dependent tests: (no tests depend on exception) Legend:
|
3946343 to
e90500a
Compare
| } | ||
|
|
||
| #[unsafe(no_mangle)] | ||
| pub unsafe extern "C" fn Py_CompileStringFlags( |
There was a problem hiding this comment.
Py_CompileStringFlags is not part of the stable ABI.
There was a problem hiding this comment.
I also reverted more misoperations on ceval.rs
| vm.compile(code, mode, filename) | ||
| .map_err(|err| vm.new_syntax_error(&err, Some(code))) | ||
| #[unsafe(no_mangle)] | ||
| pub unsafe extern "C" fn Py_CompileStringExFlags( |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
1ecac1d to
c4bd2c9
Compare
There was a problem hiding this comment.
Actionable comments posted: 13
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/_ast/repr.rs (1)
82-86:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winPreserve the comma for one-element tuple fields.
repr_ast_tuplenow renders a single-item tuple as(value), which is no longer a tuple representation. AST reprs containing tuple-valued fields should keep(value,).🐛 Proposed fix
} else if parts.len() > 1 { rendered.push_wtf8(", ".as_ref()); rendered.push_wtf8(&parts[1]); + } else if items.len() == 1 { + rendered.push_wtf8(",".as_ref()); } rendered.push_wtf8(")".as_ref());🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/vm/src/stdlib/_ast/repr.rs` around lines 82 - 86, The repr_ast_tuple function currently does not preserve the trailing comma for single-element tuples, causing them to render as `(value)` instead of the correct `(value,)`. Add logic to handle the case when parts.len() equals 1, where you should push the first element from parts[0] followed by a comma before the closing parenthesis is added. The existing condition checking parts.len() > 1 should remain unchanged, but you need to add an else-if branch or similar handling to address the single-element case explicitly.
🧹 Nitpick comments (3)
crates/vm/src/stdlib/_ast/operator.rs (1)
19-30: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winConsolidate singleton-node dispatch into one shared matcher.
These four functions duplicate the same “scan node type candidates, return enum variant, else type error” logic. A shared helper + per-enum table will keep mappings centralized and easier to audit.
♻️ Suggested refactor shape
+fn match_singleton_node<T: Copy>( + vm: &VirtualMachine, + object: &PyObjectRef, + expected: &str, + variants: &[(&'static Py<PyType>, T)], +) -> PyResult<T> { + for (node_type, value) in variants { + if is_node_instance(vm, object, *node_type)? { + return Ok(*value); + } + } + Err(vm.new_type_error(format!( + "expected some sort of {expected}, but got {}", + object.repr(vm)? + ))) +} + fn ast_from_object( vm: &VirtualMachine, _source_file: &SourceFile, object: PyObjectRef, ) -> PyResult<Self> { - Ok( - if is_node_instance(vm, &object, pyast::NodeOperatorAdd::static_type())? { - Self::Add - } else if ... - ) + match_singleton_node( + vm, + &object, + "operator", + &[ + (pyast::NodeOperatorAdd::static_type(), Self::Add), + (pyast::NodeOperatorSub::static_type(), Self::Sub), + // ... + (pyast::NodeOperatorFloorDiv::static_type(), Self::FloorDiv), + ], + ) }As per coding guidelines, “When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code.”
Also applies to: 60-93, 114-129, 156-183
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/vm/src/stdlib/_ast/operator.rs` around lines 19 - 30, The boolop dispatch logic in this function is duplicated across four functions with only differing node type checks and enum variant returns. Extract a shared helper function that takes a mapping or table of node type candidates paired with their corresponding enum variants, then performs the common dispatch pattern (check each node type and return the matching variant, or throw a type error if none match). Replace the duplicate dispatch code in all four functions (including the blocks at lines 60-93, 114-129, and 156-183) with calls to this shared helper, passing only the differing node-type-to-variant mappings.Source: Coding guidelines
crates/vm/src/stdlib/_ast/module.rs (1)
43-67: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winExtract
Modnode-kind detection into a shared matcher helper.The dispatch in Line 43-Line 67 repeats the same branch shape; extracting the detected kind first will keep conversion logic single-path and reduce drift with similar AST dispatchers.
♻️ Suggested refactor shape
+enum ModKind { + Module, + Interactive, + Expression, + FunctionType, +} + +fn mod_kind_from_object(vm: &VirtualMachine, object: &PyObjectRef) -> PyResult<ModKind> { + if object.is_instance(pyast::NodeModModule::static_type().as_object(), vm)? { + Ok(ModKind::Module) + } else if object.is_instance(pyast::NodeModInteractive::static_type().as_object(), vm)? { + Ok(ModKind::Interactive) + } else if object.is_instance(pyast::NodeModExpression::static_type().as_object(), vm)? { + Ok(ModKind::Expression) + } else if object.is_instance(pyast::NodeModFunctionType::static_type().as_object(), vm)? { + Ok(ModKind::FunctionType) + } else { + Err(vm.new_type_error(format!( + "expected some sort of mod, but got {}", + object.repr(vm)? + ))) + } +} + fn ast_from_object( vm: &VirtualMachine, source_file: &SourceFile, object: PyObjectRef, ) -> PyResult<Self> { - Ok( - if object.is_instance(pyast::NodeModModule::static_type().as_object(), vm)? { - Self::Module(ModModule::ast_from_object(vm, source_file, object)?) - } else if object - .is_instance(pyast::NodeModInteractive::static_type().as_object(), vm)? - { - Self::Interactive(ModInteractive::ast_from_object(vm, source_file, object)?) - } else if object.is_instance(pyast::NodeModExpression::static_type().as_object(), vm)? { - Self::Expression(ast::ModExpression::ast_from_object( - vm, - source_file, - object, - )?) - } else if object - .is_instance(pyast::NodeModFunctionType::static_type().as_object(), vm)? - { - Self::FunctionType(ModFunctionType::ast_from_object(vm, source_file, object)?) - } else { - return Err(vm.new_type_error(format!( - "expected some sort of mod, but got {}", - object.repr(vm)? - ))); - }, - ) + match mod_kind_from_object(vm, &object)? { + ModKind::Module => Ok(Self::Module(ModModule::ast_from_object(vm, source_file, object)?)), + ModKind::Interactive => Ok(Self::Interactive(ModInteractive::ast_from_object(vm, source_file, object)?)), + ModKind::Expression => Ok(Self::Expression(ast::ModExpression::ast_from_object(vm, source_file, object)?)), + ModKind::FunctionType => Ok(Self::FunctionType(ModFunctionType::ast_from_object(vm, source_file, object)?)), + } }As per coding guidelines, “When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/vm/src/stdlib/_ast/module.rs` around lines 43 - 67, The dispatch logic in the conversion function for Mod contains repeated branches that all follow the same pattern: checking is_instance for different node types and then calling ast_from_object. Extract the detection logic by first identifying which mod kind the object is (ModModule, ModInteractive, ModExpression, or ModFunctionType) into a separate helper or pattern match, then perform the ast_from_object conversion call once using the detected kind rather than repeating the conversion call in each branch. This eliminates duplicate code where the only difference between branches is the specific node type being checked.Source: Coding guidelines
crates/vm/src/stdlib/_ast/string.rs (1)
730-730: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winReplace the decorative section separator.
This adds a
// ===== ... =====separator, which the repository guidelines explicitly disallow. Use a plain short comment or remove it.Suggested fix
-// ===== TString (Template String) Support ===== +// Template string support.As per coding guidelines, “Do not add decorative section separators (e.g.
// -----------,// ===,/* *** */). Use///doc-comments or short//comments only when they add value.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/vm/src/stdlib/_ast/string.rs` at line 730, The decorative section separator comment using the `// ===== ... =====` pattern violates repository coding guidelines that disallow such decorative separators. Replace the separator comment at the TString (Template String) Support section with either a plain short comment like `// TString (Template String) Support` if the section heading adds value, or remove it entirely if the context is clear from the surrounding code.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 3601-3608: The pattern matching for is_copy_of_load_const only
checks for Instruction::LoadConst, but line 3595 can canonicalize a LOAD_CONST
integer to LOAD_SMALL_INT before curr is captured, causing valid COPY patterns
after LOAD_SMALL_INT to not be recognized. Update the matches! condition for
is_copy_of_load_const to also handle Instruction::LoadSmallInt in addition to
Instruction::LoadConst so that patterns like LOAD_SMALL_INT followed by COPY are
properly tracked and can be fused with subsequent operations like TO_BOOL.
In `@crates/vm/src/frame.rs`:
- Around line 3198-3200: The current `seen_keys: Vec<PyObjectRef>` approach uses
linear scanning with `vm.bool_eq` which changes observable behavior by calling
`__eq__` for keys CPython would separate by hash and allows unhashable keys to
reach custom implementations. Replace this vector-based tracking with a Python
set-backed collection that respects proper hash and equality semantics. Extract
the duplicate key checking logic into a single shared function that works across
both lookup paths instead of duplicating the check logic in the branches around
lines 3217-3230 and 3247-3260, following the guideline to extract differing
values first and call common logic once.
- Line 3056: The code uses Vec::<String>::new() to track seen attributes and
calls expect_str() on user-defined __match_args__, which panics on strings with
lone surrogates. Change seen_attrs from Vec::<String> to Vec::<PyObjectRef> to
store the actual Python objects instead of converting to strings. In both the
positional attribute tracking section (around lines 3103-3116) and keyword
attribute tracking section (around lines 3159-3172), replace direct string
equality comparisons with vm.bool_eq() calls to safely compare the PyObjectRef
values. Use .repr() on PyObjectRef objects when constructing error messages
instead of trying to extract string values. This preserves the original Python
objects and avoids panicking on surrogates.
In `@crates/vm/src/stdlib/_abc.rs`:
- Around line 244-250: The collection_flags calculation only reads from
cls_type.slots.flags but ignores the abc_tpflags field, which stores virtual
ABC-derived sequence and mapping markers. This causes transitive virtual
registrations to be lost during propagation. Modify the collection_flags
assignment to include flags from both cls_type.slots.flags and
cls_type.abc_tpflags by combining them appropriately before passing to
set_abc_collection_flags_recursive, ensuring that virtual ABC registrations are
preserved when propagating collection flags.
In `@crates/vm/src/stdlib/_ast.rs`:
- Around line 337-347: The error message in the node_object_to_ast_string
function on line 345 only mentions str as an acceptable type, but the function
actually accepts both str and bytes as shown in the type check on line 342.
Update the error message passed to vm.new_type_error to accurately reflect that
both str and bytes types are accepted, so users receive an accurate description
of what types are allowed.
In `@crates/vm/src/stdlib/_ast/constant.rs`:
- Around line 1124-1144: The first_invalid_constant_type function recursively
scans nested tuples and frozensets without checking the VM's recursion limit,
which can cause a Rust stack overflow on deeply nested structures. Guard the
recursive calls to first_invalid_constant_type_opt with vm.with_recursion to
enforce the VM's recursion limit before recursing into items in both the tuple
iteration block (where items are processed from tuple.iter()) and the frozenset
iteration block (where items are processed from set.elements()). Apply the same
recursion guard pattern to any other similar recursive scans referenced around
lines 1209-1216.
In `@crates/vm/src/stdlib/_ast/pyast.rs`:
- Around line 91-109: The base AST classes generated by this macro (such as
stmt, expr, and excepthandler) are missing initialization of _field_types and
__annotations__ attributes, which are needed for CPython-compatible
introspection metadata. In the same macro branch where _fields is being set to
ctx.empty_tuple, add initialization for _field_types and __annotations__ as
empty dictionaries using appropriate ctx methods. This should be done before or
after the _fields initialization to ensure all base classes maintain proper
metadata structure.
In `@crates/vm/src/stdlib/_ast/python.rs`:
- Around line 478-480: In the branch where
ftype.fast_isinstance(vm.ctx.types.union_type) evaluates to true (handling
optional T | None fields), instead of leaving the optional field unset with just
a comment, assign a default value of None to the field. This ensures that
optional attributes always exist on AST nodes and prevents missing field errors
in constructors, __replace__ methods, and repr/validation operations that expect
the field to be present.
In `@crates/vm/src/stdlib/_ast/statement.rs`:
- Line 104: The issue is that stmt_range_from_object is being called on an
object before validating it is actually a statement type, causing location field
errors to appear before the type-check error. Move the range extraction call
from before the match statement into each individual successful match branch
(for each specific statement type being handled), and leave the final else
clause to report the "expected some sort of stmt" error without requiring
location fields to be present on invalid node types.
- Around line 29-42: In the function definition_range_from_name, replace the
incorrect keyword_len arithmetic calculation with a search-based approach.
Instead of using checked_sub with the keyword_len parameter to compute
keyword_start, use rfind to search backwards from name_start for the actual
keyword position. The current implementation with checked_sub and keyword_len is
fragile and produces correct results only by accident through the
max(line_start) clamping. You will need to pass the keyword string as a
parameter to the function so rfind can locate its actual position in the source
code.
In `@crates/vm/src/stdlib/_ast/string.rs`:
- Around line 913-940: The interpolation_to_expr function currently loses the
original location information by setting both the ast::TString range and
ast::ExprTString range to TextRange::default(). Capture the range from the
interpolation parameter before it is moved into TemplateStrPart::Interpolation,
then use this captured range to populate the range field for both the
ast::TString struct and the ast::ExprTString struct instead of using
TextRange::default().
In `@crates/vm/src/stdlib/_ast/type_parameters.rs`:
- Around line 87-118: The function currently calls type_param_range_from_object
on the object before validating that it is actually a supported type_param node
kind, which can cause it to fail while reading location fields instead of
returning the intended error message. Restructure the logic to first classify
the object by checking if it is an instance of TypeVar, ParamSpec, or
TypeVarTuple using is_node_instance, and only after confirming it is a valid
type should type_param_range_from_object be called to extract the range. If none
of the three type checks match, immediately return the type_error before
attempting to access any fields on the invalid object.
In `@crates/vm/src/stdlib/builtins.rs`:
- Around line 374-377: The constant CODEGEN_NOT_SUPPORTED is defined only under
the feature guard #[cfg(not(feature = "rustpython-compiler"))] but is being used
under #[cfg(not(feature = "compiler"))] in the code blocks at lines 374-377 and
397-400. Since "compiler" is a feature alias that depends on
"rustpython-compiler", these conditions are not equivalent and the constant may
be unavailable, causing a compilation error. Fix this by making the feature
guard consistent for the CODEGEN_NOT_SUPPORTED constant definition and its usage
locations—either move the constant definition to be guarded by #[cfg(not(feature
= "compiler"))] or change all usage sites to reference #[cfg(not(feature =
"rustpython-compiler"))] instead, ensuring the same feature flag is used
throughout.
---
Outside diff comments:
In `@crates/vm/src/stdlib/_ast/repr.rs`:
- Around line 82-86: The repr_ast_tuple function currently does not preserve the
trailing comma for single-element tuples, causing them to render as `(value)`
instead of the correct `(value,)`. Add logic to handle the case when parts.len()
equals 1, where you should push the first element from parts[0] followed by a
comma before the closing parenthesis is added. The existing condition checking
parts.len() > 1 should remain unchanged, but you need to add an else-if branch
or similar handling to address the single-element case explicitly.
---
Nitpick comments:
In `@crates/vm/src/stdlib/_ast/module.rs`:
- Around line 43-67: The dispatch logic in the conversion function for Mod
contains repeated branches that all follow the same pattern: checking
is_instance for different node types and then calling ast_from_object. Extract
the detection logic by first identifying which mod kind the object is
(ModModule, ModInteractive, ModExpression, or ModFunctionType) into a separate
helper or pattern match, then perform the ast_from_object conversion call once
using the detected kind rather than repeating the conversion call in each
branch. This eliminates duplicate code where the only difference between
branches is the specific node type being checked.
In `@crates/vm/src/stdlib/_ast/operator.rs`:
- Around line 19-30: The boolop dispatch logic in this function is duplicated
across four functions with only differing node type checks and enum variant
returns. Extract a shared helper function that takes a mapping or table of node
type candidates paired with their corresponding enum variants, then performs the
common dispatch pattern (check each node type and return the matching variant,
or throw a type error if none match). Replace the duplicate dispatch code in all
four functions (including the blocks at lines 60-93, 114-129, and 156-183) with
calls to this shared helper, passing only the differing node-type-to-variant
mappings.
In `@crates/vm/src/stdlib/_ast/string.rs`:
- Line 730: The decorative section separator comment using the `// ===== ...
=====` pattern violates repository coding guidelines that disallow such
decorative separators. Replace the separator comment at the TString (Template
String) Support section with either a plain short comment like `// TString
(Template String) Support` if the section heading adds value, or remove it
entirely if the context is clear from the surrounding code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: e18de394-1f13-4b89-b45b-7957a283e3cb
⛔ Files ignored due to path filters (13)
Cargo.lockis excluded by!**/*.lockLib/test/test_compile.pyis excluded by!Lib/**Lib/test/test_grammar.pyis excluded by!Lib/**Lib/test/test_patma.pyis excluded by!Lib/**Lib/test/test_syntax.pyis excluded by!Lib/**Lib/test/test_type_comments.pyis excluded by!Lib/**crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__bare_function_annotations_check_attribute_and_subscript_expressions.snapis excluded by!**/*.snapcrates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__const_no_op.snapis excluded by!**/*.snapcrates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__constant_true_if_pass_keeps_line_anchor_nop.snapis excluded by!**/*.snapcrates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__if_ands.snapis excluded by!**/*.snapcrates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__if_mixed.snapis excluded by!**/*.snapcrates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__if_ors.snapis excluded by!**/*.snapcrates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__nested_bool_op.snapis excluded by!**/*.snap
📒 Files selected for processing (62)
.cspell.json.pre-commit-config.yamlCargo.tomlcrates/capi/src/ceval.rscrates/codegen/src/compile.rscrates/codegen/src/error.rscrates/codegen/src/ir.rscrates/codegen/src/lib.rscrates/codegen/src/preprocess.rscrates/codegen/src/symboltable.rscrates/codegen/src/unparse.rscrates/compiler-core/src/bytecode.rscrates/compiler-core/src/bytecode/instruction.rscrates/compiler-core/src/bytecode/oparg.rscrates/compiler/src/lib.rscrates/stdlib/src/_opcode.rscrates/vm/Cargo.tomlcrates/vm/src/builtins/type.rscrates/vm/src/eval.rscrates/vm/src/exceptions.rscrates/vm/src/frame.rscrates/vm/src/import.rscrates/vm/src/object/core.rscrates/vm/src/stdlib/_abc.rscrates/vm/src/stdlib/_ast.rscrates/vm/src/stdlib/_ast/argument.rscrates/vm/src/stdlib/_ast/basic.rscrates/vm/src/stdlib/_ast/constant.rscrates/vm/src/stdlib/_ast/elif_else_clause.rscrates/vm/src/stdlib/_ast/exception.rscrates/vm/src/stdlib/_ast/expression.rscrates/vm/src/stdlib/_ast/module.rscrates/vm/src/stdlib/_ast/node.rscrates/vm/src/stdlib/_ast/operator.rscrates/vm/src/stdlib/_ast/other.rscrates/vm/src/stdlib/_ast/parameter.rscrates/vm/src/stdlib/_ast/pattern.rscrates/vm/src/stdlib/_ast/pyast.rscrates/vm/src/stdlib/_ast/python.rscrates/vm/src/stdlib/_ast/repr.rscrates/vm/src/stdlib/_ast/statement.rscrates/vm/src/stdlib/_ast/string.rscrates/vm/src/stdlib/_ast/type_ignore.rscrates/vm/src/stdlib/_ast/type_parameters.rscrates/vm/src/stdlib/_ast/validate.rscrates/vm/src/stdlib/_symtable.rscrates/vm/src/stdlib/builtins.rscrates/vm/src/stdlib/sys.rscrates/vm/src/vm/compile.rscrates/vm/src/vm/interpreter.rscrates/vm/src/vm/mod.rscrates/vm/src/vm/python_run.rscrates/vm/src/vm/vm_new.rscrates/wasm/Cargo.tomlcrates/wasm/src/lib.rscrates/wasm/src/vm_class.rsexamples/hello_embed.rsexamples/mini_repl.rsexamples/parse_folder.rsextra_tests/snippets/builtin_compile.pysrc/shell.rstools/opcode_metadata/generate_rs_opcode_metadata.py
| pub(super) fn node_object_to_ast_string( | ||
| vm: &VirtualMachine, | ||
| obj: PyObjectRef, | ||
| ) -> PyResult<PyObjectRef> { | ||
| let cls = obj.class(); | ||
| if cls.is(vm.ctx.types.str_type) || cls.is(vm.ctx.types.bytes_type) { | ||
| Ok(obj) | ||
| } else { | ||
| Err(vm.new_type_error("AST string must be of type str")) | ||
| } | ||
| } |
There was a problem hiding this comment.
Error message does not reflect that bytes is also accepted.
The function accepts both str and bytes types (line 342), but the error message on line 345 only mentions str. This could confuse users who encounter the error after passing a different type.
Suggested fix
} else {
- Err(vm.new_type_error("AST string must be of type str"))
+ Err(vm.new_type_error("AST string must be of type str or bytes"))
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub(super) fn node_object_to_ast_string( | |
| vm: &VirtualMachine, | |
| obj: PyObjectRef, | |
| ) -> PyResult<PyObjectRef> { | |
| let cls = obj.class(); | |
| if cls.is(vm.ctx.types.str_type) || cls.is(vm.ctx.types.bytes_type) { | |
| Ok(obj) | |
| } else { | |
| Err(vm.new_type_error("AST string must be of type str")) | |
| } | |
| } | |
| pub(super) fn node_object_to_ast_string( | |
| vm: &VirtualMachine, | |
| obj: PyObjectRef, | |
| ) -> PyResult<PyObjectRef> { | |
| let cls = obj.class(); | |
| if cls.is(vm.ctx.types.str_type) || cls.is(vm.ctx.types.bytes_type) { | |
| Ok(obj) | |
| } else { | |
| Err(vm.new_type_error("AST string must be of type str or bytes")) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/vm/src/stdlib/_ast.rs` around lines 337 - 347, The error message in
the node_object_to_ast_string function on line 345 only mentions str as an
acceptable type, but the function actually accepts both str and bytes as shown
in the type check on line 342. Update the error message passed to
vm.new_type_error to accurately reflect that both str and bytes types are
accepted, so users receive an accurate description of what types are allowed.
754a4a4 to
cfc9db9
Compare
cfc9db9 to
5235b03
Compare
| // Invalidate negative cache | ||
| increment_invalidation_counter(); | ||
|
|
||
| if let Some(cls_type) = cls.downcast_ref::<PyType>() |
There was a problem hiding this comment.
check if this have CPython reference
| rustpython-wtf8 = { path = "crates/wtf8", version = "0.5.0" } | ||
| rustpython-doc = { path = "crates/doc", version = "0.5.0" } | ||
|
|
||
| # Use RustPython-packaged Ruff crates from the published fork while keeping |
There was a problem hiding this comment.
minimize diff and keep rustpython-ruff_python_parser as commented
| @@ -65,8 +65,8 @@ def fib(n): | |||
| // this line also automatically prints the output | |||
| // (note that this is only the case when compiler::Mode::Single is passed to vm.compile) | |||
| match vm | |||
| .compile(&input, vm::compiler::Mode::Single, "<embedded>") | |||
| .map_err(|err| vm.new_syntax_error(&err, Some(&input))) | |||
| .compile(&input, vm::compiler::Mode::Single, "<embedded>".to_owned()) | |||
There was a problem hiding this comment.
why redundant .to_owned() is added?
| syntax_error.as_object(), self, unwrap, | ||
| "_metadata" => metadata, | ||
| ); | ||
| syntax_error |
There was a problem hiding this comment.
keep set_attrs! usage
| } | ||
| } | ||
|
|
||
| impl<T: Node> Node for ThinVec<T> { |
There was a problem hiding this comment.
if we added ThinVec support, does we still need Vec support?
| impl Transformer for AstPreprocessor { | ||
| fn visit_stmt(&self, stmt: &mut ast::Stmt) { | ||
| match stmt { | ||
| ast::Stmt::FunctionDef(function) => { |
There was a problem hiding this comment.
do we really need to full-implement this match arms again?
3324654 to
ba0113c
Compare
Update the compiler pipeline, AST module support, and related VM compile/eval paths toward CPython 3.14 behavior. Use upstream Ruff parser crates directly and keep RustPython-specific syntax preflight handling in the compile path. Refresh opcode metadata, snapshots, and targeted tests for the aligned bytecode and AST behavior.
ba0113c to
af789fa
Compare
Summary
_ast,symtable, and compile-path handling so RustPython follows CPython semantics while using the Ruff parser AST surface.Summary by CodeRabbit
Release Notes
__future__flags, including improved future-annotation handling.finallyblocks during preprocessing.execto accept closure objects.Py_CompileStringhandling.