Align compiler and AST behavior with CPython by youknowone · Pull Request #8138 · RustPython/RustPython

youknowone · 2026-06-22T03:45:42Z

Closes #xxxx
This PR follows our AI policy

Summary

Align RustPython compiler/codegen behavior more closely with CPython, including bytecode layout, control-flow lowering, optimizer behavior, source locations, exception tables, and future flags.
Expand _ast, symtable, and compile-path handling so RustPython follows CPython semantics while using the Ruff parser AST surface.
Remove the local Ruff parser vendor fork and depend on upstream Ruff by pinned commit for reproducible builds.
Update compiler, opcode, and Python-level tests/snapshots for the new CPython-aligned behavior.

Summary by CodeRabbit

Release Notes

New Features
- Extended parsing/compilation support for more __future__ flags, including improved future-annotation handling.
- Added a warning for invalid control flow inside finally blocks during preprocessing.
- Enhanced exec to accept closure objects.
Bug Fixes
- Improved error reporting for compile/import/eval failures to raise the correct Python exceptions.
- Strengthened structural pattern matching validation (duplicate attribute/key detection).
- Fixed dict-comprehension un-parsing behavior and improved embedding-facing Py_CompileString handling.
Chores
- Updated lint/spell configuration and refreshed tooling dependencies.

coderabbitai · 2026-06-22T03:45:51Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 7746687c-05fe-4a91-95d4-1f319aff1191

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR updates compile-time plumbing, code generation, AST conversion and validation, runtime pattern handling, and builtin execution paths to align behavior with newer CPython-style semantics.

Changes

Compiler and AST parity

Layer / File(s)	Summary
Tooling, C-API compile path, and compiler metadata `.cspell.json`, `.pre-commit-config.yaml`, `Cargo.toml`, `crates/capi/src/ceval.rs`, `crates/compiler-core/src/bytecode*.rs`, `crates/stdlib/src/_opcode.rs`, `crates/vm/Cargo.toml`	Updates Ruff sourcing/configuration, adds `thin-vec`, extends C-ABI compile helpers and compiler flags, widens unpack arguments, and adjusts low-level compile/error conversion paths and opcode tests.
Codegen errors, preprocessing, and optimizer flow `crates/codegen/src/{error.rs,ir.rs,lib.rs,preprocess.rs,unparse.rs}`	Adds public preprocessing APIs and AST side tables, revises error variants, changes future-feature and docstring preprocessing, updates constant folding and jump threading, and expands optimizer regression tests.
Runtime type flags and match execution `crates/vm/src/{builtins/type.rs,frame.rs,object/core.rs,stdlib/{_abc.rs,_symtable.rs}}`	Caches inherited ABC collection flags on types, propagates them through registration, uses them in match operations, checks duplicate class attributes and mapping keys, and extends symbol-table scope typing.
AST core helpers and Python-side node classes `crates/vm/src/stdlib/_ast/{constant.rs,pyast.rs,python.rs,repr.rs,type_ignore.rs,type_parameters.rs,validate.rs,basic.rs,operator.rs,other.rs,parameter.rs,module.rs,node.rs,argument.rs}`	Introduces public AST override storage and lookup paths, rewrites shared list and field extraction helpers, updates Python-side AST class behavior, and makes validation consume override-backed node data.
AST node conversion refactors `crates/vm/src/stdlib/_ast/{expression.rs,statement.rs,pattern.rs,string.rs,exception.rs,elif_else_clause.rs}`	Refactors expression, statement, pattern, exception, f-string, and template-string conversions around range-aware constructors and public-node reconstruction paths.
Builtin compile, eval, and exec flow `crates/vm/src/stdlib/builtins.rs`, `crates/vm/src/{eval.rs,import.rs,exceptions.rs,sys.rs}`	Reworks builtin compilation and execution to merge future flags, audit source compilation, decode byte inputs, validate closures, and adjust related runtime error conversion paths.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant builtins.rs
  participant _ast
  participant codegen
  participant VM

  Caller->>builtins.rs: compile/eval/exec input
  builtins.rs->>_ast: parse, preprocess, or validate AST/source
  _ast->>codegen: preprocess module and future flags
  codegen->>VM: compile code object with flags
  VM-->>builtins.rs: code object or exception
  builtins.rs-->>Caller: result or raised exception

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

RustPython/RustPython#7927: Also changes crates/capi/src/ceval.rs around Py_CompileString.
RustPython/RustPython#7700: Also changes crates/vm/src/builtins/type.rs around __abc_tpflags__ handling.
RustPython/RustPython#7721: Also changes crates/codegen/src/ir.rs optimizer and constant-folding logic.

Suggested reviewers

ShaharNaveh

Poem

🐇 I hopped through syntax, flags, and trees,
and tugged at bytecode in the breeze.
With AST burrows neat and deep,
the matchers now more closely keep.
A compile-time carrot, crisp and bright—
thump thump, the tests all passed tonight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 15.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title "Align compiler and AST behavior with CPython" accurately and clearly describes the primary objective of this comprehensive changeset, which updates RustPython's compiler pipeline and AST handling to match CPython semantics.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

github-actions · 2026-06-22T03:46:35Z

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[ ] test: cpython/Lib/test/test_syntax.py (TODO: 65)

dependencies:

dependent tests: (no tests depend on syntax)

[x] lib: cpython/Lib/pydoc.py
[x] lib: cpython/Lib/pydoc_data
[ ] test: cpython/Lib/test/test_pydoc (TODO: 34)

dependencies:

pydoc

dependent tests: (5 tests)

pydoc: test_enum test_pydoc
- pdb: test_pdb
- xmlrpc.server: test_docxmlrpc test_xmlrpc

[x] test: cpython/Lib/test/test_unpack.py
[ ] test: cpython/Lib/test/test_unpack_ex.py (TODO: 4)

dependencies:

dependent tests: (no tests depend on unpack)

[x] test: cpython/Lib/test/test_cmd_line_script.py (TODO: 14)

dependencies:

dependent tests: (no tests depend on cmd_line_script)

[x] lib: cpython/Lib/ast.py
[x] lib: cpython/Lib/_ast_unparse.py
[x] test: cpython/Lib/test/test_unparse.py
[x] test: cpython/Lib/test/test_type_comments.py

dependencies:

ast

dependent tests: (148 tests)

[x] test: cpython/Lib/test/test_global.py

dependencies:

dependent tests: (no tests depend on global)

[ ] test: cpython/Lib/test/test_generators.py (TODO: 11)
[ ] test: cpython/Lib/test/test_genexps.py (TODO: 4)
[x] test: cpython/Lib/test/test_generator_stop.py
[x] test: cpython/Lib/test/test_yield_from.py (TODO: 1)

dependencies:

dependent tests: (no tests depend on generator)

[x] test: cpython/Lib/test/test_list.py (TODO: 4)
[ ] test: cpython/Lib/test/test_listcomps.py (TODO: 4)
[ ] test: cpython/Lib/test/test_userlist.py (TODO: 1)

dependencies:

dependent tests: (no tests depend on list)

[x] test: cpython/Lib/test/test_compile.py
[x] test: cpython/Lib/test/test_compiler_assemble.py
[x] test: cpython/Lib/test/test_compiler_codegen.py
[x] test: cpython/Lib/test/test_peepholer.py (TODO: 3)

dependencies:

dependent tests: (no tests depend on compile)

[x] lib: cpython/Lib/codeop.py
[x] test: cpython/Lib/test/test_codeop.py (TODO: 2)

dependencies:

codeop

dependent tests: (103 tests)

[x] lib: cpython/Lib/pdb.py
[ ] test: cpython/Lib/test/test_pdb.py (TODO: 47)

dependencies:

pdb

dependent tests: (1 tests)

pdb: test_pdb

[x] test: cpython/Lib/test/test_builtin.py (TODO: 15)

dependencies:

dependent tests: (no tests depend on builtin)

[x] test: cpython/Lib/test/test_str.py (TODO: 6)
[x] test: cpython/Lib/test/test_fstring.py (TODO: 14)
[x] test: cpython/Lib/test/test_string_literals.py (TODO: 4)

dependencies:

dependent tests: (no tests depend on str)

[ ] test: cpython/Lib/test/test_grammar.py (TODO: 1)

dependencies:

dependent tests: (no tests depend on grammar)

[x] test: cpython/Lib/test/test_sys.py (TODO: 7)
[x] test: cpython/Lib/test/test_syslog.py (TODO: 1)
[x] test: cpython/Lib/test/test_sys_setprofile.py (TODO: 2)
[x] test: cpython/Lib/test/test_sys_settrace.py (TODO: 168)
[x] test: cpython/Lib/test/test_audit.py (TODO: 14)
[x] test: cpython/Lib/test/audit-tests.py

dependencies:

dependent tests: (240 tests)

sys: regrtestdata test___all__ test__colorize test__locale test__osx_support test_android test_annotationlib test_argparse test_array test_asdl_parser test_ast test_asyncio test_audit test_bdb test_bigaddrspace test_bigmem test_bisect test_buffer test_builtin test_bytes test_bz2 test_c_locale_coercion test_calendar test_class test_clinic test_cmath test_cmd test_cmd_line test_cmd_line_script test_code test_code_module test_codeccallbacks test_codecs test_collections test_compile test_compileall test_complex test_concurrent_futures test_context test_contextlib test_coroutines test_cprofile test_csv test_ctypes test_datetime test_dbm test_dbm_sqlite3 test_decimal test_descr test_dict test_difflib test_dis test_doctest test_doctest2 test_docxmlrpc test_dtrace test_dynamic test_dynamicclassattribute test_email test_embed test_ensurepip test_enum test_enumerate test_eof test_except_star test_exceptions test_faulthandler test_fcntl test_file test_file_eintr test_fileinput test_fileio test_float test_fork1 test_format test_fractions test_frame test_frozen test_functools test_future_stmt test_gc test_generated_cases test_generators test_genericpath test_genexps test_getopt test_glob test_grammar test_gzip test_hash test_hashlib test_http_cookiejar test_httpservers test_importlib test_inspect test_int test_io test_ioctl test_iter test_itertools test_json test_largefile test_launcher test_list test_listcomps test_locale test_logging test_long test_lzma test_mailbox test_marshal test_math test_memoryio test_memoryview test_metaclass test_mimetypes test_mmap test_monitoring test_msvcrt test_multibytecodec test_multiprocessing_forkserver test_multiprocessing_main_handling test_multiprocessing_spawn test_netrc test_ntpath test_numeric_tower test_operator test_optparse test_ordered_dict test_os test_osx_env test_pathlib test_patma test_pdb test_peepholer test_perfmaps test_pickle test_pkg test_pkgutil test_platform test_plistlib test_popen test_posix test_posixpath test_print test_profile test_property test_pty test_pwd test_py_compile test_pyclbr test_pydoc test_pyexpat test_pyrepl test_quopri test_raise test_range test_re test_regrtest test_repl test_reprlib test_resource test_runpy test_sax test_scope test_script_helper test_select test_selectors test_shutil test_signal test_site test_slice test_smtplib test_socket test_sqlite3 test_ssl test_stable_abi_ctypes test_stat test_statistics test_str test_strftime test_string_literals test_strptime test_strtod test_struct test_subprocess test_support test_sys test_sys_setprofile test_sys_settrace test_sysconfig test_syslog test_tarfile test_tempfile test_termios test_threading test_threading_local test_threadsignals test_time test_timeit test_tomllib test_tools test_trace test_traceback test_tracemalloc test_tuple test_type_cache test_type_comments test_types test_typing test_unicode_file test_unicode_file_functions test_unicodedata test_unittest test_univnewlines test_urllib test_urllib2 test_urllib2net test_urlparse test_utf8_mode test_uuid test_venv test_wait3 test_wait4 test_wave test_weakref test_webbrowser test_winconsoleio test_winreg test_with test_wsgiref test_xml_etree test_xmlrpc test_xpickle test_zipapp test_zipfile test_zipfile64 test_zipimport test_zipimport_support test_zlib

[x] lib: cpython/Lib/symtable.py
[x] test: cpython/Lib/test/test_symtable.py (TODO: 14)

dependencies:

symtable

dependent tests: (2 tests)

symtable: test_inspect test_symtable

[x] test: cpython/Lib/test/test_named_expressions.py (TODO: 2)

dependencies:

dependent tests: (no tests depend on named_expressions)

[x] lib: cpython/Lib/future.py

dependencies:

future

dependent tests: (35 tests)

future: test_flufl test_future_stmt test_generator_stop test_pydoc test_zoneinfo
- codeop: test_codeop
  - pdb: test_pdb
  - traceback: test_asyncio test_builtin test_code_module test_contextlib test_contextlib_async test_coroutines test_dictcomps test_exceptions test_http_cookiejar test_importlib test_iter test_listcomps test_pyexpat test_setcomps test_socket test_ssl test_subprocess test_sys test_threadedtempfile test_threading test_traceback test_unittest test_with test_zipimport
- importlib.metadata: test_importlib
- pydoc: test_enum
  - xmlrpc.server: test_docxmlrpc test_xmlrpc

[ ] test: cpython/Lib/test/test_pep646_syntax.py (TODO: 2)

dependencies:

dependent tests: (no tests depend on pep646_syntax)

[x] lib: cpython/Lib/typing.py
[ ] test: cpython/Lib/test/test_typing.py (TODO: 2)
[x] test: cpython/Lib/test/test_type_aliases.py
[x] test: cpython/Lib/test/test_type_annotations.py (TODO: 1)
[x] test: cpython/Lib/test/test_type_params.py (TODO: 1)
[x] test: cpython/Lib/test/test_genericalias.py

dependencies:

typing

dependent tests: (19 tests)

typing: test_annotationlib test_builtin test_copy test_enum test_fractions test_funcattrs test_functools test_genericalias test_grammar test_inspect test_isinstance test_patma test_peg_generator test_pydoc test_pyrepl test_type_aliases test_type_params test_types test_typing

[ ] test: cpython/Lib/test/test_pyrepl (TODO: 23)
[ ] test: cpython/Lib/test/test_repl.py

dependencies:

dependent tests: (no tests depend on pyrepl)

[ ] test: cpython/Lib/test/test_patma.py (TODO: 1)

dependencies:

dependent tests: (no tests depend on patma)

[ ] test: cpython/Lib/test/test_exceptions.py (TODO: 22)
[ ] test: cpython/Lib/test/test_baseexception.py
[x] test: cpython/Lib/test/test_except_star.py (TODO: 1)
[ ] test: cpython/Lib/test/test_exception_group.py (TODO: 3)
[x] test: cpython/Lib/test/test_exception_hierarchy.py (TODO: 2)
[x] test: cpython/Lib/test/test_exception_variations.py

dependencies:

dependent tests: (no tests depend on exception)

Legend:

[+] path exists in CPython
[x] up-to-date, [ ] outdated

bschoenmaeckers · 2026-06-22T08:07:48Z

+}
+
+#[unsafe(no_mangle)]
+pub unsafe extern "C" fn Py_CompileStringFlags(


Py_CompileStringFlags is not part of the stable ABI.

I also reverted more misoperations on ceval.rs

Sign in to view

-        vm.compile(code, mode, filename)
-            .map_err(|err| vm.new_syntax_error(&err, Some(code)))
+#[unsafe(no_mangle)]
+pub unsafe extern "C" fn Py_CompileStringExFlags(


coderabbitai

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/vm/src/stdlib/_ast/repr.rs (1)

82-86: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Preserve the comma for one-element tuple fields.

repr_ast_tuple now renders a single-item tuple as (value), which is no longer a tuple representation. AST reprs containing tuple-valued fields should keep (value,).

🐛 Proposed fix

     } else if parts.len() > 1 {
         rendered.push_wtf8(", ".as_ref());
         rendered.push_wtf8(&parts[1]);
+    } else if items.len() == 1 {
+        rendered.push_wtf8(",".as_ref());
     }
     rendered.push_wtf8(")".as_ref());

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/stdlib/_ast/repr.rs` around lines 82 - 86, The repr_ast_tuple
function currently does not preserve the trailing comma for single-element
tuples, causing them to render as `(value)` instead of the correct `(value,)`.
Add logic to handle the case when parts.len() equals 1, where you should push
the first element from parts[0] followed by a comma before the closing
parenthesis is added. The existing condition checking parts.len() > 1 should
remain unchanged, but you need to add an else-if branch or similar handling to
address the single-element case explicitly.

🧹 Nitpick comments (3)

crates/vm/src/stdlib/_ast/operator.rs (1)

19-30: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Consolidate singleton-node dispatch into one shared matcher.

These four functions duplicate the same “scan node type candidates, return enum variant, else type error” logic. A shared helper + per-enum table will keep mappings centralized and easier to audit.

♻️ Suggested refactor shape

+fn match_singleton_node<T: Copy>(
+    vm: &VirtualMachine,
+    object: &PyObjectRef,
+    expected: &str,
+    variants: &[(&'static Py<PyType>, T)],
+) -> PyResult<T> {
+    for (node_type, value) in variants {
+        if is_node_instance(vm, object, *node_type)? {
+            return Ok(*value);
+        }
+    }
+    Err(vm.new_type_error(format!(
+        "expected some sort of {expected}, but got {}",
+        object.repr(vm)?
+    )))
+}
+
     fn ast_from_object(
         vm: &VirtualMachine,
         _source_file: &SourceFile,
         object: PyObjectRef,
     ) -> PyResult<Self> {
-        Ok(
-            if is_node_instance(vm, &object, pyast::NodeOperatorAdd::static_type())? {
-                Self::Add
-            } else if ...
-        )
+        match_singleton_node(
+            vm,
+            &object,
+            "operator",
+            &[
+                (pyast::NodeOperatorAdd::static_type(), Self::Add),
+                (pyast::NodeOperatorSub::static_type(), Self::Sub),
+                // ...
+                (pyast::NodeOperatorFloorDiv::static_type(), Self::FloorDiv),
+            ],
+        )
     }

As per coding guidelines, “When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code.”

Also applies to: 60-93, 114-129, 156-183

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/stdlib/_ast/operator.rs` around lines 19 - 30, The boolop
dispatch logic in this function is duplicated across four functions with only
differing node type checks and enum variant returns. Extract a shared helper
function that takes a mapping or table of node type candidates paired with their
corresponding enum variants, then performs the common dispatch pattern (check
each node type and return the matching variant, or throw a type error if none
match). Replace the duplicate dispatch code in all four functions (including the
blocks at lines 60-93, 114-129, and 156-183) with calls to this shared helper,
passing only the differing node-type-to-variant mappings.

Source: Coding guidelines

crates/vm/src/stdlib/_ast/module.rs (1)

43-67: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Extract Mod node-kind detection into a shared matcher helper.

The dispatch in Line 43-Line 67 repeats the same branch shape; extracting the detected kind first will keep conversion logic single-path and reduce drift with similar AST dispatchers.

♻️ Suggested refactor shape

+enum ModKind {
+    Module,
+    Interactive,
+    Expression,
+    FunctionType,
+}
+
+fn mod_kind_from_object(vm: &VirtualMachine, object: &PyObjectRef) -> PyResult<ModKind> {
+    if object.is_instance(pyast::NodeModModule::static_type().as_object(), vm)? {
+        Ok(ModKind::Module)
+    } else if object.is_instance(pyast::NodeModInteractive::static_type().as_object(), vm)? {
+        Ok(ModKind::Interactive)
+    } else if object.is_instance(pyast::NodeModExpression::static_type().as_object(), vm)? {
+        Ok(ModKind::Expression)
+    } else if object.is_instance(pyast::NodeModFunctionType::static_type().as_object(), vm)? {
+        Ok(ModKind::FunctionType)
+    } else {
+        Err(vm.new_type_error(format!(
+            "expected some sort of mod, but got {}",
+            object.repr(vm)?
+        )))
+    }
+}
+
     fn ast_from_object(
         vm: &VirtualMachine,
         source_file: &SourceFile,
         object: PyObjectRef,
     ) -> PyResult<Self> {
-        Ok(
-            if object.is_instance(pyast::NodeModModule::static_type().as_object(), vm)? {
-                Self::Module(ModModule::ast_from_object(vm, source_file, object)?)
-            } else if object
-                .is_instance(pyast::NodeModInteractive::static_type().as_object(), vm)?
-            {
-                Self::Interactive(ModInteractive::ast_from_object(vm, source_file, object)?)
-            } else if object.is_instance(pyast::NodeModExpression::static_type().as_object(), vm)? {
-                Self::Expression(ast::ModExpression::ast_from_object(
-                    vm,
-                    source_file,
-                    object,
-                )?)
-            } else if object
-                .is_instance(pyast::NodeModFunctionType::static_type().as_object(), vm)?
-            {
-                Self::FunctionType(ModFunctionType::ast_from_object(vm, source_file, object)?)
-            } else {
-                return Err(vm.new_type_error(format!(
-                    "expected some sort of mod, but got {}",
-                    object.repr(vm)?
-                )));
-            },
-        )
+        match mod_kind_from_object(vm, &object)? {
+            ModKind::Module => Ok(Self::Module(ModModule::ast_from_object(vm, source_file, object)?)),
+            ModKind::Interactive => Ok(Self::Interactive(ModInteractive::ast_from_object(vm, source_file, object)?)),
+            ModKind::Expression => Ok(Self::Expression(ast::ModExpression::ast_from_object(vm, source_file, object)?)),
+            ModKind::FunctionType => Ok(Self::FunctionType(ModFunctionType::ast_from_object(vm, source_file, object)?)),
+        }
     }

As per coding guidelines, “When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/stdlib/_ast/module.rs` around lines 43 - 67, The dispatch logic
in the conversion function for Mod contains repeated branches that all follow
the same pattern: checking is_instance for different node types and then calling
ast_from_object. Extract the detection logic by first identifying which mod kind
the object is (ModModule, ModInteractive, ModExpression, or ModFunctionType)
into a separate helper or pattern match, then perform the ast_from_object
conversion call once using the detected kind rather than repeating the
conversion call in each branch. This eliminates duplicate code where the only
difference between branches is the specific node type being checked.

Source: Coding guidelines

crates/vm/src/stdlib/_ast/string.rs (1)

730-730: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Replace the decorative section separator.

This adds a // ===== ... ===== separator, which the repository guidelines explicitly disallow. Use a plain short comment or remove it.
Suggested fix
-// ===== TString (Template String) Support =====
+// Template string support.
As per coding guidelines, “Do not add decorative section separators (e.g. // -----------, // ===, /* *** */). Use /// doc-comments or short // comments only when they add value.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/stdlib/_ast/string.rs` at line 730, The decorative section
separator comment using the `// ===== ... =====` pattern violates repository
coding guidelines that disallow such decorative separators. Replace the
separator comment at the TString (Template String) Support section with either a
plain short comment like `// TString (Template String) Support` if the section
heading adds value, or remove it entirely if the context is clear from the
surrounding code.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/codegen/src/ir.rs`:
- Around line 3601-3608: The pattern matching for is_copy_of_load_const only
checks for Instruction::LoadConst, but line 3595 can canonicalize a LOAD_CONST
integer to LOAD_SMALL_INT before curr is captured, causing valid COPY patterns
after LOAD_SMALL_INT to not be recognized. Update the matches! condition for
is_copy_of_load_const to also handle Instruction::LoadSmallInt in addition to
Instruction::LoadConst so that patterns like LOAD_SMALL_INT followed by COPY are
properly tracked and can be fused with subsequent operations like TO_BOOL.

In `@crates/vm/src/frame.rs`:
- Around line 3198-3200: The current `seen_keys: Vec<PyObjectRef>` approach uses
linear scanning with `vm.bool_eq` which changes observable behavior by calling
`__eq__` for keys CPython would separate by hash and allows unhashable keys to
reach custom implementations. Replace this vector-based tracking with a Python
set-backed collection that respects proper hash and equality semantics. Extract
the duplicate key checking logic into a single shared function that works across
both lookup paths instead of duplicating the check logic in the branches around
lines 3217-3230 and 3247-3260, following the guideline to extract differing
values first and call common logic once.
- Line 3056: The code uses Vec::<String>::new() to track seen attributes and
calls expect_str() on user-defined __match_args__, which panics on strings with
lone surrogates. Change seen_attrs from Vec::<String> to Vec::<PyObjectRef> to
store the actual Python objects instead of converting to strings. In both the
positional attribute tracking section (around lines 3103-3116) and keyword
attribute tracking section (around lines 3159-3172), replace direct string
equality comparisons with vm.bool_eq() calls to safely compare the PyObjectRef
values. Use .repr() on PyObjectRef objects when constructing error messages
instead of trying to extract string values. This preserves the original Python
objects and avoids panicking on surrogates.

In `@crates/vm/src/stdlib/_abc.rs`:
- Around line 244-250: The collection_flags calculation only reads from
cls_type.slots.flags but ignores the abc_tpflags field, which stores virtual
ABC-derived sequence and mapping markers. This causes transitive virtual
registrations to be lost during propagation. Modify the collection_flags
assignment to include flags from both cls_type.slots.flags and
cls_type.abc_tpflags by combining them appropriately before passing to
set_abc_collection_flags_recursive, ensuring that virtual ABC registrations are
preserved when propagating collection flags.

In `@crates/vm/src/stdlib/_ast.rs`:
- Around line 337-347: The error message in the node_object_to_ast_string
function on line 345 only mentions str as an acceptable type, but the function
actually accepts both str and bytes as shown in the type check on line 342.
Update the error message passed to vm.new_type_error to accurately reflect that
both str and bytes types are accepted, so users receive an accurate description
of what types are allowed.

In `@crates/vm/src/stdlib/_ast/constant.rs`:
- Around line 1124-1144: The first_invalid_constant_type function recursively
scans nested tuples and frozensets without checking the VM's recursion limit,
which can cause a Rust stack overflow on deeply nested structures. Guard the
recursive calls to first_invalid_constant_type_opt with vm.with_recursion to
enforce the VM's recursion limit before recursing into items in both the tuple
iteration block (where items are processed from tuple.iter()) and the frozenset
iteration block (where items are processed from set.elements()). Apply the same
recursion guard pattern to any other similar recursive scans referenced around
lines 1209-1216.

In `@crates/vm/src/stdlib/_ast/pyast.rs`:
- Around line 91-109: The base AST classes generated by this macro (such as
stmt, expr, and excepthandler) are missing initialization of _field_types and
__annotations__ attributes, which are needed for CPython-compatible
introspection metadata. In the same macro branch where _fields is being set to
ctx.empty_tuple, add initialization for _field_types and __annotations__ as
empty dictionaries using appropriate ctx methods. This should be done before or
after the _fields initialization to ensure all base classes maintain proper
metadata structure.

In `@crates/vm/src/stdlib/_ast/python.rs`:
- Around line 478-480: In the branch where
ftype.fast_isinstance(vm.ctx.types.union_type) evaluates to true (handling
optional T | None fields), instead of leaving the optional field unset with just
a comment, assign a default value of None to the field. This ensures that
optional attributes always exist on AST nodes and prevents missing field errors
in constructors, __replace__ methods, and repr/validation operations that expect
the field to be present.

In `@crates/vm/src/stdlib/_ast/statement.rs`:
- Line 104: The issue is that stmt_range_from_object is being called on an
object before validating it is actually a statement type, causing location field
errors to appear before the type-check error. Move the range extraction call
from before the match statement into each individual successful match branch
(for each specific statement type being handled), and leave the final else
clause to report the "expected some sort of stmt" error without requiring
location fields to be present on invalid node types.
- Around line 29-42: In the function definition_range_from_name, replace the
incorrect keyword_len arithmetic calculation with a search-based approach.
Instead of using checked_sub with the keyword_len parameter to compute
keyword_start, use rfind to search backwards from name_start for the actual
keyword position. The current implementation with checked_sub and keyword_len is
fragile and produces correct results only by accident through the
max(line_start) clamping. You will need to pass the keyword string as a
parameter to the function so rfind can locate its actual position in the source
code.

In `@crates/vm/src/stdlib/_ast/string.rs`:
- Around line 913-940: The interpolation_to_expr function currently loses the
original location information by setting both the ast::TString range and
ast::ExprTString range to TextRange::default(). Capture the range from the
interpolation parameter before it is moved into TemplateStrPart::Interpolation,
then use this captured range to populate the range field for both the
ast::TString struct and the ast::ExprTString struct instead of using
TextRange::default().

In `@crates/vm/src/stdlib/_ast/type_parameters.rs`:
- Around line 87-118: The function currently calls type_param_range_from_object
on the object before validating that it is actually a supported type_param node
kind, which can cause it to fail while reading location fields instead of
returning the intended error message. Restructure the logic to first classify
the object by checking if it is an instance of TypeVar, ParamSpec, or
TypeVarTuple using is_node_instance, and only after confirming it is a valid
type should type_param_range_from_object be called to extract the range. If none
of the three type checks match, immediately return the type_error before
attempting to access any fields on the invalid object.

In `@crates/vm/src/stdlib/builtins.rs`:
- Around line 374-377: The constant CODEGEN_NOT_SUPPORTED is defined only under
the feature guard #[cfg(not(feature = "rustpython-compiler"))] but is being used
under #[cfg(not(feature = "compiler"))] in the code blocks at lines 374-377 and
397-400. Since "compiler" is a feature alias that depends on
"rustpython-compiler", these conditions are not equivalent and the constant may
be unavailable, causing a compilation error. Fix this by making the feature
guard consistent for the CODEGEN_NOT_SUPPORTED constant definition and its usage
locations—either move the constant definition to be guarded by #[cfg(not(feature
= "compiler"))] or change all usage sites to reference #[cfg(not(feature =
"rustpython-compiler"))] instead, ensuring the same feature flag is used
throughout.

---

Outside diff comments:
In `@crates/vm/src/stdlib/_ast/repr.rs`:
- Around line 82-86: The repr_ast_tuple function currently does not preserve the
trailing comma for single-element tuples, causing them to render as `(value)`
instead of the correct `(value,)`. Add logic to handle the case when parts.len()
equals 1, where you should push the first element from parts[0] followed by a
comma before the closing parenthesis is added. The existing condition checking
parts.len() > 1 should remain unchanged, but you need to add an else-if branch
or similar handling to address the single-element case explicitly.

---

Nitpick comments:
In `@crates/vm/src/stdlib/_ast/module.rs`:
- Around line 43-67: The dispatch logic in the conversion function for Mod
contains repeated branches that all follow the same pattern: checking
is_instance for different node types and then calling ast_from_object. Extract
the detection logic by first identifying which mod kind the object is
(ModModule, ModInteractive, ModExpression, or ModFunctionType) into a separate
helper or pattern match, then perform the ast_from_object conversion call once
using the detected kind rather than repeating the conversion call in each
branch. This eliminates duplicate code where the only difference between
branches is the specific node type being checked.

In `@crates/vm/src/stdlib/_ast/operator.rs`:
- Around line 19-30: The boolop dispatch logic in this function is duplicated
across four functions with only differing node type checks and enum variant
returns. Extract a shared helper function that takes a mapping or table of node
type candidates paired with their corresponding enum variants, then performs the
common dispatch pattern (check each node type and return the matching variant,
or throw a type error if none match). Replace the duplicate dispatch code in all
four functions (including the blocks at lines 60-93, 114-129, and 156-183) with
calls to this shared helper, passing only the differing node-type-to-variant
mappings.

In `@crates/vm/src/stdlib/_ast/string.rs`:
- Line 730: The decorative section separator comment using the `// ===== ...
=====` pattern violates repository coding guidelines that disallow such
decorative separators. Replace the separator comment at the TString (Template
String) Support section with either a plain short comment like `// TString
(Template String) Support` if the section heading adds value, or remove it
entirely if the context is clear from the surrounding code.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: e18de394-1f13-4b89-b45b-7957a283e3cb

📥 Commits

Reviewing files that changed from the base of the PR and between 55b96f9 and c4bd2c9.

⛔ Files ignored due to path filters (13)

Cargo.lock is excluded by !**/*.lock
Lib/test/test_compile.py is excluded by !Lib/**
Lib/test/test_grammar.py is excluded by !Lib/**
Lib/test/test_patma.py is excluded by !Lib/**
Lib/test/test_syntax.py is excluded by !Lib/**
Lib/test/test_type_comments.py is excluded by !Lib/**
crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__bare_function_annotations_check_attribute_and_subscript_expressions.snap is excluded by !**/*.snap
crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__const_no_op.snap is excluded by !**/*.snap
crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__constant_true_if_pass_keeps_line_anchor_nop.snap is excluded by !**/*.snap
crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__if_ands.snap is excluded by !**/*.snap
crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__if_mixed.snap is excluded by !**/*.snap
crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__if_ors.snap is excluded by !**/*.snap
crates/stdlib/src/snapshots/rustpython_stdlib___opcode__tests__nested_bool_op.snap is excluded by !**/*.snap

📒 Files selected for processing (62)

.cspell.json
.pre-commit-config.yaml
Cargo.toml
crates/capi/src/ceval.rs
crates/codegen/src/compile.rs
crates/codegen/src/error.rs
crates/codegen/src/ir.rs
crates/codegen/src/lib.rs
crates/codegen/src/preprocess.rs
crates/codegen/src/symboltable.rs
crates/codegen/src/unparse.rs
crates/compiler-core/src/bytecode.rs
crates/compiler-core/src/bytecode/instruction.rs
crates/compiler-core/src/bytecode/oparg.rs
crates/compiler/src/lib.rs
crates/stdlib/src/_opcode.rs
crates/vm/Cargo.toml
crates/vm/src/builtins/type.rs
crates/vm/src/eval.rs
crates/vm/src/exceptions.rs
crates/vm/src/frame.rs
crates/vm/src/import.rs
crates/vm/src/object/core.rs
crates/vm/src/stdlib/_abc.rs
crates/vm/src/stdlib/_ast.rs
crates/vm/src/stdlib/_ast/argument.rs
crates/vm/src/stdlib/_ast/basic.rs
crates/vm/src/stdlib/_ast/constant.rs
crates/vm/src/stdlib/_ast/elif_else_clause.rs
crates/vm/src/stdlib/_ast/exception.rs
crates/vm/src/stdlib/_ast/expression.rs
crates/vm/src/stdlib/_ast/module.rs
crates/vm/src/stdlib/_ast/node.rs
crates/vm/src/stdlib/_ast/operator.rs
crates/vm/src/stdlib/_ast/other.rs
crates/vm/src/stdlib/_ast/parameter.rs
crates/vm/src/stdlib/_ast/pattern.rs
crates/vm/src/stdlib/_ast/pyast.rs
crates/vm/src/stdlib/_ast/python.rs
crates/vm/src/stdlib/_ast/repr.rs
crates/vm/src/stdlib/_ast/statement.rs
crates/vm/src/stdlib/_ast/string.rs
crates/vm/src/stdlib/_ast/type_ignore.rs
crates/vm/src/stdlib/_ast/type_parameters.rs
crates/vm/src/stdlib/_ast/validate.rs
crates/vm/src/stdlib/_symtable.rs
crates/vm/src/stdlib/builtins.rs
crates/vm/src/stdlib/sys.rs
crates/vm/src/vm/compile.rs
crates/vm/src/vm/interpreter.rs
crates/vm/src/vm/mod.rs
crates/vm/src/vm/python_run.rs
crates/vm/src/vm/vm_new.rs
crates/wasm/Cargo.toml
crates/wasm/src/lib.rs
crates/wasm/src/vm_class.rs
examples/hello_embed.rs
examples/mini_repl.rs
examples/parse_folder.rs
extra_tests/snippets/builtin_compile.py
src/shell.rs
tools/opcode_metadata/generate_rs_opcode_metadata.py

coderabbitai · 2026-06-22T10:29:29Z

+pub(super) fn node_object_to_ast_string(
+    vm: &VirtualMachine,
+    obj: PyObjectRef,
+) -> PyResult<PyObjectRef> {
+    let cls = obj.class();
+    if cls.is(vm.ctx.types.str_type) || cls.is(vm.ctx.types.bytes_type) {
+        Ok(obj)
+    } else {
+        Err(vm.new_type_error("AST string must be of type str"))
+    }
+}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Error message does not reflect that bytes is also accepted.

The function accepts both str and bytes types (line 342), but the error message on line 345 only mentions str. This could confuse users who encounter the error after passing a different type.

Suggested fix

} else { - Err(vm.new_type_error("AST string must be of type str")) + Err(vm.new_type_error("AST string must be of type str or bytes")) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

pub(super) fn node_object_to_ast_string(

vm: &VirtualMachine,

obj: PyObjectRef,

) -> PyResult<PyObjectRef> {

let cls = obj.class();

if cls.is(vm.ctx.types.str_type) || cls.is(vm.ctx.types.bytes_type) {

Ok(obj)

} else {

Err(vm.new_type_error("AST string must be of type str"))

}

}

pub(super) fn node_object_to_ast_string(

vm: &VirtualMachine,

obj: PyObjectRef,

) -> PyResult<PyObjectRef> {

let cls = obj.class();

if cls.is(vm.ctx.types.str_type) || cls.is(vm.ctx.types.bytes_type) {

Ok(obj)

} else {

Err(vm.new_type_error("AST string must be of type str or bytes"))

}

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/vm/src/stdlib/_ast.rs` around lines 337 - 347, The error message in the node_object_to_ast_string function on line 345 only mentions str as an acceptable type, but the function actually accepts both str and bytes as shown in the type check on line 342. Update the error message passed to vm.new_type_error to accurately reflect that both str and bytes types are accepted, so users receive an accurate description of what types are allowed.

youknowone · 2026-06-22T13:36:52Z

        // Invalidate negative cache
        increment_invalidation_counter();

+        if let Some(cls_type) = cls.downcast_ref::<PyType>()


check if this have CPython reference

youknowone · 2026-06-22T13:37:36Z

 rustpython-wtf8 = { path = "crates/wtf8", version = "0.5.0" }
 rustpython-doc = { path = "crates/doc", version = "0.5.0" }

-# Use RustPython-packaged Ruff crates from the published fork while keeping


minimize diff and keep rustpython-ruff_python_parser as commented

youknowone · 2026-06-22T13:38:09Z

@@ -65,8 +65,8 @@ def fib(n):
        // this line also automatically prints the output
        // (note that this is only the case when compiler::Mode::Single is passed to vm.compile)
        match vm
-            .compile(&input, vm::compiler::Mode::Single, "<embedded>")
-            .map_err(|err| vm.new_syntax_error(&err, Some(&input)))
+            .compile(&input, vm::compiler::Mode::Single, "<embedded>".to_owned())


why redundant .to_owned() is added?

youknowone · 2026-06-22T13:39:39Z

-                syntax_error.as_object(), self, unwrap,
-                "_metadata" => metadata,
-            );
+            syntax_error


keep set_attrs! usage

youknowone · 2026-06-22T13:42:57Z

+    }
+}
+
+impl<T: Node> Node for ThinVec<T> {


if we added ThinVec support, does we still need Vec support?

youknowone · 2026-06-22T13:46:43Z

 impl Transformer for AstPreprocessor {
+    fn visit_stmt(&self, stmt: &mut ast::Stmt) {
+        match stmt {
+            ast::Stmt::FunctionDef(function) => {


do we really need to full-implement this match arms again?

Update the compiler pipeline, AST module support, and related VM compile/eval paths toward CPython 3.14 behavior. Use upstream Ruff parser crates directly and keep RustPython-specific syntax preflight handling in the compile path. Refresh opcode metadata, snapshots, and targeted tests for the aligned bytecode and AST behavior.

youknowone changed the title ~~Bytecode parity~~ Align compiler and AST behavior with CPython Jun 22, 2026

youknowone force-pushed the bytecode-parity branch from 3946343 to e90500a Compare June 22, 2026 07:41

bschoenmaeckers reviewed Jun 22, 2026

View reviewed changes

Comment thread crates/capi/src/ceval.rs Outdated

vm.compile(code, mode, filename)

.map_err(|err| vm.new_syntax_error(&err, Some(code)))

#[unsafe(no_mangle)]

pub unsafe extern "C" fn Py_CompileStringExFlags(

This comment was marked as resolved.

Sign in to view

youknowone force-pushed the bytecode-parity branch 2 times, most recently from 1ecac1d to c4bd2c9 Compare June 22, 2026 09:35

youknowone marked this pull request as ready for review June 22, 2026 09:53

coderabbitai Bot reviewed Jun 22, 2026

View reviewed changes

youknowone force-pushed the bytecode-parity branch 3 times, most recently from 754a4a4 to cfc9db9 Compare June 22, 2026 13:27

youknowone requested a review from bschoenmaeckers June 22, 2026 13:29

youknowone marked this pull request as draft June 22, 2026 13:38

youknowone force-pushed the bytecode-parity branch from cfc9db9 to 5235b03 Compare June 22, 2026 14:00

youknowone commented Jun 22, 2026

View reviewed changes

youknowone force-pushed the bytecode-parity branch 3 times, most recently from 3324654 to ba0113c Compare June 22, 2026 14:41

youknowone force-pushed the bytecode-parity branch from ba0113c to af789fa Compare June 22, 2026 20:53

youknowone added 2 commits June 23, 2026 13:39

fix

308df73

Fix PR 8138 CI failures

4abf27b

Conversation

youknowone commented Jun 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📦 Library Dependencies

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

youknowone commented Jun 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading

github-actions Bot commented Jun 22, 2026 •

edited

Loading