Skip to content

Autogen opcodes metadata#7983

Open
ShaharNaveh wants to merge 4 commits into
RustPython:mainfrom
ShaharNaveh:autogen-opcodes-metadata
Open

Autogen opcodes metadata#7983
ShaharNaveh wants to merge 4 commits into
RustPython:mainfrom
ShaharNaveh:autogen-opcodes-metadata

Conversation

@ShaharNaveh
Copy link
Copy Markdown
Contributor

@ShaharNaveh ShaharNaveh commented May 26, 2026

Inspired by #7797 (comment)

Summary by CodeRabbit

  • Chores
    • Restructured opcode metadata generation system with improved build pipeline integration
    • Separated metadata generation into distinct development stages for better workflow efficiency
    • Updated pre-commit hook configuration for enhanced code generation processes
    • Adjusted compiler recursion limit settings

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 26, 2026

📝 Walkthrough

Walkthrough

This PR refactors RustPython's opcode metadata generation infrastructure. The changes migrate from a monolithic legacy approach (single generate.py script reading opcode.toml) to a modular, CPython-aware system. Macro-driven Rust instruction definitions now generate core behavior; separate Python tools produce both Rust and Python opcode metadata. CI and pre-commit workflows are updated to support the new generation pathways.

Changes

Opcode Metadata Generation Refactor

Layer / File(s) Summary
Utility Foundations and Configuration
tools/opcode_metadata/utils.py, tools/opcode_metadata/conf.toml
Stack-effect data structures, case conversion helpers (to_pascal_case, to_upper_snake_case), TOML configuration loading, and Rust enum body extraction utilities.
CPython Opcode Analysis Bridge
tools/opcode_metadata/cpython.py
Dynamically loads CPython's opcode analyzer via CPYTHON_ROOT environment variable, merges instruction and pseudo-instruction analysis, and provides unified get_analysis() entrypoint.
Rust Opcode Definition Parsing
tools/opcode_metadata/opcodes.py
Parses define_opcodes!(…) macro blocks from Rust source, extracts opcode metadata, matches against CPython analysis data, applies override configuration, and computes deoptimization targets via OpcodeInfo and Opcode dataclasses.
Rust Opcode Metadata Generation
tools/opcode_metadata/generate_rs_opcode_metadata.py
Reads parsed opcode definitions, generates Rust impl blocks with metadata accessors (numeric conversions, cache entries, flag predicates, stack-effect computation, and variant conversions), formats via rustfmt, and writes to crates/compiler-core/src/bytecode/opcode_metadata.rs.
Generated Opcode Metadata Output
crates/compiler-core/src/bytecode/opcode_metadata.rs, .gitattributes
Generated file containing Opcode and PseudoOpcode metadata methods: as_u8/as_u16, cache_entries, deopt, predicate helpers (has_arg, has_const, has_free, has_jump, has_local, has_name, is_instrumented), stack-effect computation, and variant conversions (to_base, to_instrumented, try_from_u8/try_from_u16). Marked as generated in .gitattributes.
Python Opcode Metadata Generation
tools/opcode_metadata/generate_py_opcode_metadata.py
Generates Lib/_opcode_metadata.py for CPython 3.14 compatibility by aggregating opcode definitions, building _specializations mapping, computing HAVE_ARGUMENT and MIN_INSTRUMENTED_OPCODE constants, and emitting prioritized opmap.
Instruction Module Macro Refactoring
crates/compiler-core/src/bytecode/instruction.rs
Refactors to use define_opcodes! macro for generating Opcode and PseudoOpcode enums and their conversion/predicate methods. Explicitly implements PseudoOpcode::is_scope_exit (returns false) and is_unconditional_jump (restricted to Jump/JumpNoInterrupt). Updates AnyInstruction and AnyOpcode delegation via either_real_pseudo! to use macro-generated methods. Adds Eq and PartialEq to AnyOpcode trait derives.
Bytecode Module Reorganization
crates/compiler-core/src/bytecode.rs, crates/compiler-core/src/bytecode/oparg.rs
Re-exports Instruction/Opcode/PseudoInstruction/PseudoOpcode from instruction submodule instead of separate instructions module. Declares opcode_metadata module instead of instructions. Updates oparg.rs to import Instruction from crate::bytecode directly.
Rust Compiler Configuration
crates/compiler-core/src/lib.rs
Adds #![recursion_limit = "256"] crate attribute to support macro expansion.
CI and Pre-commit Workflow Updates
.github/workflows/ci.yaml, .pre-commit-config.yaml
ci.yaml: installs prek separately, reads CPython version from .python-version, clones the matching CPython tag, and runs prek for manual hooks with CPYTHON_ROOT set. Adds explicit persist-credentials: false to checkout. .pre-commit-config.yaml: replaces single generate-opcode-metadata hook with two separate hooks (generate-rs-opcode-metadata and generate-py-opcode-metadata), each pointing to its dedicated generator script with proper serial execution and ordering.
Legacy Code Removal
Deleted: crates/compiler-core/generate.py, crates/compiler-core/opcode.toml, scripts/generate_opcode_metadata.py
Removes obsolete files: generate.py (old Rust generation with OpcodeGen and InstructionGen classes), opcode.toml (old metadata definitions), and scripts/generate_opcode_metadata.py (old Python generation).

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

  • RustPython/RustPython#7797: Earlier PR in the opcode/instruction autogeneration refactor that updates the compiler-core model and generation inputs.
  • RustPython/RustPython#7573: Overlapping changes to instruction.rs macro generation and bytecode module re-exports.
  • RustPython/RustPython#7757: Related changes to AnyInstruction/PseudoInstruction helper methods and block/pseudo-opcode identification logic.

Suggested Reviewers

  • youknowone
  • fanninpm

🐰 The rabbits hopped through fields of code,
Where opcodes once lived, now macros flow,
Generated metadata shines so bright,
Separate paths for Python and for Rust—
A refactored garden, unified and true!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "Autogen opcodes metadata" directly describes the main change in the changeset: the PR implements autogenerated opcode metadata across the repository. The title is concise and accurately represents the primary objective.
Docstring Coverage ✅ Passed Docstring coverage is 83.78% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] lib: cpython/Lib/opcode.py
[ ] lib: cpython/Lib/_opcode_metadata.py
[x] test: cpython/Lib/test/test__opcode.py
[x] test: cpython/Lib/test/test_opcodes.py

dependencies:

  • opcode (native: _opcode, builtins)
    • _opcode_metadata
    • _opcode_metadata

dependent tests: (47 tests)

  • opcode: test__opcode test_code test_compile test_dis test_peepholer
    • dis: test_ast test_compiler_assemble test_dtrace test_fstring test_inspect test_monitoring test_opcache test_patma test_positional_only_arg test_type_cache
      • bdb: test_bdb
      • inspect: test_abc test_argparse test_asyncgen test_buffer test_builtin test_clinic test_collections test_coroutines test_decimal test_enum test_functools test_generators test_grammar test_ntpath test_operator test_posixpath test_pydoc test_signal test_sqlite3 test_traceback test_type_annotations test_types test_typing test_unittest test_yield_from test_zipimport test_zipimport_support test_zoneinfo
      • modulefinder: test_importlib test_modulefinder
      • trace: test_trace

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@ShaharNaveh ShaharNaveh marked this pull request as ready for review May 26, 2026 15:18
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (2)
tools/opcode_metadata/cpython.py (1)

5-8: ⚡ Quick win

Chain the exception in the except KeyError path.

Use raise ... from err to preserve the original failure context and satisfy B904.

Suggested fix
-try:
-    CPYTHON_ROOT = pathlib.Path(os.environ["CPYTHON_ROOT"]).expanduser().resolve()
-except KeyError:
-    raise ValueError("Missing environment variable 'CPYTHON_ROOT'")
+try:
+    CPYTHON_ROOT = pathlib.Path(os.environ["CPYTHON_ROOT"]).expanduser().resolve()
+except KeyError as err:
+    raise ValueError("Missing environment variable 'CPYTHON_ROOT'") from err

As per coding guidelines, "Use ruff for linting Python code".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/opcode_metadata/cpython.py` around lines 5 - 8, The except KeyError
block that raises ValueError for missing CPYTHON_ROOT should chain the original
KeyError so the original context is preserved; modify the except clause to
capture the exception (e.g., except KeyError as err) and re-raise the ValueError
using "raise ValueError(\"Missing environment variable 'CPYTHON_ROOT'\") from
err" so the CPYTHON_ROOT resolution code
(pathlib.Path(os.environ["CPYTHON_ROOT"]).expanduser().resolve()) preserves the
original error context for lint B904.
tools/opcode_metadata/opcodes.py (1)

95-98: ⚡ Quick win

Set stacklevel on warnings.warn for correct caller attribution.

This warning should point to the caller site, not inside helper internals.

Suggested fix
-            warnings.warn(
+            warnings.warn(
                 f"Could not get instruction metadata for {rust_name}"
-                " from CPython or override conf"
+                " from CPython or override conf",
+                stacklevel=2,
             )

As per coding guidelines, "Use ruff for linting Python code".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/opcode_metadata/opcodes.py` around lines 95 - 98, The warnings.warn
call that emits "Could not get instruction metadata for {rust_name} from CPython
or override conf" should include a stacklevel so the warning points at the
caller, not this helper; update the warnings.warn call to pass stacklevel=2
(e.g., warnings.warn(..., stacklevel=2)) referencing the existing rust_name
variable and the warnings.warn invocation so linting (ruff) and caller
attribution are correct.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.pre-commit-config.yaml:
- Around line 46-47: The hook `files` regex currently uses
`tools/opcode_metadata/*` which doesn't match files like
`tools/opcode_metadata/opcodes.py`; update both occurrences of the `files`
pattern in the pre-commit config (the lines with `files:` that currently contain
`tools/opcode_metadata/*`) to use `tools/opcode_metadata/.*` so the
opcode-metadata hooks trigger on changes under that directory (keep
`pass_filenames: false` as-is).

In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 213-233: The macro arm that detects label fields only matches bare
"Label" but opcode variants use "oparg::Label", so add explicit macro arms to
match oparg::Label (mirror the existing Label arms) — e.g. add (`@match`
$self:expr, $name:ident, [$variant:ident { $field:ident : oparg::Label } ,
$($rest:tt)*]) and the final-arm variant (`@match` $self:expr, $name:ident,
[$variant:ident { $field:ident : oparg::Label }]) with the same bodies that
return Some(*$field); this ensures define_opcodes! macro recognizes oparg::Label
fields and preserves CodeObject::label_targets().

In `@tools/opcode_metadata/cpython.py`:
- Around line 17-18: The module currently imports SKIP_PROPERTIES, Analysis,
Family, Properties, analyze_files and get_stack_effect only for downstream
re-export but lint flags them as unused; make this explicit by adding an __all__
=
("SKIP_PROPERTIES","Analysis","Family","Properties","analyze_files","get_stack_effect")
at top (or alternatively append targeted "# noqa: F401" to those import lines)
so the re-exports are intentional and ruff/CI no longer reports F401. Ensure the
symbol names match the imported identifiers exactly (SKIP_PROPERTIES, Analysis,
Family, Properties, analyze_files, get_stack_effect).

In `@tools/opcode_metadata/generate_py_opcode_metadata.py`:
- Around line 12-15: Remove the unused imports that trigger F401 by deleting
"typing" from the top-level imports and removing "to_pascal_case" from the
from-utils import list in generate_py_opcode_metadata.py; update the import line
to only keep used symbols (OpcodeInfo, DEFAULT_INPUT, ROOT, get_conf,
to_upper_snake_case) so the module passes ruff/flake8 linting.
- Around line 84-85: Replace the lambda assigned to key with a small local def
to satisfy E731: create a function (e.g., def sort_key(op): return
(op.cpython_name not in PRIORITY_OPMAP, op.id)) and pass sort_key into
sorted(opcodes, key=sort_key) instead of using key = lambda ...; keep references
to PRIORITY_OPMAP and op.id/op.cpython_name unchanged.

In `@tools/opcode_metadata/generate_rs_opcode_metadata.py`:
- Line 23: The name OpcodeDef used as a type annotation is undefined; either
import the correct type that represents entries returned by
OpcodeInfo.iter_infos (e.g., add an import for OpcodeDef from the module that
defines it) or change the annotation to the actual type produced by
OpcodeInfo.iter_infos (for example Iterator[OpcodeInfo] or list[OpcodeInfo]) and
add the necessary typing imports (from typing import Iterator, List) so
ruff/F821 stops complaining; update the annotation on the variable/parameter
named info (or wherever OpcodeDef is used) accordingly.
- Around line 4-16: Remove the unused imports in the top import block: delete
"collections", "os", "sys", "typing", "tomllib" (if not used elsewhere), and the
unused names from the cpython from-import ("Analysis", "get_analysis",
"get_stack_effect"), and remove "to_pascal_case" from the utils import; keep
only the actually used symbols (e.g., dataclasses, io, pathlib, subprocess,
OpcodeInfo, DEFAULT_INPUT, ROOT, get_conf). Update the import lines in
generate_rs_opcode_metadata.py accordingly and run ruff to verify no F401s
remain.

In `@tools/opcode_metadata/opcodes.py`:
- Around line 35-37: The comparison that skips the family root uses raw
member.name against the normalized family_name, causing mismatches; update the
filter to compare the normalized member_name (the result of
to_pascal_case(member.name)) to family_name (or normalize family_name with
to_pascal_case) so the family opcode is correctly excluded—adjust the logic
around member_name, member.name, family_name and the call to to_pascal_case in
opcodes.py.

In `@tools/opcode_metadata/utils.py`:
- Line 25: Replace the PEP 695 type-alias statement so it parses on Python 3.11:
change the declaration "type OverrideConfs = dict[str, Override]" to an
assignment-style alias "OverrideConfs = dict[str, Override]" (keep the same name
OverrideConfs and the referenced Override type), ensuring any necessary imports
for Override remain unchanged.

---

Nitpick comments:
In `@tools/opcode_metadata/cpython.py`:
- Around line 5-8: The except KeyError block that raises ValueError for missing
CPYTHON_ROOT should chain the original KeyError so the original context is
preserved; modify the except clause to capture the exception (e.g., except
KeyError as err) and re-raise the ValueError using "raise ValueError(\"Missing
environment variable 'CPYTHON_ROOT'\") from err" so the CPYTHON_ROOT resolution
code (pathlib.Path(os.environ["CPYTHON_ROOT"]).expanduser().resolve()) preserves
the original error context for lint B904.

In `@tools/opcode_metadata/opcodes.py`:
- Around line 95-98: The warnings.warn call that emits "Could not get
instruction metadata for {rust_name} from CPython or override conf" should
include a stacklevel so the warning points at the caller, not this helper;
update the warnings.warn call to pass stacklevel=2 (e.g., warnings.warn(...,
stacklevel=2)) referencing the existing rust_name variable and the warnings.warn
invocation so linting (ruff) and caller attribution are correct.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 3d449e36-7fc4-4c30-a9b0-afbda43241c9

📥 Commits

Reviewing files that changed from the base of the PR and between dcb273b and f9a754c.

⛔ Files ignored due to path filters (1)
  • Lib/_opcode_metadata.py is excluded by !Lib/**
📒 Files selected for processing (18)
  • .gitattributes
  • .github/workflows/ci.yaml
  • .pre-commit-config.yaml
  • crates/compiler-core/generate.py
  • crates/compiler-core/opcode.toml
  • crates/compiler-core/src/bytecode.rs
  • crates/compiler-core/src/bytecode/instruction.rs
  • crates/compiler-core/src/bytecode/instructions.rs
  • crates/compiler-core/src/bytecode/oparg.rs
  • crates/compiler-core/src/bytecode/opcode_metadata.rs
  • crates/compiler-core/src/lib.rs
  • scripts/generate_opcode_metadata.py
  • tools/opcode_metadata/conf.toml
  • tools/opcode_metadata/cpython.py
  • tools/opcode_metadata/generate_py_opcode_metadata.py
  • tools/opcode_metadata/generate_rs_opcode_metadata.py
  • tools/opcode_metadata/opcodes.py
  • tools/opcode_metadata/utils.py
💤 Files with no reviewable changes (3)
  • scripts/generate_opcode_metadata.py
  • crates/compiler-core/opcode.toml
  • crates/compiler-core/generate.py

Comment thread .pre-commit-config.yaml
Comment on lines +46 to 47
files: '^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/*)$'
pass_filenames: false
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

python - <<'PY'
import re
pat = re.compile(r'^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/*)$')
samples = [
    "crates/compiler-core/src/bytecode/instruction.rs",
    "tools/opcode_metadata/opcodes.py",
    "tools/opcode_metadata/utils.py",
]
print("Sample matches:")
for s in samples:
    print(f"{s}: {bool(pat.search(s))}")
PY

python - <<'PY'
import re, subprocess
pat = re.compile(r'^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/*)$')
files = subprocess.check_output(["git", "ls-files"], text=True).splitlines()
matches = [f for f in files if pat.search(f)]
print(f"repo_match_count={len(matches)}")
for f in matches[:20]:
    print(f)
PY

Repository: RustPython/RustPython

Length of output: 282


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the exact current lines around the two reported occurrences
sed -n '35,70p' .pre-commit-config.yaml | nl -ba | sed -n '30,55p'

python3 - <<'PY'
import re, subprocess
pat = re.compile(r'^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/*)$')
files = subprocess.check_output(["git", "ls-files"], text=True).splitlines()
matches = [f for f in files if pat.search(f)]
print(f"repo_match_count={len(matches)}")
for f in matches:
    print(f)
PY

python3 - <<'PY'
import re, subprocess
pat = re.compile(r'^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/.*)$')
files = subprocess.check_output(["git", "ls-files"], text=True).splitlines()
matches = [f for f in files if pat.search(f)]
print(f"repo_match_count_with_fixed_regex={len(matches)}")
for f in matches[:30]:
    print(f)
PY

Repository: RustPython/RustPython

Length of output: 108


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Display relevant section with line numbers (cat -n is widely available)
cat -n .pre-commit-config.yaml | sed -n '35,70p'

python3 - <<'PY'
import re, subprocess
pat = re.compile(r'^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/*)$')
files = subprocess.check_output(["git", "ls-files"], text=True).splitlines()
matches = [f for f in files if pat.search(f)]
print("repo_match_count=", len(matches))
for f in matches:
    print(f)
PY

python3 - <<'PY'
import re, subprocess
pat = re.compile(r'^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/.*)$')
files = subprocess.check_output(["git", "ls-files"], text=True).splitlines()
matches = [f for f in files if pat.search(f)]
print("repo_match_count_with_fixed_regex=", len(matches))
for f in matches[:50]:
    print(f)
PY

Repository: RustPython/RustPython

Length of output: 1850


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Display relevant section with line numbers (cat -n is widely available)
cat -n .pre-commit-config.yaml | sed -n '35,70p'

python3 - <<'PY'
import re, subprocess
pat = re.compile(r'^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/*)$')
files = subprocess.check_output(["git", "ls-files"], text=True).splitlines()
matches = [f for f in files if pat.search(f)]
print("repo_match_count=", len(matches))
for f in matches:
    print(f)
PY

python3 - <<'PY'
import re, subprocess
pat = re.compile(r'^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/.*)$')
files = subprocess.check_output(["git", "ls-files"], text=True).splitlines()
matches = [f for f in files if pat.search(f)]
print("repo_match_count_with_fixed_regex=", len(matches))
for f in matches[:50]:
    print(f)
PY

Repository: RustPython/RustPython

Length of output: 1850


Fix files regex so opcode-metadata hooks trigger on tools/opcode_metadata/* changes

The current regex tools/opcode_metadata/* doesn’t match real files like tools/opcode_metadata/opcodes.py, so both manual hooks only match crates/compiler-core/src/bytecode/instruction.rs. Update both hook files patterns to tools/opcode_metadata/.*.

💡 Proposed fix
-        files: '^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/*)$'
+        files: '^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/.*)$'
...
-        files: '^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/*)$'
+        files: '^(crates/compiler-core/src/bytecode/instruction\.rs|tools/opcode_metadata/.*)$'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.pre-commit-config.yaml around lines 46 - 47, The hook `files` regex
currently uses `tools/opcode_metadata/*` which doesn't match files like
`tools/opcode_metadata/opcodes.py`; update both occurrences of the `files`
pattern in the pre-commit config (the lines with `files:` that currently contain
`tools/opcode_metadata/*`) to use `tools/opcode_metadata/.*` so the
opcode-metadata hooks trigger on changes under that directory (keep
`pass_filenames: false` as-is).

Comment on lines +213 to +233
(@match $self:expr, $name:ident, [$variant:ident { $field:ident : Label } , $($rest:tt)*]) => {
match $self {
$name::$variant { $field } => Some(*$field),
other => define_opcodes!(@match other, $name, [$($rest)*]),
}
};

// Label field variant (last in list)
(@match $self:expr, $name:ident, [$variant:ident { $field:ident : Label }]) => {
match $self {
$name::$variant { $field } => Some(*$field),
other => define_opcodes!(@match other, $name, []),
}
};

// Non-Label field variant (with trailing variants)
(@match $self:expr, $name:ident, [$variant:ident { $field:ident : $type:ty } , $($rest:tt)*]) => {
match $self {
$name::$variant { .. } => None,
other => define_opcodes!(@match other, $name, [$($rest)*]),
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

label_arg() macro arm does not match oparg::Label typed fields.

The label-detection arm matches ... : Label, but opcode declarations pass oparg::Label. Those variants fall into the non-label arm and return None, which can break CodeObject::label_targets().

Targeted fix
-    (`@match` $self:expr, $name:ident, [$variant:ident { $field:ident : Label } , $($rest:tt)*]) => {
+    (`@match` $self:expr, $name:ident, [$variant:ident { $field:ident : oparg::Label } , $($rest:tt)*]) => {
         match $self {
             $name::$variant { $field } => Some(*$field),
             other => define_opcodes!(`@match` other, $name, [$($rest)*]),
         }
     };
@@
-    (`@match` $self:expr, $name:ident, [$variant:ident { $field:ident : Label }]) => {
+    (`@match` $self:expr, $name:ident, [$variant:ident { $field:ident : oparg::Label }]) => {
         match $self {
             $name::$variant { $field } => Some(*$field),
             other => define_opcodes!(`@match` other, $name, []),
         }
     };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(@match $self:expr, $name:ident, [$variant:ident { $field:ident : Label } , $($rest:tt)*]) => {
match $self {
$name::$variant { $field } => Some(*$field),
other => define_opcodes!(@match other, $name, [$($rest)*]),
}
};
// Label field variant (last in list)
(@match $self:expr, $name:ident, [$variant:ident { $field:ident : Label }]) => {
match $self {
$name::$variant { $field } => Some(*$field),
other => define_opcodes!(@match other, $name, []),
}
};
// Non-Label field variant (with trailing variants)
(@match $self:expr, $name:ident, [$variant:ident { $field:ident : $type:ty } , $($rest:tt)*]) => {
match $self {
$name::$variant { .. } => None,
other => define_opcodes!(@match other, $name, [$($rest)*]),
}
(`@match` $self:expr, $name:ident, [$variant:ident { $field:ident : oparg::Label } , $($rest:tt)*]) => {
match $self {
$name::$variant { $field } => Some(*$field),
other => define_opcodes!(`@match` other, $name, [$($rest)*]),
}
};
// Label field variant (last in list)
(`@match` $self:expr, $name:ident, [$variant:ident { $field:ident : oparg::Label }]) => {
match $self {
$name::$variant { $field } => Some(*$field),
other => define_opcodes!(`@match` other, $name, []),
}
};
// Non-Label field variant (with trailing variants)
(`@match` $self:expr, $name:ident, [$variant:ident { $field:ident : $type:ty } , $($rest:tt)*]) => {
match $self {
$name::$variant { .. } => None,
other => define_opcodes!(`@match` other, $name, [$($rest)*]),
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 213 - 233, The
macro arm that detects label fields only matches bare "Label" but opcode
variants use "oparg::Label", so add explicit macro arms to match oparg::Label
(mirror the existing Label arms) — e.g. add (`@match` $self:expr, $name:ident,
[$variant:ident { $field:ident : oparg::Label } , $($rest:tt)*]) and the
final-arm variant (`@match` $self:expr, $name:ident, [$variant:ident {
$field:ident : oparg::Label }]) with the same bodies that return Some(*$field);
this ensures define_opcodes! macro recognizes oparg::Label fields and preserves
CodeObject::label_targets().

Comment on lines +17 to +18
from analyzer import SKIP_PROPERTIES, Analysis, Family, Properties, analyze_files
from stack import get_stack_effect
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make intentional re-exports explicit to clear F401 lint failures.

These names are imported for downstream use (opcodes.py), but lint treats them as unused. Add __all__ (or targeted # noqa: F401) so the module intent is explicit and CI lint won’t fail.

Suggested fix
 from analyzer import SKIP_PROPERTIES, Analysis, Family, Properties, analyze_files
 from stack import get_stack_effect
+
+__all__ = [
+    "SKIP_PROPERTIES",
+    "Analysis",
+    "Family",
+    "Properties",
+    "analyze_files",
+    "get_stack_effect",
+    "get_analysis",
+]

As per coding guidelines, "Use ruff for linting Python code".

🧰 Tools
🪛 Flake8 (7.3.0)

[error] 17-17: 'analyzer.SKIP_PROPERTIES' imported but unused

(F401)


[error] 17-17: 'analyzer.Family' imported but unused

(F401)


[error] 17-17: 'analyzer.Properties' imported but unused

(F401)


[error] 18-18: 'stack.get_stack_effect' imported but unused

(F401)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/opcode_metadata/cpython.py` around lines 17 - 18, The module currently
imports SKIP_PROPERTIES, Analysis, Family, Properties, analyze_files and
get_stack_effect only for downstream re-export but lint flags them as unused;
make this explicit by adding an __all__ =
("SKIP_PROPERTIES","Analysis","Family","Properties","analyze_files","get_stack_effect")
at top (or alternatively append targeted "# noqa: F401" to those import lines)
so the re-exports are intentional and ruff/CI no longer reports F401. Ensure the
symbol names match the imported identifiers exactly (SKIP_PROPERTIES, Analysis,
Family, Properties, analyze_files, get_stack_effect).

Comment on lines +12 to +15
import typing

from opcodes import OpcodeInfo
from utils import DEFAULT_INPUT, ROOT, get_conf, to_pascal_case, to_upper_snake_case
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove unused imports to satisfy lint checks (Line 12, Line 15).

typing and to_pascal_case are unused and trigger F401. Please drop both imports to keep ruff/flake8 clean.

As per coding guidelines **/*.py: "Follow PEP 8 style for custom Python code" and "Use ruff for linting Python code".

🧰 Tools
🪛 Flake8 (7.3.0)

[error] 12-12: 'typing' imported but unused

(F401)


[error] 15-15: 'utils.to_pascal_case' imported but unused

(F401)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/opcode_metadata/generate_py_opcode_metadata.py` around lines 12 - 15,
Remove the unused imports that trigger F401 by deleting "typing" from the
top-level imports and removing "to_pascal_case" from the from-utils import list
in generate_py_opcode_metadata.py; update the import line to only keep used
symbols (OpcodeInfo, DEFAULT_INPUT, ROOT, get_conf, to_upper_snake_case) so the
module passes ruff/flake8 linting.

Comment on lines +84 to +85
key = lambda op: (op.cpython_name not in PRIORITY_OPMAP, op.id)
for opcode in sorted(opcodes, key=key):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace the lambda sort key with a named function (E731)

Ruff reports E731 Do not assign a lambda expression, use a def for key = lambda .... Replace it with a small local def.

Suggested fix
-    key = lambda op: (op.cpython_name not in PRIORITY_OPMAP, op.id)
-    for opcode in sorted(opcodes, key=key):
+    def sort_key(opcode):
+        return (opcode.cpython_name not in PRIORITY_OPMAP, opcode.id)
+
+    for opcode in sorted(opcodes, key=sort_key):
🧰 Tools
🪛 Flake8 (7.3.0)

[error] 84-84: do not assign a lambda expression, use a def

(E731)

🪛 Ruff (0.15.14)

[error] 84-84: Do not assign a lambda expression, use a def

Rewrite key as a def

(E731)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/opcode_metadata/generate_py_opcode_metadata.py` around lines 84 - 85,
Replace the lambda assigned to key with a small local def to satisfy E731:
create a function (e.g., def sort_key(op): return (op.cpython_name not in
PRIORITY_OPMAP, op.id)) and pass sort_key into sorted(opcodes, key=sort_key)
instead of using key = lambda ...; keep references to PRIORITY_OPMAP and
op.id/op.cpython_name unchanged.

Comment on lines +4 to +16
import collections
import dataclasses
import io
import os
import pathlib
import subprocess
import sys
import typing

import tomllib
from cpython import Analysis, get_analysis, get_stack_effect
from opcodes import OpcodeInfo
from utils import DEFAULT_INPUT, ROOT, get_conf, to_pascal_case
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove unused imports to keep the generator Ruff-clean.

This block contains multiple unused imports (collections, os, sys, typing, tomllib, Analysis/get_analysis/get_stack_effect, to_pascal_case), which fail lint (F401).

Suggested cleanup
-import collections
 import dataclasses
 import io
-import os
 import pathlib
 import subprocess
-import sys
-import typing
 
-import tomllib
-from cpython import Analysis, get_analysis, get_stack_effect
+from cpython import Analysis, get_analysis, get_stack_effect
 from opcodes import OpcodeInfo
-from utils import DEFAULT_INPUT, ROOT, get_conf, to_pascal_case
+from utils import DEFAULT_INPUT, ROOT, get_conf

As per coding guidelines "**/*.py: Use ruff for linting Python code".

🧰 Tools
🪛 Flake8 (7.3.0)

[error] 4-4: 'collections' imported but unused

(F401)


[error] 7-7: 'os' imported but unused

(F401)


[error] 10-10: 'sys' imported but unused

(F401)


[error] 11-11: 'typing' imported but unused

(F401)


[error] 13-13: 'tomllib' imported but unused

(F401)


[error] 14-14: 'cpython.Analysis' imported but unused

(F401)


[error] 14-14: 'cpython.get_analysis' imported but unused

(F401)


[error] 14-14: 'cpython.get_stack_effect' imported but unused

(F401)


[error] 16-16: 'utils.to_pascal_case' imported but unused

(F401)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/opcode_metadata/generate_rs_opcode_metadata.py` around lines 4 - 16,
Remove the unused imports in the top import block: delete "collections", "os",
"sys", "typing", "tomllib" (if not used elsewhere), and the unused names from
the cpython from-import ("Analysis", "get_analysis", "get_stack_effect"), and
remove "to_pascal_case" from the utils import; keep only the actually used
symbols (e.g., dataclasses, io, pathlib, subprocess, OpcodeInfo, DEFAULT_INPUT,
ROOT, get_conf). Update the import lines in generate_rs_opcode_metadata.py
accordingly and run ruff to verify no F401s remain.


@dataclasses.dataclass(frozen=True, slots=True)
class OpcodeGen:
info: OpcodeDef
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix unresolved type annotation (OpcodeDef) before merge.

OpcodeDef is undefined in this module, so linting fails (F821). Import the right type or update the annotation to the actual symbol returned by OpcodeInfo.iter_infos(...).

As per coding guidelines "**/*.py: Use ruff for linting Python code".

🧰 Tools
🪛 Flake8 (7.3.0)

[error] 23-23: undefined name 'OpcodeDef'

(F821)

🪛 Ruff (0.15.14)

[error] 23-23: Undefined name OpcodeDef

(F821)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/opcode_metadata/generate_rs_opcode_metadata.py` at line 23, The name
OpcodeDef used as a type annotation is undefined; either import the correct type
that represents entries returned by OpcodeInfo.iter_infos (e.g., add an import
for OpcodeDef from the module that defines it) or change the annotation to the
actual type produced by OpcodeInfo.iter_infos (for example Iterator[OpcodeInfo]
or list[OpcodeInfo]) and add the necessary typing imports (from typing import
Iterator, List) so ruff/F821 stops complaining; update the annotation on the
variable/parameter named info (or wherever OpcodeDef is used) accordingly.

Comment on lines +35 to +37
member_name = to_pascal_case(member.name)
if member.name == family_name:
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix mismatched-name comparison when filtering family root members.

You normalize both names to PascalCase, but compare raw member.name to normalized family_name. This can incorrectly include the family opcode in its own deopt list.

Suggested fix
-                if member.name == family_name:
+                if member_name == family_name:
                     continue
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/opcode_metadata/opcodes.py` around lines 35 - 37, The comparison that
skips the family root uses raw member.name against the normalized family_name,
causing mismatches; update the filter to compare the normalized member_name (the
result of to_pascal_case(member.name)) to family_name (or normalize family_name
with to_pascal_case) so the family opcode is correctly excluded—adjust the logic
around member_name, member.name, family_name and the call to to_pascal_case in
opcodes.py.

stack_effect: StackEffect = dataclasses.field(default_factory=StackEffect)


type OverrideConfs = dict[str, Override]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify Python-version expectations used by CI/tooling and where this alias appears.
fd -HI "pyproject.toml|setup.cfg|tox.ini|.python-version|ci.yaml|*.yml|*.yaml" .github workflows . 2>/dev/null | xargs -r rg -n "python-version|requires-python|3\\.(1[01]|12|13)"
rg -n "type\\s+OverrideConfs\\s*=" tools/opcode_metadata/utils.py

Repository: RustPython/RustPython

Length of output: 109


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Python version constraints (best-effort) =="
rg -n --hidden --no-ignore \
  "requires-python|python-version|target-version|minimum-python-version|Programming Language :: Python" \
  . \
  || true

echo
echo "== Ruff target-version (if any) =="
rg -n --hidden --no-ignore \
  "target-version" \
  pyproject.toml setup.cfg tox.ini .ruff.toml ruff.toml 2>/dev/null || true

echo
echo "== Inspect tools/opcode_metadata/utils.py around line 25 =="
python3 - <<'PY'
from pathlib import Path
p = Path("tools/opcode_metadata/utils.py")
if p.exists():
    lines = p.read_text().splitlines()
    for i in range(18, 33):
        if 0 <= i-1 < len(lines):
            print(f"{i}:{lines[i-1]}")
else:
    print("File not found:", p)
PY

Repository: RustPython/RustPython

Length of output: 50378


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== .python-version =="
if [ -f .python-version ]; then cat .python-version; else echo "missing"; fi

echo
echo "== requires-python in packaging configs (limited) =="
rg -n --hidden --no-ignore "requires-python" pyproject.toml setup.cfg tox.ini . 2>/dev/null || true
rg -n --hidden --no-ignore "Programming Language :: Python" pyproject.toml setup.cfg setup.* 2>/dev/null || true

echo
echo "== ruff target-version (limited) =="
rg -n --hidden --no-ignore "target-version" pyproject.toml ruff.toml .ruff.toml setup.cfg tox.ini 2>/dev/null || true

echo
echo "== Inspect tools/opcode_metadata/utils.py lines 1-80 (for context around line 25) =="
nl -ba tools/opcode_metadata/utils.py | sed -n '1,80p'

Repository: RustPython/RustPython

Length of output: 50440


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Ruff config files ==" 
ls -la .ruff.toml ruff.toml pyproject.toml setup.cfg 2>/dev/null || true

echo
echo "== Search ruff target-version =="
rg -n --hidden --no-ignore "target-version|python-version|requires-python" .ruff.toml ruff.toml pyproject.toml setup.cfg 2>/dev/null || true

echo
echo "== Search ruff target-version in workflow (limited) =="
rg -n --hidden --no-ignore "target-version|requires-python|python-version" .github/workflows 2>/dev/null | head -n 50 || true

echo
echo "== tools/opcode_metadata/utils.py around line 25 =="
python3 - <<'PY'
from pathlib import Path
p = Path("tools/opcode_metadata/utils.py")
lines = p.read_text().splitlines()
start, end = 18, 32
for i in range(start, end+1):
    if i-1 < len(lines):
        print(f"{i}:{lines[i-1]}")
PY

Repository: RustPython/RustPython

Length of output: 2817


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== sandbox python3 version =="
python3 --version || true

echo
echo "== ruff.toml (first 120 lines) =="
sed -n '1,120p' ruff.toml

echo
echo "== ruff.toml Python-version related keys =="
rg -n "target-version|python-version|requires-python" ruff.toml || true

echo
echo "== ruff usage in CI workflows (limited context) =="
rg -n "ruff" .github/workflows/ci.yaml .github/workflows/*.yml .github/workflows/*.yaml 2>/dev/null | head -n 80 || true

echo
echo "== python-version in CI workflow where ruff runs (best-effort) =="
rg -n "python-version" .github/workflows/ci.yaml 2>/dev/null | head -n 80 || true

Repository: RustPython/RustPython

Length of output: 1567


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== ruff references in ci.yaml =="
rg -n "ruff" .github/workflows/ci.yaml || true

echo
echo "== setup-python / python-version in ci.yaml (context) =="
rg -n "setup-python|python-version:" .github/workflows/ci.yaml || true

echo
echo "== Lint job section (best-effort) =="
rg -n "lint|format|check" .github/workflows/ci.yaml | head -n 80 || true

echo
echo "== show around first ruff match (if any) =="
ruff_line=$(rg -n "ruff" .github/workflows/ci.yaml | head -n 1 | cut -d: -f1 || true)
if [ -n "${ruff_line:-}" ]; then
  start=$((ruff_line-25)); end=$((ruff_line+25))
  sed -n "${start},${end}p" .github/workflows/ci.yaml
else
  echo "No ruff match found."
fi

Repository: RustPython/RustPython

Length of output: 2502


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== py_compile with sandbox python =="
python3 - <<'PY'
import py_compile, sys
try:
    py_compile.compile("tools/opcode_metadata/utils.py", doraise=True)
    print("py_compile: OK")
except Exception as e:
    print("py_compile: FAIL")
    print(type(e).__name__ + ":", e)
PY

echo
echo "== Inspect CI lint job section in .github/workflows/ci.yaml =="
sed -n '500,620p' .github/workflows/ci.yaml

echo
echo "== Search for ruff usage in workflows =="
rg -n "ruff" .github/workflows || true

echo
echo "== Current ruff version (if installed) =="
ruff --version || true

Repository: RustPython/RustPython

Length of output: 4287


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Locate ruff in .pre-commit-config.yaml =="
rg -n "ruff" .pre-commit-config.yaml || true

echo
echo "== Show ruff-related hook blocks (around matches) =="
rg -n -C 3 "ruff" .pre-commit-config.yaml || true

echo
echo "== Show full .pre-commit-config.yaml (limited) if small =="
wc -l .pre-commit-config.yaml || true

Repository: RustPython/RustPython

Length of output: 721


🏁 Script executed:

#!/bin/bash
set -euo pipefail
ruff check tools/opcode_metadata/utils.py -v || true

Repository: RustPython/RustPython

Length of output: 632


Replace the PEP 695 type alias to avoid a Python 3.11 parse-time SyntaxError.

tools/opcode_metadata/utils.py line 25 uses type OverrideConfs = dict[str, Override] (PEP 695), which fails to parse under Python 3.11 (py_compile reports SyntaxError: invalid syntax). Use the assignment alias form instead:

Suggested fix
- type OverrideConfs = dict[str, Override]
+OverrideConfs = dict[str, Override]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
type OverrideConfs = dict[str, Override]
OverrideConfs = dict[str, Override]
🧰 Tools
🪛 Flake8 (7.3.0)

[error] 25-25: SyntaxError: invalid syntax

(E999)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/opcode_metadata/utils.py` at line 25, Replace the PEP 695 type-alias
statement so it parses on Python 3.11: change the declaration "type
OverrideConfs = dict[str, Override]" to an assignment-style alias "OverrideConfs
= dict[str, Override]" (keep the same name OverrideConfs and the referenced
Override type), ensuring any necessary imports for Override remain unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant