Skip to content

General code nits#8107

Draft
ShaharNaveh wants to merge 5 commits into
RustPython:mainfrom
ShaharNaveh:nits-7
Draft

General code nits#8107
ShaharNaveh wants to merge 5 commits into
RustPython:mainfrom
ShaharNaveh:nits-7

Conversation

@ShaharNaveh

@ShaharNaveh ShaharNaveh commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

Summary by CodeRabbit

  • Bug Fixes

    • Adjusted ASCII-based string lower/upper checks so strings with no ASCII alphabetic characters now match the expected semantics.
  • Performance Improvements

    • Reduced allocations during struct-format and decoding/encoding error construction by reusing message representations more efficiently.
  • Internal Improvements

    • Streamlined codec and error-handler setup, refined compiler feature gating and public error exposure under non-compiler builds.
    • Improved garbage-collection timing behavior on WebAssembly and enhanced GC result/stat value ergonomics (more copy/const behavior).

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: ca4f2a11-ebfb-4147-9c37-327f0e14d7a0

📥 Commits

Reviewing files that changed from the base of the PR and between 2f1287a and 2075c16.

📒 Files selected for processing (1)
  • crates/vm/src/class.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/vm/src/class.rs

📝 Walkthrough

Walkthrough

This PR applies broad code-quality improvements across the RustPython VM crate: eliminates intermediate string allocations in error paths by changing new_struct_error to accept Wtf8Buf, alters py_islower/py_isupper edge-case semantics via iterator .all(...), refactors cformat and codecs control flow, restructures compiler feature gating with compile_error!/cfg_select!, makes GC timing wasm32-safe, and adds Clone/Copy/const fn to GC public types. Widespread import reordering and formatting cleanup is also included.

Changes

VM crate code quality and refactoring

Layer / File(s) Summary
String allocation reduction: new_struct_error and callers
crates/vm/src/buffer.rs, crates/stdlib/src/pystruct.rs, crates/vm/src/builtins/float.rs
new_struct_error signature changes from impl Into<String> to T: Into<Wtf8Buf>; call sites in pystruct.rs and float.rs drop .to_owned() and pass string literals directly.
anystr.rs: py_islower/py_isupper semantic change and single_or_tuple_any refactor
crates/vm/src/anystr.rs
py_islower and py_isupper switch from explicit loops to .all(...) iterator predicates, changing empty/non-alphabetic input from returning false to true. single_or_tuple_any is rewritten from match to if let Ok.
cformat.rs: printf-style formatting control flow refactor
crates/vm/src/cformat.rs
Literal-only, mapping-required, and positional tuple-consuming paths in cformat_bytes and cformat_string are rewritten using explicit early returns, if let Some extraction, unreachable!() guards, and value.clone() at formatting call sites.
codecs.rs: imports, control flow, and derive refactoring
crates/vm/src/codecs.rs
Switches codec imports to rustpython_common; simplifies encoder/decoder arg construction via map_or_else; refactors SurrogatePass, ErrorsHandler::new, and strict_errors from match to if let; adds Copy to StandardEncoding; adjusts PyDecodeData::Deref form; widespread whitespace cleanup.
compiler.rs: feature-gate restructuring and error enum promotion
crates/vm/src/compiler.rs
Replaces runtime panic! with compile_error! for invalid feature combos; consolidates re-exports under cfg_select!; promotes CompileErrorType/CompileError from private mod error to top-level public enums, still gated by #[cfg(not(feature = "compiler"))].
gc_state.rs: wasm32-safe timing, Clone/Copy derives, and const fn
crates/vm/src/gc_state.rs
Removes wasm32 elapsed_secs stub; all duration sites use cfg_select! (0.0 on wasm32). CollectResult and GcStats gain Clone/Copy derives. GcState::new becomes pub const fn.
Formatting, import reordering, and minor cosmetic changes
crates/vm/src/byte.rs, crates/vm/src/suggestion.rs, crates/vm/src/convert/try_from.rs, crates/vm/src/class.rs, crates/vm/src/recursion.rs, crates/vm/src/builtins/staticmethod.rs
Import consolidation in byte.rs and suggestion.rs; intra-doc link in try_from.rs; class.rs reorders PyClassImpl items and switches to debug_assert!; recursion.rs updates docs and uses Self; staticmethod.rs whitespace addition.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • RustPython/RustPython#5858: Touches crates/vm/src/builtins/float.rs with the same pattern of passing string literals directly to error constructors instead of allocating owned Strings.

Suggested reviewers

  • youknowone

Poem

🐇 Hop, hop! No more .to_owned() today,
The rabbit clips strings that got in the way.
cfg_select! guides us through wasm and back,
.all(...) now rules the lowercase track.
Early returns bloom where match used to nest—
Clean code is the warren the rabbit loves best! 🌿

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.76% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'General code nits' is vague and generic, using non-descriptive language that doesn't convey specific information about the changeset despite spanning 14 modified files with varying types of improvements. Replace with a more specific title that highlights the primary refactoring theme, such as 'Refactor error handling and optimize string allocations across stdlib and vm modules' or similar.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/vm/src/anystr.rs`:
- Around line 414-419: The py_islower and py_isupper methods are using
filter().all() which returns vacuously true when no ASCII alphabetic characters
are present, causing empty strings and digit-only strings to incorrectly return
true. In both functions (py_islower at lines 414-419 and py_isupper at lines
423-428), you need to ensure at least one ASCII alphabetic character exists
before checking the case condition. Modify the logic to use both an any() check
to verify at least one ASCII alphabetic byte is present AND the all() check to
ensure all such bytes match the required case, combining them with AND logic so
both conditions must be true for the function to return true.

In `@crates/vm/src/cformat.rs`:
- Around line 361-379: The guard condition in the bytes formatting branch at
crates/vm/src/cformat.rs#L361-L379 currently uses OR logic which makes the
literal-only path unreachable. The condition should reject only when the value
is neither a mapping nor an empty tuple. Change the boolean operator from || to
&& in the condition checking !is_mapping and the tuple predicate to properly
accept mapping objects or empty tuples while rejecting other values. Apply the
identical fix to the string formatting branch at
crates/vm/src/cformat.rs#L458-L476, changing the same OR operator to AND in the
corresponding guard condition.

In `@crates/vm/src/compiler.rs`:
- Around line 1-2: The ToPyException impl blocks at lines 41-52 have feature
gates that don't match the VM method gates they depend on. Locate the two
ToPyException impl blocks that call the methods new_syntax_error and
new_syntax_error_maybe_incomplete, and change their cfg attribute from
any(feature = "parser", feature = "codegen") to any(feature = "parser", feature
= "compiler") to match the gate on those called VM methods, preventing linking
failures when building with --features=codegen alone.
- Around line 24-27: The module paths in the error enum variant definitions are
stale and no longer valid. The Codegen variant references
super::codegen::error::CodegenErrorType which should be replaced with the
absolute path rustpython_codegen::error::CodegenErrorType. The Parse variant
references super::parser::ParseErrorType which should be replaced with the local
parser alias that is already defined on line 12, changing it to simply
parser::ParseErrorType.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: fa25c5e2-0970-4999-807c-f9e62d8ebbfc

📥 Commits

Reviewing files that changed from the base of the PR and between dbfd69e and 2f1287a.

📒 Files selected for processing (14)
  • crates/stdlib/src/pystruct.rs
  • crates/vm/src/anystr.rs
  • crates/vm/src/buffer.rs
  • crates/vm/src/builtins/float.rs
  • crates/vm/src/builtins/staticmethod.rs
  • crates/vm/src/byte.rs
  • crates/vm/src/cformat.rs
  • crates/vm/src/class.rs
  • crates/vm/src/codecs.rs
  • crates/vm/src/compiler.rs
  • crates/vm/src/convert/try_from.rs
  • crates/vm/src/gc_state.rs
  • crates/vm/src/recursion.rs
  • crates/vm/src/suggestion.rs

Comment thread crates/vm/src/anystr.rs
Comment on lines +414 to 419
self.as_bytes()
.iter()
.copied()
.filter(u8::is_ascii_alphabetic)
{
if byte.is_ascii_uppercase() {
return false;
}
lower = true;
}
lower
.all(|byte| byte.is_ascii_lowercase())
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restore the “at least one ASCII cased letter” requirement in py_islower/py_isupper.

Line 414 and Line 423 now use filter(...).all(...); on inputs with no ASCII alphabetic bytes, all() is vacuously true, so these functions return true for empty/digit-only content.

💡 Suggested fix
 fn py_islower(&self) -> bool {
-    self.as_bytes()
-        .iter()
-        .copied()
-        .filter(u8::is_ascii_alphabetic)
-        .all(|byte| byte.is_ascii_lowercase())
+    let mut seen_alpha = false;
+    for byte in self
+        .as_bytes()
+        .iter()
+        .copied()
+        .filter(u8::is_ascii_alphabetic)
+    {
+        seen_alpha = true;
+        if !byte.is_ascii_lowercase() {
+            return false;
+        }
+    }
+    seen_alpha
 }
 
 // Py_bytes_isupper
 fn py_isupper(&self) -> bool {
-    self.as_bytes()
-        .iter()
-        .copied()
-        .filter(u8::is_ascii_alphabetic)
-        .all(|byte| byte.is_ascii_uppercase())
+    let mut seen_alpha = false;
+    for byte in self
+        .as_bytes()
+        .iter()
+        .copied()
+        .filter(u8::is_ascii_alphabetic)
+    {
+        seen_alpha = true;
+        if !byte.is_ascii_uppercase() {
+            return false;
+        }
+    }
+    seen_alpha
 }

Also applies to: 423-428

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/anystr.rs` around lines 414 - 419, The py_islower and
py_isupper methods are using filter().all() which returns vacuously true when no
ASCII alphabetic characters are present, causing empty strings and digit-only
strings to incorrectly return true. In both functions (py_islower at lines
414-419 and py_isupper at lines 423-428), you need to ensure at least one ASCII
alphabetic character exists before checking the case condition. Modify the logic
to use both an any() check to verify at least one ASCII alphabetic byte is
present AND the all() check to ensure all such bytes match the required case,
combining them with AND logic so both conditions must be true for the function
to return true.

Comment thread crates/vm/src/cformat.rs
Comment on lines 361 to +379
if num_specifiers == 0 {
// literal only
return if is_mapping
|| values_obj
if !is_mapping
|| !values_obj
.downcast_ref::<tuple::PyTuple>()
.is_some_and(|e| e.is_empty())
{
for (_, part) in format.iter_mut() {
match part {
CFormatPart::Literal(literal) => result.append(literal),
CFormatPart::Spec(_) => unreachable!(),
}
return Err(vm.new_type_error("not all arguments converted during bytes formatting"));
}

// literal only
for (_, part) in format.iter_mut() {
if let CFormatPart::Literal(literal) = part {
result.append(literal)
} else {
unreachable!()
}
Ok(result)
} else {
Err(vm.new_type_error("not all arguments converted during bytes formatting"))
};
}

return Ok(result);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restore the literal-only success condition.

The guard now makes the literal-only success path unreachable: tuples are excluded from is_mapping, while mappings fail the empty-tuple predicate. This makes valid literal-only % formatting with {} or () raise instead of returning the literal.

  • crates/vm/src/cformat.rs#L361-L379: change the bytes branch guard to reject only when the value is neither a mapping nor an empty tuple.
  • crates/vm/src/cformat.rs#L458-L476: apply the same boolean fix for string formatting.
Proposed fix
-        if !is_mapping
-            || !values_obj
+        if !is_mapping
+            && !values_obj
                 .downcast_ref::<tuple::PyTuple>()
                 .is_some_and(|e| e.is_empty())
         {
             return Err(vm.new_type_error("not all arguments converted during bytes formatting"));
         }
-        if !is_mapping
-            || !values_obj
+        if !is_mapping
+            && !values_obj
                 .downcast_ref::<tuple::PyTuple>()
                 .is_some_and(|e| e.is_empty())
         {
             return Err(vm.new_type_error("not all arguments converted during string formatting"));
         }
📍 Affects 1 file
  • crates/vm/src/cformat.rs#L361-L379 (this comment)
  • crates/vm/src/cformat.rs#L458-L476
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/cformat.rs` around lines 361 - 379, The guard condition in the
bytes formatting branch at crates/vm/src/cformat.rs#L361-L379 currently uses OR
logic which makes the literal-only path unreachable. The condition should reject
only when the value is neither a mapping nor an empty tuple. Change the boolean
operator from || to && in the condition checking !is_mapping and the tuple
predicate to properly accept mapping objects or empty tuples while rejecting
other values. Apply the identical fix to the string formatting branch at
crates/vm/src/cformat.rs#L458-L476, changing the same OR operator to AND in the
corresponding guard condition.

Comment thread crates/vm/src/compiler.rs
Comment on lines +1 to +2
#[cfg(all(not(feature = "compiler"), feature = "parser", feature = "codegen",))]
compile_error!("Use --features=compiler to enable both parser and codegen");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect the relevant feature declarations and cfg gates without building.

rg -n -C3 '\bcompiler\s*=|\bparser\s*=|\bcodegen\s*=' --iglob 'Cargo.toml'
rg -n -C3 'new_syntax_error(_maybe_incomplete)?|impl crate::convert::ToPyException for \(CompileError' --type=rust

Repository: RustPython/RustPython

Length of output: 14894


Change the ToPyException impl gates to match the VM method gates.

The new compile-time guard only rejects not(compiler) && parser && codegen, but codegen alone is a valid feature combination. Building with --features=codegen without parser or compiler would compile the ToPyException impls (lines 41-52) while the VM methods they call—new_syntax_error and new_syntax_error_maybe_incomplete—remain gated on any(feature = "parser", feature = "compiler"), causing a linking failure.

Fix: change the cfg gates on both impls from #[cfg(any(feature = "parser", feature = "codegen"))] to #[cfg(any(feature = "parser", feature = "compiler"))] to match the availability of the called methods.

Suggested diff
-#[cfg(any(feature = "parser", feature = "codegen"))]
+#[cfg(any(feature = "parser", feature = "compiler"))]
 impl crate::convert::ToPyException for (CompileError, Option<&str>) {
     fn to_pyexception(&self, vm: &crate::VirtualMachine) -> crate::builtins::PyBaseExceptionRef {
         vm.new_syntax_error(&self.0, self.1)
     }
 }
 
-#[cfg(any(feature = "parser", feature = "codegen"))]
+#[cfg(any(feature = "parser", feature = "compiler"))]
 impl crate::convert::ToPyException for (CompileError, Option<&str>, bool) {
     fn to_pyexception(&self, vm: &crate::VirtualMachine) -> crate::builtins::PyBaseExceptionRef {
         vm.new_syntax_error_maybe_incomplete(&self.0, self.1, self.2)
     }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/compiler.rs` around lines 1 - 2, The ToPyException impl blocks
at lines 41-52 have feature gates that don't match the VM method gates they
depend on. Locate the two ToPyException impl blocks that call the methods
new_syntax_error and new_syntax_error_maybe_incomplete, and change their cfg
attribute from any(feature = "parser", feature = "codegen") to any(feature =
"parser", feature = "compiler") to match the gate on those called VM methods,
preventing linking failures when building with --features=codegen alone.

Comment thread crates/vm/src/compiler.rs
Comment on lines +24 to +27
Codegen(#[from] super::codegen::error::CodegenErrorType),
#[cfg(feature = "parser")]
#[error(transparent)]
Parse(#[from] super::parser::ParseErrorType),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect whether `super::parser` / `super::codegen` can resolve from `crate::compiler`.

rg -n -C3 'super::(parser|codegen)::|pub use ruff_python_parser as parser|rustpython_codegen::' --type=rust
rg -n -C3 '^\s*(pub\s+)?mod\s+(parser|codegen)\b|^\s*pub\s+use\s+.*\s+as\s+(parser|codegen)\b' --type=rust

Repository: RustPython/RustPython

Length of output: 4574


🏁 Script executed:

#!/bin/bash
# Check if these are truly at top level or in a nested module
head -50 crates/vm/src/compiler.rs | cat -n

Repository: RustPython/RustPython

Length of output: 2043


🏁 Script executed:

#!/bin/bash
# Look for where codegen module might be defined or re-exported
rg -n 'mod codegen|pub use.*codegen' crates/vm/src/compiler.rs

Repository: RustPython/RustPython

Length of output: 108


🏁 Script executed:

#!/bin/bash
# Check rustpython_codegen crate structure for the error types
fd -t f '\.rs$' crates/codegen/src/ 2>/dev/null | head -20 || echo "codegen crate not found, checking if it's external"

Repository: RustPython/RustPython

Length of output: 307


🏁 Script executed:

#!/bin/bash
# Verify what types are actually available from ruff_python_parser via the parser alias
rg -n 'ParseError' crates/compiler/src/ | head -10

Repository: RustPython/RustPython

Length of output: 893


🏁 Script executed:

#!/bin/bash
# Check what error types rustpython_codegen exports from its error module
rg -n 'pub.*CodegenError' crates/codegen/src/error.rs | head -5

Repository: RustPython/RustPython

Length of output: 260


🏁 Script executed:

#!/bin/bash
# Check what ParseErrorType is exported as in ruff_python_parser
# Since ruff_python_parser is external, check how it's used in crates/compiler
rg -n 'ParseErrorType|ParseError' crates/compiler/src/lib.rs | grep -E '(use|import|pub)'

Repository: RustPython/RustPython

Length of output: 287


🏁 Script executed:

#!/bin/bash
# Verify that super::parser and super::codegen don't resolve by checking if these modules exist in vm crate
ls -la crates/vm/src/ | grep -E 'codegen|parser' || echo "No codegen or parser directories found"

Repository: RustPython/RustPython

Length of output: 104


🏁 Script executed:

#!/bin/bash
# Check if the file actually compiles with these broken paths
# by looking for any build/compilation context
cd crates/vm && cargo check --no-default-features --features parser 2>&1 | head -50 || echo "Cannot build in sandbox"

Repository: RustPython/RustPython

Length of output: 151


🏁 Script executed:

#!/bin/bash
# Double-check the exact exports from rustpython_codegen error module
rg -n 'pub (enum|struct) (CodegenError|CodegenErrorType)' crates/codegen/src/error.rs

Repository: RustPython/RustPython

Length of output: 125


🏁 Script executed:

#!/bin/bash
# Verify the cfg_select conditions that guard the parser alias availability
rg -B5 -A10 'pub use ruff_python_parser as parser' crates/vm/src/compiler.rs

Repository: RustPython/RustPython

Length of output: 449


Update stale super:: paths after promoting the error enums.

The super::codegen::error::* and super::parser::* paths are broken. These enums are at the top level of compiler.rs with no parent module containing codegen or parser subdirectories. Use the local parser alias (defined on line 12) and the absolute rustpython_codegen::error::* path instead.

Proposed fix
     #[cfg(feature = "codegen")]
     #[error(transparent)]
-    Codegen(#[from] super::codegen::error::CodegenErrorType),
+    Codegen(#[from] rustpython_codegen::error::CodegenErrorType),
     #[cfg(feature = "parser")]
     #[error(transparent)]
-    Parse(#[from] super::parser::ParseErrorType),
+    Parse(#[from] parser::ParseErrorType),
 }
@@
     #[cfg(feature = "codegen")]
     #[error(transparent)]
-    Codegen(#[from] super::codegen::error::CodegenError),
+    Codegen(#[from] rustpython_codegen::error::CodegenError),
     #[cfg(feature = "parser")]
     #[error(transparent)]
-    Parse(#[from] super::parser::ParseError),
+    Parse(#[from] parser::ParseError),
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/vm/src/compiler.rs` around lines 24 - 27, The module paths in the
error enum variant definitions are stale and no longer valid. The Codegen
variant references super::codegen::error::CodegenErrorType which should be
replaced with the absolute path rustpython_codegen::error::CodegenErrorType. The
Parse variant references super::parser::ParseErrorType which should be replaced
with the local parser alias that is already defined on line 12, changing it to
simply parser::ParseErrorType.

@ShaharNaveh ShaharNaveh marked this pull request as draft June 15, 2026 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant