Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add ConstantData::Frozenset variant (type only, folding deferred)
Add Frozenset to ConstantData, BorrowedConstant, and marshal support.
Actual frozenset folding (BUILD_SET + CONTAINS_OP → LOAD_CONST frozenset)
requires VirtualMachine for element hashing and is deferred.
  • Loading branch information
youknowone committed Mar 30, 2026
commit f8927e11225c10453dbba486c38ead4ed15bbc2e
24 changes: 24 additions & 0 deletions crates/compiler-core/src/bytecode.rs
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,7 @@ impl Constant for ConstantData {
Self::Code { code } => Code { code },
Self::Tuple { elements } => Tuple { elements },
Self::Slice { elements } => Slice { elements },
Self::Frozenset { elements } => Frozenset { elements },
Self::None => None,
Self::Ellipsis => Ellipsis,
}
Expand Down Expand Up @@ -860,6 +861,7 @@ pub enum ConstantData {
Code { code: Box<CodeObject> },
/// Constant slice(start, stop, step)
Slice { elements: Box<[ConstantData; 3]> },
Frozenset { elements: Vec<ConstantData> },
None,
Ellipsis,
}
Comment on lines +854 to 887
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if there are other match statements on ConstantData that may need updating
rg -n "match.*ConstantData" --type=rust -A10 | head -100

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

cat crates/jit/tests/common.rs

Repository: RustPython/RustPython

Length of output: 14312


🏁 Script executed:

rg "ConstantData::" --type=rust -B2 -A5 | head -150

Repository: RustPython/RustPython

Length of output: 9976


🏁 Script executed:

fd . crates/jit/tests --type f -name "*.rs"

Repository: RustPython/RustPython

Length of output: 295


🏁 Script executed:

rg "match.*\{" --type=rust -B3 -A15 | grep -A15 "ConstantData"

Repository: RustPython/RustPython

Length of output: 41022


🏁 Script executed:

rg "Slice\s*\{|Frozenset\s*\{" --type=rust -B2 -A2

Repository: RustPython/RustPython

Length of output: 16928


🏁 Script executed:

fd . crates/jit/tests --type f -name "*.rs" -exec grep -l "slice\|frozenset" {} \;

Repository: RustPython/RustPython

Length of output: 295


Update crates/jit/tests/common.rs to handle ConstantData::Slice and ConstantData::Frozenset variants.

The From<ConstantData> for StackValue implementation uses a wildcard match arm that will panic at runtime if these new constant types are encountered. Since the compiler actively generates ConstantData::Slice constants, any JIT test using slice expressions will fail with an unimplemented! panic. Add handling for both variants in the match statement.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/compiler-core/src/bytecode.rs` around lines 854 - 887, The
From<ConstantData> for StackValue implementation currently uses a wildcard arm
and will panic for the new ConstantData::Slice and ConstantData::Frozenset
variants; update the match in From<ConstantData> for StackValue to explicitly
handle ConstantData::Slice (map it to an appropriate StackValue variant
representing a slice object, constructing start/stop/step from the elements
array) and ConstantData::Frozenset (convert elements Vec<ConstantData> into the
corresponding StackValue frozenset representation), ensuring you pattern-match
on ConstantData::Slice and ConstantData::Frozenset instead of relying on the
unimplemented wildcard arm.

Expand All @@ -882,6 +884,7 @@ impl PartialEq for ConstantData {
(Code { code: a }, Code { code: b }) => core::ptr::eq(a.as_ref(), b.as_ref()),
(Tuple { elements: a }, Tuple { elements: b }) => a == b,
(Slice { elements: a }, Slice { elements: b }) => a == b,
(Frozenset { elements: a }, Frozenset { elements: b }) => a == b,
(None, None) => true,
(Ellipsis, Ellipsis) => true,
_ => false,
Expand Down Expand Up @@ -909,6 +912,7 @@ impl hash::Hash for ConstantData {
Code { code } => core::ptr::hash(code.as_ref(), state),
Tuple { elements } => elements.hash(state),
Slice { elements } => elements.hash(state),
Frozenset { elements } => elements.hash(state),
None => {}
Ellipsis => {}
}
Expand All @@ -926,6 +930,7 @@ pub enum BorrowedConstant<'a, C: Constant> {
Code { code: &'a CodeObject<C> },
Tuple { elements: &'a [C] },
Slice { elements: &'a [C; 3] },
Frozenset { elements: &'a [C] },
None,
Ellipsis,
}
Expand Down Expand Up @@ -972,6 +977,19 @@ impl<C: Constant> BorrowedConstant<'_, C> {
elements[2].borrow_constant().fmt_display(f)?;
write!(f, ")")
}
BorrowedConstant::Frozenset { elements } => {
write!(f, "frozenset({{")?;
let mut first = true;
for c in *elements {
if first {
first = false
} else {
write!(f, ", ")?;
}
c.borrow_constant().fmt_display(f)?;
}
write!(f, "}})")
}
BorrowedConstant::None => write!(f, "None"),
BorrowedConstant::Ellipsis => write!(f, "..."),
}
Expand Down Expand Up @@ -1005,6 +1023,12 @@ impl<C: Constant> BorrowedConstant<'_, C> {
BorrowedConstant::Slice { elements } => Slice {
elements: Box::new(elements.each_ref().map(|c| c.borrow_constant().to_owned())),
},
BorrowedConstant::Frozenset { elements } => Frozenset {
elements: elements
.iter()
.map(|c| c.borrow_constant().to_owned())
.collect(),
},
BorrowedConstant::None => None,
BorrowedConstant::Ellipsis => Ellipsis,
}
Expand Down
6 changes: 4 additions & 2 deletions crates/compiler-core/src/marshal.rs
Original file line number Diff line number Diff line change
Expand Up @@ -464,8 +464,9 @@ impl<Bag: ConstantBag> MarshalBag for Bag {
Err(MarshalError::BadType)
}

fn make_frozenset(&self, _: impl Iterator<Item = Self::Value>) -> Result<Self::Value> {
Err(MarshalError::BadType)
fn make_frozenset(&self, it: impl Iterator<Item = Self::Value>) -> Result<Self::Value> {
let elements: Vec<Self::Value> = it.collect();
Ok(self.make_constant::<Bag::Constant>(BorrowedConstant::Frozenset { elements: &elements }))
}
Comment on lines +471 to 478
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n crates/compiler-core/src/marshal.rs | sed -n '460,490p'

Repository: RustPython/RustPython

Length of output: 1125


🏁 Script executed:

# Search for context around frozenset handling
rg -A 20 -B 5 "make_frozenset" crates/compiler-core/src/marshal.rs

Repository: RustPython/RustPython

Length of output: 2488


🏁 Script executed:

# Check if there's decoding logic that calls make_frozenset
rg -B 10 "TYPE_FROZENSET|frozenset" crates/compiler-core/src/marshal.rs | head -100

Repository: RustPython/RustPython

Length of output: 1368


🏁 Script executed:

# Find BorrowedConstant definition and Frozenset variant
rg -A 3 "enum BorrowedConstant|Frozenset {" crates/compiler-core/src/marshal.rs | head -50

Repository: RustPython/RustPython

Length of output: 194


🏁 Script executed:

# Search for BorrowedConstant definition in related files
fd -e rs | xargs rg "BorrowedConstant" | grep -E "(enum|struct)" | head -20

Repository: RustPython/RustPython

Length of output: 1031


🏁 Script executed:

# Check what makes up a constant and how frozensets are handled in constant bag
rg -B 5 -A 10 "struct.*Constant|enum.*Constant" crates/compiler-core/src/ | grep -A 10 -B 5 frozenset

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Find the BorrowedConstant enum definition
rg -A 40 "pub enum BorrowedConstant" crates/compiler-core/src/

Repository: RustPython/RustPython

Length of output: 3104


🏁 Script executed:

# Check if there's any deduplication logic in set/frozenset handling
rg -i "dedup\|duplicate\|hashset" crates/compiler-core/src/marshal.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look at the make_constant implementation to understand how frozensets are stored
rg -B 5 -A 15 "fn make_constant" crates/compiler-core/src/marshal.rs

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Search for the make_constant method implementation more broadly
fd -e rs -path "*/compiler-core/src/*" | xargs rg -A 10 "fn make_constant"

Repository: RustPython/RustPython

Length of output: 2908


🏁 Script executed:

# Look at how frozensets are used in the codebase
rg "Frozenset" crates/compiler-core/src/ -A 2 -B 2

Repository: RustPython/RustPython

Length of output: 4297


🏁 Script executed:

# Check if there's a ConstantBag trait that make_constant is part of
rg -B 5 -A 20 "trait ConstantBag" crates/compiler-core/src/

Repository: RustPython/RustPython

Length of output: 1662


🏁 Script executed:

# Check the marshal deserialization code for frozenset more carefully
rg -B 10 -A 5 "Type::FrozenSet" crates/compiler-core/src/marshal.rs

Repository: RustPython/RustPython

Length of output: 1290


🏁 Script executed:

# Look at how frozensets are handled in VM when executing constants
rg "Frozenset" crates/vm/src/ -A 3 -B 3 | head -80

Repository: RustPython/RustPython

Length of output: 3098


🏁 Script executed:

# Check if there are any tests for frozenset marshaling
fd -e rs -path "*/tests/*" -o -path "*test*" | xargs rg -l "frozenset\|Frozenset" | head -10

Repository: RustPython/RustPython

Length of output: 233


🏁 Script executed:

# Find test files for marshal
find crates -name "*test*" -o -name "*tests*" | grep -E "\.rs$" | xargs rg -l "frozenset\|Frozenset" 2>/dev/null | head -10

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Check the VM's marshal implementation for frozenset handling
rg -B 5 -A 20 "PyMarshalBag\|impl.*MarshalBag" crates/vm/src/stdlib/marshal.rs | head -100

Repository: RustPython/RustPython

Length of output: 47


🏁 Script executed:

# Look for how frozenset constants are created in the VM context
rg -B 3 -A 10 "make_frozenset" crates/vm/src/stdlib/marshal.rs

Repository: RustPython/RustPython

Length of output: 501


🏁 Script executed:

# Find PyFrozenSet::from_iter implementation
rg -B 5 -A 20 "impl.*PyFrozenSet|fn from_iter" crates/vm/src/builtins/frozenset.rs | head -150

Repository: RustPython/RustPython

Length of output: 140


🏁 Script executed:

# Look at PyFrozenSet implementation
cat crates/vm/src/builtins/frozenset.rs | head -150

Repository: RustPython/RustPython

Length of output: 132


🏁 Script executed:

# Find PyFrozenSet definition
fd -e rs | xargs rg "struct PyFrozenSet|impl PyFrozenSet" | head -20

Repository: RustPython/RustPython

Length of output: 335


🏁 Script executed:

# Search for frozenset implementation in builtins
fd -e rs crates/vm/src/builtins/ | xargs rg -l "PyFrozenSet\|frozenset"

Repository: RustPython/RustPython

Length of output: 464


🏁 Script executed:

# Check PyFrozenSet implementation and from_iter method
rg -B 5 -A 30 "impl PyFrozenSet" crates/vm/src/builtins/set.rs | head -200

Repository: RustPython/RustPython

Length of output: 1897


🏁 Script executed:

# Search for from_iter implementation
rg -B 3 -A 15 "from_iter" crates/vm/src/builtins/set.rs

Repository: RustPython/RustPython

Length of output: 3361


Deduplicate frozenset elements before constructing the constant.

When deserializing TYPE_FROZENSET, the marshal code collects items directly into a Vec without deduplication, preserving duplicates from the marshal data. CPython's equivalent (PySet_Add in marshal.c) automatically deduplicates. This means crafted marshal data with duplicate elements will produce a different constant representation in the bytecode than CPython does, affecting constant equality and hashing. Normalize the elements before passing to make_constant.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/compiler-core/src/marshal.rs` around lines 471 - 478, The
make_frozenset implementation collects elements into a Vec and passes them
straight to make_constant, which preserves duplicates; update make_frozenset to
deduplicate elements before constructing the BorrowedConstant::Frozenset so its
behavior matches CPython. Concretely, in make_frozenset use a set (e.g., HashSet
or BTreeSet) to track seen values and build a new Vec of unique elements
(preserving a deterministic order such as first-seen or sorted if needed), then
call self.make_constant::<Bag::Constant>(BorrowedConstant::Frozenset { elements:
&unique_elements }) instead of passing the original elements Vec. Ensure you
reference the existing function make_frozenset and the constant constructor
BorrowedConstant::Frozenset when applying the change.


fn make_dict(
Expand Down Expand Up @@ -710,6 +711,7 @@ impl<'a, C: Constant> From<BorrowedConstant<'a, C>> for DumpableValue<'a, C> {
BorrowedConstant::Slice { elements } => {
Self::Slice(&elements[0], &elements[1], &elements[2])
}
BorrowedConstant::Frozenset { elements } => Self::Frozenset(elements),
BorrowedConstant::None => Self::None,
BorrowedConstant::Ellipsis => Self::Ellipsis,
}
Expand Down
15 changes: 15 additions & 0 deletions crates/vm/src/builtins/code.rs
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,13 @@ fn borrow_obj_constant(obj: &PyObject) -> BorrowedConstant<'_, Literal> {
let arr = Box::leak(Box::new([Literal(start), Literal(stop), Literal(step)]));
BorrowedConstant::Slice { elements: arr }
}
ref fs @ super::set::PyFrozenSet => {
// Box::leak the elements so they outlive the borrow. Leak is acceptable since
// constant pool objects live for the program's lifetime.
let elems: Vec<Literal> = fs.elements().into_iter().map(Literal).collect();
let elements = Box::leak(elems.into_boxed_slice());
BorrowedConstant::Frozenset { elements }
}
_ => panic!("unexpected payload for constant python value"),
})
}
Expand Down Expand Up @@ -309,6 +316,14 @@ impl ConstantBag for PyObjBag<'_> {
.into_ref(ctx)
.into()
}
BorrowedConstant::Frozenset { elements: _ } => {
// Creating a frozenset requires VirtualMachine for element hashing.
// PyObjBag only has Context, so we cannot construct PyFrozenSet here.
// Frozenset constants from .pyc are handled by PyMarshalBag which has VM access.
unimplemented!(
"frozenset constant in PyObjBag::make_constant requires VirtualMachine"
)
}
BorrowedConstant::None => ctx.none(),
BorrowedConstant::Ellipsis => ctx.ellipsis.clone().into(),
};
Expand Down