Support per-type-instance vectorcall for builtin types (dict, list, int, ...)
Summary
Currently, all type-object constructor calls (e.g. dict(x=1), list([1,2,3]), int("42")) go through a single vectorcall_type function, which only has a fast path for type(x) and falls back to the generic PyType::call slow path for everything else. This means every builtin type constructor pays the cost of slot_new(args.clone()) + slot_init(args) dispatch, including an unnecessary args.clone().
CPython avoids this by giving each PyTypeObject its own tp_vectorcall function pointer. When you call dict(...), CPython reads PyDict_Type.tp_vectorcall (= dict_vectorcall) directly, bypassing the generic type.__call__ → __new__ + __init__ chain entirely. Over 15 builtin types have dedicated vectorcall implementations.
Current Architecture in RustPython
dict(x=1)
→ PyCallable::new(dict_type_obj)
→ obj.class() = type (metatype)
→ type.slots.vectorcall = vectorcall_type
→ vectorcall_type: not type(x), so fallback
→ PyType::call(dict_type, args)
→ slot_new(args.clone()) ← unnecessary clone
→ slot_init(args)
Each PyType instance already has its own slots: PyTypeSlots with vectorcall: AtomicCell<Option<VectorCallFunc>>, but the dispatch path never reads it. vectorcall_type receives the type object as its first argument but ignores the type's own slots.vectorcall.
Proposed Solution
1. Modify vectorcall_type to dispatch per-type vectorcall
In crates/vm/src/builtins/type.rs, add a branch that checks the called type's own slots.vectorcall:
fn vectorcall_type(...) -> PyResult {
let zelf: &Py<PyType> = zelf_obj.downcast_ref().unwrap();
if zelf.is(vm.ctx.types.type_type) {
// type(x) fast path (existing)
...
} else if let Some(type_vc) = zelf.slots.vectorcall.load() {
// Per-type vectorcall: dict_vectorcall, list_vectorcall, etc.
return type_vc(zelf_obj, args, nargs, kwnames, vm);
}
// Fallback to PyType::call
...
}
The else if structure prevents infinite recursion: when zelf is type_type itself, we never read zelf.slots.vectorcall (which would be vectorcall_type again).
2. Clear vectorcall on __init__/__new__ override
In crates/vm/src/types/slot.rs, when update_slot processes TpInit or TpNew with ADD=true, also clear self.slots.vectorcall.store(None).
This ensures subclasses that override __init__ or __new__ don't inherit an incorrect per-type vectorcall. For example, class MyDict(dict): def __init__(self, ...): ... must NOT use dict_vectorcall, which would skip the Python __init__ override.
3. Implement and register per-type vectorcall functions
Each builtin type gets a dedicated vectorcall function registered in its init():
| Type |
File |
Pattern |
dict |
builtins/dict.rs |
DefaultConstructor + Initializer (skip slot_new, pass args only to slot_init) |
list |
builtins/list.rs |
Constructor |
tuple |
builtins/tuple.rs |
Constructor |
int |
builtins/int.rs |
Constructor |
float |
builtins/float.rs |
Constructor |
str |
builtins/pystr.rs |
Constructor |
bool |
builtins/bool_.rs |
Constructor |
set |
builtins/set.rs |
DefaultConstructor + Initializer |
frozenset |
builtins/set.rs |
Constructor |
The key optimization for DefaultConstructor + Initializer types (like dict, set) is avoiding the args.clone() in PyType::call line 2216 — since Default::default() needs no args, we construct the object first, then pass args only to slot_init.
Inheritance Behavior
- Vectorcall is already inherited alongside
call via copyslot_if_none in slot_defs.rs (lines 574-577)
class MyDict(dict): pass → inherits dict_vectorcall ✓
class MyDict(dict): def __init__(self, ...): ... → vectorcall cleared to None, falls back to PyType::call ✓
- Custom metaclass with
__call__ override → vectorcall cleared on metaclass, per-type vectorcall never reached ✓
Key Files
crates/vm/src/builtins/type.rs — vectorcall_type dispatch modification
crates/vm/src/types/slot.rs — update_slot vectorcall clearing for TpInit/TpNew
crates/vm/src/types/slot_defs.rs — copyslot_if_none inheritance (already correct)
crates/vm/src/protocol/callable.rs — PyCallable::new (no changes needed)
- Individual builtin type files for vectorcall implementations
References
Support per-type-instance vectorcall for builtin types (dict, list, int, ...)
Summary
Currently, all type-object constructor calls (e.g.
dict(x=1),list([1,2,3]),int("42")) go through a singlevectorcall_typefunction, which only has a fast path fortype(x)and falls back to the genericPyType::callslow path for everything else. This means every builtin type constructor pays the cost ofslot_new(args.clone())+slot_init(args)dispatch, including an unnecessaryargs.clone().CPython avoids this by giving each
PyTypeObjectits owntp_vectorcallfunction pointer. When you calldict(...), CPython readsPyDict_Type.tp_vectorcall(=dict_vectorcall) directly, bypassing the generictype.__call__→__new__+__init__chain entirely. Over 15 builtin types have dedicated vectorcall implementations.Current Architecture in RustPython
Each
PyTypeinstance already has its ownslots: PyTypeSlotswithvectorcall: AtomicCell<Option<VectorCallFunc>>, but the dispatch path never reads it.vectorcall_typereceives the type object as its first argument but ignores the type's ownslots.vectorcall.Proposed Solution
1. Modify
vectorcall_typeto dispatch per-type vectorcallIn
crates/vm/src/builtins/type.rs, add a branch that checks the called type's ownslots.vectorcall:The
else ifstructure prevents infinite recursion: whenzelfistype_typeitself, we never readzelf.slots.vectorcall(which would bevectorcall_typeagain).2. Clear vectorcall on
__init__/__new__overrideIn
crates/vm/src/types/slot.rs, whenupdate_slotprocessesTpInitorTpNewwithADD=true, also clearself.slots.vectorcall.store(None).This ensures subclasses that override
__init__or__new__don't inherit an incorrect per-type vectorcall. For example,class MyDict(dict): def __init__(self, ...): ...must NOT usedict_vectorcall, which would skip the Python__init__override.3. Implement and register per-type vectorcall functions
Each builtin type gets a dedicated vectorcall function registered in its
init():dictbuiltins/dict.rsslot_new, pass args only toslot_init)listbuiltins/list.rstuplebuiltins/tuple.rsintbuiltins/int.rsfloatbuiltins/float.rsstrbuiltins/pystr.rsboolbuiltins/bool_.rssetbuiltins/set.rsfrozensetbuiltins/set.rsThe key optimization for
DefaultConstructor + Initializertypes (likedict,set) is avoiding theargs.clone()inPyType::callline 2216 — sinceDefault::default()needs no args, we construct the object first, then pass args only toslot_init.Inheritance Behavior
callviacopyslot_if_noneinslot_defs.rs(lines 574-577)class MyDict(dict): pass→ inheritsdict_vectorcall✓class MyDict(dict): def __init__(self, ...): ...→ vectorcall cleared toNone, falls back toPyType::call✓__call__override → vectorcall cleared on metaclass, per-type vectorcall never reached ✓Key Files
crates/vm/src/builtins/type.rs—vectorcall_typedispatch modificationcrates/vm/src/types/slot.rs—update_slotvectorcall clearing for TpInit/TpNewcrates/vm/src/types/slot_defs.rs—copyslot_if_noneinheritance (already correct)crates/vm/src/protocol/callable.rs—PyCallable::new(no changes needed)References
Objects/typeobject.c:type_vectorcall,inherit_specialInclude/internal/pycore_call.h:_PyVectorcall_FunctionInline,_PyObject_VectorcallTstate