Skip to content

Settle on Py_ssize_t type and sys.maxsize value #447

Description

@silmeth

The sys.maxsize constant defines the maximal value of the Py_ssize_t type, which is a type for indexing collections (and thus, it is the theoretical maximal size of given Python implementation collections).

CPython defines Py_ssize_t as C ssize_t (signed int of the machine’s pointer size, equivalent to Rust isize). And justifies the use of signed type in the PEP-353 with two reasons:

  1. Python allows negative indexes in many contexts anyway (so using isize makes arithmetic on indices easier – but nothing that can’t be done with BigInts and usize anyway),
  2. the CPython codebase has many places where it depends on indexing variable being signed (for loops checking the variable being non-negative, possibly also some error handling by returning -1) – this does not concern RustPython in any way.

At the moment, RustPython defines sys.maxsize as the maximal value of usize:

ctx.set_attr(&sys_mod, "maxsize", ctx.new_int(std::usize::MAX));

and documents it as ‘the largest supported length of containers’.

But at the same time, it starts to use isize for maximal size of containers. At the moment list.insert() checks if the index fits in isize.

Using usize sounds like it makes sense because that is what Rust expects for vector indexing (so all indices must be cast to it eventually), but apparently Rust Vec will panic anyway if you try to grow it over isize::MAX bytes (because Rust has no way to allocate more as a single chunk of memory) – on the other hand it lets you get usize::MAX-long vectors of 0-bytes sized values (but those afaik have no use in RustPython).

And using isize would probably make it easier to eventually implement the Python C API (eg. PyList_Size() function) and C-based FFI.

I think RustPython should settle on either usize or isize (and for the sake of interoperability, and because of technical limitations, probably isize, even though I personally like usize more and find it cleaner) as the underlying indexing type, and raise OverflowErrors in collection functions when indices cannot be converted to isize and perhaps when growing a collection to more than isize::MAX elements (hmm, is it even possible? wouldn’t that panic earlier because of unsuccessful allocation anyway?).

Perhaps adding an alias like type PySize = isize (or PySsize, or PySSize) and using that instead of isize directly would also make sense?

Sorry for a wall of text…

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions