Skip to content

BUG: np.clip has surprising object-dtype behavior for huge Python ints (0-D return type and where without out) #31145

@francof2a

Description

@francof2a

Describe the issue

I am using dtype=object with np.clip to preserve exact Python integers larger than int64.

While testing this, I found two object-dtype behaviors that are surprising and hard to handle downstream:

  1. np.clip on a 0-D object array returns a plain Python int, not a NumPy scalar / 0-D array-like result.
  2. np.clip(..., where=mask) on an object array with out=None leaves None in untouched positions.

The second behavior may be inherited from generic ufunc semantics, but for object arrays it becomes especially surprising because the untouched slots become actual None objects.

For context: I originally hit this area because on NumPy 1.26.4, clipping huge positive Python ints near 2**64 - 1 also lost exactness by promoting to float64. That scalar precision issue appears fixed in current NumPy (2.4.4), but the object-dtype behaviors below still reproduce.

Reproduce the code example

import numpy as np

print("numpy", np.__version__)

print("\n0-D object array input")
x0 = np.array(2**64 - 1, dtype=object)
y0 = np.clip(x0, 0, 2**64 - 1)
print("input:", x0, x0.dtype, x0.shape)
print("result:", y0)
print("result type:", type(y0))
print("result dtype:", getattr(y0, "dtype", None))
print("result shape:", getattr(y0, "shape", None))

print("\n1-D object array with where and no out")
x1 = np.array([2**64 - 1, 1, -(2**64) + 1], dtype=object)
mask = np.array([True, False, True])
y1 = np.clip(x1, -(2**64) + 1, 2**64 - 1, where=mask)
print("input:", x1, x1.dtype)
print("mask:", mask)
print("result:", y1)
print("result dtype:", y1.dtype)

Current output

Observed on NumPy 2.4.4:

numpy 2.4.4

0-D object array input
input: 18446744073709551615 object ()
result: 18446744073709551615
result type: <class 'int'>
result dtype: None
result shape: None

1-D object array with where and no out
input: [18446744073709551615 1 -18446744073709551615] object
mask: [ True False  True]
result: [18446744073709551615 None -18446744073709551615]
result dtype: object

Expected behavior

At least one of these would help:

  1. Preserve array/scalar shape semantics for 0-D object array input in a way that is consistent with non-object dtypes.
  2. Clarify in np.clip documentation that for object arrays, where=... with out=None may leave untouched entries as None objects.
  3. If feasible, make np.clip preserve original values in untouched object-array positions when where is used without out.

Additional context

For comparison, non-object 0-D inputs keep NumPy scalar types:

np.clip(np.array(5, dtype=np.int64), 0, 10)
# np.int64(5)

np.clip(np.array(2**64 - 1, dtype=np.uint64), 0, np.uint64(2**64 - 1))
# np.uint64(18446744073709551615)

Also, exact huge integers are preserved fine for 1-D object arrays when where is not used:

arr = np.array([2**64 - 1, -(2**64) + 1], dtype=object)
np.clip(arr, -(2**64) + 1, 2**64 - 1)
# array([18446744073709551615, -18446744073709551615], dtype=object)

Python and NumPy Versions

Python: 3.13.x / 3.12.x
NumPy: 2.4.4
Platform: Linux x86_64

Reproduce the code example:

import numpy as np


def _print_status(ok):
    print("STATUS:", "RIGHT RESULT" if ok else "WRONG RESULT")


def report_scalar_case(name, x, lo, hi, expected_value, expected_type):
    y = np.clip(x, lo, hi)
    ok = int(y) == expected_value and type(y) is expected_type
    print(f"\n{name}")
    print("input:", x)
    print("bounds:", (lo, hi))
    print("result:", y)
    print("result type:", type(y))
    print("int(result):", int(y))
    print("int(result) - input:", int(y) - x)
    print("expected value:", expected_value)
    print("expected type:", expected_type)
    _print_status(ok)


def report_array_case(name, x, lo, hi, expected_value, expected_type, expected_dtype=None, **kwargs):
    y = np.clip(x, lo, hi, **kwargs)
    ok = (
        type(y) is expected_type
        and np.array_equal(np.asarray(y, dtype=object), np.asarray(expected_value, dtype=object))
        and (expected_dtype is None or getattr(y, "dtype", None) == expected_dtype)
    )
    print(f"\n{name}")
    print("input:", x, "dtype=", getattr(x, "dtype", None), "shape=", getattr(x, "shape", None))
    print("kwargs:", kwargs)
    print("result:", y)
    print("result type:", type(y))
    print("result dtype:", getattr(y, "dtype", None))
    print("result shape:", getattr(y, "shape", None))
    print("expected value:", expected_value)
    print("expected type:", expected_type)
    print("expected dtype:", expected_dtype)
    _print_status(ok)


def main():
    print("numpy", np.__version__)

    print("\n== Scalar cases that behave well before the edge ==")
    report_scalar_case(
        "scalar positive int64 max (works)",
        2**63 - 1,
        0,
        2**63 - 1,
        expected_value=2**63 - 1,
        expected_type=np.int64,
    )
    report_scalar_case(
        "scalar negative int64 min (works)",
        -(2**63),
        -(2**63),
        0,
        expected_value=-(2**63),
        expected_type=np.int64,
    )

    print("\n== Scalar positive cases that lose exactness above int64 ==")
    report_scalar_case(
        "scalar uint64-2",
        2**64 - 2,
        0,
        2**64 - 2,
        expected_value=2**64 - 2,
        expected_type=int,
    )
    report_scalar_case(
        "scalar uint64-1",
        2**64 - 1,
        0,
        2**64 - 1,
        expected_value=2**64 - 1,
        expected_type=int,
    )

    print("\n== Scalar negative controls beyond int64 ==")
    report_scalar_case(
        "scalar negative uint64+2 equivalent",
        -(2**64) + 2,
        -(2**64) + 2,
        0,
        expected_value=-(2**64) + 2,
        expected_type=int,
    )
    report_scalar_case(
        "scalar negative uint64+1 equivalent",
        -(2**64) + 1,
        -(2**64) + 1,
        0,
        expected_value=-(2**64) + 1,
        expected_type=int,
    )

    print("\n== Object-dtype cases ==")
    report_array_case(
        "0-D object array returns a Python int instead of an ndarray",
        np.asarray(2**64 - 1, dtype=object),
        0,
        2**64 - 1,
        expected_value=np.asarray(2**64 - 1, dtype=object),
        expected_type=np.ndarray,
        expected_dtype=object,
    )
    report_array_case(
        "1-D object array preserves exact huge integers",
        np.array([2**64 - 1, -(2**64) + 1], dtype=object),
        -(2**64) + 1,
        2**64 - 1,
        expected_value=np.array([2**64 - 1, -(2**64) + 1], dtype=object),
        expected_type=np.ndarray,
        expected_dtype=object,
    )
    report_array_case(
        "1-D object array with where and no out leaves invalid untouched entries",
        np.array([2**64 - 1, 1, -(2**64) + 1], dtype=object),
        -(2**64) + 1,
        2**64 - 1,
        expected_value=np.array([2**64 - 1, 1, -(2**64) + 1], dtype=object),
        expected_type=np.ndarray,
        expected_dtype=object,
        where=np.array([True, False, True]),
    )
    report_array_case(
        "1-D int64 array with where and no out also leaves implementation-defined untouched entries",
        np.array([1, 5, 3], dtype=np.int64),
        2,
        4,
        expected_value=np.array([2, 5, 3], dtype=np.int64),
        expected_type=np.ndarray,
        expected_dtype=np.int64,
        where=np.array([True, False, True]),
    )


if __name__ == "__main__":
    main()

Error message:

Python and NumPy Versions:

2.4.4
1.26.4

Runtime Environment:

No response

How does this issue affect you or how did you find it:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    00 - Bug57 - Close?Issues which may be closable unless discussion continued

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions