ENH, PERF: allocate memory as part of the array object for scalars by mhvk · Pull Request #31092 · numpy/numpy

mhvk · 2026-03-29T12:54:12Z

@seberg, @eendebakpt: this is an updated (and greatly improved) version of #29878, making use of the fact that in previous PRs I removed the explicit dependence on how data were allocated.

Details

Allocate memory for the dimensions and strides, as well as small amounts of data, as part of the object, to avoid the overhead of multiple allocations (with the allocation of tracked memory for data especially large).

Note that the above is done only for standard arrays, though I think in principle it could be extended to subttypes with a suitable check.

Also, allocating space for data is done only for the standard allocator.

Timings

%timeit np.empty(())
133->68.2 ns
a = 1.6

%timeit np.array(a)
151->83 ns

%timeit np.add(a, a)
581->423 ns

b = np.array(a)
%timeit np.add(b, b)
293->225 ns

No AI was used.

mhvk · 2026-03-29T13:05:25Z

-    npy_free_cache_dim(PyArray_DIMS(arr), PyArray_NDIM(arr));
+    /* deallocate if not part of the array instance */
+    if ((PyArray_DIMS(arr) != NULL)
+        && (PyArray_DIMS(arr) !=


Possibly, there should be another flag for whether the strides and dimensions are stored on the array, but it seemed overkill.

NPY_ARRAY_DIMSONINSTANCE seems clearer to me than the pointer math that happens here but I agree it's not a big deal

eendebakpt

Some comments and small things, but overall I am +1 on this

There are a couple of cases (pickling, reshapes) where we could make more use of the inline data, dims and strides, but since these are deprecated or not performance critical I suggest we do not add them here.

eendebakpt · 2026-03-29T17:59:19Z

+ * This flag may be tested for in PyArray_FLAGS(arr).
+ * (MAYBE: allow it to be requested as well?)
+ */
+#define NPY_ARRAY_DATAONINSTANCE  0x0008


This is not strictly needed (where the data is allocated is an implementation detail, in any case for the default memory handler). But fine in having this in the interface. If we keep this, can we add a test for it?

Indeed, this should be an implementation detail, and in principle one could also tell that the data is on the instance from the fact that mem_handler=NULL -- unfortunately, in the code elsewhere that is taken to mean that someone messed around with the instance and just set data (it is free'd in the deallocator).

That said, I agree this does not need to be public; I just need to have some way to know the data are on the instance internally.

I actually wondered whether one could adjust the default memory handler to somehow keep track of this, but am not sure how... (it doesn't have access to the instance, after all).

Allocate memory for the dimensions and strides as part of the object, to avoid the overhead of an extra allocation. This will only become useless memory if the array shape is set (now deprecated) or the array is resized in-place (frowned upon). Note that the above is done only for standard arrays, though I think in principle it could be extended to subttypes with a suitable check. Also, allocating space for data is done only for the standard allocator. Timings ``` %timeit np.empty(()) 133->127 ns a = 1.6 %timeit np.array(a) 151->137 ns %timeit np.add(a, a) 581->557 ns b = np.array(a) %timeit np.add(b, b) 293->279 ns ```

mhvk · 2026-03-30T13:51:17Z


 fail:
    NPY_traverse_info_xfree(&fill_zero_info);
-    Py_XDECREF(fa->mem_handler);


This was a mistake in the failure path - mem_handler gets decref'd in the deallocator too.

Allocate memory for small amounts of data as part of the object, to avoid the overhead of the allocation of tracked memory. Like for the dimensions and strides, this is done only for standard arrays. Furthermore, allocating space for data is done only for the standard allocator. Timings relative to having dimensions, strides and data not on the object: ``` %timeit np.empty(()) 133->68.2 ns a = 1.6 %timeit np.array(a) 151->83 ns %timeit np.add(a, a) 581->423 ns b = np.array(a) %timeit np.add(b, b) 293->225 ns ```

mhvk · 2026-03-30T14:21:49Z

OK, now split into 2 commits, one for storing the dimensions and strides on the object (which has only a modest benefit for performance, see commit message), and the second for also storing small amounts of data.

I'm still not quite happy with having to keep track of whether or not the data is on the instance via a flag; suggestions for alternatives appreciated! Ideally it would somehow be possible to let the default allocator do some of the work, by registering the address of the data-on-instance as memory that does not need freeing (so renew would just allocate new memory).

ngoldbaum

I left some minor comments below. I asked Claude Code to do a review pass and most of these are generated from the report is made. All the wording is from me and I tried to validate everything I commented about.

ngoldbaum · 2026-03-30T19:16:28Z

+            return PyErr_NoMemory();
+        }
+        /* PyType_GenericAlloc zeroes the extra bytes, but we don't need to. */
+        fa = (PyArrayObject_fields *)PyObject_Init((PyObject *)alloc, subtype);


this can fail to due to memory exhaustion, might as well add a NULL check just in case

This cannot fail - the allocation has already been made above. Python also does not check:
https://github.com/python/cpython/blob/9e5b8383724211d14165a32c0e7682e56e13843a/Objects/object.c#L531-L540

ngoldbaum · 2026-03-30T19:18:18Z

+            Py_DECREF(descr);
+            return PyErr_NoMemory();
+        }
+        /* PyType_GenericAlloc zeroes the extra bytes, but we don't need to. */


Suggested change

/* PyType_GenericAlloc zeroes the extra bytes, but we don't need to. */

/* PyType_GenericAlloc zeroes the extra bytes, but we don't need to

because all fields are explicitly initialized below */

for future readers, in case new fields ever get added?

Agreed. I've done this in the other PR, though, since I think that ended up the better approach.

ngoldbaum · 2026-03-30T19:20:48Z

-    npy_free_cache_dim(PyArray_DIMS(arr), PyArray_NDIM(arr));
+    /* deallocate if not part of the array instance */
+    if ((PyArray_DIMS(arr) != NULL)
+        && (PyArray_DIMS(arr) !=


NPY_ARRAY_DIMSONINSTANCE seems clearer to me than the pointer math that happens here but I agree it's not a big deal

ngoldbaum · 2026-03-30T19:26:25Z

+             * TODO: make this a plain else; see comment in array_dealloc.
+             */
+            if (newnbytes <= oldnbytes) {
+                new_data = PyArray_DATA(self);


If we get here and PyArray_NDIM(self) != new_nd, then we might get into a state where the dimensions are stored outside the array but the data are still stored on-instance. Maybe there should be a PyArray_NDIM(self) == new_nd check here?

ngoldbaum · 2026-03-30T19:32:34Z

+    /*
+     * Create the array instance.
+     *
+     * For standard arrays, we allocate extra memory for the dimensions and


I would say "for ndarrays (but not subtypes)" instead of 'standard arrays'.

ngoldbaum · 2026-03-30T19:39:22Z

-        Py_DECREF(descr);
-        return NULL;
+    if (subtype == &PyArray_Type) {
+        size_t size = subtype->tp_basicsize;


On 64 bit architectures, this is 96 bytes, which is enough space for 12 64 bit data elements. So this optimization applies for small arrays as well as scalars. Unless I'm missing something that restricts it to just scalars.

The PR description and title talk about scalars, so maybe that behavior is unintentional.

Sorry, originally I only did scalars, but now it is for small arrays as well, indeed.

mhvk · 2026-03-31T15:02:46Z

See gh-31108 for an alternative implementation that uses a "hacked" allocator to deal with possible resize, etc., but avoids having a flag. It has the advantage that it needs far less change elsewhere in the code base, thus giving hope that if people use mem_handler outside of numpy, things will just continue to work.

mhvk requested review from eendebakpt and seberg March 29, 2026 12:54

mhvk added 01 - Enhancement component: numpy._core labels Mar 29, 2026

mhvk mentioned this pull request Mar 29, 2026

ENH, PERF: allocate memory as part of the array object for scalars #29878

Closed

mhvk force-pushed the array-allocate-scalar-or-strides-via-object-flags branch 2 times, most recently from 87e035f to 0f8ba39 Compare March 29, 2026 13:02

mhvk commented Mar 29, 2026

View reviewed changes

mhvk force-pushed the array-allocate-scalar-or-strides-via-object-flags branch 3 times, most recently from c5f5dfd to 4ee2623 Compare March 29, 2026 14:18

mhvk mentioned this pull request Mar 29, 2026

ENH, PERF: allocate dims and strides on array object by default. #31096

Closed

eendebakpt reviewed Mar 29, 2026

View reviewed changes

mhvk force-pushed the array-allocate-scalar-or-strides-via-object-flags branch from 4ee2623 to 50923bb Compare March 30, 2026 13:40

mhvk commented Mar 30, 2026

View reviewed changes

mhvk force-pushed the array-allocate-scalar-or-strides-via-object-flags branch from 50923bb to 13df5f5 Compare March 30, 2026 14:18

ngoldbaum reviewed Mar 30, 2026

View reviewed changes

mhvk mentioned this pull request Mar 31, 2026

ENH, PERF: small arrays/dimensions on array object - with special allocator #31108

Open

	/* PyType_GenericAlloc zeroes the extra bytes, but we don't need to. */
	/* PyType_GenericAlloc zeroes the extra bytes, but we don't need to
	because all fields are explicitly initialized below */

Uh oh!

Conversation

mhvk commented Mar 29, 2026

Details

Timings

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eendebakpt left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhvk commented Mar 30, 2026

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngoldbaum Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhvk commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ngoldbaum Mar 30, 2026 •

edited

Loading