BUG: Use dict to store ufunc loops (fixes thread-safety) by seberg · Pull Request #31184 · numpy/numpy

seberg · 2026-04-07T18:03:46Z

Change _loops to use a dict instead to be able to use setdefault for adding new loops.

A dict is also ordered, although we need to use .items() to iterate it, otherwise things get nicer over-all (although the key is repeated for now, to simplify passing around info).

Closes gh-31112

This could even be a global mutex, since adding new loops in a deallocator is not reasonable. Of course, there are other complexities around promotion and the loop cache that are indirectly related. And it may thus be that we'll want to use this (or a similar) mutex more widely. I.e. this does _not_ enable simple thread-safe dynamic addition of user loops (the user would have to lock themselves and track if they added the loop dynamically already at the time being). Closes numpygh-31112

seberg · 2026-04-08T17:42:09Z

While a mutex/critical section should be OK here (although I forgot to replace the PySequence_Fast_GET_ITEM with a PyList_GetItemRef), replacing it with a dict seems a bit nicer (as it has the builtin setdefault which is nicely atomic and even informs if things were added).

The downside is that the best solution for the actual promotion is probably just to call tuple(dict), but since normally we hit the cache that should be fine probably.

seberg · 2026-04-08T20:29:36Z

    int nin = ufunc->nin, nargs = ufunc->nargs;
-    Py_ssize_t size = PySequence_Length(ufunc->_loops);
+    int ret = -1;
+    /* PyDict_Values returns a snapshot, safe against concurrent additions. */


Yeah, I used cursor/claude (although had to tell it to modify almost everything to my liking... just because you set out to make something thread-safe, doesn't mean it quite remembered that part, hehe), had to tell it off to reduce it's comment load, but this seems pretty OK ;).

One note: I decided to keep the tuple -> (tuple, meth/promoter) for now. Just like that one // borrow-ref OK comment this is currently not designed to allow deletion and it would just add churn here...
(I am not even sure if we want to ever delete things, rather than just put it somewhere else... even if we allow replacing, it shouldn't happen often and immortalizing may be pragmatic.)

~~EDIT: As a note, I suspect all is good. But I'll go once more myself tomorrow morning to see if there aren't ref-counting issues or so.~~ (done)

So, the value of the dict effectively contains a tuple (key, real-value)? Or can the tuple of dtypes possibly be different? If not, then my sense would be to remove the key from the value, to help the logic: a dict keyed by the dtypes that returns a promotor or method seems much clearer conceptually.

That said, I think it makes sense to not make that change here if it is more intrusive, to keep this PR focussed on thread safety.

Right, the thing is that info currently ends up in the cache (where the key isn't identical anymore) as borrowed.

So it looks awkward, but changing it seems really awkward too.

The practical refactor might be different: Right now there is an awkward distinction between "BoundArrayMethod" and the "ArrayMethod", if we remove it the need for the tuple should just go away (of course the DTypes will still be duplicated, but it shouldn't feel as awkward ;)).

(The backstory is to accept that DTypes are immortal -- with a plausible exception for some future dynamically created subclasses of a superclass DType which would have no loops registered for the subclass. At the time that felt, and was pointed out, like we should keep it plausible to make DTypes mortal in practice and that requires the split. But it was never practical -- Python doesn't even have the necessary technology to do it nicely/correctly!)

OK, that makes sense -- and getting rid of bound vs unbound (a difference I don't fully understand...) sounds like a plan!

Too much information, so feel free to ignore :).

and getting rid of bound vs unbound (a difference I don't fully understand...)

Like a bound method: A bound method keeps self alive, a BoundArrayMethod keeps the DTypes (i.e. the multiple "self") alive.

With that distinction it was plausible to at least clean up many/most cases (still annoying as the key needs to be something like a weakref).

Without... Maybe it is still plausible if maybe more awkward; but all cases would require technology Python doesn't have (i.e. ephemerons).

In a variation, too much ignorance, so feel free to ignore :)

But why do array methods need to keep dtypes alive at all? Isn't it only the other way around, an otherwise unused array method should just disappear if all dtypes that use it have disappeared? Or, trying not to be completely ignorant, is it because the array method has internal links to the dtypes it works for? (I really should re-read the relevant NEPs and documentation and try to make improve/summarize myself, so people like me who have an incomplete idea can understand how it all works...)

Since I started to try to explain it, I'll continue. But to be clear: The only thing that matters is if there is a real reason to think that cleanup matters...
I am not 100% sure, but to me it seems rather implausible (it still works for dynamically created DTypes if you only register their loops via a superclass DTypes -- although at least casts don't work for that pattern right now).

I doubt the NEPs help this is just unwieldy to think about but a pretty small detail in a sense...

We need the DTypes when executing any ArrayMethod at some point (we also pass it in to the resolve_descriptors, etc. but we need it for more things before also).
So at that time you need them guaranteed alive in some form, and those DTypes aren't the ones passed in by the user:

Because promotion might have happened.

If output DTypes differ from inputs, then those need to be kept somewhere and at least in principle such an ArrayMethod (or ufunc loop) should keep the output DType alive as long as it is callable.

Now, I think at the time I mostly said that 2 is the thing that just can't work (except maybe for casts).

The thing you can possibly make work, is if the ArrayMethods can become "invalid". I don't think I really thought about it at the time (as it breaks 2).
Having a BoundArrayMethod solve the need for having to "invalidate" an ArrayMethod (whether we can be sure to clean it up or not), while allowing us to have an object that represents the full thing.

Now, the only truly correct path is if the ufunc loop (effectively) keeps it's DTypes alive but is cleaned up when it becomes inaccessible. If the ArrayMethod doesn't own the DTypes this can work for many cases (e.g. if all input and output DTypes are identical and maybe casts -- if all are the same, a weakref can be used).

But for all other cases you need an Ephemeron to spell something like: As long as the input DType is alive, this ArrayMethod/key is alive and keeps all the (output) DTypes alive.
(If Python had an ephemeron aware GC, the cyclic GC could then figure out if a DType becomes truly inaccessible.)

But as you can (hopefully) see, you do need a split if you want cleanup in the generic case. Right now it is a (barely used/existing) BoundArrayMethod vs. ArrayMethod. If the ArrayMethod knew it's DTypes it but didn't keep it alive, then it would mean that it needs to have an "invalid" state.
And technically, if you expose it to Python you still kinda need the BoundArrayMethod as at least in that case it should guarantee staying valid!

(I suppose I could also store the BoundArrayMethod as the info for now, but it is also a step in the "we'll never implement cleanup anyway" direction ;).)

Anyway, this is too much and I dunno if it can be understood :). In the end, the question is really if there actually is some reason to care about plausible generic cleanup... My gut feeling is "no" as it is just really impractical, but if there is a sensible reason, then keeping the current split may be better to not move in a direction that would make it even harder.

Thanks, yes, the output dtype is a tricky one. Conceptually, I think it doesn't strictly need to be on the array method:

For dtype casts, the casts themselves could be a dict of methods keyed by output dtype, so that the cast machinery rather than the array method becomes responsible for keeping the references.

For ufuncs, I guess the method would need to be the value in a dict keyed by both input and output dtype, and anything that is incomplete has to be handled by a promoter.

But, really, there is a lot to be said for a arraymethod specification to be self-describing, which does mean it is logical to have references to the dtypes.

And I think you are right that throw-away dtypes that also have their own loops are an unlikely prospect, so in practice there really should not be a problem.

mhvk

Overall, this looks nice! Indeed, the dict seems a more logical solution to the problem.

But it does seem to me that, perhaps in follow-up, we should move to have a dict that just has the promoter or array method as its value.

mhvk · 2026-04-09T05:57:21Z

    int nin = ufunc->nin, nargs = ufunc->nargs;
-    Py_ssize_t size = PySequence_Length(ufunc->_loops);
+    int ret = -1;
+    /* PyDict_Values returns a snapshot, safe against concurrent additions. */


So, the value of the dict effectively contains a tuple (key, real-value)? Or can the tuple of dtypes possibly be different? If not, then my sense would be to remove the key from the value, to help the logic: a dict keyed by the dtypes that returns a promotor or method seems much clearer conceptually.

That said, I think it makes sense to not make that change here if it is more intrusive, to keep this PR focussed on thread safety.

ngoldbaum

Spotted a couple minor issues (with the help of Claude). Also ping @kumaraditya303 to give this a once-over.

Also can you update the PR description so this gets merged with a more relevant commit message?

ngoldbaum · 2026-04-09T14:47:12Z

+    }
+    if (existing_info != NULL) {
+        PyObject *existing_meth = PyTuple_GET_ITEM(existing_info, 1);
+        Py_DECREF(existing_info);


I think it's possible this decref could cause the last reference to existing_meth to go away. Since you have a borrowed reference, that means existing_meth is possibly invalid after here. There might be larger architectural reasons why that's not possible, so a comment explaining might be nice. You could also move the existing_inf decref to line 272 or below, to ensure existing_meth stays alive until its last use.

The dict holds a strong ref to existing_info so this is safe, but it would still be nice to move the decref.

ngoldbaum · 2026-04-09T14:49:24Z

+    }
+    if (existing_item != NULL) {
+        PyObject *registered = PyTuple_GET_ITEM(existing_item, 1);
+        Py_DECREF(existing_item);


same pattern here, either leave a comment explaining why this is safe or move the decref below the last use of registered.

Yeah, nicer to reorganize, claude is right, that claude didn't do this perfectly ;) (even if fine as the old code also relied on borrowing being OK)

I think there's still an issue - the DECREF has to be after the last use of registered on line 5259 below.

ngoldbaum · 2026-04-09T14:58:51Z

I tried to trigger the crash on my Mac for about 5 minutes running the command in #31112 (comment) and didn't see any failures. Hard to prove a negative but I think we can conclude this avoids the race that causes the crash. main crashes about about 30 seconds of looping on that test on the same Mac.

kumaraditya303 · 2026-04-09T15:07:42Z

        /* New private fields related to dispatching */
        void *_dispatch_cache;
-        /* A PyListObject of `(tuple of DTypes, ArrayMethod/Promoter)` */
+        /* Ordered dict `tuple of DTypes -> (tuple of DTypes, ArrayMethod/Promoter)` */


I find it little confusing because there is a separate collections.OrderdDict, dicts are insertion ordered by default so maybe just remove the "ordered"?

True, but I want to remind whoever is about to change it that the order may matter.

ngoldbaum

One last comment, I think there's still a refcounting issue in one of the code paths I commented on. Also the PR description still needs a rewrite before merging. Approving though since these are just touchup-level changes.

ngoldbaum · 2026-04-09T16:25:36Z

+    }
+    if (existing_item != NULL) {
+        PyObject *registered = PyTuple_GET_ITEM(existing_item, 1);
+        Py_DECREF(existing_item);


I think there's still an issue - the DECREF has to be after the last use of registered on line 5259 below.

kumaraditya303

LGTM, thanks for fixing this

ngoldbaum · 2026-04-09T21:28:45Z

Thanks Sebastian!

seberg mentioned this pull request Apr 7, 2026

BUG: flaky test under pytest-run-parallel #31112

Closed

seberg force-pushed the add-loop-mutex branch from 8c43cbd to 1196738 Compare April 7, 2026 18:10

charris added the 00 - Bug label Apr 7, 2026

Instead, use a dict and rewire things

a17f4b5

seberg force-pushed the add-loop-mutex branch from 38cbfda to a17f4b5 Compare April 8, 2026 20:26

seberg commented Apr 8, 2026

View reviewed changes

mhvk reviewed Apr 9, 2026

View reviewed changes

seberg changed the title ~~BUG: Make PyUFunc_AddLoop thread-safe with a per-ufunc mutex~~ BUG: Use dict to store _loops (fixes thread-safety) Apr 9, 2026

seberg changed the title ~~BUG: Use dict to store _loops (fixes thread-safety)~~ BUG: Use dict to store ufunc loops (fixes thread-safety) Apr 9, 2026

ngoldbaum reviewed Apr 9, 2026

View reviewed changes

ngoldbaum added this to the 2.5.0 Release milestone Apr 9, 2026

seberg commented Apr 9, 2026

View reviewed changes

Comment thread numpy/_core/src/umath/ufunc_object.c

kumaraditya303 reviewed Apr 9, 2026

View reviewed changes

Comment thread numpy/_core/src/umath/dispatching.cpp Outdated

seberg added 2 commits April 9, 2026 17:21

Don't make the code rely on borrowed refs (fine for now but not nice)

55f1105

Delete unnecessary late init of ufunc->_loops

e0f5db0

ngoldbaum approved these changes Apr 9, 2026

View reviewed changes

move all of the if out

0a17b4c

kumaraditya303 approved these changes Apr 9, 2026

View reviewed changes

ngoldbaum merged commit 563c604 into numpy:main Apr 9, 2026
86 checks passed

Uh oh!

Conversation

seberg commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Apr 8, 2026

Uh oh!

seberg Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngoldbaum commented Apr 9, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kumaraditya303 left a comment

Choose a reason for hiding this comment

Uh oh!

ngoldbaum commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

seberg commented Apr 7, 2026 •

edited

Loading

seberg Apr 8, 2026 •

edited

Loading

seberg Apr 9, 2026 •

edited

Loading

seberg Apr 9, 2026 •

edited

Loading