ENH: Descending sorts with built-in dtype implementations#31345
ENH: Descending sorts with built-in dtype implementations#31345MaanasArora wants to merge 51 commits into
Conversation
| static int less_equal(type const& a, type const& b) { | ||
| return !less(b, a); | ||
| } | ||
| static int greater(type const& a, type const& b) { |
There was a problem hiding this comment.
Arguably we can just create new tags, but doing this seems nicer for consolidation.
There was a problem hiding this comment.
OK, I guess it's worth looking if we can't just clean this up... and in the age of agents, maybe just with the help of an agent...
This whole tag business was always annoying. One small part that would help here, is to push the new sort loop creation into C++ so we only have a make_sort_loops_@name@(...) and everything else can use templating without having to export dozens of functions explicitly.
(And even if this tag business makes sense, I really don't get why it can't use a subclassed template to clean this up. Probably also a C++11 limitation?)
EDIT: I'll see if I can't make a gross simplification in a separate PR quickly... Done in gh-31352, probably more possible, but a minimal start that is probably sufficient for this...
| } | ||
| }; | ||
|
|
||
| template <typename Tag> |
There was a problem hiding this comment.
I think we can just use bool reverse now, maybe that wasn't possible with C++11 (where all of this text monstrosity comes from).
|
One more thing: It seems to me that both highway and x86 sort should have a descending parameter. For highway, I didn't immediately find whether its NaN handling is what we need, but we can probably just try (if not, CI should fail as we'll have some test for it :)). |
Push into C++ or do you mean
Will do this and above now! We actually already have a basic test for NaN handling for descending in this PR (though only floats I think)! Edit: I have a draft for quicksort, but let me wait for the other two PRs to merge to rebase before pushing or this will be challenging to rebase :) |
3cc3f6d to
f3b5797
Compare
|
I rebased over the other PR, and moved the sort loops registration to C++, it really is much nicer! I've pushed something broken... the issue is mainly, though, that I'm including Edit: I've moved the template code to header files, and everything works well! So if you like I can push after you have a brief pass at the line-by-line changes? |
|
I've pushed what I have now - going to clean things up and add back the |
|
I did a couple more iterations:
What is remaining is adding SIMD sorts (though not sure if that should be follow-up, it's not too invasive, but still requires some changes to x86_simd_sort), and cleaning up the But I think this should be mostly ready now! Thanks. Edit: oh, and need a release note, and some explanatory docs maybe. |
|
OK, I pushed SIMD sorts for ascending, still trying to check if descending can be included. Also added a release note. Aside from this, mainly just cleanup remaining, hopefully. Edit: nevermind, some strange bug with the SIMD on some systems... (Seems fixed now.) |
|
The SIMD bug was for np::half, seems solved now! I was able to thread descending through SIMD, but as far as I can tell even x86 simd sort doesn't sort nans to the end on descending, so haven't pushed. Closer look, and it seems nans are replaced with positive infinity for both orders. |
seberg
left a comment
There was a problem hiding this comment.
Thanks, two more small nits, I'll have another look over the tests, the large moves and things later, but I think this is settling.
(One silly thing: I find the _generic name a bit awkward, because it's exactly the other file that implements the "generic" version that uses the cmp function. Not sure whats best, maybe _templates or even swapping the name -- anyway, just a nit. )
|
Thanks for reviewing! It seems the We do need better tests for complex nans, given three "types" of (heterogenous) nans! Will try to draft some soon.
Ha, I was thinking of generics on types aka templates and somehow thought in the moment that would be clear - I'll probably just swap the names unless you have a preference for another name! |
seberg
left a comment
There was a problem hiding this comment.
Thanks, I should go through once more (but I am not sure I'll get to it this week).
But I think this is ready now so if anyone else wants to take a look, please do, because otherwise I may just make a pass put it in (probably early next week).
| r += make_sorts_<npy::timedelta_tag>(&PyArray_TimedeltaDType, "timedelta"); | ||
| r += make_string_sorts_<npy::string_tag>(&PyArray_BytesDType, "string"); | ||
| r += make_string_sorts_<npy::unicode_tag>(&PyArray_UnicodeDType, "unicode"); | ||
| r += make_sorts_<npy::half_tag>(&PyArray_HalfDType, "half"); |
There was a problem hiding this comment.
Hmmm, we don't have error handling on the other side now. (Although, it probably doesn't matter much, if this fails some Python error will bubble up and that is all that matters. -- it may be slightly nicer to deal with the errors right away, with a && chain or just a loop, although that may need making some local struct array to loop.)
There was a problem hiding this comment.
Ah right - I made a suggestion in another comment, though yeah, not sure if it's much better given it's not chained!
| * Add sorting array methods for the new types. | ||
| */ | ||
|
|
||
| register_all_sorts(); |
There was a problem hiding this comment.
| register_all_sorts(); | |
| if (register_all_sorts() < 0) { | |
| return -1; | |
| } |
Perhaps this suffices, import will crash anyway this way?
|
Thanks for reviewing! I swapped the names for |
|
Added various complex tests (maybe too complex :)) and fixed a SIMD dispatch bug. @charris brought up benchmarks in the community call, so I ran them against the latest commit. Sort BenchmarksArgsort BenchmarksBasically:
Not sure if we should add either in this PR though, the diff may explode. Radixsort needs to be done sometime, as it's a significant regression, though (perhaps moreso for bytes and bools, which are not benchmarked)... |
…method types instead
Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
|
Yeah, checking benchmarks was still missing and clearly important! The heapsort thing seems also like a simple bug: We weren't using heapsort before, but now you are mapping heap-sort to stable rather than unstable. The sort path should be clearing the HEAPSORT flag before setting it for the sort parameter flags (and then just delete it from the enum). |
|
Thanks! Added back radixsort (just a rename for the file), and fixed the heapsort bug (you were right, it was in Sort BenchmarksArgsort benchmarks |
|
OK, to wrap the current state: I would like to push forward, I think. Right now the performance for descending sorts is lower if we should be using x86-simd sort, because it's NaN handling is incompatible. That isn't ideal and it is unclear if bad performance for Otherwise, there are no performance changes (beyond small random things) for the existing sorts. |
This is a draft PR depending on #31278 that enables descending sorts on the Python side and adds support to built-in dtypes (bools, ints, floats, dates). It copies the comparison functions rather than the sorts, but the template wrappers are still a bit verbose. I added some tests for only sorting (not argsorting) but more are needed.
Note that SIMD sorts are not used for descending, nor is radixsort, as these would require more changes to the sort functions themselves (radixsort uses raw memory, for example).
AI Disclosure
Claude wrote the
(arg)sort_*_descendfunctions (not the comparisons), and I looked over very briefly so far.