-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH, API: New sorting mechanism for DType API #28516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 1 commit
4700c13
5fb0b5f
76be21b
59590d2
b89accd
a437eb9
aa63d11
16e95a2
42e76d6
88636cc
9d14ec1
a556455
3c0957e
9506798
6ce5351
96a53b2
9a2b100
8d4c75d
50988ba
ca5797e
95cfd8f
6dd4f4c
4fa813c
894911e
57687ac
0edb4ea
167301e
e6b8c1e
579c351
d854b00
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
…lic API
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -485,16 +485,16 @@ typedef int (PyArray_CompareFuncWithDescr)(const void *, const void *, | |||||
PyArray_Descr *); | ||||||
typedef int (PyArray_SortCompareFunc)(const void *, const void *, | ||||||
PyArray_Descr *); | ||||||
typedef int (PyArray_SortFunc)(PyArrayMethod_Context *, | ||||||
typedef int (PyArray_SortFuncWithContext)(PyArrayMethod_Context *, | ||||||
void *, npy_intp, | ||||||
NpyAuxData *); | ||||||
typedef int (PyArray_ArgSortFunc)(PyArrayMethod_Context *, | ||||||
typedef int (PyArray_ArgSortFuncWithContext)(PyArrayMethod_Context *, | ||||||
void *, npy_intp *, npy_intp, | ||||||
NpyAuxData *); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These two need different names and you need to leave the original typedefs in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for reviewing! This is done. |
||||||
|
||||||
typedef int *(PyArrayDTypeMeta_GetSortFunction)(PyArray_Descr *, | ||||||
npy_intp, int, PyArray_SortFunc **, NpyAuxData **); | ||||||
npy_intp, int, PyArray_SortFuncWithContext **, NpyAuxData **); | ||||||
typedef int *(PyArrayDTypeMeta_GetArgSortFunction)(PyArray_Descr *, | ||||||
npy_intp, int, PyArray_ArgSortFunc **, NpyAuxData **); | ||||||
npy_intp, int, PyArray_ArgSortFuncWithContext **, NpyAuxData **); | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. New stuff in the public API needs new API docs as well as a release note describing the new features. Maybe also as a proof-of-concept, it looks like both quaddtype and mpfdtype in numpy-user-dtypes implement sorting - would you be willing to update them to use the new API in a PR to numpy-user-dtypes that depends on this PR to numpy? That should give you a feeling for whether this API is helpful for someone writing a new user dtype. It'll also be a form of documentation - we don't have great docs for writing user dtypes besides the examples in numpy-user-dtypes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also what should we do about the flags that got added before we made the dtype API public, e.g. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's easy to generate a deprecation warning during registration (a bit tedious maybe, as you need explicit check). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I'll add API docs and a release note, and willing to make a PR to numpy-user-dtypes! Will look into that. Just to be clear, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can't change slot numbers (unless they are guarded as private)! So the numbers are fixed (until they have not been used for a bit at least). So, we just have to live with the numbering we got, I half thought I asked for an offset for the [^depr] I think this is as simple as asking users to compile with the new NumPy, and then adding There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There is an offset, numpy/numpy/_core/src/multiarray/dtypemeta.h Lines 94 to 95 in 9389862
|
||||||
#endif /* NUMPY_CORE_INCLUDE_NUMPY___DTYPE_API_H_ */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming is a bit weird here, but I didn't want to disturb the original type as it's used a lot. I think the
SortCompareFunc
should still be a unique type so will do that (even if only a clone of this type).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have slightly mixed feelings. On the one hand, I think this is the pragmatic thing to have.
On the other hand, we could also look this function from the
np.less_than
ornp.great_than
ufunc to implement sorting, I think.(The problem there is still how to deal with unordered elements, a compare ufunc would work better...)
But, on the other hand, it seems pragmatic even if it won't work well e.g. for structured dtypes (performance issues), it will always work and provides an easy entry-point (we can also use this to define default comparison ufuncs).
So overall, I think I end up at just doing this, although I could imaging punting if we don't need it for
StringDType
(I suspect we do, though).Would like to hear if @ngoldbaum has an opinion.
(A neater future path would also be if this was more of a header-only code binding generator job with us making the sorting patterns available maybe. I.e. if this was defined in a C++ class and our sort code available, the DType could compile the full loop and avoid calling such a helper everywhere.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this is fine, if only because it exists right now 😄