BUG: scalars missing several methods for array api compat #27305

Illviljan · 2024-08-28T20:44:36Z

Describe the issue:

I keep getting stuck trying to get tests in https://github.com/data-apis/array-api-tests to pass and the final errors are often due to numpy arrays having turned into scalars and the tests not expecting that, (https://data-apis.org/array-api/draft/API_specification/array_object.html)

Reading #26850 it seems the intention is that the scalars should support the array api spec?

Reproduce the code example:

import numpy as np
import array_api_strict as xps


# Some examples:
xps.mean(xps.asarray(4, dtype=xps.float32)).__iadd__(1)
np.mean(np.asarray(4, dtype=np.float32)).__iadd__(1) # AttributeError: 'numpy.float32' object has no attribute '__iadd__'

xps.all(xps.asarray(True, dtype=xps.bool)).__ior__(False)
np.all(np.asarray(True, dtype=np.bool)).__ior__(False)  # AttributeError: 'numpy.bool' object has no attribute '__ior__'

xps.mean(xps.asarray(4, dtype=xps.float32)).__complex__()
np.mean(np.asarray(4, dtype=np.float32)).__complex__() # AttributeError: 'numpy.float32' object has no attribute '__complex__'

Error message:

No response

Python and NumPy Versions:

import sys, numpy; print(numpy.version); print(sys.version)
2.1.0
3.12.2 | packaged by conda-forge | (main, Feb 16 2024, 20:42:31) [MSC v.1937 64 bit (AMD64)]

Runtime Environment:

No response

Context for the issue:

No response

ngoldbaum · 2024-08-29T17:31:25Z

So one issue with the in-place operators is that scalars are immutable. Maybe we could implement them but make them return a copy? But maybe a in-place operation returning a copy is confusing?

Implementing __complex__ makes sense. I guess the reason it's missing is there isn't a PyNumberMethod slot for it as far as I can see: https://docs.python.org/3/c-api/typeobj.html. You'd probably need to "manually" define a function named __complex__.

ngoldbaum · 2024-09-04T18:14:33Z

This is the current behavior:

In [6]: a = np.int64(3)

In [7]: id(a)
Out[7]: 2199088643840

In [8]: a += 3

In [9]: id(a)
Out[9]: 2199088641120

I think implementing __iadd__ and making it return a copy doesn't actually change this behavior. So we should do that.

I added a 2.2.0 milestone.

I think __iadd__ and __ior__ are probably the easiest since there are PyNumberMethod slots for them.

For __complex__ someone would need to define a python function (in C) named __complex__ that does the conversion to a PyComplex.

Illviljan · 2024-09-06T05:07:55Z

Setting up array-api-tests to run with scalars will probably be helpful as well.
It might smoke out more missing methods.

Ishankoradia · 2024-09-30T17:02:17Z

@ngoldbaum have we decided to do this ?
I wanted to give it a shot.

ngoldbaum · 2024-09-30T17:05:35Z

@Ishankoradia go ahead. Having a milestone means we're planning to do it. We also don't claim issues, just go ahead and work on it.

Ishankoradia · 2024-09-30T17:32:20Z

Gotcha !!

@ngoldbaum I am looking at the core/__init__.pyi, i can see that for method __iadd__ we only accept NDArray. I am guessing my first step would be to add a method overload to accept scalar. Am i in the right direction ?

ngoldbaum · 2024-09-30T17:41:01Z

No, in order to implement these functions you'll need to modify the C internals of NumPy.

Here is where the PyNumberMethods struct is set up for all of the NumPy scalar types:

numpy/numpy/_core/src/umath/scalarmath.c.src

Lines 1974 to 1997 in f37a4f3

    
           static PyNumberMethods @name@_as_number = { 
        
               .nb_add = (binaryfunc)@name@_add, 
        
               .nb_subtract = (binaryfunc)@name@_subtract, 
        
               .nb_multiply = (binaryfunc)@name@_multiply, 
        
               .nb_remainder = (binaryfunc)@name@_remainder, 
        
               .nb_divmod = (binaryfunc)@name@_divmod, 
        
               .nb_power = (ternaryfunc)@name@_power, 
        
               .nb_negative = (unaryfunc)@name@_negative, 
        
               .nb_positive = (unaryfunc)@name@_positive, 
        
               .nb_absolute = (unaryfunc)@name@_absolute, 
        
               .nb_bool = (inquiry)@name@_bool, 
        
               .nb_invert = (unaryfunc)@name@_invert, 
        
               .nb_lshift = (binaryfunc)@name@_lshift, 
        
               .nb_rshift = (binaryfunc)@name@_rshift, 
        
               .nb_and = (binaryfunc)@name@_and, 
        
               .nb_xor = (binaryfunc)@name@_xor, 
        
               .nb_or = (binaryfunc)@name@_or, 
        
               .nb_int = (unaryfunc)@name@_int, 
        
               .nb_float = (unaryfunc)@name@_float, 
        
               .nb_floor_divide = (binaryfunc)@name@_floor_divide, 
        
               .nb_true_divide = (binaryfunc)@name@_true_divide, 
        
               /* TODO: This struct/initialization should not be split between files */ 
        
               .nb_index = (unaryfunc)NULL,  /* set in add_scalarmath below */ 
        
           };

This is inside a file that is written in NumPy's custom templating language used for codegen internally.

You can see that none of the inplace methods listed in the CPython docs are implemented. I think we need to implement all of them, not just __iadd__ and __ior__. You'd also need to add tests and I'd also double check my analysis above that defining in-place operators that return copies is actually OK. Do other array libraries do that?

For __complex__, as noted above, you'd need to add an entry and implementation for __complex__ to the PyMethodDef array for the scalar types:

numpy/numpy/_core/src/multiarray/scalartypes.c.src

Line 2586 in f37a4f3

static PyMethodDef gentype_methods[] = {

.

If you've never touched the CPython C API before this is probably a big project, although tbh it would be a decent way to learn about the C internals of NumPy or how to work with Python C extensions using the C API directly.

Ishankoradia · 2024-09-30T17:48:00Z

Thanks a ton @ngoldbaum , this helps a lot.
I think I can do it. Although i have limited knowledge of Python C extensions , but like you said what better to learn this & numpy C internals.

ngoldbaum · 2024-09-30T17:51:58Z

I found going through https://docs.python.org/3/extending/extending.html and https://llllllllll.github.io/c-extension-tutorial/ helped immensely to understand this stuff better.

Ishankoradia · 2024-09-30T17:52:44Z

Got it !! I will read through them before I dive in.

Thank you for sharing them.

Ishankoradia · 2024-10-04T15:46:00Z

@ngoldbaum I have spent a lot of time reading the material you shared. I have good understanding of how c extensions work. I built out 2(easy ones) and was able to run them from python

I was looking at this template file you pointed out scalarmath.c.src. Its very interesting. I think all @name@ placeholders are replaced by the correct dtypes. And i also see their corresponding method implementation in the compiled file scalarmath.c. But i can figure out where is the source code for those functions (eg. @name@_add ) coming from. Could you point me to that file ?

[updated]
Ohh is this the function @name@_ctype_add for .nb_add = (binaryfunc)@name@_add, ?

ngoldbaum · 2024-10-07T08:42:15Z

Hi, sorry for taking a few days to respond, I've been on vacation.

The @name@ part of the template system numpy uses for codegen. All files with .c.src extensions use this template system.

The templating for the block of code I linked to is set up immediately above that code:

numpy/numpy/_core/src/umath/scalarmath.c.src

Lines 1968 to 1973 in f37a4f3

    
           /**begin repeat 
        
            *  #name = byte, ubyte, short, ushort, int, uint, 
        
            *          long, ulong, longlong, ulonglong, 
        
            *          half, float, double, longdouble, 
        
            *          cfloat, cdouble, clongdouble# 
        
           **/

Here, the template system is saying to replace @name@ with each of the names in the comma-separated list, one for each scalar type.

In order to implement __iadd__, __ior__, and the rest of the in-place operators, you're going to need to define new templates that define implementations for the these operators (or maybe you can just re-use the existing implementations for the non-in-place operators? not sure) then add new entries to the table I linked to above for all the in-place operators like nb_inplace_add.

Ishankoradia · 2024-10-07T16:44:49Z

@ngoldbaum no problem. Thanks for getting back. (Also, hope you had a great vacation and got good time to recharge)

So my first attempt was to use the same table. I tried to add a new entry there for .nb_inplace_add like this

But the mistake I did was, I implemented a method @name@_ctype_inplace_add instead @name@_inplace_add. I assumed ctype is also some kind of a prefix stub that is needed for the methods because i see it everywhere.

I have added this now and i see the compiled file scalarmath.c that has this dummy implementation.

What is that ctype prefix ?
Right now I have just copied the dummy implementation from _ctype_add. I guess i will have to break it down once i start implementing for each data types right based on how the logic looks like ?
How are inputs to these methods handled here ? Can i assume in inplace operation that the first input will be self ?

If you can clear these up , it would be great. Sorry for the back & forth. Thank you for all the help in this one.

ngoldbaum · 2024-10-14T16:27:53Z

What is that ctype prefix ?

Right now I have just copied the dummy implementation from _ctype_add. I guess i will have to break it down once i start implementing for each data types right based on how the logic looks like ?

The ctype functions are used to define the "c-level" version of the operation. There's also a "python-level" version of the operation that is defined using a second level of templating to define a function named e.g. float_add. See the template function defined starting at line 1175 in scalarmath.c.src. I guess to get this to work, you'll also need to extend this to generate in-place python wrappers, along with the C-level wrappers. Although that said, maybe you can just re-use the existing wrappers that are already defined, so just set nb_inplace_add to e.g. (binaryfunc)@name@_add.

If you look up at my analysis when I originally opened the issue - numpy scalars already let you use the in-place operators, they just return a copy, so maybe in principle you can just re-use the existing implementations that are getting used already, but now by explicitly using the slot so people can call the dunder methods directly from Python.

How are inputs to these methods handled here ? Can i assume in inplace operation that the first input will be self ?

I think that's right, although instead of asking me and then waiting for a response, you should try experimenting to see for yourself. I would need to poke around with a C debugger to figure that out.

There are spin gdb and spin lldb commands. You'll also need to make sure you're building NumPy with debug symbols. There's no need to use a debug build of CPython unless you need to step through CPython, which is not needed for most things.

If you don't like debuggers, printf debugging also works :)

Ishankoradia · 2024-10-15T03:42:43Z

What is that ctype prefix ?

Right now I have just copied the dummy implementation from _ctype_add. I guess i will have to break it down once i start implementing for each data types right based on how the logic looks like ?

The ctype functions are used to define the "c-level" version of the operation. There's also a "python-level" version of the operation that is defined using a second level of templating to define a function named e.g. float_add. See the template function defined starting at line 1175 in scalarmath.c.src. I guess to get this to work, you'll also need to extend this to generate in-place python wrappers, along with the C-level wrappers. Although that said, maybe you can just re-use the existing wrappers that are already defined, so just set nb_inplace_add to e.g. (binaryfunc)@name@_add.

If you look up at my analysis when I originally opened the issue - numpy scalars already let you use the in-place operators, they just return a copy, so maybe in principle you can just re-use the existing implementations that are getting used already, but now by explicitly using the slot so people can call the dunder methods directly from Python.

How are inputs to these methods handled here ? Can i assume in inplace operation that the first input will be self ?

I think that's right, although instead of asking me and then waiting for a response, you should try experimenting to see for yourself. I would need to poke around with a C debugger to figure that out.

There are spin gdb and spin lldb commands. You'll also need to make sure you're building NumPy with debug symbols. There's no need to use a debug build of CPython unless you need to step through CPython, which is not needed for most things.

If you don't like debuggers, printf debugging also works :)

Understood. Thank you Nathan !!
This helps a lot. I will get back in 3-4 days hopefully with something implemented.

mhvk · 2024-10-15T13:02:06Z

Sorry for a late side comment, but reading this are we sure that this #27305 (comment) has been followed up:

You can see that none of the inplace methods listed in the CPython docs are implemented. I think we need to implement all of them, not just iadd and ior. You'd also need to add tests and I'd also double check my analysis above that defining in-place operators that return copies is actually OK. Do other array libraries do that?

Do other array libraries in fact do that for immutable scalars? I ask because certainly python ints do not implement __iadd__ - within python at least those methods should not be used directly, but tested through simply doing a += b (which goes through a.__add__(b) if there is no __iadd__).

seberg · 2024-10-15T13:11:54Z

That is correct, inplace operators cannot and thus must not be implemented. __complex__ makes a lot of sense to be missing, precisely because it is not an nb_ slot, but rather only a Python defined method.

Illviljan added the 00 - Bug label Aug 28, 2024

ngoldbaum added 40 - array API standard PRs and issues related to support for the array API standard triage review Issue/PR to be discussed at the next triage meeting labels Aug 29, 2024

Illviljan mentioned this issue Sep 1, 2024

Avoid inplace operators in general tests data-apis/array-api-tests#287

Open

ngoldbaum added this to the 2.2.0 release milestone Sep 4, 2024

ngoldbaum removed the triage review Issue/PR to be discussed at the next triage meeting label Sep 4, 2024

charris modified the milestones: 2.2.0 release, 2.3.0 release Nov 22, 2024

charris modified the milestones: 2.3.0 release, 2.4.0 release May 20, 2025

Uh oh!

BUG: scalars missing several methods for array api compat #27305

BUG: scalars missing several methods for array api compat #27305

Comments

Illviljan commented Aug 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the issue:

Reproduce the code example:

Error message:

Python and NumPy Versions:

Runtime Environment:

Context for the issue:

ngoldbaum commented Aug 29, 2024

Uh oh!

ngoldbaum commented Sep 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Illviljan commented Sep 6, 2024

Uh oh!

Ishankoradia commented Sep 30, 2024

Uh oh!

ngoldbaum commented Sep 30, 2024

Uh oh!

Ishankoradia commented Sep 30, 2024

Uh oh!

ngoldbaum commented Sep 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ishankoradia commented Sep 30, 2024

Uh oh!

ngoldbaum commented Sep 30, 2024

Uh oh!

Ishankoradia commented Sep 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ishankoradia commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngoldbaum commented Oct 7, 2024

Uh oh!

Ishankoradia commented Oct 7, 2024

Uh oh!

ngoldbaum commented Oct 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ishankoradia commented Oct 15, 2024

Uh oh!

mhvk commented Oct 15, 2024

Uh oh!

seberg commented Oct 15, 2024

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Illviljan commented Aug 28, 2024 •

edited

Loading

ngoldbaum commented Sep 4, 2024 •

edited

Loading

ngoldbaum commented Sep 30, 2024 •

edited

Loading

Ishankoradia commented Sep 30, 2024 •

edited

Loading

Ishankoradia commented Oct 4, 2024 •

edited

Loading

ngoldbaum commented Oct 14, 2024 •

edited

Loading