Skip to content

DEP: Deprecate constructing dtypes from any object having a .dtype attribute #25306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
timhoffm opened this issue Dec 4, 2023 · 6 comments

Comments

@timhoffm
Copy link
Contributor

timhoffm commented Dec 4, 2023

Proposed new feature or change:

I suggest to deprecate this feature. This is essentially already alluded in "we could consider deprecating even more" in #13578 (comment).

It seems too permissive (allowing sloppy user code) and complicates NumPy code (e.g. #13003). Users should instead dereference the .dtype attribute explicitly ("explicit is better than implicit"), e.g.

np.array(data, dtype=obj.dtype)
# instead of
np.array(data, dtype=obj)

Also, the current documentation is not exactly correct. It claims

Any type object with a .dtype attribute

but NumPy explicitly disallows this for ndarrays - for good reason, because np.array(data, dtype=other_array) would be semantically sloppy.

OTOH, I can for example do this with a pandas.Series (np.array(data, dtype=series)). IMHO this should not work for pandas.Series or other objects. An additional unwanted side effect is that one can have quite strange comparions due to dtypes comparing True to valid data type specifications

>>> np.dtype('int64') == pd.Series([1, 2])
True
@jakevdp
Copy link
Contributor

jakevdp commented Dec 4, 2023

Hi! I appreciate the motivation for this, but one data point: as proposed, this will make jax scalar type aliases (e.g. jax.numpy.float32, jax.numpy.int32, etc.) unusable as dtypes.

Some background: JAX made the choice early to not have distinct scalar type classes, but rather to represent scalars as 0-dimensional arrays. What this means is that when it comes to the jax.numpy API, we want jnp.float32(0) to be a function that returns a zero-dimensional jax.Array.

Unfortunately, a common pattern in numpy is to use scalar constructors as dtype arguments; e.g. jnp.arange(4, dtype=jnp.float32). In order to support this, we've made the scalar constructors objects that have a dtype attribute, so that np.dtype(jnp.float32) will return dtype('float32'). If that mechanism is taken away, I suspect it would break a lot of existing JAX code.

Perhaps it would be possible to replace this with some other better-defined mechanism, e.g. using a __dtype__ attribute to tell NumPy how np.dtype should treat a particular object?

@rkern
Copy link
Member

rkern commented Dec 4, 2023

Adding a designed-for-purpose mechanism like .__dtype__, then deprecating the it-seemed-like-a-good-idea-at-the-time mechanism of inspecting the multipurpose .dtype makes sense to me.

@timhoffm
Copy link
Contributor Author

timhoffm commented Dec 4, 2023

Thanks for the comments. Makes sense. __dtype__ may even be something to be formalized in the array API standard. Do you have any opinion on whether one should first bring this up there, or whether introducing it into numpy directly is more appropriate?

@seberg
Copy link
Member

seberg commented Dec 5, 2023

dtype may even be something to be formalized in the array API standard

I don't really see it fitting there. We are talking about mixing different implementations and I would say it is better if it is __numpy_dtype__ here anyway.

But yes, I agree with the idea that it is odd, but also agree that it needs a replacement since JAX and probably also pandas rely on it.

@jakevdp
Copy link
Contributor

jakevdp commented Feb 14, 2024

Circling back here: has there been any more discussion of replacing .dtype with .__dtype__ or __numpy_dtype__ to specify objects that are coercible to numpy dtype? It seems to me this would be a good fit for NumPy 2.0.

@seberg
Copy link
Member

seberg commented Feb 15, 2024

I don't think it matters, as we should just deprecate .dtype attribute lookup slowly when you don't find __numpy_dtype__, but would probably press the merge button on a PR adding it (with or without deprecation).
It seems like a pretty clear idea to try to be specific if we already need such logic...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy