-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
Make eig/eigvals
always return complex eigenvalues
#29000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think we'd have to come up with a new function that has a different return type. A few versions later, we could deprecate We can't just change the return types though IMO because then it becomes annoying to simultaneously support several NumPy versions. We also can't generate new deprecation warnings for |
Quite the other way around. I think this is an intrusive change and make things really complicated for linear algebra code. Because it will require checking the imaginary part every time you have an eigvals code. I would argue that this is a change not due to usability improvement but to regularize type system that I don't find sufficiently strong reason about. I am aware of the array api work, but this should not come at the expense of giving up the convenience of a dynamic language.
This one is also very problematic. Tons of code relies on this behavior and the behavior is pretty much written in stone at this point. Instead of changing established conventions, the other libraries should look for ways to mimic this behavior. |
Array API discussion is a for context only, and is not a motivation for this request. So let's not tangle this in. [1]
Will it really? To me, it's a pure math thing: eigenvalues of a real matrix are complex-valued. They can --- by accident --- have zero imaginary parts. These zero imaginary parts can --- again, by accident --- become identically zero in floating point, but in general one needs some tolerance to decide if And I stand by the OP statement that the very need to decide if eigenvalues are or are not exactly on the real axis of the complex plane is rare. That a bunch of code has possibly already been written to expect handholding from numpy is a separate matter. A small experiment running scipy and scikit-{learn,image} test suites above seems to hint that the amount of such code is not that large. We can of course question how representative that exercise is, but by all means, if we reject the change because of potential backwards compatibility concerns, let's be clear about it and not mix up any other reasons. [1] For the record (and I'm already on record elsewhere): IMO the array API movement is in itself not a reason to change anything, full stop. It can serve as a motivation, yes, but any change initiated by array API should be weighted on its own merits, not only consistency between array backends. |
IMO this is worth doing. Having consistent output types is also better for JITs like numba or AoT compilers. Maybe it could be done by adding a new keyword argument that forces imaginary output? In a future version we could deprecate Either way we need a story for library authors that ideally allows them to avoid adding code that depends on NumPy version in all cases. Library authors that want to jump the gun and write version-dependent code can do so, but we shouldn't generate any new deprecation warnings that force library authors to have version-dependent code. |
I am assuming this is the reason but otherwise what's the reason to change it?
Yes and that's how Python took off with hand-holding everywhere meaning providing a convenient interface. Otherwise we would be still coding in braces or The type safe way of doing this is the classical return-two-arrays in LAPACK or other compiled sources that you get one array I also want to be clear about our intentions and not make up reasons to avoid the fact that this is coming from array api. So it should go both ways. Independent from the dtype, the eigenvalue ordering, that's a major battle altogether. |
The array API discussion linked above is a context where this issue originates, yes. NumPy uses one convention and PyTorch uses a different convention. FTR, I'm suggesting that CuPy (which is the original motivation for the whole story) follows pytorch not numpy in cupy/cupy#8980.
That might be, but I'd argue that everywhere should be where reasonable, as with time we've seen both kinds of interfaces. So for me, the reason is just the math. With my comp phys / applied maths hat on, every time I see Also, what are the applications/use cases where returning complex eigenvalues with zero imaginary parts breaks something? I don't mean existing implementations (where the scipy/scikit-learn/scikit-image test suites seem to hint at an answer), I mean computational problems. I can only think of
Sure, this is in the OP :-).
With a downstream library maintainer hat on, I don't see how a keyword helps. Downstream, we'll need to first add a keyword (with a version check), then silence the deprecation warnings, then remove it. It's way simpler to just add an |
SciPy is a very engaged downstream. I'm thinking of people with less resources, who would prefer to have a one-liner they could use, or even retain their old code - which after all isn't broken. Another advantage of the keyword argument approach is that we can make it so the deprecation warning only triggers if you don't specify the keyword. Anyone who wants to make their code always handle imaginaries can choose to make their code depend on numpy version and use the keyword instead of casting if it's available. Anyone who wants to keep their code as-is just explicitly sets e.g. In a few numpy versions we expire the deprecation. At that point the lazy people who don't care about deprecations can just set the keyword or do the migration to always handle imaginaries. Either way it's not much work and they don't need to figure out the correct way to make their code depend on numpy version. |
For a real-valued input, NumPy's eigenvalue routines currently may return either real or complex eigenvalues: both
eig
andeigvals
check for imaginary parts being exactly zero, and downcast the output if they are [1].Other array libraries skip this handholding and always return complex eigenvalues. For instance,
In [10]: torch.linalg.eig(torch.arange(4).reshape(2, 2)*1.0).eigenvalues
Out[10]: tensor([-0.5616+0.j, 3.5616+0.j])
A recent array-api discussion [2], had a question if NumPy would consider changing this value-dependent behavior and always return a complex array for eigenvalues?
What is the downstream effect. Were we starting from scratch, I'd expect that doing
real_if_close
is not that taxing for a user---and in fact in the vast majority of use cases it's just not needed.Since we're not starting from scratch, and to roughly assess the blast radius, I applied the following patch (hidden under the fold), and ran tests for scipy, scikit-learn and scikit-image.
Here's the summary:
scipy:
SciPy: 4 failures in scipy.signal, all look trivial to fix
FAILED signal/tests/test_dltisys.py::TestStateSpaceDisc::test_properties - AssertionError: dtypes do not match.
FAILED signal/tests/test_dltisys.py::TestTransferFunction::test_properties - AssertionError: dtypes do not match.
FAILED signal/tests/test_ltisys.py::TestStateSpace::test_properties - AssertionError: dtypes do not match.
FAILED signal/tests/test_ltisys.py::TestTransferFunction::test_properties - AssertionError: dtypes do not match.
scikit-learn:
all tests pass
scikit-image:
FAILED measure/tests/test_fit.py::test_ellipse_model_estimate - TypeError: unsupported operand type(s) for %=: 'numpy.complex128' and 'float'
FAILED measure/tests/test_fit.py::test_ellipse_parameter_stability - TypeError: unsupported operand type(s) for %=: 'numpy.complex128' and 'float'
FAILED measure/tests/test_fit.py::test_ellipse_model_estimate_from_data - TypeError: unsupported operand type(s) for %=: 'numpy.complex128' and 'float'
FAILED measure/tests/test_fit.py::test_ellipse_model_estimate_from_far_shifted_data - TypeError: unsupported operand type(s) for %=: 'numpy.complex128' and 'float'
[1] https://github.com/numpy/numpy/blob/v2.1.0/numpy/linalg/_linalg.py#L1226
[2] data-apis/array-api#935 (comment)
The text was updated successfully, but these errors were encountered: