ENH: Improve performance of np.linalg._linalg._commonType #28686

eendebakpt · 2025-04-10T18:40:32Z

We improve performance of _commonType which benefits np.linalg.det and several other methods for small size arrays.

We return the output of isComplexType so that the value can be reused by the calling methods.
We pass the dtype instead of the arrays we can use a cache.

Benchmark:

np.linalg.det(x22): Mean +- std dev: [main_commontype] 3.38 us +- 0.15 us -> [pr_commontype] 2.90 us +- 0.17 us: 1.17x faster
np.linalg.det(x33): Mean +- std dev: [main_commontype] 3.57 us +- 0.28 us -> [pr_commontype] 3.01 us +- 0.23 us: 1.18x faster
np.linalg.inv(x22): Mean +- std dev: [main_commontype] 5.81 us +- 0.48 us -> [pr_commontype] 5.32 us +- 0.30 us: 1.09x faster
np.linalg.inv(x33): Mean +- std dev: [main_commontype] 6.83 us +- 0.49 us -> [pr_commontype] 6.14 us +- 0.34 us: 1.11x faster
np.linalg.eig(x22): Mean +- std dev: [main_commontype] 25.5 us +- 1.3 us -> [pr_commontype] 24.6 us +- 1.1 us: 1.04x faster

Geometric mean: 1.12x faster

Test script

# /// script
# requires-python = ">=3.10"
# dependencies = ['numpy', 'pyperf']
# ///

import pyperf

setup = """
import numpy as np
x22 = np.arange(4.).reshape( (2,2) ) + np.eye(2)
x33 = np.arange(9.).reshape( (3,3) ) + np.eye(3)
"""

runner = pyperf.Runner()
runner.timeit(name="np.linalg.det(x22)", stmt="np.linalg.det(x22)", setup=setup)
runner.timeit(name="np.linalg.det(x33)", stmt="np.linalg.det(x33)", setup=setup)
runner.timeit(name="np.linalg.inv(x22)", stmt="np.linalg.inv(x22)", setup=setup)
runner.timeit(name="np.linalg.inv(x33)", stmt="np.linalg.inv(x33)", setup=setup)
runner.timeit(name="np.linalg.eig(x22)", stmt="np.linalg.eig(x22)", setup=setup)

mhvk

@eendebakpt - I'm wondering if this isn't just adding complexity for something where it might pay to just look better at what is being done. E.g., I'm rather confused why one cannot use promote_types and cut this short a bit. A quick test on det shows that the following passes all tests,

diff --git a/numpy/linalg/_linalg.py b/numpy/linalg/_linalg.py
index e181e1a5d8..67e36debb8 100644
--- a/numpy/linalg/_linalg.py
+++ b/numpy/linalg/_linalg.py
@@ -31,7 +31,7 @@
     reciprocal, overrides, diagonal as _core_diagonal, trace as _core_trace,
     cross as _core_cross, outer as _core_outer, tensordot as _core_tensordot,
     matmul as _core_matmul, matrix_transpose as _core_matrix_transpose,
-    transpose as _core_transpose, vecdot as _core_vecdot,
+    promote_types, transpose as _core_transpose, vecdot as _core_vecdot,
 )
 from numpy._globals import _NoValue
 from numpy.lib._twodim_base_impl import triu, eye
@@ -2367,10 +2367,9 @@ def det(a):
     """
     a = asarray(a)
     _assert_stacked_square(a)
-    t, result_t = _commonType(a)
-    signature = 'D->D' if isComplexType(t) else 'd->d'
-    r = _umath_linalg.det(a, signature=signature)
-    r = r.astype(result_t, copy=False)
+    r = _umath_linalg.det(a, dtype=promote_types(a.dtype, double))
+    if r.dtype != a.dtype:
+        r = r.astype(promote_types(a.dtype, single), copy=False)
     return r

With that, the test from your script on a 2x2 matrix

x22 = np.arange(4.).reshape( (2,2) ) + np.eye(2)
%timeit np.linalg.det(x22)
7.09 -> 4.48 us

The only possible downside is that this does not raise an error on f2 input -- but why should it anyway?

p.s. For det at least, there is no need for _assert_stacked_square either - the gufunc will already check that there are at least 2 dimensions and that the last two are equal.

jorenham · 2025-04-11T18:34:09Z

How about something like this:

_DTYPE_RANK = dict(zip(map(dtype, "fdFD"), range(4)))

max_rank = -1
for dtype in dtypes:
    if dtype.num < 11:  # <: integer | bool
        continue
    if (rank := _DTYPE_RANK.get(dtype)) is None:
        raise TypeError(...)
    if rank == 3:  # no need to go on
        return cdouble, cdouble
    if rank > max_rank:
        max_rank = rank

if max_rank > 1:
    return cdouble, (csingle, cdouble)[max_rank - 2]
else:
    return double, (single, double)[max_rank]

I didn't test it, but I expect this to be quite a bit faster (and it might even be correct, too). Anyway, even if not correct, I'm sure you get the idea.

mhvk · 2025-04-11T19:41:14Z

Ideally, we don't rely on implementation details like type numbers... Also, no real reason to exclude user dtypes that know how to convert to double, etc. Using promote_types puts the burden where it belongs (and is in C, so pretty fast).

eendebakpt · 2025-04-11T20:58:47Z

The promote_types approach is a bit faster than the PR here, so I will look into that option.

The main performance gain is from the if r.dtype != a.dtype: check. I want to see whether we can handle that inside the astype.

eendebakpt · 2025-04-12T21:17:16Z

@mhvk To avoid the copy on the scalar we can also check in astype whether a copy is needed. A prototype for this is:

main...eendebakpt:numpy:astype

The advantage over the if r.dtype != a.dtype check is that it makes calls to scalar.astype faster (if no conversion is needed), also in other cases. The disadvantage is this adds some more complexity at the C side (and a very minor slowdown for the path where a conversion is needed).

At this moment I have no strong preference for either options, so any arguments in either direction are welcome.

mhvk · 2025-04-12T22:25:04Z

@eendebakpt - I think your patch makes sense in principle, but is perhaps a bit orthogonal to the goals here? At least, I wrote the if statement mostly to avoid the second call to promote_types.

eendebakpt · 2025-04-15T08:52:26Z

@mhvk Your patch looks good, I might end up refactoring this PR in that way.

It would be nice to also refactor the other methods calling _commonType in the same style. The np.promote_types does only handle 2 arguments (we need 3 some some methods). The np.result_type handles any number of arguments, but is quite a bit slower. I opened #28710 to improve performance, but it does not come close the promote_types performance (mainly due to dispatcher overhead).

eendebakpt · 2025-04-30T12:54:52Z

p.s. For det at least, there is no need for _assert_stacked_square either - the gufunc will already check that there are at least 2 dimensions and that the last two are equal.

True, but _assert_stacked_square raises a LinAlgError and the gufunc a ValueError. So removing that check might lead to some backwards compatibility issues.

eendebakpt added 2 commits April 9, 2025 23:24

ENH: Improve performance of _commonType

b85b4de

pass dtype

e83b398

github-actions bot added the 01 - Enhancement label Apr 10, 2025

eendebakpt mentioned this pull request Apr 10, 2025

ENH: Improve np.linalg.det performance #28649

Merged

eendebakpt changed the title ~~ENH: Improve peformance of np.linalg._linalg._commonType~~ ENH: Improve performance of np.linalg._linalg._commonType Apr 10, 2025

mhvk reviewed Apr 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Improve performance of np.linalg._linalg._commonType #28686

ENH: Improve performance of np.linalg._linalg._commonType #28686

eendebakpt commented Apr 10, 2025

Uh oh!

mhvk left a comment

Uh oh!

jorenham commented Apr 11, 2025

Uh oh!

mhvk commented Apr 11, 2025

Uh oh!

eendebakpt commented Apr 11, 2025

Uh oh!

eendebakpt commented Apr 12, 2025

Uh oh!

mhvk commented Apr 12, 2025

Uh oh!

eendebakpt commented Apr 15, 2025

Uh oh!

eendebakpt commented Apr 30, 2025

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Uh oh!

ENH: Improve performance of np.linalg._linalg._commonType #28686

Are you sure you want to change the base?

ENH: Improve performance of np.linalg._linalg._commonType #28686

Conversation

eendebakpt commented Apr 10, 2025

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

jorenham commented Apr 11, 2025

Uh oh!

mhvk commented Apr 11, 2025

Uh oh!

eendebakpt commented Apr 11, 2025

Uh oh!

eendebakpt commented Apr 12, 2025

Uh oh!

mhvk commented Apr 12, 2025

Uh oh!

eendebakpt commented Apr 15, 2025

Uh oh!

eendebakpt commented Apr 30, 2025

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.