-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: Polynomial package slower than polynomial module for python <= 3.11. Both are worse for python > 3.11. #28948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you give more details on your use case? |
Sure, what exactly would you like to know? The use case is not much more complicated than the example code I provided. We have image arrays that need a correction applied and the correction is parameterized as a polynomial of arbitrary degree. The full example looks something like this: def correct_image(data: np.ndarray, correction_polynomial: numpy.poly1d | numpy.polynomial.polynomial.Polynomial) -> np.ndarray:
correction = correction_polynomial(data)
corrected_data = data / correction
return corrected data So there's an extra division, but I removed that common step in my tests above. In the tests I ran above I used a stack of images (i.e., fourteen 2048 x 2048 images), mostly just to give |
It'd be useful to see a low-level profile generated using e.g. samply on a Python version that is seeing a slowdown and comparing that with one that isn't. That'll show where Python is spending its time, which might give a hint at where the difference is coming from. |
I am having some problems confirming the dependence on the python version and the slowdowns. @eigenbrot Could you also benchmark with smaller There are a couple of open PRs to improve the performance btw: #24499, #24467, #26885, #24531 |
@ngoldbaum, I'm attaching some profiles captured with
The traces mean nothing to me, but if someone would like different samples please let me know. @eendebakpt, I re-ran the test described in my initial post but with two modifications you requested: the shape of the data is now
|
Where? If you upload the profile you should be able to share Firefox profiler links. |
@ngoldbaum, sorry. I borked the upload. They should be there now. |
Ah darn, unfortunately the profiles you uploaded don't have debug symbols, so they're pretty useless. If I have time I can try doing this myself to give you something more useful to look at. In the meantime you should be able to build NumPy from source using the meson debugoptimized build profile to get an optimized executable but with useful debugging info and then regenerate the profiles. If you're on Linux, you can also pass |
Ok, I've built numpy with the debugoptimized profile and re-generate profiles and test results. Just to make sure we're all on the same page, here is how I built numpy for each python version.
All were build against openblas 0.3.29 The profiles were then collected with (note the use of
I had to go the PYTHONPATH route because Profiles: 11_poly1d_debug_perf.json 12_poly1d_debug_perf.json 13_poly1d_debug_perf.json And here are the results of the timing test (with smaller arrays and more samples). Note that these were run in the same conda environment used to build numpy, prefixed with
|
Unfortunately the profiles still don't have C debugging information so that won't tell me anything useful. |
Here's a profile generated on Python 3.11 using just Polynomial with 100 timeit iterations: https://share.firefox.dev/4j0GCyo And here's the same thing, generated by calling poly1d with timeit 100 times: https://share.firefox.dev/4miarx9 Unfortunately this is on a Mac, so I can't use the perf integration. |
And here's a profile that does both: https://share.firefox.dev/4jQ2MEN At a high level, without any Python frames, it sort of looks like Polynomial is doing more ufunc operations? I don't see anything obvious in the profiles though - both are spending almost all their time in the add and multiply ufunc implementations. There's probably something different in the algorithm that's being used that's causing some extra ufunc calculations to happen that aren't needed by poly1d. |
Both implementations use numpy/numpy/polynomial/_polybase.py Lines 510 to 512 in 8d722b8
numpy/numpy/polynomial/polynomial.py Line 1580 in 8d722b8
and numpy/numpy/lib/_polynomial_impl.py Lines 1342 to 1343 in 8d722b8
For @eigenbrot What timings do you get if you use |
@eendebakpt just to make sure we're all on the same page, it looks like there are two versions of Here are the results of running similar timing tests with the two versions of
and the results:
I'd say the overall trend is still the same, but the difference between the two implementations is much much smaller. There still seems to be an inflection point across the 3.11 -> 3.12 transition. |
@eigenbrot You are completely right there are two versions of In the old version there is an iteration over the coefficients with With your example script I do not get the inflection across 3.11 to 3.12, but I do see another effect: for Because I do not see the performance change from 3.10 to 3.11 (or 3.12 or 3.13) on my system it is a bit hard to further investigate this. Could you try to narrow down the issue even more by testing a few different shapes of |
Describe the issue:
I really would like to use the new Polynomial package, but I'm finding the evaluation performance is much worse than
np.poly1d
for python versions 3.10 and 3.11. Running the example code below on different python versions gives the following results:Polynomial
np.poly1d
Notice that for python versions <= 3.11
np.poly1d
is significantly faster than using thePolynomial
package machinery. The performance of the two methods converges for python > 3.11, but mostly becausenp.poly1d
is slowing down.Is this expected behavior? I didn't see any mention of performance in the docs for the new Polynomial package. I have seen some reports that the
Polynomial
package is faster thannp.poly1d
(which would be great), but that's not what I'm seeing with my tests.Any advice/insight would be greatly appreciated. For now the clear solution is to just use
np.poly1d
.Reproduce the code example:
Error message:
Python and NumPy Versions:
For 3.10:
For 3.11:
For 3.12:
For 3.13:
Runtime Environment:
For 3.10:
For 3.11:
For 3.12:
For 3.13:
Context for the issue:
I need to apply polynomials to many large arrays and do it quickly. I want to use the new-and-improved
Polynomial
package, but its performance is forcing me to use the oldernp.poly1d
.The text was updated successfully, but these errors were encountered: