BLD: bump OpenBLAS version, use OpenBLAS for win-arm64 #29039

mattip · 2025-05-23T12:19:14Z

Closes #29035 by adding openblas support to the arm64 windows builds

This bumps the version of OpenBLAS from the 0.3.29 release to the latest develop HEAD (which had wheels for OpenBLAS on win-arm64), so it may impact other things.

requirements/ci_requirements.txt

mattip · 2025-05-25T06:21:25Z

Hmm. Wheel builder is failing to upload arm64 windows wheels since there is no micromamba

mattip · 2025-05-25T07:02:39Z

Ahh, can't win-arm64 run x86 code using emulation? I wonder if the wheel uploader could just use the win-x86_64 micromamba install

seberg

Not sure I follow that is a problem with this PR, so approving: merge when you think it's done.

charris · 2025-05-25T14:02:26Z

Wheel builder is failing to upload arm64 windows wheels since there is no micromamba

I have the same problem uploading the numpy win_arm64 wheels for the 2.3.0rc1 release.

charris · 2025-05-25T14:18:46Z

I've grabbed the win_arm wheels on github and will try using those.

matthew-brett · 2025-05-25T14:48:27Z

@charris, @mattip - could you use my fix for the OpenBLAS wheel uploading for WoA? I installed anaconda-client via pip instead of using MicroMamba:

https://github.com/MacPython/openblas-libs/blob/main/.github/workflows/windows-arm.yml#L82

mattip · 2025-05-25T17:44:49Z

Thanks @matthew-brett. I used the pip-install trick and also disallowed building a win-arm64 wheel without OpenBLAS. Please check the wheel building logs and/or the artifact to make sure the win-arm64 wheel is using OpenBLAS before merging.

mattip · 2025-05-25T17:56:58Z

CirrusCI macOS-arm64 builds are failing, There is this messsage, then another #28227 heisenbug failure

Only [ghcr.io/cirruslabs/macos-runner:sonoma, ghcr.io/cirruslabs/macos-runner:sequoia] is allowed. Automatically upgraded to ghcr.io/cirruslabs/macos-runner:sequoia.

mattip · 2025-05-26T10:56:52Z

The repair-wheel-command is not running. In the x86_64 logs I see for instance

Successfully built numpy-2.4.0.dev0-cp312-cp312-win_amd64.whl
##[endgroup]
##[group]Repairing wheel...

but in the arm64 run the build goes right to testing

Successfully built numpy-2.4.0.dev0-cp311-cp311-win_arm64.whl
##[endgroup]
##[group]Testing wheel...

The repair-wheel-command is inside a [tool.cibuildwheel.windwows] section in the pyproject.toml. Is a different selector needed for win-arm64?

numpy/pyproject.toml

Lines 180 to 184 in 3c995e7

    
           [tool.cibuildwheel.windows] 
        
           # This does not work, use CIBW_ENVIRONMENT_WINDOWS 
        
           environment = {PKG_CONFIG_PATH="./.openblas"} 
        
           config-settings = "setup-args=--vsenv setup-args=-Dallow-noblas=false build-dir=build" 
        
           repair-wheel-command = "bash -el ./tools/wheels/repair_windows.sh {wheel} {dest_dir}"

joerick · 2025-05-26T13:28:00Z

That's the right selector, but the build options output at the start of a build shows an override: https://github.com/numpy/numpy/actions/runs/15246740791/job/42874812888#step:8:8542

That's just a bit further down the file:

numpy/pyproject.toml

Lines 191 to 195 in f3edb9f

    
           [[tool.cibuildwheel.overrides]] 
        
           select = "*-win_arm64" 
        
           config-settings = "setup-args=--vsenv setup-args=-Dallow-noblas=true build-dir=build" 
        
           repair-wheel-command = ""

mattip · 2025-05-26T15:15:08Z

Ahh, thanks I missed that.

mattip · 2025-05-26T17:27:33Z

Wheel repairing is running but...

C:\mingw64\bin\strip.exe: ./numpy/fft/_pocketfft_umath.cp311-win_arm64.pyd: file format not recognized

Is this using a x86_64 ming64 installation?

mattip · 2025-05-26T17:46:55Z

Since delvewheel updated the way it mangles, maybe there is no more need to strip the pyds?

mattip · 2025-05-26T19:25:12Z

Cool. OpenBLAS is properly mangled in both windows platforms. Only the cp313t-win32 wheel testing failed with worker 'gw0' crashed while running '_core/tests/test_arrayprint.py::test_multithreaded_array_printing'. Is that a known thing?

The windows-arm64 wheels weigh in at about 9.5MB, where the windows-x86_64 ones are about 13MB.

Anyone want to take a look?

matthew-brett · 2025-05-26T19:50:22Z

Installed and imported OK. A couple of test failures.

============================================== FAILURES ===============================================
___________________________ TestComplexFunctions.test_branch_cuts_complex64 ___________________________ 

self = <test_umath.TestComplexFunctions object at 0x0000029C0E2DF650>

    @pytest.mark.xfail(IS_WASM, reason="doesn't work")
    def test_branch_cuts_complex64(self):
        # check branch cuts and continuity on them
        _check_branch_cut(np.log,   -0.5, 1j, 1, -1, True, np.complex64)  # noqa: E221
        _check_branch_cut(np.log2,  -0.5, 1j, 1, -1, True, np.complex64)  # noqa: E221
        _check_branch_cut(np.log10, -0.5, 1j, 1, -1, True, np.complex64)
        _check_branch_cut(np.log1p, -1.5, 1j, 1, -1, True, np.complex64)
        _check_branch_cut(np.sqrt,  -0.5, 1j, 1, -1, True, np.complex64)  # noqa: E221

        _check_branch_cut(np.arcsin, [ -2, 2],   [1j, 1j], 1, -1, True, np.complex64)
        _check_branch_cut(np.arccos, [ -2, 2],   [1j, 1j], 1, -1, True, np.complex64)
>       _check_branch_cut(np.arctan, [0 - 2j, 2j],  [1,  1], -1, 1, True, np.complex64)

self       = <test_umath.TestComplexFunctions object at 0x0000029C0E2DF650>

envs\py312\Lib\site-packages\numpy\_core\tests\test_umath.py:4295:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

f = <ufunc 'arctan'>, x0 = array([0.-2.j, 0.+2.j], dtype=complex64)
dx = array([1.+0.j, 1.+0.j], dtype=complex64), re_sign = -1, im_sign = 1, sig_zero_ok = True
dtype = <class 'numpy.complex64'>

    def _check_branch_cut(f, x0, dx, re_sign=1, im_sign=-1, sig_zero_ok=False,
                          dtype=complex):
        """
        Check for a branch cut in a function.

        Assert that `x0` lies on a branch cut of function `f` and `f` is
        continuous from the direction `dx`.

        Parameters
        ----------
        f : func
            Function to check
        x0 : array-like
            Point on branch cut
        dx : array-like
            Direction to check continuity in
        re_sign, im_sign : {1, -1}
            Change of sign of the real or imaginary part expected
        sig_zero_ok : bool
            Whether to check if the branch cut respects signed zero (if applicable)
        dtype : dtype
            Dtype to check (should be complex)

        """
        x0 = np.atleast_1d(x0).astype(dtype)
        dx = np.atleast_1d(dx).astype(dtype)

        if np.dtype(dtype).char == 'F':
            scale = np.finfo(dtype).eps * 1e2
            atol = np.float32(1e-2)
        else:
            scale = np.finfo(dtype).eps * 1e3
            atol = 1e-4

        y0 = f(x0)
        yp = f(x0 + dx * scale * np.absolute(x0) / np.absolute(dx))
        ym = f(x0 - dx * scale * np.absolute(x0) / np.absolute(dx))

        assert_(np.all(np.absolute(y0.real - yp.real) < atol), (y0, yp))
        assert_(np.all(np.absolute(y0.imag - yp.imag) < atol), (y0, yp))
>       assert_(np.all(np.absolute(y0.real - ym.real * re_sign) < atol), (y0, ym))
E       AssertionError: (array([-1.3112233-0.23887786j,  1.3112233+0.23887786j], dtype=complex64), array([-1.3112233-0.23887786j,  1.3112233+0.23887786j], dtype=complex64))

atol       = np.float32(0.01)
dtype      = <class 'numpy.complex64'>
dx         = array([1.+0.j, 1.+0.j], dtype=complex64)
f          = <ufunc 'arctan'>
im_sign    = 1
re_sign    = -1
scale      = np.float32(1.1920929e-05)
sig_zero_ok = True
x0         = array([0.-2.j, 0.+2.j], dtype=complex64)
y0         = array([-1.3112233-0.23887786j,  1.3112233+0.23887786j], dtype=complex64)
ym         = array([-1.3112233-0.23887786j,  1.3112233+0.23887786j], dtype=complex64)
yp         = array([-1.3112233-0.23887786j,  1.3112233+0.23887786j], dtype=complex64)

envs\py312\Lib\site-packages\numpy\_core\tests\test_umath.py:4540: AssertionError
_______________________ TestComplexFunctions.test_loss_of_precision[complex64] ________________________

self = <test_umath.TestComplexFunctions object at 0x0000029C0E2DFB90>, dtype = <class 'numpy.complex64'>

    @pytest.mark.xfail(
        # manylinux2014 uses glibc2.17
        _glibc_older_than("2.18"),
        reason="Older glibc versions are imprecise (maybe passes with SIMD?)"
    )
    @pytest.mark.xfail(IS_WASM, reason="doesn't work")
    @pytest.mark.parametrize('dtype', [
        np.complex64, np.complex128, np.clongdouble
    ])
    def test_loss_of_precision(self, dtype):
        """Check loss of precision in complex arc* functions"""
        if dtype is np.clongdouble and platform.machine() != 'x86_64':
            # Failures on musllinux, aarch64, s390x, ppc64le (see gh-17554)
            pytest.skip('Only works reliably for x86-64 and recent glibc')

        # Check against known-good functions

        info = np.finfo(dtype)
        real_dtype = dtype(0.).real.dtype
        eps = info.eps

        def check(x, rtol):
            x = x.astype(real_dtype)

            z = x.astype(dtype)
            d = np.absolute(np.arcsinh(x) / np.arcsinh(z).real - 1)
            assert_(np.all(d < rtol), (np.argmax(d), x[np.argmax(d)], d.max(),
                                      'arcsinh'))

            z = (1j * x).astype(dtype)
            d = np.absolute(np.arcsinh(x) / np.arcsin(z).imag - 1)
            assert_(np.all(d < rtol), (np.argmax(d), x[np.argmax(d)], d.max(),
                                      'arcsin'))

            z = x.astype(dtype)
            d = np.absolute(np.arctanh(x) / np.arctanh(z).real - 1)
            assert_(np.all(d < rtol), (np.argmax(d), x[np.argmax(d)], d.max(),
                                      'arctanh'))

            z = (1j * x).astype(dtype)
            d = np.absolute(np.arctanh(x) / np.arctan(z).imag - 1)
            assert_(np.all(d < rtol), (np.argmax(d), x[np.argmax(d)], d.max(),
                                      'arctan'))

        # The switchover was chosen as 1e-3; hence there can be up to
        # ~eps/1e-3 of relative cancellation error before it

        x_series = np.logspace(-20, -3.001, 200)
        x_basic = np.logspace(-2.999, 0, 10, endpoint=False)

        if dtype is np.clongdouble:
            if bad_arcsinh():
                pytest.skip("Trig functions of np.clongdouble values known "
                            "to be inaccurate on aarch64 and PPC for some "
                            "compilation configurations.")
            # It's not guaranteed that the system-provided arc functions
            # are accurate down to a few epsilons. (Eg. on Linux 64-bit)
            # So, give more leeway for long complex tests here:
            check(x_series, 50.0 * eps)
        else:
>           check(x_series, 2.1 * eps)

check      = <function TestComplexFunctions.test_loss_of_precision.<locals>.check at 0x0000029C2D9E4A40>
dtype      = <class 'numpy.complex64'>
eps        = np.float32(1.1920929e-07)
info       = finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)
real_dtype = dtype('float32')
self       = <test_umath.TestComplexFunctions object at 0x0000029C0E2DFB90>
x_basic    = array([0.00100231, 0.0019994 , 0.00398841, 0.0079561 , 0.01587084,
       0.0316592 , 0.06315387, 0.12597953, 0.25130435, 0.50130265])
x_series   = array([1.00000000e-20, 1.21736864e-20, 1.48198641e-20, 1.80412378e-20,
       2.19628372e-20, 2.67368693e-20, 3.254862...3.06526013e-04, 3.73155156e-04, 4.54267386e-04,       
       5.53010871e-04, 6.73218092e-04, 8.19554595e-04, 9.97700064e-04])

envs\py312\Lib\site-packages\numpy\_core\tests\test_umath.py:4392:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

x = array([9.99999968e-21, 1.21736865e-20, 1.48198648e-20, 1.80412373e-20,
       2.19628376e-20, 2.67368701e-20, 3.254862...55170e-04, 4.54267400e-04,
       5.53010846e-04, 6.73218106e-04, 8.19554611e-04, 9.97700030e-04],
      dtype=float32)
rtol = np.float32(2.503395e-07)

    def check(x, rtol):
        x = x.astype(real_dtype)

        z = x.astype(dtype)
        d = np.absolute(np.arcsinh(x) / np.arcsinh(z).real - 1)
        assert_(np.all(d < rtol), (np.argmax(d), x[np.argmax(d)], d.max(),
                                  'arcsinh'))

        z = (1j * x).astype(dtype)
        d = np.absolute(np.arcsinh(x) / np.arcsin(z).imag - 1)
>       assert_(np.all(d < rtol), (np.argmax(d), x[np.argmax(d)], d.max(),
                                  'arcsin'))
E       AssertionError: (np.int64(198), np.float32(0.0008195546), np.float32(3.5762787e-07), 'arcsin')  

d          = array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, 0.0000000e+00,
       0.0000000e+00, 0.0000000e+00, 0.0000000e+00,... 0.0000000e+00, 5.9604645e-08,
       1.1920929e-07, 2.3841858e-07, 3.5762787e-07, 3.5762787e-07],
      dtype=float32)
dtype      = <class 'numpy.complex64'>
real_dtype = dtype('float32')
rtol       = np.float32(2.503395e-07)
x          = array([9.99999968e-21, 1.21736865e-20, 1.48198648e-20, 1.80412373e-20,
       2.19628376e-20, 2.67368701e-20, 3.254862...55170e-04, 4.54267400e-04,
       5.53010846e-04, 6.73218106e-04, 8.19554611e-04, 9.97700030e-04],
      dtype=float32)
z          = array([0.+9.99999968e-21j, 0.+1.21736865e-20j, 0.+1.48198648e-20j,
       0.+1.80412373e-20j, 0.+2.19628376e-20j, 0.+2...54267400e-04j, 0.+5.53010846e-04j, 0.+6.73218106e-04j,
       0.+8.19554611e-04j, 0.+9.97700030e-04j], dtype=complex64)

envs\py312\Lib\site-packages\numpy\_core\tests\test_umath.py:4363: AssertionError
======================================= short test summary info =======================================
FAILED envs/py312/Lib/site-packages/numpy/_core/tests/test_umath.py::TestComplexFunctions::test_branch_cuts_complex64 - AssertionError: (array([-1.3112233-0.23887786j,  1.3112233+0.23887786j], dtype=complex64), array([-...
FAILED envs/py312/Lib/site-packages/numpy/_core/tests/test_umath.py::TestComplexFunctions::test_loss_of_precision[complex64] - AssertionError: (np.int64(198), np.float32(0.0008195546), np.float32(3.5762787e-07), 'arcsin')
2 failed, 43460 passed, 1142 skipped, 2644 deselected, 28 xfailed, 5 xpassed in 80.00s (0:01:19)

andyfaff · 2025-05-27T02:58:22Z

CirrusCI macOS-arm64 builds are failing, There is this messsage, then another #28227 heisenbug failure

Only [ghcr.io/cirruslabs/macos-runner:sonoma, ghcr.io/cirruslabs/macos-runner:sequoia] is allowed. Automatically upgraded to ghcr.io/cirruslabs/macos-runner:sequoia.

I think the monterey image is no longer available, so it may be time to update it to Sonoma/Sequoia.

mattip · 2025-05-27T04:09:00Z

@matthew-brett are those failures new to the wheels-with-openblas or do they occur also in the wheel artifacts from, say, this CI run? If the latter, maybe that should be part of a different PR to add win-arm64 testing and blocklist some trig functions?

I was too dismissive of the cirrus failure. I see it is not only the heisenbug, there are 47!! failures when using OpenBLAS with macos-arm64 (targeting macos_11 without accelerate). I don't know whether this is due to the automatic update to a newer macos version, or due to the newer OpenBLAS version. I don't see any issues in upstream OpenBLAS that might be relevant.I opened #29061 to change only the macOS version, let's see if the build passes there.

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

…l build]

mattip · 2025-05-27T05:56:32Z

Rebased off main to get the cirrus-ci update to sonoma, which passed CI. Let's see if the OpenBLAS update is the problem or the further macos update to sequoia is the problem.

mattip · 2025-05-27T06:27:47Z

All that changed is the OpenBLAS version, and instead of passing there are now there are 43 failures 😞. @martin-frbg does the failures on macos-arm64 when moving from 0.3.29 to the latest develop HEAD ring any bells? I checked taht both builds are using the scipy_openblas64 ilp64 interfaces. From what I can see, the failures are in float32 when using matmul or power-like operations. For instance here is the failure from test_ufunc_noncontiguous[matmul]:

E                    ACTUAL: array([[[ 1.,  2.,  3.,  4.,  5.,  6.],
E                           [ 7.,  8.,  9., 10., 11., 12.],
E                           [13., 14., 15., 16., 17., 18.],...
E                    DESIRED: array([[[7.812911e-03, 2.000110e+00, 3.200184e+01, 5.120308e+02,
E                            2.048128e+03, 8.192533e+03],
E                           [3.277197e+04, 1.310888e+05, 2.621793e+05, 5.243622e+05,...

martin-frbg · 2025-05-27T07:02:30Z

No idea, there have been way too many changes since 0.3.29 but not that many that would affect OSX/Arm64 (I assume you are building for the "VORTEX" Apple M target there, which is mostly NeoverseN1 kernels ?) and none that manifest themselves as OpenBLAS errors

mattip · 2025-05-27T09:49:36Z

I cannot reproduce the failures locally on a macbook M2 using Sequoia 15.4.1. I wonder what I am missing.

python3.12 -m venv /tmp/venv312
source /tmp/venv312/bin/activate
pip install cibuildwheel
export CIBW_BUILD="cp312*"
export CIBW_ARCH=arm64
export INSTALL_OPENBLAS=true
export CIBW_ENVIRONMENT_MACOS="MACOSX_DEPLOYMENT_TARGET='11.0' INSTALL_OPENBLAS=true RUNNER_OS=macOS PKG_CONFIG_PATH=$PWD/.openblas"
cibuildwheel
# wheel builds and tests without failure

mattip · 2025-05-27T09:59:19Z

I assume you are building for the "VORTEX" Apple M target there

The config string is OpenBLAS 0.3.29.dev USE64BITINT DYNAMIC_ARCH NO_AFFINITY neoversen1 MAX_THREADS=64, which is identical to the run before this PR (just with OpenBLAS 0.3.29, no dev

martin-frbg · 2025-05-27T10:07:29Z

I don't see this on the M1 in the GCC Compile farm either (targeted build, will try a dynamic_arch build too but don't expect that to be different). Can try with my M4 later today but don't see why that would be different except that it probably has a newer OS

mattip · 2025-05-27T12:10:39Z

Grasping at straws: maybe this is due to some quirk in the VM hosting at Cirrus and the number of parallel test runners (currently set to -n auto in the testing script). I am playing with this in #29069

github-actions bot added the 36 - Build Build related PR label May 23, 2025

seberg reviewed May 23, 2025

View reviewed changes

requirements/ci_requirements.txt Outdated Show resolved Hide resolved

seberg approved these changes May 25, 2025

View reviewed changes

mattip force-pushed the openblas-win-arm64 branch from b277e71 to 6c8f0f7 Compare May 25, 2025 18:26

charris added the 09 - Backport-Candidate PRs tagged should be backported label May 26, 2025

charris added this to the 2.3.0 release milestone May 26, 2025

mattip mentioned this pull request May 27, 2025

BLD: use sonoma image on Cirrus for wheel build #29061

Merged

mattip mentioned this pull request May 27, 2025

MNT: Update windows-2019 to windows-latest #28955

Open

mattip and others added 6 commits May 27, 2025 08:52

BLD: bump OpenBLAS version, use OpenBLAS for win-arm64 [wheel build]

9cf70dc

Update requirements/ci_requirements.txt

c6f764b

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

use pip to install anaconda-client on win-arm64 [wheel build]

7ed91ee

allow noblas in win32 wheels, use scipy-openblas32 on win-arm64 [whee…

eb7a3d7

…l build]

improve runner arch detection logic [wheel build]

66364b8

remove win_arm64 cibuildwheel override

a27c638

remove 'strip' before calling delvewheel [wheel build]

5bc41aa

mattip force-pushed the openblas-win-arm64 branch from fd2732c to 5bc41aa Compare May 27, 2025 05:54

mattip mentioned this pull request May 27, 2025

BLD: use github to build macos-arm64 wheels with OpenBLAS #29069

Open

charris mentioned this pull request May 27, 2025

BLD: use sonoma image on Cirrus for wheel build #29073

Merged

Uh oh!

BLD: bump OpenBLAS version, use OpenBLAS for win-arm64 #29039

Are you sure you want to change the base?

BLD: bump OpenBLAS version, use OpenBLAS for win-arm64 #29039

Conversation

mattip commented May 23, 2025

Uh oh!

Uh oh!

mattip commented May 25, 2025

Uh oh!

mattip commented May 25, 2025

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

charris commented May 25, 2025

Uh oh!

charris commented May 25, 2025

Uh oh!

matthew-brett commented May 25, 2025

Uh oh!

mattip commented May 25, 2025

Uh oh!

mattip commented May 25, 2025

Uh oh!

mattip commented May 26, 2025

Uh oh!

joerick commented May 26, 2025

Uh oh!

mattip commented May 26, 2025

Uh oh!

mattip commented May 26, 2025

Uh oh!

mattip commented May 26, 2025

Uh oh!

mattip commented May 26, 2025

Uh oh!

matthew-brett commented May 26, 2025

Uh oh!

andyfaff commented May 27, 2025

Uh oh!

mattip commented May 27, 2025

Uh oh!

mattip commented May 27, 2025

Uh oh!

mattip commented May 27, 2025

Uh oh!

martin-frbg commented May 27, 2025

Uh oh!

mattip commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattip commented May 27, 2025

Uh oh!

martin-frbg commented May 27, 2025

Uh oh!

mattip commented May 27, 2025

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

mattip commented May 27, 2025 •

edited

Loading