Skip to content

BLAS desiderata

Matti Picus edited this page Aug 23, 2018 · 6 revisions

The numerical ecosystem could really use a modern, optionally-multithreaded BLAS under a BSD-like license with a priority on

  • Correctness
  • Out-of-the-box single-binary functionality (e.g., runtime kernel selection, runtime thread control)
  • Speed
  • Portability

...in roughly that order.

OpenBLAS is currently the library that's closest to providing these things, but there are a number of improvements possible. Fixing these might make some good concrete targets for people to go after:

  • The path leading to getting a generally-useful build is lined with tricky booby-traps (e.g., automagic capping of the maximum number of threads and the famous NO_AFFINITY).
  • There are concerns about lack of tests. That link lists a number of specific bugs that made it past the existing test suite and still are not tested for; in general it would be very useful to build up a set of comprehensive BLAS/Lapack tests that includes tests for realistic problem sizes.
  • It's not possible (?) to override CPU detection at runtime, which makes it hard to run comprehensive tests.
  • The use of AT&T-syntax inline asm (?) prevents the use of MSVC; using intrinsics instead might be more maintainable and certainly more portable. MSVC now supported
  • ...any more?
Clone this wiki locally
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy