* [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
@ 2025-09-11 18:07 Michał Górny
2025-09-11 19:18 ` Matthias Maier
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Michał Górny @ 2025-09-11 18:07 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 5409 bytes --]
Hi, everyone.
TL;DR: I'm proposing removing eselect-blas/lapack and the virtuals,
and using sci-libs/flexiblas instead.
1. The problem
==============
The current Gentoo infrastructure for supporting multiple BLAS/LAPACK
variants doesn't really work. The primary problem is that while
the interfaces are standardized, they are also versioned and different
providers (and their different versions) tend to diverge over what's
implemented. On top of that, packages generally don't stop at a kind of
"common subset" (whatever that would be), but tend to detect which
routines are implemented by the BLAS/LAPACK libraries actually used.
The end result is that if you build a package against library A, then it
may or may not work with library B after switching.
While I agree in principle with eselect-ldso being superior to regular
alternatives, it suffers from this problem even more. Since we're
building packages against sci-libs/lapack, i.e. the reference
implementation, we're effectively building against the most complete
implementation.
Given the ABI problem, no solution based on alternatives or virtuals can
really work.
2. Proposed solution
====================
I'd like to propose a two-tier replacement to the current system:
1. Base BLAS/LAPACK support via sci-libs/flexiblas.
2. USE flags for specific implementations for packages that use
additional functions beyond standard BLAS/LAPACK interfaces.
2.1. sci-libs/flexiblas
-----------------------
FlexiBLAS is a thin wrapper over BLAS/LAPACK that provides runtime-
switchable backends. It's similar in principle to media-libs/libglvnd.
It provides a modern complete BLAS/LAPACK interface that other packages
can build on. At runtime, it dispatches calls to the selected provider,
or if it doesn't implement the call in question, to sci-libs/lapack,
or a built-in fallback.
We get the best of both worlds: packages can use the complete API, we
use the preferred provider when we can, and fall back when we can't.
On top of that, unlike with eselect, the preferred provider can be
overriden via user configuration and environment variables too.
On the minus side, FlexiBLAS currently doesn't support the mixed
LP64/ILP64 interface provided by the latest sci-libs/lapack (upstream is
planning to add it). So we're stuck with LP64, unless we add USE=64bit-
index to sci-libs/lapack, for the more traditional ILP64 interface.
Another potential concern is that FlexiBLAS seems to be largely
university-developed software, with a single maintainer, no public bug
tracker and a semi-public code repository (it seems that only releases
are pushed there rather than all changes). The author seems quite
friendly, and addressed issues I've reported so far, but the risk is
there. However, we're not alone here: Fedora has switched to FlexiBLAS
already.
2.2. USE flags
--------------
I also propose adding USE flags to packages whenever this makes sense.
In particular, this could be useful for packages that e.g. use
the additional functions MKL provides.
I'm opposed to adding USE flags for different BLAS/LAPACK providers all
over the place, since most of the time these flags would make no real
difference and only harm the ability to dynamically dispatch.
Of course, I do realize that some people will inevitably add unnecessary
flags. They probably already did.
3. The transition
=================
A major problem with this is transitioning from the current model to
FlexiBLAS. I can think of two different approaches to that:
1. The hard approach -- we migrate packages one-by-one to use FlexiBLAS
in place of the virtuals.
2. The lazy approach (untested) -- we replace BLAS/LAPACK libraries with
symlinks to FlexiBLAS.
3.1. The hard approach
----------------------
The advantage of the hard approach is that it's clean and explicit. We
update packages to link to libflexiblas explicitly. Other BLAS/LAPACK
providers, including sci-libs/lapack, are left as-is, i.e. following
upstream install. While transitioning packages, we test them,
and eventually we remove the virtuals and eselect modules.
Bad news is that:
- it's a lot of work
- a lot of packages will require patching to use flexiblas
- people will keep mistakenly linking to sci-libs/lapack directly
This is the approach Fedora took.
3.2. The lazy approach
----------------------
The rough idea is that we move sci-libs/lapack away, via renaming
the libraries or moving them into a subdirectory, and put symlinks to
flexiblas in their place. Packages can continue searching for libblas,
liblapack, etc. as usual, and they will end up being linked to flexiblas
instead.
Note that I haven't found time to test this yet, so I may be missing
some problem with it, but I can't think of one right now. What's
unclear to me is how we handle pkg-config files and CMake: if we also
replace them with symlinks to flexiblas ones, or leave as-is but with
the original paths that are now replaced by symlinks. I also need to
check if the CMake files installed by sci-libs/lapack are actually used
at all.
Bad news is that it's less clean, and leaves us forever diverging in how
sci-libs/lapack is installed, and apps linking to '-lblas -lcblas
-llapack -llapacke' instead of '-lflexiblas'.
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-11 18:07 [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS Michał Górny
@ 2025-09-11 19:18 ` Matthias Maier
2025-09-11 19:38 ` Michał Górny
2025-09-13 14:01 ` Benda Xu
` (2 subsequent siblings)
3 siblings, 1 reply; 17+ messages in thread
From: Matthias Maier @ 2025-09-11 19:18 UTC (permalink / raw
To: gentoo-dev
> TL;DR: I'm proposing removing eselect-blas/lapack and the virtuals,
> and using sci-libs/flexiblas instead.
Yes, please!
> Another potential concern is that FlexiBLAS seems to be largely
> university-developed software, with a single maintainer, no public bug
> tracker and a semi-public code repository (it seems that only releases
> are pushed there rather than all changes). The author seems quite
> friendly, and addressed issues I've reported so far, but the risk is
> there. However, we're not alone here: Fedora has switched to FlexiBLAS
> already.
On the other hand, BLAS/Lapack is a *very* slowly evolving standard. It
would also be feasible to consider picking up maintenance of FlexiBLAS
if the need arises.
We could wait for support for LP64 / ILP64 in FlexiBLAS before
switching. Or just switch and let ILP64 hang in the air for a bit.
Best,
Matthias
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-11 19:18 ` Matthias Maier
@ 2025-09-11 19:38 ` Michał Górny
0 siblings, 0 replies; 17+ messages in thread
From: Michał Górny @ 2025-09-11 19:38 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1134 bytes --]
On Thu, 2025-09-11 at 14:18 -0500, Matthias Maier wrote:
> > TL;DR: I'm proposing removing eselect-blas/lapack and the virtuals,
> > and using sci-libs/flexiblas instead.
>
> Yes, please!
>
>
> > Another potential concern is that FlexiBLAS seems to be largely
> > university-developed software, with a single maintainer, no public bug
> > tracker and a semi-public code repository (it seems that only releases
> > are pushed there rather than all changes). The author seems quite
> > friendly, and addressed issues I've reported so far, but the risk is
> > there. However, we're not alone here: Fedora has switched to FlexiBLAS
> > already.
>
> On the other hand, BLAS/Lapack is a *very* slowly evolving standard. It
> would also be feasible to consider picking up maintenance of FlexiBLAS
> if the need arises.
>
>
> We could wait for support for LP64 / ILP64 in FlexiBLAS before
> switching. Or just switch and let ILP64 hang in the air for a bit.
Given that we don't really support ILP64 now (it's semi-accidental
at best), I suppose no harm in doing it now.
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-11 18:07 [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS Michał Górny
2025-09-11 19:18 ` Matthias Maier
@ 2025-09-13 14:01 ` Benda Xu
2025-09-14 21:45 ` Michael Orlitzky
2025-09-20 17:12 ` Michał Górny
3 siblings, 0 replies; 17+ messages in thread
From: Benda Xu @ 2025-09-13 14:01 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1959 bytes --]
Hi Michał,
Michał Górny <mgorny@gentoo.org> writes:
> TL;DR: I'm proposing removing eselect-blas/lapack and the virtuals,
> and using sci-libs/flexiblas instead.
After reading the flexiblas related publications, I find it a cleaner
solution to our eselect-ldso.
> 1. The problem
> ==============
> The current Gentoo infrastructure for supporting multiple BLAS/LAPACK
> variants doesn't really work. The primary problem is that while the
> interfaces are standardized, they are also versioned and different
> providers (and their different versions) tend to diverge over what's
> implemented. On top of that, packages generally don't stop at a kind
> of "common subset" (whatever that would be), but tend to detect which
> routines are implemented by the BLAS/LAPACK libraries actually used.
> The end result is that if you build a package against library A, then
> it may or may not work with library B after switching.
>
> While I agree in principle with eselect-ldso being superior to regular
> alternatives, it suffers from this problem even more. Since we're
> building packages against sci-libs/lapack, i.e. the reference
> implementation, we're effectively building against the most complete
> implementation.
>
> Given the ABI problem, no solution based on alternatives or virtuals
> can really work.
>
>
> 2. Proposed solution
> ====================
> I'd like to propose a two-tier replacement to the current system:
>
> 1. Base BLAS/LAPACK support via sci-libs/flexiblas.
>
> 2. USE flags for specific implementations for packages that use
> additional functions beyond standard BLAS/LAPACK interfaces.
>
> [...]
While I have not personally encountered the ABI incompatibilities
described with my setups with blis and OpenBLAS, I found the analysis
and proposed solution to be sound and well-reasoned. The proposed
approach makes sense to me from a technical standpoint.
Yours,
Benda
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-11 18:07 [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS Michał Górny
2025-09-11 19:18 ` Matthias Maier
2025-09-13 14:01 ` Benda Xu
@ 2025-09-14 21:45 ` Michael Orlitzky
2025-09-15 2:12 ` Michał Górny
2025-09-20 17:12 ` Michał Górny
3 siblings, 1 reply; 17+ messages in thread
From: Michael Orlitzky @ 2025-09-14 21:45 UTC (permalink / raw
To: gentoo-dev
On 2025-09-11 20:07:11, Michał Górny wrote:
> Hi, everyone.
>
> TL;DR: I'm proposing removing eselect-blas/lapack and the virtuals,
> and using sci-libs/flexiblas instead.
I wish this (not your proposal necessarily, but all of it) were
simpler in the common case where OpenBLAS works well enough that the
user would never think about switching, much less switching at
runtime.
I'm a little skeptical of the FlexiBLAS benchmarks. The only ones I've
seen are in lawn284.pdf, where they measure in seconds(!) on a very
fast CPU the extra time taken by FlexiBLAS... on top of an already
slow computation. The wrapper could be 1000% slower than a direct
library call, but you'd never know because they multiply two big
matrices afterwards and then tell you how long it took to do the fast
thing and the slow thing together.
All I personally want is to be able to dump OpenBLAS in $libdir, the
same way you're planning to do with FlexiBLAS. I can do this now with
eselect-ldso, but I don't care about runtime switching and would be
fine if that went away or was replaced with FlexiBLAS.
Instead of putting FlexiBLAS in $libdir, I guess we could make the
$libdir implementation configurable via USE and have FlexiBLAS be one
of the options. Default to OpenBLAS on most arches, unless you need
the reference implementation as a dependency? But then we would have
to make sure that every implementation knows how to install itself to
two places, etc. It would be even more work than doing it 1x for
FlexiBLAS, and I'm not volunteering, so this is just thinking out
loud.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-14 21:45 ` Michael Orlitzky
@ 2025-09-15 2:12 ` Michał Górny
2025-09-15 3:44 ` Mitchell Dorrell
2025-09-15 12:40 ` Michael Orlitzky
0 siblings, 2 replies; 17+ messages in thread
From: Michał Górny @ 2025-09-15 2:12 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 791 bytes --]
On Sun, 2025-09-14 at 17:45 -0400, Michael Orlitzky wrote:
> Instead of putting FlexiBLAS in $libdir, I guess we could make the
> $libdir implementation configurable via USE and have FlexiBLAS be one
> of the options. Default to OpenBLAS on most arches, unless you need
> the reference implementation as a dependency? But then we would have
> to make sure that every implementation knows how to install itself to
> two places, etc. It would be even more work than doing it 1x for
> FlexiBLAS, and I'm not volunteering, so this is just thinking out
> loud.
You're missing the point above. The implementations aren't guaranteed
to be ABI-compatible, and USE flags can't handle that, unless every
single package has a [openblas=] and so on.
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-15 2:12 ` Michał Górny
@ 2025-09-15 3:44 ` Mitchell Dorrell
2025-09-15 12:40 ` Michael Orlitzky
1 sibling, 0 replies; 17+ messages in thread
From: Mitchell Dorrell @ 2025-09-15 3:44 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 606 bytes --]
On Sun, Sep 14, 2025, 22:13 Michał Górny <mgorny@gentoo.org> wrote:
> unless every single package has a [openblas=] and so on.
Suppose we take a step back to consider a general case. A software package
needs some library routines which can be fulfilled by any of a few
different non-ABI-compatible dependencies. Would this ("[openblas=]") be
the normal way to handle the situation?
Why are linear algebra libraries a special case?
I don't know enough about FlexiBLAS to support or oppose the change, but
I'm very interested in the outcome of this discussion.
-Mitchell Dorrell
>
[-- Attachment #2: Type: text/html, Size: 1276 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-15 2:12 ` Michał Górny
2025-09-15 3:44 ` Mitchell Dorrell
@ 2025-09-15 12:40 ` Michael Orlitzky
2025-09-15 14:05 ` Michał Górny
1 sibling, 1 reply; 17+ messages in thread
From: Michael Orlitzky @ 2025-09-15 12:40 UTC (permalink / raw
To: gentoo-dev
On 2025-09-15 04:12:01, Michał Górny wrote:
>
> You're missing the point above. The implementations aren't guaranteed
> to be ABI-compatible, and USE flags can't handle that, unless every
> single package has a [openblas=] and so on.
>
I know, but I don't think that's too crazy, or even the hardest part
of this. With a USE_EXPAND and an eclass, the flag and dependency
could be handled automatically.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-15 12:40 ` Michael Orlitzky
@ 2025-09-15 14:05 ` Michał Górny
2025-09-15 15:05 ` Michael Orlitzky
0 siblings, 1 reply; 17+ messages in thread
From: Michał Górny @ 2025-09-15 14:05 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 608 bytes --]
On Mon, 2025-09-15 at 08:40 -0400, Michael Orlitzky wrote:
> On 2025-09-15 04:12:01, Michał Górny wrote:
> >
> > You're missing the point above. The implementations aren't guaranteed
> > to be ABI-compatible, and USE flags can't handle that, unless every
> > single package has a [openblas=] and so on.
> >
>
> I know, but I don't think that's too crazy, or even the hardest part
> of this. With a USE_EXPAND and an eclass, the flag and dependency
> could be handled automatically.
Can you suggest a good (and easy to use) benchmark for BLAS/LAPACK?
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-15 14:05 ` Michał Górny
@ 2025-09-15 15:05 ` Michael Orlitzky
2025-09-15 15:43 ` Matthias Maier
0 siblings, 1 reply; 17+ messages in thread
From: Michael Orlitzky @ 2025-09-15 15:05 UTC (permalink / raw
To: gentoo-dev
On 2025-09-15 16:05:53, Michał Górny wrote:
>
> Can you suggest a good (and easy to use) benchmark for BLAS/LAPACK?
No, I'm honestly not that knowledgable about it. Despite nominally
working in linear algebra, I mainly use the computer to solve systems
of low dimension that arise in other algorithms. There are no doubt a
few thousand such uses in R, Octave, SageMath, etc.
My only real concern is that the FlexiBLAS benchmarks appear targeted
at people who use linear algebra to do large-scale linear algebra.
Their smallest n=500 is larger than the matrices I usually work
with. But if the numbers look the same for n=3, then there's nothing
to worry about. You could steal the FlexiBLAS benchmarks and tone down
"n" to test this?
(You can always turn many small systems into one big system to cut
down on the number of API calls, but the size of the system grows
quadratically so this isn't a real strategy.)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-15 15:05 ` Michael Orlitzky
@ 2025-09-15 15:43 ` Matthias Maier
2025-09-16 17:56 ` Michael Orlitzky
0 siblings, 1 reply; 17+ messages in thread
From: Matthias Maier @ 2025-09-15 15:43 UTC (permalink / raw
To: gentoo-dev
Dear all,
Looking at the publication for FlexiBLAS I'd say the runtime overhead is
negligible.
Just another pointer: glibc is already doing dynamic dispatch for
transcendental functions (sin, exp, pow, etc.) to specialized
implementations. This is a noticeable pointer walk but still small
compared to the actual runtime of, say, pow().
So if flexiblas has implemented their dynamic dispatch in a similar
fashion then it's going to be reasonably fast.
I doubt that n=3 is the target application for BLAS/LAPACK
routines. Setting up everything and then calling into the routines
(without flexiblas) will already cost you much more than the 10
instructions that are executed in the end. [1]
In that sense the benchmarks presented are quite representable, I would
say. Maybe n=100 would have been a nice pointer to have.
But all that said, our current support for blas/lapack in gentoo isn't
particularly great and I think that flexiblas has the potential to
improve that situation significantly.
Best,
Matthias
[1] And from my experience, *every* project that cares about efficient
algorithms for n=3 handrolls these 10 instructions anyway...
On Mon, Sep 15, 2025, at 10:05 CDT, Michael Orlitzky <mjo@gentoo.org> wrote:
> On 2025-09-15 16:05:53, Michał Górny wrote:
>>
>> Can you suggest a good (and easy to use) benchmark for BLAS/LAPACK?
>
> No, I'm honestly not that knowledgable about it. Despite nominally
> working in linear algebra, I mainly use the computer to solve systems
> of low dimension that arise in other algorithms. There are no doubt a
> few thousand such uses in R, Octave, SageMath, etc.
>
> My only real concern is that the FlexiBLAS benchmarks appear targeted
> at people who use linear algebra to do large-scale linear algebra.
> Their smallest n=500 is larger than the matrices I usually work
> with. But if the numbers look the same for n=3, then there's nothing
> to worry about. You could steal the FlexiBLAS benchmarks and tone down
> "n" to test this?
>
> (You can always turn many small systems into one big system to cut
> down on the number of API calls, but the size of the system grows
> quadratically so this isn't a real strategy.)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-15 15:43 ` Matthias Maier
@ 2025-09-16 17:56 ` Michael Orlitzky
2025-09-16 18:37 ` Michał Górny
2025-09-16 20:20 ` Matthias Maier
0 siblings, 2 replies; 17+ messages in thread
From: Michael Orlitzky @ 2025-09-16 17:56 UTC (permalink / raw
To: gentoo-dev
On 2025-09-15 10:43:51, Matthias Maier wrote:
> Dear all,
>
> Looking at the publication for FlexiBLAS I'd say the runtime overhead is
> negligible.
It is, if you are multiplying huge matrices with each API call...
> Just another pointer: glibc is already doing dynamic dispatch for
> transcendental functions (sin, exp, pow, etc.) to specialized
> implementations. This is a noticeable pointer walk but still small
> compared to the actual runtime of, say, pow().
>
> So if flexiblas has implemented their dynamic dispatch in a similar
> fashion then it's going to be reasonably fast.
To avoid getting into an argument over hypotheticals, I wrote a small
C program that I assume is the worst case for FlexiBLAS: doing
100,000,000 scalar (1x1 matrix) multiplications with repeated calls to
cblas_dgemm(). These numbers obviously fluxuate, but they are
representative:
* netlib: 8.51s
* openblas: 13.75s
* flexiblas (netlib): 24.71s
* flexiblas (openblas): 27.45s
> I doubt that n=3 is the target application for BLAS/LAPACK
> routines. Setting up everything and then calling into the routines
> (without flexiblas) will already cost you much more than the 10
> instructions that are executed in the end. [1]
n=3 is just an example, but I think you will find dimension three to
be quite popular among the people who live there.
For every researcher working directly with BLAS, there are a thousand
users of software like Mathematica, MATLAB, Magma, Maple, Octave,
SageMath, SciPy, etc. Typically these will hand off your computation
regardless of size. Likewise for graphics and signal processing
applications. On a distro where BLAS gets pulled in as a dependency, I
think this generic use case has to be considered the main one.
> But all that said, our current support for blas/lapack in gentoo isn't
> particularly great and I think that flexiblas has the potential to
> improve that situation significantly.
No argument here. I'm not going to die if the Sage test suite gets 1%
slower, I would just rather be clear about the overhead.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-16 17:56 ` Michael Orlitzky
@ 2025-09-16 18:37 ` Michał Górny
2025-09-16 22:58 ` Michael Orlitzky
2025-09-16 20:20 ` Matthias Maier
1 sibling, 1 reply; 17+ messages in thread
From: Michał Górny @ 2025-09-16 18:37 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 866 bytes --]
On Tue, 2025-09-16 at 13:56 -0400, Michael Orlitzky wrote:
> n=3 is just an example, but I think you will find dimension three to
> be quite popular among the people who live there.
>
> For every researcher working directly with BLAS, there are a thousand
> users of software like Mathematica, MATLAB, Magma, Maple, Octave,
> SageMath, SciPy, etc. Typically these will hand off your computation
> regardless of size. Likewise for graphics and signal processing
> applications. On a distro where BLAS gets pulled in as a dependency, I
> think this generic use case has to be considered the main one.
I don't think the number of users alone is a key issue here. The real
question is, do these users actually perform a humongous number of
small-dimension computations where such an overhead will actually
matter?
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-16 17:56 ` Michael Orlitzky
2025-09-16 18:37 ` Michał Górny
@ 2025-09-16 20:20 ` Matthias Maier
2025-09-16 20:36 ` Matthias Maier
1 sibling, 1 reply; 17+ messages in thread
From: Matthias Maier @ 2025-09-16 20:20 UTC (permalink / raw
To: gentoo-dev
On Tue, Sep 16, 2025, at 12:56 CDT, Michael Orlitzky <mjo@gentoo.org> wrote:
> C program that I assume is the worst case for FlexiBLAS: doing
> 100,000,000 scalar (1x1 matrix) multiplications with repeated calls to
> cblas_dgemm(). These numbers obviously fluxuate, but they are
> representative:
>
> * netlib: 8.51s
> * openblas: 13.75s
> * flexiblas (netlib): 24.71s
> * flexiblas (openblas): 27.45s
So this is basically measuring the pure function call overhead
introduced by flexiblas which amounts to roughly 15s for 100,000,000
repeated calls to cblas_dgemm(). This is pretty good I would say.
Best,
Matthias
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-16 20:20 ` Matthias Maier
@ 2025-09-16 20:36 ` Matthias Maier
0 siblings, 0 replies; 17+ messages in thread
From: Matthias Maier @ 2025-09-16 20:36 UTC (permalink / raw
To: gentoo-dev
On Tue, Sep 16, 2025, at 15:20 CDT, Matthias Maier <tamiko@gentoo.org> wrote:
>> C program that I assume is the worst case for FlexiBLAS: doing
>> 100,000,000 scalar (1x1 matrix) multiplications with repeated calls to
>> cblas_dgemm(). These numbers obviously fluxuate, but they are
>> representative:
>>
>> * netlib: 8.51s
>> * openblas: 13.75s
>> * flexiblas (netlib): 24.71s
>> * flexiblas (openblas): 27.45s
> So this is basically measuring the pure function call overhead
> introduced by flexiblas which amounts to roughly 15s for 100,000,000
> repeated calls to cblas_dgemm(). This is pretty good I would say.
Just for reference what we are talking about: 100,000,000 scalar
multiplications (with 2 cycle inverse throughput) on a core with 3GHz
will amount to 0.07s.
So calling into blas for that adds roundabout a factor ~100
inefficiency. Flexiblas will worsen that to a factor ~300.
Best,
Matthias
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-16 18:37 ` Michał Górny
@ 2025-09-16 22:58 ` Michael Orlitzky
0 siblings, 0 replies; 17+ messages in thread
From: Michael Orlitzky @ 2025-09-16 22:58 UTC (permalink / raw
To: gentoo-dev
On 2025-09-16 20:37:39, Michał Górny wrote:
>
> I don't think the number of users alone is a key issue here. The real
> question is, do these users actually perform a humongous number of
> small-dimension computations where such an overhead will actually
> matter?
Ultimately I don't think it will *matter* to anyone. But ironically,
the type of person it will bother most is the sort who wants to switch
his BLAS implementation.
It's worth it to fix the other issues you pointed out. From what I can
tell, your approach can even be extended in the future with other
system-blas providers if anyone cares enough to do it.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS
2025-09-11 18:07 [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS Michał Górny
` (2 preceding siblings ...)
2025-09-14 21:45 ` Michael Orlitzky
@ 2025-09-20 17:12 ` Michał Górny
3 siblings, 0 replies; 17+ messages in thread
From: Michał Górny @ 2025-09-20 17:12 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 3047 bytes --]
Hi,
Some updates.
On Thu, 2025-09-11 at 20:07 +0200, Michał Górny wrote:
> We get the best of both worlds: packages can use the complete API, we
> use the preferred provider when we can, and fall back when we can't.
> On top of that, unlike with eselect, the preferred provider can be
> overriden via user configuration and environment variables too.
We're currently blocked on missing xerbla_array_ function, but upstream
promised to add it for us.
On the positive side, it looks like FlexiBLAS also implements functions
specific to OpenBLAS that Netlib LAPACK doesn't have, so we'll be able
to remove the OpenBLAS hack from dev-python/qiskit-aer.
> On the minus side, FlexiBLAS currently doesn't support the mixed
> LP64/ILP64 interface provided by the latest sci-libs/lapack (upstream is
> planning to add it). So we're stuck with LP64, unless we add USE=64bit-
> index to sci-libs/lapack, for the more traditional ILP64 interface.
Upstream is planning to add support for that. However, all things
considered it's not a high priority -- it's just an implementation
detail for us.
> 3.2. The lazy approach
> ----------------------
> The rough idea is that we move sci-libs/lapack away, via renaming
> the libraries or moving them into a subdirectory, and put symlinks to
> flexiblas in their place. Packages can continue searching for libblas,
> liblapack, etc. as usual, and they will end up being linked to flexiblas
> instead.
>
> Note that I haven't found time to test this yet, so I may be missing
> some problem with it, but I can't think of one right now. What's
> unclear to me is how we handle pkg-config files and CMake: if we also
> replace them with symlinks to flexiblas ones, or leave as-is but with
> the original paths that are now replaced by symlinks. I also need to
> check if the CMake files installed by sci-libs/lapack are actually used
> at all.
>
> Bad news is that it's less clean, and leaves us forever diverging in how
> sci-libs/lapack is installed, and apps linking to '-lblas -lcblas
> -llapack -llapacke' instead of '-lflexiblas'.
I'm testing this for a while, and it's working quite well (modulo
the xerbla_array_ problem mentioned). I haven't seen any problems from
packages built before the switch (i.e. working via symlinks), nor built
post the switch.
Bad news is that it's a bit unidirectional. Once you switch to
FlexiBLAS, the linker ends up linking to libflexiblas.so, and if you
switch back, you have to rebuild stuff built with flexiblas (with
preserved-libs, there is no immediate breakage). Technically, this
could be solved by providing libblas.so, etc. wrappers instead of
symlinks but I couldn't find a way to do that.
I thought about copying what eselect-ldso does, but it turns out it's
just horribly wrong (it duplicates all symbols) and shouldn't have ever
been merged. One more reason to kill it with fire.
I've created a tracker at:
https://bugs.gentoo.org/963034
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2025-09-20 17:12 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-11 18:07 [gentoo-dev] Redoing BLAS/LAPACK in Gentoo, using FlexiBLAS Michał Górny
2025-09-11 19:18 ` Matthias Maier
2025-09-11 19:38 ` Michał Górny
2025-09-13 14:01 ` Benda Xu
2025-09-14 21:45 ` Michael Orlitzky
2025-09-15 2:12 ` Michał Górny
2025-09-15 3:44 ` Mitchell Dorrell
2025-09-15 12:40 ` Michael Orlitzky
2025-09-15 14:05 ` Michał Górny
2025-09-15 15:05 ` Michael Orlitzky
2025-09-15 15:43 ` Matthias Maier
2025-09-16 17:56 ` Michael Orlitzky
2025-09-16 18:37 ` Michał Górny
2025-09-16 22:58 ` Michael Orlitzky
2025-09-16 20:20 ` Matthias Maier
2025-09-16 20:36 ` Matthias Maier
2025-09-20 17:12 ` Michał Górny
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox