Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BLAS/LAPACK switching mechanism #83888

Merged
merged 8 commits into from Apr 17, 2020

Conversation

matthewbauer
Copy link
Member

@matthewbauer matthewbauer commented Mar 31, 2020

Motivation for this change

This is based on previous work for switching between BLAS and LAPACK
implementation in Debian[1] and Gentoo[2]. The goal is to have one way
to depend on the BLAS/LAPACK libraries that all packages must use. The
attrs “blas” and “lapack” are used to represent a wrapped BLAS/LAPACK
provider. Derivations that don’t care how BLAS and LAPACK are
implemented can just use blas and lapack directly. If you do care what
you get (perhaps for some CPP), you should verify that blas and lapack
match what you expect with an assertion.

The “blas” package collides with the old “blas” reference
implementation. This has been renamed to “blas-reference”. In
addition, “lapack-reference” is also included, corresponding to
“liblapack” from Netlib.org.

Currently, there are 3 providers of the BLAS and LAPACK interfaces:

  • lapack-reference: the BLAS/LAPACK implementation maintained by netlib.org
  • OpenBLAS: an optimized version of BLAS and LAPACK
  • MKL: Intel’s unfree but highly optimized BLAS/LAPACK implementation

By default, the above implementations all use the “LP64” BLAS and
LAPACK ABI. This corresponds to “openblasCompat” and is the safest way
to use BLAS/LAPACK. You may received some benefits from “ILP64” or
8-byte integer BLAS at the expense of breaking compatibility with some
packages.

This can be switched at build time with an override like:

import <nixpkgs> {
    config.allowUnfree = true;
    overlays = [(self: super: {
      lapack = super.lapack.override {
        lapackProvider = super.lapack-reference;
      };
      blas = super.blas.override {
        blasProvider = super.lapack-reference;
      };
    })];
  }

or, switched at runtime via LD_LIBRARY_PATH like:

$ LD_LIBRARY_PATH=$(nix-build -E '(with import <nixpkgs> {}).lapack.override { lapackProvider = pkgs.mkl; is64bit = true; })')/lib:$(nix-build -E '(with import <nixpkgs> {}).blas.override { blasProvider = pkgs.mkl; is64bit = true; })')/lib ./your-blas-linked-binary

By default, we use OpenBLAS LP64 also known in Nixpkgs as
openblasCompat.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@jonringer
Copy link
Contributor

looks like you put a lot of effort into this, and thank you for doing so.

cc @bhipple @costrouc @FRidh @tbenst (people I can think of which may have mentioned blas implementations before)

Copy link
Member

@timokau timokau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, though I haven't taken the time to understand all the details. Thank you!

Documentation is needed though.

pkgs/build-support/alternatives/blas/default.nix Outdated Show resolved Hide resolved
pkgs/build-support/alternatives/blas/default.nix Outdated Show resolved Hide resolved
pkgs/build-support/alternatives/blas/default.nix Outdated Show resolved Hide resolved
pkgs/build-support/alternatives/blas/default.nix Outdated Show resolved Hide resolved
pkgs/build-support/alternatives/blas/default.nix Outdated Show resolved Hide resolved
"lammps" "lammps-mpi"

# requires openblas
# "caffe" "mxnet" "flint" "sage" "sageWithDoc"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think flint and sage necessarily require openblas. But since you added the comment I'm assuming you tested them. Maybe I'll have time to look into that once this is merged.

@@ -0,0 +1,91 @@
{ pkgsFun ? import ../..
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the intention of this file? Is it only for testing purposes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this was used in conjunction with Hydra to test this PR. It's still worth keeping although its not release *alternatives but blas only.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - mainly to test that everything works. It certainly doesn't have to be included, but may help if we ever want to make a jobset to test this stuff.

if ! isELF "$i"; then continue; fi

if $OBJDUMP -p "$i" | grep 'NEEDED' | awk '{ print $2; }' | grep -q '\(libmkl_rt.so\|libopenblas.so.0\)'; then
echo "Found direct BLAS/LAPACK implementation reference in $i. Depend on blas / lapack instead."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error should be a bit more actionable. It should explain why you should depend on blas / lapack instead and how you can still use a specific implementation if you know what you're doing.

Copy link
Member

@FRidh FRidh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR allows for a central configuration of blas to be applied on all packages. Before different packages were using different blas, either because no attention was paid, or because a package requires a specific one (e.g. numpy requires openblasCompat).

Will passing in a specific blas for a specific package still function due to the renaming of -l flags?

@@ -275,20 +275,16 @@ self: super:
old:
let
blas = old.passthru.args.blas or pkgs.openblasCompat;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @adisbladis for the poetry2nix repo

@@ -0,0 +1,91 @@
{ pkgsFun ? import ../..
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this was used in conjunction with Hydra to test this PR. It's still worth keeping although its not release *alternatives but blas only.

@matthewbauer
Copy link
Member Author

Will passing in a specific blas for a specific package still function due to the renaming of -l flags?

No, they need to use the generic name if you are using the generic "blas" or "lapack". But, you can still directly depend on openblas or mkl if you want and use the directly.

@FRidh FRidh added this to Needs review in Staging Apr 3, 2020
@FRidh
Copy link
Member

FRidh commented Apr 4, 2020

No, they need to use the generic name if you are using the generic "blas" or "lapack". But, you can still directly depend on openblas or mkl if you want and use the directly.

This confuses me. So, we use the default blas and lapack everywhere, but then in one package I choose an different blas, so I pass in e.g. blas = mkl;. Then the -lblas that is introduced here won't function and -lmkl (don't know if that's the right name) would have to be added. Is that correct?

And to work-around that issue, one needs to wrap it first and pass in the wrapped blas.

let
  my_blas = blas.override {blasProvider = mkl;};
in numpy.override {blas = my_blas;}

It's a trivial operation but makes me think we should maybe have a mkl-unwrapped and mkl and same for others blas and lapack implementations so we can just do

@markuskowa
Copy link
Member

@matthewbauer
Copy link
Member Author

No, they need to use the generic name if you are using the generic "blas" or "lapack". But, you can still directly depend on openblas or mkl if you want and use the directly.

This confuses me. So, we use the default blas and lapack everywhere, but then in one package I choose an different blas, so I pass in e.g. blas = mkl;. Then the -lblas that is introduced here won't function and -lmkl (don't know if that's the right name) would have to be added. Is that correct?

Yes, you wouldn't be able to do this because neither mkl, or openblas provide a "libblas" or "liblapack.

And to work-around that issue, one needs to wrap it first and pass in the wrapped blas.

let
  my_blas = blas.override {blasProvider = mkl;};
in numpy.override {blas = my_blas;}

It's a trivial operation but makes me think we should maybe have a mkl-unwrapped and mkl and same for others blas and lapack implementations so we can just do

I think better than a wrapper would be just providing a symlink in mkl and openblas for this. I think this would actually make some other logic simpler.

vbgl pushed a commit that referenced this pull request May 6, 2020
drewrisinger added a commit to drewrisinger/nur-packages that referenced this pull request Oct 17, 2020
drewrisinger added a commit to drewrisinger/nur-packages that referenced this pull request Oct 20, 2020
risicle added a commit to risicle/nixpkgs that referenced this pull request Oct 25, 2020
without master's fix in NixOS#83888, opencv3 & opencv4 end up with an 8-byte
openblas, which it does work with. however this causes the python
bindings to also end up with an 8-byte openblas, which numpy doesn't work
with. force 4-byte openblas for opencv.
@alexvorobiev
Copy link
Contributor

alexvorobiev commented Nov 16, 2020

I have built R using MKL for both blas and lapack:

self: super: {
  blas = super.blas.override { blasProvider = self.mkl; };
  lapack = super.lapack.override { lapackProvider = self.mkl; };
}

Now matrix multiplication for large matrices produces almost random results:

> res <- rep(0, 30)
> mat <- matrix(data = rep(1,30000), nrow = 10000, ncol = 3)
> vect <- c(1,-1,3)
> for (i in 1:1000) { r <- (mat %*% vect)[1]; res[[r]] <- res[[r]] + 1 }
> res
 [1]   0   0  26   0   0  51   0   0  82   0   0 143   0   0  96   0   0 114   0
[20]   0 146   0   0 342   0   0   0   0   0   0

For comparison, Microsoft R Open, also with MKL, produces correct results:

> for (i in 1:1000) { r <- (mat %*% vect)[1]; res[[r]] <- res[[r]] + 1 }
> res
 [1]    0    0 1000    0    0    0    0    0    0    0    0    0    0    0    0
[16]    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/improving-nixos-data-science-infrastructure-ci-for-mkl-cuda/5074/50

FRidh pushed a commit that referenced this pull request Nov 21, 2020
without master's fix in #83888, opencv3 & opencv4 end up with an 8-byte
openblas, which it does work with. however this causes the python
bindings to also end up with an 8-byte openblas, which numpy doesn't work
with. force 4-byte openblas for opencv.
markuskowa added a commit to markuskowa/nixpkgs that referenced this pull request Dec 26, 2020
Removed non-functional lines.
Openmolcas does not build with the blas/lapack switching mechanism
markuskowa added a commit that referenced this pull request Dec 26, 2020
balodja added a commit to balodja/nixpkgs that referenced this pull request May 13, 2021
balodja added a commit to balodja/nixpkgs that referenced this pull request Jul 23, 2021
Most of them are:
* separate packages for different openmodelica components,
* qt4 -> qt5,
* patches to instruct the OMEdit wrapper with stdenv executables
  location,
* adoption of NixOS#89731 and NixOS#109595,
* openblas -> blas, lapack according to NixOS#83888,
* parallel building,
* getting rid of spurious build phases,
* correct the license,
* cross-compilation,
* forcing compiler to clang++ according to OM build recommendations,
* drop of pangox_compat according to NixOS#75909 and NixOS#76412,
* better dependencies, and more.
balodja added a commit to balodja/nixpkgs that referenced this pull request Jul 24, 2021
Co-authored-by: Jaakko Luttinen <jaakko.luttinen@iki.fi>

Most of changes are:
* separate packages for different openmodelica components,
* qt4 -> qt5,
* patches to instruct the OMEdit wrapper with stdenv executables
  location,
* adoption of NixOS#89731 and NixOS#109595,
* openblas -> blas, lapack according to NixOS#83888,
* parallel building,
* getting rid of spurious build phases,
* correct the license,
* cross-compilation,
* forcing compiler to clang++ according to OM build recommendations,
* drop of pangox_compat according to NixOS#75909 and NixOS#76412,
* better dependencies, and more.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Staging
  
Done
Development

Successfully merging this pull request may close these issues.

None yet