Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mkl: wrap with openmp in LD_LIBRARY_PATH #48926

Closed
wants to merge 1 commit into from

Conversation

bhipple
Copy link
Contributor

@bhipple bhipple commented Oct 23, 2018

Fixes #48876. We could also fix this in numpy itself, but since any other
package using the multi-threaded portions of mkl needs this, it makes sense to
fix it in the source itself.

As mentioned, the upstream binary is very large, so we create a wrapper package
for the openmp-dependent setup hook to avoid rebuilding mkl itself.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

@bhipple
Copy link
Contributor Author

bhipple commented Oct 23, 2018

CC @FRidh @costrouc @xeji @orivej this follows up on my original intention to make mkl "just work", but without blowing up the size on disk with new versions for each stdenv.

Using this, I can successfully build numpy off the staging branch and pass all of the tests except for this one:

msg        = 'Warning: neither f2py nor f2py3 nor f2py3.6 found in path'
success    = False
version    = sys.version_info(major=3, minor=6, micro=7, releaselevel='final', serial=0)

/nix/store/srycdc4h12dvdygp23vv8ld138jjkawv-python3.6-numpy-1.15.2/lib/python3.6/site-packages/numpy/tests/test_scripts.py:93: AssertionError
= 1 failed, 4979 passed, 41 skipped, 88 deselected, 9 xfailed in 72.91 seconds =

which is significantly further than I get without it (fails immediately otherwise).

Moreover, this package also builds and tests with mkl + openmp:
bhipple@508c54e

(Note that I'm adding mkl itself to the path in the above test, but not openmp; it has the same issue where it doesn't add SO_NEEDED entries and instead lets users fill it in with the env. Since that one isn't proprietary, I'm going to fix it up with a patch in the build before submitting, but that's a separate PR.)

SO_NEEDED entry directly to the library, since the license prevents
modification.
*/
let
version = "${date}.${rel}";
date = "2019.0";
rel = "117";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at this in split view with whitespace ignored, I didn't really change much; it's just that there's a let ... in ... in this file now, so the indentation is different.

};

libvar = if stdenvNoCC.isDarwin then "DYLD_LIBRARY_PATH" else "LD_LIBRARY_PATH";
in
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is unfriendly to overlays being able to override mkl versions easily. Is there an idiomatic way to proceed? Pass all the versions and fixed-output shas as defaulted parameters? Seems clunky enough that I bet a user who wants to fork it will just copy the expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just expose unwrapped mkl as attribute in pkgs? Then it can be overridden like any other package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense; updated accordingly!

@GrahamcOfBorg
Copy link

No attempt on aarch64-linux (full log)

The following builds were skipped because they don't evaluate on aarch64-linux: mkl

Partial log (click to expand)


a) For `nixos-rebuild` you can set
  { nixpkgs.config.allowUnfree = true; }
in configuration.nix to override this.

b) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
  { allowUnfree = true; }
to ~/.config/nixpkgs/config.nix.


@GrahamcOfBorg
Copy link

No attempt on x86_64-linux (full log)

The following builds were skipped because they don't evaluate on x86_64-linux: mkl

Partial log (click to expand)


a) For `nixos-rebuild` you can set
  { nixpkgs.config.allowUnfree = true; }
in configuration.nix to override this.

b) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
  { allowUnfree = true; }
to ~/.config/nixpkgs/config.nix.



libvar = if stdenvNoCC.isDarwin then "DYLD_LIBRARY_PATH" else "LD_LIBRARY_PATH";
in
stdenvNoCC.mkDerivation rec {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for wrapping. Let's create a hook with makeSetupHook. When the hook is needed, it can be added to the buildInputs.

Copy link
Contributor Author

@bhipple bhipple Oct 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have to wrap if we want to maintain the large binaries being in a fixed-output derivation, since otherwise our output hash changes based on openmp (or the setupHook arg). @veprbl's suggestion to expose both attributes instead of keeping the unwrapped one private seems like the best of both worlds, though.

@GrahamcOfBorg
Copy link

No attempt on aarch64-linux (full log)

The following builds were skipped because they don't evaluate on aarch64-linux: mkl

Partial log (click to expand)


a) For `nixos-rebuild` you can set
  { nixpkgs.config.allowUnfree = true; }
in configuration.nix to override this.

b) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
  { allowUnfree = true; }
to ~/.config/nixpkgs/config.nix.


@GrahamcOfBorg
Copy link

No attempt on x86_64-linux (full log)

The following builds were skipped because they don't evaluate on x86_64-linux: mkl

Partial log (click to expand)


a) For `nixos-rebuild` you can set
  { nixpkgs.config.allowUnfree = true; }
in configuration.nix to override this.

b) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
  { allowUnfree = true; }
to ~/.config/nixpkgs/config.nix.


@bhipple
Copy link
Contributor Author

bhipple commented Nov 4, 2018

Ping @FRidh any further comments or suggestions for improvement?

@bhipple bhipple changed the base branch from master to staging November 5, 2018 04:15
@GrahamcOfBorg
Copy link

No attempt on aarch64-linux (full log)

The following builds were skipped because they don't evaluate on aarch64-linux: mkl

Partial log (click to expand)


a) For `nixos-rebuild` you can set
  { nixpkgs.config.allowUnfree = true; }
in configuration.nix to override this.

b) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
  { allowUnfree = true; }
to ~/.config/nixpkgs/config.nix.


@GrahamcOfBorg
Copy link

No attempt on x86_64-linux (full log)

The following builds were skipped because they don't evaluate on x86_64-linux: mkl

Partial log (click to expand)


a) For `nixos-rebuild` you can set
  { nixpkgs.config.allowUnfree = true; }
in configuration.nix to override this.

b) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
  { allowUnfree = true; }
to ~/.config/nixpkgs/config.nix.


Fixes NixOS#48876. We could also fix this in `numpy` itself, but since any other
package using the multi-threaded portions of `mkl` needs this, it makes sense to
fix it in the source itself.

As mentioned, the upstream binary is very large, so we create a wrapper package
for the openmp-dependent setup hook to avoid rebuilding `mkl` itself.

With this fix, the setup-hook allows us to simplify `numexpr`; likewise, both
`numpy` and `numexpr` pass all of their test cases.
@@ -71,8 +74,6 @@ in buildPythonPackage rec {
inherit blasImplementation cfg;
};

doCheck = blasImplementation != "mkl";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numpy actualy passes all of its test cases with mkl as its blas implementation.

@@ -28,6 +28,9 @@ in buildPythonPackage rec {
nativeBuildInputs = [ gfortran pytest ];
buildInputs = [ blas ];

# If the blas imlementation is MKL, we need to propagate it for the setup-hook.
propagatedBuildInputs = [ ] ++ lib.optional (blasImplementation == "mkl") blas;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since callers all the way up to the top of the stack will need to run mkl's setup-hook, it has to be propagated as far as I can tell.

@@ -47,4 +41,4 @@ buildPythonPackage rec {
homepage = "https://github.com/pydata/numexpr";
license = lib.licenses.mit;
};
}
Copy link
Contributor Author

@bhipple bhipple Nov 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup hook makes mkl and numpy "just work", without any customization needed in numexpr; this passes all of its test cases for both python2 and python3.

@GrahamcOfBorg
Copy link

No attempt on aarch64-linux (full log)

The following builds were skipped because they don't evaluate on aarch64-linux: mkl

Partial log (click to expand)


a) For `nixos-rebuild` you can set
  { nixpkgs.config.allowUnfree = true; }
in configuration.nix to override this.

b) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
  { allowUnfree = true; }
to ~/.config/nixpkgs/config.nix.


@GrahamcOfBorg
Copy link

No attempt on x86_64-linux (full log)

The following builds were skipped because they don't evaluate on x86_64-linux: mkl

Partial log (click to expand)


a) For `nixos-rebuild` you can set
  { nixpkgs.config.allowUnfree = true; }
in configuration.nix to override this.

b) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
  { allowUnfree = true; }
to ~/.config/nixpkgs/config.nix.


@bhipple
Copy link
Contributor Author

bhipple commented Nov 10, 2018

Any chance you could take a look, @FRidh? I'm happy to make further changes if you have suggestions for improvement.

@tbenst
Copy link
Contributor

tbenst commented Nov 15, 2018

I tried to use these changes to compile Julia with MKL, but I unfortunately I still see

/build/source/usr/bin/julia: symbol lookup error: /nix/store/82brdma6j2yjfx8vi1mkm8v8fyx6h1rv-mkl-2019.0.117/lib/libmkl_intel_thread.so: undefined symbol: omp_get_num_procs

@bhipple any thoughts?

https://github.com/tbenst/nixpkgs/blob/master/pkgs/development/compilers/julia-mkl/shared.nix

Edit: I believe this happens during the test phase. It seems like Julia itself compiles ok, but when testing the various .jl files, Julia spits this out when it tries to compile .jl code using MKL

@bhipple
Copy link
Contributor Author

bhipple commented Nov 16, 2018

Actually, I've found a much better and more robust way to do this, and will open up a follow-up PR this weekend with the details. The implementation in this PR works -- but only if you're in a builder context where you've sourced the information, which means finished/tested packages may be unusable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants