Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mxnet: use openblas, fix build #87118

Merged
merged 1 commit into from May 28, 2020
Merged

mxnet: use openblas, fix build #87118

merged 1 commit into from May 28, 2020

Conversation

jonringer
Copy link
Contributor

@jonringer jonringer commented May 6, 2020

Motivation for this change

noticed it was failing when reviewing #87102

previous failure:

builder for '/nix/store/1zcd9mmm3dyyxcr588cszdclrsnjbga1-mxnet-1.6.0.drv' failed with exit code 1; last 10 log lines:
  -- Could not find OpenBLAS lib. Turning OpenBLAS_FOUND off
  CMake Error at cmake/Modules/FindOpenBLAS.cmake:82 (MESSAGE):
    Could not find OpenBLAS
  Call Stack (most recent call first):
    cmake/ChooseBlas.cmake:42 (find_package)
    CMakeLists.txt:310 (include)


  -- Configuring incomplete, errors occurred!
  See also "/build/apache-mxnet-src-1.6.0-incubating/build/CMakeFiles/CMakeOutput.log".
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

Copy link
Contributor

@bhipple bhipple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep using the virtual package, so that if someone overlays mkl as the blas implementation mxnet will pick it up correctly.

Sometimes, we need the specific provider package in buildInputs as well, to get things like cmake hooks or path searches working. The way to do that is to also add blas.provider to the buildInputs, which will cleanly and consistently handle the choice.

CC @matthewbauer I'm pretty sure the above is correct; I had to do something similar for pytorch. Perhaps we should add to the manual?

@@ -1,5 +1,5 @@
{ config, stdenv, lib, fetchurl, bash, cmake
, opencv3, gtest, blas, perl
, opencv3, gtest, openblas, perl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
, opencv3, gtest, openblas, perl
, opencv3, gtest, blas, perl

@@ -17,7 +17,7 @@ stdenv.mkDerivation rec {

nativeBuildInputs = [ cmake perl ];

buildInputs = [ opencv3 gtest blas ]
buildInputs = [ opencv3 gtest openblas ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
buildInputs = [ opencv3 gtest openblas ]
buildInputs = [ opencv3 gtest blas blas.provider ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, wasn't aware of the provider usage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any need for the actual blas package/attr set? Seems to build fine with only the provider

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, linking maybe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build just failed for me; looks like we need to do the same with lapack to make it visible to the linker.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any need for the actual blas package/attr set? Seems to build fine with only the provider

Not actually sure ... in principle it might be sufficient to just pass in blas.provider, but it seems reasonable to expose the meta-package as well, as that has the actual public API shared objects defined.

Copy link
Contributor

@bhipple bhipple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mxnet built on my machine with this update now; kicking off a nixpkgs-review now.

@jonringer
Copy link
Contributor Author

jonringer commented May 6, 2020

I'm not a big fan of having to do both blas blas.provider, is there a way that we can symLinkJoin (or something else) them so that people only need to specify blas

@jonringer
Copy link
Contributor Author

AFAIK, the provider pass thru is pretty special to blas libraries

@bhipple
Copy link
Contributor

bhipple commented May 6, 2020

Result of nixpkgs-review pr 87118 1

1 package marked as broken and skipped:
- python27Packages.myfitnesspal
5 packages built:
- mxnet
- python27Packages.mxnet
- python37Packages.mxnet
- python37Packages.optuna
- python38Packages.mxnet

@jonringer
Copy link
Contributor Author

I would approve your push, but I can't approve my own PR. Lol

@jonringer
Copy link
Contributor Author

also, wouldn't it be better to get the liblapack.so from the blas.provider?

[12:24:46] jon@jon-desktop /home/jon/projects/nixpkgs (fix-mxnet)
$ nix-build -A blas.provider
/nix/store/hwdbqpqknffqnc30fjxzvrpmp2d3lcg2-openblas-0.3.9
[12:24:58] jon@jon-desktop /home/jon/projects/nixpkgs (fix-mxnet)
$ ls ./result/lib/lib*
./result/lib/libblas.so.3     ./result/lib/liblapack.so.3          ./result/lib/libopenblas.so.0
./result/lib/libcblas.so.3    ./result/lib/libopenblasp-r0.3.9.so
./result/lib/liblapacke.so.3  ./result/lib/libopenblas.so

@jonringer
Copy link
Contributor Author

the main thing I see, is that the blas.provider only provides the libraries with the .3 suffix, maybe we should add symlinks to their unversioned variants

@bhipple
Copy link
Contributor

bhipple commented May 6, 2020

Agreed, the current setup is a little clunky, though I'm not sure symlinkJoin would cover it: IIRC we need the actual package to be in buildInputs, so that setupHooks for doing things like adding to PKG_CONFIG_PATH and the CMake search path work.

@matthewbauer anything you think we should change here?

@jonringer
Copy link
Contributor Author

actually, this is specific to openblas

    ln -s $out/lib/libopenblas${stdenv.hostPlatform.extensions.sharedLibrary} $out/lib/libblas${stdenv.hostPlatform.extensions.sharedLibrary}${stdenv.lib.optionalString stdenv.hostPlatform.isLinux ".3"}
    ln -s $out/lib/libopenblas${stdenv.hostPlatform.extensions.sharedLibrary} $out/lib/libcblas${stdenv.hostPlatform.extensions.sharedLibrary}${stdenv.lib.optionalString stdenv.hostPlatform.isLinux ".3"}
    ln -s $out/lib/libopenblas${stdenv.hostPlatform.extensions.sharedLibrary} $out/lib/liblapack${stdenv.hostPlatform.extensions.sharedLibrary}${stdenv.lib.optionalString stdenv.hostPlatform.isLinux ".3"}
    ln -s $out/lib/libopenblas${stdenv.hostPlatform.extensions.sharedLibrary} $out/lib/liblapacke${stdenv.hostPlatform.extensions.sharedLibrary}${stdenv.lib.optionalString stdenv.hostPlatform.isLinux ".3"}

@jonringer
Copy link
Contributor Author

by modifying the openblas package (in the PR that's now linked to this thread), i was able to build this while only supplying blas.provider

@jonringer
Copy link
Contributor Author

currently blocked by #87134

@matthewbauer
Copy link
Member

CC @matthewbauer I'm pretty sure the above is correct; I had to do something similar for pytorch. Perhaps we should add to the manual?

Yes that sounds right. Something in the manual would be helpful. We may need a "maintainers" section or something since this is more about how you make sure your package is correct vs. just how you use it.

@matthewbauer anything you think we should change here?

Adding the unversioned library seems like a good idea.

@jonringer
Copy link
Contributor Author

should I just target staging and get this merged?

@bhipple
Copy link
Contributor

bhipple commented May 9, 2020

Sounds good, let's just merge this on staging then so it can bake with all the other openblas / numpy changes. and we don't have to remember to circle back to it.

@jonringer jonringer changed the base branch from master to staging May 11, 2020 18:33
@jonringer
Copy link
Contributor Author

@GrahamcOfBorg build mxnet

@doronbehar
Copy link
Contributor

There's a CI error.

@jonringer
Copy link
Contributor Author

@GrahamcOfBorg eval

@jonringer
Copy link
Contributor Author

pretty sure staging was broken at the time of PR creation

@jonringer jonringer merged commit 3a4fec8 into NixOS:staging May 28, 2020
@jonringer jonringer deleted the fix-mxnet branch May 28, 2020 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants