pytorch: Move cudatoolkit to nativeBuildInputs #60002

andersk · 2019-04-22T03:01:30Z

Motivation for this change

nvcc must be available in PATH at build time; otherwise CUDA support will be disabled.

This is a more minimal version of #57438.

Things done

nvcc must be available in PATH at build time; otherwise CUDA support will be disabled. Signed-off-by: Anders Kaseorg <andersk@mit.edu>

teh

Thanks!

jyp · 2019-04-26T14:53:46Z

I tried your patch, but unfortunately the build did not go through. Here is the tail of the build:

[ 79%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/queue/queue_ops_gpu.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/sgd/iter_op_gpu.cc.o
[ 80%] Building CXX object caffe2/CMakeFiles/caffe2_gpu.dir/sgd/learning_rate_op_gpu.cc.o
[ 80%] Linking CXX shared library ../lib/libcaffe2_gpu.so
impure path `/usr/local/cuda/lib/libcudnn.so.7' used in link
collect2: error: ld returned 1 exit status
make[2]: *** [caffe2/CMakeFiles/caffe2_gpu.dir/build.make:4687: lib/libcaffe2_gpu.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:5422: caffe2/CMakeFiles/caffe2_gpu.dir/all] Error 2
make: *** [Makefile:141: all] Error 2
setup.py::build_deps::run()
Failed to run 'bash ../tools/build_pytorch_libs.sh --use-cuda --use-nnpack --use-mkldnn --use-qnnpack caffe2'
builder for '/nix/store/i5n4iqqk9kzkhhn4a8v4bnm3b00f3k64-python3.7-pytorch-1.0.0.drv' failed with exit code 1
cannot build derivation '/nix/store/5mhc9w2bi01f0p1jpb0j8fc9hp76pziy-python3-3.7.3-env.drv': 1 dependencies couldn't be built
error: build of '/nix/store/5mhc9w2bi01f0p1jpb0j8fc9hp76pziy-python3-3.7.3-env.drv' failed

andersk · 2019-04-26T19:13:57Z

@jyp It sounds like you’ve run into a different problem: the Nix package expects to use the packaged CUDA and cuDNN in /nix/store, but you have a local copy installed in /usr/local/cuda. Can you provide some more context? Are you using NixOS or something else? Why do you have a /usr/local/cuda, and does it work if you remove that? What happens without the patch? What happens with only the first hunk of the patch (i.e. add cudatoolkit_joined to nativeBuildInputs without removing it from buildInputs)?

FRidh · 2019-04-27T05:49:15Z

Clearly sandboxing was disabled in @jyp's build.

jyp · 2019-04-29T09:48:24Z

Indeed it works using sandbox. (I was confused by #51671)

mtn · 2019-05-30T06:04:09Z

I'm here after reading this and a few related issues (eg. #51671). Here is my shell.nix -- on first dropping into the this shell, PyTorch was built form source:

with import <nixpkgs> {};

let
  py = pkgs.python37;
in
stdenv.mkDerivation rec {
  name = "python-environment";

  buildInputs = [
    py
    py.pkgs.matplotlib
    py.pkgs.tkinter
    py.pkgs.numpy
    py.pkgs.pytorchWithCuda
  ];
}

However, during the build, there were several messages like "CUDA not available, skipping tests". Maybe relatedly, there were errors like "Error in cpuinfo: failed to parse the list of present procesors in /sys/devices/system/cpu/present". I can provide a full log if need be.

Additional outputs:

$LD_LIBRARY_PATH=/run/opengl-driver/lib

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 415.27       Driver Version: 415.27       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+

In a python session:

>>> import torch
>>> torch.cuda.is_available()
False
>>>

I'm still pretty new to nix, so I might be doing something wrong. Should this have worked/what can I do to fix it?

jyp · 2019-05-30T07:20:44Z

@mtn These errors were fixed for me when using a sandboxed build.

mtn · 2019-05-30T11:33:37Z

How can I do this, and can I still use nix-shell? I'm finding this issue hard to parse: #903. Or maybe there's a easier way I should be building and using this? I just want the result to be isolated.

Edit: @jyp I'm running nixos 19.03, so shouldn't sandboxing be happening by default?

haskelious · 2019-06-07T20:14:37Z

I have the same problem as @mtn and I have no idea how to resolve on 19.03

OmnipotentEntity · 2019-07-06T23:48:03Z

@mtn @fkstef I just ran into this problem and spent a long time trying to solve it. The answer is actually quite simple. This fix has not been backported to 19.03. You can either pin to this fix using pinning or use an overrideAttr directive. I did the second, but the first might be more useful if you're planning on a long term thing.

For instance, here is an example shell.nix

let
  pkgs = import <nixpkgs> {};
  pytorch-cuda = pkgs.python37Packages.pytorchWithCuda.overrideAttrs (oldAttrs: {
    nativeBuildInputs = oldAttrs.nativeBuildInputs ++ [ pkgs.cudatoolkit ];
  });
  python3 = pkgs.python3.withPackages (ps: with ps; [opencv4 numpy pytorch-cuda]);

in pkgs.stdenv.mkDerivation (with pkgs; {
  name = "env";

  buildInputs = [
    python3
  ];

})

pytorch: Move cudatoolkit to nativeBuildInputs

ad98f2e

nvcc must be available in PATH at build time; otherwise CUDA support will be disabled. Signed-off-by: Anders Kaseorg <andersk@mit.edu>

andersk requested a review from FRidh as a code owner April 22, 2019 03:01

andersk mentioned this pull request Apr 22, 2019

pytorch: fix CUDA support #57438

Closed

10 tasks

GrahamcOfBorg added the 6.topic: python label Apr 22, 2019

GrahamcOfBorg requested a review from thoughtpolice April 22, 2019 03:18

GrahamcOfBorg added 10.rebuild-darwin: 0 10.rebuild-linux: 1-10 labels Apr 22, 2019

teh approved these changes Apr 22, 2019

View reviewed changes

FRidh merged commit 27d1362 into NixOS:master Apr 27, 2019

jyp mentioned this pull request Apr 29, 2019

Cannot depend on pytorch + CUDA in sandboxed build #51671

Closed

GuillaumeDesforges mentioned this pull request Sep 30, 2019

pytorchWithCuda does not install with CUDA #69424

Closed

andersk deleted the pytorch-cuda branch December 10, 2019 09:01

andersk mentioned this pull request Dec 11, 2019

PyTorch with CUDA broken "AssertionError: Torch not compiled with CUDA enabled" #58934

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytorch: Move cudatoolkit to nativeBuildInputs #60002

pytorch: Move cudatoolkit to nativeBuildInputs #60002

andersk commented Apr 22, 2019

teh left a comment

jyp commented Apr 26, 2019

andersk commented Apr 26, 2019

FRidh commented Apr 27, 2019

jyp commented Apr 29, 2019

mtn commented May 30, 2019

jyp commented May 30, 2019 •

edited

mtn commented May 30, 2019 •

edited

haskelious commented Jun 7, 2019

OmnipotentEntity commented Jul 6, 2019 •

edited

pytorch: Move cudatoolkit to nativeBuildInputs #60002

pytorch: Move cudatoolkit to nativeBuildInputs #60002

Conversation

andersk commented Apr 22, 2019

Motivation for this change

Things done

teh left a comment

Choose a reason for hiding this comment

jyp commented Apr 26, 2019

andersk commented Apr 26, 2019

FRidh commented Apr 27, 2019

jyp commented Apr 29, 2019

mtn commented May 30, 2019

jyp commented May 30, 2019 • edited

mtn commented May 30, 2019 • edited

haskelious commented Jun 7, 2019

OmnipotentEntity commented Jul 6, 2019 • edited

jyp commented May 30, 2019 •

edited

mtn commented May 30, 2019 •

edited

OmnipotentEntity commented Jul 6, 2019 •

edited