New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pythonPackages.tensorflow: 1.3.1 -> 1.4.1 #34420
Conversation
@jyp Apparently Arch is able to build 1.5x from git. Might try to dig into it this weekend – when I try to build 1.5, bazel throws errors about infinite symlinks due to the output directory being in the source tree. Setting the output dir to $(mktemp -d) fixes the first part. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work!
I did some minor tweaks to make your PR work on Darwin, here's the hello world output:
[nix-shell:~]$ ipython --quick
Python 3.6.4 (default, Jan 17 2018, 00:49:01)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import tensorflow as tf
/nix/store/wc4m3g4grivhy26sdck44jxwkvz1j17n-python3-3.6.4/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
return f(*args, **kwds)
In [2]: hello = tf.constant('Hello, TensorFlow!')
In [3]: sess = tf.Session()
2018-02-09 18:21:45.788566: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
In [4]: print(sess.run(hello))
b'Hello, TensorFlow!'
The word2vec_basic example seems to work well too.
Tensorboard seems to be broken though:
[nix-shell:~]$ tensorboard
Traceback (most recent call last):
File "/nix/store/jvckx7illg4s3y64443swlgbsmv0n1w6-python3.6-tensorflow-1.4.1/bin/.tensorboard-wrapped", line 8, in <module>
from tensorboard.main import main
ModuleNotFoundError: No module named 'tensorboard'
# tensorflow depends on tensorflow_tensorboard, which cannot be | ||
# cudatoolkit is split (see https://github.com/NixOS/nixpkgs/commit/bb1c9b027d343f2ce263496582d6b56af8af92e6) | ||
# However this means that libcusolver is not loadable by tensor flow. So we undo the split here. | ||
cudatoolkit_joined = symlinkJoin { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be wrapped into if cudaSupport
or smth like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that really so? I thought that nix
had lazy evaluation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all derivation (or stdenv.mkDerivation
?) attributes are forced to become environment variables in a generated build script.
if isPy3k then | ||
dls.mac_py_2_cpu | ||
else | ||
dls.mac_py_3_cpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be a mixup here.
pkgs/top-level/python-packages.nix
Outdated
cudatoolkit = pkgs.cudatoolkit8; | ||
cudnn = pkgs.cudnn6_cudatoolkit8; | ||
cudatoolkit = pkgs.cudatoolkit9; | ||
cudnn = pkgs.cudnn_cudatoolkit9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to remove these entirely so my darwin build goes through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comes from master --- I'd suggest to fix this in another patch.
@jyp (continuation from #31492)
I'm not against getting this merged in the short term, but ideally we would trace down why the bazel build fails (source build would make #31046 straightforward to resolve). Easy to undo this if/when we get build from source working. This method is nice since Bazel on Darwin is broken. |
@lukeadams "Ideally" yes. If someone can fix it I'll be happy to see the bazel build restored. In practice tensorflow has been broken for three months in master now, hence this PR. |
@proger Tensorboard is incompatible with python 3.6. tensorflow/tensorboard#427 |
c7552c0
to
81a868c
Compare
@proger Thanks a lot for your review. I acted on your suggestions but I've left the top-level definition as it was to minimize the diff with master. We can change that one separately. (It's hard enough to get the simplest PR merged.) |
@jyp This patch is all that's necessary to build on Darwin: From 5cbbb34a43cc431aeb40591042f3d61c31c88085 Mon Sep 17 00:00:00 2001
From: Luke Adams <luke.adams@belljar.io>
Date: Sat, 10 Feb 2018 10:18:50 -0600
Subject: [PATCH] tensorflow: move cudatoolkit_joined to top-level for clarity
allows for simplification of postFixup
---
.../python-modules/tensorflow/default.nix | 24 +++++++++++-----------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/pkgs/development/python-modules/tensorflow/default.nix b/pkgs/development/python-modules/tensorflow/default.nix
index 5c5d98e7ace..ff34d02c4fa 100644
--- a/pkgs/development/python-modules/tensorflow/default.nix
+++ b/pkgs/development/python-modules/tensorflow/default.nix
@@ -31,8 +31,18 @@ assert ! (stdenv.isDarwin && cudaSupport);
# project's build system is an arcane beast based on
# bazel. Untangling it and building the wheel from source is an open
# problem.
+let
+ # cudatoolkit is split (see https://github.com/NixOS/nixpkgs/commit/bb1c9b027d343f2ce263496582d6b56af8af92e6)
+ # However this means that libcusolver is not loadable by tensor flow. So we undo the split here.
+ cudatoolkit_joined = symlinkJoin {
+ name = "unsplit_cudatoolkit";
+ paths = [
+ cudatoolkit.out
+ cudatoolkit.lib
+ ];
+ };
-buildPythonPackage rec {
+in buildPythonPackage rec {
pname = "tensorflow";
version = "1.4.1";
name = "${pname}-${version}";
@@ -82,17 +92,7 @@ buildPythonPackage rec {
# patchelf --shrink-rpath will remove the cuda libraries.
postFixup = let
rpath = stdenv.lib.makeLibraryPath
- (if cudaSupport then
- # cudatoolkit is split (see https://github.com/NixOS/nixpkgs/commit/bb1c9b027d343f2ce263496582d6b56af8af92e6)
- # However this means that libcusolver is not loadable by tensor flow. So we undo the split here.
- (let cudatoolkit_joined = symlinkJoin {
- name = "unsplit_cudatoolkit";
- paths = [ cudatoolkit.out
- cudatoolkit.lib ];};
- in [ stdenv.cc.cc.lib zlib cudatoolkit_joined cudnn nvidia_x11 ])
- else
- [ stdenv.cc.cc.lib zlib ]
- );
+ ([ stdenv.cc.cc.lib zlib ] ++ lib.optionals cudaSupport [ cudatoolkit_joined cudnn nvidia_x11 ]);
in
''
rrPath="$out/${python.sitePackages}/tensorflow/:${rpath}"
--
2.15.1 Also Tensorboard can be used by using an older python version, e.g. |
81a868c
to
5d29538
Compare
I have applied the patch of @lukeadams Could this PR be considered for merging? @FRidh ? |
5d29538
to
cbd252e
Compare
Also revert to a wheel-based build (the bazel-based build is broken and it is unclear how to repair it)
cbd252e
to
9635a45
Compare
That's very strange as I'm using it right now. Can you show your error? |
Ah, I see failure on Hydra (it was down today morning so I didn't check it first). The error is very strange; for now my theory is that something in nixpkgs has updated since the last successful build (https://hydra.nixos.org/build/69832610). Currently tensorflow doesn't build because of unrelated reasons on master; I'll see if I can reproduce the issue on an older commit. |
Thanks! Please report progress on #31492, I'll ping you from there. |
@lukeadams Yeah, I want to tackle that next -- or return a wheel-based build specifically for Darwin. |
I've tried to set up a Mac OS X VirtualBox machine but apparently AMD processors are not supported, and on an image from https://forum.amd-osx.com/ mouse and keyboard do not work. @LnL7 Can you perhaps set a Darwin Hydra job for my |
@abbradar in case it is not possible to get the source builds working entirely on all supported archs, I think we should still provide the wheel-based solution, at least for 18.03, for all archs. In that case, @jyp, I recommend including the script you used as an actual script in the tree. Also, instead of bothering to write .nix you can just output .json. |
@FRidh Yeah, that's my opinion too -- I thought I'd try to tackle this one on holidays before giving up completely. |
@jyp I feel too defeated by Darwin for now D: Can you rebase this pull request and make it be used only for Darwin? I can do it by myself if you don't have enough time. Thanks in advance! |
@jyp Ah, for some reason I thought you had a Darwin machine too, sorry. I did this myself in #37044 @lukeadams Maybe you can help me test it? |
Also revert to a wheel-based build (the bazel-based build is broken
and it is unclear how to repair it)
Motivation for this change
Two motivations:
Things done
build-use-sandbox
innix.conf
on non-NixOS)nix-shell -p nox --run "nox-review wip"
./result/bin/
)