-
-
Notifications
You must be signed in to change notification settings - Fork 15.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for CUDA and Tensorflow update #31507
Conversation
4a9d312
to
2ff37b5
Compare
Cool, I'll try this once it is merged, when this is merged, does it come with tensorboard as well!? |
@CMCDragonkai Yes,it includes an update. |
Unfortunately I could not build tensorflow/cuda with this PR. Here is the tail of the log:
|
Notice that this PR requires another one (which was recently merged) - the error is because of this. It should land in master soon. I haven't included respective patches here because they induce mass rebuilds.
--
Nikolay.
|
Ok, I had not gotten that his PR did not include the other one. Please let
me know when this is ready to test. :)
…On Wed, Nov 15, 2017 at 1:56 PM, Nikolay Amiantov ***@***.***> wrote:
Notice that this PR requires another one (which was recently merged) - the
error is because of this. It should land in master soon. I haven't included
respective patches here because they induce mass rebuilds.
--
Nikolay.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31507 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AABsY_mAeEYAtFs3Xmt1TzZk0cda0lNzks5s2t93gaJpZM4QaTWY>
.
|
2ff37b5
to
b181ea5
Compare
My new build attempt fails with
|
Oh no, that's what I feared very much would happen and it didn't... Until now. I'll see if I can reproduce, this means dependencies fetch is not fully deterministic after all.
…On November 20, 2017 5:39:08 PM GMT+03:00, Jean-Philippe Bernardy ***@***.***> wrote:
My build attempt fails with
```
output path
‘/nix/store/bwpd405zwcq15zirf8khd6i6gv6lw4ha-tensorflow-build-1.4.0-deps’
has r:sha256 hash
‘10k7i61ya33dcy98i0s7r8f1d4s4rwjl5myfyiyr46skjpzydxdv’ when
‘0sq0a7vsajzqwxgg82xw1q74n7vdq37n9d5z7p0c8gzpmyw7mgc9’ was expected
```
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#31507 (comment)
--
Nikolay.
|
Are you sure it’s a determinism problem? I just tried this and got the same error with the same hash as @jyp:
|
@abbradar Still working on this? TensorFlow 1.5 is available now, based on CUDA 9 and cuDNN 7. |
@abbradar has not been responding to any message for a couple of months |
Sorry for the long absence, had some real life difficulties. The hash may be the same for you @andersk but that means that it has changed over time and that is a problem. I can't reproduce it now; I'll rebase this work and check if it works -- we cannot determine nondeterminism cause until we hit it again. |
A separate function for building Bazel-bazed packages. Internally it splits the build into two phases, fetching and building. Users are expected to provide `fetchArgs.sha256` -- checksum of fetched dependencies. Local dependencies should be removed in `fetchArgs.preInstall`. Overall `fetchArgs` and `buildArgs` can be used to add specific steps to fetch and build.
b181ea5
to
690a2f8
Compare
I've bumped Tensorflow version to 1.5 and Tensorboard to 1.5.1. Works fine with MNIST test from Keras. |
++ lib.optionals (pythonOlder "3.4") [ backports_weakref enum34 ] | ||
++ lib.optional withTensorboard tensorflow-tensorboard; | ||
|
||
# For some reason, CUDA is not retained in RPATH. | ||
doInstallCheck = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering in buildPython*
the checkPhase
corresponds to installCheckPhase
, I suggest removing doCheck = false;
and setting instead checkPhase
(but of course with the comment that the actual tests are slow and impure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
I'm building Python 2 version now just in case, shiny new Ryzen helps but it still takes a while -- I'll update when it's done.
690a2f8
to
8aeea49
Compare
Tested on Python 2 and Python 3.5 (not on 3.6 because of no Tensorboard support -- but it should work). |
Thank you! |
Will this be ported to 17.09? Or is it unstable only? |
@CMCDragonkai I don't think we'll port this -- it's a version bump which we don't do unless there's a security or other serious reason. Better to stay wheel-based in release. |
Motivation for this change
Fix problems with certain usage of cudatoolkit 8 and update TensorFlow (which depends on it). This PR also depends on #31497 -- I plan to merge as soon as it goes in master.
Things done
build-use-sandbox
innix.conf
on non-NixOS)nix-shell -p nox --run "nox-review wip"
./result/bin/
)cc @cstrahan -- you may be interested in new
buildBazelPackage
. I hope we can incorporate your future Bazel fixes there. I also plan to build TensorBoard with it in future (that's why it was made) but it turns out TensorBoard is even more messy than TensorFlow :D