New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tensorflow: update to 2.3.0 #95824
tensorflow: update to 2.3.0 #95824
Conversation
AFAICT touches similar files as #95736 (probably supersedes). That one only updates to 2.2.0, and also adds a new package which should be able to be separated out. |
Seems like the scipy version bound also needs to be relaxed:
|
Needed a ton of memory to build, but after that indeed seems to work! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lots of noise, don't know what's a regression
https://github.com/NixOS/nixpkgs/pull/95824
40 packages failed to build:
poretools python27Packages.Keras python27Packages.annoy python27Packages.awkward python27Packages.h5py python27Packages.h5py-mpi python27Packages.keras-applications python27Packages.phonopy python27Packages.tensorflow-bin python27Packages.tensorflow-bin_2 python27Packages.uproot python27Packages.uproot-methods python27Packages.worldengine python37Packages.arviz python37Packages.baselines python37Packages.clifford python37Packages.dm-sonnet python37Packages.edward python37Packages.graph_nets python37Packages.mask-rcnn python37Packages.optuna python37Packages.phonopy python37Packages.pymc3 python37Packages.qasm2image python37Packages.qiskit python37Packages.qiskit-aqua python37Packages.rl-coach python37Packages.sumo python37Packages.tensorflow python37Packages.tensorflow-bin python37Packages.tensorflow-bin_2 python37Packages.tensorflow-probability python37Packages.tensorflowWithCuda python37Packages.tflearn python38Packages.clifford python38Packages.phonopy python38Packages.qasm2image python38Packages.qiskit python38Packages.qiskit-aqua python38Packages.sumo
51 packages built:
python27Packages.tensorflow-estimator_2 python37Packages.Keras python37Packages.annoy python37Packages.awkward python37Packages.bayespy python37Packages.caffe python37Packages.dcmstack python37Packages.dicom2nifti python37Packages.dipy python37Packages.h5netcdf python37Packages.h5py python37Packages.h5py-mpi python37Packages.heudiconv python37Packages.hickle python37Packages.keras-applications python37Packages.nibabel python37Packages.nilearn python37Packages.nipy python37Packages.nipype python37Packages.nitime python37Packages.pybids python37Packages.pywick python37Packages.tensorflow_2 python37Packages.tensorflow-estimator_2 python37Packages.uproot python37Packages.uproot-methods python37Packages.worldengine python38Packages.Keras python38Packages.annoy python38Packages.awkward python38Packages.bayespy python38Packages.caffe python38Packages.dicom2nifti python38Packages.dipy python38Packages.h5netcdf python38Packages.h5py python38Packages.h5py-mpi python38Packages.hickle python38Packages.keras-applications python38Packages.nibabel python38Packages.nilearn python38Packages.nipy python38Packages.nipype python38Packages.pybids python38Packages.pywick python38Packages.tensorflow_2 python38Packages.tensorflow-estimator_2 python38Packages.uproot python38Packages.uproot-methods worldengine-cli visidata
looks like the h5py bump is the cause of a lot of the regressions for python2, but I don't really care about keep python2 green... |
Builds succesfully with CUDA. Our GPUs are too busy to test anything serious training, but a small sanity check works. (We do use impure CUDA dependencies though, since the machine is running Ubuntu.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested with CUDA 11, python37
and python38
, trained a small model, worked perfectly!
the changes look good, but do you mind cleaning up the git history? Also, the commit messages should adhere to CONTRIBUTING.md, specifically (my template): To comply with CONTRIBUTING.md please have the commit message name be of the format
for more examples, please look at https://github.com/NixOS/nixpkgs/blob/master/.github/CONTRIBUTING.md#submitting-changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what is or isn't noise:
https://github.com/NixOS/nixpkgs/pull/95824
25 packages failed to build:
python37Packages.arviz python37Packages.baselines python37Packages.clifford python37Packages.dm-sonnet python37Packages.edward python37Packages.graph_nets python37Packages.hickle python37Packages.mask-rcnn python37Packages.optuna python37Packages.phonopy python37Packages.pymc3 python37Packages.pywick python37Packages.rl-coach python37Packages.sumo python37Packages.tensorflow python37Packages.tensorflow-bin python37Packages.tensorflow-bin_2 python37Packages.tensorflow-probability python37Packages.tensorflowWithCuda python37Packages.tflearn python38Packages.clifford python38Packages.hickle python38Packages.phonopy python38Packages.pywick python38Packages.sumo
53 packages built:
python27Packages.tensorflow-estimator_2 python37Packages.Keras python37Packages.annoy python37Packages.awkward python37Packages.bayespy python37Packages.caffe python37Packages.dcmstack python37Packages.dicom2nifti python37Packages.dipy python37Packages.h5netcdf python37Packages.h5py python37Packages.h5py-mpi python37Packages.heudiconv python37Packages.keras-applications python37Packages.nibabel python37Packages.nilearn python37Packages.nipy python37Packages.nipype python37Packages.nitime python37Packages.pybids python37Packages.qasm2image python37Packages.qiskit python37Packages.qiskit-aqua python37Packages.tensorflow_2 python37Packages.tensorflow-estimator_2 python37Packages.uproot python37Packages.uproot-methods python37Packages.worldengine python38Packages.Keras python38Packages.annoy python38Packages.awkward python38Packages.bayespy python38Packages.caffe python38Packages.dicom2nifti python38Packages.dipy python38Packages.h5netcdf python38Packages.h5py python38Packages.h5py-mpi python38Packages.keras-applications python38Packages.nibabel python38Packages.nilearn python38Packages.nipy python38Packages.nipype python38Packages.pybids python38Packages.qasm2image python38Packages.qiskit python38Packages.qiskit-aqua python38Packages.tensorflow_2 python38Packages.tensorflow-estimator_2 python38Packages.uproot python38Packages.uproot-methods worldengine-cli visidata
Seems like all of these are misfires because nixpkgs-review believes that the
(Unless I missed something) |
There's 16 commits. |
Ah right! Checking Hydra:
Of course, the h5py bump could have caused extra breakage that we do not see for the existing reasons. |
This version does not seem to detect the GPU: the test give the following warning
|
tensorflow_1 should still be removed because its broken (numpy version) and can't be fixed |
I made a PR to mark |
Interesting, I am getting that error as well, but it works fine after the build. The library has the correct run path:
Not sure why it fails during the check phase, maybe it is not loading the module from At any rate, even if this is fixed, the check would fail to run on the GPU, because Could you try if the module works with your GPU after the build? |
@danieldk You're right, it is in fact able to find some CUDA libraries in my production environment. Unfortunately, this is not the version which corresponds to my kernel.
Any idea could be helpful here. |
Overriding should still be possible (didn't test it though), but you have to override tensorflow_2.override {
cudatoolkit = pkgs.cudatoolkit_10_1;
cudnn = pkgs.cudnn_cudatoolkit_10_1;
nccl = pkgs.nccl_cudatoolkit_10;
} |
Also: - patch to remove scipy requirement - add cuda to RPATH - don’t include nvidia_x11 (This isn’t needed, we can get it from /run/opengl-driver being in the RPATH.) Co-authored-by: Arnout Engelen <arnout@bzzt.net> Co-authored-by: Daniël de Kok <me@github.danieldk.eu>
also disable on python 2.7 Co-authored-by: Jon <jonringer@users.noreply.github.com>
d8ec6b4
to
59eecac
Compare
I put libcuda.so.1 in /run/opengl-driver/lib:
and it appears that the library can be loaded, but then it fails with:
Am I forgetting something in /run/opengl-driver? Edit: also, I did not override the cudatoolkit version. |
On NixOS, RTX 2060 Super:
I am going to compile with the same capabilities as upstream and see if that fixes this. |
That worked, so maybe we should switch to the same capabilities that upstream currently uses:
(We currently have 3.5 and 5.2. Even 5.2 is pretty old by now.) |
There's also #98522 now which could be included in here which upgrades the tensorflow bin derivation |
Currently build fails with that on #98522
For that PR so both would have to be upgraded simultaneously |
There's also #95736 which upgrades bin to 2.2.0, but has some other changes |
@jonringer I propose that we merge this now. This works on NixOS and it would be nice if we could backport this to 20.03 to have a working Tensorflow 2. Remaining issues:
|
Ok, considering this is getting stale repeatedly and the current drv is broken anyways, I'm merging in 3 days Any objections? |
Let's do this: (1) it's a large improvement over master (which does not build); (2) the only other affected derivation is h5py; (3) 2.3.1 is out already (which also fixes a lot of CVEs); (4) multiple people have gone over this PR and @matthewbauer has addressed all the comments. |
Motivation for this change
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)