New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python3Packages.pytorch: 1.2.0 -> 1.4.1, python3Packages.ignite: 0.2.1 -> 0.3.0 #75827
Conversation
cb1d7a0
to
f138203
Compare
f138203
to
71c812e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the nix-protobuf code (it will just live in pytorch-world for now). Rebuilding everything again -- I just had to deal with some faulty RAM slots so updates will have a slightly faster turnaround again.
fwiw, 1.4 is now out. |
There may soon also be a 1.4.1 release that is compatible with gcc 9: |
@GrahamcOfBorg build python2Packages.pytorch |
please address failures :( |
looks like a lot of failures are related to
|
On x86 Ubuntu with Cuda support, the package builds but is broken:
Possibly setting Seems fine without Cuda but didn't run the test suite. |
@GrahamcOfBorg build python37Packages.pytorch @stites I pushed an update to your PR with the following changes:
I've managed to build this successfully with both the FOSS stack and with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about regressions, but at the very least, we should disable for python38
builder for '/nix/store/a70hd44hmh22wxsr0w7sl3wnrfgfxpjb-python3.7-ignite-0.2.1.drv' failed with exit code 1; last 10 log lines:
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
tests/ignite/contrib/handlers/test_param_scheduler.py::test_lr_scheduler
/nix/store/m4i2i1ax2ch5y1ql8ng7f014mrmk63ja-python3.7-pytorch-1.4.1/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:342: DeprecationWarning: To get the last learning rate computed by the scheduler, please use `get_last_lr()`.
"please use `get_last_lr()`.", DeprecationWarning)
-- Docs: https://docs.pytest.org/en/latest/warnings.html
===== 4 failed, 275 passed, 4 skipped, 72 deselected, 20 warnings in 6.45s =====
builder for '/nix/store/zyxn5nxx22jwdj5xhvxx173ngkms6ryi-python3.8-pytorch-1.4.1.drv' failed with exit code 1; last 10 log lines:
----------------------------------------------------------------------
Ran 2276 tests in 93.630s
FAILED (failures=1, skipped=66, expected failures=1)
Traceback (most recent call last):
File "test/run_test.py", line 455, in <module>
main()
File "test/run_test.py", line 448, in main
raise RuntimeError(message)
RuntimeError: test_jit failed!
cannot build derivation '/nix/store/iraap11hf3k4yax6bzpmnmrif3pzli5l-python3.8-ignite-0.2.1.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/8qdsp4ga63vmwbq8w0b1vf33avb848rx-python3.8-tensorly-0.4.5.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/qkmv9p848yh1l62cn9yy3wb6id0hh1ac-python3.8-torchvision-0.2.1.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/11waqaqv5xq4276f1npv1dmpj31hgprx-python3.8-pywick-0.5.6.drv': 2 dependencies couldn't be built
builder for '/nix/store/c8jvyrqblrk4a95rlnv5ji142d7sv33m-python3.7-pytorch-1.4.1.drv' failed with exit code 1; last 10 log lines:
wrapping `/nix/store/28cbbgvg7mm81l4q7qac7pmmqycjizj2-python3.7-pytorch-1.4.1/bin/convert-caffe2-to-onnx'...
Executing pythonRemoveTestsDir
Finished executing pythonRemoveTestsDir
running install tests
Traceback (most recent call last):
File "test/run_test.py", line 14, in <module>
import torch
File "/nix/store/28cbbgvg7mm81l4q7qac7pmmqycjizj2-python3.7-pytorch-1.4.1/lib/python3.7/site-packages/torch/__init__.py", line 81, in <module>
from torch._C import *
ImportError: /nix/store/28cbbgvg7mm81l4q7qac7pmmqycjizj2-python3.7-pytorch-1.4.1/lib/python3.7/site-packages/torch/lib/libtorch_python.so: undefined symbol: _ZN5torch4cuda4nccl6detail16throw_nccl_errorE12ncclResult_t
builder for '/nix/store/f2lxq1zcw1zbc4q3snnskdaadm5vyw2x-python3.8-pytorch-1.4.1.drv' failed with exit code 1; last 10 log lines:
wrapping `/nix/store/q9diyfdsl34cy1iqcnkg7c6g650aif52-python3.8-pytorch-1.4.1/bin/convert-caffe2-to-onnx'...
Executing pythonRemoveTestsDir
Finished executing pythonRemoveTestsDir
running install tests
Traceback (most recent call last):
File "test/run_test.py", line 14, in <module>
import torch
File "/nix/store/q9diyfdsl34cy1iqcnkg7c6g650aif52-python3.8-pytorch-1.4.1/lib/python3.8/site-packages/torch/__init__.py", line 81, in <module>
from torch._C import *
ImportError: /nix/store/q9diyfdsl34cy1iqcnkg7c6g650aif52-python3.8-pytorch-1.4.1/lib/python3.8/site-packages/torch/lib/libtorch_python.so: undefined symbol: _ZN5torch4cuda4nccl6detail16throw_nccl_errorE12ncclResult_t
cannot build derivation '/nix/store/rqcqydv29f29gn957kfb5vysx4ppgnsv-env.drv': 8 dependencies couldn't be built
[0 built (4 failed), 0.0 MiB DL]
error: build of '/nix/store/rqcqydv29f29gn957kfb5vysx4ppgnsv-env.drv' failed
https://github.com/NixOS/nixpkgs/pull/75827
2 package marked as broken and skipped:
python37Packages.pyro-ppl python38Packages.pyro-ppl
8 package failed to build:
python37Packages.ignite python37Packages.pytorchWithCuda python38Packages.ignite python38Packages.pytorch python38Packages.pytorchWithCuda python38Packages.pywick python38Packages.tensorly python38Packages.torchvision
4 package built:
python37Packages.pytorch python37Packages.pywick python37Packages.tensorly python37Packages.torchvision
Hmm, I think I'll go back to just not running the test suite -- particularly since users of MKL may be running without a binary cache and need to wait for the package to build themselves. |
or just reduce the checkphase to something simple. I would be fine with some As long as the maintainer verified that it worked for more in-depth cases locally |
👍 thanks for bumping this pr @bhipple ! |
Co-authored-by: Benjamin Hipple <bhipple@protonmail.com>
- Pass `blas.provider` into `buildInputs`, so that CMake can find the actual `mkl` for inspection of its cmake files and headers. - Add `USE_MKL` correctly when the blas provider is `mkl`. - Use the MKLDNN and MKLDNN_CBLAS flags by default, since `mkldnn` is FOSS and always available.. - Remove a patch for MKL 2019, since we've moved to 2020. - Add a pythonImportsCheck for "torch" as a basic sanity-check - Removed some unused variables at the top of the file
Result of 2 packages marked as broken and skipped:- python37Packages.pyro-ppl - python38Packages.pyro-ppl 4 packages failed to build:- python37Packages.ignite - python37Packages.pytorchWithCuda - python38Packages.ignite - python38Packages.pytorchWithCuda 8 packages built:- python37Packages.pytorch (python37Packages.pytorchWithoutCuda) - python37Packages.pywick - python37Packages.tensorly - python37Packages.torchvision - python38Packages.pytorch (python38Packages.pytorchWithoutCuda) - python38Packages.pywick - python38Packages.tensorly - python38Packages.torchvision |
@GrahamcOfBorg eval |
One last change: in #85839 I changed the name of Updated pytorch to reference |
Result of 2 packages failed to build:- python37Packages.pytorchWithCuda - python38Packages.pytorchWithCuda 10 packages built:- python37Packages.ignite - python37Packages.pytorch (python37Packages.pytorchWithoutCuda) - python37Packages.pywick - python37Packages.tensorly - python37Packages.torchvision - python38Packages.ignite - python38Packages.pytorch (python38Packages.pytorchWithoutCuda) - python38Packages.pywick - python38Packages.tensorly - python38Packages.torchvision |
great work! thank you all! |
what is the best way to get
for now I will try to follow #75827 (comment) |
that looks like a linking error to cuda. Not super familiar with the cuda toolchain to know where an assumption is being broken |
If you manage to get it working, please do send a PR! |
Interesting note: when we upgraded pytorch from 1.0.0 -> 1.2.0, we must've leaked a proprietary dependency into the default expression, because Hydra stopped building it. That's now been fixed with this PR, so we once again have binary cache builds of the default pytorch: https://hydra.nixos.org/job/nixpkgs/trunk/python38Packages.pytorch.x86_64-linux/all |
I tried a few things but couldn't make it work with the current version. |
Motivation for this change
Update pytorch to 1.3.1. A detailed commit history can be seen here.
Check phase verified on python36.pytorch, python36.pytorchWithMkl, python36.pytorchWithCuda10, python36.pytorchWithCuda10Mkl.
Cachix pending (my machine with the keys is down).
Relevant changelog:
buildDocs
flag addedbuildNamedTensor
is now true by defaultuseNixProtobuf
but disables this functionality.To build
I believe the following should work:
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nix-review --run "nix-review wip"
./result/bin/
)nix path-info -S
before and after)Notify maintainers
cc @teh @thoughtpolice @stites @tscholak @bhipple