Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow: 0.10.0 -> 0.12.0rc0 #20794

Closed
wants to merge 2 commits into from
Closed

Conversation

tindzk
Copy link
Contributor

@tindzk tindzk commented Nov 29, 2016

Motivation for this change

Update to latest TensorFlow version and support Python 3.

Things done
  • Tested using sandboxing
    (nix.useSandbox on NixOS,
    or option build-use-sandbox in nix.conf
    on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • Linux
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

@mention-bot
Copy link

@tindzk, thanks for your PR! By analyzing the history of the files in this pull request, we identified @nlewo, @houqp and @FRidh to be potential reviewers.

Copy link
Member

@FRidh FRidh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit messy with the urls and hashes, but looking at the available wheels it doesn't look like it can be done much prettier either.

While I haven't tested the PR it looks good.

@tindzk tindzk force-pushed the tensorflow branch 2 times, most recently from baeb6e6 to 2596ae0 Compare November 29, 2016 15:55
@tindzk tindzk mentioned this pull request Nov 29, 2016
7 tasks
Copy link
Member

@FRidh FRidh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having looked at it again I think the names of the packages should be different.


tensorflowNoGpuSupport = buildPythonPackage rec {
tensorflowWithoutGpu = buildPythonPackage rec {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this with generic GPU support, or specifically Cuda? If I am correct it is Cuda only, and therefore the name should reflect that: tensorflowWithoutCuda and tensorflowWithCuda

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, TensorFlow does not support OpenCL. However, if we were to introduce theanoWithGpu and theanoWithoutGpu, I would suggest to use a similar naming here (for sake of consistency). What do you think?

@tindzk
Copy link
Contributor Author

tindzk commented Nov 30, 2016

I updated the package as per your suggestion.

protobuf2_6 = self.protobufBuild pkgs.protobuf2_6;
protobuf2_5 = self.protobufBuild pkgs.protobuf2_5;
protobufBuild = protobuf: buildPythonPackage rec {
inherit (protobuf) name src;
disabled = isPy3k || isPyPy;
isV2_6 = versionAtLeast protobuf.version "2.6.0";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isV2_6 = protobuf.version == "2.6.0";?

@tindzk
Copy link
Contributor Author

tindzk commented Nov 30, 2016

Thanks! I renamed the variables to make it clearer. versionAtLeast is correct.

@jyp
Copy link
Contributor

jyp commented Dec 1, 2016

I tested the PR on darwin, and it works great with python 3.5 👍
However I got the following strange error with 2.7:

    import tensorflow as tf
  File "/nix/store/qprvcgvh7l0rw99ywnzcr5f9fkqrwsm9-python-2.7.12-env/lib/python2.7/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/nix/store/qprvcgvh7l0rw99ywnzcr5f9fkqrwsm9-python-2.7.12-env/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 60, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/nix/store/qprvcgvh7l0rw99ywnzcr5f9fkqrwsm9-python-2.7.12-env/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/nix/store/qprvcgvh7l0rw99ywnzcr5f9fkqrwsm9-python-2.7.12-env/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in <module>
    _pywrap_tensorflow = swig_import_helper()
  File "/nix/store/qprvcgvh7l0rw99ywnzcr5f9fkqrwsm9-python-2.7.12-env/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
ImportError: dlopen(/nix/store/qprvcgvh7l0rw99ywnzcr5f9fkqrwsm9-python-2.7.12-env/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so, 10): no suitable image found.  Did find:
	/nix/store/qprvcgvh7l0rw99ywnzcr5f9fkqrwsm9-python-2.7.12-env/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so: file too short

@tindzk
Copy link
Contributor Author

tindzk commented Dec 1, 2016

Does it work when you change the six dependency back to google_apputils?

@jyp
Copy link
Contributor

jyp commented Dec 2, 2016 via email

@jyp
Copy link
Contributor

jyp commented Dec 5, 2016

To be sure: I'd love this PR to be merged, even in its present state, because I prefer "only python 3.5" to "only python 2.7".

@FRidh
Copy link
Member

FRidh commented Dec 5, 2016

@tindzk you update two separate functions, so we need two commits. The version increments are also different for each. We typically also include only releases, not RC's.

@jyp it was originally added for 2.7 so we won't break that.

@teh @anderspapitto you've updated tensorflow is in the past. Comments?

Finally, these two packages need a maintainer.

@tindzk
Copy link
Contributor Author

tindzk commented Dec 5, 2016

@FRidh The previous version of the CUDA package was also a release candidate which lead me believe that RCs are fine. Should we wait until the final version has been released? Which two functions do you refer to? I agree that Python 2.7 support would be desirable.

@jyp It seems to me that _pywrap_tensorflow.so is corrupt. Is it a broken symlink?

@teh
Copy link
Contributor

teh commented Dec 5, 2016

Nice cleanup change, lgtm from me.

@jyp
Copy link
Contributor

jyp commented Dec 5, 2016

@tindzk For some reason it the .so file contains a script:

cat /nix/store/qprvcgvh7l0rw99ywnzcr5f9fkqrwsm9-python-2.7.12-env/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so
#! /nix/store/8r21d159vxacg598hy2v2cyn4dargp6f-bash-4.3-p46/bin/bash -e
export PATH=/nix/store/sajcffrl9jlqp9ywpf2bs6qvq4niy0w4-python-2.7.12/bin:/nix/store/bd2wijgnwxan0dq3ix32wj73sjm0m4zp-python2.7-tensorflow/bin:/nix/store/651fsmql7jjqgaxzldd500qzbj1b3a1v-python2.7-numpy-1.11.2/bin:/nix/store/wa63dlrhqx8kpkv8mazhgwx1mfhwgfxw-openblas-0.2.19/bin:/nix/store/f0l6g80v5nyc06iv975va4ps4mhlwwvs-python2.7-setuptools-28.8.0/bin:/nix/store/45fcvk3bdlby9yv709cd3gbdbnppvav9-protobuf-3.1.0/bin:/nix/store/3f5kc88shqgnh6i91kabm2wzgglfib4s-swig-3.0.10/bin:/nix/store/ma9rlqcr9az6znph5r0q68wf9ypwvdjw-python2.7-pbr-1.8.1/bin${PATH:+:}$PATH
exec -a "$0" /nix/store/bd2wijgnwxan0dq3ix32wj73sjm0m4zp-python2.7-tensorflow/lib/python2.7/site-packages/tensorflow/python/._pywrap_tensorflow.so-wrapped "${extraFlagsArray[@]}" "$@"

@jyp
Copy link
Contributor

jyp commented Dec 12, 2016

@tindzk Is there anything I can do to help the PR moving forward?

@tindzk
Copy link
Contributor Author

tindzk commented Dec 12, 2016

@jyp From my understanding, PatchELF was applied to the .so file and replaced it by a shell script. This seems to be the issue we are encountering: NixOS/patchelf#108

@jyp
Copy link
Contributor

jyp commented Dec 12, 2016

So do we need to wait for NixOS/patchelf#108 to be fixed? Or should we attempt a workaround?

@tindzk
Copy link
Contributor Author

tindzk commented Dec 13, 2016

I'd prefer to avoid workarounds and get the issue fixed in patchelf instead as it will benefit all packages with symbolic links.

@vcunat
Copy link
Member

vcunat commented Dec 26, 2016

No, patchelf leaves ELF files as ELF files; it doesn't change them to a script.

@jluttine
Copy link
Member

jluttine commented Jan 2, 2017

Perhaps not the right place to ask but anyway: Would it be possible to compile Tensorflow from source instead of installing binary wheel? I tried it but the bazel build system seems to make it very difficult because it does a bit what Nix should do (handles and installs dependencies). But any insight whether it would be possible and how?

@jyp
Copy link
Contributor

jyp commented Jan 2, 2017

@jluttine Unfortunately bazel is currently broken on Darwin, see #20579
So this should be fixed first.

@anderspapitto
Copy link
Contributor

@jluttine https://github.com/nixos/nixpkgs/blob/f0373f4ceea40405491d1316529b91171744dcb3/pkgs/top-level/python-packages.nix#L30928

probably the best way forward on that is to see how debian/other big distros deal with bazel

@teh
Copy link
Contributor

teh commented Jan 2, 2017

@jluttine - I'd vastly prefer a build from source as well, but: I spent 6+ hours trying to get bazel to work for tensorflow with not much to show. I have wrangled some really byzantine build systems but bazel exceeds my frustration tolerance. I will cheer the brave soul that accomplishes this.

@jyp
Copy link
Contributor

jyp commented Jan 9, 2017

@tindzk Can I do something to help? According to @vcunat patchelf is not the culprit for the problem that I am observing.

@tindzk
Copy link
Contributor Author

tindzk commented Jan 9, 2017

If patchelf is not the culprit, I do not know what causes the problem. Perhaps we could get a lead developer involved in the conversation? Apparently, TensorFlow isn't the only package that is affected as @quanah describes the same bug in NixOS/patchelf#108.

@edolstra
Copy link
Member

edolstra commented Jan 9, 2017

As an aside, I think large, non-trivial packages like TensorFlow should really be separate Nix expressions rather than being in python-packages.nix. The use of a mega-expression like python-packages.nix prevents (for example) the use of git log to see the history of an individual package, and increases the probability of merge conflicts.

@jyp
Copy link
Contributor

jyp commented Jan 24, 2017

@FRidh perhaps a way to move forward is to push the cudnn version, which never worked with macos before. Would that be acceptable?

@Mic92
Copy link
Member

Mic92 commented Jan 24, 2017

see also #22061 regarding protobuf and python3. Your version check is a bit more elegant, but my pull requests enables also tests. A combination would be good. I would also move the build expressions into an external file for both protobuf and tensorflow.

@teh
Copy link
Contributor

teh commented Feb 14, 2017

Currently tensorflow doesn't work on any platform due to a broken protobuf dependency - I'm +1 on merging this after the updates in #22061 to make at least Linux work again.

Edit: Also +1 on moving this to its own directory under ./pkgs/development/python-modules/

@tindzk
Copy link
Contributor Author

tindzk commented Feb 14, 2017

Is anyone interested in taking over the maintenance of the TensorFlow package? I am currently not using Nix, so I do not want to prevent this from being merged. We would have to re-base this pull request and adapt it to the latest TensorFlow version (0.12.1).

@FRidh FRidh self-assigned this Feb 15, 2017
@jyp
Copy link
Contributor

jyp commented Feb 15, 2017

As it were, I've done the update to 0.12.1 and used this version of tensorflow/cuda sucessfully.

I'll be using tensorflow in nix for my research group in the foreseeable future. Thus I can try to take it over, but I am a relative beginner in nix.

@teh
Copy link
Contributor

teh commented Feb 15, 2017

@jyp - Im a nix cuda-tensorflow user as well. Happy to review and / or help. Let me know what I can do.

@FRidh
Copy link
Member

FRidh commented Feb 15, 2017

Note that I've moved the expressions and renamed the attributes in a069c16 and 97c3beb.

@jyp
Copy link
Contributor

jyp commented Feb 15, 2017

@teh - AFAIU we should first wait for #22061 to be merged. @Mic92 if you need help on your PR please let us know.

@tindzk
Copy link
Contributor Author

tindzk commented Feb 15, 2017

Thanks a lot for the help, @jyp and @teh! I'm going to close this pull request then. Please feel free to open a new one.

@tindzk tindzk closed this Feb 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants