Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theano cleanup #26580

Merged
merged 3 commits into from Aug 8, 2018
Merged

Theano cleanup #26580

merged 3 commits into from Aug 8, 2018

Conversation

twhitehead
Copy link
Contributor

Motivation for this change

The first patch merges the cuda and non-cuda derivations. This removes a lot of duplication and was mentioned in the original commit 07dcc4f as something that should be done.

The second patch bakes the build-time compiler into Theano for it to use when it compiles up and loads it EDSL. Without this, it uses whatever compiler is current in the environment. This can cause issues, especially in the non-NixOS case where the system compiler is often not a good choice.

I've built and tested this against the 17.03 release. It wouldn't build against master for me due to issues with the dependencies unreleated to these patches.

Things done
  • Tested using sandboxing
    (nix.useSandbox on NixOS,
    or option build-use-sandbox in nix.conf
    on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • Linux
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

postPatch =
let
# Closure for running a command in a stdenv for building with given buildInputs
wrapped = command: buildInputs:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use makeWrapper/wrapProgram?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I ran into is that Theano tries to compile the code it generates with just a standard g++ call, and then expects to be able to dynamically load the results into itself. Generally this goes wrong because

  1. the system g++ cannot find the required libraries or, even if the first succeeds,

  2. the resulting binary is not compatible with running Theano (i.e., depends on the system versions of common libraries like glibc which, especially if you are not using NixOS, may not match the ones Theano was compiled with).

Ideally I think you would want the runtime builds to execute in the same sort of environment that build-time builds were done in. Hence the above code to capture an appropriate stdenv. I emailed the nix-dev mailing list about this earlier, but haven't received any replies.

It would be possible to further patch Theano or use makeWrapper/wrapProgram to add the appropriate compiler flags so it can find its libraries. I think the above may be more elegant and less maintenance though as you automatically get the full build environment (i.e., the closure of the required includes, the setting of the appropriate NIX_* flags to enforce compiler purity, etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this sort of function could be added the standard libraries?

I say that as I think there are actually quite a few places it could be used. For example, I recently packaged up COFFEE which has similar issues (another python package that compiles code on the fly and then dynamically loads it into itself).

I also noticed in another package that behind the scenes nvcc was being called and it was using my system gcc. This was failing as nvcc wasn't compatible with the system gcc version. Ideally a Nix provided nvcc should be using a Nix provided gcc determined at the installation time of the nvcc.

'';

in ''
sed -i -e 's|@g++@|${wrapped "g++" [ libgpuarray ]}|' theano/configdefaults.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nicer to create a new variable, say compiler = ${wrapped "g++" [ libgpuarray ]} and then modify the patch to use @compiler@ (or @stdenv.cc@).

# Test whether or not g++ is present: disable C code if it is not.
try:
- rc = call_subprocess_Popen(['g++', '-v'])
+ rc = call_subprocess_Popen(['@g++@', '-v'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directly using g++ is indeed not going to work because with Nix we wrap our compilers as you've noticed. So its likely best here to just use stdenv.cc (though that might not be ideal on Darwin since they use LLVM).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a link to the relevant bit in the patched file. As you can see, they treat the Mac case specially by trying to run clang++.

Perhaps the patch should delete this compiler determination mess and replace it with

param = "@compiler@"

to be substituted at build time by according to the Nix expression.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, purge that mess :)

@twhitehead twhitehead force-pushed the theano-cleanup branch 3 times, most recently from 19f2309 to 2d6e314 Compare June 19, 2017 22:56
@twhitehead
Copy link
Contributor Author

  1. Cleaned up all the compiler stuff as per the prior discussion (used substituteInPlace as a cleaner substitution system without as many escaping issues as sed for this).
  2. Locked nvcc to a Nix provided gcc as ran into issues with it using whatever one happens to be in the path as well (would be ideal if the nvcc package did this itself, but that seems outside the scope of the Theano work).

@twhitehead
Copy link
Contributor Author

nvcc wants the bin directory, had accidentally given it the top level directory.

@twhitehead
Copy link
Contributor Author

Noticed the non-gpu version had been updated to 0.9.0. Added an update of the combined to 0.9.0. Not sure why the travis failed on the prior. Some sort of error about the python dogpile cache module. Suspect it may have have been some (hopefully transient) issue with the travis system/scripts.

@FRidh
Copy link
Member

FRidh commented Jun 23, 2017

Something is not good yet. When I build python.pkgs.Theano it starts downloading the cuda toolkit and such. Looks like the patches are used for each build (with and without cuda), introducing a dependency on cuda regardless. I suppose the second patch is best split in two, with one only used for TheanoWithCuda.

@twhitehead
Copy link
Contributor Author

Thanks for the heads up @FRidh. I'll have a look and fix it up.

@twhitehead
Copy link
Contributor Author

@FRidh I believe I have address the issue. It was as you had figured (dependencies added by the patches). I hadn't realized that callPackage provided values for arguments with default values as well.

'';

# Theano spews warnings and disabled flags if the compiler isn't named g++
cxx_compiler = wrapped "g++" [ libgpuarray cudnn ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a cuda dependency still exists because of this here

cxx_compiler = wrapped "g++" [ libgpuarray cudnn ];
in ''
substituteInPlace theano/configdefaults.py \
--subst-var-by cxx_compiler '${cxx_compiler}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

@twhitehead
Copy link
Contributor Author

Yes. You are correct. I thought it was good for sure when I checked the derivation file and verified there were no cuda related items in it. Should have checked the closure of the built package.

Sorry about all the back and forth and thanks for the prompt reviews.

@FRidh
Copy link
Member

FRidh commented Jul 17, 2017

No problem. It can be hard, especially in the beginning, to see what's all happening. Just check with $ nix-store -qR $(nix-build -A python.pkgs.Theano) what it needs during run-time and
$ nix-store -qR $(nix-instantiate -A python.pkgs.Theano) during build-time.

@twhitehead
Copy link
Contributor Author

twhitehead commented Nov 5, 2017

Sorry for the delay. Had to make some serious progress on some other projects. I see master has now picked up its own merged CUDA and non-CUDA versions. I gave this one a try and it doesn't work. Here is the test code (lifted from the Theano GPU documentation)

cat temp.py
from theano import function, config, shared, tensor
import numpy
import time

vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], tensor.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
    r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, tensor.Elemwise) and
              ('Gpu' not in type(x.op).__name__)
              for x in f.maker.fgraph.toposort()]):
    print('Used the cpu')
else:
    print('Used the gpu')

commands (note that TheanoWithoutCuda and TheanoWithCuda are backwards in master) and results

nix-build -E '(import <nixpkgs> { }).TheanoWithoutCuda'
THEANO_FLAGS='device=cuda' ./result/bin/python temp.py
...
Problem occurred during compilation with the command line below:
/nix/store/dn8rmk842arvn8nggiqzla72bbw9a1wy-g++/bin/g++ -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wl,-rpath,"/opt/sharcnet/cuda/7.5.18/lib64" -march=haswell -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -mbmi2 -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mxsave -mxsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-clwb -mno-mwaitx -mno-clzero -mno-pku --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=20480 -mtune=haswell -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -fPIC -I/opt/sharcnet/cuda/7.5.18/include -I/nix/store/aq6nbc2v4inzlnmwwdxn6yi72z4yyvhb-python-2.7.14-env/lib/python2.7/site-packages/numpy/core/include -I/nix/store/aq6nbc2v4inzlnmwwdxn6yi72z4yyvhb-python-2.7.14-env/include/python2.7 -I/nix/store/aq6nbc2v4inzlnmwwdxn6yi72z4yyvhb-python-2.7.14-env/lib/python2.7/site-packages/theano/gof -L/opt/sharcnet/cuda/7.5.18/lib64 -L/nix/store/5nb6vmj382dh9dddkk82d4pwkmvxmgwh-python-2.7.14/lib -L/nix/store/aq6nbc2v4inzlnmwwdxn6yi72z4yyvhb-python-2.7.14-env/lib -fvisibility=hidden -o /home/tyson/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-centos-6.8-Final-x86_64-2.7.14-64/tmpNX_tXh/11617ef9bd367a3818a3537ffe37d382.so /home/tyson/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-centos-6.8-Final-x86_64-2.7.14-64/tmpNX_tXh/mod.cpp -lcudnn -lpython2.7
/home/tyson/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-centos-6.8-Final-x86_64-2.7.14-64/tmpNX_tXh/mod.cpp:4:19: fatal error: cudnn.h: No such file or directory
 #include "cudnn.h"
                   ^
compilation terminated.

ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
...
Exception: ('The following error happened while compiling the node', DnnVersion(), '\n', 'Compilation failed (return status=1): /home/tyson/.theano/compiledir_Linux-2.6-el6.x86_64-x86_64-with-centos-6.8-Final-x86_64-2.7.14-64/tmpNX_tXh/mod.cpp:4:19: fatal error: cudnn.h: No such file or directory.  #include "cudnn.h".                    ^. compilation terminated.. ', '[DnnVersion()]')

I've merged it with mine to produce something that works. I also finally fixed the dependencies so the non-CUDA one is no longer needlessly pulling in CUDA dependencies.

Thanks! -Tyson

PS: At least the following problems exist in the merged Theano in master which are addressed with this pull

  1. TheanoWithCuda and TheanoWithoutCuda are reversed (i.e., with is really without and vice versa).
  2. The runtime compiler isn't being invoked in away that successfully compiles code (as seen above).
  3. Theano won't work with python < 3 without the future dependency which was stripped out.
  4. Theano won't work with cudnn > 6 and spews a warning for cudnn > 5.
  5. Theano won't work with cuda > 7.5.

@twhitehead
Copy link
Contributor Author

@FRidh thanks for the tip about nix-store -qR BTW. I hadn't realized you could run that on a derivation. Nice!

@twhitehead
Copy link
Contributor Author

@FRidh Any thoughts on this one? Not sure what the grahamcofborg-eval thing is.

@idontgetoutmuch
Copy link
Contributor

Is this related in any way? #33073

@twhitehead
Copy link
Contributor Author

Any thoughts on this anyone? Thanks!

@twhitehead
Copy link
Contributor Author

Another month so figured I would post again.

Any chance this can be merged?

As far as I know, it's good to go.

Thanks! -Tyson

@twhitehead
Copy link
Contributor Author

Been a couple of months again now, so posting again.

Could someone please give me some directions on how to proceed here?

Do I have to close this and open a new ticket?

Thanks! -Tyson

@FRidh
Copy link
Member

FRidh commented Apr 17, 2018

It would be nice if the history would be "fixed". e3f2c12 seems to be undone again (partially) somewhere.

@FRidh
Copy link
Member

FRidh commented Apr 17, 2018

Other than that, the final changes look fine to me.

@twhitehead
Copy link
Contributor Author

Looks like master bumped the Theano version back on December 30th so the patches don't apply cleanly anymore. I'll have a look into it.

@FRidh was there anything particular you were wanting fixed up about the history (other than making sure the cuda and non-cuda version are unified in the final version)?

@FRidh
Copy link
Member

FRidh commented Aug 2, 2018

@twhitehead master should not be merged into your changes. Instead, your changes should be rebased onto it.

@twhitehead
Copy link
Contributor Author

Ok. I'm going to do a merge squash back onto master and then force push that back to the pull request. I believe this should give one simple commit on top of master instead of the current very complex history.

Original was a mix of config changes and code changes with a search
and replace that also changed unintended bits such as messages.
@twhitehead
Copy link
Contributor Author

@FRidh I believe this should finally be good. I've brought it back up to date so grahamcofborg is now happy, and I've replaced the convoluted history with just three self-contained commits on the top of master.

@FRidh FRidh merged commit 52e7817 into NixOS:master Aug 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants