Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python.pkgs.tensorflow: 1.15.0 -> 2.0.0 #70910

Closed
wants to merge 1 commit into from

Conversation

timokau
Copy link
Member

@timokau timokau commented Oct 10, 2019

Motivation for this change

Tensorflow 2.0 was released. This is WIP. The only testing I did for now was to build the py3, non-cuda variant on aarch64 using the community builder. Things to do:

  • more testing
  • fill in cuda hash
  • think about whether or not we want to enable cuda by default, as upstream seems to do now. Least surprise for users, easiest to handle. But annoying since cuda is unfree, thereby "infecting" tensorflow.
  • somehow keep 1.14? It is problematic to mix python versions, so maybe we shouldn't do it. I think users could just use tf 2 in its compatibility mode instead.

@timokau timokau requested a review from FRidh as a code owner October 10, 2019 13:13
@timokau timokau mentioned this pull request Oct 10, 2019
@timokau timokau changed the title python.pkgs.tensorflow: 1.14.0 -> 2.0. [WIP] python.pkgs.tensorflow: 1.14.0 -> 2.0. Oct 10, 2019
@timokau timokau changed the title [WIP] python.pkgs.tensorflow: 1.14.0 -> 2.0. [WIP] python.pkgs.tensorflow: 1.14.0 -> 2.0.0 Oct 10, 2019
@FRidh
Copy link
Member

FRidh commented Oct 10, 2019

If users need 1.14 they can take it from an older Nixpkgs.

@danieldk
Copy link
Contributor

danieldk commented Oct 10, 2019

Please keep 1.14/1.15 around for a while! A lot of users cannot switch to TF 2.0's compat mode, they completely dropped tf.contrib, on which a lot of models depend (for instance, all our models use at least something from tf.contrib), and tensorflow-addons currently only provides a subset of the original functionality.

Upstream will soon release 1.15 and they will provide security updates for another year:
https://twitter.com/TensorFlow/status/1182007823120633864

Edit: just wanted to add: I would think that for most folks who are still stuck on 1.x, having libtensorflow-bin and pythonPackages.libtensorflow-bin would be good enough (which is what we have been using so far).

@timokau
Copy link
Member Author

timokau commented Oct 10, 2019

A lot of users cannot switch to TF 2.0's compat mode.

Why is that? If I understand the release notes correctly (I haven't tested this yet) you should be able to just set the environment variable TF2_BEVHAVIOR=0 and have it act like tf1.

Keeping multiple versions with python is not simple, since the python interpreter will not be able to deal with different versions. Also there would need to be duplicates for all reverse deps etc.

@danieldk
Copy link
Contributor

danieldk commented Oct 10, 2019

A lot of users cannot switch to TF 2.0's compat mode.

Why is that? If I understand the release notes correctly (I haven't tested this yet) you should be able to just set the environment variable TF2_BEVHAVIOR=0 and have it act like tf1.

As I mentioned, TF 2.0 with compat does not have tf.contrib. First sentence from the TF2 migration pages:

It is still possible to run 1.X code, unmodified (except for contrib), in TensorFlow 2.0:

https://www.tensorflow.org/guide/migrate

We are currently migrating code to Tensorflow 2.0, but it will take a while before all the contrib functionality is rewritten.

Keeping multiple versions with python is not simple, since the python interpreter will not be able to deal with different versions.

I am not sure if I understand. If we had a 1.14 bin version besides 2.0 bin/Bazel versions, that would be fine as long as people do not use both packages at the same time, right?

Also there would need to be duplicates for all reverse deps etc.

Don't 1.14/2.0 largely share the same dependencies?

@timokau
Copy link
Member Author

timokau commented Oct 10, 2019

Ah right, I seem to have misunderstood. So tf 1.15 can work as tf1 or tf2 while tf 2.0.0 can only work as tf2. But tf 1.15 still won't be able to execute tf 2.0 entirely without modification since you need to actively enable the tf2 features.

What a mess. So in that case its probably best to wait for 1.15 to be released (currently in rc3) and ship that first. Then maybe after a month or two, when the core infrastructure had time to adapt (for example sonnet 2.0 is still in beta) do the 2.0 update.

@jonringer
Copy link
Contributor

Yea, i would hold off on bumping this..... that data science community needs some time...

@timokau
Copy link
Member Author

timokau commented Oct 17, 2019

1.15.0 has been released. The update is in #71282, I'll keep this open as a WIP for the eventual 2.0 update.

@timokau timokau changed the title [WIP] python.pkgs.tensorflow: 1.14.0 -> 2.0.0 python.pkgs.tensorflow: 1.15.0 -> 2.0.0 Nov 20, 2019
@timokau
Copy link
Member Author

timokau commented Nov 20, 2019

tensorflow 2.0.0 works now (with bazel 1.19). So from a technical point of view this is ready. Any strong opinions on when to merge this?

@jonringer
Copy link
Contributor

When most of the downstream packages have had time to bump to tensorflow 2. I would almost be willing to make an exception to the "only 1 package version for python". This package is pretty much impossible to get without significant effort, for nixos users who need tensorflow<2, they will have a difficult time.

@jonringer
Copy link
Contributor

Also, tensorflow 1.15 contains an implementation of both v1 and v2 APIs, so I think this is the correct version to have during this transition period

@danieldk
Copy link
Contributor

danieldk commented Nov 20, 2019

Also, I think Tensorflow 2.0 does not provide a C library yet:

Note: There is no libtensorflow support for TensorFlow 2 yet. It is expected in a future release.

https://www.tensorflow.org/install/lang_c

Which means that the graphs created with Tensorflow 2.0 (in contrast to v2 compat in 1.15.0) cannot be used with some bindings yet (eg. the Rust binding uses libtensorflow).

@timokau
Copy link
Member Author

timokau commented Nov 20, 2019

When most of the downstream packages have had time to bump to tensorflow 2. I would almost be willing to make an exception to the "only 1 package version for python". This package is pretty much impossible to get without significant effort, for nixos users who need tensorflow<2, they will have a difficult time.

Well as @FRidh suggested, its always possible to take it from an earlier nixpkgs. Since 1.15.0 is in the channels now, that should be as easy as

let
  tf1_pkgs = let
    nixpkgs-rev = "e89b21504f3e61e535229afa0b121defb52d2a50"; # nixos-unstable from 2019-11-20
  in import (builtins.fetchTarball {
    name = "nixpkgs-${nixpkgs-rev}";
    url = "https://github.com/nixos/nixpkgs/archive/${nixpkgs-rev}.tar.gz";
  }) {};
in
something

@timokau
Copy link
Member Author

timokau commented Jan 16, 2020

I'm not sure if it is worth it. The pinning experience is not that horrible (especially if you use a shell.nix in your project), and might create incentives in the right direction. For what it's worth:

  • ArchLinux already updated (and they don't even have the pinning option)
  • optuna seems to support tf2, although some examples are failing.

Is anyone aware of packages in nixpkgs that actually still have a hard dependency on tf1?

@danieldk
Copy link
Contributor

Is anyone aware of packages in nixpkgs that actually still have a hard dependency on tf1?

As far as I know the Haskell tensorflow package, because it requires libtensorflow. @basvandijk

@jonringer
Copy link
Contributor

I'm not sure if it is worth it. The pinning experience is not that horrible (especially if you use a shell.nix in your project), and might create incentives in the right direction. For what it's worth:

* ArchLinux already [updated](https://www.archlinux.org/packages/community/x86_64/python-tensorflow/) (and they don't even have the pinning option)

* optuna seems to [support](https://www.archlinux.org/packages/community/x86_64/python-tensorflow/) tf2, although some examples are [failing](https://www.archlinux.org/packages/community/x86_64/python-tensorflow/).

Is anyone aware of packages in nixpkgs that actually still have a hard dependency on tf1?

there's a few python packages: graph_nets, imbalanced_learn, tflearn and a few others. Looking at arch, it looks like they dont have any packages that directly rely on tensorflow besides the packages bundled with like tensorboard

@jonringer
Copy link
Contributor

Is anyone aware of packages in nixpkgs that actually still have a hard dependency on tf1?

As far as I know the Haskell tensorflow package, because it requires libtensorflow. @basvandijk

Note: There is no libtensorflow support for TensorFlow 2 yet. It is expected in a future release. https://www.tensorflow.org/install/lang_c

@cdepillabout
Copy link
Member

As far as I know the Haskell tensorflow package, because it requires libtensorflow.

tensorflow-haskell hasn't seen much love lately. There doesn't appear to be anyone working on even updating it for tensorflow-1.15.

We shouldn't use tensorflow-haskell as a reason to keep back tensorflow-2.

@jonringer
Copy link
Contributor

@cdepillabout agreed.

My point was more to illustrate the ecosystem around tensorflow is largely in disarray with the switch. This is only amplified with how many resources are invested in developing ML models. There's a large resistance to upgrade just because a new version is out.

@Ericson2314
Copy link
Member

Ericson2314 commented Jan 17, 2020

Yeah I would think it's fine to have tensorflow_1 and tensorflow_2 and a tensorflow alias, with the restriction that libraries only depend on tensorflow. Then both can be cached but there is no package coherence problem.

@Zhen-hao
Copy link

what is blocking the merge now? is there anything I can help with?
Meanwhile, I've moved to Pytorch for my project since I can't get TS2 on NixOS.
Disclaimer: I am new to Python.

@jonringer
Copy link
Contributor

If we document a way to get tensorflow_1, then I'm fine with merging. Still unfortunate about the major breakage

@FRidh
Copy link
Member

FRidh commented Feb 9, 2020

Yeah I would think it's fine to have tensorflow_1 and tensorflow_2 and a tensorflow alias, with the restriction that libraries only depend on tensorflow. Then both can be cached but there is no package coherence problem.

There are some other special cases where we do this, e.g. Django.

@jonringer
Copy link
Contributor

We should probably get this in before the feature freeze tomorrow

@danieldk
Copy link
Contributor

We should probably get this in before the feature freeze tomorrow

Agreed! I think the same counts for

#75827

@timokau
Copy link
Member Author

timokau commented Feb 10, 2020

For the plain update, this should be good to go IIRC. My time is pretty limited right now, so I'll not be working on this before the branchoff. Merge / fork off at your own discretion.

@jonringer
Copy link
Contributor

seems that the branch off has been delayed a bit.

@jonringer
Copy link
Contributor

Well, the current build is broken, I don't see any harm in merging this

@jonringer
Copy link
Contributor

@GrahamcOfBorg build python3Packages.tensorflow

@mjlbach
Copy link
Contributor

mjlbach commented Feb 27, 2020

There are current PRs that will depend on having a packaged TF 1 variant due to the sunsetting of contrib, see #61253

@danieldk
Copy link
Contributor

danieldk commented Feb 27, 2020

There are current PRs that will depend on having a packaged TF 1 variant due to the sunsetting of contrib, see #61253

Which currently does not work, because neither tensorflow and tensorflow-bin currently work. The former because it relies on Bazel 1 (#77771), so this requires keeping a Bazel 1 derivation around. Also, Tensorflow 1.15 require patches to work with the current glibc (#81035, but you'd know ;)).

These things are only going to get worse as the ecosystem moves on and Google loses interest to maintain the 1.15 branch.

I think at this point, it makes most sense to switch to Tensorflow 2, since that is what most people will expect now and is required by various high-profile projects (e.g. Hugging Face transformers). Tensorflow 1.15 and projects that rely on it could always be packaged in NUR.

Random observation: a lot of these packages are quite high-friction. I have started packaging my own PyTorch because the nixpkgs version is usually trailing quite a bit. Also, the libtorch C++ API breaks between releases, so we always have to cherry pick releases that line up with the tch-rs binding anyway.

I wonder if these packages belong in nixpkgs, rather than a specialized NUR repository. PyTorch and Tensorflow, probably, because people expect to be able to fire up a Jupyter notebook and experiment. But if they are always going to be held back because they are moving fast and updating them breaks other derivations in nixpkgs, we'd always be shipping ancient Tensorflow and PyTorch versions.

@mjlbach
Copy link
Contributor

mjlbach commented Feb 27, 2020

Ha true, but once my PR's get merged they will! I agree with you in principal (I have migrated all of my active projects to tf2) however, several libraries like Sonnet 1, Deepmind's graph neural network library (the name escapes me), Deeplabcut, all depend on tf1 and are unlikely to be updated. I also have a PR for adding a configurable bazel build with bazel 1 #81033, so I don't see the harm in having a legacy tensorflow implementation, we could even add a separate tensorflow_1 derivation.

@danieldk
Copy link
Contributor

danieldk commented Feb 27, 2020

we could even add a separate tensorflow_1 derivation.

That was my initial suggestion, but this is probably not a good idea as discussed upthread. The objection is that this could result in closures that contain two versions of Tensorflow, which conflict in Python.

@tbenst
Copy link
Contributor

tbenst commented Feb 27, 2020

Hm, is there some workaround for applications?

The PR that @mjlbach referenced would be of high use to the scientific community. Deeplabcut has been used in dozens or more high profile papers, and popularity is only increasing.

Unfortunately the package is a PITA to install and neither conda nor docker are fully functional due to various restrictions. Nix would really shine here, and it’s a gateway to growing nix datascience community.

TF1 vs TF2 is more analogous to Python 2 vs Python 3 than a typical package update. There’s a huge ecosystem dependent on tf1 that is not switching in the foreseeable future.

@Ericson2314
Copy link
Member

I am currently rebasing this to be a new additional version rather than replacement.

@jonringer
Copy link
Contributor

related: #83518

@Ericson2314
Copy link
Member

We've landed the 2.1 change.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Jul 8, 2020

Can someone explain the differences and equivalences?

In TF1, I used tensorflowWithCuda and tensorflowWithoutCuda.

Now with TF2, what is the equivalent here?

I see:

  • tensorflow_2
  • tensorflow-bin_2
  • tensorflow-build_2

What translates to the "CUDA" and non-CUDA variant?

Also:

Python 3.8 support requires TensorFlow 2.2 or later.

I see that 2.2 is not yet in nixpkgs.

@jonringer
Copy link
Contributor

be the change you want to see :)

@Zhen-hao
Copy link

Zhen-hao commented Jul 8, 2020

tensorflowWithoutCuda

@CMCDragonkai
use python37Packages.tensorflow_2.override { cudaSupport = true; }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet