Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Nix-builtin fetchurl where possible #41147

Closed

Conversation

bobvanderlinden
Copy link
Member

@bobvanderlinden bobvanderlinden commented May 27, 2018

Motivation for this change

Nixpkgs uses the fetchurl function quite a lot. That function uses Curl binary which runs inside a build sandbox. This requires the netrc file required to download private resources to be exposed inside of the sandbox. This was mentioned in #41000.

I was interested how viable it would be to use the Nix-builtin fetchurl instead of using the Curl binary.

This PR changes the way fetchurl works by using fetchurl from Nix by default, but fallback to the Curl variant when arguments are being used that are not supported by the Nix fetchurl.

The original fetchurl/default.nix was renamed to fetchurl/curl.nix. fetchurl/default.nix uses a derivation with builder = "builtins:fetchurl", similar to fetchurl/boot.nix. fetchurl/default.nix has the same signature as fetchurl/curl.nix, but will check for incompatible arguments to see whether it should fallback to fetchurl/curl.nix.

Consider this PR a work in progress and a way to gauge interest/motivation/problems/compatiblity.

The problems that I anticipate are:

  • Mirror support are not in Nix yet, so quite a lot of fetchurl calls fallback to curl.
  • NIX_CURL_FLAGS will be broken for those calls that do use Nix.
  • nix-build does not show progress of downloads.
Things done
  • Tested using nix-build -A nginx and nix-build -A mysql.
  • Tested using sandboxing (nix.useSandbox on NixOS, or option build-use-sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

@bobvanderlinden
Copy link
Member Author

It seems @GrahamcOfBorg is failing on a derivation that's already failing on master (ran this locally).

@matthewbauer
Copy link
Member

Could we do some sort of feature detection for this? Like checking whether the new fetchurl is available to retain compatibility with old Nix?

@bobvanderlinden
Copy link
Member Author

That's a good idea. I've added an additional check for the Nix version. I wasn't able to find a good way to check for the builtin:fetchurl feature, so it's checking for Nix version (which needs to be 2.0 or higher).

@matthewbauer
Copy link
Member

matthewbauer commented Jun 1, 2018

That's a good idea. I've added an additional check for the Nix version. I wasn't able to find a good way to check for the builtin:fetchurl feature, so it's checking for Nix version (which needs to be 2.0 or higher).

That makes sense. I think that's the only proper way to do it.

Also:

  • Mirror support are not in Nix yet, so quite a lot of fetchurl calls fallback to curl.

Can we implement this in Nixpkgs? It should just be some string manipulation and a lookup of mirrors.nix.

  • NIX_CURL_FLAGS will be broken for those calls that do use Nix.

I'm not sure I understand this. I think we will also have to rewrite builders that do use NIX_CURL_FLAGS though: https://github.com/NixOS/nixpkgs/search?q=NIX_CURL_FLAGS

  • nix-build does not show progress of downloads.

I think this is okay for now. nix-build never had good progress support anyway & as long as nix build progress works I'm fine with that.

@shlevy
Copy link
Member

shlevy commented Jun 1, 2018

I don't understand what private files have to do with fetchurl usage in nixpkgs. What's wrong with the statusquo exactly?

@shlevy
Copy link
Member

shlevy commented Jun 1, 2018

Note that there is unfortunately no way to actually check if builtins:fetchurl is available, as the version of Nix used to evaluate the expressions is not necessarily the same as the version of the nix store builder. Indeed, you can instantiate a derivation on one machine, copy it to another, and build it there... So I'm not sure the version checks gain us much. We should just decide as a community when it's OK for nix 2.0+ store features, like builtins:fetchurl and structured attrs, are OK in nixpkgs.

@bobvanderlinden
Copy link
Member Author

bobvanderlinden commented Jun 1, 2018

Can we implement this in Nixpkgs? It should just be some string manipulation and a lookup of mirrors.nix.

I don't think we can handle 404's from outside the derivation, but maybe I'm misunderstanding this part?

EDIT: I think we probably need to add support for mirrors inside Nix itself, however we can already take the first site from the mirrors to make most fetches use the builtin method.

I'm not sure I understand this. I think we will also have to rewrite builders that do use NIX_CURL_FLAGS though: https://github.com/NixOS/nixpkgs/search?q=NIX_CURL_FLAGS

Yes, that's true. We also need to add options to builtin:fetchurl to support the use-cases of those builders. However, another part is that you can pass NIX_CURL_FLAGS to nix and nix-daemon. So if we drop support for NIX_CURL_FLAGS, existing users with tooling around NIX_CURL_FLAGS might be affected.

I think that is the right way to go though, because NIX_CURL_FLAGS leans too much on the interface of the curl command-line tool.

I don't understand what private files have to do with fetchurl usage in nixpkgs. What's wrong with the statusquo exactly?

The documentation mentions to use netrc-file for fetching resources that require authorization. However with the current implementation, the netrc-file only applies to fetches being handled by Nix/nix-daemon. It does not actually apply to curl used within derivations (like fetchurl).

The most popular way to get a netrc-file used by fetchurl is using NIX_CURL_FLAGS. This environment variable can be passed through the command-line nix tool to the derivation. There are 2 problems with this approach.

One problem is that people need to pass it to their command-line tool as well as nix-daemon. On OSX nix-daemon is called from launchd using org.nixos.nix-daemon.plist. That file does not supply any NIX_CURL_FLAGS that refers to a netrc file. Changing org.nixos.nix-daemon.plist is awkward, since the file is defined inside your default Nix profile under root and should move along with upgrades of Nix.

Another problem is that NIX_CURL_FLAGS is being used inside the sandbox. That means that any netrc file referred inside the NIX_CURL_FLAGS will need to be exposed inside the sandbox. That is certainly possible using sandbox-paths, but that would expose all your authorisations to the sandbox. It would be nice if that part could be handled by Nix/nix-daemon, so that the built itself can be safer.

That is why I think it would be nice to move more towards using the methods in Nix to fetch HTTP(S) resources. That way we can make this process less vague for the user, the documentation more consistent and the builds more secure.

Note that there is unfortunately no way to actually check if builtins:fetchurl is available, as the version of Nix used to evaluate the expressions is not necessarily the same as the version of the nix store builder. Indeed, you can instantiate a derivation on one machine, copy it to another, and build it there... So I'm not sure the version checks gain us much. We should just decide as a community when it's OK for nix 2.0+ store features, like builtins:fetchurl and structured attrs, are OK in nixpkgs.

Hmm, good point. I ran into builtins:fetchurl due to <nix/fetchurl.nix>, which is being used by fetchurlBoot. So it is already being used to bootstrap tools like curl itself, but I guess those usually are retrieved from cache.nixos.org. Suggestions on how to make such a decision are very welcome. Should I make an issue about plans for Nix < 2.0 deprecation?

@shlevy
Copy link
Member

shlevy commented Jun 1, 2018

But none of the fetchurl calls in nixpkgs itself should require authorization. I get why you might want to use bultins:fetchurl or even builtins.fetchurl (which operates at eval time, not build time) in a private project, but what does it matter to nixpkgs?

@bobvanderlinden
Copy link
Member Author

That's true, but fetchurl is also being used by packages that are based on bundix/node2nix/mvn2nix/etc. For those you do want to be able to fetch private resources. I fixed this issue by overriding fetchurl with a built-in fetchurl for only these packages, but this is somewhat non-trivial. This is not something I'd like to recommend beginning Nix users. It hurts the UX and is hard to document.

A global netrc file would help quite a lot in this regard. It will work for private Nix imports, private cache as well as private src. A lot easier to grasp imo.

@matthewbauer
Copy link
Member

Aren’t there other benefits besides private files? For instance:

  • progress updating integration with Nix 2.0
  • simplify bootstrapping of Nixpkgs (no more fetchurlBoot)
  • configure once, work everywhere with Nix’s netrc. Users behind corporate firewalls may need some special options and needing to change both nix.conf and NIX_CURL_FLAGS to get this to work. That’s a little unintuitive unless you know how Nixpkgs internals work w.r.t. Nix.

@Ericson2314
Copy link
Member

Ericson2314 commented Jun 3, 2018

Yeah I'd be excited to get rid of fetchurlBoot

@bobvanderlinden
Copy link
Member Author

bobvanderlinden commented Jun 3, 2018

Shall I make removal of fetchurlBoot part of this PR, or would it be better to create a separate PR afterwards?

Edit: just remembered, the fetchurlBoot does work with mirrors as it will just take the first entry and use that one. So we cannot replace fetchurlBoot with the implementation in this PR yet.

@bobvanderlinden
Copy link
Member Author

What can I do to get this moving forward?

@matthewbauer matthewbauer modified the milestones: 18.09, 19.03 Aug 2, 2018
@ElvishJerricco
Copy link
Contributor

This is awesome. I'd love to see this merged, if only for the ability to see download progress with Nix 2.0.

@bobvanderlinden
Copy link
Member Author

@matthewbauer @shlevy @Ericson2314 @edolstra What can I do to move this forward?
Should we first support all features in Nix?
I'd like something actionable, as it seems this is a desired feature.

@edolstra
Copy link
Member

There are a few things missing in builtin fetchurl:

  • Mirror support. (It does already have support for content-addressed mirrors so maybe this is not a big deal.)

  • Support for unpacking tarballs/zipfiles. This would allow it so handle most fetchzip-based functions like fetchFromGitHub. Currently it can only unpack NARs. (The alternative is to merge impure derivation support, since then the result of fetchurl can be impure so long as its output is only used by a fixed-output derivation like an unpacker.)

@bobvanderlinden
Copy link
Member Author

@edolstra Indeed, for cases where mirrors are used, curl is still being used as a fallback:

# Fallback to fetchurlCurl when builtin:fetchurl is not supported
# or arguments are used that are not supported by the Nix-builtin fetchurl.
if (builtins.compareVersions "2.0" builtins.nixVersion >= 0)
|| curlOpts != ""
|| netrcPhase != null
|| postFetch != ""
|| downloadToTemp
|| showURLs
|| urls != []
|| (lib.strings.hasPrefix "mirror:" url)
then
fetchurlCurl args
else

fetchTarball is already built-in right? I thought fetchzip is used in fewer cases.

Would you say these features need to be implemented in Nix before moving over using fallbacks like this PR does?

@danbst
Copy link
Contributor

danbst commented Jan 14, 2019

So far I can't see benefits, except slightly more cleaner in some cases:

progress updating integration with Nix 2.0

isn't it already a case? Or would it make nix-build to produce nix build-like progress bar?

simplify bootstrapping of Nixpkgs (no more fetchurlBoot)

you've shown already it doesn't

configure once, work everywhere with Nix’s netrc. Users behind corporate firewalls may need some special options and needing to change both nix.conf and NIX_CURL_FLAGS to get this to work. That’s a little unintuitive unless you know how Nixpkgs internals work w.r.t. Nix.

because this is a partial fetchurl replacement, it still won't work "in general case". Even 1 curl-fetchurl package can spoil a build.

Also, if it gets merged, it will deprecate things like postFetch and urls, because whenever one adds those, an objection will come: "but wait, this may break someone's setup with netrc!".

So, maybe it would make more sense to replace fetchrul with builtins.fetchurl only in places, that guarantee not to use any extended attribute (bundix/node2nix/mvn2nix/etc like you've described). Or deprecate extended attributes in Nixpkgs altogether? (unlikely, because we then would want to deprecate mirrors as well, which is no-no)

@bobvanderlinden @matthewbauer what do you think?

@bobvanderlinden
Copy link
Member Author

I was thinking that eventually we want all url fetches to be done by Nix itself, instead of the curl cmdline. We still need to support some features of the curl solution, so until those are implemented in Nix, curl will need to be used as fallback. This PR was a first step towards gradually migrating all url fetches to Nix.

If 1 curl fetch spoils your build, it is because it is using features not supported by Nix. Explicit usage of builtins would be cool, but a lot of helperfunctions are already referring to the nixpkgs fetchurl.

This PR isn't intended to be a really good and clean solution, but just a way to migrate over to full builtins.fetchurl coverage.

As for the download progress (correct me if I'm wrong, because I haven't tried this is a while), it reports downloads using Nix fetchurl differently than using curl. Currently fetches mostly reported as builds, whereas fetches for Nix files and the lines are reported as downloads. It would be nice to have all fetches be reported the same. This PR isn't a full solution for that, but it is a step toward that goal.

That said, I was only making this PR because of frustration regarding half-functioning netrc file for Nix and thereby the inability to fetch private files needed to set up company environments. This is one of the points blocking potential adoption at my workplace. I cannot 'sell' Nix if it doesn't have a consistent answer for fetching private files.

@edolstra
Copy link
Member

It would definitely be better if fetches are done by Nix rather than by builders, since then (one day) we can remove the ability of fixed-output derivations to fetch things from the network and thereby be arbitrarily irreproducible (NixOS/nix#2270).

@danbst
Copy link
Contributor

danbst commented Jan 15, 2019

I've added a trace to find out how many fetchurl calls are non-standard when building my system:

    820 trace: downloadToTemp, name, postFetch, recursiveHash
     71 trace: postFetch
     65 trace: urls
     51 trace: name, postFetch
     31 trace: name
      2 trace: curlOpts, downloadToTemp, name, postFetch, recursiveHash

Pretty much. Please review patch, maybe I've did it wrong:

diff --git a/pkgs/build-support/fetchurl/default.nix b/pkgs/build-support/fetchurl/default.nix
index 5f0c1384c79..d4b79ce918e 100644
--- a/pkgs/build-support/fetchurl/default.nix
+++ b/pkgs/build-support/fetchurl/default.nix
@@ -87,9 +87,17 @@ in

   # Passthru information, if any.
 , passthru ? {}
-}:
+}@args:

-assert sha512 != "" -> builtins.compareVersions "1.11" builtins.nixVersion <= 0;
+let allURLs = if args ? url then [url] else urls;
+    isMirror = lib.hasPrefix "mirror";
+    nonStandardArgs = (builtins.removeAttrs args ["sha256" "url" "meta" "sha512"])
+      // (if (lib.any isMirror allURLs) then { isMirror = true; } else {});
+    nonStandardArgsString = lib.concatStringsSep ", " (builtins.attrNames nonStandardArgs);
+    in
+
+assert (if nonStandardArgs != {} then (builtins.trace nonStandardArgsString) else lib.id)
+  sha512 != "" -> builtins.compareVersions "1.11" builtins.nixVersion <= 0;

 let
   urls_ =

EDIT: fix bug in script

@danbst
Copy link
Contributor

danbst commented Jan 15, 2019

alright, I've fixed a bug in patch above, here's new result:

$ nix-build --no-out-link ./nixos -A system 2>&1 | grep trace | sort | uniq -c | sort -k1nr
    817 trace: downloadToTemp, postFetch, recursiveHash
    628 trace: isMirror
    366 trace: NORMAL_URL
    122 trace: postFetch
     60 trace: urls
      5 trace: isMirror, urls
      3 trace: downloadToTemp, isMirror, postFetch, recursiveHash
      2 trace: curlOpts, downloadToTemp, postFetch, recursiveHash

Out of ~1800 fetchurl calls only 337 will be replaced with builtin fetchurl. So, I'd say, this PR is too much ahead of time.

I wonder also if it's possible to add netrc for fetchurl via overlay.

@bobvanderlinden
Copy link
Member Author

bobvanderlinden commented Jan 15, 2019

@danbst That's a really good idea! It's unfortunate so little fetches will use Nix this way. Your statistics do show what has the biggest impact to tackle to get Nix fetch dependencies. postFetch doesn't seem like a straightforward one to resolve, but mirror support could be a good first step.

Apart from those issues, I looked into getting netrc into fetchurl via overlay, but it all felt confusing and dirty to do it that way. I thought I ran into the solution that would force me to allow a sandbox to access my netrc file. As it includes really private information, that wasn't something I'd like to recommend to coworkers.

That said, I'm still a bit hesitant to just close this PR, as it does give some more motivation to implement (for instance) mirror support in Nix.

@FRidh FRidh modified the milestones: 19.03, 19.09 Mar 5, 2019
@bobvanderlinden
Copy link
Member Author

Closing this one because it probably will need more work on Nix itself because this PR is a viable option.

@FRidh
Copy link
Member

FRidh commented Jul 11, 2019

@danbst we need to be careful with interpreting those kind of results. fetchzip uses downloadToTemp, postFetch and recursiveHash, and is itself primarily used by fetchFromGitHub for which we could use builtins.fetchGit (although it really should do shallow clones by then). Other than that recursivehash is used primarily by fetchcrate and LaTeX, with the latter performing a lot of fetches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants