Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an optional hash parameter to builtins.fetchGit #3216

Closed
wants to merge 1 commit into from

Conversation

Ma27
Copy link
Member

@Ma27 Ma27 commented Nov 9, 2019

cc @globin @xbreak


This is particularly useful to ensure the purity e.g. of a repo which is
pinned to a git tag (tags can be force-pushed which makes builds silently irreproducible)
or when trying to evaluate a nix expression without internet connectivity (if the
cache-entry is expired, the eval will break as the git repo can't be re-fetched).

This change adds an optional hash parameter (with an SRI-hash[1]) to the builtin
fetchGit and checks if there's a content-addressable path which matches the hash
(the hash can be determined using nix to-sri --type sha256 $(nix-prefetch-git ...)).

If no such path exists, it will be attempted to fetch the Git repository in
order to compare the hashes. Please note that caching is still used here, so
if the repo is already fetched and the only hash in the expression changes,
the evaluation will fail pretty fast.

[1] https://www.w3.org/TR/SRI/

Copy link
Member

@FRidh FRidh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also started to have my doubts about refs like in #2582. For example

builtins.fetchGit {
  url = https://github.com/NixOS/nixpkgs.git;
  ref = "refs/pull/1024/head;
}

While I want to be able to fetch them, they are mutable. Maybe we should discourage the use of them?

doc/manual/expressions/builtins.xml Outdated Show resolved Hide resolved
doc/manual/expressions/builtins.xml Outdated Show resolved Hide resolved
@Ma27
Copy link
Member Author

Ma27 commented Nov 9, 2019

While I want to be able to fetch them, they are mutable

The ref argument is needed to e.g. fetch git tags (checking out a git tag as a rev didn't work here, as far as I understand the fetchgit code from nixpkgs, ref/rev is mixed together, so rev = "v1.2.3" works there, but not with builtins.fetchGit).

@Ericson2314
Copy link
Member

CC @matthewbauer

@matthewbauer
Copy link
Member

matthewbauer commented Nov 13, 2019

While I want to be able to fetch them, they are mutable

The ref argument is needed to e.g. fetch git tags (checking out a git tag as a rev didn't work here, as far as I understand the fetchgit code from nixpkgs, ref/rev is mixed together, so rev = "v1.2.3" works there, but not with builtins.fetchGit).

I would consider fetchGit with "rev" to be fixed-output and fetchGit with "ref" to be not fixed-output. In Git, branches and tags are considered references, and are not content-addressable (although tags require force push to change). Perhaps another arg could be added like tag or namedRev for only the tag subset of refs.

@Ma27
Copy link
Member Author

Ma27 commented Nov 13, 2019

@globin and I actually discussed about this last weekend :)

The reason why ref is currently used is because tags can be fetched with it and we use tags (although those are not 100% content-addressable) pretty often in nixpkgs.

Perhaps another arg could be added like tag or namedRev for only the tag subset of refs.

That's a good idea actually! Will implement this probably next weekend :)

@edolstra
Copy link
Member

Could you replace sha256 with generic hash argument containing an SRI hash, as @FRidh suggests? We've been trying to move away from sha1/sha256/... arguments (see e.g. <nix/fetchurl.nix>).

@edolstra
Copy link
Member

I would consider fetchGit with "rev" to be fixed-output and fetchGit with "ref" to be not fixed-output.

To be clear, neither is fixed-output because they're not derivations. But they both produce content-addressable store paths. The difference is that with rev you'll always get the same CA store path and with ref you won't necessarily. But in neither case can you compute the store path from the rev or ref alone.

Morally speaking, fetching using a ref and a hash is not worse than using rev.

@matthewbauer
Copy link
Member

matthewbauer commented Nov 14, 2019

Morally speaking, fetching using a ref and a hash is not worse than using rev.

Won't fetching using a ref and a hash inevitably lead to hash mismatch errors, though? Perhaps to Nix it's all the same, but not to Git. The rev commit hash is content-addressable: derived from its parent commit(s), working tree, and commit metadata. Nix could be taught one day to recognize it as such, making the hash argument here optional. On the other hand we could never do the same for ref as it is only a pointer to a commit.

@globin
Copy link
Member

globin commented Nov 14, 2019

Yeah but we often want to pin on a tag and guard that with the hash to notice if people force push a new version and still have a pure evaluation.

@edolstra
Copy link
Member

Nix could be taught one day to recognize it as such, making the hash argument here optional.

That would be tricky because the rev is a hash over the entire history rather than the current tree. So unless we copy the entire history to the store, Nix won't be able to verify the rev.

BTW with flakes it's fine to use just a tag/branch because it will get locked to a specific rev in the lock file.

@Ma27 Ma27 force-pushed the sha256-support-for-builtins.fetchGit branch from 0872bb5 to 1b1b6ab Compare November 14, 2019 16:27
@Ma27 Ma27 changed the title Add an optional sha256 parameter to builtins.fetchGit Add an optional hash parameter to builtins.fetchGit Nov 14, 2019
@Ma27
Copy link
Member Author

Ma27 commented Dec 6, 2019

@edolstra may I ask if there's anything else to add/fix? :)

@Ericson2314
Copy link
Member

@edolstra Matt and I were also talking about using tree hashes, which Nix could verify without storing history.

This is particularly useful to ensure the purity e.g. of a repo which is
pinned to a git tag (tags can be force-pushed which makes builds silently irreproducible)
or when trying to evaluate a nix expression without internet connectivity (if the
cache-entry is expired, the eval will break as the git repo can't be re-fetched).

This change adds an optional `hash` parameter (with an SRI-hash[1]) to the builtin
`fetchGit` and checks if there's a content-addressable path which matches the hash
(the hash can be determined using `nix to-sri --type sha256 $(nix-prefetch-git ...)`).

If no such path exists, it will be attempted to fetch the Git repository in
order to compare the hashes. Please note that caching is still used here, so
if the repo is already fetched and the only hash in the expression changes,
the evaluation will fail pretty fast.

[1] https://www.w3.org/TR/SRI/
@Ma27 Ma27 force-pushed the sha256-support-for-builtins.fetchGit branch from 1b1b6ab to 24e14cc Compare December 21, 2019 01:40
@Ma27
Copy link
Member Author

Ma27 commented Dec 21, 2019

Just updated the patch to work with the refactorings on the store API on master :)

@Ma27
Copy link
Member Author

Ma27 commented Jan 17, 2020

@edolstra @Ericson2314 is there anything I can address here atm? :)

@edolstra
Copy link
Member

The new fetcher infrastructure adds a generic narHash attribute to all input types, e.g.

fetchTree { type = "tarball"; url = file://foo/bar.tar.gz; narHash = "sha256-xdKv2pq/IiwLSnBBJXW8hNowI4MrdZfW+SYqDQs7Tzc="; }

I think that's better than adding ad hoc support to specific fetchers. Also, the name hash is pretty ambiguous since it could be confused with a commit hash (i.e. the rev attribute).

@jtojnar
Copy link
Contributor

jtojnar commented Feb 19, 2020

Related: NixOS/nixpkgs#79987

@Ma27
Copy link
Member Author

Ma27 commented Feb 19, 2020

@jtojnar did you link this just for documentation purposes? This patch only supports SRI hashes as well :)

I'll take a closer look at the new fetcher code later to confirm that this can be closed now.

@Ma27
Copy link
Member Author

Ma27 commented Feb 20, 2020

@edolstra so I took a closer look at the new fetcher-infrastructure today. I currently have two questions about this:

  • Would it be possible to "backport" this to nix-master? I'd really love to use this feature on stable nix :)
  • The git-fetcher doesn't seem to support a narHash-attribute atm. I guess that in favor of this PR I could file a patch that adds narHash-support to fetchTree { type = "git"; }, right?

@Ma27
Copy link
Member Author

Ma27 commented Apr 29, 2020

Closing now. The fetcher-API has been backported to nix-master quite recently. I already have a patch locally which adds support for the narHash-functionality (same hash-values as in my hash-param).... will file a PR tonight or tomorrow. Thanks for all the feedback!

@Ma27 Ma27 closed this Apr 29, 2020
Ma27 added a commit to Ma27/nix that referenced this pull request Apr 29, 2020
Replacement for NixOS#3216 which became obsolete after the fetchers-API was
introduced which provides a `narHash`-argument for each fetcher
which is a SRI-hash of a content-addressable path. However, the
`fetchGit`-primop required some changes to work with it.

To summarize, the following things changed:

* The hash-mismatch error in `libfetchers` now returns `102` as exit
  code which is the default for those errors.

* The `git`-fetcher computes an SRI-hash of the result for the
  hash-validation.

* It's possible to specify an SRI-hash using the `narHash`-attribute in
  `builtins.fetchGit`. This is only allowed if a revision is specified.
@Ma27 Ma27 deleted the sha256-support-for-builtins.fetchGit branch April 29, 2020 23:12
Ma27 added a commit to Ma27/nix that referenced this pull request May 8, 2020
The original idea was to implement a git-fetcher in Nix's core that
supports content hashes[1]. In NixOS#3549[2] it has been suggested to
actually use `fetchTree` for this since it's a fairly generic wrapper
over the new fetcher-API[3] and already supports content-hashes.

This patch implements a new git-fetcher based on `fetchTree` by
incorporating the following changes:

* Removed the original `fetchGit`-implementation and replaced it with an
  alias on the `fetchTree` implementation.

* Ensured that the `git`-fetcher from `libfetchers` always computes a
  content-hash and returns an "empty" revision on dirty trees (the
  latter one is needed to retain backwards-compatibility).

* The hash-mismatch error in the fetcher-API exits with code 102 as it
  usually happens whenever a hash-mismatch is detected by Nix.

* Removed the `flakes`-feature-flag: I didn't see a reason why this API
  is so tightly coupled to the flakes-API and at least `fetchGit` should
  remain usable without any feature-flags.

* It's only possible to specify a `narHash` for a `git`-tree if either a
  `ref` or a `rev` is given[4].

* It's now possible to specify an URL without a protocol. If it's missing,
  `file://` is automatically added as it was the case in the original
  `fetchGit`-implementation.

[1] NixOS#3216
[2] NixOS#3549 (comment)
[3] NixOS#3459
[4] NixOS#3216 (comment)
Ma27 added a commit to Ma27/nix that referenced this pull request Jun 4, 2020
The original idea was to implement a git-fetcher in Nix's core that
supports content hashes[1]. In NixOS#3549[2] it has been suggested to
actually use `fetchTree` for this since it's a fairly generic wrapper
over the new fetcher-API[3] and already supports content-hashes.

This patch implements a new git-fetcher based on `fetchTree` by
incorporating the following changes:

* Removed the original `fetchGit`-implementation and replaced it with an
  alias on the `fetchTree` implementation.

* Ensured that the `git`-fetcher from `libfetchers` always computes a
  content-hash and returns an "empty" revision on dirty trees (the
  latter one is needed to retain backwards-compatibility).

* The hash-mismatch error in the fetcher-API exits with code 102 as it
  usually happens whenever a hash-mismatch is detected by Nix.

* Removed the `flakes`-feature-flag: I didn't see a reason why this API
  is so tightly coupled to the flakes-API and at least `fetchGit` should
  remain usable without any feature-flags.

* It's only possible to specify a `narHash` for a `git`-tree if either a
  `ref` or a `rev` is given[4].

* It's now possible to specify an URL without a protocol. If it's missing,
  `file://` is automatically added as it was the case in the original
  `fetchGit`-implementation.

[1] NixOS#3216
[2] NixOS#3549 (comment)
[3] NixOS#3459
[4] NixOS#3216 (comment)
Ma27 added a commit to Ma27/nix that referenced this pull request Jun 19, 2020
The original idea was to implement a git-fetcher in Nix's core that
supports content hashes[1]. In NixOS#3549[2] it has been suggested to
actually use `fetchTree` for this since it's a fairly generic wrapper
over the new fetcher-API[3] and already supports content-hashes.

This patch implements a new git-fetcher based on `fetchTree` by
incorporating the following changes:

* Removed the original `fetchGit`-implementation and replaced it with an
  alias on the `fetchTree` implementation.

* Ensured that the `git`-fetcher from `libfetchers` always computes a
  content-hash and returns an "empty" revision on dirty trees (the
  latter one is needed to retain backwards-compatibility).

* The hash-mismatch error in the fetcher-API exits with code 102 as it
  usually happens whenever a hash-mismatch is detected by Nix.

* Removed the `flakes`-feature-flag: I didn't see a reason why this API
  is so tightly coupled to the flakes-API and at least `fetchGit` should
  remain usable without any feature-flags.

* It's only possible to specify a `narHash` for a `git`-tree if either a
  `ref` or a `rev` is given[4].

* It's now possible to specify an URL without a protocol. If it's missing,
  `file://` is automatically added as it was the case in the original
  `fetchGit`-implementation.

[1] NixOS#3216
[2] NixOS#3549 (comment)
[3] NixOS#3459
[4] NixOS#3216 (comment)
@Ericson2314
Copy link
Member

@edolstra Matt and I were also talking about using tree hashes, which Nix could verify without storing history.

For anyone else reading this, that's now done in #3635

Ma27 added a commit to Ma27/nix that referenced this pull request Jul 17, 2020
The original idea was to implement a git-fetcher in Nix's core that
supports content hashes[1]. In NixOS#3549[2] it has been suggested to
actually use `fetchTree` for this since it's a fairly generic wrapper
over the new fetcher-API[3] and already supports content-hashes.

This patch implements a new git-fetcher based on `fetchTree` by
incorporating the following changes:

* Removed the original `fetchGit`-implementation and replaced it with an
  alias on the `fetchTree` implementation.

* Ensured that the `git`-fetcher from `libfetchers` always computes a
  content-hash and returns an "empty" revision on dirty trees (the
  latter one is needed to retain backwards-compatibility).

* The hash-mismatch error in the fetcher-API exits with code 102 as it
  usually happens whenever a hash-mismatch is detected by Nix.

* Removed the `flakes`-feature-flag: I didn't see a reason why this API
  is so tightly coupled to the flakes-API and at least `fetchGit` should
  remain usable without any feature-flags.

* It's only possible to specify a `narHash` for a `git`-tree if either a
  `ref` or a `rev` is given[4].

* It's now possible to specify an URL without a protocol. If it's missing,
  `file://` is automatically added as it was the case in the original
  `fetchGit`-implementation.

[1] NixOS#3216
[2] NixOS#3549 (comment)
[3] NixOS#3459
[4] NixOS#3216 (comment)
Ma27 added a commit to Ma27/nix that referenced this pull request Jul 27, 2020
The original idea was to implement a git-fetcher in Nix's core that
supports content hashes[1]. In NixOS#3549[2] it has been suggested to
actually use `fetchTree` for this since it's a fairly generic wrapper
over the new fetcher-API[3] and already supports content-hashes.

This patch implements a new git-fetcher based on `fetchTree` by
incorporating the following changes:

* Removed the original `fetchGit`-implementation and replaced it with an
  alias on the `fetchTree` implementation.

* Ensured that the `git`-fetcher from `libfetchers` always computes a
  content-hash and returns an "empty" revision on dirty trees (the
  latter one is needed to retain backwards-compatibility).

* The hash-mismatch error in the fetcher-API exits with code 102 as it
  usually happens whenever a hash-mismatch is detected by Nix.

* Removed the `flakes`-feature-flag: I didn't see a reason why this API
  is so tightly coupled to the flakes-API and at least `fetchGit` should
  remain usable without any feature-flags.

* It's only possible to specify a `narHash` for a `git`-tree if either a
  `ref` or a `rev` is given[4].

* It's now possible to specify an URL without a protocol. If it's missing,
  `file://` is automatically added as it was the case in the original
  `fetchGit`-implementation.

[1] NixOS#3216
[2] NixOS#3549 (comment)
[3] NixOS#3459
[4] NixOS#3216 (comment)
Ma27 added a commit to Ma27/nix that referenced this pull request Jul 27, 2020
The original idea was to implement a git-fetcher in Nix's core that
supports content hashes[1]. In NixOS#3549[2] it has been suggested to
actually use `fetchTree` for this since it's a fairly generic wrapper
over the new fetcher-API[3] and already supports content-hashes.

This patch implements a new git-fetcher based on `fetchTree` by
incorporating the following changes:

* Removed the original `fetchGit`-implementation and replaced it with an
  alias on the `fetchTree` implementation.

* Ensured that the `git`-fetcher from `libfetchers` always computes a
  content-hash and returns an "empty" revision on dirty trees (the
  latter one is needed to retain backwards-compatibility).

* The hash-mismatch error in the fetcher-API exits with code 102 as it
  usually happens whenever a hash-mismatch is detected by Nix.

* Removed the `flakes`-feature-flag: I didn't see a reason why this API
  is so tightly coupled to the flakes-API and at least `fetchGit` should
  remain usable without any feature-flags.

* It's only possible to specify a `narHash` for a `git`-tree if either a
  `ref` or a `rev` is given[4].

* It's now possible to specify an URL without a protocol. If it's missing,
  `file://` is automatically added as it was the case in the original
  `fetchGit`-implementation.

[1] NixOS#3216
[2] NixOS#3549 (comment)
[3] NixOS#3459
[4] NixOS#3216 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants