Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fetchSubmodules to builtins.fetchGit #3166

Merged
merged 17 commits into from Apr 7, 2020

Conversation

blitz
Copy link
Contributor

@blitz blitz commented Oct 25, 2019

This pull request adds the ability to fetch submodules with builtins.fetchGit. Use it like: builtins.fetchGit { fetchSubmodules = true; ... }

There are some downsides to this feature:

  • Submodules are not cached (unlike the root repo),
  • Full checkouts are created in a temporary directory.

These could be fixed, but require far more code and would require re-implementing some of the submodule cloning feature. This PR does not change any of the fetchGit behavior when submodules are not used (the default), so only people using this feature are paying these costs.

Other missing things:

  • Tests
  • Documentation

Note: This is my first contribution to Nix, so please expect stupid mistakes. :)

Fixes #2151

@edolstra @tfc

@blitz
Copy link
Contributor Author

blitz commented Oct 26, 2019

I've added documentation and a test.

@blitz blitz changed the title PoC: Add fetchSubmodules to builtins.fetchGit Add fetchSubmodules to builtins.fetchGit Oct 28, 2019
@nmattia
Copy link
Contributor

nmattia commented Nov 22, 2019

@edolstra WDYT? this would be super helpful!

@blitz
Copy link
Contributor Author

blitz commented Dec 3, 2019

@edolstra @tfc I've resolved merge conflicts.

@blitz
Copy link
Contributor Author

blitz commented Jan 8, 2020

@edolstra It would be really nice to find a solution here to get submodule support working with builtins.fetchGit, because I'm currently building workarounds again. :)

@bjornfor
Copy link
Contributor

Here is a patch with my proposed changes:

diff --git a/src/libexpr/local.mk b/src/libexpr/local.mk
index 26b9f14ba..e8c019365 100644
--- a/src/libexpr/local.mk
+++ b/src/libexpr/local.mk
@@ -8,7 +8,7 @@ libexpr_SOURCES := $(wildcard $(d)/*.cc) $(wildcard $(d)/primops/*.cc) $(d)/lexe
 
 libexpr_LIBS = libutil libstore libnixrust
 
-libexpr_LDFLAGS =
+libexpr_LDFLAGS = -lstdc++fs
 ifneq ($(OS), FreeBSD)
  libexpr_LDFLAGS += -ldl
 endif
diff --git a/src/libexpr/primops/fetchGit.cc b/src/libexpr/primops/fetchGit.cc
index b08e48404..c58d28bbc 100644
--- a/src/libexpr/primops/fetchGit.cc
+++ b/src/libexpr/primops/fetchGit.cc
@@ -6,6 +6,7 @@
 #include "hash.hh"
 #include "tarfile.hh"
 
+#include <filesystem>
 #include <sys/time.h>
 
 #include <regex>
@@ -182,9 +183,13 @@ GitInfo exportGit(ref<Store> store, const std::string & uri,
 
         runProgram("git", true, { "-C", tmpDir, "checkout", "--quiet", "FETCH_HEAD" });
         runProgram("git", true, { "-C", tmpDir, "remote", "add", "origin", uri });
-        runProgram("git", true, { "-C", tmpDir, "submodule", "--quiet", "update", "--init" });
+        runProgram("git", true, { "-C", tmpDir, "submodule", "--quiet", "update", "--init", "--recursive" });
 
-        deletePath(tmpDir + "/.git");
+        for (const auto& p : std::filesystem::recursive_directory_iterator(tmpDir)) {
+            if (p.path().filename() == ".git") {
+                std::filesystem::remove_all(p.path());
+            }
+        }
 
         gitInfo.submodules = true;
     } else {
diff --git a/tests/fetchGitSubmodules.sh b/tests/fetchGitSubmodules.sh
index c821d2caa..987bfd69e 100644
--- a/tests/fetchGitSubmodules.sh
+++ b/tests/fetchGitSubmodules.sh
@@ -47,4 +47,5 @@ pathWithSubmodules=$(nix eval --raw "(builtins.fetchGit { url = file://$rootRepo
 [[ ! -e $pathWithoutSubmodules/sub/content ]]
 [[ -e $pathWithSubmodules/sub/content ]]
 
-
+# No .git directory or submodule reference files must be left
+test "$(find "$pathWithSubmodules" -name .git)" = ""

I used std::filesystem for removing the .git paths. Nix already depends on C++17, but apparently one needs -lstdc++fs to get this library. The internet says clang needs a different flag, but the current flag worked in my testing (./shell.nix has a useClang flag which I enabled). So perhaps I was reading outdated info.

@nmattia
Copy link
Contributor

nmattia commented Jan 13, 2020

By the way, I had a chat with @edolstra . He's refactoring the fetcher, which will need to land before this. Sharing this to clarify, and ideally to save people some time in case some of the changes don't apply anymore!

@blitz
Copy link
Contributor Author

blitz commented Jan 13, 2020

@bjornfor Thanks for the review and fixes! As @nmattia pointed out this will likely conflict with upcoming changes, so I'd rather wait before spending more time here. Or do you think there is a chance of sneaking this in before the refactoring?

@nmattia @edolstra It would be nice if there is some issue to watch regarding the fetcher changes.

@bjornfor
Copy link
Contributor

I'm eagerly awaiting a status update here :-)

@bjornfor
Copy link
Contributor

I did some real life testing with this patch and found two issues.

  1. The result of builtins.fetchGit { fetchSubmodules = true; ... } changes from the first run to the next. The first run results in { outPath = ; ...; submodules = true; }, but the next calls result in { outPath = ; ...; submodules = false; }. I think the output submodules attribute would follow the input fetchSubmodules parameter exactly and not change on the 2nd (and following) runs.

  2. Using fetchSubmodules = true on repos that have no submodules is a fatal error:

nix --experimental-features nix-command eval '(builtins.fetchGit { url = "https://github.com/nixos/nix.git"; ref = "refs/heads/master"; fetchSubmodules = true; } )'
fatal: Refusing to fetch into current branch refs/heads/master of non-bare repository
fatal: The remote end hung up unexpectedly
error: program 'git' failed with exit code 128

Allowing fetchSubmodules = true on non-submodule using repos is a requirement for my use case, and is supported by git itself: git clone --recursive https://github.com/nixos/nix.git is no error even if nix.git uses no submodules.

@edolstra: Are you refactoring the fetcher(s)? I would really like for nix to have git submodule support and can probably spend a bit more time on this if there is interest. But no sign of interest kills my motivation.

@edolstra
Copy link
Member

Yes, the new fetcher infrastructure is in the flakes branch: https://github.com/NixOS/nix/tree/flakes/src/libstore/fetchers. It replaces fetchGit / fetchMercurial with a generic fetchTree builtin that uses the same attributes as flake inputs (e.g. fetchTree { type = "github"; owner = "NixOS"; repo = "nix"; rev = ...; }). So submodules would be an attribute supported by the git fetcher. And then flakes would automatically support Git repos with submodules as an input.

@blitz
Copy link
Contributor Author

blitz commented Feb 20, 2020

@edolstra @tfc @bjornfor I was wondering how to implement submodule support while keeping all the nice caching in place. I'm thinking about something along the lines of:

  1. Fetch a git repo
  2. Export it into the Nix store
  3. If submodule fetching is enabled:
  • enumerate all submodules, their URLs and commit hashes,
  • recursively use the normal git fetching logic to fetch them into the Nix store
  • join all submodules together using hard/symlinks

This would keep the normal caching of git repositories in place and submodules wouldn't be treated any differently from normal git checkouts.

The downside is that the implementation is somewhat complex compared to just using git clone --recursive and we are partially reimplementing the git submodule command.

To limit the complexity in Nix itself, most of this could be implemented as Nix expression, given that a builtin is available that returns a list of submodules and their properties.

Opinions?

@edolstra
Copy link
Member

I think we should ignore the caching issue for now. I still want to add a generic caching mechanism to the fetcher infrastructure that generalizes the Git-specific caching (in ~/.cache/nix/git-revs). So Nix would keep a mapping fetcher input attributes -> store path which would also work for git-with-submodules.

@bjornfor
Copy link
Contributor

Yes, the new fetcher infrastructure is in the flakes branch: https://github.com/NixOS/nix/tree/flakes/src/libstore/fetchers.

@blitz @edolstra: I'm confused about what the next action is. Ok, so flakes branch has new fetcher infrastructure, and this is where the future git fetcher with submodule support will live. But flakes is an experimental branch and it would be nice to have submodules support in standard Nix. (I don't know how long until flakes get stabilized and land in master.)

So, if this PR is refreshed (fixing the outstanding issues as commented in this thread), would it be merged?

@blitz
Copy link
Contributor Author

blitz commented Feb 26, 2020

@bjornfor I'll post an updated version soon. If I should rebuild this on top of flakes, this would be good to know.

@edolstra
Copy link
Member

@bjornfor Yes, I'll take care of porting it to the new fetcher infrastructure.

@blitz
Copy link
Contributor Author

blitz commented Mar 2, 2020

I've incorporated all fixes. I just have to investigate the test failures.

@blitz
Copy link
Contributor Author

blitz commented Mar 3, 2020

@edolstra @bjornfor @nmattia @tfc I'd appreciate a second look here. I think I have added tests for all issues that were pointed out and these tests are passing now.

Two tests are failing (tests/multiple-outputs.sh and `tests/nix-build.sh), but these are not regressions and are already failing on the master branch.

@blitz blitz requested a review from edolstra March 3, 2020 10:42
@bjornfor
Copy link
Contributor

bjornfor commented Mar 5, 2020

@blitz: Did you build it? I'm getting this error:

/nix/store/m8ih8gw7qnd2c7p0qgv3cn4b6abkbanb-binutils-2.31.1/bin/ld: /build/nix-2.4pre7279_7fee022/src/libexpr/primops/fetchGit.cc:188: undefined reference to `std::filesystem::__cxx11::recursive_directory_iterator::operator++()'

In my patch I added libexpr_LDFLAGS = -lstdc++fs, but I don't see that in your PR.

@blitz
Copy link
Contributor Author

blitz commented Mar 9, 2020

@bjornfor

@blitz: Did you build it? I'm getting this error:

/nix/store/m8ih8gw7qnd2c7p0qgv3cn4b6abkbanb-binutils-2.31.1/bin/ld: /build/nix-2.4pre7279_7fee022/src/libexpr/primops/fetchGit.cc:188: undefined reference to `std::filesystem::__cxx11::recursive_directory_iterator::operator++()'

In my patch I added libexpr_LDFLAGS = -lstdc++fs, but I don't see that in your PR.

My understanding was that -lstdc++fs is only required for the experimental filesystem support and is not required anymore since the filesystem support moved out of its experimental state. That may be wrong for some platforms. I'm using:

% nix-build release.nix -A build.x86_64-linux

blitz and others added 11 commits March 29, 2020 22:29
Some platforms seem to still require linking with stdc++fs to enable
STL std::filesystem support.
Before this change it would be false for all evaluations but the first.
Now it follows the input argument (as it should).
The .link file is used as a lock, so I think we should put the
"submodule" attribute in there since turning on submodules creates a new
.link file path.
Major bugfix for the submodules = true code path.

TODO: Add tests.
Due to fetchGit not checking if rev is an ancestor of ref (there is even
a FIXME comment about it in the code), the cache repo might not have the
ref even though it has the rev. This doesn't matter when submodule =
false, but the submodule = true code blows up because it tries to fetch
the (missing) ref from the cache repo.

Fix this in the simplest way possible: fetch all refs from the local
cache repo when submodules = true.

TODO: Add tests.
Using std::filesystem means also having to link with -lstdc++fs on
some platforms and it's hard to discover for what platforms this is
needed. As all the functionality is already implemented as utilities,
use those instead.
@blitz
Copy link
Contributor Author

blitz commented Mar 29, 2020

@edolstra @bjornfor

I've removed std::filesystem use in favor of using the filter support in addToStore. This seemed simpler than using readDirectory and deletePath.

@tfc
Copy link
Contributor

tfc commented Apr 2, 2020

@edolstra @bjornfor anything else that needs to be done? i am very interested in seeing this patch merged.

@edolstra edolstra self-assigned this Apr 7, 2020
@edolstra edolstra merged commit 55cefd4 into NixOS:master Apr 7, 2020
@edolstra
Copy link
Member

edolstra commented Apr 7, 2020

Merged, thanks!

@blitz blitz deleted the fetchgit-recursive branch April 7, 2020 13:53
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/why-doesnt-fetchfromgithub-support-submodules-with-private/6600/6

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/fetchgit-and-fetchgit-are-not-working-for-private-repos-with-submodules/6673/2

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-flakes-and-submodules/7904/2

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/using-crate2nix-to-create-a-cargo-alternative-or-wrapper-that-transparently-uses-nix-store-as-a-cache/12095/5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

builtins.fetchGit doesn't fetch submodules
7 participants