-
-
Notifications
You must be signed in to change notification settings - Fork 15.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimenting with submodules in nixpkgs #37527
Conversation
1 Description ============= This is an exploratory pull request to examine one possible future for a project like nixpkgs. These changes add support to nixpkgs for using git submodules in the nixpkgs git repository to find sources for derivations. This uses builtins.fetchgit to remove the need to specify a sha256, while still maintaining complete reproducibility. To demonstrate these changes, two submodules are added, and the corresponding two packages are updated to use those two submodules for their sources. To keep things simple, most of the git-interaction is currently implemented in a simple Python script, not in Nix expressions; thus you must run `./prepare_sources.py > sources.json' in the root of the Nixpkgs repository whenever you init or deinit a submodule. This limitation can be removed with additional work by porting that Python code to Nix expressions. 2 Motivation and purpose ======================== First off, the most pedestrian benefit. Representing the source version as a submodule makes it easier to update the version of sources used in Nixpkgs. Just `git pull' in the submodule, `git add' in Nixpkgs, and commit, and you're done. It's equally easy to update to a new tag; just `git fetch && git checkout' instead of `git pull'. In either case, there's no need to edit any Nix expression. Beyond that, this simple change enables an extremely powerful new workflow for open source software developers. If I wish to make a change to some software, I just initialize the corresponding submodule in nixpkgs, and start hacking: ,---- | git submodule init sources/tools/system/supervise | cd sources/tools/system/supervise | vi foo.c # hack hack hack `---- Any changes in the now-initialized submodule will be automatically picked up when I next build: ,---- | nix-build ~/my-nixpkgs -A supervise `---- If I have committer privileges to the project, pushing my changes then as easy as `git push' in the submodule. If I don't have commit privileges, I can just go through the project's normal contribution workflow; with Github, that would be as easy as `hub fork && hub pull-request'. In either case, to add my change to nixpkgs, all I have to do is `git add' the now-changed submodule, commit, and push: ,---- | git add sources/tools/system/supervise | git commit -m "supervise: update from blah to blah" | git push `---- This is already a huge win on its own. But it's what happens when I'm working on multiple pieces of software at once that really makes this transformative. I can check out any number of different projects with arbitrary dependency relationships, and it's easy as `nix-build -A somepkg' to automatically rebuild the tree with all my changes. If, for example, I'm working on a project written in Python packaged in nixpkgs, and I see an issue in CPython that I should fix, I just open up the CPython submodule, fix the issue, and my project immediately benefits. If I didn't quite fix the problem, I can easily keep iterating, tweaking my own project and CPython simultaneously. It's likewise just as easy to distribute in-progress changes to multiple projects. I commit my in-progress changes and push it to my nixpkgs fork. Then if someone clones my nixpkgs fork, they can immediately start working on the same changed codebases with the same changes. (This can be made incredibly simple with some tooling that uses techniques such as on-demand creation of an "[omega repo]".) I suspect there are numerous other advantages as well which are not yet obvious, and that making common operations so much cheaper may unlock a radically different way of working on open source. [omega repo] https://github.com/twosigma/git-meta/wiki/The-Omega-Repo 3 Conclusion ============ I think this way of organizing projects would be a truly transformative way to organize software development, and it would be a major incentive for using Nix as the backbone of a software project. In a certain way, this would position Nixpkgs as an "open source monorepo", a place where cross-project integration work could be done with ease, without any of the scaling issues of traditional monorepos. Of course, tying Nixpkgs so deeply to Git may be undesirable, though it wouldn't prevent us from doing anything we currently do. There are some practical downsides as well; git submodules can be tricky to work with, though there are [projects] attempting to make them easier to use. `builtins.fetchgit' currently has some scaling problems with its git cache, which may make it difficult to do something like this before the issues are fixed. Nevertheless, for projects other than Nixpkgs, such as separate Nixpkgs overlays for a few related packages, I think this kind of organization makes a lot of sense. It would be nice to see some projects openly experiment with a submodule-based Nixpkgs overlay. What do you think? [projects] https://github.com/twosigma/git-meta
See previous commit message for details.
Thank you for looking into this. There are pros and cons to using submodules.
With The advantages you've listed are nice ones. If we would move not just the source but an expression in a submodule, we get the added benefit of being able to set up different permissions, although it also has a clear disadvantage: not being able to see all expressions directly. Simply initializing all submodules is not an option as it takes to long. |
Yes, I think that with regard to putting package expressions in the "same place" as package sources, submodules-in-nixpkgs has all the same issues as tarballs. Getting the package expression out of the submodule will have the same advantages and the same disadvantages as getting a package expression out of a tarball. I think submodules might reduce the need to move package expressions to the "same place" as package sources, though. If the package source is a submodule of the nixpkgs repo, the package expressions and package source are already conceptually a lot "closer together", and maybe that obsoletes some of the reasons for putting expressions next to sources. (Sorry, super vague I know :)) On the other hand, Nix expressions in submodules could be really interesting for another use case: An overlay repo (or even an individual project) could put Nixpkgs in a submodule, rather than doing pinning by other means. Or to reverse it: maybe you could modularize Nixpkgs into multiple overlays, pulled in as submodules, which themselves have submodules pointing to the source code for their packages. Of course, then you'd have nested submodules, which sounds like a nightmare, but maybe with sufficient tooling could actually be very cool. |
While it was an interesting experiment, it won't go in so I am closing this. |
This is an exploratory pull request to examine one possible future for a project like nixpkgs.
These changes add support to nixpkgs for using git submodules in the nixpkgs git repository to find sources for derivations. This uses builtins.fetchgit to remove the need to specify a sha256, while still maintaining complete reproducibility.
To demonstrate these changes, two submodules are added, and the corresponding two packages are updated to use those two submodules for their sources.
To keep things simple, most of the git-interaction is currently implemented in a simple Python script, not in Nix expressions; thus you must run
./prepare_sources.py > sources.json
in the root of the Nixpkgs repository whenever you init or deinit a submodule. This limitation can be removed with additional work by porting that Python code to Nix expressions.First off, the most pedestrian benefit. Representing the source version as a submodule makes it easier to update the version of sources used in Nixpkgs. Just
git pull
in the submodule,git add
in Nixpkgs, and commit, and you're done. It's equally easy to update to a new tag; justgit fetch && git checkout
instead ofgit pull
. In either case, there's no need to edit any Nix expression.Beyond that, this simple change enables an extremely powerful new workflow for open source software developers.
If I wish to make a change to some software, I just initialize the corresponding submodule in nixpkgs, and start hacking:
Any changes in the now-initialized submodule will be automatically picked up when I next build:
nix-build ~/my-nixpkgs -A supervise
If I have committer privileges to the project, pushing my changes then as easy as
git push
in the submodule. If I don't have commit privileges, I can just go through the project's normal contribution workflow; with Github, that would be as easy ashub fork && hub pull-request
.In either case, to add my change to nixpkgs, all I have to do is
git add
the now-changed submodule, commit, and push:git add sources/tools/system/supervise git commit -m "supervise: update from blah to blah" git push
This is already a huge win on its own. But it's what happens when I'm working on multiple pieces of software at once that really makes this transformative.
I can check out any number of different projects with arbitrary dependency relationships, and it's easy as
nix-build -A somepkg
to automatically rebuild the tree with all my changes.If, for example, I'm working on a project written in Python packaged in nixpkgs, and I see an issue in CPython that I should fix, I just open up the CPython submodule, fix the issue, and my project immediately benefits. If I didn't quite fix the problem, I can easily keep iterating, tweaking my own project and CPython simultaneously.
It's likewise just as easy to distribute in-progress changes to multiple projects. I commit my in-progress changes and push it to my nixpkgs fork. Then if someone clones my nixpkgs fork, they can immediately start working on the same changed codebases with the same changes. (This can be made incredibly simple with some tooling that uses techniques such as on-demand creation of an "omega repo".)
I suspect there are numerous other advantages as well which are not yet obvious, and that making common operations so much cheaper may unlock a radically different way of working on open source.
I think this way of organizing projects would be a truly transformative way to organize software development, and it would be a major incentive for using Nix as the backbone of a software project. In a certain way, this would position Nixpkgs as an "open source monorepo", a place where cross-project integration work could be done with ease, without any of the scaling issues of traditional monorepos.
Of course, tying Nixpkgs so deeply to Git may be undesirable, though it wouldn't prevent us from doing anything we currently do. There are some practical downsides as well; git submodules can be tricky to work with, though there are projects attempting to make them easier to use.
builtins.fetchgit
currently has some scaling problems with its git cache, which may make it difficult to do something like this before the issues are fixed.Nevertheless, for projects other than Nixpkgs, such as separate Nixpkgs overlays for a few related packages, I think this kind of organization makes a lot of sense. It would be nice to see some projects openly experiment with a submodule-based Nixpkgs overlay.
What do you think?