New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
r-modules: add GitHub only r packages #93883
base: master
Are you sure you want to change the base?
Conversation
Having |
The CRAN bump should probably be done separately. |
Since Github is not even minimally reviewed unlike CRAN/bioc, the script to generate the package set should probably also take commits and hashes as inputs as well as owner and repo for at least security reasons. |
Is there anything special about GitHub here or should this be generalized to arbitrary source repos? |
Happy to split that out, it may resolve the current breakage with I think I agree regarding freezing commit and sha256 for the github packages (instead of refreshing each time DESCRIPTION has a version bump). This logic could be separated to another script. Alternatively, since As for generalizing to other sources of git packages, some of the logic is github specific. For example downloading the raw DESCRIPTION file without cloning the repo is likely github (possibly gitlab) exclusive. Addendum: This doesn't stop us from generalizing, since we're running |
b68a319
to
784a42c
Compare
I've added commandline handling, fixed the code so old packages are maintained if not passed on the commandline, modified the code to use the parallel map for the git fetches and parsing. And squashed the history. @peti what are your thoughts on adding this feature? |
32c515d
to
072af71
Compare
- adds a (currently two package github-packages.nix) - add github case to generate-R-packages.R - accept repo specs of the form user/repo@commit on the commandline - preserves old versions unchanged if not passed on the commandline - use the multicore map to handle the repo fetching/parsing - use nix-prefetch-git to fetch sha256 from repo - reads DESCRIPTION directly from github (TODO use prefetched repo)
072af71
to
9406644
Compare
Thanks for this! We also have a number of proprietary R packages in the internal git repository. Can the PR be extended to support that? |
@alexvorobiev It could, but I think we're basically waiting for some signal from @peti that this will be merged at some point or what sort of modifications to make. I think we need to split the CRAN package regeneration into a separate PR at least but other than that this should be ready to go. |
@bcdarwin Since the R packages in nixpkgs are updated so infrequently I am working on an overlay that uses the copies of the files from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I think it is fine except that some documentation needs to be updated.
# Templates for generating Bioconductor and CRAN packages | ||
# from the name, version, sha256, and optional per-package arguments above | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that github packages are supported, imho this comment and /doc/languages-frameworks/r.section.md
should be updated appropriately.
# Execute the following command to update the file. | ||
# | ||
# Rscript generate-r-packages.R github >new && mv new github-packages.nix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This instruction is incorrect for github so misleading for users.
Are there any updates to this PR or obstacles? In general I think it would be very useful to have mechanisms to
without modifying https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/r-modules/default.nix or https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/r-modules/generate-r-packages.R. I have an internal mirror of CRAN and a number of proprietary packages, so every time there are updates to that file (I copied it to an overlay) I need to merge my changes to the new version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have discussed my thoughts about this issue in great detail at https://www.twitch.tv/videos/878608118?t=1h5m36s.
The gist of if it is that IMHO the complexity of generate-r-packages.R
is getting out of hand and that I feel less and less able to maintain that code. I am a moderately competent user of R, but I am by no means a developer. I have never written an R package myself. I came up with some simple and straightforward solution to generate our package set a few years ago that worked, but my impression is that this code should really be taken over by someone more knowledgeable than myself.
So, I am perfectly happy to see these changes merged, but I would prefer someone else to take over the responsibility of maintaining of the R package set in that case because I don't want to find myself in a situation where that generate-r-packages.R
script no longer works and I have to feel like I'm responsible for fixing it.
If anyone is up to the task, please let me know and I'll make the necessary changes in .github/CODEOWNERS
to make sure that all pull requests go through you. Anyone?
Hi @peti, I have discussed this with my colleague @cfhammill and we're willing to become co-maintainers. Realistically we can't promise to refactor this code in the near term but we'll certainly try to stop it from growing further out of control. (Of course, this PR shouldn't be merged yet due to unresolved issues.) |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/automated-package-updates/12908/9 |
@peti @bcdarwin @cfhammill I'd also be willing to be a co-maintainer. I've reasonably frequently updated this package set and am a frequent user of R and it's packages in Nix. As for this pull request, I also feel it's a bit too github specific and that it should be generalised, especially since the repository is cloned anyway to compute a hash so we may as well read the DESCRIPTION file from there. |
Hi @peti -- any word on this? Three people in this thread have responded to your call to volunteer to co-maintain the R packages set. Admittedly no-one has generalized this PR yet but maybe out of a sense that merging of changes to the R infrastructure has somewhat stalled, r-packages is not being regularly evaluated on Hydra, etc. Thanks!! cc @cfhammill @jbedo |
@@ -36,11 +36,31 @@ let | |||
meta.platforms = R.meta.platforms; | |||
meta.hydraPlatforms = hydraPlatforms; | |||
meta.broken = broken; | |||
}); | |||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}); | |
}); |
@@ -3,7 +3,7 @@ | |||
{ R, pkgs, overrides }: | |||
|
|||
let | |||
inherit (pkgs) cacert fetchurl stdenv lib; | |||
inherit (pkgs) cacert fetchurl fetchgit stdenv lib; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use fetchFromGitHub?
lgpr = derive2 { name="lgpr"; version="0.33.2"; sha256="14i1vf9119cypjnacy43flfzx3lsjhax5zg33h4kjsyjbfl68ngi"; depends=[MASS Rcpp bayesplot ggplot2 ggpubr rstan rstantools]; url = "https://github.com/jtimonen/lgpr"; rev = "5a25fa8a1b21a34f915461802e21f592184d6e67"; }; | ||
posterior = derive2 { name="posterior"; version="0.1.0"; sha256="08psb9cknkva4awbrz38pdr7ssic0za31hji4i90vjb0vdnjn4k0"; depends=[abind checkmate rlang tibble]; url = "https://github.com/stan-dev/posterior"; rev = "4794130b548511c9861fc0085ba5a677afeba6fe"; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgpr = derive2 { name="lgpr"; version="0.33.2"; sha256="14i1vf9119cypjnacy43flfzx3lsjhax5zg33h4kjsyjbfl68ngi"; depends=[MASS Rcpp bayesplot ggplot2 ggpubr rstan rstantools]; url = "https://github.com/jtimonen/lgpr"; rev = "5a25fa8a1b21a34f915461802e21f592184d6e67"; }; | |
posterior = derive2 { name="posterior"; version="0.1.0"; sha256="08psb9cknkva4awbrz38pdr7ssic0za31hji4i90vjb0vdnjn4k0"; depends=[abind checkmate rlang tibble]; url = "https://github.com/stan-dev/posterior"; rev = "4794130b548511c9861fc0085ba5a677afeba6fe"; }; | |
lgpr = derive2 { name = "lgpr"; version = "0.33.2"; sha256 = "14i1vf9119cypjnacy43flfzx3lsjhax5zg33h4kjsyjbfl68ngi"; depends = [ MASS Rcpp bayesplot ggplot2 ggpubr rstan rstantools ]; url = "https://github.com/jtimonen/lgpr"; rev = "5a25fa8a1b21a34f915461802e21f592184d6e67"; }; | |
posterior = derive2 { name = "posterior"; version = "0.1.0"; sha256 = "08psb9cknkva4awbrz38pdr7ssic0za31hji4i90vjb0vdnjn4k0"; depends = [ abind checkmate rlang tibble ]; url = "https://github.com/stan-dev/posterior"; rev = "4794130b548511c9861fc0085ba5a677afeba6fe"; }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whitespace changes need to be fixed in generate_R_packages.nix
I want to know if I understand this PR. Does the workflow for github packages look like this:
So users wanting to update the github package set do so on a per package basis, with no bulk update of github packages for security reasons? I frequently use overlays to bring in a package from Github, sometimes specifically the development version of a CRAN package, which needs to override the CRAN version. How would I do that? Similar to @alexvorobiev, I would like to see functions like environment.systemPackages = [
pkgs.rstudioWrapper.override {
packages = [
(pkgs.buildRpackage { name="lgpr"; version="0.33.2"; sha256="14i1vf9119cypjnacy43flfzx3lsjhax5zg33h4kjsyjbfl68ngi"; depends=[MASS Rcpp bayesplot ggplot2 ggpubr rstan rstantools]; url = "https://github.com/jtimonen/lgpr"; rev = "5a25fa8a1b21a34f915461802e21f592184d6e67"; }; };)
];
]; |
Yeah, this is great! Is there anything you guys need from me to get started maintaining the package set? What is blocking you?
Hydra frequently evaluates https://hydra.nixos.org/jobset/nixpkgs/r-updates. One just needs to push the to-be-evaluated code to the |
I think you just need to add some of them to CODEOWNERS for R. It looks like you could add all of Line 98 in 2689bd2
Thanks, I didn't know that it needed to go to a specific branch. |
I added @jbedo and @bcdarwin in 430a043. I didn't add @cfhammill because that user name is not listed in |
You need write access to the repo to get notifications via CODEOWNERS. |
I am starting a hydra job for r-updates but it seems that Hydra's config causes it to check out @peti's |
Haskell's update branch is off NixOS/nixpkgs, so maybe we should be the same? |
I think it'd be great to have some support for github based r packages, but I'm now not sure they should be part of rPackages. Is there any reason to have them part of the rPackages tree? I'm concerned about conflicts and how to resolve them. Specifically, it's not unusual to have a github hosted R package get listed in CRAN/BIOC, at which point there is a conflict to resolve as it will show up as part of the CRAN/BIOC package set automatic update. Another option to consider is to just manually manage these github packages outside of the automatically generated rPackages tree, like any other package in nixpkgs. This would allow us to leverage tools like One final though is that many use cases might be covered by an r2nix tool, or with a recursive nix expression that parses |
@jbedo I see where you're coming from. It was so long ago now, but I think the order preference was CRAN, BioC, then git for the case of cross-listed packages. This issue would already be occurring with many packages appearing both on CRAN and in BioC. I think it is probably preferable to append the git packages to the rPackages tree, perhaps with a e.g. |
It's part of BioC naming policy to avoid naming collisions with CRAN so this should be very infrequent (from https://www.bioconductor.org/developers/package-submission/):
Currently there is one package (interacCircos) that is both in CRAN and BioC, and it probably should be reported upstream to resolve the issue.
Even if it's hundreds I don't think it will be an issue, after all most of nixpkgs is handled this way. I still feel like generate-r-packages is the wrong machinery for updating github hosted packages and that other existing tooling ( I also agree with you and think it's useful to have them in grouped in a subtree not top-level, maybe in another top-level tree called |
Ok all sounds good to me. I will take a stab at redoing this PR, adding a top level |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/is-there-a-way-to-add-r-packages-straight-from-github/17871/2 |
@cfhammill - do you have a branch started for adding a top-level rPackagesExtra? I'm interested in picking this up again. |
@nviets I haven't been working on it, if you wanted to start a new PR with the feature I'd be happy to review |
Thanks for confirming. I'll try to take a look soon. |
Motivation for this change
Frequently for an analysis a package is required that isn't released on CRAN or bioconductor, but has a usable version on github. This PR adds the ability to integrate these packages with the
rPackages
set and adds two example packages https://github.com/jtimonen/lgpr and https://github.com/stan-dev/posterior.Things done
In order to do this I:
github
github
is passed, a fixed list of github packages are traversed, usingnix-prefetch-git
to get the commit hash and sha256 for the current commit at HEAD. (This list is currently kept as an R vector in generate-R-packages.R, but could easily be made external as a text file).deriveGithub
that constructs the homepage as the repo url and usesfetchgit
to download the sources.nix-shell -p rPackages.posterior
works,nix-shell -p rPackages.lgpr
fails to build due to failure to buildrPackages.StanHeaders
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)