Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bioconductor packages: 3.8 -> 3.10, update R packages, other fixes #85522

Closed
wants to merge 1 commit into from

Conversation

timsears
Copy link

Motivation for this change

Updates cran and bioconductor packages.
Adds bioconductor workflows
Add sample shell environment
Fixes hundreds of existing packages
Marks dozens broken packages as broken, adding hints for next steps to fix them
Fixes many R packages on darwin.
Adds selected packages to the generic builder so that many future R packages will be able to build without customization in nixpkgs.
Adds scripts to help generate lists of selected package names to ease testing.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@peti
Copy link
Member

peti commented Apr 19, 2020

Why do you make a new PR instead of just updating the old one?

@timsears
Copy link
Author

I ran into trouble updating the feature branch on my end. Also did not seem like the original one received much notice so I didn’t think anyone would care. Lmk if you have any suggestion. Happy to take advice.

(lib.optionals stdenv.isDarwin [ libiconv flock which ])
++ [ pkgs.gsl liblapack gmp.dev gtk2.dev bzip2.dev pkgs.pkgconfig
lzma icu.dev pcre.dev zlib.dev curl.dev libpng libtiff libjpeg
]);
Copy link
Member

@garbas garbas Apr 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will increase the closure size quite a bit for packages that don't need this dependencies.

I'm assuming this is an optimization for your time since it is hard to track down who needs what, but for including this in nixpkgs I think it would be smart to add it on per package basis. I'm no expert in R, but I would assume this would apply also to rPackages. But I understand it adds much more work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I thought about that. If you add these it means a new R packages is more likely to just work. However, what you say about closures is true. A mitigating factor is that the target audience for R mainly wants a development environment with many packages. In that scenario you only pay the size price once. If you think certain packages should be removed from this list I would be willing to add them explicitly as needed. However I don't thing the list should be totally empty. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A mitigating factor is that the target audience for R mainly wants a development environment with many packages. In that scenario you only pay the size price once.

I don't believe that you can legitimately speak for "the target audience" unless you have polled them first in some significant way. As it is, we don't know other people want. We can only speak about what we want.

Personally, I want accurate dependencies. I don't want packages to depend on stuff that they actually don't depend on. Furthermore, this is a guiding principle we have applied all over Nixpkgs, not just for R. This change goes against that design goal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current branch there are packages already added to every R derivation, whether they are needed or not. The question is what is the criteria for adding those? I am sure I added some that are reasonable, but perhaps some should be done per-package. What's the right decision process? Should it include likelihood/ease of adding a future R package to nixpkgs? If not, we could make all dependencies explicit. More accurate, smaller closure size. But more breakage, more effort now and more in the future. I honestly don't know the correct answer here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the target audience, it might be the case that the nix/R user only runs batch jobs and need only a few R packages at a time. That would be a fair comment. On the other hand, I can confidently say that there are few if any data scientists using nix/R to build and maintain a dev environment that includes bioconductor packages. That's what prompted me to attempt to improve the situation. Even if this PR got merged, serious problems remain for source packages that update frequently. We don't need a poll to see how conda supports those users better right now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current branch there are packages already added to every R derivation, whether they are needed or not.

Which ones are you referring to?

On the target audience, it might be the case that the nix/R user only runs batch jobs and need only a few R packages at a time.

Nobody said that. I all said was that I don't like it when people try to justify technical decisions with the needs of some "silent majority" without any evidence that they have actual knowledge about the needs of that group. I'd rather you speak for what you need than about what some mystical "target audience" needs.

@peti
Copy link
Member

peti commented May 7, 2020

I ran into trouble updating the feature branch on my end.

What kind of trouble?

Copy link
Member

@peti peti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR changes too many things at once. I would really love to have this submission broken up into separate PRs that are easier to review.

@@ -0,0 +1,27 @@
#!/bin/sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this script?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a test script to build packages. I used it to test all of the packages. Complements the generation of all the packages created by generate-r-packages.R. Intended for use by someone updating all the R packages, not a user trying to install some.

_self = { inherit buildRPackage; } //
import ./bioc-packages.nix { inherit self; derive = deriveBioc; } //
import ./bioc-packages.nix { inherit self; derive = deriveBioc; } //
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't mix indentation changes into a PR that changes functionality. This makes the diffs unnecessarily long and hard to read. If you want to improve the formatting, please use a separate PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that looks like a mistake.

};

packagesWithNativeBuildInputs = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. Please make such a stylistic change in a separate PR. As it is, it's hard hard to tell what you actually changed because the diff is so long.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sure the diff could be improved and welcome suggestions. But the diff long mainly because it fixes hundreds of packages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the diff is unnecessarily long because you've replaced every explicit pkgs path with an implicit with pkgs; scope at the top of the attribute. There might be lots of build fixes included in the change you've submitted, but they aren't recognizable because of this noise.

@timsears
Copy link
Author

timsears commented May 8, 2020

@peti I would attempt to meet the requested changes if the target was clearer. Simply saying to break up the PR doesn't tell me what would make the PR acceptable.

@peti
Copy link
Member

peti commented May 8, 2020

I would attempt to meet the requested changes if the target was clearer. Simply saying to break up the PR doesn't tell me what would make the PR acceptable.

  • Please don't submit cosmetic white-space changes at the same time as a functional change.
  • Please don't include your test scripts in a PR that's supposed to update the biodata package set.
  • Please don't submit changes in name scoping (by adding or removing 'with xyz;' statements) at the same time as functional changes.
  • Please don't change which packages are included in the build environment by default at the same time together with other functional changes.

@histonemark
Copy link

histonemark commented Jun 1, 2020

The patch Rhdf5lib.patch also fails for me. And I wholeheartedly agree with @timsears with the need of rapid adaptation for data analysis environments in which packages are changing their dependencies very other day.

@peti
Copy link
Member

peti commented Jun 2, 2020

@timsears said "would attempt to meet the requested changes if the target was clearer". Then I went out of my way to say what changes are required in #85522 (comment), but the effect has been zilch. What a waste of time. :-( I'm closing this PR for now. If anyone wants to pick up the baton, please don't hesitate to submit your changes in a new PR.

@peti peti closed this Jun 2, 2020
@timsears
Copy link
Author

@peti It's fine to close this PR. For a new person on nixpkgs I feel your attitude is discouraging at best and borderline toxic. It's not going to be easy to attract new nixpkgs committers without a change of approach and maybe documentation. I truly hope this is not a common experience for others.

On the technical side I am going to start over and start by thinking about about the source caching issue, which I uncovered only by attempting this PR and which it does not address.

@peti
Copy link
Member

peti commented Jun 25, 2020

@timsears, this whole interaction would be so much more constructive if you could cut out the ad hominem attacks and instead address the technical issues I pointed out about your PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants