Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] haskell: add core libraries logic to remove need for global package db #43300

Closed
wants to merge 5 commits into from

Conversation

matthewbauer
Copy link
Member

@matthewbauer matthewbauer commented Jul 10, 2018

This moves core libraries in Haskell to its own derivations. This is
needed to avoid references to the GHC compiler ending up in binaries
built by GHC. Instead of referencing GHC directly, these binaries will
now link to the copies.

Unfortunately, for this to work, we need to know which GHC libraries
exist at evaluation time. This requires maintaining a list of the
builtin library names & their versions in the GHC compiler’s “libs”
attribute. A better way of handling this is needed.

This is still a work in progress.

/cc @kirelagin @bgamari @ElvishJerricco

@matthewbauer matthewbauer requested a review from peti as a code owner July 10, 2018 19:16
@matthewbauer matthewbauer requested review from Ericson2314 and removed request for peti July 10, 2018 19:16
@matthewbauer matthewbauer requested a review from peti July 10, 2018 19:24
@matthewbauer matthewbauer changed the title [wip] haskell: add core libraries logic to remove libraries from GHC [wip] haskell: add core libraries logic to remove need for global package db Jul 10, 2018
@peti
Copy link
Member

peti commented Jul 10, 2018

How do these changes relate to #43070?

@matthewbauer
Copy link
Member Author

matthewbauer commented Jul 10, 2018

I think they accomplish the same thing - just in different ways. Both should reduce closure sizes.

Pros of this approach:

  • Avoids rebuilding GHC.
  • Meant as a stopgap fix until we can split the GHC compiler from its core libraries (more work on GHC is needed for this to be possible IIUC).
  • Doesn't rely on multiple outputs (perhaps this is a con for some but IMO having a separate output for each library is kind of an abuse of the idea of multiple outputs).

Cons of this approach:

  • You end up with two copies of each core library in your Nix store.
  • Requires keeping track of name + version of every core library in the GHC compiler (more maintenance especially because it looks like they change from version to version).

@matthewbauer
Copy link
Member Author

This is currently blocked on an issue that I have been unable to debug:

Configuring hscolour-1.24.4...
Dependency base <10: using base-4.12.0.0
Dependency containers -any: using containers-0.6.0.1
Dependency base <10: using base-4.12.0.0
Dependency containers -any: using containers-0.6.0.1
<command line>: unknown package: rts

It looks like it picks up base but not rts?

@Ericson2314
Copy link
Member

Ericson2314 commented Jul 11, 2018

@peti Unless we're worried about churn for the builders, I'd love to land #43070 as soon as possible, as it was passing deps before mere conflicts. Then we can do this approach, with the libraries having been listed for us, right after.

With NixOS/cabal2nix#358 enabling a new strictDeps PR with far fewer overrides + all libraries coming from separate non-GHC derivations, we should be able to both 1) finish cross support and 2) remove all duplicate args, in less than a week! [I say finish because while it would be nice to e.g. build components, including Setup, separately, there is no burden for correctness forcing us to do so.]

@domenkozar
Copy link
Member

This is really good work, it will reduce haskell closures significantly for packages dynamically linked.

@kirelagin
Copy link
Member

Both should reduce closure sizes.

Actually, I don’t think that my approach affects closure size in any way.

@kirelagin
Copy link
Member

I might be missing something, but can’t we just “merge” our two PRs by moving the code from mkCoreLib here into the install phase in my PR? This way boot libraries outputs will not retain a link to the main output and we will get the closure size reduced.

P. S. I wonder if the GHC build system can be told to install those packages straight into the right location instead of us copying them around manually.

@kirelagin
Copy link
Member

having a separate output for each library is kind of an abuse of the idea of multiple outputs

I don’t see how this is an abuse TBH. The GHC build process is structured in such a way that it produces both the compiler and libraries for it to use and, furthermore, it is the only supported way of getting those libraries. What I mean is that some of the boot libraries can be then recompiled, but some of them just cannot be built outside of the ghc compilation process, and, AFAIU, this is exactly the purpose of multiple outputs.

What we can do is limit those libraries that go to multiple outputs strictly to those that cannot be built separately.

@Ericson2314
Copy link
Member

@kirelagin I personally don't mind multiple outputs, but I do think it's important that we have it so generic-builder and ghc with generic-builder can find them exactly one way. Whether we do that with multiple outputs, or shim derivations I guess doesn't matter as much on that front as long as GHC-without-libraries has its own output / shim derivation. Put another way, the -B for GHC should give it a thing without libraries.

@matthewbauer
Copy link
Member Author

matthewbauer commented Jul 11, 2018

I don’t see how this is an abuse TBH. The GHC build process is structured in such a way that it produces both the compiler and libraries for it to use and, furthermore, it is the only supported way of getting those libraries. What I mean is that some of the boot libraries can be then recompiled, but some of them just cannot be built outside of the ghc compilation process, and, AFAIU, this is exactly the purpose of multiple outputs.

The original use case was for splitting up /include from /lib. I don't think anyone has ever generated them from a variable list like this before. There's probably no technical reason why it can't be done, it is just a little bit unexpected. A lot of this can probably be solved by better tooling - I'm just not sure if we are ready for something like this. A lot of users are surprised when something like this doesn't work:

nix-env -iA ghc.base

(just ghc.out is installed IIRC)

@kirelagin
Copy link
Member

Well, those multiple outputs are just a technical detail, it is actually a good thing that there is no easy way to discover them. They are intended to be used only as top-level haskell packages set attributes.

@Ericson2314
Copy link
Member

Another concern is that more invasive the procedure gets, The move convenient it is to have two derivations for purpose of debug cycles. Kind of like how we don't wrap gcc when be build it but separately.

matthewbauer and others added 4 commits July 18, 2018 15:30
This moves core libraries in Haskell to its own derivations. This is
needed to avoid references to the GHC compiler ending up in binaries
built by GHC. Instead of referencing GHC directly, these binaries will
now link to the copies.

Unfortunately, for this to work, we need to know which GHC libraries
exist at evaluation time. This requires maintaining a list of the
builtin library names & their versions in the GHC compiler’s “libs”
attribute. A better way of handling this is needed.
ghc8.2.2 apparently doesn’t like moving rts from its original
location. Hopefully this was fixed in ghc 8.6.1.
This may not be needed but it is more consistent with the idea of
treating core libraries as ordinary haskell libraries.
Move score libs to pkgs/development/haskell-modules/default.nix. This
seems like a better place for them.

Instead of doing “base = null;” we now can use the actual base here.
@matthewbauer
Copy link
Member Author

I honestly have no idea how to fix the unknown package: rts. Any debugging points to it being some sort of Cabal thing. I filed an issue here: haskell/cabal#5445. Does anyone else have experience with using -no-global-package-db?


eval fixupPhase
'';
}; in builtins.listToAttrs (map (mkCoreLib ghc) (super.ghc.libraries or []));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation?

@@ -208,6 +208,45 @@ stdenv.mkDerivation (rec {

# Our Cabal compiler name
haskellCompilerName = "ghc-8.4.3";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated issue here, btw

@cdepillabout
Copy link
Member

@matthewbauer This PR is quite old at this point, is there anything in particular that needs to be done to move it forward?

Or could we go ahead and close it?

@matthewbauer
Copy link
Member Author

Yes closing for now. It may be useful in the future but not in the current state

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants