Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glibc: don't use bootstrap libgcc_s #36948

Merged
merged 3 commits into from Jan 23, 2019

Conversation

dtzWill
Copy link
Member

@dtzWill dtzWill commented Mar 14, 2018

Fixes #36947

  • Tested using sandboxing (nix.useSandbox on NixOS, or option build-use-sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

Huge rebuild, have not tested everything obviously :).

Wanted to see what broke and was surprised to find
that things didn't immediately and completely break.

I'm curious if clang works after this, we'll see!

@GrahamcOfBorg
Copy link

No attempt on x86_64-darwin

The following builds were skipped because they don't evaluate on x86_64-darwin: glibc

No log is available.

@GrahamcOfBorg
Copy link

Success on aarch64-linux (full log)

Attempted: glibc

Partial log (click to expand)

cannot find section .dynamic
moving /nix/store/8jga76ik7x3wfsg5m2hf9jmbdxczzgsf-glibc-2.26-131-bin/sbin/* to /nix/store/8jga76ik7x3wfsg5m2hf9jmbdxczzgsf-glibc-2.26-131-bin/bin
checking for references to /build in /nix/store/r45ws72pm72hba06dd69gw3w6qhjigzz-glibc-2.26-131-dev...
could not find build ID of /nix/store/0mgm69j09xql4vsb063hlwc31l04kzjj-glibc-2.26-131-static/lib/libmcheck.a, skipping
could not find build ID of /nix/store/0mgm69j09xql4vsb063hlwc31l04kzjj-glibc-2.26-131-static/lib/libieee.a, skipping
checking for references to /build in /nix/store/0mgm69j09xql4vsb063hlwc31l04kzjj-glibc-2.26-131-static...
wrong ELF type
wrong ELF type
checking for references to /build in /nix/store/3gg8dg63z571zzj152rd4h70psbh5qxf-glibc-2.26-131-debug...
/nix/store/0hsa094362ksjmfr8klay9wkj85frd49-glibc-2.26-131

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: glibc

Partial log (click to expand)

cannot find section .dynamic
moving /nix/store/nwrvymzvv0xp2rslag6crfc018dwwndz-glibc-2.26-131-bin/sbin/* to /nix/store/nwrvymzvv0xp2rslag6crfc018dwwndz-glibc-2.26-131-bin/bin
checking for references to /build in /nix/store/v1qs4k70dbh094ms4qmvjncsnfw5m5b7-glibc-2.26-131-dev...
could not find build ID of /nix/store/bzj1lrd6qghn0bj6i12i4pv3ymsxssga-glibc-2.26-131-static/lib/libieee.a, skipping
could not find build ID of /nix/store/bzj1lrd6qghn0bj6i12i4pv3ymsxssga-glibc-2.26-131-static/lib/libmcheck.a, skipping
checking for references to /build in /nix/store/bzj1lrd6qghn0bj6i12i4pv3ymsxssga-glibc-2.26-131-static...
wrong ELF type
wrong ELF type
checking for references to /build in /nix/store/nsxjns63rfg6dlf10hl213vdk4404wkx-glibc-2.26-131-debug...
/nix/store/jcs0l7vdp162qjixs0i7wdxva303dh59-glibc-2.26-131

@dtzWill
Copy link
Member Author

dtzWill commented Mar 14, 2018

Yep, tests.cc-wrapper-clang-5 fails with this, as expected. Need to teach it to find libs in gcc.lib or perhaps create a merged variant and point clang at that...

@dtzWill
Copy link
Member Author

dtzWill commented Mar 14, 2018

build log of the failure, for the curious: https://gist.github.com/dtzWill/9aea1fb5695788fcb0a63c96f32d36d4

This is the error we currently find w/trying to use clang as a cross-compiler, which is what started me looking into all of this :).

cc #36867

@Mic92
Copy link
Member

Mic92 commented Mar 14, 2018

@dtzWill if you are at it, could you maybe also teach clang to find c++ stdlib headers without compiler wrappers on linux? This would fix all tooling that uses libclang. I tried to fix it in https://github.com/llvm-mirror/clang/blob/master/lib/Driver/ToolChains/Linux.cpp#L758 but this only affects clang from a build directory.

@Mic92
Copy link
Member

Mic92 commented Mar 14, 2018

@Ericson2314
Copy link
Member

Yeah we need to get change GCC at some point so the compiler itself and runtime libraries are built separately. That will make the one problem caused by this go away!

@vcunat
Copy link
Member

vcunat commented Mar 14, 2018

Hmm, I suppose we can afford this now that we have gcc.lib split away.

@Ekleog
Copy link
Member

Ekleog commented Mar 14, 2018

Hmm, I think this issue is currently making some builds fail for 18.03, esp. ones that are required for the channel to move forward and actually come into existence (spidermonkey_38 for i686 in particular is required by tests that are required for 18.03, if I read hydra correctly).

Would it make sense to merge this PR into release-18.03 and see what happens, so that we get a better idea of what exactly is broken by it? Anyway I feel like the channel won't move forward until this PR or a similar one is merged, so I don't think there is much to lose (apart from hydra building time)?

(cc ZHF #36453)

@vcunat
Copy link
Member

vcunat commented Mar 14, 2018

Eh, better not do such experiments directly on 18.03 anymore. After it's confirmed that this works good, we can cherry-pick it in 18.03. We may have a separate temporary jobset if you think there's a risk of mass breakage, but I suppose it won't be that risky.

@pbogdan
Copy link
Member

pbogdan commented Mar 14, 2018

Plus 18.03's tested job already succeeded so as I understand it this shouldn't block channel updates (I can't see the channel being created yet but I suppose that's a different issue).

@dtzWill
Copy link
Member Author

dtzWill commented Mar 14, 2018

Perhaps a new jobset would be appropriate then?

@Ekleog
Copy link
Member

Ekleog commented Mar 14, 2018

@pbogdan oh indeed, I was misreading the jobs tab as the tested job, because I didn't understand why 18.03 hadn't been created yet -- my reasoning was completely wrong, then, sorry for the noise :)

@Ericson2314 Ericson2314 added this to the 18.09 milestone Mar 15, 2018
@Ericson2314 Ericson2314 added the 6.topic: portability General portability concerns, not specific to cross-compilation or a specific platform label Mar 15, 2018
@Ericson2314
Copy link
Member

Ericson2314 commented Mar 15, 2018

I added the 18.09 milestone because I definitely want this to happen by then. Not trying to say it shouldn't also happen for 18.03 :).

@vcunat
Copy link
Member

vcunat commented Mar 15, 2018

For reference, the 18.03 channels are there now, for about a day. The machine bumping them wasn't immediately re-deployed with the new configuration, by mistake.

@matthewbauer matthewbauer added this to Planned for 18.09 in Cross compilation Mar 19, 2018
@Ericson2314
Copy link
Member

Can we still do this? A quick hack for LLVM in cc-wrapper should be easy enough.

@dtzWill
Copy link
Member Author

dtzWill commented May 2, 2018

Agreed, and we can probably improve the libclang situation while we're at it.

Definitely want this to happen, just not sure when :)

@lopsided98
Copy link
Contributor

Doing this would make everything that needs on libgcc_s.so depend on gcc, right? Shouldn't we have a libgcc package like pretty much every other distribution to avoid this?

@Ericson2314
Copy link
Member

We soon will for compiler-rt! #39743

@GrahamcOfBorg
Copy link

Unexpected error: command failed with exit code 1 on aarch64-linux (full log)

Attempted: pkgsi686Linux.dejagnu

Partial log (click to expand)

copying path '/nix/store/j566kypswj02xsv5mas0i84hdwz01gm6-patchelf-0.9' from 'https://cache.nixos.org'...
copying path '/nix/store/c1c89krxy60ba0v7y6d40zhkv5bdgsri-paxctl-0.9' from 'https://cache.nixos.org'...
copying path '/nix/store/8ir1cbiqgg8g5n60hcx10zcpw60is06z-perl-5.28.0' from 'https://cache.nixos.org'...
copying path '/nix/store/858i1ji39dqzawm4w6byw4cirfp90z3n-zlib-1.2.11' from 'https://cache.nixos.org'...
copying path '/nix/store/jdaazj4akbrvch36ph5zw9in2pk0bmg0-bison-3.1' from 'https://cache.nixos.org'...
copying path '/nix/store/238m5cy2ahxx4j5zawx98ykgxcv52p95-binutils-2.30' from 'https://cache.nixos.org'...
copying path '/nix/store/5kn286sai38qfgwn2cmd89fkinb23y8q-binutils-wrapper-2.30' from 'https://cache.nixos.org'...
copying path '/nix/store/yfx0d30vihr85l335l0y98izpvzc0xhz-bootstrap-stage2-gcc-wrapper' from 'https://cache.nixos.org'...
copying path '/nix/store/2vjfknwnzpqsp1dld5f7wirx30j1gxn9-bootstrap-stage2-stdenv-linux' from 'https://cache.nixos.org'...
error: a 'i686-linux' is required to build '/nix/store/gy8l79r0r5np5l2ha1x110mhcls5z5d1-linux-headers-4.18.3.drv', but I am a 'aarch64-linux'

@GrahamcOfBorg
Copy link

Timed out, unknown build status on x86_64-linux (full log)

Attempted: pkgsi686Linux.dejagnu

Partial log (click to expand)

cannot build derivation '/nix/store/46i4abslll9px32gdiqhmqy6g71qy794-hook.drv': 3 dependencies couldn't be built
cannot build derivation '/nix/store/pph0yn3wla4idmy5rdrrfdvfqcrrnipz-patch-2.7.6.drv': 3 dependencies couldn't be built
cannot build derivation '/nix/store/0wr8wi99cljhmjzk3hznym4ffra5vrk1-stdenv-linux.drv': 23 dependencies couldn't be built
cannot build derivation '/nix/store/yfjcxdg24qf2drdibgpgq4bmkcy5l5nm-stdenv-linux.drv': 23 dependencies couldn't be built
cannot build derivation '/nix/store/6bq22cai0890gnsnz0cl3579bjspyy88-hook.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/y5i5gcdp78c1fcqrdw4pw3zlfhxj1kp0-tcl-8.6.6.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/riy20wqr2jq9j1xvanwldlx48m5fcwmc-hook.drv': 3 dependencies couldn't be built
cannot build derivation '/nix/store/9i0qyghb3pwv6nvcx2fp5jkhxhfaiasl-expect-5.45.4.drv': 4 dependencies couldn't be built
cannot build derivation '/nix/store/x6ndlqzda010q1hf90yri3n6axdwzdm1-dejagnu-1.6.1.drv': 4 dependencies couldn't be built
error: build of '/nix/store/x6ndlqzda010q1hf90yri3n6axdwzdm1-dejagnu-1.6.1.drv' failed

dtzWill and others added 2 commits November 3, 2018 19:05
The underlying problem with libgcc is worked around.

(cherry picked from commit afea12f)
@GrahamcOfBorg
Copy link

No attempt on x86_64-darwin (full log)

The following builds were skipped because they don't evaluate on x86_64-darwin: glibc

Partial log (click to expand)


a) For `nixos-rebuild` you can set
  { nixpkgs.config.allowUnsupportedSystem = true; }
in configuration.nix to override this.

b) For `nix-env`, `nix-build`, `nix-shell` or any other Nix command you can add
  { allowUnsupportedSystem = true; }
to ~/.config/nixpkgs/config.nix.


@GrahamcOfBorg
Copy link

Success on aarch64-linux (full log)

Attempted: glibc

Partial log (click to expand)

checking for references to /build in /nix/store/wsa5nkdf644ryv9p8zpn2d2qwsva9dff-glibc-2.27-bin...
cannot find section .dynamic
cannot find section .dynamic
moving /nix/store/wsa5nkdf644ryv9p8zpn2d2qwsva9dff-glibc-2.27-bin/sbin/* to /nix/store/wsa5nkdf644ryv9p8zpn2d2qwsva9dff-glibc-2.27-bin/bin
checking for references to /build in /nix/store/5ds0vi1w2fki4v061ppm5xp225dgyw7j-glibc-2.27-dev...
could not find build ID of /nix/store/0a1hhqqcvp4akjd1s20wq3qx298ld4in-glibc-2.27-static/lib/libmcheck.a, skipping
checking for references to /build in /nix/store/0a1hhqqcvp4akjd1s20wq3qx298ld4in-glibc-2.27-static...
wrong ELF type
checking for references to /build in /nix/store/8w2dkwgjd22xbqlmvyjpd1rvwm77yrar-glibc-2.27-debug...
/nix/store/4y1w2p3c5m0gh9xphfir1zfw09jan57f-glibc-2.27

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: glibc

Partial log (click to expand)

checking for references to /build in /nix/store/3713rx0rbs3mjyny61warmd0ain3wmfi-glibc-2.27-bin...
cannot find section .dynamic
cannot find section .dynamic
moving /nix/store/3713rx0rbs3mjyny61warmd0ain3wmfi-glibc-2.27-bin/sbin/* to /nix/store/3713rx0rbs3mjyny61warmd0ain3wmfi-glibc-2.27-bin/bin
checking for references to /build in /nix/store/0p4swxfiga1y4xzb6db9wxbwyxw9dk5k-glibc-2.27-dev...
could not find build ID of /nix/store/5yzm7bdpvlrhfrgcpzlmf465brfxk9dr-glibc-2.27-static/lib/libmcheck.a, skipping
checking for references to /build in /nix/store/5yzm7bdpvlrhfrgcpzlmf465brfxk9dr-glibc-2.27-static...
wrong ELF type
checking for references to /build in /nix/store/yim1ca15is3k2zld4ywci4pl7d6cv7l2-glibc-2.27-debug...
/nix/store/mk5bqc17fvjxif36k13r3nikfl381x4f-glibc-2.27

@matthewbauer matthewbauer force-pushed the fix/glibc-libgcc_s branch 2 times, most recently from 8d7d962 to e26c8aa Compare January 23, 2019 21:52
@pbogdan
Copy link
Member

pbogdan commented Jan 24, 2019

FWIW dejagnu is broken on x86_64-linux and i686-linux with these changes. As I understand it the library may be dynamically loaded by glibc at runtime which makes hunting down failures more difficult (in this case it seems it's some code in tcl that triggers loading the library which is no longer available on the search path).

@pbogdan
Copy link
Member

pbogdan commented Jan 24, 2019

gnutls test suite fails too.

@matthewbauer
Copy link
Member

@dtzWill is there any way to avoid these issues? I still like getting rid of references to libgcc... But it may need to wait for another time?

@matthewbauer
Copy link
Member

Ok let's revert for now. ideally we can get this fixed at a later point. 319ebef is the revert.

matthewbauer added a commit that referenced this pull request Jan 27, 2019
@dtzWill
Copy link
Member Author

dtzWill commented Jan 27, 2019

Aww can't reopen this? 😿

Suppose can always make a new one if someone finds time to tackle the issues.

Not sure I prefer the current state (subtle issues everywhere) over breaking things explicitly and forcing the issue, but that's just me O:). Probably not best to do that with a release looming, I admit... :)

Gaelan added a commit to Gaelan/nixpkgs that referenced this pull request Apr 28, 2022
Let's try this again. See
NixOS#36947 and
NixOS#36948
for history
ajs124 pushed a commit to helsinki-systems/nixpkgs that referenced this pull request Oct 22, 2022
Let's try this again. See
NixOS#36947 and
NixOS#36948
for history
@ghost
Copy link

ghost commented Jan 11, 2023

I figured out the root cause of these segfaults.

wegank pushed a commit to wegank/nixpkgs that referenced this pull request Feb 1, 2023
#### Immediate Benefits

- Allow `gcc11` on `aarch64`
- No more copying `libgcc_s` out of the bootstrap-files or other
  derivations
- No more [static `lib{mpfr,mpc,gmp,isl}.a`
  hack](https://github.com/NixOS/nixpkgs/blob/2f1948af9c984ebb82dfd618e67dc949755823e2/pkgs/stdenv/linux/default.nix#L380)
- *Zero* additional `gcc` builds (stage1+stage2+stageCompare)
  - The `gcc` derivation builds `gcc` once instead of three times.
  - The libraries that are linked into the final `pkgs.gcc` (`mpfr`,
    `mpc`, `gmp`, `isl`, `glibc`) are built by
    `stdenv.__bootPkgs.gcc` rather than by the `bootstrapFiles`.  No
    more Frankenstein compiler!
  - stageCompare runs **concurrently** with (not in series with)
    with `stdenv`'s dependees.
- Many other `stdenv` hacks eliminated.
  - `gcc` and `clang` share the same codepath for more of
    `cc-wrapper`.
  - Makes the cross and native codepaths much more similar --
    another step towards "cross by default".

Note that *all* the changes in this PR are controlled by flags; no
old codepaths need to be removed until/if we're completely certain
that this is the right way to go.

#### Future Benefits

- This should allow using a [foreign] `bootstrap-files` so long as
  `hostPlatform.canExecute bootstrapFiles`.
- There will be an "avalanche of simplification" when we set
  `enableGccExternalBootstrap=true` and run dead code elimination.
  It's really quite a huge amount of code that goes away.
  Native-gcc has its own special codepath in so many places, while
  cross-gcc and clang work the same way (and are much simpler).
- This should allow each of the libraries that ship with `gcc`
  (`lib{backtrace,atomic,cc1,decnumber,ffi,gomp,iberty,offloadatomic,quadmath,sanitizer,ssp,stdc++-v3,vtv}`)
  to be built in separate (one-liner) derivations which `inherit
  src;` from `gcc`.
  - Building `libstdc++-v3` in a separate derivation will eliminate
    a lot of accidental-reference-to-the-`bootstrapFiles` landmines.

#### Incorporates

- NixOS#209054
- NixOS#210004
- NixOS#36948 (unreverted)
- NixOS#210325
- NixOS#210118
- NixOS#210132
- NixOS#210109

#### Closes

- Closes NixOS#208412
- Closes NixOS#108111
- Closes NixOS#108305
- Closes NixOS#201254

#### Build history

- First successful builds (stage1/stage2):
  - powerpc64le-linux at 9c7e9ef
  - x86_64-linux at 9c7e9ef
  - aarch64-linux at 4d5bc7d

- First successful comparisons (stageCompare):
  - at 81949cf
  - [aarch64-linux][aarch64-compare-ofborg]
  - [x86\_64-linux][amd64-compare-ofborg]

#### Credits

This project was made possible by three important insights, none of
which were mine:

1. @Ericson2314 was the first to advocate for this change, and
   probably the first to appreciate its advantages.  External
   bootstrap is "cross by default".

2. @trofi has figured out a lot about how to get gcc to not mix up
   the copy of `libstdc++` that it depends on with the copy that it
   builds.  Now that gcc is written in C++, it depends on
   `libstdc++`, builds a copy of `libstdc++`, and builds auxiliary
   products (like `libplugin`) which depend on `libstdc++`.  @trofi
   developed two important techniques for keeping this straight: the
   use of a [nonexistent sysroot] and moving the `bootstrapFiles`'
   `libstdc++` into a [versioned directory].  Without these two
   discoveries, external bootstrap would be impossible, because the
   final gcc would still have references to the `bootstrapFiles`.

3. Using the undocumented variable [`user-defined-trusted-dirs`]
   when building glibc.  When glibc `dlopen()`s `libgcc_s.so`, it
   uses a completely different and totally special set of rules for
   finding `libgcc_s.so`.  This trick is the only way we can put
   `libgcc_s.so` in its own separate outpath without creating
   circular dependencies or dependencies on the bootstrapFiles.  I
   would never have guessed to use this (or that it existed!) if it
   were not for a [comment in guix] which @Mic92 [mentioned].

My own role in this PR was basically: being available to go on a
coding binge at an opportune moment, so we wouldn't waste a
[crisis].

[aarch64-compare-ofborg]: https://github.com/NixOS/nixpkgs/pull/209870/checks?check_run_id=10662822938
[amd64-compare-ofborg]: https://github.com/NixOS/nixpkgs/pull/209870/checks?check_run_id=10662825857
[nonexistent sysroot]: NixOS#210004
[versioned directory]: NixOS#209054
[`user-defined-trusted-dirs`]: https://sourceware.org/legacy-ml/libc-help/2013-11/msg00026.html
[comment in guix]: https://github.com/guix-mirror/guix/blob/5e4ec8218142eee8e6e148e787381a5ef891c5b1/gnu/packages/gcc.scm#L253
[mentioned]: NixOS#210112 (comment)
[crisis]: NixOS#108305
[foreign]: NixOS#170857 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: portability General portability concerns, not specific to cross-compilation or a specific platform 10.rebuild-darwin: 0 10.rebuild-linux: 501+ 10.rebuild-linux-stdenv
Projects
No open projects
Cross compilation
Planned for 18.09
Development

Successfully merging this pull request may close these issues.

None yet