Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "meson: 0.46.1 -> 0.47.0" #44040

Closed

Conversation

dtzWill
Copy link
Member

@dtzWill dtzWill commented Jul 24, 2018

With meson 0.47.0 (or 0.47.1, or git)
things are very wrong re:rpath handling
resulting in at best missing libs but
even corrupt binaries :(.

When we run patchelf it masks the problem
by removing obviously busted paths.
Which is probably why this wasn't noticed immediately.

Unfortunately the binary already
has a long series of paths scribbled
in a space intended for a much smaller string;
in my testing it was something like
lengths were 67 with 300+ written to it.

I think we've reported the relevant issues upstream,
but unfortunately it appears our patches
are what introduces the overwrite/corruption
(by no longer being correct in what they assume)

This doesn't look so bad to fix but it's
not something I can spend more time on
at the moment.

--

Interestingly the overwritten string data
(because it is scribbled past the bounds)
remains in the binary and is why we're suddenly
seeing unexpected references in various builds
-- notably this is is the reason we're
seeing the "extra-utils" breakage
that entirely crippled NixOS on master
(and probably on staging before?).

Fixes #43650.

This reverts commit 305ac4d.


cc #43546


  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

With meson 0.47.0 (or 0.47.1, or git)
things are very wrong re:rpath handling
resulting in at best missing libs but
even corrupt binaries :(.

When we run patchelf it masks the problem
by removing obviously busted paths.
Which is probably why this wasn't noticed immediately.

Unfortunately the binary already
has a long series of paths scribbled
in a space intended for a much smaller string;
in my testing it was something like
lengths were 67 with 300+ written to it.

I think we've reported the relevant issues upstream,
but unfortunately it appears our patches
are what introduces the overwrite/corruption
(by no longer being correct in what they assume)

This doesn't look so bad to fix but it's
not something I can spend more time on
at the moment.

--

Interestingly the overwritten string data
(because it is scribbled past the bounds)
remains in the binary and is why we're suddenly
seeing unexpected references in various builds
-- notably this is is the reason we're
seeing the "extra-utils" breakage
that entirely crippled NixOS on master
(and probably on staging before?).

Fixes NixOS#43650.

This reverts commit 305ac4d.
@dtzWill
Copy link
Member Author

dtzWill commented Jul 24, 2018

As mentioned mass failures on Hydra: https://hydra.nixos.org/eval/1470463#tabs-still-fail

So honestly if this fixes even just this test it's an improvement ;)

@GrahamcOfBorg test systemd

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: tests.systemd

Partial log (click to expand)

syncing
machine: running command: sync
machine: exit status 0
2 out of 2 tests succeeded
test script finished in 48.70s
cleaning up
killing machine (pid 624)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/3mvyfz2zg6h21dilp3vy2mbi1c23128w-vm-test-run-systemd

@GrahamcOfBorg
Copy link

Success on aarch64-linux (full log)

Attempted: tests.systemd

Partial log (click to expand)

machine# [   25.815009] display-manager[698]: (II) Module glx: vendor="X.Org Foundation"
machine# [   25.829322] display-manager[698]:   compiled for 1.19.6, module version = 1.0.0
machine: exit status 0
2 out of 2 tests succeeded
test script finished in 73.12s
cleaning up
killing machine (pid 649)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/nm3lacblnq4khcr6w8m13wd7qx8d3xh8-vm-test-run-systemd

@domenkozar
Copy link
Member

I'd like to merge staging with this as well.

@dtzWill
Copy link
Member Author

dtzWill commented Jul 24, 2018

Sounds good to me--just wanted someone to confirm before merging since it'll cause all the rebuilds :P. Well a lot anyway :).

@domenkozar
Copy link
Member

Oh there is already a big rebuild on staging, so I'll merge there and reeval.

@domenkozar
Copy link
Member

Pushed to staging-next.

@domenkozar domenkozar closed this Jul 24, 2018
@bkchr
Copy link
Contributor

bkchr commented Jul 24, 2018

But the fix is required on master too, or do you will merge staging-next into master also?

@domenkozar
Copy link
Member

I think it's best to do so, given that staging-next has (more or less) rebuilt and except for a few hiccups on the Python side that @FRidh seems to be looking into, it looks good.

@jtojnar jtojnar mentioned this pull request Aug 8, 2018
@jtojnar
Copy link
Contributor

jtojnar commented Aug 13, 2018

@dtzWill What package breaks for you with meson 0.47?

@dtzWill
Copy link
Member Author

dtzWill commented Aug 13, 2018

All NixOS test VMs, blocked things for quite some time and was a beast to track down. See description at top, the breakage is subtle (although corruption of binarirs is not small issue!!) as things appear to work at first.

I'll check later (soon as I can) but most glaring symptom of our incorrectly scribbling into produced binaries is that pretty much 100% NixOS VM tests failed and probably NixOS failed to boot although I don't know that I bothered checking the latter. I think the self-check at end of producing the extras for stage 1 is where it crashed.

@dtzWill dtzWill mentioned this pull request Sep 4, 2018
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

staging can't build any NixOS test VM's? (extra-utils references libidn2 unexpectedly?)
5 participants