Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

firefox: fix running wayland firefox built with LTO and some miscellaneous improvements #106617

Merged
merged 7 commits into from Jan 16, 2021

Conversation

S-NA
Copy link
Contributor

@S-NA S-NA commented Dec 11, 2020

Motivation for this change

Resolves #101429 which is Firefox crashing when running under wayland and being built with LTO. This means building with LTO support by default on Linux once again. There are also some commits with cleaning up the overall expression which hopefully is fine but if it is preferred to pull them out to another pull request I can do that.

The commits contain relevant information on the changes done but to recap, the issue seemed to be a borked generation of dependentlibs.list which caused Firefox's wayland stub library to be loaded first. I could not discern why readelf -d was returning incorrect results when building under LTO support but fine when building without LTO. However, switching to a different tool in this case llvm-objdump fixed the issue.

I have built and tested firefox with LTO support on X11 and Wayland. Would appreciate if the other firefox variants could be tested too.

cc @andir @artemist

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@lovesegfault
Copy link
Member

Tested and working!

@@ -341,14 +341,6 @@ buildStdenv.mkDerivation ({
gappsWrapperArgs+=(--argv0 "$out/bin/.${binaryName}-wrapped")
'';

postFixup = lib.optionalString buildStdenv.isLinux ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this break executing firefox without the wrapper? If so this sound wrong. I've never really used the wrapper much...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it still executes and functions as a web browser. Both are optional runtime libraries and firefox can function fine without them. That being said it does degrade the feature set provided by firefox-unwrapped. Namely firefox goes back to using non-native notifications and there is no more screensaver inhibitor. I am not sure if that is a problem though since firefox-unwrapped already lacks things like VA-API, audio, etc.

@Mic92 Mic92 removed their request for review December 13, 2020 03:47
@ajs124
Copy link
Member

ajs124 commented Dec 25, 2020

Can you rebase this on current master? The 84 release upgrade probably caused some conflicts.

Are there any upstream issues or do we know why we're running into this when it's working fine for other distros? Is our readelf any different from what they use?

This looks reasonable enough to me, otherwise. I know you're not officially maintaining firefox anymore @andir, but I would still like to know if you have any more concerns/objections besides the one you've brought up about firefox-unwrapped.

@andir
Copy link
Member

andir commented Dec 25, 2020

IMHO just switch to clang stdenv as that is the only upstream tested compiler...

@S-NA
Copy link
Contributor Author

S-NA commented Dec 27, 2020

Are there any upstream issues or do we know why we're running into this when it's working fine for other distros? Is our readelf any different from what they use?

There is now, bmo#1684291. Seems that the LTO supporting stdenv provides a symlink of readelf to llvm-readelf (or is that part of the LLVM binutils package/PATH priority?). Anyway, it really should not matter since llvm-readelf is supposed to provide GNU style output by default; however, Firefox's dependentlibs.py is sensitive to some trivial white-space discrepancy that causes the script to fail.

I suppose this also means if there is a way to make stdenv provide readelf from GNU binutils the patches could be dropped. (Or default to GNU binutils in general.)

It was added for nspr and nss back in the 55.0.3 to 56.0 upgrade. It
also served as a workaround for an undeclared gio-unix-2.0 dependency.
Sometime afterwards nspr was removed, leaving just the two. Since then,
upstream has added a declaration for gio-unix-2.0 (in FF62). As for the
nss include it seemingly has no purpose since current firefox builds
with it removed.
It only affected FF80 so place an upper bound restriction. See
bmo#1661096 for details.

This fixes substituteStream() warnings about missing patterns which
appeared in the logs.
Change `lib.optionals a [ b ]` to `lib.optional a b`.
Firefox has a number of optional dependencies that get dlopened.
Instead of using patchelf to set the RPATH use LD_LIBRARY_PATH.
The motivation for this is we already set LD_LIBRARY_PATH in the
wrapper on Linux.
This was required to solve the XPCOMGlueLoad error when building with
LTO. However, it turns out libxul.so is supposed to have some libraries
that are reported as not found by ldd. Setting the RPATH worked around
the error as it forced dependency resolution but failed to fix the real
issue of broken generation of dependentlibs.list.

The libraries that are reported as not found by ldd are supposed to be
dlopened through the logic found in nsXPCOMGlue.cpp. However since the
generation of dependentlibs.list is broken under LTO this did not
happen. Instead of pulling libwayland-client.so from the GTK libraries
it found the stub library first (libmozwayland.so). The stub library
causes (as it should) wl_display_connect to always return NULL which is
the cause of the segmentation fault and LTO breaking wayland support.

Remove the hardcoded path used for the XPCOMGlueLoad error workaround
in NIX_LDFLAGS. libunwind is still unfortunately needed. Once the issue
of the generation of dependentlibs.list being borked is fixed it should
remedy the wayland crash issue on LTO.
Enable LTO support on Linux by default again.

Add patch to fix dependentlibs.list generation under LTO. This is
necessary for fixing firefox-wayland crashing when built with LTO.

Add makeFlags which set ar, ranlib, and nm to be llvm-ar, llvm-ranlib
and llvm-nm when building with llvm-based LTO. (bmo#1480005)
Python 2 is no longer required to build Firefox.
@S-NA
Copy link
Contributor Author

S-NA commented Jan 13, 2021

Upstream has accepted a patch, changes will arrive in Firefox 86. I rebased off of current master and changed that the patch is fetched from upstream for Firefox 84/85. Nixpkgs still will have to carry a patch locally for firefox-esr.

I expect firefox to build and function as normal since these changes are not different from the ones prior, but I am still going to build firefox and test everything works against master.

Please let me know if there is anything else I need to do to have this merged.

@lovesegfault
Copy link
Member

Testing this

@lovesegfault
Copy link
Member

lovesegfault commented Jan 14, 2021

Tested and working. I also ran some benchmarks:

Without LTO

Speedometer

Iteration 1	 35.97 runs/min
Iteration 2	 35.23 runs/min
Iteration 3	 34.02 runs/min
Iteration 4	 37.19 runs/min
Iteration 5	 36.18 runs/min
Iteration 6	 34.07 runs/min
Iteration 7	 35.89 runs/min
Iteration 8	 34.87 runs/min
Iteration 9	 35.49 runs/min
Iteration 10     36.10 runs/min
Arithmetic Mean:35.5 ± 0.71 (2.0%)

MotionMark 1.1

114.71

Size

$ du -L -sh /nix/store/fbswdb1w6ajxc5244c02pqz7n984v80i-firefox-84.0.2/
185M	/nix/store/fbswdb1w6ajxc5244c02pqz7n984v80i-firefox-84.0.2/

With LTO

Speedometer

Iteration 1	 37.22 runs/min
Iteration 2	 35.54 runs/min
Iteration 3	 35.88 runs/min
Iteration 4	 36.26 runs/min
Iteration 5	 34.95 runs/min
Iteration 6	 37.55 runs/min
Iteration 7	 34.65 runs/min
Iteration 8	 37.83 runs/min
Iteration 9	 34.46 runs/min
Iteration 10     37.44 runs/min
Arithmetic Mean:36.2 ± 0.91 (2.5%)

~1.97% mean improvement

MotionMark 1.1

117.90

~2.78% improvement

Size

$ du -L -sh /nix/store/siavzav33v0zrq1bny9m9qnl4m1xa1hc-firefox-84.0.2/
183M	/nix/store/siavzav33v0zrq1bny9m9qnl4m1xa1hc-firefox-84.0.2/

~1.08% improvement

@lovesegfault
Copy link
Member

firefox-bin

Speedometer

Iteration 1  39.27 runs/min
Iteration 2  39.73 runs/min
Iteration 3  39.12 runs/min
Iteration 4  39.30 runs/min
Iteration 5  37.57 runs/min
Iteration 6  38.02 runs/min
Iteration 7  33.87 runs/min
Iteration 8  34.18 runs/min
Iteration 9  34.64 runs/min
Iteration 10 38.75 runs/min
Arithmetic Mean:37.4 ± 1.7 (4.4%)

~3.31% mean improvement (over LTO)

MotionMark 1.1

118.65

~0.64% improvement (over LTO)

Size

$ du -L -sh /nix/store/69jn8rhrzlkgq48acqj0kw5xhzdav6vp-firefox-bin-84.0.2/
228M	/nix/store/69jn8rhrzlkgq48acqj0kw5xhzdav6vp-firefox-bin-84.0.2/

~24.6% worse (than LTO)

@lovesegfault
Copy link
Member

@S-NA Could you try using clangStdenv?

@lovesegfault
Copy link
Member

This patch compiles and works, FWIW

diff --git a/pkgs/applications/networking/browsers/firefox/common.nix b/pkgs/applications/networking/browsers/firefox/common.nix
index 123c3cc08a6..4cb149f3efe 100644
--- a/pkgs/applications/networking/browsers/firefox/common.nix
+++ b/pkgs/applications/networking/browsers/firefox/common.nix
@@ -2,7 +2,7 @@
 , src, unpackPhase ? null, patches ? []
 , extraNativeBuildInputs ? [], extraConfigureFlags ? [], extraMakeFlags ? [] }:
 
-{ lib, stdenv, pkgconfig, pango, perl, python3, zip
+{ lib, clangStdenv, pkgconfig, pango, perl, python3, zip
 , libjpeg, zlib, dbus, dbus-glib, bzip2, xorg
 , freetype, fontconfig, file, nspr, nss, nss_3_53
 , yasm, libGLU, libGL, sqlite, unzip, makeWrapper
@@ -18,12 +18,12 @@
 
 ## optional libraries
 
-, alsaSupport ? stdenv.isLinux, alsaLib
-, pulseaudioSupport ? stdenv.isLinux, libpulseaudio
+, alsaSupport ? clangStdenv.isLinux, alsaLib
+, pulseaudioSupport ? clangStdenv.isLinux, libpulseaudio
 , ffmpegSupport ? true
 , gtk3Support ? true, gtk2, gtk3, wrapGAppsHook
 , waylandSupport ? true, libxkbcommon
-, ltoSupport ? stdenv.isLinux, overrideCC, buildPackages
+, ltoSupport ? clangStdenv.isLinux, overrideCC, buildPackages
 , gssSupport ? true, kerberos
 , pipewireSupport ? waylandSupport && webrtcSupport, pipewire
 
@@ -71,22 +71,22 @@
 , enableOfficialBranding ? true
 }:
 
-assert stdenv.cc.libc or null != null;
+assert clangStdenv.cc.libc or null != null;
 assert pipewireSupport -> !waylandSupport || !webrtcSupport -> throw "pipewireSupport requires both wayland and webrtc support.";
-assert ltoSupport -> stdenv.isDarwin -> throw "LTO is broken on Darwin (see PR#19312).";
+assert ltoSupport -> clangStdenv.isDarwin -> throw "LTO is broken on Darwin (see PR#19312).";
 
 let
   flag = tf: x: [(if tf then "--enable-${x}" else "--disable-${x}")];
 
-  default-toolkit = if stdenv.isDarwin then "cairo-cocoa"
+  default-toolkit = if clangStdenv.isDarwin then "cairo-cocoa"
                     else "cairo-gtk${if gtk3Support then "3${lib.optionalString waylandSupport "-wayland"}" else "2"}";
 
   binaryName = "firefox";
   binaryNameCapitalized = lib.toUpper (lib.substring 0 1 binaryName) + lib.substring 1 (-1) binaryName;
 
-  browserName = if stdenv.isDarwin then binaryNameCapitalized else binaryName;
+  browserName = if clangStdenv.isDarwin then binaryNameCapitalized else binaryName;
 
-  execdir = if stdenv.isDarwin
+  execdir = if clangStdenv.isDarwin
             then "/Applications/${binaryNameCapitalized}.app/Contents/MacOS"
             else "/bin";
 
@@ -98,7 +98,7 @@ let
   # clang LTO on Darwin is broken so the stdenv is not being changed.
   # Target the LLVM version that rustc -Vv reports it is built with for LTO.
   # rustPackages_1_45 -> LLVM 10, rustPackages -> LLVM 11
-  llvmPackages = if stdenv.isDarwin
+  llvmPackages = if clangStdenv.isDarwin
                  then buildPackages.llvmPackages
                  else if lib.versionAtLeast rustc.llvm.version "11"
                       then buildPackages.llvmPackages_11
@@ -107,8 +107,8 @@ let
   # When LTO for Darwin is fixed, the following will need updating as lld
   # doesn't work on it. For now it is fine since ltoSupport implies no Darwin.
   buildStdenv = if ltoSupport
-                then overrideCC stdenv llvmPackages.lldClang
-                else stdenv;
+                then overrideCC clangStdenv llvmPackages.lldClang
+                else clangStdenv;
 
   nss_pkg = if lib.versionOlder ffversion "83" then nss_3_53 else nss;
 in

@S-NA
Copy link
Contributor Author

S-NA commented Jan 15, 2021

@lovesegfault I do not mind applying that patch but from what I can tell (using nix repl) both overrideCC clangStdenv llvmPackages.lldClang and overrideCC stdenv llvmPackages.lldClang end up pointing to the same /nix/store/<hash>-stdenv-linux. The only difference is that non-LTO will be built with clangStdenv (I have not confirmed if it builds). If that is the intended goal I will go ahead apply and test the patch with ltoSupport set to false.

@lovesegfault
Copy link
Member

@lovesegfault I do not mind applying that patch but from what I can tell (using nix repl) both overrideCC clangStdenv llvmPackages.lldClang and overrideCC stdenv llvmPackages.lldClang end up pointing to the same /nix/store/<hash>-stdenv-linux. The only difference is that non-LTO will be built with clangStdenv (I have not confirmed if it builds). If that is the intended goal I will go ahead apply and test the patch with ltoSupport set to false.

I think ultimately this is @andir's call since he knows this best.

Personally though, I am in favor of just always building FF with clang, regardless of LTO being enabled.

@andir
Copy link
Member

andir commented Jan 15, 2021 via email

@S-NA
Copy link
Contributor Author

S-NA commented Jan 16, 2021

I don't care about Firefox anymore. Infact I purged it from my systems after the Firefox extensions garbage.

That is understandable, my apologies. I agree with your points and will keep them in mind.

I will be comfortable to converting non-LTO Firefox to use Clang but only after this PR gets merged (and some time passes). I want to rely on being able to fall back to a known working configuration, even if it deviates from upstream though I do agree we should not deviate from upstream, and would like to see non-LTO Firefox be built with Clang. As witnessed prior when Firefox built with LTO support was merged it was discovered it broke Wayland support (sorry). Being able to fall back to a known working configuration (hopefully) minimized the annoyance while I could spend the time working out the issue. As this PR goes it does bring nixpkg's Firefox closer to being built how upstream's Firefox is.

@lovesegfault
Copy link
Member

Alright, I will merge this as-is and we can convert to clang-by-default later.

Thanks for your work on this @S-NA :)

@lovesegfault lovesegfault merged commit aa4a14b into NixOS:master Jan 16, 2021
S-NA added a commit to S-NA/nixpkgs that referenced this pull request Jan 17, 2021
This is a follow up to NixOS#106617 which brought LTO support but broke the
AArch64 Firefox builds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Firefox 82 and firefox-bin 82 do not run under Wayland when LTO is enabled
4 participants