Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treewide: migrate to -fno-common #110571

Merged
merged 1 commit into from Jul 7, 2022
Merged

treewide: migrate to -fno-common #110571

merged 1 commit into from Jul 7, 2022

Conversation

Gaelan
Copy link
Contributor

@Gaelan Gaelan commented Jan 23, 2021

Motivation for this change

GCC 10 defaults to -fno-common, a change that broke some packages. To ease the transition when migrating to GCC 10, we changed the default back in our build of GCC. This commit undoes that, bringing us closer to upstream and reducing our technical debt. (Clang, which made the same change, also cites "performance and code-size benefits"; it's unclear if this is the case for GCC.)

In an effort to reduce the potential breakage from this PR, I've made no effort to patch or upgrade packages broken by this. Instead, for any package that fails to build, I fix it by setting NIX_CFLAGS_COMPILE = "-fcommon", i.e. returning to the status quo. This brings the vast majority of packages onto the new default, and has no risk of introducing new bugs due to incorrect patches. As the rest of the world migrates towards GCC 10, most of these will probably get fixed by upstream, and we can remove the flag; for the rest, we can decide whether to write/find a patch or keep the flag indefinitely.

This PR is not ready to merge at this point. Before doing so, I'd like to try building at least these things:

  • The configuration.nix of the laptop I'm testing this on
  • Some (or all) of the configurations in release.nix
  • Packages listed as broken (or previously broken) in the Gentoo and Fedora issues for migrating to GCC 10.

It might also be a good idea to set up a hydra job.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@Gaelan
Copy link
Contributor Author

Gaelan commented Jan 23, 2021

Huh, it's weird that this causes an stdenv rebuild on darwin. It should only affect GCC.

@SuperSandro2000
Copy link
Member

https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html

-fcommon
In C code, this option controls the placement of global variables defined without an initializer, known as tentative definitions in the C standard. Tentative definitions are distinct from declarations of a variable with the extern keyword, which do not allocate storage.

The default is -fno-common, which specifies that the compiler places uninitialized global variables in the BSS section of the object file. This inhibits the merging of tentative definitions by the linker so you get a multiple-definition error if the same variable is accidentally defined in more than one compilation unit.

The -fcommon places uninitialized global variables in a common block. This allows the linker to resolve all tentative definitions of the same variable in different compilation units to the same object, or to a non-tentative definition. This behavior is inconsistent with C++, and on many targets implies a speed and code size penalty on global variable references. It is mainly useful to enable legacy code to link without errors.

Copy link
Contributor

@r-burns r-burns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this would warrant a hydra job. Since this primarily affects linux, it could be a good task for spare x86_64-linux builders.

# GCC 10 changed this default, causing cpio not to compile. This was
# necessary as of Jan 2021 (version 2.13); if a new release comes out, check
# if it's still needed and remove it if not.
NIX_CFLAGS_COMPILE = "-fcommon";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The darwin stdenv contains cpio, but we can avoid the rebuild:

Suggested change
NIX_CFLAGS_COMPILE = "-fcommon";
NIX_CFLAGS_COMPILE = if stdenv.cc.isGNU then "-fcommon" else null;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it looks like gentoo has a fix for this: https://bugs.gentoo.org/705900

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It goes to staging anyway so we don't really need to worry about stdenv rebuilds.

Copy link
Contributor Author

@Gaelan Gaelan Jan 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, that solves the Darwin mystery.

My inclination is to use CFLAGS instead of patches initially, because it keeps the PR simple and seems less likely to break things; we can always switch to patches on a per-package basis later. But I can probably be convinced otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into this for #126411 too. I opted for patches in the cases I came across because I figured they'd break sooner and trigger people to reevaluate them. For cpio in particular I've also sent an email to upstream (no response so far). So I'm in favor of having this apply for Darwin and clang too : )

@Gaelan
Copy link
Contributor Author

Gaelan commented Jan 25, 2021

iso_minimal now builds. I can upload to cachix if it helps anyone.

@Gaelan
Copy link
Contributor Author

Gaelan commented Jan 25, 2021

Bah, there's a merge conflict, but if I try to merge in aa8868c (the output of git merge-base origin/master origin/staging), the PR pulls in a bunch of unrelated commits for some reason.

@ruuda
Copy link
Contributor

ruuda commented Jan 25, 2021

There’s a conflict in squashfsTools because lz4 support recently became unconditional, and it touched related lines.

@bobrik bobrik mentioned this pull request Feb 4, 2021
21 tasks
@thefloweringash
Copy link
Member

Clang 11 is making the same change (changelog, D75056) so these changes shouldn't be restricted to GCC. I ran into this when working on aarch64-darwin which has a hydra job. It could be useful to pick through the set of new failures in the change from clang 10 to clang 11: https://hydra.nixos.org/eval/1646384?compare=1646102#tabs-now-fail.

@Gaelan
Copy link
Contributor Author

Gaelan commented Feb 15, 2021

Sorry this PR is languishing (and for all the force pushes) - I can't manage to fix the merge conflict without GitHub pulling in thousands of unrelated commits. I thought git merge xyz, where xyz is a commit appearing in both master and staging, would do the trick, but it doesn't seem to work.

@trofi
Copy link
Contributor

trofi commented Jun 18, 2022

All failing packages noticed on hydra are now fixed in master or staging. I'd like to do one extra run once staging trickles down into master to check if we have any extra darwin fallout: #177171

After that we'll be ready for merge.

@trofi
Copy link
Contributor

trofi commented Jun 20, 2022

@vcunat
Copy link
Member

vcunat commented Jun 25, 2022

Jobset: as x86_64-darwin is now catching up about a week of outage on Hydra, I canceled it here and put in aarch64-darwin instead (was idling right now).

@trofi trofi force-pushed the no-common branch 2 times, most recently from 3fdf85c to e9ee2e0 Compare July 2, 2022 17:16
GCC 10 sets -fno-common by default. This broke some packages, so
when moving to GCC 10 we initially disabled this behavior. This
commit reverts that, bringing us closer to the standard and
upstream.

Co-authored-by: Sergei Trofimovich <slyich@gmail.com>
@trofi trofi changed the title WIP: treewide: migrate to -fno-common treewide: migrate to -fno-common Jul 6, 2022
@trofi trofi marked this pull request as ready for review July 6, 2022 07:40
@trofi trofi requested a review from matthewbauer as a code owner July 6, 2022 07:40
@trofi
Copy link
Contributor

trofi commented Jul 6, 2022

I hereby declare all (~240) the knows failures to be fixed. Should be ready for staging.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/help-disabling-fno-common-hack/19031/3

@Artturin
Copy link
Member

Artturin commented Jul 6, 2022

@ofborg build nixosTests.gnome

@vcunat vcunat merged commit 43f9c19 into NixOS:staging Jul 7, 2022
@vcunat
Copy link
Member

vcunat commented Jul 7, 2022

Thanks. This must've taken really lots of work. I'm convinced it's in good enough shape for staging. Minor remaining issues can always get fixed during staging-next.

@trofi
Copy link
Contributor

trofi commented Jul 9, 2022

Recent package additions that fail now:

@Artturin
Copy link
Member

Artturin commented Jul 9, 2022

Thank you for your efforts trofi

@trofi trofi mentioned this pull request Jul 14, 2022
13 tasks
vbgl added a commit to vbgl/nixpkgs that referenced this pull request Aug 3, 2022
vbgl added a commit that referenced this pull request Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet