Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildRustCrate: use $NIX_BUILD_CORES for each of the crates #57936

Merged
merged 1 commit into from Mar 29, 2019

Conversation

andir
Copy link
Member

@andir andir commented Mar 20, 2019

Motivation for this change

I started to compile a project over and over and over and over again and was wondering why not all of my cores where being used like when compiling with cargo in development mode.

@P-E-Meunier you restricted the core count to 1 back in #39003. Was there some specific reasoning?

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Assured whether relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@andir
Copy link
Member Author

andir commented Mar 20, 2019

@GrahamcOfBorg build cargo-vendor

@P-E-Meunier
Copy link
Contributor

The reason is the grain of parallelism we want: rustc alone is not capable of predicting the number of available cores, I believe Cargo does something like that, where the last step is parallelized on all cores. Unfortunately, this would be hard to model in a purely functional way.

If we don't restrict this, Nix runs on all available cores, and each rustc runs on all available cores, so we get n^2 threads, which is really bad for performance (and super heavy on memory).

@andir
Copy link
Member Author

andir commented Mar 27, 2019

That is why I am using $NIX_BUILD_CORES since that is what we use everywhere else. It comes with the same issues of oversubscribing cores - if that is what the admin configured.

Having used that on a few projects now it seems to just do what is expected. Us up to $NIX_BUILD_CORES cores at a time (per rustc invocation).

My concerns (and thus asking you) were if there was some build breakage that you initially observed that would be a blocker for this.

@P-E-Meunier
Copy link
Contributor

I have never seen any problem related to this, I think it's fine. So, let's say I have 5 cores, what setting should I choose if I want to keep my cores as busy as possible, but not oversubscribe?

@andir
Copy link
Member Author

andir commented Mar 28, 2019

That is a tradeoff that you have to make yourself. One simple solution would be to always have numCores = buildCores * buildJobs. Then you can have up to the maximum number of cores used during the build but not more. If all of your builds only run on a single core you are ofc wasting resources. It mostly depends on the machine, the builds you are running, etcpp..

Oversubscribing should not be that bad with decent scheduling. That is a completely different research topic..

@P-E-Meunier
Copy link
Contributor

Ok, maybe I wasn't clear in my first post, because I'm not sure I get it.

Nix runs in parallel on $NIX_BUILD_CORES cores, by launching multiple processes. Rustc also tries to runs in parallel. Let's say we set NIX_BUILD_CORES to the number of cores on the machine.

They don't know about each other, so they will both try to use all cores available on the machine, by launching a quadratic number of jobs. But if we let one of them deal with it completely, we could get a more optimal number of jobs:

  • if we allowed Nix to run on only one core, and Rustc on N cores, we would be using N cores sometimes, and 1 core often (especially when Rustc starts and finishes).

  • If we allowed Nix to deal with everything, and ran Rustc on 1 core each time, Nix would spawn N instances of "1-core rustcs". Maybe the final step would not utilise all the available cores (which happens to me in practice), but that final step usually takes a negligible fraction of the total time for me.

However, allowing both Nix and Rustc to start N jobs does result in machine hanging, crazy memory usage filling all the swap+memory, etc. That has happened to me many times, and is the reason why I changed that setting from "all available cores on both" to "all available cores on Nix, 1 on Rustc". My builds now run in parallel, and much much faster than with the previous quadratic strategy, because they don't fill the memory, and don't spend their whole time in context switches.

@grahamc
Copy link
Member

grahamc commented Mar 28, 2019

I hope I can help explain this a bit.

Nix has two relevant settings with regards to how your CPU cores will be utilized: max-jobs and cores.

  • max-jobs dictates how many separate derivations will be built at the same time. If you set this to zero, the local machine will attempt to do no builds (forcing it to be built remotely.)
  • cores dictates how many cores each derivation is allowed to use. If you set this to zero, the builds will be given access to all the cores of your system. This setting determines the value of NIX_BUILD_CORES (if cores == 0 then the total number of cores in the system, else the literal value of the cores setting.)

The total number of consumed cores is a simple multiplication, cores * max-jobs (unless cores is 0... in which case total-system-cores * max-jobs)

The balance on how to set these two independent variables depends upon each builder's workload and hardware. Here are a few example scenarios on a machine with 24 cores:

max-jobs cores NIX_BUILD_CORES value maximum possible simultaneous processes Result
1 24 24 24 One derivation will be built at a time, each one can use 24 cores. Undersold if a job can’t use 24 cores.
4 6 6 24 Four derivations will be built at once, each given access to six cores.
12 6 6 72 12 derivations will be built at once, each given access to six cores. This configuration is over-sold. If all 12 derivations being built simultaneously try to use all six cores, the machine's performance will be degraded due to extensive context switching between the 12 builds.
24 1 1 24 24 derivations can build at the same time, each using a single core. Never oversold, but derivations which require many cores will be very slow to compile.
24 0 24 576 24 derivations can build at the same time, each using all the available cores of the machine. Very likely to be oversold, and very likely to suffer context switches.

Right now, it is up to the builders to respect host's requested cores-per-build by following $NIX_BUILD_CORES. By forcing this to be 1, Rust compilations will not adequately utilize Hydra's build resources, wasting (for big-parallel aarch64) up to 44 available cores.

I don't believe Nix currently reduces the actual, total number of available cores with something like taskset, but it certainly could.

Does this help?

ps: I'm duplicating this over in the Nix manual now. I have duplicated this in the Nix manual: NixOS/nix#2749

@P-E-Meunier
Copy link
Contributor

Ok, this is more clever and subtle than how I usually use Nix. So, indeed setting the variable to $NIX_BUILD_CORES seems the only reasonable thing to do, I agree 100%.

@andir andir merged commit e0b4356 into NixOS:master Mar 29, 2019
@andir andir deleted the build-rust-crate-nix-build-cores branch March 29, 2019 11:30
@lagoda-antithesis
Copy link

After trying to track down a rust build that wasn't reproducible across machines, I'm pretty sure that this PR introduced a non-reproducibilty, because codegen-units actually changes the resulting binary.

I opened an issue with a demonstration, but was advised to comment here by @nh2.

davidscherer added a commit to davidscherer/nixpkgs that referenced this pull request Apr 30, 2022
The codegen-units setting affects the build output and must be set consistently.
@davidscherer davidscherer mentioned this pull request Apr 30, 2022
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants