Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support multi threaded xz encoder #1848

Merged
merged 7 commits into from Feb 9, 2018
Merged

Conversation

AmineChikhaoui
Copy link
Member

This seems to work for me, but will probably need to test more and check if we need to handle some more corner cases e.g if the number of cpu cores cannot be detected fallback to lzma_easy_encoder or set the threads to 1 which is suggested in some comments in xz source/examples.

The intention of the PR is mainly to take a stab at reducing the overhead of compression in hydra + a binary cache where it seems there is a considerable time spent in compression whenever hydra-queue-runner sends derivations inputs to the binary cache or uploads build outputs.

the case of hydra where the overhead of single threaded encoding is more
noticeable e.g most of the time spent in "Sending inputs"/"Receiving outputs"
is due to compression while the actual upload to the binary cache seems
to be negligible.
some comments about possible improvements wrt memory usage/threading.
@AmineChikhaoui AmineChikhaoui changed the title [WIP] support multi threaded xz encoder support multi threaded xz encoder Feb 7, 2018
@AmineChikhaoui
Copy link
Member Author

Some numbers to compare the 2 options:

[nix-shell:~/src/nix]$ cat /tmp/parallel-xz.nix
let
  pkgs = (import <nixpkgs> {});
in
  pkgs.runCommand "parallel-xz" {
    buildInputs = [ pkgs.utillinux ];
  } ''
  mkdir $out
  fallocate -l 1G  $out/foo
  ''

[nix-shell:~/src/nix]$ nix-build /tmp/parallel-xz.nix
warning: in configuration file '/etc/nix/nix.conf': unknown setting 'signed-binary-caches'
these derivations will be built:
  /nix/store/d4i32c1gcs2xz7n5x94h5834ph3mrm9m-parallel-xz.drv
building '/nix/store/d4i32c1gcs2xz7n5x94h5834ph3mrm9m-parallel-xz.drv'...
/nix/store/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw-parallel-xz

[nix-shell:~/src/nix]$ nix copy --to "s3://amine-testing?profile=lb-dev" /nix/store/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw-parallel-xz  -vvv
querying info about '/nix/store/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw-parallel-xz' on 's3://amine-testing'...
copying 1 paths...
copying path '/nix/store/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw-parallel-xz' to 's3://amine-testing'...
warning: dumping very large path (> 256 MiB); this may run out of memory
copying path '/nix/store/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw-parallel-xz' (1073742104 bytes, compressed 100.0% in 24765 ms) to binary cache
uploaded 's3://amine-testing/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw.narinfo' (399 bytes) in 242 ms
[1 copied (1024.0 MiB)]

[nix-shell:~/src/nix]$ nix copy --to "s3://amine-testing?profile=lb-dev" /nix/store/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw-parallel-xz  --option parallel-compression true -vvv
downloaded 's3://amine-testing/nix-cache-info' (21 bytes) in 133 ms
querying info about '/nix/store/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw-parallel-xz' on 's3://amine-testing'...
copying 1 paths...
copying path '/nix/store/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw-parallel-xz' to 's3://amine-testing'...
warning: dumping very large path (> 256 MiB); this may run out of memory
copying path '/nix/store/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw-parallel-xz' (1073742104 bytes, compressed 100.0% in 11477 ms) to binary cache
uploaded 's3://amine-testing/nar/1va60z9sa87wsy0lsnvir9arf3w7bsh1b1jnqxx3nbr1qjc5rxz1.nar.xz' (161192 bytes) in 732 ms
uploaded 's3://amine-testing/0hrl8vyf5xqbsj1czq0v6gkvxr22kvyw.narinfo' (399 bytes) in 263 ms
[1 copied (1024.0 MiB)]

So using a 1G build output, parallel compression takes 11477 ms compared to 24765 ms with single threaded xz. Compression ratio seems to deteriorate a bit though:

[nix-shell:~/src/nix]$ aws s3 ls s3://amine-testing/nar/05b1bnpxhhlg5wpjfrxy3wsl2fv695wfpgfl9i1bngdw8zkxpzxd.nar.xz --profile lb-dev
2018-02-07 18:44:39     156416 05b1bnpxhhlg5wpjfrxy3wsl2fv695wfpgfl9i1bngdw8zkxpzxd.nar.xz

[nix-shell:~/src/nix]$ aws s3 ls s3://amine-testing/nar/1va60z9sa87wsy0lsnvir9arf3w7bsh1b1jnqxx3nbr1qjc5rxz1.nar.xz --profile lb-dev
2018-02-07 18:49:12     161192 1va60z9sa87wsy0lsnvir9arf3w7bsh1b1jnqxx3nbr1qjc5rxz1.nar.xz

@edolstra
Copy link
Member

edolstra commented Feb 7, 2018

@AmineChikhaoui How many cores was that?

@edolstra
Copy link
Member

edolstra commented Feb 7, 2018

Hm, maybe the setting should be moved to BinaryCacheStore. Then it can be specified in the store URI, e.g. s3://nix-cache?parallel=compression=1.

@AmineChikhaoui
Copy link
Member Author

@edolstra that was on my laptop with 4 cores.

@AmineChikhaoui
Copy link
Member Author

yeah I'll move the setting to the binary cache store, seems better to have it in the url 👍

@dtzWill
Copy link
Member

dtzWill commented Feb 7, 2018

May want to touchup Hydra's code and handling of its "compress_num_threads" option to avoid unexpected behavior when used with a storeURI that sets parallel-compression.

@AmineChikhaoui
Copy link
Member Author

@dtzWill from what I see in hydra compress_num_threads is only used in src/lib/Hydra/View/NixNAR.pm:

open $fh, "nix-store --dump '$storePath' | pixz -0 $pParam |";

I don't think this PR has an impact on that part though ?

@dtzWill
Copy link
Member

dtzWill commented Feb 8, 2018

@AmineChikhaoui eep my mistake! Thanks for taking a look :).

@edolstra edolstra merged commit 3d2d207 into NixOS:master Feb 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants