Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdenv: introduce -frandom-seed #102251

Merged
merged 1 commit into from Nov 19, 2020
Merged

stdenv: introduce -frandom-seed #102251

merged 1 commit into from Nov 19, 2020

Conversation

andir
Copy link
Member

@andir andir commented Oct 31, 2020

Motivation for this change

This is some random work that I did while looking at r13y.com and rsync.

Apparently the gcc-unwrapped attribute reproduces.

cc @zimbatm

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@ofborg ofborg bot added the 6.topic: stdenv Standard environment label Oct 31, 2020
@@ -0,0 +1,2 @@
#export NIX_SET_BUILD_ID=${SOURCE_DATE_EPOCH:1}
export NIX_CFLAGS_COMPILE+=" -frandom-seed=${SOURCE_DATE_EPOCH:1} "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about

Suggested change
export NIX_CFLAGS_COMPILE+=" -frandom-seed=${SOURCE_DATE_EPOCH:1} "
export NIX_CFLAGS_COMPILE+=" -frandom-seed=$out "

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question. We also had that idea.

Andi's argument was that it would prevent content-addressable outputs since the output will ten always change if $out changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I even think that $SOURCE_DATE_EPOCH has that problem. I've not yet come up with a good idea here. Apparently, based on random googling, a single, static value that is the same across all builds isn't great. This went to the extend of being different per input file (e.g. it's hash). I am not convinced or an expert here :/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up going with the bash equivalent of basename $out as I want to avoid embedding a store path into the binary (and in e.g. multi-output derivations carrying a reference).

This adds -frandom-seed to each compiler invocation in stdenv. The
object here is to make the compierl invocations produce the same output
every time they are called (for the same derivation). When the
-frandom-seed option is not set the compiler will use a combination of
random numbers (in GCC's case from /dev/urandom) and the durrent time to
produce a "random" input per file. This can (among other things) lead to
different ordering of symbols in the produced object files.

For reason of reproducibility we prefer having the same derivation
produce the exact same outputs. This is not a silver bullet but one way
to tame the compiler.
@andir andir changed the base branch from master to staging November 1, 2020 18:41
@andir andir marked this pull request as ready for review November 1, 2020 18:41
@andir andir changed the title WIP: introduce -frandom-seed in stdenv stdenv: introduce -frandom-seed Nov 1, 2020
@andir
Copy link
Member Author

andir commented Nov 1, 2020

@Ericson2314 Would be nice if you can give us some feedback on this. This effectively made derivations more reproducible. It solves all of the cases where e.g. symbol ordering in the build result would be random across rebuilds. This was discovered why looking at the current state of https://r13y.com and the rsync derivation.

Copy link
Member

@zimbatm zimbatm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rsync is now reproducible

@andir andir merged commit 278b273 into NixOS:staging Nov 19, 2020
@andir andir deleted the random-seed branch November 19, 2020 00:08
@SuperSandro2000 SuperSandro2000 mentioned this pull request Nov 24, 2020
10 tasks
@zimbatm zimbatm added this to In progress in R13y via automation Nov 24, 2020
@zimbatm zimbatm moved this from In progress to Done in R13y Nov 24, 2020
@r-burns
Copy link
Contributor

r-burns commented Dec 13, 2020

I think this broke GCC cross builds (i.e. when build != host == target) somehow, by causing cyclical references in the outputs (out <-> lib)

cycle detected in the references of '/nix/store/3gli04fpx2cn31h5b1sj8hk1j8f8rlbi-gcc-debug-9.3.0-powerpc64le-unknown-linux-musl' from '/nix/store/av9qib1pma6c5vdm74rind16f36xc33v-gcc-debug-9.3.0-powerpc64le-unknown-linux-musl-lib'

I bisected the issue with pkgsCross.powernv.gcc down to this merge, and it looks like it might have caused similar failures for other platforms' cross builds:
https://hydra.nixos.org/eval/1631713#tabs-now-fail

I'm not sure why this is happening - any ideas?

@r-burns
Copy link
Contributor

r-burns commented Dec 13, 2020

Looks like the out path ends up in the random seed, which is stored along with the rest of the command line args:

strings /nix/store/89ircz9c1930mald5pqji1vb25ds1jsz-gcc-debug-9.3.0-powerpc64le-unknown-linux-gnu-lib/lib/libasan.so.5.0.0 | grep 2vbd2sf8mmkpv4z6fx4wd90qhka1hfdw
GNU C++11 9.3.0 -g -g -O2 -O2 -O2 -std=gnu++11 -fstack-protector-strong -fno-strict-overflow -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer -funwind-tables -fvisibility=hidden -fno-ipa-icf -fPIC -frandom-seed=2vbd2sf8mmkpv4z6fx4wd90qhka1hfdw-gcc-debug-9.3.0-powerpc64le-unknown-linux-gnu --param ssp-buffer-size=4
GNU C++11 9.3.0 -g -g -O2 -O2 -O2 -std=gnu++11 -fstack-protector-strong -fno-strict-overflow -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer -funwind-tables -fvisibility=hidden -fPIC -frandom-seed=2vbd2sf8mmkpv4z6fx4wd90qhka1hfdw-gcc-debug-9.3.0-powerpc64le-unknown-linux-gnu --param ssp-buffer-size=4
GNU C17 9.3.0 -g -g -O2 -O2 -O2 -fstack-protector-strong -fno-strict-overflow -fPIC -frandom-seed=2vbd2sf8mmkpv4z6fx4wd90qhka1hfdw-gcc-debug-9.3.0-powerpc64le-unknown-linux-gnu --param ssp-buffer-size=4
GNU C++11 9.3.0 -g -g -O2 -O2 -O2 -std=gnu++11 -fstack-protector-strong -fno-strict-overflow -fno-rtti -fno-exceptions -fPIC -frandom-seed=2vbd2sf8mmkpv4z6fx4wd90qhka1hfdw-gcc-debug-9.3.0-powerpc64le-unknown-linux-gnu --param ssp-buffer-size=4

@symphorien
Copy link
Member

local randomseed="${out##*/}"
export NIX_CFLAGS_COMPILE+=" -frandom-seed=${randomseed:0:10}"

should fix this issue, because the full hash is not present anymore.

@andir
Copy link
Member Author

andir commented Dec 14, 2020 via email

@symphorien
Copy link
Member

It looks like the hash is enough:

/tmp 
$  cat foo.nix
with import <nixpkgs> {};
runCommand "foo" {} ''
  local foo=${sl}; echo ''${foo##*/} > $out
''

/tmp 
$  nix-build foo.nix
these derivations will be built:
  /nix/store/zhb7sp79ijsq6cx68dah18qd3jcsfi8x-foo.drv
building '/nix/store/zhb7sp79ijsq6cx68dah18qd3jcsfi8x-foo.drv'...
/nix/store/6gj5v1vh6n8c3djf7csqshbvsbhigb22-foo
                                                                                                                                                                                              
/tmp 
$  cat result
zbylvjxqs1qvw25n655sg6vly1rb2wfk-sl-5.05
                                                                                                                                                                                              
/tmp 
$  nix-store -q --references ./result
/nix/store/zbylvjxqs1qvw25n655sg6vly1rb2wfk-sl-5.05
                                                                                                                                                                                              

@andir
Copy link
Member Author

andir commented Dec 14, 2020 via email

@r-burns
Copy link
Contributor

r-burns commented Dec 14, 2020

I fail to see how this is a flaw in nix - checking the hash is more robust to scripts joining individual path components, no?

@r-burns
Copy link
Contributor

r-burns commented Dec 15, 2020

I had an idea for a different approach - what if this logic is moved to the cc-wrapper script, so that the random seed can depend on the input files? Then we can follow GCC's recommendation to use a different seed for each file.

Anyway, the suggested trimmed seed seems to work for me so I will submit a patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
R13y
Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants