Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fake uname implementation to the Linux stdenv (that returns hardcoded values) #25240

Closed
wants to merge 3 commits into from

Conversation

dezgeg
Copy link
Contributor

@dezgeg dezgeg commented Apr 26, 2017

This has two purity advantages:

  • This makes uname -m return the processor architecture we're building
    for, not the architecture the build is running on. This mainly
    affects ARM, where we want to build armv6l packages on an armv7l
    package. The v7 processors are very capable of executing v6 code,
    but many build systems will look at uname -m and decide to do
    different things based on the result. Examples:

    • coreutils & openssl both add some ARMv7-specific FPU optimization
      flags to gcc's command line which causes the binaries to SIGILL
      on ARMv6 processors.
    • The emacs, perl & ruby builds create machine-specific directories
      like $out/libexec/emacs/25.1/armv6l-unknown-linux-gnueabihf

    Both of these are now fixed. And no, unlike PER_LINUX32 (which
    makes uname -m report i686 when run on x86_64) there is no kernel
    option to achieve the same thing for returning armv6l on a ARMv7.

  • Some packages like Xorg capture the kernel version where the build
    was done (it prints "Build Operating System: Linux 4.4.45 x86_64"
    to the logs). This improves binary reproducibility of such packages.

I've added this just for Linux for now (as it ended up requiring tweaking the bootstrap stages to get it right) but in principle it could be added for Darwin as well if there's a use case.

cc @edolstra

This is an implementation of the uname command that just returns
constant data. It should be compatible with coreutils uname.
Makes it much easier to inspect what's included in a particular stdenv
stage.
This has two purity advantages:

- This makes `uname -m` return the processor architecture we're building
  for, not the architecture the build is running on. This mainly
  affects ARM, where we want to build armv6l packages on an armv7l
  package. The v7 processors are very capable of executing v6 code,
  but many build systems will look at `uname -m` and decide to do
  different things based on the result. Examples:
    - coreutils & openssl both add some ARMv7-specific FPU optimization
      flags to gcc's command line which causes the binaries to SIGILL
      on ARMv6 processors.
    - The emacs, perl & ruby builds  create machine-specific directories
      like `$out/libexec/emacs/25.1/armv6l-unknown-linux-gnueabihf`
  Both of these are now fixed. And no, unlike PER_LINUX32 (which
  makes `uname -m` report `i686` when run on x86_64) there is no kernel
  option to achieve the same thing for returning armv6l on a ARMv7.

- Some packages like Xorg capture the kernel version where the build
  was done (it prints "Build Operating System: Linux 4.4.45 x86_64"
  to the logs). This improves binary reproducibility of such packages.
@mention-bot
Copy link

@dezgeg, thanks for your PR! By analyzing the history of the files in this pull request, we identified @Ericson2314, @errge and @edolstra to be potential reviewers.

@dezgeg dezgeg changed the base branch from master to staging April 26, 2017 10:19
@dezgeg dezgeg changed the title Add a fake uname implementation to stdenv (that returns hardcoded values) Add a fake uname implementation to the Linux stdenv (that returns hardcoded values) Apr 26, 2017
@expipiplus1
Copy link
Contributor

expipiplus1 commented Apr 26, 2017

I was playing with a similar solution last week (https://github.com/expipiplus1/nixpkgs/blob/17c8652edbe164457e763d37cdaea03cd7251f91/pkgs/tools/misc/coreutils/uname-armv7l.patch) and discovered that (annoyingly) some packages use a system call to determine that machine instead of calling uname, the go 1.4 compiler is an example.

@grahamc @bennofs mused if it would be possible to use some systemtap magic to intercept the system call

@bennofs has suggested overriding the uname function in libc by using LD_PRELOAD

Edit, I misremembered who mentioned systemtap

@edolstra
Copy link
Member

I'm not really in favor of this. It seems a bit hacky and adds considerable complexity to stdenv. (E.g. you have to be careful to ensure that fake-uname appears before coreutils' uname.)

An alternative would be to patch coreutils' uname to return a deterministic value based on some environment variable (say NIX_UNAME=<sysname>:<machine>:...). Also hacky but it would probably be just a few lines of code and wouldn't require duplicating all of uname's argument parsing. It wouldn't fix the uname() syscall, but please let's not do global LD_PRELOAD hacks or (shudder) systemtap magic just for go.

(Actually the environment variable hack could also be done in Glibc's uname(). Then it would work for everything except direct syscalls.)

@dezgeg
Copy link
Contributor Author

dezgeg commented Apr 26, 2017

The glibc idea sounds good, I will give that a shot...

@dezgeg
Copy link
Contributor Author

dezgeg commented Apr 26, 2017

Apparently that's not easy, as glibc build just auto-generates some wrappers from a system call list instead of having uname() written in C code...

@edolstra
Copy link
Member

Oh, too bad :-(

@bennofs
Copy link
Contributor

bennofs commented Apr 26, 2017

@dezgeg perhaps you can remove uname from the "list of auto-generated stuff" and add it manually?

@dezgeg
Copy link
Contributor Author

dezgeg commented Apr 26, 2017

That seems too complex and risky for very little gain. For Go you can set GOHOSTARCH, GOARCH and GOARM manually anyway to avoid the autodetection. And in fact you really have to set them to avoid impurities because the build system also tries to execute various floating point instructions and checks if a SIGILL was raised or not...

Coreutils seems a better place to hack.

@Ericson2314
Copy link
Member

I'd blacklist the system call itself at build time for sandboxed builds. Autodetection during configuration is rarely a good idea, period. Go wouldn't use glibc to do the system call anyways, would it?

@dezgeg
Copy link
Contributor Author

dezgeg commented Apr 26, 2017

At least in the bootstrap Go uname is called only in shell scripts and some C code.

/* node-name */ "localhost"
/* kernel-release */ "4.4.0-not-a-real-version"
/* kernel-version */ "#1 SMP PREEMPT Thu Jan 1 00:00:01 GMT 1970"
/* machine */ (builtins.head (lib.splitString "-" stdenv.system))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, hostPlatform.system is probably a better way to get what you want. Makes which of the 3 platforms you want explicit, and during cross compilation the stdenv.* stuff incorrectly reflects the build platform.

@edolstra
Copy link
Member

Not sure if blacklisting is a good idea, but we could use @aszlig's seccomp patch for Nix to make the uname syscall return a different value. This would compensate for the lack of a personality() for ARMv6.

@dezgeg
Copy link
Contributor Author

dezgeg commented Apr 26, 2017

If I remember correctly, besides optionally killing the process seccomp BPF filters can only affect the return value of a system call, not any values in userspace memory.

@aszlig
Copy link
Member

aszlig commented Apr 26, 2017

@dezgeg: Well, you can via SECCOMP_RET_TRAP and SECCOMP_RET_TRACE, but that will be complicated as well, is very architecture specific and isn't supported by libseccomp.

For example this was my WIP handler for rewriting the stat-family of syscalls:

void rewriteStatHandler(int signo, siginfo_t *si, void *ctx) {
    ucontext_t *uc = reinterpret_cast<ucontext_t*>(ctx);

    if (signo != SIGSYS || si->si_code != SYS_SECCOMP || !uc)
        throw SysError("unexpected SIGSYS triggered");

    if (si->si_arch == SCMP_ARCH_X86_64) {
        long ret = syscall(si->si_syscall,
                           uc->uc_mcontext.gregs[REG_RDI],
                           uc->uc_mcontext.gregs[REG_RSI]);
        uc->uc_mcontext.gregs[REG_RAX] = ret;
        if (ret != 0) return;
        struct stat st;
        memcpy(&st, &uc->uc_mcontext.gregs[REG_RSI], sizeof(st));
        st.st_uid = 0;
        st.st_gid = 0;
        uc->uc_mcontext.gregs[REG_RSI] = reinterpret_cast<greg_t>(&st);
    }
}

So you can change any register with this, but because you're working on register-level you're also highly architecture dependant.

Another reason why it's probably not a good idea is that seccomp also comes with a bit of overhead.

@Ericson2314
Copy link
Member

By "blacklisting" I mean just fail the build. If killing the process is also easier / more performant, all the better!

@veprbl
Copy link
Member

veprbl commented Apr 27, 2017

On darwin it would be really nice to have a fake sw_vers. In my experience, missing it was leading to much build breakage.

@matthewbauer
Copy link
Member

We actually have a real sw_vers in Darwin.DarwinTools but a fake one might be useful too.

@mmahut
Copy link
Member

mmahut commented Aug 19, 2019

Are there any updates on this pull request, please?

@expipiplus1
Copy link
Contributor

expipiplus1 commented Aug 19, 2019 via email

@veprbl veprbl added the 6.topic: cross-compilation Building packages on a different sort platform than than they will be run on label Oct 27, 2019
@stale
Copy link

stale bot commented Jun 1, 2020

Thank you for your contributions.
This has been automatically marked as stale because it has had no activity for 180 days.
If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.
Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the
    related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse. 3. Ask on the #nixos channel on
    irc.freenode.net.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 1, 2020
@zimbatm zimbatm added this to To do in R13y Nov 21, 2020
@misuzu
Copy link
Contributor

misuzu commented Jun 4, 2022

There is this patch from Ubuntu folks and it works great: https://lists.ubuntu.com/archives/kernel-team/2016-January/068203.html

{ config, pkgs, ... }:
{
  boot.kernelPatches = [
    rec {
      name = "compat_uts_machine";
      patch = pkgs.fetchpatch {
        inherit name;
        url = "https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/patch/?id=c1da50fa6eddad313360249cadcd4905ac9f82ea";
        sha256 = "sha256-mpq4YLhobWGs+TRKjIjoe5uDiYLVlimqWUCBGFH/zzU=";
      };
    }
  ];
  boot.kernelParams = [
    "compat_uts_machine=armv7l"
  ];
}
[misuzu@oracle:~]$ uname -m
aarch64
[misuzu@oracle:~]$ linux32
[misuzu@oracle:~]$ uname -m
armv7l

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 4, 2022
@lucasew
Copy link
Contributor

lucasew commented Aug 24, 2022

Is it still relevant?

@Mic92
Copy link
Member

Mic92 commented Nov 20, 2022

This PR seems stalled. Closing.

@Mic92 Mic92 closed this Nov 20, 2022
R13y automation moved this from Inbox to Done Nov 20, 2022
@Artturin
Copy link
Member

A deterministic uname was finally added in #210102

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.status: merge conflict 6.topic: cross-compilation Building packages on a different sort platform than than they will be run on 6.topic: reproducible builds 11.by: nixpkgs-member
Projects
R13y
Done
Development

Successfully merging this pull request may close these issues.

None yet