Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nscd: disable by default #50042

Closed
wants to merge 1 commit into from
Closed

Conversation

lheckemann
Copy link
Member

@lheckemann lheckemann commented Nov 9, 2018

Motivation for this change

nscd seems to cause occasional spurious errors, in the vein of Could not resolve <hostname>: System Error. I propose to disable it by default.

Things done
  • [] Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

@grahamc
Copy link
Member

grahamc commented Nov 9, 2018

We don't have many services enabled by default, so turning one off probably requires some explanation about why it was on, and if the trade-offs of turning it off make sense.

@lheckemann
Copy link
Member Author

@GrahamcOfBorg test networking nsd avahi

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: tests.networking, tests.nsd

The following builds were skipped because they don't evaluate on x86_64-linux: tests.avahi

Partial log (click to expand)

clientv6: exit status 0
2 out of 2 tests succeeded
test script finished in 36.66s
cleaning up
killing server (pid 597)
killing clientv4 (pid 609)
killing clientv6 (pid 621)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/qnh1wy07m30bkwsamw45r2g0snmv9qab-vm-test-run-nsd

@lheckemann
Copy link
Member Author

Agreed, I figure this isn't a trivial merge and would appreciate if we could understand why it's been enabled by default up to now. Git archaeology has revealed that its existence goes way way back to 2007 (9963b26) and it wasn't even possible to disable it until 2009 (3f6ca96).
cc @viric who introduced the enable option, @edolstra who introduced it in the first place.

Also cc @fpletz who gave me the idea that it might not actually be necessary.

@GrahamcOfBorg
Copy link

Timed out, unknown build status on aarch64-linux (full log)

Attempted: tests.networking, tests.nsd

The following builds were skipped because they don't evaluate on aarch64-linux: tests.avahi

Partial log (click to expand)

cannot build derivation '/nix/store/ndyr5zk5fcwgmnzb03lgvhyg1c16yx7s-closure-info.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/4a2hrcrb21i01h480r8ahmmsyi0g1av9-run-nixos-vm.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/rk8cd77nh5rd57vggn8lq98aj2x61d8v-run-nixos-vm.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/v68l88mwrn6yrp85rm1lmjx9kq8pzg39-run-nixos-vm.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/hyagsk1yf2f1l002m1qdmxn6k652bgqq-nixos-vm.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/l7d1rmxgsgs2176q2c1ybq0b4yinlgn4-nixos-vm.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/laxgx19063xb41r5l0vdc7jx8ns3ayl8-nixos-vm.drv': 2 dependencies couldn't be built
cannot build derivation '/nix/store/zpk4p6r01v5qj2i3spsf3ar2gn34f3p8-nixos-test-driver-nsd.drv': 3 dependencies couldn't be built
cannot build derivation '/nix/store/j8fvcg4lsn1gn5kl6vsl3q89vjm0gkik-vm-test-run-nsd.drv': 1 dependencies couldn't be built
error: build of '/nix/store/j8fvcg4lsn1gn5kl6vsl3q89vjm0gkik-vm-test-run-nsd.drv' failed

@edolstra
Copy link
Member

edolstra commented Nov 9, 2018

No, nscd should not be disabled. It's essential on NixOS to ensure that NSS modules work properly (including for 32-bit programs on 64-bit NixOS).

A possible replacement might be unscd.

@matthewbauer
Copy link
Member

matthewbauer commented Nov 9, 2018

An alternative to nscd would be great. Sometimes it will increase startup times slightly due to this hack blocking:

https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/services/system/nscd.nix#L79-L86

@matthewbauer
Copy link
Member

If anyone wants to try out unscd, they can use this patch:

30952b9

Needs more testing before opening a PR though.

@orivej
Copy link
Contributor

orivej commented Nov 11, 2018

I'm running with services.nscd.enable = false; because nscd bypasses network namespaces. (I have nameserver 127.0.0.1 in /etc/resolv.conf and want each network namespace to use its own instance of a DNS server, but with nscd name resolution is done in the namespace where nscd is running.)

It's essential on NixOS to ensure that NSS modules work properly (including for 32-bit programs on 64-bit NixOS).

NSS modules seem to work properly without nscd:

$ python -m SimpleHTTPServer &
Serving HTTP on 0.0.0.0 port 8000 ...

$ strace -e '!rt_sigaction' -f curl -I http://lh:8000/

[pid 28117] connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
[pid 28117] openat(AT_FDCWD, "/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
[pid 28117] openat(AT_FDCWD, "/nix/store/g2yk54hifqlsjiha3szr4q3ccmdzyrdv-glibc-2.27/lib/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3
[pid 28117] openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 3

HTTP/1.0 200 OK

$ strace -e '!rt_sigaction' -f $(nix-build '<nixpkgs>' --no-out-link -A pkgsi686Linux.curl)/bin/curl -I http://lh:8000/

[pid  1887] connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
[pid  1887] openat(AT_FDCWD, "/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
[pid  1887] openat(AT_FDCWD, "/nix/store/7y10kn6791h88vmykdrddb178pjid5bv-glibc-2.27/lib/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3
[pid  1887] openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 3

HTTP/1.0 200 OK

where lh is defined in /etc/hosts; it is not resolvable by DNS:

$ host lh
Host lh not found: 3(NXDOMAIN)

$ grep lh /etc/hosts
127.0.0.1 lh

If I replace hosts: files dns with hosts: dns in /etc/nsswitch.conf, curl fails:

$ curl http://lh:8000/ -I
curl: (6) Could not resolve host: lh

@lheckemann
Copy link
Member Author

@matthewbauer the hack could be replaced by a nicer hack using inotifywait? :D

@edolstra
Copy link
Member

@orivej That only works for NSS modules built into Glibc. It won't be able to find e.g. libnss_myhostname or libnss_mymachines from systemd because they are not in the library search path.

@Mic92
Copy link
Member

Mic92 commented Nov 12, 2018

unscd seems to bind the socket before it forks itself, making the hack unnecessary.

@Mic92
Copy link
Member

Mic92 commented Nov 12, 2018

@matthewbauer if you would make unscd an option we could test this a bit before switching to it by default.

@Mic92
Copy link
Member

Mic92 commented Nov 12, 2018

I applied several fixes to unscd here: Mic92@8ff2c10

@arianvp arianvp mentioned this pull request Nov 12, 2018
@matthewbauer
Copy link
Member

@Mic92 feel free to take this up. I'm a little too busy to work on it further.

@Mic92
Copy link
Member

Mic92 commented Nov 13, 2018

#50316 is an alternative approach. Disabling negative caches for host lookups could solve some issues we currently have with nscd.

@lheckemann
Copy link
Member Author

I think we've concluded that we don't want to do this, but at least it's spawned some other approaches with their respective discussions :)

@lheckemann lheckemann closed this Nov 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants