Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/systemd: install sysctl snippets #66482

Merged
merged 4 commits into from Aug 19, 2019
Merged

Conversation

flokli
Copy link
Contributor

@flokli flokli commented Aug 11, 2019

systemd provides two sysctl snippets, 50-coredump.conf and
50-default.conf.

Apart from some loose reverse path filtering and source routing
filtering, this enables fq_codel as a packet scheduler to fight
bufferbloat, and configures systemd-coredump properly.

Let's start using these, like other distros already do for quite some
time, and remove those duplicate boot.kernel.sysctl options we
previously did set.

In the case of rp_filter (which systemd would set to 2 (loose)), make
our overrides to "1" more explicit.

sysctl.d(5) recommends prefixing all filenames in /etc/sysctl.d with a
two-digit number and a dash, to simplify the ordering of the files.

Some packages provide custom files, often with "50-" prefix.
To ensure user-supplied configuration takes precedence over the one
specified via boot.kernel.sysctl, prefix the file generated there with
"60-".

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
Notify maintainers

cc @

@flokli flokli requested review from Mic92, fpletz and andir August 11, 2019 14:40
@flokli flokli requested a review from joachifm as a code owner August 11, 2019 14:40
@flokli
Copy link
Contributor Author

flokli commented Aug 11, 2019

Currently, the systemd test seems to break at another subtest ("file system with x-initrd.mount is not unmounted"), but the fq_codel one succeeds.

I didn't yet test whether coredumps are available via coredumpctl.

boot.kernel.sysctl."net.ipv4.conf.all.log_martians" = mkDefault true;
boot.kernel.sysctl."net.ipv4.conf.all.rp_filter" = mkDefault true;
boot.kernel.sysctl."net.ipv4.conf.all.rp_filter" = mkDefault "1";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to change this? The sysctl module maps true to 1 for us (since toString true == "1").

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered the same. I think the answer is

In the case of rp_filter (which systemd would set to 2 (loose)), make
our overrides to "1" more explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. systemd sets these to "2" (loose) in the snippets already, the hardenened rules set it to "1" (strict).

Having a boolean here doesn't really explain what and why we're doing.
I also updated the comment above, to add the "strict" bit.

@andir
Copy link
Member

andir commented Aug 11, 2019 via email

<para>
We now install the sysctl snippets shipped with systemd. Apart from some loose reverse path
filtering and source routing filtering, this enables <literal>fq_codel</literal> as a packet
scheduler to fight bufferbloat, and configures <literal>systemd-coredump</literal> properly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I maybe should have left the above comment down a few lines. In general when it comes to release notes and setting changes, avoid weasel words.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the rp_filter option from above comment: systemd sets net.ipv4.conf.all.rp_filter to 2 by default, which does enable some reverse path filtering, while we currently don't do any reverse path filtering at all. On hardened and google-compute-config, we set it to 1 (strict).

Detailled explanation about the options:

  • 0 No source address validation is performed and any packet is forwarded to the destination network
  • 1 Strict Mode as defined in RFC 3074. Each incoming packet to a router is tested against the routing table and if the interface that the packet is received on is not the best return path for the packet then the packet is dropped.
  • 2 Loose mode as defines in RFC 3074 Loose Reverse Path. Each incoming packet is tested against the route table and the packet is dropped if the source address is not routable through any interface. The allows for asymmetric routing where the return path may not be the same as the source path.

As a lot of distros enabled the systemd sysctls, they enabled loose reverse path filtering, so I thought it should be fine to do so too. Same goes for net.ipv4.conf.all.accept_source_route = 0.

I didn't want to explicitly list every option currently set in ${systemd}/example/sysctl.d/50-default.conf - only the fact that we now apply these options, plus some broad summary about what's changed, so people are aware when reading release notes.

@flokli
Copy link
Contributor Author

flokli commented Aug 11, 2019

Regarding systemd-coredump:

We seem to currently have a systemd.coredump module, which duplicates the kernel.core_pattern from the upstream systemd unit if systemd.coredump.enable is set to true, otherwise sets it back to core.

I was a bit baffled coredumpctl doesn't save coredumps on NixOS by default, and wonder if it's time to merge the things enabled there to the systemd module itself (and remove the module) - people could still get the old behaviour by setting boot.kernel.sysctl."kernel.core_pattern" = "core".

@nh2, @Jookia @auntieNeo, what do you think?

@nh2
Copy link
Contributor

nh2 commented Aug 17, 2019

As long as we keep the ability to use normal core instead of systemd's coredump handling (because it's very slow slow), I'm fine with things being moved around.

@flokli
Copy link
Contributor Author

flokli commented Aug 17, 2019

@nh2 Yes, if you set the sysctl parameter as described in the changelog, systemd-coredump won't be handling coredumps at all, and the kernel will dump the coredump file by itself.

@flokli
Copy link
Contributor Author

flokli commented Aug 17, 2019

I'd be curious to know about when the coredump performance becomes critical though ;-)

@flokli flokli force-pushed the systemd-sysctl branch 2 times, most recently from ccd422d to c7363bd Compare August 17, 2019 20:39
@flokli
Copy link
Contributor Author

flokli commented Aug 17, 2019

I removed weaselness in the release notes and explained getting back the old coredump behaviour a bit more clearly in the release notes.

@nh2
Copy link
Contributor

nh2 commented Aug 17, 2019

I'd be curious to know about when the coredump performance becomes critical though ;-)

@flokli I have observed many times that when a program with a couple GB resident memory core dumps, systemd's coredump program was spinning on 100% CPU for many minutes, while core terminated almost instantly.

Probably it's this issue.

I didn't investigate it, I just found it very annoying to have to wait so long.

@flokli
Copy link
Contributor Author

flokli commented Aug 17, 2019

This was fixed upstream February 2016 in bdfd7b2c63cac944a3aa1fc0fafd19f41789e208 - anyways, when setting sysctl, it won't be piped through it.

@worldofpeace
Copy link
Contributor

worldofpeace commented Aug 18, 2019

I found it impossible to suggest this, but my personal preference found this part to read pretty like this

diff --git a/nixos/doc/manual/release-notes/rl-1909.xml b/nixos/doc/manual/release-notes/rl-1909.xml
index f9363561bdc..3ae0492ebfa 100644
--- a/nixos/doc/manual/release-notes/rl-1909.xml
+++ b/nixos/doc/manual/release-notes/rl-1909.xml
@@ -435,13 +435,26 @@
    </listitem>
    <listitem>
      <para>
-       We now install the sysctl snippets shipped with systemd. Apart from enabling loose reverse path
-       filtering and source routing filtering, this enables <literal>fq_codel</literal> as a packet
-       scheduler to fight bufferbloat, and configures the kernel to pass coredumps to
-       <literal>systemd-coredump</literal>.
-       These sysctl snippets can be found in <literal>/etc/sysctl.d/50-*.conf</literal>,
-       and overridden via <link linkend="opt-boot.kernel.sysctl">boot.kernel.sysctl</link>
-       (which will place the parameters in <literal>/etc/sysctl.d/60-nixos.conf</literal>).
+       We now install the sysctl snippets that are shipped with systemd.
+       <itemizedlist>
+        <para>This enables:</para>
+        <listitem>
+          <para>Loose reverse path filtering</para>
+        </listitem>
+        <listitem>
+          <para>Source routing filtering</para>
+        </listitem>
+        <listitem>
+         <para>
+          <literal>fq_codel</literal> as a packet scheduler (this helps to fight bufferbloat)
+         </para>
+        </listitem>
+       </itemizedlist>
+
+       And configures the kernel to pass its coredumps to <literal>systemd-coredump</literal>.
+       The sysctl snippets that are shipped with systemd can be found in <literal>/etc/sysctl.d/50-*conf</literal> and can
+       be overridden with the <link linkend="opt-boot.kernel.sysctl">boot.kernel.sysctl</link> option.
+       Note these overrides still go into <literal>/etc/sysctl.d/60-nixos.conf</literal>.
      </para>
    </listitem>
    <listitem>

Screenshot from 2019-08-17 21 40 10

@flokli flokli force-pushed the systemd-sysctl branch 3 times, most recently from b42797b to 2f3e548 Compare August 18, 2019 15:50
sysctl.d(5) recommends prefixing all filenames in /etc/sysctl.d with a
two-digit number and a dash, to simplify the ordering of the files.

Some packages provide custom files, often with "50-" prefix.
To ensure user-supplied configuration takes precedence over the one
specified via `boot.kernel.sysctl`, prefix the file generated there with
"60-".
systemd provides two sysctl snippets, 50-coredump.conf and
50-default.conf.

These enable:
 - Loose reverse path filtering
 - Source route filtering
 - `fq_codel` as a packet scheduler (this helps to fight bufferbloat)

This also configures the kernel to pass coredumps to `systemd-coredump`.
These sysctl snippets can be found in `/etc/sysctl.d/50-*.conf`,
and overridden via `boot.kernel.sysctl`
(which will place the parameters in `/etc/sysctl.d/60-nixos.conf`.

Let's start using these, like other distros already do for quite some
time, and remove those duplicate `boot.kernel.sysctl` options we
previously did set.

In the case of rp_filter (which systemd would set to 2 (loose)), make
our overrides to "1" more explicit.
@flokli
Copy link
Contributor Author

flokli commented Aug 18, 2019

Docs should be good, any objections? ;-)

Copy link
Contributor

@worldofpeace worldofpeace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs are good 👍

@flokli flokli merged commit 93a0317 into NixOS:master Aug 19, 2019
@flokli flokli deleted the systemd-sysctl branch August 19, 2019 14:32
@gebner
Copy link
Member

gebner commented Aug 31, 2019

This change disables all SysRq keys (except sync) by default. Was this intentional? In any case, this needs to be documented in the release notes.

@flokli
Copy link
Contributor Author

flokli commented Aug 31, 2019

@gebner I indeed missed that line!

It's part of ${pkgs.systemd}/example/sysctl.d/50-default.conf:

# Use kernel.sysrq = 1 to allow all keys.
# See https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html for a list
# of values and keys.
kernel.sysrq = 16

I'll update the release notes accordingly and file a follow-up PR, thanks!

flokli added a commit to flokli/nixpkgs that referenced this pull request Aug 31, 2019
jtojnar added a commit to jtojnar/nixfiles that referenced this pull request Nov 5, 2019
Since NixOS/nixpkgs#66482, it needs an explicit override.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants