Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos-containers: fix atomic restart #80169

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

danbst
Copy link
Contributor

@danbst danbst commented Feb 15, 2020

nixos containers: remove --keep-unit (revert 810680b)

Fixes #43652
Fixes #16753
Alternative fix for #39717
The fix in #76719 was not enough

`--keep-unit` ties systemd-nspawn container to systemd unit. When
unit is restarted (ie atomic operation "restart", not two separate
"stop" and "start") systemd assumes that unit didn't disappear.
Hence machine didn't disappear. Though it may as well be systemd bug
that machine is left in `closing` state forever.

If unit is stopped (or stopped due to failure), associated machine is also stopped,
so service can start fresh.

When --keep-unit is removed, resource slice isn't changed though - it is machine.slice by default for
nspawn containers.

cc @erikarvstedt @peterhoeg @flokli @edolstra

Fixes NixOS#43652
Fixes NixOS#16753
Alternative fix for NixOS#39717
The fix in NixOS#76719 was not enough

`--keep-unit` ties systemd-nspawn container to systemd unit. When
unit is restarted (ie atomic operation "restart", not two separate
"stop" and "start") systemd assumes that unit didn't disappear.
Hence machine won't disappear.

If unit is stopped (or stopped due to failure), associated machine is also stopped,
so service can start fresh.

Resource slice isn't changed though - it is machine.slice by default for
nspawn containers.
@danbst
Copy link
Contributor Author

danbst commented Feb 15, 2020

if this will be merged, we can go on and cleanup the artifacts from old solutions - WINCH signal, mixed killmode, PID mgmt in imperative containers, restart as stop/start...

@flokli
Copy link
Contributor

flokli commented Feb 15, 2020

cc @arianvp here.

@flokli flokli requested a review from arianvp February 15, 2020 20:50
flokli referenced this pull request Feb 18, 2020
Previously, we were storing the leader pid in a runtime file and
signalled SIGRTMIN+4 manually.

In systemd 219, the `machinectl poweroff` command was introduced, which
does that for us.
@d-xo
Copy link
Contributor

d-xo commented Feb 19, 2020

I tested this locally and can confirm that it fixes the regression of #6212 introduced in 90a3908.

@arianvp
Copy link
Member

arianvp commented Feb 19, 2020

This seems fine. though im a bit confused by why it was originally introduced.

I also don't understand why we need machinectl poweroff to turn off a container; but maybe this is what you're referring to with:

WINCH signal, mixed killmode, PID mgmt in imperative containers, restart as stop/start...

From my understanding, systemd-nspawn should be able to just run in the unit file, and then when the unit is stopped, systemd-nspawn should stop automatically... E.g. make this unit look more like the upstream systemd-nspawn@.service unit

#  SPDX-License-Identifier: LGPL-2.1+
#
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

[Unit]
Description=Container %i
Documentation=man:systemd-nspawn(1)
Wants=modprobe@tun.service modprobe@loop.service modprobe@dm-mod.service
PartOf=machines.target
Before=machines.target
After=network.target systemd-resolved.service modprobe@tun.service modprobe@loop.service modprobe@dm-mod.service
RequiresMountsFor=/var/lib/machines

[Service]
# Make sure the DeviceAllow= lines below can properly resolve the 'block-loop' expression (and others)
ExecStart=systemd-nspawn --quiet --keep-unit --boot --link-journal=try-guest --network-veth -U --settings=override --machine=%i
KillMode=mixed
Type=notify
RestartForceExitStatus=133
SuccessExitStatus=133
Slice=machine.slice
Delegate=yes
TasksMax=16384
@SERVICE_WATCHDOG@

# Enforce a strict device policy, similar to the one nspawn configures when it
# allocates its own scope unit. Make sure to keep these policies in sync if you
# change them!
DevicePolicy=closed
DeviceAllow=/dev/net/tun rwm
DeviceAllow=char-pts rw

# nspawn itself needs access to /dev/loop-control and /dev/loop, to implement
# the --image= option. Add these here, too.
DeviceAllow=/dev/loop-control rw
DeviceAllow=block-loop rw
DeviceAllow=block-blkext rw

# nspawn can set up LUKS encrypted loopback files, in which case it needs
# access to /dev/mapper/control and the block devices /dev/mapper/*.
DeviceAllow=/dev/mapper/control rw
DeviceAllow=block-device-mapper rw

[Install]
WantedBy=machines.target

@arianvp
Copy link
Member

arianvp commented Feb 19, 2020

So I think, by merging this change, we can completely get rid of the machinectl stop command, as doing systemctl stop container@blah should be enough to shut down the container.

This is the same issue we're fixing as documented here: systemd/systemd#2770

@flokli
Copy link
Contributor

flokli commented Feb 19, 2020

@danbst this looks good - can you do the above mentioned cleanups in this PR, too?

@danbst
Copy link
Contributor Author

danbst commented Feb 19, 2020

@arianvp omg, there was an upstream issue for this! Interesting, they mentioned RemainAfterExit, I think we can't get rid of it in NixOS, as it will break NixOS restart logic.

@flokli ok, I'll try cleanup that

@arianvp
Copy link
Member

arianvp commented Feb 19, 2020

We'll have to keep KillMode=mixed by the way. it's also used upstream. I asked for the reason and this was the answer:

michich, when the kernel wants to stop properly the system (for instance, because you pressed the power button) it sends
SIGTERM to init, wait for some time, then powers down
to emulate that for a container, you send SIGTERM to the main process (the init of the container)
then you wait a bit
then you SIGKILL everything, which is the equivalent for a container of shutting down

@flokli
Copy link
Contributor

flokli commented Feb 22, 2020

@danbst poke ;-)

@danbst
Copy link
Contributor Author

danbst commented Feb 22, 2020

@flokli I'm debugging tests, they fail on nixos-container destroy command

* machinectl terminate during container start was a hack, probably wrong one
* use "systemctl restart" consistently for declarative and imperative containers
* fix "nixos-container terminate" command. We have to notify systemd we want
  unit to be stopped, because otherwise it will be started back due to
  restart on failure policy.
* add a bit docs to stop/terminate/destroy, to clarify which one does what
* add missing "nixos-container restart" docs
@danbst
Copy link
Contributor Author

danbst commented Feb 22, 2020

Cleanups (from commit message):

* machinectl terminate during container start was a hack, probably wrong one
* use "systemctl restart" consistently for declarative and imperative containers
* fix "nixos-container terminate" command. We have to notify systemd we want
  unit to be stopped, because otherwise it will be started back due to
  restart on failure policy.
* add a bit docs to stop/terminate/destroy, to clarify which one does what
* add missing "nixos-container restart" docs

@arianvp I was not able to remove machinectl poweroff, for some reason it doesn't poweroff machine, but just kills it.

Also, with this PR machinectl terminate behavior had changed. I think I'll add release notes, to clarify this bit. nixos-container terminate/destroy works though.

@danbst
Copy link
Contributor Author

danbst commented Feb 24, 2020

@GrahamcOfBorg test containers-imperative

@@ -25,7 +25,7 @@ sub showHelp {
nixos-container create <container-name>
[--nixos-path <path>]
[--system-path <path>]
[--config <string>]
[--config <string>] -- config without outer braces, for example 'services.nginx.enable = true;'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already documented in nixos/doc/manual/administration/imperative-containers.xml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine to duplicate it here. The hands-on --help is too scarce.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @danbst here. This is a common misunderstanding (which usually seems like a bug to an end-user), so mentioning this here as well is fine IMHO.

nixos-container restart <container-name>
nixos-container stop <container-name> -- shutdown container cleanly, wait until stopped
nixos-container terminate <container-name> -- halt container, like hard poweroff
nixos-container destroy <container-name> -- terminate (halt) container and remove state data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should move into nixos/doc/manual/administration/imperative-containers.xml (if not already there). Why were things reordered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, technically yes, but it's fine duplicating it here. No need to consult manual when stopping container.

Things were reordered to group stop/terminate/destroy commands. They do same, but with different nuances.

Comment on lines +276 to 277
system("systemctl", "stop", "--no-block", "container\@$containerName");
system("machinectl", "terminate", $containerName) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Urgh, this feels like a hack. Doesn't machinectl terminate by itself make the systemd unit become stopped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is consequence of removing --keep-unit, I haven't found a better way to stop unit with machine.

machinectl stop does stop unit fine. It is because machine exits with normal result. But when it runs terminate, it exits with failure result, and systemd restarts unit due to Restart=on-failure policy.

@arianvp
Copy link
Member

arianvp commented Feb 26, 2020 via email

@danbst
Copy link
Contributor Author

danbst commented Mar 1, 2020

@arianvp kind ping :)

@arianvp
Copy link
Member

arianvp commented Mar 1, 2020 via email

@Ma27
Copy link
Member

Ma27 commented Apr 4, 2020

@danbst @arianvp may I ask if there are any updates here? :)

Ma27 added a commit to Ma27/nixpkgs that referenced this pull request Apr 7, 2020
I started working on a draft for improved nixos-containers (using
`networkd` and `.nspawn` units) after the networkd hackathon[1] which
isn't published yet.

Quite recently I realized that when changing a `.nspawn`-unit, the
`switch-to-configuration.pl` doesn't activate those changes. This patch
takes care of it with the following changes:

* It's possible to declare whether to restart or reload such a unit. The
  restart option is the default. In that case the
  `systemd-nspawn@<machine-name>.service`[2]-unit will be restarted or reloaded.

* By default, all `.nspawn`-units are part of the `machines.target`.

* A VM-test covers all those cases including a custom reload-script to
  activate a new configuration in the machine.

* I had to remove the `--keep-unit` flag on startup to fix the restart
  of the unit. This is a known issue[3].

It's also possible to use a reload to activate a new configuration
inside a nspawn-machine with a config like this:

``` nix
{ pkgs, ... }: {
  systemd.nspawn.test-container.reloadOnChange = true;
  systemd.nspawn.test-container.restartOnChange = false;
  systemd.services."systemd-nspawn@test-container".serviceConfig.ExecReload = "${pkgs.writeScriptBin "activate" ''
    #! ${pkgs.runtimeShell} -xe
    systemd-run --machine test-container --pty --quiet -- /bin/sh --login -c \
      '${containerCfg}/bin/switch-to-configuration test'
  ''}/bin/activate";
}
```

[1] https://discourse.nixos.org/t/networkd-sprint-2019-11-23-24-in-munich/4578
[2] https://github.com/systemd/systemd/blob/v243/units/systemd-nspawn@.service.in
[3] NixOS#80169
@arianvp
Copy link
Member

arianvp commented Apr 7, 2020

I'm okay with this approach for now if we are adamant on getting it out for 20.03

If not, I would like to wait until we complete the 245 release, and see if the issues I encountered in the linked systemd issue goes away. If we keep using --keep-unit with the inclusion of --kill-signal=SIGTRM3 then systemctl stop container@foo should be sufficient to kill the container, without the machinectl terminate call

@flokli
Copy link
Contributor

flokli commented Apr 7, 2020

I'm not sure which approach is referred to by "this", whether it's this PR, or the one referencing it (#84608). I don't think any of these two should be backported to 20.03. It's too late for that, and I don't want to cause breakages that late in the release cycle.

@tadfisher
Copy link
Contributor

Ping @danbst; this is still an issue in master, and it would be nice to have this fixed there at least.

@maralorn
Copy link
Member

maralorn commented Feb 7, 2021

Hello @arianvp and @danbst, are there any news on this one?

lIt would be so great to have some kind of fix. Today was another day where I woke up to all my containers being down after the nightly system update.

@stale
Copy link

stale bot commented Aug 6, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Aug 6, 2021
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Apr 13, 2022
@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Nov 2, 2022
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Sep 7, 2023
@wegank wegank added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 19, 2024
@wegank wegank marked this pull request as draft March 20, 2024 14:55
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Restarting containers is broken nixos-container doesn't have restart subcommand