Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos system-activation: support activation scripts run in a user context #47842

Merged
merged 4 commits into from Oct 5, 2018

Conversation

peterhoeg
Copy link
Member

Motivation for this change

Some system.activationScripts are user specific but are run by root using su. That causes a number of problems:

  • if $HOME is on a dm-crypt volume, stage-2-init will try to activate the volume (and fail) during boot causing a hang until the user hits escape which isn't obvious if plymouth and KDE is enabled
  • it only works with users defined declaratively and not for those from other sources (LDAP, NIS, local)
  • the script will run for each user even if they are not logged in

There is now system.userActivationScripts which will be run by a systemd user service when $XDG_RUNTIME_DIR/.nixos-activate is changed.

We could easily use a different activation mechanism but the file based option chosen here is very straight forward.

I have tried it out here and tested as follows:

  1. rebooted with this PR applied
  2. there was no hang during boot and no stage-2-init errors
  3. added ktorrent to environment.systemPackages
  4. ran nixos-rebuild switch
  5. verified that the user service nixos-active.service ran and as a consequence kbuildsycoca5 as well
  6. verified that ktorrent is now available in the menu

I haven't gone through all the various activation scripts but there are likely more that should be moved to system.userActivationScripts.

Fixes #47577. Supersedes #45290

Cc: @vcunat @samueldr

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

@peterhoeg
Copy link
Member Author

Cc @bkchr @jerith666 @worldofpeace

@peterhoeg
Copy link
Member Author

@GrahamcOfBorg test plasma5

@GrahamcOfBorg
Copy link

Success on aarch64-linux (full log)

Attempted: tests.plasma5

Partial log (click to expand)

machine: exit status 1
syncing
machine: running command: sync
machine: exit status 0
test script finished in 206.04s
cleaning up
killing machine (pid 631)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/3ida9v6kdif14k14y5pzphrnpaqgqmpi-vm-test-run-plasma5

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: tests.plasma5

Partial log (click to expand)

machine: exit status 1
syncing
machine: running command: sync
machine: exit status 0
test script finished in 92.64s
cleaning up
killing machine (pid 600)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/qc5sx74c1v6y0vpxnvbbg1wl85wrs9b1-vm-test-run-plasma5

@bkchr
Copy link
Contributor

bkchr commented Oct 4, 2018

Should we add a test for this?

@peterhoeg
Copy link
Member Author

Should we add a test for this?

We definitely should - any good ideas?

@teto
Copy link
Member

teto commented Oct 4, 2018

This might be useful for things like home-manager/nixup !

@bkchr
Copy link
Contributor

bkchr commented Oct 4, 2018

Can we maybe encrypt the home directory and enable auto login?

@samueldr
Copy link
Member

samueldr commented Oct 5, 2018

Hmmm, it almost works, but it looks like the activation scripts are always "one build behind" here. (In addition to needing to at least systemctl --user restart nixos-activation.path, not sure if the systemctl --user daemon-reload was necessary too.) (See update)

Here's my test setup:

On top of 7f70ebf, applied the user-activation commit (e0e8874a8d0a55588d509641cf01c3c3d86ada65).

My configuration is as such:

{
  system.userActivationScripts.annoy = ''
    ${pkgs.libnotify}/bin/notify-send -a nixos -t 5000 -u critical "AAA"
  '';
}

Before doing time sudo nixos-rebuild test, I change the message to something (BBB, CCC, DDD...) when test finishes activating, I always see the previously expected message (so with the system having BBB, I see AAA).

I'm rebuilding without the patch, rebooting and checking what it takes so it works at the first switch adding that activation service.


Update:

.../nixos/nixpkgs ⬤  systemctl --user status nixos-activation.path
Unit nixos-activation.path could not be found.

.../nixos/nixpkgs 4 ⬤  sudo nixos-rebuild test
[...]

.../nixos/nixpkgs ⬤  systemctl --user status nixos-activation.path
● nixos-activation.path - Trigger user specific NixOS activation on /run/user/1000/.nixos-activate
   Loaded: loaded (/nix/store/bcmrl7v0vc5iraz6vvdmc95ga29lfw75-unit-nixos-activation.path/nixos-activation.>
   Active: inactive (dead)

.../nixos/nixpkgs ⬤  systemctl --user restart nixos-activation

.../nixos/nixpkgs ⬤  systemctl --user status nixos-activation
● nixos-activation.service - Run user specific NixOS activation
   Loaded: loaded (/nix/store/xd7205bfwqdchz5b6ff6rm5j8liq3vj0-unit-nixos-activation.service/nixos-activati>
   Active: inactive (dead) since Thu 2018-10-04 21:48:27 EDT; 1s ago
  Process: 10692 ExecStart=/nix/store/wws8a9lzkqhbvny1p1inpvx2jybkg007-unit-script-nixos-activation-start (>
 Main PID: 10692 (code=exited, status=0/SUCCESS)

Oct 04 21:48:26 ralphwiggum systemd[2414]: Starting Run user specific NixOS activation...
Oct 04 21:48:27 ralphwiggum systemd[2414]: Started Run user specific NixOS activation.

I don't know what it takes to make it "Active: active (waiting)", as even after a restart of the .path it still was inactive, after restarting both the service and the path it started working :/.

Though, once it is active, it will solidly activate with the previous generation's script!

@samueldr
Copy link
Member

samueldr commented Oct 5, 2018

After a reboot, the service doesn't need any kind of massaging, but still seems to lag behind one generation.

@peterhoeg
Copy link
Member Author

Yeah, probably because systemctl --user daemon-reload hasn't run yet. A slight variation of this, is instead of touching the file, that we simply activate nixos-activation.service in switch-to-configuration.pl after running daemon-reload.

@peterhoeg
Copy link
Member Author

I just pushed some untested commits (I'll try them out now) that uses switch-to-configuration.pl instead.

@samueldr
Copy link
Member

samueldr commented Oct 5, 2018

Quick update, before more in-depth testing it doesn't lag behind one generation anymore!

Now testing a fresh boot without the change, if the the first switch works.

@samueldr
Copy link
Member

samueldr commented Oct 5, 2018

And confirming that the changes (1353ba2 and 8118d6e) on 18.09 does activate right on first activation.

Seems fine as far as I can tell (and can now annoy myself with notifications on rebuilds!)

That feature is 👍 imho.

@peterhoeg
Copy link
Member Author

So this is good to go despite the lack of tests? We really should have some (although I think the encryptfs test isn't relevant in this case) but it's of course your call if it's more important to ship 18.09 assuming this is the last pending blocker.

@samueldr
Copy link
Member

samueldr commented Oct 5, 2018

I would prefer having tests for the feature; though I don't know how the testing infrastructure can be used to make this work since it seems to need a user session to be active (though I'm sure there's already good material in there).

Do you have any idea on how to proceed with a test and the time to implement it at short notice? (DO NOT feel pressured!) Otherwise I feel like I have tested it right, and with the latest changes seems solid. With another pair of eyes looking at it and testing it I would be okay to merge without tests, hoping to later see tests pop into existence. (Which I would gladly adopt into backporting.)

@peterhoeg
Copy link
Member Author

Do you have any idea on how to proceed with a test and the time to implement it at short notice? (DO NOT feel pressured!)

One way to do it would be to have the plasma test see if the .cache/ksycoca5_bla_bla_bla file is created but I will need to look at this later. Will a promise to implement that soonish be enough?

Copy link
Member

@samueldr samueldr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving after successful manual testing.

Code is minimal and looks fine to me.

@peterhoeg
Copy link
Member Author

It's also working fine for me in the last iteration. I'll open a separate PR for 18.09.

@chkno
Copy link
Member

chkno commented Feb 15, 2021

Tests popping into existence: #113241

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Boot hangs with encrypted home and plasma
6 participants