-
-
Notifications
You must be signed in to change notification settings - Fork 15.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] nixos/systemd-boot: boot counting and automatic fallback #84204
Conversation
cc @adisbladis for interest |
9584565
to
4c5d3e4
Compare
@danielfullmer with 245.3 merged into master, can you rebase this PR and see if it's still broken upstream? |
4c5d3e4
to
2fe51f7
Compare
@flokli Sure, just rebased. There haven't been any relavent changes to Since I haven't received any response about the menu ordering here systemd/systemd#15256, I'll send a patch upstream to see if that sparks any discussion. I also recently patched Just to be clear, the logic is now this: We select the default entry specified in Upstream PR here: systemd/systemd#15577 |
0db9bed
to
2f2e083
Compare
I just prepared the bump to We also might need to address #86422 (comment), so actual sd-boot fixes manifest on existing installations. |
2f2e083
to
c03ed62
Compare
I've resolved the earlier issue with the My fix was this: instead of setting (for example) Since the proper default generation might not be the highest numbered generation, I've renamed the default entry to (for example) This approach has the additional benefit that non-default system profiles would be respected. If the user has run |
c03ed62
to
72a37ec
Compare
Hi @danielfullmer, I just wanted to swing by and say: thank you for your work on this! I tested this on my machine just now and it works flawlessly. Is there anything I can do to help move this into a mergeable state? EDIT: Removed from the Staging project because it looks like this is no longer a mass rebuild. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I just noticed (after having switched off of using this PR) -- I still have boot entries with counters attached, but I also have the same profiles without counters. e.g.:
$ ls /boot/loader/entries
[...]
.rwxr-xr-x 690 root 22 Mar 11:49 nixos-generation-70+0-1.conf
.rwxr-xr-x 580 root 23 Mar 13:29 nixos-generation-70.conf
.rwxr-xr-x 690 root 22 Mar 11:49 nixos-generation-71+0-1.conf
.rwxr-xr-x 580 root 23 Mar 13:29 nixos-generation-71.conf
.rwxr-xr-x 690 root 22 Mar 11:49 nixos-generation-72+0-1.conf
.rwxr-xr-x 580 root 23 Mar 13:29 nixos-generation-72.conf
.rwxr-xr-x 690 root 22 Mar 11:49 nixos-generation-73+0-3.conf
.rwxr-xr-x 580 root 23 Mar 13:29 nixos-generation-73.conf
.rwxr-xr-x 580 root 23 Mar 13:29 nixos-generation-74.conf
.rwxr-xr-x 580 root 23 Mar 13:29 nixos-generation-76.conf
.rwxr-xr-x 690 root 22 Mar 11:49 nixos-generation-default-74+3.conf
If boot counters are ever disabled and we see that there are profiles with a counter attached, what should be done? Nothing (and continue to regenerate the counted profiles on nixos-rebuild switch
)? Skip it (e.g. don't regenerate that profile)?
|
||
generation_dir = os.readlink(system_dir(self.profile, self.generation)) | ||
tmp_path = self.path + ".tmp" | ||
kernel_params = "systemConfig=%s init=%s/init " % (generation_dir, generation_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of 5276ebb, systemConfig
has been removed.
kernel_params = "systemConfig=%s init=%s/init " % (generation_dir, generation_dir) | |
kernel_params = "init=%s/init " % generation_dir |
I marked this as stale due to inactivity. → More info |
This still looks interesting, can we do something to help? |
I marked this as stale due to inactivity. → More info |
Still interested, can we help you @danielfullmer ? |
Man this has been on my TODO list for so long, apologies for not taking it the last 20% and trying to get it merged. I now have a concrete need for this feature so I intend to work on this soon (on the order of weeks). If anyone gets to it before I do: I know we'll at least need to (1) Resolve any conflicts and ensure the feature still works (2) Figure out a solution to @cole-h 's issue of how to handle boot entries if we turn the feature off and on, and (3) Write some documentation/release notes for the feature. This is very likely too late to make it into 22.05, but it'd be nice to get it merged into master for the release after that. |
@danielfullmer Gentlest of pings :) |
Partly related, I have a half-baked implementation of this for grub at home. Main issue is that it's shitty to implement when we want proper counting and that I cannot get the test runner to do what I want |
As part of Cachix Deploy, I'd very much like to sponsor this work to get it fixed for everyone in NixOS itself. Let me know if someone is willing to get it done :) |
@domenkozar I'm actually currently working on bringing this PR up-to-date with master right now. I know I've been meaning to work on this for a while now, and I don't want to prevent you from sponsoring someone else to work on this if needed. But if I don't have something within ~2 weeks, then take that as a sign that I might not get to it for a while longer. |
Are you unable to get sponsored for this? Not sure if that's what @domenkozar could have had in mind. |
It might be complicated for some unrelated reasons--but I'll reach out directly. |
I'm interested into taking over for this and bringing this to completion if @danielfullmer cannot really anymore. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/pi4-broken-on-23-11-uefi-zfs-root/36375/3 |
@dasJ Could you share more about your grub implementation? Would love to see this PR go forward! |
Closing in favor of #273062 which is close to completion. |
Motivation for this change
This draft PR adds support for
systemd-boot
"automatic boot assessment".For background information, please first read the following document: https://systemd.io/AUTOMATIC_BOOT_ASSESSMENT/
Setting
boot.loader.systemd-boot.counters.enable = true
enables this optional functionality.Since what a "successful boot" exactly means is somewhat context dependent, systemd provides
boot-complete.target
which serves as a synchronization point that an end user could order necessary services before. The boot entry will be marked as successfully booting only if this target succeeds.For unattended boots, there are some other potentially useful options such as adding
panic=1
toboot.kernelParams
or settingFailureAction=reboot
for certain necessary services. However, these are end-user concerns that are somewhat orthogonal to boot counting and automatic fallback, and they probably do not belong in this PR.This is a draft PR, as there are a few issues that still need to be resolved:
systemd-boot
test needs a writable boot disk (since boot counters are state included in the entry filenames). I have a hack enabling this, which should be cleaned up somehow.(edit: the issue below has been resolved to my satisfaction)
Perhaps the biggest current issue which could use some feedback on is how we set the "default" for
systemd-boot
. Normally,systemd-boot-builder.py
sets thedefault
entry inloader.conf
. The logic insystemd-boot
means that this entry would always be selected--overriding anything having to do with boot counters. I've disabled settingdefault
in this draft PR, and sosystemd-boot
should select the latest generation which does not have 0 tries left. However, this would break the ability for the user to switch back to a previous generation, which is clearly undesirable. Perhaps we need to further patchsystemd-boot
to usedefault
only if it has remaining tries left, and otherwise select the latest generation with remaining tries left.Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)