Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs: Improve import handling #42269

Merged
merged 1 commit into from Jul 3, 2018
Merged

zfs: Improve import handling #42269

merged 1 commit into from Jul 3, 2018

Conversation

Baughn
Copy link
Contributor

@Baughn Baughn commented Jun 19, 2018

Motivation for this change

Addresses #42178

This is a big sledgehammer, but it should be a reasonably safe sledgehammer: Use the ZFS tools to check if the pool is importable as ONLINE (as opposed to DEGRADED), and if not, sleep. It does cause serious performance degradation, but in practice that means adding another second or two to boot time, worst-case.

What I don't like, and which I can't find a cause for, is that this also increases the size of a ZFS-enabled initrd from 12 to 120 MB. By examination, this is because it ends up including all of ncurses, the glibc headers, and so on and so annoyingly forth.

I don't know why. The only added dependency is gawk, which only depends on glibc, which is already in the initrd.


...nope, busybox awk has the same problem. I'll get back to this when I'm more awake.

###### Things done

<!-- Please check what applies. Note that these are not hard requirements but merely serve as information for reviewers. -->

- [x] Tested using sandboxing ([nix.useSandbox](http://nixos.org/nixos/manual/options.html#opt-nix.useSandbox) on NixOS, or option `sandbox` in [`nix.conf`](http://nixos.org/nix/manual/#sec-conf-file) on non-NixOS)
- Built on platform(s)
   - [x] NixOS
   - [ ] macOS
   - [ ] other Linux distributions
- [ ] Tested via one or more NixOS test(s) if existing and applicable for the change (look inside [nixos/tests](https://github.com/NixOS/nixpkgs/blob/master/nixos/tests))
- [x] Tested compilation of all pkgs that depend on this change using `nix-shell -p nox --run "nox-review wip"`
- [x] Tested execution of all binary files (usually in `./result/bin/`)
- [x] Fits [CONTRIBUTING.md](https://github.com/NixOS/nixpkgs/blob/master/.github/CONTRIBUTING.md).

---

@Baughn
Copy link
Contributor Author

Baughn commented Jun 19, 2018

I've tested this manually, by restarts with a machine in various stages of (ocassionally non-)degradation. I haven't been able to test degradation of the root device, as I only have a single NVMe root disk, but I see no reason why that should fail.

@Baughn
Copy link
Contributor Author

Baughn commented Jun 19, 2018

Probable output when no devices are broken:

-- Logs begin at Sat 2018-06-16 01:41:18 IST, end at Tue 2018-06-19 20:59:04 IST. --
Jun 19 20:53:44 tsugumi systemd[1]: Starting Import ZFS pool "stash"...
Jun 19 20:53:44 tsugumi zfs-import-stash-start[3705]: importing ZFS pool "stash"...no pools available to import
Jun 19 20:53:44 tsugumi zfs-import-stash-start[3705]: Pool stash in state MISSING, waiting
Jun 19 20:53:45 tsugumi zfs-import-stash-start[3705]: no pools available to import
Jun 19 20:53:45 tsugumi zfs-import-stash-start[3705]: Pool stash in state MISSING, waiting
Jun 19 20:53:46 tsugumi zfs-import-stash-start[3705]: no pools available to import
Jun 19 20:53:46 tsugumi zfs-import-stash-start[3705]: Pool stash in state MISSING, waiting
Jun 19 20:53:47 tsugumi zfs-import-stash-start[3705]: no pools available to import
Jun 19 20:53:47 tsugumi zfs-import-stash-start[3705]: Pool stash in state MISSING, waiting
Jun 19 20:53:48 tsugumi zfs-import-stash-start[3705]: no pools available to import
Jun 19 20:53:48 tsugumi zfs-import-stash-start[3705]: Pool stash in state MISSING, waiting
Jun 19 20:53:49 tsugumi zfs-import-stash-start[3705]: no pools available to import
Jun 19 20:53:49 tsugumi zfs-import-stash-start[3705]: Pool stash in state MISSING, waiting
Jun 19 20:53:50 tsugumi zfs-import-stash-start[3705]: no pools available to import
Jun 19 20:53:50 tsugumi zfs-import-stash-start[3705]: Pool stash in state MISSING, waiting
Jun 19 20:53:51 tsugumi zfs-import-stash-start[3705]: no pools available to import
Jun 19 20:53:51 tsugumi zfs-import-stash-start[3705]: Pool stash in state MISSING, waiting
Jun 19 20:53:52 tsugumi zfs-import-stash-start[3705]: no pools available to import
Jun 19 20:53:52 tsugumi zfs-import-stash-start[3705]: Pool stash in state MISSING, waiting
Jun 19 20:53:55 tsugumi zfs-import-stash-start[3705]: Pool stash in state DEGRADED, waiting
Jun 19 20:53:58 tsugumi zfs-import-stash-start[3705]: Successfully imported stash
Jun 19 20:53:58 tsugumi systemd[1]: Started Import ZFS pool "stash".

@@ -58,6 +58,26 @@ let

snapshotNames = [ "frequent" "hourly" "daily" "weekly" "monthly" ];

importLib = {pool, cfgZfs}: ''
zpool_cmd="${packages.zfsUser}/sbin/zpool"
awk_cmd="${pkgs.gawk}/bin/awk"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use the busybox awk instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there's an idea.

Well, it worked, but it didn't affect the closure size. I'm starting to wonder if there's a spurious dependency from the scripting, somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...false. It seems I'm getting too sleepy for this, as I forgot to git commit.

Swapping to busybox awk fixes the closure size... though it doesn't tell me why that was a problem to begin with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...false, again. I'll get back to this when I'm more awake, but it happens independently of the awk selection.

@Baughn
Copy link
Contributor Author

Baughn commented Jun 19, 2018

Got it. It's obvious in retrospect; I'd replaced the carefully copied zpool binary with a direct reference to the zpool package. That bloat never had anything to do with awk at all.

Now it's good.

@Baughn
Copy link
Contributor Author

Baughn commented Jun 19, 2018

...and this should be picked to stable as well, because the previous code is a data loss event in waiting.

until msg="$(zpool import -d ${cfgZfs.devNodes} -N $ZFS_FORCE '${pool}' 2>&1)"; do
sleep 0.25
# Loop across the import until it succeeds, because the devices needed may not be discovered yet.
imported && exit
Copy link
Member

@Mic92 Mic92 Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an independent script. I don't think exit is a good choice here it may interfere with other commands later in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

zpoolCmd = "zpool";
awkCmd = "awk";
inherit pool cfgZfs;
}) + ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you overwrite existing function if multiple pools are defined. Can you only define each function once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If multiple root pools are defined?

Well, I suppose it'd be cleaner to not map across it. I'm not sure why this is a map in the first place, but let's do it right.

"${zpoolCmd}" list "${pool}" >/dev/null 2>/dev/null
}
import() {
"${zpoolCmd}" import -d "${cfgZfs.devNodes}" -N $ZFS_FORCE "${pool}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to rename ready to zpool-ready, which takes the pool name as the first argument.
It would be actually more readable if the other functions would be inline, since the commands are not much more complex then there replacement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the names.

I think the functions are quite a bit more readable than their contents, but that's not the only reason they exist. It's also to avoid duplicating the code. We might need to change some of this in the future, and it's good not having to change two places.

Note that the import systemd unit and the import-root-pool scripts had already diverged, so that isn't an academic concern.

sleep 1
done
imported || import # Try one last time, e.g. to import a degraded pool.
if imported; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the return value of zpool import in case of an failure? Should the script not already exit before it reaches this line?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most likely. I'm going to call this defensive coding; no matter what happens, we won't claim the pool is successfully imported if it isn't.

ready && import && break
sleep 1
done
imported || import # Try one last time, e.g. to import a degraded pool.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this is tried a last time when it already failed 60 times before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't tried in the same way. "ready && import" would only import the pool if every device is available, i.e. it's in state ONLINE and not state DEGRADED.

The unconditional import at the end will fail if it's in state FAULTED or MISSING, of course, but not if it's DEGRADED. That's a feature: The system will still boot if one or two devices are missing, just not as quickly. But we don't want it to import the pool in a degraded state if it can avoid that by waiting another few seconds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, can you also mention this in the comment?

Copy link
Member

@Mic92 Mic92 Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also that it waits for devices to become visible, before the pool is imported the first time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Baughn
Copy link
Contributor Author

Baughn commented Jun 22, 2018

Anything else that needs doing?

@Mic92
Copy link
Member

Mic92 commented Jun 28, 2018

@GrahamcOfBorg test installer.zfsroot

@Mic92
Copy link
Member

Mic92 commented Jun 28, 2018

@woffs can you test the import service? I only have zfs as a root pool.

@GrahamcOfBorg
Copy link

No attempt on aarch64-linux (full log)

The following builds were skipped because they don't evaluate on aarch64-linux: tests.installer.zfsroot

Partial log (click to expand)

while evaluating anonymous function at /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/nixos/release.nix:30:25, called from /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/lib/attrsets.nix:199:52:
while evaluating 'hydraJob' at /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/lib/customisation.nix:162:14, called from /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/nixos/release.nix:30:28:
while evaluating the attribute 'drvPath' at /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/lib/customisation.nix:179:13:
while evaluating the attribute 'drvPath' at /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/lib/customisation.nix:146:13:
while evaluating the attribute 'buildCommand' of the derivation 'vm-test-run-installer-zfs-root' at /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/pkgs/stdenv/generic/make-derivation.nix:170:11:
while evaluating the attribute 'testScript' of the derivation 'nixos-test-driver-installer-zfs-root' at /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/pkgs/stdenv/generic/make-derivation.nix:170:11:
while evaluating 'isFunction' at /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/lib/trivial.nix:213:16, called from /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/nixos/lib/testing.nix:104:12:
while evaluating 'testScriptFun' at /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/nixos/tests/installer.nix:68:19, called from /var/lib/gc-of-borg/nix-test-rs-21/repo/38dca4e3aa6bca43ea96d2fcc04e8229/builder/grahamc-aarch64-community-21/nixos/tests/installer.nix:267:20:
Non-EFI boot methods are only supported on i686 / x86_64

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: tests.installer.zfsroot

Partial log (click to expand)

machine# [    6.627932] systemd-shutdown[1]: All loop devices detached.
machine# [    6.628714] systemd-shutdown[1]: Detaching DM devices.
machine# [    6.630723] ACPI: Preparing to enter system sleep state S5
machine# [    6.631594] reboot: Power down
collecting coverage data
syncing
test script finished in 468.35s
vde_switch: EOF on stdin, cleaning up and exiting
cleaning up
/nix/store/q9mkir8x3i29ffy2j7xlnmah9m0md04y-vm-test-run-installer-zfs-root

@woffs
Copy link
Contributor

woffs commented Jun 28, 2018

@Mic92 I applied 1f50c87c83d0c46a11d44958f59bd408db74bbc2 onto nixos-18.03-small and rebooted. It worked, the pool (consisting of 24 devices) was cleanly imported.

@Mic92
Copy link
Member

Mic92 commented Jun 28, 2018

@Baughn or @woffs can you re-add data pool encryption support?

@woffs
Copy link
Contributor

woffs commented Jun 28, 2018

Should be easily done, but what's the best way to do it? Rebase this onto current master, or reapply #41846 onto #42269? I guess I have no push access to this.

@Mic92
Copy link
Member

Mic92 commented Jun 28, 2018

@woffs you could open a new pull request on top of this one.

@Baughn
Copy link
Contributor Author

Baughn commented Jun 29, 2018

I can try updating this one, but it's problematic rebasing it on top of unstable -- I don't run unstable on any of my computers, anymore. Perhaps I should.

@Baughn
Copy link
Contributor Author

Baughn commented Jul 2, 2018

I rebased it. It might work. But at the moment unstable doesn't build, so we'll never know.

@Baughn
Copy link
Contributor Author

Baughn commented Jul 2, 2018

I was able to test it. It seems to work...

@Mic92 Mic92 merged commit 54da292 into NixOS:master Jul 3, 2018
@Baughn Baughn deleted the zfs branch July 3, 2018 12:38
@sjau
Copy link

sjau commented Aug 14, 2018

This seems to cause a 2min 25secs downtime for me when booting my notebooks from mirrored SSDs

#43797

@gebner
Copy link
Member

gebner commented Aug 31, 2018

@Baughn Thanks for fixing this! I just spent a few hours yesterday trying to figure out why two of the devices in my pool were UNAVAIL ("label is missing or invalid"), even though the disks were clearly there in /dev/...

Taneb added a commit to Taneb/nixpkgs that referenced this pull request Nov 4, 2020
It looks like its functionality was (possibly accidentally) removed in NixOS#42269
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants