Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZTS: init zfs test suite nixos tests (WIP) #110873

Closed
wants to merge 2 commits into from

Conversation

Mindavi
Copy link
Contributor

@Mindavi Mindavi commented Jan 26, 2021

Motivation for this change

#108649

Using the official test suite is nice. Currently some random tests are enabled, the hardest part of this whole story seems to be: which tests should be enabled?

Note that running the 'sanity' tests already takes over 40 minutes, which is unacceptable. @adisbladis and me had a quick chat over on IRC (https://logs.nix.samueldr.com/nixos/2021-01-10, search for Mindavi and adisbladis), and we determined that ~10 mins is a good amount of time to shoot for (this is quite arbitrary, and based on the fact that ZFS is a fundamental part of the system).

There are some things I'd like feedback on:

  • Do we want to combine the zfs.nix and this one (zfs-tests.nix currently)?
  • Is there a way to conditionally not remove the test suite files? I've commented out the line and tried some things with a variable as function argument, but couldn't figure out how to combine override and overrideAttrs (they're both needed).
  • Is it acceptable that ZFS is rebuilt? This is currently done because the build scripts are changed (due to changes required to the test scripts).

All the currently enabled tests are passing, but (as can be seen in the test script) some of the tests fail. I'm not sure why, and the failing tests may be contributing to the long sanity test time as well.

This one in particular seems to be a culprit:

file_wait_event exceeded 120.289 seconds
ERROR: file_wait_event /var/tmp/zed/zed.debug.log sysevent.fs.zfs.scrub_start exited 1

It can be seen multiple times in a failure log.

I think the first step for this should be that we decide on where to put this and how to handle the ZFS conditional building / inclusion of the test suite. After this is merged, it's possible to look into what tests are best to run. This set will at least do something in the good direction. If you have suggestions, please let me know. I don't want to be bikeshedding about which ones to run exactly now though. I'll trust any good advice on this, as long as the runtime stays acceptable.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@adisbladis
@Atemu

Attached is a test log of some of the failing tests, with the data of the logfiles that the ZTS generates included.

zfs-tests-failing-tests-dump.txt

@hmenke
Copy link
Member

hmenke commented Jan 26, 2021

It looks like there is a test set linux-fast upstream. Maybe that would be suitable?
https://github.com/openzfs/zfs/blob/f4f50a70488ebad3fa96c2fe4142924215733113/scripts/zfs-tests.sh#L338-L339

@Mindavi
Copy link
Contributor Author

Mindavi commented Jan 26, 2021

It does mention a linux-fast test, but it doesn't exist. A linux test does exist though, but I've started it an it has been running for 1830 seconds already now -- so that's too long.

@stale
Copy link

stale bot commented Aug 21, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Aug 21, 2021
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Aug 21, 2021
nixos/tests/zfs-tests.nix Outdated Show resolved Hide resolved
nixos/tests/zfs-tests.nix Outdated Show resolved Hide resolved
nixos/tests/zfs-tests.nix Outdated Show resolved Hide resolved
pkgs/os-specific/linux/zfs/default.nix Outdated Show resolved Hide resolved
nixos/tests/zfs-tests.nix Outdated Show resolved Hide resolved
@@ -17,6 +17,14 @@

# for determining the latest compatible linuxPackages
, linuxPackages_5_13

# By default, do not include tests as they cause gcc to be in the closure
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're also quite large, multiple dozen MiB.

Comment on lines 99 to 120
'' + optionalString (buildUser && includeTests) ''
# TODO(ricsch): Is overriding with /run/current-system/sw/bin acceptable?
substituteInPlace scripts/zfs-tests.sh \
--replace '$STF_PATH/ksh' ${ksh}/bin/ksh \
--replace 'LOSETUP=' 'LOSETUP=${util-linux}/bin/losetup #' \
--replace 'export PATH=$STF_PATH' '#' \
--replace 'DMSETUP=' 'DMSETUP=${lvm2}/bin/dmsetup #' \
--replace '$STF_PATH/gzip' '${gzip}/bin/gzip' \
--replace '$STF_PATH/gunzip' '${gzip}/bin/gunzip' \
--replace '/sbin/fsck.ext4' '${e2fsprogs}/bin/fsck.ext4' \
--replace '/sbin/mkfs.ext4' '${e2fsprogs}/bin/mkfs.ext4' \
--replace '$STF_PATH/awk' '${gawk}/bin/awk' \
--replace 'SYSTEM_DIRS="/usr/local/bin /usr/local/sbin"' 'SYSTEM_DIRS="/run/current-system/sw/bin"'

substituteInPlace scripts/common.sh.in \
--replace 'export ZTS_DIR=' 'export ZTS_DIR=${placeholder "out"}/share/zfs #' \
--replace 'export SCRIPT_DIR=' 'export SCRIPT_DIR=${placeholder "out"}/share/zfs #' \
--replace 'export ZDB=' 'export ZDB=${placeholder "out"}/bin/zdb #' \
--replace 'export ZFS=' 'export ZFS=${placeholder "out"}/bin/zfs #' \
--replace 'export ZPOOL=' 'export ZPOOL=${placeholder "out"}/bin/zpool #' \
--replace 'export ZTEST=' 'export ZTEST=${placeholder "out"}/bin/ztest #' \
--replace 'export ZFS_SH=' 'export ZFS_SH=${placeholder "out"}/share/zfs/zfs.sh #'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no better way to provide these dependencies? How do upstream testers declare these binaries? It'd probably be best to just get them from $PATH which is hermetic anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the PATH is being modified when these tests are run, which is slighly hocus-pocus to me.

@Mindavi
Copy link
Contributor Author

Mindavi commented Aug 23, 2021

@Atemu do you have a tip on how to run those tests in parallel on my machine? Is there some way to run them all instead of having to nix-build -A <testname> for each of them?

@Atemu
Copy link
Member

Atemu commented Aug 24, 2021

Usually, you can build a set of drvs with nix-build but the tests evaluate to lambdas which nix-build is somehow able to eval to a drv? I don't understand this magic well enough.

An easy way of doing this is to generate a list instead of a set; just map mkTest tests.

@Mindavi
Copy link
Contributor Author

Mindavi commented Oct 7, 2021

I did some experiments with making multiple outputs and by doing that properly, there was no loop in the references anymore. I should look into that some time (this PR isn't high-priority for me, just a nice to have).

@hmenke
Copy link
Member

hmenke commented Nov 11, 2021

It would be amazing if we had the full ZTS as a multiple output derivation. This would allow us to run the full ZTS as a single-shot test when bumping the ZFS version. With zfsUnstable there are often demands to “unlock” it for kernel version that have not been marked as compatible by upstream. If such a configuration passes the ZTS I would be much more comfortable approving such a merge, see also #145458

@Mindavi
Copy link
Contributor Author

Mindavi commented Dec 6, 2021

Rebased on top of latest master, but the deadman test now crashes the kernel and hangs indefinitely.

I think this is starting to be ready to go though. The tests are now put in a separate 'zfs_tests' output, to ensure they're properly separated.

Only thing that'd still be nice is if we could remove all hardcoded paths to the tools, but since the ZTS does a little dance for setting up the PATH that I don't fully understand, I'm not 100% sure how to approach that properly.

@Mindavi Mindavi marked this pull request as ready for review May 26, 2022 17:31
@Mindavi
Copy link
Contributor Author

Mindavi commented May 26, 2022

@ofborg test zfs

@Mindavi
Copy link
Contributor Author

Mindavi commented May 26, 2022

@ofborg build zfs

@Mindavi
Copy link
Contributor Author

Mindavi commented May 26, 2022

Hmm, so I should add it to passthru and to the toplevel tests file still, I think.

Copy link
Member

@Atemu Atemu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  • The test isn't discoverable in nixosTests
  • We need to be able to test a matrix of zfs and kernel versions.

Comment on lines 106 to 136
'' + optionalString (buildUser) ''
# TODO(ricsch): Is overriding with /run/current-system/sw/bin acceptable?
substituteInPlace scripts/zfs-tests.sh \
--replace '$STF_PATH/ksh' ${ksh}/bin/ksh \
--replace 'LOSETUP=' 'LOSETUP=${util-linux}/bin/losetup #' \
--replace 'export PATH=$STF_PATH' '#' \
--replace 'DMSETUP=' 'DMSETUP=${lvm2}/bin/dmsetup #' \
--replace '$STF_PATH/gzip' '${gzip}/bin/gzip' \
--replace '$STF_PATH/gunzip' '${gzip}/bin/gunzip' \
--replace '/sbin/fsck.ext4' '${e2fsprogs}/bin/fsck.ext4' \
--replace '/sbin/mkfs.ext4' '${e2fsprogs}/bin/mkfs.ext4' \
--replace '$STF_PATH/awk' '${gawk}/bin/awk' \
--replace 'SYSTEM_DIRS="/usr/local/bin /usr/local/sbin"' 'SYSTEM_DIRS="/run/current-system/sw/bin"'

substituteInPlace scripts/common.sh.in \
--replace 'export ZTS_DIR=' 'export ZTS_DIR=${placeholder "zfs_tests"}/share/zfs #' \
--replace 'export SCRIPT_DIR=' 'export SCRIPT_DIR=${placeholder "zfs_tests"}/share/zfs #' \
--replace 'export ZDB=' 'export ZDB=${placeholder "out"}/bin/zdb #' \
--replace 'export ZFS=' 'export ZFS=${placeholder "out"}/bin/zfs #' \
--replace 'export ZPOOL=' 'export ZPOOL=${placeholder "out"}/bin/zpool #' \
--replace 'export ZTEST=' 'export ZTEST=${placeholder "out"}/bin/ztest #' \
--replace 'export ZFS_SH=' 'export ZFS_SH=${placeholder "zfs_tests"}/share/zfs/zfs.sh #'

# Fix ksh paths in test suite.
# patchShebangs doesn't work due to the scripts not being executable.
# It doesn't seem logical to make them executable either.
find -name "*.ksh" -exec sed -i 's,/bin/ksh,${ksh}/bin/ksh,' {} \;

# Patching maybe required for more binaries
# Maybe FHS environment would be better?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy, look at the time! It's time to resurrect this PR yet again :)

I think these scripts could use makeWrapper instead of awkwardly patching them. Most variables use the pattern LOSETUP=${LOSETUP:-/sbin/losetup}, so they can be overridden using makeWrapper ... --set LOSETUP "${util-linux}/bin/losetup".

Also what's up with all the ksh stuff? I've read through zfs-tests.sh and to me it looks like ksh is only used on FreeBSD.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! Hopefully I can find some time soon to maybe clean this up :). Of course, feel free to contribute if you feel like it.

@Mindavi
Copy link
Contributor Author

Mindavi commented Nov 22, 2022

I rebased this on top of master, didn't change anything else for now (other than solve a minor merge conflict). Seems to be still working in this form.

@ofborg ofborg bot requested a review from hmenke November 22, 2022 19:39
@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label May 21, 2023
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 20, 2024
@wegank wegank marked this pull request as draft March 20, 2024 22:53
@Mindavi Mindavi force-pushed the feature/zfs-tests branch 2 times, most recently from d0483ba to 496808d Compare March 25, 2024 19:58
@Mindavi
Copy link
Contributor Author

Mindavi commented Mar 25, 2024

I'm not really interested in getting this working anymore. I'm not that big of a ZFS user and I did this at some point to scratch an itch. It seems like the tests are now broken and there were some features I couldn't figure out how to implement (test all zfs versions). Hopefully someone can pick up from where I left, if someone's still interested in this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants