Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hdf4: fix tests on some machines #61059

Merged
merged 1 commit into from May 12, 2019
Merged

hdf4: fix tests on some machines #61059

merged 1 commit into from May 12, 2019

Conversation

lopsided98
Copy link
Contributor

Motivation for this change

The hdf4 package has some problems with tests failing on certain machines. Based on my testing, a particular machine will either fail or pass reliably. I have not seen any transient failures.

I discovered this issue because the Hydra builder "ike" failed to build this package and therefore one of my builders attempted to build it and also failed. The build previously succeeded on builder "t4a". I tested both commits, and both times was able to reproduce the failure, meaning that this is not a regression.

Here are the results of tests on different machines:

Status CPU OS Cores
✔️ i7-6700HQ Arch Linux 8
✔️ Xeon E5-1620 v2 NixOS 8
Xeon E5-2640 v4 Ubuntu 18.04 40
Xeon E5-2620 Ubuntu 18.04 24
Xeon E5-2623 v3 Fedora 30 16
Xeon E5-2660 v3 Fedora 29 40
Xeon E5-2623 v3 Fedora 24 16

All of these test were done with sandboxing enabled.

Things done

Interestingly, the most reliable way I can find to fix the tests is to enable all the tests. The tests that were previously disabled pass on all machines I was able to test, including armv7l and aarch64, and they seem to make the other failing tests pass as well. I was not able to test macOS, but that can be done by ofborg.

I'm not sure what's going on here, but I suspect it fails on machines that have large numbers of cores, maybe due to a race condition.

cc @knedlsepp

  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Assured whether relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@knedlsepp
Copy link
Member

We might want to try what effect disabling parallel building using enableParallelBuilding = false; has.

@c0bw3b
Copy link
Contributor

c0bw3b commented May 7, 2019

Or you may tell Nix to use no more than 8 cores to build

  preBuild = ''
    export NIX_BUILD_CORES=$(( $NIX_BUILD_CORES<=8 ? $NIX_BUILD_CORES:8 ))
  '';

@lopsided98
Copy link
Contributor Author

This PR seems to fix the problem in practice, so I'd rather just merge this as is and only disable/limit parallel building if we actually observe build failures.

@c0bw3b
Copy link
Contributor

c0bw3b commented May 7, 2019

@GrahamcOfBorg build hdf4

@c0bw3b
Copy link
Contributor

c0bw3b commented May 7, 2019

Failing on Darwin:

The following tests FAILED:
        130 - MFHDF_TEST-hdftest (SEGFAULT)
        134 - MFHDF_TEST-hdftest-shared (SEGFAULT)
        228 - HDP-dumpsds-18 (Failed)
        442 - NC_TEST-nctest (SEGFAULT)

@lopsided98
Copy link
Contributor Author

Those tests must have been disabled unconditionally even though they only fail on macOS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants