Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/slurm: fix default module parameters, update documenation #41016

Merged
merged 3 commits into from May 24, 2018

Conversation

markuskowa
Copy link
Member

@markuskowa markuskowa commented May 24, 2018

Motivation for this change

Slurm's default process tracking facility is now cgroups. However, the lack of the cgroups.conf config files, required by slurm, breaks the test. I added the option procTrackType to the module, setting the
default to proctrack/linuxproc. This enables the minimal example of the test to run again.

This PR follows the idea that a minimal setup configured through nix should be functional.
Integrating cgroups.conf requires an additional option in the module, since it is required
to reside in the same directory as slurm.conf (will be part of a future PR).

Things done
  • New option procTrackType
  • Refer to munge in the description (required to run in the default setting).
  • [EDIT]: enable munge by default.
  • [EDIT]: added slurm test to release.nix
  • Tested using sandboxing (nix.useSandbox on NixOS, or option build-use-sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
    nixos/tests/slurm.nix completes successfully
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Fits CONTRIBUTING.md.

@markuskowa markuskowa changed the title slurm/module: fix default module parameters, update documenation nixos/slurm: fix default module parameters, update documenation May 24, 2018
Wether to enable the slurm control daemon.
Note that the standard authentication method is "munge".
The "munge" service needs to be setup up in order for
slurm to work properly (see <literal>services.munge<literal>).
Copy link
Member

@Mic92 Mic92 May 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a good idea to enable munge automatically?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a good idea. However, you need to provide a key file to the munge module to make it work.

@xeji
Copy link
Contributor

xeji commented May 24, 2018

Please add the slurm test to tests in nixos/release.nix so it will be run on Hydra.

@xeji
Copy link
Contributor

xeji commented May 24, 2018

@GrahamcOfBorg test slurm

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: tests.slurm

Partial log (click to expand)

3 out of 3 tests succeeded
test script finished in 34.68s
cleaning up
killing node3 (pid 593)
killing node1 (pid 605)
killing node2 (pid 617)
killing control (pid 629)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/rp14m7d2b9wqnbrrsw38xalsxswc1lhw-vm-test-run-slurm

@@ -150,6 +169,8 @@ in

environment.systemPackages = [ wrappedSlurm ];

services.munge.enable = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be possible to use slurm without munge ? If yes, please change the value to mkDefault true so users can easily override it in their config.

However, you need to provide a key file to the munge module to make it work.

Can this change break existing configs that use slurm without munge ? If yes, that should be documented in the release notes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can run slurm without munge by turning off the authentication via extraConfig.
If you do not set services.munge.enable=false explicitly in that case, then this will result in the munged.service to fail on startup (given that no service.munge.password was provided).
I am not sure that this would really result in a breakage of an existing installation.
The way the slurm module is coded now munge is assumed to be there:
the slurmctld.service specifies requires = [ "munged.service" ].
Munge is practially "hardcoded" into the nixos module already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. So there's no need for extra documentation.

@GrahamcOfBorg
Copy link

Success on aarch64-linux (full log)

Attempted: tests.slurm

Partial log (click to expand)

3 out of 3 tests succeeded
test script finished in 25.76s
cleaning up
killing node1 (pid 627)
killing control (pid 640)
killing node2 (pid 653)
killing node3 (pid 666)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/1clxvvhfhyb0nml3v5y72wixrxxsjwmr-vm-test-run-slurm

@xeji xeji merged commit 8d4716a into NixOS:master May 24, 2018
@markuskowa markuskowa deleted the slurm-pr branch May 24, 2018 17:55
xeji added a commit that referenced this pull request May 24, 2018
build of nixos manual broke because of mismatched xml tags
in an option description introduced in #41016.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants