Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos-containers: add TimeoutStartSec option #65661

Merged
merged 1 commit into from Aug 28, 2019

Conversation

davidak
Copy link
Member

@davidak davidak commented Jul 31, 2019

Motivation for this change

Fixes #65001

Default is now 1m instead of global default of 15sec. It is also
configurable.

This is not a very elegant solution since the system load is still very high at boot, but it fixes the issue for now.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.
Notify maintainers

cc @arianvp @edolstra @kampfschlaefer

@davidak
Copy link
Member Author

davidak commented Jul 31, 2019

To test this, i created an Installer live image with 100 containers configured and bootet it.

All containers start now. The start fails without this fix.

@davidak
Copy link
Member Author

davidak commented Aug 14, 2019

@infinisil can you review it again?

Copy link
Member

@infinisil infinisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, can you fix the merge conflict? Looks good to me. Feel free to ping me again when done (or somebody else that is actively merging PR's)

@infinisil
Copy link
Member

Oh actually, while running nix-build nixos/release.nix -A tests.containers-imperative.x86_64-linux:

machine# [   61.148540] systemd[1]: /nix/store/ar6287z36akl2m2a9mdvan37ad17scjh-unit-container-.service/container@.service:31:
  Failed to parse TimeoutStartSec= parameter, ignoring:

You should probably make the dummy be valid too

Default is now 1m instead of global default of 15sec. It is also
configurable.

Fixes issue where start of many containers (40+) fail
NixOS#65001
@davidak
Copy link
Member Author

davidak commented Aug 28, 2019

@infinisil thanks for mentioning the tests. They run sucessfull now!

I set the dummy to the current default (systemd default) of 15s.

@infinisil infinisil merged commit fb7611a into NixOS:master Aug 28, 2019
@davidak davidak deleted the containers branch August 28, 2019 20:03
@maralorn
Copy link
Member

Hello @davidak, I am slightly confused by this PR.

You say the current default set by systemd is 15s. But I don‘t think this is the case. I am curious where you got this number. At least on my systems (using nixos for two years) the timeout is and has been since the introduction of systemd in arch years ago 90s.

I mean it doesn‘t really matter, but this PR actually lowered the default TimeoutStartSec for my containers and subsequently my containers now fail more often.

So I am wondering: What do you think about raising the timeout to 90s or making this option null-able and just not setting it by default to make sure we use the systemd default (even if they change it)?

@davidak
Copy link
Member Author

davidak commented Sep 11, 2019

@maralorn i got the information from this stackexchange answer :D

#65001 (comment)

I mean it doesn‘t really matter, but this PR actually lowered the default TimeoutStartSec for my containers and subsequently my containers now fail more often.

That was not my intention. Do you use NixOS stable or unstable? They might have different systemd versions.

So I am wondering: What do you think about raising the timeout to 90s or making this option null-able and just not setting it by default to make sure we use the systemd default (even if they change it)?

When it's actually 90s, we might be fine with a null-able option. But since we depend on it, it might be more stable to set 90s as default. Would you create a PR for that? I can review it then.

@davidak davidak mentioned this pull request Jan 8, 2022
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't start 100 nixos-containers
3 participants