Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glusterfs service: Switch to simple unit instead of forking #49736

Merged

Conversation

nh2
Copy link
Contributor

@nh2 nh2 commented Nov 4, 2018

Motivation for this change

Gluster's pidfile handling is bug-ridden.

I have fixed https://bugzilla.redhat.com/show_bug.cgi?id=1509340
in an attempt to improve it but that is far from enough.

The gluster developers describe another pidfile issue as
"our brick-process management is a total nightmare", see
https://github.com/gluster/glusterfs/blob/f1071f17e02502c24375c0b480d369d37f4e4054/xlators/mgmt/glusterd/src/glusterd-utils.c#L5907-L5924

I have observed multiple cases where glusterd doesn't start correctly
and systemd doesn't notice because of the erroneous pidfile handling.

To improve the situation, we don't let glusterd daemonize itself any more
and instead use --no-daemon and the Simple service type.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

CC @bachp

I have tested this for a long time on my Gluster cluster.

Now I'm moved to ceph, but I still want to upstream the improvements to NixOS.

Gluster's pidfile handling is bug-ridden.

I have fixed https://bugzilla.redhat.com/show_bug.cgi?id=1509340
in an attempt to improve it but that is far from enough.

The gluster developers describe another pidfile issue as
"our brick-process management is a total nightmare", see
https://github.com/gluster/glusterfs/blob/f1071f17e02502c24375c0b480d369d37f4e4054/xlators/mgmt/glusterd/src/glusterd-utils.c#L5907-L5924

I have observed multiple cases where glusterd doesn't start correctly
and systemd doesn't notice because of the erroneous pidfile handling.

To improve the situation, we don't let glusterd daemonize itself any more
and instead use `--no-daemon` and the `Simple` service type.
@bachp
Copy link
Member

bachp commented Nov 4, 2018

The current service was based on the one in the upstream repository.

But I'm completly fine if we change it for NixOS as I also don't like the complexity of pid files.

@nh2
Copy link
Contributor Author

nh2 commented Nov 4, 2018

@bachp That's right. Note gluster may still be as confused as before about its own children, but at least with this patch, systemd knows correctly whether gluster is running or not.

@lheckemann lheckemann merged commit 231e671 into NixOS:master Nov 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants