-
-
Notifications
You must be signed in to change notification settings - Fork 15.5k
nixos/buildkite-agents: support multiple buildkite agents #78373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@GrahamcOfBorg test buildkite-agents |
Oh you didn't rename the test attr @GrahamcOfBorg test buildkite-agent |
2cc5149
to
ea954b3
Compare
(test attr renamed now) |
enable = mkEnableOption "buildkite-agent"; | ||
buildkiteOptions = { name ? "", config, ... }: { | ||
options = { | ||
enable = mkOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the fact that it is declared assume that you want it enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Would be weird if you declared an agent and then you have to explicitly enable it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still want to have the option to then explicitly disable it? Seems weird, as we filter by enabled agents…
nixos/modules/services/continuous-integration/buildkite-agents.nix
Outdated
Show resolved
Hide resolved
Do we not want to keep compatibility? Provide |
dataDir = mkOption { | ||
default = "/var/lib/buildkite-agent"; | ||
default = "/var/lib/buildkite-agent-${name}"; | ||
description = "The workdir for the agent"; | ||
type = types.str; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you know usecases of where this should be somewhere else than at this location?
If we restrict it to /var/lib/buildkite-agent-${name}
, we could just set StateDirectory=buildkite-agent=${name}
, and DynamicUser=true
, and don't need to setup a system-wide user at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we use the buildkite directory to store some gc roots that are used for Continuous Delivery. We could change to a ReadWriteDirectory, but I don't think DynamicUser
is really set up for communication with the outside world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our approach to CD is arguably horrible by abusing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'd recommend using some post-build-hook on the global nix-daemon to sign and upload to a binary cache, but hmm… I can understand you can't change this that quick.
Still, what about defaulting to use DynamicUser=true
, and StateDirectory=buidlkite-agent-${name}
, and only use a system-wide user if both a user
attribute and that dataDir
attribute is set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using some post-build-hook on the global nix-daemon to sign and upload to a binary cache
Been there, tried that :(. Builds become horribly slow and the host becomes almost unusable as it is uploading everything, even your user environment after nix-env -i
(and that’s slow too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the naive approach blocks the build loop, as documented in https://nixos.org/nix/manual/#chap-post-build-hook-caveats .
In your case, you might want to have the executed hook to fork away as fast as possible.
However, I see that discussion as not really related to this PR - it doesn't yet introduce DynamicUser
.
nixos/modules/services/continuous-integration/buildkite-agents.nix
Outdated
Show resolved
Hide resolved
I'm fine with some breaking changes, if they're documented in the release notes and remove some cruft. We did some other updates when moving to buildkite 3, so I'd expect people to take a look at the release notes anyway and update their configuration. |
4e26f22
to
520361f
Compare
(mkRemovedOptionModule [ "services" "buildkite-agent" "openssh" "publicKey" ] "SSH public keys aren't necessary to clone private repos.") | ||
(mkRemovedOptionModule [ "services" "buildkite-agent" "openssh" "publicKeyPath" ] "SSH public keys aren't necessary to clone private repos.") | ||
(mkRenamedOptionModule [ "services" "buildkite-agent" "meta-data"] [ "services" "buildkite-agent" "tags" ]) | ||
(mkRemovedOptionModule [ "services" "buildkite-agent"] "services.buildkite-agent has been moved to an attribute set at services.buildkite-agents") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be relatively easy to make this backwards compatible with mkChangedOptionModule
, which will also issue a warning when the old option is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do that, but what should we make the default agent name? The state paths for it would still change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since people are expected to have used dataDir
to refer to the path, I don't think it should be a problem. Also from this PR I can see that the service previously didn't even use dataDir
to get its config, seemingly indicating that it was never fully supported anyways.
Also I correct myself: A simple
mkRenamedOptionModule [ "services" "buildkite-agent" ]
[ "services" "buildkite-agents" "legacy" ]
should work (using the name legacy
, or maybe single
or nixos-single
or so)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the dataDir
actually contains long-term state. People would have to move the contents of /var/lib/buildkite-agent
to /var/lib/buildkite-agent-legacy
, or get unexpected behavior. I think it's safer to have it break during the build, instead of during runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh good point, didn't think of that. Okay so a mkRenamedOptionModule
won't work, but a mkChangedOptionModule
should still be possible with a legacy
attribute (or similar) and by defaulting its dataDir
to /var/lib/buildkite-agent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think breaking old config and making sure the user did read the release notes is a better way forward here.
There have been quite some changes, and just silently assuming everything will still work as it did before is probably a bad idea anyways.
I also proposed to default to use DynamicUsers
, and make the usage of a non-dynamic user, and custom StateDirectory
the non-default case, which might cause nontrivial migrations aswell.
I think batching all these with the option syntax change is a reasonable path forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the "quite some changes"? From what I can see all this PR does is to add support for multiple agents as it says, which imo really doesn't warrant breaking people's configs, especially if it only takes a couple lines to make it automatic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@infinisil see the earlier PRs and the corresponding release notes:
https://github.com/NixOS/nixpkgs/blob/master/nixos/doc/manual/release-notes/rl-2003.xml#L417
As for the "defaulting to DynamicUser
": #78373 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were any of those not backwards compatible? If so they should've added a warning/error themselves. This PR itself seems 100% backwards compatible with a mkChangedOptionModule
. Unless backwards compat has to be broken I'd rather not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the module options themselves not, but the upgrade from buildkite 2 to 3 might have introduced some behavorial changes in how the bootstrap/hooks were configured: https://buildkite.com/docs/agent/v3/upgrading
04b97b7
to
e242ecc
Compare
@GrahamcOfBorg test buildkite-agent |
@GrahamcOfBorg test buildkite-agents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
a16d757
to
e242ecc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this on IRC: https://logs.nix.samueldr.com/nixos-dev/2020-02-10#1581336556-1581341910;
Looks good to me now :)
After this PR I get:
but I am not specifying hooksPath, and am only specifying hooks.environment. |
|
||
assertions = [ | ||
config.assertions = mapAgents (name: cfg: [ | ||
{ assertion = cfg.hooksPath == hooksDir || all (v: v == null) (attrValues cfg.hooks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ assertion = cfg.hooksPath == hooksDir || all (v: v == null) (attrValues cfg.hooks); | |
{ assertion = cfg.hooksPath == (hooksDir cfg) || all (v: v == null) (attrValues cfg.hooks); |
Motivation for this change
Continued from #68378, which was continued from #59358.
Things done
sandbox
innix.conf
on non-NixOS linux)cc
@flokli @mkaito @balsoft @infinisil