New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add package & NixOS module for journaldriver #42134
Conversation
5453ed1
to
98ad891
Compare
98ad891
to
9797564
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can one know whether its a GCP machine or not?
And a nitpick: Only half of the option declarations have the =
lined up nicely :)
{ lib, fetchFromGitHub, rustPlatform, pkgconfig, openssl, systemd }: | ||
|
||
rustPlatform.buildRustPackage rec { | ||
name = "journaldriver"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version should be appended to the name, separated by a dash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
}; | ||
|
||
# Disable tests in nixpkgs | ||
doCheck = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for the tests being disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly decided to do that due to the time it takes to build the entire dependency chain, which will happen twice for running tests & non-tests.
The release versions have already had tests run (otherwise they wouldn't have gone into upstream master) and since all dependencies are pinned it should be fine.
Normally I'd package a Rust tool by using the buildRustCrate
(which doesn't run tests at all), but that's not resolving the dependency on libsystemd correctly for some reason and has no obvious (read: documented) way to override the dependencies it figures out.
If you think tests should be enabled I don't mind enabling them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as the tests work, I think they should be on by default. Hydra will have it in the cache anyways, so it doesn't really matter how long it takes for the end user. Usually the only good reason to disable tests is because the tests require network access or something else that Nix doesn't have in the sandbox.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Updating with tests enabled.
# Disable tests in nixpkgs | ||
doCheck = false; | ||
|
||
buildInputs = [ pkgconfig openssl systemd.dev ]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
buildInputs
should choose the dev outputs automatically, so I think it should also work with s/systemd.dev/systemd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, pkgconfig
should go in nativeBuildInputs
|
||
config = mkIf cfg.enable { | ||
users.extraUsers.journaldriver = { | ||
isSystemUser = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this set to true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand the differentiation this is only to indicate that the user account is not used by a human, which results in the UID of the account being allocated in a different range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not entirely sure, there are a couple services that use this, but others set a fixed id in https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/misc/ids.nix#L36. I hope somebody else can comment on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment at the top of ids.nix
sounds to me like the preferred way is to not hardcode an ID unless necessary.
It may be possible to use the DynamicUser=
option in systemd. Journaldriver needs to have a persistent file to store its cursor position in, this should be possible to achieve with StateDirectory=
in systemd - it's not obvious how to grant the dynamic user membership in the journal access group though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update:
This configuration seems to work without specifying a user at all:
serviceConfig = {
Restart = "always";
DynamicUser = true;
StateDirectory = "journaldriver";
# This group is required for accessing journald.
SupplementaryGroups = "systemd-journal";
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! I didn't know of that :o
}; | ||
|
||
systemd.services.journaldriver = { | ||
inherit environment; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can move the section at the start here, and change it to something like this:
environment = {
RUST_LOG = cfg.logLevel;
LOG_NAME = cfg.logName;
# ...
}
This works because environment
allows null
values which has the effect of not setting the vars at all already.
|
||
logLevel = mkOption { | ||
type = types.string; | ||
default = "info"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What valid values are there? You can change the type to types.enum [ "info" "debug" ... ]
to only allow certain values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values aren't constrained by an enum in this case, Rust's logging library accepts pretty complex configuration via this environment variable (for example one could enable trace logging for the HTTP client used within while not touching the log-level of the rest of the service).
@infinisil Thanks for the review! Good feedback 👌
Hmm, presumably people know who their hosting provider is ... I'd hope? Technically one could try to run some kind of detection by attempting to speak to Google's instance metadata service, but that should probably be a change in journaldriver and not something that Nix should do. I'm pushing an updated version that addresses some of your points and will quickly test it on some test machines. |
9797564
to
66c5129
Compare
Update: Tests successful :) |
}; | ||
|
||
# Disable tests in nixpkgs | ||
doCheck = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as the tests work, I think they should be on by default. Hydra will have it in the cache anyways, so it doesn't really matter how long it takes for the end user. Usually the only good reason to disable tests is because the tests require network access or something else that Nix doesn't have in the sandbox.
"LOG_NAME" = cfg.logName; | ||
"LOG_STREAM" = cfg.logStream; | ||
"GOOGLE_CLOUD_PROJECT" = cfg.googleCloudProject; | ||
"GOOGLE_APPLICATION_CREDENTIALS" = cfg.applicationCredentials; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nicer to not use the "
here, RUST_LOG = cfg.logLevel
works just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Updated
|
||
config = mkIf cfg.enable { | ||
users.extraUsers.journaldriver = { | ||
isSystemUser = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not entirely sure, there are a couple services that use this, but others set a fixed id in https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/misc/ids.nix#L36. I hope somebody else can comment on this
66c5129
to
3ed9bfc
Compare
Refactored to use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Another thing I'm unsure about is network.target
vs network-online.target
. I'd think that most services would require network-online.target
, but there are vastly more services actually using network.target
. I suspect that's just because of confusion regarding those two, network-online.target
seems to be the right one here.
One last thing to change and then this is looking good to me :)
}; | ||
|
||
logLevel = mkOption { | ||
type = types.string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost missed this: The type should be types.str
instead:
Lines 203 to 205 in 7296c26
# Deprecated; should not be used because it quietly concatenates | |
# strings, which is usually not what you want. | |
string = separatedString ""; |
The same for all other options that use types.string
except applicationCredentials
, which should be changed to a path: types.path
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
50a4d12
to
bfea9b2
Compare
Regarding the network targets: I read up on that a bit before making the change and according to this article, the |
|
||
# The directory is set up and permissions are configured by | ||
# systemd via the `StateDirectory` configuration option. | ||
CURSOR_POSITION_FILE = "/var/lib/private/journaldriver/cursor.pos"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading man systemd.exec
on StateDirectory
tells me that symlinks should be set up to point /var/lib/journaldriver -> /var/lib/private/journaldriver
, so I think the private/
infix could be dropped.
Other than that, I'd squash the 3 module commits into 1, and then I got nothing more to say :) |
bfea9b2
to
6f3d976
Compare
@infinisil Upstream now uses I've squashed the commits together now. Thanks for your review, I learned a bunch of interesting things that aren't immediately obvious from the Nix docs :) |
Adds a module for running the journaldriver log forwarding agent via systemd. The agent can be deployed on both GCP instances and machines hosted elsewhere to forward all logs from journald to Stackdriver Logging. Consult the module options and upstream documentation for more information. Implementation notes: * The service unit is configured to use systemd's dynamic user feature which will let systemd set up the state directory and appropriate user configuration at unit launch time instead of hardcoding it. * The module depends on `network-online.target` to prevent a situation where journaldriver is failing and restarting multiple times before the network is online.
6f3d976
to
59e5aab
Compare
Verified the final version in our test environment, good to go from my side! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks for the PR @tazjin and @infinisil for the thorough review. 🍻
@fpletz Thank you! :) |
Imports the `journaldriver` module into the top-level NixOS module list to make it usable without extra work. This went unnoticed in NixOS#42134 (mostly because my setup imports modules explicitly from pinned versions). Fixes NixOS#50390
Motivation for this change
Add support for the journaldriver log forwarding agent, primarily for use on Google Cloud Platform instances.
This PR adds a derivation for building
journaldriver
, as well as a NixOS module that configures a user and service for running the log forwarder.Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nox --run "nox-review wip"
./result/bin/
)We have deployed this module on a bunch of machines in our test environments (both on Google Cloud machines and VMs hosted elsewhere) and it works as expected.