New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataDog Agent v6 #40399
DataDog Agent v6 #40399
Conversation
Will the old version be supported for a while? |
@Mic92 Yes I assume it will be supported by DataDog for a while because v6 still lacks features that v5 has (changes, missing_features). Also users might need time to update any custom Python check scripts that they have. |
|
||
# DataDog use paths relative to the agent binary, so fix these. | ||
postPatch = '' | ||
sed -e "s|PyChecksPath =.*|PyChecksPath = \"$bin/${python.sitePackages}\"|" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expect this to do the same and it is simpler (no escaping). Apply in multiple places.
sed -e 's|PyChecksPath =.*|PyChecksPath = "$bin/${python.sitePackages}"|' \
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think $bin
would be expanded in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. In that case
sed -e 's|PyChecksPath =.*|PyChecksPath = "'"$bin"'/${python.sitePackages}"|' \
@rvl This looks exactly like what I had in mind when I opened the issue. I hope it will be merged soon. |
description = "Datadog agent monitor"; | ||
} // (if newDog then { | ||
preStart = '' | ||
chown datadog /etc/dd-agent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not done recursively?
serviceConfig = { | ||
ExecStart = "${cfg.package}/bin/dogstatsd"; | ||
Type = "forking"; | ||
PIDFile = "/tmp/dogstatsd.pid"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/tmp/
is a strange place for this. I hope statsd checks that no other user has created a malicious symlink there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are established locations for this stuff indeed. /tmp
is not the right location.
For my use case this kind of security is not important, but for some people it might be.
OK thanks @Mic92 I have fixed those issues. Also I noticed that the delightful v5 dd-agent code apart from putting PID files in In no hurry to merge this until it gets some testing from datadog users. |
I think it is used in the core infrastructure of NixOS itself. |
I haven't tested this on an actual machine yet, but from a superficial look I think this is missing the inclusion of the core integrations. The agent's go rewrite only covers some of the core functionality, most checks are still run via Python. If you look at the derivation output of the V5 Datadog's upstream packages (for Debian & co) include these in the agent package and they also note that they're included in the documentation. I've previously read through the upstream build process that they use to create the deb & RPM packages and it's ... complicated, to say the least. I'm not really familiar with the Python ecosystem but if someone wants to take a look at this together, please ping me :) |
4129a85
to
fb27009
Compare
Thanks @tazjin - I have added the python core integrations. I have also decided to not make the dd-agent module work with both v5 and v6. I have kept the v5 dd-agent temp files fix which @Mic92 pointed out. |
@rvl Awesome work, I'll try it in one of our environments with a bunch of custom integration configs and report back. Probably won't get around to it before Tuesday though. |
The merge of #41035 created a merge conflict, please rebase. |
7599676
to
53a1148
Compare
OK thanks @xeji I have rebased. Now that there are separate NixOS modules for v5 and v6, I am more keen to get this reviewed and merged. The testing I have done is with nixops and virtualbox, with the basic host dashboard metrics, and the postgresql integration. |
I have a build failure when I build from the PR as well as via a
I added |
This is the new v6 version of datadog-agent. The old v5 module is kept as dd-agent.
rm -f /etc/datadog-agent/auth_token | ||
''; | ||
script = '' | ||
export DD_API_KEY=$(head -n1 ${cfg.apiKeyFile}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
head -n 1
is POSIX correct. head -n1
is not (it just happens to work).
Refactors the datadog-agent (i.e. V6) module to let users configure arbitrary checks, not just a limited set, without having to resort to linking the files manually and updating the systemd unit. Checks are now configured via a `services.datadog-agent.checks` option which takes an attribute set in which the keys refer directly to Datadog check names, and the values are attribute sets representing Datadog's configuration structure. With this mechanism users can configure arbitrary integrations, for example for the `ntp`-check, simply by saying: services.datadog-agent.checks.ntp = { init_config = null; # ... other check configuration options as per Datadog # documentation }; The previous check-specific configuration options for non-default checks have been removed. Disk & network check configuration options have been kept rather than making them a `default`-value of the `checks`-option because they will be overridden by user-configurations in that case. Relates to NixOS#40399.
Refactors the process used to build the Datadog core integrations to be more easily extensible with integrations other than the ones built and installed by default. Documentation has been added in relevant parts of the module to describe how the process works. As a high-level overview: The `datadog-integrations-core` attribute in the top-level package set now accepts an extra parameter. This parameter is an attribute set where each key is the name of a Datadog integration as it appears in Datadog's integrations-core repository[1], and the value is a function that receives the Python package set and returns the required dependencies of this integration. For example: datadog-integrations-core { ntp = (ps: [ ps.ntplib ]); }; This would build the default integrations and, additionally, the `ntp` integration. To support passing the modified Python environment to the datadog-agent itself, the `python` key has been moved inside of the derivation which means that it will be made overridable. This relates to NixOS#40399. [1]: https://github.com/DataDog/integrations-core
Introduces an option `services.datadog-agent.extraIntegrations` that can be set to include additional Datadog agent integrations from the integrations-core repository. Documentation and an example is provided with the change. Relates to NixOS#40399
I am force-pushing over the current tip ( The configuration is tested on a few machines with both custom check config and extra integrations. From my side this should be done, but I haven't explicitly tested whether this is affected by #44362 - maybe @coretemp can do that, or maybe we can look at it later. |
Refactors the datadog-agent (i.e. V6) module to let users configure arbitrary checks, not just a limited set, without having to resort to linking the files manually and updating the systemd unit. Checks are now configured via a `services.datadog-agent.checks` option which takes an attribute set in which the keys refer directly to Datadog check names, and the values are attribute sets representing Datadog's configuration structure. With this mechanism users can configure arbitrary integrations, for example for the `ntp`-check, simply by saying: services.datadog-agent.checks.ntp = { init_config = null; # ... other check configuration options as per Datadog # documentation }; The previous check-specific configuration options for non-default checks have been removed. Disk & network check configuration options have been kept rather than making them a `default`-value of the `checks`-option because they will be overridden by user-configurations in that case. Relates to NixOS#40399.
Refactors the process used to build the Datadog core integrations to be more easily extensible with integrations other than the ones built and installed by default. Documentation has been added in relevant parts of the module to describe how the process works. As a high-level overview: The `datadog-integrations-core` attribute in the top-level package set now accepts an extra parameter. This parameter is an attribute set where each key is the name of a Datadog integration as it appears in Datadog's integrations-core repository[1], and the value is a function that receives the Python package set and returns the required dependencies of this integration. For example: datadog-integrations-core { ntp = (ps: [ ps.ntplib ]); }; This would build the default integrations and, additionally, the `ntp` integration. To support passing the modified Python environment to the datadog-agent itself, the `python` key has been moved inside of the derivation which means that it will be made overridable. This relates to NixOS#40399. [1]: https://github.com/DataDog/integrations-core
Introduces an option `services.datadog-agent.extraIntegrations` that can be set to include additional Datadog agent integrations from the integrations-core repository. Documentation and an example is provided with the change. Relates to NixOS#40399
@tazjin Looking at it later is the option that I prefer. It's just that I raised it, because the costs of fixing something earlier is generally lower. However, I consider getting it into 18.09 even with this minor bug to be even more important. I don't plan to fix it in the next 4 months for example. |
@tazjin Sorry for the delay. I have been busy moving house. I have just tested your fixes and refactors with nginx and postgres checks. It looks good and works well. Thanks. |
Thanks @disassembler! 🥇 |
Great to see this merged. Is there a work flow to use this module on 18.03 that does not involve forking? |
@coretemp Yes, you can do something roughly like this: { config, pkgs, ... }:
let
# Download nixpkgs from master (you may want to pin a commit, or maybe
# the unstable channel, instead):
unstableSrc = fetchTarball "https://github.com/NixOS/nixpkgs/archive/master.zip";
# The above gives is the raw (unimported) Nix source, which we'll need
# in a second, but we also need to import the actual package set:
unstable = import unstableSrc {};
in {
# Import the module itself from unstable:
imports = [
"${unstableSrc}/nixos/modules/services/monitoring/datadog-agent.nix"
];
config = {
# Pick the relevant packages into your local overrides:
nixpkgs.config.packageOverrides = base: base // {
inherit (unstable) datadog-agent datadog-integrations-core;
};
# And use the module:
services.datadog-agent = {
enable = true;
apiKeyFile = whatever;
tags = [ "my-env" ];
};
};
} I wrote this from memory so it may be slightly off, but that's roughly how we do it atm. |
Thanks all! |
Thanks @tazjin. I am using the following (I don't like a moving target):
|
@tazjin What do you do about?
Also, I am not sure why, but despite the module running fine, the Host map in Datadog doesn't show the machine anymore after the upgrade. |
@tazjin I found a more serious limitation: I added
|
@tazjin When I use |
@coretemp Working on something else right now, can take a look at this tomorrow, but some small comments:
Depending on how your hosts were set up they may end up with new hostnames. This doesn't happen in my environments because we explicitly configure the hostnames used in Datadog, so I'm not sure if there is a difference in behaviour between this and the V5 module. Will check tomorrow.
I'm not sure if it's common or expected for modules to install CLI tools system-wide, but I guess it makes sense here because of flares and retrieving the agent status. This should be a tiny change.
What sort of error do you get? |
@tazjin There is no error. I am certain that if you try to send a flare, that it will also not work. |
Motivation for this change
I wanted to store the
api_key
setting in a file, saw that datadog-agent v6 supports using an environment variable, then foolishly attempted packaging datadog-agent v6.They decided to rewrite most parts of the agent in Go. The new version is here: https://github.com/DataDog/datadog-agent
The NixOS module supports both versions using the
service.dd-agent.package
option.Things done
It's hard to automatically test this because it depends on a proprietary service.
I have tested a basic setup using a nixops VM and a DataDog trial account and this works.
If someone could help test more elaborate configs, that would help.
build-use-sandbox
innix.conf
on non-NixOS)nix-shell -p nox --run "nox-review wip"
./result/bin/
)/cc @thoughtpolice @domenkozar @mbbx6spp @Fuuzetsu @bflyblue @coretemp