Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the permissions for the datadog-agent need to be modifiable #91649

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

maxgitt
Copy link

@maxgitt maxgitt commented Jun 27, 2020

Motivation for this change

I can't get any disk metrics with the user-group datadog. My system runs all daemons through root. Below is my error message that I was able to resolve by creating a permissions option in the datadog-agent module setting my services.datadog-agent.permissions = { User = "root"; Group = "root"; };

Jun 27 02:14:59 nixos zx0f096xv8lgg1wxgnwhspxq0n4lj96g-unit-script-datadog-agent-start[16394]: 2020-06-27 02:14:59 UTC | CORE | WARN | (pkg/collector/py/datadog_agent.go:148 in LogMessage) | (disk.py:107) | Unable to get disk metrics for /run/user/1000/gvfs: [Errno 13] Permission denied: '/run/user/1000/gvfs'
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.


permissions = mkOption {
description = "User and group permissions for datadog-agent";
type = types.attrs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a submodule so the type documents itself or switch to flat options instead of this nested approach.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to use a submodule. let me know if there is something more you'd like in here.

@@ -201,6 +201,16 @@ in {
excluded_interfaces = [ "lo" "lo0" ]; } ];
};
};

permissions = mkOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If User/Group don't equal datadog, the datadog user shouldn't be created probably.

Copy link
Author

@maxgitt maxgitt Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts are that the developer should be able to atleast set the User/Group to root. In my case, the datadog agent couldn't retrieve disk info.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but in either way a user named datadog will be created in the module and you probably don't want to create a new user that is unused then.

Copy link
Author

@maxgitt maxgitt Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah looks like you're right. I have this unused datadog user.
datadog_user

Maybe I'm asking the wrong question here. Perhaps it should be, Why can't datadog access disk metrics?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Ma27 I did some digging. Here is what I have:

The datadog-agent's User/Group is datadog by default from the datadog module. The agent attempts to call into run/user/1000/gvfs or the Gnome Virtual File System. Take a look.

[myorg@myhost:/run/user]$ ll
total 0
drwx------ 5 myorg myorg 100 Jun 30 07:41 1000

[myorg@myhost:/run/user/1000]$ ll
total 0
drwx------ 3 myorg myorg 60 Jun 30 07:41 dbus-1
dr-x------ 2 myorg myorg  0 Jun 30 07:41 gvfs
drwxr-xr-x 2 myorg myorg 80 Jun 30 19:41 systemd

Of course it makes sense that the agent doesn't have access to gvfs. I also found similar issues here and here where the fix is to either set root permissions or blacklist the offending error. So naturally I tried

[myorg@myhost:~]$ cat /etc/datadog-agent/conf.d/disk.d/conf.yaml | jq
{
  "init_config": null,
  "instances": [
    {
      "device_blacklist": [
        "gvfs"
      ],
      "file_system_blacklist": [
        "gvfs"
      ],
      "mount_point_blacklist": [
        "gvfs"
      ],
      "use_mount": false
    }
  ]
}

And, yet even after attempting to blacklist the gvfs I'm still left with the same error as before:

Jun 30 20:41:37 myhost somehash-unit-script-datadog-agent-start[32208]: 2020-06-30 20:41:37 UTC | CORE | WARN | (pkg/collector/py/datadog_agent.go:148 in LogMessage) | (disk.py:107) | Unable to get disk metrics for /run/user/1000/gvfs: [Errno 13] Permission denied: '/run/user/1000/gvfs'

I would agree that datadog shouldn't be run as root but the only solution that has worked for me thus far is modifying upstream to run datadog via root.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Without taking a look at the details, you may want to take a look at the parameters in systemd.exec(5)
  • I'm afraid that you're misunderstanding my point: the module creates a new user named datadog. But this shouldn't be done IMHO if you declare a user which is not named datadog.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point. The module creates a datadog user. Therefore it would be unwise to use the module and then set the User/Group to some other user i.e. root.

Then it comes down to getting the datadog user to access the /run/user/1000/gvfs which is owned by my user named myorg.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about adding the datadog group as member to the gvfs group?

@maxgitt maxgitt requested review from Ma27 and dasJ June 30, 2020 22:09
@@ -231,7 +250,7 @@ in {
datadog-agent = makeService {
description = "Datadog agent monitor";
preStart = ''
chown -R datadog: /etc/datadog-agent
chown -R ${cfg.permissions.User}: /etc/datadog-agent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does datadog need to write to this directory? If not, you could replace this line with serviceConfig.ConfigurationDirectory = "datadog-agent" and drop the (deprecated) serviceConfig.PermissionsStartOnly

@dasJ
Copy link
Member

dasJ commented Aug 5, 2020

Also, it might be easier to just do

{
  services.datadog-agent = { /* whatever */ };
  systemd.services.datadog-agent.serviceConfig = {
    # Either this (but results in an unused datadog user)
    User = "root";
    Group = "root";
    # Or this
    SupplementaryGroups = [ "myorg" ];
    # Or maybe this?
    InaccessiblePaths = [ "/run/user" ];
  };
}

This works because the module system merges all definitions (the one from your config and from this module) of systemd.services.datadog-agent.

@flokli
Copy link
Contributor

flokli commented Sep 25, 2020

I think adding to SupplementaryGroups from inside the datadog module, if certain checks requiring it is fine. @maxgitt, WDYT?

@stale
Copy link

stale bot commented Mar 26, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 26, 2021
@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jun 12, 2023
@wegank wegank added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants