Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thanos: init at 0.6.0 & NixOS module #60500

Merged
merged 12 commits into from Jul 28, 2019
Merged

Conversation

basvandijk
Copy link
Member

@basvandijk basvandijk commented Apr 30, 2019

Please don't merge since this is work-in-progress. EDIT: this is now ready for review.

Motivation for this change

I would like to use Thanos in NixOS.

0.4.0 Release notes.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nix-review --run "nix-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Assured whether relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@basvandijk
Copy link
Member Author

@GrahamcOfBorg build thanos

The following contains some preliminary tests for thanos:
@GrahamcOfBorg test prometheus2

@basvandijk
Copy link
Member Author

@Mic92 I see you've done some work on buildGoModule. It would be great if you could review commit c93e5ca. The thanos go package needs to have access to the bzr executable when downloading the go module dependencies. This commit adds a way to override the go-modules derivation so that we can add bzr to the nativeBuildInputs.

@ofborg ofborg bot requested a review from kalbasit April 30, 2019 16:43
@basvandijk basvandijk force-pushed the thanos-init branch 3 times, most recently from 930bccf to 8ef1718 Compare May 2, 2019 15:28
@basvandijk basvandijk changed the title WIP: thanos: init at 0.4.0-rc.1 & NixOS module thanos: init at 0.4.0-rc.1 & NixOS module May 2, 2019
@basvandijk basvandijk removed their assignment May 2, 2019
@infinisil
Copy link
Member

I really don't like having more such huge modules in nixpkgs. Having options for every config setting has some serious downsides, check out NixOS/rfcs#42 if you haven't seen it already.

@basvandijk basvandijk changed the title thanos: init at 0.4.0-rc.1 & NixOS module thanos: init at 0.4.0 & NixOS module May 10, 2019
@basvandijk
Copy link
Member Author

@infinisil thanks for pointing me to that RFC. I didn't see it yet. You raise some interesting points. I guess I'll have to make my mind up what my opinion is on the matter.

@basvandijk
Copy link
Member Author

@GrahamcOfBorg test prometheus2

@basvandijk basvandijk force-pushed the thanos-init branch 2 times, most recently from 4aeb105 to 13c264b Compare May 16, 2019 23:26
@basvandijk
Copy link
Member Author

@GrahamcOfBorg test prometheus2

@basvandijk basvandijk changed the title thanos: init at 0.4.0 & NixOS module thanos: init at 0.5.0-rc.0 & NixOS module Jun 3, 2019
@basvandijk
Copy link
Member Author

@GrahamcOfBorg test prometheus2

@basvandijk
Copy link
Member Author

basvandijk commented Jun 3, 2019

@infinisil regarding NixOS/rfcs#42, I understand the motivation for it: NixOS modules that encode every setting as a NixOS option are, to quote the RFC, "time consuming to implement, tedious to review and hard to maintain". This is true. I do think though that the time invested in implementing, reviewing and maintaining the module will benefit the user significantly. She'll get a module which is

  • discoverable: every setting is a NixOS options which is searchable and documented
  • safe to use: eval-time syntax- and type-checking for every option.

OTOH reducing the time spent implementing, reviewing and maintaining modules is also important so I'm willing to investigate a config-based approach. Regarding that I have a few questions:

  • Thanos doesn't work with a configuration file but accepts command line arguments instead. It's unclear to me how to map an option of type types.json to a list of command line arguments.

    For most type of options the mapping is not that hard. For example we can have:
    { store.sd-interval = "10m"; store.unhealthy-timeout = "3m"; } ->
    [ ''--store.sd-interval="10m"'' ''store.unhealthy-timeout="3m"'' ]

    But what to do with repeatable options like:

    --store=<store> ...        Addresses of statically configured store API
                               servers (repeatable). The scheme may be
                               prefixed with 'dns+' or 'dnssrv+' to detect
                               store API servers through respective DNS
                               lookups.  
    

    This PR solves this by having the option:
    store.addresses = mkOption { type = types.listOf types.str; ... }
    This option will then get converted to a list of multiple --store="<address>" arguments.

    Of course we could implement the same kind of logic with the config setting where we would check if config has a store.addresses field of type list of string and then converting that. However, in contrast to the NixOS option, that field would not be discoverable.

  • Similarly to the above list-based arguments Thanos has key-value-based arguments like:

    --label=<name>="<value>" ...
                               Labels to be applied to all generated metrics
                               (repeated). Similar to external labels for
                               Prometheus, used to identify ruler and its
                               blocks as unique source.
    

    How would we specify and convert these? Maybe by checking for a field named label in the config which should be an attribute set, then converting the name-value pairs to --label=<name>="<value>". This could work but is also not very discoverable.

@mmahut
Copy link
Member

mmahut commented Jun 4, 2019

I've tested the binary and this works as expected.

Copy link
Member

@kalbasit kalbasit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not tested yet, unsure if I can afford to do so myself. This is a pure CR.

nixos/modules/services/monitoring/thanos.nix Outdated Show resolved Hide resolved
nixos/modules/services/monitoring/thanos.nix Outdated Show resolved Hide resolved
(mkIf cfg.sidecar.enable {
systemd.services.thanos-sidecar = {
wantedBy = [ "multi-user.target" ];
after = [ "network.target" "prometheus2.service" ];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we enable prometheus (services.prometheus.enable) or write an assertion for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an assertion is an excellent idea! I'll add it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thank you for adding the assertions. Not a blocker, but it would be great to add a test for the assertions.

runCommandNoCC "bzr-no-cert-validation" {
inherit bazaar;
buildInputs = [ makeWrapper ];
} "makeWrapper $bazaar/bin/bzr $out/bin/bzr --add-flags -Ossl.cert_reqs=none";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way this can go into go-modules instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add it to nativeBuildInputs by default but that would cause an unnecessary build dependency for all go-modules (except for thanos of course).

However git is also added by default so there's precedent. OTOH git is used by most go-modules while bzr isn't.

In general I think it's worthwhile to have support for overriding go-modules.

Copy link
Member

@kalbasit kalbasit Jun 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the need to be able to support overriding the go-modules and I would leave that. I'm just thinking that any package using bzr will have to copy your hack over to their package and most likely spend time figuring out why they're getting a TLS error in the first place. We could maybe control it with a flag defaulting to false? Something like enable-bzr ? false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently thanos-0.5.0 doesn't depend on bzr anymore so I removed the overrideModAttrs setting.

It would still be nice to add a enableBzr flag but this can be done in a separate PR.

pkgs/servers/monitoring/thanos/default.nix Outdated Show resolved Hide resolved
@ofborg ofborg bot requested a review from kalbasit June 5, 2019 11:24
@basvandijk
Copy link
Member Author

@GrahamcOfBorg test prometheus2

@basvandijk basvandijk changed the title thanos: init at 0.5.0-rc.0 & NixOS module thanos: init at 0.5.0 & NixOS module Jun 16, 2019
@basvandijk
Copy link
Member Author

The test fails on aarch64-linux because of insufficient disk space:

store# [  126.911144] thanos[739]: level=error ts=2019-06-16T14:00:26.59870538Z caller=main.go:182 msg="running command failed" err="error executing compaction: first pass of downsampling failed: create dir: mkdir /var/lib/thanos-compact/downsample: no space left on device"
store# [  126.942655] systemd[1]: thanos-compact.service: Main process exited, code=exited, status=1/FAILURE

I'll increase the disk space on store and run the test again...

@basvandijk
Copy link
Member Author

@GrahamcOfBorg test prometheus2

@basvandijk basvandijk changed the title thanos: init at 0.5.0 & NixOS module thanos: init at 0.6.0 & NixOS module Jul 28, 2019
@basvandijk
Copy link
Member Author

@GrahamcOfBorg test prometheus2

I'll leave in the support for overrideModAttrs because that can be
useful for other go packages.
This is to fix the following error in the test on aarch64-linux:

store# [  126.911144] thanos[739]: level=error ts=2019-06-16T14:00:26.59870538Z caller=main.go:182 msg="running command failed" err="error executing compaction: first pass of downsampling failed: create dir: mkdir /var/lib/thanos-compact/downsample: no space left on device"
store# [  126.942655] systemd[1]: thanos-compact.service: Main process exited, code=exited, status=1/FAILURE
@basvandijk
Copy link
Member Author

@GrahamcOfBorg test prometheus2

@basvandijk basvandijk merged commit 9ff408a into NixOS:master Jul 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants