Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus: upgrade configuration module to use the 2.x package #46069

Closed
wants to merge 6,397 commits into from

Conversation

azazel75
Copy link
Contributor

@azazel75 azazel75 commented Sep 4, 2018

Although it's not possible to migrate an existing 1.x to 2.x this upgrade seems appropriate because there are no more 1.x docs available online and the configuration format changed without backward compatibility letting people installing it having serious issues.

This upgrade is conservative: it keeps all the old configuation items without introducing new ones that may be preferable to configure "alertmanager service discovery" optional feature. (It's possible to configure it specifying a configuration file)

I've personally spent some ours finding out that the configuration error reported by the prometheus daemon on my installation was really a misconfiguration due to the use of the old prometheus 1.x by the configuration module, despite the fact that the prometheus 2.x it's already available in nixpackages.

Motivation for this change

Configure a version of the prometheus services that you can actually find documentation for.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

@fpletz
Copy link
Member

fpletz commented Sep 5, 2018

I've thought about this a lot in the past. I don't think it's a good idea to update prometheus to 2.x quietly because old data won't be available anymore as you've already mentioned. On the other hand I agree that we have to update to 2.x as soon as possible.

Also, some options slightly changed but we've already made these changes in our fork (we've got a dedicated services.prometheus2 module) and I would be happy to merge those into your PR.

If we do this like you propose, we would have to at least mention this change in the release notes because it is a backwards-incompatible change.

@vcunat & @samueldr What do you think about this change for 18.09? Do you think that adding a new services.prometheus2 module might be better than breaking current setups?

@azazel75
Copy link
Contributor Author

azazel75 commented Sep 5, 2018

I would also favor a prometheus2 module name as the recommended way to migrate from 1.x to 2.x is actually to run the two instances in parallel, see https://prometheus.io/docs/prometheus/latest/migration/#storage . I can make the necessary changes. @fpletz what options slightly changed? I only stumbled upon the recording and alerting rules (which are just superficially managed by the module) and the "alert manager service discovery" stuff (see https://prometheus.io/docs/prometheus/latest/migration/#alertmanager-service-discovery ) which my revised module handles "statically" as it was in 1.x (leaving the possibility to the user to use the textual rephresentation to specify a more "advanced" configuration)

@vcunat
Copy link
Member

vcunat commented Sep 6, 2018

If this is about DB data, have you considered using system.stateVersion?

@azazel75
Copy link
Contributor Author

azazel75 commented Sep 7, 2018

@vcunat can you elaborate please?

@bachp
Copy link
Member

bachp commented Oct 1, 2018

I think what @vcunat mean it to just make it depend on system.stateVersion which prometheus version to use. So basically for system.stateVersion >= 18.09 it would be prometheus 2 for earlier ones it would be 1.

I actually also thought about making it depend on system.stateVersion, but I think creating a prometheus2 module is probably the better way. Especially because it would support the recommended "migration strategy" of keeping prometheus running in parallel.

7c6f434c and others added 17 commits November 2, 2018 08:06
Fix flightgear, simgear & speed_dreams
ntp: fix ntpd shutdown by using upstream patch
activation-script: add libc to path to provide nscd when needed
As reported by @andir, the regular expressions that match the sandbox
output are no longer matching in the recent Chromium bump as of
bb03fbc.

Instead of a boolean field that determines whether namespace sandboxes
are on, the namespace sandbox is now an enum within "Layer 1 Sandbox".

I've modified the regular expressions accordingly and also ran the test
for the stable branch, which now succeeds.

Signed-off-by: aszlig <aszlig@nix.build>
Issue: NixOS#49442
Cc: @bendlas, @andir
nixos/syncthing: move configuration to condigDir
@azazel75
Copy link
Contributor Author

azazel75 commented Nov 5, 2018

So, now that I've completed the merge: I spent the time of running the two in parallel because it seemed to me that this was the idea of most of the people that wrote here. Now that I've done this it seems that the idea has changed to trash the old 1.x service and keep the 2.x. Maybe you guys should come to an agreement ;-)

@azazel75
Copy link
Contributor Author

azazel75 commented Nov 5, 2018

Uhumm, it seems I've made some serious errors in this branch while trying to rebasing it, I'm sorry. I'll try to fix it

@globin
Copy link
Member

globin commented Nov 5, 2018

I would keep both modules for the next release (to be able to run them in parallel for one release), but with actually having copied the previous module (with intended code duplication). This will make it easier to just remove prometheus(1) after the next release and not having to disentangle them again by hand.

@azazel75
Copy link
Contributor Author

azazel75 commented Nov 7, 2018

Please see #49802

@benley
Copy link
Member

benley commented Nov 7, 2018

mind if we close this PR since it looks like we're continuing over in the other one now?

@benley benley closed this Nov 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet