Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/curator: init elasticsearch curator #44340

Merged
merged 11 commits into from Aug 25, 2018
Merged

Conversation

shmish111
Copy link
Contributor

@shmish111 shmish111 commented Aug 2, 2018

See elasticsearch curator

Motivation for this change

I needed to use this tool on my NixOS ES cluster

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.


options.services.curator = {

enable = mkOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use mkEnableOption here.

};
in {

options.services.curator = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should name the module elasticsearch-curator instead of just curator because it's more descriptive and consistent with the name of the Python package.


config = mkIf cfg.enable {

systemd.services.curator = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with the module name I think we should also name the systemd unit elasticsearch-curator instead of just curator.

enable = cfg.enable;
startAt = cfg.interval;
serviceConfig = {
ExecStart = ''${pkgs.python36Packages.elasticsearch-curator}/bin/curator --config ${curatorConfig} ${curatorAction}'';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<nitpicking> could you indent this line? </nitpicking>

# not a Python "NoneType"
client:
hosts: ${builtins.toJSON cfg.hosts}
port: 9200
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the port also be configurable? We should probably default it to services.elasticsearch.port.

@shmish111
Copy link
Contributor Author

@basvandijk I have updated the PR according to your PR, do I need to squash the extra commit?

Copy link
Member

@basvandijk basvandijk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also rename the file to elasticsearch-curator.nix?

with lib;

let
cfg = config.services.curator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should now be: config.services.elasticsearch-curator

};
port = mkOption {
description = "the port that elasticsearch is listening on";
type = types.int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing a semicolon here.

@shmish111
Copy link
Contributor Author

fixed. should I squash @basvandijk ?

@basvandijk
Copy link
Member

@shmish111 thank you. A squash would be nice.

Any chance you could add a NixOS test for this? Maybe extend the existing ELK test.

@shmish111
Copy link
Contributor Author

@basvandijk do you think it is ok to add it to the ELK test? I feel like it would be unnecessary overhead to have a test that boots up another ES machine when I could just add in some test cases to the existing ELK test. I suppose I would just create an index in ES, run a curator action that should delete it and then check it has been deleted.

@basvandijk
Copy link
Member

@shmish111 extending the existing ELK test like you suggest is the way to go IMHO. It would not only test curator but also elasticsearch itself.

@shmish111
Copy link
Contributor Author

@basvandijk how do I run the nixpkgs tests? I can't find documentation anywhere :-(

@shmish111
Copy link
Contributor Author

@basvandijk I have squashed. Also, I can't get the test passing on master so I can't really do anything until I do that.

uname -a
Linux ip-10-0-4-156.us-east-2.compute.internal 4.14.32 #1-NixOS SMP Sat Mar 31 16:10:43 UTC 2018 x86_64 GNU/Linux

nix-build nixos/tests/elk.nix

This fails to connect to elasticsearch. It seems that elasticsearch has failed to start but I can't work out how to debug in order to find out why.

@basvandijk
Copy link
Member

See the following on how to run tests interactively:
https://nixos.org/nixos/manual/index.html#sec-running-nixos-tests
That will get you into a console on the VM with which you can debug the issue.

@shmish111
Copy link
Contributor Author

@basvandijk I've tried to play around but it's difficult to debug as I can't get into a shell in the running VM. All I can see is the output of startAll etc. I managed to see that elasticsearch does start eventually however it take a long long time for some reason. Then logstash is failing, with "TypeError('Invalid argument: family must be 4 or 6');". Also cat /tmp/logstash.out never appears.

Have you managed to run these tests successfully? I am trying to run these on a NixOS machine on AWS with the image NixOS x86_64-linux 18.03.132946.1caae7247b8 (hvm-ebs) as a t2.xlarge instance.

I think the test maintainers need to look at this, does this need to prevent my PR? I'm happy to add a test but the existing test needs to be fixed first and at the moment that is beyond me.

Copy link
Member

@basvandijk basvandijk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the test, I can successfully run nix-build nixos/tests/elk.nix on my Mac Book Pro mid 2015 running NixOS 18.03. Maybe a t2.xlarge can't run VMs?

Do you have some statements available that create an index on elasticsearch and an actionYAML definition that deletes that index? Then I can try running that test to see if it works.

config = mkIf cfg.enable {

systemd.services.elasticsearch-curator = {
enable = cfg.enable;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to set this since enable is true by default and the whole definition is already under a mkIf.

@shmish111
Copy link
Contributor Author

@basvandijk ok, I've fixed the cfg.enable thing. I've also added a test for curator, this just runs curator after the other tests and checks that the logstash index has been deleted. This index must have been created for the logstash queries to succeed. Can you run these tests because I have no access to a NixOS machine that isn't on AWS at the moment, and I suspect you are correct and that VMs aren't running correctly on AWS instances.

@shmish111 shmish111 requested a review from FRidh as a code owner August 21, 2018 15:42
@basvandijk
Copy link
Member

basvandijk commented Aug 21, 2018

Great we're getting closer! I pushed some more fixes to your branch. In the end you might want to squash all of them.

Could you look into fixing the elasticsearch-curator python package because building it fails with:

nix-build -A python36Packages.elasticsearch-curator
...
Collecting requests-aws4auth>=0.9 (from elasticsearch-curator==5.5.4)
  Could not find a version that satisfies the requirement requests-aws4auth>=0.9 (from elasticsearch-curator==5.5.4) (from versions: )
No matching distribution found for requests-aws4auth>=0.9 (from elasticsearch-curator==5.5.4)

I see we don't have a package for requests-aws4auth so that needs to be packaged.

@basvandijk
Copy link
Member

You might also want to add yourself as maintainer of the module using:

  meta.maintainers = with maintainers; [ shmish111  ];

and add shmish111 to the maintainer-list.nix.

@basvandijk basvandijk force-pushed the es-curator branch 2 times, most recently from 8c8862c to 50dd177 Compare August 25, 2018 14:53
@basvandijk
Copy link
Member

@GrahamcOfBorg test elk.ELK-5 elk.ELK-6

@GrahamcOfBorg
Copy link

No attempt on aarch64-linux (full log)

The following builds were skipped because they don't evaluate on aarch64-linux: tests.elk.ELK-5, tests.elk.ELK-6

Partial log (click to expand)

Cannot nix-instantiate `tests.elk.ELK-5' because:
error: attribute 'ELK-5' in selection path 'tests.elk.ELK-5' not found

Cannot nix-instantiate `tests.elk.ELK-6' because:
error: attribute 'ELK-6' in selection path 'tests.elk.ELK-6' not found

@GrahamcOfBorg
Copy link

Timed out, unknown build status on x86_64-linux (full log)

Attempted: tests.elk.ELK-5, tests.elk.ELK-6

Partial log (click to expand)

one: running command: cat /tmp/logstash.out | grep flowers
one# cat: /tmp/logstash.out: No such file or directory
one: exit status 1
one: running command: cat /tmp/logstash.out | grep flowers
one# cat: /tmp/logstash.out: No such file or directory
one: exit status 1
one: running command: cat /tmp/logstash.out | grep flowers
one# cat: /tmp/logstash.out: No such file or directory
building of '/nix/store/q5wv95vqb8bifpzyz0qvcvb78r5vddi8-vm-test-run-ELK-5.drv' timed out after 1800 seconds
error: build of '/nix/store/mp71gw6kysf1y7nilnnvz5gq70dyh7di-vm-test-run-ELK-6.drv', '/nix/store/q5wv95vqb8bifpzyz0qvcvb78r5vddi8-vm-test-run-ELK-5.drv' failed


propagatedBuildInputs = [ requests ];

doCheck = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the tests disabled? Please include a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test hang. I'm not sure why. Good point about adding a comment. Will do.

@basvandijk basvandijk merged commit a144c79 into NixOS:master Aug 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants