Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/cockroachdb: create new service #51306

Merged
merged 1 commit into from Dec 2, 2018

Conversation

thoughtpolice
Copy link
Member

This adds a module to fully configure CockroachDB. This also includes a full end-to-end CockroachDB clustering test to ensure everything basically works.

This subsumes #38665, and was based off it with several amendments (including most of the changes I requested in #38665).

/cc @jbboehr

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Assured whether relevant documentation is up to date
  • Fits CONTRIBUTING.md.

@thoughtpolice
Copy link
Member Author

@GrahamcOfBorg test cockroachdb

1 similar comment
@thoughtpolice
Copy link
Member Author

@GrahamcOfBorg test cockroachdb

@thoughtpolice
Copy link
Member Author

@GrahamcOfBorg build nixosTests.cockroachdb

@thoughtpolice
Copy link
Member Author

thoughtpolice commented Dec 1, 2018

See also #51338 in my never-ending quest to fix things.

This also includes a full end-to-end CockroachDB clustering test to
ensure everything basically works. However, this test is not currently
enabled by default, though it can be run manually. See the included
comments in the test for more information.

Closes NixOS#51306. Closes NixOS#38665.

Co-authored-by: Austin Seipp <aseipp@pobox.com>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
@nixos-discourse
Copy link

This pull request has been mentioned on Nix community. There might be relevant details there:

https://discourse.nixos.org/t/rfc-accurate-time-synchronization-dependencies-via-time-sync-target/1582/1

preStart = ''
if ! test -e ${cfg.dataDir}; then
mkdir -m 0700 -p ${cfg.dataDir}
chown -R ${cfg.user} ${cfg.dataDir}
Copy link
Member

@aszlig aszlig Dec 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thoughtpolice: What about using the StateDirectory option (see systemd.exec(5)) instead, if cfg.dataDir is within /var/lib?

};

port = mkOption {
type = types.int;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe types.port?

like datacenter. The tiers and order must be the same on all nodes.
Including more tiers is better than including fewer. For example:

country=us,region=us-west,datacenter=us-west-1b,rack=12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use the docbook <example/> tag here?

http = addressOption "http-based Admin UI" 8080;

locality = mkOption {
type = types.nullOr types.str;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest types.commas here, because it correctly merges multiple option definitions.

That aside, maybe listOf (attrsOf str) makes more sense here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These strings need to be topographically ordered (from "least specific" qualifier to "most specific" in terms of where they are located regionally), and it's very unclear to me what happens when you merge things from two different modules and how the ordering works out.

Besides, individual generic modules cannot possibly know the datacenter/system the database is deployed on, making locality information impossible to derive reliably, outside of your own deploy-specific logic. Anything implicitly setting the locality option except for the actual top-level configuration.nix declaration itself in a single place is doing something wrong, AFAICS.

Perhaps option merging isn't the correct way to assert this, though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and it's very unclear to me what happens when you merge things from two different modules and how the ordering works out.

This is controlled via mkBefore/mkAfter/mkOrder based on a priority value, so if you want to append a locality you'd just do locality = mkAfter "foobar".

But okay, if you want to prevent that from being possible (except via overrides), then it makes sense to leave it at str.

Besides, individual generic modules cannot possibly know the datacenter/system the database is deployed on

Maybe generic modules might not know, but let's say you have a module that defines these values by discovery of other nodes, eg. this comes into my mind, where the build slaves for Hydra are discovered using the nodes attribute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe generic modules might not know, but let's say you have a module that defines these values by discovery of other nodes, eg. this comes into my mind, where the build slaves for Hydra are discovered using the nodes attribute.

That use case works fine, unless I'm missing something -- just set the locality setting based on the attribute values you produce. After all, you are writing the code to instantiate services.cockroachdb, so just do it right there. You're done! The real question is, why do you want to set locality in two places, in two modules? Locality is really a cluster-level concept, and the keys/values must be the same for every node in the cluster. It's not something generic modules can (or should) have an interest in, really. Furthermore CockroachDB is a "global service" for NixOS, like most services -- there can't be more than one of it at a time (though it may serve multiple uses). In a sense, it's just like a stronger version of something like the port setting -- of which, there can only be one meaningful instantiation of it.

So under the assumption that one NixOS instance = one CockroachDB node, there is really only one global possible locality configuration for any usage of services.cockroachdb. And there's no generic module that can know how to configure that. Any module that does set it has to be done with your specific deployment in mind. In your scenario, you're already writing code to instantiate the module.

The use case that won't work here is if you want to import module a.nix and b.nix and have them both, somehow, set locality information (specifically locality: they can set other mergeable options just fine). For example, a.nix could say planet=earth and b.nix could say continent=us, and they would like to "share" the CockroachDB service, and enable it themselves each. This seems like a misdesign. First off every locality setting must be consistent across all nodes, so every node has to now have a.nix and b.nix instantiated, or have it replicated (which is a weird workaround, say, if you don't want to enable b.nix, but now you have to duplicate the options). Second, if they're going to share a database, you probably want to configure it independently of either module, in a single place -- and you can fix that yourself!

(This is sort of a bigger problem with the module system in general and also affects things like PostgreSQL. A lot of things try to offer extensible attributes through merging so that one service can perform multiple roles -- it can be 'reused' through other services -- but it often hits lots of fiddly edge cases because NixOS modules are a global concept -- for example, two services can configure/demand mutually exclusive things from a service. In an ideal world, it would be possible to instantiate any module at an arbitrary number of call-sites and have multiple copies coexist, of course. I'm sure you know all this, though)

In short, I think any generic module can't (and should not) have any interest in the locality information, and any situation where you do want it should be done by a instantiation of it, per instance, in a single place. I'm having a very hard time coming up with use cases where this doesn't work out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, on this note, what would be cool actually is if I could somehow make locality a non-mergeable attribute set, like a special version of attrsOf str

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, agreed, my example wasn't a particular good one, so maybe it would make sense to wait until the first user comes along and has a use-case for this.

If you want to make the attribute set non-mergeable, you can just use the mergeOneOption merge function, like this:

{ lib, ... }:

let
  inherit (lib) types;
in {
  imports = [
    { foo.bar = "xxx"; }
    { foo.xxx = "yyy"; }
  ];

  options.foo = lib.mkOption {
    type = (types.attrsOf types.str) // {
      merge = lib.mergeOneOption;
    };
    default = {};
    description = "foo";
  };
}

Output of evaluation:

$ nix-instantiate --eval '<nixpkgs/nixos>' --arg configuration /path/to/above/file.nix -A config.foo
error: The unique option `foo' is defined multiple times, in `/home/aszlig/mergey.nix' and `/home/aszlig/mergey.nix'.
(use '--show-trace' to show detailed location information)
$

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn, just missed this comment! I just simplified things a bit in 2a22554 based on your other comments. I'll submit a follow up with this attribute change, too, since a typed version is much nicer.

This can be a percentage, expressed with a fraction sign or as a
decimal-point number, or any bytes-based unit. For example, "25%",
"0.25" both represent 25% of the available system memory. The values
"1000000000" and "1GB" both represent 1 gigabyte of memory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use <literal>1000000000</literal> here instead of using quotes, same for the cache option.

TimeoutStopSec="60";
RestartSec="10";
StandardOutput="syslog";
StandardError="syslog";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably use = instead of =, but that aside, why force the use of syslog here? Users can still decide for themselves in journald.conf whether they want this to be logged to syslog.

{ ExecStart = startupCommand;
Type = "notify";
User = cfg.user;
PermissionsStartOnly = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using StateDirectory above should make this superfluous (at least when using /var/lib/...).

@thoughtpolice
Copy link
Member Author

Thanks @aszlig, I'll look into these changes later today.

@thoughtpolice thoughtpolice deleted the nixos/cockroachdb branch January 12, 2019 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants