Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace simp-le with lego and support DNS-01 challenge #77578

Merged
merged 10 commits into from Feb 10, 2020

Conversation

m1cr0man
Copy link
Contributor

Motivation for this change

Continuation of #63613

Closes #34941, #35168

I have the following things still to do:

  • Add a deprecation notice for security.acme.certs.<name>.plugins (I don't know how, need help)
  • Similarly add some equivalent of mkChangedOptionModule to convert extraDomains to a list since custom webroots are no longer possible.

It is mostly backwards compatible, with a few caveats.

  • extraDomains can no longer have different webroots to the main webroot for the cert.
  • An email address is now mandatory for CA registration. I added a more globally scoped security.acme.email to make transitioning easier, and an assertion.
  • Accepting terms of service was a topic discussed in the other PR. For now, I have added
    security.acme.acceptTerms and defaulted it to true to avoid breaking user's configs. There may be reason to change this default depending on what reviewers think.

The following other changes were required:

  • Deprecate security.acme.certs.<name>.plugins, as this was specific to simp-le
  • Rename security.acme.validMin to validMinDays, to avoid confusion and errors. Lego requires the TTL to be specified in days. I have added a mkChangedOptionModule to cover this.
  • Add options to cover DNS-01 challenge (dnsProvider, credentialsFile, dnsPropagationCheck)
  • A shared state directory is now used (/var/lib/acme/.lego) to avoid account creation rate limits and share credentials between certs

I have tested converting my own simp-le generated certs to lego and it worked fine. I also have lego in production generating 28 HTTP validated certs and a DNS-01 cert with a bind DNS server (also on a NixOS box 😄 ). This is a non-breaking change if you haven't used custom webroots or plugins.

Also I have tried to keep the diff as small as possible but that other PR was so old it took me a full day to manually rebase on top of all the more recent changes to the ACME module. I'd be tempted to do some other cleanup to the module but there's good git blame as it is now.

Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

Lego allows users to use the DNS-01 challenge to validate their
certificates. It is mostly backwards compatible, with a few
caveats.

 - extraDomains can no longer have different webroots to the
   main webroot for the cert.
 - An email address is now mandatory for account creation

The following other changes were required:
 - Deprecate security.acme.certs.<name>.plugins, as this was
   specific to simp-le
 - Rename security.acme.validMin to validMinDays, to avoid
   confusion and errors. Lego requires the TTL to be specified in
   days
 - Add options to cover DNS challenge (dnsProvider,
   credentialsFile, dnsPropagationCheck)
 - A shared state directory is now used (/var/lib/acme/.lego)
   to avoid account creation rate limits and share credentials
   between certs
++ concatMap (p: [ "-f" p ]) data.plugins
++ concatLists (mapAttrsToList (name: root: [ "-d" (if root == null then name else "${name}:${root}")]) data.extraDomains)
email = if data.email == null then cfg.email else data.email;
globalOpts = [ "-d" data.domain "--email" email "--path" "." ]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick note on --path .. Somewhere between lego 3.0.2 and lego 3.2.0, the default path changed from ./.lego to ~/.lego. I had to add this statement so that lego always runs in the service's WorkingDirectory.

@m1cr0man
Copy link
Contributor Author

Ping @aanderse

Finally, the PR is here.

@aanderse
Copy link
Member

Thanks @m1cr0man! This is great work. I hope to review and test this over the next few weeks, switching some of my test (and hopefully production) systems over to this.

@arianvp
Copy link
Member

arianvp commented Jan 13, 2020

I will have a ook at this Tuesday. Thanks for the work

# Only try loading the credentialsFile if the dns challenge is enabled
EnvironmentFile = if data.dnsProvider != null then data.credentialsFile else null;
ExecStart = pkgs.writeScript "acme-start" ''
#!${pkgs.runtimeShell} -e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not possible to do what we want in one run of the binary?

Copy link
Member

@Mic92 Mic92 Jan 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not look like it does: go-acme/lego#216

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to enforce the validMinDays you have to run renew instead of run, which seems like an oversight in lego honestly. Run creates the account credentials if they don't exist but forces a new cert to be generated. I was tempted to add explicit logic to check for the cert's credentials files but I feel that's too solution specific. This conditional logic is the same as in the other PR.

Copy link
Member

@aanderse aanderse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple notes as I work through this...

Some issues with webroot at the moment. The tmpfiles rules reference webroot but for DNS certs this won't be defined.

nixos/modules/security/acme.nix Outdated Show resolved Hide resolved
@@ -173,6 +189,15 @@ in
'';
};

acceptTerms = mkOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do other people think of this? Personally I think it is overkill... but am fairly indifferent I guess 🤷‍♂️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Let's Encrypt wants users to accept their terms I don't think implicitly accepting it for endusers should be the way to go. This knob is the exact place where we can point users to the terms of services of Lets Encrypt.

This doesn't make much sense though when overriding the API endpoints (-s|--server) because then we can't know where the correct ToS are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a really strong feeling about this. We were implicitly accepting it before no? This seems to make the behaviour slightly better than before

@m1cr0man
Copy link
Contributor Author

If anyone can help me on my two todo's mentioned in the description that would be great. I tried adding a regular mkRemovedOptionModule under certOpts, with a relative attribute path. In order to change extraDomains I tried mkChangedOptionModule but I doubt this works since the attribute path has not changed, and the warning it would produce wouldn't explain the change to a list.

   cfg = config.security.acme;
 
   certOpts = { name, ... }: {
+    imports = [
+      (mkRemovedOptionModule [ "plugins" ] ''
+        This option has been removed since simp-le was replaced with LEGo.
+        You may be able to achieve the desired functionality with postRun
+      '')
+      (mkChangedOptionModule [ "extraDomains" ] [ "extraDomains" ] (config: let
+        domains = config.security.acme."${name}".extraDomains;
+      in if isList domains then domains else attrNames domains))
+    ];
+

These are the errors I got:

The option `security.acme.certs.mydomain.warnings' defined in `/root/nixpkgs/nixos/modules/security/acme.nix' does not exist.
# Removing the mkChangedOptionModule....
The option `security.acme.certs.mydomain.assertions' defined in `/root/nixpkgs/nixos/modules/security/acme.nix' does not exist.

@aanderse
Copy link
Member

If anyone can help me on my two todo's mentioned in the description that would be great. I tried adding a regular mkRemovedOptionModule under certOpts, with a relative attribute path. In order to change extraDomains I tried mkChangedOptionModule but I doubt this works since the attribute path has not changed, and the warning it would produce wouldn't explain the change to a list.

@m1cr0man this is what @infinisil had to say on the matter: https://logs.nix.samueldr.com/nixos-dev/2020-01-15#1579087078-1579096064

@Mic92
Copy link
Member

Mic92 commented Jan 16, 2020

It also seems that simp-le still uses acmev1 which will be deprecated by letsencrypt in June.

@arianvp
Copy link
Member

arianvp commented Jan 16, 2020

@Mic92 nope don't worry; we already addressed this #71953

@aanderse
Copy link
Member

I'm very happy with the new DNS challenge functionality. After a few minor changes I mentioned it worked flawlessly. Next step is to test migration of existing HTTP challenge based certificates.

Great work @m1cr0man!

@arianvp
Copy link
Member

arianvp commented Jan 16, 2020

@aanderse how do you envision that migration path? for example; would keeping around account information and private keys from simp_le around be a requirement for you?

@m1cr0man
Copy link
Contributor Author

@arianvp Those are already being kept around. I'm not explicitly deleting the JSON files in /var/lib/acme/${name} so you can revert back pretty painlessly.

@m1cr0man
Copy link
Contributor Author

m1cr0man commented Feb 3, 2020

Sorry for the triple-comment, but I just want to point out that I never figured out how to add the assertions to the submodule. The plugins attribute will raise an assertion anyway since it's not defined anymore. extraModules is currently unchanged but setting a custom webroot is now impossible so it makes more sense to convert it to a list. I'm happy to merge without these and just the tests for now, and work on them later.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-20-03-feature-freeze/5655/27

@m1cr0man
Copy link
Contributor Author

m1cr0man commented Feb 9, 2020

Ok it was super difficult to figure out, but I got the DNS-01 challenge test working. However in doing this I've somehow broken the test for b.example.com and I'm not sure how. Would someone else be able to take a look at it(@arianvp)?

@m1cr0man
Copy link
Contributor Author

m1cr0man commented Feb 9, 2020

Fixed it - http-01 validation was running before nginx was started. :) This is ready for review again

nixos/tests/acme.nix Outdated Show resolved Hide resolved
Comment on lines +119 to +120
user = config.services.nginx.user;
group = config.services.nginx.group;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? Can't we use the per-vhost enableACME option in the nginx module, and just reconfigure the dns provider in the acme module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing wildcard certs. You can't specify a wildcard cert through enableACME, and that option is tested elsewhere. Personally I use the cert for more than just nginx so I want to ensure it works hard way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure - I'd expect serverAliases to work with that - if not, I'd consider it a bug.

I just find this test hard to understand. - If we can't use the nginx-provided ACME functionalities, probably it'd be more understandable to just not use nginx in the test, only configure the acme module, and check test certificate files appear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's sufficient to simply check does the file exist to truly test wildcard certs. It should work when used with any service (in this case, nginx) serving under that domain. Using nginx here keeps things consistent with the other tests. You're right I could've jused used serverAliases and added *.example.com there and it would've worked fine, but I'm testing the acme module, not nginx.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant using nginx, but not the acme-specific support of it makes the test much more verbose. Maybe a nginx test withserverAliases, and a simple one which just looks at the certificate via openssl x509 -in full.pem -text -noout | grep DNS?

user = config.services.nginx.user;
group = config.services.nginx.group;
};
systemd.targets."acme-finished-example.com" = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we do that? This looks like you're workarounding something…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a comment on the earlier ones. It's so that it can be waited upon in the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please copy that comment over - I got drowned in the rest.

Comment on lines +123 to +127
systemd.services."acme-example.com" = {
wants = [ "acme-finished-example.com.target" ];
before = [ "acme-finished-example.com.target" "nginx.service" ];
wantedBy = [ "nginx.service" ];
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - I wouldn't expect to have to specify that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nginx was starting after the certs consistently when I was testing, causing the HTTP-01 validation to fail. Roll it back and run nix-build acme.nix and you'll see what I mean

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's only for the wildcard certificate, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it was for the HTTP-01 challenge certs, aka the ones using enableACME

wantedBy = [ "nginx.service" ];
};
services.nginx.virtualHosts."c.example.com" = {
forceSSL = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we use enableACME here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing wildcard certs. I don't want a cert for c.example.com

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above.

Comment on lines +130 to +132
sslCertificate = config.security.acme.certs."example.com".directory + "/cert.pem";
sslTrustedCertificate = config.security.acme.certs."example.com".directory + "/full.pem";
sslCertificateKey = config.security.acme.certs."example.com".directory + "/key.pem";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is then probably not necessary anymore, too.

Merge remote-tracking branch 'remotes/upstream/master'
user = config.services.nginx.user;
group = config.services.nginx.group;
};
systemd.targets."acme-finished-example.com" = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please copy that comment over - I got drowned in the rest.

Comment on lines +119 to +120
user = config.services.nginx.user;
group = config.services.nginx.group;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure - I'd expect serverAliases to work with that - if not, I'd consider it a bug.

I just find this test hard to understand. - If we can't use the nginx-provided ACME functionalities, probably it'd be more understandable to just not use nginx in the test, only configure the acme module, and check test certificate files appear.

Comment on lines +123 to +127
systemd.services."acme-example.com" = {
wants = [ "acme-finished-example.com.target" ];
before = [ "acme-finished-example.com.target" "nginx.service" ];
wantedBy = [ "nginx.service" ];
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's only for the wildcard certificate, right?

systemd.targets."acme-finished-b.example.com" = {};
systemd.services."acme-b.example.com" = {
wants = [ "acme-finished-b.example.com.target" ];
before = [ "acme-finished-b.example.com.target" ];
after = [ "nginx.service" ];
};
services.nginx.virtualHosts."b.example.com" = {
enableACME = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should name these "subdomain certs" a bit more self-explaining, like "webroot-nginx" maybe?

wantedBy = [ "nginx.service" ];
};
services.nginx.virtualHosts."c.example.com" = {
forceSSL = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above.

@disassembler
Copy link
Member

@m1cr0man are you going to be able to get this addressed before feature freeze tomorrow?

@flokli
Copy link
Contributor

flokli commented Feb 10, 2020 via email

@m1cr0man
Copy link
Contributor Author

@flokli in general I found these acme tests to be a bit too complex. I would love to rewrite the whole lot of them and incorporate some Apache based tests too, but I can't do that today. I feel that given that there is at least a working test for DNS-01 and the numerous production instances of this code now running that it is stable for 20.03

@arianvp
Copy link
Member

arianvp commented Feb 10, 2020 via email

@flokli
Copy link
Contributor

flokli commented Feb 10, 2020

Okay, let's merge this, and improve the tests later on.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/setup-a-wildcard-certificate-with-acme-on-a-custom-domain-name-hosted-by-powerdns/15055/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants