New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix letsencrypt #60219
Fix letsencrypt #60219
Conversation
f87c497
to
4f79d6d
Compare
I am planning to deprecate the I think a good replacement instead would be to create a
This way, the delay isn't an arbitrary number, but mandated by the script that you use to publish to See e70d293#commitcomment-34131909 for context |
CC @pngwjpgh |
SGTM. Any fixed "delay" value is inherently wrong. |
One big problem: You can't If anybody has an idea on how to test for oneshots firing, please let me know |
Even with remain after exit?
…On Sun, Jun 30, 2019, at 9:28 AM, Arian van Putten wrote:
One big problem:
You can't `waitForUnit` on `oneshot` services, as their state goes from `inactive => activating => inactive`. This makes the tests currently fail.
If anybody has an idea on how to test for oneshots firing, please let me know
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#60219?email_source=notifications&email_token=AAASXLGF3OCJIHVAAU76XCTP5CYGLA5CNFSM4HIOERIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY4MGII#issuecomment-507036449>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAASXLDKA3PTU4J5O4WJLO3P5CYGLANCNFSM4HIOERIA>.
|
RemainAfterExit will work but then the unit remains active which means the
timer unit will not periodically activate it causing your certs to expire :p
On Sun, Jun 30, 2019, 15:37 Graham Christensen <notifications@github.com>
wrote:
… Even with remain after exit?
On Sun, Jun 30, 2019, at 9:28 AM, Arian van Putten wrote:
> One big problem:
> You can't `waitForUnit` on `oneshot` services, as their state goes from
`inactive => activating => inactive`. This makes the tests currently fail.
> If anybody has an idea on how to test for oneshots firing, please let me
know
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub <
#60219?email_source=notifications&email_token=AAASXLGF3OCJIHVAAU76XCTP5CYGLA5CNFSM4HIOERIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY4MGII#issuecomment-507036449>,
or mute the thread <
https://github.com/notifications/unsubscribe-auth/AAASXLDKA3PTU4J5O4WJLO3P5CYGLANCNFSM4HIOERIA
>.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#60219?email_source=notifications&email_token=AAEZNI6PE2RMN22SUGLFHPDP5CZIXA5CNFSM4HIOERIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODY4MK3I#issuecomment-507037037>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEZNI4YSU7EOVB64Q2VNWTP5CZIXANCNFSM4HIOERIA>
.
|
@grahamc I've fixed the test by letting each oneshot activate a target (Which remains active after running) to use as a probe to test for unit activation. It's a bit hacky, but it works! |
@GrahamcOfBorg test acme |
A few problems remain (that were already present before I started coding):
Number 2. was introduced by @bjornfor in 7a0e958 Would you mind if I remove this coupling altogether, and document instead that people
whenever you want to depend on certs being generated on startup? For This will also fix #62958 |
@arianvp: I don't mind. I only added that line because nginx had it, and it felt like the natural thing to do at that time. |
This is ready for review. I can refactor even more but lets keep that for a new PR or a new NixOS Module altogether. At least the bugs that were present before are fixed now. |
I think we should do the big refactor to DNS challenges and |
910c6cb
to
89af1ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rebase this on latest master?
89af1ae
to
1ccd9f8
Compare
When subsequent certificates would be added, they would not wake up nginx correctly due to target units only being triggered once. We now added more fine-grained systemd dependencies to make sure nginx always is aware of new certificates and doesn't restart too early resulting in a crash. Furthermore, the acme module has been refactored. Mostly to get rid of the deprecated PermissionStartOnly systemd options which were deprecated. Below is a summary of changes made. * Use SERVICE_RESULT to determine status This was added in systemd v232. we don't have to keep track of the EXITCODE ourselves anymore. * Add regression test for requesting mutliple domains * Deprecate 'directory' option We now use systemd's StateDirectory option to manage create and permissions of the acme state directory. * The webroot is created using a systemd.tmpfiles.rules rule instead of the preStart script. * Depend on certs directly By getting rid of the target units, we make sure ordering is correct in the case that you add new certs after already having deployed some. Reason it broke before: acme-certificates.target would be in active state, and if you then add a new cert, it would still be active and hence nginx would restart without even requesting a new cert. Not good! We make the dependencies more fine-grained now. this should fix that * Remove activationDelay option It complicated the code a lot, and is rather arbitrary. What if your activation script takes more than activationDelay seconds? Instead, one should use systemd dependencies to make sure some action happens before setting the certificate live. e.g. If you want to wait until your cert is published in DNS DANE / TLSA, you could create a unit that blocks until it appears in DNS: ``` RequiredBy=acme-${cert}.service After=acme-${cert}.service ExecStart=publish-wait-for-dns-script ```
1ccd9f8
to
ba447fa
Compare
Co-Authored-By: Florian Klink <flokli@flokli.de>
Thanks! |
FWIW, I was actually using |
Feel free to come up with an alternative. We could use |
Not sure |
This is error log is generated because of:
Note that it's just a warning that is logged. not something that makes fatally fail. But still; it's something worth having a look at. Poking @aanderse I still think that |
I'm happy to hear the what and why of your custom directory to change my opinion, but as it currently stands if symlinks would do the job I continue to agree with the decision to remove the directory option. |
@arianvp Hmm, yeah, I see why my setup is causing the warning now. My config predates @aanderse Sure, I can detail the reasoning a bit: essentially, I keep any service-related data that is worth backing up* under a directory tree rooted at
Could I use a symlink? Often. But then I have more problems:
Pointing the service in question directly to the directory I actually want it to use is always going to be the better option. *: By "worth backing up", I mean "not trivially reproduceable". LE certs fall under this classification because acquiring them is rate-limited, so having to recreate them en masse would potentially be problematic. Combine that with HTTP Strict Transport Security and you could end up with effectively inaccessible domains for a while. |
@Shados to address your first two points: if you use I'm not against having an option for specifying the state directory when there is a justification for such an option (like an application that can't deal with symlinks as you mentioned), but it would be nice if there was justification, as opposed to preference. But then again... this is just my opinion, and I don't make the rules. |
@aanderse I got around to taking the symlink approach with Initially, I went with just a directory entry for Namely, I was back to getting the "unsafe path transition" warnings. And while this is just a warning for The cause of the warning was:
All up: I do think that using a tool designed to declaratively manipulate a filesystem is the right approach to solving this kind of issue in NixOS, but that using |
* nixos/acme: Fix ordering of cert requests When subsequent certificates would be added, they would not wake up nginx correctly due to target units only being triggered once. We now added more fine-grained systemd dependencies to make sure nginx always is aware of new certificates and doesn't restart too early resulting in a crash. Furthermore, the acme module has been refactored. Mostly to get rid of the deprecated PermissionStartOnly systemd options which were deprecated. Below is a summary of changes made. * Use SERVICE_RESULT to determine status This was added in systemd v232. we don't have to keep track of the EXITCODE ourselves anymore. * Add regression test for requesting mutliple domains * Deprecate 'directory' option We now use systemd's StateDirectory option to manage create and permissions of the acme state directory. * The webroot is created using a systemd.tmpfiles.rules rule instead of the preStart script. * Depend on certs directly By getting rid of the target units, we make sure ordering is correct in the case that you add new certs after already having deployed some. Reason it broke before: acme-certificates.target would be in active state, and if you then add a new cert, it would still be active and hence nginx would restart without even requesting a new cert. Not good! We make the dependencies more fine-grained now. this should fix that * Remove activationDelay option It complicated the code a lot, and is rather arbitrary. What if your activation script takes more than activationDelay seconds? Instead, one should use systemd dependencies to make sure some action happens before setting the certificate live. e.g. If you want to wait until your cert is published in DNS DANE / TLSA, you could create a unit that blocks until it appears in DNS: ``` RequiredBy=acme-${cert}.service After=acme-${cert}.service ExecStart=publish-wait-for-dns-script ``` (cherry picked from commit 604b7c1)
* nixos/acme: Fix ordering of cert requests When subsequent certificates would be added, they would not wake up nginx correctly due to target units only being triggered once. We now added more fine-grained systemd dependencies to make sure nginx always is aware of new certificates and doesn't restart too early resulting in a crash. Furthermore, the acme module has been refactored. Mostly to get rid of the deprecated PermissionStartOnly systemd options which were deprecated. Below is a summary of changes made. * Use SERVICE_RESULT to determine status This was added in systemd v232. we don't have to keep track of the EXITCODE ourselves anymore. * Add regression test for requesting mutliple domains * Deprecate 'directory' option We now use systemd's StateDirectory option to manage create and permissions of the acme state directory. * The webroot is created using a systemd.tmpfiles.rules rule instead of the preStart script. * Depend on certs directly By getting rid of the target units, we make sure ordering is correct in the case that you add new certs after already having deployed some. Reason it broke before: acme-certificates.target would be in active state, and if you then add a new cert, it would still be active and hence nginx would restart without even requesting a new cert. Not good! We make the dependencies more fine-grained now. this should fix that * Remove activationDelay option It complicated the code a lot, and is rather arbitrary. What if your activation script takes more than activationDelay seconds? Instead, one should use systemd dependencies to make sure some action happens before setting the certificate live. e.g. If you want to wait until your cert is published in DNS DANE / TLSA, you could create a unit that blocks until it appears in DNS: ``` RequiredBy=acme-${cert}.service After=acme-${cert}.service ExecStart=publish-wait-for-dns-script ``` (cherry picked from commit 604b7c1)
* nixos/acme: Fix ordering of cert requests When subsequent certificates would be added, they would not wake up nginx correctly due to target units only being triggered once. We now added more fine-grained systemd dependencies to make sure nginx always is aware of new certificates and doesn't restart too early resulting in a crash. Furthermore, the acme module has been refactored. Mostly to get rid of the deprecated PermissionStartOnly systemd options which were deprecated. Below is a summary of changes made. * Use SERVICE_RESULT to determine status This was added in systemd v232. we don't have to keep track of the EXITCODE ourselves anymore. * Add regression test for requesting mutliple domains * Deprecate 'directory' option We now use systemd's StateDirectory option to manage create and permissions of the acme state directory. * The webroot is created using a systemd.tmpfiles.rules rule instead of the preStart script. * Depend on certs directly By getting rid of the target units, we make sure ordering is correct in the case that you add new certs after already having deployed some. Reason it broke before: acme-certificates.target would be in active state, and if you then add a new cert, it would still be active and hence nginx would restart without even requesting a new cert. Not good! We make the dependencies more fine-grained now. this should fix that * Remove activationDelay option It complicated the code a lot, and is rather arbitrary. What if your activation script takes more than activationDelay seconds? Instead, one should use systemd dependencies to make sure some action happens before setting the certificate live. e.g. If you want to wait until your cert is published in DNS DANE / TLSA, you could create a unit that blocks until it appears in DNS: ``` RequiredBy=acme-${cert}.service After=acme-${cert}.service ExecStart=publish-wait-for-dns-script ``` FCIO: * cherry-picked from nixpkgs-unstable ba447fa * upstream PR NixOS#60219 * included in 19.09
FCIO: This is a patch for the upstream PR NixOS#60219. It improved the situation but there were still failures because the challenge was sent to the webserver before it was accepting requests for the new domain. Maybe using another LE client would fix this, too. Should be discussed with upstream.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/please-update-to-latest-19-09-if-using-letsencrypt/4595/3 |
Motivation for this change
Fixes #60180
Futhermore refactors the module a bit. I'm trying to get rid of
PermissionStartOnly
Things done
sandbox
innix.conf
on non-NixOS)nix-shell -p nix-review --run "nix-review wip"
./result/bin/
)nix path-info -S
before and after)