Describe the bug
The change from 19.09 to 20.03 (or unstable) breaks a potentially common workflow for certificate renewal:
This leads to situations where previously valid certificates will become overriden silently by leg, for instance if two different certificates have the same "main domain" key: The first systemd job to run will have a valid certificate while the second one will get the other one.
This link https://github.com/go-acme/lego/issues/838 explains that the option (--filename) that would permit to avoid the issue is deprecated.
In case my issue isn’t clear enough, the second point of the linked issue is the same as this one, so it seems like without this --filename option it would become impossible to be backward compatible with simp_le.
The solution may then be to raise (or at least warn) when we have more than one certificate with a given "domain" key.
ping @arianvp @Mic92 @m1cr0man @emilazy @flokli
@immae I'm not sure if I understood the issue correctly, but you should refer to certificates from /var/lib/acme/${name}/*.pem, not /var/lib/acme/.lego/certificates/….
security.acme.certs.web = {
domain = "example.com";
}
security.acme.certs.xmpp = {
domain = "example.com";
extraDomains = "uploads.example.com";
};
This should create two timers writing to /var/lib/acme/web/*.pem and /var/lib/acme/xmpp/*.pem, right?
Thanks for your example @flokli I have trouble explaining it correctly :)
It will create two timers yes, but the content of /var/lib/acme/.lego/certificates will contain only one "set of file", example.com.*, which may or may not include uploads.example.com depending on the timer that ran first.
And this result will then be copied to both /var/lib/acme/web/*.pem and /var/lib/acme/xmpp/*.pem , so xmpp may have uploads.example.com missing if acme-web.timers was the first one to run.
Is it clearer presented like this?
I solved it using: https://github.com/Mic92/nixpkgs/commit/41f3375edccf6e28af09291da8b8d653dbf5b03b
This fix is not upstream though.
This looks reasonable. @Mic92, could you file a PR?
With this applied, will it re-request certificates entirely from scratch?
IMHO, this is a regression that should also be fixed for 20.03.
@flokli: It will, but as far as I saw it is already the case in the current state (although I wouldn’t bet on it, it’s the impression I had when upgrading and I don’t see in the code why it would be otherwise)
@flokli: It will, but as far as I saw it is already the case in the current state (although I wouldn’t bet on it, it’s the impression I had when upgrading and I don’t see in the code why it would be otherwise)
The problem appears when you try to create different certificates that have the same domain as a common name (CN). Lego will use the CN as a directory name internally in its .lego directory. In my case I need multiple for certificates for some common domain for the following use cases:
/var/lib/acme/.lego/
I think this is really ugly. The system certificates are installed in a well-known path (/var/lib/acme) and they shouldn't be in a hidden directory, even worse, whose name dependes on an implementation detail (the name of the acme client). This will break again next time the acme client will be changed.
Can we get rid of it somehow?
@rnhmjoj these are lego-internal details - you shouldn't need to ever access them from there manually. The script is taking care to copy them to the appropriate locations, with the right permissions.
@Mic92 These are all valid usecases for multiple certificates for the same domain, and I see this as a regression from 19.09.
I just think if we should apply something like https://github.com/Mic92/nixpkgs/commit/41f3375edccf6e28af09291da8b8d653dbf5b03b to nixpkgs, and get it backported to 20.03, so we avoid having another round of ssl certificate re-requests, as lego will re-request certs another time, due to the internally-changed /var/lib/acme/.lego paths.
The script is taking care to copy them to the appropriate locations, with the right permissions.
I must have misunderstood the issue, then. So certificates are still under /var/lib/acme/${name} where name is the attribute name of the configuration but having multiple certificates with the same CN results in one overwriting another?
Certificates are exposed in /var/lib/acme/${name} (with name being the name of the attrset, not the main domain). The certificate handling in the internal .lego folder is messed up, which caused certs to be overridden in there (and then copied out) IIRC.
Ok, thank you. I get it now.
@Mic92 can you prepare a PR for master?
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/go-no-go-meeting-nixos-20-03-markhor/6495/19
Most helpful comment
I solved it using: https://github.com/Mic92/nixpkgs/commit/41f3375edccf6e28af09291da8b8d653dbf5b03b
This fix is not upstream though.