We upgraded to concourse v5.3.0 back in June and started using the let's encrypt functionality. This has been running fine and we were happy. However, The original certificates is expiring in a couple weeks now. We expected our certificate to update around 30 days before expiration, but it has not.
Looking at certificate transparency logs for our particular domain at https://crt.sh, we see the domain actually renewed around 30 days before expiration - and apparently again a week after that. However our concourse is still serving the old certificate that expires in two weeks.
Enable "Let's Encrypt" functionality and see that the certificate does not get updated.
We expected the certificate to automatically update.
Looking into this we started seeing errors in the concourse db logs around 30 days prior to expiration:
ERROR: duplicate key value violates unique constraint "cert_cache_pkey"
DETAIL: Key (domain)=(OUR-FANCY-DOMAIN) already exists.
STATEMENT: INSERT INTO cert_cache (domain, cert, nonce) VALUES ($1, $2, $3)
One hypothesis we have is that the new certificate just isn't getting stored in the cache so not refreshing. In particular we see that the query to store state in the database cert_cache table is an INSERT but perhaps it needs to be an UPSERT instead.
Thanks for looking into it, that theory about needing an UPSERT sounds accurate. I'll prioritize this so we get to it soon. v5.5 is about to ship and won't include this, but we can try to get it in a patch release soon after.
Rename the domain for the record where you cert is and restart concourse web. Upon restart it'll grab a new certificate.
update cert_cache set domain='old_cert' where domain='your.domain.com';
We applied a similar workaround, but restarting concourse web initially failed for us. This is because autocert had already exhausted the ~5 renew attempts per week and we hit the Let's Encrypt API rate limit.
We ended up moving the old cert_cache entry out of the way about a day before the rate limits reset again. Then the cert automatically refreshed. So the workaround can be a bit more annoying that just running a query and restarting services.