Concourse: acme/autocert managed certificate not updating

Created on 29 Aug 2019  路  3Comments  路  Source: concourse/concourse

Bug Report

We upgraded to concourse v5.3.0 back in June and started using the let's encrypt functionality. This has been running fine and we were happy. However, The original certificates is expiring in a couple weeks now. We expected our certificate to update around 30 days before expiration, but it has not.

Looking at certificate transparency logs for our particular domain at https://crt.sh, we see the domain actually renewed around 30 days before expiration - and apparently again a week after that. However our concourse is still serving the old certificate that expires in two weeks.

Steps to Reproduce

Enable "Let's Encrypt" functionality and see that the certificate does not get updated.

Expected Results

We expected the certificate to automatically update.

Additional Context

Looking into this we started seeing errors in the concourse db logs around 30 days prior to expiration:

ERROR:  duplicate key value violates unique constraint "cert_cache_pkey"
DETAIL:  Key (domain)=(OUR-FANCY-DOMAIN) already exists.
STATEMENT:  INSERT INTO cert_cache (domain, cert, nonce) VALUES ($1, $2, $3)

One hypothesis we have is that the new certificate just isn't getting stored in the cache so not refreshing. In particular we see that the query to store state in the database cert_cache table is an INSERT but perhaps it needs to be an UPSERT instead.

Version Info

  • Concourse version: 5.3.0
  • Deployment type (BOSH/Docker/binary): BOSH
  • Infrastructure/IaaS: GCP
bug

All 3 comments

Thanks for looking into it, that theory about needing an UPSERT sounds accurate. I'll prioritize this so we get to it soon. v5.5 is about to ship and won't include this, but we can try to get it in a patch release soon after.

Workaround

Rename the domain for the record where you cert is and restart concourse web. Upon restart it'll grab a new certificate.

update cert_cache set domain='old_cert' where domain='your.domain.com';

We applied a similar workaround, but restarting concourse web initially failed for us. This is because autocert had already exhausted the ~5 renew attempts per week and we hit the Let's Encrypt API rate limit.

We ended up moving the old cert_cache entry out of the way about a day before the rate limits reset again. Then the cert automatically refreshed. So the workaround can be a bit more annoying that just running a query and restarting services.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

drahnr picture drahnr  路  3Comments

Templarian picture Templarian  路  3Comments

UmamaheshMaxwell picture UmamaheshMaxwell  路  3Comments

klakin-pivotal picture klakin-pivotal  路  3Comments

kcmannem picture kcmannem  路  3Comments