Cert-manager: Waiting for http-01 challenge propagation: failed to perform self check GET request

Created on 31 Aug 2020 · 24Comments · Source: jetstack/cert-manager

Status:
Presented: true
Processing: true
Reason: Waiting for http-01 challenge propagation: failed to perform self check GET request 'http://abc.com/.well-known/acme-challenge/Oej8tloD2wuHNBWS6eVhSKmGkZNfjLRemPmpJoHOPkA': Get "http://abc.com/.well-known/acme-challenge/Oej8tloD2wuHNBWS6eVhSKmGkZNfjLRemPmpJoHOPkA": dial tcp 18.192.17.98:80: connect: connection timed out
State: pending

Installation -
I am using AWS eks-
I cloned nginx-ngress in local and then I am installing it eks using annotation
service.beta.kubernetes.io/aws-load-balancer-type: nlb

I install certbot using helm
I applied a issuer and ingress resource. Till now I haven't created any application deployment.

When I am doing kubectl describe challenge I am getting above error message.

I am doing nothing extra. I had tried all the possible way but its not working . Can anyone help here

triagsupport

Source

dineshgupta04

Most helpful comment

I've lost faith in cert-manager due to the lack of progress on this issue. I'm writing a replacement that automates what I described in my previous comment.

nabsul on 13 Nov 2020

👍6

All 24 comments

Can you access https://abc.com/.well-known/acme-challenge/Oej8tloD2wuHNBWS6eVhSKmGkZNfjLRemPmpJoHOPkA from within a pod inside the cluster?

/triage support

meyskens on 22 Sep 2020

@meyskens I was also getting same issue. Providing the result.
Capture

I was using HAproxy Ingress controller but create ingress based annotation with ngnix.

gudipudipradeep on 24 Sep 2020

Hi, I'm having the same problem (with nginx ingress). I tried curling the validation URL from a pod in the cluster and got the following response:

curl: (52) Empty reply from server

Whereas doing the same from outside the cluster returns the secret as expected.

Does this mean there's something strange going on with DNS in Kubernetes?

nabsul on 28 Sep 2020

👀1

Interestingly, I can access external domains from the pod, but I can't seem to access any of the domains that are hosted inside this cluster. For example:

> curl https://nabeel.blog
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to nabeel.blog:443

vs:

> curl https://bing.com
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="https://www.bing.com:443/?toWww=1&amp;redig=E2BCEB95F2954770B50A53D9BBBE3C3D">here</a>.</h2>
</body></html>

nabsul on 28 Sep 2020

👀1

I'm facing the same problem as @nabsul

Serrvosky on 6 Oct 2020

Based on some other threads I've been reading, this seems to be related to a bug in Kubernetes DNS, not directly related to cert-manager (but certain affects cert-manager) heavily.

In case this is helpful to others stuck on this issue, I've unblocked myself by manually generating certs and uploading them to my cluster. (Note: I prefer to use a VM or pod to do the following because the source IP address gets logged to public records at let's encrypt)

1) Run certbot certonly --manual --preferred-challenges dns and follow the instructions. You'll need to update a TXT record in your domain settings to complete the process.

2) Go to the directory where the certs were created and run kubectl create secret tls [secret-name] --cert=fullchain.pem --key=privkey.pem

It's tedious, but at least my certs won't expire while we're waiting for this bug to get fixed.

Docs:

https://certbot.eff.org/docs/using.html#manual
https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-secret-tls-em-

nabsul on 6 Oct 2020

In the meantime, I wonder how hard/bad it would be to hack cert-manager itself to skip the "self check GET request" step altogether. It's a great idea to do this check, but I don't think it's absolutely necessary to the cert renewal process.

nabsul on 6 Oct 2020

WARNING: This is a random idea that I haven't fully thought through. Attempt at your own risk :-).

Although I would love to, I most likely don't have time to mess with this idea, but if anyone wants to give it a shot, I would try replacing the testReachability() function here with a simple return nil.

You'd then need to build a Docker image, upload it to docker hub, and use it instead of the official image in your cluster.

Again, if this works at all, it should be considered a temporary solution until a formal fix comes out.

nabsul on 6 Oct 2020

I strongly reccomend not doing that, have you tried https://cert-manager.io/docs/usage/certificate/#temporary-certificates-whilst-issuing ?

meyskens on 8 Oct 2020

@meyskens what do you recommend ??

mohamedalaa33 on 17 Oct 2020

Hi all. Has there been any news on this issue? I gave up on looking for a solution and instead figured out how to manually renew Let's Encrypt certs in Kubernetes.

In case any of your are still stuck, I've shared code and instructions to do this here: https://github.com/nabsul/k8s-letsencrypt

The instructions are long for the sake of clarity, but it's actually not that bad to do it manually.

nabsul on 23 Oct 2020

Is There a new development about this issue?

UsernameAlvarez on 12 Nov 2020

I've lost faith in cert-manager due to the lack of progress on this issue. I'm writing a replacement that automates what I described in my previous comment.

nabsul on 13 Nov 2020

👍6

I am also having this issue. Its far too on-and-off

just1689 on 22 Nov 2020

i have also the same issue

nimerfarahty on 25 Nov 2020

If anyone is stuck and willing to try out some experimental code, please reach out to me (LinkedIn and Twitter username is the same as here).

nabsul on 25 Nov 2020

@nabsul Hi there. Are you referring to https://github.com/nabsul/k8s-letsencrypt ? I'd like to give it a go. Need to find time this week. I'd be happy to move the conversation over to your repo if that works for you.

just1689 on 25 Nov 2020

Hi here,

I got exactly the same issue while deploying the Gitlab Helm charts, that rely on Jetstack's cert-manager (version v0.10.1).

The log of the acme-htttp-solver pod kept telling "Failed to perform self check GET request" on the challenge url on the public domain.

I was working on french k8s as a service provider Scaleway (https://www.scaleway.com/en/kubernetes-kapsule/).

The default network setup is using Cilium as container network interface (CNI).

On a cluster with the same provider using the Calico CNI, the problem is gone, certificates are properly issued.

rcomblen on 25 Nov 2020

👍1

finally i have got a solution, i have follow the tutorial on digitalocean and step number 5 solve the issue

nimerfarahty on 25 Nov 2020

👍4

@nimerfarahty Do you know if the solution suggested by digital ocean works if you have multiple domains on the same load balancer?

nabsul on 25 Nov 2020

@just1689 It's actually a new project based on the one you referenced . I'll be making the code public today and will tag you from there.

nabsul on 25 Nov 2020

👍1

Interestingly, I can access external domains from the pod, but I can't seem to access any of the domains that are hosted inside this cluster. For example:
> curl https://nabeel.blog
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to nabeel.blog:443
vs:
> curl https://bing.com
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="https://www.bing.com:443/?toWww=1&amp;redig=E2BCEB95F2954770B50A53D9BBBE3C3D">here</a>.</h2>
</body></html>

Same issue here (digital ocean). Which probably points to a bug upstream, I'm not sure if this issue only started occurring after installation of cert-manager though, in which case maybe it's not upstream.

chrissound on 26 Nov 2020

Seems to be the same issue here: https://github.com/jetstack/cert-manager/issues/466

chrissound on 26 Nov 2020

This PR (implementation of the KEP) might help: https://github.com/kubernetes/kubernetes/pull/92312
It will probably be merged for 1.21 (next cycle). So until then maybe use the workarounds or use dns instead of http.