Cert-manager: Waiting for http-01 challenge propagation: failed to perform self check GET request

Created on 31 Aug 2020  路  24Comments  路  Source: jetstack/cert-manager

Status:
Presented: true
Processing: true
Reason: Waiting for http-01 challenge propagation: failed to perform self check GET request 'http://abc.com/.well-known/acme-challenge/Oej8tloD2wuHNBWS6eVhSKmGkZNfjLRemPmpJoHOPkA': Get "http://abc.com/.well-known/acme-challenge/Oej8tloD2wuHNBWS6eVhSKmGkZNfjLRemPmpJoHOPkA": dial tcp 18.192.17.98:80: connect: connection timed out
State: pending

Installation -
I am using AWS eks-
I cloned nginx-ngress in local and then I am installing it eks using annotation
service.beta.kubernetes.io/aws-load-balancer-type: nlb

I install certbot using helm
I applied a issuer and ingress resource. Till now I haven't created any application deployment.

When I am doing kubectl describe challenge I am getting above error message.

I am doing nothing extra. I had tried all the possible way but its not working . Can anyone help here

triagsupport

Most helpful comment

I've lost faith in cert-manager due to the lack of progress on this issue. I'm writing a replacement that automates what I described in my previous comment.

All 24 comments

Can you access https://abc.com/.well-known/acme-challenge/Oej8tloD2wuHNBWS6eVhSKmGkZNfjLRemPmpJoHOPkA from within a pod inside the cluster?

/triage support

@meyskens I was also getting same issue. Providing the result.
Capture

I was using HAproxy Ingress controller but create ingress based annotation with ngnix.
image

Hi, I'm having the same problem (with nginx ingress). I tried curling the validation URL from a pod in the cluster and got the following response:

curl: (52) Empty reply from server

Whereas doing the same from outside the cluster returns the secret as expected.

Does this mean there's something strange going on with DNS in Kubernetes?

Interestingly, I can access external domains from the pod, but I can't seem to access any of the domains that are hosted inside this cluster. For example:

> curl https://nabeel.blog
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to nabeel.blog:443

vs:

> curl https://bing.com
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="https://www.bing.com:443/?toWww=1&amp;redig=E2BCEB95F2954770B50A53D9BBBE3C3D">here</a>.</h2>
</body></html>

I'm facing the same problem as @nabsul

Based on some other threads I've been reading, this seems to be related to a bug in Kubernetes DNS, not directly related to cert-manager (but certain affects cert-manager) heavily.

In case this is helpful to others stuck on this issue, I've unblocked myself by manually generating certs and uploading them to my cluster. (Note: I prefer to use a VM or pod to do the following because the source IP address gets logged to public records at let's encrypt)

1) Run certbot certonly --manual --preferred-challenges dns and follow the instructions. You'll need to update a TXT record in your domain settings to complete the process.

2) Go to the directory where the certs were created and run kubectl create secret tls [secret-name] --cert=fullchain.pem --key=privkey.pem

It's tedious, but at least my certs won't expire while we're waiting for this bug to get fixed.

Docs:

https://certbot.eff.org/docs/using.html#manual
https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-secret-tls-em-

In the meantime, I wonder how hard/bad it would be to hack cert-manager itself to skip the "self check GET request" step altogether. It's a great idea to do this check, but I don't think it's absolutely necessary to the cert renewal process.

WARNING: This is a random idea that I haven't fully thought through. Attempt at your own risk :-).

Although I would love to, I most likely don't have time to mess with this idea, but if anyone wants to give it a shot, I would try replacing the testReachability() function here with a simple return nil.

You'd then need to build a Docker image, upload it to docker hub, and use it instead of the official image in your cluster.

Again, if this works at all, it should be considered a temporary solution until a formal fix comes out.

I strongly reccomend not doing that, have you tried https://cert-manager.io/docs/usage/certificate/#temporary-certificates-whilst-issuing ?

@meyskens what do you recommend ??

Hi all. Has there been any news on this issue? I gave up on looking for a solution and instead figured out how to manually renew Let's Encrypt certs in Kubernetes.

In case any of your are still stuck, I've shared code and instructions to do this here: https://github.com/nabsul/k8s-letsencrypt

The instructions are long for the sake of clarity, but it's actually not that bad to do it manually.

Is There a new development about this issue?

I've lost faith in cert-manager due to the lack of progress on this issue. I'm writing a replacement that automates what I described in my previous comment.

I am also having this issue. Its far too on-and-off

i have also the same issue

If anyone is stuck and willing to try out some experimental code, please reach out to me (LinkedIn and Twitter username is the same as here).

@nabsul Hi there. Are you referring to https://github.com/nabsul/k8s-letsencrypt ? I'd like to give it a go. Need to find time this week. I'd be happy to move the conversation over to your repo if that works for you.

Hi here,

I got exactly the same issue while deploying the Gitlab Helm charts, that rely on Jetstack's cert-manager (version v0.10.1).

The log of the acme-htttp-solver pod kept telling "Failed to perform self check GET request" on the challenge url on the public domain.

I was working on french k8s as a service provider Scaleway (https://www.scaleway.com/en/kubernetes-kapsule/).

The default network setup is using Cilium as container network interface (CNI).

On a cluster with the same provider using the Calico CNI, the problem is gone, certificates are properly issued.

finally i have got a solution, i have follow the tutorial on digitalocean and step number 5 solve the issue

@nimerfarahty Do you know if the solution suggested by digital ocean works if you have multiple domains on the same load balancer?

@just1689 It's actually a new project based on the one you referenced . I'll be making the code public today and will tag you from there.

Interestingly, I can access external domains from the pod, but I can't seem to access any of the domains that are hosted inside this cluster. For example:

> curl https://nabeel.blog
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to nabeel.blog:443

vs:

> curl https://bing.com
<html><head><title>Object moved</title></head><body>
<h2>Object moved to <a href="https://www.bing.com:443/?toWww=1&amp;redig=E2BCEB95F2954770B50A53D9BBBE3C3D">here</a>.</h2>
</body></html>

Same issue here (digital ocean). Which probably points to a bug upstream, I'm not sure if this issue only started occurring after installation of cert-manager though, in which case maybe it's not upstream.

Seems to be the same issue here: https://github.com/jetstack/cert-manager/issues/466

This PR (implementation of the KEP) might help: https://github.com/kubernetes/kubernetes/pull/92312
It will probably be merged for 1.21 (next cycle). So until then maybe use the workarounds or use dns instead of http.

Was this page helpful?
0 / 5 - 0 ratings