Cert-manager: Waiting for http-01 challenge propagation: wrong status code '404', expected '200'

Created on 26 Apr 2020 · 10Comments · Source: jetstack/cert-manager

Describe the bug:
I have an nginx-controller on my cluster and I'm also using cert-manger version v0.9.1. I have an ingress that I'm trying to get a certificate for. The issue is that the created challenge goes to pending state and the reason is Waiting for http-01 challenge propagation: wrong status code '404', expected '200'.
I did some digging and and found a page in letsencrypt forums that states that this issue might be caused by a preflight check by the cert manager. My question is that is this a bug or is the problem caused because I'm using an old version of cert manager?

Expected behaviour:
A new Ready certificate created for the ingress I created.

Steps to reproduce the bug:
Create the manifest provided below

Anything else we need to know?:
My ingress manifest:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
certmanager.k8s.io/acme-http01-edit-in-place: "false"
ingress.kubernetes.io/ssl-redirect: "false"
kubernetes.io/tls-acme: "true"
name: test-ingress
namespace: default
spec:
rules:

host: test.domain.com
http:
paths:

backend:

serviceName: backendservice

servicePort: 80

path: /

tls:

hosts:

test.domain.com

secretName: test-tls-secret

I also have a backend that serves static content on its index page.

Created challenge fails fairly quickly, something like 7 seconds. That made me think maybe it's a preflight check.

Environment details::

Kubernetes version (e.g. v1.14.8):
cert-manager version (e.g. v0.9.1):

/kind bug
/kind support

areacmhttp01 triagsupport

Source

MohiK98

Most helpful comment

okay I think I found my issue - I kept on thinking about what the check could be and suddenly thought about the fact that my cluster is in an internal netwerk and that the domain name is a public record, so if cert-manager resolves it and tries to connect to it - he needs to go to the internet and then through the firewall back inside. I have seen multiple types of issues where these connections fail so I added an internal DNS record as well so that if cert-manager resolves the IP, it gets the internal record instead (and doesn't go through the firewall to reach my ingress) and only 5 minutes after that my certificates are already installed and running fine.

I was mislead by the message that a propagation failed - but actually the thing was that he was unable to verify that the propagation succeeded ....

timboven on 6 May 2020

👍3

All 10 comments

@MohiK98: The label(s) kind/support cannot be applied, because the repository doesn't have them

In response to this:

Describe the bug:
I have an nginx-controller on my cluster and I'm also using cert-manger version v0.9.1. I have an ingress that I'm trying to get a certificate for. The issue is that the created challenge goes to pending state and the reason is Waiting for http-01 challenge propagation: wrong status code '404', expected '200'.
I did some digging and and found a page in letsencrypt forums that states that this issue might be caused by a preflight check by the cert manager. My question is that is this a bug or is the problem caused because I'm using an old version of cert manager?

Expected behaviour:
A new Ready certificate created for the ingress I created.

Steps to reproduce the bug:
Create the manifest provided below

Anything else we need to know?:
My ingress manifest:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
certmanager.k8s.io/acme-http01-edit-in-place: "false"
ingress.kubernetes.io/ssl-redirect: "false"
kubernetes.io/tls-acme: "true"
name: test-ingress
namespace: default
spec:
rules:

host: test.domain.com
http:
paths:

backend:

serviceName: backendservice

servicePort: 80

path: /

tls:

hosts:

test.domain.com

secretName: test-tls-secret

I also have a backend that serves static content on its index page.

Created challenge fails fairly quickly, something like 7 seconds. That made me think maybe it's a preflight check.

Environment details::

Kubernetes version (e.g. v1.14.8):

cert-manager version (e.g. v0.9.1):

/kind bug
/kind support

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jetstack-bot on 26 Apr 2020

Yep, cert-manager will perform a self check in order to verify that your HTTP solving configuration has been correctly applied/adopted by your Ingress controller, and to ensure that you DNS is configured to correctly route traffic to your ingress controller. Without this check, when Let's Encrypt attempt to verify your ownership of the domain, it will fail and you will not be given a certificate.

First, I'd advise updating to a current/supported version of cert-manager (v0.14).

I'd then also check through the documentation for info on how to setup HTTP01 with your own ingress controller. Where are you running cert-manager, what ingress controller are you using, etc?

/triage support
/remove-kind bug
/area acme/http01

munnerz on 27 Apr 2020

Thanks for the feedback. There are a couple of issues I need to address before upgrading the cert manager's version. I'll post an update when it is done.
My nginx ingress controller image is: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
And for my cluster, I'm not using any cloud providers' services and it's self hosted.

MohiK98 on 30 Apr 2020

I'm running cert-manager v0.14.3 and I run into this issue as well.
The strange thing is that I currently only see this for domain names with a dash in it e.g. my-api.domain.com (not sure if the dash is really related to the issue actually).

@munnerz could you explain in more detail what the self check means? what is he checking?

timboven on 6 May 2020

I was mislead by the message that a propagation failed - but actually the thing was that he was unable to verify that the propagation succeeded ....

timboven on 6 May 2020

👍3

okay I think I found my issue - I kept on thinking about what the check could be and suddenly thought about the fact that my cluster is in an internal netwerk and that the domain name is a public record, so if cert-manager resolves it and tries to connect to it - he needs to go to the internet and then through the firewall back inside. I have seen multiple types of issues where these connections fail so I added an internal DNS record as well so that if cert-manager resolves the IP, it gets the internal record instead (and doesn't go through the firewall to reach my ingress) and only 5 minutes after that my certificates are already installed and running fine.

I was mislead by the message that a propagation failed - but actually the thing was that he was unable to verify that the propagation succeeded ....

May I ask where, exactly did you add this record?
And is pointing to LB/Nodeport/Ingress itself?
In advance thank you for clarify.

bvbek on 12 Jun 2020

@bvbek there are several places/ways you can do this:

Add this to CoreDNS
Add this as a hostAlias to the certmanager pod
Configure NAT to translate the public IP address to the IP address of the ingress controller or svc pointing to the ingress controller.

tomiamao on 30 Jun 2020

Can confirm, pointing the coredns config to lead to a corporate dns server instead of 8.8.8.8 helps a lot. Suppose problem was in a network routing where cert-manager/challenger can not reach itself from a local network.

ivan-sirosh on 27 Sep 2020

I am trying to resolve similar issue in a local setup which is setup as follows:

www -> public IP:80 -> router firewall -> ingress-nginx-controller:30080 -> ing -> svc -> pod

Here, my ingress controller's service is of type NodePort and it is accessed via port 30080 instead of port 80.

What type of configuration would I need to setup in order for cert-manager to be able to reach my service's acme http solver from within my own private network?