Describe the bug:
I have an nginx-controller on my cluster and I'm also using cert-manger version v0.9.1. I have an ingress that I'm trying to get a certificate for. The issue is that the created challenge goes to pending state and the reason is Waiting for http-01 challenge propagation: wrong status code '404', expected '200'.
I did some digging and and found a page in letsencrypt forums that states that this issue might be caused by a preflight check by the cert manager. My question is that is this a bug or is the problem caused because I'm using an old version of cert manager?
Expected behaviour:
A new Ready certificate created for the ingress I created.
Steps to reproduce the bug:
Create the manifest provided below
Anything else we need to know?:
My ingress manifest:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
certmanager.k8s.io/acme-http01-edit-in-place: "false"
ingress.kubernetes.io/ssl-redirect: "false"
kubernetes.io/tls-acme: "true"
name: test-ingress
namespace: default
spec:
rules:
- host: test.domain.com
http:
paths:
- backend:
serviceName: backendservice
servicePort: 80
path: /
tls:
- hosts:
- test.domain.com
secretName: test-tls-secret
I also have a backend that serves static content on its index page.
Created challenge fails fairly quickly, something like 7 seconds. That made me think maybe it's a preflight check.
Environment details::
/kind bug
/kind support
@MohiK98: The label(s) kind/support cannot be applied, because the repository doesn't have them
In response to this:
Describe the bug:
I have an nginx-controller on my cluster and I'm also using cert-manger versionv0.9.1. I have an ingress that I'm trying to get a certificate for. The issue is that the created challenge goes topendingstate and the reason isWaiting for http-01 challenge propagation: wrong status code '404', expected '200'.
I did some digging and and found a page in letsencrypt forums that states that this issue might be caused by a preflight check by the cert manager. My question is that is this a bug or is the problem caused because I'm using an old version of cert manager?Expected behaviour:
A newReadycertificate created for the ingress I created.Steps to reproduce the bug:
Create the manifest provided belowAnything else we need to know?:
My ingress manifest:apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
certmanager.k8s.io/acme-http01-edit-in-place: "false"
ingress.kubernetes.io/ssl-redirect: "false"
kubernetes.io/tls-acme: "true"
name: test-ingress
namespace: default
spec:
rules:
- host: test.domain.com
http:
paths:
- backend:
serviceName: backendservice
servicePort: 80
path: /
tls:
- hosts:
- test.domain.com
secretName: test-tls-secret
I also have a backend that serves static content on its index page.
Created challenge fails fairly quickly, something like 7 seconds. That made me think maybe it's a preflight check.
Environment details::
- Kubernetes version (e.g. v1.14.8):
- cert-manager version (e.g. v0.9.1):
/kind bug
/kind support
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Yep, cert-manager will perform a self check in order to verify that your HTTP solving configuration has been correctly applied/adopted by your Ingress controller, and to ensure that you DNS is configured to correctly route traffic to your ingress controller. Without this check, when Let's Encrypt attempt to verify your ownership of the domain, it will fail and you will not be given a certificate.
First, I'd advise updating to a current/supported version of cert-manager (v0.14).
I'd then also check through the documentation for info on how to setup HTTP01 with your own ingress controller. Where are you running cert-manager, what ingress controller are you using, etc?
/triage support
/remove-kind bug
/area acme/http01
Thanks for the feedback. There are a couple of issues I need to address before upgrading the cert manager's version. I'll post an update when it is done.
My nginx ingress controller image is: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
And for my cluster, I'm not using any cloud providers' services and it's self hosted.
I'm running cert-manager v0.14.3 and I run into this issue as well.
The strange thing is that I currently only see this for domain names with a dash in it e.g. my-api.domain.com (not sure if the dash is really related to the issue actually).
@munnerz could you explain in more detail what the self check means? what is he checking?
okay I think I found my issue - I kept on thinking about what the check could be and suddenly thought about the fact that my cluster is in an internal netwerk and that the domain name is a public record, so if cert-manager resolves it and tries to connect to it - he needs to go to the internet and then through the firewall back inside. I have seen multiple types of issues where these connections fail so I added an internal DNS record as well so that if cert-manager resolves the IP, it gets the internal record instead (and doesn't go through the firewall to reach my ingress) and only 5 minutes after that my certificates are already installed and running fine.
I was mislead by the message that a propagation failed - but actually the thing was that he was unable to verify that the propagation succeeded ....
okay I think I found my issue - I kept on thinking about what the check could be and suddenly thought about the fact that my cluster is in an internal netwerk and that the domain name is a public record, so if cert-manager resolves it and tries to connect to it - he needs to go to the internet and then through the firewall back inside. I have seen multiple types of issues where these connections fail so I added an internal DNS record as well so that if cert-manager resolves the IP, it gets the internal record instead (and doesn't go through the firewall to reach my ingress) and only 5 minutes after that my certificates are already installed and running fine.
I was mislead by the message that a propagation failed - but actually the thing was that he was unable to verify that the propagation succeeded ....
May I ask where, exactly did you add this record?
And is pointing to LB/Nodeport/Ingress itself?
In advance thank you for clarify.
@bvbek there are several places/ways you can do this:
Can confirm, pointing the coredns config to lead to a corporate dns server instead of 8.8.8.8 helps a lot. Suppose problem was in a network routing where cert-manager/challenger can not reach itself from a local network.
I am trying to resolve similar issue in a local setup which is setup as follows:
www -> public IP:80 -> router firewall -> ingress-nginx-controller:30080 -> ing -> svc -> pod
Here, my ingress controller's service is of type NodePort and it is accessed via port 30080 instead of port 80.
What type of configuration would I need to setup in order for cert-manager to be able to reach my service's acme http solver from within my own private network?
I think this is possibly the same as https://github.com/jetstack/cert-manager/issues/1292.
Most helpful comment
okay I think I found my issue - I kept on thinking about what the check could be and suddenly thought about the fact that my cluster is in an internal netwerk and that the domain name is a public record, so if cert-manager resolves it and tries to connect to it - he needs to go to the internet and then through the firewall back inside. I have seen multiple types of issues where these connections fail so I added an internal DNS record as well so that if cert-manager resolves the IP, it gets the internal record instead (and doesn't go through the firewall to reach my ingress) and only 5 minutes after that my certificates are already installed and running fine.
I was mislead by the message that a propagation failed - but actually the thing was that he was unable to verify that the propagation succeeded ....