Bugs should be filed for issues encountered whilst operating cert-manager.
You should first attempt to resolve your issues through the community support
channels, e.g. Slack, in order to rule out individual configuration errors.
Please provide as much detail as possible.
Describe the bug:
I was following along the steps at here: https://cert-manager.io/docs/installation/kubernetes/
Expected behaviour:
I got an issue when trying to test the installation
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: dial tcp 10.43.18.
211:443: i/o timeout
Steps to reproduce the bug:
Steps to reproduce the bug should be clear and easily reproducible to help people
gain an understanding of the problem.
Following the steps in the link above and got the issue when testing the installation
Anything else we need to know?:
The installation step is successfully, as I verified as follow
kubectl get pods --namespace cert-manager Thu Apr 16 18:33:31 2020
NAME READY STATUS RESTARTS AGE
cert-manager-cainjector-79f4496665-7gptd 1/1 Running 0 11m
cert-manager-57cdd66b-7xvj2 1/1 Running 0 11m
cert-manager-webhook-6d57dbf4f-28zjc 1/1 Running 0 11m
I'm using VMs from Google.
Environment details::
/kind bug
I have the same this scenario, installed the last version of cert-manager (0.15 alpha.1
we trying to create issuer and certificate by the test-resources.yaml i am getting the following error, the status of every thing is up and running but still i am facing this error:
Error from server (InternalError): error when creating "test-resources.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Hey @sangnguyen7
How have you deployed k3s/what environment is it deployed into? This error indicates that your apiserver is unable to route traffic to the cert-manager webhook pod, which is a required component.
This is something that is required to be working in order for Kubernetes conformance tests to pass as far as I'm aware, so this indicates that somewhere along the line your cluster is not configured correctly.
Are you able to run Sonobuoy to check and ensure your cluster is set up properly? This will hopefully help you to pinpoint what's going on 馃槃
/triage support
/area deploy
/remove-kind bug
Thanks @munnerz. No, I have not tried the tool yet. I will double check with.
I had the same issue in k8s, not k3s. I resolved it by switching from flannel to calico.
Same issue here using k3s. Resolved it using the flag --flannel-backend host-gw during the k3s setup. So it looks like something is wrong in the default flannel setup, but I didn't investigate further
Are there any related issues in the k3s repository that this could be linked to?
@munnerz, it seems the issue is related to the network setup and not related to cert-manager, if there is nobody else having other issues, you can close this issue. Thanks!
Maybe related to https://github.com/coreos/flannel/issues/1243
@alecunsolo, same issue in k3s and i change flannel backend from vxlan to host-gw, but it doesn't work.
vim /etc/systemd/system/k3s.service
ExecStart=/usr/local/bin/k3s server --flannel-backend host-gw
[root@bowser1704 ~]# cat /var/lib/rancher/k3s/agent/etc/flannel/net-conf.json
{
"Network": "10.42.0.0/16",
"Backend": {
"Type": "host-gw"
}
}
But it still doesn't work.
[root@bowser1704 ~]# kubectl apply -f cert-manager/cluster-issuer.yaml
Error from server (InternalError): error when creating "cert-manager/cluster-issuer.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: context deadline exceeded
Do you have any suggestions?
thanks.
maybe realted to rancher/k3s#1266 rancher/k3s#1958
It tooks a few days to fix this problem.
it maybe related to coreos/flannel#1268. or vxlan bugs.
- Super slow access to service IP from host, maybe 60s delay
- Can't access pods by service cluster ip except you access from the node where pod in
cert-manager-webook to the node where you are by using spec.nodeSelector. maybe related to rancher/k3s#1638
I fix by this command.
ethtool -K flannel.1 tx-checksum-ip-generic off
Partly comment, partly question:
We're running k3s (v1.17.7+k3s1, flannel, AWS AMI2) but the master is separated from the cluster (by firewall - allowing port 6443 from workers towards master EDIT: master runs no pods, not part of the cluster workload). Until now we've been using the "no-webhook" version of cert manager (until 0.13) without problems but after upgrade yesterday getting the same error when we try and list certs using kubectl on the master:
# /usr/local/bin/kubectl get cert --all-namespaces
Error from server: conversion webhook for cert-manager.io/v1alpha2, Kind=Certificate failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: context deadline exceeded
From a pod running on the cluster I can retrieve the certs (I tested the pod on different nodes, it always works):
curl -s --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
-H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \
https://kubernetes.default.svc/apis/cert-manager.io/v1alpha2/certificates
Any idea why this would happen?
Partly comment, partly question:
We're running k3s (v1.17.7+k3s1, flannel, AWS AMI2) but the master is separated from the cluster (by firewall - allowing port 6443 from workers towards master EDIT: master runs no pods, not part of the cluster workload). Until now we've been using the "no-webhook" version of cert manager (until 0.13) without problems but after upgrade yesterday getting the same error when we try and list certs using kubectl on the master:
# /usr/local/bin/kubectl get cert --all-namespaces Error from server: conversion webhook for cert-manager.io/v1alpha2, Kind=Certificate failed: Post https://cert-manager-webhook.cert-manager.svc:443/convert?timeout=30s: context deadline exceededFrom a pod running on the cluster I can retrieve the certs (I tested the pod on different nodes, it always works):
curl -s --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt \ -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \ https://kubernetes.default.svc/apis/cert-manager.io/v1alpha2/certificatesAny idea why this would happen?
UPDATE: I moved my master server into the same security group and started k3s agent on the master node and it started working. So for anyone Googling this : if the master node is separated from the cluster and not running k3s agent, it doesn't appear that it will be able to contact the webhook server.
Having the same issue with our EKS cluster. Wondering if there is a fix to stop the pod from restarting on the time out and gracefully just adding an error to the logs ?
Error from server (InternalError): error when creating "clusterIssuer.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": Post https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=30s: context deadline exceeded
Getting above error.
杩欎釜闂鍙兘鏄痗ni瀵艰嚧鐨勶紝鎴戜慨鏀逛簡calico鐨刴tu鍚庤繖涓棶棰樿В鍐充簡(This problem may be caused by cni. After I modified the mtu of calico, the problem was solved.)
"mtu": 1440-> "mtu": 1420,
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"log_level": "info",
"log_file_path": "/var/log/calico/cni/cni.log",
"datastore_type": "kubernetes",
"nodename": "k3s-operator-1",
"mtu": 1420,
"ipam": {
"type": "calico-ipam"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
},
{
"type": "bandwidth",
"capabilities": {"bandwidth": true}
}
]
}
Most helpful comment
Same issue here using k3s. Resolved it using the flag
--flannel-backend host-gwduring the k3s setup. So it looks like something is wrong in the default flannel setup, but I didn't investigate further