Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.): no
What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.): DoS
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
NGINX Ingress controller version: 0.21.0
Kubernetes version (use kubectl version): v1.15.3
Environment:
uname -a):What happened:
The nginx-ingress-controller allows Ingress resources to include configuration snippets (https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#configuration-snippet)
It seems the nginx-ingress-controller does not pre-check these snippets for valid syntax before applying them to the controller's global nginx.conf. That means a syntax error can cause nginx to crash with a syntax error:
nginx: [emerg] unknown directive "|" in /tmp/nginx-cfg258693461:6221
nginx: configuration file /tmp/nginx-cfg258693461 test failed
and all the nginx-ingress-controller pods eventually bail out, leaving the cluster offline.
What you expected to happen:
Ingress controller to validate invalid config snippet, and either fall back to an earlier config, or exclude the invalid directive.
How to reproduce it (as minimally and precisely as possible):
Create ingress with metadata.annotations["nginx.ingress.kubernetes.io/configuration-snippet"]' having "|" in it's value
Anything else we need to know:
As per best k8s best practices, nginx ingress controller acts as a global LB for the cluster, possibility of one invalid ingress crashing the entire LB seems concerning. Any pointers of how to prevent this sitatuion from happening, are appreciated.
Any pointers of how to prevent this sitatuion from happening, are appreciated.
Please check https://kubernetes.github.io/ingress-nginx/deploy/validating-webhook/
Right now this is optional but it could be mandatory.
@aledbf great, thank you, will check!
We'll look into enabling it as an option for our users, but making the check mandatory by default on ingress-nginx side would be great.
Just gonna add my 2c here, we ran into this last night because somebody used a very long value for their path parameter. This took out all of our ingress controllers.
We're going to add the validating webhook, but does the ingress controller really do no validation before it merges a new ingress config into the nginx config? This was a little surprising to us.
Also, is there a list of what it actually does during validation? Perhaps just on the nginx side?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Closing. The validation webhook is enabled by default now #5399
Most helpful comment
We'll look into enabling it as an option for our users, but making the check mandatory by default on ingress-nginx side would be great.