Describe the bug
New loki v1.5.0 is part of helm/loki-stack 0.37.0. The PodSecurityPolicy in Chart is configured to drop ALL capabilities which compatible with previous loki version like v1.4.1. In my cluster 0.36.2 works great.
Related image version bump PR: https://github.com/grafana/loki/pull/2100
Pod loki-0 status is: CrashLoopBackOff, displayed log is:
standard_init_linux.go:211: exec user process caused "operation not permitted"
stream closed
To Reproduce
Steps to reproduce the behavior:
0.37.0 onto Kubernetes cluster where PodSecurityPolicy admission controller is enabledloki-0pod's logExpected behavior
Track down changes why grafana/loki:1.5.0 needs extra capability, however grafana/loki:1.4.1 did not need. It it is really required, then modify the PSP in Helm Chart.
Environment:
v1.18.0v3.2.1Screenshots, Promtail config, or terminal output
standard_init_linux.go:211: exec user process caused "operation not permitted"
stream closed
Good find ! I think you're right, would love a PR for this.
This is required to bind port below 1024 since we now user non-root user.
RUN setcap cap_net_bind_service=+ep /usr/bin/loki we have this now in the Dockerfile.
Good find ! I think you're right, would love a PR for this.
This is required to bind port below 1024 since we now user non-root user.
RUN setcap cap_net_bind_service=+ep /usr/bin/lokiwe have this now in the Dockerfile.
Than you! I would support other approach: use different port which does not need any capability. I am writing from my mobile, I was not able to find which Dockerfile is written to loki image. What do you think?
We did this to avoid trouble for people using port below 1024 including us. Having a seperate image just for us (Grafana Labs) is annoying.
/cc @slim-bean @owen-d WDYT ? we broke PSP users !
I am little bit confused, I see EXPOSE 3100:
https://github.com/grafana/loki/blob/12c7eab8bb94fd82b184c1c222200e37f2ca050a/cmd/loki/Dockerfile#L27
I think the EXPOSE in the dockerfile is a remnant that we should remove as it's misleading and doesn't actually change docker's functionality (just acts as documentation).
Can we just have this parameterized in the helm config such that we only request the capability if running against a privileged port?
I think we should modify the pod security policy in the helm chart to allow the capability we added.
The Expose 3100 could probably be removed but the default Loki config file does run on port 3100 so it does make it easier to run the image in the default state especially with docker compose and grafana
Hi, I stumbled across this issue when I updated to loki 1.5.0 with helm.
Good find ! I think you're right, would love a PR for this.
This is required to bind port below 1024 since we now user non-root user.
RUN setcap cap_net_bind_service=+ep /usr/bin/lokiwe have this now in the Dockerfile.
We did this to avoid trouble for people using port below 1024 including us. Having a seperate image just for us (Grafana Labs) is annoying.
@cyriltovena Is there any reason why you can't use a port above 1024? In docker, you can use port forwards and in Kubernetes you can use services to publish the container port to a public port below 1024.
I don't think its a good idea to generally add the cap_net_bind_service capability. I use the default configuration with port 3100 (and I think any others using helm also do), which actually does not require cap_net_bind_service. So if users (like me) do not want to set the cap_net_bind_service to the Kubernetes PSP to apply only the least minimal privileges, they cannot use the recent loki releases anymore.
Yep I hear you, we do this because it鈥檚 easier for us when we port-forward we know it鈥檚 always 80.
But the consequences are just too annoying for sure, I need some time to update everything and remove the capabilities will keep you posted.
Ed let鈥檚 use 3100 everywhere and publish with a service 80.
Also got hit by this. It seems such patch can be used as a workaround:
diff --git loki-stack/charts/loki/templates/podsecuritypolicy.yaml loki-stack-fixed/charts/loki/templates/podsecuritypolicy.yaml
index 6a6444e..71062d3 100755
--- loki-stack/charts/loki/templates/podsecuritypolicy.yaml
+++ loki-stack-fixed/charts/loki/templates/podsecuritypolicy.yaml
@@ -12,6 +12,8 @@ metadata:
spec:
privileged: false
allowPrivilegeEscalation: false
+ defaultAddCapabilities:
+ - NET_BIND_SERVICE
volumes:
- 'configMap'
- 'emptyDir'
I鈥檓 still planning to remove this capability and bind only to 3100 even internally.
Sorry 馃槓
Most helpful comment
Yep I hear you, we do this because it鈥檚 easier for us when we port-forward we know it鈥檚 always 80.
But the consequences are just too annoying for sure, I need some time to update everything and remove the capabilities will keep you posted.
Ed let鈥檚 use 3100 everywhere and publish with a service 80.