Linkerd2: ip tables error when initializing the pod in OpenShift 3.7

Created on 15 Jan 2018 · 18Comments · Source: linkerd/linkerd2

I'm deploying conduit to openshift 3.7.
the step for my deployment can be found here:
https://github.com/raffaelespazzoli/openshift-enablement-exam/tree/master/misc/conduit

I get an error when the init container initializes the pod's iptables. here is the log:

[rspazzol@rspazzol conduit]$ oc logs emoji-svc-3777961096-ws7nv -c conduit-init
2018/01/15 15:07:50 Tracing this script execution as [1516028870]
2018/01/15 15:07:50 State of iptables rules before run:
2018/01/15 15:07:50 > iptables -t nat -vnL
2018/01/15 15:07:50 < Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain CONDUIT_REDIRECT (0 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:80 /* conduit/ignore-port-80/1516028824 */
    0     0 RETURN     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:4190 /* conduit/ignore-port-4190/1516028824 */

2018/01/15 15:07:50 > iptables -t nat -F CONDUIT_REDIRECT
2018/01/15 15:07:50 < 
2018/01/15 15:07:50 > iptables -t nat -X CONDUIT_REDIRECT
2018/01/15 15:07:50 < 
2018/01/15 15:07:50 Will ignore port 80 on chain CONDUIT_REDIRECT
2018/01/15 15:07:50 Will ignore port 4190 on chain CONDUIT_REDIRECT
2018/01/15 15:07:50 Will redirect all INPUT ports to proxy
2018/01/15 15:07:50 > iptables -t nat -F CONDUIT_OUTPUT
2018/01/15 15:07:50 < iptables: No chain/target/match by that name.

2018/01/15 15:07:50 > iptables -t nat -X CONDUIT_OUTPUT
2018/01/15 15:07:50 < iptables: No chain/target/match by that name.

2018/01/15 15:07:50 Ignoring uid 2102
2018/01/15 15:07:50 Redirecting all OUTPUT to 4140
2018/01/15 15:07:50 Executing commands:
2018/01/15 15:07:50 > iptables -t nat -N CONDUIT_REDIRECT -m comment --comment conduit/redirect-common-chain/1516028870
2018/01/15 15:07:50 < 
2018/01/15 15:07:50 > iptables -t nat -A CONDUIT_REDIRECT -p tcp --destination-port 80 -j RETURN -m comment --comment conduit/ignore-port-80/1516028870
2018/01/15 15:07:50 < 
2018/01/15 15:07:50 > iptables -t nat -A CONDUIT_REDIRECT -p tcp --destination-port 4190 -j RETURN -m comment --comment conduit/ignore-port-4190/1516028870
2018/01/15 15:07:50 < 
2018/01/15 15:07:50 > iptables -t nat -A CONDUIT_REDIRECT -p tcp -j REDIRECT --to-port 4143 -m comment --comment conduit/redirect-all-incoming-to-proxy-port/1516028870
2018/01/15 15:07:50 < iptables: No chain/target/match by that name.

2018/01/15 15:07:50 Aborting firewall configuration
2018/01/15 15:07:50 exit status 1

Any suggestions? It doesn't seem OCP dependent to me.

wontfix

Source

raffaelespazzoli

Most helpful comment

I agree that it will take a long time. And I don't have a good answer to
your question. But I work in consulting and I am skeptical that the
customers I deal with (especially the large ones) will allow pods with
CAP_NET_ADMIN to run in production environment.

2018-02-14 20:17 GMT-07:00 Brian Smith notifications@github.com:

I have been thinking about this issue and I am believe that we should start
considering a different approach

@raffaelespazzoli https://github.com/raffaelespazzoli I agree in the
long term (see kubernetes/kubernetes#55435 (comment)
https://github.com/kubernetes/kubernetes/issues/55435#issuecomment-365469675
for example). However, what should we do in the short- to mid- term?
Realistically I would expect it might take a year or more for Kubernetes to
spec out an interface like you suggest and for CNI implementations to
implement it.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/runconduit/conduit/issues/151#issuecomment-365814939,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AF5I3IesBKWGK5O0phNIpuKHFKPrQMZxks5tU6HBgaJpZM4Rejoq
.

--
ciao/bye
Raffaele

raffaelespazzoli on 15 Feb 2018

👍2

All 18 comments

@raffaelespazzoli do you have SELinux enabled by any chance?

wmorgan on 15 Jan 2018

Yes selinux is eabled. What is the right setting to make it work?

On Jan 15, 2018 9:34 AM, "William Morgan" notifications@github.com wrote:

@raffaelespazzoli https://github.com/raffaelespazzoli do you have
SELinux enabled by any chance?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/runconduit/conduit/issues/151#issuecomment-357732440,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AF5I3GP9Uf9m58cyIuLnuoSeyooSQgyYks5tK33-gaJpZM4Rejoq
.

raffaelespazzoli on 15 Jan 2018

SELinux might be preventing Conduit from changing iptables in the pod. Per https://stackoverflow.com/questions/39059149/iptables-error-prevents-pod-starting-in-kubernetes/39063558#39063558, you can "search for messages containing the string 'AVC' inside /var/log/audit/audit.log in order to confirm that theory"

If that's the case, the quick workaround is to disable SELinux. If you don't want to do that, you can investigate some of the options in the StackOverflow link to change the SELinux policy to allow the iptables changes that Conduit needs to make in the pod.

wmorgan on 15 Jan 2018

this is the error I'm seeing in the logs:

type=AVC msg=audit(1516045749.591:563630): avc:  denied  { module_request } for  pid=122393 comm="iptables" kmod="ipt_REDIRECT" scontext=system_u:system_r:svirt_lxc_net_t:s0:c10,c11 tcontext=system_u:system_r:kernel_t:s0 tclass=system

via the security context it is possible to modify the selinux context under which the containers will run. shouldn't conduit inject take care of this also?

if anyone here is a selinux expert, what selinux context would make the above error go away?

raffaelespazzoli on 15 Jan 2018

Do we need to do the same thing here as https://github.com/linkerd/linkerd-inject/issues/6 ?

wmorgan on 15 Jan 2018

@klingerf to evaluate with linkerd/linkerd-inject#7 will fix this too.

wmorgan on 13 Feb 2018

RE: The solution in https://github.com/linkerd/linkerd-inject/pull/7, I think that's a possible short-term workaround but it isn't a long-term solution. See https://github.com/istio/issues/issues/34#issuecomment-331047150 and https://github.com/istio/issues/issues/172 and https://github.com/istio/issues/issues/172#issuecomment-361348278.

https://docs.openshift.com/enterprise/3.0/admin_guide/manage_scc.html shows that OpenShift intends to limit which service accounts can run privileged containers to a small set of service accounts. The conduit init container runs under the service account assigned to the pod. We want people to be able to use Conduit in all pods, which implies using it in all service accounts. That would effectively require disabling the SCC functionality in OpenShift.

This is similarly at odds with the Kubernetes PodSecurityPolicy feature.

Therefore, we shouldn't make privileged=true the default since, IIUC, privileged=true is only required for OpenShift. Also, assuming we have some short-term solution, we should have a new issue open for tracking the long-term solution.

briansmith on 14 Feb 2018

I think we should defer this work until we have OpenShift actually working. This only affects OpenShift. Based on https://blog.openshift.com/running-istio-service-mesh-openshift/, OpenShift support likely requires additional work, at least w.r.t. documenting how to configure things and/or automating other OpenShift-specific configuration settings. Maybe we can find a better long-term solution in the interim, perhaps in coordination with the OpenShift team.

briansmith on 14 Feb 2018

Longer-term solutions might involve https://docs.google.com/document/d/1QQ5u1RBDLXWvC8K3pscTtTRThsOeBSts_imYEoRyw8A/edit#heading=h.t0gz595nqls3

briansmith on 14 Feb 2018

I am ok with deferring this until we have more fully-featured OpenShift support. @raffaelespazzoli what do you think? You work on OpenShift, right?

wmorgan on 14 Feb 2018

I have been thinking about this issue and I am believe that we should start
considering a different approach (quoting from my proposal to istio, you
can replace istio with conduit):

"We could define a CRD that describes how connections should be routed
inside a pod.
These rules would be expressed in a technology agnostic manner, so that
they could be implemented not only with iptables but with other means.
This CRD would also have a pod selector to specify which pods should get
those rule.
The Istio control plane would create these CRD objects.
The CNI driver would honor them when the pod is created, but before any
container is started.

I believe this way no specific privileges would be required by the pod.
Also this approach is probably general enough that other service mesh
implementations could leverage it."

Keep in mind that this is my opinion and does not reflect redhat thinking.

2018-02-14 14:02 GMT-07:00 William Morgan notifications@github.com:

I am ok with deferring this until we have more fully-featured OpenShift
support. @raffaelespazzoli https://github.com/raffaelespazzoli what do
you think? You work on OpenShift, right?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/runconduit/conduit/issues/151#issuecomment-365744183,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AF5I3NELD3NRTWHigxcsnM0Ntfe84mJ9ks5tU0n-gaJpZM4Rejoq
.

--
ciao/bye
Raffaele

raffaelespazzoli on 15 Feb 2018

👍1

I have been thinking about this issue and I am believe that we should start
considering a different approach

@raffaelespazzoli I agree in the long term (see https://github.com/kubernetes/kubernetes/issues/55435#issuecomment-365469675 for example). However, what should we do in the short- to mid- term? Realistically I would expect it might take a year or more for Kubernetes to spec out an interface like you suggest and for CNI implementations to implement it.

briansmith on 15 Feb 2018

👍1

2018-02-14 20:17 GMT-07:00 Brian Smith notifications@github.com:

I have been thinking about this issue and I am believe that we should start
considering a different approach

@raffaelespazzoli https://github.com/raffaelespazzoli I agree in the
long term (see kubernetes/kubernetes#55435 (comment)
https://github.com/kubernetes/kubernetes/issues/55435#issuecomment-365469675
for example). However, what should we do in the short- to mid- term?
Realistically I would expect it might take a year or more for Kubernetes to
spec out an interface like you suggest and for CNI implementations to
implement it.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/runconduit/conduit/issues/151#issuecomment-365814939,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AF5I3IesBKWGK5O0phNIpuKHFKPrQMZxks5tU6HBgaJpZM4Rejoq
.

--
ciao/bye
Raffaele

raffaelespazzoli on 15 Feb 2018

👍2

I suggest we leave this issue open for now, and if we encounter more people who are evaluating Conduit on SELinux clusters, we can do the short-term solution described in linkerd/linkerd-inject#7.

wmorgan on 19 Feb 2018

See the documentation here: https://docs.openshift.com/container-platform/3.7/admin_guide/manage_scc.html#provide-additional-capabilities. If I am reading it correctly, we might be able to do something relatively simple to resolve this that doesn't require running the container as privileged.

See https://github.com/kubernetes/kubernetes/issues/55435#issuecomment-366703250 which suggests this isn't totally unreasonable to do.

briansmith on 20 Feb 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] on 8 Oct 2018

So i am encountering this on a non OCP cluster, just a vanilla kube 1.10 cluster. What is the work around that has been settled on? I see that there was a PR that addresses the issue, maybe?