I am having a very strange issue and I couldn't find out the issue that is causing this weird scenario. I am using Calico Network Policy to allow the DB to accept the connection from one specific namespace only.
apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
name: network-policy-171-946
namespace: ns-restriction-demo-2
spec:
selector: app == 'db-demo-2'
ingress:
- action: Allow
protocol: TCP
source:
selector: app == 'node-demo-1'
namespaceSelector: name == 'ns-restriction-demo-1'
- action: Allow
protocol: TCP
source:
namespaceSelector: name == 'ns-restriction-demo-2'
When I apply the network policy, regardless of the kubernetes worker node, the policy should work.
When I apply the network policy, it work only if the DB and the application that is connecting to it, both are on same kubernetes worker node.
Client Version: v3.5.8
Git commit: 107e128
Cluster Version: v3.9.1
Cluster Type: k8s,bgp,kdd,typha
Kubernetes: 1.13.6
Istio: 1.1.10
Please help me understand or debug the issue.
Thanks
Seem to have the same problem with a different network policy #2896
My pods can communicate only if they are on the same node
@Woap Yeah something is wrong.
Are you running in the cloud?
Are you using IPIP? Is IPIP traffic allowed between your nodes?
Are the calico-node pods Running (not erroring/crashing)?
Hi, I've also run into the same issue (although with canal).
Edit: Updated following more investigation, I believe my issue was down to flannel version change rather than calico.
In my env I had updated Calico from v3.3.0 to v3.6.1, flannel from v0.9.0 to v0.11.0.
Prior to the upgrade, the POSTROUTING table was as follows:
Chain POSTROUTING (policy ACCEPT)
target prot opt in out source destination
KUBE-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
RETURN all -- * * 10.244.0.0/16 10.244.0.0/16
MASQUERADE all -- * * 10.244.0.0/16 !224.0.0.0/4
RETURN all -- * * !10.244.0.0/16 10.244.1.0/24
MASQUERADE all -- * * !10.244.0.0/16 10.244.0.0/16
cali-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:O3lYWMrLQYEMJtB5 */
Following the update of the DaemonSet and all the pods recycling, the POSTROUTING chain on all nodes had gotten into the following state:
Chain POSTROUTING (policy ACCEPT)
target prot opt in out source destination
KUBE-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
MASQUERADE all -- * * 10.244.0.0/16 !224.0.0.0/4
MASQUERADE all -- * * !10.244.0.0/16 10.244.0.0/16
RETURN all -- * * 10.244.0.0/16 10.244.0.0/16
MASQUERADE all -- * * 10.244.0.0/16 !224.0.0.0/4 random-fully
RETURN all -- * * !10.244.0.0/16 10.244.1.0/24
MASQUERADE all -- * * !10.244.0.0/16 10.244.0.0/16 random-fully
cali-POSTROUTING all -- * * 0.0.0.0/0 0.0.0.0/0 /* cali:O3lYWMrLQYEMJtB5 */
Snippet of the kube-flannel logs:
iptables.go:167] Deleting iptables rule: -s 0.0.0.0/0 -d 0.0.0.0/0 -j RETURN
iptables.go:167] Deleting iptables rule: -s 0.0.0.0/0 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
iptables.go:167] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -j RETURN
iptables.go:167] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -j MASQUERADE --random-fully
main.go:317] Wrote subnet file to /run/flannel/subnet.env
main.go:321] Running backend.
main.go:339] Waiting for all goroutines to exit
vxlan_network.go:60] watching for new subnet leases
iptables.go:145] Some iptables rules are missing; deleting and recreating rules
iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.1.0/24 -j RETURN
iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully
iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.1.0/24 -j RETURN
iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully
When flannel starts up it attempts to detect and remove 4 rules (RETURN, MASQUERADE, RETURN, MASQUERADE) before re-adding, but in this scenario it only detected and removed the two RETURN rules. That caused the 2 MASQUERADE rules to be left (slight difference in the new version with them referencing --random-fully), which all traffic ends up hitting and causing this issue.
To solve it without cycling the nodes I flushed the POSTROUTING chain (or alternatively could just drop those 2 MASQUERADE rules individually) and it was reconfigured correctly shortly afterwards (e.g. iptables -t nat -F POSTROUTING).
This issue is related: #2169
@venomwaqar please provide additional information then we can re-open this issue.
Does the traffic you are attempting work when you do not have policies in place?
Most helpful comment
Seem to have the same problem with a different network policy #2896
My pods can communicate only if they are on the same node