/kind bug
Description
Containers cannot access published ports of other containers on same host. The port can be accessed from the host itself (via localhost and external ip) and from other hosts.
Steps to reproduce the issue:
Start a server and expose the port on the machine: sudo podman run -d --rm -p 8080:80 nginx:alpine
Start another pod and try to access the server via the server name / server IP:
$ sudo podman run -it --rm alpine
/ # wget my-server.my-domain:8080
Connecting to my-server.my-domain:8080 (10.0.1.4:8080)
The request hangs. Other ports on the host system can be accessed.
Describe the results you received:
The second pod cannot access the service on port 8080. But it can access all other ports on the host system.
Netstat shows that SYN packed is sent.
/ # netstat -n
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 1 10.88.0.11:48968 10.0.1.4:8080 SYN_SENT
Other ports can be accessed:
/ # nc 10.0.1.4 22
SSH-2.0-OpenSSH_7.4
Describe the results you expected:
The second pod / container should be able to access the published port on the host.
Additional information you deem important (e.g. issue happens only occasionally):
This can be reproduced every time.
I also tried to use the gateway IP to access the published port - this does not work either. Other ports can also be accessed this way:
/ # route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
default 10.88.0.1 0.0.0.0 UG 0 0 0 eth0
10.88.0.0 * 255.255.0.0 U 0 0 0 eth0
/ # wget 10.88.0.1:8080
Connecting to 10.88.0.1:8080 (10.88.0.1:8080)
/ # nc 10.88.0.1 22
SSH-2.0-OpenSSH_7.4
Output of podman version:
Version: 1.2.0
RemoteAPI Version: 1
Go Version: go1.10.2
OS/Arch: linux/amd64
Output of podman info --debug:
debug:
compiler: gc
git commit: ""
go version: go1.10.2
podman version: 1.2.0
host:
BuildahVersion: 1.7.2
Conmon:
package: podman-1.2-2.git3bd528e.el7.x86_64
path: /usr/libexec/podman/conmon
version: 'conmon version 1.14.0-dev, commit: 345710c5d359e8d5b126906e24615d6a3e28c131-dirty'
Distribution:
distribution: '"centos"'
version: "7"
MemFree: 6445105152
MemTotal: 8353083392
OCIRuntime:
package: runc-1.0.0-60.dev.git2abd837.el7.x86_64
path: /usr/bin/runc
version: 'runc version spec: 1.0.0'
SwapFree: 0
SwapTotal: 0
arch: amd64
cpus: 2
hostname: podman-test
kernel: 3.10.0-862.11.6.el7.x86_64
os: linux
rootless: false
uptime: 1h 25m 15.91s (Approximately 0.04 days)
insecure registries:
registries: []
registries:
registries:
- registry.access.redhat.com
- docker.io
- registry.fedoraproject.org
- quay.io
- registry.centos.org
store:
ConfigFile: /etc/containers/storage.conf
ContainerStore:
number: 1
GraphDriverName: overlay
GraphOptions: null
GraphRoot: /var/lib/containers/storage
GraphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "true"
Supports d_type: "true"
Using metacopy: "false"
ImageStore:
number: 2
RunRoot: /var/run/containers/storage
VolumePath: /var/lib/containers/storage/volumes
Additional environment details (AWS, VirtualBox, physical, etc.):
Tried on Centos 7.6 on Azure and RHEL 7.6 on-premise (VMWare)
could it be the firewall? I've tried on Fedora 29 with the last podman release and a container can connect to the host
Nope - it's not the firewall. The container can connect to all ports on the host except the one published by the other container. When I run a server on the same port directly on the host then the container can connect to it.
Probably still the firewall, though indirectly. CNI uses IPTables to manage
port forwarding, and I think it's entirely possible that we neglect to
create rules for container IPs to access the forwarded port
On Wed, Apr 10, 2019, 06:51 Christian Köberl notifications@github.com
wrote:
Nope - it's not the firewall. The container can connect to all ports on
the host except the one published by the other container.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/containers/libpod/issues/2886#issuecomment-481641354,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHYHCKpTzSKUjNBBXvsI0-GhNnilsXsJks5vfcIfgaJpZM4cmWht
.
Here is a tcpdump of this issue - here are the IPs involved:
$ sudo tcpdump -i cni0 -nn
listening on cni0, link-type EN10MB (Ethernet), capture size 262144 bytes
07:10:20.352613 IP 10.88.0.7.53136 > 10.0.1.4.8080: Flags [S], seq 496137395, win 29200, options [mss 1460,sackOK,TS val 2442627 ecr 0,nop,wscale 7], length 0
07:10:20.352675 IP 10.88.0.7.53136 > 10.88.0.2.80: Flags [S], seq 496137395, win 29200, options [mss 1460,sackOK,TS val 2442627 ecr 0,nop,wscale 7], length 0
07:10:20.352716 IP 10.88.0.2.80 > 10.88.0.7.53136: Flags [S.], seq 1883965819, ack 496137396, win 28960, options [mss 1460,sackOK,TS val 2442627 ecr 2442627,nop,wscale 7], length 0
07:10:20.352739 IP 10.88.0.7.53136 > 10.88.0.2.80: Flags [R], seq 496137396, win 0, length 0
The two connect attempts seem strange: 10.88.0.7.53136 > 10.0.1.4.8080 and 10.88.0.7.53136 > 10.88.0.2.80.
And here' the iptables nat:
$ sudo iptables -nvL -t nat
Chain PREROUTING (policy ACCEPT 121 packets, 7180 bytes)
pkts bytes target prot opt in out source destination
140 8376 CNI-HOSTPORT-DNAT all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL
Chain INPUT (policy ACCEPT 121 packets, 7180 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 3112 packets, 187K bytes)
pkts bytes target prot opt in out source destination
12 720 CNI-HOSTPORT-DNAT all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL
Chain POSTROUTING (policy ACCEPT 3114 packets, 187K bytes)
pkts bytes target prot opt in out source destination
4733 285K CNI-HOSTPORT-MASQ all -- * * 0.0.0.0/0 0.0.0.0/0 /* CNI portfwd requiring masquerade */
16 982 CNI-ab9cc4b6e05a65d35ab8b9e5 all -- * * 10.88.0.0/16 0.0.0.0/0 /* name: "podman" id: "cb991a9e3818251276266d9de37ffbde0a28ffd050f12bb932d99f3b9a0e5d0d" */
0 0 CNI-fb900a39655af1e6d4ff7db6 all -- * * 10.88.0.0/16 0.0.0.0/0 /* name: "podman" id: "1288995927d80d5f36309fa6a78228523745b93651ea273dfba36e1ce6438ce8" */
Chain CNI-DN-ab9cc4b6e05a65d35ab8b (1 references)
pkts bytes target prot opt in out source destination
0 0 CNI-HOSTPORT-SETMARK tcp -- * * 10.88.0.2 0.0.0.0/0 tcp dpt:8080
5 300 CNI-HOSTPORT-SETMARK tcp -- * * 127.0.0.1 0.0.0.0/0 tcp dpt:8080
20 1200 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080 to:10.88.0.2:80
Chain CNI-HOSTPORT-DNAT (2 references)
pkts bytes target prot opt in out source destination
20 1200 CNI-DN-ab9cc4b6e05a65d35ab8b tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* dnat name: "podman" id: "cb991a9e3818251276266d9de37ffbde0a28ffd050f12bb932d99f3b9a0e5d0d" */ multiport dports 8080
Chain CNI-HOSTPORT-MASQ (1 references)
pkts bytes target prot opt in out source destination
5 300 MASQUERADE all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x2000/0x2000
Chain CNI-HOSTPORT-SETMARK (2 references)
pkts bytes target prot opt in out source destination
5 300 MARK all -- * * 0.0.0.0/0 0.0.0.0/0 /* CNI portfwd masquerade mark */ MARK or 0x2000
Chain CNI-ab9cc4b6e05a65d35ab8b9e5 (1 references)
pkts bytes target prot opt in out source destination
11 660 ACCEPT all -- * * 0.0.0.0/0 10.88.0.0/16 /* name: "podman" id: "cb991a9e3818251276266d9de37ffbde0a28ffd050f12bb932d99f3b9a0e5d0d" */
5 322 MASQUERADE all -- * * 0.0.0.0/0 !224.0.0.0/4 /* name: "podman" id: "cb991a9e3818251276266d9de37ffbde0a28ffd050f12bb932d99f3b9a0e5d0d" */
Chain CNI-fb900a39655af1e6d4ff7db6 (1 references)
pkts bytes target prot opt in out source destination
0 0 ACCEPT all -- * * 0.0.0.0/0 10.88.0.0/16 /* name: "podman" id: "1288995927d80d5f36309fa6a78228523745b93651ea273dfba36e1ce6438ce8" */
0 0 MASQUERADE all -- * * 0.0.0.0/0 !224.0.0.0/4 /* name: "podman" id: "1288995927d80d5f36309fa6a78228523745b93651ea273dfba36e1ce6438ce8" */
Does it work if you connect to the other container via its 10.x.x.x IP? If
so, I think we probably have an issue with the IPTables rules added.
On Thu, Apr 11, 2019, 03:36 Christian Köberl notifications@github.com
wrote:
Here is a tcpdump of this issue - here are the IPs involved:
- 10.0.1.4 ist the host adress
- 10.88.0.2 is the nginx container
- 10.88.0.7 is the alpine container running the wget
$ sudo tcpdump -i cni0 -nn
listening on cni0, link-type EN10MB (Ethernet), capture size 262144 bytes
07:10:20.352613 IP 10.88.0.7.53136 > 10.0.1.4.8080: Flags [S], seq 496137395, win 29200, options [mss 1460,sackOK,TS val 2442627 ecr 0,nop,wscale 7], length 0
07:10:20.352675 IP 10.88.0.7.53136 > 10.88.0.2.80: Flags [S], seq 496137395, win 29200, options [mss 1460,sackOK,TS val 2442627 ecr 0,nop,wscale 7], length 0
07:10:20.352716 IP 10.88.0.2.80 > 10.88.0.7.53136: Flags [S.], seq 1883965819, ack 496137396, win 28960, options [mss 1460,sackOK,TS val 2442627 ecr 2442627,nop,wscale 7], length 0
07:10:20.352739 IP 10.88.0.7.53136 > 10.88.0.2.80: Flags [R], seq 496137396, win 0, length 0The two connect attempts seem strange: 10.88.0.7.53136 > 10.0.1.4.8080
and 10.88.0.7.53136 > 10.88.0.2.80.And here' the iptables nat:
$ sudo iptables -nvL -t nat
Chain PREROUTING (policy ACCEPT 121 packets, 7180 bytes)
pkts bytes target prot opt in out source destination
140 8376 CNI-HOSTPORT-DNAT all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCALChain INPUT (policy ACCEPT 121 packets, 7180 bytes)
pkts bytes target prot opt in out source destinationChain OUTPUT (policy ACCEPT 3112 packets, 187K bytes)
pkts bytes target prot opt in out source destination
12 720 CNI-HOSTPORT-DNAT all -- * * 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCALChain POSTROUTING (policy ACCEPT 3114 packets, 187K bytes)
pkts bytes target prot opt in out source destination
4733 285K CNI-HOSTPORT-MASQ all -- * * 0.0.0.0/0 0.0.0.0/0 /* CNI portfwd requiring masquerade /
16 982 CNI-ab9cc4b6e05a65d35ab8b9e5 all -- * * 10.88.0.0/16 0.0.0.0/0 / name: "podman" id: "cb991a9e3818251276266d9de37ffbde0a28ffd050f12bb932d99f3b9a0e5d0d" /
0 0 CNI-fb900a39655af1e6d4ff7db6 all -- * * 10.88.0.0/16 0.0.0.0/0 / name: "podman" id: "1288995927d80d5f36309fa6a78228523745b93651ea273dfba36e1ce6438ce8" */Chain CNI-DN-ab9cc4b6e05a65d35ab8b (1 references)
pkts bytes target prot opt in out source destination
0 0 CNI-HOSTPORT-SETMARK tcp -- * * 10.88.0.2 0.0.0.0/0 tcp dpt:8080
5 300 CNI-HOSTPORT-SETMARK tcp -- * * 127.0.0.1 0.0.0.0/0 tcp dpt:8080
20 1200 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:8080 to:10.88.0.2:80Chain CNI-HOSTPORT-DNAT (2 references)
pkts bytes target prot opt in out source destination
20 1200 CNI-DN-ab9cc4b6e05a65d35ab8b tcp -- * * 0.0.0.0/0 0.0.0.0/0 /* dnat name: "podman" id: "cb991a9e3818251276266d9de37ffbde0a28ffd050f12bb932d99f3b9a0e5d0d" */ multiport dports 8080Chain CNI-HOSTPORT-MASQ (1 references)
pkts bytes target prot opt in out source destination
5 300 MASQUERADE all -- * * 0.0.0.0/0 0.0.0.0/0 mark match 0x2000/0x2000Chain CNI-HOSTPORT-SETMARK (2 references)
pkts bytes target prot opt in out source destination
5 300 MARK all -- * * 0.0.0.0/0 0.0.0.0/0 /* CNI portfwd masquerade mark */ MARK or 0x2000Chain CNI-ab9cc4b6e05a65d35ab8b9e5 (1 references)
pkts bytes target prot opt in out source destination
11 660 ACCEPT all -- * * 0.0.0.0/0 10.88.0.0/16 /* name: "podman" id: "cb991a9e3818251276266d9de37ffbde0a28ffd050f12bb932d99f3b9a0e5d0d" /
5 322 MASQUERADE all -- * * 0.0.0.0/0 !224.0.0.0/4 / name: "podman" id: "cb991a9e3818251276266d9de37ffbde0a28ffd050f12bb932d99f3b9a0e5d0d" */Chain CNI-fb900a39655af1e6d4ff7db6 (1 references)
pkts bytes target prot opt in out source destination
0 0 ACCEPT all -- * * 0.0.0.0/0 10.88.0.0/16 /* name: "podman" id: "1288995927d80d5f36309fa6a78228523745b93651ea273dfba36e1ce6438ce8" /
0 0 MASQUERADE all -- * * 0.0.0.0/0 !224.0.0.0/4 / name: "podman" id: "1288995927d80d5f36309fa6a78228523745b93651ea273dfba36e1ce6438ce8" */—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/containers/libpod/issues/2886#issuecomment-482002826,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHYHCMqO8gLxSB2aYckxNaVZcjFbgaQWks5vfuYMgaJpZM4cmWht
.
It can be accessed via 10.88.0.2:80.
The iptables rules are here but I don't know enough to find the issue.
I think this PR could fix the issue we are seeing here: https://github.com/containers/libpod/pull/2940
Any plans on merging the PR and fixing this for RHEL 7.7? This issue stops us from using podman.
I think this is being worked on as part of BZ 1703261 on the Red Hat Bugzilla - it's reported against 7.6 there. We were planning on including a more recent Podman for the next 7.x anyways, so if the patch lands by then we should have it included.
(Though I'm sufficient fuzzy on RHEL7 schedule to not know whether we're upgrading for 7.7 or 7.8)
We have point releases so it could be at 7.6.* or 7.7
https://bugzilla.redhat.com/show_bug.cgi?id=1703261 for reference
Loading the kernel module br_netfilter apparently fixes this. Just tried that and it works.
Should podman install load/require this module?
I don't know. I guess we could drop a file into /etc/modules-load.d/
We should also hit the tutorial and troubleshooting guide to notify people on older Podman versions what they need to do
I am running podman on centos 7, and this is still an issue. Every reboot I'm manually loading the br_netfilter module, and then restarting my containers. The containers seem to drift back into a bad state. Just want to know if there is an official fix.
Is there a fix for this, or is it just manually installing the br_netfilter module?
Is this fixed in RHEL 8 ?
"Drift back into a bad state" - do you mean that, after loading the module, things break again after a time?
If the issue is just loading the module, can you use /etc/modules-load.d/ to automatically load on system start?
Yeah, I have one container that stops connecting/communicating with other containers ports after a few days/weeks. I haven't spent a ton of time tracking it down.
I am aware I can put something in /etc/module-load.d, I'm just asking if this is the "official" fix for rhel/cent 7.
I think that loading the module on startup is the official fix, though I believe most installs include it by default - we don't get many reports of it not working by default.
Communication no longer working sounds like it's definitely a bug of some way. If you can get more details, we'd be interested in hearing about it.
Firewall rules? This would probably be a CNI issue, and not an issue directly in podman.
I am experiencing the same on CentOS 8.
The original issue here, or the newer one where containers stop communicating after several days/weeks?
My containers actually never start to communicate at all.. I get there is no name resolution like with docker, and I am ok with that.. I tried to publish ports to host, but from other container at the same host I cant get to the port. I can connect the published port from other server without any problem.
Are you trying to communicate with the other containers via their assigned IPs, or via the IP of the host? We expect the former to work, but not the latter.
I confirm that, I can connect to the IP:PORT using internal CNI network, but since there is no name resolution and these IPs are changing, I can use it.. I forgot to mention this works.. Which is quite a suprise since documentation says this should only work for containers being part of a pod..