So it seems the problem is something to do with ARP replies for my host
not making it into the container when I have a routing rule. For
example:
sudo podman network create -d bridge net1
sudo podman run -dt --name test --network net1 --cap-add NET_RAW --rm busybox
On container:
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0@if542: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
link/ether d2:a6:90:7b:60:80 brd ff:ff:ff:ff:ff:ff
inet 10.89.0.4/24 brd 10.89.0.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::d0a6:90ff:fe7b:6080/64 scope link
valid_lft forever preferred_lft forever
Then in the container:
/ # ping 1.1.1.1
Then on the host:
sudo tcpdump -nn -i cni-podman1
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on cni-podman1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
02:20:02.428435 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:20:04.411105 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:20:05.415224 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:20:06.428555 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
^C
4 packets captured
4 packets received by filter
0 packets dropped by kernel
/ # arp -a
host.containers.internal (10.89.0.1) at <incomplete> on eth0
Seems that my container doesn't know who 10.89.0.1 is.
If I remove the rule:
sudo ip rule del from 10.89.0.0/24 lookup CONTAINERS
the ARP reply comes through:
sudo tcpdump -nn -i cni-podman1
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on cni-podman1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
02:22:09.563747 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:22:09.563801 ARP, Reply 10.89.0.1 is-at ce:b3:e0:ab:b0:ff, length 28
02:22:09.563831 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 14, seq 0, length 64
02:22:09.812966 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 14, seq 0, length 64
02:22:10.563915 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 14, seq 1, length 64
02:22:10.807300 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 14, seq 1, length 64
02:22:14.935078 ARP, Request who-has 10.89.0.4 tell 10.89.0.1, length 28
02:22:14.935128 ARP, Reply 10.89.0.4 is-at d2:a6:90:7b:60:80, length 28
and remains for as long as the arp cache is valid
/ # arp -a
host.containers.internal (10.89.0.1) at ce:b3:e0:ab:b0:ff [ether] on eth0
if I add the rule again on my host
sudo ip rule add from 10.89.0.0/24 lookup CONTAINERS
things continue to work for a while until, the arp cache expires, for
example ping stops:
/ # ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 56 data bytes
64 bytes from 1.1.1.1: seq=0 ttl=59 time=8.701 ms
64 bytes from 1.1.1.1: seq=1 ttl=59 time=8.810 ms
64 bytes from 1.1.1.1: seq=2 ttl=59 time=9.335 ms
64 bytes from 1.1.1.1: seq=3 ttl=59 time=9.660 ms
64 bytes from 1.1.1.1: seq=4 ttl=59 time=8.742 ms
64 bytes from 1.1.1.1: seq=5 ttl=59 time=8.242 ms
64 bytes from 1.1.1.1: seq=6 ttl=59 time=8.940 ms
64 bytes from 1.1.1.1: seq=7 ttl=59 time=8.987 ms
64 bytes from 1.1.1.1: seq=8 ttl=59 time=9.302 ms
^C
--- 1.1.1.1 ping statistics ---
27 packets transmitted, 9 packets received, 66% packet loss
round-trip min/avg/max = 8.242/8.968/9.660 ms
sudo tcpdump -nn -i cni-podman1
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on cni-podman1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
02:24:47.781760 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 18, seq 0, length 64
02:24:47.790349 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 18, seq 0, length 64
02:24:48.781992 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 18, seq 1, length 64
02:24:48.790710 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 18, seq 1, length 64
02:24:49.782086 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 18, seq 2, length 64
02:24:49.791350 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 18, seq 2, length 64
02:24:50.782279 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 18, seq 3, length 64
02:24:50.791865 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 18, seq 3, length 64
02:24:51.782334 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 18, seq 4, length 64
02:24:51.791010 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 18, seq 4, length 64
02:24:52.782396 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 18, seq 5, length 64
02:24:52.790556 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 18, seq 5, length 64
02:24:52.801752 ARP, Request who-has 10.89.0.4 tell 10.89.0.1, length 28
02:24:52.801792 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:24:52.801803 ARP, Reply 10.89.0.4 is-at d2:a6:90:7b:60:80, length 28
02:24:53.782472 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 18, seq 6, length 64
02:24:53.791345 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 18, seq 6, length 64
02:24:53.815092 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:24:54.782662 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 18, seq 7, length 64
02:24:54.791579 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 18, seq 7, length 64
02:24:54.828530 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:24:55.782856 IP 10.89.0.4 > 1.1.1.1: ICMP echo request, id 18, seq 8, length 64
02:24:55.792086 IP 1.1.1.1 > 10.89.0.4: ICMP echo reply, id 18, seq 8, length 64
02:24:56.783054 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:24:57.791743 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:24:58.801868 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:00.783485 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:01.788525 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:02.801872 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:04.784141 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:05.788525 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:06.801764 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:08.784601 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:09.788526 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:10.801864 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
02:25:12.785253 ARP, Request who-has 10.89.0.1 tell 10.89.0.4, length 28
^C
36 packets captured
36 packets received by filter
0 packets dropped by kernel
and then we get back to:
/ # arp -a
host.containers.internal (10.89.0.1) at <incomplete> on eth0
on the container.
So it seems I need another solution to route out via bond0.7, that
doesn't intefer with ARP requests between the containers and
cni-podman1.
--
Daniel Gray 0x41911F722B0F9AE3
https://mastodon.social/@dngray