Why can't I ping hosts on two different internet-facing interfaces?

Hi all, networking question for a box running Almalinux 9.1

I have two connections, each going to a different ISP. Primary is on eno1 and has a metric of 100. Secondary is on eno2 and has a metric of 200. Both have default routes defined.

I want to automate failover between the two interfaces. Manual failover is validated and works, but the problem I encounter is that I can’t get ping responses out of the secondary interface.

Packet capture shows that the ping replies coming in, but the ping application doesn’t see them.

Strace on the ping process shows the packet replies, and then I see this error:

sendto(3, “\10\0\201\335\0\0\0\1\257\237/d\0\0\0\0\327J\1\0\0\0\0\0\20\21\22\23\24\25\26\27”…, 64, 0, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr(“8.8.8.8”)}, 16) = 64
recvmsg(3, {msg_namelen=128}, 0) = -1 EAGAIN (Resource temporarily unavailable)
sendto(3, “\10\0\16^\0\0\0\2\260\237/d\0\0\0\0I\311\1\0\0\0\0\0\20\21\22\23\24\25\26\27”…, 64, 0, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr(“8.8.8.8”)}, 16) = 64
recvmsg(3, {msg_namelen=128}, 0) = -1 EAGAIN (Resource temporarily unavailable)

This happens by the way without any firewall enabled, iptables flushed clean, no unusual routing.

How can I get ping replies on both interfaces to work? Doing so would make automatic failure detection possible.

For traffic there are two distinct cases:

  • This machine sends something new out. If destination is not within (statically) known destinations, then packet will use the default route
  • This machine sends a reply to something that did come in

For the latter the “two routes” setup requires policy routing (aka source routing).

Lets say that eno1 has IP1 and eno2 has IP2. If a packet comes in from eno2, then its destination is IP2 and the reply will by default have IP2 as source.

Without policy routing the reply would be routed out via eno1 as long as default route is on that side, but a packet with IP2 as source cannot leave from eno1 that has address IP1 (in different subnet).

The policy routing contains two additional routing tables:

// rt1
to subnet of IP1 from eno1
default via GW1  // that is in subnet of IP1
// rt2
to subnet of IP2 from eno2
default via GW2  // that is in subnet of IP2

Furthermore, there must be rules:

from IP1 use table rt1
from IP2 use table rt2

Now that reply packet that has “from IP2” will use table rt2 for routing, because the rule says so, and according to table rt2 that packet is sent to GW2 via eno2.


The ‘ping’ command does have some option to use specific interface to send the packet, rather than the one determined by routing tables.


“Flushing iptables” is deprecated because kernel has nf-tables in both el8 and el9.
Use nft list ruleset to see current rules and man nft for how to flush them.

Thanks! Before I reply specifically let me clarify a few things that might narrow it down better.

I was using the ping utility with the -I option, and the results are as follows:

[router ~]# ping -c 4 -I eno1 8.8.8.8
PING 8.8.8.8 (8.8.8.8) from {IP1} eno1: 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=59 time=11.7 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=59 time=12.3 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=59 time=11.3 ms
64 bytes from 8.8.8.8: icmp_seq=4 ttl=59 time=12.3 ms

— 8.8.8.8 ping statistics —
4 packets transmitted, 4 received, 0% packet loss, time 3005ms
rtt min/avg/max/mdev = 11.310/11.911/12.311/0.414 ms
[router ~]# ping -c 4 -I eno2 8.8.8.8
PING 8.8.8.8 (8.8.8.8) from {IP2} eno2: 56(84) bytes of data.

— 8.8.8.8 ping statistics —
4 packets transmitted, 0 received, 100% packet loss, time 3049ms

In the case where I ping with -I eno2, the packet captures and strace show the ping reply packets arriving at the box, so I am assuming the ping utility is sending the correct source address. I’m unsure why the ping utility itself doesn’t see the responses and what the “resource temporarily unavailable” error is indicative of in the strace.

Good call on nftables. I checked my firewall to make sure when I was testing the rules were totally cleared out:

]# nft list ruleset
table ip filter {
chain INPUT {
type filter hook input priority filter; policy accept;
}
}

When the firewall is running, I have tested policy routing. For instance, building the two route tables in a similar manner to what you described. When I take the IP of a machine behind the firewall and set a rule for it to use the eno2 gateway, that machine correctly routes all traffic out the eno2 interface and the masquerading seems to have no issue returning pings and all other traffic.

Hope that clears it up a bit.