Why Are Packets Going Out The Wrong Interface--Preserving For Historical Reasons
By Stw-Oracle on Apr 04, 2011
Dated: Thursday Apr 30, 2009
A common complaint for Solaris users runs something like this:
- I have a Solaris system with two Ethernet interfaces connected to different subnets. Sometimes, I see an IP packet come in on one interface, but the packet goes back out a different one.
- This behavior is bad for my network, because I have firewalls that check the packet sources, and they drop these misdirected packets. Why does Solaris do this? And how can I fix it? I've tried disabling routing, but that doesn't seem to help.
The underlying problem here is at least partly a misunderstanding of how TCP/IP works. When a system transmits a packet, it must locate the "best" interface over which to send it. By default, the algorithm for doing that is as described in RFC 1122 section 3.3.1. Note in particular section 188.8.131.52. This requires the system to look at local interfaces first -- all of them -- to try to match the destination address. And once we find the interface by the destination address, we're done.
That alone is enough to make things not work as expected. If you send a packet to the local address on ce0 from some other system, but that other system is best reachable through bge0, then we'll send the reply via bge0. It doesn't go back out through ce0, even if the original request came in that way.
When considering a non-interface route (whether only the "default routes" of RFC 1122 or the more flexible CIDR routes of RFC 1812), the system will look up the route by destination IP address alone, and then use the route to obtain the output interface. This often causes the same sort of confusion when a "default route" ends up causing packets to go to the default router that the administrator thinks don't belong there.
I actually consider this a design feature of TCP/IP, and not a flaw. It's part of the robustness that IP's datagram routing system offers: every node in the network -- hosts and routers alike -- independently determines the best way to send each distinct datagram based solely on the destination IP address. This allows for "healing" of broken networks, as the failure of one interface or router means that you can potentially still use a different (perhaps less preferred) one to send your message.
There are some related bits of confusion in this area. For example, some programmers think that binding to a particular IP address means that the interface with that address is "bound" and all packets will go out that way. That's not correct. The system still uses the destination address to pick the output path for each individual IP packet, even if your socket is bound to an address on some particular interface. And, as long as you don't set the ip_strict_dst_multihoming ndd flag (it's not set by default), binding to an address doesn't mean that packets will only arrive on that corresponding interface. They can arrive on any interface in the system, as long as the IP address matches the one bound.
There are many ways to fix this issue, and the right answer for a given situation likely depends on the details of that situation.
- The main issue here is the kernel's forwarding table, so putting the right things into the forwarding table is one of the first tasks.
A common problem is that the administrator has set up a "default router," but that specified router cannot correctly forward to all possible IP destinations. Some packets the system sends end up getting misdirected or lost as a result. The solution is not having that router as a "default router," and instead using more specific routes (perhaps running a listen-only routing protocol to simplify the administrative burden).
- Some systems have a "route by source address" feature. Solaris isn't one of those, though there is an RFE open on it (see CR 4777670). A better answer, in my opinion, would be to do something similar to what's suggested in CR 4173841. That would be, when we have multiple matching routes, to prefer a route that gives us an output interface in the same subnet as the source address.
It's a simple tweak, and would at least fix the folks who have problems default route selection. It would not fix the problems people with interfaces on separate subnets have, though.
- Applications that care about interface selection can use IP_BOUND_IF or IP_PKTINFO to select the specific interface desired.
See the ip(7P) man page on your system for details.
- If all else fails, you can use IP Filter's fastroute/to keyword on an output interface to put packets right where you want them. You should be aware that when you do this, you're circumventing IP's routing features, which means that if there's an interface or path failure, you may cause connections to fail that didn't need to fail.