ARP Internals: an in-depth exploration of how ARP functions

Introduction

Have you ever wondered how your computer finds the right device to send data to on a network?

The Address Resolution Protocol (ARP) makes this possible. It acts as the bridge between IP addresses and physical MAC (Media Access Control) addresses, ensuring seamless communication in Ethernet networks. While higher-level protocols like IP use logical addresses, devices actually communicate using MAC addresses. ARP ensures that when a device knows the IP address of another device on the same local network, it can discover the corresponding MAC address, allowing packets to be properly routed and delivered.

This blog aims to provide an in-depth exploration of how ARP functions within the Linux operating system. We’ll dive into the packet structure, the internal workings of ARP in the Linux kernel, and the processes involved in ARP requests and replies. By the end, readers will have a comprehensive understanding of ARP’s role in networking and how it’s implemented and managed in Linux.

ARP Basics

ARP Functionality

ARP plays a critical role in enabling communication over a network by mapping IP addresses (used by the Internet Protocol) to MAC addresses (used by Ethernet). This mapping process is necessary because while IP addresses are used to identify devices at the network layer, data link layer protocols like Ethernet require MAC addresses to forward packets within the local network.

For example, when a device wants to send data to another device on the same network, it first checks its ARP cache to see if the MAC address corresponding to the destination IP address is already known. If not, it broadcasts an ARP request packet to all devices on the network, asking, “Who has this IP address?” The device with the matching IP address replies with its MAC address, which is then cached and used to send the data.

ARP Request Process

When a device on a network needs to communicate with another device, it requires the MAC address corresponding to the target device’s IP address to generate the Layer 2 header. If this MAC address isn’t already cached, the device initiates an ARP request. This process is broadcast-based, meaning the request is sent to all devices on the local network segment.

The ARP request is sent as a broadcast Ethernet frame because the sender does not yet know the MAC address of the target device. The packet contains the sender’s IP and MAC addresses, and the target’s IP address with the target MAC address field set to FF:FF:FF:FF:FF:FF.
In the Linux kernel, the ARP request is generated by the arp_send() function, which constructs the ARP packet and broadcasts it. The kernel’s network stack ensures that the ARP request is sent on the correct network interface.

ARP Reply Process

Upon receiving an ARP request, all devices on the network inspect the target IP address within the packet. The device with a matching IP address responds with an ARP reply, which is unicast directly back to the requester.

The ARP reply contains the MAC address of the responding device, enabling the original sender to map the IP address to this MAC address and store it in the ARP cache. This reply is handled in the Linux kernel by the arp_process() function.
Once the ARP reply is received, the requester updates its ARP cache with the new IP-to-MAC mapping. This cached entry allows subsequent communications to bypass the ARP request process, as the MAC address is now known.

ARP Packet Structure

An ARP packet consists of several fields that define its structure and purpose:

Hardware Type: Specifies the type of hardware address (e.g., Ethernet).
Protocol Type: Defines the type of protocol address (e.g., IPv4).
Hardware Size: The length of the hardware address.
Protocol Size: The length of the protocol address.
Opcode: Indicates whether the packet is an ARP request (1) or reply (2).
Sender MAC Address: The MAC address of the sender.
Sender IP Address: The IP address of the sender.
Target MAC Address: The MAC address of the intended recipient (set to 00:00:00:00:00:00 in ARP requests).
Target IP Address: The IP address of the intended recipient.

This structure allows the ARP protocol to carry out its function of resolving IP addresses to MAC addresses efficiently.

Gratuitous ARP

Gratuitous ARP (GARP) is a type of ARP message that is not a response to any ARP request but is instead sent as a broadcast proactively. It serves two main purposes:

Network Announcement: GARP proactively informs other devices about changes in IP-MAC associations, ensuring ARP caches remain updated and preventing stale entries.
Failover in High Availability Setups: In redundancy configurations, two devices may share the same IP but have different MAC addresses. When failover occurs, GARP ensures that the network updates its ARP cache immediately, allowing uninterrupted communication with the shifting IP address.

ARP in Linux

ARP Implementation in the Linux Kernel

In Linux, ARP is implemented as part of the kernel’s networking stack. The ARP module is responsible for maintaining ARP tables, processing ARP requests and replies, and ensuring that the MAC-to-IP mappings are kept up-to-date. The Linux kernel handles ARP operations primarily through the net/ipv4/arp.c file, where core ARP-related functions are defined.

Key kernel functions involved in ARP processing include:

arp_send(): Responsible for crafting and sending ARP requests.
arp_rcv(): Handles incoming ARP packets, determining if they are requests or replies and processing them accordingly.
arp_process(): Manages the logic for updating the ARP cache and generating ARP replies when necessary.

Linux also provides a rich set of tools and configurations that allow one to control how ARP operates, including adjusting timeouts, cache sizes, and enabling or disabling specific ARP features like Proxy ARP, which is out of scope of our discussion.

ARP State Transitions

In Linux, the ARP cache entries undergo various state transitions based on network activity and communication patterns. Each state reflects the current status of an IP-to-MAC mapping, helping the kernel determine how to handle ARP requests, replies, and probes.

The primary ARP states include:

NONE: This state indicates that no valid neighbour entry exists.
INCOMPLETE: The kernel has sent an ARP request for a given IP address but has not yet received a reply. The entry is in a temporary state, waiting to be resolved.
REACHABLE: The IP-to-MAC mapping is confirmed, and the entry is actively being used. The mapping is considered valid and can be used for packet transmission without further ARP requests.
STALE: The entry has not been used for a certain period, making the mapping potentially outdated. The kernel does not immediately remove it but will transition the entry to another state if further communication occurs.
DELAY: After an entry becomes STALE, it transitions to the DELAY state when traffic is detected. The kernel will wait for a configured delay (as controlled by /proc/sys/net/ipv4/neigh/default/delay_first_probe_time) before probing the entry to check if it’s still valid.
PROBE: If the DELAY timer expires without validation, the kernel sends ARP probes to confirm the IP-to-MAC mapping. During this state, the entry is being actively verified.
FAILED: The ARP request has failed after multiple attempts, meaning the IP-to-MAC mapping could not be resolved. The entry is usually removed from the cache.
PERMANENT: This state is assigned to manually configured static ARP entries that do not expire and are not subject to automatic garbage collection.

Alt text

Note: 1. The START state is included for conceptual understanding and is not part of the actual ARP state machine. 2. All the probes mentioned in the above state diagram are available at /proc/sys/net/ipv4/neigh/default/

ARP Cache

ARP cache stores the MAC-to-IP mappings that have been learned through ARP requests and replies. These tables are maintained by the kernel and can be viewed and managed using user-space tools like ip and arp.

To view the current ARP table, you can use the following command:

ip -s neighbour

These commands display the list of IP addresses and their corresponding MAC addresses, along with additional information such as the interface through which the address was learned and the state of the entry (e.g., REACHABLE, STALE).

<ip_address_1> dev <IF_1> lladdr <MAC_1> used 62/61/49 probes 1 STALE
<ip_address_2> dev <IF_2> lladdr <MAC_2> ref 1 used 6/6/6 probes 1 REACHABLE
<ip_address_3> dev <IF_3> lladdr <MAC_3> ref 1 used 0/0/213 probes 1 REACHABLE

p>The output typically includes:

The IP address and MAC address (lladdr) of the neighbour devices are listed.
The interface (dev) through which the neighbour is connected is specified.
The statistics include:
- ref: The reference count (number of users of this entry).
- used: Time (in seconds) since the
  - last packet was sent to neighbour
  - last packet was received from neighbour
  - last entry was updated.
- probes: The number of ARP probes sent to confirm the neighbour’s reachability.
The state of the neighbour entry (such as delay, failed, incomplete, noarp, none, permanent, probe, reachable, stale ) is shown.

Modifying the ARP table (e.g., adding or deleting entries) can be done using the ip or arp commands. For example, to manually add an entry to the ARP table, you can use:

sudo ip neigh add <addr> lladdr <mac> dev <if_name>

ARP Cache Management

Each entry in the ARP cache is associated with specific attributes, including a state, a timeout value, and flags that guide how the kernel should handle the entry. The kernel automatically manages the cache by adding new entries when ARP replies are received and by removing stale or outdated entries over time. This dynamic management ensures that the cache remains efficient and up-to-date.

Linux provides several tunable parameters that control the behavior of the ARP cache. These settings can be adjusted to optimize network performance, depending on the specific requirements of the environment.

Adjusting Cache Timeouts

One important setting to consider is the timeout for ARP cache entries, which can be configured through the /proc/sys/net/ipv4/neigh/default/gc_stale_time parameter. This controls how long an ARP entry remains in the cache before it is marked as STALE. By adjusting this value, you can influence how frequently the system revalidates its ARP entries, balancing between freshness and resource usage.

Few scenarios: 1. Too High (>120s): Stale ARP entries linger, leading to packet loss, delayed failover, and slower network adaptation. While it reduces ARP processing overhead, it risks outdated MAC mappings.

Too Low (<30s): ARP requests become frequent, increasing CPU and network overhead. However, this ensures faster updates and better responsiveness in dynamic environments.

Choosing the right value depends on your network’s stability and performance needs. For most setups, 60 seconds offers a balanced approach.

Managing Cache Size

The ARP cache size can be controlled using garbage collection settings, such as gc_thresh1, gc_thresh2, and gc_thresh3. These thresholds define the minimum, threshold, and maximum number of entries that the ARP cache can hold. If the cache exceeds these limits, the system will begin purging older entries to make room for new ones. Properly configuring these values helps maintain the right balance between memory usage and cache effectiveness.

Few scenarios:

Too Low (gc_thresh3 < 256): The ARP cache fills up quickly, forcing premature purging of valid entries. This leads to frequent ARP lookups, causing network delays and increased latency, especially in environments with multiple active devices.
Too High (gc_thresh3 > 4096): The ARP cache grows excessively, consuming more memory than necessary. On resource-constrained devices like routers, this can lead to performance degradation due to inefficient memory allocation.

Choosing the right values depends on network size and device capacity. For small networks, a balanced configuration (gc_thresh3 = 1024) ensures optimal ARP resolution while preventing unnecessary cache evictions.

Controlling Delay Before Probing

Another useful parameter is the /proc/sys/net/ipv4/neigh/default/delay_first_probe_time, which controls the initial delay before the kernel sends the first ARP probe for a STALE entry. By default, this is set to 5 seconds, but adjusting it can reduce unnecessary probing. The delay allows the ARP entry to potentially return to the REACHABLE state on its own, minimizing network traffic and improving overall system performance.

Few scenarios:

Too High (>10s): The system waits too long before revalidating ARP entries, leading to temporary connectivity issues. This can cause packet loss or delays when communicating with previously idle hosts, especially in dynamic networks where IP addresses frequently change.
Too Low (<2s): ARP probes are sent too aggressively, increasing network traffic and CPU usage. This can overwhelm the system in large deployments or busy networks, leading to unnecessary processing overhead.

Choosing the right value depends on network stability and traffic patterns. For most environments, the default 5 seconds strikes a good balance, ensuring timely revalidation without excessive ARP probing.

Adding Static ARP Entries

In some cases, you may want to add static ARP entries to the cache. These entries do not expire and are not subject to the usual garbage collection process. Static entries are useful for devices with fixed IP-to-MAC mappings, such as network infrastructure components. You can add these entries manually using the arp -s or ip neigh add commands, ensuring that key devices are always reachable without needing to re-resolve their addresses.

Conclusion

While ARP is a mature and well-understood protocol, its role in networking is as critical as ever. Understanding the intricacies of ARP, including its implementation in Linux, its advanced features, and the potential security implications, equips network administrators and developers to build more secure and efficient networks.

As networking continues to evolve, particularly with the advent of IPv6 (which uses Neighbour Discovery Protocol instead of ARP), the principles behind ARP will still stay relevant for the future protocols and practices. Nevertheless, for IPv4-based networks, ARP will continue to play a pivotal role.

ARP Internals: an in-depth exploration of how ARP functions

Introduction

ARP Basics

ARP Functionality

ARP Request Process

ARP Reply Process

ARP Packet Structure

Gratuitous ARP

ARP in Linux

ARP Implementation in the Linux Kernel

ARP State Transitions

ARP Cache

ARP Cache Management

Adjusting Cache Timeouts

Managing Cache Size

Controlling Delay Before Probing

Adding Static ARP Entries

Conclusion

Mohith Kumar Thummaluru

Closing A Hole In The Detection Of Buffer Overflows With GCC

Upgrading Oracle Kubernetes Engine (OKE) with zero impact using OCI Load Balancer and draining

ARP Internals: an in-depth exploration of how ARP functions

Introduction

ARP Basics

ARP Functionality

ARP Request Process

ARP Reply Process

ARP Packet Structure

Gratuitous ARP

ARP in Linux

ARP Implementation in the Linux Kernel

ARP State Transitions

ARP Cache

ARP Cache Management

Adjusting Cache Timeouts

Managing Cache Size

Controlling Delay Before Probing

Adding Static ARP Entries

Conclusion

Authors

Mohith Kumar Thummaluru

Closing A Hole In The Detection Of Buffer Overflows With GCC

Upgrading Oracle Kubernetes Engine (OKE) with zero impact using OCI Load Balancer and draining