Modern storage stacks are expected to deliver more than just raw speed. They also need resilience, predictable failover, and efficient path management. In traditional SAN environments, multipathing has long been handled through SCSI-based mechanisms such as DM-Multipath in Oracle Linux. But with NVMe, especially in high-performance and low-latency environments, the model changes.

Native NVMe multipathing is designed to let hosts access the same NVMe namespace through multiple paths without relying on the older SCSI multipath stack. The result is a cleaner, NVMe-aware approach that better fits the protocol’s performance goals.

This post explains what native NVMe multipathing is, how it works, and why it matters for modern infrastructure.

What is multipathing?

Multipathing means a host can reach the same storage device through more than one physical or logical path. Those paths may exist because of multiple network links, multiple HBAs, multiple controllers, or multiple fabrics.

The main goals are:

  • High availability if one path fails
  • Better fault tolerance during maintenance or transient link issues
  • Optional load distribution across available paths

In the SCSI world, this has commonly been handled by userspace and device-mapper layers. In the NVMe world, the protocol and host drivers are built with a more native understanding of paths, controllers, and namespaces.

Why NVMe changes the picture

NVMe was designed for parallelism and low overhead. It avoids much of the legacy complexity that came with older storage protocols. Because of that, it also benefits from a path-management model that understands NVMe concepts directly.

An NVMe subsystem can expose the same namespace through multiple controllers. From the host’s perspective, those controllers may correspond to different routes to the same storage target. Instead of treating these as unrelated block devices that must be glued together later, native NVMe multipathing recognizes that they are alternate paths to the same namespace.

That matters because the host can make path decisions with protocol-level context instead of through a more generic abstraction.

What is native NVMe multipathing?

Native NVMe multipathing is the kernel’s built-in ability to manage multiple access paths to the same NVMe namespace. Rather than stacking a generic multipath layer on top, the NVMe host driver presents a single logical block device and handles path selection internally.

In practice, this means:

  • Multiple controllers can point to the same namespace
  • The host groups those paths under a single namespace head
  • I/O is routed using NVMe-aware policies
  • Path failures can be handled without exposing unnecessary complexity to applications

This is especially relevant for NVMe over Fabrics environments such as NVMe/TCP, NVMe/RDMA, or NVMe over FibreChannel, where a host may connect to the same subsystem through multiple network or fabric paths.

At a high level, native NVMe multipathing works as below:

  1. The host discovers multiple controllers for the same NVMe subsystem.
  2. Those controllers expose access to the same namespace.
  3. The kernel associates them under one logical namespace device.
  4. Path selection logic decides which path should carry I/O.
  5. If a path becomes unavailable, I/O can move to another valid path.

The important point is that the namespace remains the identity anchor. The host is not just merging random disks. It is recognizing that several transport paths lead to the same storage object.

Below is an example of Native NVMe Multipathing over FibreChannel Fabric.

 # nvme list-subsys
nvme-subsys2 - NQN=nqn.1992-08.com.netapp:sn.5f88dab0ec9a11eca063d039ea23c270:subsystem.nvme_2
\
 +- nvme0 fc traddr=nn-0x200ad039ea23c26f:pn-0x200cd039ea23c26f host_traddr=nn-0x200000109b9b1a6d:pn-0x100000109b9b1a6d live 
 +- nvme1 fc traddr=nn-0x200ad039ea23c26f:pn-0x200fd039ea23c26f host_traddr=nn-0x200000109b9b1a6d:pn-0x100000109b9b1a6d live 
 +- nvme3 fc traddr=nn-0x200ad039ea23c26f:pn-0x200bd039ea23c26f host_traddr=nn-0x200000109b9b1a6d:pn-0x100000109b9b1a6d live 
 +- nvme4 fc traddr=nn-0x200ad039ea23c26f:pn-0x200ed039ea23c26f host_traddr=nn-0x200000109b9b1a6c:pn-0x100000109b9b1a6c live 

There are 4 physical paths (nvme0, nvm1, nvme3 and nvme4) below the OS represention ‘nvme2’ logical path.

Path states and Selection Policies

One of the most important concepts here is ANA, or Asymmetric Namespace Access.

ANA: the NVMe equivalent of path state awareness

ANA allows an NVMe subsystem to tell the host which paths are optimized and which are still available but less preferred. This is similar in spirit to ALUA in the SCSI world, but it is designed for NVMe.

With ANA, paths may be reported as:

  • Optimized
  • Non-optimized
  • Inaccessible
  • Persistent loss
  • Change state

This helps the host make smarter routing decisions. Instead of blindly sending traffic down every path equally, it can prefer the best path and still keep alternates ready for failover. That improves both performance and recovery behavior.

The ANA multipath is enabled by default on Oracle Linux, if not it can be enabled by setting nvme_core.multipath to ‘Y’ either in the kernel command line or in the nvme_core configuraiton file.

The ‘nvme list-subsys’ for a namespace shows the details of ANA multipath.

# nvme list-subsys /dev/nvme4n10
nvme-subsys4 - NQN=nqn.1992-08.com.netapp:sn.5f88dab0ec9a11eca063d039ea23c270:subsystem.nvme_1 hostnqn=nqn.2014-08.org.nvmexpress:uuid:080020ff-ffff-ffff-ffff-0010e06fd6d4 iopolicy=round-robin
\
 +- nvme3 fc traddr=nn-0x200ad039ea23c26f:pn-0x200cd039ea23c26f,host_traddr=nn-0x200034800dee51ad:pn-0x210034800dee51ad live non-optimized
 +- nvme5 fc traddr=nn-0x200ad039ea23c26f:pn-0x200fd039ea23c26f,host_traddr=nn-0x200034800dee51ad:pn-0x210034800dee51ad live optimized
 +- nvme6 fc traddr=nn-0x200ad039ea23c26f:pn-0x200ed039ea23c26f,host_traddr=nn-0x200034800dee51ac:pn-0x210034800dee51ac live non-optimized
 +- nvme7 fc traddr=nn-0x200ad039ea23c26f:pn-0x200bd039ea23c26f,host_traddr=nn-0x200034800dee51ad:pn-0x210034800dee51ad live optimized

The kernel uses different I/O policies to decide which path to use for a given I/O request.

The available policies include:

  • numa: The default policy, which selects the path that is physically closest to the CPU’s Non-Uniform Memory Access (NUMA) node to optimize memory access latency.
  • round-robin: Distributes I/O requests evenly across all available paths in sequence, which is effective for balanced workloads and homogeneous path performance.
  • queue-depth: Selects the path with the fewest number of pending I/O requests (lowest queue depth) to balance the load dynamically, which is effective under high, unpredictable loads.
# cat /sys/class/nvme-subsystem/nvme-subsys2/iopolicy
round-robin

Why use native NVMe multipathing instead of DM-Multipath?

This is a common question, especially in environments that already use DM-Multipath heavily.

The short answer is that native NVMe multipathing is usually a better fit for NVMe devices because it is protocol- aware and simpler in the I/O stack.

Key advantages include:

  • Less layering in the storage path
  • Better alignment with NVMe subsystem and namespace concepts
  • Cleaner integration with ANA
  • Lower operational complexity in many setups
  • Potentially lower overhead

DM-Multipath still has a place in mixed or legacy environments, but for native NVMe deployments, especially NVMe- oF, using the NVMe stack’s built-in multipathing often leads to a more natural design.

Operational benefits

For platform and storage teams, native NVMe multipathing offers several practical benefits.

  1. Higher availability
    If a cable, NIC, switch port, controller path, or fabric session drops, I/O can continue through another path. This reduces single points of failure and supports rolling maintenance with less disruption.

  2. Better protocol alignment
    The NVMe host stack understands controllers, namespaces, ANA states, and subsystem relationships. That makes troubleshooting and behavior more predictable than forcing NVMe into older SCSI-era models.

  3. Simpler architecture
    Fewer layers often mean easier debugging. When path management lives inside the NVMe stack, there is less translation between storage concepts and fewer moving pieces to keep aligned.

  4. Improved performance behavior
    By preferring optimized paths and reducing unnecessary abstraction, native multipathing can better preserve the low-latency benefits that made NVMe attractive in the first place.

Where native multipathing is most useful

Native NVMe multipathing is especially useful in:

  • NVMe over Fabrics deployments
  • Dual-controller storage systems
  • High-availability clusters
  • Performance-sensitive databases
  • Virtualization hosts with redundant fabric connectivity
  • Kubernetes or container platforms backed by NVMe-oF storage

Any environment that wants both redundancy and high throughput can benefit from understanding this capability.

Things to keep in mind

Native NVMe multipathing is not just a box to check. To get the expected behavior, a few things need to line up:

  • The storage subsystem must present the namespace correctly across controllers
  • The transport and fabric configuration must support redundant connectivity
  • The host kernel and NVMe tooling must support the desired multipath features
  • ANA behavior should be validated during failover testing
  • Monitoring should distinguish path health from namespace health

It is also important to test real failover scenarios instead of assuming path redundancy works because it is configured. Multipathing is only valuable if it behaves correctly under actual fault conditions.

A practical mindset for adoption

When teams migrate from legacy block storage to NVMe, they sometimes focus only on IOPS and latency. That is understandable, but incomplete. Performance without resilient pathing can still leave a platform fragile.

Native NVMe multipathing helps close that gap. It brings high availability into the same modernized storage stack, without forcing NVMe to inherit assumptions from older protocols. That makes it a key building block for production-grade NVMe deployments.

Conclusion

Native NVMe multipathing is more than a feature checkbox. It reflects a broader shift in storage architecture: letting modern protocols manage themselves on their own terms.

By handling multiple paths within the NVMe host stack, systems can make smarter path decisions, reduce overhead, and improve resilience. For teams building high-performance Linux platforms or NVMe-oF environments, understanding native multipathing is essential.

The big takeaway is simple: if you are deploying NVMe at scale, do not think only about speed. Think about paths, failover, and how the host understands the storage it is talking to. Native multipathing is a big part of making NVMe both fast and dependable.

References: