NIU Hybrid I/O

NIU Hybrid I/O feature is one of the significant network features released with LDoms1.1. NIU Hybrid I/O provides a very high  bandwidth with low latency directly to a Guest domain. This technology now limited UltraSPARC-T2(Niagara-II) cpu based platforms only. More details below:

Hybrid I/O model explained:

The Hybrid I/O model combines direct(or physical) and virtualized I/O to allow flexible deployment of I/O resources to virtual machines(or Guest Domains). Particularly, it is useful when direct I/O does not provide full I/O functioanlity for a Virtual Machine(VM), or direct I/O may not be persistently or consistently available to the VM (due to resource availability or VM migration). The hybrid I/O architecture is well suited to Niagara II's Network Interface Unit (NIU), a network I/O interface integrated on chip. This allows the dynamic assignment of DMA resources to virtual networking devices and thereby providing consistent performance to applications in the domain.

Hybrid I/O and Virtualized I/O differences:

In Virtualized I/O, the network packets flows through another domain(Service Domain), which sends/receives packets from/to a physical NIC via direct I/O. That is, in case of LDoms network, the packets are sent/received to/from the vswitch in a Service Domain where the vsiwtch sends/receives packets via the physical NIC. The Virtualized I/O model is flexible, but performance suffers due to overhead of another domain in between to perform direct I/O to/from a physical device.

In Hybrid I/O, the packets sent/received directly via physical DMA resources, but virtualized I/O used for broadcast/multicast packets. That is, all unicast packets from external network are directly received via the direct DMA resources, but the broadcast and multicast packets go through Vswitch in the Service domain. The Hybird I/O provides high bandwidth with low latency, it is also flexible as the assignment of DMA resources is dynamic, but a complex solution.

Hybrid I/O specific details of NIU:

  • Network Interface Unit(NIU) is on chip on UltraSPARC-T2 cpu.
  • Currently there are two platforms T5120 and T5220 ship with UltraSPARC-T2 cpu.
  • NIU has support for 2 ports.  Both T5120 and T5220 have two slots for NIU, that is one slot for each port. The XAUI adapters need to be installed in order gain access to the NIU ports.
  • Each NIU port can share up to 3 Hybrid mode resources. That is, a total of 6 Hybrid resources available to virtual network devices. That means, 6 virtual network devices can have hybrid resource assigned to them.

The following is a diagram that shows internal blocks in an NIU Hybrid/IO for two Guest domains with Hybrid resource assigned.

Hybrid I/O diagram

  • NIU -- Network interface Unit
  • RX/TX -- A pair of Receive and Transmit DMA channels
  • VR -- Virtual Region where DMA resources are grouped and assigned to a Guest domain.
  • HIO -- Hybrid I/O
  • VIO -- Virtualized I/O
  • Hybrid nxge -- Nxge driver operating in Hybrid mode
  • Vnet -- Virtual network device
  • Virtual switch -- A vswitch that provides communication to vnet devices

Hybrid I/O CLIs:

The Hybrid I/O for a vnet is enabled/disabled via the 'mode' option of vnet CLIs. If the mode is set to 'hybrid' then Hybrid I/O enabled for a vnet device. If not, its disabled. By default the 'hybrid' mode is disabled.

To enable hybrid mode, use either of the commands below:

 primary# ldm add-vnet mode=hybrid vnet0 primary-vsw0 ldg1
 primary# ldm set-vnet mode=hybrid vnet0 ldg1
To disable hybrid mode, use either of the commands below:
 primary# ldm add-vnet vnet0 primary-vsw0 ldg1
 primary# ldm set-vnet mode= vnet0 ldg1

Important Note: Setting "mode=hybrid" is treated as a hint or desired flag only. That means, enabling hybrid mode doesn't guarantee the hybrid resource assignment. No specific checks are made when the hybrid mode is enable for a Vnet device. Actual assignment depends on many criteria, some of these are as below. 

    1. The Vswitch to which a vnet device is connected is backed by one of the ports of NIU.
    2. The system has the right level of Firmware, that is, a Firmware released via LDoms1.1 or later.
    3. Both Service domain(where vsiwch exists) and Guest domain(where vnet exists) have Solaris 10 10/08(a few patches may be required) or later.
    4. A free hybrid resource(VR+DMA channels) is available. Note, each port of an NIU has only 3 hybrid resource available.
    5. The vnet interface in the Guest domain is plumbed.

Important points to note about NIU Hybrid I/O:

  1. Today, the Hybrid I/O feature implemented only for the NIU which is on chip in UltraSPARC-T2 cpu. That means, this feature is available for all platforms that have Niagara II cpu. Note, this is not a feature for UltraSPARC-T2+ cpu which doesn't have NIU on chip.
  2. Note, the nxge driver used for NIU and the same driver used for other NICs such as Neptune PCIe card. So, when the system has a mix of these devices, its important to identify which nxge interface belong to NIU ports.
  3. This technology provides direct DMA of unicast packets only, that means, the performance improvement is only seen for unicast packets. The broadcast, multicast type of applications may not benefit from this feature.
  4. The hybrid resource(show as VR(NIU)) above is assigned only when a vnet is plumbed. That means, the DMA resources are not wasted for a virtual network device that is not currently active.
  5. OBP doesn't use Hybird I/O, that means, the boot net type of traffic in OBP doesn't benefit from Hybrid I/O.
  6. Guest Domain to Guest Domain communication in the same system does not use Hybrid I/O.

Troubleshooting tips:

  1. Hybrid resource not assigned to a vnet device?
    • Identify the vswitch to which the vnet device connects to and find the assigned net-dev. It should be an nxge device. Verify that nxge device is really from NIU, not from other nxge supported devices as below. This grep needs to be run on the Service Domain(where Vswitch is created). If you see the '/niu@80' infront of the nxge instance you are using, then its the right device to use.
      • Service# grep nxge /etc/path_to_inst 
        "/niu@80/network@0" 0 "nxge"
        "/niu@80/network@1" 1 "nxge"
    • Verify the required Firmware and Software installed on the system.
    • Verify if the mode is set to 'hybrid' that vnet.
    • Verify if that vnet interface is plumbed in the Guest domain.
    • Verify that there are no more than 3 vnet devices per vswitch have 'hybrid' mode enabled.
  2. How confirm a Hybrid resource assigned to a vnet device?
    • Run “kstat nxge” in the Guest domain. If you see any nxge kstats, that indicate a hybrid mode nxge instance is created. If you have multiple vnet devices with hybrid mode enabled in the same Guest domain, then you need to verify if you have that many nxge instances shown in the "kstat nxge" output.
    • Next release of Solaris 10 update will have Hybrid I/O kstats to make it more meaningful and easiler to monitor.
  3. How to verify the unicast traffic is really going through hybrid resource(nxge)?
    • kstats of nxge provide the packets going through hybrid nxge device.
      # kstat -p nxge:\*:\*:\*packets 
      nxge:0:RDC Channel 0 Stats:rdc_packets  20
      nxge:0:RDC Channel 1 Stats:rdc_packets  30
      nxge:0:TDC Channel 0 Stats:tdc_packets  20
      nxge:0:TDC Channel 1 Stats:tdc_packets  20
      • tdc_packets – packets going via TX DMA channel
        rdc_packets – packets received by the RX DMA channel

Nice Blog Raghuram - I'll add you to our bloggers roll in our LDoms community!

Posted by John F on December 27, 2008 at 03:10 PM PST #

Does this mean that we can now do link based IPMP within a LDOM?

Posted by Cliff Pearcy on January 22, 2009 at 02:10 AM PST #

Hi Cliff, while I am not sure whether hybrid I/O will pass the link failure (I suspect not as multi- and broad-cast traffic must still go via the vswitch, so the full semantics of a physical link are not there), there is work in progress in both LDom and Solaris 10 to propagate link failure out of band to mpathd, so a link failure will initiate IPMP action. The Solaris part is in Solaris 10 10/09. Don't know about the LDom part.


Posted by Steffen Weiberle on September 09, 2009 at 11:12 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed

Raghuram Kothakota


« April 2014