Saturday Mar 29, 2014

Fibre Channel SR-IOV

OVM Server for SPARC 3.1.1 introduces support for a new class of SR-IOV devices, that is Fibre Channel. The Fibre Channel SR-IOV is a new exciting feature that brings native Fibre Channel performance to the Logical domains. There are no additional overhead caused by any software layers as the SR-IOV Virtual functions are implemented in h/w  and accessed directly in the Logical domains. This technology greatly increases the utilization of HBA adapters and reduces cost by decreasing the number of adapters as well as energy cost, more importantly reduces the FC switch port cost which is very high. This is all accomplished without loosing performance that is typically the case with virtualized IO.

NOTE: The same HBA adapter's h/w resources including the FC port b/w divided among the Virtual Functions, so the total performance won't exceed what a single adapter can deliver, but it is effectly divided among the Virtual Functions. The VFs that are not performing I/O do not consume any b/w, there by that b/w is available to the other VFs. 

This feature is fully dynamic like Ethernet SR-IOV. That is, you can create/destroy VFs dynamically without requiring a reboot of the Root domain and you can dynamically add and remove VFs from Logical domains. There are a few constraints that need to be met to accomplish this, see later in this blog for those details.This feature is also fully compatible with our non-primary Root domains. That is, you can assign a PCIe bus to a Logical domain(known as Root domain) and then create VFs and then assign the VFs to IO domains. This provides a method to reduce the single point of failures in a large deployment.

The following is an example view of 3 VFs from one port of an FC HBA assigned to three different IO domains. 

FC SR-IOV Example View

Getting Started

1. Install the required software:

Install the FC SR-IOV card in the specific PCIe bus and slot in your system. The following document provides the list of adapters that support FC SR-IOV.

Oracle VM Server for SPARC PCIe Direct I/O and SR-IOV Features (Doc ID 1325454.1)

NOTE: Both Qlogic and Emulex cards are supported now, but only Ganymede class(16G) adapters support this feature.

Ensure the required Firmware installed in your platform. You can find the Firmware support information in the release notes at the following URL:

Install or update the OS version in the Root domain to the version that supports FC SR-IOV. You can find the OS versions that supports FC SR-IOV in the release notes at the following URL:  

Solaris OS Version Requirements

Ensure that the LDoms manager 3.1.1 software installed in the control domain. Note, the LDoms manager 3.1.1 software is automatically installed if the Solaris11.1 SRU17 or later installed in the control domain. You can verify this with "ldm -V" command.

NOTE: If you are using Qlogic adapter, then you need S11.2 installed in the Root domain that owns the corresponding PCIe bus. 

Support for Solaris10 for both Root domain and IO domains is also available, please check with your support contact on this information.  Using different OS versions(either S11+ or S10) in Root domain and in IO domains(either S11+ or S10) is supported as well. A very common config that we expect is the Root domain is installed with an S11 version(that supports the adapter you are using) and S10 in an IO domain with patches that support the VF driver for the adapter that you are using.

2. Update the FC HBA adapter Firmware

Update the FC HBA adapter Firmware to the version that supports FC SR-IOV.

The firmware for Emulex 16Gb HBAs can be found at:

Firmware for Qlogic 16Gb HBAs can be found at:

FW for Part No: 7101674

FW for Part No: 7101682

It is important to power cycle the system(actually the HBA that needs the power cycle) after updating the Firmware.

3. FC Connection requirements

Ensure that the FC SR-IOV HBA is connected to a compatible FC Switch that supports NPIV. It is very important ensure that this condition is met. This feature is not supported if the FC port is directly connected to the storage.

4. Verify FC Physical Functions

Each FC HBA port shows up as one Physical Function. The FC PFs are named with "IOVFC.PFx" name, you can identify them in the output of "ldm list-io" command. The following is an example output:

# ldm ls-io
NAME                                      TYPE   BUS      DOMAIN   STATUS   
----                                      ----   ---      ------   ------   
pci_0                                     BUS    pci_0    primary  IOV      
pci_1                                     BUS    pci_1    primary  IOV      
niu_0                                     NIU    niu_0    primary           
niu_1                                     NIU    niu_1    primary           
/SYS/MB/PCIE0                             PCIE   pci_0    primary  OCC      
/SYS/MB/PCIE2                             PCIE   pci_0    primary  OCC      
/SYS/MB/PCIE4                             PCIE   pci_0    primary  EMP      
/SYS/MB/PCIE6                             PCIE   pci_0    primary  EMP      
/SYS/MB/PCIE8                             PCIE   pci_0    primary  EMP      
/SYS/MB/SASHBA                            PCIE   pci_0    primary  OCC      
/SYS/MB/NET0                              PCIE   pci_0    primary  OCC      
/SYS/MB/PCIE1                             PCIE   pci_1    primary  EMP      
/SYS/MB/PCIE3                             PCIE   pci_1    primary  OCC      
/SYS/MB/PCIE5                             PCIE   pci_1    primary  OCC      
/SYS/MB/PCIE7                             PCIE   pci_1    primary  OCC      
/SYS/MB/PCIE9                             PCIE   pci_1    primary  OCC      
/SYS/MB/NET2                              PCIE   pci_1    primary  OCC      
/SYS/MB/NET0/IOVNET.PF0                   PF     pci_0    primary           
/SYS/MB/NET0/IOVNET.PF1                   PF     pci_0    primary           
/SYS/MB/PCIE5/IOVNET.PF0                  PF     pci_1    primary           
/SYS/MB/PCIE5/IOVNET.PF1                  PF     pci_1    primary           
/SYS/MB/PCIE7/IOVFC.PF0                   PF     pci_1    primary           
/SYS/MB/PCIE7/IOVFC.PF1                   PF     pci_1    primary           
/SYS/MB/NET2/IOVNET.PF0                   PF     pci_1    primary           
/SYS/MB/NET2/IOVNET.PF1                   PF     pci_1    primary           

5. Understand the capabilities of each FC Physical Function

FC Physical functions have only one detail that is, the maximum number of VFs supported by it. You can use the "ldm list -l <pf-name>" to find this information. For example:

# ldm list-io -l /SYS/MB/PCIE7/IOVFC.PF0
/SYS/MB/PCIE7/IOVFC.PF0                   PF     pci_1    primary
[pci@500/pci@1/pci@0/pci@6/SUNW,emlxs@0]
    maxvfs = 8

6. Create the Virtual Functions(VFs)

We recommend to create all VFs in one step, this is an optimized method of creating the VFs and use them as needed. There is no performance penalty if some VFs are not used. You can use "ldm create-vf" command to accomplish this. For example:


# ldm create-vf -n max /SYS/MB/PCIE7/IOVFC.PF0
Created new vf: /SYS/MB/PCIE7/IOVFC.PF0.VF0
Created new vf: /SYS/MB/PCIE7/IOVFC.PF0.VF1
Created new vf: /SYS/MB/PCIE7/IOVFC.PF0.VF2
Created new vf: /SYS/MB/PCIE7/IOVFC.PF0.VF3
Created new vf: /SYS/MB/PCIE7/IOVFC.PF0.VF4
Created new vf: /SYS/MB/PCIE7/IOVFC.PF0.VF5
Created new vf: /SYS/MB/PCIE7/IOVFC.PF0.VF6
Created new vf: /SYS/MB/PCIE7/IOVFC.PF0.VF7


NOTE: If the IOV option is not enabled for the given PCIe bus where the FC HBA is installed, the above command will be failed. Enabling IOV option is not dynamic operation today, so one would have to reboot the Root domain to accomplish this. If you have to reboot the Root domain to set the IOV, we recommend to create VFs at the same time so that you can reboot and see VFs availabel soon after the reboot. This can be done with the following commmand.

# ldm start-reconf <root domain>
# ldm set-io iov=on pci_X
# ldm create-vf -n max <PF-name>
# <reboot the root domain to effect the changes>

7.  Understand VF WWNs assignment

The LDoms manager assigns a Port-WWN and a Node-WWN automatically to each FC VF. The auto-allocated WWNs are unique only if all of SPARC systems connected to a given SAN fabric are also connected in the same ethernet multicast domain. If not, they won't be unique. Also, if you ever destroy and recreate the VFs, theymay not get the same WWNs. As you may use these WWNs for Zoning or Lun masking, we would recommend using manual WWN allocation. See the following from admin guide for more details.


You can manually set the WWNs using the following command. You can change the WWNs dynamically as long as that VF is not assigned to any domain.

# ldm set-io port-wwn=<Port WWN> node-wwn=<Node WWN> <vf-name>

8. Configuration of SAN Storage

Configure your SAN storage to assign LUNs to each VFs. It is highly recommend to use LUN masking such that the LUNs are visible only the VF to which they are assigned. This is no different from how LUNs are assigned to different HBAs on different systems. NOTE: One important point to notice here is that you can assign LUNs IO domains such that they are not even visible in the Root domain. This produces equally same level of secure access that you get with different set of HBAs. This is not possible with virtual I/O methods.

9. Assigning VFs to Logical Domains

You can now assign VFs to Logical domains using the "add-io" command. For example, the following commands assign VFs to three different domain.

# ldm add-io /SYS/MB/PCIE7/IOVFC.PF0.VF0 ldg0
# ldm add-io /SYS/MB/PCIE7/IOVFC.PF0.VF1 ldg1
# ldm add-io /SYS/MB/PCIE7/IOVFC.PF0.VF2 ldg2

Make sure to setup the failure-policy settings to handle any unexpected Root domain reboot/crash cases. The following failure-policy will cause the IO domains(here ldg0, ldg1, ldg2) to be reset along with the Root domain. 

# ldm set-domain failure-policy=reset primary
# ldm set-domain master=primary ldg0
# ldm set-domain master=primary ldg1
# ldm set-domain master=primary ldg2 

NOTE: You can dynamically assign VFs too. That is, if the given Logical domain is running the required OS version, you can simply run the same commands to dynamically add the VFs. The IO domain OS requirements are same as what is mentioned for the Root domains, which you can find at:

10. Using the VFs in IO domains

If you statically added the VFs, then you can now start the IO domains and use the VFs like any FC HBA. For example, at OBP prompt of the IO domain, you can perform the "probe-scsi-all" to see all LUNs visible to that VF.   Installing the OS and booting from a LUN is fully supported. Features like MPxIO are full supported. For example, you can assign VFs from different Root Domains and configure MPxIO. 

Caution: Reboot/crash of a Root domain will impact the IO domains. Having VFs from different Root domains doesn't increase the availability of the IO domains.

Documentation

You can find the detailed documentation in the OVM Server for SPARC 3.1.1 Admin Guide

FC SR-IOV Limitations

There are a few limitations that needs to be understood. These are:

  • No support when the HBA is connected to the storage directly.
  • No NPIV support on top of Virtual functions. NPIV on the physical function is supported as usual. 






Tuesday Aug 20, 2013

Virtual network performance greatly improved!

With latest OVM Server for SPARC, the virtual network performance has been improved greatly.  We are now able to drive the line rate(9.xGbps) on a 10Gbps NIC and up to 16Gbps for Guest-to-Guest communication. These numbers are achieved with standard MTU(1500), that is no need to use Jumbo Frames to get this performance. This is made possible by introducing support for LSO in our networking stack. The following graphs are from a SPARC T5-2 platform, with 2 cores to Control domain and Guest domains. 

LDoms Virtual network performance graphs

Note: In general for any network device, the performance numbers depends on the type of workload, the above numbers are obtained with iperf workload and a message size of 8k. Note, the interface is configured with standard MTU.for throughput.

These improvements are available in S11.1 SRU9,  the latest SRU is always recommended. The S10 patches with the same improvement will be available very soon. We recommend highly to use S11.1 in the Service domains.

What you need to get this performance? 

  • Install S11.1 SRU9 or later in both Service domain(the domain that hosts LDoms vsw) and the Guest domains. It is important both Service domain and the Guest domains to be updated to get this performance.
    • S10 patches with equivalent performance are also available. The S10 patch 150031-07 is required to be installed in the S10 domain(s). Please contact Oracle support teams for any additional information.
  • Update the latest system Firmware that is available for your platform platform.
    • These performance numbers can only expected on SPARC T4 and newer platforms only.
  • Ensure that the extended-mapin-space is set to on for both Service domain and Guest domains.  
    • Note OVM Server for SPARC 3.1 Software and associated FW sets extended-mapin-space to on by default so that this performance comes out of the box, in any case, confirm if it is set to on all domains.
    • You can check this with the following command:

# ldm ls -l <domain-name> |grep extended-mapin
   extended-mapin-space=on
 
    •  If the extended-mapin-space is not set to on, you can set it to on with the following commands. Note, the changes to extended-mapin-space will trigger delayed reconfig for primary domain and require a reboot and the Guest domains required to be stopped.
# ldm set-domain extended-mapin-space=on <domain-name>

  •  Ensure there are sufficient CPU and memory resources assigned to both the Service domain and Guest domains. Note to drive 10Gbps or beyond performance a Guest domain need to be configured to be able to drive such performance, we recommend 2 CPU Cores or more and 4GB or more memory to be assigned to each Guest domain. As the service domain is also involved in proxying the data for the Guest domains, it is very important to assign sufficient CPU and memory resources, we recommend 2 CPU Cores or more and 4GB or more memory resources to the Service domain.
  • No jumbo frames configuration is required, that is, this performance improvement will be available for standard MTU(1500) as well. We introduced support for LSO to be able to optimize the performance for standard MTU. In fact, we recommend avoid configuring Jumbo Frames unless you have specific need. 

About

Raghuram Kothakota

Search

Categories
Archives
« May 2015
SunMonTueWedThuFriSat
     
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
      
Today