X

Using Solaris and SPARC Networking and Virtualization

Using Aggregations and VLANs with LDoms and Zones


Often people ask how to use link aggregations and VLANs with Oracle VM Server for SPARC (Logical Domain or LDoms). My goal here is to give a brief description and steps how to configure a link aggregation in a Serivce Domain (in this case also the Control Domain) and then set up different VLAN configurations.

I am showing this with Solaris 11.3, though the steps will work for any Solaris 11. Due to networking differences in Solaris 10, the principles will apply yet the steps will be different.

My Setup


I am using a T4-1 for the system to demonstrate the networking and LDom set, a T5120 as are remote system on the network, and a Netgear GS716T Smart Switch between the two. The GS716T can do link aggregation, but not IEEE 802.3ad LACP. Solaris supports link aggregation with or without LACP, and since Solaris 11.1, also using Data Link MultiPathing (DLMP.) The functionality and steps are almost identical except in some options when setting up the link aggregation.

I find it useful for my understanding to see not just command line input. I also like to see the output, and validation that steps I perform actually do something. When doing network configurations, I prefer to see network traffic. This session will include all of that. For networking that requires a second system, and I will show the setup of that as well. I hope this is useful for others.

The Remote System


For my network target, as it were, I am using a SPARC T5120 running Solaris 11.2. The actual release is not as important for this, as I am using only basic VLAN features.

Initial network configuration is as follows. It has some other things on it, that I am cutting from the output as it is not relevant to this topic.

root@remote# dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net1 Ethernet up 1000 full e1000g1
net2 Ethernet up 1000 full e1000g2
net0 Ethernet up 1000 full e1000g0
net3 Ethernet up 1000 full e1000g3
root@remote# dladm show-link
LINK CLASS MTU STATE OVER
net1 phys 1500 up --
net2 phys 1500 up --
net0 phys 1500 up --
net3 phys 9000 up --
root@remote# dladm show-vlan
root@remote#

First I will create three VLANs that are configured on the switch, 111, 112, and 113.
 
root@remote# dladm create-vlan -l net3 -v 111 net3111
root@remote# dladm create-vlan -l net3 -v 112 net3111
root@remote# dladm create-vlan -l net3 -v 113 net3113
root@remote#
root@remote# dladm show-link
LINK CLASS MTU STATE OVER
net1 phys 1500 up --
net2 phys 1500 up --
net0 phys 1500 up --
net3 phys 9000 up --
net3111 vlan 9000 up net3
net3112 vlan 9000 up net3
net3113 vlan 9000 up net3
root@remote#
root@remote# dladm show-vlan
LINK VID SVID PVLAN-TYPE FLAGS OVER
net3111 111 -- -- ----- net3
net3112 112 -- -- ----- net3
net3113 113 -- -- ----- net3
root@remote#

If I had not set the data link name, in my case net3111 for the first one, Solaris would have used the old PPA (Physical Point of Attachment) standard that has been used in Solaris for a long time. They would have been net111003, net112003, and net113003. Those names require more typing. I do like naming where it is easy to recognize the data link the VLAN is on as well as the VLAN ID.

Next step is to put some IP address on those VLAN. I use 192.168.VLAN.x as my subnet, and I set "x" to the host part of the IP address of the base system, in this case "1".

root@remote# ipadm create-ip net3111
root@remote# ipadm create-ip net3112
root@remote# ipadm create-ip net3113
root@remote#
root@remote# ipadm create-addr -a 192.168.111.1/24 net3111
net3111/v4
root@remote# ipadm create-addr -a 192.168.112.1/24 net3112
net3112/v4
root@remote# ipadm create-addr -a 192.168.113.1/24 net3113
net3113/v4
root@remote#
root@remote# ipadm show-addr
ADDROBJ TYPE STATE ADDR
lo0/v4 static ok 127.0.0.1/8
net0/v4 static ok 172.16.1.1/22
net3111/v4 static ok 192.168.111.1/24
net3112/v4 static ok 192.168.112.1/24
net3113/v4 static ok 192.168.113.1/24
lo0/v6 static ok ::1/128
net0/v6 addrconf ok fe80::214:4fff:feac:57c4/10
net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:feac:57c4/64
root@remote#

The remote system setup is complete.

Creating a Link Aggregation


On the SPARC T4-1 running Solaris 11.3, I will first create an aggregation and test is in the Control/Service Domain. I will use interfaces 1 and 3 on the system, since those are using two different physical chips on the system motherboard. In production, they likely are ports on two different NICs.
root@cdom# dladm create-aggr -l net1 -l net3 aggr1
root@cdom#
root@cdom# dladm show-aggr
LINK MODE POLICY ADDRPOLICY LACPACTIVITY LACPTIMER
aggr1 trunk L4 auto off short
root@cdom#
root@cdom# dladm show-aggr -P
LINK MODE POLICY ADDRPOLICY LACPACTIVITY LACPTIMER
aggr1 trunk L4 auto off short
root@cdom# dladm show-aggr -L
LINK PORT AGGREGATABLE SYNC COLL DIST DEFAULTED EXPIRED
aggr1 net1 no no no no no no
-- net3 no no no no no no
root@cdom# dladm show-aggr -x
LINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATE
aggr1 -- 1000Mb full up 0:21:28:d2:17:f9 --
net1 1000Mb full up 0:21:28:d2:17:f9 attached
net3 1000Mb full up 0:21:28:d2:17:fb attached
root@cdom#

I show different outputs of the dladm(1M) command here, and we'll see some differences later.

The aggregation is on a private network on the Netgear switch, so a snoop will not show a lot of traffic. I will generate some traffic using ping, and I will be switching between the two systems for that.

root@remote# ping 192.168.111.101 2
no answer from 192.168.111.101
root@remote# ping 192.168.112.101 2
no answer from 192.168.112.101
root@remote# ping 192.168.113.101 2
no answer from 192.168.113.101
root@remote#

To keep output short, and make testing faster, I only sent two packets per IP address, since I know there is not going to be an answer. So what does this look like on the system with the aggregation?
root@cdom# snoop -d aggr1
Using device aggr1 (promiscuous mode)
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.1, 192.168.111.1 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.1, 192.168.112.1 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.101, 192.168.112.101 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.101, 192.168.112.101 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.101, 192.168.112.101 ?
VLAN#113: 192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.1, 192.168.113.1 ?
VLAN#113: 192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.101, 192.168.113.101 ?
VLAN#113: 192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.101, 192.168.113.101 ?
VLAN#113: 192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.101, 192.168.113.101 ?

First notice that each line includes the VLAN ID. This is a new feature in Solaris 11, and may have been back ported to a late update of Solaris 10 (I will have to check and come back to that.)

You can see the ARP requests for all three VLANs with the target address on each. This is why I like to have the VLAN ID and the subnet the same. I am beginning to notice this with some customers as well.

My first test is to bring down one or both ports and see the changes in the aggregation and the network. After turning off the port of net1, this is how things look.

root@cdom# dladm show-aggr -x
LINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATE
aggr1 -- 1000Mb full up 0:21:28:d2:17:f9 --
net1 0Mb unknown down 0:21:28:d2:17:f9 standby
net3 1000Mb full up 0:21:28:d2:17:fb attached
root@cdom#
root@cdom# snoop -d aggr1
Using device aggr1 (promiscuous mode)
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.101, 192.168.111.101 ?
^Croot@cdom#

The aggregation stays up, and traffic continues to come into the system. Port status changes are also in /var/adm/messages. I don't see anything going to the console, however. Messages are limited as the aggregation is not plumbed nor IP is using it, even when both ports are down.
root@cdom# dladm show-aggr -x
LINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATE
aggr1 -- 0Mb unknown down 0:21:28:d2:17:f9 --
net1 0Mb unknown down 0:21:28:d2:17:f9 standby
net3 0Mb unknown down 0:21:28:d2:17:fb standby
root@cdom#
root@cdom# tail /var/adm/messages
...
Apr 21 11:00:07 gravity mac: [ID 486395 kern.info] NOTICE: igb3 link down
Apr 21 11:05:40 gravity mac: [ID 486395 kern.info] NOTICE: igb1 link down
Apr 21 11:05:40 gravity mac: [ID 486395 kern.info] NOTICE: aggr1 link down
root@cdom#

To better see network connectivity, I will create a VLAN and configure an IP address.
root@cdom# dladm create-vlan -l aggr1 -v 111 aggr1111
root@cdom# ipadm create-ip aggr1111
root@cdom# ipadm create-addr -a 192.168.111.5/24 aggr1111
aggr1111/v4
root@cdom#
root@cdom# ipadm show-addr
ADDROBJ TYPE STATE ADDR
lo0/v4 static ok 127.0.0.1/8
net0/v4 static ok 192.168.1.5/22
net4/v4 static ok 169.254.182.77/24
aggr1111/v4 static ok 192.168.111.5/24
lo0/v6 static ok ::1/128
net0/v6 addrconf ok fe80::221:28ff:fed2:17f8/10
net0/v6 addrconf ok 2606:b400:602:c080:221:28ff:fed2:17f8/64
root@cdom#

The networks works as seen from the remote system.
root@remote# ping 192.168.111.5 2
192.168.111.5 is alive
root@remote#

I will first bring one port down, then both.
root@cdom# dladm show-aggr -x
LINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATE
aggr1 -- 1000Mb full up 0:21:28:d2:17:f9 --
net1 1000Mb full up 0:21:28:d2:17:f9 attached
net3 1000Mb full up 0:21:28:d2:17:fb attached
root@cdom#
root@cdom# dladm show-aggr -x
LINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATE
aggr1 -- 1000Mb full up 0:21:28:d2:17:f9 --
net1 0Mb unknown down 0:21:28:d2:17:f9 standby
net3 1000Mb full up 0:21:28:d2:17:fb attached
root@cdom#
root@remote# ping 192.168.111.5 2
192.168.111.5 is alive
root@remote#
root@cdom# dladm show-aggr -x
LINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATE
aggr1 -- 0Mb unknown down 0:21:28:d2:17:f9 --
net1 0Mb unknown down 0:21:28:d2:17:f9 standby
net3 0Mb unknown down 0:21:28:d2:17:fb standby
root@cdom#
root@cdom# ipadm show-addr
ADDROBJ TYPE STATE ADDR
lo0/v4 static ok 127.0.0.1/8
net0/v4 static ok 192.168.1.5/22
net4/v4 static ok 169.254.182.77/24
aggr1111/v4 static inaccessible 192.168.111.5/24
lo0/v6 static ok ::1/128
net0/v6 addrconf ok fe80::221:28ff:fed2:17f8/10
net0/v6 addrconf ok 2606:b400:602:c080:221:28ff:fed2:17f8/64
root@cdom#
root@remote# ping 192.168.111.5 2
no answer from 192.168.111.5
root@remote#

With this I have created an aggregation, shown VLAN, and demonstrated what happens when one or both ports in the aggregation fail. The aggregation remains functional with one port failure and networking continues. The aggregation fails with both ports down, and the IP address shows to be inaccessible.

Setting up the Virtual Switch in a Service Domain


I use the terms Service Domain and Control Domain to refer to the specific function I am doing or working on. On this system, there is only one Service Domain, and it is also the Control Domain. The concepts and steps I am outlining here apply also to second or redundant Service Domains when a system is configured with more than one.

This is an area where there are differences between Solaris 11 and Solaris 10 Service Domains. Oracle highly recommends that all Service Domains are running Solaris 11.

I will be creating a virtual switch on top of the aggr1 data link while the existing VLANs are already there. If this were Solaris 10, I'd likely remove them and if I need Service Domain access to the aggregation, I would use the virtual switch.

In Solaris 11, there is no need to set or modify the pvid and vid parameters on the virtual switch. If this were Solaris 10 and I wanted to get access to VLANs on the data link (in this case aggr1) I would need to set those.

Let us get started on the virtual switch.

root@cdom# ldm add-vsw net-dev=aggr1 primary-vsw1 primary
root@cdom#
root@cdom# ldm list-services
...
VSW
NAME LDOM MAC NET-DEV ID DEVICE LINKPROP DEFAULT-VLAN-ID PVID VID MTU MODE INTER-VNET-LINK
primary-vsw0 primary 00:14:4f:f9:b4:9f net0 0 switch@0 1 1 1500 on
primary-vsw1 primary 00:14:4f:f8:44:87 aggr1 1 switch@1 1 1 1500 on
...
root@cdom#
root@cdom# ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
primary active -n-cv- UART 16 7680M 0.2% 0.2% 69d 17h 38m
host1 active -n---- 5000 8 4G 0.2% 0.2% 5d 21h 23m
root@cdom#

There is an existing Guest Domain on the system, and I will use that one to demonstrate the networking. I will do this in steps, to cover a range of LDom networking items. First item is to create a new virtual network device (vnet) for the Guest Domain. I show before and after.
guest@host1:~$ pfbash [1]
guest@host1:~$
guest@host1:~$ PS1="guest-pf@host1$ "
guest-pf@host1$
guest-pf@host1$ dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet up 0 unknown vnet0
guest-pf@host1$
guest-pf@host1$ dladm show-link
LINK CLASS MTU STATE OVER
net0 phys 1500 up --
guest-pf@host1$
root@cdom# ldm add-vnet linkprop=phys-state vnet1 primary-vsw1 host1
root@cdom#
guest-pf@host1$ dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet up 0 unknown vnet0
net1 Ethernet unknown 0 unknown vnet1
guest-pf@host1$

In the background, on the remote system, I have steady ping(1) running on all three subnets. A snoop shows no traffic (I won't bother to show "nothing" here (how do I add a smiley?) So we have successfully added a virtualized network interface where the underlying data link is an aggregation.

[1] You may wonder what I did here. If you don't, skip to the next section.

I am using Solaris' Role Based Access Control feature, where this user has been given a lot of privileges. I could just do an su(1) to root. Instead, I am running as the user in the profile shell version of bash. Every command is then checked for authorization. This is easier than running "pfexec command", for those familiar with sudo, "sudo command". The pfexec(1) command does not prompt for a password.

Starting to Work with VLANs


I keep the snoop running while I add a VLAN ID to the vnet, as a "vid", which means it will also show up in the Guest Domain with the VLAN tag.
root@cdom# ldm set-vnet vid=111 vnet1 host1
root@cdom#
guest-pf@host1$ snoop -d net1
Using device net1 (promiscuous mode)
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
^Cguest-pf@host1$

Not easy to show here, but almost immediately after running the set-vnet command snoop sees traffic on VLAN 111. Just as expected. The pings/ARPs on the other two VLANs are still not coming through, as they are not assigned to the vnet. Now I will add VLAN 112 also as a vid, and I will add VLAN 113 as a pvid. That means 113 traffic will come in with the VLAN tag removed.
root@cdom# ldm set-vnet vid=111,112 pvid=113 vnet1 host1
root@cdom#
guest-pf@host1$ snoop -d net1
Using device net1 (promiscuous mode)
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.6, 192.168.113.6 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.6, 192.168.113.6 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
192.168.113.1 -> (broadcast) ARP C Who is 192.168.113.6, 192.168.113.6 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
^Cguest-pf@host1$

As I expected, the VLAN 111 traffic continues, and then I see VLAN 112 also tagged (the VLAN#111: prefix) and the VLAN 113 traffic untagged. Easy to see because of the third octet in each IP address.

Note that I can add or remove VLANs while the interface is running in the Guest Domain.

Testing Link Failure with LDoms


The next step it to show how a failure of one or both ports in the aggregation affects the Guest Domain. I add an IP address in the Guest, and then turn off one, then both ports.
guest-pf@host1$ dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet up 0 unknown vnet0
net1 Ethernet down 0 unknown vnet1
guest-pf@host1$
guest-pf@host1$ ipadm create-ip net1
guest-pf@host1$ ipadm create-addr -a 192.168.113.6/24 net1
net1/v4
guest-pf@host1$
guest-pf@host1$ ipadm show-addr
ADDROBJ TYPE STATE ADDR
lo0/v4 static ok 127.0.0.1/8
net0/v4 static ok 192.168.1.6/22
net1/v4 static ok 192.168.113.6/24
lo0/v6 static ok ::1/128
net0/v6 addrconf ok fe80::214:4fff:fef9:fc75/10
net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:fef9:fc75/64
guest-pf@host1$
guest-pf@host1$ dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet up 0 unknown vnet0
net1 Ethernet up 0 unknown vnet1
guest-pf@host1$
guest-pf@host1$ snoop -d net1
Using device net1 (promiscuous mode)
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
192.168.113.1 -> 192.168.113.6 ICMP Echo request (ID: 26169 Sequence number: 8405)
192.168.113.6 -> 192.168.113.1 ICMP Echo reply (ID: 26169 Sequence number: 8405)
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
VLAN#111: 192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
192.168.113.1 -> 192.168.113.6 ICMP Echo request (ID: 26169 Sequence number: 8406)
192.168.113.6 -> 192.168.113.1 ICMP Echo reply (ID: 26169 Sequence number: 8406)
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
^Cguest-pf@host1$
guest-pf@host1$

After configuring the address, I can see ICMP echo and reply messages to that address. I chose to do this on VLAN 113, which is untagged, however, the same would work if I create a VLAN data link. I will show that on a different step.

I quickly tested the configuration with one port on the switch down, and in the guest it looks the same. Then I marked the second port down. This makes the aggregation down.

root@cdom# dladm show-aggr -x
LINK PORT SPEED DUPLEX STATE ADDRESS PORTSTATE
aggr1 -- 0Mb unknown down 0:21:28:d2:17:f9 --
net1 0Mb unknown down 0:21:28:d2:17:f9 standby
net3 0Mb unknown down 0:21:28:d2:17:fb standby
root@cdom#

So how does this look in the Guest Domain? You can see that it also sees the virtual network interface down. How did that happen? I'll explain shortly.
guest-pf@host1$ dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet up 0 unknown vnet0
net1 Ethernet down 0 unknown vnet1
guest-pf@host1$
guest-pf@host1$ ipadm show-addr
ADDROBJ TYPE STATE ADDR
lo0/v4 static ok 127.0.0.1/8
net0/v4 static ok 192.168.1.6/22
net1/v4 static inaccessible 192.168.113.6/24
lo0/v6 static ok ::1/128
net0/v6 addrconf ok fe80::214:4fff:fef9:fc75/10
net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:fef9:fc75/64
guest-pf@host1$
guest-pf@host1$ snoop -d net1
Using device net1 (promiscuous mode)
^Cguest-pf@host1$
guest-pf@host1$

You may have noticed that when I set up the vnet in the Service Domain, I used an option "linkprop=phys-state". This LDom option uses an out of band protocol to pass the link state of the underlying data link to the guest. Without this, because there is a virtual switch between the physical data link or aggregation and the virtual network interface (vnet), the latter would not see a hardware failure. It can still communicate with other vnets on the same virtual switch. This link state propagation was added to LDoms a number of years ago.

To demonstrate, I will turn linkprop off, and then look at the interface in the Guest Domain.

root@cdom# ldm set-vnet linkprop="" vnet1 host1
root@cdom#
guest-pf@host1$ dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet up 0 unknown vnet0
net1 Ethernet up 0 unknown vnet1
guest-pf@host1$
guest-pf@host1$ ipadm show-addr
ADDROBJ TYPE STATE ADDR
lo0/v4 static ok 127.0.0.1/8
net0/v4 static ok 192.168.1.6/22
net1/v4 static ok 192.168.113.6/24
lo0/v6 static ok ::1/128
net0/v6 addrconf ok fe80::214:4fff:fef9:fc75/10
net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:fef9:fc75/64
guest-pf@host1$

The Guest thinks the link is working. A snoop was completely quiet. I'll turn linkprop back on, and then enable the ports again to put everything into a working state. Behind the scenes I see my ping showing success on the remote system. Another validation that the network is working again.

Using Solaris Virtual Network Interfaces (VNICs) in LDoms


Must customers using LDoms are also using Solaris Zones. In Solaris 11 is a key feature, network virtualization. This allows a user, or the Solaris Zones framework, to create individual virtual NICs (VNICs) for Zones, making consolidation much easier and the Zones behave more as if they are different systems with their own networking hardware. Before moving on to Zones, I'd like to test this with a VNIC manually.

Let's give it a try.

guest-pf@host1$ dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet up 0 unknown vnet0
net1 Ethernet up 0 unknown vnet1
guest-pf@host1$
guest-pf@host1$ dladm show-phys -m
LINK SLOT ADDRESS INUSE CLIENT
net0 primary 0:14:4f:f9:fc:75 yes net0
1 0:14:4f:fb:a1:78 no --
2 0:14:4f:f8:f9:32 no --
3 0:14:4f:f9:ab:37 no --
4 0:14:4f:f8:1:93 no --
net1 primary 0:14:4f:f8:3e:e5 yes net1
guest-pf@host1$
guest-pf@host1$ dladm create-vnic -l net1 vnic11
dladm: vnic creation failed: operation not supported
guest-pf@host1$
guest-pf@host1$ dladm create-vnic -l net0 vnic1
guest-pf@host1$
guest-pf@host1$ dladm show-phys -m
LINK SLOT ADDRESS INUSE CLIENT
net0 primary 0:14:4f:f9:fc:75 yes net0
1 0:14:4f:fb:a1:78 yes vnic1
2 0:14:4f:f8:f9:32 no --
3 0:14:4f:f9:ab:37 no --
4 0:14:4f:f8:1:93 no --
net1 primary 0:14:4f:f8:3e:e5 yes net1
guest-pf@host1$

Oops. Creating a VNIC on net1 failed. Why is that? Turns out each vnic needs its own MAC, since it will have its own IP address on it--this is definitely the case in a Zone. However, the underlying "physical" interface, in this case a vnet only has one MAC address. And while on an actual physical interface it is possible to add more MAC addresses, through some device driver mechanics, this is not possible on a vnet.

This is also why I chose to show VNICs outside of Zones. If we had gone straight to Zone creating and start-up, this failure might be harder to track down.

Several years ago LDoms added a new feature to assign additional MAC addresses to a vnet. The property is called "alt-mac-addrs". It allows a fixed number of MAC addresses to be assigned to the vnet. Unfortunately, this vnet property can not be set or changed when a Guest Domain is running. So I will shut the Guest down.

guest-pf@host1$ init 5
updating /platform/sun4v/boot_archive
guest-pf@host1$
root@cdom# ldm set-vnet alt-mac-addrs=auto,auto,auto,auto,auto,auto vnet1 host1
Please perform the operation while the LDom is bound or inactive
root@cdom#
root@cdom# ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
primary active -n-cv- UART 16 7680M 1.3% 1.3% 69d 22h 15m
host1 active -n---- 5000 8 4G 0.1% 0.1% 3h 54m
root@cdom#
root@cdom# ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
primary active -n-cv- UART 16 7680M 0.5% 0.5% 69d 22h 16m
host1 bound ------ 5000 8 4G
root@cdom#
root@cdom# ldm set-vnet alt-mac-addrs=auto,auto,auto,auto,auto,auto vnet1 host1
root@cdom#
root@cdom# ldm start host1
LDom host1 started
root@cdom#

I show the error message when I tried to change the vnet while the Guest Domain is running. Once it was stopped, the operation was successful. You may notice that I list six time the work "auto". I am adding six MAC addresses to the vnet. And I am allowing each MAC address to be automatically generated. If I need to keep MAC addresses across configurations, I can set the explicitly.

Once the Guest Domain is back up, and can see what things look like now.

guest-pf@host1$ dladm show-phys -m
LINK SLOT ADDRESS INUSE CLIENT
net0 primary 0:14:4f:f9:fc:75 yes net0
1 0:14:4f:fb:a1:78 yes vnic1
2 0:14:4f:f8:f9:32 no --
3 0:14:4f:f9:ab:37 no --
4 0:14:4f:f8:1:93 no --
net1 primary 0:14:4f:f8:3e:e5 yes net1
1 0:14:4f:fa:a6:5e no --
2 0:14:4f:f8:92:c0 no --
3 0:14:4f:f9:77:8c no --
4 0:14:4f:fb:d8:33 no --
5 0:14:4f:f8:50:1 no --
6 0:14:4f:fa:bc:2d no --
guest-pf@host1$

Here you see the six MAC addresses on the second interface. That is one reason I chose a number other than my typical four MACs.
This time the operation to create a VNIC on net1 should succeed.
guest-pf@host1$ dladm create-vnic -l net1 vnic11
guest-pf@host1$
guest-pf@host1$ dladm show-phys -m
LINK SLOT ADDRESS INUSE CLIENT
net0 primary 0:14:4f:f9:fc:75 yes net0
1 0:14:4f:fb:a1:78 yes vnic1
2 0:14:4f:f8:f9:32 no --
3 0:14:4f:f9:ab:37 no --
4 0:14:4f:f8:1:93 no --
net1 primary 0:14:4f:f8:3e:e5 yes net1
1 0:14:4f:fa:a6:5e yes vnic11
2 0:14:4f:f8:92:c0 no --
3 0:14:4f:f9:77:8c no --
4 0:14:4f:fb:d8:33 no --
5 0:14:4f:f8:50:1 no --
6 0:14:4f:fa:bc:2d no --
guest-pf@host1$
guest-pf@host1$ dladm show-vnic
LINK OVER SPEED MACADDRESS MACADDRTYPE IDS
vnic1 net0 0 0:14:4f:fb:a1:78 factory, slot 1 VID:0
vnic11 net1 0 0:14:4f:fa:a6:5e factory, slot 1 VID:0
guest-pf@host1$

Success indeed. I will get rid of the VNIC on net0 to simplify output.
guest-pf@host1$ dladm delete-vnic vnic1
guest-pf@host1$
guest-pf@host1$ dladm show-vnic
LINK OVER SPEED MACADDRESS MACADDRTYPE IDS
vnic11 net1 0 0:14:4f:fa:a6:5e factory, slot 1 VID:0
guest-pf@host1$

Before moving on to Zones, I want to show two things. Creating a interface on a VLAN, and showing that full aggregation failure also propagates to the VNIC.There are two types of operations: one VLANs; and on VNICs. When creating a VNIC I can specify a VLAN ID, so I can show both in a single operation.
guest-pf@host1$ dladm create-vnic -l net1 -v 111 vnic1111
guest-pf@host1$
guest-pf@host1$ dladm show-vnic
LINK OVER SPEED MACADDRESS MACADDRTYPE IDS
vnic11 net1 0 0:14:4f:fa:a6:5e factory, slot 1 VID:0
vnic1111 net1 0 0:14:4f:f8:92:c0 factory, slot 2 VID:111
guest-pf@host1$ dladm show-vlan
guest-pf@host1$
guest-pf@host1$ ipadm create-ip vnic1111
guest-pf@host1$ ipadm create-addr -a 192.168.111.6/24 vnic1111
vnic1111/v4
guest-pf@host1$
guest-pf@host1$ snoop -d net1
Using device net1 (promiscuous mode)
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
VLAN#111: 192.168.111.1 -> 192.168.111.6 ICMP Echo request (ID: 26167 Sequence number: 13612)
VLAN#111: 192.168.111.6 -> 192.168.111.1 ICMP Echo reply (ID: 26167 Sequence number: 13612)
192.168.113.1 -> 192.168.113.6 ICMP Echo request (ID: 26169 Sequence number: 13601)
192.168.113.6 -> 192.168.113.1 ICMP Echo reply (ID: 26169 Sequence number: 13601)
VLAN#111: 192.168.111.6 -> (broadcast) ARP C Who is 192.168.111.6, 192.168.111.6 ?
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
VLAN#111: 192.168.111.1 -> 192.168.111.6 ICMP Echo request (ID: 26167 Sequence number: 13613)
VLAN#111: 192.168.111.6 -> 192.168.111.1 ICMP Echo reply (ID: 26167 Sequence number: 13613)
192.168.113.1 -> 192.168.113.6 ICMP Echo request (ID: 26169 Sequence number: 13602)
192.168.113.6 -> 192.168.113.1 ICMP Echo reply (ID: 26169 Sequence number: 13602)
VLAN#112: 192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
^Cguest-pf@host1$

Here I created a VNIC on top of net1 with VLAN ID 111. I can see those details with dladm(1M).

And snoop now shows that pings are working on 192.168.113.6 and now 192.168.111.6. Now I will disable both interfaces on the switch.

guest-pf@host1$ dladm show-phys
LINK MEDIA STATE SPEED DUPLEX DEVICE
net0 Ethernet up 0 unknown vnet0
net1 Ethernet down 0 unknown vnet1
guest-pf@host1$
guest-pf@host1$ dladm show-link
LINK CLASS MTU STATE OVER
net0 phys 1500 up --
net1 phys 1500 up --
vnic11 vnic 1500 up net1
vnic1111 vnic 1500 down net1
guest-pf@host1$
guest-pf@host1$ ipadm show-addr
ADDROBJ TYPE STATE ADDR
lo0/v4 static ok 127.0.0.1/8
net0/v4 static ok 192.168.1.6/22
net1/v4 static ok 192.168.113.6/24
vnic1111/v4 static inaccessible 192.168.111.6/24
lo0/v6 static ok ::1/128
net0/v6 addrconf ok fe80::214:4fff:fef9:fc75/10
net0/v6 addrconf ok 2606:b400:602:c080:214:4fff:fef9:fc75/64
guest-pf@host1$
guest-pf@host1$ snoop -d net1
Using device net1 (promiscuous mode)
^Cguest-pf@host1$

I am a bit stumped. The VNIC on net1 is showing the interfaces are down, however, the base interface is not. I see that both at the data link layer with dladm and the IP layer with ipadm. I thought this may be a bug, however, Solaris network engineering is saying this is expected behavior when only one VNIC is up. The VNICs can still be used to communicate with each other, even though the underlying data link is down. As would be the case with any switch were the uplink is down. Hosts can still communicate.

Note: I may come back to this later and update details.

Let us move on to Zones.

Using the LDoms and Solaris Zones Network Virtualization Features Together


Now I would like to combine all the features into creating a Zone. The Link Aggregation is being handled by the Service Domain. This is really convenient, as all LDoms and Zones will benefit from he increased availability of the aggregation. And since each VNIC has its own MAC address, inbound traffic that is hashed at Layer 2 may still have its load spread across the member links in the aggregation. Solaris' load spreading is at L4, using TCP or UDP headers, so it is already likely to spread.

I will not focus on the mechanics of creating a Solaris Zone here. Others and I have done that elsewhere. However, the network details of the Zone configuration are important to highlight.

guest-pf@host1$ zonecfg -z myzone info anet
anet:

linkname: net0

lower-link: net1

allowed-address not specified

configure-allowed-address: true

defrouter not specified

allowed-dhcp-cids not specified

link-protection: mac-nospoof

mac-address: auto

mac-prefix not specified

mac-slot not specified

vlan-id not specified

priority not specified

rxrings not specified

txrings not specified

mtu not specified

maxbw not specified

bwshare not specified

rxfanout not specified

vsi-typeid not specified

vsi-vers not specified

vsi-mgrid not specified

etsbw-lcl not specified

cos not specified

pkey not specified

linkmode not specified

evs not specified

vport not specified
anet:

linkname: net1

lower-link: net1

...

vlan-id: 111
...
anet:

linkname: net2

lower-link: net1

...

vlan-id: 112
...
guest-pf@host1$

Each network section is started with "anet" for Automated Network. This feature in Solaris 11 will create a VNIC for each entry when the Zone boots, and will remove it when the Zone halts. This simplifies Zone networks and limits the privileges an administrator needs to those for Zone Configuration. The user "guest" has those privileges.

The link "net0" had the defaults, and is using the net1 interface. Since "vlan-id" is not specified, it will use the untagged inteface, or VLAN 113.

The other two interfaces, net1 and net2 will use VLAN IDs 111 and 112, respectively.

Because I did not give guest all Zone privileges, I perform a few operations here as root. User guest can start and stop Zones, and also log into the Zone.

guest-pf@host1$ su
Password:
root@host1:~#
root@host1:~# zonecfg -z myzone -f myzone.cfg
UX: /usr/sbin/usermod: guest is currently logged in, some changes may not take effect until next login.
root@host1:~#
root@host1:~# zoneadm -z myzone install -c myzone.xml
The following ZFS file system(s) have been created:
rpool/zones
rpool/zones/myzone
Progress being logged to /var/log/zones/zoneadm.20160421T225323Z.myzone.install
Image: Preparing at /zones/myzone/root.
Install Log: /system/volatile/install.1585/install_log
AI Manifest: /tmp/manifest.xml.P6aOed
SC Profile: /export/home/guest/myzone.xml
Zonename: myzone
Installation: Starting ...
Creating IPS image
Startup linked: 1/1 done
Installing packages from:
solaris
origin: http://172.16.1.1/
DOWNLOAD PKGS FILES XFER (MB) SPEED
Completed 279/279 48306/48306 354.4/354.4 1.6M/s
PHASE ITEMS
Installing new actions 66017/66017
Updating package state database Done
Updating package cache 0/0
Updating image state Done
Creating fast lookup database Done
Updating package cache 1/1
Installation: Succeeded
Note: Man pages can be obtained by installing pkg:/system/manual
done.
Done: Installation completed in 431.151 seconds.
Next Steps: Boot the zone, then log into the zone console (zlogin -C)
to complete the configuration process.
Log saved in non-global zone as /zones/myzone/root/var/log/zones/zoneadm.20160421T225323Z.myzone.install
root@host1:~#
root@host1:~# exit
exit
guest-pf@host1$
guest-pf@host1$ zoneadm -z myzone boot
guest-pf@host1$

I save myself a few steps with a System Configuration File that sets the hostname, IP addresses, and the like, so I am not prompted for that information the first time it boots.
guest-pf@host1$ zlogin myzone
[Connected to zone 'myzone' pts/2]
Last login: Thu Apr 21 19:15:02 2016 on pts/2
Oracle Corporation

SunOS 5.11

11.3

February 2016
root@myzone:~#
root@myzone:~# dladm show-phys
root@myzone:~# dladm show-link
LINK CLASS MTU STATE OVER
net2 vnic 1500 up ?
net1 vnic 1500 up ?
net0 vnic 1500 up ?
root@myzone:~#
root@myzone:~# ipadm show-addr
ADDROBJ TYPE STATE ADDR
lo0/v4 static ok 127.0.0.1/8
net0/v4 static ok 192.168.113.16/24
lo0/v6 static ok ::1/128
net0/v6 addrconf ok fe80::214:4fff:fef8:5001/10
root@myzone:~#
root@myzone:~# ping 192.168.113.1
192.168.113.1 is alive
root@myzone:~#
root@myzone:~# snoop -d net0
Using device net0 (promiscuous mode)
^Croot@myzone:~#
root@myzone:~#
root@myzone:~# snoop -d net1
Using device net1 (promiscuous mode)
192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.2, 192.168.111.2 ?
192.168.111.1 -> (broadcast) ARP C Who is 192.168.111.2, 192.168.111.2 ?
^Croot@myzone:~#
root@myzone:~# snoop -d net2
Using device net2 (promiscuous mode)
192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
192.168.112.1 -> (broadcast) ARP C Who is 192.168.112.6, 192.168.112.6 ?
^Croot@myzone:~#

While the 113 VLAN on net0 is relatively quiet (all ping attempts are being met and thus no broadcasting is going on), there is traffic visible on VLANs 111 and 112. What you may note here is that the VNICs are bringing data into the Zone without the VLAN tags. At this time only one VLAN ID can be set for a VNIC, and thus there is no need bring in the tag, and it actually hides some network details and complexity from the Zone.

I think the final item I want to show is the link failure as seen in the Zone.

root@myzone:~# dladm show-link
LINK CLASS MTU STATE OVER
net2 vnic 1500 down ?
net1 vnic 1500 up ?
net0 vnic 1500 up ?

Again, not all VNICs are showing they are down. What does it look like in the Global Zone?
guest-pf@host1$ dladm show-link
LINK CLASS MTU STATE OVER
net0 phys 1500 up --
net1 phys 1500 up --
vnic11 vnic 1500 up net1
vnic1111 vnic 1500 up net1
myzone/net2 vnic 1500 down net1
myzone/net1 vnic 1500 up net1
myzone/net0 vnic 1500 up net1
guest-pf@host1$

Also here, only one VNIC is showing the link is down. Here you can see another benefit of using the anet feature. Each VNIC of the zone is identefied with the Zone name's prefix.

Wrapping Things Up


So we have gone over the following items
  • Creating an aggregation in Solaris 11
  • Creating a VLAN on an aggregation
  • Showing what happens when link(s) fail
  • Creating an LDom virtual switch in a Solaris 11 Service Domain
  • Adding a virtual network (vnet) interface to an LDom Guest Domain
  • Configuring and testing VLANs on the vnet
  • Demonstrating link failure propagation with an LDom vnet
  • Creating a Solaris 11 VNIC in a Guest Domain
  • Show how Zones use VNICs and VLANs

Wow, that was a lot of territory. I thought it took a while.

I hope it is useful for you!

Regards,

Steffen

Appreciations


Thanks to Nicolas Droux for a quick reply to my question on the VNIC behavior when the link is down, and his ongoing internal answers to my deeper Solaris networking questions.

Thanks to Jeff Savit for a quick review and editorial suggestions. He and I discussed the need this topic several times.

Revision History

(Other than minor typographical changes)

2016.04.22: Posted

2016.04.21: Created


                                         
    
                    
          
        
              
       

                                
                                                                

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services