Multi-node Solaris 11.2 OpenStack on SPARC Servers

In this blog post we are going to look at how to partition a single Oracle SPARC server and configure multi-node OpenStack on the server running OVM Server for SPARC (or LDoms).

If we are going to partition the server into multiple Root domains and, optionally, IO domains (not with SR-IOV VFs),
then configuring Solaris OpenStack Havana on these domains is very similar to setting up OpenStack on multiple individual physical machines.

On the other hand, if we are going to partition the server into multiple domains such that each domain (other than the primary domain) utilizes either

   -- networking service from primary domain OR
   -- SR-IOV Virtual Function (VF)

then there are some networking constraints that dictate how these domains can be used to run OpenStack services and how they can be used as compute nodes to host zones. We will look into these constraints and see how we can use VXLAN tunneling technology to overcome them.

Note: For the purposes of this blog, any non-primary domain is a guest domain. It is assumed that the user is familiar with LDoms Virtual Networking, SR-IOV VFs, and Crossbow VNICs.

Networking Constraint

To support a solaris brand or solaris-kz brand zone inside a guest domain, or just a VNIC inside a guest domain, it is required that the VNET device (or VF device) be instantiated with several alternate MAC addresses (See here). If the devices have just one MAC address, then VNIC creation fails as below:
   +-------------------------------------------------------------------------+
   |guest_domain_1# dladm show-phys net0                                     |
   |LINK              MEDIA                STATE      SPEED  DUPLEX    DEVICE|
   |net0              Ethernet             up         0      unknown   vnet0 |
   |guest_domain_1# dladm show-phys -m net0                                  |
   |LINK                SLOT     ADDRESS            INUSE CLIENT             |
   |net0                primary  0:14:4f:fb:37:a    yes   net0               |
   |guest_domain_1# dladm create-vnic -l net0 vnic0                          |
   |dladm: vnic creation failed: operation not supported                     |
   +-------------------------------------------------------------------------+

If the VNET device was added with several alternate MAC addresses, then one can create a VNIC:

   +--------------------------------------------------------------------------------+
   |guest_domain_1# dladm show-phys -m net1                                         |
   |LINK                SLOT     ADDRESS            INUSE CLIENT                    |
   |net1                primary  0:14:4f:fb:af:ed   no    --                        |
   |                    1        0:14:4f:fb:4c:8a   no    --                        |
   |                    2        0:14:4f:fb:ea:71   no    --                        |
   |                    3        0:14:4f:fa:e9:b8   no    --                        |
   |guest_domain_1# dladm create-vnic -l net1 vnic0                                 |
   |guest_domain_1# dladm show-vnic vnic0                                           |
   |LINK                OVER              SPEED  MACADDRESS        MACADDRTYPE VIDS |
   |vnic0               net1              0      0:14:4f:fb:4c:8a  factory, slot 1 0|
   +--------------------------------------------------------------------------------+

However, we cannot create a VNIC with any random MAC address. The MAC address should be one of the alternate MAC addresses. This is the issue or the constraint that I was alluding to early on. OpenStack Neutron through Solaris EVS applies a random MAC address to a OpenStack Neutron port. When a VM is launched inside the guest domain, it now tries to create a VNIC with this random MAC address and zone boot fails.

In the case of para-virtualized networking, guest domains transmit/receive packets through the primary domain's physical device. If the physical device in the primary domain is unaware of the MAC addresses used inside the guest domains, then the zones or VNICs using the random MAC address will receive packets.

In the case of SR-IOV VF, guest domains transmit/receive packets through the VF inside the guest domain. However, these VFs are pre-programmed with MAC addresses, and the guest cannot create VNICs outside of these MAC addresses.

The upstream OpenStack has resolved this issue for other hypervisors by re-creating the port at the time of VM launch, using one of the unused hypervisor MAC address. However, this issue is not that straightforward in Solaris. Instead of a list of MAC addresses per server, Solaris has a list of MAC addresses per device. We realize this is a gap, and we are working toward fixing it.

VXLAN (Virtual eXtensible LAN) to the rescue

VXLAN, or Virtual eXtensible LAN, is a tunneling mechanism that provides isolated virtual Layer 2 (L2) segments that can span multiple physical L2 segments. Since it is a tunneling mechanism, it uses IP (IPv4 or IPv6) as its underlying network, which means we can have isolated virtual L2 segments over networks connected by IP. This allows Virtual Machines (VM) to be in the same L2 segment even if they  are located on systems that are in different physical networks. For more info on VXLAN do read this blog post.

VXLAN enables you to create VNICs with any MAC address on top of VXLAN datalinks, and the packets from these VNICs will be wrapped in an IP packet that will use primary MAC address of the VNET or VF device. The inner MAC address is not of importance for routing packets in and out of the guest domain.



In the above case, the packets from the VNIC (vnic0) will be wrapped in a UDP->IP->Ethernet packet before it is finally delivered out of net0.

Basic requirements to use VXLAN 

(a) IP interface of the primary domain and all the guest domains should be in the same subnet. This is not a hard requirement, but avoids the need for multicast routing.


In the setup above, all the domains are part of the 10.129.192.0/24 subnet. The 10.129.192.1 forms the default gateway IP, while the primary domain is assigned 10.129.192.2 and guest domains guest_domain_1 and guest_domain_2 are assigned 10.129.192.3 and 10.129.192.4 respectively. Various VXLAN datalinks will be created on top of these IP interfaces. Note that one VXLAN datalink will be created for each OpenStack Network.


(b) OpenStack services placement

Strictly speaking, only the OpenStack Neutron L3 agent needs to be run in the Primary Domain, while the rest of the OpenStack services can be run in a Guest domain. Neutron L3 agent deals with infrastructure that needs VLANS, for example for like for providing public addresses for tenants' VMs.

In the setup described below, the OpenStack services are placed as shown in the following list:
   +------------------------------+
   |Primary Domain:               |
   |  - Neutron server            |
   |  - Neutron L3 gent           |
   |  - Neutron DHCP agent        |
   |  - EVS controller            |
   |                            |
   |Guest Domain (guest_domain_1):|
   |  - Cinder services           |
   |  - Glance services           |
   |  - Nova services             |
   |  - Keystone services         |
   |  - Horizon services          |
   |                              |
   |Guest Domain (guest_domain_2):|
   |  - Nova compute              |
   +------------------------------+

Configuring OpenStack services on individual nodes

On the primary domain:

   - Modify the following options in /etc/neutron/neutron.conf
   +-------------------------------------------+
   |rabbit_host = 10.129.192.3                 |
   |auth_host = 10.129.192.3                   |
   |identity_uri = http://10.129.192.3:35357   |
   |auth_uri = http://10.129.192.3:5000/v2.0   |
   +-------------------------------------------+

   - Set the EVS controller to 10.129.192.2
   +-------------------------------------------------------------------------+
   |primary_domain# evsadm set-prop -p controller=ssh://evsuser@10.129.192.2 |
   +-------------------------------------------------------------------------+

     Copy neutron's, root's, and evsuser's public keys into /var/user/evsuser/.ssh/authorized_keys so that those users can
     do password-less ssh into 10.129.192.2 as evsuser.

   - Set the following options on EVS controller
   +----------------------------------------------------------------+
   |primary_domain# evsadm set-controlprop -p l2-type=vxlan         |
   |primary_domain# evsadm set-controlprop -p uplink-port=net0      |
   |primary_domain# evsadm set-controlprop -p vxlan-range=2000-3000 |
   |primary_domain# evsadm set-controlprop -p vlan-range=1          |
   +----------------------------------------------------------------+

   - Enable Solaris IP filter feature (svcamd enable ipfilter)

   - Enable IP forwarding (ipadm set-prop -p forwarding=on ipv4)

   - Enable Neutron server (svcadm enable neutron-server)

   - Enable Neutron DHCP agent (svcadm enable neutron-dhcp-agent)

On guest_domain_1:

   - Delete the keystone service endpoint that says neutron is available on 10.129.192.3,
     and add a new service endpoint for neutron as shown in the following keystone command.

    guest_domain_1# set |grep OS_
    OS_AUTH_URL=http://10.129.192.3:5000/v2.0
    OS_PASSWORD=neutron
    OS_TENANT_NAME=service
    OS_USERNAME=neutron
   +---------------------------------------------------------------------------------+
   |guest_domain_1# keystone endpoint-create --region RegionOne \                   |
   |--service 4f49dea054b46cf6f83afff4a216aa13 --publicurl http://10.129.192.2:9696 \|
   |--adminurl http://10.129.192.2:9696 --internalurl http://10.129.192.2:9696       |
   +---------------------------------------------------------------------------------+

   - Set the EVS controller to 10.129.192.2
   +--------------------------------------------------------------------------+
   |guest_domain_1# evsadm set-prop -p controller=ssh://evsuser@10.129.192.2  |
   +--------------------------------------------------------------------------+
     Copy root's public key into 10.129.192.2:/var/user/evsuser/.ssh/authorized_keys so that root on this machine can ssh
     as evsuser into 10.129.192.2. This is needed by zoneadmd that runs as a root to fetch EVS information from the EVS
     controller.

On guest_domain_2:

   - Set the EVS controller to 10.129.192.2
   +-------------------------------------------------------------------------+
   |guest_domain_2# evsadm set-prop -p controller=ssh://evsuser@10.129.192.2 |
   +-------------------------------------------------------------------------+

     Copy root's public key into 10.129.192.2:/var/user/evsuser/.ssh/authorized_keys so that root on this machine can ssh
     as evsuser into 10.129.192.2. This is needed by zoneadmd that runs as a root to fetch EVS information from EVS
     controller.

Create Networks

Creating an internal network for tenant demo:
    guest_domain_1# set |grep OS_
    OS_AUTH_URL=http://10.129.192.3:5000/v2.0
    OS_PASSWORD=secrete
    OS_TENANT_NAME=demo
    OS_USERNAME=admin
   +-------------------------------------------------------------------------------+
   |guest_domain_1# neutron net-create eng_net                                 |
   |guest_domain_1# neutron subnet-create --name eng_subnet eng_net 192.168.10.0/24|
   +-------------------------------------------------------------------------------+

Creating an external network for the tenant service:
    primary_domain# set |grep OS_
    OS_AUTH_URL=http://10.129.192.3:5000/v2.0
    OS_PASSWORD=neutron
    OS_TENANT_NAME=service
    OS_USERNAME=neutron
   +------------------------------------------------------------------------------+
   |primary_domain# neutron net-create --router:external=true ext_net \  |
   |--provider:network_type=vlan  |
   |                           |
   |primary_domain# neutron subnet-create --name ext_subnet --enable_dhcp=false \ |
   |ext_net 10.129.192.0/24                                                       |
   +------------------------------------------------------------------------------+

Creating a router and add interfaces to it
   +------------------------------------------------------+
   |primary_domain# neutron router-create provider_router |
   +------------------------------------------------------+
       Copy the router UUID from the above output and set it to router_id in /etc/neutron/l3_agent.ini.
   +---------------------------------------------------------------------------------+
   |primary_domain# neutron router-gateway-set <router_uuid> <external_network_uuid> |
   |primary_domain# neutron router-interface-add <router_uuid> <internal_subnet_uuid>|
   +---------------------------------------------------------------------------------+

Enable the Neutron L3 agent
   +-----------------------------------------------+
   |primary_domain# svcadm enable neutron-l3-agent |
   +-----------------------------------------------+

        At this point, the following resources are created in primary_domain and they are depicted in the diagram below.

   +--------------------------------------------------------------------------------+
   |primary_domain# dladm show-vxlan                                           |
   |LINK                ADDR                     VNI   MGROUP                     |
   |evs-vxlan2000       10.129.192.2            2000  224.0.0.1                   |
   |primary_domain# dladm show-vnic                                             |
   |LINK                OVER              SPEED  MACADDRESS        MACADDRTYPE VIDS |
   |ldoms-vsw0.vport0   net0              1000   0:14:4f:fb:37:a   fixed       0    |
   |evsb0abc182_2_0     evs-vxlan2000     1000   2:8:20:c9:ee:39   fixed       0    |
   |l3id27a4750_2_0     evs-vxlan2000     1000   2:8:20:af:d0:65   fixed       0    |
   |l3ec631ab64_2_0     net0              1000   2:8:20:32:84:94   fixed       0    |
   |primary_domain# ipadm                                                       |
   |NAME                 CLASS/TYPE STATE        UNDER      ADDR    |
   |evsb0abc182_2_0      ip         ok           --         --                    |
   |  evsb0abc182_2_0/v4 static     ok           --         192.168.10.2/24        |
   |l3ec631ab64_2_0      ip         ok           --         --    |
   |  l3ec631ab64_2_0/v4 static     ok           --         10.129.192.5/24        |
   |l3id27a4750_2_0      ip         ok           --         --                    |
   |  l3id27a4750_2_0/v4 static     ok           --         192.168.10.1/24        |
   |net0                 ip         ok           --         --                    |
   |  net0/v4            static     ok           --         10.129.192.2/24        |
   +--------------------------------------------------------------------------------+

Launch a VM

Launch VM connected to the internal network. Once the VM is in the Active state, you will see the following resources created in guest_domain1:
   +--------------------------------------------------------------------------------+
   |guest_domain_1# dladm show-vxlan                                             |
   |LINK                ADDR                     VNI   MGROUP                      |
   |evs-vxlan2000       10.129.192.3           2000  224.0.0.1                    |
   |guest_domain_1# dladm show-vnic                                               |
   |LINK                OVER              SPEED  MACADDRESS        MACADDRTYPE VIDS |
   |instance-00000005/net0 evs-vxlan2000  0      2:8:20:5b:ec:6b   fixed       0    |
   +--------------------------------------------------------------------------------+

From within the zone, you can ping the default gateway IP of 192.168.10.1 that is present in primary domain. The diagram below shows the path taken by ICMP packets.

   +-------------------------------------------------------------------+
   |root@host-192-168-10-3:~# ping -s 192.168.10.1                    |
   |PING 192-168.10.1: 56 data bytes                                |
   |64 bytes from 192.168.10.1: icmp_seq=0. time=0.432 ms             |
   |64 bytes from 192.168.10.1: icmp_seq=1. time=0.452 ms             |
   |64 bytes from 192.168.10.1: icmp_seq=2. time=0.326 ms             |
   |^C                                                              |
   |----192.168.10.1 PING Statistics----                            |
   |3 packets transmitted, 3 packets received, 0% packet loss         |
   |round-trip (ms)  min/avg/max/stddev = 0.326/0.403/0.452/0.068      |
   +-------------------------------------------------------------------+

Create and associate a Floating IP
   +-----------------------------------------------------------------------------+
   |guest_domain_1# neutron floatingip-create <external_network_uuid>            |
   |guest_domain_1# neutron floatingip-associate <floatingip_uuid> <VM_Port_UUID>|
   +-----------------------------------------------------------------------------+

Check the IP Filter and IP NAT rules on the primary domain:
   +------------------------------------------------------------------------+
   |primary_domain# ipadm show-addr l3ec631ab64_2_0/                      |
   |ADDROBJ           TYPE     STATE        ADDR                          |
   |l3ec631ab64_2_0/v4 static  ok           10.129.192.5/24                |
   |l3ec631ab64_2_0/v4a static ok           10.129.192.6/32                |
   |                                                                   |
   |primary_domain# ipfstat -io                                         |
   |empty list for ipfilter(out)                                          |
   |block in quick on l3id27a4750_2_0 from 192.168.10.0/24 to pool/11522149 |
   |                              |
   |primary_domain# ipnat -l      |
   |List of active MAP/Redirect filters:                                  |
   |bimap l3ec631ab64_2_0 192.168.10.3/32 -> 10.129.192.6/32              |
   |                                                                   |
   |List of active sessions:      |
   +------------------------------------------------------------------------+

Now the VM should be accessible from the external network as 10.129.192.6.

   +------------------------------------------------------------------------+
   |[gmoodalb@thunta:~]                                                  |
   |>ping -ns 10.129.192.6                                                |
   |PING 10.129.192.6 (10.129.192.6): 56 data bytes                         |
   |64 bytes from 10.129.192.6: icmp_seq=0. time=0.919 ms                   |
   |64 bytes from 10.129.192.6: icmp_seq=1. time=0.854 ms                   |
   |64 bytes from 10.129.192.6: icmp_seq=2. time=0.828 ms                   |
   |^C                                                                 |
   |----10.129.192.6 PING Statistics----                                    |
   |3 packets transmitted, 3 packets received, 0% packet loss               |
   |round-trip (ms)  min/avg/max/stddev = 0.828/0.867/0.919/0.047           |
   |[gmoodalb@thunta:~]                                                  |
   |>ssh root@10.129.192.6                                                |
   |Password:                                                            |
   |Last login: Fri Aug 22 21:32:38 2014 from 10.132.146.13                 |
   |Oracle Corporation      SunOS 5.11      11.2    June 2014               |
   |root@host-192-168-10-3:~# zonename                                      |
   |instance-00000005                                                    |
   |root@host-192-168-10-3:~#                                               |
   +------------------------------------------------------------------------+

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

Oracle OpenStack is cloud management software that provides customers an enterprise-grade solution to deploy and manage their entire IT environment. Customers can rapidly deploy Oracle and third-party applications across shared compute, network, and storage resources with ease, with end-to-end enterprise-class support. For more information, see here.

Search

Archives
« May 2015
SunMonTueWedThuFriSat
     
1
2
3
4
5
6
7
8
9
10
11
12
13
14
16
17
18
21
22
23
24
25
26
27
28
29
30
31
      
Today