Wednesday Sep 24, 2014

Announcing General Availability of Oracle OpenStack for Oracle Linux

Today we announce the general availability of Oracle OpenStack for Oracle Linux 1.0. With this announcement we are now providing production OpenStack support to Oracle Linux and Oracle VM customers. The support for OpenStack is available as part of the Oracle Linux and Oracle VM Premier Support at no additional cost.

In December 2013 Oracle joined the OpenStack foundation, In May, during the OpenStack conference in Atlanta, we have released the technology preview version which was based on the Havana release. Since then we have refreshed the tech preview to support Icehouse and continued improving it by making install process simpler and fixing bugs. The GA release, which is available today, is based on the latest Icehouse release.

Our goal is to help make OpenStack an enterprise grade solution, capable of operating an entire datacenter including all types of workloads meeting all the requirements including reliability, security and performance. We would like OpenStack to become first class solution and the cloud operating system of choice for customers of any size. One of the ways Oracle can help make this a reality is to test and enable demanding enterprise workloads to run on OpenStack. Making sure that those workloads can run on OpenStack in a reliable and supported way is an important step in making OpenStack an enterprise grade, general purpose cloud management solution. With experience in deploying and managing enterprise workloads in virtual environment we are in a good position to drive such enabling activity forward, and that is what we plan to do.

Some resources and pointers to learn more about Oracle OpenStack for Oracle Linux:

  • Oracle OpenStack for Oracle Linux (aka O3L) is available for download from the Oracle Linux public yum repository at http://public-yum.oracle.com
  • The Oracle Technical Network (OTN) page for O3L is located here and includes documentation, links and also a virtualbox VM which has O3L fully installed, configured and ready to use on your own laptop
  • More information can also be found at http://www.oracle.com/goto/openstack including data sheet, an interview with Wim Coekaerts and more

Next week is Oracle OpenWorld, at OpenWorld we will showcase Oracle OpenStack for Oracle Linux 1.0 and also have sessions and hands on lab to discuss OpenStack, you are welcome to join us. Here are links to our sessions about OpenStack:

We discuss the upcoming OpenWorld sessions here.

@RonenKofman

Monday Aug 18, 2014

Running OpenStack Icehouse with ZFS Storage Appliance

Couple of months ago Oracle announced the support for OpenStack Cinder plugin with ZFS Storage Appliance (aka ZFSSA).  With our recent release of the Icehouse tech preview I thought it is a good opportunity to demonstrate the ZFSSA plugin working with Icehouse.

One thing that helps a lot to get started with ZFSSA is that it has a VirtualBox simulator. This simulator allows users to try out the appliance’s features before getting to a real box. Users can test the functionality and design an environment even before they have a real appliance which makes the deployment process much more efficient. With OpenStack this is especially nice because having a simulator on the other end allows us to test the complete set of the Cinder plugin and check the entire integration on a single server or even a laptop. Let’s see how this works

Installing and Configuring the Simulator

To get started we first need to download the simulator, the simulator is available here, unzip it and it is ready to be imported to VirtualBox. If you do not already have VirtualBox installed you can download it from here according to your platform of choice.

To import the simulator go to VirtualBox console File -> Import Appliance , navigate to the location of the simulator and import the virtual machine.

When opening the virtual machine you will need to make the following changes:

- Network – by default the network is “Host Only” , the user needs to change that to “Bridged” so the VM can connect to the network and be accessible.

- Memory (optional) – the VM comes with a default of 2560MB which may be fine but if you have more memory that could not hurt, in my case I decided to give it 8192

- vCPU (optional) – the default the VM comes with 1 vCPU, I decided to change it to two, you are welcome to do so too.

And here is how the VM looks like:

Start the VM, when the boot process completes we will need to change the root password and the simulator is running and ready to go.

Now that the simulator is up and running we can access simulated appliance using the URL https://<IP or DNS name>:215/, the IP is showing on the virtual machine console. At this stage we will need to configure the appliance, in my case I did not change any of the default (in other words pressed ‘commit’ several times) and the simulated appliance was configured and ready to go.

We will need to enable REST access otherwise Cinder will not be able to call the appliance we do that in Configuration->Services and at the end of the page there is ‘REST’ button, enable it. If you are a more advanced user you can set additional features in the appliance but for the purpose of this demo this is sufficient. One final step will be to create a pool, go to Configuration -> Storage and add a pool as shown below the pool is named “default”:


The simulator is now running, configured and ready for action.

Configuring Cinder

Back to OpenStack, I have a multi node deployment which we created according to the “Getting Started with Oracle VM, Oracle Linux and OpenStack” guide using Icehouse tech preview release. Now we need to install and configure the ZFSSA Cinder plugin using the README file. In short the steps are as follows:

1. Copy the file from here to the control node and place them at: /usr/lib/python2.6/site-packages/cinder/volume/drivers/zfssa

2. Configure the plugin, editing /etc/cinder/cinder.conf

# Driver to use for volume creation (string value)

#volume_driver=cinder.volume.drivers.lvm.LVMISCSIDriver

volume_driver=cinder.volume.drivers.zfssa.zfssaiscsi.ZFSSAISCSIDriver

zfssa_host = <HOST IP>

zfssa_auth_user = root

zfssa_auth_password = <ROOT PASSWORD>

zfssa_pool = default

zfssa_target_portal = <HOST IP>:3260

zfssa_project = test

zfssa_initiator_group = default

zfssa_target_interfaces = e1000g0

3. Restart the cinder-volume service: service openstack-cinder-volume restart

4. Look into the log file, this will tell us if everything works well so far. If you see any errors fix them before continuing.

5. Install iscsi-initiator-utils package, this is important since the plugin uses iscsi commands from this package:

yum install -y iscsi-initiator-utils

The installation and configuration are very simple, we do not need to have a “project” in the ZFSSA but we do need to define a pool.

Creating and Using Volumes in OpenStack

We are now ready to work, to get started lets create a volume in OpenStack and see it showing up on the simulator:

#  cinder create 2 --display-name my-volume-1

+---------------------+--------------------------------------+

|       Property      |                Value                 |

+---------------------+--------------------------------------+

|     attachments     |                  []                  |

|  availability_zone  |                 nova                 |

|       bootable      |                false                 |

|      created_at     |      2014-08-12T04:24:37.806752      |

| display_description |                 None                 |

|     display_name    |             my-volume-1              |

|      encrypted      |                False                 |

|          id         | df67c447-9a36-4887-a8ff-74178d5d06ee |

|       metadata      |                  {}                  |

|         size        |                  2                   |

|     snapshot_id     |                 None                 |

|     source_volid    |                 None                 |

|        status       |               creating               |

|     volume_type     |                 None                 |

+---------------------+--------------------------------------+

In the simulator:

Extending the volume to 5G:

# cinder extend df67c447-9a36-4887-a8ff-74178d5d06ee 5

In the simulator:

Creating templates using Cinder Volumes

By default OpenStack supports ephemeral storage where an image is copied into the run area during instance launch and deleted when the instance is terminated. With Cinder we can create persistent storage and launch instances from a Cinder volume. Booting from volume has several advantages, one of the main advantages of booting from volumes is speed. No matter how large the volume is the launch operation is immediate there is no copying of an image to a run areas, an operation which can take a long time when using ephemeral storage (depending on image size).

In this deployment we have a Glance image of Oracle Linux 6.5, I would like to make it into a volume which I can boot from. When creating a volume from an image we actually “download” the image into the volume and making the volume bootable, this process can take some time depending on the image size, during the download we will see the following status:

# cinder create --image-id 487a0731-599a-499e-b0e2-5d9b20201f0f --display-name ol65 2

# cinder list

+--------------------------------------+-------------+--------------+------+-------------+

|                  ID                  |    Status   | Display Name | Size | Volume Type | …

+--------------------------------------+-------------+--------------+------+-------------

| df67c447-9a36-4887-a8ff-74178d5d06ee |  available  | my-volume-1  |  5   |     None    | …

| f61702b6-4204-4f10-8bdf-7da792f15c28 | downloading |     ol65     |  2   |     None    | …

+--------------------------------------+-------------+--------------+------+-------------+

After the download is complete we will see that the volume status changed to “available” and that the bootable state is “true”.

We can use this new volume to boot an instance from or we can use it as a template. Cinder can create a volume from another volume and ZFSSA can replicate volumes instantly in the back end. The result is an efficient template model where users can spawn an instance from a “template” instantly even if the template is very large in size.

Let’s try replicating the bootable volume with the Oracle Linux 6.5 on it creating additional 3 bootable volumes:

# cinder create 2 --source-volid f61702b6-4204-4f10-8bdf-7da792f15c28 --display-name ol65-bootable-1

# cinder create 2 --source-volid f61702b6-4204-4f10-8bdf-7da792f15c28 --display-name ol65-bootable-2

# cinder create 2 --source-volid f61702b6-4204-4f10-8bdf-7da792f15c28 --display-name ol65-bootable-3

# cinder list

+--------------------------------------+-----------+-----------------+------+-------------+----------+-------------+

|                  ID                  |   Status  |   Display Name  | Size | Volume Type | Bootable | Attached to |

+--------------------------------------+-----------+-----------------+------+-------------+----------+-------------+

| 9bfe0deb-b9c7-4d97-8522-1354fc533c26 | available | ol65-bootable-2 |  2   |     None    |   true   |             |

| a311a855-6fb8-472d-b091-4d9703ef6b9a | available | ol65-bootable-1 |  2   |     None    |   true   |             |

| df67c447-9a36-4887-a8ff-74178d5d06ee | available |   my-volume-1   |  5   |     None    |  false   |             |

| e7fbd2eb-e726-452b-9a88-b5eee0736175 | available | ol65-bootable-3 |  2   |     None    |   true   |             |

| f61702b6-4204-4f10-8bdf-7da792f15c28 | available |       ol65      |  2   |     None    |   true   |             |

+--------------------------------------+-----------+-----------------+------+-------------+----------+-------------+

Note that the creation of those 3 volume was almost immediate, no need to download or copy, ZFSSA takes care of the volume copy for us.

Start 3 instances:

# nova boot --boot-volume a311a855-6fb8-472d-b091-4d9703ef6b9a --flavor m1.tiny ol65-instance-1 --nic net-id=25b19746-3aea-4236-8193-4c6284e76eca

# nova boot --boot-volume 9bfe0deb-b9c7-4d97-8522-1354fc533c26 --flavor m1.tiny ol65-instance-2 --nic net-id=25b19746-3aea-4236-8193-4c6284e76eca

# nova boot --boot-volume e7fbd2eb-e726-452b-9a88-b5eee0736175 --flavor m1.tiny ol65-instance-3 --nic net-id=25b19746-3aea-4236-8193-4c6284e76eca

Instantly replicating volumes is a very powerful feature, especially for large templates. The ZFSSA Cinder plugin allows us to take advantage of this feature of ZFSSA. By offloading some of the operations to the array OpenStack create a highly efficient environment where persistent volume can be instantly created from a template.

That’s all for now, with this environment you can continue to test ZFSSA with OpenStack and when you are ready for the real appliance the operations will look the same.

@RonenKofman

Sunday Jul 13, 2014

Diving into OpenStack Network Architecture - Part 4 - Connecting to Public Network

 In the previous post we discussed routing in OpenStack, we saw how routing is done between two networks inside an OpenStack deployment using a router implemented inside a network namespace. In this post we will extend the routing capabilities and show how we can route not only between two internal networks but also how we route to a public network. We will also see how Neutron can assign a floating IP to allow VMs to receive a public IP and become accessible from the public network.

Use case #5: Connecting VMs to the public network

A “public network”, for the purpose of this discussion, is any network which is external to the OpenStack deployment. This could be another network inside the data center or the internet or just another private network which is not controlled by OpenStack.

To connect the deployment to a public network we first have to create a network in OpenStack and designate it as public. This network will be the target for all outgoing traffic from VMs inside the OpenStack deployment. At this time VMs cannot be directly connected to a network designated as public, the traffic can only be routed from a private network to a public network using an OpenStack created router. To create a public network in OpenStack we simply use the net-create command from Neutron and setting the router:external option as True. In our example we will create public network in OpenStack called “my-public”:

# neutron net-create my-public --router:external=True

Created a new network:

+---------------------------+--------------------------------------+

| Field                     | Value                                |

+---------------------------+--------------------------------------+

| admin_state_up            | True                                 |

| id                        | 5eb99ac3-905b-4f0e-9c0f-708ce1fd2303 |

| name                      | my-public                            |

| provider:network_type     | vlan                                 |

| provider:physical_network | default                              |

| provider:segmentation_id  | 1002                                 |

| router:external           | True                                 |

| shared                    | False                                |

| status                    | ACTIVE                               |

| subnets                   |                                      |

| tenant_id                 | 9796e5145ee546508939cd49ad59d51f     |

+---------------------------+--------------------------------------+

In our deployment eth3 on the control node is a non-IP’ed interface and we will use it as the connection point to the external public network. To do that we simply add eth3 to a bridge on OVS called “br-ex”. This is the bridge Neutron will route the traffic to when a VM is connecting with the public network:

# ovs-vsctl add-port br-ex eth3

# ovs-vsctl show

8a069c7c-ea05-4375-93e2-b9fc9e4b3ca1

.

.

.

    Bridge br-ex

        Port br-ex

            Interface br-ex

                type: internal

        Port "eth3"

            Interface "eth3"

.

.

.

 

For this exercise we have created a public network with the IP range 180.180.180.0/24 accessible from eth3. This public network is provided from the datacenter side and has a gateway at 180.180.180.1 which connects it to the datacenter network. To connect this network to our OpenStack deployment we will create a subnet on our “my-public” network with the same IP range and tell Neutron what is its gateway:

# neutron subnet-create my-public 180.180.180.0/24 --name public_subnet --enable_dhcp=False --allocation-pool start=180.180.180.2,end=180.180.180.100 --gateway=180.180.180.1

Created a new subnet:

+------------------+------------------------------------------------------+

| Field            | Value                                                |

+------------------+------------------------------------------------------+

| allocation_pools | {"start": "180.180.180.2", "end": "180.180.180.100"} |

| cidr             | 180.180.180.0/24                                     |

| dns_nameservers  |                                                      |

| enable_dhcp      | False                                                |

| gateway_ip       | 180.180.180.1                                        |

| host_routes      |                                                      |

| id               | ecadf103-0b3b-46e8-8492-4c5f4b3ea4cd                 |

| ip_version       | 4                                                    |

| name             | public_subnet                                        |

| network_id       | 5eb99ac3-905b-4f0e-9c0f-708ce1fd2303                 |

| tenant_id        | 9796e5145ee546508939cd49ad59d51f                     |

+------------------+------------------------------------------------------+

Next we need to connect the router to our newly created public network, we do this using the following command:

# neutron router-gateway-set my-router my-public

Set gateway for router my-router

Note: We use the term “public network” for two things, one is the actual public network available from the datacenter (180.180.180.0/24) for clarity we’ll call this network “external public network”. The second place we use the term “public network” is within OpenStack for the network we call “my-public” which is the interface network inside the OpenStack deployment. We also refer to two “gateways”, one of them is the gateway used by the external public network (180.180.180.1) and another is the gateway interface on the router (180.180.180.2).

After performing the operation above the router which had two interfaces is also connected to a third interface which is called gateway (this is the router gateway). A router can have multiple interfaces, to connect to regular internal subnets, and one gateway to connect to the “my-public” network. A common mistake would be to try to connect the public network as a regular interface, the operation can succeed but no connection will be made to the external world. After we have created a public network, a subnet and connected them to the router we the network topology view will look like this:

Looking into the router’s namespace we see that another interface was added with an IP on the 180.180.180.0/24 network, this IP is 180.180.180.2 which is the router gateway interface:

# ip netns exec qrouter-fce64ebe-47f0-4846-b3af-9cf764f1ff11 ip addr

.

.

22: qg-c08b8179-3b: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN                                                      

    link/ether fa:16:3e:a4:58:40 brd ff:ff:ff:ff:ff:ff

    inet 180.180.180.2/24 brd 180.180.180.255 scope global qg-c08b8179-3b

    inet6 2606:b400:400:3441:f816:3eff:fea4:5840/64 scope global dynamic

       valid_lft 2591998sec preferred_lft 604798sec

    inet6 fe80::f816:3eff:fea4:5840/64 scope link

       valid_lft forever preferred_lft forever

.

.

At this point the router’s gateway (180.180.180.2) address is connected to the VMs and the VMs can ping it. We can also ping the external gateway (180.180.180.1) from the VMs as well as reach the network this gateway is connected to.

If we look into the router namespace we see that two lines are added to the NAT table in iptables:

# ip netns exec qrouter-fce64ebe-47f0-4846-b3af-9cf764f1ff11 iptables-save

.

.

-A neutron-l3-agent-snat -s 20.20.20.0/24 -j SNAT --to-source 180.180.180.2

-A neutron-l3-agent-snat -s 10.10.10.0/24 -j SNAT --to-source 180.180.180.2

 

.

.

This will change the source IP of outgoing packets from the networks net1 and net2 to 180.180.180.2. When we ping from within the VMs will one the network we will see as if the request comes from this IP address.

The routing table inside the namespace will route any outgoing traffic to the gateway of the public network as we defined it when we created the subnet, in this case 180.180.180.1

#  ip netns exec  qrouter-fce64ebe-47f0-4846-b3af-9cf764f1ff11 route -n

Kernel IP routing table

Destination     Gateway         Genmask         Flags Metric Ref    Use Iface

0.0.0.0         180.180.180.1   0.0.0.0         UG    0      0        0 qg-c08b8179-3b

10.10.10.0      0.0.0.0         255.255.255.0   U     0      0        0 qr-15ea2dd1-65

20.20.20.0      0.0.0.0         255.255.255.0   U     0      0        0 qr-dc290da0-0a

180.180.180.0   0.0.0.0         255.255.255.0   U     0      0        0 qg-c08b8179-3b

 

Those two pieces will assure that a request from a VM trying to reach the public network will be NAT’ed to 180.180.180.2 as a source and routed to the public network’s gateway. We can also see that ip forwarding is enabled inside the namespace to allow routing:

# ip netns exec qrouter-fce64ebe-47f0-4846-b3af-9cf764f1ff11 sysctl net.ipv4.ip_forward

net.ipv4.ip_forward = 1

 

Use case #6: Attaching a floating IP to a VM

Now that the VMs can access the public network we would like to take the next step allow an external client to access the VMs inside the OpenStack deployment, we will do that using a floating IP. A floating IP is an IP provided by the public network which the user can assign to a particular VM making it accessible to an external client.

To create a floating IP, the first step is to connect the VM to a public network as we have shown in the previous use case. The second step will be to generate a floating IP from command line:

# neutron floatingip-create public

Created a new floatingip:

+---------------------+--------------------------------------+

| Field               | Value                                |

+---------------------+--------------------------------------+

| fixed_ip_address    |                                      |

| floating_ip_address | 180.180.180.3                        |

| floating_network_id | 5eb99ac3-905b-4f0e-9c0f-708ce1fd2303 |

| id                  | 25facce9-c840-4607-83f5-d477eaceba61 |

| port_id             |                                      |

| router_id           |                                      |

| tenant_id           | 9796e5145ee546508939cd49ad59d51f     |

+---------------------+--------------------------------------+

The user can generate as many IPs as are available on the “my-public” network. Assigning the floating IP can be done either from the GUI or from command line, in this example we go to the GUI:

Under the hood we can look at the router namespace and see the following additional lines in the iptables of the router namespace:

-A neutron-l3-agent-OUTPUT -d 180.180.180.3/32 -j DNAT --to-destination 20.20.20.2

-A neutron-l3-agent-PREROUTING -d 180.180.180.3/32 -j DNAT --to-destination 20.20.20.2

-A neutron-l3-agent-float-snat -s 20.20.20.2/32 -j SNAT --to-source 180.180.180.3

These lines are performing the NAT operation for the floating IP. In this case if and incoming request arrives and its destination is 180.180.180.3 it will be translated to 20.20.20.2 and vice versa.

Once a floating IP is associated we can connect to the VM, it is important to make sure there are security groups rule which will allow this for example:

nova secgroup-add-rule default icmp -1 -1 0.0.0.0/0

nova secgroup-add-rule default tcp 22 22 0.0.0.0/0

 

Those will allow ping and ssh.

Iptables is a sophisticated and powerful tool, to better understand all the bits and pieces on how the chains are structured in the different tables we can look at one of the many iptables tutorials available online and read more to understand any specific details.

Summary

This post was about connecting VMs in the OpenStack deployment to a public network. It shows how using namespaces and routing tables we can route not only inside the OpenStack environment but also to the outside world.

This will also be the last post in the series for now. Networking is one of the most complicated areas in OpenStack and gaining good understanding of it is key. If you read all four posts you should have a good starting point to analyze and understand different network topologies in OpenStack. We can apply the same principles shown here to understand more network concepts such as Firewall as a service, Load Balance as a service, Metadata service etc. The general method will be to look into a namespace and figure out how certain functionality is implemented using the regular Linux networking features in the same way we did throughout this series.

As we said in the beginning, the use cases shown here are just examples of one method to configure networking in OpenStack and there are many others. All the examples here are using the Open vSwitch plugin and can be used right out of the box. When analyzing another plugin or specific feature operation it will be useful to compare the features here to their equivalent method with the plugin you choose to use. In many cases vendor plugins will use Open vSwitch , bridges or namespaces and some of the same principles and methods shown here.

The goal of this series is to make the OpenStack networking accessible to the average user. This series takes a bottom up approach and using simple use cases tries to build a complete picture of how the network architecture is working. Unlike some other resources we did not start out by explaining the different agents and their functionality but tried to explain what they do , how does the end result looks like. A good next step would be to go to one of those resources and try to see how the different agents implement the functionality explained here.

That’s it for now

@RonenKofman

Wednesday Jun 18, 2014

Diving into OpenStack Network Architecture - Part 3 - Routing

In the previous posts we have seen the basic components of OpenStack networking and then described three simple use cases that explain how network connectivity is achieved. In this short post we will continue to explore networking setup through looking at a more sophisticated (but still pretty basic) use case of routing between two isolated networks. Routing uses the same basic components to achieve inter subnet connectivity and uses another namespace to create an isolated container to allow forwarding from one subnet to another.

Just to remind what we said in the first post, this is just an example using out of the box OVS plugin. This is only one of the options to use networking in OpenStack and there are many plugins that use different means.

Use case #4: Routing traffic between two isolated networks

In a real world deployment we would like to create different networks for different purposes. We would also like to be able to connect those networks as needed. Since those two networks have different IP ranges we need a router to connect them. To explore this setup we will first create an additional network called net2 we will use 20.20.20.0/24 as its subnet. After creating the network we will launch an instance of Oracle Linux and connect it to net2. This is how this looks in the network topology tab from the OpenStack GUI:

If we further explore what happened we can see that another namespace has appeared on the network node, this namespace will be serving the newly created network. Now we have two namespaces, one for each network:

# ip netns list

qdhcp-63b7fcf2-e921-4011-8da9-5fc2444b42dd

qdhcp-5f833617-6179-4797-b7c0-7d420d84040c

To associate the network with the ID we can use net-list or simply look into the UI network information:

# nova net-list

+--------------------------------------+-------+------+

| ID                                   | Label | CIDR |

+--------------------------------------+-------+------+

| 5f833617-6179-4797-b7c0-7d420d84040c | net1  | None |

| 63b7fcf2-e921-4011-8da9-5fc2444b42dd | net2  | None |

+--------------------------------------+-------+------+

Our newly created network, net2 has its own namespace separate from net1. When we look into the namespace we see that it has two interfaces, a local and an interface with an IP which will also serve DHCP requests:

# ip netns exec qdhcp-63b7fcf2-e921-4011-8da9-5fc2444b42dd ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

       valid_lft forever preferred_lft forever

19: tap16630347-45: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN

    link/ether fa:16:3e:bd:94:42 brd ff:ff:ff:ff:ff:ff

    inet 20.20.20.3/24 brd 20.20.20.255 scope global tap16630347-45

    inet6 fe80::f816:3eff:febd:9442/64 scope link

       valid_lft forever preferred_lft forever

Those two networks, net1 and net2, are not connected at this time, to connect them we need to add a router and connect both networks to the router. OpenStack Neutron provides users with the capability to create a router to connect two or more networks This router will be simply an additional namespace.

Creating a router with Neutron can be done from the GUI or from command line:

# neutron router-create my-router

Created a new router:

+-----------------------+--------------------------------------+

| Field                 | Value                                |

+-----------------------+--------------------------------------+

| admin_state_up        | True                                 |

| external_gateway_info |                                      |

| id                    | fce64ebe-47f0-4846-b3af-9cf764f1ff11 |

| name                  | my-router                            |

| status                | ACTIVE                               |

| tenant_id             | 9796e5145ee546508939cd49ad59d51f     |

+-----------------------+--------------------------------------+

We now connect the router to the two networks:

Checking which subnets are available:

# neutron subnet-list

+--------------------------------------+------+---------------+------------------------------------------------+

| id                                   | name | cidr          | allocation_pools                               |

+--------------------------------------+------+---------------+------------------------------------------------+

| 2d7a0a58-0674-439a-ad23-d6471aaae9bc |      | 10.10.10.0/24 | {"start": "10.10.10.2", "end": "10.10.10.254"} |

| 4a176b4e-a9b2-4bd8-a2e3-2dbe1aeaf890 |      | 20.20.20.0/24 | {"start": "20.20.20.2", "end": "20.20.20.254"} |

+--------------------------------------+------+---------------+------------------------------------------------+

Adding the 10.10.10.0/24 subnet to the router:

# neutron router-interface-add fce64ebe-47f0-4846-b3af-9cf764f1ff11 subnet=2d7a0a58-0674-439a-ad23-d6471aaae9bc

Added interface 0b7b0b40-f952-41dd-ad74-2c15a063243a to router fce64ebe-47f0-4846-b3af-9cf764f1ff11.

Adding the 20.20.20.0/24 subnet to the router:

# neutron router-interface-add fce64ebe-47f0-4846-b3af-9cf764f1ff11 subnet=4a176b4e-a9b2-4bd8-a2e3-2dbe1aeaf890

Added interface dc290da0-0aa4-4d96-9085-1f894cf5b160 to router fce64ebe-47f0-4846-b3af-9cf764f1ff11.

At this stage we can look at the network topology view and see that the two networks are connected to the router:

We can also see that the interfaces connected to the router are the interfaces we have defined as gateways for the subnets.

We can also see that another namespace was created for the router:

# ip netns list

qrouter-fce64ebe-47f0-4846-b3af-9cf764f1ff11

qdhcp-63b7fcf2-e921-4011-8da9-5fc2444b42dd

qdhcp-5f833617-6179-4797-b7c0-7d420d84040c

When looking into the namespace we see the following:

# ip netns exec qrouter-fce64ebe-47f0-4846-b3af-9cf764f1ff11 ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

       valid_lft forever preferred_lft forever

20: qr-0b7b0b40-f9: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN

    link/ether fa:16:3e:82:47:a6 brd ff:ff:ff:ff:ff:ff

    inet 10.10.10.1/24 brd 10.10.10.255 scope global qr-0b7b0b40-f9

    inet6 fe80::f816:3eff:fe82:47a6/64 scope link

       valid_lft forever preferred_lft forever

21: qr-dc290da0-0a: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN

    link/ether fa:16:3e:c7:7c:9c brd ff:ff:ff:ff:ff:ff

    inet 20.20.20.1/24 brd 20.20.20.255 scope global qr-dc290da0-0a

    inet6 fe80::f816:3eff:fec7:7c9c/64 scope link

       valid_lft forever preferred_lft forever

We see the two interfaces, “qr-dc290da0-0a“ and “qr-0b7b0b40-f9. Those interfaces are using the IP addresses which were defined as gateways when we created the networks and subnets. Those interfaces are connected to OVS:

# ovs-vsctl show

8a069c7c-ea05-4375-93e2-b9fc9e4b3ca1

    Bridge "br-eth2"

        Port "br-eth2"

            Interface "br-eth2"

                type: internal

        Port "eth2"

            Interface "eth2"

        Port "phy-br-eth2"

            Interface "phy-br-eth2"

    Bridge br-ex

        Port br-ex

            Interface br-ex

                type: internal

    Bridge br-int

        Port "int-br-eth2"

            Interface "int-br-eth2"

        Port "qr-dc290da0-0a"

            tag: 2

            Interface "qr-dc290da0-0a"

                type: internal

        Port "tap26c9b807-7c"

            tag: 1

            Interface "tap26c9b807-7c"

                type: internal

        Port br-int

            Interface br-int

                type: internal

        Port "tap16630347-45"

            tag: 2

            Interface "tap16630347-45"

                type: internal

        Port "qr-0b7b0b40-f9"

            tag: 1

            Interface "qr-0b7b0b40-f9"

                type: internal

    ovs_version: "1.11.0"

As we see those interfaces are connected to “br-int” and tagged with the VLAN corresponding to their respective networks. At this point we should be able to successfully ping the router namespace using the gateway address (20.20.20.1 in this case):

We can also see that the VM with IP 20.20.20.2 can ping the VM with IP 10.10.10.2 and this is how we see the routing actually getting done:

The two subnets are connected to the name space through an interface in the namespace. Inside the namespace Neutron enabled forwarding by setting the net.ipv4.ip_forward parameter to 1, we can see that here:

# ip netns exec qrouter-fce64ebe-47f0-4846-b3af-9cf764f1ff11 sysctl net.ipv4.ip_forward

net.ipv4.ip_forward = 1

We  can see that this net.ipv4.ip_forward is specific to the namespace and is not impacted by changing this parameter outside the namespace.

Summary

When a router is created Neutron creates a namespace called qrouter-<router id>. The subnets are connected to the router through interfaces on the OVS br-int bridge. The interfaces are designated with the correct VLAN so they can connect to their respective networks. In the example above the interface qr-0b7b0b40-f9 is assigned IP 10.10.10.1 and is tagged with VLAN 1, this allows it to be connected to “net1”. The routing action itself is enabled by the net.ipv4.ip_forward parameter set to 1 inside the namespace.

This post shows how a router is created using just a network namespace. In the next post we will see how floating IPs work using iptables. This becomes a bit more sophisticated but still uses the same basic components.

@RonenKofman

 

 

Wednesday Jun 04, 2014

Diving into OpenStack Network Architecture - Part 2 - Basic Use Cases

 

In the previous post we reviewed several network components including Open vSwitch, Network Namespaces, Linux Bridges and veth pairs. In this post we will take three simple use cases and see how those basic components come together to create a complete SDN solution in OpenStack. With those three use cases we will review almost the entire network setup and see how all the pieces work together. The use cases we will use are:

1.       Create network – what happens when we create network and how can we create multiple isolated networks

2.       Launch a VM – once we have networks we can launch VMs and connect them to networks.

3.       DHCP request from a VM – OpenStack can automatically assign IP addresses to VMs. This is done through local DHCP service controlled by OpenStack Neutron. We will see how this service runs and how does a DHCP request and response look like.

In this post we will show connectivity, we will see how packets get from point A to point B. We first focus on how a configured deployment looks like and only later we will discuss how and when the configuration is created. Personally I found it very valuable to see the actual interfaces and how they connect to each other through examples and hands on experiments. After the end game is clear and we know how the connectivity works, in a later post, we will take a step back and explain how Neutron configures the components to be able to provide such connectivity. 

We are going to get pretty technical shortly and I recommend trying these examples on your own deployment or using the Oracle OpenStack Tech Preview. Understanding these three use cases thoroughly and how to look at them will be very helpful when trying to debug a deployment in case something does not work.

Use case #1: Create Network

Create network is a simple operation it can be performed from the GUI or command line. When we create a network in OpenStack the network is only available to the tenant who created it or it could be defined as “shared” and then it can be used by all tenants. A network can have multiple subnets but for this demonstration purpose and for simplicity we will assume that each network has exactly one subnet. Creating a network from the command line will look like this:

# neutron net-create net1

Created a new network:

+---------------------------+--------------------------------------+

| Field                     | Value                                |

+---------------------------+--------------------------------------+

| admin_state_up            | True                                 |

| id                        | 5f833617-6179-4797-b7c0-7d420d84040c |

| name                      | net1                                 |

| provider:network_type     | vlan                                 |

| provider:physical_network | default                              |

| provider:segmentation_id  | 1000                                 |

| shared                    | False                                |

| status                    | ACTIVE                               |

| subnets                   |                                      |

| tenant_id                 | 9796e5145ee546508939cd49ad59d51f     |

+---------------------------+--------------------------------------+

Creating a subnet for this network will look like this:

# neutron subnet-create net1 10.10.10.0/24

Created a new subnet:

+------------------+------------------------------------------------+

| Field            | Value                                          |

+------------------+------------------------------------------------+

| allocation_pools | {"start": "10.10.10.2", "end": "10.10.10.254"} |

| cidr             | 10.10.10.0/24                                  |

| dns_nameservers  |                                                |

| enable_dhcp      | True                                           |

| gateway_ip       | 10.10.10.1                                     |

| host_routes      |                                                |

| id               | 2d7a0a58-0674-439a-ad23-d6471aaae9bc           |

| ip_version       | 4                                              |

| name             |                                                |

| network_id       | 5f833617-6179-4797-b7c0-7d420d84040c           |

| tenant_id        | 9796e5145ee546508939cd49ad59d51f               |

+------------------+------------------------------------------------+

We now have a network and a subnet, on the network topology view this looks like this:

Now let’s dive in and see what happened under the hood. Looking at the control node we will discover that a new namespace was created:

# ip netns list

qdhcp-5f833617-6179-4797-b7c0-7d420d84040c

 

The name of the namespace is qdhcp-<network id> (see above), let’s look into the namespace and see what’s in it:

# ip netns exec qdhcp-5f833617-6179-4797-b7c0-7d420d84040c ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

    inet 127.0.0.1/8 scope host lo

    inet6 ::1/128 scope host

       valid_lft forever preferred_lft forever

12: tap26c9b807-7c: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN

    link/ether fa:16:3e:1d:5c:81 brd ff:ff:ff:ff:ff:ff

    inet 10.10.10.3/24 brd 10.10.10.255 scope global tap26c9b807-7c

    inet6 fe80::f816:3eff:fe1d:5c81/64 scope link

       valid_lft forever preferred_lft forever

 

We see two interfaces in the namespace, one is the loopback and the other one is an interface called “tap26c9b807-7c”. This interface has the IP address of 10.10.10.3 and it will also serve dhcp requests in a way we will see later. Let’s trace the connectivity of the “tap26c9b807-7c” interface from the namespace.  First stop is OVS, we see that the interface connects to bridge  br-int” on OVS:

# ovs-vsctl show

8a069c7c-ea05-4375-93e2-b9fc9e4b3ca1

    Bridge "br-eth2"

        Port "br-eth2"

            Interface "br-eth2"

                type: internal

        Port "eth2"

            Interface "eth2"

        Port "phy-br-eth2"

            Interface "phy-br-eth2"

    Bridge br-ex

        Port br-ex

            Interface br-ex

                type: internal

    Bridge br-int

        Port "int-br-eth2"

            Interface "int-br-eth2"

        Port "tap26c9b807-7c"

            tag: 1

            Interface "tap26c9b807-7c"

                type: internal

        Port br-int

            Interface br-int

                type: internal

    ovs_version: "1.11.0"

 

In the picture above we have a veth pair which has two ends called “int-br-eth2” and "phy-br-eth2", this veth pair is used to connect two bridge in OVS "br-eth2" and "br-int". In the previous post we explained how to check the veth connectivity using the ethtool command. It shows that the two are indeed a pair:

# ethtool -S int-br-eth2

NIC statistics:

     peer_ifindex: 10

.

.

 

#ip link

.

.

10: phy-br-eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

.

.

Note that “phy-br-eth2” is connected to a bridge called "br-eth2" and one of this bridge's interfaces is the physical link eth2. This means that the network which we have just created has created a namespace which is connected to the physical interface eth2. eth2 is the “VM network” the physical interface where all the virtual machines connect to where all the VMs are connected.

About network isolation:

OpenStack supports creation of multiple isolated networks and can use several mechanisms to isolate the networks from one another. The isolation mechanism can be VLANs, VxLANs or GRE tunnels, this is configured as part of the initial setup in our deployment we use VLANs. When using VLAN tagging as an isolation mechanism a VLAN tag is allocated by Neutron from a pre-defined VLAN tags pool and assigned to the newly created network. By provisioning VLAN tags to the networks Neutron allows creation of multiple isolated networks on the same physical link.  The big difference between this and other platforms is that the user does not have to deal with allocating and managing VLANs to networks. The VLAN allocation and provisioning is handled by Neutron which keeps track of the VLAN tags, and responsible for allocating and reclaiming VLAN tags. In the example above net1 has the VLAN tag 1000, this means that whenever a VM is created and connected to this network the packets from that VM will have to be tagged with VLAN tag 1000 to go on this particular network. This is true for namespace as well, if we would like to connect a namespace to a particular network we have to make sure that the packets to and from the namespace are correctly tagged when they reach the VM network.

In the example above we see that the namespace interface “tap26c9b807-7c” has vlan tag 1 assigned to it, if we examine OVS we see that it has flows which modify VLAN tag 1 to VLAN tag 1000 when a packet goes to the VM network on eth2 and vice versa. We can see this using the dump-flows command on OVS for packets going to the VM network we see the modification done on br-eth2:

#  ovs-ofctl dump-flows br-eth2

NXST_FLOW reply (xid=0x4):

 cookie=0x0, duration=18669.401s, table=0, n_packets=857, n_bytes=163350, idle_age=25, priority=4,in_port=2,dl_vlan=1 actions=mod_vlan_vid:1000,NORMAL

 cookie=0x0, duration=165108.226s, table=0, n_packets=14, n_bytes=1000, idle_age=5343, hard_age=65534, priority=2,in_port=2 actions=drop

 cookie=0x0, duration=165109.813s, table=0, n_packets=1671, n_bytes=213304, idle_age=25, hard_age=65534, priority=1 actions=NORMAL

 

For packets coming from the interface to the namespace we see the following modification:

#  ovs-ofctl dump-flows br-int

NXST_FLOW reply (xid=0x4):

 cookie=0x0, duration=18690.876s, table=0, n_packets=1610, n_bytes=210752, idle_age=1, priority=3,in_port=1,dl_vlan=1000 actions=mod_vlan_vid:1,NORMAL

 cookie=0x0, duration=165130.01s, table=0, n_packets=75, n_bytes=3686, idle_age=4212, hard_age=65534, priority=2,in_port=1 actions=drop

 cookie=0x0, duration=165131.96s, table=0, n_packets=863, n_bytes=160727, idle_age=1, hard_age=65534, priority=1 actions=NORMAL

 

To summarize we can see that when a user creates a network Neutron creates a namespace and this namespace is connected through OVS to the “VM network”. OVS also takes care of tagging the packets from the namespace to the VM network with the correct VLAN tag and knows to modify the VLAN for packets coming from VM network to the namespace. Now let’s see what happens when a VM is launched and how it is connected to the “VM network”.

Use case #2: Launch a VM

Launching a VM can be done from Horizon or from the command line this is how we do it from Horizon:

Attach the network:

And Launch

Once the virtual machine is up and running we can see the associated IP using the nova list command :

# nova list

+--------------------------------------+--------------+--------+------------+-------------+-----------------+

| ID                                   | Name         | Status | Task State | Power State | Networks        |

+--------------------------------------+--------------+--------+------------+-------------+-----------------+

| 3707ac87-4f5d-4349-b7ed-3a673f55e5e1 | Oracle Linux | ACTIVE | None       | Running     | net1=10.10.10.2 |

+--------------------------------------+--------------+--------+------------+-------------+-----------------+

The nova list command shows us that the VM is running and that the IP 10.10.10.2 is assigned to this VM. Let’s trace the connectivity from the VM to VM network on eth2 starting with the VM definition file. The configuration files of the VM including the virtual disk(s), in case of ephemeral storage, are stored on the compute node at/var/lib/nova/instances/<instance-id>/. Looking into the VM definition file ,libvirt.xml,  we see that the VM is connected to an interface called “tap53903a95-82” which is connected to a Linux bridge called “qbr53903a95-82”:

<interface type="bridge">

      <mac address="fa:16:3e:fe:c7:87"/>

      <source bridge="qbr53903a95-82"/>

      <target dev="tap53903a95-82"/>

    </interface>

 

Looking at the bridge using the brctl show command we see this:

# brctl show

bridge name     bridge id               STP enabled     interfaces

qbr53903a95-82          8000.7e7f3282b836       no              qvb53903a95-82

                                                        tap53903a95-82

 

 The bridge has two interfaces, one connected to the VM (“tap53903a95-82 “) and another one ( “qvb53903a95-82”) connected to “br-int” bridge on OVS:

# ovs-vsctl show

83c42f80-77e9-46c8-8560-7697d76de51c

    Bridge "br-eth2"

        Port "br-eth2"

            Interface "br-eth2"

                type: internal

        Port "eth2"

            Interface "eth2"

        Port "phy-br-eth2"

            Interface "phy-br-eth2"

    Bridge br-int

        Port br-int

            Interface br-int

                type: internal

        Port "int-br-eth2"

            Interface "int-br-eth2"

        Port "qvo53903a95-82"

            tag: 3

            Interface "qvo53903a95-82"

    ovs_version: "1.11.0"

 

As we showed earlier “br-int” is connected to “br-eth2” on OVS using the veth pair int-br-eth2,phy-br-eth2 and br-eth2 is connected to the physical interface eth2. The whole flow end to end looks like this:

VM è tap53903a95-82 (virtual interface)è qbr53903a95-82 (Linux bridge) è qvb53903a95-82 (interface connected from Linux bridge to OVS bridge br-int) è int-br-eth2 (veth one end) è phy-br-eth2 (veth the other end) è eth2 physical interface.

The purpose of the Linux Bridge connecting to the VM is to allow security group enforcement with iptables. Security groups are enforced at the edge point which are the interface of the VM, since iptables nnot be applied to OVS bridges we use Linux bridge to apply them. In the future we hope to see this Linux Bridge going away rules. 

VLAN tags: As we discussed in the first use case net1 is using VLAN tag 1000, looking at OVS above we see that qvo41f1ebcf-7c is tagged with VLAN tag 3. The modification from VLAN tag 3 to 1000 as we go to the physical network is done by OVS  as part of the packet flow of br-eth2 in the same way we showed before.

To summarize, when a VM is launched it is connected to the VM network through a chain of elements as described here. During the packet from VM to the network and back the VLAN tag is modified.

Use case #3: Serving a DHCP request coming from the virtual machine

In the previous use cases we have shown that both the namespace called dhcp-<some id> and the VM end up connecting to the physical interface eth2  on their respective nodes, both will tag their packets with VLAN tag 1000.We saw that the namespace has an interface with IP of 10.10.10.3. Since the VM and the namespace are connected to each other and have interfaces on the same subnet they can ping each other, in this picture we see a ping from the VM which was assigned 10.10.10.2 to the namespace:

The fact that they are connected and can ping each other can become very handy when something doesn’t work right and we need to isolate the problem. In such case knowing that we should be able to ping from the VM to the namespace and back can be used to trace the disconnect using tcpdump or other monitoring tools.

To serve DHCP requests coming from VMs on the network Neutron uses a Linux tool called “dnsmasq”,this is a lightweight DNS and DHCP service you can read more about it here. If we look at the dnsmasq on the control node with the ps command we see this:

dnsmasq --no-hosts --no-resolv --strict-order --bind-interfaces --interface=tap26c9b807-7c --except-interface=lo --pid-file=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/host --dhcp-optsfile=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/opts --leasefile-ro --dhcp-range=tag0,10.10.10.0,static,120s --dhcp-lease-max=256 --conf-file= --domain=openstacklocal

The service connects to the tap interface in the namespace (“--interface=tap26c9b807-7c”), If we look at the hosts file we see this:

# cat  /var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/host

fa:16:3e:fe:c7:87,host-10-10-10-2.openstacklocal,10.10.10.2

 

If you look at the console output above you can see the MAC address fa:16:3e:fe:c7:87 which is the VM MAC. This MAC address is mapped to IP 10.10.10.2 and so when a DHCP request comes with this MAC dnsmasq will return the 10.10.10.2.If we look into the namespace at the time we initiate a DHCP request from the VM (this can be done by simply restarting the network service in the VM) we see the following:

# ip netns exec qdhcp-5f833617-6179-4797-b7c0-7d420d84040c tcpdump -n

19:27:12.191280 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:fe:c7:87, length 310

19:27:12.191666 IP 10.10.10.3.bootps > 10.10.10.2.bootpc: BOOTP/DHCP, Reply, length 325

 

To summarize, the DHCP service is handled by dnsmasq which is configured by Neutron to listen to the interface in the DHCP namespace. Neutron also configures dnsmasq with the combination of MAC and IP so when a DHCP request comes along it will receive the assigned IP.

Summary

In this post we relied on the components described in the previous post and saw how network connectivity is achieved using three simple use cases. These use cases gave a good view of the entire network stack and helped understand how an end to end connection is being made between a VM on a compute node and the DHCP namespace on the control node. One conclusion we can draw from what we saw here is that if we launch a VM and it is able to perform a DHCP request and receive a correct IP then there is reason to believe that the network is working as expected. We saw that a packet has to travel through a long list of components before reaching its destination and if it has done so successfully this means that many components are functioning properly.

In the next post we will look at some more sophisticated services Neutron supports and see how they work. We will see that while there are some more components involved for the most part the concepts are the same.

@RonenKofman

Wednesday May 28, 2014

Diving into OpenStack Network Architecture - Part 1

Before we begin

OpenStack networking has very powerful capabilities but at the same time it is quite complicated. In this blog series we will review an existing OpenStack setup using the Oracle OpenStack Tech Preview and explain the different network components through use cases and examples. The goal is to show how the different pieces come together and provide a bigger picture view of the network architecture in OpenStack. This can be very helpful to users making their first steps in OpenStack or anyone wishes to understand how networking works in this environment.  We will go through the basics first and build the examples as we go.

According to the recent Icehouse user survey and the one before it, Neutron with Open vSwitch plug-in is the most widely used network setup both in production and in POCs (in terms of number of customers) and so in this blog series we will analyze this specific OpenStack networking setup. As we know there are many options to setup OpenStack networking and while Neturon + Open vSwitch is the most popular setup there is no claim that it is either best or the most efficient option. Neutron + Open vSwitch is an example, one which provides a good starting point for anyone interested in understanding OpenStack networking. Even if you are using different kind of network setup such as different Neutron plug-in or even not using Neutron at all this will still be a good starting point to understand the network architecture in OpenStack.

The setup we are using for the examples is the one used in the Oracle OpenStack Tech Preview. Installing it is simple and it would be helpful to have it as reference. In this setup we use eth2 on all servers for VM network, all VM traffic will be flowing through this interface.The Oracle OpenStack Tech Preview is using VLANs for L2 isolation to provide tenant and network isolation. The following diagram shows how we have configured our deployment:


This first post is a bit long and will focus on some basic concepts in OpenStack networking. The components we will be discussing are Open vSwitch, network namespaces, Linux bridge and veth pairs. Note that this is not meant to be a comprehensive review of these components, it is meant to describe the component as much as needed to understand OpenStack network architecture. All the components described here can be further explored using other resources.

Open vSwitch (OVS)

In the Oracle OpenStack Tech Preview OVS is used to connect virtual machines to the physical port (in our case eth2) as shown in the deployment diagram. OVS contains bridges and ports, the OVS bridges are different from the Linux bridge (controlled by the brctl command) which are also used in this setup. To get started let’s view the OVS structure, use the following command:

# ovs-vsctl show

7ec51567-ab42-49e8-906d-b854309c9edf

    Bridge br-int

        Port br-int

            Interface br-int

type: internal

        Port "int-br-eth2"

            Interface "int-br-eth2"

    Bridge "br-eth2"

        Port "br-eth2"

            Interface "br-eth2"

type: internal

        Port "eth2"

            Interface "eth2"

        Port "phy-br-eth2"

            Interface "phy-br-eth2"

ovs_version: "1.11.0"

We see a standard post deployment OVS on a compute node with two bridges and several ports hanging off of each of them. The example above is a compute node without any VMs, we can see that the physical port eth2 is connected to a bridge called “br-eth2”. We also see two ports "int-br-eth2" and "phy-br-eth2" which are actually a veth pair and form virtual wire between the two bridges, veth pairs are discussed later in this post.

When a virtual machine is created a port is created on one the br-int bridge and this port is eventually connected to the virtual machine (we will discuss the exact connectivity later in the series). Here is how OVS looks after a VM was launched:

# ovs-vsctl show

efd98c87-dc62-422d-8f73-a68c2a14e73d

    Bridge br-int

        Port "int-br-eth2"

            Interface "int-br-eth2"

        Port br-int

            Interface br-int

type: internal

        Port "qvocb64ea96-9f"

tag: 1

            Interface "qvocb64ea96-9f"

    Bridge "br-eth2"

        Port "phy-br-eth2"

            Interface "phy-br-eth2"

        Port "br-eth2"

            Interface "br-eth2"

type: internal

        Port "eth2"

            Interface "eth2"

ovs_version: "1.11.0"

Bridge "br-int" now has a new port "qvocb64ea96-9f" which connects to the VM and tagged with VLAN 1. Every VM which will be launched will add a port on the “br-int” bridge for every network interface the VM has.

Another useful command on OVS is dump-flows for example:

# ovs-ofctl dump-flows br-int

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=735.544s, table=0, n_packets=70, n_bytes=9976, idle_age=17, priority=3,in_port=1,dl_vlan=1000 actions=mod_vlan_vid:1,NORMAL

cookie=0x0, duration=76679.786s, table=0, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=2,in_port=1 actions=drop

cookie=0x0, duration=76681.36s, table=0, n_packets=68, n_bytes=7950, idle_age=17, hard_age=65534, priority=1 actions=NORMAL

As we see the port which is connected to the VM has the VLAN tag 1. However the port on the VM network (eth2) will be using tag 1000. OVS is modifying the vlan as the packet flow from the VM to the physical interface. In OpenStack the Open vSwitch agent takes care of programming the flows in Open vSwitch so the users do not have to deal with this at all. If you wish to learn more about how to program the Open vSwitch you can read more about it at http://openvswitch.org looking at the documentation describing the ovs-ofctl command.

Network Namespaces (netns)

Network namespaces is a very cool Linux feature can be used for many purposes and is heavily used in OpenStack networking. Network namespaces are isolated containers which can hold a network configuration and is not seen from outside of the namespace. A network namespace can be used to encapsulate specific network functionality or provide a network service in isolation as well as simply help to organize a complicated network setup. Using the Oracle OpenStack Tech Preview we are using the latest Unbreakable Enterprise Kernel R3 (UEK3), this kernel provides a complete support for netns.

Let's see how namespaces work through couple of examples to control network namespaces we use the ip netns command:
Defining a new namespace:

# ip netns add my-ns

# ip netns list

my-ns

As mentioned the namespace is an isolated container, we can perform all the normal actions in the namespace context using the exec command for example running the ifconfig command:

# ip netns exec my-ns ifconfig -a

lo        Link encap:Local Loopback

          LOOPBACK  MTU:16436 Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

We can run every command in the namespace context, this is especially useful for debug using tcpdump command, we can ping or ssh or define iptables all within the namespace.

Connecting the namespace to the outside world:
There are various ways to connect into a namespaces and between namespaces we will focus on how this is done in OpenStack. OpenStack uses a combination of Open vSwitch and network namespaces. OVS defines the interfaces and then we can add those interfaces to namespace.

So first let's add a bridge to OVS:

# ovs-vsctl add-br my-bridge

Now let's add a port on the OVS and make it internal:

# ovs-vsctl add-port my-bridge my-port

# ovs-vsctl set Interface my-port type=internal

And let's connect it into the namespace:

# ip link set my-port netns my-ns

Looking inside the namespace:

# ip netns exec my-ns ifconfig -a

lo        Link encap:Local Loopback

          LOOPBACK  MTU:65536 Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

my-port   Link encap:Ethernet HWaddr 22:04:45:E2:85:21

          BROADCAST  MTU:1500 Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:0

          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

Now we can add more ports to the OVS bridge and connect it to other namespaces or other device like physical interfaces.

Neutron is using network namespaces to implement network services such as DCHP, routing, gateway, firewall, load balance and more. In the next post we will go into this in further details.

Linux Bridge and veth pairs

Linux bridge is used to connect the port from OVS to the VM. Every port goes from the OVS bridge to a Linux bridge and from there to the VM. The reason for using regular Linux bridges is for security groups’ enforcement. Security groups are implemented using iptables and iptables can only be applied to Linux bridges and not to OVS bridges.

Veth pairs are used extensively throughout the network setup in OpenStack and are also a good tool to debug a network problem. Veth pairs are simply a virtual wire and so veths always come in pairs. Typically one side of the veth pair will connect to a bridge and the other side to another bridge or simply left as a usable interface.

In this example we will create some veth pairs, connect them to bridges and test connectivity. This example is using regular Linux server and not an OpenStack node:
Creating a veth pair, note that we define names for both ends:

# ip link add veth0 type veth peer name veth1

# ifconfig -a

.

.

veth0     Link encap:Ethernet HWaddr 5E:2C:E6:03:D0:17

          BROADCAST MULTICAST  MTU:1500 Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

veth1     Link encap:Ethernet HWaddr E6:B6:E2:6D:42:B8

          BROADCAST MULTICAST  MTU:1500 Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

collisions:0 txqueuelen:1000

          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

.

.

To make the example more meaningful we will create the following setup:

veth0 => veth1 => br-eth3 => eth3 ======> eth2 on another Linux server

br-eth3 – a regular Linux bridge which will be connected to veth1 and eth3

eth3 – a physical interface with no IP on it, connected to a private network

eth2 – a physical interface on the remote Linux box connected to the private network and configured with the IP of 50.50.50.1

Once we create the setup we will ping 50.50.50.1 (the remote IP) through veth0 to test that the connection is up:

# brctl addbr br-eth3

# brctl addif br-eth3 eth3

# brctl addif br-eth3 veth1

# brctl show

bridge name     bridge id               STP enabled     interfaces

br-eth3         8000.00505682e7f6       no              eth3

                                                        veth1

# ifconfig veth0 50.50.50.50

# ping -I veth0 50.50.50.51

PING 50.50.50.51 (50.50.50.51) from 50.50.50.50 veth0: 56(84) bytes of data.

64 bytes from 50.50.50.51: icmp_seq=1 ttl=64 time=0.454 ms

64 bytes from 50.50.50.51: icmp_seq=2 ttl=64 time=0.298 ms

When the naming is not as obvious as the previous example and we don't know who are the paired veth interfaces we can use the ethtool command to figure this out. The ethtool command returns an index we can look up using ip link command, for example:

# ethtool -S veth1

NIC statistics:

peer_ifindex: 12

# ip link

.

.

12: veth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

Summary

That’s all for now, we quickly reviewed OVS, network namespaces, Linux bridges and veth pairs. These components are heavily used in the OpenStack network architecture we are exploring and understanding them well will be very useful when reviewing the different use cases. In the next post we will look at how the OpenStack network is laid out connecting the virtual machines to each other and to the external world.

@RonenKofman

Friday May 16, 2014

OpenStack Summit - What a Great Week This Was

Close to 5000 OpenStack developers, customers, vendors and technology enthusiasts met this week in Atlanta Georgia for the semi-annual OpenStack summit. It was an exciting event, there is something about the OpenStack project that make people more passionate than any technology I have seen before. The metaphor commonly used at the summit describing OpenStack as the “Rebel Alliance” seems to accurately capture the mindset the OpenStack community. The feeling is that we are making something big and revolutionary, changing the world in a big way. As a technology conference veteran I felt that this one was very different. There is a vibrant discussion inside the OpenStack community about what to do and how to do it. There are different schools of thought but everyone has a common goal, to make OpenStack successful.

The sessions covered any and every aspect of OpenStack and allowed anyone to learn about what is happening in the different projects and how will they evolve. An interesting aspect of the OpenStack community is the openness and how people feel comfortable having a completely open discussion about what works well and what doesn’t, what is better handled differently and what can be improved. Many of the sessions and discussions revolved around genuine analysis of the best path forward suggesting improvements and changes encouraging a discussion and involvement from the audience.

The future direction of networking in OpenStack came up in many sessions and discussions. While Neutron provides a lot of features and built as a way that abstract networking implementation from user interface there are still many challenges to overcome. The main challenges with Neutron are the complexity that has grown significantly with Neutron as well as scalability. While there are technical challenges to overcome the community is working together towards a common goal, despite disagreement and different mindsets everyone is looking how to make OpenStack better.

Another interesting session and completely non-technical was a one by Thierry Carrez, Chair of the Technical Committee and Release Manager of OpenStack. This session discussed the challenges of managing a community and how to create the right governance structure and the right process to assure proper evaluation of ideas and foster a merit based community. The rapid growth of the OpenStack community with many participants who have different goals has made the management of OpenStack into a very challenging task in of itself. You can watch the recording here

On Tuesday of this week we announced the technology preview for Oracle’s Distribution of OpenStack which was very well received by our customers, partners and the community. Our goal is simple, work with the OpenStack community to make OpenStack an enterprise grade platform capable of running enterprise applications. We want to work with our partner to integrate their solution with Oracle’s Openstack distribution. We are also striving to be open and collaborative providing full support for Oracle Linux and Oracle VM to anyone who deploy OpenStack whether from Oracle, DIY or any other OpenStack distribution. More details can be found on Wim’s blog.

It was an exciting week and we are looking forward for the weeks and months ahead working with our partners and customers making Oracle OpenStack distribution available as well as working with the community to make OpenStack better. You are welcome to try the Oracle OpenStack technology preview here and we look forward to your feedback.

@RonenKofman

Tuesday May 13, 2014

Oracle Announces OpenStack Support for Oracle Linux and Oracle VM

Today we are announcing OpenStack support for Oracle Linux and Oracle VM. This support will be included as part of Premier Support for Oracle Linux and Oracle VM at no extra cost.

Existing customers can deploy OpenStack with Oracle Linux and Oracle VM and receive support for the entire software stack. This support includes base operating system which OpenStack installs on, the OpenStack bits, hypervisor and guest operating system with Oracle Linux.This complete end to end support which includes all the components of the stack is a key requirement for enterprise customers and therefore we will be providing it as part of Oracle Linux / Oracle VM Premier Support.

Today we are releasing a technology preview which allows customers and partners to test OpenStack with Oracle VM and Oracle Linux. The technology preview is available here. Together with the tech preview we are providing a Getting Started guide which walks the user through the steps of setting up an OpenStack deployment, the Getting Started guide is available here. The technology preview and Get Started guide give users a simple and easy way to deploy OpenStack and make their first steps exploring the features available with it. The technology preview is also suitable for vendors who would like to integrate with OpenStack from Oracle.  

 As OpenStack matures and evolves it provides a solution which can meet the growing need for standardization in the IT infrastructure. With the wider adoption of OpenStack customers are looking to leverage it to a broader set of applications including enterprise applications. One of our main goals is to work with the community and help OpenStack take the next step into the enterprise application space, allowing customers to deploy enterprise applications in a reliable, supportable manner.

The press release is available here

To those of you at the Atlanta OpenStack conference we will be happy to discuss this further, we are located at booth D14 and we are looking forward to see you there. 


About

Ronen is Director of Product Development for Oracle OpenStack. You are welcome to follow Ronen on Twitter at @RonenKofman

Search

Categories
Archives
« September 2015
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today