X

Move your VMware and KVM applications to the cloud without making any changes

  • October 11, 2015

How to build a large scale BGP MPLS Service Provider Network and model on AWS – Part 2

Author:
Matt Conran
Matt Conran is a Network Architect based out of Ireland and a prolific blogger at Network Insight. In his spare time he writes on topics ranging from SDN, OpenFlow, NFV, OpenStack, Cloud, Automation and Programming.

Service Providers around the globe deploy MPLS networks using label-switched paths to improve quality of service (QoS) that meet specific SLAs that enterprises demand for traffic. This is a two-part post – Part 1 introduces MPLS constructs and Part 2 (this post) describes how to setup a fully-functional MPLS network using Ravello’s Network Smart Lab. Interested in playing with this 14 node Service Provider MPLS deployment – just add this blueprint from Ravello Repo.

The blueprint uses Multiprotocol Extensions for BGP (MP-BGP) to pass customer routing information with both Border Gateway Protocol (BGP) route reflection and full mesh designs. The MPLS core is implemented with OSPF as the IGP and Label Distribution Protocol (LDP) to distribute labels. To transport IPv6 packets over an MPLS IPv4 core, an additional mechanism known as 6PE is designed on a standalone 6PE route reflector. Labels are assigned to transport IPv4 and IPv6 packets across an IPv4 MPLS core.

Creating MPLS Service Provider Network on Ravello

I decided to use Cisco CSR1000v to create my MPLS Service Provider network on Ravello as it supports a strong feature set including MP-BGP, LDP, 6PE, and route reflection. CSR1000v is a fully featured Layer 3 router, and fulfilled all device roles (P, PE, RR and Jump) for the MPLS/VPN network.

I created a mini MPLS/VPN network consisting of 2 x P nodes, 8 x PE’s, 2 x IPv4 route reflectors, and 1 x 6PE route reflectors. A JUMP host was used for device reachability. The P nodes have core functionality and switch packets based on labels. The PE’s accepts customer prefixes and peer with either a route reflector or other PE’s nodes. The route reflectors negate the need for a full mesh in the lower half of the network.

Once the initial design was in place, I was able to drag and drop Cisco CSR1000v VM’s to build a 15 node design with great ease. I also expanded (on the fly) the core to a 4 node square design and scaled back down to two for simplicity. This type of elasticity is hard to replicate in the physical world. All device are accessed from the managment JUMP host. A local host file is created allowing you to telnet by name and not IP address e.g telnet p1, telnet PE2 etc.

The diagram below displays the physical interface interconnects per device. The core P1 and P2 are the hub for all connections. Every node connects to either P1 or P2.

The table below displays the management address for each node, and confirms the physical interconnects.

Device Name Connecting To Mgmt Address
PE1 P1 192.168.254.20
PE2 P2 192.168.254.21
PE3 P1 192.168.254.22
PE4 P2 192.168.254.23
PE5 P1 192.168.254.24
PE6 P1 192.168.254.25
PE7 P2 192.168.254.26
PE8 P2 192.168.254.27
RR1 P2 192.168.254.28
RR2 P2 192.168.254.29
RR3 P1 192.168.254.31
P1 P2,RR3,PE1,PE3 192.168.254.10
P2 P1,RR1,RR2,PE2,PE4 192.168.254.12
MGMT All Nodes. External

Logical Setup

In this lab, there are two types of BGP route propagation a) Full Mesh and b) Route Reflection.

A full mesh design entails all BGP speakers peering (neighbor relationship) with each other. If for some reason a PE node is left out of the peering, due to BGP rules and loop prevention mechanism, it will receive no routes. In a large BGP design, full mesh creates a lot of BGP neighbor relationships and hampers router resources. For demonstration purposes, PE1, PE2, PE3, and PE4 peer directly with each other, creating a BGP full mesh design.

For large BGP networks, designers employ BGP route reflection. In a route reflection design, BGP speakers do not need to peer with each and peer directly with a central control plane point, known as a route reflector. Route reflection significantly reduces the number of BGP peering sessions per device. Instead of each BGP speaker peering with each other, they peer directly with a route reflector. For demonstration purposes, PE5, PE6, PE7 and PE8 peer directly with RR1 and RR2 (IPv4 route reflectors).

In summary, there are two sections of the network. PE1 to PE4 are in the top section and participate in a BGP full mesh design. PE5 to PE8 are in the bottom section and participate in a BGP Route Reflection design. All PE’s are connected to a Provider node, either P1 or P2. The PE’s do not have any physical connectivity connectivity to each other but they do have logical connectivity. The top and bottom PE’s cannot communicate with each other and have separate VRFs for testing purposes. However, this can be changed by adding additional peering with RR1 and RR2’s or by participating in the BGP full mesh design.

The 3rd BGP route reflector is called RR3 and serves as the 6PE device. Both PE1 and PE2 peer with the 6PE route reflector for IPv6 connectivity.

The Provider (P) nodes have interconnect address to ALL PE and RR nodes, assigned 172.16.x.x. The P-to-P interconnects are labelled 10.1.1.x. The IPv4 and IPv6 route reflectors are interconnected to the P nodes and assigned address 172.16.x.x. They do not have any direct connections to the PE devices. Following screenshot shows how all the Service Provider MPLS network nodes are setup on Ravello.

BGP and MPLS Configuration

PE1, PE2, PE3 and PE4 are configured in a BGP full mesh. Each node is a BGP peer of each other. There are three stages to complete this design.

The first stage is to create the BGP neighbor, specify the BGP remote AS number, and source of the TCP session. Both BGP neighbors are in the same BGP AS making the connection a IBGP session and not an EBGP session. By default, BGP neighbor relationships are not dynamic and neighbors are explicitly specified on both ends. The command remote-as determines IBGP or EBGP relationship, “update-source Loopback100” sources the BGP session.

router bgp 100 bgp log-neighbor-changes neighbor 10.10.10.x remote-as 100 neighbor 10.10.10.x update-source Loopback100

The second stage is to activate the neighbor under the IPv4 address family.

address-family ipv4 neighbor 10.10.10.x activate

The third stage is to active the neighbor under the VPNv4 address family. We also need to make sure we are sending extended BGP attributes.

address-family vpnv4 neighbor 10.10.10.x activate neighbor 10.10.10.x send-community both

A test VRF named PE1 is created to test connectivity between PE1 to PE4. PE1 has test IP address of 10.10.10.10, PE2 has 10.10.10.20, PE3 has 10.10.10.30 and PE4 has 10.10.10.40. These addresses are reachable through MP-BGP and are present on the top half PE’s. The test interfaces are within the PE VRF and not the global routing table.

interface Loopback10 ip vrf forwarding PE1 ip address 10.10.10.x 255.255.255.255

The diagram below display the routing table for PE1 and the test results from pinging within the VRF. The VRF creates a separate routing table from the global table so when pinging one needs to make sure to execute the ping command within the VRF instance.

PE5, PE6, PE7 and PE8 are configured as route reflector clients of RR1 and RR2. Each of these PE’s has a BGP session to both RR1 and RR2. RR1 and RR2 are BGP route reflectors configured within a cluster-ID for redundancy. To prevent loops a cluster-id of 1.1.1.1 is implemented. They reflect routes from PE5, PE6, PE7 and PE8, not PE1, PE2, PE3 and PE4.

The main configuration points for a route reflector design are on the actual route reflectors; RR1 and RR2. The configuration commands on the PE’s stay the same but the only difference is that they have single BGP peering to the route reflectors and not each other.

Similar, to the PE devices, the route reflector sources the TCP session from the loopback100, specifies the remote AS number to highlight if this is an IBGP or EBG session. The cluster-ID is used to prevent loops as the bottom half PE’s peer with two route reflectors.

router bgp 200 bgp cluster-id 1.1.1.1 bgp log-neighbor-changes neighbor 10.10.10.x remote-as 200 neighbor 10.10.10.x update-source Loopback100

The PE neighbor is activated under the IPv4 address family

address-family ipv4 neighbor 10.10.10.x activate

Finally, the PE neighbor is activated under the VPNv4 address family. The main difference is that the route-reflector-client command is enabled to the neighbor relationship for the PE nodes. This single command enables route reflection capability.

address-family vpnv4 neighbor 10.10.10.x activate neighbor 10.10.10.x send-community both neighbor 10.10.10.x route-reflector-client 

The following displays the PE8 test loopback of 10.10.10.80 within the test VRF PE2. The cluster-id of 1.1.1.1 is present in the BGP table for that VRF.

RR3 is a standalone IPv6 route reflector. It interconnects with P1 with IPv4 addressing not IPv6. It does not serve the IPv4 address family and is used for IPv6-only. The send-label command labels IPv6 packets for transportation over an IPv4-only MPLS core. The command is configured on the PE side under the IPv6 address family.

The following snippet displays the additional configuration on PE1 for 6PE functionality. Note the send label command.

address-family ipv6 redistribute connected neighbor 10.10.10.11 activate neighbor 10.10.10.11 send-community extended neighbor 10.10.10.11 send-label

The IPv6 6PE RR has similar configuration to the IPv4 RR, except the IPv6 address family is used, not the IPv4 address family. Note, the 6PE RR does not have any neighbors activated under the IPv4 address family.

The following snippet displays the configuration on RR3 (IPv6 6PE ) for PE1 neighbor relationship.

router bgp 100 bgp log-neighbor-changes no bgp default ipv4-unicast neighbor 10.10.10.1 remote-as 100 neighbor 10.10.10.1 update-source Loopback100  address-family ipv4 exit-address-family  address-family ipv6 neighbor 10.10.10.1 activate neighbor 10.10.10.1 route-reflector-client neighbor 10.10.10.1 send-label

RR3 serves only PE1 and PE2 and implements a mechanism known as 6PE. PE1 and PE2 are chosen as they are physically connected to different P nodes. A trace from PE1 to PE2 displays additional labels added for IPv6 end-to-end reachability. An additional label is assigned to the IPv6 prefix so it can be label switched across the IPv4 MPLS core. If we had configured a mechanism known as 6VPE (VPNv6) we would see a three label stack. However, in the current configuration of 6PE (IPv6 not VPNv6) we have 2 labels; A label to reach the remote PE (assigned by LDP) and another label for the IPv6 prefix (assigned by BGP). These two labels are displayed in the diagram below. - Label 18 and Label 41, representing a two label stack.

The MPLS core consists of the P1 and P2 Provider nodes. These devices switch packet based on labels and run LDP to each other and to the PE routers. LDP is enabled simply with the mpls ip command under the connecting PE and P interfaces.

Interface Gix mpls ip

OSPF Area 0 is used to pass INTERNAL routing information. There are no BGP or customer routes in the core. There are a number of ways to configure OSPF to advertise routes. For demonstration purposes this blueprint uses both ways. The core nodes only have Internal reachability information.

The snippets display both ways to configure OSPF; enabled under the interface or configured within the OSPF process.

interface GigabitEthernet ip ospf 1 area 0  router ospf 1 network  0.0.0.0 area 0 

The image below displays the MPLS forwarding plane with the show mpls forwarding-table command. The table display the incoming to outgoing label allocation for each prefix in the routing table. The outgoing action could either be a POP label or an outgoing label assignment. For example, there is a POP label action for PE1 and PE3 loopbacks as these two nodes are directly connected to P1. However, for PE2 and PE4, who are connected to the other P node, there is an outgoing label action of 18 and 20.

As discussed, OSPF is the IGP and we are running Area 0. The command show ip ospf neighbors displays the OSPF neighbors for P1. It should be adjacent to all the directly connected PE’s, P2 and RR3.

Complete configuration for this setup can be accessed at this Github account.

Conclusion

This post walks through step by step instructions on how to create a 14 node MPLS network using Ravello’s Network Smart Lab. Interested in playing with this MPLS core network? Just open a Ravello account and add this fully functional MPLS network blueprint to your library.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha