I have been messing around with my new Nexus 9000v’s and wanted to have a crack at setting up VXLAN using MP-BGP EVPN as the control plane. There is a lot of literature available on the topic however the topology I had in mind didn’t seem to be covered in any detail. I wanted to use it as a simple 2 site DCI solution to stretch some VLANs with just a pair of vPC connected leafs in each site. I didn’t want to use multicast for BUM (broadcast, unknown unicast and multicast) traffic and I wanted to keep the BGP configuration as simple as possible. After doing a lot of reading and piecing together of various configuration snippets, this is what I came up with. It is based upon 2 sites with a pair of port based point to point ethernet services connecting them and uses Ingress Replication instead of multicast to handle BUM traffic. This type of use case would be pretty common in the small to medium enterprise world. There is a good guide available here which gives an overview of VXLAN BGP EVPN.
This post assumes you already have a pair of Nexus 9K’s configured with vPC in each site.
The first step is to enable all of the required features on the switches with the following commands:
feature ospf feature bgp feature udld feature interface-vlan feature vn-segment-vlan-based feature nv overlay nv overlay evpn
Now we need to setup the L3 underlay to provide basic connectivity and IGP routing between all of the switches. My protocol of choice is OSPF however IS-IS and other IGP’s will work fine.
Start by configuring a loopback interface with a unique IP for each switch.
interface loopback0 ip address 172.16.1.1/32
Next we configure OSPF using the loopback address as the router-id.
router ospf L3Core router-id 172.16.1.1 log-adjacency-changes passive-interface default
And then we add the OSPF instance to our loopback address.
interface loopback0 ip address 172.16.1.1/32 ip router ospf L3Core area 0.0.0.0
Now we need to configure L3 interfaces on our switches to terminate the inter site port services and connections between each switch within the DC’s. I used additional L3 connections between the switches within each site, you could also use an SVI and the existing vPC peer-link to connect them, you just need to make sure you allow the VLAN across the peer-link and do NOT configure the VLAN you use on any other ports (i.e. the VLAN must be dedicated to the L3 connection only). Notice the MTU is set to 9216, it is best to set this as high as the underlying network allows as the VXLAN encapsulation adds extra bytes to each packet.
interface Ethernet1/3 description L3 to SW-NEXUS03 no switchport mtu 9216 ip address 10.10.0.97/30 ip ospf dead-interval 2 ip ospf hello-interval 1 ip ospf network point-to-point no ip ospf passive-interface ip router ospf L3Core area 0.0.0.0 no shutdown
Once you have finished configuring all of the L3 interconnects it should resemble the below diagram. If you do a show ip route
you should see all of the IP’s listed.
That’s the underlay sorted, now it is time for the overlay.
We start with an additional loopback interface for each switch. As we are using vPC, this interface will have two ip addresses. The first is unique to each switch, the second should be the same for each vPC pair and will be used as the VTEP (VXLAN tunnel endpoint) address.
interface loopback1 ip address 192.168.1.1/32 ip address 192.168.1.10/32 secondary ip router ospf L3Core area 0.0.0.0
Next step is to map the VLANs that we wish to stretch between the data centres to VXLAN VNIDs (Virtual Network Identifiers)
vlan 100 name Servers vn-segment 1100 vlan 200 name Management vn-segment 1200 vlan 300 name Backups vn-segment 1300
Now we configure BGP. We will be using the loopback0 address for the router-id and defining each of the other switches as a neighbor so the config on each switch will be slightly different.
router bgp 100 router-id 172.16.1.1 address-family ipv4 unicast address-family l2vpn evpn neighbor 172.16.2.2 remote-as 100 update-source loopback0 address-family ipv4 unicast address-family l2vpn evpn send-community send-community extended neighbor 172.16.3.3 remote-as 100 update-source loopback0 address-family ipv4 unicast address-family l2vpn evpn send-community send-community extended neighbor 172.16.4.4 remote-as 100 update-source loopback0 address-family ipv4 unicast address-family l2vpn evpn send-community send-community extended
And now EVPN
evpn vni 1100 l2 rd auto route-target import auto route-target export auto vni 1200 l2 rd auto route-target import auto route-target export auto vni 1300 l2 rd auto route-target import auto route-target export auto
The last step is the NVE interface
interface nve1 no shutdown source-interface loopback1 host-reachability protocol bgp source-interface hold-down-time 30 member vni 1100 ingress-replication protocol bgp member vni 1200 ingress-replication protocol bgp member vni 1300 ingress-replication protocol bgp
If you run a show nve peers
you should now see the IP of the other VTEP.
Hosts connected to one of the stretched VLANs in DC A should now be able to communicate with hosts on the same VLAN in DC B.
Job done!
Nice Lab!
I did almost exacly the same LAB in GNS3 before finding you page, only diffrence is I’m
using multicast for BUM. Now I also tried your config. The only issue I see is when using iperf to send multicasts from a ubuntu server at one site I do receive duplicates at other side. Wonder if this due to my setup or an actual problem? I use 7.0(3)I7(3) in 9000v
Thanks
Nikas
Hey Niklas
I dont recall experiencing that issue with my setup, when I next run it up I will check it out.
Regards
Kirin
would it be possible to have full config file ?
thx
Secondary link between DCs is down ? Or there’s any ECMP ?
Secondary link is there as an alternate route in case Primary goes down. For the use case I had I needed to steer traffic over the Primary unless there was an outage. This was done by adding a cost to the Secondary link route to make the Primary via the local L3 PtP link the favoured route from the switches directly connected via the secondary. I am not sure if you could do a true ECMP as I am not fully across the behaviour of the VTEP in a vPC situation (i.e. whether the VTEP interface is available simultaneously on both switches in the vPC pair). Might be something worth testing out!
Regards
Kirin
Hi Kirin,
if i have two Nexus on each site, do all switches have to be interconnected, like SW-NEXUS01 on DC site has to connect to SW-NEXUS03 and SW-NEXUS04 , well just a clarification.
Hi Ashley
Each switch participating in the overlay needs to be able to reach each other via the underlay. The network topology doesn’t have to look like the diagram in the post but there does need to be Layer 3 connectivity between each switch.
Regards
Kirin
Hello all
Let me please know some details. I have similar topology but instead of two vpc, I have vpc pair in my primary DC and only one nexus in my secondary DC (not vpc) could I config between my DC VXLAN ?