APRESIA Technical Blog

Cumulus LinuxでVXLANルーティング構築

「Cumulus LinuxでIP CLOS with EVPN-VXLAN構築」の続編です。 今回は、EVPN-VXLAN構成でVXLANルーティングを設定していきます。

VXLANルーティングについて

Cumulus Linuxでは、2つのアーキテクチャをサポートしています。

  • 集中型ルーティング
    • 1台のGateway装置(もしくは冗長化されたペア装置)が全てのVXLANルーティングを実施
    • 最もシンプルな構成
    • ルーティングのトラフィックパターンがNorth-Southよりも、East-Westが中心の場合、必ずGatewayを経由するため、DC内で余計なEast-Westトラフィックが増加

集中型

  • 分散型ルーティング(Asymmetric routing, Symmetric routing)
    • Anycast gatewayにより、ホストに最も近いLeafにてVXLANルーティング
    • 集中型ルーティングよりもトラフィックフローを最適化可能
    • Asymmetric routing, Symmetric routingの双方をサポート

分散型

分散型ルーティング

Asymmetric routing

  • Ingress VTEPがVNI間のルーティングを実施し、Egress VTEPはブリッジのみを行います。
  • 対象のLeaf SWにVLAN/VNIに属するホストがいない場合でも、送信元のVNIと宛先のVNIの双方を各Leafに設定する必要があります。そのため、Leaf SWが保持するべきIPやMACアドレスが増加し、Symmetricよりもスケールが限定されます。
  • NW上のVNIを全てのLeaf SWに設定することにより、NW設定の単純化やVMモビリティの確保が可能です。

Symmetric routing

  • L3 VNIを経由してルーティングします。
  • 各Leaf SWは、ローカルホストのVNIのみを設定するため、Asmmetricよりもスケールメリットがあります。

座学はここまで。それでは実機でSymmetric Routingを確認していきます!

ネットワーク構成

分散型(Symmetric Routing)の構成です。
NW構成
  • VXLANルーティング構成
    • Symmetric IRBによりL3 VNI経由でルーティング、各LeafSWにL3 VNI(VNI:104001)を設定
    • 各Leaf SWにAnycast Gatewayとして、各SVIに共通のIPアドレスを設定
    • VRFを使用したマルチテナント構成
    • 外部接続用にEVPN Type5ルート(Prefixルート)を使用し、ボーターLeaf(Exit01)からEVPNでデフォルトルート(0.0.0.0/0)をLeaf01、Leaf02へ広報
  • Leaf SWは、Edgecore Networks社のAS5812-54Tを使用
  • Spine SWは、Edgecore Networks社のAS5812-54Xを使用
  • Exit01(ボーダーLeaf)、Router(外部接続)は、Cumulus VX(仮想OS)を使用
  • Cumulus Linux 3.7.3を使用


Config設定

Leaf01

  • /etc/network/interfaces
# This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). source /etc/network/interfaces.d/*.intf # The loopback network interface auto lo iface lo inet loopback address 10.0.0.11/32 # The primary network interface auto eth0 iface eth0 address 192.168.100.90/24 gateway 192.168.100.1 auto swp1 iface swp1 bridge-vids 100 auto swp49 iface swp49 link-speed 40000 auto swp50 iface swp50 link-speed 40000 auto bridge iface bridge bridge-ports swp1 vni-10100 vni-104001 bridge-vids 100 4001 bridge-vlan-aware yes auto vlan100 iface vlan100 address 172.16.100.1/24 # anycast gateway address vlan-id 100 vlan-raw-device bridge vrf vrf1 # vrf1に設定 auto vlan4001 # L3 VNI用VLAN iface vlan4001 vlan-id 4001 vlan-raw-device bridge vrf vrf1 auto vni-10100 # L2 VNI設定 iface vni-10100 bridge-access 100 bridge-arp-nd-suppress on bridge-learning off mstpctl-bpduguard yes mstpctl-portbpdufilter yes vxlan-id 10100 vxlan-local-tunnelip 10.0.0.11 auto vni-104001 # L3 VNI設定 iface vni-104001 bridge-access 4001 bridge-learning off mstpctl-bpduguard yes mstpctl-portbpdufilter yes vxlan-id 104001 vxlan-local-tunnelip 10.0.0.11 auto vrf1 # vrf1設定 iface vrf1 vrf-table auto


  • /etc/frr/frr.conf
frr version 4.0+cl3u9 frr defaults datacenter hostname Leaf01 username cumulus nopassword ! service integrated-vtysh-config ! log syslog ! vrf vrf1 vni 104001 # vrf1にL3 VNI割り当て exit-vrf ! interface swp49 ipv6 nd ra-interval 10 no ipv6 nd suppress-ra ! interface swp50 ipv6 nd ra-interval 10 no ipv6 nd suppress-ra ! router bgp 65011 bgp router-id 10.0.0.11 bgp bestpath as-path multipath-relax neighbor FABRIC peer-group neighbor FABRIC remote-as external neighbor FABRIC bfd neighbor FABRIC capability extended-nexthop neighbor swp49 interface peer-group FABRIC neighbor swp50 interface peer-group FABRIC ! address-family ipv4 unicast network 10.0.0.11/32 exit-address-family ! address-family l2vpn evpn # EVPN設定 neighbor FABRIC activate advertise-all-vni exit-address-family ! line vty !


Leaf02

  • /etc/network/interfaces
# This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). source /etc/network/interfaces.d/*.intf # The loopback network interface auto lo iface lo inet loopback address 10.0.0.12/32 # The primary network interface auto eth0 iface eth0 address 192.168.100.91/24 gateway 192.168.100.1 auto swp1 iface swp1 bridge-vids 100-101 auto swp49 iface swp49 link-speed 40000 auto swp50 iface swp50 link-speed 40000 auto bridge iface bridge bridge-ports swp1 vni-10100 vni-10101 vni-104001 bridge-vids 100-101 4001 bridge-vlan-aware yes auto vlan100 iface vlan100 address 172.16.100.1/24 # anycast gateway address vlan-id 100 vlan-raw-device bridge vrf vrf1 # vrf1に設定 auto vlan101 iface vlan101 address 172.16.101.1/24 # anycast gateway address vlan-id 101 vlan-raw-device bridge vrf vrf1 # vrf1に設定 auto vlan4001 # L3 VNI用VLAN iface vlan4001 vlan-id 4001 vlan-raw-device bridge vrf vrf1 auto vni-10100 # L2 VNI設定 iface vni-10100 bridge-access 100 bridge-arp-nd-suppress on bridge-learning off mstpctl-bpduguard yes mstpctl-portbpdufilter yes vxlan-id 10100 vxlan-local-tunnelip 10.0.0.12 auto vni-10101 # L2 VNI設定 iface vni-10101 bridge-access 101 bridge-arp-nd-suppress on bridge-learning off mstpctl-bpduguard yes mstpctl-portbpdufilter yes vxlan-id 10101 vxlan-local-tunnelip 10.0.0.12 auto vni-104001 # L3 VNI設定 iface vni-104001 bridge-access 4001 bridge-learning off mstpctl-bpduguard yes mstpctl-portbpdufilter yes vxlan-id 104001 vxlan-local-tunnelip 10.0.0.12 auto vrf1 # vrf1設定 iface vrf1 vrf-table auto


  • /etc/frr/frr.conf
frr version 4.0+cl3u9 frr defaults datacenter hostname Leaf02 username cumulus nopassword ! service integrated-vtysh-config ! log syslog ! vrf vrf1 vni 104001 # vrf1にL3 VNI割り当て exit-vrf ! interface swp49 ipv6 nd ra-interval 10 no ipv6 nd suppress-ra ! interface swp50 ipv6 nd ra-interval 10 no ipv6 nd suppress-ra ! router bgp 65012 bgp router-id 10.0.0.12 bgp bestpath as-path multipath-relax neighbor FABRIC peer-group neighbor FABRIC remote-as external neighbor FABRIC bfd neighbor FABRIC capability extended-nexthop neighbor swp49 interface peer-group FABRIC neighbor swp50 interface peer-group FABRIC ! address-family ipv4 unicast network 10.0.0.12/32 exit-address-family ! address-family l2vpn evpn # EVPN設定 neighbor FABRIC activate advertise-all-vni exit-address-family ! line vty !


Spine01

  • /etc/network/interfaces
# This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). source /etc/network/interfaces.d/*.intf # The loopback network interface auto lo iface lo inet loopback # The primary network interface address 10.0.0.1/32 # The primary network interface auto eth0 iface eth0 address 192.168.100.85/24 gateway 192.168.100.1 auto swp48 iface swp48 auto swp49 iface swp49 link-speed 40000 auto swp50 iface swp50 link-speed 40000


  • /etc/frr/frr.conf
frr version 4.0+cl3u9 frr defaults datacenter hostname Spine01 username cumulus nopassword ! service integrated-vtysh-config ! log syslog ! interface swp48 ipv6 nd ra-interval 10 no ipv6 nd suppress-ra ! interface swp49 ipv6 nd ra-interval 10 no ipv6 nd suppress-ra ! interface swp50 ipv6 nd ra-interval 10 no ipv6 nd suppress-ra ! router bgp 65020 bgp router-id 10.0.0.1 bgp bestpath as-path multipath-relax neighbor FABRIC peer-group neighbor FABRIC remote-as external neighbor FABRIC bfd neighbor FABRIC capability extended-nexthop neighbor swp48 interface peer-group FABRIC neighbor swp49 interface peer-group FABRIC neighbor swp50 interface peer-group FABRIC ! address-family ipv4 unicast network 10.0.0.1/32 exit-address-family ! address-family l2vpn evpn neighbor FABRIC activate exit-address-family ! line vty !


Exit01

  • /etc/network/interfaces
# This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). source /etc/network/interfaces.d/*.intf # The loopback network interface auto lo iface lo inet loopback # The primary network interface address 10.0.0.41/32 # The primary network interface auto eth0 iface eth0 address 192.168.100.94/24 gateway 192.168.100.1 auto swp1 iface swp1 auto swp2 iface swp2 auto swp3 iface swp3 bridge-vids 2001 auto bridge iface bridge bridge-ports swp3 vni-104001 bridge-vids 2001 4001 bridge-vlan-aware yes auto vlan2001 # Routerとの接続I/F iface vlan2001 address 10.10.10.0/31 vlan-id 2001 vlan-raw-device bridge vrf vrf1 auto vlan4001 # L3 VNI用VLAN iface vlan4001 vlan-id 4001 vlan-raw-device bridge vrf vrf1 auto vni-104001 # L3 VNI設定 iface vni-104001 bridge-access 4001 bridge-learning off mstpctl-bpduguard yes mstpctl-portbpdufilter yes vxlan-id 104001 vxlan-local-tunnelip 10.0.0.41 auto vrf1 # vrf1設定 iface vrf1 vrf-table auto


  • /etc/frr/frr.conf
frr version 4.0+cl3u8 frr defaults datacenter hostname Exit01 username cumulus nopassword ! service integrated-vtysh-config ! log syslog ! vrf vrf1 vni 104001 exit-vrf ! interface swp1 ipv6 nd ra-interval 10 no ipv6 nd suppress-ra ! interface swp2 ipv6 nd ra-interval 10 no ipv6 nd suppress-ra ! router bgp 65041 bgp router-id 10.0.0.41 bgp bestpath as-path multipath-relax neighbor FABRIC peer-group neighbor FABRIC remote-as external neighbor FABRIC bfd neighbor FABRIC capability extended-nexthop neighbor swp1 interface peer-group FABRIC neighbor swp2 interface peer-group FABRIC ! address-family ipv4 unicast network 10.0.0.41/32 exit-address-family ! address-family l2vpn evpn neighbor FABRIC activate advertise-all-vni exit-address-family ! router bgp 65041 vrf vrf1 # RouterとのBGPピア設定 bgp router-id 10.10.10.1 neighbor 10.10.10.1 remote-as external ! address-family ipv4 unicast # ホストルートをRouterへ広報 aggregate-address 172.16.100.0/24 summary-only aggregate-address 172.16.101.0/24 summary-only exit-address-family ! address-family l2vpn evpn # EVPN設定 advertise ipv4 unicast route-map DGW # Default Gatewayのみ広報 exit-address-family ! ip prefix-list DGW seq 5 permit 0.0.0.0/0 ! route-map DGW permit 10 match ip address prefix-list DGW ! line vty !
  • Spine02は割愛(IPアドレス、ASN、ルーターIDなど個別パラメーターのみの違い)


動作確認

  • VM1→VM2へPing: OK(L2通信、VNI-10100)
[root@PCVM-1 ~]# ping 172.16.100.20 -c 1 PING 172.16.100.20 (172.16.100.20) 56(84) bytes of data. 64 bytes from 172.16.100.20: icmp_seq=1 ttl=64 time=1.36 ms --- 172.16.100.20 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.369/1.369/1.369/0.000 ms

  • VM1→VM3へPing: OK(別サブネット通信、Leaf間、L3 VNI-104001)
# ping 172.16.101.10 -c 1 PING 172.16.101.10 (172.16.101.10) 56(84) bytes of data. 64 bytes from 172.16.101.10: icmp_seq=1 ttl=62 time=1.34 ms --- 172.16.101.10 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.340/1.340/1.340/0.000 ms

  • VM2→VM3へPing: OK(別サブネット通信、同一Leaf内)
# ping 172.16.101.10 -c 1 PING 172.16.101.10 (172.16.101.10) 56(84) bytes of data. 64 bytes from 172.16.101.10: icmp_seq=1 ttl=63 time=1.43 ms --- 172.16.101.10 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.436/1.436/1.436/0.000 ms

  • BGP経路情報(Leaf01)
cumulus@Leaf01:~$ net show route vrf vrf1 show ip route vrf vrf1 ======================= Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, > - selected route, * - FIB route VRF vrf1: B>* 0.0.0.0/0 [20/0] via 10.0.0.41, vlan4001 onlink, 07:50:31 K * 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 03w6d08h C>* 172.16.100.0/24 is directly connected, vlan100, 03w6d08h B>* 172.16.100.20/32 [20/0] via 10.0.0.12, vlan4001 onlink, 07:50:31 B>* 172.16.101.10/32 [20/0] via 10.0.0.12, vlan4001 onlink, 07:50:31
cumulus@Leaf01:~$ net show bgp vrf vrf1 show bgp vrf vrf1 ipv4 unicast ============================== BGP table version is 24, local router ID is 172.16.100.1 Status codes: s suppressed, d damped, h history, * valid, > best, = multipath, i internal, r RIB-failure, S Stale, R Removed Origin codes: i - IGP, e - EGP, ? - incomplete Network Next Hop Metric LocPrf Weight Path * 0.0.0.0 10.0.0.41 0 65020 65041 25253 i *> 10.0.0.41 0 65020 65041 25253 i * 172.16.100.20/32 10.0.0.12 0 65020 65012 i *> 10.0.0.12 0 65020 65012 i * 172.16.101.10/32 10.0.0.12 0 65020 65012 i *> 10.0.0.12 0 65020 65012 i

ホストルート(172.16.100.20、172.16.101.10)をLeaf02(10.0.0.12)からマルチパスで学習しています。また、デフォルトルート(0.0.0.0/0)をExit01(10.0.0.41)からマルチパスで学習しています。

  • EVPNルート情報(Leaf01)
cumulus@Leaf01:~$ net show bgp evpn route type macip BGP table version is 11, local router ID is 10.0.0.11 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 10.0.0.11:5 *> [2]:[0]:[0]:[48]:[08:00:27:81:70:77] 10.0.0.11 32768 i *> [2]:[0]:[0]:[48]:[08:00:27:81:70:77]:[32]:[172.16.100.10] 10.0.0.11 32768 i Route Distinguisher: 10.0.0.12:6 * [2]:[0]:[0]:[48]:[08:00:27:99:55:95] 10.0.0.12 0 65020 65012 i *> [2]:[0]:[0]:[48]:[08:00:27:99:55:95] 10.0.0.12 0 65020 65012 i * [2]:[0]:[0]:[48]:[08:00:27:99:55:95]:[32]:[172.16.101.10] 10.0.0.12 0 65020 65012 i *> [2]:[0]:[0]:[48]:[08:00:27:99:55:95]:[32]:[172.16.101.10] 10.0.0.12 0 65020 65012 i Route Distinguisher: 10.0.0.12:7 * [2]:[0]:[0]:[48]:[08:00:27:8e:a4:c7] 10.0.0.12 0 65020 65012 i *> [2]:[0]:[0]:[48]:[08:00:27:8e:a4:c7] 10.0.0.12 0 65020 65012 i * [2]:[0]:[0]:[48]:[08:00:27:8e:a4:c7]:[32]:[172.16.100.20] 10.0.0.12 0 65020 65012 i *> [2]:[0]:[0]:[48]:[08:00:27:8e:a4:c7]:[32]:[172.16.100.20] 10.0.0.12 0 65020 65012 i Displayed 6 prefixes (10 paths) (of requested type)
EVPN Type2ルートでホストの情報(MAC+IP)を学習していることが分かります。

cumulus@Leaf01:~$ net show bgp evpn route type prefix BGP table version is 53, local router ID is 10.0.0.11 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 10.10.10.1:2 * [5]:[0]:[0]:[0]:[0.0.0.0] 10.0.0.41 0 65020 65041 25253 i *> [5]:[0]:[0]:[0]:[0.0.0.0] 10.0.0.41 0 65020 65041 25253 i

EVPN Type5ルートでPrefix(0.0.0.0/0)を学習していることが分かります。

まとめ

Symmetric routing構成で以下が確認出来ました。

  • L2 VNI(L2通信)、L3 VNI(L3通信)の混在
  • Anycast gateway
  • EVPN Type2ルート(ホストルート)、EVPN Type5ルート(Prefix)

「Cumulus LinuxでIP CLOS構築」、「Cumulus LinuxでIP CLOS with EVPN-VXLAN構築」と今回の計3回でお送りしました。 拝見頂いた方、最後までお付き合い有難うございました。

実機で試してみたい方必見!POCキャンペーン開催中!!


ホワイトボックスに興味ある方はこちら