Blog

Now that you have a hardware for your SOHO appliance, installed an Operating System and prepared  systemd init script in order to resume freeRouter operation in case of unexpected outage (power cut, reset button etc.), let's proceed to RARE/freeRouter installation itself.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

When installing RARE/freeRouter on x86, you have 2 choices:

  • installation with a software dataplane
  • installation with a DPDK dataplane


In this precise case, we will consider a DPDK dataplane installation as our hardware is compliant to the requirement listed below.

DPDK requirements

  • CPU with SSE4 support
  • DPDK compatible NIC 

Note that freeRouter is available where JVM is available

  • x86
  • ARM

Article objective

In this article we will pursue the SOHO network appliance installation based on the diagram below, and freeRouter installation using  DPDK dataplane. In this situation, the appliance is behind ISP FTTH box demarcation point. As it is typical to French FTTH domestic deployment. 

Deployment consideration

In this case, RARE/freeRouter is connected to a ISP box demarcation point that deliver copper connectivity. Nothing prevents you, following your context, to deploy a similar equipment with with SFP uplinks directly connected to your Provider Edge backbone routers if you own also the dark fiber paths local to the MAN. 

Diagrams

[ #003 ] - RARE/freeRouter DPDK SOHO installation 

 Requirements
  • Own a similar hardware described in SOHO #001
  • Having installed an Operating System with Java Runtime Environment
  • Configured systemd so that RARE/freeRouter can take over networking at each reboot as described in SOHO #002.
 IPv4 addressing plan

Let's consider the following assumptions:

  • ISP box comes with 192.168.0.0/24 subnet configured at RJ45 demarcation point
  • Home networkS will be within 192.168.128.0/17
  • 192.168.128.0/17 will be subnetted further into multiple /24 in order to accomodate home network requirement
  • RARE/freeRouter is connected to the FTTP ISP box via appliance DPDK port #0 (interface sdn1)
  • Home traffic going to outside world will be subject to port address translation (NAT/PAT) using an IPv4 within ISP subnet range
  • appliance port #1 will be connected to FTTH ISP box and will have an IP within 192.168.0.0/24

IPv6 addressing plan has not been forgotten. It is not mentioned here on purpose in order to not complicate explanations. IPv6 we be the object of further articles. It is not that IPv6 is a complex topic. It just that it deserves special attention. You might not realised it, but IPv6 is everywhere and is used by default between peers as soon as IPv6 is enable. So IMHO we need to get used to it as soon as possible especially if you are a network administrator.

 Create configuration files for RARE/freerouter

FreeRouter uses 2 configuration files in order to run, let's write these configuration files in /rtr

freeRouter hardware configuration file: rtr-hw.txt
hwid j1900-i211
! cpu_port
int eth0 eth - 127.0.0.1 20001 127.0.0.1 20002
! freerouter control port for message packet-in/out in P4 VRF _ONLY_
tcp2vrf 9080 p4 9080
! freeroouter local access in p4 VRF _ONLY_
tcp2vrf 2323 p4 23
! launch a process called "veth0" that actually link to veth0b
! cmd for control plane/dataplane communication unified messaging: ip link add veth0a type veth peer name veth0b
! cmd for appliance Linux access: ip link add veth1a type veth peer name veth1b
! cmd for integrated wifi: ip link add veth2a type veth peer name veth2b
! external wifi AP
proc hostapd /usr/sbin/hostapd /etc/hostapd/hostapd.conf
! integrated wifi AP
proc wlan /rtr/pcap2pcap.bin wlan0 veth2a
! DP/CP communication process
proc veth0 /rtr/pcapInt.bin veth0a 20002 127.0.0.1 20001 127.0.0.1
! DP DPDK process
proc p4emu /rtr/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b --vdev=net_af_packet1,iface=veth2b --vdev=net_af_packet2,iface=veth1b 127.0.0.1 9080 6

Note:

Let's spend some times on this hardware configuration file, as you might have notice there are additional interesting lines worth to mention:

  • Exclamation mark "!" are comments
  • hwid is a text field that would just designate the hardware on which freeRouter is running. (output of : show platform)
  • proc <process-name>

It is possible within freeRouter startup to launch processes. We use here this feature to start control plane / dataplane communication via veth pair: veth0a and veth0b and also P4Emu/dpdk, p4dpdk.bin packet processing backend.

  • proc p4emu /rtr/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b --vdev=net_af_packet1,iface=veth2b --vdev=net_af_packet2,iface=veth1b 127.0.0.1 9080 6

In dpdk, by default dpdk interfaces have port_ids that are sequentially allocated and in the order of appearance in dpdk-devbind --status output usually sorted by pci_id. In the below output interface enp0s1 has port_id #0 and in dpdk it would be pci_id:00:01.0

enp0s1 would be: #0 with pci_id: 00:01.0

enp0s2 would be: #1 with pci_id: 00:02.0

enp0s5 would be: #2 with pci_id: 00:05.0

enp0s6 would be: #3 with pci_id: 00:06.0

enp0s7 would be: #4 with pci_id: 00:07.0

enp0s8 would be: #5 with pci_id: 00:08.0

DPDK diagnosis
dpdk-devbind.py --status

Network devices using DPDK-compatible driver
============================================
0000:01:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:02:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:05:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:06:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:07:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:08:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb

Network devices using kernel driver
===================================
0000:09:00.0 'AR928X Wireless Network Adapter (PCI-Express) 002a' if=wlan0 drv=ath9k unused=uio_pci_generic 

No 'Baseband' devices detected
==============================

Other Crypto devices
====================
0000:00:1a.0 'Atom Processor Z36xxx/Z37xxx Series Trusted Execution Engine 0f18' unused=uio_pci_generic

No 'Eventdev' devices detected
==============================

No 'Mempool' devices detected
=============================

No 'Compress' devices detected
==============================

No 'Misc (rawdev)' devices detected
===================================
  • DPDK --vdev addition. In this precise case we instruct DPDK to take into account additional veth endpoint we created respectively for
    • Control plane / data plane communication
    • Linux out of band management access via SSH we installed previously during Debian package installation
    • integrated hardware WIFI access point
  • in DPDK vdev interface will have in order of apparition in the command line:
    • DP/CP communication: 6 ↔ veth0b
    • integrated WIFI: 7 ↔ veth2b
    • Linux out of band management access: 8 ↔ veth1b

external WIFI access point will be bound directly to an interface of the appliance via DPDK. This will be describe in future articles.

freeRouter software configuration file: rtr-sw.txt
hostname mjolnir
buggy
!
!
vrf definition inet
 exit
!
vrf definition p4
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth0a]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@DPDK[port-1]
 mtu 1500
 vrf forwarding inet
 ipv4 address 192.168.0.90 255.255.255.0
 no shutdown
 no log-link-change
 exit
!
interface sdn2
 description freerouter@DPDK[port-2]
 mtu 1500
 shutdown
 no log-link-change
 exit
!
interface sdn3
 description freerouter@DPDK[port-3]
 mtu 1500
 shutdown
 no log-link-change
 exit
!
interface sdn4
 description freerouter@DPDK[port-4]
 mtu 1500
 shutdown
 no log-link-change
 exit
!
interface sdn5
 description freerouter@DPDK[port-5]
 mtu 1500
 shutdown
 no log-link-change
 exit
!
interface sdn6
 description freerouter@DPDK[port-6]
 mtu 1500
 shutdown
 no log-link-change
 exit
!
interface sdn998
 description freerouter@DPDK[port-7 --> veth2a] integrated wifi
 mtu 1500
 shutdown
 no log-link-change
 exit
!
interface sdn999
 description freerouter@OOBM[port-8 --> veth1a] Linux management
 mtu 1500
 vrf forwarding inet
 ipv4 address 192.168.128.1 255.255.255.0
 no shutdown
 no log-link-change
 exit

server telnet telnet
 security protocol telnet
 no exec authorization
 no login authentication
 vrf p4
 exit
!
server p4lang p4
 export-vrf inet 1
 export-port sdn1 0 1 0 0 0
 export-port sdn2 1 1 0 0 0
 export-port sdn3 2 1 0 0 0
 export-port sdn4 3 1 0 0 0
 export-port sdn5 4 1 0 0 0
 export-port sdn6 5 1 0 0 0
 export-port sdn998 7 1 0 0 0
 export-port sdn999 8 1 0 0 0
 interconnect ethernet0
 vrf p4
 exit
!
!
end
  • For now integrated wifi is shut. We will see in later article how to activate it
  • At Linux level, if you noticed in the previous article
    • management IP subnet is 192.168.128.0/24. OOBM appliance IP is then 192.168.128.254
appliance management IP@Linux level (check previous article)
ip addr flush dev veth1a
ip addr add 192.168.128.254/24 dev veth1a
  • management IP seen from freeRouter@sdn999 with IP 192.168.128.1 within 192.168.128.0/24
  • with configured a Linux static routes
Add default route to OOBM sdn999@Linux level (check previous article)
route add default gw 192.168.128.1

Security note

  • If you pay attention p4lang server in p4 VRF
    • This VRF has no bound interface
    • Is isolated then from the other VRF
  • This will allow only local Linux host control plane and dataplane communication 

Verification

 connectivity check from freeRouter to ISP IPv4 gateway
ping ISP demarcation point IP
ping 192.168.0.254 /vrf inet /interface sdn1                           
pinging 192.168.0.254, src=192.168.0.90, vrf=inet, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!
result=100%, recv/sent/lost=5/5/0, rtt min/avg/max/total=0/0/1/4
ARP discovery
mjolnir#sh ipv4 arp sdn1                                                       
mac             address        time      static
0024.d4a0.0cd3  192.168.0.254  00:00:20  false
 Check freeRouter interface configuration
Add default route to OOBM sdn999@Linux level (check previous article)
sh int sdn1                                                          
sdn1 is up (since 13:14:14, 2 changes)
 description: mjolnir@LAN1[01:00.0]
 type is sdn, hwaddr=003b.7671.764f, mtu=1500, bw=8000kbps, vrf=inet
 ip4 address=192.168.0.90/24, netmask=255.255.255.0, ifcid=10014
 received 64038 packets (17841459 bytes) dropped 4 packets (326 bytes)
 transmitted 250217 packets (38032822 bytes) promisc=false macsec=false
interface summary
show interfaces summary                                                
interface    state  tx        rx        drop
ethernet0    up     74690935  51798769  0
sdn1         up     37954707  17828649  326
sdn2         admin  0         0         0
sdn3         admin  0         0         0
sdn4         admin  0         0         0
sdn5         admin  0         0         0
sdn6         admin  0         0         0
sdn998       admin  0         0         0
sdn999       up     23646     17904     0
interface summary
interface   state  tx          rx          drop
sdn1        up     674397352   3883928390  948
sdn2        admin  0           0           0
sdn3        admin  0           0           0
sdn4        admin  0           0           0
sdn5        admin  0           0           0
sdn6        admin  0           0           0
sdn998      admin  0           0           0
sdn999      up     110520      85072       0
 Check freeRouter CLI access via localhost

Check Linux appliance local routes

From linux terminal
root@mjolnir:~# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.128.1   0.0.0.0         UG    0      0        0 veth1a

Test local telnet access from linux/localhost

Conclusion

In this article

  • we finally launched RARE/freeRouter with DPDK dataplane
  • configure RARE/freeRouter with a vanilla config that takes into account all the appliance physical interfaces
  • added veth pair in the config in order to take into account:
    • Control plane / Data plane communication 
    • linux OOBM
    • integrated WIFI
  • Enabled and checked IPv4 connectivity between freeRouter@sdn1 and ISP demarcation point
  • Check telnet access to freeRouter from localhost only

RARE validated design: [ SOHO #003 ] - key take-away

From this point you have a complete freeRouter connected to ISP box via SDN1 as uplink in 192.168.0.0/24 subnet. We will extend further this base configuration step by step in order to enrich user experience !

  • Now you would want to enable IPv4/IPv6  connectivity to all potential hosts@home whether they are connected via RJ45 or via built-in WIFI.
  • you would also want to distribute IPv4, IPv6 to all the of hosts@home
  • IPv4/IPv6 connectivity is not enough, you would like to provide Domain Name Service to them
  • Domain Name Service is not enough if they can't reach outside world. As we are using RFC1918 addressing plan we should figure out a way to ensure NAT/PAT address translation in order to enable egress traffic toward the Internet
  • Your home might have several floors and only one WIFI access point is not enough ? Let's see how we can add additional WIFI AP in the network
  • Maybe you have an outsourced network management service ? Let's see how connectivity can be enable via OpenVPN encrypted tunnel
  • Last but not least, let's see how we can connect DN42 parallel network using a Wireguard tunnel relying on an IPv6 underlay.

You've guessed it, all of these points will be elaborated in the futures articles. Therefore stay tuned !

This is a new article for the blog serie called "RARE Day One". Today we will explore one of freeRouter feature that is used a lot in Service Provider trusted environment" TFTP server

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

I'm not sure if this is still the case now, but back in 1999, I had the opportunity to managed multiple VPNs at a very huge French Service Provider. I'm saying huge as in this type of MPLS muti-service core network, you could have hundreds of VRF in the same PE router connecting a myriads of CPE via X25 (XOT), frame-relay and ATM PVC at best. In that context, some companies could have several thousands of routers in their VPNs and it was not common to follow a high pace deployment which was at ~10 CPEs per day for a new customer VPN implementation. So one of my favorite CLI command was:

staging a CPE with its final configuration
copy tftp run
Address or name of remote host []? <x.y.z.t>
Source filename []? <router-cpe-config-file-name>
Destination filename [running-config]?
...

That being said, I'm not sure if this has evolved since then as TFTP occurred inside a very protected out of band management network, it was very good and did a perfect job. Keep in mind that we could be hundreds of "VPN owner" deploying CPEs at the same time. This has to be highly available.

That was for the anecdote, but recently I attempted to upgrade my OpenWRT wifi router from 18.06.02 to the latest code train: 19.07.4. As a I'm lazy, I just sticked with OpenWRT web upgrade via LuCI. Not sure if I was right ... I don't know why and how but the upgrade failed and my wifi router got "bricked". (smile)

After a lot of googling and reading, i concluded that I had only one solution: restore from factory and re-install OpenWRT 19.07.04 installation by hand. You have guess the rest of the article, the factory-reset procedure requires a TFTP server. (smile) 

Note

But before that, I had to solder an USB - UART module as described here.

Article objective

As again i was lazy on installing a TFTP server on my MAC and disconnect my current LAN access in order to have a direct connectivity with the OpenWRT box, I had an idea (this is not often (smile)) off the top of my head: "Hey, maybe freeRouter has a TFTP server that I can activate in few lines ?"... Well, after a terminal connection to my home router let me introduce you to freeRouter/TFTP server:

 

[ #004 ] - Saving private OpenWRT", thanks freeRouter's TFTP server !

 Log into freeRouter

If you are familiar with Cisco operating system you will feel at home with this TFTP server. 

Log into freeRouter in config mode:
   __               ____             _
  / _|_ __ ___  ___|  _ \ ___  _   _| |_ ___ _ __
 | |_| '__/ _ \/ _ \ |_) / _ \| | | | __/ _ \ '__|
 |  _| | |  __/  __/  _ < (_) | |_| | ||  __/ |
 |_| |_|  \___|\___|_| \_\___/ \__,_|\__\___|_|
  _ __ ___   ___| | _____  | |
 | '__/ _ \ / __| |/ / __| | |
 | | | (_) | (__|   <\__ \ |_|
 |_|  \___/ \___|_|\_\___/ (_)

welcome
line ready
mjolnir#conf t                                                                 
mjolnir(cfg)#server tftp openwrt                                               
mjolnir(cfg-server)#?                                                          
  access-blackhole4 - propagate and check violating prefixes
  access-blackhole6 - propagate and check violating prefixes
  access-class      - set access list
  access-log        - log dropped attemps
  access-map        - set route map
  access-peer       - per client session limit
  access-policy     - set route policy
  access-prefix     - set prefix list
  access-rate       - access rate for this server
  access-startup    - initial downtime for this server
  access-subnet     - per subnet session limit
  access-total      - session limit for this server
  do                - execute one exec command
  end               - close this config session
  exit              - go back to previous mode
  interface         - interface to bind to
  no                - negate a command
  path              - set root folder
  port              - set port to listen on
  protocol          - set lower protocols to use
  readonly          - set write protection
  security          - set security parameters
  show              - running system information
  vrf               - set vrf to use

 TFTP server configuration

sdn6 is the port #6 connected from my SOHO router to OpenWRT router.

TFTP server configuration
sh run tftp                                                            
server tftp openwrt
 path /rtr/owrt/
 interface sdn6
 vrf inet
 exit
!
sh run sdn6                                                           
interface sdn6
 description mjolnir@LAN6[08:00.0]
 mtu 1500
 macaddr 004c.7307.0a77
 vrf forwarding inet
 ipv4 address 192.168.136.1 255.255.255.0
 ipv4 broadcast-multicast
 no shutdown
 no log-link-change
 exit
!
...

So the LAN port of my OpenWRT router is like this:

OpenWRT config (this can be done via Web GUI)
...
config interface 'lan'
        option type 'bridge'
        option proto 'static'
        option ipaddr '192.168.136.2'
        option netmask '255.255.255.0'
        option broadcast '192.168.136.255'
        option gateway '192.168.136.1'
        option ip6assign '60'
        list dns '192.168.254.1'
        option ifname 'eth0 eth0.1 eth0.2 wlan0 wlan1' 
...

Basic connectivity check (well technically you could not ping as it is part if TFTP restore to factory process. Remember our box crashed ! (smile) )

ping OpenWRT
ping 192.168.136.2 /vrf inet                                           
pinging 192.168.136.2, src=null, vrf=inet, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!
result=100%, recv/sent/lost=5/5/0, rtt min/avg/max/total=0/1/2/5
...
 Launch OpenWRT TFTP factory reset

So we are basically ready ...

Initiate OpenWRT factory restore process via TFTP
===================================================================
                MT7621   stage1 code 10:33:11 (ASIC)
                CPU=50000000 HZ BUS=12500000 HZ
==================================================================
Change MPLL source from XTAL to CR...
do MEMPLL setting..
MEMPLL Config : 0x11100000
3PLL mode + External loopback
=== XTAL-40Mhz === DDR-1200Mhz ===
PLL2 FB_DL: 0x9, 1/0 = 567/457 25000000
PLL3 FB_DL: 0xc, 1/0 = 596/428 31000000
PLL4 FB_DL: 0x11, 1/0 = 560/464 45000000
do DDR setting..[00320381]
Apply DDR3 Setting...(use customer AC)
          0    8   16   24   32   40   48   56   64   72   80   88   96  104  112  120
      --------------------------------------------------------------------------------
0000:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0001:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0002:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0003:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0004:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0005:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0006:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0007:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0008:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0009:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
000A:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
000B:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
000C:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
000D:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    1
000E:|    0    0    0    0    0    0    0    0    0    1    1    1    1    1    1    1
000F:|    0    0    0    0    1    1    1    1    1    1    1    1    1    1    0    0
0010:|    1    1    1    1    1    1    1    1    1    0    0    0    0    0    0    0
0011:|    1    1    1    0    0    0    0    0    0    0    0    0    0    0    0    0
0012:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0013:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0014:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0015:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0016:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0017:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0018:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
0019:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
001A:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
001B:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
001C:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
001D:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
001E:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
001F:|    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0    0
rank 0 coarse = 15
rank 0 fine = 72
B:|    0    0    0    0    0    0    0    0    0    0    1    1    1    0    0    0
opt_dle value:11
DRAMC_R0DELDLY[018]=00001F1F
==================================================================
                RX      DQS perbit delay software calibration 
==================================================================
1.0-15 bit dq delay value
==================================================================
bit|     0  1  2  3  4  5  6  7  8  9
--------------------------------------
0 |    10 7 9 9 7 7 8 7 3 6 
10 |    6 7 7 9 6 9 
--------------------------------------

==================================================================
2.dqs window
x=pass dqs delay value (min~max)center 
y=0-7bit DQ of every group
input delay:DQS0 =31 DQS1 = 31
==================================================================
bit     DQS0     bit      DQS1
0  (1~61)31  8  (1~56)28
1  (1~58)29  9  (1~61)31
2  (1~60)30  10  (1~59)30
3  (1~58)29  11  (1~57)29
4  (1~57)29  12  (1~60)30
5  (1~61)31  13  (1~60)30
6  (1~58)29  14  (1~61)31
7  (1~62)31  15  (1~61)31
==================================================================
3.dq delay value last
==================================================================
bit|    0  1  2  3  4  5  6  7  8   9
--------------------------------------
0 |    10 9 10 11 9 7 10 7 6 6 
10 |    7 9 8 10 6 9 
==================================================================
==================================================================
     TX  perbyte calibration 
==================================================================
DQS loop = 15, cmp_err_1 = ffff0000 
dqs_perbyte_dly.last_dqsdly_pass[0]=15,  finish count=1 
dqs_perbyte_dly.last_dqsdly_pass[1]=15,  finish count=2 
DQ loop=15, cmp_err_1 = ffff0080
dqs_perbyte_dly.last_dqdly_pass[1]=15,  finish count=1 
DQ loop=14, cmp_err_1 = ffff0000
dqs_perbyte_dly.last_dqdly_pass[0]=14,  finish count=2 
byte:0, (DQS,DQ)=(8,8)
byte:1, (DQS,DQ)=(8,8)
20,data:88
[EMI] DRAMC calibration passed

===================================================================
                MT7621   stage1 code done 
                CPU=50000000 HZ BUS=12500000 HZ
===================================================================


U-Boot 1.1.3 (Apr 17 2017 - 17:00:02)

Board: Ralink APSoC DRAM:  256 MB
Power on memory test. Memory size= 256 MB...OK!
relocate_code Pointer at: 8ffac000

Config XHCI 40M PLL 
******************************
Software System Reset Occurred
******************************
Allocate 16 byte aligned buffer: 8ffdffd0
Enable NFI Clock
# MTK NAND # : Use HW ECC
NAND ID [C8 D1 80 95 42]
Device not found, ID: c8d1
Not Support this Device! 
chip_mode=00000001
Support this Device in MTK table! c8d1 
select_chip
[NAND]select ecc bit:4, sparesize :64 spare_per_sector=16
Signature matched and data read!
load_fact_bbt success 1023
load fact bbt success
[mtk_nand] probe successfully!
mtd->writesize=2048 mtd->oobsize=64,    mtd->erasesize=131072  devinfo.iowidth=8
..============================================ 
Ralink UBoot Version: 5.0.0.0
-------------------------------------------- 
ASIC MT7621A DualCore (MAC to MT7530 Mode)
DRAM_CONF_FROM: Auto-Detection 
DRAM_TYPE: DDR3 
DRAM bus: 16 bit
Xtal Mode=5 OCP Ratio=1/4
Flash component: NAND Flash
Date:Apr 17 2017  Time:17:00:02
============================================ 
icache: sets:256, ways:4, linesz:32 ,total:32768
dcache: sets:256, ways:4, linesz:32 ,total:32768 

 ##### The CPU freq = 880 MHZ #### 
 estimate memory size =256 Mbytes
#Reset_MT7530
set LAN/WAN LWLLL

Please choose the operation: 
   1: Load system code to SDRAM via TFTP. 
   2: Load system code then write to Flash via TFTP. 
   3: Boot system code via Flash (default).
   4: Entr boot command line interface.
   7: Load Boot Loader code then write to Flash via Serial. 
   9: Load Boot Loader code then write to Flash via TFTP. 
 4 
You choosed 2

 0 

   
2: System Load Linux Kernel then write to Flash via TFTP. 
 Warning!! Erase Linux in Flash then burn new one. Are you sure?(Y/N)
 Please Input new ones /or Ctrl-C to discard
        Input device IP (192.168.31.1) ==:192.168.31.1
        Input server IP (192.168.31.100) ==:192.168.31.2  
        Input Linux Kernel filename () ==: <my_factory_router_image>


...

And ... Voilà !

Note

We won this factory reset battle but the war is over. After having restored the genuine official vendor image, we need to re-install OpenWRT with the latest 19.07.4 image and configure OpenWRT so that it can acts as a "dummy Wifi Access Point". DHCP, DNS will be served by the SOHO router.

Discussion

You can deploy freeRouter manually in a VM or container and bind it to a linux interface if you need a TFTP server in order to apply configuration to all your equipment. When final staging are done in a secure Out of Band management network context having a TFTP server is a blessing as it correspond to a gain of time in a production environment. Imaging hundreds of people working in a SP environment and working at the same time.

Conclusion

In this 4th article:

  • We presented freeRouter TFTP embedded server 
  • You can use it in order to undertake network equipment deployment requiring TFTP
  • This TFTP server is compatible with IPv4/IPv6

TFTP is a basic but a common tool in SP environment (or it was? If it is still used, yes please confirm !) In this example, I demonstrated the use of TFTP server in order to flash a wifi router to factory default. I have 802.11ac back up and running !

Final words

freeRouter can be perceived not only as a router but it is a networking Swiss army knife. in further articles we will shed some lights in various treasures hidden into freeRouter... And for free !

Last but not least, you can play with these different servers from this sandbox: (You'll be able to spot amazing server that will be the object of further article.)

type "telnet dl.nop.hu" in a terminal and choose "1"
Trying 193.224.23.5...
Connected to dl.nop.hu.
Escape character is '^]'.
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX     XXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX  XXXX XX XXXX XX XXXX XX XX XX XXXX XXXXX/~~~~~~\XXXXXX
XXXX X XXX XX XXXX XX XXXX XX XX XX XXXX XXXX| player |XXXXX
XXXX XX XX XX XXXX XX     XXX    XX XXXX XXXXX\______/XXXXXX
XXXX XXX X XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXX  XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX XXX XXX XX XXX    XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
welcome
line ready
menu lab:
# - reboot router1
$ - reboot router2
% - reboot router3
1 - connect to router1
2 - connect to router2
3 - connect to router3
^ - rebuild routers
l - connect to lg.nop.dn42
x - exit
choose:1 - attach vdc lab1 

welcome
line ready
yourname#conf t                                                                
warning user.userLineHandler.doExec:userLine.java:606 <nobody> configuring from tty1
yourname(cfg)#server ?                                                         
  bmp2mrt      - configure an bmp to mrt server
  bstun        - configure a bstun server
  chargen      - configure a chargen server
  daytime      - configure a daytime server
  dcp          - configure a dcp server
  dhcp4        - configure a dhcp4 server
  dhcp6        - configure a dhcp6 server
  discard      - configure a discard server
  dns          - configure a dns server
  echo         - configure an echo server
  etherip      - configure a etherip server
  forwarder    - configure a forwarder server
  ftp          - configure a ftp server
  geneve       - configure a geneve server
  gopher       - configure a gopher server
  gre          - configure a gre server
  gtp          - configure a gtp server
  honeypot     - configure a honeypot server
  http         - configure a http server
  irc          - configure an irc server
  iscsi        - configure an iscsi server
  l2f          - configure a l2f server
  l2tp2        - configure a l2tp v2 server
  l2tp3        - configure a l2tp v3 server
  loadbalancer - configure a loadbalancer server
  lpd          - configure a lpd server
  modem        - configure a modem server
  mplsip       - configure a mplsip server
  mplsudp      - configure a mplsudp server
  multiplexer  - configure a multiplexer server
  netflow      - configure an netflow server
  nrpe         - configure a nrpe server
  ntp          - configure a ntp server
  openflow     - configure an openflow server
  p4lang       - configure an p4lang server
  pcep         - configure a pcep server
  pckodtls     - configure a pckodtls server
  pckotcp      - configure a pckotcp server
  pckotxt      - configure a pckotxt server
  pckoudp      - configure a pckoudp server
  pop3         - configure a pop3 server
  pptp         - configure a pptp server
  prometheus   - configure a prometheus server
  quote        - configure a quote server
  radius       - configure a radius server
  rfb          - configure a rfb server
  rpki         - configure a rpki server
  sip          - configure a sip server
  smtp         - configure a smtp server
  snmp         - configure a snmp server
  socks        - configure a socks server
  streamingmdt - configure a streaming telemetry server
  stun         - configure a stun server
  syslog       - configure a syslog server
  tacacs       - configure a tacacs server
  telnet       - configure a telnet server
  tftp         - configure a tftp server
  time         - configure a time server
  udpfwd       - configure an udp forwarder server
  udptn        - configure an udptn server
  upnpfwd      - configure an upnp forwarder server
  upnphub      - configure an upnp hub server
  voice        - configure a voice server
  vxlan        - configure a vxlan server

yourname(cfg)#server                              
...

In order to exit the sandbox session use the following escape sequence: Ctrl-c + Ctrl-x









This is a new category of article that falls under "RARE software architecture" special blog series. As its name implies, it deals with topics related to RARE/freeRouter software / Monitoring.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

In Greek mythology, Prometheus is a Titan that is credited mankind creation by stealing Fire from Gods and by giving it to human. In the RARE context, Prometheus is a the software from prometheus.io project. It became very popular in the IT industry as it is very simple to implement/configure while providing a great number of metrics without impacting application performance. It is heavily used in microservices environment such as docker and Kubernetes. The mythological reference gives us an indication of how Prometheus is operating. At a constant rate, Prometheus metric collector or server is stealing metrics from Prometheus agent. All the stolen metrics are then consolidated in Time Series database ready to be poured to a queueing system for proper visualization. 

Before going further, allow me a brief digression by sharing with you a small anecdote that leds to this ongoing work related to network monitoring for RARE. As mentioned previously, our focus is to elaborate RARE/freeRouter solution the possibility to be monitored in an operational environment. In that context, we started with the implementation of a lightweight SNMP stack that provided relevant result via SNMP tools like LibreNMS. This is great for organisation that wouldn’t want invest time on anything but SNMP.

However, we felt a lack of flexibility due to SNMP inherent structure and we needed more versatile and instant monitoring capabilities.  More importantly the need to export infinite metric type from Control Plane in a more flexible way arise. How metrics such as: Number of IPv4/IPv6 routes, IPv4 BGP prefix, IPv6 BGP prefix platform JVM memory etc. could be shared without too much hassle ?

After some internal discussion, I just said: "I’m not a monitoring expert but we have tools like ELK and PROMETHEUS and GRAFANA in NMaaS catalog … Shouldn’t we consider use this ?"

The answer was: « Let’s give it a try and fire up a Prometheus and Grafana instance from NMaaS platform !»

Some hacking at the control plane code level were initiated, after few hours freeRouter lead developer came up with a solution and said: Let me introduce you "freeRouter prometheus agent »

And thanks to the great support of NMaaS team, in few minutes and some point and clicks (it took longer than expected as I’m not good with GUI) we were able to test this agent.

Why is it important you might say ? It is just that with prometheus simplicity and low resource overhead with have full control plane metrics visibility !

As a side note this is not a replacement for INT/Telemetry/Netflow/IPFIX that provide different type of data that are to at the same scale…
People with INT/TELEMETRY/NETFLOW/IPFIX are talking about a "data lake" or "data deluge". Which is correct, if you think about the complexity of resolving a  gigantic producer/consumer data problem. This needs the relevant IT infrastructure in order to process all of the data provided by these protocol at the NREN scale.

While in our case, we are just focusing on exposing CONTROL PLANE METRICS at the network element level. We simply monitor and ensure a router operation by using prometheus metrics

Note

While he above might be true, the number of metrics exported from a prometheus target can be very high. Fine tuning might be necessary in order to make sure that all metrics are really necessary for network monitoring purpose. This explosion of metrics exposure can add unnecessary workload at the control plane level. 

Again, kudos to NMaaS team that made this happen so that we could test this on the P4 LAB with — ZERO — effort.

Article objective

In this article, we will present freeRouter and Prometheus integration and as an example we will implement one of the 22 grafana dashboard that we developed and published here. In the rest of the article we will assume that you are a running one or more freeRouter nodes.

Diagram

[ #001 ] - Cookbook

 Configure a Prometheus server

The first step is to implement a prometheus server. Using NMaaS it is pretty instantaneous. However, if you plan to deploy prometheus in an other platform just follow the installation guide here.

Once deployed you can push the following prometheus.yaml config:

prometheus.yaml
global:
  scrape_interval: 15s
  evaluation_interval: 30s
alerting:
  alertmanagers:
    - static_configs:
      - targets:
rule_files:
scrape_configs:
  - job_name: 'router'
    metrics_path: /metrics
    scrape_interval: 15s
    static_configs:
    - targets: ['192.168.0.1:9001','192.168.0.2:9001']
      labels:

In this configuration we assume that we have 2 freeRouters that are configured as above (192.168.0.1:9001 and 192.168.0.2:9001) in prometheus worls these are called targets:

  • each target are interrogated or "scraped" very "scrap_interval" which is 15s here
  • the main job name is called; "router"
  • metrics_path is: "/metrics" so the scraped URL is: "http://192.168.0.1:9001/metrics

Note that this had to be deployed only once for all of your routers. However, each time you'd like to add a new router, you have to add a new target in the "targets" YAML list.

 Configure Prometheus FreeRouter control plane

In this example let's focus our interest interface metrics. Please note that this configuration should be deployed on each freeRouter and connectivity should be available between all targets and the prometheus server.

The objective is to tell freeRouter control plane to expose hardware and software counter interface metric. In order to do this just copy/paste the stanza here below via freeRouter CLI:

prometheus interface metric configuration
!
server prometheus <PROMETHEUS_SERVER_NAME>
 metric inthw command sho inter hwsumm
 metric inthw prepend iface_hw_byte_
 metric inthw name 0 ifc=
 metric inthw replace \. _
 metric inthw column 1 name st
 metric inthw column 1 replace admin -1
 metric inthw column 1 replace down 0
 metric inthw column 1 replace up 1
 metric inthw column 2 name tx
 metric inthw column 3 name rx
 metric inthw column 4 name dr

 metric intsw command sho inter summ
 metric intsw prepend iface_sw_byte_
 metric intsw name 0 ifc=
 metric intsw replace \. _
 metric intsw column 1 name st
 metric intsw column 1 replace admin -1
 metric intsw column 1 replace down 0
 metric intsw column 1 replace up 1
 metric intsw column 2 name tx
 metric intsw column 3 name rx
 metric intsw column 4 name dr
 vrf <VRF_NAME>
 exit
!

So this basically means:

  • From freeRouter CLI, issue the following command:
prometheus interface metric configuration
sho inter hwsumm
interface   state  tx          rx          drop
hairpin41   up     67404       0           0
hairpin42   up     153134      0           0
sdn1        up     412319805   1057514903  1152305
sdn2        up     1038840147  407307558   202
sdn3        admin  0           0           0
sdn4        admin  0           0           0
sdn5        admin  0           0           0
sdn6        admin  0           0           0
sdn998      up     9154        0           0
sdn999      up     199178      262939      0
tunnel1965  up     0           9122896     0 
  • prepend to the metric name: "iface_hw_byte_"
  • column 0 will have prometheus label ifc=
  • replace all dots "." by "_" . (so interface bundle1.123 will become bundle1_123)
  • column defines a metric name  "iface_hw_byte_" concatenated to "st" => "iface_hw_byte_st" which is essentially interface status
  • if column 1 "state" value is admin/down/up we associate value -1/0/1 
  • column defines a metric name  "iface_hw_byte_" concatenated to "tx" => "iface_hw_byte_tx" which is essentially interface bytes transmitted counter 
  • column defines a metric name  "iface_hw_byte_" concatenated to "rx" => "iface_hw_byte_rx" which is essentially interface bytes received counter 
  • column defines a metric name  "iface_hw_byte_" concatenated to "dr" => "iface_hw_byte_dr" which is essentially interface bytes dropped counter 

And if you followed this correctly, we are repeating these lines for software interface counter metric.

Tip

You can view Prometheus configuration for various Grafana dashboard here. Feel free to study these Prometheus configuration and activate them as you see fit depending on your requirements. The set of dashboard is not exhaustive and is by no means absolute. Feel free to submit additional dashboard ! We would gladly add them in the current list of freeRouter Dashboard.

Note

After this definition a freeRouter level you should have:

4 metrics related to hardware counters 

  • iface_hw_byte_st
  • iface_hw_byte_tx
  • iface_hw_byte_rx
  • iface_hw_byte_dr

4 metrics related to software counters

  • iface_sw_byte_st
  • iface_sw_byte_tx
  • iface_sw_byte_rx
  • iface_sw_byte_dr

Which is a total of 8 metrics

Tip

From that point you can check via prometheus console:

check the "Targets" menu drop down selection

From that point you should be able to use PromQL query filed in order to check that you can retrieve the metrics we defined above.

 Grafana configuration

For metric visualisation, we will use Grafana. Therefore:

  • install Grafana from official web site.
  • Once installed configure Prometheus as Grafana data source:

 

  • fill in all the prometheus server information

  • check the the data source is defined correctly by clicking the "Save & test" button

At that point your Grafana and Prometheus are correctly binded.

  • now you need to import "RARE/freeRouter interface bytes" dashboard

  • download freeRouter interface bytes dashboard here 


  • import the dashboard via ID or simply download JSON or use JSON panel

And Voila ! 

In order to immediately see the graph zoom in to 5m period with a refresh of 5s and you should see automagically the interface bytes TX/RX on all interface for each targets.

Discussion

This example related to interface metrics is universal, as the metrics at freeRouter level are yielded through a generic CLI command:

  • "show interface hwsummary"
  • or "show interface swsummary".

However some metrics cannot be retrieved by generic interface. Some metrics will be tied to specificities of your network. These can be the AS number, IGP process name, VRF name etc.

Let me give you a couple of examples:

 the metrics below assume that you have deployed a link state IGP called: "isis 1" 

But your network context you could have arbitrary deployed "isis 2200". (2200 is RENATER AS number) 

prometheus interface metric configuration
 metric lsigp4int command sho ipv4 isis 1 interface
 metric lsigp4int prepend lsigp4_iface_
 metric lsigp4int name 0 proto="isis1",ifc=
 metric lsigp4int replace \. _
 metric lsigp4int column 1 name neighbors
 metric lsigp4peer command sho ipv4 isis 1 topology 2
 metric lsigp4peer prepend lsigp4_peers_
 metric lsigp4peer name 0 proto="isis1",node=
 metric lsigp4peer replace \. _
 metric lsigp4peer column 1 name reachable
 metric lsigp4peer column 1 replace false 0
 metric lsigp4peer column 1 replace true 1
 metric lsigp4peer column 6 name neighbors
 metric lsigp4perf command sho ipv4 isis 1 spf 2 | inc reachable|fill|calc|run
 metric lsigp4perf prepend lsigp4_perf_
 metric lsigp4perf labels proto="isis1"
 metric lsigp4perf skip 0
 metric lsigp4perf column 1 name val

 metric lsigp6int command sho ipv6 isis 1 interface
 metric lsigp6int prepend lsigp6_iface_
 metric lsigp6int name 0 proto="isis1",ifc=
 metric lsigp6int replace \. _
 metric lsigp6int column 1 name neighbors
 metric lsigp6peer command sho ipv6 isis 1 topology 2
 metric lsigp6peer name 0 proto="isis1",node=
 metric lsigp6peer prepend lsigp6_peers_
 metric lsigp6peer replace \. _
 metric lsigp6peer column 1 name reachable
 metric lsigp6peer column 1 replace false 0
 metric lsigp6peer column 1 replace true 1
 metric lsigp6peer column 6 name neighbors
 metric lsigp6perf command sho ipv6 isis 1 spf 2 | inc reachable|fill|calc|run
 metric lsigp6perf prepend lsigp6_perf_
 metric lsigp6perf labels proto="isis1"
 metric lsigp6perf skip 0
 metric lsigp6perf column 1 name val
 in the metric below the variable is BGP AS number "65535": 


prometheus interface metric configuration
 metric bgp4peer command sho ipv4 bgp 65535 summ
 metric bgp4peer prepend bgp4_peer_
 metric bgp4peer name 4 peer=
 metric bgp4peer replace \. _
 metric bgp4peer column 1 name learn
 metric bgp4peer column 2 name advert
 metric bgp4peer column 3 name state
 metric bgp4peer column 3 replace false 0
 metric bgp4peer column 3 replace true 1
 metric bgp4perf command sho ipv4 bgp 65535 best | exc last
 metric bgp4perf prepend bgp4_perf_
 metric bgp4perf replace \s _
 metric bgp4perf column 1 name val

 metric bgp6peer command sho ipv6 bgp 65535 summ
 metric bgp6peer prepend bgp6_peer_
 metric bgp6peer name 4 peer=
 metric bgp6peer replace \: _
 metric bgp6peer column 1 name learn
 metric bgp6peer column 2 name advert
 metric bgp6peer column 3 name state
 metric bgp6peer column 3 replace false 0
 metric bgp6peer column 3 replace true 1
 metric bgp6perf command sho ipv6 bgp 65535 best | exc last
 metric bgp6perf prepend bgp6_perf_
 metric bgp6perf replace \s _
 metric bgp6perf column 1 name val 
 Last example with "LDP null" metrics, in this particular case the variable object is the VRF name: "inet"
prometheus interface metric configuration
 metric ldp4nul command sho ipv4 ldp inet nulled-summary
 metric ldp4nul prepend ldp4null_
 metric ldp4nul name 3 ip=
 metric ldp4nul skip 2
 metric ldp4nul replace \. _
 metric ldp4nul column 0 name prefix_learn
 metric ldp4nul column 1 name prefix_advert
 metric ldp4nul column 2 name prefix_nulled

 metric ldp6nul command sho ipv6 ldp inet nulled-summary
 metric ldp6nul prepend ldp6null_
 metric ldp6nul name 3 ip=
 metric ldp6nul skip 2
 metric ldp6nul replace \: _
 metric ldp6nul column 0 name prefix_learn
 metric ldp6nul column 1 name prefix_advert
 metric ldp6nul column 2 name prefix_nulled 


Conclusion

In this 1st article, you were presented :

  • freeRouter/Prometheus integration 
  • How to add a new router in the list of Prometheus target
  • How to integrate a RARE/freeRouter Grafana Dashboard. (Feel free to adapt the other available dashboard query to your context !)

Final words

In Prometheus philosophy, normally the user should do only the minimum of tweaking regarding configuration. Ultimately, he should be only be able to enable a metric or simply disable it if the scrape cost is too high. However in freeRouter/Prometheus integration process, you see that some metric are issued using specific $variable (VRF, BGP/IGP process number ...) Which makes impossible to maintain this universality. However, from the network operator point of view this should not be a showstopper. On the contrary, it is a powerful choice to be able to alter these command via $variables.

Remember in freeRouter philosophy you can have multiple VRF, multiple IGP and multiple BGP process number ! (Which is not the case for all routing platform)

Last but not least, this Prometheus agent was developed quickly because of one reason, all the objects  at the control plane level were already well structured in table form as previously described in this article. So implementing this table row/column logic in order to derive a prometheus metric was technically possible without too much hassle.





This is a new article for the blog serie called "RARE Day One". Today we will explore one of freeRouter features meant to fine tune terminal user environment and behaviour in order to best match your taste/preferences.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

We all have our habits that are inherited from our past experience. Some people are used to IOS, IOS-XR, NX-OX, IOS-XE others prefer Junos etc. Using freeRouter provides a different user experience. Some feature such as show/view/watch/differ diagnosis commands are pretty unique to freeRouter. However, freeRouter have some cards in its sleeves in order to provide you a familiar experience.

Article objective

In this article, we will focus on these features:

  • monitor
  • length
  • width
  • spacetab
  • tablemode
  • timestamps
  • colorized

Basically these commands are accessed through freeRouter user mode. If you need to use them from config mode, please use the "do" keyword.

[ #003 ] - "monitor/length/width/spacetab/tablemode/timestamps/colorized"

 monitor

If you are familiar with Cisco operating system you will feel at home with "terminal monitor" mode. This mode is usually used in combination with "debug" diagnosis command and actually redirect console amd debug output also in your current terminal session. (VTY in Cisco language)

Let's assume we want to debug IPv4 BGP

show BGP configuration from running config
r1#debug proto bgp ?                                                      
  computation - computation events
  event       - table events
  full        - full events
  incremental - incremental events
  traffic     - interface packets

Let's activate BGP debug event

Check BGP IPv4 peers status in VRF dn42
r1#debug rtr.rtrBgpSpeak.packRecv:rtrBgpSpeak.java:1134 got update from 172.23.215.177
debug rtr.rtrBgp.routerCreateComputed:rtrBgp.java:1902 create table
debug rtr.rtrBgpSpeak.packSend:rtrBgpSpeak.java:1080 sending update to 172.23.215.177
r1#                                                                       
r1#debug rtr.rtrBgpSpeak.packRecv:rtrBgpSpeak.java:1134 got keepalive from 172.23.215.177
debug rtr.rtrBgpSpeak.packSend:rtrBgpSpeak.java:1080 sending keepalive to 172.23.215.177
debug rtr.rtrBgpSpeak.packRecv:rtrBgpSpeak.java:1134 got update from fd40:cc1e:c0de::151
debug rtr.rtrBgp.routerCreateComputed:rtrBgp.java:1902 create table
debug rtr.rtrBgpSpeak.packSend:rtrBgpSpeak.java:1080 sending update to fd40:cc1e:c0de::151
debug rtr.rtrBgpSpeak.packRecv:rtrBgpSpeak.java:1134 got update from fd40:cc1e:c0de::151
debug rtr.rtrBgp.routerCreateComputed:rtrBgp.java:1902 create table
debug rtr.rtrBgpSpeak.packRecv:rtrBgpSpeak.java:1134 got update from fd40:cc1e:c0de::151
debug rtr.rtrBgp.routerCreateComputed:rtrBgp.java:1902 create table
debug rtr.rtrBgpSpeak.packSend:rtrBgpSpeak.java:1080 sending update to fd40:cc1e:c0de::151
debug rtr.rtrBgpSpeak.packRecv:rtrBgpSpeak.java:1134 got update from fd40:cc1e:c0de::151
debug rtr.rtrBgp.routerCreateComputed:rtrBgp.java:1902 create table
...

in order to cancel debug:

undebug all (or specific debug)
r1#un all
...

in order to stop console output from your terminal session:

termonal no monitor
r1#term no mon
...

Note

Similar to Cisco gear, "debug" can be very chatty. Therefore, be ready to issue "term no mon" or better log debug into a file for further off-line forensics.

 terminal length

In my previous examples, the output of "show ipv4 bgp 42 unicast database" command could not fit my window as the output as tomany lines. "terminal length" can be used to alter the number of  lines of the terminal.

Check hardware traffic counters
r1##terminal length ?                                                      
  <num> - height in lines

r1#terminal length
...

Note

"terminal length" can have no effect if you are using a more sophisticated terminal. However, this will have a visible impact on view/display/differ buffer.

 terminal spacetab

"terminal spacetab" is specifically decidated to Junos user. It basically does the same effect as the <TAB> key, but it add also contextual completion with the <SPACE>

Check hardware traffic counters
r1#terminal spacetab ?                                                    
  <cr>

r1#terminal spacetab                                                      
...

Note

"terminal spacetab" does not remove <TAB> key behaviour.

 terminal tablemode

"terminal tablemode" provide pre-formatted table output. 

"terminal tablemode" available format
mjolnir#terminal tablemode ?                                                   
  csv    - select csv mode
  fancy  - select fancy mode
  html   - select html mode
  normal - select normal mode
  raw    - select raw mode
  table  - select table mode
r1#terminal spacetab ?                                                    
  <cr>

Let's select "fancy"

Fancy mode
r1#show ipv4 bgp 42 summary                                               
 |~~~~~~~~~~~~|~~~~~~~|~~~~~~|~~~~~~~|~~~~~~~~~~~~~~~~|~~~~~~~~~~|
 | as         | learn | done | ready | neighbor       | uptime   |
 |------------|-------|------|-------|----------------|----------|
 | 4242421955 | 516   | 517  | true  | 172.23.215.177 | 01:16:23 |
 |____________|_______|______|_______|________________|__________|

Note

Feel free to play all format proposed by "terminal tablemode". This is pretty useful when you have to prepare some report related to the VPN or network you are currently managing.

 terminal timestamps

"terminal timestamps" will simply prepend command timestamps.

Fancy mode
r1#show ipv4 bgp 42 summary                                               
2020-09-30 09:46:32
 |~~~~~~~~~~~~|~~~~~~~|~~~~~~|~~~~~~~|~~~~~~~~~~~~~~~~|~~~~~~~~~~|
 | as         | learn | done | ready | neighbor       | uptime   |
 |------------|-------|------|-------|----------------|----------|
 | 4242421955 | 516   | 517  | true  | 172.23.215.177 | 01:18:45 |
 |____________|_______|______|_______|________________|__________|

Note

As you can see, you can stack these modes. Here we activated "terminal tablemode + timestamps"

 terminal colorized

"terminal colorized" will simply  color  your prompt

 Make terminal <mode> permanent

"terminal <mode>" is specific to your current session. If you want a persistent behaviour you would need to activate these features from the  "server telnet" stanza. Which as its name wrongly implies, is not about configuring a telnet server only. From this stanza you'll able to configure any type of server dedicated to terminal connection. (SSH)

Fancy mode
r1#(cfg-server)#exec ?                                                     
  authorization - set authorization
  autocommand   - set automatic command
  autohangup    - disconnect user after autocommand
  bye           - set goodbye message
  colorized     - enable colorization
  height        - set height of terminal
  interface     - set interface to use for framing
  logging       - enable logging
  privilege     - set default privilege
  ready         - set ready message
  spacetab      - enable space as tab
  tablemode     - set table mode
  timeout       - set timeout value
  timestamp     - enable timestamps
  welcome       - set welcome message
  width         - number of columns

Note

This "server telnet" section will provide you lots of possibility to fine tune your terminal access !  Feel free to use them in order to feel at home !

Discussion

monitor/length/width/spacetab/tablemode/timestamps/colorized is a set of feature meant to ease your experience with freeRouter in mimic'ing well know behaviour and proposing you additional convenient features. One main behaviour is that all command issue from the CLI is instantly taken into account. 

Conclusion

In this 3rd article:

  • We presented freeRouter monitor/length/width/spacetab/tablemode/timestamps/colorized terminal customization command
  • These are very useful if you come from Cisco or Junos world as it mimic some termnal behaviour.

Final words

As said, these are terminal commands are not specific to freeRouter. Some behaviour are mimic'ed from IOS and Junos. Anyway, these have been developed for one purpose:

"Make network engineers feel at ease and provide then an enjoyable operation experience "

Feel free to try and use them according to your environment taste!

Last but not least, you can play with these different mode from this sandbox:

type "telnet dl.nop.hu" in a terminal and choose "1"
telnet dl.nop.hu
Trying 193.224.23.5...
Connected to dl.nop.hu.
Escape character is '^]'.
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX     XXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX  XXXX XX XXXX XX XXXX XX XX XX XXXX XXXXX/~~~~~~\XXXXXX
XXXX X XXX XX XXXX XX XXXX XX XX XX XXXX XXXX| player |XXXXX
XXXX XX XX XX XXXX XX     XXX    XX XXXX XXXXX\______/XXXXXX
XXXX XXX X XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXX  XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX XXX XXX XX XXX    XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
welcome
line ready
menu lab:
# - reboot router1
$ - reboot router2
% - reboot router3
1 - connect to router1
2 - connect to router2
3 - connect to router3
^ - rebuild routers
l - connect to lg.nop.dn42
x - exit
choose:1 - attach vdc lab1 

welcome
line ready
yourname#terminal ?                                                            
  colorized  - sending to ansi terminal
  length     - set terminal length
  monitor    - log to this terminal
  no         - negate a parameter
  spacetab   - treat space as tabulator
  tablemode  - select table formatting mode
  timestamps - put time before each executed command
  width      - set terminal width

yourname#terminal                            
...

In order to exit the sandbox session use the following escape sequence: Ctrl-c + Ctrl-x








This is a new article for the blog serie called "RARE Day One". Today we will explore one of freeRouter killer feature that will make your life easier during your day to day operation: freeRouter assisted diagnosis command.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

As previously mentioned in the precedent article, when you log into a network equipment such as a router, you tend to have some automatic reflex. You usually:

  • Check router configuration: show run or sh conf
  • Check ipv4 / ipv6 / or LFIB forwarding table
  • So you basically issue diagnosis, troubleshooting command
  • An then you want to configure the router

Article objective

In this article, we will focus on the 3rd bullet point and will present you freeRouter available diagnosis command. They are grouped into 5 categories:

  • show 
  • view
  • watch
  • display
  • differ

Basically these commands are accessed through freeRouter user mode. If you need to use them from config mode, please use the "do" keyword.

[ #002 ] - "show/view/watch/display/differ"

 show

You would mostly be familiar with the "show" command. It is very good and can basically be used to get output from control plane object. Most of the time this can be used against static object like config.

Let's assume that I would like to get BGP config from my home router:

show BGP configuration from running config
show running-config bgp4                                               
router bgp4 42                                                                 
 vrf dn42                                                                      
 local-as 4242421975                                                           
 router-id 172.22.105.65                                                       
 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 neighbor 172.23.215.177 remote-as 4242421955                                  
 neighbor 172.23.215.177 description NOP.DN42                                  
 neighbor 172.23.215.177 local-as 4242421975                                   
 neighbor 172.23.215.177 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 neighbor 172.23.215.177 distance 20                                           
 justadvert loopback42                                                         
 exit                      

But I can also check the status of BGP peering into VRF dn42

Check BGP IPv4 peers status in VRF dn42
show ipv4 bgp 42 summary                                               
as          learn  done  ready  neighbor        uptime
4242421955  517    518   true   172.23.215.177  00:38:18                      

Check the same BGP peering but now for IPv6

Check BGP IPv6 peers status in VRF dn42
r1#show ipv6 bgp 42 summary                                               
as          learn  done  ready  neighbor             uptime
4242421955  351    352   true   fd40:cc1e:c0de::151  00:40:40
show ipv4 bgp 42 summary                                               
as          learn  done  ready  neighbor        uptime
4242421955  517    518   true   172.23.215.177  00:38:18                      

Let's see some BGP prefix received in VRF dn42 bgp table:

so my screen is too small for all the IPv6 BGP prefix into DN42 VRF

As a last example, something we usually do as network operators is to check ongoing interface traffic level:

Check interface traffic level (received/transmitted) )
r1#sh int sdn1                                                            
sdn1 is up (since 09:41:21, 2 changes)
 description: mjolnir@LAN1[01:00.0]
 type is sdn, hwaddr=003b.7671.764f, mtu=1500, bw=8000kbps, vrf=inet
 ip4 address=192.168.0.90/24, netmask=255.255.255.0, ifcid=10013
 ip6 address=2a01:e0a:159:2850::666/64, netmask=ffff:ffff:ffff:ffff::, ifcid=10013
 received 52013 packets (17638316 bytes) dropped 5 packets (448 bytes)
 transmitted 80765 packets (15101696 bytes) promisc=false macsec=false

r1#sh int sdn1                                                            
sdn1 is up (since 09:41:22, 2 changes)
 description: mjolnir@LAN1[01:00.0]
 type is sdn, hwaddr=003b.7671.764f, mtu=1500, bw=8000kbps, vrf=inet
 ip4 address=192.168.0.90/24, netmask=255.255.255.0, ifcid=10013
 ip6 address=2a01:e0a:159:2850::666/64, netmask=ffff:ffff:ffff:ffff::, ifcid=10013
 received 52013 packets (17638316 bytes) dropped 5 packets (448 bytes)
 transmitted 80766 packets (15101778 bytes) promisc=false macsec=false

r1#sh int sdn1                                                            
sdn1 is up (since 09:41:24, 2 changes)
 description: mjolnir@LAN1[01:00.0]
 type is sdn, hwaddr=003b.7671.764f, mtu=1500, bw=8000kbps, vrf=inet
 ip4 address=192.168.0.90/24, netmask=255.255.255.0, ifcid=10013
 ip6 address=2a01:e0a:159:2850::666/64, netmask=ffff:ffff:ffff:ffff::, ifcid=10013
 received 52015 packets (17638418 bytes) dropped 5 packets (448 bytes)
 transmitted 80766 packets (15101778 bytes) promisc=false macsec=false
                        

In the last example we repeatedly issue the "sh int sdn1" command and try to see if TX/RX packets counters increment or not.

This command can be improved in order to be less chatty:

Check interface traffic level (received/transmitted) )
r1#sh int sdn1 | i received|transmitted                            
 received 52256 packets (17681204 bytes) dropped 5 packets (448 bytes)
 transmitted 81130 packets (15162642 bytes) promisc=false macsec=false

r1#sh int sdn1 | i received|transmitted                            
 received 52256 packets (17681204 bytes) dropped 5 packets (448 bytes)
 transmitted 81130 packets (15162642 bytes) promisc=false macsec=false

r1#sh int sdn1 | i received|transmitted                            
 received 52260 packets (17681496 bytes) dropped 5 packets (448 bytes)
 transmitted 81132 packets (15162790 bytes) promisc=false macsec=false

Same goes if want want interface traffic for all interface

Check interface traffic level (received/transmitted) )
show interfaces summary                                                
interface   state  tx        rx        drop
loopback0   up     65856     0         0
loopback42  up     65856     0         0
ethernet0   up     31071917  33183183  0
hairpin41   up     85806     85552     0
hairpin42   up     85806     85552     0
sdn1        up     15200591  17703953  448
sdn2        up     15563546  8000994   794
sdn3        admin  0         0         0
sdn4        admin  0         0         0
sdn5        admin  0         0         0
sdn6        admin  0         0         0
sdn998      up     5850      0         0
sdn999      up     23268     18666     0
tunnel1965  up     5222281   7124950   0

Above was to check interface status related to software switched packet. What if I want to check hardware switched packet counters switched by P4 or DPDK ?

Check interface traffic level (received/transmitted) )
show interfaces hwsummary                                              
interface   state  tx         rx         drop
hairpin41   up     0          0          0
hairpin42   up     0          0          0
sdn1        up     317902736  590402538  1162971
sdn2        up     574923844  310497399  203
sdn3        admin  0          0          0
sdn4        admin  0          0          0
sdn5        admin  0          0          0
sdn6        admin  0          0          0
sdn998      up     9062       0          0
sdn999      up     103804     64470      0
tunnel1965  up     0          1301312    0

Note

As a network operator, the "show" command is your best friend, your wingman. Just explore now from freeRouter CLI using "show ?" and you'll understand the amazing list of diagnosis command available.

 view

In my previous examples, the output of "show ipv4 bgp 42 unicast database" command could not fit my window. Say hello to "view" keyword then !  

Let's now try to get hardware counters as above:

Check hardware traffic counters
r1#view ipv4 bgp 42 unicast database
...

Then you'll be able to see READ-ONLY text buffer where you can navigate and check the output that are beyond boundaries of your screen !

Note

"view" is similar to "show" but it will let you deal with a fixed buffer. "view" buffer won't be refreshed.

 watch

As mentioned above, "show" gives you diagnosis instant photo of a control plane object. In order to see counter increment, you'd have to issue "show" repeatedly. In order to avoid that, let me introduce you the "watch" command

Let's now try to get hardware counters as above:

Check hardware traffic counters
r1#watch interfaces hwsummary
...

It will clear the terminal session and gives you the same outout as above but with counter updated in a regular basis

So in this example you'll see a live output with counter incrementing. In the screenshot it is not noticeable, but in real life this is bluffing. See watch interface pretty much like Junos "monitor" keyword.

So needless to say that "watch" is applicable to every control plane object such as BGP:

Amazing, don't you think ?

 display

In my previous examples, the output of "show ipv4 bgp 42 unicast database" command could not fit my window. Say hello to "display" keyword then !  

Display BGP prefix from dn42 VRF
r1#display ipv4 bgp 42 unicast database
...

Then you'll be able to see READ-ONLY text buffer where you can navigate and check the output that are beyond boundaries of your screen !

As a side note, you can benefit from online help by pressing <f1>

You can press Ctrl+q in order to exit the editor. As the viewer is a READ-ONLY buffer

Note

Use "display" for output that have output that does not fit into your screen. "display" shows a buffer that is auto-refreshed similar to "watch". But instead the output is thrown into a buffer where you can navigate. But display, very useful to diagnoses object such as huge:

  • ACL
  • prefix-list
  • route policy list
  • route-map

As opposed to "view", "display" proposes an auto-refresh version of the buffer ! 

 differ

Last but not list. "differ" , this will split the window in 2 buffers reflecting the same output but with different version and it it signal line lines that have changed. 

Check BGP best path computation for BGP process 42
r1#diff ipv4 bgp 42 bestpath
...

With this view you can easily spot the differences between 2 advertisements interval.

To be honest, when i used this feature for the first time I was totally stumbled and said: Waouw ...

Simply amazing ... 

Discussion

show/view/watch/display/differ is pretty unique to freeRouter, and is really meant to provide you the best user experience as a network operator ! These command have proven to be helpful, especially if you deal with huge feed. However, be careful when you are working with very big output such BGP full feed. This won't crash the router of course as we used to when we issued "debug ip packet" but it will for sure imply a high CPU usage due to regular refresh at the control plane level.

Conclusion

In this 2nd article:

  • We presented freeRouter show/watch/display/differ diagnisis command
  • These are very useful when you have to deal with huge command output buffer.

Final words

As said, these are diagnosis commands are specific to freeRouter. 2 decades of know how and network experience have been pushed into these feature codes. These have been developed for one purpose:

"Provide a unique operation experience to network engineers"

Feel free to try and use them according to your environment taste!

Last but not least, you can play with these different mode from this sandbox:

type "ssh dl.nop.hu" in a terminal (any user/pass will do) and choose "l"
ssh dl.nop.hu -l random_user                                                                                                                              
Warning: Permanently added 'dl.nop.hu,193.224.23.5' (RSA) to the list of known hosts.
random_user@dl.nop.hu's password: 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX     XXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX  XXXX XX XXXX XX XXXX XX XX XX XXXX XXXXX/~~~~~~\XXXXXX
XXXX X XXX XX XXXX XX XXXX XX XX XX XXXX XXXX| player |XXXXX
XXXX XX XX XX XXXX XX     XXX    XX XXXX XXXXX\______/XXXXXX
XXXX XXX X XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXX  XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX XXX XXX XX XXX    XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
welcome
line ready
menu lab:
# - reboot router1
$ - reboot router2
% - reboot router3
1 - connect to router1
2 - connect to router2
3 - connect to router3
^ - rebuild routers
l - connect to lg.nop.dn42
x - exit
choose:l - telnet 172.23.199.110 23 /telnet 
 - connecting to 172.23.199.110 23
 - securing connection

hi there!
try the following:
  show ipv4 route dn42
  show ipv6 route dn42
  show ipv4 bgp 65535 vpnuni summary
  show ipv6 bgp 65535 vpnuni summary
  show ipv4 bgp 65535 vpnuni database
  show ipv6 bgp 65535 vpnuni database
  show ipv4 bgp 65535 vpnuni allroute <prefix> 65535:42
  show ipv6 bgp 65535 vpnuni allroute <prefix> 65535:42
  show ipv4 logger 42 flapstat 10
  show ipv6 logger 42 flapstat 10
  show ipv4 bgp 65535 vpnuni flapstat 10
  show ipv6 bgp 65535 vpnuni flapstat 10
  show ipv4 bgp 65535 vpnuni flappath <prefix> 65535:42
  show ipv6 bgp 65535 vpnuni flappath <prefix> 65535:42
have fun!
mc36
welcome
line ready
player-dn42>                                                                   
player-dn42>                   
...

Then issue a "diff" command:

differ example with BGP command
player-dn42>diff ipv4 bgp 65535 vpnuni database 10.11.160.0/20 65535:42
...

You'll be rewarded by this diff out related to the command which means:

"show me the prefix status of 10.11.160.0/20 within BGP process 65535 with rd: 65535:42"

After a quick look at VRF definition indicates that rd 65535:42 is tied to VRF dn42:

Check vrf list on router
player-dn42>sh start vrf                                                       
vrf definition dn42
 rd 65535:42
 rt-import 65535:42
 rt-export 65535:42
 source4route all
 source6route all
 mdt4
 mdt6
 exit
vrf definition rtbh
 rd 65535:666
 rt-import 65535:666
 rt-export 65535:666
 exit
vrf definition vpn
 rd 65535:1
 rt-import 65535:1
 rt-export 65535:1
 mdt4
 mdt6
 exit
...

In order to exit the sandbox session use the following escape sequence: Ctrl-c + Ctrl-x







This is a special blog series called "RARE Day One". I've always been a huge Cisco and JUNIPER fans, Cisco has unparalleled documentation and I really like JUNIPER "Day One" or "This Week" booklets. Similar to JUNIPER approach RARE "Day One" articles are dealing with essential topics that you need to get familiar with and that will become handy during your "RARE-freeRouter"-FU practices !  

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

Even in the era of zero touch configuration where everything can be modelled by YANG and automated by Ansible, CLI configuration mode is essential and will take a special important place into network engineers' heart.

Any network engineer in the room who never issued this command ?

Mythical "configure terminal" command
conf t
...

Article objective

In this article, we will present you freeRouter available configuration mode. This is an essential article as it will help you in your potential daily operation task. 

Diagram

[ #001 ] - "configure <mode>"

 configure terminal

When you log into a network equipment such as a router, you tend to have some automatic reflex. You usually:

  • Check router configuration: show run or sh conf
  • Check ipv4 / ipv6 / or LFIB forwarding table
  • An then you want to configure the router

Let's assume you want to configure interface sdn3 description:

Mythical "configure terminal" command
r1#sh run sdn3                                                            
interface sdn3
 description r1@LAN3[05:00.0]
 mtu 1500
 macaddr 007b.0c15.1e0c
 shutdown
 no log-link-change
 exit

...

r1#conf t
r1(cfg)#                                                                  
r1(cfg)#int sdn3                                                          
r1(cfg-if)#                                                               
r1(cfg)#int sdn3                                                          
r1(cfg-if)#description Hello Workd SDN3                                
r1(cfg-if)#      

As you would notice, configuring these from "config terminal" prompt has an immediate effect. Please note you can issue "show" command from config mode using the "do" keyword :

issue "show" command in configuration mode using "do" keywork
r1#conf t
r1(cfg)#                                                                  
r1(cfg)#int sdn3                                                          
r1(cfg-if)#                                                               
r1(cfg)#int sdn3                                                          
r1(cfg-if)#description Hello Workd SDN3                       
r1(cfg-if)#   
r1(cfg-if)#do sh run sdn3                                                 
interface sdn3
 description Hello Workd SDN3
 mtu 1500
 macaddr 007b.0c15.1e0c
 shutdown
 no log-link-change
 exit
   

At that point you have a running-config in router memory and you have a startup-config written into the freeRouter flash. In order to see the difference:

issue "show" command in configuration mode
...

r1(cfg-if)#do sh config                                                   
interface sdn3
 no description old_descrption
 description Hello Workd SDN3
 exit
...
r1(cfg-if)# end                                                           
r1#show config-differences                                                
interface sdn3
 no description old_descrption
 description Hello Workd SDN3
 exit   

Notice the use of "end" primitive in order to end configuration mode and revert to user mode. In the example we used shortcut command name:

  • sh config
  • show config-differences

So basically this command will show you the difference between running-config and startup-config. This is similar to Junos: show | compare except that in this context this a comparison between running and startup config.

In this case it just delete the current description and replace it by the new one.

Once you are happy you can write the running-config into the startup-config:

issue "show" command in configuration mode
...
r1#wr                                                                     
% success
r1#sh conf                                                                

r1#           

You observe that show config-differences has no relevant output. running-config is aligned to startup-config !

Note

This is the most intuitive and recommended way to start learning freeRouter as from this interactive mode, you'll benefit from the contextual help that can be triggered by '?'. In this way you'll even be able to discover new freeRouter feature yourself ! This piece of software holds a tremendous amount of secret functionality. In the output below we just check which control plane can be activated ...

issue "show" command in configuration mode
...
r1(cfg)#router ?                                                          
  babel4     - babel routing protocol
  babel6     - babel routing protocol
  bgp4       - border gateway protocol
  bgp6       - border gateway protocol
  blackhole4 - blackhole collector
  blackhole6 - blackhole collector
  deaggr4    - deaggregate creator
  deaggr6    - deaggregate creator
  download4  - route download
  download6  - route download
  eigrp4     - enhanced interior gateway routing protocol
  eigrp6     - enhanced interior gateway routing protocol
  flowspec4  - flowspec to flowspec rewriter
  flowspec6  - flowspec to flowspec rewriter
  isis4      - intermediate system intermediate system
  isis6      - intermediate system intermediate system
  logger4    - route logger
  logger6    - route logger
  lsrp4      - link state routing protocol
  lsrp6      - link state routing protocol
  mobile4    - mobile route creator
  mobile6    - mobile route creator
  msdp4      - multicast source discovery protocol
  msdp6      - multicast source discovery protocol
  olsr4      - optimized link state routing protocol
  olsr6      - optimized link state routing protocol
  ospf4      - open shortest path first
  ospf6      - open shortest path first
  pvrp4      - path vector routing protocol
  pvrp6      - path vector routing protocol
  rip4       - routing information protocol
  rip6       - routing information protocol
  uni2flow4  - unicast to flowspec converter
  uni2flow6  - unicast to flowspec converter
  uni2multi4 - unicast to multicast converter
  uni2multi6 - unicast to multicast converter         
       
 configure viewer

"configure viewer" is a very interesting mode as it gives you the possibility to review the router configuration from a  viewer inspired from "mcedit" (Norton Midnight Commander) 

configure viewer
r1#configure viewer
...

Then you'll be able to read your configuration from a READ-ONLY text buffer:

As a side note, you can benefit from online help by pressing <f1>

But what if I just want to view a specific object ? Let's find out how to check ONLY BGP configuration @ home:

route addition via freeRouter
r1#configure viewer bgp4
...

So in this case It'll just throw my IPv4 bgp config snippet onto the viewer buffer

Same if I want to only view all interface sdn<x> from the router config:

config viewer <regexp>
r1#configure viewer sdn
...

This is so cool, isn't ?

Note

In big TELCO Service Provider environment, most of the time you have Technical Project Manager that just need to perform some checks related to specific customer VPN deployment. So some times, I received some calls: "Can you please that from customer the HUB site prefix 1.2.3.0/24 is configured and advertised into BGP for customer ABC in VRF YXZ ?" With "configure viewer <object>", the TPM can just check it for himself without bothering you at all ! And this without the fear to alter router configuration by accident.

PS: For that you'll need to create a aaa security config with:

  • proper router aaa security policy with privilege level 1
  • with or without TACACS/RADIUS authentication / authorisation and accounting
  • and apply it to a specific OOBM SSH/telnet server in a specific VRF,

but this is not in the scope of the the present article and it will be the object of further articles.

In SP environment, you should not be surprised to see router configuration that has 100k lines or even more. In these environment, I've seen config with countless amount of VRF, NAT, DLSW, GRE, IPSEC tunnels, BGP peers ...  "config viewer" is a great tools when you want to verify a specific stanza on a per customer or object basis and in bonus without any risk the Provider Edge router configuration.

 configure editor

"configure viewer" gives you the possibility to view the config or some parts of the config in read-only mode. "configure editor" gives you simply the possibility to edit also the specific running-config config stanza.

route addition via freeRouter
r1#configure editor
...

Then you'll be able to edit your configuration from a READ-WRITE text buffer:

As a side note, you can benefit from online help by pressing <f1>

You can press Ctrl+q in order to exit the editor. As you did not change anything it will exit the editor.

But what if I just want to edit a specific object ? Let's find out how to check ONLY BGP configuration @ home:

config editor <regexp>
r1#configure editor bgp4
...

So in this case I'll just throw my IPv4 bgp config onto the editor buffer

In this buffer let's just create a description for BGP neighbor 172.23.215.177.

Now just press Ctrl-q (as per the online help accessible using <f1>). However, freeRouter detect the buffer changed has we added BGP description configuration. Therefore it will ask you if you want to save the buffer change into the running-config and apply it.

At that moment you'll be displayed a small recap of what has been applied. 

Even more cool no ? 

Warning

Even if "config editor" is seducing and seems more appealing especially for beginners. This is absolutely not the case. "configure editor" mode is meant for advanced users who knows freeRouter CLI by heart. Why, you might say ? Just try to edit a gigantic BGP configuration without any contextual help just by writing a textual file and you'll understand the risk behind using "config editor". Therefore it is no recommend to use it against complex control plane object.

Please take note that "config editor" alter the running-configuration directly when you saved the editor buffer !

Note

So what's the point of having this cool feature ? This feature is powerful when it comes to simple control plane object or big repetitive object. This is very practical to use this feature against: ACL / Prefix-List / Route Policy Object / Route Map etc.

  • ACL
  • prefix-list
  • route policy list
  • route-map

but nothing to prevent you to edit BGP stanza if you feel that your freeRouter-fu needs to be challenged (wink)

 configure startup

Same as "config editor", but instead of working against the running-config you are editing the startup-config. Which is more safe ... till the next reload (wink)

config startup
r1#configure startup
...
 configure reload

"configure reload" as its name implies is not about reloading a router whatsoever (smile)

config reload
r1#configure reload ?                                                     
  <url> - source url

r1#configure reload    
...

This command take a <url> as argument. Basically it will fetch router configuration from the specified <url> and load it into the startup-config. It is an equivalent to Cisco "copy <url> start". From that point:

  • it is up to the network operator to check the startup configuration
  • and issue a reload warm in order to restart the router and test that connectivity is resuming as expected
  • Check the running-config is aligned to startup-config


Warning

(repetition is not harmful) As said before "configure reload" does not reload the router. It just load the config from specified <url> into the startup-configuration. And this steps precedes a reload that has to be triggered manually by the operator after having checked the config.

Note

in day to day operation, startup-config is usually not altered directly. In TELCO SP environment, IIRC, I used it mainly to retrieve configuration from a CMDB server during 2 situations:

  • Router first time installation after basic configuration staging enabling minimum connectivity
  • Router hardware replacement

Note that in SP environment, as VPN owner we could handle a portfolio of customer (~10). Each customer could have ~ 2000 CPEs. You can see why "config reload" can be very handy.

 configure network

"configure network" gives you the possibility to update/merge existing  running-config from config exposed from a web server.

route addition via freeRouter
r1#configure network ?                                                    
  <url> - source url

r1#configure network
...

This command take a <url> as argument. Basically it will fetch specified configuration from the specified <url> and merge it into the running-config. It is an equivalent to Cisco "copy <url> run". So, from that point:

Warning

  • only running-config is altered.
  • If not saved all changes will be lost in the next reload

Note

in day to day operation, In TELCO SP environment, "configure network" is very useful when you want to apply the same configuration stanza to several router at the same time.

 configure overwrite-network

Same as "configure network" gives you the possibility to replace running-config from config exposed from a web server.

route addition via freeRouter
r1#configure overwrite-network ?                                                    
  <url> - source url

r1#configure overwrite-network
...

This command take a <url> as argument. Basically it will fetch specified configuration from the specified <url> and replace the running-config. It is an equivalent to Cisco "copy <url> run". So, from that point:

Warning

  • only running-config is altered.
  • If not saved all changes will be lost in the next reload

Note

in day to day operation, In TELCO SP environment, "configure network" is very useful when you want to apply the same configuration stanza to several router at the same time from a clean slate state. (no merger)

 configure banner

"configure banner" is one of my favorite mode. It will display an editor allowing you to edit the banner of your router.

route addition via freeRouter
r1#configure banner                                                   
...

Press Ctrl-q and then y in order to save the banner.

Log in to you router again in order to check your new banner:


Note

in day to day operation, this banner can be written in configuration using banner encoded command

banner encoded
banner encoded ICAgX18gICAgICAgICAgICAgICBfX19fICAgICAgICAgICAgIF8NCiAgLyBffF8gX18gX19fICBfX198ICBfIFwgX19fICBfICAgX3wgfF8gX19fIF8gX18NCiB8IHxffCAnX18vIF8gXC8gXyBcIHxfKSAvIF8gXHwgfCB8IHwgX18vIF8gXCAnX198DQogfCAgX3wgfCB8ICBfXy8gIF9fLyAgXyA8IChfKSB8IHxffCB8IHx8ICBfXy8gfA0KIHxffCB8X3wgIFxfX198XF9fX3xffCBcX1xfX18vIFxfXyxffFxfX1xfX198X3wNCiAgXyBfXyBfX18gICBfX198IHwgX19fX18gIHwgfA0KIHwgJ19fLyBfIFwgLyBfX3wgfC8gLyBfX3wgfCB8DQogfCB8IHwgKF8pIHwgKF9ffCAgIDxcX18gXCB8X3wNCiB8X3wgIFxfX18vIFxfX198X3xcX1xfX18vIChfKQ0KDQo=

the command corresponds to the banner mentioned above.

 configure revert

"configure revert" revert the running-config to the startup config. For Junos fan it is equivalent to "rollback 0"

configure description
r1#sh run int sdn3                                                        
interface sdn3
 description r1@LAN3[05:00.0]
 mtu 1500
 macaddr 007b.0c15.1e0c
 shutdown
 no log-link-change
 exit
!
configure description
r1# conf t                                                                
r1(cfg)#int sdn3
r1(cfg-if)#description "This is the new description"

mjolnir(cfg-if)#do sh conf                                                     
interface sdn3
 no description r1@LAN3[05:00.0]
 description "This is the new description "
 exit
Let's diff between running and startup config
r1(cfg-if)#do sh conf                                                     
interface sdn3
 no description r1@LAN3[05:00.0]
 description "This is the new description "
 exit
sh run sdn3
mjolnir(cfg)#exit                                                              
mjolnir#sh run sdn3                                                            
interface sdn3
 description "This is the new description "
 mtu 1500
 macaddr 007b.0c15.1e0c
 shutdown
 no log-link-change
 exit
sh run sdn3
mjolnir#configure revert                                                       
     1: interface sdn3
     2:  no description "This is the new description "
     3:  description r1@LAN3[05:00.0]
     4:  exit

errors=0
sh run sdn3
mjolnir#sh run sdn3                                                            
interface sdn3
 description r1@LAN3[05:00.0]
 mtu 1500
 macaddr 007b.0c15.1e0c
 shutdown
 no log-link-change
 exit

Note

in day to day operation, In TELCO SP environment, "configure revert" should be used as "rollback 0" upon the running config. This means that you are about to abandon the current running config and re-apply the config that figures in the startup-config. In our case, it was changing a description, but in some case it can have more impact. (change route filtering, route advertising etc.)

 configure rollback

"configure rollback" is very useful when you are in an operational  situation that needs "trial and error" approach, and sometimes the error can lead to loss of connectivity on the router itself... Who never experienced that ?

First of all we have a saying a French saying: "Il n'y a que ceux qui ne font rien qui ne font pas de bêtise". So don't feel guilty about that... I remembered having isolated some sites just by accident ...

In this situation "configure rollback" is a combination of "configure revert" and a loss of CLI TCP session. What does this practically means ?

Imagine you are configuring a redistribution between IS-IS and OSPF and that you forgot that the network have 2 connections. This redistribution without careful route filtering will result in a routing loop and it happens that you lose connectivity upon that configuration. (never ending routing advertisement loop, high cpu load etc...)

Upon losing TCP connection, in "configure rollback" freeRouter will automatically revert to its startup config.

You will therefore auto-magically get back connection before it was the route redistribution that caused the havoc.

How cool is that !

Note

In IOS, i used to use  "reload in <x>" command, in JunOS of course you have "commit confirm" and same goes for IOS-XR. So this airbag is not only unique to freeRouter, but IT IS THERE !

 configure file

"configure file" gives you to the possibility to update/merge running configuration from a local file from the flash filesystem.

route addition via freeRouter
r1#configure file ?                                                       
  <file> - source file

r1#configure file
...

This command take a <file> as argument. Basically it will load specified configuration from the specified <file> and update/merge the running-config. It is an equivalent to Cisco "copy <flash:file> run". So, from that point:

show flash
mjolnir#show flash /rtr                                                        
date                 size     name
2009-12-31 23:00:00  18048    bundle.bin
2020-07-30 15:47:05  2477     c.sh
2009-12-31 23:00:00  22648    hdlcInt.bin
2020-08-26 07:35:35  2937     hwdet-all.sh
2020-07-31 13:31:28  203      hwdet-main.sh
2009-12-31 23:00:00  18616    mapInt.bin
2020-09-29 08:58:48  554856   mjolnir.log
2009-12-31 23:00:00  18088    modem.bin
2009-12-31 23:00:00  131432   p4dpdk.bin
2009-12-31 23:00:00  121896   p4emu.bin
2009-12-31 23:00:00  63144    p4pkt.bin
2009-12-31 23:00:00  18088    pcap2pcap.bin
2009-12-31 23:00:00  18608    pcapInt.bin
2009-12-31 23:00:00  18384    rawInt.bin
2020-09-28 11:54:12  598      rtr-hw.txt
2020-09-28 21:16:19  14607    rtr-sw.txt
2020-07-30 15:47:37  2022     rtr.err
2020-09-29 03:09:25  5587321  rtr.jar
2020-09-29 03:09:16  5585713  rtr.jar.bak
2020-09-29 03:09:26  24       rtr.rld
2020-09-23 03:06:12  529      rtr.scr
2020-09-23 03:06:11  483      rtr.scr.bak
2020-08-23 17:34:19  46       rtr.scr2
2020-08-23 17:34:18  0        rtr.scr2.bak
2020-09-23 03:06:11  542720   rtr.tar
2020-09-23 03:06:09  522240   rtr.tar.bak
2020-09-29 03:11:04  2330     rtr.ver
2020-09-29 03:11:03  3790694  rtr.zip
2020-09-29 03:10:57  3789659  rtr.zip.bak
2020-07-30 15:47:05  388      setup_dpdk.sh
2020-07-30 15:47:05  48       setup_route.sh
2020-07-30 15:47:05  2171     setup_veth.sh
2009-12-31 23:00:00  18048    stdLin.bin
2009-12-31 23:00:00  18440    tapInt.bin
2009-12-31 23:00:00  18224    ttyLin.bin
2009-12-31 23:00:00  18256    vlan.bin
 configure replace

"configure file" gives you to the possibility to replace running configuration from a local file from the flash filesystem.

route addition via freeRouter
r1#configure replace ?                                                       
  <file> - source file

r1#configure replace
...

This command take a <file> as argument. Basically it will load specified configuration from the specified <file> and replace the running-config. It is an equivalent to Cisco "copy <flash:file> run". So, from that point:

show flash
mjolnir#show flash /rtr                                                        
date                 size     name
2009-12-31 23:00:00  18048    bundle.bin
2020-07-30 15:47:05  2477     c.sh
2009-12-31 23:00:00  22648    hdlcInt.bin
2020-08-26 07:35:35  2937     hwdet-all.sh
2020-07-31 13:31:28  203      hwdet-main.sh
2009-12-31 23:00:00  18616    mapInt.bin
2020-09-29 08:58:48  554856   mjolnir.log
2009-12-31 23:00:00  18088    modem.bin
2009-12-31 23:00:00  131432   p4dpdk.bin
2009-12-31 23:00:00  121896   p4emu.bin
2009-12-31 23:00:00  63144    p4pkt.bin
2009-12-31 23:00:00  18088    pcap2pcap.bin
2009-12-31 23:00:00  18608    pcapInt.bin
2009-12-31 23:00:00  18384    rawInt.bin
2020-09-28 11:54:12  598      rtr-hw.txt
2020-09-28 21:16:19  14607    rtr-sw.txt
2020-07-30 15:47:37  2022     rtr.err
2020-09-29 03:09:25  5587321  rtr.jar
2020-09-29 03:09:16  5585713  rtr.jar.bak
2020-09-29 03:09:26  24       rtr.rld
2020-09-23 03:06:12  529      rtr.scr
2020-09-23 03:06:11  483      rtr.scr.bak
2020-08-23 17:34:19  46       rtr.scr2
2020-08-23 17:34:18  0        rtr.scr2.bak
2020-09-23 03:06:11  542720   rtr.tar
2020-09-23 03:06:09  522240   rtr.tar.bak
2020-09-29 03:11:04  2330     rtr.ver
2020-09-29 03:11:03  3790694  rtr.zip
2020-09-29 03:10:57  3789659  rtr.zip.bak
2020-07-30 15:47:05  388      setup_dpdk.sh
2020-07-30 15:47:05  48       setup_route.sh
2020-07-30 15:47:05  2171     setup_veth.sh
2009-12-31 23:00:00  18048    stdLin.bin
2009-12-31 23:00:00  18440    tapInt.bin
2009-12-31 23:00:00  18224    ttyLin.bin
2009-12-31 23:00:00  18256    vlan.bin

Discussion

Most of you will simply use the basic "conf t" mode, but keep in mind that depending on your context, all the other modes are proven to be very handy. The possibility to configure 1000 router with one single config file using "config network" is a savior. Having the possibility to trigger automatic definitive router staging using "conf reload" is tremendously useful when you have to deploy 10 routers a day. As said "config view" can give non operation staff to check if some configs are there or not ... "config editor" is very powerful when you want to edit a never ending access-list, but remember to avoid to use it for complex BGP config... You have been warned !

Conclusion

In this 1st article:

  • We presented freeRouter config mode
  • Most of these are useful in various different context

Final words

All these modes are not new. IOS, IOS-XR, IOX-XE, NW-OX, JUNOS have their own config mode that are somewhat similar. In any case freeRouter config mode implementation is meant to address  all needs from the network operators perspective. As you can observe, configure mode has an impressive list of mode. Feel free to try and use them according to your environment taste!

Last but not least, you can play with these different mode from this sandbox:

type "telnet dl.nop.hu" in a terminal and choose "1"
 telnet dl.nop.hu                                                                                                                                                 1 ↵
Trying 193.224.23.5...
Connected to dl.nop.hu.
Escape character is '^]'.
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX     XXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX  XXXX XX XXXX XX XXXX XX XX XX XXXX XXXXX/~~~~~~\XXXXXX
XXXX X XXX XX XXXX XX XXXX XX XX XX XXXX XXXX| player |XXXXX
XXXX XX XX XX XXXX XX     XXX    XX XXXX XXXXX\______/XXXXXX
XXXX XXX X XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXX  XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX XXX XXX XX XXX    XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
welcome
line ready
menu lab:
...
type "ssh dl.nop.hu" in a terminal (any user/pass will do) and choose "1"
ssh dl.nop.hu -l random_user                                                                                                                                     1 ↵
Warning: Permanently added 'dl.nop.hu,193.224.23.5' (RSA) to the list of known hosts.
random_user@dl.nop.hu's password: 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX     XXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX  XXXX XX XXXX XX XXXX XX XX XX XXXX XXXXX/~~~~~~\XXXXXX
XXXX X XXX XX XXXX XX XXXX XX XX XX XXXX XXXX| player |XXXXX
XXXX XX XX XX XXXX XX     XXX    XX XXXX XXXXX\______/XXXXXX
XXXX XXX X XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXX  XX XXXX XX XXXXXXX XX XX XXXX XXXXXXXXXXXXXXXXXXX
XXXX XXXXX XXX    XXX XXX XXX XX XXX    XXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
welcome
line ready
menu lab:
# - reboot router1
$ - reboot router2
% - reboot router3
1 - connect to router1
2 - connect to router2
3 - connect to router3
^ - rebuild routers
l - connect to lg.nop.dn42
x - exit
choose:1 - attach vdc lab1 

yourname#                                                                      
yourname#configure ?                                                           
  <cr>
  banner            - edit the banner
  editor            - configure from editor
  file              - append to running configuration
  network           - append to running configuration
  overwrite-network - overwrite the running configuration
  reapply           - !!!EXPERiMENTAL!!! try to reapply current configuration
  reload            - overwrite the startup configuration
  replace           - overwrite the running configuration
  revert            - revert to startup configuration
  rollback          - configure within auto-revert session
  startup           - edit the startup configuration
  terminal          - configure from this terminal
  viewer            - view current configuration

yourname#configure                                      
...

In order to exit the sandbox session use the following escape sequence: Ctrl-c + Ctrl-x

Another method to access the sandbox, by click here, this will open a terminal webapp into your browser:







This is a special blog series called "RARE software architecture". As its name implies, it deals with topics related to RARE/freeRouter software design choice.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

RARE project objective is to provide a routing platform proposing various solutions addressing multiple use cases in the R&E landscape. In the picture below you see in purple the different use cases:

As you can notice, each use case will run on different hardware that potentially can have different dataplanes. As we were starting from a clean slate environment without much choice, especially with P4 programmability - the first dataplane or P4 target considered was BMv2. BMv2 is an excellent way to learn P4, it is also the first target we use in order to program and validate new features. After 6 months of practising our "P4-fu" we developed:

  • a P4lang repository for ubuntu bionic and focal
  • a debian 10 repository
  • had our first RARE/FreeRouter prototype powered by a P4 BMv2 dataplane !

Our initial work, considering FreeRouter's Java nature, was to write a Java P4Runtime GRPC client that would be able to program the entries in the tables exposed by BMv2 via the P4Info file. However, this would have intimately tied FreeRouter code to P4Runtime gRPC code. Even if it's more natural to choose this solution, going in that direction implied that dataplanes other than BMv2 would be compliant to P4Runtime. It turns out that this is not the case. We then opted for a simple message API via a bi-directional raw UNIX socket. We will see what this means later in this blog.

Motivated by the successful experience with BMv2, we then decided to move forward and started to study TOFINO as a target. We were greedy and eager to apply our P4 code against multi-terabits traffic. After a few P4 program compilations, the first impression from my personal perspective was ... mind blowing ! INTEL/BAREFOOT TOFINO effectively opened the door to multi-terabits packet processing... Just to have at the tip of your finger the possibility to process traffic at these traffic levels was exciting !

As a side note, the journey was not without suffering and pain... (smile) We had to port our BMv2 code - and to port to TOFINO was not "Une lettre à la poste"... It is not that TOFINO programming is gratuitously painful. It is just that it is p4c-tofino's job to make sure that our packets are processed at silicon lighting speed. Imagine you are asked to  convey parcels by driving from Paris to Amsterdam with a car that has an infinitely sized trunk, with an infinite gas tank and no particular speed constraint along the road. And then you are asked to do the same trip, but with an actual real car that has a trunk with a fixed size and with a 50 litre gas tank, and of course you'll have to follow speed signs along the road.

In the first case, you would put as many parcels as you would like and you even won't bother looking at your gas tank level and maybe you'd set the speed to 200 Km/h. The second case forces you to carefully think about how many parcels you can put in your trunk, check to see if one completely full tank can be sufficient for the trip and of course, you would have to follow the speed signs.

If you allow me this comparison, this is where BMv2 and TOFINO programming differs.  

But, this pain was not in vain, it was for the greater good... You can't imagine the inherent joy when you see the TOFINO compiler displaying the DONE word ! For the veterans who can remember, it is the same feeling when you manage to compile your first program in the ADA language. The compiler is not so strict that compiling an ADA program is in itself a feat. No wonder why this language is used in Spatial rocket (Ariane).

Back to our dataplane interface story, even TOFINO and BMv2 share some roots, while BMv2 had P4Runtime as a northnound interface, INTEL/BAREFOOT pushed into TOFINO platform with P4_16 their gRPC interface counterpart: BfRuntime.

Our best bet paid off as FreeRouter message API was unchanged and without much effort we could add a new dataplane "wingman" to the FreeRouter control plane.

To recap:

  • For BMv2: Our interface yields P4Runtime RPC calls. This program is called: forwarder.py
  • For TOFINO: Our interface yields BfRuntime RPC calls. This program is called witout too much originality: bf_forwader.py

At that point we were starting to have a decent LSR/LER router for CORE and Aggregation use cases.

But we still had nothing at the EDGE/AGGREGATION layer in terms of a solution proposal, deploying P4 hardware might be way too expensive in specific contexts such as small R&E institutions like primary schools or small R&E labs. To that purpose, we started to study new targets such as VMWARE XDP and a very promising project: T4P4S ELTE. While we could not use XDP without a lot of P4 code rewriting and compromise, T4P4S ELTE was from our perpective very promising. But due to a compilation issue, we could not move forward.

FPGA was also a solution that we considered but had no access to any FPGA hardware that was P4 compliant.

As a result, we were a little bit bitter and started to read the DPDK library. And we started to play with DPDK examples... These examples were tremendously useful as it sparked some DPDK development into the RARE team. Csaba, the FreeRouter lead developer, step by step came up with this GENIUS idea: why don't we just use emulate P4 RARE P4 dataplane program ? We can still revert to using T4P4S ELTE when it will be ready ?

P4emu/P4dpdk was then born ! 

To conclude this short story, RARE/FreeRouter has now 3 completely different dataplanes: (in order of appearance)

  • BMv2
  • TOFINO
  • DPDK


Unique RARE/FreeRouter feature

However, please note that FreeRouter message API is common to the three dataplanes listed above. You'll see further how this structure make the solution: an open modular, interchangeable solution.

Article objective

In this article, let's present RARE/FreeRouter platform structure and focus on the interface(S) between FreeRouter control plane and various dataplane.

Diagram

[ #001 ] - Modular design

 FreeRouter control plane

In this designs, FreeRouter is focusing on running control plane processes, such as routing protocols IGP(s), BGP(s). There are other control plane processes but let's just focus on these latter. At some point in time, all IGPs/EGP converge and will have to create an entry in a routing table. In case of IPv4 the entry will be created into an IPv4 forwarding table and similarly, an IPv6 route entry will be created into IPv6 forwarding table. From FreeRouter point of view these entry creation will be triggered by yielding one Java function twice that will generate these 2 API messages, one for IPv4 and the other one for IPv6.

 Common message API

Let's add an IPv4 route into freeRouter CLI

route addition via freeRouter
conf t
ipv4 route v1 1.2.3.0 255.255.255.0 4.4.4.4
...

Upon entering the ipv4 route and pressing <enter>, you'll see the following message appearing

message API: route4_add
...
rx: ['route4_add', '1.2.3.0/24', '13063', '4.4.4.4', '1', '\n']
...

Let's delete the route via FreeRouter CLI

route deletion via freeRouter
conf t
no ipv4 route v1 1.2.3.0 255.255.255.0 4.4.4.4
...
message API: route4_del
...
rx: ['route4_del', '1.2.3.0/24', '13063', '4.4.4.4', '1', '\n']
...

Important note

In short, the message API is simply a collection of message that would trigger an entry ADD/DELETE/MODIFY into the dataplane corresponding table.

The documentation of this message API will be documented and published soon, but for those who are curious and can't wait this documentation, you can read forwarder.py, bf_forwarder.py or p4dpdk.bin  source code

 Candidate dataplane platform

As said in the beginning of the article, freeRouter control plane would have to deal with dataplane of different nature. And we concluded in mentioning that for now, freeRouter has three dataplanes. Each of these dataplanes have their own northbound interface, whether this is P4Runtime for BMv2, BfRuntime for TOFINO or P4DPDK for system compatible with DPDK and having DPDK complinnt NIC.

For BMv2 we just had to write an interface that would translate freeRouter API message into P4Runtime GRPC calls. For BMv2 this interface is called forwarder.py:

For TOFINO we just had to write an interface that would translate freeRouter API message into BfRuntime GRPC calls. For TOFINO this interface is called bf_forwarder.py:

For DPDK we just had to write an interface that would translate freeRouter API message into DPDK primitives. This interface is included into DPDK dataplane bundled into freeRouter binaries: p4dpdk.bin

It is just as simple as that !

Discussion

 Dataplane addition made easy

This design is pretty unique because, if for any reason you would like to "hook" freeRouter control plane to an other dataplane such as:

  • FPGA
  • or dataplane powered by kernel bypass technique such as RDMA
  • Or other NPU based dataplane
  • etc.

This is possible !

You would "just" have to port your P4 code logic into the target dataplane and create an interface able to translate API messages from FreeRouter into understandable message from the target dataplane.

Be cautious with the word "just"

The "just" word can be misleading. Indeed, depending on the target dataplane, it can be a huge task. With DPDK, we were lucky in getting enough material in order to move forward and again p4dpdk.bin was a simple trial at the very beginning. But some other dataplane can just be simply be ignored if we don't get enough material/support from NPU vendors. 

 You can use your own control plane too !

One thing that we did not experience, but this can be maybe one day a reality.

What if you have your own control plane and that you absolutely want to keep it, but would like to re-use BMv2/TOFINO or DPDK RARE dataplane ?

Well this is possible !

Long time ago I met Thomas MANGIN (yet another cool and nice French guy (smile) ) which is the author of Exa-BGP, i did not talk to him about this and I don't want to give him bad idea, but what if he would like to hook a TOFINO P4 dataplane to Exa-BGP ?

Well, he actually would just have to teach exaBGP to handle entry ADD/DELETE/MODIFY message according to the message API above.

I also love the work DONE at the SoNIC project level and I know that SoNIC has already a P4 dataplane called switch.p4. I doubt it will be the case one day but, what if SoNIC project wanted to re-use RARE dataplane for especially for Service Provider capability ?

OK, this sounds crazy, but the modular design we proposed here is valid and can make the RARE dataplane available for other control plane.

Of course, we strongly suggest you to stick with FreeRouter as you will just realize IMHO that in the TELCO Service Provider space there is no match. You'll have the venerable IOS-XR and JUNOS, but these are not Open Source counterparts.


Conclusion

In this 1st article you:

  • had a 10K feet view description of RARE/FreeRouter modular design
  • This design allow rapid dataplane addtion without altering whatsoever FreeRouter code base
  • In case you would like to re-use BMv2/TOFINO/P4DPDK dataplane, this has been never implemented but this is possible !

Message API documentation

From the time being this API  message is not yet publicly documented. However, it is available and buried inside forwarder.py or bf_forwarder.py source code. This is work in progress but if you feel an urgent need to use it feel free to read the code.

PS: We will publish this document ASAP, but time plays against us ...




This is a special blog series called "RARE hardware platform". As its name implies it deals with certified and tested platform on which RARE/freeRouter can run out of the box.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

We will deal with a series of article related to STORDIS BF2556X-1T P4 switch. The key highlight of this box is: 

  • It is a P4 TOFINO NPU based switch
  • TOFINO version has 2 cores (i.e. 2 pipes) and can manage up to 2 Tbps
  • It offers multiple connection types and rates:
    • 48x25GSFP28 and 8x100GQSFP28
      • SFP28 port [1 - 16] can configure into 1G/10G/25G
      • SFP28 port [17 - 48] can configure into 10G/25G
      • QSFP28 port [49 - 56] Each QSFP28 port can configure into 1x100G,2x50G,4x25G, 1x40G or 4x10G Mode.
  • SyncE and 1588 support

Article objective

In this article, we will just do a basic introduction of the BF2556X-1T

[ #001 ] - BF2556X-1T in a nutshell

 BF2556X-1T unboxing

Parcel

What's in the box

Included items

Quick Installation guide

Front panel

Back panel

STORDIS BF2556X-1T Racked

STORDIS BF2556X-1T alongside to his P4 brothers: Edgecore WEDGE100BF32X

 Hardware specification

BF2556X-1T specification

The system uses Barefoot BFN-T10-032D-020 (Tofino 2.0T) Switch Chip which can support 20 x 100GE ports.

Major features are:

  • 2.0 Tbps bandwidth
  • One Barefoot BFN-T10-032D-020(Tofino 2.0T) Switch ASIC
    • Ethernet support 80x25G SERDES ports
    • Management SERDES support four ports 10G-KR
    • PCIe Gen3 x 4lanes
  • Eight Marvell 98PX1024
    • Single chip support 4x25G SERDES
  • Network Interface
    • 48x25G SFP28 and 8x100G QSFP28
    • SFP28 port 1~16 can configure into1G/10G/25G.
    • SFP28 port17~48 can configure into 10G/25G.
    • Each QSFP28 port can configure into 1x100G,2x50G,4x25G, 1x40G or 4x10G Mode.
  • CPU Module: Optional Module design for flexibility
    • Intel® Xeon® Processors D1527 (BDXDE)
  • BMC: Base Board Management Controller
    • BMC is a specialized service processor that monitors the physical state of a system.
    • ASPEED AST2520
  • Management Port:
    • 3xRJ45 10/100/1000Mbps OOBM(Out Of Band Management) port
    • 1xConsoleRJ45
    • 1xUSB3.0
  • FAN Tray:
    • Four 40mmx56mm Fan-tray
    • Supporting 3+1 redundancy
    • Support front to back and back to front air direction.
  • PSU:
    • 1+1 redundant PSU
    • Each PSU will be supporting 850W power to system.
    • 12V standby power for system management chips.
    • Support DC power supply
CPU specification
lscpu
Architecture:        x86_64                                                         
CPU op-mode(s):      32-bit, 64-bit                                                 
Byte Order:          Little Endian                                                  
CPU(s):              16                                                             
On-line CPU(s) list: 0-15                                                           
Thread(s) per core:  2                                                              
Core(s) per socket:  8                                                              
Socket(s):           1                                                              
NUMA node(s):        1                                                              
Vendor ID:           GenuineIntel                                                   
CPU family:          6                                                              
Model:               86                                                             
Model name:          Intel(R) Xeon(R) CPU D-1548 @ 2.00GHz                          
Stepping:            3                                                              
CPU MHz:             799.832                                                        
CPU max MHz:         2600.0000                                                      
CPU min MHz:         800.0000                                                       
BogoMIPS:            4000.16                                                        
Virtualization:      VT-x                                                           
L1d cache:           32K                                                            
L1i cache:           32K                                                            
L2 cache:            256K                                                           
L3 cache:            12288K                                                         
NUMA node0 CPU(s):   0-15                                                           
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperf mperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_ad just bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsa veopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts md_clear flush_l1d                                                                      

Discussion

 Hardware consideration

The BF2556X-1T is a horse power:

  • the usage of 8 cores having each one 2 threads speeds up P4 program compilation. (BF2556X-1T  as 2x more core than the WEDGE100BF32X)
  • SyncE 1588 might be certainly important for you should your P4 application require precise time synchronisation capability 
  • Having 1G/10G/25G/40/50G/100G connectivity via SFP28 and QSFP28 will make the BF2556X-1T ready for multiple use case.
    • In a P/PE architecture MPLS PE proposing 1G/10G connectivity and having uplink toward the core
    • In a collapse core can be used a MPLS PE router
    • Can be used as a leaf or Tor switch/router
    • BRAS/BNG router

Conclusion

In this 1st article you:

  • had a brief description STORDIS BF2556X-1T hardware platform
  • The hardware provide p4 connectivity at 1GE capacity (16x1GE ports is available)
  • In addition to 1GE it also provide 10/25/40/50/100G connectivity

RARE hardware plarform: [ BF2556X-1T #001 ] - key take-away

  • From RARE/FreeRouter point of view, BF2556X-1T is very good candidate for PE (Provider Edge) router.

The 8x100G ports can make as a strong in a collapse core architecture (P function merge with PE functions), the box can also be used a a BGP route as it boast with 32 GB of RAM (~10 full BGP feeds), but you won't leverage the ports availability. It can be used to implement BRAS/BNG use case but would be also a good candidate as a ToR in Data Center envionment with BGP/MPLS capability and the possibility to provide 1GE connection to existing server purchased beforehand.

  • SyncE 1588 support is a key features if your application needs precision provided by PTP

As we will discover the box, we will explain in further articles how to benefit from this features. 

  • RARE/freeRouter @design can coexist with Virtualisation technology BF2556X-1T

We just started our experience with this box. You'll find further, a series of article dedicated to BF2556X-1T depicting:

  • How to proceed to initial OS installation
  • Proceed to STORDIS BF2556X-1T software installation (TOFINO SDE and Gearbox) installation
  • Port operations on TOFINO ports SFP28 port 16-47 and QSFP28 port 48-56  
  • Port operations on GearBox ports SFP28 port 1-16 (1G/10G/25G)
  • How to benefit from SyncE 1588 support
  • RARE/freeRouter effective installation

The installation will be implemented should be compliant to ISP TELECOM standard. (It should survives power outage, easy upgrade features, start automatically at boot time without any human intervention)



The 1st article presented you the hardware platform and the rationale behind the choices. Let's dive into the subject now!

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

Several choices were possible, we finally ended up in following the KISS method. The Operating system requirements are:

  • requirement #0: LTS operating system 
  • requirement #1: Benefit from LTS security patches
  • requirement #2: Must be able to run DPDK
  • requirement #3: (personal requirement) Must be familiar to me
  • requirement #4: Able to run Java software as freeRouter is written in Java
  • requirement #5: Small operating system software footprint
  • requirement #6: Support for IPv4/IPv6

The hardest path would be:

The objective is to have tight control of the software installed on the appliance. This guarantees the smallest footprint we hope to obtain. For those familiar with OpenWRT, we can reach a tiny image size. My OpenWRT image is 5Mb.

  • Use of NixOS or Nix package manager

This provides an incredible feature: commit/rollback functionality at the package management level!

Note

The features above are still under study into RARE group. We will introduce these technologies once we feel more confident on how to integrate these technologies into a streamlined deployment process.

Article objective

In this article we will go through the major steps in deploying Debian 10 stable aka Buster in order to prepare freeRouter installation.

Diagrams

[ #002 ] - Cookbook

 Operating system installation preparation
Get debian 10 minimal ISO
wget http://ftp.nl.debian.org/debian/dists/buster/main/installer-amd64/current/images/

On MACOSX, burn the iso using balenaEtcher

balenaEtcher can be downloaded here

Via the appliance BIOS settings:

  • activate console port redirection:

Option d'activation du port série

  • configure serial port settings

Now that you have activated console port:

  • plug the USB key on which you previously burnt Debian 10
  • make sure you  set boot option from USB in BIOS settings
  • reboot

You can now proceed to the next step: Debian 10 installation

 Operating system installation

We will assume that you have installed Debian 10 on the 256 Gb SSD.

Just as a side note during the installation process you'll be prompted the: "Software selection" window, in this steps we will:

  • unselect everything
  • select "SSH server"

Software selection

This will guarantee the tiniest Debian 10 operating system software footprint. We will on demand install the needed packages manually.

 packages installation needed by RARE/freeRouter

On minimal installation, sudo is not installed, so all the software will be done as root.

minimal Java installation
apt-get update
apt-get install default-jre-headless

The latest DPDK software is needed. We use the Debian 10 backport repository in orcer to get DPDK 19.11.2-1~bpo10+1

dpdk from debian 10 backports repository
echo "deb http://deb.debian.org/debian buster-backports main" | tee /etc/apt/sources.list.d/buster-backports.list
apt-get update
apt-get install dpdk dpdk-dev
Check DPDK version
dpkg -l | grep dpdk
ii  dpdk                                    19.11.2-1~bpo10+1            amd64        Data Plane Development Kit (runtime)
ii  dpdk-dev                                19.11.2-1~bpo10+1            amd64        Data Plane Development Kit (dev tools)
ii  libdpdk-dev:amd64                       19.11.2-1~bpo10+1            amd64        Data Plane Development Kit (basic development files)
additional 3rd party software used by freeRouter
apt-get update
apt-get install unzip net-tools libpcap-dev ethtool default-jre-headless psmisc tcpdump
 create freeRouter /rtr folder

In this setup we will create a freeRouter folder at the filesystem root directory

Create freeRouter folder at filesystem root directory
mkdir /rtr
get freeRouter control plane software
cd /rtr 
wget http://freerouter.nop.hu/rtr.jar
get freeRouter net-tools tarball
cd /rtr 
tar xvf rtr.tar -C /rtr
rm rtr.tar
 Disable host networking (One time installation)

As freeRouter is handling the networking task, we have to disable the appliance networking. Forgetting to do so will result in conflicts and unpredictable behaviour. 

Disable networking from systemd perspective
systemctl set-default multi-user.target
rm /usr/lib/systemd/network/*
SVC="network-manager NetworkManager ModemManager systemd-network-generator systemd-networkd systemd-networkd-wait-online systemd-resolved hostapd wpa_supplicant"
systemctl disable $SVC
systemctl mask $SVC
 freeRouter systemd startup script
freeRouter systemd startup script
cat /lib/systemd/system/rtr.service

[Unit]
Description=router processes
Wants=network.target
After=network-pre.target
Before=network.target

[Service]
Type=forking
ExecStart=/rtr/hwdet-all.sh

[Install]
WantedBy=multi-user.target
/rtr/hwdet-all.sh script
cat /rtr/hwdet-all.sh

#!/bin/sh

cd /rtr
echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6
echo 0 > /proc/sys/net/ipv6/conf/lo/disable_ipv6
ip link set lo up mtu 65535
ip addr add 127.0.0.1/8 dev lo
ip addr add ::1/128 dev lo

# DPDK
echo 96 > /proc/sys/vm/nr_hugepages
modprobe uio_pci_generic

dpdk-devbind.py -b uio_pci_generic 01:00.0 
dpdk-devbind.py -b uio_pci_generic 02:00.0 
dpdk-devbind.py -b uio_pci_generic 05:00.0 
dpdk-devbind.py -b uio_pci_generic 06:00.0 
dpdk-devbind.py -b uio_pci_generic 07:00.0 
dpdk-devbind.py -b uio_pci_generic 08:00.0 

#VETH for CPU_PORT and OOBM_PORT
ip link add veth0a type veth peer name veth0b

ip link set veth0a multicast on
ip link set veth0a allmulti on
ip link set veth0a promisc on
ip link set veth0a mtu 8192
ip link set veth0a up

ip link set veth0b multicast on
ip link set veth0b allmulti on
ip link set veth0b promisc on
ip link set veth0b mtu 8192
ip link set veth0b up

ethtool -K veth0a rx off
ethtool -K veth0a tx off
ethtool -K veth0a sg off
ethtool -K veth0a tso off
ethtool -K veth0a ufo off
ethtool -K veth0a gso off
ethtool -K veth0a gro off
ethtool -K veth0a lro off
ethtool -K veth0a rxvlan off
ethtool -K veth0a txvlan off
ethtool -K veth0a ntuple off
ethtool -K veth0a rxhash off
ethtool --set-eee veth0a eee off

ethtool -K veth0b rx off
ethtool -K veth0b tx off
ethtool -K veth0b sg off
ethtool -K veth0b tso off
ethtool -K veth0b ufo off
ethtool -K veth0b gso off
ethtool -K veth0b gro off
ethtool -K veth0b lro off
ethtool -K veth0b rxvlan off
ethtool -K veth0b txvlan off
ethtool -K veth0b ntuple off
ethtool -K veth0b rxhash off
ethtool --set-eee veth0b eee off

ip link add veth1a type veth peer name veth1b

ip link set veth1a multicast on
ip link set veth1a allmulti on
ip link set veth1a promisc on
ip link set veth1a mtu 1500
ip link set veth1a up

ip link set veth1b multicast on
ip link set veth1b allmulti on
ip link set veth1b promisc on
ip link set veth1b mtu 8192
ip link set veth1b up

ip link set wlan0 up

ethtool -K veth1a rx off
ethtool -K veth1a tx off
ethtool -K veth1a sg off
ethtool -K veth1a tso off
ethtool -K veth1a ufo off
ethtool -K veth1a gso off
ethtool -K veth1a gro off
ethtool -K veth1a lro off
ethtool -K veth1a rxvlan off
ethtool -K veth1a txvlan off
ethtool -K veth1a ntuple off
ethtool -K veth1a rxhash off
ethtool --set-eee veth1a eee off

ethtool -K veth1b rx off
ethtool -K veth1b tx off
ethtool -K veth1b sg off
ethtool -K veth1b tso off
ethtool -K veth1b ufo off
ethtool -K veth1b gso off
ethtool -K veth1b gro off
ethtool -K veth1b lro off
ethtool -K veth1b rxvlan off
ethtool -K veth1b txvlan off
ethtool -K veth1b ntuple off
ethtool -K veth1b rxhash off
ethtool --set-eee veth1b eee off

ip addr flush dev veth1a 
ip addr add 192.168.128.254/24 dev veth1a

#ADD DEFAULT ROUTE to OOBM SDN999
route add default gw 192.168.128.1

# START RTR !
start-stop-daemon -S -b -x /rtr/hwdet-main.sh
make hwdet-main.sh executable
chmod u+x /rtr/hwdet-main.sh

A bit of explanation

Disable IPv6
echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6
echo 0 > /proc/sys/net/ipv6/conf/lo/disable_ipv6
ip link set lo up mtu 65535

Note that IPv6 operation would occur on the host itself, IPv6 will be handled at freeRouter level

Disable IPv6
echo 96 > /proc/sys/vm/nr_hugepages
modprobe uio_pci_generic

dpdk-devbind.py -b uio_pci_generic 01:00.0 
dpdk-devbind.py -b uio_pci_generic 02:00.0 
dpdk-devbind.py -b uio_pci_generic 05:00.0 
dpdk-devbind.py -b uio_pci_generic 06:00.0 
dpdk-devbind.py -b uio_pci_generic 07:00.0 
dpdk-devbind.py -b uio_pci_generic 08:00.0 

In the stanza above, we configure DPDK (required)

  • Configure HugePages

In this case we use 96 hugepages, this value can be different if you are using a box with different characteristics (# of ports, memory etc.) The objective is to configure a value that is not too high (waste of resources) and not too small. otherwise p4dpdk won't run. In this case this leaves 10 Free HugePages.

HugesPages Verification
grep HugePages_ /proc/meminfo
HugePages_Total:      96
HugePages_Free:       10
HugePages_Rsvd:        0
HugePages_Surp:        0 
  • Activate UIO_PCI_GENERIC driver
  • Bind the interfaces to DPDK, DPDK will control them now. Keep in mind that now they will be invisible from the linux kernel.

This command use device PCI ID. In order to check device PCI ID just issue the below command:

List PCI device ID list ready to be use (or not by DPDK)
 dpdk-devbind.py --status

Network devices using DPDK-compatible driver
============================================
0000:01:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:02:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:05:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:06:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:07:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb
0000:08:00.0 'I211 Gigabit Network Connection 1539' drv=uio_pci_generic unused=igb

Network devices using kernel driver
===================================
0000:09:00.0 'AR928X Wireless Network Adapter (PCI-Express) 002a' if=wlan0 drv=ath9k unused=uio_pci_generic 

No 'Baseband' devices detected
==============================

Other Crypto devices
====================
0000:00:1a.0 'Atom Processor Z36xxx/Z37xxx Series Trusted Execution Engine 0f18' unused=uio_pci_generic

No 'Eventdev' devices detected
==============================

No 'Mempool' devices detected
=============================

No 'Compress' devices detected
==============================

No 'Misc (rawdev)' devices detected
=================================== 
  • Configure the appliance OOBM via veth pair (as all physical ports are handled by DPDK and will be invisible from the Linux kernel)
Disable IPv6
#VETH for CPU_PORT and OOBM_PORT
ip link add veth0a type veth peer name veth0b

ip link set veth0a multicast on
ip link set veth0a allmulti on
ip link set veth0a promisc on
ip link set veth0a mtu 8192
ip link set veth0a up

ip link set veth0b multicast on
ip link set veth0b allmulti on
ip link set veth0b promisc on
ip link set veth0b mtu 8192
ip link set veth0b up

ethtool -K veth0a rx off
ethtool -K veth0a tx off
ethtool -K veth0a sg off
ethtool -K veth0a tso off
ethtool -K veth0a ufo off
ethtool -K veth0a gso off
ethtool -K veth0a gro off
ethtool -K veth0a lro off
ethtool -K veth0a rxvlan off
ethtool -K veth0a txvlan off
ethtool -K veth0a ntuple off
ethtool -K veth0a rxhash off
ethtool --set-eee veth0a eee off

ethtool -K veth0b rx off
ethtool -K veth0b tx off
ethtool -K veth0b sg off
ethtool -K veth0b tso off
ethtool -K veth0b ufo off
ethtool -K veth0b gso off
ethtool -K veth0b gro off
ethtool -K veth0b lro off
ethtool -K veth0b rxvlan off
ethtool -K veth0b txvlan off
ethtool -K veth0b ntuple off
ethtool -K veth0b rxhash off
ethtool --set-eee veth0b eee off

So the above section is pretty straightforward:

  • It creates veth0a / veth0b pair. For those familiar with P4, this is similar to the channel between the control plane (freeRouter) and p4dpdk (dataplane) using CPU_PORT
  • It sets for veth0a/veth0b: multicast/allmulti/promisc flag + mtu=8192
  • It disables TCP offload for veth0a/veth0b

We do the same thing for the Out Of Band management (linux access)

veth1a/veth1b for OOB management
ip link add veth1a type veth peer name veth1b

ip link set veth1a multicast on
ip link set veth1a allmulti on
ip link set veth1a promisc on
ip link set veth1a mtu 1500
ip link set veth1a up

ip link set veth1b multicast on
ip link set veth1b allmulti on
ip link set veth1b promisc on
ip link set veth1b mtu 8192
ip link set veth1b up

ip link set wlan0 up

ethtool -K veth1a rx off
ethtool -K veth1a tx off
ethtool -K veth1a sg off
ethtool -K veth1a tso off
ethtool -K veth1a ufo off
ethtool -K veth1a gso off
ethtool -K veth1a gro off
ethtool -K veth1a lro off
ethtool -K veth1a rxvlan off
ethtool -K veth1a txvlan off
ethtool -K veth1a ntuple off
ethtool -K veth1a rxhash off
ethtool --set-eee veth1a eee off

ethtool -K veth1b rx off
ethtool -K veth1b tx off
ethtool -K veth1b sg off
ethtool -K veth1b tso off
ethtool -K veth1b ufo off
ethtool -K veth1b gso off
ethtool -K veth1b gro off
ethtool -K veth1b lro off
ethtool -K veth1b rxvlan off
ethtool -K veth1b txvlan off
ethtool -K veth1b ntuple off
ethtool -K veth1b rxhash off
ethtool --set-eee veth1b eee off

ip addr flush dev veth1a 
ip addr add 192.168.128.254/24 dev veth1a

Add default route to SDN999 for OOBM return traffic (192.168.128.1 is freeRouter sdn999: we will see the full config later)

#ADD DEFAULT ROUTE to OOBM SDN999
route add default gw 192.168.128.1

Effectively start freeRouter main loop

Start freeRouter inside main loop
start-stop-daemon -S -b -x /rtr/hwdet-main.sh

This main loop is triggered by the script hwdet-main.sh below:

/rtr/hwdet-all.sh script
cat /rtr/hwdet-main.sh 

#!/bin/sh

while (true); do
  cd /rtr/
  stty raw < /dev/tty
  java -Xmx4g -jar /rtr/rtr.jar router /rtr/rtr-
  if [ $? -eq 4 ] ; then
    sync
    reboot -f
  fi
  stty cooked < /dev/tty
  sleep 1
done  

A bit of explanation

Requirement considerations:

  • The box should run 24x7
  • It must survive a power cut, i.e the service should be restored each time the power is cut for any reasons
  • If no power cut but freeRouter has crashed for any reason, it should be restarted

Let me re-assure you, freeRouter usually don't crash, most often freeRouter has manual or better: auto-upgrades (smile) 

freeRouter infinite loop: freeRouter autoupgrade process restarts and self-restarts
while (true); do
  ...
done  
  • The appliance has 8Gb RAM which is enough for JVM running freeRouter. (Full routing IPv4/IPv6 at the control plane is possible at home!  ← ok this is useless but cool, no? :3 )
    • RAM allocation is for JVM and its tables
    • Additional RAM allocation is for p4dpdk and p4emu, as we have to store the table once for the native code too
    • Lastly the kernel also needs memory, so it's a good idea to leave some free RAM and not give everything to JVM.
Start freeRouter
java -Xmx4g -jar /rtr/rtr.jar router /rtr/rtr-
  • freeRouter "Cold reboot"  
Cold reboot
if [ $? -eq 4 ] ; then
  sync
  reboot -f
fi

Discussion

 Design choice considerations

All the choices have been made in order to make the appliance resilient as much as possible and provide an enjoyable user experience. We will see in a later article, a feature that I love: auto-upgrade. This will keep your appliance up to date over the network with the latest freeRouter train during low traffic period. Of course, for ISP P/PE core router we don't want this, but hey! why not? As soon as all customers are dual homed to 2 different PEs reachable via 2 direct core paths, this can be achieved during low traffic period after having set the metric to infinity on all the PE/P boxes to be upgraded. (use IS-IS overload bit or OSPF max-metric router-lsa)

Conclusion

In this article, we got our hands dirty and manually installed freeRouter with DPDK dataplane from a clean slate environment. This is done on purpose, as I'd like you to understand the whole installation process in detail. There is an automated installation alternative that will install freeRouter also. However this is will install freeRouter with software backend. If your hardware CPU+NIC is compatible you can just replace the software backend by DPDK backend. At that precise point we have a vanilla genuine installation of freeRouter with DPDK dataplane on an appliance that can survive physical wild environment and power cut. We have just now to create the 2 freeRouter configuration files:

freeRouter configuration files
ls -l rtr-*
-rw-r--r-- 1 root root  646 Jul 31 17:03 rtr-hw.txt
-rw-r--r-- 1 root root 9027 Aug 25 10:02 rtr-sw.txt


RARE validated design: [ SOHO #002 ] - key take-away

  • freeRouter installation is not complex. It just boils down to installing a basic supported Linux OS, install Java, some 3rd party software and the freeRouter jar and binaries itself
  • In the binary list you'll have a special one called p4dpdk that corresponds to freeRouter DPDK dataplane that emulate RARE P4 program on BMv2 (It does not emulate BMv2 !)
  • Though this installation is manual for pedagogic purpose, the installation can be fully automated, just fire up a VM with a bunch of interfaces and test it ! 
  • The installation proposed is highly resilient and will ease upgrade of the appliance (we will see in subsequent article what it means (wink) )

In the next article, we will configure the freeRouter appliance, start the router, and provide configuration in order to have effective basic ping reachability to the FTTH BROADBAND internal IP.

The "RARE/FreeRouter-101" series of articles is meant to help you quickly kickstart your very first RARE/freeRouter deployment and understand via a series of tutorials how it can be powered by various dataplanes. 101 article series also explained how RARE/freeRouter could be configured in order to be integrated into the external network environment. 101- [ #006 ] introduced an interesting solution for SOHO (small office/home office). You'll see in this "RARE validated design" series of articles,  an innovative implementation of a SOHO routing platform. These articles will draw your attention to an exceptional SOHO router with features usually implemented only by commercial solutions in service provider environments.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

Back in 2004, I deployed a 8Mbps ATM circuit that connected an airline company hub site. Traffic growth increased amazingly since then! In 2020, what does SOHO (Small Office, Home Office) mean nowadays? In our use case we will consider a SOHO connected at 1GE link. This is for example:

  • Primary schools, Secondary schools
  • Small R&E institution spoke sites
  • Home office (especially considering the COVID context)
  • Small company spoke agencies

Article objective

In this article we will describe how to build a carrier grade SOHO router (aka CPE) from an actual real platform. In this example let me share with you my personal story and introduce you my SOHO hardware that I'm using at home. It is compliant with the requirements implied by the use cases listed above:

Requirements

  • requirement #0: n×1GE capable, ISP uplink is 1GE 
  • requirement #1: completely silent, the box can be moved to crowded room
  • requirement #2: small power consumption, as it is meant to run 24x7. (I'm paying the bill ! (smile) )
  • requirement #3: Run 64-bit linux 
  • requirement #4: native support of DPDK

Diagrams

[ #001 ] - Cookbook

 Hardware selection

Hardware specification

  • 6× Intel 211AT Gigabit Ethernet, support wake up on LAN
  • Support 1× mSATA SSD, 1x DDR3L 1.35V memory 1333/1600MHz, max to 8GB;
  • 1× VGA max resolution 1920x1080P
  • 1× COM RJ45 console
  • Support add WiFi module ( Mini PCI-E half height size )
  • Support automatically power on after power restore.
  • Ultra compact measured at 180×175×34mm;
  • Low power requirements save money and be more eco-friendly.
  • Fanless, passive cooling, noise-less

CPU specification

  • CPU identifier: J1900
  • of cores: 4

  • # of Threads: 4

  • Processor Base Frequency: 2.00 GHz

  • Burst Frequency: 2.42 GHz

  • Cache: 2 MB L2 Cache

  • TDP: 10 W

freeRouter is heavily multithreaded, so for 4 cores is appreciated, as a budget SOHO router, VPN hardware NIC assistance is not required. If VPN concentrator is needed, we can deploy in a SOHO environment a dedicated box that has a CPU with AES-NI support. freeRouter won't run as a VM, so VT-x nor VT-d and VT-c is not required.  

SOHO usage

  • home office work
  • regular 720p/1080p/4K (and more) on-line VC via RENATER RENDEZ-VOUS or ZOOM
  • (intensive grown up kids) online gaming (2–3 persons can play an online game at the same time)
  • these kids+wife can multitask and watch 480p/780p Youtube video at the same times (This is the digital natives ...)
  • streaming video from MyCanal (French Netflix competitor)
  • Operating system/school educational material  parallel downloads
  • Intensive social network usage via native mobile client having integrated video in the apps ...

Bandwidth check

So all the above usage require a high amount of connectivity as all of the action above can occur in parallel. This is Speedtest test result during crowded working hours:

So my ISP was not totally lying after all, though I could not reach the theoretical 1GE that the ISP advertisement boasts. (wink)

SOHO comments

Please note that this hardware has no optical/SFP port. There are indeed similar configuration with 1 optical uplink port in case you are also the service provider in your environment. This hardware is specific to FTTH environment currently deployed in France.

 Operating system selection

Operating system specification

  • Debian 10 (aka Buster) 
  • netinstall is used
  • minimal vanilla installation

Requirements

  • requirement #0: LTS operating system 
  • requirement #1: Benefit from LTS security patches
  • requirement #2: Must be able to run dpdk
  • requirement #3: (personal requirement) Must be familiar to me
  • requirement #4: Able to run java software as freeRouter is written in Java
  • requirement #5: small operating system software footprint
  • requirement #6: Support for IPv4/IPv6

Additional nice to have features (but not used here as we are not using VM nor require high VPN traffic load)

  • Virtualisation support: Check CPU support for VT-x (intel) AMD-V (AMD) 
  • I/O MMU virtualisation (Kernel bypass mechanism): Check CPU support for VT-d AMD-Vi (AMD) needed by dpdk with VFIO driver in order to ensure hardware NIC packet forwarding
  • Network virtualisation: Check CPU support for VT-c  (SR-IOV)
  • Hardware Encryption: Check CPU support for AES-NI (Tunnel mechanism using AES such as OpenVPN, however this is useless for other tunnel type such as Wireguard

Discussion

 Design choice considerations

Though the traffic distribution is totally different from a school or SOHO site traffic patterns, we can consider this hardware platform as a viable choice.

Platform considerations:

  • each 1GE port is wired to an Intel 211AT chipset. DPDK will take advantage of these chipset packet processing power burnt into the silicon in order to relieved the CPU load.
  • WIFI is not mandatory and the hardware included is not bleeding edge but considering the uplink bandwidth 802.11ax is not necessary. At least for Northbound traffic we are safe for the moment. At some points if East-West traffic such as NAS to wifi client require 10G traffic rate it will be the moment to buy a new appliance. If WIFI improvement is needed, 802.11ac card can be purchased with a 15€ budget. For WIFI client to WIFI client traffic 10GE traffic you can still purchase a 802.11ax mini PCIe card for around the same budget.

 freeRouter is supported on:

  • linux based system
  • android → yes, you can install freeRouter on your mobile phone and wander around your house, IPv4/IPv6 WIFI roaming will occur automagically!
  • freeRouter has a DPDK dataplane as well as a libpcap dataplane for older hardware
  • in this example I selected an appliance for convenient reasons but nothing prevents you from recycling an old laptop/desktop PC with multiple DPDK NICs. We can run a small PE (provider edge) router with multiple 1GE/10GE NICs. Note that the appliance can act as a 6x1GE provider edge router. This is the edge of the MPLS Seamless architecture.

Operating system future considerations:

  • In SP environment, the ideal situation is to have a custom Operating System (We are studying the Yocto project in order to create this custom OS)
  • This custom OS will encompasses the strict minimum software thus reducing the software footprint at its minimum
  • A very promising and unique features is also provided by: NixOS/Nix package manager : This will enable atomic commit/rollback at the package management level

The combination of Yocto + Nix can help develop your own specific DIY hardware (or for your company/organisation/institution) based on the popular concept that French ISPs love: "INTERNET BOX"

Conclusion

In this 1st article you:

  • had a brief description hardware platform suitable for SOHO
  • had a description of the SOHO use case in 2020
  • get a rationale on why this platform has been chosen
  • had a brief description of the selected Operating System
  • get a rationale on why this OS has been chosen

RARE validated design: [ SOHO #001 ] - key take-away

  • RARE/FreeRouter is a strong candidate for SOHO with multiple dataplane support solution.

If you are a company you run RARE/freeRouter with a versatile P4 switch such as STORDIS BF25561X-1T or WEDGE, but as a SOHO with a small budget you can run it with a DPDK dataplane and for older hardware you still have the possibility run it with a pure software dataplane

  • RARE/freeRouter is the first element at the very edge of the MPLS seamless architecture

End to end MPLS is now possible for the Service provider at an affordable price

  • RARE/freeRouter design can coexist with Virtualisation technology

CPU extension such as VT-x/AMD-V, VT-D/AMD-Vi, VT-c can provide coexistence between RARE/freeRouter and a small amount of storage and compute node. (Such as micro-K8/docker)

In the next article we will start our journey in creating a carrier grade CPE using the platform above.

After having followed P4Lang P4 for dummies [ #002 ] article, you should have now a working P4 development environment.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

image2020-6-29_13-54-48.png

Overview

Let's start writing. compiling and running our first P4 program.

Article objective

This 3rd article propose to write your first P4 program based on P4Lang P4 for dummies [ #001 ]  my_program.p4 specification. 

Diagram: my_program.p4

[ #003 ] - Cookbook

 my_program.p4

P4 program specification

my_program.p4 packet progressing logic: "all packets arriving at port 4 are switched/forwarded to port 8"

  • In this example, the switch has 8 ports
  • A ingress packet arrives at port 4
  • the ingress port is then checked
  • If it is port 4, then the packet is switched to port 8
  • my_program.p4 does not implement a default condition, so all the packets not arriving on port 4 are then dropped
  • the ingress packets arrived with a header with charateristics set by the previous node
  • if needed, my_program.p4 is able to set modify the egress packet header for further processing by the next network node (example of in-band network Telemetry)

Let's first create the P4 program environment:

my_program.p4
mkdir -p ~/my_program/bin ~/my_program/p4src ~/my_program/p4rt_python ~/my_program/build  
Where
tree -d my_program/
my_program/         <------- top folder            
├── bfrt_python     <------- python/scapy folder containg tests scripts            
├── bin             <------- executable binary folder            
├── build           <------- containing p4 compilation artefacts results            
└── p4src           <------- containing p4 code
~/my_program/p4src/my_program.p4
/*
 * P4 language version: P4_16 
 */

/*
 * include P4 core library 
 */
#include <core.p4>

/* 
 * include P4 v1model library implemented by simple_switch 
 */
#include <v1model.p4>

#define PORT_4 4 
#define PORT_8 8 


/*
 * egress_spec port encoded using 9 bits
 */ 
typedef bit<9>  nexthop_id_t;

/*
 * metadata type  
 */
struct metadata_t {
   nexthop_id_t nexthop_id;
}

/*
 * Our P4 program header structure 
 */
struct headers {
}

/*
 * V1Model PARSER
 */
parser prs_main(packet_in packet,
                out headers hdr,
                inout metadata_t md,
                inout standard_metadata_t std_md) {

   state start {
      transition select(std_md.ingress_port) {
         PORT_4: prs_port_4;
         default: accept;
      }
   }

   state prs_port_4 {
      md.nexthop_id = PORT_8;
      transition accept;     
   }
}

/*
 * V1Model CHECKSUM VERIFICATION 
 */
control ctl_verify_checksum(inout headers hdr, inout metadata_t metadata) {
    apply {
  }
}


/*
 * V1Model INGRESS
 */
control ctl_ingress(inout headers hdr,
                  inout metadata_t md,
                  inout standard_metadata_t std_md) {

   apply {
      if (std_md.ingress_port == PORT_4) {
         std_md.egress_spec = md.nexthop_id;
      } 
   }
}


/*
 * V1Model EGRESS
 */

control ctl_egress(inout headers hdr,
                 inout metadata_t md,
                 inout standard_metadata_t std_md) {
   apply {
   }
}

/*
 * V1Model CHECKSUM COMPUTATION
 */
control ctl_compute_checksum(inout headers hdr, inout metadata_t md) {
   apply {
   }
}

/*
 * V1Model DEPARSER
 */
control ctl_deprs(packet_out packet, in headers hdr) {
    apply {
        /*
         * emit hdr
         */
        packet.emit(hdr);
    }
}


/*
 * V1Model P4 Switch define in v1model.p4
 */
V1Switch(
prs_main(),
ctl_verify_checksum(),
ctl_ingress(),
ctl_egress(),
ctl_compute_checksum(),
ctl_deprs()
) main;
Compilation of my_program.p4 using P4lang p4c
p4c --std p4-16 --target bmv2 --arch v1model -I ./include -o ./build --p4runtime-files ./build/my_program.json ./p4src/my_program.p4m

Verification

 my_program.p4 verification
Compilation of my_program.p4 artefact in ./build
floui@ubi16:~/my_program$ ls -l build/
total 44
-rw-rw-r-- 1 floui floui  7532 Jul 24 14:23 my_program.json  <------ output used when launching bmv2
-rw-rw-r-- 1 floui floui 35462 Jul 24 14:23 my_program.p4ip  <------ other usage (not taken into account by the examplr)

Create veth pair before ...

Before launching our BMv2 virtual switch we need to create the veth pair that will be bound the P4 switch.

for that we will reuse bash scripts from Andy Fingerhut public GitHub Repository:

veth pairs setup
cd ~/my_program/bin
wget https://raw.githubusercontent.com/jafingerhut/p4-guide/master/bin/veth_setup.sh
wget https://raw.githubusercontent.com/jafingerhut/p4-guide/master/bin/veth_teardown.sh
chmod u+x ./veth_setup.sh
chmod u+x ./veth_teardown.sh
sudo ./veth_setup.sh

ip link | grep veth
4: veth1@veth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
6: veth3@veth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
7: veth2@veth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
8: veth5@veth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
9: veth4@veth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
10: veth7@veth6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
11: veth6@veth7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
12: veth9@veth8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
13: veth8@veth9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
14: veth11@veth10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
15: veth10@veth11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
16: veth13@veth12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
17: veth12@veth13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
18: veth15@veth14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
19: veth14@veth15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
20: veth17@veth16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
21: veth16@veth17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000

we can now launch BMv2 simple_switch and bind the 8 veth pairs we just configured

start bmv2 simple_switch (load my_program.json)
sudo simple_switch --log-console -i 1@veth2 -i 2@veth4 -i 3@veth6 -i 4@veth8 -i 5@veth10 -i 6@veth12 -i 7@veth14 -i 8@veth16 ./build/my_program.json
Calling target program-options parser
[14:28:41.364] [bmv2] [D] [thread 15917] Set default default entry for table 'tbl_my_program76': my_program76 - 
Adding interface veth2 as port 1
[14:28:41.364] [bmv2] [D] [thread 15917] Adding interface veth2 as port 1
Adding interface veth4 as port 2
[14:28:41.415] [bmv2] [D] [thread 15917] Adding interface veth4 as port 2
Adding interface veth6 as port 3
[14:28:41.455] [bmv2] [D] [thread 15917] Adding interface veth6 as port 3
Adding interface veth8 as port 4
[14:28:41.503] [bmv2] [D] [thread 15917] Adding interface veth8 as port 4
Adding interface veth10 as port 5
[14:28:41.547] [bmv2] [D] [thread 15917] Adding interface veth10 as port 5
Adding interface veth12 as port 6
[14:28:41.587] [bmv2] [D] [thread 15917] Adding interface veth12 as port 6
Adding interface veth14 as port 7
[14:28:41.635] [bmv2] [D] [thread 15917] Adding interface veth14 as port 7
Adding interface veth16 as port 8
[14:28:41.683] [bmv2] [D] [thread 15917] Adding interface veth16 as port 8
[14:28:41.727] [bmv2] [I] [thread 15917] Starting Thrift server on port 9090
[14:28:41.728] [bmv2] [I] [thread 15917] Thrift server was started
...
tcpdump veth8 (port 4)
sudo tcpdump -i veth8
...
tcpdump veth8 (port 8)
sudo tcpdump -i veth16
...

Now you need to find a way to:

  • send a packet to simple_switch@PORT_4 (veth8)
  • send another packet to simple_switch@PORT_1 (veth2)

We will use scapy for that:

scapy installation as root
pip3 install --pre scapy[complete]

Run scapy with sufficient privileges to send packets on specific interface

sudo scapy3
/usr/lib/python3/dist-packages/IPython/utils/module_paths.py:29: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
                                      
                     aSPY//YASa       
             apyyyyCY//////////YCa       |
            sY//////YSpcs  scpCY//Pp     | Welcome to Scapy
 ayp ayyyyyyySCP//Pp           syY//C    | Version 2.4.3~bionic
 AYAsAYYYYYYYY///Ps              cY//S   |
         pCCCCY//p          cSSps y//Y   | https://github.com/secdev/scapy
         SPPPP///a          pP///AC//Y   |
              A//A            cyP////C   | Have fun!
              p///Ac            sC///a   |
              P////YCpc           A//A   | Craft packets like I craft my beer.
       scccccp///pSP///p          p//Y   |               -- Jean De Clerck
      sY/////////y  caa           S//P   |
       cayCyayP//Ya              pY/Ya
        sY/PsY////YCc          aC//Yp 
         sc  sccaCY//PCypaapyCP//YSs  
                  spCPY//////YPSps    
                       ccaacs         
                                       using IPython 5.5.0
>>> 
From scapy prompt, send a packet to PORT_4 (veth8)
>>> sendp(IP(dst="1.2.3.4")/ICMP(),iface="veth8")
.
Sent 1 packets.
>>> 
Check tcpdump on veth8 (PORT_4)
floui@ubi16:~$ sudo tcpdump -i veth8
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth8, link-type EN10MB (Ethernet), capture size 262144 bytes
14:58:23.404299 00:00:40:01:9d:d2 (oui Unknown) > 45:00:00:1c:00:01 (oui Unknown), ethertype Unknown (0xc1e0), length 28: 
        0x0000:  1728 0102 0304 0800 f7ff 0000 0000       .(............ 
Check tcpdump on veth16 (PORT_8)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth16, link-type EN10MB (Ethernet), capture size 262144 bytes
14:58:23.406042 00:00:40:01:9d:d2 (oui Unknown) > 45:00:00:1c:00:01 (oui Unknown), ethertype Unknown (0xc1e0), length 28: 
        0x0000:  1728 0102 0304 0800 f7ff 0000 0000       .(............ Conclusion

Congratulations !

You have successfully written, compiled, load your program P4Lang P4 virtual switch ! In addition, you also checked that the logic of your program is implemented correctly by sending a packet to PORT_4 using scapy python3 tool. You then checked with tcpdump that your packet ingressed the P4 switch via PORT_4 and egressed via PORT_8 as it was expected.

What's happening to other packets arriving on a port that is different from PORT_4 ?

Let's try to find out. In that situation, let's send an ingress packet to PORT_1 (veth2) of the switch and see what's happening.

From scapy prompt, send a packet to PORT_4 (veth8)
>>> sendp(IP(dst="1.2.3.4")/ICMP(),iface="veth2")
.
Sent 1 packets.
>>> 

In that case we don't know what is the egress port so let's look at simple_switch console.

simple_switch console
floui@ubi16:~/my_program$ sudo simple_switch --log-console -i 1@veth2 -i 2@veth4 -i 3@veth6 -i 4@veth8 -i 5@veth10 -i 6@veth12 -i 7@veth14 -i 8@veth16 ./build/my_program.json
Calling target program-options parser
[15:10:55.525] [bmv2] [D] [thread 16129] Set default default entry for table 'tbl_my_program76': my_program76 - 
Adding interface veth2 as port 1
[15:10:55.525] [bmv2] [D] [thread 16129] Adding interface veth2 as port 1
Adding interface veth4 as port 2
[15:10:55.555] [bmv2] [D] [thread 16129] Adding interface veth4 as port 2
Adding interface veth6 as port 3
[15:10:55.603] [bmv2] [D] [thread 16129] Adding interface veth6 as port 3
Adding interface veth8 as port 4
[15:10:55.651] [bmv2] [D] [thread 16129] Adding interface veth8 as port 4
Adding interface veth10 as port 5
[15:10:55.691] [bmv2] [D] [thread 16129] Adding interface veth10 as port 5
Adding interface veth12 as port 6
[15:10:55.739] [bmv2] [D] [thread 16129] Adding interface veth12 as port 6
Adding interface veth14 as port 7
[15:10:55.791] [bmv2] [D] [thread 16129] Adding interface veth14 as port 7
Adding interface veth16 as port 8
[15:10:55.839] [bmv2] [D] [thread 16129] Adding interface veth16 as port 8
[15:10:55.879] [bmv2] [I] [thread 16129] Starting Thrift server on port 9090
[15:10:55.880] [bmv2] [I] [thread 16129] Thrift server was started
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Processing packet received on port 1
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Parser 'parser': start
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Parser 'parser' entering state 'start'
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Parser state 'start': key is 0001
[15:11:00.449] [bmv2] [T] [thread 16135] [0.0] [cxt 0] Bytes parsed: 0
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Parser 'parser': end
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Pipeline 'ingress': start
[15:11:00.450] [bmv2] [T] [thread 16135] [0.0] [cxt 0] ./p4src/my_program.p4(75) Condition "std_md.ingress_port == 4" (node_2) is false
[15:11:00.450] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Pipeline 'ingress': end
[15:11:00.450] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Egress port is 0
[15:11:00.450] [bmv2] [D] [thread 16136] [0.0] [cxt 0] Pipeline 'egress': start
[15:11:00.450] [bmv2] [D] [thread 16136] [0.0] [cxt 0] Pipeline 'egress': end
[15:11:00.450] [bmv2] [D] [thread 16136] [0.0] [cxt 0] Deparser 'deparser': start
[15:11:00.450] [bmv2] [D] [thread 16136] [0.0] [cxt 0] Deparser 'deparser': end
[15:11:00.450] [bmv2] [D] [thread 16140] [0.0] [cxt 0] Transmitting packet of size 28 out of port 0

So in that case we see that line: "Egress port is 0", which is a special port number that designate the null0 interace. (packet dropped)

Let's now resent a packet to PORT_4 and observe simple_switch console log.

simple_switch console
sudo simple_switch --log-console -i 1@veth2 -i 2@veth4 -i 3@veth6 -i 4@veth8 -i 5@veth10 -i 6@veth12 -i 7@veth14 -i 8@veth16 ./build/my_program.json
Calling target program-options parser
[15:14:51.047] [bmv2] [D] [thread 16151] Set default default entry for table 'tbl_my_program76': my_program76 - 
Adding interface veth2 as port 1
[15:14:51.048] [bmv2] [D] [thread 16151] Adding interface veth2 as port 1
Adding interface veth4 as port 2
[15:14:51.099] [bmv2] [D] [thread 16151] Adding interface veth4 as port 2
Adding interface veth6 as port 3
[15:14:51.139] [bmv2] [D] [thread 16151] Adding interface veth6 as port 3
Adding interface veth8 as port 4
[15:14:51.175] [bmv2] [D] [thread 16151] Adding interface veth8 as port 4
Adding interface veth10 as port 5
[15:14:51.207] [bmv2] [D] [thread 16151] Adding interface veth10 as port 5
Adding interface veth12 as port 6
[15:14:51.239] [bmv2] [D] [thread 16151] Adding interface veth12 as port 6
Adding interface veth14 as port 7
[15:14:51.271] [bmv2] [D] [thread 16151] Adding interface veth14 as port 7
Adding interface veth16 as port 8
[15:14:51.319] [bmv2] [D] [thread 16151] Adding interface veth16 as port 8
[15:14:51.347] [bmv2] [I] [thread 16151] Starting Thrift server on port 9090
[15:14:51.348] [bmv2] [I] [thread 16151] Thrift server was started
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Processing packet received on port 4
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser 'parser': start
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser 'parser' entering state 'start'
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser state 'start': key is 0004
[15:14:58.053] [bmv2] [T] [thread 16158] [0.0] [cxt 0] Bytes parsed: 0
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser 'parser' entering state 'prs_port_4'
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser set: setting field 'scalars.userMetadata.nexthop_id' to 8
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser state 'prs_port_4' has no switch, going to default next state
[15:14:58.053] [bmv2] [T] [thread 16158] [0.0] [cxt 0] Bytes parsed: 0
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser 'parser': end
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Pipeline 'ingress': start
[15:14:58.054] [bmv2] [T] [thread 16158] [0.0] [cxt 0] ./p4src/my_program.p4(75) Condition "std_md.ingress_port == 4" (node_2) is true
[15:14:58.054] [bmv2] [T] [thread 16158] [0.0] [cxt 0] Applying table 'tbl_my_program76'
[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Looking up key:

[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Table 'tbl_my_program76': miss
[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Action entry is my_program76 - 
[15:14:58.054] [bmv2] [T] [thread 16158] [0.0] [cxt 0] Action my_program76
[15:14:58.054] [bmv2] [T] [thread 16158] [0.0] [cxt 0] ./p4src/my_program.p4(76) Primitive std_md.egress_spec = md.nexthop_id
[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Pipeline 'ingress': end
[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Egress port is 8
[15:14:58.054] [bmv2] [D] [thread 16159] [0.0] [cxt 0] Pipeline 'egress': start
[15:14:58.054] [bmv2] [D] [thread 16159] [0.0] [cxt 0] Pipeline 'egress': end
[15:14:58.054] [bmv2] [D] [thread 16159] [0.0] [cxt 0] Deparser 'deparser': start
[15:14:58.054] [bmv2] [D] [thread 16159] [0.0] [cxt 0] Deparser 'deparser': end
[15:14:58.054] [bmv2] [D] [thread 16163] [0.0] [cxt 0] Transmitting packet of size 28 out of port 8

We clearly confirmed what tcpdump what putting in evidence: ingress PORT_4 leads to a packet switched to PORT_8

Conclusion

In this article you:

  • wrote your first P4 program
  • use p4c in order to compile it
  • learned how to instantiate virtual ethernet pair in order to bind them with simple_switch
  • launch simple_switch and load your program on it
  • set up a test environment using scapy
  • and verify your program using a combination a scapy and tcpdump

P4Lang P4 for dummy [ #002 ] - key take-away

  • my_program.p4 is written following V1Model definition that defines:
    • a parsing stage
    • a checksum verification stage
    • an ingress packet processing control stage
    • an egress packet processing control stage
    • a checksum computation stage
    • deparser stages
V1model PISA model
V1Switch( prs_main(), ctl_verify_checksum(), ctl_ingress(), ctl_egress(), ctl_compute_checksum(), ctl_deprs() ) main; 

It is described by the diagram below:

In a subsequent article we will dissect my_program.p4, but as you could observe, P4 programming is quite intuitive as it is all about switching a packet based on intrinsic ingress packet header and metadata (like packet ingress port) value.






In P4Lang P4 for dummies [ #001 ], you learned that behavioural language offers you access to dataplane programming. 

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

image2020-6-29_13-54-48.png

Overview

In order to be able to start P4 programming, we will concretely start setting up a P4 development environment using Open Source P4Lang P4 community software. 

Article objective

This article exposes how to install:

  • P4Lang PI
  • P4Lang BMv2
  • P4Lang p4c

Operating system supported

  • Debian 10 (stable aka buster)
  • Ubuntu 18.04 (Bionic beaver)
  • Ubuntu 20.04 (Focal fossa)

Note

You can of course use the distribution of your choice as soon as the Operating System you are using has all the necessary third party dependencies required by P4Lang software, mainly:

  • protobuf
  • grpc
  • thrift
  • nanomsg
  • nnpy

You can find the full list here in Launchpad.

Diagram: 

[ #002 ] - Cookbook

 Install your favorite operating system

In our example we will use the same debian stable image (buster) installed as a VirtualBox VM

and we add a bridge network interface to our laptop RJ45 connection.

 Install P4lang environment on Debian 10
add p4lang repository in /etc/apt/sources.list.d/p4.list
deb https://download.opensuse.org/repositories/home:/frederic-loui:/p4lang:/p4c:/master/Debian_10/ ./
add debian 10 repository key from download.opensuse.org
wget https://download.opensuse.org/repositories/home:/frederic-loui:/p4lang:/p4c:/master/Debian_10/Release.key
sudo apt-key add ./Release.key
install p4lang packages (just install p4c and it will install p4lang-pi and bmv2)
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install p4c

Note

Installing p4lang software with apt-get will download and install:

  • p4c
  • bmv2
  • p4lang-pi
 Or install P4lang environment on Ubuntu 18.04
add p4lang bionic 3rd party repository
sudo add-apt-repository ppa:frederic-loui/p4lang-3rd-party
sudo apt-get update
add p4lang bionic nightly build repository
sudo add-apt-repository ppa:frederic-loui/p4lang-master-bionic-nightly
sudo apt-get update
install p4lang packages (just install p4c and it will install p4lang-pi and bmv2)
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install p4c bmv2 plang-pi

Note

Installing p4lang software with apt-get will download and install:

  • p4lang-3rd-party (bionic)

alongside:

  • p4c
  • bmv2
  • p4lang-pi
 Or install P4lang environment on Ubuntu 20.04
add p4lang bionic 3rd party repository
sudo add-apt-repository ppa:frederic-loui/p4lang-3rd-party-focal
sudo apt-get update
add p4lang bionic nightly build repository
sudo add-apt-repository ppa:frederic-loui/p4lang-master-focal-nightly
sudo apt-get update
install p4lang packages (just install p4c and it will install p4lang-pi and bmv2)
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install p4c bmv2 plang-pi

Note

Installing p4lang software with apt-get will download and install:

  • p4lang-3rd-party (focal)

alongside:

  • p4c
  • bmv2
  • p4lang-pi

Verification

 RARE P4 code compilation
check p4lang packages installation on Debian
dpkg -l | grep p4lang
ii  bmv2                                   20200615~d447b6a~release~nightly-0+57.1 amd64        p4lang behavioral-model
ii  p4c                                    20200628~7c03f854~release~nightly-0     amd64        p4c p4lang project compiler
ii  p4lang-pi                              20200601~822a0d1~release~nightly-0+39.1 amd64        Implementation framework of a P4Runtime server
check p4lang packages installation on Ubuntu 18.04 (same for 20.04)
dpkg -l | grep p4lang
ii  bmv2                                     1.13.0-202006160902-d447b6a~ubuntu18.04.1    amd64        p4lang behavioral-model
ii  p4c                                      1.1.0-rc1-202006191103-3917a1c~ubuntu18.04.1 amd64        p4c p4lang project compiler
ii  p4lang-3rd-party                         1.1~bionic-1                                 all          This package installs 3rd party software needed by p4lang software
ii  p4lang-pi                                0.8-202006020517-822a0d1~ubuntu18.04.1       amd64        Implementation framework of a P4Runtime server
Clone RARE code from repository
cd ~/
git clone https://github.com/frederic-loui/RARE.git
compile RARE router.p4
cd ~/RARE/02-PE-labs/p4src
make build
mkdir -p ../build ../run/log
p4c --std p4-16 --target bmv2 --arch v1model \
        -I ./ -o ../build --p4runtime-files ../build/router.txt router.p4 
check RARE router.p4 compilation result:
ls -l ./build/
total 572
-rw-r--r-- 1 root root 448313 Jul 22 10:15 router.json
-rw-r--r-- 1 root root 100912 Jul 22 10:15 router.p4i
-rw-r--r-- 1 root root  32764 Jul 22 10:15 router.txt

Conclusion

In this article you learned how to set up a P4 environment development 

  • Debian 10
  • Ubuntu 18.04
  • Ubuntu 20.04

And tested the installation by compiling RARE P4 code.


P4Lang P4 for dummy [ #002 ] - key take-away

  • P4Lang P4 development environment creation is easy
  • it uses P4Lang packages on Debian and Ubuntu
  • These packages are maintained by RARE project and are nightly built based on P4Lang official GitHub

In the next article we will:

  • compile my_program.p4
  • launch P4Lang virtual switch called simple_switch and load my_program.p4 on it
  • perform basic verification





While the "RARE/FreeRouter-101" series teaches you how to start using RARE/freeRouter, in the "P4Lang P4 for dummies" article series, you'll learn how to start programming with the P4Lang P4 language. As a reminder, P4 dataplane is a type of dataplane that can be coupled to RARE/freeRouter as it is described in article 101-#003 and 101-#004. The final objective of this article series is to help you compile the VERY FIRST RARE/freeRouter test case that is covering:

  • basic control plane communication between freeRouter and BMv2/simple_switch_grpc
  • and the simple packet_in/packet_out header used in this interface communication.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

image2020-6-29_13-54-48.png

Overview

P4 is a language for programming the data plane of network devices. From p4.org web site:

«P4 is a domain-specific programming language for specifying the behaviour of the dataplanes of network-forwarding elements. »

Article objective

This 1st article exposes:

  • A brief introduction to the P4 language
  • A basic P4 development workflow
  • Some basic specificities of the P4 language

Note

This article is preliminary a pure introduction to P4lang P4. It does not correspond in any way to an extensive programming language description nor a P4 compilation guide.

Diagram: P4 development workflow

[ #001 ] - Cookbook: P4 development workflow

 P4 in a nutshell

Based on what we mentioned, what does the "P4 Domain specific language" give you ? Concretely:

  1. You can write a program as you would in C or C++ but you'd have to follow the P4 language specification. (The current one is P4_16, there is also a previous P4_14 specification.)
  2. That program is compiled with a p4 compiler in P4_16 or P4_14 (similar to C++14/C++11) 
  3. The resulting compilation artifacts can be then loaded into an equipment implementing a P4 model commonly called a P4 target that is able to interpret/run p4 binaries. Here we will be using BMv2, a softwarized P4 target intended for learning.


Take away

The specificities are:

  • This P4 program is YOUR program
  • This P4 program allows you to define the YOUR OWN packet processing logic

In short, you can now program:

«how a packet that comes into your system, is processed and goes out your system»

Diagram packet processing description

The diagram above depicts 2 perspectives: 

  • P4 program development workflow
    • It starts by writing your P4 program using your favorite editor
    • compile your program with the P4 compiler of your target
    • load your program into the P4 target
  • my_program.p4 packet progressing logic: "all packets arriving at port 4 are switched/forwarded to port 8"
    • In this example, the switch has 8 ports
    • A ingress packet arrives at port 4
    • the ingress port is then checked
    • If it is port 4, then the packet is switched to port 8
    • my_program.p4 does not implement a default condition, so all the packets not arriving on port 4 are then dropped
    • the ingress packets arrived with a header with charateristics set by the previous node
    • if needed, my_program.p4 is able to set modify the egress packet header for further processing by the next network node (example of in-band network Telemetry)

Router for Academia Research & Education (RARE) & P4

 P4 rationale

The RARE project objective is to provide a networking solution to Research & Education institution use cases. While we witnessed the birth of several control plane such as GNU Zebra, Bird, exaBGP, etc. The common point of these softwares is that they don't have the capability (yet) to be coupled easily with a hardware dataplane. Simply put, these software control plane cannot be used without specific/important development in order to run on an equipment able to forward nx100GE links at a high Mpps rate. 

There have been attempts with DPDK and other kernel bypass mechanism, that enabled higher throughput processing capability, but this is not comparable to commercial/vendor equipment's packet processing power. 

P4:

  • opens you the door to software AND hardware dataplane programmability
  • gives you the possibility to implement YOUR own packet processing algorithm 

RARE control plane: freeRouter

In the RARE project, we are using a software control plane called freeRouter:

  • It is an open source control plane
  • It has been deployed since 2014 and benefits from hours of production in various environment
  • Interworking has been extensively and continuously tested with major equipment vendors
  • Last but not least freeRouter's maintainer is in the RARE team which allowed Rapid Application development and prototyping in order to build control plane and P4 dataplane communication.
 P4 targets

P4 use cases are mostly inherently linked to the P4 target you plan to use in order to run your P4 program: 

A comprehensive list can be found here

  • P4Lang BMv2 V1Model target:

It is the P4Lang virtual model that emulates a PISA architecture. You can run it on a VM and start writing your first P4 program and load it on simple_switch and/or simple_switch_grpc (if you plan to use P4Runtime). While this is a great solution in order to learn P4 and sketch your packet processing algorithm, it is not recommended for production use.

  • INTEL/BAREFOOT TOFINO/TOFINO 2

This target also implements a PISA architecture and proposes a Virtual model so that you can validate your algorithm. However, once validated on the virtual model, you can load your program into a hardware switch that is running a NPU called TOFINO and its bigger brother TOFINO2. While TOFINO is able to handle 6.4 Tbps of traffic rate, TOFINO2 simply doubles this. (12 Tbps) In addition to that, TOFINO2 exposes additional inherent capabilities like bigger buffer, memory and TCAM compared to his little brother.

 P4 RARE use cases

These are the use cases enabled by the combination of P4 and RARE software:

  • Service Provider core router:

You can build a robust packet switching fabric at the scale of Telecom Service Provider able to switch packets at n×100GE

  • Service Provider edge router: 

You can build an edge router an Interconnect the core router above. These routers will terminate your backbone network service like L2/L3 plain IP or VPN services (IPv4 - IPv6)  

  • Datacenter ToR switch 

With the WEDGE100BF32X you can have 2x100GE uplinks toward 2 distinct "leaf switches", it leaves you 30x100GE server connections.

  • Datacenter Spine/Leaf switch 

The WEDGE100BF32X is also a good candidate router in DC as a core/spine switch. You can create a fabric able to switch 6.4 Tbps trafic rates

  • Internet Exchange

In this case, the WEDGE100BF32X is 100GE a peer aggregator or simply integrated into the IXP distributed core fabric.

  • MAN/CPE router

The STORDIS BF2556X-1T with its flexible connectivity options is a good candidate for regional network implementation. It has 8x100GE ports, 2 of them can be used as uplinks toward their main transit provider, 2 other can be used to provide EAST/WEST connection via 2 different fire routes, this leave 4x100 ports in case you need to increase capacity. The STORDIS also has 16x[1/10/25] GbE ports, 32x[10/25] GbE ports which gives the possibility to interconnects users via various access port bandwidth.

Conclusion

In this article you:

  • had a brief introduction of P4Lang P4 language
  • had been presented a 10 thousand feet view of P4 development workflow
  • had been exposed a list of P4 targets and the use cases enabled by these targets

P4Lang P4 for dummy [ #001 ] - key take-away

__THE__ exciting INNOVATION provided by P4 boils down into this community language that unlocks and opens for you the door of system's dataplane. Till now, dataplane programming was reserved to commercial vendors. Some of these dataplanes like the well known CEF (Cisco Express Forwarding) are specific to Cisco equipment. Juniper, has its own dataplane (not sure about the name) implemented by Forwarding Plane component. (example of vMX architecture)    

P4 language inherent characteristics:

  • Behavioral programming language
  • Language with constraints 
  • Limited number of variable types
  • With fixed size
  • P4 is not a general purpose language, You cannot program any software. like C, C++ or Java

It is therefore a simple language, that is easier to be tamed by network managers rather than pure software developers. Indeed, writing a P4 program is all about defining the behavior of a network packet processing algorithm based on intrinsic variables encoded into the packet header.  




"RARE/FreeRouter-101" series of article are meant to help you quickly kickstart your RARE/freeRouter very first deployment and understand via a series of tutorial how it can be powered by various dataplane. 101 article series explained also how RARE/freeRouter could be configured in order to be integrated to the external network environment. However, even if 101- [ #006 ] is a robust and interesting solution for SOHO, you'll see in the "RARE validated design" series of articles,  a lot more interesting use case. This articles will draw your attention to mind blowing use cases that are usually implemented only by commercial solution in service provider environment.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

BGP is THE protocol of Internet, it is used to exchange routing information between other BGP systems between Internet domains. It comes in two flavours:

External BGP(eBGP): Network Layer Reachability Information (NLRI) is exchanged between network domain called Autonomous system usually administratively independant. We are speaking about BGP inter-domain routing. As an example, let's us assume a BGP speaker from AS2200 (RENATER) advertising NLRI information to AS20965 (GÉANT R&E). From that point AS20965 has the knowledge of how to reach any network advertised by AS2200 based on the NLRI information.

Internal BGP (iBGP): NLRI is propagated between BGP speakers inside the same domain. We are speaking about BGP intra-domain routing. As an example, assume border router AS2200 in Paris connected to GEANT network and get NLRI information from AS20965. I will then propagate this information internally and advertise GEANT NLRI information via iBGP session to other BGP speaker inside network domain for AS2200.

iBGP requires a full mesh network between all BGP speakers inside a domain because of an anti-AS loop avoidance. Thus requiring n*(n-1)/2 number of sessions to be implemented. BGP route reflection is a proposal that remove full mesh requirement. BGP Edge router has now only 1 BGP session toward the RR, thus reducing network equipment workload.

Article objective

In this article we will describe how to build a carrier grade route reflector cluster composed by RR1 and RR2. In order to reach Telecom Internet Service provider 99,999% of availability:

Let's consider the architecture network of a fictitious service provider below, router reflector RR1 and RR2 are dual homed to a core P routers.

Diagram

[ #001 ] - Cookbook

 BGP Route Reflector implementation

BGP RR main requirements

SR655 1 x EPYC 7302P, 64GB RAM, 2G CONTROLLER CACHE FLASH, 4x10G ports + SFP+ and 4x1G ports, 3 SSD 480GB MAINSTREAM, XCLARITY ENTERPRISE.

SR655 AMD EPYC 7302P (16C 2.8GHz 128MB Cache/155W) 32GB (2x32GB, 2Rx4 3200MHz RDIMM), No Backplane, SATA, 1x750W, Tooless Rails
ThinkSystem 2x32GB TruDDR4 3200MHz (2Rx4 1.2V) RDIMM-A
ThinkSystem SR655 2.5 SATA/SAS 8-Bay Backplane Kit
ThinkSystem RAID 930-8i 2GB Flash PCIe 12Gb Adapter
ThinkSystem 2.5 5300 480GB Mainstream SATA 6Gb Hot Swap SSD
ThinkSystem SR655 x16/x8/x8 PCIe Riser1 FH Kit
ThinkSystem SR635/SR655 x8 PCIe Internal Riser Kit
ThinkSystem Broadcom 57454 10/25GbE SFP28 4-port OCP Ethernet Adapter
ThinkSystem Broadcom 5720 1GbE RJ45 2-Port PCIe Ethernet Adapter
SFP+ SR Transceiver
ThinkSystem 750W(230/115V) Platinum Hot-Swap Power Supply
2.8m, 10A/100-250V, C13 to IEC 320-C14 Rack Power Cable
ThinkSystem Toolless Slide Rail Kit with 2U CMA
ThinkSystem SR655 Fan Option Kit
ThinkSystem SR635/SR655 Supercap Installation Kit

BGP RR main requirements

RR is a specific component inside a service provider environment:

  • The BGP RR is not in the data path inside the backbone, this can be adjusted by setting hight IGP metrics inside the code backbone. 
  • BGP traffic does not require a tremendous throughput so no need to have hardware NIC assisted forwarding mechanism such as dpdk.
  • A NREN route reflector with 2xIPv4 and 2xIPv6 full views coming from 2 upstream provider requires steady ~ 10 Mbps traffic rates, so we can assume that 10GE connection will be sufficient for the next decades all address-family included.
  • As of 2020/07/13, the Internet IPv4 routing table size is 839945 entries
  • As of 2020/07/13, the Internet IPv6 routing table size is 91062 entries

both cumulated with BGP other address families needs a constant usage of ~ 4GB of memory:

# show watchdog memory

  • So in the config above 64 Gbytes of RAM is sufficient in order to cache all the IPv4 and IPv6 routing table in memory (and also other BGP address family tables). It will be also largely enough in case of network instability, events that involves more CPU/memory usage related convergence computation.

Disclaimer

  • We have no incentive in proposing a server with the above brand. It just happen that this server was already bought and its configuration is matching perfectly the use case requirement but again, this is pure coincidence
  • 10GE port connection might be overkill, but in a Service Provider context this is the norm. It will avoid adjacent core routers to implement 1GE connectivity
  • PCIe GEN4 is available, and thus provide a tremendous amount of bandwidth for disk R/W operation. Though useful for the OS application, BGP RR setup won't take a direct advantage from PCIE GEN4.
  • Indeed in this configuration considering the amount of RAM we have we will disable SWAP operations.
 Server network wiring


BGP RR distinct data path

  • Connect the server with 2 NIC using optical  SFP ( Broadcom 57454 10/25GbE SFP28 4-port OCP Ethernet Adapter) to core backbone routers following distinct dark fiber path.
  • The link between C1 - C2 provides an additional level of redundancy

BGP RR out of band management

  • Connect the server with 1 NIC using RJ45 (Broadcom 5720 1GbE RJ45 2-Port PCIe Ethernet Adapter) to the KVM or Out fo band management network

Do not forget ...

One point overlooked is the environment. As said BGP is a central component in service provider network. It must be deployed considering the following recommendations:

  • Deploy an RR in carrier hotel
  • With sufficient cooling
  • With sufficient power. Make also sure to have redundant power and use dual PSU connected to different energy source
  • Rack properly the server and make sure it is installed without blocking airflow as per server vendor advice
 Operating system requirement


Install OS supported in your company

  • Use only stable branch also called LTS operating system like Debian 10 or Ubuntu 18.04 and ubuntu 20.04
  • Apply your IT strip down security patch and make it enter your server maintenance process
  • In our case we will use Debian 10

BGP RR Life cycle management

It is important to note that now, BGP RR is subject to your company server hardware maintenance and that the software is not part of it.

  • Server hardware maintenance is now applied to a network equipment
  • The software is maintained by freeRouter project members
 freeRouter installation as in RARE/FreeRouter-101 [ #002 ] article
mkdir -p ~/freeRouter/bin ~/freeRouter/lib ~/freeRouter/etc ~/freeRouter/log
cd ~/freeRouter/lib
wget http://freerouter.nop.hu/rtr.jar
Update & Upgrade system
╭─[11:11:54]floui@debian ~ 
╰─➤ tree freeRouter
freeRouter
├── bin   # binary files      
├── etc   # configuration files      
├── lib   # library files      
└── log   # log files      

 Install freeRouter net-tools
get freeRouter net-tools tarball
wget freerouter.nop.hu/rtr.tar
Install build tools
tar xvf rtr.tar -C ~/freeRouter/bin/

For those you would like to rebuild these binaries you can find the compilation shell script in freeRouter cloned git repository in: ~/freeRouter/src/native/c.sh

No throughput required

  • In this case simple pcapInt packet forwarding is recommended
  • In this setup all freeRouter functionalities are natively available
  • freeRouter heavily uses the concept of thread, hence 16 CPU cores will be fully exploited 

freeRouter upgrade

freeRouter upgrades involves 3 aspects:

  • It is pretty unusual, but as freeRouter is using Java, you have to follow Java software update recommandation 
  • freeRouter control plane software it self, it is essentiallaly a rtr.jar file that has to be replaced by the latest version
  • freeRouter dataplane software pcapInt upgrade. pcapInt upgrade are unusual but still has to be checked in freeRouter release notes

We are (at last) now ready to configure freeRouter as a BGP route reflector !

 Create configuration files for router: bgp-rr-freerouter

FreeRouter uses 2 configuration files in order to run, let's write these configuration files for R1 in ~/freeRouter/etc

freeRouter hardware file: bgp-rr-freerouter-hw.txt
int eth1 eth 0000.1111.0001 127.0.0.1 10011 127.0.0.1 10012
int eth2 eth 0000.2222.0002 127.0.0.1 10021 127.0.0.1 10022
tcp2vrf 2323 v1 23

BGP RR interfaces

  • eth1 is BGP port eth1, port 10011 is freeRouter port while 10012 is the port associated to pcapInt associated in linux interface in NIC #1 
  • eth2 is BGP port eth2,  port 10021 is freeRouter port while 10022 is the port associated to pcapInt associated in linux interface in NIC #2
  • For now freeRouter will be accessible only via telnet session on port 2323 
freeRouter software configuration file: r1-sw.txt
hostname rr1
buggy
!
!
access-list ACL-IPv4-RR-CLIENT
 sequence 10 permit all 1.1.1.1 255.255.255.255 all any all
 sequence 20 permit all 2.2.2.2 255.255.255.255 all any all
 sequence 30 permit all 3.3.3.3 255.255.255.255 all any all
 sequence 40 permit all 4.4.4.4 255.255.255.255 all any all
 sequence 50 permit all 5.5.5.5 255.255.255.255 all any all
 sequence 60 permit all 6.6.6.6 255.255.255.255 all any all
 sequence 70 permit all 7.7.7.7 255.255.255.255 all any all
 sequence 80 permit all 8.8.8.8 255.255.255.255 all any all
 exit
!
access-list ACL-IPv6-RR-CLIENT
 sequence 10 deny all fd00::a ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff all any all
 sequence 20 deny all fd00::b ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff all any all
 sequence 30 permit all fd00:: ffff:: all any all
 exit
!
prefix-list PFX-IPv4-NHT
 sequence 10 permit 1.1.1.1/32 ge 32 le 32
 sequence 20 permit 2.2.2.2/32 ge 32 le 32
 sequence 30 permit 3.3.3.3/32 ge 32 le 32
 sequence 40 permit 4.4.4.4/32 ge 32 le 32
 sequence 50 permit 5.5.5.5/32 ge 32 le 32
 sequence 60 permit 6.6.6.6/32 ge 32 le 32
 sequence 70 permit 7.7.7.7/32 ge 32 le 32
 sequence 80 permit 8.8.8.8/32 ge 32 le 32
 sequence 100 permit 10.10.10.10/32 ge 32 le 32
 sequence 110 permit 11.11.11.11/32 ge 32 le 32
 exit
!
prefix-list PFX-IPv6-NHT
 sequence 10 permit fd00::/32 ge 128 le 128
 exit
!
route-policy NHT
 sequence 10 if distance 110
 sequence 20   pass
 sequence 30 else
 sequence 40   drop
 sequence 50 enif
 exit
!
vrf definition v1
 rd 1:1
 exit
!
router ospf4 1
 vrf v1
 router-id 4.4.4.10
 traffeng-id 0.0.0.0
 area 0 enable
 redistribute connected
 exit
!
router ospf6 1
 vrf v1
 router-id 6.6.6.10
 traffeng-id ::
 area 0 enable
 redistribute connected
 exit
!
interface loopback1
 no description
 vrf forwarding v1
 ipv4 address 10.10.10.10 255.255.255.255
 ipv6 address fd00::a ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
 router ospf4 1 enable
 router ospf4 1 area 0
 router ospf4 1 passive
 router ospf6 1 enable
 router ospf6 1 area 0
 router ospf6 1 passive
 no shutdown
 no log-link-change
 exit
!
interface ethernet1
 no description
 vrf forwarding v1
 ipv4 address 10.1.10.10 255.255.255.0
 ipv6 address fd00:cafe::1:10:10 ffff:ffff:ffff:ffff:ffff:ffff:ffff::
 router ospf4 1 enable
 router ospf4 1 area 0
 router ospf4 1 cost 4444
 router ospf6 1 enable
 router ospf6 1 area 0
 router ospf6 1 cost 6666
 no shutdown
 no log-link-change
 exit
!
interface ethernet2
 no description
 vrf forwarding v1
 ipv4 address 10.4.10.10 255.255.255.0
 ipv6 address fd00:cafe::4:10:10 ffff:ffff:ffff:ffff:ffff:ffff:ffff::
 router ospf4 1 enable
 router ospf4 1 area 0
 router ospf4 1 cost 4444
 router ospf6 1 enable
 router ospf6 1 area 0
 router ospf6 1 cost 6666
 no shutdown
 no log-link-change
 exit
!
router bgp4 65535
 vrf v1
 local-as 65535
 router-id 10.10.10.10
 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 nexthop route-policy NHT
 nexthop prefix-list PFX-IPv4-NHT
 template bgp4 remote-as 65535
 template bgp4 description rr clients
 template bgp4 local-as 65535
 template bgp4 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 template bgp4 distance 255
 template bgp4 connection-mode active
 template bgp4 compression both
 template bgp4 update-source loopback1
 template bgp4 hostname
 template bgp4 aigp
 template bgp4 traffeng
 template bgp4 pmsitun
 template bgp4 tunenc
 template bgp4 attribset
 template bgp4 segrout
 template bgp4 bier
 template bgp4 route-reflector-client
 template bgp4 next-hop-unchanged
 template bgp4 send-community all
 listen ACL-IPv4-RR-CLIENT bgp4
 exit
!
router bgp6 65535
 vrf v1
 local-as 65535
 router-id 10.10.10.10
 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 nexthop route-policy NHT
 nexthop prefix-list PFX-IPv6-NHT
 template bgp6 remote-as 65535
 template bgp6 description rr clients
 template bgp6 local-as 65535
 template bgp6 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 template bgp6 distance 255
 template bgp6 connection-mode active
 template bgp6 compression both
 template bgp6 update-source loopback1
 template bgp6 hostname
 template bgp6 aigp
 template bgp6 traffeng
 template bgp6 pmsitun
 template bgp6 tunenc
 template bgp6 attribset
 template bgp6 segrout
 template bgp6 bier
 template bgp6 route-reflector-client
 template bgp6 next-hop-unchanged
 template bgp6 send-community all
 listen ACL-IPv6-RR-CLIENT bgp6
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet tel
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
!
end
 Launch router: rr1
freeRouter launch with supplied rr1-hw.txt and rr1-sw.txt with a console prompt
╭─[6:06:13]floui@debian ~/freeRouter  
╰─➤  java -jar lib/rtr.jar routersc etc/rr1-hw.txt etc/rr1-sw.txt                                                                                      
info cfg.cfgInit.doInit:cfgInit.java:556 booting
info cfg.cfgInit.doInit:cfgInit.java:680 initializing hardware
info cfg.cfgInit.doInit:cfgInit.java:687 applying defaults
info cfg.cfgInit.doInit:cfgInit.java:695 applying configuration
info cfg.cfgInit.doInit:cfgInit.java:721 done
welcome
line ready
rr1#                   
Launch pcapInt in order to bind socket for both interface enp0s9
╭─[6:06:13]floui@debian[1]  ~/freeRouter/bin  
╰─➤  sudo ./pcapInt.bin enp0s9 10012 127.0.0.1 10011 127.0.0.1                                                                                                       
binded to local port 127.0.0.1 10012.
will send to 127.0.0.1 10011.
pcap version: libpcap version 1.8.1
opening interface enp0s9 with pcap1.x api
serving others
> 
Launch pcapInt in order to bind socket for both interface enp0s10
╭─[6:06:13]floui@debian[1]  ~/freeRouter/bin  
╰─➤  sudo ./pcapInt.bin enp0s10 10022 127.0.0.1 10021 127.0.0.1                                                                                                      
binded to local port 127.0.0.1 10022.
will send to 127.0.0.1 10021.
pcap version: libpcap version 1.8.1
opening interface enp0s10 with pcap1.x api
serving others
> 

Verification

 Check telnet access for rr1@10010
rr1 telnet access via port 10010
╭─[1:09:28]floui@debian ~  
╰─➤  telnet localhost 10010
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
welcome
line ready
rr1#                   
 Check rr1 is not in network backbone the data path
From rr1 perspective:
rr1# sh ipv4 route v1                                                          
typ  prefix          metric    iface      hop        time
O    1.1.1.1/32      110/4444  ethernet1  10.1.10.1  00:05:05
O    2.2.2.2/32      110/4445  ethernet1  10.1.10.1  00:04:50
O    3.3.3.3/32      110/4445  ethernet2  10.4.10.4  00:04:32
O    4.4.4.4/32      110/4444  ethernet2  10.4.10.4  00:04:18
O    5.5.5.5/32      110/4445  ethernet1  10.1.10.1  00:04:00
O    6.6.6.6/32      110/4445  ethernet1  10.1.10.1  00:03:42
O    7.7.7.7/32      110/4446  ethernet1  10.1.10.1  00:03:28
O    8.8.8.8/32      110/4445  ethernet2  10.4.10.4  00:02:59
O    10.1.2.0/24     110/4444  ethernet1  10.1.10.1  00:22:47
O    10.1.4.0/24     110/4444  ethernet2  10.4.10.4  00:22:47
O    10.1.5.0/24     110/4444  ethernet1  10.1.10.1  00:22:47
O    10.1.6.0/24     110/4444  ethernet1  10.1.10.1  00:22:47
C    10.1.10.0/24    0/0       ethernet1  null       00:22:49
LOC  10.1.10.10/32   0/1       ethernet1  null       00:22:49
O    10.2.3.0/24     110/4445  ethernet2  10.4.10.4  00:22:35
O    10.2.6.0/24     110/4445  ethernet1  10.1.10.1  00:22:47
O    10.2.7.0/24     110/4445  ethernet1  10.1.10.1  00:22:38
O    10.2.11.0/24    110/4445  ethernet1  10.1.10.1  00:22:38
O    10.3.4.0/24     110/4444  ethernet2  10.4.10.4  00:22:47
O    10.3.7.0/24     110/4445  ethernet2  10.4.10.4  00:22:35
O    10.3.8.0/24     110/4445  ethernet2  10.4.10.4  00:22:32
O    10.3.11.0/24    110/4445  ethernet2  10.4.10.4  00:22:35
O    10.4.5.0/24     110/4444  ethernet2  10.4.10.4  00:22:47
O    10.4.8.0/24     110/4444  ethernet2  10.4.10.4  00:22:47
C    10.4.10.0/24    0/0       ethernet2  null       00:22:49
LOC  10.4.10.10/32   0/1       ethernet2  null       00:22:49
C    10.10.10.10/32  0/0       loopback1  null       00:22:49
O    11.11.11.11/32  110/8889  ethernet1  10.1.10.1  00:06:43

rr1# sh ipv4 ospf 1 topo 0                                                     
node      reach  via        ifc        met   hop  conn  sr  br  neighbors
4.4.4.1   true   10.1.10.1  ethernet1  4444  1    5     0   0   4.4.4.2=1=10.1.2.1 4.4.4.4=1=10.1.4.1 4.4.4.5=1=10.1.5.1 4.4.4.6=1=10.1.6.1 4.4.4.10=4444=10.1.10.1
4.4.4.2   true   10.1.10.1  ethernet1  4445  2    5     0   0   4.4.4.1=1=10.1.2.2 4.4.4.3=1=10.2.3.2 4.4.4.7=1=10.2.7.2 4.4.4.6=1=10.2.6.2 4.4.4.11=4444=10.2.11.2
4.4.4.3   true   10.4.10.4  ethernet2  4445  2    4     0   0   4.4.4.2=1=10.2.3.3 4.4.4.4=1=10.3.4.3 4.4.4.8=1=10.3.8.3 4.4.4.7=1=10.3.7.3
4.4.4.4   true   10.4.10.4  ethernet2  4444  1    5     0   0   4.4.4.3=1=10.3.4.4 4.4.4.8=1=10.4.8.4 4.4.4.5=1=10.4.5.4 4.4.4.1=1=10.1.4.4 4.4.4.10=4444=10.4.10.4
4.4.4.5   true   10.1.10.1  ethernet1  4445  2    2     0   0   4.4.4.1=1=10.1.5.5 4.4.4.4=1=10.4.5.5
4.4.4.6   true   10.1.10.1  ethernet1  4445  2    2     0   0   4.4.4.1=1=10.1.6.6 4.4.4.2=1=10.2.6.6
4.4.4.7   true   10.1.10.1  ethernet1  4446  3    2     0   0   4.4.4.2=1=10.2.7.7 4.4.4.3=1=10.3.7.7
4.4.4.8   true   10.4.10.4  ethernet2  4445  2    2     0   0   4.4.4.3=1=10.3.8.8 4.4.4.4=1=10.4.8.8
4.4.4.10  true   null       null       0     0    2     0   0   4.4.4.1=4444=10.1.10.10 4.4.4.4=4444=10.4.10.10
4.4.4.11  true   10.1.10.1  ethernet1  8889  3    1     0   0   4.4.4.2=4444=10.2.11.11

rr1# sh ipv6 route v1                                                          
typ  prefix                  metric     iface      hop                time
O    fd00::1/128             110/6666   ethernet1  fd00:cafe::1:10:1  00:06:01
O    fd00::2/128             110/6667   ethernet1  fd00:cafe::1:10:1  00:05:46
O    fd00::3/128             110/6667   ethernet2  fd00:cafe::4:10:4  00:05:28
O    fd00::4/128             110/6666   ethernet2  fd00:cafe::4:10:4  00:05:14
O    fd00::5/128             110/6667   ethernet1  fd00:cafe::1:10:1  00:04:56
O    fd00::6/128             110/6667   ethernet1  fd00:cafe::1:10:1  00:04:38
O    fd00::7/128             110/6668   ethernet1  fd00:cafe::1:10:1  00:04:24
O    fd00::8/128             110/6667   ethernet2  fd00:cafe::4:10:4  00:03:56
C    fd00::a/128             0/0        loopback1  null               00:23:45
O    fd00::b/128             110/13333  ethernet1  fd00:cafe::1:10:1  00:07:40
O    fd00:cafe::1:2:0/112    110/6666   ethernet1  fd00:cafe::1:10:1  00:23:43
O    fd00:cafe::1:4:0/112    110/6666   ethernet2  fd00:cafe::4:10:4  00:23:43
O    fd00:cafe::1:5:0/112    110/6666   ethernet1  fd00:cafe::1:10:1  00:23:43
O    fd00:cafe::1:6:0/112    110/6666   ethernet1  fd00:cafe::1:10:1  00:23:43
C    fd00:cafe::1:10:0/112   0/0        ethernet1  null               00:23:45
LOC  fd00:cafe::1:10:10/128  0/1        ethernet1  null               00:23:45
O    fd00:cafe::2:3:0/112    110/6667   ethernet1  fd00:cafe::1:10:1  00:23:32
O    fd00:cafe::2:6:0/112    110/6667   ethernet1  fd00:cafe::1:10:1  00:23:32
O    fd00:cafe::2:7:0/112    110/6667   ethernet1  fd00:cafe::1:10:1  00:23:32
O    fd00:cafe::2:11:0/112   110/6667   ethernet1  fd00:cafe::1:10:1  00:23:32
O    fd00:cafe::3:4:0/112    110/6666   ethernet2  fd00:cafe::4:10:4  00:23:43
O    fd00:cafe::3:7:0/112    110/6667   ethernet2  fd00:cafe::4:10:4  00:23:32
O    fd00:cafe::3:8:0/112    110/6667   ethernet2  fd00:cafe::4:10:4  00:23:32
O    fd00:cafe::3:11:0/112   110/6667   ethernet2  fd00:cafe::4:10:4  00:23:32
O    fd00:cafe::4:5:0/112    110/6666   ethernet2  fd00:cafe::4:10:4  00:23:43
O    fd00:cafe::4:8:0/112    110/6666   ethernet2  fd00:cafe::4:10:4  00:23:43
C    fd00:cafe::4:10:0/112   0/0        ethernet2  null               00:23:45
LOC  fd00:cafe::4:10:10/128  0/1        ethernet2  null               00:23:45

rr1# sh ipv6 ospf 1 topo 0                                                     
node               reach  via                ifc        met    hop  conn  sr  br  neighbors
6.6.6.1/00000000   true   fd00:cafe::1:10:1  ethernet1  6666   1    5     0   0   6.6.6.2/00000000=1=10012 6.6.6.4/00000000=1=10015 6.6.6.5/00000000=1=10012 6.6.6.6/00000000=1=10012 6.6.6.10/00000000=6666=10012
6.6.6.2/00000000   true   fd00:cafe::1:10:1  ethernet1  6667   2    5     0   0   6.6.6.1/00000000=1=10012 6.6.6.3/00000000=1=10012 6.6.6.7/00000000=1=10012 6.6.6.6/00000000=1=10013 6.6.6.11/00000000=6666=10012
6.6.6.3/00000000   true   fd00:cafe::4:10:4  ethernet2  6667   2    4     0   0   6.6.6.2/00000000=1=10013 6.6.6.4/00000000=1=10012 6.6.6.8/00000000=1=10012 6.6.6.7/00000000=1=10013
6.6.6.4/00000000   true   fd00:cafe::4:10:4  ethernet2  6666   1    5     0   0   6.6.6.3/00000000=1=10013 6.6.6.8/00000000=1=10013 6.6.6.5/00000000=1=10013 6.6.6.1/00000000=1=10013 6.6.6.10/00000000=6666=10013
6.6.6.5/00000000   true   fd00:cafe::1:10:1  ethernet1  6667   2    2     0   0   6.6.6.1/00000000=1=10014 6.6.6.4/00000000=1=10014
6.6.6.6/00000000   true   fd00:cafe::1:10:1  ethernet1  6667   2    2     0   0   6.6.6.1/00000000=1=10015 6.6.6.2/00000000=1=10015
6.6.6.7/00000000   true   fd00:cafe::1:10:1  ethernet1  6668   3    2     0   0   6.6.6.2/00000000=1=10014 6.6.6.3/00000000=1=10015
6.6.6.8/00000000   true   fd00:cafe::4:10:4  ethernet2  6667   2    2     0   0   6.6.6.3/00000000=1=10014 6.6.6.4/00000000=1=10013
6.6.6.10/00000000  true   null               null       0      0    2     0   0   6.6.6.1/00000000=6666=10016 6.6.6.4/00000000=6666=10016
6.6.6.11/00000000  true   fd00:cafe::1:10:1  ethernet1  13333  3    1     0   0   6.6.6.2/00000000=6666=10016
 Connectivity test between rr1 and other BGP speakers
Check reachability from one RR client (c5 for example)
c5#sh ipv4 route v1                                                            
typ  prefix          metric    iface      hop       time
O    1.1.1.1/32      110/1     ethernet1  10.1.5.1  00:07:22
O    2.2.2.2/32      110/2     ethernet1  10.1.5.1  00:07:07
O    3.3.3.3/32      110/2     ethernet2  10.4.5.4  00:06:49
O    4.4.4.4/32      110/1     ethernet2  10.4.5.4  00:06:35
C    5.5.5.5/32      0/0       loopback1  null      00:25:07
O    6.6.6.6/32      110/2     ethernet1  10.1.5.1  00:06:00
O    7.7.7.7/32      110/3     ethernet1  10.1.5.1  00:05:46
O    8.8.8.8/32      110/2     ethernet2  10.4.5.4  00:05:17
O    10.1.2.0/24     110/1     ethernet1  10.1.5.1  00:25:06
O    10.1.4.0/24     110/1     ethernet2  10.4.5.4  00:25:05
C    10.1.5.0/24     0/0       ethernet1  null      00:25:07
LOC  10.1.5.5/32     0/1       ethernet1  null      00:25:07
O    10.1.6.0/24     110/1     ethernet1  10.1.5.1  00:25:06
O    10.1.10.0/24    110/1     ethernet1  10.1.5.1  00:25:06
O    10.2.3.0/24     110/2     ethernet2  10.4.5.4  00:24:53
O    10.2.6.0/24     110/2     ethernet1  10.1.5.1  00:25:05
O    10.2.7.0/24     110/2     ethernet1  10.1.5.1  00:24:56
O    10.2.11.0/24    110/2     ethernet1  10.1.5.1  00:24:56
O    10.3.4.0/24     110/1     ethernet2  10.4.5.4  00:25:05
O    10.3.7.0/24     110/2     ethernet2  10.4.5.4  00:24:53
O    10.3.8.0/24     110/2     ethernet2  10.4.5.4  00:24:50
O    10.3.11.0/24    110/2     ethernet2  10.4.5.4  00:24:53
C    10.4.5.0/24     0/0       ethernet2  null      00:25:07
LOC  10.4.5.5/32     0/1       ethernet2  null      00:25:07
O    10.4.8.0/24     110/1     ethernet2  10.4.5.4  00:25:05
O    10.4.10.0/24    110/1     ethernet2  10.4.5.4  00:25:05
O    10.10.10.10/32  110/4445  ethernet1  10.1.5.1  00:11:05
O    11.11.11.11/32  110/4446  ethernet1  10.1.5.1  00:09:01

c5#sh ipv4 ospf 1 topo 0                                                       
node      reach  via       ifc        met   hop  conn  sr  br  neighbors
4.4.4.1   true   10.1.5.1  ethernet1  1     1    5     0   0   4.4.4.2=1=10.1.2.1 4.4.4.4=1=10.1.4.1 4.4.4.5=1=10.1.5.1 4.4.4.6=1=10.1.6.1 4.4.4.10=4444=10.1.10.1
4.4.4.2   true   10.1.5.1  ethernet1  2     2    5     0   0   4.4.4.1=1=10.1.2.2 4.4.4.3=1=10.2.3.2 4.4.4.7=1=10.2.7.2 4.4.4.6=1=10.2.6.2 4.4.4.11=4444=10.2.11.2
4.4.4.3   true   10.4.5.4  ethernet2  2     2    4     0   0   4.4.4.2=1=10.2.3.3 4.4.4.4=1=10.3.4.3 4.4.4.8=1=10.3.8.3 4.4.4.7=1=10.3.7.3
4.4.4.4   true   10.4.5.4  ethernet2  1     1    5     0   0   4.4.4.3=1=10.3.4.4 4.4.4.8=1=10.4.8.4 4.4.4.5=1=10.4.5.4 4.4.4.1=1=10.1.4.4 4.4.4.10=4444=10.4.10.4
4.4.4.5   true   null      null       0     0    2     0   0   4.4.4.1=1=10.1.5.5 4.4.4.4=1=10.4.5.5
4.4.4.6   true   10.1.5.1  ethernet1  2     2    2     0   0   4.4.4.1=1=10.1.6.6 4.4.4.2=1=10.2.6.6
4.4.4.7   true   10.1.5.1  ethernet1  3     3    2     0   0   4.4.4.2=1=10.2.7.7 4.4.4.3=1=10.3.7.7
4.4.4.8   true   10.4.5.4  ethernet2  2     2    2     0   0   4.4.4.3=1=10.3.8.8 4.4.4.4=1=10.4.8.8
4.4.4.10  true   10.1.5.1  ethernet1  4445  2    2     0   0   4.4.4.1=4444=10.1.10.10 4.4.4.4=4444=10.4.10.10
4.4.4.11  true   10.1.5.1  ethernet1  4446  3    1     0   0   4.4.4.2=4444=10.2.11.11

c5#sh ipv6 route v1                                                            
typ  prefix                 metric    iface      hop               time
O    fd00::1/128            110/1     ethernet1  fd00:cafe::1:5:1  00:08:06
O    fd00::2/128            110/2     ethernet1  fd00:cafe::1:5:1  00:07:51
O    fd00::3/128            110/2     ethernet2  fd00:cafe::4:5:4  00:07:33
O    fd00::4/128            110/1     ethernet2  fd00:cafe::4:5:4  00:07:19
C    fd00::5/128            0/0       loopback1  null              00:25:51
O    fd00::6/128            110/2     ethernet1  fd00:cafe::1:5:1  00:06:43
O    fd00::7/128            110/3     ethernet1  fd00:cafe::1:5:1  00:06:29
O    fd00::8/128            110/2     ethernet2  fd00:cafe::4:5:4  00:06:01
O    fd00::a/128            110/6667  ethernet1  fd00:cafe::1:5:1  00:11:45
O    fd00::b/128            110/6668  ethernet1  fd00:cafe::1:5:1  00:09:45
O    fd00:cafe::1:2:0/112   110/1     ethernet1  fd00:cafe::1:5:1  00:25:49
O    fd00:cafe::1:4:0/112   110/1     ethernet2  fd00:cafe::4:5:4  00:25:49
C    fd00:cafe::1:5:0/112   0/0       ethernet1  null              00:25:51
LOC  fd00:cafe::1:5:5/128   0/1       ethernet1  null              00:25:51
O    fd00:cafe::1:6:0/112   110/1     ethernet1  fd00:cafe::1:5:1  00:25:49
O    fd00:cafe::1:10:0/112  110/1     ethernet1  fd00:cafe::1:5:1  00:25:49
O    fd00:cafe::2:3:0/112   110/2     ethernet1  fd00:cafe::1:5:1  00:25:37
O    fd00:cafe::2:6:0/112   110/2     ethernet1  fd00:cafe::1:5:1  00:25:37
O    fd00:cafe::2:7:0/112   110/2     ethernet1  fd00:cafe::1:5:1  00:25:37
O    fd00:cafe::2:11:0/112  110/2     ethernet1  fd00:cafe::1:5:1  00:25:37
O    fd00:cafe::3:4:0/112   110/1     ethernet2  fd00:cafe::4:5:4  00:25:49
O    fd00:cafe::3:7:0/112   110/2     ethernet2  fd00:cafe::4:5:4  00:25:37
O    fd00:cafe::3:8:0/112   110/2     ethernet2  fd00:cafe::4:5:4  00:25:37
O    fd00:cafe::3:11:0/112  110/2     ethernet2  fd00:cafe::4:5:4  00:25:37
C    fd00:cafe::4:5:0/112   0/0       ethernet2  null              00:25:51
LOC  fd00:cafe::4:5:5/128   0/1       ethernet2  null              00:25:51
O    fd00:cafe::4:8:0/112   110/1     ethernet2  fd00:cafe::4:5:4  00:25:49
O    fd00:cafe::4:10:0/112  110/1     ethernet2  fd00:cafe::4:5:4  00:25:49

c5#sh ipv6 ospf 1 topo 0                                                       
node               reach  via               ifc        met   hop  conn  sr  br  neighbors
6.6.6.1/00000000   true   fd00:cafe::1:5:1  ethernet1  1     1    5     0   0   6.6.6.2/00000000=1=10012 6.6.6.4/00000000=1=10015 6.6.6.5/00000000=1=10012 6.6.6.6/00000000=1=10012 6.6.6.10/00000000=6666=10012
6.6.6.2/00000000   true   fd00:cafe::1:5:1  ethernet1  2     2    5     0   0   6.6.6.1/00000000=1=10012 6.6.6.3/00000000=1=10012 6.6.6.7/00000000=1=10012 6.6.6.6/00000000=1=10013 6.6.6.11/00000000=6666=10012
6.6.6.3/00000000   true   fd00:cafe::4:5:4  ethernet2  2     2    4     0   0   6.6.6.2/00000000=1=10013 6.6.6.4/00000000=1=10012 6.6.6.8/00000000=1=10012 6.6.6.7/00000000=1=10013
6.6.6.4/00000000   true   fd00:cafe::4:5:4  ethernet2  1     1    5     0   0   6.6.6.3/00000000=1=10013 6.6.6.8/00000000=1=10013 6.6.6.5/00000000=1=10013 6.6.6.1/00000000=1=10013 6.6.6.10/00000000=6666=10013
6.6.6.5/00000000   true   null              null       0     0    2     0   0   6.6.6.1/00000000=1=10014 6.6.6.4/00000000=1=10014
6.6.6.6/00000000   true   fd00:cafe::1:5:1  ethernet1  2     2    2     0   0   6.6.6.1/00000000=1=10015 6.6.6.2/00000000=1=10015
6.6.6.7/00000000   true   fd00:cafe::1:5:1  ethernet1  3     3    2     0   0   6.6.6.2/00000000=1=10014 6.6.6.3/00000000=1=10015
6.6.6.8/00000000   true   fd00:cafe::4:5:4  ethernet2  2     2    2     0   0   6.6.6.3/00000000=1=10014 6.6.6.4/00000000=1=10013
6.6.6.10/00000000  true   fd00:cafe::1:5:1  ethernet1  6667  2    2     0   0   6.6.6.1/00000000=6666=10016 6.6.6.4/00000000=6666=10016
6.6.6.11/00000000  true   fd00:cafe::1:5:1  ethernet1  6668  3    1     0   0   6.6.6.2/00000000=6666=10016
Ping from rr1 from c5
c5#ping 10.10.10.10 /vrf v1                                                    
pinging 10.10.10.10, src=null, vrf=v1, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!
result=100%, recv/sent/lost=5/5/0, rtt min/avg/max/total=0/0/1/4
c5#ping fd00::a /vrf v1                                                        
pinging fd00::a, src=null, vrf=v1, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!
result=100%, recv/sent/lost=5/5/0, rtt min/avg/max/total=0/0/1/4
c5#                                                                                                                                                 
 BGP connectivity check on rr1
BGP summary
rr1#sh ipv4 bgp 65535 sum                                                      
as     learn  done  ready  neighbor  uptime
65535  0      0     true   1.1.1.1   16:22:28
65535  0      0     true   2.2.2.2   16:17:26
65535  0      0     true   3.3.3.3   16:16:44
65535  0      0     true   4.4.4.4   16:16:01
65535  0      0     true   5.5.5.5   16:15:32
65535  0      0     true   6.6.6.6   16:14:56
65535  0      0     true   7.7.7.7   16:14:30
65535  0      0     true   8.8.8.8   16:13:37

rr1#sh ipv6 bgp 65535 sum                                                      
as     learn  done  ready  neighbor  uptime
65535  0      0     true   fd00::1   16:20:41
65535  0      0     true   fd00::2   16:18:27
65535  0      0     true   fd00::3   16:17:32
65535  0      0     true   fd00::4   16:16:59
65535  0      0     true   fd00::5   16:16:22
65535  0      0     true   fd00::6   16:15:57
65535  0      0     true   fd00::7   16:15:15
65535  0      0     true   fd00::8   16:14:45

From rr1 check c1 BGP status (pay attention to type = routeReflectorClient)
rr1#show ipv4 bgp 65535 neighbor 1.1.1.1 status                                
peer = 1.1.1.1
reachable state = true
reachable changed = 16:24:12
reachable changes = 1
fallover = null
update group = 0
type = routeReflectorClient
safi =  unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
local = 10.10.10.10
router id = 1.1.1.1
uptime = 16:24:12
hold time = 00:03:00
keepalive time = 00:01:00
32bit as = true
refresh = true, rx=0, tx=0
description = rr clients
hostname = null
compression = rx=true, tx=false
graceful = 
addpath rx = 
addpath tx = 
unicast advertised = 0 of 0, list = 0, accepted = 0 of 0
multicast advertised = 0 of 0, list = 0, accepted = 0 of 0
other advertised = 0 of 0, list = 0, accepted = 0 of 0
flowspec advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnuni advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnmlt advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnflw advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnuni advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnmlt advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnflw advertised = 0 of 0, list = 0, accepted = 0 of 0
vpls advertised = 0 of 0, list = 0, accepted = 0 of 0
mspw advertised = 0 of 0, list = 0, accepted = 0 of 0
evpn advertised = 0 of 0, list = 0, accepted = 0 of 0
mdt advertised = 0 of 0, list = 0, accepted = 0 of 0
srte advertised = 0 of 0, list = 0, accepted = 0 of 0
mvpn advertised = 0 of 0, list = 0, accepted = 0 of 0
omvpn advertised = 0 of 0, list = 0, accepted = 0 of 0
version = 14 of 14, needfull=0, buffull=0
full = 9, 2020-07-27 16:32:29, 16:15:21 ago, 0 ms
incr = 2, 2020-07-28 08:13:10, 00:34:40 ago, 0 ms
connection = tx=173(987) rx=158(986) drp=0(0)
uncompressed = tx=0(0) rx=0(0) drp=0(0)
buffer = max=65536 rx=0 tx=65536

rr1#show ipv6 bgp 65535 neighbor fd00::1 status                                
peer = fd00::1
reachable state = true
reachable changed = 16:22:33
reachable changes = 1
fallover = null
update group = 0
type = routeReflectorClient
safi =  unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
local = fd00::a
router id = 1.1.1.1
uptime = 16:22:33
hold time = 00:03:00
keepalive time = 00:01:00
32bit as = true
refresh = true, rx=0, tx=0
description = rr clients
hostname = null
compression = rx=true, tx=false
graceful = 
addpath rx = 
addpath tx = 
unicast advertised = 0 of 0, list = 0, accepted = 0 of 0
multicast advertised = 0 of 0, list = 0, accepted = 0 of 0
other advertised = 0 of 0, list = 0, accepted = 0 of 0
flowspec advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnuni advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnmlt advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnflw advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnuni advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnmlt advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnflw advertised = 0 of 0, list = 0, accepted = 0 of 0
vpls advertised = 0 of 0, list = 0, accepted = 0 of 0
mspw advertised = 0 of 0, list = 0, accepted = 0 of 0
evpn advertised = 0 of 0, list = 0, accepted = 0 of 0
mdt advertised = 0 of 0, list = 0, accepted = 0 of 0
srte advertised = 0 of 0, list = 0, accepted = 0 of 0
mvpn advertised = 0 of 0, list = 0, accepted = 0 of 0
omvpn advertised = 0 of 0, list = 0, accepted = 0 of 0
version = 14 of 14, needfull=0, buffull=0
full = 9, 2020-07-27 16:32:15, 16:16:37 ago, 0 ms
incr = 2, 2020-07-28 08:13:15, 00:35:38 ago, 0 ms
connection = tx=173(985) rx=158(984) drp=0(0)
uncompressed = tx=0(0) rx=0(0) drp=0(0)
buffer = max=65536 rx=0 tx=65536                                                                                                                                          

Conclusion

In this article you:

  • had a brief introduction of BGP protocol and BGP route reflector rationale
  • learned the design consideration related to BGP RR setup 
  • got a typical BGP configuration example with a long list of AFI/SAFI enabled
  • This configuration is not exhaustive as for example BGP add-path is available but not configured
  • verified BGP RR operation

RARE validated design: [ BGP RR #001 ]- key take-away

  • BGP Router Reflector use case does not require a commercial vendor router, it can be handled perfectly by a sowftare solution running on a server with enoough RAM.

The example above an example of a high availability Route Reflector that is able to handle BGP signalling for a high carrier Service Provider for all address familay

  • Redundant BGP Router Reflection is ensured by deploying 2 RR (at minimum) belonging to the same BGP RR cluster 

In addition to have several RR for the whole domain, it is also common to see hierarchical RR design. SOme Service provider deploy dedicated RR for specific address family (L3VPN unicast for example)

  • RR in the same cluster run basic iBGP session

These RR also share the same cluster ID, in order to ensure route withdraw in case of routing advertisement

  • RR should not be in the traffic datapath

This is the reason why we are setting high cost (4444 and 6666) for IPv4 and IPv6 respectively on both direction on the RR(s) interconnections ports

  • RR design for a multi-service backbone

In the example, the RR client are running only IPv4/IPv6 but the RR design above can empower a Service provider backbone with additional service running on TOP of MPLS, L3VPN, 6VPE, VPLS EVPN etc.

  • In the next article we will dissect the rr1 configurations

This will demonstrate some nice features proposed by freeRouter such as BGP template and nexthop tracking among a list of other feature not mentioned here... (like BGP add-path)


RR design test

You can test this design above in order to check RR and backbone router signalling.

  • Set up freeRouter environment as describe above
  • Get RARE code
Clone RARE code from repository
 git clone https://github.com/frederic-loui/RARE.git
Launch the Service Provider example (diagram above)
cd RARE/00-unit-labs/0101-rare-validated-design-bgp/
make
Access routers using the following command:
c1: telnet localhost 10001 
c2: telnet localhost 10002 
c3: telnet localhost 10003 
c4: telnet localhost 10004 
c4: telnet localhost 10005 
c6: telnet localhost 10006 
c7: telnet localhost 10007 
c8: telnet localhost 10008 
rr1: telnet localhost 10010 
rr2: telnet localhost 10011 
Launch the Service Provider example (diagram above)
cd RARE/00-unit-labs/0101-rare-validated-design-bgp/
make clean

In article #005 you learned how RARE/freeRouter is controlling a P4Emu/pcap dataplane. We also demonstrated that this setup could be integrated into real networks.

Requirement

  • Basic Linux/Unix knowledge
  • Basic networking knowledge

Overview

Though P4Emu/pcap can be used for SOHO and can handle nx1GE of traffic, this comes at a high CPU load cost and thus a higher power consumption. 

"Why write yet another software dataplane as freeRouter has already a working native software dataplane ?"

The partial answer to the question raised in the previous article was:

"decoupling control plane from the dataplane"

We learned that P4Emu:

  • is able to understand the VERY same strict control message from freeRouter as it occurs with a P4 dataplane
  • is able to switch packet emulating router.p4 using libpcap packet forwarding backend.

However, even though libpcap is a performant packet processing library, the kernel is still heavily sollicited and the higher the traffic rate is, the higher CPU workload becomes.

Article objective

In this article we'll using freeRouter setup deployed in #005 and replace P4Emu/pcap's dataplane by P4Emu/dpdk dataplane. 

Source Wikipedia: https://en.wikipedia.org/wiki/Data_Plane_Development_Kit

The Data Plane Development Kit (DPDK) is an Open source software project managed by the Linux Foundation. It provides a set of data plane libraries and network interface controller polling-mode drivers for offloading TCP packet processing from the operating system kernel to processes running in user space. This offloading achieves higher computing efficiency and higher packet throughput than is possible using the interrupt-driven processing provided in the kernel.


It is important to note that though its name implies, P4Emu/dpdk is not emulating V1Model. P4Emu is emulating router.p4 packet processing logic and uses a packet forwarding library to effectively transmit packets at specific ingress port to the right egress port defined by freeRouter control plane message. However, in this precise case, packet processing is offloaded from the kernel to user space. The consequence is the ability with dpdk compatible NIC and driver, to reach tremendous traffic rate. DPDK is not available on all hardware, please refer to DPDK HCL.


Diagram

[ #006 ] - Cookbook

 Install your favorite operating system

In our example we will use the ubuntu focal as we need dpdk 19.11.1 (latest current version is 20.05.0)

and we add a bridge network interface to or laptop RJ45 connection.

 Install FreeRouter as per #001 article: "Create freeRouter environment"
Install dpdk and dpdk-dev
apt-get update
apt-get upgrade
apt-get install dpdk dpdk-dev --no-install-recommends
flush enp0s3 so that it can be controlled by dpdk
ip addr flush enp0s3

Add out of band management enp0s8 with Virtualbox

You can add a second Host-only interface  (enp0s8) in VirtualBox in order to connect the ubuntu focal VM guest as you might lose connection when you flushed enp0s3.
Setup up dpdk and veth pair for control plane dataplane discussion via pcapInt
#!/bin/bash
echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 64 > /proc/sys/vm/nr_hugepages
modprobe uio_pci_generic
dpdk-devbind.py -b uio_pci_generic 00:03.0
ip link add veth0a type veth peer name veth0b
ip link set veth0a up
ip link set veth0b up
Check that dpdk is controlling able to enp0s3
dpdk-devbind.py --status

Network devices using DPDK-compatible driver
============================================
0000:00:03.0 '82540EM Gigabit Ethernet Controller 100e' drv=uio_pci_generic unused=e1000,vfio-pci

Network devices using kernel driver
===================================
0000:00:08.0 '82540EM Gigabit Ethernet Controller 100e' if=enp0s8 drv=e1000 unused=vfio-pci,uio_pci_generic *Active*

No 'Baseband' devices detected
==============================

No 'Crypto' devices detected
============================

No 'Eventdev' devices detected
==============================

No 'Mempool' devices detected
=============================

No 'Compress' devices detected
==============================

No 'Misc (rawdev)' devices detected
===================================
mkdir -p ~/freeRouter/bin ~/freeRouter/lib ~/freeRouter/etc ~/freeRouter/log
cd ~/freeRouter/lib
wget http://freerouter.nop.hu/rtr.jar
Update & Upgrade system
tree freeRouter
freeRouter
├── bin   # binary files      
├── etc   # configuration files      
├── lib   # library files      
└── log   # log files      

 Install freeRouter net-tools
get freeRouter net-tools tarball
wget freerouter.nop.hu/rtr.tar
Install build tools
tar xvf rtr.tar -C ~/freeRouter/bin/

For those you would like to rebuild these binaries you can find the compilation shell script in freeRouter cloned git repository in: ~/freeRouter/src/native/c.sh

 Create configuration files for router: freerouter

FreeRouter uses 2 configuration files in order to run, let's write these configuration files for R1 in ~/freeRouter/etc

freeRouter hardware configuration file: dpdk-focal-1-hw.txt
hwid hp
! cpu_port
int eth0 eth - 127.0.0.1 20001 127.0.0.1 20002
! freerouter control port for message
tcp2vrf 9080 v1 9080
! freerouter cli
tcp2vrf 2323 v1 23
! launch a process called "veth0" that actually link to veth0b
! cmd: ip link add veth0a type veth peer name veth0b
proc veth0 /root/freertr/bin/pcapInt.bin veth0a 20002 127.0.0.1 20001 127.0.0.1
proc p4emu /root/freertr/bin/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b 127.0.0.1 9080 1

Note:

Let's spend some times on this hardware configuration file, as you might have notice there are additional interesting lines worth to mention:

  • proc <process-name>

It is possible within freeRouter startup to launch processes. We use here this feature to start control plane / dataplane communication via veth pair: veth0a and veth0b and also P4Emu/dpdk, p4dpdk.bin packet processing backend.

  • proc p4emu /root/freertr/bin/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b 127.0.0.1 9080 1

In dpdk, by default dpdk interfaces have port_ids that are sequentially allocated and in the order of appearance in dpdk-devbind --status output usually sorted by pci_id. In the previous output interface enp0s3 has port_id #0 and in dpdk veth0b (CPU_PORT has alwasy the last port_id beside dpdk data port_id, so here it is 1. If for exaplem we dedicate enp0s3, enp0s8, enp0s9, enp0s10 in virtualbox the command would have been:

proc p4emu /root/freertr/bin/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b 127.0.0.1 9080 4

enp0s3 would be: #0 with pci_id: 00:03.0

enp0s8 would be: #1 with pci_id: 00:08.0

enp0s9 would be: #2 with pci_id: 00:09.0

enp0s10 would be: #3 with pci_id: 00:0a.0

freeRouter software configuration file: dpdk-freerouter-sw.txt
hostname dpdk-freerouter
buggy
!
!
vrf definition v1
 rd 1:1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth0a]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@P4_CPU_PORT[enp0s3]
 mtu 1500
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet telnet
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 0 0
 interconnect ethernet0
 vrf v1
 exit
!
!
end
 Launch freeRouter control plane alongside P4Emu/pcap dataplane
freeRouter launch with supplied dpdk-freerouter-hw.txt and dpdk-freerouter-sw.txt with a console prompt
java -jar lib/rtr.jar routersc dpdk-focal-1-hw.txt dpdk-focal-1-sw.txt
info cfg.cfgInit.doInit:cfgInit.java:556 booting
info cfg.cfgInit.doInit:cfgInit.java:680 initializing hardware
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 2:! cpu_port
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 4:! freerouter control port for message
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 6:! freerouter cli
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 8:! launch a process called "veth0" that actually link to veth0b
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 9:! cmd: ip link add veth0a type veth peer name veth0b
info cfg.cfgInit.doInit:cfgInit.java:687 applying defaults
info cfg.cfgInit.doInit:cfgInit.java:695 applying configuration
info cfg.cfgInit.doInit:cfgInit.java:721 done
welcome
line ready
dpdk-freertr-1#

Verification

 Check telnet access for freerouter@2323
FreeRouter telnet access from Virtualbox VM guest via port 2323
root@focal-1:~# telnet 127.0.0.1 2323
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
welcome
line ready
dpdk-freerouter#  
 Check running-config
freerouter running configuration
dpdk-freerouter#term len 0                                                      
dpdk-freerouter#sh run                                                          
hostname dpdk-freerouter
buggy
!
!
vrf definition v1
 rd 1:1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth0a]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@P4_CPU_PORT[enp0s3]
 mtu 1500
 macaddr 0078.5223.343c
 lldp enable
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet telnet
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 0 0
 interconnect ethernet0
 vrf v1
 exit
!
!
end

Check control plane is communicating with P4Emu/dpdk dataplane
dpdk-freerouter#show interfaces summary                                         
interface  state  tx     rx       drop
ethernet0  up     43567  8727278  0
sdn1       up     42659  8675606  0
 Connectivity IPv4/IPv6 check for freeRouter
Ping IPv4 from freerouter -> LAN router gateway
dpdk-freerouter#ping 192.168.0.254 /vrf v1                                      
pinging 192.168.0.254, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!
result=100%, recv/sent/lost=5/5/0, rtt min/avg/max/total=1/1/1/6
Ping IPv4 from freerouter -> LAN server
dpdk-freerouter#ping 192.168.0.62 /vrf v1                                       
pinging 192.168.0.62, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1005

Please observe the 1st ICMP packet loss that triggered ARP learning for respectively 192.168.0.254 and 192.168.0.62.

IPv4 arp check
dpdk-freerouter#sh ipv4 arp sdn1                                                
mac             address        time      static
e03f.496d.1899  192.168.0.62   00:00:24  false    <----- Host server
0024.d4a0.0cd3  192.168.0.254  00:00:24  false    <----- LAN gateway
Ping IPv6 from freerouter -> LAN router
dpdk-freerouter#ping 2a01:e0a:159:2850::1 /vrf v1                               
pinging 2a01:e0a:159:2850::1, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1005
Ping IPv6 from freerouter -> Host server and SSH connection test
dpdk-freerouter#ping 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 /vrf v1              
pinging 2a01:e0a:159:2850:e23f:49ff:fe6d:1899, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1006

Please observe the 1st ICMP packet loss that triggered IPv6 neighbor discovery for respectively 2a01:e0a:159:2850::1 and 2a01:e0a:159:2850:e23f:49ff:fe6d:1899

IPv6 neighbor discovery check
dpdk-freerouter#show ipv6 neighbors sdn1                                        
mac             address                                time      static  router
0024.d4a0.0cd3  2a01:e0a:159:2850::1                   00:00:39  false   false    <----- LAN gateway
e03f.496d.1899  2a01:e0a:159:2850:e23f:49ff:fe6d:1899  00:00:39  false   false    <----- Host server
0024.d4a0.0cd3  fe80::224:d4ff:fea0:cd3                00:00:39  false   false    <----- Link local LAN gateway EUI64 IPv6 address
e03f.496d.1899  fe80::e23f:49ff:fe6d:1899              00:00:39  false   false    <----- Link local host server IPv6 address
Initiate IPv4 ssh from freerouter -> LAN router gateway
dpdk-freerouter#ssh 192.168.0.62 /vrf v1 /user my-nas                          
 - connecting to 192.168.0.62 22
password: *******
                
 - securing connection

Last login: Tue Jul  7 17:40:55 2020 from 2a01:e0a:159:2850::666
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS%


Initiate IPv6 ssh from freerouter -> LAN router gateway
dpdk-freerouter#ssh 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 /vrf v1 /user my-nas 
 - connecting to 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 22
password: *******
                
 - securing connection

Last login: Wed Jul  8 11:28:32 2020 from 192.168.0.131
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS% 
freeRouter p4dpdk hardware statistics
dpdk-freerouter#sh int snd1 hw                                                  
  hwcounters     - hardware counters
  hwdrhistory    - hardware historic drop byte counters
  hwdrphistory   - hardware historic drop packet counters
  hwhistory      - hardware historic byte counters
  hwnumhist      - hardware numeric historic byte counters
  hwnumphist     - hardware numeric historic packet counters
  hwphistory     - hardware historic packet counters
  hwrates        - hardware traffic rates
  hwrealtime     - hardware realtime counters
  hwrxhistory    - hardware historic rx byte counters
  hwrxphistory   - hardware historic rx packet counters
  hwtxhistory    - hardware historic tx byte counters
  hwtxphistory   - hardware historic tx packet counters

dpdk-freerouter#show interfaces sdn1 hwrates                                    
       packet         byte
time   tx  rx   drop  tx     rx      drop
1sec   5   20   0     1498   4668    0
1min   39  104  0     48056  56745   0
1hour  31  174  0     10162  137481  0

dpdk-freerouter#show interfaces sdn1 hwhistory                                  
        217k|                                                            
        195k|                            #                               
        173k|                            #                               
        151k|   #                        #                            #  
        130k| # #           #          # #                 #          #  
        108k| # #           #          # #        #        #   #      #  
         86k| # #           #  #     # ###   #    #        #  ##      ## 
         65k| # # #    # #  #  #     # ##### ## # #        #####    # ## 
         43k|## # #### # ####  ### # ########## ###### #   ##### # ##### 
         21k|## ###### # ##### ##### ########## ######################## 
           0|########################################################### 
         bps|0---------10--------20--------30--------40--------50-------- seconds

         43m|                                                            
         39m| *                                                          
         34m| *                                                          
         30m| *                                         *                
         26m| *                                         *                
         21m| *                                         *                
         17m| *                                         *                
         13m| *                                         *       *        
       8684k| *                                    *  * *  *    *        
       4342k| *       **                           *  * * ** * **      * 
           0|########################################################### 
         bps|0---------10--------20--------30--------40--------50-------- minutes

         70m|                                                            
         63m| * *                                                        
         56m| * *                                                        
         49m| * *                                                        
         42m| * *                                                        
         35m| * *                                                        
         28m| * *                                                        
         21m|** *                                                        
         14m|** *                                                        
       7017k|****                                                        
           0|##*#                                                        
         bps|0---------10--------20--------30--------40--------50-------- hours


Conclusion

In this article you:

  • had a demonstration of how to integrate freeRouter into a local area network (Similar to article #002)
  • However instead of using P4Emu/dpdk we used a P4Emu/dpdk dataplane
  • communication between freeRouter control plane and P4Emu/dpdk is ensured by pcapInt via veth pair [ veth0a - veth0b ]
  • In this example the freeRouter with P4Emu/dpdk has only 1 dataplane interface that is bound to enp0s3 VM interface exposed to the local network as a bridged interface

[ #006 ] RARE/FreeRouter-101 - key take-away

  • FreeRouter is using UNIX socket in order to forward packet dedicated to control plane + dataplane communication.

This essential paradigm is used to ensure communication between freeRouter and P4Emu/dpdk dataplane. It is ensured by pcapInt binary from freeRouter net-tools that will bind freeRouter socket (veth0a@locathost:22001) to a virtual network interface (veth0b@localhost:22002)  connected to CPU_PORT 1.

  • freeRouter is the control plane for P4Emu/dpdk dataplane

freeRouter is doing all the control plane route computation and write/modify/remove message entry P4 entries are created/modified/removed accordingly from P4Emu/dpdk tables. Although the name is P4Emu, it does not emulate BMv2 V1Model.p4, but rather router.p4

  • dpdk port_id allocation

dpkg port_id allocation follow pci_id port naming convention starting from id 0. p4dpdk.bin is invoked with the parameter: (number_of_dpdk_port - 1) + 1 <--- CPU_PORT

  • In this setup the combination of freeRouter/P4Emu/dpdk delivers a solution for small campus network having 10GE links (100GE links to be validated)

dpkg removed the kernel intervention calls for each packet processed. In that configuration packet processing is now off loaded to user space. Reducing kernel intervention to ~ 0%. Congratulation you have a hardware NIC assisted forwarding is system !

In subsequent article we will see how this setup behaves with a DELL 640 server powered by Intel(R) Xeon(R) Gold 6138 CPU x 2  and equipped with a  Mellanox ConnectX-5 EX Dual Port 100GbE QSFP28 PCIe Adapter Low Profile card. We will also see how to connect this server to a P4 switch, BF2556X-1T. So stay tuned !