In article #005 you learned how RARE/freeRouter is controlling a P4Emu/pcap dataplane. We also demonstrated that this setup could be integrated into real networks.

Requirement

  • Basic Linux/Unix knowledge
  • Basic networking knowledge

Overview

Though P4Emu/pcap can be used for SOHO and can handle nx1GE of traffic, this comes at a high CPU load cost and thus a higher power consumption. 

"Why write yet another software dataplane as freeRouter has already a working native software dataplane ?"

The partial answer to the question raised in the previous article was:

"decoupling control plane from the dataplane"

We learned that P4Emu:

  • is able to understand the VERY same strict control message from freeRouter as it occurs with a P4 dataplane
  • is able to switch packet emulating router.p4 using libpcap packet forwarding backend.

However, even though libpcap is a performant packet processing library, the kernel is still heavily sollicited and the higher the traffic rate is, the higher CPU workload becomes.

Article objective

In this article we'll using freeRouter setup deployed in #005 and replace P4Emu/pcap's dataplane by P4Emu/dpdk dataplane. 

The Data Plane Development Kit (DPDK) is an Open source software project managed by the Linux Foundation. It provides a set of data plane libraries and network interface controller polling-mode drivers for offloading TCP packet processing from the operating system kernel to processes running in user space. This offloading achieves higher computing efficiency and higher packet throughput than is possible using the interrupt-driven processing provided in the kernel.


It is important to note that though its name implies, P4Emu/dpdk is not emulating V1Model. P4Emu is emulating router.p4 packet processing logic and uses a packet forwarding library to effectively transmit packets at specific ingress port to the right egress port defined by freeRouter control plane message. However, in this precise case, packet processing is offloaded from the kernel to user space. The consequence is the ability with dpdk compatible NIC and driver, to reach tremendous traffic rate. DPDK is not available on all hardware, please refer to DPDK HCL.


Diagram

[ #006 ] - Cookbook

In our example we will use the ubuntu focal as we need dpdk 19.11.1 (latest current version is 20.05.0)

and we add a bridge network interface to or laptop RJ45 connection.



apt-get update
apt-get upgrade
apt-get install dpdk dpdk-dev --no-install-recommends


ip addr flush enp0s3


You can add a second Host-only interface  (enp0s8) in VirtualBox in order to connect the ubuntu focal VM guest as you might lose connection when you flushed enp0s3.


#!/bin/bash
echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 64 > /proc/sys/vm/nr_hugepages
modprobe uio_pci_generic
dpdk-devbind.py -b uio_pci_generic 00:03.0
ip link add veth0a type veth peer name veth0b
ip link set veth0a up
ip link set veth0b up


dpdk-devbind.py --status

Network devices using DPDK-compatible driver
============================================
0000:00:03.0 '82540EM Gigabit Ethernet Controller 100e' drv=uio_pci_generic unused=e1000,vfio-pci

Network devices using kernel driver
===================================
0000:00:08.0 '82540EM Gigabit Ethernet Controller 100e' if=enp0s8 drv=e1000 unused=vfio-pci,uio_pci_generic *Active*

No 'Baseband' devices detected
==============================

No 'Crypto' devices detected
============================

No 'Eventdev' devices detected
==============================

No 'Mempool' devices detected
=============================

No 'Compress' devices detected
==============================

No 'Misc (rawdev)' devices detected
===================================


mkdir -p ~/freeRouter/bin ~/freeRouter/lib ~/freeRouter/etc ~/freeRouter/log
cd ~/freeRouter/lib
wget http://freerouter.nop.hu/rtr.jar


tree freeRouter
freeRouter
├── bin   # binary files      
├── etc   # configuration files      
├── lib   # library files      
└── log   # log files      




wget http://www.freertr.net/rtr-`uname -m`.tar -O rtr.tar


tar xvf rtr.tar -C ~/freeRouter/bin/


For those you would like to rebuild these binaries you can find the compilation shell script in freeRouter cloned git repository in: ~/freeRouter/src/native/c.sh



FreeRouter uses 2 configuration files in order to run, let's write these configuration files for R1 in ~/freeRouter/etc

hwid hp
! cpu_port
int eth0 eth - 127.0.0.1 20001 127.0.0.1 20002
! freerouter control port for message
tcp2vrf 9080 v1 9080
! freerouter cli
tcp2vrf 2323 v1 23
! launch a process called "veth0" that actually link to veth0b
! cmd: ip link add veth0a type veth peer name veth0b
proc veth0 /root/freertr/bin/pcapInt.bin veth0a 20002 127.0.0.1 20001 127.0.0.1
proc p4emu /root/freertr/bin/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b 127.0.0.1 9080 1


Let's spend some times on this hardware configuration file, as you might have notice there are additional interesting lines worth to mention:

  • proc <process-name>

It is possible within freeRouter startup to launch processes. We use here this feature to start control plane / dataplane communication via veth pair: veth0a and veth0b and also P4Emu/dpdk, p4dpdk.bin packet processing backend.

  • proc p4emu /root/freertr/bin/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b 127.0.0.1 9080 1

In dpdk, by default dpdk interfaces have port_ids that are sequentially allocated and in the order of appearance in dpdk-devbind --status output usually sorted by pci_id. In the previous output interface enp0s3 has port_id #0 and in dpdk veth0b (CPU_PORT has alwasy the last port_id beside dpdk data port_id, so here it is 1. If for exaplem we dedicate enp0s3, enp0s8, enp0s9, enp0s10 in virtualbox the command would have been:

proc p4emu /root/freertr/bin/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b 127.0.0.1 9080 4

enp0s3 would be: #0 with pci_id: 00:03.0

enp0s8 would be: #1 with pci_id: 00:08.0

enp0s9 would be: #2 with pci_id: 00:09.0

enp0s10 would be: #3 with pci_id: 00:0a.0


hostname dpdk-freerouter
buggy
!
!
vrf definition v1
 rd 1:1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth0a]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@P4_CPU_PORT[enp0s3]
 mtu 1500
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet telnet
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 0 0
 interconnect ethernet0
 vrf v1
 exit
!
!
end




java -jar lib/rtr.jar routersc dpdk-focal-1-hw.txt dpdk-focal-1-sw.txt
info cfg.cfgInit.doInit:cfgInit.java:556 booting
info cfg.cfgInit.doInit:cfgInit.java:680 initializing hardware
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 2:! cpu_port
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 4:! freerouter control port for message
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 6:! freerouter cli
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 8:! launch a process called "veth0" that actually link to veth0b
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 9:! cmd: ip link add veth0a type veth peer name veth0b
info cfg.cfgInit.doInit:cfgInit.java:687 applying defaults
info cfg.cfgInit.doInit:cfgInit.java:695 applying configuration
info cfg.cfgInit.doInit:cfgInit.java:721 done
welcome
line ready
dpdk-freertr-1#


Verification


root@focal-1:~# telnet 127.0.0.1 2323
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
welcome
line ready
dpdk-freerouter#  




dpdk-freerouter#term len 0                                                      
dpdk-freerouter#sh run                                                          
hostname dpdk-freerouter
buggy
!
!
vrf definition v1
 rd 1:1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth0a]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@P4_CPU_PORT[enp0s3]
 mtu 1500
 macaddr 0078.5223.343c
 lldp enable
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet telnet
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 0 0
 interconnect ethernet0
 vrf v1
 exit
!
!
end


dpdk-freerouter#show interfaces summary                                         
interface  state  tx     rx       drop
ethernet0  up     43567  8727278  0
sdn1       up     42659  8675606  0




dpdk-freerouter#ping 192.168.0.254 /vrf v1                                      
pinging 192.168.0.254, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!
result=100%, recv/sent/lost=5/5/0, rtt min/avg/max/total=1/1/1/6


dpdk-freerouter#ping 192.168.0.62 /vrf v1                                       
pinging 192.168.0.62, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1005


Please observe the 1st ICMP packet loss that triggered ARP learning for respectively 192.168.0.254 and 192.168.0.62.


dpdk-freerouter#sh ipv4 arp sdn1                                                
mac             address        time      static
e03f.496d.1899  192.168.0.62   00:00:24  false    <----- Host server
0024.d4a0.0cd3  192.168.0.254  00:00:24  false    <----- LAN gateway


dpdk-freerouter#ping 2a01:e0a:159:2850::1 /vrf v1                               
pinging 2a01:e0a:159:2850::1, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1005


dpdk-freerouter#ping 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 /vrf v1              
pinging 2a01:e0a:159:2850:e23f:49ff:fe6d:1899, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1006


Please observe the 1st ICMP packet loss that triggered IPv6 neighbor discovery for respectively 2a01:e0a:159:2850::1 and 2a01:e0a:159:2850:e23f:49ff:fe6d:1899


dpdk-freerouter#show ipv6 neighbors sdn1                                        
mac             address                                time      static  router
0024.d4a0.0cd3  2a01:e0a:159:2850::1                   00:00:39  false   false    <----- LAN gateway
e03f.496d.1899  2a01:e0a:159:2850:e23f:49ff:fe6d:1899  00:00:39  false   false    <----- Host server
0024.d4a0.0cd3  fe80::224:d4ff:fea0:cd3                00:00:39  false   false    <----- Link local LAN gateway EUI64 IPv6 address
e03f.496d.1899  fe80::e23f:49ff:fe6d:1899              00:00:39  false   false    <----- Link local host server IPv6 address


dpdk-freerouter#ssh 192.168.0.62 /vrf v1 /user my-nas                          
 - connecting to 192.168.0.62 22
password: *******
                
 - securing connection

Last login: Tue Jul  7 17:40:55 2020 from 2a01:e0a:159:2850::666
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS%


dpdk-freerouter#ssh 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 /vrf v1 /user my-nas 
 - connecting to 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 22
password: *******
                
 - securing connection

Last login: Wed Jul  8 11:28:32 2020 from 192.168.0.131
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS% 


dpdk-freerouter#sh int snd1 hw                                                  
  hwcounters     - hardware counters
  hwdrhistory    - hardware historic drop byte counters
  hwdrphistory   - hardware historic drop packet counters
  hwhistory      - hardware historic byte counters
  hwnumhist      - hardware numeric historic byte counters
  hwnumphist     - hardware numeric historic packet counters
  hwphistory     - hardware historic packet counters
  hwrates        - hardware traffic rates
  hwrealtime     - hardware realtime counters
  hwrxhistory    - hardware historic rx byte counters
  hwrxphistory   - hardware historic rx packet counters
  hwtxhistory    - hardware historic tx byte counters
  hwtxphistory   - hardware historic tx packet counters

dpdk-freerouter#show interfaces sdn1 hwrates                                    
       packet         byte
time   tx  rx   drop  tx     rx      drop
1sec   5   20   0     1498   4668    0
1min   39  104  0     48056  56745   0
1hour  31  174  0     10162  137481  0

dpdk-freerouter#show interfaces sdn1 hwhistory                                  
        217k|                                                            
        195k|                            #                               
        173k|                            #                               
        151k|   #                        #                            #  
        130k| # #           #          # #                 #          #  
        108k| # #           #          # #        #        #   #      #  
         86k| # #           #  #     # ###   #    #        #  ##      ## 
         65k| # # #    # #  #  #     # ##### ## # #        #####    # ## 
         43k|## # #### # ####  ### # ########## ###### #   ##### # ##### 
         21k|## ###### # ##### ##### ########## ######################## 
           0|########################################################### 
         bps|0---------10--------20--------30--------40--------50-------- seconds

         43m|                                                            
         39m| *                                                          
         34m| *                                                          
         30m| *                                         *                
         26m| *                                         *                
         21m| *                                         *                
         17m| *                                         *                
         13m| *                                         *       *        
       8684k| *                                    *  * *  *    *        
       4342k| *       **                           *  * * ** * **      * 
           0|########################################################### 
         bps|0---------10--------20--------30--------40--------50-------- minutes

         70m|                                                            
         63m| * *                                                        
         56m| * *                                                        
         49m| * *                                                        
         42m| * *                                                        
         35m| * *                                                        
         28m| * *                                                        
         21m|** *                                                        
         14m|** *                                                        
       7017k|****                                                        
           0|##*#                                                        
         bps|0---------10--------20--------30--------40--------50-------- hours



Conclusion

In this article you:

  • FreeRouter is using UNIX socket in order to forward packet dedicated to control plane + dataplane communication.

This essential paradigm is used to ensure communication between freeRouter and P4Emu/dpdk dataplane. It is ensured by pcapInt binary from freeRouter net-tools that will bind freeRouter socket (veth0a@locathost:22001) to a virtual network interface (veth0b@localhost:22002)  connected to CPU_PORT 1.

  • freeRouter is the control plane for P4Emu/dpdk dataplane

freeRouter is doing all the control plane route computation and write/modify/remove message entry P4 entries are created/modified/removed accordingly from P4Emu/dpdk tables. Although the name is P4Emu, it does not emulate BMv2 V1Model.p4, but rather router.p4

  • dpdk port_id allocation

dpkg port_id allocation follow pci_id port naming convention starting from id 0. p4dpdk.bin is invoked with the parameter: (number_of_dpdk_port - 1) + 1 <--- CPU_PORT

  • In this setup the combination of freeRouter/P4Emu/dpdk delivers a solution for small campus network having 10GE links (100GE links to be validated)

dpkg removed the kernel intervention calls for each packet processed. In that configuration packet processing is now off loaded to user space. Reducing kernel intervention to ~ 0%. Congratulation you have a hardware NIC assisted forwarding is system !

In subsequent article we will see how this setup behaves with a DELL 640 server powered by Intel(R) Xeon(R) Gold 6138 CPU x 2  and equipped with a  Mellanox ConnectX-5 EX Dual Port 100GbE QSFP28 PCIe Adapter Low Profile card. We will also see how to connect this server to a P4 switch, BF2556X-1T. So stay tuned !