Blog from July, 2020

After having followed P4Lang P4 for dummies [ #002 ] article, you should have now a working P4 development environment.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

image2020-6-29_13-54-48.png

Overview

Let's start writing. compiling and running our first P4 program.

Article objective

This 3rd article propose to write your first P4 program based on P4Lang P4 for dummies [ #001 ]  my_program.p4 specification. 

Diagram: my_program.p4

[ #003 ] - Cookbook

P4 program specification

my_program.p4 packet progressing logic: "all packets arriving at port 4 are switched/forwarded to port 8"

  • In this example, the switch has 8 ports
  • A ingress packet arrives at port 4
  • the ingress port is then checked
  • If it is port 4, then the packet is switched to port 8
  • my_program.p4 does not implement a default condition, so all the packets not arriving on port 4 are then dropped
  • the ingress packets arrived with a header with charateristics set by the previous node
  • if needed, my_program.p4 is able to set modify the egress packet header for further processing by the next network node (example of in-band network Telemetry)

Let's first create the P4 program environment:

my_program.p4
mkdir -p ~/my_program/bin ~/my_program/p4src ~/my_program/p4rt_python ~/my_program/build  
Where
tree -d my_program/
my_program/         <------- top folder            
├── bfrt_python     <------- python/scapy folder containg tests scripts            
├── bin             <------- executable binary folder            
├── build           <------- containing p4 compilation artefacts results            
└── p4src           <------- containing p4 code
~/my_program/p4src/my_program.p4
/*
 * P4 language version: P4_16 
 */

/*
 * include P4 core library 
 */
#include <core.p4>

/* 
 * include P4 v1model library implemented by simple_switch 
 */
#include <v1model.p4>

#define PORT_4 4 
#define PORT_8 8 


/*
 * egress_spec port encoded using 9 bits
 */ 
typedef bit<9>  nexthop_id_t;

/*
 * metadata type  
 */
struct metadata_t {
   nexthop_id_t nexthop_id;
}

/*
 * Our P4 program header structure 
 */
struct headers {
}

/*
 * V1Model PARSER
 */
parser prs_main(packet_in packet,
                out headers hdr,
                inout metadata_t md,
                inout standard_metadata_t std_md) {

   state start {
      transition select(std_md.ingress_port) {
         PORT_4: prs_port_4;
         default: accept;
      }
   }

   state prs_port_4 {
      md.nexthop_id = PORT_8;
      transition accept;     
   }
}

/*
 * V1Model CHECKSUM VERIFICATION 
 */
control ctl_verify_checksum(inout headers hdr, inout metadata_t metadata) {
    apply {
  }
}


/*
 * V1Model INGRESS
 */
control ctl_ingress(inout headers hdr,
                  inout metadata_t md,
                  inout standard_metadata_t std_md) {

   apply {
      if (std_md.ingress_port == PORT_4) {
         std_md.egress_spec = md.nexthop_id;
      } 
   }
}


/*
 * V1Model EGRESS
 */

control ctl_egress(inout headers hdr,
                 inout metadata_t md,
                 inout standard_metadata_t std_md) {
   apply {
   }
}

/*
 * V1Model CHECKSUM COMPUTATION
 */
control ctl_compute_checksum(inout headers hdr, inout metadata_t md) {
   apply {
   }
}

/*
 * V1Model DEPARSER
 */
control ctl_deprs(packet_out packet, in headers hdr) {
    apply {
        /*
         * emit hdr
         */
        packet.emit(hdr);
    }
}


/*
 * V1Model P4 Switch define in v1model.p4
 */
V1Switch(
prs_main(),
ctl_verify_checksum(),
ctl_ingress(),
ctl_egress(),
ctl_compute_checksum(),
ctl_deprs()
) main;
Compilation of my_program.p4 using P4lang p4c
p4c --std p4-16 --target bmv2 --arch v1model -I ./include -o ./build --p4runtime-files ./build/my_program.json ./p4src/my_program.p4m

Verification

Compilation of my_program.p4 artefact in ./build
floui@ubi16:~/my_program$ ls -l build/
total 44
-rw-rw-r-- 1 floui floui  7532 Jul 24 14:23 my_program.json  <------ output used when launching bmv2
-rw-rw-r-- 1 floui floui 35462 Jul 24 14:23 my_program.p4ip  <------ other usage (not taken into account by the examplr)

Create veth pair before ...

Before launching our BMv2 virtual switch we need to create the veth pair that will be bound the P4 switch.

for that we will reuse bash scripts from Andy Fingerhut public GitHub Repository:

veth pairs setup
cd ~/my_program/bin
wget https://raw.githubusercontent.com/jafingerhut/p4-guide/master/bin/veth_setup.sh
wget https://raw.githubusercontent.com/jafingerhut/p4-guide/master/bin/veth_teardown.sh
chmod u+x ./veth_setup.sh
chmod u+x ./veth_teardown.sh
sudo ./veth_setup.sh

ip link | grep veth
4: veth1@veth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
5: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
6: veth3@veth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
7: veth2@veth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
8: veth5@veth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
9: veth4@veth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
10: veth7@veth6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
11: veth6@veth7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
12: veth9@veth8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
13: veth8@veth9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
14: veth11@veth10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
15: veth10@veth11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
16: veth13@veth12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
17: veth12@veth13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
18: veth15@veth14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
19: veth14@veth15: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
20: veth17@veth16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
21: veth16@veth17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9500 qdisc noqueue state UP mode DEFAULT group default qlen 1000

we can now launch BMv2 simple_switch and bind the 8 veth pairs we just configured

start bmv2 simple_switch (load my_program.json)
sudo simple_switch --log-console -i 1@veth2 -i 2@veth4 -i 3@veth6 -i 4@veth8 -i 5@veth10 -i 6@veth12 -i 7@veth14 -i 8@veth16 ./build/my_program.json
Calling target program-options parser
[14:28:41.364] [bmv2] [D] [thread 15917] Set default default entry for table 'tbl_my_program76': my_program76 - 
Adding interface veth2 as port 1
[14:28:41.364] [bmv2] [D] [thread 15917] Adding interface veth2 as port 1
Adding interface veth4 as port 2
[14:28:41.415] [bmv2] [D] [thread 15917] Adding interface veth4 as port 2
Adding interface veth6 as port 3
[14:28:41.455] [bmv2] [D] [thread 15917] Adding interface veth6 as port 3
Adding interface veth8 as port 4
[14:28:41.503] [bmv2] [D] [thread 15917] Adding interface veth8 as port 4
Adding interface veth10 as port 5
[14:28:41.547] [bmv2] [D] [thread 15917] Adding interface veth10 as port 5
Adding interface veth12 as port 6
[14:28:41.587] [bmv2] [D] [thread 15917] Adding interface veth12 as port 6
Adding interface veth14 as port 7
[14:28:41.635] [bmv2] [D] [thread 15917] Adding interface veth14 as port 7
Adding interface veth16 as port 8
[14:28:41.683] [bmv2] [D] [thread 15917] Adding interface veth16 as port 8
[14:28:41.727] [bmv2] [I] [thread 15917] Starting Thrift server on port 9090
[14:28:41.728] [bmv2] [I] [thread 15917] Thrift server was started
...
tcpdump veth8 (port 4)
sudo tcpdump -i veth8
...
tcpdump veth8 (port 8)
sudo tcpdump -i veth16
...

Now you need to find a way to:

  • send a packet to simple_switch@PORT_4 (veth8)
  • send another packet to simple_switch@PORT_1 (veth2)

We will use scapy for that:

scapy installation as root
pip3 install --pre scapy[complete]

Run scapy with sufficient privileges to send packets on specific interface

sudo scapy3
/usr/lib/python3/dist-packages/IPython/utils/module_paths.py:29: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp
                                      
                     aSPY//YASa       
             apyyyyCY//////////YCa       |
            sY//////YSpcs  scpCY//Pp     | Welcome to Scapy
 ayp ayyyyyyySCP//Pp           syY//C    | Version 2.4.3~bionic
 AYAsAYYYYYYYY///Ps              cY//S   |
         pCCCCY//p          cSSps y//Y   | https://github.com/secdev/scapy
         SPPPP///a          pP///AC//Y   |
              A//A            cyP////C   | Have fun!
              p///Ac            sC///a   |
              P////YCpc           A//A   | Craft packets like I craft my beer.
       scccccp///pSP///p          p//Y   |               -- Jean De Clerck
      sY/////////y  caa           S//P   |
       cayCyayP//Ya              pY/Ya
        sY/PsY////YCc          aC//Yp 
         sc  sccaCY//PCypaapyCP//YSs  
                  spCPY//////YPSps    
                       ccaacs         
                                       using IPython 5.5.0
>>> 
From scapy prompt, send a packet to PORT_4 (veth8)
>>> sendp(IP(dst="1.2.3.4")/ICMP(),iface="veth8")
.
Sent 1 packets.
>>> 
Check tcpdump on veth8 (PORT_4)
floui@ubi16:~$ sudo tcpdump -i veth8
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth8, link-type EN10MB (Ethernet), capture size 262144 bytes
14:58:23.404299 00:00:40:01:9d:d2 (oui Unknown) > 45:00:00:1c:00:01 (oui Unknown), ethertype Unknown (0xc1e0), length 28: 
        0x0000:  1728 0102 0304 0800 f7ff 0000 0000       .(............ 
Check tcpdump on veth16 (PORT_8)
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth16, link-type EN10MB (Ethernet), capture size 262144 bytes
14:58:23.406042 00:00:40:01:9d:d2 (oui Unknown) > 45:00:00:1c:00:01 (oui Unknown), ethertype Unknown (0xc1e0), length 28: 
        0x0000:  1728 0102 0304 0800 f7ff 0000 0000       .(............ Conclusion

Congratulations !

You have successfully written, compiled, load your program P4Lang P4 virtual switch ! In addition, you also checked that the logic of your program is implemented correctly by sending a packet to PORT_4 using scapy python3 tool. You then checked with tcpdump that your packet ingressed the P4 switch via PORT_4 and egressed via PORT_8 as it was expected.

What's happening to other packets arriving on a port that is different from PORT_4 ?

Let's try to find out. In that situation, let's send an ingress packet to PORT_1 (veth2) of the switch and see what's happening.

From scapy prompt, send a packet to PORT_4 (veth8)
>>> sendp(IP(dst="1.2.3.4")/ICMP(),iface="veth2")
.
Sent 1 packets.
>>> 

In that case we don't know what is the egress port so let's look at simple_switch console.

simple_switch console
floui@ubi16:~/my_program$ sudo simple_switch --log-console -i 1@veth2 -i 2@veth4 -i 3@veth6 -i 4@veth8 -i 5@veth10 -i 6@veth12 -i 7@veth14 -i 8@veth16 ./build/my_program.json
Calling target program-options parser
[15:10:55.525] [bmv2] [D] [thread 16129] Set default default entry for table 'tbl_my_program76': my_program76 - 
Adding interface veth2 as port 1
[15:10:55.525] [bmv2] [D] [thread 16129] Adding interface veth2 as port 1
Adding interface veth4 as port 2
[15:10:55.555] [bmv2] [D] [thread 16129] Adding interface veth4 as port 2
Adding interface veth6 as port 3
[15:10:55.603] [bmv2] [D] [thread 16129] Adding interface veth6 as port 3
Adding interface veth8 as port 4
[15:10:55.651] [bmv2] [D] [thread 16129] Adding interface veth8 as port 4
Adding interface veth10 as port 5
[15:10:55.691] [bmv2] [D] [thread 16129] Adding interface veth10 as port 5
Adding interface veth12 as port 6
[15:10:55.739] [bmv2] [D] [thread 16129] Adding interface veth12 as port 6
Adding interface veth14 as port 7
[15:10:55.791] [bmv2] [D] [thread 16129] Adding interface veth14 as port 7
Adding interface veth16 as port 8
[15:10:55.839] [bmv2] [D] [thread 16129] Adding interface veth16 as port 8
[15:10:55.879] [bmv2] [I] [thread 16129] Starting Thrift server on port 9090
[15:10:55.880] [bmv2] [I] [thread 16129] Thrift server was started
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Processing packet received on port 1
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Parser 'parser': start
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Parser 'parser' entering state 'start'
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Parser state 'start': key is 0001
[15:11:00.449] [bmv2] [T] [thread 16135] [0.0] [cxt 0] Bytes parsed: 0
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Parser 'parser': end
[15:11:00.449] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Pipeline 'ingress': start
[15:11:00.450] [bmv2] [T] [thread 16135] [0.0] [cxt 0] ./p4src/my_program.p4(75) Condition "std_md.ingress_port == 4" (node_2) is false
[15:11:00.450] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Pipeline 'ingress': end
[15:11:00.450] [bmv2] [D] [thread 16135] [0.0] [cxt 0] Egress port is 0
[15:11:00.450] [bmv2] [D] [thread 16136] [0.0] [cxt 0] Pipeline 'egress': start
[15:11:00.450] [bmv2] [D] [thread 16136] [0.0] [cxt 0] Pipeline 'egress': end
[15:11:00.450] [bmv2] [D] [thread 16136] [0.0] [cxt 0] Deparser 'deparser': start
[15:11:00.450] [bmv2] [D] [thread 16136] [0.0] [cxt 0] Deparser 'deparser': end
[15:11:00.450] [bmv2] [D] [thread 16140] [0.0] [cxt 0] Transmitting packet of size 28 out of port 0

So in that case we see that line: "Egress port is 0", which is a special port number that designate the null0 interace. (packet dropped)

Let's now resent a packet to PORT_4 and observe simple_switch console log.

simple_switch console
sudo simple_switch --log-console -i 1@veth2 -i 2@veth4 -i 3@veth6 -i 4@veth8 -i 5@veth10 -i 6@veth12 -i 7@veth14 -i 8@veth16 ./build/my_program.json
Calling target program-options parser
[15:14:51.047] [bmv2] [D] [thread 16151] Set default default entry for table 'tbl_my_program76': my_program76 - 
Adding interface veth2 as port 1
[15:14:51.048] [bmv2] [D] [thread 16151] Adding interface veth2 as port 1
Adding interface veth4 as port 2
[15:14:51.099] [bmv2] [D] [thread 16151] Adding interface veth4 as port 2
Adding interface veth6 as port 3
[15:14:51.139] [bmv2] [D] [thread 16151] Adding interface veth6 as port 3
Adding interface veth8 as port 4
[15:14:51.175] [bmv2] [D] [thread 16151] Adding interface veth8 as port 4
Adding interface veth10 as port 5
[15:14:51.207] [bmv2] [D] [thread 16151] Adding interface veth10 as port 5
Adding interface veth12 as port 6
[15:14:51.239] [bmv2] [D] [thread 16151] Adding interface veth12 as port 6
Adding interface veth14 as port 7
[15:14:51.271] [bmv2] [D] [thread 16151] Adding interface veth14 as port 7
Adding interface veth16 as port 8
[15:14:51.319] [bmv2] [D] [thread 16151] Adding interface veth16 as port 8
[15:14:51.347] [bmv2] [I] [thread 16151] Starting Thrift server on port 9090
[15:14:51.348] [bmv2] [I] [thread 16151] Thrift server was started
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Processing packet received on port 4
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser 'parser': start
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser 'parser' entering state 'start'
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser state 'start': key is 0004
[15:14:58.053] [bmv2] [T] [thread 16158] [0.0] [cxt 0] Bytes parsed: 0
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser 'parser' entering state 'prs_port_4'
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser set: setting field 'scalars.userMetadata.nexthop_id' to 8
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser state 'prs_port_4' has no switch, going to default next state
[15:14:58.053] [bmv2] [T] [thread 16158] [0.0] [cxt 0] Bytes parsed: 0
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Parser 'parser': end
[15:14:58.053] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Pipeline 'ingress': start
[15:14:58.054] [bmv2] [T] [thread 16158] [0.0] [cxt 0] ./p4src/my_program.p4(75) Condition "std_md.ingress_port == 4" (node_2) is true
[15:14:58.054] [bmv2] [T] [thread 16158] [0.0] [cxt 0] Applying table 'tbl_my_program76'
[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Looking up key:

[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Table 'tbl_my_program76': miss
[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Action entry is my_program76 - 
[15:14:58.054] [bmv2] [T] [thread 16158] [0.0] [cxt 0] Action my_program76
[15:14:58.054] [bmv2] [T] [thread 16158] [0.0] [cxt 0] ./p4src/my_program.p4(76) Primitive std_md.egress_spec = md.nexthop_id
[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Pipeline 'ingress': end
[15:14:58.054] [bmv2] [D] [thread 16158] [0.0] [cxt 0] Egress port is 8
[15:14:58.054] [bmv2] [D] [thread 16159] [0.0] [cxt 0] Pipeline 'egress': start
[15:14:58.054] [bmv2] [D] [thread 16159] [0.0] [cxt 0] Pipeline 'egress': end
[15:14:58.054] [bmv2] [D] [thread 16159] [0.0] [cxt 0] Deparser 'deparser': start
[15:14:58.054] [bmv2] [D] [thread 16159] [0.0] [cxt 0] Deparser 'deparser': end
[15:14:58.054] [bmv2] [D] [thread 16163] [0.0] [cxt 0] Transmitting packet of size 28 out of port 8

We clearly confirmed what tcpdump what putting in evidence: ingress PORT_4 leads to a packet switched to PORT_8

Conclusion

In this article you:

  • wrote your first P4 program
  • use p4c in order to compile it
  • learned how to instantiate virtual ethernet pair in order to bind them with simple_switch
  • launch simple_switch and load your program on it
  • set up a test environment using scapy
  • and verify your program using a combination a scapy and tcpdump

P4Lang P4 for dummy [ #002 ] - key take-away

  • my_program.p4 is written following V1Model definition that defines:
    • a parsing stage
    • a checksum verification stage
    • an ingress packet processing control stage
    • an egress packet processing control stage
    • a checksum computation stage
    • deparser stages
V1model PISA model
V1Switch( prs_main(), ctl_verify_checksum(), ctl_ingress(), ctl_egress(), ctl_compute_checksum(), ctl_deprs() ) main; 

It is described by the diagram below:

In a subsequent article we will dissect my_program.p4, but as you could observe, P4 programming is quite intuitive as it is all about switching a packet based on intrinsic ingress packet header and metadata (like packet ingress port) value.






In P4Lang P4 for dummies [ #001 ], you learned that behavioural language offers you access to dataplane programming. 

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

image2020-6-29_13-54-48.png

Overview

In order to be able to start P4 programming, we will concretely start setting up a P4 development environment using Open Source P4Lang P4 community software. 

Article objective

This article exposes how to install:

  • P4Lang PI
  • P4Lang BMv2
  • P4Lang p4c

Operating system supported

  • Debian 10 (stable aka buster)
  • Ubuntu 18.04 (Bionic beaver)
  • Ubuntu 20.04 (Focal fossa)

Note

You can of course use the distribution of your choice as soon as the Operating System you are using has all the necessary third party dependencies required by P4Lang software, mainly:

  • protobuf
  • grpc
  • thrift
  • nanomsg
  • nnpy

You can find the full list here in Launchpad.

Diagram: 

[ #002 ] - Cookbook

In our example we will use the same debian stable image (buster) installed as a VirtualBox VM

and we add a bridge network interface to our laptop RJ45 connection.

add p4lang repository in /etc/apt/sources.list.d/p4.list
deb https://download.opensuse.org/repositories/home:/frederic-loui:/p4lang:/p4c:/master/Debian_10/ ./
add debian 10 repository key from download.opensuse.org
wget https://download.opensuse.org/repositories/home:/frederic-loui:/p4lang:/p4c:/master/Debian_10/Release.key
sudo apt-key add ./Release.key
install p4lang packages (just install p4c and it will install p4lang-pi and bmv2)
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install p4c

Note

Installing p4lang software with apt-get will download and install:

  • p4c
  • bmv2
  • p4lang-pi
add p4lang bionic 3rd party repository
sudo add-apt-repository ppa:frederic-loui/p4lang-3rd-party
sudo apt-get update
add p4lang bionic nightly build repository
sudo add-apt-repository ppa:frederic-loui/p4lang-master-bionic-nightly
sudo apt-get update
install p4lang packages (just install p4c and it will install p4lang-pi and bmv2)
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install p4c bmv2 plang-pi

Note

Installing p4lang software with apt-get will download and install:

  • p4lang-3rd-party (bionic)

alongside:

  • p4c
  • bmv2
  • p4lang-pi
add p4lang bionic 3rd party repository
sudo add-apt-repository ppa:frederic-loui/p4lang-3rd-party-focal
sudo apt-get update
add p4lang bionic nightly build repository
sudo add-apt-repository ppa:frederic-loui/p4lang-master-focal-nightly
sudo apt-get update
install p4lang packages (just install p4c and it will install p4lang-pi and bmv2)
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install p4c bmv2 plang-pi

Note

Installing p4lang software with apt-get will download and install:

  • p4lang-3rd-party (focal)

alongside:

  • p4c
  • bmv2
  • p4lang-pi

Verification

check p4lang packages installation on Debian
dpkg -l | grep p4lang
ii  bmv2                                   20200615~d447b6a~release~nightly-0+57.1 amd64        p4lang behavioral-model
ii  p4c                                    20200628~7c03f854~release~nightly-0     amd64        p4c p4lang project compiler
ii  p4lang-pi                              20200601~822a0d1~release~nightly-0+39.1 amd64        Implementation framework of a P4Runtime server
check p4lang packages installation on Ubuntu 18.04 (same for 20.04)
dpkg -l | grep p4lang
ii  bmv2                                     1.13.0-202006160902-d447b6a~ubuntu18.04.1    amd64        p4lang behavioral-model
ii  p4c                                      1.1.0-rc1-202006191103-3917a1c~ubuntu18.04.1 amd64        p4c p4lang project compiler
ii  p4lang-3rd-party                         1.1~bionic-1                                 all          This package installs 3rd party software needed by p4lang software
ii  p4lang-pi                                0.8-202006020517-822a0d1~ubuntu18.04.1       amd64        Implementation framework of a P4Runtime server
Clone RARE code from repository
cd ~/
git clone https://github.com/frederic-loui/RARE.git
compile RARE router.p4
cd ~/RARE/02-PE-labs/p4src
make build
mkdir -p ../build ../run/log
p4c --std p4-16 --target bmv2 --arch v1model \
        -I ./ -o ../build --p4runtime-files ../build/router.txt router.p4 
check RARE router.p4 compilation result:
ls -l ./build/
total 572
-rw-r--r-- 1 root root 448313 Jul 22 10:15 router.json
-rw-r--r-- 1 root root 100912 Jul 22 10:15 router.p4i
-rw-r--r-- 1 root root  32764 Jul 22 10:15 router.txt

Conclusion

In this article you learned how to set up a P4 environment development 

  • Debian 10
  • Ubuntu 18.04
  • Ubuntu 20.04

And tested the installation by compiling RARE P4 code.


P4Lang P4 for dummy [ #002 ] - key take-away

  • P4Lang P4 development environment creation is easy
  • it uses P4Lang packages on Debian and Ubuntu
  • These packages are maintained by RARE project and are nightly built based on P4Lang official GitHub

In the next article we will:

  • compile my_program.p4
  • launch P4Lang virtual switch called simple_switch and load my_program.p4 on it
  • perform basic verification





While the "RARE/FreeRouter-101" series teaches you how to start using RARE/freeRouter, in the "P4Lang P4 for dummies" article series, you'll learn how to start programming with the P4Lang P4 language. As a reminder, P4 dataplane is a type of dataplane that can be coupled to RARE/freeRouter as it is described in article 101-#003 and 101-#004. The final objective of this article series is to help you compile the VERY FIRST RARE/freeRouter test case that is covering:

  • basic control plane communication between freeRouter and BMv2/simple_switch_grpc
  • and the simple packet_in/packet_out header used in this interface communication.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

image2020-6-29_13-54-48.png

Overview

P4 is a language for programming the data plane of network devices. From p4.org web site:

«P4 is a domain-specific programming language for specifying the behaviour of the dataplanes of network-forwarding elements. »

Article objective

This 1st article exposes:

  • A brief introduction to the P4 language
  • A basic P4 development workflow
  • Some basic specificities of the P4 language

Note

This article is preliminary a pure introduction to P4lang P4. It does not correspond in any way to an extensive programming language description nor a P4 compilation guide.

Diagram: P4 development workflow

[ #001 ] - Cookbook: P4 development workflow

Based on what we mentioned, what does the "P4 Domain specific language" give you ? Concretely:

  1. You can write a program as you would in C or C++ but you'd have to follow the P4 language specification. (The current one is P4_16, there is also a previous P4_14 specification.)
  2. That program is compiled with a p4 compiler in P4_16 or P4_14 (similar to C++14/C++11) 
  3. The resulting compilation artifacts can be then loaded into an equipment implementing a P4 model commonly called a P4 target that is able to interpret/run p4 binaries. Here we will be using BMv2, a softwarized P4 target intended for learning.


Take away

The specificities are:

  • This P4 program is YOUR program
  • This P4 program allows you to define the YOUR OWN packet processing logic

In short, you can now program:

«how a packet that comes into your system, is processed and goes out your system»

Diagram packet processing description

The diagram above depicts 2 perspectives: 

  • P4 program development workflow
    • It starts by writing your P4 program using your favorite editor
    • compile your program with the P4 compiler of your target
    • load your program into the P4 target
  • my_program.p4 packet progressing logic: "all packets arriving at port 4 are switched/forwarded to port 8"
    • In this example, the switch has 8 ports
    • A ingress packet arrives at port 4
    • the ingress port is then checked
    • If it is port 4, then the packet is switched to port 8
    • my_program.p4 does not implement a default condition, so all the packets not arriving on port 4 are then dropped
    • the ingress packets arrived with a header with charateristics set by the previous node
    • if needed, my_program.p4 is able to set modify the egress packet header for further processing by the next network node (example of in-band network Telemetry)

Router for Academia Research & Education (RARE) & P4

The RARE project objective is to provide a networking solution to Research & Education institution use cases. While we witnessed the birth of several control plane such as GNU Zebra, Bird, exaBGP, etc. The common point of these softwares is that they don't have the capability (yet) to be coupled easily with a hardware dataplane. Simply put, these software control plane cannot be used without specific/important development in order to run on an equipment able to forward nx100GE links at a high Mpps rate. 

There have been attempts with DPDK and other kernel bypass mechanism, that enabled higher throughput processing capability, but this is not comparable to commercial/vendor equipment's packet processing power. 

P4:

  • opens you the door to software AND hardware dataplane programmability
  • gives you the possibility to implement YOUR own packet processing algorithm 

RARE control plane: freeRouter

In the RARE project, we are using a software control plane called freeRouter:

  • It is an open source control plane
  • It has been deployed since 2014 and benefits from hours of production in various environment
  • Interworking has been extensively and continuously tested with major equipment vendors
  • Last but not least freeRouter's maintainer is in the RARE team which allowed Rapid Application development and prototyping in order to build control plane and P4 dataplane communication.

P4 use cases are mostly inherently linked to the P4 target you plan to use in order to run your P4 program: 

A comprehensive list can be found here

  • P4Lang BMv2 V1Model target:

It is the P4Lang virtual model that emulates a PISA architecture. You can run it on a VM and start writing your first P4 program and load it on simple_switch and/or simple_switch_grpc (if you plan to use P4Runtime). While this is a great solution in order to learn P4 and sketch your packet processing algorithm, it is not recommended for production use.

  • INTEL/BAREFOOT TOFINO/TOFINO 2

This target also implements a PISA architecture and proposes a Virtual model so that you can validate your algorithm. However, once validated on the virtual model, you can load your program into a hardware switch that is running a NPU called TOFINO and its bigger brother TOFINO2. While TOFINO is able to handle 6.4 Tbps of traffic rate, TOFINO2 simply doubles this. (12 Tbps) In addition to that, TOFINO2 exposes additional inherent capabilities like bigger buffer, memory and TCAM compared to his little brother.

These are the use cases enabled by the combination of P4 and RARE software:

  • Service Provider core router:

You can build a robust packet switching fabric at the scale of Telecom Service Provider able to switch packets at n×100GE

  • Service Provider edge router: 

You can build an edge router an Interconnect the core router above. These routers will terminate your backbone network service like L2/L3 plain IP or VPN services (IPv4 - IPv6)  

  • Datacenter ToR switch 

With the WEDGE100BF32X you can have 2x100GE uplinks toward 2 distinct "leaf switches", it leaves you 30x100GE server connections.

  • Datacenter Spine/Leaf switch 

The WEDGE100BF32X is also a good candidate router in DC as a core/spine switch. You can create a fabric able to switch 6.4 Tbps trafic rates

  • Internet Exchange

In this case, the WEDGE100BF32X is 100GE a peer aggregator or simply integrated into the IXP distributed core fabric.

  • MAN/CPE router

The STORDIS BF2556X-1T with its flexible connectivity options is a good candidate for regional network implementation. It has 8x100GE ports, 2 of them can be used as uplinks toward their main transit provider, 2 other can be used to provide EAST/WEST connection via 2 different fire routes, this leave 4x100 ports in case you need to increase capacity. The STORDIS also has 16x[1/10/25] GbE ports, 32x[10/25] GbE ports which gives the possibility to interconnects users via various access port bandwidth.

Conclusion

In this article you:

  • had a brief introduction of P4Lang P4 language
  • had been presented a 10 thousand feet view of P4 development workflow
  • had been exposed a list of P4 targets and the use cases enabled by these targets

P4Lang P4 for dummy [ #001 ] - key take-away

__THE__ exciting INNOVATION provided by P4 boils down into this community language that unlocks and opens for you the door of system's dataplane. Till now, dataplane programming was reserved to commercial vendors. Some of these dataplanes like the well known CEF (Cisco Express Forwarding) are specific to Cisco equipment. Juniper, has its own dataplane (not sure about the name) implemented by Forwarding Plane component. (example of vMX architecture)    

P4 language inherent characteristics:

  • Behavioral programming language
  • Language with constraints 
  • Limited number of variable types
  • With fixed size
  • P4 is not a general purpose language, You cannot program any software. like C, C++ or Java

It is therefore a simple language, that is easier to be tamed by network managers rather than pure software developers. Indeed, writing a P4 program is all about defining the behavior of a network packet processing algorithm based on intrinsic variables encoded into the packet header.  




"RARE/FreeRouter-101" series of article are meant to help you quickly kickstart your RARE/freeRouter very first deployment and understand via a series of tutorial how it can be powered by various dataplane. 101 article series explained also how RARE/freeRouter could be configured in order to be integrated to the external network environment. However, even if 101- [ #006 ] is a robust and interesting solution for SOHO, you'll see in the "RARE validated design" series of articles,  a lot more interesting use case. This articles will draw your attention to mind blowing use cases that are usually implemented only by commercial solution in service provider environment.

Requirement

  • Basic Linux/Unix knowledge
  • Service provider networking knowledge

Overview

BGP is THE protocol of Internet, it is used to exchange routing information between other BGP systems between Internet domains. It comes in two flavours:

External BGP(eBGP): Network Layer Reachability Information (NLRI) is exchanged between network domain called Autonomous system usually administratively independant. We are speaking about BGP inter-domain routing. As an example, let's us assume a BGP speaker from AS2200 (RENATER) advertising NLRI information to AS20965 (GÉANT R&E). From that point AS20965 has the knowledge of how to reach any network advertised by AS2200 based on the NLRI information.

Internal BGP (iBGP): NLRI is propagated between BGP speakers inside the same domain. We are speaking about BGP intra-domain routing. As an example, assume border router AS2200 in Paris connected to GEANT network and get NLRI information from AS20965. I will then propagate this information internally and advertise GEANT NLRI information via iBGP session to other BGP speaker inside network domain for AS2200.

iBGP requires a full mesh network between all BGP speakers inside a domain because of an anti-AS loop avoidance. Thus requiring n*(n-1)/2 number of sessions to be implemented. BGP route reflection is a proposal that remove full mesh requirement. BGP Edge router has now only 1 BGP session toward the RR, thus reducing network equipment workload.

Article objective

In this article we will describe how to build a carrier grade route reflector cluster composed by RR1 and RR2. In order to reach Telecom Internet Service provider 99,999% of availability:

Let's consider the architecture network of a fictitious service provider below, router reflector RR1 and RR2 are dual homed to a core P routers.

Diagram

[ #001 ] - Cookbook

BGP RR main requirements

SR655 1 x EPYC 7302P, 64GB RAM, 2G CONTROLLER CACHE FLASH, 4x10G ports + SFP+ and 4x1G ports, 3 SSD 480GB MAINSTREAM, XCLARITY ENTERPRISE.

SR655 AMD EPYC 7302P (16C 2.8GHz 128MB Cache/155W) 32GB (2x32GB, 2Rx4 3200MHz RDIMM), No Backplane, SATA, 1x750W, Tooless Rails
ThinkSystem 2x32GB TruDDR4 3200MHz (2Rx4 1.2V) RDIMM-A
ThinkSystem SR655 2.5 SATA/SAS 8-Bay Backplane Kit
ThinkSystem RAID 930-8i 2GB Flash PCIe 12Gb Adapter
ThinkSystem 2.5 5300 480GB Mainstream SATA 6Gb Hot Swap SSD
ThinkSystem SR655 x16/x8/x8 PCIe Riser1 FH Kit
ThinkSystem SR635/SR655 x8 PCIe Internal Riser Kit
ThinkSystem Broadcom 57454 10/25GbE SFP28 4-port OCP Ethernet Adapter
ThinkSystem Broadcom 5720 1GbE RJ45 2-Port PCIe Ethernet Adapter
SFP+ SR Transceiver
ThinkSystem 750W(230/115V) Platinum Hot-Swap Power Supply
2.8m, 10A/100-250V, C13 to IEC 320-C14 Rack Power Cable
ThinkSystem Toolless Slide Rail Kit with 2U CMA
ThinkSystem SR655 Fan Option Kit
ThinkSystem SR635/SR655 Supercap Installation Kit

BGP RR main requirements

RR is a specific component inside a service provider environment:

  • The BGP RR is not in the data path inside the backbone, this can be adjusted by setting hight IGP metrics inside the code backbone. 
  • BGP traffic does not require a tremendous throughput so no need to have hardware NIC assisted forwarding mechanism such as dpdk.
  • A NREN route reflector with 2xIPv4 and 2xIPv6 full views coming from 2 upstream provider requires steady ~ 10 Mbps traffic rates, so we can assume that 10GE connection will be sufficient for the next decades all address-family included.
  • As of 2020/07/13, the Internet IPv4 routing table size is 839945 entries
  • As of 2020/07/13, the Internet IPv6 routing table size is 91062 entries

both cumulated with BGP other address families needs a constant usage of ~ 4GB of memory:

# show watchdog memory

  • So in the config above 64 Gbytes of RAM is sufficient in order to cache all the IPv4 and IPv6 routing table in memory (and also other BGP address family tables). It will be also largely enough in case of network instability, events that involves more CPU/memory usage related convergence computation.

Disclaimer

  • We have no incentive in proposing a server with the above brand. It just happen that this server was already bought and its configuration is matching perfectly the use case requirement but again, this is pure coincidence
  • 10GE port connection might be overkill, but in a Service Provider context this is the norm. It will avoid adjacent core routers to implement 1GE connectivity
  • PCIe GEN4 is available, and thus provide a tremendous amount of bandwidth for disk R/W operation. Though useful for the OS application, BGP RR setup won't take a direct advantage from PCIE GEN4.
  • Indeed in this configuration considering the amount of RAM we have we will disable SWAP operations.


BGP RR distinct data path

  • Connect the server with 2 NIC using optical  SFP ( Broadcom 57454 10/25GbE SFP28 4-port OCP Ethernet Adapter) to core backbone routers following distinct dark fiber path.
  • The link between C1 - C2 provides an additional level of redundancy

BGP RR out of band management

  • Connect the server with 1 NIC using RJ45 (Broadcom 5720 1GbE RJ45 2-Port PCIe Ethernet Adapter) to the KVM or Out fo band management network

Do not forget ...

One point overlooked is the environment. As said BGP is a central component in service provider network. It must be deployed considering the following recommendations:

  • Deploy an RR in carrier hotel
  • With sufficient cooling
  • With sufficient power. Make also sure to have redundant power and use dual PSU connected to different energy source
  • Rack properly the server and make sure it is installed without blocking airflow as per server vendor advice


Install OS supported in your company

  • Use only stable branch also called LTS operating system like Debian 10 or Ubuntu 18.04 and ubuntu 20.04
  • Apply your IT strip down security patch and make it enter your server maintenance process
  • In our case we will use Debian 10

BGP RR Life cycle management

It is important to note that now, BGP RR is subject to your company server hardware maintenance and that the software is not part of it.

  • Server hardware maintenance is now applied to a network equipment
  • The software is maintained by freeRouter project members
mkdir -p ~/freeRouter/bin ~/freeRouter/lib ~/freeRouter/etc ~/freeRouter/log
cd ~/freeRouter/lib
wget http://freerouter.nop.hu/rtr.jar
Update & Upgrade system
╭─[11:11:54]floui@debian ~ 
╰─➤ tree freeRouter
freeRouter
├── bin   # binary files      
├── etc   # configuration files      
├── lib   # library files      
└── log   # log files      

get freeRouter net-tools tarball
wget freerouter.nop.hu/rtr.tar
Install build tools
tar xvf rtr.tar -C ~/freeRouter/bin/

For those you would like to rebuild these binaries you can find the compilation shell script in freeRouter cloned git repository in: ~/freeRouter/src/native/c.sh

No throughput required

  • In this case simple pcapInt packet forwarding is recommended
  • In this setup all freeRouter functionalities are natively available
  • freeRouter heavily uses the concept of thread, hence 16 CPU cores will be fully exploited 

freeRouter upgrade

freeRouter upgrades involves 3 aspects:

  • It is pretty unusual, but as freeRouter is using Java, you have to follow Java software update recommandation 
  • freeRouter control plane software it self, it is essentiallaly a rtr.jar file that has to be replaced by the latest version
  • freeRouter dataplane software pcapInt upgrade. pcapInt upgrade are unusual but still has to be checked in freeRouter release notes

We are (at last) now ready to configure freeRouter as a BGP route reflector !

FreeRouter uses 2 configuration files in order to run, let's write these configuration files for R1 in ~/freeRouter/etc

freeRouter hardware file: bgp-rr-freerouter-hw.txt
int eth1 eth 0000.1111.0001 127.0.0.1 10011 127.0.0.1 10012
int eth2 eth 0000.2222.0002 127.0.0.1 10021 127.0.0.1 10022
tcp2vrf 2323 v1 23

BGP RR interfaces

  • eth1 is BGP port eth1, port 10011 is freeRouter port while 10012 is the port associated to pcapInt associated in linux interface in NIC #1 
  • eth2 is BGP port eth2,  port 10021 is freeRouter port while 10022 is the port associated to pcapInt associated in linux interface in NIC #2
  • For now freeRouter will be accessible only via telnet session on port 2323 
freeRouter software configuration file: r1-sw.txt
hostname rr1
buggy
!
!
access-list ACL-IPv4-RR-CLIENT
 sequence 10 permit all 1.1.1.1 255.255.255.255 all any all
 sequence 20 permit all 2.2.2.2 255.255.255.255 all any all
 sequence 30 permit all 3.3.3.3 255.255.255.255 all any all
 sequence 40 permit all 4.4.4.4 255.255.255.255 all any all
 sequence 50 permit all 5.5.5.5 255.255.255.255 all any all
 sequence 60 permit all 6.6.6.6 255.255.255.255 all any all
 sequence 70 permit all 7.7.7.7 255.255.255.255 all any all
 sequence 80 permit all 8.8.8.8 255.255.255.255 all any all
 exit
!
access-list ACL-IPv6-RR-CLIENT
 sequence 10 deny all fd00::a ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff all any all
 sequence 20 deny all fd00::b ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff all any all
 sequence 30 permit all fd00:: ffff:: all any all
 exit
!
prefix-list PFX-IPv4-NHT
 sequence 10 permit 1.1.1.1/32 ge 32 le 32
 sequence 20 permit 2.2.2.2/32 ge 32 le 32
 sequence 30 permit 3.3.3.3/32 ge 32 le 32
 sequence 40 permit 4.4.4.4/32 ge 32 le 32
 sequence 50 permit 5.5.5.5/32 ge 32 le 32
 sequence 60 permit 6.6.6.6/32 ge 32 le 32
 sequence 70 permit 7.7.7.7/32 ge 32 le 32
 sequence 80 permit 8.8.8.8/32 ge 32 le 32
 sequence 100 permit 10.10.10.10/32 ge 32 le 32
 sequence 110 permit 11.11.11.11/32 ge 32 le 32
 exit
!
prefix-list PFX-IPv6-NHT
 sequence 10 permit fd00::/32 ge 128 le 128
 exit
!
route-policy NHT
 sequence 10 if distance 110
 sequence 20   pass
 sequence 30 else
 sequence 40   drop
 sequence 50 enif
 exit
!
vrf definition v1
 rd 1:1
 exit
!
router ospf4 1
 vrf v1
 router-id 4.4.4.10
 traffeng-id 0.0.0.0
 area 0 enable
 redistribute connected
 exit
!
router ospf6 1
 vrf v1
 router-id 6.6.6.10
 traffeng-id ::
 area 0 enable
 redistribute connected
 exit
!
interface loopback1
 no description
 vrf forwarding v1
 ipv4 address 10.10.10.10 255.255.255.255
 ipv6 address fd00::a ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
 router ospf4 1 enable
 router ospf4 1 area 0
 router ospf4 1 passive
 router ospf6 1 enable
 router ospf6 1 area 0
 router ospf6 1 passive
 no shutdown
 no log-link-change
 exit
!
interface ethernet1
 no description
 vrf forwarding v1
 ipv4 address 10.1.10.10 255.255.255.0
 ipv6 address fd00:cafe::1:10:10 ffff:ffff:ffff:ffff:ffff:ffff:ffff::
 router ospf4 1 enable
 router ospf4 1 area 0
 router ospf4 1 cost 4444
 router ospf6 1 enable
 router ospf6 1 area 0
 router ospf6 1 cost 6666
 no shutdown
 no log-link-change
 exit
!
interface ethernet2
 no description
 vrf forwarding v1
 ipv4 address 10.4.10.10 255.255.255.0
 ipv6 address fd00:cafe::4:10:10 ffff:ffff:ffff:ffff:ffff:ffff:ffff::
 router ospf4 1 enable
 router ospf4 1 area 0
 router ospf4 1 cost 4444
 router ospf6 1 enable
 router ospf6 1 area 0
 router ospf6 1 cost 6666
 no shutdown
 no log-link-change
 exit
!
router bgp4 65535
 vrf v1
 local-as 65535
 router-id 10.10.10.10
 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 nexthop route-policy NHT
 nexthop prefix-list PFX-IPv4-NHT
 template bgp4 remote-as 65535
 template bgp4 description rr clients
 template bgp4 local-as 65535
 template bgp4 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 template bgp4 distance 255
 template bgp4 connection-mode active
 template bgp4 compression both
 template bgp4 update-source loopback1
 template bgp4 hostname
 template bgp4 aigp
 template bgp4 traffeng
 template bgp4 pmsitun
 template bgp4 tunenc
 template bgp4 attribset
 template bgp4 segrout
 template bgp4 bier
 template bgp4 route-reflector-client
 template bgp4 next-hop-unchanged
 template bgp4 send-community all
 listen ACL-IPv4-RR-CLIENT bgp4
 exit
!
router bgp6 65535
 vrf v1
 local-as 65535
 router-id 10.10.10.10
 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 nexthop route-policy NHT
 nexthop prefix-list PFX-IPv6-NHT
 template bgp6 remote-as 65535
 template bgp6 description rr clients
 template bgp6 local-as 65535
 template bgp6 address-family unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
 template bgp6 distance 255
 template bgp6 connection-mode active
 template bgp6 compression both
 template bgp6 update-source loopback1
 template bgp6 hostname
 template bgp6 aigp
 template bgp6 traffeng
 template bgp6 pmsitun
 template bgp6 tunenc
 template bgp6 attribset
 template bgp6 segrout
 template bgp6 bier
 template bgp6 route-reflector-client
 template bgp6 next-hop-unchanged
 template bgp6 send-community all
 listen ACL-IPv6-RR-CLIENT bgp6
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet tel
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
!
end
freeRouter launch with supplied rr1-hw.txt and rr1-sw.txt with a console prompt
╭─[6:06:13]floui@debian ~/freeRouter  
╰─➤  java -jar lib/rtr.jar routersc etc/rr1-hw.txt etc/rr1-sw.txt                                                                                      
info cfg.cfgInit.doInit:cfgInit.java:556 booting
info cfg.cfgInit.doInit:cfgInit.java:680 initializing hardware
info cfg.cfgInit.doInit:cfgInit.java:687 applying defaults
info cfg.cfgInit.doInit:cfgInit.java:695 applying configuration
info cfg.cfgInit.doInit:cfgInit.java:721 done
welcome
line ready
rr1#                   
Launch pcapInt in order to bind socket for both interface enp0s9
╭─[6:06:13]floui@debian[1]  ~/freeRouter/bin  
╰─➤  sudo ./pcapInt.bin enp0s9 10012 127.0.0.1 10011 127.0.0.1                                                                                                       
binded to local port 127.0.0.1 10012.
will send to 127.0.0.1 10011.
pcap version: libpcap version 1.8.1
opening interface enp0s9 with pcap1.x api
serving others
> 
Launch pcapInt in order to bind socket for both interface enp0s10
╭─[6:06:13]floui@debian[1]  ~/freeRouter/bin  
╰─➤  sudo ./pcapInt.bin enp0s10 10022 127.0.0.1 10021 127.0.0.1                                                                                                      
binded to local port 127.0.0.1 10022.
will send to 127.0.0.1 10021.
pcap version: libpcap version 1.8.1
opening interface enp0s10 with pcap1.x api
serving others
> 

Verification

rr1 telnet access via port 10010
╭─[1:09:28]floui@debian ~  
╰─➤  telnet localhost 10010
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
welcome
line ready
rr1#                   
From rr1 perspective:
rr1# sh ipv4 route v1                                                          
typ  prefix          metric    iface      hop        time
O    1.1.1.1/32      110/4444  ethernet1  10.1.10.1  00:05:05
O    2.2.2.2/32      110/4445  ethernet1  10.1.10.1  00:04:50
O    3.3.3.3/32      110/4445  ethernet2  10.4.10.4  00:04:32
O    4.4.4.4/32      110/4444  ethernet2  10.4.10.4  00:04:18
O    5.5.5.5/32      110/4445  ethernet1  10.1.10.1  00:04:00
O    6.6.6.6/32      110/4445  ethernet1  10.1.10.1  00:03:42
O    7.7.7.7/32      110/4446  ethernet1  10.1.10.1  00:03:28
O    8.8.8.8/32      110/4445  ethernet2  10.4.10.4  00:02:59
O    10.1.2.0/24     110/4444  ethernet1  10.1.10.1  00:22:47
O    10.1.4.0/24     110/4444  ethernet2  10.4.10.4  00:22:47
O    10.1.5.0/24     110/4444  ethernet1  10.1.10.1  00:22:47
O    10.1.6.0/24     110/4444  ethernet1  10.1.10.1  00:22:47
C    10.1.10.0/24    0/0       ethernet1  null       00:22:49
LOC  10.1.10.10/32   0/1       ethernet1  null       00:22:49
O    10.2.3.0/24     110/4445  ethernet2  10.4.10.4  00:22:35
O    10.2.6.0/24     110/4445  ethernet1  10.1.10.1  00:22:47
O    10.2.7.0/24     110/4445  ethernet1  10.1.10.1  00:22:38
O    10.2.11.0/24    110/4445  ethernet1  10.1.10.1  00:22:38
O    10.3.4.0/24     110/4444  ethernet2  10.4.10.4  00:22:47
O    10.3.7.0/24     110/4445  ethernet2  10.4.10.4  00:22:35
O    10.3.8.0/24     110/4445  ethernet2  10.4.10.4  00:22:32
O    10.3.11.0/24    110/4445  ethernet2  10.4.10.4  00:22:35
O    10.4.5.0/24     110/4444  ethernet2  10.4.10.4  00:22:47
O    10.4.8.0/24     110/4444  ethernet2  10.4.10.4  00:22:47
C    10.4.10.0/24    0/0       ethernet2  null       00:22:49
LOC  10.4.10.10/32   0/1       ethernet2  null       00:22:49
C    10.10.10.10/32  0/0       loopback1  null       00:22:49
O    11.11.11.11/32  110/8889  ethernet1  10.1.10.1  00:06:43

rr1# sh ipv4 ospf 1 topo 0                                                     
node      reach  via        ifc        met   hop  conn  sr  br  neighbors
4.4.4.1   true   10.1.10.1  ethernet1  4444  1    5     0   0   4.4.4.2=1=10.1.2.1 4.4.4.4=1=10.1.4.1 4.4.4.5=1=10.1.5.1 4.4.4.6=1=10.1.6.1 4.4.4.10=4444=10.1.10.1
4.4.4.2   true   10.1.10.1  ethernet1  4445  2    5     0   0   4.4.4.1=1=10.1.2.2 4.4.4.3=1=10.2.3.2 4.4.4.7=1=10.2.7.2 4.4.4.6=1=10.2.6.2 4.4.4.11=4444=10.2.11.2
4.4.4.3   true   10.4.10.4  ethernet2  4445  2    4     0   0   4.4.4.2=1=10.2.3.3 4.4.4.4=1=10.3.4.3 4.4.4.8=1=10.3.8.3 4.4.4.7=1=10.3.7.3
4.4.4.4   true   10.4.10.4  ethernet2  4444  1    5     0   0   4.4.4.3=1=10.3.4.4 4.4.4.8=1=10.4.8.4 4.4.4.5=1=10.4.5.4 4.4.4.1=1=10.1.4.4 4.4.4.10=4444=10.4.10.4
4.4.4.5   true   10.1.10.1  ethernet1  4445  2    2     0   0   4.4.4.1=1=10.1.5.5 4.4.4.4=1=10.4.5.5
4.4.4.6   true   10.1.10.1  ethernet1  4445  2    2     0   0   4.4.4.1=1=10.1.6.6 4.4.4.2=1=10.2.6.6
4.4.4.7   true   10.1.10.1  ethernet1  4446  3    2     0   0   4.4.4.2=1=10.2.7.7 4.4.4.3=1=10.3.7.7
4.4.4.8   true   10.4.10.4  ethernet2  4445  2    2     0   0   4.4.4.3=1=10.3.8.8 4.4.4.4=1=10.4.8.8
4.4.4.10  true   null       null       0     0    2     0   0   4.4.4.1=4444=10.1.10.10 4.4.4.4=4444=10.4.10.10
4.4.4.11  true   10.1.10.1  ethernet1  8889  3    1     0   0   4.4.4.2=4444=10.2.11.11

rr1# sh ipv6 route v1                                                          
typ  prefix                  metric     iface      hop                time
O    fd00::1/128             110/6666   ethernet1  fd00:cafe::1:10:1  00:06:01
O    fd00::2/128             110/6667   ethernet1  fd00:cafe::1:10:1  00:05:46
O    fd00::3/128             110/6667   ethernet2  fd00:cafe::4:10:4  00:05:28
O    fd00::4/128             110/6666   ethernet2  fd00:cafe::4:10:4  00:05:14
O    fd00::5/128             110/6667   ethernet1  fd00:cafe::1:10:1  00:04:56
O    fd00::6/128             110/6667   ethernet1  fd00:cafe::1:10:1  00:04:38
O    fd00::7/128             110/6668   ethernet1  fd00:cafe::1:10:1  00:04:24
O    fd00::8/128             110/6667   ethernet2  fd00:cafe::4:10:4  00:03:56
C    fd00::a/128             0/0        loopback1  null               00:23:45
O    fd00::b/128             110/13333  ethernet1  fd00:cafe::1:10:1  00:07:40
O    fd00:cafe::1:2:0/112    110/6666   ethernet1  fd00:cafe::1:10:1  00:23:43
O    fd00:cafe::1:4:0/112    110/6666   ethernet2  fd00:cafe::4:10:4  00:23:43
O    fd00:cafe::1:5:0/112    110/6666   ethernet1  fd00:cafe::1:10:1  00:23:43
O    fd00:cafe::1:6:0/112    110/6666   ethernet1  fd00:cafe::1:10:1  00:23:43
C    fd00:cafe::1:10:0/112   0/0        ethernet1  null               00:23:45
LOC  fd00:cafe::1:10:10/128  0/1        ethernet1  null               00:23:45
O    fd00:cafe::2:3:0/112    110/6667   ethernet1  fd00:cafe::1:10:1  00:23:32
O    fd00:cafe::2:6:0/112    110/6667   ethernet1  fd00:cafe::1:10:1  00:23:32
O    fd00:cafe::2:7:0/112    110/6667   ethernet1  fd00:cafe::1:10:1  00:23:32
O    fd00:cafe::2:11:0/112   110/6667   ethernet1  fd00:cafe::1:10:1  00:23:32
O    fd00:cafe::3:4:0/112    110/6666   ethernet2  fd00:cafe::4:10:4  00:23:43
O    fd00:cafe::3:7:0/112    110/6667   ethernet2  fd00:cafe::4:10:4  00:23:32
O    fd00:cafe::3:8:0/112    110/6667   ethernet2  fd00:cafe::4:10:4  00:23:32
O    fd00:cafe::3:11:0/112   110/6667   ethernet2  fd00:cafe::4:10:4  00:23:32
O    fd00:cafe::4:5:0/112    110/6666   ethernet2  fd00:cafe::4:10:4  00:23:43
O    fd00:cafe::4:8:0/112    110/6666   ethernet2  fd00:cafe::4:10:4  00:23:43
C    fd00:cafe::4:10:0/112   0/0        ethernet2  null               00:23:45
LOC  fd00:cafe::4:10:10/128  0/1        ethernet2  null               00:23:45

rr1# sh ipv6 ospf 1 topo 0                                                     
node               reach  via                ifc        met    hop  conn  sr  br  neighbors
6.6.6.1/00000000   true   fd00:cafe::1:10:1  ethernet1  6666   1    5     0   0   6.6.6.2/00000000=1=10012 6.6.6.4/00000000=1=10015 6.6.6.5/00000000=1=10012 6.6.6.6/00000000=1=10012 6.6.6.10/00000000=6666=10012
6.6.6.2/00000000   true   fd00:cafe::1:10:1  ethernet1  6667   2    5     0   0   6.6.6.1/00000000=1=10012 6.6.6.3/00000000=1=10012 6.6.6.7/00000000=1=10012 6.6.6.6/00000000=1=10013 6.6.6.11/00000000=6666=10012
6.6.6.3/00000000   true   fd00:cafe::4:10:4  ethernet2  6667   2    4     0   0   6.6.6.2/00000000=1=10013 6.6.6.4/00000000=1=10012 6.6.6.8/00000000=1=10012 6.6.6.7/00000000=1=10013
6.6.6.4/00000000   true   fd00:cafe::4:10:4  ethernet2  6666   1    5     0   0   6.6.6.3/00000000=1=10013 6.6.6.8/00000000=1=10013 6.6.6.5/00000000=1=10013 6.6.6.1/00000000=1=10013 6.6.6.10/00000000=6666=10013
6.6.6.5/00000000   true   fd00:cafe::1:10:1  ethernet1  6667   2    2     0   0   6.6.6.1/00000000=1=10014 6.6.6.4/00000000=1=10014
6.6.6.6/00000000   true   fd00:cafe::1:10:1  ethernet1  6667   2    2     0   0   6.6.6.1/00000000=1=10015 6.6.6.2/00000000=1=10015
6.6.6.7/00000000   true   fd00:cafe::1:10:1  ethernet1  6668   3    2     0   0   6.6.6.2/00000000=1=10014 6.6.6.3/00000000=1=10015
6.6.6.8/00000000   true   fd00:cafe::4:10:4  ethernet2  6667   2    2     0   0   6.6.6.3/00000000=1=10014 6.6.6.4/00000000=1=10013
6.6.6.10/00000000  true   null               null       0      0    2     0   0   6.6.6.1/00000000=6666=10016 6.6.6.4/00000000=6666=10016
6.6.6.11/00000000  true   fd00:cafe::1:10:1  ethernet1  13333  3    1     0   0   6.6.6.2/00000000=6666=10016
Check reachability from one RR client (c5 for example)
c5#sh ipv4 route v1                                                            
typ  prefix          metric    iface      hop       time
O    1.1.1.1/32      110/1     ethernet1  10.1.5.1  00:07:22
O    2.2.2.2/32      110/2     ethernet1  10.1.5.1  00:07:07
O    3.3.3.3/32      110/2     ethernet2  10.4.5.4  00:06:49
O    4.4.4.4/32      110/1     ethernet2  10.4.5.4  00:06:35
C    5.5.5.5/32      0/0       loopback1  null      00:25:07
O    6.6.6.6/32      110/2     ethernet1  10.1.5.1  00:06:00
O    7.7.7.7/32      110/3     ethernet1  10.1.5.1  00:05:46
O    8.8.8.8/32      110/2     ethernet2  10.4.5.4  00:05:17
O    10.1.2.0/24     110/1     ethernet1  10.1.5.1  00:25:06
O    10.1.4.0/24     110/1     ethernet2  10.4.5.4  00:25:05
C    10.1.5.0/24     0/0       ethernet1  null      00:25:07
LOC  10.1.5.5/32     0/1       ethernet1  null      00:25:07
O    10.1.6.0/24     110/1     ethernet1  10.1.5.1  00:25:06
O    10.1.10.0/24    110/1     ethernet1  10.1.5.1  00:25:06
O    10.2.3.0/24     110/2     ethernet2  10.4.5.4  00:24:53
O    10.2.6.0/24     110/2     ethernet1  10.1.5.1  00:25:05
O    10.2.7.0/24     110/2     ethernet1  10.1.5.1  00:24:56
O    10.2.11.0/24    110/2     ethernet1  10.1.5.1  00:24:56
O    10.3.4.0/24     110/1     ethernet2  10.4.5.4  00:25:05
O    10.3.7.0/24     110/2     ethernet2  10.4.5.4  00:24:53
O    10.3.8.0/24     110/2     ethernet2  10.4.5.4  00:24:50
O    10.3.11.0/24    110/2     ethernet2  10.4.5.4  00:24:53
C    10.4.5.0/24     0/0       ethernet2  null      00:25:07
LOC  10.4.5.5/32     0/1       ethernet2  null      00:25:07
O    10.4.8.0/24     110/1     ethernet2  10.4.5.4  00:25:05
O    10.4.10.0/24    110/1     ethernet2  10.4.5.4  00:25:05
O    10.10.10.10/32  110/4445  ethernet1  10.1.5.1  00:11:05
O    11.11.11.11/32  110/4446  ethernet1  10.1.5.1  00:09:01

c5#sh ipv4 ospf 1 topo 0                                                       
node      reach  via       ifc        met   hop  conn  sr  br  neighbors
4.4.4.1   true   10.1.5.1  ethernet1  1     1    5     0   0   4.4.4.2=1=10.1.2.1 4.4.4.4=1=10.1.4.1 4.4.4.5=1=10.1.5.1 4.4.4.6=1=10.1.6.1 4.4.4.10=4444=10.1.10.1
4.4.4.2   true   10.1.5.1  ethernet1  2     2    5     0   0   4.4.4.1=1=10.1.2.2 4.4.4.3=1=10.2.3.2 4.4.4.7=1=10.2.7.2 4.4.4.6=1=10.2.6.2 4.4.4.11=4444=10.2.11.2
4.4.4.3   true   10.4.5.4  ethernet2  2     2    4     0   0   4.4.4.2=1=10.2.3.3 4.4.4.4=1=10.3.4.3 4.4.4.8=1=10.3.8.3 4.4.4.7=1=10.3.7.3
4.4.4.4   true   10.4.5.4  ethernet2  1     1    5     0   0   4.4.4.3=1=10.3.4.4 4.4.4.8=1=10.4.8.4 4.4.4.5=1=10.4.5.4 4.4.4.1=1=10.1.4.4 4.4.4.10=4444=10.4.10.4
4.4.4.5   true   null      null       0     0    2     0   0   4.4.4.1=1=10.1.5.5 4.4.4.4=1=10.4.5.5
4.4.4.6   true   10.1.5.1  ethernet1  2     2    2     0   0   4.4.4.1=1=10.1.6.6 4.4.4.2=1=10.2.6.6
4.4.4.7   true   10.1.5.1  ethernet1  3     3    2     0   0   4.4.4.2=1=10.2.7.7 4.4.4.3=1=10.3.7.7
4.4.4.8   true   10.4.5.4  ethernet2  2     2    2     0   0   4.4.4.3=1=10.3.8.8 4.4.4.4=1=10.4.8.8
4.4.4.10  true   10.1.5.1  ethernet1  4445  2    2     0   0   4.4.4.1=4444=10.1.10.10 4.4.4.4=4444=10.4.10.10
4.4.4.11  true   10.1.5.1  ethernet1  4446  3    1     0   0   4.4.4.2=4444=10.2.11.11

c5#sh ipv6 route v1                                                            
typ  prefix                 metric    iface      hop               time
O    fd00::1/128            110/1     ethernet1  fd00:cafe::1:5:1  00:08:06
O    fd00::2/128            110/2     ethernet1  fd00:cafe::1:5:1  00:07:51
O    fd00::3/128            110/2     ethernet2  fd00:cafe::4:5:4  00:07:33
O    fd00::4/128            110/1     ethernet2  fd00:cafe::4:5:4  00:07:19
C    fd00::5/128            0/0       loopback1  null              00:25:51
O    fd00::6/128            110/2     ethernet1  fd00:cafe::1:5:1  00:06:43
O    fd00::7/128            110/3     ethernet1  fd00:cafe::1:5:1  00:06:29
O    fd00::8/128            110/2     ethernet2  fd00:cafe::4:5:4  00:06:01
O    fd00::a/128            110/6667  ethernet1  fd00:cafe::1:5:1  00:11:45
O    fd00::b/128            110/6668  ethernet1  fd00:cafe::1:5:1  00:09:45
O    fd00:cafe::1:2:0/112   110/1     ethernet1  fd00:cafe::1:5:1  00:25:49
O    fd00:cafe::1:4:0/112   110/1     ethernet2  fd00:cafe::4:5:4  00:25:49
C    fd00:cafe::1:5:0/112   0/0       ethernet1  null              00:25:51
LOC  fd00:cafe::1:5:5/128   0/1       ethernet1  null              00:25:51
O    fd00:cafe::1:6:0/112   110/1     ethernet1  fd00:cafe::1:5:1  00:25:49
O    fd00:cafe::1:10:0/112  110/1     ethernet1  fd00:cafe::1:5:1  00:25:49
O    fd00:cafe::2:3:0/112   110/2     ethernet1  fd00:cafe::1:5:1  00:25:37
O    fd00:cafe::2:6:0/112   110/2     ethernet1  fd00:cafe::1:5:1  00:25:37
O    fd00:cafe::2:7:0/112   110/2     ethernet1  fd00:cafe::1:5:1  00:25:37
O    fd00:cafe::2:11:0/112  110/2     ethernet1  fd00:cafe::1:5:1  00:25:37
O    fd00:cafe::3:4:0/112   110/1     ethernet2  fd00:cafe::4:5:4  00:25:49
O    fd00:cafe::3:7:0/112   110/2     ethernet2  fd00:cafe::4:5:4  00:25:37
O    fd00:cafe::3:8:0/112   110/2     ethernet2  fd00:cafe::4:5:4  00:25:37
O    fd00:cafe::3:11:0/112  110/2     ethernet2  fd00:cafe::4:5:4  00:25:37
C    fd00:cafe::4:5:0/112   0/0       ethernet2  null              00:25:51
LOC  fd00:cafe::4:5:5/128   0/1       ethernet2  null              00:25:51
O    fd00:cafe::4:8:0/112   110/1     ethernet2  fd00:cafe::4:5:4  00:25:49
O    fd00:cafe::4:10:0/112  110/1     ethernet2  fd00:cafe::4:5:4  00:25:49

c5#sh ipv6 ospf 1 topo 0                                                       
node               reach  via               ifc        met   hop  conn  sr  br  neighbors
6.6.6.1/00000000   true   fd00:cafe::1:5:1  ethernet1  1     1    5     0   0   6.6.6.2/00000000=1=10012 6.6.6.4/00000000=1=10015 6.6.6.5/00000000=1=10012 6.6.6.6/00000000=1=10012 6.6.6.10/00000000=6666=10012
6.6.6.2/00000000   true   fd00:cafe::1:5:1  ethernet1  2     2    5     0   0   6.6.6.1/00000000=1=10012 6.6.6.3/00000000=1=10012 6.6.6.7/00000000=1=10012 6.6.6.6/00000000=1=10013 6.6.6.11/00000000=6666=10012
6.6.6.3/00000000   true   fd00:cafe::4:5:4  ethernet2  2     2    4     0   0   6.6.6.2/00000000=1=10013 6.6.6.4/00000000=1=10012 6.6.6.8/00000000=1=10012 6.6.6.7/00000000=1=10013
6.6.6.4/00000000   true   fd00:cafe::4:5:4  ethernet2  1     1    5     0   0   6.6.6.3/00000000=1=10013 6.6.6.8/00000000=1=10013 6.6.6.5/00000000=1=10013 6.6.6.1/00000000=1=10013 6.6.6.10/00000000=6666=10013
6.6.6.5/00000000   true   null              null       0     0    2     0   0   6.6.6.1/00000000=1=10014 6.6.6.4/00000000=1=10014
6.6.6.6/00000000   true   fd00:cafe::1:5:1  ethernet1  2     2    2     0   0   6.6.6.1/00000000=1=10015 6.6.6.2/00000000=1=10015
6.6.6.7/00000000   true   fd00:cafe::1:5:1  ethernet1  3     3    2     0   0   6.6.6.2/00000000=1=10014 6.6.6.3/00000000=1=10015
6.6.6.8/00000000   true   fd00:cafe::4:5:4  ethernet2  2     2    2     0   0   6.6.6.3/00000000=1=10014 6.6.6.4/00000000=1=10013
6.6.6.10/00000000  true   fd00:cafe::1:5:1  ethernet1  6667  2    2     0   0   6.6.6.1/00000000=6666=10016 6.6.6.4/00000000=6666=10016
6.6.6.11/00000000  true   fd00:cafe::1:5:1  ethernet1  6668  3    1     0   0   6.6.6.2/00000000=6666=10016
Ping from rr1 from c5
c5#ping 10.10.10.10 /vrf v1                                                    
pinging 10.10.10.10, src=null, vrf=v1, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!
result=100%, recv/sent/lost=5/5/0, rtt min/avg/max/total=0/0/1/4
c5#ping fd00::a /vrf v1                                                        
pinging fd00::a, src=null, vrf=v1, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!
result=100%, recv/sent/lost=5/5/0, rtt min/avg/max/total=0/0/1/4
c5#                                                                                                                                                 
BGP summary
rr1#sh ipv4 bgp 65535 sum                                                      
as     learn  done  ready  neighbor  uptime
65535  0      0     true   1.1.1.1   16:22:28
65535  0      0     true   2.2.2.2   16:17:26
65535  0      0     true   3.3.3.3   16:16:44
65535  0      0     true   4.4.4.4   16:16:01
65535  0      0     true   5.5.5.5   16:15:32
65535  0      0     true   6.6.6.6   16:14:56
65535  0      0     true   7.7.7.7   16:14:30
65535  0      0     true   8.8.8.8   16:13:37

rr1#sh ipv6 bgp 65535 sum                                                      
as     learn  done  ready  neighbor  uptime
65535  0      0     true   fd00::1   16:20:41
65535  0      0     true   fd00::2   16:18:27
65535  0      0     true   fd00::3   16:17:32
65535  0      0     true   fd00::4   16:16:59
65535  0      0     true   fd00::5   16:16:22
65535  0      0     true   fd00::6   16:15:57
65535  0      0     true   fd00::7   16:15:15
65535  0      0     true   fd00::8   16:14:45

From rr1 check c1 BGP status (pay attention to type = routeReflectorClient)
rr1#show ipv4 bgp 65535 neighbor 1.1.1.1 status                                
peer = 1.1.1.1
reachable state = true
reachable changed = 16:24:12
reachable changes = 1
fallover = null
update group = 0
type = routeReflectorClient
safi =  unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
local = 10.10.10.10
router id = 1.1.1.1
uptime = 16:24:12
hold time = 00:03:00
keepalive time = 00:01:00
32bit as = true
refresh = true, rx=0, tx=0
description = rr clients
hostname = null
compression = rx=true, tx=false
graceful = 
addpath rx = 
addpath tx = 
unicast advertised = 0 of 0, list = 0, accepted = 0 of 0
multicast advertised = 0 of 0, list = 0, accepted = 0 of 0
other advertised = 0 of 0, list = 0, accepted = 0 of 0
flowspec advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnuni advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnmlt advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnflw advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnuni advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnmlt advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnflw advertised = 0 of 0, list = 0, accepted = 0 of 0
vpls advertised = 0 of 0, list = 0, accepted = 0 of 0
mspw advertised = 0 of 0, list = 0, accepted = 0 of 0
evpn advertised = 0 of 0, list = 0, accepted = 0 of 0
mdt advertised = 0 of 0, list = 0, accepted = 0 of 0
srte advertised = 0 of 0, list = 0, accepted = 0 of 0
mvpn advertised = 0 of 0, list = 0, accepted = 0 of 0
omvpn advertised = 0 of 0, list = 0, accepted = 0 of 0
version = 14 of 14, needfull=0, buffull=0
full = 9, 2020-07-27 16:32:29, 16:15:21 ago, 0 ms
incr = 2, 2020-07-28 08:13:10, 00:34:40 ago, 0 ms
connection = tx=173(987) rx=158(986) drp=0(0)
uncompressed = tx=0(0) rx=0(0) drp=0(0)
buffer = max=65536 rx=0 tx=65536

rr1#show ipv6 bgp 65535 neighbor fd00::1 status                                
peer = fd00::1
reachable state = true
reachable changed = 16:22:33
reachable changes = 1
fallover = null
update group = 0
type = routeReflectorClient
safi =  unicast multicast other flowspec vpnuni vpnmlt vpnflw ovpnuni ovpnmlt ovpnflw vpls mspw evpn mdt srte mvpn omvpn
local = fd00::a
router id = 1.1.1.1
uptime = 16:22:33
hold time = 00:03:00
keepalive time = 00:01:00
32bit as = true
refresh = true, rx=0, tx=0
description = rr clients
hostname = null
compression = rx=true, tx=false
graceful = 
addpath rx = 
addpath tx = 
unicast advertised = 0 of 0, list = 0, accepted = 0 of 0
multicast advertised = 0 of 0, list = 0, accepted = 0 of 0
other advertised = 0 of 0, list = 0, accepted = 0 of 0
flowspec advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnuni advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnmlt advertised = 0 of 0, list = 0, accepted = 0 of 0
vpnflw advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnuni advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnmlt advertised = 0 of 0, list = 0, accepted = 0 of 0
ovpnflw advertised = 0 of 0, list = 0, accepted = 0 of 0
vpls advertised = 0 of 0, list = 0, accepted = 0 of 0
mspw advertised = 0 of 0, list = 0, accepted = 0 of 0
evpn advertised = 0 of 0, list = 0, accepted = 0 of 0
mdt advertised = 0 of 0, list = 0, accepted = 0 of 0
srte advertised = 0 of 0, list = 0, accepted = 0 of 0
mvpn advertised = 0 of 0, list = 0, accepted = 0 of 0
omvpn advertised = 0 of 0, list = 0, accepted = 0 of 0
version = 14 of 14, needfull=0, buffull=0
full = 9, 2020-07-27 16:32:15, 16:16:37 ago, 0 ms
incr = 2, 2020-07-28 08:13:15, 00:35:38 ago, 0 ms
connection = tx=173(985) rx=158(984) drp=0(0)
uncompressed = tx=0(0) rx=0(0) drp=0(0)
buffer = max=65536 rx=0 tx=65536                                                                                                                                          

Conclusion

In this article you:

  • had a brief introduction of BGP protocol and BGP route reflector rationale
  • learned the design consideration related to BGP RR setup 
  • got a typical BGP configuration example with a long list of AFI/SAFI enabled
  • This configuration is not exhaustive as for example BGP add-path is available but not configured
  • verified BGP RR operation

RARE validated design: [ BGP RR #001 ]- key take-away

  • BGP Router Reflector use case does not require a commercial vendor router, it can be handled perfectly by a sowftare solution running on a server with enoough RAM.

The example above an example of a high availability Route Reflector that is able to handle BGP signalling for a high carrier Service Provider for all address familay

  • Redundant BGP Router Reflection is ensured by deploying 2 RR (at minimum) belonging to the same BGP RR cluster 

In addition to have several RR for the whole domain, it is also common to see hierarchical RR design. SOme Service provider deploy dedicated RR for specific address family (L3VPN unicast for example)

  • RR in the same cluster run basic iBGP session

These RR also share the same cluster ID, in order to ensure route withdraw in case of routing advertisement

  • RR should not be in the traffic datapath

This is the reason why we are setting high cost (4444 and 6666) for IPv4 and IPv6 respectively on both direction on the RR(s) interconnections ports

  • RR design for a multi-service backbone

In the example, the RR client are running only IPv4/IPv6 but the RR design above can empower a Service provider backbone with additional service running on TOP of MPLS, L3VPN, 6VPE, VPLS EVPN etc.

  • In the next article we will dissect the rr1 configurations

This will demonstrate some nice features proposed by freeRouter such as BGP template and nexthop tracking among a list of other feature not mentioned here... (like BGP add-path)


RR design test

You can test this design above in order to check RR and backbone router signalling.

  • Set up freeRouter environment as describe above
  • Get RARE code
Clone RARE code from repository
 git clone https://github.com/frederic-loui/RARE.git
Launch the Service Provider example (diagram above)
cd RARE/00-unit-labs/0101-rare-validated-design-bgp/
make
Access routers using the following command:
c1: telnet localhost 10001 
c2: telnet localhost 10002 
c3: telnet localhost 10003 
c4: telnet localhost 10004 
c4: telnet localhost 10005 
c6: telnet localhost 10006 
c7: telnet localhost 10007 
c8: telnet localhost 10008 
rr1: telnet localhost 10010 
rr2: telnet localhost 10011 
Launch the Service Provider example (diagram above)
cd RARE/00-unit-labs/0101-rare-validated-design-bgp/
make clean

In article #005 you learned how RARE/freeRouter is controlling a P4Emu/pcap dataplane. We also demonstrated that this setup could be integrated into real networks.

Requirement

  • Basic Linux/Unix knowledge
  • Basic networking knowledge

Overview

Though P4Emu/pcap can be used for SOHO and can handle nx1GE of traffic, this comes at a high CPU load cost and thus a higher power consumption. 

"Why write yet another software dataplane as freeRouter has already a working native software dataplane ?"

The partial answer to the question raised in the previous article was:

"decoupling control plane from the dataplane"

We learned that P4Emu:

  • is able to understand the VERY same strict control message from freeRouter as it occurs with a P4 dataplane
  • is able to switch packet emulating router.p4 using libpcap packet forwarding backend.

However, even though libpcap is a performant packet processing library, the kernel is still heavily sollicited and the higher the traffic rate is, the higher CPU workload becomes.

Article objective

In this article we'll using freeRouter setup deployed in #005 and replace P4Emu/pcap's dataplane by P4Emu/dpdk dataplane. 

Source Wikipedia: https://en.wikipedia.org/wiki/Data_Plane_Development_Kit

The Data Plane Development Kit (DPDK) is an Open source software project managed by the Linux Foundation. It provides a set of data plane libraries and network interface controller polling-mode drivers for offloading TCP packet processing from the operating system kernel to processes running in user space. This offloading achieves higher computing efficiency and higher packet throughput than is possible using the interrupt-driven processing provided in the kernel.


It is important to note that though its name implies, P4Emu/dpdk is not emulating V1Model. P4Emu is emulating router.p4 packet processing logic and uses a packet forwarding library to effectively transmit packets at specific ingress port to the right egress port defined by freeRouter control plane message. However, in this precise case, packet processing is offloaded from the kernel to user space. The consequence is the ability with dpdk compatible NIC and driver, to reach tremendous traffic rate. DPDK is not available on all hardware, please refer to DPDK HCL.


Diagram

[ #006 ] - Cookbook

In our example we will use the ubuntu focal as we need dpdk 19.11.1 (latest current version is 20.05.0)

and we add a bridge network interface to or laptop RJ45 connection.

Install dpdk and dpdk-dev
apt-get update
apt-get upgrade
apt-get install dpdk dpdk-dev --no-install-recommends
flush enp0s3 so that it can be controlled by dpdk
ip addr flush enp0s3

Add out of band management enp0s8 with Virtualbox

You can add a second Host-only interface  (enp0s8) in VirtualBox in order to connect the ubuntu focal VM guest as you might lose connection when you flushed enp0s3.
Setup up dpdk and veth pair for control plane dataplane discussion via pcapInt
#!/bin/bash
echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 64 > /proc/sys/vm/nr_hugepages
modprobe uio_pci_generic
dpdk-devbind.py -b uio_pci_generic 00:03.0
ip link add veth0a type veth peer name veth0b
ip link set veth0a up
ip link set veth0b up
Check that dpdk is controlling able to enp0s3
dpdk-devbind.py --status

Network devices using DPDK-compatible driver
============================================
0000:00:03.0 '82540EM Gigabit Ethernet Controller 100e' drv=uio_pci_generic unused=e1000,vfio-pci

Network devices using kernel driver
===================================
0000:00:08.0 '82540EM Gigabit Ethernet Controller 100e' if=enp0s8 drv=e1000 unused=vfio-pci,uio_pci_generic *Active*

No 'Baseband' devices detected
==============================

No 'Crypto' devices detected
============================

No 'Eventdev' devices detected
==============================

No 'Mempool' devices detected
=============================

No 'Compress' devices detected
==============================

No 'Misc (rawdev)' devices detected
===================================
mkdir -p ~/freeRouter/bin ~/freeRouter/lib ~/freeRouter/etc ~/freeRouter/log
cd ~/freeRouter/lib
wget http://freerouter.nop.hu/rtr.jar
Update & Upgrade system
tree freeRouter
freeRouter
├── bin   # binary files      
├── etc   # configuration files      
├── lib   # library files      
└── log   # log files      

get freeRouter net-tools tarball
wget http://www.freertr.net/rtr-`uname -m`.tar -O rtr.tar
Install build tools
tar xvf rtr.tar -C ~/freeRouter/bin/

For those you would like to rebuild these binaries you can find the compilation shell script in freeRouter cloned git repository in: ~/freeRouter/src/native/c.sh

FreeRouter uses 2 configuration files in order to run, let's write these configuration files for R1 in ~/freeRouter/etc

freeRouter hardware configuration file: dpdk-focal-1-hw.txt
hwid hp
! cpu_port
int eth0 eth - 127.0.0.1 20001 127.0.0.1 20002
! freerouter control port for message
tcp2vrf 9080 v1 9080
! freerouter cli
tcp2vrf 2323 v1 23
! launch a process called "veth0" that actually link to veth0b
! cmd: ip link add veth0a type veth peer name veth0b
proc veth0 /root/freertr/bin/pcapInt.bin veth0a 20002 127.0.0.1 20001 127.0.0.1
proc p4emu /root/freertr/bin/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b 127.0.0.1 9080 1

Note:

Let's spend some times on this hardware configuration file, as you might have notice there are additional interesting lines worth to mention:

  • proc <process-name>

It is possible within freeRouter startup to launch processes. We use here this feature to start control plane / dataplane communication via veth pair: veth0a and veth0b and also P4Emu/dpdk, p4dpdk.bin packet processing backend.

  • proc p4emu /root/freertr/bin/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b 127.0.0.1 9080 1

In dpdk, by default dpdk interfaces have port_ids that are sequentially allocated and in the order of appearance in dpdk-devbind --status output usually sorted by pci_id. In the previous output interface enp0s3 has port_id #0 and in dpdk veth0b (CPU_PORT has alwasy the last port_id beside dpdk data port_id, so here it is 1. If for exaplem we dedicate enp0s3, enp0s8, enp0s9, enp0s10 in virtualbox the command would have been:

proc p4emu /root/freertr/bin/p4dpdk.bin --vdev=net_af_packet0,iface=veth0b 127.0.0.1 9080 4

enp0s3 would be: #0 with pci_id: 00:03.0

enp0s8 would be: #1 with pci_id: 00:08.0

enp0s9 would be: #2 with pci_id: 00:09.0

enp0s10 would be: #3 with pci_id: 00:0a.0

freeRouter software configuration file: dpdk-freerouter-sw.txt
hostname dpdk-freerouter
buggy
!
!
vrf definition v1
 rd 1:1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth0a]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@P4_CPU_PORT[enp0s3]
 mtu 1500
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet telnet
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 0 0
 interconnect ethernet0
 vrf v1
 exit
!
!
end
freeRouter launch with supplied dpdk-freerouter-hw.txt and dpdk-freerouter-sw.txt with a console prompt
java -jar lib/rtr.jar routersc dpdk-focal-1-hw.txt dpdk-focal-1-sw.txt
info cfg.cfgInit.doInit:cfgInit.java:556 booting
info cfg.cfgInit.doInit:cfgInit.java:680 initializing hardware
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 2:! cpu_port
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 4:! freerouter control port for message
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 6:! freerouter cli
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 8:! launch a process called "veth0" that actually link to veth0b
info cfg.cfgInit.executeHWcommands:cfgInit.java:469 9:! cmd: ip link add veth0a type veth peer name veth0b
info cfg.cfgInit.doInit:cfgInit.java:687 applying defaults
info cfg.cfgInit.doInit:cfgInit.java:695 applying configuration
info cfg.cfgInit.doInit:cfgInit.java:721 done
welcome
line ready
dpdk-freertr-1#

Verification

FreeRouter telnet access from Virtualbox VM guest via port 2323
root@focal-1:~# telnet 127.0.0.1 2323
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
welcome
line ready
dpdk-freerouter#  
freerouter running configuration
dpdk-freerouter#term len 0                                                      
dpdk-freerouter#sh run                                                          
hostname dpdk-freerouter
buggy
!
!
vrf definition v1
 rd 1:1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth0a]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@P4_CPU_PORT[enp0s3]
 mtu 1500
 macaddr 0078.5223.343c
 lldp enable
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet telnet
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 0 0
 interconnect ethernet0
 vrf v1
 exit
!
!
end

Check control plane is communicating with P4Emu/dpdk dataplane
dpdk-freerouter#show interfaces summary                                         
interface  state  tx     rx       drop
ethernet0  up     43567  8727278  0
sdn1       up     42659  8675606  0
Ping IPv4 from freerouter -> LAN router gateway
dpdk-freerouter#ping 192.168.0.254 /vrf v1                                      
pinging 192.168.0.254, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!
result=100%, recv/sent/lost=5/5/0, rtt min/avg/max/total=1/1/1/6
Ping IPv4 from freerouter -> LAN server
dpdk-freerouter#ping 192.168.0.62 /vrf v1                                       
pinging 192.168.0.62, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1005

Please observe the 1st ICMP packet loss that triggered ARP learning for respectively 192.168.0.254 and 192.168.0.62.

IPv4 arp check
dpdk-freerouter#sh ipv4 arp sdn1                                                
mac             address        time      static
e03f.496d.1899  192.168.0.62   00:00:24  false    <----- Host server
0024.d4a0.0cd3  192.168.0.254  00:00:24  false    <----- LAN gateway
Ping IPv6 from freerouter -> LAN router
dpdk-freerouter#ping 2a01:e0a:159:2850::1 /vrf v1                               
pinging 2a01:e0a:159:2850::1, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1005
Ping IPv6 from freerouter -> Host server and SSH connection test
dpdk-freerouter#ping 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 /vrf v1              
pinging 2a01:e0a:159:2850:e23f:49ff:fe6d:1899, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1006

Please observe the 1st ICMP packet loss that triggered IPv6 neighbor discovery for respectively 2a01:e0a:159:2850::1 and 2a01:e0a:159:2850:e23f:49ff:fe6d:1899

IPv6 neighbor discovery check
dpdk-freerouter#show ipv6 neighbors sdn1                                        
mac             address                                time      static  router
0024.d4a0.0cd3  2a01:e0a:159:2850::1                   00:00:39  false   false    <----- LAN gateway
e03f.496d.1899  2a01:e0a:159:2850:e23f:49ff:fe6d:1899  00:00:39  false   false    <----- Host server
0024.d4a0.0cd3  fe80::224:d4ff:fea0:cd3                00:00:39  false   false    <----- Link local LAN gateway EUI64 IPv6 address
e03f.496d.1899  fe80::e23f:49ff:fe6d:1899              00:00:39  false   false    <----- Link local host server IPv6 address
Initiate IPv4 ssh from freerouter -> LAN router gateway
dpdk-freerouter#ssh 192.168.0.62 /vrf v1 /user my-nas                          
 - connecting to 192.168.0.62 22
password: *******
                
 - securing connection

Last login: Tue Jul  7 17:40:55 2020 from 2a01:e0a:159:2850::666
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS%


Initiate IPv6 ssh from freerouter -> LAN router gateway
dpdk-freerouter#ssh 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 /vrf v1 /user my-nas 
 - connecting to 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 22
password: *******
                
 - securing connection

Last login: Wed Jul  8 11:28:32 2020 from 192.168.0.131
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS% 
freeRouter p4dpdk hardware statistics
dpdk-freerouter#sh int snd1 hw                                                  
  hwcounters     - hardware counters
  hwdrhistory    - hardware historic drop byte counters
  hwdrphistory   - hardware historic drop packet counters
  hwhistory      - hardware historic byte counters
  hwnumhist      - hardware numeric historic byte counters
  hwnumphist     - hardware numeric historic packet counters
  hwphistory     - hardware historic packet counters
  hwrates        - hardware traffic rates
  hwrealtime     - hardware realtime counters
  hwrxhistory    - hardware historic rx byte counters
  hwrxphistory   - hardware historic rx packet counters
  hwtxhistory    - hardware historic tx byte counters
  hwtxphistory   - hardware historic tx packet counters

dpdk-freerouter#show interfaces sdn1 hwrates                                    
       packet         byte
time   tx  rx   drop  tx     rx      drop
1sec   5   20   0     1498   4668    0
1min   39  104  0     48056  56745   0
1hour  31  174  0     10162  137481  0

dpdk-freerouter#show interfaces sdn1 hwhistory                                  
        217k|                                                            
        195k|                            #                               
        173k|                            #                               
        151k|   #                        #                            #  
        130k| # #           #          # #                 #          #  
        108k| # #           #          # #        #        #   #      #  
         86k| # #           #  #     # ###   #    #        #  ##      ## 
         65k| # # #    # #  #  #     # ##### ## # #        #####    # ## 
         43k|## # #### # ####  ### # ########## ###### #   ##### # ##### 
         21k|## ###### # ##### ##### ########## ######################## 
           0|########################################################### 
         bps|0---------10--------20--------30--------40--------50-------- seconds

         43m|                                                            
         39m| *                                                          
         34m| *                                                          
         30m| *                                         *                
         26m| *                                         *                
         21m| *                                         *                
         17m| *                                         *                
         13m| *                                         *       *        
       8684k| *                                    *  * *  *    *        
       4342k| *       **                           *  * * ** * **      * 
           0|########################################################### 
         bps|0---------10--------20--------30--------40--------50-------- minutes

         70m|                                                            
         63m| * *                                                        
         56m| * *                                                        
         49m| * *                                                        
         42m| * *                                                        
         35m| * *                                                        
         28m| * *                                                        
         21m|** *                                                        
         14m|** *                                                        
       7017k|****                                                        
           0|##*#                                                        
         bps|0---------10--------20--------30--------40--------50-------- hours


Conclusion

In this article you:

  • had a demonstration of how to integrate freeRouter into a local area network (Similar to article #002)
  • However instead of using P4Emu/dpdk we used a P4Emu/dpdk dataplane
  • communication between freeRouter control plane and P4Emu/dpdk is ensured by pcapInt via veth pair [ veth0a - veth0b ]
  • In this example the freeRouter with P4Emu/dpdk has only 1 dataplane interface that is bound to enp0s3 VM interface exposed to the local network as a bridged interface

[ #006 ] RARE/FreeRouter-101 - key take-away

  • FreeRouter is using UNIX socket in order to forward packet dedicated to control plane + dataplane communication.

This essential paradigm is used to ensure communication between freeRouter and P4Emu/dpdk dataplane. It is ensured by pcapInt binary from freeRouter net-tools that will bind freeRouter socket (veth0a@locathost:22001) to a virtual network interface (veth0b@localhost:22002)  connected to CPU_PORT 1.

  • freeRouter is the control plane for P4Emu/dpdk dataplane

freeRouter is doing all the control plane route computation and write/modify/remove message entry P4 entries are created/modified/removed accordingly from P4Emu/dpdk tables. Although the name is P4Emu, it does not emulate BMv2 V1Model.p4, but rather router.p4

  • dpdk port_id allocation

dpkg port_id allocation follow pci_id port naming convention starting from id 0. p4dpdk.bin is invoked with the parameter: (number_of_dpdk_port - 1) + 1 <--- CPU_PORT

  • In this setup the combination of freeRouter/P4Emu/dpdk delivers a solution for small campus network having 10GE links (100GE links to be validated)

dpkg removed the kernel intervention calls for each packet processed. In that configuration packet processing is now off loaded to user space. Reducing kernel intervention to ~ 0%. Congratulation you have a hardware NIC assisted forwarding is system !

In subsequent article we will see how this setup behaves with a DELL 640 server powered by Intel(R) Xeon(R) Gold 6138 CPU x 2  and equipped with a  Mellanox ConnectX-5 EX Dual Port 100GbE QSFP28 PCIe Adapter Low Profile card. We will also see how to connect this server to a P4 switch, BF2556X-1T. So stay tuned !





In article #003 and #004 you learned how RARE/freeRouter is controlling a P4 dataplane (BMv2 or TOFINO virtual model). We also demonstrated that this setup could be integrated into real networks. However, these P4 dataplanes are not suitable for day to day real operation as it have inherent software limitations. While freeRouter native software dataplane presents the advantage to get  the entire feature set and is sufficient to handle a home network traffic load, we investigated a way to improve dataplane performance. In that context we considered to study:

Requirement

  • Basic Linux/Unix knowledge
  • Basic networking knowledge

Overview

However, XDP model was not complete enough in order to compile router.p4 and we could not generate the corresponding kernel bypass code with ELTE T4P4S based on BMv2 V1Model.p4. (A GitHub issue is still pending). In that context, Csaba freeRouter lead developer decided to develop P4Emu a software dataplane that has the particularity to:

  • understand freeRouter control plane message meant to be addressed to a P4 dataplane
  • thus maintaining the control plane decoupled to the dataplane as it was the case with BMv2 and BF_SWITCHD

One would ask: Why write yet another software dataplane as freeRouter has already a working native software dataplane. This is a very good and valid question. The answer boils down in:

"decoupling control plane from the dataplane"

We will see in subsequent article how P4Emu unlock new valid uses cases.

Article objective

In this article we'll using freeRouter setup deployed in #004 and replace  bf_switchd providing freeRouter INTEL/BAREFOOT TOFINO's dataplane by P4Emu/pcap.

It is important to note that though its name, P4Emu/pcap is not emulating V1Model. P4Emu is emulating router.p4 packet processing logic and uses a packet forwarding library to effectively transmit packets at specific ingress port to the right egress port defined by freeRouter control plane message. 


Diagram

[ #005 ] - Cookbook

In our example we will use the same debian stable image (buster) installed as a VirtualBox VM as in #002.

and we add a bridge network interface to or laptop RJ45 connection.

flush enp0s3 so that it can be controlled by dpdk
ip addr flush enp0s3

Add out of band management enp0s8 with Virtualbox

You can add a second Host-only interface  (enp0s8) in VirtualBox in order to connect the ubuntu focal VM guest as you might lose connection when you flushed enp0s3.
mkdir -p ~/freeRouter/bin ~/freeRouter/lib ~/freeRouter/etc ~/freeRouter/log
cd ~/freeRouter/lib
wget http://freerouter.nop.hu/rtr.jar
Update & Upgrade system
tree freeRouter
freeRouter
├── bin   # binary files      
├── etc   # configuration files      
├── lib   # library files      
└── log   # log files      

get freeRouter net-tools tarball
wget freerouter.nop.hu/rtr.tar
Install build tools
tar xvf rtr.tar -C ~/freeRouter/bin/

For those you would like to rebuild these binaries you can find the compilation shell script in freeRouter cloned git repository in: ~/freeRouter/src/native/c.sh

FreeRouter uses 2 configuration files in order to run, let's write these configuration files for R1 in ~/freeRouter/etc

freeRouter hardware configuration file: pcap-freerouter-hw.txt
int eth0 eth 0000.1111.00fb 127.0.0.1 22710 127.0.0.1 22709
tcp2vrf 2323 v1 23
tcp2vrf 9080 v1 9080
freeRouter software configuration file: pcap-freerouter-sw.txt
hostname pcap-freerouter
buggy
!
vrf definition v1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth251]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@sdn1[enp0s3]
 mtu 9000
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet tel
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 1 0
 interconnect ethernet0
 vrf v1
 exit
!
end
Setup P4Emu dataplane communication channel via veth pair and interface adjustment (disable IPv6 at VM guest level, MTU 10240, disable TCP offload etc.)
echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6

ip link add veth251 type veth peer name veth250
ip link set veth250  up 
ip link set veth251  up 

ifconfig enp0s3 promisc
ifconfig veth250 promisc
ifconfig veth251 promisc

ip link set dev veth250 up mtu 10240
ip link set dev veth251 up mtu 10240
ip link set dev enp0s3 up mtu 10240
export TOE_OPTIONS="rx tx sg tso ufo gso gro lro rxvlan txvlan rxhash"

for TOE_OPTION in $TOE_OPTIONS; do
    /sbin/ethtool --offload veth250 "$TOE_OPTION" off &> /dev/null
    /sbin/ethtool --offload veth251 "$TOE_OPTION" off &> /dev/null
    /sbin/ethtool --offload enp0s3 "$TOE_OPTION" off &> /dev/null
done
freeRouter launch with supplied pcap-freerouter-hw.txt and pcap-freerouter-sw.txt with a console prompt
java -jar lib/rtr.jar routersc etc/pcap-freerouter-hw.txt etc/pcap-freerouter-sw.txt
info cfg.cfgInit.doInit:cfgInit.java:556 booting
info cfg.cfgInit.doInit:cfgInit.java:680 initializing hardware
info cfg.cfgInit.doInit:cfgInit.java:687 applying defaults
info cfg.cfgInit.doInit:cfgInit.java:695 applying configuration
info cfg.cfgInit.doInit:cfgInit.java:721 done
welcome
line ready
pcap-freerouter#                   
launch freeRouter pcapInt in order to stitch control plane and P4Emu/pcap dataplane communication
cd ~/freeRouter/bin
./pcapInt.bin veth251 22709 127.0.0.1 22710 127.0.0.1
binded to local port 127.0.0.1 22709.
will send to 127.0.0.1 22710.
pcap version: libpcap version 1.8.1
opening interface veth251 with pcap1.x api
serving others
> 
Launch P4Emu/pcap software dataplane
sudo ./p4emu.bin  127.0.0.1 9080 0 veth250 enp0s3
cpu port is #0 of 2...
pcap version: libpcap version 1.8.1
connecting 127.0.0.1 9080.
opening interface veth250.
opening interface enp0s3.
rx: 'myaddr4' 'add' '224.0.0.0' '4' '0' '1' '' 
rx: 'myaddr4' 'add' '255.255.255.255' '32' '0' '1' '' 
rx: 'myaddr6' 'add' 'ff00::' '8' '0' '1' '' 
rx: 'myaddr4' 'add' '192.168.0.0' '24' '-1' '1' '' 
rx: 'myaddr4' 'add' '192.168.0.131' '32' '-1' '1' '' 
rx: 'myaddr6' 'add' '2a01:e0a:159:2850::' '64' '-1' '1' '' 
rx: 'myaddr6' 'add' '2a01:e0a:159:2850::666' '128' '-1' '1' '' 
rx: 'myaddr6' 'add' 'fe80::' '64' '-1' '1' '' 
rx: 'mylabel4' 'add' '615589' '1' '' 
rx: 'mylabel6' 'add' '1036348' '1' '' 
rx: 'state' '1' '1' '0' '' 
rx: 'mtu' '1' '9000' '' 
rx: 'portvrf' 'add' '1' '1' '' 
rx: 'keepalive' '' 
rx: 'keepalive' '' 
rx: 'neigh6' 'add' '11120' 'fe80::224:d4ff:fea0:cd3' '00:24:d4:a0:0c:d3' '1' '00:72:3e:18:1b:6f' '1' '' 
rx: 'keepalive' '' 
rx: 'keepalive' '' 
rx: 'keepalive' '' 
rx: 'neigh4' 'add' '29738' '192.168.0.254' '00:24:d4:a0:0c:d3' '1' '00:72:3e:18:1b:6f' '1' '' 
rx: 'keepalive' '' 
rx: 'neigh4' 'add' '40470' '192.168.0.62' 'e0:3f:49:6d:18:99' '1' '00:72:3e:18:1b:6f' '1' '' 
rx: 'keepalive' '' 
rx: 'keepalive' '' 
rx: 'keepalive' '' 
rx: 'keepalive' '' 
rx: 'keepalive' '' 
rx: 'neigh6' 'add' '45820' '2a01:e0a:159:2850:e23f:49ff:fe6d:1899' 'e0:3f:49:6d:18:99' '1' '00:72:3e:18:1b:6f' '1' '' 
rx: 'keepalive' '' 
rx: 'neigh6' 'add' '49055' 'fe80::e23f:49ff:fe6d:1899' 'e0:3f:49:6d:18:99' '1' '00:72:3e:18:1b:6f' '1' '' 
rx: 'neigh6' 'add' '33334' '2a01:e0a:159:2850::
...

Verification

FreeRouter telnet access from Virtualbox VM guest via port 2323
telnet localhost 2323
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
welcome
line ready
pcap-freerouter#
freerouter running configuration
pcap-freerouter#term len 0                                                       
pcap-freerouter#sh run                                                           
hostname pcap-freerouter
buggy
!
vrf definition v1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth251]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@sdn1[enp0s3]
 mtu 9000
 macaddr 0072.3e18.1b6f
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet tel
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 1 0
 interconnect ethernet0
 vrf v1
 exit
!
end

Check control plane is communicating with bf_switchd p4 dataplane
pcap-freerouter#show interfaces summary                                          
interface  state  tx    rx      drop
ethernet0  up     8739  404545  0
sdn1       up     8535  400013  0
Ping IPv4 from freerouter -> LAN router gateway
pcap-freerouter#ping 192.168.0.254 /vrf v1                                       
pinging 192.168.0.254, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1011
Ping IPv4 from freerouter -> LAN server
pcap-freerouter#ping 192.168.0.62 /vrf v1                                        
pinging 192.168.0.62, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/2/1005 

Please observe the 1st ICMP packet loss that triggered ARP learning for respectively 192.168.0.254 and 192.168.0.62.

IPv4 arp check
pcap-freerouter#sh ipv4 arp sdn1                                                 
mac             address        time      static
e03f.496d.1899  192.168.0.62   00:00:57  false    <----- Host server
0024.d4a0.0cd3  192.168.0.254  00:00:57  false    <----- LAN gateway
Ping IPv6 from freerouter -> LAN router
pcap-freerouter#ping 2a01:e0a:159:2850::1  /vrf v1                               
pinging 2a01:e0a:159:2850::1, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=0/1/2/1004
Ping IPv6 from freerouter -> Host server and SSH connection test
pcap-freerouter#ping 2a01:e0a:159:2850:e23f:49ff:fe6d:1899  /vrf v1              
pinging 2a01:e0a:159:2850:e23f:49ff:fe6d:1899, src=null, cnt=5, len=64, tim=1000, ttl=255, tos=0, sweep=false
.!!!!
result=80%, recv/sent/lost=4/5/1, rtt min/avg/max/total=1/1/1/1006

Please observe the 1st ICMP packet loss that triggered IPv6 neighbor discovery for respectively 2a01:e0a:159:2850::1 and 2a01:e0a:159:2850:e23f:49ff:fe6d:1899

IPv6 neighbor discovery check
pcap-freerouter#show ipv6 neighbors sdn1                                         
mac             address                                time      static  router
0024.d4a0.0cd3  2a01:e0a:159:2850::1                   00:00:53  false   false
e03f.496d.1899  2a01:e0a:159:2850:e23f:49ff:fe6d:1899  00:00:53  false   false
0024.d4a0.0cd3  fe80::224:d4ff:fea0:cd3                00:00:53  false   false
e03f.496d.1899  fe80::e23f:49ff:fe6d:1899              00:00:53  false   false
Initiate IPv4 ssh from freerouter -> LAN router gateway
pcap-freerouter#ssh 192.168.0.62 /vrf v1 /user my-nas                           
 - connecting to 192.168.0.62 22
password: *******
                
 - securing connection

Last login: Mon Jul  6 15:05:38 2020 from 192.168.0.77
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS% 


Initiate IPv6 ssh from freerouter -> LAN router gateway
pcap-freerouter#ssh 2a01:e0a:159:2850:e23f:49ff:fe6d:1899  /vrf v1 /user my-nas 
 - connecting to 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 22
password: *******
                
 - securing connection

Last login: Tue Jul  7 16:01:54 2020 from 2a01:e0a:159:2850::666
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS% 

Conclusion

In this article you:

  • had a demonstration of how to integrate freeRouter into a local area network (Similar to article #002)
  • However instead of using bmv2 or TOFINO we used a P4Emu/pcap dataplane
  • communication between freeRouter control plane and P4Emu/pcap is ensured by pcapInt via veth pair [ veth250 - veth251 ]
  • In this example the freeRouter with P4Emu/pcap has only 1 dataplane interface that is bound to enp0s3 VM interface exposed to the local network as a bridged interface

[ #005 ] RARE/FreeRouter-101 - key take-away

  • FreeRouter is using UNIX socket in order to forward packet dedicated to control plane + dataplane communication.

This essential paradigm is used to ensure communication between freeRouter and P4Emu/pcap dataplane. It is ensured by pcapInt binary from freeRouter net-tools that will bind freeRouter socket (veth251@locathost:22710) to a virtual network interface (veth250@localhost:22709)  connected to CPU_PORT 0.

  • freeRouter is the control plane for P4Emu/pcap dataplane

freeRouter is doing all the control plane route computation and write/modify/remove message entry P4 entries are created/modified/removed accordingly from P4Emu/pcap tables. Although the name is P4Emu, it does not emulate BMv2 V1Model.p4, but rather router.p4

  • In this setup the combination of freeRouter/pcap deliver a solution for SOHO network having 1GE links

However, 1GE traffic rate require 50% of one CPU thread. Nevertheless, traffic rate achieved is higher with P4Emu/pcap than freeRouter native software packet forwarding software.

In subsequent article we will see how we can improve the latter requirement implied by P4Emu/pcap.



In the previous article #003 "Are you P4 compliant ?" we exposed a setup where RARE/freeRouter was controlling BMv2 P4 dataplane called simple_switch_grpc. In this article we replace the open source BMv2 target by a commercial virtual target provided by INTEL/BAREFOOT. As a side note, we will show that this setup can be integrated with real networks. (with inherent software limitations) 

Requirement

  • Basic Linux/Unix knowledge
  • Basic networking knowledge

Overview

I'm repeating the core message from #003: For those who are not familiar with data plane programming and especially with P4, "P4 is a domain-specific programming language for specifying the behaviour of the dataplanes of network-forwarding elements." (from p4.org) in short it helps you to write a "program specifying how a switch processes packets".

Article objective

In this article we'll using freeRouter setup deployed in #003 and replace bmv2/simple_switch_grpc providing freeRouter P4Lang's dataplane by INTEL BAREFOOT/bf_switchd. Actually the effective dataplane is ensured by INTEL/BAREFOOT virtual bf_switchd model running RARE P4 program called: bf_router.p4.

Diagram

[ #004 ] - Cookbook

In our example we will use the OpenNetworkLinux KVM image (ONL9) this is the recommended build from INTEL/BAREFOOT for SDE-9.2.0.

and we add a network interface bridged to our laptop RJ45 connection.

mkdir -p ~/freeRouter/bin ~/freeRouter/lib ~/freeRouter/etc ~/freeRouter/log
cd ~/freeRouter/lib
wget http://freerouter.nop.hu/rtr.jar
Update & Upgrade system
tree freeRouter
freeRouter
├── bin   # binary files      
├── etc   # configuration files      
├── lib   # library files      
└── log   # log files      

get freeRouter net-tools tarball
wget freerouter.nop.hu/rtr.tar
Install build tools
tar xvf rtr.tar -C ~/freeRouter/bin/

For those you would like to rebuild these binaries you can find the compilation shell script in freeRouter cloned git repository in: ~/freeRouter/src/native/c.sh

In that section, you'll need to get access to INTEL/BAREFOOT Software Development Environment. For Research & Academia institution, you can apply here in order to become a FASTER member and access to INTEL/BAREFOOT resources. You can find here, a document installing INTEL/BAREFOOT SDE on ONL for a WEDGE100BF32X system. In our case, we are setting up the following environment:

  • ONL9 as VM guest with kernel 8192 Mb of RAM and 2 vCPU
  • SDE 9.2.0
  • VirtualBox is running on MACOSX host

Just for the sake of example, SDE 9.2.0 is installed in root home directory:

SDE installation environment
export SDE=/root/bf-sde-9.2.0
export SDE_INSTALL=/root/bf-sde-9.2.0/install
export PATH=$PATH:$SDE_INSTALL/bin:$SDE/tools

TOFINO RARE bitbucket is a private repository. It is currently being reworked in order to make it public as per INTEL/BAREFOOT decision to make P4 code related to TOFINO architecture public. (It is thus inaccessible for now but will be opened to the public soon.)

Clone RARE code from repository
cd ~/
git clone https://bitbucket.software.geant.org/scm/rare/rare.git
compile RARE bf_router.p4
p4_build.sh -I /root/rare/p4src/ -DHAVE_MPLS /root/rare/p4src/bf_router.p4 
Using SDE          /root/bf-sde-9.2.0
Using SDE_INSTALL /root/bf-sde-9.2.0/install
Using SDE version bf-sde-9.2.0

OS Name: Ubuntu 18.04.4 LTS
This system has 8GB of RAM and 1 CPU(s)
Parallelization:  Recommended: -j1   Actual: -j1

Compiling for p4_16/tna
P4 compiler path:    /root/bf-sde-9.2.0/install/bin/p4c
P4 compiler version: 9.2.0 (SHA: 639d9ec) (p4c-based)
Build Dir: /root/bf-sde-9.2.0/build/p4-build/bf_router
 Logs Dir: /root/bf-sde-9.2.0/logs/p4-build/bf_router

  Building bf_router        CLEAR CONFIGURE MAKE INSTALL ... DONE

FreeRouter uses 2 configuration files in order to run, let's write these configuration files for R1 in ~/freeRouter/etc

freeRouter hardware configuration file: tna-freerouter-hw.txt
int eth0 eth 0000.1111.00fb 127.0.0.1 22710 127.0.0.1 22709
tcp2vrf 2323 v1 23
tcp2vrf 9080 v1 9080
freeRouter software configuration file: tna-freerouter-sw.txt
hostname tna-freerouter
buggy
!
!
vrf definition v1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth251]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@sdn1[enp0s3]
 mtu 9000
 macaddr 0072.3e18.1b6f
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet tel
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 0 10
 interconnect ethernet0
 vrf v1
 exit
!
client tcp-checksum transmit
!
end
Setup bf_switchd dataplane communication channel via veth pair and interface adjustment (disable IPv6 at VM guest level, MTU 10240, disable TCP offload etc.)
echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6

ip link add veth251 type veth peer name veth250
ip link set veth250  up 
ip link set veth251  up 

ifconfig enp0s3 promisc
ifconfig veth250 promisc
ifconfig veth251 promisc

ip link set dev veth250 up mtu 10240
ip link set dev veth251 up mtu 10240
ip link set dev enp0s3 up mtu 10240
export TOE_OPTIONS="rx tx sg tso ufo gso gro lro rxvlan txvlan rxhash"

for TOE_OPTION in $TOE_OPTIONS; do
    /sbin/ethtool --offload veth250 "$TOE_OPTION" off &> /dev/null
    /sbin/ethtool --offload veth251 "$TOE_OPTION" off &> /dev/null
    /sbin/ethtool --offload enp0s3 "$TOE_OPTION" off &> /dev/null
done
freeRouter launch with supplied tna-freerouter-hw.txt and tna-freerouter-sw.txt with a console prompt
java -jar lib/rtr.jar routersc etc/p4-freerouter-hw.txt etc/p4-freerouter-sw.txt
info cfg.cfgInit.doInit:cfgInit.java:556 booting
info cfg.cfgInit.doInit:cfgInit.java:680 initializing hardware
info cfg.cfgInit.doInit:cfgInit.java:687 applying defaults
info cfg.cfgInit.doInit:cfgInit.java:695 applying configuration
info cfg.cfgInit.doInit:cfgInit.java:721 done
welcome
line ready
freerouter#                   
launch freeRouter pcapInt in order to stitch control plane and P4 bf_switchd dataplane communication
cd ~/freeRouter/bin
./pcapInt.bin veth251 22709 127.0.0.1 22710 127.0.0.1
binded to local port 127.0.0.1 22709.
will send to 127.0.0.1 22710.
pcap version: libpcap version 1.8.1
opening interface veth251 with pcap1.x api
serving others
> 
Create bf_switchd RARE running environement
mkdir -p ~/rare-run/etc ~/rare-run/logs ~/rare-run/mibs ~/rare-run/snmp
create a custom ports.json file for bf_switchd model
cat ~/rare-run/etc/ports.json
{
    "PortToIf" : [
        { "device_port" :  0, "if" : "enp0s3" },
        { "device_port" : 64, "if" : "veth250" }
    ]
}
run TOFINO model in quiet mode with bf_router as program and log file (if any) should be in ~/rare-run/logs
cd $SDE
./run_tofino_model.sh -p bf_router -f ~/rare-run/etc/ports.json --log-dir ~/rare-run/logs/ -q
Run bf_switchd (logs will be in ~/rare-run/logs)
cd ~/rare-run/logs
 $SDE/run_switchd.sh -p bf_router
Launch RARE bf_forwarder.p4 (BfRuntime GRPC based interface)
cd ~/rare/bfrt_python/
./bf_forwarder.py --ifmibs-dir ~/rare-run/mibs/ --ifindex ~/rare-run/snmp/ifindex
bf_forwarder.py running on: MODEL
GRPC_ADDRESS: 127.0.0.1:50052
P4_NAME: bf_router
CLIENT_ID: 0
Subscribe attempt #1
Subscribe response received 0
Received bf_router on GetForwarding
Binding with p4_name bf_router
Binding with p4_name bf_router successful!!
BfForwarder - loop
  Clearing Table pipe.ig_ctl.ig_ctl_mpls.tbl_mpls_fib
  Clearing Table pipe.ig_ctl.ig_ctl_acl_in.tbl_ipv6_acl
BfIfSnmpClient - main
BfIfSnmpClient - No active ports
  Clearing Table pipe.ig_ctl.ig_ctl_ipv4.tbl_ipv4_fib_host
  Clearing Table pipe.ig_ctl.ig_ctl_copp.tbl_ipv6_copp
  Clearing Table pipe.ig_ctl.ig_ctl_acl_in.tbl_ipv4_acl
  Clearing Table pipe.ig_ctl.ig_ctl_ipv6.tbl_ipv6_fib_host
  Clearing Table pipe.ig_ctl.ig_ctl_mpls.tbl_mpls_fib_decap
  Clearing Table pipe.ig_ctl.ig_ctl_nexthop.tbl_nexthop
  Clearing Table pipe.ig_ctl.ig_ctl_vlan_out.tbl_vlan_out
  Clearing Table pipe.ig_ctl.ig_ctl_vlan_in.tbl_vlan_in
  Clearing Table pipe.ig_ctl.ig_ctl_acl_out.tbl_ipv6_acl
  Clearing Table pipe.ig_ctl.ig_ctl_ipv4.tbl_ipv4_fib_lpm
  Clearing Table pipe.ig_ctl.ig_ctl_acl_out.tbl_ipv4_acl
  Clearing Table pipe.ig_ctl.ig_ctl_vrf.tbl_vrf
  Clearing Table pipe.ig_ctl.ig_ctl_copp.tbl_ipv4_copp
  Clearing Table pipe.ig_ctl.ig_ctl_ipv6.tbl_ipv6_fib_lpm
  Clearing Table pipe.ig_ctl.ig_ctl_bridge.tbl_bridge_target
  Clearing Table pipe.ig_ctl.ig_ctl_bridge.tbl_bridge_learn
Bundle specific clearing: (Order matters)
  Clearing Bundle Table pipe.ig_ctl.ig_ctl_bundle.tbl_nexthop_bundle
  Clearing Bundle Table pipe.ig_ctl.ig_ctl_bundle.ase_bundle
  Clearing Bundle Table pipe.ig_ctl.ig_ctl_bundle.apr_bundle
BfForwarder - Main
BfForwarder - Entering message loop
rx: ['myaddr4_add', '224.0.0.0/4', '0', '1', '\n']
BfIfStatus - main
BfIfStatus - No active ports
rx: ['myaddr4_add', '255.255.255.255/32', '0', '1', '\n']
BfSubIfCounter - main
BfSubIfCounter - No active ports
rx: ['myaddr6_add', 'ff00::/8', '0', '1', '\n']
rx: ['myaddr4_add', '192.168.0.0/24', '-1', '1', '\n']
rx: ['myaddr4_add', '192.168.0.131/32', '-1', '1', '\n']
rx: ['myaddr6_add', '2a01:e0a:159:2850::/64', '-1', '1', '\n']
rx: ['myaddr6_add', '2a01:e0a:159:2850::666/128', '-1', '1', '\n']
rx: ['myaddr6_add', 'fe80::/64', '-1', '1', '\n']
rx: ['mylabel4_add', '186286', '1', '\n']
rx: ['mylabel6_add', '842368', '1', '\n']
rx: ['state', '0', '1', '10', '\n']
rx: ['mtu', '0', '9000', '\n']
rx: ['portvrf_add', '0', '1', '\n']
rx: ['neigh6_add', '20989', 'fe80::224:d4ff:fea0:cd3', '00:24:d4:a0:0c:d3', '1', '00:72:3e:18:1b:6f', '0', '\n']
BfIfSnmpClient - added stats for port 0
rx: ['keepalive', '\n']
rx: ['neigh4_add', '29777', '192.168.0.254', '00:24:d4:a0:0c:d3', '1', '00:72:3e:18:1b:6f', '0', '\n']
rx: ['keepalive', '\n']
rx: ['neigh6_add', '25745', 'fe80::bc6a:83ad:7897:8461', '00:13:46:3c:a9:4f', '1', '00:72:3e:18:1b:6f', '0', '\n']
rx: ['keepalive', '\n']
rx: ['neigh6_add', '41106', 'fe80::e23f:49ff:fe6d:1899', 'e0:3f:49:6d:18:99', '1', '00:72:3e:18:1b:6f', '0', '\n']
rx: ['keepalive', '\n']
rx: ['neigh6_add', '35111', '2a01:e0a:159:2850:e23f:49ff:fe6d:1899', 'e0:3f:49:6d:18:99', '1', '00:72:3e:18:1b:6f', '0', '\n']
rx: ['keepalive', '\n']
rx: ['neigh6_del', '25745', 'fe80::bc6a:83ad:7897:8461', '00:13:46:3c:a9:4f', '1', '00:72:3e:18:1b:6f', '0', '\n']
rx: ['keepalive', '\n']
rx: ['neigh6_add', '20371', 'fe80::bc6a:83ad:7897:8461', '00:13:46:3c:a9:4f', '1', '00:72:3e:18:1b:6f', '0', '\n']
rx: ['keepalive', '\n']
...
rx: ['keepalive', '\n']
rx: ['neigh4_add', '34182', '192.168.0.62', 'e0:3f:49:6d:18:99', '1', '00:72:3e:18:1b:6f', '0', '\n']
...

Verification

FreeRouter telnet access from Virtualbox VM guest via port 2323
telnet localhost 2323
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
welcome
line ready
tna-freerouter#
freerouter running configuration
tna-freerouter#term len 0                                                       
tna-freerouter#sh run                                                           
hostname tna-freerouter
buggy
!
!
vrf definition v1
 exit
!
interface ethernet0
 description freerouter@P4_CPU_PORT[veth251]
 no shutdown
 no log-link-change
 exit
!
interface sdn1
 description freerouter@sdn1[enp0s9]
 mtu 9000
 macaddr 0072.3e18.1b6f
 vrf forwarding v1
 ipv4 address 192.168.0.131 255.255.255.0
 ipv6 address 2a01:e0a:159:2850::666 ffff:ffff:ffff:ffff::
 ipv6 enable
 no shutdown
 no log-link-change
 exit
!
!
!
!
!
!
!
!
!
!
!
!
!
!
server telnet tel
 security protocol telnet
 no exec authorization
 no login authentication
 vrf v1
 exit
!
server p4lang p4
 export-vrf v1 1
 export-port sdn1 0 10
 interconnect ethernet0
 vrf v1
 exit
!
client tcp-checksum transmit
!
end
Check control plane is communicating with bf_switchd p4 dataplane
tna-freerouter#sh int sum                                                       
interface  state  tx     rx         drop
ethernet0  up     89955  128007451  0
sdn1       up     87291  127572417  0
Ping IPv4 from freerouter -> LAN router gateway
tna-freerouter#ping 192.168.0.254 /vrf v1 /repeat 11111                         
pinging 192.168.0.254, src=null, cnt=11111, len=64, tim=1000, ttl=255, tos=0, sweep=false
..!........!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!*
result=95%, recv/sent/lost=197/207/10, rtt min/avg/max/total=27/54/645/20764

The output above indicates that there are packet losses. This is due to the fact fact as soon as bf_switch port 0 is bridged to enp0s3 on the local area netwrok it receives a lot of packet. These packet have to be process by bf_switchd increasing the processing delay as all the packet received via enp0s3 has to be queued and processed by bf_switchd model.

IPv4 arp check
tna-freerouter#sh ipv4 arp sdn1                                                 
mac             address        time      static
e03f.496d.1899  192.168.0.62   00:05:27  false    <----- Host server
9ceb.e8d5.2c51  192.168.0.77   00:05:27  false    <----- VM guest bridged IP
0024.d4a0.0cd3  192.168.0.254  00:01:27  false    <----- LAN gateway

Ping IPv6 from freerouter -> Host server and SSH connection test
tna-freerouter#..1:e0a:159:2850:e23f:49ff:fe6d:1899 /vrf v1 /repeat 111111      
pinging 2a01:e0a:159:2850:e23f:49ff:fe6d:1899, src=null, cnt=111111, len=64, tim=1000, ttl=255, tos=0, sweep=false
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!*
result=100%, recv/sent/lost=89/89/0, rtt min/avg/max/total=30/50/600/4467
IPv6 neighbor discovery check
tna-freerouter#show ipv6 neighbors sdn1                                         
mac             address                                time      static  router
0024.d4a0.0cd3  2a01:e0a:159:2850::1                   00:00:26  false   false    <----- LAN gateway
e03f.496d.1899  2a01:e0a:159:2850:e23f:49ff:fe6d:1899  00:03:26  false   false    <----- Host server
0024.d4a0.0cd3  fe80::224:d4ff:fea0:cd3                00:01:26  false   false
e03f.496d.1899  fe80::e23f:49ff:fe6d:1899              00:02:26  false   false
Initiate IPv4 ssh from freerouter -> LAN router gateway
tna-freerouter#ssh 192.168.0.62 /vrf v1 /user my-nas                           
 - connecting to 192.168.0.62 22
password: *******
                
 - securing connection

Last login: Fri Jul  3 10:57:02 2020 from 192.168.0.66
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS% 


Initiate IPv6 ssh from freerouter -> LAN router gateway
tna-freerouter#..:e0a:159:2850:e23f:49ff:fe6d:1899 /vrf v1 /user my-nas        
 - connecting to 2a01:e0a:159:2850:e23f:49ff:fe6d:1899 22
password: *******
                
 - securing connection

Last login: Mon Jul  6 11:05:31 2020 from 192.168.0.131
FreeBSD 11.3-RELEASE-p9 (FreeNAS.amd64) #0 r325575+588899735f7(HEAD): Mon Jun  1 15:04:31 EDT 2020

        FreeNAS (c) 2009-2020, The FreeNAS Development Team
        All rights reserved.
        FreeNAS is released under the modified BSD license.

        For more information, documentation, help or support, go here:
        http://freenas.org
Welcome to FreeNAS
MY-NAS% 

Conclusion

In this article you:

  • had a demonstration of how to integrate freeRouter into a local area network (Similar to article #002)
  • However instead of using bmv2 we used a INTEL/BAREFOOT P4 dataplane called: TOFINO (bf_switchd)
  • TOFINO bf_switchd target is running RARE bf_router.p4
  • communication between freeRouter control plane and TOFINO is ensured by pcapInt via veth pair [ veth250 - veth251 ]
  • This communication is possible via RARE bf_forwarder.py based on GRPC P4Lang BfRuntime python binding
  • In this example the TOFINO bf_switchd P4 virtual switch model has only 1 dataplane interface that is bound to enp0s3 VM interface exposed to the local network as a bridged interface

[ #004 ] RARE/FreeRouter-101 - key take-away

  • FreeRouter is using UNIX socket in order to forward packet dedicated to control plane + dataplane communication.

This essential paradigm is used to ensure communication between freeRouter and TOFINO bf_switchd P4 dataplane. It is ensured by pcapInt binary from freeRouter net-tools that will bind freeRouter socket (veth251@locathost:22710) to a virtual network interface (veth250@localhost:22709)  connected to CPU_PORT 64.

  • freeRouter control plane and dataplane communication is enabled by RARE bf_forwarder.py 

bf_forwarder.py is a simple python script based on GRPC client BfRuntime python library.

freeRouter is doing all the control plane route computation and write/modify/remove message entry via BfRuntime so that P4 entries are created/modified/removed accordingly from P4 tables

While TOFINO bf_switchd virtual model target is a very good choice for packet processing algorithm validation on TOFINO platform, the virtual model is not a target for production use. We will see in next articles how we can reach TREMENDOUS traffic throughput required by Internet Service Provider's use cases. Indeed, while with the model we can validate algorithm accuracy, traffic transfers achieved have a very low throughput. (I could barely make my setup described above working)

In a subsequent article we will demonstrate how we can create with RARE/freeRouter/TOFINO TNA architecture, a service provider/carrier grade router that technically is able to switch 3.3 Tbps of traffic (line rate) using EdgeCore WEDGE100BF32X hardware switch.

TOFINO family most powerful Programmable Switching ASIC has the ability to switch 6.5 Tbps traffic throughput, our WEDGE100BF32X switches are powered by the ASIC's little brother that is able to handle 3.3 Tbps line rate traffic throughput.