NERSC Case Study: Stateful Firewall
This case study was presented at Joint Techs 2006 (Albuquerque, NM) by Brian Draney of NERSC. NERSC is the US DoE's scientific computer centre, which has ~20 TFlops of processing power and 8.8PB of storage. It uses a 10GE LAN backbone and connects to EDnet at 10 Gbps.
A sluggish transfer of data between two end hosts.
For a specific IP flow, original packets did not seem to be getting through, but all the re-transmits were. In the Xplot below all the red points represent re-transmitted packets.
The sender's route table showed that the correct PMTU (Path Maximum Transmission Unit) was being used for the destination, but
tcpdump showed 64kB packets leaving a 9kB capable interface.
It was determined that a Large Send Offload NIC was being used, and this was not honouring the path MTU (becasue it did not appear to have access to the host's routing table). Over-sized packets were being sent and these were being dropped. However, the re-trnasmitted packets were managed by the host's kernel and not the LSO engine, and these did honour the PMTU, so did get through.
– Main.TobyRodwell - 16 Feb 2006