Skip to end of metadata
Go to start of metadata

BBR is a new algorithm for TCP Congestion Control. It was tested in Google's data center networks as well as on some of their public-facing Web servers including and

YouTube. It strives to optimize both throughput and latency/RTT by estimating the bottleneck bandwidth and RTT to compute a pacing rate. One goal — one that sets it apart from most traditional TCP variants — is to avoid filling up the bottleneck buffer, which might induce Bufferbloat.


A description of the algorithm was published in the September/October 2016 issue of ACM Queue. An implementation in the Linux kernel has been proposed as a patch. Dave Taht posted a preliminary evaluation ("a quick look...") on his blog. An good description of the BBR's motivation and approach is included in the proposed kernel patch (see below).

In the STARTUP phase, BBR tries to quickly approximate the bottleneck bandwidth. It does so by increasing the sending rate until the estimated bottleneck bandwidth stops growing.

Bottleneck bandwidth is estimated from the amount of data ACKed over a given period, filtered through a "windowed max-filter".

In the DRAIN phase, the sending (pacing) rate is reduced to get rid of the queue that BBR estimates to have created while probing the bottleneck bandwidth during STARTUP.

In steady state, BBR will pace to the estimated bottleneck bandwidth. Periodically it tries to improve its network model by doing probing:

PROBE_BW mode: BBR probes for a bandwidth increase at the bottleneck by increasing the pacing rate, then decreasing the rate to remove temporary queuing in case the bottleneck bandwidth hasn't grown.

PROBE_RTT mode: RTT is filtered through a windowed min-filter. Sometimes the algorithm will reduce the pacing rate to better approximate the base RTT in case queueing ("bufferbloat") is in effect.

State Diagram

A state diagram has been included in the Linux kernel source as a patch. I quote:

Here is a state transition diagram for BBR:

   +---> STARTUP  ----+
   |        |         |
   |        V         |
   |      DRAIN   ----+
   |        |         |
   |        V         |
   +---> PROBE_BW ----+
   |      ^    |      |
   |      |    |      |
   |      +----+      |
   |                  |
   +---- PROBE_RTT <--+

 A BBR flow starts in STARTUP, and ramps up its sending rate quickly.
 When it estimates the pipe is full, it enters DRAIN to drain the queue.
 In steady state a BBR flow only uses PROBE_BW and PROBE_RTT.
 A long-lived BBR flow spends the vast majority of its time remaining
 (repeatedly) in PROBE_BW, fully probing and utilizing the pipe's bandwidth
 in a fair manner, with a small, bounded queue. If a flow has been
 continuously sending for the entire min_rtt window, and hasn't seen an RTT
 sample that matches or decreases its min_rtt estimate for 10 seconds, then
 it briefly enters PROBE_RTT to cut inflight to a minimum value to re-probe
 the path's two-way propagation delay (min_rtt). When exiting PROBE_RTT, if
 we estimated that we reached the full bw of the pipe then we enter PROBE_BW;
 otherwise we enter STARTUP to try to fill the pipe.


Linux kernel TCP implementation

Linux 4.9 and above: An implementation for the Linux kernel was submitted and merged in September 2016. This initial kernel implementation relied on a scheduler that is capable of pacing, such as the fq scheduler.

Linux 4.13 and above: In May 2017, Éric Dumazet submitted a patch to implement pacing in TCP itself, removing the dependency on the fq scheduler. This makes BBR is simpler to enable, and allows its use together with other schedulers (such as the popular fq_codel).

TCP_CC_INFO instrumentation

BBR exports its bandwidth and RTT estimates using the getsockopt(TCP_CC_INFO) interface, see struct tcp_bbr_info . User-space applications can use this information as hints for adapting their use of a given connection, e.g. by selecting appropriate audio or video encodings.

Other implementations

According to the BBR update from IETF 99 (see reference below), Netflix is working on an implementation of BBR for FreeBSD.

BBR has also been implemented for QUIC, and is in active use on Google's servers.

Mark Claypool has written an implementation of BBR for ns-3, a popular network simulator.

History/Related Work

There is another instance of "BBR TCP", namely Bufferbloat Resistant TCP proposed by Michael Fitz Nowlan in his 2014 Ph.D. thesis (Yale library catalog entry, PDF). It shares the basic goals of (Google's) BBR TCP, namely to detect bottleneck rate and to avoid bufferbloat. But it uses a delay-based measurement approach rather than direct measurement of the amount of ACKed data per time, and doesn't seem to make use of pacing. Despite these differences, the link between the two BBR TCPs seems more than just a coincidental acronym collision: Fitz Nowlan is acknowledged in the Queue/CACM papers, and conversely acknowledges several of the "Google" BBR authors in his thesis.


– SimonLeinen

  • No labels