TCP Flow Control

Note: This topic describes the Reno enhancement of classical "Van Jacobson" or Tahoe congestion control. There have been many suggestions for improving this mechanism - see the topic on high-speed TCP variants.

TCP flow control and window size adjustment is mainly based on two key mechanism: Slow Start and Additive Increase/Multiplicative Decrease (AIMD), also known as Congestion Avoidance. (RFC 793 and RFC 5681)

Slow Start

To avoid that a starting TCP connection floods the network, a Slow Start
mechanism was introduced in TCP. This mechanism effectively
probes to find the available bandwidth.

In addition to the window advertised by the receiver, a Congestion Window ( cwnd=) value is used and the effective window size is the lesser of the two. The starting value of the =cwnd window is set initially to a value that has been evolving over the years, the TCP Initial Window. After each acknowledgment, the cwnd window is increased by one MSS. By this algorithm, the data rate of the sender doubles each round-trip time (RTT) interval (actually, taking into account Delayed ACKs, rate increases by 50% every RTT). For a properly implemented version of TCP this increase continues until:

the advertised window size is reached
congestion (packet loss) is detected on the connection.
there is no traffic waiting to take advantage of an increased window (i.e. cwnd should only grow if it needs to)

When congestion is detected, the TCP flow-control mode is changed from Slow Start to Congestion Avoidance. Note that some TCP implementations maintain cwnd in units of bytes, while others use units of full-sized segments.

Congestion Avoidance

Once congestion is detected (through timeout and/or duplicate ACKs), the data rate is reduced in order to let the network recover.

Slow Start uses an exponential increase in window size and thus also in data rate. Congestion Avoidance uses a linear growth function (additive increase). This is achieved by introducing - in addition to the cwnd window - a slow start threshold (ssthresh).

As long as cwnd is less than ssthresh , Slow Start applies. Once ssthresh is reached, cwnd is increased by at most one segment per RTT. The cwnd window continues to open with this linear rate until a congestion event is detected.

When congestion is detected, ssthresh is set to half the cwnd (or to be strictly accurate, half the "Flight Size". This distinction is important if the implementation lets cwnd grow beyond rwnd (the receiver's declared window)).
cwnd is either set to 1 if congestion was signalled by a timeout, forcing the sender to enter Slow Start, or to ssthresh if congestion was signalled by duplicate ACKs and the Fast Recovery algorithm has terminated. In either case, once the sender enters Congestion Avoidance, its rate has been reduced to half the value at the time of congestion. This multiplicative decrease causes the cwnd to close exponentially with each detected loss event.

Fast Retransmit

In Fast Retransmit, the arrival of three duplicate ACKs is interpreted as packet loss, and retransmission starts before the retransmission timer (RTO) expires.

The missing segment will be retransmitted immediately without going through the normal retransmission queue processing. This improves performance by eliminating delays that would suspend effective data flow on the link.

Fast Recovery

Fast Recovery is used to react quickly to a single packet loss. In Fast recovery, the receipt of 3 duplicate ACKs, while being taken to mean a loss of a segment, does not result in a full Slow Start. This is because obviously later segments got through, and hence congestion is not stopping everything. In fast recovery, ssthresh is set to half of the current send window size, the missing segment is retransmitted (Fast Retransmit) and cwnd is set to ssthresh plus three segments. Each additional duplicate ACK indicates that one segment has left the network at the receiver and cwnd is increased by one segment to allow the transmission of another segment if allowed by the new cwnd . When an ACK is received for new data, cwmd is reset to the ssthresh , and TCP enters congestion avoidance mode.

References

Congestion Avoidance and Control, V. Jacobson, Computer Communication Review, vol. 18, no. 4, pp. 314-329, August 1988, ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z
TCP Congestion Control, RFC 5681, M. Allman, V. Paxson, E. Blanton, September 2009
Congestion Control in the RFC Series, RFC 5783, M. Welzl, W. Eddy, February 2010
Computing TCP's Retransmission Timer, RFC 6298, V. Paxson, M. Allman, J. Chu, M. Sargent, June 2011
Congestion Control in Linux TCP, P. Sarolahti, A. Kuznetsov, USENIX Annual Technical Conference 2002, Freenix Track
The Great Internet TCP Congestion Control Census, A. Mishra, X. Sun, A. Jain, S. Pande, R. Joshi, B. Leong, ACM SIGMETRICS, December 2019 (PDF, presentation video, Gordon code)

– Main.UlrichSchmid - 07 Jun 2005
– Main.SimonLeinen - 27 Jan 2006 - 20 Jun 2020