TCP Flow Control
Note: This topic describes the Reno enhancement of classical "Van Jacobson" or Tahoe congestion control. There have been many suggestions for improving this mechanism - see the topic on high-speed TCP variants.
TCP flow control and window size adjustment is mainly based on two key mechanism: Slow Start and Additive Increase/Multiplicative Decrease (AIMD), also known as Congestion Avoidance. (RFC 793 and RFC 5681)
To avoid that a starting TCP connection floods the network, a Slow Start
mechanism was introduced in TCP. This mechanism effectively
probes to find the available bandwidth.
In addition to the window advertised by the receiver, a Congestion Window (
cwnd=) value is used and the effective window size is the lesser of the two. The starting value of the =cwnd window is set initially to a value that has been evolving over the years, the TCP Initial Window. After each acknowledgment, the
cwnd window is increased by one MSS. By this algorithm, the data rate of the sender doubles each round-trip time (RTT) interval (actually, taking into account Delayed ACKs, rate increases by 50% every RTT). For a properly implemented version of TCP this increase continues until:
- the advertised window size is reached
- congestion (packet loss) is detected on the connection.
- there is no traffic waiting to take advantage of an increased window (i.e. cwnd should only grow if it needs to)
When congestion is detected, the TCP flow-control mode is changed from Slow Start to Congestion Avoidance. Note that some TCP implementations maintain cwnd in units of bytes, while others use units of full-sized segments.
Once congestion is detected (through timeout and/or duplicate ACKs), the data rate is reduced in order to let the network recover.
Slow Start uses an exponential increase in window size and thus also in data rate. Congestion Avoidance uses a linear growth function (additive increase). This is achieved by introducing - in addition to the
cwnd window - a slow start threshold (
As long as
cwnd is less than
ssthresh , Slow Start applies. Once
ssthresh is reached,
cwnd is increased by at most one segment per RTT. The
cwnd window continues to open with this linear rate until a congestion event is detected.
When congestion is detected,
ssthresh is set to half the
cwnd (or to be strictly accurate, half the "Flight Size". This distinction is important if the implementation lets cwnd grow beyond
rwnd (the receiver's declared window)).
cwnd is either set to 1 if congestion was signalled by a timeout, forcing the sender to enter Slow Start, or to
ssthresh if congestion was signalled by duplicate ACKs and the Fast Recovery algorithm has terminated. In either case, once the sender enters Congestion Avoidance, its rate has been reduced to half the value at the time of congestion. This multiplicative decrease causes the
cwnd to close exponentially with each detected loss event.
In Fast Retransmit, the arrival of three duplicate ACKs is interpreted as packet loss, and retransmission starts before the retransmission timer (RTO) expires.
The missing segment will be retransmitted immediately without going through the normal retransmission queue processing. This improves performance by eliminating delays that would suspend effective data flow on the link.
Fast Recovery is used to react quickly to a single packet loss. In Fast recovery, the receipt of 3 duplicate ACKs, while being taken to mean a loss of a segment, does not result in a full Slow Start. This is because obviously later segments got through, and hence congestion is not stopping everything. In fast recovery, ssthresh is set to half of the current send window size, the missing segment is retransmitted (Fast Retransmit) and
cwnd is set to
ssthresh plus three segments. Each additional duplicate ACK indicates that one segment has left the network at the receiver and
cwnd is increased by one segment to allow the transmission of another segment if allowed by the new
cwnd . When an ACK is received for new data,
cwmd is reset to the
ssthresh , and TCP enters congestion avoidance mode.
- Congestion Avoidance and Control, V. Jacobson, Computer Communication Review, vol. 18, no. 4, pp. 314-329, August 1988, ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z
- TCP Congestion Control, RFC 5681, M. Allman, V. Paxson, E. Blanton, September 2009
- Congestion Control in the RFC Series, RFC 5783, M. Welzl, W. Eddy, February 2010
- Computing TCP's Retransmission Timer, RFC 6298, V. Paxson, M. Allman, J. Chu, M. Sargent, June 2011
- Congestion Control in Linux TCP, P. Sarolahti, A. Kuznetsov, USENIX Annual Technical Conference 2002, Freenix Track
- The Great Internet TCP Congestion Control Census, A. Mishra, X. Sun, A. Jain, S. Pande, R. Joshi, B. Leong, ACM SIGMETRICS, December 2019 (PDF, presentation video, Gordon code)
– Main.UlrichSchmid - 07 Jun 2005
– Main.SimonLeinen - 27 Jan 2006 - 20 Jun 2020