Transmission Control Protocol

TCP is transmission Control Protocol, the Layer4 protocol for communication over both wireline as well as wireless links. It is one of the most widespread of protocols in usage today. All key applications defining the web today, http, email transfer, file transfer, etc. use TCP as the backbone transmission protocol.

TCP offers a stream oriented reliable delivery service. Unusually, for a transport layer protocol, it has no explicit support for QoS. Perhaps this is one of the reasons for its success - by eschewing features, it has managed to be simultaneously very efficient and very powerful. Also, it is designed to operate on any standard network layer protocol with as few assumptions as possible.
The key feature in TCP is a very sophisticated scheme of congestion management and rate adaptation. There are multiple variants of TCP, each principally differing in the way they execute these features.

For a definitive specification of TCP protocol, please see the standard in rfc0793.txt

In the rest of this article, we shall review the theory of TCP congestion avoidance and the different algorithms available for handling the same.


Congestion collapse

Congestion collapse takes place when multiple sources start transmitting together at a constant rate. If the number of sources is sufficiently high, this will be greater than the clearing rate of the network. This will, then, lead to overflow of buffers and packet dropping. The dropped packets will, in turn, lead to further retransmissions, which will only increase the load in the network. Eventually, we will reach the stage where no data is going through. Congestion collapse was first seen and documented by (addref).

How to prevent congestion collapse

The job of the transmitter is then, to discover a transmission rate so that the combined transmission rate of all transmitters is less than the network clearing rate. The fact that this is even possible comes from a very deep result in distributed autonomous control, see [Kleinrock95].

The rate adaptation page describes the theory of rate adaptation.

TCP basics

TCP is a sliding window, asynchronous ARQ protocol. It is termed asynchronous because it does not use clocks explicitly for most operations and also, does not need clocks of a very fine granularity. Rather, whenever a packet is received, this is treated as an event, which triggers the next activity. The lack of clocks ensures that processing load is relatively low and TCP has run successfully on 5-6 generations of computers, ranging from the 5Mhz VAX machines of the 1970s to modern day embedded systems and teraflop supercomputers.

TCP Reno

TCP Reno is the most widespread variant of TCP currently. It is derived from the original version of TCP (TCP Tahoe) which was circulated as part of the FreeBSD protocol stack. The TCP stack in Solaris, Linux and Windows is a variant of TCP Reno.

TCP Reno uses two key techniques for congestion management. One is slow-start and the other is congestion avoidance.

Congestion detection

Standard TCP was designed for a network where links were relatively reliable, but processing computers were slow and starved for memory. Link speeds were limited (the first Internet was run on 56kb/s dialup lines, which, in the USA would have meant a noise margin of more than 40dB), but the ability of computers to transmit over them was slower still. Consequently, packet dropping was usually due to lack of buffering capability.

Standard TCP thus uses packet dropping as a sign of congestion. Packet dropping can be signalled when the receiver gets a discontinuous sequence, or, in cases of heavy congestion, an entire burst is lost. The difference between the first and the second is the method of feedback. While discontinuous packets can be signaled to the transmitter using duplicate acknowledgements, a whole burst being lost can only be detected autonomously by the transmitter using a retransmission timeout. One of the biggest challenges in TCP protocol design is to correctly compute the RTO wait period, taking into account queueing delay which can vary substantially and will also dominate link delay.

Slow Start

Slow start is the mechanism which is used when a TCP session is started. Instead of starting transmission at a fixed rate, the transmission is started at the lowest rate possible and doubled. The rate of doubling is tied to the rate at which acknowledgements come back; thus, the rate is doubled every round trip time. If congestion is detected, the transmission rate is reduced to half of what it is currently and the congestion avoidance process starts. The rate doubling and reduction is nothing but a binary search for the 'right' transmission bandwidth.

Congestion avoidance

The congestion avoidance algorithm starts where slow start stops. The TCP increases its transmission rate linearly, by adding one additional packet to its window each transmission time. If congestion is detected at any point, the TCP reduces its transmission rate to half the previous value. Thus the TCP seesaws its way to the right transmission rate. Clearly, the scheme is less aggressive than the slow-start phase (note the linear increase against the exponential growth in slow start).

The TCP Vegas and related methods

The initial design of TCP was for a certain network characterization as discussed above. While it was a very successful design from the point of view of a cooperative network being shared amongst a small group, it was soon clear that there were many issues when it came to scaling it to a worldwide heterogenous Internet.

Background

A key transition in analysis came when TCP was analyzed, not from the point of view of a single protocol operating in an uncertain environment, but a group of 'greedy' protocols competing with each other for limited resources. [Shenker95] gives a fascinating review in his seminal paper, showing how the 'bandwidth hunting' nature of TCP actually maps to zero-sum competitive games. Subsequently, using game theoretic techniques, equuilibrium behaviour and incentives have been developed.

Critiques and criticisms of TCP Reno

The analysis above yielded the following criticism of TCP Reno.

  1. By its very nature, a TCP is designed to drive a network into saturation. A TCP will never reduce its transmission rate unless it sees a packet drop. This will only occur when the network is overloaded. Thus a combination of TCPs will together ensure that a network is driven into overload constantly. This makes it difficult for new connections to enter the system, since they are at a natural disadvantage due to slow start.
  2. TCP Reno attempts to equalize the window size for competing connections. However, a measure of the bandwidth achieved is the bandwidth delay product, not the window size only. Thus, TCP Reno is unfair to connections with larger round-trip times.

How TCP Vegas works

TCP Vegas uses an alternate method to detect congestion. Instead of depending on packet drops, the TCP monitors variations in measured round-trip times and achieved throughputs. Depending on these changes, it decides whether to increase the window or decrease the window. To this end, the TCP Vegas compares the expected bandwidth (current window size / estimated link latency) against the actual bandwidth. The actual bandwidth is estimated using a heuristic algorithm, on which there is a lot of confusion. In our algorithm, we use the following formulae

alpha and beta are two variables which indicate how much surplus buffering is to be maintained in the network. Theoretically, the alpha-beta gap allows for some amount of hysteresis, without which there is unacceptably high thrashing.

There are two or three big issues here, which are often brushed under the carpet by Vegas enthusiasts:

  1. There is no good way to estimate the baseRTT, or the link latency. A standard mechanism used is the minimum measured RTT. However, this leads to a very insidious fairness problem. Assume that we have a connection A, operating on a link of latency l and bandwidth b. The connection A measures a link latency l and injects alpha packets into the link. A little later connection b starts up. The minimum link latency that it measures is . Since this is higher than what is measured by connection A, it has an obvious advantage.
  2. A second problem comes in due to re-routing of the connection. The link latency for the new route may well be larger than the old one, but the base RTT cannot be changed. Suggested solutions for this problem includes sending probing packets, etc.
  3. Measuring the transmitted bytes in a round trip time. The estimate for round-trip time is constantly changing as the load on the network goes up or down. This causes very large instabilities in the operation of the Vegas. In my own experience, it is necessary to use a damping parameter to ensure that the number of transmitted bytes roughly maps to the average window size.

Further details of the Vegas congestion algorithm are provided in [Brakmo94, Scowcroft95].

Another important innovation in TCP Vegas is the manner in which packet drops are detected. A study by [Balakrishnan98] showed that upto 50% of packet drops in standard TCP are only detected by timeout. In TCP Vegas, whenever a duplicate ack is received, the TCP immediately checks all pending retransmission timers. If a retransmission timer has only a small amount of time to expire (small with reference to the measured round trip time), it is retransmitted immediately.

Analysis of TCP Vegas

TCP VEGAS created a significant excitement when it was initially described. Analyses such as [Low02] showed that algorithmically, it was superior to the standard TCP Reno. This was confirmed by simulation results in [Mo99].

However, detailed analysis of TCP Vegas belies its claims. While it does indeed perform better than TCP Reno,this is not because of its superior congestion management algorithm, but because of the innovations in timeout management. The problem was pointed out in one of the very first papers, but disregarded by researchers for a long time; TCP Vegas convergence is problematic. Recently [Chloe03] and others have issued enhancements to TCP Vegas (Vegas-A) which claim better convergence properties.

Other variants of TCP

There are numerous other variants of TCP, but nothing with the widespread appeal of Reno or Vegas.

  1. TCP Peach was created specifically for satellite environments. It is based around the idea of using dummy segments to probe the network and determine key network states. This is incorporate into two new algorithms, sudden start and rapid recovery. The interested reader is referred to [Morabito01]
  2. TCP Westwood is designed for wireless links and uses a derivative of the Packet pair algorithm. Specifically, it tries to get information about duplicate acknowledgements by measuring the delivery rate of dup acks. The theory is that when packets are dropped, the link is in bottleneck, so the rate of delivery of dupacks is indicative of the bottleneck throughput of the link. These estimates are used as a less conservative (as compared to Reno) basis for backing off transmission. In our experience, Westwood has even worse stability problems with VEGAS and is very susceptible to assymetric links. The interested reader is referred to [Cassetti01]

References

[Shenker95] S. J. Shenker. Making greed work in networks: A game-theoretic analysis of switch service disciplines. IEEE/ACM Transactions on Networking, 3(6):819-- 831, 1995.


[Kleinrock93] B. Tung, and L. Kleinrock, "Distributed Control Methods," Proceedings of the Second International Conference on High Performance Distributed Computing, 1993.

[Balakrishnan98] Hari Balakrishnan, Katz, "TCP Behaviour of a busy Internet Server: Analysis and Improvements" Proc. of the IEEE Infocom, 1998

[Brakmo95] Brakmo, Peterson, "TCP Vegas: End-to-end congestion avoidance on a global internet" IEEE JSAC 1995

[Mo99] Mo, La, Anantharam, Valrand, "Analysis and comparision of TCP Reno and Vegas", Proceedings of the IEEE Infocom, 1999

[Low02] Low, Peterson, "Understanding TCP VEGAS: A duality model", Journal of the ACM, 2002

[Choe03] Choe, Low "Stabilized Vegas", Proceedings of the IEEE Infocom, 2003

[Morabito01] Akyildiz, Morabito, Palazzo, "TCP Peach: A new congestion control scheme for satellite IP networks " IEEE/ACM Transactions on Networking, August 2001

[Casetti01]Casetti, Gerla, Mascolo, Sanadidi, Wang "TCP Westwood: Bandwidth estimation for enhanced performance over wireless links", Proceedings of the ACM Mobicom 2001





Page Information

  • Changed 2 years ago [show history]
  • View page source
  • You're not logged in
  • Spam-like content was removed from this page.
  • No tags yet learn more

Wiki Information

Recent PBwiki Blog Posts