HSC welcomes all external visitors to this site, especially students and members of the academic community. Please use the comments box at the bottom of each page to record any comments or suggestions for improvement.
Spurious RTO problem
TCP has a mechanism for retransmission of the unacknowledged data. The timeout period for retransmission (RTO) is based on the round trip time (rtt) experienced by TCP connection.
In most of the situations, RTO would indicate a packet loss in the network which triggers loss recovery procedure in TCP, whereby slow start is initiated and congestion window is dropped to one. However, a sudden spike in round trip time (e.g. due to congestion or handoff) will cause the RTO on the packets which are not actually lost in the network. Forward RTO-recovery is a mechanism of isolating the spurious RTO from genuine ones.
Forward RTO Recovery (F-RTO)
The basic principle of F-RTO is to observe two acks after the RTO before declaring the RTO as genuine and triggering the loss recovery. In case these acks are not duplicate and ack the packet which was not retransmitted, RTO is declared to be spurious.
F-RTO can be implemented on only the sender TCP side and does not require negotiation with the peer TCP stack.
Detailed functionality
Check possibility of using F RTO algo (tcp_use_frto)
When the retransmission time expires possibility of using F-RTO is checked:
- F-RTO is activated using a sysctl variable, sysctl_tcp_frto
- F-RTO works on the principle of sending one new data after retransmitting the old segment. Thus it is necessary that new data is available and allowed (lies within in the transmission window, snd_wnd ) for transmission.
- If the above two conditions are met, F-RTO is initialized.
Initializing F-RTO execution (tcp_enter_frto)
- In case connection is already recovering some loss, halve snd_ssthresh reducing snd_ssthresh no less than 2.
- Initialize the variables for tracking frto progress:
- Number of packets retransmitted, retrans_out is reset.
- mark the point in transmission window which was last acknowledged ( snd_una ) as we have entered frto, undo_marker
- reset the count for number of retransmission done after undo_marker point, undo_retrans
- All the packets in queue, waiting to be acknowledged are marked as not retransmitted
- Congestion avoidance state, ca_state is set to TCP_CA_OPEN
- Mark the point in transmission window from which new segment are transmitted ( snd_nxt ) after entering frto, frto_highmark
- Number of new acks processed after RTO is maintained in variable frto_counter , it is set to 1.
Retransmit the packet for which rto expired ( tcp_retransmit_skb )
Processing of ack after entering F-RTO (tcp_process_frto)
Acks received after entering F-RTO are to be analyzed for judging the genuineness of RTO:
- The first ack received after entering F-RTO and retransmitting the rto expired packet may be one of the following:
- Duplicate ack: Receiving dup ack indicates that the packet loss was genuine.
- Ack is not duplicate but has acked till (or more than) frto_highmark : this indicates that the retransmission has most likely filled up a hole, the subsequent packets were already received. This would mean that the retransmitted packet was genuinely lost.
In case the RTO was considered genuine,
If the first ack received doesnt declare the retransmission as genuine, two more packets are let out. Its worthwhile to notice that the conventional recovery methods cause the congestion window to become two on first ack arrival, thus we can fallback to conventional recovery if suggested by the analysis of second ack.
frto_counter is incremented at this stage.
The second ack received is similarly analyzed to check whether RTO was genuine. In case the RTO is found to be spurious, it is still taken as an indication of congestion and TCP enters congestion avoidance.
frto_counter is reset after processing the second ack which indicates that the subsequent acks may not undergo FRTO processing.
Entering loss state after entering F-RTO (tcp_enter_frto_loss)
In case the RTO was genuine:
- Mark the packets prior to frto_highmark as lost, if they have not been sacked.
- Congestion window is set to two or three depending on whether the one or two acks have been received since RTO: this is done to match with the conventional loss recovery mechanism.
- ca_state is marked as TCP_CA_Loss
Maintainer: seema.garg@hsc.com
Categories: Inside_Linux_TCP Software
Comments