Skip to main content

A common misconception is that TCP guarantees delivery of the data. What TCP actually guarantees is that it will deliver the data to the receiving host’s TCP stack or report an error to the sending application. Unfortunately, the error report does not indicate how much data was actually delivered. There is also a significant difference between the receiving TCP stack and the receiving application.

There are two basic failure scenarios. In the first, the sending TCP stack is not receiving TCP acknowledgments for the data that it is transmitting. In this scenario, the sending application can continue to call send to put more data into the TCP stack’s send buffer. Once the TCP stack times out the transmission the next send (or receive) called by the sending application will indicate the ETIMEDOUT error. The application now knows there was a problem but has no idea how much data was successfully transmitted. Success means that the sending TCP stack received an acknowledgement for the data from the receiving TCP stack.

In the second scenario, the sending TCP stack transmits data and receives a reset instead of an acknowledgment. The reset may indicate that the receiving TCP stack has closed the socket or some network device, perhaps a firewall, has timed out. The next time that the sending application calls send (or receive) the ECONNRESET error is indicated. You might think that in this scenario the sending application could infer that it was only the data from the last send call that failed, but you would be wrong. It is possible that the sending TCP stack buffered the data from multiple send calls into one TCP segment and that segment triggered the reset or that a series of TCP segments were sent before the reset, triggered by the first segment in the series, arrived back at the sender. All the sending application can infer is that not all the data was successfully transmitted.

There is a third failure scenario that has nothing to do with the TCP (sending or receiving) stack or the network. Assume that the receiving application has a bug that is preventing it from reading the data. The receiving TCP stack will continue to receive the data and send TCP acknowledgments up to the point that the TCP receive buffer fills. That however can be up to 64K bytes of data (or more if TCP window scaling is supported). If the receiving application needs to be restarted, all data in the TCP receive buffer will be lost. The receiving TCP stack should (but may not) send a reset to the sending TCP stack when the receiving application is terminated. It will send a reset the next time it receives a segment for the now closed connection. From the point of view of the sending application this is similar to the second scenario. However, it points out that even if the receiving TCP stack acknowledges the data, in the event of an error, it is not safe for the sender to assume that the receiving application has read the data.

The idea to take away from these failure scenarios is that without an application layer acknowledgement, a transmission timeout or reset error indicates that some, possibly all of the transmitted data may not have been read by the receiving application. For this reason I recommend that all applications include application layer acknowledgments, be prepared to reestablish a connection and retransmit unacknowledged data and be able to deal with duplicate data since what may have been lost is the acknowledgement.

© 2024 Stratus Technologies.