diff mbox series

[net] UPSTREAM: tcp: fix DSACK undo in fast recovery to call tcp_try_to_open()

Message ID 20240627024227.3040278-1-ncardwell.sw@gmail.com (mailing list archive)
State Accepted
Commit a6458ab7fd4f427d4f6f54380453ad255b7fde83
Delegated to: Netdev Maintainers
Headers show
Series [net] UPSTREAM: tcp: fix DSACK undo in fast recovery to call tcp_try_to_open() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 859 this patch: 859
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 2 maintainers not CCed: pabeni@redhat.com dsahern@kernel.org
netdev/build_clang success Errors and warnings before: 863 this patch: 863
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 864 this patch: 864
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 1 this patch: 1
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-06-28--06-00 (tests: 666)

Commit Message

Neal Cardwell June 27, 2024, 2:42 a.m. UTC
From: Neal Cardwell <ncardwell@google.com>

In some production workloads we noticed that connections could
sometimes close extremely prematurely with ETIMEDOUT after
transmitting only 1 TLP and RTO retransmission (when we would normally
expect roughly tcp_retries2 = TCP_RETR2 = 15 RTOs before a connection
closes with ETIMEDOUT).

From tracing we determined that these workloads can suffer from a
scenario where in fast recovery, after some retransmits, a DSACK undo
can happen at a point where the scoreboard is totally clear (we have
retrans_out == sacked_out == lost_out == 0). In such cases, calling
tcp_try_keep_open() means that we do not execute any code path that
clears tp->retrans_stamp to 0. That means that tp->retrans_stamp can
remain erroneously set to the start time of the undone fast recovery,
even after the fast recovery is undone. If minutes or hours elapse,
and then a TLP/RTO/RTO sequence occurs, then the start_ts value in
retransmits_timed_out() (which is from tp->retrans_stamp) will be
erroneously ancient (left over from the fast recovery undone via
DSACKs). Thus this ancient tp->retrans_stamp value can cause the
connection to die very prematurely with ETIMEDOUT via
tcp_write_err().

The fix: we change DSACK undo in fast recovery (TCP_CA_Recovery) to
call tcp_try_to_open() instead of tcp_try_keep_open(). This ensures
that if no retransmits are in flight at the time of DSACK undo in fast
recovery then we properly zero retrans_stamp. Note that calling
tcp_try_to_open() is more consistent with other loss recovery
behavior, since normal fast recovery (CA_Recovery) and RTO recovery
(CA_Loss) both normally end when tp->snd_una meets or exceeds
tp->high_seq and then in tcp_fastretrans_alert() the "default" switch
case executes tcp_try_to_open(). Also note that by inspection this
change to call tcp_try_to_open() implies at least one other nice bug
fix, where now an ECE-marked DSACK that causes an undo will properly
invoke tcp_enter_cwr() rather than ignoring the ECE mark.

Fixes: c7d9d6a185a7 ("tcp: undo on DSACK during recovery")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_input.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

patchwork-bot+netdevbpf@kernel.org June 28, 2024, 9:30 a.m. UTC | #1
Hello:

This patch was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:

On Wed, 26 Jun 2024 22:42:27 -0400 you wrote:
> From: Neal Cardwell <ncardwell@google.com>
> 
> In some production workloads we noticed that connections could
> sometimes close extremely prematurely with ETIMEDOUT after
> transmitting only 1 TLP and RTO retransmission (when we would normally
> expect roughly tcp_retries2 = TCP_RETR2 = 15 RTOs before a connection
> closes with ETIMEDOUT).
> 
> [...]

Here is the summary with links:
  - [net] UPSTREAM: tcp: fix DSACK undo in fast recovery to call tcp_try_to_open()
    https://git.kernel.org/netdev/net/c/a6458ab7fd4f

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2e39cb881e209..e67cbeeeb95b4 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3077,7 +3077,7 @@  static void tcp_fastretrans_alert(struct sock *sk, const u32 prior_snd_una,
 			return;
 
 		if (tcp_try_undo_dsack(sk))
-			tcp_try_keep_open(sk);
+			tcp_try_to_open(sk, flag);
 
 		tcp_identify_packet_loss(sk, ack_flag);
 		if (icsk->icsk_ca_state != TCP_CA_Recovery) {