diff mbox series

[net-next,2/2] tcp: more accurately check DSACKs to grow RACK reordering window

Message ID 20210727144258.946533-3-ncardwell@google.com (mailing list archive)
State Accepted
Commit a657db0350bb8f568897835b6189c84a89f13292
Delegated to: Netdev Maintainers
Headers show
Series more accurate DSACK processing for RACK-TLP | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 3 maintainers not CCed: yoshfuji@linux-ipv6.org dsahern@kernel.org kuba@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 2 this patch: 2
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 2 this patch: 2
netdev/header_inline success Link

Commit Message

Neal Cardwell July 27, 2021, 2:42 p.m. UTC
Previously, a DSACK could expand the RACK reordering window when no
reordering has been seen, and/or when the DSACK was due to an
unnecessary TLP retransmit (rather than a spurious fast recovery due
to reordering). This could result in unnecessarily growing the RACK
reordering window and thus unnecessarily delaying RACK-based fast
recovery episodes.

To avoid these issues, this commit tightens the conditions under which
a DSACK triggers the RACK reordering window to grow, so that a
connection only expands its RACK reordering window if:

(a) reordering has been seen in the connection
(b) a DSACKed range does not match the most recent TLP retransmit

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Priyaranjan Jha <priyarjha@google.com>
---
 net/ipv4/tcp_input.c    | 9 ++++++++-
 net/ipv4/tcp_recovery.c | 3 ++-
 2 files changed, 10 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 98408d520c32..3f7bd7ae7d7a 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1001,7 +1001,14 @@  static u32 tcp_dsack_seen(struct tcp_sock *tp, u32 start_seq,
 		return 0;
 
 	tp->rx_opt.sack_ok |= TCP_DSACK_SEEN;
-	tp->rack.dsack_seen = 1;
+	/* We increase the RACK ordering window in rounds where we receive
+	 * DSACKs that may have been due to reordering causing RACK to trigger
+	 * a spurious fast recovery. Thus RACK ignores DSACKs that happen
+	 * without having seen reordering, or that match TLP probes (TLP
+	 * is timer-driven, not triggered by RACK).
+	 */
+	if (tp->reord_seen && !(state->flag & FLAG_DSACK_TLP))
+		tp->rack.dsack_seen = 1;
 
 	state->flag |= FLAG_DSACKING_ACK;
 	/* A spurious retransmission is delivered */
diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
index 6f1b4ac7fe99..fd113f6226ef 100644
--- a/net/ipv4/tcp_recovery.c
+++ b/net/ipv4/tcp_recovery.c
@@ -172,7 +172,8 @@  void tcp_rack_reo_timeout(struct sock *sk)
 
 /* Updates the RACK's reo_wnd based on DSACK and no. of recoveries.
  *
- * If DSACK is received, increment reo_wnd by min_rtt/4 (upper bounded
+ * If a DSACK is received that seems like it may have been due to reordering
+ * triggering fast recovery, increment reo_wnd by min_rtt/4 (upper bounded
  * by srtt), since there is possibility that spurious retransmission was
  * due to reordering delay longer than reo_wnd.
  *