From patchwork Fri Aug 11 02:55:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13350031 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 162E6EC0 for ; Fri, 11 Aug 2023 03:01:45 +0000 (UTC) Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10CF62D60; Thu, 10 Aug 2023 20:01:44 -0700 (PDT) Received: by mail-pf1-x444.google.com with SMTP id d2e1a72fcca58-68783004143so1232063b3a.2; Thu, 10 Aug 2023 20:01:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691722903; x=1692327703; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=c3VeUl42GIsBJoy2db7FH5pZ3W66hnMuj1immbzSRdg=; b=BXWcJn5q0lkz/7skqUZ3b6l6Q2Lv798L+/u0WP0i0ytMYpUvMGFpZsviVVOBXXgRD4 3gmH23BPSCjzLDlkcBOSWkBmUIY+g3pLsl7X8KQUAvH1TLMCR4/4CxSR6dzvdPRFNaRg qqfK1eKicq8nJLLButyWq3Y9fyHWPj5ldXg8syCRJT4WrhGyic+x8kjtaNdICsrEFY4d zEgNLm7ERUGu6TrNMH9k4PwG3TGh1J68rVQ66E8yyVGa4YugRpEdGNAE4Qc9zgYwnJ7K njygRTNczL55Wp1IBSTKL7HnMnacCH0XlkPB7CWKKQML/LoSR0+EIc2dmHxm1VH7YAJF 4PWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691722903; x=1692327703; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=c3VeUl42GIsBJoy2db7FH5pZ3W66hnMuj1immbzSRdg=; b=VsNmrMT4b64nIvqpu4x3+tYNipWWcUuIU06pQyeKBWECR2Ggo6RF1+AtN4RMdQXu++ ee4BC4jCV8zT78LAF2Q0IhhsUO/pxjiumAHrA6b2Jn2YhGtcnlvrafUkSvN93xCojQiH o9qPgsdqADKGoKCKAXjOKxixxS7ZM9pFKQVivRNghFkq1sKU29AE4WUFN8g7DvrMH/my UEjCDQqrbkqEoxQkrL3GszJKxvF5KSbzmPiuILPEUDejaKjdmrri59NRbgOK0yVevnI5 1I963OGwhGjUhnM1vL83ddMXnpjWkSII/v9U3O2EP25IEuwkQfi65r2qozigKE9olObK nN4g== X-Gm-Message-State: AOJu0Yy+7yc/CmXil7J62CJJwxlFiwj2ovr+pLxm5zfAhbtwORKD+egr jjwdFORWyPSRvx+iH38iaGo= X-Google-Smtp-Source: AGHT+IHLpMnBAPjatoHM3usBXK2gLWCHJk9kgiqZQIH1oLxXGCgKsaIkRhQja3bfGRHLWjVidYblPw== X-Received: by 2002:a05:6a20:4327:b0:12f:c0c1:d70 with SMTP id h39-20020a056a20432700b0012fc0c10d70mr919689pzk.40.1691722903447; Thu, 10 Aug 2023 20:01:43 -0700 (PDT) Received: from localhost.localdomain ([203.205.141.10]) by smtp.gmail.com with ESMTPSA id l5-20020a639845000000b005646868da17sm2281197pgo.72.2023.08.10.20.01.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 20:01:42 -0700 (PDT) From: menglong8.dong@gmail.com X-Google-Original-From: imagedong@tencent.com To: edumazet@google.com, ncardwell@google.com Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, flyingpeng@tencent.com, Menglong Dong Subject: [PATCH net-next v4 1/4] net: tcp: send zero-window ACK when no memory Date: Fri, 11 Aug 2023 10:55:27 +0800 Message-Id: <20230811025530.3510703-2-imagedong@tencent.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230811025530.3510703-1-imagedong@tencent.com> References: <20230811025530.3510703-1-imagedong@tencent.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong For now, skb will be dropped when no memory, which makes client keep retrans util timeout and it's not friendly to the users. In this patch, we reply an ACK with zero-window in this case to update the snd_wnd of the sender to 0. Therefore, the sender won't timeout the connection and will probe the zero-window with the retransmits. Signed-off-by: Menglong Dong Reviewed-by: Eric Dumazet --- v3: - refactor the code to avoid code duplication v2: - send 0 rwin ACK for the receive queue empty case when necessary - send the ACK immediately by using the ICSK_ACK_NOW flag --- include/net/inet_connection_sock.h | 3 ++- net/ipv4/tcp_input.c | 18 ++++++++++++------ net/ipv4/tcp_output.c | 14 +++++++++++--- 3 files changed, 25 insertions(+), 10 deletions(-) diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index c2b15f7e5516..be3c858a2ebb 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -164,7 +164,8 @@ enum inet_csk_ack_state_t { ICSK_ACK_TIMER = 2, ICSK_ACK_PUSHED = 4, ICSK_ACK_PUSHED2 = 8, - ICSK_ACK_NOW = 16 /* Send the next ACK immediately (once) */ + ICSK_ACK_NOW = 16, /* Send the next ACK immediately (once) */ + ICSK_ACK_NOMEM = 32, }; void inet_csk_init_xmit_timers(struct sock *sk, diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 8e96ebe373d7..2ac059483410 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5059,13 +5059,19 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) /* Ok. In sequence. In window. */ queue_and_out: - if (skb_queue_len(&sk->sk_receive_queue) == 0) - sk_forced_mem_schedule(sk, skb->truesize); - else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) { - reason = SKB_DROP_REASON_PROTO_MEM; - NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP); + if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) { + /* TODO: maybe ratelimit these WIN 0 ACK ? */ + inet_csk(sk)->icsk_ack.pending |= + (ICSK_ACK_NOMEM | ICSK_ACK_NOW); + inet_csk_schedule_ack(sk); sk->sk_data_ready(sk); - goto drop; + + if (skb_queue_len(&sk->sk_receive_queue)) { + reason = SKB_DROP_REASON_PROTO_MEM; + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP); + goto drop; + } + sk_forced_mem_schedule(sk, skb->truesize); } eaten = tcp_queue_rcv(sk, skb, &fragstolen); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index c5412ee77fc8..769a558159ee 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -257,11 +257,19 @@ EXPORT_SYMBOL(tcp_select_initial_window); static u16 tcp_select_window(struct sock *sk) { struct tcp_sock *tp = tcp_sk(sk); - u32 old_win = tp->rcv_wnd; - u32 cur_win = tcp_receive_window(tp); - u32 new_win = __tcp_select_window(sk); struct net *net = sock_net(sk); + u32 old_win = tp->rcv_wnd; + u32 cur_win, new_win; + + /* Make the window 0 if we failed to queue the data because we + * are out of memory. The window is temporary, so we don't store + * it on the socket. + */ + if (unlikely(inet_csk(sk)->icsk_ack.pending & ICSK_ACK_NOMEM)) + return 0; + cur_win = tcp_receive_window(tp); + new_win = __tcp_select_window(sk); if (new_win < cur_win) { /* Danger Will Robinson! * Don't update rcv_wup/rcv_wnd here or else From patchwork Fri Aug 11 02:55:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13350032 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 63ECC10F0 for ; Fri, 11 Aug 2023 03:01:48 +0000 (UTC) Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 201452D61; Thu, 10 Aug 2023 20:01:47 -0700 (PDT) Received: by mail-pf1-x444.google.com with SMTP id d2e1a72fcca58-686d8c8fc65so1206854b3a.0; Thu, 10 Aug 2023 20:01:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691722906; x=1692327706; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1nhz2+yXHajZnOYlDCo8CCrQ4By4cAhwFG/S6tJ/BEs=; b=evGBOJY0uH1bC95u2uGOJe4KwOvAB0qQH2qgdiN2nSXzdiiAQPSldBdam/bzX3ueav Z0XqCNtP4w5PJBGYe6pfoXzA96Of13ehQ/1kQs1VwIB46DIOc4DD3vEI+Jd72QwqWibs MYpX3P/yoZWQ7aAn22yROmYOUR+j4fW3ilUKl9UziADz7GBhDRi2J42oQesJHB/R0YBs wJ1wsC5oMMw89UA7ojrlKp13EUqd190ewO5WMPKmaQhVKrVN+dDfbqF/LaPOgS36yN2M HxHwwx8Q9rdN0asYJbeJnSOJbSu4I/k320CJiIrK61dOd/D1DS4X0FQqvwtTgig6FQnv ZaIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691722906; x=1692327706; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1nhz2+yXHajZnOYlDCo8CCrQ4By4cAhwFG/S6tJ/BEs=; b=llsBFnDLCzSXSp0A2l7uorpPjuDpxIqN87iFgl7nU14sOqD98wi/uORSiAD0Kh2Hkz SZtNvv/FnA6dq2rNGrT6NO0nfn4UF2muqtAL8FNPBb10E6sQDV6h+lHE2Miy2KGRQraw ovTdODFQeSxP1km/wsp/wYhTkg9CvseWtO+c8UPHcdXaHlWCk5RkCiNQqcI39uNsUUwR 6tES3kxdPFl4NTxHuIS2MXkBDt1zpBoXgpW291s1Oednjl200Z+ebSFrSdgaxh4mn+cK mc66iQC3H0AZW7CEZNBj3sWCJ3B0IwKDH9lwIew7uZBMpzvMcoekFM9LIv/zkWzvieO4 MkVw== X-Gm-Message-State: AOJu0YyBS9lQfkTp0HELwIhncXsA1x8YUY6HRWa+FMuh+8r9a0IAJMbl 5DP4idU9swCVgM9MwWiooCI= X-Google-Smtp-Source: AGHT+IFbJCCur/eTwLIV0FphtySrsf5eBsyeTX7ML0N0hb+5GtQWsjuzZzT8GXUEXkID1V1c2AVyuA== X-Received: by 2002:a05:6a20:3d8c:b0:13d:315f:26b7 with SMTP id s12-20020a056a203d8c00b0013d315f26b7mr1011825pzi.1.1691722906536; Thu, 10 Aug 2023 20:01:46 -0700 (PDT) Received: from localhost.localdomain ([203.205.141.10]) by smtp.gmail.com with ESMTPSA id l5-20020a639845000000b005646868da17sm2281197pgo.72.2023.08.10.20.01.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 20:01:45 -0700 (PDT) From: menglong8.dong@gmail.com X-Google-Original-From: imagedong@tencent.com To: edumazet@google.com, ncardwell@google.com Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, flyingpeng@tencent.com, Menglong Dong Subject: [PATCH net-next v4 2/4] net: tcp: allow zero-window ACK update the window Date: Fri, 11 Aug 2023 10:55:28 +0800 Message-Id: <20230811025530.3510703-3-imagedong@tencent.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230811025530.3510703-1-imagedong@tencent.com> References: <20230811025530.3510703-1-imagedong@tencent.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong Fow now, an ACK can update the window in following case, according to the tcp_may_update_window(): 1. the ACK acknowledged new data 2. the ACK has new data 3. the ACK expand the window and the seq of it is valid Now, we allow the ACK update the window if the window is 0, and the seq/ack of it is valid. This is for the case that the receiver replies an zero-window ACK when it is under memory stress and can't queue the new data. Signed-off-by: Menglong Dong Reviewed-by: Eric Dumazet --- net/ipv4/tcp_input.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 2ac059483410..d34d52fdfdb1 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3525,7 +3525,7 @@ static inline bool tcp_may_update_window(const struct tcp_sock *tp, { return after(ack, tp->snd_una) || after(ack_seq, tp->snd_wl1) || - (ack_seq == tp->snd_wl1 && nwin > tp->snd_wnd); + (ack_seq == tp->snd_wl1 && (nwin > tp->snd_wnd || !nwin)); } /* If we update tp->snd_una, also update tp->bytes_acked */ From patchwork Fri Aug 11 02:55:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13350033 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D3ACA4C for ; Fri, 11 Aug 2023 03:01:51 +0000 (UTC) Received: from mail-oi1-x243.google.com (mail-oi1-x243.google.com [IPv6:2607:f8b0:4864:20::243]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A3C42D62; Thu, 10 Aug 2023 20:01:50 -0700 (PDT) Received: by mail-oi1-x243.google.com with SMTP id 5614622812f47-3a3fbfb616dso1216919b6e.3; Thu, 10 Aug 2023 20:01:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691722909; x=1692327709; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=enVkNJXrJdLf7MzJUPRylvkN1IvdGVVLk/5iI6tM8A4=; b=XISWmeUDefkxqNsY0JQpgSjpOyQHTTrNySFbrmrp2L8X8IlgTZnMrNnCRK5mrrOWwW nSJcsAAf7sjhaXFeqpY5cSkXeNCdEkQFRwALPDkbHv/0IrBhc6vgdjKTv82spDzKSn3F IbjXgc7m1JDDxfMYaelU5r4XjdYvDX13WOT+H8k6LjFOlOh8V5q5vks6HpLMSy1emSPU W8500QM18Fw5K6wJtyWkzx8Y5yLen6du5xzq26SAfWhi/dr27lw0Aqg7J3q2Qf0SoikT 0CkUs35TLxTngXms7MvIaS9dKwZDeDXXob1Mbb4o6sFCiKxmy14nCFQ7gglqvMia6GlM gBwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691722909; x=1692327709; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=enVkNJXrJdLf7MzJUPRylvkN1IvdGVVLk/5iI6tM8A4=; b=Ky/3TLM2mmflFgj0WSIx4BA/cpkyBCKsZN6E2qQIMN1BXD6ngTCSn4t5J+M/KBY9Xd gQ+MpBZZ4o4td7rG1OIjsZ0sPD9aLkK+8fgwbRUH6LaXFrHSSG0cI/BH1RQAPlgK/sIE cMZ+qcq3KhO9qQLINdacy77ruO7F10vGT9PY66mvqq4Lnq5Enh0Xsm//mxXPY52G829b dEd8Ud3ZQrM1KIcRTj5a1JB4soU5ptgRAeglFDD2swr6TV9EldgKXdHfPdw7Rczhlwc4 IpzmoFg3rvl3SwL5hm817ukFOky4hf9GO44fVTn8Jsm+EqKPD3ZshiDWHqLMgyuyDm6o XJWA== X-Gm-Message-State: AOJu0YwuxjTV7sP0QhIxZv4w21P2gcCM6e1AtHziY34HfnFszClivzNF G4hx2Ia+gX/6esbK5EMJiyc= X-Google-Smtp-Source: AGHT+IF9pto/XQgcZ5r8VYBj/cCWKdXbbM11P42QEyEQ3GKd6IJDhwcwoWPrfOpxf+w5vkQ5Ryq0ug== X-Received: by 2002:a05:6808:211d:b0:3a7:44f2:4570 with SMTP id r29-20020a056808211d00b003a744f24570mr928103oiw.42.1691722909532; Thu, 10 Aug 2023 20:01:49 -0700 (PDT) Received: from localhost.localdomain ([203.205.141.10]) by smtp.gmail.com with ESMTPSA id l5-20020a639845000000b005646868da17sm2281197pgo.72.2023.08.10.20.01.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 20:01:49 -0700 (PDT) From: menglong8.dong@gmail.com X-Google-Original-From: imagedong@tencent.com To: edumazet@google.com, ncardwell@google.com Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, flyingpeng@tencent.com, Menglong Dong Subject: [PATCH net-next v4 3/4] net: tcp: fix unexcepted socket die when snd_wnd is 0 Date: Fri, 11 Aug 2023 10:55:29 +0800 Message-Id: <20230811025530.3510703-4-imagedong@tencent.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230811025530.3510703-1-imagedong@tencent.com> References: <20230811025530.3510703-1-imagedong@tencent.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong In tcp_retransmit_timer(), a window shrunk connection will be regarded as timeout if 'tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX'. This is not right all the time. The retransmits will become zero-window probes in tcp_retransmit_timer() if the 'snd_wnd==0'. Therefore, the icsk->icsk_rto will come up to TCP_RTO_MAX sooner or later. However, the timer can be delayed and be triggered after 122877ms, not TCP_RTO_MAX, as I tested. Therefore, 'tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX' is always true once the RTO come up to TCP_RTO_MAX, and the socket will die. Fix this by replacing the 'tcp_jiffies32' with '(u32)icsk->icsk_timeout', which is exact the timestamp of the timeout. However, "tp->rcv_tstamp" can restart from idle, then tp->rcv_tstamp could already be a long time (minutes or hours) in the past even on the first RTO. So we double check the timeout with the duration of the retransmission. Meanwhile, making "2 * TCP_RTO_MAX" as the timeout to avoid the socket dying too soon. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/netdev/CADxym3YyMiO+zMD4zj03YPM3FBi-1LHi6gSD2XT8pyAMM096pg@mail.gmail.com/ Signed-off-by: Menglong Dong Reviewed-by: Eric Dumazet --- v4: - make the timeout "2 * TCP_RTO_MAX" - tp->retrans_stamp is not based on jiffies and can't be compared with icsk->icsk_timeout. Fix it. v3: - use after() instead of max() in tcp_rtx_probe0_timed_out() v2: - consider the case of the connection restart from idle, as Neal comment --- net/ipv4/tcp_timer.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index d45c96c7f5a4..f2a52c11e044 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -454,6 +454,22 @@ static void tcp_fastopen_synack_timer(struct sock *sk, struct request_sock *req) req->timeout << req->num_timeout, TCP_RTO_MAX); } +static bool tcp_rtx_probe0_timed_out(const struct sock *sk, + const struct sk_buff *skb) +{ + const struct tcp_sock *tp = tcp_sk(sk); + const int timeout = TCP_RTO_MAX * 2; + u32 rcv_delta, rtx_delta; + + rcv_delta = inet_csk(sk)->icsk_timeout - tp->rcv_tstamp; + if (rcv_delta <= timeout) + return false; + + rtx_delta = (u32)msecs_to_jiffies(tcp_time_stamp(tp) - + (tp->retrans_stamp ?: tcp_skb_timestamp(skb))); + + return rtx_delta > timeout; +} /** * tcp_retransmit_timer() - The TCP retransmit timeout handler @@ -519,7 +535,7 @@ void tcp_retransmit_timer(struct sock *sk) tp->snd_una, tp->snd_nxt); } #endif - if (tcp_jiffies32 - tp->rcv_tstamp > TCP_RTO_MAX) { + if (tcp_rtx_probe0_timed_out(sk, skb)) { tcp_write_err(sk); goto out; } From patchwork Fri Aug 11 02:55:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13350034 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 12181A4C for ; Fri, 11 Aug 2023 03:01:54 +0000 (UTC) Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1824E2D79; Thu, 10 Aug 2023 20:01:53 -0700 (PDT) Received: by mail-pf1-x444.google.com with SMTP id d2e1a72fcca58-687ca37628eso1418609b3a.1; Thu, 10 Aug 2023 20:01:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1691722912; x=1692327712; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3OgisSTpDCALq2GggzFZJcvXoiKxj7XqfD1T8ufu6ys=; b=Wt0qbxMgSrP5Ha7icZ+FaqBUtW5bx8CzES3A1OByr1DLFZNqE2rfOgQdBOSAoYXI6Z YEW41Io2Kxu6KVkXpgxblVOcLWOIei2TAD/mvEIDAFGQbvU8NbNGcBIcwZ09bDGiZsTk xvsWQbkWxHVwTVQWVbk+uwPQlg3vJPTu8VqMLJRTv9QM19hy5G28SqAv/jImHYGdwGOg r9/SNX7MjzS9dfcgPA7gXdVI0PXg25HvgB1D7moy3eg3u+EjJwl5EJh7DAydyAsnPlEU fjV5OU6encgDxAZ8yKO9+H1NtbrPhr8NSV3kXtYGjuSee0lGPqAzLEZyd7gIcly35lh/ 3Eow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691722912; x=1692327712; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3OgisSTpDCALq2GggzFZJcvXoiKxj7XqfD1T8ufu6ys=; b=S/8543aVXjgHYBkg6ZiQhlLyrpDGrFuSaSpxieExYs5s4/WNIVoTCTqCmJrMnVB3jY 6octBF409xea5OuihnBfaUaVwtZOqgl+uYFd3gJ2KMTfM7Brwry9XdksPxa4wTprUv5S so/ceRcckkoBlxmL/GhTlexe8e45AdM69XVee4WNVlAgddeOkFJHrqItD5OheOwlZ7Te fcmruHE1t00hcgriUSeQl8VLGFHqB/xbnNRusmPsGqkE7wMBQhHZhtTOlrSedO2YI6r2 mGeI0PyWEAtPRbsCJ2CBqXfIBM3qSw34yWnGcR5NKohgOiIMcwdbM2ElkZyimt3EJ/dI McpQ== X-Gm-Message-State: AOJu0Yw/67dgCuJsKJXgjr7sSsWiWnroh/93z+OyFyGmUtHAA4q+Bz2x a8Xnk+xRo5B97b43bE7QR2o= X-Google-Smtp-Source: AGHT+IH+jcd+1sV0hamvuNqqBPBWTM367Ti9YvRSnNeVeS2kh+8kjv9rPue2JPLAdXeix1VM0m16Xw== X-Received: by 2002:a05:6a20:3d8c:b0:140:a6ec:b55d with SMTP id s12-20020a056a203d8c00b00140a6ecb55dmr1141710pzi.15.1691722912510; Thu, 10 Aug 2023 20:01:52 -0700 (PDT) Received: from localhost.localdomain ([203.205.141.10]) by smtp.gmail.com with ESMTPSA id l5-20020a639845000000b005646868da17sm2281197pgo.72.2023.08.10.20.01.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 20:01:52 -0700 (PDT) From: menglong8.dong@gmail.com X-Google-Original-From: imagedong@tencent.com To: edumazet@google.com, ncardwell@google.com Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, flyingpeng@tencent.com, Menglong Dong Subject: [PATCH net-next v4 4/4] net: tcp: refactor the dbg message in tcp_retransmit_timer() Date: Fri, 11 Aug 2023 10:55:30 +0800 Message-Id: <20230811025530.3510703-5-imagedong@tencent.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230811025530.3510703-1-imagedong@tencent.com> References: <20230811025530.3510703-1-imagedong@tencent.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong The debug message in tcp_retransmit_timer() is slightly wrong, because they could be printed even if we did not receive a new ACK packet from the remote peer. Change it to probing zero-window, as it is a expected case now. The description may be not correct. Adding the duration since the last ACK we received, and the duration of the retransmission, which are useful for debugging. And the message now like this: Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 209ms ago, lasting 209ms Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 404ms ago, lasting 408ms Probing zero-window on 127.0.0.1:9999/46946, seq=3737778959:3737791503, recv 812ms ago, lasting 1224ms Signed-off-by: Menglong Dong Reviewed-by: Eric Dumazet --- net/ipv4/tcp_timer.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index f2a52c11e044..74c70fc1003c 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -519,20 +519,23 @@ void tcp_retransmit_timer(struct sock *sk) * we cannot allow such beasts to hang infinitely. */ struct inet_sock *inet = inet_sk(sk); + u32 rtx_delta; + + rtx_delta = tcp_time_stamp(tp) - (tp->retrans_stamp ?: tcp_skb_timestamp(skb)); if (sk->sk_family == AF_INET) { - net_dbg_ratelimited("Peer %pI4:%u/%u unexpectedly shrunk window %u:%u (repaired)\n", - &inet->inet_daddr, - ntohs(inet->inet_dport), - inet->inet_num, - tp->snd_una, tp->snd_nxt); + net_dbg_ratelimited("Probing zero-window on %pI4:%u/%u, seq=%u:%u, recv %ums ago, lasting %ums\n", + &inet->inet_daddr, ntohs(inet->inet_dport), + inet->inet_num, tp->snd_una, tp->snd_nxt, + jiffies_to_msecs(jiffies - tp->rcv_tstamp), + rtx_delta); } #if IS_ENABLED(CONFIG_IPV6) else if (sk->sk_family == AF_INET6) { - net_dbg_ratelimited("Peer %pI6:%u/%u unexpectedly shrunk window %u:%u (repaired)\n", - &sk->sk_v6_daddr, - ntohs(inet->inet_dport), - inet->inet_num, - tp->snd_una, tp->snd_nxt); + net_dbg_ratelimited("Probing zero-window on %pI6:%u/%u, seq=%u:%u, recv %ums ago, lasting %ums\n", + &sk->sk_v6_daddr, ntohs(inet->inet_dport), + inet->inet_num, tp->snd_una, tp->snd_nxt, + jiffies_to_msecs(jiffies - tp->rcv_tstamp), + rtx_delta); } #endif if (tcp_rtx_probe0_timed_out(sk, skb)) {