From patchwork Wed May 17 12:41:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13244844 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14C7D1549B for ; Wed, 17 May 2023 12:42:14 +0000 (UTC) Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E462E7A; Wed, 17 May 2023 05:42:11 -0700 (PDT) Received: by mail-pf1-x444.google.com with SMTP id d2e1a72fcca58-643ac91c51fso534320b3a.1; Wed, 17 May 2023 05:42:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684327331; x=1686919331; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=EPpBzc8U46icZRxTtg/C9J0WiKkv45MAA53fWrbYIQ0=; b=O60dhM/uIO2RqpUb7HUDbbNgTo94TW9EJxg6BGTAGlyvmhisydQeE0inp1u2RKYPvi bKCqR1opFS9yNdG42a5ozgD7QJ/s9VklocbsMdwNxDdiezOfaPBLGMoX0bcfofKwqhQQ sjR5ZG3PcopHSV41gHDVuwymYhNagQZQSRUpQ5aZhZiw7uLApDsO4jLVAOPQUKcz+OIh wrdaAxT22GLsKMoI9KTUG6yGNMdNWaYmsMH7lyrBu37rNkyhGm52tonO4O6B7ZRFDK/g IMRe9rJjQaJtjU+KXocNduhoNq9W/+hjKeyc0Ek52dkdRgAc7hwrHdn6aD8lq0U8gxkP qmPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684327331; x=1686919331; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EPpBzc8U46icZRxTtg/C9J0WiKkv45MAA53fWrbYIQ0=; b=N5w7bQeZ0Vqr332K8n7XifxCNg5PECj5OZGaq8HKFtb4YxpcsJ4nzlvCzsE1NsKZ1/ vHt7LuareLtI34gS1losRss44+5LiCvvDFj42ueIsUlvhSEQbl/A5tJqCkDxzEWd73JF rOOLaHDoh6meP/rDiKw3vvOv9SWayNJAQfIxLfjhRfsxtb7x+S88NJ5bOzPkrmHyTjJn wXRWQA2NjiQYu9C02NlzOZnshDQutnu+kIVyqokyMvzY0zKxEMen8dOOo/o1imwdinGH KLgBoalzpDsb/InyhivC7EC4NOPA1tI2PBb+KdWLYbiOVesGcfFwQZLMbmeYvlho2avf wP0A== X-Gm-Message-State: AC+VfDy0fudbE9CEsyNNUDobSMRB1OH5Hop8/WVAnVVZVKwz4twqMlNm TXsdjBbgJEsLKleoqXS26H4= X-Google-Smtp-Source: ACHHUZ5JkhPKg6y+ftOY75hoFRteYiVtRue48NakVXk5fALEe9HQFGwyEqQxmvlxq5iRTDd2mP8uwQ== X-Received: by 2002:a05:6a21:6da4:b0:100:8258:169e with SMTP id wl36-20020a056a216da400b001008258169emr45051800pzb.24.1684327331107; Wed, 17 May 2023 05:42:11 -0700 (PDT) Received: from localhost.localdomain ([81.70.217.19]) by smtp.gmail.com with ESMTPSA id u23-20020aa78497000000b0064aea45b040sm9244224pfn.168.2023.05.17.05.42.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 05:42:10 -0700 (PDT) From: menglong8.dong@gmail.com X-Google-Original-From: imagedong@tencent.com To: kuba@kernel.org Cc: davem@davemloft.net, edumazet@google.com, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Menglong Dong Subject: [PATCH net-next 1/3] net: tcp: add sysctl for controling tcp window shrink Date: Wed, 17 May 2023 20:41:59 +0800 Message-Id: <20230517124201.441634-2-imagedong@tencent.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230517124201.441634-1-imagedong@tencent.com> References: <20230517124201.441634-1-imagedong@tencent.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong Introduce the sysctl 'tcp_wnd_shrink', which will be used in the following patches. Signed-off-by: Menglong Dong --- include/net/tcp.h | 1 + net/ipv4/sysctl_net_ipv4.c | 9 +++++++++ net/ipv4/tcp.c | 3 +++ 3 files changed, 13 insertions(+) diff --git a/include/net/tcp.h b/include/net/tcp.h index a0a91a988272..a6cf6d823e34 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -247,6 +247,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo); /* sysctl variables for tcp */ extern int sysctl_tcp_max_orphans; extern long sysctl_tcp_mem[3]; +extern int sysctl_tcp_wnd_shrink; #define TCP_RACK_LOSS_DETECTION 0x1 /* Use RACK to detect losses */ #define TCP_RACK_STATIC_REO_WND 0x2 /* Use static RACK reo wnd */ diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 0d0cc4ef2b85..fd6cb5a5c2b9 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -577,6 +577,15 @@ static struct ctl_table ipv4_table[] = { .extra1 = &sysctl_fib_sync_mem_min, .extra2 = &sysctl_fib_sync_mem_max, }, + { + .procname = "tcp_wnd_shrink", + .data = &sysctl_tcp_wnd_shrink, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_ONE + }, { } }; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index fd68d49490f2..db0483b2159f 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -297,6 +297,9 @@ EXPORT_SYMBOL(tcp_memory_allocated); DEFINE_PER_CPU(int, tcp_memory_per_cpu_fw_alloc); EXPORT_PER_CPU_SYMBOL_GPL(tcp_memory_per_cpu_fw_alloc); +int sysctl_tcp_wnd_shrink __read_mostly; +EXPORT_SYMBOL(sysctl_tcp_wnd_shrink); + #if IS_ENABLED(CONFIG_SMC) DEFINE_STATIC_KEY_FALSE(tcp_have_smc); EXPORT_SYMBOL(tcp_have_smc); From patchwork Wed May 17 12:42:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13244845 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FA84156EA for ; Wed, 17 May 2023 12:42:16 +0000 (UTC) Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CB151BF; Wed, 17 May 2023 05:42:14 -0700 (PDT) Received: by mail-pf1-x442.google.com with SMTP id d2e1a72fcca58-64384274895so497182b3a.2; Wed, 17 May 2023 05:42:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684327334; x=1686919334; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2/0OAhjGOHrmxWrbN9W3gRqX+XIbc1pmgYwBMpZEkis=; b=Y2JdKOfL1Y+jzb6X3BwIf/rxj42QRt8mwa4OtrOo0pu0hjdph/vy9O7eH+P3bIA/Wu ObzfpTrmVyGvCJORHYgUYs63ox5JQQv/FRwl9SFocNMVExF5lLaAQG9UiPlCG0Jv7s8G 6LMlcqfLrmNaXeg0l4kft5e1G1DmFygQR4JE0vv3Dw0/+gu1CJglcwWjV9sAL777wQWF zgr7s1jXiNG48TH1iYBF9fwp0zpib2gLE8KvL3Jfux1bEOuSqajB7oeNgg7YzCJ1ehdo sLmmu6yhcGPYxOvDNgsmSJxp6ej8BVPuFWKC9iNYrKSmZJPY/iNfhZ4KCEtIKCuF6jQH +3FQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684327334; x=1686919334; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2/0OAhjGOHrmxWrbN9W3gRqX+XIbc1pmgYwBMpZEkis=; b=RmG2sAtCxUGALczBSP0DB/C13TIwPnqdOtcUm2TMBZYVf72Kvb2UfK+sWwT+wMOb3E YDawev2Yq2QOEAGVDUM9VoUHu5YBhLDPOnnSbleximUlvAGJqpEYmfFl9qpVEMoLWdW0 Fifc239w3DEjxhJQpYTid/xjwJ8ErfSDIJ2TWvW4eJwD84ttQTq4W7TFZAwfoQhuaEuA 3z8kezc2knXTsTPoOjWiJy8pkI9nIZrMunQtqsMTWrd+Nv0oJZaRxOjVWj2brVwL49ns bQXB05QFX4YyEk4R7XiuHSnsp/lvDmA30T9UPGdsFF5hM1zX/2QJFY6vAyl1CGU+VUY2 J5pw== X-Gm-Message-State: AC+VfDw90wgJerrvxaud1oPY7oUm80haSgOWLCXf0iXtH2rmlcZWF6NF RaWBP7mX8rb4NubmVE1T3DU= X-Google-Smtp-Source: ACHHUZ5Xclg3N4WSLv1+naH6PK6QsV3ODhQj3peOYplty+jdFtJZ+SsKXw/ZRTCtlBayT25szg+Dxw== X-Received: by 2002:a05:6a00:1308:b0:647:f128:c4f5 with SMTP id j8-20020a056a00130800b00647f128c4f5mr873899pfu.22.1684327333721; Wed, 17 May 2023 05:42:13 -0700 (PDT) Received: from localhost.localdomain ([81.70.217.19]) by smtp.gmail.com with ESMTPSA id u23-20020aa78497000000b0064aea45b040sm9244224pfn.168.2023.05.17.05.42.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 05:42:13 -0700 (PDT) From: menglong8.dong@gmail.com X-Google-Original-From: imagedong@tencent.com To: kuba@kernel.org Cc: davem@davemloft.net, edumazet@google.com, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Menglong Dong Subject: [PATCH net-next 2/3] net: tcp: send zero-window when no memory Date: Wed, 17 May 2023 20:42:00 +0800 Message-Id: <20230517124201.441634-3-imagedong@tencent.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230517124201.441634-1-imagedong@tencent.com> References: <20230517124201.441634-1-imagedong@tencent.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong For now, skb will be dropped when no memory, which makes client keep retrans util timeout and it's not friendly to the users. Therefore, now we force to receive one packet on current socket when the protocol memory is out of the limitation. Then, this socket will stay in 'no mem' status, util protocol memory is available. When a socket is in 'no mem' status, it's receive window will become 0, which means window shrink happens. And the sender need to handle such window shrink properly, which is done in the next commit. Signed-off-by: Menglong Dong --- include/net/sock.h | 1 + net/ipv4/tcp_input.c | 12 ++++++++++++ net/ipv4/tcp_output.c | 7 +++++++ 3 files changed, 20 insertions(+) diff --git a/include/net/sock.h b/include/net/sock.h index 5edf0038867c..90db8a1d7f31 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -957,6 +957,7 @@ enum sock_flags { SOCK_XDP, /* XDP is attached */ SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */ SOCK_RCVMARK, /* Receive SO_MARK ancillary data with packet */ + SOCK_NO_MEM, /* protocol memory limitation happened */ }; #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index a057330d6f59..56e395cb4554 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5047,10 +5047,22 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) if (skb_queue_len(&sk->sk_receive_queue) == 0) sk_forced_mem_schedule(sk, skb->truesize); else if (tcp_try_rmem_schedule(sk, skb, skb->truesize)) { + if (sysctl_tcp_wnd_shrink) + goto do_wnd_shrink; + reason = SKB_DROP_REASON_PROTO_MEM; NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVQDROP); sk->sk_data_ready(sk); goto drop; +do_wnd_shrink: + if (sock_flag(sk, SOCK_NO_MEM)) { + NET_INC_STATS(sock_net(sk), + LINUX_MIB_TCPRCVQDROP); + sk->sk_data_ready(sk); + goto out_of_window; + } + sk_forced_mem_schedule(sk, skb->truesize); + sock_set_flag(sk, SOCK_NO_MEM); } eaten = tcp_queue_rcv(sk, skb, &fragstolen); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index cfe128b81a01..21dc4f7e0a12 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -300,6 +300,13 @@ static u16 tcp_select_window(struct sock *sk) NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFROMZEROWINDOWADV); } + if (sock_flag(sk, SOCK_NO_MEM)) { + if (sk_memory_allocated(sk) < sk_prot_mem_limits(sk, 2)) + sock_reset_flag(sk, SOCK_NO_MEM); + else + new_win = 0; + } + return new_win; } From patchwork Wed May 17 12:42:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Menglong Dong X-Patchwork-Id: 13244846 X-Patchwork-Delegate: kuba@kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7777618B0B for ; Wed, 17 May 2023 12:42:18 +0000 (UTC) Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1CEE269F; Wed, 17 May 2023 05:42:16 -0700 (PDT) Received: by mail-pf1-x442.google.com with SMTP id d2e1a72fcca58-64359d9c531so517302b3a.3; Wed, 17 May 2023 05:42:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684327336; x=1686919336; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pX2hrFRMTsVQF1+5JbeKRRhV3fvCd61M3POL1LFhKAA=; b=kDUlcJaW77qv2uceQP01whPf+y603pZWGnpjr3X5VKoYFKVAN6D0NRY4f6T6MCq2rh ep/rW4c+eeXIXZULnUI0ht//PM48uzBnwnViUcXEz9K7VDB807FyPRXi1fNsoPggsXhd Qw8zXKiP7ZUxHg0BWtYyH1OLe/YZM6iJEATdlO9UYbuQSlR1sTtScm/Efj5kkySgeJuf ++nVpAlRG9lPVPbSUyMgroB+NqvhtDflxtk8H28X45+vJeK4a+5WX2MyIFmT0M2cVebq USEmEd9DtH9c0sXzZwfDPQng5207kUEhHH/8MrLEDwE2kcbzX6wxedShDdA+5EHLTf26 2/Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684327336; x=1686919336; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pX2hrFRMTsVQF1+5JbeKRRhV3fvCd61M3POL1LFhKAA=; b=LTqlPHajUDQuypc7t+90ZqnZ3eSAWlzSUwYo9PG7dJXuEJboAQAIsdJv8qQOYjpi5X 9Zr8Q9jHwfFbeARrq0MEvpps71wSsDDLhwNoX+iC3ar3hFNV3jbORx5NaezZsk8OSdMX Wvhng7Ki3KEQu/bpvVC6PAaRxWRuSdXBuY1fP070twpsWcxbceHU4hMbTOPKgF0We0WP lNex2DuMcuCGsCurrB/YlCqrUVjLK1tJcq0ncOHBHOYpZQfl6CJkL6rhjP1LnNT+BBAE Udcq1r0oaZdjppoAid68ixw6i+Y61BxFn1cp26FxnU1zhC8iyQSo/Mblwre9GfDNYzsu On1g== X-Gm-Message-State: AC+VfDzlVA+jNuracUQWRCDBGBzhKwEwQ21bJbGDp79VAjulY116tXNw KoahN2E3PniXkKBfO1bLCNw= X-Google-Smtp-Source: ACHHUZ6oGWTq7fjcLMhSu3MHrZzpF6gpIFARGwmwve7ipGe3wMAWwd4V6JtnhedYmomBUE50HgvGxw== X-Received: by 2002:a05:6a20:54a8:b0:104:6432:270 with SMTP id i40-20020a056a2054a800b0010464320270mr24929219pzk.46.1684327336317; Wed, 17 May 2023 05:42:16 -0700 (PDT) Received: from localhost.localdomain ([81.70.217.19]) by smtp.gmail.com with ESMTPSA id u23-20020aa78497000000b0064aea45b040sm9244224pfn.168.2023.05.17.05.42.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 05:42:16 -0700 (PDT) From: menglong8.dong@gmail.com X-Google-Original-From: imagedong@tencent.com To: kuba@kernel.org Cc: davem@davemloft.net, edumazet@google.com, pabeni@redhat.com, dsahern@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Menglong Dong Subject: [PATCH net-next 3/3] net: tcp: handle window shrink properly Date: Wed, 17 May 2023 20:42:01 +0800 Message-Id: <20230517124201.441634-4-imagedong@tencent.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230517124201.441634-1-imagedong@tencent.com> References: <20230517124201.441634-1-imagedong@tencent.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Patchwork-Delegate: kuba@kernel.org From: Menglong Dong Window shrink is not allowed and also not handled for now, but it's needed in some case. In the origin logic, 0 probe is triggered only when there is no any data in the retrans queue and the receive window can't hold the data of the 1th packet in the send queue. Now, let's change it and trigger the 0 probe in such cases: - if the retrans queue has data and the 1th packet in it is not within the receive window - no data in the retrans queue and the 1th packet in the send queue is out of the end of the receive window Signed-off-by: Menglong Dong --- include/net/tcp.h | 21 +++++++++++++++++++++ net/ipv4/tcp_input.c | 41 +++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_output.c | 3 +-- net/ipv4/tcp_timer.c | 4 +--- 4 files changed, 64 insertions(+), 5 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index a6cf6d823e34..9625d0bf00e1 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1910,6 +1910,27 @@ static inline void tcp_add_write_queue_tail(struct sock *sk, struct sk_buff *skb tcp_chrono_start(sk, TCP_CHRONO_BUSY); } +static inline bool tcp_rtx_overflow(const struct sock *sk) +{ + struct sk_buff *rtx_head = tcp_rtx_queue_head(sk); + + return rtx_head && after(TCP_SKB_CB(rtx_head)->end_seq, + tcp_wnd_end(tcp_sk(sk))); +} + +static inline bool tcp_probe0_needed(const struct sock *sk) +{ + /* for the normal case */ + if (!tcp_sk(sk)->packets_out && !tcp_write_queue_empty(sk)) + return true; + + if (!sysctl_tcp_wnd_shrink) + return false; + + /* for the window shrink case */ + return tcp_rtx_overflow(sk); +} + /* Insert new before skb on the write queue of sk. */ static inline void tcp_insert_write_queue_before(struct sk_buff *new, struct sk_buff *skb, diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 56e395cb4554..a9ac295502ee 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3188,6 +3188,14 @@ void tcp_rearm_rto(struct sock *sk) /* Try to schedule a loss probe; if that doesn't work, then schedule an RTO. */ static void tcp_set_xmit_timer(struct sock *sk) { + /* Check if we are already in probe0 state, which means it's + * not needed to schedule the RTO. The normal probe0 can't reach + * here, so it must be window-shrink probe0 case here. + */ + if (unlikely(inet_csk(sk)->icsk_pending == ICSK_TIME_PROBE0) && + sysctl_tcp_wnd_shrink) + return; + if (!tcp_schedule_loss_probe(sk, true)) tcp_rearm_rto(sk); } @@ -3465,6 +3473,38 @@ static void tcp_ack_probe(struct sock *sk) } } +/** + * This function is called only when there are packets in the rtx queue, + * which means that the packets out is not 0. + * + * NOTE: we only handle window shrink case in this part. + */ +static void tcp_ack_probe_shrink(struct sock *sk) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + unsigned long when; + + if (!sysctl_tcp_wnd_shrink) + return; + + if (tcp_rtx_overflow(sk)) { + when = tcp_probe0_when(sk, TCP_RTO_MAX); + + when = tcp_clamp_probe0_to_user_timeout(sk, when); + tcp_reset_xmit_timer(sk, ICSK_TIME_PROBE0, when, TCP_RTO_MAX); + } else { + /* check if recover from window shrink */ + if (icsk->icsk_pending != ICSK_TIME_PROBE0) + return; + + icsk->icsk_backoff = 0; + icsk->icsk_probes_tstamp = 0; + inet_csk_clear_xmit_timer(sk, ICSK_TIME_PROBE0); + if (!tcp_rtx_queue_empty(sk)) + tcp_retransmit_timer(sk); + } +} + static inline bool tcp_ack_is_dubious(const struct sock *sk, const int flag) { return !(flag & FLAG_NOT_DUP) || (flag & FLAG_CA_ALERT) || @@ -3908,6 +3948,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP)) sk_dst_confirm(sk); + tcp_ack_probe_shrink(sk); delivered = tcp_newly_delivered(sk, delivered, flag); lost = tp->lost - lost; /* freshly marked lost */ rs.is_ack_delayed = !!(flag & FLAG_ACK_MAYBE_DELAYED); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 21dc4f7e0a12..eac0532edb61 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -4089,14 +4089,13 @@ int tcp_write_wakeup(struct sock *sk, int mib) void tcp_send_probe0(struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk); - struct tcp_sock *tp = tcp_sk(sk); struct net *net = sock_net(sk); unsigned long timeout; int err; err = tcp_write_wakeup(sk, LINUX_MIB_TCPWINPROBE); - if (tp->packets_out || tcp_write_queue_empty(sk)) { + if (!tcp_probe0_needed(sk)) { /* Cancel probe timer, if it is not required. */ icsk->icsk_probes_out = 0; icsk->icsk_backoff = 0; diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index b839c2f91292..a28606291b7e 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -350,11 +350,9 @@ static void tcp_delack_timer(struct timer_list *t) static void tcp_probe_timer(struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk); - struct sk_buff *skb = tcp_send_head(sk); - struct tcp_sock *tp = tcp_sk(sk); int max_probes; - if (tp->packets_out || !skb) { + if (!tcp_probe0_needed(sk)) { icsk->icsk_probes_out = 0; icsk->icsk_probes_tstamp = 0; return;