From patchwork Fri Oct 30 22:45:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mat Martineau X-Patchwork-Id: 11870933 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB5DCC00A89 for ; Fri, 30 Oct 2020 22:45:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A5B342224E for ; Fri, 30 Oct 2020 22:45:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726090AbgJ3WpO (ORCPT ); Fri, 30 Oct 2020 18:45:14 -0400 Received: from mga11.intel.com ([192.55.52.93]:45959 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726068AbgJ3WpO (ORCPT ); Fri, 30 Oct 2020 18:45:14 -0400 IronPort-SDR: dbkSNhSiWfreoCaA98m8vtHNyu8Yph1NKCX7oq5vO6AeC8uR0PvixsyWVUHmb+SoDnHg0Usanc PYdF2VMMmaHA== X-IronPort-AV: E=McAfee;i="6000,8403,9790"; a="165177395" X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="165177395" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:14 -0700 IronPort-SDR: zyh2+/nDjWOvorFNOopKsYG8uNkMCuGemac7Oiqwbzs5BteZoI+2eiG9nmpbb7A37rEeoJFjWm /dyzXBhNUCeg== X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="361983934" Received: from mjmartin-nuc02.amr.corp.intel.com ([10.254.102.227]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:12 -0700 From: Mat Martineau To: netdev@vger.kernel.org Cc: Florian Westphal , mptcp@lists.01.org, kuba@kernel.org, davem@davemloft.net, Mat Martineau Subject: [PATCH net-next 1/6] mptcp: adjust mptcp receive buffer limit if subflow has larger one Date: Fri, 30 Oct 2020 15:45:01 -0700 Message-Id: <20201030224506.108377-2-mathew.j.martineau@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> References: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Florian Westphal In addition to tcp autotuning during read, it may also increase the receive buffer in tcp_clamp_window(). In this case, mptcp should adjust its receive buffer size as well so it can move all pending skbs from the subflow socket to the mptcp socket. At this time, TCP can have more skbs ready for processing than what the mptcp receive buffer size allows. In the mptcp case, the receive window announced is based on the free space of the mptcp parent socket instead of the individual subflows. Following the subflow allows mptcp to grow its receive buffer. This is especially noticeable for loopback traffic where two skbs are enough to fill the initial receive window. In mptcp_data_ready() we do not hold the mptcp socket lock, so modifying mptcp_sk->sk_rcvbuf is racy. Do it when moving skbs from subflow to mptcp socket, both sockets are locked in this case. Reviewed-by: Mat Martineau Signed-off-by: Florian Westphal --- net/mptcp/protocol.c | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e7419fd15d84..e010ef7585bf 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -466,6 +466,18 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk, struct tcp_sock *tp; u32 old_copied_seq; bool done = false; + int sk_rbuf; + + sk_rbuf = READ_ONCE(sk->sk_rcvbuf); + + if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) { + int ssk_rbuf = READ_ONCE(ssk->sk_rcvbuf); + + if (unlikely(ssk_rbuf > sk_rbuf)) { + WRITE_ONCE(sk->sk_rcvbuf, ssk_rbuf); + sk_rbuf = ssk_rbuf; + } + } pr_debug("msk=%p ssk=%p", msk, ssk); tp = tcp_sk(ssk); @@ -528,7 +540,7 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk, WRITE_ONCE(tp->copied_seq, seq); more_data_avail = mptcp_subflow_data_available(ssk); - if (atomic_read(&sk->sk_rmem_alloc) > READ_ONCE(sk->sk_rcvbuf)) { + if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf) { done = true; break; } @@ -622,6 +634,7 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); struct mptcp_sock *msk = mptcp_sk(sk); + int sk_rbuf, ssk_rbuf; bool wake; /* move_skbs_to_msk below can legitly clear the data_avail flag, @@ -632,12 +645,16 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk) if (wake) set_bit(MPTCP_DATA_READY, &msk->flags); - if (atomic_read(&sk->sk_rmem_alloc) < READ_ONCE(sk->sk_rcvbuf) && - move_skbs_to_msk(msk, ssk)) + ssk_rbuf = READ_ONCE(ssk->sk_rcvbuf); + sk_rbuf = READ_ONCE(sk->sk_rcvbuf); + if (unlikely(ssk_rbuf > sk_rbuf)) + sk_rbuf = ssk_rbuf; + + /* over limit? can't append more skbs to msk */ + if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf) goto wake; - /* don't schedule if mptcp sk is (still) over limit */ - if (atomic_read(&sk->sk_rmem_alloc) > READ_ONCE(sk->sk_rcvbuf)) + if (move_skbs_to_msk(msk, ssk)) goto wake; /* mptcp socket is owned, release_cb should retry */ From patchwork Fri Oct 30 22:45:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mat Martineau X-Patchwork-Id: 11870939 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0ACBCC00A89 for ; Fri, 30 Oct 2020 22:45:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BD1D122228 for ; Fri, 30 Oct 2020 22:45:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726120AbgJ3WpR (ORCPT ); Fri, 30 Oct 2020 18:45:17 -0400 Received: from mga11.intel.com ([192.55.52.93]:45959 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726091AbgJ3WpP (ORCPT ); Fri, 30 Oct 2020 18:45:15 -0400 IronPort-SDR: YGhEjGwpUp4BBq+uL/ual+83fPD77LTq+n+IfGWzr5PIH0aHIsJOHlBP9ALEs9yg1k/3W2EqEc 6fmRew+rNb2w== X-IronPort-AV: E=McAfee;i="6000,8403,9790"; a="165177397" X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="165177397" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:14 -0700 IronPort-SDR: boY8rD4fhd6diZ4mUkz3dYybOb34BWQxjqfyTPOEyYXLoUN+PI60A6L7mPiT+cMYbBOfAR4UTl cPDGCjryxhEg== X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="361983938" Received: from mjmartin-nuc02.amr.corp.intel.com ([10.254.102.227]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:12 -0700 From: Mat Martineau To: netdev@vger.kernel.org Cc: Florian Westphal , mptcp@lists.01.org, kuba@kernel.org, davem@davemloft.net, Paolo Abeni , Mat Martineau Subject: [PATCH net-next 2/6] mptcp: use _fast lock version in __mptcp_move_skbs Date: Fri, 30 Oct 2020 15:45:02 -0700 Message-Id: <20201030224506.108377-3-mathew.j.martineau@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> References: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Florian Westphal The function is short and won't sleep, so this can use the _fast version. Acked-by: Paolo Abeni Reviewed-by: Mat Martineau Signed-off-by: Florian Westphal --- net/mptcp/protocol.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e010ef7585bf..f5bacfc55006 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1493,13 +1493,14 @@ static bool __mptcp_move_skbs(struct mptcp_sock *msk) __mptcp_flush_join_list(msk); do { struct sock *ssk = mptcp_subflow_recv_lookup(msk); + bool slowpath; if (!ssk) break; - lock_sock(ssk); + slowpath = lock_sock_fast(ssk); done = __mptcp_move_skbs_from_subflow(msk, ssk, &moved); - release_sock(ssk); + unlock_sock_fast(ssk, slowpath); } while (!done); if (mptcp_ofo_queue(msk) || moved > 0) { From patchwork Fri Oct 30 22:45:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mat Martineau X-Patchwork-Id: 11870943 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CA82C388F9 for ; Fri, 30 Oct 2020 22:45:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5BDAB22245 for ; Fri, 30 Oct 2020 22:45:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726166AbgJ3WpY (ORCPT ); Fri, 30 Oct 2020 18:45:24 -0400 Received: from mga11.intel.com ([192.55.52.93]:45959 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726092AbgJ3WpP (ORCPT ); Fri, 30 Oct 2020 18:45:15 -0400 IronPort-SDR: dJDwoRjuYXNP7ZXxJg9Lzxy6HZuz0SUeZJ9eZR9Y10pboIPeysfTaew6BfHP8kfs6OiKrTOGNc NIflivYB4UAA== X-IronPort-AV: E=McAfee;i="6000,8403,9790"; a="165177398" X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="165177398" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:14 -0700 IronPort-SDR: EGFqJTfKjrCLDJ6CD7m6P1DNdqp6YqKD2wb+FlmB5qfiJf1Cj3ZM7YdFPERMmCAi2knh+T0ESE WnBPYEe9F2Hw== X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="361983939" Received: from mjmartin-nuc02.amr.corp.intel.com ([10.254.102.227]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:12 -0700 From: Mat Martineau To: netdev@vger.kernel.org Cc: Paolo Abeni , mptcp@lists.01.org, kuba@kernel.org, davem@davemloft.net, Mat Martineau Subject: [PATCH net-next 3/6] tcp: propagate MPTCP skb extensions on xmit splits Date: Fri, 30 Oct 2020 15:45:03 -0700 Message-Id: <20201030224506.108377-4-mathew.j.martineau@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> References: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Paolo Abeni When the TCP stack splits a packet on the write queue, the tail half currently lose the associated skb extensions, and will not carry the DSM on the wire. The above does not cause functional problems and is allowed by the RFC, but interact badly with GRO and RX coalescing, as possible candidates for aggregation will carry different TCP options. This change tries to improve the MPTCP behavior, propagating the skb extensions on split. Additionally, we must prevent the MPTCP stack from updating the mapping after the split occur: that will both violate the RFC and fool the reader. Reviewed-by: Mat Martineau Signed-off-by: Paolo Abeni --- include/net/mptcp.h | 21 ++++++++++++++++++++- net/ipv4/tcp_output.c | 3 +++ net/mptcp/protocol.c | 7 +++++-- 3 files changed, 28 insertions(+), 3 deletions(-) diff --git a/include/net/mptcp.h b/include/net/mptcp.h index 753ba7e755d6..6e706d838e4e 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -29,7 +29,8 @@ struct mptcp_ext { use_ack:1, ack64:1, mpc_map:1, - __unused:2; + frozen:1, + __unused:1; /* one byte hole */ }; @@ -106,6 +107,19 @@ static inline void mptcp_skb_ext_move(struct sk_buff *to, from->active_extensions = 0; } +static inline void mptcp_skb_ext_copy(struct sk_buff *to, + struct sk_buff *from) +{ + struct mptcp_ext *from_ext; + + from_ext = skb_ext_find(from, SKB_EXT_MPTCP); + if (!from_ext) + return; + + from_ext->frozen = 1; + skb_ext_copy(to, from); +} + static inline bool mptcp_ext_matches(const struct mptcp_ext *to_ext, const struct mptcp_ext *from_ext) { @@ -193,6 +207,11 @@ static inline void mptcp_skb_ext_move(struct sk_buff *to, { } +static inline void mptcp_skb_ext_copy(struct sk_buff *to, + struct sk_buff *from) +{ +} + static inline bool mptcp_skb_can_collapse(const struct sk_buff *to, const struct sk_buff *from) { diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index bf48cd73e967..9abf5a0358d5 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1569,6 +1569,7 @@ int tcp_fragment(struct sock *sk, enum tcp_queue tcp_queue, if (!buff) return -ENOMEM; /* We'll just try again later. */ skb_copy_decrypted(buff, skb); + mptcp_skb_ext_copy(buff, skb); sk_wmem_queued_add(sk, buff->truesize); sk_mem_charge(sk, buff->truesize); @@ -2123,6 +2124,7 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len, if (unlikely(!buff)) return -ENOMEM; skb_copy_decrypted(buff, skb); + mptcp_skb_ext_copy(buff, skb); sk_wmem_queued_add(sk, buff->truesize); sk_mem_charge(sk, buff->truesize); @@ -2393,6 +2395,7 @@ static int tcp_mtu_probe(struct sock *sk) skb = tcp_send_head(sk); skb_copy_decrypted(nskb, skb); + mptcp_skb_ext_copy(nskb, skb); TCP_SKB_CB(nskb)->seq = TCP_SKB_CB(skb)->seq; TCP_SKB_CB(nskb)->end_seq = TCP_SKB_CB(skb)->seq + probe_size; diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index f5bacfc55006..25d12183d1ca 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -771,8 +771,11 @@ static bool mptcp_skb_can_collapse_to(u64 write_seq, if (!tcp_skb_can_collapse_to(skb)) return false; - /* can collapse only if MPTCP level sequence is in order */ - return mpext && mpext->data_seq + mpext->data_len == write_seq; + /* can collapse only if MPTCP level sequence is in order and this + * mapping has not been xmitted yet + */ + return mpext && mpext->data_seq + mpext->data_len == write_seq && + !mpext->frozen; } static bool mptcp_frag_can_collapse_to(const struct mptcp_sock *msk, From patchwork Fri Oct 30 22:45:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mat Martineau X-Patchwork-Id: 11870941 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA930C4742C for ; Fri, 30 Oct 2020 22:45:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 695A322228 for ; Fri, 30 Oct 2020 22:45:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726133AbgJ3WpT (ORCPT ); Fri, 30 Oct 2020 18:45:19 -0400 Received: from mga11.intel.com ([192.55.52.93]:45959 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726070AbgJ3WpO (ORCPT ); Fri, 30 Oct 2020 18:45:14 -0400 IronPort-SDR: t7++ucT/0yKaeS7djikc8NyjwZ0nEGCFcgfma6JESe5ePqk4MO55NgNcUQZk11gTCd+AatI2pE 9E00P46icliQ== X-IronPort-AV: E=McAfee;i="6000,8403,9790"; a="165177399" X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="165177399" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:14 -0700 IronPort-SDR: wg9DkpDG9ivpmchvRsyJUjaM74V/ucGlRGNQTAIddg8qZE2OXoLenfDnys1fV8J0/s4s+RglSi ykBosuvmMTYA== X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="361983940" Received: from mjmartin-nuc02.amr.corp.intel.com ([10.254.102.227]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:13 -0700 From: Mat Martineau To: netdev@vger.kernel.org Cc: Florian Westphal , mptcp@lists.01.org, kuba@kernel.org, davem@davemloft.net, Mat Martineau Subject: [PATCH net-next 4/6] mptcp: split mptcp_clean_una function Date: Fri, 30 Oct 2020 15:45:04 -0700 Message-Id: <20201030224506.108377-5-mathew.j.martineau@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> References: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Florian Westphal mptcp_clean_una() will wake writers in case memory could be reclaimed. When called from mptcp_sendmsg the wakeup code isn't needed. Move the wakeup to a new helper and then use that from the mptcp worker. Reviewed-by: Mat Martineau Signed-off-by: Florian Westphal --- net/mptcp/protocol.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 25d12183d1ca..b84b84adc9ad 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -853,19 +853,25 @@ static void mptcp_clean_una(struct sock *sk) } out: - if (cleaned) { + if (cleaned) sk_mem_reclaim_partial(sk); +} - /* Only wake up writers if a subflow is ready */ - if (mptcp_is_writeable(msk)) { - set_bit(MPTCP_SEND_SPACE, &mptcp_sk(sk)->flags); - smp_mb__after_atomic(); +static void mptcp_clean_una_wakeup(struct sock *sk) +{ + struct mptcp_sock *msk = mptcp_sk(sk); - /* set SEND_SPACE before sk_stream_write_space clears - * NOSPACE - */ - sk_stream_write_space(sk); - } + mptcp_clean_una(sk); + + /* Only wake up writers if a subflow is ready */ + if (mptcp_is_writeable(msk)) { + set_bit(MPTCP_SEND_SPACE, &msk->flags); + smp_mb__after_atomic(); + + /* set SEND_SPACE before sk_stream_write_space clears + * NOSPACE + */ + sk_stream_write_space(sk); } } @@ -1769,7 +1775,7 @@ static void mptcp_worker(struct work_struct *work) long timeo = 0; lock_sock(sk); - mptcp_clean_una(sk); + mptcp_clean_una_wakeup(sk); mptcp_check_data_fin_ack(sk); __mptcp_flush_join_list(msk); if (test_and_clear_bit(MPTCP_WORK_CLOSE_SUBFLOW, &msk->flags)) From patchwork Fri Oct 30 22:45:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mat Martineau X-Patchwork-Id: 11870937 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 432E6C388F9 for ; Fri, 30 Oct 2020 22:45:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 026EB221EB for ; Fri, 30 Oct 2020 22:45:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726138AbgJ3WpU (ORCPT ); Fri, 30 Oct 2020 18:45:20 -0400 Received: from mga11.intel.com ([192.55.52.93]:45959 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726107AbgJ3WpP (ORCPT ); Fri, 30 Oct 2020 18:45:15 -0400 IronPort-SDR: 9TQoV+1H4LV8qYetLO+afrwes4x1EQ64f4QaNA9WD7Z8K+9H5ZgHiuIuezVHoeZRldd8Gb3Fvk uWSs2+E7cy1w== X-IronPort-AV: E=McAfee;i="6000,8403,9790"; a="165177401" X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="165177401" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:14 -0700 IronPort-SDR: WGBqc34Orx9fY4yHrfopmMb8cAmEPQWB1R9poj0ae6hBznj/tRq6nRMbbFCEIf5KJ3L/1ARELC +EMp+4FC/euw== X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="361983942" Received: from mjmartin-nuc02.amr.corp.intel.com ([10.254.102.227]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:13 -0700 From: Mat Martineau To: netdev@vger.kernel.org Cc: Geliang Tang , mptcp@lists.01.org, kuba@kernel.org, davem@davemloft.net, Matthieu Baerts , Paolo Abeni Subject: [PATCH net-next 5/6] mptcp: add a new sysctl add_addr_timeout Date: Fri, 30 Oct 2020 15:45:05 -0700 Message-Id: <20201030224506.108377-6-mathew.j.martineau@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> References: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Geliang Tang This patch added a new sysctl, named add_addr_timeout, to control the timeout value (in seconds) of the ADD_ADDR retransmission. Suggested-by: Matthieu Baerts Suggested-by: Paolo Abeni Acked-by: Paolo Abeni Reviewed-by: Matthieu Baerts Signed-off-by: Geliang Tang --- net/mptcp/ctrl.c | 14 ++++++++++++++ net/mptcp/pm_netlink.c | 8 ++++++-- net/mptcp/protocol.h | 1 + 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c index 54b888f94009..96ba616f59bf 100644 --- a/net/mptcp/ctrl.c +++ b/net/mptcp/ctrl.c @@ -18,6 +18,7 @@ struct mptcp_pernet { struct ctl_table_header *ctl_table_hdr; int mptcp_enabled; + unsigned int add_addr_timeout; }; static struct mptcp_pernet *mptcp_get_pernet(struct net *net) @@ -30,6 +31,11 @@ int mptcp_is_enabled(struct net *net) return mptcp_get_pernet(net)->mptcp_enabled; } +unsigned int mptcp_get_add_addr_timeout(struct net *net) +{ + return mptcp_get_pernet(net)->add_addr_timeout; +} + static struct ctl_table mptcp_sysctl_table[] = { { .procname = "enabled", @@ -40,12 +46,19 @@ static struct ctl_table mptcp_sysctl_table[] = { */ .proc_handler = proc_dointvec, }, + { + .procname = "add_addr_timeout", + .maxlen = sizeof(unsigned int), + .mode = 0644, + .proc_handler = proc_dointvec_jiffies, + }, {} }; static void mptcp_pernet_set_defaults(struct mptcp_pernet *pernet) { pernet->mptcp_enabled = 1; + pernet->add_addr_timeout = TCP_RTO_MAX; } static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet) @@ -61,6 +74,7 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet) } table[0].data = &pernet->mptcp_enabled; + table[1].data = &pernet->add_addr_timeout; hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table); if (!hdr) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 0d6f3d912891..ed60538df7b2 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -206,6 +206,7 @@ static void mptcp_pm_add_timer(struct timer_list *timer) struct mptcp_pm_add_entry *entry = from_timer(entry, timer, add_timer); struct mptcp_sock *msk = entry->sock; struct sock *sk = (struct sock *)msk; + struct net *net = sock_net(sk); pr_debug("msk=%p", msk); @@ -232,7 +233,8 @@ static void mptcp_pm_add_timer(struct timer_list *timer) } if (entry->retrans_times < ADD_ADDR_RETRANS_MAX) - sk_reset_timer(sk, timer, jiffies + TCP_RTO_MAX); + sk_reset_timer(sk, timer, + jiffies + mptcp_get_add_addr_timeout(net)); spin_unlock_bh(&msk->pm.lock); @@ -264,6 +266,7 @@ static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, { struct mptcp_pm_add_entry *add_entry = NULL; struct sock *sk = (struct sock *)msk; + struct net *net = sock_net(sk); if (lookup_anno_list_by_saddr(msk, &entry->addr)) return false; @@ -279,7 +282,8 @@ static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, add_entry->retrans_times = 0; timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0); - sk_reset_timer(sk, &add_entry->add_timer, jiffies + TCP_RTO_MAX); + sk_reset_timer(sk, &add_entry->add_timer, + jiffies + mptcp_get_add_addr_timeout(net)); return true; } diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 13ab89dc1914..278c88c405e8 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -362,6 +362,7 @@ mptcp_subflow_get_mapped_dsn(const struct mptcp_subflow_context *subflow) } int mptcp_is_enabled(struct net *net); +unsigned int mptcp_get_add_addr_timeout(struct net *net); void mptcp_subflow_fully_established(struct mptcp_subflow_context *subflow, struct mptcp_options_received *mp_opt); bool mptcp_subflow_data_available(struct sock *sk); From patchwork Fri Oct 30 22:45:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mat Martineau X-Patchwork-Id: 11870945 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5703CC00A89 for ; Fri, 30 Oct 2020 22:45:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0E223221EB for ; Fri, 30 Oct 2020 22:45:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726151AbgJ3WpX (ORCPT ); Fri, 30 Oct 2020 18:45:23 -0400 Received: from mga11.intel.com ([192.55.52.93]:45968 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726108AbgJ3WpQ (ORCPT ); Fri, 30 Oct 2020 18:45:16 -0400 IronPort-SDR: mlY4AB9Ww4jyTTm0rb1PlAqkaB/KcnM3mLzdmDKmllRriBEiTUdelWhfhbObHqtyjk+L2SyuTS wQBiOdck9jGQ== X-IronPort-AV: E=McAfee;i="6000,8403,9790"; a="165177403" X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="165177403" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:14 -0700 IronPort-SDR: 9I3S7hOympe45pE8kgYfcZE2uQ9aiH8F1KlT5MMH2t4xNr7P9Sqim8vKrBvbUH2OwSsReiXI7l MYcaf/GYLfZw== X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="361983944" Received: from mjmartin-nuc02.amr.corp.intel.com ([10.254.102.227]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 15:45:13 -0700 From: Mat Martineau To: netdev@vger.kernel.org Cc: Geliang Tang , mptcp@lists.01.org, kuba@kernel.org, davem@davemloft.net, Matthieu Baerts , Paolo Abeni Subject: [PATCH net-next 6/6] selftests: mptcp: add ADD_ADDR timeout test case Date: Fri, 30 Oct 2020 15:45:06 -0700 Message-Id: <20201030224506.108377-7-mathew.j.martineau@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> References: <20201030224506.108377-1-mathew.j.martineau@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Geliang Tang This patch added the test case for retransmitting ADD_ADDR when timeout occurs. It set NS1's add_addr_timeout to 1 second, and drop NS2's ADD_ADDR echo packets. Here we need to slow down the transfer process of all data to let the ADD_ADDR suboptions can be retransmitted three times. So we added a new parameter "speed" for do_transfer, it can be set with fast or slow. We also added three new optional parameters for run_tests, and dropped run_remove_tests function. Since we added the netfilter rules in this test case, we need to update the "config" file. Suggested-by: Matthieu Baerts Suggested-by: Paolo Abeni Acked-by: Paolo Abeni Reviewed-by: Matthieu Baerts Signed-off-by: Geliang Tang --- tools/testing/selftests/net/mptcp/config | 10 ++ .../testing/selftests/net/mptcp/mptcp_join.sh | 94 ++++++++++++++----- 2 files changed, 80 insertions(+), 24 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/config b/tools/testing/selftests/net/mptcp/config index 741a1c4f4ae8..0faaccd21447 100644 --- a/tools/testing/selftests/net/mptcp/config +++ b/tools/testing/selftests/net/mptcp/config @@ -5,3 +5,13 @@ CONFIG_INET_DIAG=m CONFIG_INET_MPTCP_DIAG=m CONFIG_VETH=y CONFIG_NET_SCH_NETEM=m +CONFIG_NETFILTER=y +CONFIG_NETFILTER_ADVANCED=y +CONFIG_NETFILTER_NETLINK=m +CONFIG_NF_TABLES=m +CONFIG_NFT_COUNTER=m +CONFIG_NFT_COMPAT=m +CONFIG_NETFILTER_XTABLES=m +CONFIG_NETFILTER_XT_MATCH_BPF=m +CONFIG_NF_TABLES_IPV4=y +CONFIG_NF_TABLES_IPV6=y diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index 08f53d86dedc..0d93b243695f 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -13,6 +13,24 @@ capture=0 TEST_COUNT=0 +# generated using "nfbpf_compile '(ip && (ip[54] & 0xf0) == 0x30) || +# (ip6 && (ip6[74] & 0xf0) == 0x30)'" +CBPF_MPTCP_SUBOPTION_ADD_ADDR="14, + 48 0 0 0, + 84 0 0 240, + 21 0 3 64, + 48 0 0 54, + 84 0 0 240, + 21 6 7 48, + 48 0 0 0, + 84 0 0 240, + 21 0 4 96, + 48 0 0 74, + 84 0 0 240, + 21 0 1 48, + 6 0 0 65535, + 6 0 0 0" + init() { capout=$(mktemp) @@ -82,6 +100,26 @@ reset_with_cookies() done } +reset_with_add_addr_timeout() +{ + local ip="${1:-4}" + local tables + + tables="iptables" + if [ $ip -eq 6 ]; then + tables="ip6tables" + fi + + reset + + ip netns exec $ns1 sysctl -q net.mptcp.add_addr_timeout=1 + ip netns exec $ns2 $tables -A OUTPUT -p tcp \ + -m tcp --tcp-option 30 \ + -m bpf --bytecode \ + "$CBPF_MPTCP_SUBOPTION_ADD_ADDR" \ + -j DROP +} + for arg in "$@"; do if [ "$arg" = "-c" ]; then capture=1 @@ -94,6 +132,17 @@ if [ $? -ne 0 ];then exit $ksft_skip fi +iptables -V > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run all tests without iptables tool" + exit $ksft_skip +fi + +ip6tables -V > /dev/null 2>&1 +if [ $? -ne 0 ];then + echo "SKIP: Could not run all tests without ip6tables tool" + exit $ksft_skip +fi check_transfer() { @@ -135,6 +184,7 @@ do_transfer() connect_addr="$5" rm_nr_ns1="$6" rm_nr_ns2="$7" + speed="$8" port=$((10000+$TEST_COUNT)) TEST_COUNT=$((TEST_COUNT+1)) @@ -159,7 +209,7 @@ do_transfer() sleep 1 fi - if [[ $rm_nr_ns1 -eq 0 && $rm_nr_ns2 -eq 0 ]]; then + if [ $speed = "fast" ]; then mptcp_connect="./mptcp_connect -j" else mptcp_connect="./mptcp_connect -r" @@ -250,26 +300,13 @@ run_tests() listener_ns="$1" connector_ns="$2" connect_addr="$3" + rm_nr_ns1="${4:-0}" + rm_nr_ns2="${5:-0}" + speed="${6:-fast}" lret=0 - do_transfer ${listener_ns} ${connector_ns} MPTCP MPTCP ${connect_addr} 0 0 - lret=$? - if [ $lret -ne 0 ]; then - ret=$lret - return - fi -} - -run_remove_tests() -{ - listener_ns="$1" - connector_ns="$2" - connect_addr="$3" - rm_nr_ns1="$4" - rm_nr_ns2="$5" - lret=0 - - do_transfer ${listener_ns} ${connector_ns} MPTCP MPTCP ${connect_addr} ${rm_nr_ns1} ${rm_nr_ns2} + do_transfer ${listener_ns} ${connector_ns} MPTCP MPTCP ${connect_addr} \ + ${rm_nr_ns1} ${rm_nr_ns2} ${speed} lret=$? if [ $lret -ne 0 ]; then ret=$lret @@ -491,12 +528,21 @@ run_tests $ns1 $ns2 10.0.1.1 chk_join_nr "multiple subflows and signal" 3 3 3 chk_add_nr 1 1 +# add_addr timeout +reset_with_add_addr_timeout +ip netns exec $ns1 ./pm_nl_ctl limits 0 1 +ip netns exec $ns2 ./pm_nl_ctl limits 1 1 +ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal +run_tests $ns1 $ns2 10.0.1.1 0 0 slow +chk_join_nr "signal address, ADD_ADDR timeout" 1 1 1 +chk_add_nr 4 0 + # single subflow, remove reset ip netns exec $ns1 ./pm_nl_ctl limits 0 1 ip netns exec $ns2 ./pm_nl_ctl limits 0 1 ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow -run_remove_tests $ns1 $ns2 10.0.1.1 0 1 +run_tests $ns1 $ns2 10.0.1.1 0 1 slow chk_join_nr "remove single subflow" 1 1 1 chk_rm_nr 1 1 @@ -506,7 +552,7 @@ ip netns exec $ns1 ./pm_nl_ctl limits 0 2 ip netns exec $ns2 ./pm_nl_ctl limits 0 2 ip netns exec $ns2 ./pm_nl_ctl add 10.0.2.2 flags subflow ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow -run_remove_tests $ns1 $ns2 10.0.1.1 0 2 +run_tests $ns1 $ns2 10.0.1.1 0 2 slow chk_join_nr "remove multiple subflows" 2 2 2 chk_rm_nr 2 2 @@ -515,7 +561,7 @@ reset ip netns exec $ns1 ./pm_nl_ctl limits 0 1 ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal ip netns exec $ns2 ./pm_nl_ctl limits 1 1 -run_remove_tests $ns1 $ns2 10.0.1.1 1 0 +run_tests $ns1 $ns2 10.0.1.1 1 0 slow chk_join_nr "remove single address" 1 1 1 chk_add_nr 1 1 chk_rm_nr 0 0 @@ -526,7 +572,7 @@ ip netns exec $ns1 ./pm_nl_ctl limits 0 2 ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal ip netns exec $ns2 ./pm_nl_ctl limits 1 2 ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow -run_remove_tests $ns1 $ns2 10.0.1.1 1 1 +run_tests $ns1 $ns2 10.0.1.1 1 1 slow chk_join_nr "remove subflow and signal" 2 2 2 chk_add_nr 1 1 chk_rm_nr 1 1 @@ -538,7 +584,7 @@ ip netns exec $ns1 ./pm_nl_ctl add 10.0.2.1 flags signal ip netns exec $ns2 ./pm_nl_ctl limits 1 3 ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow ip netns exec $ns2 ./pm_nl_ctl add 10.0.4.2 flags subflow -run_remove_tests $ns1 $ns2 10.0.1.1 1 2 +run_tests $ns1 $ns2 10.0.1.1 1 2 slow chk_join_nr "remove subflows and signal" 3 3 3 chk_add_nr 1 1 chk_rm_nr 2 2