From patchwork Thu Feb 3 03:13:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733801 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD4532C9D for ; Thu, 3 Feb 2022 03:13:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643858022; x=1675394022; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=31Rn1fVGrhzOGv3WF3Z/oTSCRdUAxWa4OxdrNJxlX2M=; b=mko7COpC3LIdxu0SQNvhDqlB5ewtceVfqsohzQyYUe3jz1THTt42FSUr 8jMCTf/QZ/HZJYlaZpKeODn235gydkPfCzpQXikzN5QDFzfiYLpJVdu6k 1KVhHzRmH7tV2KIdn8GeYV/gHjutUCZWqQdBm9r/7OAcNMPUX+JdgXsa/ mIB8kT5oPD0IlT6SqVk5QfdSCqbrQRDCgyXejLx01TDNbBPYMlcO47UiR z8I6DgXD4YhmheOykUXjBi/oNWitJjwQBsfNEust+5yujHezyq69BSvgo A0f53A3VxJzO3hPTIR5ywe10piFnxL52a7l4KaRpD234Lq8vdBujccRtj g==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="308795477" X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="308795477" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="771658250" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:39 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v4 1/8] mptcp: bypass in-kernel PM restrictions for non-kernel PMs Date: Wed, 2 Feb 2022 22:13:24 -0500 Message-Id: <20220203031331.2996457-2-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203031331.2996457-1-kishen.maloor@intel.com> References: <20220203031331.2996457-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Current limits on the # of addresses/subflows must apply only to in-kernel PM managed sockets. Thus this change removes such restrictions for connections overseen by non-kernel (e.g. userspace) PMs. This change also ensures that the kernel does not record stats inside struct mptcp_pm_data updated along kernel code paths when exercised by non-kernel PMs. Signed-off-by: Kishen Maloor --- v4: rephrased commit message, add API mptcp_pm_is_kernel(), bypass accounting fo non-kernel PM managed connections --- net/mptcp/pm.c | 6 +++++- net/mptcp/pm_netlink.c | 3 +++ net/mptcp/protocol.h | 9 +++++++-- net/mptcp/subflow.c | 3 ++- 4 files changed, 17 insertions(+), 4 deletions(-) diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 1f8878cc29e3..3e053b759181 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -87,6 +87,9 @@ bool mptcp_pm_allow_new_subflow(struct mptcp_sock *msk) unsigned int subflows_max; int ret = 0; + if (!mptcp_pm_is_kernel(msk)) + return true; + subflows_max = mptcp_pm_get_subflows_max(msk); pr_debug("msk=%p subflows=%d max=%d allow=%d", msk, pm->subflows, @@ -179,7 +182,8 @@ void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk, bool update_subflows; update_subflows = (ssk->sk_state == TCP_CLOSE) && - (subflow->request_join || subflow->mp_join); + (subflow->request_join || subflow->mp_join) && + mptcp_pm_is_kernel(msk); if (!READ_ONCE(pm->work_pending) && !update_subflows) return; diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 93800f32fcb6..bf24c1a74e1d 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -795,6 +795,9 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk, if (!removed) continue; + if (!mptcp_pm_is_kernel(msk)) + continue; + if (rm_type == MPTCP_MIB_RMADDR) { msk->pm.add_addr_accepted--; WRITE_ONCE(msk->pm.accept_addr, true); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index f37f087caab3..ac8b57d4f853 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -804,9 +804,14 @@ static inline bool mptcp_pm_should_rm_signal(struct mptcp_sock *msk) return READ_ONCE(msk->pm.addr_signal) & BIT(MPTCP_RM_ADDR_SIGNAL); } -static inline bool mptcp_pm_is_userspace(struct mptcp_sock *msk) +static inline bool mptcp_pm_is_userspace(const struct mptcp_sock *msk) { - return READ_ONCE(msk->pm.pm_type) != MPTCP_PM_TYPE_KERNEL; + return READ_ONCE(msk->pm.pm_type) == MPTCP_PM_TYPE_USERSPACE; +} + +static inline bool mptcp_pm_is_kernel(const struct mptcp_sock *msk) +{ + return READ_ONCE(msk->pm.pm_type) == MPTCP_PM_TYPE_KERNEL; } static inline unsigned int mptcp_add_addr_len(int family, bool echo, bool port) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 88ee94adc38c..8c25a1122bfd 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -62,7 +62,8 @@ static void subflow_generate_hmac(u64 key1, u64 key2, u32 nonce1, u32 nonce2, static bool mptcp_can_accept_new_subflow(const struct mptcp_sock *msk) { return mptcp_is_fully_established((void *)msk) && - READ_ONCE(msk->pm.accept_subflow); + (!mptcp_pm_is_kernel(msk) || + READ_ONCE(msk->pm.accept_subflow)); } /* validate received token and create truncated hmac and nonce for SYN-ACK */ From patchwork Thu Feb 3 03:13:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733800 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E9B822CA1 for ; Thu, 3 Feb 2022 03:13:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643858022; x=1675394022; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=t0F+LtZuBQhBzOAqb3kTvTyZmkbo6vTLqs3duOIuOUY=; b=U+FQPvOKaC23ciWDyiWPcbA/fzqd9NeEFzeBVDMTDbt+RMj8oR6lyQeF qg03m2U7AWHjbqn7M+knfcDI4hm/hIbCTMup/KJiuETU9R/0rz+EguxqY d5GG1aOzwLkyLs2ZKopIpKgHnZRvqjAIZanuWGBORNoWIK2bX6fZIchl2 dL3u61HN9qmrh3IltPW1NvRZ/riSdKjQEn9gr85eODzVi6j9eakajKia8 w3Rp7swQewqYEF/mSP/SvLWduzz7JoOuZMd1lADIZFrnQbFd7tWsEZtJl ThrIQtDmMg6OVLlzpu/4EqjzSaJkGso/V90M/v5HbPWnVBfT48X2FV3RW A==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="308795478" X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="308795478" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="771658253" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:39 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v4 2/8] mptcp: store remote id from MP_JOIN SYN/ACK in local ctx Date: Wed, 2 Feb 2022 22:13:25 -0500 Message-Id: <20220203031331.2996457-3-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203031331.2996457-1-kishen.maloor@intel.com> References: <20220203031331.2996457-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This change reads the addr id assigned to the remote endpoint of a subflow from the MP_JOIN SYN/ACK message and stores it in the related subflow context. The remote id was not being captured prior to this change, and will now provide a consistent view of remote endpoints and their ids as seen through netlink events. Signed-off-by: Kishen Maloor --- net/mptcp/subflow.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 8c25a1122bfd..d3691b95401a 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -444,6 +444,7 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb) subflow->backup = mp_opt.backup; subflow->thmac = mp_opt.thmac; subflow->remote_nonce = mp_opt.nonce; + subflow->remote_id = mp_opt.join_id; pr_debug("subflow=%p, thmac=%llu, remote_nonce=%u backup=%d", subflow, subflow->thmac, subflow->remote_nonce, subflow->backup); From patchwork Thu Feb 3 03:13:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733802 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2756D2CA1 for ; Thu, 3 Feb 2022 03:13:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643858024; x=1675394024; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=lb+/sjxrVOCu6IaiElGnADZNu5LQc1FIjY+PPW9x2pc=; b=D+LE5D633rF7r3Beofjs+6UHdmmvJryeW7TW64VlrvM15ZYZaiVhTQSr vpvK+a+7JVfmmb+JuFmNYRtXtxB1qo1WDQ2MrUZAGIYGg6zPQclCTYkCV DMeaUl0o62Zo7/eJ7mn8UmRPO8CYZXBmtdyZ9GdYAvbSrMqaaj7n19nCY nKwqiJqPYqMKt4OmB8TmynGuiamGHXj5I4X/rn0HzcDOIfD3Qwy/Lha89 xIqqMylawPamgtug42NRvI/B1G1bNvxvUt2CkKscyj0M5Kile5X4nvx/F cJjxz1SEjUWnQ5UqV94Y8I/5SqhS8tQZGiBos6v2nBxTI4Xq1dXU7mg7/ Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="308795479" X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="308795479" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="771658257" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v4 3/8] mptcp: reflect remote port (not 0) in ANNOUNCED events Date: Wed, 2 Feb 2022 22:13:26 -0500 Message-Id: <20220203031331.2996457-4-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203031331.2996457-1-kishen.maloor@intel.com> References: <20220203031331.2996457-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Per RFC 8684, if no port is specified in an ADD_ADDR message, MPTCP SHOULD attempt to connect to the specified address on the same port as the port that is already in use by the subflow on which the ADD_ADDR signal was sent. To facilitate that, this change reflects the specific remote port in use by that subflow in MPTCP_EVENT_ANNOUNCED events. Signed-off-by: Kishen Maloor --- v4: refactor mptcp_pm_add_addr_received() and mptcp_event_addr_announced() to eliminate a param --- net/mptcp/options.c | 2 +- net/mptcp/pm.c | 6 ++++-- net/mptcp/pm_netlink.c | 11 ++++++++--- net/mptcp/protocol.h | 4 ++-- 4 files changed, 15 insertions(+), 8 deletions(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 7b615dc10897..6dfaa8e11331 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -1131,7 +1131,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) if ((mp_opt.suboptions & OPTION_MPTCP_ADD_ADDR) && add_addr_hmac_valid(msk, &mp_opt)) { if (!mp_opt.echo) { - mptcp_pm_add_addr_received(msk, &mp_opt.addr); + mptcp_pm_add_addr_received(sk, &mp_opt.addr); MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ADDADDR); } else { mptcp_pm_add_addr_echoed(msk, &mp_opt.addr); diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 3e053b759181..94f008b2d624 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -200,15 +200,17 @@ void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk, spin_unlock_bh(&pm->lock); } -void mptcp_pm_add_addr_received(struct mptcp_sock *msk, +void mptcp_pm_add_addr_received(const struct sock *ssk, const struct mptcp_addr_info *addr) { + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + struct mptcp_sock *msk = mptcp_sk(subflow->conn); struct mptcp_pm_data *pm = &msk->pm; pr_debug("msk=%p remote_id=%d accept=%d", msk, addr->id, READ_ONCE(pm->accept_addr)); - mptcp_event_addr_announced(msk, addr); + mptcp_event_addr_announced(ssk, addr); spin_lock_bh(&pm->lock); diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index bf24c1a74e1d..ff13012178ae 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -1974,10 +1974,12 @@ void mptcp_event_addr_removed(const struct mptcp_sock *msk, uint8_t id) kfree_skb(skb); } -void mptcp_event_addr_announced(const struct mptcp_sock *msk, +void mptcp_event_addr_announced(const struct sock *ssk, const struct mptcp_addr_info *info) { - struct net *net = sock_net((const struct sock *)msk); + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + struct mptcp_sock *msk = mptcp_sk(subflow->conn); + struct net *net = sock_net(ssk); struct nlmsghdr *nlh; struct sk_buff *skb; @@ -1999,7 +2001,10 @@ void mptcp_event_addr_announced(const struct mptcp_sock *msk, if (nla_put_u8(skb, MPTCP_ATTR_REM_ID, info->id)) goto nla_put_failure; - if (nla_put_be16(skb, MPTCP_ATTR_DPORT, info->port)) + if (nla_put_be16(skb, MPTCP_ATTR_DPORT, + info->port == 0 ? + ((struct inet_sock *)inet_sk(ssk))->inet_dport : + info->port)) goto nla_put_failure; switch (info->family) { diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index ac8b57d4f853..4371ac3fbde1 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -751,7 +751,7 @@ void mptcp_pm_subflow_established(struct mptcp_sock *msk); bool mptcp_pm_nl_check_work_pending(struct mptcp_sock *msk); void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk, const struct mptcp_subflow_context *subflow); -void mptcp_pm_add_addr_received(struct mptcp_sock *msk, +void mptcp_pm_add_addr_received(const struct sock *ssk, const struct mptcp_addr_info *addr); void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk, struct mptcp_addr_info *addr); @@ -780,7 +780,7 @@ int mptcp_pm_remove_subflow(struct mptcp_sock *msk, const struct mptcp_rm_list * void mptcp_event(enum mptcp_event_type type, const struct mptcp_sock *msk, const struct sock *ssk, gfp_t gfp); -void mptcp_event_addr_announced(const struct mptcp_sock *msk, const struct mptcp_addr_info *info); +void mptcp_event_addr_announced(const struct sock *ssk, const struct mptcp_addr_info *info); void mptcp_event_addr_removed(const struct mptcp_sock *msk, u8 id); static inline bool mptcp_pm_should_add_signal(struct mptcp_sock *msk) From patchwork Thu Feb 3 03:13:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733803 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 453D32CA4 for ; Thu, 3 Feb 2022 03:13:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643858024; x=1675394024; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=OACnXn1oJrS2hLkA30+p19VlWoYRQSzgjIal5iKRlMc=; b=X4eWZ8PTHW7Z7N/It3Lcb+UR2ZUbBgMTtPqOOmRGdVqPmsZd5/LTih/M 1R5qjfo6YrKQPtfPrJND5CW8l9OwGCV1zH2fnF+8hXGlejrb0wflIjX68 rbF4YI+xCH9PZVLsWSwvZn+otjsgY9JzVH5Vh9K318cQYUZQ6cdHfWPXR lTZrEsQOQwDosSJ8sB7Oe/JV1i+N6bBEBuaRbNKFvihiq3dYGP3NpUDcb YMf2EKg9VYDGFK7ksDmv87/9CAr8Yoh89b7Q7AnKYvI+p4P5rAZybF8YL TBr2D2iTXrCWC7oQrnCwt9SMJGxHGcg/A6AM6mbSO6wBGAwwJTpjut61+ g==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="308795480" X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="308795480" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="771658260" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v4 4/8] mptcp: establish subflows from either end of connection Date: Wed, 2 Feb 2022 22:13:27 -0500 Message-Id: <20220203031331.2996457-5-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203031331.2996457-1-kishen.maloor@intel.com> References: <20220203031331.2996457-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This change updates internal logic to permit subflows to be established from either the client or server ends of MPTCP connections. This symmetry and added flexibility may be harnessed by PM implementations running on either end in creating new subflows. The essence of this change lies in not relying on the "server_side" flag (which continues to be available if needed). Signed-off-by: Kishen Maloor --- v2: check for 3rd ACK retransmission only on passive side of the MPJ handshake v3: check for active subflow socket in subflow_simultaneous_connect --- net/mptcp/options.c | 2 +- net/mptcp/protocol.c | 5 +---- net/mptcp/protocol.h | 8 ++++++-- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 6dfaa8e11331..4f56e874c542 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -929,7 +929,7 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk, if (TCP_SKB_CB(skb)->seq == subflow->ssn_offset + 1 && TCP_SKB_CB(skb)->end_seq == TCP_SKB_CB(skb)->seq && subflow->mp_join && (mp_opt->suboptions & OPTIONS_MPTCP_MPJ) && - READ_ONCE(msk->pm.server_side)) + !subflow->request_join) tcp_send_ack(ssk); goto fully_established; } diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 3324e1c61576..6142b4b25769 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3255,15 +3255,12 @@ bool mptcp_finish_join(struct sock *ssk) return false; } - if (!msk->pm.server_side) + if (!list_empty(&subflow->node)) goto out; if (!mptcp_pm_allow_new_subflow(msk)) goto err_prohibited; - if (WARN_ON_ONCE(!list_empty(&subflow->node))) - goto err_prohibited; - /* active connections are already on conn_list. * If we can't acquire msk socket lock here, let the release callback * handle it diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 4371ac3fbde1..1a8d09796627 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -908,13 +908,17 @@ static inline bool mptcp_check_infinite_map(struct sk_buff *skb) return false; } +static inline bool is_active_ssk(struct mptcp_subflow_context *subflow) +{ + return (subflow->request_mptcp || subflow->request_join); +} + static inline bool subflow_simultaneous_connect(struct sock *sk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); - struct sock *parent = subflow->conn; return sk->sk_state == TCP_ESTABLISHED && - !mptcp_sk(parent)->pm.server_side && + is_active_ssk(subflow) && !subflow->conn_finished; } From patchwork Thu Feb 3 03:13:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733804 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A75C2CA6 for ; Thu, 3 Feb 2022 03:13:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643858024; x=1675394024; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=gxuX5nBfNjm3eDOIguVloVSnnpRcwqw5nK5LrBxglK0=; b=Viw74cj12VPQDiioVJANyuCA1CURjAzmHnwt0GRvQfNXU9qfZKNdtah9 nv57L6VMF1+OYFIH82UCTHJBWsHmpJNJpN9XWnhPbzFNAD4mqzCyHdKFL yxrdkLy9TSf3JaJpaJwuHzm86ne/02CcEs4V9I9NhV2yCkO7TMduV+5QI HieKiF9A8xHKh5WOIFaJ1c/GTrnIlGGFu6WiCiS3HgYHRTbb+gaeFK8B+ BPYDBno+Y0Vo6MKdK1w8c5MgecKMsTDCptEznYzGlmYWGTlswo6kuahKo 0bWYv8hvqspkWxvu9rVelNSpGYlOFWQLiTvOh7LR+Q06fiGUMYdAdDpjq g==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="308795481" X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="308795481" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="771658263" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v4 5/8] mptcp: netlink: store per namespace list of refcounted listen socks Date: Wed, 2 Feb 2022 22:13:28 -0500 Message-Id: <20220203031331.2996457-6-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203031331.2996457-1-kishen.maloor@intel.com> References: <20220203031331.2996457-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The kernel can create listening sockets bound to announced addresses via the ADD_ADDR option for receiving MP_JOIN requests. Path managers may further choose to advertise the same addr+port over multiple MPTCP connections. So this change provides a simple framework to manage a list of all distinct listning sockets created in the kernel over a namespace by encapsulating the socket in a structure that is ref counted and can be shared across multiple connections. The sockets are released when there are no more references. Signed-off-by: Kishen Maloor --- v2: fixed formatting --- net/mptcp/pm_netlink.c | 76 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index ff13012178ae..3d6251baef26 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -22,6 +22,14 @@ static struct genl_family mptcp_genl_family; static int pm_nl_pernet_id; +struct mptcp_local_lsk { + struct list_head list; + struct mptcp_addr_info addr; + struct socket *lsk; + struct rcu_head rcu; + refcount_t refcount; +}; + struct mptcp_pm_addr_entry { struct list_head list; struct mptcp_addr_info addr; @@ -41,7 +49,10 @@ struct mptcp_pm_add_entry { struct pm_nl_pernet { /* protects pernet updates */ spinlock_t lock; + /* protects access to pernet lsk list */ + spinlock_t lsk_list_lock; struct list_head local_addr_list; + struct list_head lsk_list; unsigned int addrs; unsigned int stale_loss_cnt; unsigned int add_addr_signal_max; @@ -83,6 +94,69 @@ static bool addresses_equal(const struct mptcp_addr_info *a, return a->port == b->port; } +static struct mptcp_local_lsk *lsk_list_find(struct pm_nl_pernet *pernet, + struct mptcp_addr_info *addr) +{ + struct mptcp_local_lsk *lsk_ref = NULL; + struct mptcp_local_lsk *i; + + rcu_read_lock(); + + list_for_each_entry_rcu(i, &pernet->lsk_list, list) { + if (addresses_equal(&i->addr, addr, true)) { + if (refcount_inc_not_zero(&i->refcount)) { + lsk_ref = i; + break; + } + } + } + + rcu_read_unlock(); + + return lsk_ref; +} + +static void lsk_list_add_ref(struct mptcp_local_lsk *lsk_ref) +{ + refcount_inc(&lsk_ref->refcount); +} + +static struct mptcp_local_lsk *lsk_list_add(struct pm_nl_pernet *pernet, + struct mptcp_addr_info *addr, + struct socket *lsk) +{ + struct mptcp_local_lsk *lsk_ref; + + lsk_ref = kmalloc(sizeof(*lsk_ref), GFP_ATOMIC); + + if (!lsk_ref) + return NULL; + + lsk_ref->lsk = lsk; + memcpy(&lsk_ref->addr, addr, sizeof(struct mptcp_addr_info)); + refcount_set(&lsk_ref->refcount, 1); + + spin_lock_bh(&pernet->lsk_list_lock); + list_add_rcu(&lsk_ref->list, &pernet->lsk_list); + spin_unlock_bh(&pernet->lsk_list_lock); + + return lsk_ref; +} + +static void lsk_list_release(struct pm_nl_pernet *pernet, + struct mptcp_local_lsk *lsk_ref) +{ + if (lsk_ref && refcount_dec_and_test(&lsk_ref->refcount)) { + sock_release(lsk_ref->lsk); + + spin_lock_bh(&pernet->lsk_list_lock); + list_del_rcu(&lsk_ref->list); + spin_unlock_bh(&pernet->lsk_list_lock); + + kfree_rcu(lsk_ref, rcu); + } +} + static bool address_zero(const struct mptcp_addr_info *addr) { struct mptcp_addr_info zero; @@ -2141,12 +2215,14 @@ static int __net_init pm_nl_init_net(struct net *net) struct pm_nl_pernet *pernet = net_generic(net, pm_nl_pernet_id); INIT_LIST_HEAD_RCU(&pernet->local_addr_list); + INIT_LIST_HEAD_RCU(&pernet->lsk_list); /* Cit. 2 subflows ought to be enough for anybody. */ pernet->subflows_max = 2; pernet->next_id = 1; pernet->stale_loss_cnt = 4; spin_lock_init(&pernet->lock); + spin_lock_init(&pernet->lsk_list_lock); /* No need to initialize other pernet fields, the struct is zeroed at * allocation time. From patchwork Thu Feb 3 03:13:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733805 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3517A2C9D for ; Thu, 3 Feb 2022 03:13:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643858025; x=1675394025; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=g/x4Lyd02IWROGIZPOELi9G/bO8vO8QW8MjnKsw/N/0=; b=IFX6c4lJTV7K37H8fO6wZ5QgdyVJk05lCMGz4dFGuBwmIRciNgUIjnp8 SGk0/QKufyeahit01S6aeSVKxAZF55eydQq26fzGCbrc5lkE7QB1uKVDD VpkPIOkXtic2Mh9NGSU+ae7FdwDjd5rGe1XhWcrw45/tGmyLQlmxSItsa 3K+/3y6mOUYS2uGxMIXdxBb1OSpFacYt5Juzvb4xpp/xralJK7KyE/vH8 EIYSV/loJ2cLR/9StR6Vq6Kf0+v/qfnBflwR7soMOkeUE23c7H5w9GfQy B95/cKLDtyEIdGSBx3yH9iKkrmSaihbt1Cb0FQTYf2vHJWruhN9EgCIEU g==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="308795482" X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="308795482" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="771658266" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v4 6/8] mptcp: netlink: store lsk ref in mptcp_pm_addr_entry Date: Wed, 2 Feb 2022 22:13:29 -0500 Message-Id: <20220203031331.2996457-7-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203031331.2996457-1-kishen.maloor@intel.com> References: <20220203031331.2996457-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This change updates struct mptcp_pm_addr_entry to store a listening socket (lsk) reference, i.e. a pointer to a reference counted structure containing the lsk (struct socket *) instead of the lsk itself. Code blocks that previously operated on the lsk in struct mptcp_pm_addr_entry have been updated to work with the lsk ref instead, utilizing new helper functions. Signed-off-by: Kishen Maloor --- v2: fixed formatting v3: added helper lsk_list_find_or_create(), updated mptcp_pm_nl_create_listen_socket() to take struct net* as param v4: call lsk_list_find() after a failed lsk_list_find_or_create() for a chance to retrieve a recently created lsk by a simultaneous call --- net/mptcp/pm_netlink.c | 79 +++++++++++++++++++++++++++++++----------- 1 file changed, 58 insertions(+), 21 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 3d6251baef26..4c9567db56ff 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -35,7 +35,7 @@ struct mptcp_pm_addr_entry { struct mptcp_addr_info addr; u8 flags; int ifindex; - struct socket *lsk; + struct mptcp_local_lsk *lsk_ref; }; struct mptcp_pm_add_entry { @@ -157,6 +157,33 @@ static void lsk_list_release(struct pm_nl_pernet *pernet, } } +static struct mptcp_local_lsk *lsk_list_find_or_create(struct net *net, + struct pm_nl_pernet *pernet, + struct mptcp_pm_addr_entry *entry, + int *createlsk_err) +{ + struct mptcp_local_lsk *lsk_ref; + struct socket *lsk; + int err; + + lsk_ref = lsk_list_find(pernet, &entry->addr); + + if (!lsk_ref) { + err = mptcp_pm_nl_create_listen_socket(net, entry, &lsk); + + if (createlsk_err) + *createlsk_err = err; + + if (lsk) + lsk_ref = lsk_list_add(pernet, &entry->addr, lsk); + + if (lsk && !lsk_ref) + sock_release(lsk); + } + + return lsk_ref; +} + static bool address_zero(const struct mptcp_addr_info *addr) { struct mptcp_addr_info zero; @@ -999,8 +1026,9 @@ static int mptcp_pm_nl_append_new_local_addr(struct pm_nl_pernet *pernet, return ret; } -static int mptcp_pm_nl_create_listen_socket(struct sock *sk, - struct mptcp_pm_addr_entry *entry) +static int mptcp_pm_nl_create_listen_socket(struct net *net, + struct mptcp_pm_addr_entry *entry, + struct socket **lsk) { int addrlen = sizeof(struct sockaddr_in); struct sockaddr_storage addr; @@ -1009,12 +1037,12 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, int backlog = 1024; int err; - err = sock_create_kern(sock_net(sk), entry->addr.family, - SOCK_STREAM, IPPROTO_MPTCP, &entry->lsk); + err = sock_create_kern(net, entry->addr.family, + SOCK_STREAM, IPPROTO_MPTCP, lsk); if (err) return err; - msk = mptcp_sk(entry->lsk->sk); + msk = mptcp_sk((*lsk)->sk); if (!msk) { err = -EINVAL; goto out; @@ -1046,7 +1074,8 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, return 0; out: - sock_release(entry->lsk); + sock_release(*lsk); + *lsk = NULL; return err; } @@ -1095,7 +1124,7 @@ int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct sock_common *skc) entry->addr.port = 0; entry->ifindex = 0; entry->flags = 0; - entry->lsk = NULL; + entry->lsk_ref = NULL; ret = mptcp_pm_nl_append_new_local_addr(pernet, entry); if (ret < 0) kfree(entry); @@ -1304,18 +1333,25 @@ static int mptcp_nl_cmd_add_addr(struct sk_buff *skb, struct genl_info *info) *entry = addr; if (entry->addr.port) { - ret = mptcp_pm_nl_create_listen_socket(skb->sk, entry); - if (ret) { - GENL_SET_ERR_MSG(info, "create listen socket error"); + entry->lsk_ref = lsk_list_find_or_create(sock_net(skb->sk), pernet, entry, &ret); + + if (!entry->lsk_ref) + entry->lsk_ref = lsk_list_find(pernet, &entry->addr); + + if (!entry->lsk_ref) { + GENL_SET_ERR_MSG(info, "can't create/allocate lsk"); kfree(entry); + ret = (ret == 0) ? -ENOMEM : ret; return ret; } } + ret = mptcp_pm_nl_append_new_local_addr(pernet, entry); + if (ret < 0) { GENL_SET_ERR_MSG(info, "too many addresses or duplicate one"); - if (entry->lsk) - sock_release(entry->lsk); + if (entry->lsk_ref) + lsk_list_release(pernet, entry->lsk_ref); kfree(entry); return ret; } @@ -1418,10 +1454,11 @@ static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net, } /* caller must ensure the RCU grace period is already elapsed */ -static void __mptcp_pm_release_addr_entry(struct mptcp_pm_addr_entry *entry) +static void __mptcp_pm_release_addr_entry(struct pm_nl_pernet *pernet, + struct mptcp_pm_addr_entry *entry) { - if (entry->lsk) - sock_release(entry->lsk); + if (entry->lsk_ref) + lsk_list_release(pernet, entry->lsk_ref); kfree(entry); } @@ -1503,7 +1540,7 @@ static int mptcp_nl_cmd_del_addr(struct sk_buff *skb, struct genl_info *info) mptcp_nl_remove_subflow_and_signal_addr(sock_net(skb->sk), &entry->addr); synchronize_rcu(); - __mptcp_pm_release_addr_entry(entry); + __mptcp_pm_release_addr_entry(pernet, entry); return ret; } @@ -1559,7 +1596,7 @@ static void mptcp_nl_remove_addrs_list(struct net *net, } /* caller must ensure the RCU grace period is already elapsed */ -static void __flush_addrs(struct list_head *list) +static void __flush_addrs(struct pm_nl_pernet *pernet, struct list_head *list) { while (!list_empty(list)) { struct mptcp_pm_addr_entry *cur; @@ -1567,7 +1604,7 @@ static void __flush_addrs(struct list_head *list) cur = list_entry(list->next, struct mptcp_pm_addr_entry, list); list_del_rcu(&cur->list); - __mptcp_pm_release_addr_entry(cur); + __mptcp_pm_release_addr_entry(pernet, cur); } } @@ -1592,7 +1629,7 @@ static int mptcp_nl_cmd_flush_addrs(struct sk_buff *skb, struct genl_info *info) spin_unlock_bh(&pernet->lock); mptcp_nl_remove_addrs_list(sock_net(skb->sk), &free_list); synchronize_rcu(); - __flush_addrs(&free_list); + __flush_addrs(pernet, &free_list); return 0; } @@ -2242,7 +2279,7 @@ static void __net_exit pm_nl_exit_net(struct list_head *net_list) * other modifiers, also netns core already waited for a * RCU grace period. */ - __flush_addrs(&pernet->local_addr_list); + __flush_addrs(pernet, &pernet->local_addr_list); } } From patchwork Thu Feb 3 03:13:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733807 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44BDC2CA4 for ; Thu, 3 Feb 2022 03:13:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643858025; x=1675394025; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=wZFF3l+YSojCqKSBtZHHf5PFM+bc+2bVtw2OO5X4ovc=; b=X4foyi4f5K1XEdGUZZQ4tuFduBzfGTaUTS0zh4x3ZXFOJdVbl3CBEga6 ucJydGyhno67Xh1HKdF2XUNHIEBDdv25ZWh51JN+WMKnl1bZ08MHRCL3W 4CJcOs/ZS1tAcaTAhmDQF6oDOQ5dFyT43TcKdZVWHDTkHCcVTQWve5Vk3 btNu0fihYzli43PjdaHDJpcIkyZ8UuOHVz5p6XHkE7AIsE+Di9OgyMbcG dFHLxkW8YkU31Rw7KdJH2fR2X5m5ZqVvyx5/D6bQzISx8rm+v2N+FvJpB H9XLjWv09C6f7NT9wHh0q+BEcCz1hC0eqmhyp5rqI0o5Mskj2srQ/Tf9l w==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="308795483" X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="308795483" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="771658271" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v4 7/8] mptcp: attempt to add listening sockets for announced addrs Date: Wed, 2 Feb 2022 22:13:30 -0500 Message-Id: <20220203031331.2996457-8-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203031331.2996457-1-kishen.maloor@intel.com> References: <20220203031331.2996457-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 When ADD_ADDR announcements use the port associated with an active subflow, this change ensures that a listening socket is bound to the announced addr+port in the kernel for subsequently receiving MP_JOINs. But if a listening socket for this address is already held by the application then no action is taken. A listening socket is created (when there isn't a listener) just prior to the addr advertisement. If it is desired to not create a listening socket in the kernel for an address, then this can be requested by including the MPTCP_PM_ADDR_FLAG_NO_LISTEN flag with the address. When a listening socket is created, it is stored in struct mptcp_pm_add_entry and released accordingly. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/203 Signed-off-by: Kishen Maloor --- v2: fixed formatting v3: added new addr flag MPTCP_PM_ADDR_FLAG_NO_LISTEN to skip creating a listening socket in the kernel during an ADD_ADDR request, use this flag along the in-kernel PM flow for ADD_ADDR requests (Note: listening sockets are always created for port-based endpoints as before), use the lsk_list_find_or_create() helper v4: call lsk_list_find() after a failed lsk_list_find_or_create() for a chance to retrieve a recently created lsk by a simultaneous call --- include/uapi/linux/mptcp.h | 1 + net/mptcp/pm_netlink.c | 50 ++++++++++++++++++++++++++++++++++++-- 2 files changed, 49 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h index f106a3941cdf..265cabc0d7aa 100644 --- a/include/uapi/linux/mptcp.h +++ b/include/uapi/linux/mptcp.h @@ -81,6 +81,7 @@ enum { #define MPTCP_PM_ADDR_FLAG_SUBFLOW (1 << 1) #define MPTCP_PM_ADDR_FLAG_BACKUP (1 << 2) #define MPTCP_PM_ADDR_FLAG_FULLMESH (1 << 3) +#define MPTCP_PM_ADDR_FLAG_NO_LISTEN (1 << 4) enum { MPTCP_PM_CMD_UNSPEC, diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 4c9567db56ff..9b3d871d3712 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -43,6 +43,7 @@ struct mptcp_pm_add_entry { struct mptcp_addr_info addr; struct timer_list add_timer; struct mptcp_sock *sock; + struct mptcp_local_lsk *lsk_ref; u8 retrans_times; }; @@ -66,6 +67,10 @@ struct pm_nl_pernet { #define MPTCP_PM_ADDR_MAX 8 #define ADD_ADDR_RETRANS_MAX 3 +static int mptcp_pm_nl_create_listen_socket(struct net *net, + struct mptcp_pm_addr_entry *entry, + struct socket **lsk); + static bool addresses_equal(const struct mptcp_addr_info *a, const struct mptcp_addr_info *b, bool use_port) { @@ -465,7 +470,8 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk, } static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, - struct mptcp_pm_addr_entry *entry) + struct mptcp_pm_addr_entry *entry, + struct mptcp_local_lsk *lsk_ref) { struct mptcp_pm_add_entry *add_entry = NULL; struct sock *sk = (struct sock *)msk; @@ -485,6 +491,10 @@ static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, add_entry->addr = entry->addr; add_entry->sock = msk; add_entry->retrans_times = 0; + add_entry->lsk_ref = lsk_ref; + + if (lsk_ref) + lsk_list_add_ref(lsk_ref); timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0); sk_reset_timer(sk, &add_entry->add_timer, @@ -497,8 +507,11 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk) { struct mptcp_pm_add_entry *entry, *tmp; struct sock *sk = (struct sock *)msk; + struct pm_nl_pernet *pernet; LIST_HEAD(free_list); + pernet = net_generic(sock_net(sk), pm_nl_pernet_id); + pr_debug("msk=%p", msk); spin_lock_bh(&msk->pm.lock); @@ -507,6 +520,8 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk) list_for_each_entry_safe(entry, tmp, &free_list, list) { sk_stop_timer_sync(sk, &entry->add_timer); + if (entry->lsk_ref) + lsk_list_release(pernet, entry->lsk_ref); kfree(entry); } } @@ -611,7 +626,9 @@ lookup_id_by_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *add } static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) + __must_hold(&msk->pm.lock) { + struct mptcp_local_lsk *lsk_ref = NULL; struct sock *sk = (struct sock *)msk; struct mptcp_pm_addr_entry *local; unsigned int add_addr_signal_max; @@ -648,12 +665,34 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) local = select_signal_address(pernet, msk); if (local) { - if (mptcp_pm_alloc_anno_list(msk, local)) { + if (!(local->flags & MPTCP_PM_ADDR_FLAG_NO_LISTEN) && + !local->addr.port) { + local->addr.port = + ((struct inet_sock *)inet_sk + ((struct sock *)msk))->inet_sport; + + spin_unlock_bh(&msk->pm.lock); + + lsk_ref = lsk_list_find_or_create(sock_net(sk), pernet, + local, NULL); + + spin_lock_bh(&msk->pm.lock); + + if (!lsk_ref) + lsk_ref = lsk_list_find(pernet, &local->addr); + + local->addr.port = 0; + } + + if (mptcp_pm_alloc_anno_list(msk, local, lsk_ref)) { __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); msk->pm.add_addr_signaled++; mptcp_pm_announce_addr(msk, &local->addr, false); mptcp_pm_nl_addr_send_ack(msk); } + + if (lsk_ref) + lsk_list_release(pernet, lsk_ref); } } @@ -745,6 +784,7 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk, } static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) + __must_hold(&msk->pm.lock) { struct mptcp_addr_info addrs[MPTCP_PM_ADDR_MAX]; struct sock *sk = (struct sock *)msk; @@ -1385,11 +1425,17 @@ int mptcp_pm_get_flags_and_ifindex_by_id(struct net *net, unsigned int id, static bool remove_anno_list_by_saddr(struct mptcp_sock *msk, struct mptcp_addr_info *addr) { + struct sock *sk = (struct sock *)msk; struct mptcp_pm_add_entry *entry; + struct pm_nl_pernet *pernet; + + pernet = net_generic(sock_net(sk), pm_nl_pernet_id); entry = mptcp_pm_del_add_timer(msk, addr, false); if (entry) { list_del(&entry->list); + if (entry->lsk_ref) + lsk_list_release(pernet, entry->lsk_ref); kfree(entry); return true; } From patchwork Thu Feb 3 03:13:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733806 Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A19062CA6 for ; Thu, 3 Feb 2022 03:13:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643858025; x=1675394025; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=GWC00DGcOIXpenWBtT7VMMPLeYZ8SNQrpkt8jAQUe5k=; b=c60xaXqVJ7aOJZ4PBDO9IBu17bs/zOHjAfzqLMMfLcUrnXquPOVpANgo Hwsld/uZwI2zPQO8Ckr6yL+RL4HfjmgB+nT/YL+mVPeKM2GLF8llPcnaI DK1JABQtAuQCLzljpVIAxBrkmk19hmoTWSYC3ch5s5o9ldbb5iC8vI9AR oMrQcQi1y1gzT7D7Bt1lg9uBNtNRaSiWhV+j30HuJPewuQxI6QXZNmznc YqTyONZKmpOpVkNjrQR32KTHs1m7H2CK/VFKKLiZYmBT5mWzabdshhCus ggebu1DMHdc/ablJAk9CPM94bJy+lO0W0WNpRSD+VzxwD1UEQKiJL7BQi Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="308795486" X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="308795486" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:41 -0800 X-IronPort-AV: E=Sophos;i="5.88,338,1635231600"; d="scan'208";a="771658274" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 19:13:40 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v4 8/8] mptcp: expose server_side attribute in MPTCP netlink events Date: Wed, 2 Feb 2022 22:13:31 -0500 Message-Id: <20220203031331.2996457-9-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203031331.2996457-1-kishen.maloor@intel.com> References: <20220203031331.2996457-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This change records the server_side attribute in MPTCP_EVENT_CREATED and MPTCP_EVENT_ESTABLISHED events to inform the recipient of the role of the associated MPTCP application (Client/Server) that is handling it's end of the MPTCP connection. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/246 Signed-off-by: Kishen Maloor --- include/uapi/linux/mptcp.h | 1 + net/mptcp/pm_netlink.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h index 265cabc0d7aa..0df44a116a31 100644 --- a/include/uapi/linux/mptcp.h +++ b/include/uapi/linux/mptcp.h @@ -188,6 +188,7 @@ enum mptcp_event_attr { MPTCP_ATTR_IF_IDX, /* s32 */ MPTCP_ATTR_RESET_REASON,/* u32 */ MPTCP_ATTR_RESET_FLAGS, /* u32 */ + MPTCP_ATTR_SERVER_SIDE, /* u8 */ __MPTCP_ATTR_AFTER_LAST }; diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 9b3d871d3712..eaa1a5a21192 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -2097,6 +2097,9 @@ static int mptcp_event_created(struct sk_buff *skb, if (err) return err; + if (nla_put_u8(skb, MPTCP_ATTR_SERVER_SIDE, READ_ONCE(msk->pm.server_side))) + return -EMSGSIZE; + return mptcp_event_add_subflow(skb, ssk); }