From patchwork Thu Feb 3 07:25:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733865 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A9CB82C9D for ; Thu, 3 Feb 2022 07:25:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643873116; x=1675409116; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=31Rn1fVGrhzOGv3WF3Z/oTSCRdUAxWa4OxdrNJxlX2M=; b=dMwnQ9NKhs+9pCgDLX/HKjCOMM7YgG0OWUWe5qsfW/xJk2wddW7aYMCQ iwsgArHWShwjKmu5JQqir0HAI9J6iFsEmjVjOUu5VYVNhdSdofqRzF7Ed 2Wg7ecXngRmMnDe6buEYCXnv1iYHGCB5dJArJtBcb1ebuy1vyZ0lfS9yU BrcYlktTp8W+XE6nwUEvRUr54tsQfWJVY9MnbuW0khaflHbRoNU+xmOBx zfQhila75P5uZjCezIwl/gP8qojDEEdfta9JY+nSurJmvyKv/3kBFf4kL Sb8z2fA3z/c8AJoGlfiBOjuT27YFvxCFh7Z6q4TSep5dxahjd9viE1LWL w==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="272580769" X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="272580769" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="535118714" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v5 1/8] mptcp: bypass in-kernel PM restrictions for non-kernel PMs Date: Thu, 3 Feb 2022 02:25:01 -0500 Message-Id: <20220203072508.3072309-2-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203072508.3072309-1-kishen.maloor@intel.com> References: <20220203072508.3072309-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Current limits on the # of addresses/subflows must apply only to in-kernel PM managed sockets. Thus this change removes such restrictions for connections overseen by non-kernel (e.g. userspace) PMs. This change also ensures that the kernel does not record stats inside struct mptcp_pm_data updated along kernel code paths when exercised by non-kernel PMs. Signed-off-by: Kishen Maloor --- v4: rephrased commit message, add API mptcp_pm_is_kernel(), bypass accounting fo non-kernel PM managed connections --- net/mptcp/pm.c | 6 +++++- net/mptcp/pm_netlink.c | 3 +++ net/mptcp/protocol.h | 9 +++++++-- net/mptcp/subflow.c | 3 ++- 4 files changed, 17 insertions(+), 4 deletions(-) diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 1f8878cc29e3..3e053b759181 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -87,6 +87,9 @@ bool mptcp_pm_allow_new_subflow(struct mptcp_sock *msk) unsigned int subflows_max; int ret = 0; + if (!mptcp_pm_is_kernel(msk)) + return true; + subflows_max = mptcp_pm_get_subflows_max(msk); pr_debug("msk=%p subflows=%d max=%d allow=%d", msk, pm->subflows, @@ -179,7 +182,8 @@ void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk, bool update_subflows; update_subflows = (ssk->sk_state == TCP_CLOSE) && - (subflow->request_join || subflow->mp_join); + (subflow->request_join || subflow->mp_join) && + mptcp_pm_is_kernel(msk); if (!READ_ONCE(pm->work_pending) && !update_subflows) return; diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 93800f32fcb6..bf24c1a74e1d 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -795,6 +795,9 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk, if (!removed) continue; + if (!mptcp_pm_is_kernel(msk)) + continue; + if (rm_type == MPTCP_MIB_RMADDR) { msk->pm.add_addr_accepted--; WRITE_ONCE(msk->pm.accept_addr, true); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index f37f087caab3..ac8b57d4f853 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -804,9 +804,14 @@ static inline bool mptcp_pm_should_rm_signal(struct mptcp_sock *msk) return READ_ONCE(msk->pm.addr_signal) & BIT(MPTCP_RM_ADDR_SIGNAL); } -static inline bool mptcp_pm_is_userspace(struct mptcp_sock *msk) +static inline bool mptcp_pm_is_userspace(const struct mptcp_sock *msk) { - return READ_ONCE(msk->pm.pm_type) != MPTCP_PM_TYPE_KERNEL; + return READ_ONCE(msk->pm.pm_type) == MPTCP_PM_TYPE_USERSPACE; +} + +static inline bool mptcp_pm_is_kernel(const struct mptcp_sock *msk) +{ + return READ_ONCE(msk->pm.pm_type) == MPTCP_PM_TYPE_KERNEL; } static inline unsigned int mptcp_add_addr_len(int family, bool echo, bool port) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 88ee94adc38c..8c25a1122bfd 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -62,7 +62,8 @@ static void subflow_generate_hmac(u64 key1, u64 key2, u32 nonce1, u32 nonce2, static bool mptcp_can_accept_new_subflow(const struct mptcp_sock *msk) { return mptcp_is_fully_established((void *)msk) && - READ_ONCE(msk->pm.accept_subflow); + (!mptcp_pm_is_kernel(msk) || + READ_ONCE(msk->pm.accept_subflow)); } /* validate received token and create truncated hmac and nonce for SYN-ACK */ From patchwork Thu Feb 3 07:25:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733864 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB1F12CA1 for ; Thu, 3 Feb 2022 07:25:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643873116; x=1675409116; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=t0F+LtZuBQhBzOAqb3kTvTyZmkbo6vTLqs3duOIuOUY=; b=TBl6e94gHuSWLjiAUrdOArLZahDxqUAtwmnMawUpFlbXXk0HL5KSspq+ JHAXjRFFOLKoohW3VLpWA6mqyuujK4rQ6+6NjTPtKBOS3H1YHQlMGsSps 8fTk80cpKfw29HlUpprOvxoa6QmUvLmejST18c1jEc8SWHgFJHmYTTHk8 XxwL/tuDI/+oCC9E+TxqmeMqWuBNEIcpbqcQoIInmpFlrWMD+5HzUqqWN PcfukvfKHEAD79TFAZS0YGze2azVvYxJVMtqpw/+rULMgBE56/mGS6gCt Bhu6/KZeBCJ08EjA/agAO20eE2RtUxOTmckRh/Ea6DEoEbckyAvaLj90C A==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="272580770" X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="272580770" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="535118717" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v5 2/8] mptcp: store remote id from MP_JOIN SYN/ACK in local ctx Date: Thu, 3 Feb 2022 02:25:02 -0500 Message-Id: <20220203072508.3072309-3-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203072508.3072309-1-kishen.maloor@intel.com> References: <20220203072508.3072309-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This change reads the addr id assigned to the remote endpoint of a subflow from the MP_JOIN SYN/ACK message and stores it in the related subflow context. The remote id was not being captured prior to this change, and will now provide a consistent view of remote endpoints and their ids as seen through netlink events. Signed-off-by: Kishen Maloor --- net/mptcp/subflow.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 8c25a1122bfd..d3691b95401a 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -444,6 +444,7 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb) subflow->backup = mp_opt.backup; subflow->thmac = mp_opt.thmac; subflow->remote_nonce = mp_opt.nonce; + subflow->remote_id = mp_opt.join_id; pr_debug("subflow=%p, thmac=%llu, remote_nonce=%u backup=%d", subflow, subflow->thmac, subflow->remote_nonce, subflow->backup); From patchwork Thu Feb 3 07:25:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733867 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E40E72CA1 for ; Thu, 3 Feb 2022 07:25:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643873117; x=1675409117; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=lb+/sjxrVOCu6IaiElGnADZNu5LQc1FIjY+PPW9x2pc=; b=dnEPQ73jHTxhagxkymoWEHOgGPva4k4gH4ctxatwxS4f+f/M/QOn/fqm M0TnQUosc2kWgD0ztid755r13EtEbYiOAacDXK+BMXfz9L6xN+31+ufbE o/3Hi8ruwFscW4Mc6tHC/HiLlGX1JL9q60NP/tjj5LwjHuBJSznV67uPK J9KSx3osd+pp9GV0VpGpCn+6ECnT2WGkl4bcAJEnM1pRt8eXjBEk4KjVT gfSfyrVBIFS2uGSBnR3cUx8B4Mq33tA3cneQ53mOplQbR9wwkqs9/Xf62 c1+kfL3SoOpzlSNGQl+Tu3I6WVHCGsMwR8dXXA3dRdVLb0F/cQLrxLolo w==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="272580772" X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="272580772" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="535118720" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v5 3/8] mptcp: reflect remote port (not 0) in ANNOUNCED events Date: Thu, 3 Feb 2022 02:25:03 -0500 Message-Id: <20220203072508.3072309-4-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203072508.3072309-1-kishen.maloor@intel.com> References: <20220203072508.3072309-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Per RFC 8684, if no port is specified in an ADD_ADDR message, MPTCP SHOULD attempt to connect to the specified address on the same port as the port that is already in use by the subflow on which the ADD_ADDR signal was sent. To facilitate that, this change reflects the specific remote port in use by that subflow in MPTCP_EVENT_ANNOUNCED events. Signed-off-by: Kishen Maloor --- v4: refactor mptcp_pm_add_addr_received() and mptcp_event_addr_announced() to eliminate a param --- net/mptcp/options.c | 2 +- net/mptcp/pm.c | 6 ++++-- net/mptcp/pm_netlink.c | 11 ++++++++--- net/mptcp/protocol.h | 4 ++-- 4 files changed, 15 insertions(+), 8 deletions(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 7b615dc10897..6dfaa8e11331 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -1131,7 +1131,7 @@ bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) if ((mp_opt.suboptions & OPTION_MPTCP_ADD_ADDR) && add_addr_hmac_valid(msk, &mp_opt)) { if (!mp_opt.echo) { - mptcp_pm_add_addr_received(msk, &mp_opt.addr); + mptcp_pm_add_addr_received(sk, &mp_opt.addr); MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ADDADDR); } else { mptcp_pm_add_addr_echoed(msk, &mp_opt.addr); diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 3e053b759181..94f008b2d624 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -200,15 +200,17 @@ void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk, spin_unlock_bh(&pm->lock); } -void mptcp_pm_add_addr_received(struct mptcp_sock *msk, +void mptcp_pm_add_addr_received(const struct sock *ssk, const struct mptcp_addr_info *addr) { + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + struct mptcp_sock *msk = mptcp_sk(subflow->conn); struct mptcp_pm_data *pm = &msk->pm; pr_debug("msk=%p remote_id=%d accept=%d", msk, addr->id, READ_ONCE(pm->accept_addr)); - mptcp_event_addr_announced(msk, addr); + mptcp_event_addr_announced(ssk, addr); spin_lock_bh(&pm->lock); diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index bf24c1a74e1d..ff13012178ae 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -1974,10 +1974,12 @@ void mptcp_event_addr_removed(const struct mptcp_sock *msk, uint8_t id) kfree_skb(skb); } -void mptcp_event_addr_announced(const struct mptcp_sock *msk, +void mptcp_event_addr_announced(const struct sock *ssk, const struct mptcp_addr_info *info) { - struct net *net = sock_net((const struct sock *)msk); + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk); + struct mptcp_sock *msk = mptcp_sk(subflow->conn); + struct net *net = sock_net(ssk); struct nlmsghdr *nlh; struct sk_buff *skb; @@ -1999,7 +2001,10 @@ void mptcp_event_addr_announced(const struct mptcp_sock *msk, if (nla_put_u8(skb, MPTCP_ATTR_REM_ID, info->id)) goto nla_put_failure; - if (nla_put_be16(skb, MPTCP_ATTR_DPORT, info->port)) + if (nla_put_be16(skb, MPTCP_ATTR_DPORT, + info->port == 0 ? + ((struct inet_sock *)inet_sk(ssk))->inet_dport : + info->port)) goto nla_put_failure; switch (info->family) { diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index ac8b57d4f853..4371ac3fbde1 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -751,7 +751,7 @@ void mptcp_pm_subflow_established(struct mptcp_sock *msk); bool mptcp_pm_nl_check_work_pending(struct mptcp_sock *msk); void mptcp_pm_subflow_check_next(struct mptcp_sock *msk, const struct sock *ssk, const struct mptcp_subflow_context *subflow); -void mptcp_pm_add_addr_received(struct mptcp_sock *msk, +void mptcp_pm_add_addr_received(const struct sock *ssk, const struct mptcp_addr_info *addr); void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk, struct mptcp_addr_info *addr); @@ -780,7 +780,7 @@ int mptcp_pm_remove_subflow(struct mptcp_sock *msk, const struct mptcp_rm_list * void mptcp_event(enum mptcp_event_type type, const struct mptcp_sock *msk, const struct sock *ssk, gfp_t gfp); -void mptcp_event_addr_announced(const struct mptcp_sock *msk, const struct mptcp_addr_info *info); +void mptcp_event_addr_announced(const struct sock *ssk, const struct mptcp_addr_info *info); void mptcp_event_addr_removed(const struct mptcp_sock *msk, u8 id); static inline bool mptcp_pm_should_add_signal(struct mptcp_sock *msk) From patchwork Thu Feb 3 07:25:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733866 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 05FC22C9D for ; Thu, 3 Feb 2022 07:25:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643873117; x=1675409117; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=OACnXn1oJrS2hLkA30+p19VlWoYRQSzgjIal5iKRlMc=; b=n2LI3hzQBmGj0mlvcnkxUZM3tQJlFAyiYS1XupP2iqIYz5/sUICDJKP7 x40A0S4iUPZx1TZKWUyzKNIOZVBy0zqccnxT86H/9gywQ4Jrf5+yM8JZr +k+X5LX/dYym+/9aEHzRGQQBHS3/uuaLFiduveVhmu5J3AdZeI47QRxjt V8OW+4Gk+arM/H+LR0kajUgobAz1W77Yte3udkwqupPRWQvwKoZmTf3F/ tCnt99fq+6IXvIAE+jOCgmGbJrCoDkD12nXx0gkDd4207eTgYv6/WtUMk /VUr4Z1s3IWbob7cTSuFsHC9F3IwfNmGtLX+DNCP1RZbpW434dbbQdlBC Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="272580773" X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="272580773" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="535118723" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v5 4/8] mptcp: establish subflows from either end of connection Date: Thu, 3 Feb 2022 02:25:04 -0500 Message-Id: <20220203072508.3072309-5-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203072508.3072309-1-kishen.maloor@intel.com> References: <20220203072508.3072309-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This change updates internal logic to permit subflows to be established from either the client or server ends of MPTCP connections. This symmetry and added flexibility may be harnessed by PM implementations running on either end in creating new subflows. The essence of this change lies in not relying on the "server_side" flag (which continues to be available if needed). Signed-off-by: Kishen Maloor --- v2: check for 3rd ACK retransmission only on passive side of the MPJ handshake v3: check for active subflow socket in subflow_simultaneous_connect --- net/mptcp/options.c | 2 +- net/mptcp/protocol.c | 5 +---- net/mptcp/protocol.h | 8 ++++++-- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 6dfaa8e11331..4f56e874c542 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -929,7 +929,7 @@ static bool check_fully_established(struct mptcp_sock *msk, struct sock *ssk, if (TCP_SKB_CB(skb)->seq == subflow->ssn_offset + 1 && TCP_SKB_CB(skb)->end_seq == TCP_SKB_CB(skb)->seq && subflow->mp_join && (mp_opt->suboptions & OPTIONS_MPTCP_MPJ) && - READ_ONCE(msk->pm.server_side)) + !subflow->request_join) tcp_send_ack(ssk); goto fully_established; } diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 3324e1c61576..6142b4b25769 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3255,15 +3255,12 @@ bool mptcp_finish_join(struct sock *ssk) return false; } - if (!msk->pm.server_side) + if (!list_empty(&subflow->node)) goto out; if (!mptcp_pm_allow_new_subflow(msk)) goto err_prohibited; - if (WARN_ON_ONCE(!list_empty(&subflow->node))) - goto err_prohibited; - /* active connections are already on conn_list. * If we can't acquire msk socket lock here, let the release callback * handle it diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 4371ac3fbde1..1a8d09796627 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -908,13 +908,17 @@ static inline bool mptcp_check_infinite_map(struct sk_buff *skb) return false; } +static inline bool is_active_ssk(struct mptcp_subflow_context *subflow) +{ + return (subflow->request_mptcp || subflow->request_join); +} + static inline bool subflow_simultaneous_connect(struct sock *sk) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); - struct sock *parent = subflow->conn; return sk->sk_state == TCP_ESTABLISHED && - !mptcp_sk(parent)->pm.server_side && + is_active_ssk(subflow) && !subflow->conn_finished; } From patchwork Thu Feb 3 07:25:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733868 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 258112CA5 for ; Thu, 3 Feb 2022 07:25:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643873118; x=1675409118; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=gxuX5nBfNjm3eDOIguVloVSnnpRcwqw5nK5LrBxglK0=; b=OObXnGMjhHdslNm3V83HkXKeYivrN7ZvBLE8hNiSFMAXr1FC5GMNSMGi MPqdxTjUzgJGKHTSLeLGnTyz0wyVMxBitd6k/l85m6AyfanKO1VW0+KRR oyxRfvYDmHggDB2DiddP1jhIUQK/vtlckxEpD3SUwvhBVp5H2L45NThQH aKbcK6R2Aj44bkd58XQ/DVh39dCtUQCyyqtqStvIY9Z4cVSNa0bbnFuq7 p3VkbmooLxdnFz4lUcbIXmI5qGTDB6RPlt5cu+x0pxWQGeIosdZ2lorWG oQzuG3AEL37W5AsXeonWCFI/AA9XBTTE0PAoeiDgfHuZMVpb8Nx1g2E/d A==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="272580774" X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="272580774" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="535118726" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:13 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v5 5/8] mptcp: netlink: store per namespace list of refcounted listen socks Date: Thu, 3 Feb 2022 02:25:05 -0500 Message-Id: <20220203072508.3072309-6-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203072508.3072309-1-kishen.maloor@intel.com> References: <20220203072508.3072309-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The kernel can create listening sockets bound to announced addresses via the ADD_ADDR option for receiving MP_JOIN requests. Path managers may further choose to advertise the same addr+port over multiple MPTCP connections. So this change provides a simple framework to manage a list of all distinct listning sockets created in the kernel over a namespace by encapsulating the socket in a structure that is ref counted and can be shared across multiple connections. The sockets are released when there are no more references. Signed-off-by: Kishen Maloor --- v2: fixed formatting --- net/mptcp/pm_netlink.c | 76 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index ff13012178ae..3d6251baef26 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -22,6 +22,14 @@ static struct genl_family mptcp_genl_family; static int pm_nl_pernet_id; +struct mptcp_local_lsk { + struct list_head list; + struct mptcp_addr_info addr; + struct socket *lsk; + struct rcu_head rcu; + refcount_t refcount; +}; + struct mptcp_pm_addr_entry { struct list_head list; struct mptcp_addr_info addr; @@ -41,7 +49,10 @@ struct mptcp_pm_add_entry { struct pm_nl_pernet { /* protects pernet updates */ spinlock_t lock; + /* protects access to pernet lsk list */ + spinlock_t lsk_list_lock; struct list_head local_addr_list; + struct list_head lsk_list; unsigned int addrs; unsigned int stale_loss_cnt; unsigned int add_addr_signal_max; @@ -83,6 +94,69 @@ static bool addresses_equal(const struct mptcp_addr_info *a, return a->port == b->port; } +static struct mptcp_local_lsk *lsk_list_find(struct pm_nl_pernet *pernet, + struct mptcp_addr_info *addr) +{ + struct mptcp_local_lsk *lsk_ref = NULL; + struct mptcp_local_lsk *i; + + rcu_read_lock(); + + list_for_each_entry_rcu(i, &pernet->lsk_list, list) { + if (addresses_equal(&i->addr, addr, true)) { + if (refcount_inc_not_zero(&i->refcount)) { + lsk_ref = i; + break; + } + } + } + + rcu_read_unlock(); + + return lsk_ref; +} + +static void lsk_list_add_ref(struct mptcp_local_lsk *lsk_ref) +{ + refcount_inc(&lsk_ref->refcount); +} + +static struct mptcp_local_lsk *lsk_list_add(struct pm_nl_pernet *pernet, + struct mptcp_addr_info *addr, + struct socket *lsk) +{ + struct mptcp_local_lsk *lsk_ref; + + lsk_ref = kmalloc(sizeof(*lsk_ref), GFP_ATOMIC); + + if (!lsk_ref) + return NULL; + + lsk_ref->lsk = lsk; + memcpy(&lsk_ref->addr, addr, sizeof(struct mptcp_addr_info)); + refcount_set(&lsk_ref->refcount, 1); + + spin_lock_bh(&pernet->lsk_list_lock); + list_add_rcu(&lsk_ref->list, &pernet->lsk_list); + spin_unlock_bh(&pernet->lsk_list_lock); + + return lsk_ref; +} + +static void lsk_list_release(struct pm_nl_pernet *pernet, + struct mptcp_local_lsk *lsk_ref) +{ + if (lsk_ref && refcount_dec_and_test(&lsk_ref->refcount)) { + sock_release(lsk_ref->lsk); + + spin_lock_bh(&pernet->lsk_list_lock); + list_del_rcu(&lsk_ref->list); + spin_unlock_bh(&pernet->lsk_list_lock); + + kfree_rcu(lsk_ref, rcu); + } +} + static bool address_zero(const struct mptcp_addr_info *addr) { struct mptcp_addr_info zero; @@ -2141,12 +2215,14 @@ static int __net_init pm_nl_init_net(struct net *net) struct pm_nl_pernet *pernet = net_generic(net, pm_nl_pernet_id); INIT_LIST_HEAD_RCU(&pernet->local_addr_list); + INIT_LIST_HEAD_RCU(&pernet->lsk_list); /* Cit. 2 subflows ought to be enough for anybody. */ pernet->subflows_max = 2; pernet->next_id = 1; pernet->stale_loss_cnt = 4; spin_lock_init(&pernet->lock); + spin_lock_init(&pernet->lsk_list_lock); /* No need to initialize other pernet fields, the struct is zeroed at * allocation time. From patchwork Thu Feb 3 07:25:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733869 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 437B32C9D for ; Thu, 3 Feb 2022 07:25:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643873119; x=1675409119; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=EtJNFBe+tNKy51yk4DnKNVwStlaZ9uX06yyF25qBonE=; b=k4UbdTwCsJF9/yhXvKYMExTgH8Vg/UYbFI7KzWW19IbR6tFrvE6rkMMx 8OhoDvr80WgRGcewvri60zjyc6VF1UPATe8fZ1BuRJ/vP68eiGKO0kMLO OstzROlZB6DccKsc8BL/4Y0tWPWWWaUQhhZMf2QR9DciVTgFFdQxUqEXw roUG1fbbojA0DaCHymWTcUSUNnxjZelThKwVtmhRM3m3ojD2dTzhy4gpM ac77ezGX55/Izk9F/xWkAwzhtFem/3L03/W2W02Lf7hpQDvTsaBpOzFxu MElNyQE4YTFZ70zj6bpux0B12K5pR+qIfLNMhb9on5B8/dY+WS2+D/mgI Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="272580775" X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="272580775" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:14 -0800 X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="535118729" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:14 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v5 6/8] mptcp: netlink: store lsk ref in mptcp_pm_addr_entry Date: Thu, 3 Feb 2022 02:25:06 -0500 Message-Id: <20220203072508.3072309-7-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203072508.3072309-1-kishen.maloor@intel.com> References: <20220203072508.3072309-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This change updates struct mptcp_pm_addr_entry to store a listening socket (lsk) reference, i.e. a pointer to a reference counted structure containing the lsk (struct socket *) instead of the lsk itself. Code blocks that previously operated on the lsk in struct mptcp_pm_addr_entry have been updated to work with the lsk ref instead, utilizing new helper functions. Signed-off-by: Kishen Maloor --- v2: fixed formatting v3: added helper lsk_list_find_or_create(), updated mptcp_pm_nl_create_listen_socket() to take struct net* as param v4: call lsk_list_find() after a failed lsk_list_find_or_create() for a chance to retrieve a recently created lsk by a simultaneous call v5: fixed implicit declaration error --- net/mptcp/pm_netlink.c | 83 +++++++++++++++++++++++++++++++----------- 1 file changed, 62 insertions(+), 21 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 3d6251baef26..a4fb9acbba51 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -35,7 +35,7 @@ struct mptcp_pm_addr_entry { struct mptcp_addr_info addr; u8 flags; int ifindex; - struct socket *lsk; + struct mptcp_local_lsk *lsk_ref; }; struct mptcp_pm_add_entry { @@ -66,6 +66,10 @@ struct pm_nl_pernet { #define MPTCP_PM_ADDR_MAX 8 #define ADD_ADDR_RETRANS_MAX 3 +static int mptcp_pm_nl_create_listen_socket(struct net *net, + struct mptcp_pm_addr_entry *entry, + struct socket **lsk); + static bool addresses_equal(const struct mptcp_addr_info *a, const struct mptcp_addr_info *b, bool use_port) { @@ -157,6 +161,33 @@ static void lsk_list_release(struct pm_nl_pernet *pernet, } } +static struct mptcp_local_lsk *lsk_list_find_or_create(struct net *net, + struct pm_nl_pernet *pernet, + struct mptcp_pm_addr_entry *entry, + int *createlsk_err) +{ + struct mptcp_local_lsk *lsk_ref; + struct socket *lsk; + int err; + + lsk_ref = lsk_list_find(pernet, &entry->addr); + + if (!lsk_ref) { + err = mptcp_pm_nl_create_listen_socket(net, entry, &lsk); + + if (createlsk_err) + *createlsk_err = err; + + if (lsk) + lsk_ref = lsk_list_add(pernet, &entry->addr, lsk); + + if (lsk && !lsk_ref) + sock_release(lsk); + } + + return lsk_ref; +} + static bool address_zero(const struct mptcp_addr_info *addr) { struct mptcp_addr_info zero; @@ -999,8 +1030,9 @@ static int mptcp_pm_nl_append_new_local_addr(struct pm_nl_pernet *pernet, return ret; } -static int mptcp_pm_nl_create_listen_socket(struct sock *sk, - struct mptcp_pm_addr_entry *entry) +static int mptcp_pm_nl_create_listen_socket(struct net *net, + struct mptcp_pm_addr_entry *entry, + struct socket **lsk) { int addrlen = sizeof(struct sockaddr_in); struct sockaddr_storage addr; @@ -1009,12 +1041,12 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, int backlog = 1024; int err; - err = sock_create_kern(sock_net(sk), entry->addr.family, - SOCK_STREAM, IPPROTO_MPTCP, &entry->lsk); + err = sock_create_kern(net, entry->addr.family, + SOCK_STREAM, IPPROTO_MPTCP, lsk); if (err) return err; - msk = mptcp_sk(entry->lsk->sk); + msk = mptcp_sk((*lsk)->sk); if (!msk) { err = -EINVAL; goto out; @@ -1046,7 +1078,8 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk, return 0; out: - sock_release(entry->lsk); + sock_release(*lsk); + *lsk = NULL; return err; } @@ -1095,7 +1128,7 @@ int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct sock_common *skc) entry->addr.port = 0; entry->ifindex = 0; entry->flags = 0; - entry->lsk = NULL; + entry->lsk_ref = NULL; ret = mptcp_pm_nl_append_new_local_addr(pernet, entry); if (ret < 0) kfree(entry); @@ -1304,18 +1337,25 @@ static int mptcp_nl_cmd_add_addr(struct sk_buff *skb, struct genl_info *info) *entry = addr; if (entry->addr.port) { - ret = mptcp_pm_nl_create_listen_socket(skb->sk, entry); - if (ret) { - GENL_SET_ERR_MSG(info, "create listen socket error"); + entry->lsk_ref = lsk_list_find_or_create(sock_net(skb->sk), pernet, entry, &ret); + + if (!entry->lsk_ref) + entry->lsk_ref = lsk_list_find(pernet, &entry->addr); + + if (!entry->lsk_ref) { + GENL_SET_ERR_MSG(info, "can't create/allocate lsk"); kfree(entry); + ret = (ret == 0) ? -ENOMEM : ret; return ret; } } + ret = mptcp_pm_nl_append_new_local_addr(pernet, entry); + if (ret < 0) { GENL_SET_ERR_MSG(info, "too many addresses or duplicate one"); - if (entry->lsk) - sock_release(entry->lsk); + if (entry->lsk_ref) + lsk_list_release(pernet, entry->lsk_ref); kfree(entry); return ret; } @@ -1418,10 +1458,11 @@ static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net, } /* caller must ensure the RCU grace period is already elapsed */ -static void __mptcp_pm_release_addr_entry(struct mptcp_pm_addr_entry *entry) +static void __mptcp_pm_release_addr_entry(struct pm_nl_pernet *pernet, + struct mptcp_pm_addr_entry *entry) { - if (entry->lsk) - sock_release(entry->lsk); + if (entry->lsk_ref) + lsk_list_release(pernet, entry->lsk_ref); kfree(entry); } @@ -1503,7 +1544,7 @@ static int mptcp_nl_cmd_del_addr(struct sk_buff *skb, struct genl_info *info) mptcp_nl_remove_subflow_and_signal_addr(sock_net(skb->sk), &entry->addr); synchronize_rcu(); - __mptcp_pm_release_addr_entry(entry); + __mptcp_pm_release_addr_entry(pernet, entry); return ret; } @@ -1559,7 +1600,7 @@ static void mptcp_nl_remove_addrs_list(struct net *net, } /* caller must ensure the RCU grace period is already elapsed */ -static void __flush_addrs(struct list_head *list) +static void __flush_addrs(struct pm_nl_pernet *pernet, struct list_head *list) { while (!list_empty(list)) { struct mptcp_pm_addr_entry *cur; @@ -1567,7 +1608,7 @@ static void __flush_addrs(struct list_head *list) cur = list_entry(list->next, struct mptcp_pm_addr_entry, list); list_del_rcu(&cur->list); - __mptcp_pm_release_addr_entry(cur); + __mptcp_pm_release_addr_entry(pernet, cur); } } @@ -1592,7 +1633,7 @@ static int mptcp_nl_cmd_flush_addrs(struct sk_buff *skb, struct genl_info *info) spin_unlock_bh(&pernet->lock); mptcp_nl_remove_addrs_list(sock_net(skb->sk), &free_list); synchronize_rcu(); - __flush_addrs(&free_list); + __flush_addrs(pernet, &free_list); return 0; } @@ -2242,7 +2283,7 @@ static void __net_exit pm_nl_exit_net(struct list_head *net_list) * other modifiers, also netns core already waited for a * RCU grace period. */ - __flush_addrs(&pernet->local_addr_list); + __flush_addrs(pernet, &pernet->local_addr_list); } } From patchwork Thu Feb 3 07:25:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733870 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B43CF2CA1 for ; Thu, 3 Feb 2022 07:25:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643873119; x=1675409119; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=OM51OlNI4UM843dpo4EL8un5G66rQEcWF4rAEz4dh1o=; b=QgOczU8MULUlvXhQO/cO/7bchyvsXrZ4ij7NZwvuDv9Gg+OpEBP19fNc B9Y10JC/w74a1nZw/6obYrGUj8VsUacfoMX2BuMIIk6cUjtKDN9gpFFNC vQfoIY9FOJr5EHLfEEfxn5Gr8DkuP/qD/gnUWx3KsKsp3iSg6Tz40Jdh/ x7/dout8HJ7Wa2C5+77LW1/M9tqgFapanO2yf1bjkLN06aGhQc4Y0pYVL y9Tw4vvlzqvhraZ6Q5nL0pvCKT0pE98kEB9FWP72UMizOpWgGH57QVDYZ AAu1EcWL0tdCed/3hOM0DaWY0pmaAT3M3QS5GA9MBQSq+Cwr2qvkqWB9/ Q==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="272580776" X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="272580776" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:14 -0800 X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="535118732" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:14 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v5 7/8] mptcp: attempt to add listening sockets for announced addrs Date: Thu, 3 Feb 2022 02:25:07 -0500 Message-Id: <20220203072508.3072309-8-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203072508.3072309-1-kishen.maloor@intel.com> References: <20220203072508.3072309-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 When ADD_ADDR announcements use the port associated with an active subflow, this change ensures that a listening socket is bound to the announced addr+port in the kernel for subsequently receiving MP_JOINs. But if a listening socket for this address is already held by the application then no action is taken. A listening socket is created (when there isn't a listener) just prior to the addr advertisement. If it is desired to not create a listening socket in the kernel for an address, then this can be requested by including the MPTCP_PM_ADDR_FLAG_NO_LISTEN flag with the address. When a listening socket is created, it is stored in struct mptcp_pm_add_entry and released accordingly. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/203 Signed-off-by: Kishen Maloor --- v2: fixed formatting v3: added new addr flag MPTCP_PM_ADDR_FLAG_NO_LISTEN to skip creating a listening socket in the kernel during an ADD_ADDR request, use this flag along the in-kernel PM flow for ADD_ADDR requests (Note: listening sockets are always created for port-based endpoints as before), use the lsk_list_find_or_create() helper v4: call lsk_list_find() after a failed lsk_list_find_or_create() for a chance to retrieve a recently created lsk by a simultaneous call --- include/uapi/linux/mptcp.h | 1 + net/mptcp/pm_netlink.c | 46 ++++++++++++++++++++++++++++++++++++-- 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h index f106a3941cdf..265cabc0d7aa 100644 --- a/include/uapi/linux/mptcp.h +++ b/include/uapi/linux/mptcp.h @@ -81,6 +81,7 @@ enum { #define MPTCP_PM_ADDR_FLAG_SUBFLOW (1 << 1) #define MPTCP_PM_ADDR_FLAG_BACKUP (1 << 2) #define MPTCP_PM_ADDR_FLAG_FULLMESH (1 << 3) +#define MPTCP_PM_ADDR_FLAG_NO_LISTEN (1 << 4) enum { MPTCP_PM_CMD_UNSPEC, diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index a4fb9acbba51..9b3d871d3712 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -43,6 +43,7 @@ struct mptcp_pm_add_entry { struct mptcp_addr_info addr; struct timer_list add_timer; struct mptcp_sock *sock; + struct mptcp_local_lsk *lsk_ref; u8 retrans_times; }; @@ -469,7 +470,8 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk, } static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, - struct mptcp_pm_addr_entry *entry) + struct mptcp_pm_addr_entry *entry, + struct mptcp_local_lsk *lsk_ref) { struct mptcp_pm_add_entry *add_entry = NULL; struct sock *sk = (struct sock *)msk; @@ -489,6 +491,10 @@ static bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, add_entry->addr = entry->addr; add_entry->sock = msk; add_entry->retrans_times = 0; + add_entry->lsk_ref = lsk_ref; + + if (lsk_ref) + lsk_list_add_ref(lsk_ref); timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0); sk_reset_timer(sk, &add_entry->add_timer, @@ -501,8 +507,11 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk) { struct mptcp_pm_add_entry *entry, *tmp; struct sock *sk = (struct sock *)msk; + struct pm_nl_pernet *pernet; LIST_HEAD(free_list); + pernet = net_generic(sock_net(sk), pm_nl_pernet_id); + pr_debug("msk=%p", msk); spin_lock_bh(&msk->pm.lock); @@ -511,6 +520,8 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk) list_for_each_entry_safe(entry, tmp, &free_list, list) { sk_stop_timer_sync(sk, &entry->add_timer); + if (entry->lsk_ref) + lsk_list_release(pernet, entry->lsk_ref); kfree(entry); } } @@ -615,7 +626,9 @@ lookup_id_by_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *add } static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) + __must_hold(&msk->pm.lock) { + struct mptcp_local_lsk *lsk_ref = NULL; struct sock *sk = (struct sock *)msk; struct mptcp_pm_addr_entry *local; unsigned int add_addr_signal_max; @@ -652,12 +665,34 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) local = select_signal_address(pernet, msk); if (local) { - if (mptcp_pm_alloc_anno_list(msk, local)) { + if (!(local->flags & MPTCP_PM_ADDR_FLAG_NO_LISTEN) && + !local->addr.port) { + local->addr.port = + ((struct inet_sock *)inet_sk + ((struct sock *)msk))->inet_sport; + + spin_unlock_bh(&msk->pm.lock); + + lsk_ref = lsk_list_find_or_create(sock_net(sk), pernet, + local, NULL); + + spin_lock_bh(&msk->pm.lock); + + if (!lsk_ref) + lsk_ref = lsk_list_find(pernet, &local->addr); + + local->addr.port = 0; + } + + if (mptcp_pm_alloc_anno_list(msk, local, lsk_ref)) { __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); msk->pm.add_addr_signaled++; mptcp_pm_announce_addr(msk, &local->addr, false); mptcp_pm_nl_addr_send_ack(msk); } + + if (lsk_ref) + lsk_list_release(pernet, lsk_ref); } } @@ -749,6 +784,7 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk, } static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) + __must_hold(&msk->pm.lock) { struct mptcp_addr_info addrs[MPTCP_PM_ADDR_MAX]; struct sock *sk = (struct sock *)msk; @@ -1389,11 +1425,17 @@ int mptcp_pm_get_flags_and_ifindex_by_id(struct net *net, unsigned int id, static bool remove_anno_list_by_saddr(struct mptcp_sock *msk, struct mptcp_addr_info *addr) { + struct sock *sk = (struct sock *)msk; struct mptcp_pm_add_entry *entry; + struct pm_nl_pernet *pernet; + + pernet = net_generic(sock_net(sk), pm_nl_pernet_id); entry = mptcp_pm_del_add_timer(msk, addr, false); if (entry) { list_del(&entry->list); + if (entry->lsk_ref) + lsk_list_release(pernet, entry->lsk_ref); kfree(entry); return true; } From patchwork Thu Feb 3 07:25:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kishen Maloor X-Patchwork-Id: 12733871 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7CE5E2CA5 for ; Thu, 3 Feb 2022 07:25:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1643873120; x=1675409120; h=from:to:subject:date:message-id:in-reply-to:references: mime-version:content-transfer-encoding; bh=GWC00DGcOIXpenWBtT7VMMPLeYZ8SNQrpkt8jAQUe5k=; b=XoXVE9Qxnhj0cT+y+BLfYOaTwFVbg3wyXJWUG1ZVdFOXU0+9fpN4r4Jx xrY1IHrPH+6dLPNy96NLpGdMjNpmsUkD2wa/QtQ0v7zgGMJts6d7dI3Bg qKLhsJ8KcQiFM57BY6jw9VEaVsCC9YKghexqIMzwrAZEwo3aJ2GtNn3oN E9D6plVjFQ+vAJjOUxuTiL1YlxcXNpo0qvvko/Hwh+arwiaQ8Pfz2y0Ch JNaSNTm5F+V3Xst3TWYW3MotbYrPHTLoY1D0Wy17DTjkMdufM23zhu04s 4d6YyPAIb9/IRu3SC7jtj2ufr/b3X8LhVsDlcxPSrDUyZFv4Dlh5/5M79 A==; X-IronPort-AV: E=McAfee;i="6200,9189,10246"; a="272580777" X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="272580777" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:14 -0800 X-IronPort-AV: E=Sophos;i="5.88,339,1635231600"; d="scan'208";a="535118736" Received: from otc-tsn-4.jf.intel.com ([10.23.153.135]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Feb 2022 23:25:14 -0800 From: Kishen Maloor To: kishen.maloor@intel.com, mptcp@lists.linux.dev Subject: [PATCH mptcp-next v5 8/8] mptcp: expose server_side attribute in MPTCP netlink events Date: Thu, 3 Feb 2022 02:25:08 -0500 Message-Id: <20220203072508.3072309-9-kishen.maloor@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220203072508.3072309-1-kishen.maloor@intel.com> References: <20220203072508.3072309-1-kishen.maloor@intel.com> Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This change records the server_side attribute in MPTCP_EVENT_CREATED and MPTCP_EVENT_ESTABLISHED events to inform the recipient of the role of the associated MPTCP application (Client/Server) that is handling it's end of the MPTCP connection. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/246 Signed-off-by: Kishen Maloor --- include/uapi/linux/mptcp.h | 1 + net/mptcp/pm_netlink.c | 3 +++ 2 files changed, 4 insertions(+) diff --git a/include/uapi/linux/mptcp.h b/include/uapi/linux/mptcp.h index 265cabc0d7aa..0df44a116a31 100644 --- a/include/uapi/linux/mptcp.h +++ b/include/uapi/linux/mptcp.h @@ -188,6 +188,7 @@ enum mptcp_event_attr { MPTCP_ATTR_IF_IDX, /* s32 */ MPTCP_ATTR_RESET_REASON,/* u32 */ MPTCP_ATTR_RESET_FLAGS, /* u32 */ + MPTCP_ATTR_SERVER_SIDE, /* u8 */ __MPTCP_ATTR_AFTER_LAST }; diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 9b3d871d3712..eaa1a5a21192 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -2097,6 +2097,9 @@ static int mptcp_event_created(struct sk_buff *skb, if (err) return err; + if (nla_put_u8(skb, MPTCP_ATTR_SERVER_SIDE, READ_ONCE(msk->pm.server_side))) + return -EMSGSIZE; + return mptcp_event_add_subflow(skb, ssk); }