From patchwork Wed Nov 6 18:10:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov via B4 Relay X-Patchwork-Id: 13865261 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CCFA0208226; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918995; cv=none; b=aMzPf5v5LpALxWENJJma/9XM0QnGpEJ159BJR42tKi/91w2ZZy5unZsknOfMRfygqE/srGvWwRGf9PRHDUbB8C2XqvchfcR1rXFvgQjKKG1a015hsCTNsudfTMBkv+X6wzg+IFEUW/Np7D+DkMLbFv2zIhy2C1Xgm9cMQQsbkPQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918995; c=relaxed/simple; bh=N5LhYQ8LNuTU5/eVJ0fCcC0c8UEr5xiNw6btgM649Co=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ux+a8XYnFJu8Xg3vwKRtKBvhQl5h86/Bbu5R/Hq+uMtQ0M3j9ky8Y2g+Ssn4/ga66fzfLXIPR1CXGanmYmMbQrr4xHKnpwJojM56FgWEVhWEXoLqwOf8+/x7mqKD8xuQ3kBLjJQg9GGmfxfe3zN/FibRAYdDYHqclcI6MRomipg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=bQVMNPP2; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="bQVMNPP2" Received: by smtp.kernel.org (Postfix) with ESMTPS id 6BEBAC4CED0; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730918995; bh=N5LhYQ8LNuTU5/eVJ0fCcC0c8UEr5xiNw6btgM649Co=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=bQVMNPP272pg2AuJsWmI3GTaYGSpNUHUMfQPR+1lFSkx/6dk/XYrQSRWdnJBzJH54 9Ip0iAtDU/fAS13yOa/WZSvjVXtonw0bZBSgd+i2knKOLdNSj+gPABfJIgeJ7lXSOY GwkV4SF1xLDH0G+bpq6MNIqcgW7W4Et8Ih6R9ZTQIQglTewBF/KTrvuOvsxxwiP/7c atE+IB0+Q8WsD3y/0lM1YCixHcakB9iSBpsdHTFQ15EcxsiC7uvUqV2/LBQEB0EkzP E5aCh/FqxgB11UzA5/3hT/DoVIjcj9JxhOkRTL9KocBrpSIl0FnoaOxyANF+RXpVaa imQtuWo8XtvWA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A02ED59F60; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) From: Dmitry Safonov via B4 Relay Date: Wed, 06 Nov 2024 18:10:14 +0000 Subject: [PATCH net 1/6] net/diag: Do not race on dumping MD5 keys with adding new MD5 keys Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241106-tcp-md5-diag-prep-v1-1-d62debf3dded@gmail.com> References: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> In-Reply-To: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , David Ahern , Ivan Delalande , Matthieu Baerts , Mat Martineau , Geliang Tang , Boris Pismenny , John Fastabend Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, mptcp@lists.linux.dev, Dmitry Safonov <0x7f454c46@gmail.com>, stable@vger.kernel.org X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1730918993; l=7958; i=0x7f454c46@gmail.com; s=20240410; h=from:subject:message-id; bh=k7l0ErglAWcq8Z7BhR+blvFMIj4mj3UET5TdAev1Vas=; b=4kkrARfKmGSYKInaqlsVGNk0snyMmMBem2SNfiGNydT3FoRdsvs0cQUXKI6dyaJsHqx3VfWo8 37jHk+dNc8hCPzikFbm19qQsZVziJVGSjSwVFO5/cY12wXVymJ30GFk X-Developer-Key: i=0x7f454c46@gmail.com; a=ed25519; pk=cFSWovqtkx0HrT5O9jFCEC/Cef4DY8a2FPeqP4THeZQ= X-Endpoint-Received: by B4 Relay for 0x7f454c46@gmail.com/20240410 with auth_id=152 X-Original-From: Dmitry Safonov <0x7f454c46@gmail.com> Reply-To: 0x7f454c46@gmail.com From: Dmitry Safonov <0x7f454c46@gmail.com> Inet-diag has two modes: (1) dumping information for a specific socket, for which kernel creates one netlink message with the information and (2) dumping information for multiple sockets (possibly with a filter), where for the reply kernel sends many messages, one for each matched socket. Currently those two modes work differently as the information about a specific socket is never split between multiple messages. For (2), multi-socket dump for the reply kernel allocates up to 32Kb skb and fills that with as many socket dumps as possible. For (1), one-socket dump kernel pre-calculates the required space for the reply, allocates a new skb and nlmsg and only then starts filling the socket's details. Preallocating the needed size quite makes sense as most of the details are fix-sized and provided for each socket, see inet_sk_attr_size(). But there's an exception: .idiag_get_aux_size() which is optional for a socket. This is provided only for TCP sockets by tcp_diag. For TCP-MD5 it calculates the memory needed to fill an array of (struct tcp_diag_md5sig). The issue here is that the amount of keys may change in inet_diag_dump_one_icsk() between inet_sk_attr_size() and sk_diag_fill() calls. As the code expects fix-sized information on any socket, it considers sk_diag_fill() failures by -EMSGSIZE reason as a bug, resulting in such WARN_ON(): [] ------------[ cut here ]------------ [] WARNING: CPU: 7 PID: 17420 at net/ipv4/inet_diag.c:586 inet_diag_dump_one_icsk+0x3c8/0x420 [] CPU: 7 UID: 0 PID: 17420 Comm: diag_ipv4 Tainted: G W 6.11.0-rc6-00022-gc9fd7a9f9aca-dirty #2 [] pc : inet_diag_dump_one_icsk+0x3c8/0x420 [] lr : inet_diag_dump_one_icsk+0x1d4/0x420 [] sp : ffff8000aef87460 ... [] Call trace: [] inet_diag_dump_one_icsk+0x3c8/0x420 [] tcp_diag_dump_one+0xa0/0xf0 [] inet_diag_cmd_exact+0x234/0x278 [] inet_diag_handler_cmd+0x16c/0x288 [] sock_diag_rcv_msg+0x1a8/0x550 [] netlink_rcv_skb+0x198/0x378 [] sock_diag_rcv+0x20/0x48 [] netlink_unicast+0x400/0x6a8 [] netlink_sendmsg+0x654/0xa58 [] __sys_sendto+0x1ec/0x330 [] __arm64_sys_sendto+0xc8/0x168 ... [] ---[ end trace 0000000000000000 ]--- One way to solve it would be to grab lock_sock() in inet_diag_dump_one_icsk(), but that may be costly and bring new lock dependencies. The alternative is to call tcp_diag_put_md5sig() as the last attribute of the netlink message and calculate how much space left after all previous attributes filled and translate it into (struct tcp_diag_md5sig)-sized units. If it turns out that there's not enough space for all TCP-MD5 keys, mark the dump as inconsistent by setting NLM_F_DUMP_INTR flag. Userspace may figure out that dumping raced with the socket properties change and retry again. Currently it may be unexpected by userspace that netlink message for one socket may be inconsistent, but I believe we're on a safe side from breaking userspace as previously dump would fail and an ugly WARN was produced in dmesg. IOW, it is a clear improvement. This is not a theoretical issue: I've written a test and that reproduces the issue I suspected (the backtrace above). Fixes: c03fa9bcacd9 ("tcp_diag: report TCP MD5 signing keys and addresses") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> --- include/linux/inet_diag.h | 3 ++- net/ipv4/inet_diag.c | 8 +++---- net/ipv4/tcp_diag.c | 55 ++++++++++++++++++++++++++++------------------- 3 files changed, 39 insertions(+), 27 deletions(-) diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h index a9033696b0aad36ab9abd47e4b68e272053019d7..cb2ba672eba131986d0432dd628fc42bbf800886 100644 --- a/include/linux/inet_diag.h +++ b/include/linux/inet_diag.h @@ -22,7 +22,8 @@ struct inet_diag_handler { int (*idiag_get_aux)(struct sock *sk, bool net_admin, - struct sk_buff *skb); + struct sk_buff *skb, + struct nlmsghdr *nlh); size_t (*idiag_get_aux_size)(struct sock *sk, bool net_admin); diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index 67639309163d05c034fad80fc9a6096c3b79d42f..67b9cc4c0e47a596a4d588e793b7f13ee040a1e3 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -350,10 +350,6 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_connection_sock *icsk, handler->idiag_get_info(sk, r, info); - if (ext & (1 << (INET_DIAG_INFO - 1)) && handler->idiag_get_aux) - if (handler->idiag_get_aux(sk, net_admin, skb) < 0) - goto errout; - if (sk->sk_state < TCP_TIME_WAIT) { union tcp_cc_info info; size_t sz = 0; @@ -368,6 +364,10 @@ int inet_sk_diag_fill(struct sock *sk, struct inet_connection_sock *icsk, goto errout; } + if (ext & (1 << (INET_DIAG_INFO - 1)) && handler->idiag_get_aux) + if (handler->idiag_get_aux(sk, net_admin, skb, nlh) < 0) + goto errout; + /* Keep it at the end for potential retry with a larger skb, * or else do best-effort fitting, which is only done for the * first_nlmsg. diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c index f428ecf9120f2f596e1d67db2b2a0d0d0e211905..36606a19b451f059e32c58c0d76a878dc9be5ff0 100644 --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -53,29 +53,39 @@ static void tcp_diag_md5sig_fill(struct tcp_diag_md5sig *info, } static int tcp_diag_put_md5sig(struct sk_buff *skb, - const struct tcp_md5sig_info *md5sig) + const struct tcp_md5sig_info *md5sig, + struct nlmsghdr *nlh) { + size_t key_size = sizeof(struct tcp_diag_md5sig); + unsigned int attrlen, md5sig_count; const struct tcp_md5sig_key *key; struct tcp_diag_md5sig *info; struct nlattr *attr; - int md5sig_count = 0; + /* + * Userspace doesn't like to see zero-filled key-values, so + * allocating too large attribute is bad. + */ hlist_for_each_entry_rcu(key, &md5sig->head, node) md5sig_count++; if (md5sig_count == 0) return 0; - attr = nla_reserve(skb, INET_DIAG_MD5SIG, - md5sig_count * sizeof(struct tcp_diag_md5sig)); + attrlen = skb_availroom(skb) - NLA_HDRLEN; + md5sig_count = min(md5sig_count, attrlen / key_size); + attr = nla_reserve(skb, INET_DIAG_MD5SIG, md5sig_count * key_size); if (!attr) return -EMSGSIZE; info = nla_data(attr); - memset(info, 0, md5sig_count * sizeof(struct tcp_diag_md5sig)); + memset(info, 0, md5sig_count * key_size); hlist_for_each_entry_rcu(key, &md5sig->head, node) { - tcp_diag_md5sig_fill(info++, key); - if (--md5sig_count == 0) + /* More keys on a socket than pre-allocated space available */ + if (md5sig_count-- == 0) { + nlh->nlmsg_flags |= NLM_F_DUMP_INTR; break; + } + tcp_diag_md5sig_fill(info++, key); } return 0; @@ -110,25 +120,11 @@ static int tcp_diag_put_ulp(struct sk_buff *skb, struct sock *sk, } static int tcp_diag_get_aux(struct sock *sk, bool net_admin, - struct sk_buff *skb) + struct sk_buff *skb, struct nlmsghdr *nlh) { struct inet_connection_sock *icsk = inet_csk(sk); int err = 0; -#ifdef CONFIG_TCP_MD5SIG - if (net_admin) { - struct tcp_md5sig_info *md5sig; - - rcu_read_lock(); - md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); - if (md5sig) - err = tcp_diag_put_md5sig(skb, md5sig); - rcu_read_unlock(); - if (err < 0) - return err; - } -#endif - if (net_admin) { const struct tcp_ulp_ops *ulp_ops; @@ -138,6 +134,21 @@ static int tcp_diag_get_aux(struct sock *sk, bool net_admin, if (err) return err; } + +#ifdef CONFIG_TCP_MD5SIG + if (net_admin) { + struct tcp_md5sig_info *md5sig; + + rcu_read_lock(); + md5sig = rcu_dereference(tcp_sk(sk)->md5sig_info); + if (md5sig) + err = tcp_diag_put_md5sig(skb, md5sig, nlh); + rcu_read_unlock(); + if (err < 0) + return err; + } +#endif + return 0; } From patchwork Wed Nov 6 18:10:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov via B4 Relay X-Patchwork-Id: 13865259 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA735208219; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918995; cv=none; b=VwY3l8KEWMo08CO8B40bfPMA/aKwmDg8MzREV2mffjyT4r1U8SrXINnMIvfQiueO6DGw4cdCKukPmwbZsCUk5HZq9672ulskzmo/iIgTOM2zFC7E+lk73rTOvQGa+labbDxogxqbFIDCft3AB275qDqq8uLeUdfyVD2Tlud5LQQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918995; c=relaxed/simple; bh=PTBVL2tPmPN6P0/h38z8Y2oc6jqzgBOjiS1aI01AKeM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=MhgQwbLvrOVfrmS/wQLP2K+tGvEDEFeWTWuCcOBGHuGjiCFkSc78Pno3ZG7MoafQJNW4RD8DfrCPvv0rfSNVE9uQT49H6uu89US+6EbxgGcuLLfFDEdW/hn97TECH4MANM3ggfhTNTVIwsmXSpNGK80d1rrgtQODrMbmWH4Pl/c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IUCgvukC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IUCgvukC" Received: by smtp.kernel.org (Postfix) with ESMTPS id 7CE1CC4CECD; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730918995; bh=PTBVL2tPmPN6P0/h38z8Y2oc6jqzgBOjiS1aI01AKeM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=IUCgvukCE2xslOF/m4R+aX/b1Mo4ERsVCh22l19vBv/Pv1OUXioO/sJf33MH4ApjH kETYe1FfmJMKGCzG5J9Mu4Pb1zLlEAM6WvtYe7FP/yldtARcgjz/AHMxB9Rn4HKBzC miOFydWNx7l5xqv9xCsBUln6yFKeRmC/haMKodW54jwiomzNYV/BWg0BDldXXseX+g zPg3qg6LYFVrI54ZhhuQF/gUEZL9FfRLeue7gCn2RCKoL9BEisoKhPhX22qgDV7o/h h2+MTjTFyF+kl79lEtLq5OrlMTF6RDftRYaeWgIkYCX9xaA9iR8e0Bdl/LsqOvIvdG fszOAv0aWMTVg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C268D59F65; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) From: Dmitry Safonov via B4 Relay Date: Wed, 06 Nov 2024 18:10:15 +0000 Subject: [PATCH net 2/6] net/diag: Warn only once on EMSGSIZE Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241106-tcp-md5-diag-prep-v1-2-d62debf3dded@gmail.com> References: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> In-Reply-To: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , David Ahern , Ivan Delalande , Matthieu Baerts , Mat Martineau , Geliang Tang , Boris Pismenny , John Fastabend Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, mptcp@lists.linux.dev, Dmitry Safonov <0x7f454c46@gmail.com> X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1730918993; l=1040; i=0x7f454c46@gmail.com; s=20240410; h=from:subject:message-id; bh=yIywEl4vHF1DrEzww1U5WkkGaZ14BAMY3GjreNIr5EA=; b=bEe3Nj9Ft2qwDiKOg6KHN8UOZdyHHL7bHoshX5SjL0EAZKvZYSRS+mbNynLT6KyczIi/0eUzz 85bYlJhVu7LDv+U41rm8yFwML1IPrO7jppiwuqrOGVdv3zmyeZetppD X-Developer-Key: i=0x7f454c46@gmail.com; a=ed25519; pk=cFSWovqtkx0HrT5O9jFCEC/Cef4DY8a2FPeqP4THeZQ= X-Endpoint-Received: by B4 Relay for 0x7f454c46@gmail.com/20240410 with auth_id=152 X-Original-From: Dmitry Safonov <0x7f454c46@gmail.com> Reply-To: 0x7f454c46@gmail.com From: Dmitry Safonov <0x7f454c46@gmail.com> The code clearly expects that the pre-allocated skb will be enough for the netlink reply message. But if in an unbelievable situation there is a kernel issue and sk_diag_fill() fails with -EMSGSIZE, this WARN_ON() can be triggered from userspace. That aggravates the issue from KASLR leak into possible DOS vector. Use WARN_ON_ONCE() which is clearly enough to provide an information on a kernel issue. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> --- net/ipv4/inet_diag.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index 67b9cc4c0e47a596a4d588e793b7f13ee040a1e3..ca9a7e61d8d7de80cb234c45c41d6357fde50c11 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -583,7 +583,7 @@ int inet_diag_dump_one_icsk(struct inet_hashinfo *hashinfo, err = sk_diag_fill(sk, rep, cb, req, 0, net_admin); if (err < 0) { - WARN_ON(err == -EMSGSIZE); + WARN_ON_ONCE(err == -EMSGSIZE); nlmsg_free(rep); goto out; } From patchwork Wed Nov 6 18:10:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov via B4 Relay X-Patchwork-Id: 13865258 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA67D20820D; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918995; cv=none; b=Sk4a9yHm3o4mciER59gnguqWkF2bP5uyS1GNj8GJwSCzTDz0vQVWngo93B67C/ZB2r+CZ6QJMhAsQeE3/a5AYyC04j49uqPeFQrQROSLTe3RobTJ1npeZDzWnRIIbCGgESQGsJWIQa8CQN+3lI4A75RcHaL1iClP7BogJjA7wto= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918995; c=relaxed/simple; bh=jLZjmYe2kGX8q5Z1lMDdOYCXLxA2nQXAkgTbQtibyss=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=YJmyUvFKe3D4lIShgtu7s5SIm8US44euELD16Y6w3KjPMFoA4FbNTjRIABa80VR0Rfu564bmk+eMqaJfFHkO7XSFd1G3QlzggNNWwiHw5Zx9mYgMoa8f3BLeCUIQM/oakXFMspBrbJYuN26sfXPPJqWc8Bl6YAcPv8QRRaKRPaA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=NNTLc3Ni; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="NNTLc3Ni" Received: by smtp.kernel.org (Postfix) with ESMTPS id 868CFC4CED2; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730918995; bh=jLZjmYe2kGX8q5Z1lMDdOYCXLxA2nQXAkgTbQtibyss=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=NNTLc3NiiL8R+ns6+jELxr/HgX8MKL3u/BSgwz3r95H0D1PJarlhju8bJ0tKbXZH2 mKJ4BGHUUlmLXCrC+Ja2pbA652nzOqFrhh5H+7rgiOms6aI5uOcy9ldEgEM2pt7zpG 7FJx+PO2kraJltT0I7/RbROMIJY/PKTR31021b0zc4Jn5VM3ObZncVlzSG4v1tC6l9 OdIAox3UJDwaDHHm8QwR2tacNnGcWD9qJa2VHIS5pd5JFFLDJTox4dbW1Q6hwq83X6 kFHPxbl4oreEZVZNITQ7XTvKuUw/SikqdwFSTQEm5a1PpNRs+NJ98o86eQOEqPII9o ycSgcQ0KQCtEw== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E1D7D59F62; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) From: Dmitry Safonov via B4 Relay Date: Wed, 06 Nov 2024 18:10:16 +0000 Subject: [PATCH net 3/6] net/diag: Pre-allocate optional info only if requested Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241106-tcp-md5-diag-prep-v1-3-d62debf3dded@gmail.com> References: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> In-Reply-To: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , David Ahern , Ivan Delalande , Matthieu Baerts , Mat Martineau , Geliang Tang , Boris Pismenny , John Fastabend Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, mptcp@lists.linux.dev, Dmitry Safonov <0x7f454c46@gmail.com> X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1730918993; l=2351; i=0x7f454c46@gmail.com; s=20240410; h=from:subject:message-id; bh=fOCy3J49s/tDFSTizx6qbQKYeByBMBjOxYu/MDJO9nw=; b=2O5F1B9EZzEGVdrOCj2HomQpsZZzVQgWcK2OQNqsSIt8ApYIeTBk6eILHN1WF072Om+eeraJ5 QUK9xiYVjVnArf7LvqgTk0a7p39cA3kBkwAxQ9IEe4ea6jHcrE1bJQi X-Developer-Key: i=0x7f454c46@gmail.com; a=ed25519; pk=cFSWovqtkx0HrT5O9jFCEC/Cef4DY8a2FPeqP4THeZQ= X-Endpoint-Received: by B4 Relay for 0x7f454c46@gmail.com/20240410 with auth_id=152 X-Original-From: Dmitry Safonov <0x7f454c46@gmail.com> Reply-To: 0x7f454c46@gmail.com From: Dmitry Safonov <0x7f454c46@gmail.com> Those INET_DIAG_* flags from req->idiag_ext are provided by the userspace, so they are not going to change during one socket dump. This is going to save just nits and bits for typical netlink reply, which I'm going to utilise in the very next patch by always allocating tls_get_info_size(). It's possible to save even some more by checking the request in inet_diag_msg_attrs_size(), but that's being on very stingy side. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> --- net/ipv4/inet_diag.c | 29 ++++++++++++++++++----------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index ca9a7e61d8d7de80cb234c45c41d6357fde50c11..2dd173a73bd1e2657957e5e4ecb70401cc85dfda 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -102,24 +102,31 @@ static size_t inet_sk_attr_size(struct sock *sk, bool net_admin) { const struct inet_diag_handler *handler; - size_t aux = 0; + int ext = req->idiag_ext; + size_t ret = 0; rcu_read_lock(); handler = rcu_dereference(inet_diag_table[req->sdiag_protocol]); DEBUG_NET_WARN_ON_ONCE(!handler); if (handler && handler->idiag_get_aux_size) - aux = handler->idiag_get_aux_size(sk, net_admin); + ret += handler->idiag_get_aux_size(sk, net_admin); rcu_read_unlock(); - return nla_total_size(sizeof(struct tcp_info)) - + nla_total_size(sizeof(struct inet_diag_msg)) - + inet_diag_msg_attrs_size() - + nla_total_size(sizeof(struct inet_diag_meminfo)) - + nla_total_size(SK_MEMINFO_VARS * sizeof(u32)) - + nla_total_size(TCP_CA_NAME_MAX) - + nla_total_size(sizeof(struct tcpvegas_info)) - + aux - + 64; + ret += nla_total_size(sizeof(struct tcp_info)) + + nla_total_size(sizeof(struct inet_diag_msg)) + + inet_diag_msg_attrs_size() + + 64; + + if (ext & (1 << (INET_DIAG_MEMINFO - 1))) + ret += nla_total_size(sizeof(struct inet_diag_meminfo)); + if (ext & (1 << (INET_DIAG_SKMEMINFO - 1))) + ret += nla_total_size(SK_MEMINFO_VARS * sizeof(u32)); + if (ext & (1 << (INET_DIAG_CONG - 1))) + ret += nla_total_size(TCP_CA_NAME_MAX); + if (ext & (1 << (INET_DIAG_VEGASINFO - 1))) + ret += nla_total_size(sizeof(struct tcpvegas_info)); + + return ret; } int inet_diag_msg_attrs_fill(struct sock *sk, struct sk_buff *skb, From patchwork Wed Nov 6 18:10:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov via B4 Relay X-Patchwork-Id: 13865263 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 02FAC20823B; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918996; cv=none; b=ITZaSXFP/+tzAA2t/EHKIIn2/BZoyC/sI6d2TqZPv6y1N5LginORogzZ+zQjtl0P2ksD6l0Pm7p7j6X8PBMw4MUzVqZg9fNjn7ZhfzI5B7pC30pwf1Mz2H5mpSv8xLnUya0wieCsmVNky5FLj+/jhXr3Z0DnfuYe0mDQI1meZcI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918996; c=relaxed/simple; bh=x3OuOVXeYNMZ5J8mdlOMHUyUU2o5xCEWuMjEXd6ZnWo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=pSgK2hjjo7HplqYCnyK3UZfouDymDCpwa3/3qIePudAxZES/NTTnESHrCrDQ2CfSmT2mWEHHTFCoomFhhjQYGZNSCJtP5syc0v/t+cCW4DHXNqVenjjOT0LIlkp97OwyuUJ+9SnyY+UY9b3ZB+4cds+RDYQCwz2zrmLZGfAWCpY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=FwxUIK2y; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="FwxUIK2y" Received: by smtp.kernel.org (Postfix) with ESMTPS id 9A7DAC4CED9; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730918995; bh=x3OuOVXeYNMZ5J8mdlOMHUyUU2o5xCEWuMjEXd6ZnWo=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=FwxUIK2yGkiiEL1/AssTbOdkgDQ0MycZyp4Fw+NVfsxrjIWGaPeoj2J14lwVScdmb ryRpdHvQohFrcjV1zZKFwFKHE4yg2ya3gmrNFm4d01onoZFQVr/jt2GJBRY+CYJn7Q wK8w14uzsiOB+yjtekLonDl++r5PLS/SyLPy4EBX2kZlRDIp5Hgi+1RTINTz343Xc+ 2j18oT3R0HzBm3gByshemSeG2VZl3Vt4AZmXXyrbvDOVb9UUha20Uc4fu8guf8GAGD ZfCDPswMp7bCVY6YMgIY2SMJnOZLJmgdheM5fmfdbt+LvLqlicmaolRCgyyl52yoYG f3xlNuxccf1QQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9085DD59F60; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) From: Dmitry Safonov via B4 Relay Date: Wed, 06 Nov 2024 18:10:17 +0000 Subject: [PATCH net 4/6] net/diag: Always pre-allocate tcp_ulp info Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241106-tcp-md5-diag-prep-v1-4-d62debf3dded@gmail.com> References: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> In-Reply-To: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , David Ahern , Ivan Delalande , Matthieu Baerts , Mat Martineau , Geliang Tang , Boris Pismenny , John Fastabend Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, mptcp@lists.linux.dev, Dmitry Safonov <0x7f454c46@gmail.com> X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1730918993; l=6868; i=0x7f454c46@gmail.com; s=20240410; h=from:subject:message-id; bh=FncH4oZGHPV39oeGv/CSy8io/OgtTSiJCBLPRkjA1HU=; b=g5fDzb3sXLqQkmiILRw2wIeXsPHRbv9pl2o1FAODv7/R2cKEKtTaeulkQdFRC6H1J0eo3NWzZ w6fmt7kcGjJBGgN4ED3rKvsYJQ9vFjVlvQuavd8SqNYBMhKpV2Kfub6 X-Developer-Key: i=0x7f454c46@gmail.com; a=ed25519; pk=cFSWovqtkx0HrT5O9jFCEC/Cef4DY8a2FPeqP4THeZQ= X-Endpoint-Received: by B4 Relay for 0x7f454c46@gmail.com/20240410 with auth_id=152 X-Original-From: Dmitry Safonov <0x7f454c46@gmail.com> Reply-To: 0x7f454c46@gmail.com From: Dmitry Safonov <0x7f454c46@gmail.com> Currently there is a theoretical race between netlink one-socket dump and allocating icsk->icsk_ulp_ops. Simplify the expectations by always allocating maximum tcp_ulp-info. With the previous patch the typical netlink message allocation was decreased for kernel replies on requests without idiag_ext flags, so let's use it. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> --- include/net/tcp.h | 1 - net/ipv4/inet_diag.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ net/ipv4/tcp_diag.c | 13 ------------- net/mptcp/diag.c | 20 -------------------- net/tls/tls_main.c | 17 ----------------- 5 files changed, 48 insertions(+), 51 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index d1948d357dade0842777265d3397842919f9eee0..757711aa5337ae7e6abee62d303eb66d37082e19 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -2568,7 +2568,6 @@ struct tcp_ulp_ops { void (*release)(struct sock *sk); /* diagnostic */ int (*get_info)(struct sock *sk, struct sk_buff *skb); - size_t (*get_info_size)(const struct sock *sk); /* clone ulp */ void (*clone)(const struct request_sock *req, struct sock *newsk, const gfp_t priority); diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index 2dd173a73bd1e2657957e5e4ecb70401cc85dfda..97862971d552216e574cac3dd2a8fc8c893888d3 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -97,6 +97,53 @@ void inet_diag_msg_common_fill(struct inet_diag_msg *r, struct sock *sk) } EXPORT_SYMBOL_GPL(inet_diag_msg_common_fill); +static size_t tls_get_info_size(void) +{ + size_t size = 0; + +#ifdef CONFIG_TLS + size += nla_total_size(0) + /* INET_ULP_INFO_TLS */ + nla_total_size(sizeof(u16)) + /* TLS_INFO_VERSION */ + nla_total_size(sizeof(u16)) + /* TLS_INFO_CIPHER */ + nla_total_size(sizeof(u16)) + /* TLS_INFO_RXCONF */ + nla_total_size(sizeof(u16)) + /* TLS_INFO_TXCONF */ + nla_total_size(0) + /* TLS_INFO_ZC_RO_TX */ + nla_total_size(0) + /* TLS_INFO_RX_NO_PAD */ + 0; +#endif + + return size; +} + +static size_t subflow_get_info_size(void) +{ + size_t size = 0; + +#ifdef CONFIG_MPTCP + size += nla_total_size(0) + /* INET_ULP_INFO_MPTCP */ + nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_TOKEN_REM */ + nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_TOKEN_LOC */ + nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_RELWRITE_SEQ */ + nla_total_size_64bit(8) + /* MPTCP_SUBFLOW_ATTR_MAP_SEQ */ + nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_MAP_SFSEQ */ + nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_SSN_OFFSET */ + nla_total_size(2) + /* MPTCP_SUBFLOW_ATTR_MAP_DATALEN */ + nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_FLAGS */ + nla_total_size(1) + /* MPTCP_SUBFLOW_ATTR_ID_REM */ + nla_total_size(1) + /* MPTCP_SUBFLOW_ATTR_ID_LOC */ + 0; +#endif + + return size; +} + +static size_t tcp_ulp_ops_size(void) +{ + size_t size = max(tls_get_info_size(), subflow_get_info_size()); + + return size + nla_total_size(0) + nla_total_size(TCP_ULP_NAME_MAX); +} + static size_t inet_sk_attr_size(struct sock *sk, const struct inet_diag_req_v2 *req, bool net_admin) @@ -115,6 +162,7 @@ static size_t inet_sk_attr_size(struct sock *sk, ret += nla_total_size(sizeof(struct tcp_info)) + nla_total_size(sizeof(struct inet_diag_msg)) + inet_diag_msg_attrs_size() + + tcp_ulp_ops_size() + 64; if (ext & (1 << (INET_DIAG_MEMINFO - 1))) diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c index 36606a19b451f059e32c58c0d76a878dc9be5ff0..722dbfd54d247b4def1e77b1674c5b207c5a939d 100644 --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -154,7 +154,6 @@ static int tcp_diag_get_aux(struct sock *sk, bool net_admin, static size_t tcp_diag_get_aux_size(struct sock *sk, bool net_admin) { - struct inet_connection_sock *icsk = inet_csk(sk); size_t size = 0; #ifdef CONFIG_TCP_MD5SIG @@ -174,18 +173,6 @@ static size_t tcp_diag_get_aux_size(struct sock *sk, bool net_admin) sizeof(struct tcp_diag_md5sig)); } #endif - - if (net_admin && sk_fullsock(sk)) { - const struct tcp_ulp_ops *ulp_ops; - - ulp_ops = icsk->icsk_ulp_ops; - if (ulp_ops) { - size += nla_total_size(0) + - nla_total_size(TCP_ULP_NAME_MAX); - if (ulp_ops->get_info_size) - size += ulp_ops->get_info_size(sk); - } - } return size; } diff --git a/net/mptcp/diag.c b/net/mptcp/diag.c index 2d3efb405437d85c0bca70d7a92ca3a7363365e1..8b36867e4ddd5f45cebcf60e9093a061d5208756 100644 --- a/net/mptcp/diag.c +++ b/net/mptcp/diag.c @@ -84,27 +84,7 @@ static int subflow_get_info(struct sock *sk, struct sk_buff *skb) return err; } -static size_t subflow_get_info_size(const struct sock *sk) -{ - size_t size = 0; - - size += nla_total_size(0) + /* INET_ULP_INFO_MPTCP */ - nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_TOKEN_REM */ - nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_TOKEN_LOC */ - nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_RELWRITE_SEQ */ - nla_total_size_64bit(8) + /* MPTCP_SUBFLOW_ATTR_MAP_SEQ */ - nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_MAP_SFSEQ */ - nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_SSN_OFFSET */ - nla_total_size(2) + /* MPTCP_SUBFLOW_ATTR_MAP_DATALEN */ - nla_total_size(4) + /* MPTCP_SUBFLOW_ATTR_FLAGS */ - nla_total_size(1) + /* MPTCP_SUBFLOW_ATTR_ID_REM */ - nla_total_size(1) + /* MPTCP_SUBFLOW_ATTR_ID_LOC */ - 0; - return size; -} - void mptcp_diag_subflow_init(struct tcp_ulp_ops *ops) { ops->get_info = subflow_get_info; - ops->get_info_size = subflow_get_info_size; } diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c index 6b4b9f2749a6fd6de495940c5cb3f2154a5a451e..f3491c4e942e08dc882cb81eef071203384b2b37 100644 --- a/net/tls/tls_main.c +++ b/net/tls/tls_main.c @@ -1072,22 +1072,6 @@ static int tls_get_info(struct sock *sk, struct sk_buff *skb) return err; } -static size_t tls_get_info_size(const struct sock *sk) -{ - size_t size = 0; - - size += nla_total_size(0) + /* INET_ULP_INFO_TLS */ - nla_total_size(sizeof(u16)) + /* TLS_INFO_VERSION */ - nla_total_size(sizeof(u16)) + /* TLS_INFO_CIPHER */ - nla_total_size(sizeof(u16)) + /* TLS_INFO_RXCONF */ - nla_total_size(sizeof(u16)) + /* TLS_INFO_TXCONF */ - nla_total_size(0) + /* TLS_INFO_ZC_RO_TX */ - nla_total_size(0) + /* TLS_INFO_RX_NO_PAD */ - 0; - - return size; -} - static int __net_init tls_init_net(struct net *net) { int err; @@ -1123,7 +1107,6 @@ static struct tcp_ulp_ops tcp_tls_ulp_ops __read_mostly = { .init = tls_init, .update = tls_update, .get_info = tls_get_info, - .get_info_size = tls_get_info_size, }; static int __init tls_register(void) From patchwork Wed Nov 6 18:10:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov via B4 Relay X-Patchwork-Id: 13865262 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 069AE208236; Wed, 6 Nov 2024 18:49:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918996; cv=none; b=tTYoHpK34I/VVSaeMj/K3tpSyLybF+D489B4w0VJnzIzCDqWjRzDQsGX7rRV60LlfLplEzQk2/b4lEnqcV3geA4eI+jRq4Ea5EQZ0n2KAw35b3sH1UaFajiWph3jh/teYJH9zvbF+fiGL2uK9GmIG4RwV/ckYlbILsSPF22LtpE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918996; c=relaxed/simple; bh=BKqLaO6XPMvQ5HpRGFcPhcfr1sfukA4hEzfeaiOil0w=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=OH0TVeNG/vrchY/3xWbQj31T8HkPILs6RFYsC6vL3jlRRocWf9D0vBeALALxvZHqXF8YLw0CZSFXdprpcieANn3qlhmvm2W9fI6MIPGNlxh6eTMvgCCmdjkUlzncQdc53f6XJ74IMBjQvG2VvJFe1/n6z3fX1B7/hdO4KdLRxeI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=X42fVoZF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="X42fVoZF" Received: by smtp.kernel.org (Postfix) with ESMTPS id B4167C4CEDD; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730918995; bh=BKqLaO6XPMvQ5HpRGFcPhcfr1sfukA4hEzfeaiOil0w=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=X42fVoZF9AmC1mgk5dH5kfF7n6AUHEAvUh7vzTRgQK/hkcZc8ui5IxjUCZAfxi9rI XkxKhA5nGLh4w4JCbvRufqCJWQbspGRnk1u4JlE8StWFvyCsxITV0G0h6Et24u5Shz UgqkTC2onFiKnU76GU7LPJpfpmFcGOf4V+Ya1gnMpFB7H+ehiYXlmw0z/COlOyXBE9 2MjP/Jjtr4GnHgqxlIo57XUes2tBSxyVDeA9u61icNUSjlh/NYB8OAJsAWWjT/FINX NKfVQM57YI4Lt4Hik+WUkmlzi1xl2bMUcFb4wblNVb3r5Vky5vLOaPpJAWk7IWXpWi BJi8Zn8xGDbHQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id A775DD59F66; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) From: Dmitry Safonov via B4 Relay Date: Wed, 06 Nov 2024 18:10:18 +0000 Subject: [PATCH net 5/6] net/diag: Limit TCP-MD5-diag array by max attribute length Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241106-tcp-md5-diag-prep-v1-5-d62debf3dded@gmail.com> References: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> In-Reply-To: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , David Ahern , Ivan Delalande , Matthieu Baerts , Mat Martineau , Geliang Tang , Boris Pismenny , John Fastabend Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, mptcp@lists.linux.dev, Dmitry Safonov <0x7f454c46@gmail.com> X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1730918993; l=3314; i=0x7f454c46@gmail.com; s=20240410; h=from:subject:message-id; bh=PKo8fHDH+Y26BFpEtywm6891bWe9iV/N0VRAp9ggg7E=; b=GkfMWOeUE2s8NVrp9CoHDE9rOxuVaoN4dBrn5VXXR6UodtyakVhNuPnRIF/MWiE96me7arJ6t VWcWdJ1SnRBCM9WAsfrVxdklibcbnzPRrSaKm8pUMUFaOc9d2MO9qCG X-Developer-Key: i=0x7f454c46@gmail.com; a=ed25519; pk=cFSWovqtkx0HrT5O9jFCEC/Cef4DY8a2FPeqP4THeZQ= X-Endpoint-Received: by B4 Relay for 0x7f454c46@gmail.com/20240410 with auth_id=152 X-Original-From: Dmitry Safonov <0x7f454c46@gmail.com> Reply-To: 0x7f454c46@gmail.com From: Dmitry Safonov <0x7f454c46@gmail.com> Currently TCP-MD5 keys are dumped as an array of (struct tcp_diag_md5sig). All the keys from a socket go into the same netlink attribute. The maximum amount of TCP-MD5 keys on any socket is limited by /proc/sys/net/core/optmem_max, which post commit 4944566706b2 ("net: increase optmem_max default value") is now by default 128 KB. With the help of selftest I've figured out that equals to 963 keys, without user having to increase optmem_max: > test_set_md5() [963/1024]: Cannot allocate memory The maximum length of nlattr is limited by typeof(nlattr::nla_len), which is (U16_MAX - 1). When there are too many keys the array written overflows the netlink attribute. Here is what one can see on a test, with no adjustments to optmem_max defaults: > recv() = 65180 > socket: 10.0.254.1:7013->0.0.0.0:0 (intf 3) > family: 2 state: 10 timer: 0 retrans: 0 > expires: 0 rqueu: 0 wqueue: 1 uid: 0 inode: 456 > attr type: 8 (5) > attr type: 15 (8) > attr type: 21 (12) > attr type: 22 (6) > attr type: 2 (252) > attr type: 18 (64804) > recv() = 130680 > socket: 10.0.254.1:7013->0.0.0.0:0 (intf 3) > family: 2 state: 10 timer: 0 retrans: 0 > expires: 0 rqueu: 0 wqueue: 1 uid: 0 inode: 456 > attr type: 8 (5) > attr type: 15 (8) > attr type: 21 (12) > attr type: 22 (6) > attr type: 2 (252) > attr type: 18 (64768) > attr type: 29555 (25966) > recv() = 130680 > socket: 10.0.254.1:7013->0.0.0.0:0 (intf 3) > family: 2 state: 10 timer: 0 retrans: 0 > expires: 0 rqueu: 0 wqueue: 1 uid: 0 inode: 456 > attr type: 8 (5) > attr type: 15 (8) > attr type: 21 (12) > attr type: 22 (6) > attr type: 2 (252) > attr type: 18 (64768) > attr type: 29555 (25966) > attr type: 8265 (8236) Here attribute type 18 is INET_DIAG_MD5SIG, the following nlattr types are junk made of tcp_diag_md5sig's content. Here is the overflow of the nlattr size: >>> hex(64768) '0xfd00' >>> hex(130300) '0x1fcfc' Limit the size of (struct tcp_diag_md5sig) array in the netlink reply by maximum attribute length. Not perfect as NLM_F_DUMP_INTR will be set on the netlink header flags, but the userspace can differ if it's due to inconsistency or due to maximum size of the netlink attribute. In a following patch set, I'm planning to address this and re-introduce TCP-MD5-diag that actually works. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> --- net/ipv4/tcp_diag.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c index 722dbfd54d247b4def1e77b1674c5b207c5a939d..d55a0ac39fa0853806efd4a6b38591255e0f4930 100644 --- a/net/ipv4/tcp_diag.c +++ b/net/ipv4/tcp_diag.c @@ -72,6 +72,7 @@ static int tcp_diag_put_md5sig(struct sk_buff *skb, return 0; attrlen = skb_availroom(skb) - NLA_HDRLEN; + attrlen = min(attrlen, U16_MAX - 1); /* attr->nla_len */ md5sig_count = min(md5sig_count, attrlen / key_size); attr = nla_reserve(skb, INET_DIAG_MD5SIG, md5sig_count * key_size); if (!attr) From patchwork Wed Nov 6 18:10:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Safonov via B4 Relay X-Patchwork-Id: 13865264 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B541209695; Wed, 6 Nov 2024 18:49:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918996; cv=none; b=ci39HK2VrM0PYZ/ljTy7z9/iZZKHbqGkJ2YJufCOcatss2BrT2M21MR59+L22PwrZzhorLnOknvuDWbh+6+Mxxig5wJTt8fW8CzElItkkH0BFKPutZ7iL8TEnDXjfGnKtzK7Yt4jdqgyabq48dvMCvi24LrManMxOFyAJUG4SMU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730918996; c=relaxed/simple; bh=hUXnHZwdxyXGHKs/xhNvNy/CJiyEexLZlYu0RRseBcs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=XN3TJRcMoF/PoP6unCwVVq6WHjB1/rFpSPeKB/2GAxF0lwXrH9hvmOTH4SlC7SrKxceqf4NV/FgDaTnAal3XB25+59Q1ter/eWXJxB1xGR0SWrSS4ThOTooikWWirDzbhrNEusNrnuj+DSjpVPAJnRNPG8I4BAa2fws4rgImTfw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JhmxUkMG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JhmxUkMG" Received: by smtp.kernel.org (Postfix) with ESMTPS id DEE18C4CEE1; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1730918996; bh=hUXnHZwdxyXGHKs/xhNvNy/CJiyEexLZlYu0RRseBcs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=JhmxUkMG9i19apfIq+dUrff90TNz5ygITK22+ILWBtJOjLiLKs7J/dunyG9Y5ICG8 v0t9/D7kHGrpVyoXEUsCIUftXvrfrl8/YFXyV/nBC6kC0s4wshkZi+k/GjWa8ZpcEV D3TuBObS6l3IwbPXdm5ZSCwDxzDidv7hQ4AfAm4BSAGn2T7QzTdy7r8yv2rWUJDrLj 8lF8sIRPtNdvoRAC2OtddRzCM8GpluTlJqD84Hl0u1dBmZzyqPmuPbTvwZDUuHvfKX LPx6TL8+yZH2MhamG6QyavjUY92B4nbkjPbsOPctxOw5ckm5NTeHPuVunDEPWqv07I P6XaQzlV7CpeA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2807D59F65; Wed, 6 Nov 2024 18:49:55 +0000 (UTC) From: Dmitry Safonov via B4 Relay Date: Wed, 06 Nov 2024 18:10:19 +0000 Subject: [PATCH net 6/6] net/netlink: Correct the comment on netlink message max cap Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241106-tcp-md5-diag-prep-v1-6-d62debf3dded@gmail.com> References: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> In-Reply-To: <20241106-tcp-md5-diag-prep-v1-0-d62debf3dded@gmail.com> To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , David Ahern , Ivan Delalande , Matthieu Baerts , Mat Martineau , Geliang Tang , Boris Pismenny , John Fastabend Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, mptcp@lists.linux.dev, Dmitry Safonov <0x7f454c46@gmail.com> X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1730918994; l=1447; i=0x7f454c46@gmail.com; s=20240410; h=from:subject:message-id; bh=Kmj21DwNJNOs65kiQumd7s7cSWkgFWMCJrFg+lJ9ggE=; b=ITm5BD/vJrJwTvNmV1V6c7esw02waru8s1F/KcRHpnPwih8VvLb5zrhOJkimS2VHKQBziP56v jdM9oPDoaHECjZpmFpJOoueAzKPXBTDCuVXLCz8+ouoidSykBvuUtd8 X-Developer-Key: i=0x7f454c46@gmail.com; a=ed25519; pk=cFSWovqtkx0HrT5O9jFCEC/Cef4DY8a2FPeqP4THeZQ= X-Endpoint-Received: by B4 Relay for 0x7f454c46@gmail.com/20240410 with auth_id=152 X-Original-From: Dmitry Safonov <0x7f454c46@gmail.com> Reply-To: 0x7f454c46@gmail.com From: Dmitry Safonov <0x7f454c46@gmail.com> Since commit d35c99ff77ec ("netlink: do not enter direct reclaim from netlink_dump()") the cap is 32KiB. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> --- net/netlink/af_netlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 0a9287fadb47a2afaf0babe675738bc43051c5a7..27979cefc06256bde052898d193ed99f710c2087 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -2279,7 +2279,7 @@ static int netlink_dump(struct sock *sk, bool lock_taken) goto errout_skb; /* NLMSG_GOODSIZE is small to avoid high order allocations being - * required, but it makes sense to _attempt_ a 16K bytes allocation + * required, but it makes sense to _attempt_ a 32KiB allocation * to reduce number of system calls on dump operations, if user * ever provided a big enough buffer. */ @@ -2301,7 +2301,7 @@ static int netlink_dump(struct sock *sk, bool lock_taken) goto errout_skb; /* Trim skb to allocated size. User is expected to provide buffer as - * large as max(min_dump_alloc, 16KiB (mac_recvmsg_len capped at + * large as max(min_dump_alloc, 32KiB (max_recvmsg_len capped at * netlink_recvmsg())). dump will pack as many smaller messages as * could fit within the allocated skb. skb is typically allocated * with larger space than required (could be as much as near 2x the