From patchwork Thu Sep 26 17:30:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthieu Baerts (NGI0)" X-Patchwork-Id: 13813542 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 062752914; Thu, 26 Sep 2024 17:30:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727371841; cv=none; b=JQg6D05PJqda9w3uuCIWb/c9F/mD5TnJNl779A9Pl1on45OAgZZs+d3lCqDS2OkIHtBChFvroLFwHwtC+AC1MLvHaesI6ignPbuPAobdcd/VZeH8aXkgr9oFHVJ2J0OQG1FWiacw3I7A/mZnNy4bym5XyAR11/fFV4+7flxpa+M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727371841; c=relaxed/simple; bh=D0SfxUVTuyTnYZC9kSremtU+kFqebil+/+UbQRnwwPM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=g0+OYabpdnlm8qiNfvNLqYSG0v9Hy8oeyRO++2DlUvlRKRKvMZFohll643cdyRTraSpAd8zfzlhfAL0aaUM4YF810rK/kc3Ib63BveDZBUk73rfVgbJUd/UUhla23sgZpTgmmbdvoG77c5oxjG1Nabg6EsQGeO8WeghCvdBVpBY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ENdTaTaY; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ENdTaTaY" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 31BB6C4CED1; Thu, 26 Sep 2024 17:30:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727371840; bh=D0SfxUVTuyTnYZC9kSremtU+kFqebil+/+UbQRnwwPM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=ENdTaTaYpYfUoh1nCY22Dlg9jxssuLFBmmcTq6CYhQa886ueJQ8tiiTXWi6tGpMTk d3F7R/QhtWW6Ru1fhoVtB69awcUomzaOrUXJ9zAufdclqYV0ISOpjSvZ0aHoQWBHwG LtM6TZ2X74s+rw7FN0CuoRhGuv/xtj9ARwA3YLJWfU40gdV/pnCA6wjly1X67XFi0K +usO86pAlycVQl0N+QMnwV2pN0k77v+DI0bVfFta1/AILrLDJmwYVzrES6MyUsQ7Np 9/SHOh2Q6+Tgqe5OikuoS1/xvQPfa33s9duqHTkYDs1qyRfAO2b2oEk9J4swyASkaW WfA4F8qpw/54A== From: "Matthieu Baerts (NGI0)" Date: Thu, 26 Sep 2024 19:30:22 +0200 Subject: [PATCH bpf-next/net v7 1/3] selftests/bpf: Add mptcp subflow example Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-1-d26029e15cdd@kernel.org> References: <20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-0-d26029e15cdd@kernel.org> In-Reply-To: <20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-0-d26029e15cdd@kernel.org> To: mptcp@lists.linux.dev, Mat Martineau , Geliang Tang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, "Matthieu Baerts (NGI0)" , Geliang Tang X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=4501; i=matttbe@kernel.org; h=from:subject:message-id; bh=EMtfZe45Hxqq8vNRVMFtfAoY8a5Xhn5SAcHBEaZLfUs=; b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBm9Zo22Nih81yx+/UrbcG5sVeZWMg1i7RIY8Bub iMpXaqmHveJAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZvWaNgAKCRD2t4JPQmmg c7HeD/90DBPh+gzTYSCHAXU/40gke0uvwWDUZDPnF6/Hm40tFltxS+j6EhOR8VJMLo8tZtIGSWh OQhPRb02CN3tFehgMKI/reiIKC1VfyT5C7BRADrpClAhpB8s6+IK/47gXbaYgKkZ6z7PMliZ++Y 73sC5u99m9UWZ14ZGYAhpLxcKFhoeo87q8C1ysSBgKdkVZvq9h2q/YnfJRxM8mUORKyR6GfPqmZ XsdtcASlaPFD+KTrbs97u1Q77it08tPQCVW8f3MvdmeUzFJceN8USaRFBxKBgKR1cp6P8BiPBvf Li+A2XDz0WkEcKRktSTJhq+mgLsVPurYS/42DU1vFBwGRA7l0Nlxq6Zw+27dcTticMbjh2LH2k0 G1Ije9DkeW0KDVQUAMK+qG6rBmug9uMc2WqFHH1DtVknAjNzFJEWCBgeJw0Eq0hEthX1JzGCKBZ U0IBGXWZW6TY5gtl8r99bBruSE0N+AUdx5xF5k9u5+BCsiY9NSCBCcWoxpqukL5w2xo9CuwJ9sI 7z7KFbPzwNXcD1YXWT5Ypbf1dTERknJmri6G1x9H6u5PggWIP9kWgKuctZ6u7QBIFPb8EDppu/0 W1R63azGj6JQknLsAEJe5lGESIiAzhEGYJSs5o1ZUW67wnJN+4S9m1dEWmzq9DQE0ydOWJhHnAy L5x0EHcT6hcnX9Q== X-Developer-Key: i=matttbe@kernel.org; a=openpgp; fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073 From: Nicolas Rybowski Move Nicolas' patch into bpf selftests directory. This example adds a different mark (SO_MARK) on each subflow, and changes the TCP CC only on the first subflow. From the userspace, an application can do a setsockopt() on an MPTCP socket, and typically the same value will be propagated to all subflows (paths). If someone wants to have different values per subflow, the recommended way is to use BPF. So it is good to add such example here, and make sure there is no regressions. This example shows how it is possible to: Identify the parent msk of an MPTCP subflow. Put different sockopt for each subflow of a same MPTCP connection. Here especially, two different behaviours are implemented: A socket mark (SOL_SOCKET SO_MARK) is put on each subflow of a same MPTCP connection. The order of creation of the current subflow defines its mark. The TCP CC algorithm of the very first subflow of an MPTCP connection is set to "reno". This is just to show it is possible to identify an MPTCP connection, and set socket options, from different SOL levels, per subflow. "reno" has been picked because it is built-in and usually not set as default one. It is easy to verify with 'ss' that these modifications have been applied correctly. That's what the next patch is going to do. Nicolas' code comes from: commit 4d120186e4d6 ("bpf:examples: update mptcp_set_mark_kern.c") from the MPTCP repo https://github.com/multipath-tcp/mptcp_net-next (the "scripts" branch), and it has been adapted by Geliang. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/76 Co-developed-by: Geliang Tang Signed-off-by: Geliang Tang Signed-off-by: Nicolas Rybowski Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts (NGI0) --- Notes: - v1 -> v2: - The commit message has been updated: why setting multiple socket options, why reno, the verification is done in a later patch (different author). (Alexei) - v2 -> v3: - Only #include "bpf_tracing_net.h", linked to: https://lore.kernel.org/20240509175026.3423614-1-martin.lau@linux.dev - v4 -> v5: - Set reno as TCP cc on the second subflow, not to influence the getsockopt() done from the userspace, which will return the one from the first subflow, the default TCP cc then, not the modified one. --- tools/testing/selftests/bpf/progs/mptcp_subflow.c | 59 +++++++++++++++++++++++ 1 file changed, 59 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/mptcp_subflow.c b/tools/testing/selftests/bpf/progs/mptcp_subflow.c new file mode 100644 index 0000000000000000000000000000000000000000..2e28f4a215b5469fcbc31168071887687ca34792 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp_subflow.c @@ -0,0 +1,59 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Tessares SA. */ +/* Copyright (c) 2024, Kylin Software */ + +/* vmlinux.h, bpf_helpers.h and other 'define' */ +#include "bpf_tracing_net.h" + +char _license[] SEC("license") = "GPL"; + +char cc[TCP_CA_NAME_MAX] = "reno"; + +/* Associate a subflow counter to each token */ +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(__u32)); + __uint(max_entries, 100); +} mptcp_sf SEC(".maps"); + +SEC("sockops") +int mptcp_subflow(struct bpf_sock_ops *skops) +{ + __u32 init = 1, key, mark, *cnt; + struct mptcp_sock *msk; + struct bpf_sock *sk; + int err; + + if (skops->op != BPF_SOCK_OPS_TCP_CONNECT_CB) + return 1; + + sk = skops->sk; + if (!sk) + return 1; + + msk = bpf_skc_to_mptcp_sock(sk); + if (!msk) + return 1; + + key = msk->token; + cnt = bpf_map_lookup_elem(&mptcp_sf, &key); + if (cnt) { + /* A new subflow is added to an existing MPTCP connection */ + __sync_fetch_and_add(cnt, 1); + mark = *cnt; + } else { + /* A new MPTCP connection is just initiated and this is its primary subflow */ + bpf_map_update_elem(&mptcp_sf, &key, &init, BPF_ANY); + mark = init; + } + + /* Set the mark of the subflow's socket based on appearance order */ + err = bpf_setsockopt(skops, SOL_SOCKET, SO_MARK, &mark, sizeof(mark)); + if (err < 0) + return 1; + if (mark == 2) + err = bpf_setsockopt(skops, SOL_TCP, TCP_CONGESTION, cc, TCP_CA_NAME_MAX); + + return 1; +} From patchwork Thu Sep 26 17:30:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthieu Baerts (NGI0)" X-Patchwork-Id: 13813543 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBE9F2914; Thu, 26 Sep 2024 17:30:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727371845; cv=none; b=Vy7qWkqONYY4cvKrihoJXkRn1DZ0+vsfJG/bpjKuT1kaYCA2zeBcV2dbcdcy/yJV5KjEkHbBrSl+t0pxHvSY80g2AKJwsPaFpaQmlenNjJTP93/0Ixkz0onbTNPdF7xHPRW1MQRm6uCkpHLs0rhaG2vqXSjYjRtY5OfjJPVFDJw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727371845; c=relaxed/simple; bh=lRLfi2UxFsB0Wd5r8DgWeVoWLcwTpOqLJtw/uZ5JA2w=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=LXUqHTjSW4T/l2JwYFMWzVwflGRqR/n4Ep4WPrYp8LNx8Ng/n9aOZ7X+NijVZhWiV2fzbeKFCiRVvNSx4GQNp5SNqDNWfpXlmiA3ywC27Len5tfih+s6SwHgUCzWiReMHIqRrqzV4N/bn7zpyOQVA9ZNxn3ndSduAiFbmt7kB8k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GB+cVKTw; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GB+cVKTw" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E6015C4CEC7; Thu, 26 Sep 2024 17:30:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727371845; bh=lRLfi2UxFsB0Wd5r8DgWeVoWLcwTpOqLJtw/uZ5JA2w=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=GB+cVKTw0N5j32bSDpHVsPy1HWvIht/1acmycM1/2g6KqPboNzMsqKLZYaukJxMqC p3IkkQze6iR9sLJTIIK92ln4lj0hr50sF+plVdr5agFGX0c6KTFxZKsB3wrNJZ4A0n l6rIFbbO76SnkGu2YFLr2Q3sLEhik5Xi+arGRvISSD9ZPkIw8iLOotrolhWZyfSExF qd7dhdh9HE5dQLi2J0UugqDOUUuZUPHWL3+Rl4y/mGFigGWlplR6SIoYNOCRhHmk2E /LdeBUu+f1P3ZV1k9jevCRCBd8NkCfbmGGLrVW6DPAQhfA//rkwIZaOMunwNSSPDGH GHc8L+g4qA7lw== From: "Matthieu Baerts (NGI0)" Date: Thu, 26 Sep 2024 19:30:23 +0200 Subject: [PATCH bpf-next/net v7 2/3] selftests/bpf: Add getsockopt to inspect mptcp subflow Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-2-d26029e15cdd@kernel.org> References: <20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-0-d26029e15cdd@kernel.org> In-Reply-To: <20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-0-d26029e15cdd@kernel.org> To: mptcp@lists.linux.dev, Mat Martineau , Geliang Tang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, "Matthieu Baerts (NGI0)" , Martin KaFai Lau , Geliang Tang X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=6227; i=matttbe@kernel.org; h=from:subject:message-id; bh=Fv2w/TMtWqopJGbiMvr/rlywhKanG9Xb8OKXkobTytI=; b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBm9Zo2Iexf+V1fyQg7RvHtHdg61oFl/N02hef32 UGNDI7dG16JAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZvWaNgAKCRD2t4JPQmmg c1csEAC+W1TMx8o2k8XGnBaFVSLQo/hIdOkOw699bXzt+iCwnK7vbUT8nhXks/VA0hJ48f1tKgr D25ETN1kJbQIbZYaEmY0Mk4Kwlbao/2vhbpMU2X13nCAuwe1PijRSN7E+rznJCpjK4QdsRG5V3j Z6wHvunIKNYD3D41GQuGP8dpmEfIXvOF8e2vyn+PUrLuEkb1O8qvSnE1HNgjjLZILDy2hWO1eO5 Jv0N17EPndiVYm5Ow+jJgmISyLcMLDMWu8QwNLc9iKI9pTP23VSyqfXrUNoJYmOMb7uT04FvLrb r5mGBI+RWO4QURJPAujZRaJ7lU36hqn6Qp9xUsdXiiUWoeR6ARGhDqBFNXMAJVh1+XzB2IFGXPl ogzI/h2Iw2bUOx3IBYVTCjo2Z0MWxdXnbWTW5bDZNdz9xTkljCK4c8rETRl3Q+xnjs32MlZPce7 0IY2DhUiXQmSaNSYJ3gvOtlfd1FBMncUYeWsxrlP80D8GAqComYL94JtiCFytH0vwih75p/rDhb 4gCbXHWpEXli1Y//eJdY2P3UNMBQ+O3/NvDVG4vltOxeGXFNcwukGILwuAhzBD+G+vXwEzcY/lF KIhRCXhrEWUpsg1VcmGNvrJ/ctiXBfw66fpdmYaL/OPAEgxmbH8JQMJlkJ7sIBsB9cKGvJa3BdI PGNgZ1AQ/VxmBkw== X-Developer-Key: i=matttbe@kernel.org; a=openpgp; fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073 From: Geliang Tang This patch adds a "cgroup/getsockopt" way to inspect the subflows of an MPTCP socket, and verify the modifications done by the same BPF program in the previous commit: a different mark per subflow, and a different TCP CC set on the second one. This new hook will be used by the next commit to verify the socket options set on each subflow. This extra "cgroup/getsockopt" prog walks the msk->conn_list and use bpf_core_cast to cast a pointer for readonly. It allows to inspect all the fields of a structure. Note that on the kernel side, the MPTCP socket stores a list of subflows under 'msk->conn_list'. They can be iterated using the generic 'list' helpers. They have been imported here, with a small difference: list_for_each_entry() uses 'can_loop' to limit the number of iterations, and ease its use. Because only data need to be read here, it is enough to use this technique. It is planned to use bpf_iter, when BPF programs will be used to modify data from the different subflows. mptcp_subflow_tcp_sock() and mptcp_for_each_stubflow() helpers have also be imported. Suggested-by: Martin KaFai Lau Signed-off-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) --- Notes: - v5: new patch, instead of using 'ss' in the following patch - v7: use 'can_loop' instead of 'cond_break'. (Martin) --- MAINTAINERS | 2 +- tools/testing/selftests/bpf/progs/mptcp_bpf.h | 42 ++++++++++++++ tools/testing/selftests/bpf/progs/mptcp_subflow.c | 69 +++++++++++++++++++++++ 3 files changed, 112 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 3bce6cc05553dad53db5f06d36e6957061886dd0..8817aa26b2fc0ba3581576d040f5093124cc60a7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16097,7 +16097,7 @@ F: include/net/mptcp.h F: include/trace/events/mptcp.h F: include/uapi/linux/mptcp*.h F: net/mptcp/ -F: tools/testing/selftests/bpf/*/*mptcp*.c +F: tools/testing/selftests/bpf/*/*mptcp*.[ch] F: tools/testing/selftests/net/mptcp/ NETWORKING [TCP] diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf.h b/tools/testing/selftests/bpf/progs/mptcp_bpf.h new file mode 100644 index 0000000000000000000000000000000000000000..3b188ccdcc4041acb4f7ed38ae8ddf5a7305466a --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp_bpf.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ +#ifndef __MPTCP_BPF_H__ +#define __MPTCP_BPF_H__ + +#include "bpf_experimental.h" + +/* list helpers from include/linux/list.h */ +static inline int list_is_head(const struct list_head *list, + const struct list_head *head) +{ + return list == head; +} + +#define list_entry(ptr, type, member) \ + container_of(ptr, type, member) + +#define list_first_entry(ptr, type, member) \ + list_entry((ptr)->next, type, member) + +#define list_next_entry(pos, member) \ + list_entry((pos)->member.next, typeof(*(pos)), member) + +#define list_entry_is_head(pos, head, member) \ + list_is_head(&pos->member, (head)) + +/* small difference: 'can_loop' has been added in the conditions */ +#define list_for_each_entry(pos, head, member) \ + for (pos = list_first_entry(head, typeof(*pos), member); \ + !list_entry_is_head(pos, head, member) && can_loop; \ + pos = list_next_entry(pos, member)) + +/* mptcp helpers from protocol.h */ +#define mptcp_for_each_subflow(__msk, __subflow) \ + list_for_each_entry(__subflow, &((__msk)->conn_list), node) + +static __always_inline struct sock * +mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow) +{ + return subflow->tcp_sock; +} + +#endif diff --git a/tools/testing/selftests/bpf/progs/mptcp_subflow.c b/tools/testing/selftests/bpf/progs/mptcp_subflow.c index 2e28f4a215b5469fcbc31168071887687ca34792..70302477e326eecaef6aad4ecf899aa3d6606f23 100644 --- a/tools/testing/selftests/bpf/progs/mptcp_subflow.c +++ b/tools/testing/selftests/bpf/progs/mptcp_subflow.c @@ -4,10 +4,12 @@ /* vmlinux.h, bpf_helpers.h and other 'define' */ #include "bpf_tracing_net.h" +#include "mptcp_bpf.h" char _license[] SEC("license") = "GPL"; char cc[TCP_CA_NAME_MAX] = "reno"; +int pid; /* Associate a subflow counter to each token */ struct { @@ -57,3 +59,70 @@ int mptcp_subflow(struct bpf_sock_ops *skops) return 1; } + +static int _check_getsockopt_subflow_mark(struct mptcp_sock *msk, struct bpf_sockopt *ctx) +{ + struct mptcp_subflow_context *subflow; + int i = 0; + + mptcp_for_each_subflow(msk, subflow) { + struct sock *ssk; + + ssk = mptcp_subflow_tcp_sock(bpf_core_cast(subflow, + struct mptcp_subflow_context)); + + if (ssk->sk_mark != ++i) { + ctx->retval = -2; + break; + } + } + + return 1; +} + +static int _check_getsockopt_subflow_cc(struct mptcp_sock *msk, struct bpf_sockopt *ctx) +{ + struct mptcp_subflow_context *subflow; + + mptcp_for_each_subflow(msk, subflow) { + struct inet_connection_sock *icsk; + struct sock *ssk; + + ssk = mptcp_subflow_tcp_sock(bpf_core_cast(subflow, + struct mptcp_subflow_context)); + icsk = bpf_core_cast(ssk, struct inet_connection_sock); + + if (ssk->sk_mark == 2 && + __builtin_memcmp(icsk->icsk_ca_ops->name, cc, TCP_CA_NAME_MAX)) { + ctx->retval = -2; + break; + } + } + + return 1; +} + +SEC("cgroup/getsockopt") +int _getsockopt_subflow(struct bpf_sockopt *ctx) +{ + struct bpf_sock *sk = ctx->sk; + struct mptcp_sock *msk; + + if (bpf_get_current_pid_tgid() >> 32 != pid) + return 1; + + if (!sk || sk->protocol != IPPROTO_MPTCP || + (!(ctx->level == SOL_SOCKET && ctx->optname == SO_MARK) && + !(ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION))) + return 1; + + msk = bpf_core_cast(sk, struct mptcp_sock); + if (msk->pm.subflows != 1) { + ctx->retval = -1; + return 1; + } + + if (ctx->optname == SO_MARK) + return _check_getsockopt_subflow_mark(msk, ctx); + return _check_getsockopt_subflow_cc(msk, ctx); +} From patchwork Thu Sep 26 17:30:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthieu Baerts (NGI0)" X-Patchwork-Id: 13813544 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 40F8E166F06; Thu, 26 Sep 2024 17:30:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727371850; cv=none; b=EK3SvyA8lDE5PejNifoZlwcPhXw951EAghdJrHp2w9RacpXGjl8qgmvlmuarCdXxWNYgStphc8Is5IxGP8dC6O7JjZmU7+5WbCXBKO2L3Qj4rDaIFo250y524JBpPUvA2usMEZpDS4Q0acvvsgtrQrpH+efxc93keLSf3mWmu5c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727371850; c=relaxed/simple; bh=xuqwGLHjdr30dQ1lPi12br1Zy+HpIC4MWNhfUuOvRbQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Ft3HiNCcVBZQw8wmG1smQX9MSeCd6jHtig5df8oPZKxvdwSca8+k0tKANtlQilkfU3Nfv3ttBmyzKilCZlBPOCPTTL45m17fqj2wzmFhnmn9fF7YkymLJ31TeFkwlAnXeleSurJ4qk2YOO6lyW3M4N9SvtgJLi05R6DycBNDbDE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XuCF6/8c; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XuCF6/8c" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D1ED4C4CECF; Thu, 26 Sep 2024 17:30:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727371850; bh=xuqwGLHjdr30dQ1lPi12br1Zy+HpIC4MWNhfUuOvRbQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=XuCF6/8csFYR+p4s+T13eTD0qj7XcQO3vN0Q7Pb0m7ggcUPa3/q93o63VTnceYYFx jT/Rej/SAbHsnlAx2ROmTo/7kdmf7JALbsuPlDJ8gzdBtU5EfpgG6FG2Nglo+L5dbj 2pMT+NrFyFr/QU5pZcNrRc+Zr04mYT11KFNCpLt+wEWql//bOLRCgbkif1IiDRQ1NS h7hzTYMcTcEqh9ZMoYk6vDsLw0ZZtcq85KXDHO/lEngIXEZJ9fI+pLfpOKLtyYAONS 0jXDD0BqtgSjDOKNDhp/HgKkEx/51c/R4bET+8CMnw+jVvHvGpmXt7eVz7LkEe2rew 9paTmqoJl61Ow== From: "Matthieu Baerts (NGI0)" Date: Thu, 26 Sep 2024 19:30:24 +0200 Subject: [PATCH bpf-next/net v7 3/3] selftests/bpf: Add mptcp subflow subtest Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-3-d26029e15cdd@kernel.org> References: <20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-0-d26029e15cdd@kernel.org> In-Reply-To: <20240926-upstream-bpf-next-20240506-mptcp-subflow-test-v7-0-d26029e15cdd@kernel.org> To: mptcp@lists.linux.dev, Mat Martineau , Geliang Tang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, "Matthieu Baerts (NGI0)" , Geliang Tang X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=6499; i=matttbe@kernel.org; h=from:subject:message-id; bh=amjFCzfj2zTqypTjATD/9IUdR8Z/IjIKIf9BGZz5Jbw=; b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBm9Zo260GeawutDlv38rtlu+FTaCTBSDzsrUF4C egYQLmfs3CJAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZvWaNgAKCRD2t4JPQmmg c8GWD/9jS3jl5phtJkZQFpEA4U0Cox68GLXuqP5WEHmhngkXVMG2eThOdx9RosWAaAp2mIqZihT 0M4I9909JHgumQDTCI3vAWcmXQmP1nYIEG9KiHDdKmwfJrR8HldEJ9G1TSUTyHwc8Fo+mnM2zx2 kaeS9MTqzLDetFTNyACCS/3XmZTbZQXr8fx1TaOzfXMiB4mmVG9dCqdbvPovGavdbFSvLpe5PXv aJ2yNASSEwGa9zrUevHTdDcc5JIf++AokonAGeZX5i1medODIc1GuNnmbAdSqTZwf6WEoNEJNTM 40Qghw9cEZrlK+1081c8X/9z7wRBEjEQpurIPdvA/VrsEJ8LZZiQSojd1PXtdVRO+ZS5EWHV5Mx o0asTgEG7UNNvy2DGOSt39InE/pf7duZ6kYS9OORKSSv2N70z80t/Na6zkiuwdW3FTckPc4CNRV Zal9TQ1i+L0B8QFke6fdks1qXNUrHGxoe4sfBJihBnSb9F+fzipljDysVr1cpznE2raFbBp+r1O QlUPcIplnkFjbDmbJYJL0eOhNB1YBw0g4bVW5TL0if1WqrGb9WmpGZ1zIs/4wf3AMEkmYECWgJb 16AjgMAoQkYAFbeWU1qzMYgZULMvzk+/sL6InO5f28JVHbz/cvw1y0FTlrkKtW9DrNF06HlRS0/ NtCktoFWGIQvD0A== X-Developer-Key: i=matttbe@kernel.org; a=openpgp; fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073 From: Geliang Tang This patch adds a subtest named test_subflow in test_mptcp to load and verify the newly added MPTCP subflow BPF program. To goal is to make sure it is possible to set different socket options per subflows, while the userspace socket interface only lets the application to set the same socket options for the whole MPTCP connection and its multiple subflows. To check that, a client and a server are started in a dedicated netns, with veth interfaces to simulate multiple paths. They will exchange data to allow the creation of an additional subflow. When the different subflows are being created, the new MPTCP subflow BPF program will set some socket options: marks and TCP CC. The validation is done by the same program, when the userspace checks the value of the modified socket options. On the userspace side, it will see that the default values are still being used on the MPTCP connection, while the BPF program will see different options set per subflow of the same MPTCP connection. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/76 Signed-off-by: Geliang Tang Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts (NGI0) --- Notes: - v2 -> v3: - Use './mptcp_pm_nl_ctl' instead of 'ip mptcp', not supported by the BPF CI running IPRoute 5.5.0. - Use SYS_NOFAIL() in _ss_search() instead of calling system() - v3 -> v4: - Drop './mptcp_pm_nl_ctl', but skip this new test if 'ip mptcp' is not supported. - v4 -> v5: - Note that this new test is no longer skipped on the BPF CI, because 'ip mptcp' is now supported after the switch from Ubuntu 20.04 to 22.04. - Update the commit message, reflecting the latest version. - The validations are no longer done using 'ss', but using the new BPF program added in the previous patch, to reduce the use of external dependences. (Martin) - v5 -> v6: - Use usleep() instead of sleep(). - v6 -> v7: - Drop mptcp_subflow__attach(), use bpf_program__attach_cgroup() instead of bpf_prog_attach(), plus assign the returned value to skel->links.* directly. (Martin) --- tools/testing/selftests/bpf/prog_tests/mptcp.c | 121 +++++++++++++++++++++++++ 1 file changed, 121 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index d2ca32fa3b21e686d6ef2673b5953d5417edfedb..b61f26b8cdf2540a34e28ddb8a5f1f2378cf8c06 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -5,12 +5,17 @@ #include #include #include +#include #include "cgroup_helpers.h" #include "network_helpers.h" #include "mptcp_sock.skel.h" #include "mptcpify.skel.h" +#include "mptcp_subflow.skel.h" #define NS_TEST "mptcp_ns" +#define ADDR_1 "10.0.1.1" +#define ADDR_2 "10.0.1.2" +#define PORT_1 10001 #ifndef IPPROTO_MPTCP #define IPPROTO_MPTCP 262 @@ -335,10 +340,126 @@ static void test_mptcpify(void) close(cgroup_fd); } +static int endpoint_init(char *flags) +{ + SYS(fail, "ip -net %s link add veth1 type veth peer name veth2", NS_TEST); + SYS(fail, "ip -net %s addr add %s/24 dev veth1", NS_TEST, ADDR_1); + SYS(fail, "ip -net %s link set dev veth1 up", NS_TEST); + SYS(fail, "ip -net %s addr add %s/24 dev veth2", NS_TEST, ADDR_2); + SYS(fail, "ip -net %s link set dev veth2 up", NS_TEST); + if (SYS_NOFAIL("ip -net %s mptcp endpoint add %s %s", NS_TEST, ADDR_2, flags)) { + printf("'ip mptcp' not supported, skip this test.\n"); + test__skip(); + goto fail; + } + + return 0; +fail: + return -1; +} + +static void wait_for_new_subflows(int fd) +{ + socklen_t len; + u8 subflows; + int err, i; + + len = sizeof(subflows); + /* Wait max 1 sec for new subflows to be created */ + for (i = 0; i < 10; i++) { + err = getsockopt(fd, SOL_MPTCP, MPTCP_INFO, &subflows, &len); + if (!err && subflows > 0) + break; + + usleep(100000); /* 0.1s */ + } +} + +static void run_subflow(void) +{ + int server_fd, client_fd, err; + char new[TCP_CA_NAME_MAX]; + char cc[TCP_CA_NAME_MAX]; + unsigned int mark; + socklen_t len; + + server_fd = start_mptcp_server(AF_INET, ADDR_1, PORT_1, 0); + if (!ASSERT_OK_FD(server_fd, "start_mptcp_server")) + return; + + client_fd = connect_to_fd(server_fd, 0); + if (!ASSERT_OK_FD(client_fd, "connect_to_fd")) + goto close_server; + + send_byte(client_fd); + wait_for_new_subflows(client_fd); + + len = sizeof(mark); + err = getsockopt(client_fd, SOL_SOCKET, SO_MARK, &mark, &len); + if (ASSERT_OK(err, "getsockopt(client_fd, SO_MARK)")) + ASSERT_EQ(mark, 0, "mark"); + + len = sizeof(new); + err = getsockopt(client_fd, SOL_TCP, TCP_CONGESTION, new, &len); + if (ASSERT_OK(err, "getsockopt(client_fd, TCP_CONGESTION)")) { + get_msk_ca_name(cc); + ASSERT_STREQ(new, cc, "cc"); + } + + close(client_fd); +close_server: + close(server_fd); +} + +static void test_subflow(void) +{ + struct mptcp_subflow *skel; + struct nstoken *nstoken; + int cgroup_fd; + + cgroup_fd = test__join_cgroup("/mptcp_subflow"); + if (!ASSERT_OK_FD(cgroup_fd, "join_cgroup: mptcp_subflow")) + return; + + skel = mptcp_subflow__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_load: mptcp_subflow")) + goto close_cgroup; + + skel->bss->pid = getpid(); + + skel->links.mptcp_subflow = + bpf_program__attach_cgroup(skel->progs.mptcp_subflow, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links.mptcp_subflow, "attach mptcp_subflow")) + goto skel_destroy; + + skel->links._getsockopt_subflow = + bpf_program__attach_cgroup(skel->progs._getsockopt_subflow, cgroup_fd); + if (!ASSERT_OK_PTR(skel->links._getsockopt_subflow, "attach _getsockopt_subflow")) + goto skel_destroy; + + nstoken = create_netns(); + if (!ASSERT_OK_PTR(nstoken, "create_netns: mptcp_subflow")) + goto skel_destroy; + + if (endpoint_init("subflow") < 0) + goto close_netns; + + run_subflow(); + +close_netns: + cleanup_netns(nstoken); +skel_destroy: + mptcp_subflow__destroy(skel); +close_cgroup: + close(cgroup_fd); +} + void test_mptcp(void) { if (test__start_subtest("base")) test_base(); if (test__start_subtest("mptcpify")) test_mptcpify(); + if (test__start_subtest("subflow")) + test_subflow(); }