From patchwork Tue Sep 10 14:12:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthieu Baerts (NGI0)" X-Patchwork-Id: 13798592 X-Patchwork-Delegate: matthieu.baerts@tessares.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5EBEC188CBB; Tue, 10 Sep 2024 14:13:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725977606; cv=none; b=I4th1KCfX1HRVs5fxZXI6qlOa3A468ELfuBYQDCSgiQrA9OAM/+O3EoSK14KN4VEsiQo3qKCvkXyz1CSb/j1/viEzStNkFw6GYP5/hTogsjcpGWfXQsimjBHcEH0d00YeybCcQ03EENhuL4zh/YkfonzeUzG/SChlQyED4v8eKg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725977606; c=relaxed/simple; bh=hWl57FHY2fOdoggMMQbYWWGMnt4b33OvRJbD5uuAWPQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=GUiXSMUPt8XidyV3mn+ZAdRdRwTbI0s3fPhjgzDhA16monBb3Nb2ccvtxijPwxaeg5lmI7pAQjadW2D2SAh2/HTE99KQfby8hqCWT1muLTNjGy9f26RVXafFkPnBjtVAhXY/KrD/8ctFyMMyts7Ph9ihfaldI4d8oif/Dgjy6uI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IgBirhhN; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IgBirhhN" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8EDB9C4CED1; Tue, 10 Sep 2024 14:13:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1725977606; bh=hWl57FHY2fOdoggMMQbYWWGMnt4b33OvRJbD5uuAWPQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=IgBirhhNDnr0wa9pmkekuaGVxwMzodpWsGehEC9qffQv+Ps/3zJJlN40itLpzyDLb p1JTUlbNtPqYCi+I82F6fty5FKOihR0LuBCdcgdNuJFKJYDQguUnN8RzxR3PpSEIGH JT9VAQuYmDQbxHOLLA8HArKaJ77fIIG2WIq7IUgrZhbXyVBwkhIpKfJ0k8qjY5iXtJ 4tmaVXOtEcMRXuy2ZlAjlRz3NkQHE6is5INMjgLFEGSAxepoHAP7MFaNM1+zf82Bat lFA8m7lBSHU0ULoK8OvmXHUHHhJGJlqWLv4iHLSAYjy1iNGk10gRRMmkfYAMn/2UOo zTAVDFKyv7jBA== From: "Matthieu Baerts (NGI0)" Date: Tue, 10 Sep 2024 16:12:59 +0200 Subject: [PATCH bpf-next/net v5 1/3] selftests/bpf: Add mptcp subflow example Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240910-upstream-bpf-next-20240506-mptcp-subflow-test-v5-1-2c664a7da47c@kernel.org> References: <20240910-upstream-bpf-next-20240506-mptcp-subflow-test-v5-0-2c664a7da47c@kernel.org> In-Reply-To: <20240910-upstream-bpf-next-20240506-mptcp-subflow-test-v5-0-2c664a7da47c@kernel.org> To: mptcp@lists.linux.dev, Mat Martineau , Geliang Tang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, "Matthieu Baerts (NGI0)" , Nicolas Rybowski , Geliang Tang X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=4445; i=matttbe@kernel.org; h=from:subject:message-id; bh=98MxbfTkgjSKHUOjjCbtRcgcjpjaNf9HbvOObpvAYhI=; b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBm4FP6OAgCr/76qaOirF0W9fH+xWuZwyjL1P1Di kaSlJekI1mJAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZuBT+gAKCRD2t4JPQmmg c/YyEAC4I+ya+HE9RLuO9qLM8XrXnxmGObKEYeJJdZXW7utQnr784IXI//tGO6gyOXEBUfAcmOa bkohR1J7mrIMLfiAG30A4ZV0tQSFWXV3ruHZInBxkKula3Vm3wuRvIoqr2UZufVBBKBuBtEvTrR XAmvPRZiML4/zPg45m2P0DfAuHKKEwxd3hLoqTu9ahzYAQEqDsPvVdFCwjAxZTum1060WnhpGzf hjEJvK/1xI0iyvkf54OUQc8O2sHfXkSYrU8zNYBx+UhyTj3/c6wdkZevkeFGi02dvJdYQ5J+Km8 MPar6jDk+Hz1Z56Ggz8gHuDelQQ0BWmm6XJP9zgj6B0zl8DZZqVNjGrMvH2Xflw8XrRXo2a+P5u sNZKxK43Y5nw7vQmHzJ5ehcRA5pfbKV84MH9wVddgzUaLYYS0sE9KD5pmjdYRHDpX9J+oZCiLYh 03uD0h0PIOiBrX9YO0oVRxC7Dr2IxbSz+N5Vs0C652f8ya23LUfxbYvWzagxdios7bg+i7nEcT4 +qtu0Utwq2Ns/ccOakO087fUYvOjkdc+foCBbsQR0YBa8/F5BEeXZE9oVO2n9axEFDo5CCQsFfK Wf4Vt4862yfZ1u1rmLDf8yOHTyi4T1fjuSr/ihqgbttlj/fwW/j64ZKp/jjB+fjN22a53r7OjhJ ZnF1I0cLKPYRi+A== X-Developer-Key: i=matttbe@kernel.org; a=openpgp; fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073 From: Nicolas Rybowski Move Nicolas' patch into bpf selftests directory. This example adds a different mark (SO_MARK) on each subflow, and changes the TCP CC only on the first subflow. From the userspace, an application can do a setsockopt() on an MPTCP socket, and typically the same value will be propagated to all subflows (paths). If someone wants to have different values per subflow, the recommended way is to use BPF. So it is good to add such example here, and make sure there is no regressions. This example shows how it is possible to: Identify the parent msk of an MPTCP subflow. Put different sockopt for each subflow of a same MPTCP connection. Here especially, two different behaviours are implemented: A socket mark (SOL_SOCKET SO_MARK) is put on each subflow of a same MPTCP connection. The order of creation of the current subflow defines its mark. The TCP CC algorithm of the very first subflow of an MPTCP connection is set to "reno". This is just to show it is possible to identify an MPTCP connection, and set socket options, from different SOL levels, per subflow. "reno" has been picked because it is built-in and usually not set as default one. It is easy to verify with 'ss' that these modifications have been applied correctly. That's what the next patch is going to do. Nicolas' code comes from: commit 4d120186e4d6 ("bpf:examples: update mptcp_set_mark_kern.c") from the MPTCP repo https://github.com/multipath-tcp/mptcp_net-next (the "scripts" branch), and it has been adapted by Geliang. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/76 Co-developed-by: Geliang Tang Signed-off-by: Geliang Tang Signed-off-by: Nicolas Rybowski Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts (NGI0) --- Notes: - v1 -> v2: - The commit message has been updated: why setting multiple socket options, why reno, the verification is done in a later patch (different author). (Alexei) - v2 -> v3: - Only #include "bpf_tracing_net.h", linked to: https://lore.kernel.org/20240509175026.3423614-1-martin.lau@linux.dev - v4 -> v5: - Set reno as TCP cc on the second subflow, not to influence the getsockopt() done from the userspace, which will return the one from the first subflow, the default TCP cc then, not the modified one. --- tools/testing/selftests/bpf/progs/mptcp_subflow.c | 59 +++++++++++++++++++++++ 1 file changed, 59 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/mptcp_subflow.c b/tools/testing/selftests/bpf/progs/mptcp_subflow.c new file mode 100644 index 000000000000..2e28f4a215b5 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp_subflow.c @@ -0,0 +1,59 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020, Tessares SA. */ +/* Copyright (c) 2024, Kylin Software */ + +/* vmlinux.h, bpf_helpers.h and other 'define' */ +#include "bpf_tracing_net.h" + +char _license[] SEC("license") = "GPL"; + +char cc[TCP_CA_NAME_MAX] = "reno"; + +/* Associate a subflow counter to each token */ +struct { + __uint(type, BPF_MAP_TYPE_HASH); + __uint(key_size, sizeof(__u32)); + __uint(value_size, sizeof(__u32)); + __uint(max_entries, 100); +} mptcp_sf SEC(".maps"); + +SEC("sockops") +int mptcp_subflow(struct bpf_sock_ops *skops) +{ + __u32 init = 1, key, mark, *cnt; + struct mptcp_sock *msk; + struct bpf_sock *sk; + int err; + + if (skops->op != BPF_SOCK_OPS_TCP_CONNECT_CB) + return 1; + + sk = skops->sk; + if (!sk) + return 1; + + msk = bpf_skc_to_mptcp_sock(sk); + if (!msk) + return 1; + + key = msk->token; + cnt = bpf_map_lookup_elem(&mptcp_sf, &key); + if (cnt) { + /* A new subflow is added to an existing MPTCP connection */ + __sync_fetch_and_add(cnt, 1); + mark = *cnt; + } else { + /* A new MPTCP connection is just initiated and this is its primary subflow */ + bpf_map_update_elem(&mptcp_sf, &key, &init, BPF_ANY); + mark = init; + } + + /* Set the mark of the subflow's socket based on appearance order */ + err = bpf_setsockopt(skops, SOL_SOCKET, SO_MARK, &mark, sizeof(mark)); + if (err < 0) + return 1; + if (mark == 2) + err = bpf_setsockopt(skops, SOL_TCP, TCP_CONGESTION, cc, TCP_CA_NAME_MAX); + + return 1; +} From patchwork Tue Sep 10 14:13:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthieu Baerts (NGI0)" X-Patchwork-Id: 13798593 X-Patchwork-Delegate: matthieu.baerts@tessares.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FF36188CBB; Tue, 10 Sep 2024 14:13:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725977611; cv=none; b=Gc65cN+VaFMLTpXVhZBsWjtmoqraJ/FrBYNAEpDqJHavqcUXc+uHUA6T20ZY25SlHnaLfC68RCa+Rdd6OaDv+krOCevlxyUkAC9+wFbyOnwpOkhW6Lpu26nJ6VGUaaXm6Kt+fBBI97fKLDm0ynw4qpmWJLShKWnPT9JZnEoiUQ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725977611; c=relaxed/simple; bh=Iv5TZEml/H6t7GaDZCEGZTMq7nkqIaIE/iw0KKiXtKo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=piWq95QVZwdGkHmyUbaiIHgWkTdkxpHeoyKIajrfO/LAjEY7uSAb/g7RO78OMExR4gvjdRYfhVpkANbVbDX3sdCIdS6MaAr2P2lYGEZw3qpLZ+VonC4Ek2L4fP57wUSoez84xyyJA6cqiu7fIXnNpbAbAEE5HrjWNe2vlgyyDgc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jS2b/pyC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jS2b/pyC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81C9FC4CED2; Tue, 10 Sep 2024 14:13:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1725977611; bh=Iv5TZEml/H6t7GaDZCEGZTMq7nkqIaIE/iw0KKiXtKo=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=jS2b/pyCzUcgYPyHpg3WdNpOIRrXKr+rtT1OJwtziK8/h68Cg2PQu2NEgF8JNEnY+ ++jhom7vQSGIaLiA6EpIqG7tIT1n6hBD4I475KErWBLV39p10M1Lnfq10rkS70+R4m q/UDlRVWUItQeAoCzmvbYZghyFeOxHNmqzkVcRlxt3ckvr24wKOIkPMFZGIRtoI6kp k3EI7UEkGVzpOEGZfvVxEsvw4LGNG5+VmlJ3IHZIRo8S104yIhifqQavKTgoCrzMOn IbJYOU1daXp7ZTYDx2EpEkf4cgQp+8uBfMoCgseg5/ePqnGOrdTwiHEL9WQQfhi29I 8xD92KaUQ1iuw== From: "Matthieu Baerts (NGI0)" Date: Tue, 10 Sep 2024 16:13:00 +0200 Subject: [PATCH bpf-next/net v5 2/3] selftests/bpf: Add getsockopt to inspect mptcp subflow Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240910-upstream-bpf-next-20240506-mptcp-subflow-test-v5-2-2c664a7da47c@kernel.org> References: <20240910-upstream-bpf-next-20240506-mptcp-subflow-test-v5-0-2c664a7da47c@kernel.org> In-Reply-To: <20240910-upstream-bpf-next-20240506-mptcp-subflow-test-v5-0-2c664a7da47c@kernel.org> To: mptcp@lists.linux.dev, Mat Martineau , Geliang Tang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, "Matthieu Baerts (NGI0)" , Martin KaFai Lau , Geliang Tang X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=6005; i=matttbe@kernel.org; h=from:subject:message-id; bh=FDhxqzjJz67+rEBh+dkc4FV5Kx80WdtuXY//a+1arxA=; b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBm4FP6PWb5L7K5OOzY7au9d3x/3Hvk2v60gTqqY Bv9MPDaWBeJAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZuBT+gAKCRD2t4JPQmmg czZVD/9qboRpOedgzfsOTqY1wEOI8cgvLCZiKvxXfdhlDVNstUtgNLU7dXTB7J4g+at6SnS1xGz 8m36NXSobVs1bQRhlZX6tyWF6ng8x+LQQWvzOzOq9LMY2P7QCny5aTMdJzjdJX3L5aDVSowGe6/ Lrs8BVyqLX62rkiouMLxCSX3RIkxNykQ0h4y1Rne+RS0JYUuKZl2wnzUi0DxK4KuuNqpkSYYD/S OW2rBrvecA85G/yMG2eWlrgGWknze0MlPcbG+WXwmKFBFPIjMdD6PRZ0cWCE6l+97mx6hkjhCfF rBU5ZOqNUxs3CJKO+5J7hp1mmScwMnCsnHnbHK7PGBYksfzjS94pszRX6K1U7AlyZDLTUNhpY4v B4ovQKcr9zAseXa+lFLLh7peTlfFqbCf+kq6qDUCCKJl8wJWOUlJ28miU6j+iXJCe/bohOTDU7e DZk4SbsFS1I4o359JJV3Avz7kj/ziWV/fBdF/M++bxR45auCHDkmAVpl63a/9zCiOKDLN7Ul0Hv ZGswauwSTyJaTG7/UJ1pWe4cMVrNXfjiqs89vG3hdwbQpA17MtV8tkmOUKC3rq8Blw9Hs7iWDod omawYdkU73hAc64BQ2FlCxvagm8pvpRz+Hd8hq533K1zDBekUxqhLIUEP1bdsCHFCGv6DTUrYvn mR2TS+LJkcn9Jrw== X-Developer-Key: i=matttbe@kernel.org; a=openpgp; fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073 From: Geliang Tang This patch adds a "cgroup/getsockopt" way to inspect the subflows of an MPTCP socket, and verify the modifications done by the same BPF program in the previous commit: a different mark per subflow, and a different TCP CC set on the second one. This new hook will be used by the next commit to verify the socket options set on each subflow. This extra "cgroup/getsockopt" prog walks the msk->conn_list and use bpf_core_cast to cast a pointer for readonly. It allows to inspect all the fields of a structure. Note that on the kernel side, the MPTCP socket stores a list of subflows under 'msk->conn_list'. They can be iterated using the generic 'list' helpers. They have been imported here, with a small difference: list_for_each_entry() uses 'cond_break' to limit the number of iterations, and ease its use. Because only data need to be read here, it is enough to use this technique. It is planned to use bpf_iter, when BPF programs will be used to modify data from the different subflows. mptcp_subflow_tcp_sock() and mptcp_for_each_stubflow() helpers have also be imported. Suggested-by: Martin KaFai Lau Signed-off-by: Geliang Tang Reviewed-by: Matthieu Baerts (NGI0) Signed-off-by: Matthieu Baerts (NGI0) --- Notes: - v5: new patch, instead of using 'ss' in the following patch --- MAINTAINERS | 2 +- tools/testing/selftests/bpf/progs/mptcp_bpf.h | 42 ++++++++++++++ tools/testing/selftests/bpf/progs/mptcp_subflow.c | 69 +++++++++++++++++++++++ 3 files changed, 112 insertions(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 30a9b9450e11..2674b86209d1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16058,7 +16058,7 @@ F: include/net/mptcp.h F: include/trace/events/mptcp.h F: include/uapi/linux/mptcp*.h F: net/mptcp/ -F: tools/testing/selftests/bpf/*/*mptcp*.c +F: tools/testing/selftests/bpf/*/*mptcp*.[ch] F: tools/testing/selftests/net/mptcp/ NETWORKING [TCP] diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf.h b/tools/testing/selftests/bpf/progs/mptcp_bpf.h new file mode 100644 index 000000000000..179b74c1205f --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp_bpf.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ +#ifndef __MPTCP_BPF_H__ +#define __MPTCP_BPF_H__ + +#include "bpf_experimental.h" + +/* list helpers from include/linux/list.h */ +static inline int list_is_head(const struct list_head *list, + const struct list_head *head) +{ + return list == head; +} + +#define list_entry(ptr, type, member) \ + container_of(ptr, type, member) + +#define list_first_entry(ptr, type, member) \ + list_entry((ptr)->next, type, member) + +#define list_next_entry(pos, member) \ + list_entry((pos)->member.next, typeof(*(pos)), member) + +#define list_entry_is_head(pos, head, member) \ + list_is_head(&pos->member, (head)) + +/* small difference: 'cond_break' has been added in the conditions */ +#define list_for_each_entry(pos, head, member) \ + for (pos = list_first_entry(head, typeof(*pos), member); \ + cond_break, !list_entry_is_head(pos, head, member); \ + pos = list_next_entry(pos, member)) + +/* mptcp helpers from protocol.h */ +#define mptcp_for_each_subflow(__msk, __subflow) \ + list_for_each_entry(__subflow, &((__msk)->conn_list), node) + +static __always_inline struct sock * +mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow) +{ + return subflow->tcp_sock; +} + +#endif diff --git a/tools/testing/selftests/bpf/progs/mptcp_subflow.c b/tools/testing/selftests/bpf/progs/mptcp_subflow.c index 2e28f4a215b5..70302477e326 100644 --- a/tools/testing/selftests/bpf/progs/mptcp_subflow.c +++ b/tools/testing/selftests/bpf/progs/mptcp_subflow.c @@ -4,10 +4,12 @@ /* vmlinux.h, bpf_helpers.h and other 'define' */ #include "bpf_tracing_net.h" +#include "mptcp_bpf.h" char _license[] SEC("license") = "GPL"; char cc[TCP_CA_NAME_MAX] = "reno"; +int pid; /* Associate a subflow counter to each token */ struct { @@ -57,3 +59,70 @@ int mptcp_subflow(struct bpf_sock_ops *skops) return 1; } + +static int _check_getsockopt_subflow_mark(struct mptcp_sock *msk, struct bpf_sockopt *ctx) +{ + struct mptcp_subflow_context *subflow; + int i = 0; + + mptcp_for_each_subflow(msk, subflow) { + struct sock *ssk; + + ssk = mptcp_subflow_tcp_sock(bpf_core_cast(subflow, + struct mptcp_subflow_context)); + + if (ssk->sk_mark != ++i) { + ctx->retval = -2; + break; + } + } + + return 1; +} + +static int _check_getsockopt_subflow_cc(struct mptcp_sock *msk, struct bpf_sockopt *ctx) +{ + struct mptcp_subflow_context *subflow; + + mptcp_for_each_subflow(msk, subflow) { + struct inet_connection_sock *icsk; + struct sock *ssk; + + ssk = mptcp_subflow_tcp_sock(bpf_core_cast(subflow, + struct mptcp_subflow_context)); + icsk = bpf_core_cast(ssk, struct inet_connection_sock); + + if (ssk->sk_mark == 2 && + __builtin_memcmp(icsk->icsk_ca_ops->name, cc, TCP_CA_NAME_MAX)) { + ctx->retval = -2; + break; + } + } + + return 1; +} + +SEC("cgroup/getsockopt") +int _getsockopt_subflow(struct bpf_sockopt *ctx) +{ + struct bpf_sock *sk = ctx->sk; + struct mptcp_sock *msk; + + if (bpf_get_current_pid_tgid() >> 32 != pid) + return 1; + + if (!sk || sk->protocol != IPPROTO_MPTCP || + (!(ctx->level == SOL_SOCKET && ctx->optname == SO_MARK) && + !(ctx->level == SOL_TCP && ctx->optname == TCP_CONGESTION))) + return 1; + + msk = bpf_core_cast(sk, struct mptcp_sock); + if (msk->pm.subflows != 1) { + ctx->retval = -1; + return 1; + } + + if (ctx->optname == SO_MARK) + return _check_getsockopt_subflow_mark(msk, ctx); + return _check_getsockopt_subflow_cc(msk, ctx); +} From patchwork Tue Sep 10 14:13:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Matthieu Baerts (NGI0)" X-Patchwork-Id: 13798594 X-Patchwork-Delegate: matthieu.baerts@tessares.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E3D91188CBB; Tue, 10 Sep 2024 14:13:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725977616; cv=none; b=fzk17dFYxrWm6aZxBLufJaeU5sSCFLF76dqt405VSWHDLil++rbcq9m6EcBzhyM+HfyDSC7/4Tr2wAZ1dsdMngYi9lvA1vfosEOnfrEGWGBxcgIwcB4VgkVA08pnYEOA5zvalRFxsKHiCht9JxAsrr6Z/vNpBnS8gxW0R4VBZo4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1725977616; c=relaxed/simple; bh=zQY9ylJsw5DVKosYUHbe+6rAr4wehAk1+o/s2b7Heqo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Zvnh25J3wNaWNSGq7E4aYIhYHZiMEihPcOxEmIxTlNj4soUIHeXOZYOE1cbbSDqaEbhI+I2v4t4Y22bzct/YFO3UrPVTdj2vWyx9XUkmU3gNTYG00fKHz7fSmV6coIqTEQZ50XlzHfKL/8UH+h1iABx6DjkWaQ4DDzsKeixmYpI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tA3pojio; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tA3pojio" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 776F8C4CED0; Tue, 10 Sep 2024 14:13:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1725977615; bh=zQY9ylJsw5DVKosYUHbe+6rAr4wehAk1+o/s2b7Heqo=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=tA3pojioUVPd7sBSaFU+tYoEOTWRWMUMrb4Qt63RHrqrdHnl7gRBfwl8Do3D1Bjfr W+J6CgEcsnJXUJFpufW9nnQX2A6GtsXYAjAGNbFTg+kSk2cw+lwJPOIjiKIm1tCoN1 RSDQVJ/giR255Ta41PSJHMjEu5QNYFxMqI8sObRCrOU4pxdTCiS3mjhlgS19OJgSy4 riWWqUlgXw3LXY/KdbbuW7/rhmhKxGGnqKZ2n/ma4Fj4vJmrTgnpWbIIYtn25BuZPz uNuxHdhtVYnUuYX84AOCi/V5e9LzHQR9u3h6KMrep9PhHcDZxx0yUlML2j5jhO2ch5 sXtjIWJY5wVSA== From: "Matthieu Baerts (NGI0)" Date: Tue, 10 Sep 2024 16:13:01 +0200 Subject: [PATCH bpf-next/net v5 3/3] selftests/bpf: Add mptcp subflow subtest Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20240910-upstream-bpf-next-20240506-mptcp-subflow-test-v5-3-2c664a7da47c@kernel.org> References: <20240910-upstream-bpf-next-20240506-mptcp-subflow-test-v5-0-2c664a7da47c@kernel.org> In-Reply-To: <20240910-upstream-bpf-next-20240506-mptcp-subflow-test-v5-0-2c664a7da47c@kernel.org> To: mptcp@lists.linux.dev, Mat Martineau , Geliang Tang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, "Matthieu Baerts (NGI0)" , Geliang Tang X-Mailer: b4 0.14.1 X-Developer-Signature: v=1; a=openpgp-sha256; l=6158; i=matttbe@kernel.org; h=from:subject:message-id; bh=bjjivKOUvp0bVG0rbMsxWryHAGBQZQyRzRlpNeCUmXk=; b=owEBbQKS/ZANAwAIAfa3gk9CaaBzAcsmYgBm4FP6+yFAYgRXp+ZVvgKXqjKPaGUYtTntUQRkz OMrVkkpKjaJAjMEAAEIAB0WIQToy4X3aHcFem4n93r2t4JPQmmgcwUCZuBT+gAKCRD2t4JPQmmg c1mHEADOQYiUZ3/fpHwHr+LkMlmigNvbCZApf3Tyv1xTgIHRS1MaQU9cvQuugsEEreoWAYarSA0 /2FVWIEBMkFPnz8538zaEQwe+IfxgnLZ2gJok8d8qlDxCqQWMz7tnLThF0jEIZs6DZtP6We6Y/+ Bx1IMh9h3yNW7F9qNJdiSO922dgbx0+mrKv377wrkc22p44SwlU99PBFbZUN9cVKYJAIc8Odwxm KwtteegAMvNEszQaMAX3mwO3ruoRn05JXrOB1XAqg8eA59CYcZfLxALRaiJyyvICNAIAaHMt8MY 1wvwJPQTqGlHXZA6jEs99tyYvIPek9nGei9hNjcXDOr+lxQM0Bwa3Y/EF9jk089IIcIqXwqcR5T edNfo5Npd5mXXQa8JQ0u2zI/eFN47ivAyRCjtvUsOM6IxgHEtjnKkpcfJAneFx0Coi2j4AD3zTa da8tX4arNppdy9ytLQZxI+VA2GRs/sE5jMo6vCMKFcD9Nny1GWxsZSLuhdoIVpx1aowc9k+pVYd 6i86Dvp5v7W4fhJikFqToSsh+ZndnD4bd6Yrg/uuTa0YGz6geZCVx8NyxtdNsxdwaU1mHNuJMVq AEACBysJSfwuAL0TlI5W6j0RWQnK4aPXz6JNWA1tPErtYJnb0Nc7WeAG1mOO6zHXdWzYbi6YCD0 8pdLmHcIyFoJpVw== X-Developer-Key: i=matttbe@kernel.org; a=openpgp; fpr=E8CB85F76877057A6E27F77AF6B7824F4269A073 From: Geliang Tang This patch adds a subtest named test_subflow in test_mptcp to load and verify the newly added MPTCP subflow BPF program. To goal is to make sure it is possible to set different socket options per subflows, while the userspace socket interface only lets the application to set the same socket options for the whole MPTCP connection and its multiple subflows. To check that, a client and a server are started in a dedicated netns, with veth interfaces to simulate multiple paths. They will exchange data to allow the creation of an additional subflow. When the different subflows are being created, the new MPTCP subflow BPF program will set some socket options: marks and TCP CC. The validation is done by the same program, when the userspace checks the value of the modified socket options. On the userspace side, it will see that the default values are still being used on the MPTCP connection, while the BPF program will see different options set per subflow of the same MPTCP connection. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/76 Signed-off-by: Geliang Tang Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts (NGI0) --- Notes: - v2 -> v3: - Use './mptcp_pm_nl_ctl' instead of 'ip mptcp', not supported by the BPF CI running IPRoute 5.5.0. - Use SYS_NOFAIL() in _ss_search() instead of calling system() - v3 -> v4: - Drop './mptcp_pm_nl_ctl', but skip this new test if 'ip mptcp' is not supported. - v4 -> v5: - Note that this new test is no longer skipped on the BPF CI, because 'ip mptcp' is now supported after the switch from Ubuntu 20.04 to 22.04. - Update the commit message, reflecting the latest version. - The validations are no longer done using 'ss', but using the new BPF program added in the previous patch, to reduce the use of external dependences. --- tools/testing/selftests/bpf/prog_tests/mptcp.c | 126 +++++++++++++++++++++++++ 1 file changed, 126 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index d2ca32fa3b21..c30f032edaca 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -9,8 +9,12 @@ #include "network_helpers.h" #include "mptcp_sock.skel.h" #include "mptcpify.skel.h" +#include "mptcp_subflow.skel.h" #define NS_TEST "mptcp_ns" +#define ADDR_1 "10.0.1.1" +#define ADDR_2 "10.0.1.2" +#define PORT_1 10001 #ifndef IPPROTO_MPTCP #define IPPROTO_MPTCP 262 @@ -335,10 +339,132 @@ static void test_mptcpify(void) close(cgroup_fd); } +static int endpoint_init(char *flags) +{ + SYS(fail, "ip -net %s link add veth1 type veth peer name veth2", NS_TEST); + SYS(fail, "ip -net %s addr add %s/24 dev veth1", NS_TEST, ADDR_1); + SYS(fail, "ip -net %s link set dev veth1 up", NS_TEST); + SYS(fail, "ip -net %s addr add %s/24 dev veth2", NS_TEST, ADDR_2); + SYS(fail, "ip -net %s link set dev veth2 up", NS_TEST); + if (SYS_NOFAIL("ip -net %s mptcp endpoint add %s %s", NS_TEST, ADDR_2, flags)) { + printf("'ip mptcp' not supported, skip this test.\n"); + test__skip(); + goto fail; + } + + return 0; +fail: + return -1; +} + +static void wait_for_new_subflows(int fd) +{ + socklen_t len; + u8 subflows; + int err, i; + + len = sizeof(subflows); + /* Wait max 1 sec for new subflows to be created */ + for (i = 0; i < 10; i++) { + err = getsockopt(fd, SOL_MPTCP, MPTCP_INFO, &subflows, &len); + if (!err && subflows > 0) + break; + + sleep(0.1); + } +} + +static void run_subflow(void) +{ + int server_fd, client_fd, err; + char new[TCP_CA_NAME_MAX]; + char cc[TCP_CA_NAME_MAX]; + unsigned int mark; + socklen_t len; + + server_fd = start_mptcp_server(AF_INET, ADDR_1, PORT_1, 0); + if (!ASSERT_OK_FD(server_fd, "start_mptcp_server")) + return; + + client_fd = connect_to_fd(server_fd, 0); + if (!ASSERT_OK_FD(client_fd, "connect_to_fd")) + goto close_server; + + send_byte(client_fd); + wait_for_new_subflows(client_fd); + + len = sizeof(mark); + err = getsockopt(client_fd, SOL_SOCKET, SO_MARK, &mark, &len); + if (ASSERT_OK(err, "getsockopt(client_fd, SO_MARK)")) + ASSERT_EQ(mark, 0, "mark"); + + len = sizeof(new); + err = getsockopt(client_fd, SOL_TCP, TCP_CONGESTION, new, &len); + if (ASSERT_OK(err, "getsockopt(client_fd, TCP_CONGESTION)")) { + get_msk_ca_name(cc); + ASSERT_STREQ(new, cc, "cc"); + } + + close(client_fd); +close_server: + close(server_fd); +} + +static void test_subflow(void) +{ + int cgroup_fd, prog_fd, err; + struct mptcp_subflow *skel; + struct nstoken *nstoken; + struct bpf_link *link; + + cgroup_fd = test__join_cgroup("/mptcp_subflow"); + if (!ASSERT_OK_FD(cgroup_fd, "join_cgroup: mptcp_subflow")) + return; + + skel = mptcp_subflow__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_load: mptcp_subflow")) + goto close_cgroup; + + skel->bss->pid = getpid(); + + err = mptcp_subflow__attach(skel); + if (!ASSERT_OK(err, "skel_attach: mptcp_subflow")) + goto skel_destroy; + + prog_fd = bpf_program__fd(skel->progs.mptcp_subflow); + err = bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_SOCK_OPS, 0); + if (!ASSERT_OK(err, "prog_attach")) + goto skel_destroy; + + nstoken = create_netns(); + if (!ASSERT_OK_PTR(nstoken, "create_netns: mptcp_subflow")) + goto skel_destroy; + + if (endpoint_init("subflow") < 0) + goto close_netns; + + link = bpf_program__attach_cgroup(skel->progs._getsockopt_subflow, + cgroup_fd); + if (!ASSERT_OK_PTR(link, "getsockopt prog")) + goto close_netns; + + run_subflow(); + + bpf_link__destroy(link); +close_netns: + cleanup_netns(nstoken); +skel_destroy: + mptcp_subflow__destroy(skel); +close_cgroup: + close(cgroup_fd); +} + void test_mptcp(void) { if (test__start_subtest("base")) test_base(); if (test__start_subtest("mptcpify")) test_mptcpify(); + if (test__start_subtest("subflow")) + test_subflow(); }