From patchwork Fri Mar 21 01:49:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Geliang Tang X-Patchwork-Id: 14024767 X-Patchwork-Delegate: matthieu.baerts@tessares.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 574C31C84BF for ; Fri, 21 Mar 2025 01:49:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742521767; cv=none; b=DRtB7Ij+iVfYM1x39Uz37g10aczJAadsJ3exjKNQ9thTrEbNkJgYltlkmoflcSNRPhuDRydQxayB60w50MOC/eMPGBe+YfH3GUTk+60KSp5ZBdjjxhs8rWDVTZBnKcuaNnW6jIdzkhyy2btQ1GHbPg1IqfZYGhPbpdL+UEXnbcA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742521767; c=relaxed/simple; bh=NwQ2ZrePdycqSz9krQSAuwnz1skNScibMGwVHXBjl/g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=l5xOxg1BUcggzYRUmMCOwFH13sogAugjJTNe1AVCwGz/Nr0KFfwFtIO8NVLhDodx3syeuERoZXNy4YtJ2Y04O3rKahCBRVQzCgph7oL4As60fbPEmpDpPgh2r5wkqJJ5ciWm9R/6rJ1OvRvMOtrwfg2RG2nar5+z3deE50QdZd0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lN2je9wa; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lN2je9wa" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 39585C4CEE7; Fri, 21 Mar 2025 01:49:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742521767; bh=NwQ2ZrePdycqSz9krQSAuwnz1skNScibMGwVHXBjl/g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lN2je9waULQIfZYFjAMWrxBqHWPzylYH+nq/H+YGHc+LHhyzMYWDot82y6PKrcAHX FteJz9AcudaF5SyXEzDKK5Af6zsn0yYfy13luvYEZDuMsSFRyp06/09fsT7rMATbYp 5Ssw6Hxm/9FsqoPrd1kdKFPcC1RpWPC2USpuFRW1iS+87QqwPjsiYh/Ukywxo32nM2 NlNbAM33/7DPkqvmzw9hqKFr4xHRj0AIlLdCdhKt+ayiF0UstGRE6cBVzMi70RFjz3 5PoZWQXhopjcZpQ4lkPRRSU0imkH0JcAVT/MROTXOb8mTknkgGrXrv3Vuw/Wu6wkVB JGLmyL0jb4bqg== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang Subject: [PATCH mptcp-next v1 1/4] bpf: Add mptcp path manager struct_ops Date: Fri, 21 Mar 2025 09:49:15 +0800 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Geliang Tang This patch implements a new struct bpf_struct_ops for MPTCP BPF path manager: bpf_mptcp_pm_ops. Register and unregister the bpf path manager in .reg and .unreg. Add write access for some fields of struct mptcp_sock and struct mptcp_pm_addr_entry in .btf_struct_access. This MPTCP BPF path manager implementation is similar to BPF TCP CC. And net/ipv4/bpf_tcp_ca.c is a frame of reference for this patch. Signed-off-by: Geliang Tang --- net/mptcp/bpf.c | 259 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 258 insertions(+), 1 deletion(-) diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c index 2b0cfb57df8c..596574102b89 100644 --- a/net/mptcp/bpf.c +++ b/net/mptcp/bpf.c @@ -17,10 +17,266 @@ #include "protocol.h" #ifdef CONFIG_BPF_JIT -static struct bpf_struct_ops bpf_mptcp_sched_ops; +static struct bpf_struct_ops bpf_mptcp_pm_ops, + bpf_mptcp_sched_ops; static u32 mptcp_sock_id, + mptcp_entry_id, mptcp_subflow_id; +/* MPTCP BPF path manager */ + +static const struct bpf_func_proto * +bpf_mptcp_pm_get_func_proto(enum bpf_func_id func_id, + const struct bpf_prog *prog) +{ + switch (func_id) { + case BPF_FUNC_sk_storage_get: + return &bpf_sk_storage_get_proto; + case BPF_FUNC_sk_storage_delete: + return &bpf_sk_storage_delete_proto; + default: + return bpf_base_func_proto(func_id, prog); + } +} + +static int bpf_mptcp_pm_btf_struct_access(struct bpf_verifier_log *log, + const struct bpf_reg_state *reg, + int off, int size) +{ + u32 id = reg->btf_id; + size_t end; + + if (id == mptcp_sock_id) { + switch (off) { + case offsetof(struct mptcp_sock, pm.remote.id): + end = offsetofend(struct mptcp_sock, pm.remote.id); + break; + case offsetof(struct mptcp_sock, pm.remote.family): + end = offsetofend(struct mptcp_sock, pm.remote.family); + break; + case offsetof(struct mptcp_sock, pm.remote.port): + end = offsetofend(struct mptcp_sock, pm.remote.port); + break; +#if IS_ENABLED(CONFIG_MPTCP_IPV6) + case offsetof(struct mptcp_sock, pm.remote.addr6.s6_addr32[0]): + end = offsetofend(struct mptcp_sock, pm.remote.addr6.s6_addr32[0]); + break; + case offsetof(struct mptcp_sock, pm.remote.addr6.s6_addr32[1]): + end = offsetofend(struct mptcp_sock, pm.remote.addr6.s6_addr32[1]); + break; + case offsetof(struct mptcp_sock, pm.remote.addr6.s6_addr32[2]): + end = offsetofend(struct mptcp_sock, pm.remote.addr6.s6_addr32[2]); + break; + case offsetof(struct mptcp_sock, pm.remote.addr6.s6_addr32[3]): + end = offsetofend(struct mptcp_sock, pm.remote.addr6.s6_addr32[3]); + break; +#else + case offsetof(struct mptcp_sock, pm.remote.addr.s_addr): + end = offsetofend(struct mptcp_sock, pm.remote.addr.s_addr); + break; +#endif + case offsetof(struct mptcp_sock, pm.work_pending): + end = offsetofend(struct mptcp_sock, pm.work_pending); + break; + case offsetof(struct mptcp_sock, pm.accept_addr): + end = offsetofend(struct mptcp_sock, pm.accept_addr); + break; + case offsetof(struct mptcp_sock, pm.accept_subflow): + end = offsetofend(struct mptcp_sock, pm.accept_subflow); + break; + case offsetof(struct mptcp_sock, pm.add_addr_signaled): + end = offsetofend(struct mptcp_sock, pm.add_addr_signaled); + break; + case offsetof(struct mptcp_sock, pm.local_addr_used): + end = offsetofend(struct mptcp_sock, pm.local_addr_used); + break; + case offsetof(struct mptcp_sock, pm.subflows): + end = offsetofend(struct mptcp_sock, pm.subflows); + break; + default: + bpf_log(log, "no write support to mptcp_sock at off %d\n", + off); + return -EACCES; + } + } else if (id == mptcp_entry_id) { + switch (off) { + case offsetof(struct mptcp_pm_addr_entry, addr.id): + end = offsetofend(struct mptcp_pm_addr_entry, addr.id); + break; + case offsetof(struct mptcp_pm_addr_entry, addr.port): + end = offsetofend(struct mptcp_pm_addr_entry, addr.port); + break; + default: + bpf_log(log, "no write support to mptcp_pm_addr_entry at off %d\n", + off); + return -EACCES; + } + } else { + bpf_log(log, "only access to mptcp sock or addr or entry is supported\n"); + return -EACCES; + } + + if (off + size > end) { + bpf_log(log, "access beyond %s at off %u size %u ended at %zu", + id == mptcp_sock_id ? "mptcp_sock" : + (id == mptcp_entry_id ? "mptcp_pm_addr_entry" : "mptcp_addr_info"), + off, size, end); + return -EACCES; + } + + return NOT_INIT; +} + +static const struct bpf_verifier_ops bpf_mptcp_pm_verifier_ops = { + .get_func_proto = bpf_mptcp_pm_get_func_proto, + .is_valid_access = bpf_tracing_btf_ctx_access, + .btf_struct_access = bpf_mptcp_pm_btf_struct_access, +}; + +static int bpf_mptcp_pm_reg(void *kdata, struct bpf_link *link) +{ + return mptcp_pm_register(kdata); +} + +static void bpf_mptcp_pm_unreg(void *kdata, struct bpf_link *link) +{ + mptcp_pm_unregister(kdata); +} + +static int bpf_mptcp_pm_check_member(const struct btf_type *t, + const struct btf_member *member, + const struct bpf_prog *prog) +{ + return 0; +} + +static int bpf_mptcp_pm_init_member(const struct btf_type *t, + const struct btf_member *member, + void *kdata, const void *udata) +{ + const struct mptcp_pm_ops *upm; + struct mptcp_pm_ops *pm; + u32 moff; + + upm = (const struct mptcp_pm_ops *)udata; + pm = (struct mptcp_pm_ops *)kdata; + + moff = __btf_member_bit_offset(t, member) / 8; + switch (moff) { + case offsetof(struct mptcp_pm_ops, name): + if (bpf_obj_name_cpy(pm->name, upm->name, + sizeof(pm->name)) <= 0) + return -EINVAL; + return 1; + } + + return 0; +} + +static int bpf_mptcp_pm_init(struct btf *btf) +{ + s32 type_id; + + type_id = btf_find_by_name_kind(btf, "mptcp_sock", + BTF_KIND_STRUCT); + if (type_id < 0) + return -EINVAL; + mptcp_sock_id = type_id; + + type_id = btf_find_by_name_kind(btf, "mptcp_pm_addr_entry", + BTF_KIND_STRUCT); + if (type_id < 0) + return -EINVAL; + mptcp_entry_id = type_id; + + return 0; +} + +static int bpf_mptcp_pm_validate(void *kdata) +{ + return mptcp_pm_validate(kdata); +} + +static int __bpf_mptcp_pm_get_local_id(struct mptcp_sock *msk, + struct mptcp_pm_addr_entry *skc) +{ + return 0; +} + +static bool __bpf_mptcp_pm_get_priority(struct mptcp_sock *msk, + struct mptcp_addr_info *skc) +{ + return false; +} + +static void __bpf_mptcp_pm_established(struct mptcp_sock *msk) +{ +} + +static void __bpf_mptcp_pm_subflow_established(struct mptcp_sock *msk) +{ +} + +static bool __bpf_mptcp_pm_allow_new_subflow(struct mptcp_sock *msk) +{ + return false; +} + +static bool __bpf_mptcp_pm_accept_new_subflow(const struct mptcp_sock *msk) +{ + return false; +} + +static bool __bpf_mptcp_pm_add_addr_echo(struct mptcp_sock *msk, + const struct mptcp_addr_info *addr) +{ + return false; +} + +static int __bpf_mptcp_pm_add_addr_received(struct mptcp_sock *msk, + const struct mptcp_addr_info *addr) +{ + return 0; +} + +static void __bpf_mptcp_pm_rm_addr_received(struct mptcp_sock *msk) +{ +} + +static void __bpf_mptcp_pm_init(struct mptcp_sock *msk) +{ +} + +static void __bpf_mptcp_pm_release(struct mptcp_sock *msk) +{ +} + +static struct mptcp_pm_ops __bpf_mptcp_pm_ops = { + .get_local_id = __bpf_mptcp_pm_get_local_id, + .get_priority = __bpf_mptcp_pm_get_priority, + .established = __bpf_mptcp_pm_established, + .subflow_established = __bpf_mptcp_pm_subflow_established, + .allow_new_subflow = __bpf_mptcp_pm_allow_new_subflow, + .accept_new_subflow = __bpf_mptcp_pm_accept_new_subflow, + .add_addr_echo = __bpf_mptcp_pm_add_addr_echo, + .add_addr_received = __bpf_mptcp_pm_add_addr_received, + .rm_addr_received = __bpf_mptcp_pm_rm_addr_received, + .init = __bpf_mptcp_pm_init, + .release = __bpf_mptcp_pm_release, +}; + +static struct bpf_struct_ops bpf_mptcp_pm_ops = { + .verifier_ops = &bpf_mptcp_pm_verifier_ops, + .reg = bpf_mptcp_pm_reg, + .unreg = bpf_mptcp_pm_unreg, + .check_member = bpf_mptcp_pm_check_member, + .init_member = bpf_mptcp_pm_init_member, + .init = bpf_mptcp_pm_init, + .validate = bpf_mptcp_pm_validate, + .name = "mptcp_pm_ops", + .cfi_stubs = &__bpf_mptcp_pm_ops, +}; + /* MPTCP BPF packet scheduler */ static const struct bpf_func_proto * @@ -332,6 +588,7 @@ static int __init bpf_mptcp_kfunc_init(void) ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_mptcp_common_kfunc_set); #ifdef CONFIG_BPF_JIT + ret = ret ?: register_bpf_struct_ops(&bpf_mptcp_pm_ops, mptcp_pm_ops); ret = ret ?: register_bpf_struct_ops(&bpf_mptcp_sched_ops, mptcp_sched_ops); #endif From patchwork Fri Mar 21 01:49:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Geliang Tang X-Patchwork-Id: 14024768 X-Patchwork-Delegate: matthieu.baerts@tessares.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B1C51C84BF for ; Fri, 21 Mar 2025 01:49:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742521769; cv=none; b=pHIz8hY0Pn6Pxsv0R+iRSYWkAbA8pnqcbnlo36Sz+na3en+lzNYNFmbE9mKlSfE5CcNJBwe5pqd35TdCfYI1tvBV8xvDCACgL5CZsZ2/pNkYeo2AyTpSDEwU8r2RgDBLy+bbX2BSPwuL+xJmToZ+LmHXIDNGcJceQv3yfCFc+QQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742521769; c=relaxed/simple; bh=cVRBhtjh4QRXKFlACtdoKvNyaCAjxylpxBgHnfg2BbU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=W/YY1i5bPlxkRJ4746nCFb7thFUl2oaLhpiNzTTjXWOlwZR7nmwLdPP1N3ZGnPkFfocbP1Mhjff1rlDmGrcl4d+jZ50I8YIoY25u30K+mjby+RPYEkdKrEWa6FC9sDSF9DZImwJzWLzyOBFBhrSIWHSEap9cJpvnEsnPYRRFtS4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=eDcbomWT; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="eDcbomWT" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BDD70C4CEEF; Fri, 21 Mar 2025 01:49:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742521768; bh=cVRBhtjh4QRXKFlACtdoKvNyaCAjxylpxBgHnfg2BbU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eDcbomWTTzH1gi775+/qXunGgXcSHft5v0wtpD+WBr7LiLT7XlhgbUjtf+NyxEEKf rpng17vdVEcBIV8V/FscJwf4kcmuJeYGuiCzTccE7rC2X2WCP/raD0erNgwZeTJXxy NueD03kzK34Dj2nZgeiCeTtnK8/kBzsogKGDZ6MnaBIUilUlTEJ1wvn7T4xd9XPTYf B0necN9wuMsU5LBKNjr7yUlf3F7aS9tOgKMXG/FuujYZStukpvz5x1b7UAyeLFm355 PlOgE9EUAPrI3831v7TY+uIsKEB3qVp31KIYO8KtP6tlI6rYmMAOxf3ZZ1OQupDnVC 6g8pIbEiepuDw== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang Subject: [PATCH mptcp-next v1 2/4] bpf: Export mptcp path manager kfuncs Date: Fri, 21 Mar 2025 09:49:16 +0800 Message-ID: <0c324baa241000f32a1b97017da8e96a383767ad.1742521587.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Geliang Tang This patch exports mptcp path manager helpers into BPF, adds these kfunc names into mptcp common kfunc_set. bpf_kmemdup_entry() and bpf_kfree_entry() are wrappers of kmemdup() and kfree(), using to alloc and free an mptcp address entry. bpf_set_bit() and bpf_bitmap_fill() are wrappers of __set_bit() and bitmap_fill(), using for mptcp address ID bitmap. bpf_spin_lock_bh() and bpf_spin_unlock_bh() are wrappers of spin_lock_bh() and spin_unlock_bh(), using to lock and unlock the mptcp pm lock. Signed-off-by: Geliang Tang --- net/mptcp/bpf.c | 48 +++++++++++++++++++++++++++++++++++++++++++ net/mptcp/pm_kernel.c | 27 ++++++++++++++++++++++++ 2 files changed, 75 insertions(+) diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c index 596574102b89..e411ae8382f2 100644 --- a/net/mptcp/bpf.c +++ b/net/mptcp/bpf.c @@ -540,6 +540,38 @@ bpf_iter_mptcp_subflow_destroy(struct bpf_iter_mptcp_subflow *it) { } +__bpf_kfunc static struct mptcp_pm_addr_entry * +bpf_kmemdup_entry(struct mptcp_pm_addr_entry *entry, int size, gfp_t priority) +{ + return kmemdup(entry, size, priority); +} + +__bpf_kfunc static void +bpf_kfree_entry(struct mptcp_pm_addr_entry *entry) +{ + kfree(entry); +} + +__bpf_kfunc static void bpf_set_bit(unsigned long nr, unsigned long *addr__ign) +{ + __set_bit(nr, addr__ign); +} + +__bpf_kfunc static void bpf_bitmap_fill(unsigned long *dst__ign, unsigned int nbits) +{ + bitmap_fill(dst__ign, nbits); +} + +__bpf_kfunc static void bpf_spin_lock_bh(spinlock_t *lock) +{ + spin_lock_bh(lock); +} + +__bpf_kfunc static void bpf_spin_unlock_bh(spinlock_t *lock) +{ + spin_unlock_bh(lock); +} + __bpf_kfunc static bool bpf_mptcp_subflow_queues_empty(struct sock *sk) { return tcp_rtx_queue_empty(sk); @@ -564,6 +596,22 @@ BTF_ID_FLAGS(func, bpf_mptcp_subflow_tcp_sock, KF_RET_NULL) BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_new, KF_ITER_NEW | KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_next, KF_ITER_NEXT | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_iter_mptcp_subflow_destroy, KF_ITER_DESTROY) +BTF_ID_FLAGS(func, bpf_kmemdup_entry) +BTF_ID_FLAGS(func, bpf_kfree_entry) +BTF_ID_FLAGS(func, bpf_set_bit) +BTF_ID_FLAGS(func, bpf_bitmap_fill) +BTF_ID_FLAGS(func, bpf_spin_lock_bh) +BTF_ID_FLAGS(func, bpf_spin_unlock_bh) +BTF_ID_FLAGS(func, mptcp_pm_nl_lookup_addr) +BTF_ID_FLAGS(func, mptcp_pm_nl_append_new_local_addr_msk) +BTF_ID_FLAGS(func, mptcp_pm_get_add_addr_signal_max) +BTF_ID_FLAGS(func, mptcp_pm_get_add_addr_accept_max) +BTF_ID_FLAGS(func, mptcp_pm_get_subflows_max) +BTF_ID_FLAGS(func, mptcp_pm_get_local_addr_max) +BTF_ID_FLAGS(func, mptcp_pm_add_addr_recv) +BTF_ID_FLAGS(func, mptcp_pm_is_init_remote_addr) +BTF_ID_FLAGS(func, mptcp_pm_create_subflow_or_signal_addr) +BTF_ID_FLAGS(func, mptcp_pm_rm_addr_recv) BTF_ID_FLAGS(func, mptcp_subflow_set_scheduled) BTF_ID_FLAGS(func, mptcp_subflow_active) BTF_ID_FLAGS(func, mptcp_set_timeout) diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c index 4f7b2e0e998d..3cf81986c70d 100644 --- a/net/mptcp/pm_kernel.c +++ b/net/mptcp/pm_kernel.c @@ -253,6 +253,9 @@ __lookup_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *info) return NULL; } +__bpf_kfunc_start_defs(); + +__bpf_kfunc static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) { struct sock *sk = (struct sock *)msk; @@ -367,6 +370,8 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) mptcp_pm_nl_check_work_pending(msk); } +__bpf_kfunc_end_defs(); + static void mptcp_pm_kernel_established(struct mptcp_sock *msk) { spin_lock_bh(&msk->pm.lock); @@ -1493,3 +1498,25 @@ void __init mptcp_pm_kernel_register(void) mptcp_pm_register(&mptcp_pm_kernel); } + +__bpf_kfunc_start_defs(); + +__bpf_kfunc static struct mptcp_pm_addr_entry * +mptcp_pm_nl_lookup_addr(struct mptcp_sock *msk, const struct mptcp_addr_info *info) +{ + struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); + + return __lookup_addr(pernet, info); +} + +__bpf_kfunc static int +mptcp_pm_nl_append_new_local_addr_msk(struct mptcp_sock *msk, + struct mptcp_pm_addr_entry *entry, + bool needs_id, bool replace) +{ + struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); + + return mptcp_pm_nl_append_new_local_addr(pernet, entry, needs_id, replace); +} + +__bpf_kfunc_end_defs(); From patchwork Fri Mar 21 01:49:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Geliang Tang X-Patchwork-Id: 14024769 X-Patchwork-Delegate: matthieu.baerts@tessares.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 85B834C85 for ; Fri, 21 Mar 2025 01:49:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742521770; cv=none; b=FFrustsxH1KE0Sx87/Kq0wWtaVIuj6q0VP0zYPigDqrC7h9Le1ozUQmCeuUBfSUO+sckSPlqp2IbW/bg42QmXDg3q4UT4xb+VwaQtFbPbNSxOhyYmJugv+hFThKb5ssFaNk2/ntaVHHoy5STI63tzYaPodRv4ShF0ww4C2hG9kw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742521770; c=relaxed/simple; bh=EWRTRr+OefG+MebdqmMPkLi+vdY2MwAf0r+zOnOliZ8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QiDx/+NF5tjBrFb76mU2qPFIFw/C+bbExo2Lo7CYLl7OI5uh2KjCBDotlT97aIrJfDrkZt+1g5uuAMweIeqo/J8zGabkwNlhDUPevqTNqSG1jym6M8Z8um7QbMm2fSWmfpDIQQaZAX5Qgy0zGO8/4ZtrUA0NvAHrEqS76OQLKc8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VgxxH5Xr; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VgxxH5Xr" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0C002C4CEE7; Fri, 21 Mar 2025 01:49:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742521770; bh=EWRTRr+OefG+MebdqmMPkLi+vdY2MwAf0r+zOnOliZ8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VgxxH5Xrq9gZyhhBIFKpJCa1c3CJhE98aUmAhJKFxWcLzYv3lFiNLyupxDVtp84wo 3XGaJa/1uRVpKzIYozRfd2yTQFjEkRgfJuMD1Dl3yZMkcel13SAZhmXAgjR9/tZeMO rISg6ZeJBM3b011FXj2DnWDPA64X8AVW8B8huDuVhI4qjTDMshSwhPvHR1qNCEsA7q h+RnRPzfQctS3igrNPMAc6ARiTS6iPDvNCz8C16pACyfjUCNUwm8jUpPH8Rtd6tHTF H940d8hWYWsifLsJ8C19/tDx0FTTPRGoqXNFUyv0J9JAqyiMoutpFoHTk2xIE0UAKj MLoCQSbHjolBw== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang Subject: [PATCH mptcp-next v1 3/4] selftests/bpf: Add mptcp netlink pm subtest Date: Fri, 21 Mar 2025 09:49:17 +0800 Message-ID: <64d988e6d362e91e764ff235e45b28ba5e39ff4e.1742521587.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Geliang Tang To verify that the behavior of BPF path manager is the same as that of netlink pm in the kernel, a netlink pm self-test has been added. BPF path manager in the next commit will also use this test too. Signed-off-by: Geliang Tang --- .../testing/selftests/bpf/prog_tests/mptcp.c | 236 ++++++++++++++++++ 1 file changed, 236 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index 7c51250e7161..5303cbf38a44 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -56,6 +56,12 @@ #endif #define MPTCP_SCHED_NAME_MAX 16 +enum mptcp_pm_family { + IPV4 = 0, + IPV4MAPPED, + IPV6, +}; + static const unsigned int total_bytes = 10 * 1024 * 1024; static int duration; @@ -562,6 +568,234 @@ static void test_iters_subflow(void) close(cgroup_fd); } +static int recv_byte(int fd) +{ + char buf[1]; + ssize_t n; + + n = recv(fd, buf, sizeof(buf), 0); + if (CHECK(n <= 0, "recv_byte", "recv")) { + log_err("failed/partial recv"); + return -1; + } + return 0; +} + +static int netlink_pm_add_subflow(char *addr, __u8 id) +{ + return SYS_NOFAIL("ip -n %s mptcp endpoint add %s subflow id %u", + NS_TEST, addr, id); +} + +static int netlink_pm_rm_subflow(__u8 id) +{ + return SYS_NOFAIL("ip -n %s mptcp endpoint delete id %u", + NS_TEST, id); +} + +static int netlink_pm_add_addr(char *addr, __u8 id) +{ + return SYS_NOFAIL("ip -n %s mptcp endpoint add %s signal id %u", + NS_TEST, addr, id); +} + +static int netlink_pm_rm_addr(__u8 id) +{ + return SYS_NOFAIL("ip -n %s mptcp endpoint delete id %u", + NS_TEST, id); +} + +static int netlink_pm_rm_addr_id_0(char *addr) +{ + return SYS_NOFAIL("ip -n %s mptcp endpoint delete id 0 %s", + NS_TEST, addr); +} + +static int netlink_pm_set_flags(__u8 id, char *flags) +{ + return SYS_NOFAIL("ip -n %s mptcp endpoint change id %u %s", + NS_TEST, id, flags); +} + +static int netlink_pm_get_addr(__u8 id, char *output) +{ + char cmd[1024]; + FILE *fp; + + sprintf(cmd, "ip -n %s mptcp endpoint show id %u", NS_TEST, id); + fp = popen(cmd, "r"); + if (!fp) + return -1; + + bzero(output, BUFSIZ); + fread(output, 1, BUFSIZ, fp); + pclose(fp); + + return 0; +} + +static int netlink_pm_dump_addr(char *output) +{ + char cmd[1024]; + FILE *fp; + + sprintf(cmd, "ip -n %s mptcp endpoint show", NS_TEST); + fp = popen(cmd, "r"); + if (!fp) + return -1; + + bzero(output, BUFSIZ); + fread(output, 1, BUFSIZ, fp); + pclose(fp); + + return 0; +} + +static void run_netlink_pm(enum mptcp_pm_family family) +{ + bool ipv4mapped = (family == IPV4MAPPED); + bool ipv6 = (family == IPV6 || ipv4mapped); + int server_fd, client_fd, accept_fd; + char output[BUFSIZ], expect[1024]; + char *addr; + int err; + + addr = ipv6 ? (ipv4mapped ? "::ffff:"ADDR_1 : ADDR6_1) : ADDR_1; + server_fd = start_mptcp_server(ipv6 ? AF_INET6 : AF_INET, addr, PORT_1, 0); + if (!ASSERT_OK_FD(server_fd, "start_mptcp_server")) + return; + + client_fd = connect_to_fd(server_fd, 0); + if (!ASSERT_OK_FD(client_fd, "connect_to_fd")) + goto close_server; + + accept_fd = accept(server_fd, NULL, NULL); + if (!ASSERT_OK_FD(accept_fd, "accept")) + goto close_client; + + usleep(200000); /* 0.2s */ + send_byte(client_fd); + recv_byte(accept_fd); + usleep(200000); /* 0.2s */ + + addr = ipv6 ? (ipv4mapped ? "::ffff:"ADDR_2 : ADDR6_2) : ADDR_2; + err = netlink_pm_add_subflow(addr, 100); + if (!ASSERT_OK(err, "netlink_pm_add_subflow 100")) + goto close_accept; + + send_byte(accept_fd); + recv_byte(client_fd); + + sprintf(expect, "%s id 100 subflow \n", addr); + err = netlink_pm_get_addr(100, output); + if (!ASSERT_OK(err, "netlink_pm_get_addr 100") || + !ASSERT_STRNEQ(output, expect, sizeof(expect), "get_addr")) + goto close_accept; + + err = netlink_pm_set_flags(100, "backup"); + if (!ASSERT_OK(err, "netlink_pm_set_flags backup")) + goto close_accept; + + send_byte(client_fd); + recv_byte(accept_fd); + + sprintf(expect, "%s id 100 subflow backup \n", addr); + err = netlink_pm_get_addr(100, output); + if (!ASSERT_OK(err, "netlink_pm_get_addr 100") || + !ASSERT_STRNEQ(output, expect, sizeof(expect), "get_addr")) + goto close_accept; + + err = netlink_pm_set_flags(100, "nobackup"); + if (!ASSERT_OK(err, "netlink_pm_set_flags nobackup")) + goto close_accept; + + send_byte(accept_fd); + recv_byte(client_fd); + + sprintf(expect, "%s id 100 subflow \n", addr); + err = netlink_pm_get_addr(100, output); + if (!ASSERT_OK(err, "netlink_pm_get_addr 100") || + !ASSERT_STRNEQ(output, expect, sizeof(expect), "get_addr")) + goto close_accept; + + err = netlink_pm_rm_subflow(100); + if (!ASSERT_OK(err, "netlink_pm_rm_subflow 100")) + goto close_accept; + + send_byte(client_fd); + recv_byte(accept_fd); + + err = netlink_pm_dump_addr(output); + if (!ASSERT_OK(err, "netlink_pm_dump_addr") || + !ASSERT_STRNEQ(output, "", sizeof(output), "dump_addr")) + goto close_accept; + + addr = ipv6 ? (ipv4mapped ? "::ffff:"ADDR_3 : ADDR6_3) : ADDR_3; + err = netlink_pm_add_addr(addr, 200); + if (!ASSERT_OK(err, "netlink_pm_add_addr 200")) + goto close_accept; + + send_byte(accept_fd); + recv_byte(client_fd); + + sprintf(expect, "%s id 200 signal \n", addr); + err = netlink_pm_dump_addr(output); + if (!ASSERT_OK(err, "netlink_pm_dump_addr") || + !ASSERT_STRNEQ(output, expect, sizeof(expect), "dump_addr")) + goto close_accept; + + err = netlink_pm_rm_addr(200); + if (!ASSERT_OK(err, "netlink_pm_rm_addr 200")) + goto close_accept; + + send_byte(client_fd); + recv_byte(accept_fd); + + err = netlink_pm_rm_addr_id_0(addr); + ASSERT_OK(err, "netlink_pm_rm_addr 0"); + +close_accept: + close(accept_fd); +close_client: + close(client_fd); +close_server: + close(server_fd); +} + +static int pm_init(const char *pm_name) +{ + if (address_init()) + goto fail; + + SYS(fail, "ip netns exec %s sysctl -qw net.mptcp.path_manager=%s", + NS_TEST, pm_name); + SYS(fail, "ip -n %s mptcp limits set add_addr_accepted 4 subflows 4", + NS_TEST); + + return 0; +fail: + return -1; +} + +static void test_netlink_pm(void) +{ + struct netns_obj *netns; + int err; + + netns = netns_new(NS_TEST, true); + if (!ASSERT_OK_PTR(netns, "netns_new")) + return; + + err = pm_init("kernel"); + if (!ASSERT_OK(err, "pm_init: netlink pm")) + goto fail; + + run_netlink_pm(IPV4MAPPED); + +fail: + netns_free(netns); +} + static int sched_init(char *flags, char *sched) { if (endpoint_init(flags, 2) < 0) @@ -756,6 +990,8 @@ void test_mptcp(void) test_subflow(); if (test__start_subtest("iters_subflow")) test_iters_subflow(); + if (test__start_subtest("netlink_pm")) + test_netlink_pm(); if (test__start_subtest("default")) test_default(); if (test__start_subtest("first")) From patchwork Fri Mar 21 01:49:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Geliang Tang X-Patchwork-Id: 14024770 X-Patchwork-Delegate: matthieu.baerts@tessares.net Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F17C4C85 for ; Fri, 21 Mar 2025 01:49:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742521772; cv=none; b=i9KXnmqMUeUpLW7VMacOSMtvh0dbJNW+2htfFbAvi+J9rgE9jCLcTJfuo4oJV223nswEwsRaBCo7vYXdzK9ykCf2Qn9wtLF5IWUTfhFRy2yipQF7wMXPemqFgCroIGSo8DcAHRwkEmKmjwe5PTBfzRgOU7Ql45HDsnE33Ju+mKY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742521772; c=relaxed/simple; bh=EFee2ntjFGH7YMVUwIWXGED6yPXrojIfOdVJvpyDtZU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LTB9NnZNVf+jLv8QK/iEQae9ZPX5k9ZHJlq3phmM+TLGFW1vEvj5I/OmyZ3ApnmGgB/OaslPbaWhb0ly/zxWwt+06pr/nWDi1wWSRn0MECDmNFVqn52T6+r/0LH6cepG7ZExCOO03D+4PPHkPRhoKw1Ohis/ZD7kFhM4i+uWM4E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZMgE5IWz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZMgE5IWz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9C2D9C4CEDD; Fri, 21 Mar 2025 01:49:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742521771; bh=EFee2ntjFGH7YMVUwIWXGED6yPXrojIfOdVJvpyDtZU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZMgE5IWzSBqro1x5UA8KuKXBZKBp+MDyFtiJuy8hV6YzU+CSlnNDReF1bhP+EAOBL jmyw3Bh2goso+cXoErcfS+n0A1sWzel6YsvRJMfo1a8afWb1a85knzJ/rN3fFli/EB qbyo0YtnO/Ff2A5cbcyfUT212+XGVxas4Y73NgAIcqRMKsSZwZHXPuZov5h0+z9pC9 RbSjpmuc2bANzzNAChfofj31yHJkeXqFgY9IFD5HXHrzHItXh6tcv/SHc60vnzstnn /aE3s5QJ0pyv5UwEOCyDM1kCEdBRUIJZN+xPcm4jlEUVnDCK23KV8VdoxyFDpsYir1 joDtlSsJRYbPQ== From: Geliang Tang To: mptcp@lists.linux.dev Cc: Geliang Tang Subject: [PATCH mptcp-next v1 4/4] selftests/bpf: Add mptcp bpf_netlink pm subtest Date: Fri, 21 Mar 2025 09:49:18 +0800 Message-ID: <1360ae99ba7d866451098b9465037e62e601ca13.1742521587.git.tanggeliang@kylinos.cn> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Geliang Tang This patch adds an mptcp bpf netlink pm example program, implements all interfaces of struct mptcp_pm_ops using almost the same logic as the netlink pm in kernel. Signed-off-by: Geliang Tang --- .../testing/selftests/bpf/prog_tests/mptcp.c | 48 +++++ tools/testing/selftests/bpf/progs/mptcp_bpf.h | 27 +++ .../bpf/progs/mptcp_bpf_netlink_pm.c | 204 ++++++++++++++++++ .../selftests/bpf/progs/mptcp_bpf_pm.h | 52 +++++ 4 files changed, 331 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/mptcp_bpf_netlink_pm.c create mode 100644 tools/testing/selftests/bpf/progs/mptcp_bpf_pm.h diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c index 5303cbf38a44..c0bc4cfb24d1 100644 --- a/tools/testing/selftests/bpf/prog_tests/mptcp.c +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -12,6 +12,7 @@ #include "mptcpify.skel.h" #include "mptcp_subflow.skel.h" #include "mptcp_bpf_iters.skel.h" +#include "mptcp_bpf_netlink_pm.skel.h" #include "mptcp_bpf_first.skel.h" #include "mptcp_bpf_bkup.skel.h" #include "mptcp_bpf_rr.skel.h" @@ -796,6 +797,51 @@ static void test_netlink_pm(void) netns_free(netns); } +static void test_bpf_netlink_pm(void) +{ + struct mptcp_bpf_netlink_pm *skel; + struct netns_obj *netns; + struct bpf_link *link; + int err; + + skel = mptcp_bpf_netlink_pm__open(); + if (!ASSERT_OK_PTR(skel, "open: bpf_netlink pm")) + return; + + err = bpf_program__set_flags(skel->progs.mptcp_pm_netlink_established, + BPF_F_SLEEPABLE); + err = err ?: bpf_program__set_flags(skel->progs.mptcp_pm_netlink_subflow_established, + BPF_F_SLEEPABLE); + err = err ?: bpf_program__set_flags(skel->progs.mptcp_pm_netlink_rm_addr_received, + BPF_F_SLEEPABLE); + if (!ASSERT_OK(err, "set sleepable flags")) + goto skel_destroy; + + if (!ASSERT_OK(mptcp_bpf_netlink_pm__load(skel), "load: bpf_netlink pm")) + goto skel_destroy; + + link = bpf_map__attach_struct_ops(skel->maps.bpf_netlink); + if (!ASSERT_OK_PTR(link, "attach_struct_ops: bpf_netlink pm")) + goto skel_destroy; + + netns = netns_new(NS_TEST, true); + if (!ASSERT_OK_PTR(netns, "netns_new")) + goto link_destroy; + + err = pm_init("bpf_netlink"); + if (!ASSERT_OK(err, "pm_init: bpf_netlink pm")) + goto close_netns; + + run_netlink_pm(skel->kconfig->CONFIG_MPTCP_IPV6 ? IPV6 : IPV4); + +close_netns: + netns_free(netns); +link_destroy: + bpf_link__destroy(link); +skel_destroy: + mptcp_bpf_netlink_pm__destroy(skel); +} + static int sched_init(char *flags, char *sched) { if (endpoint_init(flags, 2) < 0) @@ -992,6 +1038,8 @@ void test_mptcp(void) test_iters_subflow(); if (test__start_subtest("netlink_pm")) test_netlink_pm(); + if (test__start_subtest("bpf_netlink_pm")) + test_bpf_netlink_pm(); if (test__start_subtest("default")) test_default(); if (test__start_subtest("first")) diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf.h b/tools/testing/selftests/bpf/progs/mptcp_bpf.h index 4e901941d5dd..0d5cf8426bc5 100644 --- a/tools/testing/selftests/bpf/progs/mptcp_bpf.h +++ b/tools/testing/selftests/bpf/progs/mptcp_bpf.h @@ -4,6 +4,9 @@ #include "bpf_experimental.h" +#define READ_ONCE(x) (*(const volatile typeof(x) *)&(x)) +#define WRITE_ONCE(x, val) ((*(volatile typeof(x) *) &(x)) = (val)) + /* list helpers from include/linux/list.h */ static inline int list_is_head(const struct list_head *list, const struct list_head *head) @@ -33,6 +36,24 @@ static inline int list_is_head(const struct list_head *list, #define mptcp_for_each_subflow(__msk, __subflow) \ list_for_each_entry(__subflow, &((__msk)->conn_list), node) +/* errno macros from include/uapi/asm-generic/errno-base.h */ +#define ESRCH 3 /* No such process */ +#define ENOMEM 12 /* Out of Memory */ +#define EINVAL 22 /* Invalid argument */ + +/* GFP macros from include/linux/gfp_types.h */ +#define __AC(X,Y) (X##Y) +#define _AC(X,Y) __AC(X,Y) +#define _UL(x) (_AC(x, UL)) +#define UL(x) (_UL(x)) +#define BIT(nr) (UL(1) << (nr)) + +#define ___GFP_HIGH BIT(___GFP_HIGH_BIT) +#define __GFP_HIGH ((gfp_t)___GFP_HIGH) +#define ___GFP_KSWAPD_RECLAIM BIT(___GFP_KSWAPD_RECLAIM_BIT) +#define __GFP_KSWAPD_RECLAIM ((gfp_t)___GFP_KSWAPD_RECLAIM) /* kswapd can wake */ +#define GFP_ATOMIC (__GFP_HIGH|__GFP_KSWAPD_RECLAIM) + static __always_inline struct sock * mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow) { @@ -40,6 +61,12 @@ mptcp_subflow_tcp_sock(const struct mptcp_subflow_context *subflow) } /* ksym */ +void bpf_rcu_read_lock(void) __ksym; +void bpf_rcu_read_unlock(void) __ksym; + +extern void bpf_spin_lock_bh(spinlock_t *lock) __ksym; +extern void bpf_spin_unlock_bh(spinlock_t *lock) __ksym; + extern struct mptcp_subflow_context * bpf_mptcp_subflow_ctx(const struct sock *sk) __ksym; extern struct sock * diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_netlink_pm.c b/tools/testing/selftests/bpf/progs/mptcp_bpf_netlink_pm.c new file mode 100644 index 000000000000..9a9a396bdf94 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_netlink_pm.c @@ -0,0 +1,204 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2025, Kylin Software */ + +#include "mptcp_bpf.h" +#include "mptcp_bpf_pm.h" + +char _license[] SEC("license") = "GPL"; + +extern bool CONFIG_MPTCP_IPV6 __kconfig __weak; + +extern unsigned int +mptcp_pm_get_add_addr_signal_max(const struct mptcp_sock *msk) __ksym; +extern unsigned int +mptcp_pm_get_add_addr_accept_max(const struct mptcp_sock *msk) __ksym; +extern unsigned int +mptcp_pm_get_subflows_max(const struct mptcp_sock *msk) __ksym; +extern unsigned int +mptcp_pm_get_local_addr_max(const struct mptcp_sock *msk) __ksym; +extern void bpf_bitmap_fill(unsigned long *dst__ign, unsigned int nbits) __ksym; + +extern bool mptcp_pm_is_init_remote_addr(struct mptcp_sock *msk, + const struct mptcp_addr_info *remote) __ksym; +extern bool mptcp_pm_add_addr_recv(struct mptcp_sock *msk) __ksym; +extern void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) __ksym; +extern void mptcp_pm_rm_addr_recv(struct mptcp_sock *msk) __ksym; +extern int mptcp_pm_nl_append_new_local_addr_msk(struct mptcp_sock *msk, + struct mptcp_pm_addr_entry *entry, + bool needs_id, bool replace) __ksym; +extern struct mptcp_pm_addr_entry * +mptcp_pm_nl_lookup_addr(struct mptcp_sock *msk, + const struct mptcp_addr_info *info) __ksym; + +extern struct mptcp_pm_addr_entry * +bpf_kmemdup_entry(struct mptcp_pm_addr_entry *entry, + int size, gfp_t priority) __ksym; +extern void +bpf_kfree_entry(struct mptcp_pm_addr_entry *entry) __ksym; + +static void mptcp_pm_copy_addr(struct mptcp_addr_info *dst, + const struct mptcp_addr_info *src) +{ + dst->id = src->id; + dst->family = src->family; + dst->port = src->port; + + if (src->family == AF_INET) { + dst->addr.s_addr = src->addr.s_addr; + } else if (src->family == AF_INET6) { + dst->addr6.s6_addr32[0] = src->addr6.s6_addr32[0]; + dst->addr6.s6_addr32[1] = src->addr6.s6_addr32[1]; + dst->addr6.s6_addr32[2] = src->addr6.s6_addr32[2]; + dst->addr6.s6_addr32[3] = src->addr6.s6_addr32[3]; + } +} + +SEC("struct_ops") +int BPF_PROG(mptcp_pm_netlink_get_local_id, struct mptcp_sock *msk, + struct mptcp_pm_addr_entry *skc) +{ + struct mptcp_pm_addr_entry *entry; + int ret; + + bpf_rcu_read_lock(); + entry = mptcp_pm_nl_lookup_addr(msk, &skc->addr); + ret = entry ? entry->addr.id : -1; + bpf_rcu_read_unlock(); + if (ret >= 0) + return ret; + + entry = bpf_kmemdup_entry(skc, sizeof(*skc), GFP_ATOMIC); + if (!entry) + return -ENOMEM; + + entry->addr.port = 0; + ret = mptcp_pm_nl_append_new_local_addr_msk(msk, entry, true, false); + if (ret < 0) + bpf_kfree_entry(entry); + + return 0; +} + +SEC("struct_ops") +bool BPF_PROG(mptcp_pm_netlink_get_priority, struct mptcp_sock *msk, + struct mptcp_addr_info *skc) +{ + struct mptcp_pm_addr_entry *entry; + bool backup; + + bpf_rcu_read_lock(); + entry = mptcp_pm_nl_lookup_addr(msk, skc); + backup = entry && !!(entry->flags & MPTCP_PM_ADDR_FLAG_BACKUP); + bpf_rcu_read_unlock(); + + return backup; +} + +SEC("struct_ops") +void BPF_PROG(mptcp_pm_netlink_established, struct mptcp_sock *msk) +{ + bpf_spin_lock_bh(&msk->pm.lock); + mptcp_pm_create_subflow_or_signal_addr(msk); + bpf_spin_unlock_bh(&msk->pm.lock); +} + +SEC("struct_ops") +void BPF_PROG(mptcp_pm_netlink_subflow_established, struct mptcp_sock *msk) +{ + bpf_spin_lock_bh(&msk->pm.lock); + mptcp_pm_create_subflow_or_signal_addr(msk); + bpf_spin_unlock_bh(&msk->pm.lock); +} + +SEC("struct_ops") +bool BPF_PROG(mptcp_pm_netlink_allow_new_subflow, struct mptcp_sock *msk) +{ + struct mptcp_pm_data *pm = &msk->pm; + unsigned int subflows_max; + int ret = 0; + + subflows_max = mptcp_pm_get_subflows_max(msk); + + /* try to avoid acquiring the lock below */ + if (!READ_ONCE(pm->accept_subflow)) + return false; + + bpf_spin_lock_bh(&pm->lock); + if (READ_ONCE(pm->accept_subflow)) { + ret = pm->subflows < subflows_max; + if (ret && ++pm->subflows == subflows_max) + WRITE_ONCE(pm->accept_subflow, false); + } + bpf_spin_unlock_bh(&pm->lock); + + return ret; +} + +SEC("struct_ops") +bool BPF_PROG(mptcp_pm_netlink_accept_new_subflow, const struct mptcp_sock *msk) +{ + return READ_ONCE(msk->pm.accept_subflow); +} + +SEC("struct_ops") +bool BPF_PROG(mptcp_pm_netlink_add_addr_echo, struct mptcp_sock *msk, + const struct mptcp_addr_info *addr) +{ + return (addr->id == 0 && !mptcp_pm_is_init_remote_addr(msk, addr)) || + (addr->id > 0 && !READ_ONCE(msk->pm.accept_addr)); +} + +SEC("struct_ops") +int BPF_PROG(mptcp_pm_netlink_add_addr_received, struct mptcp_sock *msk, + const struct mptcp_addr_info *addr) +{ + int ret = 0; + + if (mptcp_pm_add_addr_recv(msk)) + mptcp_pm_copy_addr(&msk->pm.remote, addr); + else + ret = -EINVAL; + return ret; +} + +SEC("struct_ops") +void BPF_PROG(mptcp_pm_netlink_rm_addr_received, struct mptcp_sock *msk) +{ + mptcp_pm_rm_addr_recv(msk); +} + +SEC("struct_ops") +void BPF_PROG(mptcp_pm_netlink_init, struct mptcp_sock *msk) +{ + bool subflows_allowed = !!mptcp_pm_get_subflows_max(msk); + struct mptcp_pm_data *pm = &msk->pm; + + bpf_printk("BPF netlink PM (%s)", + CONFIG_MPTCP_IPV6 ? "IPv6" : "IPv4"); + + WRITE_ONCE(pm->work_pending, + (!!mptcp_pm_get_local_addr_max(msk) && + subflows_allowed) || + !!mptcp_pm_get_add_addr_signal_max(msk)); + WRITE_ONCE(pm->accept_addr, + !!mptcp_pm_get_add_addr_accept_max(msk) && + subflows_allowed); + WRITE_ONCE(pm->accept_subflow, subflows_allowed); + + bpf_bitmap_fill(pm->id_avail_bitmap, MPTCP_PM_MAX_ADDR_ID + 1); +} + +SEC(".struct_ops.link") +struct mptcp_pm_ops bpf_netlink = { + .get_local_id = (void *)mptcp_pm_netlink_get_local_id, + .get_priority = (void *)mptcp_pm_netlink_get_priority, + .established = (void *)mptcp_pm_netlink_established, + .subflow_established = (void *)mptcp_pm_netlink_subflow_established, + .allow_new_subflow = (void *)mptcp_pm_netlink_allow_new_subflow, + .accept_new_subflow = (void *)mptcp_pm_netlink_accept_new_subflow, + .add_addr_echo = (void *)mptcp_pm_netlink_add_addr_echo, + .add_addr_received = (void *)mptcp_pm_netlink_add_addr_received, + .rm_addr_received = (void *)mptcp_pm_netlink_rm_addr_received, + .init = (void *)mptcp_pm_netlink_init, + .name = "bpf_netlink", +}; diff --git a/tools/testing/selftests/bpf/progs/mptcp_bpf_pm.h b/tools/testing/selftests/bpf/progs/mptcp_bpf_pm.h new file mode 100644 index 000000000000..0ba21c743a13 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp_bpf_pm.h @@ -0,0 +1,52 @@ +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ + +#ifndef __MPTCP_BPF_PM_H__ +#define __MPTCP_BPF_PM_H__ + +#include "bpf_tracing_net.h" + +/* mptcp helpers from include/net/mptcp.h */ +#define U8_MAX ((u8)~0U) + +/* max value of mptcp_addr_info.id */ +#define MPTCP_PM_MAX_ADDR_ID U8_MAX + +/* mptcp macros from include/uapi/linux/mptcp.h */ +#define MPTCP_PM_ADDR_FLAG_SIGNAL (1 << 0) +#define MPTCP_PM_ADDR_FLAG_SUBFLOW (1 << 1) +#define MPTCP_PM_ADDR_FLAG_BACKUP (1 << 2) +#define MPTCP_PM_ADDR_FLAG_FULLMESH (1 << 3) +#define MPTCP_PM_ADDR_FLAG_IMPLICIT (1 << 4) + +extern void bpf_set_bit(unsigned long nr, unsigned long *addr) __ksym; + +extern int mptcp_pm_remove_addr(struct mptcp_sock *msk, + const struct mptcp_rm_list *rm_list) __ksym; + +#define ipv6_addr_equal(a, b) ((a).s6_addr32[0] == (b).s6_addr32[0] && \ + (a).s6_addr32[1] == (b).s6_addr32[1] && \ + (a).s6_addr32[2] == (b).s6_addr32[2] && \ + (a).s6_addr32[3] == (b).s6_addr32[3]) + +static __always_inline bool +mptcp_addresses_equal(const struct mptcp_addr_info *a, + const struct mptcp_addr_info *b, bool use_port) +{ + bool addr_equals = false; + + if (a->family == b->family) { + if (a->family == AF_INET) + addr_equals = a->addr.s_addr == b->addr.s_addr; + else + addr_equals = ipv6_addr_equal(a->addr6, b->addr6); + } + + if (!addr_equals) + return false; + if (!use_port) + return true; + + return a->port == b->port; +} + +#endif