diff mbox series

[bpf-next,v3,1/8] bpf: expose is_mptcp flag to bpf_tcp_sock

Message ID 20220502211235.142250-2-mathew.j.martineau@linux.intel.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series bpf: mptcp: Support for mptcp_sock and is_mptcp | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-VM_Test-2 fail Logs for Kernel LATEST on z15 + selftests
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for Kernel LATEST on ubuntu-latest + selftests
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1818 this patch: 1818
netdev/cc_maintainers warning 8 maintainers not CCed: songliubraving@fb.com davem@davemloft.net pabeni@redhat.com kpsingh@kernel.org kafai@fb.com yhs@fb.com john.fastabend@gmail.com kuba@kernel.org
netdev/build_clang success Errors and warnings before: 196 this patch: 196
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1828 this patch: 1828
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 35 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Mat Martineau May 2, 2022, 9:12 p.m. UTC
From: Nicolas Rybowski <nicolas.rybowski@tessares.net>

is_mptcp is a field from struct tcp_sock used to indicate that the
current tcp_sock is part of the MPTCP protocol.

In this protocol, a first socket (mptcp_sock) is created with
sk_protocol set to IPPROTO_MPTCP (=262) for control purpose but it
isn't directly on the wire. This is the role of the subflow (kernel)
sockets which are classical tcp_sock with sk_protocol set to
IPPROTO_TCP. The only way to differentiate such sockets from plain TCP
sockets is the is_mptcp field from tcp_sock.

Such an exposure in BPF is thus required to be able to differentiate
plain TCP sockets from MPTCP subflow sockets in BPF_PROG_TYPE_SOCK_OPS
programs.

The choice has been made to silently pass the case when CONFIG_MPTCP is
unset by defaulting is_mptcp to 0 in order to make BPF independent of
the MPTCP configuration. Another solution is to make the verifier fail
in 'bpf_tcp_sock_is_valid_ctx_access' but this will add an additional
'#ifdef CONFIG_MPTCP' in the BPF code and a same injected BPF program
will not run if MPTCP is not set.

An example use-case is provided in
https://github.com/multipath-tcp/mptcp_net-next/tree/scripts/bpf/examples

Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
---
 include/uapi/linux/bpf.h       | 1 +
 net/core/filter.c              | 9 ++++++++-
 tools/include/uapi/linux/bpf.h | 1 +
 3 files changed, 10 insertions(+), 1 deletion(-)

Comments

Martin KaFai Lau May 11, 2022, 12:48 a.m. UTC | #1
On Mon, May 02, 2022 at 02:12:27PM -0700, Mat Martineau wrote:
> From: Nicolas Rybowski <nicolas.rybowski@tessares.net>
> 
> is_mptcp is a field from struct tcp_sock used to indicate that the
> current tcp_sock is part of the MPTCP protocol.
> 
> In this protocol, a first socket (mptcp_sock) is created with
> sk_protocol set to IPPROTO_MPTCP (=262) for control purpose but it
> isn't directly on the wire. This is the role of the subflow (kernel)
> sockets which are classical tcp_sock with sk_protocol set to
> IPPROTO_TCP. The only way to differentiate such sockets from plain TCP
> sockets is the is_mptcp field from tcp_sock.
> 
> Such an exposure in BPF is thus required to be able to differentiate
> plain TCP sockets from MPTCP subflow sockets in BPF_PROG_TYPE_SOCK_OPS
> programs.
> 
> The choice has been made to silently pass the case when CONFIG_MPTCP is
> unset by defaulting is_mptcp to 0 in order to make BPF independent of
> the MPTCP configuration. Another solution is to make the verifier fail
> in 'bpf_tcp_sock_is_valid_ctx_access' but this will add an additional
> '#ifdef CONFIG_MPTCP' in the BPF code and a same injected BPF program
> will not run if MPTCP is not set.
There is already bpf_skc_to_tcp_sock() and its returned tcp_sock pointer
can access all fields of the "struct tcp_sock" without extending
the bpf_tcp_sock.

iiuc, I believe the needs to extend bpf_tcp_sock here is to make the
same bpf sockops prog works for kernel with and without CONFIG_MPTCP
because tp->is_mptcp is not always available:

struct tcp_sock {
	/* ... */

#if IS_ENABLED(CONFIG_MPTCP)
	bool    is_mptcp;
#endif
};

Andrii, do you think bpf_core_field_exists() can be used in
the bpf prog to test if is_mptcp is available in the running kernel
such that the same bpf prog can be used in kernel with and without
CONFIG_MPTCP?
Andrii Nakryiko May 11, 2022, 5:02 a.m. UTC | #2
On Tue, May 10, 2022 at 5:48 PM Martin KaFai Lau <kafai@fb.com> wrote:
>
> On Mon, May 02, 2022 at 02:12:27PM -0700, Mat Martineau wrote:
> > From: Nicolas Rybowski <nicolas.rybowski@tessares.net>
> >
> > is_mptcp is a field from struct tcp_sock used to indicate that the
> > current tcp_sock is part of the MPTCP protocol.
> >
> > In this protocol, a first socket (mptcp_sock) is created with
> > sk_protocol set to IPPROTO_MPTCP (=262) for control purpose but it
> > isn't directly on the wire. This is the role of the subflow (kernel)
> > sockets which are classical tcp_sock with sk_protocol set to
> > IPPROTO_TCP. The only way to differentiate such sockets from plain TCP
> > sockets is the is_mptcp field from tcp_sock.
> >
> > Such an exposure in BPF is thus required to be able to differentiate
> > plain TCP sockets from MPTCP subflow sockets in BPF_PROG_TYPE_SOCK_OPS
> > programs.
> >
> > The choice has been made to silently pass the case when CONFIG_MPTCP is
> > unset by defaulting is_mptcp to 0 in order to make BPF independent of
> > the MPTCP configuration. Another solution is to make the verifier fail
> > in 'bpf_tcp_sock_is_valid_ctx_access' but this will add an additional
> > '#ifdef CONFIG_MPTCP' in the BPF code and a same injected BPF program
> > will not run if MPTCP is not set.
> There is already bpf_skc_to_tcp_sock() and its returned tcp_sock pointer
> can access all fields of the "struct tcp_sock" without extending
> the bpf_tcp_sock.
>
> iiuc, I believe the needs to extend bpf_tcp_sock here is to make the
> same bpf sockops prog works for kernel with and without CONFIG_MPTCP
> because tp->is_mptcp is not always available:
>
> struct tcp_sock {
>         /* ... */
>
> #if IS_ENABLED(CONFIG_MPTCP)
>         bool    is_mptcp;
> #endif
> };
>
> Andrii, do you think bpf_core_field_exists() can be used in
> the bpf prog to test if is_mptcp is available in the running kernel
> such that the same bpf prog can be used in kernel with and without
> CONFIG_MPTCP?

yep, absolutely:

bool is_mptcp = bpf_core_field_exists(struct tcp_sock, is_mptcp) ?
sock->is_mptcp : false;

One can also directly check if CONFIG_MPTCP is set with the following
in BPF-side code:

extern bool CONFIG_MPTCP __kconfig;
Geliang Tang May 11, 2022, 6:10 a.m. UTC | #3
Andrii Nakryiko <andrii.nakryiko@gmail.com> 于2022年5月11日周三 13:02写道:
>
> On Tue, May 10, 2022 at 5:48 PM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> > On Mon, May 02, 2022 at 02:12:27PM -0700, Mat Martineau wrote:
> > > From: Nicolas Rybowski <nicolas.rybowski@tessares.net>
> > >
> > > is_mptcp is a field from struct tcp_sock used to indicate that the
> > > current tcp_sock is part of the MPTCP protocol.
> > >
> > > In this protocol, a first socket (mptcp_sock) is created with
> > > sk_protocol set to IPPROTO_MPTCP (=262) for control purpose but it
> > > isn't directly on the wire. This is the role of the subflow (kernel)
> > > sockets which are classical tcp_sock with sk_protocol set to
> > > IPPROTO_TCP. The only way to differentiate such sockets from plain TCP
> > > sockets is the is_mptcp field from tcp_sock.
> > >
> > > Such an exposure in BPF is thus required to be able to differentiate
> > > plain TCP sockets from MPTCP subflow sockets in BPF_PROG_TYPE_SOCK_OPS
> > > programs.
> > >
> > > The choice has been made to silently pass the case when CONFIG_MPTCP is
> > > unset by defaulting is_mptcp to 0 in order to make BPF independent of
> > > the MPTCP configuration. Another solution is to make the verifier fail
> > > in 'bpf_tcp_sock_is_valid_ctx_access' but this will add an additional
> > > '#ifdef CONFIG_MPTCP' in the BPF code and a same injected BPF program
> > > will not run if MPTCP is not set.
> > There is already bpf_skc_to_tcp_sock() and its returned tcp_sock pointer
> > can access all fields of the "struct tcp_sock" without extending
> > the bpf_tcp_sock.
> >
> > iiuc, I believe the needs to extend bpf_tcp_sock here is to make the
> > same bpf sockops prog works for kernel with and without CONFIG_MPTCP
> > because tp->is_mptcp is not always available:
> >
> > struct tcp_sock {
> >         /* ... */
> >
> > #if IS_ENABLED(CONFIG_MPTCP)
> >         bool    is_mptcp;
> > #endif
> > };
> >
> > Andrii, do you think bpf_core_field_exists() can be used in
> > the bpf prog to test if is_mptcp is available in the running kernel
> > such that the same bpf prog can be used in kernel with and without
> > CONFIG_MPTCP?
>
> yep, absolutely:
>
> bool is_mptcp = bpf_core_field_exists(struct tcp_sock, is_mptcp) ?
> sock->is_mptcp : false;
>
> One can also directly check if CONFIG_MPTCP is set with the following
> in BPF-side code:
>
> extern bool CONFIG_MPTCP __kconfig;

Thanks Martin & Andrii, will update this in v4.

-Geliang
SUSE

>
diff mbox series

Patch

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 444fe6f1cf35..7043f3641534 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5706,6 +5706,7 @@  struct bpf_tcp_sock {
 	__u32 delivered;	/* Total data packets delivered incl. rexmits */
 	__u32 delivered_ce;	/* Like the above but only ECE marked packets */
 	__u32 icsk_retransmits;	/* Number of unrecovered [RTO] timeouts */
+	__u32 is_mptcp;		/* Is MPTCP subflow? */
 };
 
 struct bpf_sock_tuple {
diff --git a/net/core/filter.c b/net/core/filter.c
index b741b9f7e6a9..b474e5bd1458 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6754,7 +6754,7 @@  bool bpf_tcp_sock_is_valid_access(int off, int size, enum bpf_access_type type,
 				  struct bpf_insn_access_aux *info)
 {
 	if (off < 0 || off >= offsetofend(struct bpf_tcp_sock,
-					  icsk_retransmits))
+					  is_mptcp))
 		return false;
 
 	if (off % size != 0)
@@ -6888,6 +6888,13 @@  u32 bpf_tcp_sock_convert_ctx_access(enum bpf_access_type type,
 	case offsetof(struct bpf_tcp_sock, icsk_retransmits):
 		BPF_INET_SOCK_GET_COMMON(icsk_retransmits);
 		break;
+	case offsetof(struct bpf_tcp_sock, is_mptcp):
+#ifdef CONFIG_MPTCP
+		BPF_TCP_SOCK_GET_COMMON(is_mptcp);
+#else
+		*insn++ = BPF_MOV32_IMM(si->dst_reg, 0);
+#endif
+		break;
 	}
 
 	return insn - insn_buf;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 444fe6f1cf35..7043f3641534 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5706,6 +5706,7 @@  struct bpf_tcp_sock {
 	__u32 delivered;	/* Total data packets delivered incl. rexmits */
 	__u32 delivered_ce;	/* Like the above but only ECE marked packets */
 	__u32 icsk_retransmits;	/* Number of unrecovered [RTO] timeouts */
+	__u32 is_mptcp;		/* Is MPTCP subflow? */
 };
 
 struct bpf_sock_tuple {