[RFC,08/17] bpf: Add helpers to dequeue from a PIFO map

Message ID	20220713111430.134810-9-toke@redhat.com (mailing list archive)
State	RFC
Delegated to:	BPF
Headers	show Return-Path: <netdev-owner@kernel.org> From: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= <toke@redhat.com> To: Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andrii@kernel.org>, Martin KaFai Lau <martin.lau@linux.dev>, Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>, John Fastabend <john.fastabend@gmail.com>, KP Singh <kpsingh@kernel.org>, Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>, "David S. Miller" <davem@davemloft.net>, Jakub Kicinski <kuba@kernel.org>, Jesper Dangaard Brouer <hawk@kernel.org> Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>, netdev@vger.kernel.org, bpf@vger.kernel.org, Freysteinn Alfredsson <freysteinn.alfredsson@kau.se>, Cong Wang <xiyou.wangcong@gmail.com>, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rg?= =?utf-8?q?ensen?= <toke@redhat.com>, Eric Dumazet <edumazet@google.com>, Paolo Abeni <pabeni@redhat.com> Subject: [RFC PATCH 08/17] bpf: Add helpers to dequeue from a PIFO map Date: Wed, 13 Jul 2022 13:14:16 +0200 Message-Id: <20220713111430.134810-9-toke@redhat.com> In-Reply-To: <20220713111430.134810-1-toke@redhat.com> References: <20220713111430.134810-1-toke@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	xdp: Add packet queueing and scheduling capabilities \| expand [RFC,00/17] xdp: Add packet queueing and scheduling capabilities [RFC,01/17] dev: Move received_rps counter next to RPS members in softnet data [RFC,02/17] bpf: Expand map key argument of bpf_redirect_map to u64 [RFC,03/17] bpf: Use 64-bit return value for bpf_prog_run [RFC,04/17] bpf: Add a PIFO priority queue map type [RFC,05/17] pifomap: Add queue rotation for continuously increasing rank mode [RFC,06/17] xdp: Add dequeue program type for getting packets from a PIFO [RFC,07/17] bpf: Teach the verifier about referenced packets returned from dequeue programs [RFC,08/17] bpf: Add helpers to dequeue from a PIFO map [RFC,09/17] bpf: Introduce pkt_uid member for PTR_TO_PACKET [RFC,10/17] bpf: Implement direct packet access in dequeue progs [RFC,11/17] dev: Add XDP dequeue hook [RFC,12/17] bpf: Add helper to schedule an interface for TX dequeue [RFC,13/17] libbpf: Add support for dequeue program type and PIFO map type [RFC,14/17] libbpf: Add support for querying dequeue programs [RFC,15/17] selftests/bpf: Add verifier tests for dequeue prog [RFC,16/17] selftests/bpf: Add test for XDP queueing through PIFO maps [RFC,17/17] samples/bpf: Add queueing support to xdp_fwd sample

Message ID

20220713111430.134810-9-toke@redhat.com (mailing list archive)

State

RFC

Delegated to:

BPF

Headers

From: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= <toke@redhat.com>
To: Alexei Starovoitov <ast@kernel.org>,
        Daniel Borkmann <daniel@iogearbox.net>,
        Andrii Nakryiko <andrii@kernel.org>,
        Martin KaFai Lau <martin.lau@linux.dev>,
        Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>,
        John Fastabend <john.fastabend@gmail.com>,
        KP Singh <kpsingh@kernel.org>,
        Stanislav Fomichev <sdf@google.com>,
        Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>,
        "David S. Miller" <davem@davemloft.net>,
        Jakub Kicinski <kuba@kernel.org>,
        Jesper Dangaard Brouer <hawk@kernel.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>, netdev@vger.kernel.org,
 bpf@vger.kernel.org, Freysteinn Alfredsson <freysteinn.alfredsson@kau.se>,
 Cong Wang <xiyou.wangcong@gmail.com>, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rg?=
	=?utf-8?q?ensen?= <toke@redhat.com>, Eric Dumazet <edumazet@google.com>,
 Paolo Abeni <pabeni@redhat.com>
Subject: [RFC PATCH 08/17] bpf: Add helpers to dequeue from a PIFO map
Date: Wed, 13 Jul 2022 13:14:16 +0200
Message-Id: <20220713111430.134810-9-toke@redhat.com>
In-Reply-To: <20220713111430.134810-1-toke@redhat.com>
References: <20220713111430.134810-1-toke@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

xdp: Add packet queueing and scheduling capabilities | expand

Context	Check	Description
bpf/vmtest-bpf-next-PR	pending	PR summary
bpf/vmtest-bpf-next-VM_Test-2	pending	Logs for Kernel LATEST on ubuntu-latest with llvm-15
bpf/vmtest-bpf-next-VM_Test-3	pending	Logs for Kernel LATEST on z15 with gcc
bpf/vmtest-bpf-next-VM_Test-1	fail	Logs for Kernel LATEST on ubuntu-latest with gcc
netdev/tree_selection	success	Guessed tree name to be net-next, async
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/subject_prefix	success	Link
netdev/cover_letter	success	Series has a cover letter
netdev/patch_count	fail	Series longer than 15 patches (and no cover letter)
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 1779 this patch: 1779
netdev/cc_maintainers	success	CCed 18 of 18 maintainers
netdev/build_clang	success	Errors and warnings before: 186 this patch: 186
netdev/module_param	success	Was 0 now: 0
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 1790 this patch: 1790
netdev/checkpatch	warning	WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

Context

Check

Description

bpf/vmtest-bpf-next-PR

pending

PR summary

bpf/vmtest-bpf-next-VM_Test-2

pending

Logs for Kernel LATEST on ubuntu-latest with llvm-15

bpf/vmtest-bpf-next-VM_Test-3

pending

Logs for Kernel LATEST on z15 with gcc

bpf/vmtest-bpf-next-VM_Test-1

fail

Logs for Kernel LATEST on ubuntu-latest with gcc

netdev/tree_selection

success

Guessed tree name to be net-next, async

netdev/fixes_present

success

Fixes tag not required for -next series

netdev/subject_prefix

success

Link

netdev/cover_letter

success

Series has a cover letter

netdev/patch_count

fail

Series longer than 15 patches (and no cover letter)

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 1779 this patch: 1779

netdev/cc_maintainers

success

CCed 18 of 18 maintainers

netdev/build_clang

success

Errors and warnings before: 186 this patch: 186

netdev/module_param

success

Was 0 now: 0

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

No Fixes tag

netdev/build_allmodconfig_warn

success

Errors and warnings before: 1790 this patch: 1790

netdev/checkpatch

warning

WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/source_inline

success

Was 0 now: 0

Commit Message

Toke Høiland-Jørgensen July 13, 2022, 11:14 a.m. UTC

This adds a new helper to dequeue a packet from a PIFO map,
bpf_packet_dequeue(). The helper returns a refcounted pointer to the packet
dequeued from the map; the reference must be released either by dropping
the packet (using bpf_packet_drop()), or by returning it to the caller.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 include/uapi/linux/bpf.h       | 19 +++++++++++++++
 kernel/bpf/verifier.c          | 13 +++++++---
 net/core/filter.c              | 43 +++++++++++++++++++++++++++++++++-
 tools/include/uapi/linux/bpf.h | 19 +++++++++++++++
 4 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 974fb5882305..d44382644391 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5341,6 +5341,23 @@  union bpf_attr {
  *		**-EACCES** if the SYN cookie is not valid.
  *
  *		**-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
+ *
+ * long bpf_packet_dequeue(void *ctx, struct bpf_map *map, u64 flags, u64 *rank)
+ *	Description
+ *		Dequeue the packet at the head of the PIFO in *map* and return a pointer
+ *		to the packet (or NULL if the PIFO is empty).
+ *	Return
+ *		On success, a pointer to the packet, or NULL if the PIFO is empty. The
+ *		packet pointer must be freed using *bpf_packet_drop()* or returning
+ *		the packet pointer. The *rank* pointer will be set to the rank of
+ *		the dequeued packet on success, or a negative error code on error.
+ *
+ * long bpf_packet_drop(void *ctx, void *pkt)
+ *	Description
+ *		Drop *pkt*, which must be a reference previously returned by
+ *		*bpf_packet_dequeue()* (and checked to not be NULL).
+ *	Return
+ *		This always succeeds and returns zero.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5551,6 +5568,8 @@  union bpf_attr {
 	FN(tcp_raw_gen_syncookie_ipv6),	\
 	FN(tcp_raw_check_syncookie_ipv4),	\
 	FN(tcp_raw_check_syncookie_ipv6),	\
+	FN(packet_dequeue),		\
+	FN(packet_drop),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e3662460a095..68f98d76bc78 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -483,7 +483,8 @@  static bool may_be_acquire_function(enum bpf_func_id func_id)
 		func_id == BPF_FUNC_sk_lookup_udp ||
 		func_id == BPF_FUNC_skc_lookup_tcp ||
 		func_id == BPF_FUNC_map_lookup_elem ||
-	        func_id == BPF_FUNC_ringbuf_reserve;
+		func_id == BPF_FUNC_ringbuf_reserve ||
+		func_id == BPF_FUNC_packet_dequeue;
 }
 
 static bool is_acquire_function(enum bpf_func_id func_id,
@@ -495,7 +496,8 @@  static bool is_acquire_function(enum bpf_func_id func_id,
 	    func_id == BPF_FUNC_sk_lookup_udp ||
 	    func_id == BPF_FUNC_skc_lookup_tcp ||
 	    func_id == BPF_FUNC_ringbuf_reserve ||
-	    func_id == BPF_FUNC_kptr_xchg)
+	    func_id == BPF_FUNC_kptr_xchg ||
+	    func_id == BPF_FUNC_packet_dequeue)
 		return true;
 
 	if (func_id == BPF_FUNC_map_lookup_elem &&
@@ -6276,7 +6278,8 @@  static int check_map_func_compatibility(struct bpf_verifier_env *env,
 			goto error;
 		break;
 	case BPF_MAP_TYPE_PIFO_XDP:
-		if (func_id != BPF_FUNC_redirect_map)
+		if (func_id != BPF_FUNC_redirect_map &&
+		    func_id != BPF_FUNC_packet_dequeue)
 			goto error;
 		break;
 	default:
@@ -6385,6 +6388,10 @@  static int check_map_func_compatibility(struct bpf_verifier_env *env,
 		if (map->map_type != BPF_MAP_TYPE_TASK_STORAGE)
 			goto error;
 		break;
+	case BPF_FUNC_packet_dequeue:
+		if (map->map_type != BPF_MAP_TYPE_PIFO_XDP)
+			goto error;
+		break;
 	default:
 		break;
 	}
diff --git a/net/core/filter.c b/net/core/filter.c
index 30bd3a6aedab..893b75515859 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4430,6 +4430,40 @@  static const struct bpf_func_proto bpf_xdp_redirect_map_proto = {
 	.arg3_type      = ARG_ANYTHING,
 };
 
+BTF_ID_LIST_SINGLE(xdp_md_btf_ids, struct, xdp_md)
+
+BPF_CALL_4(bpf_packet_dequeue, struct dequeue_data *, ctx, struct bpf_map *, map,
+	   u64, flags, u64 *, rank)
+{
+	return (unsigned long)pifo_map_dequeue(map, flags, rank);
+}
+
+static const struct bpf_func_proto bpf_packet_dequeue_proto = {
+	.func           = bpf_packet_dequeue,
+	.gpl_only       = false,
+	.ret_type       = RET_PTR_TO_BTF_ID_OR_NULL,
+	.ret_btf_id	= xdp_md_btf_ids,
+	.arg1_type      = ARG_PTR_TO_CTX,
+	.arg2_type      = ARG_CONST_MAP_PTR,
+	.arg3_type      = ARG_ANYTHING,
+	.arg4_type      = ARG_PTR_TO_LONG,
+};
+
+BPF_CALL_2(bpf_packet_drop, struct dequeue_data *, ctx, struct xdp_frame *, pkt)
+{
+	xdp_return_frame(pkt);
+	return 0;
+}
+
+static const struct bpf_func_proto bpf_packet_drop_proto = {
+	.func           = bpf_packet_drop,
+	.gpl_only       = false,
+	.ret_type       = RET_INTEGER,
+	.arg1_type      = ARG_PTR_TO_CTX,
+	.arg2_type      = ARG_PTR_TO_BTF_ID | OBJ_RELEASE,
+	.arg2_btf_id	= xdp_md_btf_ids,
+};
+
 static unsigned long bpf_skb_copy(void *dst_buff, const void *skb,
 				  unsigned long off, unsigned long len)
 {
@@ -8065,7 +8099,14 @@  xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 static const struct bpf_func_proto *
 dequeue_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
-	return bpf_base_func_proto(func_id);
+	switch (func_id) {
+	case BPF_FUNC_packet_dequeue:
+		return &bpf_packet_dequeue_proto;
+	case BPF_FUNC_packet_drop:
+		return &bpf_packet_drop_proto;
+	default:
+		return bpf_base_func_proto(func_id);
+	}
 }
 
 const struct bpf_func_proto bpf_sock_map_update_proto __weak;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4dd8a563f85d..1dab68a89e18 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5341,6 +5341,23 @@  union bpf_attr {
  *		**-EACCES** if the SYN cookie is not valid.
  *
  *		**-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
+ *
+ * long bpf_packet_dequeue(void *ctx, struct bpf_map *map, u64 flags, u64 *rank)
+ *	Description
+ *		Dequeue the packet at the head of the PIFO in *map* and return a pointer
+ *		to the packet (or NULL if the PIFO is empty).
+ *	Return
+ *		On success, a pointer to the packet, or NULL if the PIFO is empty. The
+ *		packet pointer must be freed using *bpf_packet_drop()* or returning
+ *		the packet pointer. The *rank* pointer will be set to the rank of
+ *		the dequeued packet on success, or a negative error code on error.
+ *
+ * long bpf_packet_drop(void *ctx, void *pkt)
+ *	Description
+ *		Drop *pkt*, which must be a reference previously returned by
+ *		*bpf_packet_dequeue()* (and checked to not be NULL).
+ *	Return
+ *		This always succeeds and returns zero.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5551,6 +5568,8 @@  union bpf_attr {
 	FN(tcp_raw_gen_syncookie_ipv6),	\
 	FN(tcp_raw_check_syncookie_ipv4),	\
 	FN(tcp_raw_check_syncookie_ipv6),	\
+	FN(packet_dequeue),		\
+	FN(packet_drop),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper

[RFC,08/17] bpf: Add helpers to dequeue from a PIFO map

Checks

Commit Message

Patch