diff mbox series

[bpf-next,v1,2/3] bpf: add bpf_relay_output kfunc

Message ID 20231227100130.84501-3-lulie@linux.alibaba.com (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series bpf: introduce BPF_MAP_TYPE_RELAY | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-32 fail Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-8 fail Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 fail Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 fail Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-33 fail Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-26 fail Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 fail Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 fail Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-27 fail Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 fail Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 fail Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-22 fail Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-31 fail Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-37 fail Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-38 fail Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-39 fail Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40 fail Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41 fail Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-16 fail Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 1161 this patch: 1162
netdev/cc_maintainers success CCed 12 of 12 maintainers
netdev/build_clang success Errors and warnings before: 1143 this patch: 1143
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 1188 this patch: 1189
netdev/checkpatch warning CHECK: Alignment should match open parenthesis
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-13 fail Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc

Commit Message

Philo Lu Dec. 27, 2023, 10:01 a.m. UTC
A kfunc is needed to write into the relay channel, named
bpf_relay_output. The usage is same as bpf_ringbuf_output helper. It
only works after relay files are set, i.e., after calling
map_update_elem for the created relay map.

Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
---
 kernel/bpf/helpers.c  |  3 +++
 kernel/bpf/relaymap.c | 22 ++++++++++++++++++++++
 2 files changed, 25 insertions(+)

Comments

Hou Tao Dec. 27, 2023, 2:23 p.m. UTC | #1
Hi,

On 12/27/2023 6:01 PM, Philo Lu wrote:
> A kfunc is needed to write into the relay channel, named
> bpf_relay_output. The usage is same as bpf_ringbuf_output helper. It
> only works after relay files are set, i.e., after calling
> map_update_elem for the created relay map.
>
> Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
> ---
>  kernel/bpf/helpers.c  |  3 +++
>  kernel/bpf/relaymap.c | 22 ++++++++++++++++++++++
>  2 files changed, 25 insertions(+)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index be72824f32b2..22480b69ff27 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -2617,6 +2617,9 @@ BTF_ID_FLAGS(func, bpf_dynptr_is_null)
>  BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
>  BTF_ID_FLAGS(func, bpf_dynptr_size)
>  BTF_ID_FLAGS(func, bpf_dynptr_clone)
> +#ifdef CONFIG_RELAY
> +BTF_ID_FLAGS(func, bpf_relay_output)
> +#endif
>  BTF_SET8_END(common_btf_ids)

Could you explain the reason why bpf_relay_out is placed in
common_btf_ids instead of generic_kfunc_set ?
>  
>  static const struct btf_kfunc_id_set common_kfunc_set = {
> diff --git a/kernel/bpf/relaymap.c b/kernel/bpf/relaymap.c
> index 02b33a8e6b6c..37280d60133c 100644
> --- a/kernel/bpf/relaymap.c
> +++ b/kernel/bpf/relaymap.c
> @@ -6,6 +6,7 @@
>  #include <linux/slab.h>
>  #include <linux/bpf.h>
>  #include <linux/err.h>
> +#include <linux/btf.h>
>  
>  #define RELAY_CREATE_FLAG_MASK (BPF_F_OVERWRITE)
>  
> @@ -197,3 +198,24 @@ const struct bpf_map_ops relay_map_ops = {
>  	.map_mem_usage = relay_map_mem_usage,
>  	.map_btf_id = &relay_map_btf_ids[0],
>  };
> +
> +__bpf_kfunc_start_defs();
> +
> +__bpf_kfunc int bpf_relay_output(struct bpf_map *map,
> +				   void *data, u64 data__sz, u32 flags)
> +{
> +	struct bpf_relay_map *rmap;
> +
> +	/* not support any flag now */
> +	if (unlikely(!map || flags))
> +		return -EINVAL;
> +
> +	rmap = container_of(map, struct bpf_relay_map, map);

How does bpf_relay_out() guarantee the passed map is a relay map ? And
just like bpf_map_sum_elem_count(), I think KF_TRUSTED_ARGS is also
necessary for bpf_relay_output().
> +	if (!rmap->relay_chan->has_base_filename)
> +		return -ENOENT;
> +

I think a comment is needed here. It needs to explain why checking
->has_base_filename is enough to guarantee the concurrently running of
bpf_relay_output() and .map_update_elem() is safe.

> +	relay_write(rmap->relay_chan, data, data__sz);
> +	return 0;
> +}
> +
> +__bpf_kfunc_end_defs();
Alexei Starovoitov Dec. 27, 2023, 6:07 p.m. UTC | #2
On Wed, Dec 27, 2023 at 2:01 AM Philo Lu <lulie@linux.alibaba.com> wrote:
>
> +__bpf_kfunc int bpf_relay_output(struct bpf_map *map,
> +                                  void *data, u64 data__sz, u32 flags)
> +{
> +       struct bpf_relay_map *rmap;
> +
> +       /* not support any flag now */
> +       if (unlikely(!map || flags))
> +               return -EINVAL;
> +
> +       rmap = container_of(map, struct bpf_relay_map, map);
> +       if (!rmap->relay_chan->has_base_filename)
> +               return -ENOENT;
> +
> +       relay_write(rmap->relay_chan, data, data__sz);
> +       return 0;

This just opens a can of worms.
Above is not nmi safe. relay_write() can be used only out of
known context which effectively makes it unusable out of bpf tracing
progs that can kprobe attach anywhere in the kernel.
perf_event buffer is the only sure way to deliver events to user
space with overwrite.
bpf ringbuf is a best effort due to
if (in_nmi()) if (!spin_trylock_irqsave

Sorry, but it's a nack to allow bpf progs interface with relay.
diff mbox series

Patch

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index be72824f32b2..22480b69ff27 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2617,6 +2617,9 @@  BTF_ID_FLAGS(func, bpf_dynptr_is_null)
 BTF_ID_FLAGS(func, bpf_dynptr_is_rdonly)
 BTF_ID_FLAGS(func, bpf_dynptr_size)
 BTF_ID_FLAGS(func, bpf_dynptr_clone)
+#ifdef CONFIG_RELAY
+BTF_ID_FLAGS(func, bpf_relay_output)
+#endif
 BTF_SET8_END(common_btf_ids)
 
 static const struct btf_kfunc_id_set common_kfunc_set = {
diff --git a/kernel/bpf/relaymap.c b/kernel/bpf/relaymap.c
index 02b33a8e6b6c..37280d60133c 100644
--- a/kernel/bpf/relaymap.c
+++ b/kernel/bpf/relaymap.c
@@ -6,6 +6,7 @@ 
 #include <linux/slab.h>
 #include <linux/bpf.h>
 #include <linux/err.h>
+#include <linux/btf.h>
 
 #define RELAY_CREATE_FLAG_MASK (BPF_F_OVERWRITE)
 
@@ -197,3 +198,24 @@  const struct bpf_map_ops relay_map_ops = {
 	.map_mem_usage = relay_map_mem_usage,
 	.map_btf_id = &relay_map_btf_ids[0],
 };
+
+__bpf_kfunc_start_defs();
+
+__bpf_kfunc int bpf_relay_output(struct bpf_map *map,
+				   void *data, u64 data__sz, u32 flags)
+{
+	struct bpf_relay_map *rmap;
+
+	/* not support any flag now */
+	if (unlikely(!map || flags))
+		return -EINVAL;
+
+	rmap = container_of(map, struct bpf_relay_map, map);
+	if (!rmap->relay_chan->has_base_filename)
+		return -ENOENT;
+
+	relay_write(rmap->relay_chan, data, data__sz);
+	return 0;
+}
+
+__bpf_kfunc_end_defs();