diff mbox series

[bpf,08/11] bpf: Defer bpf_map_put() for inner map in map htab

Message ID 20231107140702.1891778-9-houtao@huaweicloud.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series bpf: Fix the release of inner map | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for bpf, async
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1328 this patch: 1328
netdev/cc_maintainers fail 1 blamed authors not CCed: paulmck@kernel.org; 1 maintainers not CCed: paulmck@kernel.org
netdev/build_clang success Errors and warnings before: 1342 this patch: 1342
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1356 this patch: 1356
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-VM_Test-2 success Logs for Validate matrix.py
bpf/vmtest-bpf-VM_Test-3 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-VM_Test-8 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-VM_Test-4 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-7 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-6 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-5 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-VM_Test-9 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-VM_Test-14 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-VM_Test-15 success Logs for set-matrix
bpf/vmtest-bpf-VM_Test-16 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-VM_Test-17 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-18 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-19 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-20 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-21 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-22 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-23 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-VM_Test-24 success Logs for x86_64-llvm-16 / build / build for x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-25 success Logs for x86_64-llvm-16 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-26 success Logs for x86_64-llvm-16 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-27 success Logs for x86_64-llvm-16 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-28 success Logs for x86_64-llvm-16 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-VM_Test-29 success Logs for x86_64-llvm-16 / veristat
bpf/vmtest-bpf-VM_Test-13 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-VM_Test-12 fail Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-PR fail PR summary
bpf/vmtest-bpf-VM_Test-10 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
bpf/vmtest-bpf-VM_Test-11 fail Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc

Commit Message

Hou Tao Nov. 7, 2023, 2:06 p.m. UTC
From: Hou Tao <houtao1@huawei.com>

When updating or deleting a map in map htab, the map may still be
accessed by non-sleepable program or sleepable program. However
bpf_fd_htab_map_update_elem() decreases the ref-count of the inner map
directly through bpf_map_put(), if the ref-count is the last ref-count
which is true for most cases, the inner map will be free by
ops->map_free() in a kworker. But for now, most .map_free() callbacks
don't use synchronize_rcu() or its variants to wait for the elapse of a
RCU grace period, so bpf program which is accessing the inner map may
incur use-after-free problem.

Fix it by deferring the invocation of bpf_map_put() after the elapse of
both one RCU grace period and one tasks trace RCU grace period.

Fixes: bba1dc0b55ac ("bpf: Remove redundant synchronize_rcu.")
Fixes: 638e4b825d52 ("bpf: Allows per-cpu maps and map-in-map in sleepable programs")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
 kernel/bpf/hashtab.c    | 25 +++++++++++++++----------
 kernel/bpf/map_in_map.h |  4 ++--
 2 files changed, 17 insertions(+), 12 deletions(-)
diff mbox series

Patch

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 81b9f237c942b..0013329af6d36 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -1813,10 +1813,10 @@  __htab_map_lookup_and_delete_batch(struct bpf_map *map,
 		} else {
 			value = l->key + roundup_key_size;
 			if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) {
-				struct bpf_map **inner_map = value;
+				void *inner = READ_ONCE(*(void **)value);
 
-				 /* Actual value is the id of the inner map */
-				map_id = map->ops->map_fd_sys_lookup_elem(*inner_map);
+				/* Actual value is the id of the inner map */
+				map_id = map->ops->map_fd_sys_lookup_elem(inner);
 				value = &map_id;
 			}
 
@@ -2553,12 +2553,16 @@  static struct bpf_map *htab_of_map_alloc(union bpf_attr *attr)
 
 static void *htab_of_map_lookup_elem(struct bpf_map *map, void *key)
 {
-	struct bpf_map **inner_map  = htab_map_lookup_elem(map, key);
+	struct bpf_inner_map_element *element;
+	void **ptr;
 
-	if (!inner_map)
+	ptr = htab_map_lookup_elem(map, key);
+	if (!ptr)
 		return NULL;
 
-	return READ_ONCE(*inner_map);
+	/* element must be no-NULL */
+	element = READ_ONCE(*ptr);
+	return element->map;
 }
 
 static int htab_of_map_gen_lookup(struct bpf_map *map,
@@ -2570,11 +2574,12 @@  static int htab_of_map_gen_lookup(struct bpf_map *map,
 	BUILD_BUG_ON(!__same_type(&__htab_map_lookup_elem,
 		     (void *(*)(struct bpf_map *map, void *key))NULL));
 	*insn++ = BPF_EMIT_CALL(__htab_map_lookup_elem);
-	*insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 2);
+	*insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 3);
 	*insn++ = BPF_ALU64_IMM(BPF_ADD, ret,
 				offsetof(struct htab_elem, key) +
 				round_up(map->key_size, 8));
 	*insn++ = BPF_LDX_MEM(BPF_DW, ret, ret, 0);
+	*insn++ = BPF_LDX_MEM(BPF_DW, ret, ret, 0);
 
 	return insn - insn_buf;
 }
@@ -2592,9 +2597,9 @@  const struct bpf_map_ops htab_of_maps_map_ops = {
 	.map_get_next_key = htab_map_get_next_key,
 	.map_lookup_elem = htab_of_map_lookup_elem,
 	.map_delete_elem = htab_map_delete_elem,
-	.map_fd_get_ptr = bpf_map_fd_get_ptr,
-	.map_fd_put_ptr = bpf_map_fd_put_ptr,
-	.map_fd_sys_lookup_elem = bpf_map_fd_sys_lookup_elem,
+	.map_fd_get_ptr = bpf_map_of_map_fd_get_ptr,
+	.map_fd_put_ptr = bpf_map_of_map_fd_put_ptr,
+	.map_fd_sys_lookup_elem = bpf_map_of_map_fd_sys_lookup_elem,
 	.map_gen_lookup = htab_of_map_gen_lookup,
 	.map_check_btf = map_check_no_btf,
 	.map_mem_usage = htab_map_mem_usage,
diff --git a/kernel/bpf/map_in_map.h b/kernel/bpf/map_in_map.h
index 1fa688b8882ae..f8719bcd7c254 100644
--- a/kernel/bpf/map_in_map.h
+++ b/kernel/bpf/map_in_map.h
@@ -10,8 +10,8 @@  struct file;
 struct bpf_map;
 
 struct bpf_inner_map_element {
-	/* map must be the first member, array_of_map_gen_lookup() depends on it
-	 * to dereference map correctly.
+	/* map must be the first member, array_of_map_gen_lookup() and
+	 * htab_of_map_lookup_elem() depend on it to dereference map correctly.
 	 */
 	struct bpf_map *map;
 	struct rcu_head rcu;