Message ID | 20240322050716.357983-1-yonghong.song@linux.dev (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] selftests/bpf: Fix flaky test btf_map_in_map/lookup_update | expand |
On 3/21/24 10:07 PM, Yonghong Song wrote: > Recently, I frequently hit the following test failure: > > [root@arch-fb-vm1 bpf]# ./test_progs -n 33/1 > test_lookup_update:PASS:skel_open 0 nsec > ... > test_lookup_update:PASS:sync_rcu 0 nsec > test_lookup_update:FAIL:map1_leak inner_map1 leaked! > #33/1 btf_map_in_map/lookup_update:FAIL > #33 btf_map_in_map:FAIL > > In the test, after map is closed and then after two rcu grace periods, > it is assumed that map_id is not available to user space. > > But the above assumption cannot be guaranteed. After zero or one > or two rcu grace periods in different siturations, the actual > freeing-map-work is put into a workqueue. Later on, when the work > is dequeued, the map will be actually freed. > See bpf_map_put() in kernel/bpf/syscall.c. > > By using workqueue, there is no ganrantee that map will be actually > freed after a couple of rcu grace periods. This patch removed > such map leak detection and then the test can pass consistently. Please ignore this one. There is another patch with the same subject later but with one more change to fix compilation error. The reason is I send this patch but forget to fold a simple fix in. I waited for half an hour but the email didn't reach me or mailing list. I do not know what is going and thought it may have some issues. So I amended the fix and send another one. It looks like the other patch appeared in the mailing list too. > > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > --- > .../selftests/bpf/prog_tests/btf_map_in_map.c | 24 ------------------- > 1 file changed, 24 deletions(-) > > diff --git a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c > index a8b53b8736f0..5d389eb16295 100644 > --- a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c > +++ b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c > @@ -102,30 +102,6 @@ static void test_lookup_update(void) > CHECK(map1_id == 0, "map1_id", "failed to get ID 1\n"); > CHECK(map2_id == 0, "map2_id", "failed to get ID 2\n"); > > - test_btf_map_in_map__destroy(skel); > - skel = NULL; > - > - /* we need to either wait for or force synchronize_rcu(), before > - * checking for "still exists" condition, otherwise map could still be > - * resolvable by ID, causing false positives. > - * > - * Older kernels (5.8 and earlier) freed map only after two > - * synchronize_rcu()s, so trigger two, to be entirely sure. > - */ > - CHECK(kern_sync_rcu(), "sync_rcu", "failed\n"); > - CHECK(kern_sync_rcu(), "sync_rcu", "failed\n"); > - > - fd = bpf_map_get_fd_by_id(map1_id); > - if (CHECK(fd >= 0, "map1_leak", "inner_map1 leaked!\n")) { > - close(fd); > - goto cleanup; > - } > - fd = bpf_map_get_fd_by_id(map2_id); > - if (CHECK(fd >= 0, "map2_leak", "inner_map2 leaked!\n")) { > - close(fd); > - goto cleanup; > - } > - > cleanup: > test_btf_map_in_map__destroy(skel); > }
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c index a8b53b8736f0..5d389eb16295 100644 --- a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c +++ b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c @@ -102,30 +102,6 @@ static void test_lookup_update(void) CHECK(map1_id == 0, "map1_id", "failed to get ID 1\n"); CHECK(map2_id == 0, "map2_id", "failed to get ID 2\n"); - test_btf_map_in_map__destroy(skel); - skel = NULL; - - /* we need to either wait for or force synchronize_rcu(), before - * checking for "still exists" condition, otherwise map could still be - * resolvable by ID, causing false positives. - * - * Older kernels (5.8 and earlier) freed map only after two - * synchronize_rcu()s, so trigger two, to be entirely sure. - */ - CHECK(kern_sync_rcu(), "sync_rcu", "failed\n"); - CHECK(kern_sync_rcu(), "sync_rcu", "failed\n"); - - fd = bpf_map_get_fd_by_id(map1_id); - if (CHECK(fd >= 0, "map1_leak", "inner_map1 leaked!\n")) { - close(fd); - goto cleanup; - } - fd = bpf_map_get_fd_by_id(map2_id); - if (CHECK(fd >= 0, "map2_leak", "inner_map2 leaked!\n")) { - close(fd); - goto cleanup; - } - cleanup: test_btf_map_in_map__destroy(skel); }
Recently, I frequently hit the following test failure: [root@arch-fb-vm1 bpf]# ./test_progs -n 33/1 test_lookup_update:PASS:skel_open 0 nsec ... test_lookup_update:PASS:sync_rcu 0 nsec test_lookup_update:FAIL:map1_leak inner_map1 leaked! #33/1 btf_map_in_map/lookup_update:FAIL #33 btf_map_in_map:FAIL In the test, after map is closed and then after two rcu grace periods, it is assumed that map_id is not available to user space. But the above assumption cannot be guaranteed. After zero or one or two rcu grace periods in different siturations, the actual freeing-map-work is put into a workqueue. Later on, when the work is dequeued, the map will be actually freed. See bpf_map_put() in kernel/bpf/syscall.c. By using workqueue, there is no ganrantee that map will be actually freed after a couple of rcu grace periods. This patch removed such map leak detection and then the test can pass consistently. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> --- .../selftests/bpf/prog_tests/btf_map_in_map.c | 24 ------------------- 1 file changed, 24 deletions(-)