diff mbox series

[bpf-next] selftests/bpf: fix potential premature unload in bpf_testmod

Message ID 20240109164317.16371-1-asavkov@redhat.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next] selftests/bpf: fix potential premature unload in bpf_testmod | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17 and -O2 optimization
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for bpf-next
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers success CCed 0 of 0 maintainers
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff fail author Signed-off-by missing
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch warning WARNING: Please use correct Fixes: style 'Fixes: <12 chars of sha1> ("<title line>")' - ie: 'Fixes: 65eb006d85a2 ("bpf: Move kernel test kfuncs to bpf_testmod")'
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Artem Savkov Jan. 9, 2024, 4:43 p.m. UTC
It is possible for bpf_kfunc_call_test_release() to be called from
bpf_map_free_deferred() when bpf_testmod is already unloaded and
perf_test_stuct.cnt which it tries to decrease is no longer in memory.
This patch tries to fix the issue by waiting for all references to be
dropped in bpf_testmod_exit().

The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
synchronous grace periods urgently").

Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
---
 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Yonghong Song Jan. 9, 2024, 7:40 p.m. UTC | #1
On 1/9/24 8:43 AM, Artem Savkov wrote:
> It is possible for bpf_kfunc_call_test_release() to be called from
> bpf_map_free_deferred() when bpf_testmod is already unloaded and
> perf_test_stuct.cnt which it tries to decrease is no longer in memory.
> This patch tries to fix the issue by waiting for all references to be
> dropped in bpf_testmod_exit().
>
> The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
> but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
> synchronous grace periods urgently").
>
> Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")

Please add your Signed-off-by tag.

I think the root cause is that bpf_kfunc_call_test_acquire() kfunc
is defined in bpf_testmod and the kfunc returns some data in bpf_testmod.
But the release function bpf_kfunc_call_test_release() is in the kernel.
The release func tries to access some data in bpf_testmod which might
have been unloaded. The prog_test_ref_kfunc is defined in the kernel, so
no bpf_testmod btf reference is hold so bpf_testmod can be unloaded before
bpf_kfunc_call_test_release().
As you mentioned, we won't have this issue if bpf_kfunc_call_test_acquire()
is also in the kernel.

I think putting bpf_kfunc_call_test_acquire() in bpf_testmod and
bpf_kfunc_call_test_release() in kernel is not a good idea and confusing.
But since this is only for tests, I guess we can live with that. With that,

Acked-by: Yonghong Song <yonghong.song@linux.dev>

> ---
>   tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> index 91907b321f913..63f0dbd016703 100644
> --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> @@ -2,6 +2,7 @@
>   /* Copyright (c) 2020 Facebook */
>   #include <linux/btf.h>
>   #include <linux/btf_ids.h>
> +#include <linux/delay.h>
>   #include <linux/error-injection.h>
>   #include <linux/init.h>
>   #include <linux/module.h>
> @@ -544,6 +545,9 @@ static int bpf_testmod_init(void)
>   
>   static void bpf_testmod_exit(void)
>   {
> +	while (refcount_read(&prog_test_struct.cnt) > 1)
> +		msleep(20);
> +
>   	return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
>   }
>
Artem Savkov Jan. 10, 2024, 8:14 a.m. UTC | #2
On Tue, Jan 09, 2024 at 11:40:38AM -0800, Yonghong Song wrote:
> 
> On 1/9/24 8:43 AM, Artem Savkov wrote:
> > It is possible for bpf_kfunc_call_test_release() to be called from
> > bpf_map_free_deferred() when bpf_testmod is already unloaded and
> > perf_test_stuct.cnt which it tries to decrease is no longer in memory.
> > This patch tries to fix the issue by waiting for all references to be
> > dropped in bpf_testmod_exit().
> > 
> > The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
> > but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
> > synchronous grace periods urgently").
> > 
> > Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
> 
> Please add your Signed-off-by tag.

Thanks for noticing. Will resend with signed-off-by and your ack.

> I think the root cause is that bpf_kfunc_call_test_acquire() kfunc
> is defined in bpf_testmod and the kfunc returns some data in bpf_testmod.
> But the release function bpf_kfunc_call_test_release() is in the kernel.
> The release func tries to access some data in bpf_testmod which might
> have been unloaded. The prog_test_ref_kfunc is defined in the kernel, so
> no bpf_testmod btf reference is hold so bpf_testmod can be unloaded before
> bpf_kfunc_call_test_release().
> As you mentioned, we won't have this issue if bpf_kfunc_call_test_acquire()
> is also in the kernel.
> 
> I think putting bpf_kfunc_call_test_acquire() in bpf_testmod and
> bpf_kfunc_call_test_release() in kernel is not a good idea and confusing.
> But since this is only for tests, I guess we can live with that. With that,

Correct. 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
also mentions why bpf_kfunc_call_test_release() is not in the module and
states that this is temporary. I'll add a comment in v2 so the wait can
be removed once the functions are re-united.
 
> Acked-by: Yonghong Song <yonghong.song@linux.dev>
> 
> > ---
> >   tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > index 91907b321f913..63f0dbd016703 100644
> > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > @@ -2,6 +2,7 @@
> >   /* Copyright (c) 2020 Facebook */
> >   #include <linux/btf.h>
> >   #include <linux/btf_ids.h>
> > +#include <linux/delay.h>
> >   #include <linux/error-injection.h>
> >   #include <linux/init.h>
> >   #include <linux/module.h>
> > @@ -544,6 +545,9 @@ static int bpf_testmod_init(void)
> >   static void bpf_testmod_exit(void)
> >   {
> > +	while (refcount_read(&prog_test_struct.cnt) > 1)
> > +		msleep(20);
> > +
> >   	return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
> >   }
>
Jiri Olsa Jan. 10, 2024, 12:49 p.m. UTC | #3
On Wed, Jan 10, 2024 at 09:14:51AM +0100, Artem Savkov wrote:
> On Tue, Jan 09, 2024 at 11:40:38AM -0800, Yonghong Song wrote:
> > 
> > On 1/9/24 8:43 AM, Artem Savkov wrote:
> > > It is possible for bpf_kfunc_call_test_release() to be called from
> > > bpf_map_free_deferred() when bpf_testmod is already unloaded and
> > > perf_test_stuct.cnt which it tries to decrease is no longer in memory.
> > > This patch tries to fix the issue by waiting for all references to be
> > > dropped in bpf_testmod_exit().
> > > 
> > > The issue can be triggered by running 'test_progs -t map_kptr' in 6.5,
> > > but is obscured in 6.6 by d119357d07435 ("rcu-tasks: Treat only
> > > synchronous grace periods urgently").
> > > 
> > > Fixes: 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
> > 
> > Please add your Signed-off-by tag.
> 
> Thanks for noticing. Will resend with signed-off-by and your ack.
> 
> > I think the root cause is that bpf_kfunc_call_test_acquire() kfunc
> > is defined in bpf_testmod and the kfunc returns some data in bpf_testmod.
> > But the release function bpf_kfunc_call_test_release() is in the kernel.
> > The release func tries to access some data in bpf_testmod which might
> > have been unloaded. The prog_test_ref_kfunc is defined in the kernel, so
> > no bpf_testmod btf reference is hold so bpf_testmod can be unloaded before
> > bpf_kfunc_call_test_release().
> > As you mentioned, we won't have this issue if bpf_kfunc_call_test_acquire()
> > is also in the kernel.
> > 
> > I think putting bpf_kfunc_call_test_acquire() in bpf_testmod and
> > bpf_kfunc_call_test_release() in kernel is not a good idea and confusing.
> > But since this is only for tests, I guess we can live with that. With that,
> 
> Correct. 65eb006d85a2a ("bpf: Move kernel test kfuncs to bpf_testmod")
> also mentions why bpf_kfunc_call_test_release() is not in the module and
> states that this is temporary. I'll add a comment in v2 so the wait can
> be removed once the functions are re-united.

I somehow recall it has to do with the fact you can't have trusted
pointer on module's object, so that's why those structs had to stay
in kernel.. but I might be wrong

jirka

>  
> > Acked-by: Yonghong Song <yonghong.song@linux.dev>
> > 
> > > ---
> > >   tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 4 ++++
> > >   1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > index 91907b321f913..63f0dbd016703 100644
> > > --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
> > > @@ -2,6 +2,7 @@
> > >   /* Copyright (c) 2020 Facebook */
> > >   #include <linux/btf.h>
> > >   #include <linux/btf_ids.h>
> > > +#include <linux/delay.h>
> > >   #include <linux/error-injection.h>
> > >   #include <linux/init.h>
> > >   #include <linux/module.h>
> > > @@ -544,6 +545,9 @@ static int bpf_testmod_init(void)
> > >   static void bpf_testmod_exit(void)
> > >   {
> > > +	while (refcount_read(&prog_test_struct.cnt) > 1)
> > > +		msleep(20);
> > > +
> > >   	return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
> > >   }
> > 
> 
> -- 
> Regards,
>   Artem
>
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 91907b321f913..63f0dbd016703 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -2,6 +2,7 @@ 
 /* Copyright (c) 2020 Facebook */
 #include <linux/btf.h>
 #include <linux/btf_ids.h>
+#include <linux/delay.h>
 #include <linux/error-injection.h>
 #include <linux/init.h>
 #include <linux/module.h>
@@ -544,6 +545,9 @@  static int bpf_testmod_init(void)
 
 static void bpf_testmod_exit(void)
 {
+	while (refcount_read(&prog_test_struct.cnt) > 1)
+		msleep(20);
+
 	return sysfs_remove_bin_file(kernel_kobj, &bin_attr_bpf_testmod_file);
 }