Message ID | 20240930-b4-slub-kunit-fix-v1-2-32ca9dbbbc11@suse.cz (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | slub kunit tests fixes for 6.12 | expand |
On 9/30/24 01:37, Vlastimil Babka wrote: > Guenter Roeck reports that the new slub kunit tests added by commit > 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and > test_leak_destroy()") cause a lockup on boot on several architectures > when the kunit tests are configured to be built-in and not modules. > > These tests invoke kfree_rcu() and kvfree_rcu_barrier() and boot > sequence inspection showed the runner for built-in kunit tests > kunit_run_all_tests() is called before setting system_state to > SYSTEM_RUNNING and calling rcu_end_inkernel_boot(), so this seems like a > likely cause. So while I was unable to reproduce the problem myself, > moving the call to kunit_run_all_tests() a bit later in the boot seems > to have fixed the lockup problem according to Guenter's limited testing. > > No kunit tests should be broken by calling the built-in executor a bit > later, as when compiled as modules, they are still executed even later > than this. > > Fixes: 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()") > Reported-by: Guenter Roeck <linux@roeck-us.net> > Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.net/ > Cc: "Paul E. McKenney" <paulmck@kernel.org> > Cc: Boqun Feng <boqun.feng@gmail.com> > Cc: Uladzislau Rezki <urezki@gmail.com> > Cc: rcu@vger.kernel.org > Cc: Brendan Higgins <brendanhiggins@google.com> > Cc: David Gow <davidgow@google.com> > Cc: Rae Moar <rmoar@google.com> > Cc: linux-kselftest@vger.kernel.org > Cc: kunit-dev@googlegroups.com > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > --- > init/main.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/init/main.c b/init/main.c > index c4778edae7972f512d5eefe8400075ac35a70d1c..7890ebb00e84b8bd7bac28923fb1fe571b3e9ee2 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -1489,6 +1489,8 @@ static int __ref kernel_init(void *unused) > > rcu_end_inkernel_boot(); > > + kunit_run_all_tests(); > + > do_sysctl_args(); > > if (ramdisk_execute_command) { > @@ -1579,8 +1581,6 @@ static noinline void __init kernel_init_freeable(void) > > do_basic_setup(); > > - kunit_run_all_tests(); > - > wait_for_initramfs(); > console_on_rootfs(); > > Unfortunately it doesn't work. With this patch applied, I get many backtraces similar to the following, and ultimately the image crashes. This is with arm64. I do not see the problem if I drop this patch. Guenter --- [ 9.465871] KTAP version 1 [ 9.465964] # Subtest: iov_iter [ 9.466056] # module: kunit_iov_iter [ 9.466115] 1..12 [ 9.467000] Unable to handle kernel paging request at virtual address ffffc37db5c9f26c [ 9.467244] Mem abort info: [ 9.467332] ESR = 0x0000000086000007 [ 9.467454] EC = 0x21: IABT (current EL), IL = 32 bits [ 9.467576] SET = 0, FnV = 0 [ 9.467667] EA = 0, S1PTW = 0 [ 9.467762] FSC = 0x07: level 3 translation fault [ 9.467912] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000042a59000 [ 9.468055] [ffffc37db5c9f26c] pgd=0000000000000000, p4d=1000000044b36003, pud=1000000044b37003, pmd=1000000044b3a003, pte=0000000000000000 [ 9.469430] Internal error: Oops: 0000000086000007 [#1] PREEMPT SMP [ 9.469687] Modules linked in: [ 9.470035] CPU: 0 UID: 0 PID: 550 Comm: kunit_try_catch Tainted: G N 6.12.0-rc1-00005-ga65e3eb58cdb #1 [ 9.470290] Tainted: [N]=TEST [ 9.470356] Hardware name: linux,dummy-virt (DT) [ 9.470530] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 9.470656] pc : iov_kunit_copy_to_kvec+0x0/0x334 [ 9.471055] lr : kunit_try_run_case+0x6c/0x15c [ 9.471145] sp : ffff800080883de0 [ 9.471210] x29: ffff800080883e20 x28: 0000000000000000 x27: 0000000000000000 [ 9.471376] x26: 0000000000000000 x25: 0000000000000000 x24: ffff80008000bb68 [ 9.471501] x23: ffffc37db3f7093c x22: ffff80008000b940 x21: ffff545847af4c00 [ 9.471622] x20: ffff545847cd3940 x19: ffff80008000bb50 x18: 0000000000000006 [ 9.471742] x17: 6c61746f7420303a x16: 70696b7320303a6c x15: 0000000000000172 [ 9.471863] x14: 0000000000020000 x13: 0000000000000000 x12: ffffc37db6a600c8 [ 9.471983] x11: 0000000000000043 x10: 0000000000000043 x9 : 1fffffffffffffff [ 9.472122] x8 : 00000000ffffffff x7 : 000000001040d4fd x6 : ffffc37db70c3810 [ 9.472243] x5 : 0000000000000000 x4 : ffffffffc4653600 x3 : 000000003b9ac9ff [ 9.472363] x2 : 0000000000000001 x1 : ffffc37db5c9f26c x0 : ffff80008000bb50 [ 9.472572] Call trace: [ 9.472636] iov_kunit_copy_to_kvec+0x0/0x334 [ 9.472740] kunit_generic_run_threadfn_adapter+0x28/0x4c [ 9.472835] kthread+0x11c/0x120 [ 9.472903] ret_from_fork+0x10/0x20 [ 9.473146] Code: ???????? ???????? ???????? ???????? (????????) [ 9.473505] ---[ end trace 0000000000000000 ]---
On 9/30/24 11:50, Guenter Roeck wrote: > On 9/30/24 01:37, Vlastimil Babka wrote: >> Guenter Roeck reports that the new slub kunit tests added by commit >> 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and >> test_leak_destroy()") cause a lockup on boot on several architectures >> when the kunit tests are configured to be built-in and not modules. >> >> These tests invoke kfree_rcu() and kvfree_rcu_barrier() and boot >> sequence inspection showed the runner for built-in kunit tests >> kunit_run_all_tests() is called before setting system_state to >> SYSTEM_RUNNING and calling rcu_end_inkernel_boot(), so this seems like a >> likely cause. So while I was unable to reproduce the problem myself, >> moving the call to kunit_run_all_tests() a bit later in the boot seems >> to have fixed the lockup problem according to Guenter's limited testing. >> >> No kunit tests should be broken by calling the built-in executor a bit >> later, as when compiled as modules, they are still executed even later >> than this. >> Actually, that is wrong. Turns out kunit_iov_iter (and other kunit tests) are marked __init. That means those unit tests have to run before the init code is released, and it actually _is_ harmful to run the tests after rcu_end_inkernel_boot() because at that time free_initmem() has already been called. Guenter >> Fixes: 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()") >> Reported-by: Guenter Roeck <linux@roeck-us.net> >> Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.net/ >> Cc: "Paul E. McKenney" <paulmck@kernel.org> >> Cc: Boqun Feng <boqun.feng@gmail.com> >> Cc: Uladzislau Rezki <urezki@gmail.com> >> Cc: rcu@vger.kernel.org >> Cc: Brendan Higgins <brendanhiggins@google.com> >> Cc: David Gow <davidgow@google.com> >> Cc: Rae Moar <rmoar@google.com> >> Cc: linux-kselftest@vger.kernel.org >> Cc: kunit-dev@googlegroups.com >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> >> --- >> init/main.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/init/main.c b/init/main.c >> index c4778edae7972f512d5eefe8400075ac35a70d1c..7890ebb00e84b8bd7bac28923fb1fe571b3e9ee2 100644 >> --- a/init/main.c >> +++ b/init/main.c >> @@ -1489,6 +1489,8 @@ static int __ref kernel_init(void *unused) >> rcu_end_inkernel_boot(); >> + kunit_run_all_tests(); >> + >> do_sysctl_args(); >> if (ramdisk_execute_command) { >> @@ -1579,8 +1581,6 @@ static noinline void __init kernel_init_freeable(void) >> do_basic_setup(); >> - kunit_run_all_tests(); >> - >> wait_for_initramfs(); >> console_on_rootfs(); >> > Unfortunately it doesn't work. With this patch applied, I get many backtraces > similar to the following, and ultimately the image crashes. This is with arm64. > I do not see the problem if I drop this patch. > > Guenter > > --- > [ 9.465871] KTAP version 1 > [ 9.465964] # Subtest: iov_iter > [ 9.466056] # module: kunit_iov_iter > [ 9.466115] 1..12 > [ 9.467000] Unable to handle kernel paging request at virtual address ffffc37db5c9f26c > [ 9.467244] Mem abort info: > [ 9.467332] ESR = 0x0000000086000007 > [ 9.467454] EC = 0x21: IABT (current EL), IL = 32 bits > [ 9.467576] SET = 0, FnV = 0 > [ 9.467667] EA = 0, S1PTW = 0 > [ 9.467762] FSC = 0x07: level 3 translation fault > [ 9.467912] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000042a59000 > [ 9.468055] [ffffc37db5c9f26c] pgd=0000000000000000, p4d=1000000044b36003, pud=1000000044b37003, pmd=1000000044b3a003, pte=0000000000000000 > [ 9.469430] Internal error: Oops: 0000000086000007 [#1] PREEMPT SMP > [ 9.469687] Modules linked in: > [ 9.470035] CPU: 0 UID: 0 PID: 550 Comm: kunit_try_catch Tainted: G N 6.12.0-rc1-00005-ga65e3eb58cdb #1 > [ 9.470290] Tainted: [N]=TEST > [ 9.470356] Hardware name: linux,dummy-virt (DT) > [ 9.470530] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 9.470656] pc : iov_kunit_copy_to_kvec+0x0/0x334 > [ 9.471055] lr : kunit_try_run_case+0x6c/0x15c > [ 9.471145] sp : ffff800080883de0 > [ 9.471210] x29: ffff800080883e20 x28: 0000000000000000 x27: 0000000000000000 > [ 9.471376] x26: 0000000000000000 x25: 0000000000000000 x24: ffff80008000bb68 > [ 9.471501] x23: ffffc37db3f7093c x22: ffff80008000b940 x21: ffff545847af4c00 > [ 9.471622] x20: ffff545847cd3940 x19: ffff80008000bb50 x18: 0000000000000006 > [ 9.471742] x17: 6c61746f7420303a x16: 70696b7320303a6c x15: 0000000000000172 > [ 9.471863] x14: 0000000000020000 x13: 0000000000000000 x12: ffffc37db6a600c8 > [ 9.471983] x11: 0000000000000043 x10: 0000000000000043 x9 : 1fffffffffffffff > [ 9.472122] x8 : 00000000ffffffff x7 : 000000001040d4fd x6 : ffffc37db70c3810 > [ 9.472243] x5 : 0000000000000000 x4 : ffffffffc4653600 x3 : 000000003b9ac9ff > [ 9.472363] x2 : 0000000000000001 x1 : ffffc37db5c9f26c x0 : ffff80008000bb50 > [ 9.472572] Call trace: > [ 9.472636] iov_kunit_copy_to_kvec+0x0/0x334 > [ 9.472740] kunit_generic_run_threadfn_adapter+0x28/0x4c > [ 9.472835] kthread+0x11c/0x120 > [ 9.472903] ret_from_fork+0x10/0x20 > [ 9.473146] Code: ???????? ???????? ???????? ???????? (????????) > [ 9.473505] ---[ end trace 0000000000000000 ]--- >
On 10/1/24 1:55 AM, Guenter Roeck wrote: > On 9/30/24 11:50, Guenter Roeck wrote: >> On 9/30/24 01:37, Vlastimil Babka wrote: >>> Guenter Roeck reports that the new slub kunit tests added by commit >>> 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and >>> test_leak_destroy()") cause a lockup on boot on several architectures >>> when the kunit tests are configured to be built-in and not modules. >>> >>> These tests invoke kfree_rcu() and kvfree_rcu_barrier() and boot >>> sequence inspection showed the runner for built-in kunit tests >>> kunit_run_all_tests() is called before setting system_state to >>> SYSTEM_RUNNING and calling rcu_end_inkernel_boot(), so this seems like a >>> likely cause. So while I was unable to reproduce the problem myself, >>> moving the call to kunit_run_all_tests() a bit later in the boot seems >>> to have fixed the lockup problem according to Guenter's limited testing. >>> >>> No kunit tests should be broken by calling the built-in executor a bit >>> later, as when compiled as modules, they are still executed even later >>> than this. >>> > > Actually, that is wrong. > > Turns out kunit_iov_iter (and other kunit tests) are marked __init. > That means those unit tests have to run before the init code is released, > and it actually _is_ harmful to run the tests after rcu_end_inkernel_boot() > because at that time free_initmem() has already been called. Oh, guess that explains why the kunit_run_all_tests() executor is called so suspiciously early. Of course when built as modules, __init has a different lifetime. Guess I will just skip the two new tests using kfree_rcu() when the slub kunit is built-in then. Thanks for testing. > Guenter > >>> Fixes: 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and >>> test_leak_destroy()") >>> Reported-by: Guenter Roeck <linux@roeck-us.net> >>> Closes: >>> https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.net/ >>> Cc: "Paul E. McKenney" <paulmck@kernel.org> >>> Cc: Boqun Feng <boqun.feng@gmail.com> >>> Cc: Uladzislau Rezki <urezki@gmail.com> >>> Cc: rcu@vger.kernel.org >>> Cc: Brendan Higgins <brendanhiggins@google.com> >>> Cc: David Gow <davidgow@google.com> >>> Cc: Rae Moar <rmoar@google.com> >>> Cc: linux-kselftest@vger.kernel.org >>> Cc: kunit-dev@googlegroups.com >>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> >>> --- >>> init/main.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/init/main.c b/init/main.c >>> index >>> c4778edae7972f512d5eefe8400075ac35a70d1c..7890ebb00e84b8bd7bac28923fb1fe571b3e9ee2 100644 >>> --- a/init/main.c >>> +++ b/init/main.c >>> @@ -1489,6 +1489,8 @@ static int __ref kernel_init(void *unused) >>> rcu_end_inkernel_boot(); >>> + kunit_run_all_tests(); >>> + >>> do_sysctl_args(); >>> if (ramdisk_execute_command) { >>> @@ -1579,8 +1581,6 @@ static noinline void __init >>> kernel_init_freeable(void) >>> do_basic_setup(); >>> - kunit_run_all_tests(); >>> - >>> wait_for_initramfs(); >>> console_on_rootfs(); >>> >> Unfortunately it doesn't work. With this patch applied, I get many >> backtraces >> similar to the following, and ultimately the image crashes. This is >> with arm64. >> I do not see the problem if I drop this patch. >> >> Guenter >> >> --- >> [ 9.465871] KTAP version 1 >> [ 9.465964] # Subtest: iov_iter >> [ 9.466056] # module: kunit_iov_iter >> [ 9.466115] 1..12 >> [ 9.467000] Unable to handle kernel paging request at virtual >> address ffffc37db5c9f26c >> [ 9.467244] Mem abort info: >> [ 9.467332] ESR = 0x0000000086000007 >> [ 9.467454] EC = 0x21: IABT (current EL), IL = 32 bits >> [ 9.467576] SET = 0, FnV = 0 >> [ 9.467667] EA = 0, S1PTW = 0 >> [ 9.467762] FSC = 0x07: level 3 translation fault >> [ 9.467912] swapper pgtable: 4k pages, 48-bit VAs, >> pgdp=0000000042a59000 >> [ 9.468055] [ffffc37db5c9f26c] pgd=0000000000000000, >> p4d=1000000044b36003, pud=1000000044b37003, pmd=1000000044b3a003, >> pte=0000000000000000 >> [ 9.469430] Internal error: Oops: 0000000086000007 [#1] PREEMPT SMP >> [ 9.469687] Modules linked in: >> [ 9.470035] CPU: 0 UID: 0 PID: 550 Comm: kunit_try_catch Tainted: >> G N 6.12.0-rc1-00005-ga65e3eb58cdb #1 >> [ 9.470290] Tainted: [N]=TEST >> [ 9.470356] Hardware name: linux,dummy-virt (DT) >> [ 9.470530] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS >> BTYPE=--) >> [ 9.470656] pc : iov_kunit_copy_to_kvec+0x0/0x334 >> [ 9.471055] lr : kunit_try_run_case+0x6c/0x15c >> [ 9.471145] sp : ffff800080883de0 >> [ 9.471210] x29: ffff800080883e20 x28: 0000000000000000 x27: >> 0000000000000000 >> [ 9.471376] x26: 0000000000000000 x25: 0000000000000000 x24: >> ffff80008000bb68 >> [ 9.471501] x23: ffffc37db3f7093c x22: ffff80008000b940 x21: >> ffff545847af4c00 >> [ 9.471622] x20: ffff545847cd3940 x19: ffff80008000bb50 x18: >> 0000000000000006 >> [ 9.471742] x17: 6c61746f7420303a x16: 70696b7320303a6c x15: >> 0000000000000172 >> [ 9.471863] x14: 0000000000020000 x13: 0000000000000000 x12: >> ffffc37db6a600c8 >> [ 9.471983] x11: 0000000000000043 x10: 0000000000000043 x9 : >> 1fffffffffffffff >> [ 9.472122] x8 : 00000000ffffffff x7 : 000000001040d4fd x6 : >> ffffc37db70c3810 >> [ 9.472243] x5 : 0000000000000000 x4 : ffffffffc4653600 x3 : >> 000000003b9ac9ff >> [ 9.472363] x2 : 0000000000000001 x1 : ffffc37db5c9f26c x0 : >> ffff80008000bb50 >> [ 9.472572] Call trace: >> [ 9.472636] iov_kunit_copy_to_kvec+0x0/0x334 >> [ 9.472740] kunit_generic_run_threadfn_adapter+0x28/0x4c >> [ 9.472835] kthread+0x11c/0x120 >> [ 9.472903] ret_from_fork+0x10/0x20 >> [ 9.473146] Code: ???????? ???????? ???????? ???????? (????????) >> [ 9.473505] ---[ end trace 0000000000000000 ]--- >> >
On Tue, 1 Oct 2024 at 07:55, Guenter Roeck <linux@roeck-us.net> wrote: > > On 9/30/24 11:50, Guenter Roeck wrote: > > On 9/30/24 01:37, Vlastimil Babka wrote: > >> Guenter Roeck reports that the new slub kunit tests added by commit > >> 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and > >> test_leak_destroy()") cause a lockup on boot on several architectures > >> when the kunit tests are configured to be built-in and not modules. > >> > >> These tests invoke kfree_rcu() and kvfree_rcu_barrier() and boot > >> sequence inspection showed the runner for built-in kunit tests > >> kunit_run_all_tests() is called before setting system_state to > >> SYSTEM_RUNNING and calling rcu_end_inkernel_boot(), so this seems like a > >> likely cause. So while I was unable to reproduce the problem myself, > >> moving the call to kunit_run_all_tests() a bit later in the boot seems > >> to have fixed the lockup problem according to Guenter's limited testing. > >> > >> No kunit tests should be broken by calling the built-in executor a bit > >> later, as when compiled as modules, they are still executed even later > >> than this. > >> > > Actually, that is wrong. > > Turns out kunit_iov_iter (and other kunit tests) are marked __init. > That means those unit tests have to run before the init code is released, > and it actually _is_ harmful to run the tests after rcu_end_inkernel_boot() > because at that time free_initmem() has already been called. Yeah: some tests are marked __init. KUnit does actually mark these with an attribute, so we can potentially split the execution up into an 'init' part which runs early, and a later part, but there are some complications if we still want to track the total number of tests and support filtering, etc. properly. That's something I think we'll look at for 6.13: in the meantime, skipping the problematic slub tests when built-in seems to be the right short-term fix. I'll look into having the built-in executor moved later for non-init tests once we've worked out how best to adapt the filter/KTAP output code to do so as cleanly as possible. Cheers, -- David > > Guenter > > >> Fixes: 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()") > >> Reported-by: Guenter Roeck <linux@roeck-us.net> > >> Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.net/ > >> Cc: "Paul E. McKenney" <paulmck@kernel.org> > >> Cc: Boqun Feng <boqun.feng@gmail.com> > >> Cc: Uladzislau Rezki <urezki@gmail.com> > >> Cc: rcu@vger.kernel.org > >> Cc: Brendan Higgins <brendanhiggins@google.com> > >> Cc: David Gow <davidgow@google.com> > >> Cc: Rae Moar <rmoar@google.com> > >> Cc: linux-kselftest@vger.kernel.org > >> Cc: kunit-dev@googlegroups.com > >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > >> --- > >> init/main.c | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/init/main.c b/init/main.c > >> index c4778edae7972f512d5eefe8400075ac35a70d1c..7890ebb00e84b8bd7bac28923fb1fe571b3e9ee2 100644 > >> --- a/init/main.c > >> +++ b/init/main.c > >> @@ -1489,6 +1489,8 @@ static int __ref kernel_init(void *unused) > >> rcu_end_inkernel_boot(); > >> + kunit_run_all_tests(); > >> + > >> do_sysctl_args(); > >> if (ramdisk_execute_command) { > >> @@ -1579,8 +1581,6 @@ static noinline void __init kernel_init_freeable(void) > >> do_basic_setup(); > >> - kunit_run_all_tests(); > >> - > >> wait_for_initramfs(); > >> console_on_rootfs(); > >> > > Unfortunately it doesn't work. With this patch applied, I get many backtraces > > similar to the following, and ultimately the image crashes. This is with arm64. > > I do not see the problem if I drop this patch. > > > > Guenter > > > > --- > > [ 9.465871] KTAP version 1 > > [ 9.465964] # Subtest: iov_iter > > [ 9.466056] # module: kunit_iov_iter > > [ 9.466115] 1..12 > > [ 9.467000] Unable to handle kernel paging request at virtual address ffffc37db5c9f26c > > [ 9.467244] Mem abort info: > > [ 9.467332] ESR = 0x0000000086000007 > > [ 9.467454] EC = 0x21: IABT (current EL), IL = 32 bits > > [ 9.467576] SET = 0, FnV = 0 > > [ 9.467667] EA = 0, S1PTW = 0 > > [ 9.467762] FSC = 0x07: level 3 translation fault > > [ 9.467912] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000042a59000 > > [ 9.468055] [ffffc37db5c9f26c] pgd=0000000000000000, p4d=1000000044b36003, pud=1000000044b37003, pmd=1000000044b3a003, pte=0000000000000000 > > [ 9.469430] Internal error: Oops: 0000000086000007 [#1] PREEMPT SMP > > [ 9.469687] Modules linked in: > > [ 9.470035] CPU: 0 UID: 0 PID: 550 Comm: kunit_try_catch Tainted: G N 6.12.0-rc1-00005-ga65e3eb58cdb #1 > > [ 9.470290] Tainted: [N]=TEST > > [ 9.470356] Hardware name: linux,dummy-virt (DT) > > [ 9.470530] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > [ 9.470656] pc : iov_kunit_copy_to_kvec+0x0/0x334 > > [ 9.471055] lr : kunit_try_run_case+0x6c/0x15c > > [ 9.471145] sp : ffff800080883de0 > > [ 9.471210] x29: ffff800080883e20 x28: 0000000000000000 x27: 0000000000000000 > > [ 9.471376] x26: 0000000000000000 x25: 0000000000000000 x24: ffff80008000bb68 > > [ 9.471501] x23: ffffc37db3f7093c x22: ffff80008000b940 x21: ffff545847af4c00 > > [ 9.471622] x20: ffff545847cd3940 x19: ffff80008000bb50 x18: 0000000000000006 > > [ 9.471742] x17: 6c61746f7420303a x16: 70696b7320303a6c x15: 0000000000000172 > > [ 9.471863] x14: 0000000000020000 x13: 0000000000000000 x12: ffffc37db6a600c8 > > [ 9.471983] x11: 0000000000000043 x10: 0000000000000043 x9 : 1fffffffffffffff > > [ 9.472122] x8 : 00000000ffffffff x7 : 000000001040d4fd x6 : ffffc37db70c3810 > > [ 9.472243] x5 : 0000000000000000 x4 : ffffffffc4653600 x3 : 000000003b9ac9ff > > [ 9.472363] x2 : 0000000000000001 x1 : ffffc37db5c9f26c x0 : ffff80008000bb50 > > [ 9.472572] Call trace: > > [ 9.472636] iov_kunit_copy_to_kvec+0x0/0x334 > > [ 9.472740] kunit_generic_run_threadfn_adapter+0x28/0x4c > > [ 9.472835] kthread+0x11c/0x120 > > [ 9.472903] ret_from_fork+0x10/0x20 > > [ 9.473146] Code: ???????? ???????? ???????? ???????? (????????) > > [ 9.473505] ---[ end trace 0000000000000000 ]--- > > >
diff --git a/init/main.c b/init/main.c index c4778edae7972f512d5eefe8400075ac35a70d1c..7890ebb00e84b8bd7bac28923fb1fe571b3e9ee2 100644 --- a/init/main.c +++ b/init/main.c @@ -1489,6 +1489,8 @@ static int __ref kernel_init(void *unused) rcu_end_inkernel_boot(); + kunit_run_all_tests(); + do_sysctl_args(); if (ramdisk_execute_command) { @@ -1579,8 +1581,6 @@ static noinline void __init kernel_init_freeable(void) do_basic_setup(); - kunit_run_all_tests(); - wait_for_initramfs(); console_on_rootfs();
Guenter Roeck reports that the new slub kunit tests added by commit 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()") cause a lockup on boot on several architectures when the kunit tests are configured to be built-in and not modules. These tests invoke kfree_rcu() and kvfree_rcu_barrier() and boot sequence inspection showed the runner for built-in kunit tests kunit_run_all_tests() is called before setting system_state to SYSTEM_RUNNING and calling rcu_end_inkernel_boot(), so this seems like a likely cause. So while I was unable to reproduce the problem myself, moving the call to kunit_run_all_tests() a bit later in the boot seems to have fixed the lockup problem according to Guenter's limited testing. No kunit tests should be broken by calling the built-in executor a bit later, as when compiled as modules, they are still executed even later than this. Fixes: 4e1c44b3db79 ("kunit, slub: add test_kfree_rcu() and test_leak_destroy()") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/6fcb1252-7990-4f0d-8027-5e83f0fb9409@roeck-us.net/ Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: rcu@vger.kernel.org Cc: Brendan Higgins <brendanhiggins@google.com> Cc: David Gow <davidgow@google.com> Cc: Rae Moar <rmoar@google.com> Cc: linux-kselftest@vger.kernel.org Cc: kunit-dev@googlegroups.com Signed-off-by: Vlastimil Babka <vbabka@suse.cz> --- init/main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)