diff mbox series

of: unittest: Run overlay apply/revert sequence three times

Message ID a9fb4eb560c58d11a7f167bc78a137b46e76cf15.1692699743.git.geert+renesas@glider.be (mailing list archive)
State Mainlined
Commit 02dfc6acea1ca652cacfc92f4231f058f84f0a73
Delegated to: Geert Uytterhoeven
Headers show
Series of: unittest: Run overlay apply/revert sequence three times | expand

Commit Message

Geert Uytterhoeven Aug. 22, 2023, 10:22 a.m. UTC
Run the test for the overlay apply/revert sequence three times, to
test if there are unbalanced of_node_put() calls causing reference
counts to become negative.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
This is a reproducer for the issue fixed by commit 7882541ca06d51a6
("of/platform: increase refcount of fwnode") in dt/linus.
---
 drivers/of/unittest.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Rob Herring (Arm) Aug. 22, 2023, 3:32 p.m. UTC | #1
On Tue, Aug 22, 2023 at 12:22:34PM +0200, Geert Uytterhoeven wrote:
> Run the test for the overlay apply/revert sequence three times, to
> test if there are unbalanced of_node_put() calls causing reference
> counts to become negative.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> This is a reproducer for the issue fixed by commit 7882541ca06d51a6
> ("of/platform: increase refcount of fwnode") in dt/linus.

Is this necessary? There were WARN backtraces without that fix.

Rob
Geert Uytterhoeven Aug. 22, 2023, 5:52 p.m. UTC | #2
Hi Rob,

On Tue, Aug 22, 2023 at 5:32 PM Rob Herring <robh@kernel.org> wrote:
> On Tue, Aug 22, 2023 at 12:22:34PM +0200, Geert Uytterhoeven wrote:
> > Run the test for the overlay apply/revert sequence three times, to
> > test if there are unbalanced of_node_put() calls causing reference
> > counts to become negative.
> >
> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > ---
> > This is a reproducer for the issue fixed by commit 7882541ca06d51a6
> > ("of/platform: increase refcount of fwnode") in dt/linus.
>
> Is this necessary? There were WARN backtraces without that fix.

Did you see them?
Peng saw them with the out-of-tree jailhouse hypervisor enable/disable
test, and I saw them with the out-of-tree overlay configfs patches.
I am not aware of any in-tree kernel code triggering them.  If we
would have had this in the unittests, I would have noticed this
regression earlier...

Gr{oetje,eeting}s,

                        Geert
Rob Herring (Arm) Aug. 22, 2023, 5:59 p.m. UTC | #3
On Tue, Aug 22, 2023 at 12:52 PM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
>
> Hi Rob,
>
> On Tue, Aug 22, 2023 at 5:32 PM Rob Herring <robh@kernel.org> wrote:
> > On Tue, Aug 22, 2023 at 12:22:34PM +0200, Geert Uytterhoeven wrote:
> > > Run the test for the overlay apply/revert sequence three times, to
> > > test if there are unbalanced of_node_put() calls causing reference
> > > counts to become negative.
> > >
> > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > > ---
> > > This is a reproducer for the issue fixed by commit 7882541ca06d51a6
> > > ("of/platform: increase refcount of fwnode") in dt/linus.
> >
> > Is this necessary? There were WARN backtraces without that fix.
>
> Did you see them?

Yes, but that was also with your series applied. When I tested the
fix, I had dropped that, so maybe your series triggered it too.

Rob
Rob Herring (Arm) Aug. 22, 2023, 7:21 p.m. UTC | #4
On Tue, Aug 22, 2023 at 12:59 PM Rob Herring <robh@kernel.org> wrote:
>
> On Tue, Aug 22, 2023 at 12:52 PM Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
> >
> > Hi Rob,
> >
> > On Tue, Aug 22, 2023 at 5:32 PM Rob Herring <robh@kernel.org> wrote:
> > > On Tue, Aug 22, 2023 at 12:22:34PM +0200, Geert Uytterhoeven wrote:
> > > > Run the test for the overlay apply/revert sequence three times, to
> > > > test if there are unbalanced of_node_put() calls causing reference
> > > > counts to become negative.
> > > >
> > > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > > > ---
> > > > This is a reproducer for the issue fixed by commit 7882541ca06d51a6
> > > > ("of/platform: increase refcount of fwnode") in dt/linus.
> > >
> > > Is this necessary? There were WARN backtraces without that fix.
> >
> > Did you see them?
>
> Yes, but that was also with your series applied. When I tested the
> fix, I had dropped that, so maybe your series triggered it too.

With the fix reverted on my dt/linus branch, I get this:

[ 1.269977] ### dt-test ### pass of_unittest_overlay_10():2507
[ 1.270123] ### dt-test ### pass of_unittest_overlay_10():2513
[ 1.270290] ### dt-test ### pass of_unittest_overlay_10():2519
[ 1.275673] ------------[ cut here ]------------
[ 1.275790] refcount_t: addition on 0; use-after-free.
[ 1.276118] WARNING: CPU: 1 PID: 1 at lib/refcount.c:25
refcount_warn_saturate+0x120/0x144
[ 1.276343] Modules linked in:
[ 1.276558] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G N
6.5.0-rc1-00010-g8e081e8346d1 #84
[ 1.276791] Hardware name: linux,dummy-virt (DT)
[ 1.276973] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1.277114] pc : refcount_warn_saturate+0x120/0x144
[ 1.277219] lr : refcount_warn_saturate+0x120/0x144
[ 1.277332] sp : ffff80008002b630
[ 1.277410] x29: ffff80008002b630 x28: ffff80008002b978 x27: ffff0a00ffffff05
[ 1.277585] x26: ffff80008002b9a9 x25: ffffd5ba808bec2f x24: ffff452a08002f18
[ 1.277738] x23: ffff0000ffffff00 x22: ffff80008002b978 x21: 0000000000000000
[ 1.277895] x20: ffff80008002b9dd x19: ffff452a08002f80 x18: 0000000000000006
[ 1.278052] x17: 73657474696e752d x16: 747365743a313174 x15: 0765076507720766
[ 1.278200] x14: 072d077207650774 x13: ffffd5b9814a0660 x12: 000000000000069c
[ 1.278357] x11: 0000000000000234 x10: ffffd5b9814f8660 x9 : ffffd5b9814a0660
[ 1.278529] x8 : 00000000ffffefff x7 : ffffd5b9814f8660 x6 : 80000000fffff000
[ 1.278680] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 1.278829] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff452a02cb8000
[ 1.279041] Call trace:
[ 1.279171] refcount_warn_saturate+0x120/0x144
[ 1.279322] kobject_get+0xb8/0xbc
[ 1.279416] of_node_get+0x20/0x34
[ 1.279497] of_fwnode_get+0x34/0x54
[ 1.279567] fwnode_get_nth_parent+0xf0/0x12c
[ 1.279666] fwnode_full_name_string+0x48/0xb8
[ 1.279764] device_node_string+0x380/0x5a4
[ 1.279841] pointer+0x38c/0x4ac
[ 1.279900] vsnprintf+0x14c/0x6d0
[ 1.279970] vprintk_store+0x168/0x47c
[ 1.280055] vprintk_emit+0x104/0x2b4
[ 1.280122] vprintk_default+0x38/0x44
[ 1.280189] vprintk+0xd4/0xf0
[ 1.280262] _printk+0x5c/0x84
[ 1.280332] of_node_release+0x1ac/0x1b4
[ 1.280413] kobject_put+0xb0/0x220
[ 1.280492] of_changeset_destroy+0x50/0xf4
[ 1.280584] free_overlay_changeset+0x24/0x104
[ 1.280680] of_overlay_remove+0x240/0x2b8
[ 1.280766] of_unittest_apply_revert_overlay_check.constprop.0+0xa8/0x310
[ 1.280904] of_unittest+0xbf4/0x2e64
[ 1.280986] do_one_initcall+0x7c/0x1c0
[ 1.281072] kernel_init_freeable+0x1c4/0x294
[ 1.281161] kernel_init+0x24/0x1dc
[ 1.281242] ret_from_fork+0x10/0x20

Then later on:

[ 1.459652] ### dt-test ### EXPECT \ : ------------[ cut here ]------------
[ 1.459877] ### dt-test ### EXPECT \ : WARNING: <<all>>
[ 1.460227] ### dt-test ### EXPECT \ : refcount_t: underflow; use-after-free.
[ 1.460508] ### dt-test ### EXPECT \ : ---[ end trace <<int>> ]---
[ 1.460860] ### dt-test ### pass of_unittest_lifecycle():3187
[ 1.461455] ### dt-test ### EXPECT / : ---[ end trace <<int>> ]---
[ 1.461463] ### dt-test ### EXPECT / : refcount_t: underflow; use-after-free.
[ 1.461789] ### dt-test ### EXPECT / : WARNING: <<all>>
[ 1.462137] ### dt-test ### EXPECT / : ------------[ cut here ]------------

So it seems we were getting the warning, but at the wrong point.

Rob
Geert Uytterhoeven Aug. 23, 2023, 7:41 a.m. UTC | #5
Hi Rob,

On Tue, Aug 22, 2023 at 9:22 PM Rob Herring <robh@kernel.org> wrote:
> On Tue, Aug 22, 2023 at 12:59 PM Rob Herring <robh@kernel.org> wrote:
> > On Tue, Aug 22, 2023 at 12:52 PM Geert Uytterhoeven
> > <geert@linux-m68k.org> wrote:
> > >
> > > Hi Rob,
> > >
> > > On Tue, Aug 22, 2023 at 5:32 PM Rob Herring <robh@kernel.org> wrote:
> > > > On Tue, Aug 22, 2023 at 12:22:34PM +0200, Geert Uytterhoeven wrote:
> > > > > Run the test for the overlay apply/revert sequence three times, to
> > > > > test if there are unbalanced of_node_put() calls causing reference
> > > > > counts to become negative.
> > > > >
> > > > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > > > > ---
> > > > > This is a reproducer for the issue fixed by commit 7882541ca06d51a6
> > > > > ("of/platform: increase refcount of fwnode") in dt/linus.
> > > >
> > > > Is this necessary? There were WARN backtraces without that fix.
> > >
> > > Did you see them?
> >
> > Yes, but that was also with your series applied. When I tested the
> > fix, I had dropped that, so maybe your series triggered it too.
>
> With the fix reverted on my dt/linus branch, I get this:
>
> [ 1.269977] ### dt-test ### pass of_unittest_overlay_10():2507
> [ 1.270123] ### dt-test ### pass of_unittest_overlay_10():2513
> [ 1.270290] ### dt-test ### pass of_unittest_overlay_10():2519
> [ 1.275673] ------------[ cut here ]------------
> [ 1.275790] refcount_t: addition on 0; use-after-free.
> [ 1.276118] WARNING: CPU: 1 PID: 1 at lib/refcount.c:25
> refcount_warn_saturate+0x120/0x144
> [ 1.276343] Modules linked in:
> [ 1.276558] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G N
> 6.5.0-rc1-00010-g8e081e8346d1 #84
> [ 1.276791] Hardware name: linux,dummy-virt (DT)
> [ 1.276973] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 1.277114] pc : refcount_warn_saturate+0x120/0x144
> [ 1.277219] lr : refcount_warn_saturate+0x120/0x144
> [ 1.277332] sp : ffff80008002b630
> [ 1.277410] x29: ffff80008002b630 x28: ffff80008002b978 x27: ffff0a00ffffff05
> [ 1.277585] x26: ffff80008002b9a9 x25: ffffd5ba808bec2f x24: ffff452a08002f18
> [ 1.277738] x23: ffff0000ffffff00 x22: ffff80008002b978 x21: 0000000000000000
> [ 1.277895] x20: ffff80008002b9dd x19: ffff452a08002f80 x18: 0000000000000006
> [ 1.278052] x17: 73657474696e752d x16: 747365743a313174 x15: 0765076507720766
> [ 1.278200] x14: 072d077207650774 x13: ffffd5b9814a0660 x12: 000000000000069c
> [ 1.278357] x11: 0000000000000234 x10: ffffd5b9814f8660 x9 : ffffd5b9814a0660
> [ 1.278529] x8 : 00000000ffffefff x7 : ffffd5b9814f8660 x6 : 80000000fffff000
> [ 1.278680] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
> [ 1.278829] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff452a02cb8000
> [ 1.279041] Call trace:
> [ 1.279171] refcount_warn_saturate+0x120/0x144
> [ 1.279322] kobject_get+0xb8/0xbc
> [ 1.279416] of_node_get+0x20/0x34
> [ 1.279497] of_fwnode_get+0x34/0x54
> [ 1.279567] fwnode_get_nth_parent+0xf0/0x12c
> [ 1.279666] fwnode_full_name_string+0x48/0xb8
> [ 1.279764] device_node_string+0x380/0x5a4
> [ 1.279841] pointer+0x38c/0x4ac
> [ 1.279900] vsnprintf+0x14c/0x6d0
> [ 1.279970] vprintk_store+0x168/0x47c
> [ 1.280055] vprintk_emit+0x104/0x2b4
> [ 1.280122] vprintk_default+0x38/0x44
> [ 1.280189] vprintk+0xd4/0xf0
> [ 1.280262] _printk+0x5c/0x84
> [ 1.280332] of_node_release+0x1ac/0x1b4
> [ 1.280413] kobject_put+0xb0/0x220
> [ 1.280492] of_changeset_destroy+0x50/0xf4
> [ 1.280584] free_overlay_changeset+0x24/0x104
> [ 1.280680] of_overlay_remove+0x240/0x2b8
> [ 1.280766] of_unittest_apply_revert_overlay_check.constprop.0+0xa8/0x310
> [ 1.280904] of_unittest+0xbf4/0x2e64
> [ 1.280986] do_one_initcall+0x7c/0x1c0
> [ 1.281072] kernel_init_freeable+0x1c4/0x294
> [ 1.281161] kernel_init+0x24/0x1dc
> [ 1.281242] ret_from_fork+0x10/0x20
>
> Then later on:
>
> [ 1.459652] ### dt-test ### EXPECT \ : ------------[ cut here ]------------
> [ 1.459877] ### dt-test ### EXPECT \ : WARNING: <<all>>
> [ 1.460227] ### dt-test ### EXPECT \ : refcount_t: underflow; use-after-free.
> [ 1.460508] ### dt-test ### EXPECT \ : ---[ end trace <<int>> ]---
> [ 1.460860] ### dt-test ### pass of_unittest_lifecycle():3187
> [ 1.461455] ### dt-test ### EXPECT / : ---[ end trace <<int>> ]---
> [ 1.461463] ### dt-test ### EXPECT / : refcount_t: underflow; use-after-free.
> [ 1.461789] ### dt-test ### EXPECT / : WARNING: <<all>>
> [ 1.462137] ### dt-test ### EXPECT / : ------------[ cut here ]------------
>
> So it seems we were getting the warning, but at the wrong point.

Thanks (and confirmed), I had missed that.

Note that it did not cause any test failures, though:

    ### dt-test ### end of unittest - 302 passed, 0 failed

With this patch:

    ### dt-test ### end of unittest - 303 passed, 1 failed

Anyway, it's up to you to decide to apply or not...

Gr{oetje,eeting}s,

                        Geert
Rob Herring (Arm) Aug. 23, 2023, 1:26 p.m. UTC | #6
On Wed, Aug 23, 2023 at 09:41:40AM +0200, Geert Uytterhoeven wrote:
> Hi Rob,
> 
> On Tue, Aug 22, 2023 at 9:22 PM Rob Herring <robh@kernel.org> wrote:
> > On Tue, Aug 22, 2023 at 12:59 PM Rob Herring <robh@kernel.org> wrote:
> > > On Tue, Aug 22, 2023 at 12:52 PM Geert Uytterhoeven
> > > <geert@linux-m68k.org> wrote:
> > > >
> > > > Hi Rob,
> > > >
> > > > On Tue, Aug 22, 2023 at 5:32 PM Rob Herring <robh@kernel.org> wrote:
> > > > > On Tue, Aug 22, 2023 at 12:22:34PM +0200, Geert Uytterhoeven wrote:
> > > > > > Run the test for the overlay apply/revert sequence three times, to
> > > > > > test if there are unbalanced of_node_put() calls causing reference
> > > > > > counts to become negative.
> > > > > >
> > > > > > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > > > > > ---
> > > > > > This is a reproducer for the issue fixed by commit 7882541ca06d51a6
> > > > > > ("of/platform: increase refcount of fwnode") in dt/linus.
> > > > >
> > > > > Is this necessary? There were WARN backtraces without that fix.
> > > >
> > > > Did you see them?
> > >
> > > Yes, but that was also with your series applied. When I tested the
> > > fix, I had dropped that, so maybe your series triggered it too.
> >
> > With the fix reverted on my dt/linus branch, I get this:
> >
> > [ 1.269977] ### dt-test ### pass of_unittest_overlay_10():2507
> > [ 1.270123] ### dt-test ### pass of_unittest_overlay_10():2513
> > [ 1.270290] ### dt-test ### pass of_unittest_overlay_10():2519
> > [ 1.275673] ------------[ cut here ]------------
> > [ 1.275790] refcount_t: addition on 0; use-after-free.
> > [ 1.276118] WARNING: CPU: 1 PID: 1 at lib/refcount.c:25
> > refcount_warn_saturate+0x120/0x144
> > [ 1.276343] Modules linked in:
> > [ 1.276558] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G N
> > 6.5.0-rc1-00010-g8e081e8346d1 #84
> > [ 1.276791] Hardware name: linux,dummy-virt (DT)
> > [ 1.276973] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [ 1.277114] pc : refcount_warn_saturate+0x120/0x144
> > [ 1.277219] lr : refcount_warn_saturate+0x120/0x144
> > [ 1.277332] sp : ffff80008002b630
> > [ 1.277410] x29: ffff80008002b630 x28: ffff80008002b978 x27: ffff0a00ffffff05
> > [ 1.277585] x26: ffff80008002b9a9 x25: ffffd5ba808bec2f x24: ffff452a08002f18
> > [ 1.277738] x23: ffff0000ffffff00 x22: ffff80008002b978 x21: 0000000000000000
> > [ 1.277895] x20: ffff80008002b9dd x19: ffff452a08002f80 x18: 0000000000000006
> > [ 1.278052] x17: 73657474696e752d x16: 747365743a313174 x15: 0765076507720766
> > [ 1.278200] x14: 072d077207650774 x13: ffffd5b9814a0660 x12: 000000000000069c
> > [ 1.278357] x11: 0000000000000234 x10: ffffd5b9814f8660 x9 : ffffd5b9814a0660
> > [ 1.278529] x8 : 00000000ffffefff x7 : ffffd5b9814f8660 x6 : 80000000fffff000
> > [ 1.278680] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
> > [ 1.278829] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff452a02cb8000
> > [ 1.279041] Call trace:
> > [ 1.279171] refcount_warn_saturate+0x120/0x144
> > [ 1.279322] kobject_get+0xb8/0xbc
> > [ 1.279416] of_node_get+0x20/0x34
> > [ 1.279497] of_fwnode_get+0x34/0x54
> > [ 1.279567] fwnode_get_nth_parent+0xf0/0x12c
> > [ 1.279666] fwnode_full_name_string+0x48/0xb8
> > [ 1.279764] device_node_string+0x380/0x5a4
> > [ 1.279841] pointer+0x38c/0x4ac
> > [ 1.279900] vsnprintf+0x14c/0x6d0
> > [ 1.279970] vprintk_store+0x168/0x47c
> > [ 1.280055] vprintk_emit+0x104/0x2b4
> > [ 1.280122] vprintk_default+0x38/0x44
> > [ 1.280189] vprintk+0xd4/0xf0
> > [ 1.280262] _printk+0x5c/0x84
> > [ 1.280332] of_node_release+0x1ac/0x1b4
> > [ 1.280413] kobject_put+0xb0/0x220
> > [ 1.280492] of_changeset_destroy+0x50/0xf4
> > [ 1.280584] free_overlay_changeset+0x24/0x104
> > [ 1.280680] of_overlay_remove+0x240/0x2b8
> > [ 1.280766] of_unittest_apply_revert_overlay_check.constprop.0+0xa8/0x310
> > [ 1.280904] of_unittest+0xbf4/0x2e64
> > [ 1.280986] do_one_initcall+0x7c/0x1c0
> > [ 1.281072] kernel_init_freeable+0x1c4/0x294
> > [ 1.281161] kernel_init+0x24/0x1dc
> > [ 1.281242] ret_from_fork+0x10/0x20
> >
> > Then later on:
> >
> > [ 1.459652] ### dt-test ### EXPECT \ : ------------[ cut here ]------------
> > [ 1.459877] ### dt-test ### EXPECT \ : WARNING: <<all>>
> > [ 1.460227] ### dt-test ### EXPECT \ : refcount_t: underflow; use-after-free.
> > [ 1.460508] ### dt-test ### EXPECT \ : ---[ end trace <<int>> ]---
> > [ 1.460860] ### dt-test ### pass of_unittest_lifecycle():3187
> > [ 1.461455] ### dt-test ### EXPECT / : ---[ end trace <<int>> ]---
> > [ 1.461463] ### dt-test ### EXPECT / : refcount_t: underflow; use-after-free.
> > [ 1.461789] ### dt-test ### EXPECT / : WARNING: <<all>>
> > [ 1.462137] ### dt-test ### EXPECT / : ------------[ cut here ]------------
> >
> > So it seems we were getting the warning, but at the wrong point.
> 
> Thanks (and confirmed), I had missed that.
> 
> Note that it did not cause any test failures, though:
> 
>     ### dt-test ### end of unittest - 302 passed, 0 failed
> 
> With this patch:
> 
>     ### dt-test ### end of unittest - 303 passed, 1 failed
> 
> Anyway, it's up to you to decide to apply or not...

I guess that is better than searching thru the log. Applied.

Rob
diff mbox series

Patch

diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
index 9af5337c76f62162..67e32977341a6f0c 100644
--- a/drivers/of/unittest.c
+++ b/drivers/of/unittest.c
@@ -3035,6 +3035,7 @@  static void __init of_unittest_overlay_notify(void)
 static void __init of_unittest_overlay(void)
 {
 	struct device_node *bus_np = NULL;
+	unsigned int i;
 
 	if (platform_driver_register(&unittest_driver)) {
 		unittest(0, "could not register unittest driver\n");
@@ -3072,7 +3073,8 @@  static void __init of_unittest_overlay(void)
 	of_unittest_overlay_2();
 	of_unittest_overlay_3();
 	of_unittest_overlay_4();
-	of_unittest_overlay_5();
+	for (i = 0; i < 3; i++)
+		of_unittest_overlay_5();
 	of_unittest_overlay_6();
 	of_unittest_overlay_8();