diff mbox series

xfs: add __GFP_NOLOCKDEP when allocating memory in xfs_attr_shortform_list()

Message ID SEZPR01MB45270BCD2BC28813FCB39AEDA8D72@SEZPR01MB4527.apcprd01.prod.exchangelabs.com (mailing list archive)
State New
Headers show
Series xfs: add __GFP_NOLOCKDEP when allocating memory in xfs_attr_shortform_list() | expand

Commit Message

Jiwei Sun June 27, 2024, 1:12 p.m. UTC
From: Jiwei Sun <sunjw10@lenovo.com>

If the following configuration is set
CONFIG_LOCKDEP=y

The following warning log appears,

======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc4-dirty #81 Not tainted
------------------------------------------------------
kswapd1/1465 is trying to acquire lock:
ff11000928da0160 (&xfs_nondir_ilock_class){++++}-{4:4}, at: xfs_icwalk_ag+0x7cd/0x14c0 [xfs]

but task is already holding lock:
ffffffff9fd44100 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x856/0x11a0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}-{0:0}:
       lock_acquire+0x194/0x490
       fs_reclaim_acquire+0x103/0x160
       __kmalloc_noprof+0x9b/0x430
       xfs_attr_shortform_list+0x56a/0x15d0 [xfs]
       xfs_attr_list+0x1cb/0x250 [xfs]
       xfs_vn_listxattr+0xee/0x170 [xfs]
       listxattr+0x5b/0xf0
       __x64_sys_flistxattr+0x126/0x1b0
       do_syscall_64+0x8a/0x170
       entry_SYSCALL_64_after_hwframe+0x76/0x7e

-> #0 (&xfs_nondir_ilock_class){++++}-{4:4}:
       validate_chain+0x171c/0x3270
       __lock_acquire+0xecd/0x1ed0
       lock_acquire+0x194/0x490
       down_write_nested+0xa2/0x510
       xfs_icwalk_ag+0x7cd/0x14c0 [xfs]
       xfs_icwalk+0x4f/0xd0 [xfs]
       xfs_reclaim_inodes_nr+0x144/0x1f0 [xfs]
       super_cache_scan+0x305/0x430
       do_shrink_slab+0x2f3/0xce0
       shrink_slab+0x507/0xcb0
       shrink_one+0x400/0x6d0
       shrink_many+0x2d5/0xc10
       shrink_node+0x1a0b/0x2110
       balance_pgdat+0x7a2/0x11a0
       kswapd+0x518/0x9c0
       kthread+0x2e9/0x3d0
       ret_from_fork+0x2d/0x60
       ret_from_fork_asm+0x1a/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(fs_reclaim);
                               lock(&xfs_nondir_ilock_class);
                               lock(fs_reclaim);
  lock(&xfs_nondir_ilock_class);

 *** DEADLOCK ***

2 locks held by kswapd1/1465:
 #0: ffffffff9fd44100 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x856/0x11a0
 #1: ff110001d1dec0e8 (&type->s_umount_key#62){++++}-{4:4}, at: super_trylock_shared+0x18/0xa0

stack backtrace:
CPU: 182 PID: 1465 Comm: kswapd1 Kdump: loaded Not tainted 6.10.0-rc4-dirty #81 d8f3b21024b789e635a9c2daea46fdb7762f1b50
Hardware name: Lenovo ThinkServer SR660 V3/SR660 V3, BIOS T8E166X-2.54 05/30/2024
Call Trace:
 <TASK>
 dump_stack_lvl+0x7c/0xc0
 check_noncircular+0x31f/0x3f0
 validate_chain+0x171c/0x3270
 __lock_acquire+0xecd/0x1ed0
 lock_acquire+0x194/0x490
 down_write_nested+0xa2/0x510
 xfs_icwalk_ag+0x7cd/0x14c0 [xfs 681f3433bed0d714083e513d149a819b095e6e51]
 xfs_icwalk+0x4f/0xd0 [xfs 681f3433bed0d714083e513d149a819b095e6e51]
 xfs_reclaim_inodes_nr+0x144/0x1f0 [xfs 681f3433bed0d714083e513d149a819b095e6e51]
 super_cache_scan+0x305/0x430
 do_shrink_slab+0x2f3/0xce0
 shrink_slab+0x507/0xcb0
 shrink_one+0x400/0x6d0
 shrink_many+0x2d5/0xc10
 shrink_node+0x1a0b/0x2110
 balance_pgdat+0x7a2/0x11a0
 kswapd+0x518/0x9c0
 kthread+0x2e9/0x3d0
 ret_from_fork+0x2d/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

This is a false positive. If a node is getting reclaimed, it cannot be
the target of a flistxattr operation. Commit 6dcde60efd94 ("xfs: more
lockdep whackamole with kmem_alloc*") has the similar root cause.

Fix the issue by adding __GFP_NOLOCKDEP in order to shut up lockdep.

Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
Suggested-by: Adrian Huang <ahuang12@lenovo.com>
---
 fs/xfs/xfs_attr_list.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Eric Sandeen June 28, 2024, 4:25 p.m. UTC | #1
On 6/27/24 8:12 AM, Jiwei Sun wrote:
> From: Jiwei Sun <sunjw10@lenovo.com>
> 
> If the following configuration is set
> CONFIG_LOCKDEP=y
> 
> The following warning log appears,

Was just about to send this. :)

I had talked to dchinner about this and he also suggested that this was 
missed in the series that removed GFP_NOFS, i.e.

[PATCH 00/12] xfs: remove remaining kmem interfaces and GFP_NOFS usage
at https://lore.kernel.org/linux-mm/20240622094411.GA830005@ceph-admin/T/

So, I think this could also use one or both of:

Fixes: 204fae32d5f7 ("xfs: clean up remaining GFP_NOFS users")
Fixes: 94a69db2367e ("xfs: use __GFP_NOLOCKDEP instead of GFP_NOFS")

...

> This is a false positive. If a node is getting reclaimed, it cannot be
> the target of a flistxattr operation. Commit 6dcde60efd94 ("xfs: more
> lockdep whackamole with kmem_alloc*") has the similar root cause.
> 
> Fix the issue by adding __GFP_NOLOCKDEP in order to shut up lockdep.
> 
> Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
> Suggested-by: Adrian Huang <ahuang12@lenovo.com>
> ---
>  fs/xfs/xfs_attr_list.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> index 5c947e5ce8b8..506ade0befa4 100644
> --- a/fs/xfs/xfs_attr_list.c
> +++ b/fs/xfs/xfs_attr_list.c
> @@ -114,7 +114,8 @@ xfs_attr_shortform_list(
>  	 * It didn't all fit, so we have to sort everything on hashval.
>  	 */
>  	sbsize = sf->count * sizeof(*sbuf);
> -	sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL);
> +	sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL |
> +			     __GFP_NOLOCKDEP);

Minor nitpick, style-wise we seem to do:

        sbp = sbuf = kmalloc(sbsize,
                        GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL);

in most other places, and not split the flags onto 2 lines, since you need
to add a line anyway.

Otherwise,

Acked-by: Eric Sandeen <sandeen@redhat.com>
  
>  	/*
>  	 * Scan the attribute list for the rest of the entries, storing
Darrick J. Wong June 28, 2024, 5:01 p.m. UTC | #2
On Fri, Jun 28, 2024 at 11:25:10AM -0500, Eric Sandeen wrote:
> On 6/27/24 8:12 AM, Jiwei Sun wrote:
> > From: Jiwei Sun <sunjw10@lenovo.com>
> > 
> > If the following configuration is set
> > CONFIG_LOCKDEP=y
> > 
> > The following warning log appears,
> 
> Was just about to send this. :)
> 
> I had talked to dchinner about this and he also suggested that this was 
> missed in the series that removed GFP_NOFS, i.e.
> 
> [PATCH 00/12] xfs: remove remaining kmem interfaces and GFP_NOFS usage
> at https://lore.kernel.org/linux-mm/20240622094411.GA830005@ceph-admin/T/
> 
> So, I think this could also use one or both of:
> 
> Fixes: 204fae32d5f7 ("xfs: clean up remaining GFP_NOFS users")
> Fixes: 94a69db2367e ("xfs: use __GFP_NOLOCKDEP instead of GFP_NOFS")
> 
> ...
> 
> > This is a false positive. If a node is getting reclaimed, it cannot be
> > the target of a flistxattr operation. Commit 6dcde60efd94 ("xfs: more
> > lockdep whackamole with kmem_alloc*") has the similar root cause.
> > 
> > Fix the issue by adding __GFP_NOLOCKDEP in order to shut up lockdep.
> > 
> > Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
> > Suggested-by: Adrian Huang <ahuang12@lenovo.com>
> > ---
> >  fs/xfs/xfs_attr_list.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
> > index 5c947e5ce8b8..506ade0befa4 100644
> > --- a/fs/xfs/xfs_attr_list.c
> > +++ b/fs/xfs/xfs_attr_list.c
> > @@ -114,7 +114,8 @@ xfs_attr_shortform_list(
> >  	 * It didn't all fit, so we have to sort everything on hashval.
> >  	 */
> >  	sbsize = sf->count * sizeof(*sbuf);
> > -	sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL);
> > +	sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL |
> > +			     __GFP_NOLOCKDEP);
> 
> Minor nitpick, style-wise we seem to do:
> 
>         sbp = sbuf = kmalloc(sbsize,
>                         GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL);
> 
> in most other places, and not split the flags onto 2 lines, since you need
> to add a line anyway.
> 
> Otherwise,
> 
> Acked-by: Eric Sandeen <sandeen@redhat.com>

Hey, could you all please read the list before sending duplicate
patches?

https://lore.kernel.org/linux-xfs/20240622082631.2661148-1-leo.lilong@huawei.com/

--D

> >  	/*
> >  	 * Scan the attribute list for the rest of the entries, storing
> 
>
Jiwei Sun June 28, 2024, 11:31 p.m. UTC | #3
On 6/29/24 01:01, Darrick J. Wong wrote:
> On Fri, Jun 28, 2024 at 11:25:10AM -0500, Eric Sandeen wrote:
>> On 6/27/24 8:12 AM, Jiwei Sun wrote:
>>> From: Jiwei Sun <sunjw10@lenovo.com>
>>>
>>> If the following configuration is set
>>> CONFIG_LOCKDEP=y
>>>
>>> The following warning log appears,
>>
>> Was just about to send this. :)
>>
>> I had talked to dchinner about this and he also suggested that this was 
>> missed in the series that removed GFP_NOFS, i.e.
>>
>> [PATCH 00/12] xfs: remove remaining kmem interfaces and GFP_NOFS usage
>> at https://lore.kernel.org/linux-mm/20240622094411.GA830005@ceph-admin/T/
>>
>> So, I think this could also use one or both of:
>>
>> Fixes: 204fae32d5f7 ("xfs: clean up remaining GFP_NOFS users")
>> Fixes: 94a69db2367e ("xfs: use __GFP_NOLOCKDEP instead of GFP_NOFS")
>>
>> ...
>>
>>> This is a false positive. If a node is getting reclaimed, it cannot be
>>> the target of a flistxattr operation. Commit 6dcde60efd94 ("xfs: more
>>> lockdep whackamole with kmem_alloc*") has the similar root cause.
>>>
>>> Fix the issue by adding __GFP_NOLOCKDEP in order to shut up lockdep.
>>>
>>> Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
>>> Suggested-by: Adrian Huang <ahuang12@lenovo.com>
>>> ---
>>>  fs/xfs/xfs_attr_list.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
>>> index 5c947e5ce8b8..506ade0befa4 100644
>>> --- a/fs/xfs/xfs_attr_list.c
>>> +++ b/fs/xfs/xfs_attr_list.c
>>> @@ -114,7 +114,8 @@ xfs_attr_shortform_list(
>>>  	 * It didn't all fit, so we have to sort everything on hashval.
>>>  	 */
>>>  	sbsize = sf->count * sizeof(*sbuf);
>>> -	sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL);
>>> +	sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL |
>>> +			     __GFP_NOLOCKDEP);
>>
>> Minor nitpick, style-wise we seem to do:
>>
>>         sbp = sbuf = kmalloc(sbsize,
>>                         GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL);
>>
>> in most other places, and not split the flags onto 2 lines, since you need
>> to add a line anyway.
>>
>> Otherwise,
>>
>> Acked-by: Eric Sandeen <sandeen@redhat.com>
> 
> Hey, could you all please read the list before sending duplicate
> patches?

I'm very sorry for wasting everyone's time because of missing that patch.
Thank you for pointing out this point, @Darrick.
And thank you also for your review and suggestions, @Eric.

Thanks,
Regards,
Jiwei

> 
> https://lore.kernel.org/linux-xfs/20240622082631.2661148-1-leo.lilong@huawei.com/
> 
> --D
> 
>>>  	/*
>>>  	 * Scan the attribute list for the rest of the entries, storing
>>
>>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
index 5c947e5ce8b8..506ade0befa4 100644
--- a/fs/xfs/xfs_attr_list.c
+++ b/fs/xfs/xfs_attr_list.c
@@ -114,7 +114,8 @@  xfs_attr_shortform_list(
 	 * It didn't all fit, so we have to sort everything on hashval.
 	 */
 	sbsize = sf->count * sizeof(*sbuf);
-	sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL);
+	sbp = sbuf = kmalloc(sbsize, GFP_KERNEL | __GFP_NOFAIL |
+			     __GFP_NOLOCKDEP);
 
 	/*
 	 * Scan the attribute list for the rest of the entries, storing