diff mbox series

nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net

Message ID 20241021082540.2885014-1-yangerkun@huaweicloud.com (mailing list archive)
State New
Headers show
Series nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net | expand

Commit Message

yangerkun Oct. 21, 2024, 8:25 a.m. UTC
In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the
function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will
release all resources related to the hashed `nfs4_client`. If the
`nfsd_client_shrinker` is running concurrently, the `expire_client`
function will first unhash this client and then destroy it. This can
lead to the following warning. Additionally, numerous use-after-free
errors may occur as well.

nfsd_client_shrinker         echo 0 > /proc/fs/nfsd/threads

expire_client                nfsd_shutdown_net
  unhash_client                ...
                               nfs4_state_shutdown_net
                                 /* won't wait shrinker exit */
  /*                             cancel_work(&nn->nfsd_shrinker_work)
   * nfsd_file for this          /* won't destroy unhashed client1 */
   * client1 still alive         nfs4_state_destroy_net
   */

                               nfsd_file_cache_shutdown
                                 /* trigger warning */
                                 kmem_cache_destroy(nfsd_file_slab)
                                 kmem_cache_destroy(nfsd_file_mark_slab)
  /* release nfsd_file and mark */
  __destroy_client

====================================================================
BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
__kmem_cache_shutdown()
--------------------------------------------------------------------
CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1

 dump_stack_lvl+0x53/0x70
 slab_err+0xb0/0xf0
 __kmem_cache_shutdown+0x15c/0x310
 kmem_cache_destroy+0x66/0x160
 nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
 nfsd_destroy_serv+0x251/0x2a0 [nfsd]
 nfsd_svc+0x125/0x1e0 [nfsd]
 write_threads+0x16a/0x2a0 [nfsd]
 nfsctl_transaction_write+0x74/0xa0 [nfsd]
 vfs_write+0x1a5/0x6d0
 ksys_write+0xc1/0x160
 do_syscall_64+0x5f/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

====================================================================
BUG nfsd_file_mark (Tainted: G    B   W         ): Objects remaining
nfsd_file_mark on __kmem_cache_shutdown()
--------------------------------------------------------------------

 dump_stack_lvl+0x53/0x70
 slab_err+0xb0/0xf0
 __kmem_cache_shutdown+0x15c/0x310
 kmem_cache_destroy+0x66/0x160
 nfsd_file_cache_shutdown+0xc8/0x210 [nfsd]
 nfsd_destroy_serv+0x251/0x2a0 [nfsd]
 nfsd_svc+0x125/0x1e0 [nfsd]
 write_threads+0x16a/0x2a0 [nfsd]
 nfsctl_transaction_write+0x74/0xa0 [nfsd]
 vfs_write+0x1a5/0x6d0
 ksys_write+0xc1/0x160
 do_syscall_64+0x5f/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

To resolve this issue, cancel `nfsd_shrinker_work` using synchronous
mode in nfs4_state_shutdown_net.

Fixes: 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory condition")
Signed-off-by: Yang Erkun <yangerkun@huaweicloud.com>
---
 fs/nfsd/nfs4state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Chuck Lever Oct. 21, 2024, 2 p.m. UTC | #1
On Mon, Oct 21, 2024 at 04:25:40PM +0800, Yang Erkun wrote:
> In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the
> function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will
> release all resources related to the hashed `nfs4_client`. If the
> `nfsd_client_shrinker` is running concurrently, the `expire_client`
> function will first unhash this client and then destroy it. This can
> lead to the following warning. Additionally, numerous use-after-free
> errors may occur as well.
> 
> nfsd_client_shrinker         echo 0 > /proc/fs/nfsd/threads
> 
> expire_client                nfsd_shutdown_net
>   unhash_client                ...
>                                nfs4_state_shutdown_net
>                                  /* won't wait shrinker exit */
>   /*                             cancel_work(&nn->nfsd_shrinker_work)
>    * nfsd_file for this          /* won't destroy unhashed client1 */
>    * client1 still alive         nfs4_state_destroy_net
>    */
> 
>                                nfsd_file_cache_shutdown
>                                  /* trigger warning */
>                                  kmem_cache_destroy(nfsd_file_slab)
>                                  kmem_cache_destroy(nfsd_file_mark_slab)
>   /* release nfsd_file and mark */
>   __destroy_client
> 
> ====================================================================
> BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
> __kmem_cache_shutdown()
> --------------------------------------------------------------------
> CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1
> 
>  dump_stack_lvl+0x53/0x70
>  slab_err+0xb0/0xf0
>  __kmem_cache_shutdown+0x15c/0x310
>  kmem_cache_destroy+0x66/0x160
>  nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
>  nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>  nfsd_svc+0x125/0x1e0 [nfsd]
>  write_threads+0x16a/0x2a0 [nfsd]
>  nfsctl_transaction_write+0x74/0xa0 [nfsd]
>  vfs_write+0x1a5/0x6d0
>  ksys_write+0xc1/0x160
>  do_syscall_64+0x5f/0x170
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> ====================================================================
> BUG nfsd_file_mark (Tainted: G    B   W         ): Objects remaining
> nfsd_file_mark on __kmem_cache_shutdown()
> --------------------------------------------------------------------
> 
>  dump_stack_lvl+0x53/0x70
>  slab_err+0xb0/0xf0
>  __kmem_cache_shutdown+0x15c/0x310
>  kmem_cache_destroy+0x66/0x160
>  nfsd_file_cache_shutdown+0xc8/0x210 [nfsd]
>  nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>  nfsd_svc+0x125/0x1e0 [nfsd]
>  write_threads+0x16a/0x2a0 [nfsd]
>  nfsctl_transaction_write+0x74/0xa0 [nfsd]
>  vfs_write+0x1a5/0x6d0
>  ksys_write+0xc1/0x160
>  do_syscall_64+0x5f/0x170
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> To resolve this issue, cancel `nfsd_shrinker_work` using synchronous
> mode in nfs4_state_shutdown_net.
> 
> Fixes: 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory condition")

I think like:

Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker")

a little better.

I'm very curious how you stumbled across this one!


> Signed-off-by: Yang Erkun <yangerkun@huaweicloud.com>
> ---
>  fs/nfsd/nfs4state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 56b261608af4..158d67186777 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -8684,7 +8684,7 @@ nfs4_state_shutdown_net(struct net *net)
>  	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>  
>  	shrinker_free(nn->nfsd_client_shrinker);
> -	cancel_work(&nn->nfsd_shrinker_work);
> +	cancel_work_sync(&nn->nfsd_shrinker_work);
>  	cancel_delayed_work_sync(&nn->laundromat_work);
>  	locks_end_grace(&nn->nfsd4_manager);
>  
> -- 
> 2.39.2
>
yangerkun Oct. 21, 2024, 2:18 p.m. UTC | #2
在 2024/10/21 22:00, Chuck Lever 写道:
> On Mon, Oct 21, 2024 at 04:25:40PM +0800, Yang Erkun wrote:
>> In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the
>> function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will
>> release all resources related to the hashed `nfs4_client`. If the
>> `nfsd_client_shrinker` is running concurrently, the `expire_client`
>> function will first unhash this client and then destroy it. This can
>> lead to the following warning. Additionally, numerous use-after-free
>> errors may occur as well.
>>
>> nfsd_client_shrinker         echo 0 > /proc/fs/nfsd/threads
>>
>> expire_client                nfsd_shutdown_net
>>    unhash_client                ...
>>                                 nfs4_state_shutdown_net
>>                                   /* won't wait shrinker exit */
>>    /*                             cancel_work(&nn->nfsd_shrinker_work)
>>     * nfsd_file for this          /* won't destroy unhashed client1 */
>>     * client1 still alive         nfs4_state_destroy_net
>>     */
>>
>>                                 nfsd_file_cache_shutdown
>>                                   /* trigger warning */
>>                                   kmem_cache_destroy(nfsd_file_slab)
>>                                   kmem_cache_destroy(nfsd_file_mark_slab)
>>    /* release nfsd_file and mark */
>>    __destroy_client
>>
>> ====================================================================
>> BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
>> __kmem_cache_shutdown()
>> --------------------------------------------------------------------
>> CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1
>>
>>   dump_stack_lvl+0x53/0x70
>>   slab_err+0xb0/0xf0
>>   __kmem_cache_shutdown+0x15c/0x310
>>   kmem_cache_destroy+0x66/0x160
>>   nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
>>   nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>>   nfsd_svc+0x125/0x1e0 [nfsd]
>>   write_threads+0x16a/0x2a0 [nfsd]
>>   nfsctl_transaction_write+0x74/0xa0 [nfsd]
>>   vfs_write+0x1a5/0x6d0
>>   ksys_write+0xc1/0x160
>>   do_syscall_64+0x5f/0x170
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> ====================================================================
>> BUG nfsd_file_mark (Tainted: G    B   W         ): Objects remaining
>> nfsd_file_mark on __kmem_cache_shutdown()
>> --------------------------------------------------------------------
>>
>>   dump_stack_lvl+0x53/0x70
>>   slab_err+0xb0/0xf0
>>   __kmem_cache_shutdown+0x15c/0x310
>>   kmem_cache_destroy+0x66/0x160
>>   nfsd_file_cache_shutdown+0xc8/0x210 [nfsd]
>>   nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>>   nfsd_svc+0x125/0x1e0 [nfsd]
>>   write_threads+0x16a/0x2a0 [nfsd]
>>   nfsctl_transaction_write+0x74/0xa0 [nfsd]
>>   vfs_write+0x1a5/0x6d0
>>   ksys_write+0xc1/0x160
>>   do_syscall_64+0x5f/0x170
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> To resolve this issue, cancel `nfsd_shrinker_work` using synchronous
>> mode in nfs4_state_shutdown_net.
>>
>> Fixes: 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory condition")
> 
> I think like:
> 
> Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker")

Hi,

Thanks a lot for your review! Before this, will this problem exist too
since the cocurrently run between upper two threads can happen too?

> 
> a little better.
> 
> I'm very curious how you stumbled across this one!

Our excellent test team has recently performed a lot of fault injection
tests on nfs/nfsd, which helps us find many nfs/nfsd problems. This
problem is only one of them. There will be some bugfixes for other
problems later.

> 
> 
>> Signed-off-by: Yang Erkun <yangerkun@huaweicloud.com>
>> ---
>>   fs/nfsd/nfs4state.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>> index 56b261608af4..158d67186777 100644
>> --- a/fs/nfsd/nfs4state.c
>> +++ b/fs/nfsd/nfs4state.c
>> @@ -8684,7 +8684,7 @@ nfs4_state_shutdown_net(struct net *net)
>>   	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>>   
>>   	shrinker_free(nn->nfsd_client_shrinker);
>> -	cancel_work(&nn->nfsd_shrinker_work);
>> +	cancel_work_sync(&nn->nfsd_shrinker_work);
>>   	cancel_delayed_work_sync(&nn->laundromat_work);
>>   	locks_end_grace(&nn->nfsd4_manager);
>>   
>> -- 
>> 2.39.2
>>
>
Jeff Layton Oct. 21, 2024, 2:23 p.m. UTC | #3
On Mon, 2024-10-21 at 16:25 +0800, Yang Erkun wrote:
> In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the
> function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will
> release all resources related to the hashed `nfs4_client`. If the
> `nfsd_client_shrinker` is running concurrently, the `expire_client`
> function will first unhash this client and then destroy it. This can
> lead to the following warning. Additionally, numerous use-after-free
> errors may occur as well.
> 
> nfsd_client_shrinker         echo 0 > /proc/fs/nfsd/threads
> 
> expire_client                nfsd_shutdown_net
>   unhash_client                ...
>                                nfs4_state_shutdown_net
>                                  /* won't wait shrinker exit */
>   /*                             cancel_work(&nn->nfsd_shrinker_work)
>    * nfsd_file for this          /* won't destroy unhashed client1 */
>    * client1 still alive         nfs4_state_destroy_net
>    */
> 
>                                nfsd_file_cache_shutdown
>                                  /* trigger warning */
>                                  kmem_cache_destroy(nfsd_file_slab)
>                                  kmem_cache_destroy(nfsd_file_mark_slab)
>   /* release nfsd_file and mark */
>   __destroy_client
> 
> ====================================================================
> BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
> __kmem_cache_shutdown()
> --------------------------------------------------------------------
> CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1
> 
>  dump_stack_lvl+0x53/0x70
>  slab_err+0xb0/0xf0
>  __kmem_cache_shutdown+0x15c/0x310
>  kmem_cache_destroy+0x66/0x160
>  nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
>  nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>  nfsd_svc+0x125/0x1e0 [nfsd]
>  write_threads+0x16a/0x2a0 [nfsd]
>  nfsctl_transaction_write+0x74/0xa0 [nfsd]
>  vfs_write+0x1a5/0x6d0
>  ksys_write+0xc1/0x160
>  do_syscall_64+0x5f/0x170
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> ====================================================================
> BUG nfsd_file_mark (Tainted: G    B   W         ): Objects remaining
> nfsd_file_mark on __kmem_cache_shutdown()
> --------------------------------------------------------------------
> 
>  dump_stack_lvl+0x53/0x70
>  slab_err+0xb0/0xf0
>  __kmem_cache_shutdown+0x15c/0x310
>  kmem_cache_destroy+0x66/0x160
>  nfsd_file_cache_shutdown+0xc8/0x210 [nfsd]
>  nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>  nfsd_svc+0x125/0x1e0 [nfsd]
>  write_threads+0x16a/0x2a0 [nfsd]
>  nfsctl_transaction_write+0x74/0xa0 [nfsd]
>  vfs_write+0x1a5/0x6d0
>  ksys_write+0xc1/0x160
>  do_syscall_64+0x5f/0x170
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> To resolve this issue, cancel `nfsd_shrinker_work` using synchronous
> mode in nfs4_state_shutdown_net.
> 
> Fixes: 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory condition")
> Signed-off-by: Yang Erkun <yangerkun@huaweicloud.com>
> ---
>  fs/nfsd/nfs4state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 56b261608af4..158d67186777 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -8684,7 +8684,7 @@ nfs4_state_shutdown_net(struct net *net)
>  	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>  
>  	shrinker_free(nn->nfsd_client_shrinker);
> -	cancel_work(&nn->nfsd_shrinker_work);
> +	cancel_work_sync(&nn->nfsd_shrinker_work);
>  	cancel_delayed_work_sync(&nn->laundromat_work);
>  	locks_end_grace(&nn->nfsd4_manager);
>  

Nice catch and analysis!

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Chuck Lever Oct. 21, 2024, 2:25 p.m. UTC | #4
> On Oct 21, 2024, at 10:18 AM, yangerkun <yangerkun@huawei.com> wrote:
> 
> 
> 
> 在 2024/10/21 22:00, Chuck Lever 写道:
>> On Mon, Oct 21, 2024 at 04:25:40PM +0800, Yang Erkun wrote:
>>> 
>>> Fixes: 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory condition")
>> I think like:
>> Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker")
> 
> Hi,
> 
> Thanks a lot for your review! Before this, will this problem exist too
> since the concurrently run between upper two threads can happen too?

Yes, the race can happen before 7c24fa, but your patch
won't apply to kernels that don't have 7c24fa applied.

The automation should pull both of these into LTS correctly.
If it doesn't happen as we expect, we can fix it up by hand.

I plan to apply this to nfsd-fixes (for v6.12-rc).


>> a little better.
>> I'm very curious how you stumbled across this one!
> 
> Our excellent test team has recently performed a lot of fault injection
> tests on nfs/nfsd, which helps us find many nfs/nfsd problems. This
> problem is only one of them. There will be some bugfixes for other
> problems later.

Excellent, looking forward to seeing those.


--
Chuck Lever
yangerkun Oct. 22, 2024, 1:06 a.m. UTC | #5
在 2024/10/21 22:25, Chuck Lever III 写道:
> 
> 
>> On Oct 21, 2024, at 10:18 AM, yangerkun <yangerkun@huawei.com> wrote:
>>
>>
>>
>> 在 2024/10/21 22:00, Chuck Lever 写道:
>>> On Mon, Oct 21, 2024 at 04:25:40PM +0800, Yang Erkun wrote:
>>>>
>>>> Fixes: 7746b32f467b ("NFSD: add shrinker to reap courtesy clients on low memory condition")
>>> I think like:
>>> Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker")
>>
>> Hi,
>>
>> Thanks a lot for your review! Before this, will this problem exist too
>> since the concurrently run between upper two threads can happen too?
> 
> Yes, the race can happen before 7c24fa, but your patch
> won't apply to kernels that don't have 7c24fa applied.
> 
> The automation should pull both of these into LTS correctly.
> If it doesn't happen as we expect, we can fix it up by hand.

Thanks for explaining. Agree with you.

> 
> I plan to apply this to nfsd-fixes (for v6.12-rc).

Thanks!

> 
> 
>>> a little better.
>>> I'm very curious how you stumbled across this one!
>>
>> Our excellent test team has recently performed a lot of fault injection
>> tests on nfs/nfsd, which helps us find many nfs/nfsd problems. This
>> problem is only one of them. There will be some bugfixes for other
>> problems later.
> 
> Excellent, looking forward to seeing those.
> 
> 
> --
> Chuck Lever
> 
>
diff mbox series

Patch

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 56b261608af4..158d67186777 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -8684,7 +8684,7 @@  nfs4_state_shutdown_net(struct net *net)
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
 	shrinker_free(nn->nfsd_client_shrinker);
-	cancel_work(&nn->nfsd_shrinker_work);
+	cancel_work_sync(&nn->nfsd_shrinker_work);
 	cancel_delayed_work_sync(&nn->laundromat_work);
 	locks_end_grace(&nn->nfsd4_manager);