diff mbox series

[5.10/5.15/6.1] nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net

Message ID 20241229144557.1203112-1-kovalev@altlinux.org (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series [5.10/5.15/6.1] nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net | expand

Commit Message

Vasiliy Kovalev Dec. 29, 2024, 2:45 p.m. UTC
From: Yang Erkun <yangerkun@huaweicloud.com>

[ Upstream commit d5ff2fb2e7167e9483846e34148e60c0c016a1f6 ]

In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the
function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will
release all resources related to the hashed `nfs4_client`. If the
`nfsd_client_shrinker` is running concurrently, the `expire_client`
function will first unhash this client and then destroy it. This can
lead to the following warning. Additionally, numerous use-after-free
errors may occur as well.

nfsd_client_shrinker         echo 0 > /proc/fs/nfsd/threads

expire_client                nfsd_shutdown_net
  unhash_client                ...
                               nfs4_state_shutdown_net
                                 /* won't wait shrinker exit */
  /*                             cancel_work(&nn->nfsd_shrinker_work)
   * nfsd_file for this          /* won't destroy unhashed client1 */
   * client1 still alive         nfs4_state_destroy_net
   */

                               nfsd_file_cache_shutdown
                                 /* trigger warning */
                                 kmem_cache_destroy(nfsd_file_slab)
                                 kmem_cache_destroy(nfsd_file_mark_slab)
  /* release nfsd_file and mark */
  __destroy_client

====================================================================
BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
__kmem_cache_shutdown()
--------------------------------------------------------------------
CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1

 dump_stack_lvl+0x53/0x70
 slab_err+0xb0/0xf0
 __kmem_cache_shutdown+0x15c/0x310
 kmem_cache_destroy+0x66/0x160
 nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
 nfsd_destroy_serv+0x251/0x2a0 [nfsd]
 nfsd_svc+0x125/0x1e0 [nfsd]
 write_threads+0x16a/0x2a0 [nfsd]
 nfsctl_transaction_write+0x74/0xa0 [nfsd]
 vfs_write+0x1a5/0x6d0
 ksys_write+0xc1/0x160
 do_syscall_64+0x5f/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

====================================================================
BUG nfsd_file_mark (Tainted: G    B   W         ): Objects remaining
nfsd_file_mark on __kmem_cache_shutdown()
--------------------------------------------------------------------

 dump_stack_lvl+0x53/0x70
 slab_err+0xb0/0xf0
 __kmem_cache_shutdown+0x15c/0x310
 kmem_cache_destroy+0x66/0x160
 nfsd_file_cache_shutdown+0xc8/0x210 [nfsd]
 nfsd_destroy_serv+0x251/0x2a0 [nfsd]
 nfsd_svc+0x125/0x1e0 [nfsd]
 write_threads+0x16a/0x2a0 [nfsd]
 nfsctl_transaction_write+0x74/0xa0 [nfsd]
 vfs_write+0x1a5/0x6d0
 ksys_write+0xc1/0x160
 do_syscall_64+0x5f/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

To resolve this issue, cancel `nfsd_shrinker_work` using synchronous
mode in nfs4_state_shutdown_net.

Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker")
Signed-off-by: Yang Erkun <yangerkun@huaweicloud.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
(cherry picked from commit f965dc0f099a54fca100acf6909abe52d0c85328)
Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org>
---
Backport to fix CVE-2024-50121
Link: https://www.cve.org/CVERecord/?id=CVE-2024-50121
---
 fs/nfsd/nfs4state.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Chuck Lever Dec. 29, 2024, 3:45 p.m. UTC | #1
On 12/29/24 9:45 AM, Vasiliy Kovalev wrote:
> From: Yang Erkun <yangerkun@huaweicloud.com>
> 
> [ Upstream commit d5ff2fb2e7167e9483846e34148e60c0c016a1f6 ]
> 
> In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the
> function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will
> release all resources related to the hashed `nfs4_client`. If the
> `nfsd_client_shrinker` is running concurrently, the `expire_client`
> function will first unhash this client and then destroy it. This can
> lead to the following warning. Additionally, numerous use-after-free
> errors may occur as well.
> 
> nfsd_client_shrinker         echo 0 > /proc/fs/nfsd/threads
> 
> expire_client                nfsd_shutdown_net
>    unhash_client                ...
>                                 nfs4_state_shutdown_net
>                                   /* won't wait shrinker exit */
>    /*                             cancel_work(&nn->nfsd_shrinker_work)
>     * nfsd_file for this          /* won't destroy unhashed client1 */
>     * client1 still alive         nfs4_state_destroy_net
>     */
> 
>                                 nfsd_file_cache_shutdown
>                                   /* trigger warning */
>                                   kmem_cache_destroy(nfsd_file_slab)
>                                   kmem_cache_destroy(nfsd_file_mark_slab)
>    /* release nfsd_file and mark */
>    __destroy_client
> 
> ====================================================================
> BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
> __kmem_cache_shutdown()
> --------------------------------------------------------------------
> CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1
> 
>   dump_stack_lvl+0x53/0x70
>   slab_err+0xb0/0xf0
>   __kmem_cache_shutdown+0x15c/0x310
>   kmem_cache_destroy+0x66/0x160
>   nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
>   nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>   nfsd_svc+0x125/0x1e0 [nfsd]
>   write_threads+0x16a/0x2a0 [nfsd]
>   nfsctl_transaction_write+0x74/0xa0 [nfsd]
>   vfs_write+0x1a5/0x6d0
>   ksys_write+0xc1/0x160
>   do_syscall_64+0x5f/0x170
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> ====================================================================
> BUG nfsd_file_mark (Tainted: G    B   W         ): Objects remaining
> nfsd_file_mark on __kmem_cache_shutdown()
> --------------------------------------------------------------------
> 
>   dump_stack_lvl+0x53/0x70
>   slab_err+0xb0/0xf0
>   __kmem_cache_shutdown+0x15c/0x310
>   kmem_cache_destroy+0x66/0x160
>   nfsd_file_cache_shutdown+0xc8/0x210 [nfsd]
>   nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>   nfsd_svc+0x125/0x1e0 [nfsd]
>   write_threads+0x16a/0x2a0 [nfsd]
>   nfsctl_transaction_write+0x74/0xa0 [nfsd]
>   vfs_write+0x1a5/0x6d0
>   ksys_write+0xc1/0x160
>   do_syscall_64+0x5f/0x170
>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> To resolve this issue, cancel `nfsd_shrinker_work` using synchronous
> mode in nfs4_state_shutdown_net.
> 
> Fixes: 7c24fa225081 ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker")
> Signed-off-by: Yang Erkun <yangerkun@huaweicloud.com>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> (cherry picked from commit f965dc0f099a54fca100acf6909abe52d0c85328)
> Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org>
> ---
> Backport to fix CVE-2024-50121
> Link: https://www.cve.org/CVERecord/?id=CVE-2024-50121
> ---
>   fs/nfsd/nfs4state.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 8bceae771c1c75..f6fa719ee32668 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -8208,7 +8208,7 @@ nfs4_state_shutdown_net(struct net *net)
>   	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>   
>   	unregister_shrinker(&nn->nfsd_client_shrinker);
> -	cancel_work(&nn->nfsd_shrinker_work);
> +	cancel_work_sync(&nn->nfsd_shrinker_work);
>   	cancel_delayed_work_sync(&nn->laundromat_work);
>   	locks_end_grace(&nn->nfsd4_manager);
>   

Backport Acked-by: Chuck Lever <chuck.lever@oracle.com>

Not sure why automation didn't pick this one up.
diff mbox series

Patch

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8bceae771c1c75..f6fa719ee32668 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -8208,7 +8208,7 @@  nfs4_state_shutdown_net(struct net *net)
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
 	unregister_shrinker(&nn->nfsd_client_shrinker);
-	cancel_work(&nn->nfsd_shrinker_work);
+	cancel_work_sync(&nn->nfsd_shrinker_work);
 	cancel_delayed_work_sync(&nn->laundromat_work);
 	locks_end_grace(&nn->nfsd4_manager);