From patchwork Fri Feb 18 18:31:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 12751755 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D346AC433F5 for ; Fri, 18 Feb 2022 18:35:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239270AbiBRSfb (ORCPT ); Fri, 18 Feb 2022 13:35:31 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:33414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237385AbiBRSfb (ORCPT ); Fri, 18 Feb 2022 13:35:31 -0500 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 198E3251E75; Fri, 18 Feb 2022 10:35:14 -0800 (PST) Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nL82Z-0004OI-C5; Fri, 18 Feb 2022 13:31:35 -0500 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: kernel-team@fb.com, linux-fsdevel@vger.kernel.org, paulmck@kernel.org, gscrivan@redhat.com, viro@zeniv.linux.org.uk, Rik van Riel , Eric Biederman , Chris Mason Subject: [PATCH 1/2] vfs: free vfsmount through rcu work from kern_unmount Date: Fri, 18 Feb 2022 13:31:13 -0500 Message-Id: <20220218183114.2867528-2-riel@surriel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220218183114.2867528-1-riel@surriel.com> References: <20220218183114.2867528-1-riel@surriel.com> MIME-Version: 1.0 Sender: riel@shelob.surriel.com Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org After kern_unmount returns, callers can no longer access the vfsmount structure. However, the vfsmount structure does need to be kept around until the end of the RCU grace period, to make sure other accesses have all gone away too. This can be accomplished by either gating each kern_unmount on synchronize_rcu (the comment in the code says it all), or by deferring the freeing until the next grace period, where it needs to be handled in a workqueue due to the locking in mntput_no_expire(). Suggested-by: Eric Biederman Reported-by: Chris Mason --- fs/namespace.c | 11 +++++++++-- include/linux/mount.h | 2 ++ 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 40b994a29e90..9f62cf6c69de 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -4384,13 +4384,20 @@ struct vfsmount *kern_mount(struct file_system_type *type) } EXPORT_SYMBOL_GPL(kern_mount); +static void mntput_rcu_work(struct work_struct *work) +{ + struct vfsmount *mnt = container_of(to_rcu_work(work), + struct vfsmount, free_rwork); + mntput(mnt); +} + void kern_unmount(struct vfsmount *mnt) { /* release long term mount so mount point can be released */ if (!IS_ERR_OR_NULL(mnt)) { real_mount(mnt)->mnt_ns = NULL; - synchronize_rcu(); /* yecchhh... */ - mntput(mnt); + INIT_RCU_WORK(&mnt->free_rwork, mntput_rcu_work); + queue_rcu_work(system_wq, &mnt->free_rwork); } } EXPORT_SYMBOL(kern_unmount); diff --git a/include/linux/mount.h b/include/linux/mount.h index 7f18a7555dff..cd007cb70d57 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -16,6 +16,7 @@ #include #include #include +#include struct super_block; struct vfsmount; @@ -73,6 +74,7 @@ struct vfsmount { struct super_block *mnt_sb; /* pointer to superblock */ int mnt_flags; struct user_namespace *mnt_userns; + struct rcu_work free_rwork; } __randomize_layout; static inline struct user_namespace *mnt_user_ns(const struct vfsmount *mnt) From patchwork Fri Feb 18 18:31:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rik van Riel X-Patchwork-Id: 12751754 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58710C433F5 for ; Fri, 18 Feb 2022 18:35:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239268AbiBRSfW (ORCPT ); Fri, 18 Feb 2022 13:35:22 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:33134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239263AbiBRSfU (ORCPT ); Fri, 18 Feb 2022 13:35:20 -0500 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2C79224943; Fri, 18 Feb 2022 10:35:03 -0800 (PST) Received: from imladris.surriel.com ([96.67.55.152]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nL82Z-0004OI-D8; Fri, 18 Feb 2022 13:31:35 -0500 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: kernel-team@fb.com, linux-fsdevel@vger.kernel.org, paulmck@kernel.org, gscrivan@redhat.com, viro@zeniv.linux.org.uk, Rik van Riel Subject: [PATCH 2/2] ipc: get rid of free_ipc_work workqueue Date: Fri, 18 Feb 2022 13:31:14 -0500 Message-Id: <20220218183114.2867528-3-riel@surriel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220218183114.2867528-1-riel@surriel.com> References: <20220218183114.2867528-1-riel@surriel.com> MIME-Version: 1.0 Sender: riel@shelob.surriel.com Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org With kern_unmount deferring the freeing of the vfsmount structure through queue_rcu_work, we no longer need a separate workqueue for freeing up ipc_namespace structures. Signed-off-by: Rik van Riel --- include/linux/ipc_namespace.h | 2 -- ipc/namespace.c | 21 +-------------------- 2 files changed, 1 insertion(+), 22 deletions(-) diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h index b75395ec8d52..5a3debde2f3d 100644 --- a/include/linux/ipc_namespace.h +++ b/include/linux/ipc_namespace.h @@ -67,8 +67,6 @@ struct ipc_namespace { struct user_namespace *user_ns; struct ucounts *ucounts; - struct llist_node mnt_llist; - struct ns_common ns; } __randomize_layout; diff --git a/ipc/namespace.c b/ipc/namespace.c index ae83f0f2651b..090a08b17710 100644 --- a/ipc/namespace.c +++ b/ipc/namespace.c @@ -117,9 +117,6 @@ void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids, static void free_ipc_ns(struct ipc_namespace *ns) { - /* mq_put_mnt() waits for a grace period as kern_unmount() - * uses synchronize_rcu(). - */ mq_put_mnt(ns); sem_exit_ns(ns); msg_exit_ns(ns); @@ -131,21 +128,6 @@ static void free_ipc_ns(struct ipc_namespace *ns) kfree(ns); } -static LLIST_HEAD(free_ipc_list); -static void free_ipc(struct work_struct *unused) -{ - struct llist_node *node = llist_del_all(&free_ipc_list); - struct ipc_namespace *n, *t; - - llist_for_each_entry_safe(n, t, node, mnt_llist) - free_ipc_ns(n); -} - -/* - * The work queue is used to avoid the cost of synchronize_rcu in kern_unmount. - */ -static DECLARE_WORK(free_ipc_work, free_ipc); - /* * put_ipc_ns - drop a reference to an ipc namespace. * @ns: the namespace to put @@ -168,8 +150,7 @@ void put_ipc_ns(struct ipc_namespace *ns) mq_clear_sbinfo(ns); spin_unlock(&mq_lock); - if (llist_add(&ns->mnt_llist, &free_ipc_list)) - schedule_work(&free_ipc_work); + free_ipc_ns(ns); } }