From patchwork Tue Apr 1 00:32:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 14034235 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8FC06F099; Tue, 1 Apr 2025 00:33:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467611; cv=none; b=trmlSQB0tps86OR7HS4jtXjk8soJxxlrRgMCtua+geuVkFzGECRcfUAyLllhvTh9Jvl6gsaU94azOab3fXaC8Sb9rd4PBkFrU5UFHDISyTG5zsY0CgrfsetdW71m0OQAtAbThKlAdoCN/v7t8CVBiV8DnRncWZ/DW4NQS3veyU8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467611; c=relaxed/simple; bh=v/2U6Lf3UPg1q5dA8hz9DqjrfG65mtSGjt9qiHZK+iU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Ow06JRjHbNIyPYZaYMw+lTgcsdOFE1/4oBnl5eBNRc7Vb/dZdkJ+13kkI50k9Bg9gxduPVocVMzyG3hVRIbgZuVbzBiWO4RqdkvGEdVGUGNefIGv24emv+s50FKGqOfKnFhJB38PQ09uL6iRzf+CkbfYYU03MEQ5NdMofIL6ypA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VMON9pul; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VMON9pul" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ACBB9C4CEE5; Tue, 1 Apr 2025 00:33:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743467610; bh=v/2U6Lf3UPg1q5dA8hz9DqjrfG65mtSGjt9qiHZK+iU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VMON9pulmwp/ZtqWmuBSqNWW3AbYWmZblYTC16tKmiNm/DfBxsVKWe7vemiU7vmtB Vbho5t2kWOso3YuO9Ewm2dFB/yCISmcHT6Q8Bt3WH5lpzR+pru2o/mamdax8FfJ88K 9uZQaDrFHjAhKxlCcLF6eUJKqXGWGoiSzXL81eVp0xyStBWrMGmjKaKAC4vbHjq6Nm PlpVqVa8z5f18Xji1kHV97U/uRsVEwsoSuK37ONHwje/dmgGce0D/Xy2xWCjrTpppg H1Om10HS2e28vKs+YFGDAM3epsU/WAwdo9HfHJvsJp/idze3nAY/24xDS7Amkjbl+k wicCjLOpFnQEw== From: Christian Brauner To: linux-fsdevel@vger.kernel.org, jack@suse.cz Cc: Christian Brauner , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, James Bottomley , mcgrof@kernel.org, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [PATCH 1/6] ext4: replace kthread freezing with auto fs freezing Date: Tue, 1 Apr 2025 02:32:46 +0200 Message-ID: <20250401-work-freeze-v1-1-d000611d4ab0@kernel.org> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> References: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mailer: b4 0.15-dev-42535 X-Developer-Signature: v=1; a=openpgp-sha256; l=4014; i=brauner@kernel.org; h=from:subject:message-id; bh=I1JUqV0XESLI2NYVo+JV4P0snp7i2H23CqsH/UPiKWY=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaS/NrHelvdCcjfHx9N1dcvOm27qSpK5cXHXtLtrkxec5 fywYlHU0Y5SFgYxLgZZMUUWh3aTcLnlPBWbjTI1YOawMoEMYeDiFICJHPjC8M/owt23wbpxkhFT DX9zK76vdxSMPhq+2KW9cM/tDxlv3T4w/FM8l7HncLQBg4Dc4TrXxdUpTzVPmMp0+snfSPZfb+U pxwAA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 From: Luis Chamberlain The kernel power management now supports allowing the VFS to handle filesystem freezing freezes and thawing. Take advantage of that and remove the kthread freezing. This is needed so that we properly really stop IO in flight without races after userspace has been frozen. Without this we rely on kthread freezing and its semantics are loose and error prone. The filesystem therefore is in charge of properly dealing with quiescing of the filesystem through its callbacks if it thinks it knows better than how the VFS handles it. The following Coccinelle rule was used as to remove the now superfluous freezer calls: make coccicheck MODE=patch SPFLAGS="--in-place --no-show-diff" COCCI=./fs-freeze-cleanup.cocci M=fs/ext4 virtual patch @ remove_set_freezable @ expression time; statement S, S2; expression task, current; @@ ( - set_freezable(); | - if (try_to_freeze()) - continue; | - try_to_freeze(); | - freezable_schedule(); + schedule(); | - freezable_schedule_timeout(time); + schedule_timeout(time); | - if (freezing(task)) { S } | - if (freezing(task)) { S } - else { S2 } | - freezing(current) ) @ remove_wq_freezable @ expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4; identifier fs_wq_fn; @@ ( WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_FREEZABLE, + WQ_ARG2, ...); | WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3, + WQ_ARG2 | WQ_ARG3, ...); | WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE, + WQ_ARG2 | WQ_ARG3, ...); | WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4, + WQ_ARG2 | WQ_ARG3 | WQ_ARG4, ...); | WQ_E = - WQ_ARG1 | WQ_FREEZABLE + WQ_ARG1 | WQ_E = - WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3 + WQ_ARG1 | WQ_ARG3 | fs_wq_fn( - WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3 + WQ_ARG2 | WQ_ARG3 ) | fs_wq_fn( - WQ_FREEZABLE | WQ_ARG2 + WQ_ARG2 ) | fs_wq_fn( - WQ_FREEZABLE + 0 ) ) @ add_auto_flag @ expression E1; identifier fs_type; @@ struct file_system_type fs_type = { .fs_flags = E1 + | FS_AUTOFREEZE , }; Generated-by: Coccinelle SmPL Signed-off-by: Luis Chamberlain Link: https://lore.kernel.org/r/20250326112220.1988619-5-mcgrof@kernel.org Signed-off-by: Christian Brauner --- fs/ext4/mballoc.c | 2 +- fs/ext4/super.c | 3 --- 2 files changed, 1 insertion(+), 4 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 0d523e9fb3d5..ae235ec5ff3a 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -6782,7 +6782,7 @@ static ext4_grpblk_t ext4_last_grp_cluster(struct super_block *sb, static bool ext4_trim_interrupted(void) { - return fatal_signal_pending(current) || freezing(current); + return fatal_signal_pending(current); } static int ext4_try_to_trim_range(struct super_block *sb, diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 8122d4ffb3b5..020c818078d7 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -3778,7 +3778,6 @@ static int ext4_lazyinit_thread(void *arg) unsigned long next_wakeup, cur; BUG_ON(NULL == eli); - set_freezable(); cont_thread: while (true) { @@ -3837,8 +3836,6 @@ static int ext4_lazyinit_thread(void *arg) } mutex_unlock(&eli->li_list_mtx); - try_to_freeze(); - cur = jiffies; if (!next_wakeup_initialized || time_after_eq(cur, next_wakeup)) { cond_resched(); From patchwork Tue Apr 1 00:32:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 14034236 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1EBE215D5B6; Tue, 1 Apr 2025 00:33:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467615; cv=none; b=sm/CuhECPyJNJqH7VpN6hcONDib7b5cXgCpfVCzcBrt4pz9Gx9fXFFHk4c6Hdm0hqGSF0dh7Yl33m+gWrndws9srfGG8cswGNXPa1csEN4CG+tb4bdgVwu+bem3lJpMo5AlzvSLsO91GF2pHhX4EY8nsFM8g+jrR7DPRrI4FNFs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467615; c=relaxed/simple; bh=lCY/KJW9ikBmlJ/oV/9jDJ7hT8RIfBH0XkGe2K3kJYw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=aKPlxz5UZkt7YxDrLL40f6YGRiyYTpYyA5peP7XVCVR4qnW84abl38X9FOogFOB40dV6ts82JmLtw1c/Z3wn6shWzrl9K9P0pc2n+MRokDK3u0+C/s1LWCbOlTO6SJ61wKSpRUKYyekT++4NWs+dZLWSz/IKNOaKH1TDvxAcZ5U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IoCzgbjj; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IoCzgbjj" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E6BB4C4CEEE; Tue, 1 Apr 2025 00:33:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743467614; bh=lCY/KJW9ikBmlJ/oV/9jDJ7hT8RIfBH0XkGe2K3kJYw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IoCzgbjjADfRLGcBq8tQxtzAh+nw9hJTYSo1OFfKB4zT7VsZ7m8xZ2SiorMmHZThc SLmeaXTjuyKa1TdRxX35IRjX15UZpeG9HVUtBU/xcfre0Gv9Wc1aPzr7rUs5buzYo9 0DuBgxVLKBK0iGDuHdHmtQVjetfr13yVEE2W/AJQm3lsEde5QtUb+ikWQ3HEf81dI5 zgtHLR6kRTNFpPKx2TMwnBMylSN2ArPzKVmLmOeNJp0UTn5cSuYKlsWeYOHIzjj1wf TxxXXEXIuCijpGfnm618zzUfdtioMA5rzPNGgL+qXH74cMfcHJaq1MGqSKPOsiArlb I/1kqtF+NJ5/A== From: Christian Brauner To: linux-fsdevel@vger.kernel.org, jack@suse.cz Cc: Christian Brauner , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, James Bottomley , mcgrof@kernel.org, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [PATCH 2/6] btrfs: replace kthread freezing with auto fs freezing Date: Tue, 1 Apr 2025 02:32:47 +0200 Message-ID: <20250401-work-freeze-v1-2-d000611d4ab0@kernel.org> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> References: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mailer: b4 0.15-dev-42535 X-Developer-Signature: v=1; a=openpgp-sha256; l=4192; i=brauner@kernel.org; h=from:subject:message-id; bh=d3lub6zszyAvvsjIX6sIGC+CVKRgSmI0RRk0r2As0ac=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaS/NrGZYPf7dvHUt8sULp+RD/Yp9+Y7vj/isdrjz+orO /luMnArdpSyMIhxMciKKbI4tJuEyy3nqdhslKkBM4eVCWQIAxenAEyk6TjDH87C7t13p9/JMjb6 y+32xnrnzAVls7eukwkx/Oq2olGgdSFQRUqbf0d048Kv7FZ2vUrJh6TrdKceNeHRYF+aYHPB/ws HAA== X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 From: Luis Chamberlain The kernel power management now supports allowing the VFS to handle filesystem freezing freezes and thawing. Take advantage of that and remove the kthread freezing. This is needed so that we properly really stop IO in flight without races after userspace has been frozen. Without this we rely on kthread freezing and its semantics are loose and error prone. The filesystem therefore is in charge of properly dealing with quiescing of the filesystem through its callbacks if it thinks it knows better than how the VFS handles it. The following Coccinelle rule was used as to remove the now superfluous freezer calls: make coccicheck MODE=patch SPFLAGS="--in-place --no-show-diff" COCCI=./fs-freeze-cleanup.cocci M=fs/btrfs virtual patch @ remove_set_freezable @ expression time; statement S, S2; expression task, current; @@ ( - set_freezable(); | - if (try_to_freeze()) - continue; | - try_to_freeze(); | - freezable_schedule(); + schedule(); | - freezable_schedule_timeout(time); + schedule_timeout(time); | - if (freezing(task)) { S } | - if (freezing(task)) { S } - else { S2 } | - freezing(current) ) @ remove_wq_freezable @ expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4; identifier fs_wq_fn; @@ ( WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_FREEZABLE, + WQ_ARG2, ...); | WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3, + WQ_ARG2 | WQ_ARG3, ...); | WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE, + WQ_ARG2 | WQ_ARG3, ...); | WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4, + WQ_ARG2 | WQ_ARG3 | WQ_ARG4, ...); | WQ_E = - WQ_ARG1 | WQ_FREEZABLE + WQ_ARG1 | WQ_E = - WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3 + WQ_ARG1 | WQ_ARG3 | fs_wq_fn( - WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3 + WQ_ARG2 | WQ_ARG3 ) | fs_wq_fn( - WQ_FREEZABLE | WQ_ARG2 + WQ_ARG2 ) | fs_wq_fn( - WQ_FREEZABLE + 0 ) ) @ add_auto_flag @ expression E1; identifier fs_type; @@ struct file_system_type fs_type = { .fs_flags = E1 + | FS_AUTOFREEZE , }; Generated-by: Coccinelle SmPL Signed-off-by: Luis Chamberlain Link: https://lore.kernel.org/r/20250326112220.1988619-6-mcgrof@kernel.org Signed-off-by: Christian Brauner --- fs/btrfs/disk-io.c | 4 ++-- fs/btrfs/scrub.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 1a916716cefe..bce3ae569fe0 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1962,8 +1962,8 @@ static void btrfs_init_qgroup(struct btrfs_fs_info *fs_info) static int btrfs_init_workqueues(struct btrfs_fs_info *fs_info) { u32 max_active = fs_info->thread_pool_size; - unsigned int flags = WQ_MEM_RECLAIM | WQ_FREEZABLE | WQ_UNBOUND; - unsigned int ordered_flags = WQ_MEM_RECLAIM | WQ_FREEZABLE; + unsigned int flags = WQ_MEM_RECLAIM | WQ_UNBOUND; + unsigned int ordered_flags = WQ_MEM_RECLAIM; fs_info->workers = btrfs_alloc_workqueue(fs_info, "worker", flags, max_active, 16); diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 2c5edcee9450..5790177b4c2f 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -2877,7 +2877,7 @@ static void scrub_workers_put(struct btrfs_fs_info *fs_info) static noinline_for_stack int scrub_workers_get(struct btrfs_fs_info *fs_info) { struct workqueue_struct *scrub_workers = NULL; - unsigned int flags = WQ_FREEZABLE | WQ_UNBOUND; + unsigned int flags = WQ_UNBOUND; int max_active = fs_info->thread_pool_size; int ret = -ENOMEM; From patchwork Tue Apr 1 00:32:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 14034237 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5937A3B2A0; Tue, 1 Apr 2025 00:33:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467619; cv=none; b=avweqoUn+hI/IWy1Ip4+vXlbdQJ6pzkPklX8KlnCh7YT3b500miCSM7Ks7oOoC/gU2eqNkCVMxXwOfNTdPJ4emfSPIKZVf+nxnVCD7UTWmQ0pbNiYKUOECM3Wa4hCZNiCT2AqnmQPyqz5nhMTrt83nfjGPWlpm3OaDouZsj4VUk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467619; c=relaxed/simple; bh=LgKEk3JEMCVIwbUc0LPca13vEvDo6/Cr7URqAgR/XqY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Wb1tc/5TfD+YMQxTySAtiBuNh3L8r9JfrQw7rUkLCnIckjljypJ6NoLBYhwOwEP4MlZFMaHMHp9CCakgb+kvYpE8TrFCkOMFD2CJhYFteI19cfged4Wu4qJmF+pxhxoFShiX26tN0yfW+f2JtVZ/8K+c3ouvn58DW+mKVNV3lp0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=An6fGshs; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="An6fGshs" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0B414C4CEED; Tue, 1 Apr 2025 00:33:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743467618; bh=LgKEk3JEMCVIwbUc0LPca13vEvDo6/Cr7URqAgR/XqY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=An6fGshskWq3QX8K2FdJorEK8/jRxdV8qJFnAJ3r5Zlcj4tXqQhTB6vvqyKbBv1gS Zg/RHe2VIDB+Nrm2x6xq8FJcPucoIX0EzhoRockD1cLBNheXhuSIJWFJCrWOedkwy6 RiVVR54jefaVaQG529fPF4qmEurlvBollB7pOwIkbLCrIsKYZimTM15VUmp+NeVvbP 5Vq69BzwS7F/IdoEgojBFs09SZZ+3sp1tmYieWJsD4XBHsw4K9I+p53NuuPCJgyVw9 uk4juI9lqnBbY9XAJhIlDBQK1Noe7Su6lSg4DaK7MwIWEyS2vwjcyqPRw2UCOxIlwX K348wMhvuW0rQ== From: Christian Brauner To: linux-fsdevel@vger.kernel.org, jack@suse.cz Cc: Christian Brauner , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, James Bottomley , mcgrof@kernel.org, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [PATCH 3/6] xfs: replace kthread freezing with auto fs freezing Date: Tue, 1 Apr 2025 02:32:48 +0200 Message-ID: <20250401-work-freeze-v1-3-d000611d4ab0@kernel.org> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> References: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mailer: b4 0.15-dev-42535 X-Developer-Signature: v=1; a=openpgp-sha256; l=8352; i=brauner@kernel.org; h=from:subject:message-id; bh=FtAblvbK+CSDReLl0ZY3VoDAalHQydNqGK9uQ+GdDdI=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaS/NrGJ874lqjwv3tLYOsvov2n+xYyaF0H9cikWFdPi6 1k0VG51lLIwiHExyIopsji0m4TLLeep2GyUqQEzh5UJZAgDF6cATMRQhuF/tle9eNHfU/WzXHUO 7Jb1ujT38NeJwifmLVhWJPfipOSzCYwMd36V/HP5wli56ETJu/1fk/asYrMM8VOUCXCKvnlyxcs MVgA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 From: Luis Chamberlain The kernel power management now supports allowing the VFS to handle filesystem freezing freezes and thawing. Take advantage of that and remove the kthread freezing. This is needed so that we properly really stop IO in flight without races after userspace has been frozen. Without this we rely on kthread freezing and its semantics are loose and error prone. The filesystem therefore is in charge of properly dealing with quiescing of the filesystem through its callbacks if it thinks it knows better than how the VFS handles it. The following Coccinelle rule was used as to remove the now superfluous freezer calls: make coccicheck MODE=patch SPFLAGS="--in-place --no-show-diff" COCCI=./fs-freeze-cleanup.cocci M=fs/xfs virtual patch @ remove_set_freezable @ expression time; statement S, S2; expression task, current; @@ ( - set_freezable(); | - if (try_to_freeze()) - continue; | - try_to_freeze(); | - freezable_schedule(); + schedule(); | - freezable_schedule_timeout(time); + schedule_timeout(time); | - if (freezing(task)) { S } | - if (freezing(task)) { S } - else { S2 } | - freezing(current) ) @ remove_wq_freezable @ expression WQ_E, WQ_ARG1, WQ_ARG2, WQ_ARG3, WQ_ARG4; identifier fs_wq_fn; @@ ( WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_FREEZABLE, + WQ_ARG2, ...); | WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_FREEZABLE | WQ_ARG3, + WQ_ARG2 | WQ_ARG3, ...); | WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE, + WQ_ARG2 | WQ_ARG3, ...); | WQ_E = alloc_workqueue(WQ_ARG1, - WQ_ARG2 | WQ_ARG3 | WQ_FREEZABLE | WQ_ARG4, + WQ_ARG2 | WQ_ARG3 | WQ_ARG4, ...); | WQ_E = - WQ_ARG1 | WQ_FREEZABLE + WQ_ARG1 | WQ_E = - WQ_ARG1 | WQ_FREEZABLE | WQ_ARG3 + WQ_ARG1 | WQ_ARG3 | fs_wq_fn( - WQ_FREEZABLE | WQ_ARG2 | WQ_ARG3 + WQ_ARG2 | WQ_ARG3 ) | fs_wq_fn( - WQ_FREEZABLE | WQ_ARG2 + WQ_ARG2 ) | fs_wq_fn( - WQ_FREEZABLE + 0 ) ) @ add_auto_flag @ expression E1; identifier fs_type; @@ struct file_system_type fs_type = { .fs_flags = E1 + | FS_AUTOFREEZE , }; Generated-by: Coccinelle SmPL Signed-off-by: Luis Chamberlain Link: https://lore.kernel.org/r/20250326112220.1988619-7-mcgrof@kernel.org Signed-off-by: Christian Brauner --- fs/xfs/xfs_discard.c | 2 +- fs/xfs/xfs_log.c | 3 +-- fs/xfs/xfs_log_cil.c | 2 +- fs/xfs/xfs_mru_cache.c | 2 +- fs/xfs/xfs_pwork.c | 2 +- fs/xfs/xfs_super.c | 14 +++++++------- fs/xfs/xfs_trans_ail.c | 3 --- fs/xfs/xfs_zone_gc.c | 2 -- 8 files changed, 12 insertions(+), 18 deletions(-) diff --git a/fs/xfs/xfs_discard.c b/fs/xfs/xfs_discard.c index c1a306268ae4..1596cf0ecb9b 100644 --- a/fs/xfs/xfs_discard.c +++ b/fs/xfs/xfs_discard.c @@ -333,7 +333,7 @@ xfs_trim_gather_extents( static bool xfs_trim_should_stop(void) { - return fatal_signal_pending(current) || freezing(current); + return fatal_signal_pending(current); } /* diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index 6493bdb57351..317f6db292fb 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -1489,8 +1489,7 @@ xlog_alloc_log( log->l_iclog->ic_prev = prev_iclog; /* re-write 1st prev ptr */ log->l_ioend_workqueue = alloc_workqueue("xfs-log/%s", - XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM | - WQ_HIGHPRI), + XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_HIGHPRI), 0, mp->m_super->s_id); if (!log->l_ioend_workqueue) goto out_free_iclog; diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c index 1ca406ec1b40..8ff5d68394e6 100644 --- a/fs/xfs/xfs_log_cil.c +++ b/fs/xfs/xfs_log_cil.c @@ -1932,7 +1932,7 @@ xlog_cil_init( * concurrency the log spinlocks will be exposed to. */ cil->xc_push_wq = alloc_workqueue("xfs-cil/%s", - XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM | WQ_UNBOUND), + XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_UNBOUND), 4, log->l_mp->m_super->s_id); if (!cil->xc_push_wq) goto out_destroy_cil; diff --git a/fs/xfs/xfs_mru_cache.c b/fs/xfs/xfs_mru_cache.c index d0f5b403bdbe..c9a49c6f6129 100644 --- a/fs/xfs/xfs_mru_cache.c +++ b/fs/xfs/xfs_mru_cache.c @@ -293,7 +293,7 @@ int xfs_mru_cache_init(void) { xfs_mru_reap_wq = alloc_workqueue("xfs_mru_cache", - XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_FREEZABLE), 1); + XFS_WQFLAGS(WQ_MEM_RECLAIM), 1); if (!xfs_mru_reap_wq) return -ENOMEM; return 0; diff --git a/fs/xfs/xfs_pwork.c b/fs/xfs/xfs_pwork.c index c283b801cc5d..3f5bf53f8778 100644 --- a/fs/xfs/xfs_pwork.c +++ b/fs/xfs/xfs_pwork.c @@ -72,7 +72,7 @@ xfs_pwork_init( trace_xfs_pwork_init(mp, nr_threads, current->pid); pctl->wq = alloc_workqueue("%s-%d", - WQ_UNBOUND | WQ_SYSFS | WQ_FREEZABLE, nr_threads, tag, + WQ_UNBOUND | WQ_SYSFS, nr_threads, tag, current->pid); if (!pctl->wq) return -ENOMEM; diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 53944cc7af24..06eb51a3d13b 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -565,37 +565,37 @@ xfs_init_mount_workqueues( struct xfs_mount *mp) { mp->m_buf_workqueue = alloc_workqueue("xfs-buf/%s", - XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM), + XFS_WQFLAGS(WQ_MEM_RECLAIM), 1, mp->m_super->s_id); if (!mp->m_buf_workqueue) goto out; mp->m_unwritten_workqueue = alloc_workqueue("xfs-conv/%s", - XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM), + XFS_WQFLAGS(WQ_MEM_RECLAIM), 0, mp->m_super->s_id); if (!mp->m_unwritten_workqueue) goto out_destroy_buf; mp->m_reclaim_workqueue = alloc_workqueue("xfs-reclaim/%s", - XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM), + XFS_WQFLAGS(WQ_MEM_RECLAIM), 0, mp->m_super->s_id); if (!mp->m_reclaim_workqueue) goto out_destroy_unwritten; mp->m_blockgc_wq = alloc_workqueue("xfs-blockgc/%s", - XFS_WQFLAGS(WQ_UNBOUND | WQ_FREEZABLE | WQ_MEM_RECLAIM), + XFS_WQFLAGS(WQ_UNBOUND | WQ_MEM_RECLAIM), 0, mp->m_super->s_id); if (!mp->m_blockgc_wq) goto out_destroy_reclaim; mp->m_inodegc_wq = alloc_workqueue("xfs-inodegc/%s", - XFS_WQFLAGS(WQ_FREEZABLE | WQ_MEM_RECLAIM), + XFS_WQFLAGS(WQ_MEM_RECLAIM), 1, mp->m_super->s_id); if (!mp->m_inodegc_wq) goto out_destroy_blockgc; mp->m_sync_workqueue = alloc_workqueue("xfs-sync/%s", - XFS_WQFLAGS(WQ_FREEZABLE), 0, mp->m_super->s_id); + XFS_WQFLAGS(0), 0, mp->m_super->s_id); if (!mp->m_sync_workqueue) goto out_destroy_inodegc; @@ -2488,7 +2488,7 @@ xfs_init_workqueues(void) * max_active value for this workqueue. */ xfs_alloc_wq = alloc_workqueue("xfsalloc", - XFS_WQFLAGS(WQ_MEM_RECLAIM | WQ_FREEZABLE), 0); + XFS_WQFLAGS(WQ_MEM_RECLAIM), 0); if (!xfs_alloc_wq) return -ENOMEM; diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 0fcb1828e598..ad8183db0780 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -636,7 +636,6 @@ xfsaild( unsigned int noreclaim_flag; noreclaim_flag = memalloc_noreclaim_save(); - set_freezable(); while (1) { /* @@ -695,8 +694,6 @@ xfsaild( __set_current_state(TASK_RUNNING); - try_to_freeze(); - tout = xfsaild_push(ailp); } diff --git a/fs/xfs/xfs_zone_gc.c b/fs/xfs/xfs_zone_gc.c index c5136ea9bb1d..1875b6551ab0 100644 --- a/fs/xfs/xfs_zone_gc.c +++ b/fs/xfs/xfs_zone_gc.c @@ -993,7 +993,6 @@ xfs_zone_gc_handle_work( } __set_current_state(TASK_RUNNING); - try_to_freeze(); if (reset_list) xfs_zone_gc_reset_zones(data, reset_list); @@ -1041,7 +1040,6 @@ xfs_zoned_gcd( unsigned int nofs_flag; nofs_flag = memalloc_nofs_save(); - set_freezable(); for (;;) { set_current_state(TASK_INTERRUPTIBLE | TASK_FREEZABLE); From patchwork Tue Apr 1 00:32:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 14034238 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0D633B2A0; Tue, 1 Apr 2025 00:33:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467623; cv=none; b=NJYOmiNCrBAWR3yu2Kynmn8sHsKt588UPVeWEYncQA94Bz3TdZ6RZLO1FZWa6tguYCHlsGYsIDZYsCrm48jmSqO+0++il+Lh3iLRJiB8AVESvDv5u49Fn/oJzueolmuo5YCNrg9hkTSCMAZfGPvj4aJkyp1y3BzeF/wour8TNzY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467623; c=relaxed/simple; bh=OPGdbbL5QTv7IBZoEqw961G0IFKg5su4K0y3BuKLV8c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=jtBL96BpOzHsBMa19ds7Hkdp3RgCzaqd71d7nVl8lFwCOEmORPli395ba/oH+d4t0bD/2FjpolE4oFGSWoIvdJgmJGk/7uLS6CysnD5Q+8BJgbfoXjeosyBNYaJM0th9HSBzhQVPeil1AGThvaezNQwGRzyddyb7Dx25cgOpwnc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aOGACVc2; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aOGACVc2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4EFC9C4CEE5; Tue, 1 Apr 2025 00:33:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743467623; bh=OPGdbbL5QTv7IBZoEqw961G0IFKg5su4K0y3BuKLV8c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aOGACVc2bgJwGAFEz3QcjP2iCi/LsO4nuEZhl2eN69FADL3rhiuVoNBpr8vaztLio UfrKldSOaQD4pv22bM0qOR2VFSi3e9KIbwQyzLafomdil9w63wA0UjqUzAb3F5yiBa Wl8OSEadgGR1fYkdfIbNij9SKIWUJl/PaD0G5zjyK2d5Fm6da4rnXvboI44skYV2rA FhvdDVkP/Y4ZlRVrftjjfFG69GzAVJ7J11WjZ08wJo1705+858N10Ij5TSPVpxDZH9 nx6slNJ+i3HeHyrSMwcYCyeuDqh++m2S/751alobzbGNGHbjb8lTS5nJrW7dEIqi4x ANUELxJ0k5dxA== From: Christian Brauner To: linux-fsdevel@vger.kernel.org, jack@suse.cz Cc: Christian Brauner , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, James Bottomley , mcgrof@kernel.org, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [PATCH 4/6] fs: add owner of freeze/thaw Date: Tue, 1 Apr 2025 02:32:49 +0200 Message-ID: <20250401-work-freeze-v1-4-d000611d4ab0@kernel.org> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> References: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mailer: b4 0.15-dev-42535 X-Developer-Signature: v=1; a=openpgp-sha256; l=17602; i=brauner@kernel.org; h=from:subject:message-id; bh=OPGdbbL5QTv7IBZoEqw961G0IFKg5su4K0y3BuKLV8c=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaS/NrE53qU2n7fAOU3k15IJxzYnajNsWN3joGD09Vd8i p1Iw/YzHaUsDGJcDLJiiiwO7Sbhcst5KjYbZWrAzGFlAhnCwMUpABPJPs/wV2bCztk2V6NVeXuN lUM8T/xyVj9yX/Xd41UxCrUb7gU+S2FkuHkpZeMCtduieiybvuw/99uhu2fTi3+q7ya7zotZXue ziRUA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 For some kernel subsystems it is paramount that they are guaranteed that they are the owner of the freeze to avoid any risk of deadlocks. This is the case for the power subsystem. Enable it to recognize whether it did actually freeze the filesystem. Signed-off-by: Christian Brauner --- fs/f2fs/gc.c | 6 ++-- fs/gfs2/super.c | 20 +++++++------ fs/gfs2/sys.c | 4 +-- fs/ioctl.c | 8 +++--- fs/super.c | 68 +++++++++++++++++++++++++++++++++++++-------- fs/xfs/scrub/fscounters.c | 4 +-- fs/xfs/xfs_notify_failure.c | 6 ++-- include/linux/fs.h | 13 ++++++--- 8 files changed, 91 insertions(+), 38 deletions(-) diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index 2b8f9239bede..3e8af62c9e15 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -2271,12 +2271,12 @@ int f2fs_resize_fs(struct file *filp, __u64 block_count) if (err) return err; - err = freeze_super(sbi->sb, FREEZE_HOLDER_USERSPACE); + err = freeze_super(sbi->sb, FREEZE_HOLDER_USERSPACE, NULL); if (err) return err; if (f2fs_readonly(sbi->sb)) { - err = thaw_super(sbi->sb, FREEZE_HOLDER_USERSPACE); + err = thaw_super(sbi->sb, FREEZE_HOLDER_USERSPACE, NULL); if (err) return err; return -EROFS; @@ -2333,6 +2333,6 @@ int f2fs_resize_fs(struct file *filp, __u64 block_count) out_err: f2fs_up_write(&sbi->cp_global_sem); f2fs_up_write(&sbi->gc_lock); - thaw_super(sbi->sb, FREEZE_HOLDER_USERSPACE); + thaw_super(sbi->sb, FREEZE_HOLDER_USERSPACE, NULL); return err; } diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 44e5658b896c..519943189109 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -674,7 +674,7 @@ static int gfs2_sync_fs(struct super_block *sb, int wait) return sdp->sd_log_error; } -static int gfs2_do_thaw(struct gfs2_sbd *sdp) +static int gfs2_do_thaw(struct gfs2_sbd *sdp, enum freeze_holder who, const void *freeze_owner) { struct super_block *sb = sdp->sd_vfs; int error; @@ -682,7 +682,7 @@ static int gfs2_do_thaw(struct gfs2_sbd *sdp) error = gfs2_freeze_lock_shared(sdp); if (error) goto fail; - error = thaw_super(sb, FREEZE_HOLDER_USERSPACE); + error = thaw_super(sb, who, freeze_owner); if (!error) return 0; @@ -703,14 +703,14 @@ void gfs2_freeze_func(struct work_struct *work) if (test_bit(SDF_FROZEN, &sdp->sd_flags)) goto freeze_failed; - error = freeze_super(sb, FREEZE_HOLDER_USERSPACE); + error = freeze_super(sb, FREEZE_HOLDER_USERSPACE, NULL); if (error) goto freeze_failed; gfs2_freeze_unlock(sdp); set_bit(SDF_FROZEN, &sdp->sd_flags); - error = gfs2_do_thaw(sdp); + error = gfs2_do_thaw(sdp, FREEZE_HOLDER_USERSPACE, NULL); if (error) goto out; @@ -731,7 +731,8 @@ void gfs2_freeze_func(struct work_struct *work) * */ -static int gfs2_freeze_super(struct super_block *sb, enum freeze_holder who) +static int gfs2_freeze_super(struct super_block *sb, enum freeze_holder who, + const void *freeze_owner) { struct gfs2_sbd *sdp = sb->s_fs_info; int error; @@ -744,7 +745,7 @@ static int gfs2_freeze_super(struct super_block *sb, enum freeze_holder who) } for (;;) { - error = freeze_super(sb, FREEZE_HOLDER_USERSPACE); + error = freeze_super(sb, who, freeze_owner); if (error) { fs_info(sdp, "GFS2: couldn't freeze filesystem: %d\n", error); @@ -758,7 +759,7 @@ static int gfs2_freeze_super(struct super_block *sb, enum freeze_holder who) break; } - error = gfs2_do_thaw(sdp); + error = gfs2_do_thaw(sdp, who, freeze_owner); if (error) goto out; @@ -799,7 +800,8 @@ static int gfs2_freeze_fs(struct super_block *sb) * */ -static int gfs2_thaw_super(struct super_block *sb, enum freeze_holder who) +static int gfs2_thaw_super(struct super_block *sb, enum freeze_holder who, + const void *freeze_owner) { struct gfs2_sbd *sdp = sb->s_fs_info; int error; @@ -814,7 +816,7 @@ static int gfs2_thaw_super(struct super_block *sb, enum freeze_holder who) atomic_inc(&sb->s_active); gfs2_freeze_unlock(sdp); - error = gfs2_do_thaw(sdp); + error = gfs2_do_thaw(sdp, who, freeze_owner); if (!error) { clear_bit(SDF_FREEZE_INITIATOR, &sdp->sd_flags); diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c index ecc699f8d9fc..748125653d6c 100644 --- a/fs/gfs2/sys.c +++ b/fs/gfs2/sys.c @@ -174,10 +174,10 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len) switch (n) { case 0: - error = thaw_super(sdp->sd_vfs, FREEZE_HOLDER_USERSPACE); + error = thaw_super(sdp->sd_vfs, FREEZE_HOLDER_USERSPACE, NULL); break; case 1: - error = freeze_super(sdp->sd_vfs, FREEZE_HOLDER_USERSPACE); + error = freeze_super(sdp->sd_vfs, FREEZE_HOLDER_USERSPACE, NULL); break; default: return -EINVAL; diff --git a/fs/ioctl.c b/fs/ioctl.c index c91fd2b46a77..bedc83fc2f20 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -396,8 +396,8 @@ static int ioctl_fsfreeze(struct file *filp) /* Freeze */ if (sb->s_op->freeze_super) - return sb->s_op->freeze_super(sb, FREEZE_HOLDER_USERSPACE); - return freeze_super(sb, FREEZE_HOLDER_USERSPACE); + return sb->s_op->freeze_super(sb, FREEZE_HOLDER_USERSPACE, NULL); + return freeze_super(sb, FREEZE_HOLDER_USERSPACE, NULL); } static int ioctl_fsthaw(struct file *filp) @@ -409,8 +409,8 @@ static int ioctl_fsthaw(struct file *filp) /* Thaw */ if (sb->s_op->thaw_super) - return sb->s_op->thaw_super(sb, FREEZE_HOLDER_USERSPACE); - return thaw_super(sb, FREEZE_HOLDER_USERSPACE); + return sb->s_op->thaw_super(sb, FREEZE_HOLDER_USERSPACE, NULL); + return thaw_super(sb, FREEZE_HOLDER_USERSPACE, NULL); } static int ioctl_file_dedupe_range(struct file *file, diff --git a/fs/super.c b/fs/super.c index 3c4a496d6438..606072a3fab9 100644 --- a/fs/super.c +++ b/fs/super.c @@ -39,7 +39,8 @@ #include #include "internal.h" -static int thaw_super_locked(struct super_block *sb, enum freeze_holder who); +static int thaw_super_locked(struct super_block *sb, enum freeze_holder who, + const void *freeze_owner); static LIST_HEAD(super_blocks); static DEFINE_SPINLOCK(sb_lock); @@ -1148,7 +1149,7 @@ static void do_thaw_all_callback(struct super_block *sb, void *unused) if (IS_ENABLED(CONFIG_BLOCK)) while (sb->s_bdev && !bdev_thaw(sb->s_bdev)) pr_warn("Emergency Thaw on %pg\n", sb->s_bdev); - thaw_super_locked(sb, FREEZE_HOLDER_USERSPACE); + thaw_super_locked(sb, FREEZE_HOLDER_USERSPACE, NULL); return; } @@ -1522,10 +1523,10 @@ static int fs_bdev_freeze(struct block_device *bdev) if (sb->s_op->freeze_super) error = sb->s_op->freeze_super(sb, - FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE); + FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); else error = freeze_super(sb, - FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE); + FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); if (!error) error = sync_blockdev(bdev); deactivate_super(sb); @@ -1571,10 +1572,10 @@ static int fs_bdev_thaw(struct block_device *bdev) if (sb->s_op->thaw_super) error = sb->s_op->thaw_super(sb, - FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE); + FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); else error = thaw_super(sb, - FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE); + FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); deactivate_super(sb); return error; } @@ -1946,7 +1947,7 @@ static int wait_for_partially_frozen(struct super_block *sb) } #define FREEZE_HOLDERS (FREEZE_HOLDER_KERNEL | FREEZE_HOLDER_USERSPACE) -#define FREEZE_FLAGS (FREEZE_HOLDERS | FREEZE_MAY_NEST) +#define FREEZE_FLAGS (FREEZE_HOLDERS | FREEZE_MAY_NEST | FREEZE_EXCL) static inline int freeze_inc(struct super_block *sb, enum freeze_holder who) { @@ -1977,6 +1978,21 @@ static inline bool may_freeze(struct super_block *sb, enum freeze_holder who) WARN_ON_ONCE((who & ~FREEZE_FLAGS)); WARN_ON_ONCE(hweight32(who & FREEZE_HOLDERS) > 1); + if (who & FREEZE_EXCL) { + if (WARN_ON_ONCE(!(who & FREEZE_HOLDER_KERNEL))) + return false; + + if (who & ~(FREEZE_EXCL | FREEZE_HOLDER_KERNEL)) + return false; + + return (sb->s_writers.freeze_kcount + + sb->s_writers.freeze_ucount) == 0; + } + + /* This filesystem is already exclusively frozen. */ + if (sb->s_writers.freeze_owner) + return false; + if (who & FREEZE_HOLDER_KERNEL) return (who & FREEZE_MAY_NEST) || sb->s_writers.freeze_kcount == 0; @@ -1986,10 +2002,30 @@ static inline bool may_freeze(struct super_block *sb, enum freeze_holder who) return false; } +static inline bool may_unfreeze(struct super_block *sb, enum freeze_holder who, + const void *freeze_owner) +{ + WARN_ON_ONCE((who & ~FREEZE_FLAGS)); + WARN_ON_ONCE(hweight32(who & FREEZE_HOLDERS) > 1); + + if (who & FREEZE_EXCL) { + if (WARN_ON_ONCE(sb->s_writers.freeze_owner == NULL)) + return false; + if (WARN_ON_ONCE(!(who & FREEZE_HOLDER_KERNEL))) + return false; + if (who & ~(FREEZE_EXCL | FREEZE_HOLDER_KERNEL)) + return false; + return sb->s_writers.freeze_owner == freeze_owner; + } + + return sb->s_writers.freeze_owner == NULL; +} + /** * freeze_super - lock the filesystem and force it into a consistent state * @sb: the super to lock * @who: context that wants to freeze + * @freeze_owner: owner of the freeze * * Syncs the super to make sure the filesystem is consistent and calls the fs's * freeze_fs. Subsequent calls to this without first thawing the fs may return @@ -2041,7 +2077,7 @@ static inline bool may_freeze(struct super_block *sb, enum freeze_holder who) * Return: If the freeze was successful zero is returned. If the freeze * failed a negative error code is returned. */ -int freeze_super(struct super_block *sb, enum freeze_holder who) +int freeze_super(struct super_block *sb, enum freeze_holder who, const void *freeze_owner) { int ret; @@ -2075,6 +2111,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who) if (sb_rdonly(sb)) { /* Nothing to do really... */ WARN_ON_ONCE(freeze_inc(sb, who) > 1); + sb->s_writers.freeze_owner = freeze_owner; sb->s_writers.frozen = SB_FREEZE_COMPLETE; wake_up_var(&sb->s_writers.frozen); super_unlock_excl(sb); @@ -2122,6 +2159,7 @@ int freeze_super(struct super_block *sb, enum freeze_holder who) * when frozen is set to SB_FREEZE_COMPLETE, and for thaw_super(). */ WARN_ON_ONCE(freeze_inc(sb, who) > 1); + sb->s_writers.freeze_owner = freeze_owner; sb->s_writers.frozen = SB_FREEZE_COMPLETE; wake_up_var(&sb->s_writers.frozen); lockdep_sb_freeze_release(sb); @@ -2136,13 +2174,17 @@ EXPORT_SYMBOL(freeze_super); * removes that state without releasing the other state or unlocking the * filesystem. */ -static int thaw_super_locked(struct super_block *sb, enum freeze_holder who) +static int thaw_super_locked(struct super_block *sb, enum freeze_holder who, + const void *freeze_owner) { int error = -EINVAL; if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) goto out_unlock; + if (!may_unfreeze(sb, who, freeze_owner)) + goto out_unlock; + /* * All freezers share a single active reference. * So just unlock in case there are any left. @@ -2152,6 +2194,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who) if (sb_rdonly(sb)) { sb->s_writers.frozen = SB_UNFROZEN; + sb->s_writers.freeze_owner = NULL; wake_up_var(&sb->s_writers.frozen); goto out_deactivate; } @@ -2169,6 +2212,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who) } sb->s_writers.frozen = SB_UNFROZEN; + sb->s_writers.freeze_owner = NULL; wake_up_var(&sb->s_writers.frozen); sb_freeze_unlock(sb, SB_FREEZE_FS); out_deactivate: @@ -2184,6 +2228,7 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who) * thaw_super -- unlock filesystem * @sb: the super to thaw * @who: context that wants to freeze + * @freeze_owner: owner of the freeze * * Unlocks the filesystem and marks it writeable again after freeze_super() * if there are no remaining freezes on the filesystem. @@ -2197,13 +2242,14 @@ static int thaw_super_locked(struct super_block *sb, enum freeze_holder who) * have been frozen through the block layer via multiple block devices. * The filesystem remains frozen until all block devices are unfrozen. */ -int thaw_super(struct super_block *sb, enum freeze_holder who) +int thaw_super(struct super_block *sb, enum freeze_holder who, + const void *freeze_owner) { if (!super_lock_excl(sb)) { WARN_ON_ONCE("Dying superblock while thawing!"); return -EINVAL; } - return thaw_super_locked(sb, who); + return thaw_super_locked(sb, who, freeze_owner); } EXPORT_SYMBOL(thaw_super); diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c index e629663e460a..9b598c5790ad 100644 --- a/fs/xfs/scrub/fscounters.c +++ b/fs/xfs/scrub/fscounters.c @@ -123,7 +123,7 @@ xchk_fsfreeze( { int error; - error = freeze_super(sc->mp->m_super, FREEZE_HOLDER_KERNEL); + error = freeze_super(sc->mp->m_super, FREEZE_HOLDER_KERNEL, NULL); trace_xchk_fsfreeze(sc, error); return error; } @@ -135,7 +135,7 @@ xchk_fsthaw( int error; /* This should always succeed, we have a kernel freeze */ - error = thaw_super(sc->mp->m_super, FREEZE_HOLDER_KERNEL); + error = thaw_super(sc->mp->m_super, FREEZE_HOLDER_KERNEL, NULL); trace_xchk_fsthaw(sc, error); return error; } diff --git a/fs/xfs/xfs_notify_failure.c b/fs/xfs/xfs_notify_failure.c index ed8d8ed42f0a..3545dc1d953c 100644 --- a/fs/xfs/xfs_notify_failure.c +++ b/fs/xfs/xfs_notify_failure.c @@ -127,7 +127,7 @@ xfs_dax_notify_failure_freeze( struct super_block *sb = mp->m_super; int error; - error = freeze_super(sb, FREEZE_HOLDER_KERNEL); + error = freeze_super(sb, FREEZE_HOLDER_KERNEL, NULL); if (error) xfs_emerg(mp, "already frozen by kernel, err=%d", error); @@ -143,7 +143,7 @@ xfs_dax_notify_failure_thaw( int error; if (kernel_frozen) { - error = thaw_super(sb, FREEZE_HOLDER_KERNEL); + error = thaw_super(sb, FREEZE_HOLDER_KERNEL, NULL); if (error) xfs_emerg(mp, "still frozen after notify failure, err=%d", error); @@ -153,7 +153,7 @@ xfs_dax_notify_failure_thaw( * Also thaw userspace call anyway because the device is about to be * removed immediately. */ - thaw_super(sb, FREEZE_HOLDER_USERSPACE); + thaw_super(sb, FREEZE_HOLDER_USERSPACE, NULL); } static int diff --git a/include/linux/fs.h b/include/linux/fs.h index 1aa578412f1b..b379a46b5576 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1307,6 +1307,7 @@ struct sb_writers { unsigned short frozen; /* Is sb frozen? */ int freeze_kcount; /* How many kernel freeze requests? */ int freeze_ucount; /* How many userspace freeze requests? */ + const void *freeze_owner; /* Owner of the freeze */ struct percpu_rw_semaphore rw_sem[SB_FREEZE_LEVELS]; }; @@ -2270,6 +2271,7 @@ extern loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos, * @FREEZE_HOLDER_KERNEL: kernel wants to freeze or thaw filesystem * @FREEZE_HOLDER_USERSPACE: userspace wants to freeze or thaw filesystem * @FREEZE_MAY_NEST: whether nesting freeze and thaw requests is allowed + * @FREEZE_EXCL: whether actual freezing must be done by the caller * * Indicate who the owner of the freeze or thaw request is and whether * the freeze needs to be exclusive or can nest. @@ -2283,6 +2285,7 @@ enum freeze_holder { FREEZE_HOLDER_KERNEL = (1U << 0), FREEZE_HOLDER_USERSPACE = (1U << 1), FREEZE_MAY_NEST = (1U << 2), + FREEZE_EXCL = (1U << 3), }; struct super_operations { @@ -2296,9 +2299,9 @@ struct super_operations { void (*evict_inode) (struct inode *); void (*put_super) (struct super_block *); int (*sync_fs)(struct super_block *sb, int wait); - int (*freeze_super) (struct super_block *, enum freeze_holder who); + int (*freeze_super) (struct super_block *, enum freeze_holder who, const void *owner); int (*freeze_fs) (struct super_block *); - int (*thaw_super) (struct super_block *, enum freeze_holder who); + int (*thaw_super) (struct super_block *, enum freeze_holder who, const void *owner); int (*unfreeze_fs) (struct super_block *); int (*statfs) (struct dentry *, struct kstatfs *); int (*remount_fs) (struct super_block *, int *, char *); @@ -2706,8 +2709,10 @@ extern int unregister_filesystem(struct file_system_type *); extern int vfs_statfs(const struct path *, struct kstatfs *); extern int user_statfs(const char __user *, struct kstatfs *); extern int fd_statfs(int, struct kstatfs *); -int freeze_super(struct super_block *super, enum freeze_holder who); -int thaw_super(struct super_block *super, enum freeze_holder who); +int freeze_super(struct super_block *super, enum freeze_holder who, + const void *freeze_owner); +int thaw_super(struct super_block *super, enum freeze_holder who, + const void *freeze_owner); extern __printf(2, 3) int super_setup_bdi_name(struct super_block *sb, char *fmt, ...); extern int super_setup_bdi(struct super_block *sb); From patchwork Tue Apr 1 00:32:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 14034239 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DC7013C9C4; Tue, 1 Apr 2025 00:33:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467627; cv=none; b=bElBNgR5uXIfFs7colUQoQxKhnLV6EUhMX80bn5u/kXO0yLUANbgatlBCewdEzYWydYQHa2JR4j9iJyagqh+dFuecIMbamIO3QQNWl1BhfIsCTa0bT/Yvw4ifIqV+OaahL/p8B9pO5SwJUXnPGf3GUhUEgMpoHOx/3aPzqwNKow= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467627; c=relaxed/simple; bh=VvL58nItWEHY6IVraVsEyEezYA4/drjDHy8bBXigLc4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=JKulsxw4CFv+wmM9r9PkJ7sm7aBnt3gx/JVlPfj1G8J7asG+qv+TjX3t6zLJ6vheMMiINIsyoxjBktMs70AP3yP/x56smNilZeeDjxENFLC6RLFlSZ+z3tTHMdv4lMvXfR6r0Eme37F9dS6EyUB2oMneQy9KClD9F5X408YWhUk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VUzj7xFe; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VUzj7xFe" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 92A40C4CEE3; Tue, 1 Apr 2025 00:33:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743467627; bh=VvL58nItWEHY6IVraVsEyEezYA4/drjDHy8bBXigLc4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VUzj7xFeZYRm9TdE+rSHJKpL67o+RauEPjAPJsQkvE4OleSXC2vC2Eze/SGpfTC5i 07CeC6poBG7hQoO19an/r8zVBwgxbYD+69GPOi7FN1SINAWVXE7GPuNfGP9Bvq2eA4 3ZDdRCbvGGACdADnNNW4d/wtVrhDPB2XE2uD0SmEasS7DGPhFj3cgdkcGwSdRCuiD4 cCm1pIxmg+0gUyjUQqPYR6gZcvXbtAlmYGk2sE1l7eNOPC4Ckf2+TTtnDaoUEUDdzC IseMuqqLhWI96zP8EFohiTiIpunWRvAM6uy07qWuXUEbtEVqnAf087MvfFriRGhixw AcKV2Dhy59Lcg== From: Christian Brauner To: linux-fsdevel@vger.kernel.org, jack@suse.cz Cc: Christian Brauner , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, James Bottomley , mcgrof@kernel.org, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [PATCH 5/6] fs: allow pagefault based writers to be frozen Date: Tue, 1 Apr 2025 02:32:50 +0200 Message-ID: <20250401-work-freeze-v1-5-d000611d4ab0@kernel.org> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> References: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mailer: b4 0.15-dev-42535 X-Developer-Signature: v=1; a=openpgp-sha256; l=871; i=brauner@kernel.org; h=from:subject:message-id; bh=VvL58nItWEHY6IVraVsEyEezYA4/drjDHy8bBXigLc4=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaS/NrGdL29xpkQnZp9ExoTU6XfTLYRSHjQ68CRt++veo nDpvmZTRykLgxgXg6yYIotDu0m43HKeis1GmRowc1iZQIYwcHEKwER+hDAyvDgrtzNtxyTr/ylJ x9fuq5hom/lGkq3O1qnrrP2pzKuFagz/g4+WrlxvyLVxiVxzmPHxRUfdwspndixdWvBJ7orjuwc d7AA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Otherwise tasks such as systemd-journald that mmap a file and write to it will not be frozen after we've frozen the filesystem. Signed-off-by: Christian Brauner --- include/linux/fs.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index b379a46b5576..528e73f192ac 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1782,7 +1782,8 @@ static inline void __sb_end_write(struct super_block *sb, int level) static inline void __sb_start_write(struct super_block *sb, int level) { percpu_down_read_freezable(sb->s_writers.rw_sem + level - 1, - level == SB_FREEZE_WRITE); + (level == SB_FREEZE_WRITE || + level == SB_FREEZE_PAGEFAULT)); } static inline bool __sb_start_write_trylock(struct super_block *sb, int level) From patchwork Tue Apr 1 00:32:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 14034240 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1B5213B2A0; Tue, 1 Apr 2025 00:33:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467632; cv=none; b=Z+/oJYO0ihm9NxwpUTUquADeobKSInDW0saw41fI2de7VnyQg5Uuc3P6FfoE6D0KagUuzXtg/44Uphz8Kv7ybizKse0Z4vfJU6iSV8Db8rw5NyitANS1dRfGJHVJNQKXX0jRNyL2bWurguDk8sp1kG77eRgN0lEo6GsmtVdzl3g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743467632; c=relaxed/simple; bh=E6uWVHHjoPhRtgz4oW7o0triLrpRj/wcC1z2auGXS0M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=SxPlW3k+Rm+ddP1BAoT+K9R3rvTce9+TG//f8U875xzJ0fURlYRYL4FxkoV0jjo5NaYUnSKQ9zWat3zsX50HCkGukgM+it1yoOsLY3kYLC/F83uoeQNHGP4azXK7t7kzb2WdPAbW2lnYDt2it+mVM8db9WQeGwnF9Kbyw4zIlZo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pMY+ijcA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pMY+ijcA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 949EAC4CEEB; Tue, 1 Apr 2025 00:33:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1743467631; bh=E6uWVHHjoPhRtgz4oW7o0triLrpRj/wcC1z2auGXS0M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pMY+ijcAI7yJeb3c2qVB4U/X3FzVcvqBCOGQLKKQV7m26dzqMtW8Gp7b4tNikTo/G 6h9wydXvkgrtPzqItacfLtD2I2uMUsg05Juigl86a4/8OrImYLPVf6I/TqbTlhFdy9 t5utlaStPLY7dQ2cAgqp5E0nADwx/ZEI3Rc40H9sUTDJD9oiAc0v2BM3l3JHnvqat4 NFpneF43NtP16WRGCDu7BUIc3A/xeavmb1xGp5/cfelw38N2ng19IwGfPtBSCZgZ5Y 9rBSWNPRW6verGErN85+OCku2Ug0Rsg9o+GsKT5Wx2s49yPdY0/L1bTxFMhVEtQgqm 8tnw0XU61XX+A== From: Christian Brauner To: linux-fsdevel@vger.kernel.org, jack@suse.cz Cc: Christian Brauner , Ard Biesheuvel , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, James Bottomley , mcgrof@kernel.org, hch@infradead.org, david@fromorbit.com, rafael@kernel.org, djwong@kernel.org, pavel@kernel.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, boqun.feng@gmail.com Subject: [PATCH 6/6] power: freeze filesystems during suspend/resume Date: Tue, 1 Apr 2025 02:32:51 +0200 Message-ID: <20250401-work-freeze-v1-6-d000611d4ab0@kernel.org> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> References: <20250401-work-freeze-v1-0-d000611d4ab0@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mailer: b4 0.15-dev-42535 X-Developer-Signature: v=1; a=openpgp-sha256; l=5971; i=brauner@kernel.org; h=from:subject:message-id; bh=E6uWVHHjoPhRtgz4oW7o0triLrpRj/wcC1z2auGXS0M=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaS/NrHVrnxZbjJVjeXVhehQe53H83N8Fpw+qHfdp7Gvy 1fpZO6yjlIWBjEuBlkxRRaHdpNwueU8FZuNMjVg5rAygQxh4OIUgInMa2Bk2LXaaWmP/8On4jlT 1xbK7ua2ti88/GypceBnFffe5xvevmFkWM6QeoB5rrpYN0fI8oN8oWY2/Eq9f15psWTrz3M9OHE CLwA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Now all the pieces are in place to actually allow the power subsystem to freeze/thaw filesystems during suspend/resume. Filesystems are only frozen and thawed if the power subsystem does actually own the freeze. Othwerwise it risks thawing filesystems it didn't own. This could be done differently be e.g., keepin the filesystems that were actually frozen on a list and then unfreezing them from that list. This is disgustingly unclean though and reeks of an ugly hack. If the filesystem is already frozen by the time we've frozen all userspace processes we don't care to freeze it again. That's userspace's job once the process resumes. We only actually freeze filesystems if we absolutely have to and we ignore other failures to freeze for now. We could bubble up errors and fail suspend/resume if the error isn't EBUSY (aka it's already frozen) but I don't think that this is worth it. Filesystem freezing during suspend/resume is best-effort. If the user has 500 ext4 filesystems mounted and 4 fail to freeze for whatever reason then we simply skip them. What we have now is already a big improvement and let's see how we fare with it before making our lives even harder (and uglier) than we have to. Signed-off-by: Christian Brauner --- fs/super.c | 14 ++++++++++---- kernel/power/hibernate.c | 13 ++++++++++++- kernel/power/suspend.c | 8 ++++++++ 3 files changed, 30 insertions(+), 5 deletions(-) diff --git a/fs/super.c b/fs/super.c index 606072a3fab9..dd0d6def4a55 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1187,6 +1187,8 @@ static inline bool get_active_super(struct super_block *sb) return active; } +static const void *filesystems_freeze_ptr; + static void filesystems_freeze_callback(struct super_block *sb, void *unused) { if (!sb->s_op->freeze_fs && !sb->s_op->freeze_super) @@ -1196,9 +1198,11 @@ static void filesystems_freeze_callback(struct super_block *sb, void *unused) return; if (sb->s_op->freeze_super) - sb->s_op->freeze_super(sb, FREEZE_MAY_NEST | FREEZE_HOLDER_KERNEL); + sb->s_op->freeze_super(sb, FREEZE_EXCL | FREEZE_HOLDER_KERNEL, + filesystems_freeze_ptr); else - freeze_super(sb, FREEZE_MAY_NEST | FREEZE_HOLDER_KERNEL); + freeze_super(sb, FREEZE_EXCL | FREEZE_HOLDER_KERNEL, + filesystems_freeze_ptr); deactivate_super(sb); } @@ -1218,9 +1222,11 @@ static void filesystems_thaw_callback(struct super_block *sb, void *unused) return; if (sb->s_op->thaw_super) - sb->s_op->thaw_super(sb, FREEZE_MAY_NEST | FREEZE_HOLDER_KERNEL); + sb->s_op->thaw_super(sb, FREEZE_EXCL | FREEZE_HOLDER_KERNEL, + filesystems_freeze_ptr); else - thaw_super(sb, FREEZE_MAY_NEST | FREEZE_HOLDER_KERNEL); + thaw_super(sb, FREEZE_EXCL | FREEZE_HOLDER_KERNEL, + filesystems_freeze_ptr); deactivate_super(sb); } diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 50ec26ea696b..1803b7d24757 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -777,6 +777,7 @@ int hibernate(void) goto Restore; ksys_sync_helper(); + filesystems_freeze(); error = freeze_processes(); if (error) @@ -841,6 +842,7 @@ int hibernate(void) error = load_image_and_restore(); } thaw_processes(); + filesystems_thaw(); /* Don't bother checking whether freezer_test_done is true */ freezer_test_done = false; @@ -881,6 +883,8 @@ int hibernate_quiet_exec(int (*func)(void *data), void *data) if (error) goto restore; + filesystems_freeze(); + error = freeze_processes(); if (error) goto exit; @@ -940,6 +944,7 @@ int hibernate_quiet_exec(int (*func)(void *data), void *data) thaw_processes(); exit: + filesystems_thaw(); pm_notifier_call_chain(PM_POST_HIBERNATION); restore: @@ -1028,19 +1033,25 @@ static int software_resume(void) if (error) goto Restore; + filesystems_freeze(); + pm_pr_dbg("Preparing processes for hibernation restore.\n"); error = freeze_processes(); - if (error) + if (error) { + filesystems_thaw(); goto Close_Finish; + } error = freeze_kernel_threads(); if (error) { thaw_processes(); + filesystems_thaw(); goto Close_Finish; } error = load_image_and_restore(); thaw_processes(); + filesystems_thaw(); Finish: pm_notifier_call_chain(PM_POST_RESTORE); Restore: diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index 8eaec4ab121d..4c476271f7f2 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "power.h" @@ -374,6 +375,8 @@ static int suspend_prepare(suspend_state_t state) if (error) goto Restore; + if (sync_on_suspend_enabled) + filesystems_freeze(); trace_suspend_resume(TPS("freeze_processes"), 0, true); error = suspend_freeze_processes(); trace_suspend_resume(TPS("freeze_processes"), 0, false); @@ -550,6 +553,8 @@ int suspend_devices_and_enter(suspend_state_t state) static void suspend_finish(void) { suspend_thaw_processes(); + if (sync_on_suspend_enabled) + filesystems_thaw(); pm_notifier_call_chain(PM_POST_SUSPEND); pm_restore_console(); } @@ -587,6 +592,7 @@ static int enter_state(suspend_state_t state) trace_suspend_resume(TPS("sync_filesystems"), 0, true); ksys_sync_helper(); trace_suspend_resume(TPS("sync_filesystems"), 0, false); + filesystems_freeze(); } pm_pr_dbg("Preparing system for sleep (%s)\n", mem_sleep_labels[state]); @@ -609,6 +615,8 @@ static int enter_state(suspend_state_t state) pm_pr_dbg("Finishing wakeup.\n"); suspend_finish(); Unlock: + if (sync_on_suspend_enabled) + filesystems_thaw(); mutex_unlock(&system_transition_mutex); return error; }