From patchwork Wed Jun 16 23:55:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12326295 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 174B6C48BE5 for ; Wed, 16 Jun 2021 23:55:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F2ABD6113C for ; Wed, 16 Jun 2021 23:55:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234511AbhFPX5u (ORCPT ); Wed, 16 Jun 2021 19:57:50 -0400 Received: from mail.kernel.org ([198.145.29.99]:55218 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229503AbhFPX5u (ORCPT ); Wed, 16 Jun 2021 19:57:50 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id DEA1F60FD7; Wed, 16 Jun 2021 23:55:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1623887744; bh=V+Om14e3dFOtK4Atx9RV7J+I68Z2jf2MKd3eJwEIQaU=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=HuM1j5iha3OpU/FgqkgNhpsVbx9H+MLRQWYUoSPwuvdOXub6kfm3hUS/kEgfiSQHt XipCIh6MeRsj5FsGH8mxsVfJsvM/mPVVUzVyX1EbDMjeF66sfJ3Ywqm0c7boQXLko0 ZJuAv3nFQZ7kLJ5WR0fDA+InkR5/RjYmBR6K3+GWWBx5Odvm7C1fXp7d/SRXB4jpn7 4JK2pblPyStKMXp2b4CSYzArj8ath+NzS1f+cLFLCK9JXc4gajBXk3dflQZVQrFiI4 YGguTeNuzqnfgyKqm5YJoSJ0Y+FNhPdEyTm6IPg7w3ZO7pJ+fnK7z4MEsTbqpziQWA Cmeko1LvYMmEg== Subject: [PATCH 1/2] xfs: fix log intent recovery ENOSPC shutdowns when inactivating inodes From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Wed, 16 Jun 2021 16:55:43 -0700 Message-ID: <162388774359.3427167.14326615553028119265.stgit@locust> In-Reply-To: <162388773802.3427167.4556309820960423454.stgit@locust> References: <162388773802.3427167.4556309820960423454.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong During regular operation, the xfs_inactive operations create transactions with zero block reservation because in general we're freeing space, not asking for more. The per-AG space reservations created at mount time enable us to handle expansions of the refcount btree without needing to reserve blocks to the transaction. Unfortunately, log recovery doesn't create the per-AG space reservations when intent items are being recovered. This isn't an issue for intent item recovery itself because they explicitly request blocks, but any inode inactivation that can happen during log recovery uses the same xfs_inactive paths as regular runtime. If a refcount btree expansion happens, the transaction will fail due to blk_res_used > blk_res, and we shut down the filesystem unnecessarily. Fix this problem by making per-AG reservations temporarily so that we can handle the inactivations, and releasing them at the end. This brings the recovery environment closer to the runtime environment. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_mount.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index c3a96fb3ad80..d0755494597f 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -859,9 +859,17 @@ xfs_mountfs( /* * Finish recovering the file system. This part needed to be delayed * until after the root and real-time bitmap inodes were consistently - * read in. + * read in. Temporarily create per-AG space reservations for metadata + * btree shape changes because space freeing transactions (for inode + * inactivation) require the per-AG reservation in lieu of reserving + * blocks. */ + error = xfs_fs_reserve_ag_blocks(mp); + if (error && error == -ENOSPC) + xfs_warn(mp, + "ENOSPC reserving per-AG metadata pool, log recovery may fail."); error = xfs_log_mount_finish(mp); + xfs_fs_unreserve_ag_blocks(mp); if (error) { xfs_warn(mp, "log mount finish failed"); goto out_rtunmount; From patchwork Wed Jun 16 23:55:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12326297 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.2 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ED2AC48BE5 for ; Wed, 16 Jun 2021 23:55:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 820C06113C for ; Wed, 16 Jun 2021 23:55:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230043AbhFPX54 (ORCPT ); Wed, 16 Jun 2021 19:57:56 -0400 Received: from mail.kernel.org ([198.145.29.99]:55238 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229503AbhFPX54 (ORCPT ); Wed, 16 Jun 2021 19:57:56 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 68ED460FD7; Wed, 16 Jun 2021 23:55:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1623887749; bh=BKJdqn3hJ1UqKIB6W3bjTrjzSzLqXM6IiA2AUQvQ/wM=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=JIbYhqxjsjtHh87gpXE1jaGlTK5hlDJ1rPtKybjVDXDiTrQckg2j+tQqSwTg8SrId 2Cx2g51QbtrOA8/ZxNI/hy5JnC305UQBYU/O07hQu2YwMCFtBsp4fFWaW/LpeIzsdO f4Pn+inO+EodehdroEIsobe1+xZIsapLE0UES0ECH1p8R83NAQbqscZ9d7POpyLPT5 BgUQgfOiMv2xXmysN6KxzpPpX8YpxtfkPJD0Fl0qkPssE1HV3d0CqBUnJ2EdcMaNxD 3Py4T5XBSz5gJXSByY5q9uTTFRvFr2DHl1voF7k6N2o7xlWlHIKtjafvP2NvGwn1sU 3WivRSOVMG+ig== Subject: [PATCH 2/2] xfs: force the log offline when log intent item recovery fails From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Wed, 16 Jun 2021 16:55:49 -0700 Message-ID: <162388774909.3427167.8813765394953438195.stgit@locust> In-Reply-To: <162388773802.3427167.4556309820960423454.stgit@locust> References: <162388773802.3427167.4556309820960423454.stgit@locust> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong If any part of log intent item recovery fails, we should shut down the log immediately to stop the log from writing a clean unmount record to disk, because the metadata is not consistent. The inability to cancel a dirty transaction catches most of these cases, but there are a few things that have slipped through the cracks, such as ENOSPC from a transaction allocation, or runtime errors that result in cancellation of a non-dirty transaction. This solves some weird behaviors reported by customers where a system goes down, the first mount fails, the second succeeds, but then the fs goes down later because of inconsistent metadata. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_log.c | 3 +++ fs/xfs/xfs_log_recover.c | 5 ++++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index e921b554b683..f945df46c7e1 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -776,6 +776,9 @@ xfs_log_mount_finish( if (readonly) mp->m_flags |= XFS_MOUNT_RDONLY; + /* Make sure the log is dead if we're returning failure. */ + ASSERT(!error || (mp->m_log->l_flags & XLOG_IO_ERROR)); + return error; } diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 1227503d2246..1721fce2ec94 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2458,8 +2458,10 @@ xlog_finish_defer_ops( error = xfs_trans_alloc(mp, &resv, dfc->dfc_blkres, dfc->dfc_rtxres, XFS_TRANS_RESERVE, &tp); - if (error) + if (error) { + xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR); return error; + } /* * Transfer to this new transaction all the dfops we captured @@ -3449,6 +3451,7 @@ xlog_recover_finish( * this) before we get around to xfs_log_mount_cancel. */ xlog_recover_cancel_intents(log); + xfs_force_shutdown(log->l_mp, SHUTDOWN_LOG_IO_ERROR); xfs_alert(log->l_mp, "Failed to recover intents"); return error; }