[RFC] xfs: Prevent umount from indefinitely waiting on XFS_IFLUSHING flag on stale inodes

Executing xfs/057 can lead to an unmount task to wait indefinitely for
XFS_IFLUSHING flag on some inodes to be cleared. The following timeline
describes as to how inodes can get into such a state.

  Task A               Task B                      Iclog endio processing
  ----------------------------------------------------------------------------
  Inodes are freed

  Inodes items are
  added to the CIL

  CIL contents are
  written to iclog

  iclog->ic_fail_crc
  is set to true

  iclog is submitted
  for writing to the
  disk

                       Last inode in the cluster
                       buffer is freed

                       XFS_[ISTALE/IFLUSHING] is
                       set on all inodes in the
                       cluster buffer

                       XFS_STALE is set on
                       the cluster buffer
                                                   iclog crc error is detected
                       ...                         during endio processing

                       During xfs_trans_commit,    Set XFS_LI_ABORTED on inode
                       log shutdown is detected    items

                       XFS_LI_ABORTED is set       xfs_inode_item_committed()
                       on xfs_buf_log_item         - Unpin the inode since it
                                                   is stale and return -1
                       xfs_buf_log_item is freed
                                                   Inode log items are not
                       xfs_buf is not freed here   processed further since
                       since b_hold has a          xfs_inode_item_committed()
                       non-zero value              returns -1

During normal operation, the stale inodes are processed by
xfs_buf_item_unpin() => xfs_buf_inode_iodone(). This ends up calling
xfs_iflush_abort() which in turn clears the XFS_IFLUSHING flag. However, in
the case of this bug, the xfs_buf_log_item is freed just before the high level
transaction is committed to the CIL.

To overcome this bug, this commit removes the check for log shutdown during
high level transaction commit operation. The log items in the high level
transaction will now be committed to the CIL despite the log being
shutdown. This will allow the CIL processing logic (i.e. xlog_cil_push_work())
to invoke xlog_cil_committed() as part of error handling. This will cause
xfs_buf log item to to be unpinned and the corresponding inodes to be aborted
and have their XFS_IFLUSHING flag cleared.

Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
---
PS: I have tested this patch by executing xfs/057 in a loop for about 24 hours.
On a non-patched kernel, this issue gets recreated within 24 hours.

 fs/xfs/xfs_trans.c | 11 -----------
 1 file changed, 11 deletions(-)

Message ID	20240902075045.1037365-1-chandanbabu@kernel.org (mailing list archive)
State	Not Applicable, archived
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7937814900E for <linux-xfs@vger.kernel.org>; Mon, 2 Sep 2024 07:51:00 +0000 (UTC) From: Chandan Babu R <chandanbabu@kernel.org> To: linux-xfs@vger.kernel.org Cc: Chandan Babu R <chandanbabu@kernel.org> Subject: [RFC PATCH] xfs: Prevent umount from indefinitely waiting on XFS_IFLUSHING flag on stale inodes Date: Mon, 2 Sep 2024 13:20:41 +0530 Message-ID: <20240902075045.1037365-1-chandanbabu@kernel.org> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[RFC] xfs: Prevent umount from indefinitely waiting on XFS_IFLUSHING flag on stale inodes \| expand [RFC] xfs: Prevent umount from indefinitely waiting on XFS_IFLUSHING flag on stale inodes

[RFC] xfs: Prevent umount from indefinitely waiting on XFS_IFLUSHING flag on stale inodes

Commit Message

Comments

Patch