diff mbox series

xfs: cancel intents immediately if process_intents fails

Message ID 20201019162917.GJ9832@magnolia
State Accepted
Headers show
Series xfs: cancel intents immediately if process_intents fails | expand

Commit Message

Darrick J. Wong Oct. 19, 2020, 4:29 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

If processing recovered log intent items fails, we need to cancel all
the unprocessed recovered items immediately so that a subsequent AIL
push in the bail out path won't get wedged on the pinned intent items
that didn't get processed.

This can happen if the log contains (1) an intent that gets and releases
an inode, (2) an intent that cannot be recovered successfully, and (3)
some third intent item.  When recovery of (2) fails, we leave (3) pinned
in memory.  Inode reclamation is called in the error-out path of
xfs_mountfs before xfs_log_cancel_mount.  Reclamation calls
xfs_ail_push_all_sync, which gets stuck waiting for (3).

Therefore, call xlog_recover_cancel_intents if _process_intents fails.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_log_recover.c |    8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Brian Foster Oct. 20, 2020, 10:38 a.m. UTC | #1
On Mon, Oct 19, 2020 at 09:29:17AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If processing recovered log intent items fails, we need to cancel all
> the unprocessed recovered items immediately so that a subsequent AIL
> push in the bail out path won't get wedged on the pinned intent items
> that didn't get processed.
> 
> This can happen if the log contains (1) an intent that gets and releases
> an inode, (2) an intent that cannot be recovered successfully, and (3)
> some third intent item.  When recovery of (2) fails, we leave (3) pinned
> in memory.  Inode reclamation is called in the error-out path of
> xfs_mountfs before xfs_log_cancel_mount.  Reclamation calls
> xfs_ail_push_all_sync, which gets stuck waiting for (3).
> 
> Therefore, call xlog_recover_cancel_intents if _process_intents fails.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/xfs_log_recover.c |    8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index a8289adc1b29..87886b7f77da 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -3446,6 +3446,14 @@ xlog_recover_finish(
>  		int	error;
>  		error = xlog_recover_process_intents(log);
>  		if (error) {
> +			/*
> +			 * Cancel all the unprocessed intent items now so that
> +			 * we don't leave them pinned in the AIL.  This can
> +			 * cause the AIL to livelock on the pinned item if
> +			 * anyone tries to push the AIL (inode reclaim does
> +			 * this) before we get around to xfs_log_mount_cancel.
> +			 */
> +			xlog_recover_cancel_intents(log);
>  			xfs_alert(log->l_mp, "Failed to recover intents");
>  			return error;
>  		}
>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index a8289adc1b29..87886b7f77da 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3446,6 +3446,14 @@  xlog_recover_finish(
 		int	error;
 		error = xlog_recover_process_intents(log);
 		if (error) {
+			/*
+			 * Cancel all the unprocessed intent items now so that
+			 * we don't leave them pinned in the AIL.  This can
+			 * cause the AIL to livelock on the pinned item if
+			 * anyone tries to push the AIL (inode reclaim does
+			 * this) before we get around to xfs_log_mount_cancel.
+			 */
+			xlog_recover_cancel_intents(log);
 			xfs_alert(log->l_mp, "Failed to recover intents");
 			return error;
 		}