From patchwork Fri Apr 17 15:08:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11495383 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A72F13B2 for ; Fri, 17 Apr 2020 15:09:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3CC6120857 for ; Fri, 17 Apr 2020 15:09:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="aXXV5vpU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728852AbgDQPJK (ORCPT ); Fri, 17 Apr 2020 11:09:10 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:36719 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728867AbgDQPJI (ORCPT ); Fri, 17 Apr 2020 11:09:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1587136147; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yXO/dhQQgrEGWlnaDkvWqP+M920YOhxTGASDTDAGpqY=; b=aXXV5vpUmmqRoQGZIrcd8bYJj2s4leumfPkT65+vcpi4EwYGmSzu6efAyZ/M8VVI5hIc7A pOqCRi4vF+mchOmR7SqacffsgV+EsRmPG+U4SmZ4Ztlc4D8YyZhl4NMOZhRYkB0Jynq+GZ i9JTLO21/FOQ/olT3F4s2WuaErtWFjA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-309-RhK-20rANbOiQu4NIwN2dQ-1; Fri, 17 Apr 2020 11:09:04 -0400 X-MC-Unique: RhK-20rANbOiQu4NIwN2dQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1EAFE1088384 for ; Fri, 17 Apr 2020 15:09:04 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id CB64860BE0 for ; Fri, 17 Apr 2020 15:09:03 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH 09/12] xfs: elide the AIL lock on log item failure tracking Date: Fri, 17 Apr 2020 11:08:56 -0400 Message-Id: <20200417150859.14734-10-bfoster@redhat.com> In-Reply-To: <20200417150859.14734-1-bfoster@redhat.com> References: <20200417150859.14734-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The log item failed flag is used to indicate a log item was flushed but the underlying buffer was not successfully written to disk. If the error configuration allows for writeback retries, xfsaild uses the flag to resubmit the underlying buffer without acquiring the flush lock of the item. The flag is currently set and cleared under the AIL lock and buffer lock in the ->iop_error() callback, invoked via ->b_iodone() at I/O completion time. The flag is checked at xfsaild push time under AIL lock and cleared under buffer lock before resubmission. If I/O eventually succeeds, the flag is cleared in the _done() handler for the associated item type, again under AIL lock and buffer lock. As far as I can tell, the only reason for holding the AIL lock across sets/clears is to manage consistency between the log item bitop state and the temporary buffer pointer that is attached to the log item. The bit itself is used to manage consistency of the attached buffer, but is not enough to guarantee the buffer is still attached by the time xfsaild attempts to access it. However since failure state is always set or cleared under buffer lock (either via I/O completion or xfsaild), this particular case can be handled at item push time by verifying failure state once under buffer lock. Remove the AIL lock protection from the various bits of log item failure state management and simplify the surrounding code where applicable. Use the buffer lock in the xfsaild resubmit code to detect failure state changes and temporarily treat the item as locked. Signed-off-by: Brian Foster --- fs/xfs/xfs_buf_item.c | 4 ---- fs/xfs/xfs_dquot.c | 41 +++++++++++++++-------------------------- fs/xfs/xfs_inode_item.c | 11 +++-------- fs/xfs/xfs_trans_ail.c | 12 ++++++++++-- fs/xfs/xfs_trans_priv.h | 5 ----- 5 files changed, 28 insertions(+), 45 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 72d37a4609d8..6b000f855e13 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -1046,7 +1046,6 @@ xfs_buf_do_callbacks_fail( struct xfs_buf *bp) { struct xfs_log_item *lip; - struct xfs_ail *ailp; /* * Buffer log item errors are handled directly by xfs_buf_item_push() @@ -1058,13 +1057,10 @@ xfs_buf_do_callbacks_fail( lip = list_first_entry(&bp->b_li_list, struct xfs_log_item, li_bio_list); - ailp = lip->li_ailp; - spin_lock(&ailp->ail_lock); list_for_each_entry(lip, &bp->b_li_list, li_bio_list) { if (lip->li_ops->iop_error) lip->li_ops->iop_error(lip, bp); } - spin_unlock(&ailp->ail_lock); } static bool diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index 41750f797861..953059235130 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -1023,34 +1023,23 @@ xfs_qm_dqflush_done( struct xfs_ail *ailp = lip->li_ailp; /* - * We only want to pull the item from the AIL if its - * location in the log has not changed since we started the flush. - * Thus, we only bother if the dquot's lsn has - * not changed. First we check the lsn outside the lock - * since it's cheaper, and then we recheck while - * holding the lock before removing the dquot from the AIL. + * Only pull the item from the AIL if its location in the log has not + * changed since it was flushed. Do a lockless check first to reduce + * lock traffic. */ - if (test_bit(XFS_LI_IN_AIL, &lip->li_flags) && - ((lip->li_lsn == qip->qli_flush_lsn) || - test_bit(XFS_LI_FAILED, &lip->li_flags))) { - - /* xfs_trans_ail_delete() drops the AIL lock. */ - spin_lock(&ailp->ail_lock); - if (lip->li_lsn == qip->qli_flush_lsn) { - xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE); - } else { - /* - * Clear the failed state since we are about to drop the - * flush lock - */ - xfs_clear_li_failed(lip); - spin_unlock(&ailp->ail_lock); - } - } + if (!test_bit(XFS_LI_IN_AIL, &lip->li_flags) || + lip->li_lsn != qip->qli_flush_lsn) + goto out; - /* - * Release the dq's flush lock since we're done with it. - */ + spin_lock(&ailp->ail_lock); + if (lip->li_lsn == qip->qli_flush_lsn) + /* xfs_trans_ail_delete() drops the AIL lock */ + xfs_trans_ail_delete(ailp, lip, SHUTDOWN_CORRUPT_INCORE); + else + spin_unlock(&ailp->ail_lock); + +out: + xfs_clear_li_failed(lip); xfs_dqfunlock(dqp); } diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index 1d4d256a2e96..0ae61844b224 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -682,9 +682,7 @@ xfs_iflush_done( * Scan the buffer IO completions for other inodes being completed and * attach them to the current inode log item. */ - list_add_tail(&lip->li_bio_list, &tmp); - list_for_each_entry_safe(blip, n, &bp->b_li_list, li_bio_list) { if (lip->li_cb != xfs_iflush_done) continue; @@ -695,15 +693,13 @@ xfs_iflush_done( * the AIL lock. */ iip = INODE_ITEM(blip); - if ((iip->ili_logged && blip->li_lsn == iip->ili_flush_lsn) || - test_bit(XFS_LI_FAILED, &blip->li_flags)) + if (iip->ili_logged && blip->li_lsn == iip->ili_flush_lsn) need_ail++; } /* make sure we capture the state of the initial inode. */ iip = INODE_ITEM(lip); - if ((iip->ili_logged && lip->li_lsn == iip->ili_flush_lsn) || - test_bit(XFS_LI_FAILED, &lip->li_flags)) + if (iip->ili_logged && lip->li_lsn == iip->ili_flush_lsn) need_ail++; /* @@ -732,8 +728,6 @@ xfs_iflush_done( xfs_lsn_t lsn = xfs_ail_delete_one(ailp, blip); if (!tail_lsn && lsn) tail_lsn = lsn; - } else { - xfs_clear_li_failed(blip); } } xfs_ail_update_finish(ailp, tail_lsn); @@ -746,6 +740,7 @@ xfs_iflush_done( */ list_for_each_entry_safe(blip, n, &tmp, li_bio_list) { list_del_init(&blip->li_bio_list); + xfs_clear_li_failed(blip); iip = INODE_ITEM(blip); iip->ili_logged = 0; iip->ili_last_fields = 0; diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 0c709651a2c6..6af609070143 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -368,15 +368,23 @@ xfsaild_push_failed( { struct xfs_buf *bp = lip->li_buf; - if (!xfs_buf_trylock(bp)) + /* + * Log item state bits are racy so we cannot assume the temporary buffer + * pointer is set. Treat the item as locked if the pointer is clear or + * the failure state has changed once we've locked out I/O completion. + */ + if (!bp || !xfs_buf_trylock(bp)) return XFS_ITEM_LOCKED; + if (!test_bit(XFS_LI_FAILED, &lip->li_flags)) { + xfs_buf_unlock(bp); + return XFS_ITEM_LOCKED; + } if (!xfs_buf_delwri_queue(bp, buffer_list)) { xfs_buf_unlock(bp); return XFS_ITEM_FLUSHING; } - /* protected by ail_lock */ list_for_each_entry(lip, &bp->b_li_list, li_bio_list) xfs_clear_li_failed(lip); diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h index 35655eac01a6..9135afdcee9d 100644 --- a/fs/xfs/xfs_trans_priv.h +++ b/fs/xfs/xfs_trans_priv.h @@ -158,9 +158,6 @@ xfs_clear_li_failed( { struct xfs_buf *bp = lip->li_buf; - ASSERT(test_bit(XFS_LI_IN_AIL, &lip->li_flags)); - lockdep_assert_held(&lip->li_ailp->ail_lock); - if (test_and_clear_bit(XFS_LI_FAILED, &lip->li_flags)) { lip->li_buf = NULL; xfs_buf_rele(bp); @@ -172,8 +169,6 @@ xfs_set_li_failed( struct xfs_log_item *lip, struct xfs_buf *bp) { - lockdep_assert_held(&lip->li_ailp->ail_lock); - if (!test_and_set_bit(XFS_LI_FAILED, &lip->li_flags)) { xfs_buf_hold(bp); lip->li_buf = bp;