From patchwork Thu Feb 27 13:43:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11408775 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6782314D5 for ; Thu, 27 Feb 2020 14:48:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3BCFF246E3 for ; Thu, 27 Feb 2020 14:48:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="aJsT668B" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729431AbgB0Nn0 (ORCPT ); Thu, 27 Feb 2020 08:43:26 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:41622 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729962AbgB0NnZ (ORCPT ); Thu, 27 Feb 2020 08:43:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582811004; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Im0n69uG6RtvEN39KMCO8NYo3nRdviWHfombc8Moe7Q=; b=aJsT668BO2qDIPASojTPplusDEMEp2vZA7xVcGKUJe9PS4MUPnsKxkp6kaQWYyTIZe1K9+ srvJv8N2ZhjJdprgits2IZHHubm4r95sYE63ZZE+ii0VxrjnfK2p7FL35s6343o1puwzzG MHCh9kDI8r5JQKYmRPFw8ITf0aIR+H0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-284-vOWZTHG-NiW3YyALfS6avQ-1; Thu, 27 Feb 2020 08:43:23 -0500 X-MC-Unique: vOWZTHG-NiW3YyALfS6avQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 404C713EA for ; Thu, 27 Feb 2020 13:43:22 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 00E7F5DA7C for ; Thu, 27 Feb 2020 13:43:21 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [RFC v5 PATCH 1/9] xfs: set t_task at wait time instead of alloc time Date: Thu, 27 Feb 2020 08:43:13 -0500 Message-Id: <20200227134321.7238-2-bfoster@redhat.com> In-Reply-To: <20200227134321.7238-1-bfoster@redhat.com> References: <20200227134321.7238-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The xlog_ticket structure contains a task reference to support blocking for available log reservation. This reference is assigned at ticket allocation time, which assumes that the transaction allocator will acquire reservation in the same context. This is normally true, but will not always be the case with automatic relogging. There is otherwise no fundamental reason log space cannot be reserved for a ticket from a context different from the allocating context. Move the task assignment to the log reservation blocking code where it is used. Signed-off-by: Brian Foster Reviewed-by: Allison Collins --- fs/xfs/xfs_log.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index f6006d94a581..df60942a9804 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -262,6 +262,7 @@ xlog_grant_head_wait( int need_bytes) __releases(&head->lock) __acquires(&head->lock) { + tic->t_task = current; list_add_tail(&tic->t_queue, &head->waiters); do { @@ -3601,7 +3602,6 @@ xlog_ticket_alloc( unit_res = xfs_log_calc_unit_res(log->l_mp, unit_bytes); atomic_set(&tic->t_ref, 1); - tic->t_task = current; INIT_LIST_HEAD(&tic->t_queue); tic->t_unit_res = unit_res; tic->t_curr_res = unit_res; From patchwork Thu Feb 27 13:43:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11408653 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 00CB914B4 for ; Thu, 27 Feb 2020 13:43:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D555821D7E for ; Thu, 27 Feb 2020 13:43:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ewH0nfz0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729963AbgB0Nn0 (ORCPT ); Thu, 27 Feb 2020 08:43:26 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:29195 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729653AbgB0NnZ (ORCPT ); Thu, 27 Feb 2020 08:43:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582811004; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=49SK+T9ERmBMskQKQf2xcajfIteFRUcRk18kEjfuyIo=; b=ewH0nfz0Wrvfvr7ajGnuQveAcmzgoV1tj87Akkxy3aHHLiKR8MVB4tHotoXVrl6oQpHz7r 08uCiUTJ6KAktnZSpdeTpSSDfb9D2Bdl9Nc+h8mz+1WioxZzXLtXhivlF+MaHPKPKtzgLt sN9l7Q2bDvfKQuCSlV6st2Ja4IyKP2M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-279-ImX-QRJUOlGE3twApZfNgg-1; Thu, 27 Feb 2020 08:43:23 -0500 X-MC-Unique: ImX-QRJUOlGE3twApZfNgg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 92EC18014D0 for ; Thu, 27 Feb 2020 13:43:22 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 591915DA7C for ; Thu, 27 Feb 2020 13:43:22 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [RFC v5 PATCH 2/9] xfs: introduce ->tr_relog transaction Date: Thu, 27 Feb 2020 08:43:14 -0500 Message-Id: <20200227134321.7238-3-bfoster@redhat.com> In-Reply-To: <20200227134321.7238-1-bfoster@redhat.com> References: <20200227134321.7238-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Create a transaction reservation specifically for relog transactions. For now it only supports the quotaoff intent, so use the associated reservation. Signed-off-by: Brian Foster Reviewed-by: Allison Collins --- fs/xfs/libxfs/xfs_trans_resv.c | 15 +++++++++++++++ fs/xfs/libxfs/xfs_trans_resv.h | 1 + 2 files changed, 16 insertions(+) diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 7a9c04920505..1f5c9e6e1afc 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -832,6 +832,17 @@ xfs_calc_sb_reservation( return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize); } +/* + * Internal relog transaction. + * quotaoff intent + */ +STATIC uint +xfs_calc_relog_reservation( + struct xfs_mount *mp) +{ + return xfs_calc_qm_quotaoff_reservation(mp); +} + void xfs_trans_resv_calc( struct xfs_mount *mp, @@ -946,4 +957,8 @@ xfs_trans_resv_calc( resp->tr_clearagi.tr_logres = xfs_calc_clear_agi_bucket_reservation(mp); resp->tr_growrtzero.tr_logres = xfs_calc_growrtzero_reservation(mp); resp->tr_growrtfree.tr_logres = xfs_calc_growrtfree_reservation(mp); + + resp->tr_relog.tr_logres = xfs_calc_relog_reservation(mp); + resp->tr_relog.tr_logcount = XFS_DEFAULT_PERM_LOG_COUNT; + resp->tr_relog.tr_logflags |= XFS_TRANS_PERM_LOG_RES; } diff --git a/fs/xfs/libxfs/xfs_trans_resv.h b/fs/xfs/libxfs/xfs_trans_resv.h index 7241ab28cf84..b723979cad09 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.h +++ b/fs/xfs/libxfs/xfs_trans_resv.h @@ -50,6 +50,7 @@ struct xfs_trans_resv { struct xfs_trans_res tr_qm_equotaoff;/* end of turn quota off */ struct xfs_trans_res tr_sb; /* modify superblock */ struct xfs_trans_res tr_fsyncts; /* update timestamps on fsync */ + struct xfs_trans_res tr_relog; /* internal relog transaction */ }; /* shorthand way of accessing reservation structure */ From patchwork Thu Feb 27 13:43:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11408655 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A059924 for ; Thu, 27 Feb 2020 13:43:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 70143246A5 for ; Thu, 27 Feb 2020 13:43:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="YXHXDqyK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729968AbgB0Nn1 (ORCPT ); Thu, 27 Feb 2020 08:43:27 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:35180 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729418AbgB0Nn0 (ORCPT ); Thu, 27 Feb 2020 08:43:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582811005; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3I3mQXlpOH2b9q8pkq7O1JUql4UD1YL8DZFlRy16KhU=; b=YXHXDqyKZOiYJT5PCAT78CAmgCRI9twadUmSSLsTpfQCgCDLzni9I/nQDSOMRKnwW9tNLV RbNMY/EMBfg6GvYYpTSkPkeFbPMoqGmzp1L9SeDEn5JU2CJYxuF3WLDgkNDQSqY92jdGaz YM5YZDrmGp+3NXoQOGxzQsladGcZjwI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-346-C-FXy86zMsiXoZWGjAoIZQ-1; Thu, 27 Feb 2020 08:43:23 -0500 X-MC-Unique: C-FXy86zMsiXoZWGjAoIZQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E83EA18A8C98 for ; Thu, 27 Feb 2020 13:43:22 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id AE7415DA7C for ; Thu, 27 Feb 2020 13:43:22 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [RFC v5 PATCH 3/9] xfs: automatic relogging reservation management Date: Thu, 27 Feb 2020 08:43:15 -0500 Message-Id: <20200227134321.7238-4-bfoster@redhat.com> In-Reply-To: <20200227134321.7238-1-bfoster@redhat.com> References: <20200227134321.7238-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Automatic item relogging will occur from xfsaild context. xfsaild cannot acquire log reservation itself because it is also responsible for writeback and thus making used log reservation available again. Since there is no guarantee log reservation is available by the time a relogged item reaches the AIL, this is prone to deadlock. To guarantee log reservation for automatic relogging, implement a reservation management scheme where a transaction that is capable of enabling relogging of an item must contribute the necessary reservation to the relog mechanism up front. Use reference counting to associate the lifetime of pending relog reservation to the lifetime of in-core log items with relogging enabled. The basic log reservation sequence for a relog enabled transaction is as follows: - A transaction that uses relogging specifies XFS_TRANS_RELOG at allocation time. - Once initialized, RELOG transactions check for the existence of the global relog log ticket. If it exists, grab a reference and return. If not, allocate an empty ticket and install into the relog subsystem. Seed the relog ticket from reservation of the current transaction. Roll the current transaction to replenish its reservation and return to the caller. - The transaction is used as normal. If an item is relogged in the transaction, that item acquires a reference on the global relog ticket currently held open by the transaction. The item's reference persists until relogging is disabled on the item. - The RELOG transaction commits and releases its reference to the global relog ticket. The global relog ticket is released once its reference count drops to zero. This provides a central relog log ticket that guarantees reservation availability for relogged items, avoids log reservation deadlocks and is allocated and released on demand. Signed-off-by: Brian Foster Reviewed-by: Allison Collins --- fs/xfs/libxfs/xfs_shared.h | 1 + fs/xfs/xfs_trans.c | 37 +++++++++++++--- fs/xfs/xfs_trans.h | 3 ++ fs/xfs/xfs_trans_ail.c | 89 ++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_trans_priv.h | 1 + 5 files changed, 126 insertions(+), 5 deletions(-) diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h index c45acbd3add9..0a10ca0853ab 100644 --- a/fs/xfs/libxfs/xfs_shared.h +++ b/fs/xfs/libxfs/xfs_shared.h @@ -77,6 +77,7 @@ void xfs_log_get_max_trans_res(struct xfs_mount *mp, * made then this algorithm will eventually find all the space it needs. */ #define XFS_TRANS_LOWMODE 0x100 /* allocate in low space mode */ +#define XFS_TRANS_RELOG 0x200 /* enable automatic relogging */ /* * Field values for xfs_trans_mod_sb. diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index 3b208f9a865c..8ac05ed8deda 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -107,9 +107,14 @@ xfs_trans_dup( ntp->t_flags = XFS_TRANS_PERM_LOG_RES | (tp->t_flags & XFS_TRANS_RESERVE) | - (tp->t_flags & XFS_TRANS_NO_WRITECOUNT); - /* We gave our writer reference to the new transaction */ + (tp->t_flags & XFS_TRANS_NO_WRITECOUNT) | + (tp->t_flags & XFS_TRANS_RELOG); + /* + * The writer reference and relog reference transfer to the new + * transaction. + */ tp->t_flags |= XFS_TRANS_NO_WRITECOUNT; + tp->t_flags &= ~XFS_TRANS_RELOG; ntp->t_ticket = xfs_log_ticket_get(tp->t_ticket); ASSERT(tp->t_blk_res >= tp->t_blk_res_used); @@ -284,15 +289,25 @@ xfs_trans_alloc( tp->t_firstblock = NULLFSBLOCK; error = xfs_trans_reserve(tp, resp, blocks, rtextents); - if (error) { - xfs_trans_cancel(tp); - return error; + if (error) + goto error; + + if (flags & XFS_TRANS_RELOG) { + error = xfs_trans_ail_relog_reserve(&tp); + if (error) + goto error; } trace_xfs_trans_alloc(tp, _RET_IP_); *tpp = tp; return 0; + +error: + /* clear relog flag if we haven't acquired a ref */ + tp->t_flags &= ~XFS_TRANS_RELOG; + xfs_trans_cancel(tp); + return error; } /* @@ -973,6 +988,10 @@ __xfs_trans_commit( xfs_log_commit_cil(mp, tp, &commit_lsn, regrant); + /* release the relog ticket reference if this transaction holds one */ + if (tp->t_flags & XFS_TRANS_RELOG) + xfs_trans_ail_relog_put(mp); + current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); xfs_trans_free(tp); @@ -1004,6 +1023,10 @@ __xfs_trans_commit( error = -EIO; tp->t_ticket = NULL; } + /* release the relog ticket reference if this transaction holds one */ + /* XXX: handle RELOG items on transaction abort */ + if (tp->t_flags & XFS_TRANS_RELOG) + xfs_trans_ail_relog_put(mp); current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); xfs_trans_free_items(tp, !!error); xfs_trans_free(tp); @@ -1064,6 +1087,10 @@ xfs_trans_cancel( tp->t_ticket = NULL; } + /* release the relog ticket reference if this transaction holds one */ + if (tp->t_flags & XFS_TRANS_RELOG) + xfs_trans_ail_relog_put(mp); + /* mark this thread as no longer being in a transaction */ current_restore_flags_nested(&tp->t_pflags, PF_MEMALLOC_NOFS); diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index 752c7fef9de7..a032989943bd 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -236,6 +236,9 @@ int xfs_trans_roll_inode(struct xfs_trans **, struct xfs_inode *); void xfs_trans_cancel(xfs_trans_t *); int xfs_trans_ail_init(struct xfs_mount *); void xfs_trans_ail_destroy(struct xfs_mount *); +int xfs_trans_ail_relog_reserve(struct xfs_trans **); +bool xfs_trans_ail_relog_get(struct xfs_mount *); +int xfs_trans_ail_relog_put(struct xfs_mount *); void xfs_trans_buf_set_type(struct xfs_trans *, struct xfs_buf *, enum xfs_blft); diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 00cc5b8734be..a3fb64275baa 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -17,6 +17,7 @@ #include "xfs_errortag.h" #include "xfs_error.h" #include "xfs_log.h" +#include "xfs_log_priv.h" #ifdef DEBUG /* @@ -818,6 +819,93 @@ xfs_trans_ail_delete( xfs_log_space_wake(ailp->ail_mount); } +bool +xfs_trans_ail_relog_get( + struct xfs_mount *mp) +{ + struct xfs_ail *ailp = mp->m_ail; + bool ret = false; + + spin_lock(&ailp->ail_lock); + if (ailp->ail_relog_tic) { + xfs_log_ticket_get(ailp->ail_relog_tic); + ret = true; + } + spin_unlock(&ailp->ail_lock); + return ret; +} + +/* + * Reserve log space for the automatic relogging ->tr_relog ticket. This + * requires a clean, permanent transaction from the caller. Pull reservation + * for the relog ticket and roll the caller's transaction back to its fully + * reserved state. If the AIL relog ticket is already initialized, grab a + * reference and return. + */ +int +xfs_trans_ail_relog_reserve( + struct xfs_trans **tpp) +{ + struct xfs_trans *tp = *tpp; + struct xfs_mount *mp = tp->t_mountp; + struct xfs_ail *ailp = mp->m_ail; + struct xlog_ticket *tic; + uint32_t logres = M_RES(mp)->tr_relog.tr_logres; + + ASSERT(tp->t_flags & XFS_TRANS_PERM_LOG_RES); + ASSERT(!(tp->t_flags & XFS_TRANS_DIRTY)); + + if (xfs_trans_ail_relog_get(mp)) + return 0; + + /* no active ticket, fall into slow path to allocate one.. */ + tic = xlog_ticket_alloc(mp->m_log, logres, 1, XFS_TRANSACTION, true, 0); + if (!tic) + return -ENOMEM; + ASSERT(tp->t_ticket->t_curr_res >= tic->t_curr_res); + + /* check again since we dropped the lock for the allocation */ + spin_lock(&ailp->ail_lock); + if (ailp->ail_relog_tic) { + xfs_log_ticket_get(ailp->ail_relog_tic); + spin_unlock(&ailp->ail_lock); + xfs_log_ticket_put(tic); + return 0; + } + + /* attach and reserve space for the ->tr_relog ticket */ + ailp->ail_relog_tic = tic; + tp->t_ticket->t_curr_res -= tic->t_curr_res; + spin_unlock(&ailp->ail_lock); + + return xfs_trans_roll(tpp); +} + +/* + * Release a reference to the relog ticket. + */ +int +xfs_trans_ail_relog_put( + struct xfs_mount *mp) +{ + struct xfs_ail *ailp = mp->m_ail; + struct xlog_ticket *tic; + + spin_lock(&ailp->ail_lock); + if (atomic_add_unless(&ailp->ail_relog_tic->t_ref, -1, 1)) { + spin_unlock(&ailp->ail_lock); + return 0; + } + + ASSERT(atomic_read(&ailp->ail_relog_tic->t_ref) == 1); + tic = ailp->ail_relog_tic; + ailp->ail_relog_tic = NULL; + spin_unlock(&ailp->ail_lock); + + xfs_log_done(mp, tic, NULL, false); + return 0; +} + int xfs_trans_ail_init( xfs_mount_t *mp) @@ -854,6 +942,7 @@ xfs_trans_ail_destroy( { struct xfs_ail *ailp = mp->m_ail; + ASSERT(ailp->ail_relog_tic == NULL); kthread_stop(ailp->ail_task); kmem_free(ailp); } diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h index 2e073c1c4614..839df6559b9f 100644 --- a/fs/xfs/xfs_trans_priv.h +++ b/fs/xfs/xfs_trans_priv.h @@ -61,6 +61,7 @@ struct xfs_ail { int ail_log_flush; struct list_head ail_buf_list; wait_queue_head_t ail_empty; + struct xlog_ticket *ail_relog_tic; }; /* From patchwork Thu Feb 27 13:43:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11408773 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C876E1580 for ; Thu, 27 Feb 2020 14:48:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9DF51246B1 for ; Thu, 27 Feb 2020 14:48:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ehGvLIII" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731701AbgB0OsQ (ORCPT ); Thu, 27 Feb 2020 09:48:16 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:54557 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729970AbgB0Nn1 (ORCPT ); Thu, 27 Feb 2020 08:43:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582811006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AoQqqY3wn4EHZR/Bdqhd1nQQjld+kUALV2aa+N73378=; b=ehGvLIIIoXVv75ahgvLP/Uw+pQiaIFwn3JH7I+oZ/GvYBRuH5VUw9ArQuMem2fEZL6wwBN w0+Gq1luiarBD0Qv4wpNmImIYpRc1WUgViyOXlInkjcbc4gXl0hSiXaVEFYfrK2U9RiSjL wuAJT17rYH2tkgjdAfDhK6W/efoWTzc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-269-sKeIUis6NL6ckMWEjn1IwA-1; Thu, 27 Feb 2020 08:43:24 -0500 X-MC-Unique: sKeIUis6NL6ckMWEjn1IwA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 472931005512 for ; Thu, 27 Feb 2020 13:43:23 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0DBB65DA7C for ; Thu, 27 Feb 2020 13:43:22 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [RFC v5 PATCH 4/9] xfs: automatic relogging item management Date: Thu, 27 Feb 2020 08:43:16 -0500 Message-Id: <20200227134321.7238-5-bfoster@redhat.com> In-Reply-To: <20200227134321.7238-1-bfoster@redhat.com> References: <20200227134321.7238-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org As implemented by the previous patch, relogging can be enabled on any item via a relog enabled transaction (which holds a reference to an active relog ticket). Add a couple log item flags to track relog state of an arbitrary log item. The item holds a reference to the global relog ticket when relogging is enabled and releases the reference when relogging is disabled. Signed-off-by: Brian Foster Reviewed-by: Allison Collins --- fs/xfs/xfs_trace.h | 2 ++ fs/xfs/xfs_trans.c | 36 ++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_trans.h | 6 +++++- fs/xfs/xfs_trans_priv.h | 2 ++ 4 files changed, 45 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index a86be7f807ee..a066617ec54d 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1063,6 +1063,8 @@ DEFINE_LOG_ITEM_EVENT(xfs_ail_push); DEFINE_LOG_ITEM_EVENT(xfs_ail_pinned); DEFINE_LOG_ITEM_EVENT(xfs_ail_locked); DEFINE_LOG_ITEM_EVENT(xfs_ail_flushing); +DEFINE_LOG_ITEM_EVENT(xfs_relog_item); +DEFINE_LOG_ITEM_EVENT(xfs_relog_item_cancel); DECLARE_EVENT_CLASS(xfs_ail_class, TP_PROTO(struct xfs_log_item *lip, xfs_lsn_t old_lsn, xfs_lsn_t new_lsn), diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index 8ac05ed8deda..f7f2411ead4e 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -778,6 +778,41 @@ xfs_trans_del_item( list_del_init(&lip->li_trans); } +void +xfs_trans_relog_item( + struct xfs_log_item *lip) +{ + if (!test_and_set_bit(XFS_LI_RELOG, &lip->li_flags)) { + xfs_trans_ail_relog_get(lip->li_mountp); + trace_xfs_relog_item(lip); + } +} + +void +xfs_trans_relog_item_cancel( + struct xfs_log_item *lip, + bool drain) /* wait for relogging to cease */ +{ + struct xfs_mount *mp = lip->li_mountp; + + if (!test_and_clear_bit(XFS_LI_RELOG, &lip->li_flags)) + return; + xfs_trans_ail_relog_put(lip->li_mountp); + trace_xfs_relog_item_cancel(lip); + + if (!drain) + return; + + /* + * Some operations might require relog activity to cease before they can + * proceed. For example, an operation must wait before including a + * non-lockable log item (i.e. intent) in another transaction. + */ + while (wait_on_bit_timeout(&lip->li_flags, XFS_LI_RELOGGED, + TASK_UNINTERRUPTIBLE, HZ)) + xfs_log_force(mp, XFS_LOG_SYNC); +} + /* Detach and unlock all of the items in a transaction */ static void xfs_trans_free_items( @@ -863,6 +898,7 @@ xfs_trans_committed_bulk( if (aborted) set_bit(XFS_LI_ABORTED, &lip->li_flags); + clear_and_wake_up_bit(XFS_LI_RELOGGED, &lip->li_flags); if (lip->li_ops->flags & XFS_ITEM_RELEASE_WHEN_COMMITTED) { lip->li_ops->iop_release(lip); diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index a032989943bd..fc4c25b6eee4 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -59,12 +59,16 @@ struct xfs_log_item { #define XFS_LI_ABORTED 1 #define XFS_LI_FAILED 2 #define XFS_LI_DIRTY 3 /* log item dirty in transaction */ +#define XFS_LI_RELOG 4 /* automatically relog item */ +#define XFS_LI_RELOGGED 5 /* item relogged (not committed) */ #define XFS_LI_FLAGS \ { (1 << XFS_LI_IN_AIL), "IN_AIL" }, \ { (1 << XFS_LI_ABORTED), "ABORTED" }, \ { (1 << XFS_LI_FAILED), "FAILED" }, \ - { (1 << XFS_LI_DIRTY), "DIRTY" } + { (1 << XFS_LI_DIRTY), "DIRTY" }, \ + { (1 << XFS_LI_RELOG), "RELOG" }, \ + { (1 << XFS_LI_RELOGGED), "RELOGGED" } struct xfs_item_ops { unsigned flags; diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h index 839df6559b9f..d1edec1cb8ad 100644 --- a/fs/xfs/xfs_trans_priv.h +++ b/fs/xfs/xfs_trans_priv.h @@ -16,6 +16,8 @@ struct xfs_log_vec; void xfs_trans_init(struct xfs_mount *); void xfs_trans_add_item(struct xfs_trans *, struct xfs_log_item *); void xfs_trans_del_item(struct xfs_log_item *); +void xfs_trans_relog_item(struct xfs_log_item *); +void xfs_trans_relog_item_cancel(struct xfs_log_item *, bool); void xfs_trans_unreserve_and_mod_sb(struct xfs_trans *tp); void xfs_trans_committed_bulk(struct xfs_ail *ailp, struct xfs_log_vec *lv, From patchwork Thu Feb 27 13:43:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11408771 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E3D6817E0 for ; Thu, 27 Feb 2020 14:48:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C39D9246C0 for ; Thu, 27 Feb 2020 14:48:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ST9uf92K" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730045AbgB0OsL (ORCPT ); Thu, 27 Feb 2020 09:48:11 -0500 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:60316 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729625AbgB0Nn1 (ORCPT ); Thu, 27 Feb 2020 08:43:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582811005; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zduSw/vTwQ1eQ28xjb7bdY9grDz0T9BLLBNqRiw/sY0=; b=ST9uf92KJI5PGs0SHO2P7LmgqRu9RLwl6rhJ+RF+Dq2kHVZu2SDV3xrJQmoWE7YdWWYbhB FpbuWQFJrozTez0O82eD9UhlmzlJkKosPZR+iCnJxmsRnhivCfIN06Q/fehn7POgp2xq2i h+2ft2jIu6ysOYgk7n8+Gqj7BeS63pY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-186-5oeBigucMCaKiVMkLhOAaA-1; Thu, 27 Feb 2020 08:43:24 -0500 X-MC-Unique: 5oeBigucMCaKiVMkLhOAaA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 99DBE1408 for ; Thu, 27 Feb 2020 13:43:23 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 617915DA7C for ; Thu, 27 Feb 2020 13:43:23 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [RFC v5 PATCH 5/9] xfs: automatic log item relog mechanism Date: Thu, 27 Feb 2020 08:43:17 -0500 Message-Id: <20200227134321.7238-6-bfoster@redhat.com> In-Reply-To: <20200227134321.7238-1-bfoster@redhat.com> References: <20200227134321.7238-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Now that relog reservation is available and relog state tracking is in place, all that remains to automatically relog items is the relog mechanism itself. An item with relogging enabled is basically pinned from writeback until relog is disabled. Instead of being written back, the item must instead be periodically committed in a new transaction to move it in the physical log. The purpose of moving the item is to avoid long term tail pinning and thus avoid log deadlocks for long running operations. The ideal time to relog an item is in response to tail pushing pressure. This accommodates the current workload at any given time as opposed to a fixed time interval or log reservation heuristic, which risks performance regression. This is essentially the same heuristic that drives metadata writeback. XFS already implements various log tail pushing heuristics that attempt to keep the log progressing on an active fileystem under various workloads. The act of relogging an item simply requires to add it to a transaction and commit. This pushes the already dirty item into a subsequent log checkpoint and frees up its previous location in the on-disk log. Joining an item to a transaction of course requires locking the item first, which means we have to be aware of type-specific locks and lock ordering wherever the relog takes place. Fundamentally, this points to xfsaild as the ideal location to process relog enabled items. xfsaild already processes log resident items, is driven by log tail pushing pressure, processes arbitrary log item types through callbacks, and is sensitive to type-specific locking rules by design. The fact that automatic relogging essentially diverts items between writeback or relog also suggests xfsaild as an ideal location to process items one way or the other. Of course, we don't want xfsaild to process transactions as it is a critical component of the log subsystem for driving metadata writeback and freeing up log space. Therefore, similar to how xfsaild builds up a writeback queue of dirty items and queues writes asynchronously, make xfsaild responsible only for directing pending relog items into an appropriate queue and create an async (workqueue) context for processing the queue. The workqueue context utilizes the pre-reserved relog ticket to drain the queue by rolling a permanent transaction. Update the AIL pushing infrastructure to support a new RELOG item state. If a log item push returns the relog state, queue the item for relog instead of writeback. On completion of a push cycle, schedule the relog task at the same point metadata buffer I/O is submitted. This allows items to be relogged automatically under the same locking rules and pressure heuristics that govern metadata writeback. Signed-off-by: Brian Foster Reviewed-by: Allison Collins --- fs/xfs/xfs_trace.h | 1 + fs/xfs/xfs_trans.h | 1 + fs/xfs/xfs_trans_ail.c | 103 +++++++++++++++++++++++++++++++++++++++- fs/xfs/xfs_trans_priv.h | 3 ++ 4 files changed, 106 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index a066617ec54d..df0114ec66f1 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1063,6 +1063,7 @@ DEFINE_LOG_ITEM_EVENT(xfs_ail_push); DEFINE_LOG_ITEM_EVENT(xfs_ail_pinned); DEFINE_LOG_ITEM_EVENT(xfs_ail_locked); DEFINE_LOG_ITEM_EVENT(xfs_ail_flushing); +DEFINE_LOG_ITEM_EVENT(xfs_ail_relog); DEFINE_LOG_ITEM_EVENT(xfs_relog_item); DEFINE_LOG_ITEM_EVENT(xfs_relog_item_cancel); diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index fc4c25b6eee4..1637df32c64c 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -99,6 +99,7 @@ void xfs_log_item_init(struct xfs_mount *mp, struct xfs_log_item *item, #define XFS_ITEM_PINNED 1 #define XFS_ITEM_LOCKED 2 #define XFS_ITEM_FLUSHING 3 +#define XFS_ITEM_RELOG 4 /* * Deferred operation item relogging limits. diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index a3fb64275baa..71a47faeaae8 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -144,6 +144,75 @@ xfs_ail_max_lsn( return lsn; } +/* + * Relog log items on the AIL relog queue. + */ +static void +xfs_ail_relog( + struct work_struct *work) +{ + struct xfs_ail *ailp = container_of(work, struct xfs_ail, + ail_relog_work); + struct xfs_mount *mp = ailp->ail_mount; + struct xfs_trans_res tres = {}; + struct xfs_trans *tp; + struct xfs_log_item *lip; + int error; + + /* + * The first transaction to submit a relog item contributed relog + * reservation to the relog ticket before committing. Create an empty + * transaction and manually associate the relog ticket. + */ + error = xfs_trans_alloc(mp, &tres, 0, 0, 0, &tp); + ASSERT(!error); + if (error) + return; + tp->t_log_res = M_RES(mp)->tr_relog.tr_logres; + tp->t_log_count = M_RES(mp)->tr_relog.tr_logcount; + tp->t_flags |= M_RES(mp)->tr_relog.tr_logflags; + tp->t_ticket = xfs_log_ticket_get(ailp->ail_relog_tic); + + spin_lock(&ailp->ail_lock); + while ((lip = list_first_entry_or_null(&ailp->ail_relog_list, + struct xfs_log_item, + li_trans)) != NULL) { + /* + * Drop the AIL processing ticket reference once the relog list + * is emptied. At this point it's possible for our transaction + * to hold the only reference. + */ + list_del_init(&lip->li_trans); + if (list_empty(&ailp->ail_relog_list)) + xfs_log_ticket_put(ailp->ail_relog_tic); + spin_unlock(&ailp->ail_lock); + + xfs_trans_add_item(tp, lip); + set_bit(XFS_LI_DIRTY, &lip->li_flags); + tp->t_flags |= XFS_TRANS_DIRTY; + /* XXX: include ticket owner task fix */ + error = xfs_trans_roll(&tp); + ASSERT(!error); + if (error) + goto out; + spin_lock(&ailp->ail_lock); + } + spin_unlock(&ailp->ail_lock); + +out: + /* XXX: handle shutdown scenario */ + /* + * Drop the relog reference owned by the transaction separately because + * we don't want the cancel to release reservation if this isn't the + * final reference. The relog ticket and associated reservation needs + * to persist so long as relog items are active in the log subsystem. + */ + xfs_trans_ail_relog_put(mp); + + tp->t_ticket = NULL; + xfs_trans_cancel(tp); +} + /* * The cursor keeps track of where our current traversal is up to by tracking * the next item in the list for us. However, for this to be safe, removing an @@ -364,7 +433,7 @@ static long xfsaild_push( struct xfs_ail *ailp) { - xfs_mount_t *mp = ailp->ail_mount; + struct xfs_mount *mp = ailp->ail_mount; struct xfs_ail_cursor cur; struct xfs_log_item *lip; xfs_lsn_t lsn; @@ -426,6 +495,23 @@ xfsaild_push( ailp->ail_last_pushed_lsn = lsn; break; + case XFS_ITEM_RELOG: + /* + * The item requires a relog. Add to the pending relog + * list and set the relogged bit to prevent further + * relog requests. The relog bit and ticket reference + * can be dropped from the item at any point, so hold a + * relog ticket reference for the pending relog list to + * ensure the ticket stays around. + */ + trace_xfs_ail_relog(lip); + ASSERT(list_empty(&lip->li_trans)); + if (list_empty(&ailp->ail_relog_list)) + xfs_log_ticket_get(ailp->ail_relog_tic); + list_add_tail(&lip->li_trans, &ailp->ail_relog_list); + set_bit(XFS_LI_RELOGGED, &lip->li_flags); + break; + case XFS_ITEM_FLUSHING: /* * The item or its backing buffer is already being @@ -492,6 +578,9 @@ xfsaild_push( if (xfs_buf_delwri_submit_nowait(&ailp->ail_buf_list)) ailp->ail_log_flush++; + if (!list_empty(&ailp->ail_relog_list)) + queue_work(ailp->ail_relog_wq, &ailp->ail_relog_work); + if (!count || XFS_LSN_CMP(lsn, target) >= 0) { out_done: /* @@ -922,15 +1011,24 @@ xfs_trans_ail_init( spin_lock_init(&ailp->ail_lock); INIT_LIST_HEAD(&ailp->ail_buf_list); init_waitqueue_head(&ailp->ail_empty); + INIT_LIST_HEAD(&ailp->ail_relog_list); + INIT_WORK(&ailp->ail_relog_work, xfs_ail_relog); + + ailp->ail_relog_wq = alloc_workqueue("xfs-relog/%s", WQ_FREEZABLE, 0, + mp->m_super->s_id); + if (!ailp->ail_relog_wq) + goto out_free_ailp; ailp->ail_task = kthread_run(xfsaild, ailp, "xfsaild/%s", ailp->ail_mount->m_super->s_id); if (IS_ERR(ailp->ail_task)) - goto out_free_ailp; + goto out_destroy_wq; mp->m_ail = ailp; return 0; +out_destroy_wq: + destroy_workqueue(ailp->ail_relog_wq); out_free_ailp: kmem_free(ailp); return -ENOMEM; @@ -944,5 +1042,6 @@ xfs_trans_ail_destroy( ASSERT(ailp->ail_relog_tic == NULL); kthread_stop(ailp->ail_task); + destroy_workqueue(ailp->ail_relog_wq); kmem_free(ailp); } diff --git a/fs/xfs/xfs_trans_priv.h b/fs/xfs/xfs_trans_priv.h index d1edec1cb8ad..33a724534869 100644 --- a/fs/xfs/xfs_trans_priv.h +++ b/fs/xfs/xfs_trans_priv.h @@ -63,6 +63,9 @@ struct xfs_ail { int ail_log_flush; struct list_head ail_buf_list; wait_queue_head_t ail_empty; + struct work_struct ail_relog_work; + struct list_head ail_relog_list; + struct workqueue_struct *ail_relog_wq; struct xlog_ticket *ail_relog_tic; }; From patchwork Thu Feb 27 13:43:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11408769 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A510914D5 for ; Thu, 27 Feb 2020 14:48:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 79A4924691 for ; Thu, 27 Feb 2020 14:48:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WPleX09+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729625AbgB0OsL (ORCPT ); Thu, 27 Feb 2020 09:48:11 -0500 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:56947 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729959AbgB0Nn1 (ORCPT ); Thu, 27 Feb 2020 08:43:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582811006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qcVU/Ad5k6xmcd73Mj8LD9ajLQszhmypF3+9/C+QKkc=; b=WPleX09+xBQd+PpM5ag+EnJ4j8kjTQHMdJsF2cz3WMtBYXlMD8CF4kwPKqgf9zSoGLyQYv ud1dkATE9TNyl3wyFh0ye/Ag9HUue21MV8Bl6tCQPM0yklN3Sy4559woc9OU2Z5KyyqlE8 nXTBGy379ruRcGw0aMAFFfttBlKkVbM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-363-kWg0Li0CM8yLDhA-GojiJw-1; Thu, 27 Feb 2020 08:43:24 -0500 X-MC-Unique: kWg0Li0CM8yLDhA-GojiJw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id EC1FE1084437 for ; Thu, 27 Feb 2020 13:43:23 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id B53FC5DA7E for ; Thu, 27 Feb 2020 13:43:23 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [RFC v5 PATCH 6/9] xfs: automatically relog the quotaoff start intent Date: Thu, 27 Feb 2020 08:43:18 -0500 Message-Id: <20200227134321.7238-7-bfoster@redhat.com> In-Reply-To: <20200227134321.7238-1-bfoster@redhat.com> References: <20200227134321.7238-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The quotaoff operation has a rare but longstanding deadlock vector in terms of how the operation is logged. A quotaoff start intent is logged (synchronously) at the onset to ensure recovery can handle the operation if interrupted before in-core changes are made. This quotaoff intent pins the log tail while the quotaoff sequence scans and purges dquots from all in-core inodes. While this operation generally doesn't generate much log traffic on its own, it can be time consuming. If unrelated, concurrent filesystem activity consumes remaining log space before quotaoff is able to acquire log reservation for the quotaoff end intent, the filesystem locks up indefinitely. quotaoff cannot allocate the end intent before the scan because the latter can result in transaction allocation itself in certain indirect cases (releasing an inode, for example). Further, rolling the original transaction is difficult because the scanning work occurs multiple layers down where caller context is lost and not much information is available to determine how often to roll the transaction. To address this problem, enable automatic relogging of the quotaoff start intent. This automatically relogs the intent whenever AIL pushing finds the item at the tail of the log. When quotaoff completes, wait for relogging to complete as the end intent expects to be able to permanently remove the start intent from the log subsystem. This ensures that the log tail is kept moving during a particularly long quotaoff operation and avoids the log reservation deadlock. Signed-off-by: Brian Foster --- fs/xfs/libxfs/xfs_trans_resv.c | 3 ++- fs/xfs/xfs_dquot_item.c | 7 +++++++ fs/xfs/xfs_qm_syscalls.c | 12 +++++++++++- 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 1f5c9e6e1afc..f49b20c9ca33 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -935,7 +935,8 @@ xfs_trans_resv_calc( resp->tr_qm_setqlim.tr_logcount = XFS_DEFAULT_LOG_COUNT; resp->tr_qm_quotaoff.tr_logres = xfs_calc_qm_quotaoff_reservation(mp); - resp->tr_qm_quotaoff.tr_logcount = XFS_DEFAULT_LOG_COUNT; + resp->tr_qm_quotaoff.tr_logcount = XFS_DEFAULT_PERM_LOG_COUNT; + resp->tr_qm_quotaoff.tr_logflags |= XFS_TRANS_PERM_LOG_RES; resp->tr_qm_equotaoff.tr_logres = xfs_calc_qm_quotaoff_end_reservation(); diff --git a/fs/xfs/xfs_dquot_item.c b/fs/xfs/xfs_dquot_item.c index d60647d7197b..ea5123678466 100644 --- a/fs/xfs/xfs_dquot_item.c +++ b/fs/xfs/xfs_dquot_item.c @@ -297,6 +297,13 @@ xfs_qm_qoff_logitem_push( struct xfs_log_item *lip, struct list_head *buffer_list) { + struct xfs_log_item *mlip = xfs_ail_min(lip->li_ailp); + + if (test_bit(XFS_LI_RELOG, &lip->li_flags) && + !test_bit(XFS_LI_RELOGGED, &lip->li_flags) && + !XFS_LSN_CMP(lip->li_lsn, mlip->li_lsn)) + return XFS_ITEM_RELOG; + return XFS_ITEM_LOCKED; } diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c index 1ea82764bf89..7b48d34da0f4 100644 --- a/fs/xfs/xfs_qm_syscalls.c +++ b/fs/xfs/xfs_qm_syscalls.c @@ -18,6 +18,7 @@ #include "xfs_quota.h" #include "xfs_qm.h" #include "xfs_icache.h" +#include "xfs_trans_priv.h" STATIC int xfs_qm_log_quotaoff( @@ -31,12 +32,14 @@ xfs_qm_log_quotaoff( *qoffstartp = NULL; - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_qm_quotaoff, 0, 0, 0, &tp); + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_qm_quotaoff, 0, 0, + XFS_TRANS_RELOG, &tp); if (error) goto out; qoffi = xfs_trans_get_qoff_item(tp, NULL, flags & XFS_ALL_QUOTA_ACCT); xfs_trans_log_quotaoff_item(tp, qoffi); + xfs_trans_relog_item(&qoffi->qql_item); spin_lock(&mp->m_sb_lock); mp->m_sb.sb_qflags = (mp->m_qflags & ~(flags)) & XFS_MOUNT_QUOTA_ALL; @@ -69,6 +72,13 @@ xfs_qm_log_quotaoff_end( int error; struct xfs_qoff_logitem *qoffi; + /* + * startqoff must be in the AIL and not the CIL when the end intent + * commits to ensure it is not readded to the AIL out of order. Wait on + * relog activity to drain to isolate startqoff to the AIL. + */ + xfs_trans_relog_item_cancel(&startqoff->qql_item, true); + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_qm_equotaoff, 0, 0, 0, &tp); if (error) return error; From patchwork Thu Feb 27 13:43:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11408765 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D16041580 for ; Thu, 27 Feb 2020 14:48:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A74B724656 for ; Thu, 27 Feb 2020 14:48:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EOLhtC7P" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729969AbgB0OsK (ORCPT ); Thu, 27 Feb 2020 09:48:10 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:46035 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729967AbgB0Nn1 (ORCPT ); Thu, 27 Feb 2020 08:43:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582811006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nmadNi8II9xm8yPBRY4OKlzh72r86cmU3sWMkDVLnJg=; b=EOLhtC7PuV+Q39NLcV2aVvOtKvjeGjEOsQu74dYjr0Qz7lIVPLGLpE+OoCd+F2hcue3wOB O1Ry4lj41wov4A4yi3CCSyNNu6vitin95He1X/bXJwJ4gQ/wpTPyWNm7j9J3JA8FcAGDe4 2v2ptmV2WzF990lhqeBU8K8kER6gRQk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-399-zVWA0JNEMoWxJBXNs1-AuQ-1; Thu, 27 Feb 2020 08:43:25 -0500 X-MC-Unique: zVWA0JNEMoWxJBXNs1-AuQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4CE561005513 for ; Thu, 27 Feb 2020 13:43:24 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 14B295D9CD for ; Thu, 27 Feb 2020 13:43:24 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [RFC v5 PATCH 7/9] xfs: buffer relogging support prototype Date: Thu, 27 Feb 2020 08:43:19 -0500 Message-Id: <20200227134321.7238-8-bfoster@redhat.com> In-Reply-To: <20200227134321.7238-1-bfoster@redhat.com> References: <20200227134321.7238-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Add a quick and dirty implementation of buffer relogging support. There is currently no use case for buffer relogging. This is for experimental use only and serves as an example to demonstrate the ability to relog arbitrary items in the future, if necessary. Add a hook to enable relogging a buffer in a transaction, update the buffer log item handlers to support relogged BLIs and update the relog handler to join the relogged buffer to the relog transaction. Signed-off-by: Brian Foster --- fs/xfs/xfs_buf_item.c | 5 +++++ fs/xfs/xfs_trans.h | 1 + fs/xfs/xfs_trans_ail.c | 19 ++++++++++++++++--- fs/xfs/xfs_trans_buf.c | 22 ++++++++++++++++++++++ 4 files changed, 44 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 663810e6cd59..4ef2725fa8ce 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -463,6 +463,7 @@ xfs_buf_item_unpin( list_del_init(&bp->b_li_list); bp->b_iodone = NULL; } else { + xfs_trans_relog_item_cancel(lip, false); spin_lock(&ailp->ail_lock); xfs_trans_ail_delete(ailp, lip, SHUTDOWN_LOG_IO_ERROR); xfs_buf_item_relse(bp); @@ -528,6 +529,9 @@ xfs_buf_item_push( return XFS_ITEM_LOCKED; } + if (test_bit(XFS_LI_RELOG, &lip->li_flags)) + return XFS_ITEM_RELOG; + ASSERT(!(bip->bli_flags & XFS_BLI_STALE)); trace_xfs_buf_item_push(bip); @@ -956,6 +960,7 @@ STATIC void xfs_buf_item_free( struct xfs_buf_log_item *bip) { + ASSERT(!test_bit(XFS_LI_RELOG, &bip->bli_item.li_flags)); xfs_buf_item_free_format(bip); kmem_free(bip->bli_item.li_lv_shadow); kmem_cache_free(xfs_buf_item_zone, bip); diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index 1637df32c64c..81cb42f552d9 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -226,6 +226,7 @@ void xfs_trans_inode_buf(xfs_trans_t *, struct xfs_buf *); void xfs_trans_stale_inode_buf(xfs_trans_t *, struct xfs_buf *); bool xfs_trans_ordered_buf(xfs_trans_t *, struct xfs_buf *); void xfs_trans_dquot_buf(xfs_trans_t *, struct xfs_buf *, uint); +bool xfs_trans_relog_buf(struct xfs_trans *, struct xfs_buf *); void xfs_trans_inode_alloc_buf(xfs_trans_t *, struct xfs_buf *); void xfs_trans_ichgtime(struct xfs_trans *, struct xfs_inode *, int); void xfs_trans_ijoin(struct xfs_trans *, struct xfs_inode *, uint); diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 71a47faeaae8..103ab62e61be 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -18,6 +18,7 @@ #include "xfs_error.h" #include "xfs_log.h" #include "xfs_log_priv.h" +#include "xfs_buf_item.h" #ifdef DEBUG /* @@ -187,9 +188,21 @@ xfs_ail_relog( xfs_log_ticket_put(ailp->ail_relog_tic); spin_unlock(&ailp->ail_lock); - xfs_trans_add_item(tp, lip); - set_bit(XFS_LI_DIRTY, &lip->li_flags); - tp->t_flags |= XFS_TRANS_DIRTY; + /* + * TODO: Ideally, relog transaction management would be pushed + * down into the ->iop_push() callbacks rather than playing + * games with ->li_trans and looking at log item types here. + */ + if (lip->li_type == XFS_LI_BUF) { + struct xfs_buf_log_item *bli = (struct xfs_buf_log_item *) lip; + xfs_buf_hold(bli->bli_buf); + xfs_trans_bjoin(tp, bli->bli_buf); + xfs_trans_dirty_buf(tp, bli->bli_buf); + } else { + xfs_trans_add_item(tp, lip); + set_bit(XFS_LI_DIRTY, &lip->li_flags); + tp->t_flags |= XFS_TRANS_DIRTY; + } /* XXX: include ticket owner task fix */ error = xfs_trans_roll(&tp); ASSERT(!error); diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c index 08174ffa2118..e17715ac23fc 100644 --- a/fs/xfs/xfs_trans_buf.c +++ b/fs/xfs/xfs_trans_buf.c @@ -787,3 +787,25 @@ xfs_trans_dquot_buf( xfs_trans_buf_set_type(tp, bp, type); } + +/* + * Enable automatic relogging on a buffer. This essentially pins a dirty buffer + * in-core until relogging is disabled. Note that the buffer must not already be + * queued for writeback. + */ +bool +xfs_trans_relog_buf( + struct xfs_trans *tp, + struct xfs_buf *bp) +{ + struct xfs_buf_log_item *bip = bp->b_log_item; + + ASSERT(tp->t_flags & XFS_TRANS_RELOG); + ASSERT(xfs_buf_islocked(bp)); + + if (bp->b_flags & _XBF_DELWRI_Q) + return false; + + xfs_trans_relog_item(&bip->bli_item); + return true; +} From patchwork Thu Feb 27 13:43:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11408767 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0622417E0 for ; Thu, 27 Feb 2020 14:48:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D9C1924656 for ; Thu, 27 Feb 2020 14:48:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WIEgpNF1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729703AbgB0OsK (ORCPT ); Thu, 27 Feb 2020 09:48:10 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:26860 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729969AbgB0Nn1 (ORCPT ); Thu, 27 Feb 2020 08:43:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582811006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6xBO7XFaQ4Q2SAQVY0hNR9OS7IAuRDc8MaivqtJGyOY=; b=WIEgpNF1rFXhh20ZtZ6UkG3g9/g0QwUNXKoP34D9bHIkL1aMUyGvtX3vv13VRa2apqATYB 0tqJLDz3fuBrCw88HlvCxAflgyM8hkCb/21YFKscRgFI0jJFM5nBe70DBQD5Y005UrDTGS 14IujU2sS+D7Nxd+X8gReTkLBL399T0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-356-FubtAjoGPs68uPRq0rZ3aQ-1; Thu, 27 Feb 2020 08:43:25 -0500 X-MC-Unique: FubtAjoGPs68uPRq0rZ3aQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9FF6418A8C88 for ; Thu, 27 Feb 2020 13:43:24 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 67F975D9CD for ; Thu, 27 Feb 2020 13:43:24 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [RFC v5 PATCH 8/9] xfs: create an error tag for random relog reservation Date: Thu, 27 Feb 2020 08:43:20 -0500 Message-Id: <20200227134321.7238-9-bfoster@redhat.com> In-Reply-To: <20200227134321.7238-1-bfoster@redhat.com> References: <20200227134321.7238-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Create an errortag to randomly enable relogging on permanent transactions. This only stresses relog reservation management and does not enable relogging of any particular items. The tag will be reused in a subsequent patch to enable random item relogging. Signed-off-by: Brian Foster Reviewed-by: Allison Collins --- fs/xfs/libxfs/xfs_errortag.h | 4 +++- fs/xfs/xfs_error.c | 3 +++ fs/xfs/xfs_trans.c | 6 ++++++ 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h index 79e6c4fb1d8a..ca7bcadb9455 100644 --- a/fs/xfs/libxfs/xfs_errortag.h +++ b/fs/xfs/libxfs/xfs_errortag.h @@ -55,7 +55,8 @@ #define XFS_ERRTAG_FORCE_SCRUB_REPAIR 32 #define XFS_ERRTAG_FORCE_SUMMARY_RECALC 33 #define XFS_ERRTAG_IUNLINK_FALLBACK 34 -#define XFS_ERRTAG_MAX 35 +#define XFS_ERRTAG_RELOG 35 +#define XFS_ERRTAG_MAX 36 /* * Random factors for above tags, 1 means always, 2 means 1/2 time, etc. @@ -95,5 +96,6 @@ #define XFS_RANDOM_FORCE_SCRUB_REPAIR 1 #define XFS_RANDOM_FORCE_SUMMARY_RECALC 1 #define XFS_RANDOM_IUNLINK_FALLBACK (XFS_RANDOM_DEFAULT/10) +#define XFS_RANDOM_RELOG XFS_RANDOM_DEFAULT #endif /* __XFS_ERRORTAG_H_ */ diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c index 331765afc53e..2838b909287e 100644 --- a/fs/xfs/xfs_error.c +++ b/fs/xfs/xfs_error.c @@ -53,6 +53,7 @@ static unsigned int xfs_errortag_random_default[] = { XFS_RANDOM_FORCE_SCRUB_REPAIR, XFS_RANDOM_FORCE_SUMMARY_RECALC, XFS_RANDOM_IUNLINK_FALLBACK, + XFS_RANDOM_RELOG, }; struct xfs_errortag_attr { @@ -162,6 +163,7 @@ XFS_ERRORTAG_ATTR_RW(buf_lru_ref, XFS_ERRTAG_BUF_LRU_REF); XFS_ERRORTAG_ATTR_RW(force_repair, XFS_ERRTAG_FORCE_SCRUB_REPAIR); XFS_ERRORTAG_ATTR_RW(bad_summary, XFS_ERRTAG_FORCE_SUMMARY_RECALC); XFS_ERRORTAG_ATTR_RW(iunlink_fallback, XFS_ERRTAG_IUNLINK_FALLBACK); +XFS_ERRORTAG_ATTR_RW(relog, XFS_ERRTAG_RELOG); static struct attribute *xfs_errortag_attrs[] = { XFS_ERRORTAG_ATTR_LIST(noerror), @@ -199,6 +201,7 @@ static struct attribute *xfs_errortag_attrs[] = { XFS_ERRORTAG_ATTR_LIST(force_repair), XFS_ERRORTAG_ATTR_LIST(bad_summary), XFS_ERRORTAG_ATTR_LIST(iunlink_fallback), + XFS_ERRORTAG_ATTR_LIST(relog), NULL, }; diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index f7f2411ead4e..24e0208b74b8 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -19,6 +19,7 @@ #include "xfs_trace.h" #include "xfs_error.h" #include "xfs_defer.h" +#include "xfs_errortag.h" kmem_zone_t *xfs_trans_zone; @@ -263,6 +264,11 @@ xfs_trans_alloc( struct xfs_trans *tp; int error; + /* relogging requires permanent transactions */ + if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_RELOG) && + resp->tr_logflags & XFS_TRANS_PERM_LOG_RES) + flags |= XFS_TRANS_RELOG; + /* * Allocate the handle before we do our freeze accounting and setting up * GFP_NOFS allocation context so that we avoid lockdep false positives From patchwork Thu Feb 27 13:43:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11408657 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B3D8114B4 for ; Thu, 27 Feb 2020 13:43:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8FC67222C2 for ; Thu, 27 Feb 2020 13:43:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GP2PxZbT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729974AbgB0Nn2 (ORCPT ); Thu, 27 Feb 2020 08:43:28 -0500 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:44150 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729458AbgB0Nn2 (ORCPT ); Thu, 27 Feb 2020 08:43:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582811007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JnTyIZnYAlg9jJVw7ap0x4ErW8S92UHd74aIll+Pyjk=; b=GP2PxZbTBb3/jd5qUKAD84a9wQc53vbh6Jn8WLNsm41+FZVcgYoILVawbU5GQ/eCotHq/l gz5Ah5AbULlNnTfxOTniw5BDr45VyKTVQpzoEd8ILBFca1r4t9Z44aa6036AGW423jlRbQ VI88sZVyJHUxAgeAttj7Oq22YoXM32g= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-242-G12yHyi5O0y5Eehe-jmZ_w-1; Thu, 27 Feb 2020 08:43:25 -0500 X-MC-Unique: G12yHyi5O0y5Eehe-jmZ_w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id F31B5107B27A for ; Thu, 27 Feb 2020 13:43:24 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id BD0515D9CD for ; Thu, 27 Feb 2020 13:43:24 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [RFC v5 PATCH 9/9] xfs: relog random buffers based on errortag Date: Thu, 27 Feb 2020 08:43:21 -0500 Message-Id: <20200227134321.7238-10-bfoster@redhat.com> In-Reply-To: <20200227134321.7238-1-bfoster@redhat.com> References: <20200227134321.7238-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Since there is currently no specific use case for buffer relogging, add some hacky and experimental code to relog random buffers when the associated errortag is enabled. Update the relog reservation calculation appropriately and use fixed termination logic to help ensure that the relog queue doesn't grow indefinitely. Note that this patch was useful in causing log reservation deadlocks on an fsstress workload if the relog mechanism code is modified to acquire its own log reservation rather than rely on the relog pre-reservation mechanism. In other words, this helps prove that the relog reservation management code effectively avoids log reservation deadlocks. Signed-off-by: Brian Foster --- fs/xfs/libxfs/xfs_trans_resv.c | 8 +++++++- fs/xfs/xfs_trans.h | 4 +++- fs/xfs/xfs_trans_ail.c | 11 +++++++++++ fs/xfs/xfs_trans_buf.c | 13 +++++++++++++ 4 files changed, 34 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index f49b20c9ca33..59a328a0dec6 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -840,7 +840,13 @@ STATIC uint xfs_calc_relog_reservation( struct xfs_mount *mp) { - return xfs_calc_qm_quotaoff_reservation(mp); + uint res; + + res = xfs_calc_qm_quotaoff_reservation(mp); +#ifdef DEBUG + res = max(res, xfs_calc_buf_res(4, XFS_FSB_TO_B(mp, 1))); +#endif + return res; } void diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h index 81cb42f552d9..1783441f6d03 100644 --- a/fs/xfs/xfs_trans.h +++ b/fs/xfs/xfs_trans.h @@ -61,6 +61,7 @@ struct xfs_log_item { #define XFS_LI_DIRTY 3 /* log item dirty in transaction */ #define XFS_LI_RELOG 4 /* automatically relog item */ #define XFS_LI_RELOGGED 5 /* item relogged (not committed) */ +#define XFS_LI_RELOG_RAND 6 #define XFS_LI_FLAGS \ { (1 << XFS_LI_IN_AIL), "IN_AIL" }, \ @@ -68,7 +69,8 @@ struct xfs_log_item { { (1 << XFS_LI_FAILED), "FAILED" }, \ { (1 << XFS_LI_DIRTY), "DIRTY" }, \ { (1 << XFS_LI_RELOG), "RELOG" }, \ - { (1 << XFS_LI_RELOGGED), "RELOGGED" } + { (1 << XFS_LI_RELOGGED), "RELOGGED" }, \ + { (1 << XFS_LI_RELOG_RAND), "RELOG_RAND" } struct xfs_item_ops { unsigned flags; diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c index 103ab62e61be..9b1d7c8df6d8 100644 --- a/fs/xfs/xfs_trans_ail.c +++ b/fs/xfs/xfs_trans_ail.c @@ -188,6 +188,17 @@ xfs_ail_relog( xfs_log_ticket_put(ailp->ail_relog_tic); spin_unlock(&ailp->ail_lock); + /* + * Terminate random/debug relogs at a fixed, aggressive rate to + * avoid building up too much relog activity. + */ + if (test_bit(XFS_LI_RELOG_RAND, &lip->li_flags) && + ((prandom_u32() & 1) || + (mp->m_flags & XFS_MOUNT_UNMOUNTING))) { + clear_bit(XFS_LI_RELOG_RAND, &lip->li_flags); + xfs_trans_relog_item_cancel(lip, false); + } + /* * TODO: Ideally, relog transaction management would be pushed * down into the ->iop_push() callbacks rather than playing diff --git a/fs/xfs/xfs_trans_buf.c b/fs/xfs/xfs_trans_buf.c index e17715ac23fc..de7b9a68fe38 100644 --- a/fs/xfs/xfs_trans_buf.c +++ b/fs/xfs/xfs_trans_buf.c @@ -14,6 +14,8 @@ #include "xfs_buf_item.h" #include "xfs_trans_priv.h" #include "xfs_trace.h" +#include "xfs_error.h" +#include "xfs_errortag.h" /* * Check to see if a buffer matching the given parameters is already @@ -527,6 +529,17 @@ xfs_trans_log_buf( trace_xfs_trans_log_buf(bip); xfs_buf_item_log(bip, first, last); + + /* + * Relog random buffers so long as the transaction is relog enabled and + * the buffer wasn't already relogged explicitly. + */ + if (XFS_TEST_ERROR(false, tp->t_mountp, XFS_ERRTAG_RELOG) && + (tp->t_flags & XFS_TRANS_RELOG) && + !test_bit(XFS_LI_RELOG, &bip->bli_item.li_flags)) { + if (xfs_trans_relog_buf(tp, bp)) + set_bit(XFS_LI_RELOG_RAND, &bip->bli_item.li_flags); + } }