From patchwork Fri Jul 24 06:12:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gao Xiang X-Patchwork-Id: 11682099 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B39B6138C for ; Fri, 24 Jul 2020 06:13:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C5C420767 for ; Fri, 24 Jul 2020 06:13:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Mw8uXqKl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726567AbgGXGNy (ORCPT ); Fri, 24 Jul 2020 02:13:54 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:58848 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726178AbgGXGNy (ORCPT ); Fri, 24 Jul 2020 02:13:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1595571232; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=BO9n6hVZpzBnYCRr3ajb6n9ABkf1XIdzDLUFQFkkBXg=; b=Mw8uXqKlDUncI9OeWGZMffQq+DgoooXG845FVPkdNyjwboxJx5Rdy3WgcMVhQ+twYzI0hM FBah3SetED078JiWRL/iQncNIimbCHf4GwpJqZZ6rdzC2eRRRDizoBo8TidXwYCKFyi36i VUEsFkxq36cMR1K8A75ZSqpGq9+AF0g= Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-25-fBNXa7xtPn6PKdQ7QccUVQ-1; Fri, 24 Jul 2020 02:13:50 -0400 X-MC-Unique: fBNXa7xtPn6PKdQ7QccUVQ-1 Received: by mail-pl1-f200.google.com with SMTP id o10so4887221plk.12 for ; Thu, 23 Jul 2020 23:13:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BO9n6hVZpzBnYCRr3ajb6n9ABkf1XIdzDLUFQFkkBXg=; b=ahG6KqWOgPnREa4CjUIk3PcW1pqV+uVZa5GxRJPqyqJxizcFRKGa2vM6mHwjErwj7z 5pEGx9u56by/BYJBcJATbYxLX5uFlpr1y42s84JYXhDkWXFFYtXgEIaV69PakfoUxi0o +GZFVcFldjqHOfLFf3nFZ+q75AGtLOqdE0RUerekTXRHZ1CiusmHIt51Czua49GK9RBN 7ZGpRQKW+MVrO1VZSVHWVaQiWoerxr/eQeHScY/mwZ0tjPdwhFXgJc9ova6/AhMg2rNE a5oLjtLwTG+EKNX30v2qrWKFpk/BJM+TXthoiiCcUzmG51Rv6RRpAksN8L4JMqlJWyTz bkMg== X-Gm-Message-State: AOAM532u2m3WHKpZAlTk1oKy0cZnz0QVzIBp4k2nsM40TWyHhVEn0vM/ XxL/6F3JglaBnzwdFrlYMkttrMwGrp/LIq67jH79XWcrmLj/QoMcxVvkvuFGKJ6lNVU0QwFJMcl 91N3vLjubNx6oIALaGRJq X-Received: by 2002:a17:90b:142:: with SMTP id em2mr3698916pjb.236.1595571228764; Thu, 23 Jul 2020 23:13:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwi0lfJxN3LUPXyniNKbGZ5zXjR4At1YzAr1dGX1T7B5reH5/EJCMd0ieU4V59nKoevcW9czQ== X-Received: by 2002:a17:90b:142:: with SMTP id em2mr3698899pjb.236.1595571228533; Thu, 23 Jul 2020 23:13:48 -0700 (PDT) Received: from xiangao.remote.csb ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id n15sm4899232pjf.12.2020.07.23.23.13.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 23:13:47 -0700 (PDT) From: Gao Xiang To: linux-xfs@vger.kernel.org Cc: Dave Chinner , "Darrick J. Wong" , Brian Foster , Gao Xiang Subject: [RFC PATCH v2 1/3] xfs: arrange all unlinked inodes into one list Date: Fri, 24 Jul 2020 14:12:57 +0800 Message-Id: <20200724061259.5519-2-hsiangkao@redhat.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20200724061259.5519-1-hsiangkao@redhat.com> References: <20200707135741.487-1-hsiangkao@redhat.com> <20200724061259.5519-1-hsiangkao@redhat.com> Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org There is no need to keep old multiple short unlink inode buckets since we have an in-memory double linked list for all unlinked inodes. Apart from the perspective of the necessity, the main advantage is the log and AGI update can be reduced since each AG has the only one head now, which is implemented in the following patches. Therefore, this patch applies the new way in xfs_iunlink() and keep the old approach in xfs_iunlink_remove_inode() path as well so inode eviction can still work properly in recovery. Signed-off-by: Gao Xiang --- fs/xfs/xfs_inode.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index ab288424764c..4994398373df 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -33,6 +33,7 @@ #include "xfs_symlink.h" #include "xfs_trans_priv.h" #include "xfs_log.h" +#include "xfs_log_priv.h" #include "xfs_bmap_btree.h" #include "xfs_reflink.h" #include "xfs_iunlink_item.h" @@ -2001,14 +2002,13 @@ xfs_iunlink_insert_inode( xfs_agino_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino); xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ip->i_ino); xfs_agino_t next_agino; - short bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS; /* * Get the index into the agi hash table for the list this inode will * go on. Make sure the pointer isn't garbage and that this inode * isn't already on the list. */ - next_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]); + next_agino = be32_to_cpu(agi->agi_unlinked[0]); if (next_agino == agino || !xfs_verify_agino_or_null(mp, agno, next_agino)) { xfs_buf_mark_corrupt(agibp); @@ -2036,7 +2036,7 @@ xfs_iunlink_insert_inode( } /* Point the head of the list to point to this inode. */ - return xfs_iunlink_update_bucket(tp, agno, agibp, bucket_index, agino); + return xfs_iunlink_update_bucket(tp, agno, agibp, 0, agino); } /* @@ -2055,10 +2055,17 @@ xfs_iunlink_remove_inode( xfs_agino_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino); xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ip->i_ino); xfs_agino_t next_agino = ip->i_next_unlinked; - short bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS; int error; if (ip->i_prev_unlinked == NULLAGINO) { + struct xlog *log = mp->m_log; + short bucket_index = 0; + + /* During recovery, the old multiple bucket can be applied */ + if ((!log || log->l_flags & XLOG_RECOVERY_NEEDED) && + be32_to_cpu(agi->agi_unlinked[0]) != agino) + bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS; + /* remove from head of list */ if (be32_to_cpu(agi->agi_unlinked[bucket_index]) != agino) { xfs_buf_mark_corrupt(agibp); From patchwork Fri Jul 24 06:12:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gao Xiang X-Patchwork-Id: 11682103 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9742E138C for ; Fri, 24 Jul 2020 06:14:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7999F20767 for ; Fri, 24 Jul 2020 06:14:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="H0+oaLr7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726178AbgGXGN7 (ORCPT ); Fri, 24 Jul 2020 02:13:59 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:26250 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726539AbgGXGN7 (ORCPT ); Fri, 24 Jul 2020 02:13:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1595571238; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=EErgKm4WWBz+GEyxWgHWlNohF/fDsGVkUyWSU2ZPBSU=; b=H0+oaLr7ic9MuB444ZRQHMrSwXWlDkPBUrmysOVfZD7npzE7RalZmWyV5cUiuhABkbU6sJ 4UIaYubURpZsBL1R/2ojwVeYdaozdZKOp/SpzcX/BxGsj+t3t8JcaxS3a5iJf02CHhUFCK 4TXe5BssmFHcK4fDGQFvrRb1QBa7tf8= Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-126-TfppMMQANfmQQAv6KboAkw-1; Fri, 24 Jul 2020 02:13:54 -0400 X-MC-Unique: TfppMMQANfmQQAv6KboAkw-1 Received: by mail-pj1-f71.google.com with SMTP id q5so4922169pjd.3 for ; Thu, 23 Jul 2020 23:13:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=EErgKm4WWBz+GEyxWgHWlNohF/fDsGVkUyWSU2ZPBSU=; b=jEk70u4YzB81GmdEywmEAkW85ceRqJC2AKLuHqA1VjnbGW5icQ6hD67JQG8o458pAj y9b4jqkv6kLll0O/b5OjRUgpSdwq1uymQ8bpdLYHNoSBc84e6ZwmuyKptxaLTzt11iIl zAMn6WvoERc5o/QdjE/hVlw2d8GFAHR1yFktQ4E+yQ+Hg0f5Zb/DTh3xSYA/LSiMmODp 1Gjmn3wimjHNQDK+TUTQbNzGSkkZdaV8MCZ0qHEu/Se4NIkTYmdpI0M9o7SL7YYxfzkU H/xvPQ+I5yaejTKFkQcyAAJePaQ0kvpVe8vFhkH3fWkbKNgu8cUWBmvt27p7fksQgmlB Cwow== X-Gm-Message-State: AOAM530kvdNc6F/138XYD4OV0lWzAXhUYd+1GCd4304CWM6qXjwXbETG A/0MKdTpsfezjcwGRthGRazXLEGHkg61JYWSAkrdYJcxpWH7K/AK8TWeG865ouM4Jn91BZ7XJl6 sHfw7plos6UDq8f+7j6UU X-Received: by 2002:a17:90a:c915:: with SMTP id v21mr3916298pjt.48.1595571232706; Thu, 23 Jul 2020 23:13:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwXxtUF4SFs2eZgwTp7ybL6EyK1Tjj5gqtBfSIzJYs/nbIBN3KW7C2Ds/Nq3ZG/zO5qOKy6tQ== X-Received: by 2002:a17:90a:c915:: with SMTP id v21mr3916279pjt.48.1595571232431; Thu, 23 Jul 2020 23:13:52 -0700 (PDT) Received: from xiangao.remote.csb ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id n15sm4899232pjf.12.2020.07.23.23.13.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 23:13:52 -0700 (PDT) From: Gao Xiang To: linux-xfs@vger.kernel.org Cc: Dave Chinner , "Darrick J. Wong" , Brian Foster , Gao Xiang Subject: [RFC PATCH v2 2/3] xfs: introduce perag iunlink lock Date: Fri, 24 Jul 2020 14:12:58 +0800 Message-Id: <20200724061259.5519-3-hsiangkao@redhat.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20200724061259.5519-1-hsiangkao@redhat.com> References: <20200707135741.487-1-hsiangkao@redhat.com> <20200724061259.5519-1-hsiangkao@redhat.com> Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Currently, we use AGI buffer lock to protect in-memory linked list for unlinked inodes but since it's not necessary to modify AGI unless the head of the unlinked list is modified. Therefore, let's add another per-AG dedicated lock to protect the whole list, and removing AGI buffer lock in xfs_iunlink_remove() if the head is untouched. Note that some paths (e.g. xfs_create_tmpfile()) take AGI lock in its transaction in advance, so the proper locking order is only AGI lock -> unlinked list lock. Signed-off-by: Gao Xiang --- fs/xfs/xfs_inode.c | 79 ++++++++++++++++++++++++++++++++++++++-------- fs/xfs/xfs_mount.c | 1 + fs/xfs/xfs_mount.h | 3 ++ 3 files changed, 70 insertions(+), 13 deletions(-) diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 4994398373df..d78aaa8ce252 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2039,6 +2039,46 @@ xfs_iunlink_insert_inode( return xfs_iunlink_update_bucket(tp, agno, agibp, 0, agino); } +/* + * Lock the perag and take AGI lock if agi_unlinked is touched as well for + * xfs_iunlink_remove_inode(). + * + * Inode allocation in the O_TMPFILE path defines the AGI/unlinked list lock + * order as being AGI->perag unlinked list lock. We are inverting it here as + * the fast path tail addition does not need to modify the AGI at all. Hence + * we only need the AGI lock if the tail is empty, correct lock order. + */ +static struct xfs_perag * +xfs_iunlink_remove_lock( + xfs_agino_t agno, + struct xfs_trans *tp, + struct xfs_inode *ip, + struct xfs_buf **agibpp) +{ + struct xfs_mount *mp = tp->t_mountp; + struct xfs_perag *pag; + int error; + + pag = xfs_perag_get(mp, agno); + mutex_lock(&pag->pag_unlinked_mutex); + + if (ip->i_prev_unlinked == NULLAGINO) { + mutex_unlock(&pag->pag_unlinked_mutex); + /* + * some paths (e.g. xfs_create_tmpfile) could take AGI lock + * in this transaction in advance and the only locking order + * AGI buf lock -> pag_unlinked_mutex is safe. + */ + error = xfs_read_agi(mp, tp, agno, agibpp); + if (error) { + xfs_perag_put(pag); + return ERR_PTR(error); + } + mutex_lock(&pag->pag_unlinked_mutex); + } + return pag; +} + /* * Remove can be from anywhere in the list, so we have to do two adjacent inode * lookups here so we can update list pointers. We may be at the head or the @@ -2046,12 +2086,12 @@ xfs_iunlink_insert_inode( */ static int xfs_iunlink_remove_inode( + struct xfs_perag *pag, struct xfs_trans *tp, struct xfs_buf *agibp, struct xfs_inode *ip) { struct xfs_mount *mp = tp->t_mountp; - struct xfs_agi *agi = agibp->b_addr; xfs_agino_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino); xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ip->i_ino); xfs_agino_t next_agino = ip->i_next_unlinked; @@ -2059,6 +2099,7 @@ xfs_iunlink_remove_inode( if (ip->i_prev_unlinked == NULLAGINO) { struct xlog *log = mp->m_log; + struct xfs_agi *agi = agibp->b_addr; short bucket_index = 0; /* During recovery, the old multiple bucket can be applied */ @@ -2083,7 +2124,7 @@ xfs_iunlink_remove_inode( /* lookup previous inode and update to point at next */ struct xfs_inode *pip; - pip = xfs_iunlink_lookup_prev(agibp->b_pag, ip); + pip = xfs_iunlink_lookup_prev(pag, ip); if (IS_ERR_OR_NULL(pip)) return -EFSCORRUPTED; @@ -2102,7 +2143,7 @@ xfs_iunlink_remove_inode( if (ip->i_next_unlinked != NULLAGINO) { struct xfs_inode *nip; - nip = xfs_iunlink_lookup_next(agibp->b_pag, ip); + nip = xfs_iunlink_lookup_next(pag, ip); if (IS_ERR_OR_NULL(nip)) return -EFSCORRUPTED; @@ -2126,6 +2167,15 @@ xfs_iunlink_remove_inode( return 0; } +static void +xfs_iunlink_unlock( + struct xfs_perag *pag) +{ + /* Does not unlock AGI, ever. xfs_trans_commit() does that. */ + mutex_unlock(&pag->pag_unlinked_mutex); + xfs_perag_put(pag); +} + /* * This is called when the inode's link count has gone to 0 or we are creating * a tmpfile via O_TMPFILE. The inode @ip must have nlink == 0. @@ -2143,7 +2193,6 @@ xfs_iunlink( xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino); int error; - ASSERT(ip->i_next_unlinked == NULLAGINO); ASSERT(VFS_I(ip)->i_nlink == 0); ASSERT(VFS_I(ip)->i_mode != 0); trace_xfs_iunlink(ip); @@ -2153,7 +2202,10 @@ xfs_iunlink( if (error) return error; - return xfs_iunlink_insert_inode(tp, agibp, ip); + mutex_lock(&agibp->b_pag->pag_unlinked_mutex); + error = xfs_iunlink_insert_inode(tp, agibp, ip); + mutex_unlock(&agibp->b_pag->pag_unlinked_mutex); + return error; } /* @@ -2164,19 +2216,20 @@ xfs_iunlink_remove( struct xfs_trans *tp, struct xfs_inode *ip) { - struct xfs_mount *mp = tp->t_mountp; - struct xfs_buf *agibp; - xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino); + struct xfs_buf *agibp = NULL; + struct xfs_perag *pag; int error; trace_xfs_iunlink_remove(ip); - /* Get the agi buffer first. It ensures lock ordering on the list. */ - error = xfs_read_agi(mp, tp, agno, &agibp); - if (error) - return error; + pag = xfs_iunlink_remove_lock(XFS_INO_TO_AGNO(tp->t_mountp, ip->i_ino), + tp, ip, &agibp); + if (IS_ERR(pag)) + return PTR_ERR(pag); - return xfs_iunlink_remove_inode(tp, agibp, ip); + error = xfs_iunlink_remove_inode(pag, tp, agibp, ip); + xfs_iunlink_unlock(pag); + return error; } /* diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 031e96ff022d..f94e14059e61 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -223,6 +223,7 @@ xfs_initialize_perag( if (first_initialised == NULLAGNUMBER) first_initialised = index; spin_lock_init(&pag->pag_state_lock); + mutex_init(&pag->pag_unlinked_mutex); } index = xfs_set_inode_alloc(mp, agcount); diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index a72cfcaa4ad1..9a1d0f239fa4 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -372,6 +372,9 @@ typedef struct xfs_perag { /* reference count */ uint8_t pagf_refcount_level; + /* lock to protect unlinked inode list */ + struct mutex pag_unlinked_mutex; + /* * Unlinked inode information. This incore information reflects * data stored in the AGI, so callers must hold the AGI buffer lock From patchwork Fri Jul 24 06:12:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gao Xiang X-Patchwork-Id: 11682105 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2ED69138C for ; Fri, 24 Jul 2020 06:14:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1133E20767 for ; Fri, 24 Jul 2020 06:14:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="VDQyrPgP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726525AbgGXGOC (ORCPT ); Fri, 24 Jul 2020 02:14:02 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:31851 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726539AbgGXGOC (ORCPT ); Fri, 24 Jul 2020 02:14:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1595571240; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:in-reply-to:in-reply-to:references:references; bh=SqeFsJ0aJwClQe3JpaTAOGzLruqTfdvKRMMbU8DxMEE=; b=VDQyrPgPhy0OC9PGE6yIAUqbSO5PNC7GoQJl6S7wxudRmxyQcG6RWQzb6rxxlewtuD0uxm YYu5ceA2Y/0HvnnABt25pvfzXP/3yG/Yw9cUdWc3JOxk9W02Bu49XFf+Rlplkii/fvYplj l+MNfp2AVM3J/PL016fOY2cxdze87UI= Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-366-sGSMASS9NWuchUa7EA8TaA-1; Fri, 24 Jul 2020 02:13:58 -0400 X-MC-Unique: sGSMASS9NWuchUa7EA8TaA-1 Received: by mail-pf1-f199.google.com with SMTP id h75so5588101pfe.23 for ; Thu, 23 Jul 2020 23:13:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SqeFsJ0aJwClQe3JpaTAOGzLruqTfdvKRMMbU8DxMEE=; b=eBBg/ZAlC2cEyx7FWZMLuKt+wq9f3HsfwhfpT9nph6zjAKwibyX81Bdk374Cdfjgy9 0zGidB7UnXlHXKf86vdF13Uq56giwhTzNfSlwvLYnc0kfFsgRRvcF8IAspbbax3ioVMW x6WWLnoil2RVIKDTzqLIAcoTe7ZItxPgur8Y1miLDfhbfv/GG9caRiZ/8eldsJpeknfX qw9nS6Wzq91GmTdrgloJLuV4LrYagS1A8udh0CXRlmmAE+Uaa9S7wqHEiBLk6NfWmcXD QYlrRenMT88a1E2MfpuxsYxKxPTv7RSdBqA/wQGyYksVH6sMq4bPYDoWqMPps+fMt8yS zVWA== X-Gm-Message-State: AOAM533jCaNOjK8yzqremlJi8Gkq+v/2ii07RY9fON4mudyGXE8Leb7i K9ZeYdXbfWcUKK1d8m7ylJ3OQsiNOULew6cFLs01viXiHhYRkE8EhlpdQyFykCs3Cop6QNogQHC cQyqakZrnV9mvSXxJ6k9i X-Received: by 2002:a62:1e43:: with SMTP id e64mr7010914pfe.249.1595571236612; Thu, 23 Jul 2020 23:13:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz6vhxj0Lfw9p5TKyz28cO1zM1mBzm0ezYOw0ZeGVlWlMW5lwOsZ8tSlMMsJg4oFV/OP5dv1g== X-Received: by 2002:a62:1e43:: with SMTP id e64mr7010894pfe.249.1595571236251; Thu, 23 Jul 2020 23:13:56 -0700 (PDT) Received: from xiangao.remote.csb ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id n15sm4899232pjf.12.2020.07.23.23.13.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Jul 2020 23:13:55 -0700 (PDT) From: Gao Xiang To: linux-xfs@vger.kernel.org Cc: Dave Chinner , "Darrick J. Wong" , Brian Foster , Gao Xiang Subject: [RFC PATCH v2 3/3] xfs: insert unlinked inodes from tail Date: Fri, 24 Jul 2020 14:12:59 +0800 Message-Id: <20200724061259.5519-4-hsiangkao@redhat.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20200724061259.5519-1-hsiangkao@redhat.com> References: <20200707135741.487-1-hsiangkao@redhat.com> <20200724061259.5519-1-hsiangkao@redhat.com> Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Currently, AGI buffer is always touched since xfs_iunlink() adds unlinked inodes from head unconditionally, but since we now have the only one unlinked list and if we insert unlinked inodes from tail instead and there're more than 1 inodes, AGI buffer modification could be avoided. In order to do that, let's keep track of the tail of unlinked inode into the perag all the time in order for xfs_iunlink() to use. With this change, it shows that only 938 of 10000 operations modifies the head of unlinked list with the following workload: seq 1 10000 | xargs touch find . | xargs -n3 -P100 rm Signed-off-by: Gao Xiang --- fs/xfs/xfs_inode.c | 120 ++++++++++++++++++++++++--------------- fs/xfs/xfs_log_recover.c | 5 ++ fs/xfs/xfs_mount.c | 1 + fs/xfs/xfs_mount.h | 3 + 4 files changed, 84 insertions(+), 45 deletions(-) diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index d78aaa8ce252..3cfd84b76955 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -1986,6 +1986,38 @@ xfs_iunlink_update_bucket( return 0; } +/* + * Lock the perag and take AGI lock if agi_unlinked is touched as well + * for xfs_iunlink_insert_inode(). As for the details of locking order, + * refer to the comments of xfs_iunlink_remove_lock(). + */ +static struct xfs_perag * +xfs_iunlink_insert_lock( + xfs_agino_t agno, + struct xfs_trans *tp, + struct xfs_inode *ip, + struct xfs_buf **agibpp) +{ + struct xfs_mount *mp = tp->t_mountp; + struct xfs_perag *pag; + int error; + + pag = xfs_perag_get(mp, agno); + mutex_lock(&pag->pag_unlinked_mutex); + + if (!pag->pag_unlinked_tail) { + mutex_unlock(&pag->pag_unlinked_mutex); + + error = xfs_read_agi(mp, tp, agno, agibpp); + if (error) { + xfs_perag_put(pag); + return ERR_PTR(error); + } + mutex_lock(&pag->pag_unlinked_mutex); + } + return pag; +} + /* * Always insert at the head, so we only have to do a next inode lookup to * update it's prev pointer. The AGI bucket will point at the one we are @@ -1993,50 +2025,47 @@ xfs_iunlink_update_bucket( */ static int xfs_iunlink_insert_inode( + struct xfs_perag *pag, struct xfs_trans *tp, struct xfs_buf *agibp, struct xfs_inode *ip) { struct xfs_mount *mp = tp->t_mountp; - struct xfs_agi *agi = agibp->b_addr; - xfs_agino_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino); + struct xfs_inode *pip = pag->pag_unlinked_tail; xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ip->i_ino); xfs_agino_t next_agino; - /* - * Get the index into the agi hash table for the list this inode will - * go on. Make sure the pointer isn't garbage and that this inode - * isn't already on the list. - */ - next_agino = be32_to_cpu(agi->agi_unlinked[0]); - if (next_agino == agino || - !xfs_verify_agino_or_null(mp, agno, next_agino)) { - xfs_buf_mark_corrupt(agibp); - return -EFSCORRUPTED; - } - - ip->i_prev_unlinked = NULLAGINO; - ip->i_next_unlinked = next_agino; - if (ip->i_next_unlinked != NULLAGINO) { - struct xfs_inode *nip; - - nip = xfs_iunlink_lookup_next(agibp->b_pag, ip); - if (IS_ERR_OR_NULL(nip)) - return -EFSCORRUPTED; + if (!pip) { + xfs_agino_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino); + struct xfs_agi *agi = agibp->b_addr; + int error; - if (nip->i_prev_unlinked != NULLAGINO) { - xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, - NULL, 0, __this_address); + ip->i_prev_unlinked = NULLAGINO; + /* + * Get the index into the agi hash table for the list this + * inode will go on. Make sure the pointer isn't garbage + * and that this inode isn't already on the list. + */ + next_agino = be32_to_cpu(agi->agi_unlinked[0]); + if (next_agino == agino || + !xfs_verify_agino_or_null(mp, agno, next_agino)) { + xfs_buf_mark_corrupt(agibp); return -EFSCORRUPTED; } - nip->i_prev_unlinked = agino; - /* update the on disk inode now */ - xfs_iunlink_log(tp, ip); + /* Point the head of the list to point to this inode. */ + error = xfs_iunlink_update_bucket(tp, agno, agibp, 0, agino); + if (error) + return error; + } else { + ip->i_prev_unlinked = XFS_INO_TO_AGINO(mp, pip->i_ino); + ASSERT(pip->i_next_unlinked == NULLAGINO); + pip->i_next_unlinked = agino; + xfs_iunlink_log(tp, pip); } - - /* Point the head of the list to point to this inode. */ - return xfs_iunlink_update_bucket(tp, agno, agibp, 0, agino); + ip->i_next_unlinked = NULLAGINO; + pag->pag_unlinked_tail = ip; + return 0; } /* @@ -2095,6 +2124,7 @@ xfs_iunlink_remove_inode( xfs_agino_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino); xfs_agino_t agino = XFS_INO_TO_AGINO(mp, ip->i_ino); xfs_agino_t next_agino = ip->i_next_unlinked; + struct xfs_inode *pip = NULL; int error; if (ip->i_prev_unlinked == NULLAGINO) { @@ -2122,8 +2152,6 @@ xfs_iunlink_remove_inode( return -EFSCORRUPTED; } else { /* lookup previous inode and update to point at next */ - struct xfs_inode *pip; - pip = xfs_iunlink_lookup_prev(pag, ip); if (IS_ERR_OR_NULL(pip)) return -EFSCORRUPTED; @@ -2139,8 +2167,12 @@ xfs_iunlink_remove_inode( xfs_iunlink_log(tp, pip); } - /* lookup the next inode and update to point at prev */ - if (ip->i_next_unlinked != NULLAGINO) { + if (next_agino == NULLAGINO) { + /* only care about the tail of bucket 0 for xfs_iunlink() */ + if (pag->pag_unlinked_tail == ip) + pag->pag_unlinked_tail = pip; + } else { + /* lookup the next inode and update to point at prev */ struct xfs_inode *nip; nip = xfs_iunlink_lookup_next(pag, ip); @@ -2188,23 +2220,21 @@ xfs_iunlink( struct xfs_trans *tp, struct xfs_inode *ip) { - struct xfs_mount *mp = tp->t_mountp; - struct xfs_buf *agibp; - xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, ip->i_ino); + struct xfs_buf *agibp = NULL; + struct xfs_perag *pag; int error; ASSERT(VFS_I(ip)->i_nlink == 0); ASSERT(VFS_I(ip)->i_mode != 0); trace_xfs_iunlink(ip); - /* Get the agi buffer first. It ensures lock ordering on the list. */ - error = xfs_read_agi(mp, tp, agno, &agibp); - if (error) - return error; + pag = xfs_iunlink_insert_lock(XFS_INO_TO_AGNO(tp->t_mountp, ip->i_ino), + tp, ip, &agibp); + if (IS_ERR(pag)) + return PTR_ERR(pag); - mutex_lock(&agibp->b_pag->pag_unlinked_mutex); - error = xfs_iunlink_insert_inode(tp, agibp, ip); - mutex_unlock(&agibp->b_pag->pag_unlinked_mutex); + error = xfs_iunlink_insert_inode(pag, tp, agibp, ip); + xfs_iunlink_unlock(pag); return error; } diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index d47eea31c165..30198aea7335 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2801,6 +2801,11 @@ xlog_recover_iunlink_ag( prev_ip->i_next_unlinked = NULLAGINO; break; } + + /* XXX: take pag_unlinked_mutex across the loop? */ + if (!bucket) + agibp->b_pag->pag_unlinked_tail = ip; + if (prev_ip) { ip->i_prev_unlinked = prev_agino; xfs_irele(prev_ip); diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index f94e14059e61..f1cd3e9c4da5 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -224,6 +224,7 @@ xfs_initialize_perag( first_initialised = index; spin_lock_init(&pag->pag_state_lock); mutex_init(&pag->pag_unlinked_mutex); + pag->pag_unlinked_tail = NULL; } index = xfs_set_inode_alloc(mp, agcount); diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index 9a1d0f239fa4..934e7c373042 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -375,6 +375,9 @@ typedef struct xfs_perag { /* lock to protect unlinked inode list */ struct mutex pag_unlinked_mutex; + /* point to the inode tail of AGI unlinked bucket 0 */ + struct xfs_inode *pag_unlinked_tail; + /* * Unlinked inode information. This incore information reflects * data stored in the AGI, so callers must hold the AGI buffer lock