From patchwork Sat Dec 9 12:21:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 13486028 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFD2710C4 for ; Sat, 9 Dec 2023 04:17:40 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.252]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4SnRpl15Y7z14LCY; Sat, 9 Dec 2023 20:17:35 +0800 (CST) Received: from kwepemi500009.china.huawei.com (unknown [7.221.188.199]) by mail.maildlp.com (Postfix) with ESMTPS id 5FF0C1800CE; Sat, 9 Dec 2023 20:17:38 +0800 (CST) Received: from localhost.localdomain (10.175.127.227) by kwepemi500009.china.huawei.com (7.221.188.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Sat, 9 Dec 2023 20:17:37 +0800 From: Long Li To: , CC: , , , , Subject: [PATCH v2 1/3] xfs: add lock protection when remove perag from radix tree Date: Sat, 9 Dec 2023 20:21:05 +0800 Message-ID: <20231209122107.2422441-1-leo.lilong@huawei.com> X-Mailer: git-send-email 2.31.1 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemi500009.china.huawei.com (7.221.188.199) Take mp->m_perag_lock for deletions from the perag radix tree in xfs_initialize_perag to prevent racing with tagging operations. Lookups are fine - they are RCU protected so already deal with the tree changing shape underneath the lookup - but tagging operations require the tree to be stable while the tags are propagated back up to the root. Right now there's nothing stopping radix tree tagging from operating while a growfs operation is progress and adding/removing new entries into the radix tree. Hence we can have traversals that require a stable tree occurring at the same time we are removing unused entries from the radix tree which causes the shape of the tree to change. Likely this hasn't caused a problem in the past because we are only doing append addition and removal so the active AG part of the tree is not changing shape, but that doesn't mean it is safe. Just making the radix tree modifications serialise against each other is obviously correct. Signed-off-by: Long Li Reviewed-by: Christoph Hellwig --- v2: - Refer to Dave's explanation and rewrite the commit message. fs/xfs/libxfs/xfs_ag.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index f9f4d694640d..cc10a3ca052f 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -424,13 +424,17 @@ xfs_initialize_perag( out_remove_pag: xfs_defer_drain_free(&pag->pag_intents_drain); + spin_lock(&mp->m_perag_lock); radix_tree_delete(&mp->m_perag_tree, index); + spin_unlock(&mp->m_perag_lock); out_free_pag: kmem_free(pag); out_unwind_new_pags: /* unwind any prior newly initialized pags */ for (index = first_initialised; index < agcount; index++) { + spin_lock(&mp->m_perag_lock); pag = radix_tree_delete(&mp->m_perag_tree, index); + spin_unlock(&mp->m_perag_lock); if (!pag) break; xfs_buf_hash_destroy(pag); From patchwork Sat Dec 9 12:21:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 13486029 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 146B410DF for ; Sat, 9 Dec 2023 04:17:44 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4SnRpq6XypzYd1b; Sat, 9 Dec 2023 20:17:39 +0800 (CST) Received: from kwepemi500009.china.huawei.com (unknown [7.221.188.199]) by mail.maildlp.com (Postfix) with ESMTPS id 2A2DA180071; Sat, 9 Dec 2023 20:17:42 +0800 (CST) Received: from localhost.localdomain (10.175.127.227) by kwepemi500009.china.huawei.com (7.221.188.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Sat, 9 Dec 2023 20:17:41 +0800 From: Long Li To: , CC: , , , , Subject: [PATCH v2 2/3] xfs: don't assert perag when free perag Date: Sat, 9 Dec 2023 20:21:06 +0800 Message-ID: <20231209122107.2422441-2-leo.lilong@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231209122107.2422441-1-leo.lilong@huawei.com> References: <20231209122107.2422441-1-leo.lilong@huawei.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemi500009.china.huawei.com (7.221.188.199) When releasing the perag in xfs_free_perag(), the assertion that the perag in readix tree is correct in most cases. However, there is one corner case where the assertion is not true. During log recovery, the AGs become visible(that is included in mp->m_sb.sb_agcount) first, and then the perag is initialized. If the initialization of the perag fails, the assertion will be triggered. Worse yet, null pointer dereferencing can occur. Signed-off-by: Long Li --- fs/xfs/libxfs/xfs_ag.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index cc10a3ca052f..11ed048c350c 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -258,7 +258,8 @@ xfs_free_perag( spin_lock(&mp->m_perag_lock); pag = radix_tree_delete(&mp->m_perag_tree, agno); spin_unlock(&mp->m_perag_lock); - ASSERT(pag); + if (!pag) + break; XFS_IS_CORRUPT(pag->pag_mount, atomic_read(&pag->pag_ref) != 0); xfs_defer_drain_free(&pag->pag_intents_drain); From patchwork Sat Dec 9 12:21:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Long Li X-Patchwork-Id: 13486030 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFF4510C4 for ; Sat, 9 Dec 2023 04:17:46 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4SnRpt2fGwz1Q6D7; Sat, 9 Dec 2023 20:17:42 +0800 (CST) Received: from kwepemi500009.china.huawei.com (unknown [7.221.188.199]) by mail.maildlp.com (Postfix) with ESMTPS id E0742180068; Sat, 9 Dec 2023 20:17:44 +0800 (CST) Received: from localhost.localdomain (10.175.127.227) by kwepemi500009.china.huawei.com (7.221.188.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Sat, 9 Dec 2023 20:17:44 +0800 From: Long Li To: , CC: , , , , Subject: [PATCH v2 3/3] xfs: fix perag leak when growfs fails Date: Sat, 9 Dec 2023 20:21:07 +0800 Message-ID: <20231209122107.2422441-3-leo.lilong@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231209122107.2422441-1-leo.lilong@huawei.com> References: <20231209122107.2422441-1-leo.lilong@huawei.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemi500009.china.huawei.com (7.221.188.199) During growfs, if new ag in memory has been initialized, however sb_agcount has not been updated, if an error occurs at this time it will cause perag leaks as follows, these new AGs will not been freed during umount , because of these new AGs are not visible(that is included in mp->m_sb.sb_agcount). unreferenced object 0xffff88810be40200 (size 512): comm "xfs_growfs", pid 857, jiffies 4294909093 hex dump (first 32 bytes): 00 c0 c1 05 81 88 ff ff 04 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 381741e2): [] __kmalloc+0x386/0x4f0 [] kmem_alloc+0xb5/0x2f0 [] xfs_initialize_perag+0xc5/0x810 [] xfs_growfs_data+0x9bc/0xbc0 [] xfs_file_ioctl+0x5fe/0x14d0 [] __x64_sys_ioctl+0x144/0x1c0 [] do_syscall_64+0x3f/0xe0 [] entry_SYSCALL_64_after_hwframe+0x62/0x6a unreferenced object 0xffff88810be40800 (size 512): comm "xfs_growfs", pid 857, jiffies 4294909093 hex dump (first 32 bytes): 20 00 00 00 00 00 00 00 57 ef be dc 00 00 00 00 .......W....... 10 08 e4 0b 81 88 ff ff 10 08 e4 0b 81 88 ff ff ................ backtrace (crc bde50e2d): [] __kmalloc_node+0x3da/0x540 [] kvmalloc_node+0x99/0x160 [] bucket_table_alloc.isra.0+0x5f/0x400 [] rhashtable_init+0x405/0x760 [] xfs_initialize_perag+0x3a3/0x810 [] xfs_growfs_data+0x9bc/0xbc0 [] xfs_file_ioctl+0x5fe/0x14d0 [] __x64_sys_ioctl+0x144/0x1c0 [] do_syscall_64+0x3f/0xe0 [] entry_SYSCALL_64_after_hwframe+0x62/0x6a Now, the logic for freeing perag in xfs_free_perag() and xfs_initialize_perag() error handling is essentially the same. Factor out xfs_free_perag_range() from xfs_free_perag(), used for freeing unused perag within a specified range, inclued when growfs fails. Signed-off-by: Long Li --- fs/xfs/libxfs/xfs_ag.c | 35 ++++++++++++++++++++--------------- fs/xfs/libxfs/xfs_ag.h | 2 ++ fs/xfs/xfs_fsops.c | 5 ++++- 3 files changed, 26 insertions(+), 16 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index 11ed048c350c..edec03ab09aa 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -245,16 +245,20 @@ __xfs_free_perag( } /* - * Free up the per-ag resources associated with the mount structure. + * Free per-ag within the specified range, if agno is not found in the + * radix tree, then it means that agno and subsequent AGs have not been + * initialized. */ void -xfs_free_perag( - struct xfs_mount *mp) +xfs_free_perag_range( + xfs_mount_t *mp, + xfs_agnumber_t agstart, + xfs_agnumber_t agend) { - struct xfs_perag *pag; xfs_agnumber_t agno; + struct xfs_perag *pag; - for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) { + for (agno = agstart; agno < agend; agno++) { spin_lock(&mp->m_perag_lock); pag = radix_tree_delete(&mp->m_perag_tree, agno); spin_unlock(&mp->m_perag_lock); @@ -274,6 +278,16 @@ xfs_free_perag( } } +/* + * Free up the per-ag resources associated with the mount structure. + */ +void +xfs_free_perag( + struct xfs_mount *mp) +{ + xfs_free_perag_range(mp, 0, mp->m_sb.sb_agcount); +} + /* Find the size of the AG, in blocks. */ static xfs_agblock_t __xfs_ag_block_count( @@ -432,16 +446,7 @@ xfs_initialize_perag( kmem_free(pag); out_unwind_new_pags: /* unwind any prior newly initialized pags */ - for (index = first_initialised; index < agcount; index++) { - spin_lock(&mp->m_perag_lock); - pag = radix_tree_delete(&mp->m_perag_tree, index); - spin_unlock(&mp->m_perag_lock); - if (!pag) - break; - xfs_buf_hash_destroy(pag); - xfs_defer_drain_free(&pag->pag_intents_drain); - kmem_free(pag); - } + xfs_free_perag_range(mp, first_initialised, agcount); return error; } diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h index 2e0aef87d633..927f737f84ec 100644 --- a/fs/xfs/libxfs/xfs_ag.h +++ b/fs/xfs/libxfs/xfs_ag.h @@ -136,6 +136,8 @@ __XFS_AG_OPSTATE(agfl_needs_reset, AGFL_NEEDS_RESET) int xfs_initialize_perag(struct xfs_mount *mp, xfs_agnumber_t agcount, xfs_rfsblock_t dcount, xfs_agnumber_t *maxagi); int xfs_initialize_perag_data(struct xfs_mount *mp, xfs_agnumber_t agno); +void xfs_free_perag_range(xfs_mount_t *mp, xfs_agnumber_t agstart, + xfs_agnumber_t agend); void xfs_free_perag(struct xfs_mount *mp); /* Passive AG references */ diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index 57076a25f17d..144fea4374af 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -153,7 +153,7 @@ xfs_growfs_data_private( error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata, -delta, 0, 0, &tp); if (error) - return error; + goto out_free_unused_perag; last_pag = xfs_perag_get(mp, oagcount - 1); if (delta > 0) { @@ -227,6 +227,9 @@ xfs_growfs_data_private( out_trans_cancel: xfs_trans_cancel(tp); +out_free_unused_perag: + if (nagcount > oagcount) + xfs_free_perag_range(mp, oagcount, nagcount); return error; }