From patchwork Fri Dec 30 22:18:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D15F3C4332F for ; Sat, 31 Dec 2022 01:48:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236070AbiLaBsV (ORCPT ); Fri, 30 Dec 2022 20:48:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236064AbiLaBsU (ORCPT ); Fri, 30 Dec 2022 20:48:20 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00A0E1DDD1 for ; Fri, 30 Dec 2022 17:48:18 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8111C61CCD for ; Sat, 31 Dec 2022 01:48:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DDD26C433EF; Sat, 31 Dec 2022 01:48:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451297; bh=866MGCVura50NAePvq8mxMfEJxC/c8pr3z9YHZxtUUo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=bgfyMRmGkU5eVheyJsck4GXYjDqARNwnStFOXoh46EdmGBbtCEbBWTRa59OdqvgKT nXbOo1RwuKd2ASWBrlJZUbtkhKnXpIQIR+k21ZLZeTVhNcJzMsEuaPoLWCAoaDMGvf ECv/jbOlA5oM0VcPZNY33ka5oIGV67nIPBU8+D5FsJdoNsqiFgJu3RFCp4HQiVnsc0 Jg/JdNRqN36MlRdIre84YMiPgngsLy8pwN+EJHA+/7KP8KyJJmwUb2RSWcp3wKOpCW Z6VMMfi9Nfbaxs7q0sorHnUiXctv+vIQNAhAkz0+ECIs/Lf/rsbq6ZX18LC6pYyXAV Dsh0hYf4liM2w== Subject: [PATCH 01/42] xfs: prepare refcount btree cursor tracepoints for realtime From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:29 -0800 Message-ID: <167243870911.717073.3631164146808046063.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Rework the refcount btree cursor tracepoints in preparation to handle the realtime refcount btree cursor. Mostly this involves renaming the field to "refcbno" and extracting the group number from the cursor when possible. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_trace.c | 9 ++++ fs/xfs/xfs_trace.h | 114 ++++++++++++++++++++++++++++++---------------------- 2 files changed, 74 insertions(+), 49 deletions(-) diff --git a/fs/xfs/xfs_trace.c b/fs/xfs/xfs_trace.c index 0b9405749079..64f11a535763 100644 --- a/fs/xfs/xfs_trace.c +++ b/fs/xfs/xfs_trace.c @@ -64,6 +64,15 @@ xfs_rmapbt_crack_agno_opdev( } } +static inline void +xfs_refcountbt_crack_agno_opdev( + struct xfs_btree_cur *cur, + xfs_agnumber_t *agno, + dev_t *opdev) +{ + return xfs_rmapbt_crack_agno_opdev(cur, agno, opdev); +} + /* * We include this last to have the helpers above available for the trace * event implementations. diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index c22ffe459002..4e0c40934a7f 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -22,6 +22,8 @@ * * rmapbno: physical block number for a reverse mapping. This is an agbno for * per-AG rmap btrees or a rgbno for realtime rmap btrees. + * refcbno: physical block number for a refcount record. This is an agbno for + * per-AG refcount btrees or a rgbno for realtime refcount btrees. * * daddr: physical block number in 512b blocks * bbcount: number of blocks in a physical extent, in 512b blocks @@ -3230,56 +3232,60 @@ DEFINE_AG_ERROR_EVENT(xfs_ag_resv_init_error); /* refcount tracepoint classes */ DECLARE_EVENT_CLASS(xfs_refcount_class, - TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t agbno, + TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t refcbno, xfs_extlen_t len), - TP_ARGS(cur, agbno, len), + TP_ARGS(cur, refcbno, len), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, refcbno) __field(xfs_extlen_t, len) ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - __entry->agno = cur->bc_ag.pag->pag_agno; - __entry->agbno = agbno; + xfs_refcountbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); + __entry->refcbno = refcbno; __entry->len = len; ), - TP_printk("dev %d:%d agno 0x%x agbno 0x%x fsbcount 0x%x", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x refcbno 0x%x fsbcount 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, - __entry->agbno, + __entry->refcbno, __entry->len) ); #define DEFINE_REFCOUNT_EVENT(name) \ DEFINE_EVENT(xfs_refcount_class, name, \ - TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t agbno, \ + TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t refcbno, \ xfs_extlen_t len), \ - TP_ARGS(cur, agbno, len)) + TP_ARGS(cur, refcbno, len)) TRACE_DEFINE_ENUM(XFS_LOOKUP_EQi); TRACE_DEFINE_ENUM(XFS_LOOKUP_LEi); TRACE_DEFINE_ENUM(XFS_LOOKUP_GEi); TRACE_EVENT(xfs_refcount_lookup, - TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t agbno, + TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t refcbno, xfs_lookup_t dir), - TP_ARGS(cur, agbno, dir), + TP_ARGS(cur, refcbno, dir), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, refcbno) __field(xfs_lookup_t, dir) ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - __entry->agno = cur->bc_ag.pag->pag_agno; - __entry->agbno = agbno; + xfs_refcountbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); + __entry->refcbno = refcbno; __entry->dir = dir; ), - TP_printk("dev %d:%d agno 0x%x agbno 0x%x cmp %s(%d)", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x refcbno 0x%x cmp %s(%d)", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, - __entry->agbno, + __entry->refcbno, __print_symbolic(__entry->dir, XFS_AG_BTREE_CMP_FORMAT_STR), __entry->dir) ) @@ -3290,6 +3296,7 @@ DECLARE_EVENT_CLASS(xfs_refcount_extent_class, TP_ARGS(cur, irec), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, domain) __field(xfs_agblock_t, startblock) @@ -3298,14 +3305,15 @@ DECLARE_EVENT_CLASS(xfs_refcount_extent_class, ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - __entry->agno = cur->bc_ag.pag->pag_agno; + xfs_refcountbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); __entry->domain = irec->rc_domain; __entry->startblock = irec->rc_startblock; __entry->blockcount = irec->rc_blockcount; __entry->refcount = irec->rc_refcount; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x dom %s refcbno 0x%x fsbcount 0x%x refcount %u", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, __print_symbolic(__entry->domain, XFS_REFC_DOMAIN_STRINGS), __entry->startblock, @@ -3321,49 +3329,52 @@ DEFINE_EVENT(xfs_refcount_extent_class, name, \ /* single-rcext and an agbno tracepoint class */ DECLARE_EVENT_CLASS(xfs_refcount_extent_at_class, TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *irec, - xfs_agblock_t agbno), - TP_ARGS(cur, irec, agbno), + xfs_agblock_t refcbno), + TP_ARGS(cur, irec, refcbno), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, domain) __field(xfs_agblock_t, startblock) __field(xfs_extlen_t, blockcount) __field(xfs_nlink_t, refcount) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, refcbno) ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - __entry->agno = cur->bc_ag.pag->pag_agno; + xfs_refcountbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); __entry->domain = irec->rc_domain; __entry->startblock = irec->rc_startblock; __entry->blockcount = irec->rc_blockcount; __entry->refcount = irec->rc_refcount; - __entry->agbno = agbno; + __entry->refcbno = refcbno; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u @ agbno 0x%x", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x dom %s refcbno 0x%x fsbcount 0x%x refcount %u @ refcbno 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, __print_symbolic(__entry->domain, XFS_REFC_DOMAIN_STRINGS), __entry->startblock, __entry->blockcount, __entry->refcount, - __entry->agbno) + __entry->refcbno) ) #define DEFINE_REFCOUNT_EXTENT_AT_EVENT(name) \ DEFINE_EVENT(xfs_refcount_extent_at_class, name, \ TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *irec, \ - xfs_agblock_t agbno), \ - TP_ARGS(cur, irec, agbno)) + xfs_agblock_t refcbno), \ + TP_ARGS(cur, irec, refcbno)) /* double-rcext tracepoint class */ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_class, TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *i1, - struct xfs_refcount_irec *i2), + struct xfs_refcount_irec *i2), TP_ARGS(cur, i1, i2), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, i1_domain) __field(xfs_agblock_t, i1_startblock) @@ -3376,7 +3387,7 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_class, ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - __entry->agno = cur->bc_ag.pag->pag_agno; + xfs_refcountbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); __entry->i1_domain = i1->rc_domain; __entry->i1_startblock = i1->rc_startblock; __entry->i1_blockcount = i1->rc_blockcount; @@ -3386,9 +3397,10 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_class, __entry->i2_blockcount = i2->rc_blockcount; __entry->i2_refcount = i2->rc_refcount; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u -- " - "dom %s agbno 0x%x fsbcount 0x%x refcount %u", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x dom %s refcbno 0x%x fsbcount 0x%x refcount %u -- " + "dom %s refcbno 0x%x fsbcount 0x%x refcount %u", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, __print_symbolic(__entry->i1_domain, XFS_REFC_DOMAIN_STRINGS), __entry->i1_startblock, @@ -3409,10 +3421,11 @@ DEFINE_EVENT(xfs_refcount_double_extent_class, name, \ /* double-rcext and an agbno tracepoint class */ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_at_class, TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *i1, - struct xfs_refcount_irec *i2, xfs_agblock_t agbno), - TP_ARGS(cur, i1, i2, agbno), + struct xfs_refcount_irec *i2, xfs_agblock_t refcbno), + TP_ARGS(cur, i1, i2, refcbno), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, i1_domain) __field(xfs_agblock_t, i1_startblock) @@ -3422,11 +3435,11 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_at_class, __field(xfs_agblock_t, i2_startblock) __field(xfs_extlen_t, i2_blockcount) __field(xfs_nlink_t, i2_refcount) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, refcbno) ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - __entry->agno = cur->bc_ag.pag->pag_agno; + xfs_refcountbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); __entry->i1_domain = i1->rc_domain; __entry->i1_startblock = i1->rc_startblock; __entry->i1_blockcount = i1->rc_blockcount; @@ -3435,11 +3448,12 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_at_class, __entry->i2_startblock = i2->rc_startblock; __entry->i2_blockcount = i2->rc_blockcount; __entry->i2_refcount = i2->rc_refcount; - __entry->agbno = agbno; + __entry->refcbno = refcbno; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u -- " - "dom %s agbno 0x%x fsbcount 0x%x refcount %u @ agbno 0x%x", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x dom %s refcbno 0x%x fsbcount 0x%x refcount %u -- " + "dom %s refcbno 0x%x fsbcount 0x%x refcount %u @ refcbno 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, __print_symbolic(__entry->i1_domain, XFS_REFC_DOMAIN_STRINGS), __entry->i1_startblock, @@ -3449,14 +3463,14 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_at_class, __entry->i2_startblock, __entry->i2_blockcount, __entry->i2_refcount, - __entry->agbno) + __entry->refcbno) ) #define DEFINE_REFCOUNT_DOUBLE_EXTENT_AT_EVENT(name) \ DEFINE_EVENT(xfs_refcount_double_extent_at_class, name, \ TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *i1, \ - struct xfs_refcount_irec *i2, xfs_agblock_t agbno), \ - TP_ARGS(cur, i1, i2, agbno)) + struct xfs_refcount_irec *i2, xfs_agblock_t refcbno), \ + TP_ARGS(cur, i1, i2, refcbno)) /* triple-rcext tracepoint class */ DECLARE_EVENT_CLASS(xfs_refcount_triple_extent_class, @@ -3465,6 +3479,7 @@ DECLARE_EVENT_CLASS(xfs_refcount_triple_extent_class, TP_ARGS(cur, i1, i2, i3), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, i1_domain) __field(xfs_agblock_t, i1_startblock) @@ -3481,7 +3496,7 @@ DECLARE_EVENT_CLASS(xfs_refcount_triple_extent_class, ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - __entry->agno = cur->bc_ag.pag->pag_agno; + xfs_refcountbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); __entry->i1_domain = i1->rc_domain; __entry->i1_startblock = i1->rc_startblock; __entry->i1_blockcount = i1->rc_blockcount; @@ -3495,10 +3510,11 @@ DECLARE_EVENT_CLASS(xfs_refcount_triple_extent_class, __entry->i3_blockcount = i3->rc_blockcount; __entry->i3_refcount = i3->rc_refcount; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u -- " - "dom %s agbno 0x%x fsbcount 0x%x refcount %u -- " - "dom %s agbno 0x%x fsbcount 0x%x refcount %u", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x dom %s refcbno 0x%x fsbcount 0x%x refcount %u -- " + "dom %s refcbno 0x%x fsbcount 0x%x refcount %u -- " + "dom %s refcbno 0x%x fsbcount 0x%x refcount %u", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, __print_symbolic(__entry->i1_domain, XFS_REFC_DOMAIN_STRINGS), __entry->i1_startblock, @@ -3568,21 +3584,21 @@ DECLARE_EVENT_CLASS(xfs_refcount_deferred_class, __field(dev_t, dev) __field(xfs_agnumber_t, agno) __field(int, op) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, refcbno) __field(xfs_extlen_t, len) ), TP_fast_assign( __entry->dev = mp->m_super->s_dev; __entry->agno = XFS_FSB_TO_AGNO(mp, refc->ri_startblock); __entry->op = refc->ri_type; - __entry->agbno = XFS_FSB_TO_AGBNO(mp, refc->ri_startblock); + __entry->refcbno = XFS_FSB_TO_AGBNO(mp, refc->ri_startblock); __entry->len = refc->ri_blockcount; ), - TP_printk("dev %d:%d op %s agno 0x%x agbno 0x%x fsbcount 0x%x", + TP_printk("dev %d:%d op %s agno 0x%x refcbno 0x%x fsbcount 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), __print_symbolic(__entry->op, XFS_REFCOUNT_INTENT_STRINGS), __entry->agno, - __entry->agbno, + __entry->refcbno, __entry->len) ); #define DEFINE_REFCOUNT_DEFERRED_EVENT(name) \ From patchwork Fri Dec 30 22:18:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085510 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CB56C4332F for ; Sat, 31 Dec 2022 01:48:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236071AbiLaBsp (ORCPT ); Fri, 30 Dec 2022 20:48:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236064AbiLaBsl (ORCPT ); Fri, 30 Dec 2022 20:48:41 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 100251DDD1 for ; Fri, 30 Dec 2022 17:48:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B420DB81DF6 for ; Sat, 31 Dec 2022 01:48:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 71ED8C433D2; Sat, 31 Dec 2022 01:48:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451313; bh=krs+9t/WZQLzhsCd7Jo/HleOAm5a7hYcVAPoWxQ44FA=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=PSrd600REXo3Bob+EmfQgVqjKFrl8FVTJo+pZ/gce8v7KtUZJmAjno9DbSjIHwVpU ok2CGNWN0MTGoyV+xtLDrqzUNoi0blB/6sWw2yC37e9mUz2wdwnYZqz0C/WRBOx632 AmM4BDIY26VUtHtDXdydrUVqpfFeUlp/Q0ABkhookjlAax8ORz+8K269Dn5rmc/291 NP0IFWg2uNp0DXGKjGHBH5cP5Z/B7c56llBI/sn/ZAIOqVsQji9Herj+1MfsQI3ME1 eExVIFOBOQWup0u/ghj3cALPTuxGtxHeDKc4fzJ/NMdfBTb+bmHnO0ciWEBfVdW0dT /dPfv+VhGOLGg== Subject: [PATCH 02/42] xfs: introduce realtime refcount btree definitions From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:29 -0800 Message-ID: <167243870925.717073.13533133077980484735.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Add new realtime refcount btree definitions. The realtime refcount btree will be rooted from a hidden inode, but has its own shape and therefore needs to have most of its own separate types. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_btree.h | 1 + fs/xfs/libxfs/xfs_format.h | 6 ++++++ fs/xfs/libxfs/xfs_types.h | 5 +++-- fs/xfs/scrub/trace.h | 1 + fs/xfs/xfs_trace.h | 1 + 5 files changed, 12 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 20342ed62bf4..ce5ef798c3bc 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -65,6 +65,7 @@ union xfs_btree_rec { #define XFS_BTNUM_REFC ((xfs_btnum_t)XFS_BTNUM_REFCi) #define XFS_BTNUM_RCBAG ((xfs_btnum_t)XFS_BTNUM_RCBAGi) #define XFS_BTNUM_RTRMAP ((xfs_btnum_t)XFS_BTNUM_RTRMAPi) +#define XFS_BTNUM_RTREFC ((xfs_btnum_t)XFS_BTNUM_RTREFCi) struct xfs_btree_ops; uint32_t xfs_btree_magic(struct xfs_mount *mp, const struct xfs_btree_ops *ops); diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index a2b8d8ee8afd..c78fe8e78b8c 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1796,6 +1796,12 @@ struct xfs_refcount_key { /* btree pointer type */ typedef __be32 xfs_refcount_ptr_t; +/* + * Realtime Reference Count btree format definitions + * + * This is a btree for reference count records for realtime volumes + */ +#define XFS_RTREFC_CRC_MAGIC 0x52434e54 /* 'RCNT' */ /* * BMAP Btree format definitions diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h index e6a4f4a7d009..92c60a9d5862 100644 --- a/fs/xfs/libxfs/xfs_types.h +++ b/fs/xfs/libxfs/xfs_types.h @@ -126,7 +126,7 @@ typedef enum { typedef enum { XFS_BTNUM_BNOi, XFS_BTNUM_CNTi, XFS_BTNUM_RMAPi, XFS_BTNUM_BMAPi, XFS_BTNUM_INOi, XFS_BTNUM_FINOi, XFS_BTNUM_REFCi, XFS_BTNUM_RCBAGi, - XFS_BTNUM_RTRMAPi, XFS_BTNUM_MAX + XFS_BTNUM_RTRMAPi, XFS_BTNUM_RTREFCi, XFS_BTNUM_MAX } xfs_btnum_t; #define XFS_BTNUM_STRINGS \ @@ -138,7 +138,8 @@ typedef enum { { XFS_BTNUM_FINOi, "finobt" }, \ { XFS_BTNUM_REFCi, "refcbt" }, \ { XFS_BTNUM_RCBAGi, "rcbagbt" }, \ - { XFS_BTNUM_RTRMAPi, "rtrmapbt" } + { XFS_BTNUM_RTRMAPi, "rtrmapbt" }, \ + { XFS_BTNUM_RTREFCi, "rtrefcbt" } struct xfs_name { const unsigned char *name; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 4cf8180173ca..8d66ab10e1fd 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -43,6 +43,7 @@ TRACE_DEFINE_ENUM(XFS_BTNUM_RMAPi); TRACE_DEFINE_ENUM(XFS_BTNUM_REFCi); TRACE_DEFINE_ENUM(XFS_BTNUM_RCBAGi); TRACE_DEFINE_ENUM(XFS_BTNUM_RTRMAPi); +TRACE_DEFINE_ENUM(XFS_BTNUM_RTREFCi); TRACE_DEFINE_ENUM(XFS_REFC_DOMAIN_SHARED); TRACE_DEFINE_ENUM(XFS_REFC_DOMAIN_COW); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 4e0c40934a7f..1f8ab7c436a9 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -2560,6 +2560,7 @@ TRACE_DEFINE_ENUM(XFS_BTNUM_RMAPi); TRACE_DEFINE_ENUM(XFS_BTNUM_REFCi); TRACE_DEFINE_ENUM(XFS_BTNUM_RCBAGi); TRACE_DEFINE_ENUM(XFS_BTNUM_RTRMAPi); +TRACE_DEFINE_ENUM(XFS_BTNUM_RTREFCi); DECLARE_EVENT_CLASS(xfs_btree_cur_class, TP_PROTO(struct xfs_btree_cur *cur, int level, struct xfs_buf *bp), From patchwork Fri Dec 30 22:18:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CEE22C4332F for ; Sat, 31 Dec 2022 01:48:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236075AbiLaBs6 (ORCPT ); Fri, 30 Dec 2022 20:48:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236064AbiLaBsy (ORCPT ); Fri, 30 Dec 2022 20:48:54 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FB241DDD6 for ; Fri, 30 Dec 2022 17:48:50 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A8C0D61CCE for ; Sat, 31 Dec 2022 01:48:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0CFBBC433EF; Sat, 31 Dec 2022 01:48:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451329; bh=Q1IbYzDfk0/s8h5OwGZ1qFaBJodCHEUMF8ECQRM0QDc=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=G73kq3Lov5z01S/kYuU6KnrMcEYpa731EsLvfxyMH6ePAFmuTv0LJ0e2Dcy4IAfkl m+4VIzT3n8uoR8sxWldRrd8mFADhDRdRzKZIMghlXEv++Fo6dbujkNfU3ITofsKoLK JMVfZLExyf0FEm0XYxGoTNA4p+NkRmoK3JHIoHogwzehkRd2HRNb0blWg0oE8xP6yC bZoEC8U/f77HrRD5UFBf2oFnjBOFym2ywTm0FaLT4T7F+vD8coEZ5GDcB0TJfZ5tX9 YIocLTJTc7kzf3EZ7OS/kxdFjRGjv7WCkA7j8m8aeUkoJJs8mUJ/p/MQzLHsUTTd2g PzyLrQDdbKQYg== Subject: [PATCH 03/42] xfs: namespace the maximum length/refcount symbols From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:29 -0800 Message-ID: <167243870940.717073.2275494462076625238.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Actually namespace these variables properly, so that readers can tell that this is an XFS symbol, and that it's for the refcount functionality. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_format.h | 4 ++-- fs/xfs/libxfs/xfs_refcount.c | 18 +++++++++--------- fs/xfs/scrub/refcount.c | 2 +- fs/xfs/scrub/refcount_repair.c | 4 ++-- 4 files changed, 14 insertions(+), 14 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index c78fe8e78b8c..c49a946e79f3 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1790,8 +1790,8 @@ struct xfs_refcount_key { __be32 rc_startblock; /* starting block number */ }; -#define MAXREFCOUNT ((xfs_nlink_t)~0U) -#define MAXREFCEXTLEN ((xfs_extlen_t)~0U) +#define XFS_REFC_REFCOUNT_MAX ((xfs_nlink_t)~0U) +#define XFS_REFC_LEN_MAX ((xfs_extlen_t)~0U) /* btree pointer type */ typedef __be32 xfs_refcount_ptr_t; diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 3d2269c6855a..e1f55edceccf 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -126,7 +126,7 @@ xfs_refcount_check_perag_irec( struct xfs_perag *pag, const struct xfs_refcount_irec *irec) { - if (irec->rc_blockcount == 0 || irec->rc_blockcount > MAXREFCEXTLEN) + if (irec->rc_blockcount == 0 || irec->rc_blockcount > XFS_REFC_LEN_MAX) return __this_address; if (!xfs_refcount_check_domain(irec)) @@ -136,7 +136,7 @@ xfs_refcount_check_perag_irec( if (!xfs_verify_agbext(pag, irec->rc_startblock, irec->rc_blockcount)) return __this_address; - if (irec->rc_refcount == 0 || irec->rc_refcount > MAXREFCOUNT) + if (irec->rc_refcount == 0 || irec->rc_refcount > XFS_REFC_REFCOUNT_MAX) return __this_address; return NULL; @@ -860,9 +860,9 @@ xfs_refc_merge_refcount( const struct xfs_refcount_irec *irec, enum xfs_refc_adjust_op adjust) { - /* Once a record hits MAXREFCOUNT, it is pinned there forever */ - if (irec->rc_refcount == MAXREFCOUNT) - return MAXREFCOUNT; + /* Once a record hits XFS_REFC_REFCOUNT_MAX, it is pinned forever */ + if (irec->rc_refcount == XFS_REFC_REFCOUNT_MAX) + return XFS_REFC_REFCOUNT_MAX; return irec->rc_refcount + adjust; } @@ -905,7 +905,7 @@ xfs_refc_want_merge_center( * hence we need to catch u32 addition overflows here. */ ulen += cleft->rc_blockcount + right->rc_blockcount; - if (ulen >= MAXREFCEXTLEN) + if (ulen >= XFS_REFC_LEN_MAX) return false; *ulenp = ulen; @@ -940,7 +940,7 @@ xfs_refc_want_merge_left( * hence we need to catch u32 addition overflows here. */ ulen += cleft->rc_blockcount; - if (ulen >= MAXREFCEXTLEN) + if (ulen >= XFS_REFC_LEN_MAX) return false; return true; @@ -974,7 +974,7 @@ xfs_refc_want_merge_right( * hence we need to catch u32 addition overflows here. */ ulen += cright->rc_blockcount; - if (ulen >= MAXREFCEXTLEN) + if (ulen >= XFS_REFC_LEN_MAX) return false; return true; @@ -1201,7 +1201,7 @@ xfs_refcount_adjust_extents( * Adjust the reference count and either update the tree * (incr) or free the blocks (decr). */ - if (ext.rc_refcount == MAXREFCOUNT) + if (ext.rc_refcount == XFS_REFC_REFCOUNT_MAX) goto skip; ext.rc_refcount += adj; trace_xfs_refcount_modify_extent(cur, &ext); diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c index 413885eca333..78b52c8a4d7f 100644 --- a/fs/xfs/scrub/refcount.c +++ b/fs/xfs/scrub/refcount.c @@ -421,7 +421,7 @@ xchk_refcount_mergeable( if (r1->rc_refcount != r2->rc_refcount) return false; if ((unsigned long long)r1->rc_blockcount + r2->rc_blockcount > - MAXREFCEXTLEN) + XFS_REFC_LEN_MAX) return false; return true; diff --git a/fs/xfs/scrub/refcount_repair.c b/fs/xfs/scrub/refcount_repair.c index 539548cdc65a..81709afdd9e6 100644 --- a/fs/xfs/scrub/refcount_repair.c +++ b/fs/xfs/scrub/refcount_repair.c @@ -176,7 +176,7 @@ xrep_refc_stash( if (xchk_should_terminate(sc, &error)) return error; - irec.rc_refcount = min_t(uint64_t, MAXREFCOUNT, refcount); + irec.rc_refcount = min_t(uint64_t, XFS_REFC_REFCOUNT_MAX, refcount); error = xrep_refc_check_ext(rr->sc, &irec); if (error) @@ -415,7 +415,7 @@ xrep_refc_find_refcounts( /* * Set up a bag to store all the rmap records that we're tracking to * generate a reference count record. If the size of the bag exceeds - * MAXREFCOUNT, we clamp rc_refcount. + * XFS_REFC_REFCOUNT_MAX, we clamp rc_refcount. */ error = rcbag_init(sc->mp, sc->xfile_buftarg, &rcstack); if (error) From patchwork Fri Dec 30 22:18:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085512 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 926AEC4332F for ; Sat, 31 Dec 2022 01:49:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236076AbiLaBtK (ORCPT ); Fri, 30 Dec 2022 20:49:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49440 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236064AbiLaBtJ (ORCPT ); Fri, 30 Dec 2022 20:49:09 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42B4E1DDD1 for ; Fri, 30 Dec 2022 17:49:07 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C9816B81DFA for ; Sat, 31 Dec 2022 01:49:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 89069C433EF; Sat, 31 Dec 2022 01:49:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451344; bh=qDzu2D3G0UhDHebx5bUQDX26Rso8PMOGxd5iCK8rqM8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=F87KdHffeejTzgXnbBePGbZXnLPL5A5eKPYORVVpseClfYk7K2La3qunDnNkX8Ne8 P0+TA3k6HVsRENo7CnYEIblP5XaDv0QHiaUM3kaQgXsDRq67Qk6kxiP1kcOEGODG+P EBewtayUVZK5pgi7VZVcVxm+o2QiQE7O3+zCY5r0ZC41dulQ4A0t/zhqkRnmWd7TbE M2I4dgfxvRrqb1Gx9pnlLuxIrgsBTMBDIy9bKiySCZSAKYPE7gzljmi91LGSJffF3h 5JIBliSmplmKuMfvLQrNMct5xkK0+gvawqzJFsY9GPHwGU31CAv3Qr2Kn0BtxnAayQ BwzwAeRI4/w9g== Subject: [PATCH 04/42] xfs: define the on-disk realtime refcount btree format From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:29 -0800 Message-ID: <167243870954.717073.1737780719903173961.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Start filling out the rtrefcount btree implementation. Start with the on-disk btree format; add everything needed to read, write and manipulate refcount btree blocks. This prepares the way for connecting the btree operations implementation. Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/libxfs/xfs_btree.c | 6 + fs/xfs/libxfs/xfs_btree.h | 11 + fs/xfs/libxfs/xfs_format.h | 3 fs/xfs/libxfs/xfs_rtrefcount_btree.c | 311 ++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.h | 71 ++++++++ fs/xfs/libxfs/xfs_sb.c | 8 + fs/xfs/libxfs/xfs_shared.h | 2 fs/xfs/xfs_mount.c | 7 + fs/xfs/xfs_mount.h | 9 + fs/xfs/xfs_ondisk.h | 1 11 files changed, 425 insertions(+), 5 deletions(-) create mode 100644 fs/xfs/libxfs/xfs_rtrefcount_btree.c create mode 100644 fs/xfs/libxfs/xfs_rtrefcount_btree.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 17c65dce6d26..9cc30333c089 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -47,6 +47,7 @@ xfs-y += $(addprefix libxfs/, \ xfs_rmap_btree.o \ xfs_refcount.o \ xfs_refcount_btree.o \ + xfs_rtrefcount_btree.o \ xfs_rtrmap_btree.o \ xfs_sb.o \ xfs_swapext.o \ diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 377dc9b0a6e6..a789fb75e77d 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -37,6 +37,7 @@ #include "xfs_rmap.h" #include "xfs_quota.h" #include "xfs_imeta.h" +#include "xfs_rtrefcount_btree.h" /* * Btree magic numbers. @@ -1388,6 +1389,7 @@ xfs_btree_set_refs( xfs_buf_set_ref(bp, XFS_RMAP_BTREE_REF); break; case XFS_BTNUM_REFC: + case XFS_BTNUM_RTREFC: xfs_buf_set_ref(bp, XFS_REFC_BTREE_REF); break; case XFS_BTNUM_RCBAG: @@ -5548,6 +5550,9 @@ xfs_btree_init_cur_caches(void) if (error) goto err; error = xfs_rtrmapbt_init_cur_cache(); + if (error) + goto err; + error = xfs_rtrefcountbt_init_cur_cache(); if (error) goto err; @@ -5567,6 +5572,7 @@ xfs_btree_destroy_cur_caches(void) xfs_rmapbt_destroy_cur_cache(); xfs_refcountbt_destroy_cur_cache(); xfs_rtrmapbt_destroy_cur_cache(); + xfs_rtrefcountbt_destroy_cur_cache(); } /* Move the btree cursor before the first record. */ diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index ce5ef798c3bc..97127030aea6 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -226,6 +226,11 @@ union xfs_btree_irec { struct xfs_refcount_irec rc; }; +struct xbtree_refc { + unsigned int nr_ops; /* # record updates */ + unsigned int shape_changes; /* # of extent splits */ +}; + /* Per-AG btree information. */ struct xfs_btree_cur_ag { struct xfs_perag *pag; @@ -234,10 +239,7 @@ struct xfs_btree_cur_ag { struct xbtree_afakeroot *afake; /* for staging cursor */ }; union { - struct { - unsigned int nr_ops; /* # record updates */ - unsigned int shape_changes; /* # of extent splits */ - } refc; + struct xbtree_refc refc; struct { bool active; /* allocation cursor state */ } abt; @@ -258,6 +260,7 @@ struct xfs_btree_cur_ino { /* For extent swap, ignore owner check in verifier */ #define XFS_BTCUR_BMBT_INVALID_OWNER (1 << 1) + struct xbtree_refc refc; }; /* In-memory btree information */ diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index c49a946e79f3..d2270f95bfbc 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1803,6 +1803,9 @@ typedef __be32 xfs_refcount_ptr_t; */ #define XFS_RTREFC_CRC_MAGIC 0x52434e54 /* 'RCNT' */ +/* inode-rooted btree pointer type */ +typedef __be64 xfs_rtrefcount_ptr_t; + /* * BMAP Btree format definitions * diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c new file mode 100644 index 000000000000..dd8e628b068b --- /dev/null +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -0,0 +1,311 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_log_format.h" +#include "xfs_trans_resv.h" +#include "xfs_bit.h" +#include "xfs_sb.h" +#include "xfs_mount.h" +#include "xfs_defer.h" +#include "xfs_inode.h" +#include "xfs_trans.h" +#include "xfs_alloc.h" +#include "xfs_btree.h" +#include "xfs_btree_staging.h" +#include "xfs_rtrefcount_btree.h" +#include "xfs_trace.h" +#include "xfs_cksum.h" +#include "xfs_error.h" +#include "xfs_extent_busy.h" +#include "xfs_rtgroup.h" +#include "xfs_rtbitmap.h" + +static struct kmem_cache *xfs_rtrefcountbt_cur_cache; + +/* + * Realtime Reference Count btree. + * + * This is a btree used to track the owner(s) of a given extent in the realtime + * device. See the comments in xfs_refcount_btree.c for more information. + * + * This tree is basically the same as the regular refcount btree except that + * it's rooted in an inode. + */ + +static struct xfs_btree_cur * +xfs_rtrefcountbt_dup_cursor( + struct xfs_btree_cur *cur) +{ + struct xfs_btree_cur *new; + + new = xfs_rtrefcountbt_init_cursor(cur->bc_mp, cur->bc_tp, + cur->bc_ino.rtg, cur->bc_ino.ip); + + /* Copy the flags values since init cursor doesn't get them. */ + new->bc_ino.flags = cur->bc_ino.flags; + + return new; +} + +static xfs_failaddr_t +xfs_rtrefcountbt_verify( + struct xfs_buf *bp) +{ + struct xfs_mount *mp = bp->b_target->bt_mount; + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_failaddr_t fa; + int level; + + if (!xfs_verify_magic(bp, block->bb_magic)) + return __this_address; + + if (!xfs_has_reflink(mp)) + return __this_address; + fa = xfs_btree_lblock_v5hdr_verify(bp, XFS_RMAP_OWN_UNKNOWN); + if (fa) + return fa; + level = be16_to_cpu(block->bb_level); + if (level > mp->m_rtrefc_maxlevels) + return __this_address; + + return xfs_btree_lblock_verify(bp, mp->m_rtrefc_mxr[level != 0]); +} + +static void +xfs_rtrefcountbt_read_verify( + struct xfs_buf *bp) +{ + xfs_failaddr_t fa; + + if (!xfs_btree_lblock_verify_crc(bp)) + xfs_verifier_error(bp, -EFSBADCRC, __this_address); + else { + fa = xfs_rtrefcountbt_verify(bp); + if (fa) + xfs_verifier_error(bp, -EFSCORRUPTED, fa); + } + + if (bp->b_error) + trace_xfs_btree_corrupt(bp, _RET_IP_); +} + +static void +xfs_rtrefcountbt_write_verify( + struct xfs_buf *bp) +{ + xfs_failaddr_t fa; + + fa = xfs_rtrefcountbt_verify(bp); + if (fa) { + trace_xfs_btree_corrupt(bp, _RET_IP_); + xfs_verifier_error(bp, -EFSCORRUPTED, fa); + return; + } + xfs_btree_lblock_calc_crc(bp); + +} + +const struct xfs_buf_ops xfs_rtrefcountbt_buf_ops = { + .name = "xfs_rtrefcountbt", + .magic = { 0, cpu_to_be32(XFS_RTREFC_CRC_MAGIC) }, + .verify_read = xfs_rtrefcountbt_read_verify, + .verify_write = xfs_rtrefcountbt_write_verify, + .verify_struct = xfs_rtrefcountbt_verify, +}; + +const struct xfs_btree_ops xfs_rtrefcountbt_ops = { + .rec_len = sizeof(struct xfs_refcount_rec), + .key_len = sizeof(struct xfs_refcount_key), + .geom_flags = XFS_BTREE_LONG_PTRS | XFS_BTREE_ROOT_IN_INODE | + XFS_BTREE_CRC_BLOCKS | XFS_BTREE_IROOT_RECORDS, + + .dup_cursor = xfs_rtrefcountbt_dup_cursor, + .buf_ops = &xfs_rtrefcountbt_buf_ops, +}; + +/* Initialize a new rt refcount btree cursor. */ +static struct xfs_btree_cur * +xfs_rtrefcountbt_init_common( + struct xfs_mount *mp, + struct xfs_trans *tp, + struct xfs_rtgroup *rtg, + struct xfs_inode *ip) +{ + struct xfs_btree_cur *cur; + + ASSERT(xfs_isilocked(ip, XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)); + + cur = xfs_btree_alloc_cursor(mp, tp, XFS_BTNUM_RTREFC, + &xfs_rtrefcountbt_ops, mp->m_rtrefc_maxlevels, + xfs_rtrefcountbt_cur_cache); + cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_refcbt_2); + + cur->bc_ino.ip = ip; + cur->bc_ino.allocated = 0; + cur->bc_ino.flags = 0; + cur->bc_ino.refc.nr_ops = 0; + cur->bc_ino.refc.shape_changes = 0; + + cur->bc_ino.rtg = xfs_rtgroup_bump(rtg); + return cur; +} + +/* Allocate a new rt refcount btree cursor. */ +struct xfs_btree_cur * +xfs_rtrefcountbt_init_cursor( + struct xfs_mount *mp, + struct xfs_trans *tp, + struct xfs_rtgroup *rtg, + struct xfs_inode *ip) +{ + struct xfs_btree_cur *cur; + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + + cur = xfs_rtrefcountbt_init_common(mp, tp, rtg, ip); + cur->bc_nlevels = be16_to_cpu(ifp->if_broot->bb_level) + 1; + cur->bc_ino.forksize = xfs_inode_fork_size(ip, XFS_DATA_FORK); + cur->bc_ino.whichfork = XFS_DATA_FORK; + return cur; +} + +/* Create a new rt reverse mapping btree cursor with a fake root for staging. */ +struct xfs_btree_cur * +xfs_rtrefcountbt_stage_cursor( + struct xfs_mount *mp, + struct xfs_rtgroup *rtg, + struct xfs_inode *ip, + struct xbtree_ifakeroot *ifake) +{ + struct xfs_btree_cur *cur; + + cur = xfs_rtrefcountbt_init_common(mp, NULL, rtg, ip); + cur->bc_nlevels = ifake->if_levels; + cur->bc_ino.forksize = ifake->if_fork_size; + cur->bc_ino.whichfork = -1; + xfs_btree_stage_ifakeroot(cur, ifake, NULL); + return cur; +} + +/* + * Install a new rt reverse mapping btree root. Caller is responsible for + * invalidating and freeing the old btree blocks. + */ +void +xfs_rtrefcountbt_commit_staged_btree( + struct xfs_btree_cur *cur, + struct xfs_trans *tp) +{ + struct xbtree_ifakeroot *ifake = cur->bc_ino.ifake; + struct xfs_ifork *ifp; + int flags = XFS_ILOG_CORE | XFS_ILOG_DBROOT; + + ASSERT(cur->bc_flags & XFS_BTREE_STAGING); + + /* + * Free any resources hanging off the real fork, then shallow-copy the + * staging fork's contents into the real fork to transfer everything + * we just built. + */ + ifp = xfs_ifork_ptr(cur->bc_ino.ip, XFS_DATA_FORK); + xfs_idestroy_fork(ifp); + memcpy(ifp, ifake->if_fork, sizeof(struct xfs_ifork)); + + xfs_trans_log_inode(tp, cur->bc_ino.ip, flags); + xfs_btree_commit_ifakeroot(cur, tp, XFS_DATA_FORK, + &xfs_rtrefcountbt_ops); +} + +/* Calculate number of records in a realtime refcount btree block. */ +static inline unsigned int +xfs_rtrefcountbt_block_maxrecs( + unsigned int blocklen, + bool leaf) +{ + + if (leaf) + return blocklen / sizeof(struct xfs_refcount_rec); + return blocklen / (sizeof(struct xfs_refcount_key) + + sizeof(xfs_rtrefcount_ptr_t)); +} + +/* + * Calculate number of records in an refcount btree block. + */ +unsigned int +xfs_rtrefcountbt_maxrecs( + struct xfs_mount *mp, + unsigned int blocklen, + bool leaf) +{ + blocklen -= XFS_RTREFCOUNT_BLOCK_LEN; + return xfs_rtrefcountbt_block_maxrecs(blocklen, leaf); +} + +/* Compute the max possible height for realtime refcount btrees. */ +unsigned int +xfs_rtrefcountbt_maxlevels_ondisk(void) +{ + unsigned int minrecs[2]; + unsigned int blocklen; + + blocklen = XFS_MIN_CRC_BLOCKSIZE - XFS_BTREE_LBLOCK_CRC_LEN; + + minrecs[0] = xfs_rtrefcountbt_block_maxrecs(blocklen, true) / 2; + minrecs[1] = xfs_rtrefcountbt_block_maxrecs(blocklen, false) / 2; + + /* We need at most one record for every block in an rt group. */ + return xfs_btree_compute_maxlevels(minrecs, XFS_MAX_RGBLOCKS); +} + +int __init +xfs_rtrefcountbt_init_cur_cache(void) +{ + xfs_rtrefcountbt_cur_cache = kmem_cache_create("xfs_rtrefcountbt_cur", + xfs_btree_cur_sizeof( + xfs_rtrefcountbt_maxlevels_ondisk()), + 0, 0, NULL); + + if (!xfs_rtrefcountbt_cur_cache) + return -ENOMEM; + return 0; +} + +void +xfs_rtrefcountbt_destroy_cur_cache(void) +{ + kmem_cache_destroy(xfs_rtrefcountbt_cur_cache); + xfs_rtrefcountbt_cur_cache = NULL; +} + +/* Compute the maximum height of a realtime refcount btree. */ +void +xfs_rtrefcountbt_compute_maxlevels( + struct xfs_mount *mp) +{ + unsigned int d_maxlevels, r_maxlevels; + + if (!xfs_has_rtreflink(mp)) { + mp->m_rtrefc_maxlevels = 0; + return; + } + + /* + * The realtime refcountbt lives on the data device, which means that + * its maximum height is constrained by the size of the data device and + * the height required to store one refcount record for each rtextent + * in an rt group. + */ + d_maxlevels = xfs_btree_space_to_height(mp->m_rtrefc_mnr, + mp->m_sb.sb_dblocks); + r_maxlevels = xfs_btree_compute_maxlevels(mp->m_rtrefc_mnr, + xfs_rtb_to_rtxt(mp, mp->m_sb.sb_rgblocks)); + + /* Add one level to handle the inode root level. */ + mp->m_rtrefc_maxlevels = min(d_maxlevels, r_maxlevels) + 1; +} diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.h b/fs/xfs/libxfs/xfs_rtrefcount_btree.h new file mode 100644 index 000000000000..d10ebdcf7727 --- /dev/null +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.h @@ -0,0 +1,71 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_RTREFCOUNT_BTREE_H__ +#define __XFS_RTREFCOUNT_BTREE_H__ + +struct xfs_buf; +struct xfs_btree_cur; +struct xfs_mount; +struct xbtree_ifakeroot; +struct xfs_rtgroup; + +/* refcounts only exist on crc enabled filesystems */ +#define XFS_RTREFCOUNT_BLOCK_LEN XFS_BTREE_LBLOCK_CRC_LEN + +struct xfs_btree_cur *xfs_rtrefcountbt_init_cursor(struct xfs_mount *mp, + struct xfs_trans *tp, struct xfs_rtgroup *rtg, + struct xfs_inode *ip); +struct xfs_btree_cur *xfs_rtrefcountbt_stage_cursor(struct xfs_mount *mp, + struct xfs_rtgroup *rtg, struct xfs_inode *ip, + struct xbtree_ifakeroot *ifake); +void xfs_rtrefcountbt_commit_staged_btree(struct xfs_btree_cur *cur, + struct xfs_trans *tp); +unsigned int xfs_rtrefcountbt_maxrecs(struct xfs_mount *mp, + unsigned int blocklen, bool leaf); +void xfs_rtrefcountbt_compute_maxlevels(struct xfs_mount *mp); + +/* + * Addresses of records, keys, and pointers within an incore rtrefcountbt block. + * + * (note that some of these may appear unused, but they are used in userspace) + */ +static inline struct xfs_refcount_rec * +xfs_rtrefcount_rec_addr( + struct xfs_btree_block *block, + unsigned int index) +{ + return (struct xfs_refcount_rec *) + ((char *)block + XFS_RTREFCOUNT_BLOCK_LEN + + (index - 1) * sizeof(struct xfs_refcount_rec)); +} + +static inline struct xfs_refcount_key * +xfs_rtrefcount_key_addr( + struct xfs_btree_block *block, + unsigned int index) +{ + return (struct xfs_refcount_key *) + ((char *)block + XFS_RTREFCOUNT_BLOCK_LEN + + (index - 1) * sizeof(struct xfs_refcount_key)); +} + +static inline xfs_rtrefcount_ptr_t * +xfs_rtrefcount_ptr_addr( + struct xfs_btree_block *block, + unsigned int index, + unsigned int maxrecs) +{ + return (xfs_rtrefcount_ptr_t *) + ((char *)block + XFS_RTREFCOUNT_BLOCK_LEN + + maxrecs * sizeof(struct xfs_refcount_key) + + (index - 1) * sizeof(xfs_rtrefcount_ptr_t)); +} + +unsigned int xfs_rtrefcountbt_maxlevels_ondisk(void); +int __init xfs_rtrefcountbt_init_cur_cache(void); +void xfs_rtrefcountbt_destroy_cur_cache(void); + +#endif /* __XFS_RTREFCOUNT_BTREE_H__ */ diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 570919c223c9..c002cf661912 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -28,6 +28,7 @@ #include "xfs_swapext.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" /* * Physical superblock buffer manipulations. Shared with libxfs in userspace. @@ -1075,6 +1076,13 @@ xfs_sb_mount_common( mp->m_refc_mnr[0] = mp->m_refc_mxr[0] / 2; mp->m_refc_mnr[1] = mp->m_refc_mxr[1] / 2; + mp->m_rtrefc_mxr[0] = xfs_rtrefcountbt_maxrecs(mp, sbp->sb_blocksize, + true); + mp->m_rtrefc_mxr[1] = xfs_rtrefcountbt_maxrecs(mp, sbp->sb_blocksize, + false); + mp->m_rtrefc_mnr[0] = mp->m_rtrefc_mxr[0] / 2; + mp->m_rtrefc_mnr[1] = mp->m_rtrefc_mxr[1] / 2; + mp->m_bsize = XFS_FSB_TO_BB(mp, 1); mp->m_alloc_set_aside = xfs_alloc_set_aside(mp); mp->m_ag_max_usable = xfs_alloc_ag_max_usable(mp); diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h index 31c577a94295..a1bfc98c47a3 100644 --- a/fs/xfs/libxfs/xfs_shared.h +++ b/fs/xfs/libxfs/xfs_shared.h @@ -42,6 +42,7 @@ extern const struct xfs_buf_ops xfs_rtbitmap_buf_ops; extern const struct xfs_buf_ops xfs_rtsummary_buf_ops; extern const struct xfs_buf_ops xfs_rtbuf_ops; extern const struct xfs_buf_ops xfs_rtsb_buf_ops; +extern const struct xfs_buf_ops xfs_rtrefcountbt_buf_ops; extern const struct xfs_buf_ops xfs_rtrmapbt_buf_ops; extern const struct xfs_buf_ops xfs_sb_buf_ops; extern const struct xfs_buf_ops xfs_sb_quiet_buf_ops; @@ -56,6 +57,7 @@ extern const struct xfs_btree_ops xfs_bmbt_ops; extern const struct xfs_btree_ops xfs_refcountbt_ops; extern const struct xfs_btree_ops xfs_rmapbt_ops; extern const struct xfs_btree_ops xfs_rtrmapbt_ops; +extern const struct xfs_btree_ops xfs_rtrefcountbt_ops; /* log size calculation functions */ int xfs_log_calc_unit_res(struct xfs_mount *mp, int unit_bytes); diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 2e64f18deabf..f3ef385f9aaf 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -37,6 +37,7 @@ #include "xfs_imeta.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" static DEFINE_MUTEX(xfs_uuid_table_mutex); static int xfs_uuid_table_size; @@ -655,7 +656,10 @@ static inline void xfs_rtbtree_compute_maxlevels( struct xfs_mount *mp) { - mp->m_rtbtree_maxlevels = mp->m_rtrmap_maxlevels; + unsigned int levels; + + levels = max(mp->m_rtrmap_maxlevels, mp->m_rtrefc_maxlevels); + mp->m_rtbtree_maxlevels = levels; } /* @@ -729,6 +733,7 @@ xfs_mountfs( xfs_rmapbt_compute_maxlevels(mp); xfs_rtrmapbt_compute_maxlevels(mp); xfs_refcountbt_compute_maxlevels(mp); + xfs_rtrefcountbt_compute_maxlevels(mp); xfs_agbtree_compute_maxlevels(mp); xfs_rtbtree_compute_maxlevels(mp); diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index b1ffab4cb9cd..487567d1839b 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -136,11 +136,14 @@ typedef struct xfs_mount { uint m_rtrmap_mnr[2]; /* min rtrmap btree records */ uint m_refc_mxr[2]; /* max refc btree records */ uint m_refc_mnr[2]; /* min refc btree records */ + uint m_rtrefc_mxr[2]; /* max rtrefc btree records */ + uint m_rtrefc_mnr[2]; /* min rtrefc btree records */ uint m_alloc_maxlevels; /* max alloc btree levels */ uint m_bm_maxlevels[2]; /* max bmap btree levels */ uint m_rmap_maxlevels; /* max rmap btree levels */ uint m_rtrmap_maxlevels; /* max rtrmap btree level */ uint m_refc_maxlevels; /* max refcount btree level */ + uint m_rtrefc_maxlevels; /* max rtrefc btree level */ unsigned int m_agbtree_maxlevels; /* max level of all AG btrees */ unsigned int m_rtbtree_maxlevels; /* max level of all rt btrees */ xfs_extlen_t m_ag_prealloc_blocks; /* reserved ag blocks */ @@ -369,6 +372,12 @@ static inline bool xfs_has_rtrmapbt(struct xfs_mount *mp) xfs_has_rmapbt(mp); } +static inline bool xfs_has_rtreflink(struct xfs_mount *mp) +{ + return xfs_has_metadir(mp) && xfs_has_realtime(mp) && + xfs_has_reflink(mp); +} + /* * Mount features * diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h index f24a08dd63e9..94bbb6351d3d 100644 --- a/fs/xfs/xfs_ondisk.h +++ b/fs/xfs/xfs_ondisk.h @@ -79,6 +79,7 @@ xfs_check_ondisk_structs(void) XFS_CHECK_STRUCT_SIZE(struct xfs_rtbuf_blkinfo, 48); XFS_CHECK_STRUCT_SIZE(xfs_rtrmap_ptr_t, 8); XFS_CHECK_STRUCT_SIZE(struct xfs_rtrmap_root, 4); + XFS_CHECK_STRUCT_SIZE(xfs_rtrefcount_ptr_t, 8); /* * m68k has problems with xfs_attr_leaf_name_remote_t, but we pad it to From patchwork Fri Dec 30 22:18:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 070C5C4332F for ; Sat, 31 Dec 2022 01:49:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236064AbiLaBtY (ORCPT ); Fri, 30 Dec 2022 20:49:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236079AbiLaBtX (ORCPT ); Fri, 30 Dec 2022 20:49:23 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 953091DDFF for ; Fri, 30 Dec 2022 17:49:22 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 5666CB81DF9 for ; Sat, 31 Dec 2022 01:49:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1805FC433D2; Sat, 31 Dec 2022 01:49:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451360; bh=Gia+DuWNZ1ewXa4T2NngBVONHn6W4xPrAj+654Mmwi0=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=GisqnqojEI1Q2wKy01BO7YvJDNk/IpDGFYBAMSv5IZUSlO702k7qUYbditPr+fXRW LSTn/uDyT+HX5Rkls/yAm6hnZyfyaLJvTh1/TrXaU6zM+kvn4vAJumVmUAu7/vr6Xa 91y2KlmUH30lfTqdUe6tZWvVCVQEdFKUFDh0ckoiriye1fE9j8T3SHjfY/jq6nYTSZ AYmf/ydOwn736v15xGQ0za+b4qJQymGuFcx8A9EmfR0+oy7UATNkkK/GKL5MUHW+R4 4WDePiu8Zw9DLOCNNY1K1uMUeDZcthf7M78Lr+YSVGjxLs9f10nmTzOhnym+Qw9Y90 u5e6KUDndZ0lg== Subject: [PATCH 05/42] xfs: realtime refcount btree transaction reservations From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:29 -0800 Message-ID: <167243870969.717073.10174042531639869915.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Make sure that there's enough log reservation to handle mapping and unmapping realtime extents. We have to reserve enough space to handle a split in the rtrefcountbt to add the record and a second split in the regular refcountbt to record the rtrefcountbt split. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_trans_resv.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 52a4386a3d96..2b8b8dd5dec3 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -90,6 +90,14 @@ xfs_refcountbt_block_count( return num_ops * (2 * mp->m_refc_maxlevels - 1); } +static unsigned int +xfs_rtrefcountbt_block_count( + struct xfs_mount *mp, + unsigned int num_ops) +{ + return num_ops * (2 * mp->m_rtrefc_maxlevels - 1); +} + /* * Logging inodes is really tricksy. They are logged in memory format, * which means that what we write into the log doesn't directly translate into @@ -257,10 +265,13 @@ xfs_rtalloc_block_count( * Compute the log reservation required to handle the refcount update * transaction. Refcount updates are always done via deferred log items. * - * This is calculated as: + * This is calculated as the max of: * Data device refcount updates (t1): * the agfs of the ags containing the blocks: nr_ops * sector size * the refcount btrees: nr_ops * 1 trees * (2 * max depth - 1) * block size + * Realtime refcount updates (t2); + * the rt refcount inode + * the rtrefcount btrees: nr_ops * 1 trees * (2 * max depth - 1) * block size */ static unsigned int xfs_calc_refcountbt_reservation( @@ -268,12 +279,20 @@ xfs_calc_refcountbt_reservation( unsigned int nr_ops) { unsigned int blksz = XFS_FSB_TO_B(mp, 1); + unsigned int t1, t2 = 0; if (!xfs_has_reflink(mp)) return 0; - return xfs_calc_buf_res(nr_ops, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_refcountbt_block_count(mp, nr_ops), blksz); + t1 = xfs_calc_buf_res(nr_ops, mp->m_sb.sb_sectsize) + + xfs_calc_buf_res(xfs_refcountbt_block_count(mp, nr_ops), blksz); + + if (xfs_has_realtime(mp)) + t2 = xfs_calc_inode_res(mp, 1) + + xfs_calc_buf_res(xfs_rtrefcountbt_block_count(mp, nr_ops), + blksz); + + return max(t1, t2); } /* From patchwork Fri Dec 30 22:18:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085514 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42708C4332F for ; Sat, 31 Dec 2022 01:49:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236082AbiLaBti (ORCPT ); Fri, 30 Dec 2022 20:49:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236079AbiLaBth (ORCPT ); Fri, 30 Dec 2022 20:49:37 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A92FA1DDF7 for ; Fri, 30 Dec 2022 17:49:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 476ED61CD0 for ; Sat, 31 Dec 2022 01:49:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A4177C433D2; Sat, 31 Dec 2022 01:49:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451375; bh=tD9NvOuuGzM0XArO54GRX658vM4eNvAJId3HZwdsLHg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=VZVohukBt3bqgmEvCY7Rt22vUG23sgg7kTTPJctQ8/HBRubyAXTpXAURwPPG84PtO V5rkt6jEBuOV0LUB3UCx+QlFFKDNe5KhdyY2Td7NOvsSe+eenIp2cr+P9Y/TzeaEyC JCs5/MEMo7bxBncuf3fvhMxSToRd1BVXY9VRH8Ukd/px5WV2hYe2axzrPbUUDpuhpe 7+PAR7U8uLOmlTtezgmA4136OS7kK/fqI43cViBYBx4FcQNuiTicMrl0SGZVPxg00C YB8+W9gzfzPR8WFOpByl9Q4+dPd/2DoMs1vVmXb4xbDwXvRXqo6WrUz6/jnj8jyIXA /6GuTX/+1GXYg== Subject: [PATCH 06/42] xfs: add realtime refcount btree operations From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:29 -0800 Message-ID: <167243870983.717073.6818978594509030568.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Implement the generic btree operations needed to manipulate rtrefcount btree blocks. This is different from the regular refcountbt in that we allocate space from the filesystem at large, and are neither constrained to the free space nor any particular AG. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rtrefcount_btree.c | 148 ++++++++++++++++++++++++++++++++++ 1 file changed, 148 insertions(+) diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index dd8e628b068b..bdefc4f5939d 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -19,6 +19,7 @@ #include "xfs_btree.h" #include "xfs_btree_staging.h" #include "xfs_rtrefcount_btree.h" +#include "xfs_refcount.h" #include "xfs_trace.h" #include "xfs_cksum.h" #include "xfs_error.h" @@ -53,6 +54,106 @@ xfs_rtrefcountbt_dup_cursor( return new; } +STATIC int +xfs_rtrefcountbt_get_minrecs( + struct xfs_btree_cur *cur, + int level) +{ + if (level == cur->bc_nlevels - 1) { + struct xfs_ifork *ifp = xfs_btree_ifork_ptr(cur); + + return xfs_rtrefcountbt_maxrecs(cur->bc_mp, ifp->if_broot_bytes, + level == 0) / 2; + } + + return cur->bc_mp->m_rtrefc_mnr[level != 0]; +} + +STATIC int +xfs_rtrefcountbt_get_maxrecs( + struct xfs_btree_cur *cur, + int level) +{ + if (level == cur->bc_nlevels - 1) { + struct xfs_ifork *ifp = xfs_btree_ifork_ptr(cur); + + return xfs_rtrefcountbt_maxrecs(cur->bc_mp, ifp->if_broot_bytes, + level == 0); + } + + return cur->bc_mp->m_rtrefc_mxr[level != 0]; +} + +STATIC void +xfs_rtrefcountbt_init_key_from_rec( + union xfs_btree_key *key, + const union xfs_btree_rec *rec) +{ + key->refc.rc_startblock = rec->refc.rc_startblock; +} + +STATIC void +xfs_rtrefcountbt_init_high_key_from_rec( + union xfs_btree_key *key, + const union xfs_btree_rec *rec) +{ + __u32 x; + + x = be32_to_cpu(rec->refc.rc_startblock); + x += be32_to_cpu(rec->refc.rc_blockcount) - 1; + key->refc.rc_startblock = cpu_to_be32(x); +} + +STATIC void +xfs_rtrefcountbt_init_rec_from_cur( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec) +{ + const struct xfs_refcount_irec *irec = &cur->bc_rec.rc; + uint32_t start; + + start = xfs_refcount_encode_startblock(irec->rc_startblock, + irec->rc_domain); + rec->refc.rc_startblock = cpu_to_be32(start); + rec->refc.rc_blockcount = cpu_to_be32(cur->bc_rec.rc.rc_blockcount); + rec->refc.rc_refcount = cpu_to_be32(cur->bc_rec.rc.rc_refcount); +} + +STATIC void +xfs_rtrefcountbt_init_ptr_from_cur( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr) +{ + ptr->l = 0; +} + +STATIC int64_t +xfs_rtrefcountbt_key_diff( + struct xfs_btree_cur *cur, + const union xfs_btree_key *key) +{ + const struct xfs_refcount_key *kp = &key->refc; + const struct xfs_refcount_irec *irec = &cur->bc_rec.rc; + uint32_t start; + + start = xfs_refcount_encode_startblock(irec->rc_startblock, + irec->rc_domain); + return (int64_t)be32_to_cpu(kp->rc_startblock) - start; +} + +STATIC int64_t +xfs_rtrefcountbt_diff_two_keys( + struct xfs_btree_cur *cur, + const union xfs_btree_key *k1, + const union xfs_btree_key *k2, + const union xfs_btree_key *mask) +{ + ASSERT(!mask || mask->refc.rc_startblock); + + return (int64_t)be32_to_cpu(k1->refc.rc_startblock) - + be32_to_cpu(k2->refc.rc_startblock); +} + static xfs_failaddr_t xfs_rtrefcountbt_verify( struct xfs_buf *bp) @@ -119,6 +220,40 @@ const struct xfs_buf_ops xfs_rtrefcountbt_buf_ops = { .verify_struct = xfs_rtrefcountbt_verify, }; +STATIC int +xfs_rtrefcountbt_keys_inorder( + struct xfs_btree_cur *cur, + const union xfs_btree_key *k1, + const union xfs_btree_key *k2) +{ + return be32_to_cpu(k1->refc.rc_startblock) < + be32_to_cpu(k2->refc.rc_startblock); +} + +STATIC int +xfs_rtrefcountbt_recs_inorder( + struct xfs_btree_cur *cur, + const union xfs_btree_rec *r1, + const union xfs_btree_rec *r2) +{ + return be32_to_cpu(r1->refc.rc_startblock) + + be32_to_cpu(r1->refc.rc_blockcount) <= + be32_to_cpu(r2->refc.rc_startblock); +} + +STATIC enum xbtree_key_contig +xfs_rtrefcountbt_keys_contiguous( + struct xfs_btree_cur *cur, + const union xfs_btree_key *key1, + const union xfs_btree_key *key2, + const union xfs_btree_key *mask) +{ + ASSERT(!mask || mask->refc.rc_startblock); + + return xbtree_key_contig(be32_to_cpu(key1->refc.rc_startblock), + be32_to_cpu(key2->refc.rc_startblock)); +} + const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .rec_len = sizeof(struct xfs_refcount_rec), .key_len = sizeof(struct xfs_refcount_key), @@ -126,7 +261,20 @@ const struct xfs_btree_ops xfs_rtrefcountbt_ops = { XFS_BTREE_CRC_BLOCKS | XFS_BTREE_IROOT_RECORDS, .dup_cursor = xfs_rtrefcountbt_dup_cursor, + .alloc_block = xfs_btree_alloc_imeta_block, + .free_block = xfs_btree_free_imeta_block, + .get_minrecs = xfs_rtrefcountbt_get_minrecs, + .get_maxrecs = xfs_rtrefcountbt_get_maxrecs, + .init_key_from_rec = xfs_rtrefcountbt_init_key_from_rec, + .init_high_key_from_rec = xfs_rtrefcountbt_init_high_key_from_rec, + .init_rec_from_cur = xfs_rtrefcountbt_init_rec_from_cur, + .init_ptr_from_cur = xfs_rtrefcountbt_init_ptr_from_cur, + .key_diff = xfs_rtrefcountbt_key_diff, .buf_ops = &xfs_rtrefcountbt_buf_ops, + .diff_two_keys = xfs_rtrefcountbt_diff_two_keys, + .keys_inorder = xfs_rtrefcountbt_keys_inorder, + .recs_inorder = xfs_rtrefcountbt_recs_inorder, + .keys_contiguous = xfs_rtrefcountbt_keys_contiguous, }; /* Initialize a new rt refcount btree cursor. */ From patchwork Fri Dec 30 22:18:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085515 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0ACBC4332F for ; Sat, 31 Dec 2022 01:49:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236092AbiLaBt5 (ORCPT ); Fri, 30 Dec 2022 20:49:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236079AbiLaBt4 (ORCPT ); Fri, 30 Dec 2022 20:49:56 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08E911DDD1 for ; Fri, 30 Dec 2022 17:49:55 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 2326DCE19E1 for ; Sat, 31 Dec 2022 01:49:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2C444C433D2; Sat, 31 Dec 2022 01:49:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451391; bh=wECA+OpZT3ue/AeJoi6Bv1+ZvcBRrH95eekziWf+3Zo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=HqiSRLS0fF/CYdk/FQNOHXAJznMEiwCk+0V4oyHE3KfkOULJ4fOx3cZicXRPRM8kR 72AJizFUg0OKY11y7sDFCpl++6/VpuLuHmYXgmTGGeVDViO5N3NoTD2qaRqrL9+NPQ y2EkrjCaWE8wcqgThZyqU8bqNLrvQtM4DoQhtKIckRs+ukQSNCmAXtqr74VG7TlMKy K3mbqJ5GT2TAxFUTuCcJe0xGvPYAe9iesHyyV5TPnZBLojFLyd26c3SK+b5jzThSe7 Uw+uTyWEAOhj2oWJQaJRwu2rtl6AqHylwp30FjUTh04loKWEA7rG8VIgQsVegoACb/ YEqjPgpjcL7Bg== Subject: [PATCH 07/42] xfs: prepare refcount functions to deal with rtrefcountbt From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:30 -0800 Message-ID: <167243870996.717073.1691112682357408512.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Prepare the high-level refcount functions to deal with the new realtime refcountbt and its slightly different conventions. Provide the ability to talk to either refcountbt or rtrefcountbt formats from the same high level code. Note that we leave the _recover_cow_leftovers functions for a separate patch so that we can convert it all at once. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_refcount.c | 79 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 64 insertions(+), 15 deletions(-) diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index e1f55edceccf..a54a633f2ef9 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -24,6 +24,7 @@ #include "xfs_rmap.h" #include "xfs_ag.h" #include "xfs_health.h" +#include "xfs_rtgroup.h" struct kmem_cache *xfs_refcount_intent_cache; @@ -40,6 +41,16 @@ STATIC int __xfs_refcount_cow_alloc(struct xfs_btree_cur *rcur, STATIC int __xfs_refcount_cow_free(struct xfs_btree_cur *rcur, xfs_agblock_t agbno, xfs_extlen_t aglen); +/* Return the maximum startblock number of the refcountbt. */ +static inline xfs_agblock_t +xrefc_max_startblock( + struct xfs_btree_cur *cur) +{ + if (cur->bc_btnum == XFS_BTNUM_RTREFC) + return cur->bc_mp->m_sb.sb_rgblocks; + return cur->bc_mp->m_sb.sb_agblocks; +} + /* * Look up the first record less than or equal to [bno, len] in the btree * given by cur. @@ -142,12 +153,35 @@ xfs_refcount_check_perag_irec( return NULL; } +static inline xfs_failaddr_t +xfs_refcount_check_rtgroup_irec( + struct xfs_rtgroup *rtg, + const struct xfs_refcount_irec *irec) +{ + if (irec->rc_blockcount == 0 || irec->rc_blockcount > XFS_REFC_LEN_MAX) + return __this_address; + + if (!xfs_refcount_check_domain(irec)) + return __this_address; + + /* check for valid extent range, including overflow */ + if (!xfs_verify_rgbext(rtg, irec->rc_startblock, irec->rc_blockcount)) + return __this_address; + + if (irec->rc_refcount == 0 || irec->rc_refcount > XFS_REFC_REFCOUNT_MAX) + return __this_address; + + return NULL; +} + /* Simple checks for refcount records. */ xfs_failaddr_t xfs_refcount_check_irec( struct xfs_btree_cur *cur, const struct xfs_refcount_irec *irec) { + if (cur->bc_btnum == XFS_BTNUM_RTREFC) + return xfs_refcount_check_rtgroup_irec(cur->bc_ino.rtg, irec); return xfs_refcount_check_perag_irec(cur->bc_ag.pag, irec); } @@ -159,9 +193,15 @@ xfs_refcount_complain_bad_rec( { struct xfs_mount *mp = cur->bc_mp; - xfs_warn(mp, + if (cur->bc_btnum == XFS_BTNUM_RTREFC) { + xfs_warn(mp, + "RT Refcount BTree record corruption in rtgroup %u detected at %pS!", + cur->bc_ino.rtg->rtg_rgno, fa); + } else { + xfs_warn(mp, "Refcount BTree record corruption in AG %d detected at %pS!", cur->bc_ag.pag->pag_agno, fa); + } xfs_warn(mp, "Start block 0x%x, block count 0x%x, references 0x%x", irec->rc_startblock, irec->rc_blockcount, irec->rc_refcount); @@ -1054,6 +1094,15 @@ xfs_refcount_merge_extents( return 0; } +static inline struct xbtree_refc * +xrefc_btree_state( + struct xfs_btree_cur *cur) +{ + if (cur->bc_btnum == XFS_BTNUM_RTREFC) + return &cur->bc_ino.refc; + return &cur->bc_ag.refc; +} + /* * XXX: This is a pretty hand-wavy estimate. The penalty for guessing * true incorrectly is a shutdown FS; the penalty for guessing false @@ -1071,25 +1120,25 @@ xfs_refcount_still_have_space( * to handle each of the shape changes to the refcount btree. */ overhead = xfs_allocfree_block_count(cur->bc_mp, - cur->bc_ag.refc.shape_changes); - overhead += cur->bc_mp->m_refc_maxlevels; + xrefc_btree_state(cur)->shape_changes); + overhead += cur->bc_maxlevels; overhead *= cur->bc_mp->m_sb.sb_blocksize; /* * Only allow 2 refcount extent updates per transaction if the * refcount continue update "error" has been injected. */ - if (cur->bc_ag.refc.nr_ops > 2 && + if (xrefc_btree_state(cur)->nr_ops > 2 && XFS_TEST_ERROR(false, cur->bc_mp, XFS_ERRTAG_REFCOUNT_CONTINUE_UPDATE)) return false; - if (cur->bc_ag.refc.nr_ops == 0) + if (xrefc_btree_state(cur)->nr_ops == 0) return true; else if (overhead > cur->bc_tp->t_log_res) return false; return cur->bc_tp->t_log_res - overhead > - cur->bc_ag.refc.nr_ops * XFS_REFCOUNT_ITEM_OVERHEAD; + xrefc_btree_state(cur)->nr_ops * XFS_REFCOUNT_ITEM_OVERHEAD; } /* @@ -1124,7 +1173,7 @@ xfs_refcount_adjust_extents( if (error) goto out_error; if (!found_rec || ext.rc_domain != XFS_REFC_DOMAIN_SHARED) { - ext.rc_startblock = cur->bc_mp->m_sb.sb_agblocks; + ext.rc_startblock = xrefc_max_startblock(cur); ext.rc_blockcount = 0; ext.rc_refcount = 0; ext.rc_domain = XFS_REFC_DOMAIN_SHARED; @@ -1148,7 +1197,7 @@ xfs_refcount_adjust_extents( * Either cover the hole (increment) or * delete the range (decrement). */ - cur->bc_ag.refc.nr_ops++; + xrefc_btree_state(cur)->nr_ops++; if (tmp.rc_refcount) { error = xfs_refcount_insert(cur, &tmp, &found_tmp); @@ -1205,7 +1254,7 @@ xfs_refcount_adjust_extents( goto skip; ext.rc_refcount += adj; trace_xfs_refcount_modify_extent(cur, &ext); - cur->bc_ag.refc.nr_ops++; + xrefc_btree_state(cur)->nr_ops++; if (ext.rc_refcount > 1) { error = xfs_refcount_update(cur, &ext); if (error) @@ -1288,7 +1337,7 @@ xfs_refcount_adjust( if (shape_changed) shape_changes++; if (shape_changes) - cur->bc_ag.refc.shape_changes++; + xrefc_btree_state(cur)->shape_changes++; /* Now that we've taken care of the ends, adjust the middle extents */ error = xfs_refcount_adjust_extents(cur, agbno, aglen, adj); @@ -1380,8 +1429,8 @@ xfs_refcount_finish_one( */ rcur = *pcur; if (rcur != NULL && rcur->bc_ag.pag != ri->ri_pag) { - nr_ops = rcur->bc_ag.refc.nr_ops; - shape_changes = rcur->bc_ag.refc.shape_changes; + nr_ops = xrefc_btree_state(rcur)->nr_ops; + shape_changes = xrefc_btree_state(rcur)->shape_changes; xfs_refcount_finish_one_cleanup(tp, rcur, 0); rcur = NULL; *pcur = NULL; @@ -1393,8 +1442,8 @@ xfs_refcount_finish_one( return error; rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, ri->ri_pag); - rcur->bc_ag.refc.nr_ops = nr_ops; - rcur->bc_ag.refc.shape_changes = shape_changes; + xrefc_btree_state(rcur)->nr_ops = nr_ops; + xrefc_btree_state(rcur)->shape_changes = shape_changes; } *pcur = rcur; @@ -1689,7 +1738,7 @@ xfs_refcount_adjust_cow_extents( goto out_error; } if (!found_rec) { - ext.rc_startblock = cur->bc_mp->m_sb.sb_agblocks; + ext.rc_startblock = xrefc_max_startblock(cur); ext.rc_blockcount = 0; ext.rc_refcount = 0; ext.rc_domain = XFS_REFC_DOMAIN_COW; From patchwork Fri Dec 30 22:18:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085516 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 151A9C3DA7C for ; Sat, 31 Dec 2022 01:50:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236094AbiLaBuM (ORCPT ); Fri, 30 Dec 2022 20:50:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236079AbiLaBuL (ORCPT ); Fri, 30 Dec 2022 20:50:11 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 823CC1DDD3 for ; Fri, 30 Dec 2022 17:50:09 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1A5BCB81DF9 for ; Sat, 31 Dec 2022 01:50:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ACCEEC433EF; Sat, 31 Dec 2022 01:50:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451406; bh=5zkxpPy9Pm7WNXl+g9+gbvWMUn9dl9MOz+10UFtcrNo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Ue8p1+7wP8kT/eQmwjvzAihA7qsSAsXnYIjiNcBh2O3ZxN6zCChBs5RlHidt/ibtK UVuTEqBnsMK9p+geWuxtrhg6p1bMPb9eqP2nUKsA6B0xAwgFzpmlojzELFzzVR5+KN gX1p6danr29slR8gqnvQu4Ygts0Mx2LIaoEo5EPqUhflYI/1Ix6P+0Bf9QSSpQPdmd SIo8U4Ilsjf+ZbHbfpBkT5R9f7zr+7GLiY1IuMrRwoqMe++c6tTqi2zYtgUeFPNEQ/ 6w6ppZIntdY9TmwQMSdpGrVd6PGoWWIXjFeuxIhw3+fTX+qAvjzviIxyuaQeDoqzMo kRnTAa/PEtJzw== Subject: [PATCH 08/42] xfs: add a realtime flag to the refcount update log redo items From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:30 -0800 Message-ID: <167243871011.717073.12110454853946893456.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Extend the refcount update (CUI) log items with a new realtime flag that indicates that the updates apply against the realtime refcountbt. We'll wire up the actual refcount code later. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_bmap.c | 10 ++- fs/xfs/libxfs/xfs_defer.c | 1 fs/xfs/libxfs/xfs_defer.h | 1 fs/xfs/libxfs/xfs_log_format.h | 5 + fs/xfs/libxfs/xfs_refcount.c | 156 +++++++++++++++++++++++++++++----------- fs/xfs/libxfs/xfs_refcount.h | 18 +++-- fs/xfs/scrub/cow_repair.c | 2 - fs/xfs/scrub/reap.c | 5 + fs/xfs/xfs_refcount_item.c | 32 ++++++++ fs/xfs/xfs_reflink.c | 19 +++-- 10 files changed, 184 insertions(+), 65 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 8c683db35788..b46504d861e3 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4529,8 +4529,9 @@ xfs_bmapi_write( * the refcount btree for orphan recovery. */ if (whichfork == XFS_COW_FORK) - xfs_refcount_alloc_cow_extent(tp, bma.blkno, - bma.length); + xfs_refcount_alloc_cow_extent(tp, + XFS_IS_REALTIME_INODE(ip), + bma.blkno, bma.length); } /* Deal with the allocated space we found. */ @@ -4696,7 +4697,8 @@ xfs_bmapi_convert_delalloc( *seq = READ_ONCE(ifp->if_seq); if (whichfork == XFS_COW_FORK) - xfs_refcount_alloc_cow_extent(tp, bma.blkno, bma.length); + xfs_refcount_alloc_cow_extent(tp, XFS_IS_REALTIME_INODE(ip), + bma.blkno, bma.length); error = xfs_bmap_btree_to_extents(tp, ip, bma.cur, &bma.logflags, whichfork); @@ -5313,7 +5315,7 @@ xfs_bmap_del_extent_real( */ if (want_free) { if (xfs_is_reflink_inode(ip) && whichfork == XFS_DATA_FORK) { - xfs_refcount_decrease_extent(tp, del); + xfs_refcount_decrease_extent(tp, isrt, del); } else { unsigned int efi_flags = 0; diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c index ce3bc5fe2bdc..1aefb4c99e7b 100644 --- a/fs/xfs/libxfs/xfs_defer.c +++ b/fs/xfs/libxfs/xfs_defer.c @@ -186,6 +186,7 @@ static struct kmem_cache *xfs_defer_pending_cache; static const struct xfs_defer_op_type *defer_op_types[] = { [XFS_DEFER_OPS_TYPE_BMAP] = &xfs_bmap_update_defer_type, [XFS_DEFER_OPS_TYPE_REFCOUNT] = &xfs_refcount_update_defer_type, + [XFS_DEFER_OPS_TYPE_REFCOUNT_RT] = &xfs_refcount_update_defer_type, [XFS_DEFER_OPS_TYPE_RMAP] = &xfs_rmap_update_defer_type, [XFS_DEFER_OPS_TYPE_RMAP_RT] = &xfs_rmap_update_defer_type, [XFS_DEFER_OPS_TYPE_FREE] = &xfs_extent_free_defer_type, diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h index 89c279185ce6..8564777c4c49 100644 --- a/fs/xfs/libxfs/xfs_defer.h +++ b/fs/xfs/libxfs/xfs_defer.h @@ -16,6 +16,7 @@ struct xfs_defer_capture; enum xfs_defer_ops_type { XFS_DEFER_OPS_TYPE_BMAP, XFS_DEFER_OPS_TYPE_REFCOUNT, + XFS_DEFER_OPS_TYPE_REFCOUNT_RT, XFS_DEFER_OPS_TYPE_RMAP, XFS_DEFER_OPS_TYPE_RMAP_RT, XFS_DEFER_OPS_TYPE_FREE, diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h index 3a23282d6e6f..66cfcafae9b8 100644 --- a/fs/xfs/libxfs/xfs_log_format.h +++ b/fs/xfs/libxfs/xfs_log_format.h @@ -800,7 +800,10 @@ struct xfs_phys_extent { /* Type codes are taken directly from enum xfs_refcount_intent_type. */ #define XFS_REFCOUNT_EXTENT_TYPE_MASK 0xFF -#define XFS_REFCOUNT_EXTENT_FLAGS (XFS_REFCOUNT_EXTENT_TYPE_MASK) +#define XFS_REFCOUNT_EXTENT_REALTIME (1U << 31) + +#define XFS_REFCOUNT_EXTENT_FLAGS (XFS_REFCOUNT_EXTENT_TYPE_MASK | \ + XFS_REFCOUNT_EXTENT_REALTIME) /* * This is the structure used to lay out a cui log item in the diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index a54a633f2ef9..999ba2c5c37d 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -25,6 +25,7 @@ #include "xfs_ag.h" #include "xfs_health.h" #include "xfs_rtgroup.h" +#include "xfs_rtalloc.h" struct kmem_cache *xfs_refcount_intent_cache; @@ -1141,6 +1142,28 @@ xfs_refcount_still_have_space( xrefc_btree_state(cur)->nr_ops * XFS_REFCOUNT_ITEM_OVERHEAD; } +/* Schedule an extent free. */ +static void +xrefc_free_extent( + struct xfs_btree_cur *cur, + struct xfs_refcount_irec *rec) +{ + xfs_fsblock_t fsbno; + unsigned int flags = 0; + + if (cur->bc_btnum == XFS_BTNUM_RTREFC) { + flags |= XFS_FREE_EXTENT_REALTIME; + fsbno = xfs_rgbno_to_rtb(cur->bc_mp, cur->bc_ino.rtg->rtg_rgno, + rec->rc_startblock); + } else { + fsbno = XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_ag.pag->pag_agno, + rec->rc_startblock); + } + + xfs_free_extent_later(cur->bc_tp, fsbno, rec->rc_blockcount, NULL, + flags); +} + /* * Adjust the refcounts of middle extents. At this point we should have * split extents that crossed the adjustment range; merged with adjacent @@ -1157,7 +1180,6 @@ xfs_refcount_adjust_extents( struct xfs_refcount_irec ext, tmp; int error; int found_rec, found_tmp; - xfs_fsblock_t fsbno; /* Merging did all the work already. */ if (*aglen == 0) @@ -1210,11 +1232,7 @@ xfs_refcount_adjust_extents( goto out_error; } } else { - fsbno = XFS_AGB_TO_FSB(cur->bc_mp, - cur->bc_ag.pag->pag_agno, - tmp.rc_startblock); - xfs_free_extent_later(cur->bc_tp, fsbno, - tmp.rc_blockcount, NULL, 0); + xrefc_free_extent(cur, &tmp); } (*agbno) += tmp.rc_blockcount; @@ -1270,11 +1288,7 @@ xfs_refcount_adjust_extents( } goto advloop; } else { - fsbno = XFS_AGB_TO_FSB(cur->bc_mp, - cur->bc_ag.pag->pag_agno, - ext.rc_startblock); - xfs_free_extent_later(cur->bc_tp, fsbno, - ext.rc_blockcount, NULL, 0); + xrefc_free_extent(cur, &ext); } skip: @@ -1358,19 +1372,31 @@ xfs_refcount_finish_one_cleanup( struct xfs_btree_cur *rcur, int error) { - struct xfs_buf *agbp; + struct xfs_buf *agbp = NULL; if (rcur == NULL) return; - agbp = rcur->bc_ag.agbp; + if (rcur->bc_btnum == XFS_BTNUM_REFC) + agbp = rcur->bc_ag.agbp; xfs_btree_del_cursor(rcur, error); - if (error) + if (agbp) xfs_trans_brelse(tp, agbp); } +/* Does this btree cursor match the given AG? */ +static inline bool +xfs_refcount_is_wrong_cursor( + struct xfs_btree_cur *cur, + struct xfs_refcount_intent *ri) +{ + if (cur->bc_btnum == XFS_BTNUM_RTREFC) + return cur->bc_ino.rtg != ri->ri_rtg; + return cur->bc_ag.pag != ri->ri_pag; +} + /* * Set up a continuation a deferred refcount operation by updating the intent. - * Checks to make sure we're not going to run off the end of the AG. + * Checks to make sure we're not going to run off the end of the AG or rtgroup. */ static inline int xfs_refcount_continue_op( @@ -1379,19 +1405,35 @@ xfs_refcount_continue_op( xfs_agblock_t new_agbno) { struct xfs_mount *mp = cur->bc_mp; - struct xfs_perag *pag = cur->bc_ag.pag; - if (XFS_IS_CORRUPT(mp, !xfs_verify_agbext(pag, new_agbno, - ri->ri_blockcount))) { - xfs_btree_mark_sick(cur); - return -EFSCORRUPTED; + if (ri->ri_realtime) { + struct xfs_rtgroup *rtg = ri->ri_rtg; + + if (XFS_IS_CORRUPT(mp, !xfs_verify_rgbext(rtg, new_agbno, + ri->ri_blockcount))) { + xfs_btree_mark_sick(cur); + return -EFSCORRUPTED; + } + + ri->ri_startblock = xfs_rgbno_to_rtb(mp, rtg->rtg_rgno, new_agbno); + + ASSERT(xfs_verify_rtbext(mp, ri->ri_startblock, ri->ri_blockcount)); + ASSERT(rtg->rtg_rgno == xfs_rtb_to_rgno(mp, ri->ri_startblock)); + } else { + struct xfs_perag *pag = cur->bc_ag.pag; + + if (XFS_IS_CORRUPT(mp, !xfs_verify_agbext(pag, new_agbno, + ri->ri_blockcount))) { + xfs_btree_mark_sick(cur); + return -EFSCORRUPTED; + } + + ri->ri_startblock = XFS_AGB_TO_FSB(mp, pag->pag_agno, new_agbno); + + ASSERT(xfs_verify_fsbext(mp, ri->ri_startblock, ri->ri_blockcount)); + ASSERT(pag->pag_agno == XFS_FSB_TO_AGNO(mp, ri->ri_startblock)); } - ri->ri_startblock = XFS_AGB_TO_FSB(mp, pag->pag_agno, new_agbno); - - ASSERT(xfs_verify_fsbext(mp, ri->ri_startblock, ri->ri_blockcount)); - ASSERT(pag->pag_agno == XFS_FSB_TO_AGNO(mp, ri->ri_startblock)); - return 0; } @@ -1416,10 +1458,16 @@ xfs_refcount_finish_one( unsigned long nr_ops = 0; int shape_changes = 0; - bno = XFS_FSB_TO_AGBNO(mp, ri->ri_startblock); - trace_xfs_refcount_deferred(mp, ri); + if (ri->ri_realtime) { + xfs_rgnumber_t rgno; + + bno = xfs_rtb_to_rgbno(mp, ri->ri_startblock, &rgno); + } else { + bno = XFS_FSB_TO_AGBNO(mp, ri->ri_startblock); + } + if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REFCOUNT_FINISH_ONE)) return -EIO; @@ -1428,7 +1476,7 @@ xfs_refcount_finish_one( * the startblock, get one now. */ rcur = *pcur; - if (rcur != NULL && rcur->bc_ag.pag != ri->ri_pag) { + if (rcur != NULL && xfs_refcount_is_wrong_cursor(rcur, ri)) { nr_ops = xrefc_btree_state(rcur)->nr_ops; shape_changes = xrefc_btree_state(rcur)->shape_changes; xfs_refcount_finish_one_cleanup(tp, rcur, 0); @@ -1436,12 +1484,19 @@ xfs_refcount_finish_one( *pcur = NULL; } if (rcur == NULL) { - error = xfs_alloc_read_agf(ri->ri_pag, tp, - XFS_ALLOC_FLAG_FREEING, &agbp); - if (error) - return error; + if (ri->ri_realtime) { + /* coming in a later patch */ + ASSERT(0); + return -EFSCORRUPTED; + } else { + error = xfs_alloc_read_agf(ri->ri_pag, tp, + XFS_ALLOC_FLAG_FREEING, &agbp); + if (error) + return error; - rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, ri->ri_pag); + rcur = xfs_refcountbt_init_cursor(mp, tp, agbp, + ri->ri_pag); + } xrefc_btree_state(rcur)->nr_ops = nr_ops; xrefc_btree_state(rcur)->shape_changes = shape_changes; } @@ -1492,10 +1547,12 @@ static void __xfs_refcount_add( struct xfs_trans *tp, enum xfs_refcount_intent_type type, + bool isrt, xfs_fsblock_t startblock, xfs_extlen_t blockcount) { struct xfs_refcount_intent *ri; + enum xfs_defer_ops_type optype; ri = kmem_cache_alloc(xfs_refcount_intent_cache, GFP_NOFS | __GFP_NOFAIL); @@ -1503,11 +1560,24 @@ __xfs_refcount_add( ri->ri_type = type; ri->ri_startblock = startblock; ri->ri_blockcount = blockcount; + ri->ri_realtime = isrt; trace_xfs_refcount_defer(tp->t_mountp, ri); + /* + * Deferred refcount updates for the realtime and data sections must + * use separate transactions to finish deferred work because updates to + * realtime metadata files can lock AGFs to allocate btree blocks and + * we don't want that mixing with the AGF locks taken to finish data + * section updates. + */ + if (isrt) + optype = XFS_DEFER_OPS_TYPE_REFCOUNT_RT; + else + optype = XFS_DEFER_OPS_TYPE_REFCOUNT; + xfs_refcount_update_get_group(tp->t_mountp, ri); - xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_REFCOUNT, &ri->ri_list); + xfs_defer_add(tp, optype, &ri->ri_list); } /* @@ -1516,12 +1586,13 @@ __xfs_refcount_add( void xfs_refcount_increase_extent( struct xfs_trans *tp, + bool isrt, struct xfs_bmbt_irec *PREV) { if (!xfs_has_reflink(tp->t_mountp)) return; - __xfs_refcount_add(tp, XFS_REFCOUNT_INCREASE, PREV->br_startblock, + __xfs_refcount_add(tp, XFS_REFCOUNT_INCREASE, isrt, PREV->br_startblock, PREV->br_blockcount); } @@ -1531,12 +1602,13 @@ xfs_refcount_increase_extent( void xfs_refcount_decrease_extent( struct xfs_trans *tp, + bool isrt, struct xfs_bmbt_irec *PREV) { if (!xfs_has_reflink(tp->t_mountp)) return; - __xfs_refcount_add(tp, XFS_REFCOUNT_DECREASE, PREV->br_startblock, + __xfs_refcount_add(tp, XFS_REFCOUNT_DECREASE, isrt, PREV->br_startblock, PREV->br_blockcount); } @@ -1892,6 +1964,7 @@ __xfs_refcount_cow_free( void xfs_refcount_alloc_cow_extent( struct xfs_trans *tp, + bool isrt, xfs_fsblock_t fsb, xfs_extlen_t len) { @@ -1900,16 +1973,17 @@ xfs_refcount_alloc_cow_extent( if (!xfs_has_reflink(mp)) return; - __xfs_refcount_add(tp, XFS_REFCOUNT_ALLOC_COW, fsb, len); + __xfs_refcount_add(tp, XFS_REFCOUNT_ALLOC_COW, isrt, fsb, len); /* Add rmap entry */ - xfs_rmap_alloc_extent(tp, false, fsb, len, XFS_RMAP_OWN_COW); + xfs_rmap_alloc_extent(tp, isrt, fsb, len, XFS_RMAP_OWN_COW); } /* Forget a CoW staging event in the refcount btree. */ void xfs_refcount_free_cow_extent( struct xfs_trans *tp, + bool isrt, xfs_fsblock_t fsb, xfs_extlen_t len) { @@ -1919,8 +1993,8 @@ xfs_refcount_free_cow_extent( return; /* Remove rmap entry */ - xfs_rmap_free_extent(tp, false, fsb, len, XFS_RMAP_OWN_COW); - __xfs_refcount_add(tp, XFS_REFCOUNT_FREE_COW, fsb, len); + xfs_rmap_free_extent(tp, isrt, fsb, len, XFS_RMAP_OWN_COW); + __xfs_refcount_add(tp, XFS_REFCOUNT_FREE_COW, isrt, fsb, len); } struct xfs_refcount_recovery { @@ -2026,7 +2100,7 @@ xfs_refcount_recover_cow_leftovers( /* Free the orphan record */ fsb = XFS_AGB_TO_FSB(mp, pag->pag_agno, rr->rr_rrec.rc_startblock); - xfs_refcount_free_cow_extent(tp, fsb, + xfs_refcount_free_cow_extent(tp, false, fsb, rr->rr_rrec.rc_blockcount); /* Free the block. */ diff --git a/fs/xfs/libxfs/xfs_refcount.h b/fs/xfs/libxfs/xfs_refcount.h index 7713bb908bdc..4e725d723e88 100644 --- a/fs/xfs/libxfs/xfs_refcount.h +++ b/fs/xfs/libxfs/xfs_refcount.h @@ -56,10 +56,14 @@ enum xfs_refcount_intent_type { struct xfs_refcount_intent { struct list_head ri_list; - struct xfs_perag *ri_pag; + union { + struct xfs_perag *ri_pag; + struct xfs_rtgroup *ri_rtg; + }; enum xfs_refcount_intent_type ri_type; xfs_extlen_t ri_blockcount; xfs_fsblock_t ri_startblock; + bool ri_realtime; }; /* Check that the refcount is appropriate for the record domain. */ @@ -77,9 +81,9 @@ xfs_refcount_check_domain( void xfs_refcount_update_get_group(struct xfs_mount *mp, struct xfs_refcount_intent *ri); -void xfs_refcount_increase_extent(struct xfs_trans *tp, +void xfs_refcount_increase_extent(struct xfs_trans *tp, bool isrt, struct xfs_bmbt_irec *irec); -void xfs_refcount_decrease_extent(struct xfs_trans *tp, +void xfs_refcount_decrease_extent(struct xfs_trans *tp, bool isrt, struct xfs_bmbt_irec *irec); extern void xfs_refcount_finish_one_cleanup(struct xfs_trans *tp, @@ -91,10 +95,10 @@ extern int xfs_refcount_find_shared(struct xfs_btree_cur *cur, xfs_agblock_t agbno, xfs_extlen_t aglen, xfs_agblock_t *fbno, xfs_extlen_t *flen, bool find_end_of_shared); -void xfs_refcount_alloc_cow_extent(struct xfs_trans *tp, xfs_fsblock_t fsb, - xfs_extlen_t len); -void xfs_refcount_free_cow_extent(struct xfs_trans *tp, xfs_fsblock_t fsb, - xfs_extlen_t len); +void xfs_refcount_alloc_cow_extent(struct xfs_trans *tp, bool isrt, + xfs_fsblock_t fsb, xfs_extlen_t len); +void xfs_refcount_free_cow_extent(struct xfs_trans *tp, bool isrt, + xfs_fsblock_t fsb, xfs_extlen_t len); extern int xfs_refcount_recover_cow_leftovers(struct xfs_mount *mp, struct xfs_perag *pag); diff --git a/fs/xfs/scrub/cow_repair.c b/fs/xfs/scrub/cow_repair.c index 5292171e6a2b..a0c1d97ab8b6 100644 --- a/fs/xfs/scrub/cow_repair.c +++ b/fs/xfs/scrub/cow_repair.c @@ -336,7 +336,7 @@ xrep_cow_alloc( if (args.fsbno == NULLFSBLOCK) return -ENOSPC; - xfs_refcount_alloc_cow_extent(sc->tp, args.fsbno, args.len); + xfs_refcount_alloc_cow_extent(sc->tp, false, args.fsbno, args.len); irec->br_startblock = args.fsbno; irec->br_blockcount = args.len; diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c index b0b29b1e139b..77354bdb0511 100644 --- a/fs/xfs/scrub/reap.c +++ b/fs/xfs/scrub/reap.c @@ -349,7 +349,8 @@ xreap_agextent( * If we're unmapping CoW staging extents, remove the * records from the refcountbt as well. */ - xfs_refcount_free_cow_extent(sc->tp, fsbno, *aglenp); + xfs_refcount_free_cow_extent(sc->tp, false, fsbno, + *aglenp); return 0; } return xfs_rmap_free(sc->tp, sc->sa.agf_bp, sc->sa.pag, agbno, @@ -381,7 +382,7 @@ xreap_agextent( ASSERT(rs->resv == XFS_AG_RESV_NONE); rs->force_roll = true; - xfs_refcount_free_cow_extent(sc->tp, fsbno, *aglenp); + xfs_refcount_free_cow_extent(sc->tp, false, fsbno, *aglenp); xfs_free_extent_later(sc->tp, fsbno, *aglenp, NULL, XFS_FREE_EXTENT_SKIP_DISCARD); return 0; diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index ccc334d482a4..7a366b316e79 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -21,6 +21,7 @@ #include "xfs_log_priv.h" #include "xfs_log_recover.h" #include "xfs_ag.h" +#include "xfs_rtgroup.h" struct kmem_cache *xfs_cui_cache; struct kmem_cache *xfs_cud_cache; @@ -286,6 +287,11 @@ xfs_refcount_update_diff_items( ra = container_of(a, struct xfs_refcount_intent, ri_list); rb = container_of(b, struct xfs_refcount_intent, ri_list); + ASSERT(ra->ri_realtime == rb->ri_realtime); + + if (ra->ri_realtime) + return ra->ri_rtg->rtg_rgno - rb->ri_rtg->rtg_rgno; + return ra->ri_pag->pag_agno - rb->ri_pag->pag_agno; } @@ -324,6 +330,8 @@ xfs_refcount_update_log_item( default: ASSERT(0); } + if (ri->ri_realtime) + pmap->pe_flags |= XFS_REFCOUNT_EXTENT_REALTIME; } static struct xfs_log_item * @@ -365,6 +373,15 @@ xfs_refcount_update_get_group( { xfs_agnumber_t agno; + if (ri->ri_realtime) { + xfs_rgnumber_t rgno; + + rgno = xfs_rtb_to_rgno(mp, ri->ri_startblock); + ri->ri_rtg = xfs_rtgroup_get(mp, rgno); + xfs_rtgroup_bump_intents(ri->ri_rtg); + return; + } + agno = XFS_FSB_TO_AGNO(mp, ri->ri_startblock); ri->ri_pag = xfs_perag_get(mp, agno); xfs_perag_bump_intents(ri->ri_pag); @@ -375,6 +392,12 @@ static inline void xfs_refcount_update_put_group( struct xfs_refcount_intent *ri) { + if (ri->ri_realtime) { + xfs_rtgroup_drop_intents(ri->ri_rtg); + xfs_rtgroup_put(ri->ri_rtg); + return; + } + xfs_perag_drop_intents(ri->ri_pag); xfs_perag_put(ri->ri_pag); } @@ -536,6 +559,7 @@ xfs_cui_item_recover( goto abort_error; } + fake.ri_realtime = pmap->pe_flags & XFS_REFCOUNT_EXTENT_REALTIME; fake.ri_startblock = pmap->pe_startblock; fake.ri_blockcount = pmap->pe_len; @@ -561,18 +585,22 @@ xfs_cui_item_recover( switch (fake.ri_type) { case XFS_REFCOUNT_INCREASE: - xfs_refcount_increase_extent(tp, &irec); + xfs_refcount_increase_extent(tp, + fake.ri_realtime, &irec); break; case XFS_REFCOUNT_DECREASE: - xfs_refcount_decrease_extent(tp, &irec); + xfs_refcount_decrease_extent(tp, + fake.ri_realtime, &irec); break; case XFS_REFCOUNT_ALLOC_COW: xfs_refcount_alloc_cow_extent(tp, + fake.ri_realtime, irec.br_startblock, irec.br_blockcount); break; case XFS_REFCOUNT_FREE_COW: xfs_refcount_free_cow_extent(tp, + fake.ri_realtime, irec.br_startblock, irec.br_blockcount); break; diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index cf514af238ce..52e73aa2c38e 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -585,6 +585,7 @@ xfs_reflink_cancel_cow_blocks( struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_COW_FORK); struct xfs_bmbt_irec got, del; struct xfs_iext_cursor icur; + bool isrt = XFS_IS_REALTIME_INODE(ip); int error = 0; if (!xfs_inode_has_cow_data(ip)) @@ -614,11 +615,12 @@ xfs_reflink_cancel_cow_blocks( ASSERT((*tpp)->t_firstblock == NULLFSBLOCK); /* Free the CoW orphan record. */ - xfs_refcount_free_cow_extent(*tpp, del.br_startblock, - del.br_blockcount); + xfs_refcount_free_cow_extent(*tpp, isrt, + del.br_startblock, del.br_blockcount); xfs_free_extent_later(*tpp, del.br_startblock, - del.br_blockcount, NULL, 0); + del.br_blockcount, NULL, + isrt ? XFS_FREE_EXTENT_REALTIME : 0); /* Roll the transaction */ error = xfs_defer_finish(tpp); @@ -726,6 +728,7 @@ xfs_reflink_end_cow_extent( struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_COW_FORK); unsigned int resblks; int nmaps; + bool isrt = XFS_IS_REALTIME_INODE(ip); int error; /* No COW extents? That's easy! */ @@ -803,7 +806,7 @@ xfs_reflink_end_cow_extent( * or not), unmap the extent and drop its refcount. */ xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &data); - xfs_refcount_decrease_extent(tp, &data); + xfs_refcount_decrease_extent(tp, isrt, &data); xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, -data.br_blockcount); } else if (data.br_startblock == DELAYSTARTBLOCK) { @@ -823,7 +826,8 @@ xfs_reflink_end_cow_extent( } /* Free the CoW orphan record. */ - xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount); + xfs_refcount_free_cow_extent(tp, isrt, del.br_startblock, + del.br_blockcount); /* Map the new blocks into the data fork. */ xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, &del); @@ -1160,6 +1164,7 @@ xfs_reflink_remap_extent( bool quota_reserved = true; bool smap_real; bool dmap_written = xfs_bmap_is_written_extent(dmap); + bool isrt = XFS_IS_REALTIME_INODE(ip); int iext_delta = 0; int nimaps; int error; @@ -1291,7 +1296,7 @@ xfs_reflink_remap_extent( * or not), unmap the extent and drop its refcount. */ xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &smap); - xfs_refcount_decrease_extent(tp, &smap); + xfs_refcount_decrease_extent(tp, isrt, &smap); qdelta -= smap.br_blockcount; } else if (smap.br_startblock == DELAYSTARTBLOCK) { int done; @@ -1314,7 +1319,7 @@ xfs_reflink_remap_extent( * its refcount and map it into the file. */ if (dmap_written) { - xfs_refcount_increase_extent(tp, dmap); + xfs_refcount_increase_extent(tp, isrt, dmap); xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, dmap); qdelta += dmap->br_blockcount; } From patchwork Fri Dec 30 22:18:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAA61C4332F for ; Sat, 31 Dec 2022 01:50:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236095AbiLaBuY (ORCPT ); Fri, 30 Dec 2022 20:50:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236079AbiLaBuX (ORCPT ); Fri, 30 Dec 2022 20:50:23 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 492A41DDD1 for ; Fri, 30 Dec 2022 17:50:23 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DB6C461CD0 for ; Sat, 31 Dec 2022 01:50:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 484C4C433EF; Sat, 31 Dec 2022 01:50:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451422; bh=1j548RtOEwsQjE2/4fOsZFv6dRRwWYIP8Kis1UBfTfw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=s1cgw9UBc7n4yjlPWLuFfnSY11VRuM13L/+RGux4dVJhLEcXFGk33dMGrfJ4BBA/j IC/b+eHflxsiwQ5TUZ5Ffwr2Jl8USypu9lJpTQ04NnwjfQ2Re2thTphy6HmHIqsuYg sZ4+cU2hTkVIC/EV8lfvlb6RGGd3cOiumlQ5/0aaqlkRo53HnY8Po1ZNh/bZ9Gj04i fxpjDReqai4Ckrs6U5KOHMSIzgzf02syVk3EWDJZ4VzPdP/mr9RXD9vKcCa+NGu8Ix Q3YiaDWCJvbHsqXeFlv6slGgAAzlVBPgz/PPHRB81CihVFJDuVJy//XcMnBLL0bRFD Up4eBdQw/vdHQ== Subject: [PATCH 09/42] xfs: support recovering refcount intent items targetting realtime extents From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:30 -0800 Message-ID: <167243871026.717073.14257109009386052103.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Now that we have reflink on the realtime device, refcount intent items have to support remapping extents on the realtime volume. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_refcount_item.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index 7a366b316e79..fc6dbbb17ad7 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -482,6 +482,9 @@ xfs_cui_validate_phys( return false; } + if (pmap->pe_flags & XFS_REFCOUNT_EXTENT_REALTIME) + return xfs_verify_rtbext(mp, pmap->pe_startblock, pmap->pe_len); + return xfs_verify_fsbext(mp, pmap->pe_startblock, pmap->pe_len); } From patchwork Fri Dec 30 22:18:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085518 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEEECC4332F for ; Sat, 31 Dec 2022 01:50:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236079AbiLaBuk (ORCPT ); Fri, 30 Dec 2022 20:50:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236099AbiLaBuj (ORCPT ); Fri, 30 Dec 2022 20:50:39 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC3631DDD3 for ; Fri, 30 Dec 2022 17:50:38 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 68C1361CBE for ; Sat, 31 Dec 2022 01:50:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C74F3C433EF; Sat, 31 Dec 2022 01:50:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451437; bh=iBcydqg97ByD2/qKSXb1BD7D7BJKFnwn0aTjWegVUfc=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=i7w+BhPMr9gAoR0f/BUpRZUplRnesmSszToVY0gGxAgvkv9dVPArYgOfCx2qEpuFv F6qEPDxH/zc8myWRN3lD/74OV7JL4mFGNKeGYz922GBc9LWvcxTCDw98owpUqHRY1u zVjyCsjSbPXzfhcw1QRSjxkuWdA+ijpH2Wo1Nm87SlICha9CIKc/oCAzf8MIaUp+N1 LxyasFEAeRHRsMr5ZjN6j0pbtGyy69YJ2amehtxpmjvUI4q1qzB6r1v9lp7W0eWDJP iP2FNZrq8kdXfDN2ThxnjXk+MXoWHQWzdGKt/haYUKJvGO/ZGgQcoxCg7E1w0m8m8j DZjkP5io8OjCQ== Subject: [PATCH 10/42] xfs: add realtime refcount btree block detection to log recovery From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:30 -0800 Message-ID: <167243871039.717073.15249797934773660086.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Identify rt refcount btree blocks in the log correctly so that we can validate them during log recovery. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_buf_item_recover.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c index 496260c9d8cd..5368a0d34452 100644 --- a/fs/xfs/xfs_buf_item_recover.c +++ b/fs/xfs/xfs_buf_item_recover.c @@ -268,6 +268,9 @@ xlog_recover_validate_buf_type( case XFS_REFC_CRC_MAGIC: bp->b_ops = &xfs_refcountbt_buf_ops; break; + case XFS_RTREFC_CRC_MAGIC: + bp->b_ops = &xfs_rtrefcountbt_buf_ops; + break; default: warnmsg = "Bad btree block magic!"; break; @@ -772,6 +775,7 @@ xlog_recover_get_buf_lsn( break; } case XFS_RTRMAP_CRC_MAGIC: + case XFS_RTREFC_CRC_MAGIC: case XFS_BMAP_CRC_MAGIC: case XFS_BMAP_MAGIC: { struct xfs_btree_block *btb = blk; From patchwork Fri Dec 30 22:18:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D67F7C4332F for ; Sat, 31 Dec 2022 01:50:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236100AbiLaBu4 (ORCPT ); Fri, 30 Dec 2022 20:50:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236099AbiLaBuz (ORCPT ); Fri, 30 Dec 2022 20:50:55 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62C54101E3 for ; Fri, 30 Dec 2022 17:50:54 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 00DB461CCE for ; Sat, 31 Dec 2022 01:50:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5AB92C433D2; Sat, 31 Dec 2022 01:50:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451453; bh=M/ygd3wZW8MjZ2x3Z5eh6jbkd35RxdBozOnz7WsYJA8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=W/lo94Tt8IEAt8fay3iTjPhsEqFTm0G5xl/OQB9qv1RVJcugff+sRYny8vBKR/ufZ z2LZlbTsf31iA2AWE9sa1RJpmoTs4tquvx93d4pMhj5kvTOa3ywdSo2fTeZoloFAq6 VfpwnGhm1lHdv9+bXvjazBnVR5TqLiUdJi2l2AQnj3MKxlbKJX/8td3Ky5/hPo1fWW uNcEWYXZ1gIHMm0whWx0Af0a4rlEZAIRz2sDz+t+ZQ87kaIC5xYDtX1voxA9vNoBFU wzSsaIrgP2nqopVnRGK6qRdmvuvBSUXjzu2LiVEzmQC3BvBMlcqMLc72OWl4TI0OIF IqaM9IjsaagtQ== Subject: [PATCH 11/42] xfs: add realtime refcount btree inode to metadata directory From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:30 -0800 Message-ID: <167243871053.717073.8108075011573657245.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Add a metadir path to select the realtime refcount btree inode and load it at mount time. The rtrefcountbt inode will have a unique extent format code, which means that we also have to update the inode validation and flush routines to look for it. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_bmap.c | 8 +++- fs/xfs/libxfs/xfs_format.h | 4 ++ fs/xfs/libxfs/xfs_inode_buf.c | 6 +++ fs/xfs/libxfs/xfs_inode_fork.c | 9 +++++ fs/xfs/libxfs/xfs_rtgroup.h | 3 ++ fs/xfs/libxfs/xfs_rtrefcount_btree.c | 33 ++++++++++++++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.h | 4 ++ fs/xfs/xfs_inode.c | 13 +++++++ fs/xfs/xfs_inode_item.c | 2 + fs/xfs/xfs_inode_item_recover.c | 1 + fs/xfs/xfs_rtalloc.c | 63 ++++++++++++++++++++++++++++++++++ fs/xfs/xfs_trace.h | 1 + 12 files changed, 144 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index b46504d861e3..fe31f3cb5d91 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -5148,9 +5148,13 @@ xfs_bmap_del_extent_real( * the same order of operations as the data device, which is: * Remove the file mapping, remove the reverse mapping, and * then free the blocks. This means that we must delay the - * freeing until after we've scheduled the rmap update. + * freeing until after we've scheduled the rmap update. If + * realtime reflink is enabled, use deferred refcount intent + * items to decide what to do with the extent, just like we do + * for the data device. */ - if (want_free && !xfs_has_rtrmapbt(mp)) { + if (want_free && !xfs_has_rtrmapbt(mp) && + !xfs_has_rtreflink(mp)) { error = xfs_rtfree_blocks(tp, del->br_startblock, del->br_blockcount); if (error) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index d2270f95bfbc..20af5b730d6d 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1011,6 +1011,7 @@ enum xfs_dinode_fmt { XFS_DINODE_FMT_BTREE, /* struct xfs_bmdr_block */ XFS_DINODE_FMT_UUID, /* added long ago, but never used */ XFS_DINODE_FMT_RMAP, /* reverse mapping btree */ + XFS_DINODE_FMT_REFCOUNT, /* reference count btree */ }; #define XFS_INODE_FORMAT_STR \ @@ -1019,7 +1020,8 @@ enum xfs_dinode_fmt { { XFS_DINODE_FMT_EXTENTS, "extent" }, \ { XFS_DINODE_FMT_BTREE, "btree" }, \ { XFS_DINODE_FMT_UUID, "uuid" }, \ - { XFS_DINODE_FMT_RMAP, "rmap" } + { XFS_DINODE_FMT_RMAP, "rmap" }, \ + { XFS_DINODE_FMT_REFCOUNT, "refcount" } /* * Max values for extnum and aextnum. diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 9ac84be391b3..dcf816f2643b 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -414,6 +414,12 @@ xfs_dinode_verify_fork( if (!(dip->di_flags2 & cpu_to_be64(XFS_DIFLAG2_METADATA))) return __this_address; break; + case XFS_DINODE_FMT_REFCOUNT: + if (!xfs_has_rtreflink(mp)) + return __this_address; + if (!(dip->di_flags2 & cpu_to_be64(XFS_DIFLAG2_METADATA))) + return __this_address; + break; default: return __this_address; } diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index 61926c07aad3..e69ec68b5a9d 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -266,6 +266,11 @@ xfs_iformat_data_fork( return -EFSCORRUPTED; } return xfs_iformat_rtrmap(ip, dip); + case XFS_DINODE_FMT_REFCOUNT: + if (!xfs_has_rtreflink(ip->i_mount)) + return -EFSCORRUPTED; + ASSERT(0); /* to be implemented later */ + return -EFSCORRUPTED; default: xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip, sizeof(*dip), __this_address); @@ -652,6 +657,10 @@ xfs_iflush_fork( xfs_iflush_rtrmap(ip, dip); break; + case XFS_DINODE_FMT_REFCOUNT: + ASSERT(0); /* to be implemented later */ + break; + default: ASSERT(0); break; diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h index 4e9b9098f2f2..0f400f133d88 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.h +++ b/fs/xfs/libxfs/xfs_rtgroup.h @@ -23,6 +23,9 @@ struct xfs_rtgroup { /* reverse mapping btree inode */ struct xfs_inode *rtg_rmapip; + /* refcount btree inode */ + struct xfs_inode *rtg_refcountip; + /* Number of blocks in this group */ xfs_rgblock_t rtg_blockcount; diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index bdefc4f5939d..40524fee3860 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -26,6 +26,7 @@ #include "xfs_extent_busy.h" #include "xfs_rtgroup.h" #include "xfs_rtbitmap.h" +#include "xfs_imeta.h" static struct kmem_cache *xfs_rtrefcountbt_cur_cache; @@ -354,6 +355,7 @@ xfs_rtrefcountbt_commit_staged_btree( int flags = XFS_ILOG_CORE | XFS_ILOG_DBROOT; ASSERT(cur->bc_flags & XFS_BTREE_STAGING); + ASSERT(ifake->if_fork->if_format == XFS_DINODE_FMT_REFCOUNT); /* * Free any resources hanging off the real fork, then shallow-copy the @@ -457,3 +459,34 @@ xfs_rtrefcountbt_compute_maxlevels( /* Add one level to handle the inode root level. */ mp->m_rtrefc_maxlevels = min(d_maxlevels, r_maxlevels) + 1; } + +#define XFS_RTREFC_NAMELEN 21 + +/* Create the metadata directory path for an rtrefcount btree inode. */ +int +xfs_rtrefcountbt_create_path( + struct xfs_mount *mp, + xfs_rgnumber_t rgno, + struct xfs_imeta_path **pathp) +{ + struct xfs_imeta_path *path; + char *fname; + int error; + + error = xfs_imeta_create_file_path(mp, 2, &path); + if (error) + return error; + + fname = kmalloc(XFS_RTREFC_NAMELEN, GFP_KERNEL); + if (!fname) { + xfs_imeta_free_path(path); + return -ENOMEM; + } + + snprintf(fname, XFS_RTREFC_NAMELEN, "%u.refcount", rgno); + path->im_path[0] = "realtime"; + path->im_path[1] = fname; + path->im_dynamicmask = 0x2; + *pathp = path; + return 0; +} diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.h b/fs/xfs/libxfs/xfs_rtrefcount_btree.h index d10ebdcf7727..1f3f590c68e6 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.h +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.h @@ -11,6 +11,7 @@ struct xfs_btree_cur; struct xfs_mount; struct xbtree_ifakeroot; struct xfs_rtgroup; +struct xfs_imeta_path; /* refcounts only exist on crc enabled filesystems */ #define XFS_RTREFCOUNT_BLOCK_LEN XFS_BTREE_LBLOCK_CRC_LEN @@ -68,4 +69,7 @@ unsigned int xfs_rtrefcountbt_maxlevels_ondisk(void); int __init xfs_rtrefcountbt_init_cur_cache(void); void xfs_rtrefcountbt_destroy_cur_cache(void); +int xfs_rtrefcountbt_create_path(struct xfs_mount *mp, xfs_rgnumber_t rgno, + struct xfs_imeta_path **pathp); + #endif /* __XFS_RTREFCOUNT_BTREE_H__ */ diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 3b0c04b6bcdf..d50cbd0eb260 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2580,6 +2580,14 @@ xfs_iflush( __func__, ip->i_ino, ip); goto flush_out; } + } else if (ip->i_df.if_format == XFS_DINODE_FMT_REFCOUNT) { + if (!S_ISREG(VFS_I(ip)->i_mode) || + !(ip->i_diflags2 & XFS_DIFLAG2_METADATA)) { + xfs_alert_tag(mp, XFS_PTAG_IFLUSH, + "%s: Bad rt refcountbt inode %Lu, ptr "PTR_FMT, + __func__, ip->i_ino, ip); + goto flush_out; + } } else if (S_ISREG(VFS_I(ip)->i_mode)) { if (XFS_TEST_ERROR( ip->i_df.if_format != XFS_DINODE_FMT_EXTENTS && @@ -2626,6 +2634,11 @@ xfs_iflush( "%s: rt rmapbt in inode %Lu attr fork, ptr "PTR_FMT, __func__, ip->i_ino, ip); goto flush_out; + } else if (ip->i_af.if_format == XFS_DINODE_FMT_REFCOUNT) { + xfs_alert_tag(mp, XFS_PTAG_IFLUSH, + "%s: rt refcountbt in inode %Lu attr fork, ptr "PTR_FMT, + __func__, ip->i_ino, ip); + goto flush_out; } } diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index b6e374744474..7cbc79e3997a 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -63,6 +63,7 @@ xfs_inode_item_data_fork_size( break; case XFS_DINODE_FMT_BTREE: case XFS_DINODE_FMT_RMAP: + case XFS_DINODE_FMT_REFCOUNT: if ((iip->ili_fields & XFS_ILOG_DBROOT) && ip->i_df.if_broot_bytes > 0) { *nbytes += ip->i_df.if_broot_bytes; @@ -184,6 +185,7 @@ xfs_inode_item_format_data_fork( break; case XFS_DINODE_FMT_BTREE: case XFS_DINODE_FMT_RMAP: + case XFS_DINODE_FMT_REFCOUNT: iip->ili_fields &= ~(XFS_ILOG_DDATA | XFS_ILOG_DEXT | XFS_ILOG_DEV); diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c index 4f1ed1f6a34d..feeba1dff01e 100644 --- a/fs/xfs/xfs_inode_item_recover.c +++ b/fs/xfs/xfs_inode_item_recover.c @@ -417,6 +417,7 @@ xlog_recover_inode_commit_pass2( if (unlikely(S_ISREG(ldip->di_mode))) { if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) && (ldip->di_format != XFS_DINODE_FMT_RMAP) && + (ldip->di_format != XFS_DINODE_FMT_REFCOUNT) && (ldip->di_format != XFS_DINODE_FMT_BTREE)) { XFS_CORRUPTION_ERROR( "Bad log dinode data fork format for regular file", diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 0f31680284fb..c998e26f5db9 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -31,6 +31,7 @@ #include "xfs_btree.h" #include "xfs_rmap.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" /* * Realtime metadata files are not quite regular files because userspace can't @@ -42,6 +43,7 @@ static struct lock_class_key xfs_rbmip_key; static struct lock_class_key xfs_rsumip_key; static struct lock_class_key xfs_rrmapip_key; +static struct lock_class_key xfs_rrefcountip_key; /* * Read and return the summary information for a given extent size, @@ -1855,6 +1857,47 @@ xfs_rtmount_iread_extents( return error; } +/* Load realtime refcount btree inode. */ +STATIC int +xfs_rtmount_refcountbt( + struct xfs_rtgroup *rtg) +{ + struct xfs_mount *mp = rtg->rtg_mount; + struct xfs_imeta_path *path; + struct xfs_inode *ip; + xfs_ino_t ino; + int error; + + if (!xfs_has_rtreflink(mp)) + return 0; + + error = xfs_rtrefcountbt_create_path(mp, rtg->rtg_rgno, &path); + if (error) + return error; + + error = xfs_imeta_lookup(mp, path, &ino); + if (error) + goto out_path; + + error = xfs_rt_iget(mp, ino, &xfs_rrefcountip_key, &ip); + if (error) + goto out_path; + + if (XFS_IS_CORRUPT(mp, ip->i_df.if_format != XFS_DINODE_FMT_REFCOUNT)) { + error = -EFSCORRUPTED; + goto out_rele; + } + + rtg->rtg_refcountip = ip; + ip = NULL; +out_rele: + if (ip) + xfs_imeta_irele(ip); +out_path: + xfs_imeta_free_path(path); + return error; +} + /* * Get the bitmap and summary inodes and the summary cache into the mount * structure at mount time. @@ -1902,6 +1945,10 @@ xfs_rtmount_inodes( xfs_rtgroup_put(rtg); goto out_rele_rtgroup; } + + error = xfs_rtmount_refcountbt(rtg); + if (error) + goto out_rele_rtgroup; } xfs_alloc_rsum_cache(mp, sbp->sb_rbmblocks); @@ -1909,6 +1956,10 @@ xfs_rtmount_inodes( out_rele_rtgroup: for_each_rtgroup(mp, rgno, rtg) { + if (rtg->rtg_refcountip) + xfs_imeta_irele(rtg->rtg_refcountip); + rtg->rtg_refcountip = NULL; + if (rtg->rtg_rmapip) xfs_imeta_irele(rtg->rtg_rmapip); rtg->rtg_rmapip = NULL; @@ -1945,6 +1996,14 @@ xfs_rtmount_dqattach( return error; } } + + if (rtg->rtg_refcountip) { + error = xfs_qm_dqattach(rtg->rtg_refcountip); + if (error) { + xfs_rtgroup_put(rtg); + return error; + } + } } return 0; @@ -1960,6 +2019,10 @@ xfs_rtunmount_inodes( kmem_free(mp->m_rsum_cache); for_each_rtgroup(mp, rgno, rtg) { + if (rtg->rtg_refcountip) + xfs_imeta_irele(rtg->rtg_refcountip); + rtg->rtg_refcountip = NULL; + if (rtg->rtg_rmapip) xfs_imeta_irele(rtg->rtg_rmapip); rtg->rtg_rmapip = NULL; diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 1f8ab7c436a9..d07947451ec9 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -2239,6 +2239,7 @@ TRACE_DEFINE_ENUM(XFS_DINODE_FMT_EXTENTS); TRACE_DEFINE_ENUM(XFS_DINODE_FMT_BTREE); TRACE_DEFINE_ENUM(XFS_DINODE_FMT_UUID); TRACE_DEFINE_ENUM(XFS_DINODE_FMT_RMAP); +TRACE_DEFINE_ENUM(XFS_DINODE_FMT_REFCOUNT); DECLARE_EVENT_CLASS(xfs_swap_extent_class, TP_PROTO(struct xfs_inode *ip, int which), From patchwork Fri Dec 30 22:18:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085520 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1323C4332F for ; Sat, 31 Dec 2022 01:51:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236101AbiLaBvL (ORCPT ); Fri, 30 Dec 2022 20:51:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236099AbiLaBvK (ORCPT ); Fri, 30 Dec 2022 20:51:10 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E15781DDD1 for ; Fri, 30 Dec 2022 17:51:09 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7DDC561CBE for ; Sat, 31 Dec 2022 01:51:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DE2F1C433D2; Sat, 31 Dec 2022 01:51:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451468; bh=SKjCZj/hkiW1AHDeEVxMdoqDo7b1DePxQNc3+elxhCU=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=t0xyOpyt0h3I4MKqHd3pFaezUuEUmKhfAg2Rqvd8pWhiR/O9vrDNCaZAeF6exnMLm x524vflqZZLyCjzYNl/2IS1zQ4AsXqmBSGlsIk3B16AwtLibPuuL7CrvwzBkfMbjwD fP+QxkuFh/YRzkuhbrMSp+6r6S6zFaIkX3VRPUX3ch8H2NvZK9fL66RlB61s6CmQOj LeYUkkr/JpfgVoXDs73yFiuaBmkBFuvxPbC4oq4ElTsOGFyp/QtI7SFLf94ePRvgfK 14GfvdAc06/7jaRTHyg5AJ1wBGN0S0wfJlnjYlWeHgIZz2bRwI+Qs7RcWhQMpqJ6lW A4AYYuLeobcsA== Subject: [PATCH 12/42] xfs: add metadata reservations for realtime refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:30 -0800 Message-ID: <167243871068.717073.4070369282152095464.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Reserve some free blocks so that we will always have enough free blocks in the data volume to handle expansion of the realtime refcount btree. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rtrefcount_btree.c | 39 ++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.h | 2 ++ fs/xfs/xfs_rtalloc.c | 9 +++++++- 3 files changed, 49 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index 40524fee3860..74c5cf9a0d3a 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -490,3 +490,42 @@ xfs_rtrefcountbt_create_path( *pathp = path; return 0; } + +/* Calculate the rtrefcount btree size for some records. */ +static unsigned long long +xfs_rtrefcountbt_calc_size( + struct xfs_mount *mp, + unsigned long long len) +{ + return xfs_btree_calc_size(mp->m_rtrefc_mnr, len); +} + +/* + * Calculate the maximum refcount btree size. + */ +static unsigned long long +xfs_rtrefcountbt_max_size( + struct xfs_mount *mp, + xfs_rtblock_t rtblocks) +{ + /* Bail out if we're uninitialized, which can happen in mkfs. */ + if (mp->m_rtrefc_mxr[0] == 0) + return 0; + + return xfs_rtrefcountbt_calc_size(mp, rtblocks); +} + +/* + * Figure out how many blocks to reserve and how many are used by this btree. + * We need enough space to hold one record for every rt extent in the rtgroup. + */ +xfs_filblks_t +xfs_rtrefcountbt_calc_reserves( + struct xfs_mount *mp) +{ + if (!xfs_has_rtreflink(mp)) + return 0; + + return xfs_rtrefcountbt_max_size(mp, + xfs_rtb_to_rtxt(mp, mp->m_sb.sb_rgblocks)); +} diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.h b/fs/xfs/libxfs/xfs_rtrefcount_btree.h index 1f3f590c68e6..ffda0b063bcf 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.h +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.h @@ -72,4 +72,6 @@ void xfs_rtrefcountbt_destroy_cur_cache(void); int xfs_rtrefcountbt_create_path(struct xfs_mount *mp, xfs_rgnumber_t rgno, struct xfs_imeta_path **pathp); +xfs_filblks_t xfs_rtrefcountbt_calc_reserves(struct xfs_mount *mp); + #endif /* __XFS_RTREFCOUNT_BTREE_H__ */ diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index c998e26f5db9..48c7cc28b7f2 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1733,8 +1733,10 @@ xfs_rt_resv_free( struct xfs_rtgroup *rtg; xfs_rgnumber_t rgno; - for_each_rtgroup(mp, rgno, rtg) + for_each_rtgroup(mp, rgno, rtg) { + xfs_imeta_resv_free_inode(rtg->rtg_refcountip); xfs_imeta_resv_free_inode(rtg->rtg_rmapip); + } } /* Reserve space for rt metadata inodes' space expansion. */ @@ -1754,6 +1756,11 @@ xfs_rt_resv_init( err2 = xfs_imeta_resv_init_inode(rtg->rtg_rmapip, ask); if (err2 && !error) error = err2; + + ask = xfs_rtrefcountbt_calc_reserves(mp); + err2 = xfs_imeta_resv_init_inode(rtg->rtg_refcountip, ask); + if (err2 && !error) + error = err2; } return error; From patchwork Fri Dec 30 22:18:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDD0FC4332F for ; Sat, 31 Dec 2022 01:51:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236129AbiLaBv2 (ORCPT ); Fri, 30 Dec 2022 20:51:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236099AbiLaBv1 (ORCPT ); Fri, 30 Dec 2022 20:51:27 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C2AA1DDD3 for ; Fri, 30 Dec 2022 17:51:25 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1897A61CCE for ; Sat, 31 Dec 2022 01:51:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 76F2EC433D2; Sat, 31 Dec 2022 01:51:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451484; bh=zivrhYnln/rsvtSYSt93dqBHQRl2itgDfix/6yfRABs=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=P8vY27szxL9RFAjXW6cpvnkFqgiPT3KBO1w5JIZisJtX+zfx9KPA1rAQwcsUt/E57 ik9gJ9nkKUAim1gwFWQYS3UHctLFYymri5vVjVBrCl/OfL4cSAXGOxkgP1fX58IbWp DQVDCltD9JKJ0YA76UwbgtQxK6Cn57y0squiZt7exRlKp1oBfcTNItOopAao+Ut9GG tLiq62jlm5hZxo6F9ZsT/GsW7u94liTq/eQUCojDpQRtgU54vbPdVBX9836YOp+NbW VUZewUMdMQLMk//2amDVM12Ez/SPedGMYqLzAoTgmCXGcxpOK/6cohSpaa8rMowOiI qe1dhQAaPl0hw== Subject: [PATCH 13/42] xfs: wire up a new inode fork type for the realtime refcount From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:30 -0800 Message-ID: <167243871082.717073.14113526570889873496.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Plumb in the pieces we need to embed the root of the realtime refcount btree in an inode's data fork, complete with new fork type and on-disk interpretation functions. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_format.h | 8 + fs/xfs/libxfs/xfs_inode_fork.c | 8 + fs/xfs/libxfs/xfs_rtrefcount_btree.c | 236 ++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.h | 112 ++++++++++++++++ fs/xfs/xfs_inode_item_recover.c | 4 + fs/xfs/xfs_ondisk.h | 1 6 files changed, 366 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index 20af5b730d6d..17be73c45226 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1805,6 +1805,14 @@ typedef __be32 xfs_refcount_ptr_t; */ #define XFS_RTREFC_CRC_MAGIC 0x52434e54 /* 'RCNT' */ +/* + * rt refcount root header, on-disk form only. + */ +struct xfs_rtrefcount_root { + __be16 bb_level; /* 0 is a leaf */ + __be16 bb_numrecs; /* current # of data records */ +}; + /* inode-rooted btree pointer type */ typedef __be64 xfs_rtrefcount_ptr_t; diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index e69ec68b5a9d..7aae3ae810b7 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -28,6 +28,7 @@ #include "xfs_health.h" #include "xfs_symlink_remote.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" struct kmem_cache *xfs_ifork_cache; @@ -269,8 +270,7 @@ xfs_iformat_data_fork( case XFS_DINODE_FMT_REFCOUNT: if (!xfs_has_rtreflink(ip->i_mount)) return -EFSCORRUPTED; - ASSERT(0); /* to be implemented later */ - return -EFSCORRUPTED; + return xfs_iformat_rtrefcount(ip, dip); default: xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip, sizeof(*dip), __this_address); @@ -658,7 +658,9 @@ xfs_iflush_fork( break; case XFS_DINODE_FMT_REFCOUNT: - ASSERT(0); /* to be implemented later */ + ASSERT(whichfork == XFS_DATA_FORK); + if (iip->ili_fields & brootflag[whichfork]) + xfs_iflush_rtrefcount(ip, dip); break; default: diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index 74c5cf9a0d3a..a43ee6d7b547 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -85,6 +85,41 @@ xfs_rtrefcountbt_get_maxrecs( return cur->bc_mp->m_rtrefc_mxr[level != 0]; } +/* + * Calculate number of records in a realtime refcount btree inode root. + */ +unsigned int +xfs_rtrefcountbt_droot_maxrecs( + unsigned int blocklen, + bool leaf) +{ + blocklen -= sizeof(struct xfs_rtrefcount_root); + + if (leaf) + return blocklen / sizeof(struct xfs_refcount_rec); + return blocklen / (2 * sizeof(struct xfs_refcount_key) + + sizeof(xfs_rtrefcount_ptr_t)); +} + +/* + * Get the maximum records we could store in the on-disk format. + * + * For non-root nodes this is equivalent to xfs_rtrefcountbt_get_maxrecs, but + * for the root node this checks the available space in the dinode fork so that + * we can resize the in-memory buffer to match it. After a resize to the + * maximum size this function returns the same value as + * xfs_rtrefcountbt_get_maxrecs for the root node, too. + */ +STATIC int +xfs_rtrefcountbt_get_dmaxrecs( + struct xfs_btree_cur *cur, + int level) +{ + if (level != cur->bc_nlevels - 1) + return cur->bc_mp->m_rtrefc_mxr[level != 0]; + return xfs_rtrefcountbt_droot_maxrecs(cur->bc_ino.forksize, level == 0); +} + STATIC void xfs_rtrefcountbt_init_key_from_rec( union xfs_btree_key *key, @@ -255,6 +290,68 @@ xfs_rtrefcountbt_keys_contiguous( be32_to_cpu(key2->refc.rc_startblock)); } +/* Move the rt refcount btree root from one incore buffer to another. */ +static void +xfs_rtrefcountbt_broot_move( + struct xfs_inode *ip, + int whichfork, + struct xfs_btree_block *dst_broot, + size_t dst_bytes, + struct xfs_btree_block *src_broot, + size_t src_bytes, + unsigned int level, + unsigned int numrecs) +{ + struct xfs_mount *mp = ip->i_mount; + void *dptr; + void *sptr; + + ASSERT(xfs_rtrefcount_droot_space(src_broot) <= + xfs_inode_fork_size(ip, whichfork)); + + /* + * We always have to move the pointers because they are not butted + * against the btree block header. + */ + if (numrecs && level > 0) { + sptr = xfs_rtrefcount_broot_ptr_addr(mp, src_broot, 1, + src_bytes); + dptr = xfs_rtrefcount_broot_ptr_addr(mp, dst_broot, 1, + dst_bytes); + memmove(dptr, sptr, numrecs * sizeof(xfs_fsblock_t)); + } + + if (src_broot == dst_broot) + return; + + /* + * If the root is being totally relocated, we have to migrate the block + * header and the keys/records that come after it. + */ + memcpy(dst_broot, src_broot, XFS_RTREFCOUNT_BLOCK_LEN); + + if (!numrecs) + return; + + if (level == 0) { + sptr = xfs_rtrefcount_rec_addr(src_broot, 1); + dptr = xfs_rtrefcount_rec_addr(dst_broot, 1); + memcpy(dptr, sptr, + numrecs * sizeof(struct xfs_refcount_rec)); + } else { + sptr = xfs_rtrefcount_key_addr(src_broot, 1); + dptr = xfs_rtrefcount_key_addr(dst_broot, 1); + memcpy(dptr, sptr, + numrecs * sizeof(struct xfs_refcount_key)); + } +} + +static const struct xfs_ifork_broot_ops xfs_rtrefcountbt_iroot_ops = { + .maxrecs = xfs_rtrefcountbt_maxrecs, + .size = xfs_rtrefcount_broot_space_calc, + .move = xfs_rtrefcountbt_broot_move, +}; + const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .rec_len = sizeof(struct xfs_refcount_rec), .key_len = sizeof(struct xfs_refcount_key), @@ -266,6 +363,7 @@ const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .free_block = xfs_btree_free_imeta_block, .get_minrecs = xfs_rtrefcountbt_get_minrecs, .get_maxrecs = xfs_rtrefcountbt_get_maxrecs, + .get_dmaxrecs = xfs_rtrefcountbt_get_dmaxrecs, .init_key_from_rec = xfs_rtrefcountbt_init_key_from_rec, .init_high_key_from_rec = xfs_rtrefcountbt_init_high_key_from_rec, .init_rec_from_cur = xfs_rtrefcountbt_init_rec_from_cur, @@ -276,6 +374,7 @@ const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .keys_inorder = xfs_rtrefcountbt_keys_inorder, .recs_inorder = xfs_rtrefcountbt_recs_inorder, .keys_contiguous = xfs_rtrefcountbt_keys_contiguous, + .iroot_ops = &xfs_rtrefcountbt_iroot_ops, }; /* Initialize a new rt refcount btree cursor. */ @@ -529,3 +628,140 @@ xfs_rtrefcountbt_calc_reserves( return xfs_rtrefcountbt_max_size(mp, xfs_rtb_to_rtxt(mp, mp->m_sb.sb_rgblocks)); } + +/* + * Convert on-disk form of btree root to in-memory form. + */ +STATIC void +xfs_rtrefcountbt_from_disk( + struct xfs_inode *ip, + struct xfs_rtrefcount_root *dblock, + int dblocklen, + struct xfs_btree_block *rblock) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_refcount_key *fkp; + __be64 *fpp; + struct xfs_refcount_key *tkp; + __be64 *tpp; + struct xfs_refcount_rec *frp; + struct xfs_refcount_rec *trp; + unsigned int numrecs; + unsigned int maxrecs; + unsigned int rblocklen; + + rblocklen = xfs_rtrefcount_broot_space(mp, dblock); + + xfs_btree_init_block(mp, rblock, &xfs_rtrefcountbt_ops, 0, 0, + ip->i_ino); + + rblock->bb_level = dblock->bb_level; + rblock->bb_numrecs = dblock->bb_numrecs; + + if (be16_to_cpu(rblock->bb_level) > 0) { + maxrecs = xfs_rtrefcountbt_droot_maxrecs(dblocklen, false); + fkp = xfs_rtrefcount_droot_key_addr(dblock, 1); + tkp = xfs_rtrefcount_key_addr(rblock, 1); + fpp = xfs_rtrefcount_droot_ptr_addr(dblock, 1, maxrecs); + tpp = xfs_rtrefcount_broot_ptr_addr(mp, rblock, 1, rblocklen); + numrecs = be16_to_cpu(dblock->bb_numrecs); + memcpy(tkp, fkp, 2 * sizeof(*fkp) * numrecs); + memcpy(tpp, fpp, sizeof(*fpp) * numrecs); + } else { + frp = xfs_rtrefcount_droot_rec_addr(dblock, 1); + trp = xfs_rtrefcount_rec_addr(rblock, 1); + numrecs = be16_to_cpu(dblock->bb_numrecs); + memcpy(trp, frp, sizeof(*frp) * numrecs); + } +} + +/* Load a realtime reference count btree root in from disk. */ +int +xfs_iformat_rtrefcount( + struct xfs_inode *ip, + struct xfs_dinode *dip) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + struct xfs_rtrefcount_root *dfp = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + unsigned int numrecs; + unsigned int level; + int dsize; + + dsize = XFS_DFORK_SIZE(dip, mp, XFS_DATA_FORK); + numrecs = be16_to_cpu(dfp->bb_numrecs); + level = be16_to_cpu(dfp->bb_level); + + if (level > mp->m_rtrefc_maxlevels || + xfs_rtrefcount_droot_space_calc(level, numrecs) > dsize) + return -EFSCORRUPTED; + + xfs_iroot_alloc(ip, XFS_DATA_FORK, + xfs_rtrefcount_broot_space_calc(mp, level, numrecs)); + xfs_rtrefcountbt_from_disk(ip, dfp, dsize, ifp->if_broot); + return 0; +} + +/* + * Convert in-memory form of btree root to on-disk form. + */ +void +xfs_rtrefcountbt_to_disk( + struct xfs_mount *mp, + struct xfs_btree_block *rblock, + int rblocklen, + struct xfs_rtrefcount_root *dblock, + int dblocklen) +{ + struct xfs_refcount_key *fkp; + __be64 *fpp; + struct xfs_refcount_key *tkp; + __be64 *tpp; + struct xfs_refcount_rec *frp; + struct xfs_refcount_rec *trp; + unsigned int maxrecs; + unsigned int numrecs; + + ASSERT(rblock->bb_magic == cpu_to_be32(XFS_RTREFC_CRC_MAGIC)); + ASSERT(uuid_equal(&rblock->bb_u.l.bb_uuid, &mp->m_sb.sb_meta_uuid)); + ASSERT(rblock->bb_u.l.bb_blkno == cpu_to_be64(XFS_BUF_DADDR_NULL)); + ASSERT(rblock->bb_u.l.bb_leftsib == cpu_to_be64(NULLFSBLOCK)); + ASSERT(rblock->bb_u.l.bb_rightsib == cpu_to_be64(NULLFSBLOCK)); + + dblock->bb_level = rblock->bb_level; + dblock->bb_numrecs = rblock->bb_numrecs; + + if (be16_to_cpu(rblock->bb_level) > 0) { + maxrecs = xfs_rtrefcountbt_droot_maxrecs(dblocklen, false); + fkp = xfs_rtrefcount_key_addr(rblock, 1); + tkp = xfs_rtrefcount_droot_key_addr(dblock, 1); + fpp = xfs_rtrefcount_broot_ptr_addr(mp, rblock, 1, rblocklen); + tpp = xfs_rtrefcount_droot_ptr_addr(dblock, 1, maxrecs); + numrecs = be16_to_cpu(rblock->bb_numrecs); + memcpy(tkp, fkp, 2 * sizeof(*fkp) * numrecs); + memcpy(tpp, fpp, sizeof(*fpp) * numrecs); + } else { + frp = xfs_rtrefcount_rec_addr(rblock, 1); + trp = xfs_rtrefcount_droot_rec_addr(dblock, 1); + numrecs = be16_to_cpu(rblock->bb_numrecs); + memcpy(trp, frp, sizeof(*frp) * numrecs); + } +} + +/* Flush a realtime reference count btree root out to disk. */ +void +xfs_iflush_rtrefcount( + struct xfs_inode *ip, + struct xfs_dinode *dip) +{ + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + struct xfs_rtrefcount_root *dfp = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + + ASSERT(ifp->if_broot != NULL); + ASSERT(ifp->if_broot_bytes > 0); + ASSERT(xfs_rtrefcount_droot_space(ifp->if_broot) <= + xfs_inode_fork_size(ip, XFS_DATA_FORK)); + xfs_rtrefcountbt_to_disk(ip->i_mount, ifp->if_broot, + ifp->if_broot_bytes, dfp, + XFS_DFORK_SIZE(dip, ip->i_mount, XFS_DATA_FORK)); +} diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.h b/fs/xfs/libxfs/xfs_rtrefcount_btree.h index ffda0b063bcf..d2fe2004568d 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.h +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.h @@ -27,6 +27,7 @@ void xfs_rtrefcountbt_commit_staged_btree(struct xfs_btree_cur *cur, unsigned int xfs_rtrefcountbt_maxrecs(struct xfs_mount *mp, unsigned int blocklen, bool leaf); void xfs_rtrefcountbt_compute_maxlevels(struct xfs_mount *mp); +unsigned int xfs_rtrefcountbt_droot_maxrecs(unsigned int blocklen, bool leaf); /* * Addresses of records, keys, and pointers within an incore rtrefcountbt block. @@ -74,4 +75,115 @@ int xfs_rtrefcountbt_create_path(struct xfs_mount *mp, xfs_rgnumber_t rgno, xfs_filblks_t xfs_rtrefcountbt_calc_reserves(struct xfs_mount *mp); +/* Addresses of key, pointers, and records within an ondisk rtrefcount block. */ + +static inline struct xfs_refcount_rec * +xfs_rtrefcount_droot_rec_addr( + struct xfs_rtrefcount_root *block, + unsigned int index) +{ + return (struct xfs_refcount_rec *) + ((char *)(block + 1) + + (index - 1) * sizeof(struct xfs_refcount_rec)); +} + +static inline struct xfs_refcount_key * +xfs_rtrefcount_droot_key_addr( + struct xfs_rtrefcount_root *block, + unsigned int index) +{ + return (struct xfs_refcount_key *) + ((char *)(block + 1) + + (index - 1) * sizeof(struct xfs_refcount_key)); +} + +static inline xfs_rtrefcount_ptr_t * +xfs_rtrefcount_droot_ptr_addr( + struct xfs_rtrefcount_root *block, + unsigned int index, + unsigned int maxrecs) +{ + return (xfs_rtrefcount_ptr_t *) + ((char *)(block + 1) + + maxrecs * sizeof(struct xfs_refcount_key) + + (index - 1) * sizeof(xfs_rtrefcount_ptr_t)); +} + +/* + * Address of pointers within the incore btree root. + * + * These are to be used when we know the size of the block and + * we don't have a cursor. + */ +static inline xfs_rtrefcount_ptr_t * +xfs_rtrefcount_broot_ptr_addr( + struct xfs_mount *mp, + struct xfs_btree_block *bb, + unsigned int index, + unsigned int block_size) +{ + return xfs_rtrefcount_ptr_addr(bb, index, + xfs_rtrefcountbt_maxrecs(mp, block_size, false)); +} + +/* + * Compute the space required for the incore btree root containing the given + * number of records. + */ +static inline size_t +xfs_rtrefcount_broot_space_calc( + struct xfs_mount *mp, + unsigned int level, + unsigned int nrecs) +{ + size_t sz = XFS_RTREFCOUNT_BLOCK_LEN; + + if (level > 0) + return sz + nrecs * (sizeof(struct xfs_refcount_key) + + sizeof(xfs_rtrefcount_ptr_t)); + return sz + nrecs * sizeof(struct xfs_refcount_rec); +} + +/* + * Compute the space required for the incore btree root given the ondisk + * btree root block. + */ +static inline size_t +xfs_rtrefcount_broot_space(struct xfs_mount *mp, struct xfs_rtrefcount_root *bb) +{ + return xfs_rtrefcount_broot_space_calc(mp, be16_to_cpu(bb->bb_level), + be16_to_cpu(bb->bb_numrecs)); +} + +/* Compute the space required for the ondisk root block. */ +static inline size_t +xfs_rtrefcount_droot_space_calc( + unsigned int level, + unsigned int nrecs) +{ + size_t sz = sizeof(struct xfs_rtrefcount_root); + + if (level > 0) + return sz + nrecs * (sizeof(struct xfs_refcount_key) + + sizeof(xfs_rtrefcount_ptr_t)); + return sz + nrecs * sizeof(struct xfs_refcount_rec); +} + +/* + * Compute the space required for the ondisk root block given an incore root + * block. + */ +static inline size_t +xfs_rtrefcount_droot_space(struct xfs_btree_block *bb) +{ + return xfs_rtrefcount_droot_space_calc(be16_to_cpu(bb->bb_level), + be16_to_cpu(bb->bb_numrecs)); +} + +int xfs_iformat_rtrefcount(struct xfs_inode *ip, struct xfs_dinode *dip); +void xfs_rtrefcountbt_to_disk(struct xfs_mount *mp, + struct xfs_btree_block *rblock, int rblocklen, + struct xfs_rtrefcount_root *dblock, int dblocklen); +void xfs_iflush_rtrefcount(struct xfs_inode *ip, struct xfs_dinode *dip); + #endif /* __XFS_RTREFCOUNT_BTREE_H__ */ diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c index feeba1dff01e..f13bf35793f1 100644 --- a/fs/xfs/xfs_inode_item_recover.c +++ b/fs/xfs/xfs_inode_item_recover.c @@ -23,6 +23,7 @@ #include "xfs_icache.h" #include "xfs_bmap_btree.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" STATIC void xlog_recover_inode_ra_pass2( @@ -284,6 +285,9 @@ xlog_recover_inode_dbroot( case XFS_DINODE_FMT_RMAP: xfs_rtrmapbt_to_disk(mp, src, len, dfork, dsize); break; + case XFS_DINODE_FMT_REFCOUNT: + xfs_rtrefcountbt_to_disk(mp, src, len, dfork, dsize); + break; default: ASSERT(0); return -EFSCORRUPTED; diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h index 94bbb6351d3d..7c14dd104191 100644 --- a/fs/xfs/xfs_ondisk.h +++ b/fs/xfs/xfs_ondisk.h @@ -80,6 +80,7 @@ xfs_check_ondisk_structs(void) XFS_CHECK_STRUCT_SIZE(xfs_rtrmap_ptr_t, 8); XFS_CHECK_STRUCT_SIZE(struct xfs_rtrmap_root, 4); XFS_CHECK_STRUCT_SIZE(xfs_rtrefcount_ptr_t, 8); + XFS_CHECK_STRUCT_SIZE(struct xfs_rtrefcount_root, 4); /* * m68k has problems with xfs_attr_leaf_name_remote_t, but we pad it to From patchwork Fri Dec 30 22:18:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085522 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B525C4332F for ; Sat, 31 Dec 2022 01:51:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236136AbiLaBvp (ORCPT ); Fri, 30 Dec 2022 20:51:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236099AbiLaBvo (ORCPT ); Fri, 30 Dec 2022 20:51:44 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FD191DDD3 for ; Fri, 30 Dec 2022 17:51:42 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 5E0BAB81DFC for ; Sat, 31 Dec 2022 01:51:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 10046C433D2; Sat, 31 Dec 2022 01:51:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451500; bh=U4rDCvlqw8j5LqX4RctbnuqbHVf3uo5s2jb5vM/3mTo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=J4PSdYVjNPWspHM/kTQCGYOoaPb3/1Osl66Lo1uvZOo7dkNUK7OM7/C/cyb0ouTsc YPBoiQGdAdWrnv9zFOkrtp39v+6JfRCwypK4IqF+98cFMOXhqNRxhzu7eD3lKmsE5I YNYNxpNprjkUfmIj8JFH1hwKXpRu9YniWJNbdva72U8UseaiaAOBEEs2jnCkW4AvjE 0l8VjHcLn6eS3Qd3mPcYhrSXJGoUI3Wi5IReMoWxFve12TYOdqhRp9RETaBRjpbD90 5TljgdlibJXAwCEA9bZwWh7rToi++LBhxV2QloJnc9k7r7kgiNxXhAFF5Ui4WBmCs7 sBqz8fLW1Vopg== Subject: [PATCH 14/42] xfs: wire up realtime refcount btree cursors From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:31 -0800 Message-ID: <167243871097.717073.7853127836397868533.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Wire up realtime refcount btree cursors wherever they're needed throughout the code base. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_refcount.c | 7 ++- fs/xfs/libxfs/xfs_rtgroup.c | 10 ++++ fs/xfs/libxfs/xfs_rtgroup.h | 5 ++ fs/xfs/xfs_fsmap.c | 22 ++++++--- fs/xfs/xfs_reflink.c | 99 ++++++++++++++++++++++++++++++++++-------- 5 files changed, 111 insertions(+), 32 deletions(-) diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 999ba2c5c37d..c4ab749c78e4 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -26,6 +26,7 @@ #include "xfs_health.h" #include "xfs_rtgroup.h" #include "xfs_rtalloc.h" +#include "xfs_rtrefcount_btree.h" struct kmem_cache *xfs_refcount_intent_cache; @@ -1485,9 +1486,9 @@ xfs_refcount_finish_one( } if (rcur == NULL) { if (ri->ri_realtime) { - /* coming in a later patch */ - ASSERT(0); - return -EFSCORRUPTED; + xfs_rtgroup_lock(tp, ri->ri_rtg, XFS_RTGLOCK_REFCOUNT); + rcur = xfs_rtrefcountbt_init_cursor(mp, tp, ri->ri_rtg, + ri->ri_rtg->rtg_refcountip); } else { error = xfs_alloc_read_agf(ri->ri_pag, tp, XFS_ALLOC_FLAG_FREEING, &agbp); diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index bd878e65bc44..836b19e0406d 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -524,6 +524,13 @@ xfs_rtgroup_lock( if (tp) xfs_trans_ijoin(tp, rtg->rtg_rmapip, XFS_ILOCK_EXCL); } + + if ((rtglock_flags & XFS_RTGLOCK_REFCOUNT) && rtg->rtg_refcountip) { + xfs_ilock(rtg->rtg_refcountip, XFS_ILOCK_EXCL); + if (tp) + xfs_trans_ijoin(tp, rtg->rtg_refcountip, + XFS_ILOCK_EXCL); + } } /* Unlock metadata inodes associated with this rt group. */ @@ -536,6 +543,9 @@ xfs_rtgroup_unlock( ASSERT(!(rtglock_flags & XFS_RTGLOCK_BITMAP_SHARED) || !(rtglock_flags & XFS_RTGLOCK_BITMAP)); + if ((rtglock_flags & XFS_RTGLOCK_REFCOUNT) && rtg->rtg_refcountip) + xfs_iunlock(rtg->rtg_refcountip, XFS_ILOCK_EXCL); + if ((rtglock_flags & XFS_RTGLOCK_RMAP) && rtg->rtg_rmapip) xfs_iunlock(rtg->rtg_rmapip, XFS_ILOCK_EXCL); diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h index 0f400f133d88..4f0358d63457 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.h +++ b/fs/xfs/libxfs/xfs_rtgroup.h @@ -237,10 +237,13 @@ int xfs_rtgroup_init_secondary_super(struct xfs_mount *mp, xfs_rgnumber_t rgno, #define XFS_RTGLOCK_BITMAP_SHARED (1U << 1) /* Lock the rt rmap inode in exclusive mode */ #define XFS_RTGLOCK_RMAP (1U << 2) +/* Lock the rt refcount inode in exclusive mode */ +#define XFS_RTGLOCK_REFCOUNT (1U << 3) #define XFS_RTGLOCK_ALL_FLAGS (XFS_RTGLOCK_BITMAP | \ XFS_RTGLOCK_BITMAP_SHARED | \ - XFS_RTGLOCK_RMAP) + XFS_RTGLOCK_RMAP | \ + XFS_RTGLOCK_REFCOUNT) void xfs_rtgroup_lock(struct xfs_trans *tp, struct xfs_rtgroup *rtg, unsigned int rtglock_flags); diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index efbcc4b1d850..5f7e7ea2fde3 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -27,6 +27,7 @@ #include "xfs_ag.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" /* Convert an xfs_fsmap to an fsmap. */ static void @@ -209,14 +210,16 @@ xfs_getfsmap_is_shared( *stat = false; if (!xfs_has_reflink(mp)) return 0; - /* rt files will have no perag structure */ - if (!info->pag) - return 0; + + if (info->rtg) + cur = xfs_rtrefcountbt_init_cursor(mp, tp, info->rtg, + info->rtg->rtg_refcountip); + else + cur = xfs_refcountbt_init_cursor(mp, tp, info->agf_bp, + info->pag); /* Are there any shared blocks here? */ flen = 0; - cur = xfs_refcountbt_init_cursor(mp, tp, info->agf_bp, info->pag); - error = xfs_refcount_find_shared(cur, rec->rm_startblock, rec->rm_blockcount, &fbno, &flen, false); @@ -820,7 +823,8 @@ xfs_getfsmap_rtdev_rmapbt_query( return xfs_getfsmap_rtdev_helper(*curpp, &info->high, info); /* Query the rtrmapbt */ - xfs_rtgroup_lock(NULL, info->rtg, XFS_RTGLOCK_RMAP); + xfs_rtgroup_lock(NULL, info->rtg, XFS_RTGLOCK_RMAP | + XFS_RTGLOCK_REFCOUNT); *curpp = xfs_rtrmapbt_init_cursor(mp, tp, info->rtg, info->rtg->rtg_rmapip); return xfs_rmap_query_range(*curpp, &info->low, &info->high, @@ -893,7 +897,8 @@ xfs_getfsmap_rtdev_rmapbt( if (bt_cur) { xfs_rtgroup_unlock(bt_cur->bc_ino.rtg, - XFS_RTGLOCK_RMAP); + XFS_RTGLOCK_RMAP | + XFS_RTGLOCK_REFCOUNT); xfs_btree_del_cursor(bt_cur, XFS_BTREE_NOERROR); bt_cur = NULL; } @@ -934,7 +939,8 @@ xfs_getfsmap_rtdev_rmapbt( } if (bt_cur) { - xfs_rtgroup_unlock(bt_cur->bc_ino.rtg, XFS_RTGLOCK_RMAP); + xfs_rtgroup_unlock(bt_cur->bc_ino.rtg, XFS_RTGLOCK_RMAP | + XFS_RTGLOCK_REFCOUNT); xfs_btree_del_cursor(bt_cur, error < 0 ? XFS_BTREE_ERROR : XFS_BTREE_NOERROR); } diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 52e73aa2c38e..1a8a254c81f4 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -30,6 +30,9 @@ #include "xfs_ag.h" #include "xfs_ag_resv.h" #include "xfs_health.h" +#include "xfs_rtrefcount_btree.h" +#include "xfs_rtalloc.h" +#include "xfs_rtgroup.h" /* * Copy on Write of Shared Blocks @@ -155,6 +158,38 @@ xfs_reflink_find_shared( return error; } +/* + * Given an RT extent, find the lowest-numbered run of shared blocks + * within that range and return the range in fbno/flen. If + * find_end_of_shared is true, return the longest contiguous extent of + * shared blocks. If there are no shared extents, fbno and flen will + * be set to NULLRGBLOCK and 0, respectively. + */ +static int +xfs_reflink_find_rtshared( + struct xfs_rtgroup *rtg, + struct xfs_trans *tp, + xfs_agblock_t rtbno, + xfs_extlen_t rtlen, + xfs_agblock_t *fbno, + xfs_extlen_t *flen, + bool find_end_of_shared) +{ + struct xfs_mount *mp = rtg->rtg_mount; + struct xfs_btree_cur *cur; + int error; + + BUILD_BUG_ON(NULLRGBLOCK != NULLAGBLOCK); + + xfs_rtgroup_lock(NULL, rtg, XFS_RTGLOCK_REFCOUNT); + cur = xfs_rtrefcountbt_init_cursor(mp, tp, rtg, rtg->rtg_refcountip); + error = xfs_refcount_find_shared(cur, rtbno, rtlen, fbno, flen, + find_end_of_shared); + xfs_btree_del_cursor(cur, error); + xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_REFCOUNT); + return error; +} + /* * Trim the mapping to the next block where there's a change in the * shared/unshared status. More specifically, this means that we @@ -172,9 +207,7 @@ xfs_reflink_trim_around_shared( bool *shared) { struct xfs_mount *mp = ip->i_mount; - struct xfs_perag *pag; - xfs_agblock_t agbno; - xfs_extlen_t aglen; + xfs_agblock_t orig_bno; xfs_agblock_t fbno; xfs_extlen_t flen; int error = 0; @@ -187,13 +220,25 @@ xfs_reflink_trim_around_shared( trace_xfs_reflink_trim_around_shared(ip, irec); - pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, irec->br_startblock)); - agbno = XFS_FSB_TO_AGBNO(mp, irec->br_startblock); - aglen = irec->br_blockcount; + if (XFS_IS_REALTIME_INODE(ip)) { + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; - error = xfs_reflink_find_shared(pag, NULL, agbno, aglen, &fbno, &flen, - true); - xfs_perag_put(pag); + orig_bno = xfs_rtb_to_rgbno(mp, irec->br_startblock, &rgno); + rtg = xfs_rtgroup_get(mp, rgno); + error = xfs_reflink_find_rtshared(rtg, NULL, orig_bno, + irec->br_blockcount, &fbno, &flen, true); + xfs_rtgroup_put(rtg); + } else { + struct xfs_perag *pag; + + pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, + irec->br_startblock)); + orig_bno = XFS_FSB_TO_AGBNO(mp, irec->br_startblock); + error = xfs_reflink_find_shared(pag, NULL, orig_bno, + irec->br_blockcount, &fbno, &flen, true); + xfs_perag_put(pag); + } if (error) return error; @@ -203,7 +248,7 @@ xfs_reflink_trim_around_shared( return 0; } - if (fbno == agbno) { + if (fbno == orig_bno) { /* * The start of this extent is shared. Truncate the * mapping at the end of the shared region so that a @@ -221,7 +266,7 @@ xfs_reflink_trim_around_shared( * extent so that a subsequent iteration starts at the * start of the shared region. */ - irec->br_blockcount = fbno - agbno; + irec->br_blockcount = fbno - orig_bno; return 0; } @@ -1574,9 +1619,6 @@ xfs_reflink_inode_has_shared_extents( *has_shared = false; found = xfs_iext_lookup_extent(ip, ifp, 0, &icur, &got); while (found) { - struct xfs_perag *pag; - xfs_agblock_t agbno; - xfs_extlen_t aglen; xfs_agblock_t rbno; xfs_extlen_t rlen; @@ -1584,12 +1626,29 @@ xfs_reflink_inode_has_shared_extents( got.br_state != XFS_EXT_NORM) goto next; - pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, got.br_startblock)); - agbno = XFS_FSB_TO_AGBNO(mp, got.br_startblock); - aglen = got.br_blockcount; - error = xfs_reflink_find_shared(pag, tp, agbno, aglen, - &rbno, &rlen, false); - xfs_perag_put(pag); + if (XFS_IS_REALTIME_INODE(ip)) { + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; + xfs_rgblock_t rgbno; + + rgbno = xfs_rtb_to_rgbno(mp, got.br_startblock, &rgno); + rtg = xfs_rtgroup_get(mp, rgno); + error = xfs_reflink_find_rtshared(rtg, tp, rgbno, + got.br_blockcount, &rbno, &rlen, + false); + xfs_rtgroup_put(rtg); + } else { + struct xfs_perag *pag; + xfs_agblock_t agbno; + + pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, + got.br_startblock)); + agbno = XFS_FSB_TO_AGBNO(mp, got.br_startblock); + error = xfs_reflink_find_shared(pag, tp, agbno, + got.br_blockcount, &rbno, &rlen, + false); + xfs_perag_put(pag); + } if (error) return error; From patchwork Fri Dec 30 22:18:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59ECEC4332F for ; Sat, 31 Dec 2022 01:52:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235815AbiLaBwF (ORCPT ); Fri, 30 Dec 2022 20:52:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236138AbiLaBv7 (ORCPT ); Fri, 30 Dec 2022 20:51:59 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35E091DDDA for ; Fri, 30 Dec 2022 17:51:58 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id EA397B81DFC for ; Sat, 31 Dec 2022 01:51:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B0297C433EF; Sat, 31 Dec 2022 01:51:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451515; bh=scSHGjMPdBNrFB1pSJcYoyuy1Bp4xm2RdsIQMVwOc2g=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=SrF2EmurpGN15XwA4hgDYcVdAnvscaoBHqyqQl4+86qZEqUlbwKr74QixecESwg4K 9eX8CyGaDpGIFCWTFaC4QpWVX5W2MvZZY4ob1dO+9miHawlQ/LWbNC2FT66FRDJoxP rwmSfuNChAbN86hUXhvUUuYrFimHC6wtHt3KNqND9WIcnlnr8v4pmN2wfsKLhxs+DI FPscC/izyGpr44gHbFu0TP5hrQSfXgLBIx2dj1eppmz+vbb05+BjE34oq5aNWRBcxy yS42QcqMsqnQ/XHYQrPxl0j+vJSyPxWwdvocnQbohaFNcV1zHF+JsIgaWd/C8FE80u s4YRhNqWNv1cg== Subject: [PATCH 15/42] xfs: create routine to allocate and initialize a realtime refcount btree inode From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:31 -0800 Message-ID: <167243871111.717073.3368219455039072884.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Create a library routine to allocate and initialize an empty realtime refcountbt inode. We'll use this for growfs, mkfs, and repair. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rtrefcount_btree.c | 41 ++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.h | 6 +++++ 2 files changed, 47 insertions(+) diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index a43ee6d7b547..0a6fa9851371 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -765,3 +765,44 @@ xfs_iflush_rtrefcount( ifp->if_broot_bytes, dfp, XFS_DFORK_SIZE(dip, ip->i_mount, XFS_DATA_FORK)); } + +/* + * Create a realtime refcount btree inode. + * + * Regardless of the return value, the caller must clean up @ic. If a new + * inode is returned through *ipp, the caller must finish setting up the incore + * inode and release it. + */ +int +xfs_rtrefcountbt_create( + struct xfs_trans **tpp, + struct xfs_imeta_path *path, + struct xfs_imeta_update *upd, + struct xfs_inode **ipp) +{ + struct xfs_mount *mp = (*tpp)->t_mountp; + struct xfs_ifork *ifp; + struct xfs_inode *ip; + int error; + + *ipp = NULL; + + error = xfs_imeta_create(tpp, path, S_IFREG, 0, &ip, upd); + if (error) + return error; + + ifp = &ip->i_df; + ifp->if_format = XFS_DINODE_FMT_REFCOUNT; + ASSERT(ifp->if_broot_bytes == 0); + ASSERT(ifp->if_bytes == 0); + + /* Initialize the empty incore btree root. */ + xfs_iroot_alloc(ip, XFS_DATA_FORK, + xfs_rtrefcount_broot_space_calc(mp, 0, 0)); + xfs_btree_init_block(ip->i_mount, ifp->if_broot, &xfs_rtrefcountbt_ops, + 0, 0, ip->i_ino); + xfs_trans_log_inode(*tpp, ip, XFS_ILOG_CORE | XFS_ILOG_DBROOT); + + *ipp = ip; + return 0; +} diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.h b/fs/xfs/libxfs/xfs_rtrefcount_btree.h index d2fe2004568d..86a547529c9d 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.h +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.h @@ -186,4 +186,10 @@ void xfs_rtrefcountbt_to_disk(struct xfs_mount *mp, struct xfs_rtrefcount_root *dblock, int dblocklen); void xfs_iflush_rtrefcount(struct xfs_inode *ip, struct xfs_dinode *dip); +struct xfs_imeta_update; + +int xfs_rtrefcountbt_create(struct xfs_trans **tpp, + struct xfs_imeta_path *path, struct xfs_imeta_update *ic, + struct xfs_inode **ipp); + #endif /* __XFS_RTREFCOUNT_BTREE_H__ */ From patchwork Fri Dec 30 22:18:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085524 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D36BC4332F for ; Sat, 31 Dec 2022 01:52:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235911AbiLaBwO (ORCPT ); Fri, 30 Dec 2022 20:52:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231444AbiLaBwM (ORCPT ); Fri, 30 Dec 2022 20:52:12 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A4061DDD3 for ; Fri, 30 Dec 2022 17:52:12 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DB24B61CCE for ; Sat, 31 Dec 2022 01:52:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4348BC433EF; Sat, 31 Dec 2022 01:52:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451531; bh=4ZA/1hA0wYQJb6znr2iokgOl90V6LTbtQ9O/AKnvl0Q=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=aOrQKav7LvwI7TMLL4w0+DscK4cirIFQ1Al1tVNNUOe2MhBsoWYwsSNtPY+k2/gPu DodsLyieat2CFd/vTmDsAo0JJTSr7flwooEjp7SOenMKutuNKNpDa6qCKCfBMKNuTS whxibcOmYCS6LwtInss+TyC5pJQK/NfxvYWGdr41dWvAturL+wYs3YWVg7LsFXE8rb SCO9hUk7IfUV+FdvkvTMWeDz6zkmzN1Q3aAXQ3n3ME+NmuYFlDaL4w/Nv1EgcaMIOc o4Z9vTBIM6J5AJb2SdQxJeF4475xf+NWuTdT04s36XZtmpQbw+xghPeGjfNvJ58lp2 QueC7eyvFak0g== Subject: [PATCH 16/42] xfs: update rmap to allow cow staging extents in the rt rmap From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:31 -0800 Message-ID: <167243871125.717073.9802618689897794582.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Don't error out on CoW staging extent records when realtime reflink is enabled. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rmap.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index a533588a9b5b..891af03afccc 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -274,6 +274,7 @@ xfs_rmap_check_rtgroup_irec( bool is_unwritten; bool is_bmbt; bool is_attr; + bool is_cow; if (irec->rm_blockcount == 0) return __this_address; @@ -285,6 +286,12 @@ xfs_rmap_check_rtgroup_irec( return __this_address; if (irec->rm_offset != 0) return __this_address; + } else if (irec->rm_owner == XFS_RMAP_OWN_COW) { + if (!xfs_has_rtreflink(mp)) + return __this_address; + if (!xfs_verify_rgbext(rtg, irec->rm_startblock, + irec->rm_blockcount)) + return __this_address; } else { if (!xfs_verify_rgbext(rtg, irec->rm_startblock, irec->rm_blockcount)) @@ -301,8 +308,10 @@ xfs_rmap_check_rtgroup_irec( is_bmbt = irec->rm_flags & XFS_RMAP_BMBT_BLOCK; is_attr = irec->rm_flags & XFS_RMAP_ATTR_FORK; is_unwritten = irec->rm_flags & XFS_RMAP_UNWRITTEN; + is_cow = xfs_has_rtreflink(mp) && + irec->rm_owner == XFS_RMAP_OWN_COW; - if (!is_inode && irec->rm_owner != XFS_RMAP_OWN_FS) + if (!is_inode && !is_cow && irec->rm_owner != XFS_RMAP_OWN_FS) return __this_address; if (!is_inode && irec->rm_offset != 0) @@ -314,6 +323,9 @@ xfs_rmap_check_rtgroup_irec( if (is_unwritten && !is_inode) return __this_address; + if (is_unwritten && is_cow) + return __this_address; + /* Check for a valid fork offset, if applicable. */ if (is_inode && !xfs_verify_fileext(mp, irec->rm_offset, irec->rm_blockcount)) From patchwork Fri Dec 30 22:18:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B124C4332F for ; Sat, 31 Dec 2022 01:52:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236099AbiLaBw3 (ORCPT ); Fri, 30 Dec 2022 20:52:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231444AbiLaBw2 (ORCPT ); Fri, 30 Dec 2022 20:52:28 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D142D1DDD3 for ; Fri, 30 Dec 2022 17:52:27 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6D22361CE1 for ; Sat, 31 Dec 2022 01:52:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CAEEAC433D2; Sat, 31 Dec 2022 01:52:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451546; bh=oZYAXYKPcDokCJoxBBeSm/PtbG1kJkfKJUs6po1xRjQ=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=eP8GFlCuMaeiITyRLgKSxTVnzprYaZK+AmHrP0jRPK7ptOhdCCjnBkM+DdWcrtX8I gxgYngCZR5SKgmJTa1mwXIufQi4JCfvtnaNQ5imaCHmrQHKKtcKMsYMKb/pv5LmnaU 71nL3cAZHJA9rciQKSQ9uQ4XjqSyfBFnDcs5R9oX9NPAoLpOjc0HeV+Uz74QU84T8p vn4COV0Zsp6cbZZwGVWOUTlzicWuj/J3pV35Hz718Hz5jucTEYXsyQmMtZJx3sp9UJ USwP+cnbcJnmYkWxoDqvmSEg5JpXoNyno6HF5C/OPhvSHq9w7jBq2Z7wlE0n+ebUXv In8+3kA+u4npw== Subject: [PATCH 17/42] xfs: compute rtrmap btree max levels when reflink enabled From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:31 -0800 Message-ID: <167243871139.717073.10822502195576681130.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Compute the maximum possible height of the realtime rmap btree when reflink is enabled. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rtrmap_btree.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index 878bfeed411f..35ae3171a0cc 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -737,6 +737,7 @@ xfs_rtrmapbt_maxrecs( unsigned int xfs_rtrmapbt_maxlevels_ondisk(void) { + unsigned long long max_dblocks; unsigned int minrecs[2]; unsigned int blocklen; @@ -745,8 +746,20 @@ xfs_rtrmapbt_maxlevels_ondisk(void) minrecs[0] = xfs_rtrmapbt_block_maxrecs(blocklen, true) / 2; minrecs[1] = xfs_rtrmapbt_block_maxrecs(blocklen, false) / 2; - /* We need at most one record for every block in an rt group. */ - return xfs_btree_compute_maxlevels(minrecs, XFS_MAX_RGBLOCKS); + /* + * Compute the asymptotic maxlevels for an rtrmapbt on any rtreflink fs. + * + * On a reflink filesystem, each block in an rtgroup can have up to + * 2^32 (per the refcount record format) owners, which means that + * theoretically we could face up to 2^64 rmap records. However, we're + * likely to run out of blocks in the data device long before that + * happens, which means that we must compute the max height based on + * what the btree will look like if it consumes almost all the blocks + * in the data device due to maximal sharing factor. + */ + max_dblocks = -1U; /* max ag count */ + max_dblocks *= XFS_MAX_CRC_AG_BLOCKS; + return xfs_btree_space_to_height(minrecs, max_dblocks); } int __init @@ -785,9 +798,20 @@ xfs_rtrmapbt_compute_maxlevels( * maximum height is constrained by the size of the data device and * the height required to store one rmap record for each block in an * rt group. + * + * On a reflink filesystem, each rt block can have up to 2^32 (per the + * refcount record format) owners, which means that theoretically we + * could face up to 2^64 rmap records. This makes the computation of + * maxlevels based on record count meaningless, so we only consider the + * size of the data device. */ d_maxlevels = xfs_btree_space_to_height(mp->m_rtrmap_mnr, mp->m_sb.sb_dblocks); + if (xfs_has_rtreflink(mp)) { + mp->m_rtrmap_maxlevels = d_maxlevels + 1; + return; + } + r_maxlevels = xfs_btree_compute_maxlevels(mp->m_rtrmap_mnr, mp->m_sb.sb_rgblocks); From patchwork Fri Dec 30 22:18:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085526 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46B48C4332F for ; Sat, 31 Dec 2022 01:52:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235919AbiLaBwq (ORCPT ); Fri, 30 Dec 2022 20:52:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231444AbiLaBwp (ORCPT ); Fri, 30 Dec 2022 20:52:45 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ECB421DDD6 for ; Fri, 30 Dec 2022 17:52:44 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id AA2AFB81E05 for ; Sat, 31 Dec 2022 01:52:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6E0B3C433EF; Sat, 31 Dec 2022 01:52:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451562; bh=ojYa1ZradX95inJnXd8+IGxpgULiz00lhpdFfSAPcC8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=U1FFAZyQigcyqZDYa3vSs7KLjAgI+rMV1KUEUkiOiYVmhvkFGIps4bJzr+fiM8Nev fDZLEEsoXygVQ8dt6Gh2eD2eoqJF6usK7yUH73Qli+8VJSZUoOmAzpCM0DdVbSpBfE OWldGtGNBRM85sifKn3iRrhRu16Cy9gzu/njJ/WuYzX6u9JCpBGgPEF9ue8C5ereWE AlLeADZ3H9hAJ69bvOttLsilJLdoqx73hB8Egm/BNfLa21ERKuDTF6LswVSeuMB/iu n1NC6bIT60u7pGvYJh13vDvIWsHEYRYantWCCB/GxXWcPLctu3y2JrKscjyuV/OYXv ErlxbFvumSPHQ== Subject: [PATCH 18/42] xfs: refactor reflink quota updates From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:31 -0800 Message-ID: <167243871153.717073.17408045000888577090.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Hoist all quota updates for reflink into a helper function, since things are about to become more complicated. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_reflink.c | 37 ++++++++++++++++++++++++++++++++----- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 1a8a254c81f4..455adcce994d 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -750,6 +750,35 @@ xfs_reflink_cancel_cow_range( return error; } +#ifdef CONFIG_XFS_QUOTA +/* + * Update quota accounting for a remapping operation. When we're remapping + * something from the CoW fork to the data fork, we must update the quota + * accounting for delayed allocations. For remapping from the data fork to the + * data fork, use regular block accounting. + */ +static inline void +xfs_reflink_update_quota( + struct xfs_trans *tp, + struct xfs_inode *ip, + bool is_cow, + int64_t blocks) +{ + unsigned int qflag; + + if (XFS_IS_REALTIME_INODE(ip)) { + qflag = is_cow ? XFS_TRANS_DQ_DELRTBCOUNT : + XFS_TRANS_DQ_RTBCOUNT; + } else { + qflag = is_cow ? XFS_TRANS_DQ_DELBCOUNT : + XFS_TRANS_DQ_BCOUNT; + } + xfs_trans_mod_dquot_byino(tp, ip, qflag, blocks); +} +#else +# define xfs_reflink_update_quota(tp, ip, is_cow, blocks) ((void)0) +#endif + /* * Remap part of the CoW fork into the data fork. * @@ -852,8 +881,7 @@ xfs_reflink_end_cow_extent( */ xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &data); xfs_refcount_decrease_extent(tp, isrt, &data); - xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, - -data.br_blockcount); + xfs_reflink_update_quota(tp, ip, false, -data.br_blockcount); } else if (data.br_startblock == DELAYSTARTBLOCK) { int done; @@ -878,8 +906,7 @@ xfs_reflink_end_cow_extent( xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, &del); /* Charge this new data fork mapping to the on-disk quota. */ - xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT, - (long)del.br_blockcount); + xfs_reflink_update_quota(tp, ip, true, del.br_blockcount); /* Remove the mapping from the CoW fork. */ xfs_bmap_del_extent_cow(ip, &icur, &got, &del); @@ -1369,7 +1396,7 @@ xfs_reflink_remap_extent( qdelta += dmap->br_blockcount; } - xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, qdelta); + xfs_reflink_update_quota(tp, ip, false, qdelta); /* Update dest isize if needed. */ newlen = XFS_FSB_TO_B(mp, dmap->br_startoff + dmap->br_blockcount); From patchwork Fri Dec 30 22:18:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D7DCC4332F for ; Sat, 31 Dec 2022 01:53:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231444AbiLaBxA (ORCPT ); Fri, 30 Dec 2022 20:53:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231231AbiLaBw7 (ORCPT ); Fri, 30 Dec 2022 20:52:59 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 089F51DDD3 for ; Fri, 30 Dec 2022 17:52:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8C19261CCE for ; Sat, 31 Dec 2022 01:52:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EA12EC433EF; Sat, 31 Dec 2022 01:52:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451578; bh=Bd7p0pl+varYl9Kmk+tqpMiZ1+ztzbVdTLv8zWJUhwg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=DhM0gWp1RdqnDtSYtUhVkp9DbbPkbpUSW7RGY2aalGnbUbDVMSjGb3heiOqKLuqx0 tVHrN5HfY3QQ4W2nqLpUTI6rP1rilp9pw6CQf5NWsAZwNj5wxqS2u5qs5OJY3hBkYw dYN2loKUEzDNVh03Y4ZhdH4uj7Kvthmu3YRhRZl6l05SzhXCAa83x4Ss0DRaOycRYN CSU/9rdrJGdWK0RPWO92EwETXbfGVKJAu35Kc924JfFO3TjAuPQder6PFMbVGOfdnN g3uiTHBv8LrvFX1Yolbu9RgqM4jvcNRE3mqAXtKvQLPaN5P4SW/hRkB4gxquVREl0g Z5NBX1l34WGZw== Subject: [PATCH 19/42] xfs: enable CoW for realtime data From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:31 -0800 Message-ID: <167243871166.717073.11067416256842216428.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Update our write paths to support copy on write on the rt volume. This works in more or less the same way as it does on the data device, with the major exception that we never do delalloc on the rt volume. Because we consider unwritten CoW fork staging extents to be incore quota reservation, we update xfs_quota_reserve_blkres to support this case. Though xfs doesn't allow rt and quota together, the change is trivial and we shouldn't leave a logic bomb here. While we're at it, add a missing xfs_mod_delalloc call when we remove delalloc block reservation from the inode. This is largely irrelvant since realtime files do not use delalloc, but we want to avoid leaving logic bombs. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_bmap_util.c | 61 ++++++++++++++++++++++++++++++++++++++-------- fs/xfs/xfs_quota.h | 6 +---- fs/xfs/xfs_reflink.c | 36 +++++++++++++++++++++------ fs/xfs/xfs_trans_dquot.c | 11 ++++++++ 4 files changed, 90 insertions(+), 24 deletions(-) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 447c057c9331..842f472292cd 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -71,6 +71,55 @@ xfs_zero_extent( } #ifdef CONFIG_XFS_RT + +/* Update all inode and quota accounting for the allocation we just did. */ +static void +xfs_bmap_rtalloc_accounting( + struct xfs_bmalloca *ap) +{ + if (ap->flags & XFS_BMAPI_COWFORK) { + /* + * COW fork blocks are in-core only and thus are treated as + * in-core quota reservation (like delalloc blocks) even when + * converted to real blocks. The quota reservation is not + * accounted to disk until blocks are remapped to the data + * fork. So if these blocks were previously delalloc, we + * already have quota reservation and there's nothing to do + * yet. + */ + if (ap->wasdel) { + xfs_mod_delalloc(ap->ip->i_mount, -(int64_t)ap->length); + return; + } + + /* + * Otherwise, we've allocated blocks in a hole. The transaction + * has acquired in-core quota reservation for this extent. + * Rather than account these as real blocks, however, we reduce + * the transaction quota reservation based on the allocation. + * This essentially transfers the transaction quota reservation + * to that of a delalloc extent. + */ + ap->ip->i_delayed_blks += ap->length; + xfs_trans_mod_dquot_byino(ap->tp, ap->ip, + XFS_TRANS_DQ_RES_RTBLKS, -(long)ap->length); + return; + } + + /* data fork only */ + ap->ip->i_nblocks += ap->length; + xfs_trans_log_inode(ap->tp, ap->ip, XFS_ILOG_CORE); + if (ap->wasdel) { + ap->ip->i_delayed_blks -= ap->length; + xfs_mod_delalloc(ap->ip->i_mount, -(int64_t)ap->length); + } + + /* Adjust the disk quota also. This was reserved earlier. */ + xfs_trans_mod_dquot_byino(ap->tp, ap->ip, + ap->wasdel ? XFS_TRANS_DQ_DELRTBCOUNT : + XFS_TRANS_DQ_RTBCOUNT, ap->length); +} + int xfs_bmap_rtalloc( struct xfs_bmalloca *ap) @@ -166,17 +215,7 @@ xfs_bmap_rtalloc( if (rtx != NULLRTEXTNO) { ap->blkno = xfs_rtx_to_rtb(mp, rtx); ap->length = xfs_rtxlen_to_extlen(mp, ralen); - ap->ip->i_nblocks += ap->length; - xfs_trans_log_inode(ap->tp, ap->ip, XFS_ILOG_CORE); - if (ap->wasdel) - ap->ip->i_delayed_blks -= ap->length; - /* - * Adjust the disk quota also. This was reserved - * earlier. - */ - xfs_trans_mod_dquot_byino(ap->tp, ap->ip, - ap->wasdel ? XFS_TRANS_DQ_DELRTBCOUNT : - XFS_TRANS_DQ_RTBCOUNT, ap->length); + xfs_bmap_rtalloc_accounting(ap); return 0; } diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h index 0cb52d5be4aa..fa34d997b747 100644 --- a/fs/xfs/xfs_quota.h +++ b/fs/xfs/xfs_quota.h @@ -124,11 +124,7 @@ int xfs_qm_mount_quotas(struct xfs_mount *mp); extern void xfs_qm_unmount(struct xfs_mount *); extern void xfs_qm_unmount_quotas(struct xfs_mount *); -static inline int -xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks) -{ - return xfs_trans_reserve_quota_nblks(NULL, ip, blocks, 0, false); -} +int xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks); bool xfs_inode_near_dquot_enforcement(struct xfs_inode *ip, xfs_dqtype_t type); # ifdef CONFIG_XFS_LIVE_HOOKS diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 455adcce994d..3b5d144bef41 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -434,20 +434,26 @@ xfs_reflink_fill_cow_hole( struct xfs_mount *mp = ip->i_mount; struct xfs_trans *tp; xfs_filblks_t resaligned; - xfs_extlen_t resblks; + unsigned int dblocks = 0, rblocks = 0; int nimaps; int error; bool found; resaligned = xfs_aligned_fsb_count(imap->br_startoff, imap->br_blockcount, xfs_get_cowextsz_hint(ip)); - resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); + if (XFS_IS_REALTIME_INODE(ip)) { + dblocks = XFS_DIOSTRAT_SPACE_RES(mp, 0); + rblocks = resaligned; + } else { + dblocks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); + rblocks = 0; + } xfs_iunlock(ip, *lockmode); *lockmode = 0; - error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, resblks, 0, - false, &tp); + error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, dblocks, + rblocks, false, &tp); if (error) return error; @@ -1232,7 +1238,7 @@ xfs_reflink_remap_extent( struct xfs_trans *tp; xfs_off_t newlen; int64_t qdelta = 0; - unsigned int resblks; + unsigned int dblocks, rblocks, resblks; bool quota_reserved = true; bool smap_real; bool dmap_written = xfs_bmap_is_written_extent(dmap); @@ -1263,8 +1269,15 @@ xfs_reflink_remap_extent( * we're remapping. */ resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK); + if (XFS_IS_REALTIME_INODE(ip)) { + dblocks = resblks; + rblocks = dmap->br_blockcount; + } else { + dblocks = resblks + dmap->br_blockcount; + rblocks = 0; + } error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, - resblks + dmap->br_blockcount, 0, false, &tp); + dblocks, rblocks, false, &tp); if (error == -EDQUOT || error == -ENOSPC) { quota_reserved = false; error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, @@ -1344,8 +1357,15 @@ xfs_reflink_remap_extent( * done. */ if (!quota_reserved && !smap_real && dmap_written) { - error = xfs_trans_reserve_quota_nblks(tp, ip, - dmap->br_blockcount, 0, false); + if (XFS_IS_REALTIME_INODE(ip)) { + dblocks = 0; + rblocks = dmap->br_blockcount; + } else { + dblocks = dmap->br_blockcount; + rblocks = 0; + } + error = xfs_trans_reserve_quota_nblks(tp, ip, dblocks, rblocks, + false); if (error) goto out_cancel; } diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c index f5e9d76fb9a2..31ab1c5d6b13 100644 --- a/fs/xfs/xfs_trans_dquot.c +++ b/fs/xfs/xfs_trans_dquot.c @@ -1009,3 +1009,14 @@ xfs_trans_free_dqinfo( kmem_cache_free(xfs_dqtrx_cache, tp->t_dqinfo); tp->t_dqinfo = NULL; } + +int +xfs_quota_reserve_blkres( + struct xfs_inode *ip, + int64_t blocks) +{ + if (XFS_IS_REALTIME_INODE(ip)) + return xfs_trans_reserve_quota_nblks(NULL, ip, 0, blocks, + false); + return xfs_trans_reserve_quota_nblks(NULL, ip, blocks, 0, false); +} From patchwork Fri Dec 30 22:18:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085528 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01EFDC4332F for ; Sat, 31 Dec 2022 01:53:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235981AbiLaBxQ (ORCPT ); Fri, 30 Dec 2022 20:53:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50040 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231231AbiLaBxP (ORCPT ); Fri, 30 Dec 2022 20:53:15 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 866331DDD6 for ; Fri, 30 Dec 2022 17:53:14 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2348D61CD3 for ; Sat, 31 Dec 2022 01:53:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 851ECC433D2; Sat, 31 Dec 2022 01:53:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451593; bh=/DHm+0caYfymiolz4YDzY3knvfM1O7xCGUJ+4KAnYlI=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=SuX868hdM4FjvR2QWSrupd2e8uSTA8njrS2snG8UD7S8xRgh9MfmIWf+qbQUzgglZ TBrDn7Esg3wz502nPj+Lcx+V+eIH8Oq5q0Lqcm/kQhyTmCFLGDCuZOvrNGOagz4H1Q 89BKGGh4iGmeGanPBKz9kDH4uBXgF+J9+0VuKKckNvnDQ+hnxBOVc5oNfTuJtQ3h3s MOc5+ThVoWYtF+XqCjIlvFna6Uzfzj3Uga0DrbV7xfa/PznQJw2KNv881w8JEyHXtt ujhKGI5dz3X1PuqpXJpk/SmdHm7JOROcJuBjBw1KB+BEFzjqr8YwpnISJx+qyjULd5 QcmXH58HQ4Emg== Subject: [PATCH 20/42] xfs: enable sharing of realtime file blocks From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:31 -0800 Message-ID: <167243871180.717073.13325087628739180105.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Update the remapping routines to be able to handle realtime files. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_reflink.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 3b5d144bef41..3cead39e4308 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -33,6 +33,7 @@ #include "xfs_rtrefcount_btree.h" #include "xfs_rtalloc.h" #include "xfs_rtgroup.h" +#include "xfs_imeta.h" /* * Copy on Write of Shared Blocks @@ -1207,14 +1208,29 @@ xfs_reflink_update_dest( static int xfs_reflink_ag_has_free_space( struct xfs_mount *mp, - xfs_agnumber_t agno) + struct xfs_inode *ip, + xfs_fsblock_t fsb) { struct xfs_perag *pag; + xfs_agnumber_t agno; int error = 0; if (!xfs_has_rmapbt(mp)) return 0; + if (XFS_IS_REALTIME_INODE(ip)) { + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; + rgno = xfs_rtb_to_rgno(mp, fsb); + rtg = xfs_rtgroup_get(mp, rgno); + if (xfs_imeta_resv_critical(rtg->rtg_rmapip) || + xfs_imeta_resv_critical(rtg->rtg_refcountip)) + error = -ENOSPC; + xfs_rtgroup_put(rtg); + return error; + } + + agno = XFS_FSB_TO_AGNO(mp, fsb); pag = xfs_perag_get(mp, agno); if (xfs_ag_resv_critical(pag, XFS_AG_RESV_RMAPBT) || xfs_ag_resv_critical(pag, XFS_AG_RESV_METADATA)) @@ -1328,8 +1344,8 @@ xfs_reflink_remap_extent( /* No reflinking if the AG of the dest mapping is low on space. */ if (dmap_written) { - error = xfs_reflink_ag_has_free_space(mp, - XFS_FSB_TO_AGNO(mp, dmap->br_startblock)); + error = xfs_reflink_ag_has_free_space(mp, ip, + dmap->br_startblock); if (error) goto out_cancel; } @@ -1589,8 +1605,8 @@ xfs_reflink_remap_prep( /* Check file eligibility and prepare for block sharing. */ ret = -EINVAL; - /* Don't reflink realtime inodes */ - if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest)) + /* Can't reflink between data and rt volumes */ + if (XFS_IS_REALTIME_INODE(src) != XFS_IS_REALTIME_INODE(dest)) goto out_unlock; /* Don't share DAX file data with non-DAX file. */ From patchwork Fri Dec 30 22:18:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1ABADC4332F for ; Sat, 31 Dec 2022 01:53:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236042AbiLaBxe (ORCPT ); Fri, 30 Dec 2022 20:53:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231231AbiLaBxc (ORCPT ); Fri, 30 Dec 2022 20:53:32 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C08DA1DDD3 for ; Fri, 30 Dec 2022 17:53:31 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4CB42B81DFC for ; Sat, 31 Dec 2022 01:53:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 16660C433D2; Sat, 31 Dec 2022 01:53:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451609; bh=G7M0GPo2QMSQtwDDmJ22pseEXMJxV4Gxuw4AtltWmno=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=cCajrsjpbg4Hrrr75DnKii6PRDilqvLBlf1DEln47ffEcVQ1eJxfbb+IJ7Spike3g CKPckWrmHRCwYu5WgD7aUiM027V8W3nb8C1OPCvKImQMvwIEQfv9FseGH0HCFFMmQu 0tm051tnB62XLqcqIEazVKBwhhkq1bjJYLZduGAlpHhyVy7iQz2+9yFo0DFySZROaP /rlKCw9Wtn2s3/wNQY2tlMrRATKPwa8pl5POCt9gInvnP8El7w+55p92IQi+vaP4gH ldn+N4pbDyuZsAvc4ikKWCrcHlKB7BVb70TX1gyDbPfh6PfXr6VUYYvUW80ztoxBDj 82sqMEDownM4w== Subject: [PATCH 21/42] xfs: allow inodes to have the realtime and reflink flags From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:31 -0800 Message-ID: <167243871194.717073.18192234817486773032.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Now that we can share blocks between realtime files, allow this combination. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_inode_buf.c | 3 ++- fs/xfs/scrub/inode.c | 5 +++-- fs/xfs/scrub/inode_repair.c | 6 ------ fs/xfs/xfs_ioctl.c | 4 ---- 4 files changed, 5 insertions(+), 13 deletions(-) diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index dcf816f2643b..0db719f80bf2 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -675,7 +675,8 @@ xfs_dinode_verify( return __this_address; /* don't let reflink and realtime mix */ - if ((flags2 & XFS_DIFLAG2_REFLINK) && (flags & XFS_DIFLAG_REALTIME)) + if ((flags2 & XFS_DIFLAG2_REFLINK) && (flags & XFS_DIFLAG_REALTIME) && + !xfs_has_rtreflink(mp)) return __this_address; /* COW extent size hint validation */ diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c index f2c60c3515e7..3b19976b6066 100644 --- a/fs/xfs/scrub/inode.c +++ b/fs/xfs/scrub/inode.c @@ -329,8 +329,9 @@ xchk_inode_flags2( if ((flags2 & XFS_DIFLAG2_REFLINK) && !S_ISREG(mode)) goto bad; - /* realtime and reflink make no sense, currently */ - if ((flags & XFS_DIFLAG_REALTIME) && (flags2 & XFS_DIFLAG2_REFLINK)) + /* realtime and reflink don't always go together */ + if ((flags & XFS_DIFLAG_REALTIME) && (flags2 & XFS_DIFLAG2_REFLINK) && + !xfs_has_rtreflink(mp)) goto bad; /* no bigtime iflag without the bigtime feature */ diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index 8566282827f8..9f946406cfa0 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -391,8 +391,6 @@ xrep_dinode_flags( flags2 |= XFS_DIFLAG2_REFLINK; else flags2 &= ~(XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE); - if (flags & XFS_DIFLAG_REALTIME) - flags2 &= ~XFS_DIFLAG2_REFLINK; if (flags2 & XFS_DIFLAG2_REFLINK) flags2 &= ~XFS_DIFLAG2_DAX; if (!xfs_has_bigtime(mp)) @@ -1480,10 +1478,6 @@ xrep_inode_flags( if (!(S_ISREG(mode) || S_ISDIR(mode))) sc->ip->i_diflags2 &= ~XFS_DIFLAG2_DAX; - /* No reflink files on the realtime device. */ - if (sc->ip->i_diflags & XFS_DIFLAG_REALTIME) - sc->ip->i_diflags2 &= ~XFS_DIFLAG2_REFLINK; - /* No mixing reflink and DAX yet. */ if (sc->ip->i_diflags2 & XFS_DIFLAG2_REFLINK) sc->ip->i_diflags2 &= ~XFS_DIFLAG2_DAX; diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index fbe9bc50fc20..939cc6d862da 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1115,10 +1115,6 @@ xfs_ioctl_setattr_xflags( return -EINVAL; } - /* Clear reflink if we are actually able to set the rt flag. */ - if ((fa->fsx_xflags & FS_XFLAG_REALTIME) && xfs_is_reflink_inode(ip)) - ip->i_diflags2 &= ~XFS_DIFLAG2_REFLINK; - /* diflags2 only valid for v3 inodes. */ i_flags2 = xfs_flags2diflags2(ip, fa->fsx_xflags); if (i_flags2 && !xfs_has_v3inodes(mp)) From patchwork Fri Dec 30 22:18:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085530 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCCB4C4332F for ; Sat, 31 Dec 2022 01:53:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236004AbiLaBxs (ORCPT ); Fri, 30 Dec 2022 20:53:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231231AbiLaBxs (ORCPT ); Fri, 30 Dec 2022 20:53:48 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C6F41DDD3 for ; Fri, 30 Dec 2022 17:53:47 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D16FCB81E07 for ; Sat, 31 Dec 2022 01:53:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 955CEC433EF; Sat, 31 Dec 2022 01:53:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451624; bh=T66MDmflSIvYa/B7ALYYEzgRG4aKnzn/x+yTSG43sMg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Cn3UZOmgOReNLircsE8pmROL6heSUhqnDZbvILLBOdpI5S8Do8S5VY4H4bxL3oKQP WqlBJwyu+CnK3/GqjVu/3rvQhyUI0QZ6d1qiv76aBQyFumqR3+Hu1ZOqVERUKH0xok aY/kY1KFsUGDz1eRYelwMAM23Vklicz4VGZMvjz0eW9CxN3ru19I89rsf1rx7yuMaX U+m2zHbruo84MpJHQi0M6XnMEkciMQOqEycQLcUNuoVSi9xDZXXLGGE7/aSAm2KBOc HOMqNKoe6Rx4lLVXjCwBBrSgb5jfDxCsH/HpULYNexPyVBXRLMGDXe/pmky4NzW7gp tRuJYEct1GvwA== Subject: [PATCH 22/42] xfs: refcover CoW leftovers in the realtime volume From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:32 -0800 Message-ID: <167243871209.717073.14280502191537429879.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Scan the realtime refcount tree at mount time to get rid of leftover CoW staging extents. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_refcount.c | 63 +++++++++++++++++++++++++++++++++--------- fs/xfs/libxfs/xfs_refcount.h | 5 +++ fs/xfs/xfs_reflink.c | 14 ++++++++- 3 files changed, 65 insertions(+), 17 deletions(-) diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index c4ab749c78e4..8b878a7a5a3e 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -2037,14 +2037,15 @@ xfs_refcount_recover_extent( } /* Find and remove leftover CoW reservations. */ -int -xfs_refcount_recover_cow_leftovers( +static int +xfs_refcount_recover_group_cow_leftovers( struct xfs_mount *mp, - struct xfs_perag *pag) + struct xfs_perag *pag, + struct xfs_rtgroup *rtg) { struct xfs_trans *tp; struct xfs_btree_cur *cur; - struct xfs_buf *agbp; + struct xfs_buf *agbp = NULL; struct xfs_refcount_recovery *rr, *n; struct list_head debris; union xfs_btree_irec low; @@ -2054,7 +2055,12 @@ xfs_refcount_recover_cow_leftovers( /* reflink filesystems mustn't have AGs larger than 2^31-1 blocks */ BUILD_BUG_ON(XFS_MAX_CRC_AG_BLOCKS >= XFS_REFC_COWFLAG); - if (mp->m_sb.sb_agblocks > XFS_MAX_CRC_AG_BLOCKS) + if (pag && mp->m_sb.sb_agblocks > XFS_MAX_CRC_AG_BLOCKS) + return -EOPNOTSUPP; + + /* rtreflink filesystems can't have rtgroups larger than 2^31-1 blocks */ + BUILD_BUG_ON(XFS_MAX_RGBLOCKS >= XFS_REFC_COWFLAG); + if (rtg && mp->m_sb.sb_rgblocks >= XFS_MAX_RGBLOCKS) return -EOPNOTSUPP; INIT_LIST_HEAD(&debris); @@ -2073,10 +2079,16 @@ xfs_refcount_recover_cow_leftovers( if (error) return error; - error = xfs_alloc_read_agf(pag, tp, 0, &agbp); - if (error) - goto out_trans; - cur = xfs_refcountbt_init_cursor(mp, tp, agbp, pag); + if (rtg) { + xfs_rtgroup_lock(NULL, rtg, XFS_RTGLOCK_REFCOUNT); + cur = xfs_rtrefcountbt_init_cursor(mp, tp, rtg, + rtg->rtg_refcountip); + } else { + error = xfs_alloc_read_agf(pag, tp, 0, &agbp); + if (error) + goto out_trans; + cur = xfs_refcountbt_init_cursor(mp, tp, agbp, pag); + } /* Find all the leftover CoW staging extents. */ memset(&low, 0, sizeof(low)); @@ -2086,7 +2098,10 @@ xfs_refcount_recover_cow_leftovers( error = xfs_btree_query_range(cur, &low, &high, xfs_refcount_recover_extent, &debris); xfs_btree_del_cursor(cur, error); - xfs_trans_brelse(tp, agbp); + if (agbp) + xfs_trans_brelse(tp, agbp); + else + xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_REFCOUNT); xfs_trans_cancel(tp); if (error) goto out_free; @@ -2099,14 +2114,18 @@ xfs_refcount_recover_cow_leftovers( goto out_free; /* Free the orphan record */ - fsb = XFS_AGB_TO_FSB(mp, pag->pag_agno, - rr->rr_rrec.rc_startblock); - xfs_refcount_free_cow_extent(tp, false, fsb, + if (rtg) + fsb = xfs_rgbno_to_rtb(mp, rtg->rtg_rgno, + rr->rr_rrec.rc_startblock); + else + fsb = XFS_AGB_TO_FSB(mp, pag->pag_agno, + rr->rr_rrec.rc_startblock); + xfs_refcount_free_cow_extent(tp, rtg != NULL, fsb, rr->rr_rrec.rc_blockcount); /* Free the block. */ xfs_free_extent_later(tp, fsb, rr->rr_rrec.rc_blockcount, NULL, - 0); + rtg != NULL ? XFS_FREE_EXTENT_REALTIME : 0); error = xfs_trans_commit(tp); if (error) @@ -2128,6 +2147,22 @@ xfs_refcount_recover_cow_leftovers( return error; } +int +xfs_refcount_recover_cow_leftovers( + struct xfs_mount *mp, + struct xfs_perag *pag) +{ + return xfs_refcount_recover_group_cow_leftovers(mp, pag, NULL); +} + +int +xfs_refcount_recover_rtcow_leftovers( + struct xfs_mount *mp, + struct xfs_rtgroup *rtg) +{ + return xfs_refcount_recover_group_cow_leftovers(mp, NULL, rtg); +} + /* * Scan part of the keyspace of the refcount records and tell us if the area * has no records, is fully mapped by records, or is partially filled. diff --git a/fs/xfs/libxfs/xfs_refcount.h b/fs/xfs/libxfs/xfs_refcount.h index 4e725d723e88..c7907119d10c 100644 --- a/fs/xfs/libxfs/xfs_refcount.h +++ b/fs/xfs/libxfs/xfs_refcount.h @@ -12,6 +12,7 @@ struct xfs_perag; struct xfs_btree_cur; struct xfs_bmbt_irec; struct xfs_refcount_irec; +struct xfs_rtgroup; extern int xfs_refcount_lookup_le(struct xfs_btree_cur *cur, enum xfs_refc_domain domain, xfs_agblock_t bno, int *stat); @@ -99,8 +100,10 @@ void xfs_refcount_alloc_cow_extent(struct xfs_trans *tp, bool isrt, xfs_fsblock_t fsb, xfs_extlen_t len); void xfs_refcount_free_cow_extent(struct xfs_trans *tp, bool isrt, xfs_fsblock_t fsb, xfs_extlen_t len); -extern int xfs_refcount_recover_cow_leftovers(struct xfs_mount *mp, +int xfs_refcount_recover_cow_leftovers(struct xfs_mount *mp, struct xfs_perag *pag); +int xfs_refcount_recover_rtcow_leftovers(struct xfs_mount *mp, + struct xfs_rtgroup *rtg); /* * While we're adjusting the refcounts records of an extent, we have diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 3cead39e4308..13a613c077df 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1002,7 +1002,9 @@ xfs_reflink_recover_cow( struct xfs_mount *mp) { struct xfs_perag *pag; + struct xfs_rtgroup *rtg; xfs_agnumber_t agno; + xfs_rgnumber_t rgno; int error = 0; if (!xfs_has_reflink(mp)) @@ -1012,11 +1014,19 @@ xfs_reflink_recover_cow( error = xfs_refcount_recover_cow_leftovers(mp, pag); if (error) { xfs_perag_put(pag); - break; + return error; } } - return error; + for_each_rtgroup(mp, rgno, rtg) { + error = xfs_refcount_recover_rtcow_leftovers(mp, rtg); + if (error) { + xfs_rtgroup_put(rtg); + return error; + } + } + + return 0; } /* From patchwork Fri Dec 30 22:18:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90756C4332F for ; Sat, 31 Dec 2022 01:54:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236043AbiLaByD (ORCPT ); Fri, 30 Dec 2022 20:54:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231231AbiLaByC (ORCPT ); Fri, 30 Dec 2022 20:54:02 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F07A1DDD3 for ; Fri, 30 Dec 2022 17:54:01 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id BE52D61CE2 for ; Sat, 31 Dec 2022 01:54:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 25AC3C433D2; Sat, 31 Dec 2022 01:54:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451640; bh=pg4B0x7QVtQ7GFIuSuXOfuGL4WVeub5JqBpUKgko6S4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=gNhUw08tiJmq2zwDM7MNZ8120I2/hUsgxYqU3/LQXj8bpTAsD92tMHAjwx7hG4lc4 N+CGELYABHVoY6HhyTxqIFBXVR+6BnJXaRBqfANJoRL6UquCSdm1a84EbKZjDzjCKi aMt2b181qkYGJ4WBlQlsLeMR5u2znkJFL/Czfm1JpC1RymUhIOHyhBf+9zgJwJDtI8 qa4ce51h7rf+QBXK5dyz3OmJrzFvaTcpB883a+hp6ls0OxhJQaj9TMXBnvUX7dmlgH 44oJMC8/8XqwKoDPbLIIrXSHnBCrGxgRAk61uPmKWQWGcbDnLlDWHrroa9yl8nAlHP im7SqFilVxgdA== Subject: [PATCH 23/42] xfs: fix xfs_get_extsz_hint behavior with realtime alwayscow files From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:32 -0800 Message-ID: <167243871222.717073.2381159476227762923.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Currently, we (ab)use xfs_get_extsz_hint so that it always returns a nonzero value for realtime files. This apparently was done to disable delayed allocation for realtime files. However, once we enable realtime reflink, we can also turn on the alwayscow flag to force CoW writes to realtime files. In this case, the logic will incorrectly send the write through the delalloc write path. Fix this by adjusting the logic slightly. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_bmap.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index fe31f3cb5d91..552875ddcc4a 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -6427,9 +6427,8 @@ xfs_get_extsz_hint( * No point in aligning allocations if we need to COW to actually * write to them. */ - if (xfs_is_always_cow_inode(ip)) - return 0; - if ((ip->i_diflags & XFS_DIFLAG_EXTSIZE) && ip->i_extsize) + if (!xfs_is_always_cow_inode(ip) && + (ip->i_diflags & XFS_DIFLAG_EXTSIZE) && ip->i_extsize) return ip->i_extsize; if (XFS_IS_REALTIME_INODE(ip)) return ip->i_mount->m_sb.sb_rextsize; From patchwork Fri Dec 30 22:18:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2022C3DA7D for ; Sat, 31 Dec 2022 01:54:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236005AbiLaByi (ORCPT ); Fri, 30 Dec 2022 20:54:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236044AbiLaByR (ORCPT ); Fri, 30 Dec 2022 20:54:17 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B89721DDD3 for ; Fri, 30 Dec 2022 17:54:16 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 547DF61CE2 for ; Sat, 31 Dec 2022 01:54:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B1CDCC433EF; Sat, 31 Dec 2022 01:54:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451655; bh=gGKxD2s3JA9L92bHOMiv4ee+wnVvd56nKq1aifBWVAE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=CDOAJkiBiMNcbS/AOPQ2RCPQJ8QrxFT7cMR/0OazBBuZPmpdWYUEpIeTJqFCWmyvQ GtVYk+7lRl3aQrDeAMMb9Mwu4i5BwbqxBhn/EZHQqJvcCHib/yEQaZEUD6XHid8V/W xOMxM7EFDW8CMlgqQ92XT77PCoHPWP8BJb3y5Fz4jND1BPXiVd1uXpOYEE64TtWoNI ob40Q6LSAgdRy1VUs4BMBmCdj7rh/nNvGFRY6BP/P/CFIrecoe+gxgrkftRWW30j16 xST1eRi81w7uh4NxT9nm1OvfB/dKfhO3qmKrs8R0JN8a58r/B8pnA7QMIyes3LiDCp lURw10L8ketjA== Subject: [PATCH 24/42] xfs: apply rt extent alignment constraints to CoW extsize hint From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:32 -0800 Message-ID: <167243871237.717073.10361772909161650509.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The copy-on-write extent size hint is subject to the same alignment constraints as the regular extent size hint. Since we're in the process of adding reflink (and therefore CoW) to the realtime device, we must apply the same scattered rextsize alignment validation strategies to both hints to deal with the possibility of rextsize changing. Therefore, fix the inode validator to perform rextsize alignment checks on regular realtime files, and to remove misaligned directory hints. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_inode_buf.c | 25 ++++++++++++++++++++----- fs/xfs/libxfs/xfs_trans_inode.c | 14 ++++++++++++++ fs/xfs/xfs_ioctl.c | 17 +++++++++++++++-- 3 files changed, 49 insertions(+), 7 deletions(-) diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 0db719f80bf2..09dafa8a9ab2 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -831,11 +831,29 @@ xfs_inode_validate_cowextsize( bool rt_flag; bool hint_flag; uint32_t cowextsize_bytes; + uint32_t blocksize_bytes; rt_flag = (flags & XFS_DIFLAG_REALTIME); hint_flag = (flags2 & XFS_DIFLAG2_COWEXTSIZE); cowextsize_bytes = XFS_FSB_TO_B(mp, cowextsize); + /* + * Similar to extent size hints, a directory can be configured to + * propagate realtime status and a CoW extent size hint to newly + * created files even if there is no realtime device, and the hints on + * disk can become misaligned if the sysadmin changes the rt extent + * size while adding the realtime device. + * + * Therefore, we can only enforce the rextsize alignment check against + * regular realtime files, and rely on callers to decide when alignment + * checks are appropriate, and fix things up as needed. + */ + + if (rt_flag) + blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize); + else + blocksize_bytes = mp->m_sb.sb_blocksize; + if (hint_flag && !xfs_has_reflink(mp)) return __this_address; @@ -849,16 +867,13 @@ xfs_inode_validate_cowextsize( if (mode && !hint_flag && cowextsize != 0) return __this_address; - if (hint_flag && rt_flag) - return __this_address; - - if (cowextsize_bytes % mp->m_sb.sb_blocksize) + if (cowextsize_bytes % blocksize_bytes) return __this_address; if (cowextsize > XFS_MAX_BMBT_EXTLEN) return __this_address; - if (cowextsize > mp->m_sb.sb_agblocks / 2) + if (!rt_flag && cowextsize > mp->m_sb.sb_agblocks / 2) return __this_address; return NULL; diff --git a/fs/xfs/libxfs/xfs_trans_inode.c b/fs/xfs/libxfs/xfs_trans_inode.c index 4571db873f14..e292851e3b9d 100644 --- a/fs/xfs/libxfs/xfs_trans_inode.c +++ b/fs/xfs/libxfs/xfs_trans_inode.c @@ -160,6 +160,20 @@ xfs_trans_log_inode( flags |= XFS_ILOG_CORE; } + /* + * Inode verifiers do not check that the CoW extent size hint is an + * integer multiple of the rt extent size on a directory with both + * rtinherit and cowextsize flags set. If we're logging a directory + * that is misconfigured in this way, clear the hint. + */ + if ((ip->i_diflags & XFS_DIFLAG_RTINHERIT) && + (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) && + (ip->i_cowextsize % ip->i_mount->m_sb.sb_rextsize) > 0) { + ip->i_diflags2 &= ~XFS_DIFLAG2_COWEXTSIZE; + ip->i_cowextsize = 0; + flags |= XFS_ILOG_CORE; + } + /* * Record the specific change for fdatasync optimisation. This allows * fdatasync to skip log forces for inodes that are only timestamp diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 939cc6d862da..abca384c86a4 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1054,8 +1054,21 @@ xfs_fill_fsxattr( } } - if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) - fa->fsx_cowextsize = XFS_FSB_TO_B(mp, ip->i_cowextsize); + if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) { + /* + * Don't let a misaligned CoW extent size hint on a directory + * escape to userspace if it won't pass the setattr checks + * later. + */ + if ((ip->i_diflags & XFS_DIFLAG_RTINHERIT) && + ip->i_cowextsize % mp->m_sb.sb_rextsize > 0) { + fa->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE; + fa->fsx_cowextsize = 0; + } else { + fa->fsx_cowextsize = XFS_FSB_TO_B(mp, ip->i_cowextsize); + } + } + fa->fsx_projid = ip->i_projid; if (ifp && !xfs_need_iread_extents(ifp)) fa->fsx_nextents = xfs_iext_count(ifp); From patchwork Fri Dec 30 22:18:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085532 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C17BBC4332F for ; Sat, 31 Dec 2022 01:54:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236018AbiLaByj (ORCPT ); Fri, 30 Dec 2022 20:54:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236139AbiLaByc (ORCPT ); Fri, 30 Dec 2022 20:54:32 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 444C11DDD3 for ; Fri, 30 Dec 2022 17:54:32 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D651761CD6 for ; Sat, 31 Dec 2022 01:54:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 40FA8C433EF; Sat, 31 Dec 2022 01:54:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451671; bh=ainxLkxNFBefq74OUmSYZRTRopOIVf7cuw5vRbe4wVE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=JkpzCBhF0M2T82/NIBNfR82ofuVJLHaPkLXYSS1KpQLK2RmS3ROzZHKwNIzhVJ+jR LN8yOWaS0j5KBfpXgn/aIQuYU2u+ImsCu/JM2F8UmLWLjhSjJ9g0NUfrQBfzAX7lZf L7b9dBeG/4NdFEUNGIhhNZHGbyGMJsS55mOJU0GNVFgkk0FkbodAu0qevs3PbqBWGF /OSk7h4RpoI47+EEmmrr8ed3ooiDe+GZ4NTtrk5Jt92HSVFWDA0G2VYwG48IxRjH+q t7hTA3ApBZ2WFWKsxgaKX1HveHqBK1Mc9HC4Dx7swXFP30fjv3BjV4cywxlPn/ij2Y SPMl7yLlDI9wA== Subject: [PATCH 25/42] xfs: enable extent size hints for CoW operations From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:32 -0800 Message-ID: <167243871251.717073.7768252010589495025.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Wire up the copy-on-write extent size hint for realtime files, and connect it to the rt allocator so that we avoid fragmentation on rt filesystems. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_bmap.c | 8 +++++++- fs/xfs/xfs_bmap_util.c | 5 ++++- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 552875ddcc4a..b2bc39b1f9b7 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -6450,7 +6450,13 @@ xfs_get_cowextsz_hint( a = 0; if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) a = ip->i_cowextsize; - b = xfs_get_extsz_hint(ip); + if (XFS_IS_REALTIME_INODE(ip)) { + b = 0; + if (ip->i_diflags & XFS_DIFLAG_EXTSIZE) + b = ip->i_extsize; + } else { + b = xfs_get_extsz_hint(ip); + } a = max(a, b); if (a == 0) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 842f472292cd..a54ed26e1cc0 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -138,7 +138,10 @@ xfs_bmap_rtalloc( bool ignore_locality = false; int error; - align = xfs_get_extsz_hint(ap->ip); + if (ap->flags & XFS_BMAPI_COWFORK) + align = xfs_get_cowextsz_hint(ap->ip); + else + align = xfs_get_extsz_hint(ap->ip); retry: prod = xfs_extlen_to_rtxlen(mp, align); error = xfs_bmap_extsize_align(mp, &ap->got, &ap->prev, From patchwork Fri Dec 30 22:18:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085534 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 555E4C4332F for ; Sat, 31 Dec 2022 01:55:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235902AbiLaBzR (ORCPT ); Fri, 30 Dec 2022 20:55:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235911AbiLaByv (ORCPT ); Fri, 30 Dec 2022 20:54:51 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5251F1DDDD for ; Fri, 30 Dec 2022 17:54:50 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 97952CE19F4 for ; Sat, 31 Dec 2022 01:54:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D977CC433EF; Sat, 31 Dec 2022 01:54:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451686; bh=om4+79GL+45ZRqsCK0wB4AkGZWkh1Jrw1QfFPDlns2o=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=TyO0rb0kDov7C8Jg7jTlALmY2OdrUchrHMKh8gLjJGwo4vA5i6ff0o1/9ELAC7Jh/ OQQA9fmt0Pt0qH94bVuN4ccsFrQ1S26Q8MiSuJlg0DCoFoh+4FjS73jBSkmx7slK2+ 42hK5vwat+ccTLExj/0Yf47MPE10qmmWg+gpnebo/XJHIpYFTg5cW7tf2wxBOk+rIT zRmhtvcpuDaG6cYwd7aeqOEfv5MXaQn2e9p6k3CytRsUzmNPJgT1PsVX4aaL8pT0Xd 33Nq+hIhEk/kmjo7ho+AwKMj+CZkZUK2akIq8wPW2Deg7wo5QXYKU0KAPs429Nk4gO 9n8tIzYgXuBaQ== Subject: [PATCH 26/42] xfs: check that the rtrefcount maxlevels doesn't increase when growing fs From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:32 -0800 Message-ID: <167243871265.717073.12647043438282202534.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The size of filesystem transaction reservations depends on the maximum height (maxlevels) of the realtime btrees. Since we don't want a grow operation to increase the reservation size enough that we'll fail the minimum log size checks on the next mount, constrain growfs operations if they would cause an increase in the rt refcount btree maxlevels. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_fsops.c | 2 ++ fs/xfs/xfs_rtalloc.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index 65b44ad8884e..317f0461f490 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -24,6 +24,7 @@ #include "xfs_rtgroup.h" #include "xfs_rtalloc.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" /* * Write new AG headers to disk. Non-transactional, but need to be @@ -225,6 +226,7 @@ xfs_growfs_data_private( /* Compute new maxlevels for rt btrees. */ xfs_rtrmapbt_compute_maxlevels(mp); + xfs_rtrefcountbt_compute_maxlevels(mp); } return error; diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 48c7cc28b7f2..7f1ee9432e71 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1172,6 +1172,7 @@ xfs_growfs_check_rtgeom( fake_mp->m_features |= XFS_FEAT_REALTIME; xfs_rtrmapbt_compute_maxlevels(fake_mp); + xfs_rtrefcountbt_compute_maxlevels(fake_mp); xfs_trans_resv_calc(fake_mp, M_RES(fake_mp)); min_logfsbs = xfs_log_calc_minimum_size(fake_mp); @@ -1474,6 +1475,7 @@ xfs_growfs_rt( */ mp->m_features |= XFS_FEAT_REALTIME; xfs_rtrmapbt_compute_maxlevels(mp); + xfs_rtrefcountbt_compute_maxlevels(mp); } if (error) goto out_free; From patchwork Fri Dec 30 22:18:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085535 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23ACAC3DA7D for ; Sat, 31 Dec 2022 01:55:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231231AbiLaBzR (ORCPT ); Fri, 30 Dec 2022 20:55:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50428 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236030AbiLaBzG (ORCPT ); Fri, 30 Dec 2022 20:55:06 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0EEDE1DDD3 for ; Fri, 30 Dec 2022 17:55:05 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B5E92B81E07 for ; Sat, 31 Dec 2022 01:55:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 67AAFC433D2; Sat, 31 Dec 2022 01:55:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451702; bh=QQamTzppxlCOWg3ZBDU8rMfJrtsB47pcOFWzLb6tW1o=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=fNkkegF/BbuHSI5O3O3IHnKu7uZxd7SDplIL9Jex3voKL0IuKe46XkwcsNXfCSlO8 jB1fKWW96tEyqwDRnbLtg3Iz7HfGzhShd+OUqI7M60MBM1rqbjcWYoOciDVKG+gt3T iDGAHG9AxERaHmk5P0A5s3Bgj1UigOLQsCPdgwHi6sCXKMAnUCDxYcM3j5hXSLsOAD qbURVYavvwqWXsnE949X954wu+8ymrzXxYPGYI6QdyeUuIwEMZNnUpfc86AmTGMuSa uV2SQe9puKrCUj99KxmD0GuPkFioKDLLF5ObDit1bf9Zrnq1q3P9HB/L5cyWI7nagT SdtEOuBZE0NDg== Subject: [PATCH 27/42] xfs: add realtime refcount btree when adding rt volume From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:32 -0800 Message-ID: <167243871279.717073.11606527521201327303.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong If we're adding enough space to the realtime section to require the creation of new realtime groups, create the rt refcount btree inode before we start adding the space. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_rtalloc.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 77 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 7f1ee9432e71..8929c4fffb53 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1134,6 +1134,73 @@ xfs_growfsrt_create_rtrmap( return error; } +/* Add a metadata inode for a realtime refcount btree. */ +static int +xfs_growfsrt_create_rtrefcount( + struct xfs_rtgroup *rtg) +{ + struct xfs_mount *mp = rtg->rtg_mount; + struct xfs_imeta_update upd; + struct xfs_imeta_path *path; + struct xfs_trans *tp; + struct xfs_inode *ip = NULL; + int error; + + if (!xfs_has_rtreflink(mp) || rtg->rtg_refcountip) + return 0; + + error = xfs_rtrefcountbt_create_path(mp, rtg->rtg_rgno, &path); + if (error) + return error; + + error = xfs_imeta_ensure_dirpath(mp, path); + if (error) + goto out_path; + + error = xfs_imeta_start_update(mp, path, &upd); + if (error) + goto out_path; + + error = xfs_qm_dqattach(upd.dp); + if (error) + goto out_upd; + + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_imeta_create, + xfs_imeta_create_space_res(mp), 0, 0, &tp); + if (error) + goto out_end; + + error = xfs_rtrefcountbt_create(&tp, path, &upd, &ip); + if (error) + goto out_cancel; + + lockdep_set_class(&ip->i_lock.mr_lock, &xfs_rrefcountip_key); + + error = xfs_trans_commit(tp); + if (error) + goto out_end; + + xfs_imeta_end_update(mp, &upd, error); + xfs_imeta_free_path(path); + xfs_finish_inode_setup(ip); + rtg->rtg_refcountip = ip; + return 0; + +out_cancel: + xfs_trans_cancel(tp); +out_end: + /* Have to finish setting up the inode to ensure it's deleted. */ + if (ip) { + xfs_finish_inode_setup(ip); + xfs_irele(ip); + } +out_upd: + xfs_imeta_end_update(mp, &upd, error); +out_path: + xfs_imeta_free_path(path); + return error; +} + /* * Check that changes to the realtime geometry won't affect the minimum * log size, which would cause the fs to become unusable. @@ -1241,9 +1308,11 @@ xfs_growfs_rt( return -EINVAL; /* Unsupported realtime features. */ - if (!xfs_has_rtgroups(mp) && xfs_has_rmapbt(mp)) + if (!xfs_has_rtgroups(mp) && (xfs_has_rmapbt(mp) || xfs_has_reflink(mp))) return -EOPNOTSUPP; - if (xfs_has_reflink(mp) || xfs_has_quota(mp)) + if (xfs_has_quota(mp)) + return -EOPNOTSUPP; + if (xfs_has_reflink(mp) && in->extsize != 1) return -EOPNOTSUPP; nrblocks = in->newblocks; @@ -1378,6 +1447,12 @@ xfs_growfs_rt( xfs_rtgroup_put(rtg); break; } + + error = xfs_growfsrt_create_rtrefcount(rtg); + if (error) { + xfs_rtgroup_put(rtg); + break; + } } } From patchwork Fri Dec 30 22:18:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085536 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96EABC4332F for ; Sat, 31 Dec 2022 01:55:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236006AbiLaBzW (ORCPT ); Fri, 30 Dec 2022 20:55:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235911AbiLaBzV (ORCPT ); Fri, 30 Dec 2022 20:55:21 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F0D11DDDA for ; Fri, 30 Dec 2022 17:55:20 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4B7F4B81E08 for ; Sat, 31 Dec 2022 01:55:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 018A6C433EF; Sat, 31 Dec 2022 01:55:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451718; bh=6SLftXjIPDFS8ycIrKppO90HNSzeEZmUUAqiqJ8kaHQ=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Fl47TkN4bPJdO4w0NzP3vXsMDhZeFEdEBsUzpQYjtSDdkCdxdjoOKnqJxig3LhmX9 qTAZ+ndyLtjbZk7V1okTGF5LAkBvRjT1zKWHDh9o951IT3wQhOU4dDLwuE5N8PKJcm s++OOwXUj4vkjqY2wPdDQlcbXT9eYuws55lzEu27EYZnWKCF4ayFARLytbWV0TVrU2 OzWGR5QoDv7MVBAag96N1VsANnCx51uDUfrpy0saS1ew7J+34LmCr/bYukaApgs/mu TzWmx+eXi7hPVT0mKhgtoKK7j94dgnqqJtG3/j2EWAXmXuo9eWDMbiYgPCHDYbkW6O yvx8UMyV/vDmw== Subject: [PATCH 28/42] xfs: report realtime refcount btree corruption errors to the health system From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:32 -0800 Message-ID: <167243871294.717073.3857302599711174842.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Whenever we encounter corrupt realtime refcount btree blocks, we should report that to the health monitoring system for later reporting. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_fs.h | 1 + fs/xfs/libxfs/xfs_health.h | 4 +++- fs/xfs/libxfs/xfs_inode_fork.c | 4 +++- fs/xfs/libxfs/xfs_rtrefcount_btree.c | 5 ++++- fs/xfs/xfs_health.c | 4 ++++ fs/xfs/xfs_rtalloc.c | 1 + 6 files changed, 16 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h index 8547ba85c550..5819576a51a1 100644 --- a/fs/xfs/libxfs/xfs_fs.h +++ b/fs/xfs/libxfs/xfs_fs.h @@ -314,6 +314,7 @@ struct xfs_rtgroup_geometry { #define XFS_RTGROUP_GEOM_SICK_SUPER (1 << 0) /* superblock */ #define XFS_RTGROUP_GEOM_SICK_BITMAP (1 << 1) /* rtbitmap for this group */ #define XFS_RTGROUP_GEOM_SICK_RMAPBT (1 << 2) /* reverse mappings */ +#define XFS_RTGROUP_GEOM_SICK_REFCNTBT (1 << 3) /* reference counts */ /* * Structures for XFS_IOC_FSGROWFSDATA, XFS_IOC_FSGROWFSLOG & XFS_IOC_FSGROWFSRT diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h index d5976f6b0de1..131282167548 100644 --- a/fs/xfs/libxfs/xfs_health.h +++ b/fs/xfs/libxfs/xfs_health.h @@ -68,6 +68,7 @@ struct xfs_rtgroup; #define XFS_SICK_RT_SUMMARY (1 << 1) /* realtime summary */ #define XFS_SICK_RT_SUPER (1 << 2) /* rt group superblock */ #define XFS_SICK_RT_RMAPBT (1 << 3) /* reverse mappings */ +#define XFS_SICK_RT_REFCNTBT (1 << 4) /* reference counts */ /* Observable health issues for AG metadata. */ #define XFS_SICK_AG_SB (1 << 0) /* superblock */ @@ -106,7 +107,8 @@ struct xfs_rtgroup; #define XFS_SICK_RT_PRIMARY (XFS_SICK_RT_BITMAP | \ XFS_SICK_RT_SUMMARY | \ XFS_SICK_RT_SUPER | \ - XFS_SICK_RT_RMAPBT) + XFS_SICK_RT_RMAPBT | \ + XFS_SICK_RT_REFCNTBT) #define XFS_SICK_AG_PRIMARY (XFS_SICK_AG_SB | \ XFS_SICK_AG_AGF | \ diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index 7aae3ae810b7..5d5134a61994 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -268,8 +268,10 @@ xfs_iformat_data_fork( } return xfs_iformat_rtrmap(ip, dip); case XFS_DINODE_FMT_REFCOUNT: - if (!xfs_has_rtreflink(ip->i_mount)) + if (!xfs_has_rtreflink(ip->i_mount)) { + xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE); return -EFSCORRUPTED; + } return xfs_iformat_rtrefcount(ip, dip); default: xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index 0a6fa9851371..4bbda3ff0b39 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -27,6 +27,7 @@ #include "xfs_rtgroup.h" #include "xfs_rtbitmap.h" #include "xfs_imeta.h" +#include "xfs_health.h" static struct kmem_cache *xfs_rtrefcountbt_cur_cache; @@ -693,8 +694,10 @@ xfs_iformat_rtrefcount( level = be16_to_cpu(dfp->bb_level); if (level > mp->m_rtrefc_maxlevels || - xfs_rtrefcount_droot_space_calc(level, numrecs) > dsize) + xfs_rtrefcount_droot_space_calc(level, numrecs) > dsize) { + xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE); return -EFSCORRUPTED; + } xfs_iroot_alloc(ip, XFS_DATA_FORK, xfs_rtrefcount_broot_space_calc(mp, level, numrecs)); diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c index 80cc735b52d1..3a6684acd858 100644 --- a/fs/xfs/xfs_health.c +++ b/fs/xfs/xfs_health.c @@ -532,6 +532,7 @@ static const struct ioctl_sick_map rtgroup_map[] = { { XFS_SICK_RT_SUPER, XFS_RTGROUP_GEOM_SICK_SUPER }, { XFS_SICK_RT_BITMAP, XFS_RTGROUP_GEOM_SICK_BITMAP }, { XFS_SICK_RT_RMAPBT, XFS_RTGROUP_GEOM_SICK_RMAPBT }, + { XFS_SICK_RT_REFCNTBT, XFS_RTGROUP_GEOM_SICK_REFCNTBT }, { 0, 0 }, }; @@ -634,6 +635,9 @@ xfs_btree_mark_sick( case XFS_BTNUM_RTRMAP: xfs_rtgroup_mark_sick(cur->bc_ino.rtg, XFS_SICK_RT_RMAPBT); return; + case XFS_BTNUM_RTREFC: + xfs_rtgroup_mark_sick(cur->bc_ino.rtg, XFS_SICK_RT_REFCNTBT); + return; case XFS_BTNUM_BNO: mask = XFS_SICK_AG_BNOBT; break; diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 8929c4fffb53..75d39c3274df 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1968,6 +1968,7 @@ xfs_rtmount_refcountbt( goto out_path; if (XFS_IS_CORRUPT(mp, ip->i_df.if_format != XFS_DINODE_FMT_REFCOUNT)) { + xfs_rtgroup_mark_sick(rtg, XFS_SICK_RT_REFCNTBT); error = -EFSCORRUPTED; goto out_rele; } From patchwork Fri Dec 30 22:18:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0DD2C4332F for ; Sat, 31 Dec 2022 01:55:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236027AbiLaBzl (ORCPT ); Fri, 30 Dec 2022 20:55:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235911AbiLaBzl (ORCPT ); Fri, 30 Dec 2022 20:55:41 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58F751DDD3 for ; Fri, 30 Dec 2022 17:55:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DEE5BB81E07 for ; Sat, 31 Dec 2022 01:55:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A29C5C433D2; Sat, 31 Dec 2022 01:55:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451733; bh=xVdRo37W1K9A07RrNOkmsHTdmOZqhJ9RRl47ndgk8e4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=d8+640svzhLooir49NVODfYFRxDG7P6JJ8BTviT2tSlOTOu2uxo1xrjJDOFygIiIT OAK9RgbsLf9VPlbbH13Ex8XGYioNnuNPua/DYFnOW9pVZaOt0rhuCnUReo5qlfXO4n IeMieomKUBVeG4QF7SR3A0RceE78VpYVLsvWOawp4MYkBjQlnGqQb0NzTrRQxLVVOx k17+0PFl64xE7QtAK1YBquPNUf40+ffcJQxRIlUB7jniTqaEGRWyAnPTKF19LUSU11 WWwPd4UrH4I4Cgw7N/ory/5FKxb759r04jzwTIMP1c7GTqdNHsMcJgvvrCMrGmx2Eq Z9io6NSaAcDqA== Subject: [PATCH 29/42] xfs: scrub the realtime refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:33 -0800 Message-ID: <167243871308.717073.691655306599152783.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Add code to scrub realtime refcount btrees. Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/libxfs/xfs_fs.h | 3 fs/xfs/scrub/bmap.c | 1 fs/xfs/scrub/bmap_repair.c | 1 fs/xfs/scrub/common.c | 40 +++- fs/xfs/scrub/common.h | 5 fs/xfs/scrub/health.c | 1 fs/xfs/scrub/inode.c | 1 fs/xfs/scrub/rtrefcount.c | 495 ++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/scrub.c | 7 + fs/xfs/scrub/scrub.h | 3 fs/xfs/scrub/trace.h | 4 12 files changed, 548 insertions(+), 14 deletions(-) create mode 100644 fs/xfs/scrub/rtrefcount.c diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 9cc30333c089..cb1074c67dc5 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -182,6 +182,7 @@ xfs-y += $(addprefix scrub/, \ xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \ rgsuper.o \ rtbitmap.o \ + rtrefcount.o \ rtrmap.o \ rtsummary.o \ ) diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h index 5819576a51a1..453b08612256 100644 --- a/fs/xfs/libxfs/xfs_fs.h +++ b/fs/xfs/libxfs/xfs_fs.h @@ -746,9 +746,10 @@ struct xfs_scrub_metadata { #define XFS_SCRUB_TYPE_RGSUPER 28 /* realtime superblock */ #define XFS_SCRUB_TYPE_RGBITMAP 29 /* realtime group bitmap */ #define XFS_SCRUB_TYPE_RTRMAPBT 30 /* rtgroup reverse mapping btree */ +#define XFS_SCRUB_TYPE_RTREFCBT 31 /* realtime reference count btree */ /* Number of scrub subcommands. */ -#define XFS_SCRUB_TYPE_NR 31 +#define XFS_SCRUB_TYPE_NR 32 /* i: Repair this metadata. */ #define XFS_SCRUB_IFLAG_REPAIR (1u << 0) diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index 8ce279ae9c95..f18b22bc2548 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -909,6 +909,7 @@ xchk_bmap( case XFS_DINODE_FMT_DEV: case XFS_DINODE_FMT_LOCAL: case XFS_DINODE_FMT_RMAP: + case XFS_DINODE_FMT_REFCOUNT: /* No mappings to check. */ if (whichfork == XFS_COW_FORK) xchk_fblock_set_corrupt(sc, whichfork, 0); diff --git a/fs/xfs/scrub/bmap_repair.c b/fs/xfs/scrub/bmap_repair.c index b8cdcba984f3..5dca4680657f 100644 --- a/fs/xfs/scrub/bmap_repair.c +++ b/fs/xfs/scrub/bmap_repair.c @@ -807,6 +807,7 @@ xrep_bmap_check_inputs( case XFS_DINODE_FMT_LOCAL: case XFS_DINODE_FMT_UUID: case XFS_DINODE_FMT_RMAP: + case XFS_DINODE_FMT_REFCOUNT: return -ECANCELED; case XFS_DINODE_FMT_EXTENTS: case XFS_DINODE_FMT_BTREE: diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c index c2c379aae770..a632d56f255f 100644 --- a/fs/xfs/scrub/common.c +++ b/fs/xfs/scrub/common.c @@ -37,6 +37,7 @@ #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" #include "xfs_bmap_util.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -856,6 +857,10 @@ xchk_rtgroup_init( sr->rmap_cur = xfs_rtrmapbt_init_cursor(sc->mp, sc->tp, sr->rtg, sr->rtg->rtg_rmapip); + if (xfs_has_rtreflink(sc->mp) && (rtglock_flags & XFS_RTGLOCK_REFCOUNT)) + sr->refc_cur = xfs_rtrefcountbt_init_cursor(sc->mp, sc->tp, + sr->rtg, sr->rtg->rtg_refcountip); + return 0; } @@ -870,7 +875,10 @@ xchk_rtgroup_btcur_free( { if (sr->rmap_cur) xfs_btree_del_cursor(sr->rmap_cur, XFS_BTREE_ERROR); + if (sr->refc_cur) + xfs_btree_del_cursor(sr->refc_cur, XFS_BTREE_ERROR); + sr->refc_cur = NULL; sr->rmap_cur = NULL; } @@ -1556,16 +1564,26 @@ xchk_inode_count_blocks( } cur = xfs_rtrmapbt_init_cursor(sc->mp, sc->tp, sc->sr.rtg, sc->ip); - error = xfs_btree_count_blocks(cur, &btblocks); - xfs_btree_del_cursor(cur, error); - if (error) - return error; - - *nextents = 0; - *count = btblocks - 1; - return 0; - default: - return xfs_bmap_count_blocks(sc->tp, sc->ip, whichfork, - nextents, count); + goto meta_btree; + case XFS_DINODE_FMT_REFCOUNT: + if (!sc->sr.rtg) { + ASSERT(0); + return -EFSCORRUPTED; + } + cur = xfs_rtrefcountbt_init_cursor(sc->mp, sc->tp, sc->sr.rtg, + sc->ip); + goto meta_btree; } + + return xfs_bmap_count_blocks(sc->tp, sc->ip, whichfork, nextents, + count); +meta_btree: + error = xfs_btree_count_blocks(cur, &btblocks); + xfs_btree_del_cursor(cur, error); + if (error) + return error; + + *nextents = 0; + *count = btblocks - 1; + return 0; } diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h index e135f792cfcc..dd1b838a183f 100644 --- a/fs/xfs/scrub/common.h +++ b/fs/xfs/scrub/common.h @@ -108,12 +108,14 @@ int xchk_setup_rtsummary(struct xfs_scrub *sc); int xchk_setup_rgsuperblock(struct xfs_scrub *sc); int xchk_setup_rgbitmap(struct xfs_scrub *sc); int xchk_setup_rtrmapbt(struct xfs_scrub *sc); +int xchk_setup_rtrefcountbt(struct xfs_scrub *sc); #else # define xchk_setup_rtbitmap xchk_setup_nothing # define xchk_setup_rtsummary xchk_setup_nothing # define xchk_setup_rgsuperblock xchk_setup_nothing # define xchk_setup_rgbitmap xchk_setup_nothing # define xchk_setup_rtrmapbt xchk_setup_nothing +# define xchk_setup_rtrefcountbt xchk_setup_nothing #endif #ifdef CONFIG_XFS_QUOTA int xchk_ino_dqattach(struct xfs_scrub *sc); @@ -174,7 +176,8 @@ void xchk_rt_unlock(struct xfs_scrub *sc, struct xchk_rt *sr); /* All the locks we need to check an rtgroup. */ #define XCHK_RTGLOCK_ALL (XFS_RTGLOCK_BITMAP_SHARED | \ - XFS_RTGLOCK_RMAP) + XFS_RTGLOCK_RMAP | \ + XFS_RTGLOCK_REFCOUNT) int xchk_rtgroup_init(struct xfs_scrub *sc, xfs_rgnumber_t rgno, struct xchk_rt *sr, unsigned int rtglock_flags); diff --git a/fs/xfs/scrub/health.c b/fs/xfs/scrub/health.c index 061f6f73b666..cb3b0b221275 100644 --- a/fs/xfs/scrub/health.c +++ b/fs/xfs/scrub/health.c @@ -114,6 +114,7 @@ static const struct xchk_health_map type_to_health_flag[XFS_SCRUB_TYPE_NR] = { [XFS_SCRUB_TYPE_NLINKS] = { XHG_FS, XFS_SICK_FS_NLINKS }, [XFS_SCRUB_TYPE_RGSUPER] = { XHG_RTGROUP, XFS_SICK_RT_SUPER }, [XFS_SCRUB_TYPE_RTRMAPBT] = { XHG_RTGROUP, XFS_SICK_RT_RMAPBT }, + [XFS_SCRUB_TYPE_RTREFCBT] = { XHG_RTGROUP, XFS_SICK_RT_REFCNTBT }, }; /* Return the health status mask for this scrub type. */ diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c index 3b19976b6066..be9739035226 100644 --- a/fs/xfs/scrub/inode.c +++ b/fs/xfs/scrub/inode.c @@ -468,6 +468,7 @@ xchk_dinode( xchk_ino_set_corrupt(sc, ino); break; case XFS_DINODE_FMT_RMAP: + case XFS_DINODE_FMT_REFCOUNT: if (!S_ISREG(mode)) xchk_ino_set_corrupt(sc, ino); break; diff --git a/fs/xfs/scrub/rtrefcount.c b/fs/xfs/scrub/rtrefcount.c new file mode 100644 index 000000000000..528a056c7932 --- /dev/null +++ b/fs/xfs/scrub/rtrefcount.c @@ -0,0 +1,495 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_btree.h" +#include "xfs_rmap.h" +#include "xfs_refcount.h" +#include "xfs_inode.h" +#include "xfs_rtbitmap.h" +#include "xfs_rtgroup.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/btree.h" + +/* Set us up with the realtime refcount metadata locked. */ +int +xchk_setup_rtrefcountbt( + struct xfs_scrub *sc) +{ + struct xfs_rtgroup *rtg; + int error; + + if (xchk_need_fshook_drain(sc)) + xchk_fshooks_enable(sc, XCHK_FSHOOKS_DRAIN); + + rtg = xfs_rtgroup_get(sc->mp, sc->sm->sm_agno); + if (!rtg) + return -ENOENT; + + error = xchk_setup_rt(sc); + if (error) + goto out_rtg; + + error = xchk_install_live_inode(sc, rtg->rtg_refcountip); + if (error) + goto out_rtg; + + error = xchk_ino_dqattach(sc); + if (error) + goto out_rtg; + + error = xchk_rtgroup_init(sc, rtg->rtg_rgno, &sc->sr, XCHK_RTGLOCK_ALL); +out_rtg: + xfs_rtgroup_put(rtg); + return error; +} + +/* Realtime Reference count btree scrubber. */ + +/* + * Confirming Reference Counts via Reverse Mappings + * + * We want to count the reverse mappings overlapping a refcount record + * (bno, len, refcount), allowing for the possibility that some of the + * overlap may come from smaller adjoining reverse mappings, while some + * comes from single extents which overlap the range entirely. The + * outer loop is as follows: + * + * 1. For all reverse mappings overlapping the refcount extent, + * a. If a given rmap completely overlaps, mark it as seen. + * b. Otherwise, record the fragment (in agbno order) for later + * processing. + * + * Once we've seen all the rmaps, we know that for all blocks in the + * refcount record we want to find $refcount owners and we've already + * visited $seen extents that overlap all the blocks. Therefore, we + * need to find ($refcount - $seen) owners for every block in the + * extent; call that quantity $target_nr. Proceed as follows: + * + * 2. Pull the first $target_nr fragments from the list; all of them + * should start at or before the start of the extent. + * Call this subset of fragments the working set. + * 3. Until there are no more unprocessed fragments, + * a. Find the shortest fragments in the set and remove them. + * b. Note the block number of the end of these fragments. + * c. Pull the same number of fragments from the list. All of these + * fragments should start at the block number recorded in the + * previous step. + * d. Put those fragments in the set. + * 4. Check that there are $target_nr fragments remaining in the list, + * and that they all end at or beyond the end of the refcount extent. + * + * If the refcount is correct, all the check conditions in the algorithm + * should always hold true. If not, the refcount is incorrect. + */ +struct xchk_rtrefcnt_frag { + struct list_head list; + struct xfs_rmap_irec rm; +}; + +struct xchk_rtrefcnt_check { + struct xfs_scrub *sc; + struct list_head fragments; + + /* refcount extent we're examining */ + xfs_rgblock_t bno; + xfs_extlen_t len; + xfs_nlink_t refcount; + + /* number of owners seen */ + xfs_nlink_t seen; +}; + +/* + * Decide if the given rmap is large enough that we can redeem it + * towards refcount verification now, or if it's a fragment, in + * which case we'll hang onto it in the hopes that we'll later + * discover that we've collected exactly the correct number of + * fragments as the rtrefcountbt says we should have. + */ +STATIC int +xchk_rtrefcountbt_rmap_check( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xchk_rtrefcnt_check *refchk = priv; + struct xchk_rtrefcnt_frag *frag; + xfs_rgblock_t rm_last; + xfs_rgblock_t rc_last; + int error = 0; + + if (xchk_should_terminate(refchk->sc, &error)) + return error; + + rm_last = rec->rm_startblock + rec->rm_blockcount - 1; + rc_last = refchk->bno + refchk->len - 1; + + /* Confirm that a single-owner refc extent is a CoW stage. */ + if (refchk->refcount == 1 && rec->rm_owner != XFS_RMAP_OWN_COW) { + xchk_btree_xref_set_corrupt(refchk->sc, cur, 0); + return 0; + } + + if (rec->rm_startblock <= refchk->bno && rm_last >= rc_last) { + /* + * The rmap overlaps the refcount record, so we can confirm + * one refcount owner seen. + */ + refchk->seen++; + } else { + /* + * This rmap covers only part of the refcount record, so + * save the fragment for later processing. If the rmapbt + * is healthy each rmap_irec we see will be in agbno order + * so we don't need insertion sort here. + */ + frag = kmalloc(sizeof(struct xchk_rtrefcnt_frag), + XCHK_GFP_FLAGS); + if (!frag) + return -ENOMEM; + memcpy(&frag->rm, rec, sizeof(frag->rm)); + list_add_tail(&frag->list, &refchk->fragments); + } + + return 0; +} + +/* + * Given a bunch of rmap fragments, iterate through them, keeping + * a running tally of the refcount. If this ever deviates from + * what we expect (which is the rtrefcountbt's refcount minus the + * number of extents that totally covered the rtrefcountbt extent), + * we have a rtrefcountbt error. + */ +STATIC void +xchk_rtrefcountbt_process_rmap_fragments( + struct xchk_rtrefcnt_check *refchk) +{ + struct list_head worklist; + struct xchk_rtrefcnt_frag *frag; + struct xchk_rtrefcnt_frag *n; + xfs_rgblock_t bno; + xfs_rgblock_t rbno; + xfs_rgblock_t next_rbno; + xfs_nlink_t nr; + xfs_nlink_t target_nr; + + target_nr = refchk->refcount - refchk->seen; + if (target_nr == 0) + return; + + /* + * There are (refchk->rc.rc_refcount - refchk->nr refcount) + * references we haven't found yet. Pull that many off the + * fragment list and figure out where the smallest rmap ends + * (and therefore the next rmap should start). All the rmaps + * we pull off should start at or before the beginning of the + * refcount record's range. + */ + INIT_LIST_HEAD(&worklist); + rbno = NULLRGBLOCK; + + /* Make sure the fragments actually /are/ in bno order. */ + bno = 0; + list_for_each_entry(frag, &refchk->fragments, list) { + if (frag->rm.rm_startblock < bno) + goto done; + bno = frag->rm.rm_startblock; + } + + /* + * Find all the rmaps that start at or before the refc extent, + * and put them on the worklist. + */ + nr = 0; + list_for_each_entry_safe(frag, n, &refchk->fragments, list) { + if (frag->rm.rm_startblock > refchk->bno || nr > target_nr) + break; + bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; + if (bno < rbno) + rbno = bno; + list_move_tail(&frag->list, &worklist); + nr++; + } + + /* + * We should have found exactly $target_nr rmap fragments starting + * at or before the refcount extent. + */ + if (nr != target_nr) + goto done; + + while (!list_empty(&refchk->fragments)) { + /* Discard any fragments ending at rbno from the worklist. */ + nr = 0; + next_rbno = NULLRGBLOCK; + list_for_each_entry_safe(frag, n, &worklist, list) { + bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; + if (bno != rbno) { + if (bno < next_rbno) + next_rbno = bno; + continue; + } + list_del(&frag->list); + kfree(frag); + nr++; + } + + /* Try to add nr rmaps starting at rbno to the worklist. */ + list_for_each_entry_safe(frag, n, &refchk->fragments, list) { + bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; + if (frag->rm.rm_startblock != rbno) + goto done; + list_move_tail(&frag->list, &worklist); + if (next_rbno > bno) + next_rbno = bno; + nr--; + if (nr == 0) + break; + } + + /* + * If we get here and nr > 0, this means that we added fewer + * items to the worklist than we discarded because the fragment + * list ran out of items. Therefore, we cannot maintain the + * required refcount. Something is wrong, so we're done. + */ + if (nr) + goto done; + + rbno = next_rbno; + } + + /* + * Make sure the last extent we processed ends at or beyond + * the end of the refcount extent. + */ + if (rbno < refchk->bno + refchk->len) + goto done; + + /* Actually record us having seen the remaining refcount. */ + refchk->seen = refchk->refcount; +done: + /* Delete fragments and work list. */ + list_for_each_entry_safe(frag, n, &worklist, list) { + list_del(&frag->list); + kfree(frag); + } + list_for_each_entry_safe(frag, n, &refchk->fragments, list) { + list_del(&frag->list); + kfree(frag); + } +} + +/* Use the rmap entries covering this extent to verify the refcount. */ +STATIC void +xchk_rtrefcountbt_xref_rmap( + struct xfs_scrub *sc, + const struct xfs_refcount_irec *irec) +{ + struct xchk_rtrefcnt_check refchk = { + .sc = sc, + .bno = irec->rc_startblock, + .len = irec->rc_blockcount, + .refcount = irec->rc_refcount, + .seen = 0, + }; + struct xfs_rmap_irec low; + struct xfs_rmap_irec high; + struct xchk_rtrefcnt_frag *frag; + struct xchk_rtrefcnt_frag *n; + int error; + + if (!sc->sr.rmap_cur || xchk_skip_xref(sc->sm)) + return; + + /* Cross-reference with the rmapbt to confirm the refcount. */ + memset(&low, 0, sizeof(low)); + low.rm_startblock = irec->rc_startblock; + memset(&high, 0xFF, sizeof(high)); + high.rm_startblock = irec->rc_startblock + irec->rc_blockcount - 1; + + INIT_LIST_HEAD(&refchk.fragments); + error = xfs_rmap_query_range(sc->sr.rmap_cur, &low, &high, + xchk_rtrefcountbt_rmap_check, &refchk); + if (!xchk_should_check_xref(sc, &error, &sc->sr.rmap_cur)) + goto out_free; + + xchk_rtrefcountbt_process_rmap_fragments(&refchk); + if (irec->rc_refcount != refchk.seen) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); + +out_free: + list_for_each_entry_safe(frag, n, &refchk.fragments, list) { + list_del(&frag->list); + kfree(frag); + } +} + +/* Cross-reference with the other btrees. */ +STATIC void +xchk_rtrefcountbt_xref( + struct xfs_scrub *sc, + const struct xfs_refcount_irec *irec) +{ + xfs_rtblock_t rtbno; + + if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return; + + rtbno = xfs_rgbno_to_rtb(sc->mp, sc->sr.rtg->rtg_rgno, + irec->rc_startblock); + xchk_xref_is_used_rt_space(sc, rtbno, irec->rc_blockcount); + xchk_rtrefcountbt_xref_rmap(sc, irec); +} + +struct xchk_rtrefcbt_records { + /* Previous refcount record. */ + struct xfs_refcount_irec prev_rec; + + /* Number of CoW blocks we expect. */ + xfs_extlen_t cow_blocks; +}; + +static inline bool +xchk_rtrefcount_mergeable( + struct xchk_rtrefcbt_records *rrc, + const struct xfs_refcount_irec *r2) +{ + const struct xfs_refcount_irec *r1 = &rrc->prev_rec; + + /* Ignore if prev_rec is not yet initialized. */ + if (r1->rc_blockcount > 0) + return false; + + if (r1->rc_startblock + r1->rc_blockcount != r2->rc_startblock) + return false; + if (r1->rc_refcount != r2->rc_refcount) + return false; + if ((unsigned long long)r1->rc_blockcount + r2->rc_blockcount > + XFS_REFC_LEN_MAX) + return false; + + return true; +} + +/* Flag failures for records that could be merged. */ +STATIC void +xchk_rtrefcountbt_check_mergeable( + struct xchk_btree *bs, + struct xchk_rtrefcbt_records *rrc, + const struct xfs_refcount_irec *irec) +{ + if (bs->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return; + + if (xchk_rtrefcount_mergeable(rrc, irec)) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + + memcpy(&rrc->prev_rec, irec, sizeof(struct xfs_refcount_irec)); +} + +/* Scrub a rtrefcountbt record. */ +STATIC int +xchk_rtrefcountbt_rec( + struct xchk_btree *bs, + const union xfs_btree_rec *rec) +{ + struct xfs_mount *mp = bs->cur->bc_mp; + struct xchk_rtrefcbt_records *rrc = bs->private; + struct xfs_refcount_irec irec; + u32 mod; + + xfs_refcount_btrec_to_irec(rec, &irec); + if (xfs_refcount_check_irec(bs->cur, &irec) != NULL) { + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + return 0; + } + + /* We can only share full rt extents. */ + xfs_rtb_to_rtx(mp, irec.rc_startblock, &mod); + if (mod) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + xfs_rtb_to_rtx(mp, irec.rc_blockcount, &mod); + if (mod) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + + if (irec.rc_domain == XFS_REFC_DOMAIN_COW) + rrc->cow_blocks += irec.rc_blockcount; + + xchk_rtrefcountbt_check_mergeable(bs, rrc, &irec); + xchk_rtrefcountbt_xref(bs->sc, &irec); + + return 0; +} + +/* Make sure we have as many refc blocks as the rmap says. */ +STATIC void +xchk_refcount_xref_rmap( + struct xfs_scrub *sc, + const struct xfs_owner_info *btree_oinfo, + xfs_extlen_t cow_blocks) +{ + xfs_extlen_t refcbt_blocks = 0; + xfs_filblks_t blocks; + int error; + + if (!sc->sr.rmap_cur || !sc->sa.rmap_cur || xchk_skip_xref(sc->sm)) + return; + + /* Check that we saw as many refcbt blocks as the rmap knows about. */ + error = xfs_btree_count_blocks(sc->sr.refc_cur, &refcbt_blocks); + if (!xchk_btree_process_error(sc, sc->sr.refc_cur, 0, &error)) + return; + error = xchk_count_rmap_ownedby_ag(sc, sc->sa.rmap_cur, btree_oinfo, + &blocks); + if (!xchk_should_check_xref(sc, &error, &sc->sa.rmap_cur)) + return; + if (blocks != refcbt_blocks) + xchk_btree_xref_set_corrupt(sc, sc->sa.rmap_cur, 0); + + /* Check that we saw as many cow blocks as the rmap knows about. */ + error = xchk_count_rmap_ownedby_ag(sc, sc->sr.rmap_cur, + &XFS_RMAP_OINFO_COW, &blocks); + if (!xchk_should_check_xref(sc, &error, &sc->sr.rmap_cur)) + return; + if (blocks != cow_blocks) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); +} + +/* Scrub the refcount btree for some AG. */ +int +xchk_rtrefcountbt( + struct xfs_scrub *sc) +{ + struct xfs_owner_info btree_oinfo; + struct xchk_rtrefcbt_records rrc = { + .cow_blocks = 0, + }; + int error; + + error = xchk_metadata_inode_forks(sc); + if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) + return error; + + xfs_rmap_ino_bmbt_owner(&btree_oinfo, sc->sr.rtg->rtg_refcountip->i_ino, + XFS_DATA_FORK); + error = xchk_btree(sc, sc->sr.refc_cur, xchk_rtrefcountbt_rec, + &btree_oinfo, &rrc); + if (error) + goto out_unlock; + + xchk_refcount_xref_rmap(sc, &btree_oinfo, rrc.cow_blocks); + +out_unlock: + return error; +} diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index ab7a36efab3b..ad6f297ae6cf 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -432,6 +432,13 @@ static const struct xchk_meta_ops meta_scrub_ops[] = { .has = xfs_has_rtrmapbt, .repair = xrep_rtrmapbt, }, + [XFS_SCRUB_TYPE_RTREFCBT] = { /* realtime refcountbt */ + .type = ST_RTGROUP, + .setup = xchk_setup_rtrefcountbt, + .scrub = xchk_rtrefcountbt, + .has = xfs_has_rtreflink, + .repair = xrep_notsupported, + }, }; static int diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index d47db84e6b7f..3a9dd26eca7d 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -81,6 +81,7 @@ struct xchk_rt { /* rtgroup btrees */ struct xfs_btree_cur *rmap_cur; + struct xfs_btree_cur *refc_cur; }; struct xfs_scrub { @@ -194,12 +195,14 @@ int xchk_rtsummary(struct xfs_scrub *sc); int xchk_rgsuperblock(struct xfs_scrub *sc); int xchk_rgbitmap(struct xfs_scrub *sc); int xchk_rtrmapbt(struct xfs_scrub *sc); +int xchk_rtrefcountbt(struct xfs_scrub *sc); #else # define xchk_rtbitmap xchk_nothing # define xchk_rtsummary xchk_nothing # define xchk_rgsuperblock xchk_nothing # define xchk_rgbitmap xchk_nothing # define xchk_rtrmapbt xchk_nothing +# define xchk_rtrefcountbt xchk_nothing #endif #ifdef CONFIG_XFS_QUOTA int xchk_quota(struct xfs_scrub *sc); diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 8d66ab10e1fd..8070d946ae1d 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -79,6 +79,7 @@ TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_HEALTHY); TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RGSUPER); TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RGBITMAP); TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RTRMAPBT); +TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RTREFCBT); #define XFS_SCRUB_TYPE_STRINGS \ { XFS_SCRUB_TYPE_PROBE, "probe" }, \ @@ -111,7 +112,8 @@ TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RTRMAPBT); { XFS_SCRUB_TYPE_HEALTHY, "healthy" }, \ { XFS_SCRUB_TYPE_RGSUPER, "rgsuper" }, \ { XFS_SCRUB_TYPE_RGBITMAP, "rgbitmap" }, \ - { XFS_SCRUB_TYPE_RTRMAPBT, "rtrmapbt" } + { XFS_SCRUB_TYPE_RTRMAPBT, "rtrmapbt" }, \ + { XFS_SCRUB_TYPE_RTREFCBT, "rtrefcountbt" } #define XFS_SCRUB_FLAG_STRINGS \ { XFS_SCRUB_IFLAG_REPAIR, "repair" }, \ From patchwork Fri Dec 30 22:18:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085538 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C486C3DA7D for ; Sat, 31 Dec 2022 01:55:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236032AbiLaBzv (ORCPT ); Fri, 30 Dec 2022 20:55:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50654 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236030AbiLaBzv (ORCPT ); Fri, 30 Dec 2022 20:55:51 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 397845F74 for ; Fri, 30 Dec 2022 17:55:50 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CADE361C63 for ; Sat, 31 Dec 2022 01:55:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3907EC433EF; Sat, 31 Dec 2022 01:55:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451749; bh=+cctjXtA6Jf8cHwG22lp1M6RkZ84KzGQ4nEqp1mcLfY=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=QzokBy2ziTU2AOAIrgGOjjQvRBYDqEEa0bydrW4HwTqqFt/hjq9Cdh0itxYsH/iA/ HPqU78AWUngVOGsYEcYiC445q+xiyAJgx3gDr0RTKwjxsphS0fGIHVQetmv6XY6TYj CbShOv5hX/HfiTo+4gxrPrmEm3E+u6xxqAR6xjF6jkn12yN2kWv3IGNfBAxc4rFQ/b gW7qNoZCAadUNv/+u6tFpQOUQVwwCOytYu7A6F5+jS914m7obKvl8Tbs3TXqT2Yb3k uJF4lzJJkSTymEZgeHM7QA6LEGZQuKNSwJF8WXMPg6YHCs1chFGZAuSbFHCNHtnEfi RU1c+bY8k2J+A== Subject: [PATCH 30/42] xfs: cross-reference checks with the rt refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:33 -0800 Message-ID: <167243871324.717073.10721947638785793216.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Use the realtime refcount btree to implement cross-reference checks in other data structures. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/bmap.c | 27 ++++++++++++-- fs/xfs/scrub/rtbitmap.c | 2 + fs/xfs/scrub/rtrefcount.c | 86 +++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/rtrmap.c | 37 +++++++++++++++++++ fs/xfs/scrub/scrub.h | 9 +++++ 5 files changed, 156 insertions(+), 5 deletions(-) diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index f18b22bc2548..8191a67598d0 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -352,11 +352,28 @@ xchk_bmap_rt_iextent_xref( xchk_xref_is_used_rt_space(info->sc, irec->br_startblock, irec->br_blockcount); xchk_bmap_xref_rmap(info, irec, rgbno); - - xfs_rmap_ino_owner(&oinfo, info->sc->ip->i_ino, info->whichfork, - irec->br_startoff); - xchk_xref_is_only_rt_owned_by(info->sc, rgbno, irec->br_blockcount, - &oinfo); + switch (info->whichfork) { + case XFS_DATA_FORK: + if (!xfs_is_reflink_inode(info->sc->ip)) { + xfs_rmap_ino_owner(&oinfo, info->sc->ip->i_ino, + info->whichfork, irec->br_startoff); + xchk_xref_is_only_rt_owned_by(info->sc, rgbno, + irec->br_blockcount, &oinfo); + xchk_xref_is_not_rt_shared(info->sc, rgbno, + irec->br_blockcount); + } + xchk_xref_is_not_rt_cow_staging(info->sc, rgbno, + irec->br_blockcount); + break; + case XFS_COW_FORK: + xchk_xref_is_only_rt_owned_by(info->sc, rgbno, + irec->br_blockcount, &XFS_RMAP_OINFO_COW); + xchk_xref_is_rt_cow_staging(info->sc, rgbno, + irec->br_blockcount); + xchk_xref_is_not_rt_shared(info->sc, rgbno, + irec->br_blockcount); + break; + } out_free: xchk_rtgroup_btcur_free(&info->sc->sr); diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c index ca478fbd514e..9419219a534f 100644 --- a/fs/xfs/scrub/rtbitmap.c +++ b/fs/xfs/scrub/rtbitmap.c @@ -121,6 +121,8 @@ xchk_rtbitmap_xref( rgbno = xfs_rtb_to_rgbno(sc->mp, startblock, &rgno); xchk_xref_has_no_rt_owner(sc, rgbno, blockcount); + xchk_xref_is_not_rt_shared(sc, rgbno, blockcount); + xchk_xref_is_not_rt_cow_staging(sc, rgbno, blockcount); if (rtb->next_free_rtblock < startblock) { xfs_rgblock_t next_rgbno; diff --git a/fs/xfs/scrub/rtrefcount.c b/fs/xfs/scrub/rtrefcount.c index 528a056c7932..05512f8443a2 100644 --- a/fs/xfs/scrub/rtrefcount.c +++ b/fs/xfs/scrub/rtrefcount.c @@ -493,3 +493,89 @@ xchk_rtrefcountbt( out_unlock: return error; } + +/* xref check that a cow staging extent is marked in the rtrefcountbt. */ +void +xchk_xref_is_rt_cow_staging( + struct xfs_scrub *sc, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + struct xfs_refcount_irec rc; + int has_refcount; + int error; + + if (!sc->sr.refc_cur || xchk_skip_xref(sc->sm)) + return; + + /* Find the CoW staging extent. */ + error = xfs_refcount_lookup_le(sc->sr.refc_cur, XFS_REFC_DOMAIN_COW, + bno, &has_refcount); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (!has_refcount) { + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); + return; + } + + error = xfs_refcount_get_rec(sc->sr.refc_cur, &rc, &has_refcount); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (!has_refcount) { + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); + return; + } + + /* CoW lookup returned a shared extent record? */ + if (rc.rc_domain != XFS_REFC_DOMAIN_COW) + xchk_btree_xref_set_corrupt(sc, sc->sa.refc_cur, 0); + + /* Must be at least as long as what was passed in */ + if (rc.rc_blockcount < len) + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); +} + +/* + * xref check that the extent is not shared. Only file data blocks + * can have multiple owners. + */ +void +xchk_xref_is_not_rt_shared( + struct xfs_scrub *sc, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + enum xbtree_recpacking outcome; + int error; + + if (!sc->sr.refc_cur || xchk_skip_xref(sc->sm)) + return; + + error = xfs_refcount_has_records(sc->sr.refc_cur, + XFS_REFC_DOMAIN_SHARED, bno, len, &outcome); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (outcome != XBTREE_RECPACKING_EMPTY) + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); +} + +/* xref check that the extent is not being used for CoW staging. */ +void +xchk_xref_is_not_rt_cow_staging( + struct xfs_scrub *sc, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + enum xbtree_recpacking outcome; + int error; + + if (!sc->sr.refc_cur || xchk_skip_xref(sc->sm)) + return; + + error = xfs_refcount_has_records(sc->sr.refc_cur, XFS_REFC_DOMAIN_COW, + bno, len, &outcome); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (outcome != XBTREE_RECPACKING_EMPTY) + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); +} diff --git a/fs/xfs/scrub/rtrmap.c b/fs/xfs/scrub/rtrmap.c index 5442325a6982..e89d5310117a 100644 --- a/fs/xfs/scrub/rtrmap.c +++ b/fs/xfs/scrub/rtrmap.c @@ -21,6 +21,7 @@ #include "xfs_inode.h" #include "xfs_rtalloc.h" #include "xfs_rtgroup.h" +#include "xfs_refcount.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -157,6 +158,37 @@ xchk_rtrmapbt_check_mergeable( memcpy(&cr->prev_rec, irec, sizeof(struct xfs_rmap_irec)); } +/* Cross-reference a rmap against the refcount btree. */ +STATIC void +xchk_rtrmapbt_xref_rtrefc( + struct xfs_scrub *sc, + struct xfs_rmap_irec *irec) +{ + xfs_rgblock_t fbno; + xfs_extlen_t flen; + bool is_inode; + bool is_bmbt; + bool is_attr; + bool is_unwritten; + int error; + + if (!sc->sr.refc_cur || xchk_skip_xref(sc->sm)) + return; + + is_inode = !XFS_RMAP_NON_INODE_OWNER(irec->rm_owner); + is_bmbt = irec->rm_flags & XFS_RMAP_BMBT_BLOCK; + is_attr = irec->rm_flags & XFS_RMAP_ATTR_FORK; + is_unwritten = irec->rm_flags & XFS_RMAP_UNWRITTEN; + + /* If this is shared, must be a data fork extent. */ + error = xfs_refcount_find_shared(sc->sr.refc_cur, irec->rm_startblock, + irec->rm_blockcount, &fbno, &flen, false); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (flen != 0 && (!is_inode || is_attr || is_bmbt || is_unwritten)) + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); +} + /* Cross-reference with other metadata. */ STATIC void xchk_rtrmapbt_xref( @@ -172,6 +204,11 @@ xchk_rtrmapbt_xref( irec->rm_startblock); xchk_xref_is_used_rt_space(sc, rtbno, irec->rm_blockcount); + if (irec->rm_owner == XFS_RMAP_OWN_COW) + xchk_xref_is_cow_staging(sc, irec->rm_startblock, + irec->rm_blockcount); + else + xchk_rtrmapbt_xref_rtrefc(sc, irec); } /* Scrub a realtime rmapbt record. */ diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index 3a9dd26eca7d..0a3b151f9870 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -242,11 +242,20 @@ void xchk_xref_has_rt_owner(struct xfs_scrub *sc, xfs_rgblock_t rgbno, xfs_extlen_t len); void xchk_xref_is_only_rt_owned_by(struct xfs_scrub *sc, xfs_rgblock_t rgbno, xfs_extlen_t len, const struct xfs_owner_info *oinfo); +void xchk_xref_is_rt_cow_staging(struct xfs_scrub *sc, xfs_rgblock_t rgbno, + xfs_extlen_t len); +void xchk_xref_is_not_rt_shared(struct xfs_scrub *sc, xfs_rgblock_t rgbno, + xfs_extlen_t len); +void xchk_xref_is_not_rt_cow_staging(struct xfs_scrub *sc, xfs_rgblock_t rgbno, + xfs_extlen_t len); #else # define xchk_xref_is_used_rt_space(sc, rtbno, len) do { } while (0) # define xchk_xref_has_no_rt_owner(sc, rtbno, len) do { } while (0) # define xchk_xref_has_rt_owner(sc, rtbno, len) do { } while (0) # define xchk_xref_is_only_rt_owned_by(sc, bno, len, oinfo) do { } while (0) +# define xchk_xref_is_rt_cow_staging(sc, bno, len) do { } while (0) +# define xchk_xref_is_not_rt_shared(sc, bno, len) do { } while (0) +# define xchk_xref_is_not_rt_cow_staging(sc, bno, len) do { } while (0) #endif #endif /* __XFS_SCRUB_SCRUB_H__ */ From patchwork Fri Dec 30 22:18:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085539 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A24CEC4332F for ; Sat, 31 Dec 2022 01:56:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236030AbiLaB4K (ORCPT ); Fri, 30 Dec 2022 20:56:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235949AbiLaB4I (ORCPT ); Fri, 30 Dec 2022 20:56:08 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 279CB5F76 for ; Fri, 30 Dec 2022 17:56:08 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 7571FCE19E6 for ; Sat, 31 Dec 2022 01:56:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B0A0FC433D2; Sat, 31 Dec 2022 01:56:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451764; bh=uew8H4Sc10IVJ3VmA+bYQwJQrBvdRbPe2/87iLkab+Y=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=ZS4btJfFacvyBsZfOtZCP/OLVZBJId6Rrso6Qd9i28jQJePwVvb/kWlvA9I9TyQ/Z FYD7VhVNFReA+z5ePwZmJiY0ZrmWvA+m+Mn6s5oweQcpG8pb43g9E6IxknCU38CW3V 0XA/kpydouxrVvxVPbtg8a8gH8uAc7+lEti0E/OLEn3fd8lYMV8sVf+ScwOmvymc9H CVkMy2ToKOsuPYYbPBLEeEDl9b+o0tWgToJSIBnC0GxtPNN5dmPFfJbzFW/8Xs/R4v eArdf09m4SHiAxNu5udE/cEJjkRj+JiRXFLIu5edZLVakKt4BM1XRMhtprxTTmI1jI X6n09nP4BhpEQ== Subject: [PATCH 31/42] xfs: allow overlapping rtrmapbt records for shared data extents From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:33 -0800 Message-ID: <167243871338.717073.13427743306191179302.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Allow overlapping realtime reverse mapping records if they both describe shared data extents and the fs supports reflink on the realtime volume. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/rtrmap.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/rtrmap.c b/fs/xfs/scrub/rtrmap.c index e89d5310117a..3ff4151b2c0a 100644 --- a/fs/xfs/scrub/rtrmap.c +++ b/fs/xfs/scrub/rtrmap.c @@ -86,6 +86,18 @@ struct xchk_rtrmap { struct xfs_rmap_irec prev_rec; }; +static inline bool +xchk_rtrmapbt_is_shareable( + struct xfs_scrub *sc, + const struct xfs_rmap_irec *irec) +{ + if (!xfs_has_rtreflink(sc->mp)) + return false; + if (irec->rm_flags & XFS_RMAP_UNWRITTEN) + return false; + return true; +} + /* Flag failures for records that overlap but cannot. */ STATIC void xchk_rtrmapbt_check_overlapping( @@ -107,7 +119,10 @@ xchk_rtrmapbt_check_overlapping( if (pnext <= irec->rm_startblock) goto set_prev; - xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + /* Overlap is only allowed if both records are data fork mappings. */ + if (!xchk_rtrmapbt_is_shareable(bs->sc, &cr->overlap_rec) || + !xchk_rtrmapbt_is_shareable(bs->sc, irec)) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); /* Save whichever rmap record extends furthest. */ inext = irec->rm_startblock + irec->rm_blockcount; From patchwork Fri Dec 30 22:18:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085540 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C476C4332F for ; Sat, 31 Dec 2022 01:56:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236031AbiLaB4X (ORCPT ); Fri, 30 Dec 2022 20:56:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235949AbiLaB4W (ORCPT ); Fri, 30 Dec 2022 20:56:22 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4756FAE5C for ; Fri, 30 Dec 2022 17:56:21 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D9B6661C63 for ; Sat, 31 Dec 2022 01:56:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 431F0C433EF; Sat, 31 Dec 2022 01:56:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451780; bh=8nyt3bfFmYyo3to7g0XUFNOT/zOvYt6kGnZ9Vy3Rtaw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Ekaq2HqeZNM2VXh6FW4b1IoHAZ+zohXwyuwr4VLiYsybLVqV9XclNYLwrPuR2Qhjf yTQ0x8hsG6mR6pOFxL7QUntEMwyZHPU+abi64cxldi7ID29htR9iXWZHdtAK+M4X6E RSrKdAZhTZt/dDT956NhbQTvpvS7JcFaOfUjcwrcYa1R0ZXRb5e043jo6XGGkt73bt m3JszyHTTvnnUfRALWR2Jo3QKzwDudhmfutwG6X3U+6p9KdMLdGcz61s7MbSoJ8Ga0 JROdHsrEOoXDDtn3hAgQY5I8FVewBV+w/+95nUqhn+yK0R7ym2rVUISiMzJp301CFZ 35RPYedyA8WZw== Subject: [PATCH 32/42] xfs: check reference counts of gaps between rt refcount records From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:33 -0800 Message-ID: <167243871351.717073.16895686653523718159.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong If there's a gap between records in the rt refcount btree, we ought to cross-reference the gap with the rtrmap records to make sure that there aren't any overlapping records for a region that doesn't have any shared ownership. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/rtrefcount.c | 81 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 80 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/rtrefcount.c b/fs/xfs/scrub/rtrefcount.c index 05512f8443a2..3cb2ff8443da 100644 --- a/fs/xfs/scrub/rtrefcount.c +++ b/fs/xfs/scrub/rtrefcount.c @@ -15,6 +15,7 @@ #include "xfs_inode.h" #include "xfs_rtbitmap.h" #include "xfs_rtgroup.h" +#include "xfs_rtalloc.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/btree.h" @@ -356,8 +357,14 @@ struct xchk_rtrefcbt_records { /* Previous refcount record. */ struct xfs_refcount_irec prev_rec; + /* The next rtgroup block where we aren't expecting shared extents. */ + xfs_rgblock_t next_unshared_rgbno; + /* Number of CoW blocks we expect. */ xfs_extlen_t cow_blocks; + + /* Was the last record a shared or CoW staging extent? */ + enum xfs_refc_domain prev_domain; }; static inline bool @@ -398,6 +405,53 @@ xchk_rtrefcountbt_check_mergeable( memcpy(&rrc->prev_rec, irec, sizeof(struct xfs_refcount_irec)); } +STATIC int +xchk_rtrefcountbt_rmap_check_gap( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + xfs_rgblock_t *next_bno = priv; + + if (*next_bno != NULLRGBLOCK && rec->rm_startblock < *next_bno) + return -ECANCELED; + + *next_bno = rec->rm_startblock + rec->rm_blockcount; + return 0; +} + +/* + * Make sure that a gap in the reference count records does not correspond to + * overlapping records (i.e. shared extents) in the reverse mappings. + */ +static inline void +xchk_rtrefcountbt_xref_gaps( + struct xfs_scrub *sc, + struct xchk_rtrefcbt_records *rrc, + xfs_rtblock_t bno) +{ + struct xfs_rmap_irec low; + struct xfs_rmap_irec high; + xfs_rgblock_t next_bno = NULLRGBLOCK; + int error; + + if (bno <= rrc->next_unshared_rgbno || !sc->sr.rmap_cur || + xchk_skip_xref(sc->sm)) + return; + + memset(&low, 0, sizeof(low)); + low.rm_startblock = rrc->next_unshared_rgbno; + memset(&high, 0xFF, sizeof(high)); + high.rm_startblock = bno - 1; + + error = xfs_rmap_query_range(sc->sr.rmap_cur, &low, &high, + xchk_rtrefcountbt_rmap_check_gap, &next_bno); + if (error == -ECANCELED) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); + else + xchk_should_check_xref(sc, &error, &sc->sr.rmap_cur); +} + /* Scrub a rtrefcountbt record. */ STATIC int xchk_rtrefcountbt_rec( @@ -426,9 +480,26 @@ xchk_rtrefcountbt_rec( if (irec.rc_domain == XFS_REFC_DOMAIN_COW) rrc->cow_blocks += irec.rc_blockcount; + /* Shared records always come before CoW records. */ + if (irec.rc_domain == XFS_REFC_DOMAIN_SHARED && + rrc->prev_domain == XFS_REFC_DOMAIN_COW) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + rrc->prev_domain = irec.rc_domain; + xchk_rtrefcountbt_check_mergeable(bs, rrc, &irec); xchk_rtrefcountbt_xref(bs->sc, &irec); + /* + * If this is a record for a shared extent, check that all blocks + * between the previous record and this one have at most one reverse + * mapping. + */ + if (irec.rc_domain == XFS_REFC_DOMAIN_SHARED) { + xchk_rtrefcountbt_xref_gaps(bs->sc, rrc, irec.rc_startblock); + rrc->next_unshared_rgbno = irec.rc_startblock + + irec.rc_blockcount; + } + return 0; } @@ -473,7 +544,9 @@ xchk_rtrefcountbt( { struct xfs_owner_info btree_oinfo; struct xchk_rtrefcbt_records rrc = { - .cow_blocks = 0, + .cow_blocks = 0, + .next_unshared_rgbno = 0, + .prev_domain = XFS_REFC_DOMAIN_SHARED, }; int error; @@ -488,6 +561,12 @@ xchk_rtrefcountbt( if (error) goto out_unlock; + /* + * Check that all blocks between the last refcount > 1 record and the + * end of the rt volume have at most one reverse mapping. + */ + xchk_rtrefcountbt_xref_gaps(sc, &rrc, sc->mp->m_sb.sb_rblocks); + xchk_refcount_xref_rmap(sc, &btree_oinfo, rrc.cow_blocks); out_unlock: From patchwork Fri Dec 30 22:18:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FBC4C4332F for ; Sat, 31 Dec 2022 01:56:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236044AbiLaB4i (ORCPT ); Fri, 30 Dec 2022 20:56:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235949AbiLaB4h (ORCPT ); Fri, 30 Dec 2022 20:56:37 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB7A3A19D for ; Fri, 30 Dec 2022 17:56:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 76D1961C3A for ; Sat, 31 Dec 2022 01:56:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D412AC433D2; Sat, 31 Dec 2022 01:56:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451795; bh=G1YF0RANLr8MRX4xpE1edrgpIqOdx5gB7Xc9+v09PgU=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Vt25lnLlpg5CacWHBInzMK8Vf1Ft8HeZB3MJabcv3KxOjLI+EnAzxECqTyz4lnoLN PLf23pIMgPRKikn6mfXXg+VZyrhGOGXkuZ7SHzwRo4TkYOS+Zy8+NLLUS9iQ6ZtBjY Uq5rhvK/Niy6nFUEzjsBF+xWceKhglIL0U3PCCYnaqAT6D/U/bqRIdOeK87+duS2ju eS0lc7hlXjrl9cVWCMQPeI2qLlAsnTL3/PvwJ+MJ42NpEUOMWm7T1tjEMLaPcEoS70 T3LQ0X4/wtUAEEVl9kPrlf0gXT2AeoV1sJsm29oNAKBANsBNwl6vcjBzGwgxFxpzPf vlRPZerCD5aAA== Subject: [PATCH 33/42] xfs: allow dquot rt block count to exceed rt blocks on reflink fs From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:33 -0800 Message-ID: <167243871365.717073.11026110278057252995.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Update the quota scrubber to allow dquots where the realtime block count exceeds the block count of the rt volume if reflink is enabled. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/quota.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c index 714bd4c0753a..2d2064126bc9 100644 --- a/fs/xfs/scrub/quota.c +++ b/fs/xfs/scrub/quota.c @@ -139,12 +139,18 @@ xchk_quota_item( if (mp->m_sb.sb_dblocks < dq->q_blk.count) xchk_fblock_set_warning(sc, XFS_DATA_FORK, offset); + if (mp->m_sb.sb_rblocks < dq->q_rtb.count) + xchk_fblock_set_warning(sc, XFS_DATA_FORK, + offset); } else { if (mp->m_sb.sb_dblocks < dq->q_blk.count) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, offset); + if (mp->m_sb.sb_rblocks < dq->q_rtb.count) + xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, + offset); } - if (dq->q_ino.count > fs_icount || dq->q_rtb.count > mp->m_sb.sb_rblocks) + if (dq->q_ino.count > fs_icount) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, offset); /* From patchwork Fri Dec 30 22:18:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085542 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D6F0C4332F for ; Sat, 31 Dec 2022 01:56:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236056AbiLaB4y (ORCPT ); Fri, 30 Dec 2022 20:56:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235949AbiLaB4x (ORCPT ); Fri, 30 Dec 2022 20:56:53 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D0841C430 for ; Fri, 30 Dec 2022 17:56:52 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 07A9161C3A for ; Sat, 31 Dec 2022 01:56:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 64EAFC433D2; Sat, 31 Dec 2022 01:56:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451811; bh=P1MBvR8Ig3ZbN/D1yJEWZqJp27h0H59cU9Ks1EOyRM0=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=CDcO6eNPUNcgwfyDOnuU7/LpYQ6GhqCKv1U1w53E2fTVhzIHWdemmoWIP4BU5gqrM sGJhj+PE/cdGqyKDd7KA5EHY9wrYAxbY1th0xk6yF27K4HiUSzeMHEZE7qTwWLKPoS 8KffuQNgpZBtJcFGbDcgezu+RlM6Z/5kv5U3qLPsgJ1w2pCE+/EsbHg0ws3PnZfjRP HVFoSVvr9UaxbjwhP6WuCNKiSo+tzyMl5kfzacA1df5hky3Mqo1a8DrgHkMDEsvNZL fuexlcHA1EFdxCGncsz4oBq94iVN+xYPpK/a0KURfPNXR7XyuGkX3Gu7RVvAsGDUz3 iNWpY2havgZfA== Subject: [PATCH 34/42] xfs: detect and repair misaligned rtinherit directory cowextsize hints From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:33 -0800 Message-ID: <167243871379.717073.9331036711596389250.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong If we encounter a directory that has been configured to pass on a CoW extent size hint to a new realtime file and the hint isn't an integer multiple of the rt extent size, we should flag the hint for administrative review and/or turn it off because that is a misconfiguration. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/inode.c | 26 +++++++++++++++++--------- fs/xfs/scrub/inode_repair.c | 15 +++++++++++++++ 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c index be9739035226..6a37973823d2 100644 --- a/fs/xfs/scrub/inode.c +++ b/fs/xfs/scrub/inode.c @@ -229,12 +229,7 @@ xchk_inode_extsize( xchk_ino_set_warning(sc, ino); } -/* - * Validate di_cowextsize hint. - * - * The rules are documented at xfs_ioctl_setattr_check_cowextsize(). - * These functions must be kept in sync with each other. - */ +/* Validate di_cowextsize hint. */ STATIC void xchk_inode_cowextsize( struct xfs_scrub *sc, @@ -245,12 +240,25 @@ xchk_inode_cowextsize( uint64_t flags2) { xfs_failaddr_t fa; + uint32_t value = be32_to_cpu(dip->di_cowextsize); - fa = xfs_inode_validate_cowextsize(sc->mp, - be32_to_cpu(dip->di_cowextsize), mode, flags, - flags2); + fa = xfs_inode_validate_cowextsize(sc->mp, value, mode, flags, flags2); if (fa) xchk_ino_set_corrupt(sc, ino); + + /* + * XFS allows a sysadmin to change the rt extent size when adding a rt + * section to a filesystem after formatting. If there are any + * directories with cowextsize and rtinherit set, the hint could become + * misaligned with the new rextsize. The verifier doesn't check this, + * because we allow rtinherit directories even without an rt device. + * Flag this as an administrative warning since we will clean this up + * eventually. + */ + if ((flags & XFS_DIFLAG_RTINHERIT) && + (flags2 & XFS_DIFLAG2_COWEXTSIZE) && + value % sc->mp->m_sb.sb_rextsize > 0) + xchk_ino_set_warning(sc, ino); } /* Make sure the di_flags make sense for the inode. */ diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index 9f946406cfa0..3ce9ac5b0fc4 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -1572,6 +1572,20 @@ xrep_inode_extsize( } } +/* Fix COW extent size hint problems. */ +STATIC void +xrep_inode_cowextsize( + struct xfs_scrub *sc) +{ + /* Fix misaligned CoW extent size hints on a directory. */ + if ((sc->ip->i_diflags & XFS_DIFLAG_RTINHERIT) && + (sc->ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) && + sc->ip->i_extsize % sc->mp->m_sb.sb_rextsize > 0) { + sc->ip->i_cowextsize = 0; + sc->ip->i_diflags2 &= ~XFS_DIFLAG2_COWEXTSIZE; + } +} + /* Fix any irregularities in an inode that the verifiers don't catch. */ STATIC int xrep_inode_problems( @@ -1587,6 +1601,7 @@ xrep_inode_problems( xrep_inode_ids(sc); xrep_inode_size(sc); xrep_inode_extsize(sc); + xrep_inode_cowextsize(sc); trace_xrep_inode_fixed(sc); xfs_trans_log_inode(sc->tp, sc->ip, XFS_ILOG_CORE); From patchwork Fri Dec 30 22:18:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085543 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFE39C4332F for ; Sat, 31 Dec 2022 01:57:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236067AbiLaB5L (ORCPT ); Fri, 30 Dec 2022 20:57:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235949AbiLaB5K (ORCPT ); Fri, 30 Dec 2022 20:57:10 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A0571C900 for ; Fri, 30 Dec 2022 17:57:09 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2B135B81E07 for ; Sat, 31 Dec 2022 01:57:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ECAE1C433D2; Sat, 31 Dec 2022 01:57:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451827; bh=der+2yVxVJpSIva0yR9JgMa+rs6Bq4oPOKBMQsYDAfg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Sv0DETkE2Qrbbji5muh/JzPQK7piT9lGaiWTYbDv1sWUK0GN8SaZYarwJnMrbjJZL AGUibxSorKsbR1myOQMp2WSmwZweCYfsS+pqLxtj/+jivXYcY/jIQTTdBcjuHK5Iu7 VduqwSy9BDiB3zLCmKO0EIlbRpAc7IcXdQ0ELF8SaCPY8G3d0A42Y7Ky9JLbQA/Xoi 3qbc9U6NlBLGdbGzYLNAfIsjO8K34OjwJaLyTiZouIJX2AWOXe8DVBaG1S1dtBFU8O z8pRb6Xc3bD8SoADpgQzTBR2Rlp4ErXcIk+HhmnmLBgnSq8hXmQ1YoLYpLbrfL9QQ2 b7mOH81XS8Zew== Subject: [PATCH 35/42] xfs: don't flag quota rt block usage on rtreflink filesystems From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:33 -0800 Message-ID: <167243871393.717073.13834650298966488896.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Quota space usage is allowed to exceed the size of the physical storage when reflink is enabled. Now that we have reflink for the realtime volume, apply this same logic to the rtb repair logic. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/quota_repair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/scrub/quota_repair.c b/fs/xfs/scrub/quota_repair.c index a150719c2b90..c79c47714eb6 100644 --- a/fs/xfs/scrub/quota_repair.c +++ b/fs/xfs/scrub/quota_repair.c @@ -101,7 +101,7 @@ xrep_quota_item( rqi->need_quotacheck = true; dirty = true; } - if (dqp->q_rtb.count > mp->m_sb.sb_rblocks) { + if (!xfs_has_reflink(mp) && dqp->q_rtb.count > mp->m_sb.sb_rblocks) { dqp->q_rtb.reserved -= dqp->q_rtb.count; dqp->q_rtb.reserved += mp->m_sb.sb_rblocks; dqp->q_rtb.count = mp->m_sb.sb_rblocks; From patchwork Fri Dec 30 22:18:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085544 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0D6AC4332F for ; Sat, 31 Dec 2022 01:57:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235949AbiLaB52 (ORCPT ); Fri, 30 Dec 2022 20:57:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236076AbiLaB50 (ORCPT ); Fri, 30 Dec 2022 20:57:26 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 221671C438 for ; Fri, 30 Dec 2022 17:57:25 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C0A1EB81E0A for ; Sat, 31 Dec 2022 01:57:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6F5D7C433D2; Sat, 31 Dec 2022 01:57:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451842; bh=Xuh+vrOz7S8uZilC3phCTjHHgHHjs5Dtkc8QLFfl9ls=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=BQxkFfkSuL1UBRsukgvVVGjLI+U06/ghhDSEDfew0g+zXWXW6MkmPhrf7td98JyjL /tYkQ6rF/0pKkQ4bi4GRkO/7CoE/vSNeYx/Az4y6PlI3ycdGePddS9S3LZNweW3Azt rKYoX0J0Zyxje1DbIH6LWOcmAlK8tDQ+RjcEm1kPYtKnG7J4K9JBsS2R0E8IBc0R8b r1jDgeRQRi5v5WW9J5gw0XmqqU5N7hHY4H7sn3sVxR4Tv4Wp1eRyNAsQ6o6DYhVroP Ia95wyAIHogW3QNDUXRdCT/3MT6a1Ya3yOf2L09IZf1B2km87zKlHrRGZySx7gJXmt vd3wta032unHA== Subject: [PATCH 36/42] xfs: check new rtbitmap records against rt refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:34 -0800 Message-ID: <167243871407.717073.3930845877467547286.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong When we're rebuilding the realtime bitmap, check the proposed free extents against the rt refcount btree to make sure we don't commit any grievous errors. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/repair.c | 7 +++++++ fs/xfs/scrub/rtbitmap_repair.c | 21 +++++++++++++++++++++ 2 files changed, 28 insertions(+) diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index b76c01e9f540..3bde5ea86cf5 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -40,6 +40,7 @@ #include "xfs_rtgroup.h" #include "xfs_rtalloc.h" #include "xfs_imeta.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -991,6 +992,12 @@ xrep_rtgroup_btcur_init( xfs_has_rtrmapbt(mp)) sr->rmap_cur = xfs_rtrmapbt_init_cursor(mp, sc->tp, sr->rtg, sr->rtg->rtg_rmapip); + + if (sc->sm->sm_type != XFS_SCRUB_TYPE_RTREFCBT && + (sr->rtlock_flags & XFS_RTGLOCK_REFCOUNT) && + xfs_has_rtreflink(mp)) + sr->refc_cur = xfs_rtrefcountbt_init_cursor(mp, sc->tp, + sr->rtg, sr->rtg->rtg_refcountip); } /* diff --git a/fs/xfs/scrub/rtbitmap_repair.c b/fs/xfs/scrub/rtbitmap_repair.c index 0fa8942d14e7..d099f988274e 100644 --- a/fs/xfs/scrub/rtbitmap_repair.c +++ b/fs/xfs/scrub/rtbitmap_repair.c @@ -22,6 +22,7 @@ #include "xfs_swapext.h" #include "xfs_rtbitmap.h" #include "xfs_rtgroup.h" +#include "xfs_refcount.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -447,6 +448,7 @@ xrep_rgbitmap_mark_free( unsigned int bufwsize; xfs_extlen_t mod; xfs_rtword_t mask; + enum xbtree_recpacking outcome; int error; if (!xfs_verify_rgbext(rtg, rb->next_rgbno, rgbno - rb->next_rgbno)) @@ -466,6 +468,25 @@ xrep_rgbitmap_mark_free( if (mod != mp->m_sb.sb_rextsize - 1) return -EFSCORRUPTED; + /* Must not be shared or CoW staging. */ + if (rb->sc->sr.refc_cur) { + error = xfs_refcount_has_records(rb->sc->sr.refc_cur, + XFS_REFC_DOMAIN_SHARED, rb->next_rgbno, + rgbno - rb->next_rgbno, &outcome); + if (error) + return error; + if (outcome != XBTREE_RECPACKING_EMPTY) + return -EFSCORRUPTED; + + error = xfs_refcount_has_records(rb->sc->sr.refc_cur, + XFS_REFC_DOMAIN_COW, rb->next_rgbno, + rgbno - rb->next_rgbno, &outcome); + if (error) + return error; + if (outcome != XBTREE_RECPACKING_EMPTY) + return -EFSCORRUPTED; + } + trace_xrep_rgbitmap_record_free(mp, startrtx, nextrtx - 1); /* Set bits as needed to round startrtx up to the nearest word. */ From patchwork Fri Dec 30 22:18:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A111AC4332F for ; Sat, 31 Dec 2022 01:57:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236076AbiLaB5k (ORCPT ); Fri, 30 Dec 2022 20:57:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236091AbiLaB5j (ORCPT ); Fri, 30 Dec 2022 20:57:39 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1FEA71C438 for ; Fri, 30 Dec 2022 17:57:39 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AE8FE61C44 for ; Sat, 31 Dec 2022 01:57:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1572BC433D2; Sat, 31 Dec 2022 01:57:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451858; bh=j6sJA1G9PKSKfiUoy+6U45B4jtSSLXXWQIEfyAOtGSg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Xh+tbRMiNDXuQ6SXdzlISTTqsnpR0z7ao9q2UCaahIObMr7SPzJ+XuRKs8c7JMgI8 fNoiMA4ho7+78Nh+yiSdvucbwfI8C9lUDg1VoOJKefORuTu7MeZnHAuZfCa8bhCp9v SCGSog3v6OE+geLRgZYPdBx7KdxrBUyBqZzcDkpzv93wTztIrIv/STxsYiSIy8gO// Yn3NfB3vX/XKUsr8a8kYzx7T7ZBrk6rmEkIDM+5v2pmJJ93N1fd++M9Uiq9LyQQZar 2LEgTv2Eqk0AS4+8HCIMmn9lUJG9V5HdBvUnrn05M1FuAlKFZQrP42cLZEf8i2EgbS tnRQn/rlWa/Og== Subject: [PATCH 37/42] xfs: walk the rt reference count tree when rebuilding rmap From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:34 -0800 Message-ID: <167243871421.717073.4694070852781910720.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong When we're rebuilding the data device rmap, if we encounter a "refcount" format fork, we have to walk the (realtime) refcount btree inode to build the appropriate mappings. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/rmap_repair.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/fs/xfs/scrub/rmap_repair.c b/fs/xfs/scrub/rmap_repair.c index 86c5338a12b9..24dcd3842ce6 100644 --- a/fs/xfs/scrub/rmap_repair.c +++ b/fs/xfs/scrub/rmap_repair.c @@ -32,6 +32,7 @@ #include "xfs_ag.h" #include "xfs_rtrmap_btree.h" #include "xfs_rtgroup.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -530,6 +531,39 @@ xrep_rmap_scan_rtrmapbt( return -EFSCORRUPTED; } +static int +xrep_rmap_scan_rtrefcountbt( + struct xrep_rmap_ifork *rf, + struct xfs_inode *ip) +{ + struct xfs_scrub *sc = rf->rr->sc; + struct xfs_btree_cur *cur; + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; + int error; + + if (rf->whichfork != XFS_DATA_FORK) + return -EFSCORRUPTED; + + for_each_rtgroup(sc->mp, rgno, rtg) { + if (ip == rtg->rtg_refcountip) { + cur = xfs_rtrefcountbt_init_cursor(sc->mp, sc->tp, rtg, + ip); + error = xrep_rmap_scan_iroot_btree(rf, cur); + xfs_btree_del_cursor(cur, error); + xfs_rtgroup_put(rtg); + return error; + } + } + + /* + * We shouldn't find a refcount format inode that isn't associated with + * an rtgroup! + */ + ASSERT(0); + return -EFSCORRUPTED; +} + /* Find all the extents from a given AG in an inode fork. */ STATIC int xrep_rmap_scan_ifork( @@ -561,6 +595,8 @@ xrep_rmap_scan_ifork( return error; } else if (ifp->if_format == XFS_DINODE_FMT_RMAP) { return xrep_rmap_scan_rtrmapbt(&rf, ip); + } else if (ifp->if_format == XFS_DINODE_FMT_REFCOUNT) { + return xrep_rmap_scan_rtrefcountbt(&rf, ip); } else if (ifp->if_format != XFS_DINODE_FMT_EXTENTS) { return 0; } From patchwork Fri Dec 30 22:18:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085546 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CAC7DC4332F for ; Sat, 31 Dec 2022 01:58:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236091AbiLaB6A (ORCPT ); Fri, 30 Dec 2022 20:58:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236092AbiLaB5z (ORCPT ); Fri, 30 Dec 2022 20:57:55 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A78141C900 for ; Fri, 30 Dec 2022 17:57:54 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4276D61C3A for ; Sat, 31 Dec 2022 01:57:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9FFF3C433D2; Sat, 31 Dec 2022 01:57:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451873; bh=iXypW6LnMW1mE59fQwWyjPjmGl3EC+78XTk3T/SrAuE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=PNRptMMyZUuNbOi2z1YKCzVCYSH9tPCeozxMnmmCtf+BTqF62iqO6vdKzGgBgSKt6 DFRiQ/JU2ew3IRtf6TxLVrah2hQVDhnXa13TzfQy5c+RuCOCVogOv7GveSiNl2Znjz qPRZ1Ps1Tt9E8i2VzTREPFUTLXQUabylJ5q/zWPRX89GV/kbnctol57u4ChWagDF3D jGCSHYFLiE2EDjgAYP5Z1tDOFwWQIluveMphR/+onYOtj7NWjR1xrHhEy/NBtCuw/z caqI+1YKNjdOwwwMnyZ7WMtHhEG5/l6C0EaUiDl2I45nl1ymz2njlSqTd2ekkcDrZi IHFClQW1duP8w== Subject: [PATCH 38/42] xfs: capture realtime CoW staging extents when rebuilding rt rmapbt From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:34 -0800 Message-ID: <167243871435.717073.14055309745166425902.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Walk the realtime refcount btree to find the CoW staging extents when we're rebuilding the realtime rmap btree. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/bitmap.h | 28 ++++++++++++ fs/xfs/scrub/repair.h | 1 fs/xfs/scrub/rtrmap_repair.c | 102 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 131 insertions(+) diff --git a/fs/xfs/scrub/bitmap.h b/fs/xfs/scrub/bitmap.h index d59d5e76782c..29faf2b63715 100644 --- a/fs/xfs/scrub/bitmap.h +++ b/fs/xfs/scrub/bitmap.h @@ -111,6 +111,34 @@ int xagb_bitmap_set_btblocks(struct xagb_bitmap *bitmap, int xagb_bitmap_set_btcur_path(struct xagb_bitmap *bitmap, struct xfs_btree_cur *cur); +/* Bitmaps, but for type-checked for xfs_rgblock_t */ + +struct xrgb_bitmap { + struct xbitmap rgbitmap; +}; + +static inline void xrgb_bitmap_init(struct xrgb_bitmap *bitmap) +{ + xbitmap_init(&bitmap->rgbitmap); +} + +static inline void xrgb_bitmap_destroy(struct xrgb_bitmap *bitmap) +{ + xbitmap_destroy(&bitmap->rgbitmap); +} + +static inline int xrgb_bitmap_set(struct xrgb_bitmap *bitmap, + xfs_rgblock_t start, xfs_extlen_t len) +{ + return xbitmap_set(&bitmap->rgbitmap, start, len); +} + +static inline int xrgb_bitmap_walk(struct xrgb_bitmap *bitmap, + xbitmap_walk_fn fn, void *priv) +{ + return xbitmap_walk(&bitmap->rgbitmap, fn, priv); +} + /* Bitmaps, but for type-checked for xfs_fsblock_t */ struct xfsb_bitmap { diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index ff8605849a72..4a0cedea3fe0 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -39,6 +39,7 @@ xrep_trans_commit( struct xbitmap; struct xagb_bitmap; +struct xrgb_bitmap; struct xfsb_bitmap; int xrep_fix_freelist(struct xfs_scrub *sc, int alloc_flags); diff --git a/fs/xfs/scrub/rtrmap_repair.c b/fs/xfs/scrub/rtrmap_repair.c index e26847784d21..36c03e48c3fb 100644 --- a/fs/xfs/scrub/rtrmap_repair.c +++ b/fs/xfs/scrub/rtrmap_repair.c @@ -29,6 +29,7 @@ #include "xfs_rtalloc.h" #include "xfs_ag.h" #include "xfs_rtgroup.h" +#include "xfs_refcount.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -420,6 +421,100 @@ xrep_rtrmap_scan_ag( return error; } +struct xrep_rtrmap_stash_run { + struct xrep_rtrmap *rr; + uint64_t owner; +}; + +static int +xrep_rtrmap_stash_run( + uint64_t start, + uint64_t len, + void *priv) +{ + struct xrep_rtrmap_stash_run *rsr = priv; + struct xrep_rtrmap *rr = rsr->rr; + xfs_rgblock_t rgbno = start; + + return xrep_rtrmap_stash(rr, rgbno, len, rsr->owner, 0, 0); +} + +/* + * Emit rmaps for every extent of bits set in the bitmap. Caller must ensure + * that the ranges are in units of FS blocks. + */ +STATIC int +xrep_rtrmap_stash_bitmap( + struct xrep_rtrmap *rr, + struct xrgb_bitmap *bitmap, + const struct xfs_owner_info *oinfo) +{ + struct xrep_rtrmap_stash_run rsr = { + .rr = rr, + .owner = oinfo->oi_owner, + }; + + return xrgb_bitmap_walk(bitmap, xrep_rtrmap_stash_run, &rsr); +} + +/* Record a CoW staging extent. */ +STATIC int +xrep_rtrmap_walk_cowblocks( + struct xfs_btree_cur *cur, + const struct xfs_refcount_irec *irec, + void *priv) +{ + struct xrgb_bitmap *bitmap = priv; + + if (!xfs_refcount_check_domain(irec) || + irec->rc_domain != XFS_REFC_DOMAIN_COW) + return -EFSCORRUPTED; + + return xrgb_bitmap_set(bitmap, irec->rc_startblock, + irec->rc_blockcount); +} + +/* + * Collect rmaps for the blocks containing the refcount btree, and all CoW + * staging extents. + */ +STATIC int +xrep_rtrmap_find_refcount_rmaps( + struct xrep_rtrmap *rr) +{ + struct xrgb_bitmap cow_blocks; /* COWBIT */ + struct xfs_refcount_irec low = { + .rc_startblock = 0, + .rc_domain = XFS_REFC_DOMAIN_COW, + }; + struct xfs_refcount_irec high = { + .rc_startblock = -1U, + .rc_domain = XFS_REFC_DOMAIN_COW, + }; + struct xfs_scrub *sc = rr->sc; + int error; + + if (!xfs_has_rtreflink(sc->mp)) + return 0; + + xrgb_bitmap_init(&cow_blocks); + + /* Collect rmaps for CoW staging extents. */ + error = xfs_refcount_query_range(sc->sr.refc_cur, &low, &high, + xrep_rtrmap_walk_cowblocks, &cow_blocks); + if (error) + goto out_bitmap; + + /* Generate rmaps for everything. */ + error = xrep_rtrmap_stash_bitmap(rr, &cow_blocks, &XFS_RMAP_OINFO_COW); + if (error) + goto out_bitmap; + +out_bitmap: + xrgb_bitmap_destroy(&cow_blocks); + return error; +} + /* Count and check all collected records. */ STATIC int xrep_rtrmap_check_record( @@ -467,6 +562,13 @@ xrep_rtrmap_find_rmaps( if (error) return error; + /* Find CoW staging extents. */ + xrep_rtgroup_btcur_init(sc, &sc->sr); + error = xrep_rtrmap_find_refcount_rmaps(rr); + xchk_rtgroup_btcur_free(&sc->sr); + if (error) + return error; + /* * Set up for a potentially lengthy filesystem scan by reducing our * transaction resource usage for the duration. Specifically: From patchwork Fri Dec 30 22:18:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 906EBC4332F for ; Sat, 31 Dec 2022 01:58:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236094AbiLaB6P (ORCPT ); Fri, 30 Dec 2022 20:58:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51068 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236092AbiLaB6O (ORCPT ); Fri, 30 Dec 2022 20:58:14 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19E521C430 for ; Fri, 30 Dec 2022 17:58:12 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A382CB81DF0 for ; Sat, 31 Dec 2022 01:58:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 372B2C433D2; Sat, 31 Dec 2022 01:58:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451889; bh=wE06vUwYdY7FEPxjV29yTeG8qlvSvfE79IlrmE/CzDQ=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=RLp68sthSAQjLXUj/tLKtZWEi+LgGeSVEfnq/2ip7bK4Lg5N6zpizZXPZb6PvJ21r bJi8zF6BOif6C7szgPYDrVfe3di5LsjuKxfAEGUwEIWsNPDjpDufxknQaLAnmljh71 tfBzUicmA+2iDhRTEahq+mP3QTiyb1DbYY+KeqvLRU+dilEz2BvxkyZUqmJ6OQGht6 i39TP0gUC3l6Dc0MzuboVMq2uo2J8jLnRG+LX4a3wgIUml+0pQWLtgXaaIYz6HmrjK 3JVFpHX4P52oHeyf6X4+SaEQsb3dSMgu2JqvLbgc6XYQ/pOs8XBlfV8dRXV71T3EFd FyvrKC6y8bhxw== Subject: [PATCH 39/42] xfs: online repair of the realtime refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:34 -0800 Message-ID: <167243871450.717073.11301330207569929154.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Port the data device's refcount btree repair code to the realtime refcount btree. Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/libxfs/xfs_refcount.c | 2 fs/xfs/libxfs/xfs_refcount.h | 2 fs/xfs/scrub/bmap_repair.c | 2 fs/xfs/scrub/repair.c | 20 + fs/xfs/scrub/repair.h | 7 fs/xfs/scrub/rtrefcount.c | 9 fs/xfs/scrub/rtrefcount_repair.c | 783 ++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/rtrmap_repair.c | 2 fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/trace.h | 31 ++ 11 files changed, 852 insertions(+), 9 deletions(-) create mode 100644 fs/xfs/scrub/rtrefcount_repair.c diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index cb1074c67dc5..2f84dff55b6e 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -223,6 +223,7 @@ xfs-y += $(addprefix scrub/, \ xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \ rgsuper_repair.o \ rtbitmap_repair.o \ + rtrefcount_repair.o \ rtrmap_repair.o \ rtsummary_repair.o \ ) diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 8b878a7a5a3e..e3e349cad04f 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -155,7 +155,7 @@ xfs_refcount_check_perag_irec( return NULL; } -static inline xfs_failaddr_t +inline xfs_failaddr_t xfs_refcount_check_rtgroup_irec( struct xfs_rtgroup *rtg, const struct xfs_refcount_irec *irec) diff --git a/fs/xfs/libxfs/xfs_refcount.h b/fs/xfs/libxfs/xfs_refcount.h index c7907119d10c..790d7fe9e67e 100644 --- a/fs/xfs/libxfs/xfs_refcount.h +++ b/fs/xfs/libxfs/xfs_refcount.h @@ -132,6 +132,8 @@ extern void xfs_refcount_btrec_to_irec(const union xfs_btree_rec *rec, struct xfs_refcount_irec *irec); xfs_failaddr_t xfs_refcount_check_perag_irec(struct xfs_perag *pag, const struct xfs_refcount_irec *irec); +xfs_failaddr_t xfs_refcount_check_rtgroup_irec(struct xfs_rtgroup *rtg, + const struct xfs_refcount_irec *irec); xfs_failaddr_t xfs_refcount_check_irec(struct xfs_btree_cur *cur, const struct xfs_refcount_irec *irec); extern int xfs_refcount_insert(struct xfs_btree_cur *cur, diff --git a/fs/xfs/scrub/bmap_repair.c b/fs/xfs/scrub/bmap_repair.c index 5dca4680657f..4df6ce7beef4 100644 --- a/fs/xfs/scrub/bmap_repair.c +++ b/fs/xfs/scrub/bmap_repair.c @@ -349,7 +349,7 @@ xrep_bmap_check_rtfork_rmap( /* Make sure this isn't free space. */ rtbno = xfs_rgbno_to_rtb(sc->mp, cur->bc_ino.rtg->rtg_rgno, rec->rm_startblock); - return xrep_require_rtext_inuse(sc, rtbno, rec->rm_blockcount); + return xrep_require_rtext_inuse(sc, rtbno, rec->rm_blockcount, false); } /* Record realtime extents that belong to this inode's fork. */ diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index 3bde5ea86cf5..566fff059384 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -1022,21 +1022,31 @@ xrep_rtgroup_init( return 0; } -/* Ensure that all rt blocks in the given range are not marked free. */ +/* + * Ensure that all rt blocks in the given range are not marked free. If + * @must_align is true, then both ends must be aligned to a rt extent. + */ int xrep_require_rtext_inuse( struct xfs_scrub *sc, xfs_rtblock_t rtbno, - xfs_filblks_t len) + xfs_filblks_t len, + bool must_align) { struct xfs_mount *mp = sc->mp; xfs_rtxnum_t startrtx; xfs_rtxnum_t endrtx; + xfs_extlen_t mod; bool is_free = false; int error; - startrtx = xfs_rtb_to_rtxt(mp, rtbno); - endrtx = xfs_rtb_to_rtxt(mp, rtbno + len - 1); + startrtx = xfs_rtb_to_rtx(mp, rtbno, &mod); + if (must_align && mod != 0) + return -EFSCORRUPTED; + + endrtx = xfs_rtb_to_rtx(mp, rtbno + len - 1, &mod); + if (must_align && mod != mp->m_sb.sb_rextsize - 1) + return -EFSCORRUPTED; error = xfs_rtalloc_extent_is_free(mp, sc->tp, startrtx, endrtx - startrtx + 1, &is_free); @@ -1393,6 +1403,8 @@ xrep_is_rtmeta_ino( /* Newer rt metadata files are not guaranteed to exist */ if (rtg->rtg_rmapip && ino == rtg->rtg_rmapip->i_ino) return true; + if (rtg->rtg_refcountip && ino == rtg->rtg_refcountip->i_ino) + return true; return false; } diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index 4a0cedea3fe0..aa15aeffa724 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -90,6 +90,7 @@ int xrep_setup_nlinks(struct xfs_scrub *sc); int xrep_setup_symlink(struct xfs_scrub *sc, unsigned int *resblks); int xrep_setup_rgbitmap(struct xfs_scrub *sc, unsigned int *resblks); int xrep_setup_rtrmapbt(struct xfs_scrub *sc); +int xrep_setup_rtrefcountbt(struct xfs_scrub *sc); int xrep_xattr_reset_fork(struct xfs_scrub *sc); @@ -108,7 +109,7 @@ int xrep_rtgroup_init(struct xfs_scrub *sc, struct xfs_rtgroup *rtg, struct xchk_rt *sr, unsigned int rtglock_flags); void xrep_rtgroup_btcur_init(struct xfs_scrub *sc, struct xchk_rt *sr); int xrep_require_rtext_inuse(struct xfs_scrub *sc, xfs_rtblock_t rtbno, - xfs_filblks_t len); + xfs_filblks_t len, bool must_align); xfs_extlen_t xrep_calc_rtgroup_resblks(struct xfs_scrub *sc); #else # define xrep_rtgroup_init(sc, rtg, sr, lockflags) (-ENOSYS) @@ -153,12 +154,14 @@ int xrep_rtsummary(struct xfs_scrub *sc); int xrep_rgsuperblock(struct xfs_scrub *sc); int xrep_rgbitmap(struct xfs_scrub *sc); int xrep_rtrmapbt(struct xfs_scrub *sc); +int xrep_rtrefcountbt(struct xfs_scrub *sc); #else # define xrep_rtbitmap xrep_notsupported # define xrep_rtsummary xrep_notsupported # define xrep_rgsuperblock xrep_notsupported # define xrep_rgbitmap xrep_notsupported # define xrep_rtrmapbt xrep_notsupported +# define xrep_rtrefcountbt xrep_notsupported #endif /* CONFIG_XFS_RT */ #ifdef CONFIG_XFS_QUOTA @@ -230,6 +233,7 @@ xrep_setup_nothing( #define xrep_setup_parent xrep_setup_nothing #define xrep_setup_nlinks xrep_setup_nothing #define xrep_setup_rtrmapbt xrep_setup_nothing +#define xrep_setup_rtrefcountbt xrep_setup_nothing #define xrep_setup_inode(sc, imap) ((void)0) @@ -286,6 +290,7 @@ static inline int xrep_setup_rgbitmap(struct xfs_scrub *sc, unsigned int *x) #define xrep_rgsuperblock xrep_notsupported #define xrep_rgbitmap xrep_notsupported #define xrep_rtrmapbt xrep_notsupported +#define xrep_rtrefcountbt xrep_notsupported #endif /* CONFIG_XFS_ONLINE_REPAIR */ diff --git a/fs/xfs/scrub/rtrefcount.c b/fs/xfs/scrub/rtrefcount.c index 3cb2ff8443da..8eb79f7030e7 100644 --- a/fs/xfs/scrub/rtrefcount.c +++ b/fs/xfs/scrub/rtrefcount.c @@ -7,8 +7,10 @@ #include "xfs_fs.h" #include "xfs_shared.h" #include "xfs_format.h" +#include "xfs_log_format.h" #include "xfs_trans_resv.h" #include "xfs_mount.h" +#include "xfs_trans.h" #include "xfs_btree.h" #include "xfs_rmap.h" #include "xfs_refcount.h" @@ -19,6 +21,7 @@ #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/btree.h" +#include "scrub/repair.h" /* Set us up with the realtime refcount metadata locked. */ int @@ -31,6 +34,12 @@ xchk_setup_rtrefcountbt( if (xchk_need_fshook_drain(sc)) xchk_fshooks_enable(sc, XCHK_FSHOOKS_DRAIN); + if (xchk_could_repair(sc)) { + error = xrep_setup_rtrefcountbt(sc); + if (error) + return error; + } + rtg = xfs_rtgroup_get(sc->mp, sc->sm->sm_agno); if (!rtg) return -ENOENT; diff --git a/fs/xfs/scrub/rtrefcount_repair.c b/fs/xfs/scrub/rtrefcount_repair.c new file mode 100644 index 000000000000..f56966aaaad8 --- /dev/null +++ b/fs/xfs/scrub/rtrefcount_repair.c @@ -0,0 +1,783 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_defer.h" +#include "xfs_btree.h" +#include "xfs_btree_staging.h" +#include "xfs_bit.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_sb.h" +#include "xfs_alloc.h" +#include "xfs_ialloc.h" +#include "xfs_rmap.h" +#include "xfs_rmap_btree.h" +#include "xfs_rtrmap_btree.h" +#include "xfs_refcount.h" +#include "xfs_rtrefcount_btree.h" +#include "xfs_error.h" +#include "xfs_health.h" +#include "xfs_inode.h" +#include "xfs_quota.h" +#include "xfs_rtalloc.h" +#include "xfs_ag.h" +#include "xfs_rtgroup.h" +#include "scrub/xfs_scrub.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/btree.h" +#include "scrub/trace.h" +#include "scrub/repair.h" +#include "scrub/bitmap.h" +#include "scrub/xfile.h" +#include "scrub/xfarray.h" +#include "scrub/newbt.h" +#include "scrub/reap.h" +#include "scrub/rcbag.h" + +/* + * Rebuilding the Reference Count Btree + * ==================================== + * + * This algorithm is "borrowed" from xfs_repair. Imagine the rmap + * entries as rectangles representing extents of physical blocks, and + * that the rectangles can be laid down to allow them to overlap each + * other; then we know that we must emit a refcnt btree entry wherever + * the amount of overlap changes, i.e. the emission stimulus is + * level-triggered: + * + * - --- + * -- ----- ---- --- ------ + * -- ---- ----------- ---- --------- + * -------------------------------- ----------- + * ^ ^ ^^ ^^ ^ ^^ ^^^ ^^^^ ^ ^^ ^ ^ ^ + * 2 1 23 21 3 43 234 2123 1 01 2 3 0 + * + * For our purposes, a rmap is a tuple (startblock, len, fileoff, owner). + * + * Note that in the actual refcnt btree we don't store the refcount < 2 + * cases because the bnobt tells us which blocks are free; single-use + * blocks aren't recorded in the bnobt or the refcntbt. If the rmapbt + * supports storing multiple entries covering a given block we could + * theoretically dispense with the refcntbt and simply count rmaps, but + * that's inefficient in the (hot) write path, so we'll take the cost of + * the extra tree to save time. Also there's no guarantee that rmap + * will be enabled. + * + * Given an array of rmaps sorted by physical block number, a starting + * physical block (sp), a bag to hold rmaps that cover sp, and the next + * physical block where the level changes (np), we can reconstruct the + * rt refcount btree as follows: + * + * While there are still unprocessed rmaps in the array, + * - Set sp to the physical block (pblk) of the next unprocessed rmap. + * - Add to the bag all rmaps in the array where startblock == sp. + * - Set np to the physical block where the bag size will change. This + * is the minimum of (the pblk of the next unprocessed rmap) and + * (startblock + len of each rmap in the bag). + * - Record the bag size as old_bag_size. + * + * - While the bag isn't empty, + * - Remove from the bag all rmaps where startblock + len == np. + * - Add to the bag all rmaps in the array where startblock == np. + * - If the bag size isn't old_bag_size, store the refcount entry + * (sp, np - sp, bag_size) in the refcnt btree. + * - If the bag is empty, break out of the inner loop. + * - Set old_bag_size to the bag size + * - Set sp = np. + * - Set np to the physical block where the bag size will change. + * This is the minimum of (the pblk of the next unprocessed rmap) + * and (startblock + len of each rmap in the bag). + * + * Like all the other repairers, we make a list of all the refcount + * records we need, then reinitialize the rt refcount btree root and + * insert all the records. + */ + +struct xrep_rtrefc { + /* refcount extents */ + struct xfarray *refcount_records; + + /* new refcountbt information */ + struct xrep_newbt new_btree; + + /* old refcountbt blocks */ + struct xfsb_bitmap old_rtrefcountbt_blocks; + + struct xfs_scrub *sc; + + /* get_records()'s position in the rt refcount record array. */ + xfarray_idx_t array_cur; + + /* # of refcountbt blocks */ + xfs_filblks_t btblocks; +}; + +/* Set us up to repair refcount btrees. */ +int +xrep_setup_rtrefcountbt( + struct xfs_scrub *sc) +{ + return xrep_setup_buftarg(sc, "rtrefcount bag"); +} + +/* Check for any obvious conflicts with this shared/CoW staging extent. */ +STATIC int +xrep_rtrefc_check_ext( + struct xfs_scrub *sc, + const struct xfs_refcount_irec *rec) +{ + xfs_rtblock_t rtbno; + + if (xfs_refcount_check_rtgroup_irec(sc->sr.rtg, rec) != NULL) + return -EFSCORRUPTED; + + /* Make sure this isn't free space or misaligned. */ + rtbno = xfs_rgbno_to_rtb(sc->mp, sc->sr.rtg->rtg_rgno, + rec->rc_startblock); + return xrep_require_rtext_inuse(sc, rtbno, rec->rc_blockcount, true); +} + +/* Record a reference count extent. */ +STATIC int +xrep_rtrefc_stash( + struct xrep_rtrefc *rr, + enum xfs_refc_domain domain, + xfs_rgblock_t bno, + xfs_extlen_t len, + uint64_t refcount) +{ + struct xfs_refcount_irec irec = { + .rc_startblock = bno, + .rc_blockcount = len, + .rc_refcount = refcount, + .rc_domain = domain, + }; + int error = 0; + + if (xchk_should_terminate(rr->sc, &error)) + return error; + + irec.rc_refcount = min_t(uint64_t, XFS_REFC_REFCOUNT_MAX, refcount); + + error = xrep_rtrefc_check_ext(rr->sc, &irec); + if (error) + return error; + + trace_xrep_rtrefc_found(rr->sc->sr.rtg, &irec); + + return xfarray_append(rr->refcount_records, &irec); +} + +/* Record a CoW staging extent. */ +STATIC int +xrep_rtrefc_stash_cow( + struct xrep_rtrefc *rr, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + return xrep_rtrefc_stash(rr, XFS_REFC_DOMAIN_COW, bno, len, 1); +} + +/* Decide if an rmap could describe a shared extent. */ +static inline bool +xrep_rtrefc_rmap_shareable( + const struct xfs_rmap_irec *rmap) +{ + /* rt metadata are never sharable */ + if (XFS_RMAP_NON_INODE_OWNER(rmap->rm_owner)) + return false; + + /* Unwritten file blocks are not shareable. */ + if (rmap->rm_flags & XFS_RMAP_UNWRITTEN) + return false; + + return true; +} + +/* Grab the next (abbreviated) rmap record from the rmapbt. */ +STATIC int +xrep_rtrefc_walk_rmaps( + struct xrep_rtrefc *rr, + struct xfs_rmap_irec *rmap, + bool *have_rec) +{ + struct xfs_btree_cur *cur = rr->sc->sr.rmap_cur; + struct xfs_mount *mp = cur->bc_mp; + int have_gt; + int error = 0; + + *have_rec = false; + + /* + * Loop through the remaining rmaps. Remember CoW staging + * extents and the refcountbt blocks from the old tree for later + * disposal. We can only share written data fork extents, so + * keep looping until we find an rmap for one. + */ + do { + if (xchk_should_terminate(rr->sc, &error)) + return error; + + error = xfs_btree_increment(cur, 0, &have_gt); + if (error) + return error; + if (!have_gt) + return 0; + + error = xfs_rmap_get_rec(cur, rmap, &have_gt); + if (error) + return error; + if (XFS_IS_CORRUPT(mp, !have_gt)) { + xfs_btree_mark_sick(cur); + return -EFSCORRUPTED; + } + + if (rmap->rm_owner == XFS_RMAP_OWN_COW) { + error = xrep_rtrefc_stash_cow(rr, rmap->rm_startblock, + rmap->rm_blockcount); + if (error) + return error; + } else if (xfs_internal_inum(mp, rmap->rm_owner) || + (rmap->rm_flags & (XFS_RMAP_ATTR_FORK | + XFS_RMAP_BMBT_BLOCK))) { + xfs_btree_mark_sick(cur); + return -EFSCORRUPTED; + } + } while (!xrep_rtrefc_rmap_shareable(rmap)); + + *have_rec = true; + return 0; +} + +static inline uint32_t +xrep_rtrefc_encode_startblock( + const struct xfs_refcount_irec *irec) +{ + uint32_t start; + + start = irec->rc_startblock & ~XFS_REFC_COWFLAG; + if (irec->rc_domain == XFS_REFC_DOMAIN_COW) + start |= XFS_REFC_COWFLAG; + + return start; +} + +/* + * Compare two refcount records. We want to sort in order of increasing block + * number. + */ +static int +xrep_rtrefc_extent_cmp( + const void *a, + const void *b) +{ + const struct xfs_refcount_irec *ap = a; + const struct xfs_refcount_irec *bp = b; + uint32_t sa, sb; + + sa = xrep_rtrefc_encode_startblock(ap); + sb = xrep_rtrefc_encode_startblock(bp); + + if (sa > sb) + return 1; + if (sa < sb) + return -1; + return 0; +} + +/* + * Sort the refcount extents by startblock or else the btree records will be in + * the wrong order. Make sure the records do not overlap in physical space. + */ +STATIC int +xrep_rtrefc_sort_records( + struct xrep_rtrefc *rr) +{ + struct xfs_refcount_irec irec; + xfarray_idx_t cur; + enum xfs_refc_domain dom = XFS_REFC_DOMAIN_SHARED; + xfs_rgblock_t next_rgbno = 0; + int error; + + error = xfarray_sort(rr->refcount_records, xrep_rtrefc_extent_cmp, + XFARRAY_SORT_KILLABLE); + if (error) + return error; + + foreach_xfarray_idx(rr->refcount_records, cur) { + if (xchk_should_terminate(rr->sc, &error)) + return error; + + error = xfarray_load(rr->refcount_records, cur, &irec); + if (error) + return error; + + if (dom == XFS_REFC_DOMAIN_SHARED && + irec.rc_domain == XFS_REFC_DOMAIN_COW) { + dom = irec.rc_domain; + next_rgbno = 0; + } + + if (dom != irec.rc_domain) + return -EFSCORRUPTED; + if (irec.rc_startblock < next_rgbno) + return -EFSCORRUPTED; + + next_rgbno = irec.rc_startblock + irec.rc_blockcount; + } + + return error; +} + +/* Record extents that belong to the realtime refcount inode. */ +STATIC int +xrep_rtrefc_walk_rmap( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xrep_rtrefc *rr = priv; + struct xfs_mount *mp = cur->bc_mp; + xfs_fsblock_t fsbno; + int error = 0; + + if (xchk_should_terminate(rr->sc, &error)) + return error; + + /* Skip extents which are not owned by this inode and fork. */ + if (rec->rm_owner != rr->sc->ip->i_ino) + return 0; + + error = xrep_check_ino_btree_mapping(rr->sc, rec); + if (error) + return error; + + fsbno = XFS_AGB_TO_FSB(mp, cur->bc_ag.pag->pag_agno, + rec->rm_startblock); + + return xfsb_bitmap_set(&rr->old_rtrefcountbt_blocks, fsbno, + rec->rm_blockcount); +} + +/* + * Walk forward through the rmap btree to collect all rmaps starting at + * @bno in @rmap_bag. These represent the file(s) that share ownership of + * the current block. Upon return, the rmap cursor points to the last record + * satisfying the startblock constraint. + */ +static int +xrep_rtrefc_push_rmaps_at( + struct xrep_rtrefc *rr, + struct rcbag *rcstack, + xfs_rgblock_t bno, + struct xfs_rmap_irec *rmap, + bool *have) +{ + struct xfs_scrub *sc = rr->sc; + int have_gt; + int error; + + while (*have && rmap->rm_startblock == bno) { + error = rcbag_add(rcstack, rr->sc->tp, rmap); + if (error) + return error; + + error = xrep_rtrefc_walk_rmaps(rr, rmap, have); + if (error) + return error; + } + + error = xfs_btree_decrement(sc->sr.rmap_cur, 0, &have_gt); + if (error) + return error; + if (XFS_IS_CORRUPT(sc->mp, !have_gt)) { + xfs_btree_mark_sick(sc->sr.rmap_cur); + return -EFSCORRUPTED; + } + + return 0; +} + +/* Scan one AG for reverse mappings for the realtime refcount btree. */ +STATIC int +xrep_rtrefc_scan_ag( + struct xrep_rtrefc *rr, + struct xfs_perag *pag) +{ + struct xfs_scrub *sc = rr->sc; + int error; + + error = xrep_ag_init(sc, pag, &sc->sa); + if (error) + return error; + + error = xfs_rmap_query_all(sc->sa.rmap_cur, xrep_rtrefc_walk_rmap, rr); + xchk_ag_free(sc, &sc->sa); + return error; +} + +/* Iterate all the rmap records to generate reference count data. */ +STATIC int +xrep_rtrefc_find_refcounts( + struct xrep_rtrefc *rr) +{ + struct xfs_scrub *sc = rr->sc; + struct rcbag *rcstack; + struct xfs_perag *pag; + uint64_t old_stack_height; + xfs_rgblock_t sbno; + xfs_rgblock_t cbno; + xfs_rgblock_t nbno; + xfs_agnumber_t agno; + bool have; + int error; + + /* Scan for old rtrefc btree blocks. */ + for_each_perag(sc->mp, agno, pag) { + error = xrep_rtrefc_scan_ag(rr, pag); + if (error) { + xfs_perag_put(pag); + return error; + } + } + + xrep_rtgroup_btcur_init(sc, &sc->sr); + + /* + * Set up a bag to store all the rmap records that we're tracking to + * generate a reference count record. If this exceeds + * XFS_REFC_REFCOUNT_MAX, we clamp rc_refcount. + */ + error = rcbag_init(sc->mp, sc->xfile_buftarg, &rcstack); + if (error) + goto out_cur; + + /* Start the rtrmapbt cursor to the left of all records. */ + error = xfs_btree_goto_left_edge(sc->sr.rmap_cur); + if (error) + goto out_bag; + + /* Process reverse mappings into refcount data. */ + while (xfs_btree_has_more_records(sc->sr.rmap_cur)) { + struct xfs_rmap_irec rmap; + + /* Push all rmaps with pblk == sbno onto the stack */ + error = xrep_rtrefc_walk_rmaps(rr, &rmap, &have); + if (error) + goto out_bag; + if (!have) + break; + sbno = cbno = rmap.rm_startblock; + error = xrep_rtrefc_push_rmaps_at(rr, rcstack, sbno, &rmap, + &have); + if (error) + goto out_bag; + + /* Set nbno to the bno of the next refcount change */ + error = rcbag_next_edge(rcstack, sc->tp, &rmap, have, &nbno); + if (error) + goto out_bag; + + ASSERT(nbno > sbno); + old_stack_height = rcbag_count(rcstack); + + /* While stack isn't empty... */ + while (rcbag_count(rcstack) > 0) { + /* Pop all rmaps that end at nbno */ + error = rcbag_remove_ending_at(rcstack, sc->tp, nbno); + if (error) + goto out_bag; + + /* Push array items that start at nbno */ + error = xrep_rtrefc_walk_rmaps(rr, &rmap, &have); + if (error) + goto out_bag; + if (have) { + error = xrep_rtrefc_push_rmaps_at(rr, rcstack, + nbno, &rmap, &have); + if (error) + goto out_bag; + } + + /* Emit refcount if necessary */ + ASSERT(nbno > cbno); + if (rcbag_count(rcstack) != old_stack_height) { + if (old_stack_height > 1) { + error = xrep_rtrefc_stash(rr, + XFS_REFC_DOMAIN_SHARED, + cbno, nbno - cbno, + old_stack_height); + if (error) + goto out_bag; + } + cbno = nbno; + } + + /* Stack empty, go find the next rmap */ + if (rcbag_count(rcstack) == 0) + break; + old_stack_height = rcbag_count(rcstack); + sbno = nbno; + + /* Set nbno to the bno of the next refcount change */ + error = rcbag_next_edge(rcstack, sc->tp, &rmap, have, + &nbno); + if (error) + goto out_bag; + + ASSERT(nbno > sbno); + } + } + + ASSERT(rcbag_count(rcstack) == 0); +out_bag: + rcbag_free(&rcstack); +out_cur: + xchk_rtgroup_btcur_free(&sc->sr); + return error; +} + +/* Retrieve refcountbt data for bulk load. */ +STATIC int +xrep_rtrefc_get_records( + struct xfs_btree_cur *cur, + unsigned int idx, + struct xfs_btree_block *block, + unsigned int nr_wanted, + void *priv) +{ + struct xrep_rtrefc *rr = priv; + union xfs_btree_rec *block_rec; + unsigned int loaded; + int error; + + for (loaded = 0; loaded < nr_wanted; loaded++, idx++) { + error = xfarray_load(rr->refcount_records, rr->array_cur++, + &cur->bc_rec.rc); + if (error) + return error; + + block_rec = xfs_btree_rec_addr(cur, idx, block); + cur->bc_ops->init_rec_from_cur(cur, block_rec); + } + + return loaded; +} + +/* Feed one of the new btree blocks to the bulk loader. */ +STATIC int +xrep_rtrefc_claim_block( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr, + void *priv) +{ + struct xrep_rtrefc *rr = priv; + int error; + + error = xrep_newbt_relog_autoreap(&rr->new_btree); + if (error) + return error; + + return xrep_newbt_claim_block(cur, &rr->new_btree, ptr); +} + +/* Figure out how much space we need to create the incore btree root block. */ +STATIC size_t +xrep_rtrefc_iroot_size( + struct xfs_btree_cur *cur, + unsigned int level, + unsigned int nr_this_level, + void *priv) +{ + return xfs_rtrefcount_broot_space_calc(cur->bc_mp, level, + nr_this_level); +} + +/* + * Use the collected refcount information to stage a new rt refcount btree. If + * this is successful we'll return with the new btree root information logged + * to the repair transaction but not yet committed. + */ +STATIC int +xrep_rtrefc_build_new_tree( + struct xrep_rtrefc *rr) +{ + struct xfs_owner_info oinfo; + struct xfs_scrub *sc = rr->sc; + struct xfs_mount *mp = sc->mp; + struct xfs_rtgroup *rtg = sc->sr.rtg; + struct xfs_btree_cur *refc_cur; + int error; + + error = xrep_rtrefc_sort_records(rr); + if (error) + return error; + + /* + * Prepare to construct the new btree by reserving disk space for the + * new btree and setting up all the accounting information we'll need + * to root the new btree while it's under construction and before we + * attach it to the realtime refcount inode. + */ + xfs_rmap_ino_bmbt_owner(&oinfo, rtg->rtg_refcountip->i_ino, + XFS_DATA_FORK); + error = xrep_newbt_init_inode(&rr->new_btree, sc, XFS_DATA_FORK, + &oinfo); + if (error) + return error; + rr->new_btree.bload.get_records = xrep_rtrefc_get_records; + rr->new_btree.bload.claim_block = xrep_rtrefc_claim_block; + rr->new_btree.bload.iroot_size = xrep_rtrefc_iroot_size; + + /* Compute how many blocks we'll need. */ + refc_cur = xfs_rtrefcountbt_stage_cursor(mp, rtg, rtg->rtg_refcountip, + &rr->new_btree.ifake); + error = xfs_btree_bload_compute_geometry(refc_cur, &rr->new_btree.bload, + xfarray_length(rr->refcount_records)); + if (error) + goto err_cur; + + /* Last chance to abort before we start committing fixes. */ + if (xchk_should_terminate(sc, &error)) + goto err_cur; + + /* + * Guess how many blocks we're going to need to rebuild an entire + * rtrefcountbt from the number of extents we found, and pump up our + * transaction to have sufficient block reservation. We're allowed + * to exceed quota to repair inconsistent metadata, though this is + * unlikely. + */ + error = xfs_trans_reserve_more_inode(sc->tp, rtg->rtg_refcountip, + rr->new_btree.bload.nr_blocks, 0, true); + if (error) + goto err_cur; + + /* Reserve the space we'll need for the new btree. */ + error = xrep_newbt_alloc_blocks(&rr->new_btree, + rr->new_btree.bload.nr_blocks); + if (error) + goto err_cur; + + /* Add all observed refcount records. */ + rr->new_btree.ifake.if_fork->if_format = XFS_DINODE_FMT_REFCOUNT; + rr->array_cur = XFARRAY_CURSOR_INIT; + error = xfs_btree_bload(refc_cur, &rr->new_btree.bload, rr); + if (error) + goto err_cur; + + /* + * Install the new rtrefc btree in the inode. After this point the old + * btree is no longer accessible, the new tree is live, and we can + * delete the cursor. + */ + xfs_rtrefcountbt_commit_staged_btree(refc_cur, sc->tp); + xrep_inode_set_nblocks(rr->sc, rr->new_btree.ifake.if_blocks); + xfs_btree_del_cursor(refc_cur, 0); + + /* Dispose of any unused blocks and the accounting information. */ + error = xrep_newbt_commit(&rr->new_btree); + if (error) + return error; + + return xrep_roll_trans(sc); +err_cur: + xfs_btree_del_cursor(refc_cur, error); + xrep_newbt_cancel(&rr->new_btree); + return error; +} + +/* + * Now that we've logged the roots of the new btrees, invalidate all of the + * old blocks and free them. + */ +STATIC int +xrep_rtrefc_remove_old_tree( + struct xrep_rtrefc *rr) +{ + struct xfs_owner_info oinfo; + int error; + + xfs_rmap_ino_bmbt_owner(&oinfo, rr->sc->ip->i_ino, XFS_DATA_FORK); + + /* + * Free all the extents that were allocated to the former rtrefcountbt + * and aren't cross-linked with something else. If the incore space + * reservation for the rtrmap inode is insufficient, this will refill + * it. + */ + error = xrep_reap_fsblocks(rr->sc, &rr->old_rtrefcountbt_blocks, + &oinfo, XFS_AG_RESV_IMETA); + if (error) + return error; + + /* + * Ensure the proper reservation for the rtrefcount inode so that we + * don't fail to expand the btree. + */ + return xrep_reset_imeta_reservation(rr->sc); +} + +/* Rebuild the rt refcount btree. */ +int +xrep_rtrefcountbt( + struct xfs_scrub *sc) +{ + struct xrep_rtrefc *rr; + struct xfs_mount *mp = sc->mp; + int error; + + /* We require the rmapbt to rebuild anything. */ + if (!xfs_has_rtrmapbt(mp)) + return -EOPNOTSUPP; + + /* Make sure any problems with the fork are fixed. */ + error = xrep_metadata_inode_forks(sc); + if (error) + return error; + + rr = kzalloc(sizeof(struct xrep_rtrefc), XCHK_GFP_FLAGS); + if (!rr) + return -ENOMEM; + rr->sc = sc; + + /* Set up enough storage to handle one refcount record per rt extent. */ + error = xfarray_create(mp, "rtrefcount records", + mp->m_sb.sb_rextents, + sizeof(struct xfs_refcount_irec), + &rr->refcount_records); + if (error) + goto out_rr; + + /* Collect all reference counts. */ + xfsb_bitmap_init(&rr->old_rtrefcountbt_blocks); + error = xrep_rtrefc_find_refcounts(rr); + if (error) + goto out_bitmap; + + xfs_trans_ijoin(sc->tp, sc->ip, 0); + + /* Rebuild the refcount information. */ + error = xrep_rtrefc_build_new_tree(rr); + if (error) + goto out_bitmap; + + /* Kill the old tree. */ + error = xrep_rtrefc_remove_old_tree(rr); + +out_bitmap: + xfsb_bitmap_destroy(&rr->old_rtrefcountbt_blocks); + xfarray_destroy(rr->refcount_records); +out_rr: + kfree(rr); + return error; +} diff --git a/fs/xfs/scrub/rtrmap_repair.c b/fs/xfs/scrub/rtrmap_repair.c index 36c03e48c3fb..fb841036b89f 100644 --- a/fs/xfs/scrub/rtrmap_repair.c +++ b/fs/xfs/scrub/rtrmap_repair.c @@ -119,7 +119,7 @@ xrep_rtrmap_check_mapping( /* Make sure this isn't free space. */ rtbno = xfs_rgbno_to_rtb(sc->mp, sc->sr.rtg->rtg_rgno, rec->rm_startblock); - return xrep_require_rtext_inuse(sc, rtbno, rec->rm_blockcount); + return xrep_require_rtext_inuse(sc, rtbno, rec->rm_blockcount, false); } /* Store a reverse-mapping record. */ diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index ad6f297ae6cf..2f60fd6b86a9 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -437,7 +437,7 @@ static const struct xchk_meta_ops meta_scrub_ops[] = { .setup = xchk_setup_rtrefcountbt, .scrub = xchk_rtrefcountbt, .has = xfs_has_rtreflink, - .repair = xrep_notsupported, + .repair = xrep_rtrefcountbt, }, }; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 8070d946ae1d..d74bba391854 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -3063,6 +3063,37 @@ TRACE_EVENT(xrep_rtrmap_live_update, __entry->offset, __entry->flags) ); + +TRACE_EVENT(xrep_rtrefc_found, + TP_PROTO(struct xfs_rtgroup *rtg, const struct xfs_refcount_irec *rec), + TP_ARGS(rtg, rec), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(xfs_rgnumber_t, rgno) + __field(enum xfs_refc_domain, domain) + __field(xfs_rgblock_t, startblock) + __field(xfs_extlen_t, blockcount) + __field(xfs_nlink_t, refcount) + ), + TP_fast_assign( + __entry->dev = rtg->rtg_mount->m_super->s_dev; + __entry->rtdev = rtg->rtg_mount->m_rtdev_targp->bt_dev; + __entry->rgno = rtg->rtg_rgno; + __entry->domain = rec->rc_domain; + __entry->startblock = rec->rc_startblock; + __entry->blockcount = rec->rc_blockcount; + __entry->refcount = rec->rc_refcount; + ), + TP_printk("dev %d:%d rtdev %d:%d rgno 0x%x dom %s rgbno 0x%x fsbcount 0x%x refcount %u", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->rgno, + __print_symbolic(__entry->domain, XFS_REFC_DOMAIN_STRINGS), + __entry->startblock, + __entry->blockcount, + __entry->refcount) +); #endif /* CONFIG_XFS_RT */ #endif /* IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR) */ From patchwork Fri Dec 30 22:18:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085548 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2536DC4332F for ; Sat, 31 Dec 2022 01:58:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236101AbiLaB6b (ORCPT ); Fri, 30 Dec 2022 20:58:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236092AbiLaB61 (ORCPT ); Fri, 30 Dec 2022 20:58:27 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4942E1C430 for ; Fri, 30 Dec 2022 17:58:27 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0E14EB81DF0 for ; Sat, 31 Dec 2022 01:58:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CCE69C433D2; Sat, 31 Dec 2022 01:58:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451904; bh=QY4pfgUyOI8NRJ4WzmUz+mRd40gjGvc/tIkqvCVD7m0=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=dTpjysoMeQ+tfm5lcKLT+0mA8Es+0H/fWT6vL81EeXi0h8uWviAQxK++28I79R/pf 0vA/0g2dbYcEN9JPqmxt6sByDk+E6kfZ9tLjfZknu8tHzvREVYXEm9hjTT/WrKKtiX 8D6xrtN76KIoPxN1PVnycSSS+E9ctmmr4+eghkNoh4uQrXljl03wiyRZBd20LLJTjj 0L6NUhKF60ZRqvQ122Dcs8z/R2T2IQbi5xx+PBwFag2B20dsve7JOlM/oV1nsLC7vD MyezV9zdtZJODuNZZOA9oBSHUj0Dwsdujj/+ysiNbCw4tH4eGfneu029oppd991Z66 hdm7ksnXRXN6Q== Subject: [PATCH 40/42] xfs: repair inodes that have a refcount btree in the data fork From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:34 -0800 Message-ID: <167243871465.717073.11753987612330457810.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Plumb knowledge of refcount btrees into the inode core repair code. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/inode_repair.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index 3ce9ac5b0fc4..15dbb8a08b81 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -39,6 +39,7 @@ #include "xfs_rtbitmap.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -977,6 +978,7 @@ xrep_dinode_ensure_forkoff( { struct xfs_bmdr_block *bmdr; struct xfs_rtrmap_root *rmdr; + struct xfs_rtrefcount_root *rcdr; struct xfs_scrub *sc = ri->sc; xfs_extnum_t attr_extents, data_extents; size_t bmdr_minsz = xfs_bmdr_space_calc(1); @@ -1087,6 +1089,10 @@ xrep_dinode_ensure_forkoff( rmdr = XFS_DFORK_PTR(dip, XFS_DATA_FORK); dfork_min = xfs_rtrmap_broot_space(sc->mp, rmdr); break; + case XFS_DINODE_FMT_REFCOUNT: + rcdr = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + dfork_min = xfs_rtrefcount_broot_space(sc->mp, rcdr); + break; default: dfork_min = 0; break; From patchwork Fri Dec 30 22:18:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD0E5C4332F for ; Sat, 31 Dec 2022 01:58:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236092AbiLaB6n (ORCPT ); Fri, 30 Dec 2022 20:58:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236104AbiLaB6n (ORCPT ); Fri, 30 Dec 2022 20:58:43 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CFF51C900 for ; Fri, 30 Dec 2022 17:58:41 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EEF4361C5B for ; Sat, 31 Dec 2022 01:58:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5A4B8C433EF; Sat, 31 Dec 2022 01:58:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451920; bh=u108L5Hf7YJiJdLSMrc7QIzC9x6KpwVqwyBKYqlyrA8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=csFoMq+FLxxklGXPa6Lcetx7ujroi753qSG0UBPprW5uX/A5TYfR2JqYP2XSBmhND 6dsorqFRMi4XSrprJ9g84uhNJ4P83Bd5vvPf5zo4hexTlxluQphqWAbhGl2ybetf6B zyD7Hma5QRujZ1iPKBw6qKMsO4nGNC4yLGP0M5FtbsZT0knnWYTBW9KwAZ/ERZYB3L /lBSb39CnXJhHwIYPa+SQzIO0r6OjjsKO5F30NipnIVgobVbsTr+kWjI0hBNnTN5gx 34MB+R5IB1+QJqKGQ6/WRlG1RgQILWUpPeHGgNDtUC65NQUira+2gnNzUC7vJV5IOq jSSw5k/EDXvdw== Subject: [PATCH 41/42] xfs: fix cow forks for realtime files From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:34 -0800 Message-ID: <167243871478.717073.12651516252296016150.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Port the CoW fork repair to realtime files. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/bitmap.h | 28 ++++++ fs/xfs/scrub/cow_repair.c | 210 ++++++++++++++++++++++++++++++++++++++++--- fs/xfs/scrub/reap.c | 219 +++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/reap.h | 7 + fs/xfs/scrub/repair.h | 1 fs/xfs/scrub/trace.h | 72 +++++++++++++++ 6 files changed, 520 insertions(+), 17 deletions(-) diff --git a/fs/xfs/scrub/bitmap.h b/fs/xfs/scrub/bitmap.h index 29faf2b63715..7541b5fd68f8 100644 --- a/fs/xfs/scrub/bitmap.h +++ b/fs/xfs/scrub/bitmap.h @@ -167,4 +167,32 @@ static inline int xfsb_bitmap_walk(struct xfsb_bitmap *bitmap, return xbitmap_walk(&bitmap->fsbitmap, fn, priv); } +/* Bitmaps, but for type-checked for xfs_rtblock_t */ + +struct xrtb_bitmap { + struct xbitmap rtbitmap; +}; + +static inline void xrtb_bitmap_init(struct xrtb_bitmap *bitmap) +{ + xbitmap_init(&bitmap->rtbitmap); +} + +static inline void xrtb_bitmap_destroy(struct xrtb_bitmap *bitmap) +{ + xbitmap_destroy(&bitmap->rtbitmap); +} + +static inline int xrtb_bitmap_set(struct xrtb_bitmap *bitmap, + xfs_rtblock_t start, xfs_filblks_t len) +{ + return xbitmap_set(&bitmap->rtbitmap, start, len); +} + +static inline int xrtb_bitmap_walk(struct xrtb_bitmap *bitmap, + xbitmap_walk_fn fn, void *priv) +{ + return xbitmap_walk(&bitmap->rtbitmap, fn, priv); +} + #endif /* __XFS_SCRUB_BITMAP_H__ */ diff --git a/fs/xfs/scrub/cow_repair.c b/fs/xfs/scrub/cow_repair.c index a0c1d97ab8b6..5605c4ecbdca 100644 --- a/fs/xfs/scrub/cow_repair.c +++ b/fs/xfs/scrub/cow_repair.c @@ -26,6 +26,9 @@ #include "xfs_errortag.h" #include "xfs_icache.h" #include "xfs_refcount_btree.h" +#include "xfs_rtalloc.h" +#include "xfs_rtbitmap.h" +#include "xfs_rtgroup.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -59,7 +62,10 @@ struct xrep_cow { struct xbitmap bad_fileoffs; /* Bitmap of fsblocks that were removed from the CoW fork. */ - struct xfsb_bitmap old_cowfork_fsblocks; + union { + struct xfsb_bitmap old_cowfork_fsblocks; + struct xrtb_bitmap old_cowfork_rtblocks; + }; /* CoW fork mappings used to scan for bad CoW staging extents. */ struct xfs_bmbt_irec irec; @@ -137,8 +143,12 @@ xrep_cow_mark_shared_staging( xrep_cow_trim_refcount(xc, &rrec, rec); - fsbno = XFS_AGB_TO_FSB(xc->sc->mp, cur->bc_ag.pag->pag_agno, - rrec.rc_startblock); + if (XFS_IS_REALTIME_INODE(xc->sc->ip)) + fsbno = xfs_rgbno_to_rtb(xc->sc->mp, cur->bc_ino.rtg->rtg_rgno, + rrec.rc_startblock); + else + fsbno = XFS_AGB_TO_FSB(xc->sc->mp, cur->bc_ag.pag->pag_agno, + rrec.rc_startblock); return xrep_cow_mark_file_range(xc, fsbno, rrec.rc_blockcount); } @@ -158,6 +168,7 @@ xrep_cow_mark_missing_staging( { struct xrep_cow *xc = priv; struct xfs_refcount_irec rrec; + xfs_fsblock_t fsbno; int error; if (!xfs_refcount_check_domain(rec) || @@ -169,9 +180,13 @@ xrep_cow_mark_missing_staging( if (xc->next_bno >= rrec.rc_startblock) goto next; - error = xrep_cow_mark_file_range(xc, - XFS_AGB_TO_FSB(xc->sc->mp, cur->bc_ag.pag->pag_agno, - xc->next_bno), + if (XFS_IS_REALTIME_INODE(xc->sc->ip)) + fsbno = xfs_rgbno_to_rtb(xc->sc->mp, cur->bc_ino.rtg->rtg_rgno, + xc->next_bno); + else + fsbno = XFS_AGB_TO_FSB(xc->sc->mp, cur->bc_ag.pag->pag_agno, + xc->next_bno); + error = xrep_cow_mark_file_range(xc, fsbno, rrec.rc_startblock - xc->next_bno); if (error) return error; @@ -214,7 +229,12 @@ xrep_cow_mark_missing_staging_rmap( rec_len -= adj; } - fsbno = XFS_AGB_TO_FSB(xc->sc->mp, cur->bc_ag.pag->pag_agno, rec_bno); + if (XFS_IS_REALTIME_INODE(xc->sc->ip)) + fsbno = xfs_rgbno_to_rtb(xc->sc->mp, cur->bc_ino.rtg->rtg_rgno, + rec_bno); + else + fsbno = XFS_AGB_TO_FSB(xc->sc->mp, cur->bc_ag.pag->pag_agno, + rec_bno); return xrep_cow_mark_file_range(xc, fsbno, rec_len); } @@ -303,6 +323,99 @@ xrep_cow_find_bad( return 0; } +/* + * Find any part of the CoW fork mapping that isn't a single-owner CoW staging + * extent and mark the corresponding part of the file range in the bitmap. + */ +STATIC int +xrep_cow_find_bad_rt( + struct xrep_cow *xc) +{ + struct xfs_refcount_irec rc_low = { 0 }; + struct xfs_refcount_irec rc_high = { 0 }; + struct xfs_rmap_irec rm_low = { 0 }; + struct xfs_rmap_irec rm_high = { 0 }; + struct xfs_scrub *sc = xc->sc; + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; + int error = 0; + + xc->irec_startbno = xfs_rtb_to_rgbno(sc->mp, xc->irec.br_startblock, + &rgno); + + rtg = xfs_rtgroup_get(sc->mp, rgno); + if (!rtg) + return -EFSCORRUPTED; + + if (xrep_is_rtmeta_ino(sc, rtg, sc->ip->i_ino)) { + xfs_rtgroup_put(rtg); + goto out_rtg; + } + + error = xrep_rtgroup_init(sc, rtg, &sc->sr, + XFS_RTGLOCK_RMAP | XFS_RTGLOCK_REFCOUNT); + if (error) + goto out_rtg; + + /* Mark any CoW fork extents that are shared. */ + rc_low.rc_startblock = xc->irec_startbno; + rc_high.rc_startblock = xc->irec_startbno + xc->irec.br_blockcount - 1; + rc_low.rc_domain = rc_high.rc_domain = XFS_REFC_DOMAIN_SHARED; + error = xfs_refcount_query_range(sc->sr.refc_cur, &rc_low, &rc_high, + xrep_cow_mark_shared_staging, xc); + if (error) + goto out_sr; + + /* Make sure there are CoW staging extents for the whole mapping. */ + rc_low.rc_startblock = xc->irec_startbno; + rc_high.rc_startblock = xc->irec_startbno + xc->irec.br_blockcount - 1; + rc_low.rc_domain = rc_high.rc_domain = XFS_REFC_DOMAIN_COW; + xc->next_bno = xc->irec_startbno; + error = xfs_refcount_query_range(sc->sr.refc_cur, &rc_low, &rc_high, + xrep_cow_mark_missing_staging, xc); + if (error) + goto out_sr; + + if (xc->next_bno < xc->irec_startbno + xc->irec.br_blockcount) { + error = xrep_cow_mark_file_range(xc, + xfs_rgbno_to_rtb(sc->mp, rtg->rtg_rgno, + xc->next_bno), + xc->irec_startbno + xc->irec.br_blockcount - + xc->next_bno); + if (error) + goto out_sr; + } + + /* Mark any area has an rmap that isn't a COW staging extent. */ + rm_low.rm_startblock = xc->irec_startbno; + memset(&rm_high, 0xFF, sizeof(rm_high)); + rm_high.rm_startblock = xc->irec_startbno + xc->irec.br_blockcount - 1; + error = xfs_rmap_query_range(sc->sr.rmap_cur, &rm_low, &rm_high, + xrep_cow_mark_missing_staging_rmap, xc); + if (error) + goto out_sr; + + /* + * If userspace is forcing us to rebuild the CoW fork or someone + * turned on the debugging knob, replace everything in the + * CoW fork and then scan for staging extents in the refcountbt. + */ + if ((sc->sm->sm_flags & XFS_SCRUB_IFLAG_FORCE_REBUILD) || + XFS_TEST_ERROR(false, sc->mp, XFS_ERRTAG_FORCE_SCRUB_REPAIR)) { + error = xrep_cow_mark_file_range(xc, xc->irec.br_startblock, + xc->irec.br_blockcount); + if (error) + goto out_rtg; + } + +out_sr: + xchk_rtgroup_btcur_free(&sc->sr); + xchk_rtgroup_free(sc, &sc->sr); +out_rtg: + xfs_rtgroup_put(rtg); + return error; +} + /* * Allocate a replacement CoW staging extent of up to the given number of * blocks, and fill out the mapping. The caller must set irec->br_blockcount. @@ -343,6 +456,45 @@ xrep_cow_alloc( return 0; } +/* + * Allocate a replacement rt CoW staging extent of up to the given number of + * blocks, and fill out the mapping. The caller must set irec->br_blockcount. + */ +STATIC int +xrep_cow_alloc_rt( + struct xfs_scrub *sc, + struct xfs_bmbt_irec *irec) +{ + xfs_rtxnum_t rtx = NULLRTEXTNO; + xfs_rtxlen_t rtxlen = 0; + xfs_rtblock_t rtbno; + xfs_extlen_t len; + int error; + + ASSERT(sc->mp->m_sb.sb_rextsize == 1); + + error = xfs_trans_reserve_more(sc->tp, 0, irec->br_blockcount); + if (error) + return error; + + xfs_rtbitmap_lock(sc->tp, sc->mp); + + error = xfs_rtallocate_extent(sc->tp, 0, 1, irec->br_blockcount, + &rtxlen, 0, 1, &rtx); + if (error) + return error; + if (rtx == NULLRTEXTNO) + return -ENOSPC; + + rtbno = xfs_rtx_to_rtb(sc->mp, rtx); + len = xfs_rtxlen_to_extlen(sc->mp, rtxlen); + xfs_refcount_alloc_cow_extent(sc->tp, true, rtbno, len); + + irec->br_startblock = rtbno; + irec->br_blockcount = len; + return 0; +} + /* * Look up the current CoW fork mapping so that we only allocate enough to * replace a single mapping. If we don't find a mapping that covers the start @@ -514,7 +666,10 @@ xrep_cow_replace_one( * Allocate a replacement extent. If we don't fill all the blocks, * shorten the quantity that will be deleted in this step. */ - error = xrep_cow_alloc(sc, &rep); + if (XFS_IS_REALTIME_INODE(sc->ip)) + error = xrep_cow_alloc_rt(sc, &rep); + else + error = xrep_cow_alloc(sc, &rep); if (error) return error; @@ -531,8 +686,12 @@ xrep_cow_replace_one( return error; /* Note the old CoW staging extents; we'll reap them all later. */ - error = xfsb_bitmap_set(&xc->old_cowfork_fsblocks, old_startblock, - rep.br_blockcount); + if (XFS_IS_REALTIME_INODE(sc->ip)) + error = xrtb_bitmap_set(&xc->old_cowfork_rtblocks, + old_startblock, rep.br_blockcount); + else + error = xfsb_bitmap_set(&xc->old_cowfork_fsblocks, + old_startblock, rep.br_blockcount); if (error) return error; @@ -588,8 +747,12 @@ xrep_bmap_cow( if (!ifp) return 0; - /* realtime files aren't supported yet */ - if (XFS_IS_REALTIME_INODE(sc->ip)) + /* + * Realtime files with large extent sizes are not supported because + * we could encounter an CoW mapping that has been partially written + * out *and* requires replacement, and there's no solution to that. + */ + if (XFS_IS_REALTIME_INODE(sc->ip) && sc->mp->m_sb.sb_rextsize != 1) return -EOPNOTSUPP; /* @@ -610,7 +773,10 @@ xrep_bmap_cow( xc->sc = sc; xbitmap_init(&xc->bad_fileoffs); - xfsb_bitmap_init(&xc->old_cowfork_fsblocks); + if (XFS_IS_REALTIME_INODE(sc->ip)) + xrtb_bitmap_init(&xc->old_cowfork_rtblocks); + else + xfsb_bitmap_init(&xc->old_cowfork_fsblocks); for_each_xfs_iext(ifp, &icur, &xc->irec) { if (xchk_should_terminate(sc, &error)) @@ -633,7 +799,10 @@ xrep_bmap_cow( if (xfs_bmap_is_written_extent(&xc->irec)) continue; - error = xrep_cow_find_bad(xc); + if (XFS_IS_REALTIME_INODE(sc->ip)) + error = xrep_cow_find_bad_rt(xc); + else + error = xrep_cow_find_bad(xc); if (error) goto out_bitmap; } @@ -648,13 +817,20 @@ xrep_bmap_cow( * by the refcount btree, not the inode, so it is correct to treat them * like inode metadata. */ - error = xrep_reap_fsblocks(sc, &xc->old_cowfork_fsblocks, - &XFS_RMAP_OINFO_COW, XFS_AG_RESV_NONE); + if (XFS_IS_REALTIME_INODE(sc->ip)) + error = xrep_reap_rtblocks(sc, &xc->old_cowfork_rtblocks, + &XFS_RMAP_OINFO_COW); + else + error = xrep_reap_fsblocks(sc, &xc->old_cowfork_fsblocks, + &XFS_RMAP_OINFO_COW, XFS_AG_RESV_NONE); if (error) goto out_bitmap; out_bitmap: - xfsb_bitmap_destroy(&xc->old_cowfork_fsblocks); + if (XFS_IS_REALTIME_INODE(sc->ip)) + xrtb_bitmap_destroy(&xc->old_cowfork_rtblocks); + else + xfsb_bitmap_destroy(&xc->old_cowfork_fsblocks); xbitmap_destroy(&xc->bad_fileoffs); kmem_free(xc); return error; diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c index 77354bdb0511..b5b5963f6d99 100644 --- a/fs/xfs/scrub/reap.c +++ b/fs/xfs/scrub/reap.c @@ -33,6 +33,8 @@ #include "xfs_attr.h" #include "xfs_attr_remote.h" #include "xfs_defer.h" +#include "xfs_rtgroup.h" +#include "xfs_rtrmap_btree.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -676,6 +678,223 @@ xrep_reap_fsblocks( return 0; } +#ifdef CONFIG_XFS_RT +/* Dispose of a single rtgroup extent. */ +STATIC int +xreap_rgextent( + struct xreap_state *rs, + xfs_rgblock_t rgbno, + xfs_extlen_t *rglenp, + bool crosslinked) +{ + struct xfs_scrub *sc = rs->sc; + xfs_rtblock_t rtbno; + + /* + * The only caller so far is CoW fork repair, so we only know how to + * unlink or free CoW staging extents. + */ + if (rs->oinfo != &XFS_RMAP_OINFO_COW) { + ASSERT(rs->oinfo == &XFS_RMAP_OINFO_COW); + return -EFSCORRUPTED; + } + ASSERT(rs->resv == XFS_AG_RESV_NONE); + + rtbno = xfs_rgbno_to_rtb(sc->mp, sc->sr.rtg->rtg_rgno, rgbno); + + /* + * If there are other rmappings, this block is cross linked and must + * not be freed. Remove the forward and reverse mapping and move on. + * + * XXX: XFS doesn't support detecting the case where a single block + * metadata structure is crosslinked with a multi-block structure + * because the buffer cache doesn't detect aliasing problems, so we + * can't fix 100% of crosslinking problems (yet). The verifiers will + * blow on writeout, the filesystem will shut down, and the admin gets + * to run xfs_repair. + */ + if (crosslinked) { + trace_xreap_dispose_unmap_rtextent(sc->sr.rtg, rgbno, *rglenp); + + xfs_refcount_free_cow_extent(sc->tp, true, rtbno, *rglenp); + rs->deferred++; + return 0; + } + + trace_xreap_dispose_free_rtextent(sc->sr.rtg, rgbno, *rglenp); + + /* + * The CoW staging extent is not crosslinked. Use deferred work items + * to remove the refcountbt records (which removes the rmap records) + * and free the extent. We're not worried about the system going down + * here because log recovery walks the refcount btree to clean out the + * CoW staging extents. + */ + xfs_refcount_free_cow_extent(sc->tp, true, rtbno, *rglenp); + xfs_free_extent_later(sc->tp, rtbno, *rglenp, NULL, + XFS_FREE_EXTENT_REALTIME | + XFS_FREE_EXTENT_SKIP_DISCARD); + rs->deferred++; + return 0; +} + +/* + * Figure out the longest run of blocks that we can dispose of with a single + * call. Cross-linked blocks should have their reverse mappings removed, but + * single-owner extents can be freed. Units are rt blocks, not rt extents. + */ +STATIC int +xreap_rgextent_select( + struct xreap_state *rs, + xfs_rgblock_t rgbno, + xfs_rgblock_t rgbno_next, + bool *crosslinked, + xfs_extlen_t *rglenp) +{ + struct xfs_scrub *sc = rs->sc; + struct xfs_btree_cur *cur; + xfs_rgblock_t bno = rgbno + 1; + xfs_extlen_t len = 1; + int error; + + /* + * Determine if there are any other rmap records covering the first + * block of this extent. If so, the block is crosslinked. + */ + cur = xfs_rtrmapbt_init_cursor(sc->mp, sc->tp, sc->sr.rtg, + sc->sr.rtg->rtg_rmapip); + error = xfs_rmap_has_other_keys(cur, rgbno, 1, rs->oinfo, + crosslinked); + if (error) + goto out_cur; + + /* + * Figure out how many of the subsequent blocks have the same crosslink + * status. + */ + while (bno < rgbno_next) { + bool also_crosslinked; + + error = xfs_rmap_has_other_keys(cur, bno, 1, rs->oinfo, + &also_crosslinked); + if (error) + goto out_cur; + + if (*crosslinked != also_crosslinked) + break; + + len++; + bno++; + } + + *rglenp = len; + trace_xreap_rgextent_select(sc->sr.rtg, rgbno, len, *crosslinked); +out_cur: + xfs_btree_del_cursor(cur, error); + return error; +} + +#define XREAP_RTGLOCK_ALL (XFS_RTGLOCK_BITMAP | \ + XFS_RTGLOCK_RMAP | \ + XFS_RTGLOCK_REFCOUNT) + +/* + * Break a rt file metadata extent into sub-extents by fate (crosslinked, not + * crosslinked), and dispose of each sub-extent separately. The extent must + * be aligned to a realtime extent. + */ +STATIC int +xreap_rtmeta_extent( + uint64_t rtbno, + uint64_t len, + void *priv) +{ + struct xreap_state *rs = priv; + struct xfs_scrub *sc = rs->sc; + xfs_rgnumber_t rgno; + xfs_rgblock_t rgbno = xfs_rtb_to_rgbno(sc->mp, rtbno, &rgno); + xfs_rgblock_t rgbno_next = rgbno + len; + int error = 0; + + ASSERT(sc->ip != NULL); + ASSERT(!sc->sr.rtg); + + /* + * We're reaping blocks after repairing file metadata, which means that + * we have to init the xchk_ag structure ourselves. + */ + sc->sr.rtg = xfs_rtgroup_get(sc->mp, rgno); + if (!sc->sr.rtg) + return -EFSCORRUPTED; + + xfs_rtgroup_lock(NULL, sc->sr.rtg, XREAP_RTGLOCK_ALL); + + while (rgbno < rgbno_next) { + xfs_extlen_t rglen; + bool crosslinked; + + error = xreap_rgextent_select(rs, rgbno, rgbno_next, + &crosslinked, &rglen); + if (error) + goto out_unlock; + + error = xreap_rgextent(rs, rgbno, &rglen, crosslinked); + if (error) + goto out_unlock; + + if (xreap_want_defer_finish(rs)) { + error = xfs_defer_finish(&sc->tp); + if (error) + goto out_unlock; + xreap_defer_finish_reset(rs); + } else if (xreap_want_roll(rs)) { + error = xfs_trans_roll_inode(&sc->tp, sc->ip); + if (error) + goto out_unlock; + xreap_reset(rs); + } + + rgbno += rglen; + } + +out_unlock: + xfs_rtgroup_unlock(sc->sr.rtg, XREAP_RTGLOCK_ALL); + xfs_rtgroup_put(sc->sr.rtg); + sc->sr.rtg = NULL; + return error; +} + +/* + * Dispose of every block of every rt metadata extent in the bitmap. + * Do not use this to dispose of the mappings in an ondisk inode fork. + */ +int +xrep_reap_rtblocks( + struct xfs_scrub *sc, + struct xrtb_bitmap *bitmap, + const struct xfs_owner_info *oinfo) +{ + struct xreap_state rs = { + .sc = sc, + .oinfo = oinfo, + .resv = XFS_AG_RESV_NONE, + }; + int error; + + ASSERT(xfs_has_rmapbt(sc->mp)); + ASSERT(sc->ip != NULL); + + error = xrtb_bitmap_walk(bitmap, xreap_rtmeta_extent, &rs); + if (error) + return error; + + if (xreap_dirty(&rs)) + return xrep_defer_finish(sc); + + return 0; +} +#endif /* CONFIG_XFS_RT */ + /* * Metadata files are not supposed to share blocks with anything else. * If blocks are shared, we remove the reverse mapping (thus reducing the diff --git a/fs/xfs/scrub/reap.h b/fs/xfs/scrub/reap.h index cfaef544f659..bf025ec3501b 100644 --- a/fs/xfs/scrub/reap.h +++ b/fs/xfs/scrub/reap.h @@ -12,6 +12,13 @@ int xrep_reap_fsblocks(struct xfs_scrub *sc, struct xfsb_bitmap *bitmap, const struct xfs_owner_info *oinfo, enum xfs_ag_resv_type type); int xrep_reap_ifork(struct xfs_scrub *sc, struct xfs_inode *ip, int whichfork); +#ifdef CONFIG_XFS_RT +int xrep_reap_rtblocks(struct xfs_scrub *sc, struct xrtb_bitmap *bitmap, + const struct xfs_owner_info *oinfo); +#else +# define xrep_reap_rtblocks(...) (-EOPNOTSUPP) +#endif /* CONFIG_XFS_RT */ + /* Buffer cache scan context. */ struct xrep_bufscan { /* Disk address for the buffers we want to scan. */ diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index aa15aeffa724..e2b75c449046 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -41,6 +41,7 @@ struct xbitmap; struct xagb_bitmap; struct xrgb_bitmap; struct xfsb_bitmap; +struct xrtb_bitmap; int xrep_fix_freelist(struct xfs_scrub *sc, int alloc_flags); diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index d74bba391854..4d8e4b77cbbe 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -1414,6 +1414,41 @@ DEFINE_REPAIR_EXTENT_EVENT(xreap_agextent_binval); DEFINE_REPAIR_EXTENT_EVENT(xreap_bmapi_binval); DEFINE_REPAIR_EXTENT_EVENT(xrep_agfl_insert); +#ifdef CONFIG_XFS_RT +DECLARE_EVENT_CLASS(xrep_rtgroup_extent_class, + TP_PROTO(struct xfs_rtgroup *rtg, xfs_rgblock_t rgbno, + xfs_extlen_t len), + TP_ARGS(rtg, rgbno, len), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(xfs_rgnumber_t, rgno) + __field(xfs_rgblock_t, rgbno) + __field(xfs_extlen_t, len) + ), + TP_fast_assign( + __entry->dev = rtg->rtg_mount->m_super->s_dev; + __entry->rtdev = rtg->rtg_mount->m_rtdev_targp->bt_dev; + __entry->rgno = rtg->rtg_rgno; + __entry->rgbno = rgbno; + __entry->len = len; + ), + TP_printk("dev %d:%d rtdev %d:%d rgno 0x%x rgbno 0x%x fsbcount 0x%x", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->rgno, + __entry->rgbno, + __entry->len) +); +#define DEFINE_REPAIR_RTGROUP_EXTENT_EVENT(name) \ +DEFINE_EVENT(xrep_rtgroup_extent_class, name, \ + TP_PROTO(struct xfs_rtgroup *rtg, xfs_rgblock_t rgbno, \ + xfs_extlen_t len), \ + TP_ARGS(rtg, rgbno, len)) +DEFINE_REPAIR_RTGROUP_EXTENT_EVENT(xreap_dispose_unmap_rtextent); +DEFINE_REPAIR_RTGROUP_EXTENT_EVENT(xreap_dispose_free_rtextent); +#endif /* CONFIG_XFS_RT */ + DECLARE_EVENT_CLASS(xrep_reap_find_class, TP_PROTO(struct xfs_perag *pag, xfs_agblock_t agbno, xfs_extlen_t len, bool crosslinked), @@ -1447,6 +1482,43 @@ DEFINE_EVENT(xrep_reap_find_class, name, \ DEFINE_REPAIR_REAP_FIND_EVENT(xreap_agextent_select); DEFINE_REPAIR_REAP_FIND_EVENT(xreap_bmapi_select); +#ifdef CONFIG_XFS_RT +DECLARE_EVENT_CLASS(xrep_rtgroup_reap_find_class, + TP_PROTO(struct xfs_rtgroup *rtg, xfs_rgblock_t rgbno, xfs_extlen_t len, + bool crosslinked), + TP_ARGS(rtg, rgbno, len, crosslinked), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(xfs_rgnumber_t, rgno) + __field(xfs_rgblock_t, rgbno) + __field(xfs_extlen_t, len) + __field(bool, crosslinked) + ), + TP_fast_assign( + __entry->dev = rtg->rtg_mount->m_super->s_dev; + __entry->rtdev = rtg->rtg_mount->m_rtdev_targp->bt_dev; + __entry->rgno = rtg->rtg_rgno; + __entry->rgbno = rgbno; + __entry->len = len; + __entry->crosslinked = crosslinked; + ), + TP_printk("dev %d:%d rtdev %d:%d rgno 0x%x rgbno 0x%x fsbcount 0x%x crosslinked %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->rgno, + __entry->rgbno, + __entry->len, + __entry->crosslinked ? 1 : 0) +); +#define DEFINE_REPAIR_RTGROUP_REAP_FIND_EVENT(name) \ +DEFINE_EVENT(xrep_rtgroup_reap_find_class, name, \ + TP_PROTO(struct xfs_rtgroup *rtg, xfs_rgblock_t rgbno, \ + xfs_extlen_t len, bool crosslinked), \ + TP_ARGS(rtg, rgbno, len, crosslinked)) +DEFINE_REPAIR_RTGROUP_REAP_FIND_EVENT(xreap_rgextent_select); +#endif /* CONFIG_XFS_RT */ + DECLARE_EVENT_CLASS(xrep_rmap_class, TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, xfs_agblock_t agbno, xfs_extlen_t len, From patchwork Fri Dec 30 22:18:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085550 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23D32C4332F for ; Sat, 31 Dec 2022 01:59:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236104AbiLaB7B (ORCPT ); Fri, 30 Dec 2022 20:59:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236108AbiLaB7A (ORCPT ); Fri, 30 Dec 2022 20:59:00 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E30E1C438 for ; Fri, 30 Dec 2022 17:58:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 9D413CE19E8 for ; Sat, 31 Dec 2022 01:58:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB578C433EF; Sat, 31 Dec 2022 01:58:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451935; bh=oiyPkmpUWRjM2JmUtdTrVUb9hXtJ8ft02Xa/fpdN8CE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=QGPGfxSdLBAqchUKOu1vG7tuwghk2uML0zosvE7irORwdWg0sg97PWB6NjlaJQIzS YQLP4B1fw8Y5PCr0Q41gN5LfAFrkLfBNNrNmu+ZLAoGLbJmfmmYnsnUZdUWcOyj/gp MJDPab/4pPi5VBvo+ND5GouG7qeJA67FrSp3MfngjeG9iFNcpoHVeEu3xsR0KNCGTN eb9Zlra69aINyVmc5N+Cv5neX0TR2t/IkAO+KdLM4evU2ziSzZzc4TKxoJt9Q4sMRi C3tQQ3yzEe5Oy12yHSEPzvufiaB05q4IjMCEpmcmePJjW9W58sTRHHJ1H2uD48slcd Qjtr2ce1MqopA== Subject: [PATCH 42/42] xfs: enable realtime reflink From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:34 -0800 Message-ID: <167243871493.717073.7033882150807622781.stgit@magnolia> In-Reply-To: <167243870849.717073.203452386730176902.stgit@magnolia> References: <167243870849.717073.203452386730176902.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Enable reflink for realtime devices, sort of. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_super.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 4abeff701093..a3a0011272e5 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1656,14 +1656,27 @@ xfs_fs_fill_super( "EXPERIMENTAL realtime allocation group feature in use. Use at your own risk!"); if (xfs_has_reflink(mp)) { - if (mp->m_sb.sb_rblocks) { + /* + * Reflink doesn't support rt extent sizes larger than a single + * block because we would have to perform unshare-around for + * rtext-unaligned write requests. + */ + if (xfs_has_realtime(mp) && mp->m_sb.sb_rextsize != 1) { xfs_alert(mp, - "reflink not compatible with realtime device!"); + "reflink not compatible with realtime extent size %u!", + mp->m_sb.sb_rextsize); error = -EINVAL; goto out_filestream_unmount; } - if (xfs_globals.always_cow) { + /* + * always-cow mode is not supported on filesystems with rt + * extent sizes larger than a single block because we'd have + * to perform write-around for unaligned writes because remap + * requests must be aligned to an rt extent. + */ + if (xfs_globals.always_cow && + (!xfs_has_realtime(mp) || mp->m_sb.sb_rextsize == 1)) { xfs_info(mp, "using DEBUG-only always_cow mode."); mp->m_always_cow = true; }