From patchwork Fri Dec 30 22:18:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085466 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B91CC4332F for ; Sat, 31 Dec 2022 01:37:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235919AbiLaBhJ (ORCPT ); Fri, 30 Dec 2022 20:37:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236108AbiLaBhH (ORCPT ); Fri, 30 Dec 2022 20:37:07 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C19A13DD9 for ; Fri, 30 Dec 2022 17:37:06 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2B5EE61CC6 for ; Sat, 31 Dec 2022 01:37:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7FF19C433D2; Sat, 31 Dec 2022 01:37:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450625; bh=9eJ8luAf8V+tPxcfvgulkbSSn9xRm1Hb6hxCjMbXIUU=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=LrLv75OXgFlTm9NiEahZrz3Qj91/+EOn7TxwbuCvXx9rUA9LiCl54mARRcyDdySKV 6evcVCG9CFGibkmC9kwXJSybOw5P1Ga6AFMKh5e6Mmn+AfeHc9wvFf6Y1SSr4fVPl3 +3dsEsHXmWhMIBjM/jRQ9xLra3opdSRnyHM5A3L9W3aK+nySocS24l0ZgGLLSzSVaU qXqvpDF+3ivHQ4Qow8qSr5DcvPamrSZsU5kxJ4G4Z20K2lH7pOA4lOBq57vHfsWDoe kCRCA4UvbALjEVEMc7ApOQCAR+phcUN0MVVXx58x/j6/2xE2YA65dYUpFNHQRLoGzp QtVfyOnWzdsaw== Subject: [PATCH 01/38] xfs: prepare rmap btree cursor tracepoints for realtime From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:16 -0800 Message-ID: <167243869615.715303.9064037493733102205.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Rework the rmap btree cursor tracepoints in preparation to handle the realtime rmap btree cursor. Mostly this involves renaming the field to "rmapbno" and extracting the group number from the cursor when possible. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_trace.c | 18 ++++++++++++++++ fs/xfs/xfs_trace.h | 58 +++++++++++++++++++++++++++------------------------- 2 files changed, 48 insertions(+), 28 deletions(-) diff --git a/fs/xfs/xfs_trace.c b/fs/xfs/xfs_trace.c index d7ede15110e8..ae35868e0638 100644 --- a/fs/xfs/xfs_trace.c +++ b/fs/xfs/xfs_trace.c @@ -45,6 +45,24 @@ #include "xfs_rtgroup.h" #include "xfs_rmap.h" +static inline void +xfs_rmapbt_crack_agno_opdev( + struct xfs_btree_cur *cur, + xfs_agnumber_t *agno, + dev_t *opdev) +{ + if (cur->bc_flags & XFS_BTREE_IN_MEMORY) { + *agno = 0; + *opdev = xfbtree_target(cur->bc_mem.xfbtree)->bt_dev; + } else if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) { + *agno = cur->bc_ino.rtg->rtg_rgno; + *opdev = cur->bc_mp->m_rtdev_targp->bt_dev; + } else { + *agno = cur->bc_ag.pag->pag_agno; + *opdev = cur->bc_mp->m_super->s_dev; + } +} + /* * We include this last to have the helpers above available for the trace * event implementations. diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index fd067e1e28db..6bf7c2aa8e9d 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -14,11 +14,15 @@ * ino: filesystem inode number * * agbno: per-AG block number in fs blocks + * rgbno: per-rtgroup block number in fs blocks * startblock: physical block number for file mappings. This is either a * segmented fsblock for data device mappings, or a rfsblock * for realtime device mappings * fsbcount: number of blocks in an extent, in fs blocks * + * rmapbno: physical block number for a reverse mapping. This is an agbno for + * per-AG rmap btrees or a rgbno for realtime rmap btrees. + * * daddr: physical block number in 512b blocks * bbcount: number of blocks in a physical extent, in 512b blocks * @@ -2836,13 +2840,14 @@ DEFINE_DEFER_PENDING_ITEM_EVENT(xfs_defer_finish_item); /* rmap tracepoints */ DECLARE_EVENT_CLASS(xfs_rmap_class, TP_PROTO(struct xfs_btree_cur *cur, - xfs_agblock_t agbno, xfs_extlen_t len, bool unwritten, + xfs_agblock_t rmapbno, xfs_extlen_t len, bool unwritten, const struct xfs_owner_info *oinfo), - TP_ARGS(cur, agbno, len, unwritten, oinfo), + TP_ARGS(cur, rmapbno, len, unwritten, oinfo), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, rmapbno) __field(xfs_extlen_t, len) __field(uint64_t, owner) __field(uint64_t, offset) @@ -2850,8 +2855,8 @@ DECLARE_EVENT_CLASS(xfs_rmap_class, ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - __entry->agno = cur->bc_ag.pag->pag_agno; - __entry->agbno = agbno; + xfs_rmapbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); + __entry->rmapbno = rmapbno; __entry->len = len; __entry->owner = oinfo->oi_owner; __entry->offset = oinfo->oi_offset; @@ -2859,10 +2864,11 @@ DECLARE_EVENT_CLASS(xfs_rmap_class, if (unwritten) __entry->flags |= XFS_RMAP_UNWRITTEN; ), - TP_printk("dev %d:%d agno 0x%x agbno 0x%x fsbcount 0x%x owner 0x%llx fileoff 0x%llx flags 0x%lx", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x rmapbno 0x%x fsbcount 0x%x owner 0x%llx fileoff 0x%llx flags 0x%lx", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, - __entry->agbno, + __entry->rmapbno, __entry->len, __entry->owner, __entry->offset, @@ -2871,9 +2877,9 @@ DECLARE_EVENT_CLASS(xfs_rmap_class, #define DEFINE_RMAP_EVENT(name) \ DEFINE_EVENT(xfs_rmap_class, name, \ TP_PROTO(struct xfs_btree_cur *cur, \ - xfs_agblock_t agbno, xfs_extlen_t len, bool unwritten, \ + xfs_agblock_t rmapbno, xfs_extlen_t len, bool unwritten, \ const struct xfs_owner_info *oinfo), \ - TP_ARGS(cur, agbno, len, unwritten, oinfo)) + TP_ARGS(cur, rmapbno, len, unwritten, oinfo)) /* btree cursor error/%ip tracepoint class */ DECLARE_EVENT_CLASS(xfs_btree_error_class, @@ -2932,40 +2938,35 @@ TRACE_EVENT(xfs_rmap_convert_state, TP_ARGS(cur, state, caller_ip), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) - __field(xfs_ino_t, ino) __field(int, state) __field(unsigned long, caller_ip) ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) { - __entry->agno = 0; - __entry->ino = cur->bc_ino.ip->i_ino; - } else { - __entry->agno = cur->bc_ag.pag->pag_agno; - __entry->ino = 0; - } + xfs_rmapbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); __entry->state = state; __entry->caller_ip = caller_ip; ), - TP_printk("dev %d:%d agno 0x%x ino 0x%llx state %d caller %pS", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x state %d caller %pS", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, - __entry->ino, __entry->state, (char *)__entry->caller_ip) ); DECLARE_EVENT_CLASS(xfs_rmapbt_class, TP_PROTO(struct xfs_btree_cur *cur, - xfs_agblock_t agbno, xfs_extlen_t len, + xfs_agblock_t rmapbno, xfs_extlen_t len, uint64_t owner, uint64_t offset, unsigned int flags), - TP_ARGS(cur, agbno, len, owner, offset, flags), + TP_ARGS(cur, rmapbno, len, owner, offset, flags), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(xfs_agnumber_t, agno) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, rmapbno) __field(xfs_extlen_t, len) __field(uint64_t, owner) __field(uint64_t, offset) @@ -2973,17 +2974,18 @@ DECLARE_EVENT_CLASS(xfs_rmapbt_class, ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; - __entry->agno = cur->bc_ag.pag->pag_agno; - __entry->agbno = agbno; + xfs_rmapbt_crack_agno_opdev(cur, &__entry->agno, &__entry->opdev); + __entry->rmapbno = rmapbno; __entry->len = len; __entry->owner = owner; __entry->offset = offset; __entry->flags = flags; ), - TP_printk("dev %d:%d agno 0x%x agbno 0x%x fsbcount 0x%x owner 0x%llx fileoff 0x%llx flags 0x%x", + TP_printk("dev %d:%d opdev %d:%d agno 0x%x rmapbno 0x%x fsbcount 0x%x owner 0x%llx fileoff 0x%llx flags 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, - __entry->agbno, + __entry->rmapbno, __entry->len, __entry->owner, __entry->offset, @@ -2992,9 +2994,9 @@ DECLARE_EVENT_CLASS(xfs_rmapbt_class, #define DEFINE_RMAPBT_EVENT(name) \ DEFINE_EVENT(xfs_rmapbt_class, name, \ TP_PROTO(struct xfs_btree_cur *cur, \ - xfs_agblock_t agbno, xfs_extlen_t len, \ + xfs_agblock_t rmapbno, xfs_extlen_t len, \ uint64_t owner, uint64_t offset, unsigned int flags), \ - TP_ARGS(cur, agbno, len, owner, offset, flags)) + TP_ARGS(cur, rmapbno, len, owner, offset, flags)) TRACE_DEFINE_ENUM(XFS_RMAP_MAP); TRACE_DEFINE_ENUM(XFS_RMAP_MAP_SHARED); From patchwork Fri Dec 30 22:18:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB9D2C4332F for ; Sat, 31 Dec 2022 01:37:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236110AbiLaBhZ (ORCPT ); Fri, 30 Dec 2022 20:37:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236108AbiLaBhZ (ORCPT ); Fri, 30 Dec 2022 20:37:25 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C75B313DD9 for ; Fri, 30 Dec 2022 17:37:23 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 830E8B81E34 for ; Sat, 31 Dec 2022 01:37:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3788BC433EF; Sat, 31 Dec 2022 01:37:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450641; bh=wUPPK0fYbejmPKzElBD0aEvou/tbXT2CP47lVoEKwDk=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=q7K64zKTW0FchQ++nJVetaKNusfV5gPiyVlEswnrQxsp/7aWrH1kQW4ULp/i2wfUY FNohHkbMGLMj9/U4ApM/aS3In97ehHgibGNG4rQ+8Uug/53no4137mZygakId0vexO tYey6EJlOTiq/+Qr0ia834vVovOnlJ9YEmltgQKTqkSJM2qnoEksu6H7OUymNErTdV e1L0jDq6Iy0W3XdLa2tjQ/LLIpRgspFNXL0XgWZBK2NUqnb22++QYPupUKd1Ih3Vij 6c/++eAyYtB9bEw68TUwA/bW8Vzllpq5Vj4swhl6c/m/FttkApCZrtlCwtGDhAF3Cf +KR60bg8U5OGw== Subject: [PATCH 02/38] xfs: simplify the xfs_rmap_{alloc,free}_extent calling conventions From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:16 -0800 Message-ID: <167243869629.715303.14412443667827419096.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Simplify the calling conventions by allowing callers to pass a fsbno (xfs_fsblock_t) directly into these functions, since we're just going to set it in a struct anyway. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_refcount.c | 6 ++---- fs/xfs/libxfs/xfs_rmap.c | 12 +++++------- fs/xfs/libxfs/xfs_rmap.h | 8 ++++---- fs/xfs/scrub/alloc_repair.c | 10 +++++++--- 4 files changed, 18 insertions(+), 18 deletions(-) diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 2721c6076712..20c12cb7b7de 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -1889,8 +1889,7 @@ xfs_refcount_alloc_cow_extent( __xfs_refcount_add(tp, XFS_REFCOUNT_ALLOC_COW, fsb, len); /* Add rmap entry */ - xfs_rmap_alloc_extent(tp, XFS_FSB_TO_AGNO(mp, fsb), - XFS_FSB_TO_AGBNO(mp, fsb), len, XFS_RMAP_OWN_COW); + xfs_rmap_alloc_extent(tp, fsb, len, XFS_RMAP_OWN_COW); } /* Forget a CoW staging event in the refcount btree. */ @@ -1906,8 +1905,7 @@ xfs_refcount_free_cow_extent( return; /* Remove rmap entry */ - xfs_rmap_free_extent(tp, XFS_FSB_TO_AGNO(mp, fsb), - XFS_FSB_TO_AGBNO(mp, fsb), len, XFS_RMAP_OWN_COW); + xfs_rmap_free_extent(tp, fsb, len, XFS_RMAP_OWN_COW); __xfs_refcount_add(tp, XFS_REFCOUNT_FREE_COW, fsb, len); } diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index 9ad3e5077f34..a2a863e0c7fb 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -526,7 +526,7 @@ xfs_rmap_free_check_owner( struct xfs_btree_cur *cur, uint64_t ltoff, struct xfs_rmap_irec *rec, - xfs_filblks_t len, + xfs_extlen_t len, uint64_t owner, uint64_t offset, unsigned int flags) @@ -2745,8 +2745,7 @@ xfs_rmap_convert_extent( void xfs_rmap_alloc_extent( struct xfs_trans *tp, - xfs_agnumber_t agno, - xfs_agblock_t bno, + xfs_fsblock_t fsbno, xfs_extlen_t len, uint64_t owner) { @@ -2755,7 +2754,7 @@ xfs_rmap_alloc_extent( if (!xfs_rmap_update_is_needed(tp->t_mountp, XFS_DATA_FORK)) return; - bmap.br_startblock = XFS_AGB_TO_FSB(tp->t_mountp, agno, bno); + bmap.br_startblock = fsbno; bmap.br_blockcount = len; bmap.br_startoff = 0; bmap.br_state = XFS_EXT_NORM; @@ -2767,8 +2766,7 @@ xfs_rmap_alloc_extent( void xfs_rmap_free_extent( struct xfs_trans *tp, - xfs_agnumber_t agno, - xfs_agblock_t bno, + xfs_fsblock_t fsbno, xfs_extlen_t len, uint64_t owner) { @@ -2777,7 +2775,7 @@ xfs_rmap_free_extent( if (!xfs_rmap_update_is_needed(tp->t_mountp, XFS_DATA_FORK)) return; - bmap.br_startblock = XFS_AGB_TO_FSB(tp->t_mountp, agno, bno); + bmap.br_startblock = fsbno; bmap.br_blockcount = len; bmap.br_startoff = 0; bmap.br_state = XFS_EXT_NORM; diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h index 36af4de506c7..54c969731cf4 100644 --- a/fs/xfs/libxfs/xfs_rmap.h +++ b/fs/xfs/libxfs/xfs_rmap.h @@ -187,10 +187,10 @@ void xfs_rmap_unmap_extent(struct xfs_trans *tp, struct xfs_inode *ip, void xfs_rmap_convert_extent(struct xfs_mount *mp, struct xfs_trans *tp, struct xfs_inode *ip, int whichfork, struct xfs_bmbt_irec *imap); -void xfs_rmap_alloc_extent(struct xfs_trans *tp, xfs_agnumber_t agno, - xfs_agblock_t bno, xfs_extlen_t len, uint64_t owner); -void xfs_rmap_free_extent(struct xfs_trans *tp, xfs_agnumber_t agno, - xfs_agblock_t bno, xfs_extlen_t len, uint64_t owner); +void xfs_rmap_alloc_extent(struct xfs_trans *tp, xfs_fsblock_t fsbno, + xfs_extlen_t len, uint64_t owner); +void xfs_rmap_free_extent(struct xfs_trans *tp, xfs_fsblock_t fsbno, + xfs_extlen_t len, uint64_t owner); void xfs_rmap_finish_one_cleanup(struct xfs_trans *tp, struct xfs_btree_cur *rcur, int error); diff --git a/fs/xfs/scrub/alloc_repair.c b/fs/xfs/scrub/alloc_repair.c index 1e06ffe26029..6506fc202571 100644 --- a/fs/xfs/scrub/alloc_repair.c +++ b/fs/xfs/scrub/alloc_repair.c @@ -524,9 +524,13 @@ xrep_abt_dispose_one( ASSERT(pag == resv->pag); /* Add a deferred rmap for each extent we used. */ - if (resv->used > 0) - xfs_rmap_alloc_extent(sc->tp, pag->pag_agno, resv->agbno, - resv->used, XFS_RMAP_OWN_AG); + if (resv->used > 0) { + xfs_fsblock_t fsbno; + + fsbno = XFS_AGB_TO_FSB(sc->mp, pag->pag_agno, resv->agbno); + xfs_rmap_alloc_extent(sc->tp, fsbno, resv->used, + XFS_RMAP_OWN_AG); + } /* * For each reserved btree block we didn't use, add it to the free From patchwork Fri Dec 30 22:18:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085468 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2ECDBC4332F for ; Sat, 31 Dec 2022 01:37:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236111AbiLaBhl (ORCPT ); Fri, 30 Dec 2022 20:37:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236108AbiLaBhk (ORCPT ); Fri, 30 Dec 2022 20:37:40 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8781413DD9 for ; Fri, 30 Dec 2022 17:37:39 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4D490B81E0A for ; Sat, 31 Dec 2022 01:37:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E45B7C433EF; Sat, 31 Dec 2022 01:37:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450657; bh=C+07sLL0iNegsrDqDalCtM5J4ztIBnQwmGebMiPaR9w=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=EVrPNIpQodF6Qiffh1TBDJ9zrLbPpF7+LRrX6l8d66Ld+qrM/fCml1HULGdn6B/aq ICyZB5yQ08PoP8VLk3qWzRtEXqEJwtYesafebko1JC9QrR5lsX5aswhuJinOxLXMC9 kikqEdhIPN4FcH5jjWfuN3PJtuirbGg+JRFodsR1LTo0JegcHCDbISFoD71NQPz/uV 9+P7Tj8anKAcjFt7Vytx8Ju7XZQoohLUz9trc5H0Q+qsUZ3bTrW6LMLmgimK5jWJnf 17O2obpeA6ngt5AAVOWBErz0DhwbyiLr1nkT+v09/W6NzYP8u9Q5mtINjIC/1Fu2Qf nIjn0rKiVdGbA== Subject: [PATCH 03/38] xfs: introduce realtime rmap btree definitions From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:16 -0800 Message-ID: <167243869642.715303.17832303961247536911.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Add new realtime rmap btree definitions. The realtime rmap btree will be rooted from a hidden inode, but has its own shape and therefore needs to have most of its own separate types. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_btree.h | 1 + fs/xfs/libxfs/xfs_format.h | 7 +++++++ fs/xfs/libxfs/xfs_types.h | 5 +++-- fs/xfs/scrub/trace.h | 1 + fs/xfs/xfs_trace.h | 1 + 5 files changed, 13 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 125f45731a54..ddaad83d4ff9 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -64,6 +64,7 @@ union xfs_btree_rec { #define XFS_BTNUM_RMAP ((xfs_btnum_t)XFS_BTNUM_RMAPi) #define XFS_BTNUM_REFC ((xfs_btnum_t)XFS_BTNUM_REFCi) #define XFS_BTNUM_RCBAG ((xfs_btnum_t)XFS_BTNUM_RCBAGi) +#define XFS_BTNUM_RTRMAP ((xfs_btnum_t)XFS_BTNUM_RTRMAPi) struct xfs_btree_ops; uint32_t xfs_btree_magic(struct xfs_mount *mp, const struct xfs_btree_ops *ops); diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index e4f3b2c5c054..b2d4ef28a480 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1727,6 +1727,13 @@ typedef __be32 xfs_rmap_ptr_t; XFS_FIBT_BLOCK(mp) + 1 : \ XFS_IBT_BLOCK(mp) + 1) +/* + * Realtime Reverse mapping btree format definitions + * + * This is a btree for reverse mapping records for realtime volumes + */ +#define XFS_RTRMAP_CRC_MAGIC 0x4d415052 /* 'MAPR' */ + /* * Reference Count Btree format definitions * diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h index d37f8a7ce5f8..e6a4f4a7d009 100644 --- a/fs/xfs/libxfs/xfs_types.h +++ b/fs/xfs/libxfs/xfs_types.h @@ -126,7 +126,7 @@ typedef enum { typedef enum { XFS_BTNUM_BNOi, XFS_BTNUM_CNTi, XFS_BTNUM_RMAPi, XFS_BTNUM_BMAPi, XFS_BTNUM_INOi, XFS_BTNUM_FINOi, XFS_BTNUM_REFCi, XFS_BTNUM_RCBAGi, - XFS_BTNUM_MAX + XFS_BTNUM_RTRMAPi, XFS_BTNUM_MAX } xfs_btnum_t; #define XFS_BTNUM_STRINGS \ @@ -137,7 +137,8 @@ typedef enum { { XFS_BTNUM_INOi, "inobt" }, \ { XFS_BTNUM_FINOi, "finobt" }, \ { XFS_BTNUM_REFCi, "refcbt" }, \ - { XFS_BTNUM_RCBAGi, "rcbagbt" } + { XFS_BTNUM_RCBAGi, "rcbagbt" }, \ + { XFS_BTNUM_RTRMAPi, "rtrmapbt" } struct xfs_name { const unsigned char *name; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 9a51eb404fae..cf1635e00cb0 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -42,6 +42,7 @@ TRACE_DEFINE_ENUM(XFS_BTNUM_FINOi); TRACE_DEFINE_ENUM(XFS_BTNUM_RMAPi); TRACE_DEFINE_ENUM(XFS_BTNUM_REFCi); TRACE_DEFINE_ENUM(XFS_BTNUM_RCBAGi); +TRACE_DEFINE_ENUM(XFS_BTNUM_RTRMAPi); TRACE_DEFINE_ENUM(XFS_REFC_DOMAIN_SHARED); TRACE_DEFINE_ENUM(XFS_REFC_DOMAIN_COW); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 6bf7c2aa8e9d..390aa7a4afae 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -2555,6 +2555,7 @@ TRACE_DEFINE_ENUM(XFS_BTNUM_FINOi); TRACE_DEFINE_ENUM(XFS_BTNUM_RMAPi); TRACE_DEFINE_ENUM(XFS_BTNUM_REFCi); TRACE_DEFINE_ENUM(XFS_BTNUM_RCBAGi); +TRACE_DEFINE_ENUM(XFS_BTNUM_RTRMAPi); DECLARE_EVENT_CLASS(xfs_btree_cur_class, TP_PROTO(struct xfs_btree_cur *cur, int level, struct xfs_buf *bp), From patchwork Fri Dec 30 22:18:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5761FC4332F for ; Sat, 31 Dec 2022 01:37:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236112AbiLaBh6 (ORCPT ); Fri, 30 Dec 2022 20:37:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236108AbiLaBh5 (ORCPT ); Fri, 30 Dec 2022 20:37:57 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8756413DD9 for ; Fri, 30 Dec 2022 17:37:55 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1D7F9B81DD1 for ; Sat, 31 Dec 2022 01:37:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A282FC433D2; Sat, 31 Dec 2022 01:37:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450672; bh=FQLB+4WLIzAf9TbYF3lv/fv6x0fPBcbGC7NUCfcje34=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=irr63fhG90k5QozlRA7IVnv1V10ODlHqTql0GL/Wdc0exfP7ST71A7q8Rb0e93+YX lJ/VRbqztSjMWpWtzi8oRuXaFoSd2enitA6h7heJokHooHHrrzqapb9wOshe6YvBCK BbUW975uejEPryYoZGT5BKEQENNINWR76abr7xsy9GEkhgKHQdJpcDnTTxPITgyJfi q02AWRNUcptkE5LhY+4HSy27sSUUEpKGMNKlEcoo1xfWdTho2jG1Kqh59WtB98D21b 7unYVt5a15mSNAdlzSR3rj8A6V6TGmGwxv5VKE72H2rirTjEnrc5XWNJZ2NS1GNDpK SuJcy2z6ubnww== Subject: [PATCH 04/38] xfs: define the on-disk realtime rmap btree format From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:16 -0800 Message-ID: <167243869657.715303.4219727068304981370.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Start filling out the rtrmap btree implementation. Start with the on-disk btree format; add everything needed to read, write and manipulate rmap btree blocks. This prepares the way for connecting the btree operations implementation. Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/libxfs/xfs_btree.c | 6 + fs/xfs/libxfs/xfs_format.h | 3 fs/xfs/libxfs/xfs_rtrmap_btree.c | 306 ++++++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrmap_btree.h | 83 ++++++++++ fs/xfs/libxfs/xfs_sb.c | 6 + fs/xfs/libxfs/xfs_shared.h | 2 fs/xfs/xfs_mount.c | 5 - fs/xfs/xfs_mount.h | 9 + fs/xfs/xfs_ondisk.h | 1 10 files changed, 420 insertions(+), 2 deletions(-) create mode 100644 fs/xfs/libxfs/xfs_rtrmap_btree.c create mode 100644 fs/xfs/libxfs/xfs_rtrmap_btree.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 4bf6d663272b..84934538bf52 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -47,6 +47,7 @@ xfs-y += $(addprefix libxfs/, \ xfs_rmap_btree.o \ xfs_refcount.o \ xfs_refcount_btree.o \ + xfs_rtrmap_btree.o \ xfs_sb.o \ xfs_swapext.o \ xfs_symlink_remote.o \ diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index c02748e16075..4f1f03b207d3 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -32,6 +32,7 @@ #include "scrub/xfbtree.h" #include "xfs_btree_mem.h" #include "xfs_rtgroup.h" +#include "xfs_rtrmap_btree.h" /* * Btree magic numbers. @@ -1377,6 +1378,7 @@ xfs_btree_set_refs( xfs_buf_set_ref(bp, XFS_BMAP_BTREE_REF); break; case XFS_BTNUM_RMAP: + case XFS_BTNUM_RTRMAP: xfs_buf_set_ref(bp, XFS_RMAP_BTREE_REF); break; case XFS_BTNUM_REFC: @@ -5537,6 +5539,9 @@ xfs_btree_init_cur_caches(void) if (error) goto err; error = xfs_refcountbt_init_cur_cache(); + if (error) + goto err; + error = xfs_rtrmapbt_init_cur_cache(); if (error) goto err; @@ -5555,6 +5560,7 @@ xfs_btree_destroy_cur_caches(void) xfs_bmbt_destroy_cur_cache(); xfs_rmapbt_destroy_cur_cache(); xfs_refcountbt_destroy_cur_cache(); + xfs_rtrmapbt_destroy_cur_cache(); } /* Move the btree cursor before the first record. */ diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index b2d4ef28a480..fb727e1e4072 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1734,6 +1734,9 @@ typedef __be32 xfs_rmap_ptr_t; */ #define XFS_RTRMAP_CRC_MAGIC 0x4d415052 /* 'MAPR' */ +/* inode-based btree pointer type */ +typedef __be64 xfs_rtrmap_ptr_t; + /* * Reference Count Btree format definitions * diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c new file mode 100644 index 000000000000..7f6ba2efdaf2 --- /dev/null +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -0,0 +1,306 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_log_format.h" +#include "xfs_trans_resv.h" +#include "xfs_bit.h" +#include "xfs_sb.h" +#include "xfs_mount.h" +#include "xfs_defer.h" +#include "xfs_inode.h" +#include "xfs_trans.h" +#include "xfs_alloc.h" +#include "xfs_btree.h" +#include "xfs_btree_staging.h" +#include "xfs_rtrmap_btree.h" +#include "xfs_trace.h" +#include "xfs_cksum.h" +#include "xfs_error.h" +#include "xfs_extent_busy.h" +#include "xfs_rtgroup.h" + +static struct kmem_cache *xfs_rtrmapbt_cur_cache; + +/* + * Realtime Reverse Map btree. + * + * This is a btree used to track the owner(s) of a given extent in the realtime + * device. See the comments in xfs_rmap_btree.c for more information. + * + * This tree is basically the same as the regular rmap btree except that it + * is rooted in an inode and does not live in free space. + */ + +static struct xfs_btree_cur * +xfs_rtrmapbt_dup_cursor( + struct xfs_btree_cur *cur) +{ + struct xfs_btree_cur *new; + + new = xfs_rtrmapbt_init_cursor(cur->bc_mp, cur->bc_tp, cur->bc_ino.rtg, + cur->bc_ino.ip); + + /* Copy the flags values since init cursor doesn't get them. */ + new->bc_ino.flags = cur->bc_ino.flags; + + return new; +} + +static xfs_failaddr_t +xfs_rtrmapbt_verify( + struct xfs_buf *bp) +{ + struct xfs_mount *mp = bp->b_target->bt_mount; + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_failaddr_t fa; + int level; + + if (!xfs_verify_magic(bp, block->bb_magic)) + return __this_address; + + if (!xfs_has_rmapbt(mp)) + return __this_address; + fa = xfs_btree_lblock_v5hdr_verify(bp, XFS_RMAP_OWN_UNKNOWN); + if (fa) + return fa; + level = be16_to_cpu(block->bb_level); + if (level > mp->m_rtrmap_maxlevels) + return __this_address; + + return xfs_btree_lblock_verify(bp, mp->m_rtrmap_mxr[level != 0]); +} + +static void +xfs_rtrmapbt_read_verify( + struct xfs_buf *bp) +{ + xfs_failaddr_t fa; + + if (!xfs_btree_lblock_verify_crc(bp)) + xfs_verifier_error(bp, -EFSBADCRC, __this_address); + else { + fa = xfs_rtrmapbt_verify(bp); + if (fa) + xfs_verifier_error(bp, -EFSCORRUPTED, fa); + } + + if (bp->b_error) + trace_xfs_btree_corrupt(bp, _RET_IP_); +} + +static void +xfs_rtrmapbt_write_verify( + struct xfs_buf *bp) +{ + xfs_failaddr_t fa; + + fa = xfs_rtrmapbt_verify(bp); + if (fa) { + trace_xfs_btree_corrupt(bp, _RET_IP_); + xfs_verifier_error(bp, -EFSCORRUPTED, fa); + return; + } + xfs_btree_lblock_calc_crc(bp); + +} + +const struct xfs_buf_ops xfs_rtrmapbt_buf_ops = { + .name = "xfs_rtrmapbt", + .magic = { 0, cpu_to_be32(XFS_RTRMAP_CRC_MAGIC) }, + .verify_read = xfs_rtrmapbt_read_verify, + .verify_write = xfs_rtrmapbt_write_verify, + .verify_struct = xfs_rtrmapbt_verify, +}; + +const struct xfs_btree_ops xfs_rtrmapbt_ops = { + .rec_len = sizeof(struct xfs_rmap_rec), + .key_len = 2 * sizeof(struct xfs_rmap_key), + .geom_flags = XFS_BTREE_LONG_PTRS | XFS_BTREE_ROOT_IN_INODE | + XFS_BTREE_CRC_BLOCKS | XFS_BTREE_OVERLAPPING | + XFS_BTREE_IROOT_RECORDS, + + .dup_cursor = xfs_rtrmapbt_dup_cursor, + .buf_ops = &xfs_rtrmapbt_buf_ops, +}; + +/* Initialize a new rt rmap btree cursor. */ +static struct xfs_btree_cur * +xfs_rtrmapbt_init_common( + struct xfs_mount *mp, + struct xfs_trans *tp, + struct xfs_rtgroup *rtg, + struct xfs_inode *ip) +{ + struct xfs_btree_cur *cur; + + ASSERT(xfs_isilocked(ip, XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)); + + cur = xfs_btree_alloc_cursor(mp, tp, XFS_BTNUM_RTRMAP, + &xfs_rtrmapbt_ops, mp->m_rtrmap_maxlevels, + xfs_rtrmapbt_cur_cache); + cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_rmap_2); + + cur->bc_ino.ip = ip; + cur->bc_ino.allocated = 0; + cur->bc_ino.flags = 0; + + cur->bc_ino.rtg = xfs_rtgroup_bump(rtg); + return cur; +} + +/* Allocate a new rt rmap btree cursor. */ +struct xfs_btree_cur * +xfs_rtrmapbt_init_cursor( + struct xfs_mount *mp, + struct xfs_trans *tp, + struct xfs_rtgroup *rtg, + struct xfs_inode *ip) +{ + struct xfs_btree_cur *cur; + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + + cur = xfs_rtrmapbt_init_common(mp, tp, rtg, ip); + cur->bc_nlevels = be16_to_cpu(ifp->if_broot->bb_level) + 1; + cur->bc_ino.forksize = xfs_inode_fork_size(ip, XFS_DATA_FORK); + cur->bc_ino.whichfork = XFS_DATA_FORK; + return cur; +} + +/* Create a new rt reverse mapping btree cursor with a fake root for staging. */ +struct xfs_btree_cur * +xfs_rtrmapbt_stage_cursor( + struct xfs_mount *mp, + struct xfs_rtgroup *rtg, + struct xfs_inode *ip, + struct xbtree_ifakeroot *ifake) +{ + struct xfs_btree_cur *cur; + + cur = xfs_rtrmapbt_init_common(mp, NULL, rtg, ip); + cur->bc_nlevels = ifake->if_levels; + cur->bc_ino.forksize = ifake->if_fork_size; + cur->bc_ino.whichfork = -1; + xfs_btree_stage_ifakeroot(cur, ifake, NULL); + return cur; +} + +/* + * Install a new rt reverse mapping btree root. Caller is responsible for + * invalidating and freeing the old btree blocks. + */ +void +xfs_rtrmapbt_commit_staged_btree( + struct xfs_btree_cur *cur, + struct xfs_trans *tp) +{ + struct xbtree_ifakeroot *ifake = cur->bc_ino.ifake; + struct xfs_ifork *ifp; + int flags = XFS_ILOG_CORE | XFS_ILOG_DBROOT; + + ASSERT(cur->bc_flags & XFS_BTREE_STAGING); + + /* + * Free any resources hanging off the real fork, then shallow-copy the + * staging fork's contents into the real fork to transfer everything + * we just built. + */ + ifp = xfs_ifork_ptr(cur->bc_ino.ip, XFS_DATA_FORK); + xfs_idestroy_fork(ifp); + memcpy(ifp, ifake->if_fork, sizeof(struct xfs_ifork)); + + xfs_trans_log_inode(tp, cur->bc_ino.ip, flags); + xfs_btree_commit_ifakeroot(cur, tp, XFS_DATA_FORK, &xfs_rtrmapbt_ops); +} + +/* Calculate number of records in a rt reverse mapping btree block. */ +static inline unsigned int +xfs_rtrmapbt_block_maxrecs( + unsigned int blocklen, + bool leaf) +{ + if (leaf) + return blocklen / sizeof(struct xfs_rmap_rec); + return blocklen / + (2 * sizeof(struct xfs_rmap_key) + sizeof(xfs_rtrmap_ptr_t)); +} + +/* + * Calculate number of records in an rt reverse mapping btree block. + */ +unsigned int +xfs_rtrmapbt_maxrecs( + struct xfs_mount *mp, + unsigned int blocklen, + bool leaf) +{ + blocklen -= XFS_RTRMAP_BLOCK_LEN; + return xfs_rtrmapbt_block_maxrecs(blocklen, leaf); +} + +/* Compute the max possible height for realtime reverse mapping btrees. */ +unsigned int +xfs_rtrmapbt_maxlevels_ondisk(void) +{ + unsigned int minrecs[2]; + unsigned int blocklen; + + blocklen = XFS_MIN_CRC_BLOCKSIZE - XFS_BTREE_LBLOCK_CRC_LEN; + + minrecs[0] = xfs_rtrmapbt_block_maxrecs(blocklen, true) / 2; + minrecs[1] = xfs_rtrmapbt_block_maxrecs(blocklen, false) / 2; + + /* We need at most one record for every block in an rt group. */ + return xfs_btree_compute_maxlevels(minrecs, XFS_MAX_RGBLOCKS); +} + +int __init +xfs_rtrmapbt_init_cur_cache(void) +{ + xfs_rtrmapbt_cur_cache = kmem_cache_create("xfs_rtrmapbt_cur", + xfs_btree_cur_sizeof(xfs_rtrmapbt_maxlevels_ondisk()), + 0, 0, NULL); + + if (!xfs_rtrmapbt_cur_cache) + return -ENOMEM; + return 0; +} + +void +xfs_rtrmapbt_destroy_cur_cache(void) +{ + kmem_cache_destroy(xfs_rtrmapbt_cur_cache); + xfs_rtrmapbt_cur_cache = NULL; +} + +/* Compute the maximum height of an rt reverse mapping btree. */ +void +xfs_rtrmapbt_compute_maxlevels( + struct xfs_mount *mp) +{ + unsigned int d_maxlevels, r_maxlevels; + + if (!xfs_has_rtrmapbt(mp)) { + mp->m_rtrmap_maxlevels = 0; + return; + } + + /* + * The realtime rmapbt lives on the data device, which means that its + * maximum height is constrained by the size of the data device and + * the height required to store one rmap record for each block in an + * rt group. + */ + d_maxlevels = xfs_btree_space_to_height(mp->m_rtrmap_mnr, + mp->m_sb.sb_dblocks); + r_maxlevels = xfs_btree_compute_maxlevels(mp->m_rtrmap_mnr, + mp->m_sb.sb_rgblocks); + + /* Add one level to handle the inode root level. */ + mp->m_rtrmap_maxlevels = min(d_maxlevels, r_maxlevels) + 1; +} diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.h b/fs/xfs/libxfs/xfs_rtrmap_btree.h new file mode 100644 index 000000000000..7380c04e7705 --- /dev/null +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.h @@ -0,0 +1,83 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_RTRMAP_BTREE_H__ +#define __XFS_RTRMAP_BTREE_H__ + +struct xfs_buf; +struct xfs_btree_cur; +struct xfs_mount; +struct xbtree_ifakeroot; +struct xfs_rtgroup; + +/* rmaps only exist on crc enabled filesystems */ +#define XFS_RTRMAP_BLOCK_LEN XFS_BTREE_LBLOCK_CRC_LEN + +struct xfs_btree_cur *xfs_rtrmapbt_init_cursor(struct xfs_mount *mp, + struct xfs_trans *tp, struct xfs_rtgroup *rtg, + struct xfs_inode *ip); +struct xfs_btree_cur *xfs_rtrmapbt_stage_cursor(struct xfs_mount *mp, + struct xfs_rtgroup *rtg, struct xfs_inode *ip, + struct xbtree_ifakeroot *ifake); +void xfs_rtrmapbt_commit_staged_btree(struct xfs_btree_cur *cur, + struct xfs_trans *tp); +unsigned int xfs_rtrmapbt_maxrecs(struct xfs_mount *mp, unsigned int blocklen, + bool leaf); +void xfs_rtrmapbt_compute_maxlevels(struct xfs_mount *mp); + +/* + * Addresses of records, keys, and pointers within an incore rtrmapbt block. + * + * (note that some of these may appear unused, but they are used in userspace) + */ +static inline struct xfs_rmap_rec * +xfs_rtrmap_rec_addr( + struct xfs_btree_block *block, + unsigned int index) +{ + return (struct xfs_rmap_rec *) + ((char *)block + XFS_RTRMAP_BLOCK_LEN + + (index - 1) * sizeof(struct xfs_rmap_rec)); +} + +static inline struct xfs_rmap_key * +xfs_rtrmap_key_addr( + struct xfs_btree_block *block, + unsigned int index) +{ + return (struct xfs_rmap_key *) + ((char *)block + XFS_RTRMAP_BLOCK_LEN + + (index - 1) * 2 * sizeof(struct xfs_rmap_key)); +} + +static inline struct xfs_rmap_key * +xfs_rtrmap_high_key_addr( + struct xfs_btree_block *block, + unsigned int index) +{ + return (struct xfs_rmap_key *) + ((char *)block + XFS_RTRMAP_BLOCK_LEN + + sizeof(struct xfs_rmap_key) + + (index - 1) * 2 * sizeof(struct xfs_rmap_key)); +} + +static inline xfs_rtrmap_ptr_t * +xfs_rtrmap_ptr_addr( + struct xfs_btree_block *block, + unsigned int index, + unsigned int maxrecs) +{ + return (xfs_rtrmap_ptr_t *) + ((char *)block + XFS_RTRMAP_BLOCK_LEN + + maxrecs * 2 * sizeof(struct xfs_rmap_key) + + (index - 1) * sizeof(xfs_rtrmap_ptr_t)); +} + +unsigned int xfs_rtrmapbt_maxlevels_ondisk(void); + +int __init xfs_rtrmapbt_init_cur_cache(void); +void xfs_rtrmapbt_destroy_cur_cache(void); + +#endif /* __XFS_RTRMAP_BTREE_H__ */ diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 54f93e1b0f00..570919c223c9 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -27,6 +27,7 @@ #include "xfs_ag.h" #include "xfs_swapext.h" #include "xfs_rtgroup.h" +#include "xfs_rtrmap_btree.h" /* * Physical superblock buffer manipulations. Shared with libxfs in userspace. @@ -1064,6 +1065,11 @@ xfs_sb_mount_common( mp->m_rmap_mnr[0] = mp->m_rmap_mxr[0] / 2; mp->m_rmap_mnr[1] = mp->m_rmap_mxr[1] / 2; + mp->m_rtrmap_mxr[0] = xfs_rtrmapbt_maxrecs(mp, sbp->sb_blocksize, true); + mp->m_rtrmap_mxr[1] = xfs_rtrmapbt_maxrecs(mp, sbp->sb_blocksize, false); + mp->m_rtrmap_mnr[0] = mp->m_rtrmap_mxr[0] / 2; + mp->m_rtrmap_mnr[1] = mp->m_rtrmap_mxr[1] / 2; + mp->m_refc_mxr[0] = xfs_refcountbt_maxrecs(mp, sbp->sb_blocksize, true); mp->m_refc_mxr[1] = xfs_refcountbt_maxrecs(mp, sbp->sb_blocksize, false); mp->m_refc_mnr[0] = mp->m_refc_mxr[0] / 2; diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h index 62839fc87b50..31c577a94295 100644 --- a/fs/xfs/libxfs/xfs_shared.h +++ b/fs/xfs/libxfs/xfs_shared.h @@ -42,6 +42,7 @@ extern const struct xfs_buf_ops xfs_rtbitmap_buf_ops; extern const struct xfs_buf_ops xfs_rtsummary_buf_ops; extern const struct xfs_buf_ops xfs_rtbuf_ops; extern const struct xfs_buf_ops xfs_rtsb_buf_ops; +extern const struct xfs_buf_ops xfs_rtrmapbt_buf_ops; extern const struct xfs_buf_ops xfs_sb_buf_ops; extern const struct xfs_buf_ops xfs_sb_quiet_buf_ops; extern const struct xfs_buf_ops xfs_symlink_buf_ops; @@ -54,6 +55,7 @@ extern const struct xfs_btree_ops xfs_finobt_ops; extern const struct xfs_btree_ops xfs_bmbt_ops; extern const struct xfs_btree_ops xfs_refcountbt_ops; extern const struct xfs_btree_ops xfs_rmapbt_ops; +extern const struct xfs_btree_ops xfs_rtrmapbt_ops; /* log size calculation functions */ int xfs_log_calc_unit_res(struct xfs_mount *mp, int unit_bytes); diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index d94d44f40be4..1d2403b93f58 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -36,6 +36,7 @@ #include "xfs_ag.h" #include "xfs_imeta.h" #include "xfs_rtgroup.h" +#include "xfs_rtrmap_btree.h" static DEFINE_MUTEX(xfs_uuid_table_mutex); static int xfs_uuid_table_size; @@ -654,8 +655,7 @@ static inline void xfs_rtbtree_compute_maxlevels( struct xfs_mount *mp) { - /* This will be filled in later. */ - mp->m_rtbtree_maxlevels = 0; + mp->m_rtbtree_maxlevels = mp->m_rtrmap_maxlevels; } /* @@ -727,6 +727,7 @@ xfs_mountfs( xfs_bmap_compute_maxlevels(mp, XFS_ATTR_FORK); xfs_mount_setup_inode_geom(mp); xfs_rmapbt_compute_maxlevels(mp); + xfs_rtrmapbt_compute_maxlevels(mp); xfs_refcountbt_compute_maxlevels(mp); xfs_agbtree_compute_maxlevels(mp); diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index 55e6e30f9045..a565b1b1372a 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -131,11 +131,14 @@ typedef struct xfs_mount { uint m_bmap_dmnr[2]; /* min bmap btree records */ uint m_rmap_mxr[2]; /* max rmap btree records */ uint m_rmap_mnr[2]; /* min rmap btree records */ + uint m_rtrmap_mxr[2]; /* max rtrmap btree records */ + uint m_rtrmap_mnr[2]; /* min rtrmap btree records */ uint m_refc_mxr[2]; /* max refc btree records */ uint m_refc_mnr[2]; /* min refc btree records */ uint m_alloc_maxlevels; /* max alloc btree levels */ uint m_bm_maxlevels[2]; /* max bmap btree levels */ uint m_rmap_maxlevels; /* max rmap btree levels */ + uint m_rtrmap_maxlevels; /* max rtrmap btree level */ uint m_refc_maxlevels; /* max refcount btree level */ unsigned int m_agbtree_maxlevels; /* max level of all AG btrees */ unsigned int m_rtbtree_maxlevels; /* max level of all rt btrees */ @@ -359,6 +362,12 @@ __XFS_HAS_FEAT(large_extent_counts, NREXT64) __XFS_HAS_FEAT(metadir, METADIR) __XFS_HAS_FEAT(rtgroups, RTGROUPS) +static inline bool xfs_has_rtrmapbt(struct xfs_mount *mp) +{ + return xfs_has_rtgroups(mp) && xfs_has_realtime(mp) && + xfs_has_rmapbt(mp); +} + /* * Mount features * diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h index 17e541d35194..35d0695fbf57 100644 --- a/fs/xfs/xfs_ondisk.h +++ b/fs/xfs/xfs_ondisk.h @@ -77,6 +77,7 @@ xfs_check_ondisk_structs(void) XFS_CHECK_STRUCT_SIZE(union xfs_rtword_ondisk, 4); XFS_CHECK_STRUCT_SIZE(union xfs_suminfo_ondisk, 4); XFS_CHECK_STRUCT_SIZE(struct xfs_rtbuf_blkinfo, 48); + XFS_CHECK_STRUCT_SIZE(xfs_rtrmap_ptr_t, 8); /* * m68k has problems with xfs_attr_leaf_name_remote_t, but we pad it to From patchwork Fri Dec 30 22:18:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD509C4332F for ; Sat, 31 Dec 2022 01:38:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236108AbiLaBiK (ORCPT ); Fri, 30 Dec 2022 20:38:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236113AbiLaBiK (ORCPT ); Fri, 30 Dec 2022 20:38:10 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CA8913DD9 for ; Fri, 30 Dec 2022 17:38:09 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 093A461CC6 for ; Sat, 31 Dec 2022 01:38:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6B7C9C433D2; Sat, 31 Dec 2022 01:38:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450688; bh=WxXd7mdwI9x4yckpW4cURd+6JpJ0ZJYAjVlFcN0/BOY=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=QbN8dZjb9olDrt2EALtY3pVE4NHL3+tEuXhPZARuBPZlNP3VhQxsGB83bLbLaxrlq JUpXEtNvE7shCV4tibDHxQLT78p/XUFBL0JAegKYs9QpwYwQ4pUPZ7LggBC/Pd2eVa bRuwU/ElXSp9maMqEt3rwBIMi7VOifr0vooSS0772/GidiUX769RucVsGNqSAY5XDf itiMdE3nd6IZcAJyM/GtWflCqO1UeGvu/i4ltNhB8D5hDy/NqFZrjXX23mf4A7qiWo OIsmZf94KYiew1du6A1yDxZL1qpt5CVkdozWMStsxyCwgGjK6IQoc36eDF6IoyOVJp xcn0yYM7jArfA== Subject: [PATCH 05/38] xfs: realtime rmap btree transaction reservations From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:16 -0800 Message-ID: <167243869672.715303.1143852770313398883.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Make sure that there's enough log reservation to handle mapping and unmapping realtime extents. We have to reserve enough space to handle a split in the rtrmapbt to add the record and a second split in the regular rmapbt to record the rtrmapbt split. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_swapext.c | 4 +++- fs/xfs/libxfs/xfs_trans_resv.c | 12 ++++++++++-- fs/xfs/libxfs/xfs_trans_space.h | 13 +++++++++++++ 3 files changed, 26 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_swapext.c b/fs/xfs/libxfs/xfs_swapext.c index 36f03b0bf4ed..9d2ad2a680f8 100644 --- a/fs/xfs/libxfs/xfs_swapext.c +++ b/fs/xfs/libxfs/xfs_swapext.c @@ -702,7 +702,9 @@ xfs_swapext_rmapbt_blocks( if (!xfs_has_rmapbt(mp)) return 0; if (XFS_IS_REALTIME_INODE(req->ip1)) - return 0; + return howmany_64(req->nr_exchanges, + XFS_MAX_CONTIG_RTRMAPS_PER_BLOCK(mp)) * + XFS_RTRMAPADD_SPACE_RES(mp); return howmany_64(req->nr_exchanges, XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp)) * diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 3435492b1658..52a4386a3d96 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -211,7 +211,9 @@ xfs_calc_inode_chunk_res( * Per-extent log reservation for the btree changes involved in freeing or * allocating a realtime extent. We have to be able to log as many rtbitmap * blocks as needed to mark inuse XFS_BMBT_MAX_EXTLEN blocks' worth of realtime - * extents, as well as the realtime summary block. + * extents, as well as the realtime summary block (t1). Realtime rmap btree + * operations happen in a second transaction, so factor in a couple of rtrmapbt + * splits (t2). */ static unsigned int xfs_rtalloc_block_count( @@ -220,10 +222,16 @@ xfs_rtalloc_block_count( { unsigned int rtbmp_blocks; xfs_rtxlen_t rtxlen; + unsigned int t1, t2 = 0; rtxlen = xfs_extlen_to_rtxlen(mp, XFS_MAX_BMBT_EXTLEN); rtbmp_blocks = xfs_rtbitmap_blockcount(mp, rtxlen); - return (rtbmp_blocks + 1) * num_ops; + t1 = (rtbmp_blocks + 1) * num_ops; + + if (xfs_has_rmapbt(mp)) + t2 = num_ops * (2 * mp->m_rtrmap_maxlevels - 1); + + return max(t1, t2); } /* diff --git a/fs/xfs/libxfs/xfs_trans_space.h b/fs/xfs/libxfs/xfs_trans_space.h index 9640fc232c14..8124893a035d 100644 --- a/fs/xfs/libxfs/xfs_trans_space.h +++ b/fs/xfs/libxfs/xfs_trans_space.h @@ -14,6 +14,19 @@ #define XFS_MAX_CONTIG_BMAPS_PER_BLOCK(mp) \ (((mp)->m_bmap_dmxr[0]) - ((mp)->m_bmap_dmnr[0])) +/* Worst case number of realtime rmaps that can be held in a block. */ +#define XFS_MAX_CONTIG_RTRMAPS_PER_BLOCK(mp) \ + (((mp)->m_rtrmap_mxr[0]) - ((mp)->m_rtrmap_mnr[0])) + +/* Adding one realtime rmap could split every level to the top of the tree. */ +#define XFS_RTRMAPADD_SPACE_RES(mp) ((mp)->m_rtrmap_maxlevels) + +/* Blocks we might need to add "b" realtime rmaps to a tree. */ +#define XFS_NRTRMAPADD_SPACE_RES(mp, b) \ + ((((b) + XFS_MAX_CONTIG_RTRMAPS_PER_BLOCK(mp) - 1) / \ + XFS_MAX_CONTIG_RTRMAPS_PER_BLOCK(mp)) * \ + XFS_RTRMAPADD_SPACE_RES(mp)) + /* Worst case number of rmaps that can be held in a block. */ #define XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) \ (((mp)->m_rmap_mxr[0]) - ((mp)->m_rmap_mnr[0])) From patchwork Fri Dec 30 22:18:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF369C3DA7D for ; Sat, 31 Dec 2022 01:38:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236126AbiLaBih (ORCPT ); Fri, 30 Dec 2022 20:38:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236114AbiLaBi2 (ORCPT ); Fri, 30 Dec 2022 20:38:28 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D81C513DD9 for ; Fri, 30 Dec 2022 17:38:26 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6969FB81DD1 for ; Sat, 31 Dec 2022 01:38:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1127CC433D2; Sat, 31 Dec 2022 01:38:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450704; bh=VtDAbW7ZXm4yKWf0iQYeGSFXT7TmULNofpFUey594pM=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=nsni2XjSKSdo4Aunnz0VQ5fv9L6i4pzL4963Xk4XOE20yNpZdSEP4OxvWqYoSb+xR 56fkFXJIKoQF4ifbAYoqKrmm0GOihPJoLzwMxi9GUoQ6s7A/vfKwH5rO6yGj7uBWrK ot0scXWX+6gT7w3nLxtfWH6gTaIrHUxDBjqgTPz0rtLoc07cqzpF8OZFEhnuol6fYW hecMSH03mjrSRFQD6WoGGdRnMgyZmkEl6SofJyZHnqXtPNxzaib/5vk9L5lpssYx1h e+jRaMc63OOzFA/h4ctlZGFPWa/y1srUNUXsozNfc5wWtry5DD0RAeqqk1zdKL7BwZ 9gbWer/d1R6IA== Subject: [PATCH 06/38] xfs: add realtime rmap btree operations From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:16 -0800 Message-ID: <167243869686.715303.6967085140781527270.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Implement the generic btree operations needed to manipulate rtrmap btree blocks. This is different from the regular rmapbt in that we allocate space from the filesystem at large, and are neither constrained to the free space nor any particular AG. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_btree.c | 113 ++++++++++++++++ fs/xfs/libxfs/xfs_btree.h | 5 + fs/xfs/libxfs/xfs_imeta.c | 6 + fs/xfs/libxfs/xfs_rtrmap_btree.c | 271 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 395 insertions(+) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 4f1f03b207d3..fe742567a7dd 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -33,6 +33,10 @@ #include "xfs_btree_mem.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_bmap.h" +#include "xfs_rmap.h" +#include "xfs_quota.h" +#include "xfs_imeta.h" /* * Btree magic numbers. @@ -5589,3 +5593,112 @@ xfs_btree_goto_left_edge( return 0; } + +/* Allocate a block for an inode-rooted metadata btree. */ +int +xfs_btree_alloc_imeta_block( + struct xfs_btree_cur *cur, + const union xfs_btree_ptr *start, + union xfs_btree_ptr *new, + int *stat) +{ + struct xfs_alloc_arg args = { + .mp = cur->bc_mp, + .tp = cur->bc_tp + }; + struct xfs_inode *ip = cur->bc_ino.ip; + struct xfs_trans *tp = cur->bc_tp; + int error; + + ASSERT(!XFS_NOT_DQATTACHED(cur->bc_mp, ip)); + + args.fsbno = tp->t_firstblock; + args.resv = XFS_AG_RESV_IMETA; + xfs_rmap_ino_bmbt_owner(&args.oinfo, ip->i_ino, cur->bc_ino.whichfork); + + if (args.fsbno == NULLFSBLOCK) { + args.fsbno = be64_to_cpu(start->l); + args.type = XFS_ALLOCTYPE_START_BNO; + /* + * Make sure there is sufficient room left in the AG to + * complete a full tree split for an extent insert. If + * we are converting the middle part of an extent then + * we may need space for two tree splits. + * + * We are relying on the caller to make the correct block + * reservation for this operation to succeed. If the + * reservation amount is insufficient then we may fail a + * block allocation here and corrupt the filesystem. + */ + args.minleft = tp->t_blk_res; + } else if (tp->t_flags & XFS_TRANS_LOWMODE) { + args.type = XFS_ALLOCTYPE_START_BNO; + } else { + args.type = XFS_ALLOCTYPE_NEAR_BNO; + } + + args.minlen = args.maxlen = args.prod = 1; + error = xfs_alloc_vextent(&args); + if (error) + goto error0; + + if (args.fsbno == NULLFSBLOCK && args.minleft) { + /* + * Could not find an AG with enough free space to satisfy + * a full btree split. Try again without minleft and if + * successful activate the lowspace algorithm. + */ + args.fsbno = 0; + args.type = XFS_ALLOCTYPE_FIRST_AG; + args.minleft = 0; + error = xfs_alloc_vextent(&args); + if (error) + goto error0; + tp->t_flags |= XFS_TRANS_LOWMODE; + } + if (args.fsbno == NULLFSBLOCK) { + *stat = 0; + return 0; + } + ASSERT(args.len == 1); + + xfs_imeta_resv_alloc_extent(ip, &args); + cur->bc_ino.allocated++; + + new->l = cpu_to_be64(args.fsbno); + *stat = 1; + return 0; + + error0: + return error; +} + +/* Free a block from an inode-rooted metadata btree. */ +int +xfs_btree_free_imeta_block( + struct xfs_btree_cur *cur, + struct xfs_buf *bp) +{ + struct xfs_owner_info oinfo; + struct xfs_mount *mp = cur->bc_mp; + struct xfs_inode *ip = cur->bc_ino.ip; + struct xfs_trans *tp = cur->bc_tp; + struct xfs_perag *pag; + xfs_fsblock_t fsbno = XFS_DADDR_TO_FSB(mp, xfs_buf_daddr(bp)); + xfs_agnumber_t agno = XFS_FSB_TO_AGNO(mp, fsbno); + xfs_agblock_t agbno = XFS_FSB_TO_AGBNO(mp, fsbno); + int error; + + ASSERT(!XFS_NOT_DQATTACHED(mp, ip)); + + xfs_rmap_ino_bmbt_owner(&oinfo, ip->i_ino, cur->bc_ino.whichfork); + pag = xfs_perag_get(mp, agno); + error = __xfs_free_extent(tp, pag, agbno, 1, &oinfo, XFS_AG_RESV_IMETA, + false); + xfs_perag_put(pag); + if (error) + return error; + + xfs_imeta_resv_free_extent(ip, tp, 1); + return 0; +} diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index ddaad83d4ff9..5a733767649b 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -761,4 +761,9 @@ void xfs_btree_destroy_cur_caches(void); int xfs_btree_goto_left_edge(struct xfs_btree_cur *cur); +int xfs_btree_alloc_imeta_block(struct xfs_btree_cur *cur, + const union xfs_btree_ptr *start, union xfs_btree_ptr *newp, + int *stat); +int xfs_btree_free_imeta_block(struct xfs_btree_cur *cur, struct xfs_buf *bp); + #endif /* __XFS_BTREE_H__ */ diff --git a/fs/xfs/libxfs/xfs_imeta.c b/fs/xfs/libxfs/xfs_imeta.c index 5bfb1eabf21d..1065144911b3 100644 --- a/fs/xfs/libxfs/xfs_imeta.c +++ b/fs/xfs/libxfs/xfs_imeta.c @@ -1303,6 +1303,9 @@ xfs_imeta_resv_alloc_extent( xfs_trans_mod_sb(args->tp, XFS_TRANS_SB_FDBLOCKS, -len); ip->i_nblocks += args->len; + xfs_trans_mod_dquot_byino(args->tp, ip, XFS_TRANS_DQ_BCOUNT, args->len); + + xfs_trans_log_inode(args->tp, ip, XFS_ILOG_CORE); } /* Free a block to the metadata file's reservation. */ @@ -1318,6 +1321,7 @@ xfs_imeta_resv_free_extent( trace_xfs_imeta_resv_free_extent(ip, len); ip->i_nblocks -= len; + xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, -len); /* * Add the freed blocks back into the inode's delalloc reservation @@ -1338,6 +1342,8 @@ xfs_imeta_resv_free_extent( */ if (len) xfs_trans_mod_sb(tp, XFS_TRANS_SB_FDBLOCKS, len); + + xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); } /* Release a metadata file's space reservation. */ diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index 7f6ba2efdaf2..551d575713db 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -18,12 +18,14 @@ #include "xfs_alloc.h" #include "xfs_btree.h" #include "xfs_btree_staging.h" +#include "xfs_rmap.h" #include "xfs_rtrmap_btree.h" #include "xfs_trace.h" #include "xfs_cksum.h" #include "xfs_error.h" #include "xfs_extent_busy.h" #include "xfs_rtgroup.h" +#include "xfs_bmap.h" static struct kmem_cache *xfs_rtrmapbt_cur_cache; @@ -52,6 +54,182 @@ xfs_rtrmapbt_dup_cursor( return new; } +STATIC int +xfs_rtrmapbt_get_minrecs( + struct xfs_btree_cur *cur, + int level) +{ + if (level == cur->bc_nlevels - 1) { + struct xfs_ifork *ifp = xfs_btree_ifork_ptr(cur); + + return xfs_rtrmapbt_maxrecs(cur->bc_mp, ifp->if_broot_bytes, + level == 0) / 2; + } + + return cur->bc_mp->m_rtrmap_mnr[level != 0]; +} + +STATIC int +xfs_rtrmapbt_get_maxrecs( + struct xfs_btree_cur *cur, + int level) +{ + if (level == cur->bc_nlevels - 1) { + struct xfs_ifork *ifp = xfs_btree_ifork_ptr(cur); + + return xfs_rtrmapbt_maxrecs(cur->bc_mp, ifp->if_broot_bytes, + level == 0); + } + + return cur->bc_mp->m_rtrmap_mxr[level != 0]; +} + +/* + * Convert the ondisk record's offset field into the ondisk key's offset field. + * Fork and bmbt are significant parts of the rmap record key, but written + * status is merely a record attribute. + */ +static inline __be64 ondisk_rec_offset_to_key(const union xfs_btree_rec *rec) +{ + return rec->rmap.rm_offset & ~cpu_to_be64(XFS_RMAP_OFF_UNWRITTEN); +} + +STATIC void +xfs_rtrmapbt_init_key_from_rec( + union xfs_btree_key *key, + const union xfs_btree_rec *rec) +{ + key->rmap.rm_startblock = rec->rmap.rm_startblock; + key->rmap.rm_owner = rec->rmap.rm_owner; + key->rmap.rm_offset = ondisk_rec_offset_to_key(rec); +} + +STATIC void +xfs_rtrmapbt_init_high_key_from_rec( + union xfs_btree_key *key, + const union xfs_btree_rec *rec) +{ + uint64_t off; + int adj; + + adj = be32_to_cpu(rec->rmap.rm_blockcount) - 1; + + key->rmap.rm_startblock = rec->rmap.rm_startblock; + be32_add_cpu(&key->rmap.rm_startblock, adj); + key->rmap.rm_owner = rec->rmap.rm_owner; + key->rmap.rm_offset = ondisk_rec_offset_to_key(rec); + if (XFS_RMAP_NON_INODE_OWNER(be64_to_cpu(rec->rmap.rm_owner)) || + XFS_RMAP_IS_BMBT_BLOCK(be64_to_cpu(rec->rmap.rm_offset))) + return; + off = be64_to_cpu(key->rmap.rm_offset); + off = (XFS_RMAP_OFF(off) + adj) | (off & ~XFS_RMAP_OFF_MASK); + key->rmap.rm_offset = cpu_to_be64(off); +} + +STATIC void +xfs_rtrmapbt_init_rec_from_cur( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec) +{ + rec->rmap.rm_startblock = cpu_to_be32(cur->bc_rec.r.rm_startblock); + rec->rmap.rm_blockcount = cpu_to_be32(cur->bc_rec.r.rm_blockcount); + rec->rmap.rm_owner = cpu_to_be64(cur->bc_rec.r.rm_owner); + rec->rmap.rm_offset = cpu_to_be64( + xfs_rmap_irec_offset_pack(&cur->bc_rec.r)); +} + +STATIC void +xfs_rtrmapbt_init_ptr_from_cur( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr) +{ + ptr->l = 0; +} + +/* + * Mask the appropriate parts of the ondisk key field for a key comparison. + * Fork and bmbt are significant parts of the rmap record key, but written + * status is merely a record attribute. + */ +static inline uint64_t offset_keymask(uint64_t offset) +{ + return offset & ~XFS_RMAP_OFF_UNWRITTEN; +} + +STATIC int64_t +xfs_rtrmapbt_key_diff( + struct xfs_btree_cur *cur, + const union xfs_btree_key *key) +{ + struct xfs_rmap_irec *rec = &cur->bc_rec.r; + const struct xfs_rmap_key *kp = &key->rmap; + __u64 x, y; + int64_t d; + + d = (int64_t)be32_to_cpu(kp->rm_startblock) - rec->rm_startblock; + if (d) + return d; + + x = be64_to_cpu(kp->rm_owner); + y = rec->rm_owner; + if (x > y) + return 1; + else if (y > x) + return -1; + + x = offset_keymask(be64_to_cpu(kp->rm_offset)); + y = offset_keymask(xfs_rmap_irec_offset_pack(rec)); + if (x > y) + return 1; + else if (y > x) + return -1; + return 0; +} + +STATIC int64_t +xfs_rtrmapbt_diff_two_keys( + struct xfs_btree_cur *cur, + const union xfs_btree_key *k1, + const union xfs_btree_key *k2, + const union xfs_btree_key *mask) +{ + const struct xfs_rmap_key *kp1 = &k1->rmap; + const struct xfs_rmap_key *kp2 = &k2->rmap; + int64_t d; + __u64 x, y; + + /* Doesn't make sense to mask off the physical space part */ + ASSERT(!mask || mask->rmap.rm_startblock); + + d = (int64_t)be32_to_cpu(kp1->rm_startblock) - + be32_to_cpu(kp2->rm_startblock); + if (d) + return d; + + if (!mask || mask->rmap.rm_owner) { + x = be64_to_cpu(kp1->rm_owner); + y = be64_to_cpu(kp2->rm_owner); + if (x > y) + return 1; + else if (y > x) + return -1; + } + + if (!mask || mask->rmap.rm_offset) { + /* Doesn't make sense to allow offset but not owner */ + ASSERT(!mask || mask->rmap.rm_owner); + + x = offset_keymask(be64_to_cpu(kp1->rm_offset)); + y = offset_keymask(be64_to_cpu(kp2->rm_offset)); + if (x > y) + return 1; + else if (y > x) + return -1; + } + + return 0; +} + static xfs_failaddr_t xfs_rtrmapbt_verify( struct xfs_buf *bp) @@ -118,6 +296,86 @@ const struct xfs_buf_ops xfs_rtrmapbt_buf_ops = { .verify_struct = xfs_rtrmapbt_verify, }; +STATIC int +xfs_rtrmapbt_keys_inorder( + struct xfs_btree_cur *cur, + const union xfs_btree_key *k1, + const union xfs_btree_key *k2) +{ + uint32_t x; + uint32_t y; + uint64_t a; + uint64_t b; + + x = be32_to_cpu(k1->rmap.rm_startblock); + y = be32_to_cpu(k2->rmap.rm_startblock); + if (x < y) + return 1; + else if (x > y) + return 0; + a = be64_to_cpu(k1->rmap.rm_owner); + b = be64_to_cpu(k2->rmap.rm_owner); + if (a < b) + return 1; + else if (a > b) + return 0; + a = offset_keymask(be64_to_cpu(k1->rmap.rm_offset)); + b = offset_keymask(be64_to_cpu(k2->rmap.rm_offset)); + if (a <= b) + return 1; + return 0; +} + +STATIC int +xfs_rtrmapbt_recs_inorder( + struct xfs_btree_cur *cur, + const union xfs_btree_rec *r1, + const union xfs_btree_rec *r2) +{ + uint32_t x; + uint32_t y; + uint64_t a; + uint64_t b; + + x = be32_to_cpu(r1->rmap.rm_startblock); + y = be32_to_cpu(r2->rmap.rm_startblock); + if (x < y) + return 1; + else if (x > y) + return 0; + a = be64_to_cpu(r1->rmap.rm_owner); + b = be64_to_cpu(r2->rmap.rm_owner); + if (a < b) + return 1; + else if (a > b) + return 0; + a = offset_keymask(be64_to_cpu(r1->rmap.rm_offset)); + b = offset_keymask(be64_to_cpu(r2->rmap.rm_offset)); + if (a <= b) + return 1; + return 0; +} + +STATIC enum xbtree_key_contig +xfs_rtrmapbt_keys_contiguous( + struct xfs_btree_cur *cur, + const union xfs_btree_key *key1, + const union xfs_btree_key *key2, + const union xfs_btree_key *mask) +{ + ASSERT(!mask || mask->rmap.rm_startblock); + + /* + * We only support checking contiguity of the physical space component. + * If any callers ever need more specificity than that, they'll have to + * implement it here. + */ + ASSERT(!mask || (!mask->rmap.rm_owner && !mask->rmap.rm_offset)); + + return xbtree_key_contig(be32_to_cpu(key1->rmap.rm_startblock), + be32_to_cpu(key2->rmap.rm_startblock)); +} + const struct xfs_btree_ops xfs_rtrmapbt_ops = { .rec_len = sizeof(struct xfs_rmap_rec), .key_len = 2 * sizeof(struct xfs_rmap_key), @@ -126,7 +384,20 @@ const struct xfs_btree_ops xfs_rtrmapbt_ops = { XFS_BTREE_IROOT_RECORDS, .dup_cursor = xfs_rtrmapbt_dup_cursor, + .alloc_block = xfs_btree_alloc_imeta_block, + .free_block = xfs_btree_free_imeta_block, + .get_minrecs = xfs_rtrmapbt_get_minrecs, + .get_maxrecs = xfs_rtrmapbt_get_maxrecs, + .init_key_from_rec = xfs_rtrmapbt_init_key_from_rec, + .init_high_key_from_rec = xfs_rtrmapbt_init_high_key_from_rec, + .init_rec_from_cur = xfs_rtrmapbt_init_rec_from_cur, + .init_ptr_from_cur = xfs_rtrmapbt_init_ptr_from_cur, + .key_diff = xfs_rtrmapbt_key_diff, .buf_ops = &xfs_rtrmapbt_buf_ops, + .diff_two_keys = xfs_rtrmapbt_diff_two_keys, + .keys_inorder = xfs_rtrmapbt_keys_inorder, + .recs_inorder = xfs_rtrmapbt_recs_inorder, + .keys_contiguous = xfs_rtrmapbt_keys_contiguous, }; /* Initialize a new rt rmap btree cursor. */ From patchwork Fri Dec 30 22:18:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F7C6C4332F for ; Sat, 31 Dec 2022 01:38:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236114AbiLaBim (ORCPT ); Fri, 30 Dec 2022 20:38:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236119AbiLaBil (ORCPT ); Fri, 30 Dec 2022 20:38:41 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6E5013DD9 for ; Fri, 30 Dec 2022 17:38:40 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 834B961CCB for ; Sat, 31 Dec 2022 01:38:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D8A39C433F1; Sat, 31 Dec 2022 01:38:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450719; bh=0oV0Cy/aDCYdSfP/UqpqNiUEV3mbgKZxbEoUrHml6zE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=UqiMpDXSTltsF09ZxL969mKZL7fKuUt3WBiROMt9i+Y9JJCxJYONSu5EY4tjJLrBT eyph1oOeGfORxQlI98gXy8y9WwYOIUBdTzU11QrqbW9Q9X7IalgeSpaTkfVNDBSn1c 4iiX3tWjFmdIZKwIipAt0O8UcL+/MxKxOKpIXAtsEaK6bvU+oJD6aC3sxrL9OZQoK5 oLvPErkpluJZp6VKJfdctsz5KXOIo5Nqfe9Mq2tEmKcsMIJF3MMAQ2qaNS4GK2um3Y x3QD6dZgiLfUq3FUWfAPn74eZS/U++dW3TObybtdz26thEc/rySN51AmwKxtiIU9Kd 3zjj6k04V1CwQ== Subject: [PATCH 07/38] xfs: prepare rmap functions to deal with rtrmapbt From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:17 -0800 Message-ID: <167243869701.715303.7065506301518081631.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Prepare the high-level rmap functions to deal with the new realtime rmapbt and its slightly different conventions. Provide the ability to talk to either rmapbt or rtrmapbt formats from the same high level code. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rmap.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index a2a863e0c7fb..31194cc14c0b 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -24,6 +24,7 @@ #include "xfs_inode.h" #include "xfs_ag.h" #include "xfs_health.h" +#include "xfs_rtgroup.h" struct kmem_cache *xfs_rmap_intent_cache; @@ -262,12 +263,73 @@ xfs_rmap_check_perag_irec( return NULL; } +static inline xfs_failaddr_t +xfs_rmap_check_rtgroup_irec( + struct xfs_rtgroup *rtg, + const struct xfs_rmap_irec *irec) +{ + struct xfs_mount *mp = rtg->rtg_mount; + bool is_inode; + bool is_unwritten; + bool is_bmbt; + bool is_attr; + + if (irec->rm_blockcount == 0) + return __this_address; + + if (irec->rm_owner == XFS_RMAP_OWN_FS) { + if (irec->rm_startblock != 0) + return __this_address; + if (irec->rm_blockcount != mp->m_sb.sb_rextsize) + return __this_address; + if (irec->rm_offset != 0) + return __this_address; + } else { + if (!xfs_verify_rgbext(rtg, irec->rm_startblock, + irec->rm_blockcount)) + return __this_address; + } + + if (!(xfs_verify_ino(mp, irec->rm_owner) || + (irec->rm_owner <= XFS_RMAP_OWN_FS && + irec->rm_owner >= XFS_RMAP_OWN_MIN))) + return __this_address; + + /* Check flags. */ + is_inode = !XFS_RMAP_NON_INODE_OWNER(irec->rm_owner); + is_bmbt = irec->rm_flags & XFS_RMAP_BMBT_BLOCK; + is_attr = irec->rm_flags & XFS_RMAP_ATTR_FORK; + is_unwritten = irec->rm_flags & XFS_RMAP_UNWRITTEN; + + if (!is_inode && irec->rm_owner != XFS_RMAP_OWN_FS) + return __this_address; + + if (!is_inode && irec->rm_offset != 0) + return __this_address; + + if (is_bmbt || is_attr) + return __this_address; + + if (is_unwritten && !is_inode) + return __this_address; + + /* Check for a valid fork offset, if applicable. */ + if (is_inode && + !xfs_verify_fileext(mp, irec->rm_offset, irec->rm_blockcount)) + return __this_address; + + return NULL; +} + /* Simple checks for rmap records. */ xfs_failaddr_t xfs_rmap_check_irec( struct xfs_btree_cur *cur, const struct xfs_rmap_irec *irec) { + if (cur->bc_btnum == XFS_BTNUM_RTRMAP) + return xfs_rmap_check_rtgroup_irec(cur->bc_ino.rtg, irec); + if (cur->bc_flags & XFS_BTREE_IN_MEMORY) return xfs_rmap_check_perag_irec(cur->bc_mem.pag, irec); return xfs_rmap_check_perag_irec(cur->bc_ag.pag, irec); @@ -284,6 +346,10 @@ xfs_rmap_complain_bad_rec( if (cur->bc_flags & XFS_BTREE_IN_MEMORY) xfs_warn(mp, "In-Memory Reverse Mapping BTree record corruption detected at %pS!", fa); + else if (cur->bc_btnum == XFS_BTNUM_RTRMAP) + xfs_warn(mp, + "RT Reverse Mapping BTree record corruption in rtgroup %u detected at %pS!", + cur->bc_ino.rtg->rtg_rgno, fa); else xfs_warn(mp, "Reverse Mapping BTree record corruption in AG %d detected at %pS!", From patchwork Fri Dec 30 22:18:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59C7DC4332F for ; Sat, 31 Dec 2022 01:39:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236125AbiLaBi7 (ORCPT ); Fri, 30 Dec 2022 20:38:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236119AbiLaBi6 (ORCPT ); Fri, 30 Dec 2022 20:38:58 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF15813DD9 for ; Fri, 30 Dec 2022 17:38:56 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3ADD361CC7 for ; Sat, 31 Dec 2022 01:38:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 91596C433EF; Sat, 31 Dec 2022 01:38:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450735; bh=VTXIGQAiKa/UNGO/hJWsOPT2n0HbvFpo+gPva93Hhyw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=J+HOu6RjwY0f+Ymyq01Yn8zC5V5QJ6KSfIjp1XO1gvVwSfx+2C5/RCCSftbsPcnqR GEaRK7vJVvn7+2+g78JJxC2J2XYBZ8ZccJ5e1Jgqg4/0padXgXosi5+VmmD5BAnsd8 RTpxclLdH6FVVyLbIQerhh8vvM+zBINK0iJlMZJYpKgJ+FiJqebf4776cZ84PY25Xi dIGEG9bDjKNlBAO++tqZX5z8eBAJ+wMFvoq5RSLjfplv3fJ9C+Zz+puS0dr/dAfDaT j07PuKBCJLXftaOQGQSOsItlEdfAWjl+g7//1OFtmjlCJgGI0jvg52OK5NgDXHlT2l xeXS/TOG/1jPw== Subject: [PATCH 08/38] xfs: add a realtime flag to the rmap update log redo items From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:17 -0800 Message-ID: <167243869714.715303.6763477970340479268.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Extend the rmap update (RUI) log items with a new realtime flag that indicates that the updates apply against the realtime rmapbt. We'll wire up the actual rmap code later. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_defer.c | 1 + fs/xfs/libxfs/xfs_defer.h | 1 + fs/xfs/libxfs/xfs_log_format.h | 4 +++- fs/xfs/libxfs/xfs_refcount.c | 4 ++-- fs/xfs/libxfs/xfs_rmap.c | 38 ++++++++++++++++++++++++++++++++------ fs/xfs/libxfs/xfs_rmap.h | 10 +++++++--- fs/xfs/scrub/alloc_repair.c | 2 +- fs/xfs/xfs_rmap_item.c | 22 ++++++++++++++++++++++ fs/xfs/xfs_trace.h | 23 +++++++++++++++++------ 9 files changed, 86 insertions(+), 19 deletions(-) diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c index c0416bae880a..ce3bc5fe2bdc 100644 --- a/fs/xfs/libxfs/xfs_defer.c +++ b/fs/xfs/libxfs/xfs_defer.c @@ -187,6 +187,7 @@ static const struct xfs_defer_op_type *defer_op_types[] = { [XFS_DEFER_OPS_TYPE_BMAP] = &xfs_bmap_update_defer_type, [XFS_DEFER_OPS_TYPE_REFCOUNT] = &xfs_refcount_update_defer_type, [XFS_DEFER_OPS_TYPE_RMAP] = &xfs_rmap_update_defer_type, + [XFS_DEFER_OPS_TYPE_RMAP_RT] = &xfs_rmap_update_defer_type, [XFS_DEFER_OPS_TYPE_FREE] = &xfs_extent_free_defer_type, [XFS_DEFER_OPS_TYPE_FREE_RT] = &xfs_extent_free_defer_type, [XFS_DEFER_OPS_TYPE_AGFL_FREE] = &xfs_agfl_free_defer_type, diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h index 52198c7124c6..89c279185ce6 100644 --- a/fs/xfs/libxfs/xfs_defer.h +++ b/fs/xfs/libxfs/xfs_defer.h @@ -17,6 +17,7 @@ enum xfs_defer_ops_type { XFS_DEFER_OPS_TYPE_BMAP, XFS_DEFER_OPS_TYPE_REFCOUNT, XFS_DEFER_OPS_TYPE_RMAP, + XFS_DEFER_OPS_TYPE_RMAP_RT, XFS_DEFER_OPS_TYPE_FREE, XFS_DEFER_OPS_TYPE_AGFL_FREE, XFS_DEFER_OPS_TYPE_FREE_RT, diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h index f3c8257a7545..3a23282d6e6f 100644 --- a/fs/xfs/libxfs/xfs_log_format.h +++ b/fs/xfs/libxfs/xfs_log_format.h @@ -746,11 +746,13 @@ struct xfs_map_extent { #define XFS_RMAP_EXTENT_ATTR_FORK (1U << 31) #define XFS_RMAP_EXTENT_BMBT_BLOCK (1U << 30) #define XFS_RMAP_EXTENT_UNWRITTEN (1U << 29) +#define XFS_RMAP_EXTENT_REALTIME (1U << 28) #define XFS_RMAP_EXTENT_FLAGS (XFS_RMAP_EXTENT_TYPE_MASK | \ XFS_RMAP_EXTENT_ATTR_FORK | \ XFS_RMAP_EXTENT_BMBT_BLOCK | \ - XFS_RMAP_EXTENT_UNWRITTEN) + XFS_RMAP_EXTENT_UNWRITTEN | \ + XFS_RMAP_EXTENT_REALTIME) /* * This is the structure used to lay out an rui log item in the diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 20c12cb7b7de..83f681fb49fb 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -1889,7 +1889,7 @@ xfs_refcount_alloc_cow_extent( __xfs_refcount_add(tp, XFS_REFCOUNT_ALLOC_COW, fsb, len); /* Add rmap entry */ - xfs_rmap_alloc_extent(tp, fsb, len, XFS_RMAP_OWN_COW); + xfs_rmap_alloc_extent(tp, false, fsb, len, XFS_RMAP_OWN_COW); } /* Forget a CoW staging event in the refcount btree. */ @@ -1905,7 +1905,7 @@ xfs_refcount_free_cow_extent( return; /* Remove rmap entry */ - xfs_rmap_free_extent(tp, fsb, len, XFS_RMAP_OWN_COW); + xfs_rmap_free_extent(tp, false, fsb, len, XFS_RMAP_OWN_COW); __xfs_refcount_add(tp, XFS_REFCOUNT_FREE_COW, fsb, len); } diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index 31194cc14c0b..1a3607082d12 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -2654,6 +2654,12 @@ xfs_rmap_finish_one( xfs_agblock_t bno; bool unwritten; + if (ri->ri_realtime) { + /* coming in a subsequent patch */ + ASSERT(0); + return -EFSCORRUPTED; + } + bno = XFS_FSB_TO_AGBNO(mp, ri->ri_bmap.br_startblock); trace_xfs_rmap_deferred(mp, ri); @@ -2726,10 +2732,12 @@ __xfs_rmap_add( struct xfs_trans *tp, enum xfs_rmap_intent_type type, uint64_t owner, + bool isrt, int whichfork, struct xfs_bmbt_irec *bmap) { struct xfs_rmap_intent *ri; + enum xfs_defer_ops_type optype; ri = kmem_cache_alloc(xfs_rmap_intent_cache, GFP_NOFS | __GFP_NOFAIL); INIT_LIST_HEAD(&ri->ri_list); @@ -2737,11 +2745,24 @@ __xfs_rmap_add( ri->ri_owner = owner; ri->ri_whichfork = whichfork; ri->ri_bmap = *bmap; + ri->ri_realtime = isrt; + + /* + * Deferred rmap updates for the realtime and data sections must use + * separate transactions to finish deferred work because updates to + * realtime metadata files can lock AGFs to allocate btree blocks and + * we don't want that mixing with the AGF locks taken to finish data + * section updates. + */ + if (isrt) + optype = XFS_DEFER_OPS_TYPE_RMAP_RT; + else + optype = XFS_DEFER_OPS_TYPE_RMAP; trace_xfs_rmap_defer(tp->t_mountp, ri); xfs_rmap_update_get_group(tp->t_mountp, ri); - xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_RMAP, &ri->ri_list); + xfs_defer_add(tp, optype, &ri->ri_list); } /* Map an extent into a file. */ @@ -2753,6 +2774,7 @@ xfs_rmap_map_extent( struct xfs_bmbt_irec *PREV) { enum xfs_rmap_intent_type type = XFS_RMAP_MAP; + bool isrt = xfs_ifork_is_realtime(ip, whichfork); if (!xfs_rmap_update_is_needed(tp->t_mountp, whichfork)) return; @@ -2760,7 +2782,7 @@ xfs_rmap_map_extent( if (whichfork != XFS_ATTR_FORK && xfs_is_reflink_inode(ip)) type = XFS_RMAP_MAP_SHARED; - __xfs_rmap_add(tp, type, ip->i_ino, whichfork, PREV); + __xfs_rmap_add(tp, type, ip->i_ino, isrt, whichfork, PREV); } /* Unmap an extent out of a file. */ @@ -2772,6 +2794,7 @@ xfs_rmap_unmap_extent( struct xfs_bmbt_irec *PREV) { enum xfs_rmap_intent_type type = XFS_RMAP_UNMAP; + bool isrt = xfs_ifork_is_realtime(ip, whichfork); if (!xfs_rmap_update_is_needed(tp->t_mountp, whichfork)) return; @@ -2779,7 +2802,7 @@ xfs_rmap_unmap_extent( if (whichfork != XFS_ATTR_FORK && xfs_is_reflink_inode(ip)) type = XFS_RMAP_UNMAP_SHARED; - __xfs_rmap_add(tp, type, ip->i_ino, whichfork, PREV); + __xfs_rmap_add(tp, type, ip->i_ino, isrt, whichfork, PREV); } /* @@ -2797,6 +2820,7 @@ xfs_rmap_convert_extent( struct xfs_bmbt_irec *PREV) { enum xfs_rmap_intent_type type = XFS_RMAP_CONVERT; + bool isrt = xfs_ifork_is_realtime(ip, whichfork); if (!xfs_rmap_update_is_needed(mp, whichfork)) return; @@ -2804,13 +2828,14 @@ xfs_rmap_convert_extent( if (whichfork != XFS_ATTR_FORK && xfs_is_reflink_inode(ip)) type = XFS_RMAP_CONVERT_SHARED; - __xfs_rmap_add(tp, type, ip->i_ino, whichfork, PREV); + __xfs_rmap_add(tp, type, ip->i_ino, isrt, whichfork, PREV); } /* Schedule the creation of an rmap for non-file data. */ void xfs_rmap_alloc_extent( struct xfs_trans *tp, + bool isrt, xfs_fsblock_t fsbno, xfs_extlen_t len, uint64_t owner) @@ -2825,13 +2850,14 @@ xfs_rmap_alloc_extent( bmap.br_startoff = 0; bmap.br_state = XFS_EXT_NORM; - __xfs_rmap_add(tp, XFS_RMAP_ALLOC, owner, XFS_DATA_FORK, &bmap); + __xfs_rmap_add(tp, XFS_RMAP_ALLOC, owner, isrt, XFS_DATA_FORK, &bmap); } /* Schedule the deletion of an rmap for non-file data. */ void xfs_rmap_free_extent( struct xfs_trans *tp, + bool isrt, xfs_fsblock_t fsbno, xfs_extlen_t len, uint64_t owner) @@ -2846,7 +2872,7 @@ xfs_rmap_free_extent( bmap.br_startoff = 0; bmap.br_state = XFS_EXT_NORM; - __xfs_rmap_add(tp, XFS_RMAP_FREE, owner, XFS_DATA_FORK, &bmap); + __xfs_rmap_add(tp, XFS_RMAP_FREE, owner, isrt, XFS_DATA_FORK, &bmap); } /* Compare rmap records. Returns -1 if a < b, 1 if a > b, and 0 if equal. */ diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h index 54c969731cf4..e98f37c39f2f 100644 --- a/fs/xfs/libxfs/xfs_rmap.h +++ b/fs/xfs/libxfs/xfs_rmap.h @@ -173,7 +173,11 @@ struct xfs_rmap_intent { int ri_whichfork; uint64_t ri_owner; struct xfs_bmbt_irec ri_bmap; - struct xfs_perag *ri_pag; + union { + struct xfs_perag *ri_pag; + struct xfs_rtgroup *ri_rtg; + }; + bool ri_realtime; }; void xfs_rmap_update_get_group(struct xfs_mount *mp, @@ -187,9 +191,9 @@ void xfs_rmap_unmap_extent(struct xfs_trans *tp, struct xfs_inode *ip, void xfs_rmap_convert_extent(struct xfs_mount *mp, struct xfs_trans *tp, struct xfs_inode *ip, int whichfork, struct xfs_bmbt_irec *imap); -void xfs_rmap_alloc_extent(struct xfs_trans *tp, xfs_fsblock_t fsbno, +void xfs_rmap_alloc_extent(struct xfs_trans *tp, bool isrt, xfs_fsblock_t fsbno, xfs_extlen_t len, uint64_t owner); -void xfs_rmap_free_extent(struct xfs_trans *tp, xfs_fsblock_t fsbno, +void xfs_rmap_free_extent(struct xfs_trans *tp, bool isrt, xfs_fsblock_t fsbno, xfs_extlen_t len, uint64_t owner); void xfs_rmap_finish_one_cleanup(struct xfs_trans *tp, diff --git a/fs/xfs/scrub/alloc_repair.c b/fs/xfs/scrub/alloc_repair.c index 6506fc202571..b695cd2b0a56 100644 --- a/fs/xfs/scrub/alloc_repair.c +++ b/fs/xfs/scrub/alloc_repair.c @@ -528,7 +528,7 @@ xrep_abt_dispose_one( xfs_fsblock_t fsbno; fsbno = XFS_AGB_TO_FSB(sc->mp, pag->pag_agno, resv->agbno); - xfs_rmap_alloc_extent(sc->tp, fsbno, resv->used, + xfs_rmap_alloc_extent(sc->tp, false, fsbno, resv->used, XFS_RMAP_OWN_AG); } diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index a84f7e0e91a3..5f04f55f5caa 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -21,6 +21,7 @@ #include "xfs_log_priv.h" #include "xfs_log_recover.h" #include "xfs_ag.h" +#include "xfs_rtgroup.h" struct kmem_cache *xfs_rui_cache; struct kmem_cache *xfs_rud_cache; @@ -284,6 +285,11 @@ xfs_rmap_update_diff_items( ra = container_of(a, struct xfs_rmap_intent, ri_list); rb = container_of(b, struct xfs_rmap_intent, ri_list); + ASSERT(ra->ri_realtime == rb->ri_realtime); + + if (ra->ri_realtime) + return ra->ri_rtg->rtg_rgno - rb->ri_rtg->rtg_rgno; + return ra->ri_pag->pag_agno - rb->ri_pag->pag_agno; } @@ -318,6 +324,8 @@ xfs_rmap_update_log_item( map->me_flags |= XFS_RMAP_EXTENT_UNWRITTEN; if (ri->ri_whichfork == XFS_ATTR_FORK) map->me_flags |= XFS_RMAP_EXTENT_ATTR_FORK; + if (ri->ri_realtime) + map->me_flags |= XFS_RMAP_EXTENT_REALTIME; switch (ri->ri_type) { case XFS_RMAP_MAP: map->me_flags |= XFS_RMAP_EXTENT_MAP; @@ -387,6 +395,14 @@ xfs_rmap_update_get_group( { xfs_agnumber_t agno; + if (ri->ri_realtime) { + xfs_rgnumber_t rgno; + + rgno = xfs_rtb_to_rgno(mp, ri->ri_bmap.br_startblock); + ri->ri_rtg = xfs_rtgroup_get(mp, rgno); + return; + } + agno = XFS_FSB_TO_AGNO(mp, ri->ri_bmap.br_startblock); ri->ri_pag = xfs_perag_get(mp, agno); xfs_perag_bump_intents(ri->ri_pag); @@ -397,6 +413,11 @@ static inline void xfs_rmap_update_put_group( struct xfs_rmap_intent *ri) { + if (ri->ri_realtime) { + xfs_rtgroup_put(ri->ri_rtg); + return; + } + xfs_perag_drop_intents(ri->ri_pag); xfs_perag_put(ri->ri_pag); } @@ -565,6 +586,7 @@ xfs_rui_item_recover( goto abort_error; } + fake.ri_realtime = !!(map->me_flags & XFS_RMAP_EXTENT_REALTIME); fake.ri_owner = map->me_owner; fake.ri_whichfork = (map->me_flags & XFS_RMAP_EXTENT_ATTR_FORK) ? XFS_ATTR_FORK : XFS_DATA_FORK; diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 390aa7a4afae..c02a58cbf15b 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3013,9 +3013,10 @@ DECLARE_EVENT_CLASS(xfs_rmap_deferred_class, TP_ARGS(mp, ri), TP_STRUCT__entry( __field(dev_t, dev) + __field(dev_t, opdev) __field(unsigned long long, owner) __field(xfs_agnumber_t, agno) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, rmapbno) __field(int, whichfork) __field(xfs_fileoff_t, l_loff) __field(xfs_filblks_t, l_len) @@ -3024,9 +3025,18 @@ DECLARE_EVENT_CLASS(xfs_rmap_deferred_class, ), TP_fast_assign( __entry->dev = mp->m_super->s_dev; - __entry->agno = XFS_FSB_TO_AGNO(mp, ri->ri_bmap.br_startblock); - __entry->agbno = XFS_FSB_TO_AGBNO(mp, - ri->ri_bmap.br_startblock); + if (ri->ri_realtime) { + __entry->opdev = mp->m_rtdev_targp->bt_dev; + __entry->rmapbno = xfs_rtb_to_rgbno(mp, + ri->ri_bmap.br_startblock, + &__entry->agno); + } else { + __entry->agno = XFS_FSB_TO_AGNO(mp, + ri->ri_bmap.br_startblock); + __entry->opdev = __entry->dev; + __entry->rmapbno = XFS_FSB_TO_AGBNO(mp, + ri->ri_bmap.br_startblock); + } __entry->owner = ri->ri_owner; __entry->whichfork = ri->ri_whichfork; __entry->l_loff = ri->ri_bmap.br_startoff; @@ -3034,11 +3044,12 @@ DECLARE_EVENT_CLASS(xfs_rmap_deferred_class, __entry->l_state = ri->ri_bmap.br_state; __entry->op = ri->ri_type; ), - TP_printk("dev %d:%d op %s agno 0x%x agbno 0x%x owner 0x%llx %s fileoff 0x%llx fsbcount 0x%llx state %d", + TP_printk("dev %d:%d op %s opdev %d:%d agno 0x%x rmapbno 0x%x owner 0x%llx %s fileoff 0x%llx fsbcount 0x%llx state %d", MAJOR(__entry->dev), MINOR(__entry->dev), __print_symbolic(__entry->op, XFS_RMAP_INTENT_STRINGS), + MAJOR(__entry->opdev), MINOR(__entry->opdev), __entry->agno, - __entry->agbno, + __entry->rmapbno, __entry->owner, __print_symbolic(__entry->whichfork, XFS_WHICHFORK_STRINGS), __entry->l_loff, From patchwork Fri Dec 30 22:18:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8251C4332F for ; Sat, 31 Dec 2022 01:39:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236121AbiLaBjN (ORCPT ); Fri, 30 Dec 2022 20:39:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47470 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236119AbiLaBjM (ORCPT ); Fri, 30 Dec 2022 20:39:12 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C64713DD9 for ; Fri, 30 Dec 2022 17:39:12 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CD7E861CC6 for ; Sat, 31 Dec 2022 01:39:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3AD9AC433EF; Sat, 31 Dec 2022 01:39:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450751; bh=VNqn4IKxv0UI/VpegWuoH9PpnbmqB5F35XorWUit5s8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=BfJLklaqvs/lyY2RBie2YrqpvGV3kYLyznwyI3oFQBQl7O6Y3sYZgZYRkK2ziDvXT LEP1EwKF9RyDXOOGZJ/qfD1Q2M7WCZOhtJCLl6oq/rVMIM6MRPkQFj6BThwtf8Gjck IQb0pSOhfso8vtJYc9imXEPWApUhcCb82Gu0IqG5sWjrGeYejMch+aLNL6SisHaYaR NHs1hTDL1lMBob4fb4NByr5fNE85cPxfR9e4ZRj9gqN3Df65dE5qowNHvykyyHpoQY NFOGP3s8ufIiyaLBdCtojyqt+EPTJIFshbVw06y4tm1DqNQkdnaOLhHJB7GJSXcVEH xPPU1Yg1LRXJA== Subject: [PATCH 09/38] xfs: support recovering rmap intent items targetting realtime extents From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:17 -0800 Message-ID: <167243869728.715303.12164955598997789324.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Now that we have rmap on the realtime device, log recovery has to support remapping extents on the realtime volume. Make this work. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_rmap_item.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 5f04f55f5caa..a2949f818e0c 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -507,6 +507,9 @@ xfs_rui_validate_map( if (!xfs_verify_fileext(mp, map->me_startoff, map->me_len)) return false; + if (map->me_flags & XFS_RMAP_EXTENT_REALTIME) + return xfs_verify_rtbext(mp, map->me_startblock, map->me_len); + return xfs_verify_fsbext(mp, map->me_startblock, map->me_len); } From patchwork Fri Dec 30 22:18:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9250C4332F for ; Sat, 31 Dec 2022 01:39:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236131AbiLaBjc (ORCPT ); Fri, 30 Dec 2022 20:39:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236129AbiLaBja (ORCPT ); Fri, 30 Dec 2022 20:39:30 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7755413DD9 for ; Fri, 30 Dec 2022 17:39:29 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2D1CEB81DE3 for ; Sat, 31 Dec 2022 01:39:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C3FC0C433EF; Sat, 31 Dec 2022 01:39:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450766; bh=2RGy5Hxc+xlxTiUsteidMGaD89DNew6s9YlKxN88S6o=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=m2qkWkITNF7mY/yyspwSD42YDGONpapVy7CFuT8u96nTnRa5u7wUl15+6Pyt5vQ+y A2JOQ/vq8WT2SuUDVT6Eb05ZpU6+hkUA0I9GTZ4goTmcQqdWGpkjkKOk0sUJBq/NVV zt4jNq79ZsmoIxButycgn4nAX4C1Tgw/CrMurF/msQcZSs6BqELJ7asly2NLsOhOZy vPYoZZGWWTxuh4bp5KMQDBuOJA4g+fz/JQ0sszt9JK971HGpHbcPVllYZP3rpqlbrV 2IvRIGk4sUwKf8YGXcuYL0odKIK7kuu/IRZgxoSJfhb5YosMLNwhJRdQNbeDlYNhe3 VWdLpBpdvea+A== Subject: [PATCH 10/38] xfs: add realtime rmap btree block detection to log recovery From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:17 -0800 Message-ID: <167243869742.715303.9448242223459976161.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Identify rtrmapbt blocks in the log correctly so that we can validate them during log recovery. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_buf_item_recover.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c index b74d40f5beb1..496260c9d8cd 100644 --- a/fs/xfs/xfs_buf_item_recover.c +++ b/fs/xfs/xfs_buf_item_recover.c @@ -259,6 +259,9 @@ xlog_recover_validate_buf_type( case XFS_BMAP_MAGIC: bp->b_ops = &xfs_bmbt_buf_ops; break; + case XFS_RTRMAP_CRC_MAGIC: + bp->b_ops = &xfs_rtrmapbt_buf_ops; + break; case XFS_RMAP_CRC_MAGIC: bp->b_ops = &xfs_rmapbt_buf_ops; break; @@ -768,6 +771,7 @@ xlog_recover_get_buf_lsn( uuid = &btb->bb_u.s.bb_uuid; break; } + case XFS_RTRMAP_CRC_MAGIC: case XFS_BMAP_CRC_MAGIC: case XFS_BMAP_MAGIC: { struct xfs_btree_block *btb = blk; From patchwork Fri Dec 30 22:18:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CD4FC4332F for ; Sat, 31 Dec 2022 01:39:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236128AbiLaBjq (ORCPT ); Fri, 30 Dec 2022 20:39:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236129AbiLaBjo (ORCPT ); Fri, 30 Dec 2022 20:39:44 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 697391CFF2 for ; Fri, 30 Dec 2022 17:39:43 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 069E861CC6 for ; Sat, 31 Dec 2022 01:39:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 611CAC433EF; Sat, 31 Dec 2022 01:39:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450782; bh=G5PUs7g/SF61bdrI7n7idYQ2SbHd+DT4jwMJNkeTWUM=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=eAzLy5TkHZsdsR5nFC/u9UATDqvCaJmMdODtbKcYW4ZSFRr9CEEFJk71FVOQjkxmo 9u6zec/Smb7xWG1RbZ/5Y0gzuggj78ir38hozws132SffbbLbSNpZOzmbCMZbyoNWx bsHxH3JPRAOmeGKNm/VEk7wh0jR8Cnq/sQPJh9QzrOpD9zeyqSKkz5Bd4wgnfeBgOH 1LJbQtDewNFr0/Ae74kCli/sH4YzVH/AuHYOUtXK1KCQKid51r1iPb+VwRXkeejUzA WNbp+o4dMKDd96ceoWwTSvnIjEuw/LPGq0Vr6HgZH227TETn1Cy3UpuyoqL3GYErmv B88u4tHAcSzbw== Subject: [PATCH 11/38] xfs: attach dquots to rt metadata files when starting quota From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:17 -0800 Message-ID: <167243869755.715303.17695593088764549428.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Attach dquots to the realtime metadata files when starting up quotas, since the resources used by them are charged to the root dquot. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_mount.c | 4 +++- fs/xfs/xfs_qm.c | 20 +++++++++++++++++--- fs/xfs/xfs_qm_bhv.c | 2 +- fs/xfs/xfs_quota.h | 4 ++-- fs/xfs/xfs_rtalloc.c | 19 +++++++++++++++++++ fs/xfs/xfs_rtalloc.h | 3 +++ 6 files changed, 45 insertions(+), 7 deletions(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 1d2403b93f58..2e64f18deabf 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1007,7 +1007,9 @@ xfs_mountfs( ASSERT(mp->m_qflags == 0); mp->m_qflags = quotaflags; - xfs_qm_mount_quotas(mp); + error = xfs_qm_mount_quotas(mp); + if (error) + goto out_rtunmount; } /* diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index 905765eedcb0..63085d8b5ec1 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -29,6 +29,7 @@ #include "xfs_health.h" #include "xfs_imeta.h" #include "xfs_da_format.h" +#include "xfs_rtalloc.h" /* * The global quota manager. There is only one of these for the entire @@ -1486,7 +1487,7 @@ xfs_qm_quotacheck( * If we fail here, the mount will continue with quota turned off. We don't * need to inidicate success or failure at all. */ -void +int xfs_qm_mount_quotas( struct xfs_mount *mp) { @@ -1525,7 +1526,7 @@ xfs_qm_mount_quotas( error = xfs_qm_quotacheck(mp); if (error) { /* Quotacheck failed and disabled quotas. */ - return; + return 0; } } /* @@ -1566,8 +1567,21 @@ xfs_qm_mount_quotas( if (error) { xfs_warn(mp, "Failed to initialize disk quotas."); - return; + return 0; } + + /* + * Attach dquots to realtime metadata files before we do anything that + * could alter the resource usage of rt metadata (log recovery, normal + * operation, etc). + */ + error = xfs_rtmount_dqattach(mp); + if (error) { + xfs_qm_unmount_quotas(mp); + return error; + } + + return 0; } /* diff --git a/fs/xfs/xfs_qm_bhv.c b/fs/xfs/xfs_qm_bhv.c index 271c1021c733..df569a839d3f 100644 --- a/fs/xfs/xfs_qm_bhv.c +++ b/fs/xfs/xfs_qm_bhv.c @@ -119,7 +119,7 @@ xfs_qm_newmount( * mounting, and get on with the boring life * without disk quotas. */ - xfs_qm_mount_quotas(mp); + return xfs_qm_mount_quotas(mp); } else { /* * Clear the quota flags, but remember them. This diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h index fe63489d91b2..0cb52d5be4aa 100644 --- a/fs/xfs/xfs_quota.h +++ b/fs/xfs/xfs_quota.h @@ -120,7 +120,7 @@ extern void xfs_qm_dqdetach(struct xfs_inode *); extern void xfs_qm_dqrele(struct xfs_dquot *); extern void xfs_qm_statvfs(struct xfs_inode *, struct kstatfs *); extern int xfs_qm_newmount(struct xfs_mount *, uint *, uint *); -extern void xfs_qm_mount_quotas(struct xfs_mount *); +int xfs_qm_mount_quotas(struct xfs_mount *mp); extern void xfs_qm_unmount(struct xfs_mount *); extern void xfs_qm_unmount_quotas(struct xfs_mount *); @@ -205,7 +205,7 @@ xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_dquot *udqp, #define xfs_qm_dqrele(d) do { (d) = (d); } while(0) #define xfs_qm_statvfs(ip, s) do { } while(0) #define xfs_qm_newmount(mp, a, b) (0) -#define xfs_qm_mount_quotas(mp) +#define xfs_qm_mount_quotas(mp) (0) #define xfs_qm_unmount(mp) #define xfs_qm_unmount_quotas(mp) #define xfs_inode_near_dquot_enforcement(ip, type) (false) diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 7a94fb5b5a7f..82b729a86740 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -26,6 +26,7 @@ #include "xfs_imeta.h" #include "xfs_rtbitmap.h" #include "xfs_rtgroup.h" +#include "xfs_quota.h" /* * Realtime metadata files are not quite regular files because userspace can't @@ -1686,6 +1687,24 @@ xfs_rtmount_inodes( return error; } +/* Attach dquots for realtime metadata files. */ +int +xfs_rtmount_dqattach( + struct xfs_mount *mp) +{ + int error; + + error = xfs_qm_dqattach(mp->m_rbmip); + if (error) + return error; + + error = xfs_qm_dqattach(mp->m_rsumip); + if (error) + return error; + + return 0; +} + void xfs_rtunmount_inodes( struct xfs_mount *mp) diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h index 04931ab1bcac..873ebac239dd 100644 --- a/fs/xfs/xfs_rtalloc.h +++ b/fs/xfs/xfs_rtalloc.h @@ -46,6 +46,8 @@ void xfs_rtunmount_inodes( struct xfs_mount *mp); +int xfs_rtmount_dqattach(struct xfs_mount *mp); + /* * Get the bitmap and summary inodes into the mount structure * at mount time. @@ -104,6 +106,7 @@ xfs_rtmount_init( # define xfs_rtfile_convert_unwritten(ip, pos, len) (0) # define xfs_rt_resv_free(mp) ((void)0) # define xfs_rt_resv_init(mp) (0) +# define xfs_rtmount_dqattach(mp) (0) #endif /* CONFIG_XFS_RT */ #endif /* __XFS_RTALLOC_H__ */ From patchwork Fri Dec 30 22:18:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5865C4332F for ; Sat, 31 Dec 2022 01:40:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236119AbiLaBkB (ORCPT ); Fri, 30 Dec 2022 20:40:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236130AbiLaBkA (ORCPT ); Fri, 30 Dec 2022 20:40:00 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 233051CFF0 for ; Fri, 30 Dec 2022 17:39:59 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A8CAC61CBD for ; Sat, 31 Dec 2022 01:39:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0B53FC433EF; Sat, 31 Dec 2022 01:39:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450798; bh=aYQndyBaD62ond7EkDf/Ho7sntO8H81uh9cmQ9WtJHw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=hfzEaIOaaJCxhSBZ7v/P76fmSuZrWqKF7x1lmmOf+ashV5Lff9c6MtwfLAFL6vxVT kQnAz8+Z239CdWmYYfNShcq4lDsvUTy82XpUh4kwjDsIW+0AxPQQvmNpggQBDHUwTv O3DJshIrCPIAnkJeZiSbkKkaWih0nmKYXDf1n7IgxWkYaKKnpD5W8twR1bNxFlE63e 0fqE70utqVF6nv36t7jAXsich60dSGgbCazysgtoX/yP0tVjndePpbnbb3wAWW6X30 S3+emTkzCbIz9TZ59zYIjzJW8R0/CDqRZYNC7R2MyFPMHAyasQHt+Mbpco+jloXQGO F98CvCDqbhuXQ== Subject: [PATCH 12/38] xfs: add realtime reverse map inode to metadata directory From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:17 -0800 Message-ID: <167243869769.715303.8418722142356010813.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Add a metadir path to select the realtime rmap btree inode and load it at mount time. The rtrmapbt inode will have a unique extent format code, which means that we also have to update the inode validation and flush routines to look for it. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_format.h | 6 ++- fs/xfs/libxfs/xfs_inode_buf.c | 6 +++ fs/xfs/libxfs/xfs_inode_fork.c | 9 ++++ fs/xfs/libxfs/xfs_rtgroup.h | 3 + fs/xfs/libxfs/xfs_rtrmap_btree.c | 33 ++++++++++++++++ fs/xfs/libxfs/xfs_rtrmap_btree.h | 4 ++ fs/xfs/xfs_inode.c | 19 +++++++++ fs/xfs/xfs_inode_item.c | 2 + fs/xfs/xfs_inode_item_recover.c | 1 fs/xfs/xfs_rtalloc.c | 79 ++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_trace.h | 1 11 files changed, 159 insertions(+), 4 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index fb727e1e4072..babe5d3fabb1 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1009,7 +1009,8 @@ enum xfs_dinode_fmt { XFS_DINODE_FMT_LOCAL, /* bulk data */ XFS_DINODE_FMT_EXTENTS, /* struct xfs_bmbt_rec */ XFS_DINODE_FMT_BTREE, /* struct xfs_bmdr_block */ - XFS_DINODE_FMT_UUID /* added long ago, but never used */ + XFS_DINODE_FMT_UUID, /* added long ago, but never used */ + XFS_DINODE_FMT_RMAP, /* reverse mapping btree */ }; #define XFS_INODE_FORMAT_STR \ @@ -1017,7 +1018,8 @@ enum xfs_dinode_fmt { { XFS_DINODE_FMT_LOCAL, "local" }, \ { XFS_DINODE_FMT_EXTENTS, "extent" }, \ { XFS_DINODE_FMT_BTREE, "btree" }, \ - { XFS_DINODE_FMT_UUID, "uuid" } + { XFS_DINODE_FMT_UUID, "uuid" }, \ + { XFS_DINODE_FMT_RMAP, "rmap" } /* * Max values for extnum and aextnum. diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 1fb11d0e7eba..9ac84be391b3 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -408,6 +408,12 @@ xfs_dinode_verify_fork( if (di_nextents > max_extents) return __this_address; break; + case XFS_DINODE_FMT_RMAP: + if (!xfs_has_rtrmapbt(mp)) + return __this_address; + if (!(dip->di_flags2 & cpu_to_be64(XFS_DIFLAG2_METADATA))) + return __this_address; + break; default: return __this_address; } diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index b844bfd94e9c..899428f96b94 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -259,6 +259,11 @@ xfs_iformat_data_fork( return xfs_iformat_extents(ip, dip, XFS_DATA_FORK); case XFS_DINODE_FMT_BTREE: return xfs_iformat_btree(ip, dip, XFS_DATA_FORK); + case XFS_DINODE_FMT_RMAP: + if (!xfs_has_rtrmapbt(ip->i_mount)) + return -EFSCORRUPTED; + ASSERT(0); /* to be implemented later */ + return -EFSCORRUPTED; default: xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip, sizeof(*dip), __this_address); @@ -639,6 +644,10 @@ xfs_iflush_fork( } break; + case XFS_DINODE_FMT_RMAP: + ASSERT(0); /* to be implemented later */ + break; + default: ASSERT(0); break; diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h index 3c9572677f79..1792a9ab3bbf 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.h +++ b/fs/xfs/libxfs/xfs_rtgroup.h @@ -20,6 +20,9 @@ struct xfs_rtgroup { /* for rcu-safe freeing */ struct rcu_head rcu_head; + /* reverse mapping btree inode */ + struct xfs_inode *rtg_rmapip; + /* Number of blocks in this group */ xfs_rgblock_t rtg_blockcount; diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index 551d575713db..754812eaff87 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -18,6 +18,7 @@ #include "xfs_alloc.h" #include "xfs_btree.h" #include "xfs_btree_staging.h" +#include "xfs_imeta.h" #include "xfs_rmap.h" #include "xfs_rtrmap_btree.h" #include "xfs_trace.h" @@ -475,6 +476,7 @@ xfs_rtrmapbt_commit_staged_btree( int flags = XFS_ILOG_CORE | XFS_ILOG_DBROOT; ASSERT(cur->bc_flags & XFS_BTREE_STAGING); + ASSERT(ifake->if_fork->if_format == XFS_DINODE_FMT_RMAP); /* * Free any resources hanging off the real fork, then shallow-copy the @@ -575,3 +577,34 @@ xfs_rtrmapbt_compute_maxlevels( /* Add one level to handle the inode root level. */ mp->m_rtrmap_maxlevels = min(d_maxlevels, r_maxlevels) + 1; } + +#define XFS_RTRMAP_NAMELEN 17 + +/* Create the metadata directory path for an rtrmap btree inode. */ +int +xfs_rtrmapbt_create_path( + struct xfs_mount *mp, + xfs_rgnumber_t rgno, + struct xfs_imeta_path **pathp) +{ + struct xfs_imeta_path *path; + char *fname; + int error; + + error = xfs_imeta_create_file_path(mp, 2, &path); + if (error) + return error; + + fname = kmalloc(XFS_RTRMAP_NAMELEN, GFP_KERNEL); + if (!fname) { + xfs_imeta_free_path(path); + return -ENOMEM; + } + + snprintf(fname, XFS_RTRMAP_NAMELEN, "%u.rmap", rgno); + path->im_path[0] = "realtime"; + path->im_path[1] = fname; + path->im_dynamicmask = 0x2; + *pathp = path; + return 0; +} diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.h b/fs/xfs/libxfs/xfs_rtrmap_btree.h index 7380c04e7705..26e2445f5d6c 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.h +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.h @@ -11,6 +11,7 @@ struct xfs_btree_cur; struct xfs_mount; struct xbtree_ifakeroot; struct xfs_rtgroup; +struct xfs_imeta_path; /* rmaps only exist on crc enabled filesystems */ #define XFS_RTRMAP_BLOCK_LEN XFS_BTREE_LBLOCK_CRC_LEN @@ -80,4 +81,7 @@ unsigned int xfs_rtrmapbt_maxlevels_ondisk(void); int __init xfs_rtrmapbt_init_cur_cache(void); void xfs_rtrmapbt_destroy_cur_cache(void); +int xfs_rtrmapbt_create_path(struct xfs_mount *mp, xfs_rgnumber_t rgno, + struct xfs_imeta_path **pathp); + #endif /* __XFS_RTRMAP_BTREE_H__ */ diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index ab805df9db16..3b0c04b6bcdf 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2572,7 +2572,15 @@ xfs_iflush( __func__, ip->i_ino, be16_to_cpu(dip->di_magic), dip); goto flush_out; } - if (S_ISREG(VFS_I(ip)->i_mode)) { + if (ip->i_df.if_format == XFS_DINODE_FMT_RMAP) { + if (!S_ISREG(VFS_I(ip)->i_mode) || + !(ip->i_diflags2 & XFS_DIFLAG2_METADATA)) { + xfs_alert_tag(mp, XFS_PTAG_IFLUSH, + "%s: Bad rt rmapbt inode %Lu, ptr "PTR_FMT, + __func__, ip->i_ino, ip); + goto flush_out; + } + } else if (S_ISREG(VFS_I(ip)->i_mode)) { if (XFS_TEST_ERROR( ip->i_df.if_format != XFS_DINODE_FMT_EXTENTS && ip->i_df.if_format != XFS_DINODE_FMT_BTREE, @@ -2612,6 +2620,15 @@ xfs_iflush( goto flush_out; } + if (xfs_inode_has_attr_fork(ip)) { + if (ip->i_af.if_format == XFS_DINODE_FMT_RMAP) { + xfs_alert_tag(mp, XFS_PTAG_IFLUSH, + "%s: rt rmapbt in inode %Lu attr fork, ptr "PTR_FMT, + __func__, ip->i_ino, ip); + goto flush_out; + } + } + /* * Inode item log recovery for v2 inodes are dependent on the flushiter * count for correct sequencing. We bump the flush iteration count so diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index ca2941ab6cbc..b6e374744474 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -62,6 +62,7 @@ xfs_inode_item_data_fork_size( } break; case XFS_DINODE_FMT_BTREE: + case XFS_DINODE_FMT_RMAP: if ((iip->ili_fields & XFS_ILOG_DBROOT) && ip->i_df.if_broot_bytes > 0) { *nbytes += ip->i_df.if_broot_bytes; @@ -182,6 +183,7 @@ xfs_inode_item_format_data_fork( } break; case XFS_DINODE_FMT_BTREE: + case XFS_DINODE_FMT_RMAP: iip->ili_fields &= ~(XFS_ILOG_DDATA | XFS_ILOG_DEXT | XFS_ILOG_DEV); diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c index 0e5dba2343ea..3453a204d196 100644 --- a/fs/xfs/xfs_inode_item_recover.c +++ b/fs/xfs/xfs_inode_item_recover.c @@ -390,6 +390,7 @@ xlog_recover_inode_commit_pass2( if (unlikely(S_ISREG(ldip->di_mode))) { if ((ldip->di_format != XFS_DINODE_FMT_EXTENTS) && + (ldip->di_format != XFS_DINODE_FMT_RMAP) && (ldip->di_format != XFS_DINODE_FMT_BTREE)) { XFS_CORRUPTION_ERROR( "Bad log dinode data fork format for regular file", diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 82b729a86740..ba330265ab8a 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -27,6 +27,8 @@ #include "xfs_rtbitmap.h" #include "xfs_rtgroup.h" #include "xfs_quota.h" +#include "xfs_error.h" +#include "xfs_rtrmap_btree.h" /* * Realtime metadata files are not quite regular files because userspace can't @@ -37,6 +39,7 @@ */ static struct lock_class_key xfs_rbmip_key; static struct lock_class_key xfs_rsumip_key; +static struct lock_class_key xfs_rrmapip_key; /* * Read and return the summary information for a given extent size, @@ -1600,6 +1603,47 @@ __xfs_rt_iget( #define xfs_rt_iget(mp, ino, lockdep_key, ipp) \ __xfs_rt_iget((mp), (ino), (lockdep_key), #lockdep_key, (ipp)) +/* Load realtime rmap btree inode. */ +STATIC int +xfs_rtmount_rmapbt( + struct xfs_rtgroup *rtg) +{ + struct xfs_mount *mp = rtg->rtg_mount; + struct xfs_imeta_path *path; + struct xfs_inode *ip; + xfs_ino_t ino; + int error; + + if (!xfs_has_rtrmapbt(mp)) + return 0; + + error = xfs_rtrmapbt_create_path(mp, rtg->rtg_rgno, &path); + if (error) + return error; + + error = xfs_imeta_lookup(mp, path, &ino); + if (error) + goto out_path; + + error = xfs_rt_iget(mp, ino, &xfs_rrmapip_key, &ip); + if (error) + goto out_path; + + if (XFS_IS_CORRUPT(mp, ip->i_df.if_format != XFS_DINODE_FMT_RMAP)) { + error = -EFSCORRUPTED; + goto out_rele; + } + + rtg->rtg_rmapip = ip; + ip = NULL; +out_rele: + if (ip) + xfs_imeta_irele(ip); +out_path: + xfs_imeta_free_path(path); + return error; +} + /* * Read in the bmbt of an rt metadata inode so that we never have to load them * at runtime. This enables the use of shared ILOCKs for rtbitmap scans. Use @@ -1638,7 +1682,7 @@ xfs_rtmount_iread_extents( * Get the bitmap and summary inodes and the summary cache into the mount * structure at mount time. */ -int /* error */ +int xfs_rtmount_inodes( struct xfs_mount *mp) /* file system mount structure */ { @@ -1675,11 +1719,23 @@ xfs_rtmount_inodes( for_each_rtgroup(mp, rgno, rtg) { rtg->rtg_blockcount = xfs_rtgroup_block_count(mp, rtg->rtg_rgno); + + error = xfs_rtmount_rmapbt(rtg); + if (error) { + xfs_rtgroup_put(rtg); + goto out_rele_rtgroup; + } } xfs_alloc_rsum_cache(mp, sbp->sb_rbmblocks); return 0; +out_rele_rtgroup: + for_each_rtgroup(mp, rgno, rtg) { + if (rtg->rtg_rmapip) + xfs_imeta_irele(rtg->rtg_rmapip); + rtg->rtg_rmapip = NULL; + } out_rele_summary: xfs_imeta_irele(mp->m_rsumip); out_rele_bitmap: @@ -1692,6 +1748,8 @@ int xfs_rtmount_dqattach( struct xfs_mount *mp) { + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; int error; error = xfs_qm_dqattach(mp->m_rbmip); @@ -1702,6 +1760,16 @@ xfs_rtmount_dqattach( if (error) return error; + for_each_rtgroup(mp, rgno, rtg) { + if (rtg->rtg_rmapip) { + error = xfs_qm_dqattach(rtg->rtg_rmapip); + if (error) { + xfs_rtgroup_put(rtg); + return error; + } + } + } + return 0; } @@ -1709,7 +1777,16 @@ void xfs_rtunmount_inodes( struct xfs_mount *mp) { + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; + kmem_free(mp->m_rsum_cache); + + for_each_rtgroup(mp, rgno, rtg) { + if (rtg->rtg_rmapip) + xfs_imeta_irele(rtg->rtg_rmapip); + rtg->rtg_rmapip = NULL; + } if (mp->m_rbmip) xfs_imeta_irele(mp->m_rbmip); if (mp->m_rsumip) diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index c02a58cbf15b..77f4acc1b923 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -2235,6 +2235,7 @@ TRACE_DEFINE_ENUM(XFS_DINODE_FMT_LOCAL); TRACE_DEFINE_ENUM(XFS_DINODE_FMT_EXTENTS); TRACE_DEFINE_ENUM(XFS_DINODE_FMT_BTREE); TRACE_DEFINE_ENUM(XFS_DINODE_FMT_UUID); +TRACE_DEFINE_ENUM(XFS_DINODE_FMT_RMAP); DECLARE_EVENT_CLASS(xfs_swap_extent_class, TP_PROTO(struct xfs_inode *ip, int which), From patchwork Fri Dec 30 22:18:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D3B8C4332F for ; Sat, 31 Dec 2022 01:40:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236130AbiLaBkQ (ORCPT ); Fri, 30 Dec 2022 20:40:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236129AbiLaBkP (ORCPT ); Fri, 30 Dec 2022 20:40:15 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC6C313DD9 for ; Fri, 30 Dec 2022 17:40:14 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6737C61C3A for ; Sat, 31 Dec 2022 01:40:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C62AAC433EF; Sat, 31 Dec 2022 01:40:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450813; bh=CwtT5HT8O1xpGCfCzTC1Fw1IY1sGrepAYPTosBLLaP4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=gb7bu/WCOVG6tQkNSV8LD7sjS2PvMcm0PrEn4LkXkcvYbHLV/BkZOiJ3gxnX1muZ4 0N3EaJNIL/Y8AVCnnH5sTqp1vRtUsZIDdIFEFGgBO+bVEm825U5vdG1jJVk9cYs8xa 0fidkjRBIVb3RP3oprHN+oD297wHdxNE/ClJNeFD+c7J49WvdhWA4KHth3q0ZGpI7X fa0bxTHFUPBTIgCtoNgCzZRIoQ+gNC5S79zn6yn6HeqxXl4jclQ3CsxP1znK9KqBOU bouCkzVfX0s5aqQKHiBumwn0xwZf0RwDbsoYjXt2kFBtKjZAttjeuEO0Bj84Iwa3qi +yjShayy0gWww== Subject: [PATCH 13/38] xfs: add metadata reservations for realtime rmap btrees From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:17 -0800 Message-ID: <167243869783.715303.4496353824604001461.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Reserve some free blocks so that we will always have enough free blocks in the data volume to handle expansion of the realtime rmap btree. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rtrmap_btree.c | 39 ++++++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrmap_btree.h | 2 ++ fs/xfs/xfs_rtalloc.c | 21 +++++++++++++++++++- 3 files changed, 61 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index 754812eaff87..c90017408574 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -608,3 +608,42 @@ xfs_rtrmapbt_create_path( *pathp = path; return 0; } + +/* Calculate the rtrmap btree size for some records. */ +static unsigned long long +xfs_rtrmapbt_calc_size( + struct xfs_mount *mp, + unsigned long long len) +{ + return xfs_btree_calc_size(mp->m_rtrmap_mnr, len); +} + +/* + * Calculate the maximum rmap btree size. + */ +static unsigned long long +xfs_rtrmapbt_max_size( + struct xfs_mount *mp, + xfs_rtblock_t rtblocks) +{ + /* Bail out if we're uninitialized, which can happen in mkfs. */ + if (mp->m_rtrmap_mxr[0] == 0) + return 0; + + return xfs_rtrmapbt_calc_size(mp, rtblocks); +} + +/* + * Figure out how many blocks to reserve and how many are used by this btree. + */ +xfs_filblks_t +xfs_rtrmapbt_calc_reserves( + struct xfs_mount *mp) +{ + if (!xfs_has_rtrmapbt(mp)) + return 0; + + /* 1/64th (~1.5%) of the space, and enough for 1 record per block. */ + return max_t(xfs_filblks_t, mp->m_sb.sb_rgblocks >> 6, + xfs_rtrmapbt_max_size(mp, mp->m_sb.sb_rgblocks)); +} diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.h b/fs/xfs/libxfs/xfs_rtrmap_btree.h index 26e2445f5d6c..63e667d0d76d 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.h +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.h @@ -84,4 +84,6 @@ void xfs_rtrmapbt_destroy_cur_cache(void); int xfs_rtrmapbt_create_path(struct xfs_mount *mp, xfs_rgnumber_t rgno, struct xfs_imeta_path **pathp); +xfs_filblks_t xfs_rtrmapbt_calc_reserves(struct xfs_mount *mp); + #endif /* __XFS_RTRMAP_BTREE_H__ */ diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index ba330265ab8a..c3d27cb85c26 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1571,6 +1571,11 @@ void xfs_rt_resv_free( struct xfs_mount *mp) { + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; + + for_each_rtgroup(mp, rgno, rtg) + xfs_imeta_resv_free_inode(rtg->rtg_rmapip); } /* Reserve space for rt metadata inodes' space expansion. */ @@ -1578,7 +1583,21 @@ int xfs_rt_resv_init( struct xfs_mount *mp) { - return 0; + struct xfs_rtgroup *rtg; + xfs_filblks_t ask; + xfs_rgnumber_t rgno; + int error = 0; + + for_each_rtgroup(mp, rgno, rtg) { + int err2; + + ask = xfs_rtrmapbt_calc_reserves(mp); + err2 = xfs_imeta_resv_init_inode(rtg->rtg_rmapip, ask); + if (err2 && !error) + error = err2; + } + + return error; } static inline int From patchwork Fri Dec 30 22:18:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC689C4332F for ; Sat, 31 Dec 2022 01:40:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236132AbiLaBkf (ORCPT ); Fri, 30 Dec 2022 20:40:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236129AbiLaBkd (ORCPT ); Fri, 30 Dec 2022 20:40:33 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6512213DD9 for ; Fri, 30 Dec 2022 17:40:32 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E9415B81DE3 for ; Sat, 31 Dec 2022 01:40:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 77EA9C433EF; Sat, 31 Dec 2022 01:40:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450829; bh=6m9eK4vmoEym9L74ewHWHx8nfJUwCygwbfDNBuy89wo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=XaklKn4IHsyYLgshtxN2OL58UFnr7kF1REPcDqn9zYRt0cS/E1R5GSNAgdVST8O12 TQslZI+DJ3eYTlAwRCCi6csIrIqbSJsgyd4CNg3R2L+Aa770Rf2GS9vABqUjgfHune 4KAGmIOzPDO58UOO4SIorD4bSdif7rgEcfo37v9MBdOoSPERIBo0PSFL582JLz6Qm1 HxUGTWBE4SU/mRvX0IqhQjNZNkZb1Q0ojpMWjbHSOXV1hDah92v5xtlLuHV1+6mfQU 2JfNmxgg3kW8UOcicchaGcjdOEP/rQFenOFnV+sVVA50Ny5HvFjZmdjFeHJBBYmovy b3OzAW7TDt1gA== Subject: [PATCH 14/38] xfs: wire up a new inode fork type for the realtime rmap From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:18 -0800 Message-ID: <167243869797.715303.12563373811365682239.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Plumb in the pieces we need to embed the root of the realtime rmap btree in an inode's data fork, complete with new fork type and on-disk interpretation functions. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_format.h | 8 + fs/xfs/libxfs/xfs_inode_fork.c | 8 + fs/xfs/libxfs/xfs_rtrmap_btree.c | 220 ++++++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrmap_btree.h | 112 +++++++++++++++++++ fs/xfs/xfs_inode_item_recover.c | 32 +++++- fs/xfs/xfs_ondisk.h | 1 6 files changed, 375 insertions(+), 6 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index babe5d3fabb1..a2b8d8ee8afd 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1736,6 +1736,14 @@ typedef __be32 xfs_rmap_ptr_t; */ #define XFS_RTRMAP_CRC_MAGIC 0x4d415052 /* 'MAPR' */ +/* + * rtrmap root header, on-disk form only. + */ +struct xfs_rtrmap_root { + __be16 bb_level; /* 0 is a leaf */ + __be16 bb_numrecs; /* current # of data records */ +}; + /* inode-based btree pointer type */ typedef __be64 xfs_rtrmap_ptr_t; diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index 899428f96b94..94979bed8f32 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -27,6 +27,7 @@ #include "xfs_errortag.h" #include "xfs_health.h" #include "xfs_symlink_remote.h" +#include "xfs_rtrmap_btree.h" struct kmem_cache *xfs_ifork_cache; @@ -262,8 +263,7 @@ xfs_iformat_data_fork( case XFS_DINODE_FMT_RMAP: if (!xfs_has_rtrmapbt(ip->i_mount)) return -EFSCORRUPTED; - ASSERT(0); /* to be implemented later */ - return -EFSCORRUPTED; + return xfs_iformat_rtrmap(ip, dip); default: xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip, sizeof(*dip), __this_address); @@ -645,7 +645,9 @@ xfs_iflush_fork( break; case XFS_DINODE_FMT_RMAP: - ASSERT(0); /* to be implemented later */ + ASSERT(whichfork == XFS_DATA_FORK); + if (iip->ili_fields & brootflag[whichfork]) + xfs_iflush_rtrmap(ip, dip); break; default: diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index c90017408574..a099f33f26ab 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -85,6 +85,39 @@ xfs_rtrmapbt_get_maxrecs( return cur->bc_mp->m_rtrmap_mxr[level != 0]; } +/* Calculate number of records in the ondisk realtime rmap btree inode root. */ +unsigned int +xfs_rtrmapbt_droot_maxrecs( + unsigned int blocklen, + bool leaf) +{ + blocklen -= sizeof(struct xfs_rtrmap_root); + + if (leaf) + return blocklen / sizeof(struct xfs_rmap_rec); + return blocklen / (2 * sizeof(struct xfs_rmap_key) + + sizeof(xfs_rtrmap_ptr_t)); +} + +/* + * Get the maximum records we could store in the on-disk format. + * + * For non-root nodes this is equivalent to xfs_rtrmapbt_get_maxrecs, but + * for the root node this checks the available space in the dinode fork + * so that we can resize the in-memory buffer to match it. After a + * resize to the maximum size this function returns the same value + * as xfs_rtrmapbt_get_maxrecs for the root node, too. + */ +STATIC int +xfs_rtrmapbt_get_dmaxrecs( + struct xfs_btree_cur *cur, + int level) +{ + if (level != cur->bc_nlevels - 1) + return cur->bc_mp->m_rtrmap_mxr[level != 0]; + return xfs_rtrmapbt_droot_maxrecs(cur->bc_ino.forksize, level == 0); +} + /* * Convert the ondisk record's offset field into the ondisk key's offset field. * Fork and bmbt are significant parts of the rmap record key, but written @@ -377,6 +410,64 @@ xfs_rtrmapbt_keys_contiguous( be32_to_cpu(key2->rmap.rm_startblock)); } +/* Move the rtrmap btree root from one incore buffer to another. */ +static void +xfs_rtrmapbt_broot_move( + struct xfs_inode *ip, + int whichfork, + struct xfs_btree_block *dst_broot, + size_t dst_bytes, + struct xfs_btree_block *src_broot, + size_t src_bytes, + unsigned int level, + unsigned int numrecs) +{ + struct xfs_mount *mp = ip->i_mount; + void *dptr; + void *sptr; + + ASSERT(xfs_rtrmap_droot_space(src_broot) <= + xfs_inode_fork_size(ip, whichfork)); + + /* + * We always have to move the pointers because they are not butted + * against the btree block header. + */ + if (numrecs && level > 0) { + sptr = xfs_rtrmap_broot_ptr_addr(mp, src_broot, 1, src_bytes); + dptr = xfs_rtrmap_broot_ptr_addr(mp, dst_broot, 1, dst_bytes); + memmove(dptr, sptr, numrecs * sizeof(xfs_fsblock_t)); + } + + if (src_broot == dst_broot) + return; + + /* + * If the root is being totally relocated, we have to migrate the block + * header and the keys/records that come after it. + */ + memcpy(dst_broot, src_broot, XFS_RTRMAP_BLOCK_LEN); + + if (!numrecs) + return; + + if (level == 0) { + sptr = xfs_rtrmap_rec_addr(src_broot, 1); + dptr = xfs_rtrmap_rec_addr(dst_broot, 1); + memcpy(dptr, sptr, numrecs * sizeof(struct xfs_rmap_rec)); + } else { + sptr = xfs_rtrmap_key_addr(src_broot, 1); + dptr = xfs_rtrmap_key_addr(dst_broot, 1); + memcpy(dptr, sptr, numrecs * 2 * sizeof(struct xfs_rmap_key)); + } +} + +static const struct xfs_ifork_broot_ops xfs_rtrmapbt_iroot_ops = { + .maxrecs = xfs_rtrmapbt_maxrecs, + .size = xfs_rtrmap_broot_space_calc, + .move = xfs_rtrmapbt_broot_move, +}; + const struct xfs_btree_ops xfs_rtrmapbt_ops = { .rec_len = sizeof(struct xfs_rmap_rec), .key_len = 2 * sizeof(struct xfs_rmap_key), @@ -389,6 +480,7 @@ const struct xfs_btree_ops xfs_rtrmapbt_ops = { .free_block = xfs_btree_free_imeta_block, .get_minrecs = xfs_rtrmapbt_get_minrecs, .get_maxrecs = xfs_rtrmapbt_get_maxrecs, + .get_dmaxrecs = xfs_rtrmapbt_get_dmaxrecs, .init_key_from_rec = xfs_rtrmapbt_init_key_from_rec, .init_high_key_from_rec = xfs_rtrmapbt_init_high_key_from_rec, .init_rec_from_cur = xfs_rtrmapbt_init_rec_from_cur, @@ -399,6 +491,7 @@ const struct xfs_btree_ops xfs_rtrmapbt_ops = { .keys_inorder = xfs_rtrmapbt_keys_inorder, .recs_inorder = xfs_rtrmapbt_recs_inorder, .keys_contiguous = xfs_rtrmapbt_keys_contiguous, + .iroot_ops = &xfs_rtrmapbt_iroot_ops, }; /* Initialize a new rt rmap btree cursor. */ @@ -647,3 +740,130 @@ xfs_rtrmapbt_calc_reserves( return max_t(xfs_filblks_t, mp->m_sb.sb_rgblocks >> 6, xfs_rtrmapbt_max_size(mp, mp->m_sb.sb_rgblocks)); } + +/* Convert on-disk form of btree root to in-memory form. */ +STATIC void +xfs_rtrmapbt_from_disk( + struct xfs_inode *ip, + struct xfs_rtrmap_root *dblock, + unsigned int dblocklen, + struct xfs_btree_block *rblock) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_rmap_key *fkp; + __be64 *fpp; + struct xfs_rmap_key *tkp; + __be64 *tpp; + struct xfs_rmap_rec *frp; + struct xfs_rmap_rec *trp; + unsigned int rblocklen = xfs_rtrmap_broot_space(mp, dblock); + unsigned int numrecs; + unsigned int maxrecs; + + xfs_btree_init_block(mp, rblock, &xfs_rtrmapbt_ops, 0, 0, ip->i_ino); + + rblock->bb_level = dblock->bb_level; + rblock->bb_numrecs = dblock->bb_numrecs; + numrecs = be16_to_cpu(dblock->bb_numrecs); + + if (be16_to_cpu(rblock->bb_level) > 0) { + maxrecs = xfs_rtrmapbt_droot_maxrecs(dblocklen, false); + fkp = xfs_rtrmap_droot_key_addr(dblock, 1); + tkp = xfs_rtrmap_key_addr(rblock, 1); + fpp = xfs_rtrmap_droot_ptr_addr(dblock, 1, maxrecs); + tpp = xfs_rtrmap_broot_ptr_addr(mp, rblock, 1, rblocklen); + memcpy(tkp, fkp, 2 * sizeof(*fkp) * numrecs); + memcpy(tpp, fpp, sizeof(*fpp) * numrecs); + } else { + frp = xfs_rtrmap_droot_rec_addr(dblock, 1); + trp = xfs_rtrmap_rec_addr(rblock, 1); + memcpy(trp, frp, sizeof(*frp) * numrecs); + } +} + +/* Load a realtime reverse mapping btree root in from disk. */ +int +xfs_iformat_rtrmap( + struct xfs_inode *ip, + struct xfs_dinode *dip) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + struct xfs_rtrmap_root *dfp = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + unsigned int numrecs; + unsigned int level; + int dsize; + + dsize = XFS_DFORK_SIZE(dip, mp, XFS_DATA_FORK); + numrecs = be16_to_cpu(dfp->bb_numrecs); + level = be16_to_cpu(dfp->bb_level); + + if (level > mp->m_rtrmap_maxlevels || + xfs_rtrmap_droot_space_calc(level, numrecs) > dsize) + return -EFSCORRUPTED; + + xfs_iroot_alloc(ip, XFS_DATA_FORK, + xfs_rtrmap_broot_space_calc(mp, level, numrecs)); + xfs_rtrmapbt_from_disk(ip, dfp, dsize, ifp->if_broot); + return 0; +} + +/* Convert in-memory form of btree root to on-disk form. */ +void +xfs_rtrmapbt_to_disk( + struct xfs_mount *mp, + struct xfs_btree_block *rblock, + unsigned int rblocklen, + struct xfs_rtrmap_root *dblock, + unsigned int dblocklen) +{ + struct xfs_rmap_key *fkp; + __be64 *fpp; + struct xfs_rmap_key *tkp; + __be64 *tpp; + struct xfs_rmap_rec *frp; + struct xfs_rmap_rec *trp; + unsigned int numrecs; + unsigned int maxrecs; + + ASSERT(rblock->bb_magic == cpu_to_be32(XFS_RTRMAP_CRC_MAGIC)); + ASSERT(uuid_equal(&rblock->bb_u.l.bb_uuid, &mp->m_sb.sb_meta_uuid)); + ASSERT(rblock->bb_u.l.bb_blkno == cpu_to_be64(XFS_BUF_DADDR_NULL)); + ASSERT(rblock->bb_u.l.bb_leftsib == cpu_to_be64(NULLFSBLOCK)); + ASSERT(rblock->bb_u.l.bb_rightsib == cpu_to_be64(NULLFSBLOCK)); + + dblock->bb_level = rblock->bb_level; + dblock->bb_numrecs = rblock->bb_numrecs; + numrecs = be16_to_cpu(rblock->bb_numrecs); + + if (be16_to_cpu(rblock->bb_level) > 0) { + maxrecs = xfs_rtrmapbt_droot_maxrecs(dblocklen, false); + fkp = xfs_rtrmap_key_addr(rblock, 1); + tkp = xfs_rtrmap_droot_key_addr(dblock, 1); + fpp = xfs_rtrmap_broot_ptr_addr(mp, rblock, 1, rblocklen); + tpp = xfs_rtrmap_droot_ptr_addr(dblock, 1, maxrecs); + memcpy(tkp, fkp, 2 * sizeof(*fkp) * numrecs); + memcpy(tpp, fpp, sizeof(*fpp) * numrecs); + } else { + frp = xfs_rtrmap_rec_addr(rblock, 1); + trp = xfs_rtrmap_droot_rec_addr(dblock, 1); + memcpy(trp, frp, sizeof(*frp) * numrecs); + } +} + +/* Flush a realtime reverse mapping btree root out to disk. */ +void +xfs_iflush_rtrmap( + struct xfs_inode *ip, + struct xfs_dinode *dip) +{ + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + struct xfs_rtrmap_root *dfp = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + + ASSERT(ifp->if_broot != NULL); + ASSERT(ifp->if_broot_bytes > 0); + ASSERT(xfs_rtrmap_droot_space(ifp->if_broot) <= + xfs_inode_fork_size(ip, XFS_DATA_FORK)); + xfs_rtrmapbt_to_disk(ip->i_mount, ifp->if_broot, ifp->if_broot_bytes, + dfp, XFS_DFORK_SIZE(dip, ip->i_mount, XFS_DATA_FORK)); +} diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.h b/fs/xfs/libxfs/xfs_rtrmap_btree.h index 63e667d0d76d..6917a31bfe0c 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.h +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.h @@ -27,6 +27,7 @@ void xfs_rtrmapbt_commit_staged_btree(struct xfs_btree_cur *cur, unsigned int xfs_rtrmapbt_maxrecs(struct xfs_mount *mp, unsigned int blocklen, bool leaf); void xfs_rtrmapbt_compute_maxlevels(struct xfs_mount *mp); +unsigned int xfs_rtrmapbt_droot_maxrecs(unsigned int blocklen, bool leaf); /* * Addresses of records, keys, and pointers within an incore rtrmapbt block. @@ -86,4 +87,115 @@ int xfs_rtrmapbt_create_path(struct xfs_mount *mp, xfs_rgnumber_t rgno, xfs_filblks_t xfs_rtrmapbt_calc_reserves(struct xfs_mount *mp); +/* Addresses of key, pointers, and records within an ondisk rtrmapbt block. */ + +static inline struct xfs_rmap_rec * +xfs_rtrmap_droot_rec_addr( + struct xfs_rtrmap_root *block, + unsigned int index) +{ + return (struct xfs_rmap_rec *) + ((char *)(block + 1) + + (index - 1) * sizeof(struct xfs_rmap_rec)); +} + +static inline struct xfs_rmap_key * +xfs_rtrmap_droot_key_addr( + struct xfs_rtrmap_root *block, + unsigned int index) +{ + return (struct xfs_rmap_key *) + ((char *)(block + 1) + + (index - 1) * 2 * sizeof(struct xfs_rmap_key)); +} + +static inline xfs_rtrmap_ptr_t * +xfs_rtrmap_droot_ptr_addr( + struct xfs_rtrmap_root *block, + unsigned int index, + unsigned int maxrecs) +{ + return (xfs_rtrmap_ptr_t *) + ((char *)(block + 1) + + maxrecs * 2 * sizeof(struct xfs_rmap_key) + + (index - 1) * sizeof(xfs_rtrmap_ptr_t)); +} + +/* + * Address of pointers within the incore btree root. + * + * These are to be used when we know the size of the block and + * we don't have a cursor. + */ +static inline xfs_rtrmap_ptr_t * +xfs_rtrmap_broot_ptr_addr( + struct xfs_mount *mp, + struct xfs_btree_block *bb, + unsigned int index, + unsigned int block_size) +{ + return xfs_rtrmap_ptr_addr(bb, index, + xfs_rtrmapbt_maxrecs(mp, block_size, false)); +} + +/* + * Compute the space required for the incore btree root containing the given + * number of records. + */ +static inline size_t +xfs_rtrmap_broot_space_calc( + struct xfs_mount *mp, + unsigned int level, + unsigned int nrecs) +{ + size_t sz = XFS_RTRMAP_BLOCK_LEN; + + if (level > 0) + return sz + nrecs * (2 * sizeof(struct xfs_rmap_key) + + sizeof(xfs_rtrmap_ptr_t)); + return sz + nrecs * sizeof(struct xfs_rmap_rec); +} + +/* + * Compute the space required for the incore btree root given the ondisk + * btree root block. + */ +static inline size_t +xfs_rtrmap_broot_space(struct xfs_mount *mp, struct xfs_rtrmap_root *bb) +{ + return xfs_rtrmap_broot_space_calc(mp, be16_to_cpu(bb->bb_level), + be16_to_cpu(bb->bb_numrecs)); +} + +/* Compute the space required for the ondisk root block. */ +static inline size_t +xfs_rtrmap_droot_space_calc( + unsigned int level, + unsigned int nrecs) +{ + size_t sz = sizeof(struct xfs_rtrmap_root); + + if (level > 0) + return sz + nrecs * (2 * sizeof(struct xfs_rmap_key) + + sizeof(xfs_rtrmap_ptr_t)); + return sz + nrecs * sizeof(struct xfs_rmap_rec); +} + +/* + * Compute the space required for the ondisk root block given an incore root + * block. + */ +static inline size_t +xfs_rtrmap_droot_space(struct xfs_btree_block *bb) +{ + return xfs_rtrmap_droot_space_calc(be16_to_cpu(bb->bb_level), + be16_to_cpu(bb->bb_numrecs)); +} + +int xfs_iformat_rtrmap(struct xfs_inode *ip, struct xfs_dinode *dip); +void xfs_rtrmapbt_to_disk(struct xfs_mount *mp, struct xfs_btree_block *rblock, + unsigned int rblocklen, struct xfs_rtrmap_root *dblock, + unsigned int dblocklen); +void xfs_iflush_rtrmap(struct xfs_inode *ip, struct xfs_dinode *dip); + #endif /* __XFS_RTRMAP_BTREE_H__ */ diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c index 3453a204d196..4f1ed1f6a34d 100644 --- a/fs/xfs/xfs_inode_item_recover.c +++ b/fs/xfs/xfs_inode_item_recover.c @@ -22,6 +22,7 @@ #include "xfs_log_recover.h" #include "xfs_icache.h" #include "xfs_bmap_btree.h" +#include "xfs_rtrmap_btree.h" STATIC void xlog_recover_inode_ra_pass2( @@ -266,6 +267,31 @@ xlog_dinode_verify_extent_counts( return 0; } +static inline int +xlog_recover_inode_dbroot( + struct xfs_mount *mp, + void *src, + unsigned int len, + struct xfs_dinode *dip) +{ + void *dfork = XFS_DFORK_DPTR(dip); + unsigned int dsize = XFS_DFORK_DSIZE(dip, mp); + + switch (dip->di_format) { + case XFS_DINODE_FMT_BTREE: + xfs_bmbt_to_bmdr(mp, src, len, dfork, dsize); + break; + case XFS_DINODE_FMT_RMAP: + xfs_rtrmapbt_to_disk(mp, src, len, dfork, dsize); + break; + default: + ASSERT(0); + return -EFSCORRUPTED; + } + + return 0; +} + STATIC int xlog_recover_inode_commit_pass2( struct xlog *log, @@ -472,9 +498,9 @@ xlog_recover_inode_commit_pass2( break; case XFS_ILOG_DBROOT: - xfs_bmbt_to_bmdr(mp, (struct xfs_btree_block *)src, len, - (struct xfs_bmdr_block *)XFS_DFORK_DPTR(dip), - XFS_DFORK_DSIZE(dip, mp)); + error = xlog_recover_inode_dbroot(mp, src, len, dip); + if (error) + goto out_release; break; default: diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h index 35d0695fbf57..f24a08dd63e9 100644 --- a/fs/xfs/xfs_ondisk.h +++ b/fs/xfs/xfs_ondisk.h @@ -78,6 +78,7 @@ xfs_check_ondisk_structs(void) XFS_CHECK_STRUCT_SIZE(union xfs_suminfo_ondisk, 4); XFS_CHECK_STRUCT_SIZE(struct xfs_rtbuf_blkinfo, 48); XFS_CHECK_STRUCT_SIZE(xfs_rtrmap_ptr_t, 8); + XFS_CHECK_STRUCT_SIZE(struct xfs_rtrmap_root, 4); /* * m68k has problems with xfs_attr_leaf_name_remote_t, but we pad it to From patchwork Fri Dec 30 22:18:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB4E8C4332F for ; Sat, 31 Dec 2022 01:40:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236133AbiLaBku (ORCPT ); Fri, 30 Dec 2022 20:40:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48052 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236129AbiLaBkr (ORCPT ); Fri, 30 Dec 2022 20:40:47 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F7C026D2 for ; Fri, 30 Dec 2022 17:40:46 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DC4E861C3A for ; Sat, 31 Dec 2022 01:40:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 428D5C433D2; Sat, 31 Dec 2022 01:40:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450845; bh=qFzBq/EhLNaFDw7ysmqIEQRQVG0sBNIcnbwfHVz8riQ=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Wi0TS5dyTqwB73mHdUMKgM8LSPdC/6K0dtsc4tcxIDlQGxUUrbK8gv2yEw4ondrFt cInaghU/Z8Bg6mtLaGLYDIvdjvk0X0UAAUzmnZ+2MVUYzQ/I894KoqZwVQsAcDKiRe 8TjUVtyy7NqkSv0QJlagHMnkLIN9J2X8tbYn3v70hlxTOF8pDP/aEbkMLmHGVxXwfY os1yDmKVghDDV99S/yIupXQVKjje72IcjrL/u0nZ/TkbOyEonUyNIcK5gbMUnKsrMN ifCYRCnMZBN2m/s4o9a65SLR1PSdSCYbCraiyd4p7qhE9wrBlVb40BwAw9FGGAYikO v3deg1H8zFBSA== Subject: [PATCH 15/38] xfs: use realtime EFI to free extents when realtime rmap is enabled From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:18 -0800 Message-ID: <167243869812.715303.10166820015058214253.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong When rmap is enabled, XFS expects a certain order of operations, which is: 1) remove the file mapping, 2) remove the reverse mapping, and then 3) free the blocks. xfs_bmap_del_extent_real tries to do 1 and 3 in the same transaction, which means that when rtrmap is enabled, we have to use realtime EFIs to maintain the expected order. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_bmap.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 2e93b018d150..8c683db35788 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -5094,7 +5094,6 @@ xfs_bmap_del_extent_real( { xfs_fsblock_t del_endblock=0; /* first block past del */ xfs_fileoff_t del_endoff; /* first offset past del */ - int do_fx; /* free extent at end of routine */ int error; /* error return value */ int flags = 0;/* inode logging flags */ struct xfs_bmbt_irec got; /* current extent entry */ @@ -5108,6 +5107,8 @@ xfs_bmap_del_extent_real( uint qfield; /* quota field to update */ uint32_t state = xfs_bmap_fork_to_state(whichfork); struct xfs_bmbt_irec old; + bool isrt = xfs_ifork_is_realtime(ip, whichfork); + bool want_free = !(bflags & XFS_BMAPI_REMAP); mp = ip->i_mount; XFS_STATS_INC(mp, xs_del_exlist); @@ -5138,17 +5139,24 @@ xfs_bmap_del_extent_real( return -ENOSPC; flags = XFS_ILOG_CORE; - if (xfs_ifork_is_realtime(ip, whichfork)) { - if (!(bflags & XFS_BMAPI_REMAP)) { + if (isrt) { + /* + * Historically, we did not use EFIs to free realtime extents. + * However, when reverse mapping is enabled, we must maintain + * the same order of operations as the data device, which is: + * Remove the file mapping, remove the reverse mapping, and + * then free the blocks. This means that we must delay the + * freeing until after we've scheduled the rmap update. + */ + if (want_free && !xfs_has_rtrmapbt(mp)) { error = xfs_rtfree_blocks(tp, del->br_startblock, del->br_blockcount); if (error) goto done; + want_free = false; } - do_fx = 0; qfield = XFS_TRANS_DQ_RTBCOUNT; } else { - do_fx = 1; qfield = XFS_TRANS_DQ_BCOUNT; } nblks = del->br_blockcount; @@ -5303,7 +5311,7 @@ xfs_bmap_del_extent_real( /* * If we need to, add to list of extents to delete. */ - if (do_fx && !(bflags & XFS_BMAPI_REMAP)) { + if (want_free) { if (xfs_is_reflink_inode(ip) && whichfork == XFS_DATA_FORK) { xfs_refcount_decrease_extent(tp, del); } else { @@ -5312,6 +5320,8 @@ xfs_bmap_del_extent_real( if ((bflags & XFS_BMAPI_NODISCARD) || del->br_state == XFS_EXT_UNWRITTEN) efi_flags |= XFS_FREE_EXTENT_SKIP_DISCARD; + if (isrt) + efi_flags |= XFS_FREE_EXTENT_REALTIME; xfs_free_extent_later(tp, del->br_startblock, del->br_blockcount, NULL, efi_flags); From patchwork Fri Dec 30 22:18:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B097C4332F for ; Sat, 31 Dec 2022 01:41:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236134AbiLaBlG (ORCPT ); Fri, 30 Dec 2022 20:41:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236129AbiLaBlD (ORCPT ); Fri, 30 Dec 2022 20:41:03 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 184C726D2 for ; Fri, 30 Dec 2022 17:41:02 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A65CA61C63 for ; Sat, 31 Dec 2022 01:41:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0FB10C433D2; Sat, 31 Dec 2022 01:41:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450861; bh=V7AXyfRlDQHqrRJI0UD4V5nbQlD6WPtgmNUtv1oV9ao=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=uVdSiERwUtpOXpHtpy6dFklhk6/N9aGO3XiC07qrs7UL31Vz10v/9RROKKsTFbw6E CG8evLuKOwwIoCLHhBmZx6msHV8y/zJqs+GznUW5R/q/6AuX+b+s0R+I4bNVBVmZZr wxavb27u15jdq2irwETSd/UIdG6twVJfzkvBuVX4C6jdPqtrza4miq0/NQK7CnSz9s VFef1k1IM8gVvo+CM60gkwvZzxXztk5O7JwRRzXD11NiGrgUEq0tn2L5yS03knkAc3 xsxAtfFlcYWJ1Jr1xuNggx6K3CEJ6ABap1PL/1MdVmjU12ztIow3dcOeJLOhquAWYj eEwbFr3DSx7fg== Subject: [PATCH 16/38] xfs: wire up rmap map and unmap to the realtime rmapbt From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:18 -0800 Message-ID: <167243869827.715303.11017000588580522165.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Connect the map and unmap reverse-mapping operations to the realtime rmapbt via the deferred operation callbacks. This enables us to perform rmap operations against the correct btree. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rmap.c | 80 +++++++++++++++++++++++++++---------------- fs/xfs/libxfs/xfs_rtgroup.c | 9 +++++ fs/xfs/libxfs/xfs_rtgroup.h | 5 ++- 3 files changed, 63 insertions(+), 31 deletions(-) diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index 1a3607082d12..e3bff42d003d 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -25,6 +25,7 @@ #include "xfs_ag.h" #include "xfs_health.h" #include "xfs_rtgroup.h" +#include "xfs_rtrmap_btree.h" struct kmem_cache *xfs_rmap_intent_cache; @@ -2591,13 +2592,14 @@ xfs_rmap_finish_one_cleanup( struct xfs_btree_cur *rcur, int error) { - struct xfs_buf *agbp; + struct xfs_buf *agbp = NULL; if (rcur == NULL) return; - agbp = rcur->bc_ag.agbp; + if (rcur->bc_btnum == XFS_BTNUM_RMAP) + agbp = rcur->bc_ag.agbp; xfs_btree_del_cursor(rcur, error); - if (error) + if (error && agbp) xfs_trans_brelse(tp, agbp); } @@ -2633,6 +2635,17 @@ __xfs_rmap_finish_intent( } } +/* Does this btree cursor match the given group object? */ +static inline bool +xfs_rmap_is_wrong_cursor( + struct xfs_btree_cur *cur, + struct xfs_rmap_intent *ri) +{ + if (cur->bc_btnum == XFS_BTNUM_RTRMAP) + return cur->bc_ino.rtg != ri->ri_rtg; + return cur->bc_ag.pag != ri->ri_pag; +} + /* * Process one of the deferred rmap operations. We pass back the * btree cursor to maintain our lock on the rmapbt between calls. @@ -2646,24 +2659,24 @@ xfs_rmap_finish_one( struct xfs_rmap_intent *ri, struct xfs_btree_cur **pcur) { + struct xfs_owner_info oinfo; struct xfs_mount *mp = tp->t_mountp; struct xfs_btree_cur *rcur; struct xfs_buf *agbp = NULL; - int error = 0; - struct xfs_owner_info oinfo; xfs_agblock_t bno; bool unwritten; - - if (ri->ri_realtime) { - /* coming in a subsequent patch */ - ASSERT(0); - return -EFSCORRUPTED; - } - - bno = XFS_FSB_TO_AGBNO(mp, ri->ri_bmap.br_startblock); + int error = 0; trace_xfs_rmap_deferred(mp, ri); + if (ri->ri_realtime) { + xfs_rgnumber_t rgno; + + bno = xfs_rtb_to_rgbno(mp, ri->ri_bmap.br_startblock, &rgno); + } else { + bno = XFS_FSB_TO_AGBNO(mp, ri->ri_bmap.br_startblock); + } + if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_RMAP_FINISH_ONE)) return -EIO; @@ -2672,35 +2685,42 @@ xfs_rmap_finish_one( * the startblock, get one now. */ rcur = *pcur; - if (rcur != NULL && rcur->bc_ag.pag != ri->ri_pag) { + if (rcur != NULL && xfs_rmap_is_wrong_cursor(rcur, ri)) { xfs_rmap_finish_one_cleanup(tp, rcur, 0); rcur = NULL; *pcur = NULL; } if (rcur == NULL) { - /* - * Refresh the freelist before we start changing the - * rmapbt, because a shape change could cause us to - * allocate blocks. - */ - error = xfs_free_extent_fix_freelist(tp, ri->ri_pag, &agbp); - if (error) { - xfs_ag_mark_sick(ri->ri_pag, XFS_SICK_AG_AGFL); - return error; - } - if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) { - xfs_ag_mark_sick(ri->ri_pag, XFS_SICK_AG_AGFL); - return -EFSCORRUPTED; - } + if (ri->ri_realtime) { + xfs_rtgroup_lock(tp, ri->ri_rtg, XFS_RTGLOCK_RMAP); + rcur = xfs_rtrmapbt_init_cursor(mp, tp, ri->ri_rtg, + ri->ri_rtg->rtg_rmapip); + rcur->bc_ino.flags = 0; + } else { + /* + * Refresh the freelist before we start changing the + * rmapbt, because a shape change could cause us to + * allocate blocks. + */ + error = xfs_free_extent_fix_freelist(tp, ri->ri_pag, + &agbp); + if (error) { + xfs_ag_mark_sick(ri->ri_pag, XFS_SICK_AG_AGFL); + return error; + } + if (XFS_IS_CORRUPT(tp->t_mountp, !agbp)) { + xfs_ag_mark_sick(ri->ri_pag, XFS_SICK_AG_AGFL); + return -EFSCORRUPTED; + } - rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, ri->ri_pag); + rcur = xfs_rmapbt_init_cursor(mp, tp, agbp, ri->ri_pag); + } } *pcur = rcur; xfs_rmap_ino_owner(&oinfo, ri->ri_owner, ri->ri_whichfork, ri->ri_bmap.br_startoff); unwritten = ri->ri_bmap.br_state == XFS_EXT_UNWRITTEN; - bno = XFS_FSB_TO_AGBNO(rcur->bc_mp, ri->ri_bmap.br_startblock); error = __xfs_rmap_finish_intent(rcur, ri->ri_type, bno, ri->ri_bmap.br_blockcount, &oinfo, unwritten); diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index 4d9e2c0f2fd3..d6b790741265 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -515,6 +515,12 @@ xfs_rtgroup_lock( xfs_rtbitmap_lock(tp, rtg->rtg_mount); else if (rtglock_flags & XFS_RTGLOCK_BITMAP_SHARED) xfs_rtbitmap_lock_shared(rtg->rtg_mount, XFS_RBMLOCK_BITMAP); + + if ((rtglock_flags & XFS_RTGLOCK_RMAP) && rtg->rtg_rmapip) { + xfs_ilock(rtg->rtg_rmapip, XFS_ILOCK_EXCL); + if (tp) + xfs_trans_ijoin(tp, rtg->rtg_rmapip, XFS_ILOCK_EXCL); + } } /* Unlock metadata inodes associated with this rt group. */ @@ -527,6 +533,9 @@ xfs_rtgroup_unlock( ASSERT(!(rtglock_flags & XFS_RTGLOCK_BITMAP_SHARED) || !(rtglock_flags & XFS_RTGLOCK_BITMAP)); + if ((rtglock_flags & XFS_RTGLOCK_RMAP) && rtg->rtg_rmapip) + xfs_iunlock(rtg->rtg_rmapip, XFS_ILOCK_EXCL); + if (rtglock_flags & XFS_RTGLOCK_BITMAP) xfs_rtbitmap_unlock(rtg->rtg_mount); else if (rtglock_flags & XFS_RTGLOCK_BITMAP_SHARED) diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h index 1792a9ab3bbf..3230dd03d8f8 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.h +++ b/fs/xfs/libxfs/xfs_rtgroup.h @@ -220,9 +220,12 @@ int xfs_rtgroup_init_secondary_super(struct xfs_mount *mp, xfs_rgnumber_t rgno, #define XFS_RTGLOCK_BITMAP (1U << 0) /* Lock the rt bitmap inode in shared mode */ #define XFS_RTGLOCK_BITMAP_SHARED (1U << 1) +/* Lock the rt rmap inode in exclusive mode */ +#define XFS_RTGLOCK_RMAP (1U << 2) #define XFS_RTGLOCK_ALL_FLAGS (XFS_RTGLOCK_BITMAP | \ - XFS_RTGLOCK_BITMAP_SHARED) + XFS_RTGLOCK_BITMAP_SHARED | \ + XFS_RTGLOCK_RMAP) void xfs_rtgroup_lock(struct xfs_trans *tp, struct xfs_rtgroup *rtg, unsigned int rtglock_flags); From patchwork Fri Dec 30 22:18:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43DE1C4332F for ; Sat, 31 Dec 2022 01:41:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236135AbiLaBlT (ORCPT ); Fri, 30 Dec 2022 20:41:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236129AbiLaBlS (ORCPT ); Fri, 30 Dec 2022 20:41:18 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3F3B26D2 for ; Fri, 30 Dec 2022 17:41:17 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6FD2661C3A for ; Sat, 31 Dec 2022 01:41:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C1AA0C433EF; Sat, 31 Dec 2022 01:41:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450876; bh=bpsxLlA9UrZheMn2UP24dl5HxG0Ic/pudPCYxoERVW8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Ou5rfoG/53j37KU3mJAgTXkQPPdGbeAQwTb7MRQ9jii/gvb7oJJxegDcseuP7L3Xa okC1hckWE4a0+Ym1DXJkVSNAyc/P2qa/mmDoHNpsIuhedT3VZUWwcBvf+7J299zqZT TY2Zi3d2zrwOc//Km2NZU179xhd0Vr1Z0H/uCjpOwGp1Ka9wlyHJ8oFXFOiUAM+cNi Ilyf+m3JEseX12CoxJaSYidWwAg+08Ibyl2YmGStaxfxEAd0t92AbeadokOw/BSGGp M9cgWjHopXTL6p9Tux0RqfJ8D8jCHNgsOeL3G0nCrAaei0KNbf/YXteHf6NKDe/LVx OhfIvINOzXYJw== Subject: [PATCH 17/38] xfs: create routine to allocate and initialize a realtime rmap btree inode From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:18 -0800 Message-ID: <167243869841.715303.1721205201463684435.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Create a library routine to allocate and initialize an empty realtime rmapbt inode. We'll use this for mkfs and repair. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rtrmap_btree.c | 42 ++++++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrmap_btree.h | 5 +++++ 2 files changed, 47 insertions(+) diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index a099f33f26ab..9181fca2ba54 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -27,6 +27,7 @@ #include "xfs_extent_busy.h" #include "xfs_rtgroup.h" #include "xfs_bmap.h" +#include "xfs_imeta.h" static struct kmem_cache *xfs_rtrmapbt_cur_cache; @@ -867,3 +868,44 @@ xfs_iflush_rtrmap( xfs_rtrmapbt_to_disk(ip->i_mount, ifp->if_broot, ifp->if_broot_bytes, dfp, XFS_DFORK_SIZE(dip, ip->i_mount, XFS_DATA_FORK)); } + +/* + * Create a realtime rmap btree inode. + * + * Regardless of the return value, the caller must clean up @ic. If a new + * inode is returned through *ipp, the caller must finish setting up the incore + * inode and release it. + */ +int +xfs_rtrmapbt_create( + struct xfs_trans **tpp, + struct xfs_imeta_path *path, + struct xfs_imeta_update *upd, + struct xfs_inode **ipp) +{ + struct xfs_mount *mp = (*tpp)->t_mountp; + struct xfs_ifork *ifp; + struct xfs_inode *ip; + int error; + + *ipp = NULL; + + error = xfs_imeta_create(tpp, path, S_IFREG, 0, &ip, upd); + if (error) + return error; + + ifp = &ip->i_df; + ifp->if_format = XFS_DINODE_FMT_RMAP; + ASSERT(ifp->if_broot_bytes == 0); + ASSERT(ifp->if_bytes == 0); + + /* Initialize the empty incore btree root. */ + xfs_iroot_alloc(ip, XFS_DATA_FORK, + xfs_rtrmap_broot_space_calc(mp, 0, 0)); + xfs_btree_init_block(mp, ifp->if_broot, &xfs_rtrmapbt_ops, 0, 0, + ip->i_ino); + xfs_trans_log_inode(*tpp, ip, XFS_ILOG_CORE | XFS_ILOG_DBROOT); + + *ipp = ip; + return 0; +} diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.h b/fs/xfs/libxfs/xfs_rtrmap_btree.h index 6917a31bfe0c..046a60816736 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.h +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.h @@ -198,4 +198,9 @@ void xfs_rtrmapbt_to_disk(struct xfs_mount *mp, struct xfs_btree_block *rblock, unsigned int dblocklen); void xfs_iflush_rtrmap(struct xfs_inode *ip, struct xfs_dinode *dip); +struct xfs_imeta_update; + +int xfs_rtrmapbt_create(struct xfs_trans **tpp, struct xfs_imeta_path *path, + struct xfs_imeta_update *ic, struct xfs_inode **ipp); + #endif /* __XFS_RTRMAP_BTREE_H__ */ From patchwork Fri Dec 30 22:18:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B8C9C4332F for ; Sat, 31 Dec 2022 01:41:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236001AbiLaBlf (ORCPT ); Fri, 30 Dec 2022 20:41:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235936AbiLaBle (ORCPT ); Fri, 30 Dec 2022 20:41:34 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2B7426D2 for ; Fri, 30 Dec 2022 17:41:33 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3030261C3A for ; Sat, 31 Dec 2022 01:41:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 82E6DC433D2; Sat, 31 Dec 2022 01:41:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450892; bh=2mktnCrPNxsL0nffwGyx7h+5D/kPy8LJcKodNHMmxMs=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=MJVMPwLgu1lYhGfW30QCMWDMruDCRtllQGE/1SAu/RgFGJX5QP2teFgn8wvQiVP8H san7+Cl/nRPDkSWVhWp04OCYy7VLJ3v5sreTf9JZRI5mQB4yEmuoAVD+Z+oR/QS7Ut bG+XNauzFmR4Vw/+NLDKi0ymiEMf4HL1gyqWKsZw3Q4oRSShxQAaJs16PeR6VDXswI AFZ5B7/oDIYOTJtZWVpPwIf26SeLWWAj2EEIUwWXgkag7fztIgdGFbtDcYTekwDOnG TwqjyqRV7o9f532mFdi0nlk2Hab7pUCypUlqDgmwkpVFeEzInOLQ1NmimFnJUVx+P+ HqLpzadT/zjlQ== Subject: [PATCH 18/38] xfs: rearrange xfs_fsmap.c a little bit From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:18 -0800 Message-ID: <167243869855.715303.10438659359522602528.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The order of the functions in this file has gotten a little confusing over the years. Specifically, the two data device implementations (bnobt and rmapbt) could be adjacent in the source code instead of split in two by the logdev and rtdev fsmap implementations. We're about to add more functionality to this file, so rearrange things now. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_fsmap.c | 366 ++++++++++++++++++++++++++-------------------------- 1 file changed, 183 insertions(+), 183 deletions(-) diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index 71053f840ea4..dfd9e39ded6e 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -343,6 +343,21 @@ xfs_getfsmap_helper( return 0; } +/* Set rmap flags based on the getfsmap flags */ +static void +xfs_getfsmap_set_irec_flags( + struct xfs_rmap_irec *irec, + const struct xfs_fsmap *fmr) +{ + irec->rm_flags = 0; + if (fmr->fmr_flags & FMR_OF_ATTR_FORK) + irec->rm_flags |= XFS_RMAP_ATTR_FORK; + if (fmr->fmr_flags & FMR_OF_EXTENT_MAP) + irec->rm_flags |= XFS_RMAP_BMBT_BLOCK; + if (fmr->fmr_flags & FMR_OF_PREALLOC) + irec->rm_flags |= XFS_RMAP_UNWRITTEN; +} + /* Transform a rmapbt irec into a fsmap */ STATIC int xfs_getfsmap_datadev_helper( @@ -385,189 +400,6 @@ xfs_getfsmap_datadev_bnobt_helper( return xfs_getfsmap_helper(cur->bc_tp, info, &irec, rec_daddr); } -/* Set rmap flags based on the getfsmap flags */ -static void -xfs_getfsmap_set_irec_flags( - struct xfs_rmap_irec *irec, - const struct xfs_fsmap *fmr) -{ - irec->rm_flags = 0; - if (fmr->fmr_flags & FMR_OF_ATTR_FORK) - irec->rm_flags |= XFS_RMAP_ATTR_FORK; - if (fmr->fmr_flags & FMR_OF_EXTENT_MAP) - irec->rm_flags |= XFS_RMAP_BMBT_BLOCK; - if (fmr->fmr_flags & FMR_OF_PREALLOC) - irec->rm_flags |= XFS_RMAP_UNWRITTEN; -} - -/* Execute a getfsmap query against the log device. */ -STATIC int -xfs_getfsmap_logdev( - struct xfs_trans *tp, - const struct xfs_fsmap *keys, - struct xfs_getfsmap_info *info) -{ - struct xfs_mount *mp = tp->t_mountp; - struct xfs_rmap_irec rmap; - int error; - - /* Set up search keys */ - info->low.rm_startblock = XFS_BB_TO_FSBT(mp, keys[0].fmr_physical); - info->low.rm_offset = XFS_BB_TO_FSBT(mp, keys[0].fmr_offset); - error = xfs_fsmap_owner_to_rmap(&info->low, keys); - if (error) - return error; - info->low.rm_blockcount = 0; - xfs_getfsmap_set_irec_flags(&info->low, &keys[0]); - - error = xfs_fsmap_owner_to_rmap(&info->high, keys + 1); - if (error) - return error; - info->high.rm_startblock = -1U; - info->high.rm_owner = ULLONG_MAX; - info->high.rm_offset = ULLONG_MAX; - info->high.rm_blockcount = 0; - info->high.rm_flags = XFS_RMAP_KEY_FLAGS | XFS_RMAP_REC_FLAGS; - info->missing_owner = XFS_FMR_OWN_FREE; - - trace_xfs_fsmap_low_key(mp, info->dev, NULLAGNUMBER, &info->low); - trace_xfs_fsmap_high_key(mp, info->dev, NULLAGNUMBER, &info->high); - - if (keys[0].fmr_physical > 0) - return 0; - - /* Fabricate an rmap entry for the external log device. */ - rmap.rm_startblock = 0; - rmap.rm_blockcount = mp->m_sb.sb_logblocks; - rmap.rm_owner = XFS_RMAP_OWN_LOG; - rmap.rm_offset = 0; - rmap.rm_flags = 0; - - return xfs_getfsmap_helper(tp, info, &rmap, 0); -} - -#ifdef CONFIG_XFS_RT -/* Transform a rtbitmap "record" into a fsmap */ -STATIC int -xfs_getfsmap_rtdev_rtbitmap_helper( - struct xfs_mount *mp, - struct xfs_trans *tp, - const struct xfs_rtalloc_rec *rec, - void *priv) -{ - struct xfs_getfsmap_info *info = priv; - struct xfs_rmap_irec irec; - xfs_daddr_t rec_daddr; - - irec.rm_startblock = xfs_rtx_to_rtb(mp, rec->ar_startext); - rec_daddr = XFS_FSB_TO_BB(mp, irec.rm_startblock); - irec.rm_blockcount = xfs_rtx_to_rtb(mp, rec->ar_extcount); - irec.rm_owner = XFS_RMAP_OWN_NULL; /* "free" */ - irec.rm_offset = 0; - irec.rm_flags = 0; - - return xfs_getfsmap_helper(tp, info, &irec, rec_daddr); -} - -/* Execute a getfsmap query against the realtime device. */ -STATIC int -__xfs_getfsmap_rtdev( - struct xfs_trans *tp, - const struct xfs_fsmap *keys, - int (*query_fn)(struct xfs_trans *, - struct xfs_getfsmap_info *), - struct xfs_getfsmap_info *info) -{ - struct xfs_mount *mp = tp->t_mountp; - xfs_fsblock_t start_fsb; - xfs_fsblock_t end_fsb; - uint64_t eofs; - int error = 0; - - eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_rblocks); - if (keys[0].fmr_physical >= eofs) - return 0; - start_fsb = XFS_BB_TO_FSBT(mp, keys[0].fmr_physical); - end_fsb = XFS_BB_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical)); - - /* Set up search keys */ - info->low.rm_startblock = start_fsb; - error = xfs_fsmap_owner_to_rmap(&info->low, &keys[0]); - if (error) - return error; - info->low.rm_offset = XFS_BB_TO_FSBT(mp, keys[0].fmr_offset); - info->low.rm_blockcount = 0; - xfs_getfsmap_set_irec_flags(&info->low, &keys[0]); - - info->high.rm_startblock = end_fsb; - error = xfs_fsmap_owner_to_rmap(&info->high, &keys[1]); - if (error) - return error; - info->high.rm_offset = XFS_BB_TO_FSBT(mp, keys[1].fmr_offset); - info->high.rm_blockcount = 0; - xfs_getfsmap_set_irec_flags(&info->high, &keys[1]); - - trace_xfs_fsmap_low_key(mp, info->dev, NULLAGNUMBER, &info->low); - trace_xfs_fsmap_high_key(mp, info->dev, NULLAGNUMBER, &info->high); - - return query_fn(tp, info); -} - -/* Actually query the realtime bitmap. */ -STATIC int -xfs_getfsmap_rtdev_rtbitmap_query( - struct xfs_trans *tp, - struct xfs_getfsmap_info *info) -{ - struct xfs_rtalloc_rec alow = { 0 }; - struct xfs_rtalloc_rec ahigh = { 0 }; - struct xfs_mount *mp = tp->t_mountp; - unsigned int mod; - int error; - - xfs_rtbitmap_lock_shared(mp, XFS_RBMLOCK_BITMAP); - - /* - * Set up query parameters to return free rtextents covering the range - * we want. - */ - alow.ar_startext = xfs_rtb_to_rtxt(mp, info->low.rm_startblock); - ahigh.ar_startext = xfs_rtb_to_rtx(mp, info->high.rm_startblock, &mod); - if (mod) - ahigh.ar_startext++; - error = xfs_rtalloc_query_range(mp, tp, &alow, &ahigh, - xfs_getfsmap_rtdev_rtbitmap_helper, info); - if (error) - goto err; - - /* - * Report any gaps at the end of the rtbitmap by simulating a null - * rmap starting at the block after the end of the query range. - */ - info->last = true; - ahigh.ar_startext = min(mp->m_sb.sb_rextents, ahigh.ar_startext); - - error = xfs_getfsmap_rtdev_rtbitmap_helper(mp, tp, &ahigh, info); - if (error) - goto err; -err: - xfs_rtbitmap_unlock_shared(mp, XFS_RBMLOCK_BITMAP); - return error; -} - -/* Execute a getfsmap query against the realtime device rtbitmap. */ -STATIC int -xfs_getfsmap_rtdev_rtbitmap( - struct xfs_trans *tp, - const struct xfs_fsmap *keys, - struct xfs_getfsmap_info *info) -{ - info->missing_owner = XFS_FMR_OWN_UNKNOWN; - return __xfs_getfsmap_rtdev(tp, keys, xfs_getfsmap_rtdev_rtbitmap_query, - info); -} -#endif /* CONFIG_XFS_RT */ - /* Execute a getfsmap query against the regular data device. */ STATIC int __xfs_getfsmap_datadev( @@ -766,6 +598,174 @@ xfs_getfsmap_datadev_bnobt( xfs_getfsmap_datadev_bnobt_query, &akeys[0]); } +/* Execute a getfsmap query against the log device. */ +STATIC int +xfs_getfsmap_logdev( + struct xfs_trans *tp, + const struct xfs_fsmap *keys, + struct xfs_getfsmap_info *info) +{ + struct xfs_mount *mp = tp->t_mountp; + struct xfs_rmap_irec rmap; + int error; + + /* Set up search keys */ + info->low.rm_startblock = XFS_BB_TO_FSBT(mp, keys[0].fmr_physical); + info->low.rm_offset = XFS_BB_TO_FSBT(mp, keys[0].fmr_offset); + error = xfs_fsmap_owner_to_rmap(&info->low, keys); + if (error) + return error; + info->low.rm_blockcount = 0; + xfs_getfsmap_set_irec_flags(&info->low, &keys[0]); + + error = xfs_fsmap_owner_to_rmap(&info->high, keys + 1); + if (error) + return error; + info->high.rm_startblock = -1U; + info->high.rm_owner = ULLONG_MAX; + info->high.rm_offset = ULLONG_MAX; + info->high.rm_blockcount = 0; + info->high.rm_flags = XFS_RMAP_KEY_FLAGS | XFS_RMAP_REC_FLAGS; + info->missing_owner = XFS_FMR_OWN_FREE; + + trace_xfs_fsmap_low_key(mp, info->dev, NULLAGNUMBER, &info->low); + trace_xfs_fsmap_high_key(mp, info->dev, NULLAGNUMBER, &info->high); + + if (keys[0].fmr_physical > 0) + return 0; + + /* Fabricate an rmap entry for the external log device. */ + rmap.rm_startblock = 0; + rmap.rm_blockcount = mp->m_sb.sb_logblocks; + rmap.rm_owner = XFS_RMAP_OWN_LOG; + rmap.rm_offset = 0; + rmap.rm_flags = 0; + + return xfs_getfsmap_helper(tp, info, &rmap, 0); +} + +#ifdef CONFIG_XFS_RT +/* Transform a rtbitmap "record" into a fsmap */ +STATIC int +xfs_getfsmap_rtdev_rtbitmap_helper( + struct xfs_mount *mp, + struct xfs_trans *tp, + const struct xfs_rtalloc_rec *rec, + void *priv) +{ + struct xfs_getfsmap_info *info = priv; + struct xfs_rmap_irec irec; + xfs_daddr_t rec_daddr; + + irec.rm_startblock = xfs_rtx_to_rtb(mp, rec->ar_startext); + rec_daddr = XFS_FSB_TO_BB(mp, irec.rm_startblock); + irec.rm_blockcount = xfs_rtx_to_rtb(mp, rec->ar_extcount); + irec.rm_owner = XFS_RMAP_OWN_NULL; /* "free" */ + irec.rm_offset = 0; + irec.rm_flags = 0; + + return xfs_getfsmap_helper(tp, info, &irec, rec_daddr); +} + +/* Execute a getfsmap query against the realtime device. */ +STATIC int +__xfs_getfsmap_rtdev( + struct xfs_trans *tp, + const struct xfs_fsmap *keys, + int (*query_fn)(struct xfs_trans *, + struct xfs_getfsmap_info *), + struct xfs_getfsmap_info *info) +{ + struct xfs_mount *mp = tp->t_mountp; + xfs_fsblock_t start_fsb; + xfs_fsblock_t end_fsb; + uint64_t eofs; + int error = 0; + + eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_rblocks); + if (keys[0].fmr_physical >= eofs) + return 0; + start_fsb = XFS_BB_TO_FSBT(mp, keys[0].fmr_physical); + end_fsb = XFS_BB_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical)); + + /* Set up search keys */ + info->low.rm_startblock = start_fsb; + error = xfs_fsmap_owner_to_rmap(&info->low, &keys[0]); + if (error) + return error; + info->low.rm_offset = XFS_BB_TO_FSBT(mp, keys[0].fmr_offset); + info->low.rm_blockcount = 0; + xfs_getfsmap_set_irec_flags(&info->low, &keys[0]); + + info->high.rm_startblock = end_fsb; + error = xfs_fsmap_owner_to_rmap(&info->high, &keys[1]); + if (error) + return error; + info->high.rm_offset = XFS_BB_TO_FSBT(mp, keys[1].fmr_offset); + info->high.rm_blockcount = 0; + xfs_getfsmap_set_irec_flags(&info->high, &keys[1]); + + trace_xfs_fsmap_low_key(mp, info->dev, NULLAGNUMBER, &info->low); + trace_xfs_fsmap_high_key(mp, info->dev, NULLAGNUMBER, &info->high); + + return query_fn(tp, info); +} + +/* Actually query the realtime bitmap. */ +STATIC int +xfs_getfsmap_rtdev_rtbitmap_query( + struct xfs_trans *tp, + struct xfs_getfsmap_info *info) +{ + struct xfs_rtalloc_rec alow = { 0 }; + struct xfs_rtalloc_rec ahigh = { 0 }; + struct xfs_mount *mp = tp->t_mountp; + unsigned int mod; + int error; + + xfs_rtbitmap_lock_shared(mp, XFS_RBMLOCK_BITMAP); + + /* + * Set up query parameters to return free rtextents covering the range + * we want. + */ + alow.ar_startext = xfs_rtb_to_rtxt(mp, info->low.rm_startblock); + ahigh.ar_startext = xfs_rtb_to_rtx(mp, info->high.rm_startblock, &mod); + if (mod) + ahigh.ar_startext++; + error = xfs_rtalloc_query_range(mp, tp, &alow, &ahigh, + xfs_getfsmap_rtdev_rtbitmap_helper, info); + if (error) + goto err; + + /* + * Report any gaps at the end of the rtbitmap by simulating a null + * rmap starting at the block after the end of the query range. + */ + info->last = true; + ahigh.ar_startext = min(mp->m_sb.sb_rextents, ahigh.ar_startext); + + error = xfs_getfsmap_rtdev_rtbitmap_helper(mp, tp, &ahigh, info); + if (error) + goto err; +err: + xfs_rtbitmap_unlock_shared(mp, XFS_RBMLOCK_BITMAP); + return error; +} + +/* Execute a getfsmap query against the realtime device rtbitmap. */ +STATIC int +xfs_getfsmap_rtdev_rtbitmap( + struct xfs_trans *tp, + const struct xfs_fsmap *keys, + struct xfs_getfsmap_info *info) +{ + info->missing_owner = XFS_FMR_OWN_UNKNOWN; + return __xfs_getfsmap_rtdev(tp, keys, xfs_getfsmap_rtdev_rtbitmap_query, + info); +} +#endif /* CONFIG_XFS_RT */ + /* Do we recognize the device? */ STATIC bool xfs_getfsmap_is_valid_device( From patchwork Fri Dec 30 22:18:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085484 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52641C4332F for ; Sat, 31 Dec 2022 01:42:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235936AbiLaBmC (ORCPT ); Fri, 30 Dec 2022 20:42:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236137AbiLaBlw (ORCPT ); Fri, 30 Dec 2022 20:41:52 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E82AE0FD for ; Fri, 30 Dec 2022 17:41:51 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BC3BCB81DE0 for ; Sat, 31 Dec 2022 01:41:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 657BEC433EF; Sat, 31 Dec 2022 01:41:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450908; bh=USNvMBtjPTUllBRHkJoJbXQR4bl90nrrwl2FLoTIG8M=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=h6y6YclHr5cmtpSpvZHXj77Su13aNBOhRWaRV8hbQuT2GKH5aFDdaF9AJ6AmCJ424 D1cqxhBNbfQygzG/gvQv0Qfdc2unGJF3PcWTqY3HzeetsM+ngtDK6eNSK5TBqV0e5G Gs2ob6qK1aTKZ86saR1iud6VMbc0zfNLgqgmjGNzjtEfiB6C9qq7t7tB91mTePQGMc c/Op7TnZkrIjSdn+6V1Sd5KyYhuqQ8kD9qMgz6Tr2hRYZpBqNKkGn+Zap96RRloGPz dKsykf+hLHzSxN1MVsVl+WzlI6TjJB0iPmRZY7veHNkPrP/SS3TD3m+59S5apgkYFn jInSNCb1/SVIg== Subject: [PATCH 19/38] xfs: wire up getfsmap to the realtime reverse mapping btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:18 -0800 Message-ID: <167243869870.715303.1767921115178790776.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Connect the getfsmap ioctl to the realtime rmapbt. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_fsmap.c | 261 ++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 212 insertions(+), 49 deletions(-) diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index dfd9e39ded6e..e330a7e55d1d 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -25,6 +25,8 @@ #include "xfs_alloc_btree.h" #include "xfs_rtbitmap.h" #include "xfs_ag.h" +#include "xfs_rtgroup.h" +#include "xfs_rtrmap_btree.h" /* Convert an xfs_fsmap to an fsmap. */ static void @@ -158,6 +160,7 @@ struct xfs_getfsmap_info { struct xfs_fsmap_head *head; struct fsmap *fsmap_recs; /* mapping records */ struct xfs_buf *agf_bp; /* AGF, for refcount queries */ + struct xfs_rtgroup *rtg; /* rt group info, if needed */ struct xfs_perag *pag; /* AG info, if applicable */ xfs_daddr_t next_daddr; /* next daddr we expect */ u64 missing_owner; /* owner of holes */ @@ -311,8 +314,14 @@ xfs_getfsmap_helper( if (info->head->fmh_entries >= info->head->fmh_count) return -ECANCELED; - trace_xfs_fsmap_mapping(mp, info->dev, - info->pag ? info->pag->pag_agno : NULLAGNUMBER, rec); + if (info->pag) + trace_xfs_fsmap_mapping(mp, info->dev, info->pag->pag_agno, + rec); + else if (info->rtg) + trace_xfs_fsmap_mapping(mp, info->dev, info->rtg->rtg_rgno, + rec); + else + trace_xfs_fsmap_mapping(mp, info->dev, NULLAGNUMBER, rec); fmr.fmr_device = info->dev; fmr.fmr_physical = rec_daddr; @@ -667,50 +676,6 @@ xfs_getfsmap_rtdev_rtbitmap_helper( return xfs_getfsmap_helper(tp, info, &irec, rec_daddr); } -/* Execute a getfsmap query against the realtime device. */ -STATIC int -__xfs_getfsmap_rtdev( - struct xfs_trans *tp, - const struct xfs_fsmap *keys, - int (*query_fn)(struct xfs_trans *, - struct xfs_getfsmap_info *), - struct xfs_getfsmap_info *info) -{ - struct xfs_mount *mp = tp->t_mountp; - xfs_fsblock_t start_fsb; - xfs_fsblock_t end_fsb; - uint64_t eofs; - int error = 0; - - eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_rblocks); - if (keys[0].fmr_physical >= eofs) - return 0; - start_fsb = XFS_BB_TO_FSBT(mp, keys[0].fmr_physical); - end_fsb = XFS_BB_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical)); - - /* Set up search keys */ - info->low.rm_startblock = start_fsb; - error = xfs_fsmap_owner_to_rmap(&info->low, &keys[0]); - if (error) - return error; - info->low.rm_offset = XFS_BB_TO_FSBT(mp, keys[0].fmr_offset); - info->low.rm_blockcount = 0; - xfs_getfsmap_set_irec_flags(&info->low, &keys[0]); - - info->high.rm_startblock = end_fsb; - error = xfs_fsmap_owner_to_rmap(&info->high, &keys[1]); - if (error) - return error; - info->high.rm_offset = XFS_BB_TO_FSBT(mp, keys[1].fmr_offset); - info->high.rm_blockcount = 0; - xfs_getfsmap_set_irec_flags(&info->high, &keys[1]); - - trace_xfs_fsmap_low_key(mp, info->dev, NULLAGNUMBER, &info->low); - trace_xfs_fsmap_high_key(mp, info->dev, NULLAGNUMBER, &info->high); - - return query_fn(tp, info); -} - /* Actually query the realtime bitmap. */ STATIC int xfs_getfsmap_rtdev_rtbitmap_query( @@ -760,9 +725,203 @@ xfs_getfsmap_rtdev_rtbitmap( const struct xfs_fsmap *keys, struct xfs_getfsmap_info *info) { + struct xfs_mount *mp = tp->t_mountp; + xfs_fsblock_t start_fsb; + xfs_fsblock_t end_fsb; + uint64_t eofs; + int error = 0; + + eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_rblocks); + if (keys[0].fmr_physical >= eofs) + return 0; + start_fsb = XFS_BB_TO_FSBT(mp, keys[0].fmr_physical); + end_fsb = XFS_BB_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical)); + info->missing_owner = XFS_FMR_OWN_UNKNOWN; - return __xfs_getfsmap_rtdev(tp, keys, xfs_getfsmap_rtdev_rtbitmap_query, - info); + + /* Set up search keys */ + info->low.rm_startblock = start_fsb; + error = xfs_fsmap_owner_to_rmap(&info->low, &keys[0]); + if (error) + return error; + info->low.rm_offset = XFS_BB_TO_FSBT(mp, keys[0].fmr_offset); + info->low.rm_blockcount = 0; + xfs_getfsmap_set_irec_flags(&info->low, &keys[0]); + + info->high.rm_startblock = end_fsb; + error = xfs_fsmap_owner_to_rmap(&info->high, &keys[1]); + if (error) + return error; + info->high.rm_offset = XFS_BB_TO_FSBT(mp, keys[1].fmr_offset); + info->high.rm_blockcount = 0; + xfs_getfsmap_set_irec_flags(&info->high, &keys[1]); + + trace_xfs_fsmap_low_key(mp, info->dev, NULLAGNUMBER, &info->low); + trace_xfs_fsmap_high_key(mp, info->dev, NULLAGNUMBER, &info->high); + + return xfs_getfsmap_rtdev_rtbitmap_query(tp, info); +} + +/* Transform a absolute-startblock rmap (rtdev, logdev) into a fsmap */ +STATIC int +xfs_getfsmap_rtdev_helper( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xfs_mount *mp = cur->bc_mp; + struct xfs_getfsmap_info *info = priv; + xfs_rtblock_t rtbno; + xfs_daddr_t rec_daddr; + + rtbno = xfs_rgbno_to_rtb(mp, cur->bc_ino.rtg->rtg_rgno, + rec->rm_startblock); + rec_daddr = xfs_rtb_to_daddr(mp, rtbno); + + return xfs_getfsmap_helper(cur->bc_tp, info, rec, rec_daddr); +} + +/* Actually query the rtrmap btree. */ +STATIC int +xfs_getfsmap_rtdev_rmapbt_query( + struct xfs_trans *tp, + struct xfs_getfsmap_info *info, + struct xfs_btree_cur **curpp) +{ + struct xfs_mount *mp = tp->t_mountp; + + /* Report any gap at the end of the last rtgroup. */ + if (info->last) + return xfs_getfsmap_rtdev_helper(*curpp, &info->high, info); + + /* Query the rtrmapbt */ + xfs_rtgroup_lock(NULL, info->rtg, XFS_RTGLOCK_RMAP); + *curpp = xfs_rtrmapbt_init_cursor(mp, tp, info->rtg, + info->rtg->rtg_rmapip); + return xfs_rmap_query_range(*curpp, &info->low, &info->high, + xfs_getfsmap_rtdev_helper, info); +} + +/* Execute a getfsmap query against the realtime device rmapbt. */ +STATIC int +xfs_getfsmap_rtdev_rmapbt( + struct xfs_trans *tp, + const struct xfs_fsmap *keys, + struct xfs_getfsmap_info *info) +{ + struct xfs_mount *mp = tp->t_mountp; + struct xfs_rtgroup *rtg; + struct xfs_btree_cur *bt_cur = NULL; + xfs_fsblock_t start_fsb; + xfs_fsblock_t end_fsb; + xfs_rgnumber_t start_rg, end_rg; + uint64_t eofs; + int error = 0; + + eofs = XFS_FSB_TO_BB(mp, xfs_rtx_to_rtb(mp, mp->m_sb.sb_rextents)); + if (keys[0].fmr_physical >= eofs) + return 0; + start_fsb = XFS_BB_TO_FSBT(mp, keys[0].fmr_physical); + end_fsb = XFS_BB_TO_FSB(mp, min(eofs - 1, keys[1].fmr_physical)); + + info->missing_owner = XFS_FMR_OWN_FREE; + + /* + * Convert the fsmap low/high keys to rtgroup based keys. Initialize + * low to the fsmap low key and max out the high key to the end + * of the rtgroup. + */ + info->low.rm_startblock = xfs_rtb_to_rgbno(mp, start_fsb, &start_rg); + info->low.rm_offset = XFS_BB_TO_FSBT(mp, keys[0].fmr_offset); + error = xfs_fsmap_owner_to_rmap(&info->low, &keys[0]); + if (error) + return error; + info->low.rm_blockcount = 0; + xfs_getfsmap_set_irec_flags(&info->low, &keys[0]); + + info->high.rm_startblock = -1U; + info->high.rm_owner = ULLONG_MAX; + info->high.rm_offset = ULLONG_MAX; + info->high.rm_blockcount = 0; + info->high.rm_flags = XFS_RMAP_KEY_FLAGS | XFS_RMAP_REC_FLAGS; + + end_rg = xfs_rtb_to_rgno(mp, end_fsb); + + for_each_rtgroup_range(mp, start_rg, end_rg, rtg) { + /* + * Set the rtgroup high key from the fsmap high key if this + * is the last rtgroup that we're querying. + */ + info->rtg = rtg; + if (rtg->rtg_rgno == end_rg) { + xfs_rgnumber_t junk; + + info->high.rm_startblock = xfs_rtb_to_rgbno(mp, + end_fsb, &junk); + info->high.rm_offset = XFS_BB_TO_FSBT(mp, + keys[1].fmr_offset); + error = xfs_fsmap_owner_to_rmap(&info->high, &keys[1]); + if (error) + break; + xfs_getfsmap_set_irec_flags(&info->high, &keys[1]); + } + + if (bt_cur) { + xfs_rtgroup_unlock(bt_cur->bc_ino.rtg, + XFS_RTGLOCK_RMAP); + xfs_btree_del_cursor(bt_cur, XFS_BTREE_NOERROR); + bt_cur = NULL; + } + + trace_xfs_fsmap_low_key(mp, info->dev, rtg->rtg_rgno, + &info->low); + trace_xfs_fsmap_high_key(mp, info->dev, rtg->rtg_rgno, + &info->high); + + error = xfs_getfsmap_rtdev_rmapbt_query(tp, info, &bt_cur); + if (error) + break; + + /* + * Set the rtgroup low key to the start of the rtgroup prior to + * moving on to the next rtgroup. + */ + if (rtg->rtg_rgno == start_rg) { + info->low.rm_startblock = 0; + info->low.rm_owner = 0; + info->low.rm_offset = 0; + info->low.rm_flags = 0; + } + + /* + * If this is the last rtgroup, report any gap at the end of it + * before we drop the reference to the perag when the loop + * terminates. + */ + if (rtg->rtg_rgno == end_rg) { + info->last = true; + error = xfs_getfsmap_rtdev_rmapbt_query(tp, info, + &bt_cur); + if (error) + break; + } + info->rtg = NULL; + } + + if (bt_cur) { + xfs_rtgroup_unlock(bt_cur->bc_ino.rtg, XFS_RTGLOCK_RMAP); + xfs_btree_del_cursor(bt_cur, error < 0 ? XFS_BTREE_ERROR : + XFS_BTREE_NOERROR); + } + if (info->rtg) { + xfs_rtgroup_put(info->rtg); + info->rtg = NULL; + } else if (rtg) { + /* loop termination case */ + xfs_rtgroup_put(rtg); + } + + return error; } #endif /* CONFIG_XFS_RT */ @@ -881,7 +1040,10 @@ xfs_getfsmap( #ifdef CONFIG_XFS_RT if (mp->m_rtdev_targp) { handlers[2].dev = new_encode_dev(mp->m_rtdev_targp->bt_dev); - handlers[2].fn = xfs_getfsmap_rtdev_rtbitmap; + if (use_rmap) + handlers[2].fn = xfs_getfsmap_rtdev_rmapbt; + else + handlers[2].fn = xfs_getfsmap_rtdev_rtbitmap; } #endif /* CONFIG_XFS_RT */ @@ -959,6 +1121,7 @@ xfs_getfsmap( info.dev = handlers[i].dev; info.last = false; info.pag = NULL; + info.rtg = NULL; error = handlers[i].fn(tp, dkeys, &info); if (error) break; From patchwork Fri Dec 30 22:18:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EFCF7C4332F for ; Sat, 31 Dec 2022 01:42:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236006AbiLaBmH (ORCPT ); Fri, 30 Dec 2022 20:42:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236092AbiLaBmG (ORCPT ); Fri, 30 Dec 2022 20:42:06 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 218AD13DEA for ; Fri, 30 Dec 2022 17:42:05 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B297361C3A for ; Sat, 31 Dec 2022 01:42:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1915FC433EF; Sat, 31 Dec 2022 01:42:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450924; bh=0rk92Ad0iwfxQ/oamTCC1FFnXI7tvagXjJ0kvhmUgmk=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=B+3tgr6bMZ50vi4UtHpcfP7jqoYjpslNCqlEnJHO+/gIdcF/S8EbXIZpkwcBzBotN V8DAfpD50XsT2OydukdLoDpTbqKlcypxNtORp2FqyVFh8Ygnu1SM5C9RBJYSxmXFFH R3lNCJJxZbv2JaqHMtyO5JKiouu1GzOww4pE5cP7TzsArssaf2OA46ZHdKPeDYybG6 o9JHkSWu2FW+SlwTVcgV5Knv23sElAc2mp3JaM2hjSYrLyZxlsrGpnEv8IYV2nYHCH VbQbwXtu7Dy6+LPn3VETWDYqtrKO8X1wk68sulHP76URN2obDz/DWXMlI5BoJyZml2 jJJ0lzU+1cmAQ== Subject: [PATCH 20/38] xfs: fix integer overflows in the fsmap rtbitmap backend From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:18 -0800 Message-ID: <167243869884.715303.13207234516175118417.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_fsmap.c | 54 +++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 40 insertions(+), 14 deletions(-) diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index e330a7e55d1d..b5e7ae77cab9 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -163,6 +163,8 @@ struct xfs_getfsmap_info { struct xfs_rtgroup *rtg; /* rt group info, if needed */ struct xfs_perag *pag; /* AG info, if applicable */ xfs_daddr_t next_daddr; /* next daddr we expect */ + /* daddr of low fsmap key when we're using the rtbitmap */ + xfs_daddr_t low_daddr; u64 missing_owner; /* owner of holes */ u32 dev; /* device id */ struct xfs_rmap_irec low; /* low rmap key */ @@ -240,16 +242,29 @@ xfs_getfsmap_format( xfs_fsmap_from_internal(rec, xfm); } +static inline bool +xfs_getfsmap_rec_before_start( + struct xfs_getfsmap_info *info, + const struct xfs_rmap_irec *rec, + xfs_daddr_t rec_daddr) +{ + if (info->low_daddr != -1ULL) + return rec_daddr < info->low_daddr; + return xfs_rmap_compare(rec, &info->low) < 0; +} + /* * Format a reverse mapping for getfsmap, having translated rm_startblock - * into the appropriate daddr units. + * into the appropriate daddr units. Pass in a nonzero @len_daddr if the + * length could be larger than rm_blockcount in struct xfs_rmap_irec. */ STATIC int xfs_getfsmap_helper( struct xfs_trans *tp, struct xfs_getfsmap_info *info, const struct xfs_rmap_irec *rec, - xfs_daddr_t rec_daddr) + xfs_daddr_t rec_daddr, + xfs_daddr_t len_daddr) { struct xfs_fsmap fmr; struct xfs_mount *mp = tp->t_mountp; @@ -259,12 +274,15 @@ xfs_getfsmap_helper( if (fatal_signal_pending(current)) return -EINTR; + if (len_daddr == 0) + len_daddr = XFS_FSB_TO_BB(mp, rec->rm_blockcount); + /* * Filter out records that start before our startpoint, if the * caller requested that. */ - if (xfs_rmap_compare(rec, &info->low) < 0) { - rec_daddr += XFS_FSB_TO_BB(mp, rec->rm_blockcount); + if (xfs_getfsmap_rec_before_start(info, rec, rec_daddr)) { + rec_daddr += len_daddr; if (info->next_daddr < rec_daddr) info->next_daddr = rec_daddr; return 0; @@ -283,7 +301,7 @@ xfs_getfsmap_helper( info->head->fmh_entries++; - rec_daddr += XFS_FSB_TO_BB(mp, rec->rm_blockcount); + rec_daddr += len_daddr; if (info->next_daddr < rec_daddr) info->next_daddr = rec_daddr; return 0; @@ -329,7 +347,7 @@ xfs_getfsmap_helper( if (error) return error; fmr.fmr_offset = XFS_FSB_TO_BB(mp, rec->rm_offset); - fmr.fmr_length = XFS_FSB_TO_BB(mp, rec->rm_blockcount); + fmr.fmr_length = len_daddr; if (rec->rm_flags & XFS_RMAP_UNWRITTEN) fmr.fmr_flags |= FMR_OF_PREALLOC; if (rec->rm_flags & XFS_RMAP_ATTR_FORK) @@ -346,7 +364,7 @@ xfs_getfsmap_helper( xfs_getfsmap_format(mp, &fmr, info); out: - rec_daddr += XFS_FSB_TO_BB(mp, rec->rm_blockcount); + rec_daddr += len_daddr; if (info->next_daddr < rec_daddr) info->next_daddr = rec_daddr; return 0; @@ -382,7 +400,7 @@ xfs_getfsmap_datadev_helper( fsb = XFS_AGB_TO_FSB(mp, cur->bc_ag.pag->pag_agno, rec->rm_startblock); rec_daddr = XFS_FSB_TO_DADDR(mp, fsb); - return xfs_getfsmap_helper(cur->bc_tp, info, rec, rec_daddr); + return xfs_getfsmap_helper(cur->bc_tp, info, rec, rec_daddr, 0); } /* Transform a bnobt irec into a fsmap */ @@ -406,7 +424,7 @@ xfs_getfsmap_datadev_bnobt_helper( irec.rm_offset = 0; irec.rm_flags = 0; - return xfs_getfsmap_helper(cur->bc_tp, info, &irec, rec_daddr); + return xfs_getfsmap_helper(cur->bc_tp, info, &irec, rec_daddr, 0); } /* Execute a getfsmap query against the regular data device. */ @@ -650,7 +668,7 @@ xfs_getfsmap_logdev( rmap.rm_offset = 0; rmap.rm_flags = 0; - return xfs_getfsmap_helper(tp, info, &rmap, 0); + return xfs_getfsmap_helper(tp, info, &rmap, 0, 0); } #ifdef CONFIG_XFS_RT @@ -664,16 +682,22 @@ xfs_getfsmap_rtdev_rtbitmap_helper( { struct xfs_getfsmap_info *info = priv; struct xfs_rmap_irec irec; - xfs_daddr_t rec_daddr; + xfs_rtblock_t rtbno; + xfs_daddr_t rec_daddr, len_daddr; + + rtbno = xfs_rtx_to_rtb(mp, rec->ar_startext); + rec_daddr = XFS_FSB_TO_BB(mp, rtbno); + + rtbno = xfs_rtx_to_rtb(mp, rec->ar_extcount); + len_daddr = XFS_FSB_TO_BB(mp, rtbno); irec.rm_startblock = xfs_rtx_to_rtb(mp, rec->ar_startext); - rec_daddr = XFS_FSB_TO_BB(mp, irec.rm_startblock); irec.rm_blockcount = xfs_rtx_to_rtb(mp, rec->ar_extcount); irec.rm_owner = XFS_RMAP_OWN_NULL; /* "free" */ irec.rm_offset = 0; irec.rm_flags = 0; - return xfs_getfsmap_helper(tp, info, &irec, rec_daddr); + return xfs_getfsmap_helper(tp, info, &irec, rec_daddr, len_daddr); } /* Actually query the realtime bitmap. */ @@ -741,6 +765,7 @@ xfs_getfsmap_rtdev_rtbitmap( /* Set up search keys */ info->low.rm_startblock = start_fsb; + info->low_daddr = XFS_FSB_TO_BB(mp, start_fsb); error = xfs_fsmap_owner_to_rmap(&info->low, &keys[0]); if (error) return error; @@ -778,7 +803,7 @@ xfs_getfsmap_rtdev_helper( rec->rm_startblock); rec_daddr = xfs_rtb_to_daddr(mp, rtbno); - return xfs_getfsmap_helper(cur->bc_tp, info, rec, rec_daddr); + return xfs_getfsmap_helper(cur->bc_tp, info, rec, rec_daddr, 0); } /* Actually query the rtrmap btree. */ @@ -1122,6 +1147,7 @@ xfs_getfsmap( info.last = false; info.pag = NULL; info.rtg = NULL; + info.low_daddr = -1ULL; error = handlers[i].fn(tp, dkeys, &info); if (error) break; From patchwork Fri Dec 30 22:18:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12DBDC4332F for ; Sat, 31 Dec 2022 01:42:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236037AbiLaBmX (ORCPT ); Fri, 30 Dec 2022 20:42:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236017AbiLaBmV (ORCPT ); Fri, 30 Dec 2022 20:42:21 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBC28E0FD for ; Fri, 30 Dec 2022 17:42:20 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5C4DE61CBD for ; Sat, 31 Dec 2022 01:42:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B74C7C433D2; Sat, 31 Dec 2022 01:42:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450939; bh=sIAEMUoLustXH+uNyTZJelTNB+aN+ftsasWHCS5e8wg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=GJVaw3rNIGpkqO8ty9xVB0I/V9uUTS0h0Gj55fA2G41bLV+4byAhU1uPWxBWVz67v 2wGpy16oXkfC+Dgx7bmLrjqB4EoURdBZfVv5KeQhsw/+hG0eA6dhYzvAKcO4583HvI VtKnzzpTfmqP6cnANa75/w6i/Jkv7biQeuDUr5KacMVruAHXzDChijXl8VPeTKxeWk 84qx9Z/rUoyuOxECtnO4C/V9pU4p0hJjZ4bHGxMmlkaj+Ur2yPzvWg5HKWknELXa8O Pio0d4VSbPeVGnlQE7B7uRwu92eXGzc0tayIfd125+6NRC2tZYDd59rOXPdW+bF365 vqfMLNvml7HqA== Subject: [PATCH 21/38] xfs: fix getfsmap reporting past the last rt extent From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:19 -0800 Message-ID: <167243869899.715303.2892260611946678252.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The realtime section ends at the last rt extent. If the user configures the rt geometry with an extent size that is not an integer factor of the number of rt blocks, it's possible for there to be rt blocks past the end of the last rt extent. These tail blocks cannot ever be allocated and will cause corruption reports if the last extent coincides with the end of an rt bitmap block, so do not report consider them for the GETFSMAP output. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_fsmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index b5e7ae77cab9..efbcc4b1d850 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -755,7 +755,7 @@ xfs_getfsmap_rtdev_rtbitmap( uint64_t eofs; int error = 0; - eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_rblocks); + eofs = XFS_FSB_TO_BB(mp, xfs_rtx_to_rtb(mp, mp->m_sb.sb_rextents)); if (keys[0].fmr_physical >= eofs) return 0; start_fsb = XFS_BB_TO_FSBT(mp, keys[0].fmr_physical); From patchwork Fri Dec 30 22:18:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3411C4332F for ; Sat, 31 Dec 2022 01:42:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235900AbiLaBmi (ORCPT ); Fri, 30 Dec 2022 20:42:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236017AbiLaBmh (ORCPT ); Fri, 30 Dec 2022 20:42:37 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A249F026 for ; Fri, 30 Dec 2022 17:42:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id EB3EB61C3A for ; Sat, 31 Dec 2022 01:42:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 572A3C433EF; Sat, 31 Dec 2022 01:42:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450955; bh=xAuTIVosJ1K+CNexj+btz6+sNpBwD4lR1S39fHNvpvw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=YD6Ul9qZSm1Zc1eQFyidglIlNow5p5EwKHWdUwBkz02vtaL9+Gf6Pu/aw8X6sNjsm yZWUIczSMKuzl/e8WGCUH3jIf3W2g/0FLIl3w4aEnqFWga6XezBySzD9F0P2Wfu42t gw1RogJ9Ld9dqYoUtVY4WGC4Fdfv7N5y5jpt/brI0KGon6lXokQrQ/z1ned0NtGmIk 41TkNZhQ4djvOy4YVzkfZdZaZZPVxXdwyg8v+JI1lGjCYZKT/E0E/a8pld9iBDTQzb jN41DiPKbxeq8rUl66or9QhJF+SfQ9Fbc4jBxW1y292RwfHljTL+D+y7XnF0g3mdMT PwE7p6SU06HwQ== Subject: [PATCH 22/38] xfs: check that the rtrmapbt maxlevels doesn't increase when growing fs From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:19 -0800 Message-ID: <167243869913.715303.244818263733598492.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The size of filesystem transaction reservations depends on the maximum height (maxlevels) of the realtime btrees. Since we don't want a grow operation to increase the reservation size enough that we'll fail the minimum log size checks on the next mount, constrain growfs operations if they would cause an increase in those maxlevels. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_fsops.c | 12 ++++++++++ fs/xfs/xfs_rtalloc.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++- fs/xfs/xfs_rtalloc.h | 6 +++++ fs/xfs/xfs_trace.h | 21 +++++++++++++++++ 4 files changed, 101 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index 9770916acd69..65b44ad8884e 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -23,6 +23,7 @@ #include "xfs_trace.h" #include "xfs_rtgroup.h" #include "xfs_rtalloc.h" +#include "xfs_rtrmap_btree.h" /* * Write new AG headers to disk. Non-transactional, but need to be @@ -115,6 +116,13 @@ xfs_growfs_data_private( xfs_buf_relse(bp); } + /* Make sure the new fs size won't cause problems with the log. */ + error = xfs_growfs_check_rtgeom(mp, nb, mp->m_sb.sb_rblocks, + mp->m_sb.sb_rextsize, mp->m_sb.sb_rextents, + mp->m_sb.sb_rbmblocks, mp->m_sb.sb_rextslog); + if (error) + return error; + nb_div = nb; nb_mod = do_div(nb_div, mp->m_sb.sb_agblocks); nagcount = nb_div + (nb_mod != 0); @@ -214,7 +222,11 @@ xfs_growfs_data_private( error = xfs_fs_reserve_ag_blocks(mp); if (error == -ENOSPC) error = 0; + + /* Compute new maxlevels for rt btrees. */ + xfs_rtrmapbt_compute_maxlevels(mp); } + return error; out_trans_cancel: diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index c3d27cb85c26..7b7e22b36d48 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1049,6 +1049,57 @@ xfs_growfs_rt_init_primary( return 0; } +/* + * Check that changes to the realtime geometry won't affect the minimum + * log size, which would cause the fs to become unusable. + */ +int +xfs_growfs_check_rtgeom( + const struct xfs_mount *mp, + xfs_rfsblock_t dblocks, + xfs_rfsblock_t rblocks, + xfs_agblock_t rextsize, + xfs_rtblock_t rextents, + xfs_extlen_t rbmblocks, + uint8_t rextslog) +{ + struct xfs_mount *fake_mp; + int min_logfsbs; + + fake_mp = kmem_alloc(sizeof(struct xfs_mount), KM_MAYFAIL); + if (!fake_mp) + return -ENOMEM; + + /* + * Create a dummy xfs_mount with the new rt geometry, and compute the + * new minimum log size. This ensures that the log is big enough to + * handle the larger transactions that we could start sending. + */ + memcpy(fake_mp, mp, sizeof(struct xfs_mount)); + + fake_mp->m_sb.sb_dblocks = dblocks; + fake_mp->m_sb.sb_rblocks = rblocks; + fake_mp->m_sb.sb_rextents = rextents; + fake_mp->m_sb.sb_rextsize = rextsize; + fake_mp->m_sb.sb_rbmblocks = rbmblocks; + fake_mp->m_sb.sb_rextslog = rextslog; + if (rblocks > 0) + fake_mp->m_features |= XFS_FEAT_REALTIME; + + xfs_rtrmapbt_compute_maxlevels(fake_mp); + + xfs_trans_resv_calc(fake_mp, M_RES(fake_mp)); + min_logfsbs = xfs_log_calc_minimum_size(fake_mp); + trace_xfs_growfs_check_rtgeom(mp, min_logfsbs); + + kmem_free(fake_mp); + + if (mp->m_sb.sb_logblocks < min_logfsbs) + return -ENOSPC; + + return 0; +} + /* * Grow the realtime area of the filesystem. */ @@ -1139,6 +1190,12 @@ xfs_growfs_rt( if (nrsumblocks > (mp->m_sb.sb_logblocks >> 1)) return -EINVAL; + /* Make sure the new fs size won't cause problems with the log. */ + error = xfs_growfs_check_rtgeom(mp, mp->m_sb.sb_dblocks, nrblocks, + in->extsize, nrextents, nrbmblocks, nrextslog); + if (error) + return error; + /* Allocate the new rt group structures */ if (xfs_has_rtgroups(mp)) { /* @@ -1313,8 +1370,12 @@ xfs_growfs_rt( rtg->rtg_blockcount = xfs_rtgroup_block_count(mp, rtg->rtg_rgno); - /* Ensure the mount RT feature flag is now set. */ + /* + * Ensure the mount RT feature flag is now set, and compute new + * maxlevels for rt btrees. + */ mp->m_features |= XFS_FEAT_REALTIME; + xfs_rtrmapbt_compute_maxlevels(mp); } if (error) goto out_free; diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h index 873ebac239dd..35737a09cdb9 100644 --- a/fs/xfs/xfs_rtalloc.h +++ b/fs/xfs/xfs_rtalloc.h @@ -84,6 +84,11 @@ xfs_growfs_rt( int xfs_rtalloc_reinit_frextents(struct xfs_mount *mp); int xfs_rtfile_convert_unwritten(struct xfs_inode *ip, loff_t pos, uint64_t len); + +int xfs_growfs_check_rtgeom(const struct xfs_mount *mp, xfs_rfsblock_t dblocks, + xfs_rfsblock_t rblocks, xfs_agblock_t rextsize, + xfs_rtblock_t rextents, xfs_extlen_t rbmblocks, + uint8_t rextslog); #else # define xfs_rtallocate_extent(t,b,min,max,l,f,p,rb) (-ENOSYS) # define xfs_rtpick_extent(m,t,l,rb) (-ENOSYS) @@ -107,6 +112,7 @@ xfs_rtmount_init( # define xfs_rt_resv_free(mp) ((void)0) # define xfs_rt_resv_init(mp) (0) # define xfs_rtmount_dqattach(mp) (0) +# define xfs_growfs_check_rtgeom(mp, d, r, rs, rx, rb, rl) (0) #endif /* CONFIG_XFS_RT */ #endif /* __XFS_RTALLOC_H__ */ diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 77f4acc1b923..d90e9183dfc7 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -5196,6 +5196,27 @@ DEFINE_IMETA_RESV_EVENT(xfs_imeta_resv_free_extent); DEFINE_IMETA_RESV_EVENT(xfs_imeta_resv_critical); DEFINE_INODE_ERROR_EVENT(xfs_imeta_resv_init_error); +#ifdef CONFIG_XFS_RT +TRACE_EVENT(xfs_growfs_check_rtgeom, + TP_PROTO(const struct xfs_mount *mp, unsigned int min_logfsbs), + TP_ARGS(mp, min_logfsbs), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(unsigned int, logblocks) + __field(unsigned int, min_logfsbs) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->logblocks = mp->m_sb.sb_logblocks; + __entry->min_logfsbs = min_logfsbs; + ), + TP_printk("dev %d:%d logblocks %u min_logfsbs %u", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->logblocks, + __entry->min_logfsbs) +); +#endif /* CONFIG_XFS_RT */ + #endif /* _TRACE_XFS_H */ #undef TRACE_INCLUDE_PATH From patchwork Fri Dec 30 22:18:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085488 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCDEDC4332F for ; Sat, 31 Dec 2022 01:42:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236017AbiLaBmz (ORCPT ); Fri, 30 Dec 2022 20:42:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236028AbiLaBmy (ORCPT ); Fri, 30 Dec 2022 20:42:54 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8791E13E96 for ; Fri, 30 Dec 2022 17:42:53 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3A0EAB81DD1 for ; Sat, 31 Dec 2022 01:42:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F00EBC433D2; Sat, 31 Dec 2022 01:42:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450971; bh=xLECfKQGAwCkmJsBlQnvzeWitnN0Zg3Tzud6Rdb/gso=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=jsY/41xJkIBeBTNtFCpdKi2tpG8eGgjuZ0lYLdqRXGAYgsPetLDi37zz6CwwJUT8H OtouXNqHO8uCryjA/z+9Rgk3HA0/KBQSLsrulKzN4OY+OM0R2T0P7CBjOxWIvjbBa8 H2voy5iHHnbr2ftGQS9IGOJPc+d/6pUCr8scppaBZsC+DGvTssGteN8hzkHFE5WRwM RxLpdMKir5CvbRuBE405QB44LWxk2yIcWE38p7uOYVb5Rc3kOrSpW9hRssguDCi/i/ UxwP/6algs3frTvuTVKCUd55HI5boFTW4PWokToZv4BBoTsdnyJVTn4wxm+BJ/Wh5E DU1SKYxQ00msw== Subject: [PATCH 23/38] xfs: add realtime rmap btree when adding rt volume From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:19 -0800 Message-ID: <167243869928.715303.4697254359584699445.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong If we're adding enough space to the realtime section to require the creation of new realtime groups, create the rt rmap btree inode before we start adding the space. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_rtalloc.c | 100 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 98 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 7b7e22b36d48..45c388ad4c1f 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -28,6 +28,8 @@ #include "xfs_rtgroup.h" #include "xfs_quota.h" #include "xfs_error.h" +#include "xfs_btree.h" +#include "xfs_rmap.h" #include "xfs_rtrmap_btree.h" /* @@ -1049,6 +1051,87 @@ xfs_growfs_rt_init_primary( return 0; } +/* Add a metadata inode for a realtime rmap btree. */ +static int +xfs_growfsrt_create_rtrmap( + struct xfs_rtgroup *rtg) +{ + struct xfs_mount *mp = rtg->rtg_mount; + struct xfs_imeta_update upd; + struct xfs_rmap_irec rmap = { + .rm_startblock = 0, + .rm_blockcount = mp->m_sb.sb_rextsize, + .rm_owner = XFS_RMAP_OWN_FS, + .rm_offset = 0, + .rm_flags = 0, + }; + struct xfs_btree_cur *cur; + struct xfs_imeta_path *path; + struct xfs_trans *tp; + struct xfs_inode *ip = NULL; + int error; + + if (!xfs_has_rtrmapbt(mp) || rtg->rtg_rmapip) + return 0; + + error = xfs_rtrmapbt_create_path(mp, rtg->rtg_rgno, &path); + if (error) + return error; + + error = xfs_imeta_ensure_dirpath(mp, path); + if (error) + goto out_path; + + error = xfs_imeta_start_update(mp, path, &upd); + if (error) + goto out_path; + + error = xfs_qm_dqattach(upd.dp); + if (error) + goto out_upd; + + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_imeta_create, + xfs_imeta_create_space_res(mp), 0, 0, &tp); + if (error) + goto out_end; + + error = xfs_rtrmapbt_create(&tp, path, &upd, &ip); + if (error) + goto out_cancel; + + lockdep_set_class(&ip->i_lock.mr_lock, &xfs_rrmapip_key); + + cur = xfs_rtrmapbt_init_cursor(mp, tp, rtg, ip); + error = xfs_rmap_map_raw(cur, &rmap); + xfs_btree_del_cursor(cur, error); + if (error) + goto out_cancel; + + error = xfs_trans_commit(tp); + if (error) + goto out_end; + + xfs_imeta_end_update(mp, &upd, error); + xfs_imeta_free_path(path); + xfs_finish_inode_setup(ip); + rtg->rtg_rmapip = ip; + return 0; + +out_cancel: + xfs_trans_cancel(tp); +out_end: + /* Have to finish setting up the inode to ensure it's deleted. */ + if (ip) { + xfs_finish_inode_setup(ip); + xfs_irele(ip); + } +out_upd: + xfs_imeta_end_update(mp, &upd, error); +out_path: + xfs_imeta_free_path(path); + return error; +} + /* * Check that changes to the realtime geometry won't affect the minimum * log size, which would cause the fs to become unusable. @@ -1155,7 +1238,9 @@ xfs_growfs_rt( return -EINVAL; /* Unsupported realtime features. */ - if (xfs_has_rmapbt(mp) || xfs_has_reflink(mp) || xfs_has_quota(mp)) + if (!xfs_has_rtgroups(mp) && xfs_has_rmapbt(mp)) + return -EOPNOTSUPP; + if (xfs_has_reflink(mp) || xfs_has_quota(mp)) return -EOPNOTSUPP; nrblocks = in->newblocks; @@ -1278,10 +1363,21 @@ xfs_growfs_rt( nsbp->sb_rbmblocks); nmp->m_rsumsize = nrsumsize = XFS_FSB_TO_B(mp, nrsumblocks); - if (xfs_has_rtgroups(mp)) + if (xfs_has_rtgroups(mp)) { + xfs_rgnumber_t rgno = last_rgno; + nsbp->sb_rgcount = howmany_64(nsbp->sb_rblocks, nsbp->sb_rgblocks); + for_each_rtgroup_range(mp, rgno, nsbp->sb_rgcount, rtg) { + error = xfs_growfsrt_create_rtrmap(rtg); + if (error) { + xfs_rtgroup_put(rtg); + break; + } + } + } + /* * Start a transaction, get the log reservation. */ From patchwork Fri Dec 30 22:18:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5437C4332F for ; Sat, 31 Dec 2022 01:43:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236027AbiLaBnJ (ORCPT ); Fri, 30 Dec 2022 20:43:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235913AbiLaBnI (ORCPT ); Fri, 30 Dec 2022 20:43:08 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8285C13F7A for ; Fri, 30 Dec 2022 17:43:07 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2141961C3A for ; Sat, 31 Dec 2022 01:43:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7D8A9C433D2; Sat, 31 Dec 2022 01:43:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672450986; bh=I2HMR3AKOFHmhJbfSOWeo8c8fkkCe1q0lPlfDNrSPvc=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=VQiaZ7ZAHnNjjkrdiOnOiON2fOmZ3UEC/Tgas0TrhVx32nHeJPJroZnUSPtvnWZXk +1lFE7oHlvoPvdv0ee9Jd9+FLawfSgGSXAb8f/vP0qRAb+9MckQ/CyKQOY4U9Kh1x9 46UXS11pqW6+Sq3W0V5SxTeGp2havETVKYv2HpSRe7DbdD4ZtrToJT2EXf65n4ZUdz tv6Ad0IIb4Q+tP9Hd1ifxhN5OdSB/wHFMHggSLKNAzVGP6LrKhGIWDyVTqWkDrN3fR rg9D5xhO2S1VZCq5dx3meqBZBU6U21RiAJp+Q+s2q6HULUYvvMU1Eo5i6C4ue6T1vJ VraHc61pW5+Aw== Subject: [PATCH 24/38] xfs: report realtime rmap btree corruption errors to the health system From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:19 -0800 Message-ID: <167243869942.715303.8274942737380162651.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Whenever we encounter corrupt realtime rmap btree blocks, we should report that to the health monitoring system for later reporting. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_fs.h | 1 + fs/xfs/libxfs/xfs_health.h | 4 +++- fs/xfs/libxfs/xfs_inode_fork.c | 4 +++- fs/xfs/libxfs/xfs_rtrmap_btree.c | 5 ++++- fs/xfs/xfs_health.c | 4 ++++ fs/xfs/xfs_rtalloc.c | 1 + 6 files changed, 16 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h index 7e9d7d7bb40b..5c557d5ff13e 100644 --- a/fs/xfs/libxfs/xfs_fs.h +++ b/fs/xfs/libxfs/xfs_fs.h @@ -313,6 +313,7 @@ struct xfs_rtgroup_geometry { }; #define XFS_RTGROUP_GEOM_SICK_SUPER (1 << 0) /* superblock */ #define XFS_RTGROUP_GEOM_SICK_BITMAP (1 << 1) /* rtbitmap for this group */ +#define XFS_RTGROUP_GEOM_SICK_RMAPBT (1 << 2) /* reverse mappings */ /* * Structures for XFS_IOC_FSGROWFSDATA, XFS_IOC_FSGROWFSLOG & XFS_IOC_FSGROWFSRT diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h index 44137c4983fc..d5976f6b0de1 100644 --- a/fs/xfs/libxfs/xfs_health.h +++ b/fs/xfs/libxfs/xfs_health.h @@ -67,6 +67,7 @@ struct xfs_rtgroup; #define XFS_SICK_RT_BITMAP (1 << 0) /* realtime bitmap */ #define XFS_SICK_RT_SUMMARY (1 << 1) /* realtime summary */ #define XFS_SICK_RT_SUPER (1 << 2) /* rt group superblock */ +#define XFS_SICK_RT_RMAPBT (1 << 3) /* reverse mappings */ /* Observable health issues for AG metadata. */ #define XFS_SICK_AG_SB (1 << 0) /* superblock */ @@ -104,7 +105,8 @@ struct xfs_rtgroup; #define XFS_SICK_RT_PRIMARY (XFS_SICK_RT_BITMAP | \ XFS_SICK_RT_SUMMARY | \ - XFS_SICK_RT_SUPER) + XFS_SICK_RT_SUPER | \ + XFS_SICK_RT_RMAPBT) #define XFS_SICK_AG_PRIMARY (XFS_SICK_AG_SB | \ XFS_SICK_AG_AGF | \ diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index 94979bed8f32..61926c07aad3 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -261,8 +261,10 @@ xfs_iformat_data_fork( case XFS_DINODE_FMT_BTREE: return xfs_iformat_btree(ip, dip, XFS_DATA_FORK); case XFS_DINODE_FMT_RMAP: - if (!xfs_has_rtrmapbt(ip->i_mount)) + if (!xfs_has_rtrmapbt(ip->i_mount)) { + xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE); return -EFSCORRUPTED; + } return xfs_iformat_rtrmap(ip, dip); default: xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index 9181fca2ba54..2d8130b4c187 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -28,6 +28,7 @@ #include "xfs_rtgroup.h" #include "xfs_bmap.h" #include "xfs_imeta.h" +#include "xfs_health.h" static struct kmem_cache *xfs_rtrmapbt_cur_cache; @@ -800,8 +801,10 @@ xfs_iformat_rtrmap( level = be16_to_cpu(dfp->bb_level); if (level > mp->m_rtrmap_maxlevels || - xfs_rtrmap_droot_space_calc(level, numrecs) > dsize) + xfs_rtrmap_droot_space_calc(level, numrecs) > dsize) { + xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE); return -EFSCORRUPTED; + } xfs_iroot_alloc(ip, XFS_DATA_FORK, xfs_rtrmap_broot_space_calc(mp, level, numrecs)); diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c index 33f332ee8044..80cc735b52d1 100644 --- a/fs/xfs/xfs_health.c +++ b/fs/xfs/xfs_health.c @@ -531,6 +531,7 @@ xfs_ag_geom_health( static const struct ioctl_sick_map rtgroup_map[] = { { XFS_SICK_RT_SUPER, XFS_RTGROUP_GEOM_SICK_SUPER }, { XFS_SICK_RT_BITMAP, XFS_RTGROUP_GEOM_SICK_BITMAP }, + { XFS_SICK_RT_RMAPBT, XFS_RTGROUP_GEOM_SICK_RMAPBT }, { 0, 0 }, }; @@ -630,6 +631,9 @@ xfs_btree_mark_sick( case XFS_BTNUM_BMAP: xfs_bmap_mark_sick(cur->bc_ino.ip, cur->bc_ino.whichfork); return; + case XFS_BTNUM_RTRMAP: + xfs_rtgroup_mark_sick(cur->bc_ino.rtg, XFS_SICK_RT_RMAPBT); + return; case XFS_BTNUM_BNO: mask = XFS_SICK_AG_BNOBT; break; diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 45c388ad4c1f..0f31680284fb 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -1806,6 +1806,7 @@ xfs_rtmount_rmapbt( goto out_path; if (XFS_IS_CORRUPT(mp, ip->i_df.if_format != XFS_DINODE_FMT_RMAP)) { + xfs_rtgroup_mark_sick(rtg, XFS_SICK_RT_RMAPBT); error = -EFSCORRUPTED; goto out_rele; } From patchwork Fri Dec 30 22:18:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085490 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1543C4332F for ; Sat, 31 Dec 2022 01:43:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235913AbiLaBn0 (ORCPT ); Fri, 30 Dec 2022 20:43:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236028AbiLaBnZ (ORCPT ); Fri, 30 Dec 2022 20:43:25 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE9A2F026 for ; Fri, 30 Dec 2022 17:43:24 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 73483B81A16 for ; Sat, 31 Dec 2022 01:43:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1435DC433D2; Sat, 31 Dec 2022 01:43:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451002; bh=fisbBsrgK44ty80auP9A+MP7Ecn5UDAoFQ8EQIKvKLk=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Iq6ViSstJo5HpCiNS311tVPp3tKcjt7aTkAkEXmz7q4Nvdnvx9xfNajJqh14KgjKM rgYkLhvunmO6Q2r0vZRwpH5YsYh2LQk37bYZPFlyMaFPa16H15GKTnM3m+qdJI0POi +kp/9c6xFAvfPtxyA8fMpqE/eoy62lrFdLNpAplRl+pQwCka44wGYh6QIOXyPwJ6Pr /pUVvT71SeSsLuLHyCs9Kx2pSuIZVGxr5l8hvYe0POACP6uIpEToiBsYsr5R6A6oKN w34u0CRCmMKXHddsP1VpqHqtaFKhxJbJpMwQYjvLapgbFWeu1xL718waIhxSKFH7Jr 7vRR0BYtjl/vg== Subject: [PATCH 25/38] xfs: fix scrub tracepoints when inode-rooted btrees are involved From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:19 -0800 Message-ID: <167243869956.715303.2051988056260451085.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Fix a minor mistakes in the scrub tracepoints that can manifest when inode-rooted btrees are enabled. The existing code worked fine for bmap btrees, but we should tighten the code up to be less sloppy. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index cf1635e00cb0..3ffee717062d 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -477,7 +477,7 @@ TRACE_EVENT(xchk_ifork_btree_op_error, TP_fast_assign( xfs_fsblock_t fsbno = xchk_btree_cur_fsbno(cur, level); __entry->dev = sc->mp->m_super->s_dev; - __entry->ino = sc->ip->i_ino; + __entry->ino = cur->bc_ino.ip->i_ino; __entry->whichfork = cur->bc_ino.whichfork; __entry->type = sc->sm->sm_type; __entry->btnum = cur->bc_btnum; From patchwork Fri Dec 30 22:18:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E5B6C4332F for ; Sat, 31 Dec 2022 01:43:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236040AbiLaBnm (ORCPT ); Fri, 30 Dec 2022 20:43:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236028AbiLaBnj (ORCPT ); Fri, 30 Dec 2022 20:43:39 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF6AD13DEA for ; Fri, 30 Dec 2022 17:43:38 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3FC8B61CBE for ; Sat, 31 Dec 2022 01:43:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9E527C433D2; Sat, 31 Dec 2022 01:43:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451017; bh=vg+H1cB3jEm+cDRDsCE7xkHQQrfyawhMFAcKRjb4Rc0=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=GJ59vbvoTgxhR/M4ZuA6fUxaw7j+ZBXI+AnLW6tqv7eHLN2sqeJqMGg2D/4tW1XLI 4ed3DaMqUcFmKyntOjNRdRv4BKUAMVTzNC5WzIkdkD3PjLlDpvDKcADb5/ydAwF0ws N2eVHFIucy8rvOY1unsvF/KwskV3yyy3VazrauiRqqjfHTIyXPPuY3ZziiLFAmbNV+ aJe45j/kNUmjbPUrwsEu3anQcvfsZ0RbqzdX5qGx7fRIdC1MX2GAcs42dtdV7ZRMsa YR59mpZrapdfKUSXMe0FQFSURTIpkDbkXoPxLYwOTqkhIsBfZ+JuFNHaguYD+g4wmw eFEy+yUWb8X1g== Subject: [PATCH 26/38] xfs: allow queued realtime intents to drain before scrubbing From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:19 -0800 Message-ID: <167243869971.715303.13003561824394979973.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong When a writer thread executes a chain of log intent items for the realtime volume, the ILOCKs taken during each step are for each rt metadata file, not the entire rt volume itself. Although scrub takes all rt metadata ILOCKs, this isn't sufficient to guard against scrub checking the rt volume while that writer thread is in the middle of finishing a chain because there's no higher level locking primitive guarding the realtime volume. When there's a collision, cross-referencing between data structures (e.g. rtrmapbt and rtrefcountbt) yields false corruption events; if repair is running, this results in incorrect repairs, which is catastrophic. Fix this by adding to the mount structure the same drain that we use to protect scrub against concurrent AG updates, but this time for the realtime volume. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rtgroup.c | 3 ++ fs/xfs/libxfs/xfs_rtgroup.h | 9 +++++ fs/xfs/scrub/common.c | 76 ++++++++++++++++++++++++++++++++++++++++--- fs/xfs/scrub/rtbitmap.c | 3 ++ fs/xfs/xfs_bmap_item.c | 5 ++- fs/xfs/xfs_drain.c | 41 +++++++++++++++++++++++ fs/xfs/xfs_drain.h | 19 +++++++++++ fs/xfs/xfs_extfree_item.c | 2 + fs/xfs/xfs_mount.h | 1 + fs/xfs/xfs_rmap_item.c | 2 + fs/xfs/xfs_trace.h | 32 ++++++++++++++++++ 11 files changed, 186 insertions(+), 7 deletions(-) diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index d6b790741265..e40806c84256 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -132,6 +132,8 @@ xfs_initialize_rtgroups( #ifdef __KERNEL__ /* Place kernel structure only init below this point. */ spin_lock_init(&rtg->rtg_state_lock); + xfs_drain_init(&rtg->rtg_intents); + #endif /* __KERNEL__ */ /* first new rtg is fully initialized */ @@ -183,6 +185,7 @@ xfs_free_rtgroups( spin_unlock(&mp->m_rtgroup_lock); ASSERT(rtg); XFS_IS_CORRUPT(rtg->rtg_mount, atomic_read(&rtg->rtg_ref) != 0); + xfs_drain_free(&rtg->rtg_intents); call_rcu(&rtg->rcu_head, __xfs_free_rtgroups); } diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h index 3230dd03d8f8..1d41a2cac34f 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.h +++ b/fs/xfs/libxfs/xfs_rtgroup.h @@ -37,6 +37,15 @@ struct xfs_rtgroup { #ifdef __KERNEL__ /* -- kernel only structures below this line -- */ spinlock_t rtg_state_lock; + + /* + * We use xfs_drain to track the number of deferred log intent items + * that have been queued (but not yet processed) so that waiters (e.g. + * scrub) will not lock resources when other threads are in the middle + * of processing a chain of intent items only to find momentary + * inconsistencies. + */ + struct xfs_drain rtg_intents; #endif /* __KERNEL__ */ }; diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c index b63b5c016841..bb1d9ca20374 100644 --- a/fs/xfs/scrub/common.c +++ b/fs/xfs/scrub/common.c @@ -757,12 +757,78 @@ xchk_rt_unlock( } #ifdef CONFIG_XFS_RT +/* Lock all the rt group metadata inode ILOCKs and wait for intents. */ +static int +xchk_rtgroup_lock( + struct xfs_scrub *sc, + struct xchk_rt *sr, + unsigned int rtglock_flags) +{ + int error = 0; + + ASSERT(sr->rtg != NULL); + + /* + * If we're /only/ locking the rtbitmap in shared mode, then we're + * obviously not trying to compare records in two metadata inodes. + * There's no need to drain intents here because the caller (most + * likely the rgsuper scanner) doesn't need that level of consistency. + */ + if (rtglock_flags == XFS_RTGLOCK_BITMAP_SHARED) { + xfs_rtgroup_lock(NULL, sr->rtg, rtglock_flags); + sr->rtlock_flags = rtglock_flags; + return 0; + } + + do { + if (xchk_should_terminate(sc, &error)) + return error; + + xfs_rtgroup_lock(NULL, sr->rtg, rtglock_flags); + + /* + * Decide if the rt group is quiet enough for all metadata to + * be consistent with each other. Regular file IO doesn't get + * to lock all the rt inodes at the same time, which means that + * there could be other threads in the middle of processing a + * chain of deferred ops. + * + * We just locked all the metadata inodes for this rt group; + * now take a look to see if there are any intents in progress. + * If there are, drop the rt group inode locks and wait for the + * intents to drain. Since we hold the rt group inode locks + * for the duration of the scrub, this is the only time we have + * to sample the intents counter; any threads increasing it + * after this point can't possibly be in the middle of a chain + * of rt metadata updates. + * + * Obviously, this should be slanted against scrub and in favor + * of runtime threads. + */ + if (!xfs_rtgroup_intents_busy(sr->rtg)) { + sr->rtlock_flags = rtglock_flags; + return 0; + } + + xfs_rtgroup_unlock(sr->rtg, rtglock_flags); + + if (!(sc->flags & XCHK_FSHOOKS_DRAIN)) + return -ECHRNG; + error = xfs_rtgroup_drain_intents(sr->rtg); + if (error == -ERESTARTSYS) + error = -EINTR; + } while (!error); + + return error; +} + /* * For scrubbing a realtime group, grab all the in-core resources we'll need to * check the metadata, which means taking the ILOCK of the realtime group's - * metadata inodes. Callers must not join these inodes to the transaction with - * non-zero lockflags or concurrency problems will result. The @rtglock_flags - * argument takes XFS_RTGLOCK_* flags. + * metadata inodes and draining any running intent chains. Callers must not + * join these inodes to the transaction with non-zero lockflags or concurrency + * problems will result. The @rtglock_flags argument takes XFS_RTGLOCK_* + * flags. */ int xchk_rtgroup_init( @@ -778,9 +844,7 @@ xchk_rtgroup_init( if (!sr->rtg) return -ENOENT; - xfs_rtgroup_lock(NULL, sr->rtg, rtglock_flags); - sr->rtlock_flags = rtglock_flags; - return 0; + return xchk_rtgroup_lock(sc, sr, rtglock_flags); } /* diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c index d847773e5f66..a034f2d392f5 100644 --- a/fs/xfs/scrub/rtbitmap.c +++ b/fs/xfs/scrub/rtbitmap.c @@ -26,6 +26,9 @@ xchk_setup_rgbitmap( { int error; + if (xchk_need_fshook_drain(sc)) + xchk_fshooks_enable(sc, XCHK_FSHOOKS_DRAIN); + error = xchk_trans_alloc(sc, 0); if (error) return error; diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index 04eeae9aef79..e2e7e5f678e9 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -369,6 +369,7 @@ xfs_bmap_update_get_group( rgno = xfs_rtb_to_rgno(mp, bi->bi_bmap.br_startblock); bi->bi_rtg = xfs_rtgroup_get(mp, rgno); + xfs_rtgroup_bump_intents(bi->bi_rtg); } else { bi->bi_rtg = NULL; } @@ -395,8 +396,10 @@ xfs_bmap_update_put_group( struct xfs_bmap_intent *bi) { if (xfs_ifork_is_realtime(bi->bi_owner, bi->bi_whichfork)) { - if (xfs_has_rtgroups(bi->bi_owner->i_mount)) + if (xfs_has_rtgroups(bi->bi_owner->i_mount)) { + xfs_rtgroup_drop_intents(bi->bi_rtg); xfs_rtgroup_put(bi->bi_rtg); + } return; } diff --git a/fs/xfs/xfs_drain.c b/fs/xfs/xfs_drain.c index 9b463e1183f6..4fda4cd096fa 100644 --- a/fs/xfs/xfs_drain.c +++ b/fs/xfs/xfs_drain.c @@ -11,6 +11,7 @@ #include "xfs_mount.h" #include "xfs_ag.h" #include "xfs_trace.h" +#include "xfs_rtgroup.h" /* * Use a static key here to reduce the overhead of xfs_drain_drop. If the @@ -119,3 +120,43 @@ xfs_perag_intents_busy( { return xfs_drain_busy(&pag->pag_intents); } + +#ifdef CONFIG_XFS_RT +/* Add an item to the pending count. */ +void +xfs_rtgroup_bump_intents( + struct xfs_rtgroup *rtg) +{ + trace_xfs_rtgroup_bump_intents(rtg, __return_address); + xfs_drain_bump(&rtg->rtg_intents); +} + +/* Remove an item from the pending count. */ +void +xfs_rtgroup_drop_intents( + struct xfs_rtgroup *rtg) +{ + trace_xfs_rtgroup_drop_intents(rtg, __return_address); + xfs_drain_drop(&rtg->rtg_intents); +} + +/* + * Wait for the pending intent count for realtime metadata to hit zero. + * Callers must not hold any rt metadata inode locks. + */ +int +xfs_rtgroup_drain_intents( + struct xfs_rtgroup *rtg) +{ + trace_xfs_rtgroup_wait_intents(rtg, __return_address); + return xfs_drain_wait(&rtg->rtg_intents); +} + +/* Might someone else be processing intents for this rt group? */ +bool +xfs_rtgroup_intents_busy( + struct xfs_rtgroup *rtg) +{ + return xfs_drain_busy(&rtg->rtg_intents); +} +#endif /* CONFIG_XFS_RT */ diff --git a/fs/xfs/xfs_drain.h b/fs/xfs/xfs_drain.h index a980df6d3508..478ffab95b0f 100644 --- a/fs/xfs/xfs_drain.h +++ b/fs/xfs/xfs_drain.h @@ -7,6 +7,7 @@ #define XFS_DRAIN_H_ struct xfs_perag; +struct xfs_rtgroup; #ifdef CONFIG_XFS_DRAIN_INTENTS /* @@ -60,12 +61,27 @@ void xfs_drain_wait_enable(void); * All functions that create work items must increment the intent counter as * soon as the item is added to the transaction and cannot drop the counter * until the item is finished or cancelled. + * + * The same principles apply to realtime groups because the rt metadata inode + * ILOCKs are not held across transaction rolls. */ void xfs_perag_bump_intents(struct xfs_perag *pag); void xfs_perag_drop_intents(struct xfs_perag *pag); int xfs_perag_drain_intents(struct xfs_perag *pag); bool xfs_perag_intents_busy(struct xfs_perag *pag); + +#ifdef CONFIG_XFS_RT +void xfs_rtgroup_bump_intents(struct xfs_rtgroup *rtg); +void xfs_rtgroup_drop_intents(struct xfs_rtgroup *rtg); + +int xfs_rtgroup_drain_intents(struct xfs_rtgroup *rtg); +bool xfs_rtgroup_intents_busy(struct xfs_rtgroup *rtg); +#else +static inline void xfs_rtgroup_bump_intents(struct xfs_rtgroup *rtg) { } +static inline void xfs_rtgroup_drop_intents(struct xfs_rtgroup *rtg) { } +#endif /* CONFIG_XFS_RT */ + #else struct xfs_drain { /* empty */ }; @@ -75,6 +91,9 @@ struct xfs_drain { /* empty */ }; static inline void xfs_perag_bump_intents(struct xfs_perag *pag) { } static inline void xfs_perag_drop_intents(struct xfs_perag *pag) { } +static inline void xfs_rtgroup_bump_intents(struct xfs_rtgroup *rtg) { } +static inline void xfs_rtgroup_drop_intents(struct xfs_rtgroup *rtg) { } + #endif /* CONFIG_XFS_DRAIN_INTENTS */ #endif /* XFS_DRAIN_H_ */ diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index 42b89c9e996b..e2e888bc1b1c 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -491,6 +491,7 @@ xfs_extent_free_get_group( rgno = xfs_rtb_to_rgno(mp, xefi->xefi_startblock); xefi->xefi_rtg = xfs_rtgroup_get(mp, rgno); + xfs_rtgroup_bump_intents(xefi->xefi_rtg); return; } @@ -505,6 +506,7 @@ xfs_extent_free_put_group( struct xfs_extent_free_item *xefi) { if (xfs_efi_is_realtime(xefi)) { + xfs_rtgroup_drop_intents(xefi->xefi_rtg); xfs_rtgroup_put(xefi->xefi_rtg); return; } diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index a565b1b1372a..b1ffab4cb9cd 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -13,6 +13,7 @@ struct xfs_ail; struct xfs_quotainfo; struct xfs_da_geometry; struct xfs_perag; +struct xfs_rtgroup; /* dynamic preallocation free space thresholds, 5% down to 1% */ enum { diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index a2949f818e0c..a95783622adb 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -400,6 +400,7 @@ xfs_rmap_update_get_group( rgno = xfs_rtb_to_rgno(mp, ri->ri_bmap.br_startblock); ri->ri_rtg = xfs_rtgroup_get(mp, rgno); + xfs_rtgroup_bump_intents(ri->ri_rtg); return; } @@ -414,6 +415,7 @@ xfs_rmap_update_put_group( struct xfs_rmap_intent *ri) { if (ri->ri_realtime) { + xfs_rtgroup_drop_intents(ri->ri_rtg); xfs_rtgroup_put(ri->ri_rtg); return; } diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index d90e9183dfc7..a6de7b6e4afd 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -4872,6 +4872,38 @@ DEFINE_PERAG_INTENTS_EVENT(xfs_perag_bump_intents); DEFINE_PERAG_INTENTS_EVENT(xfs_perag_drop_intents); DEFINE_PERAG_INTENTS_EVENT(xfs_perag_wait_intents); +#ifdef CONFIG_XFS_RT +DECLARE_EVENT_CLASS(xfs_rtgroup_intents_class, + TP_PROTO(struct xfs_rtgroup *rtg, void *caller_ip), + TP_ARGS(rtg, caller_ip), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(long, nr_intents) + __field(void *, caller_ip) + ), + TP_fast_assign( + __entry->dev = rtg->rtg_mount->m_super->s_dev; + __entry->rtdev = rtg->rtg_mount->m_rtdev_targp->bt_dev; + __entry->nr_intents = atomic_read(&rtg->rtg_intents.dr_count); + __entry->caller_ip = caller_ip; + ), + TP_printk("dev %d:%d rtdev %d:%d intents %ld caller %pS", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->nr_intents, + __entry->caller_ip) +); + +#define DEFINE_RTGROUP_INTENTS_EVENT(name) \ +DEFINE_EVENT(xfs_rtgroup_intents_class, name, \ + TP_PROTO(struct xfs_rtgroup *rtg, void *caller_ip), \ + TP_ARGS(rtg, caller_ip)) +DEFINE_RTGROUP_INTENTS_EVENT(xfs_rtgroup_bump_intents); +DEFINE_RTGROUP_INTENTS_EVENT(xfs_rtgroup_drop_intents); +DEFINE_RTGROUP_INTENTS_EVENT(xfs_rtgroup_wait_intents); +#endif /* CONFIG_XFS_RT */ + #endif /* CONFIG_XFS_DRAIN_INTENTS */ TRACE_EVENT(xfs_swapext_overhead, From patchwork Fri Dec 30 22:18:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9256EC4332F for ; Sat, 31 Dec 2022 01:43:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231444AbiLaBn6 (ORCPT ); Fri, 30 Dec 2022 20:43:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236028AbiLaBn5 (ORCPT ); Fri, 30 Dec 2022 20:43:57 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11D8613DEA for ; Fri, 30 Dec 2022 17:43:56 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 92018B81DB1 for ; Sat, 31 Dec 2022 01:43:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 369DCC433EF; Sat, 31 Dec 2022 01:43:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451033; bh=I2JQXZIINBapbLsEDh1XPPBAhPofVyZRlhQkDnLlgDk=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=TdlS3Xd1cetg9GE2KCIt36jnjht30xI1MbJyVJfq521wUuByLgqODKPwNeH1XJHXj zpdb8PWx4E85xEMkp2UAWeW3Nw75L/+bbeDMlE4i7s41FSxrja6NIRQADQEP/pQS6K DBxIrhJabvxc1iMzH0n2xTewb5GFF40FGM95XYdWz37oshpGU6ZIJ9OyeOkQ2P1+qk jxKkwrZjC8cHmoqQwBwWrs/SLRf1wai8c5E7xS0t/RnQqoN92pt4m8F0RWYFt92Vtx SPUn1yIjO7gPDCEfNWHVtFV8SgCd9s6+l7jYNsq4sBaCW6mmdGHUAOmcZ7seoEyZ5T n21bJRM4HNctw== Subject: [PATCH 27/38] xfs: scrub the realtime rmapbt From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:19 -0800 Message-ID: <167243869986.715303.615702315094213761.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Check the realtime reverse mapping btree against the rtbitmap, and modify the rtbitmap scrub to check against the rtrmapbt. Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/libxfs/xfs_fs.h | 3 - fs/xfs/scrub/bmap.c | 1 fs/xfs/scrub/bmap_repair.c | 1 fs/xfs/scrub/common.c | 78 +++++++++++++++++ fs/xfs/scrub/common.h | 11 ++ fs/xfs/scrub/health.c | 1 fs/xfs/scrub/inode.c | 10 +- fs/xfs/scrub/inode_repair.c | 7 +- fs/xfs/scrub/repair.c | 1 fs/xfs/scrub/rtrmap.c | 192 +++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/scrub.c | 9 ++ fs/xfs/scrub/scrub.h | 5 + fs/xfs/scrub/trace.h | 4 + 14 files changed, 312 insertions(+), 12 deletions(-) create mode 100644 fs/xfs/scrub/rtrmap.c diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 84934538bf52..1060ea739210 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -181,6 +181,7 @@ xfs-y += $(addprefix scrub/, \ xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \ rgsuper.o \ rtbitmap.o \ + rtrmap.o \ rtsummary.o \ ) diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h index 5c557d5ff13e..8547ba85c550 100644 --- a/fs/xfs/libxfs/xfs_fs.h +++ b/fs/xfs/libxfs/xfs_fs.h @@ -744,9 +744,10 @@ struct xfs_scrub_metadata { #define XFS_SCRUB_TYPE_HEALTHY 27 /* everything checked out ok */ #define XFS_SCRUB_TYPE_RGSUPER 28 /* realtime superblock */ #define XFS_SCRUB_TYPE_RGBITMAP 29 /* realtime group bitmap */ +#define XFS_SCRUB_TYPE_RTRMAPBT 30 /* rtgroup reverse mapping btree */ /* Number of scrub subcommands. */ -#define XFS_SCRUB_TYPE_NR 30 +#define XFS_SCRUB_TYPE_NR 31 /* i: Repair this metadata. */ #define XFS_SCRUB_IFLAG_REPAIR (1u << 0) diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index b5b081d23ca2..0c79185daedf 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -825,6 +825,7 @@ xchk_bmap( case XFS_DINODE_FMT_UUID: case XFS_DINODE_FMT_DEV: case XFS_DINODE_FMT_LOCAL: + case XFS_DINODE_FMT_RMAP: /* No mappings to check. */ if (whichfork == XFS_COW_FORK) xchk_fblock_set_corrupt(sc, whichfork, 0); diff --git a/fs/xfs/scrub/bmap_repair.c b/fs/xfs/scrub/bmap_repair.c index 0ad0f27fd8ca..ca7df344581d 100644 --- a/fs/xfs/scrub/bmap_repair.c +++ b/fs/xfs/scrub/bmap_repair.c @@ -682,6 +682,7 @@ xrep_bmap_check_inputs( case XFS_DINODE_FMT_DEV: case XFS_DINODE_FMT_LOCAL: case XFS_DINODE_FMT_UUID: + case XFS_DINODE_FMT_RMAP: return -ECANCELED; case XFS_DINODE_FMT_EXTENTS: case XFS_DINODE_FMT_BTREE: diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c index bb1d9ca20374..fa8e0064c41d 100644 --- a/fs/xfs/scrub/common.c +++ b/fs/xfs/scrub/common.c @@ -35,6 +35,8 @@ #include "xfs_swapext.h" #include "xfs_rtbitmap.h" #include "xfs_rtgroup.h" +#include "xfs_rtrmap_btree.h" +#include "xfs_bmap_util.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -837,6 +839,8 @@ xchk_rtgroup_init( struct xchk_rt *sr, unsigned int rtglock_flags) { + int error; + ASSERT(sr->rtg == NULL); ASSERT(sr->rtlock_flags == 0); @@ -844,7 +848,30 @@ xchk_rtgroup_init( if (!sr->rtg) return -ENOENT; - return xchk_rtgroup_lock(sc, sr, rtglock_flags); + error = xchk_rtgroup_lock(sc, sr, rtglock_flags); + if (error) + return error; + + if (xfs_has_rtrmapbt(sc->mp) && (rtglock_flags & XFS_RTGLOCK_RMAP)) + sr->rmap_cur = xfs_rtrmapbt_init_cursor(sc->mp, sc->tp, + sr->rtg, sr->rtg->rtg_rmapip); + + return 0; +} + +/* + * Free all the btree cursors and other incore data relating to the realtime + * group. This has to be done /before/ committing (or cancelling) the scrub + * transaction. + */ +void +xchk_rtgroup_btcur_free( + struct xchk_rt *sr) +{ + if (sr->rmap_cur) + xfs_btree_del_cursor(sr->rmap_cur, XFS_BTREE_ERROR); + + sr->rmap_cur = NULL; } /* @@ -932,6 +959,14 @@ xchk_setup_fs( return xchk_trans_alloc(sc, resblks); } +/* Set us up with a transaction and an empty context to repair rt metadata. */ +int +xchk_setup_rt( + struct xfs_scrub *sc) +{ + return xchk_trans_alloc(sc, 0); +} + /* Set us up with AG headers and btree cursors. */ int xchk_setup_ag_btree( @@ -1490,3 +1525,44 @@ xchk_fshooks_enable( sc->flags |= scrub_fshooks; } + +/* Count the blocks used by a file, even if it's a metadata inode. */ +int +xchk_inode_count_blocks( + struct xfs_scrub *sc, + int whichfork, + xfs_extnum_t *nextents, + xfs_filblks_t *count) +{ + struct xfs_ifork *ifp = xfs_ifork_ptr(sc->ip, whichfork); + struct xfs_btree_cur *cur; + xfs_extlen_t btblocks; + int error; + + if (!ifp) { + *nextents = 0; + *count = 0; + return 0; + } + + switch (ifp->if_format) { + case XFS_DINODE_FMT_RMAP: + if (!sc->sr.rtg) { + ASSERT(0); + return -EFSCORRUPTED; + } + cur = xfs_rtrmapbt_init_cursor(sc->mp, sc->tp, sc->sr.rtg, + sc->ip); + error = xfs_btree_count_blocks(cur, &btblocks); + xfs_btree_del_cursor(cur, error); + if (error) + return error; + + *nextents = 0; + *count = btblocks - 1; + return 0; + default: + return xfs_bmap_count_blocks(sc->tp, sc->ip, whichfork, + nextents, count); + } +} diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h index e83e88b44e5b..9ca2fbaac72c 100644 --- a/fs/xfs/scrub/common.h +++ b/fs/xfs/scrub/common.h @@ -90,6 +90,7 @@ static inline int xchk_setup_nothing(struct xfs_scrub *sc) /* Setup functions */ int xchk_setup_agheader(struct xfs_scrub *sc); int xchk_setup_fs(struct xfs_scrub *sc); +int xchk_setup_rt(struct xfs_scrub *sc); int xchk_setup_ag_allocbt(struct xfs_scrub *sc); int xchk_setup_ag_iallocbt(struct xfs_scrub *sc); int xchk_setup_ag_rmapbt(struct xfs_scrub *sc); @@ -106,11 +107,13 @@ int xchk_setup_rtbitmap(struct xfs_scrub *sc); int xchk_setup_rtsummary(struct xfs_scrub *sc); int xchk_setup_rgsuperblock(struct xfs_scrub *sc); int xchk_setup_rgbitmap(struct xfs_scrub *sc); +int xchk_setup_rtrmapbt(struct xfs_scrub *sc); #else # define xchk_setup_rtbitmap xchk_setup_nothing # define xchk_setup_rtsummary xchk_setup_nothing # define xchk_setup_rgsuperblock xchk_setup_nothing # define xchk_setup_rgbitmap xchk_setup_nothing +# define xchk_setup_rtrmapbt xchk_setup_nothing #endif #ifdef CONFIG_XFS_QUOTA int xchk_ino_dqattach(struct xfs_scrub *sc); @@ -170,14 +173,17 @@ void xchk_rt_unlock(struct xfs_scrub *sc, struct xchk_rt *sr); #ifdef CONFIG_XFS_RT /* All the locks we need to check an rtgroup. */ -#define XCHK_RTGLOCK_ALL (XFS_RTGLOCK_BITMAP_SHARED) +#define XCHK_RTGLOCK_ALL (XFS_RTGLOCK_BITMAP_SHARED | \ + XFS_RTGLOCK_RMAP) int xchk_rtgroup_init(struct xfs_scrub *sc, xfs_rgnumber_t rgno, struct xchk_rt *sr, unsigned int rtglock_flags); void xchk_rtgroup_unlock(struct xfs_scrub *sc, struct xchk_rt *sr); +void xchk_rtgroup_btcur_free(struct xchk_rt *sr); void xchk_rtgroup_free(struct xfs_scrub *sc, struct xchk_rt *sr); #else # define xchk_rtgroup_init(sc, rgno, sr, lockflags) (-ENOSYS) +# define xchk_rtgroup_btcur_free(sr) ((void)0) # define xchk_rtgroup_free(sc, sr) ((void)0) #endif /* CONFIG_XFS_RT */ @@ -258,4 +264,7 @@ static inline bool xchk_need_fshook_drain(struct xfs_scrub *sc) void xchk_fshooks_enable(struct xfs_scrub *sc, unsigned int scrub_fshooks); +int xchk_inode_count_blocks(struct xfs_scrub *sc, int whichfork, + xfs_extnum_t *nextents, xfs_filblks_t *count); + #endif /* __XFS_SCRUB_COMMON_H__ */ diff --git a/fs/xfs/scrub/health.c b/fs/xfs/scrub/health.c index a71d4d9087b2..061f6f73b666 100644 --- a/fs/xfs/scrub/health.c +++ b/fs/xfs/scrub/health.c @@ -113,6 +113,7 @@ static const struct xchk_health_map type_to_health_flag[XFS_SCRUB_TYPE_NR] = { [XFS_SCRUB_TYPE_QUOTACHECK] = { XHG_FS, XFS_SICK_FS_QUOTACHECK }, [XFS_SCRUB_TYPE_NLINKS] = { XHG_FS, XFS_SICK_FS_NLINKS }, [XFS_SCRUB_TYPE_RGSUPER] = { XHG_RTGROUP, XFS_SICK_RT_SUPER }, + [XFS_SCRUB_TYPE_RTRMAPBT] = { XHG_RTGROUP, XFS_SICK_RT_RMAPBT }, }; /* Return the health status mask for this scrub type. */ diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c index 4e534ec642e2..f2c60c3515e7 100644 --- a/fs/xfs/scrub/inode.c +++ b/fs/xfs/scrub/inode.c @@ -466,6 +466,10 @@ xchk_dinode( if (!S_ISREG(mode) && !S_ISDIR(mode)) xchk_ino_set_corrupt(sc, ino); break; + case XFS_DINODE_FMT_RMAP: + if (!S_ISREG(mode)) + xchk_ino_set_corrupt(sc, ino); + break; case XFS_DINODE_FMT_UUID: default: xchk_ino_set_corrupt(sc, ino); @@ -650,15 +654,13 @@ xchk_inode_xref_bmap( return; /* Walk all the extents to check nextents/naextents/nblocks. */ - error = xfs_bmap_count_blocks(sc->tp, sc->ip, XFS_DATA_FORK, - &nextents, &count); + error = xchk_inode_count_blocks(sc, XFS_DATA_FORK, &nextents, &count); if (!xchk_should_check_xref(sc, &error, NULL)) return; if (nextents < xfs_dfork_data_extents(dip)) xchk_ino_xref_set_corrupt(sc, sc->ip->i_ino); - error = xfs_bmap_count_blocks(sc->tp, sc->ip, XFS_ATTR_FORK, - &nextents, &acount); + error = xchk_inode_count_blocks(sc, XFS_ATTR_FORK, &nextents, &acount); if (!xchk_should_check_xref(sc, &error, NULL)) return; if (nextents != xfs_dfork_attr_extents(dip)) diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index 1efd606bf92c..a8d19d1e76e3 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -1275,8 +1275,7 @@ xrep_inode_blockcounts( trace_xrep_inode_blockcounts(sc); /* Set data fork counters from the data fork mappings. */ - error = xfs_bmap_count_blocks(sc->tp, sc->ip, XFS_DATA_FORK, - &nextents, &count); + error = xchk_inode_count_blocks(sc, XFS_DATA_FORK, &nextents, &count); if (error) return error; if (xfs_has_reflink(sc->mp)) { @@ -1296,8 +1295,8 @@ xrep_inode_blockcounts( /* Set attr fork counters from the attr fork mappings. */ ifp = xfs_ifork_ptr(sc->ip, XFS_ATTR_FORK); if (ifp) { - error = xfs_bmap_count_blocks(sc->tp, sc->ip, XFS_ATTR_FORK, - &nextents, &acount); + error = xchk_inode_count_blocks(sc, XFS_ATTR_FORK, &nextents, + &acount); if (error) return error; if (count >= sc->mp->m_sb.sb_dblocks) diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index 1652f633f692..eb0dda2df7af 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -56,6 +56,7 @@ xrep_attempt( trace_xrep_attempt(XFS_I(file_inode(sc->file)), sc->sm, error); xchk_ag_btcur_free(&sc->sa); + xchk_rtgroup_btcur_free(&sc->sr); /* Repair whatever's broken. */ ASSERT(sc->ops->repair); diff --git a/fs/xfs/scrub/rtrmap.c b/fs/xfs/scrub/rtrmap.c new file mode 100644 index 000000000000..e60b454b39f3 --- /dev/null +++ b/fs/xfs/scrub/rtrmap.c @@ -0,0 +1,192 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_defer.h" +#include "xfs_btree.h" +#include "xfs_bit.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_sb.h" +#include "xfs_rmap.h" +#include "xfs_rmap_btree.h" +#include "xfs_rtrmap_btree.h" +#include "xfs_inode.h" +#include "xfs_rtalloc.h" +#include "xfs_rtgroup.h" +#include "scrub/xfs_scrub.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/btree.h" +#include "scrub/trace.h" + +/* Set us up with the realtime metadata locked. */ +int +xchk_setup_rtrmapbt( + struct xfs_scrub *sc) +{ + struct xfs_mount *mp = sc->mp; + struct xfs_rtgroup *rtg; + int error = 0; + + if (xchk_need_fshook_drain(sc)) + xchk_fshooks_enable(sc, XCHK_FSHOOKS_DRAIN); + + rtg = xfs_rtgroup_get(mp, sc->sm->sm_agno); + if (!rtg) + return -ENOENT; + + error = xchk_setup_rt(sc); + if (error) + goto out_rtg; + + error = xchk_install_live_inode(sc, rtg->rtg_rmapip); + if (error) + goto out_rtg; + + error = xchk_ino_dqattach(sc); + if (error) + goto out_rtg; + + error = xchk_rtgroup_init(sc, rtg->rtg_rgno, &sc->sr, XCHK_RTGLOCK_ALL); +out_rtg: + xfs_rtgroup_put(rtg); + return error; +} + +/* Realtime reverse mapping. */ + +struct xchk_rtrmap { + /* + * The furthest-reaching of the rmapbt records that we've already + * processed. This enables us to detect overlapping records for space + * allocations that cannot be shared. + */ + struct xfs_rmap_irec overlap_rec; + + /* + * The previous rmapbt record, so that we can check for two records + * that could be one. + */ + struct xfs_rmap_irec prev_rec; +}; + +/* Flag failures for records that overlap but cannot. */ +STATIC void +xchk_rtrmapbt_check_overlapping( + struct xchk_btree *bs, + struct xchk_rtrmap *cr, + const struct xfs_rmap_irec *irec) +{ + xfs_rtblock_t pnext, inext; + + if (bs->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return; + + /* No previous record? */ + if (cr->overlap_rec.rm_blockcount == 0) + goto set_prev; + + /* Do overlap_rec and irec overlap? */ + pnext = cr->overlap_rec.rm_startblock + cr->overlap_rec.rm_blockcount; + if (pnext <= irec->rm_startblock) + goto set_prev; + + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + + /* Save whichever rmap record extends furthest. */ + inext = irec->rm_startblock + irec->rm_blockcount; + if (pnext > inext) + return; + +set_prev: + memcpy(&cr->overlap_rec, irec, sizeof(struct xfs_rmap_irec)); +} + +/* Decide if two reverse-mapping records can be merged. */ +static inline bool +xchk_rtrmap_mergeable( + struct xchk_rtrmap *cr, + const struct xfs_rmap_irec *r2) +{ + const struct xfs_rmap_irec *r1 = &cr->prev_rec; + + /* Ignore if prev_rec is not yet initialized. */ + if (cr->prev_rec.rm_blockcount == 0) + return false; + + if (r1->rm_owner != r2->rm_owner) + return false; + if (r1->rm_startblock + r1->rm_blockcount != r2->rm_startblock) + return false; + if ((unsigned long long)r1->rm_blockcount + r2->rm_blockcount > + XFS_RMAP_LEN_MAX) + return false; + if (r1->rm_flags != r2->rm_flags) + return false; + return r1->rm_offset + r1->rm_blockcount == r2->rm_offset; +} + +/* Flag failures for records that could be merged. */ +STATIC void +xchk_rtrmapbt_check_mergeable( + struct xchk_btree *bs, + struct xchk_rtrmap *cr, + const struct xfs_rmap_irec *irec) +{ + if (bs->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return; + + if (xchk_rtrmap_mergeable(cr, irec)) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + + memcpy(&cr->prev_rec, irec, sizeof(struct xfs_rmap_irec)); +} + +/* Scrub a realtime rmapbt record. */ +STATIC int +xchk_rtrmapbt_rec( + struct xchk_btree *bs, + const union xfs_btree_rec *rec) +{ + struct xchk_rtrmap *cr = bs->private; + struct xfs_rmap_irec irec; + + if (xfs_rmap_btrec_to_irec(rec, &irec) != NULL || + xfs_rmap_check_irec(bs->cur, &irec) != NULL) { + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + return 0; + } + + if (bs->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return 0; + + xchk_rtrmapbt_check_mergeable(bs, cr, &irec); + xchk_rtrmapbt_check_overlapping(bs, cr, &irec); + return 0; +} + +/* Scrub the realtime rmap btree. */ +int +xchk_rtrmapbt( + struct xfs_scrub *sc) +{ + struct xfs_owner_info oinfo; + struct xchk_rtrmap cr = { }; + int error; + + error = xchk_metadata_inode_forks(sc); + if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) + return error; + + xfs_rmap_ino_bmbt_owner(&oinfo, sc->sr.rtg->rtg_rmapip->i_ino, + XFS_DATA_FORK); + return xchk_btree(sc, sc->sr.rmap_cur, xchk_rtrmapbt_rec, &oinfo, &cr); +} diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 6066673953cb..c9b4899c8b6a 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -182,6 +182,8 @@ xchk_teardown( int error) { xchk_ag_free(sc, &sc->sa); + xchk_rtgroup_btcur_free(&sc->sr); + if (sc->tp) { if (error == 0 && (sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR)) error = xfs_trans_commit(sc->tp); @@ -423,6 +425,13 @@ static const struct xchk_meta_ops meta_scrub_ops[] = { .has = xfs_has_rtgroups, .repair = xrep_notsupported, }, + [XFS_SCRUB_TYPE_RTRMAPBT] = { /* realtime group rmapbt */ + .type = ST_RTGROUP, + .setup = xchk_setup_rtrmapbt, + .scrub = xchk_rtrmapbt, + .has = xfs_has_rtrmapbt, + .repair = xrep_notsupported, + }, }; static int diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index 48114bda2f4a..fa75034b9051 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -78,6 +78,9 @@ struct xchk_rt { * if rtg != NULL. */ unsigned int rtlock_flags; + + /* rtgroup btrees */ + struct xfs_btree_cur *rmap_cur; }; struct xfs_scrub { @@ -190,11 +193,13 @@ int xchk_rtbitmap(struct xfs_scrub *sc); int xchk_rtsummary(struct xfs_scrub *sc); int xchk_rgsuperblock(struct xfs_scrub *sc); int xchk_rgbitmap(struct xfs_scrub *sc); +int xchk_rtrmapbt(struct xfs_scrub *sc); #else # define xchk_rtbitmap xchk_nothing # define xchk_rtsummary xchk_nothing # define xchk_rgsuperblock xchk_nothing # define xchk_rgbitmap xchk_nothing +# define xchk_rtrmapbt xchk_nothing #endif #ifdef CONFIG_XFS_QUOTA int xchk_quota(struct xfs_scrub *sc); diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 3ffee717062d..844f49091b1d 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -77,6 +77,7 @@ TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_NLINKS); TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_HEALTHY); TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RGSUPER); TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RGBITMAP); +TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RTRMAPBT); #define XFS_SCRUB_TYPE_STRINGS \ { XFS_SCRUB_TYPE_PROBE, "probe" }, \ @@ -108,7 +109,8 @@ TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RGBITMAP); { XFS_SCRUB_TYPE_NLINKS, "nlinks" }, \ { XFS_SCRUB_TYPE_HEALTHY, "healthy" }, \ { XFS_SCRUB_TYPE_RGSUPER, "rgsuper" }, \ - { XFS_SCRUB_TYPE_RGBITMAP, "rgbitmap" } + { XFS_SCRUB_TYPE_RGBITMAP, "rgbitmap" }, \ + { XFS_SCRUB_TYPE_RTRMAPBT, "rtrmapbt" } #define XFS_SCRUB_FLAG_STRINGS \ { XFS_SCRUB_IFLAG_REPAIR, "repair" }, \ From patchwork Fri Dec 30 22:18:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12721C4332F for ; Sat, 31 Dec 2022 01:44:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236036AbiLaBoL (ORCPT ); Fri, 30 Dec 2022 20:44:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236028AbiLaBoK (ORCPT ); Fri, 30 Dec 2022 20:44:10 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D420E13DEA for ; Fri, 30 Dec 2022 17:44:09 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6FC1261C3A for ; Sat, 31 Dec 2022 01:44:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CC9E8C433D2; Sat, 31 Dec 2022 01:44:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451048; bh=xyAT8fRrcvBW//Zb3kpyyOz5rDYwvjEkvApzxSzjygM=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=vAimfpzmZWC3YlU4uxyL393TO6Cjk5Lu12GzIKLlbUqRwQL9YWlxP3CLOSsbJSWKJ GxFOP/KKteHGVvTivckHK1Frz+cCdicZXwF6ZA8+1oliqK5Af/Q7LXpAKXUmfBQGGB 8SfaPkp/6u58VxO/rHjuY2rNQ/wqfMY658XXdYI4lrytQra3VHQXL2KuwpsQAfTGm1 F2G2/Ab5+45k90o+G3AThG6ik/M301lsPgFrOl7Iom/6Qbfsj9f6iT42ogmmBLKNVE qDd8KbaopmC9QPjPSU25zUohfPCPMCIgsAitaqfrLz5OCT4jaHl7/Z0xfSeFvTDzWu u2AD/Bth8JS7g== Subject: [PATCH 28/38] xfs: cross-reference realtime bitmap to realtime rmapbt scrubber From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:20 -0800 Message-ID: <167243870001.715303.15509002147839052353.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong When we're checking the realtime rmap btree entries, cross-reference those entries with the realtime bitmap too. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/rtrmap.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/fs/xfs/scrub/rtrmap.c b/fs/xfs/scrub/rtrmap.c index e60b454b39f3..72fc47cc25f0 100644 --- a/fs/xfs/scrub/rtrmap.c +++ b/fs/xfs/scrub/rtrmap.c @@ -150,6 +150,23 @@ xchk_rtrmapbt_check_mergeable( memcpy(&cr->prev_rec, irec, sizeof(struct xfs_rmap_irec)); } +/* Cross-reference with other metadata. */ +STATIC void +xchk_rtrmapbt_xref( + struct xfs_scrub *sc, + struct xfs_rmap_irec *irec) +{ + xfs_rtblock_t rtbno; + + if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return; + + rtbno = xfs_rgbno_to_rtb(sc->mp, sc->sr.rtg->rtg_rgno, + irec->rm_startblock); + + xchk_xref_is_used_rt_space(sc, rtbno, irec->rm_blockcount); +} + /* Scrub a realtime rmapbt record. */ STATIC int xchk_rtrmapbt_rec( @@ -170,6 +187,7 @@ xchk_rtrmapbt_rec( xchk_rtrmapbt_check_mergeable(bs, cr, &irec); xchk_rtrmapbt_check_overlapping(bs, cr, &irec); + xchk_rtrmapbt_xref(bs->sc, &irec); return 0; } From patchwork Fri Dec 30 22:18:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085494 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79D95C4332F for ; Sat, 31 Dec 2022 01:44:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235902AbiLaBo3 (ORCPT ); Fri, 30 Dec 2022 20:44:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236049AbiLaBo2 (ORCPT ); Fri, 30 Dec 2022 20:44:28 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D411813F7A for ; Fri, 30 Dec 2022 17:44:26 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 95940B81A16 for ; Sat, 31 Dec 2022 01:44:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5E3FFC433D2; Sat, 31 Dec 2022 01:44:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451064; bh=Dr2/UOyjflVKUhrZy/m7E9v3S1q6V4PYZ6vHN5qtU/g=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=MjsoTZCOjLUzpoHvOjrha5XqucwaI4wVANoOiGTv49af59ueZOCBbGl3rbWQHcADA /apBGpfH9JnuEbCfT5z/UbEHxnQoCCI5nawx2sIk2IhIiQU3Hnr/xNDrvzO/V9aSgx s6SLHPe/4zuszBFhUbox2uOxtmF4eeiM9ok9VR/1xpWVbMoDkOPEjUJApTkApkvRHI LTj9t0Of/VLjd4rJRLdyq/opMuUZry8xbc8+Ipn7sxwR/2DNx5D7ZB6U2gTMpSmFBQ iILYRtVJ8Ujsr0pjfjgtWoLj5ji1RQPeyT+RVCyWYGfmpwQxIIchJto65bYnaT/Y8t wQkUd+xkTrr6g== Subject: [PATCH 29/38] xfs: cross-reference the realtime rmapbt From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:20 -0800 Message-ID: <167243870016.715303.9787144291938755463.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Teach the data fork and realtime bitmap scrubbers to cross-reference information with the realtime rmap btree. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/bmap.c | 67 +++++++++++++++++++++++++++++++-------- fs/xfs/scrub/rtbitmap.c | 80 +++++++++++++++++++++++++++++++++++++++++++++-- fs/xfs/scrub/rtrmap.c | 65 ++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/scrub.h | 9 +++++ 4 files changed, 202 insertions(+), 19 deletions(-) diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index 0c79185daedf..49fffe85dde6 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -19,6 +19,7 @@ #include "xfs_bmap_btree.h" #include "xfs_rmap.h" #include "xfs_rmap_btree.h" +#include "xfs_rtgroup.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/btree.h" @@ -127,15 +128,22 @@ static inline bool xchk_bmap_get_rmap( struct xchk_bmap_info *info, struct xfs_bmbt_irec *irec, - xfs_agblock_t agbno, + xfs_agblock_t bno, uint64_t owner, struct xfs_rmap_irec *rmap) { + struct xfs_btree_cur **curp = &info->sc->sa.rmap_cur; xfs_fileoff_t offset; unsigned int rflags = 0; int has_rmap; int error; + if (xfs_ifork_is_realtime(info->sc->ip, info->whichfork)) + curp = &info->sc->sr.rmap_cur; + + if (*curp == NULL) + return false; + if (info->whichfork == XFS_ATTR_FORK) rflags |= XFS_RMAP_ATTR_FORK; if (irec->br_state == XFS_EXT_UNWRITTEN) @@ -156,13 +164,13 @@ xchk_bmap_get_rmap( * range rmap lookup to make sure we get the correct owner/offset. */ if (info->is_shared) { - error = xfs_rmap_lookup_le_range(info->sc->sa.rmap_cur, agbno, - owner, offset, rflags, rmap, &has_rmap); + error = xfs_rmap_lookup_le_range(*curp, bno, owner, offset, + rflags, rmap, &has_rmap); } else { - error = xfs_rmap_lookup_le(info->sc->sa.rmap_cur, agbno, - owner, offset, rflags, rmap, &has_rmap); + error = xfs_rmap_lookup_le(*curp, bno, owner, offset, + rflags, rmap, &has_rmap); } - if (!xchk_should_check_xref(info->sc, &error, &info->sc->sa.rmap_cur)) + if (!xchk_should_check_xref(info->sc, &error, curp)) return false; if (!has_rmap) @@ -218,13 +226,13 @@ STATIC void xchk_bmap_xref_rmap( struct xchk_bmap_info *info, struct xfs_bmbt_irec *irec, - xfs_agblock_t agbno) + xfs_agblock_t bno) { struct xfs_rmap_irec rmap; unsigned long long rmap_end; uint64_t owner; - if (!info->sc->sa.rmap_cur || xchk_skip_xref(info->sc->sm)) + if (xchk_skip_xref(info->sc->sm)) return; if (info->whichfork == XFS_COW_FORK) @@ -233,13 +241,12 @@ xchk_bmap_xref_rmap( owner = info->sc->ip->i_ino; /* Find the rmap record for this irec. */ - if (!xchk_bmap_get_rmap(info, irec, agbno, owner, &rmap)) + if (!xchk_bmap_get_rmap(info, irec, bno, owner, &rmap)) return; /* Check the rmap. */ rmap_end = (unsigned long long)rmap.rm_startblock + rmap.rm_blockcount; - if (rmap.rm_startblock > agbno || - agbno + irec->br_blockcount > rmap_end) + if (rmap.rm_startblock > bno || bno + irec->br_blockcount > rmap_end) xchk_fblock_xref_set_corrupt(info->sc, info->whichfork, irec->br_startoff); @@ -288,7 +295,7 @@ xchk_bmap_xref_rmap( * Skip this for CoW fork extents because the refcount btree (and not * the inode) is the ondisk owner for those extents. */ - if (info->whichfork != XFS_COW_FORK && rmap.rm_startblock < agbno && + if (info->whichfork != XFS_COW_FORK && rmap.rm_startblock < bno && !xchk_bmap_has_prev(info, irec)) { xchk_fblock_xref_set_corrupt(info->sc, info->whichfork, irec->br_startoff); @@ -303,7 +310,7 @@ xchk_bmap_xref_rmap( */ rmap_end = (unsigned long long)rmap.rm_startblock + rmap.rm_blockcount; if (info->whichfork != XFS_COW_FORK && - rmap_end > agbno + irec->br_blockcount && + rmap_end > bno + irec->br_blockcount && !xchk_bmap_has_next(info, irec)) { xchk_fblock_xref_set_corrupt(info->sc, info->whichfork, irec->br_startoff); @@ -318,10 +325,40 @@ xchk_bmap_rt_iextent_xref( struct xchk_bmap_info *info, struct xfs_bmbt_irec *irec) { - xchk_rt_init(info->sc, &info->sc->sr, XCHK_RTLOCK_BITMAP_SHARED); + struct xfs_owner_info oinfo; + struct xfs_mount *mp = ip->i_mount; + xfs_rgnumber_t rgno; + xfs_rgblock_t rgbno; + int error; + + if (!xfs_has_rtrmapbt(mp)) { + xchk_rt_init(info->sc, &info->sc->sr, + XCHK_RTLOCK_BITMAP_SHARED); + xchk_xref_is_used_rt_space(info->sc, irec->br_startblock, + irec->br_blockcount); + xchk_rt_unlock(info->sc, &info->sc->sr); + return; + } + + rgbno = xfs_rtb_to_rgbno(mp, irec->br_startblock, &rgno); + error = xchk_rtgroup_init(info->sc, rgno, &info->sc->sr, + XCHK_RTGLOCK_ALL); + if (!xchk_fblock_process_error(info->sc, info->whichfork, + irec->br_startoff, &error)) + goto out_free; + xchk_xref_is_used_rt_space(info->sc, irec->br_startblock, irec->br_blockcount); - xchk_rt_unlock(info->sc, &info->sc->sr); + xchk_bmap_xref_rmap(info, irec, rgbno); + + xfs_rmap_ino_owner(&oinfo, info->sc->ip->i_ino, info->whichfork, + irec->br_startoff); + xchk_xref_is_only_rt_owned_by(info->sc, rgbno, irec->br_blockcount, + &oinfo); + +out_free: + xchk_rtgroup_btcur_free(&info->sc->sr); + xchk_rtgroup_free(info->sc, &info->sc->sr); } /* Cross-reference a single datadev extent record. */ diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c index a034f2d392f5..eb150c40d33c 100644 --- a/fs/xfs/scrub/rtbitmap.c +++ b/fs/xfs/scrub/rtbitmap.c @@ -9,15 +9,19 @@ #include "xfs_format.h" #include "xfs_trans_resv.h" #include "xfs_mount.h" +#include "xfs_btree.h" #include "xfs_log_format.h" #include "xfs_trans.h" #include "xfs_rtbitmap.h" #include "xfs_inode.h" #include "xfs_bmap.h" #include "xfs_rtgroup.h" +#include "xfs_rmap.h" +#include "xfs_rtrmap_btree.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/repair.h" +#include "scrub/btree.h" /* Set us up with the realtime group metadata locked. */ int @@ -77,6 +81,43 @@ xchk_setup_rtbitmap( /* Realtime bitmap. */ +struct xchk_rtbitmap { + struct xfs_scrub *sc; + + /* The next free rt block that we expect to see. */ + xfs_rtblock_t next_free_rtblock; +}; + +/* Cross-reference rtbitmap entries with other metadata. */ +STATIC void +xchk_rtbitmap_xref( + struct xchk_rtbitmap *rtb, + xfs_rtblock_t startblock, + xfs_rtblock_t blockcount) +{ + struct xfs_scrub *sc = rtb->sc; + xfs_rgnumber_t rgno; + xfs_rgblock_t rgbno; + + if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return; + if (!sc->sr.rmap_cur) + return; + + rgbno = xfs_rtb_to_rgbno(sc->mp, startblock, &rgno); + xchk_xref_has_no_rt_owner(sc, rgbno, blockcount); + + if (rtb->next_free_rtblock < startblock) { + xfs_rgblock_t next_rgbno; + + next_rgbno = xfs_rtb_to_rgbno(sc->mp, rtb->next_free_rtblock, + &rgno); + xchk_xref_has_rt_owner(sc, next_rgbno, rgbno - next_rgbno); + } + + rtb->next_free_rtblock = startblock + blockcount; +} + /* Scrub a free extent record from the realtime bitmap. */ STATIC int xchk_rtbitmap_rec( @@ -85,8 +126,9 @@ xchk_rtbitmap_rec( const struct xfs_rtalloc_rec *rec, void *priv) { - struct xfs_scrub *sc = priv; - xfs_rtxnum_t startblock; + struct xchk_rtbitmap *rtb = priv; + struct xfs_scrub *sc = rtb->sc; + xfs_rtblock_t startblock; xfs_filblks_t blockcount; startblock = xfs_rtx_to_rtb(mp, rec->ar_startext); @@ -94,6 +136,12 @@ xchk_rtbitmap_rec( if (!xfs_verify_rtbext(mp, startblock, blockcount)) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, 0); + + xchk_rtbitmap_xref(rtb, startblock, blockcount); + + if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return -ECANCELED; + return 0; } @@ -138,8 +186,12 @@ xchk_rgbitmap( struct xfs_scrub *sc) { struct xfs_rtalloc_rec keys[2]; + struct xchk_rtbitmap rtb = { + .sc = sc, + }; struct xfs_rtgroup *rtg = sc->sr.rtg; xfs_rtblock_t rtbno; + xfs_rtblock_t last_rtbno; xfs_rgblock_t last_rgbno = rtg->rtg_blockcount - 1; int error; @@ -155,6 +207,7 @@ xchk_rgbitmap( * realtime group. */ rtbno = xfs_rgbno_to_rtb(sc->mp, rtg->rtg_rgno, 0); + rtb.next_free_rtblock = rtbno; keys[0].ar_startext = xfs_rtb_to_rtxt(sc->mp, rtbno); rtbno = xfs_rgbno_to_rtb(sc->mp, rtg->rtg_rgno, last_rgbno); @@ -162,10 +215,26 @@ xchk_rgbitmap( keys[0].ar_extcount = keys[1].ar_extcount = 0; error = xfs_rtalloc_query_range(sc->mp, sc->tp, &keys[0], &keys[1], - xchk_rtbitmap_rec, sc); + xchk_rtbitmap_rec, &rtb); if (!xchk_fblock_process_error(sc, XFS_DATA_FORK, 0, &error)) return error; + /* + * Check that the are rmappings for all rt extents between the end of + * the last free extent we saw and the last possible extent in the rt + * group. + */ + last_rtbno = xfs_rgbno_to_rtb(sc->mp, rtg->rtg_rgno, last_rgbno); + if (rtb.next_free_rtblock < last_rtbno) { + xfs_rgnumber_t rgno; + xfs_rgblock_t next_rgbno; + + next_rgbno = xfs_rtb_to_rgbno(sc->mp, rtb.next_free_rtblock, + &rgno); + xchk_xref_has_rt_owner(sc, next_rgbno, + last_rgbno - next_rgbno); + } + return 0; } @@ -174,6 +243,9 @@ int xchk_rtbitmap( struct xfs_scrub *sc) { + struct xchk_rtbitmap rtb = { + .sc = sc, + }; int error; /* Is the size of the rtbitmap correct? */ @@ -199,7 +271,7 @@ xchk_rtbitmap( if (xfs_has_rtgroups(sc->mp)) return 0; - error = xfs_rtalloc_query_all(sc->mp, sc->tp, xchk_rtbitmap_rec, sc); + error = xfs_rtalloc_query_all(sc->mp, sc->tp, xchk_rtbitmap_rec, &rtb); if (!xchk_fblock_process_error(sc, XFS_DATA_FORK, 0, &error)) return error; diff --git a/fs/xfs/scrub/rtrmap.c b/fs/xfs/scrub/rtrmap.c index 72fc47cc25f0..e9ca9670f3af 100644 --- a/fs/xfs/scrub/rtrmap.c +++ b/fs/xfs/scrub/rtrmap.c @@ -208,3 +208,68 @@ xchk_rtrmapbt( XFS_DATA_FORK); return xchk_btree(sc, sc->sr.rmap_cur, xchk_rtrmapbt_rec, &oinfo, &cr); } + +/* xref check that the extent has no realtime reverse mapping at all */ +void +xchk_xref_has_no_rt_owner( + struct xfs_scrub *sc, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + enum xbtree_recpacking outcome; + int error; + + if (!sc->sr.rmap_cur || xchk_skip_xref(sc->sm)) + return; + + error = xfs_rmap_has_records(sc->sr.rmap_cur, bno, len, &outcome); + if (!xchk_should_check_xref(sc, &error, &sc->sr.rmap_cur)) + return; + if (outcome != XBTREE_RECPACKING_EMPTY) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); +} + +/* xref check that the extent is completely mapped */ +void +xchk_xref_has_rt_owner( + struct xfs_scrub *sc, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + enum xbtree_recpacking outcome; + int error; + + if (!sc->sr.rmap_cur || xchk_skip_xref(sc->sm)) + return; + + error = xfs_rmap_has_records(sc->sr.rmap_cur, bno, len, &outcome); + if (!xchk_should_check_xref(sc, &error, &sc->sr.rmap_cur)) + return; + if (outcome != XBTREE_RECPACKING_FULL) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); +} + +/* xref check that the extent is only owned by a given owner */ +void +xchk_xref_is_only_rt_owned_by( + struct xfs_scrub *sc, + xfs_agblock_t bno, + xfs_extlen_t len, + const struct xfs_owner_info *oinfo) +{ + struct xfs_rmap_matches res; + int error; + + if (!sc->sr.rmap_cur || xchk_skip_xref(sc->sm)) + return; + + error = xfs_rmap_count_owners(sc->sr.rmap_cur, bno, len, oinfo, &res); + if (!xchk_should_check_xref(sc, &error, &sc->sr.rmap_cur)) + return; + if (res.matches != 1) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); + if (res.badno_matches) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); + if (res.nono_matches) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); +} diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index fa75034b9051..d47db84e6b7f 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -233,8 +233,17 @@ void xchk_xref_is_not_cow_staging(struct xfs_scrub *sc, xfs_agblock_t bno, #ifdef CONFIG_XFS_RT void xchk_xref_is_used_rt_space(struct xfs_scrub *sc, xfs_rtblock_t rtbno, xfs_extlen_t len); +void xchk_xref_has_no_rt_owner(struct xfs_scrub *sc, xfs_rgblock_t rgbno, + xfs_extlen_t len); +void xchk_xref_has_rt_owner(struct xfs_scrub *sc, xfs_rgblock_t rgbno, + xfs_extlen_t len); +void xchk_xref_is_only_rt_owned_by(struct xfs_scrub *sc, xfs_rgblock_t rgbno, + xfs_extlen_t len, const struct xfs_owner_info *oinfo); #else # define xchk_xref_is_used_rt_space(sc, rtbno, len) do { } while (0) +# define xchk_xref_has_no_rt_owner(sc, rtbno, len) do { } while (0) +# define xchk_xref_has_rt_owner(sc, rtbno, len) do { } while (0) +# define xchk_xref_is_only_rt_owned_by(sc, bno, len, oinfo) do { } while (0) #endif #endif /* __XFS_SCRUB_SCRUB_H__ */ From patchwork Fri Dec 30 22:18:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84AE4C4332F for ; Sat, 31 Dec 2022 01:44:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236049AbiLaBoo (ORCPT ); Fri, 30 Dec 2022 20:44:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236028AbiLaBon (ORCPT ); Fri, 30 Dec 2022 20:44:43 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82E481CFF2 for ; Fri, 30 Dec 2022 17:44:42 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3184FB81DD1 for ; Sat, 31 Dec 2022 01:44:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB3B5C433EF; Sat, 31 Dec 2022 01:44:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451079; bh=KgOCZjx4k8w6cpCFCrN/y3vzEqvQg1ksBfxUcI+zoJY=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=cd9eYzJK/6zEKpUB5LtrOsEDAN9LlC+ULdvLmIeY8egZBJZYFXb4yaZ4BhQwrolCP mt+2Klc/JOXy33elpXj1/JgLZyq0slXea5VZqDxAAbHJfLH4AZQHOTZjY0br5+MFVN SUn3Rl0XUXNRT0l55PRG84KY3CKcNPowk5KjCPhwwfeshAMyJ2nxWSpB2Z/X2u0tJf FvLQ3wfUstuTwSx1G4zptxVC1L197bEVXClcn+sNdzfVQIgyf892HY0IJx8JaWKQXd 26YRhhAR46jETDqtmCPRH/Lv83ZUNhkJbkeqxlw5cyVLsHLcLr8GZf4LsG7AoA+8RX iQ2sc27NS0GGw== Subject: [PATCH 30/38] xfs: scan rt rmap when we're doing an intense rmap check of bmbt mappings From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:20 -0800 Message-ID: <167243870030.715303.10177350333030281769.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Teach the bmbt scrubber how to perform a comprehensive check that the rmapbt does not contain /any/ mappings that are not described by bmbt records when it's dealing with a realtime file. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/bmap.c | 60 +++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 53 insertions(+), 7 deletions(-) diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index 49fffe85dde6..8ce279ae9c95 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -20,6 +20,8 @@ #include "xfs_rmap.h" #include "xfs_rmap_btree.h" #include "xfs_rtgroup.h" +#include "xfs_rtalloc.h" +#include "xfs_rtrmap_btree.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/btree.h" @@ -673,12 +675,20 @@ xchk_bmap_check_rmap( */ check_rec = *rec; while (have_map) { + xfs_fsblock_t startblock; + if (irec.br_startoff != check_rec.rm_offset) xchk_fblock_set_corrupt(sc, sbcri->whichfork, check_rec.rm_offset); - if (irec.br_startblock != XFS_AGB_TO_FSB(sc->mp, - cur->bc_ag.pag->pag_agno, - check_rec.rm_startblock)) + if (cur->bc_btnum == XFS_BTNUM_RMAP) + startblock = XFS_AGB_TO_FSB(sc->mp, + cur->bc_ag.pag->pag_agno, + check_rec.rm_startblock); + else + startblock = xfs_rgbno_to_rtb(sc->mp, + cur->bc_ino.rtg->rtg_rgno, + check_rec.rm_startblock); + if (irec.br_startblock != startblock) xchk_fblock_set_corrupt(sc, sbcri->whichfork, check_rec.rm_offset); if (irec.br_blockcount > check_rec.rm_blockcount) @@ -732,6 +742,30 @@ xchk_bmap_check_ag_rmaps( return error; } +/* Make sure each rt rmap has a corresponding bmbt entry. */ +STATIC int +xchk_bmap_check_rt_rmaps( + struct xfs_scrub *sc, + struct xfs_rtgroup *rtg) +{ + struct xchk_bmap_check_rmap_info sbcri; + struct xfs_btree_cur *cur; + int error; + + xfs_rtgroup_lock(NULL, rtg, XFS_RTGLOCK_RMAP); + cur = xfs_rtrmapbt_init_cursor(sc->mp, sc->tp, rtg, rtg->rtg_rmapip); + + sbcri.sc = sc; + sbcri.whichfork = XFS_DATA_FORK; + error = xfs_rmap_query_all(cur, xchk_bmap_check_rmap, &sbcri); + if (error == -ECANCELED) + error = 0; + + xfs_btree_del_cursor(cur, error); + xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_RMAP); + return error; +} + /* Make sure each rmap has a corresponding bmbt entry. */ STATIC int xchk_bmap_check_rmaps( @@ -749,10 +783,6 @@ xchk_bmap_check_rmaps( (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) return 0; - /* Don't support realtime rmap checks yet. */ - if (xfs_ifork_is_realtime(sc->ip, whichfork)) - return 0; - ASSERT(xfs_ifork_ptr(sc->ip, whichfork) != NULL); /* @@ -772,6 +802,22 @@ xchk_bmap_check_rmaps( (zero_size || ifp->if_nextents > 0)) return 0; + if (xfs_ifork_is_realtime(sc->ip, whichfork)) { + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; + + for_each_rtgroup(sc->mp, rgno, rtg) { + error = xchk_bmap_check_rt_rmaps(sc, rtg); + if (error || + (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) { + xfs_rtgroup_put(rtg); + return error; + } + } + + return 0; + } + for_each_perag(sc->mp, agno, pag) { error = xchk_bmap_check_ag_rmaps(sc, whichfork, pag); if (error || From patchwork Fri Dec 30 22:18:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0489C4332F for ; Sat, 31 Dec 2022 01:44:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236028AbiLaBo6 (ORCPT ); Fri, 30 Dec 2022 20:44:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236014AbiLaBo4 (ORCPT ); Fri, 30 Dec 2022 20:44:56 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CF93F026 for ; Fri, 30 Dec 2022 17:44:56 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 08EA761C3A for ; Sat, 31 Dec 2022 01:44:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6054AC433D2; Sat, 31 Dec 2022 01:44:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451095; bh=/sLgTPi/17aYf5liPwmZU0WzbYvqUKOSw10HU69+sf4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=NwbRhL8jwDAyGUtyu6W9YhFy602FVnWfr/2NUM5n03pzOPwXrCb4hoxnPPQ1sa67e UzIg879RRlcAvfcQdPkBk/C7/1z2lRqW/po5bHF/o0bBx28wtN0pBEU6JgrksoWtUL AsZpeuHE8SacxjGCSh0Qobl2ScxugDIqEiUuSjRKXXjxOUnJOt12Gy4D/iODCNqPvP YSWW5zyhp8iGn7DVW1g2OkOMjJ9P2YuBDE0jLNn279XpqE2OEB7voanxSv3TILYbn2 ihyA8tgIWIMVndT4psSC7Z/fZYzn6iRzRPRHw50/JukX9m30779V1XXSqDCkKfzA3I FbrDVc9Doxkiw== Subject: [PATCH 31/38] xfs: walk the rt reverse mapping tree when rebuilding rmap From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:20 -0800 Message-ID: <167243870045.715303.13233266513206700894.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong When we're rebuilding the data device rmap, if we encounter an "rmap" format fork, we have to walk the (realtime) rmap btree inode to build the appropriate mappings. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/rmap_repair.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/fs/xfs/scrub/rmap_repair.c b/fs/xfs/scrub/rmap_repair.c index ed937e461bf8..86c5338a12b9 100644 --- a/fs/xfs/scrub/rmap_repair.c +++ b/fs/xfs/scrub/rmap_repair.c @@ -30,6 +30,8 @@ #include "xfs_refcount.h" #include "xfs_refcount_btree.h" #include "xfs_ag.h" +#include "xfs_rtrmap_btree.h" +#include "xfs_rtgroup.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -496,6 +498,38 @@ xrep_rmap_scan_iext( return xrep_rmap_stash_accumulated(rf); } +static int +xrep_rmap_scan_rtrmapbt( + struct xrep_rmap_ifork *rf, + struct xfs_inode *ip) +{ + struct xfs_scrub *sc = rf->rr->sc; + struct xfs_btree_cur *cur; + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; + int error; + + if (rf->whichfork != XFS_DATA_FORK) + return -EFSCORRUPTED; + + for_each_rtgroup(sc->mp, rgno, rtg) { + if (ip == rtg->rtg_rmapip) { + cur = xfs_rtrmapbt_init_cursor(sc->mp, sc->tp, rtg, ip); + error = xrep_rmap_scan_iroot_btree(rf, cur); + xfs_btree_del_cursor(cur, error); + xfs_rtgroup_put(rtg); + return error; + } + } + + /* + * We shouldn't find an rmap format inode that isn't associated with + * an rtgroup! + */ + ASSERT(0); + return -EFSCORRUPTED; +} + /* Find all the extents from a given AG in an inode fork. */ STATIC int xrep_rmap_scan_ifork( @@ -525,6 +559,8 @@ xrep_rmap_scan_ifork( error = xrep_rmap_scan_bmbt(&rf, ip, &mappings_done); if (error || mappings_done) return error; + } else if (ifp->if_format == XFS_DINODE_FMT_RMAP) { + return xrep_rmap_scan_rtrmapbt(&rf, ip); } else if (ifp->if_format != XFS_DINODE_FMT_EXTENTS) { return 0; } From patchwork Fri Dec 30 22:18:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90183C4332F for ; Sat, 31 Dec 2022 01:45:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236043AbiLaBpP (ORCPT ); Fri, 30 Dec 2022 20:45:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236014AbiLaBpO (ORCPT ); Fri, 30 Dec 2022 20:45:14 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD34113DEA for ; Fri, 30 Dec 2022 17:45:13 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3F73BB81DDA for ; Sat, 31 Dec 2022 01:45:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E45FAC433EF; Sat, 31 Dec 2022 01:45:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451111; bh=9b1x5bNhXEv2NOM3znWKwHKu+jSC3MWyRcXHH5B992U=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=aybQE2FWLOAdzqwfJKsgNYu6ihJE3pKA/XtZhAXj7YT6CWskM27jaFDVwS9YvFtin c2MG6AyEYz36HvkFl+fsJWYC05SHLvTeGtNQsSfmqwArvcFxSmGbsFH3rWEoAmEmD8 Cak4t/diTsLehvu4RMD37vY9mrYfPXzge41QlqZU0797vdDJRLoMRV+w6oV+iM/9NZ BU6ttEKpCGLi9tY/nCZmC6zwLFXbEwDsXhsMuduylviLtskKX8ZetqjV8qYrVf94UH s4Y+BgiqL8aOybbz5bKYLMtNsZ43w1tkydd8hSDc0L3vggrBaeBJ7UMwOySmmrmNpz JQk3AK7duz+ug== Subject: [PATCH 32/38] xfs: online repair of realtime file bmaps From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:20 -0800 Message-ID: <167243870059.715303.6185832700825166013.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Repair the block mappings of realtime files. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/bmap_repair.c | 127 +++++++++++++++++++++++++++++++++++++++++++- fs/xfs/scrub/common.c | 2 - fs/xfs/scrub/common.h | 3 + fs/xfs/scrub/repair.c | 93 ++++++++++++++++++++++++++++++++ fs/xfs/scrub/repair.h | 11 ++++ 5 files changed, 231 insertions(+), 5 deletions(-) diff --git a/fs/xfs/scrub/bmap_repair.c b/fs/xfs/scrub/bmap_repair.c index ca7df344581d..77d601afbcfb 100644 --- a/fs/xfs/scrub/bmap_repair.c +++ b/fs/xfs/scrub/bmap_repair.c @@ -25,10 +25,12 @@ #include "xfs_bmap_btree.h" #include "xfs_rmap.h" #include "xfs_rmap_btree.h" +#include "xfs_rtrmap_btree.h" #include "xfs_refcount.h" #include "xfs_quota.h" #include "xfs_ialloc.h" #include "xfs_ag.h" +#include "xfs_rtgroup.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -313,6 +315,116 @@ xrep_bmap_scan_ag( return error; } +#ifdef CONFIG_XFS_RT +/* Check for any obvious errors or conflicts in the file mapping. */ +STATIC int +xrep_bmap_check_rtfork_rmap( + struct xfs_scrub *sc, + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec) +{ + xfs_rtblock_t rtbno; + + /* xattr extents are never stored on realtime devices */ + if (rec->rm_flags & XFS_RMAP_ATTR_FORK) + return -EFSCORRUPTED; + + /* bmbt blocks are never stored on realtime devices */ + if (rec->rm_flags & XFS_RMAP_BMBT_BLOCK) + return -EFSCORRUPTED; + + /* Data extents for non-rt files are never stored on the rt device. */ + if (!XFS_IS_REALTIME_INODE(sc->ip)) + return -EFSCORRUPTED; + + /* Check the file offsets and physical extents. */ + if (!xfs_verify_fileext(sc->mp, rec->rm_offset, rec->rm_blockcount)) + return -EFSCORRUPTED; + + /* Check that this is within the rtgroup. */ + if (!xfs_verify_rgbext(cur->bc_ino.rtg, rec->rm_startblock, + rec->rm_blockcount)) + return -EFSCORRUPTED; + + /* Make sure this isn't free space. */ + rtbno = xfs_rgbno_to_rtb(sc->mp, cur->bc_ino.rtg->rtg_rgno, + rec->rm_startblock); + return xrep_require_rtext_inuse(sc, rtbno, rec->rm_blockcount); +} + +/* Record realtime extents that belong to this inode's fork. */ +STATIC int +xrep_bmap_walk_rtrmap( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xrep_bmap *rb = priv; + xfs_rtblock_t rtbno; + int error = 0; + + if (xchk_should_terminate(rb->sc, &error)) + return error; + + /* Skip extents which are not owned by this inode and fork. */ + if (rec->rm_owner != rb->sc->ip->i_ino) + return 0; + + error = xrep_bmap_check_rtfork_rmap(rb->sc, cur, rec); + if (error) + return error; + + /* + * Record all blocks allocated to this file even if the extent isn't + * for the fork we're rebuilding so that we can reset di_nblocks later. + */ + rb->nblocks += rec->rm_blockcount; + + /* If this rmap isn't for the fork we want, we're done. */ + if (rb->whichfork == XFS_DATA_FORK && + (rec->rm_flags & XFS_RMAP_ATTR_FORK)) + return 0; + if (rb->whichfork == XFS_ATTR_FORK && + !(rec->rm_flags & XFS_RMAP_ATTR_FORK)) + return 0; + + rtbno = xfs_rgbno_to_rtb(cur->bc_mp, cur->bc_ino.rtg->rtg_rgno, + rec->rm_startblock); + return xrep_bmap_from_rmap(rb, rec->rm_offset, rtbno, + rec->rm_blockcount, + rec->rm_flags & XFS_RMAP_UNWRITTEN); +} + +/* Scan the realtime reverse mappings to build the new extent map. */ +STATIC int +xrep_bmap_scan_rtgroup( + struct xrep_bmap *rb, + struct xfs_rtgroup *rtg) +{ + struct xfs_scrub *sc = rb->sc; + int error; + + if (xrep_is_rtmeta_ino(sc, rtg, sc->ip->i_ino)) + return 0; + + error = xrep_rtgroup_init(sc, rtg, &sc->sr, + XFS_RTGLOCK_RMAP | XFS_RTGLOCK_BITMAP_SHARED); + if (error) + return error; + + error = xfs_rmap_query_all(sc->sr.rmap_cur, xrep_bmap_walk_rtrmap, rb); + xchk_rtgroup_btcur_free(&sc->sr); + xchk_rtgroup_free(sc, &sc->sr); + return error; +} +#else +static inline int +xrep_bmap_scan_rtgroup(struct xrep_bmap *rb, struct xfs_rtgroup *rtg) +{ + return -EFSCORRUPTED; +} +#endif + /* Find the delalloc extents from the old incore extent tree. */ STATIC int xrep_bmap_find_delalloc( @@ -362,9 +474,20 @@ xrep_bmap_find_mappings( { struct xfs_scrub *sc = rb->sc; struct xfs_perag *pag; + struct xfs_rtgroup *rtg; xfs_agnumber_t agno; + xfs_rgnumber_t rgno; int error = 0; + /* Iterate the rtrmaps for extents. */ + for_each_rtgroup(sc->mp, rgno, rtg) { + error = xrep_bmap_scan_rtgroup(rb, rtg); + if (error) { + xfs_rtgroup_put(rtg); + return error; + } + } + /* Iterate the rmaps for extents. */ for_each_perag(sc->mp, agno, pag) { error = xrep_bmap_scan_ag(rb, pag); @@ -705,10 +828,6 @@ xrep_bmap_check_inputs( return -EINVAL; } - /* Don't know how to rebuild realtime data forks. */ - if (XFS_IS_REALTIME_INODE(sc->ip)) - return -EOPNOTSUPP; - return 0; } diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c index fa8e0064c41d..18763d136ef5 100644 --- a/fs/xfs/scrub/common.c +++ b/fs/xfs/scrub/common.c @@ -760,7 +760,7 @@ xchk_rt_unlock( #ifdef CONFIG_XFS_RT /* Lock all the rt group metadata inode ILOCKs and wait for intents. */ -static int +int xchk_rtgroup_lock( struct xfs_scrub *sc, struct xchk_rt *sr, diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h index 9ca2fbaac72c..e135f792cfcc 100644 --- a/fs/xfs/scrub/common.h +++ b/fs/xfs/scrub/common.h @@ -181,10 +181,13 @@ int xchk_rtgroup_init(struct xfs_scrub *sc, xfs_rgnumber_t rgno, void xchk_rtgroup_unlock(struct xfs_scrub *sc, struct xchk_rt *sr); void xchk_rtgroup_btcur_free(struct xchk_rt *sr); void xchk_rtgroup_free(struct xfs_scrub *sc, struct xchk_rt *sr); +int xchk_rtgroup_lock(struct xfs_scrub *sc, struct xchk_rt *sr, + unsigned int rtglock_flags); #else # define xchk_rtgroup_init(sc, rgno, sr, lockflags) (-ENOSYS) # define xchk_rtgroup_btcur_free(sr) ((void)0) # define xchk_rtgroup_free(sc, sr) ((void)0) +# define xchk_rtgroup_lock(sc, sr, lockflags) (-ENOSYS) #endif /* CONFIG_XFS_RT */ int xchk_ag_read_headers(struct xfs_scrub *sc, xfs_agnumber_t agno, diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index eb0dda2df7af..18ce73dcdf3b 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -35,6 +35,9 @@ #include "xfs_da_btree.h" #include "xfs_attr.h" #include "xfs_dir2.h" +#include "xfs_rtrmap_btree.h" +#include "xfs_rtbitmap.h" +#include "xfs_rtgroup.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -937,6 +940,73 @@ xrep_ag_init( return 0; } +#ifdef CONFIG_XFS_RT +/* Initialize all the btree cursors for a RT repair. */ +static void +xrep_rtgroup_btcur_init( + struct xfs_scrub *sc, + struct xchk_rt *sr) +{ + struct xfs_mount *mp = sc->mp; + + ASSERT(sr->rtg != NULL); + + if (sc->sm->sm_type != XFS_SCRUB_TYPE_RTRMAPBT && + (sr->rtlock_flags & XFS_RTGLOCK_RMAP) && + xfs_has_rtrmapbt(mp)) + sr->rmap_cur = xfs_rtrmapbt_init_cursor(mp, sc->tp, sr->rtg, + sr->rtg->rtg_rmapip); +} + +/* + * Given a reference to a rtgroup structure, lock rtgroup btree inodes and + * create btree cursors. Must only be called to repair a regular rt file. + */ +int +xrep_rtgroup_init( + struct xfs_scrub *sc, + struct xfs_rtgroup *rtg, + struct xchk_rt *sr, + unsigned int rtglock_flags) +{ + ASSERT(sr->rtg == NULL); + + xfs_rtgroup_lock(NULL, rtg, rtglock_flags); + sr->rtlock_flags = rtglock_flags; + + /* Grab our own reference to the rtgroup structure. */ + sr->rtg = xfs_rtgroup_bump(rtg); + xrep_rtgroup_btcur_init(sc, sr); + return 0; +} + +/* Ensure that all rt blocks in the given range are not marked free. */ +int +xrep_require_rtext_inuse( + struct xfs_scrub *sc, + xfs_rtblock_t rtbno, + xfs_filblks_t len) +{ + struct xfs_mount *mp = sc->mp; + xfs_rtxnum_t startrtx; + xfs_rtxnum_t endrtx; + bool is_free = false; + int error; + + startrtx = xfs_rtb_to_rtxt(mp, rtbno); + endrtx = xfs_rtb_to_rtxt(mp, rtbno + len - 1); + + error = xfs_rtalloc_extent_is_free(mp, sc->tp, startrtx, + endrtx - startrtx + 1, &is_free); + if (error) + return error; + if (is_free) + return -EFSCORRUPTED; + + return 0; +} +#endif /* CONFIG_XFS_RT */ + /* Reinitialize the per-AG block reservation for the AG we just fixed. */ int xrep_reset_perag_resv( @@ -1261,3 +1331,26 @@ xrep_dotdot_lookup( return ino; } + +/* Are we looking at a realtime metadata inode? */ +bool +xrep_is_rtmeta_ino( + struct xfs_scrub *sc, + struct xfs_rtgroup *rtg, + xfs_ino_t ino) +{ + /* + * All filesystems have rt bitmap and summary inodes, even if they + * don't have an rt section. + */ + if (ino == sc->mp->m_rbmip->i_ino) + return true; + if (ino == sc->mp->m_rsumip->i_ino) + return true; + + /* Newer rt metadata files are not guaranteed to exist */ + if (rtg->rtg_rmapip && ino == rtg->rtg_rmapip->i_ino) + return true; + + return false; +} diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index 292e252efae3..c75081185c24 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -100,6 +100,17 @@ int xrep_setup_rtbitmap(struct xfs_scrub *sc, unsigned int *resblks); void xrep_ag_btcur_init(struct xfs_scrub *sc, struct xchk_ag *sa); int xrep_ag_init(struct xfs_scrub *sc, struct xfs_perag *pag, struct xchk_ag *sa); +#ifdef CONFIG_XFS_RT +int xrep_rtgroup_init(struct xfs_scrub *sc, struct xfs_rtgroup *rtg, + struct xchk_rt *sr, unsigned int rtglock_flags); +int xrep_require_rtext_inuse(struct xfs_scrub *sc, xfs_rtblock_t rtbno, + xfs_filblks_t len); +#else +# define xrep_rtgroup_init(sc, rtg, sr, lockflags) (-ENOSYS) +#endif /* CONFIG_XFS_RT */ + +bool xrep_is_rtmeta_ino(struct xfs_scrub *sc, struct xfs_rtgroup *rtg, + xfs_ino_t ino); /* Metadata revalidators */ From patchwork Fri Dec 30 22:18:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085498 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8724DC4332F for ; Sat, 31 Dec 2022 01:45:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236007AbiLaBp3 (ORCPT ); Fri, 30 Dec 2022 20:45:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235853AbiLaBp2 (ORCPT ); Fri, 30 Dec 2022 20:45:28 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85682F026 for ; Fri, 30 Dec 2022 17:45:27 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1F5E261CBF for ; Sat, 31 Dec 2022 01:45:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7A6E9C433EF; Sat, 31 Dec 2022 01:45:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451126; bh=VW3w3XlqxmsGE6Jo0BTYi+GqqupgCz5mejzpRDW2X2c=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=sx2pXIB0baRqSTLRlVqzv1gp1O4vL3Fw1VTcNJqGNN8UHz8icNdAs4szLLln5Quav u49kjTQhEKX2qnEnVCuJ0g5ycZ75xeYUFxDfh3KTE+qJWX6Kydfy6H+FjNw1fMAp7y cpx28ILA1hi6GJwDMb8U7PMPuCR977Xf2YDJfulaDbAAT4vGLcgnmxHd5UY3nypvLj F5FRIuwJ4ny10rRlZxeWqiy1yP1Q2lZJUErwX6CSvZmcKX2kNKid0UNtfZavaA1LTw FZnrB7BW2iwy4fRbb83yYYgArLlrHQC4TYYgIXYJAz9T0FbWqn03YrXRpzhRHsepxe VGZ0NrPMj6x+g== Subject: [PATCH 33/38] xfs: repair inodes that have realtime extents From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:20 -0800 Message-ID: <167243870074.715303.5398086761063722797.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Plumb into the inode core repair code the ability to search for extents on realtime devices. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/inode_repair.c | 68 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 67 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index a8d19d1e76e3..8566282827f8 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -37,6 +37,8 @@ #include "xfs_log_priv.h" #include "xfs_symlink_remote.h" #include "xfs_rtbitmap.h" +#include "xfs_rtgroup.h" +#include "xfs_rtrmap_btree.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -610,18 +612,77 @@ xrep_dinode_count_ag_rmaps( return error; } +/* Count extents and blocks for an inode given an rt rmap. */ +STATIC int +xrep_dinode_walk_rtrmap( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xrep_inode *ri = priv; + int error = 0; + + if (xchk_should_terminate(ri->sc, &error)) + return error; + + /* We only care about this inode. */ + if (rec->rm_owner != ri->sc->sm->sm_ino) + return 0; + + if (rec->rm_flags & (XFS_RMAP_ATTR_FORK | XFS_RMAP_BMBT_BLOCK)) + return -EFSCORRUPTED; + + ri->rt_blocks += rec->rm_blockcount; + ri->rt_extents++; + return 0; +} + +/* Count extents and blocks for an inode from all realtime rmap data. */ +STATIC int +xrep_dinode_count_rtgroup_rmaps( + struct xrep_inode *ri, + struct xfs_rtgroup *rtg) +{ + struct xfs_scrub *sc = ri->sc; + int error; + + if (!xfs_has_realtime(sc->mp) || + xrep_is_rtmeta_ino(sc, rtg, sc->sm->sm_ino)) + return 0; + + error = xrep_rtgroup_init(sc, rtg, &sc->sr, XFS_RTGLOCK_RMAP); + if (error) + return error; + + error = xfs_rmap_query_all(sc->sr.rmap_cur, xrep_dinode_walk_rtrmap, + ri); + xchk_rtgroup_btcur_free(&sc->sr); + xchk_rtgroup_free(sc, &sc->sr); + return error; +} + /* Count extents and blocks for a given inode from all rmap data. */ STATIC int xrep_dinode_count_rmaps( struct xrep_inode *ri) { struct xfs_perag *pag; + struct xfs_rtgroup *rtg; xfs_agnumber_t agno; + xfs_rgnumber_t rgno; int error; - if (!xfs_has_rmapbt(ri->sc->mp) || xfs_has_realtime(ri->sc->mp)) + if (!xfs_has_rmapbt(ri->sc->mp)) return -EOPNOTSUPP; + for_each_rtgroup(ri->sc->mp, rgno, rtg) { + error = xrep_dinode_count_rtgroup_rmaps(ri, rtg); + if (error) { + xfs_rtgroup_put(rtg); + return error; + } + } + for_each_perag(ri->sc->mp, agno, pag) { error = xrep_dinode_count_ag_rmaps(ri, pag); if (error) { @@ -917,6 +978,7 @@ xrep_dinode_ensure_forkoff( uint16_t mode) { struct xfs_bmdr_block *bmdr; + struct xfs_rtrmap_root *rmdr; struct xfs_scrub *sc = ri->sc; xfs_extnum_t attr_extents, data_extents; size_t bmdr_minsz = xfs_bmdr_space_calc(1); @@ -1023,6 +1085,10 @@ xrep_dinode_ensure_forkoff( bmdr = XFS_DFORK_PTR(dip, XFS_DATA_FORK); dfork_min = xfs_bmap_broot_space(sc->mp, bmdr); break; + case XFS_DINODE_FMT_RMAP: + rmdr = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + dfork_min = xfs_rtrmap_broot_space(sc->mp, rmdr); + break; default: dfork_min = 0; break; From patchwork Fri Dec 30 22:18:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99289C4332F for ; Sat, 31 Dec 2022 01:45:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231462AbiLaBpr (ORCPT ); Fri, 30 Dec 2022 20:45:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231435AbiLaBpr (ORCPT ); Fri, 30 Dec 2022 20:45:47 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CAB7826CC for ; Fri, 30 Dec 2022 17:45:44 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 5AA17B81DDA for ; Sat, 31 Dec 2022 01:45:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 13EC5C433EF; Sat, 31 Dec 2022 01:45:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451142; bh=9pS6G4P2KfuSMR4LjcSjjTWpLlWZtI59wUvrd4ZM164=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=J4ugXo2WXI4BunjGo57k7V6zCsS4qVkkzFPijs18laWlelwBZDJarJE+I74AR8Xs/ fnQFGEryVWALzJ0SNC383rr695v9Rx8KJ+7iDVgZzadmBuP2v44R3uaEtwgDqeTKcs /30+BO4w+f0IFK5Hz2Ra/K2syBhUghUvV5zhIq/2h6iuz+u7lLQ+4JDEq5fA6srurB IaODgYjlnHsTjQsewR4P6+il0CiDy9jkXIn1gLLMk815DspyRqSaScsW0QrGX9HeZ5 X5TjE0xzZ12aXu1mAPwDLgKAbfHMhfG2PBFHAgpEVH/htlwGTboePYhY2/Hib8vG/a 8EIveD0Z9TRIw== Subject: [PATCH 34/38] xfs: online repair of realtime bitmaps for a realtime group From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:20 -0800 Message-ID: <167243870088.715303.7468456853127661772.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong For a given rt group, regenerate the bitmap contents from the group's realtime rmap btree. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/repair.c | 2 fs/xfs/scrub/repair.h | 10 + fs/xfs/scrub/rtbitmap.c | 21 + fs/xfs/scrub/rtbitmap_repair.c | 692 +++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/rtsummary_repair.c | 3 fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/tempfile.c | 15 + fs/xfs/scrub/tempswap.h | 2 fs/xfs/scrub/trace.c | 1 fs/xfs/scrub/trace.h | 149 ++++++++ 10 files changed, 885 insertions(+), 12 deletions(-) diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index 18ce73dcdf3b..995b60f2d41e 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -942,7 +942,7 @@ xrep_ag_init( #ifdef CONFIG_XFS_RT /* Initialize all the btree cursors for a RT repair. */ -static void +void xrep_rtgroup_btcur_init( struct xfs_scrub *sc, struct xchk_rt *sr) diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index c75081185c24..a0ed79506195 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -87,6 +87,7 @@ int xrep_setup_directory(struct xfs_scrub *sc); int xrep_setup_parent(struct xfs_scrub *sc); int xrep_setup_nlinks(struct xfs_scrub *sc); int xrep_setup_symlink(struct xfs_scrub *sc, unsigned int *resblks); +int xrep_setup_rgbitmap(struct xfs_scrub *sc, unsigned int *resblks); int xrep_xattr_reset_fork(struct xfs_scrub *sc); @@ -103,6 +104,7 @@ int xrep_ag_init(struct xfs_scrub *sc, struct xfs_perag *pag, #ifdef CONFIG_XFS_RT int xrep_rtgroup_init(struct xfs_scrub *sc, struct xfs_rtgroup *rtg, struct xchk_rt *sr, unsigned int rtglock_flags); +void xrep_rtgroup_btcur_init(struct xfs_scrub *sc, struct xchk_rt *sr); int xrep_require_rtext_inuse(struct xfs_scrub *sc, xfs_rtblock_t rtbno, xfs_filblks_t len); #else @@ -143,10 +145,12 @@ int xrep_symlink(struct xfs_scrub *sc); int xrep_rtbitmap(struct xfs_scrub *sc); int xrep_rtsummary(struct xfs_scrub *sc); int xrep_rgsuperblock(struct xfs_scrub *sc); +int xrep_rgbitmap(struct xfs_scrub *sc); #else # define xrep_rtbitmap xrep_notsupported # define xrep_rtsummary xrep_notsupported # define xrep_rgsuperblock xrep_notsupported +# define xrep_rgbitmap xrep_notsupported #endif /* CONFIG_XFS_RT */ #ifdef CONFIG_XFS_QUOTA @@ -235,6 +239,11 @@ static inline int xrep_setup_symlink(struct xfs_scrub *sc, unsigned int *x) return 0; } +static inline int xrep_setup_rgbitmap(struct xfs_scrub *sc, unsigned int *x) +{ + return 0; +} + #define xrep_revalidate_allocbt (NULL) #define xrep_revalidate_iallocbt (NULL) @@ -262,6 +271,7 @@ static inline int xrep_setup_symlink(struct xfs_scrub *sc, unsigned int *x) #define xrep_parent xrep_notsupported #define xrep_symlink xrep_notsupported #define xrep_rgsuperblock xrep_notsupported +#define xrep_rgbitmap xrep_notsupported #endif /* CONFIG_XFS_ONLINE_REPAIR */ diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c index eb150c40d33c..ca478fbd514e 100644 --- a/fs/xfs/scrub/rtbitmap.c +++ b/fs/xfs/scrub/rtbitmap.c @@ -22,18 +22,34 @@ #include "scrub/common.h" #include "scrub/repair.h" #include "scrub/btree.h" +#include "scrub/repair.h" /* Set us up with the realtime group metadata locked. */ int xchk_setup_rgbitmap( struct xfs_scrub *sc) { + unsigned int resblks = 0; + unsigned int rtglock_flags = XCHK_RTGLOCK_ALL; int error; if (xchk_need_fshook_drain(sc)) xchk_fshooks_enable(sc, XCHK_FSHOOKS_DRAIN); - error = xchk_trans_alloc(sc, 0); + if (xchk_could_repair(sc)) { + error = xrep_setup_rgbitmap(sc, &resblks); + if (error) + return error; + + /* + * We must hold rbmip with ILOCK_EXCL to use the extent swap + * at the end of the repair function. + */ + rtglock_flags &= ~XFS_RTGLOCK_BITMAP_SHARED; + rtglock_flags |= XFS_RTGLOCK_BITMAP; + } + + error = xchk_trans_alloc(sc, resblks); if (error) return error; @@ -45,8 +61,7 @@ xchk_setup_rgbitmap( if (error) return error; - return xchk_rtgroup_init(sc, sc->sm->sm_agno, &sc->sr, - XCHK_RTGLOCK_ALL); + return xchk_rtgroup_init(sc, sc->sm->sm_agno, &sc->sr, rtglock_flags); } /* Set us up with the realtime metadata locked. */ diff --git a/fs/xfs/scrub/rtbitmap_repair.c b/fs/xfs/scrub/rtbitmap_repair.c index c88c49b03e86..0fa8942d14e7 100644 --- a/fs/xfs/scrub/rtbitmap_repair.c +++ b/fs/xfs/scrub/rtbitmap_repair.c @@ -12,15 +12,707 @@ #include "xfs_btree.h" #include "xfs_log_format.h" #include "xfs_trans.h" +#include "xfs_rtalloc.h" #include "xfs_inode.h" #include "xfs_bit.h" #include "xfs_bmap.h" #include "xfs_bmap_btree.h" +#include "xfs_rmap.h" +#include "xfs_rtrmap_btree.h" +#include "xfs_swapext.h" +#include "xfs_rtbitmap.h" +#include "xfs_rtgroup.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" #include "scrub/repair.h" #include "scrub/xfile.h" +#include "scrub/tempfile.h" +#include "scrub/tempswap.h" +#include "scrub/reap.h" + +/* + * We use an xfile to construct new bitmap blocks for the portion of the + * rtbitmap file that we're replacing. Whereas the ondisk bitmap must be + * accessed through the buffer cache, the xfile bitmap supports direct + * word-level accesses. Therefore, we create a small abstraction for linear + * access. + */ +typedef unsigned long long xrep_wordoff_t; +typedef unsigned int xrep_wordcnt_t; + +struct xrep_rgbmp { + struct xfs_scrub *sc; + + /* file offset inside the rtbitmap where we start swapping */ + xfs_fileoff_t group_rbmoff; + + /* number of rtbitmap blocks for this group */ + xfs_filblks_t group_rbmlen; + + /* The next rtgroup block we expect to see during our rtrmapbt walk. */ + xfs_rgblock_t next_rgbno; + + /* rtword position of xfile as we write buffers to disk. */ + xrep_wordoff_t prep_wordoff; +}; + +/* Mask to round an rtx down to the nearest bitmap word. */ +#define XREP_RTBMP_WORDMASK ((1ULL << XFS_NBWORDLOG) - 1) + +/* Set up to repair the realtime bitmap for this group. */ +int +xrep_setup_rgbitmap( + struct xfs_scrub *sc, + unsigned int *resblks) +{ + struct xfs_mount *mp = sc->mp; + unsigned long long blocks = 0; + unsigned long long rtbmp_words; + size_t bufsize = mp->m_sb.sb_blocksize; + int error; + + error = xrep_tempfile_create(sc, S_IFREG); + if (error) + return error; + + /* Create an xfile to hold our reconstructed bitmap. */ + rtbmp_words = xfs_rtbitmap_wordcount(mp, mp->m_sb.sb_rextents); + error = xfile_create(sc->mp, "rtbitmap", rtbmp_words << XFS_WORDLOG, + &sc->xfile); + if (error) + return error; + + bufsize = max(bufsize, sizeof(struct xrep_tempswap)); + + /* + * Allocate a memory buffer for faster creation of new bitmap + * blocks. + */ + sc->buf = kvmalloc(bufsize, XCHK_GFP_FLAGS); + if (!sc->buf) + return -ENOMEM; + + /* + * Reserve enough blocks to write out a completely new bitmap file, + * plus twice as many blocks as we would need if we can only allocate + * one block per data fork mapping. This should cover the + * preallocation of the temporary file and swapping the extent + * mappings. + * + * We cannot use xfs_swapext_estimate because we have not yet + * constructed the replacement bitmap and therefore do not know how + * many extents it will use. By the time we do, we will have a dirty + * transaction (which we cannot drop because we cannot drop the + * rtbitmap ILOCK) and cannot ask for more reservation. + */ + blocks = mp->m_sb.sb_rbmblocks; + blocks += xfs_bmbt_calc_size(mp, blocks) * 2; + if (blocks > UINT_MAX) + return -EOPNOTSUPP; + + *resblks += blocks; + + /* + * Grab support for atomic extent swapping before we allocate any + * transactions or grab ILOCKs. + */ + return xrep_tempswap_grab_log_assist(sc); +} + +static inline xrep_wordoff_t +rtx_to_wordoff( + struct xfs_mount *mp, + xfs_rtxnum_t rtx) +{ + return rtx >> XFS_NBWORDLOG; +} + +static inline xrep_wordcnt_t +rtxlen_to_wordcnt( + xfs_rtxlen_t rtxlen) +{ + return rtxlen >> XFS_NBWORDLOG; +} + +/* Helper functions to record rtwords in an xfile. */ + +static inline int +xfbmp_load( + struct xrep_rgbmp *rb, + xrep_wordoff_t wordoff, + xfs_rtword_t *word) +{ + union xfs_rtword_ondisk urk; + int error; + + error = xfile_obj_load(rb->sc->xfile, &urk, + sizeof(union xfs_rtword_ondisk), + wordoff << XFS_WORDLOG); + if (error) + return error; + + *word = xfs_rtbitmap_getword(rb->sc->mp, &urk); + return 0; +} + +static inline int +xfbmp_store( + struct xrep_rgbmp *rb, + xrep_wordoff_t wordoff, + const xfs_rtword_t word) +{ + union xfs_rtword_ondisk urk; + + xfs_rtbitmap_setword(rb->sc->mp, &urk, word); + return xfile_obj_store(rb->sc->xfile, &urk, + sizeof(union xfs_rtword_ondisk), + wordoff << XFS_WORDLOG); +} + +static inline int +xfbmp_copyin( + struct xrep_rgbmp *rb, + xrep_wordoff_t wordoff, + const union xfs_rtword_ondisk *word, + xrep_wordcnt_t nr_words) +{ + return xfile_obj_store(rb->sc->xfile, word, nr_words << XFS_WORDLOG, + wordoff << XFS_WORDLOG); +} + +static inline int +xfbmp_copyout( + struct xrep_rgbmp *rb, + xrep_wordoff_t wordoff, + union xfs_rtword_ondisk *word, + xrep_wordcnt_t nr_words) +{ + return xfile_obj_load(rb->sc->xfile, word, nr_words << XFS_WORDLOG, + wordoff << XFS_WORDLOG); +} + +/* + * Preserve the portions of the rtbitmap block for the start of this rtgroup + * that map to the previous rtgroup. + */ +STATIC int +xrep_rgbitmap_load_before( + struct xrep_rgbmp *rb) +{ + struct xfs_scrub *sc = rb->sc; + struct xfs_mount *mp = sc->mp; + struct xfs_rtgroup *rtg = sc->sr.rtg; + struct xfs_buf *bp; + xrep_wordoff_t wordoff; + xfs_rtblock_t group_rtbno; + xfs_rtxnum_t group_rtx, rbmoff_rtx; + xfs_rtword_t ondisk_word; + xfs_rtword_t xfile_word; + xfs_rtword_t mask; + xrep_wordcnt_t wordcnt; + int bit; + int error; + + /* + * Compute the file offset within the rtbitmap block that corresponds + * to the start of this group, and decide if we need to read blocks + * from the group before this one. + */ + group_rtbno = xfs_rgbno_to_rtb(mp, rtg->rtg_rgno, 0); + group_rtx = xfs_rtb_to_rtxt(mp, group_rtbno); + + rb->group_rbmoff = xfs_rtx_to_rbmblock(mp, group_rtx); + rbmoff_rtx = xfs_rbmblock_to_rtx(mp, rb->group_rbmoff); + rb->prep_wordoff = rtx_to_wordoff(mp, rbmoff_rtx); + + trace_xrep_rgbitmap_load(rtg, rb->group_rbmoff, rbmoff_rtx, + group_rtx - 1); + + if (rbmoff_rtx == group_rtx) + return 0; + + error = xfs_rtbuf_get(mp, sc->tp, rb->group_rbmoff, 0, &bp); + if (error) { + /* + * Reading the existing rbmblock failed, and we must deal with + * the part of the rtbitmap block that corresponds to the + * previous group. The most conservative option is to fill + * that part of the bitmap with zeroes so that it won't get + * allocated. The xfile contains zeroes already, so we can + * return. + */ + return 0; + } + + /* + * Copy full rtbitmap words into memory from the beginning of the + * ondisk block until we get to the word that corresponds to the start + * of this group. + */ + wordoff = rtx_to_wordoff(mp, rbmoff_rtx); + wordcnt = rtxlen_to_wordcnt(group_rtx - rbmoff_rtx); + if (wordcnt > 0) { + union xfs_rtword_ondisk *p; + + p = xfs_rbmblock_wordptr(bp, 0); + error = xfbmp_copyin(rb, wordoff, p, wordcnt); + if (error) + goto out_rele; + + trace_xrep_rgbitmap_load_words(mp, rb->group_rbmoff, wordoff, + wordcnt); + wordoff += wordcnt; + } + + /* + * Compute the bit position of the first rtextent of this group. If + * the bit position is zero, we don't have to RMW a partial word and + * move to the next step. + */ + bit = group_rtx & XREP_RTBMP_WORDMASK; + if (bit == 0) + goto out_rele; + + /* + * Create a mask of the bits that we want to load from disk. These + * bits track space in a different rtgroup, which is why we must + * preserve them even as we replace parts of the bitmap. + */ + mask = ~((((xfs_rtword_t)1 << (XFS_NBWORD - bit)) - 1) << bit); + + error = xfbmp_load(rb, wordoff, &xfile_word); + if (error) + goto out_rele; + ondisk_word = xfs_rtbitmap_getword(mp, + xfs_rbmblock_wordptr(bp, wordcnt)); + + trace_xrep_rgbitmap_load_word(mp, wordoff, bit, ondisk_word, + xfile_word, mask); + + xfile_word &= ~mask; + xfile_word |= (ondisk_word & mask); + + error = xfbmp_store(rb, wordoff, xfile_word); + if (error) + goto out_rele; + +out_rele: + xfs_trans_brelse(sc->tp, bp); + return error; +} + +/* + * Preserve the portions of the rtbitmap block for the end of this rtgroup + * that map to the next rtgroup. + */ +STATIC int +xrep_rgbitmap_load_after( + struct xrep_rgbmp *rb) +{ + struct xfs_scrub *sc = rb->sc; + struct xfs_mount *mp = rb->sc->mp; + struct xfs_rtgroup *rtg = rb->sc->sr.rtg; + struct xfs_buf *bp; + xrep_wordoff_t wordoff; + xfs_rtblock_t last_rtbno; + xfs_rtxnum_t last_group_rtx, last_rbmblock_rtx; + xfs_fileoff_t last_group_rbmoff; + xfs_rtword_t ondisk_word; + xfs_rtword_t xfile_word; + xfs_rtword_t mask; + xrep_wordcnt_t wordcnt; + unsigned int last_group_word; + int bit; + int error; + + last_rtbno = xfs_rgbno_to_rtb(mp, rtg->rtg_rgno, + rtg->rtg_blockcount - 1); + last_group_rtx = xfs_rtb_to_rtxt(mp, last_rtbno); + + last_group_rbmoff = xfs_rtx_to_rbmblock(mp, last_group_rtx); + rb->group_rbmlen = last_group_rbmoff - rb->group_rbmoff + 1; + last_rbmblock_rtx = xfs_rbmblock_to_rtx(mp, last_group_rbmoff + 1) - 1; + + trace_xrep_rgbitmap_load(rtg, last_group_rbmoff, last_group_rtx + 1, + last_rbmblock_rtx); + + if (last_rbmblock_rtx == last_group_rtx || + rtg->rtg_rgno == mp->m_sb.sb_rgcount - 1) + return 0; + + error = xfs_rtbuf_get(mp, sc->tp, last_group_rbmoff, 0, &bp); + if (error) { + /* + * Reading the existing rbmblock failed, and we must deal with + * the part of the rtbitmap block that corresponds to the + * previous group. The most conservative option is to fill + * that part of the bitmap with zeroes so that it won't get + * allocated. The xfile contains zeroes already, so we can + * return. + */ + return 0; + } + + /* + * Compute the bit position of the first rtextent of the next group. + * If the bit position is zero, we don't have to RMW a partial word + * and move to the next step. + */ + wordoff = rtx_to_wordoff(mp, last_group_rtx); + bit = (last_group_rtx + 1) & XREP_RTBMP_WORDMASK; + if (bit == 0) + goto copy_words; + + /* + * Create a mask of the bits that we want to load from disk. These + * bits track space in a different rtgroup, which is why we must + * preserve them even as we replace parts of the bitmap. + */ + mask = (((xfs_rtword_t)1 << (XFS_NBWORD - bit)) - 1) << bit; + + error = xfbmp_load(rb, wordoff, &xfile_word); + if (error) + goto out_rele; + last_group_word = xfs_rtx_to_rbmword(mp, last_group_rtx); + ondisk_word = xfs_rtbitmap_getword(mp, + xfs_rbmblock_wordptr(bp, last_group_word)); + + trace_xrep_rgbitmap_load_word(mp, wordoff, bit, ondisk_word, + xfile_word, mask); + + xfile_word &= ~mask; + xfile_word |= (ondisk_word & mask); + + error = xfbmp_store(rb, wordoff, xfile_word); + if (error) + goto out_rele; + +copy_words: + /* Copy as many full words as we can. */ + wordoff++; + wordcnt = rtxlen_to_wordcnt(last_rbmblock_rtx - last_group_rtx); + if (wordcnt > 0) { + union xfs_rtword_ondisk *p; + + p = xfs_rbmblock_wordptr(bp, mp->m_blockwsize - wordcnt); + error = xfbmp_copyin(rb, wordoff, p, wordcnt); + if (error) + goto out_rele; + + trace_xrep_rgbitmap_load_words(mp, last_group_rbmoff, wordoff, + wordcnt); + } + +out_rele: + xfs_trans_brelse(sc->tp, bp); + return error; +} + +/* Perform a logical OR operation on an rtword in the incore bitmap. */ +static int +xrep_rgbitmap_or( + struct xrep_rgbmp *rb, + xrep_wordoff_t wordoff, + xfs_rtword_t mask) +{ + xfs_rtword_t word; + int error; + + error = xfbmp_load(rb, wordoff, &word); + if (error) + return error; + + trace_xrep_rgbitmap_or(rb->sc->mp, wordoff, mask, word); + + return xfbmp_store(rb, wordoff, word | mask); +} + +/* + * Mark as free every rt extent between the next rt block we expected to see + * in the rtrmap records and the given rt block. + */ +STATIC int +xrep_rgbitmap_mark_free( + struct xrep_rgbmp *rb, + xfs_rgblock_t rgbno) +{ + struct xfs_mount *mp = rb->sc->mp; + struct xfs_rtgroup *rtg = rb->sc->sr.rtg; + xfs_rtblock_t rtbno; + xfs_rtxnum_t startrtx; + xfs_rtxnum_t nextrtx; + xrep_wordoff_t wordoff, nextwordoff; + unsigned int bit; + unsigned int bufwsize; + xfs_extlen_t mod; + xfs_rtword_t mask; + int error; + + if (!xfs_verify_rgbext(rtg, rb->next_rgbno, rgbno - rb->next_rgbno)) + return -EFSCORRUPTED; + + /* + * Convert rt blocks to rt extents The block range we find must be + * aligned to an rtextent boundary on both ends. + */ + rtbno = xfs_rgbno_to_rtb(mp, rtg->rtg_rgno, rb->next_rgbno); + startrtx = xfs_rtb_to_rtx(mp, rtbno, &mod); + if (mod) + return -EFSCORRUPTED; + + rtbno = xfs_rgbno_to_rtb(mp, rtg->rtg_rgno, rgbno - 1); + nextrtx = xfs_rtb_to_rtx(mp, rtbno, &mod) + 1; + if (mod != mp->m_sb.sb_rextsize - 1) + return -EFSCORRUPTED; + + trace_xrep_rgbitmap_record_free(mp, startrtx, nextrtx - 1); + + /* Set bits as needed to round startrtx up to the nearest word. */ + bit = startrtx & XREP_RTBMP_WORDMASK; + if (bit) { + xfs_rtblock_t len = nextrtx - startrtx; + unsigned int lastbit; + + lastbit = XFS_RTMIN(bit + len, XFS_NBWORD); + mask = (((xfs_rtword_t)1 << (lastbit - bit)) - 1) << bit; + + error = xrep_rgbitmap_or(rb, rtx_to_wordoff(mp, startrtx), mask); + if (error || lastbit - bit == len) + return error; + startrtx += XFS_NBWORD - bit; + } + + /* Set bits as needed to round nextrtx down to the nearest word. */ + bit = nextrtx & XREP_RTBMP_WORDMASK; + if (bit) { + mask = ((xfs_rtword_t)1 << bit) - 1; + + error = xrep_rgbitmap_or(rb, rtx_to_wordoff(mp, nextrtx), mask); + if (error || startrtx + bit == nextrtx) + return error; + nextrtx -= bit; + } + + trace_xrep_rgbitmap_record_free_bulk(mp, startrtx, nextrtx - 1); + + /* Set all the words in between, up to a whole fs block at once. */ + wordoff = rtx_to_wordoff(mp, startrtx); + nextwordoff = rtx_to_wordoff(mp, nextrtx); + bufwsize = mp->m_sb.sb_blocksize >> XFS_WORDLOG; + + while (wordoff < nextwordoff) { + xrep_wordoff_t rem; + xrep_wordcnt_t wordcnt; + + wordcnt = min_t(xrep_wordcnt_t, nextwordoff - wordoff, + bufwsize); + + /* + * Try to keep us aligned to sc->buf to reduce the number of + * xfile writes. + */ + rem = wordoff & (bufwsize - 1); + if (rem) + wordcnt = min_t(xrep_wordcnt_t, wordcnt, + bufwsize - rem); + + error = xfbmp_copyin(rb, wordoff, rb->sc->buf, wordcnt); + if (error) + return error; + + wordoff += wordcnt; + } + + return 0; +} + +/* Set free space in the rtbitmap based on rtrmapbt records. */ +STATIC int +xrep_rgbitmap_walk_rtrmap( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xrep_rgbmp *rb = priv; + int error = 0; + + if (xchk_should_terminate(rb->sc, &error)) + return error; + + if (rb->next_rgbno < rec->rm_startblock) { + error = xrep_rgbitmap_mark_free(rb, rec->rm_startblock); + if (error) + return error; + } + + rb->next_rgbno = max(rb->next_rgbno, + rec->rm_startblock + rec->rm_blockcount); + return 0; +} + +/* + * Walk the rtrmapbt to find all the gaps between records, and mark the gaps + * in the realtime bitmap that we're computing. + */ +STATIC int +xrep_rgbitmap_find_freespace( + struct xrep_rgbmp *rb) +{ + struct xfs_scrub *sc = rb->sc; + struct xfs_mount *mp = sc->mp; + struct xfs_rtgroup *rtg = sc->sr.rtg; + int error; + + /* Prepare a buffer of ones so that we can accelerate bulk setting. */ + memset(sc->buf, 0xFF, mp->m_sb.sb_blocksize); + + xrep_rtgroup_btcur_init(sc, &sc->sr); + error = xfs_rmap_query_all(sc->sr.rmap_cur, xrep_rgbitmap_walk_rtrmap, + rb); + if (error) + goto out; + + /* + * Mark as free every possible rt extent from the last one we saw to + * the end of the rt group. + */ + if (rb->next_rgbno < rtg->rtg_blockcount) { + error = xrep_rgbitmap_mark_free(rb, rtg->rtg_blockcount); + if (error) + goto out; + } + +out: + xchk_rtgroup_btcur_free(&sc->sr); + return error; +} + +static int +xrep_rgbitmap_prep_buf( + struct xfs_scrub *sc, + struct xfs_buf *bp, + void *data) +{ + struct xrep_rgbmp *rb = data; + struct xfs_mount *mp = sc->mp; + int error; + + error = xfbmp_copyout(rb, rb->prep_wordoff, + xfs_rbmblock_wordptr(bp, 0), mp->m_blockwsize); + if (error) + return error; + + if (xfs_has_rtgroups(sc->mp)) { + struct xfs_rtbuf_blkinfo *hdr = bp->b_addr; + + hdr->rt_magic = cpu_to_be32(XFS_RTBITMAP_MAGIC); + hdr->rt_owner = cpu_to_be64(sc->ip->i_ino); + hdr->rt_blkno = cpu_to_be64(xfs_buf_daddr(bp)); + hdr->rt_lsn = 0; + uuid_copy(&hdr->rt_uuid, &sc->mp->m_sb.sb_meta_uuid); + bp->b_ops = &xfs_rtbitmap_buf_ops; + } else { + bp->b_ops = &xfs_rtbuf_ops; + } + + rb->prep_wordoff += mp->m_blockwsize; + xfs_trans_buf_set_type(sc->tp, bp, XFS_BLFT_RTBITMAP_BUF); + return 0; +} + +/* Repair the realtime bitmap for this rt group. */ +int +xrep_rgbitmap( + struct xfs_scrub *sc) +{ + struct xrep_rgbmp rb = { + .sc = sc, + .next_rgbno = 0, + }; + struct xrep_tempswap *ti = NULL; + int error; + + /* + * We require the realtime rmapbt (and atomic file updates) to rebuild + * anything. + */ + if (!xfs_has_rtrmapbt(sc->mp)) + return -EOPNOTSUPP; + + /* + * If the start or end of this rt group happens to be in the middle of + * an rtbitmap block, try to read in the parts of the bitmap that are + * from some other group. + */ + error = xrep_rgbitmap_load_before(&rb); + if (error) + return error; + error = xrep_rgbitmap_load_after(&rb); + if (error) + return error; + + /* + * Generate the new rtbitmap data. We don't need the rtbmp information + * once this call is finished. + */ + error = xrep_rgbitmap_find_freespace(&rb); + if (error) + return error; + + /* + * Try to take ILOCK_EXCL of the temporary file. We had better be the + * only ones holding onto this inode, but we can't block while holding + * the rtbitmap file's ILOCK_EXCL. + */ + while (!xrep_tempfile_ilock_nowait(sc)) { + if (xchk_should_terminate(sc, &error)) + return error; + delay(1); + } + + /* + * Make sure we have space allocated for the part of the bitmap + * file that corresponds to this group. + */ + xfs_trans_ijoin(sc->tp, sc->ip, 0); + xfs_trans_ijoin(sc->tp, sc->tempip, 0); + error = xrep_tempfile_prealloc(sc, rb.group_rbmoff, rb.group_rbmlen); + if (error) + return error; + + /* Last chance to abort before we start committing fixes. */ + if (xchk_should_terminate(sc, &error)) + return error; + + /* Copy the bitmap file that we generated. */ + error = xrep_tempfile_copyin(sc, rb.group_rbmoff, rb.group_rbmlen, + xrep_rgbitmap_prep_buf, &rb); + if (error) + return error; + error = xrep_tempfile_set_isize(sc, + XFS_FSB_TO_B(sc->mp, sc->mp->m_sb.sb_rbmblocks)); + if (error) + return error; + + /* + * Now swap the extents. We're done with the temporary buffer, so + * we can reuse it for the tempfile swapext information. + */ + ti = sc->buf; + error = xrep_tempswap_trans_reserve(sc, XFS_DATA_FORK, rb.group_rbmoff, + rb.group_rbmlen, ti); + if (error) + return error; + + error = xrep_tempswap_contents(sc, ti); + if (error) + return error; + ti = NULL; + + /* Free the old bitmap blocks if they are free. */ + return xrep_reap_ifork(sc, sc->tempip, XFS_DATA_FORK); +} /* Set up to repair the realtime bitmap file metadata. */ int diff --git a/fs/xfs/scrub/rtsummary_repair.c b/fs/xfs/scrub/rtsummary_repair.c index 0836c1e10504..cf160fbdc370 100644 --- a/fs/xfs/scrub/rtsummary_repair.c +++ b/fs/xfs/scrub/rtsummary_repair.c @@ -167,7 +167,8 @@ xrep_rtsummary( * so we can reuse it for the tempfile swapext information. */ ti = sc->buf; - error = xrep_tempswap_trans_reserve(sc, XFS_DATA_FORK, ti); + error = xrep_tempswap_trans_reserve(sc, XFS_DATA_FORK, 0, rsumblocks, + ti); if (error) return error; diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index c9b4899c8b6a..7abd25b37c97 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -423,7 +423,7 @@ static const struct xchk_meta_ops meta_scrub_ops[] = { .setup = xchk_setup_rgbitmap, .scrub = xchk_rgbitmap, .has = xfs_has_rtgroups, - .repair = xrep_notsupported, + .repair = xrep_rgbitmap, }, [XFS_SCRUB_TYPE_RTRMAPBT] = { /* realtime group rmapbt */ .type = ST_RTGROUP, diff --git a/fs/xfs/scrub/tempfile.c b/fs/xfs/scrub/tempfile.c index 9ae556fa4b7a..a8ee84379af4 100644 --- a/fs/xfs/scrub/tempfile.c +++ b/fs/xfs/scrub/tempfile.c @@ -475,6 +475,8 @@ STATIC int xrep_tempswap_prep_request( struct xfs_scrub *sc, int whichfork, + xfs_fileoff_t off, + xfs_filblks_t len, struct xrep_tempswap *tx) { struct xfs_swapext_req *req = &tx->req; @@ -497,10 +499,10 @@ xrep_tempswap_prep_request( /* Swap all mappings in both forks. */ req->ip1 = sc->tempip; req->ip2 = sc->ip; - req->startoff1 = 0; - req->startoff2 = 0; + req->startoff1 = off; + req->startoff2 = off; req->whichfork = whichfork; - req->blockcount = XFS_MAX_FILEOFF; + req->blockcount = len; req->req_flags = XFS_SWAP_REQ_LOGGED; /* Always swap sizes when we're swapping data fork mappings. */ @@ -653,6 +655,8 @@ int xrep_tempswap_trans_reserve( struct xfs_scrub *sc, int whichfork, + xfs_fileoff_t off, + xfs_filblks_t len, struct xrep_tempswap *tx) { int error; @@ -661,7 +665,7 @@ xrep_tempswap_trans_reserve( ASSERT(xfs_isilocked(sc->ip, XFS_ILOCK_EXCL)); ASSERT(xfs_isilocked(sc->tempip, XFS_ILOCK_EXCL)); - error = xrep_tempswap_prep_request(sc, whichfork, tx); + error = xrep_tempswap_prep_request(sc, whichfork, off, len, tx); if (error) return error; @@ -692,7 +696,8 @@ xrep_tempswap_trans_alloc( ASSERT(sc->tp == NULL); - error = xrep_tempswap_prep_request(sc, whichfork, tx); + error = xrep_tempswap_prep_request(sc, whichfork, 0, XFS_MAX_FILEOFF, + tx); if (error) return error; diff --git a/fs/xfs/scrub/tempswap.h b/fs/xfs/scrub/tempswap.h index bef8d2d2134d..a7cd96aa2fc7 100644 --- a/fs/xfs/scrub/tempswap.h +++ b/fs/xfs/scrub/tempswap.h @@ -13,7 +13,7 @@ struct xrep_tempswap { int xrep_tempswap_grab_log_assist(struct xfs_scrub *sc); int xrep_tempswap_trans_reserve(struct xfs_scrub *sc, int whichfork, - struct xrep_tempswap *ti); + xfs_fileoff_t off, xfs_filblks_t len, struct xrep_tempswap *ti); int xrep_tempswap_trans_alloc(struct xfs_scrub *sc, int whichfork, struct xrep_tempswap *ti); diff --git a/fs/xfs/scrub/trace.c b/fs/xfs/scrub/trace.c index bb13f0a8e4cf..1bb868a54c06 100644 --- a/fs/xfs/scrub/trace.c +++ b/fs/xfs/scrub/trace.c @@ -20,6 +20,7 @@ #include "xfs_btree_mem.h" #include "xfs_rmap.h" #include "xfs_rtbitmap.h" +#include "xfs_rtgroup.h" #include "scrub/scrub.h" #include "scrub/xfile.h" #include "scrub/xfarray.h" diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 844f49091b1d..7d086ffce7e3 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2820,6 +2820,155 @@ TRACE_EVENT(xrep_iunlink_commit_bucket, __entry->agino) ); +#ifdef CONFIG_XFS_RT +DECLARE_EVENT_CLASS(xrep_rgbitmap_class, + TP_PROTO(struct xfs_mount *mp, xfs_rtxnum_t start, xfs_rtxnum_t end), + TP_ARGS(mp, start, end), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(xfs_rtxnum_t, start) + __field(xfs_rtxnum_t, end) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->rtdev = mp->m_rtdev_targp->bt_dev; + __entry->start = start; + __entry->end = end; + ), + TP_printk("dev %d:%d rtdev %d:%d startrtx 0x%llx endrtx 0x%llx", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->start, + __entry->end) +); +#define DEFINE_REPAIR_RGBITMAP_EVENT(name) \ +DEFINE_EVENT(xrep_rgbitmap_class, name, \ + TP_PROTO(struct xfs_mount *mp, xfs_rtxnum_t start, \ + xfs_rtxnum_t end), \ + TP_ARGS(mp, start, end)) +DEFINE_REPAIR_RGBITMAP_EVENT(xrep_rgbitmap_record_free); +DEFINE_REPAIR_RGBITMAP_EVENT(xrep_rgbitmap_record_free_bulk); + +TRACE_EVENT(xrep_rgbitmap_or, + TP_PROTO(struct xfs_mount *mp, unsigned long long wordoff, + xfs_rtword_t mask, xfs_rtword_t word), + TP_ARGS(mp, wordoff, mask, word), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(unsigned long long, wordoff) + __field(unsigned int, mask) + __field(unsigned int, word) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->rtdev = mp->m_rtdev_targp->bt_dev; + __entry->wordoff = wordoff; + __entry->mask = mask; + __entry->word = word; + ), + TP_printk("dev %d:%d rtdev %d:%d wordoff 0x%llx mask 0x%x word 0x%x", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->wordoff, + __entry->mask, + __entry->word) +); + +TRACE_EVENT(xrep_rgbitmap_load, + TP_PROTO(struct xfs_rtgroup *rtg, xfs_fileoff_t rbmoff, + xfs_rtxnum_t rtx, xfs_rtxnum_t len), + TP_ARGS(rtg, rbmoff, rtx, len), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(xfs_rgnumber_t, rgno) + __field(xfs_fileoff_t, rbmoff) + __field(xfs_rtxnum_t, rtx) + __field(xfs_rtxnum_t, len) + ), + TP_fast_assign( + __entry->dev = rtg->rtg_mount->m_super->s_dev; + __entry->rtdev = rtg->rtg_mount->m_rtdev_targp->bt_dev; + __entry->rgno = rtg->rtg_rgno; + __entry->rbmoff = rbmoff; + __entry->rtx = rtx; + __entry->len = len; + ), + TP_printk("dev %d:%d rtdev %d:%d rgno 0x%x rbmoff 0x%llx rtx 0x%llx rtxcount 0x%llx", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->rgno, + __entry->rbmoff, + __entry->rtx, + __entry->len) +); + +TRACE_EVENT(xrep_rgbitmap_load_words, + TP_PROTO(struct xfs_mount *mp, xfs_fileoff_t rbmoff, + unsigned long long wordoff, unsigned int wordcnt), + TP_ARGS(mp, rbmoff, wordoff, wordcnt), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(xfs_fileoff_t, rbmoff) + __field(unsigned long long, wordoff) + __field(unsigned int, wordcnt) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->rtdev = mp->m_rtdev_targp->bt_dev; + __entry->rbmoff = rbmoff; + __entry->wordoff = wordoff; + __entry->wordcnt = wordcnt; + ), + TP_printk("dev %d:%d rtdev %d:%d rbmoff 0x%llx wordoff 0x%llx wordcnt 0x%x", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->rbmoff, + __entry->wordoff, + __entry->wordcnt) +); + +TRACE_EVENT(xrep_rgbitmap_load_word, + TP_PROTO(struct xfs_mount *mp, unsigned long long wordoff, + unsigned int bit, xfs_rtword_t ondisk_word, + xfs_rtword_t xfile_word, xfs_rtword_t word_mask), + TP_ARGS(mp, wordoff, bit, ondisk_word, xfile_word, word_mask), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(unsigned long long, wordoff) + __field(unsigned int, bit) + __field(xfs_rtword_t, ondisk_word) + __field(xfs_rtword_t, xfile_word) + __field(xfs_rtword_t, word_mask) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->rtdev = mp->m_rtdev_targp->bt_dev; + __entry->wordoff = wordoff; + __entry->bit = bit; + __entry->ondisk_word = ondisk_word; + __entry->xfile_word = xfile_word; + __entry->word_mask = word_mask; + ), + TP_printk("dev %d:%d rtdev %d:%d wordoff 0x%llx bit %u ondisk 0x%x(0x%x) inmem 0x%x(0x%x) result 0x%x mask 0x%x", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->wordoff, + __entry->bit, + __entry->ondisk_word, + __entry->ondisk_word & __entry->word_mask, + __entry->xfile_word, + __entry->xfile_word & ~__entry->word_mask, + (__entry->xfile_word & ~__entry->word_mask) | + (__entry->ondisk_word & __entry->word_mask), + __entry->word_mask) +); +#endif /* CONFIG_XFS_RT */ + #endif /* IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR) */ From patchwork Fri Dec 30 22:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085500 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FF45C4332F for ; Sat, 31 Dec 2022 01:46:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235853AbiLaBqF (ORCPT ); Fri, 30 Dec 2022 20:46:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231435AbiLaBqD (ORCPT ); Fri, 30 Dec 2022 20:46:03 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5076112629 for ; Fri, 30 Dec 2022 17:46:00 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id DE012B81DDA for ; Sat, 31 Dec 2022 01:45:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 962C2C433D2; Sat, 31 Dec 2022 01:45:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451157; bh=C0ur5pppjwen99EI5aX8aSZuOOEtpC57eOu4mNCiauE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=WPKwABq2qvg3Xw6R/ghNX6B85eUr9wpk3K2PbbcQTn250zWWJdNGLl6NEG2tHWo4v mgFjBqGXnXIgmKEzPrHGyy5ShP6BE5Nm+i7MBCdyfCfksVY61OOFvCYb7p9bmCZWXh Kz9Mx2VVySnDEn5tS6dhFjOtidXsAuMulKadLrm0vF+f2OwyQHZF/76/a5j3neRauX G7LAVaI+oBHJhztjtO2V43DU13rRve4/Wl3rQFbmGtWyUfwBiPVWa0esEwhNvtI2JH Zvw8PrhfiSFSIaWJCXzdqukxj/Yu2bMnUtRDme3yuXDGnyRtsiXCKoeJnR4Eae4c5x keHIjQqnipf7g== Subject: [PATCH 35/38] xfs: online repair of the realtime rmap btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:21 -0800 Message-ID: <167243870103.715303.9270241850697635771.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Repair the realtime rmap btree while mounted. Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/libxfs/xfs_rmap.c | 2 fs/xfs/libxfs/xfs_rmap.h | 2 fs/xfs/libxfs/xfs_rtrmap_btree.c | 2 fs/xfs/libxfs/xfs_rtrmap_btree.h | 3 fs/xfs/scrub/bmap_repair.c | 3 fs/xfs/scrub/common.c | 5 fs/xfs/scrub/cow_repair.c | 2 fs/xfs/scrub/reap.c | 5 fs/xfs/scrub/reap.h | 2 fs/xfs/scrub/repair.c | 135 +++++++ fs/xfs/scrub/repair.h | 13 + fs/xfs/scrub/rtrmap.c | 7 fs/xfs/scrub/rtrmap_repair.c | 722 ++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/trace.h | 57 +++ 16 files changed, 954 insertions(+), 9 deletions(-) create mode 100644 fs/xfs/scrub/rtrmap_repair.c diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 1060ea739210..17c65dce6d26 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -221,6 +221,7 @@ xfs-y += $(addprefix scrub/, \ xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \ rgsuper_repair.o \ rtbitmap_repair.o \ + rtrmap_repair.o \ rtsummary_repair.o \ ) diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index e3bff42d003d..9c678e9fded5 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -264,7 +264,7 @@ xfs_rmap_check_perag_irec( return NULL; } -static inline xfs_failaddr_t +inline xfs_failaddr_t xfs_rmap_check_rtgroup_irec( struct xfs_rtgroup *rtg, const struct xfs_rmap_irec *irec) diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h index e98f37c39f2f..9d0aaa16f551 100644 --- a/fs/xfs/libxfs/xfs_rmap.h +++ b/fs/xfs/libxfs/xfs_rmap.h @@ -215,6 +215,8 @@ xfs_failaddr_t xfs_rmap_btrec_to_irec(const union xfs_btree_rec *rec, struct xfs_rmap_irec *irec); xfs_failaddr_t xfs_rmap_check_perag_irec(struct xfs_perag *pag, const struct xfs_rmap_irec *irec); +xfs_failaddr_t xfs_rmap_check_rtgroup_irec(struct xfs_rtgroup *rtg, + const struct xfs_rmap_irec *irec); xfs_failaddr_t xfs_rmap_check_irec(struct xfs_btree_cur *cur, const struct xfs_rmap_irec *irec); diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index 2d8130b4c187..418173f6f3ca 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -705,7 +705,7 @@ xfs_rtrmapbt_create_path( } /* Calculate the rtrmap btree size for some records. */ -static unsigned long long +unsigned long long xfs_rtrmapbt_calc_size( struct xfs_mount *mp, unsigned long long len) diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.h b/fs/xfs/libxfs/xfs_rtrmap_btree.h index 046a60816736..1f0a6f9620e8 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.h +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.h @@ -203,4 +203,7 @@ struct xfs_imeta_update; int xfs_rtrmapbt_create(struct xfs_trans **tpp, struct xfs_imeta_path *path, struct xfs_imeta_update *ic, struct xfs_inode **ipp); +unsigned long long xfs_rtrmapbt_calc_size(struct xfs_mount *mp, + unsigned long long len); + #endif /* __XFS_RTRMAP_BTREE_H__ */ diff --git a/fs/xfs/scrub/bmap_repair.c b/fs/xfs/scrub/bmap_repair.c index 77d601afbcfb..b8cdcba984f3 100644 --- a/fs/xfs/scrub/bmap_repair.c +++ b/fs/xfs/scrub/bmap_repair.c @@ -775,7 +775,8 @@ xrep_bmap_remove_old_tree( /* Free the old bmbt blocks if they're not in use. */ xfs_rmap_ino_bmbt_owner(&oinfo, sc->ip->i_ino, rb->whichfork); - return xrep_reap_fsblocks(sc, &rb->old_bmbt_blocks, &oinfo); + return xrep_reap_fsblocks(sc, &rb->old_bmbt_blocks, &oinfo, + XFS_AG_RESV_NONE); } /* Check for garbage inputs. Returns -ECANCELED if there's nothing to do. */ diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c index 18763d136ef5..c2c379aae770 100644 --- a/fs/xfs/scrub/common.c +++ b/fs/xfs/scrub/common.c @@ -964,7 +964,10 @@ int xchk_setup_rt( struct xfs_scrub *sc) { - return xchk_trans_alloc(sc, 0); + uint resblks; + + resblks = xrep_calc_rtgroup_resblks(sc); + return xchk_trans_alloc(sc, resblks); } /* Set us up with AG headers and btree cursors. */ diff --git a/fs/xfs/scrub/cow_repair.c b/fs/xfs/scrub/cow_repair.c index d1b5915e1703..5292171e6a2b 100644 --- a/fs/xfs/scrub/cow_repair.c +++ b/fs/xfs/scrub/cow_repair.c @@ -649,7 +649,7 @@ xrep_bmap_cow( * like inode metadata. */ error = xrep_reap_fsblocks(sc, &xc->old_cowfork_fsblocks, - &XFS_RMAP_OINFO_COW); + &XFS_RMAP_OINFO_COW, XFS_AG_RESV_NONE); if (error) goto out_bitmap; diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c index 151afacab982..b0b29b1e139b 100644 --- a/fs/xfs/scrub/reap.c +++ b/fs/xfs/scrub/reap.c @@ -652,12 +652,13 @@ int xrep_reap_fsblocks( struct xfs_scrub *sc, struct xfsb_bitmap *bitmap, - const struct xfs_owner_info *oinfo) + const struct xfs_owner_info *oinfo, + enum xfs_ag_resv_type type) { struct xreap_state rs = { .sc = sc, .oinfo = oinfo, - .resv = XFS_AG_RESV_NONE, + .resv = type, }; int error; diff --git a/fs/xfs/scrub/reap.h b/fs/xfs/scrub/reap.h index 6606b119b9ec..cfaef544f659 100644 --- a/fs/xfs/scrub/reap.h +++ b/fs/xfs/scrub/reap.h @@ -9,7 +9,7 @@ int xrep_reap_agblocks(struct xfs_scrub *sc, struct xagb_bitmap *bitmap, const struct xfs_owner_info *oinfo, enum xfs_ag_resv_type type); int xrep_reap_fsblocks(struct xfs_scrub *sc, struct xfsb_bitmap *bitmap, - const struct xfs_owner_info *oinfo); + const struct xfs_owner_info *oinfo, enum xfs_ag_resv_type type); int xrep_reap_ifork(struct xfs_scrub *sc, struct xfs_inode *ip, int whichfork); /* Buffer cache scan context. */ diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index 995b60f2d41e..b76c01e9f540 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -38,6 +38,8 @@ #include "xfs_rtrmap_btree.h" #include "xfs_rtbitmap.h" #include "xfs_rtgroup.h" +#include "xfs_rtalloc.h" +#include "xfs_imeta.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -371,6 +373,39 @@ xrep_calc_ag_resblks( return max(max(bnobt_sz, inobt_sz), max(rmapbt_sz, refcbt_sz)); } +#ifdef CONFIG_XFS_RT +/* + * Figure out how many blocks to reserve for a rtgroup repair. We calculate + * the worst case estimate for the number of blocks we'd need to rebuild one of + * any type of per-rtgroup btree. + */ +xfs_extlen_t +xrep_calc_rtgroup_resblks( + struct xfs_scrub *sc) +{ + struct xfs_mount *mp = sc->mp; + struct xfs_scrub_metadata *sm = sc->sm; + struct xfs_rtgroup *rtg; + xfs_extlen_t usedlen; + xfs_extlen_t rmapbt_sz = 0; + + if (!(sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR)) + return 0; + + rtg = xfs_rtgroup_get(mp, sm->sm_agno); + usedlen = rtg->rtg_blockcount; + xfs_rtgroup_put(rtg); + + if (xfs_has_rmapbt(mp)) + rmapbt_sz = xfs_rtrmapbt_calc_size(mp, usedlen); + + trace_xrep_calc_rtgroup_resblks_btsize(mp, sm->sm_agno, usedlen, + rmapbt_sz); + + return rmapbt_sz; +} +#endif /* CONFIG_XFS_RT */ + /* * Reconstructing per-AG Btrees * @@ -1354,3 +1389,103 @@ xrep_is_rtmeta_ino( return false; } + +/* Check the sanity of a rmap record for a metadata btree inode. */ +int +xrep_check_ino_btree_mapping( + struct xfs_scrub *sc, + const struct xfs_rmap_irec *rec) +{ + enum xbtree_recpacking outcome; + int error; + + /* + * Metadata btree inodes never have extended attributes, and all blocks + * should have the bmbt block flag set. + */ + if ((rec->rm_flags & XFS_RMAP_ATTR_FORK) || + !(rec->rm_flags & XFS_RMAP_BMBT_BLOCK)) + return -EFSCORRUPTED; + + /* Make sure the block is within the AG. */ + if (!xfs_verify_agbext(sc->sa.pag, rec->rm_startblock, + rec->rm_blockcount)) + return -EFSCORRUPTED; + + /* Make sure this isn't free space. */ + error = xfs_alloc_has_records(sc->sa.bno_cur, rec->rm_startblock, + rec->rm_blockcount, &outcome); + if (error) + return error; + if (outcome != XBTREE_RECPACKING_EMPTY) + return -EFSCORRUPTED; + + return 0; +} + +/* + * Reset the block count of the inode being repaired, and adjust the dquot + * block usage to match. The inode must not have an xattr fork. + */ +void +xrep_inode_set_nblocks( + struct xfs_scrub *sc, + int64_t new_blocks) +{ + int64_t delta; + + delta = new_blocks - sc->ip->i_nblocks; + sc->ip->i_nblocks = new_blocks; + + xfs_trans_log_inode(sc->tp, sc->ip, XFS_ILOG_CORE); + if (delta != 0) + xfs_trans_mod_dquot_byino(sc->tp, sc->ip, XFS_TRANS_DQ_BCOUNT, + delta); +} + +/* Reset the block reservation for a metadata inode. */ +int +xrep_reset_imeta_reservation( + struct xfs_scrub *sc) +{ + struct xfs_inode *ip = sc->ip; + int64_t delta; + int error; + + delta = ip->i_nblocks + ip->i_delayed_blks - ip->i_meta_resv_asked; + if (delta == 0) + return 0; + + if (delta > 0) { + int64_t give_back; + + /* Too many blocks, free from the incore reservation. */ + give_back = min_t(uint64_t, delta, ip->i_delayed_blks); + if (give_back > 0) { + xfs_mod_delalloc(ip->i_mount, -give_back); + xfs_mod_fdblocks(ip->i_mount, give_back, true); + ip->i_delayed_blks -= give_back; + } + + return 0; + } + + /* Not enough reservation, try to add more. @delta is negative here. */ + error = xfs_mod_fdblocks(sc->mp, delta, true); + while (error == -ENOSPC) { + delta++; + if (delta == 0) { + xfs_warn(sc->mp, +"Insufficient free space to reset space reservation for inode 0x%llx after repair.", + ip->i_ino); + return 0; + } + error = xfs_mod_fdblocks(sc->mp, delta, true); + } + if (error) + return error; + + xfs_mod_delalloc(sc->mp, -delta); + ip->i_delayed_blks += -delta; + return 0; +} diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index a0ed79506195..ff8605849a72 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -88,6 +88,7 @@ int xrep_setup_parent(struct xfs_scrub *sc); int xrep_setup_nlinks(struct xfs_scrub *sc); int xrep_setup_symlink(struct xfs_scrub *sc, unsigned int *resblks); int xrep_setup_rgbitmap(struct xfs_scrub *sc, unsigned int *resblks); +int xrep_setup_rtrmapbt(struct xfs_scrub *sc); int xrep_xattr_reset_fork(struct xfs_scrub *sc); @@ -107,12 +108,16 @@ int xrep_rtgroup_init(struct xfs_scrub *sc, struct xfs_rtgroup *rtg, void xrep_rtgroup_btcur_init(struct xfs_scrub *sc, struct xchk_rt *sr); int xrep_require_rtext_inuse(struct xfs_scrub *sc, xfs_rtblock_t rtbno, xfs_filblks_t len); +xfs_extlen_t xrep_calc_rtgroup_resblks(struct xfs_scrub *sc); #else # define xrep_rtgroup_init(sc, rtg, sr, lockflags) (-ENOSYS) +# define xrep_calc_rtgroup_resblks(sc) (0) #endif /* CONFIG_XFS_RT */ bool xrep_is_rtmeta_ino(struct xfs_scrub *sc, struct xfs_rtgroup *rtg, xfs_ino_t ino); +int xrep_check_ino_btree_mapping(struct xfs_scrub *sc, + const struct xfs_rmap_irec *rec); /* Metadata revalidators */ @@ -146,11 +151,13 @@ int xrep_rtbitmap(struct xfs_scrub *sc); int xrep_rtsummary(struct xfs_scrub *sc); int xrep_rgsuperblock(struct xfs_scrub *sc); int xrep_rgbitmap(struct xfs_scrub *sc); +int xrep_rtrmapbt(struct xfs_scrub *sc); #else # define xrep_rtbitmap xrep_notsupported # define xrep_rtsummary xrep_notsupported # define xrep_rgsuperblock xrep_notsupported # define xrep_rgbitmap xrep_notsupported +# define xrep_rtrmapbt xrep_notsupported #endif /* CONFIG_XFS_RT */ #ifdef CONFIG_XFS_QUOTA @@ -170,6 +177,8 @@ void xrep_trans_cancel_hook_dummy(void **cookiep, struct xfs_trans *tp); bool xrep_buf_verify_struct(struct xfs_buf *bp, const struct xfs_buf_ops *ops); xfs_ino_t xrep_dotdot_lookup(struct xfs_scrub *sc); +void xrep_inode_set_nblocks(struct xfs_scrub *sc, int64_t new_blocks); +int xrep_reset_imeta_reservation(struct xfs_scrub *sc); #else @@ -192,6 +201,8 @@ xrep_calc_ag_resblks( return 0; } +#define xrep_calc_rtgroup_resblks xrep_calc_ag_resblks + static inline int xrep_reset_perag_resv( struct xfs_scrub *sc) @@ -217,6 +228,7 @@ xrep_setup_nothing( #define xrep_setup_directory xrep_setup_nothing #define xrep_setup_parent xrep_setup_nothing #define xrep_setup_nlinks xrep_setup_nothing +#define xrep_setup_rtrmapbt xrep_setup_nothing #define xrep_setup_inode(sc, imap) ((void)0) @@ -272,6 +284,7 @@ static inline int xrep_setup_rgbitmap(struct xfs_scrub *sc, unsigned int *x) #define xrep_symlink xrep_notsupported #define xrep_rgsuperblock xrep_notsupported #define xrep_rgbitmap xrep_notsupported +#define xrep_rtrmapbt xrep_notsupported #endif /* CONFIG_XFS_ONLINE_REPAIR */ diff --git a/fs/xfs/scrub/rtrmap.c b/fs/xfs/scrub/rtrmap.c index e9ca9670f3af..5442325a6982 100644 --- a/fs/xfs/scrub/rtrmap.c +++ b/fs/xfs/scrub/rtrmap.c @@ -26,6 +26,7 @@ #include "scrub/common.h" #include "scrub/btree.h" #include "scrub/trace.h" +#include "scrub/repair.h" /* Set us up with the realtime metadata locked. */ int @@ -43,6 +44,12 @@ xchk_setup_rtrmapbt( if (!rtg) return -ENOENT; + if (xchk_could_repair(sc)) { + error = xrep_setup_rtrmapbt(sc); + if (error) + return error; + } + error = xchk_setup_rt(sc); if (error) goto out_rtg; diff --git a/fs/xfs/scrub/rtrmap_repair.c b/fs/xfs/scrub/rtrmap_repair.c new file mode 100644 index 000000000000..d856a4e46d6f --- /dev/null +++ b/fs/xfs/scrub/rtrmap_repair.c @@ -0,0 +1,722 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_defer.h" +#include "xfs_btree.h" +#include "xfs_btree_staging.h" +#include "xfs_bit.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_sb.h" +#include "xfs_alloc.h" +#include "xfs_rmap.h" +#include "xfs_rmap_btree.h" +#include "xfs_rtrmap_btree.h" +#include "xfs_inode.h" +#include "xfs_icache.h" +#include "xfs_bmap.h" +#include "xfs_bmap_btree.h" +#include "xfs_quota.h" +#include "xfs_rtalloc.h" +#include "xfs_ag.h" +#include "xfs_rtgroup.h" +#include "scrub/xfs_scrub.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/btree.h" +#include "scrub/trace.h" +#include "scrub/repair.h" +#include "scrub/bitmap.h" +#include "scrub/xfile.h" +#include "scrub/xfarray.h" +#include "scrub/iscan.h" +#include "scrub/newbt.h" +#include "scrub/reap.h" + +/* + * Realtime Reverse Mapping Btree Repair + * ===================================== + * + * This isn't quite as difficult as repairing the rmap btree on the data + * device, since we only store the data fork extents of realtime files on the + * realtime device. We still have to freeze the filesystem and stop the + * background threads like we do for the rmap repair, but we only have to scan + * realtime inodes. + * + * Collecting entries for the new realtime rmap btree is easy -- all we have + * to do is generate rtrmap entries from the data fork mappings of all realtime + * files in the filesystem. We then scan the rmap btrees of the data device + * looking for extents belonging to the old btree and note them in a bitmap. + * + * To rebuild the realtime rmap btree, we bulk-load the collected mappings into + * a new btree cursor and atomically swap that into the realtime inode. Then + * we can free the blocks from the old btree. + * + * We use the 'xrep_rtrmap' prefix for all the rmap functions. + */ + +/* Set us up to repair rt reverse mapping btrees. */ +int +xrep_setup_rtrmapbt( + struct xfs_scrub *sc) +{ + /* For now this is a placeholder until we land other pieces. */ + return 0; +} + +/* + * Packed rmap record. The UNWRITTEN flags are hidden in the upper bits of + * offset, just like the on-disk record. + */ +struct xrep_rtrmap_extent { + xfs_rgblock_t startblock; + xfs_extlen_t blockcount; + uint64_t owner; + uint64_t offset; +} __packed; + +/* Context for collecting rmaps */ +struct xrep_rtrmap { + /* new rtrmapbt information */ + struct xrep_newbt new_btree; + + /* rmap records generated from primary metadata */ + struct xfarray *rtrmap_records; + + struct xfs_scrub *sc; + + /* bitmap of old rtrmapbt blocks */ + struct xfsb_bitmap old_rtrmapbt_blocks; + + /* inode scan cursor */ + struct xchk_iscan iscan; + + /* get_records()'s position in the free space record array. */ + xfarray_idx_t array_cur; +}; + +/* Make sure there's nothing funny about this mapping. */ +STATIC int +xrep_rtrmap_check_mapping( + struct xfs_scrub *sc, + const struct xfs_rmap_irec *rec) +{ + xfs_rtblock_t rtbno; + + if (xfs_rmap_check_rtgroup_irec(sc->sr.rtg, rec) != NULL) + return -EFSCORRUPTED; + + /* Make sure this isn't free space. */ + rtbno = xfs_rgbno_to_rtb(sc->mp, sc->sr.rtg->rtg_rgno, + rec->rm_startblock); + return xrep_require_rtext_inuse(sc, rtbno, rec->rm_blockcount); +} + +/* Store a reverse-mapping record. */ +static inline int +xrep_rtrmap_stash( + struct xrep_rtrmap *rr, + xfs_rgblock_t startblock, + xfs_extlen_t blockcount, + uint64_t owner, + uint64_t offset, + unsigned int flags) +{ + struct xrep_rtrmap_extent rre = { + .startblock = startblock, + .blockcount = blockcount, + .owner = owner, + }; + struct xfs_rmap_irec rmap = { + .rm_startblock = startblock, + .rm_blockcount = blockcount, + .rm_owner = owner, + .rm_offset = offset, + .rm_flags = flags, + }; + struct xfs_scrub *sc = rr->sc; + int error = 0; + + if (xchk_should_terminate(sc, &error)) + return error; + + trace_xrep_rtrmap_found(sc->mp, &rmap); + + rre.offset = xfs_rmap_irec_offset_pack(&rmap); + return xfarray_append(rr->rtrmap_records, &rre); +} + +/* Finding all file and bmbt extents. */ + +/* Context for accumulating rmaps for an inode fork. */ +struct xrep_rtrmap_ifork { + /* + * Accumulate rmap data here to turn multiple adjacent bmaps into a + * single rmap. + */ + struct xfs_rmap_irec accum; + + struct xrep_rtrmap *rr; +}; + +/* Stash an rmap that we accumulated while walking an inode fork. */ +STATIC int +xrep_rtrmap_stash_accumulated( + struct xrep_rtrmap_ifork *rf) +{ + if (rf->accum.rm_blockcount == 0) + return 0; + + return xrep_rtrmap_stash(rf->rr, rf->accum.rm_startblock, + rf->accum.rm_blockcount, rf->accum.rm_owner, + rf->accum.rm_offset, rf->accum.rm_flags); +} + +/* Accumulate a bmbt record. */ +STATIC int +xrep_rtrmap_visit_bmbt( + struct xfs_btree_cur *cur, + struct xfs_bmbt_irec *rec, + void *priv) +{ + struct xrep_rtrmap_ifork *rf = priv; + struct xfs_rmap_irec *accum = &rf->accum; + struct xfs_mount *mp = rf->rr->sc->mp; + xfs_rgnumber_t rgno; + xfs_rgblock_t rgbno; + unsigned int rmap_flags = 0; + int error; + + rgbno = xfs_rtb_to_rgbno(mp, rec->br_startblock, &rgno); + if (rgno != rf->rr->sc->sr.rtg->rtg_rgno) + return 0; + + if (rec->br_state == XFS_EXT_UNWRITTEN) + rmap_flags |= XFS_RMAP_UNWRITTEN; + + /* If this bmap is adjacent to the previous one, just add it. */ + if (accum->rm_blockcount > 0 && + rec->br_startoff == accum->rm_offset + accum->rm_blockcount && + rgbno == accum->rm_startblock + accum->rm_blockcount && + rmap_flags == accum->rm_flags) { + accum->rm_blockcount += rec->br_blockcount; + return 0; + } + + /* Otherwise stash the old rmap and start accumulating a new one. */ + error = xrep_rtrmap_stash_accumulated(rf); + if (error) + return error; + + accum->rm_startblock = rgbno; + accum->rm_blockcount = rec->br_blockcount; + accum->rm_offset = rec->br_startoff; + accum->rm_flags = rmap_flags; + return 0; +} + +/* + * Iterate the block mapping btree to collect rmap records for anything in this + * fork that maps to the rt volume. Sets @mappings_done to true if we've + * scanned the block mappings in this fork. + */ +STATIC int +xrep_rtrmap_scan_bmbt( + struct xrep_rtrmap_ifork *rf, + struct xfs_inode *ip, + bool *mappings_done) +{ + struct xrep_rtrmap *rr = rf->rr; + struct xfs_btree_cur *cur; + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + int error = 0; + + *mappings_done = false; + + /* + * If the incore extent cache is already loaded, we'll just use the + * incore extent scanner to record mappings. Don't bother walking the + * ondisk extent tree. + */ + if (!xfs_need_iread_extents(ifp)) + return 0; + + /* Accumulate all the mappings in the bmap btree. */ + cur = xfs_bmbt_init_cursor(rr->sc->mp, rr->sc->tp, ip, XFS_DATA_FORK); + error = xfs_bmap_query_all(cur, xrep_rtrmap_visit_bmbt, rf); + xfs_btree_del_cursor(cur, error); + if (error) + return error; + + /* Stash any remaining accumulated rmaps and exit. */ + *mappings_done = true; + return xrep_rtrmap_stash_accumulated(rf); +} + +/* + * Iterate the in-core extent cache to collect rmap records for anything in + * this fork that matches the AG. + */ +STATIC int +xrep_rtrmap_scan_iext( + struct xrep_rtrmap_ifork *rf, + struct xfs_ifork *ifp) +{ + struct xfs_bmbt_irec rec; + struct xfs_iext_cursor icur; + int error; + + for_each_xfs_iext(ifp, &icur, &rec) { + if (isnullstartblock(rec.br_startblock)) + continue; + error = xrep_rtrmap_visit_bmbt(NULL, &rec, rf); + if (error) + return error; + } + + return xrep_rtrmap_stash_accumulated(rf); +} + +/* Find all the extents on the realtime device mapped by an inode fork. */ +STATIC int +xrep_rtrmap_scan_dfork( + struct xrep_rtrmap *rr, + struct xfs_inode *ip) +{ + struct xrep_rtrmap_ifork rf = { + .accum = { .rm_owner = ip->i_ino, }, + .rr = rr, + }; + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + int error = 0; + + if (ifp->if_format == XFS_DINODE_FMT_BTREE) { + bool mappings_done; + + /* + * Scan the bmbt for mappings. If the incore extent tree is + * loaded, we want to scan the cached mappings since that's + * faster when the extent counts are very high. + */ + error = xrep_rtrmap_scan_bmbt(&rf, ip, &mappings_done); + if (error || mappings_done) + return error; + } else if (ifp->if_format != XFS_DINODE_FMT_EXTENTS) { + /* realtime data forks should only be extents or btree */ + return -EFSCORRUPTED; + } + + /* Scan incore extent cache. */ + return xrep_rtrmap_scan_iext(&rf, ifp); +} + +/* Record reverse mappings for a file. */ +STATIC int +xrep_rtrmap_scan_inode( + struct xrep_rtrmap *rr, + struct xfs_inode *ip) +{ + unsigned int lock_mode; + int error = 0; + + /* Skip the rt rmap btree inode. */ + if (rr->sc->ip == ip) + return 0; + + xfs_ilock(ip, XFS_IOLOCK_SHARED | XFS_MMAPLOCK_SHARED); + lock_mode = xfs_ilock_data_map_shared(ip); + + /* Check the data fork if it's on the realtime device. */ + if (XFS_IS_REALTIME_INODE(ip)) { + error = xrep_rtrmap_scan_dfork(rr, ip); + if (error) + goto out_unlock; + } + + xchk_iscan_mark_visited(&rr->iscan, ip); +out_unlock: + xfs_iunlock(ip, XFS_IOLOCK_SHARED | XFS_MMAPLOCK_SHARED | lock_mode); + return error; +} + +/* Record extents that belong to the realtime rmap inode. */ +STATIC int +xrep_rtrmap_walk_rmap( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xrep_rtrmap *rr = priv; + struct xfs_mount *mp = cur->bc_mp; + xfs_fsblock_t fsbno; + int error = 0; + + if (xchk_should_terminate(rr->sc, &error)) + return error; + + /* Skip extents which are not owned by this inode and fork. */ + if (rec->rm_owner != rr->sc->ip->i_ino) + return 0; + + error = xrep_check_ino_btree_mapping(rr->sc, rec); + if (error) + return error; + + fsbno = XFS_AGB_TO_FSB(mp, cur->bc_ag.pag->pag_agno, + rec->rm_startblock); + + return xfsb_bitmap_set(&rr->old_rtrmapbt_blocks, fsbno, + rec->rm_blockcount); +} + +/* Scan one AG for reverse mappings for the realtime rmap btree. */ +STATIC int +xrep_rtrmap_scan_ag( + struct xrep_rtrmap *rr, + struct xfs_perag *pag) +{ + struct xfs_scrub *sc = rr->sc; + int error; + + error = xrep_ag_init(sc, pag, &sc->sa); + if (error) + return error; + + error = xfs_rmap_query_all(sc->sa.rmap_cur, xrep_rtrmap_walk_rmap, rr); + xchk_ag_free(sc, &sc->sa); + return error; +} + +STATIC int +xrep_rtrmap_find_super_rmaps( + struct xrep_rtrmap *rr) +{ + struct xfs_scrub *sc = rr->sc; + + /* Create a record for the rtgroup superblock. */ + return xrep_rtrmap_stash(rr, 0, sc->mp->m_sb.sb_rextsize, + XFS_RMAP_OWN_FS, 0, 0); +} + +/* Generate all the reverse-mappings for the realtime device. */ +STATIC int +xrep_rtrmap_find_rmaps( + struct xrep_rtrmap *rr) +{ + struct xfs_scrub *sc = rr->sc; + struct xfs_perag *pag; + struct xfs_inode *ip; + xfs_agnumber_t agno; + int error; + + /* Generate rmaps for the rtgroup superblock */ + error = xrep_rtrmap_find_super_rmaps(rr); + if (error) + return error; + + /* + * Set up for a potentially lengthy filesystem scan by reducing our + * transaction resource usage for the duration. Specifically: + * + * Unlock the realtime metadata inodes and cancel the transaction to + * release the log grant space while we scan the filesystem. + * + * Create a new empty transaction to eliminate the possibility of the + * inode scan deadlocking on cyclical metadata. + * + * We pass the empty transaction to the file scanning function to avoid + * repeatedly cycling empty transactions. This can be done even though + * we take the IOLOCK to quiesce the file because empty transactions + * do not take sb_internal. + */ + xchk_trans_cancel(sc); + xchk_rtgroup_unlock(sc, &sc->sr); + error = xchk_trans_alloc_empty(sc); + if (error) + return error; + + while ((error = xchk_iscan_iter(sc, &rr->iscan, &ip)) == 1) { + error = xrep_rtrmap_scan_inode(rr, ip); + xchk_irele(sc, ip); + if (error) + break; + + if (xchk_should_terminate(sc, &error)) + break; + } + if (error) + return error; + + /* + * Switch out for a real transaction and lock the RT metadata in + * preparation for building a new tree. + */ + xchk_trans_cancel(sc); + error = xchk_setup_rt(sc); + if (error) + return error; + error = xchk_rtgroup_lock(sc, &sc->sr, XCHK_RTGLOCK_ALL); + if (error) + return error; + + /* Scan for old rtrmap blocks. */ + for_each_perag(sc->mp, agno, pag) { + error = xrep_rtrmap_scan_ag(rr, pag); + if (error) { + xfs_perag_put(pag); + return error; + } + } + + return 0; +} + +/* Building the new rtrmap btree. */ + +/* Retrieve rtrmapbt data for bulk load. */ +STATIC int +xrep_rtrmap_get_records( + struct xfs_btree_cur *cur, + unsigned int idx, + struct xfs_btree_block *block, + unsigned int nr_wanted, + void *priv) +{ + struct xrep_rtrmap_extent rec; + struct xfs_rmap_irec *irec = &cur->bc_rec.r; + struct xrep_rtrmap *rr = priv; + union xfs_btree_rec *block_rec; + unsigned int loaded; + int error; + + for (loaded = 0; loaded < nr_wanted; loaded++, idx++) { + error = xfarray_load_next(rr->rtrmap_records, &rr->array_cur, + &rec); + if (error) + return error; + + irec->rm_startblock = rec.startblock; + irec->rm_blockcount = rec.blockcount; + irec->rm_owner = rec.owner; + + if (xfs_rmap_irec_offset_unpack(rec.offset, irec) != NULL) + return -EFSCORRUPTED; + + error = xrep_rtrmap_check_mapping(rr->sc, irec); + if (error) + return error; + + block_rec = xfs_btree_rec_addr(cur, idx, block); + cur->bc_ops->init_rec_from_cur(cur, block_rec); + } + + return loaded; +} + +/* Feed one of the new btree blocks to the bulk loader. */ +STATIC int +xrep_rtrmap_claim_block( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr, + void *priv) +{ + struct xrep_rtrmap *rr = priv; + + return xrep_newbt_claim_block(cur, &rr->new_btree, ptr); +} + +/* Figure out how much space we need to create the incore btree root block. */ +STATIC size_t +xrep_rtrmap_iroot_size( + struct xfs_btree_cur *cur, + unsigned int level, + unsigned int nr_this_level, + void *priv) +{ + return xfs_rtrmap_broot_space_calc(cur->bc_mp, level, nr_this_level); +} + +/* + * Use the collected rmap information to stage a new rmap btree. If this is + * successful we'll return with the new btree root information logged to the + * repair transaction but not yet committed. This implements section (III) + * above. + */ +STATIC int +xrep_rtrmap_build_new_tree( + struct xrep_rtrmap *rr) +{ + struct xfs_owner_info oinfo; + struct xfs_scrub *sc = rr->sc; + struct xfs_rtgroup *rtg = sc->sr.rtg; + struct xfs_btree_cur *rmap_cur; + uint64_t nr_records; + int error; + + /* + * Prepare to construct the new btree by reserving disk space for the + * new btree and setting up all the accounting information we'll need + * to root the new btree while it's under construction and before we + * attach it to the realtime rmapbt inode. + */ + xfs_rmap_ino_bmbt_owner(&oinfo, rtg->rtg_rmapip->i_ino, XFS_DATA_FORK); + error = xrep_newbt_init_inode(&rr->new_btree, sc, XFS_DATA_FORK, + &oinfo); + if (error) + return error; + rr->new_btree.bload.get_records = xrep_rtrmap_get_records; + rr->new_btree.bload.claim_block = xrep_rtrmap_claim_block; + rr->new_btree.bload.iroot_size = xrep_rtrmap_iroot_size; + + rmap_cur = xfs_rtrmapbt_stage_cursor(sc->mp, rtg, rtg->rtg_rmapip, + &rr->new_btree.ifake); + + nr_records = xfarray_length(rr->rtrmap_records); + + /* Compute how many blocks we'll need for the rmaps collected. */ + error = xfs_btree_bload_compute_geometry(rmap_cur, + &rr->new_btree.bload, nr_records); + if (error) + goto err_cur; + + /* Last chance to abort before we start committing fixes. */ + if (xchk_should_terminate(sc, &error)) + goto err_cur; + + /* + * Guess how many blocks we're going to need to rebuild an entire + * rtrmapbt from the number of extents we found, and pump up our + * transaction to have sufficient block reservation. We're allowed + * to exceed quota to repair inconsistent metadata, though this is + * unlikely. + */ + error = xfs_trans_reserve_more_inode(sc->tp, rtg->rtg_rmapip, + rr->new_btree.bload.nr_blocks, 0, true); + if (error) + goto err_cur; + + /* Reserve the space we'll need for the new btree. */ + error = xrep_newbt_alloc_blocks(&rr->new_btree, + rr->new_btree.bload.nr_blocks); + if (error) + goto err_cur; + + /* Add all observed rmap records. */ + rr->new_btree.ifake.if_fork->if_format = XFS_DINODE_FMT_RMAP; + rr->array_cur = XFARRAY_CURSOR_INIT; + error = xfs_btree_bload(rmap_cur, &rr->new_btree.bload, rr); + if (error) + goto err_cur; + + /* + * Install the new rtrmap btree in the inode. After this point the old + * btree is no longer accessible, the new tree is live, and we can + * delete the cursor. + */ + xfs_rtrmapbt_commit_staged_btree(rmap_cur, sc->tp); + xrep_inode_set_nblocks(rr->sc, rr->new_btree.ifake.if_blocks); + xfs_btree_del_cursor(rmap_cur, 0); + + /* Dispose of any unused blocks and the accounting information. */ + error = xrep_newbt_commit(&rr->new_btree); + if (error) + return error; + + return xrep_roll_trans(sc); + +err_cur: + xfs_btree_del_cursor(rmap_cur, error); + xrep_newbt_cancel(&rr->new_btree); + return error; +} + +/* Reaping the old btree. */ + +/* Reap the old rtrmapbt blocks. */ +STATIC int +xrep_rtrmap_remove_old_tree( + struct xrep_rtrmap *rr) +{ + struct xfs_owner_info oinfo; + int error; + + /* + * Free all the extents that were allocated to the former rtrmapbt and + * aren't cross-linked with something else. + */ + xfs_rmap_ino_bmbt_owner(&oinfo, rr->sc->ip->i_ino, XFS_DATA_FORK); + error = xrep_reap_fsblocks(rr->sc, &rr->old_rtrmapbt_blocks, &oinfo, + XFS_AG_RESV_IMETA); + if (error) + return error; + + /* + * Ensure the proper reservation for the rtrmap inode so that we don't + * fail to expand the new btree. + */ + return xrep_reset_imeta_reservation(rr->sc); +} + +/* Repair the realtime rmap btree. */ +int +xrep_rtrmapbt( + struct xfs_scrub *sc) +{ + struct xrep_rtrmap *rr; + int error; + + /* Functionality is not yet complete. */ + return xrep_notsupported(sc); + + /* Make sure any problems with the fork are fixed. */ + error = xrep_metadata_inode_forks(sc); + if (error) + return error; + + rr = kzalloc(sizeof(struct xrep_rtrmap), XCHK_GFP_FLAGS); + if (!rr) + return -ENOMEM; + rr->sc = sc; + + xfsb_bitmap_init(&rr->old_rtrmapbt_blocks); + + /* Set up some storage */ + error = xfarray_create(sc->mp, "rtrmap records", 0, + sizeof(struct xrep_rtrmap_extent), &rr->rtrmap_records); + if (error) + goto out_bitmap; + + /* Retry iget every tenth of a second for up to 30 seconds. */ + xchk_iscan_start(&rr->iscan, 30000, 100); + + /* Collect rmaps for realtime files. */ + error = xrep_rtrmap_find_rmaps(rr); + if (error) + goto out_records; + + xfs_trans_ijoin(sc->tp, sc->ip, 0); + + /* Rebuild the rtrmap information. */ + error = xrep_rtrmap_build_new_tree(rr); + if (error) + goto out_records; + + /* Kill the old tree. */ + error = xrep_rtrmap_remove_old_tree(rr); + +out_records: + xchk_iscan_finish(&rr->iscan); + xfarray_destroy(rr->rtrmap_records); +out_bitmap: + xfsb_bitmap_destroy(&rr->old_rtrmapbt_blocks); + kfree(rr); + return error; +} diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 7abd25b37c97..ab7a36efab3b 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -430,7 +430,7 @@ static const struct xchk_meta_ops meta_scrub_ops[] = { .setup = xchk_setup_rtrmapbt, .scrub = xchk_rtrmapbt, .has = xfs_has_rtrmapbt, - .repair = xrep_notsupported, + .repair = xrep_rtrmapbt, }, }; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 7d086ffce7e3..654cbcbd99ea 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -1715,6 +1715,32 @@ TRACE_EVENT(xrep_calc_ag_resblks_btsize, __entry->rmapbt_sz, __entry->refcbt_sz) ) + +#ifdef CONFIG_XFS_RT +TRACE_EVENT(xrep_calc_rtgroup_resblks_btsize, + TP_PROTO(struct xfs_mount *mp, xfs_rgnumber_t rgno, + xfs_rgblock_t usedlen, xfs_rgblock_t rmapbt_sz), + TP_ARGS(mp, rgno, usedlen, rmapbt_sz), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_rgnumber_t, rgno) + __field(xfs_rgblock_t, usedlen) + __field(xfs_rgblock_t, rmapbt_sz) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->rgno = rgno; + __entry->usedlen = usedlen; + __entry->rmapbt_sz = rmapbt_sz; + ), + TP_printk("dev %d:%d rgno 0x%x usedlen %u rmapbt %u", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->rgno, + __entry->usedlen, + __entry->rmapbt_sz) +); +#endif /* CONFIG_XFS_RT */ + TRACE_EVENT(xrep_reset_counters, TP_PROTO(struct xfs_mount *mp, struct xchk_fscounters *fsc), TP_ARGS(mp, fsc), @@ -2967,6 +2993,37 @@ TRACE_EVENT(xrep_rgbitmap_load_word, (__entry->ondisk_word & __entry->word_mask), __entry->word_mask) ); + +TRACE_EVENT(xrep_rtrmap_found, + TP_PROTO(struct xfs_mount *mp, const struct xfs_rmap_irec *rec), + TP_ARGS(mp, rec), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(dev_t, rtdev) + __field(xfs_rgblock_t, rgbno) + __field(xfs_extlen_t, len) + __field(uint64_t, owner) + __field(uint64_t, offset) + __field(unsigned int, flags) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->rtdev = mp->m_rtdev_targp->bt_dev; + __entry->rgbno = rec->rm_startblock; + __entry->len = rec->rm_blockcount; + __entry->owner = rec->rm_owner; + __entry->offset = rec->rm_offset; + __entry->flags = rec->rm_flags; + ), + TP_printk("dev %d:%d rtdev %d:%d rgbno 0x%x fsbcount 0x%x owner 0x%llx fileoff 0x%llx flags 0x%x", + MAJOR(__entry->dev), MINOR(__entry->dev), + MAJOR(__entry->rtdev), MINOR(__entry->rtdev), + __entry->rgbno, + __entry->len, + __entry->owner, + __entry->offset, + __entry->flags) +); #endif /* CONFIG_XFS_RT */ #endif /* IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR) */ From patchwork Fri Dec 30 22:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72D1CC4332F for ; Sat, 31 Dec 2022 01:46:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236018AbiLaBqQ (ORCPT ); Fri, 30 Dec 2022 20:46:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231435AbiLaBqP (ORCPT ); Fri, 30 Dec 2022 20:46:15 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 523D51276F for ; Fri, 30 Dec 2022 17:46:14 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D577261CBF for ; Sat, 31 Dec 2022 01:46:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3E683C433D2; Sat, 31 Dec 2022 01:46:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451173; bh=BUrzYC85I1cs0J/1lTQGKBQdEO0d0lxlC+8wWq6COyw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=QVx02i4y0lzYWf9OisfInDjccoeFCoZymL7JclgWJXTdZeDsuVPA+TlvMwqveR4+V 0DS9mJTRjIZxCWygIyflrjsHxNLcnkB42Bi5jlRWKHazJ3nKKCGleQ8cv81kXGs9yx uu1f82n9772mTyasaraS0zonDpOWDIxArgOFI9sLDDpqi1qqtsowplCQXDxSIEv4A4 68uWZ/n06eBCAtZXfNtvrxshJ3Tjcpgzlu97pghx58utRrYVI2iV9QNWSe14EC7ijJ Sx/FyDYAhq1Kpl+asRJmcexbe5k7qiyf80Sfdbz+TSkW1TeW3IyQpcIHX0Ix2BQvCL U+HkvaWeiT61A== Subject: [PATCH 36/38] xfs: create a shadow rmap btree during realtime rmap repair From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:21 -0800 Message-ID: <167243870117.715303.50488883348344202.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Create an in-memory btree of rmap records instead of an array. This enables us to do live record collection instead of freezing the fs. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_btree.c | 2 + fs/xfs/libxfs/xfs_btree.h | 1 fs/xfs/libxfs/xfs_rmap.c | 6 +- fs/xfs/libxfs/xfs_rtrmap_btree.c | 122 +++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrmap_btree.h | 9 ++ fs/xfs/scrub/rtrmap_repair.c | 150 +++++++++++++++++++++++++++----------- fs/xfs/scrub/xfbtree.c | 3 + 7 files changed, 248 insertions(+), 45 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index fe742567a7dd..377dc9b0a6e6 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -491,6 +491,8 @@ xfs_btree_del_cursor( if (cur->bc_flags & XFS_BTREE_IN_MEMORY) { if (cur->bc_mem.pag) xfs_perag_put(cur->bc_mem.pag); + if (cur->bc_mem.rtg) + xfs_rtgroup_put(cur->bc_mem.rtg); } kmem_cache_free(cur->bc_cache, cur); } diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 5a733767649b..20342ed62bf4 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -266,6 +266,7 @@ struct xfs_btree_cur_mem { struct xfbtree *xfbtree; struct xfs_buf *head_bp; struct xfs_perag *pag; + struct xfs_rtgroup *rtg; }; struct xfs_btree_level { diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index 9c678e9fded5..06840fc31f02 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -328,8 +328,12 @@ xfs_rmap_check_irec( struct xfs_btree_cur *cur, const struct xfs_rmap_irec *irec) { - if (cur->bc_btnum == XFS_BTNUM_RTRMAP) + if (cur->bc_btnum == XFS_BTNUM_RTRMAP) { + if (cur->bc_flags & XFS_BTREE_IN_MEMORY) + return xfs_rmap_check_rtgroup_irec(cur->bc_mem.rtg, + irec); return xfs_rmap_check_rtgroup_irec(cur->bc_ino.rtg, irec); + } if (cur->bc_flags & XFS_BTREE_IN_MEMORY) return xfs_rmap_check_perag_irec(cur->bc_mem.pag, irec); diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index 418173f6f3ca..878bfeed411f 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -29,6 +29,9 @@ #include "xfs_bmap.h" #include "xfs_imeta.h" #include "xfs_health.h" +#include "scrub/xfile.h" +#include "scrub/xfbtree.h" +#include "xfs_btree_mem.h" static struct kmem_cache *xfs_rtrmapbt_cur_cache; @@ -557,6 +560,125 @@ xfs_rtrmapbt_stage_cursor( return cur; } +#ifdef CONFIG_XFS_IN_MEMORY_BTREE +/* + * Validate an in-memory realtime rmap btree block. Callers are allowed to + * generate an in-memory btree even if the ondisk feature is not enabled. + */ +static xfs_failaddr_t +xfs_rtrmapbt_mem_verify( + struct xfs_buf *bp) +{ + struct xfs_mount *mp = bp->b_mount; + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_failaddr_t fa; + unsigned int level; + + if (!xfs_verify_magic(bp, block->bb_magic)) + return __this_address; + + fa = xfs_btree_lblock_v5hdr_verify(bp, XFS_RMAP_OWN_UNKNOWN); + if (fa) + return fa; + + level = be16_to_cpu(block->bb_level); + if (xfs_has_rmapbt(mp)) { + if (level >= mp->m_rtrmap_maxlevels) + return __this_address; + } else { + if (level >= xfs_rtrmapbt_maxlevels_ondisk()) + return __this_address; + } + + return xfbtree_lblock_verify(bp, + xfs_rtrmapbt_maxrecs(mp, xfo_to_b(1), level == 0)); +} + +static void +xfs_rtrmapbt_mem_rw_verify( + struct xfs_buf *bp) +{ + xfs_failaddr_t fa = xfs_rtrmapbt_mem_verify(bp); + + if (fa) + xfs_verifier_error(bp, -EFSCORRUPTED, fa); +} + +/* skip crc checks on in-memory btrees to save time */ +static const struct xfs_buf_ops xfs_rtrmapbt_mem_buf_ops = { + .name = "xfs_rtrmapbt_mem", + .magic = { 0, cpu_to_be32(XFS_RTRMAP_CRC_MAGIC) }, + .verify_read = xfs_rtrmapbt_mem_rw_verify, + .verify_write = xfs_rtrmapbt_mem_rw_verify, + .verify_struct = xfs_rtrmapbt_mem_verify, +}; + +static const struct xfs_btree_ops xfs_rtrmapbt_mem_ops = { + .rec_len = sizeof(struct xfs_rmap_rec), + .key_len = 2 * sizeof(struct xfs_rmap_key), + .geom_flags = XFS_BTREE_CRC_BLOCKS | XFS_BTREE_OVERLAPPING | + XFS_BTREE_LONG_PTRS | XFS_BTREE_IN_MEMORY, + + .dup_cursor = xfbtree_dup_cursor, + .set_root = xfbtree_set_root, + .alloc_block = xfbtree_alloc_block, + .free_block = xfbtree_free_block, + .get_minrecs = xfbtree_get_minrecs, + .get_maxrecs = xfbtree_get_maxrecs, + .init_key_from_rec = xfs_rtrmapbt_init_key_from_rec, + .init_high_key_from_rec = xfs_rtrmapbt_init_high_key_from_rec, + .init_rec_from_cur = xfs_rtrmapbt_init_rec_from_cur, + .init_ptr_from_cur = xfbtree_init_ptr_from_cur, + .key_diff = xfs_rtrmapbt_key_diff, + .buf_ops = &xfs_rtrmapbt_mem_buf_ops, + .diff_two_keys = xfs_rtrmapbt_diff_two_keys, + .keys_inorder = xfs_rtrmapbt_keys_inorder, + .recs_inorder = xfs_rtrmapbt_recs_inorder, + .keys_contiguous = xfs_rtrmapbt_keys_contiguous, +}; + +/* Create a cursor for an in-memory btree. */ +struct xfs_btree_cur * +xfs_rtrmapbt_mem_cursor( + struct xfs_rtgroup *rtg, + struct xfs_trans *tp, + struct xfs_buf *head_bp, + struct xfbtree *xfbtree) +{ + struct xfs_btree_cur *cur; + struct xfs_mount *mp = rtg->rtg_mount; + + /* Overlapping btree; 2 keys per pointer. */ + cur = xfs_btree_alloc_cursor(mp, tp, XFS_BTNUM_RTRMAP, + &xfs_rtrmapbt_mem_ops, mp->m_rtrmap_maxlevels, + xfs_rtrmapbt_cur_cache); + cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_rmap_2); + cur->bc_mem.xfbtree = xfbtree; + cur->bc_mem.head_bp = head_bp; + cur->bc_nlevels = xfs_btree_mem_head_nlevels(head_bp); + + cur->bc_mem.rtg = xfs_rtgroup_bump(rtg); + return cur; +} + +int +xfs_rtrmapbt_mem_create( + struct xfs_mount *mp, + xfs_rgnumber_t rgno, + struct xfs_buftarg *target, + struct xfbtree **xfbtreep) +{ + struct xfbtree_config cfg = { + .btree_ops = &xfs_rtrmapbt_mem_ops, + .target = target, + .flags = XFBTREE_DIRECT_MAP, + .owner = rgno, + }; + + return xfbtree_create(mp, &cfg, xfbtreep); +} +#endif /* CONFIG_XFS_IN_MEMORY_BTREE */ + /* * Install a new rt reverse mapping btree root. Caller is responsible for * invalidating and freeing the old btree blocks. diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.h b/fs/xfs/libxfs/xfs_rtrmap_btree.h index 1f0a6f9620e8..ff60a2ca945f 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.h +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.h @@ -206,4 +206,13 @@ int xfs_rtrmapbt_create(struct xfs_trans **tpp, struct xfs_imeta_path *path, unsigned long long xfs_rtrmapbt_calc_size(struct xfs_mount *mp, unsigned long long len); +#ifdef CONFIG_XFS_IN_MEMORY_BTREE +struct xfbtree; +struct xfs_btree_cur *xfs_rtrmapbt_mem_cursor(struct xfs_rtgroup *rtg, + struct xfs_trans *tp, struct xfs_buf *mhead_bp, + struct xfbtree *xfbtree); +int xfs_rtrmapbt_mem_create(struct xfs_mount *mp, xfs_rgnumber_t rgno, + struct xfs_buftarg *target, struct xfbtree **xfbtreep); +#endif /* CONFIG_XFS_IN_MEMORY_BTREE */ + #endif /* __XFS_RTRMAP_BTREE_H__ */ diff --git a/fs/xfs/scrub/rtrmap_repair.c b/fs/xfs/scrub/rtrmap_repair.c index d856a4e46d6f..5775efa67de6 100644 --- a/fs/xfs/scrub/rtrmap_repair.c +++ b/fs/xfs/scrub/rtrmap_repair.c @@ -12,6 +12,7 @@ #include "xfs_defer.h" #include "xfs_btree.h" #include "xfs_btree_staging.h" +#include "xfs_btree_mem.h" #include "xfs_bit.h" #include "xfs_log_format.h" #include "xfs_trans.h" @@ -40,6 +41,7 @@ #include "scrub/iscan.h" #include "scrub/newbt.h" #include "scrub/reap.h" +#include "scrub/xfbtree.h" /* * Realtime Reverse Mapping Btree Repair @@ -68,28 +70,16 @@ int xrep_setup_rtrmapbt( struct xfs_scrub *sc) { - /* For now this is a placeholder until we land other pieces. */ - return 0; + return xrep_setup_buftarg(sc, "rtrmapbt repair"); } -/* - * Packed rmap record. The UNWRITTEN flags are hidden in the upper bits of - * offset, just like the on-disk record. - */ -struct xrep_rtrmap_extent { - xfs_rgblock_t startblock; - xfs_extlen_t blockcount; - uint64_t owner; - uint64_t offset; -} __packed; - /* Context for collecting rmaps */ struct xrep_rtrmap { /* new rtrmapbt information */ struct xrep_newbt new_btree; /* rmap records generated from primary metadata */ - struct xfarray *rtrmap_records; + struct xfbtree *rtrmap_btree; struct xfs_scrub *sc; @@ -99,8 +89,11 @@ struct xrep_rtrmap { /* inode scan cursor */ struct xchk_iscan iscan; - /* get_records()'s position in the free space record array. */ - xfarray_idx_t array_cur; + /* in-memory btree cursor for the ->get_blocks walk */ + struct xfs_btree_cur *mcur; + + /* Number of records we're staging in the new btree. */ + uint64_t nr_records; }; /* Make sure there's nothing funny about this mapping. */ @@ -130,11 +123,6 @@ xrep_rtrmap_stash( uint64_t offset, unsigned int flags) { - struct xrep_rtrmap_extent rre = { - .startblock = startblock, - .blockcount = blockcount, - .owner = owner, - }; struct xfs_rmap_irec rmap = { .rm_startblock = startblock, .rm_blockcount = blockcount, @@ -143,6 +131,8 @@ xrep_rtrmap_stash( .rm_flags = flags, }; struct xfs_scrub *sc = rr->sc; + struct xfs_btree_cur *mcur; + struct xfs_buf *mhead_bp; int error = 0; if (xchk_should_terminate(sc, &error)) @@ -150,8 +140,23 @@ xrep_rtrmap_stash( trace_xrep_rtrmap_found(sc->mp, &rmap); - rre.offset = xfs_rmap_irec_offset_pack(&rmap); - return xfarray_append(rr->rtrmap_records, &rre); + /* Add entry to in-memory btree. */ + error = xfbtree_head_read_buf(rr->rtrmap_btree, sc->tp, &mhead_bp); + if (error) + return error; + + mcur = xfs_rtrmapbt_mem_cursor(sc->sr.rtg, sc->tp, mhead_bp, + rr->rtrmap_btree); + error = xfs_rmap_map_raw(mcur, &rmap); + xfs_btree_del_cursor(mcur, error); + if (error) + goto out_cancel; + + return xfbtree_trans_commit(rr->rtrmap_btree, sc->tp); + +out_cancel: + xfbtree_trans_cancel(rr->rtrmap_btree, sc->tp); + return error; } /* Finding all file and bmbt extents. */ @@ -395,6 +400,24 @@ xrep_rtrmap_scan_ag( return error; } +/* Count and check all collected records. */ +STATIC int +xrep_rtrmap_check_record( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xrep_rtrmap *rr = priv; + int error; + + error = xrep_rtrmap_check_mapping(rr->sc, rec); + if (error) + return error; + + rr->nr_records++; + return 0; +} + STATIC int xrep_rtrmap_find_super_rmaps( struct xrep_rtrmap *rr) @@ -414,6 +437,8 @@ xrep_rtrmap_find_rmaps( struct xfs_scrub *sc = rr->sc; struct xfs_perag *pag; struct xfs_inode *ip; + struct xfs_buf *mhead_bp; + struct xfs_btree_cur *mcur; xfs_agnumber_t agno; int error; @@ -476,7 +501,25 @@ xrep_rtrmap_find_rmaps( } } - return 0; + /* + * Now that we have everything locked again, we need to count the + * number of rmap records stashed in the btree. This should reflect + * all actively-owned rt files in the filesystem. At the same time, + * check all our records before we start building a new btree, which + * requires the rtbitmap lock. + */ + error = xfbtree_head_read_buf(rr->rtrmap_btree, NULL, &mhead_bp); + if (error) + return error; + + mcur = xfs_rtrmapbt_mem_cursor(rr->sc->sr.rtg, NULL, mhead_bp, + rr->rtrmap_btree); + rr->nr_records = 0; + error = xfs_rmap_query_all(mcur, xrep_rtrmap_check_record, rr); + xfs_btree_del_cursor(mcur, error); + xfs_buf_relse(mhead_bp); + + return error; } /* Building the new rtrmap btree. */ @@ -490,29 +533,25 @@ xrep_rtrmap_get_records( unsigned int nr_wanted, void *priv) { - struct xrep_rtrmap_extent rec; - struct xfs_rmap_irec *irec = &cur->bc_rec.r; struct xrep_rtrmap *rr = priv; union xfs_btree_rec *block_rec; unsigned int loaded; int error; for (loaded = 0; loaded < nr_wanted; loaded++, idx++) { - error = xfarray_load_next(rr->rtrmap_records, &rr->array_cur, - &rec); + int stat = 0; + + error = xfs_btree_increment(rr->mcur, 0, &stat); if (error) return error; - - irec->rm_startblock = rec.startblock; - irec->rm_blockcount = rec.blockcount; - irec->rm_owner = rec.owner; - - if (xfs_rmap_irec_offset_unpack(rec.offset, irec) != NULL) + if (!stat) return -EFSCORRUPTED; - error = xrep_rtrmap_check_mapping(rr->sc, irec); + error = xfs_rmap_get_rec(rr->mcur, &cur->bc_rec.r, &stat); if (error) return error; + if (!stat) + return -EFSCORRUPTED; block_rec = xfs_btree_rec_addr(cur, idx, block); cur->bc_ops->init_rec_from_cur(cur, block_rec); @@ -558,7 +597,7 @@ xrep_rtrmap_build_new_tree( struct xfs_scrub *sc = rr->sc; struct xfs_rtgroup *rtg = sc->sr.rtg; struct xfs_btree_cur *rmap_cur; - uint64_t nr_records; + struct xfs_buf *mhead_bp; int error; /* @@ -579,11 +618,9 @@ xrep_rtrmap_build_new_tree( rmap_cur = xfs_rtrmapbt_stage_cursor(sc->mp, rtg, rtg->rtg_rmapip, &rr->new_btree.ifake); - nr_records = xfarray_length(rr->rtrmap_records); - /* Compute how many blocks we'll need for the rmaps collected. */ error = xfs_btree_bload_compute_geometry(rmap_cur, - &rr->new_btree.bload, nr_records); + &rr->new_btree.bload, rr->nr_records); if (error) goto err_cur; @@ -609,12 +646,25 @@ xrep_rtrmap_build_new_tree( if (error) goto err_cur; + /* + * Create a cursor to the in-memory btree so that we can bulk load the + * new btree. + */ + error = xfbtree_head_read_buf(rr->rtrmap_btree, NULL, &mhead_bp); + if (error) + goto err_cur; + + rr->mcur = xfs_rtrmapbt_mem_cursor(sc->sr.rtg, NULL, mhead_bp, + rr->rtrmap_btree); + error = xfs_btree_goto_left_edge(rr->mcur); + if (error) + goto err_mcur; + /* Add all observed rmap records. */ rr->new_btree.ifake.if_fork->if_format = XFS_DINODE_FMT_RMAP; - rr->array_cur = XFARRAY_CURSOR_INIT; error = xfs_btree_bload(rmap_cur, &rr->new_btree.bload, rr); if (error) - goto err_cur; + goto err_mcur; /* * Install the new rtrmap btree in the inode. After this point the old @@ -624,6 +674,15 @@ xrep_rtrmap_build_new_tree( xfs_rtrmapbt_commit_staged_btree(rmap_cur, sc->tp); xrep_inode_set_nblocks(rr->sc, rr->new_btree.ifake.if_blocks); xfs_btree_del_cursor(rmap_cur, 0); + xfs_btree_del_cursor(rr->mcur, 0); + rr->mcur = NULL; + xfs_buf_relse(mhead_bp); + + /* + * Now that we've written the new btree to disk, we don't need to keep + * updating the in-memory btree. Abort the scan to stop live updates. + */ + xchk_iscan_abort(&rr->iscan); /* Dispose of any unused blocks and the accounting information. */ error = xrep_newbt_commit(&rr->new_btree); @@ -632,6 +691,9 @@ xrep_rtrmap_build_new_tree( return xrep_roll_trans(sc); +err_mcur: + xfs_btree_del_cursor(rr->mcur, error); + xfs_buf_relse(mhead_bp); err_cur: xfs_btree_del_cursor(rmap_cur, error); xrep_newbt_cancel(&rr->new_btree); @@ -689,8 +751,8 @@ xrep_rtrmapbt( xfsb_bitmap_init(&rr->old_rtrmapbt_blocks); /* Set up some storage */ - error = xfarray_create(sc->mp, "rtrmap records", 0, - sizeof(struct xrep_rtrmap_extent), &rr->rtrmap_records); + error = xfs_rtrmapbt_mem_create(sc->mp, sc->sr.rtg->rtg_rgno, + sc->xfile_buftarg, &rr->rtrmap_btree); if (error) goto out_bitmap; @@ -714,7 +776,7 @@ xrep_rtrmapbt( out_records: xchk_iscan_finish(&rr->iscan); - xfarray_destroy(rr->rtrmap_records); + xfbtree_destroy(rr->rtrmap_btree); out_bitmap: xfsb_bitmap_destroy(&rr->old_rtrmapbt_blocks); kfree(rr); diff --git a/fs/xfs/scrub/xfbtree.c b/fs/xfs/scrub/xfbtree.c index 55d530213d40..d803bb1d151a 100644 --- a/fs/xfs/scrub/xfbtree.c +++ b/fs/xfs/scrub/xfbtree.c @@ -17,6 +17,7 @@ #include "xfs_error.h" #include "xfs_btree_mem.h" #include "xfs_ag.h" +#include "xfs_rtgroup.h" #include "scrub/scrub.h" #include "scrub/xfile.h" #include "scrub/xfbtree.h" @@ -267,6 +268,8 @@ xfbtree_dup_cursor( if (cur->bc_mem.pag) ncur->bc_mem.pag = xfs_perag_bump(cur->bc_mem.pag); + if (cur->bc_mem.rtg) + ncur->bc_mem.rtg = xfs_rtgroup_bump(cur->bc_mem.rtg); return ncur; } From patchwork Fri Dec 30 22:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085502 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFA79C4332F for ; Sat, 31 Dec 2022 01:46:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236014AbiLaBqd (ORCPT ); Fri, 30 Dec 2022 20:46:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231435AbiLaBqd (ORCPT ); Fri, 30 Dec 2022 20:46:33 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8D4BF12769 for ; Fri, 30 Dec 2022 17:46:31 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 35ED8B81DEC for ; Sat, 31 Dec 2022 01:46:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB0EFC433F0; Sat, 31 Dec 2022 01:46:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451188; bh=fxo7DRUi5o/T4crZz62K9ecpkBtwC08EFCPvvu1LDww=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=d7gJnpVKD3+t1E6OuI27dy8ZY0rboZJDDJ0/3+WhVNP2xhnw8SQYK1Q5J2bfmdZ44 VWcZiWdkdkxYSwHGJwk45jygbsfWD2sdx78BTmloG6P9KMdNVozs4GGxT8szZ3sdf1 1kPqq3++vSVVxPn4bLLicCYWalAhRCptWxKfTVBhb4HB3a8Tm9KFH1yTJSDDSNVBDM e8Q/Gf1+VuEv7R9aGfMa7hZT8yyiKA/rXQLGikYhBY/zsvFI5MXkLg/jzLRYZR2pw/ UvgrBzsrzmXba18RkeK3JntZDzvY/UUX3sxUOyhh3+4uYg50Yh27EoKio7FeaMYq49 g4vEAZCuH+4bw== Subject: [PATCH 37/38] xfs: hook live realtime rmap operations during a repair operation From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:21 -0800 Message-ID: <167243870131.715303.4124505167326518034.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Hook the regular realtime rmap code when an rtrmapbt repair operation is running so that we can unlock the AGF buffer to scan the filesystem and keep the in-memory btree up to date during the scan. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rmap.c | 39 ++++++++++-- fs/xfs/libxfs/xfs_rmap.h | 6 ++ fs/xfs/libxfs/xfs_rtgroup.c | 2 - fs/xfs/libxfs/xfs_rtgroup.h | 3 + fs/xfs/scrub/rtrmap_repair.c | 138 ++++++++++++++++++++++++++++++++++++++++-- fs/xfs/scrub/trace.h | 36 +++++++++++ 6 files changed, 211 insertions(+), 13 deletions(-) diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index 06840fc31f02..a533588a9b5b 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -906,6 +906,7 @@ static inline void xfs_rmap_update_hook( struct xfs_trans *tp, struct xfs_perag *pag, + struct xfs_rtgroup *rtg, enum xfs_rmap_intent_type op, xfs_agblock_t startblock, xfs_extlen_t blockcount, @@ -922,6 +923,8 @@ xfs_rmap_update_hook( if (pag) xfs_hooks_call(&pag->pag_rmap_update_hooks, op, &p); + else if (rtg) + xfs_hooks_call(&rtg->rtg_rmap_update_hooks, op, &p); } } @@ -942,8 +945,28 @@ xfs_rmap_hook_del( { xfs_hooks_del(&pag->pag_rmap_update_hooks, &hook->update_hook); } + +# ifdef CONFIG_XFS_RT +/* Call the specified function during a rt reverse mapping update. */ +int +xfs_rtrmap_hook_add( + struct xfs_rtgroup *rtg, + struct xfs_rmap_hook *hook) +{ + return xfs_hooks_add(&rtg->rtg_rmap_update_hooks, &hook->update_hook); +} + +/* Stop calling the specified function during a rt reverse mapping update. */ +void +xfs_rtrmap_hook_del( + struct xfs_rtgroup *rtg, + struct xfs_rmap_hook *hook) +{ + xfs_hooks_del(&rtg->rtg_rmap_update_hooks, &hook->update_hook); +} +# endif /* CONFIG_XFS_RT */ #else -# define xfs_rmap_update_hook(t, p, o, s, b, u, oi) do { } while(0) +# define xfs_rmap_update_hook(t, p, r, o, s, b, u, oi) do { } while(0) #endif /* CONFIG_XFS_LIVE_HOOKS */ /* @@ -966,7 +989,8 @@ xfs_rmap_free( return 0; cur = xfs_rmapbt_init_cursor(mp, tp, agbp, pag); - xfs_rmap_update_hook(tp, pag, XFS_RMAP_UNMAP, bno, len, false, oinfo); + xfs_rmap_update_hook(tp, pag, NULL, XFS_RMAP_UNMAP, bno, len, false, + oinfo); error = xfs_rmap_unmap(cur, bno, len, false, oinfo); xfs_btree_del_cursor(cur, error); @@ -1210,7 +1234,8 @@ xfs_rmap_alloc( return 0; cur = xfs_rmapbt_init_cursor(mp, tp, agbp, pag); - xfs_rmap_update_hook(tp, pag, XFS_RMAP_MAP, bno, len, false, oinfo); + xfs_rmap_update_hook(tp, pag, NULL, XFS_RMAP_MAP, bno, len, false, + oinfo); error = xfs_rmap_map(cur, bno, len, false, oinfo); xfs_btree_del_cursor(cur, error); @@ -2731,8 +2756,12 @@ xfs_rmap_finish_one( if (error) return error; - xfs_rmap_update_hook(tp, ri->ri_pag, ri->ri_type, bno, - ri->ri_bmap.br_blockcount, unwritten, &oinfo); + if (ri->ri_realtime) + xfs_rmap_update_hook(tp, NULL, ri->ri_rtg, ri->ri_type, bno, + ri->ri_bmap.br_blockcount, unwritten, &oinfo); + else + xfs_rmap_update_hook(tp, ri->ri_pag, NULL, ri->ri_type, bno, + ri->ri_bmap.br_blockcount, unwritten, &oinfo); return 0; } diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h index 9d0aaa16f551..36d071b3b44c 100644 --- a/fs/xfs/libxfs/xfs_rmap.h +++ b/fs/xfs/libxfs/xfs_rmap.h @@ -279,6 +279,12 @@ void xfs_rmap_hook_enable(void); int xfs_rmap_hook_add(struct xfs_perag *pag, struct xfs_rmap_hook *hook); void xfs_rmap_hook_del(struct xfs_perag *pag, struct xfs_rmap_hook *hook); + +# ifdef CONFIG_XFS_RT +int xfs_rtrmap_hook_add(struct xfs_rtgroup *rtg, struct xfs_rmap_hook *hook); +void xfs_rtrmap_hook_del(struct xfs_rtgroup *rtg, struct xfs_rmap_hook *hook); +# endif /* CONFIG_XFS_RT */ + #endif #endif /* __XFS_RMAP_H__ */ diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index e40806c84256..bd878e65bc44 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -133,7 +133,7 @@ xfs_initialize_rtgroups( /* Place kernel structure only init below this point. */ spin_lock_init(&rtg->rtg_state_lock); xfs_drain_init(&rtg->rtg_intents); - + xfs_hooks_init(&rtg->rtg_rmap_update_hooks); #endif /* __KERNEL__ */ /* first new rtg is fully initialized */ diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h index 1d41a2cac34f..4e9b9098f2f2 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.h +++ b/fs/xfs/libxfs/xfs_rtgroup.h @@ -46,6 +46,9 @@ struct xfs_rtgroup { * inconsistencies. */ struct xfs_drain rtg_intents; + + /* Hook to feed rt rmapbt updates to an active online repair. */ + struct xfs_hooks rtg_rmap_update_hooks; #endif /* __KERNEL__ */ }; diff --git a/fs/xfs/scrub/rtrmap_repair.c b/fs/xfs/scrub/rtrmap_repair.c index 5775efa67de6..e26847784d21 100644 --- a/fs/xfs/scrub/rtrmap_repair.c +++ b/fs/xfs/scrub/rtrmap_repair.c @@ -70,6 +70,8 @@ int xrep_setup_rtrmapbt( struct xfs_scrub *sc) { + xchk_fshooks_enable(sc, XCHK_FSHOOKS_RMAP); + return xrep_setup_buftarg(sc, "rtrmapbt repair"); } @@ -78,6 +80,9 @@ struct xrep_rtrmap { /* new rtrmapbt information */ struct xrep_newbt new_btree; + /* lock for the xfbtree and xfile */ + struct mutex lock; + /* rmap records generated from primary metadata */ struct xfbtree *rtrmap_btree; @@ -86,6 +91,9 @@ struct xrep_rtrmap { /* bitmap of old rtrmapbt blocks */ struct xfsb_bitmap old_rtrmapbt_blocks; + /* Hooks into rtrmap update code. */ + struct xfs_rmap_hook hooks; + /* inode scan cursor */ struct xchk_iscan iscan; @@ -138,12 +146,16 @@ xrep_rtrmap_stash( if (xchk_should_terminate(sc, &error)) return error; + if (xchk_iscan_aborted(&rr->iscan)) + return -EFSCORRUPTED; + trace_xrep_rtrmap_found(sc->mp, &rmap); /* Add entry to in-memory btree. */ + mutex_lock(&rr->lock); error = xfbtree_head_read_buf(rr->rtrmap_btree, sc->tp, &mhead_bp); if (error) - return error; + goto out_abort; mcur = xfs_rtrmapbt_mem_cursor(sc->sr.rtg, sc->tp, mhead_bp, rr->rtrmap_btree); @@ -152,10 +164,18 @@ xrep_rtrmap_stash( if (error) goto out_cancel; - return xfbtree_trans_commit(rr->rtrmap_btree, sc->tp); + error = xfbtree_trans_commit(rr->rtrmap_btree, sc->tp); + if (error) + goto out_abort; + + mutex_unlock(&rr->lock); + return 0; out_cancel: xfbtree_trans_cancel(rr->rtrmap_btree, sc->tp); +out_abort: + xchk_iscan_abort(&rr->iscan); + mutex_unlock(&rr->lock); return error; } @@ -492,6 +512,13 @@ xrep_rtrmap_find_rmaps( if (error) return error; + /* + * If a hook failed to update the in-memory btree, we lack the data to + * continue the repair. + */ + if (xchk_iscan_aborted(&rr->iscan)) + return -EFSCORRUPTED; + /* Scan for old rtrmap blocks. */ for_each_perag(sc->mp, agno, pag) { error = xrep_rtrmap_scan_ag(rr, pag); @@ -727,6 +754,89 @@ xrep_rtrmap_remove_old_tree( return xrep_reset_imeta_reservation(rr->sc); } +static inline bool +xrep_rtrmapbt_want_live_update( + struct xchk_iscan *iscan, + const struct xfs_owner_info *oi) +{ + if (xchk_iscan_aborted(iscan)) + return false; + + /* + * We scanned the CoW staging extents before we started the iscan, so + * we need all the updates. + */ + if (XFS_RMAP_NON_INODE_OWNER(oi->oi_owner)) + return true; + + /* Ignore updates to files that the scanner hasn't visited yet. */ + return xchk_iscan_want_live_update(iscan, oi->oi_owner); +} + +/* + * Apply a rtrmapbt update from the regular filesystem into our shadow btree. + * We're running from the thread that owns the rtrmap ILOCK and is generating + * the update, so we must be careful about which parts of the struct + * xrep_rtrmap that we change. + */ +static int +xrep_rtrmapbt_live_update( + struct xfs_hook *update_hook, + unsigned long action, + void *data) +{ + struct xfs_rmap_update_params *p = data; + struct xrep_rtrmap *rr; + struct xfs_mount *mp; + struct xfs_btree_cur *mcur; + struct xfs_buf *mhead_bp; + struct xfs_trans *tp; + void *txcookie; + int error; + + rr = container_of(update_hook, struct xrep_rtrmap, hooks.update_hook); + mp = rr->sc->mp; + + if (!xrep_rtrmapbt_want_live_update(&rr->iscan, &p->oinfo)) + goto out_unlock; + + trace_xrep_rtrmap_live_update(mp, rr->sc->sr.rtg->rtg_rgno, action, p); + + error = xrep_trans_alloc_hook_dummy(mp, &txcookie, &tp); + if (error) + goto out_abort; + + mutex_lock(&rr->lock); + error = xfbtree_head_read_buf(rr->rtrmap_btree, tp, &mhead_bp); + if (error) + goto out_cancel; + + mcur = xfs_rtrmapbt_mem_cursor(rr->sc->sr.rtg, tp, mhead_bp, + rr->rtrmap_btree); + error = __xfs_rmap_finish_intent(mcur, action, p->startblock, + p->blockcount, &p->oinfo, p->unwritten); + xfs_btree_del_cursor(mcur, error); + if (error) + goto out_cancel; + + error = xfbtree_trans_commit(rr->rtrmap_btree, tp); + if (error) + goto out_cancel; + + xrep_trans_cancel_hook_dummy(&txcookie, tp); + mutex_unlock(&rr->lock); + return NOTIFY_DONE; + +out_cancel: + xfbtree_trans_cancel(rr->rtrmap_btree, tp); + xrep_trans_cancel_hook_dummy(&txcookie, tp); +out_abort: + xchk_iscan_abort(&rr->iscan); + mutex_unlock(&rr->lock); +out_unlock: + return NOTIFY_DONE; +} + /* Repair the realtime rmap btree. */ int xrep_rtrmapbt( @@ -735,9 +845,6 @@ xrep_rtrmapbt( struct xrep_rtrmap *rr; int error; - /* Functionality is not yet complete. */ - return xrep_notsupported(sc); - /* Make sure any problems with the fork are fixed. */ error = xrep_metadata_inode_forks(sc); if (error) @@ -748,6 +855,7 @@ xrep_rtrmapbt( return -ENOMEM; rr->sc = sc; + mutex_init(&rr->lock); xfsb_bitmap_init(&rr->old_rtrmapbt_blocks); /* Set up some storage */ @@ -759,26 +867,42 @@ xrep_rtrmapbt( /* Retry iget every tenth of a second for up to 30 seconds. */ xchk_iscan_start(&rr->iscan, 30000, 100); + /* + * Hook into live rtrmap operations so that we can update our in-memory + * btree to reflect live changes on the filesystem. Since we drop the + * rtrmap ILOCK to scan all the inodes, we need this piece to avoid + * installing a stale btree. + */ + ASSERT(sc->flags & XCHK_FSHOOKS_RMAP); + xfs_hook_setup(&rr->hooks.update_hook, xrep_rtrmapbt_live_update); + error = xfs_rtrmap_hook_add(sc->sr.rtg, &rr->hooks); + if (error) + goto out_records; + /* Collect rmaps for realtime files. */ error = xrep_rtrmap_find_rmaps(rr); if (error) - goto out_records; + goto out_hook; xfs_trans_ijoin(sc->tp, sc->ip, 0); /* Rebuild the rtrmap information. */ error = xrep_rtrmap_build_new_tree(rr); if (error) - goto out_records; + goto out_hook; /* Kill the old tree. */ error = xrep_rtrmap_remove_old_tree(rr); +out_hook: + xchk_iscan_abort(&rr->iscan); + xfs_rtrmap_hook_del(sc->sr.rtg, &rr->hooks); out_records: xchk_iscan_finish(&rr->iscan); xfbtree_destroy(rr->rtrmap_btree); out_bitmap: xfsb_bitmap_destroy(&rr->old_rtrmapbt_blocks); + mutex_destroy(&rr->lock); kfree(rr); return error; } diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 654cbcbd99ea..4cf8180173ca 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -3024,6 +3024,42 @@ TRACE_EVENT(xrep_rtrmap_found, __entry->offset, __entry->flags) ); + +TRACE_EVENT(xrep_rtrmap_live_update, + TP_PROTO(struct xfs_mount *mp, xfs_rgnumber_t rgno, unsigned int op, + const struct xfs_rmap_update_params *p), + TP_ARGS(mp, rgno, op, p), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_rgnumber_t, rgno) + __field(unsigned int, op) + __field(xfs_rgblock_t, rgbno) + __field(xfs_extlen_t, len) + __field(uint64_t, owner) + __field(uint64_t, offset) + __field(unsigned int, flags) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->rgno = rgno; + __entry->op = op; + __entry->rgbno = p->startblock; + __entry->len = p->blockcount; + xfs_owner_info_unpack(&p->oinfo, &__entry->owner, + &__entry->offset, &__entry->flags); + if (p->unwritten) + __entry->flags |= XFS_RMAP_UNWRITTEN; + ), + TP_printk("dev %d:%d rgno 0x%x op %s rgbno 0x%x fsbcount 0x%x owner 0x%llx fileoff 0x%llx flags 0x%x", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->rgno, + __print_symbolic(__entry->op, XFS_RMAP_INTENT_STRINGS), + __entry->rgbno, + __entry->len, + __entry->owner, + __entry->offset, + __entry->flags) +); #endif /* CONFIG_XFS_RT */ #endif /* IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR) */ From patchwork Fri Dec 30 22:18:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13085503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93D82C4332F for ; Sat, 31 Dec 2022 01:46:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236051AbiLaBqr (ORCPT ); Fri, 30 Dec 2022 20:46:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49228 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231435AbiLaBqq (ORCPT ); Fri, 30 Dec 2022 20:46:46 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 856EA13EA6 for ; Fri, 30 Dec 2022 17:46:45 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 219AA61CBE for ; Sat, 31 Dec 2022 01:46:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 812FBC433D2; Sat, 31 Dec 2022 01:46:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672451204; bh=H7tChPkvpO6fikFmVinsGspoTZkvt/8/jQtoseic9p4=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=vQGoQ1hNexsvwz1w6n8jKPiDqxcRgXVWDFFJiyLZTHKMY5NOKX6AeqsIPnKZVPuE/ UQc+l7UIt0GetRdf7B0JhwcTGVyjrBiz4QvInWjVlMqdMESwUUJHVda6o4OJ6LTH4e KqBulm4OS/tAQUK6qCoHMcPIPrFpBhCEeGTJ9ynvZItvwtsdUQIODj+iGJjF6EA42z xShcoRuZihCmxk6qligHA2oqo5dlTlNRJoyIeEky7nntcvmO1YqR3+dgVu5ftnc6c7 bD3qNBGVdIGPLF8YMrET4dMI4gmS/XHMqqKAWe6eeSzg98qPHSLXqJ/r8bRCIU47U3 qPVjNtwFfZn+w== Subject: [PATCH 38/38] xfs: enable realtime rmap btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Fri, 30 Dec 2022 14:18:21 -0800 Message-ID: <167243870146.715303.3590524752697246789.stgit@magnolia> In-Reply-To: <167243869558.715303.13347105677486333748.stgit@magnolia> References: <167243869558.715303.13347105677486333748.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_super.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index e145de0bd562..4abeff701093 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1669,12 +1669,6 @@ xfs_fs_fill_super( } } - if (xfs_has_rmapbt(mp) && mp->m_sb.sb_rblocks) { - xfs_alert(mp, - "reverse mapping btree not compatible with realtime device!"); - error = -EINVAL; - goto out_filestream_unmount; - } if (xfs_has_large_extent_counts(mp)) xfs_warn(mp,