From patchwork Thu Dec 19 19:33:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915599 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B99C81990BA for ; Thu, 19 Dec 2024 19:33:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636798; cv=none; b=Lpa3M3lB4GQ/2rP0GV5TPIoqPdO1+rbMyjEs+NzIwMp06LBdsGBapRDTsIkX2x9E6xzr8on5QjX35qRyymfHCGs6JMZfEXhoZkfE0mpmQ9bie2sho6VXi81YwM9PWEbfYcl2BC7iYpNksShCg9sVa/9ydOAb1XJeGS2bKVNmqQU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636798; c=relaxed/simple; bh=QBA9XSPJrZgJGhXQmr4phvUEQdYz3U7OI3v/85PPfOE=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=PQdLzjnWSMU1cZVRomvD/4uWb19kU60fCq7m73j+Tm7UTH6M6QA2/c+ZoXnXSd/Xvv+CQuv6L0yWJg8tvdc5P4ubtdU+fdPYD/IrOhssE9wMWLeiP1j0Vl9ZQsMSTWNuqwPDZcmv3aUd0csv2CVokZTu+cYsMJDlSxAdrGQ546w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dRXto1Pg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dRXto1Pg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 58FB5C4CED0; Thu, 19 Dec 2024 19:33:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636798; bh=QBA9XSPJrZgJGhXQmr4phvUEQdYz3U7OI3v/85PPfOE=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=dRXto1PgmpE8HCzoALmka9PNQkk1yHmWHrnW2OIZfguRoFwmoSJDHIZ8uzZ5DB0FF XbyoPYLZqdEAohnKmptLVKqM//QyYXYlUVZyflRlAsI/iPJiRCQKjZOuJKsUfxb/LU ohWmRq75vs5MJM9hfDr7yDBYcJDRJzWXiU5K4lBrJfzmzIRDr47fBo+LkMyusya1xg NJ7dxyUVtp0X1QumLSRK8TGD2L6Rw7/sa98yOQbBugVrO06xJC64pB23qQCXJ3WWCJ jEvuaMMO9qSI90/aNkSrfzj3L8VycrLPWDhoe2xc7QCcnZ6YbTvVPUuhoi6znp3dwa dx5bwzdgbZuyw== Date: Thu, 19 Dec 2024 11:33:17 -0800 Subject: [PATCH 01/43] xfs: prepare refcount btree cursor tracepoints for realtime From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463580993.1572761.7571007019155789878.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Rework the refcount btree cursor tracepoints in preparation to handle the realtime refcount btree cursor. Mostly this involves renaming the field to "refcbno" and extracting the group number from the cursor when possible. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_refcount_item.c | 4 +- fs/xfs/xfs_trace.h | 111 +++++++++++++++++++++++++++----------------- 2 files changed, 70 insertions(+), 45 deletions(-) diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index bede1c96c33011..c807c4b90c44e3 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -328,9 +328,9 @@ xfs_refcount_defer_add( { struct xfs_mount *mp = tp->t_mountp; - trace_xfs_refcount_defer(mp, ri); - ri->ri_group = xfs_group_intent_get(mp, ri->ri_startblock, XG_TYPE_AG); + + trace_xfs_refcount_defer(mp, ri); xfs_defer_add(tp, &ri->ri_list, &xfs_refcount_update_defer_type); } diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 84cdc145e2d96a..4fe689410eb6ae 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3305,56 +3305,62 @@ TRACE_EVENT(xfs_ag_resv_init_error, /* refcount tracepoint classes */ DECLARE_EVENT_CLASS(xfs_refcount_class, - TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t agbno, + TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t gbno, xfs_extlen_t len), - TP_ARGS(cur, agbno, len), + TP_ARGS(cur, gbno, len), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, gbno) __field(xfs_extlen_t, len) ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; + __entry->type = cur->bc_group->xg_type; __entry->agno = cur->bc_group->xg_gno; - __entry->agbno = agbno; + __entry->gbno = gbno; __entry->len = len; ), - TP_printk("dev %d:%d agno 0x%x agbno 0x%x fsbcount 0x%x", + TP_printk("dev %d:%d %sno 0x%x gbno 0x%x fsbcount 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, - __entry->agbno, + __entry->gbno, __entry->len) ); #define DEFINE_REFCOUNT_EVENT(name) \ DEFINE_EVENT(xfs_refcount_class, name, \ - TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t agbno, \ + TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t gbno, \ xfs_extlen_t len), \ - TP_ARGS(cur, agbno, len)) + TP_ARGS(cur, gbno, len)) TRACE_DEFINE_ENUM(XFS_LOOKUP_EQi); TRACE_DEFINE_ENUM(XFS_LOOKUP_LEi); TRACE_DEFINE_ENUM(XFS_LOOKUP_GEi); TRACE_EVENT(xfs_refcount_lookup, - TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t agbno, + TP_PROTO(struct xfs_btree_cur *cur, xfs_agblock_t gbno, xfs_lookup_t dir), - TP_ARGS(cur, agbno, dir), + TP_ARGS(cur, gbno, dir), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, gbno) __field(xfs_lookup_t, dir) ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; + __entry->type = cur->bc_group->xg_type; __entry->agno = cur->bc_group->xg_gno; - __entry->agbno = agbno; + __entry->gbno = gbno; __entry->dir = dir; ), - TP_printk("dev %d:%d agno 0x%x agbno 0x%x cmp %s(%d)", + TP_printk("dev %d:%d %sno 0x%x gbno 0x%x cmp %s(%d)", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, - __entry->agbno, + __entry->gbno, __print_symbolic(__entry->dir, XFS_AG_BTREE_CMP_FORMAT_STR), __entry->dir) ) @@ -3365,6 +3371,7 @@ DECLARE_EVENT_CLASS(xfs_refcount_extent_class, TP_ARGS(cur, irec), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, domain) __field(xfs_agblock_t, startblock) @@ -3373,14 +3380,16 @@ DECLARE_EVENT_CLASS(xfs_refcount_extent_class, ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; + __entry->type = cur->bc_group->xg_type; __entry->agno = cur->bc_group->xg_gno; __entry->domain = irec->rc_domain; __entry->startblock = irec->rc_startblock; __entry->blockcount = irec->rc_blockcount; __entry->refcount = irec->rc_refcount; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u", + TP_printk("dev %d:%d %sno 0x%x dom %s gbno 0x%x fsbcount 0x%x refcount %u", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, __print_symbolic(__entry->domain, XFS_REFC_DOMAIN_STRINGS), __entry->startblock, @@ -3396,49 +3405,53 @@ DEFINE_EVENT(xfs_refcount_extent_class, name, \ /* single-rcext and an agbno tracepoint class */ DECLARE_EVENT_CLASS(xfs_refcount_extent_at_class, TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *irec, - xfs_agblock_t agbno), - TP_ARGS(cur, irec, agbno), + xfs_agblock_t gbno), + TP_ARGS(cur, irec, gbno), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, domain) __field(xfs_agblock_t, startblock) __field(xfs_extlen_t, blockcount) __field(xfs_nlink_t, refcount) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, gbno) ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; + __entry->type = cur->bc_group->xg_type; __entry->agno = cur->bc_group->xg_gno; __entry->domain = irec->rc_domain; __entry->startblock = irec->rc_startblock; __entry->blockcount = irec->rc_blockcount; __entry->refcount = irec->rc_refcount; - __entry->agbno = agbno; + __entry->gbno = gbno; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u @ agbno 0x%x", + TP_printk("dev %d:%d %sno 0x%x dom %s gbno 0x%x fsbcount 0x%x refcount %u @ gbno 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, __print_symbolic(__entry->domain, XFS_REFC_DOMAIN_STRINGS), __entry->startblock, __entry->blockcount, __entry->refcount, - __entry->agbno) + __entry->gbno) ) #define DEFINE_REFCOUNT_EXTENT_AT_EVENT(name) \ DEFINE_EVENT(xfs_refcount_extent_at_class, name, \ TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *irec, \ - xfs_agblock_t agbno), \ - TP_ARGS(cur, irec, agbno)) + xfs_agblock_t gbno), \ + TP_ARGS(cur, irec, gbno)) /* double-rcext tracepoint class */ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_class, TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *i1, - struct xfs_refcount_irec *i2), + struct xfs_refcount_irec *i2), TP_ARGS(cur, i1, i2), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, i1_domain) __field(xfs_agblock_t, i1_startblock) @@ -3451,6 +3464,7 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_class, ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; + __entry->type = cur->bc_group->xg_type; __entry->agno = cur->bc_group->xg_gno; __entry->i1_domain = i1->rc_domain; __entry->i1_startblock = i1->rc_startblock; @@ -3461,9 +3475,10 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_class, __entry->i2_blockcount = i2->rc_blockcount; __entry->i2_refcount = i2->rc_refcount; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u -- " - "dom %s agbno 0x%x fsbcount 0x%x refcount %u", + TP_printk("dev %d:%d %sno 0x%x dom %s gbno 0x%x fsbcount 0x%x refcount %u -- " + "dom %s gbno 0x%x fsbcount 0x%x refcount %u", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, __print_symbolic(__entry->i1_domain, XFS_REFC_DOMAIN_STRINGS), __entry->i1_startblock, @@ -3484,10 +3499,11 @@ DEFINE_EVENT(xfs_refcount_double_extent_class, name, \ /* double-rcext and an agbno tracepoint class */ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_at_class, TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *i1, - struct xfs_refcount_irec *i2, xfs_agblock_t agbno), - TP_ARGS(cur, i1, i2, agbno), + struct xfs_refcount_irec *i2, xfs_agblock_t gbno), + TP_ARGS(cur, i1, i2, gbno), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, i1_domain) __field(xfs_agblock_t, i1_startblock) @@ -3497,10 +3513,11 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_at_class, __field(xfs_agblock_t, i2_startblock) __field(xfs_extlen_t, i2_blockcount) __field(xfs_nlink_t, i2_refcount) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, gbno) ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; + __entry->type = cur->bc_group->xg_type; __entry->agno = cur->bc_group->xg_gno; __entry->i1_domain = i1->rc_domain; __entry->i1_startblock = i1->rc_startblock; @@ -3510,11 +3527,12 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_at_class, __entry->i2_startblock = i2->rc_startblock; __entry->i2_blockcount = i2->rc_blockcount; __entry->i2_refcount = i2->rc_refcount; - __entry->agbno = agbno; + __entry->gbno = gbno; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u -- " - "dom %s agbno 0x%x fsbcount 0x%x refcount %u @ agbno 0x%x", + TP_printk("dev %d:%d %sno 0x%x dom %s gbno 0x%x fsbcount 0x%x refcount %u -- " + "dom %s gbno 0x%x fsbcount 0x%x refcount %u @ gbno 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, __print_symbolic(__entry->i1_domain, XFS_REFC_DOMAIN_STRINGS), __entry->i1_startblock, @@ -3524,14 +3542,14 @@ DECLARE_EVENT_CLASS(xfs_refcount_double_extent_at_class, __entry->i2_startblock, __entry->i2_blockcount, __entry->i2_refcount, - __entry->agbno) + __entry->gbno) ) #define DEFINE_REFCOUNT_DOUBLE_EXTENT_AT_EVENT(name) \ DEFINE_EVENT(xfs_refcount_double_extent_at_class, name, \ TP_PROTO(struct xfs_btree_cur *cur, struct xfs_refcount_irec *i1, \ - struct xfs_refcount_irec *i2, xfs_agblock_t agbno), \ - TP_ARGS(cur, i1, i2, agbno)) + struct xfs_refcount_irec *i2, xfs_agblock_t gbno), \ + TP_ARGS(cur, i1, i2, gbno)) /* triple-rcext tracepoint class */ DECLARE_EVENT_CLASS(xfs_refcount_triple_extent_class, @@ -3540,6 +3558,7 @@ DECLARE_EVENT_CLASS(xfs_refcount_triple_extent_class, TP_ARGS(cur, i1, i2, i3), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, i1_domain) __field(xfs_agblock_t, i1_startblock) @@ -3556,6 +3575,7 @@ DECLARE_EVENT_CLASS(xfs_refcount_triple_extent_class, ), TP_fast_assign( __entry->dev = cur->bc_mp->m_super->s_dev; + __entry->type = cur->bc_group->xg_type; __entry->agno = cur->bc_group->xg_gno; __entry->i1_domain = i1->rc_domain; __entry->i1_startblock = i1->rc_startblock; @@ -3570,10 +3590,11 @@ DECLARE_EVENT_CLASS(xfs_refcount_triple_extent_class, __entry->i3_blockcount = i3->rc_blockcount; __entry->i3_refcount = i3->rc_refcount; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u -- " - "dom %s agbno 0x%x fsbcount 0x%x refcount %u -- " - "dom %s agbno 0x%x fsbcount 0x%x refcount %u", + TP_printk("dev %d:%d %sno 0x%x dom %s gbno 0x%x fsbcount 0x%x refcount %u -- " + "dom %s gbno 0x%x fsbcount 0x%x refcount %u -- " + "dom %s gbno 0x%x fsbcount 0x%x refcount %u", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, __print_symbolic(__entry->i1_domain, XFS_REFC_DOMAIN_STRINGS), __entry->i1_startblock, @@ -3641,23 +3662,27 @@ DECLARE_EVENT_CLASS(xfs_refcount_deferred_class, TP_ARGS(mp, refc), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) __field(int, op) - __field(xfs_agblock_t, agbno) + __field(xfs_agblock_t, gbno) __field(xfs_extlen_t, len) ), TP_fast_assign( __entry->dev = mp->m_super->s_dev; - __entry->agno = XFS_FSB_TO_AGNO(mp, refc->ri_startblock); + __entry->type = refc->ri_group->xg_type; + __entry->agno = refc->ri_group->xg_gno; __entry->op = refc->ri_type; - __entry->agbno = XFS_FSB_TO_AGBNO(mp, refc->ri_startblock); + __entry->gbno = xfs_fsb_to_gbno(mp, refc->ri_startblock, + refc->ri_group->xg_type); __entry->len = refc->ri_blockcount; ), - TP_printk("dev %d:%d op %s agno 0x%x agbno 0x%x fsbcount 0x%x", + TP_printk("dev %d:%d op %s %sno 0x%x gbno 0x%x fsbcount 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), __print_symbolic(__entry->op, XFS_REFCOUNT_INTENT_STRINGS), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, - __entry->agbno, + __entry->gbno, __entry->len) ); #define DEFINE_REFCOUNT_DEFERRED_EVENT(name) \ From patchwork Thu Dec 19 19:33:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915600 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 172041B043D for ; Thu, 19 Dec 2024 19:33:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636814; cv=none; b=NUqUk9oB+HQJKJZTpokX5Te0nurAn0DHVsQH51gFDCmOGWh6LjttSCU5ikvFblhV7/DB/xLDyYZ75I/zAq1cRG3lH0KsdgfZeufOJk6O5fximuYmcoXVRs0PyHUH3hToof1udEi890kCIgmDEAqC5UtVfEv7IwKFCUz3CmBEyoY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636814; c=relaxed/simple; bh=AkyaUdYSvZhnL1Y6wxhASs5P6tSY4/xfgLsDT9Gd9WU=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=SCALaWl2aeJTeB8EAtIuDRAqw5ZMEyiC93+PwA9vNypSuM+a+P0k25wWzoWI59Q7HwuwNpy2gPceix+A/bA6vUwMdZnYPfeOf9vfmvwr1DlQGz9CyayltKrF5gIoblde1qGyduCthpCeAYJxjvmQk8xc7IaaUjhQcTGXZgx1vSo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=NRXDpFyi; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="NRXDpFyi" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E95F1C4CECE; Thu, 19 Dec 2024 19:33:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636814; bh=AkyaUdYSvZhnL1Y6wxhASs5P6tSY4/xfgLsDT9Gd9WU=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=NRXDpFyi6L54crMtNXNN9YJQJuGurYosHcp9TsIA1yuMrU0PoLBGjQQif2OAcdD2Q 2+NdwFNMMWrnB2m8PCoOOaozQ4lx2OWo2QFdKsvW5EwYRrw4lr6YvviN1Qz4jNPEHI h4kA2glAqTUSMB3lSEXap/T29JPcI3qkKoou3wogtZ01GP0VPxqmy2+5ZTo3w8bthg DXJNv61xzYnre6RwZCQkqKFUS64PilIBAn4OBd9VWLMDxRmLzL6icu/mkqtRHiIu/u t1D6jGu8pyWLEDw1RYBzY71Ifo/m7GdkXRJ4vyePITD8uH1l/1FeW6/qbxZZsAF+iL 0OpYzwaxjsb4A== Date: Thu, 19 Dec 2024 11:33:33 -0800 Subject: [PATCH 02/43] xfs: namespace the maximum length/refcount symbols From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581010.1572761.11513189735808800599.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Actually namespace these variables properly, so that readers can tell that this is an XFS symbol, and that it's for the refcount functionality. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_format.h | 4 ++-- fs/xfs/libxfs/xfs_refcount.c | 18 +++++++++--------- fs/xfs/scrub/refcount.c | 2 +- fs/xfs/scrub/refcount_repair.c | 4 ++-- 4 files changed, 14 insertions(+), 14 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index fba4e59aded4a0..16696bc3ff9445 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1790,8 +1790,8 @@ struct xfs_refcount_key { __be32 rc_startblock; /* starting block number */ }; -#define MAXREFCOUNT ((xfs_nlink_t)~0U) -#define MAXREFCEXTLEN ((xfs_extlen_t)~0U) +#define XFS_REFC_REFCOUNT_MAX ((xfs_nlink_t)~0U) +#define XFS_REFC_LEN_MAX ((xfs_extlen_t)~0U) /* btree pointer type */ typedef __be32 xfs_refcount_ptr_t; diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index bbb86dc9a25c7f..faace12fe2e383 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -128,7 +128,7 @@ xfs_refcount_check_irec( struct xfs_perag *pag, const struct xfs_refcount_irec *irec) { - if (irec->rc_blockcount == 0 || irec->rc_blockcount > MAXREFCEXTLEN) + if (irec->rc_blockcount == 0 || irec->rc_blockcount > XFS_REFC_LEN_MAX) return __this_address; if (!xfs_refcount_check_domain(irec)) @@ -138,7 +138,7 @@ xfs_refcount_check_irec( if (!xfs_verify_agbext(pag, irec->rc_startblock, irec->rc_blockcount)) return __this_address; - if (irec->rc_refcount == 0 || irec->rc_refcount > MAXREFCOUNT) + if (irec->rc_refcount == 0 || irec->rc_refcount > XFS_REFC_REFCOUNT_MAX) return __this_address; return NULL; @@ -853,9 +853,9 @@ xfs_refc_merge_refcount( const struct xfs_refcount_irec *irec, enum xfs_refc_adjust_op adjust) { - /* Once a record hits MAXREFCOUNT, it is pinned there forever */ - if (irec->rc_refcount == MAXREFCOUNT) - return MAXREFCOUNT; + /* Once a record hits XFS_REFC_REFCOUNT_MAX, it is pinned forever */ + if (irec->rc_refcount == XFS_REFC_REFCOUNT_MAX) + return XFS_REFC_REFCOUNT_MAX; return irec->rc_refcount + adjust; } @@ -898,7 +898,7 @@ xfs_refc_want_merge_center( * hence we need to catch u32 addition overflows here. */ ulen += cleft->rc_blockcount + right->rc_blockcount; - if (ulen >= MAXREFCEXTLEN) + if (ulen >= XFS_REFC_LEN_MAX) return false; *ulenp = ulen; @@ -933,7 +933,7 @@ xfs_refc_want_merge_left( * hence we need to catch u32 addition overflows here. */ ulen += cleft->rc_blockcount; - if (ulen >= MAXREFCEXTLEN) + if (ulen >= XFS_REFC_LEN_MAX) return false; return true; @@ -967,7 +967,7 @@ xfs_refc_want_merge_right( * hence we need to catch u32 addition overflows here. */ ulen += cright->rc_blockcount; - if (ulen >= MAXREFCEXTLEN) + if (ulen >= XFS_REFC_LEN_MAX) return false; return true; @@ -1196,7 +1196,7 @@ xfs_refcount_adjust_extents( * Adjust the reference count and either update the tree * (incr) or free the blocks (decr). */ - if (ext.rc_refcount == MAXREFCOUNT) + if (ext.rc_refcount == XFS_REFC_REFCOUNT_MAX) goto skip; ext.rc_refcount += adj; trace_xfs_refcount_modify_extent(cur, &ext); diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c index 1c5e45cc64190c..d465280230154f 100644 --- a/fs/xfs/scrub/refcount.c +++ b/fs/xfs/scrub/refcount.c @@ -421,7 +421,7 @@ xchk_refcount_mergeable( if (r1->rc_refcount != r2->rc_refcount) return false; if ((unsigned long long)r1->rc_blockcount + r2->rc_blockcount > - MAXREFCEXTLEN) + XFS_REFC_LEN_MAX) return false; return true; diff --git a/fs/xfs/scrub/refcount_repair.c b/fs/xfs/scrub/refcount_repair.c index 4e572b81c98669..1ee6d4aeb308f5 100644 --- a/fs/xfs/scrub/refcount_repair.c +++ b/fs/xfs/scrub/refcount_repair.c @@ -183,7 +183,7 @@ xrep_refc_stash( if (xchk_should_terminate(sc, &error)) return error; - irec.rc_refcount = min_t(uint64_t, MAXREFCOUNT, refcount); + irec.rc_refcount = min_t(uint64_t, XFS_REFC_REFCOUNT_MAX, refcount); error = xrep_refc_check_ext(rr->sc, &irec); if (error) @@ -422,7 +422,7 @@ xrep_refc_find_refcounts( /* * Set up a bag to store all the rmap records that we're tracking to * generate a reference count record. If the size of the bag exceeds - * MAXREFCOUNT, we clamp rc_refcount. + * XFS_REFC_REFCOUNT_MAX, we clamp rc_refcount. */ error = rcbag_init(sc->mp, sc->xmbtp, &rcstack); if (error) From patchwork Thu Dec 19 19:33:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915601 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0CC151B4220 for ; Thu, 19 Dec 2024 19:33:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636830; cv=none; b=bhZEZpkthnba979qXHUYm4/0r03sHOy26F7ykB4iN6PLcoFi4+3b37OE5fvU8wX98QYKRvNbPOpvM8Um8GSOo9V6ZNg7spYuUGJtgxAg467vvq9IIc/TcnFcwyYrP731q+ZJrXxniN6z9uznDFhB2mft7QvPiM1VLASkyBRK6FI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636830; c=relaxed/simple; bh=ntKNxXc1Vh6zrO1gAKc1xTo5u4cqpR0EMGGGvtYP4vo=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nS6te8DgXy1YalyY6LEn+VkMsGFpIuUoKL1xNjQ5mZVxxtJK+7AeWiqBBPUbAuWnUvIZJiXhWeSMyVskpxnff8C2LRPbTegmTAm1OUccu8AAJ58d4m+ZM40+Mab93Lcpj6q+Co2/llfo+dAqA9NtAhmjMGy2PVmIwDzYIuLojjE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tc3OUEdC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tc3OUEdC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8AE25C4CED7; Thu, 19 Dec 2024 19:33:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636829; bh=ntKNxXc1Vh6zrO1gAKc1xTo5u4cqpR0EMGGGvtYP4vo=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=tc3OUEdC8pJDkXG4mQPgKXXRCB+rpWYa+x6IhOD4dHBnL5XeFgJvy1+zPal14Te0K XvwQGog8KadkGBZdiq0G8KnXrCtsz+hs6puQdyECGySthr96ufaqdMZgJZrcoX7Sdt 4hozCTFn2FxA4X7r5RqPWUSfVMeGwjAg6dBIg5zu0pNBXWRD07ffA7TTOlCp25f0ZZ L2o39CkhGLjmsXCMaJ72Uh0GkcTfneFLYbI2BkadBTHzTPv8YshlRECfx8EgV3yc9l h/RYem27oTmT9G6X7kSNansofF+NdY62S2bx8QS0lFsmYlVuG5QEQNjVsfeGjHyk8v d2U5fiT3u1VCw== Date: Thu, 19 Dec 2024 11:33:49 -0800 Subject: [PATCH 03/43] xfs: introduce realtime refcount btree ondisk definitions From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581027.1572761.5600201348262008752.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Add the ondisk structure definitions for realtime refcount btrees. The realtime refcount btree will be rooted from a hidden inode so it needs to have a separate btree block magic and pointer format. Next, add everything needed to read, write and manipulate refcount btree blocks. This prepares the way for connecting the btree operations implementation, though the changes to actually root the rtrefcount btree in an inode come later. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/Makefile | 1 fs/xfs/libxfs/xfs_btree.c | 5 + fs/xfs/libxfs/xfs_format.h | 9 + fs/xfs/libxfs/xfs_ondisk.h | 1 fs/xfs/libxfs/xfs_rtrefcount_btree.c | 273 ++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.h | 70 +++++++++ fs/xfs/libxfs/xfs_sb.c | 8 + fs/xfs/libxfs/xfs_shared.h | 7 + fs/xfs/xfs_mount.c | 5 - fs/xfs/xfs_mount.h | 9 + fs/xfs/xfs_stats.c | 3 fs/xfs/xfs_stats.h | 1 12 files changed, 390 insertions(+), 2 deletions(-) create mode 100644 fs/xfs/libxfs/xfs_rtrefcount_btree.c create mode 100644 fs/xfs/libxfs/xfs_rtrefcount_btree.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 338e10f81b7b71..b9356d01416e16 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -51,6 +51,7 @@ xfs-y += $(addprefix libxfs/, \ xfs_rmap_btree.o \ xfs_refcount.o \ xfs_refcount_btree.o \ + xfs_rtrefcount_btree.o \ xfs_rtrmap_btree.o \ xfs_sb.o \ xfs_symlink_remote.o \ diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 36ab06f8a3bc99..299ce7fd11b0a9 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -35,6 +35,7 @@ #include "xfs_rmap.h" #include "xfs_quota.h" #include "xfs_metafile.h" +#include "xfs_rtrefcount_btree.h" /* * Btree magic numbers. @@ -5533,6 +5534,9 @@ xfs_btree_init_cur_caches(void) if (error) goto err; error = xfs_rtrmapbt_init_cur_cache(); + if (error) + goto err; + error = xfs_rtrefcountbt_init_cur_cache(); if (error) goto err; @@ -5552,6 +5556,7 @@ xfs_btree_destroy_cur_caches(void) xfs_rmapbt_destroy_cur_cache(); xfs_refcountbt_destroy_cur_cache(); xfs_rtrmapbt_destroy_cur_cache(); + xfs_rtrefcountbt_destroy_cur_cache(); } /* Move the btree cursor before the first record. */ diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index 16696bc3ff9445..17f7c0d1aaa452 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1796,6 +1796,15 @@ struct xfs_refcount_key { /* btree pointer type */ typedef __be32 xfs_refcount_ptr_t; +/* + * Realtime Reference Count btree format definitions + * + * This is a btree for reference count records for realtime volumes + */ +#define XFS_RTREFC_CRC_MAGIC 0x52434e54 /* 'RCNT' */ + +/* inode-rooted btree pointer type */ +typedef __be64 xfs_rtrefcount_ptr_t; /* * BMAP Btree format definitions diff --git a/fs/xfs/libxfs/xfs_ondisk.h b/fs/xfs/libxfs/xfs_ondisk.h index 07e2f5fb3a94ae..efb035050c009c 100644 --- a/fs/xfs/libxfs/xfs_ondisk.h +++ b/fs/xfs/libxfs/xfs_ondisk.h @@ -85,6 +85,7 @@ xfs_check_ondisk_structs(void) XFS_CHECK_STRUCT_SIZE(struct xfs_rtbuf_blkinfo, 48); XFS_CHECK_STRUCT_SIZE(xfs_rtrmap_ptr_t, 8); XFS_CHECK_STRUCT_SIZE(struct xfs_rtrmap_root, 4); + XFS_CHECK_STRUCT_SIZE(xfs_rtrefcount_ptr_t, 8); /* * m68k has problems with struct xfs_attr_leaf_name_remote, but we pad diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c new file mode 100644 index 00000000000000..d07d3b1e78f730 --- /dev/null +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -0,0 +1,273 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_log_format.h" +#include "xfs_trans_resv.h" +#include "xfs_bit.h" +#include "xfs_sb.h" +#include "xfs_mount.h" +#include "xfs_defer.h" +#include "xfs_inode.h" +#include "xfs_trans.h" +#include "xfs_alloc.h" +#include "xfs_btree.h" +#include "xfs_btree_staging.h" +#include "xfs_rtrefcount_btree.h" +#include "xfs_trace.h" +#include "xfs_cksum.h" +#include "xfs_error.h" +#include "xfs_extent_busy.h" +#include "xfs_rtgroup.h" +#include "xfs_rtbitmap.h" + +static struct kmem_cache *xfs_rtrefcountbt_cur_cache; + +/* + * Realtime Reference Count btree. + * + * This is a btree used to track the owner(s) of a given extent in the realtime + * device. See the comments in xfs_refcount_btree.c for more information. + * + * This tree is basically the same as the regular refcount btree except that + * it's rooted in an inode. + */ + +static struct xfs_btree_cur * +xfs_rtrefcountbt_dup_cursor( + struct xfs_btree_cur *cur) +{ + return xfs_rtrefcountbt_init_cursor(cur->bc_tp, to_rtg(cur->bc_group)); +} + +static xfs_failaddr_t +xfs_rtrefcountbt_verify( + struct xfs_buf *bp) +{ + struct xfs_mount *mp = bp->b_target->bt_mount; + struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp); + xfs_failaddr_t fa; + int level; + + if (!xfs_verify_magic(bp, block->bb_magic)) + return __this_address; + + if (!xfs_has_reflink(mp)) + return __this_address; + fa = xfs_btree_fsblock_v5hdr_verify(bp, XFS_RMAP_OWN_UNKNOWN); + if (fa) + return fa; + level = be16_to_cpu(block->bb_level); + if (level > mp->m_rtrefc_maxlevels) + return __this_address; + + return xfs_btree_fsblock_verify(bp, mp->m_rtrefc_mxr[level != 0]); +} + +static void +xfs_rtrefcountbt_read_verify( + struct xfs_buf *bp) +{ + xfs_failaddr_t fa; + + if (!xfs_btree_fsblock_verify_crc(bp)) + xfs_verifier_error(bp, -EFSBADCRC, __this_address); + else { + fa = xfs_rtrefcountbt_verify(bp); + if (fa) + xfs_verifier_error(bp, -EFSCORRUPTED, fa); + } + + if (bp->b_error) + trace_xfs_btree_corrupt(bp, _RET_IP_); +} + +static void +xfs_rtrefcountbt_write_verify( + struct xfs_buf *bp) +{ + xfs_failaddr_t fa; + + fa = xfs_rtrefcountbt_verify(bp); + if (fa) { + trace_xfs_btree_corrupt(bp, _RET_IP_); + xfs_verifier_error(bp, -EFSCORRUPTED, fa); + return; + } + xfs_btree_fsblock_calc_crc(bp); + +} + +const struct xfs_buf_ops xfs_rtrefcountbt_buf_ops = { + .name = "xfs_rtrefcountbt", + .magic = { 0, cpu_to_be32(XFS_RTREFC_CRC_MAGIC) }, + .verify_read = xfs_rtrefcountbt_read_verify, + .verify_write = xfs_rtrefcountbt_write_verify, + .verify_struct = xfs_rtrefcountbt_verify, +}; + +const struct xfs_btree_ops xfs_rtrefcountbt_ops = { + .name = "rtrefcount", + .type = XFS_BTREE_TYPE_INODE, + .geom_flags = XFS_BTGEO_IROOT_RECORDS, + + .rec_len = sizeof(struct xfs_refcount_rec), + .key_len = sizeof(struct xfs_refcount_key), + .ptr_len = XFS_BTREE_LONG_PTR_LEN, + + .lru_refs = XFS_REFC_BTREE_REF, + .statoff = XFS_STATS_CALC_INDEX(xs_rtrefcbt_2), + + .dup_cursor = xfs_rtrefcountbt_dup_cursor, + .buf_ops = &xfs_rtrefcountbt_buf_ops, +}; + +/* Allocate a new rt refcount btree cursor. */ +struct xfs_btree_cur * +xfs_rtrefcountbt_init_cursor( + struct xfs_trans *tp, + struct xfs_rtgroup *rtg) +{ + struct xfs_inode *ip = NULL; + struct xfs_mount *mp = rtg_mount(rtg); + struct xfs_btree_cur *cur; + + return NULL; /* XXX */ + + xfs_assert_ilocked(ip, XFS_ILOCK_SHARED | XFS_ILOCK_EXCL); + + cur = xfs_btree_alloc_cursor(mp, tp, &xfs_rtrefcountbt_ops, + mp->m_rtrefc_maxlevels, xfs_rtrefcountbt_cur_cache); + + cur->bc_ino.ip = ip; + cur->bc_refc.nr_ops = 0; + cur->bc_refc.shape_changes = 0; + cur->bc_group = xfs_group_hold(rtg_group(rtg)); + cur->bc_nlevels = be16_to_cpu(ip->i_df.if_broot->bb_level) + 1; + cur->bc_ino.forksize = xfs_inode_fork_size(ip, XFS_DATA_FORK); + cur->bc_ino.whichfork = XFS_DATA_FORK; + return cur; +} + +/* + * Install a new rt reverse mapping btree root. Caller is responsible for + * invalidating and freeing the old btree blocks. + */ +void +xfs_rtrefcountbt_commit_staged_btree( + struct xfs_btree_cur *cur, + struct xfs_trans *tp) +{ + struct xbtree_ifakeroot *ifake = cur->bc_ino.ifake; + struct xfs_ifork *ifp; + int flags = XFS_ILOG_CORE | XFS_ILOG_DBROOT; + + ASSERT(cur->bc_flags & XFS_BTREE_STAGING); + + /* + * Free any resources hanging off the real fork, then shallow-copy the + * staging fork's contents into the real fork to transfer everything + * we just built. + */ + ifp = xfs_ifork_ptr(cur->bc_ino.ip, XFS_DATA_FORK); + xfs_idestroy_fork(ifp); + memcpy(ifp, ifake->if_fork, sizeof(struct xfs_ifork)); + + cur->bc_ino.ip->i_projid = cur->bc_group->xg_gno; + xfs_trans_log_inode(tp, cur->bc_ino.ip, flags); + xfs_btree_commit_ifakeroot(cur, tp, XFS_DATA_FORK); +} + +/* Calculate number of records in a realtime refcount btree block. */ +static inline unsigned int +xfs_rtrefcountbt_block_maxrecs( + unsigned int blocklen, + bool leaf) +{ + + if (leaf) + return blocklen / sizeof(struct xfs_refcount_rec); + return blocklen / (sizeof(struct xfs_refcount_key) + + sizeof(xfs_rtrefcount_ptr_t)); +} + +/* + * Calculate number of records in an refcount btree block. + */ +unsigned int +xfs_rtrefcountbt_maxrecs( + struct xfs_mount *mp, + unsigned int blocklen, + bool leaf) +{ + blocklen -= XFS_RTREFCOUNT_BLOCK_LEN; + return xfs_rtrefcountbt_block_maxrecs(blocklen, leaf); +} + +/* Compute the max possible height for realtime refcount btrees. */ +unsigned int +xfs_rtrefcountbt_maxlevels_ondisk(void) +{ + unsigned int minrecs[2]; + unsigned int blocklen; + + blocklen = XFS_MIN_CRC_BLOCKSIZE - XFS_BTREE_LBLOCK_CRC_LEN; + + minrecs[0] = xfs_rtrefcountbt_block_maxrecs(blocklen, true) / 2; + minrecs[1] = xfs_rtrefcountbt_block_maxrecs(blocklen, false) / 2; + + /* We need at most one record for every block in an rt group. */ + return xfs_btree_compute_maxlevels(minrecs, XFS_MAX_RGBLOCKS); +} + +int __init +xfs_rtrefcountbt_init_cur_cache(void) +{ + xfs_rtrefcountbt_cur_cache = kmem_cache_create("xfs_rtrefcountbt_cur", + xfs_btree_cur_sizeof( + xfs_rtrefcountbt_maxlevels_ondisk()), + 0, 0, NULL); + + if (!xfs_rtrefcountbt_cur_cache) + return -ENOMEM; + return 0; +} + +void +xfs_rtrefcountbt_destroy_cur_cache(void) +{ + kmem_cache_destroy(xfs_rtrefcountbt_cur_cache); + xfs_rtrefcountbt_cur_cache = NULL; +} + +/* Compute the maximum height of a realtime refcount btree. */ +void +xfs_rtrefcountbt_compute_maxlevels( + struct xfs_mount *mp) +{ + unsigned int d_maxlevels, r_maxlevels; + + if (!xfs_has_rtreflink(mp)) { + mp->m_rtrefc_maxlevels = 0; + return; + } + + /* + * The realtime refcountbt lives on the data device, which means that + * its maximum height is constrained by the size of the data device and + * the height required to store one refcount record for each rtextent + * in an rt group. + */ + d_maxlevels = xfs_btree_space_to_height(mp->m_rtrefc_mnr, + mp->m_sb.sb_dblocks); + r_maxlevels = xfs_btree_compute_maxlevels(mp->m_rtrefc_mnr, + mp->m_sb.sb_rgextents); + + /* Add one level to handle the inode root level. */ + mp->m_rtrefc_maxlevels = min(d_maxlevels, r_maxlevels) + 1; +} diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.h b/fs/xfs/libxfs/xfs_rtrefcount_btree.h new file mode 100644 index 00000000000000..b713b33818800c --- /dev/null +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.h @@ -0,0 +1,70 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_RTREFCOUNT_BTREE_H__ +#define __XFS_RTREFCOUNT_BTREE_H__ + +struct xfs_buf; +struct xfs_btree_cur; +struct xfs_mount; +struct xbtree_ifakeroot; +struct xfs_rtgroup; + +/* refcounts only exist on crc enabled filesystems */ +#define XFS_RTREFCOUNT_BLOCK_LEN XFS_BTREE_LBLOCK_CRC_LEN + +struct xfs_btree_cur *xfs_rtrefcountbt_init_cursor(struct xfs_trans *tp, + struct xfs_rtgroup *rtg); +struct xfs_btree_cur *xfs_rtrefcountbt_stage_cursor(struct xfs_mount *mp, + struct xfs_rtgroup *rtg, struct xfs_inode *ip, + struct xbtree_ifakeroot *ifake); +void xfs_rtrefcountbt_commit_staged_btree(struct xfs_btree_cur *cur, + struct xfs_trans *tp); +unsigned int xfs_rtrefcountbt_maxrecs(struct xfs_mount *mp, + unsigned int blocklen, bool leaf); +void xfs_rtrefcountbt_compute_maxlevels(struct xfs_mount *mp); + +/* + * Addresses of records, keys, and pointers within an incore rtrefcountbt block. + * + * (note that some of these may appear unused, but they are used in userspace) + */ +static inline struct xfs_refcount_rec * +xfs_rtrefcount_rec_addr( + struct xfs_btree_block *block, + unsigned int index) +{ + return (struct xfs_refcount_rec *) + ((char *)block + XFS_RTREFCOUNT_BLOCK_LEN + + (index - 1) * sizeof(struct xfs_refcount_rec)); +} + +static inline struct xfs_refcount_key * +xfs_rtrefcount_key_addr( + struct xfs_btree_block *block, + unsigned int index) +{ + return (struct xfs_refcount_key *) + ((char *)block + XFS_RTREFCOUNT_BLOCK_LEN + + (index - 1) * sizeof(struct xfs_refcount_key)); +} + +static inline xfs_rtrefcount_ptr_t * +xfs_rtrefcount_ptr_addr( + struct xfs_btree_block *block, + unsigned int index, + unsigned int maxrecs) +{ + return (xfs_rtrefcount_ptr_t *) + ((char *)block + XFS_RTREFCOUNT_BLOCK_LEN + + maxrecs * sizeof(struct xfs_refcount_key) + + (index - 1) * sizeof(xfs_rtrefcount_ptr_t)); +} + +unsigned int xfs_rtrefcountbt_maxlevels_ondisk(void); +int __init xfs_rtrefcountbt_init_cur_cache(void); +void xfs_rtrefcountbt_destroy_cur_cache(void); + +#endif /* __XFS_RTREFCOUNT_BTREE_H__ */ diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index 83fb14b4074c8d..3dc5f5dba162d0 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -29,6 +29,7 @@ #include "xfs_exchrange.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" /* * Physical superblock buffer manipulations. Shared with libxfs in userspace. @@ -1226,6 +1227,13 @@ xfs_sb_mount_common( mp->m_refc_mnr[0] = mp->m_refc_mxr[0] / 2; mp->m_refc_mnr[1] = mp->m_refc_mxr[1] / 2; + mp->m_rtrefc_mxr[0] = xfs_rtrefcountbt_maxrecs(mp, sbp->sb_blocksize, + true); + mp->m_rtrefc_mxr[1] = xfs_rtrefcountbt_maxrecs(mp, sbp->sb_blocksize, + false); + mp->m_rtrefc_mnr[0] = mp->m_rtrefc_mxr[0] / 2; + mp->m_rtrefc_mnr[1] = mp->m_rtrefc_mxr[1] / 2; + mp->m_bsize = XFS_FSB_TO_BB(mp, 1); mp->m_alloc_set_aside = xfs_alloc_set_aside(mp); mp->m_ag_max_usable = xfs_alloc_ag_max_usable(mp); diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h index 960716c387cc2b..b1e0d9bc1f7d70 100644 --- a/fs/xfs/libxfs/xfs_shared.h +++ b/fs/xfs/libxfs/xfs_shared.h @@ -42,6 +42,7 @@ extern const struct xfs_buf_ops xfs_rtbitmap_buf_ops; extern const struct xfs_buf_ops xfs_rtsummary_buf_ops; extern const struct xfs_buf_ops xfs_rtbuf_ops; extern const struct xfs_buf_ops xfs_rtsb_buf_ops; +extern const struct xfs_buf_ops xfs_rtrefcountbt_buf_ops; extern const struct xfs_buf_ops xfs_rtrmapbt_buf_ops; extern const struct xfs_buf_ops xfs_sb_buf_ops; extern const struct xfs_buf_ops xfs_sb_quiet_buf_ops; @@ -58,6 +59,7 @@ extern const struct xfs_btree_ops xfs_rmapbt_ops; extern const struct xfs_btree_ops xfs_rmapbt_mem_ops; extern const struct xfs_btree_ops xfs_rtrmapbt_ops; extern const struct xfs_btree_ops xfs_rtrmapbt_mem_ops; +extern const struct xfs_btree_ops xfs_rtrefcountbt_ops; static inline bool xfs_btree_is_bno(const struct xfs_btree_ops *ops) { @@ -114,6 +116,11 @@ static inline bool xfs_btree_is_rtrmap(const struct xfs_btree_ops *ops) return ops == &xfs_rtrmapbt_ops; } +static inline bool xfs_btree_is_rtrefcount(const struct xfs_btree_ops *ops) +{ + return ops == &xfs_rtrefcountbt_ops; +} + /* log size calculation functions */ int xfs_log_calc_unit_res(struct xfs_mount *mp, int unit_bytes); int xfs_log_calc_minimum_size(struct xfs_mount *); diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 7b7d21b50d5409..477c5262cf9120 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -38,6 +38,7 @@ #include "xfs_metafile.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/stats.h" static DEFINE_MUTEX(xfs_uuid_table_mutex); @@ -656,7 +657,8 @@ static inline void xfs_rtbtree_compute_maxlevels( struct xfs_mount *mp) { - mp->m_rtbtree_maxlevels = mp->m_rtrmap_maxlevels; + mp->m_rtbtree_maxlevels = max(mp->m_rtrmap_maxlevels, + mp->m_rtrefc_maxlevels); } /* @@ -729,6 +731,7 @@ xfs_mountfs( xfs_rmapbt_compute_maxlevels(mp); xfs_rtrmapbt_compute_maxlevels(mp); xfs_refcountbt_compute_maxlevels(mp); + xfs_rtrefcountbt_compute_maxlevels(mp); xfs_agbtree_compute_maxlevels(mp); xfs_rtbtree_compute_maxlevels(mp); diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index 1bc95fb170db61..9a1516080e6361 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -162,11 +162,14 @@ typedef struct xfs_mount { uint m_rtrmap_mnr[2]; /* min rtrmap btree records */ uint m_refc_mxr[2]; /* max refc btree records */ uint m_refc_mnr[2]; /* min refc btree records */ + uint m_rtrefc_mxr[2]; /* max rtrefc btree records */ + uint m_rtrefc_mnr[2]; /* min rtrefc btree records */ uint m_alloc_maxlevels; /* max alloc btree levels */ uint m_bm_maxlevels[2]; /* max bmap btree levels */ uint m_rmap_maxlevels; /* max rmap btree levels */ uint m_rtrmap_maxlevels; /* max rtrmap btree level */ uint m_refc_maxlevels; /* max refcount btree level */ + uint m_rtrefc_maxlevels; /* max rtrefc btree level */ unsigned int m_agbtree_maxlevels; /* max level of all AG btrees */ unsigned int m_rtbtree_maxlevels; /* max level of all rt btrees */ xfs_extlen_t m_ag_prealloc_blocks; /* reserved ag blocks */ @@ -408,6 +411,12 @@ static inline bool xfs_has_rtrmapbt(struct xfs_mount *mp) xfs_has_rmapbt(mp); } +static inline bool xfs_has_rtreflink(struct xfs_mount *mp) +{ + return xfs_has_metadir(mp) && xfs_has_realtime(mp) && + xfs_has_reflink(mp); +} + /* * Some features are always on for v5 file systems, allow the compiler to * eliminiate dead code when building without v4 support. diff --git a/fs/xfs/xfs_stats.c b/fs/xfs/xfs_stats.c index b7f2988bc03bb7..35c7fb3ba32448 100644 --- a/fs/xfs/xfs_stats.c +++ b/fs/xfs/xfs_stats.c @@ -54,7 +54,8 @@ int xfs_stats_format(struct xfsstats __percpu *stats, char *buf) { "rmapbt_mem", xfsstats_offset(xs_rcbag_2) }, { "rcbagbt", xfsstats_offset(xs_rtrmap_2) }, { "rtrmapbt", xfsstats_offset(xs_rtrmap_mem_2)}, - { "rtrmapbt_mem", xfsstats_offset(xs_qm_dqreclaims)}, + { "rtrmapbt_mem", xfsstats_offset(xs_rtrefcbt_2) }, + { "rtrefcntbt", xfsstats_offset(xs_qm_dqreclaims)}, /* we print both series of quota information together */ { "qm", xfsstats_offset(xs_xstrat_bytes)}, }; diff --git a/fs/xfs/xfs_stats.h b/fs/xfs/xfs_stats.h index 9c47de5dff2dd6..15ba1abcf2536f 100644 --- a/fs/xfs/xfs_stats.h +++ b/fs/xfs/xfs_stats.h @@ -129,6 +129,7 @@ struct __xfsstats { uint32_t xs_rcbag_2[__XBTS_MAX]; uint32_t xs_rtrmap_2[__XBTS_MAX]; uint32_t xs_rtrmap_mem_2[__XBTS_MAX]; + uint32_t xs_rtrefcbt_2[__XBTS_MAX]; uint32_t xs_qm_dqreclaims; uint32_t xs_qm_dqreclaim_misses; uint32_t xs_qm_dquot_dups; From patchwork Thu Dec 19 19:34:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915604 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A0621A0BF1 for ; Thu, 19 Dec 2024 19:34:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636845; cv=none; b=lqwJUYN08jKVt3I/SD39roUm2gapCEusXg0duXFlmpZO0OWBdUH02c+s3/i/0U6EJVkpT34XDiJNNxU8ICn2SUSGns5d256/hDFicH0W+O2qSJI7dJYJ07NLDMLzz7vzKKvpb/E7/8bOcJC1Mn8YQScuvidWChw4AKbkcXqMAts= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636845; c=relaxed/simple; bh=zGRCtcQCM5dlQdrk01yBhVtilUTnGumzI/f8YFuFJlI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tnPV1TwjKvYFHSQDlOZAlldfp2pL2ZDldqtp7qxtYO94Ypsql+EALRpm2GCZJm5irumGNM62Nb4tULbFlojYTnbIBj+qGSpJCkAMBWtDK4so9w+NLKSHwlnJq8fJ9fpxyDfsy/u/+BFx6ep7Zy90f074ULtzRb8WHYCMdF3fZKE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rmicF6Qb; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rmicF6Qb" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 27912C4CECE; Thu, 19 Dec 2024 19:34:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636845; bh=zGRCtcQCM5dlQdrk01yBhVtilUTnGumzI/f8YFuFJlI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=rmicF6QbK6A3P2zmBm9EqVO+iGow1FEfe088W3mZ2lbKQ5VlDPbNHyo8h2IqaBSgh H+//UEJoxW/Q7QpZWjM8Mm9xRSM3zAp1wNl003+k4JNnIZXPlyzSNafqKkgTzW78Ym IIhMCyiT9uqEFztSEX50Ii5wh96D66GvJE5O0X+alzwtRWqae3RTECRGtIHcBRFh/l VdQ2Jy3xRRwffKJnZ+YTdiAUsK0fZmCZ0T5kFehzeruqo16TsRyMVlffz0ROTkZo7u LiQeCgZZKOflE2MxvZJX1BDqKWGf0CDCUiGV5WWHuxPAuKX1TWJc+MziGcQNEGniVt hqfKRlzuilkqw== Date: Thu, 19 Dec 2024 11:34:04 -0800 Subject: [PATCH 04/43] xfs: realtime refcount btree transaction reservations From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581046.1572761.4038146946641618516.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Make sure that there's enough log reservation to handle mapping and unmapping realtime extents. We have to reserve enough space to handle a split in the rtrefcountbt to add the record and a second split in the regular refcountbt to record the rtrefcountbt split. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_trans_resv.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index f3392eb2d7f41f..13d00c7166e178 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -92,6 +92,14 @@ xfs_refcountbt_block_count( return num_ops * (2 * mp->m_refc_maxlevels - 1); } +static unsigned int +xfs_rtrefcountbt_block_count( + struct xfs_mount *mp, + unsigned int num_ops) +{ + return num_ops * (2 * mp->m_rtrefc_maxlevels - 1); +} + /* * Logging inodes is really tricksy. They are logged in memory format, * which means that what we write into the log doesn't directly translate into @@ -259,10 +267,13 @@ xfs_rtalloc_block_count( * Compute the log reservation required to handle the refcount update * transaction. Refcount updates are always done via deferred log items. * - * This is calculated as: + * This is calculated as the max of: * Data device refcount updates (t1): * the agfs of the ags containing the blocks: nr_ops * sector size * the refcount btrees: nr_ops * 1 trees * (2 * max depth - 1) * block size + * Realtime refcount updates (t2); + * the rt refcount inode + * the rtrefcount btrees: nr_ops * 1 trees * (2 * max depth - 1) * block size */ static unsigned int xfs_calc_refcountbt_reservation( @@ -270,12 +281,20 @@ xfs_calc_refcountbt_reservation( unsigned int nr_ops) { unsigned int blksz = XFS_FSB_TO_B(mp, 1); + unsigned int t1, t2 = 0; if (!xfs_has_reflink(mp)) return 0; - return xfs_calc_buf_res(nr_ops, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_refcountbt_block_count(mp, nr_ops), blksz); + t1 = xfs_calc_buf_res(nr_ops, mp->m_sb.sb_sectsize) + + xfs_calc_buf_res(xfs_refcountbt_block_count(mp, nr_ops), blksz); + + if (xfs_has_realtime(mp)) + t2 = xfs_calc_inode_res(mp, 1) + + xfs_calc_buf_res(xfs_rtrefcountbt_block_count(mp, nr_ops), + blksz); + + return max(t1, t2); } /* From patchwork Thu Dec 19 19:34:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915605 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 02D301A0BF1 for ; Thu, 19 Dec 2024 19:34:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636861; cv=none; b=tkjY3Qpc3xamfQ/FVd+5RjAAKfc2a+LoUW146hHWut9T4KqFKy9/EcmGTT6EX3vkr11No0RCO7TpAaFUo4+uoojlIGx6EjKgmTcNnefaV7dE/ICjO25t5tpeerQXCUB5yf6l0mLPg42MiuPklp0rr1K1Yx4JqvQ52lYVjXalPK0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636861; c=relaxed/simple; bh=ZUdLw7o4FRGopF8Nar1v+WZzXiBfyMORl99rS/t5lsk=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=E2b9cVRrN4jAoJdtd+DtbO1YPrOp5rkvhxaqhcLb5QBomAkFVm1MH8mslEHr20GHsq/kqfN9wptfrtgpEERb5Z8kfFkjyWyAa2GYgk8uEbkmsJypDvM8nn8gHQpUsTBTP0UwyRapH3Erw7/1obrmbgkQFN2n3QZwsyMgnqBwbBw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=rncIGUJr; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="rncIGUJr" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF78EC4CECE; Thu, 19 Dec 2024 19:34:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636860; bh=ZUdLw7o4FRGopF8Nar1v+WZzXiBfyMORl99rS/t5lsk=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=rncIGUJrErVPMUrzZX69cE6EE59RlBTrOsx9n1r2kWScvK5UUiu1Squrc43TASTLn 05OKk60puAlO2/vtqPE73OzSLD1hEPnoeuZzSjaVazMEfxppUCR3LT11E7mgS6L9IU 1m2ubsl5tX2xHNRBAk3G+ZFiNZa2Tmp7zV0FrDjuDZgV6dcRDF25gWAEV4kwkXSKy/ fAJ9iRgWcVzE2RKaQPmakFciGft6Dnwv8WGv4LhQcDEv7K3DhpIgcuQT5gDwyCLFsv WopZnaWFc9VfCiIWPUB6grLMChtCbGM8GvU5lh1WsjwKY9pMoH+bu0p2D/7638h+sI XF6coXsnhfLbQ== Date: Thu, 19 Dec 2024 11:34:20 -0800 Subject: [PATCH 05/43] xfs: add realtime refcount btree operations From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581063.1572761.13161907660221555602.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Implement the generic btree operations needed to manipulate rtrefcount btree blocks. This is different from the regular refcountbt in that we allocate space from the filesystem at large, and are neither constrained to the free space nor any particular AG. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_rtrefcount_btree.c | 148 ++++++++++++++++++++++++++++++++++ 1 file changed, 148 insertions(+) diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index d07d3b1e78f730..e30af941581651 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -19,6 +19,7 @@ #include "xfs_btree.h" #include "xfs_btree_staging.h" #include "xfs_rtrefcount_btree.h" +#include "xfs_refcount.h" #include "xfs_trace.h" #include "xfs_cksum.h" #include "xfs_error.h" @@ -45,6 +46,106 @@ xfs_rtrefcountbt_dup_cursor( return xfs_rtrefcountbt_init_cursor(cur->bc_tp, to_rtg(cur->bc_group)); } +STATIC int +xfs_rtrefcountbt_get_minrecs( + struct xfs_btree_cur *cur, + int level) +{ + if (level == cur->bc_nlevels - 1) { + struct xfs_ifork *ifp = xfs_btree_ifork_ptr(cur); + + return xfs_rtrefcountbt_maxrecs(cur->bc_mp, ifp->if_broot_bytes, + level == 0) / 2; + } + + return cur->bc_mp->m_rtrefc_mnr[level != 0]; +} + +STATIC int +xfs_rtrefcountbt_get_maxrecs( + struct xfs_btree_cur *cur, + int level) +{ + if (level == cur->bc_nlevels - 1) { + struct xfs_ifork *ifp = xfs_btree_ifork_ptr(cur); + + return xfs_rtrefcountbt_maxrecs(cur->bc_mp, ifp->if_broot_bytes, + level == 0); + } + + return cur->bc_mp->m_rtrefc_mxr[level != 0]; +} + +STATIC void +xfs_rtrefcountbt_init_key_from_rec( + union xfs_btree_key *key, + const union xfs_btree_rec *rec) +{ + key->refc.rc_startblock = rec->refc.rc_startblock; +} + +STATIC void +xfs_rtrefcountbt_init_high_key_from_rec( + union xfs_btree_key *key, + const union xfs_btree_rec *rec) +{ + __u32 x; + + x = be32_to_cpu(rec->refc.rc_startblock); + x += be32_to_cpu(rec->refc.rc_blockcount) - 1; + key->refc.rc_startblock = cpu_to_be32(x); +} + +STATIC void +xfs_rtrefcountbt_init_rec_from_cur( + struct xfs_btree_cur *cur, + union xfs_btree_rec *rec) +{ + const struct xfs_refcount_irec *irec = &cur->bc_rec.rc; + uint32_t start; + + start = xfs_refcount_encode_startblock(irec->rc_startblock, + irec->rc_domain); + rec->refc.rc_startblock = cpu_to_be32(start); + rec->refc.rc_blockcount = cpu_to_be32(cur->bc_rec.rc.rc_blockcount); + rec->refc.rc_refcount = cpu_to_be32(cur->bc_rec.rc.rc_refcount); +} + +STATIC void +xfs_rtrefcountbt_init_ptr_from_cur( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr) +{ + ptr->l = 0; +} + +STATIC int64_t +xfs_rtrefcountbt_key_diff( + struct xfs_btree_cur *cur, + const union xfs_btree_key *key) +{ + const struct xfs_refcount_key *kp = &key->refc; + const struct xfs_refcount_irec *irec = &cur->bc_rec.rc; + uint32_t start; + + start = xfs_refcount_encode_startblock(irec->rc_startblock, + irec->rc_domain); + return (int64_t)be32_to_cpu(kp->rc_startblock) - start; +} + +STATIC int64_t +xfs_rtrefcountbt_diff_two_keys( + struct xfs_btree_cur *cur, + const union xfs_btree_key *k1, + const union xfs_btree_key *k2, + const union xfs_btree_key *mask) +{ + ASSERT(!mask || mask->refc.rc_startblock); + + return (int64_t)be32_to_cpu(k1->refc.rc_startblock) - + be32_to_cpu(k2->refc.rc_startblock); +} + static xfs_failaddr_t xfs_rtrefcountbt_verify( struct xfs_buf *bp) @@ -111,6 +212,40 @@ const struct xfs_buf_ops xfs_rtrefcountbt_buf_ops = { .verify_struct = xfs_rtrefcountbt_verify, }; +STATIC int +xfs_rtrefcountbt_keys_inorder( + struct xfs_btree_cur *cur, + const union xfs_btree_key *k1, + const union xfs_btree_key *k2) +{ + return be32_to_cpu(k1->refc.rc_startblock) < + be32_to_cpu(k2->refc.rc_startblock); +} + +STATIC int +xfs_rtrefcountbt_recs_inorder( + struct xfs_btree_cur *cur, + const union xfs_btree_rec *r1, + const union xfs_btree_rec *r2) +{ + return be32_to_cpu(r1->refc.rc_startblock) + + be32_to_cpu(r1->refc.rc_blockcount) <= + be32_to_cpu(r2->refc.rc_startblock); +} + +STATIC enum xbtree_key_contig +xfs_rtrefcountbt_keys_contiguous( + struct xfs_btree_cur *cur, + const union xfs_btree_key *key1, + const union xfs_btree_key *key2, + const union xfs_btree_key *mask) +{ + ASSERT(!mask || mask->refc.rc_startblock); + + return xbtree_key_contig(be32_to_cpu(key1->refc.rc_startblock), + be32_to_cpu(key2->refc.rc_startblock)); +} + const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .name = "rtrefcount", .type = XFS_BTREE_TYPE_INODE, @@ -124,7 +259,20 @@ const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .statoff = XFS_STATS_CALC_INDEX(xs_rtrefcbt_2), .dup_cursor = xfs_rtrefcountbt_dup_cursor, + .alloc_block = xfs_btree_alloc_metafile_block, + .free_block = xfs_btree_free_metafile_block, + .get_minrecs = xfs_rtrefcountbt_get_minrecs, + .get_maxrecs = xfs_rtrefcountbt_get_maxrecs, + .init_key_from_rec = xfs_rtrefcountbt_init_key_from_rec, + .init_high_key_from_rec = xfs_rtrefcountbt_init_high_key_from_rec, + .init_rec_from_cur = xfs_rtrefcountbt_init_rec_from_cur, + .init_ptr_from_cur = xfs_rtrefcountbt_init_ptr_from_cur, + .key_diff = xfs_rtrefcountbt_key_diff, .buf_ops = &xfs_rtrefcountbt_buf_ops, + .diff_two_keys = xfs_rtrefcountbt_diff_two_keys, + .keys_inorder = xfs_rtrefcountbt_keys_inorder, + .recs_inorder = xfs_rtrefcountbt_recs_inorder, + .keys_contiguous = xfs_rtrefcountbt_keys_contiguous, }; /* Allocate a new rt refcount btree cursor. */ From patchwork Thu Dec 19 19:34:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915606 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2808198A08 for ; Thu, 19 Dec 2024 19:34:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636876; cv=none; b=Z8mtoufUo6e4kzf2eex1k3SIv9YOh+HKV90wdxBNMg6x0+e7LcqBJcTF9e5NK5GiJ5b/WBhC0A/RVHF6yOu2s6ZbKtQpVfSmYWneITCmKk3qB8iX7+seAZNG3b69CZ3mx2yxqmERaUL8CWWHQiJy6/cwUUhaM0OHmK1RpNx/ocU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636876; c=relaxed/simple; bh=3Uxl+3GCJUguFPnIGcNEmiUh52/7Q6XB6vpQRA2QlwE=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Cw4rlWvhNTWo/lW4LObq3bDmqav8Qc15dPOGjWnMENjZywjjrRnY+SDUEV+8AqqFtu6CGC0yLjzOwY8htLY2uTZcnxZfPjiIZMj1zieAoL2SQHtvCJ57D/gwL3QN5v1TD+NK6gdEpTPQwryGtmw1nge6YQ8DcQqWjfZfNfxd/YI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AqNwPqTx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AqNwPqTx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 84038C4CECE; Thu, 19 Dec 2024 19:34:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636876; bh=3Uxl+3GCJUguFPnIGcNEmiUh52/7Q6XB6vpQRA2QlwE=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=AqNwPqTxNFRA43upHYSuPI+uNiaPrfU3DjYUMn+K1e4umbBDao2nzhf3Tp7nDorXh IKW67I32Imj6IufdIcE0cEHGZUMEQEl7s0vGrfNVNenh8tCgksU+NDYTQU2R4AKawb Ra72eLW2W5zSXGNTExDZmiHu3kXkLmDa1riH8S6xbLW2Kh+924eIkuWChXIZ/Ak+Vr 4B0cRokFrcgQGgP2pfLnjfIFQMGBb/SRXX9seOBpiAy5l0oTzjDHdaxcuV9DZzEp6x s1vfLBmjecQQCqxXNb1fqzh3jHTEJjibgF6Sc/0MJIyTC6KGN8PIKLpAcyUMgJ3zHK Sutl2FVzWo7FQ== Date: Thu, 19 Dec 2024 11:34:35 -0800 Subject: [PATCH 06/43] xfs: prepare refcount functions to deal with rtrefcountbt From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581079.1572761.5307153579693544937.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Prepare the high-level refcount functions to deal with the new realtime refcountbt and its slightly different conventions. Provide the ability to talk to either refcountbt or rtrefcountbt formats from the same high level code. Note that we leave the _recover_cow_leftovers functions for a separate patch so that we can convert it all at once. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_refcount.c | 53 ++++++++++++++++++++++++++++++++++++------ fs/xfs/libxfs/xfs_refcount.h | 3 ++ 2 files changed, 48 insertions(+), 8 deletions(-) diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index faace12fe2e383..9be40ac16c7d56 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -25,6 +25,7 @@ #include "xfs_ag.h" #include "xfs_health.h" #include "xfs_refcount_item.h" +#include "xfs_rtgroup.h" struct kmem_cache *xfs_refcount_intent_cache; @@ -144,6 +145,37 @@ xfs_refcount_check_irec( return NULL; } +xfs_failaddr_t +xfs_rtrefcount_check_irec( + struct xfs_rtgroup *rtg, + const struct xfs_refcount_irec *irec) +{ + if (irec->rc_blockcount == 0 || irec->rc_blockcount > XFS_REFC_LEN_MAX) + return __this_address; + + if (!xfs_refcount_check_domain(irec)) + return __this_address; + + /* check for valid extent range, including overflow */ + if (!xfs_verify_rgbext(rtg, irec->rc_startblock, irec->rc_blockcount)) + return __this_address; + + if (irec->rc_refcount == 0 || irec->rc_refcount > XFS_REFC_REFCOUNT_MAX) + return __this_address; + + return NULL; +} + +static inline xfs_failaddr_t +xfs_refcount_check_btrec( + struct xfs_btree_cur *cur, + const struct xfs_refcount_irec *irec) +{ + if (xfs_btree_is_rtrefcount(cur->bc_ops)) + return xfs_rtrefcount_check_irec(to_rtg(cur->bc_group), irec); + return xfs_refcount_check_irec(to_perag(cur->bc_group), irec); +} + static inline int xfs_refcount_complain_bad_rec( struct xfs_btree_cur *cur, @@ -152,9 +184,15 @@ xfs_refcount_complain_bad_rec( { struct xfs_mount *mp = cur->bc_mp; - xfs_warn(mp, + if (xfs_btree_is_rtrefcount(cur->bc_ops)) { + xfs_warn(mp, + "RT Refcount BTree record corruption in rtgroup %u detected at %pS!", + cur->bc_group->xg_gno, fa); + } else { + xfs_warn(mp, "Refcount BTree record corruption in AG %d detected at %pS!", cur->bc_group->xg_gno, fa); + } xfs_warn(mp, "Start block 0x%x, block count 0x%x, references 0x%x", irec->rc_startblock, irec->rc_blockcount, irec->rc_refcount); @@ -180,7 +218,7 @@ xfs_refcount_get_rec( return error; xfs_refcount_btrec_to_irec(rec, irec); - fa = xfs_refcount_check_irec(to_perag(cur->bc_group), irec); + fa = xfs_refcount_check_btrec(cur, irec); if (fa) return xfs_refcount_complain_bad_rec(cur, fa, irec); @@ -1065,7 +1103,7 @@ xfs_refcount_still_have_space( */ overhead = xfs_allocfree_block_count(cur->bc_mp, cur->bc_refc.shape_changes); - overhead += cur->bc_mp->m_refc_maxlevels; + overhead += cur->bc_maxlevels; overhead *= cur->bc_mp->m_sb.sb_blocksize; /* @@ -1117,7 +1155,7 @@ xfs_refcount_adjust_extents( if (error) goto out_error; if (!found_rec || ext.rc_domain != XFS_REFC_DOMAIN_SHARED) { - ext.rc_startblock = cur->bc_mp->m_sb.sb_agblocks; + ext.rc_startblock = xfs_group_max_blocks(cur->bc_group); ext.rc_blockcount = 0; ext.rc_refcount = 0; ext.rc_domain = XFS_REFC_DOMAIN_SHARED; @@ -1666,7 +1704,7 @@ xfs_refcount_adjust_cow_extents( goto out_error; } if (!found_rec) { - ext.rc_startblock = cur->bc_mp->m_sb.sb_agblocks; + ext.rc_startblock = xfs_group_max_blocks(cur->bc_group); ext.rc_blockcount = 0; ext.rc_refcount = 0; ext.rc_domain = XFS_REFC_DOMAIN_COW; @@ -1877,8 +1915,7 @@ xfs_refcount_recover_extent( INIT_LIST_HEAD(&rr->rr_list); xfs_refcount_btrec_to_irec(rec, &rr->rr_rrec); - if (xfs_refcount_check_irec(to_perag(cur->bc_group), &rr->rr_rrec) != - NULL || + if (xfs_refcount_check_btrec(cur, &rr->rr_rrec) != NULL || XFS_IS_CORRUPT(cur->bc_mp, rr->rr_rrec.rc_domain != XFS_REFC_DOMAIN_COW)) { xfs_btree_mark_sick(cur); @@ -2026,7 +2063,7 @@ xfs_refcount_query_range_helper( xfs_failaddr_t fa; xfs_refcount_btrec_to_irec(rec, &irec); - fa = xfs_refcount_check_irec(to_perag(cur->bc_group), &irec); + fa = xfs_refcount_check_btrec(cur, &irec); if (fa) return xfs_refcount_complain_bad_rec(cur, fa, &irec); diff --git a/fs/xfs/libxfs/xfs_refcount.h b/fs/xfs/libxfs/xfs_refcount.h index 62d78afcf1f3ff..9cd58d48716f83 100644 --- a/fs/xfs/libxfs/xfs_refcount.h +++ b/fs/xfs/libxfs/xfs_refcount.h @@ -12,6 +12,7 @@ struct xfs_perag; struct xfs_btree_cur; struct xfs_bmbt_irec; struct xfs_refcount_irec; +struct xfs_rtgroup; extern int xfs_refcount_lookup_le(struct xfs_btree_cur *cur, enum xfs_refc_domain domain, xfs_agblock_t bno, int *stat); @@ -120,6 +121,8 @@ extern void xfs_refcount_btrec_to_irec(const union xfs_btree_rec *rec, struct xfs_refcount_irec *irec); xfs_failaddr_t xfs_refcount_check_irec(struct xfs_perag *pag, const struct xfs_refcount_irec *irec); +xfs_failaddr_t xfs_rtrefcount_check_irec(struct xfs_rtgroup *rtg, + const struct xfs_refcount_irec *irec); extern int xfs_refcount_insert(struct xfs_btree_cur *cur, struct xfs_refcount_irec *irec, int *stat); From patchwork Thu Dec 19 19:34:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915607 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4AA581A071C for ; Thu, 19 Dec 2024 19:34:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636892; cv=none; b=Drq8IH9ONcmPD4QHioi1MnSgkWK7pEXvonAl6cLP6IQnfKmDNbaZ7jqemhmNwvABXhmFrjDRcE37rl9Md4CMupjcQ1/kkqHryQI6W1xG/bUW2/zp8Iy98sK6axZy8JxvjTSotZgW3GnZSf0TBrGQ8ztZXhOvIYLLZvRYZ+IZvs0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636892; c=relaxed/simple; bh=ztV5l6UVxmnII7zfboLNK4BuN/uycBXH59OsHBDil9s=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=A+2g2iQ5n8sFa91vLu7vLkKjDUz9aYyP8MMwu3FBjpFlARv3DPX9vrqViclkJPqf1B6TnfvlviAcF8DxyTzcFedSpnEYFVyjwbkC9o5wPnYVZi0V32A/80iAewK/+/+7rZhtne0pi5FeTFAZSLL4mQqjp3Tf4ZmUcZm2DbGw8qI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ukmBnM87; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ukmBnM87" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1C1A8C4CECE; Thu, 19 Dec 2024 19:34:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636892; bh=ztV5l6UVxmnII7zfboLNK4BuN/uycBXH59OsHBDil9s=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=ukmBnM87K7/pjFX94xHDrWriNvdQyEE1rw8ocmYuNuE5bmwdTmkrNdO2PRnTsmWWv Vb87od+MIoIJ6l1ezLbPxYrd5f7qdolinqYYB/zi9NkFdMOFl6DPPA4kGGewEWcJQp sYbpsXMvWlvhPFSjqGaoCy+7cz27BxKAS9kj4FZjj4eRFB0yQYedteupAfTAITAJUZ 2JQSv1hFUsLLjwfH2QwpU+ovkMhrSOUpsyoBGc1PiXxKSV6zcY3Qf5Sh1aZoK0+lsX GvsHLVmcPLsYL6AgGx115DU1LcB3Hn3CGfExeO2oe02JeYRI6EbLUXJbdFNcsUWbdL lIvgDc+x2VtTQ== Date: Thu, 19 Dec 2024 11:34:51 -0800 Subject: [PATCH 07/43] xfs: add a realtime flag to the refcount update log redo items From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581097.1572761.13008514207446264665.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Extend the refcount update (CUI) log items with a new realtime flag that indicates that the updates apply against the realtime refcountbt. We'll wire up the actual refcount code later. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_bmap.c | 10 +- fs/xfs/libxfs/xfs_defer.h | 1 fs/xfs/libxfs/xfs_log_format.h | 6 + fs/xfs/libxfs/xfs_log_recover.h | 2 fs/xfs/libxfs/xfs_refcount.c | 64 ++++++++--- fs/xfs/libxfs/xfs_refcount.h | 17 ++- fs/xfs/scrub/cow_repair.c | 2 fs/xfs/scrub/reap.c | 5 + fs/xfs/xfs_log_recover.c | 2 fs/xfs/xfs_refcount_item.c | 224 +++++++++++++++++++++++++++++++++++++-- fs/xfs/xfs_reflink.c | 19 ++- 11 files changed, 298 insertions(+), 54 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 02323936cc9b20..d63713630236e7 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4564,8 +4564,9 @@ xfs_bmapi_write( * the refcount btree for orphan recovery. */ if (whichfork == XFS_COW_FORK) - xfs_refcount_alloc_cow_extent(tp, bma.blkno, - bma.length); + xfs_refcount_alloc_cow_extent(tp, + XFS_IS_REALTIME_INODE(ip), + bma.blkno, bma.length); } /* Deal with the allocated space we found. */ @@ -4740,7 +4741,8 @@ xfs_bmapi_convert_one_delalloc( *seq = READ_ONCE(ifp->if_seq); if (whichfork == XFS_COW_FORK) - xfs_refcount_alloc_cow_extent(tp, bma.blkno, bma.length); + xfs_refcount_alloc_cow_extent(tp, XFS_IS_REALTIME_INODE(ip), + bma.blkno, bma.length); error = xfs_bmap_btree_to_extents(tp, ip, bma.cur, &bma.logflags, whichfork); @@ -5388,7 +5390,7 @@ xfs_bmap_del_extent_real( bool isrt = xfs_ifork_is_realtime(ip, whichfork); if (xfs_is_reflink_inode(ip) && whichfork == XFS_DATA_FORK) { - xfs_refcount_decrease_extent(tp, del); + xfs_refcount_decrease_extent(tp, isrt, del); } else if (isrt && !xfs_has_rtgroups(mp)) { error = xfs_bmap_free_rtblocks(tp, del); } else { diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h index 1e2477eaa5a844..9effd95ddcd4ee 100644 --- a/fs/xfs/libxfs/xfs_defer.h +++ b/fs/xfs/libxfs/xfs_defer.h @@ -68,6 +68,7 @@ struct xfs_defer_op_type { extern const struct xfs_defer_op_type xfs_bmap_update_defer_type; extern const struct xfs_defer_op_type xfs_refcount_update_defer_type; +extern const struct xfs_defer_op_type xfs_rtrefcount_update_defer_type; extern const struct xfs_defer_op_type xfs_rmap_update_defer_type; extern const struct xfs_defer_op_type xfs_rtrmap_update_defer_type; extern const struct xfs_defer_op_type xfs_extent_free_defer_type; diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h index a7e0e479454d3d..ec7157eaba5f68 100644 --- a/fs/xfs/libxfs/xfs_log_format.h +++ b/fs/xfs/libxfs/xfs_log_format.h @@ -252,6 +252,8 @@ typedef struct xfs_trans_header { #define XFS_LI_EFD_RT 0x124b /* realtime extent free done */ #define XFS_LI_RUI_RT 0x124c /* realtime rmap update intent */ #define XFS_LI_RUD_RT 0x124d /* realtime rmap update done */ +#define XFS_LI_CUI_RT 0x124e /* realtime refcount update intent */ +#define XFS_LI_CUD_RT 0x124f /* realtime refcount update done */ #define XFS_LI_TYPE_DESC \ { XFS_LI_EFI, "XFS_LI_EFI" }, \ @@ -275,7 +277,9 @@ typedef struct xfs_trans_header { { XFS_LI_EFI_RT, "XFS_LI_EFI_RT" }, \ { XFS_LI_EFD_RT, "XFS_LI_EFD_RT" }, \ { XFS_LI_RUI_RT, "XFS_LI_RUI_RT" }, \ - { XFS_LI_RUD_RT, "XFS_LI_RUD_RT" } + { XFS_LI_RUD_RT, "XFS_LI_RUD_RT" }, \ + { XFS_LI_CUI_RT, "XFS_LI_CUI_RT" }, \ + { XFS_LI_CUD_RT, "XFS_LI_CUD_RT" } /* * Inode Log Item Format definitions. diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h index abc705aff26dfe..66c7916fb5cd5e 100644 --- a/fs/xfs/libxfs/xfs_log_recover.h +++ b/fs/xfs/libxfs/xfs_log_recover.h @@ -81,6 +81,8 @@ extern const struct xlog_recover_item_ops xlog_rtefi_item_ops; extern const struct xlog_recover_item_ops xlog_rtefd_item_ops; extern const struct xlog_recover_item_ops xlog_rtrui_item_ops; extern const struct xlog_recover_item_ops xlog_rtrud_item_ops; +extern const struct xlog_recover_item_ops xlog_rtcui_item_ops; +extern const struct xlog_recover_item_ops xlog_rtcud_item_ops; /* * Macros, structures, prototypes for internal log manager use. diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 9be40ac16c7d56..8007d15856252b 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -26,6 +26,7 @@ #include "xfs_health.h" #include "xfs_refcount_item.h" #include "xfs_rtgroup.h" +#include "xfs_rtalloc.h" struct kmem_cache *xfs_refcount_intent_cache; @@ -1123,6 +1124,22 @@ xfs_refcount_still_have_space( cur->bc_refc.nr_ops * XFS_REFCOUNT_ITEM_OVERHEAD; } +/* Schedule an extent free. */ +static int +xrefc_free_extent( + struct xfs_btree_cur *cur, + struct xfs_refcount_irec *rec) +{ + unsigned int flags = 0; + + if (xfs_btree_is_rtrefcount(cur->bc_ops)) + flags |= XFS_FREE_EXTENT_REALTIME; + + return xfs_free_extent_later(cur->bc_tp, + xfs_gbno_to_fsb(cur->bc_group, rec->rc_startblock), + rec->rc_blockcount, NULL, XFS_AG_RESV_NONE, flags); +} + /* * Adjust the refcounts of middle extents. At this point we should have * split extents that crossed the adjustment range; merged with adjacent @@ -1139,7 +1156,6 @@ xfs_refcount_adjust_extents( struct xfs_refcount_irec ext, tmp; int error; int found_rec, found_tmp; - xfs_fsblock_t fsbno; /* Merging did all the work already. */ if (*aglen == 0) @@ -1192,11 +1208,7 @@ xfs_refcount_adjust_extents( goto out_error; } } else { - fsbno = xfs_agbno_to_fsb(to_perag(cur->bc_group), - tmp.rc_startblock); - error = xfs_free_extent_later(cur->bc_tp, fsbno, - tmp.rc_blockcount, NULL, - XFS_AG_RESV_NONE, 0); + error = xrefc_free_extent(cur, &tmp); if (error) goto out_error; } @@ -1254,11 +1266,7 @@ xfs_refcount_adjust_extents( } goto advloop; } else { - fsbno = xfs_agbno_to_fsb(to_perag(cur->bc_group), - ext.rc_startblock); - error = xfs_free_extent_later(cur->bc_tp, fsbno, - ext.rc_blockcount, NULL, - XFS_AG_RESV_NONE, 0); + error = xrefc_free_extent(cur, &ext); if (error) goto out_error; } @@ -1454,6 +1462,20 @@ xfs_refcount_finish_one( return error; } +/* + * Process one of the deferred realtime refcount operations. We pass back the + * btree cursor to maintain our lock on the btree between calls. + */ +int +xfs_rtrefcount_finish_one( + struct xfs_trans *tp, + struct xfs_refcount_intent *ri, + struct xfs_btree_cur **pcur) +{ + ASSERT(0); + return -EFSCORRUPTED; +} + /* * Record a refcount intent for later processing. */ @@ -1461,6 +1483,7 @@ static void __xfs_refcount_add( struct xfs_trans *tp, enum xfs_refcount_intent_type type, + bool isrt, xfs_fsblock_t startblock, xfs_extlen_t blockcount) { @@ -1472,6 +1495,7 @@ __xfs_refcount_add( ri->ri_type = type; ri->ri_startblock = startblock; ri->ri_blockcount = blockcount; + ri->ri_realtime = isrt; xfs_refcount_defer_add(tp, ri); } @@ -1482,12 +1506,13 @@ __xfs_refcount_add( void xfs_refcount_increase_extent( struct xfs_trans *tp, + bool isrt, struct xfs_bmbt_irec *PREV) { if (!xfs_has_reflink(tp->t_mountp)) return; - __xfs_refcount_add(tp, XFS_REFCOUNT_INCREASE, PREV->br_startblock, + __xfs_refcount_add(tp, XFS_REFCOUNT_INCREASE, isrt, PREV->br_startblock, PREV->br_blockcount); } @@ -1497,12 +1522,13 @@ xfs_refcount_increase_extent( void xfs_refcount_decrease_extent( struct xfs_trans *tp, + bool isrt, struct xfs_bmbt_irec *PREV) { if (!xfs_has_reflink(tp->t_mountp)) return; - __xfs_refcount_add(tp, XFS_REFCOUNT_DECREASE, PREV->br_startblock, + __xfs_refcount_add(tp, XFS_REFCOUNT_DECREASE, isrt, PREV->br_startblock, PREV->br_blockcount); } @@ -1858,6 +1884,7 @@ __xfs_refcount_cow_free( void xfs_refcount_alloc_cow_extent( struct xfs_trans *tp, + bool isrt, xfs_fsblock_t fsb, xfs_extlen_t len) { @@ -1866,16 +1893,17 @@ xfs_refcount_alloc_cow_extent( if (!xfs_has_reflink(mp)) return; - __xfs_refcount_add(tp, XFS_REFCOUNT_ALLOC_COW, fsb, len); + __xfs_refcount_add(tp, XFS_REFCOUNT_ALLOC_COW, isrt, fsb, len); /* Add rmap entry */ - xfs_rmap_alloc_extent(tp, false, fsb, len, XFS_RMAP_OWN_COW); + xfs_rmap_alloc_extent(tp, isrt, fsb, len, XFS_RMAP_OWN_COW); } /* Forget a CoW staging event in the refcount btree. */ void xfs_refcount_free_cow_extent( struct xfs_trans *tp, + bool isrt, xfs_fsblock_t fsb, xfs_extlen_t len) { @@ -1885,8 +1913,8 @@ xfs_refcount_free_cow_extent( return; /* Remove rmap entry */ - xfs_rmap_free_extent(tp, false, fsb, len, XFS_RMAP_OWN_COW); - __xfs_refcount_add(tp, XFS_REFCOUNT_FREE_COW, fsb, len); + xfs_rmap_free_extent(tp, isrt, fsb, len, XFS_RMAP_OWN_COW); + __xfs_refcount_add(tp, XFS_REFCOUNT_FREE_COW, isrt, fsb, len); } struct xfs_refcount_recovery { @@ -1992,7 +2020,7 @@ xfs_refcount_recover_cow_leftovers( /* Free the orphan record */ fsb = xfs_agbno_to_fsb(pag, rr->rr_rrec.rc_startblock); - xfs_refcount_free_cow_extent(tp, fsb, + xfs_refcount_free_cow_extent(tp, false, fsb, rr->rr_rrec.rc_blockcount); /* Free the block. */ diff --git a/fs/xfs/libxfs/xfs_refcount.h b/fs/xfs/libxfs/xfs_refcount.h index 9cd58d48716f83..be11df25abcc5b 100644 --- a/fs/xfs/libxfs/xfs_refcount.h +++ b/fs/xfs/libxfs/xfs_refcount.h @@ -61,6 +61,7 @@ struct xfs_refcount_intent { enum xfs_refcount_intent_type ri_type; xfs_extlen_t ri_blockcount; xfs_fsblock_t ri_startblock; + bool ri_realtime; }; /* Check that the refcount is appropriate for the record domain. */ @@ -75,22 +76,24 @@ xfs_refcount_check_domain( return true; } -void xfs_refcount_increase_extent(struct xfs_trans *tp, +void xfs_refcount_increase_extent(struct xfs_trans *tp, bool isrt, struct xfs_bmbt_irec *irec); -void xfs_refcount_decrease_extent(struct xfs_trans *tp, +void xfs_refcount_decrease_extent(struct xfs_trans *tp, bool isrt, struct xfs_bmbt_irec *irec); -extern int xfs_refcount_finish_one(struct xfs_trans *tp, +int xfs_refcount_finish_one(struct xfs_trans *tp, + struct xfs_refcount_intent *ri, struct xfs_btree_cur **pcur); +int xfs_rtrefcount_finish_one(struct xfs_trans *tp, struct xfs_refcount_intent *ri, struct xfs_btree_cur **pcur); extern int xfs_refcount_find_shared(struct xfs_btree_cur *cur, xfs_agblock_t agbno, xfs_extlen_t aglen, xfs_agblock_t *fbno, xfs_extlen_t *flen, bool find_end_of_shared); -void xfs_refcount_alloc_cow_extent(struct xfs_trans *tp, xfs_fsblock_t fsb, - xfs_extlen_t len); -void xfs_refcount_free_cow_extent(struct xfs_trans *tp, xfs_fsblock_t fsb, - xfs_extlen_t len); +void xfs_refcount_alloc_cow_extent(struct xfs_trans *tp, bool isrt, + xfs_fsblock_t fsb, xfs_extlen_t len); +void xfs_refcount_free_cow_extent(struct xfs_trans *tp, bool isrt, + xfs_fsblock_t fsb, xfs_extlen_t len); extern int xfs_refcount_recover_cow_leftovers(struct xfs_mount *mp, struct xfs_perag *pag); diff --git a/fs/xfs/scrub/cow_repair.c b/fs/xfs/scrub/cow_repair.c index 5b6194cef3e5e3..ba695dd21f8b96 100644 --- a/fs/xfs/scrub/cow_repair.c +++ b/fs/xfs/scrub/cow_repair.c @@ -343,7 +343,7 @@ xrep_cow_alloc( if (args.fsbno == NULLFSBLOCK) return -ENOSPC; - xfs_refcount_alloc_cow_extent(sc->tp, args.fsbno, args.len); + xfs_refcount_alloc_cow_extent(sc->tp, false, args.fsbno, args.len); repl->fsbno = args.fsbno; repl->len = args.len; diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c index 4d7f1b82dc559d..534c7696e9a1a7 100644 --- a/fs/xfs/scrub/reap.c +++ b/fs/xfs/scrub/reap.c @@ -419,7 +419,8 @@ xreap_agextent_iter( * records from the refcountbt, which will remove the * rmap record as well. */ - xfs_refcount_free_cow_extent(sc->tp, fsbno, *aglenp); + xfs_refcount_free_cow_extent(sc->tp, false, fsbno, + *aglenp); return 0; } @@ -451,7 +452,7 @@ xreap_agextent_iter( if (rs->oinfo == &XFS_RMAP_OINFO_COW) { ASSERT(rs->resv == XFS_AG_RESV_NONE); - xfs_refcount_free_cow_extent(sc->tp, fsbno, *aglenp); + xfs_refcount_free_cow_extent(sc->tp, false, fsbno, *aglenp); error = xfs_free_extent_later(sc->tp, fsbno, *aglenp, NULL, rs->resv, XFS_FREE_EXTENT_SKIP_DISCARD); if (error) diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 5c95c97519c767..b3c27dbccce861 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -1822,6 +1822,8 @@ static const struct xlog_recover_item_ops *xlog_recover_item_ops[] = { &xlog_rtefd_item_ops, &xlog_rtrui_item_ops, &xlog_rtrud_item_ops, + &xlog_rtcui_item_ops, + &xlog_rtcud_item_ops, }; static const struct xlog_recover_item_ops * diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index c807c4b90c44e3..2086d40514d086 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -23,6 +23,7 @@ #include "xfs_ag.h" #include "xfs_btree.h" #include "xfs_trace.h" +#include "xfs_rtgroup.h" struct kmem_cache *xfs_cui_cache; struct kmem_cache *xfs_cud_cache; @@ -94,8 +95,9 @@ xfs_cui_item_format( ASSERT(atomic_read(&cuip->cui_next_extent) == cuip->cui_format.cui_nextents); + ASSERT(lip->li_type == XFS_LI_CUI || lip->li_type == XFS_LI_CUI_RT); - cuip->cui_format.cui_type = XFS_LI_CUI; + cuip->cui_format.cui_type = lip->li_type; cuip->cui_format.cui_size = 1; xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_CUI_FORMAT, &cuip->cui_format, @@ -138,12 +140,14 @@ xfs_cui_item_release( STATIC struct xfs_cui_log_item * xfs_cui_init( struct xfs_mount *mp, + unsigned short item_type, uint nextents) - { struct xfs_cui_log_item *cuip; ASSERT(nextents > 0); + ASSERT(item_type == XFS_LI_CUI || item_type == XFS_LI_CUI_RT); + if (nextents > XFS_CUI_MAX_FAST_EXTENTS) cuip = kzalloc(xfs_cui_log_item_sizeof(nextents), GFP_KERNEL | __GFP_NOFAIL); @@ -151,7 +155,7 @@ xfs_cui_init( cuip = kmem_cache_zalloc(xfs_cui_cache, GFP_KERNEL | __GFP_NOFAIL); - xfs_log_item_init(mp, &cuip->cui_item, XFS_LI_CUI, &xfs_cui_item_ops); + xfs_log_item_init(mp, &cuip->cui_item, item_type, &xfs_cui_item_ops); cuip->cui_format.cui_nextents = nextents; cuip->cui_format.cui_id = (uintptr_t)(void *)cuip; atomic_set(&cuip->cui_next_extent, 0); @@ -190,7 +194,9 @@ xfs_cud_item_format( struct xfs_cud_log_item *cudp = CUD_ITEM(lip); struct xfs_log_iovec *vecp = NULL; - cudp->cud_format.cud_type = XFS_LI_CUD; + ASSERT(lip->li_type == XFS_LI_CUD || lip->li_type == XFS_LI_CUD_RT); + + cudp->cud_format.cud_type = lip->li_type; cudp->cud_format.cud_size = 1; xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_CUD_FORMAT, &cudp->cud_format, @@ -234,6 +240,14 @@ static inline struct xfs_refcount_intent *ci_entry(const struct list_head *e) return list_entry(e, struct xfs_refcount_intent, ri_list); } +static inline bool +xfs_cui_item_isrt(const struct xfs_log_item *lip) +{ + ASSERT(lip->li_type == XFS_LI_CUI || lip->li_type == XFS_LI_CUI_RT); + + return lip->li_type == XFS_LI_CUI_RT; +} + /* Sort refcount intents by AG. */ static int xfs_refcount_update_diff_items( @@ -282,18 +296,20 @@ xfs_refcount_update_log_item( } static struct xfs_log_item * -xfs_refcount_update_create_intent( +__xfs_refcount_update_create_intent( struct xfs_trans *tp, struct list_head *items, unsigned int count, - bool sort) + bool sort, + unsigned short item_type) { struct xfs_mount *mp = tp->t_mountp; - struct xfs_cui_log_item *cuip = xfs_cui_init(mp, count); + struct xfs_cui_log_item *cuip; struct xfs_refcount_intent *ri; ASSERT(count > 0); + cuip = xfs_cui_init(mp, item_type, count); if (sort) list_sort(mp, items, xfs_refcount_update_diff_items); list_for_each_entry(ri, items, ri_list) @@ -301,6 +317,23 @@ xfs_refcount_update_create_intent( return &cuip->cui_item; } +static struct xfs_log_item * +xfs_refcount_update_create_intent( + struct xfs_trans *tp, + struct list_head *items, + unsigned int count, + bool sort) +{ + return __xfs_refcount_update_create_intent(tp, items, count, sort, + XFS_LI_CUI); +} + +static inline unsigned short +xfs_cud_type_from_cui(const struct xfs_cui_log_item *cuip) +{ + return xfs_cui_item_isrt(&cuip->cui_item) ? XFS_LI_CUD_RT : XFS_LI_CUD; +} + /* Get an CUD so we can process all the deferred refcount updates. */ static struct xfs_log_item * xfs_refcount_update_create_done( @@ -312,8 +345,8 @@ xfs_refcount_update_create_done( struct xfs_cud_log_item *cudp; cudp = kmem_cache_zalloc(xfs_cud_cache, GFP_KERNEL | __GFP_NOFAIL); - xfs_log_item_init(tp->t_mountp, &cudp->cud_item, XFS_LI_CUD, - &xfs_cud_item_ops); + xfs_log_item_init(tp->t_mountp, &cudp->cud_item, + xfs_cud_type_from_cui(cuip), &xfs_cud_item_ops); cudp->cud_cuip = cuip; cudp->cud_format.cud_cui_id = cuip->cui_format.cui_id; @@ -328,10 +361,20 @@ xfs_refcount_defer_add( { struct xfs_mount *mp = tp->t_mountp; - ri->ri_group = xfs_group_intent_get(mp, ri->ri_startblock, XG_TYPE_AG); + /* + * Deferred refcount updates for the realtime and data sections must + * use separate transactions to finish deferred work because updates to + * realtime metadata files can lock AGFs to allocate btree blocks and + * we don't want that mixing with the AGF locks taken to finish data + * section updates. + */ + ri->ri_group = xfs_group_intent_get(mp, ri->ri_startblock, + ri->ri_realtime ? XG_TYPE_RTG : XG_TYPE_AG); trace_xfs_refcount_defer(mp, ri); - xfs_defer_add(tp, &ri->ri_list, &xfs_refcount_update_defer_type); + xfs_defer_add(tp, &ri->ri_list, ri->ri_realtime ? + &xfs_rtrefcount_update_defer_type : + &xfs_refcount_update_defer_type); } /* Cancel a deferred refcount update. */ @@ -381,7 +424,7 @@ xfs_refcount_finish_one_cleanup( return; agbp = rcur->bc_ag.agbp; xfs_btree_del_cursor(rcur, error); - if (error) + if (error && agbp) xfs_trans_brelse(tp, agbp); } @@ -515,10 +558,13 @@ xfs_refcount_relog_intent( struct xfs_phys_extent *pmap; unsigned int count; + ASSERT(intent->li_type == XFS_LI_CUI || + intent->li_type == XFS_LI_CUI_RT); + count = CUI_ITEM(intent)->cui_format.cui_nextents; pmap = CUI_ITEM(intent)->cui_format.cui_extents; - cuip = xfs_cui_init(tp->t_mountp, count); + cuip = xfs_cui_init(tp->t_mountp, intent->li_type, count); memcpy(cuip->cui_format.cui_extents, pmap, count * sizeof(*pmap)); atomic_set(&cuip->cui_next_extent, count); @@ -538,6 +584,71 @@ const struct xfs_defer_op_type xfs_refcount_update_defer_type = { .relog_intent = xfs_refcount_relog_intent, }; +#ifdef CONFIG_XFS_RT +static struct xfs_log_item * +xfs_rtrefcount_update_create_intent( + struct xfs_trans *tp, + struct list_head *items, + unsigned int count, + bool sort) +{ + return __xfs_refcount_update_create_intent(tp, items, count, sort, + XFS_LI_CUI_RT); +} + +/* Process a deferred realtime refcount update. */ +STATIC int +xfs_rtrefcount_update_finish_item( + struct xfs_trans *tp, + struct xfs_log_item *done, + struct list_head *item, + struct xfs_btree_cur **state) +{ + struct xfs_refcount_intent *ri = ci_entry(item); + int error; + + error = xfs_rtrefcount_finish_one(tp, ri, state); + + /* Did we run out of reservation? Requeue what we didn't finish. */ + if (!error && ri->ri_blockcount > 0) { + ASSERT(ri->ri_type == XFS_REFCOUNT_INCREASE || + ri->ri_type == XFS_REFCOUNT_DECREASE); + return -EAGAIN; + } + + xfs_refcount_update_cancel_item(item); + return error; +} + +/* Clean up after calling xfs_rtrefcount_finish_one. */ +STATIC void +xfs_rtrefcount_finish_one_cleanup( + struct xfs_trans *tp, + struct xfs_btree_cur *rcur, + int error) +{ + if (rcur) + xfs_btree_del_cursor(rcur, error); +} + +const struct xfs_defer_op_type xfs_rtrefcount_update_defer_type = { + .name = "rtrefcount", + .max_items = XFS_CUI_MAX_FAST_EXTENTS, + .create_intent = xfs_rtrefcount_update_create_intent, + .abort_intent = xfs_refcount_update_abort_intent, + .create_done = xfs_refcount_update_create_done, + .finish_item = xfs_rtrefcount_update_finish_item, + .finish_cleanup = xfs_rtrefcount_finish_one_cleanup, + .cancel_item = xfs_refcount_update_cancel_item, + .recover_work = xfs_refcount_recover_work, + .relog_intent = xfs_refcount_relog_intent, +}; +#else +const struct xfs_defer_op_type xfs_rtrefcount_update_defer_type = { + .name = "rtrefcount", +}; +#endif /* CONFIG_XFS_RT */ + STATIC bool xfs_cui_item_match( struct xfs_log_item *lip, @@ -603,7 +714,7 @@ xlog_recover_cui_commit_pass2( return -EFSCORRUPTED; } - cuip = xfs_cui_init(mp, cui_formatp->cui_nextents); + cuip = xfs_cui_init(mp, ITEM_TYPE(item), cui_formatp->cui_nextents); xfs_cui_copy_format(&cuip->cui_format, cui_formatp); atomic_set(&cuip->cui_next_extent, cui_formatp->cui_nextents); @@ -617,6 +728,61 @@ const struct xlog_recover_item_ops xlog_cui_item_ops = { .commit_pass2 = xlog_recover_cui_commit_pass2, }; +#ifdef CONFIG_XFS_RT +STATIC int +xlog_recover_rtcui_commit_pass2( + struct xlog *log, + struct list_head *buffer_list, + struct xlog_recover_item *item, + xfs_lsn_t lsn) +{ + struct xfs_mount *mp = log->l_mp; + struct xfs_cui_log_item *cuip; + struct xfs_cui_log_format *cui_formatp; + size_t len; + + cui_formatp = item->ri_buf[0].i_addr; + + if (item->ri_buf[0].i_len < xfs_cui_log_format_sizeof(0)) { + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, + item->ri_buf[0].i_addr, item->ri_buf[0].i_len); + return -EFSCORRUPTED; + } + + len = xfs_cui_log_format_sizeof(cui_formatp->cui_nextents); + if (item->ri_buf[0].i_len != len) { + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, + item->ri_buf[0].i_addr, item->ri_buf[0].i_len); + return -EFSCORRUPTED; + } + + cuip = xfs_cui_init(mp, ITEM_TYPE(item), cui_formatp->cui_nextents); + xfs_cui_copy_format(&cuip->cui_format, cui_formatp); + atomic_set(&cuip->cui_next_extent, cui_formatp->cui_nextents); + + xlog_recover_intent_item(log, &cuip->cui_item, lsn, + &xfs_rtrefcount_update_defer_type); + return 0; +} +#else +STATIC int +xlog_recover_rtcui_commit_pass2( + struct xlog *log, + struct list_head *buffer_list, + struct xlog_recover_item *item, + xfs_lsn_t lsn) +{ + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, log->l_mp, + item->ri_buf[0].i_addr, item->ri_buf[0].i_len); + return -EFSCORRUPTED; +} +#endif + +const struct xlog_recover_item_ops xlog_rtcui_item_ops = { + .item_type = XFS_LI_CUI_RT, + .commit_pass2 = xlog_recover_rtcui_commit_pass2, +}; + /* * This routine is called when an CUD format structure is found in a committed * transaction in the log. Its purpose is to cancel the corresponding CUI if it @@ -648,3 +814,33 @@ const struct xlog_recover_item_ops xlog_cud_item_ops = { .item_type = XFS_LI_CUD, .commit_pass2 = xlog_recover_cud_commit_pass2, }; + +#ifdef CONFIG_XFS_RT +STATIC int +xlog_recover_rtcud_commit_pass2( + struct xlog *log, + struct list_head *buffer_list, + struct xlog_recover_item *item, + xfs_lsn_t lsn) +{ + struct xfs_cud_log_format *cud_formatp; + + cud_formatp = item->ri_buf[0].i_addr; + if (item->ri_buf[0].i_len != sizeof(struct xfs_cud_log_format)) { + XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, log->l_mp, + item->ri_buf[0].i_addr, item->ri_buf[0].i_len); + return -EFSCORRUPTED; + } + + xlog_recover_release_intent(log, XFS_LI_CUI_RT, + cud_formatp->cud_cui_id); + return 0; +} +#else +# define xlog_recover_rtcud_commit_pass2 xlog_recover_rtcui_commit_pass2 +#endif + +const struct xlog_recover_item_ops xlog_rtcud_item_ops = { + .item_type = XFS_LI_CUD_RT, + .commit_pass2 = xlog_recover_rtcud_commit_pass2, +}; diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index b11769c009effc..02457e176c4252 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -571,6 +571,7 @@ xfs_reflink_cancel_cow_blocks( struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_COW_FORK); struct xfs_bmbt_irec got, del; struct xfs_iext_cursor icur; + bool isrt = XFS_IS_REALTIME_INODE(ip); int error = 0; if (!xfs_inode_has_cow_data(ip)) @@ -598,12 +599,13 @@ xfs_reflink_cancel_cow_blocks( ASSERT((*tpp)->t_highest_agno == NULLAGNUMBER); /* Free the CoW orphan record. */ - xfs_refcount_free_cow_extent(*tpp, del.br_startblock, - del.br_blockcount); + xfs_refcount_free_cow_extent(*tpp, isrt, + del.br_startblock, del.br_blockcount); error = xfs_free_extent_later(*tpp, del.br_startblock, del.br_blockcount, NULL, - XFS_AG_RESV_NONE, 0); + XFS_AG_RESV_NONE, + isrt ? XFS_FREE_EXTENT_REALTIME : 0); if (error) break; @@ -710,6 +712,7 @@ xfs_reflink_end_cow_extent( struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_COW_FORK); unsigned int resblks; int nmaps; + bool isrt = XFS_IS_REALTIME_INODE(ip); int error; resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK); @@ -779,7 +782,7 @@ xfs_reflink_end_cow_extent( * or not), unmap the extent and drop its refcount. */ xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &data); - xfs_refcount_decrease_extent(tp, &data); + xfs_refcount_decrease_extent(tp, isrt, &data); xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, -data.br_blockcount); } else if (data.br_startblock == DELAYSTARTBLOCK) { @@ -799,7 +802,8 @@ xfs_reflink_end_cow_extent( } /* Free the CoW orphan record. */ - xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount); + xfs_refcount_free_cow_extent(tp, isrt, del.br_startblock, + del.br_blockcount); /* Map the new blocks into the data fork. */ xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, &del); @@ -1135,6 +1139,7 @@ xfs_reflink_remap_extent( bool quota_reserved = true; bool smap_real; bool dmap_written = xfs_bmap_is_written_extent(dmap); + bool isrt = XFS_IS_REALTIME_INODE(ip); int iext_delta = 0; int nimaps; int error; @@ -1264,7 +1269,7 @@ xfs_reflink_remap_extent( * or not), unmap the extent and drop its refcount. */ xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &smap); - xfs_refcount_decrease_extent(tp, &smap); + xfs_refcount_decrease_extent(tp, isrt, &smap); qdelta -= smap.br_blockcount; } else if (smap.br_startblock == DELAYSTARTBLOCK) { int done; @@ -1287,7 +1292,7 @@ xfs_reflink_remap_extent( * its refcount and map it into the file. */ if (dmap_written) { - xfs_refcount_increase_extent(tp, dmap); + xfs_refcount_increase_extent(tp, isrt, dmap); xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, dmap); qdelta += dmap->br_blockcount; } From patchwork Thu Dec 19 19:35:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915608 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E59DB198A08 for ; Thu, 19 Dec 2024 19:35:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636908; cv=none; b=BkjDXT3pEP6PKC55BA6Ip1kwQlb3iXvMpLnqcxJ5hcm5igo9W2WWRa+L/0dxXNXfeXtw1YMGwJW3ZiqSVHI6QtMsokXfmZZ+WJEnMMYPUs1exOL6radxdDrYA13q58SLrf1p/H+j6bLQwHM9PC3S2war/6yOvpQJPS3wrviWNKc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636908; c=relaxed/simple; bh=HemWDq9qDFjcNe/eRbtiNBUhO/YD3edsPVAd2P2iXVw=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=p/pnK0dV70NS/aAmhemEFk49u/4Mx+vUq3SprKpEKD+Md4CWZuAmQ28cYr+wyczK6rZWXPgkEx3FjQr0q/5SsKKqSIeRvZftxnyEZM6rBCSeMHSL+1AGgvFlJw0bRfAwK/Lvy6CesWYFhibAFelAKy8YTtXCNmTBP49TLkFws+I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=K5ThQTGD; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="K5ThQTGD" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC305C4CECE; Thu, 19 Dec 2024 19:35:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636907; bh=HemWDq9qDFjcNe/eRbtiNBUhO/YD3edsPVAd2P2iXVw=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=K5ThQTGDlp6Wea6t4SOEwKlyj4MXvdmJ4HvUQ0Y2l73SZUwU5At7tYwDNQMii4vzx cjNpQbX6hZtPcwnms9AECxdPiXTEdAMbuhkLBLGRJbC8BMQeu5sr/LzNEV+G1X2dAC QKcxZ2Waj/ak8gKEZkHIdYBqkNARxIHshFAN9HznBOXVm9W0XAqa68/16besPgtp94 x0LppX8N7kxQhclejo7dVy6+3+BD9B4jqY4iIZIdU9tUhnzOIYa+wkjU8sNUmm2pBj 9q+vHMtnwuiUI5KvuWczXg6XZWfrzIeFPHca7Wjqre+P2/jZyBcoTjtLvnWnOhB8lb xsMdby5ZL8omA== Date: Thu, 19 Dec 2024 11:35:07 -0800 Subject: [PATCH 08/43] xfs: support recovering refcount intent items targetting realtime extents From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581115.1572761.6866470540595917101.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Now that we have reflink on the realtime device, refcount intent items have to support remapping extents on the realtime volume. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_refcount_item.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index 2086d40514d086..fe2d7aab8554fc 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -440,6 +440,7 @@ xfs_refcount_update_abort_intent( static inline bool xfs_cui_validate_phys( struct xfs_mount *mp, + bool isrt, struct xfs_phys_extent *pmap) { if (!xfs_has_reflink(mp)) @@ -458,6 +459,9 @@ xfs_cui_validate_phys( return false; } + if (isrt) + return xfs_verify_rtbext(mp, pmap->pe_startblock, pmap->pe_len); + return xfs_verify_fsbext(mp, pmap->pe_startblock, pmap->pe_len); } @@ -465,6 +469,7 @@ static inline void xfs_cui_recover_work( struct xfs_mount *mp, struct xfs_defer_pending *dfp, + bool isrt, struct xfs_phys_extent *pmap) { struct xfs_refcount_intent *ri; @@ -475,7 +480,8 @@ xfs_cui_recover_work( ri->ri_startblock = pmap->pe_startblock; ri->ri_blockcount = pmap->pe_len; ri->ri_group = xfs_group_intent_get(mp, pmap->pe_startblock, - XG_TYPE_AG); + isrt ? XG_TYPE_RTG : XG_TYPE_AG); + ri->ri_realtime = isrt; xfs_defer_add_item(dfp, &ri->ri_list); } @@ -494,6 +500,7 @@ xfs_refcount_recover_work( struct xfs_cui_log_item *cuip = CUI_ITEM(lip); struct xfs_trans *tp; struct xfs_mount *mp = lip->li_log->l_mp; + bool isrt = xfs_cui_item_isrt(lip); int i; int error = 0; @@ -503,7 +510,7 @@ xfs_refcount_recover_work( * just toss the CUI. */ for (i = 0; i < cuip->cui_format.cui_nextents; i++) { - if (!xfs_cui_validate_phys(mp, + if (!xfs_cui_validate_phys(mp, isrt, &cuip->cui_format.cui_extents[i])) { XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, &cuip->cui_format, @@ -511,7 +518,8 @@ xfs_refcount_recover_work( return -EFSCORRUPTED; } - xfs_cui_recover_work(mp, dfp, &cuip->cui_format.cui_extents[i]); + xfs_cui_recover_work(mp, dfp, isrt, + &cuip->cui_format.cui_extents[i]); } /* From patchwork Thu Dec 19 19:35:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915609 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 80E2A1A0BF1 for ; Thu, 19 Dec 2024 19:35:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636923; cv=none; b=Ikvtl7vWABR9bLG8F5UNB1QWq96uaKWFdnTzrAVx8ne5szv9Gy40DwiAzRM2l1lwTKU860OPgA8PF5LF8AI8kwpO9ex1mp79AVC0XcGO+WvDQfN6Smt+eTpfy08O7R0KPxcoOxnsU3GABvh2RBycGwTRwuQo1/6XjkkZfrx54og= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636923; c=relaxed/simple; bh=btHwRF31cvl6HWAFraCteuWEteK+RPKJoDHa0kSJ2Is=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DjcCbODHfOy7LgSzrCD7N4qicxFnc7T/JgGBMA28VOdp0yUJk/dpwzxW1Qjk/xGUiuMZc6Ca4n//ZQOwBSlK7bzTbG6A+GjTOnaSWKLyJP2k0dKf3+vA5qQY62qJjkTexv7mOwO80ZFKxLGqOQzR4pRRKS1UwYrQvhDB09oTTqo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Gudcw+vT; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Gudcw+vT" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 56BF5C4CECE; Thu, 19 Dec 2024 19:35:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636923; bh=btHwRF31cvl6HWAFraCteuWEteK+RPKJoDHa0kSJ2Is=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=Gudcw+vTNzqCVU4z0EnnrbKhlPmW++XEwaWaFIdNYzTbTAOetxZ8n2Cf0l/S1qmJJ bvId8DQvkHnVuG2dPgJBejS6DdwRrUgAsbTlgGWTja6MwQ8hg9zC+pyGWoZiKi0QDl gcVP05VsMetgey+NCTkSMiEUqnsZ5RVUWwABZxX6ml3AJUPA4OoRxWLMn379ckFYo/ ypbAycMWB7icqxp3p6SwtmnZJP1JE4WI7DLyQAVuKcdUnxFh6oqW/oBsHBSnSil3g3 3iWrVWsF8kPGaM1QMrPMTXSpAiAy7ChyanTOOgVpN3OSJfR4jaBNLJDF+lqjEEcENI /sUfdn+oohjMQ== Date: Thu, 19 Dec 2024 11:35:22 -0800 Subject: [PATCH 09/43] xfs: add realtime refcount btree block detection to log recovery From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581132.1572761.75332600078413039.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Identify rt refcount btree blocks in the log correctly so that we can validate them during log recovery. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_buf_item_recover.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c index 4f2e4ea29e1f57..b05d5b81f642da 100644 --- a/fs/xfs/xfs_buf_item_recover.c +++ b/fs/xfs/xfs_buf_item_recover.c @@ -271,6 +271,9 @@ xlog_recover_validate_buf_type( case XFS_REFC_CRC_MAGIC: bp->b_ops = &xfs_refcountbt_buf_ops; break; + case XFS_RTREFC_CRC_MAGIC: + bp->b_ops = &xfs_rtrefcountbt_buf_ops; + break; default: warnmsg = "Bad btree block magic!"; break; @@ -859,6 +862,7 @@ xlog_recover_get_buf_lsn( break; } case XFS_RTRMAP_CRC_MAGIC: + case XFS_RTREFC_CRC_MAGIC: case XFS_BMAP_CRC_MAGIC: case XFS_BMAP_MAGIC: { struct xfs_btree_block *btb = blk; From patchwork Thu Dec 19 19:35:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915610 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 26D951A071C for ; Thu, 19 Dec 2024 19:35:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636939; cv=none; b=IXiNnC0AYtmJKltsq8MtbF7cjdB1MIkDg25A7hw+muqnOI+9tJikZq+kohTvmvLhmKZnNNWpKXOc0yZG8c1qvzoqlOFDAN+cnq4ZWd/YycyOdYF7H/x/1SrAr6nlDmwDlvBN4B5hhO8dg/4ZNLF0QAiIrYpR5FJODMBeRofRQ/o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636939; c=relaxed/simple; bh=KUve35+d8Cyk0E4Mq8P15qOer39UE5rSBUt52ChzK0I=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ue/l7Moc6pVHOAwSQslc3ZjnU2gBXHoUjhY3N9lQ1Yyurj/1dDFj64ACq/nrl6RcZzMXE36d62zAzB/MGB5M0knZf5BV3FxR1Khk6HnAEvejHUI6IDJLy23WepE8JN6Z/M5rJdL5aBk5MfEmKdSlMMLfE5YGzmWJmD074TEmMDY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=U4XpSoh0; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="U4XpSoh0" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 00AFAC4CED0; Thu, 19 Dec 2024 19:35:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636939; bh=KUve35+d8Cyk0E4Mq8P15qOer39UE5rSBUt52ChzK0I=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=U4XpSoh0YKtMO01Ld5eezIWBJKomfScuuuh28Lqoa/1nEr1CbAUkXGec2auRdKEui /XBEh3xnh3OPrQPXRVHv0GOZYoUV09oQ/q+TDN/CgtNv9jLgZrOiiX1bBeCJcNsURh mTFCIqylF8y/l2/Yssp0qV1WQTW/hadtJk4WB0R2oB0MUyg6yas06cgF6vFS+8pgAo tth14kLtmTMH/TN7oVbYZQcOY8YyeT/Qo83ksAwbY6JansQu2vvxvA6DcP2PLrjB5B PWo746uqm3jvHQmOxL7KgVNhc9ezWSuNIC1l0sTmav2sXF1m8iP6hoiwmCxF0gYvMx zHRgpXHTX4/yw== Date: Thu, 19 Dec 2024 11:35:38 -0800 Subject: [PATCH 10/43] xfs: add realtime refcount btree inode to metadata directory From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581149.1572761.10672546631951013453.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Add a metadir path to select the realtime refcount btree inode and load it at mount time. The rtrefcountbt inode will have a unique extent format code, which means that we also have to update the inode validation and flush routines to look for it. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_format.h | 4 +++- fs/xfs/libxfs/xfs_inode_buf.c | 5 +++++ fs/xfs/libxfs/xfs_inode_fork.c | 6 ++++++ fs/xfs/libxfs/xfs_rtgroup.c | 7 +++++++ fs/xfs/libxfs/xfs_rtgroup.h | 6 ++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.c | 6 +++--- 6 files changed, 30 insertions(+), 4 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index 17f7c0d1aaa452..b6828f92c131fb 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -858,6 +858,7 @@ enum xfs_metafile_type { XFS_METAFILE_RTBITMAP, /* rt bitmap */ XFS_METAFILE_RTSUMMARY, /* rt summary */ XFS_METAFILE_RTRMAP, /* rt rmap */ + XFS_METAFILE_RTREFCOUNT, /* rt refcount */ XFS_METAFILE_MAX } __packed; @@ -870,7 +871,8 @@ enum xfs_metafile_type { { XFS_METAFILE_PRJQUOTA, "prjquota" }, \ { XFS_METAFILE_RTBITMAP, "rtbitmap" }, \ { XFS_METAFILE_RTSUMMARY, "rtsummary" }, \ - { XFS_METAFILE_RTRMAP, "rtrmap" } + { XFS_METAFILE_RTRMAP, "rtrmap" }, \ + { XFS_METAFILE_RTREFCOUNT, "rtrefcount" } /* * On-disk inode structure. diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 17cb91b89fcaa1..65eec8f60376d3 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -456,6 +456,11 @@ xfs_dinode_verify_fork( if (!xfs_has_rmapbt(mp)) return __this_address; break; + case XFS_METAFILE_RTREFCOUNT: + /* same comment about growfs and rmap inodes applies */ + if (!xfs_has_reflink(mp)) + return __this_address; + break; default: return __this_address; } diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index d9b3c182cb400b..0c4bc12401a151 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -272,6 +272,9 @@ xfs_iformat_data_fork( switch (ip->i_metatype) { case XFS_METAFILE_RTRMAP: return xfs_iformat_rtrmap(ip, dip); + case XFS_METAFILE_RTREFCOUNT: + ASSERT(0); /* to be implemented later */ + return -EFSCORRUPTED; default: break; } @@ -620,6 +623,9 @@ xfs_iflush_fork( case XFS_METAFILE_RTRMAP: xfs_iflush_rtrmap(ip, dip); break; + case XFS_METAFILE_RTREFCOUNT: + ASSERT(0); /* to be implemented later */ + break; default: ASSERT(0); break; diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index b7ed2d27d54553..6aebe9f484901f 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -367,6 +367,13 @@ static const struct xfs_rtginode_ops xfs_rtginode_ops[XFS_RTGI_MAX] = { .enabled = xfs_has_rmapbt, .create = xfs_rtrmapbt_create, }, + [XFS_RTGI_REFCOUNT] = { + .name = "refcount", + .metafile_type = XFS_METAFILE_RTREFCOUNT, + .fmt_mask = 1U << XFS_DINODE_FMT_META_BTREE, + /* same comment about growfs and rmap inodes applies here */ + .enabled = xfs_has_reflink, + }, }; /* Return the shortname of this rtgroup inode. */ diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h index 733da7417c9cd7..385ea8e2f28b67 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.h +++ b/fs/xfs/libxfs/xfs_rtgroup.h @@ -15,6 +15,7 @@ enum xfs_rtg_inodes { XFS_RTGI_BITMAP, /* allocation bitmap */ XFS_RTGI_SUMMARY, /* allocation summary */ XFS_RTGI_RMAP, /* rmap btree inode */ + XFS_RTGI_REFCOUNT, /* refcount btree inode */ XFS_RTGI_MAX, }; @@ -80,6 +81,11 @@ static inline struct xfs_inode *rtg_rmap(const struct xfs_rtgroup *rtg) return rtg->rtg_inodes[XFS_RTGI_RMAP]; } +static inline struct xfs_inode *rtg_refcount(const struct xfs_rtgroup *rtg) +{ + return rtg->rtg_inodes[XFS_RTGI_REFCOUNT]; +} + /* Passive rtgroup references */ static inline struct xfs_rtgroup * xfs_rtgroup_get( diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index e30af941581651..ebbeab112d1412 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -26,6 +26,7 @@ #include "xfs_extent_busy.h" #include "xfs_rtgroup.h" #include "xfs_rtbitmap.h" +#include "xfs_metafile.h" static struct kmem_cache *xfs_rtrefcountbt_cur_cache; @@ -281,12 +282,10 @@ xfs_rtrefcountbt_init_cursor( struct xfs_trans *tp, struct xfs_rtgroup *rtg) { - struct xfs_inode *ip = NULL; + struct xfs_inode *ip = rtg_refcount(rtg); struct xfs_mount *mp = rtg_mount(rtg); struct xfs_btree_cur *cur; - return NULL; /* XXX */ - xfs_assert_ilocked(ip, XFS_ILOCK_SHARED | XFS_ILOCK_EXCL); cur = xfs_btree_alloc_cursor(mp, tp, &xfs_rtrefcountbt_ops, @@ -316,6 +315,7 @@ xfs_rtrefcountbt_commit_staged_btree( int flags = XFS_ILOG_CORE | XFS_ILOG_DBROOT; ASSERT(cur->bc_flags & XFS_BTREE_STAGING); + ASSERT(ifake->if_fork->if_format == XFS_DINODE_FMT_META_BTREE); /* * Free any resources hanging off the real fork, then shallow-copy the From patchwork Thu Dec 19 19:35:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915611 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B6C5A198A08 for ; Thu, 19 Dec 2024 19:35:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636954; cv=none; b=R0Wyl+9kWN+N+rNXRcNibZb+uKZHwdQP29poj/j0B1ORs6eJlyVN/+/ti7v0Umi6GQ7t9Ud9YOWYmEoduWOhw1GsZEEZV/nypjsJGqmOJk0iJdE7X3xs6aog5KxuBymbcRyhsTdm4txvoM2HqizuTMHxQklG8S1Z/mLMdf97eCY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636954; c=relaxed/simple; bh=VR4nj5DPed6O3Zz5wMbrlcYFHjKtHVAl/uMNbPPva2c=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=obvHQqR1eUh3nKk9EBYxI5EOhQNQRLUIcfQr5S7BaD92OZiuYJcnTUHbIaX/wFegV0xmLnyfVBV7tQjhqIobWu1RsquCqls9DTyl/lZij2eIkWpNdwIZk41blZy05znZtbAdFQYm7MT7lBS27QAJffQivtFIydIZVs76zB4Dd+M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nyScOMio; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nyScOMio" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 95D23C4CECE; Thu, 19 Dec 2024 19:35:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636954; bh=VR4nj5DPed6O3Zz5wMbrlcYFHjKtHVAl/uMNbPPva2c=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=nyScOMioozKitUGMzUljWPdwrU9kPfcyhPa4h6+6MdJHDXISetGf0cvKYXRRc8lCc G33E0FR5+vaLy9J6vw22d5kxc4pp7+sl/9O1g2+cOh/AIvg86D7hxqouJQdSZ99O+V AT3fGirOoZl6/YO3nESU758O5jT+iVhIK23CpyMc8LlhK8qj59EfQuiACoqxKB0ZlG bBAqIBNSDTaBoniJAI6qCIdZhFEblQ3tNWCnuztT6tktjtOZEa/9dLzkul4ITUB1VY WgO4+WqpCeorHqGs0x+LaLpEpwV73ArlQT6K/FIc7iphFBURdQ+W849s2UsZnCMtQg g7IyVATGwu4uw== Date: Thu, 19 Dec 2024 11:35:54 -0800 Subject: [PATCH 11/43] xfs: add metadata reservations for realtime refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581166.1572761.8295913118582631382.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Reserve some free blocks so that we will always have enough free blocks in the data volume to handle expansion of the realtime refcount btree. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_rtrefcount_btree.c | 38 ++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.h | 4 ++++ fs/xfs/xfs_rtalloc.c | 6 +++++ 3 files changed, 48 insertions(+) diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index ebbeab112d1412..ff72ed09e75f08 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -419,3 +419,41 @@ xfs_rtrefcountbt_compute_maxlevels( /* Add one level to handle the inode root level. */ mp->m_rtrefc_maxlevels = min(d_maxlevels, r_maxlevels) + 1; } + +/* Calculate the rtrefcount btree size for some records. */ +unsigned long long +xfs_rtrefcountbt_calc_size( + struct xfs_mount *mp, + unsigned long long len) +{ + return xfs_btree_calc_size(mp->m_rtrefc_mnr, len); +} + +/* + * Calculate the maximum refcount btree size. + */ +static unsigned long long +xfs_rtrefcountbt_max_size( + struct xfs_mount *mp, + xfs_rtblock_t rtblocks) +{ + /* Bail out if we're uninitialized, which can happen in mkfs. */ + if (mp->m_rtrefc_mxr[0] == 0) + return 0; + + return xfs_rtrefcountbt_calc_size(mp, rtblocks); +} + +/* + * Figure out how many blocks to reserve and how many are used by this btree. + * We need enough space to hold one record for every rt extent in the rtgroup. + */ +xfs_filblks_t +xfs_rtrefcountbt_calc_reserves( + struct xfs_mount *mp) +{ + if (!xfs_has_rtreflink(mp)) + return 0; + + return xfs_rtrefcountbt_max_size(mp, mp->m_sb.sb_rgextents); +} diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.h b/fs/xfs/libxfs/xfs_rtrefcount_btree.h index b713b33818800c..3cd44590c9304c 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.h +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.h @@ -67,4 +67,8 @@ unsigned int xfs_rtrefcountbt_maxlevels_ondisk(void); int __init xfs_rtrefcountbt_init_cur_cache(void); void xfs_rtrefcountbt_destroy_cur_cache(void); +xfs_filblks_t xfs_rtrefcountbt_calc_reserves(struct xfs_mount *mp); +unsigned long long xfs_rtrefcountbt_calc_size(struct xfs_mount *mp, + unsigned long long len); + #endif /* __XFS_RTREFCOUNT_BTREE_H__ */ diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index a69967f9d88ead..294aa0739be311 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -31,6 +31,7 @@ #include "xfs_rtgroup.h" #include "xfs_error.h" #include "xfs_trace.h" +#include "xfs_rtrefcount_btree.h" /* * Return whether there are any free extents in the size range given @@ -1547,6 +1548,11 @@ xfs_rt_resv_init( err2 = xfs_metafile_resv_init(rtg_rmap(rtg), ask); if (err2 && !error) error = err2; + + ask = xfs_rtrefcountbt_calc_reserves(mp); + err2 = xfs_metafile_resv_init(rtg_refcount(rtg), ask); + if (err2 && !error) + error = err2; } return error; From patchwork Thu Dec 19 19:36:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915612 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56134198A08 for ; Thu, 19 Dec 2024 19:36:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636970; cv=none; b=iUi7mt/H0LLRtBw4Vb2YZJgDV+esXK8ErSx1ung9TmtOmd0AfGj1KsxL0/uX1t7HdJYfihWCB3sLHxT1/9kw37LVEmjO940d90+pod3o8D89XiphXRTnMHVLQwvm4ZfqNPoskm0p4b+1fiUtjzkUBIzfkcqbPqyJoQZkspLtTH4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636970; c=relaxed/simple; bh=Ywxt0wAm6tq9rXW00OrX+n6FEIidY315QFv5pvVrp3o=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dxB8TjZqRMXzvvczpBK+TRePUOFm630/9YJMxTj1d/TMLbOCrGdH2ykK6wSru+jbvOZZ6h0W09yP6CJre8vx1me8uu9X4IHxREh7vDIm8oxrdj3qbksWPBTCtl0qzg2wvGTluDzRXgh7gHV0YYXq0gHOJB91g9+G+jOqJugDIOM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hvam/seg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hvam/seg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D561C4CECE; Thu, 19 Dec 2024 19:36:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636970; bh=Ywxt0wAm6tq9rXW00OrX+n6FEIidY315QFv5pvVrp3o=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=hvam/segYgx0jck29rzBbopHbV0cKuDpY5mm499t9ljgZ9PxH5BXgaPnrlTGzxFij 2E/65ashmRWnoSDFwryob95tis4/RFODSb1i+pM1JMMl25k9FKlgEXoUhwdOOL/a47 JkJaDX5iEPf2Prz3wveMNRu8qaEuK68nS1l14Checzl6bkDJb3b6X9NGW9sTXUF05y j0aCyunwA9qLrSebp396mXN6IwfLI4w1bEpX2ywgvXHGQsmnu/fFtZUSXI3+Aie9kM wzHqXr+pRXbc1lnhVHXKk1744QGs2Na/4P1uuR/MKk7lCaJzqjURotLW50CmoKeVxv 4YkUFNj1QqFag== Date: Thu, 19 Dec 2024 11:36:09 -0800 Subject: [PATCH 12/43] xfs: wire up a new metafile type for the realtime refcount From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581183.1572761.8414212161734467306.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Plumb in the pieces we need to embed the root of the realtime refcount btree in an inode's data fork, complete with metafile type and on-disk interpretation functions. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_format.h | 8 + fs/xfs/libxfs/xfs_inode_fork.c | 6 - fs/xfs/libxfs/xfs_ondisk.h | 1 fs/xfs/libxfs/xfs_rtrefcount_btree.c | 264 ++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.h | 112 ++++++++++++++ fs/xfs/xfs_inode_item_recover.c | 4 + 6 files changed, 392 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index b6828f92c131fb..b1007fb661ba73 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1805,6 +1805,14 @@ typedef __be32 xfs_refcount_ptr_t; */ #define XFS_RTREFC_CRC_MAGIC 0x52434e54 /* 'RCNT' */ +/* + * rt refcount root header, on-disk form only. + */ +struct xfs_rtrefcount_root { + __be16 bb_level; /* 0 is a leaf */ + __be16 bb_numrecs; /* current # of data records */ +}; + /* inode-rooted btree pointer type */ typedef __be64 xfs_rtrefcount_ptr_t; diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index 0c4bc12401a151..4f99b90add5526 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -28,6 +28,7 @@ #include "xfs_health.h" #include "xfs_symlink_remote.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" struct kmem_cache *xfs_ifork_cache; @@ -273,8 +274,7 @@ xfs_iformat_data_fork( case XFS_METAFILE_RTRMAP: return xfs_iformat_rtrmap(ip, dip); case XFS_METAFILE_RTREFCOUNT: - ASSERT(0); /* to be implemented later */ - return -EFSCORRUPTED; + return xfs_iformat_rtrefcount(ip, dip); default: break; } @@ -624,7 +624,7 @@ xfs_iflush_fork( xfs_iflush_rtrmap(ip, dip); break; case XFS_METAFILE_RTREFCOUNT: - ASSERT(0); /* to be implemented later */ + xfs_iflush_rtrefcount(ip, dip); break; default: ASSERT(0); diff --git a/fs/xfs/libxfs/xfs_ondisk.h b/fs/xfs/libxfs/xfs_ondisk.h index efb035050c009c..a85ecddaa48eed 100644 --- a/fs/xfs/libxfs/xfs_ondisk.h +++ b/fs/xfs/libxfs/xfs_ondisk.h @@ -86,6 +86,7 @@ xfs_check_ondisk_structs(void) XFS_CHECK_STRUCT_SIZE(xfs_rtrmap_ptr_t, 8); XFS_CHECK_STRUCT_SIZE(struct xfs_rtrmap_root, 4); XFS_CHECK_STRUCT_SIZE(xfs_rtrefcount_ptr_t, 8); + XFS_CHECK_STRUCT_SIZE(struct xfs_rtrefcount_root, 4); /* * m68k has problems with struct xfs_attr_leaf_name_remote, but we pad diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index ff72ed09e75f08..6a5bc7ea42fbe6 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -77,6 +77,41 @@ xfs_rtrefcountbt_get_maxrecs( return cur->bc_mp->m_rtrefc_mxr[level != 0]; } +/* + * Calculate number of records in a realtime refcount btree inode root. + */ +unsigned int +xfs_rtrefcountbt_droot_maxrecs( + unsigned int blocklen, + bool leaf) +{ + blocklen -= sizeof(struct xfs_rtrefcount_root); + + if (leaf) + return blocklen / sizeof(struct xfs_refcount_rec); + return blocklen / (2 * sizeof(struct xfs_refcount_key) + + sizeof(xfs_rtrefcount_ptr_t)); +} + +/* + * Get the maximum records we could store in the on-disk format. + * + * For non-root nodes this is equivalent to xfs_rtrefcountbt_get_maxrecs, but + * for the root node this checks the available space in the dinode fork so that + * we can resize the in-memory buffer to match it. After a resize to the + * maximum size this function returns the same value as + * xfs_rtrefcountbt_get_maxrecs for the root node, too. + */ +STATIC int +xfs_rtrefcountbt_get_dmaxrecs( + struct xfs_btree_cur *cur, + int level) +{ + if (level != cur->bc_nlevels - 1) + return cur->bc_mp->m_rtrefc_mxr[level != 0]; + return xfs_rtrefcountbt_droot_maxrecs(cur->bc_ino.forksize, level == 0); +} + STATIC void xfs_rtrefcountbt_init_key_from_rec( union xfs_btree_key *key, @@ -247,6 +282,87 @@ xfs_rtrefcountbt_keys_contiguous( be32_to_cpu(key2->refc.rc_startblock)); } +static inline void +xfs_rtrefcountbt_move_ptrs( + struct xfs_mount *mp, + struct xfs_btree_block *broot, + short old_size, + size_t new_size, + unsigned int numrecs) +{ + void *dptr; + void *sptr; + + sptr = xfs_rtrefcount_broot_ptr_addr(mp, broot, 1, old_size); + dptr = xfs_rtrefcount_broot_ptr_addr(mp, broot, 1, new_size); + memmove(dptr, sptr, numrecs * sizeof(xfs_rtrefcount_ptr_t)); +} + +static struct xfs_btree_block * +xfs_rtrefcountbt_broot_realloc( + struct xfs_btree_cur *cur, + unsigned int new_numrecs) +{ + struct xfs_mount *mp = cur->bc_mp; + struct xfs_ifork *ifp = xfs_btree_ifork_ptr(cur); + struct xfs_btree_block *broot; + unsigned int new_size; + unsigned int old_size = ifp->if_broot_bytes; + const unsigned int level = cur->bc_nlevels - 1; + + new_size = xfs_rtrefcount_broot_space_calc(mp, level, new_numrecs); + + /* Handle the nop case quietly. */ + if (new_size == old_size) + return ifp->if_broot; + + if (new_size > old_size) { + unsigned int old_numrecs; + + /* + * If there wasn't any memory allocated before, just allocate + * it now and get out. + */ + if (old_size == 0) + return xfs_broot_realloc(ifp, new_size); + + /* + * If there is already an existing if_broot, then we need to + * realloc it and possibly move the node block pointers because + * those are not butted up against the btree block header. + */ + old_numrecs = xfs_rtrefcountbt_maxrecs(mp, old_size, level); + broot = xfs_broot_realloc(ifp, new_size); + if (level > 0) + xfs_rtrefcountbt_move_ptrs(mp, broot, old_size, + new_size, old_numrecs); + goto out_broot; + } + + /* + * We're reducing numrecs. If we're going all the way to zero, just + * free the block. + */ + ASSERT(ifp->if_broot != NULL && old_size > 0); + if (new_size == 0) + return xfs_broot_realloc(ifp, 0); + + /* + * Shrink the btree root by possibly moving the rtrmapbt pointers, + * since they are not butted up against the btree block header. Then + * reallocate broot. + */ + if (level > 0) + xfs_rtrefcountbt_move_ptrs(mp, ifp->if_broot, old_size, + new_size, new_numrecs); + broot = xfs_broot_realloc(ifp, new_size); + +out_broot: + ASSERT(xfs_rtrefcount_droot_space(broot) <= + xfs_inode_fork_size(cur->bc_ino.ip, cur->bc_ino.whichfork)); + return broot; +} + const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .name = "rtrefcount", .type = XFS_BTREE_TYPE_INODE, @@ -264,6 +380,7 @@ const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .free_block = xfs_btree_free_metafile_block, .get_minrecs = xfs_rtrefcountbt_get_minrecs, .get_maxrecs = xfs_rtrefcountbt_get_maxrecs, + .get_dmaxrecs = xfs_rtrefcountbt_get_dmaxrecs, .init_key_from_rec = xfs_rtrefcountbt_init_key_from_rec, .init_high_key_from_rec = xfs_rtrefcountbt_init_high_key_from_rec, .init_rec_from_cur = xfs_rtrefcountbt_init_rec_from_cur, @@ -274,6 +391,7 @@ const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .keys_inorder = xfs_rtrefcountbt_keys_inorder, .recs_inorder = xfs_rtrefcountbt_recs_inorder, .keys_contiguous = xfs_rtrefcountbt_keys_contiguous, + .broot_realloc = xfs_rtrefcountbt_broot_realloc, }; /* Allocate a new rt refcount btree cursor. */ @@ -457,3 +575,149 @@ xfs_rtrefcountbt_calc_reserves( return xfs_rtrefcountbt_max_size(mp, mp->m_sb.sb_rgextents); } + +/* + * Convert on-disk form of btree root to in-memory form. + */ +STATIC void +xfs_rtrefcountbt_from_disk( + struct xfs_inode *ip, + struct xfs_rtrefcount_root *dblock, + int dblocklen, + struct xfs_btree_block *rblock) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_refcount_key *fkp; + __be64 *fpp; + struct xfs_refcount_key *tkp; + __be64 *tpp; + struct xfs_refcount_rec *frp; + struct xfs_refcount_rec *trp; + unsigned int numrecs; + unsigned int maxrecs; + unsigned int rblocklen; + + rblocklen = xfs_rtrefcount_broot_space(mp, dblock); + + xfs_btree_init_block(mp, rblock, &xfs_rtrefcountbt_ops, 0, 0, + ip->i_ino); + + rblock->bb_level = dblock->bb_level; + rblock->bb_numrecs = dblock->bb_numrecs; + + if (be16_to_cpu(rblock->bb_level) > 0) { + maxrecs = xfs_rtrefcountbt_droot_maxrecs(dblocklen, false); + fkp = xfs_rtrefcount_droot_key_addr(dblock, 1); + tkp = xfs_rtrefcount_key_addr(rblock, 1); + fpp = xfs_rtrefcount_droot_ptr_addr(dblock, 1, maxrecs); + tpp = xfs_rtrefcount_broot_ptr_addr(mp, rblock, 1, rblocklen); + numrecs = be16_to_cpu(dblock->bb_numrecs); + memcpy(tkp, fkp, 2 * sizeof(*fkp) * numrecs); + memcpy(tpp, fpp, sizeof(*fpp) * numrecs); + } else { + frp = xfs_rtrefcount_droot_rec_addr(dblock, 1); + trp = xfs_rtrefcount_rec_addr(rblock, 1); + numrecs = be16_to_cpu(dblock->bb_numrecs); + memcpy(trp, frp, sizeof(*frp) * numrecs); + } +} + +/* Load a realtime reference count btree root in from disk. */ +int +xfs_iformat_rtrefcount( + struct xfs_inode *ip, + struct xfs_dinode *dip) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_rtrefcount_root *dfp = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + struct xfs_btree_block *broot; + unsigned int numrecs; + unsigned int level; + int dsize; + + /* + * growfs must create the rtrefcount inodes before adding a realtime + * volume to the filesystem, so we cannot use the rtrefcount predicate + * here. + */ + if (!xfs_has_reflink(ip->i_mount)) + return -EFSCORRUPTED; + + dsize = XFS_DFORK_SIZE(dip, mp, XFS_DATA_FORK); + numrecs = be16_to_cpu(dfp->bb_numrecs); + level = be16_to_cpu(dfp->bb_level); + + if (level > mp->m_rtrefc_maxlevels || + xfs_rtrefcount_droot_space_calc(level, numrecs) > dsize) + return -EFSCORRUPTED; + + broot = xfs_broot_alloc(xfs_ifork_ptr(ip, XFS_DATA_FORK), + xfs_rtrefcount_broot_space_calc(mp, level, numrecs)); + if (broot) + xfs_rtrefcountbt_from_disk(ip, dfp, dsize, broot); + return 0; +} + +/* + * Convert in-memory form of btree root to on-disk form. + */ +void +xfs_rtrefcountbt_to_disk( + struct xfs_mount *mp, + struct xfs_btree_block *rblock, + int rblocklen, + struct xfs_rtrefcount_root *dblock, + int dblocklen) +{ + struct xfs_refcount_key *fkp; + __be64 *fpp; + struct xfs_refcount_key *tkp; + __be64 *tpp; + struct xfs_refcount_rec *frp; + struct xfs_refcount_rec *trp; + unsigned int maxrecs; + unsigned int numrecs; + + ASSERT(rblock->bb_magic == cpu_to_be32(XFS_RTREFC_CRC_MAGIC)); + ASSERT(uuid_equal(&rblock->bb_u.l.bb_uuid, &mp->m_sb.sb_meta_uuid)); + ASSERT(rblock->bb_u.l.bb_blkno == cpu_to_be64(XFS_BUF_DADDR_NULL)); + ASSERT(rblock->bb_u.l.bb_leftsib == cpu_to_be64(NULLFSBLOCK)); + ASSERT(rblock->bb_u.l.bb_rightsib == cpu_to_be64(NULLFSBLOCK)); + + dblock->bb_level = rblock->bb_level; + dblock->bb_numrecs = rblock->bb_numrecs; + + if (be16_to_cpu(rblock->bb_level) > 0) { + maxrecs = xfs_rtrefcountbt_droot_maxrecs(dblocklen, false); + fkp = xfs_rtrefcount_key_addr(rblock, 1); + tkp = xfs_rtrefcount_droot_key_addr(dblock, 1); + fpp = xfs_rtrefcount_broot_ptr_addr(mp, rblock, 1, rblocklen); + tpp = xfs_rtrefcount_droot_ptr_addr(dblock, 1, maxrecs); + numrecs = be16_to_cpu(rblock->bb_numrecs); + memcpy(tkp, fkp, 2 * sizeof(*fkp) * numrecs); + memcpy(tpp, fpp, sizeof(*fpp) * numrecs); + } else { + frp = xfs_rtrefcount_rec_addr(rblock, 1); + trp = xfs_rtrefcount_droot_rec_addr(dblock, 1); + numrecs = be16_to_cpu(rblock->bb_numrecs); + memcpy(trp, frp, sizeof(*frp) * numrecs); + } +} + +/* Flush a realtime reference count btree root out to disk. */ +void +xfs_iflush_rtrefcount( + struct xfs_inode *ip, + struct xfs_dinode *dip) +{ + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + struct xfs_rtrefcount_root *dfp = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + + ASSERT(ifp->if_broot != NULL); + ASSERT(ifp->if_broot_bytes > 0); + ASSERT(xfs_rtrefcount_droot_space(ifp->if_broot) <= + xfs_inode_fork_size(ip, XFS_DATA_FORK)); + xfs_rtrefcountbt_to_disk(ip->i_mount, ifp->if_broot, + ifp->if_broot_bytes, dfp, + XFS_DFORK_SIZE(dip, ip->i_mount, XFS_DATA_FORK)); +} diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.h b/fs/xfs/libxfs/xfs_rtrefcount_btree.h index 3cd44590c9304c..e558a10c4744ad 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.h +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.h @@ -25,6 +25,7 @@ void xfs_rtrefcountbt_commit_staged_btree(struct xfs_btree_cur *cur, unsigned int xfs_rtrefcountbt_maxrecs(struct xfs_mount *mp, unsigned int blocklen, bool leaf); void xfs_rtrefcountbt_compute_maxlevels(struct xfs_mount *mp); +unsigned int xfs_rtrefcountbt_droot_maxrecs(unsigned int blocklen, bool leaf); /* * Addresses of records, keys, and pointers within an incore rtrefcountbt block. @@ -71,4 +72,115 @@ xfs_filblks_t xfs_rtrefcountbt_calc_reserves(struct xfs_mount *mp); unsigned long long xfs_rtrefcountbt_calc_size(struct xfs_mount *mp, unsigned long long len); +/* Addresses of key, pointers, and records within an ondisk rtrefcount block. */ + +static inline struct xfs_refcount_rec * +xfs_rtrefcount_droot_rec_addr( + struct xfs_rtrefcount_root *block, + unsigned int index) +{ + return (struct xfs_refcount_rec *) + ((char *)(block + 1) + + (index - 1) * sizeof(struct xfs_refcount_rec)); +} + +static inline struct xfs_refcount_key * +xfs_rtrefcount_droot_key_addr( + struct xfs_rtrefcount_root *block, + unsigned int index) +{ + return (struct xfs_refcount_key *) + ((char *)(block + 1) + + (index - 1) * sizeof(struct xfs_refcount_key)); +} + +static inline xfs_rtrefcount_ptr_t * +xfs_rtrefcount_droot_ptr_addr( + struct xfs_rtrefcount_root *block, + unsigned int index, + unsigned int maxrecs) +{ + return (xfs_rtrefcount_ptr_t *) + ((char *)(block + 1) + + maxrecs * sizeof(struct xfs_refcount_key) + + (index - 1) * sizeof(xfs_rtrefcount_ptr_t)); +} + +/* + * Address of pointers within the incore btree root. + * + * These are to be used when we know the size of the block and + * we don't have a cursor. + */ +static inline xfs_rtrefcount_ptr_t * +xfs_rtrefcount_broot_ptr_addr( + struct xfs_mount *mp, + struct xfs_btree_block *bb, + unsigned int index, + unsigned int block_size) +{ + return xfs_rtrefcount_ptr_addr(bb, index, + xfs_rtrefcountbt_maxrecs(mp, block_size, false)); +} + +/* + * Compute the space required for the incore btree root containing the given + * number of records. + */ +static inline size_t +xfs_rtrefcount_broot_space_calc( + struct xfs_mount *mp, + unsigned int level, + unsigned int nrecs) +{ + size_t sz = XFS_RTREFCOUNT_BLOCK_LEN; + + if (level > 0) + return sz + nrecs * (sizeof(struct xfs_refcount_key) + + sizeof(xfs_rtrefcount_ptr_t)); + return sz + nrecs * sizeof(struct xfs_refcount_rec); +} + +/* + * Compute the space required for the incore btree root given the ondisk + * btree root block. + */ +static inline size_t +xfs_rtrefcount_broot_space(struct xfs_mount *mp, struct xfs_rtrefcount_root *bb) +{ + return xfs_rtrefcount_broot_space_calc(mp, be16_to_cpu(bb->bb_level), + be16_to_cpu(bb->bb_numrecs)); +} + +/* Compute the space required for the ondisk root block. */ +static inline size_t +xfs_rtrefcount_droot_space_calc( + unsigned int level, + unsigned int nrecs) +{ + size_t sz = sizeof(struct xfs_rtrefcount_root); + + if (level > 0) + return sz + nrecs * (sizeof(struct xfs_refcount_key) + + sizeof(xfs_rtrefcount_ptr_t)); + return sz + nrecs * sizeof(struct xfs_refcount_rec); +} + +/* + * Compute the space required for the ondisk root block given an incore root + * block. + */ +static inline size_t +xfs_rtrefcount_droot_space(struct xfs_btree_block *bb) +{ + return xfs_rtrefcount_droot_space_calc(be16_to_cpu(bb->bb_level), + be16_to_cpu(bb->bb_numrecs)); +} + +int xfs_iformat_rtrefcount(struct xfs_inode *ip, struct xfs_dinode *dip); +void xfs_rtrefcountbt_to_disk(struct xfs_mount *mp, + struct xfs_btree_block *rblock, int rblocklen, + struct xfs_rtrefcount_root *dblock, int dblocklen); +void xfs_iflush_rtrefcount(struct xfs_inode *ip, struct xfs_dinode *dip); + #endif /* __XFS_RTREFCOUNT_BTREE_H__ */ diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c index 5de1d3563b7686..f3bfb814378c06 100644 --- a/fs/xfs/xfs_inode_item_recover.c +++ b/fs/xfs/xfs_inode_item_recover.c @@ -23,6 +23,7 @@ #include "xfs_icache.h" #include "xfs_bmap_btree.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" STATIC void xlog_recover_inode_ra_pass2( @@ -286,6 +287,9 @@ xlog_recover_inode_dbroot( case XFS_METAFILE_RTRMAP: xfs_rtrmapbt_to_disk(mp, src, len, dfork, dsize); return 0; + case XFS_METAFILE_RTREFCOUNT: + xfs_rtrefcountbt_to_disk(mp, src, len, dfork, dsize); + return 0; default: ASSERT(0); return -EFSCORRUPTED; From patchwork Thu Dec 19 19:36:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915613 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1D7F19884C for ; Thu, 19 Dec 2024 19:36:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636986; cv=none; b=Uo3De9swXFtb+h/jOea+/o9TCKtRwlpBmdthM7z2+jGT7plShXuQ9JV2iSWaG46jzBdGXqa2zbRtPzhwyLs7VRKOI1qg/dOAsVU1QYVRKWrYBBW/T1XjPQ3w4h7N4faHlCRhdT3w9bQ7aWT+y5lpGNWnWJ/QSyJWy7tQgU/vtNs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734636986; c=relaxed/simple; bh=BHRbYOSOMrjEhH7hVLZtzQlLLm0zn+b3eJSp3UQQO2Y=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=aRZ2qS6ipLu4JVg7dITTz1FI6ND7WHOCuuRPyYREEQDGDUnvSV+znl6juwfnTaHAVtPrqcIOpIlqa00PkACu/ClM0l8RctwjsYiKFzBzQr1rN7j1jj0SV/UP6FsanLC5okpmgISSmq6N36ypXBeywl/0x+bc47RyDgJZINx2wY4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GllalgpG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GllalgpG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D734DC4CECE; Thu, 19 Dec 2024 19:36:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734636985; bh=BHRbYOSOMrjEhH7hVLZtzQlLLm0zn+b3eJSp3UQQO2Y=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=GllalgpGUtIgOLu39fi5mky6z4T2SFUu4bmXMppO4+8WsY6cNaZA+Ra6Xntks1uAV YIKSTreFF01niGw9nCRtOS11U5ndY3+26pS9p+4gESfIoNNraDrYDOpshAIKENT05+ 53jrEzRqkpxw9xwUg047NMkmGZC/VdHJDHTxSdJsqV+Ct4L8ak8oGV7qmslD7BrEwF eqWEKTrv04uWvnTvUHgyTwsOUcprgNIFZbG8/561ZZLLzqjSG9qgp1sM/1+o8pLvXF Y97eu3FQXH50OYnEzgCDTNuGONGejgA0bxa3JgbVATCiSbd1/kCFX5AcaXF9WldN7s JqPAOXtzmLswg== Date: Thu, 19 Dec 2024 11:36:25 -0800 Subject: [PATCH 13/43] xfs: refactor xfs_reflink_find_shared From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581201.1572761.8940967949660201516.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Christoph Hellwig Move lookup of the perag structure from the callers into the helpers, and return the offset into the extent of the shared region instead of the block number that needs post-processing. This prepares the callsites for the creation of an rt-specific variant in the next patch. Signed-off-by: Christoph Hellwig Reviewed-by: "Darrick J. Wong" [djwong: port to the middle of the rtreflink series for cleanliness] Signed-off-by: "Darrick J. Wong" --- fs/xfs/xfs_reflink.c | 110 ++++++++++++++++++++++---------------------------- fs/xfs/xfs_reflink.h | 2 - 2 files changed, 50 insertions(+), 62 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 02457e176c4252..71b4743ffb7741 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -120,38 +120,46 @@ */ /* - * Given an AG extent, find the lowest-numbered run of shared blocks - * within that range and return the range in fbno/flen. If - * find_end_of_shared is true, return the longest contiguous extent of - * shared blocks. If there are no shared extents, fbno and flen will - * be set to NULLAGBLOCK and 0, respectively. + * Given a file mapping for the data device, find the lowest-numbered run of + * shared blocks within that mapping and return it in shared_offset/shared_len. + * The offset is relative to the start of irec. + * + * If find_end_of_shared is true, return the longest contiguous extent of shared + * blocks. If there are no shared extents, shared_offset and shared_len will be + * set to 0; */ static int xfs_reflink_find_shared( - struct xfs_perag *pag, + struct xfs_mount *mp, struct xfs_trans *tp, - xfs_agblock_t agbno, - xfs_extlen_t aglen, - xfs_agblock_t *fbno, - xfs_extlen_t *flen, + const struct xfs_bmbt_irec *irec, + xfs_extlen_t *shared_offset, + xfs_extlen_t *shared_len, bool find_end_of_shared) { struct xfs_buf *agbp; + struct xfs_perag *pag; struct xfs_btree_cur *cur; int error; + xfs_agblock_t orig_bno, found_bno; + + pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, irec->br_startblock)); + orig_bno = XFS_FSB_TO_AGBNO(mp, irec->br_startblock); error = xfs_alloc_read_agf(pag, tp, 0, &agbp); if (error) - return error; - - cur = xfs_refcountbt_init_cursor(pag_mount(pag), tp, agbp, pag); - - error = xfs_refcount_find_shared(cur, agbno, aglen, fbno, flen, - find_end_of_shared); + goto out; + cur = xfs_refcountbt_init_cursor(mp, tp, agbp, pag); + error = xfs_refcount_find_shared(cur, orig_bno, irec->br_blockcount, + &found_bno, shared_len, find_end_of_shared); xfs_btree_del_cursor(cur, error); - xfs_trans_brelse(tp, agbp); + + if (!error && *shared_len) + *shared_offset = found_bno - orig_bno; +out: + xfs_perag_put(pag); return error; } @@ -172,11 +180,7 @@ xfs_reflink_trim_around_shared( bool *shared) { struct xfs_mount *mp = ip->i_mount; - struct xfs_perag *pag; - xfs_agblock_t agbno; - xfs_extlen_t aglen; - xfs_agblock_t fbno; - xfs_extlen_t flen; + xfs_extlen_t shared_offset, shared_len; int error = 0; /* Holes, unwritten, and delalloc extents cannot be shared */ @@ -187,41 +191,33 @@ xfs_reflink_trim_around_shared( trace_xfs_reflink_trim_around_shared(ip, irec); - pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, irec->br_startblock)); - agbno = XFS_FSB_TO_AGBNO(mp, irec->br_startblock); - aglen = irec->br_blockcount; - - error = xfs_reflink_find_shared(pag, NULL, agbno, aglen, &fbno, &flen, - true); - xfs_perag_put(pag); + error = xfs_reflink_find_shared(mp, NULL, irec, &shared_offset, + &shared_len, true); if (error) return error; - *shared = false; - if (fbno == NULLAGBLOCK) { + if (!shared_len) { /* No shared blocks at all. */ - return 0; - } - - if (fbno == agbno) { + *shared = false; + } else if (!shared_offset) { /* - * The start of this extent is shared. Truncate the - * mapping at the end of the shared region so that a - * subsequent iteration starts at the start of the - * unshared region. + * The start of this mapping points to shared space. Truncate + * the mapping at the end of the shared region so that a + * subsequent iteration starts at the start of the unshared + * region. */ - irec->br_blockcount = flen; + irec->br_blockcount = shared_len; *shared = true; - return 0; + } else { + /* + * There's a shared region that doesn't start at the beginning + * of the mapping. Truncate the mapping at the start of the + * shared extent so that a subsequent iteration starts at the + * start of the shared region. + */ + irec->br_blockcount = shared_offset; + *shared = false; } - - /* - * There's a shared extent midway through this extent. - * Truncate the mapping at the start of the shared - * extent so that a subsequent iteration starts at the - * start of the shared region. - */ - irec->br_blockcount = fbno - agbno; return 0; } @@ -1552,27 +1548,19 @@ xfs_reflink_inode_has_shared_extents( *has_shared = false; found = xfs_iext_lookup_extent(ip, ifp, 0, &icur, &got); while (found) { - struct xfs_perag *pag; - xfs_agblock_t agbno; - xfs_extlen_t aglen; - xfs_agblock_t rbno; - xfs_extlen_t rlen; + xfs_extlen_t shared_offset, shared_len; if (isnullstartblock(got.br_startblock) || got.br_state != XFS_EXT_NORM) goto next; - pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, got.br_startblock)); - agbno = XFS_FSB_TO_AGBNO(mp, got.br_startblock); - aglen = got.br_blockcount; - error = xfs_reflink_find_shared(pag, tp, agbno, aglen, - &rbno, &rlen, false); - xfs_perag_put(pag); + error = xfs_reflink_find_shared(mp, tp, &got, &shared_offset, + &shared_len, false); if (error) return error; /* Is there still a shared block here? */ - if (rbno != NULLAGBLOCK) { + if (shared_len) { *has_shared = true; return 0; } diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 4a58e4533671c0..3bfd7ab9e1148a 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -25,7 +25,7 @@ xfs_can_free_cowblocks(struct xfs_inode *ip) return true; } -extern int xfs_reflink_trim_around_shared(struct xfs_inode *ip, +int xfs_reflink_trim_around_shared(struct xfs_inode *ip, struct xfs_bmbt_irec *irec, bool *shared); int xfs_bmap_trim_cow(struct xfs_inode *ip, struct xfs_bmbt_irec *imap, bool *shared); From patchwork Thu Dec 19 19:36:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915614 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F00E2198A08 for ; Thu, 19 Dec 2024 19:36:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637003; cv=none; b=NGbGM6RJrEUoGCdOsYH16a27LyWNrdhUnyDCBuzXmGXaQpLjaHjBm7GRDJNPIeQVTCfzJKAyx46nWFRB7aXxHBRCW3v72rLnCfuY1yu91ejmIwWfAZbBE1PY+rcBfpltzOEbp8ElUpLT2Xo3pbHod44HcJB1ZpAI7DkiYGPtW4s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637003; c=relaxed/simple; bh=zTVpRvT+rHb86D7FBiW0iCGoMbz3f1a+GOjr01N/j2Y=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=KiMy7WZoDOKzOTn93bFkqNio2knybjFr7V9O92uoteJtjlVmbNMWtkWBFgqF9ZK0xl2GYa55gabQlx278QM1CJ+PBhQqtpiW4GX2ekpkXwMpB7khq2lZB0J6qRXWMdzyrdbrxSeu1MkjYQWu+6mGBPqbjNeyxjXsn5Bj6txnyAQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=E8PBbwYv; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="E8PBbwYv" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 72FBAC4CECE; Thu, 19 Dec 2024 19:36:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637001; bh=zTVpRvT+rHb86D7FBiW0iCGoMbz3f1a+GOjr01N/j2Y=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=E8PBbwYvr24o3yHYD9WJ3UWxA7BUODD0PTB8xYVXUEl2C5MoG/pGUXNCYQi0ykzEh LtwhhtdTdDZH8KFVpeold3mxpYahbdPba79Oul7bSRK7sbBP53TK3Q4+6U9iIOGFNd ryvFooMOP2XqVupl9O1MnTwrVQ2RckodoxBP/8v4AExXNrJrdJPsUHrLWtMWPGUNJZ 2RFtNNskSGmBqv+jGFZmimGjzWsmNNt2z3lPvzs0X4aFt9y3Kh6VGf/tpRMx2x11RK Kcm3UqURtkblxfE/6ZooLn9cX4CCDmH8FxHd8lWo081AvHe+M33s+6cdOlfyLZrhyp kK7ct+IU8e0zQ== Date: Thu, 19 Dec 2024 11:36:41 -0800 Subject: [PATCH 14/43] xfs: wire up realtime refcount btree cursors From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581218.1572761.1204040223598223576.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Wire up realtime refcount btree cursors wherever they're needed throughout the code base. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_btree.h | 2 - fs/xfs/libxfs/xfs_refcount.c | 100 +++++++++++++++++++++++++++++++++++++++++- fs/xfs/libxfs/xfs_rtgroup.c | 9 ++++ fs/xfs/libxfs/xfs_rtgroup.h | 5 ++ fs/xfs/xfs_fsmap.c | 25 +++++------ fs/xfs/xfs_reflink.c | 66 ++++++++++++++++++++++++++-- 6 files changed, 187 insertions(+), 20 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index dbc047b2fb2cf5..355b304696e6c3 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -297,7 +297,7 @@ struct xfs_btree_cur struct { unsigned int nr_ops; /* # record updates */ unsigned int shape_changes; /* # of extent splits */ - } bc_refc; /* refcountbt */ + } bc_refc; /* refcountbt/rtrefcountbt */ }; /* Must be at the end of the struct! */ diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 8007d15856252b..11bff098db2dbb 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -27,6 +27,7 @@ #include "xfs_refcount_item.h" #include "xfs_rtgroup.h" #include "xfs_rtalloc.h" +#include "xfs_rtrefcount_btree.h" struct kmem_cache *xfs_refcount_intent_cache; @@ -1462,6 +1463,32 @@ xfs_refcount_finish_one( return error; } +/* + * Set up a continuation a deferred rtrefcount operation by updating the + * intent. Checks to make sure we're not going to run off the end of the + * rtgroup. + */ +static inline int +xfs_rtrefcount_continue_op( + struct xfs_btree_cur *cur, + struct xfs_refcount_intent *ri, + xfs_agblock_t new_agbno) +{ + struct xfs_mount *mp = cur->bc_mp; + struct xfs_rtgroup *rtg = to_rtg(ri->ri_group); + + if (XFS_IS_CORRUPT(mp, !xfs_verify_rgbext(rtg, new_agbno, + ri->ri_blockcount))) { + xfs_btree_mark_sick(cur); + return -EFSCORRUPTED; + } + + ri->ri_startblock = xfs_rgbno_to_rtb(rtg, new_agbno); + + ASSERT(xfs_verify_rtbext(mp, ri->ri_startblock, ri->ri_blockcount)); + return 0; +} + /* * Process one of the deferred realtime refcount operations. We pass back the * btree cursor to maintain our lock on the btree between calls. @@ -1472,8 +1499,77 @@ xfs_rtrefcount_finish_one( struct xfs_refcount_intent *ri, struct xfs_btree_cur **pcur) { - ASSERT(0); - return -EFSCORRUPTED; + struct xfs_mount *mp = tp->t_mountp; + struct xfs_rtgroup *rtg = to_rtg(ri->ri_group); + struct xfs_btree_cur *rcur = *pcur; + int error = 0; + xfs_rgblock_t bno; + unsigned long nr_ops = 0; + int shape_changes = 0; + + bno = xfs_rtb_to_rgbno(mp, ri->ri_startblock); + + trace_xfs_refcount_deferred(mp, ri); + + if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REFCOUNT_FINISH_ONE)) + return -EIO; + + /* + * If we haven't gotten a cursor or the cursor AG doesn't match + * the startblock, get one now. + */ + if (rcur != NULL && rcur->bc_group != ri->ri_group) { + nr_ops = rcur->bc_refc.nr_ops; + shape_changes = rcur->bc_refc.shape_changes; + xfs_btree_del_cursor(rcur, 0); + rcur = NULL; + *pcur = NULL; + } + if (rcur == NULL) { + xfs_rtgroup_lock(rtg, XFS_RTGLOCK_REFCOUNT); + xfs_rtgroup_trans_join(tp, rtg, XFS_RTGLOCK_REFCOUNT); + *pcur = rcur = xfs_rtrefcountbt_init_cursor(tp, rtg); + + rcur->bc_refc.nr_ops = nr_ops; + rcur->bc_refc.shape_changes = shape_changes; + } + + switch (ri->ri_type) { + case XFS_REFCOUNT_INCREASE: + error = xfs_refcount_adjust(rcur, &bno, &ri->ri_blockcount, + XFS_REFCOUNT_ADJUST_INCREASE); + if (error) + return error; + if (ri->ri_blockcount > 0) + error = xfs_rtrefcount_continue_op(rcur, ri, bno); + break; + case XFS_REFCOUNT_DECREASE: + error = xfs_refcount_adjust(rcur, &bno, &ri->ri_blockcount, + XFS_REFCOUNT_ADJUST_DECREASE); + if (error) + return error; + if (ri->ri_blockcount > 0) + error = xfs_rtrefcount_continue_op(rcur, ri, bno); + break; + case XFS_REFCOUNT_ALLOC_COW: + error = __xfs_refcount_cow_alloc(rcur, bno, ri->ri_blockcount); + if (error) + return error; + ri->ri_blockcount = 0; + break; + case XFS_REFCOUNT_FREE_COW: + error = __xfs_refcount_cow_free(rcur, bno, ri->ri_blockcount); + if (error) + return error; + ri->ri_blockcount = 0; + break; + default: + ASSERT(0); + return -EFSCORRUPTED; + } + if (!error && ri->ri_blockcount > 0) + trace_xfs_refcount_finish_one_leftover(mp, ri); + return error; } /* diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index 6aebe9f484901f..d5ecc2f6c5c202 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -206,6 +206,9 @@ xfs_rtgroup_lock( if ((rtglock_flags & XFS_RTGLOCK_RMAP) && rtg_rmap(rtg)) xfs_ilock(rtg_rmap(rtg), XFS_ILOCK_EXCL); + + if ((rtglock_flags & XFS_RTGLOCK_REFCOUNT) && rtg_refcount(rtg)) + xfs_ilock(rtg_refcount(rtg), XFS_ILOCK_EXCL); } /* Unlock metadata inodes associated with this rt group. */ @@ -218,6 +221,9 @@ xfs_rtgroup_unlock( ASSERT(!(rtglock_flags & XFS_RTGLOCK_BITMAP_SHARED) || !(rtglock_flags & XFS_RTGLOCK_BITMAP)); + if ((rtglock_flags & XFS_RTGLOCK_REFCOUNT) && rtg_refcount(rtg)) + xfs_iunlock(rtg_refcount(rtg), XFS_ILOCK_EXCL); + if ((rtglock_flags & XFS_RTGLOCK_RMAP) && rtg_rmap(rtg)) xfs_iunlock(rtg_rmap(rtg), XFS_ILOCK_EXCL); @@ -249,6 +255,9 @@ xfs_rtgroup_trans_join( if ((rtglock_flags & XFS_RTGLOCK_RMAP) && rtg_rmap(rtg)) xfs_trans_ijoin(tp, rtg_rmap(rtg), XFS_ILOCK_EXCL); + + if ((rtglock_flags & XFS_RTGLOCK_REFCOUNT) && rtg_refcount(rtg)) + xfs_trans_ijoin(tp, rtg_refcount(rtg), XFS_ILOCK_EXCL); } /* Retrieve rt group geometry. */ diff --git a/fs/xfs/libxfs/xfs_rtgroup.h b/fs/xfs/libxfs/xfs_rtgroup.h index 385ea8e2f28b67..de4eeb381fc9fc 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.h +++ b/fs/xfs/libxfs/xfs_rtgroup.h @@ -273,10 +273,13 @@ int xfs_update_last_rtgroup_size(struct xfs_mount *mp, #define XFS_RTGLOCK_BITMAP_SHARED (1U << 1) /* Lock the rt rmap inode in exclusive mode */ #define XFS_RTGLOCK_RMAP (1U << 2) +/* Lock the rt refcount inode in exclusive mode */ +#define XFS_RTGLOCK_REFCOUNT (1U << 3) #define XFS_RTGLOCK_ALL_FLAGS (XFS_RTGLOCK_BITMAP | \ XFS_RTGLOCK_BITMAP_SHARED | \ - XFS_RTGLOCK_RMAP) + XFS_RTGLOCK_RMAP | \ + XFS_RTGLOCK_REFCOUNT) void xfs_rtgroup_lock(struct xfs_rtgroup *rtg, unsigned int rtglock_flags); void xfs_rtgroup_unlock(struct xfs_rtgroup *rtg, unsigned int rtglock_flags); diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c index 3e3ef16f65a335..1dbd2d75f7ae3e 100644 --- a/fs/xfs/xfs_fsmap.c +++ b/fs/xfs/xfs_fsmap.c @@ -27,6 +27,7 @@ #include "xfs_ag.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" /* Convert an xfs_fsmap to an fsmap. */ static void @@ -212,21 +213,20 @@ xfs_getfsmap_is_shared( struct xfs_mount *mp = tp->t_mountp; struct xfs_btree_cur *cur; xfs_agblock_t fbno; - xfs_extlen_t flen; + xfs_extlen_t flen = 0; int error; *stat = false; - if (!xfs_has_reflink(mp)) - return 0; - /* rt files will have no perag structure */ - if (!info->group) + if (!xfs_has_reflink(mp) || !info->group) return 0; + if (info->group->xg_type == XG_TYPE_RTG) + cur = xfs_rtrefcountbt_init_cursor(tp, to_rtg(info->group)); + else + cur = xfs_refcountbt_init_cursor(mp, tp, info->agf_bp, + to_perag(info->group)); + /* Are there any shared blocks here? */ - flen = 0; - cur = xfs_refcountbt_init_cursor(mp, tp, info->agf_bp, - to_perag(info->group)); - error = xfs_refcount_find_shared(cur, frec->rec_key, XFS_BB_TO_FSBT(mp, frec->len_daddr), &fbno, &flen, false); @@ -863,7 +863,7 @@ xfs_getfsmap_rtdev_rmapbt_query( struct xfs_rtgroup *rtg = to_rtg(info->group); /* Query the rtrmapbt */ - xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP); + xfs_rtgroup_lock(rtg, XFS_RTGLOCK_RMAP | XFS_RTGLOCK_REFCOUNT); *curpp = xfs_rtrmapbt_init_cursor(tp, rtg); return xfs_rmap_query_range(*curpp, &info->low, &info->high, xfs_getfsmap_rtdev_rmapbt_helper, info); @@ -950,7 +950,8 @@ xfs_getfsmap_rtdev_rmapbt( if (bt_cur) { xfs_rtgroup_unlock(to_rtg(bt_cur->bc_group), - XFS_RTGLOCK_RMAP); + XFS_RTGLOCK_RMAP | + XFS_RTGLOCK_REFCOUNT); xfs_btree_del_cursor(bt_cur, XFS_BTREE_NOERROR); bt_cur = NULL; } @@ -988,7 +989,7 @@ xfs_getfsmap_rtdev_rmapbt( if (bt_cur) { xfs_rtgroup_unlock(to_rtg(bt_cur->bc_group), - XFS_RTGLOCK_RMAP); + XFS_RTGLOCK_RMAP | XFS_RTGLOCK_REFCOUNT); xfs_btree_del_cursor(bt_cur, error < 0 ? XFS_BTREE_ERROR : XFS_BTREE_NOERROR); } diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 71b4743ffb7741..66ce29101462cd 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -30,6 +30,9 @@ #include "xfs_ag.h" #include "xfs_ag_resv.h" #include "xfs_health.h" +#include "xfs_rtrefcount_btree.h" +#include "xfs_rtalloc.h" +#include "xfs_rtgroup.h" /* * Copy on Write of Shared Blocks @@ -163,6 +166,53 @@ xfs_reflink_find_shared( return error; } +/* + * Given a file mapping for the rt device, find the lowest-numbered run of + * shared blocks within that mapping and return it in shared_offset/shared_len. + * The offset is relative to the start of irec. + * + * If find_end_of_shared is true, return the longest contiguous extent of shared + * blocks. If there are no shared extents, shared_offset and shared_len will be + * set to 0; + */ +static int +xfs_reflink_find_rtshared( + struct xfs_mount *mp, + struct xfs_trans *tp, + const struct xfs_bmbt_irec *irec, + xfs_extlen_t *shared_offset, + xfs_extlen_t *shared_len, + bool find_end_of_shared) +{ + struct xfs_rtgroup *rtg; + struct xfs_btree_cur *cur; + xfs_rgblock_t orig_bno; + xfs_agblock_t found_bno; + int error; + + BUILD_BUG_ON(NULLRGBLOCK != NULLAGBLOCK); + + /* + * Note: this uses the not quite correct xfs_agblock_t type because + * xfs_refcount_find_shared is shared between the RT and data device + * refcount code. + */ + orig_bno = xfs_rtb_to_rgbno(mp, irec->br_startblock); + rtg = xfs_rtgroup_get(mp, xfs_rtb_to_rgno(mp, irec->br_startblock)); + + xfs_rtgroup_lock(rtg, XFS_RTGLOCK_REFCOUNT); + cur = xfs_rtrefcountbt_init_cursor(tp, rtg); + error = xfs_refcount_find_shared(cur, orig_bno, irec->br_blockcount, + &found_bno, shared_len, find_end_of_shared); + xfs_btree_del_cursor(cur, error); + xfs_rtgroup_unlock(rtg, XFS_RTGLOCK_REFCOUNT); + xfs_rtgroup_put(rtg); + + if (!error && *shared_len) + *shared_offset = found_bno - orig_bno; + return error; +} + /* * Trim the mapping to the next block where there's a change in the * shared/unshared status. More specifically, this means that we @@ -191,8 +241,12 @@ xfs_reflink_trim_around_shared( trace_xfs_reflink_trim_around_shared(ip, irec); - error = xfs_reflink_find_shared(mp, NULL, irec, &shared_offset, - &shared_len, true); + if (XFS_IS_REALTIME_INODE(ip)) + error = xfs_reflink_find_rtshared(mp, NULL, irec, + &shared_offset, &shared_len, true); + else + error = xfs_reflink_find_shared(mp, NULL, irec, + &shared_offset, &shared_len, true); if (error) return error; @@ -1554,8 +1608,12 @@ xfs_reflink_inode_has_shared_extents( got.br_state != XFS_EXT_NORM) goto next; - error = xfs_reflink_find_shared(mp, tp, &got, &shared_offset, - &shared_len, false); + if (XFS_IS_REALTIME_INODE(ip)) + error = xfs_reflink_find_rtshared(mp, tp, &got, + &shared_offset, &shared_len, false); + else + error = xfs_reflink_find_shared(mp, tp, &got, + &shared_offset, &shared_len, false); if (error) return error; From patchwork Thu Dec 19 19:36:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915615 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C53E19884C for ; Thu, 19 Dec 2024 19:36:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637017; cv=none; b=GUQo+VBb/Fz5gGwUlXvqeqj6Ocb+0Qi5FxJC4933J4vmohr1sRBGRtt0WtVA+ewyN8PnHD1KOBh7eFQPE5BJnU1RQpAKjqdn0yCP/x/DlaoV22nzbQwYKTlJ7OdNzYeZa7NDNJS05q9JqoMbEd+RLM6lBlD6vQxkdv7z8vEvK68= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637017; c=relaxed/simple; bh=MWUsJf5LAwVnADKP8zAqai6EORpjE/kmaFcm+GMTHOw=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Zr19pUjPiAZ6loIdSSNHedR9fbpXlrrn2r/0L8U0EbA/yDGvOMlqg8KASXdNKiHbHgb6yzwytqW+iItR3A2PsiTOih2HRsvEVSfdX568X67+T6BREu6WlYjwq3YEnF5Iz1lzV18x58LhVFm8wIaVgmv0JPLr0KGJyjVR30KybUA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=i13pWUG5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="i13pWUG5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 15D9DC4CECE; Thu, 19 Dec 2024 19:36:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637017; bh=MWUsJf5LAwVnADKP8zAqai6EORpjE/kmaFcm+GMTHOw=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=i13pWUG5z5xtzxR0QzvdwqkCG9RVYklO3MfoPS4jocy9wN1OLkQUZHwsbr9xUwM4N A+qDVynTdr/eItK/7rB1pnan6LoEm3IJDd9bsUMD68zy334BIn0ANa7KKexSDoBuuV 33lzdKgld2XVnHMjAVMGnuDZWb3u/MowyxlhvpSmOfCTvolf3olkObOR04w4cyNyEw 67JJkyyAKtwgug0dorgkmJgk1SRV9siTTzEIaTb7TsCrpm+OdEZ+sZpRZ8HjabuXL2 T3Q/x1LeetKq2REiSD2+1PO92XIV5lnC59xXo5G+YabPQ6MVXmgbYqXvtR+TXbo9G2 prTiQ3xnK5Row== Date: Thu, 19 Dec 2024 11:36:56 -0800 Subject: [PATCH 15/43] xfs: create routine to allocate and initialize a realtime refcount btree inode From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581236.1572761.8380322015327920196.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Create a library routine to allocate and initialize an empty realtime refcountbt inode. We'll use this for growfs, mkfs, and repair. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_rtgroup.c | 2 ++ fs/xfs/libxfs/xfs_rtrefcount_btree.c | 28 ++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_rtrefcount_btree.h | 3 +++ 3 files changed, 33 insertions(+) diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index d5ecc2f6c5c202..eab655a4a9ef5c 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -34,6 +34,7 @@ #include "xfs_metafile.h" #include "xfs_metadir.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" /* Find the first usable fsblock in this rtgroup. */ static inline uint32_t @@ -382,6 +383,7 @@ static const struct xfs_rtginode_ops xfs_rtginode_ops[XFS_RTGI_MAX] = { .fmt_mask = 1U << XFS_DINODE_FMT_META_BTREE, /* same comment about growfs and rmap inodes applies here */ .enabled = xfs_has_reflink, + .create = xfs_rtrefcountbt_create, }, }; diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index 6a5bc7ea42fbe6..151fb1ef7db126 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -721,3 +721,31 @@ xfs_iflush_rtrefcount( ifp->if_broot_bytes, dfp, XFS_DFORK_SIZE(dip, ip->i_mount, XFS_DATA_FORK)); } + +/* + * Create a realtime refcount btree inode. + */ +int +xfs_rtrefcountbt_create( + struct xfs_rtgroup *rtg, + struct xfs_inode *ip, + struct xfs_trans *tp, + bool init) +{ + struct xfs_ifork *ifp = xfs_ifork_ptr(ip, XFS_DATA_FORK); + struct xfs_mount *mp = ip->i_mount; + struct xfs_btree_block *broot; + + ifp->if_format = XFS_DINODE_FMT_META_BTREE; + ASSERT(ifp->if_broot_bytes == 0); + ASSERT(ifp->if_bytes == 0); + + /* Initialize the empty incore btree root. */ + broot = xfs_broot_realloc(ifp, + xfs_rtrefcount_broot_space_calc(mp, 0, 0)); + if (broot) + xfs_btree_init_block(mp, broot, &xfs_rtrefcountbt_ops, 0, 0, + ip->i_ino); + xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE | XFS_ILOG_DBROOT); + return 0; +} diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.h b/fs/xfs/libxfs/xfs_rtrefcount_btree.h index e558a10c4744ad..a99b7a8aec8659 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.h +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.h @@ -183,4 +183,7 @@ void xfs_rtrefcountbt_to_disk(struct xfs_mount *mp, struct xfs_rtrefcount_root *dblock, int dblocklen); void xfs_iflush_rtrefcount(struct xfs_inode *ip, struct xfs_dinode *dip); +int xfs_rtrefcountbt_create(struct xfs_rtgroup *rtg, struct xfs_inode *ip, + struct xfs_trans *tp, bool init); + #endif /* __XFS_RTREFCOUNT_BTREE_H__ */ From patchwork Thu Dec 19 19:37:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915616 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D28E41A08CC for ; Thu, 19 Dec 2024 19:37:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637032; cv=none; b=MZl5pR1NimFO0oROTJYwzcLkyz0nRFLPHuTAcipvklMI1AGNBMuvYo0+9INZ0+YkyetmPMNpqqbusACPWJEvTAJuEdSEpIn2FdzX+4zY0rPQxPy97RohSYxDR1Wo2LNDXjeuMfuqAwZqTwS3AdYw4EE9CH9MEDrnbEq9Jrr5HUY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637032; c=relaxed/simple; bh=Jtokb0PVehQconv58gAy6SaT+6s4nqyAG3KPe1xRR5A=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RqJEdorJcSP8atQEG74Wla9JF9+EGvktL+73lbEJwGBHXqt8+xmjqbrHVjoP82x91tEJi8WU9P1DMbTBG52B5gy9z5naYCeFEJMWI8SpOEt/14kkcIYV5SRlFwa29PmOpE5ao5YlqOjd2GANOefVsIh2pWYadiVha+K+UekkUrA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=d/sKxw9q; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="d/sKxw9q" Received: by smtp.kernel.org (Postfix) with ESMTPSA id AE70EC4CECE; Thu, 19 Dec 2024 19:37:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637032; bh=Jtokb0PVehQconv58gAy6SaT+6s4nqyAG3KPe1xRR5A=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=d/sKxw9q0a7Ud6RCX1MyEiUxqvwbSya7bCrfQY4vhaXvd0vdSt2jMfDto5VF5YUg2 WB5tpUsi40RZUDvPr940izq7DjsAzzlb+sgAaNj3lTg0s9w8LaQ+YbfOFnGBismX/6 LNEWpkrnmGPLxhLwB7DMf3F/kvZlA14QeEZiE6d+HHbjTKNjtkFn8FkoTOhpu42uTF LkAr1GWEAS1viQemAboLoyImStcEA/wBP3zBof8sCHwMWLsXDwwv8r4rDfiDE19SoC qv37KqfJ3nBT1ayFYGNWRsU09vVu7qrT10Qo2Y/4JMMAoRrtCgTz2KfMShWdhAzZDp /01EfeC7DneUg== Date: Thu, 19 Dec 2024 11:37:12 -0800 Subject: [PATCH 16/43] xfs: update rmap to allow cow staging extents in the rt rmap From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581252.1572761.5940139620557154469.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Don't error out on CoW staging extent records when realtime reflink is enabled. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_rmap.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index f8415fd96cc2aa..3cdf50563fecb9 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -285,6 +285,13 @@ xfs_rtrmap_check_meta_irec( if (irec->rm_blockcount != mp->m_sb.sb_rextsize) return __this_address; return NULL; + case XFS_RMAP_OWN_COW: + if (!xfs_has_rtreflink(mp)) + return __this_address; + if (!xfs_verify_rgbext(rtg, irec->rm_startblock, + irec->rm_blockcount)) + return __this_address; + return NULL; default: return __this_address; } From patchwork Thu Dec 19 19:37:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915617 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C06D1A08CC for ; Thu, 19 Dec 2024 19:37:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637048; cv=none; b=SXU+iMSicbzdTWAwWAw10SPOvifOd1GmLD57jtna1/a4XRAdH1T+ABy5B344xX+tVtFp47PDx6NIJri8U4XO5YmmW36qmtaTLZYsvAy54krZqjgLWuN5UHskIfqvJgWI9upFh51bh9YcTDohqc76mwmjP2n+btQXz1Cy6HZfz24= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637048; c=relaxed/simple; bh=E33NpwMg35NquW8dRlbAcvuMrYYS5rxRlCKt3lzwikg=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=AnXbg2c415FCN8xEE0yGqQh/9ndbNHq18eqgI1ZfGXYD+T4r8VMKOAjdpWw2hpi0jJCGPlpe+Jn6TlC4NMNionHRjM5uNNjV2LSYHAIOklceYsFT1H+3DhWKBhBc0E2Mug+XRCWxAQdr4osJys+KLQ6PVVDcKfRhTVk0PwDWsrc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pmBcm/XZ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pmBcm/XZ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 49FC2C4CECE; Thu, 19 Dec 2024 19:37:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637048; bh=E33NpwMg35NquW8dRlbAcvuMrYYS5rxRlCKt3lzwikg=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=pmBcm/XZfx2Mcq0TCOABZBkR8rmaEA7XUE8jswKEjqan+BvjNkQIz4mTWf8vr1/2A iKGfPhjcxCZ/e/74XTcj7inLdctg9fsdZ/jCFLZ8fJPS0bYXY0Y5Errhu7kfBRn12F helxBjNToc7qA3mZR7hQqTvrkm/JrQ78d6H4E/Um4EZ21vqr4aHjnFBuubvOytOLup 4PQQPlO4g7GxXSnae9vZDNiTHT9i0KnCx1eqTRtuiG5nmw3KJh5O3/bD1LR0+zAGpb lT5WI4DvgZB22TqllxMQwgZmgqDrQYQtMs1ocXY8FjFhh+0yI2I/jmclFcQ9Sl1qO9 Ug2ozG9CW46tA== Date: Thu, 19 Dec 2024 11:37:27 -0800 Subject: [PATCH 17/43] xfs: compute rtrmap btree max levels when reflink enabled From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581270.1572761.16688976269157812588.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Compute the maximum possible height of the realtime rmap btree when reflink is enabled. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_rtrmap_btree.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_rtrmap_btree.c b/fs/xfs/libxfs/xfs_rtrmap_btree.c index 3cb8f126b9ce16..04b9c76380adb0 100644 --- a/fs/xfs/libxfs/xfs_rtrmap_btree.c +++ b/fs/xfs/libxfs/xfs_rtrmap_btree.c @@ -718,6 +718,7 @@ xfs_rtrmapbt_maxrecs( unsigned int xfs_rtrmapbt_maxlevels_ondisk(void) { + unsigned long long max_dblocks; unsigned int minrecs[2]; unsigned int blocklen; @@ -726,8 +727,20 @@ xfs_rtrmapbt_maxlevels_ondisk(void) minrecs[0] = xfs_rtrmapbt_block_maxrecs(blocklen, true) / 2; minrecs[1] = xfs_rtrmapbt_block_maxrecs(blocklen, false) / 2; - /* We need at most one record for every block in an rt group. */ - return xfs_btree_compute_maxlevels(minrecs, XFS_MAX_RGBLOCKS); + /* + * Compute the asymptotic maxlevels for an rtrmapbt on any rtreflink fs. + * + * On a reflink filesystem, each block in an rtgroup can have up to + * 2^32 (per the refcount record format) owners, which means that + * theoretically we could face up to 2^64 rmap records. However, we're + * likely to run out of blocks in the data device long before that + * happens, which means that we must compute the max height based on + * what the btree will look like if it consumes almost all the blocks + * in the data device due to maximal sharing factor. + */ + max_dblocks = -1U; /* max ag count */ + max_dblocks *= XFS_MAX_CRC_AG_BLOCKS; + return xfs_btree_space_to_height(minrecs, max_dblocks); } int __init @@ -766,9 +779,20 @@ xfs_rtrmapbt_compute_maxlevels( * maximum height is constrained by the size of the data device and * the height required to store one rmap record for each block in an * rt group. + * + * On a reflink filesystem, each rt block can have up to 2^32 (per the + * refcount record format) owners, which means that theoretically we + * could face up to 2^64 rmap records. This makes the computation of + * maxlevels based on record count meaningless, so we only consider the + * size of the data device. */ d_maxlevels = xfs_btree_space_to_height(mp->m_rtrmap_mnr, mp->m_sb.sb_dblocks); + if (xfs_has_rtreflink(mp)) { + mp->m_rtrmap_maxlevels = d_maxlevels + 1; + return; + } + r_maxlevels = xfs_btree_compute_maxlevels(mp->m_rtrmap_mnr, mp->m_groups[XG_TYPE_RTG].blocks); From patchwork Thu Dec 19 19:37:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915618 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 661E219884C for ; Thu, 19 Dec 2024 19:37:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637064; cv=none; b=WluyoJnG3uPuc4r0lrx1PVjiT3MkDGVGhXG8bsp6F1dm/ktjnlIzylSLioP3J1faSJT80V9lx6nVqes+0zQ33TnLjx7fKrQWh16uukgi/ll6VrmDueIX3HEaA8dPeIDIxOUttbkIjlae7N7xr+E/i9A4NqnrIWpUy79JrW3WUxI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637064; c=relaxed/simple; bh=VqbvNoWjRQs46L5/Xu/EPUXMq6a+GB9slHXl2N51EB4=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=TsqfQ0mwAh//6Uo9IPuKdu1zc8XbqC7bFxGzxtaRiLh4LDrA/lHrJnc2c2wqQtLz3PUnMsOTRXhzAgXCSj40J1oR7k8jO4jASNmslunnYD8lnAq4WiZAmVGg8x2bjrnseB3ZzXMEGQooKO+Hh6g1FSFoZxPbWtnu2a48k3+EMSg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jupXuzKq; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jupXuzKq" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D7C7FC4CECE; Thu, 19 Dec 2024 19:37:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637063; bh=VqbvNoWjRQs46L5/Xu/EPUXMq6a+GB9slHXl2N51EB4=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=jupXuzKq6n8f5FM2fdZdIlF9NzuYo3bSt7rv0mShVppmgEkMau6RdspyJAw53hePX ekLMaRcnKfvBgGEBjklDkHxjHZhUfWw5lL1WRmiIKmeotJO3m6P178wQbrjgbR18jT 2HKdnRiLHAIIdt9DKClsOS0xqstXuAG/ZNASbYdJq0S+aPt6mzzGQ6g9OlmupBYesM WiVkLgjxYPE/wOHsePTiTrduHZUvCp7VkaOUP8vPpsGqk3AbzDkqMAXtMYrPdafFj0 6J1YU4LqIYknWop2zaL2d+2a0jdQqxKUGDG2Htl+G5HzWJ322UBV5iZsCP6t2jMQHs +beF8ktWGmraA== Date: Thu, 19 Dec 2024 11:37:43 -0800 Subject: [PATCH 18/43] xfs: refactor reflink quota updates From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581287.1572761.10199818887802475152.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Hoist all quota updates for reflink into a helper function, since things are about to become more complicated. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_reflink.c | 37 ++++++++++++++++++++++++++++++++----- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 66ce29101462cd..29574b015fddc0 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -739,6 +739,35 @@ xfs_reflink_cancel_cow_range( return error; } +#ifdef CONFIG_XFS_QUOTA +/* + * Update quota accounting for a remapping operation. When we're remapping + * something from the CoW fork to the data fork, we must update the quota + * accounting for delayed allocations. For remapping from the data fork to the + * data fork, use regular block accounting. + */ +static inline void +xfs_reflink_update_quota( + struct xfs_trans *tp, + struct xfs_inode *ip, + bool is_cow, + int64_t blocks) +{ + unsigned int qflag; + + if (XFS_IS_REALTIME_INODE(ip)) { + qflag = is_cow ? XFS_TRANS_DQ_DELRTBCOUNT : + XFS_TRANS_DQ_RTBCOUNT; + } else { + qflag = is_cow ? XFS_TRANS_DQ_DELBCOUNT : + XFS_TRANS_DQ_BCOUNT; + } + xfs_trans_mod_dquot_byino(tp, ip, qflag, blocks); +} +#else +# define xfs_reflink_update_quota(tp, ip, is_cow, blocks) ((void)0) +#endif + /* * Remap part of the CoW fork into the data fork. * @@ -833,8 +862,7 @@ xfs_reflink_end_cow_extent( */ xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &data); xfs_refcount_decrease_extent(tp, isrt, &data); - xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, - -data.br_blockcount); + xfs_reflink_update_quota(tp, ip, false, -data.br_blockcount); } else if (data.br_startblock == DELAYSTARTBLOCK) { int done; @@ -859,8 +887,7 @@ xfs_reflink_end_cow_extent( xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, &del); /* Charge this new data fork mapping to the on-disk quota. */ - xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT, - (long)del.br_blockcount); + xfs_reflink_update_quota(tp, ip, true, del.br_blockcount); /* Remove the mapping from the CoW fork. */ xfs_bmap_del_extent_cow(ip, &icur, &got, &del); @@ -1347,7 +1374,7 @@ xfs_reflink_remap_extent( qdelta += dmap->br_blockcount; } - xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, qdelta); + xfs_reflink_update_quota(tp, ip, false, qdelta); /* Update dest isize if needed. */ newlen = XFS_FSB_TO_B(mp, dmap->br_startoff + dmap->br_blockcount); From patchwork Thu Dec 19 19:37:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915619 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADE7A19884C for ; Thu, 19 Dec 2024 19:37:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637079; cv=none; b=oBNNlS/q8U4yh4UBE9FEyubIYJAdWDKaNwi9KErexwHHx6YPOZLxaG8/8J/Pjk6GycoGPOKaxpJVdY+ijBUEew2FpslrC2YrUf40q3m09P6DWoOmTSFkdStk/XN8UB5UTlSKsEyFUsdbikJasAU9PWSB7qmf2CDCTPD36GQbE98= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637079; c=relaxed/simple; bh=krTqPSDbEL4Uf4dmqOY5JcZGYJBUioGdHSoLVkGzkCk=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=A7bM+lTyKYTc3+giTEXAqTY/dadK3u1uCmYRaj5sxBTk//Kq5GpIWXZGSZCGb6hBZ4/Gz8+l+aVc1pGh13/VLxYkY+TTIOg6/Z0H8HNFGHIL5QZUs6uJgsjRbWo0GyhbUKqlh2AFsx8fOl/t/PPoRNzp8u9wtaymqexzursNxzQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lsoBpaKk; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lsoBpaKk" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7C6E5C4CECE; Thu, 19 Dec 2024 19:37:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637079; bh=krTqPSDbEL4Uf4dmqOY5JcZGYJBUioGdHSoLVkGzkCk=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=lsoBpaKkF4y/UhmJWUdCyu3erMyBYVJH5QkHZDhVhPU2Wi5aMfZ/hPtRS3iDGkkM4 B4QwDghU+o63J/Ix5ZWeNWJmo3mF0vrPs7t7zfvAxqAiy3J1DpIB7gMpLrBSBbpzgv JtUUgH4SvqC2OAjfSdF0MECJ6OWC0OaV6cwx+XOV0FO467l5BWkkMFXzWMT3NZqEUN uKyrHzqDZtXSf6xc3+syZjuBLPH1BzZC6bYBY/6OMCttgXlGtNNNa0waMhx2LGLJn1 Ey/KXvXZELBlMiMrSOe8SanMwhTfdiUtKEqsENs1g6k4Wun0dQmzoE3Bg0TlC38YVo iM7PeA0SDOaxQ== Date: Thu, 19 Dec 2024 11:37:59 -0800 Subject: [PATCH 19/43] xfs: enable CoW for realtime data From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581304.1572761.16104892994654937567.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Update our write paths to support copy on write on the rt volume. This works in more or less the same way as it does on the data device, with the major exception that we never do delalloc on the rt volume. Because we consider unwritten CoW fork staging extents to be incore quota reservation, we update xfs_quota_reserve_blkres to support this case. Though xfs doesn't allow rt and quota together, the change is trivial and we shouldn't leave a logic bomb here. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_reflink.c | 36 ++++++++++++++++++++++++++++-------- 1 file changed, 28 insertions(+), 8 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 29574b015fddc0..24f545687b8730 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -439,20 +439,26 @@ xfs_reflink_fill_cow_hole( struct xfs_mount *mp = ip->i_mount; struct xfs_trans *tp; xfs_filblks_t resaligned; - xfs_extlen_t resblks; + unsigned int dblocks = 0, rblocks = 0; int nimaps; int error; bool found; resaligned = xfs_aligned_fsb_count(imap->br_startoff, imap->br_blockcount, xfs_get_cowextsz_hint(ip)); - resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); + if (XFS_IS_REALTIME_INODE(ip)) { + dblocks = XFS_DIOSTRAT_SPACE_RES(mp, 0); + rblocks = resaligned; + } else { + dblocks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned); + rblocks = 0; + } xfs_iunlock(ip, *lockmode); *lockmode = 0; - error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, resblks, 0, - false, &tp); + error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, dblocks, + rblocks, false, &tp); if (error) return error; @@ -1212,7 +1218,7 @@ xfs_reflink_remap_extent( struct xfs_trans *tp; xfs_off_t newlen; int64_t qdelta = 0; - unsigned int resblks; + unsigned int dblocks, rblocks, resblks; bool quota_reserved = true; bool smap_real; bool dmap_written = xfs_bmap_is_written_extent(dmap); @@ -1243,8 +1249,15 @@ xfs_reflink_remap_extent( * we're remapping. */ resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK); + if (XFS_IS_REALTIME_INODE(ip)) { + dblocks = resblks; + rblocks = dmap->br_blockcount; + } else { + dblocks = resblks + dmap->br_blockcount; + rblocks = 0; + } error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, - resblks + dmap->br_blockcount, 0, false, &tp); + dblocks, rblocks, false, &tp); if (error == -EDQUOT || error == -ENOSPC) { quota_reserved = false; error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, @@ -1324,8 +1337,15 @@ xfs_reflink_remap_extent( * done. */ if (!quota_reserved && !smap_real && dmap_written) { - error = xfs_trans_reserve_quota_nblks(tp, ip, - dmap->br_blockcount, 0, false); + if (XFS_IS_REALTIME_INODE(ip)) { + dblocks = 0; + rblocks = dmap->br_blockcount; + } else { + dblocks = dmap->br_blockcount; + rblocks = 0; + } + error = xfs_trans_reserve_quota_nblks(tp, ip, dblocks, rblocks, + false); if (error) goto out_cancel; } From patchwork Thu Dec 19 19:38:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915620 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8DF661B4F04 for ; Thu, 19 Dec 2024 19:38:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637095; cv=none; b=Ok9ajkLGQtpuKyqnzaKtr7udgay42ptxj2S7JzJbSY2x6TyntndWZukU85ckLzdHVSsumx7jW30+dUv8wM+igwEqUDPI+nM4M4vuHVVd84ys0l4tHA/Oq/bdEI06/0c5eCxztsViqcK+UWL0r7TDBbHYPXhRecZu1BHITdpCq4Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637095; c=relaxed/simple; bh=GDfgkiZ3H0bF/1KyRQutzFdUD1zdUZFsF8rc7IlIdNc=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=t65wR9QiwJVNDf2K2XnA6CzkDa95TJjc11Ag0IZBYkBCxCf0OHZYL+kbz5l7O1UkBxd2RtuiE4u+0qrIkCrGj/igMbl77FeIN3A03IhPoY+Zhas/fuD5wXt2saxdP1+6CupeT6moz+QJpySPbGc4W9v4lJxwQ3alUMqoez2JhtE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Wv+fHRas; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Wv+fHRas" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1E49EC4CECE; Thu, 19 Dec 2024 19:38:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637095; bh=GDfgkiZ3H0bF/1KyRQutzFdUD1zdUZFsF8rc7IlIdNc=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=Wv+fHRasMk22AV9cpBCB5mqR5vS5IVX+/YsMGH7Eq2Dj3s0i/x1u5JJJaE0Jq47OF PPYrpXuFtmhJKqYkyzdYgaSOyxmMi3CsveuTAnnotpgHfMAvP7D34lI0Tc1Bu+Cz3w hz7hg0jPMo5PacZRR1RNnuXq5A4hCvqVwIgdNu+/rIuHiIYpoRQTU+6Td/YhIGiktU V8ltRAc4WKO7Sq6q5AVadmnLTIoodXX4jxyB1BcIKhGTx94GctCsABUFmLk5KnYhcS 7PdIgrUnV0xvIfTQELuWL8iogFC2Frd14Q0/neXh3slgfqPXarSy8x9nY5nBsvmIP/ RfKCL3tcjT9UQ== Date: Thu, 19 Dec 2024 11:38:14 -0800 Subject: [PATCH 20/43] xfs: enable sharing of realtime file blocks From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581321.1572761.3233283111774533052.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Update the remapping routines to be able to handle realtime files. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_reflink.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 24f545687b8730..78b47b2ac12453 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -33,6 +33,7 @@ #include "xfs_rtrefcount_btree.h" #include "xfs_rtalloc.h" #include "xfs_rtgroup.h" +#include "xfs_metafile.h" /* * Copy on Write of Shared Blocks @@ -1187,14 +1188,28 @@ xfs_reflink_update_dest( static int xfs_reflink_ag_has_free_space( struct xfs_mount *mp, - xfs_agnumber_t agno) + struct xfs_inode *ip, + xfs_fsblock_t fsb) { struct xfs_perag *pag; + xfs_agnumber_t agno; int error = 0; if (!xfs_has_rmapbt(mp)) return 0; + if (XFS_IS_REALTIME_INODE(ip)) { + struct xfs_rtgroup *rtg; + xfs_rgnumber_t rgno; + rgno = xfs_rtb_to_rgno(mp, fsb); + rtg = xfs_rtgroup_get(mp, rgno); + if (xfs_metafile_resv_critical(rtg_rmap(rtg))) + error = -ENOSPC; + xfs_rtgroup_put(rtg); + return error; + } + + agno = XFS_FSB_TO_AGNO(mp, fsb); pag = xfs_perag_get(mp, agno); if (xfs_ag_resv_critical(pag, XFS_AG_RESV_RMAPBT) || xfs_ag_resv_critical(pag, XFS_AG_RESV_METADATA)) @@ -1308,8 +1323,8 @@ xfs_reflink_remap_extent( /* No reflinking if the AG of the dest mapping is low on space. */ if (dmap_written) { - error = xfs_reflink_ag_has_free_space(mp, - XFS_FSB_TO_AGNO(mp, dmap->br_startblock)); + error = xfs_reflink_ag_has_free_space(mp, ip, + dmap->br_startblock); if (error) goto out_cancel; } @@ -1568,8 +1583,8 @@ xfs_reflink_remap_prep( /* Check file eligibility and prepare for block sharing. */ ret = -EINVAL; - /* Don't reflink realtime inodes */ - if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest)) + /* Can't reflink between data and rt volumes */ + if (XFS_IS_REALTIME_INODE(src) != XFS_IS_REALTIME_INODE(dest)) goto out_unlock; /* Don't share DAX file data with non-DAX file. */ From patchwork Thu Dec 19 19:38:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915621 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8DF719884C for ; Thu, 19 Dec 2024 19:38:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637110; cv=none; b=IcOIob0Wf9SjbPrgqJ3LQNh8d+L6gXWHA+7wA/2Cv823CtYz91A2f3fvbSnVp6lJ9+3nsLvJktRxUosmCYoD+yz/b05L0q3W68juDaWi/wD+mzHwzga6Ui14RskQnB733WU4UL1vFJanFzd+ovluzaGQUxbU8UH3ZZnnT0nwdTA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637110; c=relaxed/simple; bh=eKrvap0A7dc5lsWpWCkzOsqhx8kRWHDiYveWjQjWIKI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MRhiIy6a/KWEJzgz9jy73XESwSXt+/9dxrwmVgCql/80ynmCEUrFhVeL3cD8rb7KW3ZZ5bYITePbhCDKTkF970pc0tW5UY1d7/XY+nDqlJM6hbO5gNl/DAL+SNhhHGfu2c3tjKB8c5qEo7jRLUIAsFjmmsGnRB5X6rKWN5Fflmo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XOg6+Fo7; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XOg6+Fo7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id AF803C4CECE; Thu, 19 Dec 2024 19:38:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637110; bh=eKrvap0A7dc5lsWpWCkzOsqhx8kRWHDiYveWjQjWIKI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=XOg6+Fo74WY/qIL+5wCYQ4Ho1rP+KuJIM6JpbJlxqwMR46VegHq1pXKgd6ZdOuIuk Kv9RZKPHFT4X5iS0cBUQGQZFbie5zGXtX6Q/YW3BM2ZwAKeDShADBtrjxzMzV7JyXQ w2trBDXSt+RZnAyvAN8jls+ouGQHxWx9bnw023ouPoEq8Lc8OYv53cBByOCuykP9Pz zO/AOj2DpXpNDrghEyPlxr5yF75lEjj7wY6nR9UHWGvum/2+0EcFAFr1clL2ygSDOD OB2vxfT9qkroc3Br/+Dy1MkeUX1GW24L3qv5bK69eNUaIEPux7/5GeMLq1aI+963Ik e1XyMKKpkrbTw== Date: Thu, 19 Dec 2024 11:38:30 -0800 Subject: [PATCH 21/43] xfs: allow inodes to have the realtime and reflink flags From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581337.1572761.5681483291096817147.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Now that we can share blocks between realtime files, allow this combination. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_inode_buf.c | 3 ++- fs/xfs/scrub/inode.c | 5 +++-- fs/xfs/scrub/inode_repair.c | 6 ------ fs/xfs/xfs_ioctl.c | 4 ---- 4 files changed, 5 insertions(+), 13 deletions(-) diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 65eec8f60376d3..4273d096fb0a9c 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -748,7 +748,8 @@ xfs_dinode_verify( return __this_address; /* don't let reflink and realtime mix */ - if ((flags2 & XFS_DIFLAG2_REFLINK) && (flags & XFS_DIFLAG_REALTIME)) + if ((flags2 & XFS_DIFLAG2_REFLINK) && (flags & XFS_DIFLAG_REALTIME) && + !xfs_has_rtreflink(mp)) return __this_address; /* COW extent size hint validation */ diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c index 8e702121dc8699..c7bbc3f78e90b1 100644 --- a/fs/xfs/scrub/inode.c +++ b/fs/xfs/scrub/inode.c @@ -360,8 +360,9 @@ xchk_inode_flags2( if ((flags2 & XFS_DIFLAG2_REFLINK) && !S_ISREG(mode)) goto bad; - /* realtime and reflink make no sense, currently */ - if ((flags & XFS_DIFLAG_REALTIME) && (flags2 & XFS_DIFLAG2_REFLINK)) + /* realtime and reflink don't always go together */ + if ((flags & XFS_DIFLAG_REALTIME) && (flags2 & XFS_DIFLAG2_REFLINK) && + !xfs_has_rtreflink(mp)) goto bad; /* no bigtime iflag without the bigtime feature */ diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index d7e3f033b16073..938a18721f3697 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -564,8 +564,6 @@ xrep_dinode_flags( flags2 |= XFS_DIFLAG2_REFLINK; else flags2 &= ~(XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE); - if (flags & XFS_DIFLAG_REALTIME) - flags2 &= ~XFS_DIFLAG2_REFLINK; if (!xfs_has_bigtime(mp)) flags2 &= ~XFS_DIFLAG2_BIGTIME; if (!xfs_has_large_extent_counts(mp)) @@ -1790,10 +1788,6 @@ xrep_inode_flags( /* DAX only applies to files and dirs. */ if (!(S_ISREG(mode) || S_ISDIR(mode))) sc->ip->i_diflags2 &= ~XFS_DIFLAG2_DAX; - - /* No reflink files on the realtime device. */ - if (sc->ip->i_diflags & XFS_DIFLAG_REALTIME) - sc->ip->i_diflags2 &= ~XFS_DIFLAG2_REFLINK; } /* diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 0789c18aaa1871..4caf29cc59b9ef 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -541,10 +541,6 @@ xfs_ioctl_setattr_xflags( if (mp->m_sb.sb_rblocks == 0 || mp->m_sb.sb_rextsize == 0 || xfs_extlen_to_rtxmod(mp, ip->i_extsize)) return -EINVAL; - - /* Clear reflink if we are actually able to set the rt flag. */ - if (xfs_is_reflink_inode(ip)) - ip->i_diflags2 &= ~XFS_DIFLAG2_REFLINK; } /* diflags2 only valid for v3 inodes. */ From patchwork Thu Dec 19 19:38:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915622 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E8FA1A2389 for ; Thu, 19 Dec 2024 19:38:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637126; cv=none; b=OOmzJdCD79i9fi6zdYY0gdAfBzSuvVdaJcT+5eNZqGMSvaUV+U8fRGkDl/7MDpcNEVNTVoWB5TEj7UbIcWkRKtlBDuxwslVVUQQVpnePlUN5sQ11jCw790nCREDShVgJbT536yISo/qkcJJNBzYh52Ii89SxSnB6q6pm7oGggWw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637126; c=relaxed/simple; bh=xe/1Wajs9Areptv5v7cPr2zm2W3m2yAjUqpYbLoI+9k=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=WoLxpp4mPAUohMZByfJOUbKcNQ/CnILSCoet+MRoP2uhLTb0qp11bPp/ickJ5EZT4V3XW0KYluw47ZCm4ObwB1cFz9I0hss2aKfADHm4WAZ1Ba50EJYOo7VTuBeylewsyzNvqldH8ZEZSIDEhL6geT0JmPiYm4Vz1wE6FtbUfbA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VPisUuQz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VPisUuQz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 54D43C4CECE; Thu, 19 Dec 2024 19:38:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637126; bh=xe/1Wajs9Areptv5v7cPr2zm2W3m2yAjUqpYbLoI+9k=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=VPisUuQzb9/FiHOfUn1pJoSV5U5dD3/5NxjqWDRASbhPM8SOwbRA370mcAWu1BOvm sbyFeEgBPb2JQD3YjjbkuZAYNWEst6IdKMynDdAzo5bTMiiGK+isKcjWwJfpRsNwAu xh3rJBFFtqA7GHGrJUDT4ZE8IcchPFROwcnIhBYPzlw/oJgGqGM5tMe2KF8TVzPENB iolBAqiG07lDPsH+1K6Fysk5fk7styErXPNxG9DHVql6/12w2WqMkXCGys9EPXLKqq npDzOgGxI3KbAepeWhInS1XSmX8iHRKf1qRUZKwJTIwJvlUHVfEzFFM32xvaHtcyVT 7kGr8MBDjjh5Q== Date: Thu, 19 Dec 2024 11:38:45 -0800 Subject: [PATCH 22/43] xfs: recover CoW leftovers in the realtime volume From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581355.1572761.6134851760147887895.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Scan the realtime refcount tree at mount time to get rid of leftover CoW staging extents. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_refcount.c | 47 +++++++++++++++++++++++++++++------------- fs/xfs/libxfs/xfs_refcount.h | 3 +-- fs/xfs/xfs_reflink.c | 15 +++++++++++-- 3 files changed, 46 insertions(+), 19 deletions(-) diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 11bff098db2dbb..cebe83f7842a1c 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -2054,12 +2054,13 @@ xfs_refcount_recover_extent( /* Find and remove leftover CoW reservations. */ int xfs_refcount_recover_cow_leftovers( - struct xfs_mount *mp, - struct xfs_perag *pag) + struct xfs_group *xg) { + struct xfs_mount *mp = xg->xg_mount; + bool isrt = xg->xg_type == XG_TYPE_RTG; struct xfs_trans *tp; struct xfs_btree_cur *cur; - struct xfs_buf *agbp; + struct xfs_buf *agbp = NULL; struct xfs_refcount_recovery *rr, *n; struct list_head debris; union xfs_btree_irec low = { @@ -2072,10 +2073,19 @@ xfs_refcount_recover_cow_leftovers( xfs_fsblock_t fsb; int error; - /* reflink filesystems mustn't have AGs larger than 2^31-1 blocks */ + /* reflink filesystems must not have groups larger than 2^31-1 blocks */ + BUILD_BUG_ON(XFS_MAX_RGBLOCKS >= XFS_REFC_COWFLAG); BUILD_BUG_ON(XFS_MAX_CRC_AG_BLOCKS >= XFS_REFC_COWFLAG); - if (mp->m_sb.sb_agblocks > XFS_MAX_CRC_AG_BLOCKS) - return -EOPNOTSUPP; + + if (isrt) { + if (!xfs_has_rtgroups(mp)) + return 0; + if (xfs_group_max_blocks(xg) >= XFS_MAX_RGBLOCKS) + return -EOPNOTSUPP; + } else { + if (xfs_group_max_blocks(xg) > XFS_MAX_CRC_AG_BLOCKS) + return -EOPNOTSUPP; + } INIT_LIST_HEAD(&debris); @@ -2093,16 +2103,24 @@ xfs_refcount_recover_cow_leftovers( if (error) return error; - error = xfs_alloc_read_agf(pag, tp, 0, &agbp); - if (error) - goto out_trans; - cur = xfs_refcountbt_init_cursor(mp, tp, agbp, pag); + if (isrt) { + xfs_rtgroup_lock(to_rtg(xg), XFS_RTGLOCK_REFCOUNT); + cur = xfs_rtrefcountbt_init_cursor(tp, to_rtg(xg)); + } else { + error = xfs_alloc_read_agf(to_perag(xg), tp, 0, &agbp); + if (error) + goto out_trans; + cur = xfs_refcountbt_init_cursor(mp, tp, agbp, to_perag(xg)); + } /* Find all the leftover CoW staging extents. */ error = xfs_btree_query_range(cur, &low, &high, xfs_refcount_recover_extent, &debris); xfs_btree_del_cursor(cur, error); - xfs_trans_brelse(tp, agbp); + if (agbp) + xfs_trans_brelse(tp, agbp); + else + xfs_rtgroup_unlock(to_rtg(xg), XFS_RTGLOCK_REFCOUNT); xfs_trans_cancel(tp); if (error) goto out_free; @@ -2115,14 +2133,15 @@ xfs_refcount_recover_cow_leftovers( goto out_free; /* Free the orphan record */ - fsb = xfs_agbno_to_fsb(pag, rr->rr_rrec.rc_startblock); - xfs_refcount_free_cow_extent(tp, false, fsb, + fsb = xfs_gbno_to_fsb(xg, rr->rr_rrec.rc_startblock); + xfs_refcount_free_cow_extent(tp, isrt, fsb, rr->rr_rrec.rc_blockcount); /* Free the block. */ error = xfs_free_extent_later(tp, fsb, rr->rr_rrec.rc_blockcount, NULL, - XFS_AG_RESV_NONE, 0); + XFS_AG_RESV_NONE, + isrt ? XFS_FREE_EXTENT_REALTIME : 0); if (error) goto out_trans; diff --git a/fs/xfs/libxfs/xfs_refcount.h b/fs/xfs/libxfs/xfs_refcount.h index be11df25abcc5b..f2e299a716a4ee 100644 --- a/fs/xfs/libxfs/xfs_refcount.h +++ b/fs/xfs/libxfs/xfs_refcount.h @@ -94,8 +94,7 @@ void xfs_refcount_alloc_cow_extent(struct xfs_trans *tp, bool isrt, xfs_fsblock_t fsb, xfs_extlen_t len); void xfs_refcount_free_cow_extent(struct xfs_trans *tp, bool isrt, xfs_fsblock_t fsb, xfs_extlen_t len); -extern int xfs_refcount_recover_cow_leftovers(struct xfs_mount *mp, - struct xfs_perag *pag); +int xfs_refcount_recover_cow_leftovers(struct xfs_group *xg); /* * While we're adjusting the refcounts records of an extent, we have diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 78b47b2ac12453..d9b33e22c17669 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -983,20 +983,29 @@ xfs_reflink_recover_cow( struct xfs_mount *mp) { struct xfs_perag *pag = NULL; + struct xfs_rtgroup *rtg = NULL; int error = 0; if (!xfs_has_reflink(mp)) return 0; while ((pag = xfs_perag_next(mp, pag))) { - error = xfs_refcount_recover_cow_leftovers(mp, pag); + error = xfs_refcount_recover_cow_leftovers(pag_group(pag)); if (error) { xfs_perag_rele(pag); - break; + return error; } } - return error; + while ((rtg = xfs_rtgroup_next(mp, rtg))) { + error = xfs_refcount_recover_cow_leftovers(rtg_group(rtg)); + if (error) { + xfs_rtgroup_rele(rtg); + return error; + } + } + + return 0; } /* From patchwork Thu Dec 19 19:39:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915623 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C5921990BA for ; Thu, 19 Dec 2024 19:39:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637142; cv=none; b=uw1M+TxBIJZbwRJXH+eB1b+0kyawFlpEkmakOURLZ1n8kAQ9bTWXbEtK41IAnVZYRedN8WvQkQaBYIsyJSVVIb09vOngioD9O7WV0G6zTfbvsc0SHn1KxDUbfzpm/iGqbxhz+G3+GJTJmZNfkdxcUoqHflQ3fxL2NHFWDvilY74= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637142; c=relaxed/simple; bh=VRLOkXc1cCynwrHe5ZW4IxKjxfg05aSERfyWn2Adics=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=r6eDhdfeyB8iZXqy3uu8kPfQQLOMIVCMNrE5s+ljqPxJMKcn/S4EPOBL3yiUcnd5EqUzPCjIMQdTXlCE17fL8ZvN4ua1WablBpN7yJkkss9e6TxZcnsbfJQlhH51gqP0/iSSWDZYJxpbRwFIJxd8VfqPrj31tpUtb1oCrb0Y1sg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WWzL6skB; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WWzL6skB" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E6FE0C4CECE; Thu, 19 Dec 2024 19:39:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637142; bh=VRLOkXc1cCynwrHe5ZW4IxKjxfg05aSERfyWn2Adics=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=WWzL6skBeQIYrsfcph8N0JRr4dNNEHimYYZ7qhEDiQiitYEp5KfrHoPONFdlIVZqH rFBFs9g5J5s77OQ/1Uj8SvKeCaC6RAG65j+Q9T3GU+Gera+BnTNEn6K7h6hbERMORg KasD4YEeJ34u2TPWhFed+2Ej2K1PlEYU0MXwhnTCPxVKo/VYjG9oEGM4o/n5shv39I NeMuMlUqDyRm6NKKlfZwmNbgP4PakD7uCqWe4Gzu/UC6aZ6QhWLUOOPCvNN1rg1VQe triod1kpY9CAoxWCo7mAJqTFPlXLBCOGI8G1g7Xo3ByvFGUS+9zWXP50Eb+XQHldni WdF8lGtdiTOeQ== Date: Thu, 19 Dec 2024 11:39:01 -0800 Subject: [PATCH 23/43] xfs: fix xfs_get_extsz_hint behavior with realtime alwayscow files From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581372.1572761.7820351190559401263.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Currently, we (ab)use xfs_get_extsz_hint so that it always returns a nonzero value for realtime files. This apparently was done to disable delayed allocation for realtime files. However, once we enable realtime reflink, we can also turn on the alwayscow flag to force CoW writes to realtime files. In this case, the logic will incorrectly send the write through the delalloc write path. Fix this by adjusting the logic slightly. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_bmap.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index d63713630236e7..ae3d33d6076102 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -6500,9 +6500,8 @@ xfs_get_extsz_hint( * No point in aligning allocations if we need to COW to actually * write to them. */ - if (xfs_is_always_cow_inode(ip)) - return 0; - if ((ip->i_diflags & XFS_DIFLAG_EXTSIZE) && ip->i_extsize) + if (!xfs_is_always_cow_inode(ip) && + (ip->i_diflags & XFS_DIFLAG_EXTSIZE) && ip->i_extsize) return ip->i_extsize; if (XFS_IS_REALTIME_INODE(ip) && ip->i_mount->m_sb.sb_rextsize > 1) From patchwork Thu Dec 19 19:39:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915624 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 523891A42C4 for ; Thu, 19 Dec 2024 19:39:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637158; cv=none; b=GAfkN1OSxFDidYSrgz6mwFEvp4MAb8dbB0evCtcqCQ7AiBdJpZMTy443WW7BtuqO1TOooaPeR0jSJ4oAkfRbOGD/UbhD7ifK2YW9ZHo18uOj+jtFZpp5NUGecGEq+7yQgiM82VqpXt4SQnEuFcuqtZD2ssiJbsn6V8PMSeTw+8M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637158; c=relaxed/simple; bh=vJW0p/dPYF4Jy9k++3p70S6MVJeNt5JbsqDsVJRkWtg=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=pdrqcFTrIEX/VetDlE4d8sublh5bRCwxvqRCTw7taejW9oL0R67mbGGXm+MS6FuFUsDR9xFtZR/kJG+5UzQgETKERdp1y78/tBiJDl1z43qzgiCXZX6v51C3ai3N/QoD1TQsk+ebS5VSd2gyBqFlqr0QJD1delUI5eFLn75bzRo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XEJf586t; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XEJf586t" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5B19C4CECE; Thu, 19 Dec 2024 19:39:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637157; bh=vJW0p/dPYF4Jy9k++3p70S6MVJeNt5JbsqDsVJRkWtg=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=XEJf586twpDr923dQBIeiU1VffC566kuxLB9mkIyWxLadvFWkfgkGTJRQBRlqjtmx ssKxI5CU2M2LiRw3iHFZ7jS/USAoPOQ+TL6WNFD7ZAOScLwvIYW51kdabYOhHwZs/G I8GGj25VXSu2eRSXuhYsGwKA0/mFtXwS6BFIIMecHoDLnjgxr1MpMbG8i0QCKIIaXB ZGik9nF9Cm0xF+w2NeRPDUjbflCkxHxKaiyTCePJLWicsXmpp5Hcq0qNiR81C8amUK 8ZxOUTMVAPI23kH91ys1Bpvb5Tb70qcm+crExgY1iOazqLVVSCp388hPHGGbC9d0zO 9xKzIISvPDWgw== Date: Thu, 19 Dec 2024 11:39:17 -0800 Subject: [PATCH 24/43] xfs: apply rt extent alignment constraints to CoW extsize hint From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581389.1572761.18087012287475661586.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong The copy-on-write extent size hint is subject to the same alignment constraints as the regular extent size hint. Since we're in the process of adding reflink (and therefore CoW) to the realtime device, we must apply the same scattered rextsize alignment validation strategies to both hints to deal with the possibility of rextsize changing. Therefore, fix the inode validator to perform rextsize alignment checks on regular realtime files, and to remove misaligned directory hints. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_inode_buf.c | 25 ++++++++++++++++++++----- fs/xfs/xfs_inode_item.c | 14 ++++++++++++++ fs/xfs/xfs_ioctl.c | 17 +++++++++++++++-- 3 files changed, 49 insertions(+), 7 deletions(-) diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 4273d096fb0a9c..f24fa628fecf1e 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -910,11 +910,29 @@ xfs_inode_validate_cowextsize( bool rt_flag; bool hint_flag; uint32_t cowextsize_bytes; + uint32_t blocksize_bytes; rt_flag = (flags & XFS_DIFLAG_REALTIME); hint_flag = (flags2 & XFS_DIFLAG2_COWEXTSIZE); cowextsize_bytes = XFS_FSB_TO_B(mp, cowextsize); + /* + * Similar to extent size hints, a directory can be configured to + * propagate realtime status and a CoW extent size hint to newly + * created files even if there is no realtime device, and the hints on + * disk can become misaligned if the sysadmin changes the rt extent + * size while adding the realtime device. + * + * Therefore, we can only enforce the rextsize alignment check against + * regular realtime files, and rely on callers to decide when alignment + * checks are appropriate, and fix things up as needed. + */ + + if (rt_flag) + blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize); + else + blocksize_bytes = mp->m_sb.sb_blocksize; + if (hint_flag && !xfs_has_reflink(mp)) return __this_address; @@ -928,16 +946,13 @@ xfs_inode_validate_cowextsize( if (mode && !hint_flag && cowextsize != 0) return __this_address; - if (hint_flag && rt_flag) - return __this_address; - - if (cowextsize_bytes % mp->m_sb.sb_blocksize) + if (cowextsize_bytes % blocksize_bytes) return __this_address; if (cowextsize > XFS_MAX_BMBT_EXTLEN) return __this_address; - if (cowextsize > mp->m_sb.sb_agblocks / 2) + if (!rt_flag && cowextsize > mp->m_sb.sb_agblocks / 2) return __this_address; return NULL; diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index a174f64b8bb250..70283c6419fd30 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -157,6 +157,20 @@ xfs_inode_item_precommit( if (flags & XFS_ILOG_IVERSION) flags = ((flags & ~XFS_ILOG_IVERSION) | XFS_ILOG_CORE); + /* + * Inode verifiers do not check that the CoW extent size hint is an + * integer multiple of the rt extent size on a directory with both + * rtinherit and cowextsize flags set. If we're logging a directory + * that is misconfigured in this way, clear the hint. + */ + if ((ip->i_diflags & XFS_DIFLAG_RTINHERIT) && + (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) && + xfs_extlen_to_rtxmod(ip->i_mount, ip->i_cowextsize) > 0) { + ip->i_diflags2 &= ~XFS_DIFLAG2_COWEXTSIZE; + ip->i_cowextsize = 0; + flags |= XFS_ILOG_CORE; + } + if (!iip->ili_item.li_buf) { struct xfs_buf *bp; int error; diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 4caf29cc59b9ef..726282e74d546d 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -469,8 +469,21 @@ xfs_fill_fsxattr( } } - if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) - fa->fsx_cowextsize = XFS_FSB_TO_B(mp, ip->i_cowextsize); + if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) { + /* + * Don't let a misaligned CoW extent size hint on a directory + * escape to userspace if it won't pass the setattr checks + * later. + */ + if ((ip->i_diflags & XFS_DIFLAG_RTINHERIT) && + ip->i_cowextsize % mp->m_sb.sb_rextsize > 0) { + fa->fsx_xflags &= ~FS_XFLAG_COWEXTSIZE; + fa->fsx_cowextsize = 0; + } else { + fa->fsx_cowextsize = XFS_FSB_TO_B(mp, ip->i_cowextsize); + } + } + fa->fsx_projid = ip->i_projid; if (ifp && !xfs_need_iread_extents(ifp)) fa->fsx_nextents = xfs_iext_count(ifp); From patchwork Thu Dec 19 19:39:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915625 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8DB0D1A4F09 for ; Thu, 19 Dec 2024 19:39:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637173; cv=none; b=r4LtkRWwHsvOuC+Wt/Qqb4HsJPCLX2nY7Hsqk18isCC4KvJA7hfkNUUk0pdJAflra/LqG+MqB1u+1eCzeu1A9c28rPTNzsojFnG0hTE2eNLd2VF+Fmg1/0dc460h8xIwi04vojqqrO17nyBFPcQG0ss0JkubsB6LuHP4uxJefJE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637173; c=relaxed/simple; bh=XfLvvQd0zYfgque//5sEmPpK2/NcGcBrEB9ZcfHi8eo=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=YOv7b12dIRKcPs3F/+RiVlVTiMfX9SvUp0N9o9mfB29iZHMl0ChvJ1vmXrkWAwaiON+yZ64YI0P20IhZk/5va8o92WzVSN3AUegjgANNKUeD6ZG7Byyx1qahZsxjyRw4/HfBcYzhAPpv7AWw5DXdfguZVP2tqxDNcPy8cBqD4cg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=OOCoA6Do; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OOCoA6Do" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 63EB7C4CECE; Thu, 19 Dec 2024 19:39:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637173; bh=XfLvvQd0zYfgque//5sEmPpK2/NcGcBrEB9ZcfHi8eo=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=OOCoA6Do16LC6tX/T+ipfXHaY19RRVdSW8gXAtiFtjnT91BLTDssxWdh1QP3By6Wv 8OXdxhaeeQ5rR9qL3IJ7kjuvRaCe39oaeSo3cHCmd9DGvoKM8/1QpSuUWVnLqCELzH IeR7L4TvsKOrF8wD18BpP+iWNkkd28JygVdglyFgu/dCllWpKLIGUFdPb3DUIA4eCJ WCfCcnL6bNB/CSP8n5T6E76rlOAaw9lEQCJTAEeg3RadZm90Zo+2mhXEe9sCYNbup7 f4fs5+EnDHoGN/zdc0yiGm2LLiynKScfactPTzn70D0HMI8UUvAaXokSmX3gCQk6Nz rOyJenXgJEjUw== Date: Thu, 19 Dec 2024 11:39:32 -0800 Subject: [PATCH 25/43] xfs: enable extent size hints for CoW operations From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581406.1572761.12345015352845968214.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Wire up the copy-on-write extent size hint for realtime files, and connect it to the rt allocator so that we avoid fragmentation on rt filesystems. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_bmap.c | 8 +++++++- fs/xfs/xfs_rtalloc.c | 5 ++++- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index ae3d33d6076102..40ad22fb808b95 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -6524,7 +6524,13 @@ xfs_get_cowextsz_hint( a = 0; if (ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) a = ip->i_cowextsize; - b = xfs_get_extsz_hint(ip); + if (XFS_IS_REALTIME_INODE(ip)) { + b = 0; + if (ip->i_diflags & XFS_DIFLAG_EXTSIZE) + b = ip->i_extsize; + } else { + b = xfs_get_extsz_hint(ip); + } a = max(a, b); if (a == 0) diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 294aa0739be311..f5a3d5f8c948d8 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -2021,7 +2021,10 @@ xfs_rtallocate_align( if (*noalign) { align = mp->m_sb.sb_rextsize; } else { - align = xfs_get_extsz_hint(ap->ip); + if (ap->flags & XFS_BMAPI_COWFORK) + align = xfs_get_cowextsz_hint(ap->ip); + else + align = xfs_get_extsz_hint(ap->ip); if (!align) align = 1; if (align == mp->m_sb.sb_rextsize) From patchwork Thu Dec 19 19:39:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915626 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32A491A9B49 for ; Thu, 19 Dec 2024 19:39:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637189; cv=none; b=uE8gjJUeMqlSMiGm4qtC9WFajutdIMDqQjdidTElzDalCFDGkC6XHuW88Jli507cCiHVxM2peIX/wwo9liumPfcC9HdiyX0Cdl4qm+T1o7U00GMIpIN5hsxOSQVicjFX4CLu30TBXurgf3kZk1syS/KFVg2eVe/q89vDqtil42g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637189; c=relaxed/simple; bh=l1We+YfAluaG5JJJbMdvAWFqQwoR7txgSEEFh3clbgI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DBLFInOAc0II50dmJsBbiVtsMpOnpOjE84S1HN6jY+j5c2/V+dU3vaTQ74UmWaT+LDmR6H5HwdOq9VOUD9x1sGvAZjTY7A5XKK+bppoMocQoRxFLzwJzzpWbo8a75gWaWCAIYSxbQ5f6X2HuEIpsgqELntRq/UbEUOWilgW/Lc4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Xx+EPkq6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Xx+EPkq6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 103DAC4CECE; Thu, 19 Dec 2024 19:39:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637189; bh=l1We+YfAluaG5JJJbMdvAWFqQwoR7txgSEEFh3clbgI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=Xx+EPkq666p3zaETq9SHKqV9XkFceAmzOuCXXU4K5aDe0W7MGgqajVCKIpqaSAeu6 83Cih3/8gUVEqrhibET/8WksXElgrxm506NxqkHSB9VuEDrhvJJlssCqlZBpcCicpw zY11lgLgbPFr6jrJ+LpOmoHbA13JVXM3BJCWk8gWWVkmQcCYdP3wViLjYTZjpvm3Ic wMfjEwKjdL2lBFQ3Iz9eueiFjcnooMzF4NWBPW18P20zbUStv/IUHllWxABvhLPYLm PwKaPln6/fSXkHL3oP9kezGDLfA9ZONyq1hSZqfGkHpsN+2c0eIU6mWyC8WKmpRrJo fD6wUPBFZocUg== Date: Thu, 19 Dec 2024 11:39:48 -0800 Subject: [PATCH 26/43] xfs: check that the rtrefcount maxlevels doesn't increase when growing fs From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581423.1572761.8866411896316189478.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong The size of filesystem transaction reservations depends on the maximum height (maxlevels) of the realtime btrees. Since we don't want a grow operation to increase the reservation size enough that we'll fail the minimum log size checks on the next mount, constrain growfs operations if they would cause an increase in the rt refcount btree maxlevels. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_fsops.c | 2 ++ fs/xfs/xfs_rtalloc.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index 9df5a09c0acd3b..455298503d0102 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -23,6 +23,7 @@ #include "xfs_trace.h" #include "xfs_rtalloc.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" /* * Write new AG headers to disk. Non-transactional, but need to be @@ -231,6 +232,7 @@ xfs_growfs_data_private( /* Compute new maxlevels for rt btrees. */ xfs_rtrmapbt_compute_maxlevels(mp); + xfs_rtrefcountbt_compute_maxlevels(mp); } return error; diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index f5a3d5f8c948d8..a5de5405800a22 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -995,6 +995,7 @@ xfs_growfs_rt_bmblock( */ mp->m_features |= XFS_FEAT_REALTIME; xfs_rtrmapbt_compute_maxlevels(mp); + xfs_rtrefcountbt_compute_maxlevels(mp); kfree(nmp); return 0; @@ -1178,6 +1179,7 @@ xfs_growfs_check_rtgeom( nmp->m_sb.sb_dblocks = dblocks; xfs_rtrmapbt_compute_maxlevels(nmp); + xfs_rtrefcountbt_compute_maxlevels(nmp); xfs_trans_resv_calc(nmp, M_RES(nmp)); /* From patchwork Thu Dec 19 19:40:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915627 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7CAE1A42C4 for ; Thu, 19 Dec 2024 19:40:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637204; cv=none; b=smL4+AKryOfZQriqUHhGcLIpMZhcLiys9qim9jvUmx1oZwhrclqLIpp/33v25ZfcFxRfo8+bqw+8MGMyjxByE17kCS6SQkMFJZSDwTRLicgxxKwFS2sY2S9uOqm55AssBcgDKWQx6sAS5WjcFlyTVzq6j+xihLdmuAPm9KaEm/w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637204; c=relaxed/simple; bh=R4TKCQ/WbFiYOPkFciYR/JcNGU5i8CMzJ6tVWSqf5ZI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Xzd6AE4TYGAbTQbfjJLVi35kKD3Qy90wS9DPf4Y5qB/HsFZfiw6RwL9wo+LjIVYjBquX/b88BLFxbX/KJ/EwLH9i1cwZ/VZn2krAX5mRv4n0gzFmS/Mb7+4zyynP3TInWSckxHzs5tr9UgfmAdqKaXpjAAsnYQTasRaNdZx+Xb4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QmiW3OpQ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QmiW3OpQ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A9865C4CECE; Thu, 19 Dec 2024 19:40:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637204; bh=R4TKCQ/WbFiYOPkFciYR/JcNGU5i8CMzJ6tVWSqf5ZI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=QmiW3OpQ4Qjh0HOD90lVVFYPbB0XGi3yyieJjqTfETNcJ7CQ6g6CPHO69VG+5jhIZ K5cXJyp4qE1wIrvBaQ9sXX98JebPx39+0MzQSk0Ni0xy+6am4i2nxJm741qaijNKaw Xqpd2Yd/xbLvlQ/y9LtgcM3jNm8TQKuoCfLRbjAJ33YzxJHlzcspoz0sxL1aa5NS2L i0pA4CUvwEsFsZGpVoFpbdwRtVfHVDuMFO2W+0HJroiG4VIiud2iy9rKGJ4ioSh4gf X6JUvtA3P0Mc1SloHpSKUGKGA0gM1mH2U9xZbXvZJQbrCA0Fo1XSdhouXYw0l7qDPQ +gQz/b2tnAJ4A== Date: Thu, 19 Dec 2024 11:40:04 -0800 Subject: [PATCH 27/43] xfs: report realtime refcount btree corruption errors to the health system From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581440.1572761.11893001846505305956.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Whenever we encounter corrupt realtime refcount btree blocks, we should report that to the health monitoring system for later reporting. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_fs.h | 1 + fs/xfs/libxfs/xfs_health.h | 4 +++- fs/xfs/libxfs/xfs_rtgroup.c | 1 + fs/xfs/libxfs/xfs_rtrefcount_btree.c | 10 ++++++++-- fs/xfs/xfs_health.c | 1 + 5 files changed, 14 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h index d42d3a5617e314..ea9e58a89d92d3 100644 --- a/fs/xfs/libxfs/xfs_fs.h +++ b/fs/xfs/libxfs/xfs_fs.h @@ -996,6 +996,7 @@ struct xfs_rtgroup_geometry { #define XFS_RTGROUP_GEOM_SICK_BITMAP (1U << 1) /* rtbitmap */ #define XFS_RTGROUP_GEOM_SICK_SUMMARY (1U << 2) /* rtsummary */ #define XFS_RTGROUP_GEOM_SICK_RMAPBT (1U << 3) /* reverse mappings */ +#define XFS_RTGROUP_GEOM_SICK_REFCNTBT (1U << 4) /* reference counts */ /* * ioctl commands that are used by Linux filesystems diff --git a/fs/xfs/libxfs/xfs_health.h b/fs/xfs/libxfs/xfs_health.h index 5c8a0aff6ba6e9..b31000f7190ce5 100644 --- a/fs/xfs/libxfs/xfs_health.h +++ b/fs/xfs/libxfs/xfs_health.h @@ -71,6 +71,7 @@ struct xfs_rtgroup; #define XFS_SICK_RG_BITMAP (1 << 1) /* rt group bitmap */ #define XFS_SICK_RG_SUMMARY (1 << 2) /* rt groups summary */ #define XFS_SICK_RG_RMAPBT (1 << 3) /* reverse mappings */ +#define XFS_SICK_RG_REFCNTBT (1 << 4) /* reference counts */ /* Observable health issues for AG metadata. */ #define XFS_SICK_AG_SB (1 << 0) /* superblock */ @@ -117,7 +118,8 @@ struct xfs_rtgroup; #define XFS_SICK_RG_PRIMARY (XFS_SICK_RG_SUPER | \ XFS_SICK_RG_BITMAP | \ XFS_SICK_RG_SUMMARY | \ - XFS_SICK_RG_RMAPBT) + XFS_SICK_RG_RMAPBT | \ + XFS_SICK_RG_REFCNTBT) #define XFS_SICK_AG_PRIMARY (XFS_SICK_AG_SB | \ XFS_SICK_AG_AGF | \ diff --git a/fs/xfs/libxfs/xfs_rtgroup.c b/fs/xfs/libxfs/xfs_rtgroup.c index eab655a4a9ef5c..a6468e591232c3 100644 --- a/fs/xfs/libxfs/xfs_rtgroup.c +++ b/fs/xfs/libxfs/xfs_rtgroup.c @@ -380,6 +380,7 @@ static const struct xfs_rtginode_ops xfs_rtginode_ops[XFS_RTGI_MAX] = { [XFS_RTGI_REFCOUNT] = { .name = "refcount", .metafile_type = XFS_METAFILE_RTREFCOUNT, + .sick = XFS_SICK_RG_REFCNTBT, .fmt_mask = 1U << XFS_DINODE_FMT_META_BTREE, /* same comment about growfs and rmap inodes applies here */ .enabled = xfs_has_reflink, diff --git a/fs/xfs/libxfs/xfs_rtrefcount_btree.c b/fs/xfs/libxfs/xfs_rtrefcount_btree.c index 151fb1ef7db126..3db5e7a4a94567 100644 --- a/fs/xfs/libxfs/xfs_rtrefcount_btree.c +++ b/fs/xfs/libxfs/xfs_rtrefcount_btree.c @@ -27,6 +27,7 @@ #include "xfs_rtgroup.h" #include "xfs_rtbitmap.h" #include "xfs_metafile.h" +#include "xfs_health.h" static struct kmem_cache *xfs_rtrefcountbt_cur_cache; @@ -374,6 +375,7 @@ const struct xfs_btree_ops xfs_rtrefcountbt_ops = { .lru_refs = XFS_REFC_BTREE_REF, .statoff = XFS_STATS_CALC_INDEX(xs_rtrefcbt_2), + .sick_mask = XFS_SICK_RG_REFCNTBT, .dup_cursor = xfs_rtrefcountbt_dup_cursor, .alloc_block = xfs_btree_alloc_metafile_block, @@ -640,16 +642,20 @@ xfs_iformat_rtrefcount( * volume to the filesystem, so we cannot use the rtrefcount predicate * here. */ - if (!xfs_has_reflink(ip->i_mount)) + if (!xfs_has_reflink(ip->i_mount)) { + xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE); return -EFSCORRUPTED; + } dsize = XFS_DFORK_SIZE(dip, mp, XFS_DATA_FORK); numrecs = be16_to_cpu(dfp->bb_numrecs); level = be16_to_cpu(dfp->bb_level); if (level > mp->m_rtrefc_maxlevels || - xfs_rtrefcount_droot_space_calc(level, numrecs) > dsize) + xfs_rtrefcount_droot_space_calc(level, numrecs) > dsize) { + xfs_inode_mark_sick(ip, XFS_SICK_INO_CORE); return -EFSCORRUPTED; + } broot = xfs_broot_alloc(xfs_ifork_ptr(ip, XFS_DATA_FORK), xfs_rtrefcount_broot_space_calc(mp, level, numrecs)); diff --git a/fs/xfs/xfs_health.c b/fs/xfs/xfs_health.c index d438c3c001c829..7c541fb373d5b2 100644 --- a/fs/xfs/xfs_health.c +++ b/fs/xfs/xfs_health.c @@ -448,6 +448,7 @@ static const struct ioctl_sick_map rtgroup_map[] = { { XFS_SICK_RG_BITMAP, XFS_RTGROUP_GEOM_SICK_BITMAP }, { XFS_SICK_RG_SUMMARY, XFS_RTGROUP_GEOM_SICK_SUMMARY }, { XFS_SICK_RG_RMAPBT, XFS_RTGROUP_GEOM_SICK_RMAPBT }, + { XFS_SICK_RG_REFCNTBT, XFS_RTGROUP_GEOM_SICK_REFCNTBT }, }; /* Fill out rtgroup geometry health info. */ From patchwork Thu Dec 19 19:40:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915628 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C0561A42C4 for ; Thu, 19 Dec 2024 19:40:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637220; cv=none; b=Z5EmfsXmd3hEneoRyh1VHWpTNcjvA3O0ivViBrInRPyKgrIRs349jH0C8yIkFRjgxm965/04bw5GZYJDd5Yh95BI6mnTsEHsSSS1GKYv5wLNg5BxJELqtUXCLB36mDBnHnQNdcHgqQkK+OEukn1Mcfod4M9h/lXxkWrLVctvehs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637220; c=relaxed/simple; bh=ie/7feb42D88hd5iImzxYQlM8AOc6SxRVledgGBRlHE=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=lKnAMO21CGFpCqIYv9rbdboKFjLcsdbjsAi1kfT1buSMN08QWHtoqg6uBiKlGtZqDSvXw860l+j7BbRVFaYyMaLReHD+IpH+UFca0CS1wn3w2Nll/jFgrk/SKXvKQwHlN7/n2y2QsNaIYjrbMTqpgbJF9CFmZWKRBxd3dCq8UMw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ofCtWT04; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ofCtWT04" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 51221C4CECE; Thu, 19 Dec 2024 19:40:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637220; bh=ie/7feb42D88hd5iImzxYQlM8AOc6SxRVledgGBRlHE=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=ofCtWT04dzdcH8sdMAcFgx7sorP+fsJk496uV/M/oLo0RLRuLp2MzrcF4Ew8zRU1W lvjdMadws7TuOX8XezTJI0cBaHNoBQKEbIhiUgUEaBJJNqJ55OULmmGiDN9NEvIt9W veQ0DGhVBOO3wKCDMs05ULcTvywQ9+nk6EgEUycQup2tDS8j/tr9ndhlhDrlZt8jxt hFSddseh0VckCx4c7jAtkQ7UryvBerhtakRnvgkqeh+9PRY5xQBKJFztKFTi7I8U87 XBrbhyfPEa4qpmLEvjeYrENspYEdaYk4PqpBheXxN4PnQcoflH2blTtlDEC1GYD9hQ 6zwaJg6NRIbRg== Date: Thu, 19 Dec 2024 11:40:19 -0800 Subject: [PATCH 28/43] xfs: scrub the realtime refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581458.1572761.9841543400134933628.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Add code to scrub realtime refcount btrees. Similar to the refcount btree checking code for the data device, we walk the rmap btree for each refcount record to confirm that the reference counts are correct. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/Makefile | 1 fs/xfs/libxfs/xfs_fs.h | 3 fs/xfs/scrub/common.c | 10 + fs/xfs/scrub/common.h | 5 fs/xfs/scrub/health.c | 1 fs/xfs/scrub/rtrefcount.c | 487 +++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/scrub.c | 7 + fs/xfs/scrub/scrub.h | 3 fs/xfs/scrub/stats.c | 1 fs/xfs/scrub/trace.h | 4 10 files changed, 519 insertions(+), 3 deletions(-) create mode 100644 fs/xfs/scrub/rtrefcount.c diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index b9356d01416e16..9dd9921e53567c 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -195,6 +195,7 @@ xfs-$(CONFIG_XFS_ONLINE_SCRUB_STATS) += scrub/stats.o xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \ rgsuper.o \ rtbitmap.o \ + rtrefcount.o \ rtrmap.o \ rtsummary.o \ ) diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h index ea9e58a89d92d3..a4bd6a39c6ba71 100644 --- a/fs/xfs/libxfs/xfs_fs.h +++ b/fs/xfs/libxfs/xfs_fs.h @@ -738,9 +738,10 @@ struct xfs_scrub_metadata { #define XFS_SCRUB_TYPE_METAPATH 29 /* metadata directory tree paths */ #define XFS_SCRUB_TYPE_RGSUPER 30 /* realtime superblock */ #define XFS_SCRUB_TYPE_RTRMAPBT 31 /* rtgroup reverse mapping btree */ +#define XFS_SCRUB_TYPE_RTREFCBT 32 /* realtime reference count btree */ /* Number of scrub subcommands. */ -#define XFS_SCRUB_TYPE_NR 32 +#define XFS_SCRUB_TYPE_NR 33 /* * This special type code only applies to the vectored scrub implementation. diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c index 06cb61e6349827..28ad341df8eede 100644 --- a/fs/xfs/scrub/common.c +++ b/fs/xfs/scrub/common.c @@ -37,6 +37,7 @@ #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" #include "xfs_bmap_util.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -797,6 +798,9 @@ xchk_rtgroup_lock( if (xfs_has_rtrmapbt(sc->mp) && (rtglock_flags & XFS_RTGLOCK_RMAP)) sr->rmap_cur = xfs_rtrmapbt_init_cursor(sc->tp, sr->rtg); + if (xfs_has_rtreflink(sc->mp) && (rtglock_flags & XFS_RTGLOCK_REFCOUNT)) + sr->refc_cur = xfs_rtrefcountbt_init_cursor(sc->tp, sr->rtg); + return 0; } @@ -811,7 +815,10 @@ xchk_rtgroup_btcur_free( { if (sr->rmap_cur) xfs_btree_del_cursor(sr->rmap_cur, XFS_BTREE_ERROR); + if (sr->refc_cur) + xfs_btree_del_cursor(sr->refc_cur, XFS_BTREE_ERROR); + sr->refc_cur = NULL; sr->rmap_cur = NULL; } @@ -1687,6 +1694,9 @@ xchk_meta_btree_count_blocks( case XFS_METAFILE_RTRMAP: cur = xfs_rtrmapbt_init_cursor(sc->tp, sc->sr.rtg); break; + case XFS_METAFILE_RTREFCOUNT: + cur = xfs_rtrefcountbt_init_cursor(sc->tp, sc->sr.rtg); + break; default: ASSERT(0); return -EFSCORRUPTED; diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h index 50ac6cca18fe45..bdcd40f0ec742c 100644 --- a/fs/xfs/scrub/common.h +++ b/fs/xfs/scrub/common.h @@ -82,11 +82,13 @@ int xchk_setup_rtbitmap(struct xfs_scrub *sc); int xchk_setup_rtsummary(struct xfs_scrub *sc); int xchk_setup_rgsuperblock(struct xfs_scrub *sc); int xchk_setup_rtrmapbt(struct xfs_scrub *sc); +int xchk_setup_rtrefcountbt(struct xfs_scrub *sc); #else # define xchk_setup_rtbitmap xchk_setup_nothing # define xchk_setup_rtsummary xchk_setup_nothing # define xchk_setup_rgsuperblock xchk_setup_nothing # define xchk_setup_rtrmapbt xchk_setup_nothing +# define xchk_setup_rtrefcountbt xchk_setup_nothing #endif #ifdef CONFIG_XFS_QUOTA int xchk_ino_dqattach(struct xfs_scrub *sc); @@ -129,7 +131,8 @@ xchk_ag_init_existing( /* All the locks we need to check an rtgroup. */ #define XCHK_RTGLOCK_ALL (XFS_RTGLOCK_BITMAP | \ - XFS_RTGLOCK_RMAP) + XFS_RTGLOCK_RMAP | \ + XFS_RTGLOCK_REFCOUNT) int xchk_rtgroup_init(struct xfs_scrub *sc, xfs_rgnumber_t rgno, struct xchk_rt *sr); diff --git a/fs/xfs/scrub/health.c b/fs/xfs/scrub/health.c index bcc4244e3b55db..3c0f25098b69f0 100644 --- a/fs/xfs/scrub/health.c +++ b/fs/xfs/scrub/health.c @@ -115,6 +115,7 @@ static const struct xchk_health_map type_to_health_flag[XFS_SCRUB_TYPE_NR] = { [XFS_SCRUB_TYPE_METAPATH] = { XHG_FS, XFS_SICK_FS_METAPATH }, [XFS_SCRUB_TYPE_RGSUPER] = { XHG_RTGROUP, XFS_SICK_RG_SUPER }, [XFS_SCRUB_TYPE_RTRMAPBT] = { XHG_RTGROUP, XFS_SICK_RG_RMAPBT }, + [XFS_SCRUB_TYPE_RTREFCBT] = { XHG_RTGROUP, XFS_SICK_RG_REFCNTBT }, }; /* Return the health status mask for this scrub type. */ diff --git a/fs/xfs/scrub/rtrefcount.c b/fs/xfs/scrub/rtrefcount.c new file mode 100644 index 00000000000000..92ae2e15ae9f79 --- /dev/null +++ b/fs/xfs/scrub/rtrefcount.c @@ -0,0 +1,487 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_btree.h" +#include "xfs_rmap.h" +#include "xfs_refcount.h" +#include "xfs_inode.h" +#include "xfs_rtbitmap.h" +#include "xfs_rtgroup.h" +#include "xfs_metafile.h" +#include "xfs_rtrefcount_btree.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/btree.h" + +/* Set us up with the realtime refcount metadata locked. */ +int +xchk_setup_rtrefcountbt( + struct xfs_scrub *sc) +{ + int error; + + if (xchk_need_intent_drain(sc)) + xchk_fsgates_enable(sc, XCHK_FSGATES_DRAIN); + + error = xchk_rtgroup_init(sc, sc->sm->sm_agno, &sc->sr); + if (error) + return error; + + error = xchk_setup_rt(sc); + if (error) + return error; + + error = xchk_install_live_inode(sc, rtg_refcount(sc->sr.rtg)); + if (error) + return error; + + return xchk_rtgroup_lock(sc, &sc->sr, XCHK_RTGLOCK_ALL); +} + +/* Realtime Reference count btree scrubber. */ + +/* + * Confirming Reference Counts via Reverse Mappings + * + * We want to count the reverse mappings overlapping a refcount record + * (bno, len, refcount), allowing for the possibility that some of the + * overlap may come from smaller adjoining reverse mappings, while some + * comes from single extents which overlap the range entirely. The + * outer loop is as follows: + * + * 1. For all reverse mappings overlapping the refcount extent, + * a. If a given rmap completely overlaps, mark it as seen. + * b. Otherwise, record the fragment (in agbno order) for later + * processing. + * + * Once we've seen all the rmaps, we know that for all blocks in the + * refcount record we want to find $refcount owners and we've already + * visited $seen extents that overlap all the blocks. Therefore, we + * need to find ($refcount - $seen) owners for every block in the + * extent; call that quantity $target_nr. Proceed as follows: + * + * 2. Pull the first $target_nr fragments from the list; all of them + * should start at or before the start of the extent. + * Call this subset of fragments the working set. + * 3. Until there are no more unprocessed fragments, + * a. Find the shortest fragments in the set and remove them. + * b. Note the block number of the end of these fragments. + * c. Pull the same number of fragments from the list. All of these + * fragments should start at the block number recorded in the + * previous step. + * d. Put those fragments in the set. + * 4. Check that there are $target_nr fragments remaining in the list, + * and that they all end at or beyond the end of the refcount extent. + * + * If the refcount is correct, all the check conditions in the algorithm + * should always hold true. If not, the refcount is incorrect. + */ +struct xchk_rtrefcnt_frag { + struct list_head list; + struct xfs_rmap_irec rm; +}; + +struct xchk_rtrefcnt_check { + struct xfs_scrub *sc; + struct list_head fragments; + + /* refcount extent we're examining */ + xfs_rgblock_t bno; + xfs_extlen_t len; + xfs_nlink_t refcount; + + /* number of owners seen */ + xfs_nlink_t seen; +}; + +/* + * Decide if the given rmap is large enough that we can redeem it + * towards refcount verification now, or if it's a fragment, in + * which case we'll hang onto it in the hopes that we'll later + * discover that we've collected exactly the correct number of + * fragments as the rtrefcountbt says we should have. + */ +STATIC int +xchk_rtrefcountbt_rmap_check( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xchk_rtrefcnt_check *refchk = priv; + struct xchk_rtrefcnt_frag *frag; + xfs_rgblock_t rm_last; + xfs_rgblock_t rc_last; + int error = 0; + + if (xchk_should_terminate(refchk->sc, &error)) + return error; + + rm_last = rec->rm_startblock + rec->rm_blockcount - 1; + rc_last = refchk->bno + refchk->len - 1; + + /* Confirm that a single-owner refc extent is a CoW stage. */ + if (refchk->refcount == 1 && rec->rm_owner != XFS_RMAP_OWN_COW) { + xchk_btree_xref_set_corrupt(refchk->sc, cur, 0); + return 0; + } + + if (rec->rm_startblock <= refchk->bno && rm_last >= rc_last) { + /* + * The rmap overlaps the refcount record, so we can confirm + * one refcount owner seen. + */ + refchk->seen++; + } else { + /* + * This rmap covers only part of the refcount record, so + * save the fragment for later processing. If the rmapbt + * is healthy each rmap_irec we see will be in agbno order + * so we don't need insertion sort here. + */ + frag = kmalloc(sizeof(struct xchk_rtrefcnt_frag), + XCHK_GFP_FLAGS); + if (!frag) + return -ENOMEM; + memcpy(&frag->rm, rec, sizeof(frag->rm)); + list_add_tail(&frag->list, &refchk->fragments); + } + + return 0; +} + +/* + * Given a bunch of rmap fragments, iterate through them, keeping + * a running tally of the refcount. If this ever deviates from + * what we expect (which is the rtrefcountbt's refcount minus the + * number of extents that totally covered the rtrefcountbt extent), + * we have a rtrefcountbt error. + */ +STATIC void +xchk_rtrefcountbt_process_rmap_fragments( + struct xchk_rtrefcnt_check *refchk) +{ + struct list_head worklist; + struct xchk_rtrefcnt_frag *frag; + struct xchk_rtrefcnt_frag *n; + xfs_rgblock_t bno; + xfs_rgblock_t rbno; + xfs_rgblock_t next_rbno; + xfs_nlink_t nr; + xfs_nlink_t target_nr; + + target_nr = refchk->refcount - refchk->seen; + if (target_nr == 0) + return; + + /* + * There are (refchk->rc.rc_refcount - refchk->nr refcount) + * references we haven't found yet. Pull that many off the + * fragment list and figure out where the smallest rmap ends + * (and therefore the next rmap should start). All the rmaps + * we pull off should start at or before the beginning of the + * refcount record's range. + */ + INIT_LIST_HEAD(&worklist); + rbno = NULLRGBLOCK; + + /* Make sure the fragments actually /are/ in bno order. */ + bno = 0; + list_for_each_entry(frag, &refchk->fragments, list) { + if (frag->rm.rm_startblock < bno) + goto done; + bno = frag->rm.rm_startblock; + } + + /* + * Find all the rmaps that start at or before the refc extent, + * and put them on the worklist. + */ + nr = 0; + list_for_each_entry_safe(frag, n, &refchk->fragments, list) { + if (frag->rm.rm_startblock > refchk->bno || nr > target_nr) + break; + bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; + if (bno < rbno) + rbno = bno; + list_move_tail(&frag->list, &worklist); + nr++; + } + + /* + * We should have found exactly $target_nr rmap fragments starting + * at or before the refcount extent. + */ + if (nr != target_nr) + goto done; + + while (!list_empty(&refchk->fragments)) { + /* Discard any fragments ending at rbno from the worklist. */ + nr = 0; + next_rbno = NULLRGBLOCK; + list_for_each_entry_safe(frag, n, &worklist, list) { + bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; + if (bno != rbno) { + if (bno < next_rbno) + next_rbno = bno; + continue; + } + list_del(&frag->list); + kfree(frag); + nr++; + } + + /* Try to add nr rmaps starting at rbno to the worklist. */ + list_for_each_entry_safe(frag, n, &refchk->fragments, list) { + bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; + if (frag->rm.rm_startblock != rbno) + goto done; + list_move_tail(&frag->list, &worklist); + if (next_rbno > bno) + next_rbno = bno; + nr--; + if (nr == 0) + break; + } + + /* + * If we get here and nr > 0, this means that we added fewer + * items to the worklist than we discarded because the fragment + * list ran out of items. Therefore, we cannot maintain the + * required refcount. Something is wrong, so we're done. + */ + if (nr) + goto done; + + rbno = next_rbno; + } + + /* + * Make sure the last extent we processed ends at or beyond + * the end of the refcount extent. + */ + if (rbno < refchk->bno + refchk->len) + goto done; + + /* Actually record us having seen the remaining refcount. */ + refchk->seen = refchk->refcount; +done: + /* Delete fragments and work list. */ + list_for_each_entry_safe(frag, n, &worklist, list) { + list_del(&frag->list); + kfree(frag); + } + list_for_each_entry_safe(frag, n, &refchk->fragments, list) { + list_del(&frag->list); + kfree(frag); + } +} + +/* Use the rmap entries covering this extent to verify the refcount. */ +STATIC void +xchk_rtrefcountbt_xref_rmap( + struct xfs_scrub *sc, + const struct xfs_refcount_irec *irec) +{ + struct xchk_rtrefcnt_check refchk = { + .sc = sc, + .bno = irec->rc_startblock, + .len = irec->rc_blockcount, + .refcount = irec->rc_refcount, + .seen = 0, + }; + struct xfs_rmap_irec low; + struct xfs_rmap_irec high; + struct xchk_rtrefcnt_frag *frag; + struct xchk_rtrefcnt_frag *n; + int error; + + if (!sc->sr.rmap_cur || xchk_skip_xref(sc->sm)) + return; + + /* Cross-reference with the rmapbt to confirm the refcount. */ + memset(&low, 0, sizeof(low)); + low.rm_startblock = irec->rc_startblock; + memset(&high, 0xFF, sizeof(high)); + high.rm_startblock = irec->rc_startblock + irec->rc_blockcount - 1; + + INIT_LIST_HEAD(&refchk.fragments); + error = xfs_rmap_query_range(sc->sr.rmap_cur, &low, &high, + xchk_rtrefcountbt_rmap_check, &refchk); + if (!xchk_should_check_xref(sc, &error, &sc->sr.rmap_cur)) + goto out_free; + + xchk_rtrefcountbt_process_rmap_fragments(&refchk); + if (irec->rc_refcount != refchk.seen) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); + +out_free: + list_for_each_entry_safe(frag, n, &refchk.fragments, list) { + list_del(&frag->list); + kfree(frag); + } +} + +/* Cross-reference with the other btrees. */ +STATIC void +xchk_rtrefcountbt_xref( + struct xfs_scrub *sc, + const struct xfs_refcount_irec *irec) +{ + if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return; + + xchk_xref_is_used_rt_space(sc, + xfs_rgbno_to_rtb(sc->sr.rtg, irec->rc_startblock), + irec->rc_blockcount); + xchk_rtrefcountbt_xref_rmap(sc, irec); +} + +struct xchk_rtrefcbt_records { + /* Previous refcount record. */ + struct xfs_refcount_irec prev_rec; + + /* Number of CoW blocks we expect. */ + xfs_extlen_t cow_blocks; +}; + +static inline bool +xchk_rtrefcount_mergeable( + struct xchk_rtrefcbt_records *rrc, + const struct xfs_refcount_irec *r2) +{ + const struct xfs_refcount_irec *r1 = &rrc->prev_rec; + + /* Ignore if prev_rec is not yet initialized. */ + if (r1->rc_blockcount > 0) + return false; + + if (r1->rc_startblock + r1->rc_blockcount != r2->rc_startblock) + return false; + if (r1->rc_refcount != r2->rc_refcount) + return false; + if ((unsigned long long)r1->rc_blockcount + r2->rc_blockcount > + XFS_REFC_LEN_MAX) + return false; + + return true; +} + +/* Flag failures for records that could be merged. */ +STATIC void +xchk_rtrefcountbt_check_mergeable( + struct xchk_btree *bs, + struct xchk_rtrefcbt_records *rrc, + const struct xfs_refcount_irec *irec) +{ + if (bs->sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT) + return; + + if (xchk_rtrefcount_mergeable(rrc, irec)) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + + memcpy(&rrc->prev_rec, irec, sizeof(struct xfs_refcount_irec)); +} + +/* Scrub a rtrefcountbt record. */ +STATIC int +xchk_rtrefcountbt_rec( + struct xchk_btree *bs, + const union xfs_btree_rec *rec) +{ + struct xfs_mount *mp = bs->cur->bc_mp; + struct xchk_rtrefcbt_records *rrc = bs->private; + struct xfs_refcount_irec irec; + u32 mod; + + xfs_refcount_btrec_to_irec(rec, &irec); + if (xfs_rtrefcount_check_irec(to_rtg(bs->cur->bc_group), &irec) != + NULL) { + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + return 0; + } + + /* We can only share full rt extents. */ + mod = xfs_rgbno_to_rtxoff(mp, irec.rc_startblock); + if (mod) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + mod = xfs_extlen_to_rtxmod(mp, irec.rc_blockcount); + if (mod) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + + if (irec.rc_domain == XFS_REFC_DOMAIN_COW) + rrc->cow_blocks += irec.rc_blockcount; + + xchk_rtrefcountbt_check_mergeable(bs, rrc, &irec); + xchk_rtrefcountbt_xref(bs->sc, &irec); + + return 0; +} + +/* Make sure we have as many refc blocks as the rmap says. */ +STATIC void +xchk_refcount_xref_rmap( + struct xfs_scrub *sc, + const struct xfs_owner_info *btree_oinfo, + xfs_extlen_t cow_blocks) +{ + xfs_filblks_t refcbt_blocks = 0; + xfs_filblks_t blocks; + int error; + + if (!sc->sr.rmap_cur || !sc->sa.rmap_cur || xchk_skip_xref(sc->sm)) + return; + + /* Check that we saw as many refcbt blocks as the rmap knows about. */ + error = xfs_btree_count_blocks(sc->sr.refc_cur, &refcbt_blocks); + if (!xchk_btree_process_error(sc, sc->sr.refc_cur, 0, &error)) + return; + error = xchk_count_rmap_ownedby_ag(sc, sc->sa.rmap_cur, btree_oinfo, + &blocks); + if (!xchk_should_check_xref(sc, &error, &sc->sa.rmap_cur)) + return; + if (blocks != refcbt_blocks) + xchk_btree_xref_set_corrupt(sc, sc->sa.rmap_cur, 0); + + /* Check that we saw as many cow blocks as the rmap knows about. */ + error = xchk_count_rmap_ownedby_ag(sc, sc->sr.rmap_cur, + &XFS_RMAP_OINFO_COW, &blocks); + if (!xchk_should_check_xref(sc, &error, &sc->sr.rmap_cur)) + return; + if (blocks != cow_blocks) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); +} + +/* Scrub the refcount btree for some AG. */ +int +xchk_rtrefcountbt( + struct xfs_scrub *sc) +{ + struct xfs_owner_info btree_oinfo; + struct xchk_rtrefcbt_records rrc = { + .cow_blocks = 0, + }; + int error; + + error = xchk_metadata_inode_forks(sc); + if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) + return error; + + xfs_rmap_ino_bmbt_owner(&btree_oinfo, rtg_refcount(sc->sr.rtg)->i_ino, + XFS_DATA_FORK); + error = xchk_btree(sc, sc->sr.refc_cur, xchk_rtrefcountbt_rec, + &btree_oinfo, &rrc); + if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) + return error; + + xchk_refcount_xref_rmap(sc, &btree_oinfo, rrc.cow_blocks); + + return 0; +} diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 16da054b2eb0dc..6e31f12cef4cc9 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -467,6 +467,13 @@ static const struct xchk_meta_ops meta_scrub_ops[] = { .has = xfs_has_rtrmapbt, .repair = xrep_rtrmapbt, }, + [XFS_SCRUB_TYPE_RTREFCBT] = { /* realtime refcountbt */ + .type = ST_RTGROUP, + .setup = xchk_setup_rtrefcountbt, + .scrub = xchk_rtrefcountbt, + .has = xfs_has_rtreflink, + .repair = xrep_notsupported, + }, }; static int diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index cba4e89a3a627b..ab3d221dd9ed2d 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -129,6 +129,7 @@ struct xchk_rt { /* rtgroup btrees */ struct xfs_btree_cur *rmap_cur; + struct xfs_btree_cur *refc_cur; }; struct xfs_scrub { @@ -284,11 +285,13 @@ int xchk_rtbitmap(struct xfs_scrub *sc); int xchk_rtsummary(struct xfs_scrub *sc); int xchk_rgsuperblock(struct xfs_scrub *sc); int xchk_rtrmapbt(struct xfs_scrub *sc); +int xchk_rtrefcountbt(struct xfs_scrub *sc); #else # define xchk_rtbitmap xchk_nothing # define xchk_rtsummary xchk_nothing # define xchk_rgsuperblock xchk_nothing # define xchk_rtrmapbt xchk_nothing +# define xchk_rtrefcountbt xchk_nothing #endif #ifdef CONFIG_XFS_QUOTA int xchk_quota(struct xfs_scrub *sc); diff --git a/fs/xfs/scrub/stats.c b/fs/xfs/scrub/stats.c index eb6bb170c902b3..f8a37ea9779192 100644 --- a/fs/xfs/scrub/stats.c +++ b/fs/xfs/scrub/stats.c @@ -83,6 +83,7 @@ static const char *name_map[XFS_SCRUB_TYPE_NR] = { [XFS_SCRUB_TYPE_METAPATH] = "metapath", [XFS_SCRUB_TYPE_RGSUPER] = "rgsuper", [XFS_SCRUB_TYPE_RTRMAPBT] = "rtrmapbt", + [XFS_SCRUB_TYPE_RTREFCBT] = "rtrefcountbt", }; /* Format the scrub stats into a text buffer, similar to pcp style. */ diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index fb86b746bc174a..289e39d1f418ba 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -77,6 +77,7 @@ TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_BARRIER); TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_METAPATH); TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RGSUPER); TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RTRMAPBT); +TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RTREFCBT); #define XFS_SCRUB_TYPE_STRINGS \ { XFS_SCRUB_TYPE_PROBE, "probe" }, \ @@ -111,7 +112,8 @@ TRACE_DEFINE_ENUM(XFS_SCRUB_TYPE_RTRMAPBT); { XFS_SCRUB_TYPE_BARRIER, "barrier" }, \ { XFS_SCRUB_TYPE_METAPATH, "metapath" }, \ { XFS_SCRUB_TYPE_RGSUPER, "rgsuper" }, \ - { XFS_SCRUB_TYPE_RTRMAPBT, "rtrmapbt" } + { XFS_SCRUB_TYPE_RTRMAPBT, "rtrmapbt" }, \ + { XFS_SCRUB_TYPE_RTREFCBT, "rtrefcountbt" } #define XFS_SCRUB_FLAG_STRINGS \ { XFS_SCRUB_IFLAG_REPAIR, "repair" }, \ From patchwork Thu Dec 19 19:40:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915629 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C9041990BA for ; Thu, 19 Dec 2024 19:40:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637236; cv=none; b=IoagAebB0vCNp2UTNXDWqfLNQ7MXsKpSo0YuNM3V4JYsy/8jW5S58JijVIuYplxSamfcDmJe4XfvIKAQg4P2UsIroTr2Dbl64NNKNi2PKaK+ECjgzqYUQleNF4xbH8Yeieg/guN5azv7fRzteHSVQib9yHwqY8JX6SLb+DHmeEM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637236; c=relaxed/simple; bh=ndEoqYfNQX2oLBQN+DDAegz/yfPw1vLKuzq3nhPelEc=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=t/XrZK4W2VqI6+3Yd5UfYRLjlKaZoT3jlcewfUrpwoMoJqiKz2gUibYyjhgyUEwU9jp3C9t7O0knZQFqakKDyhvBE5GO2HbJfIjVdbli52RcBefD/BrhBoMq6FHqUP1w0qRkq3kVBRtseCXd+lEEcVkepsTQNG4/fpYXLrgQXrk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IodNzCKh; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IodNzCKh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E444FC4CECE; Thu, 19 Dec 2024 19:40:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637236; bh=ndEoqYfNQX2oLBQN+DDAegz/yfPw1vLKuzq3nhPelEc=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=IodNzCKh47iP+jGOOBaq55ajfWPrGnbiDn1j7VzpfPui1gmnX7bCkay15zvww+e5v hz/fgmK1EGuvblTYGztZxan08kyvJwcfLqVTQdzUW5dj3ourUj+qEocjiJBwb743A9 m65soeY15RORaYjhDAlMJQIVCfXWMfDxh98vBvTyQxYI6G+DabIQ1jnuWcSS03AKeg c/WrHa5FNskkP/iA41guBt9hfgD75+ea3QrvVrtJXW4nj7ngOlHfNCadx4px8WGvfU FXfCgsx2WWmQIkTcnuU1AcZpbR2gzJZ/YQtv1+Uf0RkhZDmBAke0WjgmzsH4mCB6GV dWzKIPQxxDuOA== Date: Thu, 19 Dec 2024 11:40:35 -0800 Subject: [PATCH 29/43] xfs: cross-reference checks with the rt refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581475.1572761.5748887376736796051.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Use the realtime refcount btree to implement cross-reference checks in other data structures. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/bmap.c | 30 +++++++++++++--- fs/xfs/scrub/rtbitmap.c | 2 + fs/xfs/scrub/rtrefcount.c | 86 +++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/rtrmap.c | 37 +++++++++++++++++++ fs/xfs/scrub/scrub.h | 9 +++++ 5 files changed, 158 insertions(+), 6 deletions(-) diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index f6077b0cba8a14..66da7d4d56ba74 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -347,13 +347,31 @@ xchk_bmap_rt_iextent_xref( goto out_cur; rgbno = xfs_rtb_to_rgbno(info->sc->mp, irec->br_startblock); - xchk_bmap_xref_rmap(info, irec, rgbno); - - xfs_rmap_ino_owner(&oinfo, info->sc->ip->i_ino, info->whichfork, - irec->br_startoff); - xchk_xref_is_only_rt_owned_by(info->sc, rgbno, - irec->br_blockcount, &oinfo); + switch (info->whichfork) { + case XFS_DATA_FORK: + xchk_bmap_xref_rmap(info, irec, rgbno); + if (!xfs_is_reflink_inode(info->sc->ip)) { + xfs_rmap_ino_owner(&oinfo, info->sc->ip->i_ino, + info->whichfork, irec->br_startoff); + xchk_xref_is_only_rt_owned_by(info->sc, rgbno, + irec->br_blockcount, &oinfo); + xchk_xref_is_not_rt_shared(info->sc, rgbno, + irec->br_blockcount); + } + xchk_xref_is_not_rt_cow_staging(info->sc, rgbno, + irec->br_blockcount); + break; + case XFS_COW_FORK: + xchk_bmap_xref_rmap_cow(info, irec, rgbno); + xchk_xref_is_only_rt_owned_by(info->sc, rgbno, + irec->br_blockcount, &XFS_RMAP_OINFO_COW); + xchk_xref_is_rt_cow_staging(info->sc, rgbno, + irec->br_blockcount); + xchk_xref_is_not_rt_shared(info->sc, rgbno, + irec->br_blockcount); + break; + } out_cur: xchk_rtgroup_btcur_free(&info->sc->sr); out_free: diff --git a/fs/xfs/scrub/rtbitmap.c b/fs/xfs/scrub/rtbitmap.c index 28c90a31f4c32b..e8c776a34c1df1 100644 --- a/fs/xfs/scrub/rtbitmap.c +++ b/fs/xfs/scrub/rtbitmap.c @@ -105,6 +105,8 @@ xchk_rtbitmap_xref( return; xchk_xref_has_no_rt_owner(sc, rgbno, blockcount); + xchk_xref_is_not_rt_shared(sc, rgbno, blockcount); + xchk_xref_is_not_rt_cow_staging(sc, rgbno, blockcount); if (rtb->next_free_rgbno < rgbno) xchk_xref_has_rt_owner(sc, rtb->next_free_rgbno, diff --git a/fs/xfs/scrub/rtrefcount.c b/fs/xfs/scrub/rtrefcount.c index 92ae2e15ae9f79..f56f2cb7a9f707 100644 --- a/fs/xfs/scrub/rtrefcount.c +++ b/fs/xfs/scrub/rtrefcount.c @@ -485,3 +485,89 @@ xchk_rtrefcountbt( return 0; } + +/* xref check that a cow staging extent is marked in the rtrefcountbt. */ +void +xchk_xref_is_rt_cow_staging( + struct xfs_scrub *sc, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + struct xfs_refcount_irec rc; + int has_refcount; + int error; + + if (!sc->sr.refc_cur || xchk_skip_xref(sc->sm)) + return; + + /* Find the CoW staging extent. */ + error = xfs_refcount_lookup_le(sc->sr.refc_cur, XFS_REFC_DOMAIN_COW, + bno, &has_refcount); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (!has_refcount) { + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); + return; + } + + error = xfs_refcount_get_rec(sc->sr.refc_cur, &rc, &has_refcount); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (!has_refcount) { + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); + return; + } + + /* CoW lookup returned a shared extent record? */ + if (rc.rc_domain != XFS_REFC_DOMAIN_COW) + xchk_btree_xref_set_corrupt(sc, sc->sa.refc_cur, 0); + + /* Must be at least as long as what was passed in */ + if (rc.rc_blockcount < len) + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); +} + +/* + * xref check that the extent is not shared. Only file data blocks + * can have multiple owners. + */ +void +xchk_xref_is_not_rt_shared( + struct xfs_scrub *sc, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + enum xbtree_recpacking outcome; + int error; + + if (!sc->sr.refc_cur || xchk_skip_xref(sc->sm)) + return; + + error = xfs_refcount_has_records(sc->sr.refc_cur, + XFS_REFC_DOMAIN_SHARED, bno, len, &outcome); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (outcome != XBTREE_RECPACKING_EMPTY) + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); +} + +/* xref check that the extent is not being used for CoW staging. */ +void +xchk_xref_is_not_rt_cow_staging( + struct xfs_scrub *sc, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + enum xbtree_recpacking outcome; + int error; + + if (!sc->sr.refc_cur || xchk_skip_xref(sc->sm)) + return; + + error = xfs_refcount_has_records(sc->sr.refc_cur, XFS_REFC_DOMAIN_COW, + bno, len, &outcome); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (outcome != XBTREE_RECPACKING_EMPTY) + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); +} diff --git a/fs/xfs/scrub/rtrmap.c b/fs/xfs/scrub/rtrmap.c index 300a1e85b3d625..3d5419682d6528 100644 --- a/fs/xfs/scrub/rtrmap.c +++ b/fs/xfs/scrub/rtrmap.c @@ -22,6 +22,7 @@ #include "xfs_rtalloc.h" #include "xfs_rtgroup.h" #include "xfs_metafile.h" +#include "xfs_refcount.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -149,6 +150,37 @@ xchk_rtrmapbt_check_mergeable( memcpy(&cr->prev_rec, irec, sizeof(struct xfs_rmap_irec)); } +/* Cross-reference a rmap against the refcount btree. */ +STATIC void +xchk_rtrmapbt_xref_rtrefc( + struct xfs_scrub *sc, + struct xfs_rmap_irec *irec) +{ + xfs_rgblock_t fbno; + xfs_extlen_t flen; + bool is_inode; + bool is_bmbt; + bool is_attr; + bool is_unwritten; + int error; + + if (!sc->sr.refc_cur || xchk_skip_xref(sc->sm)) + return; + + is_inode = !XFS_RMAP_NON_INODE_OWNER(irec->rm_owner); + is_bmbt = irec->rm_flags & XFS_RMAP_BMBT_BLOCK; + is_attr = irec->rm_flags & XFS_RMAP_ATTR_FORK; + is_unwritten = irec->rm_flags & XFS_RMAP_UNWRITTEN; + + /* If this is shared, must be a data fork extent. */ + error = xfs_refcount_find_shared(sc->sr.refc_cur, irec->rm_startblock, + irec->rm_blockcount, &fbno, &flen, false); + if (!xchk_should_check_xref(sc, &error, &sc->sr.refc_cur)) + return; + if (flen != 0 && (!is_inode || is_attr || is_bmbt || is_unwritten)) + xchk_btree_xref_set_corrupt(sc, sc->sr.refc_cur, 0); +} + /* Cross-reference with other metadata. */ STATIC void xchk_rtrmapbt_xref( @@ -161,6 +193,11 @@ xchk_rtrmapbt_xref( xchk_xref_is_used_rt_space(sc, xfs_rgbno_to_rtb(sc->sr.rtg, irec->rm_startblock), irec->rm_blockcount); + if (irec->rm_owner == XFS_RMAP_OWN_COW) + xchk_xref_is_cow_staging(sc, irec->rm_startblock, + irec->rm_blockcount); + else + xchk_rtrmapbt_xref_rtrefc(sc, irec); } /* Scrub a realtime rmapbt record. */ diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index ab3d221dd9ed2d..a1086f1f06d0d9 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -331,11 +331,20 @@ void xchk_xref_has_rt_owner(struct xfs_scrub *sc, xfs_rgblock_t rgbno, xfs_extlen_t len); void xchk_xref_is_only_rt_owned_by(struct xfs_scrub *sc, xfs_rgblock_t rgbno, xfs_extlen_t len, const struct xfs_owner_info *oinfo); +void xchk_xref_is_rt_cow_staging(struct xfs_scrub *sc, xfs_rgblock_t rgbno, + xfs_extlen_t len); +void xchk_xref_is_not_rt_shared(struct xfs_scrub *sc, xfs_rgblock_t rgbno, + xfs_extlen_t len); +void xchk_xref_is_not_rt_cow_staging(struct xfs_scrub *sc, xfs_rgblock_t rgbno, + xfs_extlen_t len); #else # define xchk_xref_is_used_rt_space(sc, rtbno, len) do { } while (0) # define xchk_xref_has_no_rt_owner(sc, rtbno, len) do { } while (0) # define xchk_xref_has_rt_owner(sc, rtbno, len) do { } while (0) # define xchk_xref_is_only_rt_owned_by(sc, bno, len, oinfo) do { } while (0) +# define xchk_xref_is_rt_cow_staging(sc, bno, len) do { } while (0) +# define xchk_xref_is_not_rt_shared(sc, bno, len) do { } while (0) +# define xchk_xref_is_not_rt_cow_staging(sc, bno, len) do { } while (0) #endif #endif /* __XFS_SCRUB_SCRUB_H__ */ From patchwork Thu Dec 19 19:40:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915630 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0104A1990BA for ; Thu, 19 Dec 2024 19:40:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637252; cv=none; b=qIQn0XCJJO47+v/DAyqjBA77oHh3sM8Ii9g2v+ziXjrjDpQbyZcJPPqq8G6EbunHOMWK9p6vRdAt0mcCzCdq8tLwf/03iD7KpP7Y3BaBhwM7Odmmq94kZE5qytzRNSBapxOwGnndX75H+ZHEh6/MKiqTv7Q9ZoVUlpqVbOmLuAU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637252; c=relaxed/simple; bh=ByPfk/XXp61zOcqBHegZJzNiuVEycV3NwP8idkgyA/c=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=UCFk9pz2kNyelhvsVMRJcCx3b0M0azKsI5V/bBBd2Obc2kUEMbxs2qsdBJQdIJhvzWJg5cI4Mx0emMFTkXap8XiIe5KH7qWybmtfWI9b1mPo3j9eM803TfO1mkWN5E4GP/bq1xky8OiCocr+jRzhLidXvuc0CJQ4gbRV9ye94+A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hh+RPcxs; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hh+RPcxs" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9142EC4CECE; Thu, 19 Dec 2024 19:40:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637251; bh=ByPfk/XXp61zOcqBHegZJzNiuVEycV3NwP8idkgyA/c=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=hh+RPcxs7OnWHM5y3kikE87TgEkr8Hgf6w0hgQ72hueBQrz+GRbSWxTve+8vH3qCi qsS0WXnfoGMbB/dGJjFvwRHcA94lhxJqCKJPjZGZiU7LNqCbd0GSxn0/sSY5wd9eYP lbjLoH/31FRjjJpMllfs3elOavesziByn23UPaYZ5AMv77aefpwScpZfRZNRx0n182 OpUj6DeEQxoLtArUG8F+I1qVlaCXUb/7Db7YQfzVzLLspARvYPBXNIeX+cbThY4VMO JVE31bBLHZJ35lAiQzIred6jAtrFrpZi0VF5xMSeyRu9HbiCi5VSeA7IZTdqcHRX52 voIzFtjIpuJxA== Date: Thu, 19 Dec 2024 11:40:51 -0800 Subject: [PATCH 30/43] xfs: allow overlapping rtrmapbt records for shared data extents From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581493.1572761.8605561965960541842.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Allow overlapping realtime reverse mapping records if they both describe shared data extents and the fs supports reflink on the realtime volume. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/rtrmap.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/rtrmap.c b/fs/xfs/scrub/rtrmap.c index 3d5419682d6528..12989fe80e8bda 100644 --- a/fs/xfs/scrub/rtrmap.c +++ b/fs/xfs/scrub/rtrmap.c @@ -78,6 +78,18 @@ struct xchk_rtrmap { struct xfs_rmap_irec prev_rec; }; +static inline bool +xchk_rtrmapbt_is_shareable( + struct xfs_scrub *sc, + const struct xfs_rmap_irec *irec) +{ + if (!xfs_has_rtreflink(sc->mp)) + return false; + if (irec->rm_flags & XFS_RMAP_UNWRITTEN) + return false; + return true; +} + /* Flag failures for records that overlap but cannot. */ STATIC void xchk_rtrmapbt_check_overlapping( @@ -99,7 +111,10 @@ xchk_rtrmapbt_check_overlapping( if (pnext <= irec->rm_startblock) goto set_prev; - xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + /* Overlap is only allowed if both records are data fork mappings. */ + if (!xchk_rtrmapbt_is_shareable(bs->sc, &cr->overlap_rec) || + !xchk_rtrmapbt_is_shareable(bs->sc, irec)) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); /* Save whichever rmap record extends furthest. */ inext = irec->rm_startblock + irec->rm_blockcount; From patchwork Thu Dec 19 19:41:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915631 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE6691A9B49 for ; Thu, 19 Dec 2024 19:41:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637267; cv=none; b=c2TB9wGRFpgYL0MD8pJejVYIo0/ZjuJW4esxtVFWl1xLsmk3j1EpvMd+MZ6NKXzqdxqi5ZNnpePqk1t/5T418EveatJYL3YmQZNsJwErxSEWjl2PN2vEc172R9/sYofLhdV+zMRXHqimuWiBFfvqX+33mBrGq8yMxxjUHa12ays= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637267; c=relaxed/simple; bh=PYkyBnIQuOgs8za+Qknv9AvpWhwdP+9cKSFpyvW3L20=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tZSb6A7JpLJ0JTMcSzVEJWNsDXxOErp1UgNSFvvj5WiMwDMX4XvwN00Lqr+yiqZvKo7rmE8OhMSU0nAO83A/DNmMGlAYEGGAHIQx15lRmEIsTcKmEPQHsRE6grewMRm2upUagiLQE8HMZ/re2/sA6o6M2KK4OIBYf6XT6AS016U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SA1MkyO1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SA1MkyO1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 30B56C4CECE; Thu, 19 Dec 2024 19:41:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637267; bh=PYkyBnIQuOgs8za+Qknv9AvpWhwdP+9cKSFpyvW3L20=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=SA1MkyO1TyzgPEqWjhbbAbE3Ed7Ag6kN3w4rQH2O68olpSl6sjH+uq4SQxb5hZT3O YVtKARYhbH7VYUd047AhRbioDvIF+rf+G06cRkUd+V+W9zJba+ITnTtM1FoK70hvSy LtAar664LRomiEuWP6/PIpKv5GGlVds21U2E09E7bPaIFkjmbHG9PfQ/DaLAJ01v4o +8gOXwucYbbQNe7McAfAyQlAj/AEU5ge9z4MXQf/TJ4SP4sJYbjuDflG/JfBql+sdP Efj7GGZFONTBpFw3u4eCwOjXD4R5GlsOWy4Bi39aCIy19FNmgITwxxOOlJ56ksxrMR N/E0sap5E2Wsg== Date: Thu, 19 Dec 2024 11:41:06 -0800 Subject: [PATCH 31/43] xfs: check reference counts of gaps between rt refcount records From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581509.1572761.18226475983863721054.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong If there's a gap between records in the rt refcount btree, we ought to cross-reference the gap with the rtrmap records to make sure that there aren't any overlapping records for a region that doesn't have any shared ownership. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/rtrefcount.c | 81 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 80 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/rtrefcount.c b/fs/xfs/scrub/rtrefcount.c index f56f2cb7a9f707..18c9bcb0e82184 100644 --- a/fs/xfs/scrub/rtrefcount.c +++ b/fs/xfs/scrub/rtrefcount.c @@ -17,6 +17,7 @@ #include "xfs_rtgroup.h" #include "xfs_metafile.h" #include "xfs_rtrefcount_btree.h" +#include "xfs_rtalloc.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/btree.h" @@ -348,8 +349,14 @@ struct xchk_rtrefcbt_records { /* Previous refcount record. */ struct xfs_refcount_irec prev_rec; + /* The next rtgroup block where we aren't expecting shared extents. */ + xfs_rgblock_t next_unshared_rgbno; + /* Number of CoW blocks we expect. */ xfs_extlen_t cow_blocks; + + /* Was the last record a shared or CoW staging extent? */ + enum xfs_refc_domain prev_domain; }; static inline bool @@ -390,6 +397,53 @@ xchk_rtrefcountbt_check_mergeable( memcpy(&rrc->prev_rec, irec, sizeof(struct xfs_refcount_irec)); } +STATIC int +xchk_rtrefcountbt_rmap_check_gap( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + xfs_rgblock_t *next_bno = priv; + + if (*next_bno != NULLRGBLOCK && rec->rm_startblock < *next_bno) + return -ECANCELED; + + *next_bno = rec->rm_startblock + rec->rm_blockcount; + return 0; +} + +/* + * Make sure that a gap in the reference count records does not correspond to + * overlapping records (i.e. shared extents) in the reverse mappings. + */ +static inline void +xchk_rtrefcountbt_xref_gaps( + struct xfs_scrub *sc, + struct xchk_rtrefcbt_records *rrc, + xfs_rtblock_t bno) +{ + struct xfs_rmap_irec low; + struct xfs_rmap_irec high; + xfs_rgblock_t next_bno = NULLRGBLOCK; + int error; + + if (bno <= rrc->next_unshared_rgbno || !sc->sr.rmap_cur || + xchk_skip_xref(sc->sm)) + return; + + memset(&low, 0, sizeof(low)); + low.rm_startblock = rrc->next_unshared_rgbno; + memset(&high, 0xFF, sizeof(high)); + high.rm_startblock = bno - 1; + + error = xfs_rmap_query_range(sc->sr.rmap_cur, &low, &high, + xchk_rtrefcountbt_rmap_check_gap, &next_bno); + if (error == -ECANCELED) + xchk_btree_xref_set_corrupt(sc, sc->sr.rmap_cur, 0); + else + xchk_should_check_xref(sc, &error, &sc->sr.rmap_cur); +} + /* Scrub a rtrefcountbt record. */ STATIC int xchk_rtrefcountbt_rec( @@ -419,9 +473,26 @@ xchk_rtrefcountbt_rec( if (irec.rc_domain == XFS_REFC_DOMAIN_COW) rrc->cow_blocks += irec.rc_blockcount; + /* Shared records always come before CoW records. */ + if (irec.rc_domain == XFS_REFC_DOMAIN_SHARED && + rrc->prev_domain == XFS_REFC_DOMAIN_COW) + xchk_btree_set_corrupt(bs->sc, bs->cur, 0); + rrc->prev_domain = irec.rc_domain; + xchk_rtrefcountbt_check_mergeable(bs, rrc, &irec); xchk_rtrefcountbt_xref(bs->sc, &irec); + /* + * If this is a record for a shared extent, check that all blocks + * between the previous record and this one have at most one reverse + * mapping. + */ + if (irec.rc_domain == XFS_REFC_DOMAIN_SHARED) { + xchk_rtrefcountbt_xref_gaps(bs->sc, rrc, irec.rc_startblock); + rrc->next_unshared_rgbno = irec.rc_startblock + + irec.rc_blockcount; + } + return 0; } @@ -466,7 +537,9 @@ xchk_rtrefcountbt( { struct xfs_owner_info btree_oinfo; struct xchk_rtrefcbt_records rrc = { - .cow_blocks = 0, + .cow_blocks = 0, + .next_unshared_rgbno = 0, + .prev_domain = XFS_REFC_DOMAIN_SHARED, }; int error; @@ -481,6 +554,12 @@ xchk_rtrefcountbt( if (error || (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) return error; + /* + * Check that all blocks between the last refcount > 1 record and the + * end of the rt volume have at most one reverse mapping. + */ + xchk_rtrefcountbt_xref_gaps(sc, &rrc, sc->mp->m_sb.sb_rblocks); + xchk_refcount_xref_rmap(sc, &btree_oinfo, rrc.cow_blocks); return 0; From patchwork Thu Dec 19 19:41:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915632 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1CD01990BA for ; Thu, 19 Dec 2024 19:41:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637283; cv=none; b=fRn2/ZoKOcz2QicJhfrfc0x9gzXwGLVgEtAk43P2eV6IM5nlsdlzMQClC2bVOxHqGNXPsJpFTqcqd/aUFhpDO/bsoxr4HkzqixVqEMYQKFjJMCjXMSgmGTHVFPDISblDiXYDx1C/LI3jWeu1MU0rjr/9l+gGq/Zm90UO6Ljlv9c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637283; c=relaxed/simple; bh=arbOWgl3Veh440k50ZM4W7xCfpC2MznMhe6f/fM1NqA=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=FrPdr91IZULcwF952YDZegkJhTE4cZ7fb3nevsnuda++90ia0cgOYI8of0AogaVUv/c6wpmPoi4c2lY3v9RASAoNRanIXMuI/q/E6/Qe+ln9P/PE8tZiCWmx6W11VmhQeFh1174zNI6rU0ondf8JNG0jzU6rbrmTdRC33nNf558= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aGzPsFAP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aGzPsFAP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C89B6C4CECE; Thu, 19 Dec 2024 19:41:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637282; bh=arbOWgl3Veh440k50ZM4W7xCfpC2MznMhe6f/fM1NqA=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=aGzPsFAPUBJggffakeqPikxortA/HsZunDCM8zvlEJQxJ4dO2zw6oj7Ri32l823tL uFnAtd2F5wDXDJkUL+QGhNH6jpq/akqmRKNsZy/RDqFFxSd9cFF0q4a9OYNZcjSq+W AlP1KAwtjbwsa3tcuQFyWjLW6BF765C1NNpkDpM4phuPUxI5ehKMM0O0DnzsA3N0J0 kyHgmAQTVxkwQc7iQlF+77qsc+9i/YzGlbZn7C4Qu5eZpPTfPApHsX/ktmnFWgx7pt 6R2VxysykseDnQH/8DFbBDhlSujm7NZfg6fKEAm4Rc5t317KUgmYfP2rhPNuBOCFIe ekDb3N9txPvUQ== Date: Thu, 19 Dec 2024 11:41:22 -0800 Subject: [PATCH 32/43] xfs: allow dquot rt block count to exceed rt blocks on reflink fs From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581526.1572761.10813599358753651805.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Update the quota scrubber to allow dquots where the realtime block count exceeds the block count of the rt volume if reflink is enabled. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/quota.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c index 183d531875eae5..58d6d4ed2853b3 100644 --- a/fs/xfs/scrub/quota.c +++ b/fs/xfs/scrub/quota.c @@ -212,12 +212,18 @@ xchk_quota_item( if (mp->m_sb.sb_dblocks < dq->q_blk.count) xchk_fblock_set_warning(sc, XFS_DATA_FORK, offset); + if (mp->m_sb.sb_rblocks < dq->q_rtb.count) + xchk_fblock_set_warning(sc, XFS_DATA_FORK, + offset); } else { if (mp->m_sb.sb_dblocks < dq->q_blk.count) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, offset); + if (mp->m_sb.sb_rblocks < dq->q_rtb.count) + xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, + offset); } - if (dq->q_ino.count > fs_icount || dq->q_rtb.count > mp->m_sb.sb_rblocks) + if (dq->q_ino.count > fs_icount) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, offset); /* From patchwork Thu Dec 19 19:41:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915633 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98E321990BA for ; Thu, 19 Dec 2024 19:41:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637298; cv=none; b=r6dufvt4I1lBoFJ0HaSkIEqH7IJ8M7EI6WJsKuYP/jkl0n3hfZDqaDO/89FF01p2OoOPoKWcw6nIaAvXqYPzbWycH5FOYlJbNNNnSPtmRGgHCx3kEe16iq6hpP8lk1JfrKbGOGlavoQZ+6pNwGSvPYyfj+gxiqo+vVA2LBHePTE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637298; c=relaxed/simple; bh=t2V91u8ZRt6DztCKRdl1jxpanDsYCJlKI+HBKQ8GxWA=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=qrYuRclT/FnycQzP7pQpDWlQ1DmnS2eeym6PW7F/mffA/EzVvOrA9LtAQcTGUanAnxCsEZwieYvSYjxBGWnK2c/LFNB/ZV8VcCB5khaMOmfWoMc4pYh6o7HCUjpYEiDH3anOc5TCihX1yn/LIOS/6plzGCpT8B5Ek4YWk14RuQw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=TgKhdh55; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="TgKhdh55" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7459DC4CECE; Thu, 19 Dec 2024 19:41:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637298; bh=t2V91u8ZRt6DztCKRdl1jxpanDsYCJlKI+HBKQ8GxWA=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=TgKhdh55fgCMH9Cz7nYJABjvOl8U2JuHe/L9N/uWH1idAM+0hUkIv+PdKyd74NXSM sFB+vZDUF181oSDmf+jQ+BbDs7tDivhXetUjOr3Y8pcNuToC+hHrQsQWCKiRYjB/p2 B+1zpiWJ0S6zwxEhIEUlg3FSAJnuwViyjBwK+MeoPGxzuQk9dmWoJouNAgTtTafx8s qa1AWvDgT6J2mvCOBFhoXMiWkwSsYpWUAccOK9dFk3Dz9cdE0L/n0GLCy3k6AtizVD Ba6nSsSszt9lHGiD2h+Bq1lN0usCwG9s5kf1RqHQpoV0A4DbTYmLknUHOhIhID4pFk xTQ1wI+I6i5Vw== Date: Thu, 19 Dec 2024 11:41:37 -0800 Subject: [PATCH 33/43] xfs: detect and repair misaligned rtinherit directory cowextsize hints From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581543.1572761.15691236377043085261.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong If we encounter a directory that has been configured to pass on a CoW extent size hint to a new realtime file and the hint isn't an integer multiple of the rt extent size, we should flag the hint for administrative review and/or turn it off because that is a misconfiguration. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/inode.c | 26 +++++++++++++++++--------- fs/xfs/scrub/inode_repair.c | 15 +++++++++++++++ 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c index c7bbc3f78e90b1..db6edd5a5fe5d8 100644 --- a/fs/xfs/scrub/inode.c +++ b/fs/xfs/scrub/inode.c @@ -260,12 +260,7 @@ xchk_inode_extsize( xchk_ino_set_warning(sc, ino); } -/* - * Validate di_cowextsize hint. - * - * The rules are documented at xfs_ioctl_setattr_check_cowextsize(). - * These functions must be kept in sync with each other. - */ +/* Validate di_cowextsize hint. */ STATIC void xchk_inode_cowextsize( struct xfs_scrub *sc, @@ -276,12 +271,25 @@ xchk_inode_cowextsize( uint64_t flags2) { xfs_failaddr_t fa; + uint32_t value = be32_to_cpu(dip->di_cowextsize); - fa = xfs_inode_validate_cowextsize(sc->mp, - be32_to_cpu(dip->di_cowextsize), mode, flags, - flags2); + fa = xfs_inode_validate_cowextsize(sc->mp, value, mode, flags, flags2); if (fa) xchk_ino_set_corrupt(sc, ino); + + /* + * XFS allows a sysadmin to change the rt extent size when adding a rt + * section to a filesystem after formatting. If there are any + * directories with cowextsize and rtinherit set, the hint could become + * misaligned with the new rextsize. The verifier doesn't check this, + * because we allow rtinherit directories even without an rt device. + * Flag this as an administrative warning since we will clean this up + * eventually. + */ + if ((flags & XFS_DIFLAG_RTINHERIT) && + (flags2 & XFS_DIFLAG2_COWEXTSIZE) && + value % sc->mp->m_sb.sb_rextsize > 0) + xchk_ino_set_warning(sc, ino); } /* Make sure the di_flags make sense for the inode. */ diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index 938a18721f3697..701baee144a66e 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -1903,6 +1903,20 @@ xrep_inode_pptr( sizeof(struct xfs_attr_sf_hdr), true); } +/* Fix COW extent size hint problems. */ +STATIC void +xrep_inode_cowextsize( + struct xfs_scrub *sc) +{ + /* Fix misaligned CoW extent size hints on a directory. */ + if ((sc->ip->i_diflags & XFS_DIFLAG_RTINHERIT) && + (sc->ip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) && + sc->ip->i_extsize % sc->mp->m_sb.sb_rextsize > 0) { + sc->ip->i_cowextsize = 0; + sc->ip->i_diflags2 &= ~XFS_DIFLAG2_COWEXTSIZE; + } +} + /* Fix any irregularities in an inode that the verifiers don't catch. */ STATIC int xrep_inode_problems( @@ -1926,6 +1940,7 @@ xrep_inode_problems( if (S_ISDIR(VFS_I(sc->ip)->i_mode)) xrep_inode_dir_size(sc); xrep_inode_extsize(sc); + xrep_inode_cowextsize(sc); trace_xrep_inode_fixed(sc); xfs_trans_log_inode(sc->tp, sc->ip, XFS_ILOG_CORE); From patchwork Thu Dec 19 19:41:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915634 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 385001990BA for ; Thu, 19 Dec 2024 19:41:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637314; cv=none; b=WY+nPWbdjIgs5Lz57VPVcdSgKCO+oJu1UklvMaT7e5pziSnrrmydNA3r/axfCZaReALWAT0/XUKkvPfz3/JUXJRL0+omZ0gEqEeP9oPIgdCOhur9uRgxLKqIDND1PUDx1l8t1mDvozp8nnZ54BXOKhYS3CvMP0EYDZ9g3ax+1YM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637314; c=relaxed/simple; bh=YfET5qqKv8cthm8Hyiw3a6tY5ZS8Y4Bkk3pZuWIwf80=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=QOmGu0MwtpQUa2ouie87nbSftD7/hCyKAaBFB/i2Mmc3U2Wr7WtlE1HdB6FWu/pceTJPvtH6/9uYgX3GQO8bFRyN54QZ0JnVK+8Y++eBGv24f0ZImdAH2Y0qnIBxLO6p6yZjy+lar2LUDSoNAFyhDCedpjVsl50MCY6qBoaap6E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VS3l+1Ov; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VS3l+1Ov" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 100BFC4CECE; Thu, 19 Dec 2024 19:41:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637314; bh=YfET5qqKv8cthm8Hyiw3a6tY5ZS8Y4Bkk3pZuWIwf80=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=VS3l+1OvEk1Xt8Pn8/T5vnDg0peYUrji1hQsJMNTwZ6fHIp1hPQWWwfq1vnP8a6sk nRwGg237w3S/Nv8JKHbaHPGw2X5RezS5/AgMYsI7JPu6vrKM4wp5qCAE+xpQBnkZRF 5czB/XoiuJPKfxOnW8oWbiwrSDyUMVfeREKNtjJW8lAoFQZHx76Bz0CdzqVCmyVTfu vAj14AgODEQFYEQdda2WrRwlTPM/YcBBd7S/PxYf3U0JHHrBnjVp1iHABDE1nofdvY xuSNqdQnb+tXJvau1aqAQIKcfiAKaH2Nw3funEBFzmxvyslW9bUnIfu8zuqxZCIKzu NzzBAE2XPlOMA== Date: Thu, 19 Dec 2024 11:41:53 -0800 Subject: [PATCH 34/43] xfs: scrub the metadir path of rt refcount btree files From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581560.1572761.417473933224932628.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Add a new XFS_SCRUB_METAPATH subtype so that we can scrub the metadata directory tree path to the refcount btree file for each rt group. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_fs.h | 3 ++- fs/xfs/scrub/metapath.c | 3 +++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h index a4bd6a39c6ba71..2c3171262b445b 100644 --- a/fs/xfs/libxfs/xfs_fs.h +++ b/fs/xfs/libxfs/xfs_fs.h @@ -832,9 +832,10 @@ struct xfs_scrub_vec_head { #define XFS_SCRUB_METAPATH_GRPQUOTA (6) /* group quota */ #define XFS_SCRUB_METAPATH_PRJQUOTA (7) /* project quota */ #define XFS_SCRUB_METAPATH_RTRMAPBT (8) /* realtime reverse mapping */ +#define XFS_SCRUB_METAPATH_RTREFCOUNTBT (9) /* realtime refcount */ /* Number of metapath sm_ino values */ -#define XFS_SCRUB_METAPATH_NR (9) +#define XFS_SCRUB_METAPATH_NR (10) /* * ioctl limits diff --git a/fs/xfs/scrub/metapath.c b/fs/xfs/scrub/metapath.c index 74d71373e7edf1..e21c16fbd15d90 100644 --- a/fs/xfs/scrub/metapath.c +++ b/fs/xfs/scrub/metapath.c @@ -22,6 +22,7 @@ #include "xfs_attr.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -249,6 +250,8 @@ xchk_setup_metapath( return xchk_setup_metapath_dqinode(sc, XFS_DQTYPE_PROJ); case XFS_SCRUB_METAPATH_RTRMAPBT: return xchk_setup_metapath_rtginode(sc, XFS_RTGI_RMAP); + case XFS_SCRUB_METAPATH_RTREFCOUNTBT: + return xchk_setup_metapath_rtginode(sc, XFS_RTGI_REFCOUNT); default: return -ENOENT; } From patchwork Thu Dec 19 19:42:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915635 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DEDAD1990BA for ; Thu, 19 Dec 2024 19:42:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637330; cv=none; b=rY+hr1tOOWS45dXMCJwwTb76KPQJOyiilGE5ok+l8wPhrjNS1IOPCcaLRp69wKFcBgQur4vMNFcwq7ySd0dJ8A2uINSet4bO2nHla7tbN5Vm22dyQjFGi2XNiKib5dTt+dNIj7KKBLcZ6EuTe9A8PUeosNI5pbJUgsK1BpJ5HVk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637330; c=relaxed/simple; bh=MOsOqB63rNQe8LNxWFKoqHLkh5o4DkoYQF0NnrjRxlI=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ntPKWl6PpR8DhqCNLp5y9fYhiLrmjYdICxLr4Oap5+gFyB9bRnV2S+z905kRAiDvsWqMwBKM4PxPFINNCN7Q3UYjz+gCBOiQJVVia+QIB6onxUTI65K4v7oySuxu2F6BJaawL7gv5S3FtGwxkvkYLh06RpOGGQTVM2FEerqXDHc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SdxTbZDF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SdxTbZDF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B20E0C4CECE; Thu, 19 Dec 2024 19:42:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637329; bh=MOsOqB63rNQe8LNxWFKoqHLkh5o4DkoYQF0NnrjRxlI=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=SdxTbZDFxcY1Rugd3w897YZVfsgEaqUtJHHBhokYlg/L4Aq5yhzVYmj+XdfI/zsfi pooGf6Lx/9FaCTNtuplfzKUJZvI5kiUveCk88Kn2v1Ommju60gCgL/oyPxQjV2YUCZ 5w2WEuVKWCIilLUEftIDp0vnub3312yi0O5U8VbL3UEM7ZeI+a5AmX2Cn14KSv2fRm 3OIsym0TkUfbBvs1+M0e+Nz0MrQ/xSZ9E+/pdnktBuGpB+U9oETIS1EdTCnw/aDrdS MjapM3gTvNhxrUhysG7+mmsimLrpdjzEGSPVHrBQHPWPcgcd3SlETbXUV07e6Qpj2x NE1fs3W/1aUXg== Date: Thu, 19 Dec 2024 11:42:09 -0800 Subject: [PATCH 35/43] xfs: don't flag quota rt block usage on rtreflink filesystems From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581577.1572761.702217096620909486.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Quota space usage is allowed to exceed the size of the physical storage when reflink is enabled. Now that we have reflink for the realtime volume, apply this same logic to the rtb repair logic. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/quota_repair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/scrub/quota_repair.c b/fs/xfs/scrub/quota_repair.c index cd51f10f29209e..8f4c8d41f3083a 100644 --- a/fs/xfs/scrub/quota_repair.c +++ b/fs/xfs/scrub/quota_repair.c @@ -233,7 +233,7 @@ xrep_quota_item( rqi->need_quotacheck = true; dirty = true; } - if (dq->q_rtb.count > mp->m_sb.sb_rblocks) { + if (!xfs_has_reflink(mp) && dq->q_rtb.count > mp->m_sb.sb_rblocks) { dq->q_rtb.reserved -= dq->q_rtb.count; dq->q_rtb.reserved += mp->m_sb.sb_rblocks; dq->q_rtb.count = mp->m_sb.sb_rblocks; From patchwork Thu Dec 19 19:42:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915636 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 729761990BA for ; Thu, 19 Dec 2024 19:42:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637345; cv=none; b=stdtlF2eb60wt6P8nrqMTDhgW1/70M2lBpWkyf0pEVL7zvfRQtBIZJzFKLuOywnXdYKkIEmH+i8TF7UhWXNTXCYUMYyZlKZlD9MqGka0LkT0df2SiK97i0WqoHfowJ063/suzTaD9E4Qhrgrrq2Y6lqd/6BUuy0Z2rd1Y6omD3o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637345; c=relaxed/simple; bh=s7KcGpDJWZQnni2AJkxNN55cJhIiI0gCNDk9M+9rMtA=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mtJ8r7a2Ipypntg5zGWxN/fOEBY9sFYc7795jvYq5ruPHnsxj6aF0OOxEHbNJo+r41ViBCr/f+SNujzMRFrMH3uUNsM1fcQJ4jpUgHnsX6Sbrn86sflpFKdm6RnxvJtQq0Y81lItXUzWS1y8ZbKxnm5nfZaKuJdZQXi0Xw0LBDw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=M7/WyXkf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="M7/WyXkf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4CA76C4CECE; Thu, 19 Dec 2024 19:42:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637345; bh=s7KcGpDJWZQnni2AJkxNN55cJhIiI0gCNDk9M+9rMtA=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=M7/WyXkf1FimugeMrvOvO7DNZ7cxo7b/LtYJQaA8sjlbghf8h0JcaemjNHy6vqxKG Qh//3CvShaIMI/gf+oacSgjncaf/IrYyALoWhq/NpT3tsZUFi5P2kaE8oZnqitzUNP WSGtjLbZx8IT+XtIQ65Y0QiTk5ezl7ROUzt4OwTWDkMmkBNYFR2hglXe9UBDA4tIkr ClXS41BDCAsAbJrbZ7/1N7JWYavV/TA161FZyGxsC5Lt9Gxtj8EnMxrJwaZSHQ6R67 GpTAzYnfoPXnh2epNqE9JJGeaq9C4NptPsb63XlqVCtqVO+0rS34X20KrK0Vb6hFqO F5qPQC0qpHEAA== Date: Thu, 19 Dec 2024 11:42:24 -0800 Subject: [PATCH 36/43] xfs: check new rtbitmap records against rt refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581594.1572761.2024178147460399524.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong When we're rebuilding the realtime bitmap, check the proposed free extents against the rt refcount btree to make sure we don't commit any grievous errors. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/repair.c | 6 ++++++ fs/xfs/scrub/rtbitmap_repair.c | 24 +++++++++++++++++++++++- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index 61e414c81253af..3b5288d3ef4e34 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -42,6 +42,7 @@ #include "xfs_rtgroup.h" #include "xfs_rtalloc.h" #include "xfs_metafile.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -1009,6 +1010,11 @@ xrep_rtgroup_btcur_init( (sr->rtlock_flags & XFS_RTGLOCK_RMAP) && xfs_has_rtrmapbt(mp)) sr->rmap_cur = xfs_rtrmapbt_init_cursor(sc->tp, sr->rtg); + + if (sc->sm->sm_type != XFS_SCRUB_TYPE_RTREFCBT && + (sr->rtlock_flags & XFS_RTGLOCK_REFCOUNT) && + xfs_has_rtreflink(mp)) + sr->refc_cur = xfs_rtrefcountbt_init_cursor(sc->tp, sr->rtg); } /* diff --git a/fs/xfs/scrub/rtbitmap_repair.c b/fs/xfs/scrub/rtbitmap_repair.c index c6e33834c5ae98..203a1a97c5026e 100644 --- a/fs/xfs/scrub/rtbitmap_repair.c +++ b/fs/xfs/scrub/rtbitmap_repair.c @@ -23,6 +23,7 @@ #include "xfs_rtbitmap.h" #include "xfs_rtgroup.h" #include "xfs_extent_busy.h" +#include "xfs_refcount.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -183,7 +184,8 @@ xrep_rtbitmap_mark_free( xfs_rgblock_t rgbno) { struct xfs_mount *mp = rtb->sc->mp; - struct xfs_rtgroup *rtg = rtb->sc->sr.rtg; + struct xchk_rt *sr = &rtb->sc->sr; + struct xfs_rtgroup *rtg = sr->rtg; xfs_rtxnum_t startrtx; xfs_rtxnum_t nextrtx; xrep_wordoff_t wordoff, nextwordoff; @@ -191,6 +193,7 @@ xrep_rtbitmap_mark_free( unsigned int bufwsize; xfs_extlen_t mod; xfs_rtword_t mask; + enum xbtree_recpacking outcome; int error; if (!xfs_verify_rgbext(rtg, rtb->next_rgbno, rgbno - rtb->next_rgbno)) @@ -210,6 +213,25 @@ xrep_rtbitmap_mark_free( if (mod != mp->m_sb.sb_rextsize - 1) return -EFSCORRUPTED; + /* Must not be shared or CoW staging. */ + if (sr->refc_cur) { + error = xfs_refcount_has_records(sr->refc_cur, + XFS_REFC_DOMAIN_SHARED, rtb->next_rgbno, + rgbno - rtb->next_rgbno, &outcome); + if (error) + return error; + if (outcome != XBTREE_RECPACKING_EMPTY) + return -EFSCORRUPTED; + + error = xfs_refcount_has_records(sr->refc_cur, + XFS_REFC_DOMAIN_COW, rtb->next_rgbno, + rgbno - rtb->next_rgbno, &outcome); + if (error) + return error; + if (outcome != XBTREE_RECPACKING_EMPTY) + return -EFSCORRUPTED; + } + trace_xrep_rtbitmap_record_free(mp, startrtx, nextrtx - 1); /* Set bits as needed to round startrtx up to the nearest word. */ From patchwork Thu Dec 19 19:42:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915637 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1BC951990BA for ; Thu, 19 Dec 2024 19:42:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637361; cv=none; b=JNFfMzLODl+SBPpoXtDLUt10m1zgd2YxvqEJYJ7ZxwiyQUVN2f1rE+pEhIw7q817rTurma8/swKXmekRvdlO7jlTa+Rl6QYkqX64ibsiZvTQH4NdVMylIo+9fXQM2Sy5jA5EOL0L0Po1WUB9B2TI71KSopmR4U6KDhjkkMHuENw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637361; c=relaxed/simple; bh=iI5tUM+wvRKl+6u+sdByVM+n9xG2zRFkrO8UBgUzFSM=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=m96qoUXVvXWDQB2LRvpT++hPA71++0ZMoE69otWwSS0PAeoe0o9JRlutO9EQ/McG0gcQBH0u59ZJpeESSAFxreCJ0D5u3aYbWSQXWvOrTF6n8XejIe/tquyU07H5j8sZRV4DOXcvisKjMjjftsaZQm8JSwtVLjc4PD+qry74KxM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nkU7A+Cj; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nkU7A+Cj" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DE0B3C4CECE; Thu, 19 Dec 2024 19:42:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637360; bh=iI5tUM+wvRKl+6u+sdByVM+n9xG2zRFkrO8UBgUzFSM=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=nkU7A+CjJDAjznjgznYRt9EkOOsHaCcB+l0UnV0A138i32DYwSVOYzxLMz/h3m+dh sVGQ6XlnjzlZE4UzzreFb5+CUeTnLN1eRgqz4fipJ3Qb6E0BvkYPOArUqjOz3Kv0gD E6DYLsxFKQ21/bU3ont5fZX6IOdYNCDXY5dD0rwZSahKiwky7JJVcx8YFWaXlMa5eK NY142SfAFlXxXix0dMZqqg7+ha/cQQQgbB4ZKEqhLIMRTMn2BnR/JWjBhWkwqU8+dC Syl7NOmo1NS5Unq6c/I/cIugHTdc/38RBjtXCyY70YdOttuZFP+82Eu0KuXE6uWi4z gy8eb3ZOe62ow== Date: Thu, 19 Dec 2024 11:42:40 -0800 Subject: [PATCH 37/43] xfs: walk the rt reference count tree when rebuilding rmap From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581611.1572761.1500482126048341897.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong When we're rebuilding the data device rmap, if we encounter a "refcount" format fork, we have to walk the (realtime) refcount btree inode to build the appropriate mappings. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/rmap_repair.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/xfs/scrub/rmap_repair.c b/fs/xfs/scrub/rmap_repair.c index c2c7b76cc25ab8..f5f73078ffe29d 100644 --- a/fs/xfs/scrub/rmap_repair.c +++ b/fs/xfs/scrub/rmap_repair.c @@ -33,6 +33,7 @@ #include "xfs_ag.h" #include "xfs_rtrmap_btree.h" #include "xfs_rtgroup.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -519,6 +520,9 @@ xrep_rmap_scan_meta_btree( case XFS_METAFILE_RTRMAP: type = XFS_RTGI_RMAP; break; + case XFS_METAFILE_RTREFCOUNT: + type = XFS_RTGI_REFCOUNT; + break; default: ASSERT(0); return -EFSCORRUPTED; @@ -545,6 +549,9 @@ xrep_rmap_scan_meta_btree( case XFS_METAFILE_RTRMAP: cur = xfs_rtrmapbt_init_cursor(sc->tp, rtg); break; + case XFS_METAFILE_RTREFCOUNT: + cur = xfs_rtrefcountbt_init_cursor(sc->tp, rtg); + break; default: ASSERT(0); error = -EFSCORRUPTED; From patchwork Thu Dec 19 19:42:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915638 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0EA411990BA for ; Thu, 19 Dec 2024 19:42:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637377; cv=none; b=r8FBFn99uTE7w+ppyk8kXLL+qaFFuzYHTXS0//3mU75YX/s07qgWClBInP6P3BfY9jTYlKSa4+OrxQ1fJZtUrOEiZPgMNmWREPj/6BZ5W8nJ8mZRzWGM/Gwu1jBl5jGB1z1uUj3DUV80Ux3DqVkRtoOkPSpe2yxnNCgPaof1iLE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637377; c=relaxed/simple; bh=KrhceEEF88kH0EsL+4vhISSo4eFG5AeSgY7uNd/+kQU=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=BlajXpBwxvT3w+7UJ+i8X2+JLAO/5fBSJvLGrOsZp2bMljcCHNNyh2hwntzRmeWTkJpQBpojWDiD7THy+FjEst1CBFMIpB/9JKj1JodDxE4dW9UTJSssYdjkY2Rid+3QQsC3s5prdieHwBn7dhG+mK+NyuWCyhlo5XugL10yqSQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Y/XOROv0; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Y/XOROv0" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 82636C4CECE; Thu, 19 Dec 2024 19:42:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637376; bh=KrhceEEF88kH0EsL+4vhISSo4eFG5AeSgY7uNd/+kQU=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=Y/XOROv0HXR6JAB2OabEeqRReceuTDUyxBsI7REA2QIxEP6J39Zi0W5eJO3hrJ1ng 2Qy0IqMPBXmgsPM3rXwHl0T7idHarpzUijIqDNAKSTCvm6NH95dY91Z1Sw1ubHtOTj SneaTFVabLPi34lqq8XDmwLdYNoQX74kPXxAxOssZei2v4AeoNmr3+KUrd0HHdIksy 41ckalDEzgf5KIT4z1mzTfD/Ip2LtGGbtns2ATtCOC+Y1qSAMmT3vypysxcdrAtX0l YSaEVxlrdo51+lGwhOraydaSRej7TXfTn2Yb4y/yM+sggImENU3EZclzmwmFxqmxCv 92xm8NntM9GUg== Date: Thu, 19 Dec 2024 11:42:56 -0800 Subject: [PATCH 38/43] xfs: capture realtime CoW staging extents when rebuilding rt rmapbt From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581627.1572761.1416198499651301119.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Walk the realtime refcount btree to find the CoW staging extents when we're rebuilding the realtime rmap btree. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/repair.h | 1 fs/xfs/scrub/rgb_bitmap.h | 37 +++++++++++++++ fs/xfs/scrub/rtrmap_repair.c | 103 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 141 insertions(+) create mode 100644 fs/xfs/scrub/rgb_bitmap.h diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index ac5962732d269d..77343813205375 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -50,6 +50,7 @@ xrep_trans_commit( struct xbitmap; struct xagb_bitmap; +struct xrgb_bitmap; struct xfsb_bitmap; int xrep_fix_freelist(struct xfs_scrub *sc, int alloc_flags); diff --git a/fs/xfs/scrub/rgb_bitmap.h b/fs/xfs/scrub/rgb_bitmap.h new file mode 100644 index 00000000000000..4c3126b66dcb96 --- /dev/null +++ b/fs/xfs/scrub/rgb_bitmap.h @@ -0,0 +1,37 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2020-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_SCRUB_RGB_BITMAP_H__ +#define __XFS_SCRUB_RGB_BITMAP_H__ + +/* Bitmaps, but for type-checked for xfs_rgblock_t */ + +struct xrgb_bitmap { + struct xbitmap32 rgbitmap; +}; + +static inline void xrgb_bitmap_init(struct xrgb_bitmap *bitmap) +{ + xbitmap32_init(&bitmap->rgbitmap); +} + +static inline void xrgb_bitmap_destroy(struct xrgb_bitmap *bitmap) +{ + xbitmap32_destroy(&bitmap->rgbitmap); +} + +static inline int xrgb_bitmap_set(struct xrgb_bitmap *bitmap, + xfs_rgblock_t start, xfs_extlen_t len) +{ + return xbitmap32_set(&bitmap->rgbitmap, start, len); +} + +static inline int xrgb_bitmap_walk(struct xrgb_bitmap *bitmap, + xbitmap32_walk_fn fn, void *priv) +{ + return xbitmap32_walk(&bitmap->rgbitmap, fn, priv); +} + +#endif /* __XFS_SCRUB_RGB_BITMAP_H__ */ diff --git a/fs/xfs/scrub/rtrmap_repair.c b/fs/xfs/scrub/rtrmap_repair.c index 49de8bc2dd17f5..f2fdd7a9fc2483 100644 --- a/fs/xfs/scrub/rtrmap_repair.c +++ b/fs/xfs/scrub/rtrmap_repair.c @@ -30,6 +30,7 @@ #include "xfs_rtalloc.h" #include "xfs_ag.h" #include "xfs_rtgroup.h" +#include "xfs_refcount.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -38,6 +39,7 @@ #include "scrub/repair.h" #include "scrub/bitmap.h" #include "scrub/fsb_bitmap.h" +#include "scrub/rgb_bitmap.h" #include "scrub/xfile.h" #include "scrub/xfarray.h" #include "scrub/iscan.h" @@ -423,6 +425,100 @@ xrep_rtrmap_scan_ag( return error; } +struct xrep_rtrmap_stash_run { + struct xrep_rtrmap *rr; + uint64_t owner; +}; + +static int +xrep_rtrmap_stash_run( + uint32_t start, + uint32_t len, + void *priv) +{ + struct xrep_rtrmap_stash_run *rsr = priv; + struct xrep_rtrmap *rr = rsr->rr; + xfs_rgblock_t rgbno = start; + + return xrep_rtrmap_stash(rr, rgbno, len, rsr->owner, 0, 0); +} + +/* + * Emit rmaps for every extent of bits set in the bitmap. Caller must ensure + * that the ranges are in units of FS blocks. + */ +STATIC int +xrep_rtrmap_stash_bitmap( + struct xrep_rtrmap *rr, + struct xrgb_bitmap *bitmap, + const struct xfs_owner_info *oinfo) +{ + struct xrep_rtrmap_stash_run rsr = { + .rr = rr, + .owner = oinfo->oi_owner, + }; + + return xrgb_bitmap_walk(bitmap, xrep_rtrmap_stash_run, &rsr); +} + +/* Record a CoW staging extent. */ +STATIC int +xrep_rtrmap_walk_cowblocks( + struct xfs_btree_cur *cur, + const struct xfs_refcount_irec *irec, + void *priv) +{ + struct xrgb_bitmap *bitmap = priv; + + if (!xfs_refcount_check_domain(irec) || + irec->rc_domain != XFS_REFC_DOMAIN_COW) + return -EFSCORRUPTED; + + return xrgb_bitmap_set(bitmap, irec->rc_startblock, + irec->rc_blockcount); +} + +/* + * Collect rmaps for the blocks containing the refcount btree, and all CoW + * staging extents. + */ +STATIC int +xrep_rtrmap_find_refcount_rmaps( + struct xrep_rtrmap *rr) +{ + struct xrgb_bitmap cow_blocks; /* COWBIT */ + struct xfs_refcount_irec low = { + .rc_startblock = 0, + .rc_domain = XFS_REFC_DOMAIN_COW, + }; + struct xfs_refcount_irec high = { + .rc_startblock = -1U, + .rc_domain = XFS_REFC_DOMAIN_COW, + }; + struct xfs_scrub *sc = rr->sc; + int error; + + if (!xfs_has_rtreflink(sc->mp)) + return 0; + + xrgb_bitmap_init(&cow_blocks); + + /* Collect rmaps for CoW staging extents. */ + error = xfs_refcount_query_range(sc->sr.refc_cur, &low, &high, + xrep_rtrmap_walk_cowblocks, &cow_blocks); + if (error) + goto out_bitmap; + + /* Generate rmaps for everything. */ + error = xrep_rtrmap_stash_bitmap(rr, &cow_blocks, &XFS_RMAP_OINFO_COW); + if (error) + goto out_bitmap; + +out_bitmap: + xrgb_bitmap_destroy(&cow_blocks); + return error; +} + /* Count and check all collected records. */ STATIC int xrep_rtrmap_check_record( @@ -460,6 +556,13 @@ xrep_rtrmap_find_rmaps( return error; } + /* Find CoW staging extents. */ + xrep_rtgroup_btcur_init(sc, &sc->sr); + error = xrep_rtrmap_find_refcount_rmaps(rr); + xchk_rtgroup_btcur_free(&sc->sr); + if (error) + return error; + /* * Set up for a potentially lengthy filesystem scan by reducing our * transaction resource usage for the duration. Specifically: From patchwork Thu Dec 19 19:43:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915639 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2E051990BA for ; Thu, 19 Dec 2024 19:43:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637392; cv=none; b=jPE1nQRqHjTfSVVol1/D5Rc/2tnDORk4J1MhuvHWLpLZyYDTwyaVuAClUlRavzdaiJbSzfu+7OZXvL+7zsAvb9fz3hdCq5/5SKnwExtA5jmhGLYQQgKkB0aACT1aRv3cYB0L76RSPJ6q0KTouuA0SO+QEdqpXgSWTHxy4Q8TxKI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637392; c=relaxed/simple; bh=EnpwTON1G1miU91XV5D8ICOybtdxw2PHyemvqDhVECk=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Cw8Lg+FbH1jfFfQ3nr80/UPiBuvbI4PpqIQ8F5aAPuICvhgMF/RRium34Hrcg5EvJoeVCTN0hg+fErhydZpKJIIvZS06I+cktCamzUsJrYiGHekTrazrhr2H2gmybBd1TVpWNZ+nHqjbWFRP00ZWX0I1Gy/+zh8lE4xjonJJKww= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PaKA3+Kh; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PaKA3+Kh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 24192C4CECE; Thu, 19 Dec 2024 19:43:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637392; bh=EnpwTON1G1miU91XV5D8ICOybtdxw2PHyemvqDhVECk=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=PaKA3+Kh7cA4i8eHXnMKvrE9fsN+mOB/tu4JGkSSl4m44gJsejT/jUZrucTPZfS6F hU0MNVOxKVVMUbLsyxI4koA1ghnKPCYA3utuh+UDycDaQrFnvIrHnoycbNbw2uLipN EVx719U9Gp0JewdWrkQksc4cIwBGogEL002KeFopO5zH7GwlxXMpvkAUCQnXGXdXTN Tqt8kdAbxduVutXKaEbBx3EgJyVF59o9ROnC8JyzeEZWPOGRHfZNg9nIKYA1xpvkjC 5l5aGqjh2AMt/k8B1WUzQRbUnGaJTXLi1kq2hxANtORIYnrzA8ju0RV3HmPrPvt718 NkjsStWfzc3rA== Date: Thu, 19 Dec 2024 11:43:11 -0800 Subject: [PATCH 39/43] xfs: online repair of the realtime refcount btree From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581644.1572761.15029168798580680389.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Port the data device's refcount btree repair code to the realtime refcount btree. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/Makefile | 1 fs/xfs/scrub/refcount_repair.c | 2 fs/xfs/scrub/repair.h | 5 fs/xfs/scrub/rtrefcount.c | 9 fs/xfs/scrub/rtrefcount_repair.c | 783 ++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/trace.h | 14 - 7 files changed, 809 insertions(+), 7 deletions(-) create mode 100644 fs/xfs/scrub/rtrefcount_repair.c diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 9dd9921e53567c..7afa51e414278e 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -236,6 +236,7 @@ xfs-y += $(addprefix scrub/, \ xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \ rtbitmap_repair.o \ + rtrefcount_repair.o \ rtrmap_repair.o \ rtsummary_repair.o \ ) diff --git a/fs/xfs/scrub/refcount_repair.c b/fs/xfs/scrub/refcount_repair.c index 1ee6d4aeb308f5..9c8cb5332da042 100644 --- a/fs/xfs/scrub/refcount_repair.c +++ b/fs/xfs/scrub/refcount_repair.c @@ -189,7 +189,7 @@ xrep_refc_stash( if (error) return error; - trace_xrep_refc_found(sc->sa.pag, &irec); + trace_xrep_refc_found(pag_group(sc->sa.pag), &irec); return xfarray_append(rr->refcount_records, &irec); } diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index 77343813205375..8f8f18b48a449d 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -99,6 +99,7 @@ int xrep_setup_nlinks(struct xfs_scrub *sc); int xrep_setup_symlink(struct xfs_scrub *sc, unsigned int *resblks); int xrep_setup_dirtree(struct xfs_scrub *sc); int xrep_setup_rtrmapbt(struct xfs_scrub *sc); +int xrep_setup_rtrefcountbt(struct xfs_scrub *sc); /* Repair setup functions */ int xrep_setup_ag_allocbt(struct xfs_scrub *sc); @@ -158,11 +159,13 @@ int xrep_rtbitmap(struct xfs_scrub *sc); int xrep_rtsummary(struct xfs_scrub *sc); int xrep_rgsuperblock(struct xfs_scrub *sc); int xrep_rtrmapbt(struct xfs_scrub *sc); +int xrep_rtrefcountbt(struct xfs_scrub *sc); #else # define xrep_rtbitmap xrep_notsupported # define xrep_rtsummary xrep_notsupported # define xrep_rgsuperblock xrep_notsupported # define xrep_rtrmapbt xrep_notsupported +# define xrep_rtrefcountbt xrep_notsupported #endif /* CONFIG_XFS_RT */ #ifdef CONFIG_XFS_QUOTA @@ -236,6 +239,7 @@ xrep_setup_nothing( #define xrep_setup_dirtree xrep_setup_nothing #define xrep_setup_metapath xrep_setup_nothing #define xrep_setup_rtrmapbt xrep_setup_nothing +#define xrep_setup_rtrefcountbt xrep_setup_nothing #define xrep_setup_inode(sc, imap) ((void)0) @@ -274,6 +278,7 @@ static inline int xrep_setup_symlink(struct xfs_scrub *sc, unsigned int *x) #define xrep_metapath xrep_notsupported #define xrep_rgsuperblock xrep_notsupported #define xrep_rtrmapbt xrep_notsupported +#define xrep_rtrefcountbt xrep_notsupported #endif /* CONFIG_XFS_ONLINE_REPAIR */ diff --git a/fs/xfs/scrub/rtrefcount.c b/fs/xfs/scrub/rtrefcount.c index 18c9bcb0e82184..4c5dffc73641b7 100644 --- a/fs/xfs/scrub/rtrefcount.c +++ b/fs/xfs/scrub/rtrefcount.c @@ -7,8 +7,10 @@ #include "xfs_fs.h" #include "xfs_shared.h" #include "xfs_format.h" +#include "xfs_log_format.h" #include "xfs_trans_resv.h" #include "xfs_mount.h" +#include "xfs_trans.h" #include "xfs_btree.h" #include "xfs_rmap.h" #include "xfs_refcount.h" @@ -21,6 +23,7 @@ #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/btree.h" +#include "scrub/repair.h" /* Set us up with the realtime refcount metadata locked. */ int @@ -32,6 +35,12 @@ xchk_setup_rtrefcountbt( if (xchk_need_intent_drain(sc)) xchk_fsgates_enable(sc, XCHK_FSGATES_DRAIN); + if (xchk_could_repair(sc)) { + error = xrep_setup_rtrefcountbt(sc); + if (error) + return error; + } + error = xchk_rtgroup_init(sc, sc->sm->sm_agno, &sc->sr); if (error) return error; diff --git a/fs/xfs/scrub/rtrefcount_repair.c b/fs/xfs/scrub/rtrefcount_repair.c new file mode 100644 index 00000000000000..257cfb24beb428 --- /dev/null +++ b/fs/xfs/scrub/rtrefcount_repair.c @@ -0,0 +1,783 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_defer.h" +#include "xfs_btree.h" +#include "xfs_btree_staging.h" +#include "xfs_bit.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_sb.h" +#include "xfs_alloc.h" +#include "xfs_ialloc.h" +#include "xfs_rmap.h" +#include "xfs_rmap_btree.h" +#include "xfs_rtrmap_btree.h" +#include "xfs_refcount.h" +#include "xfs_rtrefcount_btree.h" +#include "xfs_error.h" +#include "xfs_health.h" +#include "xfs_inode.h" +#include "xfs_quota.h" +#include "xfs_rtalloc.h" +#include "xfs_ag.h" +#include "xfs_rtgroup.h" +#include "xfs_rtbitmap.h" +#include "scrub/xfs_scrub.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/btree.h" +#include "scrub/trace.h" +#include "scrub/repair.h" +#include "scrub/bitmap.h" +#include "scrub/fsb_bitmap.h" +#include "scrub/xfile.h" +#include "scrub/xfarray.h" +#include "scrub/newbt.h" +#include "scrub/reap.h" +#include "scrub/rcbag.h" + +/* + * Rebuilding the Reference Count Btree + * ==================================== + * + * This algorithm is "borrowed" from xfs_repair. Imagine the rmap + * entries as rectangles representing extents of physical blocks, and + * that the rectangles can be laid down to allow them to overlap each + * other; then we know that we must emit a refcnt btree entry wherever + * the amount of overlap changes, i.e. the emission stimulus is + * level-triggered: + * + * - --- + * -- ----- ---- --- ------ + * -- ---- ----------- ---- --------- + * -------------------------------- ----------- + * ^ ^ ^^ ^^ ^ ^^ ^^^ ^^^^ ^ ^^ ^ ^ ^ + * 2 1 23 21 3 43 234 2123 1 01 2 3 0 + * + * For our purposes, a rmap is a tuple (startblock, len, fileoff, owner). + * + * Note that in the actual refcnt btree we don't store the refcount < 2 + * cases because the bnobt tells us which blocks are free; single-use + * blocks aren't recorded in the bnobt or the refcntbt. If the rmapbt + * supports storing multiple entries covering a given block we could + * theoretically dispense with the refcntbt and simply count rmaps, but + * that's inefficient in the (hot) write path, so we'll take the cost of + * the extra tree to save time. Also there's no guarantee that rmap + * will be enabled. + * + * Given an array of rmaps sorted by physical block number, a starting + * physical block (sp), a bag to hold rmaps that cover sp, and the next + * physical block where the level changes (np), we can reconstruct the + * rt refcount btree as follows: + * + * While there are still unprocessed rmaps in the array, + * - Set sp to the physical block (pblk) of the next unprocessed rmap. + * - Add to the bag all rmaps in the array where startblock == sp. + * - Set np to the physical block where the bag size will change. This + * is the minimum of (the pblk of the next unprocessed rmap) and + * (startblock + len of each rmap in the bag). + * - Record the bag size as old_bag_size. + * + * - While the bag isn't empty, + * - Remove from the bag all rmaps where startblock + len == np. + * - Add to the bag all rmaps in the array where startblock == np. + * - If the bag size isn't old_bag_size, store the refcount entry + * (sp, np - sp, bag_size) in the refcnt btree. + * - If the bag is empty, break out of the inner loop. + * - Set old_bag_size to the bag size + * - Set sp = np. + * - Set np to the physical block where the bag size will change. + * This is the minimum of (the pblk of the next unprocessed rmap) + * and (startblock + len of each rmap in the bag). + * + * Like all the other repairers, we make a list of all the refcount + * records we need, then reinitialize the rt refcount btree root and + * insert all the records. + */ + +struct xrep_rtrefc { + /* refcount extents */ + struct xfarray *refcount_records; + + /* new refcountbt information */ + struct xrep_newbt new_btree; + + /* old refcountbt blocks */ + struct xfsb_bitmap old_rtrefcountbt_blocks; + + struct xfs_scrub *sc; + + /* get_records()'s position in the rt refcount record array. */ + xfarray_idx_t array_cur; + + /* # of refcountbt blocks */ + xfs_filblks_t btblocks; +}; + +/* Set us up to repair refcount btrees. */ +int +xrep_setup_rtrefcountbt( + struct xfs_scrub *sc) +{ + char *descr; + int error; + + descr = xchk_xfile_ag_descr(sc, "rmap record bag"); + error = xrep_setup_xfbtree(sc, descr); + kfree(descr); + return error; +} + +/* Check for any obvious conflicts with this shared/CoW staging extent. */ +STATIC int +xrep_rtrefc_check_ext( + struct xfs_scrub *sc, + const struct xfs_refcount_irec *rec) +{ + xfs_rgblock_t last; + + if (xfs_rtrefcount_check_irec(sc->sr.rtg, rec) != NULL) + return -EFSCORRUPTED; + + if (xfs_rgbno_to_rtxoff(sc->mp, rec->rc_startblock) != 0) + return -EFSCORRUPTED; + + last = rec->rc_startblock + rec->rc_blockcount - 1; + if (xfs_rgbno_to_rtxoff(sc->mp, last) != sc->mp->m_sb.sb_rextsize - 1) + return -EFSCORRUPTED; + + /* Make sure this isn't free space or misaligned. */ + return xrep_require_rtext_inuse(sc, rec->rc_startblock, + rec->rc_blockcount); +} + +/* Record a reference count extent. */ +STATIC int +xrep_rtrefc_stash( + struct xrep_rtrefc *rr, + enum xfs_refc_domain domain, + xfs_rgblock_t bno, + xfs_extlen_t len, + uint64_t refcount) +{ + struct xfs_refcount_irec irec = { + .rc_startblock = bno, + .rc_blockcount = len, + .rc_refcount = refcount, + .rc_domain = domain, + }; + int error = 0; + + if (xchk_should_terminate(rr->sc, &error)) + return error; + + irec.rc_refcount = min_t(uint64_t, XFS_REFC_REFCOUNT_MAX, refcount); + + error = xrep_rtrefc_check_ext(rr->sc, &irec); + if (error) + return error; + + trace_xrep_refc_found(rtg_group(rr->sc->sr.rtg), &irec); + + return xfarray_append(rr->refcount_records, &irec); +} + +/* Record a CoW staging extent. */ +STATIC int +xrep_rtrefc_stash_cow( + struct xrep_rtrefc *rr, + xfs_rgblock_t bno, + xfs_extlen_t len) +{ + return xrep_rtrefc_stash(rr, XFS_REFC_DOMAIN_COW, bno, len, 1); +} + +/* Decide if an rmap could describe a shared extent. */ +static inline bool +xrep_rtrefc_rmap_shareable( + const struct xfs_rmap_irec *rmap) +{ + /* rt metadata are never sharable */ + if (XFS_RMAP_NON_INODE_OWNER(rmap->rm_owner)) + return false; + + /* Unwritten file blocks are not shareable. */ + if (rmap->rm_flags & XFS_RMAP_UNWRITTEN) + return false; + + return true; +} + +/* Grab the next (abbreviated) rmap record from the rmapbt. */ +STATIC int +xrep_rtrefc_walk_rmaps( + struct xrep_rtrefc *rr, + struct xfs_rmap_irec *rmap, + bool *have_rec) +{ + struct xfs_btree_cur *cur = rr->sc->sr.rmap_cur; + struct xfs_mount *mp = cur->bc_mp; + int have_gt; + int error = 0; + + *have_rec = false; + + /* + * Loop through the remaining rmaps. Remember CoW staging + * extents and the refcountbt blocks from the old tree for later + * disposal. We can only share written data fork extents, so + * keep looping until we find an rmap for one. + */ + do { + if (xchk_should_terminate(rr->sc, &error)) + return error; + + error = xfs_btree_increment(cur, 0, &have_gt); + if (error) + return error; + if (!have_gt) + return 0; + + error = xfs_rmap_get_rec(cur, rmap, &have_gt); + if (error) + return error; + if (XFS_IS_CORRUPT(mp, !have_gt)) { + xfs_btree_mark_sick(cur); + return -EFSCORRUPTED; + } + + if (rmap->rm_owner == XFS_RMAP_OWN_COW) { + error = xrep_rtrefc_stash_cow(rr, rmap->rm_startblock, + rmap->rm_blockcount); + if (error) + return error; + } else if (xfs_is_sb_inum(mp, rmap->rm_owner) || + (rmap->rm_flags & (XFS_RMAP_ATTR_FORK | + XFS_RMAP_BMBT_BLOCK))) { + xfs_btree_mark_sick(cur); + return -EFSCORRUPTED; + } + } while (!xrep_rtrefc_rmap_shareable(rmap)); + + *have_rec = true; + return 0; +} + +static inline uint32_t +xrep_rtrefc_encode_startblock( + const struct xfs_refcount_irec *irec) +{ + uint32_t start; + + start = irec->rc_startblock & ~XFS_REFC_COWFLAG; + if (irec->rc_domain == XFS_REFC_DOMAIN_COW) + start |= XFS_REFC_COWFLAG; + + return start; +} + +/* + * Compare two refcount records. We want to sort in order of increasing block + * number. + */ +static int +xrep_rtrefc_extent_cmp( + const void *a, + const void *b) +{ + const struct xfs_refcount_irec *ap = a; + const struct xfs_refcount_irec *bp = b; + uint32_t sa, sb; + + sa = xrep_rtrefc_encode_startblock(ap); + sb = xrep_rtrefc_encode_startblock(bp); + + if (sa > sb) + return 1; + if (sa < sb) + return -1; + return 0; +} + +/* + * Sort the refcount extents by startblock or else the btree records will be in + * the wrong order. Make sure the records do not overlap in physical space. + */ +STATIC int +xrep_rtrefc_sort_records( + struct xrep_rtrefc *rr) +{ + struct xfs_refcount_irec irec; + xfarray_idx_t cur; + enum xfs_refc_domain dom = XFS_REFC_DOMAIN_SHARED; + xfs_rgblock_t next_rgbno = 0; + int error; + + error = xfarray_sort(rr->refcount_records, xrep_rtrefc_extent_cmp, + XFARRAY_SORT_KILLABLE); + if (error) + return error; + + foreach_xfarray_idx(rr->refcount_records, cur) { + if (xchk_should_terminate(rr->sc, &error)) + return error; + + error = xfarray_load(rr->refcount_records, cur, &irec); + if (error) + return error; + + if (dom == XFS_REFC_DOMAIN_SHARED && + irec.rc_domain == XFS_REFC_DOMAIN_COW) { + dom = irec.rc_domain; + next_rgbno = 0; + } + + if (dom != irec.rc_domain) + return -EFSCORRUPTED; + if (irec.rc_startblock < next_rgbno) + return -EFSCORRUPTED; + + next_rgbno = irec.rc_startblock + irec.rc_blockcount; + } + + return error; +} + +/* Record extents that belong to the realtime refcount inode. */ +STATIC int +xrep_rtrefc_walk_rmap( + struct xfs_btree_cur *cur, + const struct xfs_rmap_irec *rec, + void *priv) +{ + struct xrep_rtrefc *rr = priv; + int error = 0; + + if (xchk_should_terminate(rr->sc, &error)) + return error; + + /* Skip extents which are not owned by this inode and fork. */ + if (rec->rm_owner != rr->sc->ip->i_ino) + return 0; + + error = xrep_check_ino_btree_mapping(rr->sc, rec); + if (error) + return error; + + return xfsb_bitmap_set(&rr->old_rtrefcountbt_blocks, + xfs_gbno_to_fsb(cur->bc_group, rec->rm_startblock), + rec->rm_blockcount); +} + +/* + * Walk forward through the rmap btree to collect all rmaps starting at + * @bno in @rmap_bag. These represent the file(s) that share ownership of + * the current block. Upon return, the rmap cursor points to the last record + * satisfying the startblock constraint. + */ +static int +xrep_rtrefc_push_rmaps_at( + struct xrep_rtrefc *rr, + struct rcbag *rcstack, + xfs_rgblock_t bno, + struct xfs_rmap_irec *rmap, + bool *have) +{ + struct xfs_scrub *sc = rr->sc; + int have_gt; + int error; + + while (*have && rmap->rm_startblock == bno) { + error = rcbag_add(rcstack, rr->sc->tp, rmap); + if (error) + return error; + + error = xrep_rtrefc_walk_rmaps(rr, rmap, have); + if (error) + return error; + } + + error = xfs_btree_decrement(sc->sr.rmap_cur, 0, &have_gt); + if (error) + return error; + if (XFS_IS_CORRUPT(sc->mp, !have_gt)) { + xfs_btree_mark_sick(sc->sr.rmap_cur); + return -EFSCORRUPTED; + } + + return 0; +} + +/* Scan one AG for reverse mappings for the realtime refcount btree. */ +STATIC int +xrep_rtrefc_scan_ag( + struct xrep_rtrefc *rr, + struct xfs_perag *pag) +{ + struct xfs_scrub *sc = rr->sc; + int error; + + error = xrep_ag_init(sc, pag, &sc->sa); + if (error) + return error; + + error = xfs_rmap_query_all(sc->sa.rmap_cur, xrep_rtrefc_walk_rmap, rr); + xchk_ag_free(sc, &sc->sa); + return error; +} + +/* Iterate all the rmap records to generate reference count data. */ +STATIC int +xrep_rtrefc_find_refcounts( + struct xrep_rtrefc *rr) +{ + struct xfs_scrub *sc = rr->sc; + struct rcbag *rcstack; + struct xfs_perag *pag = NULL; + uint64_t old_stack_height; + xfs_rgblock_t sbno; + xfs_rgblock_t cbno; + xfs_rgblock_t nbno; + bool have; + int error; + + /* Scan for old rtrefc btree blocks. */ + while ((pag = xfs_perag_next(sc->mp, pag))) { + error = xrep_rtrefc_scan_ag(rr, pag); + if (error) { + xfs_perag_rele(pag); + return error; + } + } + + xrep_rtgroup_btcur_init(sc, &sc->sr); + + /* + * Set up a bag to store all the rmap records that we're tracking to + * generate a reference count record. If this exceeds + * XFS_REFC_REFCOUNT_MAX, we clamp rc_refcount. + */ + error = rcbag_init(sc->mp, sc->xmbtp, &rcstack); + if (error) + goto out_cur; + + /* Start the rtrmapbt cursor to the left of all records. */ + error = xfs_btree_goto_left_edge(sc->sr.rmap_cur); + if (error) + goto out_bag; + + /* Process reverse mappings into refcount data. */ + while (xfs_btree_has_more_records(sc->sr.rmap_cur)) { + struct xfs_rmap_irec rmap; + + /* Push all rmaps with pblk == sbno onto the stack */ + error = xrep_rtrefc_walk_rmaps(rr, &rmap, &have); + if (error) + goto out_bag; + if (!have) + break; + sbno = cbno = rmap.rm_startblock; + error = xrep_rtrefc_push_rmaps_at(rr, rcstack, sbno, &rmap, + &have); + if (error) + goto out_bag; + + /* Set nbno to the bno of the next refcount change */ + error = rcbag_next_edge(rcstack, sc->tp, &rmap, have, &nbno); + if (error) + goto out_bag; + + ASSERT(nbno > sbno); + old_stack_height = rcbag_count(rcstack); + + /* While stack isn't empty... */ + while (rcbag_count(rcstack) > 0) { + /* Pop all rmaps that end at nbno */ + error = rcbag_remove_ending_at(rcstack, sc->tp, nbno); + if (error) + goto out_bag; + + /* Push array items that start at nbno */ + error = xrep_rtrefc_walk_rmaps(rr, &rmap, &have); + if (error) + goto out_bag; + if (have) { + error = xrep_rtrefc_push_rmaps_at(rr, rcstack, + nbno, &rmap, &have); + if (error) + goto out_bag; + } + + /* Emit refcount if necessary */ + ASSERT(nbno > cbno); + if (rcbag_count(rcstack) != old_stack_height) { + if (old_stack_height > 1) { + error = xrep_rtrefc_stash(rr, + XFS_REFC_DOMAIN_SHARED, + cbno, nbno - cbno, + old_stack_height); + if (error) + goto out_bag; + } + cbno = nbno; + } + + /* Stack empty, go find the next rmap */ + if (rcbag_count(rcstack) == 0) + break; + old_stack_height = rcbag_count(rcstack); + sbno = nbno; + + /* Set nbno to the bno of the next refcount change */ + error = rcbag_next_edge(rcstack, sc->tp, &rmap, have, + &nbno); + if (error) + goto out_bag; + + ASSERT(nbno > sbno); + } + } + + ASSERT(rcbag_count(rcstack) == 0); +out_bag: + rcbag_free(&rcstack); +out_cur: + xchk_rtgroup_btcur_free(&sc->sr); + return error; +} + +/* Retrieve refcountbt data for bulk load. */ +STATIC int +xrep_rtrefc_get_records( + struct xfs_btree_cur *cur, + unsigned int idx, + struct xfs_btree_block *block, + unsigned int nr_wanted, + void *priv) +{ + struct xrep_rtrefc *rr = priv; + union xfs_btree_rec *block_rec; + unsigned int loaded; + int error; + + for (loaded = 0; loaded < nr_wanted; loaded++, idx++) { + error = xfarray_load(rr->refcount_records, rr->array_cur++, + &cur->bc_rec.rc); + if (error) + return error; + + block_rec = xfs_btree_rec_addr(cur, idx, block); + cur->bc_ops->init_rec_from_cur(cur, block_rec); + } + + return loaded; +} + +/* Feed one of the new btree blocks to the bulk loader. */ +STATIC int +xrep_rtrefc_claim_block( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr, + void *priv) +{ + struct xrep_rtrefc *rr = priv; + + return xrep_newbt_claim_block(cur, &rr->new_btree, ptr); +} + +/* Figure out how much space we need to create the incore btree root block. */ +STATIC size_t +xrep_rtrefc_iroot_size( + struct xfs_btree_cur *cur, + unsigned int level, + unsigned int nr_this_level, + void *priv) +{ + return xfs_rtrefcount_broot_space_calc(cur->bc_mp, level, + nr_this_level); +} + +/* + * Use the collected refcount information to stage a new rt refcount btree. If + * this is successful we'll return with the new btree root information logged + * to the repair transaction but not yet committed. + */ +STATIC int +xrep_rtrefc_build_new_tree( + struct xrep_rtrefc *rr) +{ + struct xfs_scrub *sc = rr->sc; + struct xfs_rtgroup *rtg = sc->sr.rtg; + struct xfs_btree_cur *refc_cur; + int error; + + error = xrep_rtrefc_sort_records(rr); + if (error) + return error; + + /* + * Prepare to construct the new btree by reserving disk space for the + * new btree and setting up all the accounting information we'll need + * to root the new btree while it's under construction and before we + * attach it to the realtime refcount inode. + */ + error = xrep_newbt_init_metadir_inode(&rr->new_btree, sc); + if (error) + return error; + + rr->new_btree.bload.get_records = xrep_rtrefc_get_records; + rr->new_btree.bload.claim_block = xrep_rtrefc_claim_block; + rr->new_btree.bload.iroot_size = xrep_rtrefc_iroot_size; + + refc_cur = xfs_rtrefcountbt_init_cursor(NULL, rtg); + xfs_btree_stage_ifakeroot(refc_cur, &rr->new_btree.ifake); + + /* Compute how many blocks we'll need. */ + error = xfs_btree_bload_compute_geometry(refc_cur, &rr->new_btree.bload, + xfarray_length(rr->refcount_records)); + if (error) + goto err_cur; + + /* Last chance to abort before we start committing fixes. */ + if (xchk_should_terminate(sc, &error)) + goto err_cur; + + /* + * Guess how many blocks we're going to need to rebuild an entire + * rtrefcountbt from the number of extents we found, and pump up our + * transaction to have sufficient block reservation. We're allowed + * to exceed quota to repair inconsistent metadata, though this is + * unlikely. + */ + error = xfs_trans_reserve_more_inode(sc->tp, rtg_refcount(rtg), + rr->new_btree.bload.nr_blocks, 0, true); + if (error) + goto err_cur; + + /* Reserve the space we'll need for the new btree. */ + error = xrep_newbt_alloc_blocks(&rr->new_btree, + rr->new_btree.bload.nr_blocks); + if (error) + goto err_cur; + + /* Add all observed refcount records. */ + rr->new_btree.ifake.if_fork->if_format = XFS_DINODE_FMT_META_BTREE; + rr->array_cur = XFARRAY_CURSOR_INIT; + error = xfs_btree_bload(refc_cur, &rr->new_btree.bload, rr); + if (error) + goto err_cur; + + /* + * Install the new rtrefc btree in the inode. After this point the old + * btree is no longer accessible, the new tree is live, and we can + * delete the cursor. + */ + xfs_rtrefcountbt_commit_staged_btree(refc_cur, sc->tp); + xrep_inode_set_nblocks(rr->sc, rr->new_btree.ifake.if_blocks); + xfs_btree_del_cursor(refc_cur, 0); + + /* Dispose of any unused blocks and the accounting information. */ + error = xrep_newbt_commit(&rr->new_btree); + if (error) + return error; + + return xrep_roll_trans(sc); +err_cur: + xfs_btree_del_cursor(refc_cur, error); + xrep_newbt_cancel(&rr->new_btree); + return error; +} + +/* + * Now that we've logged the roots of the new btrees, invalidate all of the + * old blocks and free them. + */ +STATIC int +xrep_rtrefc_remove_old_tree( + struct xrep_rtrefc *rr) +{ + int error; + + /* + * Free all the extents that were allocated to the former rtrefcountbt + * and aren't cross-linked with something else. + */ + error = xrep_reap_metadir_fsblocks(rr->sc, + &rr->old_rtrefcountbt_blocks); + if (error) + return error; + + /* + * Ensure the proper reservation for the rtrefcount inode so that we + * don't fail to expand the btree. + */ + return xrep_reset_metafile_resv(rr->sc); +} + +/* Rebuild the rt refcount btree. */ +int +xrep_rtrefcountbt( + struct xfs_scrub *sc) +{ + struct xrep_rtrefc *rr; + struct xfs_mount *mp = sc->mp; + char *descr; + int error; + + /* We require the rmapbt to rebuild anything. */ + if (!xfs_has_rtrmapbt(mp)) + return -EOPNOTSUPP; + + /* Make sure any problems with the fork are fixed. */ + error = xrep_metadata_inode_forks(sc); + if (error) + return error; + + rr = kzalloc(sizeof(struct xrep_rtrefc), XCHK_GFP_FLAGS); + if (!rr) + return -ENOMEM; + rr->sc = sc; + + /* Set up enough storage to handle one refcount record per rt extent. */ + descr = xchk_xfile_ag_descr(sc, "reference count records"); + error = xfarray_create(descr, mp->m_sb.sb_rextents, + sizeof(struct xfs_refcount_irec), + &rr->refcount_records); + kfree(descr); + if (error) + goto out_rr; + + /* Collect all reference counts. */ + xfsb_bitmap_init(&rr->old_rtrefcountbt_blocks); + error = xrep_rtrefc_find_refcounts(rr); + if (error) + goto out_bitmap; + + xfs_trans_ijoin(sc->tp, sc->ip, 0); + + /* Rebuild the refcount information. */ + error = xrep_rtrefc_build_new_tree(rr); + if (error) + goto out_bitmap; + + /* Kill the old tree. */ + error = xrep_rtrefc_remove_old_tree(rr); + if (error) + goto out_bitmap; + +out_bitmap: + xfsb_bitmap_destroy(&rr->old_rtrefcountbt_blocks); + xfarray_destroy(rr->refcount_records); +out_rr: + kfree(rr); + return error; +} diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 6e31f12cef4cc9..7567dd5cad14f4 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -472,7 +472,7 @@ static const struct xchk_meta_ops meta_scrub_ops[] = { .setup = xchk_setup_rtrefcountbt, .scrub = xchk_rtrefcountbt, .has = xfs_has_rtreflink, - .repair = xrep_notsupported, + .repair = xrep_rtrefcountbt, }, }; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 289e39d1f418ba..56811862aa8226 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2116,29 +2116,33 @@ TRACE_EVENT(xrep_ibt_found, ) TRACE_EVENT(xrep_refc_found, - TP_PROTO(const struct xfs_perag *pag, + TP_PROTO(const struct xfs_group *xg, const struct xfs_refcount_irec *rec), - TP_ARGS(pag, rec), + TP_ARGS(xg, rec), TP_STRUCT__entry( __field(dev_t, dev) __field(xfs_agnumber_t, agno) __field(enum xfs_refc_domain, domain) + __field(enum xfs_group_type, type) __field(xfs_agblock_t, startblock) __field(xfs_extlen_t, blockcount) __field(xfs_nlink_t, refcount) ), TP_fast_assign( - __entry->dev = pag_mount(pag)->m_super->s_dev; - __entry->agno = pag_agno(pag); + __entry->dev = xg->xg_mount->m_super->s_dev; + __entry->agno = xg->xg_gno; + __entry->type = xg->xg_type; __entry->domain = rec->rc_domain; __entry->startblock = rec->rc_startblock; __entry->blockcount = rec->rc_blockcount; __entry->refcount = rec->rc_refcount; ), - TP_printk("dev %d:%d agno 0x%x dom %s agbno 0x%x fsbcount 0x%x refcount %u", + TP_printk("dev %d:%d %sno 0x%x dom %s %sbno 0x%x fsbcount 0x%x refcount %u", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, __print_symbolic(__entry->domain, XFS_REFC_DOMAIN_STRINGS), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->startblock, __entry->blockcount, __entry->refcount) From patchwork Thu Dec 19 19:43:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915640 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF6161990BA for ; Thu, 19 Dec 2024 19:43:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637408; cv=none; b=Wavf0BEGpSZTtY+kuZ7j0TDslwaSZV8hdfEEon3sfd6psrJwrC7WHnaMtllKna1ZpacJ3bARPCHIkNroWCyTTXmf47xlkenHfTu0DcymYrvrVauXOyxPxgP6p9VPovb11d9A76mvlK7WnPW2uQYfZpCMTReFidpwru16EJ3uNdY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637408; c=relaxed/simple; bh=N6jhuOH6nAirVOJnHVL7sFwfhfG1pR5ZFrgNuoV274I=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cxOyGjnCpwUWYK7wv3EpxtdqOA69wSSm7xw9d7W07s5w52Gp6IYs1uKfiy3Nl0Gk7fMm6bxLczwS/OB5Wz0Ylj4MIiZggCynSXvm3upEfKNZsNL8gk7jL9YAqBY2IP6IJCDCli7z3EGaR7tTiZRvFI42Ktt1okloInp1CjxJC70= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=udvIOesi; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="udvIOesi" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C4881C4CECE; Thu, 19 Dec 2024 19:43:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637407; bh=N6jhuOH6nAirVOJnHVL7sFwfhfG1pR5ZFrgNuoV274I=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=udvIOesivplqp0i7sAKbLDMhCtMvdOsFRxhXU/bm+/DJSIHoRujecPPvahoWkbt/w 0aT3b/WTb0Fs5JDl8kgim7WmLRayfQyUUm5bQNiq7SgF/nfInvvZ5Jf3g3aUr/FgQp GUNit2h2/w9p51Ml7Ko2uKhHavZkxB5FGfXUa599csgvt8ikDa8LPXN0umc3sPIPXW 4xs17bxKrvaULiznz8uV9OrtRfP7NvPb2MH4m5uYJXYaAw2AqUFgBYkGIMke7dBIgN ugxYbzg/utzhi3hOKvMfsUtYXmVoGeQJxxQBuOPFQIcDDADOkJ3ImaFtLn6QNGKId1 6RC5hFrYOb+wA== Date: Thu, 19 Dec 2024 11:43:27 -0800 Subject: [PATCH 40/43] xfs: repair inodes that have a refcount btree in the data fork From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581662.1572761.7121669588497325519.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Plumb knowledge of refcount btrees into the inode core repair code. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/inode_repair.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index 701baee144a66e..2f641b6d663eb2 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -40,6 +40,7 @@ #include "xfs_symlink_remote.h" #include "xfs_rtgroup.h" #include "xfs_rtrmap_btree.h" +#include "xfs_rtrefcount_btree.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -970,6 +971,34 @@ xrep_dinode_bad_rtrmapbt_fork( return false; } +/* Return true if this refcount-format ifork looks like garbage. */ +STATIC bool +xrep_dinode_bad_rtrefcountbt_fork( + struct xfs_scrub *sc, + struct xfs_dinode *dip, + unsigned int dfork_size) +{ + struct xfs_rtrefcount_root *dfp; + unsigned int nrecs; + unsigned int level; + + if (dfork_size < sizeof(struct xfs_rtrefcount_root)) + return true; + + dfp = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + nrecs = be16_to_cpu(dfp->bb_numrecs); + level = be16_to_cpu(dfp->bb_level); + + if (level > sc->mp->m_rtrefc_maxlevels) + return true; + if (xfs_rtrefcount_droot_space_calc(level, nrecs) > dfork_size) + return true; + if (level > 0 && nrecs == 0) + return true; + + return false; +} + /* Check a metadata-btree fork. */ STATIC bool xrep_dinode_bad_metabt_fork( @@ -984,6 +1013,8 @@ xrep_dinode_bad_metabt_fork( switch (be16_to_cpu(dip->di_metatype)) { case XFS_METAFILE_RTRMAP: return xrep_dinode_bad_rtrmapbt_fork(sc, dip, dfork_size); + case XFS_METAFILE_RTREFCOUNT: + return xrep_dinode_bad_rtrefcountbt_fork(sc, dip, dfork_size); default: return true; } @@ -1249,6 +1280,7 @@ xrep_dinode_ensure_forkoff( { struct xfs_bmdr_block *bmdr; struct xfs_rtrmap_root *rmdr; + struct xfs_rtrefcount_root *rcdr; struct xfs_scrub *sc = ri->sc; xfs_extnum_t attr_extents, data_extents; size_t bmdr_minsz = xfs_bmdr_space_calc(1); @@ -1361,6 +1393,10 @@ xrep_dinode_ensure_forkoff( rmdr = XFS_DFORK_PTR(dip, XFS_DATA_FORK); dfork_min = xfs_rtrmap_broot_space(sc->mp, rmdr); break; + case XFS_METAFILE_RTREFCOUNT: + rcdr = XFS_DFORK_PTR(dip, XFS_DATA_FORK); + dfork_min = xfs_rtrefcount_broot_space(sc->mp, rcdr); + break; default: dfork_min = 0; break; From patchwork Thu Dec 19 19:43:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915641 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF6981990BA for ; Thu, 19 Dec 2024 19:43:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637424; cv=none; b=krdd2uixMpE++23jDLtyjuR/5hrZSiapR4fGfmTDQ2jVVJK/ooYCur6L3ubBq5SDjwZ8+WaUt8eB+DFjIsVklaYy4w/jmu2OLS+Pn9XqA7pZg7wIdftpK2vchUfcrRTVDL6v1cplnsNCs0SP13HrbozI+GRUfulc5DWIseVr7wA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637424; c=relaxed/simple; bh=mW4i29yb2BnpL8CGSPV1NoxbmzQce5kgH58agdHxmFs=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=HyyzgIJsCCP3Cvmr9mcDFfNeFP7XBle2JrpTKlPIVU4zSCOrOY1qfUyJ5ZfcUUGowqfdH1SfNRsvDuop7eEovWSO+s13ZeX8jTa9XkMI+ag+u9zzy6q0vNKLuy5E6pAcmm+WZ+D55L8XuLfe0y0IFJ1w1n1vqMmm8sv7pWl8A7U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MkR6KoEg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MkR6KoEg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 68E62C4CECE; Thu, 19 Dec 2024 19:43:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637423; bh=mW4i29yb2BnpL8CGSPV1NoxbmzQce5kgH58agdHxmFs=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=MkR6KoEg5SM3CgEBMI4N8gvOTp7Nd6bOgna6KkuVIF5kWfoMLyw3ch6n1t5J83Uau KWa4eWXdO1iVnJqlGREpDY5sVo6+I7Ejm8X7duC2QOWKjmj+mmdr/YXu8dhVx9Qj0+ 3ZUjsse64NDWRQD6hS2L/oa+VdWB9GWNImdBOcpvwEobAnDSusr9imspkt9MYf5qRg uFrPzEkLGpeFYnIY1/lMVAh7B9cnRreVH3HLcXKxSYRYKhP6FKYFQJjITh8/vNcSdv +k9XjcPdjWyZOzDxsWZOyqVf8SR7m7dvIwM4aJYVvE0oKwa3Zq1EvxYI4H5PuZYgtx HCP8TONhWlAzg== Date: Thu, 19 Dec 2024 11:43:42 -0800 Subject: [PATCH 41/43] xfs: check for shared rt extents when rebuilding rt file's data fork From: "Darrick J. Wong" To: djwong@kernel.org Cc: hch@lst.de, linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581679.1572761.14901019904343071334.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong When we're rebuilding the data fork of a realtime file, we need to cross-reference each mapping with the rt refcount btree to ensure that the reflink flag is set if there are any shared extents found. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/bmap_repair.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/fs/xfs/scrub/bmap_repair.c b/fs/xfs/scrub/bmap_repair.c index fd64bdf4e13887..1084213b8e9b88 100644 --- a/fs/xfs/scrub/bmap_repair.c +++ b/fs/xfs/scrub/bmap_repair.c @@ -101,14 +101,21 @@ xrep_bmap_discover_shared( xfs_filblks_t blockcount) { struct xfs_scrub *sc = rb->sc; + struct xfs_btree_cur *cur; xfs_agblock_t agbno; xfs_agblock_t fbno; xfs_extlen_t flen; int error; - agbno = XFS_FSB_TO_AGBNO(sc->mp, startblock); - error = xfs_refcount_find_shared(sc->sa.refc_cur, agbno, blockcount, - &fbno, &flen, false); + if (XFS_IS_REALTIME_INODE(sc->ip)) { + agbno = xfs_rtb_to_rgbno(sc->mp, startblock); + cur = sc->sr.refc_cur; + } else { + agbno = XFS_FSB_TO_AGBNO(sc->mp, startblock); + cur = sc->sa.refc_cur; + } + error = xfs_refcount_find_shared(cur, agbno, blockcount, &fbno, &flen, + false); if (error) return error; @@ -450,7 +457,9 @@ xrep_bmap_scan_rtgroup( return 0; error = xrep_rtgroup_init(sc, rtg, &sc->sr, - XFS_RTGLOCK_RMAP | XFS_RTGLOCK_BITMAP_SHARED); + XFS_RTGLOCK_RMAP | + XFS_RTGLOCK_REFCOUNT | + XFS_RTGLOCK_BITMAP_SHARED); if (error) return error; @@ -903,10 +912,6 @@ xrep_bmap_init_reflink_scan( if (whichfork != XFS_DATA_FORK) return RLS_IRRELEVANT; - /* cannot share realtime extents */ - if (XFS_IS_REALTIME_INODE(sc->ip)) - return RLS_IRRELEVANT; - return RLS_UNKNOWN; } From patchwork Thu Dec 19 19:43:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915642 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 444F31B9835 for ; Thu, 19 Dec 2024 19:43:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637439; cv=none; b=ZFMtWmDV0FarHbVokzUvyEKgn65baRzXe25f+A2yu3oPyIBmE7hYzSx6V+bH+LNKNJ+I3qzZb+51OeaCCqRA8k4zHN4aFp8xXbt57ngW/BwaSm7ruLcO4ZUInE2yvKPUDNZIKsz26zTU6gHspncyjFSDinoiWFfz3TfVuLI1a7A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637439; c=relaxed/simple; bh=B2NB6EXKeJr7HHYR4KftbEcGaO8qE8/8hLE+r+rsPSk=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=b7SVWBpIvq8domjDlLddUxP8kUt8MhlwnNxbJVD6WjLBoy8F3IwyiCbAJtnGxtYDLU3ejltbOPgTwQtfCRDiqjYSE9yQ+VEqe/N3dMKucbCVb1O0uckZPe6SVgmbJJssxf8fk2xG6lcF+rYfU4w1T3Gn0qxZnkfjQH7dZJx8xvQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gKSE8Jh9; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gKSE8Jh9" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0E7B2C4CED0; Thu, 19 Dec 2024 19:43:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637439; bh=B2NB6EXKeJr7HHYR4KftbEcGaO8qE8/8hLE+r+rsPSk=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=gKSE8Jh9gTEGJvjEe5Ox1C064zMMKay8xzAfMbUvCkkNbm4T2neuNTJqNjxzGfKGw FZBYbjpYoEi2dHqXLzAuMjm4q4o/qZZ+hmlbmlweNnTP6/b3Ic69f91vmfWLbc3iwH veMaD5IRAG8D5y8WOuYpVDf+Jyp/0YG/vMXm3uNaE0KVk3F6bmBcSvZF9au7ppfC0j j6aJsD6eZenCvORiJvQqbMAs+Fi6AsCNbrCzyDKNNunA4qib8iKK6YxzSbM6Dvi+Rv ScxCwgz0dIUaAA1OQZvL+mTct/WYun2vH0amVIAM96EpkWvaMcKGP4LOe+wbYPxUn4 DT2eqIcRQ9zOg== Date: Thu, 19 Dec 2024 11:43:58 -0800 Subject: [PATCH 42/43] xfs: fix CoW forks for realtime files From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581696.1572761.4870154368051370099.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Port the copy on write fork repair to realtime files. Signed-off-by: "Darrick J. Wong" --- fs/xfs/scrub/agheader_repair.c | 2 fs/xfs/scrub/cow_repair.c | 178 +++++++++++++++++++++++++++-- fs/xfs/scrub/reap.c | 242 +++++++++++++++++++++++++++++++++++++++- fs/xfs/scrub/reap.h | 7 + fs/xfs/scrub/repair.h | 1 fs/xfs/scrub/rtb_bitmap.h | 37 ++++++ fs/xfs/scrub/trace.h | 36 ++++-- fs/xfs/xfs_rtalloc.c | 4 - fs/xfs/xfs_rtalloc.h | 5 + 9 files changed, 470 insertions(+), 42 deletions(-) create mode 100644 fs/xfs/scrub/rtb_bitmap.h diff --git a/fs/xfs/scrub/agheader_repair.c b/fs/xfs/scrub/agheader_repair.c index b45d2b32051a63..cd6f0223879f49 100644 --- a/fs/xfs/scrub/agheader_repair.c +++ b/fs/xfs/scrub/agheader_repair.c @@ -647,7 +647,7 @@ xrep_agfl_fill( xfs_agblock_t agbno = start; int error; - trace_xrep_agfl_insert(sc->sa.pag, agbno, len); + trace_xrep_agfl_insert(pag_group(sc->sa.pag), agbno, len); while (agbno < start + len && af->fl_off < af->flcount) af->agfl_bno[af->fl_off++] = cpu_to_be32(agbno++); diff --git a/fs/xfs/scrub/cow_repair.c b/fs/xfs/scrub/cow_repair.c index ba695dd21f8b96..38a246b8bf11c9 100644 --- a/fs/xfs/scrub/cow_repair.c +++ b/fs/xfs/scrub/cow_repair.c @@ -26,6 +26,9 @@ #include "xfs_errortag.h" #include "xfs_icache.h" #include "xfs_refcount_btree.h" +#include "xfs_rtalloc.h" +#include "xfs_rtbitmap.h" +#include "xfs_rtgroup.h" #include "scrub/xfs_scrub.h" #include "scrub/scrub.h" #include "scrub/common.h" @@ -34,6 +37,7 @@ #include "scrub/bitmap.h" #include "scrub/off_bitmap.h" #include "scrub/fsb_bitmap.h" +#include "scrub/rtb_bitmap.h" #include "scrub/reap.h" /* @@ -61,7 +65,10 @@ struct xrep_cow { struct xoff_bitmap bad_fileoffs; /* Bitmap of fsblocks that were removed from the CoW fork. */ - struct xfsb_bitmap old_cowfork_fsblocks; + union { + struct xfsb_bitmap old_cowfork_fsblocks; + struct xrtb_bitmap old_cowfork_rtblocks; + }; /* CoW fork mappings used to scan for bad CoW staging extents. */ struct xfs_bmbt_irec irec; @@ -145,8 +152,7 @@ xrep_cow_mark_shared_staging( xrep_cow_trim_refcount(xc, &rrec, rec); return xrep_cow_mark_file_range(xc, - xfs_agbno_to_fsb(to_perag(cur->bc_group), - rrec.rc_startblock), + xfs_gbno_to_fsb(cur->bc_group, rrec.rc_startblock), rrec.rc_blockcount); } @@ -177,9 +183,8 @@ xrep_cow_mark_missing_staging( if (xc->next_bno >= rrec.rc_startblock) goto next; - error = xrep_cow_mark_file_range(xc, - xfs_agbno_to_fsb(to_perag(cur->bc_group), xc->next_bno), + xfs_gbno_to_fsb(cur->bc_group, xc->next_bno), rrec.rc_startblock - xc->next_bno); if (error) return error; @@ -222,8 +227,7 @@ xrep_cow_mark_missing_staging_rmap( } return xrep_cow_mark_file_range(xc, - xfs_agbno_to_fsb(to_perag(cur->bc_group), rec_bno), - rec_len); + xfs_gbno_to_fsb(cur->bc_group, rec_bno), rec_len); } /* @@ -310,6 +314,92 @@ xrep_cow_find_bad( return 0; } +/* + * Find any part of the CoW fork mapping that isn't a single-owner CoW staging + * extent and mark the corresponding part of the file range in the bitmap. + */ +STATIC int +xrep_cow_find_bad_rt( + struct xrep_cow *xc) +{ + struct xfs_refcount_irec rc_low = { 0 }; + struct xfs_refcount_irec rc_high = { 0 }; + struct xfs_rmap_irec rm_low = { 0 }; + struct xfs_rmap_irec rm_high = { 0 }; + struct xfs_scrub *sc = xc->sc; + struct xfs_rtgroup *rtg; + int error = 0; + + xc->irec_startbno = xfs_rtb_to_rgbno(sc->mp, xc->irec.br_startblock); + + rtg = xfs_rtgroup_get(sc->mp, + xfs_rtb_to_rgno(sc->mp, xc->irec.br_startblock)); + if (!rtg) + return -EFSCORRUPTED; + + error = xrep_rtgroup_init(sc, rtg, &sc->sr, + XFS_RTGLOCK_RMAP | XFS_RTGLOCK_REFCOUNT); + if (error) + goto out_rtg; + + /* Mark any CoW fork extents that are shared. */ + rc_low.rc_startblock = xc->irec_startbno; + rc_high.rc_startblock = xc->irec_startbno + xc->irec.br_blockcount - 1; + rc_low.rc_domain = rc_high.rc_domain = XFS_REFC_DOMAIN_SHARED; + error = xfs_refcount_query_range(sc->sr.refc_cur, &rc_low, &rc_high, + xrep_cow_mark_shared_staging, xc); + if (error) + goto out_sr; + + /* Make sure there are CoW staging extents for the whole mapping. */ + rc_low.rc_startblock = xc->irec_startbno; + rc_high.rc_startblock = xc->irec_startbno + xc->irec.br_blockcount - 1; + rc_low.rc_domain = rc_high.rc_domain = XFS_REFC_DOMAIN_COW; + xc->next_bno = xc->irec_startbno; + error = xfs_refcount_query_range(sc->sr.refc_cur, &rc_low, &rc_high, + xrep_cow_mark_missing_staging, xc); + if (error) + goto out_sr; + + if (xc->next_bno < xc->irec_startbno + xc->irec.br_blockcount) { + error = xrep_cow_mark_file_range(xc, + xfs_rgbno_to_rtb(rtg, xc->next_bno), + xc->irec_startbno + xc->irec.br_blockcount - + xc->next_bno); + if (error) + goto out_sr; + } + + /* Mark any area has an rmap that isn't a COW staging extent. */ + rm_low.rm_startblock = xc->irec_startbno; + memset(&rm_high, 0xFF, sizeof(rm_high)); + rm_high.rm_startblock = xc->irec_startbno + xc->irec.br_blockcount - 1; + error = xfs_rmap_query_range(sc->sr.rmap_cur, &rm_low, &rm_high, + xrep_cow_mark_missing_staging_rmap, xc); + if (error) + goto out_sr; + + /* + * If userspace is forcing us to rebuild the CoW fork or someone + * turned on the debugging knob, replace everything in the + * CoW fork and then scan for staging extents in the refcountbt. + */ + if ((sc->sm->sm_flags & XFS_SCRUB_IFLAG_FORCE_REBUILD) || + XFS_TEST_ERROR(false, sc->mp, XFS_ERRTAG_FORCE_SCRUB_REPAIR)) { + error = xrep_cow_mark_file_range(xc, xc->irec.br_startblock, + xc->irec.br_blockcount); + if (error) + goto out_rtg; + } + +out_sr: + xchk_rtgroup_btcur_free(&sc->sr); + xchk_rtgroup_free(sc, &sc->sr); +out_rtg: + xfs_rtgroup_put(rtg); + return error; +} + /* * Allocate a replacement CoW staging extent of up to the given number of * blocks, and fill out the mapping. @@ -350,6 +440,32 @@ xrep_cow_alloc( return 0; } +/* + * Allocate a replacement rt CoW staging extent of up to the given number of + * blocks, and fill out the mapping. + */ +STATIC int +xrep_cow_alloc_rt( + struct xfs_scrub *sc, + xfs_extlen_t maxlen, + struct xrep_cow_extent *repl) +{ + xfs_rtxlen_t maxrtx = xfs_rtb_to_rtx(sc->mp, maxlen); + int error; + + error = xfs_trans_reserve_more(sc->tp, 0, maxrtx); + if (error) + return error; + + error = xfs_rtallocate_rtgs(sc->tp, NULLRTBLOCK, 1, maxrtx, 1, false, + false, &repl->fsbno, &repl->len); + if (error) + return error; + + xfs_refcount_alloc_cow_extent(sc->tp, true, repl->fsbno, repl->len); + return 0; +} + /* * Look up the current CoW fork mapping so that we only allocate enough to * replace a single mapping. If we don't find a mapping that covers the start @@ -467,7 +583,10 @@ xrep_cow_replace_range( */ alloc_len = min_t(xfs_fileoff_t, XFS_MAX_BMBT_EXTLEN, nextoff - startoff); - error = xrep_cow_alloc(sc, alloc_len, &repl); + if (XFS_IS_REALTIME_INODE(sc->ip)) + error = xrep_cow_alloc_rt(sc, alloc_len, &repl); + else + error = xrep_cow_alloc(sc, alloc_len, &repl); if (error) return error; @@ -483,8 +602,12 @@ xrep_cow_replace_range( return error; /* Note the old CoW staging extents; we'll reap them all later. */ - error = xfsb_bitmap_set(&xc->old_cowfork_fsblocks, got.br_startblock, - repl.len); + if (XFS_IS_REALTIME_INODE(sc->ip)) + error = xrtb_bitmap_set(&xc->old_cowfork_rtblocks, + got.br_startblock, repl.len); + else + error = xfsb_bitmap_set(&xc->old_cowfork_fsblocks, + got.br_startblock, repl.len); if (error) return error; @@ -540,8 +663,16 @@ xrep_bmap_cow( if (!ifp) return 0; - /* realtime files aren't supported yet */ - if (XFS_IS_REALTIME_INODE(sc->ip)) + /* + * Realtime files with large extent sizes are not supported because + * we could encounter an CoW mapping that has been partially written + * out *and* requires replacement, and there's no solution to that. + */ + if (xfs_inode_has_bigrtalloc(sc->ip)) + return -EOPNOTSUPP; + + /* Metadata inodes aren't supposed to have data on the rt volume. */ + if (xfs_is_metadir_inode(sc->ip) && XFS_IS_REALTIME_INODE(sc->ip)) return -EOPNOTSUPP; /* @@ -562,7 +693,10 @@ xrep_bmap_cow( xc->sc = sc; xoff_bitmap_init(&xc->bad_fileoffs); - xfsb_bitmap_init(&xc->old_cowfork_fsblocks); + if (XFS_IS_REALTIME_INODE(sc->ip)) + xrtb_bitmap_init(&xc->old_cowfork_rtblocks); + else + xfsb_bitmap_init(&xc->old_cowfork_fsblocks); for_each_xfs_iext(ifp, &icur, &xc->irec) { if (xchk_should_terminate(sc, &error)) @@ -585,7 +719,10 @@ xrep_bmap_cow( if (xfs_bmap_is_written_extent(&xc->irec)) continue; - error = xrep_cow_find_bad(xc); + if (XFS_IS_REALTIME_INODE(sc->ip)) + error = xrep_cow_find_bad_rt(xc); + else + error = xrep_cow_find_bad(xc); if (error) goto out_bitmap; } @@ -600,13 +737,20 @@ xrep_bmap_cow( * by the refcount btree, not the inode, so it is correct to treat them * like inode metadata. */ - error = xrep_reap_fsblocks(sc, &xc->old_cowfork_fsblocks, - &XFS_RMAP_OINFO_COW); + if (XFS_IS_REALTIME_INODE(sc->ip)) + error = xrep_reap_rtblocks(sc, &xc->old_cowfork_rtblocks, + &XFS_RMAP_OINFO_COW); + else + error = xrep_reap_fsblocks(sc, &xc->old_cowfork_fsblocks, + &XFS_RMAP_OINFO_COW); if (error) goto out_bitmap; out_bitmap: - xfsb_bitmap_destroy(&xc->old_cowfork_fsblocks); + if (XFS_IS_REALTIME_INODE(sc->ip)) + xrtb_bitmap_destroy(&xc->old_cowfork_rtblocks); + else + xfsb_bitmap_destroy(&xc->old_cowfork_fsblocks); xoff_bitmap_destroy(&xc->bad_fileoffs); kfree(xc); return error; diff --git a/fs/xfs/scrub/reap.c b/fs/xfs/scrub/reap.c index 534c7696e9a1a7..b32fb233cf8476 100644 --- a/fs/xfs/scrub/reap.c +++ b/fs/xfs/scrub/reap.c @@ -34,6 +34,8 @@ #include "xfs_attr_remote.h" #include "xfs_defer.h" #include "xfs_metafile.h" +#include "xfs_rtgroup.h" +#include "xfs_rtrmap_btree.h" #include "scrub/scrub.h" #include "scrub/common.h" #include "scrub/trace.h" @@ -41,6 +43,7 @@ #include "scrub/bitmap.h" #include "scrub/agb_bitmap.h" #include "scrub/fsb_bitmap.h" +#include "scrub/rtb_bitmap.h" #include "scrub/reap.h" /* @@ -311,7 +314,7 @@ xreap_agextent_binval( } out: - trace_xreap_agextent_binval(sc->sa.pag, agbno, *aglenp); + trace_xreap_agextent_binval(pag_group(sc->sa.pag), agbno, *aglenp); } /* @@ -370,7 +373,8 @@ xreap_agextent_select( out_found: *aglenp = len; - trace_xreap_agextent_select(sc->sa.pag, agbno, len, *crosslinked); + trace_xreap_agextent_select(pag_group(sc->sa.pag), agbno, len, + *crosslinked); out_cur: xfs_btree_del_cursor(cur, error); return error; @@ -409,7 +413,8 @@ xreap_agextent_iter( * to run xfs_repair. */ if (crosslinked) { - trace_xreap_dispose_unmap_extent(sc->sa.pag, agbno, *aglenp); + trace_xreap_dispose_unmap_extent(pag_group(sc->sa.pag), agbno, + *aglenp); rs->force_roll = true; @@ -428,7 +433,7 @@ xreap_agextent_iter( *aglenp, rs->oinfo); } - trace_xreap_dispose_free_extent(sc->sa.pag, agbno, *aglenp); + trace_xreap_dispose_free_extent(pag_group(sc->sa.pag), agbno, *aglenp); /* * Invalidate as many buffers as we can, starting at agbno. If this @@ -679,6 +684,225 @@ xrep_reap_fsblocks( return 0; } +#ifdef CONFIG_XFS_RT +/* + * Figure out the longest run of blocks that we can dispose of with a single + * call. Cross-linked blocks should have their reverse mappings removed, but + * single-owner extents can be freed. Units are rt blocks, not rt extents. + */ +STATIC int +xreap_rgextent_select( + struct xreap_state *rs, + xfs_rgblock_t rgbno, + xfs_rgblock_t rgbno_next, + bool *crosslinked, + xfs_extlen_t *rglenp) +{ + struct xfs_scrub *sc = rs->sc; + struct xfs_btree_cur *cur; + xfs_rgblock_t bno = rgbno + 1; + xfs_extlen_t len = 1; + int error; + + /* + * Determine if there are any other rmap records covering the first + * block of this extent. If so, the block is crosslinked. + */ + cur = xfs_rtrmapbt_init_cursor(sc->tp, sc->sr.rtg); + error = xfs_rmap_has_other_keys(cur, rgbno, 1, rs->oinfo, + crosslinked); + if (error) + goto out_cur; + + /* + * Figure out how many of the subsequent blocks have the same crosslink + * status. + */ + while (bno < rgbno_next) { + bool also_crosslinked; + + error = xfs_rmap_has_other_keys(cur, bno, 1, rs->oinfo, + &also_crosslinked); + if (error) + goto out_cur; + + if (*crosslinked != also_crosslinked) + break; + + len++; + bno++; + } + + *rglenp = len; + trace_xreap_agextent_select(rtg_group(sc->sr.rtg), rgbno, len, + *crosslinked); +out_cur: + xfs_btree_del_cursor(cur, error); + return error; +} + +/* + * Dispose of as much of the beginning of this rtgroup extent as possible. + * The number of blocks disposed of will be returned in @rglenp. + */ +STATIC int +xreap_rgextent_iter( + struct xreap_state *rs, + xfs_rgblock_t rgbno, + xfs_extlen_t *rglenp, + bool crosslinked) +{ + struct xfs_scrub *sc = rs->sc; + xfs_rtblock_t rtbno; + int error; + + /* + * The only caller so far is CoW fork repair, so we only know how to + * unlink or free CoW staging extents. Here we don't have to worry + * about invalidating buffers! + */ + if (rs->oinfo != &XFS_RMAP_OINFO_COW) { + ASSERT(rs->oinfo == &XFS_RMAP_OINFO_COW); + return -EFSCORRUPTED; + } + ASSERT(rs->resv == XFS_AG_RESV_NONE); + + rtbno = xfs_rgbno_to_rtb(sc->sr.rtg, rgbno); + + /* + * If there are other rmappings, this block is cross linked and must + * not be freed. Remove the forward and reverse mapping and move on. + */ + if (crosslinked) { + trace_xreap_dispose_unmap_extent(rtg_group(sc->sr.rtg), rgbno, + *rglenp); + + xfs_refcount_free_cow_extent(sc->tp, true, rtbno, *rglenp); + rs->deferred++; + return 0; + } + + trace_xreap_dispose_free_extent(rtg_group(sc->sr.rtg), rgbno, *rglenp); + + /* + * The CoW staging extent is not crosslinked. Use deferred work items + * to remove the refcountbt records (which removes the rmap records) + * and free the extent. We're not worried about the system going down + * here because log recovery walks the refcount btree to clean out the + * CoW staging extents. + */ + xfs_refcount_free_cow_extent(sc->tp, true, rtbno, *rglenp); + error = xfs_free_extent_later(sc->tp, rtbno, *rglenp, NULL, + rs->resv, + XFS_FREE_EXTENT_REALTIME | + XFS_FREE_EXTENT_SKIP_DISCARD); + if (error) + return error; + + rs->deferred++; + return 0; +} + +#define XREAP_RTGLOCK_ALL (XFS_RTGLOCK_BITMAP | \ + XFS_RTGLOCK_RMAP | \ + XFS_RTGLOCK_REFCOUNT) + +/* + * Break a rt file metadata extent into sub-extents by fate (crosslinked, not + * crosslinked), and dispose of each sub-extent separately. The extent must + * be aligned to a realtime extent. + */ +STATIC int +xreap_rtmeta_extent( + uint64_t rtbno, + uint64_t len, + void *priv) +{ + struct xreap_state *rs = priv; + struct xfs_scrub *sc = rs->sc; + xfs_rgblock_t rgbno = xfs_rtb_to_rgbno(sc->mp, rtbno); + xfs_rgblock_t rgbno_next = rgbno + len; + int error = 0; + + ASSERT(sc->ip != NULL); + ASSERT(!sc->sr.rtg); + + /* + * We're reaping blocks after repairing file metadata, which means that + * we have to init the xchk_ag structure ourselves. + */ + sc->sr.rtg = xfs_rtgroup_get(sc->mp, xfs_rtb_to_rgno(sc->mp, rtbno)); + if (!sc->sr.rtg) + return -EFSCORRUPTED; + + xfs_rtgroup_lock(sc->sr.rtg, XREAP_RTGLOCK_ALL); + + while (rgbno < rgbno_next) { + xfs_extlen_t rglen; + bool crosslinked; + + error = xreap_rgextent_select(rs, rgbno, rgbno_next, + &crosslinked, &rglen); + if (error) + goto out_unlock; + + error = xreap_rgextent_iter(rs, rgbno, &rglen, crosslinked); + if (error) + goto out_unlock; + + if (xreap_want_defer_finish(rs)) { + error = xfs_defer_finish(&sc->tp); + if (error) + goto out_unlock; + xreap_defer_finish_reset(rs); + } else if (xreap_want_roll(rs)) { + error = xfs_trans_roll_inode(&sc->tp, sc->ip); + if (error) + goto out_unlock; + xreap_reset(rs); + } + + rgbno += rglen; + } + +out_unlock: + xfs_rtgroup_unlock(sc->sr.rtg, XREAP_RTGLOCK_ALL); + xfs_rtgroup_put(sc->sr.rtg); + sc->sr.rtg = NULL; + return error; +} + +/* + * Dispose of every block of every rt metadata extent in the bitmap. + * Do not use this to dispose of the mappings in an ondisk inode fork. + */ +int +xrep_reap_rtblocks( + struct xfs_scrub *sc, + struct xrtb_bitmap *bitmap, + const struct xfs_owner_info *oinfo) +{ + struct xreap_state rs = { + .sc = sc, + .oinfo = oinfo, + .resv = XFS_AG_RESV_NONE, + }; + int error; + + ASSERT(xfs_has_rmapbt(sc->mp)); + ASSERT(sc->ip != NULL); + + error = xrtb_bitmap_walk(bitmap, xreap_rtmeta_extent, &rs); + if (error) + return error; + + if (xreap_dirty(&rs)) + return xrep_defer_finish(sc); + + return 0; +} +#endif /* CONFIG_XFS_RT */ + /* * Dispose of every block of an old metadata btree that used to be rooted in a * metadata directory file. @@ -771,7 +995,8 @@ xreap_bmapi_select( } imap->br_blockcount = len; - trace_xreap_bmapi_select(sc->sa.pag, agbno, len, *crosslinked); + trace_xreap_bmapi_select(pag_group(sc->sa.pag), agbno, len, + *crosslinked); out_cur: xfs_btree_del_cursor(cur, error); return error; @@ -910,7 +1135,8 @@ xreap_bmapi_binval( } out: - trace_xreap_bmapi_binval(sc->sa.pag, agbno, imap->br_blockcount); + trace_xreap_bmapi_binval(pag_group(sc->sa.pag), agbno, + imap->br_blockcount); return 0; } @@ -937,7 +1163,7 @@ xrep_reap_bmapi_iter( * anybody else who thinks they own the block, even though that * runs the risk of stale buffer warnings in the future. */ - trace_xreap_dispose_unmap_extent(sc->sa.pag, + trace_xreap_dispose_unmap_extent(pag_group(sc->sa.pag), XFS_FSB_TO_AGBNO(sc->mp, imap->br_startblock), imap->br_blockcount); @@ -960,7 +1186,7 @@ xrep_reap_bmapi_iter( * by a block starting before the first block of the extent but overlap * anyway. */ - trace_xreap_dispose_free_extent(sc->sa.pag, + trace_xreap_dispose_free_extent(pag_group(sc->sa.pag), XFS_FSB_TO_AGBNO(sc->mp, imap->br_startblock), imap->br_blockcount); diff --git a/fs/xfs/scrub/reap.h b/fs/xfs/scrub/reap.h index 70e5e6bbb8d38d..4c8f62701fb36b 100644 --- a/fs/xfs/scrub/reap.h +++ b/fs/xfs/scrub/reap.h @@ -17,6 +17,13 @@ int xrep_reap_ifork(struct xfs_scrub *sc, struct xfs_inode *ip, int whichfork); int xrep_reap_metadir_fsblocks(struct xfs_scrub *sc, struct xfsb_bitmap *bitmap); +#ifdef CONFIG_XFS_RT +int xrep_reap_rtblocks(struct xfs_scrub *sc, struct xrtb_bitmap *bitmap, + const struct xfs_owner_info *oinfo); +#else +# define xrep_reap_rtblocks(...) (-EOPNOTSUPP) +#endif /* CONFIG_XFS_RT */ + /* Buffer cache scan context. */ struct xrep_bufscan { /* Disk address for the buffers we want to scan. */ diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index 8f8f18b48a449d..823c00d1a50262 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -52,6 +52,7 @@ struct xbitmap; struct xagb_bitmap; struct xrgb_bitmap; struct xfsb_bitmap; +struct xrtb_bitmap; int xrep_fix_freelist(struct xfs_scrub *sc, int alloc_flags); diff --git a/fs/xfs/scrub/rtb_bitmap.h b/fs/xfs/scrub/rtb_bitmap.h new file mode 100644 index 00000000000000..1313ef605511ea --- /dev/null +++ b/fs/xfs/scrub/rtb_bitmap.h @@ -0,0 +1,37 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2022-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_SCRUB_RTB_BITMAP_H__ +#define __XFS_SCRUB_RTB_BITMAP_H__ + +/* Bitmaps, but for type-checked for xfs_rtblock_t */ + +struct xrtb_bitmap { + struct xbitmap64 rtbitmap; +}; + +static inline void xrtb_bitmap_init(struct xrtb_bitmap *bitmap) +{ + xbitmap64_init(&bitmap->rtbitmap); +} + +static inline void xrtb_bitmap_destroy(struct xrtb_bitmap *bitmap) +{ + xbitmap64_destroy(&bitmap->rtbitmap); +} + +static inline int xrtb_bitmap_set(struct xrtb_bitmap *bitmap, + xfs_rtblock_t start, xfs_filblks_t len) +{ + return xbitmap64_set(&bitmap->rtbitmap, start, len); +} + +static inline int xrtb_bitmap_walk(struct xrtb_bitmap *bitmap, + xbitmap64_walk_fn fn, void *priv) +{ + return xbitmap64_walk(&bitmap->rtbitmap, fn, priv); +} + +#endif /* __XFS_SCRUB_RTB_BITMAP_H__ */ diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 56811862aa8226..d7c4ced47c1567 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -1964,32 +1964,36 @@ DEFINE_XCHK_METAPATH_EVENT(xchk_metapath_lookup); #if IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR) DECLARE_EVENT_CLASS(xrep_extent_class, - TP_PROTO(const struct xfs_perag *pag, xfs_agblock_t agbno, + TP_PROTO(const struct xfs_group *xg, xfs_agblock_t agbno, xfs_extlen_t len), - TP_ARGS(pag, agbno, len), + TP_ARGS(xg, agbno, len), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) __field(xfs_agblock_t, agbno) __field(xfs_extlen_t, len) ), TP_fast_assign( - __entry->dev = pag_mount(pag)->m_super->s_dev; - __entry->agno = pag_agno(pag); + __entry->dev = xg->xg_mount->m_super->s_dev; + __entry->type = xg->xg_type; + __entry->agno = xg->xg_gno; __entry->agbno = agbno; __entry->len = len; ), - TP_printk("dev %d:%d agno 0x%x agbno 0x%x fsbcount 0x%x", + TP_printk("dev %d:%d %sno 0x%x %sbno 0x%x fsbcount 0x%x", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agbno, __entry->len) ); #define DEFINE_REPAIR_EXTENT_EVENT(name) \ DEFINE_EVENT(xrep_extent_class, name, \ - TP_PROTO(const struct xfs_perag *pag, xfs_agblock_t agbno, \ + TP_PROTO(const struct xfs_group *xg, xfs_agblock_t agbno, \ xfs_extlen_t len), \ - TP_ARGS(pag, agbno, len)) + TP_ARGS(xg, agbno, len)) DEFINE_REPAIR_EXTENT_EVENT(xreap_dispose_unmap_extent); DEFINE_REPAIR_EXTENT_EVENT(xreap_dispose_free_extent); DEFINE_REPAIR_EXTENT_EVENT(xreap_agextent_binval); @@ -1997,35 +2001,39 @@ DEFINE_REPAIR_EXTENT_EVENT(xreap_bmapi_binval); DEFINE_REPAIR_EXTENT_EVENT(xrep_agfl_insert); DECLARE_EVENT_CLASS(xrep_reap_find_class, - TP_PROTO(const struct xfs_perag *pag, xfs_agblock_t agbno, + TP_PROTO(const struct xfs_group *xg, xfs_agblock_t agbno, xfs_extlen_t len, bool crosslinked), - TP_ARGS(pag, agbno, len, crosslinked), + TP_ARGS(xg, agbno, len, crosslinked), TP_STRUCT__entry( __field(dev_t, dev) + __field(enum xfs_group_type, type) __field(xfs_agnumber_t, agno) __field(xfs_agblock_t, agbno) __field(xfs_extlen_t, len) __field(bool, crosslinked) ), TP_fast_assign( - __entry->dev = pag_mount(pag)->m_super->s_dev; - __entry->agno = pag_agno(pag); + __entry->dev = xg->xg_mount->m_super->s_dev; + __entry->type = xg->xg_type; + __entry->agno = xg->xg_gno; __entry->agbno = agbno; __entry->len = len; __entry->crosslinked = crosslinked; ), - TP_printk("dev %d:%d agno 0x%x agbno 0x%x fsbcount 0x%x crosslinked %d", + TP_printk("dev %d:%d %sno 0x%x %sbno 0x%x fsbcount 0x%x crosslinked %d", MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agno, + __print_symbolic(__entry->type, XG_TYPE_STRINGS), __entry->agbno, __entry->len, __entry->crosslinked ? 1 : 0) ); #define DEFINE_REPAIR_REAP_FIND_EVENT(name) \ DEFINE_EVENT(xrep_reap_find_class, name, \ - TP_PROTO(const struct xfs_perag *pag, xfs_agblock_t agbno, \ + TP_PROTO(const struct xfs_group *xg, xfs_agblock_t agbno, \ xfs_extlen_t len, bool crosslinked), \ - TP_ARGS(pag, agbno, len, crosslinked)) + TP_ARGS(xg, agbno, len, crosslinked)) DEFINE_REPAIR_REAP_FIND_EVENT(xreap_agextent_select); DEFINE_REPAIR_REAP_FIND_EVENT(xreap_bmapi_select); diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index a5de5405800a22..7135a6717a9e11 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -594,7 +594,7 @@ xfs_rtalloc_sumlevel( * specified. If we don't get maxlen then use prod to trim * the length, if given. The lengths are all in rtextents. */ -STATIC int +static int xfs_rtallocate_extent_size( struct xfs_rtalloc_args *args, xfs_rtxlen_t minlen, /* minimum length to allocate */ @@ -1958,7 +1958,7 @@ xfs_rtallocate_rtg( goto out_release; } -static int +int xfs_rtallocate_rtgs( struct xfs_trans *tp, xfs_fsblock_t bno_hint, diff --git a/fs/xfs/xfs_rtalloc.h b/fs/xfs/xfs_rtalloc.h index 9044f7226ab6fc..0d95b29092c9f3 100644 --- a/fs/xfs/xfs_rtalloc.h +++ b/fs/xfs/xfs_rtalloc.h @@ -77,4 +77,9 @@ xfs_growfs_check_rtgeom(const struct xfs_mount *mp, } #endif /* CONFIG_XFS_RT */ +int xfs_rtallocate_rtgs(struct xfs_trans *tp, xfs_fsblock_t bno_hint, + xfs_rtxlen_t minlen, xfs_rtxlen_t maxlen, xfs_rtxlen_t prod, + bool wasdel, bool initial_user_data, xfs_rtblock_t *bno, + xfs_extlen_t *blen); + #endif /* __XFS_RTALLOC_H__ */ From patchwork Thu Dec 19 19:44:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13915643 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 248A91A0AFE for ; Thu, 19 Dec 2024 19:44:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637455; cv=none; b=NRVjJY7Rg70FVRDmUEojql7S+9g22GK/EsiQZnI8cJe2h6zgLlz7qoAR9G0WYJ5lQ0GrNUxC6HhhM/9WU+mNjyzqPa0CfUSFwhHZ6Ndw94+DAkxZdAuRRYRZduwBwFGTwujfpsev66NBGGBzPWCXtin44F6ldq/IQGWxeGde4U8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734637455; c=relaxed/simple; bh=6oWpqMaBNhOXD42AotqdM+f4iX7mIWL02rCh8Vm3hrs=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ZfnpZfKyVIvNUpoSFnhQfPTQ4Z2X806irsJxAj2gFCB+FNHJWTwxtw2RJO7R3hC0qx22C+njOrUCM30ifUhpx5zcs57nxJ/uCl8BLSNm4MCnYWNVpRcsM7spsuZgAcNreqIMLucudDqieYROI+wJlna+Pp3LUMZ/EFxQoFxbdSM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Y6zFay8l; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Y6zFay8l" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7B55C4CECE; Thu, 19 Dec 2024 19:44:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734637454; bh=6oWpqMaBNhOXD42AotqdM+f4iX7mIWL02rCh8Vm3hrs=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=Y6zFay8luKlOOxVKwAt+lAwlXM4geZmBQEuGXcX/ojoLB0HvJ76pRyHR8uxHOWSSK 3qStagQSAr5BoF/tU2G9b2NqhcGX9Iber5eRZlJvcLJGnfMHrC6VnMF4cL0CtiX9gB znInVlp7lmzcjNU8/bRdiFDCcI1fIoo4XuaNkQR4jH5mfYhdEpjJGO5PRG+w7xUpFo ZT2D6gcawvRnOQGEci2bZ9FI+oPALfRHceRVHuUFRqvNZcoS87kV9UJoC3nmkxi7HO HB9ALYi23jb5vjb20KBVACqII2lqEToVLsabsutmkpylR5zYR+eY8NnaNU9EBTxu2z vnnYp5x0o3T6g== Date: Thu, 19 Dec 2024 11:44:14 -0800 Subject: [PATCH 43/43] xfs: enable realtime reflink From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org, hch@lst.de Message-ID: <173463581714.1572761.16587749308470053999.stgit@frogsfrogsfrogs> In-Reply-To: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> References: <173463580863.1572761.14930951818251914429.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Enable reflink for realtime devices, as long as the realtime allocation unit is a single fsblock. Signed-off-by: "Darrick J. Wong" Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_reflink.c | 25 +++++++++++++++++++++++++ fs/xfs/xfs_reflink.h | 2 ++ fs/xfs/xfs_rtalloc.c | 7 +++++-- fs/xfs/xfs_super.c | 6 ++++-- 4 files changed, 36 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index d9b33e22c17669..59f7fc16eb8093 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1822,3 +1822,28 @@ xfs_reflink_unshare( trace_xfs_reflink_unshare_error(ip, error, _RET_IP_); return error; } + +/* + * Can we use reflink with this realtime extent size? Note that we don't check + * for rblocks > 0 here because this can be called as part of attaching a new + * rt section. + */ +bool +xfs_reflink_supports_rextsize( + struct xfs_mount *mp, + unsigned int rextsize) +{ + /* reflink on the realtime device requires rtgroups */ + if (!xfs_has_rtgroups(mp)) + return false; + + /* + * Reflink doesn't support rt extent size larger than a single fsblock + * because we would have to perform CoW-around for unaligned write + * requests to guarantee that we always remap entire rt extents. + */ + if (rextsize != 1) + return false; + + return true; +} diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 3bfd7ab9e1148a..cc4e92278279b6 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -62,4 +62,6 @@ extern int xfs_reflink_remap_blocks(struct xfs_inode *src, loff_t pos_in, extern int xfs_reflink_update_dest(struct xfs_inode *dest, xfs_off_t newlen, xfs_extlen_t cowextsize, unsigned int remap_flags); +bool xfs_reflink_supports_rextsize(struct xfs_mount *mp, unsigned int rextsize); + #endif /* __XFS_REFLINK_H */ diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 7135a6717a9e11..d8e6d073d64dc9 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -32,6 +32,7 @@ #include "xfs_error.h" #include "xfs_trace.h" #include "xfs_rtrefcount_btree.h" +#include "xfs_reflink.h" /* * Return whether there are any free extents in the size range given @@ -1292,8 +1293,10 @@ xfs_growfs_rt( goto out_unlock; if (xfs_has_quota(mp)) goto out_unlock; - } - if (xfs_has_reflink(mp)) + if (xfs_has_reflink(mp)) + goto out_unlock; + } else if (xfs_has_reflink(mp) && + !xfs_reflink_supports_rextsize(mp, in->extsize)) goto out_unlock; error = xfs_sb_validate_fsb_count(&mp->m_sb, in->newblocks); diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index ecd5a9f444d862..7c3f996cd39e81 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1754,9 +1754,11 @@ xfs_fs_fill_super( xfs_warn_experimental(mp, XFS_EXPERIMENTAL_METADIR); if (xfs_has_reflink(mp)) { - if (mp->m_sb.sb_rblocks) { + if (xfs_has_realtime(mp) && + !xfs_reflink_supports_rextsize(mp, mp->m_sb.sb_rextsize)) { xfs_alert(mp, - "reflink not compatible with realtime device!"); + "reflink not compatible with realtime extent size %u!", + mp->m_sb.sb_rextsize); error = -EINVAL; goto out_filestream_unmount; }