diff mbox series

[05/10] xfs: move the zero records logic into xfs_bmap_broot_space_calc

Message ID 172480131591.2291268.4549323808410277633.stgit@frogsfrogsfrogs (mailing list archive)
State New
Headers show
Series [01/10] xfs: fix C++ compilation errors in xfs_fs.h | expand

Commit Message

Darrick J. Wong Aug. 27, 2024, 11:35 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

The bmap btree cannot ever have zero key/pointer records in an incore
btree block.  If the number of records drops to zero, that means we're
converting the fork to extents format and are trying to remove the tree.
This logic won't hold for the future realtime rmap btree, so move the
logic into the bmbt code.

This helps us remove a level of indentation in xfs_iroot_realloc because
we can handle the zero-size case in a single place instead of repeatedly
checking it.  We'll refactor further in the next patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_bmap_btree.h |    7 +++++
 fs/xfs/libxfs/xfs_inode_fork.c |   56 ++++++++++++++--------------------------
 2 files changed, 27 insertions(+), 36 deletions(-)

Comments

Christoph Hellwig Aug. 28, 2024, 4:14 a.m. UTC | #1
On Tue, Aug 27, 2024 at 04:35:01PM -0700, Darrick J. Wong wrote:
> This helps us remove a level of indentation in xfs_iroot_realloc because
> we can handle the zero-size case in a single place instead of repeatedly
> checking it.  We'll refactor further in the next patch.

I think we can do the same cleanup in xfs_iroot_realloc without this
special case:

This:

> +	new_size = xfs_bmap_broot_space_calc(mp, new_max);
> +	if (new_size == 0) {
> +		kfree(ifp->if_broot);
> +		ifp->if_broot = NULL;
> +		ifp->if_broot_bytes = 0;
> +		return;

becomes:

	if (new_max == 0) {
		kfree(ifp->if_broot);
		ifp->if_broot = NULL;
		ifp->if_broot_bytes = 0;
		return;
	}
	new_size = xfs_bmap_broot_space_calc(mp, new_max);

But either ways is fine with me:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong Aug. 29, 2024, 1:15 a.m. UTC | #2
On Wed, Aug 28, 2024 at 06:14:24AM +0200, Christoph Hellwig wrote:
> On Tue, Aug 27, 2024 at 04:35:01PM -0700, Darrick J. Wong wrote:
> > This helps us remove a level of indentation in xfs_iroot_realloc because
> > we can handle the zero-size case in a single place instead of repeatedly
> > checking it.  We'll refactor further in the next patch.
> 
> I think we can do the same cleanup in xfs_iroot_realloc without this
> special case:
> 
> This:
> 
> > +	new_size = xfs_bmap_broot_space_calc(mp, new_max);
> > +	if (new_size == 0) {
> > +		kfree(ifp->if_broot);
> > +		ifp->if_broot = NULL;
> > +		ifp->if_broot_bytes = 0;
> > +		return;
> 
> becomes:
> 
> 	if (new_max == 0) {
> 		kfree(ifp->if_broot);
> 		ifp->if_broot = NULL;
> 		ifp->if_broot_bytes = 0;
> 		return;
> 	}
> 	new_size = xfs_bmap_broot_space_calc(mp, new_max);
> 
> But either ways is fine with me:

This got me thinking about the rtrmap and refcount btrees -- could we
save 72 bytes per inode when the btree is completely empty by returning
0 from xfs_{rtrmap,rtrefcount}_broot_space_calc?  The answer to
that was a bunch of null pointer dereferences because there's a fair
amount of code in the rtrmap/rtrefcount/btree code that assumes that
if_broot isn't null if you've created a btree cursor.

OTOH if you're really going to have 130000 zns rtgroups then maybe we
actually want this savings?  That's 9MB of memory that we don't have to
waste on an empty device -- for rtrmaps the savings are minimal because
eventually you'll write *something*; for rtrefcounts, this might be
meaningful because you format with reflink but don't necessarily use it
right away.

Thoughts?

--D

> Reviewed-by: Christoph Hellwig <hch@lst.de>
>
Dave Chinner Aug. 29, 2024, 2:05 a.m. UTC | #3
On Wed, Aug 28, 2024 at 06:14:24AM +0200, Christoph Hellwig wrote:
> On Tue, Aug 27, 2024 at 04:35:01PM -0700, Darrick J. Wong wrote:
> > This helps us remove a level of indentation in xfs_iroot_realloc because
> > we can handle the zero-size case in a single place instead of repeatedly
> > checking it.  We'll refactor further in the next patch.
> 
> I think we can do the same cleanup in xfs_iroot_realloc without this
> special case:
> 
> This:
> 
> > +	new_size = xfs_bmap_broot_space_calc(mp, new_max);
> > +	if (new_size == 0) {
> > +		kfree(ifp->if_broot);
> > +		ifp->if_broot = NULL;
> > +		ifp->if_broot_bytes = 0;
> > +		return;
> 
> becomes:
> 
> 	if (new_max == 0) {
> 		kfree(ifp->if_broot);
> 		ifp->if_broot = NULL;
> 		ifp->if_broot_bytes = 0;
> 		return;
> 	}
> 	new_size = xfs_bmap_broot_space_calc(mp, new_max);

I kinda prefer this version; I noticed the code could be cleaned
up the way looking at some other patch earlier this morning...

-Dave.
Christoph Hellwig Aug. 29, 2024, 3:51 a.m. UTC | #4
On Wed, Aug 28, 2024 at 06:15:55PM -0700, Darrick J. Wong wrote:
> save 72 bytes per inode when the btree is completely empty by returning
> 0 from xfs_{rtrmap,rtrefcount}_broot_space_calc?  The answer to
> that was a bunch of null pointer dereferences because there's a fair
> amount of code in the rtrmap/rtrefcount/btree code that assumes that
> if_broot isn't null if you've created a btree cursor.
> 
> OTOH if you're really going to have 130000 zns rtgroups then maybe we
> actually want this savings?  That's 9MB of memory that we don't have to
> waste on an empty device -- for rtrmaps the savings are minimal because
> eventually you'll write *something*; for rtrefcounts, this might be
> meaningful because you format with reflink but don't necessarily use it
> right away.

Sounds kinda nice, but also painful.  If the abstraction works out nice
enough it might be worth it.
Darrick J. Wong Aug. 29, 2024, 10:34 p.m. UTC | #5
On Thu, Aug 29, 2024 at 12:05:06PM +1000, Dave Chinner wrote:
> On Wed, Aug 28, 2024 at 06:14:24AM +0200, Christoph Hellwig wrote:
> > On Tue, Aug 27, 2024 at 04:35:01PM -0700, Darrick J. Wong wrote:
> > > This helps us remove a level of indentation in xfs_iroot_realloc because
> > > we can handle the zero-size case in a single place instead of repeatedly
> > > checking it.  We'll refactor further in the next patch.
> > 
> > I think we can do the same cleanup in xfs_iroot_realloc without this
> > special case:
> > 
> > This:
> > 
> > > +	new_size = xfs_bmap_broot_space_calc(mp, new_max);
> > > +	if (new_size == 0) {
> > > +		kfree(ifp->if_broot);
> > > +		ifp->if_broot = NULL;
> > > +		ifp->if_broot_bytes = 0;
> > > +		return;
> > 
> > becomes:
> > 
> > 	if (new_max == 0) {
> > 		kfree(ifp->if_broot);
> > 		ifp->if_broot = NULL;
> > 		ifp->if_broot_bytes = 0;
> > 		return;
> > 	}
> > 	new_size = xfs_bmap_broot_space_calc(mp, new_max);
> 
> I kinda prefer this version; I noticed the code could be cleaned
> up the way looking at some other patch earlier this morning...

As I pointed out to Christoph in this thread already, this change won't
age well because the rt rmap and refcount btrees will want to create
incore btree root blocks with zero records and then create btree cursors
around that.  This refactoring series, incidentally, comes from the
rtrmap series.  Cursor initialization will try to access ifp->if_broot,
which results in null pointer deref whackamole all over xfs_btree.c if
we do that.

I'm working on a better solution than that, which might be pointing
if_broot to a zeroed out rodata xfs_btree_block object and amending
xfs_iroot_free not to delete anything that's not a heap pointer.

We don't need that here yet because the bmap btree code only sets
new_max==0 when it's tearing down the ondisk btree prior to converting
to FMT_EXTENTS, but I'd rather not change this patch now just to revert
it a month from now back to what I originally posted.

--D

> -Dave.
> 
> -- 
> Dave Chinner
> david@fromorbit.com
>
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_bmap_btree.h b/fs/xfs/libxfs/xfs_bmap_btree.h
index af47174fca084..b7842c3420f04 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.h
+++ b/fs/xfs/libxfs/xfs_bmap_btree.h
@@ -163,6 +163,13 @@  xfs_bmap_broot_space_calc(
 	struct xfs_mount	*mp,
 	unsigned int		nrecs)
 {
+	/*
+	 * If the bmbt root block is empty, we should be converting the fork
+	 * to extents format.  Hence, the size is zero.
+	 */
+	if (nrecs == 0)
+		return 0;
+
 	return xfs_bmbt_block_len(mp) + \
 	       (nrecs * (sizeof(struct xfs_bmbt_key) + sizeof(xfs_bmbt_ptr_t)));
 }
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 973e027e3d883..acb1e9cc45b76 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -447,48 +447,32 @@  xfs_iroot_realloc(
 	cur_max = xfs_bmbt_maxrecs(mp, ifp->if_broot_bytes, 0);
 	new_max = cur_max + rec_diff;
 	ASSERT(new_max >= 0);
-	if (new_max > 0)
-		new_size = xfs_bmap_broot_space_calc(mp, new_max);
-	else
-		new_size = 0;
-	if (new_size > 0) {
-		new_broot = kmalloc(new_size, GFP_KERNEL | __GFP_NOFAIL);
-		/*
-		 * First copy over the btree block header.
-		 */
-		memcpy(new_broot, ifp->if_broot,
-			xfs_bmbt_block_len(ip->i_mount));
-	} else {
-		new_broot = NULL;
+
+	new_size = xfs_bmap_broot_space_calc(mp, new_max);
+	if (new_size == 0) {
+		kfree(ifp->if_broot);
+		ifp->if_broot = NULL;
+		ifp->if_broot_bytes = 0;
+		return;
 	}
 
-	/*
-	 * Only copy the keys and pointers if there are any.
-	 */
-	if (new_max > 0) {
-		/*
-		 * First copy the keys.
-		 */
-		op = (char *)xfs_bmbt_key_addr(mp, ifp->if_broot, 1);
-		np = (char *)xfs_bmbt_key_addr(mp, new_broot, 1);
-		memcpy(np, op, new_max * (uint)sizeof(xfs_bmbt_key_t));
+	new_broot = kmalloc(new_size, GFP_KERNEL | __GFP_NOFAIL);
+	memcpy(new_broot, ifp->if_broot, xfs_bmbt_block_len(ip->i_mount));
+
+	op = (char *)xfs_bmbt_key_addr(mp, ifp->if_broot, 1);
+	np = (char *)xfs_bmbt_key_addr(mp, new_broot, 1);
+	memcpy(np, op, new_max * sizeof(xfs_bmbt_key_t));
+
+	op = (char *)xfs_bmap_broot_ptr_addr(mp, ifp->if_broot, 1,
+			ifp->if_broot_bytes);
+	np = (char *)xfs_bmap_broot_ptr_addr(mp, new_broot, 1, (int)new_size);
+	memcpy(np, op, new_max * sizeof(xfs_fsblock_t));
 
-		/*
-		 * Then copy the pointers.
-		 */
-		op = (char *)xfs_bmap_broot_ptr_addr(mp, ifp->if_broot, 1,
-						     ifp->if_broot_bytes);
-		np = (char *)xfs_bmap_broot_ptr_addr(mp, new_broot, 1,
-						     (int)new_size);
-		memcpy(np, op, new_max * (uint)sizeof(xfs_fsblock_t));
-	}
 	kfree(ifp->if_broot);
 	ifp->if_broot = new_broot;
 	ifp->if_broot_bytes = (int)new_size;
-	if (ifp->if_broot)
-		ASSERT(xfs_bmap_bmdr_space(ifp->if_broot) <=
-			xfs_inode_fork_size(ip, whichfork));
-	return;
+	ASSERT(xfs_bmap_bmdr_space(ifp->if_broot) <=
+	       xfs_inode_fork_size(ip, whichfork));
 }