diff mbox series

xfs: fix fsmap for internal zoned devices

Message ID 20250415003345.GF25675@frogsfrogsfrogs (mailing list archive)
State Accepted
Headers show
Series xfs: fix fsmap for internal zoned devices | expand

Commit Message

Darrick J. Wong April 15, 2025, 12:33 a.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

Filesystems with an internal zoned rt section use xfs_rtblock_t values
that are relative to the start of the data device.  When fsmap reports
on internal rt sections, it reports the space used by the data section
as "OWN_FS".

Unfortunately, the logic for resuming a query isn't quite right, so
xfs/273 fails because it stress-tests GETFSMAP with a single-record
buffer.  If we enter the "report fake space as OWN_FS" block with a
nonzero key[0].fmr_length, we should add that to key[0].fmr_physical
and recheck if we still need to emit the fake record.  We should /not/
just return 0 from the whole function because that prevents all rmap
record iteration.

If we don't enter that block, the resumption is still wrong.
keys[*].fmr_physical is a reflection of what we copied out to userspace
on a previous query, which means that it already accounts for rgstart.
It is not correct to add rtstart_daddr when computing start_rtb or
end_rtb, so stop that.

While we're at it, add a xfs_has_zoned to make it clear that this is a
zoned filesystem thing.

Fixes: e50ec7fac81aa2 ("xfs: enable fsmap reporting for internal RT devices")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 fs/xfs/xfs_fsmap.c |   51 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 33 insertions(+), 18 deletions(-)

Comments

Christoph Hellwig April 16, 2025, 4:36 a.m. UTC | #1
Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Darrick J. Wong April 16, 2025, 4:10 p.m. UTC | #2
On Tue, Apr 15, 2025 at 09:36:14PM -0700, Christoph Hellwig wrote:
> Looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Hey Carlos,

Can you pick up this bugfix for 6.15-fixes, please?

--D
Carlos Maiolino April 16, 2025, 4:26 p.m. UTC | #3
On Wed, Apr 16, 2025 at 09:10:57AM -0700, Darrick J. Wong wrote:
> On Tue, Apr 15, 2025 at 09:36:14PM -0700, Christoph Hellwig wrote:
> > Looks good:
> >
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> Hey Carlos,
> 
> Can you pick up this bugfix for 6.15-fixes, please?

Already picked, it's in my queue, I'm pushing it tomorrow morning.


> 
> --D
Darrick J. Wong April 16, 2025, 4:27 p.m. UTC | #4
On Wed, Apr 16, 2025 at 06:26:03PM +0200, Carlos Maiolino wrote:
> On Wed, Apr 16, 2025 at 09:10:57AM -0700, Darrick J. Wong wrote:
> > On Tue, Apr 15, 2025 at 09:36:14PM -0700, Christoph Hellwig wrote:
> > > Looks good:
> > >
> > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > 
> > Hey Carlos,
> > 
> > Can you pick up this bugfix for 6.15-fixes, please?
> 
> Already picked, it's in my queue, I'm pushing it tomorrow morning.

Ah, ok.  Thanks!

--D

> 
> > 
> > --D
>
Carlos Maiolino April 17, 2025, 8:55 a.m. UTC | #5
On Mon, 14 Apr 2025 17:33:45 -0700, Darrick J. Wong wrote:
> Filesystems with an internal zoned rt section use xfs_rtblock_t values
> that are relative to the start of the data device.  When fsmap reports
> on internal rt sections, it reports the space used by the data section
> as "OWN_FS".
> 
> Unfortunately, the logic for resuming a query isn't quite right, so
> xfs/273 fails because it stress-tests GETFSMAP with a single-record
> buffer.  If we enter the "report fake space as OWN_FS" block with a
> nonzero key[0].fmr_length, we should add that to key[0].fmr_physical
> and recheck if we still need to emit the fake record.  We should /not/
> just return 0 from the whole function because that prevents all rmap
> record iteration.
> 
> [...]

Applied to for-next, thanks!

[1/1] xfs: fix fsmap for internal zoned devices
      commit: c6f1401b1d5fb3b83bd16ba8a0f89a8b1c805993

Best regards,
diff mbox series

Patch

diff --git a/fs/xfs/xfs_fsmap.c b/fs/xfs/xfs_fsmap.c
index a4bc1642fe5615..414b27a8645886 100644
--- a/fs/xfs/xfs_fsmap.c
+++ b/fs/xfs/xfs_fsmap.c
@@ -876,6 +876,7 @@  xfs_getfsmap_rtdev_rmapbt(
 	const struct xfs_fsmap		*keys,
 	struct xfs_getfsmap_info	*info)
 {
+	struct xfs_fsmap		key0 = *keys; /* struct copy */
 	struct xfs_mount		*mp = tp->t_mountp;
 	struct xfs_rtgroup		*rtg = NULL;
 	struct xfs_btree_cur		*bt_cur = NULL;
@@ -887,32 +888,46 @@  xfs_getfsmap_rtdev_rmapbt(
 	int				error = 0;
 
 	eofs = XFS_FSB_TO_BB(mp, mp->m_sb.sb_rtstart + mp->m_sb.sb_rblocks);
-	if (keys[0].fmr_physical >= eofs)
+	if (key0.fmr_physical >= eofs)
 		return 0;
 
+	/*
+	 * On zoned filesystems with an internal rt volume, the volume comes
+	 * immediately after the end of the data volume.  However, the
+	 * xfs_rtblock_t address space is relative to the start of the data
+	 * device, which means that the first @rtstart fsblocks do not actually
+	 * point anywhere.  If a fsmap query comes in with the low key starting
+	 * below @rtstart, report it as "owned by filesystem".
+	 */
 	rtstart_daddr = XFS_FSB_TO_BB(mp, mp->m_sb.sb_rtstart);
-	if (keys[0].fmr_physical < rtstart_daddr) {
+	if (xfs_has_zoned(mp) && key0.fmr_physical < rtstart_daddr) {
 		struct xfs_fsmap_irec		frec = {
 			.owner			= XFS_RMAP_OWN_FS,
 			.len_daddr		= rtstart_daddr,
 		};
 
-		/* Adjust the low key if we are continuing from where we left off. */
-		if (keys[0].fmr_length > 0) {
-			info->low_daddr = keys[0].fmr_physical + keys[0].fmr_length;
-			return 0;
+		/*
+		 * Adjust the start of the query range if we're picking up from
+		 * a previous round, and only emit the record if we haven't
+		 * already gone past.
+		 */
+		key0.fmr_physical += key0.fmr_length;
+		if (key0.fmr_physical < rtstart_daddr) {
+			error = xfs_getfsmap_helper(tp, info, &frec);
+			if (error)
+				return error;
+
+			key0.fmr_physical = rtstart_daddr;
 		}
 
-		/* Fabricate an rmap entry for space occupied by the data dev */
-		error = xfs_getfsmap_helper(tp, info, &frec);
-		if (error)
-			return error;
+		/* Zero the other fields to avoid further adjustments. */
+		key0.fmr_owner = 0;
+		key0.fmr_offset = 0;
+		key0.fmr_length = 0;
 	}
 
-	start_rtb = xfs_daddr_to_rtb(mp, rtstart_daddr + keys[0].fmr_physical);
-	end_rtb = xfs_daddr_to_rtb(mp, rtstart_daddr +
-			min(eofs - 1, keys[1].fmr_physical));
-
+	start_rtb = xfs_daddr_to_rtb(mp, key0.fmr_physical);
+	end_rtb = xfs_daddr_to_rtb(mp, min(eofs - 1, keys[1].fmr_physical));
 	info->missing_owner = XFS_FMR_OWN_FREE;
 
 	/*
@@ -920,12 +935,12 @@  xfs_getfsmap_rtdev_rmapbt(
 	 * low to the fsmap low key and max out the high key to the end
 	 * of the rtgroup.
 	 */
-	info->low.rm_offset = XFS_BB_TO_FSBT(mp, keys[0].fmr_offset);
-	error = xfs_fsmap_owner_to_rmap(&info->low, &keys[0]);
+	info->low.rm_offset = XFS_BB_TO_FSBT(mp, key0.fmr_offset);
+	error = xfs_fsmap_owner_to_rmap(&info->low, &key0);
 	if (error)
 		return error;
-	info->low.rm_blockcount = XFS_BB_TO_FSBT(mp, keys[0].fmr_length);
-	xfs_getfsmap_set_irec_flags(&info->low, &keys[0]);
+	info->low.rm_blockcount = XFS_BB_TO_FSBT(mp, key0.fmr_length);
+	xfs_getfsmap_set_irec_flags(&info->low, &key0);
 
 	/* Adjust the low key if we are continuing from where we left off. */
 	if (info->low.rm_blockcount == 0) {