diff mbox series

[34/43] xfs: disable reflink for zoned file systems

Message ID 20250206064511.2323878-35-hch@lst.de (mailing list archive)
State Not Applicable, archived
Headers show
Series [01/43] xfs: factor out a xfs_rt_check_size helper | expand

Commit Message

hch Feb. 6, 2025, 6:44 a.m. UTC
While the zoned on-disk format supports reflinks, the GC code currently
always unshares reflinks when moving blocks to new zones, thus making the
feature unusuable.  Disable reflinks until the GC code is refcount aware.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_super.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Darrick J. Wong Feb. 7, 2025, 4:31 a.m. UTC | #1
On Thu, Feb 06, 2025 at 07:44:50AM +0100, Christoph Hellwig wrote:
> While the zoned on-disk format supports reflinks, the GC code currently
> always unshares reflinks when moving blocks to new zones, thus making the
> feature unusuable.  Disable reflinks until the GC code is refcount aware.

It might be worth mentioning that I've been working on a refcount-aware
free space defragmenter that could be used for this kind of thing,
albeit with the usual problems of userspace can't really stop the
kernel from filling its brains and starving our defragc process.

It would be interesting to load up that adversarial thread timing
sched_ext thing that I hear was talked about at fosdem.

> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reluctantly,
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_super.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 34b0f5a80412..ceb1a855453e 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1824,6 +1824,13 @@ xfs_fs_fill_super(
>  			goto out_filestream_unmount;
>  		}
>  
> +		if (xfs_has_zoned(mp)) {
> +			xfs_alert(mp,
> +	"reflink not compatible with zoned RT device!");
> +			error = -EINVAL;
> +			goto out_filestream_unmount;
> +		}
> +
>  		if (xfs_globals.always_cow) {
>  			xfs_info(mp, "using DEBUG-only always_cow mode.");
>  			mp->m_always_cow = true;
> -- 
> 2.45.2
> 
>
hch Feb. 7, 2025, 4:54 a.m. UTC | #2
On Thu, Feb 06, 2025 at 08:31:23PM -0800, Darrick J. Wong wrote:
> On Thu, Feb 06, 2025 at 07:44:50AM +0100, Christoph Hellwig wrote:
> > While the zoned on-disk format supports reflinks, the GC code currently
> > always unshares reflinks when moving blocks to new zones, thus making the
> > feature unusuable.  Disable reflinks until the GC code is refcount aware.
> 
> It might be worth mentioning that I've been working on a refcount-aware
> free space defragmenter that could be used for this kind of thing,
> albeit with the usual problems of userspace can't really stop the
> kernel from filling its brains and starving our defragc process.
> 
> It would be interesting to load up that adversarial thread timing
> sched_ext thing that I hear was talked about at fosdem.

Not sure waht sched_ext thing and how it's relevant.

It's just that refcount awareness is a bit of work and not really
required for the initial use cases.  It might also have fun implications
for the metabtree reservations, but otherwise I don't think it's rocket
science.  Even more so with a pre-existing implementation to steal ideas
from.

Talking about stealing ideas - the in-kernel GC flow should mostly work
for a regular file system as well.  I don't think actually sharing much
code is going to be useful because the block reservations work so
differently, but I suspect doing it in-kernel using the same "check if
the mapping change at I/O completion time" scheme use in GC would work
very well for a freespace defragmenter on a regular file system.  And
doing it in the kernel will be more efficient at operation time and
probably also be a lot less code than doing it with a whole bunch of
userspace to kernel roundtrips.
Darrick J. Wong Feb. 7, 2025, 5:05 a.m. UTC | #3
On Fri, Feb 07, 2025 at 05:54:46AM +0100, Christoph Hellwig wrote:
> On Thu, Feb 06, 2025 at 08:31:23PM -0800, Darrick J. Wong wrote:
> > On Thu, Feb 06, 2025 at 07:44:50AM +0100, Christoph Hellwig wrote:
> > > While the zoned on-disk format supports reflinks, the GC code currently
> > > always unshares reflinks when moving blocks to new zones, thus making the
> > > feature unusuable.  Disable reflinks until the GC code is refcount aware.
> > 
> > It might be worth mentioning that I've been working on a refcount-aware
> > free space defragmenter that could be used for this kind of thing,
> > albeit with the usual problems of userspace can't really stop the
> > kernel from filling its brains and starving our defragc process.
> > 
> > It would be interesting to load up that adversarial thread timing
> > sched_ext thing that I hear was talked about at fosdem.
> 
> Not sure waht sched_ext thing and how it's relevant.

https://lwn.net/Articles/1007689/

IOWs letting the wrong threads run ahead, to see if the author got the
locking right.

> It's just that refcount awareness is a bit of work and not really
> required for the initial use cases.  It might also have fun implications
> for the metabtree reservations, but otherwise I don't think it's rocket
> science.  Even more so with a pre-existing implementation to steal ideas
> from.
> 
> Talking about stealing ideas - the in-kernel GC flow should mostly work
> for a regular file system as well.  I don't think actually sharing much
> code is going to be useful because the block reservations work so
> differently, but I suspect doing it in-kernel using the same "check if
> the mapping change at I/O completion time" scheme use in GC would work
> very well for a freespace defragmenter on a regular file system.  And
> doing it in the kernel will be more efficient at operation time and
> probably also be a lot less code than doing it with a whole bunch of
> userspace to kernel roundtrips.

Admittedly there are some fun things that you can do in the kernel that
you can't do from userspace, like create a secret unlinked file that you
can map freespace into, and turn EFIs into a bmapi operation, and
reflink file blocks into without having to deal with eof blocks, and
turning on alwayscow for files you're actively trying to move around.
But that's a fun thing to think about now that I have far fewer patches
to juggle in my head, for which I am grateful. :)

--D
diff mbox series

Patch

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 34b0f5a80412..ceb1a855453e 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1824,6 +1824,13 @@  xfs_fs_fill_super(
 			goto out_filestream_unmount;
 		}
 
+		if (xfs_has_zoned(mp)) {
+			xfs_alert(mp,
+	"reflink not compatible with zoned RT device!");
+			error = -EINVAL;
+			goto out_filestream_unmount;
+		}
+
 		if (xfs_globals.always_cow) {
 			xfs_info(mp, "using DEBUG-only always_cow mode.");
 			mp->m_always_cow = true;