diff mbox series

[05/12] generic/562: handle ENOSPC while cloning gracefully

Message ID 173197064501.904310.1505759730439532159.stgit@frogsfrogsfrogs (mailing list archive)
State New
Headers show
Series [01/12] generic/757: fix various bugs in this test | expand

Commit Message

Darrick J. Wong Nov. 18, 2024, 11:02 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

This test creates a couple of patterned files on a tiny filesystem,
fragments the free space, clones one patterned file to the other, and
checks that the entire file was cloned.

However, this test doesn't work on a 64k fsblock filesystem because
we've used up all the free space reservation for the rmapbt, and that
causes the FICLONE to error out with ENOSPC partway through.  Hence we
need to detect the ENOSPC and _notrun the test.

That said, it turns out that XFS has been silently dropping error codes
if we managed to make some progress cloning extents.  That's ok if the
operation has REMAP_FILE_CAN_SHORTEN like copy_file_range does, but
FICLONE/FICLONERANGE do not permit partial results, so the dropped error
codes is actually an error.

Therefore, this testcase now becomes a regression test for the patch to
fix that.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 tests/generic/562 |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Filipe Manana Nov. 19, 2024, 12:17 a.m. UTC | #1
On Mon, Nov 18, 2024 at 11:03 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> From: Darrick J. Wong <djwong@kernel.org>
>
> This test creates a couple of patterned files on a tiny filesystem,
> fragments the free space, clones one patterned file to the other, and
> checks that the entire file was cloned.
>
> However, this test doesn't work on a 64k fsblock filesystem because
> we've used up all the free space reservation for the rmapbt, and that
> causes the FICLONE to error out with ENOSPC partway through.  Hence we
> need to detect the ENOSPC and _notrun the test.
>
> That said, it turns out that XFS has been silently dropping error codes
> if we managed to make some progress cloning extents.  That's ok if the
> operation has REMAP_FILE_CAN_SHORTEN like copy_file_range does, but
> FICLONE/FICLONERANGE do not permit partial results, so the dropped error
> codes is actually an error.
>
> Therefore, this testcase now becomes a regression test for the patch to
> fix that.
>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  tests/generic/562 |   10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
>
> diff --git a/tests/generic/562 b/tests/generic/562
> index 91360c4154a6a2..62899945003513 100755
> --- a/tests/generic/562
> +++ b/tests/generic/562
> @@ -15,6 +15,9 @@ _begin_fstest auto clone punch
>  . ./common/filter
>  . ./common/reflink
>
> +test "$FSTYP" = "xfs" && \
> +       _fixed_by_kernel_commit XXXXXXXXXX "xfs: don't drop errno values when we fail to ficlone the entire range"
> +
>  _require_scratch_reflink
>  _require_test_program "punch-alternating"
>  _require_xfs_io_command "fpunch"
> @@ -48,8 +51,11 @@ while true; do
>  done
>
>  # Now clone file bar into file foo. This is supposed to succeed and not fail
> -# with ENOSPC for example.
> -_reflink $SCRATCH_MNT/bar $SCRATCH_MNT/foo >>$seqres.full
> +# with ENOSPC for example.  However, XFS will sometimes run out of space.
> +_reflink $SCRATCH_MNT/bar $SCRATCH_MNT/foo >>$seqres.full 2> $tmp.err
> +cat $tmp.err
> +grep -q 'No space left on device' $tmp.err && \
> +       _notrun "ran out of space while cloning"

This defeats the original purpose of the test, which was to verify
btrfs didn't fail with -ENOSPC (or any other error).

If XFS has an ENOSPC issue in some cases, and it's not fixable, why
not make it _notrun only if it's XFS that is being tested?

Thanks.


>
>  # Unmount and mount the filesystem again to verify the operation was durably
>  # persisted.
>
>
Darrick J. Wong Nov. 19, 2024, 12:31 a.m. UTC | #2
On Tue, Nov 19, 2024 at 12:17:56AM +0000, Filipe Manana wrote:
> On Mon, Nov 18, 2024 at 11:03 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > This test creates a couple of patterned files on a tiny filesystem,
> > fragments the free space, clones one patterned file to the other, and
> > checks that the entire file was cloned.
> >
> > However, this test doesn't work on a 64k fsblock filesystem because
> > we've used up all the free space reservation for the rmapbt, and that
> > causes the FICLONE to error out with ENOSPC partway through.  Hence we
> > need to detect the ENOSPC and _notrun the test.
> >
> > That said, it turns out that XFS has been silently dropping error codes
> > if we managed to make some progress cloning extents.  That's ok if the
> > operation has REMAP_FILE_CAN_SHORTEN like copy_file_range does, but
> > FICLONE/FICLONERANGE do not permit partial results, so the dropped error
> > codes is actually an error.
> >
> > Therefore, this testcase now becomes a regression test for the patch to
> > fix that.
> >
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  tests/generic/562 |   10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> >
> >
> > diff --git a/tests/generic/562 b/tests/generic/562
> > index 91360c4154a6a2..62899945003513 100755
> > --- a/tests/generic/562
> > +++ b/tests/generic/562
> > @@ -15,6 +15,9 @@ _begin_fstest auto clone punch
> >  . ./common/filter
> >  . ./common/reflink
> >
> > +test "$FSTYP" = "xfs" && \
> > +       _fixed_by_kernel_commit XXXXXXXXXX "xfs: don't drop errno values when we fail to ficlone the entire range"
> > +
> >  _require_scratch_reflink
> >  _require_test_program "punch-alternating"
> >  _require_xfs_io_command "fpunch"
> > @@ -48,8 +51,11 @@ while true; do
> >  done
> >
> >  # Now clone file bar into file foo. This is supposed to succeed and not fail
> > -# with ENOSPC for example.
> > -_reflink $SCRATCH_MNT/bar $SCRATCH_MNT/foo >>$seqres.full
> > +# with ENOSPC for example.  However, XFS will sometimes run out of space.
> > +_reflink $SCRATCH_MNT/bar $SCRATCH_MNT/foo >>$seqres.full 2> $tmp.err
> > +cat $tmp.err
> > +grep -q 'No space left on device' $tmp.err && \
> > +       _notrun "ran out of space while cloning"
> 
> This defeats the original purpose of the test, which was to verify
> btrfs didn't fail with -ENOSPC (or any other error).
> 
> If XFS has an ENOSPC issue in some cases, and it's not fixable, why
> not make it _notrun only if it's XFS that is being tested?

Ok, will do.  In the case of xfs, we don't let you share data if the
allocation group it's in is more than about 90% full.

--D

> Thanks.
> 
> 
> >
> >  # Unmount and mount the filesystem again to verify the operation was durably
> >  # persisted.
> >
> >
>
diff mbox series

Patch

diff --git a/tests/generic/562 b/tests/generic/562
index 91360c4154a6a2..62899945003513 100755
--- a/tests/generic/562
+++ b/tests/generic/562
@@ -15,6 +15,9 @@  _begin_fstest auto clone punch
 . ./common/filter
 . ./common/reflink
 
+test "$FSTYP" = "xfs" && \
+	_fixed_by_kernel_commit XXXXXXXXXX "xfs: don't drop errno values when we fail to ficlone the entire range"
+
 _require_scratch_reflink
 _require_test_program "punch-alternating"
 _require_xfs_io_command "fpunch"
@@ -48,8 +51,11 @@  while true; do
 done
 
 # Now clone file bar into file foo. This is supposed to succeed and not fail
-# with ENOSPC for example.
-_reflink $SCRATCH_MNT/bar $SCRATCH_MNT/foo >>$seqres.full
+# with ENOSPC for example.  However, XFS will sometimes run out of space.
+_reflink $SCRATCH_MNT/bar $SCRATCH_MNT/foo >>$seqres.full 2> $tmp.err
+cat $tmp.err
+grep -q 'No space left on device' $tmp.err && \
+	_notrun "ran out of space while cloning"
 
 # Unmount and mount the filesystem again to verify the operation was durably
 # persisted.