diff mbox series

[v3,3/6] generic: copy_file_range swapfile test

Message ID 20190602124114.26810-4-amir73il@gmail.com (mailing list archive)
State New, archived
Headers show
Series fstests: copy_file_range() tests | expand

Commit Message

Amir Goldstein June 2, 2019, 12:41 p.m. UTC
This test case was split out of Dave Chinner's copy_file_range bounds
check test to reduce the requirements for running the bounds check.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 tests/generic/989     | 56 +++++++++++++++++++++++++++++++++++++++++++
 tests/generic/989.out |  4 ++++
 tests/generic/group   |  1 +
 3 files changed, 61 insertions(+)
 create mode 100755 tests/generic/989
 create mode 100644 tests/generic/989.out

Comments

Theodore Ts'o June 10, 2019, 3:58 a.m. UTC | #1
On Sun, Jun 02, 2019 at 03:41:11PM +0300, Amir Goldstein wrote:
> This test case was split out of Dave Chinner's copy_file_range bounds
> check test to reduce the requirements for running the bounds check.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

I've just updated to the latest fstests, where this has landed as
generic/554.  This test is failing on ext4, and should fail on all
file systems which do not support copy_file_range (ext4, nfsv3, etc.),
since xfs_io will fall back to emulating this via reading and writing
the file, and this causes a test failure because:

> +echo swap files return ETXTBUSY
> +_format_swapfile $testdir/swapfile 16m
> +swapon $testdir/swapfile
> +$XFS_IO_PROG -f -c "copy_range -l 32k $testdir/file" $testdir/swapfile
> +$XFS_IO_PROG -f -c "copy_range -l 32k $testdir/swapfile" $testdir/copy
> +swapoff $testdir/swapfile

Currently, the VFS doesn't prevent us from reading a swap file.
Perhaps it shouldn't, for security (theatre) reasons, but root can
read the raw block device anyway, so it's really kind of pointless.

I'm not sure what's the best way fix this, but I'm going to exclude
this test in my test appliance builds for now.

     	     	     	       	      	  - Ted
Amir Goldstein June 10, 2019, 6:37 a.m. UTC | #2
On Mon, Jun 10, 2019 at 6:58 AM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Sun, Jun 02, 2019 at 03:41:11PM +0300, Amir Goldstein wrote:
> > This test case was split out of Dave Chinner's copy_file_range bounds
> > check test to reduce the requirements for running the bounds check.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
> I've just updated to the latest fstests, where this has landed as
> generic/554.  This test is failing on ext4, and should fail on all
> file systems which do not support copy_file_range (ext4, nfsv3, etc.),
> since xfs_io will fall back to emulating this via reading and writing

Why do you think this is xfs_io fall back and not kernel fall back to
do_splice_direct()? Anyway, both cases allow read from swapfile
on upstream.

> the file, and this causes a test failure because:
>
> > +echo swap files return ETXTBUSY
> > +_format_swapfile $testdir/swapfile 16m
> > +swapon $testdir/swapfile
> > +$XFS_IO_PROG -f -c "copy_range -l 32k $testdir/file" $testdir/swapfile
> > +$XFS_IO_PROG -f -c "copy_range -l 32k $testdir/swapfile" $testdir/copy
> > +swapoff $testdir/swapfile
>
> Currently, the VFS doesn't prevent us from reading a swap file.
> Perhaps it shouldn't, for security (theatre) reasons, but root can
> read the raw block device anyway, so it's really kind of pointless.
>

Hmm, my intention with the copy_file_range() behavior was that
it mostly follows user copy limitations/semantics.
I guess preventing copy *from* swapfile doesn't make much sense
and we should relax this check in the new c_f_r bounds check in VFS.

> I'm not sure what's the best way fix this, but I'm going to exclude
> this test in my test appliance builds for now.
>

Trying to understand the desired flow of tests and fixes.
I agree that generic/554 failure may be a test/interface bug that
we should fix in a way that current upstream passes the test for
ext4. Unless there is objection, I will send a patch to fix the test
to only test copy *to* swapfile.

generic/553, OTOH, is expected to fail on upstream kernel.
Are you leaving 553 in appliance build in anticipation to upstream fix?
I guess the answer is in the ext4 IS_IMMUTABLE patch that you
posted and plan to push to upstream/stable sooner than VFS patches.

In any case, the VFS c_f_r patches are aiming for v5.3 and
I will make sure to promote them for stable as well.

Do you think that should there be a different policy w.r.t timing of
merging xfstests tests that fail on upstream kernel?

Thanks,
Amir.
Amir Goldstein June 10, 2019, 9:08 a.m. UTC | #3
On Mon, Jun 10, 2019 at 9:37 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Jun 10, 2019 at 6:58 AM Theodore Ts'o <tytso@mit.edu> wrote:
> >
> > On Sun, Jun 02, 2019 at 03:41:11PM +0300, Amir Goldstein wrote:
> > > This test case was split out of Dave Chinner's copy_file_range bounds
> > > check test to reduce the requirements for running the bounds check.
> > >
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >
> > I've just updated to the latest fstests, where this has landed as
> > generic/554.  This test is failing on ext4, and should fail on all
> > file systems which do not support copy_file_range (ext4, nfsv3, etc.),
> > since xfs_io will fall back to emulating this via reading and writing
>
> Why do you think this is xfs_io fall back and not kernel fall back to
> do_splice_direct()? Anyway, both cases allow read from swapfile
> on upstream.
>
> > the file, and this causes a test failure because:
> >
> > > +echo swap files return ETXTBUSY
> > > +_format_swapfile $testdir/swapfile 16m
> > > +swapon $testdir/swapfile
> > > +$XFS_IO_PROG -f -c "copy_range -l 32k $testdir/file" $testdir/swapfile
> > > +$XFS_IO_PROG -f -c "copy_range -l 32k $testdir/swapfile" $testdir/copy
> > > +swapoff $testdir/swapfile
> >
> > Currently, the VFS doesn't prevent us from reading a swap file.
> > Perhaps it shouldn't, for security (theatre) reasons, but root can
> > read the raw block device anyway, so it's really kind of pointless.
> >
>
> Hmm, my intention with the copy_file_range() behavior was that
> it mostly follows user copy limitations/semantics.
> I guess preventing copy *from* swapfile doesn't make much sense
> and we should relax this check in the new c_f_r bounds check in VFS.
>
> > I'm not sure what's the best way fix this, but I'm going to exclude
> > this test in my test appliance builds for now.
> >
>
> Trying to understand the desired flow of tests and fixes.
> I agree that generic/554 failure may be a test/interface bug that
> we should fix in a way that current upstream passes the test for
> ext4. Unless there is objection, I will send a patch to fix the test
> to only test copy *to* swapfile.

I made this change and test still failed on upstream ext4, because
kernel performs copy_file_range() to swapfile.
To me that seems like a real kernel bug, which is addressed by vfs
c_f_r patches, so I don't know if you should be excluding the test
from test appliance after all ?

Thanks,
Amir.

>
> generic/553, OTOH, is expected to fail on upstream kernel.
> Are you leaving 553 in appliance build in anticipation to upstream fix?
> I guess the answer is in the ext4 IS_IMMUTABLE patch that you
> posted and plan to push to upstream/stable sooner than VFS patches.
>
> In any case, the VFS c_f_r patches are aiming for v5.3 and
> I will make sure to promote them for stable as well.
>
> Do you think that should there be a different policy w.r.t timing of
> merging xfstests tests that fail on upstream kernel?
>
> Thanks,
> Amir.
Theodore Ts'o June 10, 2019, 1:31 p.m. UTC | #4
On Mon, Jun 10, 2019 at 09:37:32AM +0300, Amir Goldstein wrote:
>
>Why do you think thhis is xfs_io fall back and not kernel fall back to
>do_splice_direct()? Anyway, both cases allow read from swapfile
>on upstream.

Ah, I had assumed this was changed that was made because if you are
implementing copy_file_range in terms of some kind of reflink-like
mechanism, it becomes super-messy since you know have to break tons
and tons of COW sharing each time the kernel swaps to the swap file.

I didn't think we had (or maybe we did, and I missed it) a discussion
about whether reading from a swap file should be prohibited.
Personally, I think it's security theatre, and not worth the
effort/overhead, but whatever.... my main complaint was with the
unnecessary test failures with upstream kernels.

> Trying to understand the desired flow of tests and fixes. 
> I agree that generic/554 failure may be a test/interface bug that
> we should fix in a way that current upstream passes the test for
> ext4. Unless there is objection, I will send a patch to fix the test
> to only test copy *to* swapfile.
> 
> generic/553, OTOH, is expected to fail on upstream kernel.
> Are you leaving 553 in appliance build in anticipation to upstream fix?
> I guess the answer is in the ext4 IS_IMMUTABLE patch that you
> posted and plan to push to upstream/stable sooner than VFS patches.

So I find it kind of annoying when tests land before the fixes do
upstream.  I still have this in my global_exclude file:

# The proposed fix for generic/484, "locks: change POSIX lock
# ownership on execve when files_struct is displaced" would break NFS
# Jeff Layton and Eric Biederman have some ideas for how to address it
# but fixing it is non-trivial
generic/484

The generic/484 test landed in August 2018, and as far as I know, the
issue which it is testing for *still* hasn't been fixed upstream.
(There were issues raised with the proposed fix, and it looks like the
people looking at the kernel change have lost interest.)

The problem is that there are people who are trying to use xfstests to
look for failures, and unless they start digging into the kernel
archives from last year, they won't understand that generic/484 is a
known failing test, and it will get fixed....someday.

For people in the know, or for people who use my kvm-xfstests,
gce-xfstests, it's not a big problem, since I've already blacklisted
the test.  But not everyone (and in fact, probably most people don't)
use my front end scripts.

For generic/553, I have a fix in ext4 so it will clear the failure,
and that's fine, since I think we've all agreed in principle what the
correct fix will be, and presumably it will get fixed soon.  At that
point, I might revert the commit from ext4, and rely on the VFS to
catch the error, but the overhead of a few extra unlikely() tests
aren't that big.  But yeah, I did that mainly because unnecessary test
failures because doing an ext4-specific fix didn't have many
downsides, and one risk of adding tests to the global exclude file is
that I then have to remember to remove it from the global exclude file
when the issue is finally fixed upstream.

> Do you think that should there be a different policy w.r.t timing of
> merging xfstests tests that fail on upstream kernel?

That's my opinion, and generic/484 is the best argument for why we
should wait.  Other people may have other opinions though, and I have
a workaround, so I don't feel super-strong about it.  (generic/454 is
now the second test in my global exclude file.  :-)

At the very *least* there should be a comment in the test that fix is
pending, and might not be applied yet, with a URL to the mailing list
discussion.  That will save effort when months (years?) go by, and the
fix still hasn't landed the upstream kernel....

	       	       	      	      - Ted
Darrick J. Wong June 10, 2019, 4:06 p.m. UTC | #5
On Mon, Jun 10, 2019 at 09:31:31AM -0400, Theodore Ts'o wrote:
> On Mon, Jun 10, 2019 at 09:37:32AM +0300, Amir Goldstein wrote:
> >
> >Why do you think thhis is xfs_io fall back and not kernel fall back to
> >do_splice_direct()? Anyway, both cases allow read from swapfile
> >on upstream.
> 
> Ah, I had assumed this was changed that was made because if you are
> implementing copy_file_range in terms of some kind of reflink-like
> mechanism, it becomes super-messy since you know have to break tons
> and tons of COW sharing each time the kernel swaps to the swap file.
> 
> I didn't think we had (or maybe we did, and I missed it) a discussion
> about whether reading from a swap file should be prohibited.
> Personally, I think it's security theatre, and not worth the
> effort/overhead, but whatever.... my main complaint was with the
> unnecessary test failures with upstream kernels.
> 
> > Trying to understand the desired flow of tests and fixes. 
> > I agree that generic/554 failure may be a test/interface bug that
> > we should fix in a way that current upstream passes the test for
> > ext4. Unless there is objection, I will send a patch to fix the test
> > to only test copy *to* swapfile.
> > 
> > generic/553, OTOH, is expected to fail on upstream kernel.
> > Are you leaving 553 in appliance build in anticipation to upstream fix?
> > I guess the answer is in the ext4 IS_IMMUTABLE patch that you
> > posted and plan to push to upstream/stable sooner than VFS patches.
> 
> So I find it kind of annoying when tests land before the fixes do
> upstream.  I still have this in my global_exclude file:

Yeah, it's awkward for VFS fixes because on the one hand we don't want
to have multiyear regressions like generic/484, but OTOH stuffing tests
in before code goes upstream enables broader testing by the other fs
maintainers.

In any case, the fixes are in the copy-range-fixes branch which I'm
finally publishing...

> # The proposed fix for generic/484, "locks: change POSIX lock
> # ownership on execve when files_struct is displaced" would break NFS
> # Jeff Layton and Eric Biederman have some ideas for how to address it
> # but fixing it is non-trivial

Also, uh, can we remove this from the auto and quick groups for now?

--D

> generic/484
> 
> The generic/484 test landed in August 2018, and as far as I know, the
> issue which it is testing for *still* hasn't been fixed upstream.
> (There were issues raised with the proposed fix, and it looks like the
> people looking at the kernel change have lost interest.)
> 
> The problem is that there are people who are trying to use xfstests to
> look for failures, and unless they start digging into the kernel
> archives from last year, they won't understand that generic/484 is a
> known failing test, and it will get fixed....someday.
> 
> For people in the know, or for people who use my kvm-xfstests,
> gce-xfstests, it's not a big problem, since I've already blacklisted
> the test.  But not everyone (and in fact, probably most people don't)
> use my front end scripts.
> 
> For generic/553, I have a fix in ext4 so it will clear the failure,
> and that's fine, since I think we've all agreed in principle what the
> correct fix will be, and presumably it will get fixed soon.  At that
> point, I might revert the commit from ext4, and rely on the VFS to
> catch the error, but the overhead of a few extra unlikely() tests
> aren't that big.  But yeah, I did that mainly because unnecessary test
> failures because doing an ext4-specific fix didn't have many
> downsides, and one risk of adding tests to the global exclude file is
> that I then have to remember to remove it from the global exclude file
> when the issue is finally fixed upstream.
> 
> > Do you think that should there be a different policy w.r.t timing of
> > merging xfstests tests that fail on upstream kernel?
> 
> That's my opinion, and generic/484 is the best argument for why we
> should wait.  Other people may have other opinions though, and I have
> a workaround, so I don't feel super-strong about it.  (generic/454 is
> now the second test in my global exclude file.  :-)
> 
> At the very *least* there should be a comment in the test that fix is
> pending, and might not be applied yet, with a URL to the mailing list
> discussion.  That will save effort when months (years?) go by, and the
> fix still hasn't landed the upstream kernel....
> 
> 	       	       	      	      - Ted
Amir Goldstein June 10, 2019, 4:54 p.m. UTC | #6
On Mon, Jun 10, 2019 at 7:06 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> On Mon, Jun 10, 2019 at 09:31:31AM -0400, Theodore Ts'o wrote:
> > On Mon, Jun 10, 2019 at 09:37:32AM +0300, Amir Goldstein wrote:
> > >
> > >Why do you think thhis is xfs_io fall back and not kernel fall back to
> > >do_splice_direct()? Anyway, both cases allow read from swapfile
> > >on upstream.
> >
> > Ah, I had assumed this was changed that was made because if you are
> > implementing copy_file_range in terms of some kind of reflink-like
> > mechanism, it becomes super-messy since you know have to break tons
> > and tons of COW sharing each time the kernel swaps to the swap file.
> >
> > I didn't think we had (or maybe we did, and I missed it) a discussion
> > about whether reading from a swap file should be prohibited.
> > Personally, I think it's security theatre, and not worth the
> > effort/overhead, but whatever.... my main complaint was with the
> > unnecessary test failures with upstream kernels.
> >
> > > Trying to understand the desired flow of tests and fixes.
> > > I agree that generic/554 failure may be a test/interface bug that
> > > we should fix in a way that current upstream passes the test for
> > > ext4. Unless there is objection, I will send a patch to fix the test
> > > to only test copy *to* swapfile.
> > >
> > > generic/553, OTOH, is expected to fail on upstream kernel.
> > > Are you leaving 553 in appliance build in anticipation to upstream fix?
> > > I guess the answer is in the ext4 IS_IMMUTABLE patch that you
> > > posted and plan to push to upstream/stable sooner than VFS patches.
> >
> > So I find it kind of annoying when tests land before the fixes do
> > upstream.  I still have this in my global_exclude file:
>
> Yeah, it's awkward for VFS fixes because on the one hand we don't want
> to have multiyear regressions like generic/484, but OTOH stuffing tests
> in before code goes upstream enables broader testing by the other fs
> maintainers.

And to prove this point, Ted pointed out a test bug in 554, which also
affects the kernel and man pages fixes, so it was really worth it ;-)

>
> In any case, the fixes are in the copy-range-fixes branch which I'm
> finally publishing...
>
> > # The proposed fix for generic/484, "locks: change POSIX lock
> > # ownership on execve when files_struct is displaced" would break NFS
> > # Jeff Layton and Eric Biederman have some ideas for how to address it
> > # but fixing it is non-trivial
>
> Also, uh, can we remove this from the auto and quick groups for now?
>

I am not opposed to removing these test from auto,quick, although removing
from quick is a bit shady. I would like to mark them explicitly with group
known_issues, so that users can run ./check -g quick -x known_issues.
BTW, overlay/061 is also a known_issue that is going to be hard to fix.

But anyway, neither 553 nor 554 fall into that category.

Thanks,
Amir.
Darrick J. Wong June 10, 2019, 5:04 p.m. UTC | #7
On Mon, Jun 10, 2019 at 07:54:52PM +0300, Amir Goldstein wrote:
> On Mon, Jun 10, 2019 at 7:06 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > On Mon, Jun 10, 2019 at 09:31:31AM -0400, Theodore Ts'o wrote:
> > > On Mon, Jun 10, 2019 at 09:37:32AM +0300, Amir Goldstein wrote:
> > > >
> > > >Why do you think thhis is xfs_io fall back and not kernel fall back to
> > > >do_splice_direct()? Anyway, both cases allow read from swapfile
> > > >on upstream.
> > >
> > > Ah, I had assumed this was changed that was made because if you are
> > > implementing copy_file_range in terms of some kind of reflink-like
> > > mechanism, it becomes super-messy since you know have to break tons
> > > and tons of COW sharing each time the kernel swaps to the swap file.
> > >
> > > I didn't think we had (or maybe we did, and I missed it) a discussion
> > > about whether reading from a swap file should be prohibited.
> > > Personally, I think it's security theatre, and not worth the
> > > effort/overhead, but whatever.... my main complaint was with the
> > > unnecessary test failures with upstream kernels.
> > >
> > > > Trying to understand the desired flow of tests and fixes.
> > > > I agree that generic/554 failure may be a test/interface bug that
> > > > we should fix in a way that current upstream passes the test for
> > > > ext4. Unless there is objection, I will send a patch to fix the test
> > > > to only test copy *to* swapfile.
> > > >
> > > > generic/553, OTOH, is expected to fail on upstream kernel.
> > > > Are you leaving 553 in appliance build in anticipation to upstream fix?
> > > > I guess the answer is in the ext4 IS_IMMUTABLE patch that you
> > > > posted and plan to push to upstream/stable sooner than VFS patches.
> > >
> > > So I find it kind of annoying when tests land before the fixes do
> > > upstream.  I still have this in my global_exclude file:
> >
> > Yeah, it's awkward for VFS fixes because on the one hand we don't want
> > to have multiyear regressions like generic/484, but OTOH stuffing tests
> > in before code goes upstream enables broader testing by the other fs
> > maintainers.
> 
> And to prove this point, Ted pointed out a test bug in 554, which also
> affects the kernel and man pages fixes, so it was really worth it ;-)

:D

> >
> > In any case, the fixes are in the copy-range-fixes branch which I'm
> > finally publishing...
> >
> > > # The proposed fix for generic/484, "locks: change POSIX lock
> > > # ownership on execve when files_struct is displaced" would break NFS
> > > # Jeff Layton and Eric Biederman have some ideas for how to address it
> > > # but fixing it is non-trivial
> >
> > Also, uh, can we remove this from the auto and quick groups for now?
> >
> 
> I am not opposed to removing these test from auto,quick, although removing
> from quick is a bit shady. I would like to mark them explicitly with group
> known_issues, so that users can run ./check -g quick -x known_issues.
> BTW, overlay/061 is also a known_issue that is going to be hard to fix.
> 
> But anyway, neither 553 nor 554 fall into that category.

Sorry, I was unclear -- I was asking to remove g/484 from auto/quick.

--D

> Thanks,
> Amir.
Eryu Guan June 11, 2019, 2:12 a.m. UTC | #8
On Mon, Jun 10, 2019 at 09:31:31AM -0400, Theodore Ts'o wrote:
> On Mon, Jun 10, 2019 at 09:37:32AM +0300, Amir Goldstein wrote:
> >
> >Why do you think thhis is xfs_io fall back and not kernel fall back to
> >do_splice_direct()? Anyway, both cases allow read from swapfile
> >on upstream.
> 
> Ah, I had assumed this was changed that was made because if you are
> implementing copy_file_range in terms of some kind of reflink-like
> mechanism, it becomes super-messy since you know have to break tons
> and tons of COW sharing each time the kernel swaps to the swap file.
> 
> I didn't think we had (or maybe we did, and I missed it) a discussion
> about whether reading from a swap file should be prohibited.
> Personally, I think it's security theatre, and not worth the
> effort/overhead, but whatever.... my main complaint was with the
> unnecessary test failures with upstream kernels.
> 
> > Trying to understand the desired flow of tests and fixes. 
> > I agree that generic/554 failure may be a test/interface bug that
> > we should fix in a way that current upstream passes the test for
> > ext4. Unless there is objection, I will send a patch to fix the test
> > to only test copy *to* swapfile.
> > 
> > generic/553, OTOH, is expected to fail on upstream kernel.
> > Are you leaving 553 in appliance build in anticipation to upstream fix?
> > I guess the answer is in the ext4 IS_IMMUTABLE patch that you
> > posted and plan to push to upstream/stable sooner than VFS patches.
> 
> So I find it kind of annoying when tests land before the fixes do
> upstream.  I still have this in my global_exclude file:
> 
> # The proposed fix for generic/484, "locks: change POSIX lock
> # ownership on execve when files_struct is displaced" would break NFS
> # Jeff Layton and Eric Biederman have some ideas for how to address it
> # but fixing it is non-trivial
> generic/484
> 
> The generic/484 test landed in August 2018, and as far as I know, the
> issue which it is testing for *still* hasn't been fixed upstream.
> (There were issues raised with the proposed fix, and it looks like the
> people looking at the kernel change have lost interest.)

I usually push "known failing" tests only when there's a known & pending
fix which is expected to be merged into mainline kernel soon. And as
Darrick stated, "enables broader testing by the other fs maintainers."
and could bring broader attention of the failure.

But generic/484 is a bit unfortunate. It was in that exact situation
back then (or at least gave me the impression that the fix would be
merged soon), but apparaently things changed after test being applied..

> 
> The problem is that there are people who are trying to use xfstests to
> look for failures, and unless they start digging into the kernel
> archives from last year, they won't understand that generic/484 is a
> known failing test, and it will get fixed....someday.
> 
> For people in the know, or for people who use my kvm-xfstests,
> gce-xfstests, it's not a big problem, since I've already blacklisted
> the test.  But not everyone (and in fact, probably most people don't)
> use my front end scripts.
> 
> For generic/553, I have a fix in ext4 so it will clear the failure,
> and that's fine, since I think we've all agreed in principle what the
> correct fix will be, and presumably it will get fixed soon.  At that
> point, I might revert the commit from ext4, and rely on the VFS to
> catch the error, but the overhead of a few extra unlikely() tests
> aren't that big.  But yeah, I did that mainly because unnecessary test
> failures because doing an ext4-specific fix didn't have many
> downsides, and one risk of adding tests to the global exclude file is
> that I then have to remember to remove it from the global exclude file
> when the issue is finally fixed upstream.
> 
> > Do you think that should there be a different policy w.r.t timing of
> > merging xfstests tests that fail on upstream kernel?
> 
> That's my opinion, and generic/484 is the best argument for why we
> should wait.  Other people may have other opinions though, and I have
> a workaround, so I don't feel super-strong about it.  (generic/454 is
> now the second test in my global exclude file.  :-)

I don't see generic/454 failing with ext4 (I'm testing with default
mkfs/mount options, kernel is 5.2-rc2). But IMHO, I think generic/454 is
different, it's not a targeted regression test, it's kind of generic
test that should work for all filesystems.

> 
> At the very *least* there should be a comment in the test that fix is
> pending, and might not be applied yet, with a URL to the mailing list
> discussion.  That will save effort when months (years?) go by, and the
> fix still hasn't landed the upstream kernel....

Agreed, I've been making sure there's a comment referring to the fix or
pending fix (e.g. only commit summary no hash ID) for such targeted
regression tests.

Thanks,
Eryu
Eryu Guan June 11, 2019, 2:13 a.m. UTC | #9
On Mon, Jun 10, 2019 at 09:06:16AM -0700, Darrick J. Wong wrote:
> On Mon, Jun 10, 2019 at 09:31:31AM -0400, Theodore Ts'o wrote:
> > On Mon, Jun 10, 2019 at 09:37:32AM +0300, Amir Goldstein wrote:
> > >
> > >Why do you think thhis is xfs_io fall back and not kernel fall back to
> > >do_splice_direct()? Anyway, both cases allow read from swapfile
> > >on upstream.
> > 
> > Ah, I had assumed this was changed that was made because if you are
> > implementing copy_file_range in terms of some kind of reflink-like
> > mechanism, it becomes super-messy since you know have to break tons
> > and tons of COW sharing each time the kernel swaps to the swap file.
> > 
> > I didn't think we had (or maybe we did, and I missed it) a discussion
> > about whether reading from a swap file should be prohibited.
> > Personally, I think it's security theatre, and not worth the
> > effort/overhead, but whatever.... my main complaint was with the
> > unnecessary test failures with upstream kernels.
> > 
> > > Trying to understand the desired flow of tests and fixes. 
> > > I agree that generic/554 failure may be a test/interface bug that
> > > we should fix in a way that current upstream passes the test for
> > > ext4. Unless there is objection, I will send a patch to fix the test
> > > to only test copy *to* swapfile.
> > > 
> > > generic/553, OTOH, is expected to fail on upstream kernel.
> > > Are you leaving 553 in appliance build in anticipation to upstream fix?
> > > I guess the answer is in the ext4 IS_IMMUTABLE patch that you
> > > posted and plan to push to upstream/stable sooner than VFS patches.
> > 
> > So I find it kind of annoying when tests land before the fixes do
> > upstream.  I still have this in my global_exclude file:
> 
> Yeah, it's awkward for VFS fixes because on the one hand we don't want
> to have multiyear regressions like generic/484, but OTOH stuffing tests
> in before code goes upstream enables broader testing by the other fs
> maintainers.
> 
> In any case, the fixes are in the copy-range-fixes branch which I'm
> finally publishing...
> 
> > # The proposed fix for generic/484, "locks: change POSIX lock
> > # ownership on execve when files_struct is displaced" would break NFS
> > # Jeff Layton and Eric Biederman have some ideas for how to address it
> > # but fixing it is non-trivial
> 
> Also, uh, can we remove this from the auto and quick groups for now?

I'm fine with that :)

Thanks,
Eryu
Eryu Guan June 11, 2019, 2:36 a.m. UTC | #10
On Tue, Jun 11, 2019 at 10:12:22AM +0800, Eryu Guan wrote:
[snip]
> > 
> > > Do you think that should there be a different policy w.r.t timing of
> > > merging xfstests tests that fail on upstream kernel?
> > 
> > That's my opinion, and generic/484 is the best argument for why we
> > should wait.  Other people may have other opinions though, and I have
> > a workaround, so I don't feel super-strong about it.  (generic/454 is
> > now the second test in my global exclude file.  :-)
> 
> I don't see generic/454 failing with ext4 (I'm testing with default
> mkfs/mount options, kernel is 5.2-rc2). But IMHO, I think generic/454 is

Oh, I see, I think you meant generic/554 not generic/454 (thanks Darrick
for pointing that out :)

> different, it's not a targeted regression test, it's kind of generic
> test that should work for all filesystems.
> 
> > 
> > At the very *least* there should be a comment in the test that fix is
> > pending, and might not be applied yet, with a URL to the mailing list
> > discussion.  That will save effort when months (years?) go by, and the
> > fix still hasn't landed the upstream kernel....
> 
> Agreed, I've been making sure there's a comment referring to the fix or
> pending fix (e.g. only commit summary no hash ID) for such targeted
> regression tests.

And I took generic/55[34] as generic tests not targeted regression test.
But looks like it's better to reference the fixes anyway.

Amir, would you mind adding such references to generic/55[34] as well?

Thanks,
Eryu
diff mbox series

Patch

diff --git a/tests/generic/989 b/tests/generic/989
new file mode 100755
index 00000000..27c10296
--- /dev/null
+++ b/tests/generic/989
@@ -0,0 +1,56 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2018 Red Hat, Inc.  All Rights Reserved.
+#
+# FS QA Test No. 989
+#
+# Check that we cannot copy_file_range() to/from a swapfile
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 7 15
+
+_cleanup()
+{
+	cd /
+	rm -rf $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_os Linux
+_supported_fs generic
+
+rm -f $seqres.full
+
+_require_scratch
+_require_xfs_io_command "copy_range"
+_require_scratch_swapfile
+
+_scratch_mkfs 2>&1 >> $seqres.full
+_scratch_mount
+
+testdir=$SCRATCH_MNT
+
+rm -f $seqres.full
+
+$XFS_IO_PROG -f -c "pwrite -S 0x61 0 128k" $testdir/file >> $seqres.full 2>&1
+
+echo swap files return ETXTBUSY
+_format_swapfile $testdir/swapfile 16m
+swapon $testdir/swapfile
+$XFS_IO_PROG -f -c "copy_range -l 32k $testdir/file" $testdir/swapfile
+$XFS_IO_PROG -f -c "copy_range -l 32k $testdir/swapfile" $testdir/copy
+swapoff $testdir/swapfile
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/989.out b/tests/generic/989.out
new file mode 100644
index 00000000..32da0ce9
--- /dev/null
+++ b/tests/generic/989.out
@@ -0,0 +1,4 @@ 
+QA output created by 989
+swap files return ETXTBUSY
+copy_range: Text file busy
+copy_range: Text file busy
diff --git a/tests/generic/group b/tests/generic/group
index 20b95c14..4c100781 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -551,3 +551,4 @@ 
 546 auto quick clone enospc log
 547 auto quick log
 988 auto quick copy_range
+989 auto quick copy_range swap