mbox series

[00/10] netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes

Message ID 20241213135013.2964079-1-dhowells@redhat.com (mailing list archive)
Headers show
Series netfs, ceph, nfs, cachefiles: Miscellaneous fixes/changes | expand

Message

David Howells Dec. 13, 2024, 1:50 p.m. UTC
Hi Christian,

Here are some miscellaneous fixes and changes for netfslib and the ceph and
nfs filesystems:

 (1) Ignore silly-rename files from afs and nfs when building the header
     archive in a kernel build.

 (2) netfs: Fix the way read result collection applies results to folios
     when each folio is being read by multiple subrequests and the results
     come out of order.

 (3) netfs: Fix ENOMEM handling in buffered reads.

 (4) nfs: Fix an oops in nfs_netfs_init_request() when copying to the cache.

 (5) cachefiles: Parse the "secctx" command immediately to get the correct
     error rather than leaving it to the "bind" command.

 (6) netfs: Remove a redundant smp_rmb().  This isn't a bug per se and
     could be deferred.

 (7) netfs: Fix missing barriers by using clear_and_wake_up_bit().

 (8) netfs: Work around recursion in read retry by failing and abandoning
     the retried subrequest if no I/O is performed.

     [!] NOTE: This only works around the recursion problem if the
     	 recursion keeps returning no data.  If the server manages, say, to
     	 repeatedly return a single byte of data faster than the retry
     	 algorithm can complete, it will still recurse and the stack
     	 overrun may still occur.  Actually fixing this requires quite an
     	 intrusive change which will hopefully make the next merge window.

 (9) netfs: Fix the clearance of a folio_queue when unlocking the page if
     we're going to want to subsequently send the queue for copying to the
     cache (if, for example, we're using ceph).

(10) netfs: Fix the lack of cancellation of copy-to-cache when the cache
     for a file is temporarily disabled (for example when a DIO write is
     done to the file).  This patch and (9) fix hangs with ceph.

With these patches, I can run xfstest -g quick to completion on ceph with a
local cache.

The patches can also be found here with a bonus cifs patch:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=netfs-fixes

Thanks,
David

David Howells (8):
  kheaders: Ignore silly-rename files
  netfs: Fix non-contiguous donation between completed reads
  netfs: Fix enomem handling in buffered reads
  nfs: Fix oops in nfs_netfs_init_request() when copying to cache
  netfs: Fix missing barriers by using clear_and_wake_up_bit()
  netfs: Work around recursion by abandoning retry if nothing read
  netfs: Fix ceph copy to cache on write-begin
  netfs: Fix the (non-)cancellation of copy when cache is temporarily
    disabled

Max Kellermann (1):
  cachefiles: Parse the "secctx" immediately

Zilin Guan (1):
  netfs: Remove redundant use of smp_rmb()

 fs/9p/vfs_addr.c         |  6 +++++-
 fs/afs/write.c           |  5 ++++-
 fs/cachefiles/daemon.c   | 14 +++++++-------
 fs/cachefiles/internal.h |  3 ++-
 fs/cachefiles/security.c |  6 +++---
 fs/netfs/buffered_read.c | 28 ++++++++++++++++------------
 fs/netfs/direct_write.c  |  1 -
 fs/netfs/read_collect.c  | 33 +++++++++++++++++++--------------
 fs/netfs/read_pgpriv2.c  |  4 ++++
 fs/netfs/read_retry.c    |  6 ++++--
 fs/netfs/write_collect.c | 14 +++++---------
 fs/netfs/write_issue.c   |  2 ++
 fs/nfs/fscache.c         |  9 ++++++++-
 fs/smb/client/cifssmb.c  | 13 +++++++++----
 fs/smb/client/smb2pdu.c  |  9 ++++++---
 include/linux/netfs.h    |  6 +++---
 kernel/gen_kheaders.sh   |  1 +
 17 files changed, 98 insertions(+), 62 deletions(-)

Comments

David Howells Dec. 13, 2024, 2:04 p.m. UTC | #1
David Howells <dhowells@redhat.com> wrote:

> With these patches, I can run xfstest -g quick to completion on ceph with a
> local cache.

I should qualify that.  The thing completes and doesn't hang, but I get 6
failures:

    Failures: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732

Though these don't appear to be anything to do with netfslib (see attached).
There are two cases where the mount is busy and the rest seems to be due to
id-mapped mounts and/or user namespaces.

The xfstest local.config file looks something like:

    export FSTYP=ceph
    export TEST_DEV=<ipaddr>:/test
    export TEST_DIR=/xfstest.test
    TEST_FS_MOUNT_OPTS='-o name=admin,mds_namespace=test,fs=test,fsc'
    export SCRATCH_DEV=<ipaddr>:/scratch
    export SCRATCH_MNT=/xfstest.scratch
    export MOUNT_OPTIONS='-o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch'

David
---
# ./check -E .exclude generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
FSTYP         -- ceph
PLATFORM      -- Linux/x86_64 andromeda 6.13.0-rc2-build3+ #5311 SMP Fri Dec 13 09:03:34 GMT 2024
MKFS_OPTIONS  -- <ipaddr>:/scratch
MOUNT_OPTIONS -- -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.scratch

generic/604 2s ... [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/604.out.bad)
    --- tests/generic/604.out   2024-09-12 12:36:14.187441830 +0100
    +++ /root/xfstests-dev/results//generic/604.out.bad 2024-12-13 13:18:51.910900871 +0000
    @@ -1,2 +1,4 @@
     QA output created by 604
    -Silence is golden
    +mount error 16 = Device or resource busy
    +mount -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.scratch failed
    +(see /root/xfstests-dev/results//generic/604.full for details)
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/604.out /root/xfstests-dev/results//generic/604.out.bad'  to see the entire diff)
generic/633       [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/633.out.bad)
    --- tests/generic/633.out   2024-09-12 12:36:14.187441830 +0100
    +++ /root/xfstests-dev/results//generic/633.out.bad 2024-12-13 13:18:55.958979531 +0000
    @@ -1,2 +1,4 @@
     QA output created by 633
     Silence is golden
    +idmapped-mounts.c: 307: tcore_create_in_userns - Input/output error - failure: open file
    +vfstest.c: 2418: run_test - Success - failure: create operations in user namespace
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/633.out /root/xfstests-dev/results//generic/633.out.bad'  to see the entire diff)
generic/645       [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/645.out.bad)
    --- tests/generic/645.out   2024-09-12 12:36:14.191441810 +0100
    +++ /root/xfstests-dev/results//generic/645.out.bad 2024-12-13 13:19:25.526908024 +0000
    @@ -1,2 +1,4 @@
     QA output created by 645
     Silence is golden
    +idmapped-mounts.c: 6671: nested_userns - Invalid argument - failure: sys_mount_setattr
    +vfstest.c: 2418: run_test - Invalid argument - failure: test that nested user namespaces behave correctly when attached to idmapped mounts
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/645.out /root/xfstests-dev/results//generic/645.out.bad'  to see the entire diff)
generic/696       - output mismatch (see /root/xfstests-dev/results//generic/696.out.bad)
    --- tests/generic/696.out   2024-09-12 12:36:14.195441791 +0100
    +++ /root/xfstests-dev/results//generic/696.out.bad 2024-12-13 13:19:30.254804087 +0000
    @@ -1,2 +1,6 @@
     QA output created by 696
    +idmapped-mounts.c: 7763: setgid_create_umask_idmapped - Input/output error - failure: create
    +vfstest.c: 2418: run_test - Success - failure: create operations by using umask in directories with setgid bit set on idmapped mount
    +idmapped-mounts.c: 7763: setgid_create_umask_idmapped - Input/output error - failure: create
    +vfstest.c: 2418: run_test - Success - failure: create operations by using umask in directories with setgid bit set on idmapped mount
     Silence is golden
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/696.out /root/xfstests-dev/results//generic/696.out.bad'  to see the entire diff)

HINT: You _MAY_ be missing kernel fix:
      ac6800e279a2 fs: Add missing umask strip in vfs_tmpfile 1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers

generic/697       - output mismatch (see /root/xfstests-dev/results//generic/697.out.bad)
    --- tests/generic/697.out   2024-09-12 12:36:14.195441791 +0100
    +++ /root/xfstests-dev/results//generic/697.out.bad 2024-12-13 13:19:31.749225548 +0000
    @@ -1,2 +1,4 @@
     QA output created by 697
    +idmapped-mounts.c: 8218: setgid_create_acl_idmapped - Input/output error - failure: create
    +vfstest.c: 2418: run_test - Success - failure: create operations by using acl in directories with setgid bit set on idmapped mount
     Silence is golden
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/697.out /root/xfstests-dev/results//generic/697.out.bad'  to see the entire diff)

HINT: You _MAY_ be missing kernel fix:
      1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers

generic/732 1s ... [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/732.out.bad)
    --- tests/generic/732.out   2024-09-12 12:36:14.195441791 +0100
    +++ /root/xfstests-dev/results//generic/732.out.bad 2024-12-13 13:19:34.482858235 +0000
    @@ -1,2 +1,5 @@
     QA output created by 732
     Silence is golden
    +mount error 16 = Device or resource busy
    +mount -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.test/mountpoint2-732 failed
    +(see /root/xfstests-dev/results//generic/732.full for details)
    ...
    (Run 'diff -u /root/xfstests-dev/tests/generic/732.out /root/xfstests-dev/results//generic/732.out.bad'  to see the entire diff)
Ran: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
Failures: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
Failed 6 of 6 tests
Alex Markuze Dec. 18, 2024, 3:10 p.m. UTC | #2
Hey David.
Thanks, for the find. I've seen your mail, but it was a busy week.
If you can, please open a https://tracker.ceph.com/ bug and assign it to me.

On Fri, Dec 13, 2024 at 4:05 PM David Howells <dhowells@redhat.com> wrote:
>
> David Howells <dhowells@redhat.com> wrote:
>
> > With these patches, I can run xfstest -g quick to completion on ceph with a
> > local cache.
>
> I should qualify that.  The thing completes and doesn't hang, but I get 6
> failures:
>
>     Failures: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
>
> Though these don't appear to be anything to do with netfslib (see attached).
> There are two cases where the mount is busy and the rest seems to be due to
> id-mapped mounts and/or user namespaces.
>
> The xfstest local.config file looks something like:
>
>     export FSTYP=ceph
>     export TEST_DEV=<ipaddr>:/test
>     export TEST_DIR=/xfstest.test
>     TEST_FS_MOUNT_OPTS='-o name=admin,mds_namespace=test,fs=test,fsc'
>     export SCRATCH_DEV=<ipaddr>:/scratch
>     export SCRATCH_MNT=/xfstest.scratch
>     export MOUNT_OPTIONS='-o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch'
>
> David
> ---
> # ./check -E .exclude generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
> FSTYP         -- ceph
> PLATFORM      -- Linux/x86_64 andromeda 6.13.0-rc2-build3+ #5311 SMP Fri Dec 13 09:03:34 GMT 2024
> MKFS_OPTIONS  -- <ipaddr>:/scratch
> MOUNT_OPTIONS -- -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.scratch
>
> generic/604 2s ... [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/604.out.bad)
>     --- tests/generic/604.out   2024-09-12 12:36:14.187441830 +0100
>     +++ /root/xfstests-dev/results//generic/604.out.bad 2024-12-13 13:18:51.910900871 +0000
>     @@ -1,2 +1,4 @@
>      QA output created by 604
>     -Silence is golden
>     +mount error 16 = Device or resource busy
>     +mount -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.scratch failed
>     +(see /root/xfstests-dev/results//generic/604.full for details)
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/604.out /root/xfstests-dev/results//generic/604.out.bad'  to see the entire diff)
> generic/633       [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/633.out.bad)
>     --- tests/generic/633.out   2024-09-12 12:36:14.187441830 +0100
>     +++ /root/xfstests-dev/results//generic/633.out.bad 2024-12-13 13:18:55.958979531 +0000
>     @@ -1,2 +1,4 @@
>      QA output created by 633
>      Silence is golden
>     +idmapped-mounts.c: 307: tcore_create_in_userns - Input/output error - failure: open file
>     +vfstest.c: 2418: run_test - Success - failure: create operations in user namespace
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/633.out /root/xfstests-dev/results//generic/633.out.bad'  to see the entire diff)
> generic/645       [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/645.out.bad)
>     --- tests/generic/645.out   2024-09-12 12:36:14.191441810 +0100
>     +++ /root/xfstests-dev/results//generic/645.out.bad 2024-12-13 13:19:25.526908024 +0000
>     @@ -1,2 +1,4 @@
>      QA output created by 645
>      Silence is golden
>     +idmapped-mounts.c: 6671: nested_userns - Invalid argument - failure: sys_mount_setattr
>     +vfstest.c: 2418: run_test - Invalid argument - failure: test that nested user namespaces behave correctly when attached to idmapped mounts
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/645.out /root/xfstests-dev/results//generic/645.out.bad'  to see the entire diff)
> generic/696       - output mismatch (see /root/xfstests-dev/results//generic/696.out.bad)
>     --- tests/generic/696.out   2024-09-12 12:36:14.195441791 +0100
>     +++ /root/xfstests-dev/results//generic/696.out.bad 2024-12-13 13:19:30.254804087 +0000
>     @@ -1,2 +1,6 @@
>      QA output created by 696
>     +idmapped-mounts.c: 7763: setgid_create_umask_idmapped - Input/output error - failure: create
>     +vfstest.c: 2418: run_test - Success - failure: create operations by using umask in directories with setgid bit set on idmapped mount
>     +idmapped-mounts.c: 7763: setgid_create_umask_idmapped - Input/output error - failure: create
>     +vfstest.c: 2418: run_test - Success - failure: create operations by using umask in directories with setgid bit set on idmapped mount
>      Silence is golden
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/696.out /root/xfstests-dev/results//generic/696.out.bad'  to see the entire diff)
>
> HINT: You _MAY_ be missing kernel fix:
>       ac6800e279a2 fs: Add missing umask strip in vfs_tmpfile 1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers
>
> generic/697       - output mismatch (see /root/xfstests-dev/results//generic/697.out.bad)
>     --- tests/generic/697.out   2024-09-12 12:36:14.195441791 +0100
>     +++ /root/xfstests-dev/results//generic/697.out.bad 2024-12-13 13:19:31.749225548 +0000
>     @@ -1,2 +1,4 @@
>      QA output created by 697
>     +idmapped-mounts.c: 8218: setgid_create_acl_idmapped - Input/output error - failure: create
>     +vfstest.c: 2418: run_test - Success - failure: create operations by using acl in directories with setgid bit set on idmapped mount
>      Silence is golden
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/697.out /root/xfstests-dev/results//generic/697.out.bad'  to see the entire diff)
>
> HINT: You _MAY_ be missing kernel fix:
>       1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers
>
> generic/732 1s ... [failed, exit status 1]- output mismatch (see /root/xfstests-dev/results//generic/732.out.bad)
>     --- tests/generic/732.out   2024-09-12 12:36:14.195441791 +0100
>     +++ /root/xfstests-dev/results//generic/732.out.bad 2024-12-13 13:19:34.482858235 +0000
>     @@ -1,2 +1,5 @@
>      QA output created by 732
>      Silence is golden
>     +mount error 16 = Device or resource busy
>     +mount -o name=admin,mds_namespace=scratch,fs=scratch,fsc=scratch -o context=system_u:object_r:root_t:s0 <ipaddr>:/scratch /xfstest.test/mountpoint2-732 failed
>     +(see /root/xfstests-dev/results//generic/732.full for details)
>     ...
>     (Run 'diff -u /root/xfstests-dev/tests/generic/732.out /root/xfstests-dev/results//generic/732.out.bad'  to see the entire diff)
> Ran: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
> Failures: generic/604 generic/633 generic/645 generic/696 generic/697 generic/732
> Failed 6 of 6 tests
>
>