mbox series

[v10,0/8] fuse: full atomic open and atomic-open-revalidate

Message ID 20231023183035.11035-1-bschubert@ddn.com (mailing list archive)
Headers show
Series fuse: full atomic open and atomic-open-revalidate | expand

Message

Bernd Schubert Oct. 23, 2023, 6:30 p.m. UTC
In FUSE, as of now, uncached lookups are expensive over the wire.
E.g additional latencies and stressing (meta data) servers from
thousands of clients. With atomic-open lookup before open
can be avoided.

Here is the link to performance numbers
https://lore.kernel.org/linux-fsdevel/20220322121212.5087-1-dharamhans87@gmail.com/

Here is the libfuse pull request
https://github.com/libfuse/libfuse/pull/813

The patches are passing passthrough_hp xfstests (libfuse part applied),
although we had to introduce umount retries into xfstests, as recent
kernels/xfstests fail umount in some tests with
EBUSY - independent of atomic open. (Although outstanding for v7)

I'm especially interested in Al's and Christians opinion about the
atomic open dentry revalidation in v7. If the vfs changes are
acceptable, would it be possible to also look at the other patches
and their vfs/dcache interaction? I __hope__ I got it right and I hope
the vfs changes are acceptable.

v10:
    - Updates to Amirs suggestions (WARN_ONCE and graceful error code
      path instead of BUG_ON) in lookup_open
    - After discussion with Amir what patch 3 and 4 are about, change them
      into atomic-open-revalidate with (3)  and without (4) O_CREAT,
      instead of atomic-open-revalidate + optimization.
    - Fix opening of symlinks, thanks a lot Yuan for testing an reporting!
      To make reviews easier this is two fold:
          - In patch 2/8 it just falls back to create open (and expects
            server side not to hold a reference), in 8/8 the fallback and
            additional lookup is avoided (server needs to have an inode
            reference).
    - Fix smatch errors
        - lookup_open() error: uninitialized symbol 'error'.
        - atomic_revalidate_open() warn: variable dereferenced before
          check 'got_write'
        - error: 'switched_entry' dereferencing possible ERR_PTR()
    - rebase to 6.6, use ATTR_TIMEOUT()
v9:
    - Followed Miklos suggestion and added another patch to further
      optimize atomic revalidate/open, which avoids dentry
      acquire/release and also avoids double call into ->d_revalidate
    - Updates following Miklos' review
    - Dropped a temporary comment in patch 2/7 (accidental leftover)

v8: - Another slight indentation fix in _fuse_atomic_open
    - Fix compilation error in patch 4 (fuse atomic revalidate)
    - Remove LOOKUP_ATOMIC_REVALIDATE
    - Switch from DCACHE_ATOMIC_OPEN flag to return value and
      and introduce an enum for d_revalidate return values.
    - checkpatch fixes

v7: - Indentation and style fixes for _fuse_atomic_open.
    - Remodel atomic open to avoid races with parallel lookup, similar
      to NFS commit c94c09535c4debcc439f55b5b6d9ebe57bd4665a and what
      is done in _nfs4_open_and_get_state()
      A WARN_ONCE() and fallback is added to ensure operation is on
      negative dentries only.
    - Error handling is done via the fallback fuse_create_open()
      to reduce complexity and code duplication.
    - Remove entry cache invalidation on ENOENT in the atomic-open
      patch, as atomic-open so far operates on negative dentries only.
    - Remove fuse_advise_use_readdirplus() in _fuse_atomic_open
      (Thanks Miklos)
    - Add forgotten free_ext_value() (Thanks Miklos).
    - Declare struct fuse_inode per condition as the value needs to
      be retrieved anyway per condition.
    - Added atomic open-revalidation and required vfs changes
    - Added myself (Bernd) as Co-developed-by to Dharmendras patches, as
      I did substantial modifications.
    - More as reminder for myself, so far these tests below are
      done manually or with custom scripts, I think we need xfstests
      for these.

        With updated libfuse /scratch/dest is mounted by:
        passthrough_hp -o allow_other --foreground --debug-fuse /scratch/source /scratch/dest

        1) Test atomic open (file create) and negative dentry open

            rm -f /scratch/source/file # ensure file does not exist
            mount /scratch/dest  # overlay of /scratch source
            echo a > /scratch/dest/file # non-existing file
            umount and mount /scratch/test (clear cache)
            cat /scratch/dest/file
            rm -f /scratch/dest/file

        2) Test dir open

            mkdir /scratch/source/dir
            mount /scratch/dest  # overlay of /scratch source
            cat /scratch/source/dir
            rmdir /scratch/source/dir

        3)  Test revalidate without file change

            mount /scratch/dest
            echo "a" > /scratch/dest/file
            echo "b" >> /scratch/dest/file
            echo "c" >> /scratch/dest/file
            cat /scratch/dest/file
            rm -f /scratch/dest/file

        4)  Test revalidate by underlying file change

            mount /scratch/dest
            echo "a" > /scratch/dest/file
            cat /scratch/dest/file
            rm -f /scratch/source/file # unknown to dest mount
            str="b"
            echo "${str}" > /scratch/source/file
            reval=$(cat /scratch/dest/file)
            if [ "$str" != "reval" ]; then
                echo "String mismatch after revalidation"
                exit 1
            fi
            rm -f /scratch/dest/file

        5) Test revalidate by underlying file change, but with
           O_CREATE included. Tests dentry creation by the atomic
           revalidate

            mount /scratch/dest
            echo "a" >> /scratch/dest/file
            rm -f /scratch/source/file
            echo "b" > /scratch/source/file

            # revalidate includes O_CREATE
            echo "c" >> /scratch/dest/file

        6) Repeat above tests, but with additional "--nocache"
           passthrough_hp option

v6: Addressed Miklos comments and rewrote atomic open into its own
    function. Also dropped for now is the revalidate optimization, we
    have the code/patch, but it needs more testing. Also easier to
    first agree on atomic open and then to land the next optimization.

v5: Addressed comments

v3: Addressed comments

v4: Addressed all comments and refactored the code into 3 separate patches
    respectively for Atomic create, Atomic open, optimizing lookup in
    d_revalidate().

v3: handle review comments

v2: fixed a memory leak in <fuse_atomic_open_common>

Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dharmendra Singh <dsingh@ddn.com>
Cc: Horst Birthelmer <hbirthelmer@ddn.com>
Cc: Yuan Yao <yuanyaogoog@chromium.org>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: linux-fsdevel@vger.kernel.org

Bernd Schubert (7):
  fuse: rename fuse_create_open
  fuse: introduce atomic open
  [RFC] Allow atomic_open() on positive dentry (w/o O_CREAT)
  fuse: Revalidate positive entries in fuse_atomic_open
  fuse: Return D_REVALIDATE_ATOMIC for cached dentries
  fuse: Avoid code duplication in atomic open
  fuse atomic open: No fallback for symlinks, just call finish_no_open

Miklos Szeredi (1):
  [RFC] Allow atomic_open() on positive dentry (O_CREAT)

 fs/fuse/dir.c             | 395 +++++++++++++++++++++++++++++++++++++-
 fs/fuse/fuse_i.h          |   6 +
 fs/namei.c                |  77 +++++++-
 include/linux/namei.h     |   7 +
 include/uapi/linux/fuse.h |   3 +
 5 files changed, 474 insertions(+), 14 deletions(-)