Message ID | 20230920173445.3943581-1-bschubert@ddn.com (mailing list archive) |
---|---|
Headers | show |
Series | fuse: full atomic open and atomic-open-revalidate | expand |
On Thu, Sep 21, 2023 at 9:31 AM Bernd Schubert <bschubert@ddn.com> wrote: > > In FUSE, as of now, uncached lookups are expensive over the wire. > E.g additional latencies and stressing (meta data) servers from > thousands of clients. With atomic-open lookup before open > can be avoided. > > Here is the link to performance numbers > https://lore.kernel.org/linux-fsdevel/20220322121212.5087-1-dharamhans87@gmail.com/ > > Here is the libfuse pull request > https://github.com/libfuse/libfuse/pull/813 > > The patches are passing passthrough_hp xfstests (libfuse part applied), > although we had to introduce umount retries into xfstests, as recent > kernels/xfstests fail umount in some tests with > EBUSY - independent of atomic open. (Although outstanding for v7) Hi Bernd! I was using xfstests to test passthrough_hp (for FUSE kernel passthrough). FYI, I have made some improvements to the mount helper in libfuse [1] to support remount, which helps pass a few tests. So far, I have all the tests in group -g quick.rw pass with the baseline passthrough_hp (over xfs). Do you have a baseline for the entire quick/auto group to share with me? Can you share the patch that you are using to avoid the EBUSY errors? Note that Chritian has suggested a method to use inotify IN_UNMOUNT event to wait for sb shutdown in fstests [2]. Thanks, Amir. [1] https://github.com/amir73il/libfuse/commits/fuse-backing-fd [2] https://lore.kernel.org/linux-fsdevel/20230908-verflachen-neudefinition-4da649d673a9@brauner/
On 9/21/23 11:33, Amir Goldstein wrote: > On Thu, Sep 21, 2023 at 9:31 AM Bernd Schubert <bschubert@ddn.com> wrote: >> >> In FUSE, as of now, uncached lookups are expensive over the wire. >> E.g additional latencies and stressing (meta data) servers from >> thousands of clients. With atomic-open lookup before open >> can be avoided. >> >> Here is the link to performance numbers >> https://lore.kernel.org/linux-fsdevel/20220322121212.5087-1-dharamhans87@gmail.com/ >> >> Here is the libfuse pull request >> https://github.com/libfuse/libfuse/pull/813 >> >> The patches are passing passthrough_hp xfstests (libfuse part applied), >> although we had to introduce umount retries into xfstests, as recent >> kernels/xfstests fail umount in some tests with >> EBUSY - independent of atomic open. (Although outstanding for v7) > > Hi Bernd! > > I was using xfstests to test passthrough_hp (for FUSE kernel passthrough). > FYI, I have made some improvements to the mount helper > in libfuse [1] to support remount, which helps pass a few tests. Thanks, just asked there to send it separate to upstream. > > So far, I have all the tests in group -g quick.rw pass with the baseline > passthrough_hp (over xfs). > > Do you have a baseline for the entire quick/auto group to share with me? Please find my results attached. I have opened a libfuse issue for generic/477, (open_by_handle_at tests) but I'm not sure if this is passthrough_hp only (it trusts the passed node id, without checking if there is an inode object for it). Possibly fuse.ko passes an invalide node id - this is something for a rainy weekend (or so) to investigate... > Can you share the patch that you are using to avoid the EBUSY errors? The simple version to avoid _most_ of EBUSY is this diff --git a/common/rc b/common/rc index 741579af..a40fca3b 100644 --- a/common/rc +++ b/common/rc @@ -305,6 +305,7 @@ _scratch_mount_idmapped() _scratch_unmount() { + sync case "$FSTYP" in overlay) _overlay_scratch_unmount The better version is this https://github.com/kdave/xfstests/commit/33a15af07bb044e2773a83df1c7e0a0df280a4b7 > > Note that Chritian has suggested a method to use inotify > IN_UNMOUNT event to wait for sb shutdown in fstests [2]. Thanks, I had seen the discussion. Although I (silently) wondered if something like MNT_BLOCk as umount2 flag wouldn't be easier. > > Thanks, > Amir. > > [1] https://github.com/amir73il/libfuse/commits/fuse-backing-fd > [2] https://lore.kernel.org/linux-fsdevel/20230908-verflachen-neudefinition-4da649d673a9@brauner/ Thanks, Bernd
On Thu, Sep 21, 2023 at 3:00 PM Bernd Schubert <bschubert@ddn.com> wrote: > > On 9/21/23 11:33, Amir Goldstein wrote: > > On Thu, Sep 21, 2023 at 9:31 AM Bernd Schubert <bschubert@ddn.com> wrote: > >> > >> In FUSE, as of now, uncached lookups are expensive over the wire. > >> E.g additional latencies and stressing (meta data) servers from > >> thousands of clients. With atomic-open lookup before open > >> can be avoided. > >> > >> Here is the link to performance numbers > >> https://lore.kernel.org/linux-fsdevel/20220322121212.5087-1-dharamhans87@gmail.com/ > >> > >> Here is the libfuse pull request > >> https://github.com/libfuse/libfuse/pull/813 > >> > >> The patches are passing passthrough_hp xfstests (libfuse part applied), > >> although we had to introduce umount retries into xfstests, as recent > >> kernels/xfstests fail umount in some tests with > >> EBUSY - independent of atomic open. (Although outstanding for v7) > > > > Hi Bernd! > > > > I was using xfstests to test passthrough_hp (for FUSE kernel passthrough). > > FYI, I have made some improvements to the mount helper > > in libfuse [1] to support remount, which helps pass a few tests. > > Thanks, just asked there to send it separate to upstream. > > > > > So far, I have all the tests in group -g quick.rw pass with the baseline > > passthrough_hp (over xfs). > > > > Do you have a baseline for the entire quick/auto group to share with me? > > Please find my results attached. Not too bad. 3 more tests can pass with my mount helper fix for remount ;) > I have opened a libfuse issue for generic/477, > (open_by_handle_at tests) but I'm not sure if this is passthrough_hp only (it > trusts the passed node id, without checking if there is an inode object for it). > Possibly fuse.ko passes an invalide node id - this is something for a rainy > weekend (or so) to investigate... Stale file handles after mount cycle are expected. FUSE is not equipped to handle this correctly. NFS clients may even get access to the wrong inode after FUSE restart/reexport, if FUSE is exported with the same NFS fsid. See this discussion [3] about how this could be solved hackishly with existing FUSE protocol (for fs that know how to open by ino) and about the LOOKUP_HANDLE protocol command that is needed to solve this in a generic way. > > > > Can you share the patch that you are using to avoid the EBUSY errors? > > > The simple version to avoid _most_ of EBUSY is this > > > diff --git a/common/rc b/common/rc > index 741579af..a40fca3b 100644 > --- a/common/rc > +++ b/common/rc > @@ -305,6 +305,7 @@ _scratch_mount_idmapped() > > _scratch_unmount() > { > + sync > case "$FSTYP" in > overlay) > _overlay_scratch_unmount > > > > The better version is this > https://github.com/kdave/xfstests/commit/33a15af07bb044e2773a83df1c7e0a0df280a4b7 > > > > > Note that Chritian has suggested a method to use inotify > > IN_UNMOUNT event to wait for sb shutdown in fstests [2]. > > Thanks, I had seen the discussion. Although I (silently) wondered if something > like MNT_BLOCk as umount2 flag wouldn't be easier. > You'd better keep wondering silently unless you want to upset Christian ;) Thanks, Amir. > > [1] https://github.com/amir73il/libfuse/commits/fuse-backing-fd > > [2] https://lore.kernel.org/linux-fsdevel/20230908-verflachen-neudefinition-4da649d673a9@brauner/ [3] https://lore.kernel.org/linux-fsdevel/CAOQ4uxg_FV8U833qVkgPaAWJ4MNcnGoy9Gci41bmak4_ROSc3g@mail.gmail.com/
On 9/21/23 16:24, Amir Goldstein wrote: > On Thu, Sep 21, 2023 at 3:00 PM Bernd Schubert <bschubert@ddn.com> wrote: >> >> On 9/21/23 11:33, Amir Goldstein wrote: >>> On Thu, Sep 21, 2023 at 9:31 AM Bernd Schubert <bschubert@ddn.com> wrote: >>>> >>>> In FUSE, as of now, uncached lookups are expensive over the wire. >>>> E.g additional latencies and stressing (meta data) servers from >>>> thousands of clients. With atomic-open lookup before open >>>> can be avoided. >>>> >>>> Here is the link to performance numbers >>>> https://lore.kernel.org/linux-fsdevel/20220322121212.5087-1-dharamhans87@gmail.com/ >>>> >>>> Here is the libfuse pull request >>>> https://github.com/libfuse/libfuse/pull/813 >>>> >>>> The patches are passing passthrough_hp xfstests (libfuse part applied), >>>> although we had to introduce umount retries into xfstests, as recent >>>> kernels/xfstests fail umount in some tests with >>>> EBUSY - independent of atomic open. (Although outstanding for v7) >>> >>> Hi Bernd! >>> >>> I was using xfstests to test passthrough_hp (for FUSE kernel passthrough). >>> FYI, I have made some improvements to the mount helper >>> in libfuse [1] to support remount, which helps pass a few tests. >> >> Thanks, just asked there to send it separate to upstream. >> >>> >>> So far, I have all the tests in group -g quick.rw pass with the baseline >>> passthrough_hp (over xfs). >>> >>> Do you have a baseline for the entire quick/auto group to share with me? >> >> Please find my results attached. > > Not too bad. > 3 more tests can pass with my mount helper fix for remount ;) > >> I have opened a libfuse issue for generic/477, >> (open_by_handle_at tests) but I'm not sure if this is passthrough_hp only (it >> trusts the passed node id, without checking if there is an inode object for it). >> Possibly fuse.ko passes an invalide node id - this is something for a rainy >> weekend (or so) to investigate... > > Stale file handles after mount cycle are expected. > FUSE is not equipped to handle this correctly. I know and I don't have a problem with that. Issue is that the test triggers a heap buffer overflow, see the ASAN report here https://github.com/libfuse/libfuse/issues/838 A possible reason might be an invalid node id by open_by_handle_at, or lookup/release is not right. As I said, will investigate once I have a free minute. > > NFS clients may even get access to the wrong inode > after FUSE restart/reexport, if FUSE is exported with the same > NFS fsid. > > See this discussion [3] about how this could be solved hackishly > with existing FUSE protocol (for fs that know how to open by ino) > and about the LOOKUP_HANDLE protocol command that is > needed to solve this in a generic way. I will read through it later. I would prefer adding support up to MAX_HANDLE_SZ - our file systems typically exceed 64 bit inode sizes. Without having it read, I would just expose exportfs methods to userspace (which might be the LOOKUP_HANDLE protocol). > >> >> >>> Can you share the patch that you are using to avoid the EBUSY errors? >> >> >> The simple version to avoid _most_ of EBUSY is this >> >> >> diff --git a/common/rc b/common/rc >> index 741579af..a40fca3b 100644 >> --- a/common/rc >> +++ b/common/rc >> @@ -305,6 +305,7 @@ _scratch_mount_idmapped() >> >> _scratch_unmount() >> { >> + sync >> case "$FSTYP" in >> overlay) >> _overlay_scratch_unmount >> >> >> >> The better version is this >> https://github.com/kdave/xfstests/commit/33a15af07bb044e2773a83df1c7e0a0df280a4b7 >> >>> >>> Note that Chritian has suggested a method to use inotify >>> IN_UNMOUNT event to wait for sb shutdown in fstests [2]. >> >> Thanks, I had seen the discussion. Although I (silently) wondered if something >> like MNT_BLOCk as umount2 flag wouldn't be easier. >> > > You'd better keep wondering silently unless you want to upset Christian ;) Ouch, Christian is in CC, inotify is fine ;) Thanks, Bernd
On Thu, Sep 21, 2023 at 5:24 PM Amir Goldstein <amir73il@gmail.com> wrote: > > On Thu, Sep 21, 2023 at 3:00 PM Bernd Schubert <bschubert@ddn.com> wrote: > > > > On 9/21/23 11:33, Amir Goldstein wrote: > > > On Thu, Sep 21, 2023 at 9:31 AM Bernd Schubert <bschubert@ddn.com> wrote: > > >> > > >> In FUSE, as of now, uncached lookups are expensive over the wire. > > >> E.g additional latencies and stressing (meta data) servers from > > >> thousands of clients. With atomic-open lookup before open > > >> can be avoided. > > >> > > >> Here is the link to performance numbers > > >> https://lore.kernel.org/linux-fsdevel/20220322121212.5087-1-dharamhans87@gmail.com/ > > >> > > >> Here is the libfuse pull request > > >> https://github.com/libfuse/libfuse/pull/813 > > >> > > >> The patches are passing passthrough_hp xfstests (libfuse part applied), > > >> although we had to introduce umount retries into xfstests, as recent > > >> kernels/xfstests fail umount in some tests with > > >> EBUSY - independent of atomic open. (Although outstanding for v7) > > > > > > Hi Bernd! > > > > > > I was using xfstests to test passthrough_hp (for FUSE kernel passthrough). > > > FYI, I have made some improvements to the mount helper > > > in libfuse [1] to support remount, which helps pass a few tests. > > > > Thanks, just asked there to send it separate to upstream. Now upstream. Thanks for your help! > > > > > > > > So far, I have all the tests in group -g quick.rw pass with the baseline > > > passthrough_hp (over xfs). > > > > > > Do you have a baseline for the entire quick/auto group to share with me? > > > > Please find my results attached. > > Not too bad. > 3 more tests can pass with my mount helper fix for remount ;) > FYI, here is a wdiff of my -g auto passthough_hp test run compared to yours: [-unpatched-6.5-]{+upatched-6.6-rc3+} Failures: generic/003 [-generic/020-] {+generic/099+} generic/184 generic/192 generic/263 [-generic/294 generic/306-] {+generic/317 generic/318 generic/319 generic/375+} generic/401 {+generic/423+} generic/426 [-generic/427-] generic/434 [-generic/452-] {+generic/444+} generic/467 [-generic/468-] generic/477 [-generic/478-] generic/617 {+generic/532+} generic/631 generic/633 generic/683 [-generic/688-] Some of my {+NEW+} failures are because I have POSIX_ACL support enabled in Kconfig, so the same tests are [not run] in your results. I suspect that several permission related tests that PASS for you and FAIL for me may also be because of enabled POSIX_ACL. I was also running passthouhg_hp with -odefault_permissions, but AFAIK this did not change the fstests results. > > > > > > > Can you share the patch that you are using to avoid the EBUSY errors? > > > > > > The simple version to avoid _most_ of EBUSY is this > > You know, I am testing passthrough_hp with kernel 6.6-rc3 and I did not encounter any EBUST errors. Maybe there is some relevant vfs fix in 6.6-rc3, because you were testing 6.5? Or maybe it's because my test VM has only 2 cpus. Thanks, Amir.