Message ID | 20240124113042.44300-1-jefflexu@linux.alibaba.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | fuse: add support for explicit export disabling | expand |
On Wed, 24 Jan 2024 at 12:30, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: > > open_by_handle_at(2) can fail with -ESTALE with a valid handle returned > by a previous name_to_handle_at(2) for evicted fuse inodes, which is > especially common when entry_valid_timeout is 0, e.g. when the fuse > daemon is in "cache=none" mode. > > The time sequence is like: > > name_to_handle_at(2) # succeed > evict fuse inode > open_by_handle_at(2) # fail > > The root cause is that, with 0 entry_valid_timeout, the dput() called in > name_to_handle_at(2) will trigger iput -> evict(), which will send > FUSE_FORGET to the daemon. The following open_by_handle_at(2) will send > a new FUSE_LOOKUP request upon inode cache miss since the previous inode > eviction. Then the fuse daemon may fail the FUSE_LOOKUP request with > -ENOENT as the cached metadata of the requested inode has already been > cleaned up during the previous FUSE_FORGET. The returned -ENOENT is > treated as -ESTALE when open_by_handle_at(2) returns. > > This confuses the application somehow, as open_by_handle_at(2) fails > when the previous name_to_handle_at(2) succeeds. The returned errno is > also confusing as the requested file is not deleted and already there. > It is reasonable to fail name_to_handle_at(2) early in this case, after > which the application can fallback to open(2) to access files. > > Since this issue typically appears when entry_valid_timeout is 0 which > is configured by the fuse daemon, the fuse daemon is the right person to > explicitly disable the export when required. > > Also considering FUSE_EXPORT_SUPPORT actually indicates the support for > lookups of "." and "..", and there are existing fuse daemons supporting > export without FUSE_EXPORT_SUPPORT set, for compatibility, we add a new > INIT flag for such purpose. This looks good overall. > > Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > --- > RFC: https://lore.kernel.org/all/20240123093701.94166-1-jefflexu@linux.alibaba.com/ > --- > fs/fuse/inode.c | 11 ++++++++++- > include/uapi/linux/fuse.h | 2 ++ > 2 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > index 2a6d44f91729..851940c0e930 100644 > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -1110,6 +1110,11 @@ static struct dentry *fuse_get_parent(struct dentry *child) > return parent; > } > > +/* only for fid encoding; no support for file handle */ > +static const struct export_operations fuse_fid_operations = { Nit: I'd call this fuse_no_export_operations (or something else that emphasizes the fact that this is only for encoding and not for full export support). Thanks, Miklos
On 1/24/24 8:16 PM, Miklos Szeredi wrote: > On Wed, 24 Jan 2024 at 12:30, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: >> >> open_by_handle_at(2) can fail with -ESTALE with a valid handle returned >> by a previous name_to_handle_at(2) for evicted fuse inodes, which is >> especially common when entry_valid_timeout is 0, e.g. when the fuse >> daemon is in "cache=none" mode. >> >> The time sequence is like: >> >> name_to_handle_at(2) # succeed >> evict fuse inode >> open_by_handle_at(2) # fail >> >> The root cause is that, with 0 entry_valid_timeout, the dput() called in >> name_to_handle_at(2) will trigger iput -> evict(), which will send >> FUSE_FORGET to the daemon. The following open_by_handle_at(2) will send >> a new FUSE_LOOKUP request upon inode cache miss since the previous inode >> eviction. Then the fuse daemon may fail the FUSE_LOOKUP request with >> -ENOENT as the cached metadata of the requested inode has already been >> cleaned up during the previous FUSE_FORGET. The returned -ENOENT is >> treated as -ESTALE when open_by_handle_at(2) returns. >> >> This confuses the application somehow, as open_by_handle_at(2) fails >> when the previous name_to_handle_at(2) succeeds. The returned errno is >> also confusing as the requested file is not deleted and already there. >> It is reasonable to fail name_to_handle_at(2) early in this case, after >> which the application can fallback to open(2) to access files. >> >> Since this issue typically appears when entry_valid_timeout is 0 which >> is configured by the fuse daemon, the fuse daemon is the right person to >> explicitly disable the export when required. >> >> Also considering FUSE_EXPORT_SUPPORT actually indicates the support for >> lookups of "." and "..", and there are existing fuse daemons supporting >> export without FUSE_EXPORT_SUPPORT set, for compatibility, we add a new >> INIT flag for such purpose. > > This looks good overall. > >> >> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> >> --- >> RFC: https://lore.kernel.org/all/20240123093701.94166-1-jefflexu@linux.alibaba.com/ >> --- >> fs/fuse/inode.c | 11 ++++++++++- >> include/uapi/linux/fuse.h | 2 ++ >> 2 files changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c >> index 2a6d44f91729..851940c0e930 100644 >> --- a/fs/fuse/inode.c >> +++ b/fs/fuse/inode.c >> @@ -1110,6 +1110,11 @@ static struct dentry *fuse_get_parent(struct dentry *child) >> return parent; >> } >> >> +/* only for fid encoding; no support for file handle */ >> +static const struct export_operations fuse_fid_operations = { > > Nit: I'd call this fuse_no_export_operations (or something else that > emphasizes the fact that this is only for encoding and not for full > export support). OK I will rename it to fuse_no_export_operations. By the way do I need to bump and update the minor version of FUSE protocol?
On Wed, 24 Jan 2024 at 13:50, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: > OK I will rename it to fuse_no_export_operations. > > By the way do I need to bump and update the minor version of FUSE protocol? It's not strictly necessary since the feature is negotiated with the FUSE_NO_EXPORT_SUPPORT flag. Despite that we do usually bump the minor version once per kernel release if there were any changes to the API. Thanks, Miklos
On 1/24/24 9:04 PM, Miklos Szeredi wrote: > On Wed, 24 Jan 2024 at 13:50, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: > >> OK I will rename it to fuse_no_export_operations. >> >> By the way do I need to bump and update the minor version of FUSE protocol? > > It's not strictly necessary since the feature is negotiated with the > FUSE_NO_EXPORT_SUPPORT flag. > > Despite that we do usually bump the minor version once per kernel > release if there were any changes to the API. Got it. Many thanks.
On Wed, Jan 24, 2024 at 2:17 PM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Wed, 24 Jan 2024 at 12:30, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: > > > > open_by_handle_at(2) can fail with -ESTALE with a valid handle returned > > by a previous name_to_handle_at(2) for evicted fuse inodes, which is > > especially common when entry_valid_timeout is 0, e.g. when the fuse > > daemon is in "cache=none" mode. > > > > The time sequence is like: > > > > name_to_handle_at(2) # succeed > > evict fuse inode > > open_by_handle_at(2) # fail > > > > The root cause is that, with 0 entry_valid_timeout, the dput() called in > > name_to_handle_at(2) will trigger iput -> evict(), which will send > > FUSE_FORGET to the daemon. The following open_by_handle_at(2) will send > > a new FUSE_LOOKUP request upon inode cache miss since the previous inode > > eviction. Then the fuse daemon may fail the FUSE_LOOKUP request with > > -ENOENT as the cached metadata of the requested inode has already been > > cleaned up during the previous FUSE_FORGET. The returned -ENOENT is > > treated as -ESTALE when open_by_handle_at(2) returns. > > > > This confuses the application somehow, as open_by_handle_at(2) fails > > when the previous name_to_handle_at(2) succeeds. The returned errno is > > also confusing as the requested file is not deleted and already there. > > It is reasonable to fail name_to_handle_at(2) early in this case, after > > which the application can fallback to open(2) to access files. > > > > Since this issue typically appears when entry_valid_timeout is 0 which > > is configured by the fuse daemon, the fuse daemon is the right person to > > explicitly disable the export when required. > > > > Also considering FUSE_EXPORT_SUPPORT actually indicates the support for > > lookups of "." and "..", and there are existing fuse daemons supporting > > export without FUSE_EXPORT_SUPPORT set, for compatibility, we add a new > > INIT flag for such purpose. > > This looks good overall. > > > > > Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > > --- > > RFC: https://lore.kernel.org/all/20240123093701.94166-1-jefflexu@linux.alibaba.com/ > > --- > > fs/fuse/inode.c | 11 ++++++++++- > > include/uapi/linux/fuse.h | 2 ++ > > 2 files changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > > index 2a6d44f91729..851940c0e930 100644 > > --- a/fs/fuse/inode.c > > +++ b/fs/fuse/inode.c > > @@ -1110,6 +1110,11 @@ static struct dentry *fuse_get_parent(struct dentry *child) > > return parent; > > } > > > > +/* only for fid encoding; no support for file handle */ > > +static const struct export_operations fuse_fid_operations = { > > Nit: I'd call this fuse_no_export_operations (or something else that > emphasizes the fact that this is only for encoding and not for full > export support). Not that I really care what the name is, but overlayfs already has ovl_export_fid_operations and the name aspires from AT_HANDLE_FID, which is already documented in man pages. How about fuse_export_fid_operations? Thanks, Amir.
On Wed, 24 Jan 2024 at 15:11, Amir Goldstein <amir73il@gmail.com> wrote: > > On Wed, Jan 24, 2024 at 2:17 PM Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > On Wed, 24 Jan 2024 at 12:30, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: > > > > > > open_by_handle_at(2) can fail with -ESTALE with a valid handle returned > > > by a previous name_to_handle_at(2) for evicted fuse inodes, which is > > > especially common when entry_valid_timeout is 0, e.g. when the fuse > > > daemon is in "cache=none" mode. > > > > > > The time sequence is like: > > > > > > name_to_handle_at(2) # succeed > > > evict fuse inode > > > open_by_handle_at(2) # fail > > > > > > The root cause is that, with 0 entry_valid_timeout, the dput() called in > > > name_to_handle_at(2) will trigger iput -> evict(), which will send > > > FUSE_FORGET to the daemon. The following open_by_handle_at(2) will send > > > a new FUSE_LOOKUP request upon inode cache miss since the previous inode > > > eviction. Then the fuse daemon may fail the FUSE_LOOKUP request with > > > -ENOENT as the cached metadata of the requested inode has already been > > > cleaned up during the previous FUSE_FORGET. The returned -ENOENT is > > > treated as -ESTALE when open_by_handle_at(2) returns. > > > > > > This confuses the application somehow, as open_by_handle_at(2) fails > > > when the previous name_to_handle_at(2) succeeds. The returned errno is > > > also confusing as the requested file is not deleted and already there. > > > It is reasonable to fail name_to_handle_at(2) early in this case, after > > > which the application can fallback to open(2) to access files. > > > > > > Since this issue typically appears when entry_valid_timeout is 0 which > > > is configured by the fuse daemon, the fuse daemon is the right person to > > > explicitly disable the export when required. > > > > > > Also considering FUSE_EXPORT_SUPPORT actually indicates the support for > > > lookups of "." and "..", and there are existing fuse daemons supporting > > > export without FUSE_EXPORT_SUPPORT set, for compatibility, we add a new > > > INIT flag for such purpose. > > > > This looks good overall. > > > > > > > > Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > > > --- > > > RFC: https://lore.kernel.org/all/20240123093701.94166-1-jefflexu@linux.alibaba.com/ > > > --- > > > fs/fuse/inode.c | 11 ++++++++++- > > > include/uapi/linux/fuse.h | 2 ++ > > > 2 files changed, 12 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > > > index 2a6d44f91729..851940c0e930 100644 > > > --- a/fs/fuse/inode.c > > > +++ b/fs/fuse/inode.c > > > @@ -1110,6 +1110,11 @@ static struct dentry *fuse_get_parent(struct dentry *child) > > > return parent; > > > } > > > > > > +/* only for fid encoding; no support for file handle */ > > > +static const struct export_operations fuse_fid_operations = { > > > > Nit: I'd call this fuse_no_export_operations (or something else that > > emphasizes the fact that this is only for encoding and not for full > > export support). > > Not that I really care what the name is, but overlayfs already has > ovl_export_fid_operations and the name aspires from AT_HANDLE_FID, > which is already documented in man pages. > > How about fuse_export_fid_operations? Okay, let's be consistent with overlayfs naming. Thanks, Miklos
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 2a6d44f91729..851940c0e930 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1110,6 +1110,11 @@ static struct dentry *fuse_get_parent(struct dentry *child) return parent; } +/* only for fid encoding; no support for file handle */ +static const struct export_operations fuse_fid_operations = { + .encode_fh = fuse_encode_fh, +}; + static const struct export_operations fuse_export_operations = { .fh_to_dentry = fuse_fh_to_dentry, .fh_to_parent = fuse_fh_to_parent, @@ -1284,6 +1289,8 @@ static void process_init_reply(struct fuse_mount *fm, struct fuse_args *args, fc->create_supp_group = 1; if (flags & FUSE_DIRECT_IO_ALLOW_MMAP) fc->direct_io_allow_mmap = 1; + if (flags & FUSE_NO_EXPORT_SUPPORT) + fm->sb->s_export_op = &fuse_fid_operations; } else { ra_pages = fc->max_read / PAGE_SIZE; fc->no_lock = 1; @@ -1330,7 +1337,8 @@ void fuse_send_init(struct fuse_mount *fm) FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA | FUSE_HANDLE_KILLPRIV_V2 | FUSE_SETXATTR_EXT | FUSE_INIT_EXT | FUSE_SECURITY_CTX | FUSE_CREATE_SUPP_GROUP | - FUSE_HAS_EXPIRE_ONLY | FUSE_DIRECT_IO_ALLOW_MMAP; + FUSE_HAS_EXPIRE_ONLY | FUSE_DIRECT_IO_ALLOW_MMAP | + FUSE_NO_EXPORT_SUPPORT; #ifdef CONFIG_FUSE_DAX if (fm->fc->dax) flags |= FUSE_MAP_ALIGNMENT; @@ -1527,6 +1535,7 @@ static int fuse_fill_super_submount(struct super_block *sb, sb->s_bdi = bdi_get(parent_sb->s_bdi); sb->s_xattr = parent_sb->s_xattr; + sb->s_export_op = parent_sb->s_export_op; sb->s_time_gran = parent_sb->s_time_gran; sb->s_blocksize = parent_sb->s_blocksize; sb->s_blocksize_bits = parent_sb->s_blocksize_bits; diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index e7418d15fe39..8274c0cd5d61 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -410,6 +410,7 @@ struct fuse_file_lock { * symlink and mknod (single group that matches parent) * FUSE_HAS_EXPIRE_ONLY: kernel supports expiry-only entry invalidation * FUSE_DIRECT_IO_ALLOW_MMAP: allow shared mmap in FOPEN_DIRECT_IO mode. + * FUSE_NO_EXPORT_SUPPORT: explicitly disable export support */ #define FUSE_ASYNC_READ (1 << 0) #define FUSE_POSIX_LOCKS (1 << 1) @@ -449,6 +450,7 @@ struct fuse_file_lock { #define FUSE_CREATE_SUPP_GROUP (1ULL << 34) #define FUSE_HAS_EXPIRE_ONLY (1ULL << 35) #define FUSE_DIRECT_IO_ALLOW_MMAP (1ULL << 36) +#define FUSE_NO_EXPORT_SUPPORT (1ULL << 37) /* Obsolete alias for FUSE_DIRECT_IO_ALLOW_MMAP */ #define FUSE_DIRECT_IO_RELAX FUSE_DIRECT_IO_ALLOW_MMAP
open_by_handle_at(2) can fail with -ESTALE with a valid handle returned by a previous name_to_handle_at(2) for evicted fuse inodes, which is especially common when entry_valid_timeout is 0, e.g. when the fuse daemon is in "cache=none" mode. The time sequence is like: name_to_handle_at(2) # succeed evict fuse inode open_by_handle_at(2) # fail The root cause is that, with 0 entry_valid_timeout, the dput() called in name_to_handle_at(2) will trigger iput -> evict(), which will send FUSE_FORGET to the daemon. The following open_by_handle_at(2) will send a new FUSE_LOOKUP request upon inode cache miss since the previous inode eviction. Then the fuse daemon may fail the FUSE_LOOKUP request with -ENOENT as the cached metadata of the requested inode has already been cleaned up during the previous FUSE_FORGET. The returned -ENOENT is treated as -ESTALE when open_by_handle_at(2) returns. This confuses the application somehow, as open_by_handle_at(2) fails when the previous name_to_handle_at(2) succeeds. The returned errno is also confusing as the requested file is not deleted and already there. It is reasonable to fail name_to_handle_at(2) early in this case, after which the application can fallback to open(2) to access files. Since this issue typically appears when entry_valid_timeout is 0 which is configured by the fuse daemon, the fuse daemon is the right person to explicitly disable the export when required. Also considering FUSE_EXPORT_SUPPORT actually indicates the support for lookups of "." and "..", and there are existing fuse daemons supporting export without FUSE_EXPORT_SUPPORT set, for compatibility, we add a new INIT flag for such purpose. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> --- RFC: https://lore.kernel.org/all/20240123093701.94166-1-jefflexu@linux.alibaba.com/ --- fs/fuse/inode.c | 11 ++++++++++- include/uapi/linux/fuse.h | 2 ++ 2 files changed, 12 insertions(+), 1 deletion(-)