diff mbox series

[1/2] fuse: replace fuse_queue_forget with fuse_force_forget if error

Message ID 20240726083752.302301-2-yangyun50@huawei.com (mailing list archive)
State New
Headers show
Series fuse: add no forget support | expand

Commit Message

yangyun July 26, 2024, 8:37 a.m. UTC
Most usecases for 'fuse_queue_forget' in the code are about reverting
the lookup count when error happens, except 'fuse_evict_inode' and
'fuse_cleanup_submount_lookup'. Even if there are no errors, it
still needs alloc 'struct fuse_forget_link'. It is useless, which
contributes to performance degradation and code mess to some extent.

'fuse_force_forget' does not need allocate 'struct fuse_forget_link'in
advance, and is only used by readdirplus before this patch for the reason
that we do not know how many 'fuse_forget_link' structures will be
allocated when error happens.

Signed-off-by: yangyun <yangyun50@huawei.com>
---
 fs/fuse/dev.c     | 19 +++++++++++++++
 fs/fuse/dir.c     | 59 +++++++++++------------------------------------
 fs/fuse/fuse_i.h  |  2 ++
 fs/fuse/readdir.c | 29 +++++------------------
 4 files changed, 40 insertions(+), 69 deletions(-)

Comments

Josef Bacik July 26, 2024, 3:39 p.m. UTC | #1
On Fri, Jul 26, 2024 at 04:37:51PM +0800, yangyun wrote:
> Most usecases for 'fuse_queue_forget' in the code are about reverting
> the lookup count when error happens, except 'fuse_evict_inode' and
> 'fuse_cleanup_submount_lookup'. Even if there are no errors, it
> still needs alloc 'struct fuse_forget_link'. It is useless, which
> contributes to performance degradation and code mess to some extent.
> 
> 'fuse_force_forget' does not need allocate 'struct fuse_forget_link'in
> advance, and is only used by readdirplus before this patch for the reason
> that we do not know how many 'fuse_forget_link' structures will be
> allocated when error happens.
> 
> Signed-off-by: yangyun <yangyun50@huawei.com>

Forcing file systems to have their forget suddenly be synchronous in a lot of
cases is going to be a perf regression for them.

In some of these cases a synchronous forget is probably ok, as you say a lot of
them are error cases.  However d_revalidate() isn't.  That's us trying to figure
out if what we have in cache matches the file systems view of the inode, and if
it doesn't we're going to do a re-lookup, so we don't necessarily care for a
synchronous forget in this case.  Think of an NFS fuse client where the file got
renamed on the backend and now we're telling the kernel this is the inode we
have.  Forcing us to do a synchronous response now is going to be much more
performance impacting than it was pre-this patch.

A better approach would be to make the allocation optional based on the
->no_forget flag.  Thanks,

Josef
yangyun July 27, 2024, 10:05 a.m. UTC | #2
Hi Josef,

Thanks for the comment.


On Fri, Jul 26, 2024 at 11:39:08AM -0400, Josef Bacik wrote:
> On Fri, Jul 26, 2024 at 04:37:51PM +0800, yangyun wrote:
> > Most usecases for 'fuse_queue_forget' in the code are about reverting
> > the lookup count when error happens, except 'fuse_evict_inode' and
> > 'fuse_cleanup_submount_lookup'. Even if there are no errors, it
> > still needs alloc 'struct fuse_forget_link'. It is useless, which
> > contributes to performance degradation and code mess to some extent.
> > 
> > 'fuse_force_forget' does not need allocate 'struct fuse_forget_link'in
> > advance, and is only used by readdirplus before this patch for the reason
> > that we do not know how many 'fuse_forget_link' structures will be
> > allocated when error happens.
> > 
> > Signed-off-by: yangyun <yangyun50@huawei.com>
> 
> Forcing file systems to have their forget suddenly be synchronous in a lot of
> cases is going to be a perf regression for them.

Sorry for that I didn't notice that 'fuse_force_forget' is synchronous. Actually,
I'm not on purpose to make forget be synchronous, just want to reuse the already 
known function 'fuse_force_forget' to avoid useless memory alloc in some cases.

And thank you for the reminder regarding the performance impact of synchronization.

> 
> In some of these cases a synchronous forget is probably ok, as you say a lot of
> them are error cases.  However d_revalidate() isn't.  That's us trying to figure
> out if what we have in cache matches the file systems view of the inode, and if
> it doesn't we're going to do a re-lookup, so we don't necessarily care for a
> synchronous forget in this case.  Think of an NFS fuse client where the file got
> renamed on the backend and now we're telling the kernel this is the inode we
> have.  Forcing us to do a synchronous response now is going to be much more
> performance impacting than it was pre-this patch.

Yeah, this is definitely a performance impacting case. Thank for your adivce.

> 
> A better approach would be to make the allocation optional based on the
> ->no_forget flag.  Thanks,

The reason that I don't make the allocataion optional is that even the no_forget flag
is disabled, there are still many useless memory allocations if no errors. And The code
is also a bit messy because of those allocations.

Since forget is not necessarily synchronous (In my opinion, the pre-this patch use of 
synchronous 'fuse_force_forget' is an error case and also not necessarily synchronous), 
what about just changing the 'fuse_force_forget' to be asynchronous?

By this way, all the forget requests are asynchronous (less impact on performance) and 
we doesn't need to allocate useless memory in advance. 

> 
> Josef

Thank you once again for your time and advice.
Miklos Szeredi Aug. 22, 2024, 3:26 p.m. UTC | #3
On Sat, 27 Jul 2024 at 12:06, yangyun <yangyun50@huawei.com> wrote:
> Since forget is not necessarily synchronous (In my opinion, the pre-this patch use of
> synchronous 'fuse_force_forget' is an error case and also not necessarily synchronous),
> what about just changing the 'fuse_force_forget' to be asynchronous?

Even less impact would be to move the allocation inside
fuse_force_forget (make it GFP_NOFAIL) and still use the
fuse_queue_forget() function to send the forget as e.g. virtiofs
handles them differently from regular requests.

Thanks,
Miklos
yangyun Aug. 23, 2024, 11:58 a.m. UTC | #4
On Thu, Aug 22, 2024 at 05:26:01PM +0200, Miklos Szeredi wrote:
> On Sat, 27 Jul 2024 at 12:06, yangyun <yangyun50@huawei.com> wrote:
> > Since forget is not necessarily synchronous (In my opinion, the pre-this patch use of
> > synchronous 'fuse_force_forget' is an error case and also not necessarily synchronous),
> > what about just changing the 'fuse_force_forget' to be asynchronous?
> 
> Even less impact would be to move the allocation inside
> fuse_force_forget (make it GFP_NOFAIL) and still use the
> fuse_queue_forget() function to send the forget as e.g. virtiofs
> handles them differently from regular requests.

fuse_force_forget uses the fuse_simple_request with args.force=true and it does not need allocation outside originally, so it is strange for what you said "move the allocation inside fuse_force_forge".

I think what you mean is moving the allocation inside fuse_queue_forget, not fuse_force_forget. This can make sense.

Thanks for your advice. I will update this patch.

> 
> Thanks,
> Miklos
diff mbox series

Patch

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 9eb191b5c4de..932356833b0d 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -252,6 +252,25 @@  void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget,
 	}
 }
 
+void fuse_force_forget(struct fuse_mount *fm, u64 nodeid)
+{
+	struct fuse_forget_in inarg;
+	FUSE_ARGS(args);
+
+	memset(&inarg, 0, sizeof(inarg));
+	inarg.nlookup = 1;
+	args.opcode = FUSE_FORGET;
+	args.nodeid = nodeid;
+	args.in_numargs = 1;
+	args.in_args[0].size = sizeof(inarg);
+	args.in_args[0].value = &inarg;
+	args.force = true;
+	args.noreply = true;
+
+	fuse_simple_request(fm, &args);
+	/* ignore errors */
+}
+
 static void flush_bg_queue(struct fuse_conn *fc)
 {
 	struct fuse_iqueue *fiq = &fc->iq;
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 2b0d4781f394..6bfb3a128658 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -207,7 +207,6 @@  static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 		 (flags & (LOOKUP_EXCL | LOOKUP_REVAL | LOOKUP_RENAME_TARGET))) {
 		struct fuse_entry_out outarg;
 		FUSE_ARGS(args);
-		struct fuse_forget_link *forget;
 		u64 attr_version;
 
 		/* For negative dentries, always do a fresh lookup */
@@ -220,11 +219,6 @@  static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 
 		fm = get_fuse_mount(inode);
 
-		forget = fuse_alloc_forget();
-		ret = -ENOMEM;
-		if (!forget)
-			goto out;
-
 		attr_version = fuse_get_attr_version(fm->fc);
 
 		parent = dget_parent(entry);
@@ -239,15 +233,13 @@  static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
 			fi = get_fuse_inode(inode);
 			if (outarg.nodeid != get_node_id(inode) ||
 			    (bool) IS_AUTOMOUNT(inode) != (bool) (outarg.attr.flags & FUSE_ATTR_SUBMOUNT)) {
-				fuse_queue_forget(fm->fc, forget,
-						  outarg.nodeid, 1);
+				fuse_force_forget(fm, outarg.nodeid);
 				goto invalid;
 			}
 			spin_lock(&fi->lock);
 			fi->nlookup++;
 			spin_unlock(&fi->lock);
 		}
-		kfree(forget);
 		if (ret == -ENOMEM || ret == -EINTR)
 			goto out;
 		if (ret || fuse_invalid_attr(&outarg.attr) ||
@@ -365,7 +357,6 @@  int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name
 {
 	struct fuse_mount *fm = get_fuse_mount_super(sb);
 	FUSE_ARGS(args);
-	struct fuse_forget_link *forget;
 	u64 attr_version;
 	int err;
 
@@ -374,23 +365,17 @@  int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name
 	if (name->len > FUSE_NAME_MAX)
 		goto out;
 
-
-	forget = fuse_alloc_forget();
-	err = -ENOMEM;
-	if (!forget)
-		goto out;
-
 	attr_version = fuse_get_attr_version(fm->fc);
 
 	fuse_lookup_init(fm->fc, &args, nodeid, name, outarg);
 	err = fuse_simple_request(fm, &args);
 	/* Zero nodeid is same as -ENOENT, but with valid timeout */
 	if (err || !outarg->nodeid)
-		goto out_put_forget;
+		goto out;
 
 	err = -EIO;
 	if (fuse_invalid_attr(&outarg->attr))
-		goto out_put_forget;
+		goto out;
 	if (outarg->nodeid == FUSE_ROOT_ID && outarg->generation != 0) {
 		pr_warn_once("root generation should be zero\n");
 		outarg->generation = 0;
@@ -401,13 +386,11 @@  int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name
 			   attr_version);
 	err = -ENOMEM;
 	if (!*inode) {
-		fuse_queue_forget(fm->fc, forget, outarg->nodeid, 1);
+		fuse_force_forget(fm, outarg->nodeid);
 		goto out;
 	}
 	err = 0;
 
- out_put_forget:
-	kfree(forget);
  out:
 	return err;
 }
@@ -617,7 +600,6 @@  static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	struct inode *inode;
 	struct fuse_mount *fm = get_fuse_mount(dir);
 	FUSE_ARGS(args);
-	struct fuse_forget_link *forget;
 	struct fuse_create_in inarg;
 	struct fuse_open_out *outopenp;
 	struct fuse_entry_out outentry;
@@ -628,15 +610,10 @@  static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	/* Userspace expects S_IFREG in create mode */
 	BUG_ON((mode & S_IFMT) != S_IFREG);
 
-	forget = fuse_alloc_forget();
-	err = -ENOMEM;
-	if (!forget)
-		goto out_err;
-
 	err = -ENOMEM;
 	ff = fuse_file_alloc(fm, true);
 	if (!ff)
-		goto out_put_forget_req;
+		goto out_err;
 
 	if (!fm->fc->dont_mask)
 		mode &= ~current_umask();
@@ -670,7 +647,7 @@  static int fuse_create_open(struct inode *dir, struct dentry *entry,
 
 	err = get_create_ext(&args, dir, entry, mode);
 	if (err)
-		goto out_put_forget_req;
+		goto out_err;
 
 	err = fuse_simple_request(fm, &args);
 	free_ext_value(&args);
@@ -690,11 +667,10 @@  static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	if (!inode) {
 		flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
 		fuse_sync_release(NULL, ff, flags);
-		fuse_queue_forget(fm->fc, forget, outentry.nodeid, 1);
+		fuse_force_forget(fm, outentry.nodeid);
 		err = -ENOMEM;
 		goto out_err;
 	}
-	kfree(forget);
 	d_instantiate(entry, inode);
 	fuse_change_entry_timeout(entry, &outentry);
 	fuse_dir_changed(dir);
@@ -716,8 +692,6 @@  static int fuse_create_open(struct inode *dir, struct dentry *entry,
 
 out_free_ff:
 	fuse_file_free(ff);
-out_put_forget_req:
-	kfree(forget);
 out_err:
 	return err;
 }
@@ -782,15 +756,10 @@  static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args,
 	struct inode *inode;
 	struct dentry *d;
 	int err;
-	struct fuse_forget_link *forget;
 
 	if (fuse_is_bad(dir))
 		return -EIO;
 
-	forget = fuse_alloc_forget();
-	if (!forget)
-		return -ENOMEM;
-
 	memset(&outarg, 0, sizeof(outarg));
 	args->nodeid = get_node_id(dir);
 	args->out_numargs = 1;
@@ -800,28 +769,27 @@  static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args,
 	if (args->opcode != FUSE_LINK) {
 		err = get_create_ext(args, dir, entry, mode);
 		if (err)
-			goto out_put_forget_req;
+			goto out_err;
 	}
 
 	err = fuse_simple_request(fm, args);
 	free_ext_value(args);
 	if (err)
-		goto out_put_forget_req;
+		goto out_err;
 
 	err = -EIO;
 	if (invalid_nodeid(outarg.nodeid) || fuse_invalid_attr(&outarg.attr))
-		goto out_put_forget_req;
+		goto out_err;
 
 	if ((outarg.attr.mode ^ mode) & S_IFMT)
-		goto out_put_forget_req;
+		goto out_err;
 
 	inode = fuse_iget(dir->i_sb, outarg.nodeid, outarg.generation,
 			  &outarg.attr, ATTR_TIMEOUT(&outarg), 0);
 	if (!inode) {
-		fuse_queue_forget(fm->fc, forget, outarg.nodeid, 1);
+		fuse_force_forget(fm, outarg.nodeid);
 		return -ENOMEM;
 	}
-	kfree(forget);
 
 	d_drop(entry);
 	d = d_splice_alias(inode, entry);
@@ -837,10 +805,9 @@  static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args,
 	fuse_dir_changed(dir);
 	return 0;
 
- out_put_forget_req:
+ out_err:
 	if (err == -EEXIST)
 		fuse_invalidate_entry(entry);
-	kfree(forget);
 	return err;
 }
 
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index f23919610313..b9a5b8ec0de5 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1051,6 +1051,8 @@  int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name
 void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget,
 		       u64 nodeid, u64 nlookup);
 
+void fuse_force_forget(struct fuse_mount *fm, u64 nodeid);
+
 struct fuse_forget_link *fuse_alloc_forget(void);
 
 struct fuse_forget_link *fuse_dequeue_forget(struct fuse_iqueue *fiq,
diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
index 0377b6dc24c8..39f01ac31f7c 100644
--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -262,27 +262,6 @@  static int fuse_direntplus_link(struct file *file,
 	return 0;
 }
 
-static void fuse_force_forget(struct file *file, u64 nodeid)
-{
-	struct inode *inode = file_inode(file);
-	struct fuse_mount *fm = get_fuse_mount(inode);
-	struct fuse_forget_in inarg;
-	FUSE_ARGS(args);
-
-	memset(&inarg, 0, sizeof(inarg));
-	inarg.nlookup = 1;
-	args.opcode = FUSE_FORGET;
-	args.nodeid = nodeid;
-	args.in_numargs = 1;
-	args.in_args[0].size = sizeof(inarg);
-	args.in_args[0].value = &inarg;
-	args.force = true;
-	args.noreply = true;
-
-	fuse_simple_request(fm, &args);
-	/* ignore errors */
-}
-
 static int parse_dirplusfile(char *buf, size_t nbytes, struct file *file,
 			     struct dir_context *ctx, u64 attr_version)
 {
@@ -320,8 +299,12 @@  static int parse_dirplusfile(char *buf, size_t nbytes, struct file *file,
 		nbytes -= reclen;
 
 		ret = fuse_direntplus_link(file, direntplus, attr_version);
-		if (ret)
-			fuse_force_forget(file, direntplus->entry_out.nodeid);
+		if (ret) {
+			struct inode *inode = file_inode(file);
+			struct fuse_mount *fm = get_fuse_mount(inode);
+
+			fuse_force_forget(fm, direntplus->entry_out.nodeid);
+		}
 	}
 
 	return 0;