diff mbox series

[3/4] ovl: only lock readdir for accessing the cache

Message ID 20240307110217.203064-3-mszeredi@redhat.com (mailing list archive)
State New
Headers show
Series [1/4] ovl: use refcount_t in readdir | expand

Commit Message

Miklos Szeredi March 7, 2024, 11:02 a.m. UTC
The only reason parallel readdirs cannot run on the same inode is shared
access to the readdir cache.

Move lock/unlock to only protect the cache.  Exception is the refcount
which now uses atomic ops.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/readdir.c | 34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

Comments

Amir Goldstein March 7, 2024, 1:11 p.m. UTC | #1
On Thu, Mar 7, 2024 at 1:02 PM Miklos Szeredi <mszeredi@redhat.com> wrote:
>
> The only reason parallel readdirs cannot run on the same inode is shared
> access to the readdir cache.

I did not see a cover letter, so I am assuming that the reason for this change
is to improve concurrent readdir.

If I am reading this correctly users can only iterate pure real dirs in parallel
but not merged and impure dirs. Right?

Is there a reason why a specific cached readdir version cannot be iterated
in parallel?

>
> Move lock/unlock to only protect the cache.  Exception is the refcount
> which now uses atomic ops.
>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/overlayfs/readdir.c | 34 ++++++++++++++++++++--------------
>  1 file changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index edee9f86f469..b98e0d17f40e 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -245,8 +245,10 @@ static void ovl_cache_put(struct ovl_dir_file *od, struct inode *inode)
>         struct ovl_dir_cache *cache = od->cache;
>
>         if (refcount_dec_and_test(&cache->refcount)) {

What is stopping ovl_cache_get() to be called now, find a valid cache
and increment its refcount and use it while it is being freed?

Do we need refcount_inc_not_zero() in ovl_cache_get()?

> +               ovl_inode_lock(inode);
>                 if (ovl_dir_cache(inode) == cache)
>                         ovl_set_dir_cache(inode, NULL);
> +               ovl_inode_unlock(inode);
>
>                 ovl_cache_free(&cache->entries);
>                 kfree(cache);

P.S. A guard for ovl_inode_lock() would have been useful in this patch set,
but it's up to you if you want to define one and use it.

Thanks,
Amir.
Miklos Szeredi March 7, 2024, 2:09 p.m. UTC | #2
On Thu, 7 Mar 2024 at 14:11, Amir Goldstein <amir73il@gmail.com> wrote:

> I did not see a cover letter, so I am assuming that the reason for this change
> is to improve concurrent readdir.

That's a nice to have, but the real reason was just to get rid of the FIXME.

> If I am reading this correctly users can only iterate pure real dirs in parallel
> but not merged and impure dirs. Right?

Right.

> Is there a reason why a specific cached readdir version cannot be iterated
> in parallel?

It could, but it would take more thought (ovl _cache_update() may
modify a cache entry).

>
> >
> > Move lock/unlock to only protect the cache.  Exception is the refcount
> > which now uses atomic ops.
> >
> > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> > ---
> >  fs/overlayfs/readdir.c | 34 ++++++++++++++++++++--------------
> >  1 file changed, 20 insertions(+), 14 deletions(-)
> >
> > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> > index edee9f86f469..b98e0d17f40e 100644
> > --- a/fs/overlayfs/readdir.c
> > +++ b/fs/overlayfs/readdir.c
> > @@ -245,8 +245,10 @@ static void ovl_cache_put(struct ovl_dir_file *od, struct inode *inode)
> >         struct ovl_dir_cache *cache = od->cache;
> >
> >         if (refcount_dec_and_test(&cache->refcount)) {
>
> What is stopping ovl_cache_get() to be called now, find a valid cache
> and increment its refcount and use it while it is being freed?
>
> Do we need refcount_inc_not_zero() in ovl_cache_get()?

Yes.  But it would still be racy (winning ovl_cache_get() would set
oi->cache, then losing ovl_cache_put() would reset it).  It would be a
harmless race, but I find it ugly, so I'll just move the locking
outside of the refcount_dec_and_test().  It's not a performance
sensitive path.


>
> > +               ovl_inode_lock(inode);
> >                 if (ovl_dir_cache(inode) == cache)
> >                         ovl_set_dir_cache(inode, NULL);
> > +               ovl_inode_unlock(inode);
> >
> >                 ovl_cache_free(&cache->entries);
> >                 kfree(cache);
>
> P.S. A guard for ovl_inode_lock() would have been useful in this patch set,
> but it's up to you if you want to define one and use it.

Will look into it.

Thanks for the review.

Miklos
Miklos Szeredi March 7, 2024, 4:13 p.m. UTC | #3
On Thu, 7 Mar 2024 at 15:09, Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Thu, 7 Mar 2024 at 14:11, Amir Goldstein <amir73il@gmail.com> wrote:

> > P.S. A guard for ovl_inode_lock() would have been useful in this patch set,
> > but it's up to you if you want to define one and use it.

I like the concept of guards, though documentation and examples are
lacking and the API is not trivial to understand at first sight.

For overlayfs I'd start with ovl_override_creds(), since that is used
much more extensively than ovl_inode_lock().

Thanks,
Miklos
Amir Goldstein March 7, 2024, 5:31 p.m. UTC | #4
On Thu, Mar 7, 2024 at 6:13 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Thu, 7 Mar 2024 at 15:09, Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Thu, 7 Mar 2024 at 14:11, Amir Goldstein <amir73il@gmail.com> wrote:
>
> > > P.S. A guard for ovl_inode_lock() would have been useful in this patch set,
> > > but it's up to you if you want to define one and use it.
>
> I like the concept of guards, though documentation and examples are
> lacking and the API is not trivial to understand at first sight.
>
> For overlayfs I'd start with ovl_override_creds(), since that is used
> much more extensively than ovl_inode_lock().
>

OK. let's wait for this to land first:
https://lore.kernel.org/linux-unionfs/20240216051640.197378-1-vinicius.gomes@intel.com/

As I wrote in the review of v2,
I'd rather that Christian will review and pick up the non-overlayfs bits,
which head suggested and only after that will I review the overlayfs
patch.

Thanks,
Amir.
Christian Brauner March 11, 2024, 1:52 p.m. UTC | #5
On Thu, Mar 07, 2024 at 07:31:35PM +0200, Amir Goldstein wrote:
> On Thu, Mar 7, 2024 at 6:13 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Thu, 7 Mar 2024 at 15:09, Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Thu, 7 Mar 2024 at 14:11, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > > > P.S. A guard for ovl_inode_lock() would have been useful in this patch set,
> > > > but it's up to you if you want to define one and use it.
> >
> > I like the concept of guards, though documentation and examples are
> > lacking and the API is not trivial to understand at first sight.
> >
> > For overlayfs I'd start with ovl_override_creds(), since that is used
> > much more extensively than ovl_inode_lock().
> >
> 
> OK. let's wait for this to land first:
> https://lore.kernel.org/linux-unionfs/20240216051640.197378-1-vinicius.gomes@intel.com/
> 
> As I wrote in the review of v2,
> I'd rather that Christian will review and pick up the non-overlayfs bits,
> which head suggested and only after that will I review the overlayfs
> patch.

On it. Had been on my queue but didn't get around to it. I wanted to
play with this a bit.
diff mbox series

Patch

diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index edee9f86f469..b98e0d17f40e 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -245,8 +245,10 @@  static void ovl_cache_put(struct ovl_dir_file *od, struct inode *inode)
 	struct ovl_dir_cache *cache = od->cache;
 
 	if (refcount_dec_and_test(&cache->refcount)) {
+		ovl_inode_lock(inode);
 		if (ovl_dir_cache(inode) == cache)
 			ovl_set_dir_cache(inode, NULL);
+		ovl_inode_unlock(inode);
 
 		ovl_cache_free(&cache->entries);
 		kfree(cache);
@@ -733,12 +735,18 @@  static int ovl_iterate_real(struct file *file, struct dir_context *ctx)
 	}
 
 	if (ovl_is_impure_dir(file)) {
+		ovl_inode_lock(file_inode(file));
 		rdt.cache = ovl_cache_get_impure(&file->f_path);
-		if (IS_ERR(rdt.cache))
+		if (IS_ERR(rdt.cache)) {
+			ovl_inode_unlock(file_inode(file));
 			return PTR_ERR(rdt.cache);
+		}
 	}
 
 	err = iterate_dir(od->realfile, &rdt.ctx);
+
+	if (rdt.cache)
+		ovl_inode_unlock(file_inode(file));
 	ctx->pos = rdt.ctx.pos;
 
 	return err;
@@ -758,7 +766,6 @@  static int ovl_iterate(struct file *file, struct dir_context *ctx)
 	if (!ctx->pos)
 		ovl_dir_reset(file);
 
-	ovl_inode_lock(file_inode(file));
 	if (od->is_real) {
 		/*
 		 * If parent is merge, then need to adjust d_ino for '..', if
@@ -773,9 +780,10 @@  static int ovl_iterate(struct file *file, struct dir_context *ctx)
 		} else {
 			err = iterate_dir(od->realfile, ctx);
 		}
-		goto out;
+		goto out_revert;
 	}
 
+	ovl_inode_lock(file_inode(file));
 	if (!od->cache) {
 		struct ovl_dir_cache *cache;
 
@@ -808,6 +816,7 @@  static int ovl_iterate(struct file *file, struct dir_context *ctx)
 	err = 0;
 out:
 	ovl_inode_unlock(file_inode(file));
+out_revert:
 	revert_creds(old_cred);
 	return err;
 }
@@ -817,7 +826,6 @@  static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin)
 	loff_t res;
 	struct ovl_dir_file *od = file->private_data;
 
-	ovl_inode_lock(file_inode(file));
 	if (!file->f_pos)
 		ovl_dir_reset(file);
 
@@ -834,21 +842,22 @@  static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin)
 		case SEEK_SET:
 			break;
 		default:
-			goto out_unlock;
+			goto out;
 		}
 		if (offset < 0)
-			goto out_unlock;
+			goto out;
 
 		if (offset != file->f_pos) {
 			file->f_pos = offset;
-			if (od->cache)
+			if (od->cache) {
+				ovl_inode_lock(file_inode(file));
 				ovl_seek_cursor(od, offset);
+				ovl_inode_unlock(file_inode(file));
+			}
 		}
 		res = offset;
 	}
-out_unlock:
-	ovl_inode_unlock(file_inode(file));
-
+out:
 	return res;
 }
 
@@ -930,11 +939,8 @@  static int ovl_dir_release(struct inode *inode, struct file *file)
 {
 	struct ovl_dir_file *od = file->private_data;
 
-	if (od->cache) {
-		ovl_inode_lock(inode);
+	if (od->cache)
 		ovl_cache_put(od, inode);
-		ovl_inode_unlock(inode);
-	}
 	fput(od->realfile);
 	if (od->upperfile)
 		fput(od->upperfile);