Message ID | 20240307110217.203064-3-mszeredi@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/4] ovl: use refcount_t in readdir | expand |
On Thu, Mar 7, 2024 at 1:02 PM Miklos Szeredi <mszeredi@redhat.com> wrote: > > The only reason parallel readdirs cannot run on the same inode is shared > access to the readdir cache. I did not see a cover letter, so I am assuming that the reason for this change is to improve concurrent readdir. If I am reading this correctly users can only iterate pure real dirs in parallel but not merged and impure dirs. Right? Is there a reason why a specific cached readdir version cannot be iterated in parallel? > > Move lock/unlock to only protect the cache. Exception is the refcount > which now uses atomic ops. > > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> > --- > fs/overlayfs/readdir.c | 34 ++++++++++++++++++++-------------- > 1 file changed, 20 insertions(+), 14 deletions(-) > > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c > index edee9f86f469..b98e0d17f40e 100644 > --- a/fs/overlayfs/readdir.c > +++ b/fs/overlayfs/readdir.c > @@ -245,8 +245,10 @@ static void ovl_cache_put(struct ovl_dir_file *od, struct inode *inode) > struct ovl_dir_cache *cache = od->cache; > > if (refcount_dec_and_test(&cache->refcount)) { What is stopping ovl_cache_get() to be called now, find a valid cache and increment its refcount and use it while it is being freed? Do we need refcount_inc_not_zero() in ovl_cache_get()? > + ovl_inode_lock(inode); > if (ovl_dir_cache(inode) == cache) > ovl_set_dir_cache(inode, NULL); > + ovl_inode_unlock(inode); > > ovl_cache_free(&cache->entries); > kfree(cache); P.S. A guard for ovl_inode_lock() would have been useful in this patch set, but it's up to you if you want to define one and use it. Thanks, Amir.
On Thu, 7 Mar 2024 at 14:11, Amir Goldstein <amir73il@gmail.com> wrote: > I did not see a cover letter, so I am assuming that the reason for this change > is to improve concurrent readdir. That's a nice to have, but the real reason was just to get rid of the FIXME. > If I am reading this correctly users can only iterate pure real dirs in parallel > but not merged and impure dirs. Right? Right. > Is there a reason why a specific cached readdir version cannot be iterated > in parallel? It could, but it would take more thought (ovl _cache_update() may modify a cache entry). > > > > > Move lock/unlock to only protect the cache. Exception is the refcount > > which now uses atomic ops. > > > > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> > > --- > > fs/overlayfs/readdir.c | 34 ++++++++++++++++++++-------------- > > 1 file changed, 20 insertions(+), 14 deletions(-) > > > > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c > > index edee9f86f469..b98e0d17f40e 100644 > > --- a/fs/overlayfs/readdir.c > > +++ b/fs/overlayfs/readdir.c > > @@ -245,8 +245,10 @@ static void ovl_cache_put(struct ovl_dir_file *od, struct inode *inode) > > struct ovl_dir_cache *cache = od->cache; > > > > if (refcount_dec_and_test(&cache->refcount)) { > > What is stopping ovl_cache_get() to be called now, find a valid cache > and increment its refcount and use it while it is being freed? > > Do we need refcount_inc_not_zero() in ovl_cache_get()? Yes. But it would still be racy (winning ovl_cache_get() would set oi->cache, then losing ovl_cache_put() would reset it). It would be a harmless race, but I find it ugly, so I'll just move the locking outside of the refcount_dec_and_test(). It's not a performance sensitive path. > > > + ovl_inode_lock(inode); > > if (ovl_dir_cache(inode) == cache) > > ovl_set_dir_cache(inode, NULL); > > + ovl_inode_unlock(inode); > > > > ovl_cache_free(&cache->entries); > > kfree(cache); > > P.S. A guard for ovl_inode_lock() would have been useful in this patch set, > but it's up to you if you want to define one and use it. Will look into it. Thanks for the review. Miklos
On Thu, 7 Mar 2024 at 15:09, Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Thu, 7 Mar 2024 at 14:11, Amir Goldstein <amir73il@gmail.com> wrote: > > P.S. A guard for ovl_inode_lock() would have been useful in this patch set, > > but it's up to you if you want to define one and use it. I like the concept of guards, though documentation and examples are lacking and the API is not trivial to understand at first sight. For overlayfs I'd start with ovl_override_creds(), since that is used much more extensively than ovl_inode_lock(). Thanks, Miklos
On Thu, Mar 7, 2024 at 6:13 PM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Thu, 7 Mar 2024 at 15:09, Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > On Thu, 7 Mar 2024 at 14:11, Amir Goldstein <amir73il@gmail.com> wrote: > > > > P.S. A guard for ovl_inode_lock() would have been useful in this patch set, > > > but it's up to you if you want to define one and use it. > > I like the concept of guards, though documentation and examples are > lacking and the API is not trivial to understand at first sight. > > For overlayfs I'd start with ovl_override_creds(), since that is used > much more extensively than ovl_inode_lock(). > OK. let's wait for this to land first: https://lore.kernel.org/linux-unionfs/20240216051640.197378-1-vinicius.gomes@intel.com/ As I wrote in the review of v2, I'd rather that Christian will review and pick up the non-overlayfs bits, which head suggested and only after that will I review the overlayfs patch. Thanks, Amir.
On Thu, Mar 07, 2024 at 07:31:35PM +0200, Amir Goldstein wrote: > On Thu, Mar 7, 2024 at 6:13 PM Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > On Thu, 7 Mar 2024 at 15:09, Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > > > On Thu, 7 Mar 2024 at 14:11, Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > P.S. A guard for ovl_inode_lock() would have been useful in this patch set, > > > > but it's up to you if you want to define one and use it. > > > > I like the concept of guards, though documentation and examples are > > lacking and the API is not trivial to understand at first sight. > > > > For overlayfs I'd start with ovl_override_creds(), since that is used > > much more extensively than ovl_inode_lock(). > > > > OK. let's wait for this to land first: > https://lore.kernel.org/linux-unionfs/20240216051640.197378-1-vinicius.gomes@intel.com/ > > As I wrote in the review of v2, > I'd rather that Christian will review and pick up the non-overlayfs bits, > which head suggested and only after that will I review the overlayfs > patch. On it. Had been on my queue but didn't get around to it. I wanted to play with this a bit.
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c index edee9f86f469..b98e0d17f40e 100644 --- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -245,8 +245,10 @@ static void ovl_cache_put(struct ovl_dir_file *od, struct inode *inode) struct ovl_dir_cache *cache = od->cache; if (refcount_dec_and_test(&cache->refcount)) { + ovl_inode_lock(inode); if (ovl_dir_cache(inode) == cache) ovl_set_dir_cache(inode, NULL); + ovl_inode_unlock(inode); ovl_cache_free(&cache->entries); kfree(cache); @@ -733,12 +735,18 @@ static int ovl_iterate_real(struct file *file, struct dir_context *ctx) } if (ovl_is_impure_dir(file)) { + ovl_inode_lock(file_inode(file)); rdt.cache = ovl_cache_get_impure(&file->f_path); - if (IS_ERR(rdt.cache)) + if (IS_ERR(rdt.cache)) { + ovl_inode_unlock(file_inode(file)); return PTR_ERR(rdt.cache); + } } err = iterate_dir(od->realfile, &rdt.ctx); + + if (rdt.cache) + ovl_inode_unlock(file_inode(file)); ctx->pos = rdt.ctx.pos; return err; @@ -758,7 +766,6 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx) if (!ctx->pos) ovl_dir_reset(file); - ovl_inode_lock(file_inode(file)); if (od->is_real) { /* * If parent is merge, then need to adjust d_ino for '..', if @@ -773,9 +780,10 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx) } else { err = iterate_dir(od->realfile, ctx); } - goto out; + goto out_revert; } + ovl_inode_lock(file_inode(file)); if (!od->cache) { struct ovl_dir_cache *cache; @@ -808,6 +816,7 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx) err = 0; out: ovl_inode_unlock(file_inode(file)); +out_revert: revert_creds(old_cred); return err; } @@ -817,7 +826,6 @@ static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin) loff_t res; struct ovl_dir_file *od = file->private_data; - ovl_inode_lock(file_inode(file)); if (!file->f_pos) ovl_dir_reset(file); @@ -834,21 +842,22 @@ static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin) case SEEK_SET: break; default: - goto out_unlock; + goto out; } if (offset < 0) - goto out_unlock; + goto out; if (offset != file->f_pos) { file->f_pos = offset; - if (od->cache) + if (od->cache) { + ovl_inode_lock(file_inode(file)); ovl_seek_cursor(od, offset); + ovl_inode_unlock(file_inode(file)); + } } res = offset; } -out_unlock: - ovl_inode_unlock(file_inode(file)); - +out: return res; } @@ -930,11 +939,8 @@ static int ovl_dir_release(struct inode *inode, struct file *file) { struct ovl_dir_file *od = file->private_data; - if (od->cache) { - ovl_inode_lock(inode); + if (od->cache) ovl_cache_put(od, inode); - ovl_inode_unlock(inode); - } fput(od->realfile); if (od->upperfile) fput(od->upperfile);
The only reason parallel readdirs cannot run on the same inode is shared access to the readdir cache. Move lock/unlock to only protect the cache. Exception is the refcount which now uses atomic ops. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> --- fs/overlayfs/readdir.c | 34 ++++++++++++++++++++-------------- 1 file changed, 20 insertions(+), 14 deletions(-)