diff mbox series

[v3,2/4] kernfs: use VFS negative dentry caching

Message ID 161793090597.10062.4954029445418116308.stgit@mickey.themaw.net (mailing list archive)
State New, archived
Headers show
Series kernfs: proposed locking and concurrency improvement | expand

Commit Message

Ian Kent April 9, 2021, 1:15 a.m. UTC
If there are many lookups for non-existent paths these negative lookups
can lead to a lot of overhead during path walks.

The VFS allows dentries to be created as negative and hashed, and caches
them so they can be used to reduce the fairly high overhead alloc/free
cycle that occurs during these lookups.

Signed-off-by: Ian Kent <raven@themaw.net>
---
 fs/kernfs/dir.c |   55 +++++++++++++++++++++++++++++++++----------------------
 1 file changed, 33 insertions(+), 22 deletions(-)

Comments

Al Viro April 9, 2021, 1:35 a.m. UTC | #1
On Fri, Apr 09, 2021 at 09:15:06AM +0800, Ian Kent wrote:
> +		parent = kernfs_dentry_node(dentry->d_parent);
> +		if (parent) {
> +			const void *ns = NULL;
> +
> +			if (kernfs_ns_enabled(parent))
> +				ns = kernfs_info(dentry->d_parent->d_sb)->ns;

	For any dentry d, we have d->d_parent->d_sb == d->d_sb.  All the time.
If you ever run into the case where that would not be true, you've found
a critical bug.

> +			kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
> +			if (kn)
> +				goto out_bad;
> +		}

Umm...  What's to prevent a race with successful rename(2)?  IOW, what's
there to stabilize ->d_parent and ->d_name while we are in that function?
Ian Kent April 9, 2021, 8:26 a.m. UTC | #2
On Fri, 2021-04-09 at 01:35 +0000, Al Viro wrote:
> On Fri, Apr 09, 2021 at 09:15:06AM +0800, Ian Kent wrote:
> > +		parent = kernfs_dentry_node(dentry->d_parent);
> > +		if (parent) {
> > +			const void *ns = NULL;
> > +
> > +			if (kernfs_ns_enabled(parent))
> > +				ns = kernfs_info(dentry->d_parent-
> > >d_sb)->ns;
> 
> 	For any dentry d, we have d->d_parent->d_sb == d->d_sb.  All
> the time.
> If you ever run into the case where that would not be true, you've
> found
> a critical bug.

Right, yes.

> 
> > +			kn = kernfs_find_ns(parent, dentry-
> > >d_name.name, ns);
> > +			if (kn)
> > +				goto out_bad;
> > +		}
> 
> Umm...  What's to prevent a race with successful rename(2)?  IOW,
> what's
> there to stabilize ->d_parent and ->d_name while we are in that
> function?

Indeed, glad you looked at this.

Now I'm wondering how kerfs_iop_rename() protects itself from
concurrent kernfs_rename_ns() ...
Ian Kent April 9, 2021, 9:34 a.m. UTC | #3
On Fri, 2021-04-09 at 16:26 +0800, Ian Kent wrote:
> On Fri, 2021-04-09 at 01:35 +0000, Al Viro wrote:
> > On Fri, Apr 09, 2021 at 09:15:06AM +0800, Ian Kent wrote:
> > > +		parent = kernfs_dentry_node(dentry->d_parent);
> > > +		if (parent) {
> > > +			const void *ns = NULL;
> > > +
> > > +			if (kernfs_ns_enabled(parent))
> > > +				ns = kernfs_info(dentry->d_parent-
> > > > d_sb)->ns;
> > 
> > 	For any dentry d, we have d->d_parent->d_sb == d->d_sb.  All
> > the time.
> > If you ever run into the case where that would not be true, you've
> > found
> > a critical bug.
> 
> Right, yes.
> 
> > > +			kn = kernfs_find_ns(parent, dentry-
> > > > d_name.name, ns);
> > > +			if (kn)
> > > +				goto out_bad;
> > > +		}
> > 
> > Umm...  What's to prevent a race with successful rename(2)?  IOW,
> > what's
> > there to stabilize ->d_parent and ->d_name while we are in that
> > function?
> 
> Indeed, glad you looked at this.
> 
> Now I'm wondering how kerfs_iop_rename() protects itself from
> concurrent kernfs_rename_ns() ... 

As I thought ... I haven't done an exhaustive search but I can't find
any file system that doesn't call back into kernfs from
kernfs_syscall_ops (if provided at kernfs root creation).

I don't see anything that uses kernfs that defines a .rename() op
but if there was one it would be expected to call back into kernfs
at which point it would block on kernfs_mutex (kernfs_rwsem) until
it's released.

So I don't think there can be changes in this case due to the lock
taken just above the code your questioning.

I need to think a bit about whether the dentry being negative (ie.
not having kernfs node) could allow bad things to happen ...

Or am I misunderstanding the race your pointing out here?

Ian
diff mbox series

Patch

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 4c69e2af82dac..edfeee1bf38ec 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1037,12 +1037,33 @@  static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 	if (flags & LOOKUP_RCU)
 		return -ECHILD;
 
-	/* Always perform fresh lookup for negatives */
-	if (d_really_is_negative(dentry))
-		goto out_bad_unlocked;
+	mutex_lock(&kernfs_mutex);
 
 	kn = kernfs_dentry_node(dentry);
-	mutex_lock(&kernfs_mutex);
+
+	/* Negative hashed dentry? */
+	if (!kn) {
+		struct kernfs_node *parent;
+
+		/* If the kernfs node can be found this is a stale negative
+		 * hashed dentry so it must be discarded and the lookup redone.
+		 */
+		parent = kernfs_dentry_node(dentry->d_parent);
+		if (parent) {
+			const void *ns = NULL;
+
+			if (kernfs_ns_enabled(parent))
+				ns = kernfs_info(dentry->d_parent->d_sb)->ns;
+			kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
+			if (kn)
+				goto out_bad;
+		}
+
+		/* The kernfs node doesn't exist, leave the dentry negative
+		 * and return success.
+		 */
+		goto out;
+	}
 
 	/* The kernfs node has been deactivated */
 	if (!kernfs_active_read(kn))
@@ -1060,12 +1081,11 @@  static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 	if (kn->parent && kernfs_ns_enabled(kn->parent) &&
 	    kernfs_info(dentry->d_sb)->ns != kn->ns)
 		goto out_bad;
-
+out:
 	mutex_unlock(&kernfs_mutex);
 	return 1;
 out_bad:
 	mutex_unlock(&kernfs_mutex);
-out_bad_unlocked:
 	return 0;
 }
 
@@ -1080,33 +1100,24 @@  static struct dentry *kernfs_iop_lookup(struct inode *dir,
 	struct dentry *ret;
 	struct kernfs_node *parent = dir->i_private;
 	struct kernfs_node *kn;
-	struct inode *inode;
+	struct inode *inode = NULL;
 	const void *ns = NULL;
 
 	mutex_lock(&kernfs_mutex);
-
 	if (kernfs_ns_enabled(parent))
 		ns = kernfs_info(dir->i_sb)->ns;
 
 	kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
-
-	/* no such entry */
-	if (!kn || !kernfs_active(kn)) {
-		ret = NULL;
-		goto out_unlock;
-	}
-
 	/* attach dentry and inode */
-	inode = kernfs_get_inode(dir->i_sb, kn);
-	if (!inode) {
-		ret = ERR_PTR(-ENOMEM);
-		goto out_unlock;
+	if (kn && kernfs_active(kn)) {
+		inode = kernfs_get_inode(dir->i_sb, kn);
+		if (!inode)
+			inode = ERR_PTR(-ENOMEM);
 	}
-
-	/* instantiate and hash dentry */
+	/* instantiate and hash (possibly negative) dentry */
 	ret = d_splice_alias(inode, dentry);
- out_unlock:
 	mutex_unlock(&kernfs_mutex);
+
 	return ret;
 }