diff mbox

Q. hlist_bl_add_head_rcu() in d_alloc_parallel()

Message ID 20287.1466644756@jrobl (mailing list archive)
State New, archived
Headers show

Commit Message

J. R. Okajima June 23, 2016, 1:19 a.m. UTC
Al Viro:
> That check is definitely bogus and I'm completely at loss as to WTF is it
> doing there.  Thanks for catching that; this kind of idiotic braino can
> escape notice when rereading the code again and again, unfortunately ;-/
>
> Fixed, will push to Linus tonight or tomorrow.

Thank you too for fixing.
I've confirmed the patch was merged and it passed my local tests
too. Happy.
I have another and relatively less important suggestion. Since the two
groups tests in the loop are very similar, how about extract them as a
new single static function?
Do you think it will invite a performance down?


J. R. Okajima


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Al Viro June 23, 2016, 2:58 a.m. UTC | #1
[Linus and George Cc'd since it's close to the area affected by hash
rework and friends]

On Thu, Jun 23, 2016 at 10:19:16AM +0900, J. R. Okajima wrote:
> 
> Al Viro:
> > That check is definitely bogus and I'm completely at loss as to WTF is it
> > doing there.  Thanks for catching that; this kind of idiotic braino can
> > escape notice when rereading the code again and again, unfortunately ;-/
> >
> > Fixed, will push to Linus tonight or tomorrow.
> 
> Thank you too for fixing.
> I've confirmed the patch was merged and it passed my local tests
> too. Happy.
> I have another and relatively less important suggestion. Since the two
> groups tests in the loop are very similar, how about extract them as a
> new single static function?
> Do you think it will invite a performance down?

We do have too many duplicates, especially if you count the related but not
identical ones as well.

1) __d_lookup():
	check hash
	grab d_lock to stabilize the rest
	check parent
	check if it's hashed
	check name/length
2) d_alloc_parallel(), search in in-lookup hash:
	hash/parent/name stable for everything in in-lookup hash, need no locks
	check hash
	check parent
	check name/length
3) d_alloc_parallel(), check after waiting:
	d_lock already held
	check hash
	check parent
	check if it's hashed
	check name/length
4) d_exact_alias():
	check hash
	check parent
	check name/length (simplified since at the moment it's only used for
filesystems that do not have ->d_compare()).

FWIW, now that I look at d_exact_alias()... the comment in there needs
an update; the code is correct, but only because we don't call
that thing for directories.  Which means that d_splice_alias()-induced
moves do not affect us, leaving only d_move(), which is always called
with parent locked exclusive.

Hell knows...  Order of checks can be delicate; out of those cases, (1) and (2)
are on the hottest paths.  We certainly can do this:

static always_inline bool d_same_name(const struct dentry *dentry,
				      const struct dentry *parent,
				      const struct qstr *name)
{
	if (parent->d_flags & DCACHE_OP_COMPARE) {
		int tlen = dentry->d_name.len;
		const char *tname = dentry->d_name.name;
		if (parent->d_op->d_compare(parent, dentry, tlen, tname, name))
			return false;
	} else {
		if (dentry->d_name.len != qstr->len)
			return false;
		if (dentry_cmp(dentry, qstr->name, qstr->len))
			return false;
	}
	return true;
}

then switch all four to this.  That would reduce the amount of boilerplate;
I would hesitate to replace always_inline with inline, since we don't want
(1) and (2) to become uninlined, but maybe even that doesn't matter all that
much - most of rejects will happen without looking at the names.  It really
needs profiling...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds June 24, 2016, 5:57 a.m. UTC | #2
On Wed, Jun 22, 2016 at 7:58 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Hell knows...  Order of checks can be delicate; out of those cases, (1) and (2)
> are on the hottest paths.

Actuially, as long as we leave the actual RCU paths alone, none of the
lookup paths are all that hot in any profile I've seen recently.

So I'm certainly ok with using that d_same_name() helper for the four
cases you mention (__d_lookup, d_alloc_parallel*2, d_exact_alias).

In fact, if you add a "unlikely()" for the DCACHE_OP_COMPARE test, you
might even improve on code generation.

The non-RCU case basically never shows up in profiles (it used to,
with symlinks, but you fixed that case), and if it ever does I suspect
that the fix will be to make sure we don't fall out of rcu.

So don't worry too much about __d_lookup() being a hot case, I don't
think it is.

(Of course, if you have a load that shows me wrong, let's look at it
by all means. Maybe the loads I have used have been bad)

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro June 25, 2016, 10:54 p.m. UTC | #3
On Thu, Jun 23, 2016 at 10:57:09PM -0700, Linus Torvalds wrote:

> The non-RCU case basically never shows up in profiles (it used to,
> with symlinks, but you fixed that case), and if it ever does I suspect
> that the fix will be to make sure we don't fall out of rcu.
> 
> So don't worry too much about __d_lookup() being a hot case, I don't
> think it is.
> 
> (Of course, if you have a load that shows me wrong, let's look at it
> by all means. Maybe the loads I have used have been bad)

BTW, speaking of that area - is there any reason why dentry_cmp()
isn't simply
	return dentry_string_cmp(lockless_dereference(dentry->d_name.name),
				 ct, tcount);
other than "it predates lockless_dereference()"?

	The only difference is s/ACCESS_ONCE/READ_ONCE/, AFAICS -
lockless_dereference() uses the latter these days.  And that shouldn't
be a problem; as the matter of fact, *all* remaining ACCESS_ONCE in
fs/dcache.c look like they should become READ_ONCE.  Objections?

Al, digging through the barrier-related issues with dentries and peeling the
paint off the walls in process...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds June 26, 2016, 1:25 a.m. UTC | #4
On Sat, Jun 25, 2016 at 3:54 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> BTW, speaking of that area - is there any reason why dentry_cmp()
> isn't simply
>         return dentry_string_cmp(lockless_dereference(dentry->d_name.name),
>                                  ct, tcount);
> other than "it predates lockless_dereference()"?

No, the only important thing is that it's a read-once, and has the
smp_read_barrier_depends().

So lockless_derefence() looks like the right thing to do, it is just
that - as you suspected - the code predates that concept.

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro June 29, 2016, 8:17 a.m. UTC | #5
On Sat, Jun 25, 2016 at 06:25:58PM -0700, Linus Torvalds wrote:
> On Sat, Jun 25, 2016 at 3:54 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > BTW, speaking of that area - is there any reason why dentry_cmp()
> > isn't simply
> >         return dentry_string_cmp(lockless_dereference(dentry->d_name.name),
> >                                  ct, tcount);
> > other than "it predates lockless_dereference()"?
> 
> No, the only important thing is that it's a read-once, and has the
> smp_read_barrier_depends().
> 
> So lockless_derefence() looks like the right thing to do, it is just
> that - as you suspected - the code predates that concept.

... and as the matter of fact, patch to that effect (only cosmetically different
from what I'd cooked) had been posted by He Kuang back in March.  At least
I'd spotted that before pushing mine out...

The patch posted back then applied, with slightly edited commit message.  Apologies
for missing it...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
He Kuang June 29, 2016, 9:22 a.m. UTC | #6
在 2016/6/29 16:17, Al Viro 写道:
> On Sat, Jun 25, 2016 at 06:25:58PM -0700, Linus Torvalds wrote:
>> On Sat, Jun 25, 2016 at 3:54 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>>> BTW, speaking of that area - is there any reason why dentry_cmp()
>>> isn't simply
>>>          return dentry_string_cmp(lockless_dereference(dentry->d_name.name),
>>>                                   ct, tcount);
>>> other than "it predates lockless_dereference()"?
>> No, the only important thing is that it's a read-once, and has the
>> smp_read_barrier_depends().
>>
>> So lockless_derefence() looks like the right thing to do, it is just
>> that - as you suspected - the code predates that concept.
> ... and as the matter of fact, patch to that effect (only cosmetically different
> from what I'd cooked) had been posted by He Kuang back in March.  At least
> I'd spotted that before pushing mine out...
>
> The patch posted back then applied, with slightly edited commit message.  Apologies
> for missing it...
Its okay and really appreciate you remembering, I've forgotten that 
patch if not mentioned. :-)

Thank you.


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/dcache.c b/fs/dcache.c
index 76afffd..3a9ebbc 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -2462,14 +2462,42 @@  static void d_wait_lookup(struct dentry *dentry)
 	}
 }
 
+/* tests for d_alloc_parallel() */
+static bool dap_test(struct dentry *dentry, const struct qstr *name,
+		     struct dentry *parent, bool do_unhashed_test)
+{
+	bool ret;
+
+	ret = false;
+	if (dentry->d_name.hash != name->hash)
+		goto out;
+	if (dentry->d_parent != parent)
+		goto out;
+	if (do_unhashed_test && d_unhashed(dentry))
+		goto out;
+	if (parent->d_flags & DCACHE_OP_COMPARE) {
+		int tlen = dentry->d_name.len;
+		const char *tname = dentry->d_name.name;
+		if (parent->d_op->d_compare(parent, dentry, tlen, tname, name))
+			goto out;
+	} else {
+		unsigned int len = name->len;
+		if (dentry->d_name.len != len)
+			goto out;
+		if (dentry_cmp(dentry, name->name, len))
+			goto out;
+	}
+	ret = true;
+
+out:
+	return ret;
+}
+
 struct dentry *d_alloc_parallel(struct dentry *parent,
 				const struct qstr *name,
 				wait_queue_head_t *wq)
 {
-	unsigned int len = name->len;
-	unsigned int hash = name->hash;
-	const unsigned char *str = name->name;
-	struct hlist_bl_head *b = in_lookup_hash(parent, hash);
+	struct hlist_bl_head *b = in_lookup_hash(parent, name->hash);
 	struct hlist_bl_node *node;
 	struct dentry *new = d_alloc(parent, name);
 	struct dentry *dentry;
@@ -2515,21 +2543,8 @@  retry:
 	 * we encounter.
 	 */
 	hlist_bl_for_each_entry(dentry, node, b, d_u.d_in_lookup_hash) {
-		if (dentry->d_name.hash != hash)
-			continue;
-		if (dentry->d_parent != parent)
+		if (!dap_test(dentry, name, parent, /*do_unhashed_test*/false))
 			continue;
-		if (parent->d_flags & DCACHE_OP_COMPARE) {
-			int tlen = dentry->d_name.len;
-			const char *tname = dentry->d_name.name;
-			if (parent->d_op->d_compare(parent, dentry, tlen, tname, name))
-				continue;
-		} else {
-			if (dentry->d_name.len != len)
-				continue;
-			if (dentry_cmp(dentry, str, len))
-				continue;
-		}
 		hlist_bl_unlock(b);
 		/* now we can try to grab a reference */
 		if (!lockref_get_not_dead(&dentry->d_lockref)) {
@@ -2550,23 +2565,9 @@  retry:
 		 * d_lookup() would've found anyway.  If it is, just return it;
 		 * otherwise we really have to repeat the whole thing.
 		 */
-		if (unlikely(dentry->d_name.hash != hash))
-			goto mismatch;
-		if (unlikely(dentry->d_parent != parent))
+		if (unlikely(!dap_test(dentry, name, parent,
+				       /*do_unhashed_test*/true)))
 			goto mismatch;
-		if (unlikely(d_unhashed(dentry)))
-			goto mismatch;
-		if (parent->d_flags & DCACHE_OP_COMPARE) {
-			int tlen = dentry->d_name.len;
-			const char *tname = dentry->d_name.name;
-			if (parent->d_op->d_compare(parent, dentry, tlen, tname, name))
-				goto mismatch;
-		} else {
-			if (unlikely(dentry->d_name.len != len))
-				goto mismatch;
-			if (unlikely(dentry_cmp(dentry, str, len)))
-				goto mismatch;
-		}
 		/* OK, it *is* a hashed match; return it */
 		spin_unlock(&dentry->d_lock);
 		dput(new);