[v7,5/6] lib/dlock-list: Enable faster lookup with hashing

Message ID	1507229008-20569-6-git-send-email-longman@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 612CEC047B89 From: Waiman Long <longman@redhat.com> To: Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.com>, Jeff Layton <jlayton@poochiereds.net>, "J. Bruce Fields" <bfields@fieldses.org>, Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@linux-foundation.org> Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Andi Kleen <andi@firstfloor.org>, Dave Chinner <dchinner@redhat.com>, Boqun Feng <boqun.feng@gmail.com>, Davidlohr Bueso <dave@stgolabs.net>, Waiman Long <longman@redhat.com> Subject: [PATCH v7 5/6] lib/dlock-list: Enable faster lookup with hashing Date: Thu, 5 Oct 2017 14:43:27 -0400 Message-Id: <1507229008-20569-6-git-send-email-longman@redhat.com> In-Reply-To: <1507229008-20569-1-git-send-email-longman@redhat.com> References: <1507229008-20569-1-git-send-email-longman@redhat.com> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk

Message ID

1507229008-20569-6-git-send-email-longman@redhat.com (mailing list archive)

State

New, archived

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 612CEC047B89
From: Waiman Long <longman@redhat.com>
To: Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	"J. Bruce Fields" <bfields@fieldses.org>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>,
	Andi Kleen <andi@firstfloor.org>, Dave Chinner <dchinner@redhat.com>,
	Boqun Feng <boqun.feng@gmail.com>, Davidlohr Bueso <dave@stgolabs.net>,
	Waiman Long <longman@redhat.com>
Subject: [PATCH v7 5/6] lib/dlock-list: Enable faster lookup with hashing
Date: Thu,  5 Oct 2017 14:43:27 -0400
Message-Id: <1507229008-20569-6-git-send-email-longman@redhat.com>
In-Reply-To: <1507229008-20569-1-git-send-email-longman@redhat.com>
References: <1507229008-20569-1-git-send-email-longman@redhat.com>
Sender: linux-fsdevel-owner@vger.kernel.org
Precedence: bulk

Commit Message

Waiman Long Oct. 5, 2017, 6:43 p.m. UTC

Insertion and deletion is relatively cheap and mostly contention
free for dlock-list. Lookup, on the other hand, can be rather costly
because all the lists in a dlock-list will have to be iterated.

Currently dlock-list insertion is based on the cpu that the task is
running on. So a given object can be inserted into any one of the
lists depending on what the current cpu is.

This patch provides an alternative way of list selection. The caller
can provide a object context which will be hashed to one of the list
in a dlock-list. The object can then be added into that particular
list. Lookup can be done by iterating elements in the provided list
only instead of all the lists in a dlock-list.

The new APIs are:

struct dlock_list_head *dlock_list_hash(struct dlock_list_heads *, void *);
void dlock_list_add(struct dlock_list_node *, struct dlock_list_head *);

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/dlock-list.h |  9 +++++++++
 lib/dlock-list.c           | 49 +++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 51 insertions(+), 7 deletions(-)

Comments

Davidlohr Bueso Oct. 9, 2017, 1:08 p.m. UTC | #1

On Thu, 05 Oct 2017, Waiman Long wrote:

>Insertion and deletion is relatively cheap and mostly contention
>free for dlock-list. Lookup, on the other hand, can be rather costly
>because all the lists in a dlock-list will have to be iterated.
>
>Currently dlock-list insertion is based on the cpu that the task is
>running on. So a given object can be inserted into any one of the
>lists depending on what the current cpu is.
>
>This patch provides an alternative way of list selection. The caller
>can provide a object context which will be hashed to one of the list
>in a dlock-list. The object can then be added into that particular
>list. Lookup can be done by iterating elements in the provided list
>only instead of all the lists in a dlock-list.

Unless I'm misusing the api, I could not find a standard way of
iterating a _particular_ list head (the one the dlock_list_hash()
returned). This is because iterators always want the all the heads.

Also, in my particular epoll case I'd need the head->lock _not_ to
be dropped after the iteration, and therefore is pretty adhoc.
Currently we do:

dlist_for_each_entry() {
	// acquire head->lock for each list
}
// no locks held
dlist_add()

I'm thinking perhaps we could have dlist_check_add() which passes a
callback to ensure we want to add the node. The function could acquire
the head->lock and not release it until the very end.

Thanks,
Davidlohr

Waiman Long Oct. 9, 2017, 2:16 p.m. UTC | #2

On 10/09/2017 09:08 AM, Davidlohr Bueso wrote:
> On Thu, 05 Oct 2017, Waiman Long wrote:
>
>> Insertion and deletion is relatively cheap and mostly contention
>> free for dlock-list. Lookup, on the other hand, can be rather costly
>> because all the lists in a dlock-list will have to be iterated.
>>
>> Currently dlock-list insertion is based on the cpu that the task is
>> running on. So a given object can be inserted into any one of the
>> lists depending on what the current cpu is.
>>
>> This patch provides an alternative way of list selection. The caller
>> can provide a object context which will be hashed to one of the list
>> in a dlock-list. The object can then be added into that particular
>> list. Lookup can be done by iterating elements in the provided list
>> only instead of all the lists in a dlock-list.
>
> Unless I'm misusing the api, I could not find a standard way of
> iterating a _particular_ list head (the one the dlock_list_hash()
> returned). This is because iterators always want the all the heads.
>
> Also, in my particular epoll case I'd need the head->lock _not_ to
> be dropped after the iteration, and therefore is pretty adhoc.
> Currently we do:
>
> dlist_for_each_entry() {
>     // acquire head->lock for each list
> }
> // no locks held
> dlist_add()
>
> I'm thinking perhaps we could have dlist_check_add() which passes a
> callback to ensure we want to add the node. The function could acquire
> the head->lock and not release it until the very end. 

With the dlock_list_hash(), dlock-list is degenerated into a pool of
list where one is chosen by hashing. So the regular list iteration
macros like list_for_each_entry() can then be used. Of course, you have
to explicitly do the lock and unlock operation.

I could also encapsulate it a bit with inlined function like

dlock_list_single_iter_init(iter, dlist, head, flags)

It could set up the iterator properly to iterate only 1 list. The flags
can be to indicate holding the lock after iteration. In this case,
dlock_list_unlock(iter) will have to be called to do the unlock. I could
add a patch to do that if you prefer that route.

Cheers,
Longman

Davidlohr Bueso Oct. 9, 2017, 4:03 p.m. UTC | #3

On Mon, 09 Oct 2017, Waiman Long wrote:

>On 10/09/2017 09:08 AM, Davidlohr Bueso wrote:
>> On Thu, 05 Oct 2017, Waiman Long wrote:
>>
>>> This patch provides an alternative way of list selection. The caller
>>> can provide a object context which will be hashed to one of the list
>>> in a dlock-list. The object can then be added into that particular
>>> list. Lookup can be done by iterating elements in the provided list
>>> only instead of all the lists in a dlock-list.
>>
>> Unless I'm misusing the api, I could not find a standard way of
>> iterating a _particular_ list head (the one the dlock_list_hash()
>> returned). This is because iterators always want the all the heads.
>>
>> Also, in my particular epoll case I'd need the head->lock _not_ to
>> be dropped after the iteration, and therefore is pretty adhoc.
>> Currently we do:
>>
>> dlist_for_each_entry() {
>>     // acquire head->lock for each list
>> }
>> // no locks held
>> dlist_add()
>>
>> I'm thinking perhaps we could have dlist_check_add() which passes a
>> callback to ensure we want to add the node. The function could acquire
>> the head->lock and not release it until the very end.
>
>With the dlock_list_hash(), dlock-list is degenerated into a pool of
>list where one is chosen by hashing. So the regular list iteration
>macros like list_for_each_entry() can then be used. Of course, you have
>to explicitly do the lock and unlock operation.

Right, which seemed rather asymmetric and fragile, albeit adhoc to epoll.
Particularly not having the iter structure, makes use directly have to
deal with the spinlock, and there goes irq awareness out the window.

>I could also encapsulate it a bit with inlined function like

Probably not worth it, unless some other use cases came up. This just
bulks the api even more.

Thanks,
Davidlohr

Waiman Long Oct. 9, 2017, 4:11 p.m. UTC | #4

On 10/09/2017 12:03 PM, Davidlohr Bueso wrote:
> On Mon, 09 Oct 2017, Waiman Long wrote:
>
>> On 10/09/2017 09:08 AM, Davidlohr Bueso wrote:
>>> On Thu, 05 Oct 2017, Waiman Long wrote:
>>>
>>>> This patch provides an alternative way of list selection. The caller
>>>> can provide a object context which will be hashed to one of the list
>>>> in a dlock-list. The object can then be added into that particular
>>>> list. Lookup can be done by iterating elements in the provided list
>>>> only instead of all the lists in a dlock-list.
>>>
>>> Unless I'm misusing the api, I could not find a standard way of
>>> iterating a _particular_ list head (the one the dlock_list_hash()
>>> returned). This is because iterators always want the all the heads.
>>>
>>> Also, in my particular epoll case I'd need the head->lock _not_ to
>>> be dropped after the iteration, and therefore is pretty adhoc.
>>> Currently we do:
>>>
>>> dlist_for_each_entry() {
>>>     // acquire head->lock for each list
>>> }
>>> // no locks held
>>> dlist_add()
>>>
>>> I'm thinking perhaps we could have dlist_check_add() which passes a
>>> callback to ensure we want to add the node. The function could acquire
>>> the head->lock and not release it until the very end.
>>
>> With the dlock_list_hash(), dlock-list is degenerated into a pool of
>> list where one is chosen by hashing. So the regular list iteration
>> macros like list_for_each_entry() can then be used. Of course, you have
>> to explicitly do the lock and unlock operation.
>
> Right, which seemed rather asymmetric and fragile, albeit adhoc to epoll.
> Particularly not having the iter structure, makes use directly have to
> deal with the spinlock, and there goes irq awareness out the window.
>

Right. We don't need the irq safety support in this case.

>> I could also encapsulate it a bit with inlined function like
>
> Probably not worth it, unless some other use cases came up. This just
> bulks the api even more. 

I am thinking of adding only one more init API. So it is not a
significant bulking-up at all.

Cheers,
Longman

diff --git a/include/linux/dlock-list.h b/include/linux/dlock-list.h
index 7940e524..16474ae 100644
--- a/include/linux/dlock-list.h
+++ b/include/linux/dlock-list.h
@@ -121,6 +121,15 @@  extern void dlock_lists_add(struct dlock_list_node *node,
 extern void dlock_lists_del(struct dlock_list_node *node);
 
 /*
+ * Instead of individual list mapping by CPU number, it can be based on
+ * a given context to speed up loockup performance.
+ */
+extern struct dlock_list_head *dlock_list_hash(struct dlock_list_heads *dlist,
+					       void *context);
+extern void dlock_list_add(struct dlock_list_node *node,
+			   struct dlock_list_head *head);
+
+/*
  * Find the first entry of the next available list.
  */
 extern struct dlock_list_node *
diff --git a/lib/dlock-list.c b/lib/dlock-list.c
index a045fd7..8cd0876 100644
--- a/lib/dlock-list.c
+++ b/lib/dlock-list.c
@@ -20,6 +20,7 @@ 
 #include <linux/lockdep.h>
 #include <linux/slab.h>
 #include <linux/cpumask.h>
+#include <linux/jhash.h>
 
 /*
  * The distributed and locked list is a distributed set of lists each of
@@ -163,6 +164,46 @@  bool dlock_lists_empty(struct dlock_list_heads *dlist)
 }
 
 /**
+ * dlock_list_hash - Hash the given context to a particular list
+ * @dlist: The dlock list
+ * @ctx  : The context for hashing
+ */
+struct dlock_list_head *dlock_list_hash(struct dlock_list_heads *dlist,
+					void *ctx)
+{
+	unsigned long val = (unsigned long)ctx;
+	u32 hash;
+
+	if (unlikely(!nr_dlock_lists)) {
+		WARN_ON_ONCE(1);
+		return &dlist->heads[0];
+	}
+	if (val < nr_dlock_lists)
+		hash = val;
+	else
+		hash = jhash2((u32 *)&ctx, sizeof(ctx)/sizeof(u32), 0)
+				% nr_dlock_lists;
+	return &dlist->heads[hash];
+}
+
+/**
+ * dlock_list_add - Add a node to a particular head of dlock list
+ * @node: The node to be added
+ * @head: The dlock list head where the node is to be added
+ */
+void dlock_list_add(struct dlock_list_node *node,
+		    struct dlock_list_head *head)
+{
+	/*
+	 * There is no need to disable preemption
+	 */
+	spin_lock(&head->lock);
+	node->head = head;
+	list_add(&node->list, &head->list);
+	spin_unlock(&head->lock);
+}
+
+/**
  * dlock_lists_add - Adds a node to the given dlock list
  * @node : The node to be added
  * @dlist: The dlock list where the node is to be added
@@ -175,13 +216,7 @@  void dlock_lists_add(struct dlock_list_node *node,
 {
 	struct dlock_list_head *head = &dlist->heads[this_cpu_read(cpu2idx)];
 
-	/*
-	 * There is no need to disable preemption
-	 */
-	spin_lock(&head->lock);
-	node->head = head;
-	list_add(&node->list, &head->list);
-	spin_unlock(&head->lock);
+	dlock_list_add(node, head);
 }
 
 /**

[v7,5/6] lib/dlock-list: Enable faster lookup with hashing

Commit Message

Comments

Patch