From patchwork Mon Jul 11 17:32:12 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Waiman Long <Waiman.Long@hpe.com>
X-Patchwork-Id: 9223875
Return-Path: <linux-fsdevel-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	320D560871 for <patchwork-linux-fsdevel@patchwork.kernel.org>;
	Mon, 11 Jul 2016 17:32:58 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 24BF927DCE
	for <patchwork-linux-fsdevel@patchwork.kernel.org>;
	Mon, 11 Jul 2016 17:32:58 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 1906F27E3E; Mon, 11 Jul 2016 17:32:58 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI
	autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 63EFE27DCE
	for <patchwork-linux-fsdevel@patchwork.kernel.org>;
	Mon, 11 Jul 2016 17:32:57 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1031037AbcGKRcd (ORCPT
	<rfc822;patchwork-linux-fsdevel@patchwork.kernel.org>);
	Mon, 11 Jul 2016 13:32:33 -0400
Received: from g2t1383g.austin.hpe.com ([15.233.16.89]:57474 "EHLO
	g2t1383g.austin.hpe.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1031024AbcGKRcb (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Mon, 11 Jul 2016 13:32:31 -0400
Received: from g9t5008.houston.hpe.com (g9t5008.houston.hpe.com
	[15.241.48.72])
	(using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by g2t1383g.austin.hpe.com (Postfix) with ESMTPS id 18E4A1B8
	for <linux-fsdevel@vger.kernel.org>;
	Mon, 11 Jul 2016 17:32:31 +0000 (UTC)
Received: from g4t3433.houston.hpecorp.net (g4t3433.houston.hpecorp.net
	[16.208.49.245])
	by g9t5008.houston.hpe.com (Postfix) with ESMTP id 6E07A4C;
	Mon, 11 Jul 2016 17:32:29 +0000 (UTC)
Received: from RHEL65.localdomain (unknown [16.214.192.145])
	by g4t3433.houston.hpecorp.net (Postfix) with ESMTP id F38D346;
	Mon, 11 Jul 2016 17:32:27 +0000 (UTC)
From: Waiman Long <Waiman.Long@hpe.com>
To: Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	"J. Bruce Fields" <bfields@fieldses.org>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>,
	Andi Kleen <andi@firstfloor.org>, Dave Chinner <dchinner@redhat.com>,
	Boqun Feng <boqun.feng@gmail.com>, Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>, Waiman Long <Waiman.Long@hpe.com>
Subject: [RFC PATCH v2 7/7] lib/dlock-list: Use the per-subnode APIs for
	managing lists
Date: Mon, 11 Jul 2016 13:32:12 -0400
Message-Id: <1468258332-61537-8-git-send-email-Waiman.Long@hpe.com>
X-Mailer: git-send-email 1.7.1
In-Reply-To: <1468258332-61537-1-git-send-email-Waiman.Long@hpe.com>
References: <1468258332-61537-1-git-send-email-Waiman.Long@hpe.com>
Sender: linux-fsdevel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-fsdevel.vger.kernel.org>
X-Mailing-List: linux-fsdevel@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

This patch modifies the dlock-list to use the per-subnode APIs to
manage the distributed lists. As a result, the number of lists that
need to be iterated in dlock_list_iterate() will be reduced at least
by half making the iteration a bit faster.

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 include/linux/dlock-list.h |   81 +++++++++++++++++++++----------------------
 lib/dlock-list.c           |   19 +++++-----
 2 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/include/linux/dlock-list.h b/include/linux/dlock-list.h
index a8e1fd2..01667fc 100644
--- a/include/linux/dlock-list.h
+++ b/include/linux/dlock-list.h
@@ -20,12 +20,12 @@
 
 #include <linux/spinlock.h>
 #include <linux/list.h>
-#include <linux/percpu.h>
+#include <linux/persubnode.h>
 
 /*
  * include/linux/dlock-list.h
  *
- * A distributed (per-cpu) set of lists each of which is protected by its
+ * A distributed (per-subnode) set of lists each of which is protected by its
  * own spinlock, but acts like a single consolidated list to the callers.
  *
  * The dlock_list_head structure contains the spinlock, the other
@@ -45,19 +45,19 @@ struct dlock_list_head {
 	}
 
 /*
- * Per-cpu list iteration state
+ * Per-subnode list iteration state
  */
 struct dlock_list_state {
-	int			 cpu;
+	int			 snid;	/* Subnode ID */
 	spinlock_t		*lock;
-	struct list_head	*head;	/* List head of current per-cpu list */
+	struct list_head	*head;	/* List head of current per-subnode list */
 	struct dlock_list_node	*curr;
 	struct dlock_list_node	*next;
 };
 
 #define DLOCK_LIST_STATE_INIT()			\
 	{					\
-		.cpu  = -1,			\
+		.snid  = -1,			\
 		.lock = NULL,			\
 		.head = NULL,			\
 		.curr = NULL,			\
@@ -69,7 +69,7 @@ struct dlock_list_state {
 
 static inline void init_dlock_list_state(struct dlock_list_state *state)
 {
-	state->cpu  = -1;
+	state->snid  = -1;
 	state->lock = NULL;
 	state->head = NULL;
 	state->curr = NULL;
@@ -83,12 +83,12 @@ static inline void init_dlock_list_state(struct dlock_list_state *state)
 #endif
 
 /*
- * Next per-cpu list entry
+ * Next per-subnode list entry
  */
 #define dlock_list_next_entry(pos, member) list_next_entry(pos, member.list)
 
 /*
- * Per-cpu node data structure
+ * Per-subnode node data structure
  */
 struct dlock_list_node {
 	struct list_head list;
@@ -109,50 +109,50 @@ static inline void init_dlock_list_node(struct dlock_list_node *node)
 }
 
 static inline void
-free_dlock_list_head(struct dlock_list_head __percpu **pdlock_head)
+free_dlock_list_head(struct dlock_list_head __persubnode **pdlock_head)
 {
-	free_percpu(*pdlock_head);
+	free_persubnode(*pdlock_head);
 	*pdlock_head = NULL;
 }
 
 /*
- * Check if all the per-cpu lists are empty
+ * Check if all the per-subnode lists are empty
  */
-static inline bool dlock_list_empty(struct dlock_list_head __percpu *dlock_head)
+static inline bool dlock_list_empty(struct dlock_list_head __persubnode *dlock_head)
 {
-	int cpu;
+	int snid;
 
-	for_each_possible_cpu(cpu)
-		if (!list_empty(&per_cpu_ptr(dlock_head, cpu)->list))
+	for_each_subnode(snid)
+		if (!list_empty(&per_subnode_ptr(dlock_head, snid)->list))
 			return false;
 	return true;
 }
 
 /*
- * Helper function to find the first entry of the next per-cpu list
- * It works somewhat like for_each_possible_cpu(cpu).
+ * Helper function to find the first entry of the next per-subnode list
+ * It works somewhat like for_each_subnode(snid).
  *
  * Return: true if the entry is found, false if all the lists exhausted
  */
 static __always_inline bool
-__dlock_list_next_cpu(struct dlock_list_head __percpu *head,
+__dlock_list_next_subnode(struct dlock_list_head __persubnode *head,
 		      struct dlock_list_state *state)
 {
 	if (state->lock)
 		spin_unlock(state->lock);
-next_cpu:
+next_subnode:
 	/*
-	 * for_each_possible_cpu(cpu)
+	 * for_each_subnode(snid)
 	 */
-	state->cpu = cpumask_next(state->cpu, cpu_possible_mask);
-	if (state->cpu >= nr_cpu_ids)
-		return false;	/* All the per-cpu lists iterated */
+	state->snid = cpumask_next(state->snid, subnode_mask);
+	if (state->snid >= nr_subnode_ids)
+		return false;	/* All the per-subnode lists iterated */
 
-	state->head = &per_cpu_ptr(head, state->cpu)->list;
+	state->head = &per_subnode_ptr(head, state->snid)->list;
 	if (list_empty(state->head))
-		goto next_cpu;
+		goto next_subnode;
 
-	state->lock = &per_cpu_ptr(head, state->cpu)->lock;
+	state->lock = &per_subnode_ptr(head, state->snid)->lock;
 	spin_lock(state->lock);
 	/*
 	 * There is a slight chance that the list may become empty just
@@ -161,7 +161,7 @@ next_cpu:
 	 */
 	if (list_empty(state->head)) {
 		spin_unlock(state->lock);
-		goto next_cpu;
+		goto next_subnode;
 	}
 	state->curr = list_entry(state->head->next,
 				 struct dlock_list_node, list);
@@ -169,11 +169,11 @@ next_cpu:
 }
 
 /*
- * Iterate to the next entry of the group of per-cpu lists
+ * Iterate to the next entry of the group of per-subnode lists
  *
  * Return: true if the next entry is found, false if all the entries iterated
  */
-static inline bool dlock_list_iterate(struct dlock_list_head __percpu *head,
+static inline bool dlock_list_iterate(struct dlock_list_head __persubnode *head,
 				      struct dlock_list_state *state)
 {
 	/*
@@ -184,10 +184,10 @@ static inline bool dlock_list_iterate(struct dlock_list_head __percpu *head,
 
 	if (!state->curr || (&state->curr->list == state->head)) {
 		/*
-		 * The current per-cpu list has been exhausted, try the next
-		 * per-cpu list.
+		 * The current per-subnode list has been exhausted, try the next
+		 * per-subnode list.
 		 */
-		if (!__dlock_list_next_cpu(head, state))
+		if (!__dlock_list_next_subnode(head, state))
 			return false;
 	}
 
@@ -196,13 +196,13 @@ static inline bool dlock_list_iterate(struct dlock_list_head __percpu *head,
 }
 
 /*
- * Iterate to the next entry of the group of per-cpu lists and safe
+ * Iterate to the next entry of the group of per-subnode lists and safe
  * against removal of list_entry
  *
  * Return: true if the next entry is found, false if all the entries iterated
  */
 static inline bool
-dlock_list_iterate_safe(struct dlock_list_head __percpu *head,
+dlock_list_iterate_safe(struct dlock_list_head __persubnode *head,
 			struct dlock_list_state *state)
 {
 	/*
@@ -215,10 +215,10 @@ dlock_list_iterate_safe(struct dlock_list_head __percpu *head,
 
 	if (!state->curr || (&state->curr->list == state->head)) {
 		/*
-		 * The current per-cpu list has been exhausted, try the next
-		 * per-cpu list.
+		 * The current per-subnode list has been exhausted, try the next
+		 * per-subnode list.
 		 */
-		if (!__dlock_list_next_cpu(head, state))
+		if (!__dlock_list_next_subnode(head, state))
 			return false;
 		state->next = list_next_entry(state->curr, list);
 	}
@@ -228,8 +228,7 @@ dlock_list_iterate_safe(struct dlock_list_head __percpu *head,
 }
 
 extern void dlock_list_add(struct dlock_list_node *node,
-			   struct dlock_list_head __percpu *head);
+			   struct dlock_list_head __persubnode *head);
 extern void dlock_list_del(struct dlock_list_node *node);
-extern int  init_dlock_list_head(struct dlock_list_head __percpu **pdlock_head);
-
+extern int  init_dlock_list_head(struct dlock_list_head __persubnode **pdlock_head);
 #endif /* __LINUX_DLOCK_LIST_H */
diff --git a/lib/dlock-list.c b/lib/dlock-list.c
index e1a1930..05bbf45 100644
--- a/lib/dlock-list.c
+++ b/lib/dlock-list.c
@@ -25,20 +25,21 @@
 static struct lock_class_key dlock_list_key;
 
 /*
- * Initialize the per-cpu list head
+ * Initialize the per-subnode list head
  */
-int init_dlock_list_head(struct dlock_list_head __percpu **pdlock_head)
+int init_dlock_list_head(struct dlock_list_head __persubnode **pdlock_head)
 {
 	struct dlock_list_head *dlock_head;
-	int cpu;
+	int snid;
 
-	dlock_head = alloc_percpu(struct dlock_list_head);
+	dlock_head = alloc_persubnode(struct dlock_list_head);
 	if (!dlock_head)
 		return -ENOMEM;
 
-	for_each_possible_cpu(cpu) {
-		struct dlock_list_head *head = per_cpu_ptr(dlock_head, cpu);
+	for_each_subnode(snid) {
+		struct dlock_list_head *head;
 
+		head = per_subnode_ptr(dlock_head, snid);
 		INIT_LIST_HEAD(&head->list);
 		head->lock = __SPIN_LOCK_UNLOCKED(&head->lock);
 		lockdep_set_class(&head->lock, &dlock_list_key);
@@ -54,19 +55,19 @@ int init_dlock_list_head(struct dlock_list_head __percpu **pdlock_head)
  * So we still need to use a lock to protect the content of the list.
  */
 void dlock_list_add(struct dlock_list_node *node,
-		    struct dlock_list_head __percpu *head)
+		    struct dlock_list_head __persubnode *head)
 {
 	struct dlock_list_head *myhead;
 
 	/*
 	 * Disable preemption to make sure that CPU won't gets changed.
 	 */
-	myhead = get_cpu_ptr(head);
+	myhead = get_subnode_ptr(head);
 	spin_lock(&myhead->lock);
 	node->lockptr = &myhead->lock;
 	list_add(&node->list, &myhead->list);
 	spin_unlock(&myhead->lock);
-	put_cpu_ptr(head);
+	put_subnode_ptr(head);
 }
 
 /*