diff mbox

[V2,1/3] blk-mq: allocate blk_mq_tags and requests in correct node

Message ID a52afb3e09691267f2240331ef0b9962f793703d.1485971427.git.shli@fb.com (mailing list archive)
State New, archived
Headers show

Commit Message

Shaohua Li Feb. 1, 2017, 5:53 p.m. UTC
blk_mq_tags/requests of specific hardware queue are mostly used in
specific cpus, which might not be in the same numa node as disk. For
example, a nvme card is in node 0. half hardware queue will be used by
node 0, the other node 1.

Signed-off-by: Shaohua Li <shli@fb.com>
---
 block/blk-mq.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

Comments

Christoph Hellwig Feb. 1, 2017, 6:11 p.m. UTC | #1
Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jens Axboe Feb. 1, 2017, 7:09 p.m. UTC | #2
On 02/01/2017 09:53 AM, Shaohua Li wrote:
> blk_mq_tags/requests of specific hardware queue are mostly used in
> specific cpus, which might not be in the same numa node as disk. For
> example, a nvme card is in node 0. half hardware queue will be used by
> node 0, the other node 1.

All three patches look good to me. Bjorn, to avoid complications, if
you can review/ack patch #2, then I will queue it up through the block
tree for 4.11.
Jens Axboe Feb. 24, 2017, 10:23 p.m. UTC | #3
On 02/01/2017 12:09 PM, Jens Axboe wrote:
> On 02/01/2017 09:53 AM, Shaohua Li wrote:
>> blk_mq_tags/requests of specific hardware queue are mostly used in
>> specific cpus, which might not be in the same numa node as disk. For
>> example, a nvme card is in node 0. half hardware queue will be used by
>> node 0, the other node 1.
> 
> All three patches look good to me. Bjorn, to avoid complications, if
> you can review/ack patch #2, then I will queue it up through the block
> tree for 4.11.

Bjorn, ping. You were CC'ed on the original patch three weeks ago.
Jens Axboe Feb. 25, 2017, 2:54 a.m. UTC | #4
On 02/01/2017 10:53 AM, Shaohua Li wrote:
> blk_mq_tags/requests of specific hardware queue are mostly used in
> specific cpus, which might not be in the same numa node as disk. For
> example, a nvme card is in node 0. half hardware queue will be used by
> node 0, the other node 1.

Applied 1-3 for this series, thanks Shaohua!
diff mbox

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 48df5fd..888077c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1666,16 +1666,20 @@  struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 					unsigned int reserved_tags)
 {
 	struct blk_mq_tags *tags;
+	int node;
+
+	node = blk_mq_hw_queue_to_node(set->mq_map, hctx_idx);
+	if (node == NUMA_NO_NODE)
+		node = set->numa_node;
 
-	tags = blk_mq_init_tags(nr_tags, reserved_tags,
-				set->numa_node,
+	tags = blk_mq_init_tags(nr_tags, reserved_tags, node,
 				BLK_MQ_FLAG_TO_ALLOC_POLICY(set->flags));
 	if (!tags)
 		return NULL;
 
 	tags->rqs = kzalloc_node(nr_tags * sizeof(struct request *),
 				 GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY,
-				 set->numa_node);
+				 node);
 	if (!tags->rqs) {
 		blk_mq_free_tags(tags);
 		return NULL;
@@ -1683,7 +1687,7 @@  struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set,
 
 	tags->static_rqs = kzalloc_node(nr_tags * sizeof(struct request *),
 				 GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY,
-				 set->numa_node);
+				 node);
 	if (!tags->static_rqs) {
 		kfree(tags->rqs);
 		blk_mq_free_tags(tags);
@@ -1703,6 +1707,11 @@  int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 {
 	unsigned int i, j, entries_per_page, max_order = 4;
 	size_t rq_size, left;
+	int node;
+
+	node = blk_mq_hw_queue_to_node(set->mq_map, hctx_idx);
+	if (node == NUMA_NO_NODE)
+		node = set->numa_node;
 
 	INIT_LIST_HEAD(&tags->page_list);
 
@@ -1724,7 +1733,7 @@  int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 			this_order--;
 
 		do {
-			page = alloc_pages_node(set->numa_node,
+			page = alloc_pages_node(node,
 				GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO,
 				this_order);
 			if (page)
@@ -1757,7 +1766,7 @@  int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 			if (set->ops->init_request) {
 				if (set->ops->init_request(set->driver_data,
 						rq, hctx_idx, i,
-						set->numa_node)) {
+						node)) {
 					tags->static_rqs[i] = NULL;
 					goto fail;
 				}