diff mbox

block: kyber: make kyber more friendly with merging

Message ID 7c93b85f-6a02-bdcf-5e92-a9533500df20@kernel.dk (mailing list archive)
State New, archived
Headers show

Commit Message

Jens Axboe May 22, 2018, 5:46 p.m. UTC
On 5/22/18 10:20 AM, Jens Axboe wrote:
> On 5/22/18 10:17 AM, Holger Hoffstätte wrote:
>> On 05/22/18 16:48, Jianchao Wang wrote:
>>> Currently, kyber is very unfriendly with merging. kyber depends
>>> on ctx rq_list to do merging, however, most of time, it will not
>>> leave any requests in ctx rq_list. This is because even if tokens
>>> of one domain is used up, kyber will try to dispatch requests
>>> from other domain and flush the rq_list there.
>>>
>>> To improve this, we setup kyber_ctx_queue (kcq) which is similar
>>> with ctx, but it has rq_lists for different domain and build same
>>> mapping between kcq and khd as the ctx & hctx. Then we could merge,
>>> insert and dispatch for different domains separately. If one domain
>>> token is used up, the requests could be left in the rq_list of
>>> that domain and maybe merged with following io.
>>>
>>> Following is my test result on machine with 8 cores and NVMe card
>>> INTEL SSDPEKKR128G7
>>>
>>> fio size=256m ioengine=libaio iodepth=64 direct=1 numjobs=8
>>> seq/random
>>> +------+---------------------------------------------------------------+
>>> |patch?| bw(MB/s) |   iops    | slat(usec) |    clat(usec)   |  merge  |
>>> +----------------------------------------------------------------------+
>>> | w/o  |  606/612 | 151k/153k |  6.89/7.03 | 3349.21/3305.40 |   0/0   |
>>> +----------------------------------------------------------------------+
>>> | w/   | 1083/616 | 277k/154k |  4.93/6.95 | 1830.62/3279.95 | 223k/3k |
>>> +----------------------------------------------------------------------+
>>> When set numjobs to 16, the bw and iops could reach 1662MB/s and 425k
>>> on my platform.
>>>
>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>>
>> <snip>
>>
>> This looks great but prevents kyber from being built as module,
>> which is AFAIK supposed to work (and works now):
>>
>> ..
>>   CC [M]  block/kyber-iosched.o
>>   Building modules, stage 2.
>>   MODPOST 313 modules
>> ERROR: "bio_attempt_back_merge" [block/kyber-iosched.ko] undefined!
>> ERROR: "bio_attempt_front_merge" [block/kyber-iosched.ko] undefined!
>> ERROR: "bio_attempt_discard_merge" [block/kyber-iosched.ko] undefined!
>> ERROR: "blk_try_merge" [block/kyber-iosched.ko] undefined!
>> ERROR: "blk_rq_merge_ok" [block/kyber-iosched.ko] undefined!
>> ..
>>
>> It does build fine when compiled in, obviously. :)
> 
> It's basically duplicating the contents of blk_attempt_plug_merge().
> I would suggest abstracting out the list loop and merge check
> into a helper, that could then both be called from kyber and the
> plug merge function.

See attached, prep patch and yours rebased on top of it.

Comments

Holger Hoffstätte May 22, 2018, 6:32 p.m. UTC | #1
On 05/22/18 19:46, Jens Axboe wrote:
> On 5/22/18 10:20 AM, Jens Axboe wrote:
>> On 5/22/18 10:17 AM, Holger Hoffstätte wrote:
>>> On 05/22/18 16:48, Jianchao Wang wrote:
>>>> Currently, kyber is very unfriendly with merging. kyber depends
>>>> on ctx rq_list to do merging, however, most of time, it will not
>>>> leave any requests in ctx rq_list. This is because even if tokens
>>>> of one domain is used up, kyber will try to dispatch requests
>>>> from other domain and flush the rq_list there.
>>>>
>>>> To improve this, we setup kyber_ctx_queue (kcq) which is similar
>>>> with ctx, but it has rq_lists for different domain and build same
>>>> mapping between kcq and khd as the ctx & hctx. Then we could merge,
>>>> insert and dispatch for different domains separately. If one domain
>>>> token is used up, the requests could be left in the rq_list of
>>>> that domain and maybe merged with following io.
>>>>
>>>> Following is my test result on machine with 8 cores and NVMe card
>>>> INTEL SSDPEKKR128G7
>>>>
>>>> fio size=256m ioengine=libaio iodepth=64 direct=1 numjobs=8
>>>> seq/random
>>>> +------+---------------------------------------------------------------+
>>>> |patch?| bw(MB/s) |   iops    | slat(usec) |    clat(usec)   |  merge  |
>>>> +----------------------------------------------------------------------+
>>>> | w/o  |  606/612 | 151k/153k |  6.89/7.03 | 3349.21/3305.40 |   0/0   |
>>>> +----------------------------------------------------------------------+
>>>> | w/   | 1083/616 | 277k/154k |  4.93/6.95 | 1830.62/3279.95 | 223k/3k |
>>>> +----------------------------------------------------------------------+
>>>> When set numjobs to 16, the bw and iops could reach 1662MB/s and 425k
>>>> on my platform.
>>>>
>>>> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
>>>
>>> <snip>
>>>
>>> This looks great but prevents kyber from being built as module,
>>> which is AFAIK supposed to work (and works now):
>>>
>>> ..
>>>   CC [M]  block/kyber-iosched.o
>>>   Building modules, stage 2.
>>>   MODPOST 313 modules
>>> ERROR: "bio_attempt_back_merge" [block/kyber-iosched.ko] undefined!
>>> ERROR: "bio_attempt_front_merge" [block/kyber-iosched.ko] undefined!
>>> ERROR: "bio_attempt_discard_merge" [block/kyber-iosched.ko] undefined!
>>> ERROR: "blk_try_merge" [block/kyber-iosched.ko] undefined!
>>> ERROR: "blk_rq_merge_ok" [block/kyber-iosched.ko] undefined!
>>> ..
>>>
>>> It does build fine when compiled in, obviously. :)
>>
>> It's basically duplicating the contents of blk_attempt_plug_merge().
>> I would suggest abstracting out the list loop and merge check
>> into a helper, that could then both be called from kyber and the
>> plug merge function.
> 
> See attached, prep patch and yours rebased on top of it.

That was quick. :)

Applies smoothly on top of my 4.16++ tree, now builds correctly as
module and is reproducibly (slightly) faster even on my pedestrian
SATA SSDs, now on par or occasionally even faster than mq-deadline.
What's not to like? So:

Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>

cheers,
Holger
jianchao.wang May 23, 2018, 1:59 a.m. UTC | #2
Hi Jens and Holger

Thank for your kindly response.
That's really appreciated.

I will post next version based on Jens' patch.

Thanks
Jianchao

On 05/23/2018 02:32 AM, Holger Hoffstätte wrote:
>>>> This looks great but prevents kyber from being built as module,
>>>> which is AFAIK supposed to work (and works now):
>>>>
>>>> ..
>>>>   CC [M]  block/kyber-iosched.o
>>>>   Building modules, stage 2.
>>>>   MODPOST 313 modules
>>>> ERROR: "bio_attempt_back_merge" [block/kyber-iosched.ko] undefined!
>>>> ERROR: "bio_attempt_front_merge" [block/kyber-iosched.ko] undefined!
>>>> ERROR: "bio_attempt_discard_merge" [block/kyber-iosched.ko] undefined!
>>>> ERROR: "blk_try_merge" [block/kyber-iosched.ko] undefined!
>>>> ERROR: "blk_rq_merge_ok" [block/kyber-iosched.ko] undefined!
>>>> ..
>>>>
>>>> It does build fine when compiled in, obviously. :)
>>> It's basically duplicating the contents of blk_attempt_plug_merge().
>>> I would suggest abstracting out the list loop and merge check
>>> into a helper, that could then both be called from kyber and the
>>> plug merge function.
>> See attached, prep patch and yours rebased on top of it.
> That was quick. :)
> 
> Applies smoothly on top of my 4.16++ tree, now builds correctly as
> module and is reproducibly (slightly) faster even on my pedestrian
> SATA SSDs, now on par or occasionally even faster than mq-deadline.
> What's not to like? So:
> 
> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
diff mbox

Patch

From 82d8018412c12407fbfb134c7be5410ba5a76084 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Tue, 22 May 2018 11:41:46 -0600
Subject: [PATCH 1/2] blk-mq: abstract out blk-mq-sched rq list iteration bio
 merge helper

No functional changes in this patch, just a prep patch for utilizing
this in an IO scheduler.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq-sched.c   | 34 ++++++++++++++++++++++++----------
 include/linux/blk-mq.h |  3 ++-
 2 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index 25c14c58385c..b0f2c2a40a0c 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -268,19 +268,16 @@  bool blk_mq_sched_try_merge(struct request_queue *q, struct bio *bio,
 EXPORT_SYMBOL_GPL(blk_mq_sched_try_merge);
 
 /*
- * Reverse check our software queue for entries that we could potentially
- * merge with. Currently includes a hand-wavy stop count of 8, to not spend
- * too much time checking for merges.
+ * Iterate list of requests and see if we can merge this bio with any
+ * of them.
  */
-static bool blk_mq_attempt_merge(struct request_queue *q,
-				 struct blk_mq_ctx *ctx, struct bio *bio)
+bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list,
+			   struct bio *bio)
 {
 	struct request *rq;
 	int checked = 8;
 
-	lockdep_assert_held(&ctx->lock);
-
-	list_for_each_entry_reverse(rq, &ctx->rq_list, queuelist) {
+	list_for_each_entry_reverse(rq, list, queuelist) {
 		bool merged = false;
 
 		if (!checked--)
@@ -305,13 +302,30 @@  static bool blk_mq_attempt_merge(struct request_queue *q,
 			continue;
 		}
 
-		if (merged)
-			ctx->rq_merged++;
 		return merged;
 	}
 
 	return false;
 }
+EXPORT_SYMBOL_GPL(blk_mq_bio_list_merge);
+
+/*
+ * Reverse check our software queue for entries that we could potentially
+ * merge with. Currently includes a hand-wavy stop count of 8, to not spend
+ * too much time checking for merges.
+ */
+static bool blk_mq_attempt_merge(struct request_queue *q,
+				 struct blk_mq_ctx *ctx, struct bio *bio)
+{
+	lockdep_assert_held(&ctx->lock);
+
+	if (blk_mq_bio_list_merge(q, &ctx->rq_list, bio)) {
+		ctx->rq_merged++;
+		return true;
+	}
+
+	return false;
+}
 
 bool __blk_mq_sched_bio_merge(struct request_queue *q, struct bio *bio)
 {
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index ebc34a5686dc..fb355173f3c7 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -259,7 +259,8 @@  void blk_mq_add_to_requeue_list(struct request *rq, bool at_head,
 void blk_mq_kick_requeue_list(struct request_queue *q);
 void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs);
 void blk_mq_complete_request(struct request *rq);
-
+bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list,
+			   struct bio *bio);
 bool blk_mq_queue_stopped(struct request_queue *q);
 void blk_mq_stop_hw_queue(struct blk_mq_hw_ctx *hctx);
 void blk_mq_start_hw_queue(struct blk_mq_hw_ctx *hctx);
-- 
2.7.4