diff mbox series

[v1] block, bfq: set default slice_idle to zero for SSDs

Message ID 1573656800-14815-1-git-send-email-ppvk@codeaurora.org (mailing list archive)
State New, archived
Headers show
Series [v1] block, bfq: set default slice_idle to zero for SSDs | expand

Commit Message

Pradeep P V K Nov. 13, 2019, 2:53 p.m. UTC
With default 8ms as a slice idle time, we seen few time bounded
applications(sensors) on v4.19 kernel are getting timedout during
multimedia tests (audio, video playbacks etc) with Reboots and
leading to crash. The timeout configured for these applications
(sensors) are 20sec.

In crash dumps, we seen few synchronous requests from sensors/other
applications were in their bfq_queues for more than 12-20sec.

Idling due to anticipation of future near-by IO requests and wait on
completion of submitted requests, will effect in choosing the next
bfq-queue and its scheduling. There by it effecting some time bounded
applications.

After making the slice idle to zero, we didn't seen any crash during
our 72hrs of testing and also it increases the IO throughput.

Following FIO benchmark results were taken on a local SSD run:

RandomReads that were taken on v4.19 kernel:

 Idling   iops    avg-lat(us)    stddev       bw
 ----------------------------------------------------
 On       4136    1189.07        17221.65    16.9MB/s
 Off      7246     670.11        1054.76     29.7MB/s

    fio --name=temp --size=5G --time_based --ioengine=sync \
	--randrepeat=0 --direct=1 --invalidate=1 --verify=0 \
	--verify_fatal=0 --rw=randread --blocksize=4k \
	--group_reporting=1 --directory=/data --runtime=10 \
	--iodepth=64 --numjobs=5

Following code changes were made based on [1],[2] and [3].

[1] https://lkml.org/lkml/2018/11/1/1285
[2] Commit 41c0126b3f22 ("block: Make CFQ default to IOPS mode on
    SSDs")
[3] Commit 0bb979472a74 ("cfq-iosched: fix the setting of IOPS mode on
    SSDs")

Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
---
 Documentation/block/bfq-iosched.rst |  7 ++++---
 block/bfq-iosched.c                 | 13 +++++++++++++
 block/elevator.c                    |  2 ++
 include/linux/elevator.h            |  1 +
 4 files changed, 20 insertions(+), 3 deletions(-)

Comments

Paolo Valente Nov. 18, 2019, 2:18 p.m. UTC | #1
> Il giorno 13 nov 2019, alle ore 15:53, Pradeep P V K <ppvk@codeaurora.org> ha scritto:
> 
> With default 8ms as a slice idle time, we seen few time bounded
> applications(sensors) on v4.19 kernel are getting timedout during
> multimedia tests (audio, video playbacks etc) with Reboots and
> leading to crash. The timeout configured for these applications
> (sensors) are 20sec.
> 
> In crash dumps, we seen few synchronous requests from sensors/other
> applications were in their bfq_queues for more than 12-20sec.
> 
> Idling due to anticipation of future near-by IO requests and wait on
> completion of submitted requests, will effect in choosing the next
> bfq-queue and its scheduling. There by it effecting some time bounded
> applications.
> 
> After making the slice idle to zero,

As written in the comments that your patch modifies, idling is
essential for controlling I/O.  So your change is
unacceptable unfortunately.

I would recommend you to analyze carefully why this anomaly occurs
with non-zero slice idle.  I'd be willing to help in that.

Alternatively, if you don't want to waste your time finding out the
cause of this problem, then just set slice_idle to 0 manually for your
application.

>  we didn't seen any crash during
> our 72hrs of testing and also it increases the IO throughput.
> 
> Following FIO benchmark results were taken on a local SSD run:
> 
> RandomReads that were taken on v4.19 kernel:
> 
> Idling   iops    avg-lat(us)    stddev       bw
> ----------------------------------------------------
> On       4136    1189.07        17221.65    16.9MB/s
> Off      7246     670.11        1054.76     29.7MB/s
> 

This is anomalous too.  Probably the kernel you are using lacks
commits made after 4.19 (around one hundred commits IIRC).

Thanks,
Paolo

>    fio --name=temp --size=5G --time_based --ioengine=sync \
> 	--randrepeat=0 --direct=1 --invalidate=1 --verify=0 \
> 	--verify_fatal=0 --rw=randread --blocksize=4k \
> 	--group_reporting=1 --directory=/data --runtime=10 \
> 	--iodepth=64 --numjobs=5
> 
> Following code changes were made based on [1],[2] and [3].
> 
> [1] https://lkml.org/lkml/2018/11/1/1285
> [2] Commit 41c0126b3f22 ("block: Make CFQ default to IOPS mode on
>    SSDs")
> [3] Commit 0bb979472a74 ("cfq-iosched: fix the setting of IOPS mode on
>    SSDs")
> 
> Signed-off-by: Pradeep P V K <ppvk@codeaurora.org>
> ---
> Documentation/block/bfq-iosched.rst |  7 ++++---
> block/bfq-iosched.c                 | 13 +++++++++++++
> block/elevator.c                    |  2 ++
> include/linux/elevator.h            |  1 +
> 4 files changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/block/bfq-iosched.rst b/Documentation/block/bfq-iosched.rst
> index 0d237d4..244f4ca 100644
> --- a/Documentation/block/bfq-iosched.rst
> +++ b/Documentation/block/bfq-iosched.rst
> @@ -329,9 +329,10 @@ slice_idle
> 
> This parameter specifies how long BFQ should idle for next I/O
> request, when certain sync BFQ queues become empty. By default
> -slice_idle is a non-zero value. Idling has a double purpose: boosting
> -throughput and making sure that the desired throughput distribution is
> -respected (see the description of how BFQ works, and, if needed, the
> +slice_idle is a non-zero value for rotational devices.
> +Idling has a double purpose: boosting throughput and making
> +sure that the desired throughput distribution is respected
> +(see the description of how BFQ works, and, if needed, the
> papers referred there).
> 
> As for throughput, idling can be very helpful on highly seeky media
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 0319d63..9c994d1 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -6514,6 +6514,18 @@ static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
> 	return -ENOMEM;
> }
> 
> +static void bfq_registered_queue(struct request_queue *q)
> +{
> +	struct elevator_queue *e = q->elevator;
> +	struct bfq_data *bfqd = e->elevator_data;
> +
> +	/*
> +	 * Default to IOPS mode with no idling for SSDs
> +	 */
> +	if (blk_queue_nonrot(q))
> +		bfqd->bfq_slice_idle = 0;
> +}
> +
> static void bfq_slab_kill(void)
> {
> 	kmem_cache_destroy(bfq_pool);
> @@ -6761,6 +6773,7 @@ static ssize_t bfq_low_latency_store(struct elevator_queue *e,
> 		.init_hctx		= bfq_init_hctx,
> 		.init_sched		= bfq_init_queue,
> 		.exit_sched		= bfq_exit_queue,
> +		.elevator_registered_fn = bfq_registered_queue,
> 	},
> 
> 	.icq_size =		sizeof(struct bfq_io_cq),
> diff --git a/block/elevator.c b/block/elevator.c
> index 076ba73..b882d25 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -504,6 +504,8 @@ int elv_register_queue(struct request_queue *q, bool uevent)
> 			kobject_uevent(&e->kobj, KOBJ_ADD);
> 
> 		e->registered = 1;
> +		if (e->type->ops.elevator_registered_fn)
> +			e->type->ops.elevator_registered_fn(q);
> 	}
> 	return error;
> }
> diff --git a/include/linux/elevator.h b/include/linux/elevator.h
> index 901bda3..23dcc35 100644
> --- a/include/linux/elevator.h
> +++ b/include/linux/elevator.h
> @@ -50,6 +50,7 @@ struct elevator_mq_ops {
> 	struct request *(*next_request)(struct request_queue *, struct request *);
> 	void (*init_icq)(struct io_cq *);
> 	void (*exit_icq)(struct io_cq *);
> +	void (*elevator_registered_fn)(struct request_queue *q);
> };
> 
> #define ELV_NAME_MAX	(16)
> -- 
> 1.9.1
>
diff mbox series

Patch

diff --git a/Documentation/block/bfq-iosched.rst b/Documentation/block/bfq-iosched.rst
index 0d237d4..244f4ca 100644
--- a/Documentation/block/bfq-iosched.rst
+++ b/Documentation/block/bfq-iosched.rst
@@ -329,9 +329,10 @@  slice_idle
 
 This parameter specifies how long BFQ should idle for next I/O
 request, when certain sync BFQ queues become empty. By default
-slice_idle is a non-zero value. Idling has a double purpose: boosting
-throughput and making sure that the desired throughput distribution is
-respected (see the description of how BFQ works, and, if needed, the
+slice_idle is a non-zero value for rotational devices.
+Idling has a double purpose: boosting throughput and making
+sure that the desired throughput distribution is respected
+(see the description of how BFQ works, and, if needed, the
 papers referred there).
 
 As for throughput, idling can be very helpful on highly seeky media
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 0319d63..9c994d1 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -6514,6 +6514,18 @@  static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 	return -ENOMEM;
 }
 
+static void bfq_registered_queue(struct request_queue *q)
+{
+	struct elevator_queue *e = q->elevator;
+	struct bfq_data *bfqd = e->elevator_data;
+
+	/*
+	 * Default to IOPS mode with no idling for SSDs
+	 */
+	if (blk_queue_nonrot(q))
+		bfqd->bfq_slice_idle = 0;
+}
+
 static void bfq_slab_kill(void)
 {
 	kmem_cache_destroy(bfq_pool);
@@ -6761,6 +6773,7 @@  static ssize_t bfq_low_latency_store(struct elevator_queue *e,
 		.init_hctx		= bfq_init_hctx,
 		.init_sched		= bfq_init_queue,
 		.exit_sched		= bfq_exit_queue,
+		.elevator_registered_fn = bfq_registered_queue,
 	},
 
 	.icq_size =		sizeof(struct bfq_io_cq),
diff --git a/block/elevator.c b/block/elevator.c
index 076ba73..b882d25 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -504,6 +504,8 @@  int elv_register_queue(struct request_queue *q, bool uevent)
 			kobject_uevent(&e->kobj, KOBJ_ADD);
 
 		e->registered = 1;
+		if (e->type->ops.elevator_registered_fn)
+			e->type->ops.elevator_registered_fn(q);
 	}
 	return error;
 }
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index 901bda3..23dcc35 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -50,6 +50,7 @@  struct elevator_mq_ops {
 	struct request *(*next_request)(struct request_queue *, struct request *);
 	void (*init_icq)(struct io_cq *);
 	void (*exit_icq)(struct io_cq *);
+	void (*elevator_registered_fn)(struct request_queue *q);
 };
 
 #define ELV_NAME_MAX	(16)