diff mbox series

[v2,3/6] block: Introduce a request queue flag for pipelining zoned writes

Message ID 20220623232603.3751912-4-bvanassche@acm.org (mailing list archive)
State New, archived
Headers show
Series Improve zoned storage write performance | expand

Commit Message

Bart Van Assche June 23, 2022, 11:26 p.m. UTC
Writes in sequential write required zones must happen at the write
pointer. Even if the submitter of the write commands (e.g. a filesystem)
submits writes for sequential write required zones in order, the block
layer or the storage controller may reorder these write commands.

The zone locking mechanism in the mq-deadline I/O scheduler serializes
write commands for sequential zones. Some but not all storage controllers
require this serialization. Introduce a new flag such that block drivers
can request pipelining of writes for sequential write required zones.

An example of a storage controller standard that requires write
serialization is AHCI (Advanced Host Controller Interface). Submitting
commands to AHCI controllers happens by writing a bit pattern into a
register. Each set bit corresponds to an active command. This mechanism
does not preserve command ordering information.

Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 include/linux/blkdev.h | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Damien Le Moal June 24, 2022, 12:19 a.m. UTC | #1
On 6/24/22 08:26, Bart Van Assche wrote:
> Writes in sequential write required zones must happen at the write
> pointer. Even if the submitter of the write commands (e.g. a filesystem)
> submits writes for sequential write required zones in order, the block
> layer or the storage controller may reorder these write commands.
> 
> The zone locking mechanism in the mq-deadline I/O scheduler serializes
> write commands for sequential zones. Some but not all storage controllers
> require this serialization. Introduce a new flag such that block drivers
> can request pipelining of writes for sequential write required zones.
> 
> An example of a storage controller standard that requires write
> serialization is AHCI (Advanced Host Controller Interface). Submitting
> commands to AHCI controllers happens by writing a bit pattern into a
> register. Each set bit corresponds to an active command. This mechanism
> does not preserve command ordering information.
> 
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  include/linux/blkdev.h | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 2904100d2485..fcaa06b9c65a 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -581,6 +581,8 @@ struct request_queue {
>  #define QUEUE_FLAG_HCTX_ACTIVE	28	/* at least one blk-mq hctx is active */
>  #define QUEUE_FLAG_NOWAIT       29	/* device supports NOWAIT */
>  #define QUEUE_FLAG_SQ_SCHED     30	/* single queue style io dispatch */
> +/* Writes for sequential write required zones may be pipelined. */
> +#define QUEUE_FLAG_PIPELINE_ZONED_WRITES 31
>  
>  #define QUEUE_FLAG_MQ_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
>  				 (1 << QUEUE_FLAG_SAME_COMP) |		\
> @@ -624,6 +626,11 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
>  #define blk_queue_nowait(q)	test_bit(QUEUE_FLAG_NOWAIT, &(q)->queue_flags)
>  #define blk_queue_sq_sched(q)	test_bit(QUEUE_FLAG_SQ_SCHED, &(q)->queue_flags)
>  
> +static inline bool blk_queue_pipeline_zoned_writes(struct request_queue *q)
> +{
> +	return test_bit(QUEUE_FLAG_PIPELINE_ZONED_WRITES, &(q)->queue_flags);
> +}
> +

Since this flag will be set by an LLD to indicate that the LLD can handle
zoned write commands in order, I would suggest a different name. Something
like: QUEUE_FLAG_ORDERED_ZONED_WRITES ? And well, if the LLD says it can
do that for zoned writes, it likely means that it would be the same for
any command, so the flag could be generalized and named
QUEUE_FLAG_ORDERED_CMD or something like that.

>  extern void blk_set_pm_only(struct request_queue *q);
>  extern void blk_clear_pm_only(struct request_queue *q);
>
Bart Van Assche June 24, 2022, 4:29 p.m. UTC | #2
On 6/23/22 17:19, Damien Le Moal wrote:
> On 6/24/22 08:26, Bart Van Assche wrote:
>> +static inline bool blk_queue_pipeline_zoned_writes(struct request_queue *q)
>> +{
>> +	return test_bit(QUEUE_FLAG_PIPELINE_ZONED_WRITES, &(q)->queue_flags);
>> +}
>> +
> 
> Since this flag will be set by an LLD to indicate that the LLD can handle
> zoned write commands in order, I would suggest a different name. Something
> like: QUEUE_FLAG_ORDERED_ZONED_WRITES ? And well, if the LLD says it can
> do that for zoned writes, it likely means that it would be the same for
> any command, so the flag could be generalized and named
> QUEUE_FLAG_ORDERED_CMD or something like that.

Zoned writes should always be submitted in order by the software that 
generates the zoned writes so I think the name 
QUEUE_FLAG_ORDERED_ZONED_WRITES would cause confusion. I'm concerned it 
would make other kernel developers wonder whether it is OK for e.g. 
filesystems to submit zoned writes out of order, which is not the case. 
I think the word "pipelining" has a well-established meaning in computer 
science and also that in this context it reflects the intent correctly. 
See also https://en.wikipedia.org/wiki/Pipeline_(computing).

Thanks,

Bart.
diff mbox series

Patch

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 2904100d2485..fcaa06b9c65a 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -581,6 +581,8 @@  struct request_queue {
 #define QUEUE_FLAG_HCTX_ACTIVE	28	/* at least one blk-mq hctx is active */
 #define QUEUE_FLAG_NOWAIT       29	/* device supports NOWAIT */
 #define QUEUE_FLAG_SQ_SCHED     30	/* single queue style io dispatch */
+/* Writes for sequential write required zones may be pipelined. */
+#define QUEUE_FLAG_PIPELINE_ZONED_WRITES 31
 
 #define QUEUE_FLAG_MQ_DEFAULT	((1 << QUEUE_FLAG_IO_STAT) |		\
 				 (1 << QUEUE_FLAG_SAME_COMP) |		\
@@ -624,6 +626,11 @@  bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
 #define blk_queue_nowait(q)	test_bit(QUEUE_FLAG_NOWAIT, &(q)->queue_flags)
 #define blk_queue_sq_sched(q)	test_bit(QUEUE_FLAG_SQ_SCHED, &(q)->queue_flags)
 
+static inline bool blk_queue_pipeline_zoned_writes(struct request_queue *q)
+{
+	return test_bit(QUEUE_FLAG_PIPELINE_ZONED_WRITES, &(q)->queue_flags);
+}
+
 extern void blk_set_pm_only(struct request_queue *q);
 extern void blk_clear_pm_only(struct request_queue *q);