diff mbox series

[v2,1/5] block: Introduce a request queue flag for pipelining zoned writes

Message ID 20230710180210.1582299-2-bvanassche@acm.org (mailing list archive)
State New, archived
Headers show
Series Enable zoned write pipelining for UFS devices | expand

Commit Message

Bart Van Assche July 10, 2023, 6:01 p.m. UTC
Writes in sequential write required zones must happen at the write
pointer. Even if the submitter of the write commands (e.g. a filesystem)
submits writes for sequential write required zones in order, the block
layer or the storage controller may reorder these write commands.

The zone locking mechanism in the mq-deadline I/O scheduler serializes
write commands for sequential zones. Some but not all storage controllers
require this serialization. Introduce a new flag such that block drivers
can request pipelining of writes for sequential write required zones.

An example of a storage controller standard that requires write
serialization is AHCI (Advanced Host Controller Interface). Submitting
commands to AHCI controllers happens by writing a bit pattern into a
register. Each set bit corresponds to an active command. This mechanism
does not preserve command ordering information.

Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 include/linux/blkdev.h | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Damien Le Moal July 18, 2023, 6:34 a.m. UTC | #1
On 7/11/23 03:01, Bart Van Assche wrote:
> Writes in sequential write required zones must happen at the write
> pointer. Even if the submitter of the write commands (e.g. a filesystem)
> submits writes for sequential write required zones in order, the block
> layer or the storage controller may reorder these write commands.
> 
> The zone locking mechanism in the mq-deadline I/O scheduler serializes
> write commands for sequential zones. Some but not all storage controllers
> require this serialization. Introduce a new flag such that block drivers
> can request pipelining of writes for sequential write required zones.
> 
> An example of a storage controller standard that requires write
> serialization is AHCI (Advanced Host Controller Interface). Submitting
> commands to AHCI controllers happens by writing a bit pattern into a
> register. Each set bit corresponds to an active command. This mechanism
> does not preserve command ordering information.
> 
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  include/linux/blkdev.h | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index ed44a997f629..805012c5a6ab 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -534,6 +534,8 @@ struct request_queue {
>  #define QUEUE_FLAG_NONROT	6	/* non-rotational device (SSD) */
>  #define QUEUE_FLAG_VIRT		QUEUE_FLAG_NONROT /* paravirt device */
>  #define QUEUE_FLAG_IO_STAT	7	/* do disk/partitions IO accounting */
> +/* Writes for sequential write required zones may be pipelined. */
> +#define QUEUE_FLAG_PIPELINE_ZONED_WRITES 8

I am not a big fan of this name as "pipeline" does not necessarily imply "high
queue depth write". What about simply calling this
QUEUE_FLAG_NO_ZONE_WRITE_LOCK, indicating that there is no need to write-lock
zones ?

>  #define QUEUE_FLAG_NOXMERGES	9	/* No extended merges */
>  #define QUEUE_FLAG_ADD_RANDOM	10	/* Contributes to random pool */
>  #define QUEUE_FLAG_SYNCHRONOUS	11	/* always completes in submit context */
> @@ -596,6 +598,11 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
>  #define blk_queue_skip_tagset_quiesce(q) \
>  	test_bit(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, &(q)->queue_flags)
>  
> +static inline bool blk_queue_pipeline_zoned_writes(struct request_queue *q)
> +{
> +	return test_bit(QUEUE_FLAG_PIPELINE_ZONED_WRITES, &q->queue_flags);
> +}
> +
>  extern void blk_set_pm_only(struct request_queue *q);
>  extern void blk_clear_pm_only(struct request_queue *q);
>
Bart Van Assche July 18, 2023, 10:37 p.m. UTC | #2
On 7/17/23 23:34, Damien Le Moal wrote:
> On 7/11/23 03:01, Bart Van Assche wrote:
>> +/* Writes for sequential write required zones may be pipelined. */
>> +#define QUEUE_FLAG_PIPELINE_ZONED_WRITES 8
> 
> I am not a big fan of this name as "pipeline" does not necessarily imply "high
> queue depth write". What about simply calling this
> QUEUE_FLAG_NO_ZONE_WRITE_LOCK, indicating that there is no need to write-lock
> zones ?

Hi Damien,

I'm not a big fan of names with negative words like "no" or "not" embedded.
Isn't pipelining standard computer science terminology? See also
https://en.wikipedia.org/wiki/Pipeline_(computing).

Thanks,

Bart.
Damien Le Moal July 19, 2023, 9:58 a.m. UTC | #3
On 7/19/23 07:37, Bart Van Assche wrote:
> On 7/17/23 23:34, Damien Le Moal wrote:
>> On 7/11/23 03:01, Bart Van Assche wrote:
>>> +/* Writes for sequential write required zones may be pipelined. */
>>> +#define QUEUE_FLAG_PIPELINE_ZONED_WRITES 8
>>
>> I am not a big fan of this name as "pipeline" does not necessarily
>> imply "high
>> queue depth write". What about simply calling this
>> QUEUE_FLAG_NO_ZONE_WRITE_LOCK, indicating that there is no need to
>> write-lock
>> zones ?
> 
> Hi Damien,
> 
> I'm not a big fan of names with negative words like "no" or "not" embedded.
> Isn't pipelining standard computer science terminology? See also
> https://en.wikipedia.org/wiki/Pipeline_(computing).

Sure, pipeline is a well known term. But I do not think it is synonymous
with "high queue depth write" :) A "pipeline" for zoned write may still
operate at write qd=1 per zone...

Given that the default is using zone write locking, I would prefer a
flag name that indicates a change to this default. What about something
like QUEUE_FLAG_UNRESTRICTED_ZONE_WRITE ?

But tht is beside the point as I still have reservations on this
approach anyway. See my reply to patch 4.
diff mbox series

Patch

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index ed44a997f629..805012c5a6ab 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -534,6 +534,8 @@  struct request_queue {
 #define QUEUE_FLAG_NONROT	6	/* non-rotational device (SSD) */
 #define QUEUE_FLAG_VIRT		QUEUE_FLAG_NONROT /* paravirt device */
 #define QUEUE_FLAG_IO_STAT	7	/* do disk/partitions IO accounting */
+/* Writes for sequential write required zones may be pipelined. */
+#define QUEUE_FLAG_PIPELINE_ZONED_WRITES 8
 #define QUEUE_FLAG_NOXMERGES	9	/* No extended merges */
 #define QUEUE_FLAG_ADD_RANDOM	10	/* Contributes to random pool */
 #define QUEUE_FLAG_SYNCHRONOUS	11	/* always completes in submit context */
@@ -596,6 +598,11 @@  bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
 #define blk_queue_skip_tagset_quiesce(q) \
 	test_bit(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, &(q)->queue_flags)
 
+static inline bool blk_queue_pipeline_zoned_writes(struct request_queue *q)
+{
+	return test_bit(QUEUE_FLAG_PIPELINE_ZONED_WRITES, &q->queue_flags);
+}
+
 extern void blk_set_pm_only(struct request_queue *q);
 extern void blk_clear_pm_only(struct request_queue *q);