Message ID | 20230804154821.3232094-2-bvanassche@acm.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Improve performance for zoned UFS devices | expand |
On 8/4/23 9:47?AM, Bart Van Assche wrote: > Writes in sequential write required zones must happen at the write > pointer. Even if the submitter of the write commands (e.g. a filesystem) > submits writes for sequential write required zones in order, the block > layer or the storage controller may reorder these write commands. > > The zone locking mechanism in the mq-deadline I/O scheduler serializes > write commands for sequential zones. Some but not all storage controllers > require this serialization. Introduce a new request queue flag to allow > block drivers to indicate that they preserve the order of write commands > and thus do not require serialization of writes per zone. Looking at how this is used, why not call it QUEUE_FLAG_ZONE_WRITE_LOCK instead? That'd make the code easier to immediately grok, rather than deal with double negations.
On 8/8/23 14:19, Jens Axboe wrote: > On 8/4/23 9:47?AM, Bart Van Assche wrote: >> Writes in sequential write required zones must happen at the write >> pointer. Even if the submitter of the write commands (e.g. a filesystem) >> submits writes for sequential write required zones in order, the block >> layer or the storage controller may reorder these write commands. >> >> The zone locking mechanism in the mq-deadline I/O scheduler serializes >> write commands for sequential zones. Some but not all storage controllers >> require this serialization. Introduce a new request queue flag to allow >> block drivers to indicate that they preserve the order of write commands >> and thus do not require serialization of writes per zone. > > Looking at how this is used, why not call it QUEUE_FLAG_ZONE_WRITE_LOCK > instead? That'd make the code easier to immediately grok, rather than > deal with double negations. Hi Jens, Do I understand correctly that you want me to set the QUEUE_FLAG_ZONE_WRITE_LOCK flag for all request queues by adding it to QUEUE_FLAG_MQ_DEFAULT and also that the UFS driver should clear the QUEUE_FLAG_ZONE_WRITE_LOCK flag? Thanks, Bart.
On 8/8/23 3:46?PM, Bart Van Assche wrote: > On 8/8/23 14:19, Jens Axboe wrote: >> On 8/4/23 9:47?AM, Bart Van Assche wrote: >>> Writes in sequential write required zones must happen at the write >>> pointer. Even if the submitter of the write commands (e.g. a filesystem) >>> submits writes for sequential write required zones in order, the block >>> layer or the storage controller may reorder these write commands. >>> >>> The zone locking mechanism in the mq-deadline I/O scheduler serializes >>> write commands for sequential zones. Some but not all storage controllers >>> require this serialization. Introduce a new request queue flag to allow >>> block drivers to indicate that they preserve the order of write commands >>> and thus do not require serialization of writes per zone. >> >> Looking at how this is used, why not call it QUEUE_FLAG_ZONE_WRITE_LOCK >> instead? That'd make the code easier to immediately grok, rather than >> deal with double negations. > > Hi Jens, > > Do I understand correctly that you want me to set the > QUEUE_FLAG_ZONE_WRITE_LOCK flag for all request queues by adding it to > QUEUE_FLAG_MQ_DEFAULT and also that the UFS driver should clear the > QUEUE_FLAG_ZONE_WRITE_LOCK flag? I don't think setting that flag by default makes a lot of sense, if the device in question isn't zoned. Maybe have variants of BLK_ZONED_* which has a locked and unlocked variant for each where it applies? Perhaps have the lock flag be common between them so you can check them in the same way? That'd keep the fact that it's zoned and if it needs locking in the same spot, rather than scatter them in two spots.
On 8/8/23 15:27, Jens Axboe wrote: > I don't think setting that flag by default makes a lot of sense, if the > device in question isn't zoned. Maybe have variants of BLK_ZONED_* which > has a locked and unlocked variant for each where it applies? Perhaps > have the lock flag be common between them so you can check them in the > same way? That'd keep the fact that it's zoned and if it needs locking > in the same spot, rather than scatter them in two spots. Hi Jens, That's an interesting suggestion but there is a complication: the zone type is set by different code than the code that decides whether or not locking is required. For SCSI devices the zone type is set from inside drivers/scsi/sd.c or drivers/scsi/sd_zbc.c while the locking requirements come from the SCSI LLD (drivers/ufs/core/ufshcd.c). Additionally, the code for converting between model type and model string in scsi_debug.c would look weird if BLK_ZONED_* is split into locked and unlocked variants. See also the zbc_model_strs_*[] arrays and the sdeb_zbc_model_str() function. It should be possible to move the flag that indicates whether or not zone locking is required into struct queue_limits next to "enum blk_zoned_model zoned;". If nobody objects I will select this alternative. Thanks, Bart.
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 2f5371b8482c..b6eb988f81f3 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -534,6 +534,12 @@ struct request_queue { #define QUEUE_FLAG_NONROT 6 /* non-rotational device (SSD) */ #define QUEUE_FLAG_VIRT QUEUE_FLAG_NONROT /* paravirt device */ #define QUEUE_FLAG_IO_STAT 7 /* do disk/partitions IO accounting */ +/* + * The device supports not using the zone write locking mechanism to serialize + * write operations (REQ_OP_WRITE, REQ_OP_WRITE_ZEROES) issued to a sequential + * write required zone (BLK_ZONE_TYPE_SEQWRITE_REQ). + */ +#define QUEUE_FLAG_NO_ZONE_WRITE_LOCK 8 #define QUEUE_FLAG_NOXMERGES 9 /* No extended merges */ #define QUEUE_FLAG_ADD_RANDOM 10 /* Contributes to random pool */ #define QUEUE_FLAG_SYNCHRONOUS 11 /* always completes in submit context */ @@ -597,6 +603,11 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q); #define blk_queue_skip_tagset_quiesce(q) \ test_bit(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, &(q)->queue_flags) +static inline bool blk_queue_no_zone_write_lock(struct request_queue *q) +{ + return test_bit(QUEUE_FLAG_NO_ZONE_WRITE_LOCK, &q->queue_flags); +} + extern void blk_set_pm_only(struct request_queue *q); extern void blk_clear_pm_only(struct request_queue *q);