Message ID | 20231023215638.3405959-11-bvanassche@acm.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Improve write performance for zoned UFS devices​ | expand |
On 10/24/23 06:54, Bart Van Assche wrote: > If zoned writes (REQ_OP_WRITE) for a sequential write required zone have > a starting LBA that differs from the write pointer, e.g. because zoned > writes have been reordered, then the storage device will respond with an > UNALIGNED WRITE COMMAND error. Send commands that failed with an > unaligned write error to the SCSI error handler if zone write locking is > disabled. The SCSI error handler will sort SCSI commands per LBA before > resubmitting these. > > If zone write locking is disabled, increase the number of retries for > write commands sent to a sequential zone to the maximum number of > outstanding commands because in the worst case the number of times > reordered zoned writes have to be retried is (number of outstanding > writes per sequential zone) - 1. > > Reviewed-by: Damien Le Moal <dlemoal@kernel.org> > Cc: Martin K. Petersen <martin.petersen@oracle.com> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Ming Lei <ming.lei@redhat.com> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/scsi/scsi_error.c | 16 ++++++++++++++++ > drivers/scsi/scsi_lib.c | 1 + > drivers/scsi/sd.c | 6 ++++++ > include/scsi/scsi.h | 1 + > 4 files changed, 24 insertions(+) > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index 8b1eb637ffa8..9a54856fa03b 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -699,6 +699,22 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd) > fallthrough; > > case ILLEGAL_REQUEST: > + /* > + * Unaligned write command. This may indicate that zoned writes > + * have been received by the device in the wrong order. If zone > + * write locking is disabled, retry after all pending commands > + * have completed. > + */ > + if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 && > + !req->q->limits.use_zone_write_lock && > + blk_rq_is_seq_zoned_write(req) && > + scmd->retries <= scmd->allowed) { > + sdev_printk(KERN_INFO, scmd->device, > + "Retrying unaligned write at LBA %#llx.\n", > + scsi_get_lba(scmd)); KERN_INFO ? Did you perhaps mean KERN_DEBUG ? An info message for this will be way too noisy. > + return NEEDS_DELAYED_RETRY; > + } > + > if (sshdr.asc == 0x20 || /* Invalid command operation code */ > sshdr.asc == 0x21 || /* Logical block address out of range */ > sshdr.asc == 0x22 || /* Invalid function */ > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index c2f647a7c1b0..33a34693c8a2 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -1443,6 +1443,7 @@ static void scsi_complete(struct request *rq) > case ADD_TO_MLQUEUE: > scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY); > break; > + case NEEDS_DELAYED_RETRY: > default: > scsi_eh_scmd_add(cmd); > break; > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index 82abc721b543..4e6b77f5854f 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -1197,6 +1197,12 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd) > cmd->transfersize = sdp->sector_size; > cmd->underflow = nr_blocks << 9; > cmd->allowed = sdkp->max_retries; > + /* > + * Increase the number of allowed retries for zoned writes if zone > + * write locking is disabled. > + */ > + if (!rq->q->limits.use_zone_write_lock && blk_rq_is_seq_zoned_write(rq)) > + cmd->allowed += rq->q->nr_requests; > cmd->sdb.length = nr_blocks * sdp->sector_size; > > SCSI_LOG_HLQUEUE(1, > diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h > index ec093594ba53..6600db046227 100644 > --- a/include/scsi/scsi.h > +++ b/include/scsi/scsi.h > @@ -93,6 +93,7 @@ static inline int scsi_status_is_check_condition(int status) > * Internal return values. > */ > enum scsi_disposition { > + NEEDS_DELAYED_RETRY = 0x2000, > NEEDS_RETRY = 0x2001, > SUCCESS = 0x2002, > FAILED = 0x2003,
On 10/23/23 17:13, Damien Le Moal wrote: > On 10/24/23 06:54, Bart Van Assche wrote: >> case ILLEGAL_REQUEST: >> + /* >> + * Unaligned write command. This may indicate that zoned writes >> + * have been received by the device in the wrong order. If zone >> + * write locking is disabled, retry after all pending commands >> + * have completed. >> + */ >> + if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 && >> + !req->q->limits.use_zone_write_lock && >> + blk_rq_is_seq_zoned_write(req) && >> + scmd->retries <= scmd->allowed) { >> + sdev_printk(KERN_INFO, scmd->device, >> + "Retrying unaligned write at LBA %#llx.\n", >> + scsi_get_lba(scmd)); > > KERN_INFO ? Did you perhaps mean KERN_DEBUG ? An info message for this will be > way too noisy. Hi Damien, Are you sure that KERN_INFO will be too noisy? On our test setups we see this message less than once a day. Anyway, I will change the severity level. Bart.
On 10/25/23 02:22, Bart Van Assche wrote: > On 10/23/23 17:13, Damien Le Moal wrote: >> On 10/24/23 06:54, Bart Van Assche wrote: >>> case ILLEGAL_REQUEST: >>> + /* >>> + * Unaligned write command. This may indicate that zoned writes >>> + * have been received by the device in the wrong order. If zone >>> + * write locking is disabled, retry after all pending commands >>> + * have completed. >>> + */ >>> + if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 && >>> + !req->q->limits.use_zone_write_lock && >>> + blk_rq_is_seq_zoned_write(req) && >>> + scmd->retries <= scmd->allowed) { >>> + sdev_printk(KERN_INFO, scmd->device, >>> + "Retrying unaligned write at LBA %#llx.\n", >>> + scsi_get_lba(scmd)); >> >> KERN_INFO ? Did you perhaps mean KERN_DEBUG ? An info message for this will be >> way too noisy. > > Hi Damien, > > Are you sure that KERN_INFO will be too noisy? On our test setups we see > this message less than once a day. Anyway, I will change the severity level. I am not sure. But better safe than sorry :) So given that we should not scare the user with errors that are not errors (as the next tries will succeed), we should be silent and log a message only if the retry count is exhausted and we still see a failure. > > Bart. >
On 10/25/23 00:25, Damien Le Moal wrote: > On 10/25/23 02:22, Bart Van Assche wrote: >> On 10/23/23 17:13, Damien Le Moal wrote: >>> On 10/24/23 06:54, Bart Van Assche wrote: >>>> case ILLEGAL_REQUEST: >>>> + /* >>>> + * Unaligned write command. This may indicate that zoned writes >>>> + * have been received by the device in the wrong order. If zone >>>> + * write locking is disabled, retry after all pending commands >>>> + * have completed. >>>> + */ >>>> + if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 && >>>> + !req->q->limits.use_zone_write_lock && >>>> + blk_rq_is_seq_zoned_write(req) && >>>> + scmd->retries <= scmd->allowed) { >>>> + sdev_printk(KERN_INFO, scmd->device, >>>> + "Retrying unaligned write at LBA %#llx.\n", >>>> + scsi_get_lba(scmd)); >>> >>> KERN_INFO ? Did you perhaps mean KERN_DEBUG ? An info message for this will be >>> way too noisy. >> >> Hi Damien, >> >> Are you sure that KERN_INFO will be too noisy? On our test setups we see >> this message less than once a day. Anyway, I will change the severity level. > > I am not sure. But better safe than sorry :) > > So given that we should not scare the user with errors that are not errors (as > the next tries will succeed), we should be silent and log a message only if the > retry count is exhausted and we still see a failure. Hi Damien, I will wrap SCSI_LOG_ERROR_RECOVERY(1, ...) around the above sdev_printk() call. As you probably know SCSI logging is disabled by default. Thanks, Bart.
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 8b1eb637ffa8..9a54856fa03b 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -699,6 +699,22 @@ enum scsi_disposition scsi_check_sense(struct scsi_cmnd *scmd) fallthrough; case ILLEGAL_REQUEST: + /* + * Unaligned write command. This may indicate that zoned writes + * have been received by the device in the wrong order. If zone + * write locking is disabled, retry after all pending commands + * have completed. + */ + if (sshdr.asc == 0x21 && sshdr.ascq == 0x04 && + !req->q->limits.use_zone_write_lock && + blk_rq_is_seq_zoned_write(req) && + scmd->retries <= scmd->allowed) { + sdev_printk(KERN_INFO, scmd->device, + "Retrying unaligned write at LBA %#llx.\n", + scsi_get_lba(scmd)); + return NEEDS_DELAYED_RETRY; + } + if (sshdr.asc == 0x20 || /* Invalid command operation code */ sshdr.asc == 0x21 || /* Logical block address out of range */ sshdr.asc == 0x22 || /* Invalid function */ diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index c2f647a7c1b0..33a34693c8a2 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1443,6 +1443,7 @@ static void scsi_complete(struct request *rq) case ADD_TO_MLQUEUE: scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY); break; + case NEEDS_DELAYED_RETRY: default: scsi_eh_scmd_add(cmd); break; diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 82abc721b543..4e6b77f5854f 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1197,6 +1197,12 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd) cmd->transfersize = sdp->sector_size; cmd->underflow = nr_blocks << 9; cmd->allowed = sdkp->max_retries; + /* + * Increase the number of allowed retries for zoned writes if zone + * write locking is disabled. + */ + if (!rq->q->limits.use_zone_write_lock && blk_rq_is_seq_zoned_write(rq)) + cmd->allowed += rq->q->nr_requests; cmd->sdb.length = nr_blocks * sdp->sector_size; SCSI_LOG_HLQUEUE(1, diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h index ec093594ba53..6600db046227 100644 --- a/include/scsi/scsi.h +++ b/include/scsi/scsi.h @@ -93,6 +93,7 @@ static inline int scsi_status_is_check_condition(int status) * Internal return values. */ enum scsi_disposition { + NEEDS_DELAYED_RETRY = 0x2000, NEEDS_RETRY = 0x2001, SUCCESS = 0x2002, FAILED = 0x2003,