Message ID | 1477279216-11256-1-git-send-email-krisman@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Deferred, archived |
Headers | show |
On 01:20 Mon 24 Oct , Gabriel Krisman Bertazi wrote: > James, > > I fixed the things you pointed out on the previous review. As > discussed, I didn't change the code to reuse the request yet. We can > follow up on that later. > > Thanks, > > >8 > > Usually, re-sending the SCSI command is enough to recover from a Unit > Attention (UA). This adds a generic retry code to the SCSI command path > in case of an UA, before giving up and returning the error condition to > the caller. > > I added the UA verification into scsi_execute instead of > scsi_execute_req because there are at least a few callers that invoke > scsi_execute directly and would benefit from the internal UA retry. > Also, I didn't use scsi_normalize_sense to not duplicate functionality > with scsi_execute_req_flags. Instead, scsi_execute uses a small helper > function that verifies only the UA condition directly from the raw sense > buffer. If this design is not OK, I can refactor to use > scsi_normalize_sense. > > This prevents us from duplicating the retry code in at least a few > places. In particular, it fixes an issue found in some IBM enclosures, > in which the device may return an Unit Attention during probe, breaking > the bind with the ses module: > > scsi 1:0:7:0: Failed to get diagnostic page 0x8000002 > scsi 1:0:7:0: Failed to bind enclosure -19 > > Link: https://patchwork.kernel.org/patch/9336763/ > Suggested-by: Brian King <brking@linux.vnet.ibm.com> > Suggested-by: James Bottomley <jejb@linux.vnet.ibm.com> > Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> > --- > drivers/scsi/scsi_lib.c | 24 ++++++++++++++++++++++++ > 1 file changed, 24 insertions(+) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index c71344aebdbb..4f6a91d6ee34 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -163,6 +163,16 @@ void scsi_queue_insert(struct scsi_cmnd *cmd, int reason) > { > __scsi_queue_insert(cmd, reason, 1); > } > + > +static inline bool scsi_sense_unit_attention(const char *sense) > +{ > + int resp = sense[0] & 0x7f; > + > + return ((resp & 0x70) && > + ((resp >= 0x72 && (sense[1] & 0xf) == UNIT_ATTENTION) || > + (resp < 0x72 && (sense[2] & 0xf) == UNIT_ATTENTION))); > +} > + Just some minor nit-pick. In other places where we check for valid sense (SCSI_SENSE_VALID(), scsi_sense_valid()) we always check for whether the upper nibble really only contains 0x70 ((resp & 0x70) == 0x70). I also find it strange that we now have 3 different and independent functions/places that check for valid sense. Beste Grüße / Best regards, - Benjamin Block > /** > * scsi_execute - insert request and wait for the result > * @sdev: scsi device > @@ -187,7 +197,14 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, > struct request *req; > int write = (data_direction == DMA_TO_DEVICE); > int ret = DRIVER_ERROR << 24; > + unsigned char sense_buf[SCSI_SENSE_BUFFERSIZE]; > > + if (!sense) { > + sense = sense_buf; > + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); > + } > + > + retry: > req = blk_get_request(sdev->request_queue, write, __GFP_RECLAIM); > if (IS_ERR(req)) > return ret; > @@ -210,6 +227,13 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, > */ > blk_execute_rq(req->q, NULL, req, 1); > > + if (scsi_sense_unit_attention(sense) && req->retries > 0) { > + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); > + retries = req->retries - 1; > + blk_put_request(req); > + goto retry; > + } > + > /* > * Some devices (USB mass-storage in particular) may transfer > * garbage data together with a residue indicating that the data > -- > 2.7.4
On 10/23/2016 08:20 PM, Gabriel Krisman Bertazi wrote: > + > /** > * scsi_execute - insert request and wait for the result > * @sdev: scsi device > @@ -187,7 +197,14 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, > struct request *req; > int write = (data_direction == DMA_TO_DEVICE); > int ret = DRIVER_ERROR << 24; > + unsigned char sense_buf[SCSI_SENSE_BUFFERSIZE]; > > + if (!sense) { > + sense = sense_buf; > + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); > + } > + > + retry: > req = blk_get_request(sdev->request_queue, write, __GFP_RECLAIM); > if (IS_ERR(req)) > return ret; > @@ -210,6 +227,13 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, > */ > blk_execute_rq(req->q, NULL, req, 1); > > + if (scsi_sense_unit_attention(sense) && req->retries > 0) { > + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); > + retries = req->retries - 1; > + blk_put_request(req); > + goto retry; > + } > + > /* > * Some devices (USB mass-storage in particular) may transfer > * garbage data together with a residue indicating that the data From scsi_io_completion(): if (sense_valid && !sense_deferred) { switch (sshdr.sense_key) { case UNIT_ATTENTION: if (cmd->device->removable) { cmd->device->changed = 1; action = ACTION_FAIL; } else { action = ACTION_RETRY; } [ ... ] Why do you think new retry code is needed in scsi_execute()? Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-10-25 at 15:16 -0700, Bart Van Assche wrote: > On 10/23/2016 08:20 PM, Gabriel Krisman Bertazi wrote: > > + > > /** > > * scsi_execute - insert request and wait for the result > > * @sdev: scsi device > > @@ -187,7 +197,14 @@ int scsi_execute(struct scsi_device *sdev, > > const unsigned char *cmd, > > struct request *req; > > int write = (data_direction == DMA_TO_DEVICE); > > int ret = DRIVER_ERROR << 24; > > + unsigned char sense_buf[SCSI_SENSE_BUFFERSIZE]; > > > > + if (!sense) { > > + sense = sense_buf; > > + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); > > + } > > + > > + retry: > > req = blk_get_request(sdev->request_queue, write, > > __GFP_RECLAIM); > > if (IS_ERR(req)) > > return ret; > > @@ -210,6 +227,13 @@ int scsi_execute(struct scsi_device *sdev, > > const unsigned char *cmd, > > */ > > blk_execute_rq(req->q, NULL, req, 1); > > > > + if (scsi_sense_unit_attention(sense) && req->retries > 0) > > { > > + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); > > + retries = req->retries - 1; > > + blk_put_request(req); > > + goto retry; > > + } > > + > > /* > > * Some devices (USB mass-storage in particular) may > > transfer > > * garbage data together with a residue indicating that > > the data > > From scsi_io_completion(): > > if (sense_valid && !sense_deferred) { > switch (sshdr.sense_key) { > case UNIT_ATTENTION: > if (cmd->device->removable) { > cmd->device->changed = 1; > action = ACTION_FAIL; > } else { > action = ACTION_RETRY; > } > [ ... ] > > Why do you think new retry code is needed in scsi_execute()? Because scsi_execute uses REQ_BLOCK_PC which is completed before you get to that code. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2016-10-25 at 15:23 -0700, James Bottomley wrote: > Because scsi_execute uses REQ_BLOCK_PC which is completed before you > get to that code. Hello James, Do you perhaps mean that scsi_io_completion() returns early for REQ_TYPE_BLOCK_PC requests? Can you clarify this further? Anyway, currently the following functions interpret the SCSI sense buffer: * scsi_io_completion() in scsi_lib.c. * scsi_mode_sense() in scsi_lib.c. * scsi_test_unit_ready_flags() in scsi_lib.c. * scsi_probe_lun() in scsi_scan.c. * scsi_report_lun_scan() in scsi_scan.c. * ioctl_internal_command() in scsi_ioctl.c. * sg_rq_end_io() in sg.c. * scsi_check_sense() in scsi_error.c. * spi_execute() in scsi_transport_spi.c. Are you sure we should add sense code interpretation code in a tenth function in the SCSI core? Thanks, Bart.
On Tue, 2016-10-25 at 23:18 +0000, Bart Van Assche wrote: > On Tue, 2016-10-25 at 15:23 -0700, James Bottomley wrote: > > Because scsi_execute uses REQ_BLOCK_PC which is completed before > > you get to that code. > > Hello James, > > Do you perhaps mean that scsi_io_completion() returns early for > REQ_TYPE_BLOCK_PC requests? Can you clarify this further? I'm not sure how much simpler I can make it. How about: the first if block gives a non zero value in error causing scsi_end_request to signal an immediate return? > Anyway, currently the following functions interpret the SCSI sense > buffer: > * scsi_io_completion() in scsi_lib.c. > * scsi_mode_sense() in scsi_lib.c. > * scsi_test_unit_ready_flags() in scsi_lib.c. > * scsi_probe_lun() in scsi_scan.c. > * scsi_report_lun_scan() in scsi_scan.c. > * ioctl_internal_command() in scsi_ioctl.c. > * sg_rq_end_io() in sg.c. > * scsi_check_sense() in scsi_error.c. > * spi_execute() in scsi_transport_spi.c. > > Are you sure we should add sense code interpretation code in a tenth > function in the SCSI core? In the absence of a better proposal, yes. I originally looked into better BLOCK_PC error handling in scsi_io_completion, but that has some knock on problems, so it seems best to leave it alone. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/25/2016 04:50 PM, James Bottomley wrote: > On Tue, 2016-10-25 at 23:18 +0000, Bart Van Assche wrote: >> Anyway, currently the following functions interpret the SCSI sense >> buffer: >> * scsi_io_completion() in scsi_lib.c. >> * scsi_mode_sense() in scsi_lib.c. >> * scsi_test_unit_ready_flags() in scsi_lib.c. >> * scsi_probe_lun() in scsi_scan.c. >> * scsi_report_lun_scan() in scsi_scan.c. >> * ioctl_internal_command() in scsi_ioctl.c. >> * sg_rq_end_io() in sg.c. >> * scsi_check_sense() in scsi_error.c. >> * spi_execute() in scsi_transport_spi.c. >> >> Are you sure we should add sense code interpretation code in a tenth >> function in the SCSI core? > > In the absence of a better proposal, yes. I originally looked into > better BLOCK_PC error handling in scsi_io_completion, but that has some > knock on problems, so it seems best to leave it alone. Can you elaborate on this? Since the sense buffer is available in scsi_io_completion() and since that function already calls scsi_command_normalize_sense() this function seems like a good candidate to me to add retry logic for REQ_TYPE_BLOCK_PC requests. Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2016-10-26 at 08:42 -0700, Bart Van Assche wrote: > On 10/25/2016 04:50 PM, James Bottomley wrote: > > On Tue, 2016-10-25 at 23:18 +0000, Bart Van Assche wrote: > > > Anyway, currently the following functions interpret the SCSI > > > sense buffer: > > > * scsi_io_completion() in scsi_lib.c. > > > * scsi_mode_sense() in scsi_lib.c. > > > * scsi_test_unit_ready_flags() in scsi_lib.c. > > > * scsi_probe_lun() in scsi_scan.c. > > > * scsi_report_lun_scan() in scsi_scan.c. > > > * ioctl_internal_command() in scsi_ioctl.c. > > > * sg_rq_end_io() in sg.c. > > > * scsi_check_sense() in scsi_error.c. > > > * spi_execute() in scsi_transport_spi.c. > > > > > > Are you sure we should add sense code interpretation code in a > > > tenth function in the SCSI core? > > > > In the absence of a better proposal, yes. I originally looked into > > better BLOCK_PC error handling in scsi_io_completion, but that has > > some knock on problems, so it seems best to leave it alone. > > Can you elaborate on this? Since the sense buffer is available in > scsi_io_completion() and since that function already calls > scsi_command_normalize_sense() this function seems like a good > candidate to me to add retry logic for REQ_TYPE_BLOCK_PC requests. UAs are used to signal AENs to userspace control processes that use BLOCK_PC, like CD changers, burners and scanners. If we eat UAs inside scsi_io_completion(), we could potentially break any userspace control system that relies on AENs. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2016-10-26 at 08:52 -0700, James Bottomley wrote: > On Wed, 2016-10-26 at 08:42 -0700, Bart Van Assche wrote: > > Can you elaborate on this? Since the sense buffer is available in > > scsi_io_completion() and since that function already calls > > scsi_command_normalize_sense() this function seems like a good > > candidate to me to add retry logic for REQ_TYPE_BLOCK_PC requests. > > UAs are used to signal AENs to userspace control processes that use > BLOCK_PC, like CD changers, burners and scanners. If we eat UAs > inside scsi_io_completion(), we could potentially break any userspace > control system that relies on AENs. Ignoring the SCSI error handler, the call sequence for reporting UAs to user space is as follows: scsi_softirq_done() -> scsi_decide_disposition() -> scsi_check_sense() -> scsi_report_sense(). This means that scsi_report_sense() is called before scsi_io_completion() is called. I think this means that what scsi_io_completion() decides cannot affect UA reporting to user space? Bart.
On 10/26/2016 11:15 AM, Bart Van Assche wrote: > On Wed, 2016-10-26 at 08:52 -0700, James Bottomley wrote: >> On Wed, 2016-10-26 at 08:42 -0700, Bart Van Assche wrote: >>> Can you elaborate on this? Since the sense buffer is available in >>> scsi_io_completion() and since that function already calls >>> scsi_command_normalize_sense() this function seems like a good >>> candidate to me to add retry logic for REQ_TYPE_BLOCK_PC requests. >> >> UAs are used to signal AENs to userspace control processes that use >> BLOCK_PC, like CD changers, burners and scanners. If we eat UAs >> inside scsi_io_completion(), we could potentially break any userspace >> control system that relies on AENs. > > Ignoring the SCSI error handler, the call sequence for reporting UAs to > user space is as follows: scsi_softirq_done() -> > scsi_decide_disposition() -> scsi_check_sense() -> scsi_report_sense(). > This means that scsi_report_sense() is called before > scsi_io_completion() is called. I think this means that what > scsi_io_completion() decides cannot affect UA reporting to user space? I think what would break if we just started eating UAs in scsi_io_completion for BLOCK_PC is applications that are building BLOCK_PC requests and then checking the sense data in the completed command for a UA. I think this is what James was referring to. If we wanted to collapse some of this retry logic, it seems like we might need a way to differentiate between different types of BLOCK_PC requests. A new cmd_flag on struct request, or more generally, a way for scsi or any user of struct request to pass driver specific data along with the request. This is something I've wanted for ipr, which I've sort of worked around currently. -Brian
On 10/26/2016 07:38 PM, Brian King wrote: > On 10/26/2016 11:15 AM, Bart Van Assche wrote: >> On Wed, 2016-10-26 at 08:52 -0700, James Bottomley wrote: >>> On Wed, 2016-10-26 at 08:42 -0700, Bart Van Assche wrote: >>>> Can you elaborate on this? Since the sense buffer is available in >>>> scsi_io_completion() and since that function already calls >>>> scsi_command_normalize_sense() this function seems like a good >>>> candidate to me to add retry logic for REQ_TYPE_BLOCK_PC requests. >>> >>> UAs are used to signal AENs to userspace control processes that use >>> BLOCK_PC, like CD changers, burners and scanners. If we eat UAs >>> inside scsi_io_completion(), we could potentially break any userspace >>> control system that relies on AENs. >> >> Ignoring the SCSI error handler, the call sequence for reporting UAs to >> user space is as follows: scsi_softirq_done() -> >> scsi_decide_disposition() -> scsi_check_sense() -> scsi_report_sense(). >> This means that scsi_report_sense() is called before >> scsi_io_completion() is called. I think this means that what >> scsi_io_completion() decides cannot affect UA reporting to user space? > > I think what would break if we just started eating UAs in scsi_io_completion > for BLOCK_PC is applications that are building BLOCK_PC requests and then > checking the sense data in the completed command for a UA. I think this is what > James was referring to. If we wanted to collapse some of this retry logic, > it seems like we might need a way to differentiate between different types of BLOCK_PC > requests. A new cmd_flag on struct request, or more generally, a way for scsi > or any user of struct request to pass driver specific data along with the request. > This is something I've wanted for ipr, which I've sort of worked around currently. > I'm fully with you here. BLOCK_PC is currently used indiscriminately for all non-filesystem commands, ie for commands where the raw cdb is passed in via req->special. As such, is has a dual meaning: - A pre-filled CDB - do not evaluate the sense code If we could split this up in having one flag for a pre-filled CDB and another one for evaluating sense code we should be able to resolve this situation. Cheers, Hannes
On Thu, Oct 27, 2016 at 11:00:56AM +0200, Hannes Reinecke wrote: > BLOCK_PC is currently used indiscriminately for all non-filesystem > commands, ie for commands where the raw cdb is passed in via req->special. > > As such, is has a dual meaning: > - A pre-filled CDB > - do not evaluate the sense code > > If we could split this up in having one flag for a pre-filled CDB and > another one for evaluating sense code we should be able to resolve this > situation. That sounds fine to me. I just wish we could keep that flag inside SCSI somehow, which isn't really doable with the current BLOCK_PC passthrough architecture. I have some work in progress to fix that up but it will take a while to land. If Jens is ok with it we can add a cmd_flags flag for now, though. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index c71344aebdbb..4f6a91d6ee34 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -163,6 +163,16 @@ void scsi_queue_insert(struct scsi_cmnd *cmd, int reason) { __scsi_queue_insert(cmd, reason, 1); } + +static inline bool scsi_sense_unit_attention(const char *sense) +{ + int resp = sense[0] & 0x7f; + + return ((resp & 0x70) && + ((resp >= 0x72 && (sense[1] & 0xf) == UNIT_ATTENTION) || + (resp < 0x72 && (sense[2] & 0xf) == UNIT_ATTENTION))); +} + /** * scsi_execute - insert request and wait for the result * @sdev: scsi device @@ -187,7 +197,14 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, struct request *req; int write = (data_direction == DMA_TO_DEVICE); int ret = DRIVER_ERROR << 24; + unsigned char sense_buf[SCSI_SENSE_BUFFERSIZE]; + if (!sense) { + sense = sense_buf; + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); + } + + retry: req = blk_get_request(sdev->request_queue, write, __GFP_RECLAIM); if (IS_ERR(req)) return ret; @@ -210,6 +227,13 @@ int scsi_execute(struct scsi_device *sdev, const unsigned char *cmd, */ blk_execute_rq(req->q, NULL, req, 1); + if (scsi_sense_unit_attention(sense) && req->retries > 0) { + memset(sense, 0, SCSI_SENSE_BUFFERSIZE); + retries = req->retries - 1; + blk_put_request(req); + goto retry; + } + /* * Some devices (USB mass-storage in particular) may transfer * garbage data together with a residue indicating that the data