Message ID | 1457440568-13084-4-git-send-email-ygardi@codeaurora.org (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
On 03/08/2016 01:35 PM, Yaniv Gardi wrote: > A race condition exists between request requeueing and scsi layer > error handling: > When UFS driver queuecommand returns a busy status for a request, > it will be requeued and its tag will be freed and set to -1. > At the same time it is possible that the request will timeout and > scsi layer will start error handling for it. The scsi layer reuses > the request and its tag to send error related commands to the device, > however its tag is no longer valid. > As this request was never really sent to the device, there is no > point to start error handling with the device. > Implement the scsi error handling timeout callback and bypass SCSI > error handling for request that were not actually sent to the device. > For such requests simply reset the block layer timer. Otherwise, let > SCSI layer perform the usual error handling. > > Reviewed-by: Dolev Raviv <draviv@codeaurora.org> > Signed-off-by: Gilad Broner <gbroner@codeaurora.org> > Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org> > > --- > drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++++++ > 1 file changed, 36 insertions(+) > Having a timeout handler is always a good idea, even though this doesn't do anything here. Are we sure that the requests will return eventually? Does the UFS spec provide for a command abort? Cheers, Hannes
On 03/08/2016 02:01 PM, Hannes Reinecke wrote: > On 03/08/2016 01:35 PM, Yaniv Gardi wrote: >> A race condition exists between request requeueing and scsi layer >> error handling: >> When UFS driver queuecommand returns a busy status for a request, >> it will be requeued and its tag will be freed and set to -1. >> At the same time it is possible that the request will timeout and >> scsi layer will start error handling for it. The scsi layer reuses >> the request and its tag to send error related commands to the device, >> however its tag is no longer valid. >> As this request was never really sent to the device, there is no >> point to start error handling with the device. >> Implement the scsi error handling timeout callback and bypass SCSI >> error handling for request that were not actually sent to the device. >> For such requests simply reset the block layer timer. Otherwise, let >> SCSI layer perform the usual error handling. >> >> Reviewed-by: Dolev Raviv <draviv@codeaurora.org> >> Signed-off-by: Gilad Broner <gbroner@codeaurora.org> >> Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org> >> >> --- >> drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++++++ >> 1 file changed, 36 insertions(+) >> > Having a timeout handler is always a good idea, even though this > doesn't do anything here. > Are we sure that the requests will return eventually? > Does the UFS spec provide for a command abort? > In fact, looking at the UFS spec there _is_ a command abort. I would recommend implementing a task management request UPIO with type 'ABORT TASK' here for any task found to be pending. In the end, you might run into a _valid_ timeout, at which point you really want to abort the command... Cheers, Hannes-
> On 03/08/2016 01:35 PM, Yaniv Gardi wrote: >> A race condition exists between request requeueing and scsi layer >> error handling: >> When UFS driver queuecommand returns a busy status for a request, >> it will be requeued and its tag will be freed and set to -1. >> At the same time it is possible that the request will timeout and >> scsi layer will start error handling for it. The scsi layer reuses >> the request and its tag to send error related commands to the device, >> however its tag is no longer valid. >> As this request was never really sent to the device, there is no >> point to start error handling with the device. >> Implement the scsi error handling timeout callback and bypass SCSI >> error handling for request that were not actually sent to the device. >> For such requests simply reset the block layer timer. Otherwise, let >> SCSI layer perform the usual error handling. >> >> Reviewed-by: Dolev Raviv <draviv@codeaurora.org> >> Signed-off-by: Gilad Broner <gbroner@codeaurora.org> >> Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org> >> >> --- >> drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++++++ >> 1 file changed, 36 insertions(+) >> > Having a timeout handler is always a good idea, even though this > doesn't do anything here. > Are we sure that the requests will return eventually? > Does the UFS spec provide for a command abort? > I'm sorry, but I believe you are wrong in this case. This timeout handler is doing exactly what we intend it to do, and also, it is already tested and verified to fix the race condition i explained a few threads back. if the scsi command was dispatched to UFS and sent, let the usual SCSI error handling handle it (return value is BLK_EH_NOT_HANDLED). but, if the SCSI command was not actually dispatched to UFS driver, then return BLK_EH_RESET_TIMER and reset the timer, so we don't get >>unjustified<< timeout, for command that was never dispatched. also, i will paste again, the race-condition scenario, if anyone is interested: ---------- I will describe a race condition happened to us a while ago, that was quite difficult to understand and fix. So, this patch is not about the "busy" returning to the scsi dispatch routine. it's about the abort triggered after 30 seconds. imagine a request being queued and sent to the scsi, and then to the ufs. a timer, initialized to 30 seconds start ticking. but the request is never sent to the ufs device, as queuecommand() returns with "SCSI_MLQUEUE_HOST_BUSY" (which is normal behavior). so, now, the request should be re-queued, and its timer should be reset. (REMEMBER THIS POINT, let's call it "POINT A") BUT, a context switch happens before it's actually re-queued, and CPU is moving to other tasks, doing other things for 30 seconds. yes, sounds crazy, but it did happen. NOW, the timeout_handler invoked, and the scsi_abort() routine start executing, (since 30 seconds passed with no completion). so far, so good. but hey, another context switch happens, right at the beginning of scsi_abort() routine, before anything useful happens. (this is "POINT B") so, now, context is going back "POINT A", to the blk_requeue_request() routine, that is calling: blk_delete_timer(rq); (which does nothing cause the timer already expired) and then it calls: blk_queue_end_tag() which place "-1" in the tag field of the request, marking the request, as "not tagged yet". however, a context switch happens again, and we are back in scsi_abort() routine ("POINT B"), that now needs to abort this very request, but hey, in the "tag" field, what it sees is tag "-1" which is obviously wrong. this patch fixes this very rare race condition: 1. upon timeout, blk_rq_timed_out() is called 2. then it calls rq_timed_out_fn() which eventually call the new callback presented in this patch: "ufshcd_eh_timed_out()" 3. this routine returns with the right flag: BLK_EH_NOT_HANDLED or BLK_EH_RESET_TIMER. 4. blk_rq_timed_out() checks the returned value: in case of BLK_EH_HANDLED, it handles normally, meaning, calling scsi_abort() in case of BLK_EH_RESET_TIMER it starts a new timer, and scsi_abort() never called. hope that helps. regards, Yaniv > Cheers, > > Hannes > -- > Dr. Hannes Reinecke Teamlead Storage & Networking > hare@suse.de +49 911 74053 688 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton > HRB 21284 (AG Nürnberg) > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On 03/08/2016 02:01 PM, Hannes Reinecke wrote: >> On 03/08/2016 01:35 PM, Yaniv Gardi wrote: >>> A race condition exists between request requeueing and scsi layer >>> error handling: >>> When UFS driver queuecommand returns a busy status for a request, >>> it will be requeued and its tag will be freed and set to -1. >>> At the same time it is possible that the request will timeout and >>> scsi layer will start error handling for it. The scsi layer reuses >>> the request and its tag to send error related commands to the device, >>> however its tag is no longer valid. >>> As this request was never really sent to the device, there is no >>> point to start error handling with the device. >>> Implement the scsi error handling timeout callback and bypass SCSI >>> error handling for request that were not actually sent to the device. >>> For such requests simply reset the block layer timer. Otherwise, let >>> SCSI layer perform the usual error handling. >>> >>> Reviewed-by: Dolev Raviv <draviv@codeaurora.org> >>> Signed-off-by: Gilad Broner <gbroner@codeaurora.org> >>> Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org> >>> >>> --- >>> drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 36 insertions(+) >>> >> Having a timeout handler is always a good idea, even though this >> doesn't do anything here. >> Are we sure that the requests will return eventually? >> Does the UFS spec provide for a command abort? >> > In fact, looking at the UFS spec there _is_ a command abort. > I would recommend implementing a task management request UPIO with > type 'ABORT TASK' here for any task found to be pending. > In the end, you might run into a _valid_ timeout, at which point you > really want to abort the command... > but this is not what we'd like to achieve. we don't want to abort a task that was not even dispatched to the UFS driver. in those cases we need to re-queue the request and reset the timer. Hannes, i appreciate your time, but I really don't understand why you insist on coming up with suggestions, when we already implemented one that is working. more over, your solution doesn't fix the race condition which is the reason for this patch. as i don't have HW to test anything at the moment, I think it's better to stick with this solution that also fix the BUG and also was verified and tested. I'd really appreciate your approval for this patch, but, as already said, I can not implement anything else as i can't test it, and also - your suggestion will NOT fix the race condition. i think we shouldn't block the entire 17 patches series because of this patch. not to say - this patch is a BUG fix, so it must be included. thanks, Yaniv > Cheers, > > Hannes- > -- > Dr. Hannes Reinecke Teamlead Storage & Networking > hare@suse.de +49 911 74053 688 > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg > GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton > HRB 21284 (AG Nürnberg) > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/08/2016 08:58 PM, ygardi@codeaurora.org wrote: >> On 03/08/2016 02:01 PM, Hannes Reinecke wrote: >>> On 03/08/2016 01:35 PM, Yaniv Gardi wrote: >>>> A race condition exists between request requeueing and scsi layer >>>> error handling: >>>> When UFS driver queuecommand returns a busy status for a request, >>>> it will be requeued and its tag will be freed and set to -1. >>>> At the same time it is possible that the request will timeout and >>>> scsi layer will start error handling for it. The scsi layer reuses >>>> the request and its tag to send error related commands to the device, >>>> however its tag is no longer valid. >>>> As this request was never really sent to the device, there is no >>>> point to start error handling with the device. >>>> Implement the scsi error handling timeout callback and bypass SCSI >>>> error handling for request that were not actually sent to the device. >>>> For such requests simply reset the block layer timer. Otherwise, let >>>> SCSI layer perform the usual error handling. >>>> >>>> Reviewed-by: Dolev Raviv <draviv@codeaurora.org> >>>> Signed-off-by: Gilad Broner <gbroner@codeaurora.org> >>>> Signed-off-by: Yaniv Gardi <ygardi@codeaurora.org> >>>> >>>> --- >>>> drivers/scsi/ufs/ufshcd.c | 36 ++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 36 insertions(+) >>>> >>> Having a timeout handler is always a good idea, even though this >>> doesn't do anything here. >>> Are we sure that the requests will return eventually? >>> Does the UFS spec provide for a command abort? >>> >> In fact, looking at the UFS spec there _is_ a command abort. >> I would recommend implementing a task management request UPIO with >> type 'ABORT TASK' here for any task found to be pending. >> In the end, you might run into a _valid_ timeout, at which point you >> really want to abort the command... >> > > but this is not what we'd like to achieve. > we don't want to abort a task that was not even dispatched to the UFS driver. > in those cases we need to re-queue the request and reset the timer. > Fully understood. > Hannes, i appreciate your time, but I really don't understand why you > insist on coming up with suggestions, when we already implemented one that > is working. more over, your solution doesn't fix the race condition which is the > reason for this patch. > as i don't have HW to test anything at the moment, I think it's better to > stick with this solution that also fix the BUG and also was verified and > tested. > Ah. Didn't know that. I was under the impression that you _had_ the hardware available. If not then of course it's not easy to verify anything. So, all things considered: Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index de7280c..3400ceb 100644 --- a/drivers/scsi/ufs/ufshcd.c +++ b/drivers/scsi/ufs/ufshcd.c @@ -4568,6 +4568,41 @@ static void ufshcd_async_scan(void *data, async_cookie_t cookie) ufshcd_probe_hba(hba); } +static enum blk_eh_timer_return ufshcd_eh_timed_out(struct scsi_cmnd *scmd) +{ + unsigned long flags; + struct Scsi_Host *host; + struct ufs_hba *hba; + int index; + bool found = false; + + if (!scmd || !scmd->device || !scmd->device->host) + return BLK_EH_NOT_HANDLED; + + host = scmd->device->host; + hba = shost_priv(host); + if (!hba) + return BLK_EH_NOT_HANDLED; + + spin_lock_irqsave(host->host_lock, flags); + + for_each_set_bit(index, &hba->outstanding_reqs, hba->nutrs) { + if (hba->lrb[index].cmd == scmd) { + found = true; + break; + } + } + + spin_unlock_irqrestore(host->host_lock, flags); + + /* + * Bypass SCSI error handling and reset the block layer timer if this + * SCSI command was not actually dispatched to UFS driver, otherwise + * let SCSI layer handle the error as usual. + */ + return found ? BLK_EH_NOT_HANDLED : BLK_EH_RESET_TIMER; +} + static struct scsi_host_template ufshcd_driver_template = { .module = THIS_MODULE, .name = UFSHCD, @@ -4580,6 +4615,7 @@ static struct scsi_host_template ufshcd_driver_template = { .eh_abort_handler = ufshcd_abort, .eh_device_reset_handler = ufshcd_eh_device_reset_handler, .eh_host_reset_handler = ufshcd_eh_host_reset_handler, + .eh_timed_out = ufshcd_eh_timed_out, .this_id = -1, .sg_tablesize = SG_ALL, .cmd_per_lun = UFSHCD_CMD_PER_LUN,