Message ID | 20230823205727.505681-1-bvanassche@acm.org (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | RDMA/srp: Do not call scsi_done() from srp_abort() | expand |
On Aug 23, 2023 / 13:57, Bart Van Assche wrote: > After scmd_eh_abort_handler() has called the SCSI LLD eh_abort_handler > callback, it performs one of the following actions: > * Call scsi_queue_insert(). > * Call scsi_finish_command(). > * Call scsi_eh_scmd_add(). > Hence, SCSI abort handlers must not call scsi_done(). Otherwise all > the above actions would trigger a use-after-free. Hence remove the > scsi_done() call from srp_abort(). Keep the srp_free_req() call > before returning SUCCESS because we may not see the command again if > SUCCESS is returned. > > Cc: Bob Pearson <rpearsonhpe@gmail.com> > Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> > Fixes: d8536670916a ("IB/srp: Avoid having aborted requests hang") > Signed-off-by: Bart Van Assche <bvanassche@acm.org> Thanks Bart. Do you think this patch fixes the hangs at blktests srp/002 and srp/011? I tried this patch and still see the hang at srp/002, but the hang at srp/011 looks disappearing.
On 8/23/23 19:46, Shinichiro Kawasaki wrote: > On Aug 23, 2023 / 13:57, Bart Van Assche wrote: >> After scmd_eh_abort_handler() has called the SCSI LLD eh_abort_handler >> callback, it performs one of the following actions: >> * Call scsi_queue_insert(). >> * Call scsi_finish_command(). >> * Call scsi_eh_scmd_add(). >> Hence, SCSI abort handlers must not call scsi_done(). Otherwise all >> the above actions would trigger a use-after-free. Hence remove the >> scsi_done() call from srp_abort(). Keep the srp_free_req() call >> before returning SUCCESS because we may not see the command again if >> SUCCESS is returned. >> >> Cc: Bob Pearson <rpearsonhpe@gmail.com> >> Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> >> Fixes: d8536670916a ("IB/srp: Avoid having aborted requests hang") >> Signed-off-by: Bart Van Assche <bvanassche@acm.org> > > Thanks Bart. Do you think this patch fixes the hangs at blktests srp/002 and > srp/011? I tried this patch and still see the hang at srp/002, but the hang at > srp/011 looks disappearing. I have not yet tried to verify whether the above patch affects any blktests. The above patch is a patch that I developed a few weeks ago but that had not yet been published on the linux-rdma mailing list. Thanks, Bart.
On Aug 24, 2023 / 07:38, Bart Van Assche wrote: > On 8/23/23 19:46, Shinichiro Kawasaki wrote: > > On Aug 23, 2023 / 13:57, Bart Van Assche wrote: > > > After scmd_eh_abort_handler() has called the SCSI LLD eh_abort_handler > > > callback, it performs one of the following actions: > > > * Call scsi_queue_insert(). > > > * Call scsi_finish_command(). > > > * Call scsi_eh_scmd_add(). > > > Hence, SCSI abort handlers must not call scsi_done(). Otherwise all > > > the above actions would trigger a use-after-free. Hence remove the > > > scsi_done() call from srp_abort(). Keep the srp_free_req() call > > > before returning SUCCESS because we may not see the command again if > > > SUCCESS is returned. > > > > > > Cc: Bob Pearson <rpearsonhpe@gmail.com> > > > Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> > > > Fixes: d8536670916a ("IB/srp: Avoid having aborted requests hang") > > > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > > > > Thanks Bart. Do you think this patch fixes the hangs at blktests srp/002 and > > srp/011? I tried this patch and still see the hang at srp/002, but the hang at > > srp/011 looks disappearing. > > I have not yet tried to verify whether the above patch affects any blktests. > The above patch is a patch that I developed a few weeks ago but that had not > yet been published on the linux-rdma mailing list. Thanks for the explanation. I do not observe regression with this patch on my blktests test runs. It looks good that it avoids the srp/011 hang, but not sure why it avoids it.
On Wed, 23 Aug 2023 13:57:27 -0700, Bart Van Assche wrote: > After scmd_eh_abort_handler() has called the SCSI LLD eh_abort_handler > callback, it performs one of the following actions: > * Call scsi_queue_insert(). > * Call scsi_finish_command(). > * Call scsi_eh_scmd_add(). > Hence, SCSI abort handlers must not call scsi_done(). Otherwise all > the above actions would trigger a use-after-free. Hence remove the > scsi_done() call from srp_abort(). Keep the srp_free_req() call > before returning SUCCESS because we may not see the command again if > SUCCESS is returned. > > [...] Applied, thanks! [1/1] RDMA/srp: Do not call scsi_done() from srp_abort() https://git.kernel.org/rdma/rdma/c/e193b7955dfad6 Best regards,
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 0e513a7e5ac8..be003fa0168c 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -2788,7 +2788,6 @@ static int srp_abort(struct scsi_cmnd *scmnd) u32 tag; u16 ch_idx; struct srp_rdma_ch *ch; - int ret; shost_printk(KERN_ERR, target->scsi_host, "SRP abort called\n"); @@ -2802,19 +2801,14 @@ static int srp_abort(struct scsi_cmnd *scmnd) shost_printk(KERN_ERR, target->scsi_host, "Sending SRP abort for tag %#x\n", tag); if (srp_send_tsk_mgmt(ch, tag, scmnd->device->lun, - SRP_TSK_ABORT_TASK, NULL) == 0) - ret = SUCCESS; - else if (target->rport->state == SRP_RPORT_LOST) - ret = FAST_IO_FAIL; - else - ret = FAILED; - if (ret == SUCCESS) { + SRP_TSK_ABORT_TASK, NULL) == 0) { srp_free_req(ch, req, scmnd, 0); - scmnd->result = DID_ABORT << 16; - scsi_done(scmnd); + return SUCCESS; } + if (target->rport->state == SRP_RPORT_LOST) + return FAST_IO_FAIL; - return ret; + return FAILED; } static int srp_reset_device(struct scsi_cmnd *scmnd)
After scmd_eh_abort_handler() has called the SCSI LLD eh_abort_handler callback, it performs one of the following actions: * Call scsi_queue_insert(). * Call scsi_finish_command(). * Call scsi_eh_scmd_add(). Hence, SCSI abort handlers must not call scsi_done(). Otherwise all the above actions would trigger a use-after-free. Hence remove the scsi_done() call from srp_abort(). Keep the srp_free_req() call before returning SUCCESS because we may not see the command again if SUCCESS is returned. Cc: Bob Pearson <rpearsonhpe@gmail.com> Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: d8536670916a ("IB/srp: Avoid having aborted requests hang") Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/infiniband/ulp/srp/ib_srp.c | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-)