Message ID | a68ad043-26a1-d3d8-2009-504ba4230e0f@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jun 21, 2018 at 04:22:22PM +0800, jianchao.wang wrote: > > Trace, please. With the latest kernel. I'm not saying that there > > is nothing to fix, but the mode of never completing once timeout > > requests as currently done is SCSI is clearly broken. > > > > I didn't find the existing method to simulate this. > So I modified the scsi-debug as following patch as install it as following: > modprobe scsi-debug delay=-1 ndelay=-1 > Both 4.17-rc1 and 4.18-rc1 with this patch set could survive from the test. What tree is this against? I can't apply it to either current Linus' tree or 4.17 for that matter. Also I'm not sure this blk_abort_request call is representative of the real world. Drivers do drain their queues before calling it in general, e.g. take a look at ata_eh_set_pending for the probably most common user.
Hi Christoph Sorry for delayed response. On 06/22/2018 11:10 PM, Christoph Hellwig wrote: > On Thu, Jun 21, 2018 at 04:22:22PM +0800, jianchao.wang wrote: >>> Trace, please. With the latest kernel. I'm not saying that there >>> is nothing to fix, but the mode of never completing once timeout >>> requests as currently done is SCSI is clearly broken. >>> >> >> I didn't find the existing method to simulate this. >> So I modified the scsi-debug as following patch as install it as following: >> modprobe scsi-debug delay=-1 ndelay=-1 >> Both 4.17-rc1 and 4.18-rc1 with this patch set could survive from the test. > > What tree is this against? I can't apply it to either current Linus' > tree or 4.17 for that matter. I made the patch against 4.18.rc1. > Also I'm not sure this blk_abort_request call is representative > of the real world. Drivers do drain their queues before calling > it in general, e.g. take a look at ata_eh_set_pending for the > probably most common user. > This blk_abort_request here is to force request timed out and simulate the scenario where timeout path and io completion path could occur concurrently. It is hard for me to trigger this scenario in real world, so I made this patch which may looks bad. What I want is to trigger the io completion and timeout path concurrently. Thanks Jianchao
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 24d7496..f278e6c 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -4323,6 +4323,8 @@ static void setup_inject(struct sdebug_queue *sqp, sqcp->inj_host_busy = !!(SDEBUG_OPT_HOST_BUSY & sdebug_opts); } +static atomic_t g_abort_counter; + /* Complete the processing of the thread that queued a SCSI command to this * driver. It either completes the command by calling cmnd_done() or * schedules a hr timer or work queue then returns 0. Returns @@ -4459,6 +4461,11 @@ static int schedule_resp(struct scsi_cmnd *cmnd, struct sdebug_dev_info *devip, sd_dp->issuing_cpu = raw_smp_processor_id(); sd_dp->defer_t = SDEB_DEFER_WQ; schedule_work(&sd_dp->ew.work); + atomic_inc(&g_abort_counter); + if (atomic_read(&g_abort_counter)%2000 == 0) { + blk_abort_request(cmnd->request); + trace_printk("abort request tag %d\n", cmnd->request->tag); + } } if (unlikely((SDEBUG_OPT_Q_NOISE & sdebug_opts) && (scsi_result == device_qfull_result))) @@ -5844,6 +5851,7 @@ static int sdebug_driver_probe(struct device *dev) struct Scsi_Host *hpnt; int hprot; + atomic_set(&g_abort_counter, 0); sdbg_host = to_sdebug_host(dev); sdebug_driver_template.can_queue = sdebug_max_queue;