Message ID | 1529500964-28429-6-git-send-email-jianchao.w.wang@oracle.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index 73a97fc..2a161f6 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) > nvme_warn_reset(dev, csts); > nvme_dev_disable(dev, false); > nvme_reset_ctrl(&dev->ctrl); > + __blk_mq_complete_request(req); > return BLK_EH_DONE; > } > > @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) > dev_warn(dev->ctrl.device, > "I/O %d QID %d timeout, completion polled\n", > req->tag, nvmeq->qid); > + /* > + * nvme_end_request will invoke blk_mq_complete_request, > + * it will do nothing for this timed out request. > + */ > + __blk_mq_complete_request(req); And this clearly is bogus. We want to iterate over the tagetset and cancel all requests, not do that manually here. That was the whole point of the original change.
Hi Christoph Thanks for your kindly response. On 06/20/2018 10:39 PM, Christoph Hellwig wrote: >> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c >> index 73a97fc..2a161f6 100644 >> --- a/drivers/nvme/host/pci.c >> +++ b/drivers/nvme/host/pci.c >> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) >> nvme_warn_reset(dev, csts); >> nvme_dev_disable(dev, false); >> nvme_reset_ctrl(&dev->ctrl); >> + __blk_mq_complete_request(req); >> return BLK_EH_DONE; >> } >> >> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) >> dev_warn(dev->ctrl.device, >> "I/O %d QID %d timeout, completion polled\n", >> req->tag, nvmeq->qid); >> + /* >> + * nvme_end_request will invoke blk_mq_complete_request, >> + * it will do nothing for this timed out request. >> + */ >> + __blk_mq_complete_request(req); > > And this clearly is bogus. We want to iterate over the tagetset > and cancel all requests, not do that manually here. > > That was the whole point of the original change. > For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work. However, the hole is still there right now w/o my changes, but just narrower. Thanks Jianchao
> Hi Christoph > > Thanks for your kindly response. > > On 06/20/2018 10:39 PM, Christoph Hellwig wrote: >>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c >>> index 73a97fc..2a161f6 100644 >>> --- a/drivers/nvme/host/pci.c >>> +++ b/drivers/nvme/host/pci.c >>> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) >>> nvme_warn_reset(dev, csts); >>> nvme_dev_disable(dev, false); >>> nvme_reset_ctrl(&dev->ctrl); >>> + __blk_mq_complete_request(req); >>> return BLK_EH_DONE; >>> } >>> >>> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) >>> dev_warn(dev->ctrl.device, >>> "I/O %d QID %d timeout, completion polled\n", >>> req->tag, nvmeq->qid); >>> + /* >>> + * nvme_end_request will invoke blk_mq_complete_request, >>> + * it will do nothing for this timed out request. >>> + */ >>> + __blk_mq_complete_request(req); >> >> And this clearly is bogus. We want to iterate over the tagetset >> and cancel all requests, not do that manually here. >> >> That was the whole point of the original change. >> > > For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still > running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work. > However, the hole is still there right now w/o my changes, but just narrower. Given the amount of fixes (and fixes of fixes) we had in the timeout handler, maybe it'd be a good idea to step back and take a another look? Won't it be better to avoid disabling the device and return BLK_EH_RESET_TIMER if we are not aborting in the timeout handler?
On 06/25/2018 02:07 AM, Sagi Grimberg wrote: > >> Hi Christoph >> >> Thanks for your kindly response. >> >> On 06/20/2018 10:39 PM, Christoph Hellwig wrote: >>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c >>>> index 73a97fc..2a161f6 100644 >>>> --- a/drivers/nvme/host/pci.c >>>> +++ b/drivers/nvme/host/pci.c >>>> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) >>>> nvme_warn_reset(dev, csts); >>>> nvme_dev_disable(dev, false); >>>> nvme_reset_ctrl(&dev->ctrl); >>>> + __blk_mq_complete_request(req); >>>> return BLK_EH_DONE; >>>> } >>>> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) >>>> dev_warn(dev->ctrl.device, >>>> "I/O %d QID %d timeout, completion polled\n", >>>> req->tag, nvmeq->qid); >>>> + /* >>>> + * nvme_end_request will invoke blk_mq_complete_request, >>>> + * it will do nothing for this timed out request. >>>> + */ >>>> + __blk_mq_complete_request(req); >>> >>> And this clearly is bogus. We want to iterate over the tagetset >>> and cancel all requests, not do that manually here. >>> >>> That was the whole point of the original change. >>> >> >> For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still >> running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work. >> However, the hole is still there right now w/o my changes, but just narrower. > > Given the amount of fixes (and fixes of fixes) we had in the timeout handler, maybe it'd be a good idea to step back and take a another look? > > Won't it be better to avoid disabling the device and return > BLK_EH_RESET_TIMER if we are not aborting in the timeout handler? > Yes, that would be an ideal status for nvme-pci. But we have to depend on the timeout handler to handle the timed out request from nvme_reset_work. Thanks Jianchao
>> Given the amount of fixes (and fixes of fixes) we had in the timeout handler, maybe it'd be a good idea to step back and take a another look? >> >> Won't it be better to avoid disabling the device and return >> BLK_EH_RESET_TIMER if we are not aborting in the timeout handler? >> > > Yes, that would be an ideal status for nvme-pci. > But we have to depend on the timeout handler to handle the timed out request from nvme_reset_work. Yes, so if the timeout handler is invoked from the reset, it should simply set a proper failed status and return BLK_EH_DONE.
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 73a97fc..2a161f6 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) nvme_warn_reset(dev, csts); nvme_dev_disable(dev, false); nvme_reset_ctrl(&dev->ctrl); + __blk_mq_complete_request(req); return BLK_EH_DONE; } @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) dev_warn(dev->ctrl.device, "I/O %d QID %d timeout, completion polled\n", req->tag, nvmeq->qid); + /* + * nvme_end_request will invoke blk_mq_complete_request, + * it will do nothing for this timed out request. + */ + __blk_mq_complete_request(req); return BLK_EH_DONE; } @@ -1230,6 +1236,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) req->tag, nvmeq->qid); nvme_dev_disable(dev, false); nvme_req(req)->flags |= NVME_REQ_CANCELLED; + __blk_mq_complete_request(req); return BLK_EH_DONE; default: break; @@ -1248,6 +1255,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) nvme_reset_ctrl(&dev->ctrl); nvme_req(req)->flags |= NVME_REQ_CANCELLED; + __blk_mq_complete_request(req); return BLK_EH_DONE; } diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index c9424da..287eecd 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -1617,6 +1617,7 @@ nvme_rdma_timeout(struct request *rq, bool reserved) /* fail with DNR on cmd timeout */ nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR; + __blk_mq_complete_request(rq); return BLK_EH_DONE; } diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c index d8d91f0..627fbb0 100644 --- a/drivers/nvme/target/loop.c +++ b/drivers/nvme/target/loop.c @@ -148,6 +148,7 @@ nvme_loop_timeout(struct request *rq, bool reserved) /* fail with DNR on admin cmd timeout */ nvme_req(rq)->status = NVME_SC_ABORT_REQ | NVME_SC_DNR; + __blk_mq_complete_request(rq); return BLK_EH_DONE; }
To regain the capability to prevent normal completion path from entering a timeout request, blk_mq_mark_rq_complete is introduced in blk_mq_complete_request. Have to use __blk_mq_complete_request in timeout path to complete a timeout request. nvme_cancel_request uses blk_mq_complete_request, it won't do anything on the timeout request, so add __blk_mq_complete_request before return BLK_EH_DONE. Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> --- drivers/nvme/host/pci.c | 8 ++++++++ drivers/nvme/host/rdma.c | 1 + drivers/nvme/target/loop.c | 1 + 3 files changed, 10 insertions(+)