[5/5] nvme: use __blk_mq_complete_request in timeout path

Message ID	1529500964-28429-6-git-send-email-jianchao.w.wang@oracle.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Jianchao Wang <jianchao.w.wang@oracle.com> To: axboe@kernel.dk, hch@lst.de, martin.petersen@oracle.com, keith.busch@intel.com, josef@toxicpanda.com, ulf.hansson@linaro.org Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 5/5] nvme: use __blk_mq_complete_request in timeout path Date: Wed, 20 Jun 2018 21:22:44 +0800 Message-Id: <1529500964-28429-6-git-send-email-jianchao.w.wang@oracle.com> In-Reply-To: <1529500964-28429-1-git-send-email-jianchao.w.wang@oracle.com> References: <1529500964-28429-1-git-send-email-jianchao.w.wang@oracle.com> Sender: linux-block-owner@vger.kernel.org Precedence: bulk

jianchao.wang June 20, 2018, 1:22 p.m. UTC

To regain the capability to prevent normal completion path from
entering a timeout request, blk_mq_mark_rq_complete is introduced
in blk_mq_complete_request. Have to use __blk_mq_complete_request
in timeout path to complete a timeout request.

nvme_cancel_request uses blk_mq_complete_request, it won't do
anything on the timeout request, so add __blk_mq_complete_request
before return BLK_EH_DONE.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 drivers/nvme/host/pci.c    | 8 ++++++++
 drivers/nvme/host/rdma.c   | 1 +
 drivers/nvme/target/loop.c | 1 +
 3 files changed, 10 insertions(+)

Christoph Hellwig June 20, 2018, 2:39 p.m. UTC | #1

> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 73a97fc..2a161f6 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  		nvme_warn_reset(dev, csts);
>  		nvme_dev_disable(dev, false);
>  		nvme_reset_ctrl(&dev->ctrl);
> +		__blk_mq_complete_request(req);
>  		return BLK_EH_DONE;
>  	}
>  
> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  		dev_warn(dev->ctrl.device,
>  			 "I/O %d QID %d timeout, completion polled\n",
>  			 req->tag, nvmeq->qid);
> +		/*
> +		 * nvme_end_request will invoke blk_mq_complete_request,
> +		 * it will do nothing for this timed out request.
> +		 */
> +		__blk_mq_complete_request(req);

And this clearly is bogus.  We want to iterate over the tagetset
and cancel all requests, not do that manually here.

That was the whole point of the original change.

jianchao.wang June 21, 2018, 2:09 a.m. UTC | #2

Hi Christoph

Thanks for your kindly response.

On 06/20/2018 10:39 PM, Christoph Hellwig wrote:
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index 73a97fc..2a161f6 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>  		nvme_warn_reset(dev, csts);
>>  		nvme_dev_disable(dev, false);
>>  		nvme_reset_ctrl(&dev->ctrl);
>> +		__blk_mq_complete_request(req);
>>  		return BLK_EH_DONE;
>>  	}
>>  
>> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>  		dev_warn(dev->ctrl.device,
>>  			 "I/O %d QID %d timeout, completion polled\n",
>>  			 req->tag, nvmeq->qid);
>> +		/*
>> +		 * nvme_end_request will invoke blk_mq_complete_request,
>> +		 * it will do nothing for this timed out request.
>> +		 */
>> +		__blk_mq_complete_request(req);
> 
> And this clearly is bogus.  We want to iterate over the tagetset
> and cancel all requests, not do that manually here.
> 
> That was the whole point of the original change.
> 

For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still
running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work.
However, the hole is still there right now w/o my changes, but just narrower. 

Thanks
Jianchao

Sagi Grimberg June 24, 2018, 6:07 p.m. UTC | #3

> Hi Christoph
> 
> Thanks for your kindly response.
> 
> On 06/20/2018 10:39 PM, Christoph Hellwig wrote:
>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>>> index 73a97fc..2a161f6 100644
>>> --- a/drivers/nvme/host/pci.c
>>> +++ b/drivers/nvme/host/pci.c
>>> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>>   		nvme_warn_reset(dev, csts);
>>>   		nvme_dev_disable(dev, false);
>>>   		nvme_reset_ctrl(&dev->ctrl);
>>> +		__blk_mq_complete_request(req);
>>>   		return BLK_EH_DONE;
>>>   	}
>>>   
>>> @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>>   		dev_warn(dev->ctrl.device,
>>>   			 "I/O %d QID %d timeout, completion polled\n",
>>>   			 req->tag, nvmeq->qid);
>>> +		/*
>>> +		 * nvme_end_request will invoke blk_mq_complete_request,
>>> +		 * it will do nothing for this timed out request.
>>> +		 */
>>> +		__blk_mq_complete_request(req);
>>
>> And this clearly is bogus.  We want to iterate over the tagetset
>> and cancel all requests, not do that manually here.
>>
>> That was the whole point of the original change.
>>
> 
> For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still
> running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work.
> However, the hole is still there right now w/o my changes, but just narrower.

Given the amount of fixes (and fixes of fixes) we had in the timeout 
handler, maybe it'd be a good idea to step back and take a another look?

Won't it be better to avoid disabling the device and return
BLK_EH_RESET_TIMER if we are not aborting in the timeout handler?

jianchao.wang June 25, 2018, 1:40 a.m. UTC | #4

On 06/25/2018 02:07 AM, Sagi Grimberg wrote:
> 
>> Hi Christoph
>>
>> Thanks for your kindly response.
>>
>> On 06/20/2018 10:39 PM, Christoph Hellwig wrote:
>>>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>>>> index 73a97fc..2a161f6 100644
>>>> --- a/drivers/nvme/host/pci.c
>>>> +++ b/drivers/nvme/host/pci.c
>>>> @@ -1203,6 +1203,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>>>           nvme_warn_reset(dev, csts);
>>>>           nvme_dev_disable(dev, false);
>>>>           nvme_reset_ctrl(&dev->ctrl);
>>>> +        __blk_mq_complete_request(req);
>>>>           return BLK_EH_DONE;
>>>>       }
>>>>   @@ -1213,6 +1214,11 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>>>>           dev_warn(dev->ctrl.device,
>>>>                "I/O %d QID %d timeout, completion polled\n",
>>>>                req->tag, nvmeq->qid);
>>>> +        /*
>>>> +         * nvme_end_request will invoke blk_mq_complete_request,
>>>> +         * it will do nothing for this timed out request.
>>>> +         */
>>>> +        __blk_mq_complete_request(req);
>>>
>>> And this clearly is bogus.  We want to iterate over the tagetset
>>> and cancel all requests, not do that manually here.
>>>
>>> That was the whole point of the original change.
>>>
>>
>> For nvme-pci, we indeed have an issue that when nvme_reset_work->nvme_dev_disable returns, timeout path maybe still
>> running and the nvme_dev_disable invoked by timeout path will race with the nvme_reset_work.
>> However, the hole is still there right now w/o my changes, but just narrower.
> 
> Given the amount of fixes (and fixes of fixes) we had in the timeout handler, maybe it'd be a good idea to step back and take a another look?
> 
> Won't it be better to avoid disabling the device and return
> BLK_EH_RESET_TIMER if we are not aborting in the timeout handler?
> 

Yes, that would be an ideal status for nvme-pci.
But we have to depend on the timeout handler to handle the timed out request from nvme_reset_work.

Thanks
Jianchao

Sagi Grimberg June 25, 2018, 6:51 p.m. UTC | #5

>> Given the amount of fixes (and fixes of fixes) we had in the timeout handler, maybe it'd be a good idea to step back and take a another look?
>>
>> Won't it be better to avoid disabling the device and return
>> BLK_EH_RESET_TIMER if we are not aborting in the timeout handler?
>>
> 
> Yes, that would be an ideal status for nvme-pci.
> But we have to depend on the timeout handler to handle the timed out request from nvme_reset_work.

Yes, so if the timeout handler is invoked from the reset, it should
simply set a proper failed status and return BLK_EH_DONE.

[5/5] nvme: use __blk_mq_complete_request in timeout path

Commit Message

Comments

Patch