diff mbox series

[3/5] nvme: don't abort completed request in nvme_cancel_request

Message ID 20190722053954.25423-4-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series blk-mq: wait until completed req's complete fn is run | expand

Commit Message

Ming Lei July 22, 2019, 5:39 a.m. UTC
Before aborting in-flight requests, all IO queues have been shutdown.
However, request's completion fn may not be done yet because it may
be scheduled to run via IPI.

So don't abort one request if it is marked as completed, otherwise
we may abort one normal completed request.

Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/nvme/host/core.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Bart Van Assche July 22, 2019, 3:27 p.m. UTC | #1
On 7/21/19 10:39 PM, Ming Lei wrote:
> Before aborting in-flight requests, all IO queues have been shutdown.
> However, request's completion fn may not be done yet because it may
> be scheduled to run via IPI.
> 
> So don't abort one request if it is marked as completed, otherwise
> we may abort one normal completed request.
> 
> Cc: Max Gurtovoy <maxg@mellanox.com>
> Cc: Sagi Grimberg <sagi@grimberg.me>
> Cc: Keith Busch <keith.busch@intel.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   drivers/nvme/host/core.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index cc09b81fc7f4..cb8007cce4d1 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq);
>   
>   bool nvme_cancel_request(struct request *req, void *data, bool reserved)
>   {
> +	/* don't abort one completed request */
> +	if (blk_mq_request_completed(req))
> +		return;
> +
>   	dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
>   				"Cancelling I/O %d", req->tag);

Something I probably already asked before: what prevents that 
nvme_cancel_request() is executed concurrently with the completion 
handler of the same request?

Thanks,

Bart.
Keith Busch July 22, 2019, 11:22 p.m. UTC | #2
On Mon, Jul 22, 2019 at 9:27 AM Bart Van Assche <bvanassche@acm.org> wrote:
> On 7/21/19 10:39 PM, Ming Lei wrote:
> > Before aborting in-flight requests, all IO queues have been shutdown.
> > However, request's completion fn may not be done yet because it may
> > be scheduled to run via IPI.
> >
> > So don't abort one request if it is marked as completed, otherwise
> > we may abort one normal completed request.
> >
> > Cc: Max Gurtovoy <maxg@mellanox.com>
> > Cc: Sagi Grimberg <sagi@grimberg.me>
> > Cc: Keith Busch <keith.busch@intel.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >   drivers/nvme/host/core.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index cc09b81fc7f4..cb8007cce4d1 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq);
> >
> >   bool nvme_cancel_request(struct request *req, void *data, bool reserved)
> >   {
> > +     /* don't abort one completed request */
> > +     if (blk_mq_request_completed(req))
> > +             return;
> > +
> >       dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
> >                               "Cancelling I/O %d", req->tag);
>
> Something I probably already asked before: what prevents that
> nvme_cancel_request() is executed concurrently with the completion
> handler of the same request?

At least for pci, we've shutdown the queues and their interrupts prior
to tagset iteration, so we can't concurrently execute a natural
completion for in-flight requests while cancelling them.
Sagi Grimberg July 23, 2019, 12:07 a.m. UTC | #3
>> On 7/21/19 10:39 PM, Ming Lei wrote:
>>> Before aborting in-flight requests, all IO queues have been shutdown.
>>> However, request's completion fn may not be done yet because it may
>>> be scheduled to run via IPI.
>>>
>>> So don't abort one request if it is marked as completed, otherwise
>>> we may abort one normal completed request.
>>>
>>> Cc: Max Gurtovoy <maxg@mellanox.com>
>>> Cc: Sagi Grimberg <sagi@grimberg.me>
>>> Cc: Keith Busch <keith.busch@intel.com>
>>> Cc: Christoph Hellwig <hch@lst.de>
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>>>    drivers/nvme/host/core.c | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index cc09b81fc7f4..cb8007cce4d1 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq);
>>>
>>>    bool nvme_cancel_request(struct request *req, void *data, bool reserved)
>>>    {
>>> +     /* don't abort one completed request */
>>> +     if (blk_mq_request_completed(req))
>>> +             return;
>>> +
>>>        dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
>>>                                "Cancelling I/O %d", req->tag);
>>
>> Something I probably already asked before: what prevents that
>> nvme_cancel_request() is executed concurrently with the completion
>> handler of the same request?
> 
> At least for pci, we've shutdown the queues and their interrupts prior
> to tagset iteration, so we can't concurrently execute a natural
> completion for in-flight requests while cancelling them.

Same for tcp and rdma.
Ming Lei July 23, 2019, 1:08 a.m. UTC | #4
On Mon, Jul 22, 2019 at 08:27:32AM -0700, Bart Van Assche wrote:
> On 7/21/19 10:39 PM, Ming Lei wrote:
> > Before aborting in-flight requests, all IO queues have been shutdown.
> > However, request's completion fn may not be done yet because it may
> > be scheduled to run via IPI.
> > 
> > So don't abort one request if it is marked as completed, otherwise
> > we may abort one normal completed request.
> > 
> > Cc: Max Gurtovoy <maxg@mellanox.com>
> > Cc: Sagi Grimberg <sagi@grimberg.me>
> > Cc: Keith Busch <keith.busch@intel.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > ---
> >   drivers/nvme/host/core.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> > index cc09b81fc7f4..cb8007cce4d1 100644
> > --- a/drivers/nvme/host/core.c
> > +++ b/drivers/nvme/host/core.c
> > @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq);
> >   bool nvme_cancel_request(struct request *req, void *data, bool reserved)
> >   {
> > +	/* don't abort one completed request */
> > +	if (blk_mq_request_completed(req))
> > +		return;
> > +
> >   	dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
> >   				"Cancelling I/O %d", req->tag);
> 
> Something I probably already asked before: what prevents that
> nvme_cancel_request() is executed concurrently with the completion handler
> of the same request?

The commit log did mention the point:

	Before aborting in-flight requests, all IO queues have been shutdown.

which implies that no concurrent normal completion.

Thanks,
Ming
Bart Van Assche July 23, 2019, 7:22 p.m. UTC | #5
On 7/22/19 6:08 PM, Ming Lei wrote:
> On Mon, Jul 22, 2019 at 08:27:32AM -0700, Bart Van Assche wrote:
>> On 7/21/19 10:39 PM, Ming Lei wrote:
>>> Before aborting in-flight requests, all IO queues have been shutdown.
>>> However, request's completion fn may not be done yet because it may
>>> be scheduled to run via IPI.
>>>
>>> So don't abort one request if it is marked as completed, otherwise
>>> we may abort one normal completed request.
>>>
>>> Cc: Max Gurtovoy <maxg@mellanox.com>
>>> Cc: Sagi Grimberg <sagi@grimberg.me>
>>> Cc: Keith Busch <keith.busch@intel.com>
>>> Cc: Christoph Hellwig <hch@lst.de>
>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>> ---
>>>    drivers/nvme/host/core.c | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>>> index cc09b81fc7f4..cb8007cce4d1 100644
>>> --- a/drivers/nvme/host/core.c
>>> +++ b/drivers/nvme/host/core.c
>>> @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq);
>>>    bool nvme_cancel_request(struct request *req, void *data, bool reserved)
>>>    {
>>> +	/* don't abort one completed request */
>>> +	if (blk_mq_request_completed(req))
>>> +		return;
>>> +
>>>    	dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
>>>    				"Cancelling I/O %d", req->tag);
>>
>> Something I probably already asked before: what prevents that
>> nvme_cancel_request() is executed concurrently with the completion handler
>> of the same request?
> 
> The commit log did mention the point:
> 
> 	Before aborting in-flight requests, all IO queues have been shutdown.
> 
> which implies that no concurrent normal completion.

How about adding that explanation as a comment above
nvme_cancel_request()? That would make that explanation much easier to 
find compared to having to search through commit logs.

Thanks,

Bart.
Sagi Grimberg July 23, 2019, 8:27 p.m. UTC | #6
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
diff mbox series

Patch

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index cc09b81fc7f4..cb8007cce4d1 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -285,6 +285,10 @@  EXPORT_SYMBOL_GPL(nvme_complete_rq);
 
 bool nvme_cancel_request(struct request *req, void *data, bool reserved)
 {
+	/* don't abort one completed request */
+	if (blk_mq_request_completed(req))
+		return;
+
 	dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device,
 				"Cancelling I/O %d", req->tag);