Message ID | 20190722053954.25423-4-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-mq: wait until completed req's complete fn is run | expand |
On 7/21/19 10:39 PM, Ming Lei wrote: > Before aborting in-flight requests, all IO queues have been shutdown. > However, request's completion fn may not be done yet because it may > be scheduled to run via IPI. > > So don't abort one request if it is marked as completed, otherwise > we may abort one normal completed request. > > Cc: Max Gurtovoy <maxg@mellanox.com> > Cc: Sagi Grimberg <sagi@grimberg.me> > Cc: Keith Busch <keith.busch@intel.com> > Cc: Christoph Hellwig <hch@lst.de> > Signed-off-by: Ming Lei <ming.lei@redhat.com> > --- > drivers/nvme/host/core.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index cc09b81fc7f4..cb8007cce4d1 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq); > > bool nvme_cancel_request(struct request *req, void *data, bool reserved) > { > + /* don't abort one completed request */ > + if (blk_mq_request_completed(req)) > + return; > + > dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device, > "Cancelling I/O %d", req->tag); Something I probably already asked before: what prevents that nvme_cancel_request() is executed concurrently with the completion handler of the same request? Thanks, Bart.
On Mon, Jul 22, 2019 at 9:27 AM Bart Van Assche <bvanassche@acm.org> wrote: > On 7/21/19 10:39 PM, Ming Lei wrote: > > Before aborting in-flight requests, all IO queues have been shutdown. > > However, request's completion fn may not be done yet because it may > > be scheduled to run via IPI. > > > > So don't abort one request if it is marked as completed, otherwise > > we may abort one normal completed request. > > > > Cc: Max Gurtovoy <maxg@mellanox.com> > > Cc: Sagi Grimberg <sagi@grimberg.me> > > Cc: Keith Busch <keith.busch@intel.com> > > Cc: Christoph Hellwig <hch@lst.de> > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > --- > > drivers/nvme/host/core.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > > index cc09b81fc7f4..cb8007cce4d1 100644 > > --- a/drivers/nvme/host/core.c > > +++ b/drivers/nvme/host/core.c > > @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq); > > > > bool nvme_cancel_request(struct request *req, void *data, bool reserved) > > { > > + /* don't abort one completed request */ > > + if (blk_mq_request_completed(req)) > > + return; > > + > > dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device, > > "Cancelling I/O %d", req->tag); > > Something I probably already asked before: what prevents that > nvme_cancel_request() is executed concurrently with the completion > handler of the same request? At least for pci, we've shutdown the queues and their interrupts prior to tagset iteration, so we can't concurrently execute a natural completion for in-flight requests while cancelling them.
>> On 7/21/19 10:39 PM, Ming Lei wrote: >>> Before aborting in-flight requests, all IO queues have been shutdown. >>> However, request's completion fn may not be done yet because it may >>> be scheduled to run via IPI. >>> >>> So don't abort one request if it is marked as completed, otherwise >>> we may abort one normal completed request. >>> >>> Cc: Max Gurtovoy <maxg@mellanox.com> >>> Cc: Sagi Grimberg <sagi@grimberg.me> >>> Cc: Keith Busch <keith.busch@intel.com> >>> Cc: Christoph Hellwig <hch@lst.de> >>> Signed-off-by: Ming Lei <ming.lei@redhat.com> >>> --- >>> drivers/nvme/host/core.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c >>> index cc09b81fc7f4..cb8007cce4d1 100644 >>> --- a/drivers/nvme/host/core.c >>> +++ b/drivers/nvme/host/core.c >>> @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq); >>> >>> bool nvme_cancel_request(struct request *req, void *data, bool reserved) >>> { >>> + /* don't abort one completed request */ >>> + if (blk_mq_request_completed(req)) >>> + return; >>> + >>> dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device, >>> "Cancelling I/O %d", req->tag); >> >> Something I probably already asked before: what prevents that >> nvme_cancel_request() is executed concurrently with the completion >> handler of the same request? > > At least for pci, we've shutdown the queues and their interrupts prior > to tagset iteration, so we can't concurrently execute a natural > completion for in-flight requests while cancelling them. Same for tcp and rdma.
On Mon, Jul 22, 2019 at 08:27:32AM -0700, Bart Van Assche wrote: > On 7/21/19 10:39 PM, Ming Lei wrote: > > Before aborting in-flight requests, all IO queues have been shutdown. > > However, request's completion fn may not be done yet because it may > > be scheduled to run via IPI. > > > > So don't abort one request if it is marked as completed, otherwise > > we may abort one normal completed request. > > > > Cc: Max Gurtovoy <maxg@mellanox.com> > > Cc: Sagi Grimberg <sagi@grimberg.me> > > Cc: Keith Busch <keith.busch@intel.com> > > Cc: Christoph Hellwig <hch@lst.de> > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > --- > > drivers/nvme/host/core.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > > index cc09b81fc7f4..cb8007cce4d1 100644 > > --- a/drivers/nvme/host/core.c > > +++ b/drivers/nvme/host/core.c > > @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq); > > bool nvme_cancel_request(struct request *req, void *data, bool reserved) > > { > > + /* don't abort one completed request */ > > + if (blk_mq_request_completed(req)) > > + return; > > + > > dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device, > > "Cancelling I/O %d", req->tag); > > Something I probably already asked before: what prevents that > nvme_cancel_request() is executed concurrently with the completion handler > of the same request? The commit log did mention the point: Before aborting in-flight requests, all IO queues have been shutdown. which implies that no concurrent normal completion. Thanks, Ming
On 7/22/19 6:08 PM, Ming Lei wrote: > On Mon, Jul 22, 2019 at 08:27:32AM -0700, Bart Van Assche wrote: >> On 7/21/19 10:39 PM, Ming Lei wrote: >>> Before aborting in-flight requests, all IO queues have been shutdown. >>> However, request's completion fn may not be done yet because it may >>> be scheduled to run via IPI. >>> >>> So don't abort one request if it is marked as completed, otherwise >>> we may abort one normal completed request. >>> >>> Cc: Max Gurtovoy <maxg@mellanox.com> >>> Cc: Sagi Grimberg <sagi@grimberg.me> >>> Cc: Keith Busch <keith.busch@intel.com> >>> Cc: Christoph Hellwig <hch@lst.de> >>> Signed-off-by: Ming Lei <ming.lei@redhat.com> >>> --- >>> drivers/nvme/host/core.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c >>> index cc09b81fc7f4..cb8007cce4d1 100644 >>> --- a/drivers/nvme/host/core.c >>> +++ b/drivers/nvme/host/core.c >>> @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq); >>> bool nvme_cancel_request(struct request *req, void *data, bool reserved) >>> { >>> + /* don't abort one completed request */ >>> + if (blk_mq_request_completed(req)) >>> + return; >>> + >>> dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device, >>> "Cancelling I/O %d", req->tag); >> >> Something I probably already asked before: what prevents that >> nvme_cancel_request() is executed concurrently with the completion handler >> of the same request? > > The commit log did mention the point: > > Before aborting in-flight requests, all IO queues have been shutdown. > > which implies that no concurrent normal completion. How about adding that explanation as a comment above nvme_cancel_request()? That would make that explanation much easier to find compared to having to search through commit logs. Thanks, Bart.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index cc09b81fc7f4..cb8007cce4d1 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -285,6 +285,10 @@ EXPORT_SYMBOL_GPL(nvme_complete_rq); bool nvme_cancel_request(struct request *req, void *data, bool reserved) { + /* don't abort one completed request */ + if (blk_mq_request_completed(req)) + return; + dev_dbg_ratelimited(((struct nvme_ctrl *) data)->device, "Cancelling I/O %d", req->tag);
Before aborting in-flight requests, all IO queues have been shutdown. However, request's completion fn may not be done yet because it may be scheduled to run via IPI. So don't abort one request if it is marked as completed, otherwise we may abort one normal completed request. Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Keith Busch <keith.busch@intel.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> --- drivers/nvme/host/core.c | 4 ++++ 1 file changed, 4 insertions(+)