diff mbox series

[RESEND] nvme: explicitly use normal NVMe error handling when appropriate

Message ID 20200813144811.GA5452@redhat.com (mailing list archive)
State Superseded, archived
Delegated to: Mike Snitzer
Headers show
Series [RESEND] nvme: explicitly use normal NVMe error handling when appropriate | expand

Commit Message

Mike Snitzer Aug. 13, 2020, 2:48 p.m. UTC
Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
handling by changing multipathing's nvme_failover_req() to short-circuit
path failover and then fallback to NVMe's normal error handling (which
takes care of NVME_SC_CMD_INTERRUPTED).

This detour through native NVMe multipathing code is unwelcome because
it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
of any multipathing concerns.

Introduce nvme_status_needs_local_error_handling() to prioritize
non-failover retry, when appropriate, in terms of normal NVMe error
handling.  nvme_status_needs_local_error_handling() will naturely evolve
to include handling of any other errors that normal error handling must
be used for.

nvme_failover_req()'s ability to fallback to normal NVMe error handling
has been preserved because it may be useful for future NVME_SC that
nvme_status_needs_local_error_handling() hasn't been trained for yet.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/nvme/host/core.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Comments

Meneghini, John Aug. 13, 2020, 3:29 p.m. UTC | #1
Mike, 

I don't see your patch at:

https://patchwork.kernel.org/project/linux-block/list/?submitter=1643

Can you please upload this patch there?

Thanks,

/John

On 8/13/20, 10:48 AM, "Mike Snitzer" <snitzer@redhat.com> wrote:

    Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
    status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
    handling by changing multipathing's nvme_failover_req() to short-circuit
    path failover and then fallback to NVMe's normal error handling (which
    takes care of NVME_SC_CMD_INTERRUPTED).

    This detour through native NVMe multipathing code is unwelcome because
    it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
    of any multipathing concerns.

    Introduce nvme_status_needs_local_error_handling() to prioritize
    non-failover retry, when appropriate, in terms of normal NVMe error
    handling.  nvme_status_needs_local_error_handling() will naturely evolve
    to include handling of any other errors that normal error handling must
    be used for.

    nvme_failover_req()'s ability to fallback to normal NVMe error handling
    has been preserved because it may be useful for future NVME_SC that
    nvme_status_needs_local_error_handling() hasn't been trained for yet.

    Signed-off-by: Mike Snitzer <snitzer@redhat.com>
    ---
     drivers/nvme/host/core.c | 16 ++++++++++++++--
     1 file changed, 14 insertions(+), 2 deletions(-)

    diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
    index 88cff309d8e4..be749b690af7 100644
    --- a/drivers/nvme/host/core.c
    +++ b/drivers/nvme/host/core.c
    @@ -252,6 +252,16 @@ static inline bool nvme_req_needs_retry(struct request *req)
            return true;
     }

    +static inline bool nvme_status_needs_local_error_handling(u16 status)
    +{
    +       switch (status & 0x7ff) {
    +       case NVME_SC_CMD_INTERRUPTED:
    +               return true;
    +       default:
    +               return false;
    +       }
    +}
    +
     static void nvme_retry_req(struct request *req)
     {
            struct nvme_ns *ns = req->q->queuedata;
    @@ -270,7 +280,8 @@ static void nvme_retry_req(struct request *req)

     void nvme_complete_rq(struct request *req)
     {
    -       blk_status_t status = nvme_error_status(nvme_req(req)->status);
    +       u16 nvme_status = nvme_req(req)->status;
    +       blk_status_t status = nvme_error_status(nvme_status);

            trace_nvme_complete_rq(req);

    @@ -280,7 +291,8 @@ void nvme_complete_rq(struct request *req)
                    nvme_req(req)->ctrl->comp_seen = true;

            if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
    -               if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
    +               if (!nvme_status_needs_local_error_handling(nvme_status) &&
    +                   (req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
                            return;

                    if (!blk_queue_dying(req->q)) {
    --
    2.18.0



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Christoph Hellwig Aug. 13, 2020, 3:36 p.m. UTC | #2
On Thu, Aug 13, 2020 at 10:48:11AM -0400, Mike Snitzer wrote:
> Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
> status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
> handling by changing multipathing's nvme_failover_req() to short-circuit
> path failover and then fallback to NVMe's normal error handling (which
> takes care of NVME_SC_CMD_INTERRUPTED).
> 
> This detour through native NVMe multipathing code is unwelcome because
> it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
> of any multipathing concerns.
> 
> Introduce nvme_status_needs_local_error_handling() to prioritize
> non-failover retry, when appropriate, in terms of normal NVMe error
> handling.  nvme_status_needs_local_error_handling() will naturely evolve
> to include handling of any other errors that normal error handling must
> be used for.
> 
> nvme_failover_req()'s ability to fallback to normal NVMe error handling
> has been preserved because it may be useful for future NVME_SC that
> nvme_status_needs_local_error_handling() hasn't been trained for yet.
> 
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>

I don't see how this would change anything.  nvme_failover_req simply
retuns false for NVME_SC_CMD_INTERRUPTED, so your change is a no-op.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Aug. 13, 2020, 3:43 p.m. UTC | #3
Maybe because I didn't cc linux-block?
Only way that I know to "upload this patch there" is to have cc'd
linux-block.

But the patch is in dm-devel's patchwork here:
https://patchwork.kernel.org/patch/11712563/

Is that sufficient for your needs?

Thanks,
Mike

On Thu, Aug 13 2020 at 11:29am -0400,
Meneghini, John <John.Meneghini@netapp.com> wrote:

> Mike, 
> 
> I don't see your patch at:
> 
> https://patchwork.kernel.org/project/linux-block/list/?submitter=1643
> 
> Can you please upload this patch there?
> 
> Thanks,
> 
> /John
> 
> On 8/13/20, 10:48 AM, "Mike Snitzer" <snitzer@redhat.com> wrote:
> 
>     Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
>     status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
>     handling by changing multipathing's nvme_failover_req() to short-circuit
>     path failover and then fallback to NVMe's normal error handling (which
>     takes care of NVME_SC_CMD_INTERRUPTED).
> 
>     This detour through native NVMe multipathing code is unwelcome because
>     it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
>     of any multipathing concerns.
> 
>     Introduce nvme_status_needs_local_error_handling() to prioritize
>     non-failover retry, when appropriate, in terms of normal NVMe error
>     handling.  nvme_status_needs_local_error_handling() will naturely evolve
>     to include handling of any other errors that normal error handling must
>     be used for.
> 
>     nvme_failover_req()'s ability to fallback to normal NVMe error handling
>     has been preserved because it may be useful for future NVME_SC that
>     nvme_status_needs_local_error_handling() hasn't been trained for yet.
> 
>     Signed-off-by: Mike Snitzer <snitzer@redhat.com>
>     ---
>      drivers/nvme/host/core.c | 16 ++++++++++++++--
>      1 file changed, 14 insertions(+), 2 deletions(-)
> 
>     diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>     index 88cff309d8e4..be749b690af7 100644
>     --- a/drivers/nvme/host/core.c
>     +++ b/drivers/nvme/host/core.c
>     @@ -252,6 +252,16 @@ static inline bool nvme_req_needs_retry(struct request *req)
>             return true;
>      }
> 
>     +static inline bool nvme_status_needs_local_error_handling(u16 status)
>     +{
>     +       switch (status & 0x7ff) {
>     +       case NVME_SC_CMD_INTERRUPTED:
>     +               return true;
>     +       default:
>     +               return false;
>     +       }
>     +}
>     +
>      static void nvme_retry_req(struct request *req)
>      {
>             struct nvme_ns *ns = req->q->queuedata;
>     @@ -270,7 +280,8 @@ static void nvme_retry_req(struct request *req)
> 
>      void nvme_complete_rq(struct request *req)
>      {
>     -       blk_status_t status = nvme_error_status(nvme_req(req)->status);
>     +       u16 nvme_status = nvme_req(req)->status;
>     +       blk_status_t status = nvme_error_status(nvme_status);
> 
>             trace_nvme_complete_rq(req);
> 
>     @@ -280,7 +291,8 @@ void nvme_complete_rq(struct request *req)
>                     nvme_req(req)->ctrl->comp_seen = true;
> 
>             if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
>     -               if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
>     +               if (!nvme_status_needs_local_error_handling(nvme_status) &&
>     +                   (req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
>                             return;
> 
>                     if (!blk_queue_dying(req->q)) {
>     --
>     2.18.0
> 
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Meneghini, John Aug. 13, 2020, 3:59 p.m. UTC | #4
That works Mike. 

Unfortunately, I have to work with the corporate MS office email handler (which sucks)  and downloading the .patch file works much better for me.

Thanks,

/John

On 8/13/20, 11:43 AM, "Mike Snitzer" <snitzer@redhat.com> wrote:

    Maybe because I didn't cc linux-block?
    Only way that I know to "upload this patch there" is to have cc'd
    linux-block.

    But the patch is in dm-devel's patchwork here:
    https://patchwork.kernel.org/patch/11712563/

    Is that sufficient for your needs?

    Thanks,
    Mike

    On Thu, Aug 13 2020 at 11:29am -0400,
    Meneghini, John <John.Meneghini@netapp.com> wrote:

    > Mike,
    >
    > I don't see your patch at:
    >
    > https://patchwork.kernel.org/project/linux-block/list/?submitter=1643
    >
    > Can you please upload this patch there?
    >
    > Thanks,
    >
    > /John
    >
    > On 8/13/20, 10:48 AM, "Mike Snitzer" <snitzer@redhat.com> wrote:
    >
    >     Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
    >     status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
    >     handling by changing multipathing's nvme_failover_req() to short-circuit
    >     path failover and then fallback to NVMe's normal error handling (which
    >     takes care of NVME_SC_CMD_INTERRUPTED).
    >
    >     This detour through native NVMe multipathing code is unwelcome because
    >     it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
    >     of any multipathing concerns.
    >
    >     Introduce nvme_status_needs_local_error_handling() to prioritize
    >     non-failover retry, when appropriate, in terms of normal NVMe error
    >     handling.  nvme_status_needs_local_error_handling() will naturely evolve
    >     to include handling of any other errors that normal error handling must
    >     be used for.
    >
    >     nvme_failover_req()'s ability to fallback to normal NVMe error handling
    >     has been preserved because it may be useful for future NVME_SC that
    >     nvme_status_needs_local_error_handling() hasn't been trained for yet.
    >
    >     Signed-off-by: Mike Snitzer <snitzer@redhat.com>
    >     ---
    >      drivers/nvme/host/core.c | 16 ++++++++++++++--
    >      1 file changed, 14 insertions(+), 2 deletions(-)
    >
    >     diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
    >     index 88cff309d8e4..be749b690af7 100644
    >     --- a/drivers/nvme/host/core.c
    >     +++ b/drivers/nvme/host/core.c
    >     @@ -252,6 +252,16 @@ static inline bool nvme_req_needs_retry(struct request *req)
    >             return true;
    >      }
    >
    >     +static inline bool nvme_status_needs_local_error_handling(u16 status)
    >     +{
    >     +       switch (status & 0x7ff) {
    >     +       case NVME_SC_CMD_INTERRUPTED:
    >     +               return true;
    >     +       default:
    >     +               return false;
    >     +       }
    >     +}
    >     +
    >      static void nvme_retry_req(struct request *req)
    >      {
    >             struct nvme_ns *ns = req->q->queuedata;
    >     @@ -270,7 +280,8 @@ static void nvme_retry_req(struct request *req)
    >
    >      void nvme_complete_rq(struct request *req)
    >      {
    >     -       blk_status_t status = nvme_error_status(nvme_req(req)->status);
    >     +       u16 nvme_status = nvme_req(req)->status;
    >     +       blk_status_t status = nvme_error_status(nvme_status);
    >
    >             trace_nvme_complete_rq(req);
    >
    >     @@ -280,7 +291,8 @@ void nvme_complete_rq(struct request *req)
    >                     nvme_req(req)->ctrl->comp_seen = true;
    >
    >             if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
    >     -               if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
    >     +               if (!nvme_status_needs_local_error_handling(nvme_status) &&
    >     +                   (req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
    >                             return;
    >
    >                     if (!blk_queue_dying(req->q)) {
    >     --
    >     2.18.0
    >
    >



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Aug. 13, 2020, 5:47 p.m. UTC | #5
On Thu, Aug 13 2020 at 11:36am -0400,
Christoph Hellwig <hch@infradead.org> wrote:

> On Thu, Aug 13, 2020 at 10:48:11AM -0400, Mike Snitzer wrote:
> > Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
> > status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
> > handling by changing multipathing's nvme_failover_req() to short-circuit
> > path failover and then fallback to NVMe's normal error handling (which
> > takes care of NVME_SC_CMD_INTERRUPTED).
> > 
> > This detour through native NVMe multipathing code is unwelcome because
> > it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
> > of any multipathing concerns.
> > 
> > Introduce nvme_status_needs_local_error_handling() to prioritize
> > non-failover retry, when appropriate, in terms of normal NVMe error
> > handling.  nvme_status_needs_local_error_handling() will naturely evolve
> > to include handling of any other errors that normal error handling must
> > be used for.
> > 
> > nvme_failover_req()'s ability to fallback to normal NVMe error handling
> > has been preserved because it may be useful for future NVME_SC that
> > nvme_status_needs_local_error_handling() hasn't been trained for yet.
> > 
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> 
> I don't see how this would change anything.  nvme_failover_req simply
> retuns false for NVME_SC_CMD_INTERRUPTED, so your change is a no-op.

This is just a tweak to improve the high-level fault tree of core NVMe
error handling.  No functional change, but for such basic errors,
avoiding entering nvme_failover_req is meaningful on a code flow level.
Makes code to handle errors that need local retry clearer by being more
structured, less circuitous.

Allows NVMe core's handling of such errors to be more explicit and live
in core.c rather than multipath.c -- so things like ACRE handling can be
made explicitly part of core and not nested under nvme_failover_req's
relatively obscure failsafe that returns false for anything it doesn't
care about.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Christoph Hellwig Aug. 13, 2020, 6:43 p.m. UTC | #6
On Thu, Aug 13, 2020 at 01:47:04PM -0400, Mike Snitzer wrote:
> This is just a tweak to improve the high-level fault tree of core NVMe
> error handling.  No functional change, but for such basic errors,
> avoiding entering nvme_failover_req is meaningful on a code flow level.
> Makes code to handle errors that need local retry clearer by being more
> structured, less circuitous.
> 
> Allows NVMe core's handling of such errors to be more explicit and live
> in core.c rather than multipath.c -- so things like ACRE handling can be
> made explicitly part of core and not nested under nvme_failover_req's
> relatively obscure failsafe that returns false for anything it doesn't
> care about.

If we're going that way I'd rather do something like the (untested)
patch below that adds a dispostion function with a function that
decides it and then just switches on it:

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 88cff309d8e4f0..a740320f0d4ee7 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -241,17 +241,6 @@ static blk_status_t nvme_error_status(u16 status)
 	}
 }
 
-static inline bool nvme_req_needs_retry(struct request *req)
-{
-	if (blk_noretry_request(req))
-		return false;
-	if (nvme_req(req)->status & NVME_SC_DNR)
-		return false;
-	if (nvme_req(req)->retries >= nvme_max_retries)
-		return false;
-	return true;
-}
-
 static void nvme_retry_req(struct request *req)
 {
 	struct nvme_ns *ns = req->q->queuedata;
@@ -268,33 +257,75 @@ static void nvme_retry_req(struct request *req)
 	blk_mq_delay_kick_requeue_list(req->q, delay);
 }
 
-void nvme_complete_rq(struct request *req)
+enum nvme_disposition {
+	COMPLETE,
+	RETRY,
+	REDIRECT_ANA,
+	REDIRECT_TMP,
+};
+
+static inline enum nvme_disposition nvme_req_disposition(struct request *req)
+{
+	if (likely(nvme_req(req)->status == 0))
+		return COMPLETE;
+
+	if (blk_noretry_request(req) ||
+	    (nvme_req(req)->status & NVME_SC_DNR) ||
+	    nvme_req(req)->retries >= nvme_max_retries)
+		return COMPLETE;
+
+	if (req->cmd_flags & REQ_NVME_MPATH) {
+		switch (nvme_req(req)->status & 0x7ff) {
+		case NVME_SC_ANA_TRANSITION:
+		case NVME_SC_ANA_INACCESSIBLE:
+		case NVME_SC_ANA_PERSISTENT_LOSS:
+			return REDIRECT_ANA;
+		case NVME_SC_HOST_PATH_ERROR:
+		case NVME_SC_HOST_ABORTED_CMD:
+			return REDIRECT_TMP;
+		}
+	}
+
+	if (blk_queue_dying(req->q))
+		return COMPLETE;
+	return RETRY;
+}
+
+static inline void nvme_complete_req(struct request *req)
 {
 	blk_status_t status = nvme_error_status(nvme_req(req)->status);
 
-	trace_nvme_complete_rq(req);
+	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
+	    req_op(req) == REQ_OP_ZONE_APPEND)
+		req->__sector = nvme_lba_to_sect(req->q->queuedata,
+			le64_to_cpu(nvme_req(req)->result.u64));
+
+	nvme_trace_bio_complete(req, status);
+	blk_mq_end_request(req, status);
+}
 
+void nvme_complete_rq(struct request *req)
+{
+	trace_nvme_complete_rq(req);
 	nvme_cleanup_cmd(req);
 
 	if (nvme_req(req)->ctrl->kas)
 		nvme_req(req)->ctrl->comp_seen = true;
 
-	if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
-		if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
-			return;
-
-		if (!blk_queue_dying(req->q)) {
-			nvme_retry_req(req);
-			return;
-		}
-	} else if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
-		   req_op(req) == REQ_OP_ZONE_APPEND) {
-		req->__sector = nvme_lba_to_sect(req->q->queuedata,
-			le64_to_cpu(nvme_req(req)->result.u64));
+	switch (nvme_req_disposition(req)) {
+	case COMPLETE:
+		nvme_complete_req(req);
+		return;
+	case RETRY:
+		nvme_retry_req(req);
+		return;
+	case REDIRECT_ANA:
+		nvme_failover_req(req, true);
+		return;
+	case REDIRECT_TMP:
+		nvme_failover_req(req, false);
+		return;
 	}
-
-	nvme_trace_bio_complete(req, status);
-	blk_mq_end_request(req, status);
 }
 EXPORT_SYMBOL_GPL(nvme_complete_rq);
 
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 3ded54d2c9c6ad..0c22b2c88687a2 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -65,51 +65,32 @@ void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
 	}
 }
 
-bool nvme_failover_req(struct request *req)
+void nvme_failover_req(struct request *req, bool is_ana_status)
 {
 	struct nvme_ns *ns = req->q->queuedata;
-	u16 status = nvme_req(req)->status;
 	unsigned long flags;
 
-	switch (status & 0x7ff) {
-	case NVME_SC_ANA_TRANSITION:
-	case NVME_SC_ANA_INACCESSIBLE:
-	case NVME_SC_ANA_PERSISTENT_LOSS:
-		/*
-		 * If we got back an ANA error we know the controller is alive,
-		 * but not ready to serve this namespaces.  The spec suggests
-		 * we should update our general state here, but due to the fact
-		 * that the admin and I/O queues are not serialized that is
-		 * fundamentally racy.  So instead just clear the current path,
-		 * mark the the path as pending and kick of a re-read of the ANA
-		 * log page ASAP.
-		 */
-		nvme_mpath_clear_current_path(ns);
-		if (ns->ctrl->ana_log_buf) {
-			set_bit(NVME_NS_ANA_PENDING, &ns->flags);
-			queue_work(nvme_wq, &ns->ctrl->ana_work);
-		}
-		break;
-	case NVME_SC_HOST_PATH_ERROR:
-	case NVME_SC_HOST_ABORTED_CMD:
-		/*
-		 * Temporary transport disruption in talking to the controller.
-		 * Try to send on a new path.
-		 */
-		nvme_mpath_clear_current_path(ns);
-		break;
-	default:
-		/* This was a non-ANA error so follow the normal error path. */
-		return false;
+	nvme_mpath_clear_current_path(ns);
+
+	/*
+	 * If we got back an ANA error we know the controller is alive, but not
+	 * ready to serve this namespaces.  The spec suggests we should update
+	 * our general state here, but due to the fact that the admin and I/O
+	 * queues are not serialized that is fundamentally racy.  So instead
+	 * just clear the current path, mark the the path as pending and kick
+	 * of a re-read of the ANA log page ASAP.
+	 */
+	if (is_ana_status && ns->ctrl->ana_log_buf) {
+		set_bit(NVME_NS_ANA_PENDING, &ns->flags);
+		queue_work(nvme_wq, &ns->ctrl->ana_work);
 	}
 
 	spin_lock_irqsave(&ns->head->requeue_lock, flags);
 	blk_steal_bios(&ns->head->requeue_list, req);
 	spin_unlock_irqrestore(&ns->head->requeue_lock, flags);
-	blk_mq_end_request(req, 0);
 
+	blk_mq_end_request(req, 0);
 	kblockd_schedule_work(&ns->head->requeue_work);
-	return true;
 }
 
 void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index ebb8c3ed388554..aeff1c491ac2ef 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -629,7 +629,7 @@ void nvme_mpath_wait_freeze(struct nvme_subsystem *subsys);
 void nvme_mpath_start_freeze(struct nvme_subsystem *subsys);
 void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
 			struct nvme_ctrl *ctrl, int *flags);
-bool nvme_failover_req(struct request *req);
+void nvme_failover_req(struct request *req, bool is_ana_status);
 void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl);
 int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,struct nvme_ns_head *head);
 void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id);
@@ -688,9 +688,8 @@ static inline void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
 	sprintf(disk_name, "nvme%dn%d", ctrl->instance, ns->head->instance);
 }
 
-static inline bool nvme_failover_req(struct request *req)
+static inline void nvme_failover_req(struct request *req, bool is_ana_status)
 {
-	return false;
 }
 static inline void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
 {

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Aug. 13, 2020, 7:03 p.m. UTC | #7
On Thu, Aug 13 2020 at  2:43pm -0400,
Christoph Hellwig <hch@infradead.org> wrote:

> On Thu, Aug 13, 2020 at 01:47:04PM -0400, Mike Snitzer wrote:
> > This is just a tweak to improve the high-level fault tree of core NVMe
> > error handling.  No functional change, but for such basic errors,
> > avoiding entering nvme_failover_req is meaningful on a code flow level.
> > Makes code to handle errors that need local retry clearer by being more
> > structured, less circuitous.
> > 
> > Allows NVMe core's handling of such errors to be more explicit and live
> > in core.c rather than multipath.c -- so things like ACRE handling can be
> > made explicitly part of core and not nested under nvme_failover_req's
> > relatively obscure failsafe that returns false for anything it doesn't
> > care about.
> 
> If we're going that way I'd rather do something like the (untested)
> patch below that adds a dispostion function with a function that
> decides it and then just switches on it:

YES!  That is such a huge improvement (certainly on a code clarity
level).  I haven't reviewed or tested the relative performance or
function of before vs after (will do) but I really like this approach.

Thanks,
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Meneghini, John Aug. 14, 2020, 3:23 a.m. UTC | #8
On 8/13/20, 10:48 AM, "Mike Snitzer" <snitzer@redhat.com> wrote:

    Commit 764e9332098c0 ("nvme-multipath: do not reset on unknown
    status"), among other things, fixed NVME_SC_CMD_INTERRUPTED error
    handling by changing multipathing's nvme_failover_req() to short-circuit
    path failover and then fallback to NVMe's normal error handling (which
    takes care of NVME_SC_CMD_INTERRUPTED).

    This detour through native NVMe multipathing code is unwelcome because
    it prevents NVMe core from handling NVME_SC_CMD_INTERRUPTED independent
    of any multipathing concerns.

    Introduce nvme_status_needs_local_error_handling() to prioritize
    non-failover retry, when appropriate, in terms of normal NVMe error
    handling.  nvme_status_needs_local_error_handling() will naturely evolve
    to include handling of any other errors that normal error handling must
    be used for.

How is this any better than blk_path_error()? 

    nvme_failover_req()'s ability to fallback to normal NVMe error handling
    has been preserved because it may be useful for future NVME_SC that
    nvme_status_needs_local_error_handling() hasn't been trained for yet.

    Signed-off-by: Mike Snitzer <snitzer@redhat.com>
    ---
     drivers/nvme/host/core.c | 16 ++++++++++++++--
     1 file changed, 14 insertions(+), 2 deletions(-)

    diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
    index 88cff309d8e4..be749b690af7 100644
    --- a/drivers/nvme/host/core.c
    +++ b/drivers/nvme/host/core.c
    @@ -252,6 +252,16 @@ static inline bool nvme_req_needs_retry(struct request *req)
            return true;
     }

    +static inline bool nvme_status_needs_local_error_handling(u16 status)
    +{
    +       switch (status & 0x7ff) {
    +       case NVME_SC_CMD_INTERRUPTED:
    +               return true;
    +       default:
    +               return false;
    +       }
    +}

I assume that what you mean by nvme_status_needs_local_error_handling is - do you want the nvme core 
driver to handle the command retry.

If this is true, I don't think this function will ever work correctly because,. as discussed, whether or
not the command needs to be retried has nothing to do with the NVMe Status Code Field itself, it
has to so with the DNR bit, the CRD field, and the Status Code Type field - in that order.

    +
     static void nvme_retry_req(struct request *req)
     {
            struct nvme_ns *ns = req->q->queuedata;
    @@ -270,7 +280,8 @@ static void nvme_retry_req(struct request *req)

     void nvme_complete_rq(struct request *req)
     {
    -       blk_status_t status = nvme_error_status(nvme_req(req)->status);
    +       u16 nvme_status = nvme_req(req)->status;
    +       blk_status_t status = nvme_error_status(nvme_status);

            trace_nvme_complete_rq(req);

    @@ -280,7 +291,8 @@ void nvme_complete_rq(struct request *req)
                    nvme_req(req)->ctrl->comp_seen = true;

            if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
    -               if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
    +               if (!nvme_status_needs_local_error_handling(nvme_status) &&

This defeats the nvme-multipath logic by inserting  a second evaluation of the NVMe Status Code into the retry logic.

This is basically another version of  blk_path_error().

In fact, in your case REQ_NVME_MPATH is probably not set, so I don't see what difference this would make at all.

/John

    +                   (req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
                            return;

                    if (!blk_queue_dying(req->q)) {
    --
    2.18.0



--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Meneghini, John Aug. 14, 2020, 4:26 a.m. UTC | #9
On 8/13/20, 2:44 PM, "Christoph Hellwig" <hch@infradead.org> wrote:

    On Thu, Aug 13, 2020 at 01:47:04PM -0400, Mike Snitzer wrote:
    > This is just a tweak to improve the high-level fault tree of core NVMe
    > error handling.  No functional change, but for such basic errors,
    > avoiding entering nvme_failover_req is meaningful on a code flow level.
    > Makes code to handle errors that need local retry clearer by being more
    > structured, less circuitous.
  
I don't understand how entering nvme_failover_req() is circuitous. 

This code path is only taken if REQ_NVME_MPATH is set which - unless I am mistaken - in
the case that you care about it will not be set.

    > Allows NVMe core's handling of such errors to be more explicit and live
    > in core.c rather than multipath.c -- so things like ACRE handling can be
    > made explicitly part of core and not nested under nvme_failover_req's
    > relatively obscure failsafe that returns false for anything it doesn't
    > care about.

The ACRE handling is already explicitly a part of the core.  I don't understand what
you are after here Mike.  Are you saying that you don't want the ACRE code to run
when REQ_NVME_MPATH is clear?

    If we're going that way I'd rather do something like the (untested)
    patch below that adds a dispostion function with a function that
    decides it and then just switches on it:

Christoph, it looks like you've moved a lot of stuff around here, with no actual
functional change.... but it's really hard for me to tell.  Please be sure to cc me if this
becomes a real patch.

How does your patch solve the problem of making dm-multipath work with command retries?

Mike, do you want the nvme-core driver to retry commands on the same path, with CRD, for the dm-multipath
use case... or are you looking for a different treatment of REQ_FAILFAST_DEV... or what? 

Maybe I'm not seeing it.

/John

    diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
    index 88cff309d8e4f0..a740320f0d4ee7 100644
    --- a/drivers/nvme/host/core.c
    +++ b/drivers/nvme/host/core.c
    @@ -241,17 +241,6 @@ static blk_status_t nvme_error_status(u16 status)
            }
     }

    -static inline bool nvme_req_needs_retry(struct request *req)
    -{
    -       if (blk_noretry_request(req))
    -               return false;
    -       if (nvme_req(req)->status & NVME_SC_DNR)
    -               return false;
    -       if (nvme_req(req)->retries >= nvme_max_retries)
    -               return false;
    -       return true;
    -}
    -
     static void nvme_retry_req(struct request *req)
     {
            struct nvme_ns *ns = req->q->queuedata;
    @@ -268,33 +257,75 @@ static void nvme_retry_req(struct request *req)
            blk_mq_delay_kick_requeue_list(req->q, delay);
     }

    -void nvme_complete_rq(struct request *req)
    +enum nvme_disposition {
    +       COMPLETE,
    +       RETRY,
    +       REDIRECT_ANA,
    +       REDIRECT_TMP,
    +};
    +
    +static inline enum nvme_disposition nvme_req_disposition(struct request *req)
    +{
    +       if (likely(nvme_req(req)->status == 0))
    +               return COMPLETE;
    +
    +       if (blk_noretry_request(req) ||
    +           (nvme_req(req)->status & NVME_SC_DNR) ||
    +           nvme_req(req)->retries >= nvme_max_retries)
    +               return COMPLETE;
    +
    +       if (req->cmd_flags & REQ_NVME_MPATH) {
    +               switch (nvme_req(req)->status & 0x7ff) {
    +               case NVME_SC_ANA_TRANSITION:
    +               case NVME_SC_ANA_INACCESSIBLE:
    +               case NVME_SC_ANA_PERSISTENT_LOSS:
    +                       return REDIRECT_ANA;
    +               case NVME_SC_HOST_PATH_ERROR:
    +               case NVME_SC_HOST_ABORTED_CMD:
    +                       return REDIRECT_TMP;
    +               }
    +       }
    +
    +       if (blk_queue_dying(req->q))
    +               return COMPLETE;
    +       return RETRY;
    +}
    +
    +static inline void nvme_complete_req(struct request *req)
     {
            blk_status_t status = nvme_error_status(nvme_req(req)->status);

    -       trace_nvme_complete_rq(req);
    +       if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
    +           req_op(req) == REQ_OP_ZONE_APPEND)
    +               req->__sector = nvme_lba_to_sect(req->q->queuedata,
    +                       le64_to_cpu(nvme_req(req)->result.u64));
    +
    +       nvme_trace_bio_complete(req, status);
    +       blk_mq_end_request(req, status);
    +}

    +void nvme_complete_rq(struct request *req)
    +{
    +       trace_nvme_complete_rq(req);
            nvme_cleanup_cmd(req);

            if (nvme_req(req)->ctrl->kas)
                    nvme_req(req)->ctrl->comp_seen = true;

    -       if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
    -               if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
    -                       return;
    -
    -               if (!blk_queue_dying(req->q)) {
    -                       nvme_retry_req(req);
    -                       return;
    -               }
    -       } else if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
    -                  req_op(req) == REQ_OP_ZONE_APPEND) {
    -               req->__sector = nvme_lba_to_sect(req->q->queuedata,
    -                       le64_to_cpu(nvme_req(req)->result.u64));
    +       switch (nvme_req_disposition(req)) {
    +       case COMPLETE:
    +               nvme_complete_req(req);
    +               return;
    +       case RETRY:
    +               nvme_retry_req(req);
    +               return;
    +       case REDIRECT_ANA:
    +               nvme_failover_req(req, true);
    +               return;
    +       case REDIRECT_TMP:
    +               nvme_failover_req(req, false);
    +               return;
            }
    -
    -       nvme_trace_bio_complete(req, status);
    -       blk_mq_end_request(req, status);
     }
     EXPORT_SYMBOL_GPL(nvme_complete_rq);

    diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
    index 3ded54d2c9c6ad..0c22b2c88687a2 100644
    --- a/drivers/nvme/host/multipath.c
    +++ b/drivers/nvme/host/multipath.c
    @@ -65,51 +65,32 @@ void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
            }
     }

    -bool nvme_failover_req(struct request *req)
    +void nvme_failover_req(struct request *req, bool is_ana_status)
     {
            struct nvme_ns *ns = req->q->queuedata;
    -       u16 status = nvme_req(req)->status;
            unsigned long flags;

    -       switch (status & 0x7ff) {
    -       case NVME_SC_ANA_TRANSITION:
    -       case NVME_SC_ANA_INACCESSIBLE:
    -       case NVME_SC_ANA_PERSISTENT_LOSS:
    -               /*
    -                * If we got back an ANA error we know the controller is alive,
    -                * but not ready to serve this namespaces.  The spec suggests
    -                * we should update our general state here, but due to the fact
    -                * that the admin and I/O queues are not serialized that is
    -                * fundamentally racy.  So instead just clear the current path,
    -                * mark the the path as pending and kick of a re-read of the ANA
    -                * log page ASAP.
    -                */
    -               nvme_mpath_clear_current_path(ns);
    -               if (ns->ctrl->ana_log_buf) {
    -                       set_bit(NVME_NS_ANA_PENDING, &ns->flags);
    -                       queue_work(nvme_wq, &ns->ctrl->ana_work);
    -               }
    -               break;
    -       case NVME_SC_HOST_PATH_ERROR:
    -       case NVME_SC_HOST_ABORTED_CMD:
    -               /*
    -                * Temporary transport disruption in talking to the controller.
    -                * Try to send on a new path.
    -                */
    -               nvme_mpath_clear_current_path(ns);
    -               break;
    -       default:
    -               /* This was a non-ANA error so follow the normal error path. */
    -               return false;
    +       nvme_mpath_clear_current_path(ns);
    +
    +       /*
    +        * If we got back an ANA error we know the controller is alive, but not
    +        * ready to serve this namespaces.  The spec suggests we should update
    +        * our general state here, but due to the fact that the admin and I/O
    +        * queues are not serialized that is fundamentally racy.  So instead
    +        * just clear the current path, mark the the path as pending and kick
    +        * of a re-read of the ANA log page ASAP.
    +        */
    +       if (is_ana_status && ns->ctrl->ana_log_buf) {
    +               set_bit(NVME_NS_ANA_PENDING, &ns->flags);
    +               queue_work(nvme_wq, &ns->ctrl->ana_work);
            }

            spin_lock_irqsave(&ns->head->requeue_lock, flags);
            blk_steal_bios(&ns->head->requeue_list, req);
            spin_unlock_irqrestore(&ns->head->requeue_lock, flags);
    -       blk_mq_end_request(req, 0);

    +       blk_mq_end_request(req, 0);
            kblockd_schedule_work(&ns->head->requeue_work);
    -       return true;
     }

     void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
    diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
    index ebb8c3ed388554..aeff1c491ac2ef 100644
    --- a/drivers/nvme/host/nvme.h
    +++ b/drivers/nvme/host/nvme.h
    @@ -629,7 +629,7 @@ void nvme_mpath_wait_freeze(struct nvme_subsystem *subsys);
     void nvme_mpath_start_freeze(struct nvme_subsystem *subsys);
     void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
                            struct nvme_ctrl *ctrl, int *flags);
    -bool nvme_failover_req(struct request *req);
    +void nvme_failover_req(struct request *req, bool is_ana_status);
     void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl);
     int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,struct nvme_ns_head *head);
     void nvme_mpath_add_disk(struct nvme_ns *ns, struct nvme_id_ns *id);
    @@ -688,9 +688,8 @@ static inline void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
            sprintf(disk_name, "nvme%dn%d", ctrl->instance, ns->head->instance);
     }

    -static inline bool nvme_failover_req(struct request *req)
    +static inline void nvme_failover_req(struct request *req, bool is_ana_status)
     {
    -       return false;
     }
     static inline void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
     {


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Sagi Grimberg Aug. 14, 2020, 6:53 a.m. UTC | #10
> +static inline enum nvme_disposition nvme_req_disposition(struct request *req)
> +{
> +	if (likely(nvme_req(req)->status == 0))
> +		return COMPLETE;
> +
> +	if (blk_noretry_request(req) ||
> +	    (nvme_req(req)->status & NVME_SC_DNR) ||
> +	    nvme_req(req)->retries >= nvme_max_retries)
> +		return COMPLETE;
> +
> +	if (req->cmd_flags & REQ_NVME_MPATH) {
> +		switch (nvme_req(req)->status & 0x7ff) {
> +		case NVME_SC_ANA_TRANSITION:
> +		case NVME_SC_ANA_INACCESSIBLE:
> +		case NVME_SC_ANA_PERSISTENT_LOSS:
> +			return REDIRECT_ANA;
> +		case NVME_SC_HOST_PATH_ERROR:
> +		case NVME_SC_HOST_ABORTED_CMD:
> +			return REDIRECT_TMP;
> +		}
> +	}
> +
> +	if (blk_queue_dying(req->q))
> +		return COMPLETE;
> +	return RETRY;
> +}
> +
> +static inline void nvme_complete_req(struct request *req)
>   {
>   	blk_status_t status = nvme_error_status(nvme_req(req)->status);
>   
> -	trace_nvme_complete_rq(req);
> +	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
> +	    req_op(req) == REQ_OP_ZONE_APPEND)
> +		req->__sector = nvme_lba_to_sect(req->q->queuedata,
> +			le64_to_cpu(nvme_req(req)->result.u64));
> +
> +	nvme_trace_bio_complete(req, status);
> +	blk_mq_end_request(req, status);
> +}
>   
> +void nvme_complete_rq(struct request *req)
> +{
> +	trace_nvme_complete_rq(req);
>   	nvme_cleanup_cmd(req);
>   
>   	if (nvme_req(req)->ctrl->kas)
>   		nvme_req(req)->ctrl->comp_seen = true;
>   
> -	if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
> -		if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
> -			return;
> -
> -		if (!blk_queue_dying(req->q)) {
> -			nvme_retry_req(req);
> -			return;
> -		}
> -	} else if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
> -		   req_op(req) == REQ_OP_ZONE_APPEND) {
> -		req->__sector = nvme_lba_to_sect(req->q->queuedata,
> -			le64_to_cpu(nvme_req(req)->result.u64));
> +	switch (nvme_req_disposition(req)) {
> +	case COMPLETE:
> +		nvme_complete_req(req);

nvme_complete_rq calling nvme_complete_req... Maybe call it
__nvme_complete_rq instead?

> +		return;
> +	case RETRY:
> +		nvme_retry_req(req);
> +		return;
> +	case REDIRECT_ANA:
> +		nvme_failover_req(req, true);
> +		return;
> +	case REDIRECT_TMP:
> +		nvme_failover_req(req, false);
> +		return;
>   	}
> -
> -	nvme_trace_bio_complete(req, status);
> -	blk_mq_end_request(req, status);
>   }
>   EXPORT_SYMBOL_GPL(nvme_complete_rq);
>   
> diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
> index 3ded54d2c9c6ad..0c22b2c88687a2 100644
> --- a/drivers/nvme/host/multipath.c
> +++ b/drivers/nvme/host/multipath.c
> @@ -65,51 +65,32 @@ void nvme_set_disk_name(char *disk_name, struct nvme_ns *ns,
>   	}
>   }
>   
> -bool nvme_failover_req(struct request *req)
> +void nvme_failover_req(struct request *req, bool is_ana_status)
>   {
>   	struct nvme_ns *ns = req->q->queuedata;
> -	u16 status = nvme_req(req)->status;
>   	unsigned long flags;
>   
> -	switch (status & 0x7ff) {
> -	case NVME_SC_ANA_TRANSITION:
> -	case NVME_SC_ANA_INACCESSIBLE:
> -	case NVME_SC_ANA_PERSISTENT_LOSS:
> -		/*
> -		 * If we got back an ANA error we know the controller is alive,
> -		 * but not ready to serve this namespaces.  The spec suggests
> -		 * we should update our general state here, but due to the fact
> -		 * that the admin and I/O queues are not serialized that is
> -		 * fundamentally racy.  So instead just clear the current path,
> -		 * mark the the path as pending and kick of a re-read of the ANA
> -		 * log page ASAP.
> -		 */
> -		nvme_mpath_clear_current_path(ns);
> -		if (ns->ctrl->ana_log_buf) {
> -			set_bit(NVME_NS_ANA_PENDING, &ns->flags);
> -			queue_work(nvme_wq, &ns->ctrl->ana_work);
> -		}
> -		break;
> -	case NVME_SC_HOST_PATH_ERROR:
> -	case NVME_SC_HOST_ABORTED_CMD:
> -		/*
> -		 * Temporary transport disruption in talking to the controller.
> -		 * Try to send on a new path.
> -		 */
> -		nvme_mpath_clear_current_path(ns);
> -		break;
> -	default:
> -		/* This was a non-ANA error so follow the normal error path. */
> -		return false;
> +	nvme_mpath_clear_current_path(ns);
> +
> +	/*
> +	 * If we got back an ANA error we know the controller is alive, but not
> +	 * ready to serve this namespaces.  The spec suggests we should update
> +	 * our general state here, but due to the fact that the admin and I/O
> +	 * queues are not serialized that is fundamentally racy.  So instead
> +	 * just clear the current path, mark the the path as pending and kick
> +	 * of a re-read of the ANA log page ASAP.
> +	 */
> +	if (is_ana_status && ns->ctrl->ana_log_buf) {

Maybe call nvme_req_disposition again locally here to not carry
the is_ana_status. But not a biggy..

Overall this looks good I think.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Christoph Hellwig Aug. 14, 2020, 6:55 a.m. UTC | #11
On Thu, Aug 13, 2020 at 11:53:14PM -0700, Sagi Grimberg wrote:
> > +	switch (nvme_req_disposition(req)) {
> > +	case COMPLETE:
> > +		nvme_complete_req(req);
> 
> nvme_complete_rq calling nvme_complete_req... Maybe call it
> __nvme_complete_rq instead?

That's what I had first, but it felt so strangely out of place next
to the other nvme_*_req calls..

Maybe nvme_complete_rq needs to be renamed - what about nvme_req_done?

> Maybe call nvme_req_disposition again locally here to not carry
> the is_ana_status. But not a biggy..

That means it would have to become non-static in scope, limiting
inlining possibilities, etc.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Sagi Grimberg Aug. 14, 2020, 7:02 a.m. UTC | #12
>>> +	switch (nvme_req_disposition(req)) {
>>> +	case COMPLETE:
>>> +		nvme_complete_req(req);
>>
>> nvme_complete_rq calling nvme_complete_req... Maybe call it
>> __nvme_complete_rq instead?
> 
> That's what I had first, but it felt so strangely out of place next
> to the other nvme_*_req calls..
> 
> Maybe nvme_complete_rq needs to be renamed - what about nvme_req_done?

I'd suggest to call nvme_complete_rq nvme_end_rq because it really calls
blk_mq_end_request. That would confuse with nvme_end_request which
calls blk_mq_complete_request... Maybe we need some naming improvements
throughout.

>> Maybe call nvme_req_disposition again locally here to not carry
>> the is_ana_status. But not a biggy..
> 
> That means it would have to become non-static in scope, limiting
> inlining possibilities, etc.

I see.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox series

Patch

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 88cff309d8e4..be749b690af7 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -252,6 +252,16 @@  static inline bool nvme_req_needs_retry(struct request *req)
 	return true;
 }
 
+static inline bool nvme_status_needs_local_error_handling(u16 status)
+{
+	switch (status & 0x7ff) {
+	case NVME_SC_CMD_INTERRUPTED:
+		return true;
+	default:
+		return false;
+	}
+}
+
 static void nvme_retry_req(struct request *req)
 {
 	struct nvme_ns *ns = req->q->queuedata;
@@ -270,7 +280,8 @@  static void nvme_retry_req(struct request *req)
 
 void nvme_complete_rq(struct request *req)
 {
-	blk_status_t status = nvme_error_status(nvme_req(req)->status);
+	u16 nvme_status = nvme_req(req)->status;
+	blk_status_t status = nvme_error_status(nvme_status);
 
 	trace_nvme_complete_rq(req);
 
@@ -280,7 +291,8 @@  void nvme_complete_rq(struct request *req)
 		nvme_req(req)->ctrl->comp_seen = true;
 
 	if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
-		if ((req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
+		if (!nvme_status_needs_local_error_handling(nvme_status) &&
+		    (req->cmd_flags & REQ_NVME_MPATH) && nvme_failover_req(req))
 			return;
 
 		if (!blk_queue_dying(req->q)) {