diff mbox series

[v1,1/1] nvme: complete directly for hctx with only one ctx mapping

Message ID 20230530024120.17196-1-powen.kao@mediatek.com (mailing list archive)
State New, archived
Headers show
Series [v1,1/1] nvme: complete directly for hctx with only one ctx mapping | expand

Commit Message

Po-Wen Kao May 30, 2023, 2:41 a.m. UTC
From: Ed Tsai <ed.tsai@mediatek.com>

Refer to
commit f168420c62e7
("blk-mq: don't redirect completion for hctx withs only one ctx mapping")
When nvme applies a 1:1 mapping of hctx and ctx, there will be no remote
request.

But for ufs, the submission and completion queue could be asymmetric.
(e.g. Multiple SQs share one CQ) Therefore, 1:1 mapping of hctx and
ctx won't complete request on the submission cpu. In this situation,
put this condition in block layer could violate the
QUEUE_FLAG_SAME_FORCE, as a result, move this back to nvme.

Signed-off-by: Ed Tsai <ed.tsai@mediatek.com>
Signed-off-by: Po-Wen Kao <powen.kao@mediatek.com>
---
 block/blk-mq.c           | 8 +++-----
 drivers/nvme/host/nvme.h | 4 ++++
 2 files changed, 7 insertions(+), 5 deletions(-)

Comments

Stanley Jhu May 30, 2023, 2:49 a.m. UTC | #1
On Tue, May 30, 2023 at 10:45 AM Po-Wen Kao <powen.kao@mediatek.com> wrote:
>
> From: Ed Tsai <ed.tsai@mediatek.com>
>
> Refer to
> commit f168420c62e7
> ("blk-mq: don't redirect completion for hctx withs only one ctx mapping")
> When nvme applies a 1:1 mapping of hctx and ctx, there will be no remote
> request.
>
> But for ufs, the submission and completion queue could be asymmetric.
> (e.g. Multiple SQs share one CQ) Therefore, 1:1 mapping of hctx and
> ctx won't complete request on the submission cpu. In this situation,
> put this condition in block layer could violate the
> QUEUE_FLAG_SAME_FORCE, as a result, move this back to nvme.
>
> Signed-off-by: Ed Tsai <ed.tsai@mediatek.com>
> Signed-off-by: Po-Wen Kao <powen.kao@mediatek.com>

Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Keith Busch May 30, 2023, 5:45 p.m. UTC | #2
On Tue, May 30, 2023 at 10:41:19AM +0800, Po-Wen Kao wrote:
> ---
>  block/blk-mq.c           | 8 +++-----
>  drivers/nvme/host/nvme.h | 4 ++++
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 1749f5890606..b60c78f5ad46 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1181,12 +1181,10 @@ bool blk_mq_complete_request_remote(struct request *rq)
>  	WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
>  
>  	/*
> -	 * For request which hctx has only one ctx mapping,
> -	 * or a polled request, always complete locally,
> -	 * it's pointless to redirect the completion.
> +	 * For a polled request, always complete locally, it's pointless
> +	 * to redirect the completion.
>  	 */
> -	if (rq->mq_hctx->nr_ctx == 1 ||
> -		rq->cmd_flags & REQ_POLLED)
> +	if (rq->cmd_flags & REQ_POLLED)
>  		return false;
>  
>  	if (blk_mq_complete_need_ipi(rq)) {
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 7cf8e44d135e..acc9b1ce071d 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -702,6 +702,10 @@ static inline bool nvme_try_complete_req(struct request *req, __le16 status,
>  	nvme_should_fail(req);
>  	if (unlikely(blk_should_fake_timeout(req->q)))
>  		return true;
> +	if (likely(req->mq_hctx->nr_ctx == 1)) {
> +		WRITE_ONCE(req->state, MQ_RQ_COMPLETE);
> +		return false;
> +	}

I don't think we want low level drivers directly messing with blk-mq
request state.

Is the early nr_ctx check optimisation really worth it? Would the
following work for your use case?

---
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f6dad0886a2fa..a2d65bb127e29 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1176,7 +1176,8 @@ bool blk_mq_complete_request_remote(struct request *rq)
         * or a polled request, always complete locally,
         * it's pointless to redirect the completion.
         */
-       if (rq->mq_hctx->nr_ctx == 1 ||
+       if ((rq->mq_hctx->nr_ctx == 1 &&
+            rq->mq_ctx->cpu == raw_smp_processor_id()) ||
                rq->cmd_flags & REQ_POLLED)
                return false;
--
Ed Tsai (蔡宗軒) May 31, 2023, 1:14 a.m. UTC | #3
On Tue, 2023-05-30 at 11:45 -0600, Keith Busch wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Tue, May 30, 2023 at 10:41:19AM +0800, Po-Wen Kao wrote:
> > ---
> >  block/blk-mq.c           | 8 +++-----
> >  drivers/nvme/host/nvme.h | 4 ++++
> >  2 files changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 1749f5890606..b60c78f5ad46 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1181,12 +1181,10 @@ bool blk_mq_complete_request_remote(struct
> request *rq)
> >  WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
> >  
> >  /*
> > - * For request which hctx has only one ctx mapping,
> > - * or a polled request, always complete locally,
> > - * it's pointless to redirect the completion.
> > + * For a polled request, always complete locally, it's pointless
> > + * to redirect the completion.
> >   */
> > -if (rq->mq_hctx->nr_ctx == 1 ||
> > -rq->cmd_flags & REQ_POLLED)
> > +if (rq->cmd_flags & REQ_POLLED)
> >  return false;
> >  
> >  if (blk_mq_complete_need_ipi(rq)) {
> > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> > index 7cf8e44d135e..acc9b1ce071d 100644
> > --- a/drivers/nvme/host/nvme.h
> > +++ b/drivers/nvme/host/nvme.h
> > @@ -702,6 +702,10 @@ static inline bool
> nvme_try_complete_req(struct request *req, __le16 status,
> >  nvme_should_fail(req);
> >  if (unlikely(blk_should_fake_timeout(req->q)))
> >  return true;
> > +if (likely(req->mq_hctx->nr_ctx == 1)) {
> > +WRITE_ONCE(req->state, MQ_RQ_COMPLETE);
> > +return false;
> > +}
> 
> I don't think we want low level drivers directly messing with blk-mq
> request state.
> 
> Is the early nr_ctx check optimisation really worth it? Would the
> following work for your use case?

Ref to original discussion:

https://lore.kernel.org/lkml/1663432858-99743-1-git-send-email-liusong@linux.alibaba.com/

Seems it is what nvme hopes to optimize, so I put it back to nvme.
Otherwise, we can just remove the nr_ctx check from block, because the
submission and completion queues can be asymmetric in low level driver.
Ed Tsai (蔡宗軒) May 31, 2023, 1:32 a.m. UTC | #4
On Tue, 2023-05-30 at 11:45 -0600, Keith Busch wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Tue, May 30, 2023 at 10:41:19AM +0800, Po-Wen Kao wrote:
> > ---
> >  block/blk-mq.c           | 8 +++-----
> >  drivers/nvme/host/nvme.h | 4 ++++
> >  2 files changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index 1749f5890606..b60c78f5ad46 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1181,12 +1181,10 @@ bool blk_mq_complete_request_remote(struct
> request *rq)
> >  WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
> >  
> >  /*
> > - * For request which hctx has only one ctx mapping,
> > - * or a polled request, always complete locally,
> > - * it's pointless to redirect the completion.
> > + * For a polled request, always complete locally, it's pointless
> > + * to redirect the completion.
> >   */
> > -if (rq->mq_hctx->nr_ctx == 1 ||
> > -rq->cmd_flags & REQ_POLLED)
> > +if (rq->cmd_flags & REQ_POLLED)
> >  return false;
> >  
> >  if (blk_mq_complete_need_ipi(rq)) {
> > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> > index 7cf8e44d135e..acc9b1ce071d 100644
> > --- a/drivers/nvme/host/nvme.h
> > +++ b/drivers/nvme/host/nvme.h
> > @@ -702,6 +702,10 @@ static inline bool
> nvme_try_complete_req(struct request *req, __le16 status,
> >  nvme_should_fail(req);
> >  if (unlikely(blk_should_fake_timeout(req->q)))
> >  return true;
> > +if (likely(req->mq_hctx->nr_ctx == 1)) {
> > +WRITE_ONCE(req->state, MQ_RQ_COMPLETE);
> > +return false;
> > +}
> 
> I don't think we want low level drivers directly messing with blk-mq
> request state.
> 
> Is the early nr_ctx check optimisation really worth it? Would the
> following work for your use case?
> 
> ---
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index f6dad0886a2fa..a2d65bb127e29 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1176,7 +1176,8 @@ bool blk_mq_complete_request_remote(struct
> request *rq)
>          * or a polled request, always complete locally,
>          * it's pointless to redirect the completion.
>          */
> -       if (rq->mq_hctx->nr_ctx == 1 ||
> +       if ((rq->mq_hctx->nr_ctx == 1 &&
> +            rq->mq_ctx->cpu == raw_smp_processor_id()) ||
>                 rq->cmd_flags & REQ_POLLED)
>                 return false;
> --

Sorry, I missed for this part.
It looks good to me and I will update later.
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1749f5890606..b60c78f5ad46 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1181,12 +1181,10 @@  bool blk_mq_complete_request_remote(struct request *rq)
 	WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
 
 	/*
-	 * For request which hctx has only one ctx mapping,
-	 * or a polled request, always complete locally,
-	 * it's pointless to redirect the completion.
+	 * For a polled request, always complete locally, it's pointless
+	 * to redirect the completion.
 	 */
-	if (rq->mq_hctx->nr_ctx == 1 ||
-		rq->cmd_flags & REQ_POLLED)
+	if (rq->cmd_flags & REQ_POLLED)
 		return false;
 
 	if (blk_mq_complete_need_ipi(rq)) {
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 7cf8e44d135e..acc9b1ce071d 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -702,6 +702,10 @@  static inline bool nvme_try_complete_req(struct request *req, __le16 status,
 	nvme_should_fail(req);
 	if (unlikely(blk_should_fake_timeout(req->q)))
 		return true;
+	if (likely(req->mq_hctx->nr_ctx == 1)) {
+		WRITE_ONCE(req->state, MQ_RQ_COMPLETE);
+		return false;
+	}
 	return blk_mq_complete_request_remote(req);
 }