[1/6] nvme: don't disable local ints for polled queue

Message ID	20181110151317.3813-2-axboe@kernel.dk (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Jens Axboe <axboe@kernel.dk> To: linux-block@vger.kernel.org Cc: Jens Axboe <axboe@kernel.dk>, Keith Busch <keith.busch@intel.com>, linux-nvme@lists.infradead.org Subject: [PATCH 1/6] nvme: don't disable local ints for polled queue Date: Sat, 10 Nov 2018 08:13:12 -0700 Message-Id: <20181110151317.3813-2-axboe@kernel.dk> In-Reply-To: <20181110151317.3813-1-axboe@kernel.dk> References: <20181110151317.3813-1-axboe@kernel.dk> Sender: linux-block-owner@vger.kernel.org Precedence: bulk
Series	Various block optimizations \| expand [PATCHSET,0/6] Various block optimizations [1/6] nvme: don't disable local ints for polled queue [2/6] block: add queue_is_mq() helper [3/6] blk-mq: embed blk_mq_ops directly in the request queue [4/6] block: avoid ordered task state change for polled IO [5/6] blk-rq-qos: inline check for q->rq_qos functions [6/6] block: add polled wakeup task helper

Message ID

20181110151317.3813-2-axboe@kernel.dk (mailing list archive)

State

New, archived

Headers

From: Jens Axboe <axboe@kernel.dk>
To: linux-block@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>, Keith Busch <keith.busch@intel.com>,
        linux-nvme@lists.infradead.org
Subject: [PATCH 1/6] nvme: don't disable local ints for polled queue
Date: Sat, 10 Nov 2018 08:13:12 -0700
Message-Id: <20181110151317.3813-2-axboe@kernel.dk>
In-Reply-To: <20181110151317.3813-1-axboe@kernel.dk>
References: <20181110151317.3813-1-axboe@kernel.dk>
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk

Series

Various block optimizations | expand

Commit Message

Jens Axboe Nov. 10, 2018, 3:13 p.m. UTC

A polled queued doesn't trigger interrupts, so it's always safe
to grab the queue lock without disabling interrupts.

Cc: Keith Busch <keith.busch@intel.com>
Cc: linux-nvme@lists.infradead.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

Comments

Christoph Hellwig Nov. 14, 2018, 8:43 a.m. UTC | #1

> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 6aa86dfcb32c..a6e3fbddfadf 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1061,15 +1061,26 @@ static irqreturn_t nvme_irq_check(int irq, void *data)
>  
>  static int __nvme_poll(struct nvme_queue *nvmeq, unsigned int tag)
>  {
> +	unsigned long flags = 0; /* gcc 7.x fail */
>  	u16 start, end;
> -	bool found;
> +	bool found, has_irq;
>  
>  	if (!nvme_cqe_pending(nvmeq))
>  		return 0;
>  
> -	spin_lock_irq(&nvmeq->cq_lock);
> +	/*
> +	 * Polled queue doesn't have an IRQ, no need to disable ints
> +	 */
> +	has_irq = !nvmeq->polled;
> +	if (has_irq)
> +		local_irq_save(flags);
> +
> +	spin_lock(&nvmeq->cq_lock);
>  	found = nvme_process_cq(nvmeq, &start, &end, tag);
> -	spin_unlock_irq(&nvmeq->cq_lock);
> +	spin_unlock(&nvmeq->cq_lock);
> +
> +	if (has_irq)
> +		local_irq_restore(flags);

Eww, this just looks ugly.  Given that we are micro-optimizing here
I'd rather just use a different ->poll implementation and thus blk_mq_ops
if we have a separate poll code.  Then we could leave the existing
__nvme_poll as-is and have a new nvme_poll_noirq something like this:

static int nvme_poll_noirq(struct blk_mq_hw_ctx *hctx, unsigned int tag)
{
        struct nvme_queue *nvmeq = hctx->driver_data;
>  	u16 start, end;
	bool found;
>  
>  	if (!nvme_cqe_pending(nvmeq))
>  		return 0;
>  
> +	spin_lock(&nvmeq->cq_lock);
>  	found = nvme_process_cq(nvmeq, &start, &end, tag);
> +	spin_unlock(&nvmeq->cq_lock);
> +
>  	nvme_complete_cqes(nvmeq, start, end);
>  	return found;

And while we are at it:  I think for the irq-driven queuest in a
separate queue for poll setup we might not even need to take the
CQ lock.  Which might be an argument for only allowing polling
if we have the separate queues just to keep everything simple.

Max Gurtovoy Nov. 14, 2018, 10:18 a.m. UTC | #2

On 11/10/2018 5:13 PM, Jens Axboe wrote:
> A polled queued doesn't trigger interrupts, so it's always safe
> to grab the queue lock without disabling interrupts.


Jens,

can you share the added value in performance for this change ?



> Cc: Keith Busch <keith.busch@intel.com>
> Cc: linux-nvme@lists.infradead.org
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/nvme/host/pci.c | 17 ++++++++++++++---
>   1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 6aa86dfcb32c..a6e3fbddfadf 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1061,15 +1061,26 @@ static irqreturn_t nvme_irq_check(int irq, void *data)
>   
>   static int __nvme_poll(struct nvme_queue *nvmeq, unsigned int tag)
>   {
> +	unsigned long flags = 0; /* gcc 7.x fail */
>   	u16 start, end;
> -	bool found;
> +	bool found, has_irq;
>   
>   	if (!nvme_cqe_pending(nvmeq))
>   		return 0;
>   
> -	spin_lock_irq(&nvmeq->cq_lock);
> +	/*
> +	 * Polled queue doesn't have an IRQ, no need to disable ints
> +	 */
> +	has_irq = !nvmeq->polled;
> +	if (has_irq)
> +		local_irq_save(flags);
> +
> +	spin_lock(&nvmeq->cq_lock);
>   	found = nvme_process_cq(nvmeq, &start, &end, tag);
> -	spin_unlock_irq(&nvmeq->cq_lock);
> +	spin_unlock(&nvmeq->cq_lock);
> +
> +	if (has_irq)
> +		local_irq_restore(flags);
>   
>   	nvme_complete_cqes(nvmeq, start, end);
>   	return found;

Jens Axboe Nov. 14, 2018, 3:31 p.m. UTC | #3

On 11/14/18 1:43 AM, Christoph Hellwig wrote:
>> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
>> index 6aa86dfcb32c..a6e3fbddfadf 100644
>> --- a/drivers/nvme/host/pci.c
>> +++ b/drivers/nvme/host/pci.c
>> @@ -1061,15 +1061,26 @@ static irqreturn_t nvme_irq_check(int irq, void *data)
>>  
>>  static int __nvme_poll(struct nvme_queue *nvmeq, unsigned int tag)
>>  {
>> +	unsigned long flags = 0; /* gcc 7.x fail */
>>  	u16 start, end;
>> -	bool found;
>> +	bool found, has_irq;
>>  
>>  	if (!nvme_cqe_pending(nvmeq))
>>  		return 0;
>>  
>> -	spin_lock_irq(&nvmeq->cq_lock);
>> +	/*
>> +	 * Polled queue doesn't have an IRQ, no need to disable ints
>> +	 */
>> +	has_irq = !nvmeq->polled;
>> +	if (has_irq)
>> +		local_irq_save(flags);
>> +
>> +	spin_lock(&nvmeq->cq_lock);
>>  	found = nvme_process_cq(nvmeq, &start, &end, tag);
>> -	spin_unlock_irq(&nvmeq->cq_lock);
>> +	spin_unlock(&nvmeq->cq_lock);
>> +
>> +	if (has_irq)
>> +		local_irq_restore(flags);
> 
> Eww, this just looks ugly.  Given that we are micro-optimizing here
> I'd rather just use a different ->poll implementation and thus blk_mq_ops
> if we have a separate poll code.  Then we could leave the existing
> __nvme_poll as-is and have a new nvme_poll_noirq something like this:

Yeah good point, gets rid of those branches too. I'll make that change.

> And while we are at it:  I think for the irq-driven queuest in a
> separate queue for poll setup we might not even need to take the
> CQ lock.  Which might be an argument for only allowing polling
> if we have the separate queues just to keep everything simple.

We can definitely move in that direction, I'll take a look at what
is feasible.

Keith Busch Nov. 14, 2018, 3:37 p.m. UTC | #4

On Wed, Nov 14, 2018 at 12:43:22AM -0800, Christoph Hellwig wrote:
> static int nvme_poll_noirq(struct blk_mq_hw_ctx *hctx, unsigned int tag)
> {
>         struct nvme_queue *nvmeq = hctx->driver_data;
> >  	u16 start, end;
> 	bool found;
> >  
> >  	if (!nvme_cqe_pending(nvmeq))
> >  		return 0;
> >  
> > +	spin_lock(&nvmeq->cq_lock);
> >  	found = nvme_process_cq(nvmeq, &start, &end, tag);
> > +	spin_unlock(&nvmeq->cq_lock);
> > +
> >  	nvme_complete_cqes(nvmeq, start, end);
> >  	return found;
> 
> And while we are at it:  I think for the irq-driven queues in a
> separate queue for poll setup we might not even need to take the
> CQ lock.  Which might be an argument for only allowing polling
> if we have the separate queues just to keep everything simple.

That's a pretty cool observation. We still poll interrupt driven queues
in the timeout path as a sanity check (it really has helped in debugging
timeout issues), but we can temporarily disable the cq's irq and be
lockless.

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 6aa86dfcb32c..a6e3fbddfadf 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1061,15 +1061,26 @@  static irqreturn_t nvme_irq_check(int irq, void *data)
 
 static int __nvme_poll(struct nvme_queue *nvmeq, unsigned int tag)
 {
+	unsigned long flags = 0; /* gcc 7.x fail */
 	u16 start, end;
-	bool found;
+	bool found, has_irq;
 
 	if (!nvme_cqe_pending(nvmeq))
 		return 0;
 
-	spin_lock_irq(&nvmeq->cq_lock);
+	/*
+	 * Polled queue doesn't have an IRQ, no need to disable ints
+	 */
+	has_irq = !nvmeq->polled;
+	if (has_irq)
+		local_irq_save(flags);
+
+	spin_lock(&nvmeq->cq_lock);
 	found = nvme_process_cq(nvmeq, &start, &end, tag);
-	spin_unlock_irq(&nvmeq->cq_lock);
+	spin_unlock(&nvmeq->cq_lock);
+
+	if (has_irq)
+		local_irq_restore(flags);
 
 	nvme_complete_cqes(nvmeq, start, end);
 	return found;

[1/6] nvme: don't disable local ints for polled queue

Commit Message

Comments

Patch