diff mbox

[RFC] nvme: always return IRQ_HANDLED

Message ID 20170817202945.GB21397@localhost.localdomain (mailing list archive)
State New, archived
Headers show

Commit Message

Keith Busch Aug. 17, 2017, 8:29 p.m. UTC
On Thu, Aug 17, 2017 at 02:17:08PM -0600, Jens Axboe wrote:
> On 08/17/2017 02:15 PM, Keith Busch wrote:
> > On Thu, Aug 17, 2017 at 01:32:20PM -0600, Jens Axboe wrote:
> >> We currently have an issue with nvme when polling is used. Just
> >> ran some testing on 4.13-rc5, and it's trivial to trigger an IRQ
> >> disable ala:
> >>
> >> [   52.412851] irq 77: nobody cared (try booting with the "irqpoll" option)
> >> [   52.415310] irq 70: nobody cared (try booting with the "irqpoll" option)
> >>
> >> when running a few processes polling. The reason is pretty obvious - if
> >> we're effective at polling, the triggered IRQ will never find any
> >> events. If this happens enough times in a row, the kernel disables our
> >> IRQ since we keep returning IRQ_NONE.
> > 
> > If you're seeing IRQ_NONE returned, the NVMe driver didn't poll any
> > completions since the last time nvme_irq was called. The cqe_seen on
> > polled compeletions is sticky until the IRQ handler is run, in which
> > case it returns IRQ_HANDLED even when no completions were handled during
> > that interrupt.
> > 
> > The only way it should be able to return IRQ_NONE is if no completions
> > were observed (polled or otherwise) since the last time the IRQ handler
> > was called.
> 
> The polling do not update the cqe_seen. So it's possible that every time
> the IRQ handler does trigger, there are no entries found. Maybe a better
> or simpler fix would be to have the polling set cqe_seen to true, and
> leave the clearing to the interrupt handler as is done now.

Oops, that looks like a mistake. I don't think we want to suppress
spurious interrupt detection, though. How about this patch to set cq_seen
on polling like we used to have?

Fixes: 920d13a884 ("nvme-pci: factor out the cqe reading mechanics from __nvme_process_cq")

---
--

Comments

Jens Axboe Aug. 17, 2017, 8:25 p.m. UTC | #1
On 08/17/2017 02:29 PM, Keith Busch wrote:
> On Thu, Aug 17, 2017 at 02:17:08PM -0600, Jens Axboe wrote:
>> On 08/17/2017 02:15 PM, Keith Busch wrote:
>>> On Thu, Aug 17, 2017 at 01:32:20PM -0600, Jens Axboe wrote:
>>>> We currently have an issue with nvme when polling is used. Just
>>>> ran some testing on 4.13-rc5, and it's trivial to trigger an IRQ
>>>> disable ala:
>>>>
>>>> [   52.412851] irq 77: nobody cared (try booting with the "irqpoll" option)
>>>> [   52.415310] irq 70: nobody cared (try booting with the "irqpoll" option)
>>>>
>>>> when running a few processes polling. The reason is pretty obvious - if
>>>> we're effective at polling, the triggered IRQ will never find any
>>>> events. If this happens enough times in a row, the kernel disables our
>>>> IRQ since we keep returning IRQ_NONE.
>>>
>>> If you're seeing IRQ_NONE returned, the NVMe driver didn't poll any
>>> completions since the last time nvme_irq was called. The cqe_seen on
>>> polled compeletions is sticky until the IRQ handler is run, in which
>>> case it returns IRQ_HANDLED even when no completions were handled during
>>> that interrupt.
>>>
>>> The only way it should be able to return IRQ_NONE is if no completions
>>> were observed (polled or otherwise) since the last time the IRQ handler
>>> was called.
>>
>> The polling do not update the cqe_seen. So it's possible that every time
>> the IRQ handler does trigger, there are no entries found. Maybe a better
>> or simpler fix would be to have the polling set cqe_seen to true, and
>> leave the clearing to the interrupt handler as is done now.
> 
> Oops, that looks like a mistake. I don't think we want to suppress
> spurious interrupt detection, though. How about this patch to set cq_seen
> on polling like we used to have?

Yes, that will work. We need to get this into 4.13.
Christoph Hellwig Aug. 18, 2017, 7:14 a.m. UTC | #2
On Thu, Aug 17, 2017 at 02:25:27PM -0600, Jens Axboe wrote:
> Yes, that will work. We need to get this into 4.13.

I'll pick it up for the pull request I'm going to send you today.
diff mbox

Patch

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ea5eb4c..e9886e1 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -795,6 +795,7 @@  static inline void nvme_handle_cqe(struct nvme_queue *nvmeq,
 		return;
 	}
 
+	nvmeq->cqe_seen = 1;
 	req = blk_mq_tag_to_rq(*nvmeq->tags, cqe->command_id);
 	nvme_end_request(req, cqe->status, cqe->result);
 }
@@ -824,10 +825,8 @@  static void nvme_process_cq(struct nvme_queue *nvmeq)
 		consumed++;
 	}
 
-	if (consumed) {
+	if (consumed)
 		nvme_ring_cq_doorbell(nvmeq);
-		nvmeq->cqe_seen = 1;
-	}
 }
 
 static irqreturn_t nvme_irq(int irq, void *data)