Message ID | 1579767991-103898-1-git-send-email-liudongdong3@huawei.com (mailing list archive) |
---|---|
State | Mainlined, archived |
Commit | d95f20c4f07020ebc605f3b46af4b6db9eb5fc99 |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI/AER: Fix the uninitialized aer_fifo | expand |
On Thu, Jan 23, 2020 at 04:26:31PM +0800, Dongdong Liu wrote: > Current code do not call INIT_KFIFO() to init aer_fifo. This will lead to > kfifo_put() sometimes return 0. This means the fifo was full. In fact, it > is not. It's definitely a problem that we don't call INIT_KFIFO(). But I'm curious about why this would only be a problem "sometimes". The kfifo is allocated with devm_kzalloc(), so it should be zero-filled and I would think it would fail consistently, every time. But I guess not? > It is easy to reproduce the problem by using aer_inject. I assume maybe you mean "aer-inject" (not "aer_inject"), from https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git/ ? At least, that's what's mentioned in Documentation/PCI/pcieaer-howto.rst. > aer_inject -s :82:00.0 multiple-corr-nonfatal > The content of multiple-corr-nonfatal file is as below. > AER > COR RCVR > HL 0 1 2 3 > AER > UNCOR POISON_TLP > HL 4 5 6 7 > > Fixes: 27c1ce8bbed7 ("PCI/AER: Use kfifo for tracking events instead of reimplementing it") > Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> > --- > drivers/pci/pcie/aer.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 1ca86f2..4a818b0 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1445,6 +1445,7 @@ static int aer_probe(struct pcie_device *dev) > return -ENOMEM; > > rpc->rpd = port; > + INIT_KFIFO(rpc->aer_fifo); > set_service_data(dev, rpc); > > status = devm_request_threaded_irq(device, dev->irq, aer_irq, aer_isr, > -- > 1.9.1 >
Hi Bjorn Many thanks for your review. It's in the Spring Festival holiday, so reply later. On 2020/1/24 上午6:25, Bjorn Helgaas wrote: > On Thu, Jan 23, 2020 at 04:26:31PM +0800, Dongdong Liu wrote: >> Current code do not call INIT_KFIFO() to init aer_fifo. This will lead to >> kfifo_put() sometimes return 0. This means the fifo was full. In fact, it >> is not. > > It's definitely a problem that we don't call INIT_KFIFO(). But I'm > curious about why this would only be a problem "sometimes". The kfifo > is allocated with devm_kzalloc(), so it should be zero-filled and I > would think it would fail consistently, every time. But I guess not? Yes, It would fail consistently, every time when it appeared once. But when do echo 15 > /proc/sys/kernel/printk, "aer_inject -s 82:00.0 multiple-corr-nonfatal" executes correctly. I think this is related with the time when to call kfifo_put() and kfifo_get(). case 1: kfifo_put()--->kfifo_get()--->kfifo_put() //the fifo will not be full case 2: kfifo_put()--->kfifo_put()--->kfifo_get() the fifo will be full when the second time to call kfifo_put(); > >> It is easy to reproduce the problem by using aer_inject. > > I assume maybe you mean "aer-inject" (not "aer_inject"), from > https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git/ ? > At least, that's what's mentioned in Documentation/PCI/pcieaer-howto.rst. Yes, you are right, I mean aer-inject. Thanks, Dongdong > >> aer_inject -s :82:00.0 multiple-corr-nonfatal >> The content of multiple-corr-nonfatal file is as below. >> AER >> COR RCVR >> HL 0 1 2 3 >> AER >> UNCOR POISON_TLP >> HL 4 5 6 7 >> >> Fixes: 27c1ce8bbed7 ("PCI/AER: Use kfifo for tracking events instead of reimplementing it") >> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> >> --- >> drivers/pci/pcie/aer.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c >> index 1ca86f2..4a818b0 100644 >> --- a/drivers/pci/pcie/aer.c >> +++ b/drivers/pci/pcie/aer.c >> @@ -1445,6 +1445,7 @@ static int aer_probe(struct pcie_device *dev) >> return -ENOMEM; >> >> rpc->rpd = port; >> + INIT_KFIFO(rpc->aer_fifo); >> set_service_data(dev, rpc); >> >> status = devm_request_threaded_irq(device, dev->irq, aer_irq, aer_isr, >> -- >> 1.9.1 >>
On Thu, Jan 23, 2020 at 04:26:31PM +0800, Dongdong Liu wrote: > Current code do not call INIT_KFIFO() to init aer_fifo. This will lead to > kfifo_put() sometimes return 0. This means the fifo was full. In fact, it > is not. It is easy to reproduce the problem by using aer_inject. > aer_inject -s :82:00.0 multiple-corr-nonfatal > The content of multiple-corr-nonfatal file is as below. > AER > COR RCVR > HL 0 1 2 3 > AER > UNCOR POISON_TLP > HL 4 5 6 7 > > Fixes: 27c1ce8bbed7 ("PCI/AER: Use kfifo for tracking events instead of reimplementing it") > Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> Applied to pci/aer for v5.6, thanks! I tweaked the commit log for s/aer_inject/aer-inject/ > --- > drivers/pci/pcie/aer.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 1ca86f2..4a818b0 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -1445,6 +1445,7 @@ static int aer_probe(struct pcie_device *dev) > return -ENOMEM; > > rpc->rpd = port; > + INIT_KFIFO(rpc->aer_fifo); > set_service_data(dev, rpc); > > status = devm_request_threaded_irq(device, dev->irq, aer_irq, aer_isr, > -- > 1.9.1 >
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 1ca86f2..4a818b0 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1445,6 +1445,7 @@ static int aer_probe(struct pcie_device *dev) return -ENOMEM; rpc->rpd = port; + INIT_KFIFO(rpc->aer_fifo); set_service_data(dev, rpc); status = devm_request_threaded_irq(device, dev->irq, aer_irq, aer_isr,
Current code do not call INIT_KFIFO() to init aer_fifo. This will lead to kfifo_put() sometimes return 0. This means the fifo was full. In fact, it is not. It is easy to reproduce the problem by using aer_inject. aer_inject -s :82:00.0 multiple-corr-nonfatal The content of multiple-corr-nonfatal file is as below. AER COR RCVR HL 0 1 2 3 AER UNCOR POISON_TLP HL 4 5 6 7 Fixes: 27c1ce8bbed7 ("PCI/AER: Use kfifo for tracking events instead of reimplementing it") Signed-off-by: Dongdong Liu <liudongdong3@huawei.com> --- drivers/pci/pcie/aer.c | 1 + 1 file changed, 1 insertion(+)