Message ID | 20180117052206.7703-5-keith.busch@intel.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On Tue, Jan 16, 2018 at 10:22:04PM -0700, Keith Busch wrote: > A DPC enabled device suppresses ERR_(NON)FATAL messages, preventing the > AER handler from reporting error details. If the DPC trigger reason says > the downstream port detected the error, this patch has the DPC driver > collect the AER uncorrectable status for logging, then clears the status. > > Signed-off-by: Keith Busch <keith.busch@intel.com> > --- > drivers/pci/pcie/pcie-dpc.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c > index d1fbd83cd240..85350b00f251 100644 > --- a/drivers/pci/pcie/pcie-dpc.c > +++ b/drivers/pci/pcie/pcie-dpc.c > @@ -44,6 +44,7 @@ struct dpc_dev { > int cap_pos; > bool rp; > u32 rp_pio_status; > + struct aer_err_info info; > }; > > static const char * const rp_pio_error_string[] = { > @@ -112,6 +113,12 @@ static void interrupt_event_handler(struct work_struct *work) > struct pci_bus *parent = pdev->subordinate; > u16 ctl; > > + if (dpc->info.severity == AER_NONFATAL) { > + if (aer_get_device_error_info(pdev, &dpc->info)) { > + aer_print_error(pdev, &dpc->info); > + pci_cleanup_aer_uncorrect_error_status(pdev); > + } > + } > pci_lock_rescan_remove(); > list_for_each_entry_safe_reverse(dev, temp, &parent->devices, > bus_list) { > @@ -298,6 +305,11 @@ static irqreturn_t dpc_irq(int irq, void *context) > > schedule_work(&dpc->work); > > + if (reason == 0) > + dpc->info.severity = AER_NONFATAL; > + else > + dpc->info.severity = AER_CORRECTABLE; This looks wrong because we call schedule_work() before we set dpc->info.severity. I think that's racy because interrupt_event_handler() might read dpc->info.severity before dpc_irq() sets it. That raises the question of how we pass information from dpc_irq() to interrupt_event_handler(). I think interrupt_event_handler() is a work item because it removes devices and might need to sleep. That makes sense. But we pass dpc->info and dpc->rp_pio_status from dpc_irq() to interrupt_event_handler() via the single struct dpc_dev. That doesn't make sense in general because dpc_irq() may be invoked several times for each invocation of interrupt_event_handler(). I think the general purpose solution for passing information from interrupt context to process context would be a queue. It looks like we side-step that issue by disabling interrupts in dpc_irq() and re-enabling them in interrupt_event_handler(). That probably does work, but it's strange and requires extra config accesses. The fact that we clear PCI_EXP_DPC_STATUS_INTERRUPT at the end of interrupt_event_handler() should be enough -- sec 6.2.10.1 says we should only get a new MSI when the logical AND of things including PCI_EXP_DPC_STATUS_INTERRUPT transitions from FALSE to TRUE. I think all the log registers (Error Source ID and RP PIO Logs) are latched when DPC Trigger Status is set, so I think all the dmesg logging and dpc_process_rp_pio_error() stuff *could* be done in interrupt_event_handler(). Then it would be together with the PCI_EXP_DPC_STATUS write that clears DPC Trigger Status (and clears the latched log registers) and we wouldn't have the question about what belongs on dpc_irq() and what belongs in interrupt_event_handler(). After we clear PCI_EXP_DPC_STATUS_TRIGGER, we're supposed to "honor timing requirements ... with respect to the first Configuration Read following a Conventional Reset" (PCIe r4.0, sec 7.9.15.4). Where does that happen? > return IRQ_HANDLED; > } > > -- > 2.13.6 >
On Thu, Jan 25, 2018 at 01:55:12PM -0600, Bjorn Helgaas wrote: > On Tue, Jan 16, 2018 at 10:22:04PM -0700, Keith Busch wrote: > > > > schedule_work(&dpc->work); > > > > + if (reason == 0) > > + dpc->info.severity = AER_NONFATAL; > > + else > > + dpc->info.severity = AER_CORRECTABLE; > > This looks wrong because we call schedule_work() before we set > dpc->info.severity. I think that's racy because > interrupt_event_handler() might read dpc->info.severity before > dpc_irq() sets it. You're right, my mistake. > That raises the question of how we pass information from dpc_irq() to > interrupt_event_handler(). I think interrupt_event_handler() is a > work item because it removes devices and might need to sleep. That > makes sense. > > But we pass dpc->info and dpc->rp_pio_status from dpc_irq() to > interrupt_event_handler() via the single struct dpc_dev. That doesn't > make sense in general because dpc_irq() may be invoked several times > for each invocation of interrupt_event_handler(). I think the general > purpose solution for passing information from interrupt context to > process context would be a queue. > > It looks like we side-step that issue by disabling interrupts in > dpc_irq() and re-enabling them in interrupt_event_handler(). That > probably does work, but it's strange and requires extra config > accesses. The fact that we clear PCI_EXP_DPC_STATUS_INTERRUPT at the > end of interrupt_event_handler() should be enough -- sec 6.2.10.1 says > we should only get a new MSI when the logical AND of things including > PCI_EXP_DPC_STATUS_INTERRUPT transitions from FALSE to TRUE. Yes, that's true. We could get the same interrupt for non-DPC related events, and the interrupt config setting was something Alex proposed to make sure we don't note the same events multiple times. Using these config registers seems a bit overkill, though. > I think all the log registers (Error Source ID and RP PIO Logs) are > latched when DPC Trigger Status is set, so I think all the dmesg > logging and dpc_process_rp_pio_error() stuff *could* be done in > interrupt_event_handler(). Then it would be together with the > PCI_EXP_DPC_STATUS write that clears DPC Trigger Status (and clears > the latched log registers) and we wouldn't have the question about > what belongs on dpc_irq() and what belongs in > interrupt_event_handler(). That sounds great. If we defer all logging and containment handling to the work-queue, that should even fix up the issues the interrupt disable/enabling was trying to solve. That doesn't even sound difficult to code. Let me send you a proposal instead of this series, maybe by tomorrow. > After we clear PCI_EXP_DPC_STATUS_TRIGGER, we're supposed to "honor > timing requirements ... with respect to the first Configuration Read > following a Conventional Reset" (PCIe r4.0, sec 7.9.15.4). Where does > that happen? That is referring to reading the downstream port, and that is not handled by the DPC driver. The expectation is that the link is down on a DPC event, and the Link Up is handled by the pciehp driver, which should be honoring those timing requirements.
On Thu, Jan 25, 2018 at 02:15:21PM -0700, Keith Busch wrote: > On Thu, Jan 25, 2018 at 01:55:12PM -0600, Bjorn Helgaas wrote: > > > After we clear PCI_EXP_DPC_STATUS_TRIGGER, we're supposed to "honor > > timing requirements ... with respect to the first Configuration Read > > following a Conventional Reset" (PCIe r4.0, sec 7.9.15.4). Where does > > that happen? > > That is referring to reading the downstream port, and that is not > handled by the DPC driver. The expectation is that the link is down on a > DPC event, and the Link Up is handled by the pciehp driver, which > should be honoring those timing requirements. Sorry, I didn't mean "reading the downstream port". You can config read that anytime. I meant issuing a config read to the device connected downstream the contained port.
diff --git a/drivers/pci/pcie/pcie-dpc.c b/drivers/pci/pcie/pcie-dpc.c index d1fbd83cd240..85350b00f251 100644 --- a/drivers/pci/pcie/pcie-dpc.c +++ b/drivers/pci/pcie/pcie-dpc.c @@ -44,6 +44,7 @@ struct dpc_dev { int cap_pos; bool rp; u32 rp_pio_status; + struct aer_err_info info; }; static const char * const rp_pio_error_string[] = { @@ -112,6 +113,12 @@ static void interrupt_event_handler(struct work_struct *work) struct pci_bus *parent = pdev->subordinate; u16 ctl; + if (dpc->info.severity == AER_NONFATAL) { + if (aer_get_device_error_info(pdev, &dpc->info)) { + aer_print_error(pdev, &dpc->info); + pci_cleanup_aer_uncorrect_error_status(pdev); + } + } pci_lock_rescan_remove(); list_for_each_entry_safe_reverse(dev, temp, &parent->devices, bus_list) { @@ -298,6 +305,11 @@ static irqreturn_t dpc_irq(int irq, void *context) schedule_work(&dpc->work); + if (reason == 0) + dpc->info.severity = AER_NONFATAL; + else + dpc->info.severity = AER_CORRECTABLE; + return IRQ_HANDLED; }
A DPC enabled device suppresses ERR_(NON)FATAL messages, preventing the AER handler from reporting error details. If the DPC trigger reason says the downstream port detected the error, this patch has the DPC driver collect the AER uncorrectable status for logging, then clears the status. Signed-off-by: Keith Busch <keith.busch@intel.com> --- drivers/pci/pcie/pcie-dpc.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)