Message ID | 20180831212639.10196-11-keith.busch@intel.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | PCI, error handling and hot plug | expand |
On Fri, Aug 31, 2018 at 03:26:33PM -0600, Keith Busch wrote: > --- a/drivers/pci/pcie/portdrv_pci.c > +++ b/drivers/pci/pcie/portdrv_pci.c > @@ -139,13 +139,45 @@ static void pcie_portdrv_remove(struct pci_dev *dev) > pcie_port_device_remove(dev); > } > > +static int detected_iter(struct device *device, void *data) > +{ > + struct pci_dev *pdev = data; > + struct pcie_port_service_driver *driver; > + > + if (device->bus == &pcie_port_bus_type && device->driver) { > + driver = to_service_driver(device->driver); > + if (driver && driver->error_detected) > + driver->error_detected(pdev); > + } > + return 0; > +} You're passing a pci_dev to the ->error_detected callback and this forces the callback to laboriously find its own pcie port service device, as visible in patch [13/16], where pciehp_error_detected() contains: struct device *device = pcie_port_find_device(dev, PCIE_PORT_SERVICE_HP); if (!device) return; pcie_dev = to_pcie_device(device); This seems backwards, the callbacks should be passed a pcie_device instead of a pci_dev. Same for the ->slot_reset callback iterator added in this patch. Thanks, Lukas
On Sun, Sep 02, 2018 at 12:16:41PM +0200, Lukas Wunner wrote: > On Fri, Aug 31, 2018 at 03:26:33PM -0600, Keith Busch wrote: > > --- a/drivers/pci/pcie/portdrv_pci.c > > +++ b/drivers/pci/pcie/portdrv_pci.c > > @@ -139,13 +139,45 @@ static void pcie_portdrv_remove(struct pci_dev *dev) > > pcie_port_device_remove(dev); > > } > > > > +static int detected_iter(struct device *device, void *data) > > +{ > > + struct pci_dev *pdev = data; > > + struct pcie_port_service_driver *driver; > > + > > + if (device->bus == &pcie_port_bus_type && device->driver) { > > + driver = to_service_driver(device->driver); > > + if (driver && driver->error_detected) > > + driver->error_detected(pdev); > > + } > > + return 0; > > +} > > You're passing a pci_dev to the ->error_detected callback and this > forces the callback to laboriously find its own pcie port service device, > as visible in patch [13/16], where pciehp_error_detected() contains: > > struct device *device = pcie_port_find_device(dev, PCIE_PORT_SERVICE_HP); > if (!device) > return; > pcie_dev = to_pcie_device(device); > > This seems backwards, the callbacks should be passed a pcie_device > instead of a pci_dev. > > Same for the ->slot_reset callback iterator added in this patch. I agree that is better. I was only trying to stick with the existing pattern, but I can change that as a prep patch preceeding adding these error callbacks, no problem.
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index d59afa42fc14..70b770dd83f1 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -53,6 +53,12 @@ struct pcie_port_service_driver { int (*resume_noirq) (struct pcie_device *dev); int (*resume) (struct pcie_device *dev); + /* Device driver notification of error handling */ + void (*error_detected)(struct pci_dev *dev); + + /* Device driver notification of slot reset */ + void (*slot_reset)(struct pci_dev *dev); + /* Device driver may resume normal operations */ void (*error_resume)(struct pci_dev *dev); diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index eef22dc29140..b1deab941f68 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -139,13 +139,45 @@ static void pcie_portdrv_remove(struct pci_dev *dev) pcie_port_device_remove(dev); } +static int detected_iter(struct device *device, void *data) +{ + struct pci_dev *pdev = data; + struct pcie_port_service_driver *driver; + + if (device->bus == &pcie_port_bus_type && device->driver) { + driver = to_service_driver(device->driver); + if (driver && driver->error_detected) + driver->error_detected(pdev); + } + return 0; +} + static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev, enum pci_channel_state error) { - /* Root Port has no impact. Always recovers. */ + device_for_each_child(&dev->dev, dev, detected_iter); return PCI_ERS_RESULT_CAN_RECOVER; } +static int slot_reset_iter(struct device *device, void *data) +{ + struct pci_dev *pdev = data; + struct pcie_port_service_driver *driver; + + if (device->bus == &pcie_port_bus_type && device->driver) { + driver = to_service_driver(device->driver); + if (driver && driver->slot_reset) + driver->slot_reset(pdev); + } + return 0; +} + +static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev) +{ + device_for_each_child(&dev->dev, dev, slot_reset_iter); + return PCI_ERS_RESULT_RECOVERED; +} + static pci_ers_result_t pcie_portdrv_mmio_enabled(struct pci_dev *dev) { return PCI_ERS_RESULT_RECOVERED; @@ -185,6 +217,7 @@ static const struct pci_device_id port_pci_ids[] = { { static const struct pci_error_handlers pcie_portdrv_err_handler = { .error_detected = pcie_portdrv_error_detected, + .slot_reset = pcie_portdrv_slot_reset, .mmio_enabled = pcie_portdrv_mmio_enabled, .resume = pcie_portdrv_err_resume, };
The error notification walks the whole bus, and some ports may need to do something to prepare for error handling and recover after slot resets too. This patch chains the error notification to the port services that register this callback. And while we're here, remove the misleading comment referring to AER root ports in the generic port services driver, as the port may be involved with other services. Signed-off-by: Keith Busch <keith.busch@intel.com> --- drivers/pci/pcie/portdrv.h | 6 ++++++ drivers/pci/pcie/portdrv_pci.c | 35 ++++++++++++++++++++++++++++++++++- 2 files changed, 40 insertions(+), 1 deletion(-)