Message ID | 20231018171713.1883517-12-rrichter@amd.com |
---|---|
State | Accepted |
Commit | f05fd10d138d8ba795fcd72ace99e9c8eb7e777e |
Headers | show |
Series | cxl/pci: Add support for RCH RAS error handling | expand |
Robert Richter wrote: > From: Terry Bowman <terry.bowman@amd.com> > > Restricted CXL host (RCH) downstream port AER information is not currently > logged while in the error state. One problem preventing the error logging > is the AER and RAS registers are not accessible. The CXL driver requires > changes to find RCH downstream port AER and RAS registers for purpose of > error logging. > > RCH downstream ports are not enumerated during a PCI bus scan and are > instead discovered using system firmware, ACPI in this case.[1] The > downstream port is implemented as a Root Complex Register Block (RCRB). > The RCRB is a 4k memory block containing PCIe registers based on the PCIe > root port.[2] The RCRB includes AER extended capability registers used for > reporting errors. Note, the RCH's AER Capability is located in the RCRB > memory space instead of PCI configuration space, thus its register access > is different. Existing kernel PCIe AER functions can not be used to manage > the downstream port AER capabilities and RAS registers because the port was > not enumerated during PCI scan and the registers are not PCI config > accessible. > > Discover RCH downstream port AER extended capability registers. Use MMIO > accesses to search for extended AER capability in RCRB register space. > > [1] CXL 3.0 Spec, 9.11.2 - System Firmware View of CXL 1.1 Hierarchy > [2] CXL 3.0 Spec, 8.2.1.1 - RCH Downstream Port RCRB > > Co-developed-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > Signed-off-by: Robert Richter <rrichter@amd.com> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/core.h | 1 + > drivers/cxl/core/pci.c | 6 ++++++ > drivers/cxl/core/regs.c | 36 ++++++++++++++++++++++++++++++++++++ > 3 files changed, 43 insertions(+) > > diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h > index 45e7e044cf4a..f470ef5c0a6a 100644 > --- a/drivers/cxl/core/core.h > +++ b/drivers/cxl/core/core.h > @@ -73,6 +73,7 @@ struct cxl_rcrb_info; > resource_size_t __rcrb_to_component(struct device *dev, > struct cxl_rcrb_info *ri, > enum cxl_rcrb which); > +u16 cxl_rcrb_to_aer(struct device *dev, resource_size_t rcrb); > > extern struct rw_semaphore cxl_dpa_rwsem; > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 7c3fbf9815e9..cbccc222bb91 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -722,6 +722,12 @@ static bool cxl_report_and_clear(struct cxl_dev_state *cxlds) > > void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport) > { > + struct device *dport_dev = dport->dport_dev; > + struct pci_host_bridge *host_bridge; > + > + host_bridge = to_pci_host_bridge(dport_dev); > + if (host_bridge->native_cxl_error) > + dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); Nothing in this function has a compile dependency on CONFIG_PCIEAER_CXL the enumeration can be done unconditionally, it's the usage that needs to be gated. I believe I had commented on this before, given the late hour I will need to fix this up post -rc1.
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 45e7e044cf4a..f470ef5c0a6a 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -73,6 +73,7 @@ struct cxl_rcrb_info; resource_size_t __rcrb_to_component(struct device *dev, struct cxl_rcrb_info *ri, enum cxl_rcrb which); +u16 cxl_rcrb_to_aer(struct device *dev, resource_size_t rcrb); extern struct rw_semaphore cxl_dpa_rwsem; diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index 7c3fbf9815e9..cbccc222bb91 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -722,6 +722,12 @@ static bool cxl_report_and_clear(struct cxl_dev_state *cxlds) void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport) { + struct device *dport_dev = dport->dport_dev; + struct pci_host_bridge *host_bridge; + + host_bridge = to_pci_host_bridge(dport_dev); + if (host_bridge->native_cxl_error) + dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); } EXPORT_SYMBOL_NS_GPL(cxl_setup_parent_dport, CXL); diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c index e0fbe964f6f0..9111ceef1127 100644 --- a/drivers/cxl/core/regs.c +++ b/drivers/cxl/core/regs.c @@ -470,6 +470,42 @@ int cxl_setup_regs(struct cxl_register_map *map) } EXPORT_SYMBOL_NS_GPL(cxl_setup_regs, CXL); +u16 cxl_rcrb_to_aer(struct device *dev, resource_size_t rcrb) +{ + void __iomem *addr; + u16 offset = 0; + u32 cap_hdr; + + if (WARN_ON_ONCE(rcrb == CXL_RESOURCE_NONE)) + return 0; + + if (!request_mem_region(rcrb, SZ_4K, dev_name(dev))) + return 0; + + addr = ioremap(rcrb, SZ_4K); + if (!addr) + goto out; + + cap_hdr = readl(addr + offset); + while (PCI_EXT_CAP_ID(cap_hdr) != PCI_EXT_CAP_ID_ERR) { + offset = PCI_EXT_CAP_NEXT(cap_hdr); + + /* Offset 0 terminates capability list. */ + if (!offset) + break; + cap_hdr = readl(addr + offset); + } + + if (offset) + dev_dbg(dev, "found AER extended capability (0x%x)\n", offset); + + iounmap(addr); +out: + release_mem_region(rcrb, SZ_4K); + + return offset; +} + resource_size_t __rcrb_to_component(struct device *dev, struct cxl_rcrb_info *ri, enum cxl_rcrb which) {