Message ID | 20230523232214.55282-18-terry.bowman@amd.com |
---|---|
State | Superseded |
Headers | show |
Series | cxl/pci: Add support for RCH RAS error handling | expand |
On Tue, 23 May 2023 18:22:08 -0500 Terry Bowman <terry.bowman@amd.com> wrote: > Restricted CXL host (RCH) downstream port AER information is not currently > logged while in the error state. One problem preventing the error logging > is the AER and RAS registers are not accessible. The CXL driver requires > changes to find RCH downstream port AER and RAS registers for purpose of > error logging. > > RCH downstream ports are not enumerated during a PCI bus scan and are > instead discovered using system firmware, ACPI in this case.[1] The > downstream port is implemented as a Root Complex Register Block (RCRB). > The RCRB is a 4k memory block containing PCIe registers based on the PCIe > root port.[2] The RCRB includes AER extended capability registers used for > reporting errors. Note, the RCH's AER Capability is located in the RCRB > memory space instead of PCI configuration space, thus its register access > is different. Existing kernel PCIe AER functions can not be used to manage > the downstream port AER capabilities and RAS registers because the port was > not enumerated during PCI scan and the registers are not PCI config > accessible. > > Discover RCH downstream port AER extended capability registers. Use MMIO > accesses to search for extended AER capability in RCRB register space. > > [1] CXL 3.0 Spec, 9.11.2 - System Firmware View of CXL 1.1 Hierarchy > [2] CXL 3.0 Spec, 8.2.1.1 - RCH Downstream Port RCRB > > Co-developed-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
On Tue, 23 May 2023 18:22:08 -0500 Terry Bowman <terry.bowman@amd.com> wrote: > Restricted CXL host (RCH) downstream port AER information is not currently > logged while in the error state. One problem preventing the error logging > is the AER and RAS registers are not accessible. The CXL driver requires > changes to find RCH downstream port AER and RAS registers for purpose of > error logging. > > RCH downstream ports are not enumerated during a PCI bus scan and are > instead discovered using system firmware, ACPI in this case.[1] The > downstream port is implemented as a Root Complex Register Block (RCRB). > The RCRB is a 4k memory block containing PCIe registers based on the PCIe > root port.[2] The RCRB includes AER extended capability registers used for > reporting errors. Note, the RCH's AER Capability is located in the RCRB > memory space instead of PCI configuration space, thus its register access > is different. Existing kernel PCIe AER functions can not be used to manage > the downstream port AER capabilities and RAS registers because the port was > not enumerated during PCI scan and the registers are not PCI config > accessible. > > Discover RCH downstream port AER extended capability registers. Use MMIO > accesses to search for extended AER capability in RCRB register space. > > [1] CXL 3.0 Spec, 9.11.2 - System Firmware View of CXL 1.1 Hierarchy > [2] CXL 3.0 Spec, 8.2.1.1 - RCH Downstream Port RCRB > > Co-developed-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Terry Bowman <terry.bowman@amd.com> Seems reasonable. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c index 7e56ddf509c0..045abc11add8 100644 --- a/drivers/cxl/core/regs.c +++ b/drivers/cxl/core/regs.c @@ -404,6 +404,54 @@ int cxl_setup_regs(struct cxl_register_map *map) } EXPORT_SYMBOL_NS_GPL(cxl_setup_regs, CXL); +static void __iomem *cxl_map_reg(struct device *dev, resource_size_t addr, + resource_size_t length) +{ + struct resource *res; + + if (WARN_ON_ONCE(addr == CXL_RESOURCE_NONE)) + return NULL; + + res = request_mem_region(addr, length, dev_name(dev)); + if (!res) + return NULL; + + return ioremap(addr, length); +} + +static void cxl_unmap_reg(void __iomem *base, resource_size_t addr, + resource_size_t length) +{ + iounmap(base); + release_mem_region(addr, length); +} + +static u16 cxl_rcrb_to_aer(struct device *dev, resource_size_t rcrb) +{ + void __iomem *addr; + u16 offset = 0; + u32 cap_hdr; + + addr = cxl_map_reg(dev, rcrb, SZ_4K); + if (!addr) + return 0; + + cap_hdr = readl(addr + offset); + while (PCI_EXT_CAP_ID(cap_hdr) != PCI_EXT_CAP_ID_ERR) { + offset = PCI_EXT_CAP_NEXT(cap_hdr); + if (!offset) + break; + cap_hdr = readl(addr + offset); + } + + if (offset) + dev_dbg(dev, "found AER extended capability (0x%x)\n", offset); + + cxl_unmap_reg(addr, rcrb, SZ_4K); + + return offset; +} + resource_size_t cxl_probe_rcrb(struct device *dev, resource_size_t rcrb, struct cxl_rcrb_info *ri, enum cxl_rcrb which) { @@ -467,6 +515,9 @@ resource_size_t cxl_probe_rcrb(struct device *dev, resource_size_t rcrb, if (!IS_ALIGNED(component_reg_phys, CXL_COMPONENT_REG_BLOCK_SIZE)) return CXL_RESOURCE_NONE; + if (ri) + ri->aer_cap = cxl_rcrb_to_aer(dev, ri->base); + return component_reg_phys; } EXPORT_SYMBOL_NS_GPL(cxl_probe_rcrb, CXL);