Message ID | 20230607221651.2454764-20-terry.bowman@amd.com |
---|---|
State | Superseded |
Headers | show |
Series | cxl/pci: Add support for RCH RAS error handling | expand |
Terry Bowman wrote: > Restricted CXL host (RCH) downstream port AER information is not currently > logged while in the error state. One problem preventing the error logging > is the AER and RAS registers are not accessible. The CXL driver requires > changes to find RCH downstream port AER and RAS registers for purpose of > error logging. > > RCH downstream ports are not enumerated during a PCI bus scan and are > instead discovered using system firmware, ACPI in this case.[1] The > downstream port is implemented as a Root Complex Register Block (RCRB). > The RCRB is a 4k memory block containing PCIe registers based on the PCIe > root port.[2] The RCRB includes AER extended capability registers used for > reporting errors. Note, the RCH's AER Capability is located in the RCRB > memory space instead of PCI configuration space, thus its register access > is different. Existing kernel PCIe AER functions can not be used to manage > the downstream port AER capabilities and RAS registers because the port was > not enumerated during PCI scan and the registers are not PCI config > accessible. > > Discover RCH downstream port AER extended capability registers. Use MMIO > accesses to search for extended AER capability in RCRB register space. > > [1] CXL 3.0 Spec, 9.11.2 - System Firmware View of CXL 1.1 Hierarchy > [2] CXL 3.0 Spec, 8.2.1.1 - RCH Downstream Port RCRB > > Co-developed-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > drivers/cxl/core/regs.c | 51 +++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 51 insertions(+) > > diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c > index ba2b1763042c..dd6c3c898cff 100644 > --- a/drivers/cxl/core/regs.c > +++ b/drivers/cxl/core/regs.c > @@ -408,6 +408,54 @@ int cxl_setup_regs(struct cxl_register_map *map) > } > EXPORT_SYMBOL_NS_GPL(cxl_setup_regs, CXL); > > +static void __iomem *cxl_map_reg(struct device *dev, resource_size_t addr, > + resource_size_t length) > +{ > + struct resource *res; > + > + if (WARN_ON_ONCE(addr == CXL_RESOURCE_NONE)) > + return NULL; > + > + res = request_mem_region(addr, length, dev_name(dev)); > + if (!res) > + return NULL; > + > + return ioremap(addr, length); > +} > + > +static void cxl_unmap_reg(void __iomem *base, resource_size_t addr, > + resource_size_t length) > +{ > + iounmap(base); > + release_mem_region(addr, length); > +} Why redo the {request,release}_mem_region() and ioremap() vs handling this inside of the existing mapping of the RCRB in this function?
Hi Dan, I added a response inline below. On 6/9/23 22:09, Dan Williams wrote: > Terry Bowman wrote: >> Restricted CXL host (RCH) downstream port AER information is not currently >> logged while in the error state. One problem preventing the error logging >> is the AER and RAS registers are not accessible. The CXL driver requires >> changes to find RCH downstream port AER and RAS registers for purpose of >> error logging. >> >> RCH downstream ports are not enumerated during a PCI bus scan and are >> instead discovered using system firmware, ACPI in this case.[1] The >> downstream port is implemented as a Root Complex Register Block (RCRB). >> The RCRB is a 4k memory block containing PCIe registers based on the PCIe >> root port.[2] The RCRB includes AER extended capability registers used for >> reporting errors. Note, the RCH's AER Capability is located in the RCRB >> memory space instead of PCI configuration space, thus its register access >> is different. Existing kernel PCIe AER functions can not be used to manage >> the downstream port AER capabilities and RAS registers because the port was >> not enumerated during PCI scan and the registers are not PCI config >> accessible. >> >> Discover RCH downstream port AER extended capability registers. Use MMIO >> accesses to search for extended AER capability in RCRB register space. >> >> [1] CXL 3.0 Spec, 9.11.2 - System Firmware View of CXL 1.1 Hierarchy >> [2] CXL 3.0 Spec, 8.2.1.1 - RCH Downstream Port RCRB >> >> Co-developed-by: Robert Richter <rrichter@amd.com> >> Signed-off-by: Robert Richter <rrichter@amd.com> >> Signed-off-by: Terry Bowman <terry.bowman@amd.com> >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> >> --- >> drivers/cxl/core/regs.c | 51 +++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 51 insertions(+) >> >> diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c >> index ba2b1763042c..dd6c3c898cff 100644 >> --- a/drivers/cxl/core/regs.c >> +++ b/drivers/cxl/core/regs.c >> @@ -408,6 +408,54 @@ int cxl_setup_regs(struct cxl_register_map *map) >> } >> EXPORT_SYMBOL_NS_GPL(cxl_setup_regs, CXL); >> >> +static void __iomem *cxl_map_reg(struct device *dev, resource_size_t addr, >> + resource_size_t length) >> +{ >> + struct resource *res; >> + >> + if (WARN_ON_ONCE(addr == CXL_RESOURCE_NONE)) >> + return NULL; >> + >> + res = request_mem_region(addr, length, dev_name(dev)); >> + if (!res) >> + return NULL; >> + >> + return ioremap(addr, length); >> +} >> + >> +static void cxl_unmap_reg(void __iomem *base, resource_size_t addr, >> + resource_size_t length) >> +{ >> + iounmap(base); >> + release_mem_region(addr, length); >> +} > > Why redo the {request,release}_mem_region() and ioremap() vs handling > this inside of the existing mapping of the RCRB in this function? The intention was to follow the same pattern as existing {request,release} functions but doesn't make much sense with only one user in this case. I'll fold the {request,release} logic into cxl_rcrb_to_aer(). Regards, Terry
diff --git a/drivers/cxl/core/regs.c b/drivers/cxl/core/regs.c index ba2b1763042c..dd6c3c898cff 100644 --- a/drivers/cxl/core/regs.c +++ b/drivers/cxl/core/regs.c @@ -408,6 +408,54 @@ int cxl_setup_regs(struct cxl_register_map *map) } EXPORT_SYMBOL_NS_GPL(cxl_setup_regs, CXL); +static void __iomem *cxl_map_reg(struct device *dev, resource_size_t addr, + resource_size_t length) +{ + struct resource *res; + + if (WARN_ON_ONCE(addr == CXL_RESOURCE_NONE)) + return NULL; + + res = request_mem_region(addr, length, dev_name(dev)); + if (!res) + return NULL; + + return ioremap(addr, length); +} + +static void cxl_unmap_reg(void __iomem *base, resource_size_t addr, + resource_size_t length) +{ + iounmap(base); + release_mem_region(addr, length); +} + +static u16 cxl_rcrb_to_aer(struct device *dev, resource_size_t rcrb) +{ + void __iomem *addr; + u16 offset = 0; + u32 cap_hdr; + + addr = cxl_map_reg(dev, rcrb, SZ_4K); + if (!addr) + return 0; + + cap_hdr = readl(addr + offset); + while (PCI_EXT_CAP_ID(cap_hdr) != PCI_EXT_CAP_ID_ERR) { + offset = PCI_EXT_CAP_NEXT(cap_hdr); + if (!offset) + break; + cap_hdr = readl(addr + offset); + } + + if (offset) + dev_dbg(dev, "found AER extended capability (0x%x)\n", offset); + + cxl_unmap_reg(addr, rcrb, SZ_4K); + + return offset; +} + resource_size_t cxl_probe_rcrb(struct device *dev, resource_size_t rcrb, struct cxl_rcrb_info *ri, enum cxl_rcrb which) { @@ -471,6 +519,9 @@ resource_size_t cxl_probe_rcrb(struct device *dev, resource_size_t rcrb, if (!IS_ALIGNED(component_reg_phys, CXL_COMPONENT_REG_BLOCK_SIZE)) return CXL_RESOURCE_NONE; + if (ri) + ri->aer_cap = cxl_rcrb_to_aer(dev, ri->base); + return component_reg_phys; } EXPORT_SYMBOL_NS_GPL(cxl_probe_rcrb, CXL);