Message ID | 20230411180302.2678736-6-terry.bowman@amd.com |
---|---|
State | Superseded |
Headers | show |
Series | cxl/pci: Add support for RCH RAS error handling | expand |
On Tue, Apr 11, 2023 at 01:03:01PM -0500, Terry Bowman wrote: > From: Robert Richter <rrichter@amd.com> > > In Restricted CXL Device (RCD) mode a CXL device is exposed as an > RCiEP, but CXL downstream and upstream ports are not enumerated and > not visible in the PCIe hierarchy. Protocol and link errors are sent > to an RCEC. > > Restricted CXL host (RCH) downstream port-detected errors are signaled > as internal AER errors, either Uncorrectable Internal Error (UIE) or > Corrected Internal Errors (CIE). The error source is the id of the > RCEC. A CXL handler must then inspect the error status in various CXL > registers residing in the dport's component register space (CXL RAS > cap) or the dport's RCRB (AER ext cap). [1] > > Errors showing up in the RCEC's error handler must be handled and > connected to the CXL subsystem. Implement this by forwarding the error > to all CXL devices below the RCEC. Since the entire CXL device is > controlled only using PCIe Configuration Space of device 0, Function > 0, Capitalize "device" and "Function" the same way (also appears in comment below). > only pass it there [2]. These devices have the Memory Device class > code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver > can implement the handler. In addition to errors directed to the CXL > endpoint device, the handler must also inspect the CXL downstream > port's CXL RAS and PCIe AER external capabilities that is connected to "AER external capabilities" -- is that referring to the "AER *Extended* capability"? If so, we usually don't bother including the "extended" part because it's usually not relevant. But if you intended "external", I don't know what it means. > the device. > > Since CXL downstream port errors are signaled using internal errors, > the handler requires those errors to be unmasked. This is subject of a > follow-on patch. > > The reason for choosing this implementation is that a CXL RCEC device > is bound to the AER port driver, but the driver does not allow it to > register a custom specific handler to support CXL. Connecting the RCEC > hard-wired with a CXL handler does not work, as the CXL subsystem > might not be present all the time. The alternative to add an > implementation to the portdrv to allow the registration of a custom > RCEC error handler isn't worth doing it as CXL would be its only user. > Instead, just check for an CXL RCEC and pass it down to the connected > CXL device's error handler. With this approach the code can entirely > be implemented in the PCIe AER driver and is independent of the CXL > subsystem. The CXL driver only provides the handler. Can you make this more concrete with an example topology so we can work through how this all works? Correct me when I go off the rails here: The current code uses pcie_walk_rcec() in this path, which basically searches below a Root Port or RCEC for devices that have an AER error status bit set, add them to the e_info[] list, and call handle_error_source() for each one: aer_isr_one_error # get e_src from aer_fifo find_source_device(e_src) pcie_walk_rcec(find_device_iter) find_device_iter is_error_source # read PCI_ERR_COR_STATUS or PCI_ERR_UNCOR_STATUS if (error-source) add_error_device # add device to e_info[] list # now call handle_error_source for everything in e_info[] aer_process_err_devices for (i = 0; i < e_info->err_dev_num; i++) handle_error_source IIUC, this patch basically says that an RCEC should have an AER error status bit (UIE or CIE) set, but the devices "below" the RCEC will not, so they won't get added to e_info[]. So we insert cxl_handle_error() in handle_error_source(), where it gets called for the RCEC, and then it uses pcie_walk_rcec() again to forcibly call handle_error_source() for *every* device "below" the RCEC (even though they don't have AER error status bits set). Then handle_error_source() ultimately calls the CXL driver err_handler entry points (.cor_error_detected(), .error_detected(), etc), which can look at the CXL-specific error status in the CXL RAS or RCRB or whatever. So this basically looks like a workaround for the fact that the AER code only calls handle_error_source() when it finds AER error status, and CXL doesn't *set* that AER error status. There's not that much code here, but it seems like a quite a bit of complexity in an area that is already pretty complicated. Here's another idea: the ACPI GHES code (ghes_handle_aer()) basically receives a packet of error status from firmware and queues it for recovery via pcie_do_recovery(). What if you had a CXL module that knew how to look for the CXL error status, package it up similarly, and queue it via aer_recover_queue()? > [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors > [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices > > Co-developed-by: Terry Bowman <terry.bowman@amd.com> > Signed-off-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > Cc: "Oliver O'Halloran" <oohall@gmail.com> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-pci@vger.kernel.org > --- > drivers/pci/pcie/Kconfig | 8 ++++++ > drivers/pci/pcie/aer.c | 61 ++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 69 insertions(+) > > diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig > index 228652a59f27..b0dbd864d3a3 100644 > --- a/drivers/pci/pcie/Kconfig > +++ b/drivers/pci/pcie/Kconfig > @@ -49,6 +49,14 @@ config PCIEAER_INJECT > gotten from: > https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/ > > +config PCIEAER_CXL > + bool "PCI Express CXL RAS support" > + default y > + depends on PCIEAER && CXL_PCI > + help > + This enables CXL error handling for Restricted CXL Hosts > + (RCHs). > + > # > # PCI Express ECRC > # > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 7a25b62d9e01..171a08fd8ebd 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -946,6 +946,65 @@ static bool find_source_device(struct pci_dev *parent, > return true; > } > > +#ifdef CONFIG_PCIEAER_CXL > + > +static bool is_cxl_mem_dev(struct pci_dev *dev) > +{ > + /* > + * A CXL device is controlled only using PCIe Configuration > + * Space of device 0, Function 0. > + */ > + if (dev->devfn != PCI_DEVFN(0, 0)) > + return false; > + > + /* Right now there is only a CXL.mem driver */ > + if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL) > + return false; > + > + return true; > +} > + > +static bool is_internal_error(struct aer_err_info *info) > +{ > + if (info->severity == AER_CORRECTABLE) > + return info->status & PCI_ERR_COR_INTERNAL; > + > + return info->status & PCI_ERR_UNC_INTN; > +} > + > +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info); > + > +static int cxl_handle_error_iter(struct pci_dev *dev, void *data) > +{ > + struct aer_err_info *e_info = (struct aer_err_info *)data; > + > + if (!is_cxl_mem_dev(dev)) > + return 0; > + > + /* pci_dev_put() in handle_error_source() */ > + dev = pci_dev_get(dev); > + if (dev) > + handle_error_source(dev, e_info); > + > + return 0; > +} > + > +static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) > +{ > + /* > + * CXL downstream port errors are signaled as RCEC internal > + * errors. Forward them to all CXL devices below the RCEC. > + */ > + if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC && > + is_internal_error(info)) > + pcie_walk_rcec(dev, cxl_handle_error_iter, info); > +} > + > +#else > +static inline void cxl_handle_error(struct pci_dev *dev, > + struct aer_err_info *info) { } > +#endif > + > /** > * handle_error_source - handle logging error into an event log > * @dev: pointer to pci_dev data structure of error source device > @@ -957,6 +1016,8 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info) > { > int aer = dev->aer_cap; > > + cxl_handle_error(dev, info); > + > if (info->severity == AER_CORRECTABLE) { > /* > * Correctable error does not need software intervention. > -- > 2.34.1 >
Bjorn, thanks for your detailed review. On 12.04.23 17:02:33, Bjorn Helgaas wrote: > On Tue, Apr 11, 2023 at 01:03:01PM -0500, Terry Bowman wrote: > > From: Robert Richter <rrichter@amd.com> > > > > In Restricted CXL Device (RCD) mode a CXL device is exposed as an > > RCiEP, but CXL downstream and upstream ports are not enumerated and > > not visible in the PCIe hierarchy. Protocol and link errors are sent > > to an RCEC. > > > > Restricted CXL host (RCH) downstream port-detected errors are signaled > > as internal AER errors, either Uncorrectable Internal Error (UIE) or > > Corrected Internal Errors (CIE). The error source is the id of the > > RCEC. A CXL handler must then inspect the error status in various CXL > > registers residing in the dport's component register space (CXL RAS > > cap) or the dport's RCRB (AER ext cap). [1] > > > > Errors showing up in the RCEC's error handler must be handled and > > connected to the CXL subsystem. Implement this by forwarding the error > > to all CXL devices below the RCEC. Since the entire CXL device is > > controlled only using PCIe Configuration Space of device 0, Function > > 0, > > Capitalize "device" and "Function" the same way (also appears in > comment below). Changed that. > > > only pass it there [2]. These devices have the Memory Device class > > code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver > > can implement the handler. In addition to errors directed to the CXL > > endpoint device, the handler must also inspect the CXL downstream > > port's CXL RAS and PCIe AER external capabilities that is connected to > > "AER external capabilities" -- is that referring to the "AER > *Extended* capability"? If so, we usually don't bother including the > "extended" part because it's usually not relevant. But if you intended > "external", I don't know what it means. Right, "extended" is meant here, but I will drop it to also fit with the 'CXL RAS capability'. > > > the device. > > > > Since CXL downstream port errors are signaled using internal errors, > > the handler requires those errors to be unmasked. This is subject of a > > follow-on patch. > > > > The reason for choosing this implementation is that a CXL RCEC device > > is bound to the AER port driver, but the driver does not allow it to > > register a custom specific handler to support CXL. Connecting the RCEC > > hard-wired with a CXL handler does not work, as the CXL subsystem > > might not be present all the time. The alternative to add an > > implementation to the portdrv to allow the registration of a custom > > RCEC error handler isn't worth doing it as CXL would be its only user. > > Instead, just check for an CXL RCEC and pass it down to the connected > > CXL device's error handler. With this approach the code can entirely > > be implemented in the PCIe AER driver and is independent of the CXL > > subsystem. The CXL driver only provides the handler. > > Can you make this more concrete with an example topology so we can > work through how this all works? Correct me when I go off the rails > here: Let's assume just a simple CXL RCH topology: PCI hierarchy: ----------------- | ACPI0016 |---------------- Host bridge (CXL host) | - CEDT | | -------------| - RCRB base | | | ----------------- : | | | | | ------------------- --------- | | RCiEP |.....| RCEC | Endpoint (CXL dev) | --------| - BDF | | - BDF | | | | - PCIe AER | --------- | | | - CXL dvsec | | | | (v2: reg loc) | | | | - Comp regs | | | | - CXL RAS | | | ------------------- : : CXL hierarchy: : : : ------------------ | | | CXL root port |<-------------- | | | |----------->| - dport RCRB |<-------------- | | - PCIe AER | | | | - Comp regs | | | | - CXL RAS | | | ------------------ | | : | | | ------------------ | | ------->| CXL endpoint |--------------- | | (v1: RCRB) | ------------>| - uport RCRB | | - Comp regs | | - CXL RAS | ------------------ Dport detected errors are reported using PCIe AER and CXL RAS caps in the dports RCRB. Uport detected errors are reported using RCiEP's PCIe AER cap and either the uport's RCRB RAS cap or the RAS cap of the comp regs located using CXL DVSEC register locator. In all cases the RCEC is used with either the RCEC (dport errors) or the RCiEP (uport errors) error source id (BDF: bus, dev, func). > > The current code uses pcie_walk_rcec() in this path, which basically > searches below a Root Port or RCEC for devices that have an AER error > status bit set, add them to the e_info[] list, and call > handle_error_source() for each one: For reference, this series adds support to handle RCH downstream port-detected errors as described in CXL 3.0, 12.2.1.1. This flow looks correct to me, see comments inline. > > aer_isr_one_error > # get e_src from aer_fifo > find_source_device(e_src) e_src is the RCEC. > pcie_walk_rcec(find_device_iter) > find_device_iter > is_error_source > # read PCI_ERR_COR_STATUS or PCI_ERR_UNCOR_STATUS It is an internal error (CIE or UIE). > if (error-source) An early version of the spec did not require the RCEC as an error source. But this case is not handled with this series. > add_error_device > # add device to e_info[] list > # now call handle_error_source for everything in e_info[] > aer_process_err_devices > for (i = 0; i < e_info->err_dev_num; i++) > handle_error_source handle_error_source() is called with the RCEC as pci_dev. > > IIUC, this patch basically says that an RCEC should have an AER error > status bit (UIE or CIE) set, but the devices "below" the RCEC will > not, so they won't get added to e_info[]. An internal error of the RCEC indicates a CXL dport error. > > So we insert cxl_handle_error() in handle_error_source(), where it > gets called for the RCEC, and then it uses pcie_walk_rcec() again to > forcibly call handle_error_source() for *every* device "below" the > RCEC (even though they don't have AER error status bits set). The CXL device contains the links to the dport's caps. Also, there can be multiple RCs with CXL devs connected to it. So we must search for all CXL devices now, determine the corresponding dport and inspect both, PCIe AER and CXL RAS caps. > > Then handle_error_source() ultimately calls the CXL driver err_handler > entry points (.cor_error_detected(), .error_detected(), etc), which > can look at the CXL-specific error status in the CXL RAS or RCRB or > whatever. The AER driver (portdrv) does not have the knowledge of CXL internals. Thus the approach is to pass dport errors to the cxl_mem driver to handle it there in addition to cxl mem dev errors. > > So this basically looks like a workaround for the fact that the AER > code only calls handle_error_source() when it finds AER error status, > and CXL doesn't *set* that AER error status. There's not that much > code here, but it seems like a quite a bit of complexity in an area > that is already pretty complicated. > > Here's another idea: the ACPI GHES code (ghes_handle_aer()) basically > receives a packet of error status from firmware and queues it for > recovery via pcie_do_recovery(). What if you had a CXL module that > knew how to look for the CXL error status, package it up similarly, > and queue it via aer_recover_queue()? The CXL module knows how and where to look for errors, but it does not receive interrupts (for dport errors). The interrupts land in the portdrv (the RCEC's pci driver) and the CXL module must be notified by the portdrv. But the portdrv (AER driver) does not know the CXL module nor it is always present (e.g. CXL bus must be enumerated first etc.). aer_recover_queue() is interesting to report AER errors that has been retrieved outside the PCIe hierarchy, in particular the dport AER cap in the RCRB (see patch #4). We could collect all the data and just send it to aer_recover_queue(). I think aer_recover_work_func() must be extended to also handle corrected errors, otherwise the function is already almost the same as handle_error_source(). But first, RCEC error notifications (RCEC AER interrupts) must be sent to the CXL driver to look into the dport's RCRB. -Robert > > > [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors > > [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices > > > > Co-developed-by: Terry Bowman <terry.bowman@amd.com> > > Signed-off-by: Robert Richter <rrichter@amd.com> > > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > > Cc: "Oliver O'Halloran" <oohall@gmail.com> > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> > > Cc: linuxppc-dev@lists.ozlabs.org > > Cc: linux-pci@vger.kernel.org > > --- > > drivers/pci/pcie/Kconfig | 8 ++++++ > > drivers/pci/pcie/aer.c | 61 ++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 69 insertions(+) > > > > diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig > > index 228652a59f27..b0dbd864d3a3 100644 > > --- a/drivers/pci/pcie/Kconfig > > +++ b/drivers/pci/pcie/Kconfig > > @@ -49,6 +49,14 @@ config PCIEAER_INJECT > > gotten from: > > https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/ > > > > +config PCIEAER_CXL > > + bool "PCI Express CXL RAS support" > > + default y > > + depends on PCIEAER && CXL_PCI > > + help > > + This enables CXL error handling for Restricted CXL Hosts > > + (RCHs). > > + > > # > > # PCI Express ECRC > > # > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > > index 7a25b62d9e01..171a08fd8ebd 100644 > > --- a/drivers/pci/pcie/aer.c > > +++ b/drivers/pci/pcie/aer.c > > @@ -946,6 +946,65 @@ static bool find_source_device(struct pci_dev *parent, > > return true; > > } > > > > +#ifdef CONFIG_PCIEAER_CXL > > + > > +static bool is_cxl_mem_dev(struct pci_dev *dev) > > +{ > > + /* > > + * A CXL device is controlled only using PCIe Configuration > > + * Space of device 0, Function 0. > > + */ > > + if (dev->devfn != PCI_DEVFN(0, 0)) > > + return false; > > + > > + /* Right now there is only a CXL.mem driver */ > > + if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL) > > + return false; > > + > > + return true; > > +} > > + > > +static bool is_internal_error(struct aer_err_info *info) > > +{ > > + if (info->severity == AER_CORRECTABLE) > > + return info->status & PCI_ERR_COR_INTERNAL; > > + > > + return info->status & PCI_ERR_UNC_INTN; > > +} > > + > > +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info); > > + > > +static int cxl_handle_error_iter(struct pci_dev *dev, void *data) > > +{ > > + struct aer_err_info *e_info = (struct aer_err_info *)data; > > + > > + if (!is_cxl_mem_dev(dev)) > > + return 0; > > + > > + /* pci_dev_put() in handle_error_source() */ > > + dev = pci_dev_get(dev); > > + if (dev) > > + handle_error_source(dev, e_info); > > + > > + return 0; > > +} > > + > > +static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) > > +{ > > + /* > > + * CXL downstream port errors are signaled as RCEC internal > > + * errors. Forward them to all CXL devices below the RCEC. > > + */ > > + if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC && > > + is_internal_error(info)) > > + pcie_walk_rcec(dev, cxl_handle_error_iter, info); > > +} > > + > > +#else > > +static inline void cxl_handle_error(struct pci_dev *dev, > > + struct aer_err_info *info) { } > > +#endif > > + > > /** > > * handle_error_source - handle logging error into an event log > > * @dev: pointer to pci_dev data structure of error source device > > @@ -957,6 +1016,8 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info) > > { > > int aer = dev->aer_cap; > > > > + cxl_handle_error(dev, info); > > + > > if (info->severity == AER_CORRECTABLE) { > > /* > > * Correctable error does not need software intervention. > > -- > > 2.34.1 > >
On Tue, 11 Apr 2023 13:03:01 -0500 Terry Bowman <terry.bowman@amd.com> wrote: > From: Robert Richter <rrichter@amd.com> > > In Restricted CXL Device (RCD) mode a CXL device is exposed as an > RCiEP, but CXL downstream and upstream ports are not enumerated and > not visible in the PCIe hierarchy. Protocol and link errors are sent > to an RCEC. > > Restricted CXL host (RCH) downstream port-detected errors are signaled > as internal AER errors, either Uncorrectable Internal Error (UIE) or > Corrected Internal Errors (CIE). The error source is the id of the > RCEC. A CXL handler must then inspect the error status in various CXL > registers residing in the dport's component register space (CXL RAS > cap) or the dport's RCRB (AER ext cap). [1] > > Errors showing up in the RCEC's error handler must be handled and > connected to the CXL subsystem. Implement this by forwarding the error > to all CXL devices below the RCEC. Since the entire CXL device is > controlled only using PCIe Configuration Space of device 0, Function > 0, only pass it there [2]. These devices have the Memory Device class > code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver > can implement the handler. This comment implies only class code compliant drivers. Sure we don't have drivers for anything else yet, but we should try to avoid saying there won't be any (which I think above implies). You have a comment in the code, but maybe relaxing the description above to "currently support devices have..." > In addition to errors directed to the CXL > endpoint device, the handler must also inspect the CXL downstream > port's CXL RAS and PCIe AER external capabilities that is connected to > the device. > > Since CXL downstream port errors are signaled using internal errors, > the handler requires those errors to be unmasked. This is subject of a > follow-on patch. > > The reason for choosing this implementation is that a CXL RCEC device > is bound to the AER port driver, but the driver does not allow it to > register a custom specific handler to support CXL. Connecting the RCEC > hard-wired with a CXL handler does not work, as the CXL subsystem > might not be present all the time. The alternative to add an > implementation to the portdrv to allow the registration of a custom > RCEC error handler isn't worth doing it as CXL would be its only user. > Instead, just check for an CXL RCEC and pass it down to the connected > CXL device's error handler. With this approach the code can entirely > be implemented in the PCIe AER driver and is independent of the CXL > subsystem. The CXL driver only provides the handler. > > [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors > [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices > > Co-developed-by: Terry Bowman <terry.bowman@amd.com> > Signed-off-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > Cc: "Oliver O'Halloran" <oohall@gmail.com> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-pci@vger.kernel.org Generally looks good to me. A few trivial comments inline. > --- > drivers/pci/pcie/Kconfig | 8 ++++++ > drivers/pci/pcie/aer.c | 61 ++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 69 insertions(+) > > diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig > index 228652a59f27..b0dbd864d3a3 100644 > --- a/drivers/pci/pcie/Kconfig > +++ b/drivers/pci/pcie/Kconfig > @@ -49,6 +49,14 @@ config PCIEAER_INJECT > gotten from: > https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/ > > +config PCIEAER_CXL > + bool "PCI Express CXL RAS support" Description makes this sound too general. I'd mentioned restricted hosts even in the menu option title. > + default y > + depends on PCIEAER && CXL_PCI > + help > + This enables CXL error handling for Restricted CXL Hosts > + (RCHs). Spec term is probably fine in the title, but in the help I'd expand it as per the CXL 3.0 glossary to include "CXL Host that is operating in RCD mode." It might otherwise surprise people that this matters on their shiny new CXL X.0 host (because they found an old CXL 1.1 card in a box and decided to plug it in) Do we actually need this protection at all? It's a tiny amount of code and I can't see anything immediately that requires the CXL_PCI dependency other than it's a bit pointless if that isn't here. > + > # > # PCI Express ECRC > # > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 7a25b62d9e01..171a08fd8ebd 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -946,6 +946,65 @@ static bool find_source_device(struct pci_dev *parent, > return true; > } > > +#ifdef CONFIG_PCIEAER_CXL > + > +static bool is_cxl_mem_dev(struct pci_dev *dev) > +{ > + /* > + * A CXL device is controlled only using PCIe Configuration > + * Space of device 0, Function 0. That's not true in general. Definitely true that CXL protocol error reporting is controlled only using this Devfn, but more generally there could be other stuff in later functions. So perhaps make the comment more specific. > + */ > + if (dev->devfn != PCI_DEVFN(0, 0)) > + return false; > + > + /* Right now there is only a CXL.mem driver */ > + if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL) > + return false; > + > + return true; > +} > + > +static bool is_internal_error(struct aer_err_info *info) > +{ > + if (info->severity == AER_CORRECTABLE) > + return info->status & PCI_ERR_COR_INTERNAL; > + > + return info->status & PCI_ERR_UNC_INTN; > +} > + > +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info); > + > +static int cxl_handle_error_iter(struct pci_dev *dev, void *data) > +{ > + struct aer_err_info *e_info = (struct aer_err_info *)data; > + > + if (!is_cxl_mem_dev(dev)) > + return 0; > + > + /* pci_dev_put() in handle_error_source() */ > + dev = pci_dev_get(dev); > + if (dev) > + handle_error_source(dev, e_info); > + > + return 0; > +} > + > +static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) > +{ > + /* > + * CXL downstream port errors are signaled as RCEC internal Make this comment more specific (to RCH I think). > + * errors. Forward them to all CXL devices below the RCEC. > + */ > + if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC && > + is_internal_error(info)) > + pcie_walk_rcec(dev, cxl_handle_error_iter, info); > +} > + > +#else > +static inline void cxl_handle_error(struct pci_dev *dev, > + struct aer_err_info *info) { } > +#endif > + > /** > * handle_error_source - handle logging error into an event log > * @dev: pointer to pci_dev data structure of error source device > @@ -957,6 +1016,8 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info) > { > int aer = dev->aer_cap; > > + cxl_handle_error(dev, info); > + > if (info->severity == AER_CORRECTABLE) { > /* > * Correctable error does not need software intervention.
On 14.04.23 13:19:50, Jonathan Cameron wrote: > On Tue, 11 Apr 2023 13:03:01 -0500 > Terry Bowman <terry.bowman@amd.com> wrote: > > > From: Robert Richter <rrichter@amd.com> > > > > In Restricted CXL Device (RCD) mode a CXL device is exposed as an > > RCiEP, but CXL downstream and upstream ports are not enumerated and > > not visible in the PCIe hierarchy. Protocol and link errors are sent > > to an RCEC. > > > > Restricted CXL host (RCH) downstream port-detected errors are signaled > > as internal AER errors, either Uncorrectable Internal Error (UIE) or > > Corrected Internal Errors (CIE). The error source is the id of the > > RCEC. A CXL handler must then inspect the error status in various CXL > > registers residing in the dport's component register space (CXL RAS > > cap) or the dport's RCRB (AER ext cap). [1] > > > > Errors showing up in the RCEC's error handler must be handled and > > connected to the CXL subsystem. Implement this by forwarding the error > > to all CXL devices below the RCEC. Since the entire CXL device is > > controlled only using PCIe Configuration Space of device 0, Function > > 0, only pass it there [2]. These devices have the Memory Device class > > code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver > > can implement the handler. > > This comment implies only class code compliant drivers. Sure we don't > have drivers for anything else yet, but we should try to avoid saying > there won't be any (which I think above implies). > > You have a comment in the code, but maybe relaxing the description above > to "currently support devices have..." It is used here to identify CXL memory devices and limit the enablement to those. The spec requires this to be set for CXL mem devs (see cxl 3.0, 8.1.12.2). There could be other CXL devices (e.g. cache), but other drivers are not yet implemented. That is what I am referring to. The check makes sure there is actually a driver with a handler for it (cxl_pci). > > > In addition to errors directed to the CXL > > endpoint device, the handler must also inspect the CXL downstream > > port's CXL RAS and PCIe AER external capabilities that is connected to > > the device. > > > > Since CXL downstream port errors are signaled using internal errors, > > the handler requires those errors to be unmasked. This is subject of a > > follow-on patch. > > > > The reason for choosing this implementation is that a CXL RCEC device > > is bound to the AER port driver, but the driver does not allow it to > > register a custom specific handler to support CXL. Connecting the RCEC > > hard-wired with a CXL handler does not work, as the CXL subsystem > > might not be present all the time. The alternative to add an > > implementation to the portdrv to allow the registration of a custom > > RCEC error handler isn't worth doing it as CXL would be its only user. > > Instead, just check for an CXL RCEC and pass it down to the connected > > CXL device's error handler. With this approach the code can entirely > > be implemented in the PCIe AER driver and is independent of the CXL > > subsystem. The CXL driver only provides the handler. > > > > [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors > > [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices > > > > Co-developed-by: Terry Bowman <terry.bowman@amd.com> > > Signed-off-by: Robert Richter <rrichter@amd.com> > > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > > Cc: "Oliver O'Halloran" <oohall@gmail.com> > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> > > Cc: linuxppc-dev@lists.ozlabs.org > > Cc: linux-pci@vger.kernel.org > > Generally looks good to me. A few trivial comments inline. > > > --- > > drivers/pci/pcie/Kconfig | 8 ++++++ > > drivers/pci/pcie/aer.c | 61 ++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 69 insertions(+) > > > > diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig > > index 228652a59f27..b0dbd864d3a3 100644 > > --- a/drivers/pci/pcie/Kconfig > > +++ b/drivers/pci/pcie/Kconfig > > @@ -49,6 +49,14 @@ config PCIEAER_INJECT > > gotten from: > > https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/ > > > > +config PCIEAER_CXL > > + bool "PCI Express CXL RAS support" > > Description makes this sound too general. I'd mentioned restricted > hosts even in the menu option title. > > > > + default y > > + depends on PCIEAER && CXL_PCI > > + help > > + This enables CXL error handling for Restricted CXL Hosts > > + (RCHs). > > Spec term is probably fine in the title, but in the help I'd > expand it as per the CXL 3.0 glossary to include > "CXL Host that is operating in RCD mode." > It might otherwise surprise people that this matters on their shiny > new CXL X.0 host (because they found an old CXL 1.1 card in a box > and decided to plug it in) > > Do we actually need this protection at all? It's a tiny amount of code > and I can't see anything immediately that requires the CXL_PCI dependency > other than it's a bit pointless if that isn't here. > > > + > > # > > # PCI Express ECRC > > # > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > > index 7a25b62d9e01..171a08fd8ebd 100644 > > --- a/drivers/pci/pcie/aer.c > > +++ b/drivers/pci/pcie/aer.c > > @@ -946,6 +946,65 @@ static bool find_source_device(struct pci_dev *parent, > > return true; > > } > > > > +#ifdef CONFIG_PCIEAER_CXL > > + > > +static bool is_cxl_mem_dev(struct pci_dev *dev) > > +{ > > + /* > > + * A CXL device is controlled only using PCIe Configuration > > + * Space of device 0, Function 0. > > That's not true in general. Definitely true that CXL protocol > error reporting is controlled only using this Devfn, but > more generally there could be other stuff in later functions. > So perhaps make the comment more specific. I actually mean CXL device in RCD mode here (seen as RCiEP in the PCI hierarchy). The spec says (cxl 3.0, 8.1.3): """ In either case [(RCD and non-RCD)], the capability, status, and control fields in Device 0, Function 0 DVSEC control the CXL functionality of the entire device. """ So dev 0, func 0 must contain a CXL PCIe DVSEC. Thus it is a CXL device and able to handle CXL AER errors. The limitation to the first device prevents the handler from being run multiple times for the same event. > > > + */ > > + if (dev->devfn != PCI_DEVFN(0, 0)) > > + return false; > > + > > + /* Right now there is only a CXL.mem driver */ > > + if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL) > > + return false; > > + > > + return true; > > +} > > + > > +static bool is_internal_error(struct aer_err_info *info) > > +{ > > + if (info->severity == AER_CORRECTABLE) > > + return info->status & PCI_ERR_COR_INTERNAL; > > + > > + return info->status & PCI_ERR_UNC_INTN; > > +} > > + > > +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info); > > + > > +static int cxl_handle_error_iter(struct pci_dev *dev, void *data) > > +{ > > + struct aer_err_info *e_info = (struct aer_err_info *)data; > > + > > + if (!is_cxl_mem_dev(dev)) > > + return 0; > > + > > + /* pci_dev_put() in handle_error_source() */ > > + dev = pci_dev_get(dev); > > + if (dev) > > + handle_error_source(dev, e_info); > > + > > + return 0; > > +} > > + > > +static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) > > +{ > > + /* > > + * CXL downstream port errors are signaled as RCEC internal > > Make this comment more specific (to RCH I think). Right, same here, this is restricted mode only. Thanks for review. -Robert > > > + * errors. Forward them to all CXL devices below the RCEC. > > + */ > > + if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC && > > + is_internal_error(info)) > > + pcie_walk_rcec(dev, cxl_handle_error_iter, info); > > +} > > + > > +#else > > +static inline void cxl_handle_error(struct pci_dev *dev, > > + struct aer_err_info *info) { } > > +#endif > > + > > /** > > * handle_error_source - handle logging error into an event log > > * @dev: pointer to pci_dev data structure of error source device > > @@ -957,6 +1016,8 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info) > > { > > int aer = dev->aer_cap; > > > > + cxl_handle_error(dev, info); > > + > > if (info->severity == AER_CORRECTABLE) { > > /* > > * Correctable error does not need software intervention. >
On Thu, Apr 13, 2023 at 01:40:52PM +0200, Robert Richter wrote: > On 12.04.23 17:02:33, Bjorn Helgaas wrote: > > On Tue, Apr 11, 2023 at 01:03:01PM -0500, Terry Bowman wrote: > > > From: Robert Richter <rrichter@amd.com> > ... > Let's assume just a simple CXL RCH topology: > > PCI hierarchy: > > ----------------- > | ACPI0016 |-------------- Host bridge (CXL host) > | - CEDT | | > -----------| - RCRB base | | > | ----------------- : > | | > | | > | ------------------- --------- > | | RCiEP |.....| RCEC | Endpoint (CXL dev) > | --------| - BDF | | - BDF | > | | | - PCIe AER | --------- > | | | - CXL dvsec | > | | | (v2: reg loc) | > | | | - Comp regs | > | | | - CXL RAS | > | | ------------------- > : : > > CXL hierarchy: > > : : > : ------------------ | > | | CXL root port |<------------ > | | | > |--------->| - dport RCRB |<------------ > | | - PCIe AER | | > | | - Comp regs | | > | | - CXL RAS | | > | ------------------ | > | : | > | | ------------------ | > | ------->| CXL endpoint |------------- > | | (v1: RCRB) | > ---------->| - uport RCRB | > | - Comp regs | > | - CXL RAS | > ------------------ > > Dport detected errors are reported using PCIe AER and CXL RAS caps in > the dports RCRB. > > Uport detected errors are reported using RCiEP's PCIe AER cap and > either the uport's RCRB RAS cap or the RAS cap of the comp regs > located using CXL DVSEC register locator. > > In all cases the RCEC is used with either the RCEC (dport errors) or > the RCiEP (uport errors) error source id (BDF: bus, dev, func). I'm mostly interested in the PCI entities involved because that's all aer.c can deal with. For the above, I think the PCI core only knows about these: 00:00.0 RCEC with AER, RCEC EA includes 00:01.0 00:01.0 RCiEP with AER aer_irq() would handle AER interrupts from 00:00.0. cxl_handle_error() would be called for 00:00.0 and would call handle_error_source() for everything below it (only 00:01.0 here). > > The current code uses pcie_walk_rcec() in this path, which basically > > searches below a Root Port or RCEC for devices that have an AER error > > status bit set, add them to the e_info[] list, and call > > handle_error_source() for each one: > > For reference, this series adds support to handle RCH downstream > port-detected errors as described in CXL 3.0, 12.2.1.1. > > This flow looks correct to me, see comments inline. We seem to be on the same page here, so I'll trim it out. > ... > > So we insert cxl_handle_error() in handle_error_source(), where it > > gets called for the RCEC, and then it uses pcie_walk_rcec() again to > > forcibly call handle_error_source() for *every* device "below" the > > RCEC (even though they don't have AER error status bits set). > > The CXL device contains the links to the dport's caps. Also, there can > be multiple RCs with CXL devs connected to it. So we must search for > all CXL devices now, determine the corresponding dport and inspect > both, PCIe AER and CXL RAS caps. > > > Then handle_error_source() ultimately calls the CXL driver err_handler > > entry points (.cor_error_detected(), .error_detected(), etc), which > > can look at the CXL-specific error status in the CXL RAS or RCRB or > > whatever. > > The AER driver (portdrv) does not have the knowledge of CXL internals. > Thus the approach is to pass dport errors to the cxl_mem driver to > handle it there in addition to cxl mem dev errors. > > > So this basically looks like a workaround for the fact that the AER > > code only calls handle_error_source() when it finds AER error status, > > and CXL doesn't *set* that AER error status. There's not that much > > code here, but it seems like a quite a bit of complexity in an area > > that is already pretty complicated. My main point here (correct me if I got this wrong) is that: - A RCEC generates an AER interrupt - find_source_device() searches all devices below the RCEC and builds a list everything for which to call handle_error_source() - cxl_handle_error() *again* looks at all devices below the same RCEC and calls handle_error_source() for each one So the main difference here is that the existing flow only calls handle_error_source() when it finds an error logged in an AER status register, while the new CXL flow calls handle_error_source() for *every* device below the RCEC. I think it's OK to do that, but the almost recursive structure and the unusual reference counting make the overall AER flow much harder to understand. What if we changed is_error_source() to add every CXL.mem device it finds to the e_info[] list, which I think could nicely encapsulate the idea that "CXL devices have error state we don't know how to interpret here"? Would the existing loop in aer_process_err_devices() then do what you need? > > Here's another idea: the ACPI GHES code (ghes_handle_aer()) basically > > receives a packet of error status from firmware and queues it for > > recovery via pcie_do_recovery(). What if you had a CXL module that > > knew how to look for the CXL error status, package it up similarly, > > and queue it via aer_recover_queue()? > > ... > But first, RCEC error notifications (RCEC AER interrupts) must be sent > to the CXL driver to look into the dport's RCRB. Right. I think it could be solvable to have aer_irq() call or wake a CXL interface that has been registered. But maybe changing is_error_source() would be simpler. Bjorn
On Fri, 14 Apr 2023 16:35:05 +0200 Robert Richter <rrichter@amd.com> wrote: > On 14.04.23 13:19:50, Jonathan Cameron wrote: > > On Tue, 11 Apr 2023 13:03:01 -0500 > > Terry Bowman <terry.bowman@amd.com> wrote: > > > > > From: Robert Richter <rrichter@amd.com> > > > > > > In Restricted CXL Device (RCD) mode a CXL device is exposed as an > > > RCiEP, but CXL downstream and upstream ports are not enumerated and > > > not visible in the PCIe hierarchy. Protocol and link errors are sent > > > to an RCEC. > > > > > > Restricted CXL host (RCH) downstream port-detected errors are signaled > > > as internal AER errors, either Uncorrectable Internal Error (UIE) or > > > Corrected Internal Errors (CIE). The error source is the id of the > > > RCEC. A CXL handler must then inspect the error status in various CXL > > > registers residing in the dport's component register space (CXL RAS > > > cap) or the dport's RCRB (AER ext cap). [1] > > > > > > Errors showing up in the RCEC's error handler must be handled and > > > connected to the CXL subsystem. Implement this by forwarding the error > > > to all CXL devices below the RCEC. Since the entire CXL device is > > > controlled only using PCIe Configuration Space of device 0, Function > > > 0, only pass it there [2]. These devices have the Memory Device class > > > code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver > > > can implement the handler. > > > > This comment implies only class code compliant drivers. Sure we don't > > have drivers for anything else yet, but we should try to avoid saying > > there won't be any (which I think above implies). > > > > You have a comment in the code, but maybe relaxing the description above > > to "currently support devices have..." > > It is used here to identify CXL memory devices and limit the > enablement to those. The spec requires this to be set for CXL mem devs > (see cxl 3.0, 8.1.12.2). > > There could be other CXL devices (e.g. cache), but other drivers are > not yet implemented. That is what I am referring to. The check makes > sure there is actually a driver with a handler for it (cxl_pci). Understood on intent. My worry is that the above can be read as a statement on hardware restrictions, rathe than on what software currently implements. Meh. Minor point so I don't care that much! Unlikely anyone will read the patch description after it merges anyway ;) > > > > > > In addition to errors directed to the CXL > > > endpoint device, the handler must also inspect the CXL downstream > > > port's CXL RAS and PCIe AER external capabilities that is connected to > > > the device. > > > > > > Since CXL downstream port errors are signaled using internal errors, > > > the handler requires those errors to be unmasked. This is subject of a > > > follow-on patch. > > > > > > The reason for choosing this implementation is that a CXL RCEC device > > > is bound to the AER port driver, but the driver does not allow it to > > > register a custom specific handler to support CXL. Connecting the RCEC > > > hard-wired with a CXL handler does not work, as the CXL subsystem > > > might not be present all the time. The alternative to add an > > > implementation to the portdrv to allow the registration of a custom > > > RCEC error handler isn't worth doing it as CXL would be its only user. > > > Instead, just check for an CXL RCEC and pass it down to the connected > > > CXL device's error handler. With this approach the code can entirely > > > be implemented in the PCIe AER driver and is independent of the CXL > > > subsystem. The CXL driver only provides the handler. > > > > > > [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors > > > [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices > > > > > > Co-developed-by: Terry Bowman <terry.bowman@amd.com> > > > Signed-off-by: Robert Richter <rrichter@amd.com> > > > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > > > Cc: "Oliver O'Halloran" <oohall@gmail.com> > > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > > Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> > > > Cc: linuxppc-dev@lists.ozlabs.org > > > Cc: linux-pci@vger.kernel.org > > > > Generally looks good to me. A few trivial comments inline. > > > > > --- > > > drivers/pci/pcie/Kconfig | 8 ++++++ > > > drivers/pci/pcie/aer.c | 61 ++++++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 69 insertions(+) > > > > > > diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig > > > index 228652a59f27..b0dbd864d3a3 100644 > > > --- a/drivers/pci/pcie/Kconfig > > > +++ b/drivers/pci/pcie/Kconfig > > > @@ -49,6 +49,14 @@ config PCIEAER_INJECT > > > gotten from: > > > https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/ > > > > > > +config PCIEAER_CXL > > > + bool "PCI Express CXL RAS support" > > > > Description makes this sound too general. I'd mentioned restricted > > hosts even in the menu option title. > > > > > > > + default y > > > + depends on PCIEAER && CXL_PCI > > > + help > > > + This enables CXL error handling for Restricted CXL Hosts > > > + (RCHs). > > > > Spec term is probably fine in the title, but in the help I'd > > expand it as per the CXL 3.0 glossary to include > > "CXL Host that is operating in RCD mode." > > It might otherwise surprise people that this matters on their shiny > > new CXL X.0 host (because they found an old CXL 1.1 card in a box > > and decided to plug it in) > > > > Do we actually need this protection at all? It's a tiny amount of code > > and I can't see anything immediately that requires the CXL_PCI dependency > > other than it's a bit pointless if that isn't here. > > > > > + > > > # > > > # PCI Express ECRC > > > # > > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > > > index 7a25b62d9e01..171a08fd8ebd 100644 > > > --- a/drivers/pci/pcie/aer.c > > > +++ b/drivers/pci/pcie/aer.c > > > @@ -946,6 +946,65 @@ static bool find_source_device(struct pci_dev *parent, > > > return true; > > > } > > > > > > +#ifdef CONFIG_PCIEAER_CXL > > > + > > > +static bool is_cxl_mem_dev(struct pci_dev *dev) > > > +{ > > > + /* > > > + * A CXL device is controlled only using PCIe Configuration > > > + * Space of device 0, Function 0. > > > > That's not true in general. Definitely true that CXL protocol > > error reporting is controlled only using this Devfn, but > > more generally there could be other stuff in later functions. > > So perhaps make the comment more specific. > > I actually mean CXL device in RCD mode here (seen as RCiEP in the PCI > hierarchy). > > The spec says (cxl 3.0, 8.1.3): > > """ > In either case [(RCD and non-RCD)], the capability, status, and > control fields in Device 0, Function 0 DVSEC control the CXL > functionality of the entire device. > """ > > So dev 0, func 0 must contain a CXL PCIe DVSEC. Thus it is a CXL > device and able to handle CXL AER errors. The limitation to the first > device prevents the handler from being run multiple times for the same > event. Fine with limitation. Text says "device is controlled only using". That is true for what you are controlling here, but other aspects of the device are controlled via whatever interface they like. Perhaps just quote the specification as you have done in your reply. Then it is clear that we mean just these registers. > > > > > > > + */ > > > + if (dev->devfn != PCI_DEVFN(0, 0)) > > > + return false; > > > + > > > + /* Right now there is only a CXL.mem driver */ > > > + if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL) > > > + return false; > > > + > > > + return true; > > > +} > > > + > > > +static bool is_internal_error(struct aer_err_info *info) > > > +{ > > > + if (info->severity == AER_CORRECTABLE) > > > + return info->status & PCI_ERR_COR_INTERNAL; > > > + > > > + return info->status & PCI_ERR_UNC_INTN; > > > +} > > > + > > > +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info); > > > + > > > +static int cxl_handle_error_iter(struct pci_dev *dev, void *data) > > > +{ > > > + struct aer_err_info *e_info = (struct aer_err_info *)data; > > > + > > > + if (!is_cxl_mem_dev(dev)) > > > + return 0; > > > + > > > + /* pci_dev_put() in handle_error_source() */ > > > + dev = pci_dev_get(dev); > > > + if (dev) > > > + handle_error_source(dev, e_info); > > > + > > > + return 0; > > > +} > > > + > > > +static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) > > > +{ > > > + /* > > > + * CXL downstream port errors are signaled as RCEC internal > > > > Make this comment more specific (to RCH I think). > > Right, same here, this is restricted mode only. > > Thanks for review. > > -Robert > > > > > > > + * errors. Forward them to all CXL devices below the RCEC. > > > + */ > > > + if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC && > > > + is_internal_error(info)) > > > + pcie_walk_rcec(dev, cxl_handle_error_iter, info); > > > +} > > > + > > > +#else > > > +static inline void cxl_handle_error(struct pci_dev *dev, > > > + struct aer_err_info *info) { } > > > +#endif > > > + > > > /** > > > * handle_error_source - handle logging error into an event log > > > * @dev: pointer to pci_dev data structure of error source device > > > @@ -957,6 +1016,8 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info) > > > { > > > int aer = dev->aer_cap; > > > > > > + cxl_handle_error(dev, info); > > > + > > > if (info->severity == AER_CORRECTABLE) { > > > /* > > > * Correctable error does not need software intervention. > >
Hi Jonathan, On 17.04.23 17:54:31, Jonathan Cameron wrote: > On Fri, 14 Apr 2023 16:35:05 +0200 > Robert Richter <rrichter@amd.com> wrote: > > > On 14.04.23 13:19:50, Jonathan Cameron wrote: > > > On Tue, 11 Apr 2023 13:03:01 -0500 > > > Terry Bowman <terry.bowman@amd.com> wrote: > > > > > > > From: Robert Richter <rrichter@amd.com> > > > > > > > > In Restricted CXL Device (RCD) mode a CXL device is exposed as an > > > > RCiEP, but CXL downstream and upstream ports are not enumerated and > > > > not visible in the PCIe hierarchy. Protocol and link errors are sent > > > > to an RCEC. > > > > > > > > Restricted CXL host (RCH) downstream port-detected errors are signaled > > > > as internal AER errors, either Uncorrectable Internal Error (UIE) or > > > > Corrected Internal Errors (CIE). The error source is the id of the > > > > RCEC. A CXL handler must then inspect the error status in various CXL > > > > registers residing in the dport's component register space (CXL RAS > > > > cap) or the dport's RCRB (AER ext cap). [1] > > > > > > > > Errors showing up in the RCEC's error handler must be handled and > > > > connected to the CXL subsystem. Implement this by forwarding the error > > > > to all CXL devices below the RCEC. Since the entire CXL device is > > > > controlled only using PCIe Configuration Space of device 0, Function > > > > 0, only pass it there [2]. These devices have the Memory Device class > > > > code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver > > > > can implement the handler. > > > > > > This comment implies only class code compliant drivers. Sure we don't > > > have drivers for anything else yet, but we should try to avoid saying > > > there won't be any (which I think above implies). > > > > > > You have a comment in the code, but maybe relaxing the description above > > > to "currently support devices have..." > > > > It is used here to identify CXL memory devices and limit the > > enablement to those. The spec requires this to be set for CXL mem devs > > (see cxl 3.0, 8.1.12.2). > > > > There could be other CXL devices (e.g. cache), but other drivers are > > not yet implemented. That is what I am referring to. The check makes > > sure there is actually a driver with a handler for it (cxl_pci). > > Understood on intent. My worry is that the above can be read as a > statement on hardware restrictions, rathe than on what software currently > implements. Meh. Minor point so I don't care that much! > Unlikely anyone will read the patch description after it merges anyway ;) I have updated the description ... > > > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > > > > index 7a25b62d9e01..171a08fd8ebd 100644 > > > > --- a/drivers/pci/pcie/aer.c > > > > +++ b/drivers/pci/pcie/aer.c > > > > @@ -946,6 +946,65 @@ static bool find_source_device(struct pci_dev *parent, > > > > return true; > > > > } > > > > > > > > +#ifdef CONFIG_PCIEAER_CXL > > > > + > > > > +static bool is_cxl_mem_dev(struct pci_dev *dev) > > > > +{ > > > > + /* > > > > + * A CXL device is controlled only using PCIe Configuration > > > > + * Space of device 0, Function 0. > > > > > > That's not true in general. Definitely true that CXL protocol > > > error reporting is controlled only using this Devfn, but > > > more generally there could be other stuff in later functions. > > > So perhaps make the comment more specific. > > > > I actually mean CXL device in RCD mode here (seen as RCiEP in the PCI > > hierarchy). > > > > The spec says (cxl 3.0, 8.1.3): > > > > """ > > In either case [(RCD and non-RCD)], the capability, status, and > > control fields in Device 0, Function 0 DVSEC control the CXL > > functionality of the entire device. > > > """ > > > > So dev 0, func 0 must contain a CXL PCIe DVSEC. Thus it is a CXL > > device and able to handle CXL AER errors. The limitation to the first > > device prevents the handler from being run multiple times for the same > > event. > > Fine with limitation. Text says "device is controlled only using". > That is true for what you are controlling here, but other aspects of the > device are controlled via whatever interface they like. > > Perhaps just quote the specification as you have done in your reply. Then it > is clear that we mean just these registers. ... and comments. Thanks, -Robert
On 14.04.23 16:32:54, Bjorn Helgaas wrote: > On Thu, Apr 13, 2023 at 01:40:52PM +0200, Robert Richter wrote: > > On 12.04.23 17:02:33, Bjorn Helgaas wrote: > > > On Tue, Apr 11, 2023 at 01:03:01PM -0500, Terry Bowman wrote: > > > > From: Robert Richter <rrichter@amd.com> > > > ... > > Let's assume just a simple CXL RCH topology: > > > > PCI hierarchy: > > > > ----------------- > > | ACPI0016 |-------------- Host bridge (CXL host) > > | - CEDT | | > > -----------| - RCRB base | | > > | ----------------- : > > | | > > | | > > | ------------------- --------- > > | | RCiEP |.....| RCEC | Endpoint (CXL dev) > > | --------| - BDF | | - BDF | > > | | | - PCIe AER | --------- > > | | | - CXL dvsec | > > | | | (v2: reg loc) | > > | | | - Comp regs | > > | | | - CXL RAS | > > | | ------------------- > > : : > > > > CXL hierarchy: > > > > : : > > : ------------------ | > > | | CXL root port |<------------ > > | | | > > |--------->| - dport RCRB |<------------ > > | | - PCIe AER | | > > | | - Comp regs | | > > | | - CXL RAS | | > > | ------------------ | > > | : | > > | | ------------------ | > > | ------->| CXL endpoint |------------- > > | | (v1: RCRB) | > > ---------->| - uport RCRB | > > | - Comp regs | > > | - CXL RAS | > > ------------------ > > > > Dport detected errors are reported using PCIe AER and CXL RAS caps in > > the dports RCRB. > > > > Uport detected errors are reported using RCiEP's PCIe AER cap and > > either the uport's RCRB RAS cap or the RAS cap of the comp regs > > located using CXL DVSEC register locator. > > > > In all cases the RCEC is used with either the RCEC (dport errors) or > > the RCiEP (uport errors) error source id (BDF: bus, dev, func). > > I'm mostly interested in the PCI entities involved because that's all > aer.c can deal with. For the above, I think the PCI core only knows > about these: > > 00:00.0 RCEC with AER, RCEC EA includes 00:01.0 > 00:01.0 RCiEP with AER > > aer_irq() would handle AER interrupts from 00:00.0. > cxl_handle_error() would be called for 00:00.0 and would call > handle_error_source() for everything below it (only 00:01.0 here). > > > > The current code uses pcie_walk_rcec() in this path, which basically > > > searches below a Root Port or RCEC for devices that have an AER error > > > status bit set, add them to the e_info[] list, and call > > > handle_error_source() for each one: > > > > For reference, this series adds support to handle RCH downstream > > port-detected errors as described in CXL 3.0, 12.2.1.1. > > > > This flow looks correct to me, see comments inline. > > We seem to be on the same page here, so I'll trim it out. > > > ... > > > So we insert cxl_handle_error() in handle_error_source(), where it > > > gets called for the RCEC, and then it uses pcie_walk_rcec() again to > > > forcibly call handle_error_source() for *every* device "below" the > > > RCEC (even though they don't have AER error status bits set). > > > > The CXL device contains the links to the dport's caps. Also, there can > > be multiple RCs with CXL devs connected to it. So we must search for > > all CXL devices now, determine the corresponding dport and inspect > > both, PCIe AER and CXL RAS caps. > > > > > Then handle_error_source() ultimately calls the CXL driver err_handler > > > entry points (.cor_error_detected(), .error_detected(), etc), which > > > can look at the CXL-specific error status in the CXL RAS or RCRB or > > > whatever. > > > > The AER driver (portdrv) does not have the knowledge of CXL internals. > > Thus the approach is to pass dport errors to the cxl_mem driver to > > handle it there in addition to cxl mem dev errors. > > > > > So this basically looks like a workaround for the fact that the AER > > > code only calls handle_error_source() when it finds AER error status, > > > and CXL doesn't *set* that AER error status. There's not that much > > > code here, but it seems like a quite a bit of complexity in an area > > > that is already pretty complicated. > > My main point here (correct me if I got this wrong) is that: > > - A RCEC generates an AER interrupt > > - find_source_device() searches all devices below the RCEC and > builds a list everything for which to call handle_error_source() find_source_device() does not walk the RCEC if the error source is the RCEC itself (note that find_device_iter() is called for the root/rcec device first and exits early then). > > - cxl_handle_error() *again* looks at all devices below the same > RCEC and calls handle_error_source() for each one > > So the main difference here is that the existing flow only calls > handle_error_source() when it finds an error logged in an AER status > register, while the new CXL flow calls handle_error_source() for > *every* device below the RCEC. That is limited as much as possible: * The RCEC walk to handle CXL dport errors is done only in case of internal errors, for an RCEC only (not a port) (check in cxl_handle_error()). * Internal errors are only enabled for RCECs connected to CXL devices (handles_cxl_errors()). * The handler is only called if it is a CXL memory device (class code set and zero devfn) (check in cxl_handle_error_iter()). An optimization I see here is to convert some runtime checks to cached values determined during device enumeration (CXL device list, RCEC is associated with CXL devices). Some sort of RCEC-to-CXL-dev association, similar to rcec->rcec_ea. > > I think it's OK to do that, but the almost recursive structure and the > unusual reference counting make the overall AER flow much harder to > understand. > > What if we changed is_error_source() to add every CXL.mem device it > finds to the e_info[] list, which I think could nicely encapsulate the > idea that "CXL devices have error state we don't know how to interpret > here"? Would the existing loop in aer_process_err_devices() then do > what you need? I did not want to mix this with devices determined by the Error Source Identification Register. CXL device may not be the error source of an error which may cause some unwanted side-effects. We must also touch AER_MAX_MULTI_ERR_DEVICES then and how the dev list is implemented as the max number of devices is unclear. > > > > Here's another idea: the ACPI GHES code (ghes_handle_aer()) basically > > > receives a packet of error status from firmware and queues it for > > > recovery via pcie_do_recovery(). What if you had a CXL module that > > > knew how to look for the CXL error status, package it up similarly, > > > and queue it via aer_recover_queue()? > > > > ... > > But first, RCEC error notifications (RCEC AER interrupts) must be sent > > to the CXL driver to look into the dport's RCRB. > > Right. I think it could be solvable to have aer_irq() call or wake a > CXL interface that has been registered. But maybe changing > is_error_source() would be simpler. I am going to see if is_error_source() can be used to also find CXL devices. But my main concern here is to mix CXL devices with actual devices identified by the Error Source ID. Thanks, -Robert
Terry Bowman wrote: > From: Robert Richter <rrichter@amd.com> > > In Restricted CXL Device (RCD) mode a CXL device is exposed as an > RCiEP, but CXL downstream and upstream ports are not enumerated and > not visible in the PCIe hierarchy. Protocol and link errors are sent > to an RCEC. > > Restricted CXL host (RCH) downstream port-detected errors are signaled > as internal AER errors, either Uncorrectable Internal Error (UIE) or > Corrected Internal Errors (CIE). The error source is the id of the > RCEC. A CXL handler must then inspect the error status in various CXL > registers residing in the dport's component register space (CXL RAS > cap) or the dport's RCRB (AER ext cap). [1] > > Errors showing up in the RCEC's error handler must be handled and > connected to the CXL subsystem. Implement this by forwarding the error > to all CXL devices below the RCEC. Since the entire CXL device is > controlled only using PCIe Configuration Space of device 0, Function > 0, only pass it there [2]. These devices have the Memory Device class > code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver > can implement the handler. In addition to errors directed to the CXL > endpoint device, the handler must also inspect the CXL downstream > port's CXL RAS and PCIe AER external capabilities that is connected to > the device. > > Since CXL downstream port errors are signaled using internal errors, > the handler requires those errors to be unmasked. This is subject of a > follow-on patch. > > The reason for choosing this implementation is that a CXL RCEC device > is bound to the AER port driver, but the driver does not allow it to > register a custom specific handler to support CXL. Connecting the RCEC > hard-wired with a CXL handler does not work, as the CXL subsystem > might not be present all the time. The alternative to add an > implementation to the portdrv to allow the registration of a custom > RCEC error handler isn't worth doing it as CXL would be its only user. > Instead, just check for an CXL RCEC and pass it down to the connected > CXL device's error handler. With this approach the code can entirely > be implemented in the PCIe AER driver and is independent of the CXL > subsystem. The CXL driver only provides the handler. > > [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors > [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices > > Co-developed-by: Terry Bowman <terry.bowman@amd.com> > Signed-off-by: Robert Richter <rrichter@amd.com> > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > Cc: "Oliver O'Halloran" <oohall@gmail.com> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-pci@vger.kernel.org > --- > drivers/pci/pcie/Kconfig | 8 ++++++ > drivers/pci/pcie/aer.c | 61 ++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 69 insertions(+) > > diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig > index 228652a59f27..b0dbd864d3a3 100644 > --- a/drivers/pci/pcie/Kconfig > +++ b/drivers/pci/pcie/Kconfig > @@ -49,6 +49,14 @@ config PCIEAER_INJECT > gotten from: > https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/ > > +config PCIEAER_CXL > + bool "PCI Express CXL RAS support" > + default y > + depends on PCIEAER && CXL_PCI > + help > + This enables CXL error handling for Restricted CXL Hosts > + (RCHs). > + > # > # PCI Express ECRC > # > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 7a25b62d9e01..171a08fd8ebd 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -946,6 +946,65 @@ static bool find_source_device(struct pci_dev *parent, > return true; > } > > +#ifdef CONFIG_PCIEAER_CXL > + > +static bool is_cxl_mem_dev(struct pci_dev *dev) > +{ > + /* > + * A CXL device is controlled only using PCIe Configuration > + * Space of device 0, Function 0. > + */ > + if (dev->devfn != PCI_DEVFN(0, 0)) > + return false; > + > + /* Right now there is only a CXL.mem driver */ > + if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL) > + return false; > + > + return true; > +} This part feels broken because most the errors of concern here are CXL link generic and that can involve CXL.cache and CXL.mem errors on devices that are not PCI_CLASS_MEMORY_CXL. This situation feels like it wants formal acknowledgement in 'struct pci_dev' that CXL links ride on top of PCIe links. If it were not for RCRBs then the PCI core could just do: dvsec = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_FLEXBUS_PORT); ...at bus scan time to identify devices with active CXL links. RCRBs unfortunately make it so the link presence can not be detected until a CXL driver is loaded to read that DVSEC out of MMIO space. However, I still think that looks like a CXL aware driver registering a 'struct cxl_link' (for lack of a better name) object with a corresponding PCI device. That link can indicate whether this is an RCH topology and whether it needs to do the RCEC walk, and that registration event can flag the RCEC has having CXL link duties to attend to on AER events. I suspect 'struct cxl_link' can also be used if/when we get to incoporating CXL Reset into PCI reset handling. > + > +static bool is_internal_error(struct aer_err_info *info) > +{ > + if (info->severity == AER_CORRECTABLE) > + return info->status & PCI_ERR_COR_INTERNAL; > + > + return info->status & PCI_ERR_UNC_INTN; > +} > + > +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info); > + > +static int cxl_handle_error_iter(struct pci_dev *dev, void *data) > +{ > + struct aer_err_info *e_info = (struct aer_err_info *)data; > + > + if (!is_cxl_mem_dev(dev)) > + return 0; I assume this also needs to reference the RDPAS if present? CXL 3.0 9.17.1.5 RCEC Downstream Port Association Structure (RDPAS) > + > + /* pci_dev_put() in handle_error_source() */ > + dev = pci_dev_get(dev); > + if (dev) > + handle_error_source(dev, e_info); I went looking but missed where does handle_error_source() synchronize against driver ->remove()? > + > + return 0; > +} > + > +static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) Naming suggestion... Given that the VH topology does not require this scanning and assoication step, lets call this cxl_rch_handle_error() to make it clear this is only here to undo the awkwardness of CXL 1.1 platforms hiding registers from typical PCI scanning. A reference to: CXL 3.0 9.11.8 CXL Devices Attached to an RCH ...might be useful to a future reader that wonders why the CXL RCH case is so complicated from an AER perspective.
Dan, thanks for review, see comments inline. On 17.04.23 18:01:41, Dan Williams wrote: > Terry Bowman wrote: > > From: Robert Richter <rrichter@amd.com> > > > > In Restricted CXL Device (RCD) mode a CXL device is exposed as an > > RCiEP, but CXL downstream and upstream ports are not enumerated and > > not visible in the PCIe hierarchy. Protocol and link errors are sent > > to an RCEC. > > > > Restricted CXL host (RCH) downstream port-detected errors are signaled > > as internal AER errors, either Uncorrectable Internal Error (UIE) or > > Corrected Internal Errors (CIE). The error source is the id of the > > RCEC. A CXL handler must then inspect the error status in various CXL > > registers residing in the dport's component register space (CXL RAS > > cap) or the dport's RCRB (AER ext cap). [1] > > > > Errors showing up in the RCEC's error handler must be handled and > > connected to the CXL subsystem. Implement this by forwarding the error > > to all CXL devices below the RCEC. Since the entire CXL device is > > controlled only using PCIe Configuration Space of device 0, Function > > 0, only pass it there [2]. These devices have the Memory Device class > > code set (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver > > can implement the handler. In addition to errors directed to the CXL > > endpoint device, the handler must also inspect the CXL downstream > > port's CXL RAS and PCIe AER external capabilities that is connected to > > the device. > > > > Since CXL downstream port errors are signaled using internal errors, > > the handler requires those errors to be unmasked. This is subject of a > > follow-on patch. > > > > The reason for choosing this implementation is that a CXL RCEC device > > is bound to the AER port driver, but the driver does not allow it to > > register a custom specific handler to support CXL. Connecting the RCEC > > hard-wired with a CXL handler does not work, as the CXL subsystem > > might not be present all the time. The alternative to add an > > implementation to the portdrv to allow the registration of a custom > > RCEC error handler isn't worth doing it as CXL would be its only user. > > Instead, just check for an CXL RCEC and pass it down to the connected > > CXL device's error handler. With this approach the code can entirely > > be implemented in the PCIe AER driver and is independent of the CXL > > subsystem. The CXL driver only provides the handler. > > > > [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors > > [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices > > > > Co-developed-by: Terry Bowman <terry.bowman@amd.com> > > Signed-off-by: Robert Richter <rrichter@amd.com> > > Signed-off-by: Terry Bowman <terry.bowman@amd.com> > > Cc: "Oliver O'Halloran" <oohall@gmail.com> > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > Cc: Mahesh J Salgaonkar <mahesh@linux.ibm.com> > > Cc: linuxppc-dev@lists.ozlabs.org > > Cc: linux-pci@vger.kernel.org > > --- > > drivers/pci/pcie/Kconfig | 8 ++++++ > > drivers/pci/pcie/aer.c | 61 ++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 69 insertions(+) > > > > diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig > > index 228652a59f27..b0dbd864d3a3 100644 > > --- a/drivers/pci/pcie/Kconfig > > +++ b/drivers/pci/pcie/Kconfig > > @@ -49,6 +49,14 @@ config PCIEAER_INJECT > > gotten from: > > https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/ > > > > +config PCIEAER_CXL > > + bool "PCI Express CXL RAS support" > > + default y > > + depends on PCIEAER && CXL_PCI > > + help > > + This enables CXL error handling for Restricted CXL Hosts > > + (RCHs). > > + > > # > > # PCI Express ECRC > > # > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > > index 7a25b62d9e01..171a08fd8ebd 100644 > > --- a/drivers/pci/pcie/aer.c > > +++ b/drivers/pci/pcie/aer.c > > @@ -946,6 +946,65 @@ static bool find_source_device(struct pci_dev *parent, > > return true; > > } > > > > +#ifdef CONFIG_PCIEAER_CXL > > + > > +static bool is_cxl_mem_dev(struct pci_dev *dev) > > +{ > > + /* > > + * A CXL device is controlled only using PCIe Configuration > > + * Space of device 0, Function 0. > > + */ > > + if (dev->devfn != PCI_DEVFN(0, 0)) > > + return false; > > + > > + /* Right now there is only a CXL.mem driver */ > > + if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL) > > + return false; > > + > > + return true; > > +} > > This part feels broken because most the errors of concern here are CXL > link generic and that can involve CXL.cache and CXL.mem errors on > devices that are not PCI_CLASS_MEMORY_CXL. This situation feels like it > wants formal acknowledgement in 'struct pci_dev' that CXL links ride on > top of PCIe links. There is already rcec->rcec_ea that holds the RCEC-to-endpoint association. Determining if the RCiEP is a CXL dev is a small check which is exactly what is_cxl_mem_dev() is for. I don't see a benefit in holding the same information in an additional cxl_link structure. And as you also said below, for RCRB handling a CXL driver is needed which is why is_cxl_mem_dev() with the class check is used below. > > If it were not for RCRBs then the PCI core could just do: > > dvsec = pci_find_dvsec_capability(pdev, PCI_DVSEC_VENDOR_ID_CXL, > CXL_DVSEC_FLEXBUS_PORT); > > ...at bus scan time to identify devices with active CXL links. RCRBs > unfortunately make it so the link presence can not be detected until a > CXL driver is loaded to read that DVSEC out of MMIO space. In a VH topology those errors can be directly handled in a pci driver for CXL ports, if the portdrv handles that the check could be useful. But this is not subject of this patch series. > > However, I still think that looks like a CXL aware driver registering a > 'struct cxl_link' (for lack of a better name) object with a > corresponding PCI device. That link can indicate whether this is an RCH > topology and whether it needs to do the RCEC walk, and that registration > event can flag the RCEC has having CXL link duties to attend to on AER > events. For CXL awareness of the AER driver the simple checks from above could be used, either called directly for the pci_dev (VH mode), or by walking the RCEC. IMO, a 'struct cxl_link' and a function to register it are not really needed here. > > I suspect 'struct cxl_link' can also be used if/when we get to > incoporating CXL Reset into PCI reset handling. > > > + > > +static bool is_internal_error(struct aer_err_info *info) > > +{ > > + if (info->severity == AER_CORRECTABLE) > > + return info->status & PCI_ERR_COR_INTERNAL; > > + > > + return info->status & PCI_ERR_UNC_INTN; > > +} > > + > > +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info); > > + > > +static int cxl_handle_error_iter(struct pci_dev *dev, void *data) > > +{ > > + struct aer_err_info *e_info = (struct aer_err_info *)data; > > + > > + if (!is_cxl_mem_dev(dev)) > > + return 0; > > > I assume this also needs to reference the RDPAS if present? That is subject of a follow-on patch. Here I see, why you may need a struct cxl_link. But that list must not reside in the pci_dev, instead a CXL aware driver can look up a self-maintained list of RDPAS mappings (RCEC-to-Downstream Port assosiations) to decide whether to lookup the dport's AER and RAS capablilities. > > CXL 3.0 9.17.1.5 RCEC Downstream Port Association Structure (RDPAS) > > > + > > + /* pci_dev_put() in handle_error_source() */ > > + dev = pci_dev_get(dev); > > + if (dev) > > + handle_error_source(dev, e_info); > > I went looking but missed where does handle_error_source() synchronize > against driver ->remove()? Right, the device_lock() is missing in handle_error_source() while accessing pdrv and calling the handler. Will send a fix. > > > + > > + return 0; > > +} > > + > > +static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) > > Naming suggestion... > > Given that the VH topology does not require this scanning and > assoication step, lets call this cxl_rch_handle_error() to make it clear > this is only here to undo the awkwardness of CXL 1.1 platforms hiding > registers from typical PCI scanning. A reference to: > > CXL 3.0 9.11.8 CXL Devices Attached to an RCH > > ...might be useful to a future reader that wonders why the CXL RCH case > is so complicated from an AER perspective. Ok. Thanks, -Robert
Bjorn, On 18.04.23 00:00:58, Robert Richter wrote: > On 14.04.23 16:32:54, Bjorn Helgaas wrote: > > On Thu, Apr 13, 2023 at 01:40:52PM +0200, Robert Richter wrote: > > > On 12.04.23 17:02:33, Bjorn Helgaas wrote: > > > > On Tue, Apr 11, 2023 at 01:03:01PM -0500, Terry Bowman wrote: > > I'm mostly interested in the PCI entities involved because that's all > > aer.c can deal with. For the above, I think the PCI core only knows > > about these: > > > > 00:00.0 RCEC with AER, RCEC EA includes 00:01.0 > > 00:01.0 RCiEP with AER > > > > aer_irq() would handle AER interrupts from 00:00.0. > > cxl_handle_error() would be called for 00:00.0 and would call > > handle_error_source() for everything below it (only 00:01.0 here). > > > > > > The current code uses pcie_walk_rcec() in this path, which basically > > > > searches below a Root Port or RCEC for devices that have an AER error > > > > status bit set, add them to the e_info[] list, and call > > > > handle_error_source() for each one: > > > > > > For reference, this series adds support to handle RCH downstream > > > port-detected errors as described in CXL 3.0, 12.2.1.1. > > > > > > This flow looks correct to me, see comments inline. > > > > We seem to be on the same page here, so I'll trim it out. > > > > > ... > > > > So we insert cxl_handle_error() in handle_error_source(), where it > > > > gets called for the RCEC, and then it uses pcie_walk_rcec() again to > > > > forcibly call handle_error_source() for *every* device "below" the > > > > RCEC (even though they don't have AER error status bits set). > > > > > > The CXL device contains the links to the dport's caps. Also, there can > > > be multiple RCs with CXL devs connected to it. So we must search for > > > all CXL devices now, determine the corresponding dport and inspect > > > both, PCIe AER and CXL RAS caps. > > > > > > > Then handle_error_source() ultimately calls the CXL driver err_handler > > > > entry points (.cor_error_detected(), .error_detected(), etc), which > > > > can look at the CXL-specific error status in the CXL RAS or RCRB or > > > > whatever. > > > > > > The AER driver (portdrv) does not have the knowledge of CXL internals. > > > Thus the approach is to pass dport errors to the cxl_mem driver to > > > handle it there in addition to cxl mem dev errors. > > > > > > > So this basically looks like a workaround for the fact that the AER > > > > code only calls handle_error_source() when it finds AER error status, > > > > and CXL doesn't *set* that AER error status. There's not that much > > > > code here, but it seems like a quite a bit of complexity in an area > > > > that is already pretty complicated. > > > > My main point here (correct me if I got this wrong) is that: > > > > - A RCEC generates an AER interrupt > > > > - find_source_device() searches all devices below the RCEC and > > builds a list everything for which to call handle_error_source() > > find_source_device() does not walk the RCEC if the error source is the > RCEC itself (note that find_device_iter() is called for the root/rcec > device first and exits early then). > > > > > - cxl_handle_error() *again* looks at all devices below the same > > RCEC and calls handle_error_source() for each one > > > > So the main difference here is that the existing flow only calls > > handle_error_source() when it finds an error logged in an AER status > > register, while the new CXL flow calls handle_error_source() for > > *every* device below the RCEC. > > That is limited as much as possible: > > * The RCEC walk to handle CXL dport errors is done only in case of > internal errors, for an RCEC only (not a port) (check in > cxl_handle_error()). > > * Internal errors are only enabled for RCECs connected to CXL devices > (handles_cxl_errors()). > > * The handler is only called if it is a CXL memory device (class code > set and zero devfn) (check in cxl_handle_error_iter()). > > An optimization I see here is to convert some runtime checks to cached > values determined during device enumeration (CXL device list, RCEC is > associated with CXL devices). Some sort of RCEC-to-CXL-dev > association, similar to rcec->rcec_ea. > > > > > I think it's OK to do that, but the almost recursive structure and the > > unusual reference counting make the overall AER flow much harder to > > understand. > > > > What if we changed is_error_source() to add every CXL.mem device it > > finds to the e_info[] list, which I think could nicely encapsulate the > > idea that "CXL devices have error state we don't know how to interpret > > here"? Would the existing loop in aer_process_err_devices() then do > > what you need? > > I did not want to mix this with devices determined by the Error Source > Identification Register. CXL device may not be the error source of an > error which may cause some unwanted side-effects. We must also touch > AER_MAX_MULTI_ERR_DEVICES then and how the dev list is implemented as > the max number of devices is unclear. > > > > > > > Here's another idea: the ACPI GHES code (ghes_handle_aer()) basically > > > > receives a packet of error status from firmware and queues it for > > > > recovery via pcie_do_recovery(). What if you had a CXL module that > > > > knew how to look for the CXL error status, package it up similarly, > > > > and queue it via aer_recover_queue()? > > > > > > ... > > > But first, RCEC error notifications (RCEC AER interrupts) must be sent > > > to the CXL driver to look into the dport's RCRB. > > > > Right. I think it could be solvable to have aer_irq() call or wake a > > CXL interface that has been registered. But maybe changing > > is_error_source() would be simpler. > > I am going to see if is_error_source() can be used to also find CXL > devices. But my main concern here is to mix CXL devices with actual > devices identified by the Error Source ID. I have looked into reusing is_error_source() and modifying find_source_device() to also add CXL devices (the RCiEPs) to the dev list in e_info. The problem I see is that at AER level it is unknown whether an error happened or not. The downstream port AER capability also does not reside in a PCI config space header and thus is not directly bound to a pci_dev. That means the endpoint's AER capability in pci_dev is not the one we need, instead a CXL aware driver must lookup the RCRB which contains the AER. Additional, the CXL RAS cap must be inspected by that driver. Assuming we add the RCiEP to the dev list the CXL endpoint will be processed by aer_get_device_error_info(), aer_print_error() and handle_error_source(). This is done for the endpoint device even if the source is the dport. Also we need to check the error status of both caps registers first. This will cause error reports and status checks of devices not being the error source. That said, I think the best option is still to delegate the error down to a CXL handler and do the error status check, reporting and handling of the CXL specifics there. I see your point that esp. the pci_dev's refcount handling needs to be improved. I will address that along with the other review comments in a next version of this patch series. Let's then revisit this discussion here? Thanks, -Robert
diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig index 228652a59f27..b0dbd864d3a3 100644 --- a/drivers/pci/pcie/Kconfig +++ b/drivers/pci/pcie/Kconfig @@ -49,6 +49,14 @@ config PCIEAER_INJECT gotten from: https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/ +config PCIEAER_CXL + bool "PCI Express CXL RAS support" + default y + depends on PCIEAER && CXL_PCI + help + This enables CXL error handling for Restricted CXL Hosts + (RCHs). + # # PCI Express ECRC # diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 7a25b62d9e01..171a08fd8ebd 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -946,6 +946,65 @@ static bool find_source_device(struct pci_dev *parent, return true; } +#ifdef CONFIG_PCIEAER_CXL + +static bool is_cxl_mem_dev(struct pci_dev *dev) +{ + /* + * A CXL device is controlled only using PCIe Configuration + * Space of device 0, Function 0. + */ + if (dev->devfn != PCI_DEVFN(0, 0)) + return false; + + /* Right now there is only a CXL.mem driver */ + if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL) + return false; + + return true; +} + +static bool is_internal_error(struct aer_err_info *info) +{ + if (info->severity == AER_CORRECTABLE) + return info->status & PCI_ERR_COR_INTERNAL; + + return info->status & PCI_ERR_UNC_INTN; +} + +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info); + +static int cxl_handle_error_iter(struct pci_dev *dev, void *data) +{ + struct aer_err_info *e_info = (struct aer_err_info *)data; + + if (!is_cxl_mem_dev(dev)) + return 0; + + /* pci_dev_put() in handle_error_source() */ + dev = pci_dev_get(dev); + if (dev) + handle_error_source(dev, e_info); + + return 0; +} + +static void cxl_handle_error(struct pci_dev *dev, struct aer_err_info *info) +{ + /* + * CXL downstream port errors are signaled as RCEC internal + * errors. Forward them to all CXL devices below the RCEC. + */ + if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC && + is_internal_error(info)) + pcie_walk_rcec(dev, cxl_handle_error_iter, info); +} + +#else +static inline void cxl_handle_error(struct pci_dev *dev, + struct aer_err_info *info) { } +#endif + /** * handle_error_source - handle logging error into an event log * @dev: pointer to pci_dev data structure of error source device @@ -957,6 +1016,8 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info) { int aer = dev->aer_cap; + cxl_handle_error(dev, info); + if (info->severity == AER_CORRECTABLE) { /* * Correctable error does not need software intervention.