diff mbox series

[v2,4/5] cxl/pci: Forward RCH downstream port-detected errors to the CXL.mem dev handler

Message ID 20230323213808.398039-5-terry.bowman@amd.com
State Superseded
Headers show
Series cxl/pci: Add support for RCH RAS error handling | expand

Commit Message

Bowman, Terry March 23, 2023, 9:38 p.m. UTC
From: Robert Richter <rrichter@amd.com>

In RCD mode a CXL device (RCD) is exposed as an RCiEP, but CXL
downstream and upstream ports are not enumerated and not visible in
the PCIe hierarchy. Protocol and link errors are sent to an RCEC.

Now, RCH downstream port-detected errors are signaled as internal AER
errors (UIE/CIE) with the RCEC's source ID. A CXL handler must then
inspect the error status in various CXL registers residing in the
dport's component register space (CXL RAS cap) or the dport's RCRB
(AER ext cap). [1]

This patch connects errors showing up in the RCEC's error handler with
the CXL subsystem. Implement this by forwarding the error to all CXL
devices below the RCEC. Since the entire CXL device is controlled only
using PCIe Configuration Space of device 0, Function 0, only pass it
there [2]. These devices have the Memory Device class code set
(PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver can
implement the handler. The CXL device driver is then responsible to
enable error reporting in the RCEC's AER cap (esp. CIE and UIE) and to
inspect the dport's CXL registers in addition (CXL RAS cap and AER ext
cap).

The reason for choosing this implementation is that a CXL RCEC device
is bound to the AER port driver, but the driver does not allow it to
register a custom specific handler to support CXL. Connecting the RCEC
hard-wired with a CXL handler does not work, as the CXL subsystem
might not be present all the time. The alternative to add an
implementation to the portdrv to allow the registration of a custom
RCEC error handler isn't worth doing it as CXL would be its only user.
Instead, just check for an CXL RCEC and pass it down to the connected
CXL device's error handler.

[1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors
[2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices

Co-developed-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/pcie/aer.c | 45 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

Comments

Bowman, Terry March 23, 2023, 10:27 p.m. UTC | #1
Adding PCI reviewers.

https://lore.kernel.org/all/20230323213808.398039-1-terry.bowman@amd.com/

On 3/23/23 16:38, Terry Bowman wrote:
> |errors (UIE/CIE) with the RCEC's source ID. A CXL handler must then|
Bjorn Helgaas March 24, 2023, 10:36 p.m. UTC | #2
I'd call this a "PCI/AER: ..." patch since that's where all the
changes are.

On Thu, Mar 23, 2023 at 04:38:07PM -0500, Terry Bowman wrote:
> From: Robert Richter <rrichter@amd.com>
> 
> In RCD mode a CXL device (RCD) is exposed as an RCiEP, but CXL
> downstream and upstream ports are not enumerated and not visible in
> the PCIe hierarchy. Protocol and link errors are sent to an RCEC.

"RCD" isn't a common term in drivers/pci; can you expand it once here?

> Now, RCH downstream port-detected errors are signaled as internal AER
> errors (UIE/CIE) with the RCEC's source ID. A CXL handler must then

Similarly, "UIE" and "CIE" are new to drivers/pci; can you expand them
before using?  I assume Uncorrectable Internal Error (UIE) and
Corrected Internal Error (CIE)?  (Annoying that the PCIe spec uses
"Correctable" in general, but "Corrected" for Internal Errors.)

> inspect the error status in various CXL registers residing in the
> dport's component register space (CXL RAS cap) or the dport's RCRB
> (AER ext cap). [1]
> 
> This patch connects errors showing up in the RCEC's error handler with

"Connect errors ..." (we already know this text applies to *this
patch*).

> the CXL subsystem. Implement this by forwarding the error to all CXL
> devices below the RCEC. Since the entire CXL device is controlled only
> using PCIe Configuration Space of device 0, Function 0, only pass it
> there [2]. These devices have the Memory Device class code set
> (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver can
> implement the handler.

> The CXL device driver is then responsible to
> enable error reporting in the RCEC's AER cap

I don't know exactly what you mean by "error reporting in the RCEC's
AER cap", but IIUC, for non-Root Port devices, generation of ERR_COR/
ERR_NONFATAL/ERR_FATAL messages is controlled by the Device Control
register and should already be enabled by pci_aer_init().

Maybe you mean setting AER mask/severity specifically for Internal
Errors?  I'm hoping to get as much of AER management as we can in the
PCI core and out of drivers, so maybe we need a new PCI interface to
do that.

In any event, I assume this sort of configuration would be an
enumeration-time thing, while *this* patch is a run-time thing, so
maybe this information belongs with a different patch?

> (esp. CIE and UIE) and to
> inspect the dport's CXL registers in addition (CXL RAS cap and AER ext
> cap).
> 
> The reason for choosing this implementation is that a CXL RCEC device
> is bound to the AER port driver, but the driver does not allow it to
> register a custom specific handler to support CXL. Connecting the RCEC
> hard-wired with a CXL handler does not work, as the CXL subsystem
> might not be present all the time. The alternative to add an
> implementation to the portdrv to allow the registration of a custom
> RCEC error handler isn't worth doing it as CXL would be its only user.
> Instead, just check for an CXL RCEC and pass it down to the connected
> CXL device's error handler.
> 
> [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors
> [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices
> 
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Robert Richter <rrichter@amd.com>

Since you're sending this patch (Terry) your Signed-off-by should be
last.

> Cc: "Oliver O'Halloran" <oohall@gmail.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-pci@vger.kernel.org
> ---
>  drivers/pci/pcie/aer.c | 45 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
> 
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 7f0f52d094a4..d250a4caa85a 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -943,6 +943,49 @@ static bool find_source_device(struct pci_dev *parent,
>  	return true;
>  }
>  
> +#if IS_ENABLED(CONFIG_CXL_PCI)
> +
> +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info);
> +
> +static int handle_cxl_error_iter(struct pci_dev *dev, void *data)
> +{
> +	struct aer_err_info *e_info = (struct aer_err_info *)data;
> +

Thanks for explaining the :00.0 in the commit log.  I think a one-line
comment here would be useful too so future readers don't have to dig
out the commit to understand.

> +	if (dev->devfn != PCI_DEVFN(0, 0))
> +		return 0;
> +
> +	/* Right now there is only a CXL.mem driver */
> +	if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL)
> +		return 0;
> +
> +	/* pci_dev_put() in handle_error_source() */
> +	dev = pci_dev_get(dev);

I don't see why you need this.  Didn't we get here via this path?

  aer_isr
    aer_isr_one_error
      find_source_device
        find_device_iter
          if (is_error_source())
            add_error_device
              pci_dev_get          <-- existing pci_dev_get()
      aer_process_err_devices
        for (i = 0; i < e_info->error_dev_num && e_info->dev[i]; i++)
          if (aer_get_device_error_info(e_info->dev[i], e_info))
            handle_error_source
  +           handle_cxl_error
              pci_dev_put(dev)     <-- existing pci_dev_put()

So it looks like we wouldn't call handle_error_source() unless we had
a valid e_info->dev[i], which has already had pci_dev_get() called on
it.

Oh, I think I see ... handle_cxl_error() itself was called because an
RCEC reported an error on behalf of a CXL RCiEP (?), and then you use
pcie_walk_rcec() to look through all the associated RCiEPs, and
recursively call handle_error_source(), and we haven't acquired a
reference to those RCiEPs.  Right?

But I thought the CXL things were not enumerated (first paragraph of
commit log)?  But obviously these RCiEPs must be enumerated as PCI
devices or pcie_walk_rcec() and pci_dev_get() wouldn't work.

I haven't worked all the way through this, but I thought Sean Kelley's
and Qiuxu Zhuo's work was along the same line and might cover this,
e.g.,

  a175102b0a82 ("PCI/ERR: Recover from RCEC AER errors")
  579086225502 ("PCI/ERR: Recover from RCiEP AER errors")
  af113553d961 ("PCI/AER: Add pcie_walk_rcec() to RCEC AER handling")

But I guess maybe it's not quite the same case?

If you *do* need this, I know pci_dev_get(NULL) is a no-op, but since
you're testing for NULL anyway, I'd put it inside the "if" body.

> +	if (dev)
> +		handle_error_source(dev, e_info);
> +
> +	return 0;
> +}
> +
> +static bool is_internal_error(struct aer_err_info *info)
> +{
> +	if (info->severity == AER_CORRECTABLE)
> +		return info->status & PCI_ERR_COR_INTERNAL;
> +
> +	return info->status & PCI_ERR_UNC_INTN;
> +}
> +
> +static void handle_cxl_error(struct pci_dev *dev, struct aer_err_info *info)
> +{
> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC &&
> +	    is_internal_error(info))

What's unique about Internal Errors?  I'm trying to figure out why you
wouldn't do this for *all* CXL errors.

> +		pcie_walk_rcec(dev, handle_cxl_error_iter, info);
> +}
> +
> +#else
> +static inline void handle_cxl_error(struct pci_dev *dev,
> +				    struct aer_err_info *info) { }
> +#endif
> +
>  /**
>   * handle_error_source - handle logging error into an event log
>   * @dev: pointer to pci_dev data structure of error source device
> @@ -954,6 +997,8 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info)
>  {
>  	int aer = dev->aer_cap;
>  
> +	handle_cxl_error(dev, info);
> +
>  	if (info->severity == AER_CORRECTABLE) {
>  		/*
>  		 * Correctable error does not need software intervention.
> -- 
> 2.34.1
>
Robert Richter March 27, 2023, 9:51 p.m. UTC | #3
Bjorn,

thanks for you review.

On 24.03.23 17:36:56, Bjorn Helgaas wrote:
> I'd call this a "PCI/AER: ..." patch since that's where all the
> changes are.
> 
> On Thu, Mar 23, 2023 at 04:38:07PM -0500, Terry Bowman wrote:
> > From: Robert Richter <rrichter@amd.com>
> > 
> > In RCD mode a CXL device (RCD) is exposed as an RCiEP, but CXL
> > downstream and upstream ports are not enumerated and not visible in
> > the PCIe hierarchy. Protocol and link errors are sent to an RCEC.
> 
> "RCD" isn't a common term in drivers/pci; can you expand it once here?
> 
> > Now, RCH downstream port-detected errors are signaled as internal AER
> > errors (UIE/CIE) with the RCEC's source ID. A CXL handler must then
> 
> Similarly, "UIE" and "CIE" are new to drivers/pci; can you expand them
> before using?  I assume Uncorrectable Internal Error (UIE) and
> Corrected Internal Error (CIE)?  (Annoying that the PCIe spec uses
> "Correctable" in general, but "Corrected" for Internal Errors.)
> 
> > inspect the error status in various CXL registers residing in the
> > dport's component register space (CXL RAS cap) or the dport's RCRB
> > (AER ext cap). [1]
> > 
> > This patch connects errors showing up in the RCEC's error handler with
> 
> "Connect errors ..." (we already know this text applies to *this
> patch*).

I will update subject and description.

> 
> > the CXL subsystem. Implement this by forwarding the error to all CXL
> > devices below the RCEC. Since the entire CXL device is controlled only
> > using PCIe Configuration Space of device 0, Function 0, only pass it
> > there [2]. These devices have the Memory Device class code set
> > (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver can
> > implement the handler.
> 
> > The CXL device driver is then responsible to
> > enable error reporting in the RCEC's AER cap
> 
> I don't know exactly what you mean by "error reporting in the RCEC's
> AER cap", but IIUC, for non-Root Port devices, generation of ERR_COR/
> ERR_NONFATAL/ERR_FATAL messages is controlled by the Device Control
> register and should already be enabled by pci_aer_init().
> 
> Maybe you mean setting AER mask/severity specifically for Internal
> Errors?  I'm hoping to get as much of AER management as we can in the

Richt, this is implemented in patch #5 in function
rcec_enable_aer_ints().

> PCI core and out of drivers, so maybe we need a new PCI interface to
> do that.
> 
> In any event, I assume this sort of configuration would be an
> enumeration-time thing, while *this* patch is a run-time thing, so
> maybe this information belongs with a different patch?

Do you mean once a Restricted CXL host (RCH) is detected, the internal
errors should be enabled in the device mask, all this done during
device enumeration? But wouldn't interrupts being enabled then before
the CXL device is ready?

> 
> > (esp. CIE and UIE) and to
> > inspect the dport's CXL registers in addition (CXL RAS cap and AER ext
> > cap).
> > 
> > The reason for choosing this implementation is that a CXL RCEC device
> > is bound to the AER port driver, but the driver does not allow it to
> > register a custom specific handler to support CXL. Connecting the RCEC
> > hard-wired with a CXL handler does not work, as the CXL subsystem
> > might not be present all the time. The alternative to add an
> > implementation to the portdrv to allow the registration of a custom
> > RCEC error handler isn't worth doing it as CXL would be its only user.
> > Instead, just check for an CXL RCEC and pass it down to the connected
> > CXL device's error handler.
> > 
> > [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors
> > [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices
> > 
> > Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> > Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> 
> Since you're sending this patch (Terry) your Signed-off-by should be
> last.
> 
> > Cc: "Oliver O'Halloran" <oohall@gmail.com>
> > Cc: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Cc: linux-pci@vger.kernel.org
> > ---
> >  drivers/pci/pcie/aer.c | 45 ++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 45 insertions(+)
> > 
> > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> > index 7f0f52d094a4..d250a4caa85a 100644
> > --- a/drivers/pci/pcie/aer.c
> > +++ b/drivers/pci/pcie/aer.c
> > @@ -943,6 +943,49 @@ static bool find_source_device(struct pci_dev *parent,
> >  	return true;
> >  }
> >  
> > +#if IS_ENABLED(CONFIG_CXL_PCI)
> > +
> > +static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info);
> > +
> > +static int handle_cxl_error_iter(struct pci_dev *dev, void *data)
> > +{
> > +	struct aer_err_info *e_info = (struct aer_err_info *)data;
> > +
> 
> Thanks for explaining the :00.0 in the commit log.  I think a one-line
> comment here would be useful too so future readers don't have to dig
> out the commit to understand.

Sure, will add a comment.

> 
> > +	if (dev->devfn != PCI_DEVFN(0, 0))
> > +		return 0;
> > +
> > +	/* Right now there is only a CXL.mem driver */
> > +	if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL)
> > +		return 0;
> > +
> > +	/* pci_dev_put() in handle_error_source() */
> > +	dev = pci_dev_get(dev);

This dev is the endpoint's device handle.

> 
> I don't see why you need this.  Didn't we get here via this path?
> 
>   aer_isr
>     aer_isr_one_error
>       find_source_device
>         find_device_iter
>           if (is_error_source())
>             add_error_device
>               pci_dev_get          <-- existing pci_dev_get()

This is the RCEC's device handle. Note that downstream port errors
have the RCEC's error source id (the RCEC’s Bus, Device, and Function
Numbers, see CXL 3.0, 12.2.1.1). We pass this error to the CXL
endpoint's driver (RCD, the RCiEP) now. The iterator goes through all
endpoints connected to the RCEC.

>       aer_process_err_devices
>         for (i = 0; i < e_info->error_dev_num && e_info->dev[i]; i++)
>           if (aer_get_device_error_info(e_info->dev[i], e_info))
>             handle_error_source
>   +           handle_cxl_error

For the endpoint we further call here now:

                  pcie_walk_rcec(dev, handle_cxl_error_iter, info)
                    handle_cxl_error_iter(endpoint_dev)
		      pci_dev_get(endpoint_dev)
                      handle_error_source(endpoint_dev)
                        pci_dev_put(endpoint_dev)

>               pci_dev_put(dev)     <-- existing pci_dev_put()
> 
> So it looks like we wouldn't call handle_error_source() unless we had
> a valid e_info->dev[i], which has already had pci_dev_get() called on
> it.
> 
> Oh, I think I see ... handle_cxl_error() itself was called because an
> RCEC reported an error on behalf of a CXL RCiEP (?), and then you use
> pcie_walk_rcec() to look through all the associated RCiEPs, and
> recursively call handle_error_source(), and we haven't acquired a
> reference to those RCiEPs.  Right?

Yes, exacly.

> 
> But I thought the CXL things were not enumerated (first paragraph of
> commit log)?  But obviously these RCiEPs must be enumerated as PCI
> devices or pcie_walk_rcec() and pci_dev_get() wouldn't work.

The dport and uport are not enumerated, there are two RCRB ranges for
them. But errors are reported using the PCI hierarchy, using the RCEC
and the RCiEP. Once an internal error arrives with the RCEC as an
error source, the AER err cap in the RCRB must be checked.

> 
> I haven't worked all the way through this, but I thought Sean Kelley's
> and Qiuxu Zhuo's work was along the same line and might cover this,
> e.g.,
> 
>   a175102b0a82 ("PCI/ERR: Recover from RCEC AER errors")
>   579086225502 ("PCI/ERR: Recover from RCiEP AER errors")
>   af113553d961 ("PCI/AER: Add pcie_walk_rcec() to RCEC AER handling")
> 
> But I guess maybe it's not quite the same case?

Actually, we use this code to handle errors that are reported to the
RCEC and only implement here the CXL specifics. That is, checking if
the RCEC receives something from a CXL downstream port and forwarding
that to a CXL handler (this patch). The handler then checks the AER
err cap in the RCRB of all CXL downstream ports associated to the RCEC
(not visible in the PCI hierarchy), but discovered through the :00.0
RCiEP (patch #5).

> 
> If you *do* need this, I know pci_dev_get(NULL) is a no-op, but since
> you're testing for NULL anyway, I'd put it inside the "if" body.
> 
> > +	if (dev)
> > +		handle_error_source(dev, e_info);

The check is more for the (theoretical) case where we cannot increment
the refcount.

> > +
> > +	return 0;
> > +}
> > +
> > +static bool is_internal_error(struct aer_err_info *info)
> > +{
> > +	if (info->severity == AER_CORRECTABLE)
> > +		return info->status & PCI_ERR_COR_INTERNAL;
> > +
> > +	return info->status & PCI_ERR_UNC_INTN;
> > +}
> > +
> > +static void handle_cxl_error(struct pci_dev *dev, struct aer_err_info *info)
> > +{
> > +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC &&
> > +	    is_internal_error(info))
> 
> What's unique about Internal Errors?  I'm trying to figure out why you
> wouldn't do this for *all* CXL errors.

Per CXL specification downstream port errors are signaled using
internal errors. All other errors would be device specific, we cannot
handle that in a generic CXL driver. Note these are only errors
reported to the RCEC with the RCEC as a source, the RCiEP's errors are
handled using standard reporting already, see aer_isr_one_error() and
further calling pdrv->err_handler->cor_error_detected() or
pdrv->err_handler->error_detected().

-Robert

> 
> > +		pcie_walk_rcec(dev, handle_cxl_error_iter, info);
> > +}
> > +
> > +#else
> > +static inline void handle_cxl_error(struct pci_dev *dev,
> > +				    struct aer_err_info *info) { }
> > +#endif
> > +
> >  /**
> >   * handle_error_source - handle logging error into an event log
> >   * @dev: pointer to pci_dev data structure of error source device
> > @@ -954,6 +997,8 @@ static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info)
> >  {
> >  	int aer = dev->aer_cap;
> >  
> > +	handle_cxl_error(dev, info);
> > +
> >  	if (info->severity == AER_CORRECTABLE) {
> >  		/*
> >  		 * Correctable error does not need software intervention.
> > -- 
> > 2.34.1
> >
Bowman, Terry March 28, 2023, 1:41 p.m. UTC | #4
Hi Bjorn,

On 3/24/23 17:36, Bjorn Helgaas wrote:
> I'd call this a "PCI/AER: ..." patch since that's where all the
> changes are.
> 
> On Thu, Mar 23, 2023 at 04:38:07PM -0500, Terry Bowman wrote:
>> From: Robert Richter <rrichter@amd.com>
>>
>> In RCD mode a CXL device (RCD) is exposed as an RCiEP, but CXL
>> downstream and upstream ports are not enumerated and not visible in
>> the PCIe hierarchy. Protocol and link errors are sent to an RCEC.
> 
> "RCD" isn't a common term in drivers/pci; can you expand it once here?
> 
>> Now, RCH downstream port-detected errors are signaled as internal AER
>> errors (UIE/CIE) with the RCEC's source ID. A CXL handler must then
> 
> Similarly, "UIE" and "CIE" are new to drivers/pci; can you expand them
> before using?  I assume Uncorrectable Internal Error (UIE) and
> Corrected Internal Error (CIE)?  (Annoying that the PCIe spec uses
> "Correctable" in general, but "Corrected" for Internal Errors.)
> 
>> inspect the error status in various CXL registers residing in the
>> dport's component register space (CXL RAS cap) or the dport's RCRB
>> (AER ext cap). [1]
>>
>> This patch connects errors showing up in the RCEC's error handler with
> 
> "Connect errors ..." (we already know this text applies to *this
> patch*).
> 
>> the CXL subsystem. Implement this by forwarding the error to all CXL
>> devices below the RCEC. Since the entire CXL device is controlled only
>> using PCIe Configuration Space of device 0, Function 0, only pass it
>> there [2]. These devices have the Memory Device class code set
>> (PCI_CLASS_MEMORY_CXL, 502h) and the existing cxl_pci driver can
>> implement the handler.
> 
>> The CXL device driver is then responsible to
>> enable error reporting in the RCEC's AER cap
> 
> I don't know exactly what you mean by "error reporting in the RCEC's
> AER cap", but IIUC, for non-Root Port devices, generation of ERR_COR/
> ERR_NONFATAL/ERR_FATAL messages is controlled by the Device Control
> register and should already be enabled by pci_aer_init().
> 
> Maybe you mean setting AER mask/severity specifically for Internal
> Errors?  I'm hoping to get as much of AER management as we can in the
> PCI core and out of drivers, so maybe we need a new PCI interface to
> do that.
> 
> In any event, I assume this sort of configuration would be an
> enumeration-time thing, while *this* patch is a run-time thing, so
> maybe this information belongs with a different patch?
> 
>> (esp. CIE and UIE) and to
>> inspect the dport's CXL registers in addition (CXL RAS cap and AER ext
>> cap).
>>
>> The reason for choosing this implementation is that a CXL RCEC device
>> is bound to the AER port driver, but the driver does not allow it to
>> register a custom specific handler to support CXL. Connecting the RCEC
>> hard-wired with a CXL handler does not work, as the CXL subsystem
>> might not be present all the time. The alternative to add an
>> implementation to the portdrv to allow the registration of a custom
>> RCEC error handler isn't worth doing it as CXL would be its only user.
>> Instead, just check for an CXL RCEC and pass it down to the connected
>> CXL device's error handler.
>>
>> [1] CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected Errors
>> [2] CXL 3.0 spec, 8.1.3 PCIe DVSEC for CXL Devices
>>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Robert Richter <rrichter@amd.com>
> 
> Since you're sending this patch (Terry) your Signed-off-by should be
> last.
> 

I'll move my Signed-off-by to the last.

Regards,
Terry
Bjorn Helgaas March 28, 2023, 5:21 p.m. UTC | #5
[+cc linux-pci, more error handling folks; beginning of thread at
https://lore.kernel.org/all/20230323213808.398039-1-terry.bowman@amd.com/]

On Mon, Mar 27, 2023 at 11:51:39PM +0200, Robert Richter wrote:
> On 24.03.23 17:36:56, Bjorn Helgaas wrote:

> > > The CXL device driver is then responsible to
> > > enable error reporting in the RCEC's AER cap
> > 
> > I don't know exactly what you mean by "error reporting in the RCEC's
> > AER cap", but IIUC, for non-Root Port devices, generation of ERR_COR/
> > ERR_NONFATAL/ERR_FATAL messages is controlled by the Device Control
> > register and should already be enabled by pci_aer_init().
> > 
> > Maybe you mean setting AER mask/severity specifically for Internal
> > Errors?  I'm hoping to get as much of AER management as we can in the
> 
> Richt, this is implemented in patch #5 in function
> rcec_enable_aer_ints().

I think we should add a PCI core interface for this so we can enforce
the AER ownership question (all the crud like pcie_aer_is_native()) in
one place.

> > PCI core and out of drivers, so maybe we need a new PCI interface to
> > do that.
> > 
> > In any event, I assume this sort of configuration would be an
> > enumeration-time thing, while *this* patch is a run-time thing, so
> > maybe this information belongs with a different patch?
> 
> Do you mean once a Restricted CXL host (RCH) is detected, the internal
> errors should be enabled in the device mask, all this done during
> device enumeration? But wouldn't interrupts being enabled then before
> the CXL device is ready?

I'm not sure what you mean by "before the CXL device is ready."  What
makes a CXL device ready, and how do we know when it is ready?

pci_aer_init() turns on PCI_EXP_DEVCTL_CERE, PCI_EXP_DEVCTL_FERE, etc
as soon as we enumerate the device, before any driver claims the
device.  I'm wondering whether we can do this PCI_ERR_COR_INTERNAL and
PCI_ERR_UNC_INTN fiddling around the same time?

> > I haven't worked all the way through this, but I thought Sean Kelley's
> > and Qiuxu Zhuo's work was along the same line and might cover this,
> > e.g.,
> > 
> >   a175102b0a82 ("PCI/ERR: Recover from RCEC AER errors")
> >   579086225502 ("PCI/ERR: Recover from RCiEP AER errors")
> >   af113553d961 ("PCI/AER: Add pcie_walk_rcec() to RCEC AER handling")
> > 
> > But I guess maybe it's not quite the same case?
> 
> Actually, we use this code to handle errors that are reported to the
> RCEC and only implement here the CXL specifics. That is, checking if
> the RCEC receives something from a CXL downstream port and forwarding
> that to a CXL handler (this patch). The handler then checks the AER
> err cap in the RCRB of all CXL downstream ports associated to the RCEC
> (not visible in the PCI hierarchy), but discovered through the :00.0
> RCiEP (patch #5).

There are two calls to pcie_walk_rcec():

  1) The existing one in find_source_device()
  2) The one you add in handle_cxl_error()

Does the call in handle_cxl_error() look at devices that the existing
call in find_source_device() does not?  I'm trying to understand why
we need both calls.

> > > +static bool is_internal_error(struct aer_err_info *info)
> > > +{
> > > +	if (info->severity == AER_CORRECTABLE)
> > > +		return info->status & PCI_ERR_COR_INTERNAL;
> > > +
> > > +	return info->status & PCI_ERR_UNC_INTN;
> > > +}
> > > +
> > > +static void handle_cxl_error(struct pci_dev *dev, struct aer_err_info *info)
> > > +{
> > > +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC &&
> > > +	    is_internal_error(info))
> > 
> > What's unique about Internal Errors?  I'm trying to figure out why you
> > wouldn't do this for *all* CXL errors.
> 
> Per CXL specification downstream port errors are signaled using
> internal errors. 

Maybe a spec reference here to explain is_internal_error()?  Is the
point of the check to *exclude* non-internal errors?  Or is basically
documentation that there shouldn't ever *be* any non-internal errors?
I guess the latter wouldn't make sense because at this point we don't
know whether this is a CXL hierarchy.

> All other errors would be device specific, we cannot
> handle that in a generic CXL driver.

I'm missing the point here.  We don't have any device-specific error
handling in aer.c; it only connects the generic *reporting* mechanism
(AER log registers and Root Port interrupts) to the drivers that do
the device-specific things via err_handler hooks.  I assume we want a
similar model for CXL.

Bjorn
Robert Richter March 29, 2023, 3:59 p.m. UTC | #6
On 28.03.23 12:21:04, Bjorn Helgaas wrote:
> [+cc linux-pci, more error handling folks; beginning of thread at
> https://lore.kernel.org/all/20230323213808.398039-1-terry.bowman@amd.com/]
> 
> On Mon, Mar 27, 2023 at 11:51:39PM +0200, Robert Richter wrote:
> > On 24.03.23 17:36:56, Bjorn Helgaas wrote:
> 
> > > > The CXL device driver is then responsible to
> > > > enable error reporting in the RCEC's AER cap
> > > 
> > > I don't know exactly what you mean by "error reporting in the RCEC's
> > > AER cap", but IIUC, for non-Root Port devices, generation of ERR_COR/
> > > ERR_NONFATAL/ERR_FATAL messages is controlled by the Device Control
> > > register and should already be enabled by pci_aer_init().
> > > 
> > > Maybe you mean setting AER mask/severity specifically for Internal
> > > Errors?  I'm hoping to get as much of AER management as we can in the
> > 
> > Richt, this is implemented in patch #5 in function
> > rcec_enable_aer_ints().
> 
> I think we should add a PCI core interface for this so we can enforce
> the AER ownership question (all the crud like pcie_aer_is_native()) in
> one place.

Do you mean, code around functions rcec_enable_aer_ints() should be
moved to aer.c and the cxl handler then just assumes it is enabled
already? That looks feasible.

> 
> > > PCI core and out of drivers, so maybe we need a new PCI interface to
> > > do that.
> > > 
> > > In any event, I assume this sort of configuration would be an
> > > enumeration-time thing, while *this* patch is a run-time thing, so
> > > maybe this information belongs with a different patch?
> > 
> > Do you mean once a Restricted CXL host (RCH) is detected, the internal
> > errors should be enabled in the device mask, all this done during
> > device enumeration? But wouldn't interrupts being enabled then before
> > the CXL device is ready?
> 
> I'm not sure what you mean by "before the CXL device is ready."  What
> makes a CXL device ready, and how do we know when it is ready?

The cxl_pci driver must be bound to a device which then further
creates a CXL mem dev. With that binding we can determine the
connected CXL dports from the cxl endpoints (which are seen as PCIe
endpoints) to inspect the CXL RAS caps (in the CXL component reg
space) and the PCIe AER caps (in the RCRB of the dport).

> 
> pci_aer_init() turns on PCI_EXP_DEVCTL_CERE, PCI_EXP_DEVCTL_FERE, etc
> as soon as we enumerate the device, before any driver claims the
> device.  I'm wondering whether we can do this PCI_ERR_COR_INTERNAL and
> PCI_ERR_UNC_INTN fiddling around the same time?

Yes, if the CXL device is not yet bound, there is no handler attached
and AER errors are only handled on a PCI level. Though, we need to
make sure the status is cleared.

> 
> > > I haven't worked all the way through this, but I thought Sean Kelley's
> > > and Qiuxu Zhuo's work was along the same line and might cover this,
> > > e.g.,
> > > 
> > >   a175102b0a82 ("PCI/ERR: Recover from RCEC AER errors")
> > >   579086225502 ("PCI/ERR: Recover from RCiEP AER errors")
> > >   af113553d961 ("PCI/AER: Add pcie_walk_rcec() to RCEC AER handling")
> > > 
> > > But I guess maybe it's not quite the same case?
> > 
> > Actually, we use this code to handle errors that are reported to the
> > RCEC and only implement here the CXL specifics. That is, checking if
> > the RCEC receives something from a CXL downstream port and forwarding
> > that to a CXL handler (this patch). The handler then checks the AER
> > err cap in the RCRB of all CXL downstream ports associated to the RCEC
> > (not visible in the PCI hierarchy), but discovered through the :00.0
> > RCiEP (patch #5).
> 
> There are two calls to pcie_walk_rcec():
> 
>   1) The existing one in find_source_device()
>   2) The one you add in handle_cxl_error()
> 
> Does the call in handle_cxl_error() look at devices that the existing
> call in find_source_device() does not?  I'm trying to understand why
> we need both calls.

In case of a dport error, e_info will only contain the RCEC's id after
running find_source_device(). Thus, only the RCEC's handler would be
called. The portdrv is already bound to the device and currently
doesn't have a handler attached.

As described, due to cross dependencies between cxl and the portdrv,
instead of implementing a handler in the portdrv, we decided to
forward errors to the CXL endpoint driver and handle it there. So now,
in handle_cxl_error(), we check if the error source is an RCEC
attached to a CXL bus and we forward everything directly to the CXL
endpoint handler. pcie_walk_rcec() is used for that.

> 
> > > > +static bool is_internal_error(struct aer_err_info *info)
> > > > +{
> > > > +	if (info->severity == AER_CORRECTABLE)
> > > > +		return info->status & PCI_ERR_COR_INTERNAL;
> > > > +
> > > > +	return info->status & PCI_ERR_UNC_INTN;
> > > > +}
> > > > +
> > > > +static void handle_cxl_error(struct pci_dev *dev, struct aer_err_info *info)
> > > > +{
> > > > +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC &&
> > > > +	    is_internal_error(info))
> > > 
> > > What's unique about Internal Errors?  I'm trying to figure out why you
> > > wouldn't do this for *all* CXL errors.
> > 
> > Per CXL specification downstream port errors are signaled using
> > internal errors. 
> 
> Maybe a spec reference here to explain is_internal_error()?  Is the
> point of the check to *exclude* non-internal errors?  Or is basically
> documentation that there shouldn't ever *be* any non-internal errors?
> I guess the latter wouldn't make sense because at this point we don't
> know whether this is a CXL hierarchy.

It is described in CXL 3.0 spec, 12.2.1.1 RCH Downstream Port-detected
Errors.

We do not handle errors other than internal ones, this is what the
check is for. In theory, an RCEC could also throw other kind of
errors. But, as per spec, once internal error are received from the
RCEC, the CXL dports need to be inspected.

> 
> > All other errors would be device specific, we cannot
> > handle that in a generic CXL driver.
> 
> I'm missing the point here.  We don't have any device-specific error
> handling in aer.c; it only connects the generic *reporting* mechanism
> (AER log registers and Root Port interrupts) to the drivers that do
> the device-specific things via err_handler hooks.  I assume we want a
> similar model for CXL.

With device specific I mean implementation defined and not described
in a specification. The CXL handler is sort of generic as it is
(solely) implementing the CXL spec. Hope that makes sense.

Thanks,

-Robert
diff mbox series

Patch

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 7f0f52d094a4..d250a4caa85a 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -943,6 +943,49 @@  static bool find_source_device(struct pci_dev *parent,
 	return true;
 }
 
+#if IS_ENABLED(CONFIG_CXL_PCI)
+
+static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info);
+
+static int handle_cxl_error_iter(struct pci_dev *dev, void *data)
+{
+	struct aer_err_info *e_info = (struct aer_err_info *)data;
+
+	if (dev->devfn != PCI_DEVFN(0, 0))
+		return 0;
+
+	/* Right now there is only a CXL.mem driver */
+	if ((dev->class >> 8) != PCI_CLASS_MEMORY_CXL)
+		return 0;
+
+	/* pci_dev_put() in handle_error_source() */
+	dev = pci_dev_get(dev);
+	if (dev)
+		handle_error_source(dev, e_info);
+
+	return 0;
+}
+
+static bool is_internal_error(struct aer_err_info *info)
+{
+	if (info->severity == AER_CORRECTABLE)
+		return info->status & PCI_ERR_COR_INTERNAL;
+
+	return info->status & PCI_ERR_UNC_INTN;
+}
+
+static void handle_cxl_error(struct pci_dev *dev, struct aer_err_info *info)
+{
+	if (pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC &&
+	    is_internal_error(info))
+		pcie_walk_rcec(dev, handle_cxl_error_iter, info);
+}
+
+#else
+static inline void handle_cxl_error(struct pci_dev *dev,
+				    struct aer_err_info *info) { }
+#endif
+
 /**
  * handle_error_source - handle logging error into an event log
  * @dev: pointer to pci_dev data structure of error source device
@@ -954,6 +997,8 @@  static void handle_error_source(struct pci_dev *dev, struct aer_err_info *info)
 {
 	int aer = dev->aer_cap;
 
+	handle_cxl_error(dev, info);
+
 	if (info->severity == AER_CORRECTABLE) {
 		/*
 		 * Correctable error does not need software intervention.