diff mbox series

[v5,01/16] PCI/AER: Introduce 'struct cxl_err_handlers' and add to 'struct pci_driver'

Message ID 20250107143852.3692571-2-terry.bowman@amd.com
State New
Headers show
Series Enable CXL PCIe port protocol error handling and logging | expand

Commit Message

Bowman, Terry Jan. 7, 2025, 2:38 p.m. UTC
CXL.io provides protocol error handling on top of PCIe Protocol Error
handling. But, CXL.io and PCIe have different handling requirements
for uncorrectable errors (UCE).

The PCIe AER service driver may attempt recovering PCIe devices with
UCE while recovery is not used for CXL.io. Recovery is not used in the
CXL.io case because of potential corruption on what can be system memory.

Create pci_driver::cxl_err_handlers structure similar to
pci_driver::error_handler. Create handlers for correctable and
uncorrectable CXL.io error handling.

The CXL error handlers will be used in future patches adding CXL PCIe
Port Protocol Error handling.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
---
 include/linux/pci.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Ira Weiny Jan. 13, 2025, 11:45 p.m. UTC | #1
Terry Bowman wrote:
> CXL.io provides protocol error handling on top of PCIe Protocol Error
> handling. But, CXL.io and PCIe have different handling requirements
> for uncorrectable errors (UCE).
> 
> The PCIe AER service driver may attempt recovering PCIe devices with
> UCE while recovery is not used for CXL.io. Recovery is not used in the
> CXL.io case because of potential corruption on what can be system memory.
> 
> Create pci_driver::cxl_err_handlers structure similar to
> pci_driver::error_handler. Create handlers for correctable and
> uncorrectable CXL.io error handling.
> 
> The CXL error handlers will be used in future patches adding CXL PCIe
> Port Protocol Error handling.
> 
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Reviewed-by: Fan Ni <fan.ni@samsung.com>

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

[snip]
diff mbox series

Patch

diff --git a/include/linux/pci.h b/include/linux/pci.h
index db9b47ce3eef..e2e36f11205c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -882,6 +882,14 @@  struct pci_error_handlers {
 	void (*cor_error_detected)(struct pci_dev *dev);
 };
 
+/* Compute Express Link (CXL) bus error event callbacks */
+struct cxl_error_handlers {
+	/* CXL bus error detected on this device */
+	bool (*error_detected)(struct pci_dev *dev);
+
+	/* Allow device driver to record more details of a correctable error */
+	void (*cor_error_detected)(struct pci_dev *dev);
+};
 
 struct module;
 
@@ -927,6 +935,7 @@  struct module;
  * @sriov_get_vf_total_msix: PF driver callback to get the total number of
  *              MSI-X vectors available for distribution to the VFs.
  * @err_handler: See Documentation/PCI/pci-error-recovery.rst
+ * @cxl_err_handler: Compute Express Link specific error handlers.
  * @groups:	Sysfs attribute groups.
  * @dev_groups: Attributes attached to the device that will be
  *              created once it is bound to the driver.
@@ -952,6 +961,7 @@  struct pci_driver {
 	int  (*sriov_set_msix_vec_count)(struct pci_dev *vf, int msix_vec_count); /* On PF */
 	u32  (*sriov_get_vf_total_msix)(struct pci_dev *pf);
 	const struct pci_error_handlers *err_handler;
+	const struct cxl_error_handlers *cxl_err_handler;
 	const struct attribute_group **groups;
 	const struct attribute_group **dev_groups;
 	struct device_driver	driver;