diff mbox series

[v13,3/4] cxl/core: Add CXL EINJ debugfs files

Message ID 20240220221146.399209-4-Benjamin.Cheatham@amd.com (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series cxl, EINJ: Update EINJ for CXL error types | expand

Commit Message

Ben Cheatham Feb. 20, 2024, 10:11 p.m. UTC
Export CXL helper functions in the einj_cxl module for getting/injecting
available CXL protocol error types to sysfs under kernel/debug/cxl.

The kernel/debug/cxl/einj_types file will print the available CXL
protocol errors in the same format as the available_error_types
file provided by the einj_cxl module. The
kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the
error_type and error_inject files provided by the EINJ module, i.e.:
writing an error type into $dport_dev/einj_inject will inject said error
type into the CXL dport represented by $dport_dev.

Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
---
 Documentation/ABI/testing/debugfs-cxl | 30 ++++++++++++++++++++
 drivers/cxl/Kconfig                   |  1 +
 drivers/cxl/core/port.c               | 41 +++++++++++++++++++++++++++
 3 files changed, 72 insertions(+)

Comments

Dan Williams Feb. 21, 2024, 5:48 p.m. UTC | #1
Ben Cheatham wrote:
> Export CXL helper functions in the einj_cxl module for getting/injecting
> available CXL protocol error types to sysfs under kernel/debug/cxl.
> 
> The kernel/debug/cxl/einj_types file will print the available CXL
> protocol errors in the same format as the available_error_types
> file provided by the einj_cxl module. The
> kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the
> error_type and error_inject files provided by the EINJ module, i.e.:
> writing an error type into $dport_dev/einj_inject will inject said error
> type into the CXL dport represented by $dport_dev.
> 
> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
> ---
>  Documentation/ABI/testing/debugfs-cxl | 30 ++++++++++++++++++++
>  drivers/cxl/Kconfig                   |  1 +
>  drivers/cxl/core/port.c               | 41 +++++++++++++++++++++++++++
>  3 files changed, 72 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
> index fe61d372e3fa..4c0f62f881ca 100644
> --- a/Documentation/ABI/testing/debugfs-cxl
> +++ b/Documentation/ABI/testing/debugfs-cxl
> @@ -33,3 +33,33 @@ Description:
>  		device cannot clear poison from the address, -ENXIO is returned.
>  		The clear_poison attribute is only visible for devices
>  		supporting the capability.
> +
> +What:		/sys/kernel/debug/cxl/einj_types
> +Date:		January, 2024
> +KernelVersion:	v6.9
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RO) Prints the CXL protocol error types made available by
> +		the platform in the format "0x<error number>	<error type>".
> +		The possible error types are (as of ACPI v6.5):
> +			0x1000	CXL.cache Protocol Correctable
> +			0x2000	CXL.cache Protocol Uncorrectable non-fatal
> +			0x4000	CXL.cache Protocol Uncorrectable fatal
> +			0x8000	CXL.mem Protocol Correctable
> +			0x10000	CXL.mem Protocol Uncorrectable non-fatal
> +			0x20000	CXL.mem Protocol Uncorrectable fatal
> +
> +		The <error number> can be written to einj_inject to inject
> +		<error type> into a chosen dport.
> +
> +What:		/sys/kernel/debug/cxl/$dport_dev/einj_inject
> +Date:		January, 2024
> +KernelVersion:	v6.9
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(WO) Writing an integer to this file injects the corresponding
> +		CXL protocol error into $dport_dev ($dport_dev will be a device
> +		name from /sys/bus/pci/devices). The integer to type mapping for
> +		injection can be found by reading from einj_types. If the dport
> +		was enumerated in RCH mode, a CXL 1.1 error is injected, otherwise
> +		a CXL 2.0 error is injected.
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 67998dbd1d46..c86ae4c65c03 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -2,6 +2,7 @@
>  menuconfig CXL_BUS
>  	tristate "CXL (Compute Express Link) Devices Support"
>  	depends on PCI
> +	depends on ACPI_APEI_EINJ_CXL || !ACPI_APEI_EINJ_CXL

This statement is always true "x || !x"

I mentioned in the other patch that ACPI_APEI_EINJ_CXL needs to have a
dependency on CXL_BUS.
Ben Cheatham Feb. 21, 2024, 8:27 p.m. UTC | #2
On 2/21/24 11:48 AM, Dan Williams wrote:
> Ben Cheatham wrote:
>> Export CXL helper functions in the einj_cxl module for getting/injecting
>> available CXL protocol error types to sysfs under kernel/debug/cxl.
>>
>> The kernel/debug/cxl/einj_types file will print the available CXL
>> protocol errors in the same format as the available_error_types
>> file provided by the einj_cxl module. The
>> kernel/debug/cxl/$dport_dev/einj_inject is functionally the same as the
>> error_type and error_inject files provided by the EINJ module, i.e.:
>> writing an error type into $dport_dev/einj_inject will inject said error
>> type into the CXL dport represented by $dport_dev.
>>
>> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
>> ---
>>  Documentation/ABI/testing/debugfs-cxl | 30 ++++++++++++++++++++
>>  drivers/cxl/Kconfig                   |  1 +
>>  drivers/cxl/core/port.c               | 41 +++++++++++++++++++++++++++
>>  3 files changed, 72 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
>> index fe61d372e3fa..4c0f62f881ca 100644
>> --- a/Documentation/ABI/testing/debugfs-cxl
>> +++ b/Documentation/ABI/testing/debugfs-cxl
>> @@ -33,3 +33,33 @@ Description:
>>  		device cannot clear poison from the address, -ENXIO is returned.
>>  		The clear_poison attribute is only visible for devices
>>  		supporting the capability.
>> +
>> +What:		/sys/kernel/debug/cxl/einj_types
>> +Date:		January, 2024
>> +KernelVersion:	v6.9
>> +Contact:	linux-cxl@vger.kernel.org
>> +Description:
>> +		(RO) Prints the CXL protocol error types made available by
>> +		the platform in the format "0x<error number>	<error type>".
>> +		The possible error types are (as of ACPI v6.5):
>> +			0x1000	CXL.cache Protocol Correctable
>> +			0x2000	CXL.cache Protocol Uncorrectable non-fatal
>> +			0x4000	CXL.cache Protocol Uncorrectable fatal
>> +			0x8000	CXL.mem Protocol Correctable
>> +			0x10000	CXL.mem Protocol Uncorrectable non-fatal
>> +			0x20000	CXL.mem Protocol Uncorrectable fatal
>> +
>> +		The <error number> can be written to einj_inject to inject
>> +		<error type> into a chosen dport.
>> +
>> +What:		/sys/kernel/debug/cxl/$dport_dev/einj_inject
>> +Date:		January, 2024
>> +KernelVersion:	v6.9
>> +Contact:	linux-cxl@vger.kernel.org
>> +Description:
>> +		(WO) Writing an integer to this file injects the corresponding
>> +		CXL protocol error into $dport_dev ($dport_dev will be a device
>> +		name from /sys/bus/pci/devices). The integer to type mapping for
>> +		injection can be found by reading from einj_types. If the dport
>> +		was enumerated in RCH mode, a CXL 1.1 error is injected, otherwise
>> +		a CXL 2.0 error is injected.
>> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
>> index 67998dbd1d46..c86ae4c65c03 100644
>> --- a/drivers/cxl/Kconfig
>> +++ b/drivers/cxl/Kconfig
>> @@ -2,6 +2,7 @@
>>  menuconfig CXL_BUS
>>  	tristate "CXL (Compute Express Link) Devices Support"
>>  	depends on PCI
>> +	depends on ACPI_APEI_EINJ_CXL || !ACPI_APEI_EINJ_CXL
> 
> This statement is always true "x || !x"

Yeah, I lifted this from the Kconfig documentation (https://www.kernel.org/doc/html/next/kbuild/kconfig-language.html#kconfig-hints).
It allows CXL_BUS to built if ACPI_APEI_EINJ_CXL=n, but restricts CXL_BUS >= ACPI_APEI_EINJ_CXL otherwise.
Now that I think about it, it seems like this just flipped the dependency I had in v12 the other way.

> 
> I mentioned in the other patch that ACPI_APEI_EINJ_CXL needs to have a
> dependency on CXL_BUS.

Yep I agree, I was just having issues and was able to get it working this way.
I'll change this back to what it was.

Thanks,
Ben
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
index fe61d372e3fa..4c0f62f881ca 100644
--- a/Documentation/ABI/testing/debugfs-cxl
+++ b/Documentation/ABI/testing/debugfs-cxl
@@ -33,3 +33,33 @@  Description:
 		device cannot clear poison from the address, -ENXIO is returned.
 		The clear_poison attribute is only visible for devices
 		supporting the capability.
+
+What:		/sys/kernel/debug/cxl/einj_types
+Date:		January, 2024
+KernelVersion:	v6.9
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RO) Prints the CXL protocol error types made available by
+		the platform in the format "0x<error number>	<error type>".
+		The possible error types are (as of ACPI v6.5):
+			0x1000	CXL.cache Protocol Correctable
+			0x2000	CXL.cache Protocol Uncorrectable non-fatal
+			0x4000	CXL.cache Protocol Uncorrectable fatal
+			0x8000	CXL.mem Protocol Correctable
+			0x10000	CXL.mem Protocol Uncorrectable non-fatal
+			0x20000	CXL.mem Protocol Uncorrectable fatal
+
+		The <error number> can be written to einj_inject to inject
+		<error type> into a chosen dport.
+
+What:		/sys/kernel/debug/cxl/$dport_dev/einj_inject
+Date:		January, 2024
+KernelVersion:	v6.9
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(WO) Writing an integer to this file injects the corresponding
+		CXL protocol error into $dport_dev ($dport_dev will be a device
+		name from /sys/bus/pci/devices). The integer to type mapping for
+		injection can be found by reading from einj_types. If the dport
+		was enumerated in RCH mode, a CXL 1.1 error is injected, otherwise
+		a CXL 2.0 error is injected.
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
index 67998dbd1d46..c86ae4c65c03 100644
--- a/drivers/cxl/Kconfig
+++ b/drivers/cxl/Kconfig
@@ -2,6 +2,7 @@ 
 menuconfig CXL_BUS
 	tristate "CXL (Compute Express Link) Devices Support"
 	depends on PCI
+	depends on ACPI_APEI_EINJ_CXL || !ACPI_APEI_EINJ_CXL
 	select FW_LOADER
 	select FW_UPLOAD
 	select PCI_DOE
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index e59d9d37aa65..3b579eef4d23 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -3,6 +3,7 @@ 
 #include <linux/platform_device.h>
 #include <linux/memregion.h>
 #include <linux/workqueue.h>
+#include <linux/einj-cxl.h>
 #include <linux/debugfs.h>
 #include <linux/device.h>
 #include <linux/module.h>
@@ -793,6 +794,39 @@  static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport,
 	return rc;
 }
 
+DEFINE_SHOW_ATTRIBUTE(einj_cxl_available_error_type);
+
+static int cxl_einj_inject(void *data, u64 type)
+{
+	struct cxl_dport *dport = data;
+
+	if (dport->rch)
+		return einj_cxl_inject_rch_error(dport->rcrb.base, type);
+
+	return einj_cxl_inject_error(to_pci_dev(dport->dport_dev), type);
+}
+DEFINE_DEBUGFS_ATTRIBUTE(cxl_einj_inject_fops, NULL, cxl_einj_inject, "%llx\n");
+
+static void cxl_debugfs_create_dport_dir(struct cxl_dport *dport)
+{
+	struct dentry *dir;
+
+	if (!einj_cxl_is_initialized())
+		return;
+
+	/*
+	 * dport_dev needs to be a PCIe port for CXL 2.0+ ports because
+	 * EINJ expects a dport SBDF to be specified for 2.0 error injection.
+	 */
+	if (!dport->rch && !dev_is_pci(dport->dport_dev))
+		return;
+
+	dir = cxl_debugfs_create_dir(dev_name(dport->dport_dev));
+
+	debugfs_create_file("einj_inject", 0200, dir, dport,
+			    &cxl_einj_inject_fops);
+}
+
 static struct cxl_port *__devm_cxl_add_port(struct device *host,
 					    struct device *uport_dev,
 					    resource_size_t component_reg_phys,
@@ -1149,6 +1183,8 @@  __devm_cxl_add_dport(struct cxl_port *port, struct device *dport_dev,
 	if (dev_is_pci(dport_dev))
 		dport->link_latency = cxl_pci_get_latency(to_pci_dev(dport_dev));
 
+	cxl_debugfs_create_dport_dir(dport);
+
 	return dport;
 }
 
@@ -2221,6 +2257,11 @@  static __init int cxl_core_init(void)
 
 	cxl_debugfs = debugfs_create_dir("cxl", NULL);
 
+	if (einj_cxl_is_initialized()) {
+		debugfs_create_file("einj_types", 0400, cxl_debugfs, NULL,
+				    &einj_cxl_available_error_type_fops);
+	}
+
 	cxl_mbox_init();
 
 	rc = cxl_memdev_init();