diff mbox series

[v12,3/3] EINJ, Documentation: Update EINJ kernel doc

Message ID 20240214200709.777166-4-Benjamin.Cheatham@amd.com
State Superseded
Headers show
Series cxl, EINJ: Update EINJ for CXL error types | expand

Commit Message

Ben Cheatham Feb. 14, 2024, 8:07 p.m. UTC
Update EINJ kernel document to include how to inject CXL protocol error
types, build the kernel to include CXL error types, and give an example
injection.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
---
 .../firmware-guide/acpi/apei/einj.rst         | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Davidlohr Bueso Feb. 20, 2024, 7:02 p.m. UTC | #1
On Wed, 14 Feb 2024, Ben Cheatham wrote:

>Update EINJ kernel document to include how to inject CXL protocol error
>types, build the kernel to include CXL error types, and give an example
>injection.
>
>Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>

Would vote for folding into 2/3, but otherwise looks good with a minor
suggestion.

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

>---
> .../firmware-guide/acpi/apei/einj.rst         | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
>diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst
>index d6b61d22f525..f179adf7b61c 100644
>--- a/Documentation/firmware-guide/acpi/apei/einj.rst
>+++ b/Documentation/firmware-guide/acpi/apei/einj.rst
>@@ -181,6 +181,25 @@ You should see something like this in dmesg::
>   [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0
>   [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
>
>+CXL error types are supported from ACPI 6.5 onwards. These error types
						     ^ and target a CXL Port

>+are not available in the legacy interface at /sys/kernel/debug/apei/einj,
>+and are instead at /sys/kernel/debug/cxl/. There is a file under debug/cxl
>+called "einj_type" that is analogous to available_error_type under debug/cxl.
>+There is also a "einj_inject" file in each $dport_dev directory under debug/cxl
>+that will inject a given error into the dport represented by $dport_dev.
>+For example, to inject a CXL.mem protocol correctable error into
>+$dport_dev=pci0000:0c::
>+
>+    # cd /sys/kernel/debug/cxl/
>+    # cat einj_type                 # See which error can be injected
>+	0x00008000  CXL.mem Protocol Correctable
>+	0x00010000  CXL.mem Protocol Uncorrectable non-fatal
>+	0x00020000  CXL.mem Protocol Uncorrectable fatal
>+    # cd 0000:e0:01.1               # Navigate to dport to inject into
>+    # echo 0x8000 > einj_inject     # Inject error
>+
>+To use CXL error types, ``CONFIG_CXL_EINJ`` will need to be enabled.
>+
> Special notes for injection into SGX enclaves:
>
> There may be a separate BIOS setup option to enable SGX injection.
>--
>2.34.1
>
Ben Cheatham Feb. 20, 2024, 7:59 p.m. UTC | #2
Thanks for taking a look David!

On 2/20/24 1:02 PM, Davidlohr Bueso wrote:
> On Wed, 14 Feb 2024, Ben Cheatham wrote:
> 
>> Update EINJ kernel document to include how to inject CXL protocol error
>> types, build the kernel to include CXL error types, and give an example
>> injection.
>>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com>
> 
> Would vote for folding into 2/3, but otherwise looks good with a minor
> suggestion.
> 

I would, but I think 2/3 is already pretty large and this is more digestible to me. I've also reworked a large portion of
that patch for v13 so it's probably better to keep it smaller.

> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
> 
>> ---
>> .../firmware-guide/acpi/apei/einj.rst         | 19 +++++++++++++++++++
>> 1 file changed, 19 insertions(+)
>>
>> diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst
>> index d6b61d22f525..f179adf7b61c 100644
>> --- a/Documentation/firmware-guide/acpi/apei/einj.rst
>> +++ b/Documentation/firmware-guide/acpi/apei/einj.rst
>> @@ -181,6 +181,25 @@ You should see something like this in dmesg::
>>   [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0
>>   [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
>>
>> +CXL error types are supported from ACPI 6.5 onwards. These error types
>                              ^ and target a CXL Port
> 

Will add.

Thanks,
Ben

>> +are not available in the legacy interface at /sys/kernel/debug/apei/einj,
>> +and are instead at /sys/kernel/debug/cxl/. There is a file under debug/cxl
>> +called "einj_type" that is analogous to available_error_type under debug/cxl.
>> +There is also a "einj_inject" file in each $dport_dev directory under debug/cxl
>> +that will inject a given error into the dport represented by $dport_dev.
>> +For example, to inject a CXL.mem protocol correctable error into
>> +$dport_dev=pci0000:0c::
>> +
>> +    # cd /sys/kernel/debug/cxl/
>> +    # cat einj_type                 # See which error can be injected
>> +    0x00008000  CXL.mem Protocol Correctable
>> +    0x00010000  CXL.mem Protocol Uncorrectable non-fatal
>> +    0x00020000  CXL.mem Protocol Uncorrectable fatal
>> +    # cd 0000:e0:01.1               # Navigate to dport to inject into
>> +    # echo 0x8000 > einj_inject     # Inject error
>> +
>> +To use CXL error types, ``CONFIG_CXL_EINJ`` will need to be enabled.
>> +
>> Special notes for injection into SGX enclaves:
>>
>> There may be a separate BIOS setup option to enable SGX injection.
>> -- 
>> 2.34.1
>>
diff mbox series

Patch

diff --git a/Documentation/firmware-guide/acpi/apei/einj.rst b/Documentation/firmware-guide/acpi/apei/einj.rst
index d6b61d22f525..f179adf7b61c 100644
--- a/Documentation/firmware-guide/acpi/apei/einj.rst
+++ b/Documentation/firmware-guide/acpi/apei/einj.rst
@@ -181,6 +181,25 @@  You should see something like this in dmesg::
   [22715.834759] EDAC sbridge MC3: PROCESSOR 0:306e7 TIME 1422553404 SOCKET 0 APIC 0
   [22716.616173] EDAC MC3: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x12345 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
 
+CXL error types are supported from ACPI 6.5 onwards. These error types
+are not available in the legacy interface at /sys/kernel/debug/apei/einj,
+and are instead at /sys/kernel/debug/cxl/. There is a file under debug/cxl
+called "einj_type" that is analogous to available_error_type under debug/cxl.
+There is also a "einj_inject" file in each $dport_dev directory under debug/cxl
+that will inject a given error into the dport represented by $dport_dev.
+For example, to inject a CXL.mem protocol correctable error into
+$dport_dev=pci0000:0c::
+
+    # cd /sys/kernel/debug/cxl/
+    # cat einj_type                 # See which error can be injected
+	0x00008000  CXL.mem Protocol Correctable
+	0x00010000  CXL.mem Protocol Uncorrectable non-fatal
+	0x00020000  CXL.mem Protocol Uncorrectable fatal
+    # cd 0000:e0:01.1               # Navigate to dport to inject into
+    # echo 0x8000 > einj_inject     # Inject error
+
+To use CXL error types, ``CONFIG_CXL_EINJ`` will need to be enabled.
+
 Special notes for injection into SGX enclaves:
 
 There may be a separate BIOS setup option to enable SGX injection.