diff mbox series

[RFC,5/5] cxl/core: add poison injection event handler

Message ID 20240209115417.724638-8-ruansy.fnst@fujitsu.com
State New, archived
Headers show
Series [RFC,1/5] cxl/core: correct length of DPA field masks | expand

Commit Message

Shiyang Ruan Feb. 9, 2024, 11:54 a.m. UTC
Currently driver only trace cxl events, poison injection on cxl memdev
is silent.  OS needs to be notified then it could handle poison range
in time.  Per CXL spec, the device error event could be signaled through
FW-First and OS-First methods.

So, add poison event handler in OS-First method:
  - qemu:
    - CXL device report POISON event to OS by MSI by sending GMER after
      injecting a poison record
  - CXL driver
    a. read the POISON event through GMER;   <-- this patch
    b. get POISON list;
    c. translate DPA to HPA;
    d. construct a mce instance, then call mce_log() to queue this mce
       instance;

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/mbox.c | 42 ++++++++++++++++++++++++++++-------------
 drivers/cxl/cxlmem.h    |  8 ++++----
 drivers/cxl/pci.c       |  4 ++--
 3 files changed, 35 insertions(+), 19 deletions(-)

Comments

Dan Williams Feb. 10, 2024, 6:54 a.m. UTC | #1
Shiyang Ruan wrote:
> Currently driver only trace cxl events, poison injection on cxl memdev
> is silent.  OS needs to be notified then it could handle poison range
> in time.  Per CXL spec, the device error event could be signaled through
> FW-First and OS-First methods.
> 
> So, add poison event handler in OS-First method:
>   - qemu:
>     - CXL device report POISON event to OS by MSI by sending GMER after
>       injecting a poison record

QEMU details do not belong in a kernel changelog. It is ok for an RFC,
but my hope is that this can be tested on hardware after being proven on
QEMU.

>   - CXL driver
>     a. read the POISON event through GMER;   <-- this patch
>     b. get POISON list;
>     c. translate DPA to HPA;
>     d. construct a mce instance, then call mce_log() to queue this mce
>        instance;

It is not clear to me why the kernel should proactively fire machine
check notifications on injection? The changelog needs to make clear why
the kernel should do this, and the consequences of not going it.

For CPU consumed poison the machine check event will already fire. For
background discovery of poison, that should translate to a
memory_failure() notification with teh MF_ACTION_REQUIRED flag cleared.
Userspace, like rasdaemon, can then make a page offline decision.
Jonathan Cameron Feb. 13, 2024, 4:51 p.m. UTC | #2
> +
> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +			     enum cxl_event_log_type type,
> +			     enum cxl_event_type event_type,
> +			     const uuid_t *uuid, union cxl_event *evt)
> +{
> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>  		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> -	else if (event_type == CXL_CPER_EVENT_DRAM)
> +		/* handle poison event */
> +		if (type == CXL_EVENT_TYPE_FAIL)
> +			cxl_event_handle_poison(cxlmd, &evt->gen_media); 

I'm not 100% convinced this is necessary poison causing.  Also
the text tells us we should see 'an appropriate event'.
DRAM one seems likely to be chosen by some vendors.

The fatal check maybe makes it a little more likely (maybe though
I'm not sure anything says a device must log it to the failure log)
but it might be Memory Event Type 1, which is the host tried to
access an invalid address.  Sure poison might be returned to that
error but what would the main kernel memory handling do with it?
Something is very wrong
but it's not corrupted device memory.  TE state violations are in there
as well. Sure poison is returned on reads (I think - haven't checked).

IF the aim here is to say 'maybe there is poison, better check the
poison list'. Then that is reasonable but we should ensure things
like timer expiry are definitely ruled out and rename the function
to make it clear it might not find poison.

Jonathan
Shiyang Ruan March 15, 2024, 2:29 a.m. UTC | #3
在 2024/2/14 0:51, Jonathan Cameron 写道:
> 
>> +
>> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +			     enum cxl_event_log_type type,
>> +			     enum cxl_event_type event_type,
>> +			     const uuid_t *uuid, union cxl_event *evt)
>> +{
>> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>>   		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
>> -	else if (event_type == CXL_CPER_EVENT_DRAM)
>> +		/* handle poison event */
>> +		if (type == CXL_EVENT_TYPE_FAIL)
>> +			cxl_event_handle_poison(cxlmd, &evt->gen_media);
> 
> I'm not 100% convinced this is necessary poison causing.  Also
> the text tells us we should see 'an appropriate event'.
> DRAM one seems likely to be chosen by some vendors.

I think it's right to use DRAM Event Record for volatile-memdev, but 
should poison on a persistent-memdev also use DRAM Event Record too? 
Though its 'Physical Address' feild has the 'Volatile' bit too, which is 
same as General Media Event Record.  I am a bit confused about this.

> 
> The fatal check maybe makes it a little more likely (maybe though
> I'm not sure anything says a device must log it to the failure log)
> but it might be Memory Event Type 1, which is the host tried to
> access an invalid address.  Sure poison might be returned to that
> error but what would the main kernel memory handling do with it?
> Something is very wrong
> but it's not corrupted device memory.  TE state violations are in there
> as well. Sure poison is returned on reads (I think - haven't checked).
> 
> IF the aim here is to say 'maybe there is poison, better check the
> poison list'. Then that is reasonable but we should ensure things
> like timer expiry are definitely ruled out and rename the function
> to make it clear it might not find poison.

I forgot to distinguish the 'Transaction Type' here. Host Inject Poison 
is 0x04h. And other types should also have their specific handle method.


--
Thanks,
Ruan.

> 
> Jonathan
Jonathan Cameron April 5, 2024, 5:35 p.m. UTC | #4
On Fri, 15 Mar 2024 10:29:07 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> 在 2024/2/14 0:51, Jonathan Cameron 写道:
> >   
> >> +
> >> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> >> +			     enum cxl_event_log_type type,
> >> +			     enum cxl_event_type event_type,
> >> +			     const uuid_t *uuid, union cxl_event *evt)
> >> +{
> >> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
> >>   		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> >> -	else if (event_type == CXL_CPER_EVENT_DRAM)
> >> +		/* handle poison event */
> >> +		if (type == CXL_EVENT_TYPE_FAIL)
> >> +			cxl_event_handle_poison(cxlmd, &evt->gen_media);  
> > 
> > I'm not 100% convinced this is necessary poison causing.  Also
> > the text tells us we should see 'an appropriate event'.
> > DRAM one seems likely to be chosen by some vendors.  
> 
> I think it's right to use DRAM Event Record for volatile-memdev, but 
> should poison on a persistent-memdev also use DRAM Event Record too? 
> Though its 'Physical Address' feild has the 'Volatile' bit too, which is 
> same as General Media Event Record.  I am a bit confused about this.

That is indeed 'novel' in a DRAM device, but maybe it could be battery
backed and have a path to say a flash device that isn't visible to CXL
and form which the DRAM is refilled on power restore?

Anyhow, doesn't make sense for persistent memory that doesn't correspond
to all the other stuff in the DRAM event.
> 
> > 
> > The fatal check maybe makes it a little more likely (maybe though
> > I'm not sure anything says a device must log it to the failure log)
> > but it might be Memory Event Type 1, which is the host tried to
> > access an invalid address.  Sure poison might be returned to that
> > error but what would the main kernel memory handling do with it?
> > Something is very wrong
> > but it's not corrupted device memory.  TE state violations are in there
> > as well. Sure poison is returned on reads (I think - haven't checked).
> > 
> > IF the aim here is to say 'maybe there is poison, better check the
> > poison list'. Then that is reasonable but we should ensure things
> > like timer expiry are definitely ruled out and rename the function
> > to make it clear it might not find poison.  
> 
> I forgot to distinguish the 'Transaction Type' here. Host Inject Poison 
> is 0x04h. And other types should also have their specific handle method.
Yes. If you can use transaction type that solves this issue I think.
> 
> 
> --
> Thanks,
> Ruan.
> 
> > 
> > Jonathan
diff mbox series

Patch

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index e1c67159acc4..fa65a98ada16 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -838,25 +838,41 @@  int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt)
+static void cxl_event_handle_poison(struct cxl_memdev *cxlmd,
+				    struct cxl_event_gen_media *rec)
 {
-	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
+	u64 phys_addr = rec->phys_addr & CXL_DPA_MASK, len;
+
+	if (rec->phys_addr & CXL_DPA_VOLATILE)
+		len = resource_size(&cxlmd->cxlds->ram_res) - phys_addr;
+	else
+		len = resource_size(&cxlmd->cxlds->dpa_res) - phys_addr;
+
+	cxl_mem_get_poison(cxlmd, phys_addr, len, NULL, true);
+}
+
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt)
+{
+	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
 		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
-	else if (event_type == CXL_CPER_EVENT_DRAM)
+		/* handle poison event */
+		if (type == CXL_EVENT_TYPE_FAIL)
+			cxl_event_handle_poison(cxlmd, &evt->gen_media);
+	} else if (event_type == CXL_CPER_EVENT_DRAM)
 		trace_cxl_dram(cxlmd, type, &evt->dram);
 	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
 		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
 	else
 		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
 }
-EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
+EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
 
-static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-				     enum cxl_event_log_type type,
-				     struct cxl_event_record_raw *record)
+static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
+				      enum cxl_event_log_type type,
+				      struct cxl_event_record_raw *record)
 {
 	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
 	const uuid_t *uuid = &record->id;
@@ -868,7 +884,7 @@  static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
 		ev_type = CXL_CPER_EVENT_MEM_MODULE;
 
-	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
+	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
 }
 
 static int cxl_clear_event_record(struct cxl_memdev_state *mds,
@@ -979,8 +995,8 @@  static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
 			break;
 
 		for (i = 0; i < nr_rec; i++)
-			__cxl_event_trace_record(cxlmd, type,
-						 &payload->records[i]);
+			__cxl_event_handle_record(cxlmd, type,
+						  &payload->records[i]);
 
 		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
 			trace_cxl_overflow(cxlmd, type, payload);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index f0877f055f53..1e9e3b9c11d1 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -824,10 +824,10 @@  void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 				  unsigned long *cmds);
 void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt);
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt);
 int cxl_set_timestamp(struct cxl_memdev_state *mds);
 int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 233e7c42c161..29a5e641decd 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1003,8 +1003,8 @@  static void cxl_cper_event_call(enum cxl_event_type ev_type,
 	hdr_flags = get_unaligned_le24(rec->event.generic.hdr.flags);
 	log_type = FIELD_GET(CXL_EVENT_HDR_FLAGS_REC_SEVERITY, hdr_flags);
 
-	cxl_event_trace_record(cxlds->cxlmd, log_type, ev_type,
-			       &uuid_null, &rec->event);
+	cxl_event_handle_record(cxlds->cxlmd, log_type, ev_type,
+				&uuid_null, &rec->event);
 }
 
 static int __init cxl_pci_driver_init(void)