diff mbox series

[RFC,v2,3/6] cxl/core: add report option for cxl_mem_get_poison()

Message ID 20240329063614.362763-4-ruansy.fnst@fujitsu.com
State New
Headers show
Series cxl: add poison event handler | expand

Commit Message

Shiyang Ruan March 29, 2024, 6:36 a.m. UTC
The GMER only has "Physical Address" field, no such one indicates length.
So, when a poison event is received, we could use GET_POISON_LIST command
to get the poison list.  Now driver has cxl_mem_get_poison(), so
reuse it and add a parameter 'bool report', report poison record to MCE
if set true.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/mbox.c   | 8 ++++++--
 drivers/cxl/core/memdev.c | 4 ++--
 drivers/cxl/core/region.c | 8 ++++----
 drivers/cxl/cxlmem.h      | 2 +-
 4 files changed, 13 insertions(+), 9 deletions(-)

Comments

Dan Williams March 30, 2024, 1:50 a.m. UTC | #1
Shiyang Ruan wrote:
> The GMER only has "Physical Address" field, no such one indicates length.
> So, when a poison event is received, we could use GET_POISON_LIST command
> to get the poison list.  Now driver has cxl_mem_get_poison(), so
> reuse it and add a parameter 'bool report', report poison record to MCE
> if set true.

I am not sure I agree with the rationale here because there is no
correlation between the event being signaled and the current state of
the poison list. It also establishes race between multiple GMER events,
i.e. imagine the hardware sends 4 GMER events to communicate a 256B
poison discovery event. Does the driver need logic to support GMER event
2, 3, and 4 if it already say all 256B of poison after processing GMER
event 1?

I think the best the driver can do is assume at least 64B of poison
per-event and depend on multiple notifications to handle larger poison
lengths.

Otherwise, the poison list is really only useful for pre-populating
pages to offline after a reboot, i.e. to catch the kernel up with the
state of poison pages after a reboot.
Shiyang Ruan April 3, 2024, 2:56 p.m. UTC | #2
在 2024/3/30 9:50, Dan Williams 写道:
> Shiyang Ruan wrote:
>> The GMER only has "Physical Address" field, no such one indicates length.
>> So, when a poison event is received, we could use GET_POISON_LIST command
>> to get the poison list.  Now driver has cxl_mem_get_poison(), so
>> reuse it and add a parameter 'bool report', report poison record to MCE
>> if set true.
> 
> I am not sure I agree with the rationale here because there is no
> correlation between the event being signaled and the current state of
> the poison list. It also establishes race between multiple GMER events,
> i.e. imagine the hardware sends 4 GMER events to communicate a 256B
> poison discovery event. Does the driver need logic to support GMER event
> 2, 3, and 4 if it already say all 256B of poison after processing GMER
> event 1?

Yes, I didn't thought about that.

> 
> I think the best the driver can do is assume at least 64B of poison
> per-event and depend on multiple notifications to handle larger poison
> lengths.

Agree.  This also makes things easier.

And for qemu, I'm thinking of making a patch to limit the length of a 
poison record when injecting.  The length should between 64B to 4KiB per 
GMER. And emit many GMERs if length > 4KiB.

> 
> Otherwise, the poison list is really only useful for pre-populating
> pages to offline after a reboot, i.e. to catch the kernel up with the
> state of poison pages after a reboot.

Got it.


--
Thanks,
Ruan.
Jonathan Cameron April 4, 2024, 1:46 p.m. UTC | #3
On Wed, 3 Apr 2024 22:56:58 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> 在 2024/3/30 9:50, Dan Williams 写道:
> > Shiyang Ruan wrote:  
> >> The GMER only has "Physical Address" field, no such one indicates length.
> >> So, when a poison event is received, we could use GET_POISON_LIST command
> >> to get the poison list.  Now driver has cxl_mem_get_poison(), so
> >> reuse it and add a parameter 'bool report', report poison record to MCE
> >> if set true.  
> > 
> > I am not sure I agree with the rationale here because there is no
> > correlation between the event being signaled and the current state of
> > the poison list. It also establishes race between multiple GMER events,
> > i.e. imagine the hardware sends 4 GMER events to communicate a 256B
> > poison discovery event. Does the driver need logic to support GMER event
> > 2, 3, and 4 if it already say all 256B of poison after processing GMER
> > event 1?  
> 
> Yes, I didn't thought about that.
> 
> > 
> > I think the best the driver can do is assume at least 64B of poison
> > per-event and depend on multiple notifications to handle larger poison
> > lengths.  
> 
> Agree.  This also makes things easier.
> 
> And for qemu, I'm thinking of making a patch to limit the length of a 
> poison record when injecting.  The length should between 64B to 4KiB per 
> GMER. And emit many GMERs if length > 4KiB.

I'm not keen on such a restriction in QEMU.
QEMU is injecting lengths allowed by the specification.  That facility is
useful for testing the kernel and the QEMU modeling should not be based
on what the kernel supports.

When you said this I wondered if we had a clever implementation that fused
entries in the list, but we don't (I thought about doing so a long time
ago but seems I never bothered :)  So if you are using QEMU for testing
and you don't want to exceed the kernel supported poison lengths, don't
inject poison that big.

Jonathan

> 
> > 
> > Otherwise, the poison list is really only useful for pre-populating
> > pages to offline after a reboot, i.e. to catch the kernel up with the
> > state of poison pages after a reboot.  
> 
> Got it.
> 
> 
> --
> Thanks,
> Ruan.
diff mbox series

Patch

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 31b1b8711256..19b46fb06ed6 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1309,7 +1309,7 @@  void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
 EXPORT_SYMBOL_NS_GPL(cxl_mem_report_poison, CXL);
 
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
-		       struct cxl_region *cxlr)
+		       struct cxl_region *cxlr, bool report)
 {
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_poison_out *po;
@@ -1340,10 +1340,14 @@  int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		if (rc)
 			break;
 
-		for (int i = 0; i < le16_to_cpu(po->count); i++)
+		for (int i = 0; i < le16_to_cpu(po->count); i++) {
 			trace_cxl_poison(cxlmd, cxlr, &po->record[i],
 					 po->flags, po->overflow_ts,
 					 CXL_POISON_TRACE_LIST);
+			if (report)
+				cxl_mem_report_poison(cxlmd, cxlr,
+						      &po->record[i]);
+		}
 
 		/* Protect against an uncleared _FLAG_MORE */
 		nr_records = nr_records + le16_to_cpu(po->count);
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index d4e259f3a7e9..e976141ca4a9 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -200,14 +200,14 @@  static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd)
 	if (resource_size(&cxlds->pmem_res)) {
 		offset = cxlds->pmem_res.start;
 		length = resource_size(&cxlds->pmem_res);
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc)
 			return rc;
 	}
 	if (resource_size(&cxlds->ram_res)) {
 		offset = cxlds->ram_res.start;
 		length = resource_size(&cxlds->ram_res);
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		/*
 		 * Invalid Physical Address is not an error for
 		 * volatile addresses. Device support is optional.
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 5c186e0a39b9..e83c46cb4dea 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2585,7 +2585,7 @@  static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd,
 	if (ctx->mode == CXL_DECODER_RAM) {
 		offset = ctx->offset;
 		length = resource_size(&cxlds->ram_res) - offset;
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc == -EFAULT)
 			rc = 0;
 		if (rc)
@@ -2603,7 +2603,7 @@  static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd,
 		return 0;
 	}
 
-	return cxl_mem_get_poison(cxlmd, offset, length, NULL);
+	return cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 }
 
 static int poison_by_decoder(struct device *dev, void *arg)
@@ -2637,7 +2637,7 @@  static int poison_by_decoder(struct device *dev, void *arg)
 	if (cxled->skip) {
 		offset = cxled->dpa_res->start - cxled->skip;
 		length = cxled->skip;
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM)
 			rc = 0;
 		if (rc)
@@ -2646,7 +2646,7 @@  static int poison_by_decoder(struct device *dev, void *arg)
 
 	offset = cxled->dpa_res->start;
 	length = cxled->dpa_res->end - offset + 1;
-	rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region);
+	rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region, false);
 	if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM)
 		rc = 0;
 	if (rc)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 82f80eb381fb..1f03130b9d6a 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -832,7 +832,7 @@  void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
 			   struct cxl_region *cxlr,
 			   struct cxl_poison_record *poison);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
-		       struct cxl_region *cxlr);
+		       struct cxl_region *cxlr, bool report);
 int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);