[RFC,v2,0/6] cxl: add poison event handler

Message ID	20240329063614.362763-1-ruansy.fnst@fujitsu.com
Headers	show Received: from esa8.hc1455-7.c3s2.iphmx.com (esa8.hc1455-7.c3s2.iphmx.com [139.138.61.253]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 06C3B8BFC for <linux-cxl@vger.kernel.org>; Fri, 29 Mar 2024 06:36:26 +0000 (UTC) From: Shiyang Ruan <ruansy.fnst@fujitsu.com> To: qemu-devel@nongnu.org, linux-cxl@vger.kernel.org Cc: Jonathan.Cameron@huawei.com, dan.j.williams@intel.com, dave@stgolabs.net, ira.weiny@intel.com Subject: [RFC PATCH v2 0/6] cxl: add poison event handler Date: Fri, 29 Mar 2024 14:36:08 +0800 Message-Id: <20240329063614.362763-1-ruansy.fnst@fujitsu.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	cxl: add poison event handler \| expand [RFC,v2,0/6] cxl: add poison event handler [RFC,v2,1/6] cxl/core: correct length of DPA field masks [RFC,v2,2/6] cxl/core: introduce cxl_mem_report_poison() [RFC,v2,3/6] cxl/core: add report option for cxl_mem_get_poison() [RFC,v2,4/6] cxl/core: report poison when injecting from debugfs [RFC,v2,5/6] cxl: add definition for transaction types [RFC,v2,6/6] cxl/core: add poison injection event handler

Message ID

20240329063614.362763-1-ruansy.fnst@fujitsu.com

Headers

From: Shiyang Ruan <ruansy.fnst@fujitsu.com>
To: qemu-devel@nongnu.org,
	linux-cxl@vger.kernel.org
Cc: Jonathan.Cameron@huawei.com,
	dan.j.williams@intel.com,
	dave@stgolabs.net,
	ira.weiny@intel.com
Subject: [RFC PATCH v2 0/6] cxl: add poison event handler
Date: Fri, 29 Mar 2024 14:36:08 +0800
Message-Id: <20240329063614.362763-1-ruansy.fnst@fujitsu.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

cxl: add poison event handler | expand

Message

Shiyang Ruan March 29, 2024, 6:36 a.m. UTC

Changes:
RFCv1 -> RFCv2:
1. update commit message of PATCH 1
2. use memory_failure_queue() instead of MCE
3. also report poison in debugfs when injecting poison
4. correct DPA->HPA logic:
     find memdev's endpoint decoder to find the region it belongs to
5. distinguish transaction_type of GMER, only handle POISON related
     event for now


Currently driver only traces cxl events, poison injection (for both vmem
and pmem type) on cxl memdev is silent.  OS needs to be notified then it
could handle poison range in time.  Per CXL spec, the device error event
could be signaled through FW-First and OS-First methods.

So, add poison event handler in OS-First method:
  - qemu:
    - CXL device report POISON event to OS by MSI by sending GMER after
      injecting a poison record
  - CXL driver                                  <-- this patchset
    a. parse the POISON event from GMER;
    b. retrieve POISON list from memdev;
    c. translate poisoned DPA to HPA;
    d. enqueue poisoned PFN to memory_failure's work queue;


Shiyang Ruan (6):
  cxl/core: correct length of DPA field masks
  cxl/core: introduce cxl_mem_report_poison()
  cxl/core: add report option for cxl_mem_get_poison()
  cxl/core: report poison when injecting from debugfs
  cxl: add definition for transaction_type
  cxl/core: add poison injection event handler

 drivers/cxl/core/mbox.c   | 126 +++++++++++++++++++++++++++++++++-----
 drivers/cxl/core/memdev.c |   5 +-
 drivers/cxl/core/region.c |   8 +--
 drivers/cxl/core/trace.h  |   6 +-
 drivers/cxl/cxlmem.h      |  13 ++--
 include/linux/cxl-event.h |  17 ++++-
 6 files changed, 144 insertions(+), 31 deletions(-)

Comments

Alison Schofield March 29, 2024, 5:43 p.m. UTC | #1

On Fri, Mar 29, 2024 at 02:36:08PM +0800, Shiyang Ruan wrote:
> Changes:
> RFCv1 -> RFCv2:
> 1. update commit message of PATCH 1
> 2. use memory_failure_queue() instead of MCE
> 3. also report poison in debugfs when injecting poison
> 4. correct DPA->HPA logic:
>      find memdev's endpoint decoder to find the region it belongs to
> 5. distinguish transaction_type of GMER, only handle POISON related
>      event for now
> 
> 
> Currently driver only traces cxl events, poison injection (for both vmem
> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
> could handle poison range in time.  Per CXL spec, the device error event
> could be signaled through FW-First and OS-First methods.
> 
> So, add poison event handler in OS-First method:
>   - qemu:
>     - CXL device report POISON event to OS by MSI by sending GMER after
>       injecting a poison record
>   - CXL driver                                  <-- this patchset
>     a. parse the POISON event from GMER;
>     b. retrieve POISON list from memdev;
>     c. translate poisoned DPA to HPA;
>     d. enqueue poisoned PFN to memory_failure's work queue;

Hi,

Yesterday I posted code adding the HPAs to cxl_general_media & dram
events[1], so as I review this patchset today it's fresh in my mind.

Can we integrate this into the trace_ path directly:

1) On any GMER/poison, trigger a new poison list read

BTW - I'm not sure where to trigger that because we want to keep all
the locking in place and read by endpoints like is done now. It may
not be safe to sneak in a direct call to cxl_mem_get_poison()
as is done in this patch set.

2) Teach the poison list read trace event handler to call
memory_failure_queue().

Upon receipt of that new poison list, call memory_failture_queue()
on *any* poison in a mapped space. Is that OK?  Can we call
memory_failure_queue() on any and every poison report that is in
HPA space regardless of whether it first came to us through a GMER?
I'm actually wondering if that is going to be the next ask anyway -
ie report all poison.

I'll comment a bit more on individual patches.

--Alison

[1] https://lore.kernel.org/linux-cxl/cover.1711598777.git.alison.schofield@intel.com/

> 
> 
> Shiyang Ruan (6):
>   cxl/core: correct length of DPA field masks
>   cxl/core: introduce cxl_mem_report_poison()
>   cxl/core: add report option for cxl_mem_get_poison()
>   cxl/core: report poison when injecting from debugfs
>   cxl: add definition for transaction_type
>   cxl/core: add poison injection event handler
> 
>  drivers/cxl/core/mbox.c   | 126 +++++++++++++++++++++++++++++++++-----
>  drivers/cxl/core/memdev.c |   5 +-
>  drivers/cxl/core/region.c |   8 +--
>  drivers/cxl/core/trace.h  |   6 +-
>  drivers/cxl/cxlmem.h      |  13 ++--
>  include/linux/cxl-event.h |  17 ++++-
>  6 files changed, 144 insertions(+), 31 deletions(-)
> 
> -- 
> 2.34.1
> >

Dan Williams March 29, 2024, 6:22 p.m. UTC | #2

Alison Schofield wrote:
[..]
> Upon receipt of that new poison list, call memory_failture_queue()
> on *any* poison in a mapped space. Is that OK?  Can we call
> memory_failure_queue() on any and every poison report that is in
> HPA space regardless of whether it first came to us through a GMER?
> I'm actually wondering if that is going to be the next ask anyway -
> ie report all poison.

memory_failure_queue() should be called on poison creation events. Leave
the MF_ACTION_REQUIRED flag not set so that memory_failure() performs
"action optional" handling.  So I would expect memory_failure_queue()
notification for GMER events, but not on poison list events.

Alison Schofield March 29, 2024, 7:38 p.m. UTC | #3

On Fri, Mar 29, 2024 at 11:22:32AM -0700, Dan Williams wrote:
> Alison Schofield wrote:
> [..]
> > Upon receipt of that new poison list, call memory_failture_queue()
> > on *any* poison in a mapped space. Is that OK?  Can we call
> > memory_failure_queue() on any and every poison report that is in
> > HPA space regardless of whether it first came to us through a GMER?
> > I'm actually wondering if that is going to be the next ask anyway -
> > ie report all poison.
> 
> memory_failure_queue() should be called on poison creation events. Leave
> the MF_ACTION_REQUIRED flag not set so that memory_failure() performs
> "action optional" handling.  So I would expect memory_failure_queue()
> notification for GMER events, but not on poison list events.

Seems I totally missed the point of this patch set.
Is it's only purpose to make sure that poison that is injected gets
reported to memory_failure?

So this single patch only:
1. Poison inject leads to this GMER/CXL_EVENT_TRANSACTION_INJECT_POISON 
2. Driver sees GMER/CXL_EVENT_TRANSACTION_INJECT_POISON and reads poison
list to get accurate length.
3. Driver reports that to memory_failure_queue()

Still expect there's some code sharing opportunities and I still wonder
about what is next in this area.

--Alison

Dan Williams March 29, 2024, 8:56 p.m. UTC | #4

Alison Schofield wrote:
> On Fri, Mar 29, 2024 at 11:22:32AM -0700, Dan Williams wrote:
> > Alison Schofield wrote:
> > [..]
> > > Upon receipt of that new poison list, call memory_failture_queue()
> > > on *any* poison in a mapped space. Is that OK?  Can we call
> > > memory_failure_queue() on any and every poison report that is in
> > > HPA space regardless of whether it first came to us through a GMER?
> > > I'm actually wondering if that is going to be the next ask anyway -
> > > ie report all poison.
> > 
> > memory_failure_queue() should be called on poison creation events. Leave
> > the MF_ACTION_REQUIRED flag not set so that memory_failure() performs
> > "action optional" handling.  So I would expect memory_failure_queue()
> > notification for GMER events, but not on poison list events.
> 
> Seems I totally missed the point of this patch set.
> Is it's only purpose to make sure that poison that is injected gets
> reported to memory_failure?

Clarify terms, "poison injection" to me is a debug-only event to test
that poison handling is working, "poison creation" is an event where new
poison was encountered by CPU consumption, deposited by a
DMA-with-poison transaction, or discovered by a background scrub
operation.

> 
> So this single patch only:
> 1. Poison inject leads to this GMER/CXL_EVENT_TRANSACTION_INJECT_POISON 

Inject is a special case. Likely this should copy the PMEM legacy where
notifying memory_failure() on injected poison is optional:

"ndctl inject-error --no-notify"

> 2. Driver sees GMER/CXL_EVENT_TRANSACTION_INJECT_POISON and reads poison
> list to get accurate length.

Again, inject is the least interesting for the common case, production
kernels care about "Media ECC Error" and "Scrub Media ECC Error"
regardless of transaction type.

> 3. Driver reports that to memory_failure_queue()
> 
> Still expect there's some code sharing opportunities and I still wonder
> about what is next in this area.

One area this needs to be careful is in unifying the OS-first and
FW-first paths. In the FW-first case the platform can trigger
memory_failure() along with the GMER by just posting a memory failure
CPER record. So there is a risk that things get doubly reported if the
GMER handling code blindly triggers memory_failure(). Might be benign,
but probably best avoided.