mbox series

[v4,0/5] CXL Poison List Retrieval & Tracing

Message ID cover.1671135967.git.alison.schofield@intel.com
Headers show
Series CXL Poison List Retrieval & Tracing | expand

Message

Alison Schofield Dec. 15, 2022, 9:17 p.m. UTC
From: Alison Schofield <alison.schofield@intel.com>

Changes in v4:
- Rebase on cxl/preview
- Squash 2 mock patches into 1 mock patch
- Apply Jonathan Reviewed-by tags on Patches 1,2,4,5
- Don't return an error on failure to read volatile range poison (Jonathan)
- Use strong types in trace event arguments supplying dev_names (Dan)
- Pass the media-error record structure to trace event. (Steve, Ira)
- Re-order Patches 1 & 2 to make the change above work
- Use a poison state struct to hold buffer, lock (and max_mer) (Dan)
- Allocate the poison list payload buffer once (Dan)
- Request poison length in multiples of 64 bytes per CXL Spec
- Test for enabled when storing Identify commands max_mer
- Use get_unaligned_le24() on poison max_mer (Jonathan)
- Use decimal values for size (rsvd[20]) (Dan)
- cxl_test: mock with a valid DPA address
- s/includes/'consists of' (Jonathan)

Link to v3:
https://lore.kernel.org/linux-cxl/cover.1668115235.git.alison.schofield@intel.com/

Add support for retrieving device poison lists and store the returned
error records as kernel trace events.

The handling of the poison list is guided by the CXL 3.0 Specification
Section 8.2.9.8.4.1. [1] 

Example, triggered by memdev:
$ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0

Example, triggered by region:
$ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0

[1]: https://www.computeexpresslink.org/download-the-specification

Alison Schofield (5):
  cxl/mbox: Add GET_POISON_LIST mailbox command
  cxl/trace: Add TRACE support for CXL media-error records
  cxl/memdev: Add trigger_poison_list sysfs attribute
  cxl/region: Add trigger_poison_list sysfs attribute
  tools/testing/cxl: Mock support for Get Poison List

 Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
 drivers/cxl/core/mbox.c                 | 79 +++++++++++++++++++++++
 drivers/cxl/core/memdev.c               | 45 ++++++++++++++
 drivers/cxl/core/region.c               | 33 ++++++++++
 drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
 drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
 drivers/cxl/pci.c                       |  4 ++
 tools/testing/cxl/test/mem.c            | 42 +++++++++++++
 8 files changed, 382 insertions(+), 1 deletion(-)


base-commit: a6591693d912a1cb88cc5a6d91a7b583481d3a84