mbox series

[V2,0/4] rasdaemon: Add support for the CXL error events

Message ID 20230124165733.1452-1-shiju.jose@huawei.com (mailing list archive)
Headers show
Series rasdaemon: Add support for the CXL error events | expand

Message

Shiju Jose Jan. 24, 2023, 4:57 p.m. UTC
From: Shiju Jose <shiju.jose@huawei.com>

Log and record the following CXL errors reported through the kernel
trace events. CXL poison errors, CXL AER uncorrectable errors and CXL AER
correctable errors.

Note: The default poll method in the rasdaemon to receive the trace events
didn't work in the QEMU. Thus instead used the pthread way for
testing the CXL error events.
To do so, please make following change in the ras-events.c
<change start ...>
/* rc = read_ras_event_all_cpus(data, cpus); */
rc = -255;
< ...change end >
/* Poll doesn't work on this kernel. Fallback to pthread way */
if (rc == -255) {
...

Shiju Jose (4):
  rasdaemon: Move definition for BIT and BIT_ULL to a common file
  rasdaemon: Add support for the CXL poison events
  rasdaemon: Add support for the CXL AER uncorrectable errors
  rasdaemon: Add support for the CXL AER correctable errors

Changes:
RFC V1 -> V2
1. Rename uuid to region_uuid in the log and SQLite DB.
2. Rebase to the latest rasdaemon code.
3. Modify to match the name changes of interface structures and
   functions in the latest libtraceevent-dev, use in the rasdaemon.   

 Makefile.am                |   7 +-
 configure.ac               |  11 ++
 ras-cxl-handler.c          | 351 +++++++++++++++++++++++++++++++++++++
 ras-cxl-handler.h          |  32 ++++
 ras-events.c               |  33 ++++
 ras-events.h               |   3 +
 ras-non-standard-handler.h |   3 -
 ras-record.c               | 203 +++++++++++++++++++++
 ras-record.h               |  49 ++++++
 ras-report.c               | 219 +++++++++++++++++++++++
 ras-report.h               |   6 +
 11 files changed, 913 insertions(+), 4 deletions(-)
 create mode 100644 ras-cxl-handler.c
 create mode 100644 ras-cxl-handler.h