mbox series

[RESEND,V3,0/4] rasdaemon: Add support for the CXL error events

Message ID 20230202181846.692-1-shiju.jose@huawei.com
Headers show
Series rasdaemon: Add support for the CXL error events | expand

Message

Shiju Jose Feb. 2, 2023, 6:18 p.m. UTC
From: Shiju Jose <shiju.jose@huawei.com>

Log and record the following CXL errors reported through the kernel
trace events. CXL poison errors, CXL AER uncorrectable errors and CXL AER
correctable errors.

Note1: This V3 patch set resend due to email delivery issues to
some of the recipients.


Note2: The default poll and read method in the rasdaemon to receive
the trace events do not work due to a commit in the kernel trace system.
Thus instead used the pthread way for testing the CXL error events.
To do so, please make following change in the ras-events.c
<change start ...>
/* rc = read_ras_event_all_cpus(data, cpus); */
rc = -255;
< ...change end >
/* Poll doesn't work on this kernel. Fallback to pthread way */
if (rc == -255) {
...

Shiju Jose (4):
  rasdaemon: Move definition for BIT and BIT_ULL to a common file
  rasdaemon: Add support for the CXL poison events
  rasdaemon: Add support for the CXL AER uncorrectable errors
  rasdaemon: Add support for the CXL AER correctable errors

Changes:
RFC V2 -> V3
1. Fix for the comments from Dave Jiang.

RFC V1 -> V2
1. Rename uuid to region_uuid in the log and SQLite DB.
2. Rebase to the latest rasdaemon code.
3. Modify to match the name changes of interface structures and
   functions in the latest libtraceevent-dev, use in the rasdaemon. 

 Makefile.am                |   7 +-
 configure.ac               |  11 ++
 ras-cxl-handler.c          | 378 +++++++++++++++++++++++++++++++++++++
 ras-cxl-handler.h          |  32 ++++
 ras-events.c               |  33 ++++
 ras-events.h               |   3 +
 ras-non-standard-handler.h |   3 -
 ras-record.c               | 203 ++++++++++++++++++++
 ras-record.h               |  49 +++++
 ras-report.c               | 219 +++++++++++++++++++++
 ras-report.h               |   6 +
 11 files changed, 940 insertions(+), 4 deletions(-)
 create mode 100644 ras-cxl-handler.c
 create mode 100644 ras-cxl-handler.h