From patchwork Fri Mar 17 14:23:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 13179102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8957C6FD1D for ; Fri, 17 Mar 2023 14:24:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230034AbjCQOYK (ORCPT ); Fri, 17 Mar 2023 10:24:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229966AbjCQOYJ (ORCPT ); Fri, 17 Mar 2023 10:24:09 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3BD271204E; Fri, 17 Mar 2023 07:24:05 -0700 (PDT) Received: from lhrpeml500006.china.huawei.com (unknown [172.18.147.207]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4PdRDl0RsKz6J7MB; Fri, 17 Mar 2023 22:23:03 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.168.91) by lhrpeml500006.china.huawei.com (7.191.161.198) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Fri, 17 Mar 2023 14:24:02 +0000 From: Shiju Jose To: , , CC: , , Subject: [PATCH V5 0/4] rasdaemon: Add support for the CXL error events Date: Fri, 17 Mar 2023 14:23:47 +0000 Message-ID: <20230317142351.1234-1-shiju.jose@huawei.com> X-Mailer: git-send-email 2.26.0.windows.1 MIME-Version: 1.0 X-Originating-IP: [10.126.168.91] X-ClientProxiedBy: lhrpeml100001.china.huawei.com (7.191.160.183) To lhrpeml500006.china.huawei.com (7.191.161.198) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-cxl@vger.kernel.org Log and record the following CXL errors reported through the kernel trace events. CXL poison errors, CXL AER uncorrectable errors and CXL AER correctable errors. Shiju Jose (4): rasdaemon: Move definition for BIT and BIT_ULL to a common file rasdaemon: Add support for the CXL poison events rasdaemon: Add support for the CXL AER uncorrectable errors rasdaemon: Add support for the CXL AER correctable errors Changes: V4 -> V5 1. Add logging of device serial number for the CXL AER uncorrectable and CXL AER correctable errors. 2. Rebased. 3. Updated the patches description. V3 -> V4 1. Modifications for the changes in the kernel patches a) https://lore.kernel.org/lkml/cover.1675983077.git.alison.schofield@intel.com/ b) https://lore.kernel.org/linux-cxl/63e5ed38d77d9_138fbc2947a@iweiny-mobl.notmuch/T/#t V2 -> V3 1. Fix for the comments from Dave Jiang. RFC V1 -> V2 1. Rename uuid to region_uuid in the log and SQLite DB. 2. Rebase to the latest rasdaemon code. 3. Modify to match the name changes of interface structures and functions in the latest libtraceevent-dev, use in the rasdaemon. Makefile.am | 7 +- configure.ac | 11 + ras-cxl-handler.c | 410 +++++++++++++++++++++++++++++++++++++ ras-cxl-handler.h | 32 +++ ras-events.c | 33 +++ ras-events.h | 3 + ras-non-standard-handler.h | 3 - ras-record.c | 213 +++++++++++++++++++ ras-record.h | 54 +++++ ras-report.c | 229 +++++++++++++++++++++ ras-report.h | 6 + 11 files changed, 997 insertions(+), 4 deletions(-) create mode 100644 ras-cxl-handler.c create mode 100644 ras-cxl-handler.h