From patchwork Thu Mar 21 02:53:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ruidong Tian X-Patchwork-Id: 13598338 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A59C8C54E67 for ; Thu, 21 Mar 2024 02:53:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:Cc :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=qb1s4D+y5YB1ngkSHRle6PAqxHxvv9d3Jk3s0CmB5Uo=; b=VrEpo/ntbGQWTB I6g8KmrhqeRdtfg4m+D6vknRWgE5wZfihGv3ghV7x+8h0Sjhea9Qd2zdREgsn8CP+ks+5DF/qxRoC IBTe9PQ1o9wMoQtoXJJ21b15vYfLaMSUJ1bl2WrVzeOc4JN9yC0S++enqyGs7axQlhweTuga+PWrE v+KZ11ru8M3V5xRb4bF5wxLoF6xZpz0s0qOTKOrVDKOFeZf7UUZlzKOwrPhdKuPIbckEEP2cPIoeU 4BdpbTrD0wlxpQa/OuIhmRthmzesAZjSEvsPe41yCSJk1JECIZ8b2diXYeUripduk8JrgwYVATkZL vcTsP1bVUv7Ei/SaXNEg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rn8Yf-00000001dCP-0y8n; Thu, 21 Mar 2024 02:53:33 +0000 Received: from out30-112.freemail.mail.aliyun.com ([115.124.30.112]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rn8Ya-00000001dAN-3wov for linux-arm-kernel@lists.infradead.org; Thu, 21 Mar 2024 02:53:31 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1710989603; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=QqSJg4FMwGwgy/XBJ2XliwqjmCjrqPhPQDS9UJTxrzU=; b=LICZzIvToOsgxWq+k9Wax0dYMKlu5hMJTeKzrTpj8c92NDLfdbkvZe9PCK9/EYwrUmm8ZfXSNYeCwbXtvuhhZbCiTedj/tYf5klmaNXbgMkmGoUFzJGCLVjuz6+wb119aIb65fmmyw7WT5J+Eyik4VK+R6MY0l93T7CQr9GPoq8= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R951e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=tianruidong@linux.alibaba.com;NM=1;PH=DS;RN=17;SR=0;TI=SMTPD_---0W2zFICf_1710989599; Received: from localhost(mailfrom:tianruidong@linux.alibaba.com fp:SMTPD_---0W2zFICf_1710989599) by smtp.aliyun-inc.com; Thu, 21 Mar 2024 10:53:21 +0800 From: Ruidong Tian To: catalin.marinas@arm.com, will@kernel.org, lpieralisi@kernel.org, guohanjun@huawei.com, sudeep.holla@arm.com, xueshuai@linux.alibaba.com, baolin.wang@linux.alibaba.com, linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, rafael@kernel.org, lenb@kernel.org, tony.luck@intel.com, bp@alien8.de, linux-edac@vger.kernel.org Cc: tianruidond@linux.alibaba.com, Ruidong Tian Subject: [PATCH v2 0/2] ARM Error Source Table V1 Support Date: Thu, 21 Mar 2024 10:53:15 +0800 Message-Id: <20240321025317.114621-1-tianruidong@linux.alibaba.com> X-Mailer: git-send-email 2.33.1 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240320_195329_560449_E188767A X-CRM114-Status: GOOD ( 11.92 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This series adds support for the ARM Error Source Table (AEST) based on the 1.1 version of ACPI for the Armv8 RAS Extensions [0]. The Arm Error Source Table (AEST) enable kernel-first handling of errors in a system that supports the Armv8 RAS extensions. In kernel-first mode, kernel controls almost all RAS configuration, include CE threshold and interrupt enable/disable. Hardware errors will trigger a RAS interrupt to kernel, kernel scan all AEST node to find error node which occur error in irq context and process the RAS error. Kernel will act as follow for different types error: - CE, DE: use a workqueue to log this hardware errors. - UER, UEO: call memory_failure. - UC, UEU: panic. I have tested this series on PTG Yitian710 SOC. Both corrected and uncorrected errors were tested to verify the non-fatal vs fatal scenarios. Future work: 1. Add CE storm mitigation. 2. Support AEST V2. This series is based on Tyler Baicar's patches [1], which do not have v2 sended to mail list yet. Change from origin patch: 1. Add a genpool to collect all AEST error, and log them in a workqueue other than in irq context. 2. Just use the same one aest_proc function for system register interface and MMIO interface. 3. Reconstruct some structures and functions to make it more clear. 4. Accept all comments in Tyler Baicar's mail list. Change from V1: https://lore.kernel.org/all/20240304111517.33001-1-tianruidong@linux.alibaba.com/ 1. Marc Zyngier - Use readq/writeq_relaxed instead of readq/writeq for MMIO address. - Add sync for system register operation. - Use irq_is_percpu_devid() helper to identify a per-CPU interrupt. - Other fix. 2. Set RAS CE threshold in AEST driver. 3. Enable RAS interrupt explicitly in driver. 4. UER and UEO trigger memory_failure other than panic. [0]: https://developer.arm.com/documentation/den0085/0101/ [1]: https://lore.kernel.org/all/20211124170708.3874-1-baicar@os.amperecomputing.com/ Tyler Baicar (2): ACPI/AEST: Initial AEST driver trace, ras: add ARM RAS extension trace event MAINTAINERS | 11 + arch/arm64/include/asm/ras.h | 71 +++ drivers/acpi/arm64/Kconfig | 10 + drivers/acpi/arm64/Makefile | 1 + drivers/acpi/arm64/aest.c | 839 +++++++++++++++++++++++++++++++++++ include/linux/acpi_aest.h | 92 ++++ include/linux/cpuhotplug.h | 1 + include/ras/ras_event.h | 55 +++ 8 files changed, 1080 insertions(+) create mode 100644 arch/arm64/include/asm/ras.h create mode 100644 drivers/acpi/arm64/aest.c create mode 100644 include/linux/acpi_aest.h