From patchwork Fri Jan 3 15:07:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= X-Patchwork-Id: 11316943 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 10707139A for ; Fri, 3 Jan 2020 15:07:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D8CD322314 for ; Fri, 3 Jan 2020 15:07:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.de header.i=@amazon.de header.b="g/bqTZWT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727746AbgACPHn (ORCPT ); Fri, 3 Jan 2020 10:07:43 -0500 Received: from smtp-fw-6002.amazon.com ([52.95.49.90]:20597 "EHLO smtp-fw-6002.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727701AbgACPHn (ORCPT ); Fri, 3 Jan 2020 10:07:43 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1578064063; x=1609600063; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TInWHT1nyTPvbMS5z8zVcpKZsV2aYQwc/ahyoW7Tc+8=; b=g/bqTZWTnyFqcxFNxZ/WAaymmgnD+GtXtuhDOcXq6Dh99n3TsZuWOI2R aUy5d5PS0liRSfkC0nnwCOlHN3BpjjxUZHWGnPQ0HPa5cFXuANvJFxuYn 25EC9Qu0XQI7+WIN8HmIEoLeFFiLA3fNAjvNJKVzd5y+jy1Wy+7hMx1RT c=; IronPort-SDR: falGAzFI29AzgGAlTF6HJ8cdlr8CGF9r3fVDeGzcWyZcigigBzOOwnCcgoqK0nG/hMVSKii53Z +XFDLuKDNR7w== X-IronPort-AV: E=Sophos;i="5.69,391,1571702400"; d="scan'208";a="9979313" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2c-87a10be6.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-6002.iad6.amazon.com with ESMTP; 03 Jan 2020 15:07:40 +0000 Received: from u7588a65da6b65f.ant.amazon.com (pdx2-ws-svc-lb17-vlan3.amazon.com [10.247.140.70]) by email-inbound-relay-2c-87a10be6.us-west-2.amazon.com (Postfix) with ESMTPS id 79E2EA1D03; Fri, 3 Jan 2020 15:07:39 +0000 (UTC) Received: from u7588a65da6b65f.ant.amazon.com (localhost [127.0.0.1]) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTPS id 003F7bZo020445 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 3 Jan 2020 16:07:37 +0100 Received: (from jschoenh@localhost) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Submit) id 003F7bT0020444; Fri, 3 Jan 2020 16:07:37 +0100 From: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= To: Borislav Petkov Cc: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= , Yazen Ghannam , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, Tony Luck , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org Subject: [PATCH v2 1/6] x86/mce: Take action on UCNA/Deferred errors again Date: Fri, 3 Jan 2020 16:07:17 +0100 Message-Id: <20200103150722.20313-2-jschoenh@amazon.de> X-Mailer: git-send-email 2.22.0.3.gb49bb57c8208.dirty In-Reply-To: <20200103150722.20313-1-jschoenh@amazon.de> References: <20200103150722.20313-1-jschoenh@amazon.de> MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org Commit fa92c5869426 ("x86, mce: Support memory error recovery for both UCNA and Deferred error in machine_check_poll") added handling of UCNA and Deferred errors by adding them to the ring for SRAO errors. Later, commit fd4cf79fcc4b ("x86/mce: Remove the MCE ring for Action Optional errors") switched storage from the SRAO ring to the unified pool that is still in use today. In order to only act on the intended errors, a filter for MCE_AO_SEVERITY is used -- effectively removing handling of UCNA/Deferred errors again. Extend the severity filter to include UCNA/Deferred errors again. Also, generalize the naming of the notifier from SRAO to UC to capture the extended scope. Note, that this change may cause a message like the following to appear, as the same address may be reported as SRAO and as UCNA: Memory failure: 0x5fe3284: already hardware poisoned Technically, this is a return to previous behavior. Fixes: fd4cf79fcc4b ("x86/mce: Remove the MCE ring for Action Optional errors") Signed-off-by: Jan H. Schönherr Acked-by: Tony Luck --- Changes v1->v2: - rename notifier from SRAO to UC (as requested by Tony) - extend commit message (per remark from Tony) - don't mention Linux versions (per remark from Boris) There was some discussion on v1, whether the SRAO/UC notifier does the right thing or not. While it seems to be correct as is for Intel (per Tony), there were some concerns for AMD (per Yazen). Hence, there is a new patch 5 in this series, which disables the notifier on AMD. --- arch/x86/include/asm/mce.h | 2 +- arch/x86/kernel/cpu/mce/core.c | 31 ++++++++++++++++--------------- 2 files changed, 17 insertions(+), 16 deletions(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index dc2d4b206ab7..c8ff6f6750ef 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -144,7 +144,7 @@ struct mce_log_buffer { enum mce_notifier_prios { MCE_PRIO_FIRST = INT_MAX, - MCE_PRIO_SRAO = INT_MAX - 1, + MCE_PRIO_UC = INT_MAX - 1, MCE_PRIO_EXTLOG = INT_MAX - 2, MCE_PRIO_NFIT = INT_MAX - 3, MCE_PRIO_EDAC = INT_MAX - 4, diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 8994fe7751a4..16134ce587fd 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -156,10 +156,8 @@ void mce_log(struct mce *m) } EXPORT_SYMBOL_GPL(mce_log); -static struct notifier_block mce_srao_nb; - /* - * We run the default notifier if we have only the SRAO, the first and the + * We run the default notifier if we have only the UC, the first and the * default notifier registered. I.e., the mandatory NUM_DEFAULT_NOTIFIERS * notifiers registered on the chain. */ @@ -580,26 +578,29 @@ static struct notifier_block first_nb = { .priority = MCE_PRIO_FIRST, }; -static int srao_decode_notifier(struct notifier_block *nb, unsigned long val, - void *data) +static int uc_decode_notifier(struct notifier_block *nb, unsigned long val, + void *data) { struct mce *mce = (struct mce *)data; unsigned long pfn; - if (!mce) + if (!mce || !mce_usable_address(mce)) return NOTIFY_DONE; - if (mce_usable_address(mce) && (mce->severity == MCE_AO_SEVERITY)) { - pfn = mce->addr >> PAGE_SHIFT; - if (!memory_failure(pfn, 0)) - set_mce_nospec(pfn); - } + if (mce->severity != MCE_AO_SEVERITY && + mce->severity != MCE_DEFERRED_SEVERITY) + return NOTIFY_DONE; + + pfn = mce->addr >> PAGE_SHIFT; + if (!memory_failure(pfn, 0)) + set_mce_nospec(pfn); return NOTIFY_OK; } -static struct notifier_block mce_srao_nb = { - .notifier_call = srao_decode_notifier, - .priority = MCE_PRIO_SRAO, + +static struct notifier_block mce_uc_nb = { + .notifier_call = uc_decode_notifier, + .priority = MCE_PRIO_UC, }; static int mce_default_notifier(struct notifier_block *nb, unsigned long val, @@ -1970,7 +1971,7 @@ int __init mcheck_init(void) { mcheck_intel_therm_init(); mce_register_decode_chain(&first_nb); - mce_register_decode_chain(&mce_srao_nb); + mce_register_decode_chain(&mce_uc_nb); mce_register_decode_chain(&mce_default_nb); mcheck_vendor_init_severity(); From patchwork Fri Jan 3 15:07:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= X-Patchwork-Id: 11316949 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 01205109A for ; Fri, 3 Jan 2020 15:08:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D314622314 for ; Fri, 3 Jan 2020 15:08:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.de header.i=@amazon.de header.b="lmzDQfpR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727791AbgACPID (ORCPT ); Fri, 3 Jan 2020 10:08:03 -0500 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:64088 "EHLO smtp-fw-2101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727701AbgACPID (ORCPT ); Fri, 3 Jan 2020 10:08:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1578064082; x=1609600082; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ilxhrqiz2sA0xCthbk4rbmbS2FVPt/6iJofX5fmxfxs=; b=lmzDQfpRvRrdUhojVEjvWcTJUKrDmhwiXVzihyVNvH4ai1QPD+mI4yNU nn3ZPyDR4i0EAXDJkEKJtvWoZqOF6wt8Cz2Nmbpjbizo9rKt619zkVMkZ naSDiElI+S1WkYHOOxna8apGlbNlqYQyDh3u7bVHso7u+UJYhPARRW0CB s=; IronPort-SDR: Oa/ojuSblhpUdRZMEVQrq0/DmPhELYjRcS+BsILcI1wiMfgpmZPicI7pxuGVx0EHyidjvUARu4 tDS7RvKJ+U9w== X-IronPort-AV: E=Sophos;i="5.69,391,1571702400"; d="scan'208";a="10900535" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-1d-9ec21598.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-out-2101.iad2.amazon.com with ESMTP; 03 Jan 2020 15:07:44 +0000 Received: from u7588a65da6b65f.ant.amazon.com (iad7-ws-svc-lb50-vlan3.amazon.com [10.0.93.214]) by email-inbound-relay-1d-9ec21598.us-east-1.amazon.com (Postfix) with ESMTPS id 737AFA1D0E; Fri, 3 Jan 2020 15:07:39 +0000 (UTC) Received: from u7588a65da6b65f.ant.amazon.com (localhost [127.0.0.1]) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTPS id 003F7bDS020457 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 3 Jan 2020 16:07:37 +0100 Received: (from jschoenh@localhost) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Submit) id 003F7bQ5020456; Fri, 3 Jan 2020 16:07:37 +0100 From: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= To: Borislav Petkov Cc: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= , Yazen Ghannam , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, Tony Luck , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org Subject: [PATCH v2 2/6] x86/mce: Make mce=nobootlog work again Date: Fri, 3 Jan 2020 16:07:18 +0100 Message-Id: <20200103150722.20313-3-jschoenh@amazon.de> X-Mailer: git-send-email 2.22.0.3.gb49bb57c8208.dirty In-Reply-To: <20200103150722.20313-1-jschoenh@amazon.de> References: <20200103150722.20313-1-jschoenh@amazon.de> MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org Since commit 8b38937b7ab5 ("x86/mce: Do not enter deferred errors into the generic pool twice") the mce=nobootlog option has become mostly ineffective (after being only slightly ineffective before), as the code is taking actions on MCEs left over from boot when they have a usable address. Move the check for MCP_DONTLOG a bit outward to make it effective again. Also, since commit 011d82611172 ("RAS: Add a Corrected Errors Collector") the two branches of the remaining "if" the bottom of machine_check_poll() do same. Unify them. Signed-off-by: Jan H. Schönherr --- Changes v1->v2: - remove an indentation level in favor of a goto (requested by Boris) - don't mention Linux version (per remark from Boris) --- arch/x86/kernel/cpu/mce/core.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 16134ce587fd..0ccd6cf3402d 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -750,26 +750,22 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b) log_it: error_seen = true; - mce_read_aux(&m, i); + if (flags & MCP_DONTLOG) + goto clear_it; + mce_read_aux(&m, i); m.severity = mce_severity(&m, mca_cfg.tolerant, NULL, false); - /* * Don't get the IP here because it's unlikely to * have anything to do with the actual error location. */ - if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce) - mce_log(&m); - else if (mce_usable_address(&m)) { - /* - * Although we skipped logging this, we still want - * to take action. Add to the pool so the registered - * notifiers will see it. - */ - if (!mce_gen_pool_add(&m)) - mce_schedule_work(); - } + if (mca_cfg.dont_log_ce && !mce_usable_address(&m)) + goto clear_it; + + mce_log(&m); + +clear_it: /* * Clear state for this bank. */ From patchwork Fri Jan 3 15:07:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= X-Patchwork-Id: 11316947 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3C6CA109A for ; Fri, 3 Jan 2020 15:07:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1AE3A22522 for ; Fri, 3 Jan 2020 15:07:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.de header.i=@amazon.de header.b="YtyCxLXe" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727930AbgACPHo (ORCPT ); Fri, 3 Jan 2020 10:07:44 -0500 Received: from smtp-fw-4101.amazon.com ([72.21.198.25]:1991 "EHLO smtp-fw-4101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727769AbgACPHo (ORCPT ); Fri, 3 Jan 2020 10:07:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1578064063; x=1609600063; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tJ+5iDQ/bAjdGRg85z5wISvP/YmqlU1qWWJuDNu53GA=; b=YtyCxLXeOAiEb+hLTmuEwTvpqskGdlWjUoe+ooyHH2vs3w0W9n6tZpfI LBfT0+krAqOBdOi7TM9yeQnquyW6bGxGD+ExR9lvN2cG7RJYie7mLUW9L j4GTi89qJBCIpHQhCYsHp4PXTU4YVPsRVKzlNttDE6LBYi/aEQlWjZPAz 4=; IronPort-SDR: 6JCejOW49GdUl02089QUnN3QGIvR5rbC69HjFv2Pmg3ANnTfSbTR7m2evmt/aRx0vrcWtiVBBE 2eAjLLQPTp6g== X-IronPort-AV: E=Sophos;i="5.69,390,1571702400"; d="scan'208";a="10829612" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-2b-baacba05.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-4101.iad4.amazon.com with ESMTP; 03 Jan 2020 15:07:41 +0000 Received: from u7588a65da6b65f.ant.amazon.com (pdx2-ws-svc-lb17-vlan3.amazon.com [10.247.140.70]) by email-inbound-relay-2b-baacba05.us-west-2.amazon.com (Postfix) with ESMTPS id ABF44A22F7; Fri, 3 Jan 2020 15:07:39 +0000 (UTC) Received: from u7588a65da6b65f.ant.amazon.com (localhost [127.0.0.1]) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTPS id 003F7cnN020465 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 3 Jan 2020 16:07:38 +0100 Received: (from jschoenh@localhost) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Submit) id 003F7cj2020464; Fri, 3 Jan 2020 16:07:38 +0100 From: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= To: Borislav Petkov Cc: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= , Yazen Ghannam , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, Tony Luck , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org Subject: [PATCH v2 3/6] x86/mce: Fix use of uninitialized MCE message string Date: Fri, 3 Jan 2020 16:07:19 +0100 Message-Id: <20200103150722.20313-4-jschoenh@amazon.de> X-Mailer: git-send-email 2.22.0.3.gb49bb57c8208.dirty In-Reply-To: <20200103150722.20313-1-jschoenh@amazon.de> References: <20200103150722.20313-1-jschoenh@amazon.de> MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org The function mce_severity() is not required to update its msg argument. In fact, mce_severity_amd() does not, which makes mce_no_way_out() return uninitialized data, which may be used later for printing. Assuming that implementations of mce_severity() either always or never update the msg argument (which is currently the case), it is sufficient to initialize the temporary variable in mce_no_way_out(). While at it, avoid printing a useless "Unknown". Signed-off-by: Jan H. Schönherr --- Changes v1->v2: - simplify fix by assuming that mce_severity() either always or never updates the msg argument -- as opposed to mce_severity() having the freedom to decide on a case by case basis (requested by Boris); - stop printing "Unknown" (requested by Boris). --- arch/x86/kernel/cpu/mce/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 0ccd6cf3402d..1d91ce956772 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -790,7 +790,7 @@ EXPORT_SYMBOL_GPL(machine_check_poll); static int mce_no_way_out(struct mce *m, char **msg, unsigned long *validp, struct pt_regs *regs) { - char *tmp; + char *tmp = *msg; int i; for (i = 0; i < this_cpu_read(mce_num_banks); i++) { @@ -1209,8 +1209,8 @@ void do_machine_check(struct pt_regs *regs, long error_code) DECLARE_BITMAP(toclear, MAX_NR_BANKS); struct mca_config *cfg = &mca_cfg; int cpu = smp_processor_id(); - char *msg = "Unknown"; struct mce m, *final; + char *msg = NULL; int worst = 0; /* From patchwork Fri Jan 3 15:07:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= X-Patchwork-Id: 11316945 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 028E7109A for ; Fri, 3 Jan 2020 15:07:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D2EBC22522 for ; Fri, 3 Jan 2020 15:07:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.de header.i=@amazon.de header.b="Fk6TLzpd" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727769AbgACPHp (ORCPT ); Fri, 3 Jan 2020 10:07:45 -0500 Received: from smtp-fw-4101.amazon.com ([72.21.198.25]:2000 "EHLO smtp-fw-4101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727701AbgACPHp (ORCPT ); Fri, 3 Jan 2020 10:07:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1578064064; x=1609600064; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iHsaaNSb6QuBd5deQsg49o2tayu5boXDGXFCOP8fQzk=; b=Fk6TLzpdu7D3QWekW6/8JbQ6qiy+I3Y6wm0Z5KU9X3bJN7+e5Bvmpzd4 gvv5igoUd90w7nxGRwyacv26BSllz8JX2AjHwqJ0DEc/y9g7tVWmbyD5Z LFpdKN/GMUSxnKnQS/kzBNSEOxovQBzH29rqGT2mqzaVKb6gHcC3hF5qe A=; IronPort-SDR: dzSxum0Eg4ym7IdAnqI5ApmAWoDS4hbW1dA4MiC2R/lbhYUFwc/P4ZlINqPgjvjg++vg67PfRM Hl3Us7UKpoPg== X-IronPort-AV: E=Sophos;i="5.69,390,1571702400"; d="scan'208";a="10829622" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-1e-27fb8269.us-east-1.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-4101.iad4.amazon.com with ESMTP; 03 Jan 2020 15:07:44 +0000 Received: from u7588a65da6b65f.ant.amazon.com (iad7-ws-svc-lb50-vlan3.amazon.com [10.0.93.214]) by email-inbound-relay-1e-27fb8269.us-east-1.amazon.com (Postfix) with ESMTPS id A32F0A1C46; Fri, 3 Jan 2020 15:07:40 +0000 (UTC) Received: from u7588a65da6b65f.ant.amazon.com (localhost [127.0.0.1]) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTPS id 003F7coI020473 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 3 Jan 2020 16:07:38 +0100 Received: (from jschoenh@localhost) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Submit) id 003F7cqx020472; Fri, 3 Jan 2020 16:07:38 +0100 From: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= To: Borislav Petkov Cc: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= , Yazen Ghannam , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, Tony Luck , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org Subject: [PATCH v2 4/6] x86/mce: Allow a variable number of internal MCE decode notifiers Date: Fri, 3 Jan 2020 16:07:20 +0100 Message-Id: <20200103150722.20313-5-jschoenh@amazon.de> X-Mailer: git-send-email 2.22.0.3.gb49bb57c8208.dirty In-Reply-To: <20200103150722.20313-1-jschoenh@amazon.de> References: <20200103150722.20313-1-jschoenh@amazon.de> MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org Get rid of the compile time constant of internal (or mandatory) MCE decode notifiers in preparation for future changes. Instead, distinguish explicitly between internal and external MCE decode notifiers. Signed-off-by: Jan H. Schönherr --- New in v2, preparation for patches 5 and 6. --- arch/x86/kernel/cpu/mce/core.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 1d91ce956772..d48deb127071 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -157,21 +157,24 @@ void mce_log(struct mce *m) EXPORT_SYMBOL_GPL(mce_log); /* - * We run the default notifier if we have only the UC, the first and the - * default notifier registered. I.e., the mandatory NUM_DEFAULT_NOTIFIERS + * We run the default notifier as long as we have no external * notifiers registered on the chain. */ -#define NUM_DEFAULT_NOTIFIERS 3 static atomic_t num_notifiers; -void mce_register_decode_chain(struct notifier_block *nb) +static void mce_register_decode_chain_internal(struct notifier_block *nb) { if (WARN_ON(nb->priority > MCE_PRIO_MCELOG && nb->priority < MCE_PRIO_EDAC)) return; + blocking_notifier_chain_register(&x86_mce_decoder_chain, nb); +} + +void mce_register_decode_chain(struct notifier_block *nb) +{ atomic_inc(&num_notifiers); - blocking_notifier_chain_register(&x86_mce_decoder_chain, nb); + mce_register_decode_chain_internal(nb); } EXPORT_SYMBOL_GPL(mce_register_decode_chain); @@ -611,7 +614,7 @@ static int mce_default_notifier(struct notifier_block *nb, unsigned long val, if (!m) return NOTIFY_DONE; - if (atomic_read(&num_notifiers) > NUM_DEFAULT_NOTIFIERS) + if (atomic_read(&num_notifiers)) return NOTIFY_DONE; __print_mce(m); @@ -1966,9 +1969,9 @@ __setup("mce", mcheck_enable); int __init mcheck_init(void) { mcheck_intel_therm_init(); - mce_register_decode_chain(&first_nb); - mce_register_decode_chain(&mce_uc_nb); - mce_register_decode_chain(&mce_default_nb); + mce_register_decode_chain_internal(&first_nb); + mce_register_decode_chain_internal(&mce_uc_nb); + mce_register_decode_chain_internal(&mce_default_nb); mcheck_vendor_init_severity(); INIT_WORK(&mce_work, mce_gen_pool_process); From patchwork Fri Jan 3 15:07:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= X-Patchwork-Id: 11316953 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CBB38139A for ; Fri, 3 Jan 2020 15:08:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AA3DD22314 for ; Fri, 3 Jan 2020 15:08:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.de header.i=@amazon.de header.b="DQ3+qIP1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727980AbgACPIE (ORCPT ); Fri, 3 Jan 2020 10:08:04 -0500 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:28273 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727701AbgACPIE (ORCPT ); Fri, 3 Jan 2020 10:08:04 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1578064084; x=1609600084; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4sqTgTN+u6x0cMWwPxL9+fdn8jhOiyRZfXoDVoHTe7Q=; b=DQ3+qIP1M75gapuwNM3ThdLG0tng8bAt7LT1AhnFBlmyVo9XJsqeRMh+ pJaQ1P0hp87PVaW+F/GJPs8cpgjdtnMIa+3zPtRJeJjpr2daYungaqy90 WZZ+xY2HeJ3qrkgCWxfNcMkgCCBcX+QJ+q0ZC0NGSliMpMlNYi/UI1FBo U=; IronPort-SDR: vKEXyCsVAAGBvB15UoH4WLyjHsp29/C13vcFVXRdggYw9zMoCecdzC/wyAVP6B2n+SHy8eF6v4 gzHRlF1vWvkA== X-IronPort-AV: E=Sophos;i="5.69,390,1571702400"; d="scan'208";a="16652117" Received: from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO email-inbound-relay-2a-f14f4a47.us-west-2.amazon.com) ([10.47.23.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP; 03 Jan 2020 15:07:41 +0000 Received: from u7588a65da6b65f.ant.amazon.com (pdx2-ws-svc-lb17-vlan3.amazon.com [10.247.140.70]) by email-inbound-relay-2a-f14f4a47.us-west-2.amazon.com (Postfix) with ESMTPS id E0760A18E5; Fri, 3 Jan 2020 15:07:39 +0000 (UTC) Received: from u7588a65da6b65f.ant.amazon.com (localhost [127.0.0.1]) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTPS id 003F7cMf020481 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 3 Jan 2020 16:07:38 +0100 Received: (from jschoenh@localhost) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Submit) id 003F7ckP020480; Fri, 3 Jan 2020 16:07:38 +0100 From: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= To: Borislav Petkov Cc: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= , Yazen Ghannam , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, Tony Luck , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org Subject: [PATCH v2 5/6] x86/mce: Do not take action on SRAO/Deferred errors on AMD for now Date: Fri, 3 Jan 2020 16:07:21 +0100 Message-Id: <20200103150722.20313-6-jschoenh@amazon.de> X-Mailer: git-send-email 2.22.0.3.gb49bb57c8208.dirty In-Reply-To: <20200103150722.20313-1-jschoenh@amazon.de> References: <20200103150722.20313-1-jschoenh@amazon.de> MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org Per Yazen Ghannam we should not use the UC notifier for the time being on AMD. Reported-by: Yazen Ghannam Signed-off-by: Jan H. Schönherr --- New in v2. This is due to a remark from Yazen on v1, that we shouldn't be handling neither SRAO nor Deferred errors in that handler. An alternative implementation would do the architecture "if" directly within uc_decode_notifier(), in which case we could decide to not apply patch 4. --- arch/x86/kernel/cpu/mce/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index d48deb127071..d8fe5b048ee7 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1970,7 +1970,8 @@ int __init mcheck_init(void) { mcheck_intel_therm_init(); mce_register_decode_chain_internal(&first_nb); - mce_register_decode_chain_internal(&mce_uc_nb); + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) + mce_register_decode_chain_internal(&mce_uc_nb); mce_register_decode_chain_internal(&mce_default_nb); mcheck_vendor_init_severity(); From patchwork Fri Jan 3 15:07:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= X-Patchwork-Id: 11316955 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B5424139A for ; Fri, 3 Jan 2020 15:08:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 935E5222C3 for ; Fri, 3 Jan 2020 15:08:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.de header.i=@amazon.de header.b="DMyYrFTZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727975AbgACPIE (ORCPT ); Fri, 3 Jan 2020 10:08:04 -0500 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:28273 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727742AbgACPID (ORCPT ); Fri, 3 Jan 2020 10:08:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1578064083; x=1609600083; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Z3tkOD4FG1FNGsCTSm+doFuNCbFR2kiicI330Grt1Ls=; b=DMyYrFTZ5N/znj46NzoqHqlQK+IVJ0E8Raq272GnZlp89a182pe1LFel azIxYGzVQrwNF8FSlKGsk0DmkW3rHNQfBCMo4s0aegB/Fd3Fg/q2G8/SB q1xBpF1ihrSGtMSkvU+6WVoMgpQ2yvF3a07ngUqk0RF3oI7T22kiccDGQ 0=; IronPort-SDR: j/1aZfZMIEify9CS7bxtKFpVmv3DbjadmiJ6p8Keovlx4Hcznlj640wtQ3rbdsm0sb1sgXa8tG rdIoBRWwoGYQ== X-IronPort-AV: E=Sophos;i="5.69,390,1571702400"; d="scan'208";a="16652123" Received: from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO email-inbound-relay-1a-7d76a15f.us-east-1.amazon.com) ([10.47.23.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP; 03 Jan 2020 15:07:44 +0000 Received: from u7588a65da6b65f.ant.amazon.com (iad7-ws-svc-lb50-vlan2.amazon.com [10.0.93.210]) by email-inbound-relay-1a-7d76a15f.us-east-1.amazon.com (Postfix) with ESMTPS id B9D05A2861; Fri, 3 Jan 2020 15:07:40 +0000 (UTC) Received: from u7588a65da6b65f.ant.amazon.com (localhost [127.0.0.1]) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTPS id 003F7cx3020489 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 3 Jan 2020 16:07:38 +0100 Received: (from jschoenh@localhost) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Submit) id 003F7cHx020488; Fri, 3 Jan 2020 16:07:38 +0100 From: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= To: Borislav Petkov Cc: =?utf-8?q?Jan_H=2E_Sch=C3=B6nherr?= , Yazen Ghannam , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, Tony Luck , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org Subject: [PATCH v2 6/6] x86/mce: Dynamically register default MCE handler Date: Fri, 3 Jan 2020 16:07:22 +0100 Message-Id: <20200103150722.20313-7-jschoenh@amazon.de> X-Mailer: git-send-email 2.22.0.3.gb49bb57c8208.dirty In-Reply-To: <20200103150722.20313-1-jschoenh@amazon.de> References: <20200103150722.20313-1-jschoenh@amazon.de> MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org The default MCE handler takes action, when no external MCE handler is registered. Instead of checking for external handlers within the default MCE handler, only register the default MCE handler when there are no external handlers in the first place. Signed-off-by: Jan H. Schönherr Reported-by: kbuild test robot --- New in v2. This is something that became possible due to patch 4. I'm not entirely happy with it. One the one hand, I'm wondering whether there's a nicer way to handle (de-)registration races. On the other hand, I'm starting to question the whole logic to "only print the MCE if nothing else in the kernel has a handler registered". Is that really how it should be? For example, there are handlers that filter for a specific subset of MCEs. If one of those is registered, we're losing all information for MCEs that don't match. A possible solution to the latter would be to have a "handled" or "printed" flag within "struct mce" and print the MCE based on that in the default handler. What do you think? --- arch/x86/kernel/cpu/mce/core.c | 90 ++++++++++++++++++++-------------- 1 file changed, 52 insertions(+), 38 deletions(-) diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index d8fe5b048ee7..3b6e37c5252f 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -156,36 +156,6 @@ void mce_log(struct mce *m) } EXPORT_SYMBOL_GPL(mce_log); -/* - * We run the default notifier as long as we have no external - * notifiers registered on the chain. - */ -static atomic_t num_notifiers; - -static void mce_register_decode_chain_internal(struct notifier_block *nb) -{ - if (WARN_ON(nb->priority > MCE_PRIO_MCELOG && nb->priority < MCE_PRIO_EDAC)) - return; - - blocking_notifier_chain_register(&x86_mce_decoder_chain, nb); -} - -void mce_register_decode_chain(struct notifier_block *nb) -{ - atomic_inc(&num_notifiers); - - mce_register_decode_chain_internal(nb); -} -EXPORT_SYMBOL_GPL(mce_register_decode_chain); - -void mce_unregister_decode_chain(struct notifier_block *nb) -{ - atomic_dec(&num_notifiers); - - blocking_notifier_chain_unregister(&x86_mce_decoder_chain, nb); -} -EXPORT_SYMBOL_GPL(mce_unregister_decode_chain); - static inline u32 ctl_reg(int bank) { return MSR_IA32_MCx_CTL(bank); @@ -606,18 +576,19 @@ static struct notifier_block mce_uc_nb = { .priority = MCE_PRIO_UC, }; +/* + * We run the default notifier as long as we have no external + * notifiers registered on the chain. + */ +static atomic_t num_notifiers; + static int mce_default_notifier(struct notifier_block *nb, unsigned long val, void *data) { struct mce *m = (struct mce *)data; - if (!m) - return NOTIFY_DONE; - - if (atomic_read(&num_notifiers)) - return NOTIFY_DONE; - - __print_mce(m); + if (m) + __print_mce(m); return NOTIFY_DONE; } @@ -628,6 +599,49 @@ static struct notifier_block mce_default_nb = { .priority = MCE_PRIO_LOWEST, }; +static void update_default_notifier_registration(void) +{ + bool has_notifiers = !!atomic_read(&num_notifiers); + +retry: + if (has_notifiers) + blocking_notifier_chain_unregister(&x86_mce_decoder_chain, + &mce_default_nb); + else + blocking_notifier_chain_cond_register(&x86_mce_decoder_chain, + &mce_default_nb); + + if (has_notifiers != !!atomic_read(&num_notifiers)) { + has_notifiers = !has_notifiers; + goto retry; + } +} + +static void mce_register_decode_chain_internal(struct notifier_block *nb) +{ + if (WARN_ON(nb->priority > MCE_PRIO_MCELOG && + nb->priority < MCE_PRIO_EDAC)) + return; + + blocking_notifier_chain_register(&x86_mce_decoder_chain, nb); +} + +void mce_register_decode_chain(struct notifier_block *nb) +{ + atomic_inc(&num_notifiers); + mce_register_decode_chain_internal(nb); + update_default_notifier_registration(); +} +EXPORT_SYMBOL_GPL(mce_register_decode_chain); + +void mce_unregister_decode_chain(struct notifier_block *nb) +{ + atomic_dec(&num_notifiers); + update_default_notifier_registration(); + blocking_notifier_chain_unregister(&x86_mce_decoder_chain, nb); +} +EXPORT_SYMBOL_GPL(mce_unregister_decode_chain); + /* * Read ADDR and MISC registers. */ @@ -1972,7 +1986,7 @@ int __init mcheck_init(void) mce_register_decode_chain_internal(&first_nb); if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) mce_register_decode_chain_internal(&mce_uc_nb); - mce_register_decode_chain_internal(&mce_default_nb); + update_default_notifier_registration(); mcheck_vendor_init_severity(); INIT_WORK(&mce_work, mce_gen_pool_process);