From patchwork Sun Dec 29 17:23:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kroah-Hartman X-Patchwork-Id: 11312319 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C5421138C for ; Sun, 29 Dec 2019 18:08:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9EE3120722 for ; Sun, 29 Dec 2019 18:08:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577642884; bh=8Y3bpAeQFk9okhggOyjo4H4BlZ0WWo2yo6wIlH7nPkY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=MGTEbY+ktav/Jzlj26GSULAmcttAaiIqA2IVDm7/91o65Xo7MPoa/+TjWqLdChxHy pRrx9Xlc76z4T9clW3cfMvfqBFqJFuV8gskVxgllrbr2lJZw7jxB3fLzlCMAIlATi3 ni/3rl9sW/4RUjmTUINY5TaGFfzSERuOnht7vgvU= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731297AbfL2Rrh (ORCPT ); Sun, 29 Dec 2019 12:47:37 -0500 Received: from mail.kernel.org ([198.145.29.99]:58330 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731287AbfL2Rrg (ORCPT ); Sun, 29 Dec 2019 12:47:36 -0500 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3E111208C4; Sun, 29 Dec 2019 17:47:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577641655; bh=8Y3bpAeQFk9okhggOyjo4H4BlZ0WWo2yo6wIlH7nPkY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YvchG8yZHvX3iRc5YSK7w4EqOfWTbXhsEjwdJiX2ZCYLadJId5vltHj20HjE+RrPG 0MWFC6C4V2zFdIueFAHMJ1Ne8TuKrVJjN57r2sSylnlDbPGg8BC4ufnYhQ2syHDVl+ L0ZIinRGZanin+N9o5k/cgiJXUFHI3QlUN3CQ9Cc= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Benjamin Berg , Borislav Petkov , Hans de Goede , Christian Kellner , "H. Peter Anvin" , Ingo Molnar , linux-edac , Peter Zijlstra , Srinivas Pandruvada , Thomas Gleixner , Tony Luck , x86-ml , Sasha Levin Subject: [PATCH 5.4 157/434] x86/mce: Lower throttling MCE messages priority to warning Date: Sun, 29 Dec 2019 18:23:30 +0100 Message-Id: <20191229172712.191934842@linuxfoundation.org> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191229172702.393141737@linuxfoundation.org> References: <20191229172702.393141737@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Benjamin Berg [ Upstream commit 9c3bafaa1fd88e4dd2dba3735a1f1abb0f2c7bb7 ] On modern CPUs it is quite normal that the temperature limits are reached and the CPU is throttled. In fact, often the thermal design is not sufficient to cool the CPU at full load and limits can quickly be reached when a burst in load happens. This will even happen with technologies like RAPL limitting the long term power consumption of the package. Also, these limits are "softer", as Srinivas explains: "CPU temperature doesn't have to hit max(TjMax) to get these warnings. OEMs ha[ve] an ability to program a threshold where a thermal interrupt can be generated. In some systems the offset is 20C+ (Read only value). In recent systems, there is another offset on top of it which can be programmed by OS, once some agent can adjust power limits dynamically. By default this is set to low by the firmware, which I guess the prime motivation of Benjamin to submit the patch." So these messages do not usually indicate a hardware issue (e.g. insufficient cooling). Log them as warnings to avoid confusion about their severity. [ bp: Massage commit mesage. ] Signed-off-by: Benjamin Berg Signed-off-by: Borislav Petkov Reviewed-by: Hans de Goede Tested-by: Christian Kellner Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: linux-edac Cc: Peter Zijlstra Cc: Srinivas Pandruvada Cc: Thomas Gleixner Cc: Tony Luck Cc: x86-ml Link: https://lkml.kernel.org/r/20191009155424.249277-1-bberg@redhat.com Signed-off-by: Sasha Levin --- arch/x86/kernel/cpu/mce/therm_throt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/mce/therm_throt.c b/arch/x86/kernel/cpu/mce/therm_throt.c index 6e2becf547c5..bc441d68d060 100644 --- a/arch/x86/kernel/cpu/mce/therm_throt.c +++ b/arch/x86/kernel/cpu/mce/therm_throt.c @@ -188,7 +188,7 @@ static void therm_throt_process(bool new_event, int event, int level) /* if we just entered the thermal event */ if (new_event) { if (event == THERMAL_THROTTLING_EVENT) - pr_crit("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n", + pr_warn("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n", this_cpu, level == CORE_LEVEL ? "Core" : "Package", state->count); From patchwork Sun Dec 29 17:24:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kroah-Hartman X-Patchwork-Id: 11312295 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9A8341395 for ; Sun, 29 Dec 2019 17:50:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6F52F207FD for ; Sun, 29 Dec 2019 17:50:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577641859; bh=pDa/f9AcJMh7khYqN1ssSqjeFxVXOoWorA1MKXmZNW8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=awdFRbQ9XzVZt0myOf97+1sKpKbb7nDaAJRDuWS4msNOlGypSpenS9KCMmstq/eDs ZxGfOvXmrcKO/U/nwszvsVe/xOm3F/NQbayYacflx2xSfeiGfbxSOi1fEsjf/WK9iu Y0PUVhd3YhBVlyyRFVLONmb5zrcCo1zk3viQsl7k= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731995AbfL2Ru6 (ORCPT ); Sun, 29 Dec 2019 12:50:58 -0500 Received: from mail.kernel.org ([198.145.29.99]:35786 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731972AbfL2Ru6 (ORCPT ); Sun, 29 Dec 2019 12:50:58 -0500 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D333C207FD; Sun, 29 Dec 2019 17:50:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577641857; bh=pDa/f9AcJMh7khYqN1ssSqjeFxVXOoWorA1MKXmZNW8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LWsomwJ5Ew1zjEatVArLVzPUAKra/XyxPWlocSV1gRTUGOSDdaBAE0dkY2G9aqjtF ZJxXcmob/fpo+jdrRhP3ukpPc3PLC4lIWWZ9tZkJLmZTva6Uj7nhY7LGFVAyGeP4xe uK4366L58/3dOZvjgtt5YR1Ixtrtqc3uQrO3jWRA= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Yazen Ghannam , Borislav Petkov , "linux-edac@vger.kernel.org" , James Morse , Mauro Carvalho Chehab , Robert Richter , Tony Luck , Sasha Levin Subject: [PATCH 5.4 187/434] EDAC/amd64: Set grain per DIMM Date: Sun, 29 Dec 2019 18:24:00 +0100 Message-Id: <20191229172714.243284613@linuxfoundation.org> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191229172702.393141737@linuxfoundation.org> References: <20191229172702.393141737@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Yazen Ghannam [ Upstream commit 466503d6b1b33be46ab87c6090f0ade6c6011cbc ] The following commit introduced a warning on error reports without a non-zero grain value. 3724ace582d9 ("EDAC/mc: Fix grain_bits calculation") The amd64_edac_mod module does not provide a value, so the warning will be given on the first reported memory error. Set the grain per DIMM to cacheline size (64 bytes). This is the current recommendation. Fixes: 3724ace582d9 ("EDAC/mc: Fix grain_bits calculation") Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "linux-edac@vger.kernel.org" Cc: James Morse Cc: Mauro Carvalho Chehab Cc: Robert Richter Cc: Tony Luck Link: https://lkml.kernel.org/r/20191022203448.13962-7-Yazen.Ghannam@amd.com Signed-off-by: Sasha Levin --- drivers/edac/amd64_edac.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index c1d4536ae466..cc5e56d752c8 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -2936,6 +2936,7 @@ static int init_csrows_df(struct mem_ctl_info *mci) dimm->mtype = pvt->dram_type; dimm->edac_mode = edac_mode; dimm->dtype = dev_type; + dimm->grain = 64; } } @@ -3012,6 +3013,7 @@ static int init_csrows(struct mem_ctl_info *mci) dimm = csrow->channels[j]->dimm; dimm->mtype = pvt->dram_type; dimm->edac_mode = edac_mode; + dimm->grain = 64; } } From patchwork Sun Dec 29 17:25:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kroah-Hartman X-Patchwork-Id: 11312315 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BC363138C for ; Sun, 29 Dec 2019 18:02:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9A545207FF for ; Sun, 29 Dec 2019 18:02:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577642575; bh=UZCOrNCSYzLLHQQCAcwBP86yz0fqplT9ITOoRdzBppw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=IXBfO02UCuDzfFuPXn7BfN0ECZym57xWeNCsmDt120oFvAPJSG5EqbVZYNDL+Sj8M IYn7NoLc0FLeFnZVYMEqNICQ8HJnv6u3w+3sA7g/zzHaD2FG2tCYatyQ4IRh5MikOm P88fW7dJnHD+YTyaV4B5qdjCZK39HCmvyqix7Wco= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387725AbfL2SCz (ORCPT ); Sun, 29 Dec 2019 13:02:55 -0500 Received: from mail.kernel.org ([198.145.29.99]:42288 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732674AbfL2RyZ (ORCPT ); Sun, 29 Dec 2019 12:54:25 -0500 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 98E4821744; Sun, 29 Dec 2019 17:54:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577642065; bh=UZCOrNCSYzLLHQQCAcwBP86yz0fqplT9ITOoRdzBppw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ySzmk82yG41ouG4TD8ri0a+QryqBBZINeo9cWZBuF1zj09jAEFwqFUoKCXMbwElKe A7c5M1EX50Bi+8yM01ho9eeNVYLhj7sTTc/ku/DS/PSb0KsVjhtkLmZ+sR+/pbYTxM Gz95lwDQ+niMT1dTFQ1CQc95U4VYEBxi79MpMg4A= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, James Morse , Robert Richter , Borislav Petkov , Mauro Carvalho Chehab , "linux-edac@vger.kernel.org" , Tony Luck , Sasha Levin Subject: [PATCH 5.4 284/434] EDAC/ghes: Fix grain calculation Date: Sun, 29 Dec 2019 18:25:37 +0100 Message-Id: <20191229172720.814013410@linuxfoundation.org> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191229172702.393141737@linuxfoundation.org> References: <20191229172702.393141737@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Robert Richter [ Upstream commit 7088e29e0423d3195e09079b4f849ec4837e5a75 ] The current code to convert a physical address mask to a grain (defined as granularity in bytes) is: e->grain = ~(mem_err->physical_addr_mask & ~PAGE_MASK); This is broken in several ways: 1) It calculates to wrong grain values. E.g., a physical address mask of ~0xfff should give a grain of 0x1000. Without considering PAGE_MASK, there is an off-by-one. Things are worse when also filtering it with ~PAGE_MASK. This will calculate to a grain with the upper bits set. In the example it even calculates to ~0. 2) The grain does not depend on and is unrelated to the kernel's page-size. The page-size only matters when unmapping memory in memory_failure(). Smaller grains are wrongly rounded up to the page-size, on architectures with a configurable page-size (e.g. arm64) this could round up to the even bigger page-size of the hypervisor. Fix this with: e->grain = ~mem_err->physical_addr_mask + 1; The grain_bits are defined as: grain = 1 << grain_bits; Change also the grain_bits calculation accordingly, it is the same formula as in edac_mc.c now and the code can be unified. The value in ->physical_addr_mask coming from firmware is assumed to be contiguous, but this is not sanity-checked. However, in case the mask is non-contiguous, a conversion to grain_bits effectively converts the grain bit mask to a power of 2 by rounding it up. Suggested-by: James Morse Signed-off-by: Robert Richter Signed-off-by: Borislav Petkov Reviewed-by: Mauro Carvalho Chehab Cc: "linux-edac@vger.kernel.org" Cc: Tony Luck Link: https://lkml.kernel.org/r/20191106093239.25517-11-rrichter@marvell.com Signed-off-by: Sasha Levin --- drivers/edac/ghes_edac.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c index 296e714bf553..523dd56a798c 100644 --- a/drivers/edac/ghes_edac.c +++ b/drivers/edac/ghes_edac.c @@ -231,6 +231,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err) /* Cleans the error report buffer */ memset(e, 0, sizeof (*e)); e->error_count = 1; + e->grain = 1; strcpy(e->label, "unknown label"); e->msg = pvt->msg; e->other_detail = pvt->other_detail; @@ -326,7 +327,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err) /* Error grain */ if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK) - e->grain = ~(mem_err->physical_addr_mask & ~PAGE_MASK); + e->grain = ~mem_err->physical_addr_mask + 1; /* Memory error location, mapped on e->location */ p = e->location; @@ -442,8 +443,13 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err) if (p > pvt->other_detail) *(p - 1) = '\0'; + /* Sanity-check driver-supplied grain value. */ + if (WARN_ON_ONCE(!e->grain)) + e->grain = 1; + + grain_bits = fls_long(e->grain - 1); + /* Generate the trace event */ - grain_bits = fls_long(e->grain); snprintf(pvt->detail_location, sizeof(pvt->detail_location), "APEI location: %s %s", e->location, e->other_detail); trace_mc_event(type, e->msg, e->label, e->error_count, From patchwork Sun Dec 29 17:27:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kroah-Hartman X-Patchwork-Id: 11312311 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2BB2E14B7 for ; Sun, 29 Dec 2019 17:59:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 09C52222C4 for ; Sun, 29 Dec 2019 17:59:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577642366; bh=jjSDoosDC2f5Hfu0htQLEoNtBfuF1F+entclPo6rXTY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=wPKvm4J7yeyjBKVA1Z4XsqE4Uc+vXpfC1eOzK1OBXG8PtgDvuWmNExq+J/I3Ri2af SwD4cK6TlQO2u7jxb04BGb0JVFGNJdFRCFbJeH/dmmqYQGF9OVmOjfLhXKPoTvAOZ5 mAWvgmhNZF5gYdCpts6pGqUJNxZsvm1/kPpHHQG0= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387616AbfL2R6t (ORCPT ); Sun, 29 Dec 2019 12:58:49 -0500 Received: from mail.kernel.org ([198.145.29.99]:50078 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387607AbfL2R6t (ORCPT ); Sun, 29 Dec 2019 12:58:49 -0500 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B0830206A4; Sun, 29 Dec 2019 17:58:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577642328; bh=jjSDoosDC2f5Hfu0htQLEoNtBfuF1F+entclPo6rXTY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nq1e6kNOEd5IQw5yVhIOezwGAFrHWFTR/rKe8DxTFW2iZ/rfmp4mMFj9GFf+RYxGJ 6Ag7Or4SbvkzbeDHWpTcftxiCbS6VUfqWO/KtJWV2Lj4zEZVSpJnXiF4tgFrPlt41t wqqhZmdh74F9ZePBOgGevCaxNCGAUyDEJBcEfH7U= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Konstantin Khlebnikov , Borislav Petkov , Yazen Ghannam , "H. Peter Anvin" , Ingo Molnar , linux-edac , Thomas Gleixner , Tony Luck , x86-ml Subject: [PATCH 5.4 421/434] x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure() Date: Sun, 29 Dec 2019 18:27:54 +0100 Message-Id: <20191229172731.058586083@linuxfoundation.org> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191229172702.393141737@linuxfoundation.org> References: <20191229172702.393141737@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Konstantin Khlebnikov commit 246ff09f89e54fdf740a8d496176c86743db3ec7 upstream. ... because interrupts are disabled that early and sending IPIs can deadlock: BUG: sleeping function called from invalid context at kernel/sched/completion.c:99 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1 no locks held by swapper/1/0. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [] copy_process+0x8b9/0x1ca0 softirqs last enabled at (0): [] copy_process+0x8b9/0x1ca0 softirqs last disabled at (0): [<0000000000000000>] 0x0 Preemption disabled at: [] start_secondary+0x3b/0x190 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.0-rc2+ #1 Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018 Call Trace: dump_stack ___might_sleep.cold.92 wait_for_completion ? generic_exec_single rdmsr_safe_on_cpu ? wrmsr_on_cpus mce_amd_feature_init mcheck_cpu_init identify_cpu identify_secondary_cpu smp_store_cpu_info start_secondary secondary_startup_64 The function smca_configure() is called only on the current CPU anyway, therefore replace rdmsr_safe_on_cpu() with atomic rdmsr_safe() and avoid the IPI. [ bp: Update commit message. ] Signed-off-by: Konstantin Khlebnikov Signed-off-by: Borislav Petkov Reviewed-by: Yazen Ghannam Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: linux-edac Cc: Cc: Thomas Gleixner Cc: Tony Luck Cc: x86-ml Link: https://lkml.kernel.org/r/157252708836.3876.4604398213417262402.stgit@buzz Signed-off-by: Greg Kroah-Hartman --- arch/x86/kernel/cpu/mce/amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -269,7 +269,7 @@ static void smca_configure(unsigned int if (smca_banks[bank].hwid) return; - if (rdmsr_safe_on_cpu(cpu, MSR_AMD64_SMCA_MCx_IPID(bank), &low, &high)) { + if (rdmsr_safe(MSR_AMD64_SMCA_MCx_IPID(bank), &low, &high)) { pr_warn("Failed to read MCA_IPID for bank %d\n", bank); return; } From patchwork Sun Dec 29 17:27:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg Kroah-Hartman X-Patchwork-Id: 11312309 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5AD4714B7 for ; Sun, 29 Dec 2019 17:59:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3A49724655 for ; Sun, 29 Dec 2019 17:59:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577642364; bh=SXZtoj1N8So8wM9gNNkosrmKnhPrW8PSoe/zswMBiOs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=RJfBEq8WDeQRKPmThVh3tnlxBVt/la76nsPDUgd6LDeWoUPKFAuOTVfLUfjfEMzVK TBOnbxBMBZAAWgxEWC4Q91pmDjHHykp7U+w9lwPMmEsjO7UG0o8RB129yKPBGDYVg9 D/zOKvSjamz5e6g7Kyz/NRo5glkLKk6+ablT7KkU= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387626AbfL2R6v (ORCPT ); Sun, 29 Dec 2019 12:58:51 -0500 Received: from mail.kernel.org ([198.145.29.99]:50152 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387621AbfL2R6v (ORCPT ); Sun, 29 Dec 2019 12:58:51 -0500 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 24EB32253D; Sun, 29 Dec 2019 17:58:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577642330; bh=SXZtoj1N8So8wM9gNNkosrmKnhPrW8PSoe/zswMBiOs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fUS2LcbauC8lHEfOOkRKXTV0zBih+U3dhbX3vDklueQBixsooLpfyC9ppwmpV8TIf VeMpx/E+Tl76ENu8X7wD+4E+boJ/NUMJkzA8GxWTGHntWbzG2cE9wSd0PLv0XiPIk4 6CmSTFtE/o+XWgHpWk4n8ruMjjLm0AYVxr+hs0QY= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Yazen Ghannam , Borislav Petkov , "H. Peter Anvin" , Ingo Molnar , linux-edac , Thomas Gleixner , Tony Luck , x86-ml Subject: [PATCH 5.4 422/434] x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[] Date: Sun, 29 Dec 2019 18:27:55 +0100 Message-Id: <20191229172731.152674776@linuxfoundation.org> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191229172702.393141737@linuxfoundation.org> References: <20191229172702.393141737@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Yazen Ghannam commit 966af20929ac24360ba3fac5533eb2ab003747da upstream. Each logical CPU in Scalable MCA systems controls a unique set of MCA banks in the system. These banks are not shared between CPUs. The bank types and ordering will be the same across CPUs on currently available systems. However, some CPUs may see a bank as Reserved/Read-as-Zero (RAZ) while other CPUs do not. In this case, the bank seen as Reserved on one CPU is assumed to be the same type as the bank seen as a known type on another CPU. In general, this occurs when the hardware represented by the MCA bank is disabled, e.g. disabled memory controllers on certain models, etc. The MCA bank is disabled in the hardware, so there is no possibility of getting an MCA/MCE from it even if it is assumed to have a known type. For example: Full system: Bank | Type seen on CPU0 | Type seen on CPU1 ------------------------------------------------ 0 | LS | LS 1 | UMC | UMC 2 | CS | CS System with hardware disabled: Bank | Type seen on CPU0 | Type seen on CPU1 ------------------------------------------------ 0 | LS | LS 1 | UMC | RAZ 2 | CS | CS For this reason, there is a single, global struct smca_banks[] that is initialized at boot time. This array is initialized on each CPU as it comes online. However, the array will not be updated if an entry already exists. This works as expected when the first CPU (usually CPU0) has all possible MCA banks enabled. But if the first CPU has a subset, then it will save a "Reserved" type in smca_banks[]. Successive CPUs will then not be able to update smca_banks[] even if they encounter a known bank type. This may result in unexpected behavior. Depending on the system configuration, a user may observe issues enumerating the MCA thresholding sysfs interface. The issues may be as trivial as sysfs entries not being available, or as severe as system hangs. For example: Bank | Type seen on CPU0 | Type seen on CPU1 ------------------------------------------------ 0 | LS | LS 1 | RAZ | UMC 2 | CS | CS Extend the smca_banks[] entry check to return if the entry is a non-reserved type. Otherwise, continue so that CPUs that encounter a known bank type can update smca_banks[]. Fixes: 68627a697c19 ("x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type") Signed-off-by: Yazen Ghannam Signed-off-by: Borislav Petkov Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: linux-edac Cc: Cc: Thomas Gleixner Cc: Tony Luck Cc: x86-ml Link: https://lkml.kernel.org/r/20191121141508.141273-1-Yazen.Ghannam@amd.com Signed-off-by: Greg Kroah-Hartman --- arch/x86/kernel/cpu/mce/amd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -266,7 +266,7 @@ static void smca_configure(unsigned int smca_set_misc_banks_map(bank, cpu); /* Return early if this bank was already initialized. */ - if (smca_banks[bank].hwid) + if (smca_banks[bank].hwid && smca_banks[bank].hwid->hwid_mcatype != 0) return; if (rdmsr_safe(MSR_AMD64_SMCA_MCx_IPID(bank), &low, &high)) { From patchwork Sun Dec 29 17:27:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Greg Kroah-Hartman X-Patchwork-Id: 11312307 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 342511395 for ; Sun, 29 Dec 2019 17:58:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1077B227BF for ; Sun, 29 Dec 2019 17:58:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577642335; bh=+hfMZu4utFr+JkBt/yrr4yw7uYIzzkrDyjpekVgkpE8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=gWlO4q5s5E4XPfl5+oPHAVU9yeU3Jo+lWLuPSYPzSSFft9rNHy0bGqxdMzfWEV/gK mowtT1N6SVpDL50NDiHgukJDnpuqrnvye5lmfaze1peYxsk9SvIsrUFId2lg6y1c59 sOpbK5ipAwl0RWRC0BlPsci03PB2JL0ij9lbDa5Q= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387634AbfL2R6y (ORCPT ); Sun, 29 Dec 2019 12:58:54 -0500 Received: from mail.kernel.org ([198.145.29.99]:50236 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733207AbfL2R6x (ORCPT ); Sun, 29 Dec 2019 12:58:53 -0500 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 65F3C24650; Sun, 29 Dec 2019 17:58:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577642332; bh=+hfMZu4utFr+JkBt/yrr4yw7uYIzzkrDyjpekVgkpE8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q8MHASBiK2BZZ6ZFQH0hl7Ur0xFYQMoG/U0GbG0As7jIXOjpENML/dWdQYq8VxJwi ayDbHoNOcOco1ONufFNHHbJxJflmY15W+LdzAUIVFa+ZxbAxFl7ZzymBFybusn5MmB oqP79XZ1+Yr4v3TOjSHzxi+xd++q3kl2offkfDzc= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, " =?utf-8?q?Jan_H_=2E__Sch=C3=B6nherr?= " , Borislav Petkov , Tony Luck , "H. Peter Anvin" , Ingo Molnar , linux-edac , Thomas Gleixner , x86-ml , Yazen Ghannam Subject: [PATCH 5.4 423/434] x86/mce: Fix possibly incorrect severity calculation on AMD Date: Sun, 29 Dec 2019 18:27:56 +0100 Message-Id: <20191229172731.249449127@linuxfoundation.org> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191229172702.393141737@linuxfoundation.org> References: <20191229172702.393141737@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: linux-edac-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-edac@vger.kernel.org From: Jan H. Schönherr commit a3a57ddad061acc90bef39635caf2b2330ce8f21 upstream. The function mce_severity_amd_smca() requires m->bank to be initialized for correct operation. Fix the one case, where mce_severity() is called without doing so. Fixes: 6bda529ec42e ("x86/mce: Grade uncorrected errors for SMCA-enabled systems") Fixes: d28af26faa0b ("x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()") Signed-off-by: Jan H. Schönherr Signed-off-by: Borislav Petkov Reviewed-by: Tony Luck Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: linux-edac Cc: Cc: Thomas Gleixner Cc: x86-ml Cc: Yazen Ghannam Link: https://lkml.kernel.org/r/20191210000733.17979-4-jschoenh@amazon.de Signed-off-by: Greg Kroah-Hartman --- arch/x86/kernel/cpu/mce/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -814,8 +814,8 @@ static int mce_no_way_out(struct mce *m, if (quirk_no_way_out) quirk_no_way_out(i, m, regs); + m->bank = i; if (mce_severity(m, mca_cfg.tolerant, &tmp, true) >= MCE_PANIC_SEVERITY) { - m->bank = i; mce_read_aux(m, i); *msg = tmp; return 1;