diff mbox series

[2/6] x86/mce: Make mce=nobootlog work again

Message ID 20191210000733.17979-3-jschoenh@amazon.de (mailing list archive)
State New, archived
Headers show
Series x86/mce: Various fixes and cleanups for MCE handling | expand

Commit Message

Jan H. Schönherr Dec. 10, 2019, 12:07 a.m. UTC
Since Linux 4.5 commit 8b38937b7ab5 ("x86/mce: Do not enter deferred
errors into the generic pool twice") the mce=nobootlog option has become
mostly ineffective (after being only slightly ineffective before), as
the code is taking actions on MCEs left over from boot when they have a
usable address.

Move the check for MCP_DONTLOG a bit outward to make it effective again.

Also, since Linux 4.12 commit 011d82611172 ("RAS: Add a Corrected Errors
Collector") the two branches of the remaining "if" the bottom of
machine_check_poll() do the same. Unify them.

Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
---
 arch/x86/kernel/cpu/mce/core.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

Comments

Borislav Petkov Dec. 16, 2019, 5:15 p.m. UTC | #1
On Tue, Dec 10, 2019 at 01:07:29AM +0100, Jan H. Schönherr wrote:
> Since Linux 4.5 commit 8b38937b7ab5 ("x86/mce: Do not enter deferred

You don't have to go figure out the kernel version each time you quote
a commit - most people should be able to do git describe or git tag
--contains :)

> errors into the generic pool twice") the mce=nobootlog option has become
> mostly ineffective (after being only slightly ineffective before), as
> the code is taking actions on MCEs left over from boot when they have a
> usable address.
> 
> Move the check for MCP_DONTLOG a bit outward to make it effective again.
> 
> Also, since Linux 4.12 commit 011d82611172 ("RAS: Add a Corrected Errors
> Collector") the two branches of the remaining "if" the bottom of
> machine_check_poll() do the same. Unify them.
> 
> Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
> ---
>  arch/x86/kernel/cpu/mce/core.c | 25 +++++++++----------------
>  1 file changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index d5a8b99f7ba3..81ab25d5357a 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -760,24 +760,17 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
>  log_it:
>  		error_seen = true;
>  
> -		mce_read_aux(&m, i);
> -
> -		m.severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
> -
> -		/*
> -		 * Don't get the IP here because it's unlikely to
> -		 * have anything to do with the actual error location.
> -		 */
> -		if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce)
> -			mce_log(&m);
> -		else if (mce_usable_address(&m)) {
> +		if (!(flags & MCP_DONTLOG)) {

I hate that double-negation logic we have in the code. :-\

	if (! ... DONT...

Can you pls flip the logic here?

	if (flags & MCP_DONTLOG)
		goto clear_bank;

	/* logging code */

clear_bank:
	mce_wrmsrl(msr_ops.status(i), 0);

This way you'll save an indentation level too. Something like this (I
took your patch and mangled it):

---
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 5f42f25bac8f..2b43caaba70d 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -763,29 +763,20 @@ bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
 log_it:
 		error_seen = true;
 
-		mce_read_aux(&m, i);
+		if (flags & MCP_DONTLOG)
+			goto clear_bank;
 
+		mce_read_aux(&m, i);
 		m.severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
 
 		/*
-		 * Don't get the IP here because it's unlikely to
-		 * have anything to do with the actual error location.
+		 * Don't get the IP here because it's unlikely to have anything
+		 * to do with the actual error location.
 		 */
-		if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce)
+		if (!mca_cfg.dont_log_ce || mce_usable_address(&m))
 			mce_log(&m);
-		else if (mce_usable_address(&m)) {
-			/*
-			 * Although we skipped logging this, we still want
-			 * to take action. Add to the pool so the registered
-			 * notifiers will see it.
-			 */
-			if (!mce_gen_pool_add(&m))
-				mce_schedule_work();
-		}
 
-		/*
-		 * Clear state for this bank.
-		 */
+clear_bank:
 		mce_wrmsrl(msr_ops.status(i), 0);
 	}
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d5a8b99f7ba3..81ab25d5357a 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -760,24 +760,17 @@  bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b)
 log_it:
 		error_seen = true;
 
-		mce_read_aux(&m, i);
-
-		m.severity = mce_severity(&m, mca_cfg.tolerant, NULL, false);
-
-		/*
-		 * Don't get the IP here because it's unlikely to
-		 * have anything to do with the actual error location.
-		 */
-		if (!(flags & MCP_DONTLOG) && !mca_cfg.dont_log_ce)
-			mce_log(&m);
-		else if (mce_usable_address(&m)) {
+		if (!(flags & MCP_DONTLOG)) {
+			mce_read_aux(&m, i);
+			m.severity = mce_severity(&m, mca_cfg.tolerant, NULL,
+						  false);
 			/*
-			 * Although we skipped logging this, we still want
-			 * to take action. Add to the pool so the registered
-			 * notifiers will see it.
+			 * Don't get the IP here because it's unlikely to
+			 * have anything to do with the actual error location.
 			 */
-			if (!mce_gen_pool_add(&m))
-				mce_schedule_work();
+
+			if (!mca_cfg.dont_log_ce || mce_usable_address(&m))
+				mce_log(&m);
 		}
 
 		/*