diff mbox

[RFC] x86, mce: change the mce notifier to 'blocking' from 'atomic'

Message ID 20170413113159.rc32ebiswn64nzrr@pd.tnic (mailing list archive)
State New, archived
Headers show

Commit Message

Borislav Petkov April 13, 2017, 11:31 a.m. UTC
On Thu, Apr 13, 2017 at 12:29:25AM +0200, Borislav Petkov wrote:
> On Wed, Apr 12, 2017 at 03:26:19PM -0700, Luck, Tony wrote:
> > We can futz with that and have them specify which chain (or both)
> > that they want to be added to.
> 
> Well, I didn't want the atomic chain to be a notifier because we can
> keep it simple and non-blocking. Only the process context one will be.
> 
> So the question is, do we even have a use case for outside consumers
> hanging on the atomic chain? Because if not, we're good to go.

Ok, new day, new patch.

Below is what we could do: we don't call the notifier at all on the
atomic path but only print the MCEs. We do log them and if the machine
survives, we process them accordingly. This is only a fix for upstream
so that the current issue at hand is addressed.

For later, we'd need to split the paths in:

critical_print_mce()

or somesuch which immediately dumps the MCE to dmesg, and

mce_log()

which does the slow path of logging MCEs and calling the blocking
notifier.

Now, I'd want to have decoding of the MCE on the critical path too so
I have to think about how to do that nicely. Maybe move the decoding
bits which are the same between Intel and AMD in mce.c and have some
vendor-specific, fast calls. We'll see. Btw, this is something Ingo has
been mentioning for a while.

Anyway, here's just the urgent fix for now.

Thanks.

---
From: Vishal Verma <vishal.l.verma@intel.com>
Date: Tue, 11 Apr 2017 16:44:57 -0600
Subject: [PATCH] x86/mce: Make the MCE notifier a blocking one

The NFIT MCE handler callback (for handling media errors on NVDIMMs)
takes a mutex to add the location of a memory error to a list. But since
the notifier call chain for machine checks (x86_mce_decoder_chain) is
atomic, we get a lockdep splat like:

  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
  in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
  [..]
  Call Trace:
   dump_stack
   ___might_sleep
   __might_sleep
   mutex_lock_nested
   ? __lock_acquire
   nfit_handle_mce
   notifier_call_chain
   atomic_notifier_call_chain
   ? atomic_notifier_call_chain
   mce_gen_pool_process

Convert the notifier to a blocking one which gets to run only in process
context.

Boris: remove the notifier call in atomic context in print_mce(). For
now, let's print the MCE on the atomic path so that we can make sure it
goes out. We still log it for process context later.

Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.com
Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media error")
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce-genpool.c  |  2 +-
 arch/x86/kernel/cpu/mcheck/mce-internal.h |  2 +-
 arch/x86/kernel/cpu/mcheck/mce.c          | 18 ++++--------------
 3 files changed, 6 insertions(+), 16 deletions(-)

Comments

Borislav Petkov April 13, 2017, 12:12 p.m. UTC | #1
On Thu, Apr 13, 2017 at 01:31:59PM +0200, Borislav Petkov wrote:
> @@ -321,18 +321,8 @@ static void __print_mce(struct mce *m)
>  
>  static void print_mce(struct mce *m)
>  {
> -	int ret = 0;
> -
>  	__print_mce(m);
> -
> -	/*
> -	 * Print out human-readable details about the MCE error,
> -	 * (if the CPU has an implementation for that)
> -	 */
> -	ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
> -	if (ret == NOTIFY_STOP)
> -		return;
> -
> +	mce_log(m);

Actually, we don't need that call here because do_machine_check()
already does it.
Luck, Tony April 18, 2017, 4:28 p.m. UTC | #2
On Thu, Apr 13, 2017 at 02:12:16PM +0200, Borislav Petkov wrote:
> On Thu, Apr 13, 2017 at 01:31:59PM +0200, Borislav Petkov wrote:
> > @@ -321,18 +321,8 @@ static void __print_mce(struct mce *m)
> >  
> >  static void print_mce(struct mce *m)
> >  {
> > -	int ret = 0;
> > -
> >  	__print_mce(m);
> > -
> > -	/*
> > -	 * Print out human-readable details about the MCE error,
> > -	 * (if the CPU has an implementation for that)
> > -	 */
> > -	ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
> > -	if (ret == NOTIFY_STOP)
> > -		return;
> > -
> > +	mce_log(m);
> 
> Actually, we don't need that call here because do_machine_check()
> already does it.

Yes. Don't add mce_log(m) here. We've already done it.

With this change:

Acked-by: Tony Luck <tony.luck@intel.com>

-Tony
Verma, Vishal L April 21, 2017, 9:39 p.m. UTC | #3
On Thu, 2017-04-13 at 13:31 +0200, Borislav Petkov wrote:
> On Thu, Apr 13, 2017 at 12:29:25AM +0200, Borislav Petkov wrote:

> > On Wed, Apr 12, 2017 at 03:26:19PM -0700, Luck, Tony wrote:

> > > We can futz with that and have them specify which chain (or both)

> > > that they want to be added to.

> > 

> > Well, I didn't want the atomic chain to be a notifier because we can

> > keep it simple and non-blocking. Only the process context one will

> > be.

> > 

> > So the question is, do we even have a use case for outside consumers

> > hanging on the atomic chain? Because if not, we're good to go.

> 

> Ok, new day, new patch.

> 

> Below is what we could do: we don't call the notifier at all on the

> atomic path but only print the MCEs. We do log them and if the machine

> survives, we process them accordingly. This is only a fix for upstream

> so that the current issue at hand is addressed.

> 

> For later, we'd need to split the paths in:

> 

> critical_print_mce()

> 

> or somesuch which immediately dumps the MCE to dmesg, and

> 

> mce_log()

> 

> which does the slow path of logging MCEs and calling the blocking

> notifier.

> 

> Now, I'd want to have decoding of the MCE on the critical path too so

> I have to think about how to do that nicely. Maybe move the decoding

> bits which are the same between Intel and AMD in mce.c and have some

> vendor-specific, fast calls. We'll see. Btw, this is something Ingo

> has

> been mentioning for a while.

> 

> Anyway, here's just the urgent fix for now.

> 

> Thanks.

> 

> ---

> From: Vishal Verma <vishal.l.verma@intel.com>

> Date: Tue, 11 Apr 2017 16:44:57 -0600

> Subject: [PATCH] x86/mce: Make the MCE notifier a blocking one

> 

> The NFIT MCE handler callback (for handling media errors on NVDIMMs)

> takes a mutex to add the location of a memory error to a list. But

> since

> the notifier call chain for machine checks (x86_mce_decoder_chain) is

> atomic, we get a lockdep splat like:

> 

>   BUG: sleeping function called from invalid context at

> kernel/locking/mutex.c:620

>   in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0

>   [..]

>   Call Trace:

>    dump_stack

>    ___might_sleep

>    __might_sleep

>    mutex_lock_nested

>    ? __lock_acquire

>    nfit_handle_mce

>    notifier_call_chain

>    atomic_notifier_call_chain

>    ? atomic_notifier_call_chain

>    mce_gen_pool_process

> 

> Convert the notifier to a blocking one which gets to run only in

> process

> context.

> 

> Boris: remove the notifier call in atomic context in print_mce(). For

> now, let's print the MCE on the atomic path so that we can make sure

> it

> goes out. We still log it for process context later.

> 

> Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>

> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>

> Cc: Tony Luck <tony.luck@intel.com>

> Cc: Dan Williams <dan.j.williams@intel.com>

> Cc: linux-edac <linux-edac@vger.kernel.org>

> Cc: x86-ml <x86@kernel.org>

> Cc: <stable@vger.kernel.org>

> Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@i

> ntel.com

> Fixes: 6839a6d96f4e ("nfit: do an ARS scrub on hitting a latent media

> error")

> Signed-off-by: Borislav Petkov <bp@suse.de>

> ---

>  arch/x86/kernel/cpu/mcheck/mce-genpool.c  |  2 +-

>  arch/x86/kernel/cpu/mcheck/mce-internal.h |  2 +-

>  arch/x86/kernel/cpu/mcheck/mce.c          | 18 ++++--------------

>  3 files changed, 6 insertions(+), 16 deletions(-)

> 


I noticed this patch was picked up in tip, in ras/urgent, but didn't see
a pull request for 4.11 - was this the intention? Or will it just be
added for 4.12?

	-Vishal
diff mbox

Patch

diff --git a/arch/x86/kernel/cpu/mcheck/mce-genpool.c b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
index 1e5a50c11d3c..217cd4449bc9 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-genpool.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-genpool.c
@@ -85,7 +85,7 @@  void mce_gen_pool_process(struct work_struct *__unused)
 	head = llist_reverse_order(head);
 	llist_for_each_entry_safe(node, tmp, head, llnode) {
 		mce = &node->mce;
-		atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, mce);
+		blocking_notifier_call_chain(&x86_mce_decoder_chain, 0, mce);
 		gen_pool_free(mce_evt_pool, (unsigned long)node, sizeof(*node));
 	}
 }
diff --git a/arch/x86/kernel/cpu/mcheck/mce-internal.h b/arch/x86/kernel/cpu/mcheck/mce-internal.h
index 903043e6a62b..19592ba1a320 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-internal.h
+++ b/arch/x86/kernel/cpu/mcheck/mce-internal.h
@@ -13,7 +13,7 @@  enum severity_level {
 	MCE_PANIC_SEVERITY,
 };
 
-extern struct atomic_notifier_head x86_mce_decoder_chain;
+extern struct blocking_notifier_head x86_mce_decoder_chain;
 
 #define ATTR_LEN		16
 #define INITIAL_CHECK_INTERVAL	5 * 60 /* 5 minutes */
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 5accfbdee3f0..8e470735b16b 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -123,7 +123,7 @@  static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
  * CPU/chipset specific EDAC code can register a notifier call here to print
  * MCE errors in a human-readable form.
  */
-ATOMIC_NOTIFIER_HEAD(x86_mce_decoder_chain);
+BLOCKING_NOTIFIER_HEAD(x86_mce_decoder_chain);
 
 /* Do initial initialization of a struct mce */
 void mce_setup(struct mce *m)
@@ -220,7 +220,7 @@  void mce_register_decode_chain(struct notifier_block *nb)
 
 	WARN_ON(nb->priority > MCE_PRIO_LOWEST && nb->priority < MCE_PRIO_EDAC);
 
-	atomic_notifier_chain_register(&x86_mce_decoder_chain, nb);
+	blocking_notifier_chain_register(&x86_mce_decoder_chain, nb);
 }
 EXPORT_SYMBOL_GPL(mce_register_decode_chain);
 
@@ -228,7 +228,7 @@  void mce_unregister_decode_chain(struct notifier_block *nb)
 {
 	atomic_dec(&num_notifiers);
 
-	atomic_notifier_chain_unregister(&x86_mce_decoder_chain, nb);
+	blocking_notifier_chain_unregister(&x86_mce_decoder_chain, nb);
 }
 EXPORT_SYMBOL_GPL(mce_unregister_decode_chain);
 
@@ -321,18 +321,8 @@  static void __print_mce(struct mce *m)
 
 static void print_mce(struct mce *m)
 {
-	int ret = 0;
-
 	__print_mce(m);
-
-	/*
-	 * Print out human-readable details about the MCE error,
-	 * (if the CPU has an implementation for that)
-	 */
-	ret = atomic_notifier_call_chain(&x86_mce_decoder_chain, 0, m);
-	if (ret == NOTIFY_STOP)
-		return;
-
+	mce_log(m);
 	pr_emerg_ratelimited(HW_ERR "Run the above through 'mcelog --ascii'\n");
 }