diff mbox

[v2,1/2] mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC

Message ID 20130625174536.5226.66421.stgit@localhost.localdomain (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Naveen N. Rao June 25, 2013, 5:46 p.m. UTC
Tony, Boris,
Can you please see if the comments in the below patch include the details you
were expecting?

Thanks,
Naveen

--
Add comments to clarify usage of the various bitfields in the MCA subsystem

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
---
 arch/x86/kernel/cpu/mcheck/mce.c       |    5 ++++-
 arch/x86/kernel/cpu/mcheck/mce_intel.c |    9 +++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Borislav Petkov June 25, 2013, 5:53 p.m. UTC | #1
On Tue, Jun 25, 2013 at 11:16:32PM +0530, Naveen N. Rao wrote:
> Tony, Boris,
> Can you please see if the comments in the below patch include the details you
> were expecting?
> 
> Thanks,
> Naveen
> 
> --
> Add comments to clarify usage of the various bitfields in the MCA subsystem
> 
> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> ---
>  arch/x86/kernel/cpu/mcheck/mce.c       |    5 ++++-
>  arch/x86/kernel/cpu/mcheck/mce_intel.c |    9 +++++++++
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> index 9239504..bf49cdb 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -89,7 +89,10 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_chrdev_wait);
>  static DEFINE_PER_CPU(struct mce, mces_seen);
>  static int			cpu_missing;
>  
> -/* MCA banks polled by the period polling timer for corrected events */
> +/*
> + * MCA banks polled by the period polling timer for corrected events.
> + * With Intel CMCI, this only has MCA banks which do not support CMCI (if any).
> + */
>  DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
>  	[0 ... BITS_TO_LONGS(MAX_NR_BANKS)-1] = ~0UL
>  };
> diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> index ae1697c..2816f53 100644
> --- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
> @@ -24,6 +24,15 @@
>   * Also supports reliable discovery of shared banks.
>   */
>  
> +/*
> + * Indicates MCA banks controlled by the current cpu for CMCI. Note that this
> + * can change when a cpu is offlined or brought online since some MCA banks
> + * are shared across cpus. When a cpu is offlined, cmci_clear() disables CMCI
> + * on all banks owned by the cpu and clears this bitfield. At this point,
> + * cmci_rediscover() kicks in and a different cpu may end up taking
> + * ownership of some of the shared MCA banks that were previously owned
> + * by the offlined cpu.
> + */
>  static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned);

Yep, it looks ok to me.

Acked-by: Borislav Petkov <bp@suse.de>

Thanks.
Tony Luck June 25, 2013, 5:55 p.m. UTC | #2
+/*
+ * Indicates MCA banks controlled by the current cpu for CMCI. Note that this
+ * can change when a cpu is offlined or brought online since some MCA banks
+ * are shared across cpus. When a cpu is offlined, cmci_clear() disables CMCI
+ * on all banks owned by the cpu and clears this bitfield. At this point,
+ * cmci_rediscover() kicks in and a different cpu may end up taking
+ * ownership of some of the shared MCA banks that were previously owned
+ * by the offlined cpu.
+ */
 static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned);

Maybe an extra sentence or two at the beginning to say *why* we need this.
E.g.

/*
 * CMCI can be delivered to multiple cpus that share a machine check bank
 * so we need to designate a single cpu to process errors logged in each bank
 * in the interrupt handler (otherwise we would have many races and potential
 * double reporting of the same error.
 */
...

-Tony
diff mbox

Patch

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 9239504..bf49cdb 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -89,7 +89,10 @@  static DECLARE_WAIT_QUEUE_HEAD(mce_chrdev_wait);
 static DEFINE_PER_CPU(struct mce, mces_seen);
 static int			cpu_missing;
 
-/* MCA banks polled by the period polling timer for corrected events */
+/*
+ * MCA banks polled by the period polling timer for corrected events.
+ * With Intel CMCI, this only has MCA banks which do not support CMCI (if any).
+ */
 DEFINE_PER_CPU(mce_banks_t, mce_poll_banks) = {
 	[0 ... BITS_TO_LONGS(MAX_NR_BANKS)-1] = ~0UL
 };
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel.c b/arch/x86/kernel/cpu/mcheck/mce_intel.c
index ae1697c..2816f53 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel.c
@@ -24,6 +24,15 @@ 
  * Also supports reliable discovery of shared banks.
  */
 
+/*
+ * Indicates MCA banks controlled by the current cpu for CMCI. Note that this
+ * can change when a cpu is offlined or brought online since some MCA banks
+ * are shared across cpus. When a cpu is offlined, cmci_clear() disables CMCI
+ * on all banks owned by the cpu and clears this bitfield. At this point,
+ * cmci_rediscover() kicks in and a different cpu may end up taking
+ * ownership of some of the shared MCA banks that were previously owned
+ * by the offlined cpu.
+ */
 static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned);
 
 /*