diff mbox

[v2,08/12] x86/vmce: enable injecting LMCE to guest on Intel host

Message ID 20170317064614.23539-9-haozhong.zhang@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Haozhong Zhang March 17, 2017, 6:46 a.m. UTC
Inject LMCE to guest if the host MCE is LMCE and the affected vcpu is
known. Otherwise, broadcast MCE to all vcpus on Intel host.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>

Changes in v2:
 * Add comment for a check in mc_memerr_dhandler().
 * Add an ASSERT for vmce_vcpuid in fill_vmsr_data().
 * Combine two if branches about "broadcast".
---
 xen/arch/x86/cpu/mcheck/mcaction.c | 26 ++++++++++++++++++++------
 xen/arch/x86/cpu/mcheck/vmce.c     | 11 ++++++++++-
 xen/arch/x86/cpu/mcheck/vmce.h     |  2 +-
 3 files changed, 31 insertions(+), 8 deletions(-)

Comments

Jan Beulich March 20, 2017, 4:25 p.m. UTC | #1
>>> On 17.03.17 at 07:46, <haozhong.zhang@intel.com> wrote:
> @@ -88,18 +89,31 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
>                      goto vmce_failed;
>                  }
>  
> -                if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ||
> -                    global->mc_vcpuid == XEN_MC_VCPUID_INVALID)
> +                mc_vcpuid = global->mc_vcpuid;
> +                if (mc_vcpuid == XEN_MC_VCPUID_INVALID ||
> +                    (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> +                     (!(global->mc_gstatus & MCG_STATUS_LMCE) ||
> +                      !(d->vcpu[mc_vcpuid]->arch.vmce.lmce_enabled) ||
> +                      /*
> +                       * The following check serves for MCE injection
> +                       * test, i.e. xen-mceinj. xen-mceinj may specify
> +                       * the target domain (i.e. bank->mc_domid) and
> +                       * target CPU, but it's hard for xen-mceinj to
> +                       * ensure, when Xen prepares the actual
> +                       * injection in this function, vCPU currently
> +                       * running on the target CPU belongs to the
> +                       * target domain. If such inconsistency does
> +                       * happen, fallback to broadcast.
> +                       */
> +                      global->mc_domid != bank->mc_domid)))

Thinking about this another time, I don't think we want hackery
like this for a test utility. Instead I think the test utility wants to
pin the vCPU on the pCPU it wants to deliver the LMCE on.

Jan
Haozhong Zhang March 22, 2017, 9:19 a.m. UTC | #2
On 03/20/17 10:25 -0600, Jan Beulich wrote:
> >>> On 17.03.17 at 07:46, <haozhong.zhang@intel.com> wrote:
> > @@ -88,18 +89,31 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
> >                      goto vmce_failed;
> >                  }
> >  
> > -                if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ||
> > -                    global->mc_vcpuid == XEN_MC_VCPUID_INVALID)
> > +                mc_vcpuid = global->mc_vcpuid;
> > +                if (mc_vcpuid == XEN_MC_VCPUID_INVALID ||
> > +                    (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> > +                     (!(global->mc_gstatus & MCG_STATUS_LMCE) ||
> > +                      !(d->vcpu[mc_vcpuid]->arch.vmce.lmce_enabled) ||
> > +                      /*
> > +                       * The following check serves for MCE injection
> > +                       * test, i.e. xen-mceinj. xen-mceinj may specify
> > +                       * the target domain (i.e. bank->mc_domid) and
> > +                       * target CPU, but it's hard for xen-mceinj to
> > +                       * ensure, when Xen prepares the actual
> > +                       * injection in this function, vCPU currently
> > +                       * running on the target CPU belongs to the
> > +                       * target domain. If such inconsistency does
> > +                       * happen, fallback to broadcast.
> > +                       */
> > +                      global->mc_domid != bank->mc_domid)))
> 
> Thinking about this another time, I don't think we want hackery
> like this for a test utility. Instead I think the test utility wants to
> pin the vCPU on the pCPU it wants to deliver the LMCE on.
> 

I agree we should not introduce hackery only for test purpose.

However, after thinking twice, I think we still need this check, but
it should be lift to the outmost, i.e.
    if (mc_vcpuid == XEN_MC_VCPUID_INVALID ||
        global->mc_domid != bank->mc_domid ||             <== here
        (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
         (!(global->mc_gstatus & MCG_STATUS_LMCE) ||
          !(d->vcpu[mc_vcpuid]->arch.vmce.lmce_enabled))

MC# might not happen immediately at the moment that, e.g., the bad
memory cell is accessed, so the current domain id and vcpu id recorded
in global->mc_{domid, vcpuid} by mca_init_global() are probably not
precise (e.g. the domain accessed the bad memory was scheduled out,
and MC# comes while another domain is running). If such imprecision
does happen when handling Intel LMCE or AMD MCE, we cannot figure out
in mc_memerr_dhandler() (though it's not called in the current AMD MCE
handling, it intended to be the common code) the exact vcpu that
is affected.

To be worse, if the imprecise global->mc_vcpuid (whose value is in
variable mc_vcpuid) is larger than the maximum vcpu id of the affected
domain (indicated by variable 'd'), the check
    !(d->vcpu[mc_vcpuid]->arch.vmce.lmce_enabled)
is definitely wrong.


Haozhong
diff mbox

Patch

diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c b/xen/arch/x86/cpu/mcheck/mcaction.c
index ca17d22..f245356 100644
--- a/xen/arch/x86/cpu/mcheck/mcaction.c
+++ b/xen/arch/x86/cpu/mcheck/mcaction.c
@@ -44,6 +44,7 @@  mc_memerr_dhandler(struct mca_binfo *binfo,
     unsigned long mfn, gfn;
     uint32_t status;
     int vmce_vcpuid;
+    uint16_t mc_vcpuid;
 
     if (!mc_check_addr(bank->mc_status, bank->mc_misc, MC_ADDR_PHYSICAL)) {
         dprintk(XENLOG_WARNING,
@@ -88,18 +89,31 @@  mc_memerr_dhandler(struct mca_binfo *binfo,
                     goto vmce_failed;
                 }
 
-                if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL ||
-                    global->mc_vcpuid == XEN_MC_VCPUID_INVALID)
+                mc_vcpuid = global->mc_vcpuid;
+                if (mc_vcpuid == XEN_MC_VCPUID_INVALID ||
+                    (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+                     (!(global->mc_gstatus & MCG_STATUS_LMCE) ||
+                      !(d->vcpu[mc_vcpuid]->arch.vmce.lmce_enabled) ||
+                      /*
+                       * The following check serves for MCE injection
+                       * test, i.e. xen-mceinj. xen-mceinj may specify
+                       * the target domain (i.e. bank->mc_domid) and
+                       * target CPU, but it's hard for xen-mceinj to
+                       * ensure, when Xen prepares the actual
+                       * injection in this function, vCPU currently
+                       * running on the target CPU belongs to the
+                       * target domain. If such inconsistency does
+                       * happen, fallback to broadcast.
+                       */
+                      global->mc_domid != bank->mc_domid)))
                     vmce_vcpuid = VMCE_INJECT_BROADCAST;
                 else
-                    vmce_vcpuid = global->mc_vcpuid;
+                    vmce_vcpuid = mc_vcpuid;
 
                 bank->mc_addr = gfn << PAGE_SHIFT |
                   (bank->mc_addr & (PAGE_SIZE -1 ));
                 /* TODO: support injecting LMCE */
-                if (fill_vmsr_data(bank, d,
-                                   global->mc_gstatus & ~MCG_STATUS_LMCE,
-                                   vmce_vcpuid == VMCE_INJECT_BROADCAST))
+                if (fill_vmsr_data(bank, d, global->mc_gstatus, vmce_vcpuid))
                 {
                     mce_printk(MCE_QUIET, "Fill vMCE# data for DOM%d "
                       "failed\n", bank->mc_domid);
diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index c396d07..994a50e 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -464,14 +464,23 @@  static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t mcg_status,
 }
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
-                   uint64_t gstatus, bool broadcast)
+                   uint64_t gstatus, int vmce_vcpuid)
 {
     struct vcpu *v = d->vcpu[0];
+    bool broadcast = (vmce_vcpuid == VMCE_INJECT_BROADCAST);
     int ret, err;
 
     if ( mc_bank->mc_domid == DOMID_INVALID )
         return -EINVAL;
 
+    if ( broadcast )
+        gstatus &= ~MCG_STATUS_LMCE;
+    else if ( gstatus & MCG_STATUS_LMCE )
+    {
+        ASSERT(vmce_vcpuid >=0 && vmce_vcpuid < d->max_vcpus);
+        v = d->vcpu[vmce_vcpuid];
+    }
+
     /*
      * vMCE with the actual error information is injected to vCPU0,
      * and, if broadcast is required, we choose to inject less severe
diff --git a/xen/arch/x86/cpu/mcheck/vmce.h b/xen/arch/x86/cpu/mcheck/vmce.h
index 74f6381..2797e00 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.h
+++ b/xen/arch/x86/cpu/mcheck/vmce.h
@@ -17,7 +17,7 @@  int vmce_amd_rdmsr(const struct vcpu *, uint32_t msr, uint64_t *val);
 int vmce_amd_wrmsr(struct vcpu *, uint32_t msr, uint64_t val);
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
-                   uint64_t gstatus, bool broadcast);
+                   uint64_t gstatus, int vmce_vcpuid);
 
 #define VMCE_INJECT_BROADCAST (-1)
 int inject_vmce(struct domain *d, int vcpu);