diff mbox series

x86/svm: Drop support for AMD's Lightweight Profiling

Message ID 1558347216-19179-1-git-send-email-andrew.cooper3@citrix.com (mailing list archive)
State New, archived
Headers show
Series x86/svm: Drop support for AMD's Lightweight Profiling | expand

Commit Message

Andrew Cooper May 20, 2019, 10:13 a.m. UTC
Lightweight Profiling was introduced in Bulldozer (Fam15h), but was dropped
from Zen (Fam17h) processors.  Furthermore, LWP was dropped from Fam15/16 CPUs
when IBPB for Spectre v2 was introduced in microcode, owing to LWP not being
used in practice.

As a result, CPUs which are operating within specification (i.e. with up to
date microcode) no longer have this feature, and therefore are not using it.

Drop support from Xen.  The main motivation here is to remove unnecessary
complexity from CPUID handling, but it also tidies up the SVM code nicely.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
CC: Brian Woods <brian.woods@amd.com>
---
 xen/arch/x86/cpuid.c                        | 22 +--------
 xen/arch/x86/hvm/svm/svm.c                  | 77 -----------------------------
 xen/arch/x86/hvm/svm/vmcb.c                 |  5 --
 xen/arch/x86/msr.c                          |  4 ++
 xen/arch/x86/xstate.c                       |  3 +-
 xen/include/asm-x86/cpufeature.h            |  1 -
 xen/include/asm-x86/hvm/svm/vmcb.h          |  4 --
 xen/include/asm-x86/xstate.h                |  3 +-
 xen/include/public/arch-x86/cpufeatureset.h |  2 +-
 9 files changed, 8 insertions(+), 113 deletions(-)

Comments

Jan Beulich May 20, 2019, 11:40 a.m. UTC | #1
>>> On 20.05.19 at 12:13, <andrew.cooper3@citrix.com> wrote:
> Lightweight Profiling was introduced in Bulldozer (Fam15h), but was dropped
> from Zen (Fam17h) processors.  Furthermore, LWP was dropped from Fam15/16 CPUs
> when IBPB for Spectre v2 was introduced in microcode, owing to LWP not being
> used in practice.
> 
> As a result, CPUs which are operating within specification (i.e. with up to
> date microcode) no longer have this feature, and therefore are not using it.
> 
> Drop support from Xen.  The main motivation here is to remove unnecessary
> complexity from CPUID handling, but it also tidies up the SVM code nicely.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> --- a/xen/include/public/arch-x86/cpufeatureset.h
> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> @@ -176,7 +176,7 @@ XEN_CPUFEATURE(IBS,           3*32+10) /*   Instruction Based Sampling */
>  XEN_CPUFEATURE(XOP,           3*32+11) /*A  extended AVX instructions */
>  XEN_CPUFEATURE(SKINIT,        3*32+12) /*   SKINIT/STGI instructions */
>  XEN_CPUFEATURE(WDT,           3*32+13) /*   Watchdog timer */
> -XEN_CPUFEATURE(LWP,           3*32+15) /*S  Light Weight Profiling */
> +XEN_CPUFEATURE(LWP,           3*32+15) /*   Light Weight Profiling */

Strictly speaking this is not permitted (see the other thread on
this being part of the public interface). But of course strictly
speaking it was also not permitted for AMD to remove the
feature in a ucode update (I wonder btw whether the insns
indeed #UD now on Fam15/16).

Jan
Andrew Cooper May 20, 2019, 12:30 p.m. UTC | #2
On 20/05/2019 12:40, Jan Beulich wrote:
>>>> On 20.05.19 at 12:13, <andrew.cooper3@citrix.com> wrote:
>> Lightweight Profiling was introduced in Bulldozer (Fam15h), but was dropped
>> from Zen (Fam17h) processors.  Furthermore, LWP was dropped from Fam15/16 CPUs
>> when IBPB for Spectre v2 was introduced in microcode, owing to LWP not being
>> used in practice.
>>
>> As a result, CPUs which are operating within specification (i.e. with up to
>> date microcode) no longer have this feature, and therefore are not using it.
>>
>> Drop support from Xen.  The main motivation here is to remove unnecessary
>> complexity from CPUID handling, but it also tidies up the SVM code nicely.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Thanks.

>
>> --- a/xen/include/public/arch-x86/cpufeatureset.h
>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
>> @@ -176,7 +176,7 @@ XEN_CPUFEATURE(IBS,           3*32+10) /*   Instruction Based Sampling */
>>  XEN_CPUFEATURE(XOP,           3*32+11) /*A  extended AVX instructions */
>>  XEN_CPUFEATURE(SKINIT,        3*32+12) /*   SKINIT/STGI instructions */
>>  XEN_CPUFEATURE(WDT,           3*32+13) /*   Watchdog timer */
>> -XEN_CPUFEATURE(LWP,           3*32+15) /*S  Light Weight Profiling */
>> +XEN_CPUFEATURE(LWP,           3*32+15) /*   Light Weight Profiling */
> Strictly speaking this is not permitted (see the other thread on
> this being part of the public interface). But of course strictly
> speaking it was also not permitted for AMD to remove the
> feature in a ucode update (I wonder btw whether the insns
> indeed #UD now on Fam15/16).

There is nothing "not permitted" about it.  AMD can do whatever they
want in microcode, as can we with our feature support.

It is certainly in their (and our) interest to not break backwards
compatibility, which is why there should be a very good reason to
regress the customer experience.

In this case, LWP had already been dropped from Zen because it wasn't
used in practice, and then suddenly Spectre/Meltdown came along and
urgently needed a fix.  AMD didn't drop LWP lightly, and would have
avoided doing so if it was possible.

However, given a choice between fixing Spectre, and retaining support
for a feature which isn't used, dropping LWP was the correct decision to
make, however uncomfortable the decision was to make.

As for #UD, I haven't tried but I'd expect so.  All the instructions
were already specified to #UD anyway when XCR0.LWP was clear.

~Andrew
Woods, Brian May 20, 2019, 6:43 p.m. UTC | #3
On Mon, May 20, 2019 at 11:13:36AM +0100, Andy Cooper wrote:
> Lightweight Profiling was introduced in Bulldozer (Fam15h), but was dropped
> from Zen (Fam17h) processors.  Furthermore, LWP was dropped from Fam15/16 CPUs
> when IBPB for Spectre v2 was introduced in microcode, owing to LWP not being
> used in practice.
> 
> As a result, CPUs which are operating within specification (i.e. with up to
> date microcode) no longer have this feature, and therefore are not using it.
> 
> Drop support from Xen.  The main motivation here is to remove unnecessary
> complexity from CPUID handling, but it also tidies up the SVM code nicely.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Brian Woods <brian.woods@amd.com>

I've confirmed with HW engineers that it's going away.
diff mbox series

Patch

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index 6f59325..666fbbb 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -163,14 +163,6 @@  static void recalculate_xstate(struct cpuid_policy *p)
                           xstate_sizes[X86_XCR0_PKRU_POS]);
     }
 
-    if ( p->extd.lwp )
-    {
-        xstates |= X86_XCR0_LWP;
-        xstate_size = max(xstate_size,
-                          xstate_offsets[X86_XCR0_LWP_POS] +
-                          xstate_sizes[X86_XCR0_LWP_POS]);
-    }
-
     p->xstate.max_size  =  xstate_size;
     p->xstate.xcr0_low  =  xstates & ~XSTATE_XSAVES_ONLY;
     p->xstate.xcr0_high = (xstates & ~XSTATE_XSAVES_ONLY) >> 32;
@@ -265,8 +257,7 @@  static void recalculate_misc(struct cpuid_policy *p)
         zero_leaves(p->extd.raw, 0xb, 0x18);
 
         p->extd.raw[0x1b] = EMPTY_LEAF; /* IBS - not supported. */
-
-        p->extd.raw[0x1c].a = 0; /* LWP.a entirely dynamic. */
+        p->extd.raw[0x1c] = EMPTY_LEAF; /* LWP - not supported. */
         break;
     }
 }
@@ -581,11 +572,6 @@  void recalculate_cpuid_policy(struct domain *d)
 
     if ( !p->extd.page1gb )
         p->extd.raw[0x19] = EMPTY_LEAF;
-
-    if ( p->extd.lwp )
-        p->extd.raw[0x1c].d &= max->extd.raw[0x1c].d;
-    else
-        p->extd.raw[0x1c] = EMPTY_LEAF;
 }
 
 int init_domain_cpuid_policy(struct domain *d)
@@ -972,12 +958,6 @@  void guest_cpuid(const struct vcpu *v, uint32_t leaf,
                 res->d |= cpufeat_mask(X86_FEATURE_MTRR);
         }
         break;
-
-    case 0x8000001c:
-        if ( (v->arch.xcr0 & X86_XCR0_LWP) && cpu_has_svm )
-            /* Turn on available bit and other features specified in lwp_cfg. */
-            res->a = (res->d & v->arch.hvm.svm.guest_lwp_cfg) | 1;
-        break;
     }
 }
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 0beb31b..9f26493 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -939,72 +939,6 @@  static void svm_init_hypercall_page(struct domain *d, void *hypercall_page)
     *(u16 *)(hypercall_page + (__HYPERVISOR_iret * 32)) = 0x0b0f; /* ud2 */
 }
 
-static void svm_lwp_interrupt(struct cpu_user_regs *regs)
-{
-    struct vcpu *curr = current;
-
-    ack_APIC_irq();
-    vlapic_set_irq(
-        vcpu_vlapic(curr),
-        (curr->arch.hvm.svm.guest_lwp_cfg >> 40) & 0xff,
-        0);
-}
-
-static inline void svm_lwp_save(struct vcpu *v)
-{
-    /* Don't mess up with other guests. Disable LWP for next VCPU. */
-    if ( v->arch.hvm.svm.guest_lwp_cfg )
-    {
-        wrmsrl(MSR_AMD64_LWP_CFG, 0x0);
-        wrmsrl(MSR_AMD64_LWP_CBADDR, 0x0);
-    }
-}
-
-static inline void svm_lwp_load(struct vcpu *v)
-{
-    /* Only LWP_CFG is reloaded. LWP_CBADDR will be reloaded via xrstor. */
-   if ( v->arch.hvm.svm.guest_lwp_cfg )
-       wrmsrl(MSR_AMD64_LWP_CFG, v->arch.hvm.svm.cpu_lwp_cfg);
-}
-
-/* Update LWP_CFG MSR (0xc0000105). Return -1 if error; otherwise returns 0. */
-static int svm_update_lwp_cfg(struct vcpu *v, uint64_t msr_content)
-{
-    uint32_t msr_low;
-    static uint8_t lwp_intr_vector;
-
-    if ( xsave_enabled(v) && cpu_has_lwp )
-    {
-        msr_low = (uint32_t)msr_content;
-        
-        /* generate #GP if guest tries to turn on unsupported features. */
-        if ( msr_low & ~v->domain->arch.cpuid->extd.raw[0x1c].d )
-            return -1;
-
-        v->arch.hvm.svm.guest_lwp_cfg = msr_content;
-
-        /* setup interrupt handler if needed */
-        if ( (msr_content & 0x80000000) && ((msr_content >> 40) & 0xff) )
-        {
-            alloc_direct_apic_vector(&lwp_intr_vector, svm_lwp_interrupt);
-            v->arch.hvm.svm.cpu_lwp_cfg = (msr_content & 0xffff00ffffffffffULL)
-                | ((uint64_t)lwp_intr_vector << 40);
-        }
-        else
-        {
-            /* otherwise disable it */
-            v->arch.hvm.svm.cpu_lwp_cfg = msr_content & 0xffff00ff7fffffffULL;
-        }
-        
-        wrmsrl(MSR_AMD64_LWP_CFG, v->arch.hvm.svm.cpu_lwp_cfg);
-
-        /* track nonalzy state if LWP_CFG is non-zero. */
-        v->arch.nonlazy_xstate_used = !!(msr_content);
-    }
-
-    return 0;
-}
-
 static inline void svm_tsc_ratio_save(struct vcpu *v)
 {
     /* Other vcpus might not have vtsc enabled. So disable TSC_RATIO here. */
@@ -1034,7 +968,6 @@  static void svm_ctxt_switch_from(struct vcpu *v)
         svm_fpu_leave(v);
 
     svm_save_dr(v);
-    svm_lwp_save(v);
     svm_tsc_ratio_save(v);
 
     svm_sync_vmcb(v, vmcb_needs_vmload);
@@ -1066,7 +999,6 @@  static void svm_ctxt_switch_to(struct vcpu *v)
 
     svm_vmsave_pa(per_cpu(host_vmcb, cpu));
     vmcb->cleanbits.bytes = 0;
-    svm_lwp_load(v);
     svm_tsc_ratio_load(v);
 
     if ( cpu_has_msr_tsc_aux )
@@ -2002,10 +1934,6 @@  static int svm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         *msr_content = vmcb_get_lastinttoip(vmcb);
         break;
 
-    case MSR_AMD64_LWP_CFG:
-        *msr_content = v->arch.hvm.svm.guest_lwp_cfg;
-        break;
-
     case MSR_K7_PERFCTR0:
     case MSR_K7_PERFCTR1:
     case MSR_K7_PERFCTR2:
@@ -2177,11 +2105,6 @@  static int svm_msr_write_intercept(unsigned int msr, uint64_t msr_content)
         vmcb_set_lastinttoip(vmcb, msr_content);
         break;
 
-    case MSR_AMD64_LWP_CFG:
-        if ( svm_update_lwp_cfg(v, msr_content) < 0 )
-            goto gpf;
-        break;
-
     case MSR_K7_PERFCTR0:
     case MSR_K7_PERFCTR1:
     case MSR_K7_PERFCTR2:
diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
index 9d1c5bf..71ee710 100644
--- a/xen/arch/x86/hvm/svm/vmcb.c
+++ b/xen/arch/x86/hvm/svm/vmcb.c
@@ -100,11 +100,6 @@  static int construct_vmcb(struct vcpu *v)
     svm_disable_intercept_for_msr(v, MSR_STAR);
     svm_disable_intercept_for_msr(v, MSR_SYSCALL_MASK);
 
-    /* LWP_CBADDR MSR is saved and restored by FPU code. So SVM doesn't need to
-     * intercept it. */
-    if ( cpu_has_lwp )
-        svm_disable_intercept_for_msr(v, MSR_AMD64_LWP_CBADDR);
-
     vmcb->_msrpm_base_pa = virt_to_maddr(svm->msrpm);
     vmcb->_iopm_base_pa = __pa(v->domain->arch.hvm.io_bitmap);
 
diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 883b57b..5a2ef78 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -132,6 +132,8 @@  int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_FLUSH_CMD:
         /* Write-only */
     case MSR_TSX_FORCE_ABORT:
+    case MSR_AMD64_LWP_CFG:
+    case MSR_AMD64_LWP_CBADDR:
         /* Not offered to guests. */
         goto gp_fault;
 
@@ -272,6 +274,8 @@  int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_ARCH_CAPABILITIES:
         /* Read-only */
     case MSR_TSX_FORCE_ABORT:
+    case MSR_AMD64_LWP_CFG:
+    case MSR_AMD64_LWP_CBADDR:
         /* Not offered to guests. */
         goto gp_fault;
 
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 858d1a6..3da609a 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -725,8 +725,7 @@  int handle_xsetbv(u32 index, u64 new_bv)
     curr->arch.xcr0 = new_bv;
     curr->arch.xcr0_accum |= new_bv;
 
-    /* LWP sets nonlazy_xstate_used independently. */
-    if ( new_bv & (XSTATE_NONLAZY & ~X86_XCR0_LWP) )
+    if ( new_bv & XSTATE_NONLAZY )
         curr->arch.nonlazy_xstate_used = 1;
 
     mask &= curr->fpu_dirtied ? ~XSTATE_FP_SSE : XSTATE_NONLAZY;
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 745801f..e4f0343 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -78,7 +78,6 @@ 
 #define cpu_has_svm             boot_cpu_has(X86_FEATURE_SVM)
 #define cpu_has_sse4a           boot_cpu_has(X86_FEATURE_SSE4A)
 #define cpu_has_xop             boot_cpu_has(X86_FEATURE_XOP)
-#define cpu_has_lwp             boot_cpu_has(X86_FEATURE_LWP)
 #define cpu_has_fma4            boot_cpu_has(X86_FEATURE_FMA4)
 #define cpu_has_tbm             boot_cpu_has(X86_FEATURE_TBM)
 
diff --git a/xen/include/asm-x86/hvm/svm/vmcb.h b/xen/include/asm-x86/hvm/svm/vmcb.h
index 7017705..5c71028 100644
--- a/xen/include/asm-x86/hvm/svm/vmcb.h
+++ b/xen/include/asm-x86/hvm/svm/vmcb.h
@@ -534,10 +534,6 @@  struct svm_vcpu {
     uint64_t guest_sysenter_cs;
     uint64_t guest_sysenter_esp;
     uint64_t guest_sysenter_eip;
-
-    /* AMD lightweight profiling MSR */
-    uint64_t guest_lwp_cfg;      /* guest version */
-    uint64_t cpu_lwp_cfg;        /* CPU version */
 };
 
 struct vmcb_struct *alloc_vmcb(void);
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 47f602b..7ab0bdd 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -35,8 +35,7 @@  extern uint32_t mxcsr_mask;
                         XSTATE_NONLAZY)
 
 #define XSTATE_ALL     (~(1ULL << 63))
-#define XSTATE_NONLAZY (X86_XCR0_LWP | X86_XCR0_BNDREGS | X86_XCR0_BNDCSR | \
-                        X86_XCR0_PKRU)
+#define XSTATE_NONLAZY (X86_XCR0_BNDREGS | X86_XCR0_BNDCSR | X86_XCR0_PKRU)
 #define XSTATE_LAZY    (XSTATE_ALL & ~XSTATE_NONLAZY)
 #define XSTATE_XSAVES_ONLY         0
 #define XSTATE_COMPACTION_ENABLED  (1ULL << 63)
diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h
index 55231d4..727f482 100644
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -176,7 +176,7 @@  XEN_CPUFEATURE(IBS,           3*32+10) /*   Instruction Based Sampling */
 XEN_CPUFEATURE(XOP,           3*32+11) /*A  extended AVX instructions */
 XEN_CPUFEATURE(SKINIT,        3*32+12) /*   SKINIT/STGI instructions */
 XEN_CPUFEATURE(WDT,           3*32+13) /*   Watchdog timer */
-XEN_CPUFEATURE(LWP,           3*32+15) /*S  Light Weight Profiling */
+XEN_CPUFEATURE(LWP,           3*32+15) /*   Light Weight Profiling */
 XEN_CPUFEATURE(FMA4,          3*32+16) /*A  4 operands MAC instructions */
 XEN_CPUFEATURE(NODEID_MSR,    3*32+19) /*   NodeId MSR */
 XEN_CPUFEATURE(TBM,           3*32+21) /*A  trailing bit manipulations */