diff mbox series

[v11,14/16] KVM: x86/vmx: Flip Arch LBREn bit on guest state change

Message ID 20220506033305.5135-15-weijiang.yang@intel.com (mailing list archive)
State New, archived
Headers show
Series Introduce Architectural LBR for vPMU | expand

Commit Message

Yang, Weijiang May 6, 2022, 3:33 a.m. UTC
Per spec:"IA32_LBR_CTL.LBREn is saved and cleared on #SMI, and restored
on RSM. On a warm reset, all LBR MSRs, including IA32_LBR_DEPTH, have their
values preserved. However, IA32_LBR_CTL.LBREn is cleared to 0, disabling
LBRs." So clear Arch LBREn bit on #SMI and restore it on RSM manully, also
clear the bit when guest does warm reset.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Liang, Kan May 6, 2022, 3:08 p.m. UTC | #1
On 5/5/2022 11:33 PM, Yang Weijiang wrote:
> Per spec:"IA32_LBR_CTL.LBREn is saved and cleared on #SMI, and restored
> on RSM. On a warm reset, all LBR MSRs, including IA32_LBR_DEPTH, have their
> values preserved. However, IA32_LBR_CTL.LBREn is cleared to 0, disabling
> LBRs." So clear Arch LBREn bit on #SMI and restore it on RSM manully, also
> clear the bit when guest does warm reset.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>

> ---
>   arch/x86/kvm/vmx/vmx.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 6d6ee9cf82f5..b38f58868905 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4593,6 +4593,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>   	if (!init_event) {
>   		if (static_cpu_has(X86_FEATURE_ARCH_LBR))
>   			vmcs_write64(GUEST_IA32_LBR_CTL, 0);
> +	} else {
> +		flip_arch_lbr_ctl(vcpu, false);
>   	}
>   }
>   
> @@ -7704,6 +7706,7 @@ static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
>   	vmx->nested.smm.vmxon = vmx->nested.vmxon;
>   	vmx->nested.vmxon = false;
>   	vmx_clear_hlt(vcpu);
> +	flip_arch_lbr_ctl(vcpu, false);
>   	return 0;
>   }
>   
> @@ -7725,6 +7728,7 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
>   		vmx->nested.nested_run_pending = 1;
>   		vmx->nested.smm.guest_mode = false;
>   	}
> +	flip_arch_lbr_ctl(vcpu, true);
>   	return 0;
>   }
>
Paolo Bonzini May 10, 2022, 3:51 p.m. UTC | #2
On 5/6/22 05:33, Yang Weijiang wrote:
> Per spec:"IA32_LBR_CTL.LBREn is saved and cleared on #SMI, and restored
> on RSM. On a warm reset, all LBR MSRs, including IA32_LBR_DEPTH, have their
> values preserved. However, IA32_LBR_CTL.LBREn is cleared to 0, disabling
> LBRs." So clear Arch LBREn bit on #SMI and restore it on RSM manully, also
> clear the bit when guest does warm reset.
> 
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>   arch/x86/kvm/vmx/vmx.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 6d6ee9cf82f5..b38f58868905 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4593,6 +4593,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>   	if (!init_event) {
>   		if (static_cpu_has(X86_FEATURE_ARCH_LBR))
>   			vmcs_write64(GUEST_IA32_LBR_CTL, 0);
> +	} else {
> +		flip_arch_lbr_ctl(vcpu, false);
>   	}
>   }
>   
> @@ -7704,6 +7706,7 @@ static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
>   	vmx->nested.smm.vmxon = vmx->nested.vmxon;
>   	vmx->nested.vmxon = false;
>   	vmx_clear_hlt(vcpu);
> +	flip_arch_lbr_ctl(vcpu, false);
>   	return 0;
>   }
>   
> @@ -7725,6 +7728,7 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
>   		vmx->nested.nested_run_pending = 1;
>   		vmx->nested.smm.guest_mode = false;
>   	}
> +	flip_arch_lbr_ctl(vcpu, true);
>   	return 0;
>   }
>   

This is incorrect, you hare not saving/restoring the actual value of 
LBREn (which is "lbr_desc->event != NULL").  Therefore, a migration 
while in SMM would lose the value of LBREn = true.

Instead of using flip_arch_lbr_ctl, SMM should save the value of the MSR 
in kvm_x86_ops->enter_smm, and restore it in kvm_x86_ops->leave_smm 
(feel free to do it only if guest_cpuid_has(vcpu, X86_FEATURE_LM), i.e. 
the 32-bit case can be ignored).

Paolo
Yang, Weijiang May 11, 2022, 7:43 a.m. UTC | #3
On 5/10/2022 11:51 PM, Paolo Bonzini wrote:
> On 5/6/22 05:33, Yang Weijiang wrote:
>> Per spec:"IA32_LBR_CTL.LBREn is saved and cleared on #SMI, and restored
>> on RSM. On a warm reset, all LBR MSRs, including IA32_LBR_DEPTH, have their
>> values preserved. However, IA32_LBR_CTL.LBREn is cleared to 0, disabling
>> LBRs." So clear Arch LBREn bit on #SMI and restore it on RSM manully, also
>> clear the bit when guest does warm reset.
>>
>> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>> ---
>>    arch/x86/kvm/vmx/vmx.c | 4 ++++
>>    1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 6d6ee9cf82f5..b38f58868905 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -4593,6 +4593,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>>    	if (!init_event) {
>>    		if (static_cpu_has(X86_FEATURE_ARCH_LBR))
>>    			vmcs_write64(GUEST_IA32_LBR_CTL, 0);
>> +	} else {
>> +		flip_arch_lbr_ctl(vcpu, false);
>>    	}
>>    }
>>    
>> @@ -7704,6 +7706,7 @@ static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
>>    	vmx->nested.smm.vmxon = vmx->nested.vmxon;
>>    	vmx->nested.vmxon = false;
>>    	vmx_clear_hlt(vcpu);
>> +	flip_arch_lbr_ctl(vcpu, false);
>>    	return 0;
>>    }
>>    
>> @@ -7725,6 +7728,7 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
>>    		vmx->nested.nested_run_pending = 1;
>>    		vmx->nested.smm.guest_mode = false;
>>    	}
>> +	flip_arch_lbr_ctl(vcpu, true);
>>    	return 0;
>>    }
>>    
> This is incorrect, you hare not saving/restoring the actual value of
> LBREn (which is "lbr_desc->event != NULL").  Therefore, a migration
> while in SMM would lose the value of LBREn = true.
>
> Instead of using flip_arch_lbr_ctl, SMM should save the value of the MSR
> in kvm_x86_ops->enter_smm, and restore it in kvm_x86_ops->leave_smm
> (feel free to do it only if guest_cpuid_has(vcpu, X86_FEATURE_LM), i.e.
> the 32-bit case can be ignored).

In the case of migration in SMM, I assume kvm_x86_ops->enter_smm() 
called in source side

and kvm_x86_ops->leave_smm() is called at destination, then should the 
saved LBREn be transferred

to destination too? The destination can rely on the bit to defer setting 
LBREn bit in

VMCS until kvm_x86_ops->leave_smm() is called. is it good? thanks!

>   
>
> Paolo
Yang, Weijiang May 12, 2022, 6:44 a.m. UTC | #4
On 5/10/2022 11:51 PM, Paolo Bonzini wrote:
> On 5/6/22 05:33, Yang Weijiang wrote:
>> Per spec:"IA32_LBR_CTL.LBREn is saved and cleared on #SMI, and restored
>> on RSM. On a warm reset, all LBR MSRs, including IA32_LBR_DEPTH, have their
>> values preserved. However, IA32_LBR_CTL.LBREn is cleared to 0, disabling
>> LBRs." So clear Arch LBREn bit on #SMI and restore it on RSM manully, also
>> clear the bit when guest does warm reset.
>>
>> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>> ---
>>    arch/x86/kvm/vmx/vmx.c | 4 ++++
>>    1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 6d6ee9cf82f5..b38f58868905 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -4593,6 +4593,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>>    	if (!init_event) {
>>    		if (static_cpu_has(X86_FEATURE_ARCH_LBR))
>>    			vmcs_write64(GUEST_IA32_LBR_CTL, 0);
>> +	} else {
>> +		flip_arch_lbr_ctl(vcpu, false);
>>    	}
>>    }
>>    
>> @@ -7704,6 +7706,7 @@ static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
>>    	vmx->nested.smm.vmxon = vmx->nested.vmxon;
>>    	vmx->nested.vmxon = false;
>>    	vmx_clear_hlt(vcpu);
>> +	flip_arch_lbr_ctl(vcpu, false);
>>    	return 0;
>>    }
>>    
>> @@ -7725,6 +7728,7 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
>>    		vmx->nested.nested_run_pending = 1;
>>    		vmx->nested.smm.guest_mode = false;
>>    	}
>> +	flip_arch_lbr_ctl(vcpu, true);
>>    	return 0;
>>    }
>>    
> This is incorrect, you hare not saving/restoring the actual value of
> LBREn (which is "lbr_desc->event != NULL").  Therefore, a migration
> while in SMM would lose the value of LBREn = true.
>
> Instead of using flip_arch_lbr_ctl, SMM should save the value of the MSR
> in kvm_x86_ops->enter_smm, and restore it in kvm_x86_ops->leave_smm
> (feel free to do it only if guest_cpuid_has(vcpu, X86_FEATURE_LM), i.e.
> the 32-bit case can be ignored).

Hi, Paolo,

I re-factored this patch as below to enclose your above suggestion, 
could you

kindly check? If it's OK then I'll refresh this series with v12, thanks!

======================================================================

 From dad3abc7fe96022dd3dcee8f958960bbd4f68b95 Mon Sep 17 00:00:00 2001
From: Yang Weijiang <weijiang.yang@intel.com>
Date: Thu, 5 Aug 2021 20:48:39 +0800
Subject: [PATCH] KVM: x86/vmx: Flip Arch LBREn bit on guest state change

Per spec:"IA32_LBR_CTL.LBREn is saved and cleared on #SMI, and restored
on RSM. On a warm reset, all LBR MSRs, including IA32_LBR_DEPTH, have their
values preserved. However, IA32_LBR_CTL.LBREn is cleared to 0, disabling
LBRs." Use a reserved bit(63) of the MSR to hide LBREn bit on #SMI and
restore it to LBREn on RSM manully, also clear the bit when guest does
warm reset.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
  arch/x86/kvm/vmx/pmu_intel.c | 16 +++++++++++++---
  arch/x86/kvm/vmx/vmx.c       | 24 ++++++++++++++++++++++++
  arch/x86/kvm/vmx/vmx.h       |  1 +
  3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 038fdb788ccd..652601ad99ea 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -373,6 +373,8 @@ static bool arch_lbr_depth_is_valid(struct kvm_vcpu 
*vcpu, u64 depth)
      return (depth == pmu->kvm_arch_lbr_depth);
  }

+#define ARCH_LBR_IN_SMM    BIT(63)
+
  static bool arch_lbr_ctl_is_valid(struct kvm_vcpu *vcpu, u64 ctl)
  {
      struct kvm_cpuid_entry2 *entry;
@@ -380,7 +382,7 @@ static bool arch_lbr_ctl_is_valid(struct kvm_vcpu 
*vcpu, u64 ctl)
      if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
          return false;

-    if (ctl & ~KVM_ARCH_LBR_CTL_MASK)
+    if (ctl & ~(KVM_ARCH_LBR_CTL_MASK | ARCH_LBR_IN_SMM))
          goto warn;

      entry = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
@@ -425,6 +427,10 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, 
struct msr_data *msr_info)
          return 0;
      case MSR_ARCH_LBR_CTL:
          msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
+        if (to_vmx(vcpu)->lbr_in_smm) {
+            msr_info->data |= ARCH_LBR_CTL_LBREN;
+            msr_info->data |= ARCH_LBR_IN_SMM;
+        }
          return 0;
      default:
          if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
@@ -501,11 +507,15 @@ static int intel_pmu_set_msr(struct kvm_vcpu 
*vcpu, struct msr_data *msr_info)
          if (!arch_lbr_ctl_is_valid(vcpu, data))
              break;

-        vmcs_write64(GUEST_IA32_LBR_CTL, data);
-
          if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
              (data & ARCH_LBR_CTL_LBREN))
              intel_pmu_create_guest_lbr_event(vcpu);
+
+        if (data & ARCH_LBR_IN_SMM) {
+            data &= ~ARCH_LBR_CTL_LBREN;
+            data &= ~ARCH_LBR_IN_SMM;
+        }
+        vmcs_write64(GUEST_IA32_LBR_CTL, data);
          return 0;
      default:
          if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6d6ee9cf82f5..eadad24a68e6 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4543,6 +4543,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, 
bool init_event)

      vmx->rmode.vm86_active = 0;
      vmx->spec_ctrl = 0;
+    vmx->lbr_in_smm = false;

      vmx->msr_ia32_umwait_control = 0;

@@ -4593,6 +4594,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, 
bool init_event)
      if (!init_event) {
          if (static_cpu_has(X86_FEATURE_ARCH_LBR))
              vmcs_write64(GUEST_IA32_LBR_CTL, 0);
+    } else {
+        flip_arch_lbr_ctl(vcpu, false);
      }
  }

@@ -7695,6 +7698,8 @@ static int vmx_smi_allowed(struct kvm_vcpu *vcpu, 
bool for_injection)

  static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
  {
+    struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
+    struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
      struct vcpu_vmx *vmx = to_vmx(vcpu);

      vmx->nested.smm.guest_mode = is_guest_mode(vcpu);
@@ -7704,12 +7709,21 @@ static int vmx_enter_smm(struct kvm_vcpu *vcpu, 
char *smstate)
      vmx->nested.smm.vmxon = vmx->nested.vmxon;
      vmx->nested.vmxon = false;
      vmx_clear_hlt(vcpu);
+
+    if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR) &&
+        test_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use) &&
+        lbr_desc->event)
+        vmx->lbr_in_smm = true;
+
      return 0;
  }

  static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
  {
+    struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
+    struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
      struct vcpu_vmx *vmx = to_vmx(vcpu);
+
      int ret;

      if (vmx->nested.smm.vmxon) {
@@ -7725,6 +7739,16 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, 
const char *smstate)
          vmx->nested.nested_run_pending = 1;
          vmx->nested.smm.guest_mode = false;
      }
+
+    if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR) &&
+        test_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use) &&
+        lbr_desc->event && vmx->lbr_in_smm) {
+        u64 ctl = vmcs_read64(GUEST_IA32_LBR_CTL);
+
+        vmcs_write64(GUEST_IA32_LBR_CTL, ctl | ARCH_LBR_CTL_LBREN);
+        vmx->lbr_in_smm = false;
+    }
+
      return 0;
  }

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index b98c7e96697a..a227fe8bf288 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -351,6 +351,7 @@ struct vcpu_vmx {

      struct pt_desc pt_desc;
      struct lbr_desc lbr_desc;
+    bool lbr_in_smm;

      /* Save desired MSR intercept (read: pass-through) state */
  #define MAX_POSSIBLE_PASSTHROUGH_MSRS    15
Paolo Bonzini May 12, 2022, 1:18 p.m. UTC | #5
On 5/11/22 09:43, Yang, Weijiang wrote:
>>
>> Instead of using flip_arch_lbr_ctl, SMM should save the value of the MSR
>> in kvm_x86_ops->enter_smm, and restore it in kvm_x86_ops->leave_smm
>> (feel free to do it only if guest_cpuid_has(vcpu, X86_FEATURE_LM), i.e.
>> the 32-bit case can be ignored).
> 
> In the case of migration in SMM, I assume kvm_x86_ops->enter_smm() 
> called in source side
> 
> and kvm_x86_ops->leave_smm() is called at destination, then should the 
> saved LBREn be transferred
> 
> to destination too? The destination can rely on the bit to defer setting 
> LBREn bit in

Hi, it's transferred automatically if the MSR is saved in the SMM save 
state area.  Both enter_smm and leave_smm can access the save state area.

The enter_smm callback is called after saving "normal" state, and it has 
to save the state + clear the bit; likewise, the leave_smm callback is 
called before saving "normal" state and will restore the old value of 
the MSR.

Thanks,

Paolo

> VMCS until kvm_x86_ops->leave_smm() is called. is it good? thanks!
Yang, Weijiang May 12, 2022, 2:38 p.m. UTC | #6
On 5/12/2022 9:18 PM, Paolo Bonzini wrote:
> On 5/11/22 09:43, Yang, Weijiang wrote:
>>> Instead of using flip_arch_lbr_ctl, SMM should save the value of the MSR
>>> in kvm_x86_ops->enter_smm, and restore it in kvm_x86_ops->leave_smm
>>> (feel free to do it only if guest_cpuid_has(vcpu, X86_FEATURE_LM), i.e.
>>> the 32-bit case can be ignored).
>> In the case of migration in SMM, I assume kvm_x86_ops->enter_smm()
>> called in source side
>>
>> and kvm_x86_ops->leave_smm() is called at destination, then should the
>> saved LBREn be transferred
>>
>> to destination too? The destination can rely on the bit to defer setting
>> LBREn bit in
> Hi, it's transferred automatically if the MSR is saved in the SMM save
> state area.  Both enter_smm and leave_smm can access the save state area.
>
> The enter_smm callback is called after saving "normal" state, and it has
> to save the state + clear the bit; likewise, the leave_smm callback is
> called before saving "normal" state and will restore the old value of
> the MSR.

Got it thanks!

But there's no such slot for MSR_ARCH_LBR_CTL in SMRAM, do you still suggest

using this mechanism to implement the LBREn clear/restore logic?

> Thanks,
>
> Paolo
>
>> VMCS until kvm_x86_ops->leave_smm() is called. is it good? thanks!
Yang, Weijiang May 13, 2022, 4:02 a.m. UTC | #7
On 5/12/2022 9:18 PM, Paolo Bonzini wrote:
> On 5/11/22 09:43, Yang, Weijiang wrote:
>>> Instead of using flip_arch_lbr_ctl, SMM should save the value of the MSR
>>> in kvm_x86_ops->enter_smm, and restore it in kvm_x86_ops->leave_smm
>>> (feel free to do it only if guest_cpuid_has(vcpu, X86_FEATURE_LM), i.e.
>>> the 32-bit case can be ignored).
>> In the case of migration in SMM, I assume kvm_x86_ops->enter_smm()
>> called in source side
>>
>> and kvm_x86_ops->leave_smm() is called at destination, then should the
>> saved LBREn be transferred
>>
>> to destination too? The destination can rely on the bit to defer setting
>> LBREn bit in
> Hi, it's transferred automatically if the MSR is saved in the SMM save
> state area.  Both enter_smm and leave_smm can access the save state area.
>
> The enter_smm callback is called after saving "normal" state, and it has
> to save the state + clear the bit; likewise, the leave_smm callback is
> called before saving "normal" state and will restore the old value of
> the MSR.

Hi, I  modified this patch to consolidate your suggestion above, see 
below patch.

I added more things to ease migration handling in SMM because: 1) qemu 
checks

LBREn before transfer Arch LBR MSRs. 2)Perf event is created when LBREn 
is being

set.  Two things are not certain: 1) IA32_LBR_CTL doesn't have 
corresponding slot in

SMRAM,not sure if we need to rely on it to transfer the MSR. 2) I chose 
0x7f10 as

the offset(CET takes 0x7f08) for storage, need you double check if it's 
free or used.

Thanks a lot!

====================================================================

 From ceba1527fd87cdc789b38fce454058fca6582b0a Mon Sep 17 00:00:00 2001
From: Yang Weijiang <weijiang.yang@intel.com>
Date: Thu, 5 Aug 2021 20:48:39 +0800
Subject: [PATCH] KVM: x86/vmx: Flip Arch LBREn bit on guest state change

Per spec:"IA32_LBR_CTL.LBREn is saved and cleared on #SMI, and restored
on RSM. On a warm reset, all LBR MSRs, including IA32_LBR_DEPTH, have their
values preserved. However, IA32_LBR_CTL.LBREn is cleared to 0, disabling
LBRs." Given migration in SMM, use a reserved bit(63) of the MSR to mirror
LBREn bit, it facilitates Arch LBR specific handling during live migration
because user space will check LBREn to decide whether it's necessary to
migrate the Arch LBR related MSRs. When the mirrored bit and LBREn bit are
both set, it means Arch LBR was active in SMM, so only create perf event
and defer the LBREn bit restoring to leave_smm callback.
Also clear LBREn at warm reset.

Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
  arch/x86/kvm/vmx/pmu_intel.c | 16 +++++++++++++---
  arch/x86/kvm/vmx/vmx.c       | 29 +++++++++++++++++++++++++++++
  arch/x86/kvm/vmx/vmx.h       |  1 +
  3 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 038fdb788ccd..652601ad99ea 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -373,6 +373,8 @@ static bool arch_lbr_depth_is_valid(struct kvm_vcpu 
*vcpu, u64 depth)
      return (depth == pmu->kvm_arch_lbr_depth);
  }

+#define ARCH_LBR_IN_SMM    BIT(63)
+
  static bool arch_lbr_ctl_is_valid(struct kvm_vcpu *vcpu, u64 ctl)
  {
      struct kvm_cpuid_entry2 *entry;
@@ -380,7 +382,7 @@ static bool arch_lbr_ctl_is_valid(struct kvm_vcpu 
*vcpu, u64 ctl)
      if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
          return false;

-    if (ctl & ~KVM_ARCH_LBR_CTL_MASK)
+    if (ctl & ~(KVM_ARCH_LBR_CTL_MASK | ARCH_LBR_IN_SMM))
          goto warn;

      entry = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
@@ -425,6 +427,10 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, 
struct msr_data *msr_info)
          return 0;
      case MSR_ARCH_LBR_CTL:
          msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
+        if (to_vmx(vcpu)->lbr_in_smm) {
+            msr_info->data |= ARCH_LBR_CTL_LBREN;
+            msr_info->data |= ARCH_LBR_IN_SMM;
+        }
          return 0;
      default:
          if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
@@ -501,11 +507,15 @@ static int intel_pmu_set_msr(struct kvm_vcpu 
*vcpu, struct msr_data *msr_info)
          if (!arch_lbr_ctl_is_valid(vcpu, data))
              break;

-        vmcs_write64(GUEST_IA32_LBR_CTL, data);
-
          if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
              (data & ARCH_LBR_CTL_LBREN))
              intel_pmu_create_guest_lbr_event(vcpu);
+
+        if (data & ARCH_LBR_IN_SMM) {
+            data &= ~ARCH_LBR_CTL_LBREN;
+            data &= ~ARCH_LBR_IN_SMM;
+        }
+        vmcs_write64(GUEST_IA32_LBR_CTL, data);
          return 0;
      default:
          if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6d6ee9cf82f5..f754b9400151 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4543,6 +4543,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, 
bool init_event)

      vmx->rmode.vm86_active = 0;
      vmx->spec_ctrl = 0;
+    vmx->lbr_in_smm = false;

      vmx->msr_ia32_umwait_control = 0;

@@ -4593,6 +4594,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, 
bool init_event)
      if (!init_event) {
          if (static_cpu_has(X86_FEATURE_ARCH_LBR))
              vmcs_write64(GUEST_IA32_LBR_CTL, 0);
+    } else {
+        flip_arch_lbr_ctl(vcpu, false);
      }
  }

@@ -7695,6 +7698,8 @@ static int vmx_smi_allowed(struct kvm_vcpu *vcpu, 
bool for_injection)

  static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
  {
+    struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
+    struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
      struct vcpu_vmx *vmx = to_vmx(vcpu);

      vmx->nested.smm.guest_mode = is_guest_mode(vcpu);
@@ -7704,12 +7709,26 @@ static int vmx_enter_smm(struct kvm_vcpu *vcpu, 
char *smstate)
      vmx->nested.smm.vmxon = vmx->nested.vmxon;
      vmx->nested.vmxon = false;
      vmx_clear_hlt(vcpu);
+
+    if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR) &&
+        test_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use) &&
+        lbr_desc->event && guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
+        u64 ctl = vmcs_read64(GUEST_IA32_LBR_CTL);
+
+        put_smstate(u64, smstate, 0x7f10, ctl);
+        vmcs_write64(GUEST_IA32_LBR_CTL, ctl & ~ARCH_LBR_CTL_LBREN);
+        vmx->lbr_in_smm = true;
+    }
+
      return 0;
  }

  static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
  {
+    struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
+    struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
      struct vcpu_vmx *vmx = to_vmx(vcpu);
+
      int ret;

      if (vmx->nested.smm.vmxon) {
@@ -7725,6 +7744,16 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, 
const char *smstate)
          vmx->nested.nested_run_pending = 1;
          vmx->nested.smm.guest_mode = false;
      }
+
+    if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR) &&
+        test_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use) &&
+        lbr_desc->event && guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
+        u64 ctl = GET_SMSTATE(u64, smstate, 0x7f10);
+
+        vmcs_write64(GUEST_IA32_LBR_CTL, ctl | ARCH_LBR_CTL_LBREN);
+        vmx->lbr_in_smm = false;
+    }
+
      return 0;
  }

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index b98c7e96697a..a227fe8bf288 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -351,6 +351,7 @@ struct vcpu_vmx {

      struct pt_desc pt_desc;
      struct lbr_desc lbr_desc;
+    bool lbr_in_smm;

      /* Save desired MSR intercept (read: pass-through) state */
  #define MAX_POSSIBLE_PASSTHROUGH_MSRS    15
Yang, Weijiang May 17, 2022, 8:56 a.m. UTC | #8
On 5/13/2022 12:02 PM, Yang, Weijiang wrote:
>
> On 5/12/2022 9:18 PM, Paolo Bonzini wrote:
>> On 5/11/22 09:43, Yang, Weijiang wrote:
>>>> Instead of using flip_arch_lbr_ctl, SMM should save the value of 
>>>> the MSR
>>>> in kvm_x86_ops->enter_smm, and restore it in kvm_x86_ops->leave_smm
>>>> (feel free to do it only if guest_cpuid_has(vcpu, X86_FEATURE_LM), 
>>>> i.e.
>>>> the 32-bit case can be ignored).
>>> In the case of migration in SMM, I assume kvm_x86_ops->enter_smm()
>>> called in source side
>>>
>>> and kvm_x86_ops->leave_smm() is called at destination, then should the
>>> saved LBREn be transferred
>>>
>>> to destination too? The destination can rely on the bit to defer 
>>> setting
>>> LBREn bit in
>> Hi, it's transferred automatically if the MSR is saved in the SMM save
>> state area.  Both enter_smm and leave_smm can access the save state 
>> area.
>>
>> The enter_smm callback is called after saving "normal" state, and it has
>> to save the state + clear the bit; likewise, the leave_smm callback is
>> called before saving "normal" state and will restore the old value of
>> the MSR.
>
> Hi, I  modified this patch to consolidate your suggestion above, see 
> below patch.
>
> I added more things to ease migration handling in SMM because: 1) qemu 
> checks
>
> LBREn before transfer Arch LBR MSRs. 2)Perf event is created when 
> LBREn is being
>
> set.  Two things are not certain: 1) IA32_LBR_CTL doesn't have 
> corresponding slot in
>
> SMRAM,not sure if we need to rely on it to transfer the MSR. 2) I 
> chose 0x7f10 as
>
> the offset(CET takes 0x7f08) for storage, need you double check if 
> it's free or used.
>
> Thanks a lot!

Hi, Paolo,

I found there're some rebase conflicts between this series and your kvm 
queue branch

due to PEBS patches, I can re-post a new version based on your queue 
branch if necessary.

Waiting for your comments on this patch...

>
> ====================================================================
>
> From ceba1527fd87cdc789b38fce454058fca6582b0a Mon Sep 17 00:00:00 2001
> From: Yang Weijiang <weijiang.yang@intel.com>
> Date: Thu, 5 Aug 2021 20:48:39 +0800
> Subject: [PATCH] KVM: x86/vmx: Flip Arch LBREn bit on guest state change
>
> Per spec:"IA32_LBR_CTL.LBREn is saved and cleared on #SMI, and restored
> on RSM. On a warm reset, all LBR MSRs, including IA32_LBR_DEPTH, have 
> their
> values preserved. However, IA32_LBR_CTL.LBREn is cleared to 0, disabling
> LBRs." Given migration in SMM, use a reserved bit(63) of the MSR to 
> mirror
> LBREn bit, it facilitates Arch LBR specific handling during live 
> migration
> because user space will check LBREn to decide whether it's necessary to
> migrate the Arch LBR related MSRs. When the mirrored bit and LBREn bit 
> are
> both set, it means Arch LBR was active in SMM, so only create perf event
> and defer the LBREn bit restoring to leave_smm callback.
> Also clear LBREn at warm reset.
>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> ---
>  arch/x86/kvm/vmx/pmu_intel.c | 16 +++++++++++++---
>  arch/x86/kvm/vmx/vmx.c       | 29 +++++++++++++++++++++++++++++
>  arch/x86/kvm/vmx/vmx.h       |  1 +
>  3 files changed, 43 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index 038fdb788ccd..652601ad99ea 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -373,6 +373,8 @@ static bool arch_lbr_depth_is_valid(struct 
> kvm_vcpu *vcpu, u64 depth)
>      return (depth == pmu->kvm_arch_lbr_depth);
>  }
>
> +#define ARCH_LBR_IN_SMM    BIT(63)
> +
>  static bool arch_lbr_ctl_is_valid(struct kvm_vcpu *vcpu, u64 ctl)
>  {
>      struct kvm_cpuid_entry2 *entry;
> @@ -380,7 +382,7 @@ static bool arch_lbr_ctl_is_valid(struct kvm_vcpu 
> *vcpu, u64 ctl)
>      if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
>          return false;
>
> -    if (ctl & ~KVM_ARCH_LBR_CTL_MASK)
> +    if (ctl & ~(KVM_ARCH_LBR_CTL_MASK | ARCH_LBR_IN_SMM))
>          goto warn;
>
>      entry = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
> @@ -425,6 +427,10 @@ static int intel_pmu_get_msr(struct kvm_vcpu 
> *vcpu, struct msr_data *msr_info)
>          return 0;
>      case MSR_ARCH_LBR_CTL:
>          msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
> +        if (to_vmx(vcpu)->lbr_in_smm) {
> +            msr_info->data |= ARCH_LBR_CTL_LBREN;
> +            msr_info->data |= ARCH_LBR_IN_SMM;
> +        }
>          return 0;
>      default:
>          if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
> @@ -501,11 +507,15 @@ static int intel_pmu_set_msr(struct kvm_vcpu 
> *vcpu, struct msr_data *msr_info)
>          if (!arch_lbr_ctl_is_valid(vcpu, data))
>              break;
>
> -        vmcs_write64(GUEST_IA32_LBR_CTL, data);
> -
>          if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
>              (data & ARCH_LBR_CTL_LBREN))
>              intel_pmu_create_guest_lbr_event(vcpu);
> +
> +        if (data & ARCH_LBR_IN_SMM) {
> +            data &= ~ARCH_LBR_CTL_LBREN;
> +            data &= ~ARCH_LBR_IN_SMM;
> +        }
> +        vmcs_write64(GUEST_IA32_LBR_CTL, data);
>          return 0;
>      default:
>          if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 6d6ee9cf82f5..f754b9400151 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4543,6 +4543,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu 
> *vcpu, bool init_event)
>
>      vmx->rmode.vm86_active = 0;
>      vmx->spec_ctrl = 0;
> +    vmx->lbr_in_smm = false;
>
>      vmx->msr_ia32_umwait_control = 0;
>
> @@ -4593,6 +4594,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu 
> *vcpu, bool init_event)
>      if (!init_event) {
>          if (static_cpu_has(X86_FEATURE_ARCH_LBR))
>              vmcs_write64(GUEST_IA32_LBR_CTL, 0);
> +    } else {
> +        flip_arch_lbr_ctl(vcpu, false);
>      }
>  }
>
> @@ -7695,6 +7698,8 @@ static int vmx_smi_allowed(struct kvm_vcpu 
> *vcpu, bool for_injection)
>
>  static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
>  {
> +    struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
> +    struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>      struct vcpu_vmx *vmx = to_vmx(vcpu);
>
>      vmx->nested.smm.guest_mode = is_guest_mode(vcpu);
> @@ -7704,12 +7709,26 @@ static int vmx_enter_smm(struct kvm_vcpu 
> *vcpu, char *smstate)
>      vmx->nested.smm.vmxon = vmx->nested.vmxon;
>      vmx->nested.vmxon = false;
>      vmx_clear_hlt(vcpu);
> +
> +    if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR) &&
> +        test_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use) &&
> +        lbr_desc->event && guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
> +        u64 ctl = vmcs_read64(GUEST_IA32_LBR_CTL);
> +
> +        put_smstate(u64, smstate, 0x7f10, ctl);
> +        vmcs_write64(GUEST_IA32_LBR_CTL, ctl & ~ARCH_LBR_CTL_LBREN);
> +        vmx->lbr_in_smm = true;
> +    }
> +
>      return 0;
>  }
>
>  static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
>  {
> +    struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
> +    struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>      struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
>      int ret;
>
>      if (vmx->nested.smm.vmxon) {
> @@ -7725,6 +7744,16 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, 
> const char *smstate)
>          vmx->nested.nested_run_pending = 1;
>          vmx->nested.smm.guest_mode = false;
>      }
> +
> +    if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR) &&
> +        test_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use) &&
> +        lbr_desc->event && guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
> +        u64 ctl = GET_SMSTATE(u64, smstate, 0x7f10);
> +
> +        vmcs_write64(GUEST_IA32_LBR_CTL, ctl | ARCH_LBR_CTL_LBREN);
> +        vmx->lbr_in_smm = false;
> +    }
> +
>      return 0;
>  }
>
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index b98c7e96697a..a227fe8bf288 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -351,6 +351,7 @@ struct vcpu_vmx {
>
>      struct pt_desc pt_desc;
>      struct lbr_desc lbr_desc;
> +    bool lbr_in_smm;
>
>      /* Save desired MSR intercept (read: pass-through) state */
>  #define MAX_POSSIBLE_PASSTHROUGH_MSRS    15
Paolo Bonzini May 17, 2022, 9:01 a.m. UTC | #9
On 5/17/22 10:56, Yang, Weijiang wrote:
>> I added more things to ease migration handling in SMM because: 1) qemu 
>> checks LBREn before transfer Arch LBR MSRs.

I think it should always transfer them instead?  There's time to post a 
fixup patch.

>> 2) Perf event is created when 
>> LBREn is being set.  Two things are not certain: 1) IA32_LBR_CTL doesn't have 
>> corresponding slot in SMRAM,not sure if we need to rely on it to transfer the MSR.
>> I chose 0x7f10 as the offset(CET takes 0x7f08) for storage, need you double check if 
>> it's free or used.

0x7f10 sounds good.

> Hi, Paolo,
> 
> I found there're some rebase conflicts between this series and your kvm 
> queue branch due to PEBS patches, I can re-post a new version based on
> your queue branch if necessary.

Yes, please.

> Waiting for your comments on this patch...

I already commented that using bit 63 is not good, didn't I?

Paolo
Yang, Weijiang May 17, 2022, 11:31 a.m. UTC | #10
On 5/17/2022 5:01 PM, Paolo Bonzini wrote:
> On 5/17/22 10:56, Yang, Weijiang wrote:
>>> I added more things to ease migration handling in SMM because: 1) qemu
>>> checks LBREn before transfer Arch LBR MSRs.
> I think it should always transfer them instead?  There's time to post a
> fixup patch.
OK, I'll send a fix patch.
>
>>> 2) Perf event is created when
>>> LBREn is being set.  Two things are not certain: 1) IA32_LBR_CTL doesn't have
>>> corresponding slot in SMRAM,not sure if we need to rely on it to transfer the MSR.
>>> I chose 0x7f10 as the offset(CET takes 0x7f08) for storage, need you double check if
>>> it's free or used.
> 0x7f10 sounds good.
>
>> Hi, Paolo,
>>
>> I found there're some rebase conflicts between this series and your kvm
>> queue branch due to PEBS patches, I can re-post a new version based on
>> your queue branch if necessary.
> Yes, please.
Sure, I'll post  v12 soon.
>
>> Waiting for your comments on this patch...
> I already commented that using bit 63 is not good, didn't I?
Clear :-D, thanks!
>
> Paolo
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6d6ee9cf82f5..b38f58868905 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4593,6 +4593,8 @@  static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	if (!init_event) {
 		if (static_cpu_has(X86_FEATURE_ARCH_LBR))
 			vmcs_write64(GUEST_IA32_LBR_CTL, 0);
+	} else {
+		flip_arch_lbr_ctl(vcpu, false);
 	}
 }
 
@@ -7704,6 +7706,7 @@  static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 	vmx->nested.smm.vmxon = vmx->nested.vmxon;
 	vmx->nested.vmxon = false;
 	vmx_clear_hlt(vcpu);
+	flip_arch_lbr_ctl(vcpu, false);
 	return 0;
 }
 
@@ -7725,6 +7728,7 @@  static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 		vmx->nested.nested_run_pending = 1;
 		vmx->nested.smm.guest_mode = false;
 	}
+	flip_arch_lbr_ctl(vcpu, true);
 	return 0;
 }