diff mbox series

[RESEND,v3,03/11] KVM: x86/pmu: Protect kvm->arch.pmu_event_filter with SRCU

Message ID 20220518132512.37864-4-likexu@tencent.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86/pmu: More refactoring to get rid of PERF_TYPE_HARDWAR | expand

Commit Message

Like Xu May 18, 2022, 1:25 p.m. UTC
From: Like Xu <likexu@tencent.com>

Similar to "kvm->arch.msr_filter", KVM should guarantee that vCPUs will
see either the previous filter or the new filter when user space calls
KVM_SET_PMU_EVENT_FILTER ioctl with the vCPU running so that guest
pmu events with identical settings in both the old and new filter have
deterministic behavior.

Fixes: 66bb8a065f5a ("KVM: x86: PMU Event Filter")
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/pmu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Paolo Bonzini May 20, 2022, 12:51 p.m. UTC | #1
On 5/18/22 15:25, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Similar to "kvm->arch.msr_filter", KVM should guarantee that vCPUs will
> see either the previous filter or the new filter when user space calls
> KVM_SET_PMU_EVENT_FILTER ioctl with the vCPU running so that guest
> pmu events with identical settings in both the old and new filter have
> deterministic behavior.
> 
> Fixes: 66bb8a065f5a ("KVM: x86: PMU Event Filter")
> Signed-off-by: Like Xu <likexu@tencent.com>
> Reviewed-by: Wanpeng Li <wanpengli@tencent.com>

Please always include the call trace where SRCU is not taken.  The ones 
I reconstructed always end up at a place inside srcu_read_lock/unlock:

reprogram_gp_counter/reprogram_fixed_counter
   amd_pmu_set_msr
    kvm_set_msr_common
     svm_set_msr
      __kvm_set_msr
      kvm_set_msr_ignored_check
       kvm_set_msr_with_filter
        kvm_emulate_wrmsr**
        emulator_set_msr_with_filter**
       kvm_set_msr
        emulator_set_msr**
       do_set_msr
        __msr_io
         msr_io
          ioctl(KVM_SET_MSRS)**
   intel_pmu_set_msr
    kvm_set_msr_common
     vmx_set_msr (see svm_set_msr)
   reprogram_counter
    global_ctrl_changed
     intel_pmu_set_msr (see above)
    kvm_pmu_handle_event
     vcpu_enter_guest**
    kvm_pmu_incr_counter
     kvm_pmu_trigger_event
      nested_vmx_run**
      kvm_skip_emulated_instruction**
      x86_emulate_instruction**
   reprogram_fixed_counters
    intel_pmu_set_msr (see above)

Paolo

>   arch/x86/kvm/pmu.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index f189512207db..24624654e476 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -246,8 +246,9 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc)
>   	struct kvm *kvm = pmc->vcpu->kvm;
>   	bool allow_event = true;
>   	__u64 key;
> -	int idx;
> +	int idx, srcu_idx;
>   
> +	srcu_idx = srcu_read_lock(&kvm->srcu);
>   	filter = srcu_dereference(kvm->arch.pmu_event_filter, &kvm->srcu);
>   	if (!filter)
>   		goto out;
> @@ -270,6 +271,7 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc)
>   	}
>   
>   out:
> +	srcu_read_unlock(&kvm->srcu, srcu_idx);
>   	return allow_event;
>   }
>
Jim Mattson May 20, 2022, 1 p.m. UTC | #2
On Fri, May 20, 2022 at 5:51 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 5/18/22 15:25, Like Xu wrote:
> > From: Like Xu <likexu@tencent.com>
> >
> > Similar to "kvm->arch.msr_filter", KVM should guarantee that vCPUs will
> > see either the previous filter or the new filter when user space calls
> > KVM_SET_PMU_EVENT_FILTER ioctl with the vCPU running so that guest
> > pmu events with identical settings in both the old and new filter have
> > deterministic behavior.
> >
> > Fixes: 66bb8a065f5a ("KVM: x86: PMU Event Filter")
> > Signed-off-by: Like Xu <likexu@tencent.com>
> > Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
>
> Please always include the call trace where SRCU is not taken.  The ones
> I reconstructed always end up at a place inside srcu_read_lock/unlock:
>
> reprogram_gp_counter/reprogram_fixed_counter
>    amd_pmu_set_msr
>     kvm_set_msr_common
>      svm_set_msr
>       __kvm_set_msr
>       kvm_set_msr_ignored_check
>        kvm_set_msr_with_filter
>         kvm_emulate_wrmsr**
>         emulator_set_msr_with_filter**
>        kvm_set_msr
>         emulator_set_msr**
>        do_set_msr
>         __msr_io
>          msr_io
>           ioctl(KVM_SET_MSRS)**
>    intel_pmu_set_msr
>     kvm_set_msr_common
>      vmx_set_msr (see svm_set_msr)
>    reprogram_counter
>     global_ctrl_changed
>      intel_pmu_set_msr (see above)
>     kvm_pmu_handle_event
>      vcpu_enter_guest**
>     kvm_pmu_incr_counter
>      kvm_pmu_trigger_event
>       nested_vmx_run**
>       kvm_skip_emulated_instruction**
>       x86_emulate_instruction**
>    reprogram_fixed_counters
>     intel_pmu_set_msr (see above)
>
> Paolo

I agree with Paolo that existing usage is covered by
srcu_read_lock/unlock, but (a) it's not easy to confirm this, and (b)
this is very fragile.

Whichever way we decide to go, the userspace MSR filter and the PMU
event filter should adopt the same approach.
diff mbox series

Patch

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index f189512207db..24624654e476 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -246,8 +246,9 @@  static bool check_pmu_event_filter(struct kvm_pmc *pmc)
 	struct kvm *kvm = pmc->vcpu->kvm;
 	bool allow_event = true;
 	__u64 key;
-	int idx;
+	int idx, srcu_idx;
 
+	srcu_idx = srcu_read_lock(&kvm->srcu);
 	filter = srcu_dereference(kvm->arch.pmu_event_filter, &kvm->srcu);
 	if (!filter)
 		goto out;
@@ -270,6 +271,7 @@  static bool check_pmu_event_filter(struct kvm_pmc *pmc)
 	}
 
 out:
+	srcu_read_unlock(&kvm->srcu, srcu_idx);
 	return allow_event;
 }