diff mbox series

KVM: vPMU: Use atomic bit operations for global_status

Message ID 20230911061147.409152-1-mizhang@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: vPMU: Use atomic bit operations for global_status | expand

Commit Message

Mingwei Zhang Sept. 11, 2023, 6:11 a.m. UTC
Use atomic bit operations for pmu->global_status because it may suffer from
race conditions between emulated overflow in KVM vPMU and PEBS overflow in
host PMI handler.

Fixes: f331601c65ad ("KVM: x86/pmu: Don't generate PEBS records for emulated instructions")
Signed-off-by: Mingwei Zhang <mizhang@google.com>
---
 arch/x86/kvm/pmu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)


base-commit: e2013f46ee2e721567783557c301e5c91d0b74ff

Comments

Sean Christopherson Sept. 11, 2023, 3:01 p.m. UTC | #1
On Mon, Sep 11, 2023, Mingwei Zhang wrote:
> Use atomic bit operations for pmu->global_status because it may suffer from
> race conditions between emulated overflow in KVM vPMU and PEBS overflow in
> host PMI handler.

Only if the host PMI occurs on a different pCPU, and if that can happen don't we
have a much larger problem?

> Fixes: f331601c65ad ("KVM: x86/pmu: Don't generate PEBS records for emulated instructions")
> Signed-off-by: Mingwei Zhang <mizhang@google.com>
> ---
>  arch/x86/kvm/pmu.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index edb89b51b383..00b48f25afdb 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -117,11 +117,11 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
>  			skip_pmi = true;
>  		} else {
>  			/* Indicate PEBS overflow PMI to guest. */
> -			skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
> -						      (unsigned long *)&pmu->global_status);
> +			skip_pmi = test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
> +						    (unsigned long *)&pmu->global_status);
>  		}
>  	} else {
> -		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
> +		set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
>  	}
>  
>  	if (!pmc->intr || skip_pmi)
> 
> base-commit: e2013f46ee2e721567783557c301e5c91d0b74ff
> -- 
> 2.42.0.283.g2d96d420d3-goog
>
Mingwei Zhang Sept. 11, 2023, 6 p.m. UTC | #2
On Mon, Sep 11, 2023 at 8:01 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Sep 11, 2023, Mingwei Zhang wrote:
> > Use atomic bit operations for pmu->global_status because it may suffer from
> > race conditions between emulated overflow in KVM vPMU and PEBS overflow in
> > host PMI handler.
>
> Only if the host PMI occurs on a different pCPU, and if that can happen don't we
> have a much larger problem?

Why on different pCPU?  For vPMU, I think there is always contention
between the vCPU thread and the host PMI handler running on the same
pCPU, no?

So, in that case, anything that __kvm_perf_overflow(..., in_pmi=true)
touches on struct kvm_pmu will potentially race with the functions
like reprogram_counter() -> __kvm_perf_overflow(..., in_pmi=false).

-Mingwei
>
> > Fixes: f331601c65ad ("KVM: x86/pmu: Don't generate PEBS records for emulated instructions")
> > Signed-off-by: Mingwei Zhang <mizhang@google.com>
> > ---
> >  arch/x86/kvm/pmu.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> > index edb89b51b383..00b48f25afdb 100644
> > --- a/arch/x86/kvm/pmu.c
> > +++ b/arch/x86/kvm/pmu.c
> > @@ -117,11 +117,11 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
> >                       skip_pmi = true;
> >               } else {
> >                       /* Indicate PEBS overflow PMI to guest. */
> > -                     skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
> > -                                                   (unsigned long *)&pmu->global_status);
> > +                     skip_pmi = test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
> > +                                                 (unsigned long *)&pmu->global_status);
> >               }
> >       } else {
> > -             __set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
> > +             set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
> >       }
> >
> >       if (!pmc->intr || skip_pmi)
> >
> > base-commit: e2013f46ee2e721567783557c301e5c91d0b74ff
> > --
> > 2.42.0.283.g2d96d420d3-goog
> >
Sean Christopherson Sept. 11, 2023, 6:09 p.m. UTC | #3
On Mon, Sep 11, 2023, Mingwei Zhang wrote:
> On Mon, Sep 11, 2023 at 8:01 AM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Mon, Sep 11, 2023, Mingwei Zhang wrote:
> > > Use atomic bit operations for pmu->global_status because it may suffer from
> > > race conditions between emulated overflow in KVM vPMU and PEBS overflow in
> > > host PMI handler.
> >
> > Only if the host PMI occurs on a different pCPU, and if that can happen don't we
> > have a much larger problem?
> 
> Why on different pCPU?  For vPMU, I think there is always contention
> between the vCPU thread and the host PMI handler running on the same
> pCPU, no?

A non-atomic instruction can't be interrupted by an NMI, or any other event, so
I don't see how switching to atomic operations fixes anything unless the NMI comes
in on a different pCPU.
Mingwei Zhang Sept. 11, 2023, 11:42 p.m. UTC | #4
On Mon, Sep 11, 2023 at 11:09 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Sep 11, 2023, Mingwei Zhang wrote:
> > On Mon, Sep 11, 2023 at 8:01 AM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Mon, Sep 11, 2023, Mingwei Zhang wrote:
> > > > Use atomic bit operations for pmu->global_status because it may suffer from
> > > > race conditions between emulated overflow in KVM vPMU and PEBS overflow in
> > > > host PMI handler.
> > >
> > > Only if the host PMI occurs on a different pCPU, and if that can happen don't we
> > > have a much larger problem?
> >
> > Why on different pCPU?  For vPMU, I think there is always contention
> > between the vCPU thread and the host PMI handler running on the same
> > pCPU, no?
>
> A non-atomic instruction can't be interrupted by an NMI, or any other event, so
> I don't see how switching to atomic operations fixes anything unless the NMI comes
> in on a different pCPU.

You are right. I realize that. The race condition has to happen
concurrently from two different pCPUs. This happens to
pmu->reprogram_nmi but not pmu->global_status. The critical stuff we
care about should be re-entrancy issues for __kvm_perf_overflow() and
some state maintenance issues like avoiding duplicate NMI injection.

That concludes that we don't need this change.

Thanks
-Mingwei
diff mbox series

Patch

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index edb89b51b383..00b48f25afdb 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -117,11 +117,11 @@  static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
 			skip_pmi = true;
 		} else {
 			/* Indicate PEBS overflow PMI to guest. */
-			skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
-						      (unsigned long *)&pmu->global_status);
+			skip_pmi = test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
+						    (unsigned long *)&pmu->global_status);
 		}
 	} else {
-		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
+		set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
 	}
 
 	if (!pmc->intr || skip_pmi)