diff mbox series

[2/3] KVM: x86: Remove use of apicv_update_lock when toggling guest debug state

Message ID dc6cf3403e29c0296926e3bd8f0d4e87b67f4600.1738595289.git.naveen@kernel.org (mailing list archive)
State New
Headers show
Series KVM: x86: Address performance degradation due to APICv inhibits | expand

Commit Message

Naveen N Rao Feb. 3, 2025, 5:03 p.m. UTC
apicv_update_lock is not required when querying the state of guest
debug in all the vcpus. Remove usage of the same, and switch to
kvm_set_or_clear_apicv_inhibit() helper to simplify the code.

Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org>
---
 arch/x86/kvm/x86.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

Comments

Maxim Levitsky Feb. 4, 2025, 2 a.m. UTC | #1
On Mon, 2025-02-03 at 22:33 +0530, Naveen N Rao (AMD) wrote:
> apicv_update_lock is not required when querying the state of guest
> debug in all the vcpus. Remove usage of the same, and switch to
> kvm_set_or_clear_apicv_inhibit() helper to simplify the code.

It might be worth to mention that the reason why the lock is not needed,
is because kvm_vcpu_ioctl from which this function is called takes 'vcpu->mutex'
and thus concurrent execution of this function is not really possible.

Besides this:

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky

> 
> Signed-off-by: Naveen N Rao (AMD) <naveen@kernel.org>
> ---
>  arch/x86/kvm/x86.c | 9 ++-------
>  1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b2d9a16fd4d3..11235e91ae90 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12038,19 +12038,14 @@ static void kvm_arch_vcpu_guestdbg_update_apicv_inhibit(struct kvm *kvm)
>  	struct kvm_vcpu *vcpu;
>  	unsigned long i;
>  
> -	if (!enable_apicv)
> -		return;
> -
> -	down_write(&kvm->arch.apicv_update_lock);
> -
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>  		if (vcpu->guest_debug & KVM_GUESTDBG_BLOCKIRQ) {
>  			set = true;
>  			break;
>  		}
>  	}
> -	__kvm_set_or_clear_apicv_inhibit(kvm, APICV_INHIBIT_REASON_BLOCKIRQ, set);
> -	up_write(&kvm->arch.apicv_update_lock);
> +
> +	kvm_set_or_clear_apicv_inhibit(kvm, APICV_INHIBIT_REASON_BLOCKIRQ, set);
>  }
>  
>  int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
Naveen N Rao Feb. 4, 2025, 1:10 p.m. UTC | #2
On Mon, Feb 03, 2025 at 09:00:05PM -0500, Maxim Levitsky wrote:
> On Mon, 2025-02-03 at 22:33 +0530, Naveen N Rao (AMD) wrote:
> > apicv_update_lock is not required when querying the state of guest
> > debug in all the vcpus. Remove usage of the same, and switch to
> > kvm_set_or_clear_apicv_inhibit() helper to simplify the code.
> 
> It might be worth to mention that the reason why the lock is not needed,
> is because kvm_vcpu_ioctl from which this function is called takes 'vcpu->mutex'
> and thus concurrent execution of this function is not really possible.
> 
> Besides this:
> 
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Sure, thanks for the review!


- Naveen
Naveen N Rao Feb. 4, 2025, 2:25 p.m. UTC | #3
On Mon, Feb 03, 2025 at 09:00:05PM -0500, Maxim Levitsky wrote:
> On Mon, 2025-02-03 at 22:33 +0530, Naveen N Rao (AMD) wrote:
> > apicv_update_lock is not required when querying the state of guest
> > debug in all the vcpus. Remove usage of the same, and switch to
> > kvm_set_or_clear_apicv_inhibit() helper to simplify the code.
> 
> It might be worth to mention that the reason why the lock is not needed,
> is because kvm_vcpu_ioctl from which this function is called takes 'vcpu->mutex'
> and thus concurrent execution of this function is not really possible.

Looking at this again, that looks to be a vcpu-specific lock, so I guess 
it is possible for multiple vcpus to run this concurrently?

In reality, this looks to be coming in from a vcpu ioctl from userspace, 
so this is probably not being invoked concurrently today.

Regardless, I wonder if moving this to a per-vcpu inhibit might be a 
better way to address this.


Thanks,
Naveen
Sean Christopherson Feb. 4, 2025, 5:51 p.m. UTC | #4
On Tue, Feb 04, 2025, Naveen N Rao wrote:
> On Mon, Feb 03, 2025 at 09:00:05PM -0500, Maxim Levitsky wrote:
> > On Mon, 2025-02-03 at 22:33 +0530, Naveen N Rao (AMD) wrote:
> > > apicv_update_lock is not required when querying the state of guest
> > > debug in all the vcpus. Remove usage of the same, and switch to
> > > kvm_set_or_clear_apicv_inhibit() helper to simplify the code.
> > 
> > It might be worth to mention that the reason why the lock is not needed,
> > is because kvm_vcpu_ioctl from which this function is called takes 'vcpu->mutex'
> > and thus concurrent execution of this function is not really possible.
> 
> Looking at this again, that looks to be a vcpu-specific lock, so I guess 
> it is possible for multiple vcpus to run this concurrently?

Correct.

> In reality, this looks to be coming in from a vcpu ioctl from userspace, 
> so this is probably not being invoked concurrently today.
> 
> Regardless, I wonder if moving this to a per-vcpu inhibit might be a 
> better way to address this.

No, this is a slow path.
Paolo Bonzini Feb. 4, 2025, 5:58 p.m. UTC | #5
On 2/4/25 18:51, Sean Christopherson wrote:
> On Tue, Feb 04, 2025, Naveen N Rao wrote:
>> On Mon, Feb 03, 2025 at 09:00:05PM -0500, Maxim Levitsky wrote:
>>> On Mon, 2025-02-03 at 22:33 +0530, Naveen N Rao (AMD) wrote:
>>>> apicv_update_lock is not required when querying the state of guest
>>>> debug in all the vcpus. Remove usage of the same, and switch to
>>>> kvm_set_or_clear_apicv_inhibit() helper to simplify the code.
>>>
>>> It might be worth to mention that the reason why the lock is not needed,
>>> is because kvm_vcpu_ioctl from which this function is called takes 'vcpu->mutex'
>>> and thus concurrent execution of this function is not really possible.
>>
>> Looking at this again, that looks to be a vcpu-specific lock, so I guess
>> it is possible for multiple vcpus to run this concurrently?
> 
> Correct.

And this patch is incorrect. Because there is a store and many loads, 
you have the typical race when two vCPUs set blockirq at the same time

	vcpu 0				vcpu 1
	---------------			--------------
	set vcpu0->guest_debug
					clear vcpu1->guest_debug
	read vcpu0->guest_debug
	read vcpu1->guest_debug	
	set inhibit
					read stale vcpu0->guest_debug
					read vcpu1->guest_debug
					clear inhibit

But since this is really a slow path, why even bother optimizing it?

Paolo
Maxim Levitsky Feb. 4, 2025, 7:42 p.m. UTC | #6
On Tue, 2025-02-04 at 18:58 +0100, Paolo Bonzini wrote:
> On 2/4/25 18:51, Sean Christopherson wrote:
> > On Tue, Feb 04, 2025, Naveen N Rao wrote:
> > > On Mon, Feb 03, 2025 at 09:00:05PM -0500, Maxim Levitsky wrote:
> > > > On Mon, 2025-02-03 at 22:33 +0530, Naveen N Rao (AMD) wrote:
> > > > > apicv_update_lock is not required when querying the state of guest
> > > > > debug in all the vcpus. Remove usage of the same, and switch to
> > > > > kvm_set_or_clear_apicv_inhibit() helper to simplify the code.
> > > > 
> > > > It might be worth to mention that the reason why the lock is not needed,
> > > > is because kvm_vcpu_ioctl from which this function is called takes 'vcpu->mutex'
> > > > and thus concurrent execution of this function is not really possible.
> > > 
> > > Looking at this again, that looks to be a vcpu-specific lock, so I guess
> > > it is possible for multiple vcpus to run this concurrently?
> > 
> > Correct.
> 
> And this patch is incorrect. Because there is a store and many loads, 
> you have the typical race when two vCPUs set blockirq at the same time
> 
> 	vcpu 0				vcpu 1
> 	---------------			--------------
> 	set vcpu0->guest_debug
> 					clear vcpu1->guest_debug
> 	read vcpu0->guest_debug
> 	read vcpu1->guest_debug	
> 	set inhibit
> 					read stale vcpu0->guest_debug
> 					read vcpu1->guest_debug
> 					clear inhibit
> 
> But since this is really a slow path, why even bother optimizing it?
> 
> Paolo
> 


Paolo, you are absolutely right! the vcpu mutex only prevents concurrent ioctl
on a same vcpu, but not on different vcpus, and without locking of course
this patch isn't going to work. The per-vcpu mutex is not something I know well,
and I only recently made aware of it, so I mixed this thing up.

So yes, some kind of lock is needed here.

Best regards,
     Maxim Levitsky
Naveen N Rao Feb. 5, 2025, 11:13 a.m. UTC | #7
On Tue, Feb 04, 2025 at 09:51:22AM -0800, Sean Christopherson wrote:
> On Tue, Feb 04, 2025, Naveen N Rao wrote:
> > On Mon, Feb 03, 2025 at 09:00:05PM -0500, Maxim Levitsky wrote:
> > > On Mon, 2025-02-03 at 22:33 +0530, Naveen N Rao (AMD) wrote:
> > > > apicv_update_lock is not required when querying the state of guest
> > > > debug in all the vcpus. Remove usage of the same, and switch to
> > > > kvm_set_or_clear_apicv_inhibit() helper to simplify the code.
> > > 
> > > It might be worth to mention that the reason why the lock is not needed,
> > > is because kvm_vcpu_ioctl from which this function is called takes 'vcpu->mutex'
> > > and thus concurrent execution of this function is not really possible.
> > 
> > Looking at this again, that looks to be a vcpu-specific lock, so I guess 
> > it is possible for multiple vcpus to run this concurrently?
> 
> Correct.
> 
> > In reality, this looks to be coming in from a vcpu ioctl from userspace, 
> > so this is probably not being invoked concurrently today.
> > 
> > Regardless, I wonder if moving this to a per-vcpu inhibit might be a 
> > better way to address this.
> 
> No, this is a slow path.

My comment was more from the point of view of correctness, rather than 
performance (with the goal of removing use of apicv_update_lock) -- 
similar to the issue with IRQWIN needing to maintain per-vcpu state. My 
naive understanding of Maxim's mail was that we would introduce per-vcpu 
inhibit field to maintain per-vcpu inhibit state, but not actually 
inhibit AVIC on a per-vcpu basis :)


Thanks,
Naveen
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b2d9a16fd4d3..11235e91ae90 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12038,19 +12038,14 @@  static void kvm_arch_vcpu_guestdbg_update_apicv_inhibit(struct kvm *kvm)
 	struct kvm_vcpu *vcpu;
 	unsigned long i;
 
-	if (!enable_apicv)
-		return;
-
-	down_write(&kvm->arch.apicv_update_lock);
-
 	kvm_for_each_vcpu(i, vcpu, kvm) {
 		if (vcpu->guest_debug & KVM_GUESTDBG_BLOCKIRQ) {
 			set = true;
 			break;
 		}
 	}
-	__kvm_set_or_clear_apicv_inhibit(kvm, APICV_INHIBIT_REASON_BLOCKIRQ, set);
-	up_write(&kvm->arch.apicv_update_lock);
+
+	kvm_set_or_clear_apicv_inhibit(kvm, APICV_INHIBIT_REASON_BLOCKIRQ, set);
 }
 
 int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,