diff mbox series

[1/2] KVM: x86: reduce pvclock_gtod_sync_lock critical sections

Message ID 20210330165958.3094759-2-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show
Series KVM: x86: fix lockdep splat due to Xen runstate update | expand

Commit Message

Paolo Bonzini March 30, 2021, 4:59 p.m. UTC
There is no need to include changes to vcpu->requests into
the pvclock_gtod_sync_lock critical section.  The changes to
the shared data structures (in pvclock_update_vm_gtod_copy)
already occur under the lock.

Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/x86.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

Comments

Wanpeng Li March 31, 2021, 1:41 a.m. UTC | #1
On Wed, 31 Mar 2021 at 01:02, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> There is no need to include changes to vcpu->requests into
> the pvclock_gtod_sync_lock critical section.  The changes to
> the shared data structures (in pvclock_update_vm_gtod_copy)
> already occur under the lock.
>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Wanpeng Li <wanpengli@tencent.com>

> ---
>  arch/x86/kvm/x86.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fe806e894212..0a83eff40b43 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2562,10 +2562,12 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
>
>         kvm_hv_invalidate_tsc_page(kvm);
>
> -       spin_lock(&ka->pvclock_gtod_sync_lock);
>         kvm_make_mclock_inprogress_request(kvm);
> +
>         /* no guest entries from this point */
> +       spin_lock(&ka->pvclock_gtod_sync_lock);
>         pvclock_update_vm_gtod_copy(kvm);
> +       spin_unlock(&ka->pvclock_gtod_sync_lock);
>
>         kvm_for_each_vcpu(i, vcpu, kvm)
>                 kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
> @@ -2573,8 +2575,6 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
>         /* guest entries allowed */
>         kvm_for_each_vcpu(i, vcpu, kvm)
>                 kvm_clear_request(KVM_REQ_MCLOCK_INPROGRESS, vcpu);
> -
> -       spin_unlock(&ka->pvclock_gtod_sync_lock);
>  #endif
>  }
>
> @@ -7740,16 +7740,14 @@ static void kvm_hyperv_tsc_notifier(void)
>                 struct kvm_arch *ka = &kvm->arch;
>
>                 spin_lock(&ka->pvclock_gtod_sync_lock);
> -
>                 pvclock_update_vm_gtod_copy(kvm);
> +               spin_unlock(&ka->pvclock_gtod_sync_lock);
>
>                 kvm_for_each_vcpu(cpu, vcpu, kvm)
>                         kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
>
>                 kvm_for_each_vcpu(cpu, vcpu, kvm)
>                         kvm_clear_request(KVM_REQ_MCLOCK_INPROGRESS, vcpu);
> -
> -               spin_unlock(&ka->pvclock_gtod_sync_lock);
>         }
>         mutex_unlock(&kvm_lock);
>  }
> --
> 2.26.2
>
>
Marcelo Tosatti April 7, 2021, 5:40 p.m. UTC | #2
On Tue, Mar 30, 2021 at 12:59:57PM -0400, Paolo Bonzini wrote:
> There is no need to include changes to vcpu->requests into
> the pvclock_gtod_sync_lock critical section.  The changes to
> the shared data structures (in pvclock_update_vm_gtod_copy)
> already occur under the lock.
> 
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fe806e894212..0a83eff40b43 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2562,10 +2562,12 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
>  
>  	kvm_hv_invalidate_tsc_page(kvm);
>  
> -	spin_lock(&ka->pvclock_gtod_sync_lock);
>  	kvm_make_mclock_inprogress_request(kvm);
> +

Might be good to serialize against two kvm_gen_update_masterclock
callers? Otherwise one caller could clear KVM_REQ_MCLOCK_INPROGRESS,
while the other is still at pvclock_update_vm_gtod_copy().

Otherwise, looks good.
Paolo Bonzini April 8, 2021, 8:15 a.m. UTC | #3
On 07/04/21 19:40, Marcelo Tosatti wrote:
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index fe806e894212..0a83eff40b43 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2562,10 +2562,12 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
>>   
>>   	kvm_hv_invalidate_tsc_page(kvm);
>>   
>> -	spin_lock(&ka->pvclock_gtod_sync_lock);
>>   	kvm_make_mclock_inprogress_request(kvm);
>> +
> Might be good to serialize against two kvm_gen_update_masterclock
> callers? Otherwise one caller could clear KVM_REQ_MCLOCK_INPROGRESS,
> while the other is still at pvclock_update_vm_gtod_copy().

Makes sense, but this stuff has always seemed unnecessarily complicated 
to me.

KVM_REQ_MCLOCK_INPROGRESS is only needed to kick running vCPUs out of 
the execution loop; clearing it in kvm_gen_update_masterclock is 
unnecessary, because KVM_REQ_CLOCK_UPDATE takes pvclock_gtod_sync_lock 
too and thus will already wait for pvclock_update_vm_gtod_copy to end.

I think it's possible to use a seqcount in KVM_REQ_CLOCK_UPDATE instead 
of KVM_REQ_MCLOCK_INPROGRESS.  Both cause the vCPUs to spin. I'll take a 
look.

Paolo
Marcelo Tosatti April 8, 2021, noon UTC | #4
Hi Paolo,

On Thu, Apr 08, 2021 at 10:15:16AM +0200, Paolo Bonzini wrote:
> On 07/04/21 19:40, Marcelo Tosatti wrote:
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index fe806e894212..0a83eff40b43 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -2562,10 +2562,12 @@ static void kvm_gen_update_masterclock(struct kvm *kvm)
> > >   	kvm_hv_invalidate_tsc_page(kvm);
> > > -	spin_lock(&ka->pvclock_gtod_sync_lock);
> > >   	kvm_make_mclock_inprogress_request(kvm);
> > > +
> > Might be good to serialize against two kvm_gen_update_masterclock
> > callers? Otherwise one caller could clear KVM_REQ_MCLOCK_INPROGRESS,
> > while the other is still at pvclock_update_vm_gtod_copy().
> 
> Makes sense, but this stuff has always seemed unnecessarily complicated to
> me.
>
> KVM_REQ_MCLOCK_INPROGRESS is only needed to kick running vCPUs out of the
> execution loop; 

We do not want vcpus with different system_timestamp/tsc_timestamp
pair:

 * To avoid that problem, do not allow visibility of distinct
 * system_timestamp/tsc_timestamp values simultaneously: use a master
 * copy of host monotonic time values. Update that master copy
 * in lockstep.

So KVM_REQ_MCLOCK_INPROGRESS also ensures that no vcpu enters 
guest mode (via vcpu->requests check before VM-entry) with a 
different system_timestamp/tsc_timestamp pair.

> clearing it in kvm_gen_update_masterclock is unnecessary,
> because KVM_REQ_CLOCK_UPDATE takes pvclock_gtod_sync_lock too and thus will
> already wait for pvclock_update_vm_gtod_copy to end.
> 
> I think it's possible to use a seqcount in KVM_REQ_CLOCK_UPDATE instead of
> KVM_REQ_MCLOCK_INPROGRESS.  Both cause the vCPUs to spin. I'll take a look.
> 
> Paolo
Paolo Bonzini April 8, 2021, 12:25 p.m. UTC | #5
On 08/04/21 14:00, Marcelo Tosatti wrote:
>>
>> KVM_REQ_MCLOCK_INPROGRESS is only needed to kick running vCPUs out of the
>> execution loop;
> We do not want vcpus with different system_timestamp/tsc_timestamp
> pair:
> 
>   * To avoid that problem, do not allow visibility of distinct
>   * system_timestamp/tsc_timestamp values simultaneously: use a master
>   * copy of host monotonic time values. Update that master copy
>   * in lockstep.
> 
> So KVM_REQ_MCLOCK_INPROGRESS also ensures that no vcpu enters
> guest mode (via vcpu->requests check before VM-entry) with a
> different system_timestamp/tsc_timestamp pair.

Yes this is what KVM_REQ_MCLOCK_INPROGRESS does, but it does not have to 
be done that way.  All you really need is the IPI with KVM_REQUEST_WAIT, 
which ensures that updates happen after the vCPUs have exited guest 
mode.  You don't need to loop on vcpu->requests for example, because 
kvm_guest_time_update could just spin on pvclock_gtod_sync_lock until 
pvclock_update_vm_gtod_copy is done.

So this morning I tried protecting the kvm->arch fields for kvmclock 
using a seqcount, which is nice also because get_kvmclock_ns() does not 
have to bounce the cacheline of pvclock_gtod_sync_lock anymore.  I'll 
post it tomorrow or next week.

Paolo
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe806e894212..0a83eff40b43 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2562,10 +2562,12 @@  static void kvm_gen_update_masterclock(struct kvm *kvm)
 
 	kvm_hv_invalidate_tsc_page(kvm);
 
-	spin_lock(&ka->pvclock_gtod_sync_lock);
 	kvm_make_mclock_inprogress_request(kvm);
+
 	/* no guest entries from this point */
+	spin_lock(&ka->pvclock_gtod_sync_lock);
 	pvclock_update_vm_gtod_copy(kvm);
+	spin_unlock(&ka->pvclock_gtod_sync_lock);
 
 	kvm_for_each_vcpu(i, vcpu, kvm)
 		kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
@@ -2573,8 +2575,6 @@  static void kvm_gen_update_masterclock(struct kvm *kvm)
 	/* guest entries allowed */
 	kvm_for_each_vcpu(i, vcpu, kvm)
 		kvm_clear_request(KVM_REQ_MCLOCK_INPROGRESS, vcpu);
-
-	spin_unlock(&ka->pvclock_gtod_sync_lock);
 #endif
 }
 
@@ -7740,16 +7740,14 @@  static void kvm_hyperv_tsc_notifier(void)
 		struct kvm_arch *ka = &kvm->arch;
 
 		spin_lock(&ka->pvclock_gtod_sync_lock);
-
 		pvclock_update_vm_gtod_copy(kvm);
+		spin_unlock(&ka->pvclock_gtod_sync_lock);
 
 		kvm_for_each_vcpu(cpu, vcpu, kvm)
 			kvm_make_request(KVM_REQ_CLOCK_UPDATE, vcpu);
 
 		kvm_for_each_vcpu(cpu, vcpu, kvm)
 			kvm_clear_request(KVM_REQ_MCLOCK_INPROGRESS, vcpu);
-
-		spin_unlock(&ka->pvclock_gtod_sync_lock);
 	}
 	mutex_unlock(&kvm_lock);
 }