diff mbox

[PATCHv2,1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set

Message ID 1509644731-2249-1-git-send-email-eduval@amazon.com (mailing list archive)
State New, archived
Headers show

Commit Message

Eduardo Valentin Nov. 2, 2017, 5:45 p.m. UTC
Currently, the existing qspinlock implementation will fallback to
test-and-set if the hypervisor has not set the PV_UNHALT flag.

This patch gives the opportunity to guest kernels to select
between test-and-set and the regular queueu fair lock implementation
based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
flag is not set, the code will still fall back to test-and-set,
but when the PV_DEDICATED flag is set, the code will use
the regular queue spinlock implementation.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Jan H. Schoenherr <jschoenh@amazon.de>
Cc: Anthony Liguori <aliguori@amazon.com>
Suggested-by: Matt Wilson <msw@amazon.com>
Signed-off-by: Eduardo Valentin <eduval@amazon.com>
---
V2:
 - rebase on top of tip/master

 Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
 arch/x86/include/asm/qspinlock.h     | 4 ++++
 arch/x86/include/uapi/asm/kvm_para.h | 1 +
 3 files changed, 11 insertions(+)

Comments

Paolo Bonzini Nov. 2, 2017, 5:56 p.m. UTC | #1
On 02/11/2017 18:45, Eduardo Valentin wrote:
> Currently, the existing qspinlock implementation will fallback to
> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> 
> This patch gives the opportunity to guest kernels to select
> between test-and-set and the regular queueu fair lock implementation
> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> flag is not set, the code will still fall back to test-and-set,
> but when the PV_DEDICATED flag is set, the code will use
> the regular queue spinlock implementation.

Have you seen Waiman's series that lets you specify this on the guest
command line instead?  Would this be acceptable for your use case?

(In other words, is there a difference for you between making the host
vs. guest administrator toggle the feature?  "@amazon.com" means you are
the host admin, how would you use it?)

Thanks,

Paolo

> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Waiman Long <longman@redhat.com>
> Cc: kvm@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> Cc: Anthony Liguori <aliguori@amazon.com>
> Suggested-by: Matt Wilson <msw@amazon.com>
> Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> ---
> V2:
>  - rebase on top of tip/master
> 
>  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
>  arch/x86/include/asm/qspinlock.h     | 4 ++++
>  arch/x86/include/uapi/asm/kvm_para.h | 1 +
>  3 files changed, 11 insertions(+)
> 
> diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> index 3c65feb..117066a 100644
> --- a/Documentation/virtual/kvm/cpuid.txt
> +++ b/Documentation/virtual/kvm/cpuid.txt
> @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
>                                     ||       || before enabling paravirtualized
>                                     ||       || spinlock support.
>  ------------------------------------------------------------------------------
> +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
> +                                   ||       || to determine if they run on
> +                                   ||       || dedicated vCPUs, allowing opti-
> +                                   ||       || mizations such as usage of
> +                                   ||       || qspinlocks.
> +------------------------------------------------------------------------------
>  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
>                                     ||       || per-cpu warps are expected in
>                                     ||       || kvmclock.
> diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
> index 308dfd0..3751898 100644
> --- a/arch/x86/include/asm/qspinlock.h
> +++ b/arch/x86/include/asm/qspinlock.h
> @@ -2,6 +2,8 @@
>  #define _ASM_X86_QSPINLOCK_H
>  
>  #include <linux/jump_label.h>
> +#include <linux/kvm_para.h>
> +
>  #include <asm/cpufeature.h>
>  #include <asm-generic/qspinlock_types.h>
>  #include <asm/paravirt.h>
> @@ -57,6 +59,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
>  	if (!static_branch_likely(&virt_spin_lock_key))
>  		return false;
>  
> +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> +		return false;
>  	/*
>  	 * On hypervisors without PARAVIRT_SPINLOCKS support we fall
>  	 * back to a Test-and-Set spinlock, because fair locks have
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> index a965e5b0..d151300 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -24,6 +24,7 @@
>  #define KVM_FEATURE_STEAL_TIME		5
>  #define KVM_FEATURE_PV_EOI		6
>  #define KVM_FEATURE_PV_UNHALT		7
> +#define KVM_FEATURE_PV_DEDICATED	8
>  
>  /* The last 8 bits are used to indicate how to interpret the flags field
>   * in pvclock structure. If no bits are set, all flags are ignored.
>
Eduardo Valentin Nov. 2, 2017, 6:08 p.m. UTC | #2
On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
> On 02/11/2017 18:45, Eduardo Valentin wrote:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> > 
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
> 
> Have you seen Waiman's series that lets you specify this on the guest
> command line instead?  Would this be acceptable for your use case?
> 

No, can you please share a link to it? is it already merged to tip/master?

> (In other words, is there a difference for you between making the host
> vs. guest administrator toggle the feature?  "@amazon.com" means you are
> the host admin, how would you use it?)
> 

The way I think of this is this is a flag set by host side so the guest adapts accordingly.

If the admin in guest side wants to ignore what the host is flagging, that is a different story.

> Thanks,
> 
> Paolo
> 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: x86@kernel.org
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Waiman Long <longman@redhat.com>
> > Cc: kvm@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Suggested-by: Matt Wilson <msw@amazon.com>
> > Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> > ---
> > V2:
> >  - rebase on top of tip/master
> > 
> >  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
> >  arch/x86/include/asm/qspinlock.h     | 4 ++++
> >  arch/x86/include/uapi/asm/kvm_para.h | 1 +
> >  3 files changed, 11 insertions(+)
> > 
> > diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> > index 3c65feb..117066a 100644
> > --- a/Documentation/virtual/kvm/cpuid.txt
> > +++ b/Documentation/virtual/kvm/cpuid.txt
> > @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
> >                                     ||       || before enabling paravirtualized
> >                                     ||       || spinlock support.
> >  ------------------------------------------------------------------------------
> > +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
> > +                                   ||       || to determine if they run on
> > +                                   ||       || dedicated vCPUs, allowing opti-
> > +                                   ||       || mizations such as usage of
> > +                                   ||       || qspinlocks.
> > +------------------------------------------------------------------------------
> >  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
> >                                     ||       || per-cpu warps are expected in
> >                                     ||       || kvmclock.
> > diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
> > index 308dfd0..3751898 100644
> > --- a/arch/x86/include/asm/qspinlock.h
> > +++ b/arch/x86/include/asm/qspinlock.h
> > @@ -2,6 +2,8 @@
> >  #define _ASM_X86_QSPINLOCK_H
> >  
> >  #include <linux/jump_label.h>
> > +#include <linux/kvm_para.h>
> > +
> >  #include <asm/cpufeature.h>
> >  #include <asm-generic/qspinlock_types.h>
> >  #include <asm/paravirt.h>
> > @@ -57,6 +59,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> >  	if (!static_branch_likely(&virt_spin_lock_key))
> >  		return false;
> >  
> > +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +		return false;
> >  	/*
> >  	 * On hypervisors without PARAVIRT_SPINLOCKS support we fall
> >  	 * back to a Test-and-Set spinlock, because fair locks have
> > diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> > index a965e5b0..d151300 100644
> > --- a/arch/x86/include/uapi/asm/kvm_para.h
> > +++ b/arch/x86/include/uapi/asm/kvm_para.h
> > @@ -24,6 +24,7 @@
> >  #define KVM_FEATURE_STEAL_TIME		5
> >  #define KVM_FEATURE_PV_EOI		6
> >  #define KVM_FEATURE_PV_UNHALT		7
> > +#define KVM_FEATURE_PV_DEDICATED	8
> >  
> >  /* The last 8 bits are used to indicate how to interpret the flags field
> >   * in pvclock structure. If no bits are set, all flags are ignored.
> > 
> 
>
Waiman Long Nov. 2, 2017, 6:12 p.m. UTC | #3
On 11/02/2017 02:08 PM, Eduardo Valentin wrote:
> On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
>> On 02/11/2017 18:45, Eduardo Valentin wrote:
>>> Currently, the existing qspinlock implementation will fallback to
>>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
>>>
>>> This patch gives the opportunity to guest kernels to select
>>> between test-and-set and the regular queueu fair lock implementation
>>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
>>> flag is not set, the code will still fall back to test-and-set,
>>> but when the PV_DEDICATED flag is set, the code will use
>>> the regular queue spinlock implementation.
>> Have you seen Waiman's series that lets you specify this on the guest
>> command line instead?  Would this be acceptable for your use case?
>>
> No, can you please share a link to it? is it already merged to tip/master?

See https://lkml.org/lkml/2017/11/1/655

Cheers,
Longman
Paolo Bonzini Nov. 2, 2017, 6:24 p.m. UTC | #4
On 02/11/2017 19:08, Eduardo Valentin wrote:
> On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
>> On 02/11/2017 18:45, Eduardo Valentin wrote:
>>> Currently, the existing qspinlock implementation will fallback to
>>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
>>>
>>> This patch gives the opportunity to guest kernels to select
>>> between test-and-set and the regular queueu fair lock implementation
>>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
>>> flag is not set, the code will still fall back to test-and-set,
>>> but when the PV_DEDICATED flag is set, the code will use
>>> the regular queue spinlock implementation.
>>
>> Have you seen Waiman's series that lets you specify this on the guest
>> command line instead?  Would this be acceptable for your use case?
> 
> No, can you please share a link to it? is it already merged to tip/master?

[PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type
https://lkml.org/lkml/2017/11/1/655

>> (In other words, is there a difference for you between making the host
>> vs. guest administrator toggle the feature?  "@amazon.com" means you are
>> the host admin, how would you use it?)
> 
> The way I think of this is this is a flag set by host side so the
> guest adapts accordingly.
> 
> If the admin in guest side wants to ignore what the host is
> flagging, that is a different story.

Okay, this makes sense.  But perhaps it should be a separate CPUID leaf,
such as "configuration hints", rather than properly a feature.

Paolo
Eduardo Valentin Nov. 2, 2017, 6:27 p.m. UTC | #5
Longman,

On Thu, Nov 02, 2017 at 02:12:13PM -0400, Waiman Long wrote:
> On 11/02/2017 02:08 PM, Eduardo Valentin wrote:
> > On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
> >> On 02/11/2017 18:45, Eduardo Valentin wrote:
> >>> Currently, the existing qspinlock implementation will fallback to
> >>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >>>
> >>> This patch gives the opportunity to guest kernels to select
> >>> between test-and-set and the regular queueu fair lock implementation
> >>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> >>> flag is not set, the code will still fall back to test-and-set,
> >>> but when the PV_DEDICATED flag is set, the code will use
> >>> the regular queue spinlock implementation.
> >> Have you seen Waiman's series that lets you specify this on the guest
> >> command line instead?  Would this be acceptable for your use case?
> >>
> > No, can you please share a link to it? is it already merged to tip/master?
> 
> See https://lkml.org/lkml/2017/11/1/655

Oh I see, thanks. I think that patch would help, but I believe the series and this patch are complementary.

Paolo, back to your question, I think this patch still makes sense in combination with Waiman's series
for the following case:

+ * If this argument is not specified, the kernel will automatically choose
+ * an appropriate one depending on X86_FEATURE_HYPERVISOR and hypervisor
+ * specific settings.
+ */

 In this case, the hypervisor can still flag PV_DEDICATED and the guest would not pick test&set, when
that scenario is desirable.

> 
> Cheers,
> Longman
>
Eduardo Valentin Nov. 2, 2017, 6:43 p.m. UTC | #6
On Thu, Nov 02, 2017 at 07:24:16PM +0100, Paolo Bonzini wrote:
> On 02/11/2017 19:08, Eduardo Valentin wrote:
> > On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
> >> On 02/11/2017 18:45, Eduardo Valentin wrote:
> >>> Currently, the existing qspinlock implementation will fallback to
> >>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >>>
> >>> This patch gives the opportunity to guest kernels to select
> >>> between test-and-set and the regular queueu fair lock implementation
> >>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> >>> flag is not set, the code will still fall back to test-and-set,
> >>> but when the PV_DEDICATED flag is set, the code will use
> >>> the regular queue spinlock implementation.
> >>
> >> Have you seen Waiman's series that lets you specify this on the guest
> >> command line instead?  Would this be acceptable for your use case?
> > 
> > No, can you please share a link to it? is it already merged to tip/master?
> 
> [PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type
> https://lkml.org/lkml/2017/11/1/655
> 
> >> (In other words, is there a difference for you between making the host
> >> vs. guest administrator toggle the feature?  "@amazon.com" means you are
> >> the host admin, how would you use it?)
> > 
> > The way I think of this is this is a flag set by host side so the
> > guest adapts accordingly.
> > 
> > If the admin in guest side wants to ignore what the host is
> > flagging, that is a different story.
> 
> Okay, this makes sense.  But perhaps it should be a separate CPUID leaf,
> such as "configuration hints", rather than properly a feature.

Oh OK, you don't think this starts to deviate from the feature concept.
But would the PV_UNHALT also go to "configuration hints" bucket?

Another way to see this is we have three locking feature options to select from,
so we need at least two bits here.

> 
> Paolo
Paolo Bonzini Nov. 3, 2017, 10:09 a.m. UTC | #7
On 02/11/2017 19:43, Eduardo Valentin wrote:
> On Thu, Nov 02, 2017 at 07:24:16PM +0100, Paolo Bonzini wrote:
>> On 02/11/2017 19:08, Eduardo Valentin wrote:
>>> On Thu, Nov 02, 2017 at 06:56:46PM +0100, Paolo Bonzini wrote:
>>>> On 02/11/2017 18:45, Eduardo Valentin wrote:
>>>>> Currently, the existing qspinlock implementation will fallback to
>>>>> test-and-set if the hypervisor has not set the PV_UNHALT flag.
>>>>>
>>>>> This patch gives the opportunity to guest kernels to select
>>>>> between test-and-set and the regular queueu fair lock implementation
>>>>> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
>>>>> flag is not set, the code will still fall back to test-and-set,
>>>>> but when the PV_DEDICATED flag is set, the code will use
>>>>> the regular queue spinlock implementation.
>>>>
>>>> Have you seen Waiman's series that lets you specify this on the guest
>>>> command line instead?  Would this be acceptable for your use case?
>>>
>>> No, can you please share a link to it? is it already merged to tip/master?
>>
>> [PATCH-tip v2 0/2] x86/paravirt: Enable users to choose PV lock type
>> https://lkml.org/lkml/2017/11/1/655
>>
>>>> (In other words, is there a difference for you between making the host
>>>> vs. guest administrator toggle the feature?  "@amazon.com" means you are
>>>> the host admin, how would you use it?)
>>>
>>> The way I think of this is this is a flag set by host side so the
>>> guest adapts accordingly.
>>>
>>> If the admin in guest side wants to ignore what the host is
>>> flagging, that is a different story.
>>
>> Okay, this makes sense.  But perhaps it should be a separate CPUID leaf,
>> such as "configuration hints", rather than properly a feature.
> 
> Oh OK, you don't think this starts to deviate from the feature concept.
> But would the PV_UNHALT also go to "configuration hints" bucket?

PV_UNHALT says whether the pvqspinlock API is available, PV_DEDICATED
says whether it should be used.

> Another way to see this is we have three locking feature options to select from,
> so we need at least two bits here.

PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
PV_DEDICATED = 0, PV_UNHALT = 0: default is tas

What do you think?

Paolo
diff mbox

Patch

diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index 3c65feb..117066a 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -54,6 +54,12 @@  KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
                                    ||       || before enabling paravirtualized
                                    ||       || spinlock support.
 ------------------------------------------------------------------------------
+KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
+                                   ||       || to determine if they run on
+                                   ||       || dedicated vCPUs, allowing opti-
+                                   ||       || mizations such as usage of
+                                   ||       || qspinlocks.
+------------------------------------------------------------------------------
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
                                    ||       || per-cpu warps are expected in
                                    ||       || kvmclock.
diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
index 308dfd0..3751898 100644
--- a/arch/x86/include/asm/qspinlock.h
+++ b/arch/x86/include/asm/qspinlock.h
@@ -2,6 +2,8 @@ 
 #define _ASM_X86_QSPINLOCK_H
 
 #include <linux/jump_label.h>
+#include <linux/kvm_para.h>
+
 #include <asm/cpufeature.h>
 #include <asm-generic/qspinlock_types.h>
 #include <asm/paravirt.h>
@@ -57,6 +59,8 @@  static inline bool virt_spin_lock(struct qspinlock *lock)
 	if (!static_branch_likely(&virt_spin_lock_key))
 		return false;
 
+	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
+		return false;
 	/*
 	 * On hypervisors without PARAVIRT_SPINLOCKS support we fall
 	 * back to a Test-and-Set spinlock, because fair locks have
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index a965e5b0..d151300 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@ 
 #define KVM_FEATURE_STEAL_TIME		5
 #define KVM_FEATURE_PV_EOI		6
 #define KVM_FEATURE_PV_UNHALT		7
+#define KVM_FEATURE_PV_DEDICATED	8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.