Message ID | 20171031170254.GA12738@u40b0340c692b58f6553c.ant.amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2017-10-31 10:02-0700, Eduardo Valentin: > Hello Radim, > > On Tue, Oct 24, 2017 at 01:18:59PM +0200, Radim Krčmář wrote: > > 2017-10-23 17:44-0700, Eduardo Valentin: > > > Currently, the existing qspinlock implementation will fallback to > > > test-and-set if the hypervisor has not set the PV_UNHALT flag. > > > > Where have you detected the main source of overhead with pinned VCPUs? > > Makes me wonder if we couldn't improve general PV_UNHALT, > > This is essentially for cases of non-overcommitted vCPUs in which we want > the instance vCPUs to run uninterrupted as much as possible. Here by disabling > the PV_UNHALT, we avoid the accounting needed to properly do the PV_UNHALT > hypercall, as the lock holder won't be preempted anyway for the 1:1 pin case. Right, I would expect that the scenario should very rarely go into the halt/kick path -- is SPIN_THRESHOLD too low? We could also try abolishing the SPIN_THRESHOLD completely and only use vcpu_is_preempted() and state of the previous lock holder to enter the halt/kick path. (The drawback is that vcpu_is_preempted() currently gets set even when dropping into userspace.) > > > This patch gives the opportunity to guest kernels to select > > > between test-and-set and the regular queueu fair lock implementation > > > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED > > > flag is not set, the code will still fall back to test-and-set, > > > but when the PV_DEDICATED flag is set, the code will use > > > the regular queue spinlock implementation. > > > > Some flag makes sense and we do want to make sure that userspaces don't > > enable it in pass-through-cpuid mode. > > Did you mean something like: > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > index 0099e10..8ceb503 100644 > --- a/arch/x86/kvm/cpuid.c > +++ b/arch/x86/kvm/cpuid.c > @@ -211,7 +211,8 @@ int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu, > } > for (i = 0; i < cpuid->nent; i++) { > vcpu->arch.cpuid_entries[i].function = cpuid_entries[i].function; > - vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax; > + vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax & > + ~KVM_FEATURE_PV_DEDICATED; > vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx; > vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx; > vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx; > > > But I do not see any other KVM_FEATURE_* being enforced (e.g. PV_UNHALT). > Do you mind elaborating a bit here? Sorry, nothing is needed. I somehow though that we need to expose this to the userspace through CPUID, but KVM just needs to consider the flag as reserved.
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 0099e10..8ceb503 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -211,7 +211,8 @@ int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu, } for (i = 0; i < cpuid->nent; i++) { vcpu->arch.cpuid_entries[i].function = cpuid_entries[i].function; - vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax; + vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax & + ~KVM_FEATURE_PV_DEDICATED; vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx; vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx; vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx;