diff mbox series

[v2,2/3] KVM: X86: Implement PV sched yield hypercall

Message ID 1559004795-19927-3-git-send-email-wanpengli@tencent.com (mailing list archive)
State New, archived
Headers show
Series KVM: Yield to IPI target if necessary | expand

Commit Message

Wanpeng Li May 28, 2019, 12:53 a.m. UTC
From: Wanpeng Li <wanpengli@tencent.com>

The target vCPUs are in runnable state after vcpu_kick and suitable 
as a yield target. This patch implements the sched yield hypercall.

17% performace increase of ebizzy benchmark can be observed in an 
over-subscribe environment. (w/ kvm-pv-tlb disabled, testing TLB flush 
call-function IPI-many since call-function is not easy to be trigged 
by userspace workload).

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/x86.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

Comments

Christian Borntraeger May 28, 2019, 9:11 a.m. UTC | #1
On 28.05.19 02:53, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> The target vCPUs are in runnable state after vcpu_kick and suitable 
> as a yield target. This patch implements the sched yield hypercall.
> 
> 17% performace increase of ebizzy benchmark can be observed in an 
> over-subscribe environment. (w/ kvm-pv-tlb disabled, testing TLB flush 
> call-function IPI-many since call-function is not easy to be trigged 
> by userspace workload).
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>

FWIW, we do have a similar interface in s390.

See arch/s390/kvm/diag.c  __diag_time_slice_end_directed for our implementation.
> ---
>  arch/x86/kvm/x86.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e7e57de..2ceef51 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7172,6 +7172,26 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
>  	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
>  }
> 
> +void kvm_sched_yield(struct kvm *kvm, u64 dest_id)
> +{
> +	struct kvm_vcpu *target;
> +	struct kvm_apic_map *map;
> +
> +	rcu_read_lock();
> +	map = rcu_dereference(kvm->arch.apic_map);
> +
> +	if (unlikely(!map))
> +		goto out;
> +
> +	if (map->phys_map[dest_id]->vcpu) {
> +		target = map->phys_map[dest_id]->vcpu;
> +		kvm_vcpu_yield_to(target);
> +	}
> +
> +out:
> +	rcu_read_unlock();
> +}
> +
>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  {
>  	unsigned long nr, a0, a1, a2, a3, ret;
> @@ -7218,6 +7238,10 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  	case KVM_HC_SEND_IPI:
>  		ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit);
>  		break;
> +	case KVM_HC_SCHED_YIELD:
> +		kvm_sched_yield(vcpu->kvm, a0);
> +		ret = 0;
> +		break;
>  	default:
>  		ret = -KVM_ENOSYS;
>  		break;
>
Wanpeng Li May 28, 2019, 9:57 a.m. UTC | #2
On Tue, 28 May 2019 at 17:12, Christian Borntraeger
<borntraeger@de.ibm.com> wrote:
>
> On 28.05.19 02:53, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > The target vCPUs are in runnable state after vcpu_kick and suitable
> > as a yield target. This patch implements the sched yield hypercall.
> >
> > 17% performace increase of ebizzy benchmark can be observed in an
> > over-subscribe environment. (w/ kvm-pv-tlb disabled, testing TLB flush
> > call-function IPI-many since call-function is not easy to be trigged
> > by userspace workload).
> >
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
>
> FWIW, we do have a similar interface in s390.
>
> See arch/s390/kvm/diag.c  __diag_time_slice_end_directed for our implementation.

Good to know this. :)

Regards,
Wanpeng Li
Liran Alon May 29, 2019, 12:28 p.m. UTC | #3
> On 28 May 2019, at 3:53, Wanpeng Li <kernellwp@gmail.com> wrote:
> 
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> The target vCPUs are in runnable state after vcpu_kick and suitable 
> as a yield target. This patch implements the sched yield hypercall.
> 
> 17% performace increase of ebizzy benchmark can be observed in an 
> over-subscribe environment. (w/ kvm-pv-tlb disabled, testing TLB flush 
> call-function IPI-many since call-function is not easy to be trigged 
> by userspace workload).
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
> arch/x86/kvm/x86.c | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e7e57de..2ceef51 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7172,6 +7172,26 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
> 	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
> }
> 
> +void kvm_sched_yield(struct kvm *kvm, u64 dest_id)
> +{
> +	struct kvm_vcpu *target;
> +	struct kvm_apic_map *map;
> +
> +	rcu_read_lock();
> +	map = rcu_dereference(kvm->arch.apic_map);
> +
> +	if (unlikely(!map))
> +		goto out;
> +

We should have a bounds-check here on “dest_id”.

-Liran

> +	if (map->phys_map[dest_id]->vcpu) {
> +		target = map->phys_map[dest_id]->vcpu;
> +		kvm_vcpu_yield_to(target);
> +	}
> +
> +out:
> +	rcu_read_unlock();
> +}
> +
> int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> {
> 	unsigned long nr, a0, a1, a2, a3, ret;
> @@ -7218,6 +7238,10 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
> 	case KVM_HC_SEND_IPI:
> 		ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit);
> 		break;
> +	case KVM_HC_SCHED_YIELD:
> +		kvm_sched_yield(vcpu->kvm, a0);
> +		ret = 0;
> +		break;
> 	default:
> 		ret = -KVM_ENOSYS;
> 		break;
> -- 
> 2.7.4
>
Wanpeng Li May 30, 2019, 1:09 a.m. UTC | #4
On Wed, 29 May 2019 at 20:28, Liran Alon <liran.alon@oracle.com> wrote:
>
>
>
> > On 28 May 2019, at 3:53, Wanpeng Li <kernellwp@gmail.com> wrote:
> >
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > The target vCPUs are in runnable state after vcpu_kick and suitable
> > as a yield target. This patch implements the sched yield hypercall.
> >
> > 17% performace increase of ebizzy benchmark can be observed in an
> > over-subscribe environment. (w/ kvm-pv-tlb disabled, testing TLB flush
> > call-function IPI-many since call-function is not easy to be trigged
> > by userspace workload).
> >
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> > ---
> > arch/x86/kvm/x86.c | 24 ++++++++++++++++++++++++
> > 1 file changed, 24 insertions(+)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index e7e57de..2ceef51 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -7172,6 +7172,26 @@ void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
> >       kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
> > }
> >
> > +void kvm_sched_yield(struct kvm *kvm, u64 dest_id)
> > +{
> > +     struct kvm_vcpu *target;
> > +     struct kvm_apic_map *map;
> > +
> > +     rcu_read_lock();
> > +     map = rcu_dereference(kvm->arch.apic_map);
> > +
> > +     if (unlikely(!map))
> > +             goto out;
> > +
>
> We should have a bounds-check here on “dest_id”.

Yeah, fix it in v3.

Regards,
Wanpeng Li
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e7e57de..2ceef51 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7172,6 +7172,26 @@  void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
 	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
 }
 
+void kvm_sched_yield(struct kvm *kvm, u64 dest_id)
+{
+	struct kvm_vcpu *target;
+	struct kvm_apic_map *map;
+
+	rcu_read_lock();
+	map = rcu_dereference(kvm->arch.apic_map);
+
+	if (unlikely(!map))
+		goto out;
+
+	if (map->phys_map[dest_id]->vcpu) {
+		target = map->phys_map[dest_id]->vcpu;
+		kvm_vcpu_yield_to(target);
+	}
+
+out:
+	rcu_read_unlock();
+}
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
 	unsigned long nr, a0, a1, a2, a3, ret;
@@ -7218,6 +7238,10 @@  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 	case KVM_HC_SEND_IPI:
 		ret = kvm_pv_send_ipi(vcpu->kvm, a0, a1, a2, a3, op_64_bit);
 		break;
+	case KVM_HC_SCHED_YIELD:
+		kvm_sched_yield(vcpu->kvm, a0);
+		ret = 0;
+		break;
 	default:
 		ret = -KVM_ENOSYS;
 		break;