[2/2] KVM: LAPIC: micro-optimize fixed mode ipi delivery
diff mbox series

Message ID 1573283135-5502-2-git-send-email-wanpengli@tencent.com
State New
Headers show
Series
  • [1/2] KVM: X86: Single target IPI fastpath
Related show

Commit Message

Wanpeng Li Nov. 9, 2019, 7:05 a.m. UTC
From: Wanpeng Li <wanpengli@tencent.com>

After disabling mwait/halt/pause vmexits, RESCHEDULE_VECTOR and
CALL_FUNCTION_SINGLE_VECTOR etc IPI is one of the main remaining
cause of vmexits observed in product environment which can't be
optimized by PV IPIs. This patch is the follow-up on commit
0e6d242eccdb (KVM: LAPIC: Micro optimize IPI latency), to optimize
redundancy logic before fixed mode ipi is delivered in the fast
path.

- broadcast handling needs to go slow path, so the delivery mode repair
  can be delayed to before slow path.
- self-IPI will not be intervened by hypervisor any more after APICv is
  introduced and the boxes support APICv are popular now. In addition,
  kvm_apic_map_get_dest_lapic() can handle the self-IPI, so there is no
  need a shortcut for the non-APICv case.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
 arch/x86/kvm/irq_comm.c | 6 +++---
 arch/x86/kvm/lapic.c    | 5 -----
 2 files changed, 3 insertions(+), 8 deletions(-)

Comments

Paolo Bonzini Nov. 11, 2019, 9:59 p.m. UTC | #1
On 09/11/19 08:05, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> After disabling mwait/halt/pause vmexits, RESCHEDULE_VECTOR and
> CALL_FUNCTION_SINGLE_VECTOR etc IPI is one of the main remaining
> cause of vmexits observed in product environment which can't be
> optimized by PV IPIs. This patch is the follow-up on commit
> 0e6d242eccdb (KVM: LAPIC: Micro optimize IPI latency), to optimize
> redundancy logic before fixed mode ipi is delivered in the fast
> path.
> 
> - broadcast handling needs to go slow path, so the delivery mode repair
>   can be delayed to before slow path.

I agree with this part, but is the cost of the irq->shorthand check
really measurable?

Paolo

> - self-IPI will not be intervened by hypervisor any more after APICv is
>   introduced and the boxes support APICv are popular now. In addition,
>   kvm_apic_map_get_dest_lapic() can handle the self-IPI, so there is no
>   need a shortcut for the non-APICv case.
> 
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
>  arch/x86/kvm/irq_comm.c | 6 +++---
>  arch/x86/kvm/lapic.c    | 5 -----
>  2 files changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
> index 8ecd48d..aa88156 100644
> --- a/arch/x86/kvm/irq_comm.c
> +++ b/arch/x86/kvm/irq_comm.c
> @@ -52,15 +52,15 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
>  	unsigned long dest_vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)];
>  	unsigned int dest_vcpus = 0;
>  
> +	if (kvm_irq_delivery_to_apic_fast(kvm, src, irq, &r, dest_map))
> +		return r;
> +
>  	if (irq->dest_mode == 0 && irq->dest_id == 0xff &&
>  			kvm_lowest_prio_delivery(irq)) {
>  		printk(KERN_INFO "kvm: apic: phys broadcast and lowest prio\n");
>  		irq->delivery_mode = APIC_DM_FIXED;
>  	}
>  
> -	if (kvm_irq_delivery_to_apic_fast(kvm, src, irq, &r, dest_map))
> -		return r;
> -
>  	memset(dest_vcpu_bitmap, 0, sizeof(dest_vcpu_bitmap));
>  
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index b29d00b..ea936fa 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -951,11 +951,6 @@ bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
>  
>  	*r = -1;
>  
> -	if (irq->shorthand == APIC_DEST_SELF) {
> -		*r = kvm_apic_set_irq(src->vcpu, irq, dest_map);
> -		return true;
> -	}
> -
>  	rcu_read_lock();
>  	map = rcu_dereference(kvm->arch.apic_map);
>  
>
Wanpeng Li Nov. 12, 2019, 1:34 a.m. UTC | #2
On Tue, 12 Nov 2019 at 05:59, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 09/11/19 08:05, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > After disabling mwait/halt/pause vmexits, RESCHEDULE_VECTOR and
> > CALL_FUNCTION_SINGLE_VECTOR etc IPI is one of the main remaining
> > cause of vmexits observed in product environment which can't be
> > optimized by PV IPIs. This patch is the follow-up on commit
> > 0e6d242eccdb (KVM: LAPIC: Micro optimize IPI latency), to optimize
> > redundancy logic before fixed mode ipi is delivered in the fast
> > path.
> >
> > - broadcast handling needs to go slow path, so the delivery mode repair
> >   can be delayed to before slow path.
>
> I agree with this part, but is the cost of the irq->shorthand check
> really measurable?

I can drop the second part for v2.

    Wanpeng

Patch
diff mbox series

diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 8ecd48d..aa88156 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -52,15 +52,15 @@  int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
 	unsigned long dest_vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)];
 	unsigned int dest_vcpus = 0;
 
+	if (kvm_irq_delivery_to_apic_fast(kvm, src, irq, &r, dest_map))
+		return r;
+
 	if (irq->dest_mode == 0 && irq->dest_id == 0xff &&
 			kvm_lowest_prio_delivery(irq)) {
 		printk(KERN_INFO "kvm: apic: phys broadcast and lowest prio\n");
 		irq->delivery_mode = APIC_DM_FIXED;
 	}
 
-	if (kvm_irq_delivery_to_apic_fast(kvm, src, irq, &r, dest_map))
-		return r;
-
 	memset(dest_vcpu_bitmap, 0, sizeof(dest_vcpu_bitmap));
 
 	kvm_for_each_vcpu(i, vcpu, kvm) {
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b29d00b..ea936fa 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -951,11 +951,6 @@  bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
 
 	*r = -1;
 
-	if (irq->shorthand == APIC_DEST_SELF) {
-		*r = kvm_apic_set_irq(src->vcpu, irq, dest_map);
-		return true;
-	}
-
 	rcu_read_lock();
 	map = rcu_dereference(kvm->arch.apic_map);