From patchwork Thu May 19 13:27:06 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Wanpeng Li <kernellwp@gmail.com>
X-Patchwork-Id: 9127617
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	CEDBC60780 for <patchwork-kvm@patchwork.kernel.org>;
	Thu, 19 May 2016 13:39:09 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 61262281FF
	for <patchwork-kvm@patchwork.kernel.org>;
	Thu, 19 May 2016 13:27:32 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 557572820E; Thu, 19 May 2016 13:27:32 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,
	DKIM_ADSP_CUSTOM_MED,
	DKIM_SIGNED,FREEMAIL_FROM,RCVD_IN_DNSWL_HI,T_DKIM_INVALID
	autolearn=unavailable version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B47F7281FF
	for <patchwork-kvm@patchwork.kernel.org>;
	Thu, 19 May 2016 13:27:31 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932369AbcESN1P (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Thu, 19 May 2016 09:27:15 -0400
Received: from mail-pa0-f67.google.com ([209.85.220.67]:34874 "EHLO
	mail-pa0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S932344AbcESN1N (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 19 May 2016 09:27:13 -0400
Received: by mail-pa0-f67.google.com with SMTP id rw9so7935891pab.2;
	Thu, 19 May 2016 06:27:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=gmail.com; s=20120113;
	h=from:to:cc:subject:date:message-id:mime-version
	:content-transfer-encoding;
	bh=7eXXSykoR2zjLqlv+u1Jz3CSGa4tL+Pvu/DyjzuA/ko=;
	b=ScL+Pe8oki1C+rlt3rjZBZa4AYuzeE2KezViFcpzJxvXV1ByOfLlGkXK9ilR0PnT98
	+ekRhWui5meKy4Sm53F+Ft3zoRq+iqiR2ocVb09qLNt5sTZqJJbL+G6beiMomWXo3d3H
	pTBI6dTMYMYK6I7zFSfFw1We5WHOADMZsOyTJoTHHknc0yyDIKxGx/Ab7/kSfsNMcZJe
	cR5KrCysdaz15BUbSU+Qw7OYeAa1LPEqAb1pA3LVokTCY39L4/KtA4wi0L1L5+TUta93
	Ux5ncXCD0usv9OKZWmqz9GD3UixgHPVrdqQCVTkLWj+2PyhMWd1A3IbuYgXvq74QQz5a
	YstQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20130820;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version
	:content-transfer-encoding;
	bh=7eXXSykoR2zjLqlv+u1Jz3CSGa4tL+Pvu/DyjzuA/ko=;
	b=PbS0jHr5un/bXyjnD3MLo/OTQppJs9NxDVmhBhpCnovROnK4Bu1T2CGV29SRr3Jt/f
	9TBPJ+ElTcsvUeR15oF9/pbRAfsEqtKBNuG8aK5G71+sGZabeQfcHCIjMydUJrR0Wpkn
	5G4GifPJQcKzPUmAsCxsZyC+atw1aXEdpB3FlQokobYWdZj7sNLXvhNDsxtisyr45PMc
	wlL6JNyugKKzfKaMhaqof2Lw5fQJOhggS0Qh1zGq5yiTB6I2S2LxRTAC2o/wIYGiOtMU
	GyvtQt1s/L1yXMQ5TCGcs20SOhJSzX8ZNMX4377BQisKPVU9HKRbNaWUbezdsxal8rNe
	h0Pw==
X-Gm-Message-State: 
 AOPr4FWCqvFl7SW5+JyHlf7MBvB8fwNPAKPwI7c6dFyK3wqVns3fvcP4xLk8yrVrzdjvlg==
X-Received: by 10.66.119.177 with SMTP id kv17mr19761754pab.57.1463664432845;
	Thu, 19 May 2016 06:27:12 -0700 (PDT)
Received: from kernel.kingsoft.cn ([114.255.44.132])
	by smtp.gmail.com with ESMTPSA id
	c190sm20057180pfb.33.2016.05.19.06.27.10
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Thu, 19 May 2016 06:27:11 -0700 (PDT)
From: Wanpeng Li <kernellwp@gmail.com>
X-Google-Original-From: Wanpeng Li <wanpeng.li@hotmail.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: Wanpeng Li <wanpeng.li@hotmail.com>, Paolo Bonzini <pbonzini@redhat.com>,
	=?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= <rkrcmar@redhat.com>,
	David Matlack <dmatlack@google.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>
Subject: [PATCH v2] KVM: halt-polling: poll if emulated lapic timer will
	fire soon
Date: Thu, 19 May 2016 21:27:06 +0800
Message-Id: <1463664426-2991-1-git-send-email-wanpeng.li@hotmail.com>
X-Mailer: git-send-email 1.9.1
MIME-Version: 1.0
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

From: Wanpeng Li <wanpeng.li@hotmail.com>

If an emulated lapic timer will fire soon(in the scope of 10us the
base of dynamic halt-polling, lower-end of message passing workload
latency TCP_RR's poll time < 10us) we can treat it as a short halt,
and poll to wait it fire, the fire callback apic_timer_fn() will set
KVM_REQ_PENDING_TIMER, and this flag will be check during busy poll.
This can avoid context switch overhead and the latency which we wake
up vCPU.

iperf TCP get ~6% bandwidth improvement.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: David Matlack <dmatlack@google.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
v1 -> v2:
 * add return statement to non-x86 archs
 * capture never expire case for x86 (hrtimer is not started)

 arch/arm/include/asm/kvm_host.h     |  4 ++++
 arch/arm64/include/asm/kvm_host.h   |  4 ++++
 arch/mips/include/asm/kvm_host.h    |  4 ++++
 arch/powerpc/include/asm/kvm_host.h |  4 ++++
 arch/s390/include/asm/kvm_host.h    |  4 ++++
 arch/x86/kvm/lapic.c                | 11 +++++++++++
 arch/x86/kvm/lapic.h                |  1 +
 arch/x86/kvm/x86.c                  |  5 +++++
 include/linux/kvm_host.h            |  1 +
 virt/kvm/kvm_main.c                 | 14 ++++++++++----
 10 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 4cd8732..a5fd858 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -284,6 +284,10 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 
 static inline void kvm_arm_init_debug(void) {}
 static inline void kvm_arm_setup_debug(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d49399d..94e227a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -359,6 +359,10 @@ static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 
 void kvm_arm_init_debug(void);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 9a37a10..456bc42 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -813,6 +813,10 @@ static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
 #endif /* __MIPS_KVM_HOST_H__ */
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index ec35af3..5986c79 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -729,5 +729,9 @@ static inline void kvm_arch_exit(void) {}
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 37b9017..bdb01a1 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -696,6 +696,10 @@ static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
 		struct kvm_memory_slot *slot) {}
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
+static inline u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return -1ULL;
+}
 
 void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu);
 
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index bbb5b28..cfeeac3 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -256,6 +256,17 @@ static inline int apic_lvtt_tscdeadline(struct kvm_lapic *apic)
 	return apic->lapic_timer.timer_mode == APIC_LVT_TIMER_TSCDEADLINE;
 }
 
+u64 apic_get_timer_expire(struct kvm_vcpu *vcpu)
+{
+	struct kvm_lapic *apic = vcpu->arch.apic;
+	struct hrtimer *timer = &apic->lapic_timer.timer;
+
+	if (!hrtimer_active(timer))
+		return -1ULL;
+	else
+		return ktime_to_ns(hrtimer_get_remaining(timer));
+}
+
 static inline int apic_lvt_nmi_mode(u32 lvt_val)
 {
 	return (lvt_val & (APIC_MODE_MASK | APIC_LVT_MASKED)) == APIC_DM_NMI;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 891c6da..ee4da6c 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -212,4 +212,5 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
 			struct kvm_vcpu **dest_vcpu);
 int kvm_vector_to_index(u32 vector, u32 dest_vcpus,
 			const unsigned long *bitmap, u32 bitmap_size);
+u64 apic_get_timer_expire(struct kvm_vcpu *vcpu);
 #endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a8c7ca3..9b5ad99 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7623,6 +7623,11 @@ bool kvm_vcpu_compatible(struct kvm_vcpu *vcpu)
 struct static_key kvm_no_apic_vcpu __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_no_apic_vcpu);
 
+u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu)
+{
+	return apic_get_timer_expire(vcpu);
+}
+
 int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 {
 	struct page *page;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index b1fa8f1..14d6c23 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -663,6 +663,7 @@ int kvm_vcpu_yield_to(struct kvm_vcpu *target);
 void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu);
 void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
 void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
+u64 kvm_arch_timer_remaining(struct kvm_vcpu *vcpu);
 
 void kvm_flush_remote_tlbs(struct kvm *kvm);
 void kvm_reload_remote_mmus(struct kvm *kvm);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index dd4ac9d..e4bb30b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -78,6 +78,9 @@ module_param(halt_poll_ns_grow, uint, S_IRUGO | S_IWUSR);
 static unsigned int halt_poll_ns_shrink;
 module_param(halt_poll_ns_shrink, uint, S_IRUGO | S_IWUSR);
 
+/* lower-end of message passing workload latency TCP_RR's poll time < 10us */
+static unsigned int halt_poll_ns_base = 10000;
+
 /*
  * Ordering of locks:
  *
@@ -1966,7 +1969,7 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
 	grow = READ_ONCE(halt_poll_ns_grow);
 	/* 10us base */
 	if (val == 0 && grow)
-		val = 10000;
+		val = halt_poll_ns_base;
 	else
 		val *= grow;
 
@@ -2014,12 +2017,15 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	ktime_t start, cur;
 	DECLARE_SWAITQUEUE(wait);
 	bool waited = false;
-	u64 block_ns;
+	u64 block_ns, delta, remaining;
 
+	remaining = kvm_arch_timer_remaining(vcpu);
 	start = cur = ktime_get();
-	if (vcpu->halt_poll_ns) {
-		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
+	if (vcpu->halt_poll_ns || remaining < halt_poll_ns_base) {
+		ktime_t stop;
 
+		delta = vcpu->halt_poll_ns ? vcpu->halt_poll_ns : remaining;
+		stop = ktime_add_ns(ktime_get(), delta);
 		++vcpu->stat.halt_attempted_poll;
 		do {
 			/*