From patchwork Fri Aug 18 14:11:28 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Wanpeng Li <kernellwp@gmail.com>
X-Patchwork-Id: 9909101
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	2019760386 for <patchwork-kvm@patchwork.kernel.org>;
	Fri, 18 Aug 2017 14:11:48 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 13C1328CD5
	for <patchwork-kvm@patchwork.kernel.org>;
	Fri, 18 Aug 2017 14:11:48 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 0897628CD8; Fri, 18 Aug 2017 14:11:48 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.5 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI,
	RCVD_IN_SORBS_SPAM autolearn=unavailable version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8EE6228CD5
	for <patchwork-kvm@patchwork.kernel.org>;
	Fri, 18 Aug 2017 14:11:47 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753163AbdHROLh (ORCPT
	<rfc822;patchwork-kvm@patchwork.kernel.org>);
	Fri, 18 Aug 2017 10:11:37 -0400
Received: from mail-pg0-f65.google.com ([74.125.83.65]:34122 "EHLO
	mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751446AbdHROLd (ORCPT <rfc822; kvm@vger.kernel.org>);
	Fri, 18 Aug 2017 10:11:33 -0400
Received: by mail-pg0-f65.google.com with SMTP id y192so15257446pgd.1;
	Fri, 18 Aug 2017 07:11:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=gmail.com; s=20161025;
	h=from:to:cc:subject:date:message-id:mime-version
	:content-transfer-encoding;
	bh=WNF/2zplc7NOhwRNyP/68gl+nq5Qr7L4XTrlPZSa6dE=;
	b=YxLSqfq5ct2xhcmyL88xwLPAB3guVE14+OenofZEXtKe2xyMimkkA3gncsoiYvpuOm
	wHOGxss7L2Jwxz+W/Gl56yi8lGhRQiwHdVleyszzEO7YMzTl+5OJ0lvNNACdJuauvcud
	baIERevhtV0e/292WSL3eyfxgAofUXHfl6ZtHwA5IvttqiG+b78jJC+iBAKXoakrkuXp
	UFKD6ljnkua7b59vrDl8Xhk3gJ02Bpj0cI8lQJ+oJMNlmRx9Lw3rOeAAMRYAFtTkc19+
	KNrnlik+fsS9hRmjVKBXVd8F+WXeP7YYeqaAyBPvzx0LxOJlCY6hRhloGjeKoxdvx0qy
	YfXw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version
	:content-transfer-encoding;
	bh=WNF/2zplc7NOhwRNyP/68gl+nq5Qr7L4XTrlPZSa6dE=;
	b=SnQpqyg0FQeppUsASZuOKXJxwjpELwDo/JHtB8BgiMDYQxOkDyBENUlnm8uzIruL1r
	bRTEOjA+jtVgZYRD0DMqfyVqgqcsyRCLmSfMdnm+XuZyTF3EtM+QWXkdo9QMKsIlOzw7
	HmWoPTfl75mv7NAwi0TUSrXdMOIUjFywh7cbHdXDhPOBsqxaSuUztWAehixCX6s6P37l
	sxJLbmf+ruCGe7reXHu42DdmBKylIvEIJ9FbS2KxO2Q6QtiyF0rP+96JDV1AKphdk/u0
	3qbrZERbs5cYmtkUb7YAHFISk50kRzAFfrg7pv4PAfZykGB97mESUHbzAcyO2lPXebHP
	Dl0A==
X-Gm-Message-State: AHYfb5ibRmKUx+Iu+AVC0sGhA24iVoU1V6mSrPr4E5WlNCpZ622h6zLE
	WzwNXXW8LmAkAOyf
X-Received: by 10.84.231.130 with SMTP id g2mr9909153plk.342.1503065492877;
	Fri, 18 Aug 2017 07:11:32 -0700 (PDT)
Received: from localhost ([223.72.80.220]) by smtp.gmail.com with ESMTPSA id
	d28sm14042146pfb.139.2017.08.18.07.11.31
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
	Fri, 18 Aug 2017 07:11:32 -0700 (PDT)
From: Wanpeng Li <kernellwp@gmail.com>
X-Google-Original-From: Wanpeng Li <wanpeng.li@hotmail.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	=?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= <rkrcmar@redhat.com>,
	Wanpeng Li <wanpeng.li@hotmail.com>
Subject: [PATCH v2] KVM: nVMX: Fix trying to cancel vmlauch/vmresume
Date: Fri, 18 Aug 2017 07:11:28 -0700
Message-Id: <1503065488-108870-1-git-send-email-wanpeng.li@hotmail.com>
X-Mailer: git-send-email 2.7.4
MIME-Version: 1.0
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

From: Wanpeng Li <wanpeng.li@hotmail.com>

------------[ cut here ]------------
WARNING: CPU: 7 PID: 3861 at /home/kernel/ssd/kvm/arch/x86/kvm//vmx.c:11299 nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
CPU: 7 PID: 3861 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0-rc4+ #11
RIP: 0010:nested_vmx_vmexit+0x176e/0x1980 [kvm_intel]
Call Trace:
 ? kvm_multiple_exception+0x149/0x170 [kvm]
 ? handle_emulation_failure+0x79/0x230 [kvm]
 ? load_vmcs12_host_state+0xa80/0xa80 [kvm_intel]
 ? check_chain_key+0x137/0x1e0
 ? reexecute_instruction.part.168+0x130/0x130 [kvm]
 nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
 ? nested_vmx_inject_exception_vmexit+0xb7/0x100 [kvm_intel]
 vmx_queue_exception+0x197/0x300 [kvm_intel]
 kvm_arch_vcpu_ioctl_run+0x1b0c/0x2c90 [kvm]
 ? kvm_arch_vcpu_runnable+0x220/0x220 [kvm]
 ? preempt_count_sub+0x18/0xc0
 ? restart_apic_timer+0x17d/0x300 [kvm]
 ? kvm_lapic_restart_hv_timer+0x37/0x50 [kvm]
 ? kvm_arch_vcpu_load+0x1d8/0x350 [kvm]
 kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
 ? kvm_vcpu_ioctl+0x4e4/0x910 [kvm]
 ? kvm_dev_ioctl+0xbe0/0xbe0 [kvm]

The flag "nested_run_pending", which can override the decision of which should run 
next, L1 or L2. nested_run_pending=1 means that we *must* run L2 next, not L1. This 
is necessary in particular when L1 did a VMLAUNCH of L2 and therefore expects L2 to 
be run (and perhaps be injected with an event it specified, etc.). Nested_run_pending 
is especially intended to avoid switching  to L1 in the injection decision-point.

I catch this in the queue exception path, this patch fixes it by requesting 
an immediate VM exit from L2 and keeping the exception for L1 pending for a 
subsequent nested VM exit.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/svm.c              |  6 ++++--
 arch/x86/kvm/vmx.c              | 15 ++++++++++-----
 arch/x86/kvm/x86.c              |  4 ++--
 4 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6db0ed9..e1e6f00 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -962,7 +962,7 @@ struct kvm_x86_ops {
 				unsigned char *hypercall_addr);
 	void (*set_irq)(struct kvm_vcpu *vcpu);
 	void (*set_nmi)(struct kvm_vcpu *vcpu);
-	void (*queue_exception)(struct kvm_vcpu *vcpu);
+	int (*queue_exception)(struct kvm_vcpu *vcpu);
 	void (*cancel_injection)(struct kvm_vcpu *vcpu);
 	int (*interrupt_allowed)(struct kvm_vcpu *vcpu);
 	int (*nmi_allowed)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c21b49b..bee1937 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -632,7 +632,7 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
 	svm_set_interrupt_shadow(vcpu, 0);
 }
 
-static void svm_queue_exception(struct kvm_vcpu *vcpu)
+static int svm_queue_exception(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	unsigned nr = vcpu->arch.exception.nr;
@@ -646,7 +646,7 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
 	 */
 	if (!reinject &&
 	    nested_svm_check_exception(svm, nr, has_error_code, error_code))
-		return;
+		return 0;
 
 	if (nr == BP_VECTOR && !static_cpu_has(X86_FEATURE_NRIPS)) {
 		unsigned long rip, old_rip = kvm_rip_read(&svm->vcpu);
@@ -669,6 +669,8 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
 		| (has_error_code ? SVM_EVTINJ_VALID_ERR : 0)
 		| SVM_EVTINJ_TYPE_EXEPT;
 	svm->vmcb->control.event_inj_err = error_code;
+
+	return 0;
 }
 
 static void svm_init_erratum_383(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e398946..cd0ab5d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2500,7 +2500,7 @@ static int nested_vmx_check_exception(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-static void vmx_queue_exception(struct kvm_vcpu *vcpu)
+static int vmx_queue_exception(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	unsigned nr = vcpu->arch.exception.nr;
@@ -2509,9 +2509,12 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
 	u32 error_code = vcpu->arch.exception.error_code;
 	u32 intr_info = nr | INTR_INFO_VALID_MASK;
 
-	if (!reinject && is_guest_mode(vcpu) &&
-	    nested_vmx_check_exception(vcpu))
-		return;
+	if (!reinject && is_guest_mode(vcpu)) {
+		if (vmx->nested.nested_run_pending)
+			return -EBUSY;
+		if (nested_vmx_check_exception(vcpu))
+			return 0;
+	}
 
 	if (has_error_code) {
 		vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
@@ -2524,7 +2527,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
 			inc_eip = vcpu->arch.event_exit_inst_len;
 		if (kvm_inject_realmode_interrupt(vcpu, nr, inc_eip) != EMULATE_DONE)
 			kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
-		return;
+		return 0;
 	}
 
 	if (kvm_exception_is_soft(nr)) {
@@ -2535,6 +2538,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
 		intr_info |= INTR_TYPE_HARD_EXCEPTION;
 
 	vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
+
+	return 0;
 }
 
 static bool vmx_rdtscp_supported(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 008a0b1..8e53de2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6356,8 +6356,8 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
 			kvm_update_dr7(vcpu);
 		}
 
-		kvm_x86_ops->queue_exception(vcpu);
-		return 0;
+		r = kvm_x86_ops->queue_exception(vcpu);
+		return r;
 	}
 
 	if (vcpu->arch.nmi_injected) {