From patchwork Wed Aug 2 10:48:23 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 9876613 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 3CD986037D for ; Wed, 2 Aug 2017 10:48:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E989286B0 for ; Wed, 2 Aug 2017 10:48:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 434CB286CA; Wed, 2 Aug 2017 10:48:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EDC69286B0 for ; Wed, 2 Aug 2017 10:48:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752577AbdHBKsa (ORCPT ); Wed, 2 Aug 2017 06:48:30 -0400 Received: from mail-pg0-f66.google.com ([74.125.83.66]:38866 "EHLO mail-pg0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751192AbdHBKs3 (ORCPT ); Wed, 2 Aug 2017 06:48:29 -0400 Received: by mail-pg0-f66.google.com with SMTP id 123so4303759pga.5; Wed, 02 Aug 2017 03:48:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=23sa+5OITUpAYNG4QwQsy372NpfXIRGCwv12ObPtSE4=; b=b7gq5xHB70DZNlNMHdRxx3B1PI9IWB5w4Rd5A5waUMqfmbgJ+bKKBYcsIGYG6Xv8pa SVOFLYqPch3onTDbl6NPANFTKgEwVrgNO3W2aCuie7x2wCPtCQjGQb7z6Ew+QykjP2Qi 8m+dvMk7GmbqJ+wH3nV51UVc+cc/vE0JR3RNpJC7waa3GbOzLHxbDxTkK1NjJ8B4PXee J4w89jNL4Nur9k6bibj6bOoyrDccrxLzlt6iHH11zYvfgtC/hsScN0hDSsggHfnlUc3t 6q53xFRl8RMewkWkMRWIwj7FcYZvUBRw4wCKNG4Rbfzklll/aCxl9DmqsGBRSi8w5zTV 4ibw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=23sa+5OITUpAYNG4QwQsy372NpfXIRGCwv12ObPtSE4=; b=hi3nURh2XjprxPJpfNyH26teAjAblrQp0fPGdq2Gyd55EaqUyVwPfkpdRcd391evXF 9S0IoNvNLJtyIW2nbReBbFUeTDehzaaIPSBB+LY4KORISbHkShgXPpbNGYjUNyjdoiK2 yoiJ09X4JvWf121565up84sc2Y/eRbh/d15gohqV/RZ/ynuYfT64oknfc49J5fdLnCJH aIFC0rg8S16i1G+ZiCiDYR7LaDIMtSmhcq4f4HvjL6rY2pCNbQ42BsQ/pTjRrObon0ai G195uoduOQqLioQsaCFHt1qnLgEdH/nU/Qx2WbvcS+PWL3ibEMqtbR+nGA95NJ5G6D0/ naTA== X-Gm-Message-State: AIVw113O3nLky9psVMcv2npM79fA/8avttZl5jTpwFDOIUfFsn63OwMf e4wvVl6yz4fK2ha5 X-Received: by 10.84.212.1 with SMTP id d1mr25047021pli.17.1501670908692; Wed, 02 Aug 2017 03:48:28 -0700 (PDT) Received: from localhost ([203.205.141.123]) by smtp.gmail.com with ESMTPSA id n11sm27792371pfg.92.2017.08.02.03.48.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 02 Aug 2017 03:48:27 -0700 (PDT) From: Wanpeng Li X-Google-Original-From: Wanpeng Li To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Paolo Bonzini , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Wanpeng Li Subject: [PATCH v3] KVM: nVMX: Fix attempting to emulate "Acknowledge interrupt on exit" when there is no interrupt which L1 requires to inject to L2 Date: Wed, 2 Aug 2017 03:48:23 -0700 Message-Id: <1501670903-3368-1-git-send-email-wanpeng.li@hotmail.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Wanpeng Li ------------[ cut here ]------------ WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7 RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] Call Trace: vmx_check_nested_events+0x131/0x1f0 [kvm_intel] ? vmx_check_nested_events+0x131/0x1f0 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm] ? vmx_vcpu_load+0x1be/0x220 [kvm_intel] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x8f/0x750 ? trace_hardirqs_on_thunk+0x1a/0x1c entry_SYSCALL64_slow_path+0x25/0x25 This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which means that tells the kernel to not make use of any IOAPICs that may be present in the system. Actually external_intr variable in nested_vmx_vmexit() is the req_int_win variable passed from vcpu_enter_guest() which means that the L0's userspace requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) && L0's userspace reqeusts an irq window) is true, so there is no interrupt which L1 requires to inject to L2, we should not attempt to emualte "Acknowledge interrupt on exit" for the irq window requirement in this scenario. SDM says that with acknowledge interrupt on exit, bit 31 of the VM-exit interrupt information (valid interrupt) is always set to 1 on EXIT_REASON_EXTERNAL_INTERRUPT. We don't want to break hypervisors expecting an interrupt in that case, so we should do a userspace VM exit when the window is open and then inject the userspace interrupt with a VM exit. Cc: Paolo Bonzini Cc: Radim Krčmář Signed-off-by: Wanpeng Li --- v2 -> v3: * request an irq window and don't nested vmexit v1 -> v2: * update patch description * check nested_exit_intr_ack_set() first arch/x86/kvm/vmx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 80b20e8..9ef2ec3 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10761,7 +10761,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr) return 0; } - if ((kvm_cpu_has_interrupt(vcpu) || external_intr) && + if ((kvm_cpu_has_interrupt(vcpu) || + (external_intr && !nested_exit_intr_ack_set(vcpu))) && nested_exit_on_intr(vcpu)) { if (vmx->nested.nested_run_pending) return -EBUSY;