From patchwork Tue Aug 1 19:59:00 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= X-Patchwork-Id: 9875401 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 2F6846038F for ; Tue, 1 Aug 2017 19:59:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 216322872B for ; Tue, 1 Aug 2017 19:59:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 15D5E28731; Tue, 1 Aug 2017 19:59:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,HK_RANDOM_FROM, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A14D2872B for ; Tue, 1 Aug 2017 19:59:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752469AbdHAT7g (ORCPT ); Tue, 1 Aug 2017 15:59:36 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38800 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752218AbdHAT7G (ORCPT ); Tue, 1 Aug 2017 15:59:06 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5CC27FBB1; Tue, 1 Aug 2017 19:59:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 5CC27FBB1 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=rkrcmar@redhat.com Received: from flask (unknown [10.43.2.80]) by smtp.corp.redhat.com (Postfix) with SMTP id 0A74F60C9F; Tue, 1 Aug 2017 19:59:00 +0000 (UTC) Received: by flask (sSMTP sendmail emulation); Tue, 01 Aug 2017 21:59:00 +0200 Date: Tue, 1 Aug 2017 21:59:00 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Wanpeng Li Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Wanpeng Li Subject: Re: [PATCH v2] KVM: nVMX: Fix attempting to emulate "Acknowledge interrupt on exit" when there is no interrupt which L1 requires to inject to L2 Message-ID: <20170801195859.GB1437@flask> References: <1501554327-3608-1-git-send-email-wanpeng.li@hotmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1501554327-3608-1-git-send-email-wanpeng.li@hotmail.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 01 Aug 2017 19:59:06 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 2017-07-31 19:25-0700, Wanpeng Li: > From: Wanpeng Li > > ------------[ cut here ]------------ > WARNING: CPU: 5 PID: 2288 at arch/x86/kvm/vmx.c:11124 nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] > CPU: 5 PID: 2288 Comm: qemu-system-x86 Not tainted 4.13.0-rc2+ #7 > RIP: 0010:nested_vmx_vmexit+0xd64/0xd70 [kvm_intel] > Call Trace: > vmx_check_nested_events+0x131/0x1f0 [kvm_intel] > ? vmx_check_nested_events+0x131/0x1f0 [kvm_intel] > kvm_arch_vcpu_ioctl_run+0x5dd/0x1be0 [kvm] > ? vmx_vcpu_load+0x1be/0x220 [kvm_intel] > ? kvm_arch_vcpu_load+0x62/0x230 [kvm] > kvm_vcpu_ioctl+0x340/0x700 [kvm] > ? kvm_vcpu_ioctl+0x340/0x700 [kvm] > ? __fget+0xfc/0x210 > do_vfs_ioctl+0xa4/0x6a0 > ? __fget+0x11d/0x210 > SyS_ioctl+0x79/0x90 > do_syscall_64+0x8f/0x750 > ? trace_hardirqs_on_thunk+0x1a/0x1c > entry_SYSCALL64_slow_path+0x25/0x25 > > This can be reproduced by booting L1 guest w/ 'noapic' grub parameter, which > means that tells the kernel to not make use of any IOAPICs that may be present > in the system. > > Actually external_intr variable in nested_vmx_vmexit() is the req_int_win > variable passed from vcpu_enter_guest() which means that the L0's userspace > requests an irq window. I observed the scenario (!kvm_cpu_has_interrupt(vcpu) && > L0's userspace reqeusts an irq window) is true, so there is no interrupt which > L1 requires to inject to L2, we should not attempt to emualte "Acknowledge > interrupt on exit" for the irq window requirement in this scenario. > > This patch fixes it by not attempt to emulate "Acknowledge interrupt on exit" > if there is no L1 requirement to inject an interrupt to L2. > > Cc: Paolo Bonzini > Cc: Radim Krčmář > Signed-off-by: Wanpeng Li > --- > v1 -> v2: > * update patch description > * check nested_exit_intr_ack_set() first > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > @@ -11118,8 +11118,9 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason, > > vmx_switch_vmcs(vcpu, &vmx->vmcs01); > > - if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) > - && nested_exit_intr_ack_set(vcpu)) { > + if (nested_exit_intr_ack_set(vcpu) && > + exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT && > + kvm_cpu_has_interrupt(vcpu)) { This would work as a solution, but I don't think it's the correct behavior. SDM says that with acknowledge interrupt on exit, bit 31 of the VM-exit interrupt information (valid interrupt) is always set to 1 on EXIT_REASON_EXTERNAL_INTERRUPT. We don't want to break hypervisors expecting an interrupt in that case, so we should do a userspace VM exit when the window is open and then inject the userspace interrupt with a VM exit. The simplest thing that came to my mind is to: (It doesn't prevent malicious userspace from hitting the WARN, though.) Thanks. diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 39a6222bf968..9ad0c882c4f5 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10687,7 +10687,8 @@ static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr) return 0; } - if ((kvm_cpu_has_interrupt(vcpu) || external_intr) && + if ((kvm_cpu_has_interrupt(vcpu) || + (external_intr && !nested_exit_intr_ack_set(vcpu))) && nested_exit_on_intr(vcpu)) { if (vmx->nested.nested_run_pending) return -EBUSY; but I think it could break more ... actually, why was the window closed? kvm_vcpu_ready_for_interrupt_injection() checks vmx_interrupt_allowed() in order to decide need for the window, but vmx_check_nested_events() doesn't care about that at all, so the window might just appear closed. Would the following hunk help too? diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 39a6222bf968..7e6caa9c225d 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -5567,8 +5567,10 @@ static int vmx_nmi_allowed(struct kvm_vcpu *vcpu) static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu) { - return (!to_vmx(vcpu)->nested.nested_run_pending && - vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) && + if (is_guest_mode(vcpu)) + return !to_vmx(vcpu)->nested.nested_run_pending; + + return vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF && !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS)); }