From patchwork Tue Aug 5 04:42:24 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wanpeng Li X-Patchwork-Id: 4675941 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 7B9E69F373 for ; Tue, 5 Aug 2014 04:43:05 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7BB812015A for ; Tue, 5 Aug 2014 04:43:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 79F3A20158 for ; Tue, 5 Aug 2014 04:43:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753194AbaHEEmm (ORCPT ); Tue, 5 Aug 2014 00:42:42 -0400 Received: from mga11.intel.com ([192.55.52.93]:6818 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753167AbaHEEmk (ORCPT ); Tue, 5 Aug 2014 00:42:40 -0400 Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP; 04 Aug 2014 21:42:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.01,802,1400050800"; d="scan'208";a="571781468" Received: from wanpengl-mobl.ccr.corp.intel.com (HELO localhost) ([10.238.130.35]) by fmsmga001.fm.intel.com with ESMTP; 04 Aug 2014 21:42:37 -0700 From: Wanpeng Li To: Paolo Bonzini , Jan Kiszka Cc: Marcelo Tosatti , Gleb Natapov , Bandan Das , Zhang Yang , Davidlohr Bueso , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Wanpeng Li Subject: [PATCH v2 2/2] KVM: nVMX: fix acknowledge interrupt on exit when APICv is in use Date: Tue, 5 Aug 2014 12:42:24 +0800 Message-Id: <1407213744-24794-2-git-send-email-wanpeng.li@linux.intel.com> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1407213744-24794-1-git-send-email-wanpeng.li@linux.intel.com> References: <1407213744-24794-1-git-send-email-wanpeng.li@linux.intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-7.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP After commit 77b0f5d (KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to), "Acknowledge interrupt on exit" behavior can be emulated. To do so, KVM will ask the APIC for the interrupt vector if during a nested vmexit if VM_EXIT_ACK_INTR_ON_EXIT is set. With APICv, kvm_get_apic_interrupt would return -1 and give the following WARNING: Call Trace: [] dump_stack+0x49/0x5e [] warn_slowpath_common+0x7c/0x96 [] ? nested_vmx_vmexit+0xa4/0x233 [kvm_intel] [] warn_slowpath_null+0x15/0x17 [] nested_vmx_vmexit+0xa4/0x233 [kvm_intel] [] ? nested_vmx_exit_handled+0x6a/0x39e [kvm_intel] [] ? kvm_apic_has_interrupt+0x80/0xd5 [kvm] [] vmx_check_nested_events+0xc3/0xd3 [kvm_intel] [] inject_pending_event+0xd0/0x16e [kvm] [] vcpu_enter_guest+0x319/0x704 [kvm] If enabling APIC-v, all interrupts to L1 are delivered through APIC-v. But when L2 is running, external interrupt will casue L1 vmexit with reason external interrupt. Then L1 will pick up the interrupt through vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when L1 is running, so APIC-v hardware still will do vEOI updating. The problem is that the interrupt is delivered not through APIC-v hardware, this means SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI updating. The solution is that, when L1 tried to pick up the interrupt from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make sure the following vEOI updating and vPPR updating corrently. Also, since interrupt is delivered through vmcs12, so APIC-v hardware will not cleare vIRR and hypervisor need to clear it before L1 running. Suggested-by: Paolo Bonzini Suggested-by: "Zhang, Yang Z" Tested-by: Liu, RongrongX Signed-off-by: Wanpeng Li --- v1 -> v2: * reusing kvm_get_apic_interrupt here (by modifying kvm_cpu_get_interrupt, apic_set_isr and apic_clear_irr) arch/x86/kvm/irq.c | 2 +- arch/x86/kvm/lapic.c | 52 +++++++++++++++++++++++++++++++++++++++------------- 2 files changed, 40 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index bd0da43..a1ec6a5 100644 --- a/arch/x86/kvm/irq.c +++ b/arch/x86/kvm/irq.c @@ -108,7 +108,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v) vector = kvm_cpu_get_extint(v); - if (kvm_apic_vid_enabled(v->kvm) || vector != -1) + if (vector != -1) return vector; /* PIC */ return kvm_get_apic_interrupt(v); /* APIC */ diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 3855103..08e8a89 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -352,25 +352,46 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic) static inline void apic_clear_irr(int vec, struct kvm_lapic *apic) { - apic->irr_pending = false; + struct kvm_vcpu *vcpu; + + vcpu = apic->vcpu; + apic_clear_vector(vec, apic->regs + APIC_IRR); - if (apic_search_irr(apic) != -1) - apic->irr_pending = true; + if (unlikely(kvm_apic_vid_enabled(vcpu->kvm))) + /* try to update RVI */ + kvm_make_request(KVM_REQ_EVENT, vcpu); + else { + vec = apic_search_irr(apic); + apic->irr_pending = (vec != -1); + } } static inline void apic_set_isr(int vec, struct kvm_lapic *apic) { - /* Note that we never get here with APIC virtualization enabled. */ + struct kvm_vcpu *vcpu; + + if (__apic_test_and_set_vector(vec, apic->regs + APIC_ISR)) + return; + + vcpu = apic->vcpu; - if (!__apic_test_and_set_vector(vec, apic->regs + APIC_ISR)) - ++apic->isr_count; - BUG_ON(apic->isr_count > MAX_APIC_VECTOR); /* - * ISR (in service register) bit is set when injecting an interrupt. - * The highest vector is injected. Thus the latest bit set matches - * the highest bit in ISR. + * With APIC virtualization enabled, all caching is disabled + * because the processor can modify ISR under the hood. Instead + * just set SVI. */ - apic->highest_isr_cache = vec; + if (unlikely(kvm_apic_vid_enabled(vcpu->kvm))) + kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec); + else { + ++apic->isr_count; + BUG_ON(apic->isr_count > MAX_APIC_VECTOR); + /* + * ISR (in service register) bit is set when injecting an interrupt. + * The highest vector is injected. Thus the latest bit set matches + * the highest bit in ISR. + */ + apic->highest_isr_cache = vec; + } } static inline int apic_find_highest_isr(struct kvm_lapic *apic) @@ -1627,11 +1648,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu) int vector = kvm_apic_has_interrupt(vcpu); struct kvm_lapic *apic = vcpu->arch.apic; - /* Note that we never get here with APIC virtualization enabled. */ - if (vector == -1) return -1; + /* + * We get here even with APIC virtualization enabled, if doing + * nested virtualization and L1 runs with the "acknowledge interrupt + * on exit" mode. Then we cannot inject the interrupt via RVI, + * because the process would deliver it through the IDT. + */ + apic_set_isr(vector, apic); apic_update_ppr(apic); apic_clear_irr(vector, apic);