From patchwork Mon Dec 19 16:17:12 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 9480523 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C7928601C2 for ; Mon, 19 Dec 2016 16:18:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B5B59284DE for ; Mon, 19 Dec 2016 16:18:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A936C284E2; Mon, 19 Dec 2016 16:18:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1BC95284DE for ; Mon, 19 Dec 2016 16:18:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933721AbcLSQRa (ORCPT ); Mon, 19 Dec 2016 11:17:30 -0500 Received: from mail-wj0-f174.google.com ([209.85.210.174]:35020 "EHLO mail-wj0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758506AbcLSQRZ (ORCPT ); Mon, 19 Dec 2016 11:17:25 -0500 Received: by mail-wj0-f174.google.com with SMTP id v7so154053706wjy.2; Mon, 19 Dec 2016 08:17:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:subject:date:message-id:in-reply-to:references; bh=98NpYAJAuecJlJ7ejAhF+TyIA7W5XS/3BwEyk1iYQ58=; b=P66aeYX5aEwbLio3AOqGokjFfunQ18ofjLzWamdEefFxPnfWoCjvd/dEYqua5n8HCr 9e7BsLWUTPBaRGDM6mJXUrRPIXGF4GCzChDE6aggh6ggB4sKMaxak8Ts/680hVc8hpOu pe8fbgkhnMnot93p04H+xNvjbYuUciRVVEwjlTUT8CHzX/77CfLqEZT1ZZnVhozJrsxa 4Z+QEzNPhWXS0KTqf69ZajZYp2LLTfYisTZCpfIDzXVL8Zoe5aoVS4+faKV9kay0YStO /HxUTBRx7JLovBBB9ZGcMeUlHqSzbtTmcwYBxl0Ih8NNiYL8qYT9u0i4Py/yafKwDwSx fLBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:date:message-id :in-reply-to:references; bh=98NpYAJAuecJlJ7ejAhF+TyIA7W5XS/3BwEyk1iYQ58=; b=kLR5wiYFgbjX8cN7ClHokMi3P5ZiO+wOrsBgrE1LMeOpqtVLOB127+HGGE4Tex9MGM rd0h8SkuAY3oBxqo2xo8o+mlB6YLaeLBNeTR71+9X08ZQQkkhSDQ6qK23ikkumLRFy6G XkkQ3a86uiPGrXbZAJyCJzv/xpwK0CEHhpG9Oy9Z0o9JyqnXwjzRpGOdm4BUy92/vVl5 9PT+hc5mSRgCD/Sh8wcfLJ4vaqr4XyQ/gv8EDqcTH65WCFwMDRgX4e1Yz8s0tWzXkR8i ssFmdBDKHlEm7azQHgn6LEGBFr0841Pwf/mQlVq0LCyjISYJJ/XyBY+utKH0+9hQwoM9 8FWQ== X-Gm-Message-State: AIkVDXIvc5yG+ORtC9R4K+AVJYSBQ+UtczH6EkojC37PCOJRQx10ToJ4lv1JQmmI5EV9Tw== X-Received: by 10.194.14.196 with SMTP id r4mr17668609wjc.54.1482164242807; Mon, 19 Dec 2016 08:17:22 -0800 (PST) Received: from 640k.lan (94-39-188-115.adsl-ull.clienti.tiscali.it. [94.39.188.115]) by smtp.gmail.com with ESMTPSA id j1sm21337645wjm.26.2016.12.19.08.17.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 19 Dec 2016 08:17:22 -0800 (PST) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 6/6] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection Date: Mon, 19 Dec 2016 17:17:12 +0100 Message-Id: <1482164232-130035-7-git-send-email-pbonzini@redhat.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1482164232-130035-1-git-send-email-pbonzini@redhat.com> References: <1482164232-130035-1-git-send-email-pbonzini@redhat.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Since bf9f6ac8d749 ("KVM: Update Posted-Interrupts Descriptor when vCPU is blocked", 2015-09-18) the posted interrupt descriptor is checked unconditionally for PIR.ON. Therefore we don't need KVM_REQ_EVENT to trigger the scan and, if NMIs or SMIs are not involved, we can avoid the complicated event injection path. Calling kvm_vcpu_kick if PIR.ON=1 is also useless, though it has been there since APICv was introduced. However, without the KVM_REQ_EVENT safety net KVM needs to be much more careful about races between vmx_deliver_posted_interrupt and vcpu_enter_guest. First, the IPI for posted interrupts may be issued between setting vcpu->mode = IN_GUEST_MODE and disabling interrupts. If that happens, kvm_trigger_posted_interrupt returns true, but smp_kvm_posted_intr_ipi doesn't do anything about it. The guest is entered with PIR.ON, but the posted interrupt IPI has not been sent and the interrupt is only delivered to the guest on the next vmentry (if any). To fix this, disable interrupts before setting vcpu->mode. This ensures that the IPI is delayed until the guest enters non-root mode; it is then trapped by the processor causing the interrupt to be injected. Second, the IPI may be issued between kvm_x86_ops->hwapic_irr_update(vcpu, kvm_lapic_find_highest_irr(vcpu)); and vcpu->mode = IN_GUEST_MODE. In this case, kvm_vcpu_kick is called but it (correctly) doesn't do anything because it sees vcpu->mode == OUTSIDE_GUEST_MODE. Again, the guest is entered with PIR.ON but no posted interrupt IPI is pending; this time, the fix for this is to move the RVI update after IN_GUEST_MODE. Both issues were previously masked by the liberal usage of KVM_REQ_EVENT. In both race scenarios KVM_REQ_EVENT would cancel guest entry, resulting in another vmentry which would inject the interrupt. This saves about 300 cycles on the self_ipi_* tests of vmexit.flat. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/lapic.c | 11 ++++------- arch/x86/kvm/vmx.c | 8 +++++--- arch/x86/kvm/x86.c | 44 +++++++++++++++++++++++++------------------- 3 files changed, 34 insertions(+), 29 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index f644dd1dbe71..5ea94b622e88 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -385,12 +385,8 @@ int __kvm_apic_update_irr(u32 *pir, void *regs) int kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir) { struct kvm_lapic *apic = vcpu->arch.apic; - int max_irr; - max_irr = __kvm_apic_update_irr(pir, apic->regs); - - kvm_make_request(KVM_REQ_EVENT, vcpu); - return max_irr; + return __kvm_apic_update_irr(pir, apic->regs); } EXPORT_SYMBOL_GPL(kvm_apic_update_irr); @@ -423,9 +419,10 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic) vcpu = apic->vcpu; if (unlikely(vcpu->arch.apicv_active)) { - /* try to update RVI */ + /* need to update RVI */ apic_clear_vector(vec, apic->regs + APIC_IRR); - kvm_make_request(KVM_REQ_EVENT, vcpu); + kvm_x86_ops->hwapic_irr_update(vcpu, + apic_find_highest_irr(apic)); } else { apic->irr_pending = false; apic_clear_vector(vec, apic->regs + APIC_IRR); diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 27e40b180242..3dd4fad35a3e 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -5062,9 +5062,11 @@ static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector) if (pi_test_and_set_pir(vector, &vmx->pi_desc)) return; - r = pi_test_and_set_on(&vmx->pi_desc); - kvm_make_request(KVM_REQ_EVENT, vcpu); - if (r || !kvm_vcpu_trigger_posted_interrupt(vcpu)) + /* If a previous notification has sent the IPI, nothing to do. */ + if (pi_test_and_set_on(&vmx->pi_desc)) + return; + + if (!kvm_vcpu_trigger_posted_interrupt(vcpu)) kvm_vcpu_kick(vcpu); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c666414adc1d..725473ba6dd3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6710,19 +6710,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) kvm_hv_process_stimers(vcpu); } - /* - * KVM_REQ_EVENT is not set when posted interrupts are set by - * VT-d hardware, so we have to update RVI unconditionally. - */ - if (kvm_lapic_enabled(vcpu)) { - /* - * Update architecture specific hints for APIC - * virtual interrupt delivery. - */ - if (kvm_x86_ops->sync_pir_to_irr) - kvm_x86_ops->sync_pir_to_irr(vcpu); - } - if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) { ++vcpu->stat.req_event; kvm_apic_accept_events(vcpu); @@ -6767,20 +6754,39 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) kvm_x86_ops->prepare_guest_switch(vcpu); if (vcpu->fpu_active) kvm_load_guest_fpu(vcpu); + + /* + * Disabling IRQs before setting IN_GUEST_MODE. Posted interrupt + * IPI are then delayed after guest entry, which ensures that they + * result in virtual interrupt delivery. + */ + local_irq_disable(); vcpu->mode = IN_GUEST_MODE; srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); /* - * We should set ->mode before check ->requests, - * Please see the comment in kvm_make_all_cpus_request. - * This also orders the write to mode from any reads - * to the page tables done while the VCPU is running. - * Please see the comment in kvm_flush_remote_tlbs. + * 1) We should set ->mode before checking ->requests. Please see + * the comment in kvm_make_all_cpus_request. + * + * 2) For APICv, we should set ->mode before checking PIR.ON. This + * pairs with the memory barrier implicit in pi_test_and_set_on + * (see vmx_deliver_posted_interrupt). + * + * 3) This also orders the write to mode from any reads to the page + * tables done while the VCPU is running. Please see the comment + * in kvm_flush_remote_tlbs. */ smp_mb__after_srcu_read_unlock(); - local_irq_disable(); + if (kvm_lapic_enabled(vcpu)) { + /* + * This handles the case where a posted interrupt was + * notified with kvm_vcpu_kick. + */ + if (kvm_x86_ops->sync_pir_to_irr) + kvm_x86_ops->sync_pir_to_irr(vcpu); + } if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests || need_resched() || signal_pending(current)) {