From patchwork Mon May 11 16:47:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11541319 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9DD0E912 for ; Mon, 11 May 2020 16:48:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 85F2B20722 for ; Mon, 11 May 2020 16:48:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="APxScxYU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730730AbgEKQsN (ORCPT ); Mon, 11 May 2020 12:48:13 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:52086 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729613AbgEKQsN (ORCPT ); Mon, 11 May 2020 12:48:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589215692; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HhwY47nxj1UdfQsjm+9VtE1yq6uitrdcNiNFnzaGmSs=; b=APxScxYUnyiPa8iq3DXwIRzVQL0EBUy3akyKB530FfpvwmW3Mt3cAjz6nlvanEG4TTWAZT kcl2LIWxk5EsU3nFPal+hv9F3yF6Yo+E61lij1AenVO9U6ZHTjdofiBCR71XzAUPvprOhK nqzjtLRXE5UEWRBJfeg3dHgWdUaOpc4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-123-i1Ob4udyNbKM9TEIVonvRw-1; Mon, 11 May 2020 12:48:07 -0400 X-MC-Unique: i1Ob4udyNbKM9TEIVonvRw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BAA87100CCC0; Mon, 11 May 2020 16:48:00 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4FA44341FF; Mon, 11 May 2020 16:47:57 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH 1/8] Revert "KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously" Date: Mon, 11 May 2020 18:47:45 +0200 Message-Id: <20200511164752.2158645-2-vkuznets@redhat.com> In-Reply-To: <20200511164752.2158645-1-vkuznets@redhat.com> References: <20200511164752.2158645-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Commit 9a6e7c39810e (""KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously") added a protection against 'page ready' notification coming before 'page not ready' is delivered. This situation seems to be impossible since commit 2a266f23550b ("KVM MMU: check pending exception before injecting APF) which added 'vcpu->arch.exception.pending' check to kvm_can_do_async_pf. On x86, kvm_arch_async_page_present() has only one call site: kvm_check_async_pf_completion() loop and we only enter the loop when kvm_arch_can_inject_async_page_present(vcpu) which when async pf msr is enabled, translates into kvm_can_do_async_pf(). There is also one problem with the cancellation mechanism. We don't seem to check that the 'page not ready' notification we're canceling matches the 'page ready' notification so in theory, we may erroneously drop two valid events. Revert the commit. Reviewed-by: Gavin Shan Signed-off-by: Vitaly Kuznetsov --- arch/x86/kvm/x86.c | 23 +---------------------- 1 file changed, 1 insertion(+), 22 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c5835f9cb9ad..edd4a6415b92 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10359,13 +10359,6 @@ static int apf_put_user(struct kvm_vcpu *vcpu, u32 val) sizeof(val)); } -static int apf_get_user(struct kvm_vcpu *vcpu, u32 *val) -{ - - return kvm_read_guest_cached(vcpu->kvm, &vcpu->arch.apf.data, val, - sizeof(u32)); -} - static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu) { if (!vcpu->arch.apf.delivery_as_pf_vmexit && is_guest_mode(vcpu)) @@ -10430,7 +10423,6 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work) { struct x86_exception fault; - u32 val; if (work->wakeup_all) work->arch.token = ~0; /* broadcast wakeup */ @@ -10439,19 +10431,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, trace_kvm_async_pf_ready(work->arch.token, work->cr2_or_gpa); if (vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED && - !apf_get_user(vcpu, &val)) { - if (val == KVM_PV_REASON_PAGE_NOT_PRESENT && - vcpu->arch.exception.pending && - vcpu->arch.exception.nr == PF_VECTOR && - !apf_put_user(vcpu, 0)) { - vcpu->arch.exception.injected = false; - vcpu->arch.exception.pending = false; - vcpu->arch.exception.nr = 0; - vcpu->arch.exception.has_error_code = false; - vcpu->arch.exception.error_code = 0; - vcpu->arch.exception.has_payload = false; - vcpu->arch.exception.payload = 0; - } else if (!apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) { + !apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) { fault.vector = PF_VECTOR; fault.error_code_valid = true; fault.error_code = 0; @@ -10459,7 +10439,6 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, fault.address = work->arch.token; fault.async_page_fault = true; kvm_inject_page_fault(vcpu, &fault); - } } vcpu->arch.apf.halted = false; vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; From patchwork Mon May 11 16:47:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11541321 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6A632912 for ; Mon, 11 May 2020 16:48:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5233220757 for ; Mon, 11 May 2020 16:48:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="E6AzSBN7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730743AbgEKQsQ (ORCPT ); Mon, 11 May 2020 12:48:16 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:49303 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730740AbgEKQsQ (ORCPT ); Mon, 11 May 2020 12:48:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589215694; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=35hTdl5yNDphj6Z3O0jEEa/SyBb7SmQCZfgkS6ZID9I=; b=E6AzSBN7AewtbXdbQ2P1HkLtMVODR48+dIS9EEGt1CN+7UalPwnjT4yu+WBO54Mq6N9jOK b9i7i/ier2Vqrb2m4IrVAKnMhrZYE1thSWgnb97zu0xDscQh9i6ZfQ8fvnFIwugB6jAAMS AjVt9MriIfXDWwJ8qQeiFwj8SLEkI1E= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-212-fteJNXIRN5aMh7U1MEqFLw-1; Mon, 11 May 2020 12:48:13 -0400 X-MC-Unique: fteJNXIRN5aMh7U1MEqFLw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4EAD01899520; Mon, 11 May 2020 16:48:11 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id 23FCD196AE; Mon, 11 May 2020 16:48:00 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH 2/8] KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info Date: Mon, 11 May 2020 18:47:46 +0200 Message-Id: <20200511164752.2158645-3-vkuznets@redhat.com> In-Reply-To: <20200511164752.2158645-1-vkuznets@redhat.com> References: <20200511164752.2158645-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently, APF mechanism relies on the #PF abuse where the token is being passed through CR2. If we switch to using interrupts to deliver page-ready notifications we need a different way to pass the data. Extent the existing 'struct kvm_vcpu_pv_apf_data' with token information for page-ready notifications. The newly introduced apf_put_user_ready() temporary puts both reason and token information, this will be changed to put token only when we switch to interrupt based notifications. Signed-off-by: Vitaly Kuznetsov --- arch/x86/include/uapi/asm/kvm_para.h | 3 ++- arch/x86/kvm/x86.c | 17 +++++++++++++---- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 2a8e0b6b9805..e3602a1de136 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -113,7 +113,8 @@ struct kvm_mmu_op_release_pt { struct kvm_vcpu_pv_apf_data { __u32 reason; - __u8 pad[60]; + __u32 pageready_token; + __u8 pad[56]; __u32 enabled; }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index edd4a6415b92..28868cc16e4d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2662,7 +2662,7 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data) } if (kvm_gfn_to_hva_cache_init(vcpu->kvm, &vcpu->arch.apf.data, gpa, - sizeof(u32))) + sizeof(u64))) return 1; vcpu->arch.apf.send_user_only = !(data & KVM_ASYNC_PF_SEND_ALWAYS); @@ -10352,8 +10352,17 @@ static void kvm_del_async_pf_gfn(struct kvm_vcpu *vcpu, gfn_t gfn) } } -static int apf_put_user(struct kvm_vcpu *vcpu, u32 val) +static inline int apf_put_user_notpresent(struct kvm_vcpu *vcpu) { + u32 reason = KVM_PV_REASON_PAGE_NOT_PRESENT; + + return kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.apf.data, &reason, + sizeof(reason)); +} + +static inline int apf_put_user_ready(struct kvm_vcpu *vcpu, u32 token) +{ + u64 val = (u64)token << 32 | KVM_PV_REASON_PAGE_READY; return kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.apf.data, &val, sizeof(val)); @@ -10398,7 +10407,7 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu, kvm_add_async_pf_gfn(vcpu, work->arch.gfn); if (kvm_can_deliver_async_pf(vcpu) && - !apf_put_user(vcpu, KVM_PV_REASON_PAGE_NOT_PRESENT)) { + !apf_put_user_notpresent(vcpu)) { fault.vector = PF_VECTOR; fault.error_code_valid = true; fault.error_code = 0; @@ -10431,7 +10440,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, trace_kvm_async_pf_ready(work->arch.token, work->cr2_or_gpa); if (vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED && - !apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) { + !apf_put_user_ready(vcpu, work->arch.token)) { fault.vector = PF_VECTOR; fault.error_code_valid = true; fault.error_code = 0; From patchwork Mon May 11 16:47:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11541323 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D3D17912 for ; Mon, 11 May 2020 16:48:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BC04F2075E for ; Mon, 11 May 2020 16:48:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="M4ijXmnS" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730775AbgEKQsc (ORCPT ); Mon, 11 May 2020 12:48:32 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:44339 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730759AbgEKQsb (ORCPT ); Mon, 11 May 2020 12:48:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589215709; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l/ymy6kM/nogLN2u2BJrGjfq2sxiQ3A46kg9O0WBDz0=; b=M4ijXmnSNf73G7DfLbaNPyHylxluYZqq05FHcDUUEIlCzwegpjbPl0m25YHKRCZ2vzgrdw XvGl3wxnxNr1dyWsSJxuawwcePNwR/GZCGeeaTER3GxoTzlNwroQ2yXB5HhmBbfFnXZdqV KTqaQr1MzlpC5QltrQiwqjYU/68LD/c= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-400-WxD0fqNsN1ipJXJ4sSM_Aw-1; Mon, 11 May 2020 12:48:26 -0400 X-MC-Unique: WxD0fqNsN1ipJXJ4sSM_Aw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5441D1899525; Mon, 11 May 2020 16:48:24 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id A0C4D196AE; Mon, 11 May 2020 16:48:11 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH 3/8] KVM: introduce kvm_read_guest_offset_cached() Date: Mon, 11 May 2020 18:47:47 +0200 Message-Id: <20200511164752.2158645-4-vkuznets@redhat.com> In-Reply-To: <20200511164752.2158645-1-vkuznets@redhat.com> References: <20200511164752.2158645-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We already have kvm_write_guest_offset_cached(), introduce read analogue. Signed-off-by: Vitaly Kuznetsov --- include/linux/kvm_host.h | 3 +++ virt/kvm/kvm_main.c | 19 ++++++++++++++----- 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 01276e3d01b9..235109a23b8c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -733,6 +733,9 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset, int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len); int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, void *data, unsigned long len); +int kvm_read_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, + void *data, unsigned int offset, + unsigned long len); int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data, int offset, int len); int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 74bdb7bf3295..c425be6e6d98 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2503,13 +2503,15 @@ int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, } EXPORT_SYMBOL_GPL(kvm_write_guest_cached); -int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, - void *data, unsigned long len) +int kvm_read_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, + void *data, unsigned int offset, + unsigned long len) { struct kvm_memslots *slots = kvm_memslots(kvm); int r; + gpa_t gpa = ghc->gpa + offset; - BUG_ON(len > ghc->len); + BUG_ON(len + offset > ghc->len); if (slots->generation != ghc->generation) { if (__kvm_gfn_to_hva_cache_init(slots, ghc, ghc->gpa, ghc->len)) @@ -2520,14 +2522,21 @@ int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, return -EFAULT; if (unlikely(!ghc->memslot)) - return kvm_read_guest(kvm, ghc->gpa, data, len); + return kvm_read_guest(kvm, gpa, data, len); - r = __copy_from_user(data, (void __user *)ghc->hva, len); + r = __copy_from_user(data, (void __user *)ghc->hva + offset, len); if (r) return -EFAULT; return 0; } +EXPORT_SYMBOL_GPL(kvm_read_guest_offset_cached); + +int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, + void *data, unsigned long len) +{ + return kvm_read_guest_offset_cached(kvm, ghc, data, 0, len); +} EXPORT_SYMBOL_GPL(kvm_read_guest_cached); int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len) From patchwork Mon May 11 16:47:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11541325 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B7E10912 for ; Mon, 11 May 2020 16:48:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9537920720 for ; Mon, 11 May 2020 16:48:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hltURitX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730782AbgEKQsh (ORCPT ); Mon, 11 May 2020 12:48:37 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:24330 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730776AbgEKQsg (ORCPT ); Mon, 11 May 2020 12:48:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589215713; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qhXJffBlhNDtE9tkMyyMOsHY6u+h1PyLE3ZgipjNapM=; b=hltURitXt+hgbuhqRLfvvN7wBf4Wi6rtflTvOdKsBS6BoplI3cHZBTf7R26ktEGjS8GfZI LGNWHwn8aF0jGYfQTM/GECERsaKpH99eF1pa/sCMcCmv7VvAX7VWbK7/Ep6O7YddWd3O9D uC11oNAEsTPSjeD7QRQaD8vgmIi6Gfk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-40-awtjyMfdMXWo6417ud49_A-1; Mon, 11 May 2020 12:48:31 -0400 X-MC-Unique: awtjyMfdMXWo6417ud49_A-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2886B107ACF4; Mon, 11 May 2020 16:48:30 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id AB5E547348; Mon, 11 May 2020 16:48:24 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH 4/8] KVM: x86: interrupt based APF page-ready event delivery Date: Mon, 11 May 2020 18:47:48 +0200 Message-Id: <20200511164752.2158645-5-vkuznets@redhat.com> In-Reply-To: <20200511164752.2158645-1-vkuznets@redhat.com> References: <20200511164752.2158645-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Concerns were expressed around APF delivery via synthetic #PF exception as in some cases such delivery may collide with real page fault. For type 2 (page ready) notifications we can easily switch to using an interrupt instead. Introduce new MSR_KVM_ASYNC_PF_INT mechanism and deprecate the legacy one. One notable difference between the two mechanisms is that interrupt may not get handled immediately so whenever we would like to deliver next event (regardless of its type) we must be sure the guest had read and cleared previous event in the slot. Signed-off-by: Vitaly Kuznetsov --- Documentation/virt/kvm/msr.rst | 91 +++++++++++++++++--------- arch/x86/include/asm/kvm_host.h | 4 +- arch/x86/include/uapi/asm/kvm_para.h | 6 ++ arch/x86/kvm/x86.c | 95 ++++++++++++++++++++-------- 4 files changed, 140 insertions(+), 56 deletions(-) diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst index 33892036672d..f988a36f226a 100644 --- a/Documentation/virt/kvm/msr.rst +++ b/Documentation/virt/kvm/msr.rst @@ -190,35 +190,54 @@ MSR_KVM_ASYNC_PF_EN: 0x4b564d02 data: - Bits 63-6 hold 64-byte aligned physical address of a - 64 byte memory area which must be in guest RAM and must be - zeroed. Bits 5-3 are reserved and should be zero. Bit 0 is 1 - when asynchronous page faults are enabled on the vcpu 0 when - disabled. Bit 1 is 1 if asynchronous page faults can be injected - when vcpu is in cpl == 0. Bit 2 is 1 if asynchronous page faults - are delivered to L1 as #PF vmexits. Bit 2 can be set only if - KVM_FEATURE_ASYNC_PF_VMEXIT is present in CPUID. - - First 4 byte of 64 byte memory location will be written to by - the hypervisor at the time of asynchronous page fault (APF) - injection to indicate type of asynchronous page fault. Value - of 1 means that the page referred to by the page fault is not - present. Value 2 means that the page is now available. Disabling - interrupt inhibits APFs. Guest must not enable interrupt - before the reason is read, or it may be overwritten by another - APF. Since APF uses the same exception vector as regular page - fault guest must reset the reason to 0 before it does - something that can generate normal page fault. If during page - fault APF reason is 0 it means that this is regular page - fault. - - During delivery of type 1 APF cr2 contains a token that will - be used to notify a guest when missing page becomes - available. When page becomes available type 2 APF is sent with - cr2 set to the token associated with the page. There is special - kind of token 0xffffffff which tells vcpu that it should wake - up all processes waiting for APFs and no individual type 2 APFs - will be sent. + Asynchronous page fault (APF) control MSR. + + Bits 63-6 hold 64-byte aligned physical address of a 64 byte memory area + which must be in guest RAM and must be zeroed. This memory is expected + to hold a copy of the following structure:: + + struct kvm_vcpu_pv_apf_data { + __u32 reason; + __u32 pageready_token; + __u8 pad[56]; + __u32 enabled; + }; + + Bits 5-4 of the MSR are reserved and should be zero. Bit 0 is set to 1 + when asynchronous page faults are enabled on the vcpu, 0 when disabled. + Bit 1 is 1 if asynchronous page faults can be injected when vcpu is in + cpl == 0. Bit 2 is 1 if asynchronous page faults are delivered to L1 as + #PF vmexits. Bit 2 can be set only if KVM_FEATURE_ASYNC_PF_VMEXIT is + present in CPUID. Bit 3 enables interrupt based delivery of type 2 + (page present) events. + + First 4 byte of 64 byte memory location ('reason') will be written to + by the hypervisor at the time APF type 1 (page not present) injection. + The only possible values are '0' and '1'. Type 1 events are currently + always delivered as synthetic #PF exception. During delivery of type 1 + APF CR2 register contains a token that will be used to notify the guest + when missing page becomes available. Guest is supposed to write '0' to + the location when it is done handling type 1 event so the next one can + be delivered. + + Note, since APF type 1 uses the same exception vector as regular page + fault, guest must reset the reason to '0' before it does something that + can generate normal page fault. If during a page fault APF reason is '0' + it means that this is regular page fault. + + Bytes 5-7 of 64 byte memory location ('pageready_token') will be written + to by the hypervisor at the time of type 2 (page ready) event injection. + The content of these bytes is a token which was previously delivered as + type 1 event. The event indicates the page in now available. Guest is + supposed to write '0' to the location when it is done handling type 2 + event so the next one can be delivered. MSR_KVM_ASYNC_PF_INT MSR + specifying the interrupt vector for type 2 APF delivery needs to be + written to before enabling APF mechanism in MSR_KVM_ASYNC_PF_EN. + + Note, previously, type 2 (page present) events were delivered via the + same #PF exception as type 1 (page not present) events but this is + now deprecated. If bit 3 (interrupt based delivery) is not set APF + events are not delivered. If APF is disabled while there are outstanding APFs, they will not be delivered. @@ -319,3 +338,17 @@ data: KVM guests can request the host not to poll on HLT, for example if they are performing polling themselves. + +MSR_KVM_ASYNC_PF_INT: + 0x4b564d06 + +data: + Second asynchronous page fault (APF) control MSR. + + Bits 0-7: APIC vector for delivery of type 2 APF events (page ready + notifications). + Bits 8-63: Reserved + + Interrupt vector for asynchnonous page ready notifications delivery. + The vector has to be set up before asynchronous page fault mechanism + is enabled in MSR_KVM_ASYNC_PF_EN. diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 42a2d0d3984a..4a72ba26a532 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -763,7 +763,9 @@ struct kvm_vcpu_arch { bool halted; gfn_t gfns[roundup_pow_of_two(ASYNC_PF_PER_VCPU)]; struct gfn_to_hva_cache data; - u64 msr_val; + u64 msr_en_val; /* MSR_KVM_ASYNC_PF_EN */ + u64 msr_int_val; /* MSR_KVM_ASYNC_PF_INT */ + u16 vec; u32 id; bool send_user_only; u32 host_apf_reason; diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index e3602a1de136..c55ffa2c40b6 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -50,6 +50,7 @@ #define MSR_KVM_STEAL_TIME 0x4b564d03 #define MSR_KVM_PV_EOI_EN 0x4b564d04 #define MSR_KVM_POLL_CONTROL 0x4b564d05 +#define MSR_KVM_ASYNC_PF_INT 0x4b564d06 struct kvm_steal_time { __u64 steal; @@ -81,6 +82,11 @@ struct kvm_clock_pairing { #define KVM_ASYNC_PF_ENABLED (1 << 0) #define KVM_ASYNC_PF_SEND_ALWAYS (1 << 1) #define KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT (1 << 2) +#define KVM_ASYNC_PF_DELIVERY_AS_INT (1 << 3) + +/* MSR_KVM_ASYNC_PF_INT */ +#define KVM_ASYNC_PF_VEC_MASK GENMASK(7, 0) + /* Operations for KVM_HC_MMU_OP */ #define KVM_MMU_OP_WRITE_PTE 1 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 28868cc16e4d..d1e9b072d279 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1243,7 +1243,7 @@ static const u32 emulated_msrs_all[] = { HV_X64_MSR_TSC_EMULATION_STATUS, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, - MSR_KVM_PV_EOI_EN, + MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF_INT, MSR_IA32_TSC_ADJUST, MSR_IA32_TSCDEADLINE, @@ -2645,17 +2645,24 @@ static int xen_hvm_config(struct kvm_vcpu *vcpu, u64 data) return r; } +static inline bool kvm_pv_async_pf_enabled(struct kvm_vcpu *vcpu) +{ + u64 mask = KVM_ASYNC_PF_ENABLED | KVM_ASYNC_PF_DELIVERY_AS_INT; + + return (vcpu->arch.apf.msr_en_val & mask) == mask; +} + static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data) { gpa_t gpa = data & ~0x3f; - /* Bits 3:5 are reserved, Should be zero */ - if (data & 0x38) + /* Bits 4:5 are reserved, Should be zero */ + if (data & 0x30) return 1; - vcpu->arch.apf.msr_val = data; + vcpu->arch.apf.msr_en_val = data; - if (!(data & KVM_ASYNC_PF_ENABLED)) { + if (!kvm_pv_async_pf_enabled(vcpu)) { kvm_clear_async_pf_completion_queue(vcpu); kvm_async_pf_hash_reset(vcpu); return 0; @@ -2667,7 +2674,25 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data) vcpu->arch.apf.send_user_only = !(data & KVM_ASYNC_PF_SEND_ALWAYS); vcpu->arch.apf.delivery_as_pf_vmexit = data & KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT; + kvm_async_pf_wakeup_all(vcpu); + + return 0; +} + +static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data) +{ + /* Bits 8-63 are reserved */ + if (data >> 8) + return 1; + + if (!lapic_in_kernel(vcpu)) + return 1; + + vcpu->arch.apf.msr_int_val = data; + + vcpu->arch.apf.vec = data & KVM_ASYNC_PF_VEC_MASK; + return 0; } @@ -2883,6 +2908,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (kvm_pv_enable_async_pf(vcpu, data)) return 1; break; + case MSR_KVM_ASYNC_PF_INT: + if (kvm_pv_enable_async_pf_int(vcpu, data)) + return 1; + break; case MSR_KVM_STEAL_TIME: if (unlikely(!sched_info_on())) @@ -3157,7 +3186,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = vcpu->arch.time; break; case MSR_KVM_ASYNC_PF_EN: - msr_info->data = vcpu->arch.apf.msr_val; + msr_info->data = vcpu->arch.apf.msr_en_val; + break; + case MSR_KVM_ASYNC_PF_INT: + msr_info->data = vcpu->arch.apf.msr_int_val; break; case MSR_KVM_STEAL_TIME: msr_info->data = vcpu->arch.st.msr_val; @@ -9500,7 +9532,8 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->arch.cr2 = 0; kvm_make_request(KVM_REQ_EVENT, vcpu); - vcpu->arch.apf.msr_val = 0; + vcpu->arch.apf.msr_en_val = 0; + vcpu->arch.apf.msr_int_val = 0; vcpu->arch.st.msr_val = 0; kvmclock_reset(vcpu); @@ -10362,10 +10395,24 @@ static inline int apf_put_user_notpresent(struct kvm_vcpu *vcpu) static inline int apf_put_user_ready(struct kvm_vcpu *vcpu, u32 token) { - u64 val = (u64)token << 32 | KVM_PV_REASON_PAGE_READY; + unsigned int offset = offsetof(struct kvm_vcpu_pv_apf_data, + pageready_token); - return kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.apf.data, &val, - sizeof(val)); + return kvm_write_guest_offset_cached(vcpu->kvm, &vcpu->arch.apf.data, + &token, offset, sizeof(token)); +} + +static inline bool apf_pageready_slot_free(struct kvm_vcpu *vcpu) +{ + unsigned int offset = offsetof(struct kvm_vcpu_pv_apf_data, + pageready_token); + u32 val; + + if (kvm_read_guest_offset_cached(vcpu->kvm, &vcpu->arch.apf.data, + &val, offset, sizeof(val))) + return false; + + return !val; } static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu) @@ -10373,9 +10420,8 @@ static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu) if (!vcpu->arch.apf.delivery_as_pf_vmexit && is_guest_mode(vcpu)) return false; - if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED) || - (vcpu->arch.apf.send_user_only && - kvm_x86_ops.get_cpl(vcpu) == 0)) + if (!kvm_pv_async_pf_enabled(vcpu) || + (vcpu->arch.apf.send_user_only && kvm_x86_ops.get_cpl(vcpu) == 0)) return false; return true; @@ -10431,7 +10477,10 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu, void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work) { - struct x86_exception fault; + struct kvm_lapic_irq irq = { + .delivery_mode = APIC_DM_FIXED, + .vector = vcpu->arch.apf.vec + }; if (work->wakeup_all) work->arch.token = ~0; /* broadcast wakeup */ @@ -10439,26 +10488,20 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, kvm_del_async_pf_gfn(vcpu, work->arch.gfn); trace_kvm_async_pf_ready(work->arch.token, work->cr2_or_gpa); - if (vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED && - !apf_put_user_ready(vcpu, work->arch.token)) { - fault.vector = PF_VECTOR; - fault.error_code_valid = true; - fault.error_code = 0; - fault.nested_page_fault = false; - fault.address = work->arch.token; - fault.async_page_fault = true; - kvm_inject_page_fault(vcpu, &fault); - } + if (kvm_pv_async_pf_enabled(vcpu) && + !apf_put_user_ready(vcpu, work->arch.token)) + kvm_apic_set_irq(vcpu, &irq, NULL); + vcpu->arch.apf.halted = false; vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; } bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu) { - if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED)) + if (!kvm_pv_async_pf_enabled(vcpu)) return true; else - return kvm_can_do_async_pf(vcpu); + return apf_pageready_slot_free(vcpu); } void kvm_arch_start_assignment(struct kvm *kvm) From patchwork Mon May 11 16:47:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11541327 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 593401668 for ; Mon, 11 May 2020 16:48:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3CAB520722 for ; Mon, 11 May 2020 16:48:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LvmrsXq/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730800AbgEKQso (ORCPT ); Mon, 11 May 2020 12:48:44 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:35570 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730790AbgEKQso (ORCPT ); Mon, 11 May 2020 12:48:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589215721; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gvPOgL7NqXN+z5XgpMm6Ybw6RKGKK37IWFotakDw6ck=; b=LvmrsXq/g9GIagljWIPZvkSsA6WgKDZvMPXgwh7djp+mlysyKEn1pmm7CK9WPJuiVm5PyW aCMzhjVCVvVbMz1Evz9RNT9Xe/DLDS+1uVpMn1mOZvCdxPdUga07BhLMbntDQR5pyTR9vn FWXvzYbg1TqK/0aw6JN2xCaLWPuhEo0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-369-U8dDfSR_OIeo6D2fdzoC2g-1; Mon, 11 May 2020 12:48:39 -0400 X-MC-Unique: U8dDfSR_OIeo6D2fdzoC2g-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9B839107ACF5; Mon, 11 May 2020 16:48:37 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id B5896341FD; Mon, 11 May 2020 16:48:30 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH 5/8] KVM: x86: acknowledgment mechanism for async pf page ready notifications Date: Mon, 11 May 2020 18:47:49 +0200 Message-Id: <20200511164752.2158645-6-vkuznets@redhat.com> In-Reply-To: <20200511164752.2158645-1-vkuznets@redhat.com> References: <20200511164752.2158645-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If two page ready notifications happen back to back the second one is not delivered and the only mechanism we currently have is kvm_check_async_pf_completion() check in vcpu_run() loop. The check will only be performed with the next vmexit when it happens and in some cases it may take a while. With interrupt based page ready notification delivery the situation is even worse: unlike exceptions, interrupts are not handled immediately so we must check if the slot is empty. This is slow and unnecessary. Introduce dedicated MSR_KVM_ASYNC_PF_ACK MSR to communicate the fact that the slot is free and host should check its notification queue. Mandate using it for interrupt based type 2 APF event delivery. As kvm_check_async_pf_completion() is going away from vcpu_run() we need a way to communicate the fact that vcpu->async_pf.done queue has transitioned from empty to non-empty state. Introduce kvm_arch_async_page_present_queued() and KVM_REQ_APF_READY to do the job. Signed-off-by: Vitaly Kuznetsov --- Documentation/virt/kvm/msr.rst | 18 +++++++++++++++--- arch/s390/include/asm/kvm_host.h | 2 ++ arch/x86/include/asm/kvm_host.h | 3 +++ arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/x86.c | 26 ++++++++++++++++++++++---- virt/kvm/async_pf.c | 10 ++++++++++ 6 files changed, 53 insertions(+), 7 deletions(-) diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst index f988a36f226a..3c7afbcd243f 100644 --- a/Documentation/virt/kvm/msr.rst +++ b/Documentation/virt/kvm/msr.rst @@ -230,9 +230,10 @@ data: The content of these bytes is a token which was previously delivered as type 1 event. The event indicates the page in now available. Guest is supposed to write '0' to the location when it is done handling type 2 - event so the next one can be delivered. MSR_KVM_ASYNC_PF_INT MSR - specifying the interrupt vector for type 2 APF delivery needs to be - written to before enabling APF mechanism in MSR_KVM_ASYNC_PF_EN. + event so the next one can be delivered. It is also supposed to write + '1' to MSR_KVM_ASYNC_PF_ACK every time after clearing the location, + this forces KVM to re-scan its queue and deliver next pending + notification. Note, previously, type 2 (page present) events were delivered via the same #PF exception as type 1 (page not present) events but this is @@ -352,3 +353,14 @@ data: Interrupt vector for asynchnonous page ready notifications delivery. The vector has to be set up before asynchronous page fault mechanism is enabled in MSR_KVM_ASYNC_PF_EN. + +MSR_KVM_ASYNC_PF_ACK: + 0x4b564d07 + +data: + Asynchronous page fault (APF) acknowledgment. + + When the guest is done processing type 2 APF event and 'pageready_token' + field in 'struct kvm_vcpu_pv_apf_data' is cleared it is supposed to + write '1' to bit 0 of the MSR, this caused the host to re-scan its queue + and check if there are more notifications pending. diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index d6bcd34f3ec3..c9c5b8b71175 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -982,6 +982,8 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu, void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); +static inline void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu) {} + void kvm_arch_crypto_clear_masks(struct kvm *kvm); void kvm_arch_crypto_set_masks(struct kvm *kvm, unsigned long *apm, unsigned long *aqm, unsigned long *adm); diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 4a72ba26a532..0eeeb98ed02a 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -83,6 +83,7 @@ #define KVM_REQ_GET_VMCS12_PAGES KVM_ARCH_REQ(24) #define KVM_REQ_APICV_UPDATE \ KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) +#define KVM_REQ_APF_READY KVM_ARCH_REQ(26) #define CR0_RESERVED_BITS \ (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \ @@ -771,6 +772,7 @@ struct kvm_vcpu_arch { u32 host_apf_reason; unsigned long nested_apf_token; bool delivery_as_pf_vmexit; + bool pageready_pending; } apf; /* OSVW MSRs (AMD only) */ @@ -1643,6 +1645,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); +void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu); bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu); extern bool kvm_find_async_pf_gfn(struct kvm_vcpu *vcpu, gfn_t gfn); diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index c55ffa2c40b6..941fdf4569a4 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -51,6 +51,7 @@ #define MSR_KVM_PV_EOI_EN 0x4b564d04 #define MSR_KVM_POLL_CONTROL 0x4b564d05 #define MSR_KVM_ASYNC_PF_INT 0x4b564d06 +#define MSR_KVM_ASYNC_PF_ACK 0x4b564d07 struct kvm_steal_time { __u64 steal; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d1e9b072d279..d21adc23a99f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1243,7 +1243,7 @@ static const u32 emulated_msrs_all[] = { HV_X64_MSR_TSC_EMULATION_STATUS, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, - MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF_INT, + MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF_INT, MSR_KVM_ASYNC_PF_ACK, MSR_IA32_TSC_ADJUST, MSR_IA32_TSCDEADLINE, @@ -2912,6 +2912,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (kvm_pv_enable_async_pf_int(vcpu, data)) return 1; break; + case MSR_KVM_ASYNC_PF_ACK: + if (data & 0x1) { + vcpu->arch.apf.pageready_pending = false; + kvm_check_async_pf_completion(vcpu); + } + break; case MSR_KVM_STEAL_TIME: if (unlikely(!sched_info_on())) @@ -3191,6 +3197,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_KVM_ASYNC_PF_INT: msr_info->data = vcpu->arch.apf.msr_int_val; break; + case MSR_KVM_ASYNC_PF_ACK: + msr_info->data = 0; + break; case MSR_KVM_STEAL_TIME: msr_info->data = vcpu->arch.st.msr_val; break; @@ -8336,6 +8345,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) kvm_hv_process_stimers(vcpu); if (kvm_check_request(KVM_REQ_APICV_UPDATE, vcpu)) kvm_vcpu_update_apicv(vcpu); + if (kvm_check_request(KVM_REQ_APF_READY, vcpu)) + kvm_check_async_pf_completion(vcpu); } if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) { @@ -8610,8 +8621,6 @@ static int vcpu_run(struct kvm_vcpu *vcpu) break; } - kvm_check_async_pf_completion(vcpu); - if (signal_pending(current)) { r = -EINTR; vcpu->run->exit_reason = KVM_EXIT_INTR; @@ -10489,13 +10498,22 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, trace_kvm_async_pf_ready(work->arch.token, work->cr2_or_gpa); if (kvm_pv_async_pf_enabled(vcpu) && - !apf_put_user_ready(vcpu, work->arch.token)) + !apf_put_user_ready(vcpu, work->arch.token)) { + vcpu->arch.apf.pageready_pending = true; kvm_apic_set_irq(vcpu, &irq, NULL); + } vcpu->arch.apf.halted = false; vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; } +void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu) +{ + kvm_make_request(KVM_REQ_APF_READY, vcpu); + if (!vcpu->arch.apf.pageready_pending) + kvm_vcpu_kick(vcpu); +} + bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu) { if (!kvm_pv_async_pf_enabled(vcpu)) diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c index 15e5b037f92d..746d95cf98ad 100644 --- a/virt/kvm/async_pf.c +++ b/virt/kvm/async_pf.c @@ -51,6 +51,7 @@ static void async_pf_execute(struct work_struct *work) unsigned long addr = apf->addr; gpa_t cr2_or_gpa = apf->cr2_or_gpa; int locked = 1; + bool first; might_sleep(); @@ -69,10 +70,14 @@ static void async_pf_execute(struct work_struct *work) kvm_arch_async_page_present(vcpu, apf); spin_lock(&vcpu->async_pf.lock); + first = list_empty(&vcpu->async_pf.done); list_add_tail(&apf->link, &vcpu->async_pf.done); apf->vcpu = NULL; spin_unlock(&vcpu->async_pf.lock); + if (!IS_ENABLED(CONFIG_KVM_ASYNC_PF_SYNC) && first) + kvm_arch_async_page_present_queued(vcpu); + /* * apf may be freed by kvm_check_async_pf_completion() after * this point @@ -202,6 +207,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu) { struct kvm_async_pf *work; + bool first; if (!list_empty_careful(&vcpu->async_pf.done)) return 0; @@ -214,9 +220,13 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu) INIT_LIST_HEAD(&work->queue); /* for list_del to work */ spin_lock(&vcpu->async_pf.lock); + first = list_empty(&vcpu->async_pf.done); list_add_tail(&work->link, &vcpu->async_pf.done); spin_unlock(&vcpu->async_pf.lock); + if (!IS_ENABLED(CONFIG_KVM_ASYNC_PF_SYNC) && first) + kvm_arch_async_page_present_queued(vcpu); + vcpu->async_pf.queued++; return 0; } From patchwork Mon May 11 16:47:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11541329 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D12F5912 for ; Mon, 11 May 2020 16:48:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B5B302070B for ; Mon, 11 May 2020 16:48:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DT1jEbeR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730834AbgEKQsw (ORCPT ); Mon, 11 May 2020 12:48:52 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:58631 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730819AbgEKQsu (ORCPT ); Mon, 11 May 2020 12:48:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589215728; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IrgzgWOjkXCimZTRmYAgL/WySkGnJpOAWpLdr41OCzc=; b=DT1jEbeRn4rgcFsBvE2SOZOD2va9WI8kBeVTKxmHmzLDcihJ+ouQuBOiJ9bDWfSmwCEcO0 Kr2wtYD7LLnEwIf6NnX+zzQM6+Msbk7bP1j6H2z0bwDrNSIEpIB/0aulCN6ggpKBfIm8SO xautUIVfGqaI+HGL6fVWYelMK5GBDm0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-132-M1051vVQNHepjqOCfrQqdQ-1; Mon, 11 May 2020 12:48:43 -0400 X-MC-Unique: M1051vVQNHepjqOCfrQqdQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CEDACEC1A1; Mon, 11 May 2020 16:48:41 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id 28198196AE; Mon, 11 May 2020 16:48:37 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH 6/8] KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT Date: Mon, 11 May 2020 18:47:50 +0200 Message-Id: <20200511164752.2158645-7-vkuznets@redhat.com> In-Reply-To: <20200511164752.2158645-1-vkuznets@redhat.com> References: <20200511164752.2158645-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Introduce new capability to indicate that KVM supports interrupt based delivery of type 2 APF events (page ready notifications). This includes support for both MSR_KVM_ASYNC_PF_INT and MSR_KVM_ASYNC_PF_ACK. Signed-off-by: Vitaly Kuznetsov --- Documentation/virt/kvm/cpuid.rst | 6 ++++++ Documentation/virt/kvm/msr.rst | 9 ++++++--- arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/cpuid.c | 3 ++- arch/x86/kvm/x86.c | 1 + include/uapi/linux/kvm.h | 1 + 6 files changed, 17 insertions(+), 4 deletions(-) diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst index 01b081f6e7ea..98cbd257ac33 100644 --- a/Documentation/virt/kvm/cpuid.rst +++ b/Documentation/virt/kvm/cpuid.rst @@ -86,6 +86,12 @@ KVM_FEATURE_PV_SCHED_YIELD 13 guest checks this feature bit before using paravirtualized sched yield. +KVM_FEATURE_ASYNC_PF_INT 14 guest checks this feature bit + before using the second async + pf control msr 0x4b564d06 and + async pf acknowledgment msr + 0x4b564d07. + KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24 host will warn if no guest-side per-cpu warps are expeced in kvmclock diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst index 3c7afbcd243f..0e1ec0412714 100644 --- a/Documentation/virt/kvm/msr.rst +++ b/Documentation/virt/kvm/msr.rst @@ -209,7 +209,8 @@ data: cpl == 0. Bit 2 is 1 if asynchronous page faults are delivered to L1 as #PF vmexits. Bit 2 can be set only if KVM_FEATURE_ASYNC_PF_VMEXIT is present in CPUID. Bit 3 enables interrupt based delivery of type 2 - (page present) events. + (page present) events. Bit 3 can be set only if KVM_FEATURE_ASYNC_PF_INT + is present in CPUID. First 4 byte of 64 byte memory location ('reason') will be written to by the hypervisor at the time APF type 1 (page not present) injection. @@ -352,7 +353,8 @@ data: Interrupt vector for asynchnonous page ready notifications delivery. The vector has to be set up before asynchronous page fault mechanism - is enabled in MSR_KVM_ASYNC_PF_EN. + is enabled in MSR_KVM_ASYNC_PF_EN. The MSR is only available if + KVM_FEATURE_ASYNC_PF_INT is present in CPUID. MSR_KVM_ASYNC_PF_ACK: 0x4b564d07 @@ -363,4 +365,5 @@ data: When the guest is done processing type 2 APF event and 'pageready_token' field in 'struct kvm_vcpu_pv_apf_data' is cleared it is supposed to write '1' to bit 0 of the MSR, this caused the host to re-scan its queue - and check if there are more notifications pending. + and check if there are more notifications pending. The MSR is only + available if KVM_FEATURE_ASYNC_PF_INT is present in CPUID. diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 941fdf4569a4..921a55de5004 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -31,6 +31,7 @@ #define KVM_FEATURE_PV_SEND_IPI 11 #define KVM_FEATURE_POLL_CONTROL 12 #define KVM_FEATURE_PV_SCHED_YIELD 13 +#define KVM_FEATURE_ASYNC_PF_INT 14 #define KVM_HINTS_REALTIME 0 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 901cd1fdecd9..790fe4988001 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -712,7 +712,8 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) (1 << KVM_FEATURE_ASYNC_PF_VMEXIT) | (1 << KVM_FEATURE_PV_SEND_IPI) | (1 << KVM_FEATURE_POLL_CONTROL) | - (1 << KVM_FEATURE_PV_SCHED_YIELD); + (1 << KVM_FEATURE_PV_SCHED_YIELD) | + (1 << KVM_FEATURE_ASYNC_PF_INT); if (sched_info_on()) entry->eax |= (1 << KVM_FEATURE_STEAL_TIME); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index d21adc23a99f..ad826d922368 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3412,6 +3412,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_X86_ROBUST_SINGLESTEP: case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: + case KVM_CAP_ASYNC_PF_INT: case KVM_CAP_GET_TSC_KHZ: case KVM_CAP_KVMCLOCK_CTRL: case KVM_CAP_READONLY_MEM: diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 428c7dde6b4b..15012f78a691 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1017,6 +1017,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_S390_VCPU_RESETS 179 #define KVM_CAP_S390_PROTECTED 180 #define KVM_CAP_PPC_SECURE_GUEST 181 +#define KVM_CAP_ASYNC_PF_INT 182 #ifdef KVM_CAP_IRQ_ROUTING From patchwork Mon May 11 16:47:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11541331 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 12055912 for ; Mon, 11 May 2020 16:48:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E7A75206D7 for ; Mon, 11 May 2020 16:48:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="awVW/a1l" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730839AbgEKQsx (ORCPT ); Mon, 11 May 2020 12:48:53 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:36534 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730825AbgEKQsv (ORCPT ); Mon, 11 May 2020 12:48:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589215729; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mwdo0THIXAe5OCi/G0iGHM3nS/e1+x0NxdiSJYJ9Fq0=; b=awVW/a1lofStJmFQuLo01fuoDO9INOaEGmys9em+StnkBUHZS51C4kkl7urw0De8CoxfrG 5nIKsQOhmKCghOR1giKUF2IBbmE1NDZVNWvv56NfFDfNIpRjivWBz6GEbA7kvGGqaEu7TO gaowJDJtESNQVvw1xEPZ2wRJeXx8OAE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-147-I0XVmSwtOCygBZwlu5RC9A-1; Mon, 11 May 2020 12:48:48 -0400 X-MC-Unique: I0XVmSwtOCygBZwlu5RC9A-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0AB05846269; Mon, 11 May 2020 16:48:46 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id 40752341FD; Mon, 11 May 2020 16:48:42 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH 7/8] KVM: x86: Switch KVM guest to using interrupts for page ready APF delivery Date: Mon, 11 May 2020 18:47:51 +0200 Message-Id: <20200511164752.2158645-8-vkuznets@redhat.com> In-Reply-To: <20200511164752.2158645-1-vkuznets@redhat.com> References: <20200511164752.2158645-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM now supports using interrupt for type 2 APF event delivery (page ready notifications) and legacy mechanism was deprecated. Switch KVM guests to the new one. Signed-off-by: Vitaly Kuznetsov Reported-by: kbuild test robot --- arch/x86/entry/entry_32.S | 5 ++++ arch/x86/entry/entry_64.S | 5 ++++ arch/x86/include/asm/hardirq.h | 3 +++ arch/x86/include/asm/irq_vectors.h | 6 ++++- arch/x86/include/asm/kvm_para.h | 6 +++++ arch/x86/kernel/irq.c | 9 +++++++ arch/x86/kernel/kvm.c | 42 +++++++++++++++++++++++------- 7 files changed, 65 insertions(+), 11 deletions(-) diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index b67bae7091d7..d574dadcb2a1 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -1475,6 +1475,11 @@ BUILD_INTERRUPT3(hv_stimer0_callback_vector, HYPERV_STIMER0_VECTOR, #endif /* CONFIG_HYPERV */ +#ifdef CONFIG_KVM_GUEST +BUILD_INTERRUPT3(kvm_async_pf_vector, KVM_ASYNC_PF_VECTOR, + kvm_async_pf_intr) +#endif + SYM_CODE_START(page_fault) ASM_CLAC pushl $do_page_fault diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 0e9504fabe52..6f127c1a6547 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1190,6 +1190,11 @@ apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \ acrn_hv_callback_vector acrn_hv_vector_handler #endif +#ifdef CONFIG_KVM_GUEST +apicinterrupt3 KVM_ASYNC_PF_VECTOR \ + kvm_async_pf_vector kvm_async_pf_intr +#endif + idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=IST_INDEX_DB ist_offset=DB_STACK_OFFSET idtentry int3 do_int3 has_error_code=0 create_gap=1 idtentry stack_segment do_stack_segment has_error_code=1 diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index 07533795b8d2..be0fbb15ad7f 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -44,6 +44,9 @@ typedef struct { unsigned int irq_hv_reenlightenment_count; unsigned int hyperv_stimer0_count; #endif +#ifdef CONFIG_KVM_GUEST + unsigned int kvm_async_pf_pageready_count; +#endif } ____cacheline_aligned irq_cpustat_t; DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat); diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h index 889f8b1b5b7f..8879a9ecd908 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -104,7 +104,11 @@ #define HYPERV_STIMER0_VECTOR 0xed #endif -#define LOCAL_TIMER_VECTOR 0xec +#ifdef CONFIG_KVM_GUEST +#define KVM_ASYNC_PF_VECTOR 0xec +#endif + +#define LOCAL_TIMER_VECTOR 0xeb #define NR_VECTORS 256 diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h index 9b4df6eaa11a..fde4f21607f9 100644 --- a/arch/x86/include/asm/kvm_para.h +++ b/arch/x86/include/asm/kvm_para.h @@ -4,6 +4,7 @@ #include #include +#include #include extern void kvmclock_init(void); @@ -93,6 +94,11 @@ void kvm_async_pf_task_wake(u32 token); u32 kvm_read_and_reset_pf_reason(void); extern void kvm_disable_steal_time(void); void do_async_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address); +extern void kvm_async_pf_vector(void); +#ifdef CONFIG_TRACING +#define trace_kvm_async_pf_vector kvm_async_pf_vector +#endif +__visible void __irq_entry kvm_async_pf_intr(struct pt_regs *regs); #ifdef CONFIG_PARAVIRT_SPINLOCKS void __init kvm_spinlock_init(void); diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index c7965ff429c5..a4c2f25ad74d 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -159,6 +159,15 @@ int arch_show_interrupts(struct seq_file *p, int prec) irq_stats(j)->hyperv_stimer0_count); seq_puts(p, " Hyper-V stimer0 interrupts\n"); } +#endif +#ifdef CONFIG_KVM_GUEST + if (test_bit(KVM_ASYNC_PF_VECTOR, system_vectors)) { + seq_printf(p, "%*s: ", prec, "APF"); + for_each_online_cpu(j) + seq_printf(p, "%10u ", + irq_stats(j)->kvm_async_pf_pageready_count); + seq_puts(p, " KVM async PF page ready interrupts\n"); + } #endif seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count)); #if defined(CONFIG_X86_IO_APIC) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 6efe0410fb72..b6453d227604 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -245,23 +245,40 @@ NOKPROBE_SYMBOL(kvm_read_and_reset_pf_reason); dotraplinkage void do_async_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address) { - switch (kvm_read_and_reset_pf_reason()) { - default: - do_page_fault(regs, error_code, address); - break; + u32 reason = kvm_read_and_reset_pf_reason(); + + switch (reason) { case KVM_PV_REASON_PAGE_NOT_PRESENT: /* page is swapped out by the host. */ kvm_async_pf_task_wait((u32)address, !user_mode(regs)); break; - case KVM_PV_REASON_PAGE_READY: - rcu_irq_enter(); - kvm_async_pf_task_wake((u32)address); - rcu_irq_exit(); + case 0: + do_page_fault(regs, error_code, address); break; + default: + WARN_ONCE(1, "Unexpected async PF reason: %x\n", reason); } } NOKPROBE_SYMBOL(do_async_page_fault); +__visible void __irq_entry kvm_async_pf_intr(struct pt_regs *regs) +{ + u32 token, reason; + + entering_ack_irq(); + + inc_irq_stat(kvm_async_pf_pageready_count); + + if (__this_cpu_read(apf_reason.enabled)) { + token = __this_cpu_read(apf_reason.pageready_token); + kvm_async_pf_task_wake(token); + __this_cpu_write(apf_reason.pageready_token, 0); + wrmsrl(MSR_KVM_ASYNC_PF_ACK, 1); + } + + exiting_irq(); +} + static void __init paravirt_ops_setup(void) { pv_info.name = "KVM"; @@ -305,17 +322,19 @@ static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val) static void kvm_guest_cpu_init(void) { - if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) { + if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_INT) && kvmapf) { u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason)); #ifdef CONFIG_PREEMPTION pa |= KVM_ASYNC_PF_SEND_ALWAYS; #endif - pa |= KVM_ASYNC_PF_ENABLED; + pa |= KVM_ASYNC_PF_ENABLED | KVM_ASYNC_PF_DELIVERY_AS_INT; if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_VMEXIT)) pa |= KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT; + wrmsrl(MSR_KVM_ASYNC_PF_INT, KVM_ASYNC_PF_VECTOR); + wrmsrl(MSR_KVM_ASYNC_PF_EN, pa); __this_cpu_write(apf_reason.enabled, 1); printk(KERN_INFO"KVM setup async PF for cpu %d\n", @@ -649,6 +668,9 @@ static void __init kvm_guest_init(void) if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) apic_set_eoi_write(kvm_guest_apic_eoi_write); + if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_INT)) + alloc_intr_gate(KVM_ASYNC_PF_VECTOR, kvm_async_pf_vector); + #ifdef CONFIG_SMP smp_ops.smp_prepare_cpus = kvm_smp_prepare_cpus; smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu; From patchwork Mon May 11 16:47:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11541333 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9BAB314C0 for ; Mon, 11 May 2020 16:48:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 83A2420736 for ; Mon, 11 May 2020 16:48:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="eGR/GXA9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730827AbgEKQs5 (ORCPT ); Mon, 11 May 2020 12:48:57 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:47189 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730826AbgEKQs4 (ORCPT ); Mon, 11 May 2020 12:48:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1589215735; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X8L7h7D0ZYrLjNfJnLUVqpCrchUXXgbyRVZhx5BAtmY=; b=eGR/GXA9pDrHZ5jVJTg/sCc6E1VYGqnfshHqAZUGOqMdlrYvJ5bedO7Mnw4r6zxXFQmUKx xisi12yeD06WYkOhHsnuTqig1MFNhxifdaPgA252rPVYnb26asEvNkxuFHIaijVeN5vFzQ 8kge0fvbm9VT3zQentYm0epSYPMNrPg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-122-KuiF88lPMBKdzDRTKad1FQ-1; Mon, 11 May 2020 12:48:51 -0400 X-MC-Unique: KuiF88lPMBKdzDRTKad1FQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BE8C71899520; Mon, 11 May 2020 16:48:49 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.255]) by smtp.corp.redhat.com (Postfix) with ESMTP id 63CE6341FF; Mon, 11 May 2020 16:48:46 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH 8/8] KVM: x86: drop KVM_PV_REASON_PAGE_READY case from kvm_handle_page_fault() Date: Mon, 11 May 2020 18:47:52 +0200 Message-Id: <20200511164752.2158645-9-vkuznets@redhat.com> In-Reply-To: <20200511164752.2158645-1-vkuznets@redhat.com> References: <20200511164752.2158645-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM guest code in Linux enables APF only when KVM_FEATURE_ASYNC_PF_INT is supported, this means we will never see KVM_PV_REASON_PAGE_READY when handling page fault vmexit in KVM. While on it, make sure we only follow genuine page fault path when APF reason is zero. If we happen to see something else this means that the underlying hypervisor is misbehaving. Leave WARN_ON_ONCE() to catch that. Signed-off-by: Vitaly Kuznetsov --- arch/x86/kvm/mmu/mmu.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 8071952e9cf2..5a9fca908ca9 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4187,7 +4187,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, vcpu->arch.l1tf_flush_l1d = true; switch (vcpu->arch.apf.host_apf_reason) { - default: + case 0: trace_kvm_page_fault(fault_address, error_code); if (kvm_event_needs_reinjection(vcpu)) @@ -4201,12 +4201,8 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, kvm_async_pf_task_wait(fault_address, 0); local_irq_enable(); break; - case KVM_PV_REASON_PAGE_READY: - vcpu->arch.apf.host_apf_reason = 0; - local_irq_disable(); - kvm_async_pf_task_wake(fault_address); - local_irq_enable(); - break; + default: + WARN_ON_ONCE(1); } return r; }