From patchwork Mon May 25 14:41:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568917 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AEE54912 for ; Mon, 25 May 2020 14:41:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 96734208C7 for ; Mon, 25 May 2020 14:41:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LaQIIE6m" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403943AbgEYOlm (ORCPT ); Mon, 25 May 2020 10:41:42 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:44833 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2390896AbgEYOll (ORCPT ); Mon, 25 May 2020 10:41:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417699; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0tW/ADyX9zwrjBKIKvd0UbIE8BCH3HnzV4Z5WBcx4X0=; b=LaQIIE6mtmy+/3ACBFnWKSTeSSqf13RtMyRGxwa1AS/f1S/Xg6Bj19lRpbvNpTvebQsuQ4 HEUFAFzlr2dfAwkXwU/PFNQjYVGuDKQvMRGfiwUUwjUhBiT2ky3D/kBL408a3YGDFjl3xW xlHWynHJqZdX6VFyBCHvAHhVxKtxYXI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-483-nIPtE7cxPyiXF7F5II2QMw-1; Mon, 25 May 2020 10:41:35 -0400 X-MC-Unique: nIPtE7cxPyiXF7F5II2QMw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2F3C31005510; Mon, 25 May 2020 14:41:33 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2E6645D9C5; Mon, 25 May 2020 14:41:30 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 01/10] Revert "KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously" Date: Mon, 25 May 2020 16:41:16 +0200 Message-Id: <20200525144125.143875-2-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Commit 9a6e7c39810e (""KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously") added a protection against 'page ready' notification coming before 'page not present' is delivered. This situation seems to be impossible since commit 2a266f23550b ("KVM MMU: check pending exception before injecting APF) which added 'vcpu->arch.exception.pending' check to kvm_can_do_async_pf. On x86, kvm_arch_async_page_present() has only one call site: kvm_check_async_pf_completion() loop and we only enter the loop when kvm_arch_can_inject_async_page_present(vcpu) which when async pf msr is enabled, translates into kvm_can_do_async_pf(). There is also one problem with the cancellation mechanism. We don't seem to check that the 'page not present' notification we're canceling matches the 'page ready' notification so in theory, we may erroneously drop two valid events. Revert the commit. Reviewed-by: Gavin Shan Signed-off-by: Vitaly Kuznetsov --- arch/x86/kvm/x86.c | 23 +---------------------- 1 file changed, 1 insertion(+), 22 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c17e6eb9ad43..8e0bbe4f0d5b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10364,13 +10364,6 @@ static int apf_put_user(struct kvm_vcpu *vcpu, u32 val) sizeof(val)); } -static int apf_get_user(struct kvm_vcpu *vcpu, u32 *val) -{ - - return kvm_read_guest_cached(vcpu->kvm, &vcpu->arch.apf.data, val, - sizeof(u32)); -} - static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu) { if (!vcpu->arch.apf.delivery_as_pf_vmexit && is_guest_mode(vcpu)) @@ -10435,7 +10428,6 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work) { struct x86_exception fault; - u32 val; if (work->wakeup_all) work->arch.token = ~0; /* broadcast wakeup */ @@ -10444,19 +10436,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, trace_kvm_async_pf_ready(work->arch.token, work->cr2_or_gpa); if (vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED && - !apf_get_user(vcpu, &val)) { - if (val == KVM_PV_REASON_PAGE_NOT_PRESENT && - vcpu->arch.exception.pending && - vcpu->arch.exception.nr == PF_VECTOR && - !apf_put_user(vcpu, 0)) { - vcpu->arch.exception.injected = false; - vcpu->arch.exception.pending = false; - vcpu->arch.exception.nr = 0; - vcpu->arch.exception.has_error_code = false; - vcpu->arch.exception.error_code = 0; - vcpu->arch.exception.has_payload = false; - vcpu->arch.exception.payload = 0; - } else if (!apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) { + !apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) { fault.vector = PF_VECTOR; fault.error_code_valid = true; fault.error_code = 0; @@ -10464,7 +10444,6 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, fault.address = work->arch.token; fault.async_page_fault = true; kvm_inject_page_fault(vcpu, &fault); - } } vcpu->arch.apf.halted = false; vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; From patchwork Mon May 25 14:41:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568919 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B8FB9912 for ; Mon, 25 May 2020 14:41:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9C02120890 for ; Mon, 25 May 2020 14:41:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="N0118wJ3" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403972AbgEYOlq (ORCPT ); Mon, 25 May 2020 10:41:46 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:32801 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2403957AbgEYOlp (ORCPT ); Mon, 25 May 2020 10:41:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417703; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bgmHbS30sRzkaS9n8n+mozD2X8WFo480ZYarf0twk1E=; b=N0118wJ3e/lOz4/RfdnC+lAAUFygltrbilJaIBTixvkrY/bkXtXejE7xvjJOHy99SH3NPK InraFkD1yQtJWhXUynV5fGRh1qs97an2BLxVfORAbKmMa8YCGT1ZXgm1vpV0qhezneWC9K EmK5bnz8KzSnz06V6BUTQKSmnVDT5Dw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-439-8SJcSqdkMICKmR7H_dISUw-1; Mon, 25 May 2020 10:41:39 -0400 X-MC-Unique: 8SJcSqdkMICKmR7H_dISUw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7CF63107ACF9; Mon, 25 May 2020 14:41:37 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7F8FF5D9C5; Mon, 25 May 2020 14:41:33 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 02/10] KVM: x86: extend struct kvm_vcpu_pv_apf_data with token info Date: Mon, 25 May 2020 16:41:17 +0200 Message-Id: <20200525144125.143875-3-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently, APF mechanism relies on the #PF abuse where the token is being passed through CR2. If we switch to using interrupts to deliver page-ready notifications we need a different way to pass the data. Extent the existing 'struct kvm_vcpu_pv_apf_data' with token information for page-ready notifications. While on it, rename 'reason' to 'flags'. This doesn't change the semantics as we only have reasons '1' and '2' and these can be treated as bit flags but KVM_PV_REASON_PAGE_READY is going away with interrupt based delivery making 'reason' name misleading. The newly introduced apf_put_user_ready() temporary puts both flags and token information, this will be changed to put token only when we switch to interrupt based notifications. Signed-off-by: Vitaly Kuznetsov --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/include/asm/kvm_para.h | 4 ++-- arch/x86/include/uapi/asm/kvm_para.h | 5 +++-- arch/x86/kernel/kvm.c | 16 ++++++++-------- arch/x86/kvm/mmu/mmu.c | 6 +++--- arch/x86/kvm/svm/nested.c | 2 +- arch/x86/kvm/svm/svm.c | 3 ++- arch/x86/kvm/vmx/nested.c | 2 +- arch/x86/kvm/vmx/vmx.c | 5 +++-- arch/x86/kvm/x86.c | 17 +++++++++++++---- 10 files changed, 37 insertions(+), 25 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0a6b35353fc7..c195f63c1086 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -767,7 +767,7 @@ struct kvm_vcpu_arch { u64 msr_val; u32 id; bool send_user_only; - u32 host_apf_reason; + u32 host_apf_flags; unsigned long nested_apf_token; bool delivery_as_pf_vmexit; } apf; diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h index 9b4df6eaa11a..2a3102fee189 100644 --- a/arch/x86/include/asm/kvm_para.h +++ b/arch/x86/include/asm/kvm_para.h @@ -90,7 +90,7 @@ unsigned int kvm_arch_para_features(void); unsigned int kvm_arch_para_hints(void); void kvm_async_pf_task_wait(u32 token, int interrupt_kernel); void kvm_async_pf_task_wake(u32 token); -u32 kvm_read_and_reset_pf_reason(void); +u32 kvm_read_and_reset_apf_flags(void); extern void kvm_disable_steal_time(void); void do_async_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address); @@ -121,7 +121,7 @@ static inline unsigned int kvm_arch_para_hints(void) return 0; } -static inline u32 kvm_read_and_reset_pf_reason(void) +static inline u32 kvm_read_and_reset_apf_flags(void) { return 0; } diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 2a8e0b6b9805..d1cd5c0f431a 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -112,8 +112,9 @@ struct kvm_mmu_op_release_pt { #define KVM_PV_REASON_PAGE_READY 2 struct kvm_vcpu_pv_apf_data { - __u32 reason; - __u8 pad[60]; + __u32 flags; + __u32 token; /* Used for page ready notification only */ + __u8 pad[56]; __u32 enabled; }; diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 6efe0410fb72..340df5dab30d 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -228,24 +228,24 @@ void kvm_async_pf_task_wake(u32 token) } EXPORT_SYMBOL_GPL(kvm_async_pf_task_wake); -u32 kvm_read_and_reset_pf_reason(void) +u32 kvm_read_and_reset_apf_flags(void) { - u32 reason = 0; + u32 flags = 0; if (__this_cpu_read(apf_reason.enabled)) { - reason = __this_cpu_read(apf_reason.reason); - __this_cpu_write(apf_reason.reason, 0); + flags = __this_cpu_read(apf_reason.flags); + __this_cpu_write(apf_reason.flags, 0); } - return reason; + return flags; } -EXPORT_SYMBOL_GPL(kvm_read_and_reset_pf_reason); -NOKPROBE_SYMBOL(kvm_read_and_reset_pf_reason); +EXPORT_SYMBOL_GPL(kvm_read_and_reset_apf_flags); +NOKPROBE_SYMBOL(kvm_read_and_reset_apf_flags); dotraplinkage void do_async_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address) { - switch (kvm_read_and_reset_pf_reason()) { + switch (kvm_read_and_reset_apf_flags()) { default: do_page_fault(regs, error_code, address); break; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 8071952e9cf2..7fa5253237b2 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4186,7 +4186,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, #endif vcpu->arch.l1tf_flush_l1d = true; - switch (vcpu->arch.apf.host_apf_reason) { + switch (vcpu->arch.apf.host_apf_flags) { default: trace_kvm_page_fault(fault_address, error_code); @@ -4196,13 +4196,13 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, insn_len); break; case KVM_PV_REASON_PAGE_NOT_PRESENT: - vcpu->arch.apf.host_apf_reason = 0; + vcpu->arch.apf.host_apf_flags = 0; local_irq_disable(); kvm_async_pf_task_wait(fault_address, 0); local_irq_enable(); break; case KVM_PV_REASON_PAGE_READY: - vcpu->arch.apf.host_apf_reason = 0; + vcpu->arch.apf.host_apf_flags = 0; local_irq_disable(); kvm_async_pf_task_wake(fault_address); local_irq_enable(); diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 9a2a62e5afeb..c04adbbdbd20 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -835,7 +835,7 @@ int nested_svm_exit_special(struct vcpu_svm *svm) break; case SVM_EXIT_EXCP_BASE + PF_VECTOR: /* When we're shadowing, trap PFs, but not async PF */ - if (!npt_enabled && svm->vcpu.arch.apf.host_apf_reason == 0) + if (!npt_enabled && svm->vcpu.arch.apf.host_apf_flags == 0) return NESTED_EXIT_HOST; break; default: diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index a862c768fd54..70d136e9e075 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3399,7 +3399,8 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu) /* if exit due to PF check for async PF */ if (svm->vmcb->control.exit_code == SVM_EXIT_EXCP_BASE + PF_VECTOR) - svm->vcpu.arch.apf.host_apf_reason = kvm_read_and_reset_pf_reason(); + svm->vcpu.arch.apf.host_apf_flags = + kvm_read_and_reset_apf_flags(); if (npt_enabled) { vcpu->arch.regs_avail &= ~(1 << VCPU_EXREG_PDPTR); diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index e44f33c82332..c82d59070a9c 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -5582,7 +5582,7 @@ bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason) if (is_nmi(intr_info)) return false; else if (is_page_fault(intr_info)) - return !vmx->vcpu.arch.apf.host_apf_reason && enable_ept; + return !vmx->vcpu.arch.apf.host_apf_flags && enable_ept; else if (is_debug(intr_info) && vcpu->guest_debug & (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP)) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 89c766fad889..4a82319353f2 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4662,7 +4662,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) if (is_page_fault(intr_info)) { cr2 = vmcs_readl(EXIT_QUALIFICATION); /* EPT won't cause page fault directly */ - WARN_ON_ONCE(!vcpu->arch.apf.host_apf_reason && enable_ept); + WARN_ON_ONCE(!vcpu->arch.apf.host_apf_flags && enable_ept); return kvm_handle_page_fault(vcpu, error_code, cr2, NULL, 0); } @@ -6234,7 +6234,8 @@ static void handle_exception_nmi_irqoff(struct vcpu_vmx *vmx) /* if exit due to PF check for async PF */ if (is_page_fault(vmx->exit_intr_info)) { - vmx->vcpu.arch.apf.host_apf_reason = kvm_read_and_reset_pf_reason(); + vmx->vcpu.arch.apf.host_apf_flags = + kvm_read_and_reset_apf_flags(); /* Handle machine checks before interrupts are enabled */ } else if (is_machine_check(vmx->exit_intr_info)) { kvm_machine_check(); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8e0bbe4f0d5b..a3102406102c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2654,7 +2654,7 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data) } if (kvm_gfn_to_hva_cache_init(vcpu->kvm, &vcpu->arch.apf.data, gpa, - sizeof(u32))) + sizeof(u64))) return 1; vcpu->arch.apf.send_user_only = !(data & KVM_ASYNC_PF_SEND_ALWAYS); @@ -10357,8 +10357,17 @@ static void kvm_del_async_pf_gfn(struct kvm_vcpu *vcpu, gfn_t gfn) } } -static int apf_put_user(struct kvm_vcpu *vcpu, u32 val) +static inline int apf_put_user_notpresent(struct kvm_vcpu *vcpu) { + u32 reason = KVM_PV_REASON_PAGE_NOT_PRESENT; + + return kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.apf.data, &reason, + sizeof(reason)); +} + +static inline int apf_put_user_ready(struct kvm_vcpu *vcpu, u32 token) +{ + u64 val = (u64)token << 32 | KVM_PV_REASON_PAGE_READY; return kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.apf.data, &val, sizeof(val)); @@ -10403,7 +10412,7 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu, kvm_add_async_pf_gfn(vcpu, work->arch.gfn); if (kvm_can_deliver_async_pf(vcpu) && - !apf_put_user(vcpu, KVM_PV_REASON_PAGE_NOT_PRESENT)) { + !apf_put_user_notpresent(vcpu)) { fault.vector = PF_VECTOR; fault.error_code_valid = true; fault.error_code = 0; @@ -10436,7 +10445,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, trace_kvm_async_pf_ready(work->arch.token, work->cr2_or_gpa); if (vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED && - !apf_put_user(vcpu, KVM_PV_REASON_PAGE_READY)) { + !apf_put_user_ready(vcpu, work->arch.token)) { fault.vector = PF_VECTOR; fault.error_code_valid = true; fault.error_code = 0; From patchwork Mon May 25 14:41:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568923 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6642413B4 for ; Mon, 25 May 2020 14:41:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4E9CE20890 for ; Mon, 25 May 2020 14:41:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hD+lRAnE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403981AbgEYOlr (ORCPT ); Mon, 25 May 2020 10:41:47 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:30176 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2403961AbgEYOlq (ORCPT ); Mon, 25 May 2020 10:41:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417704; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BxYHU4B93upixhtR586I0fhW0D+AACY6i9PRAg0XSJU=; b=hD+lRAnEbPlnswVgWNylooqvTMiZK+AjBryMEsLYKqtBbNiNV07m8cVDN3qmoPeUn2Gu/m heji8UGsgukX1Us5pE87c81A97DC9EMRxyQ7ZBID7D2EhYSbrxpDQ72aPVhrVjBTH6d6u7 pRCl1YPp10+Ohx4zZzWx23I7Ur767t4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-366-RoEQmXqwOvqi3QuMBTNJzw-1; Mon, 25 May 2020 10:41:43 -0400 X-MC-Unique: RoEQmXqwOvqi3QuMBTNJzw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4A0D9107B467; Mon, 25 May 2020 14:41:41 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id D1ADE5D9C5; Mon, 25 May 2020 14:41:37 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 03/10] KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present() Date: Mon, 25 May 2020 16:41:18 +0200 Message-Id: <20200525144125.143875-4-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org An innocent reader of the following x86 KVM code: bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu) { if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED)) return true; ... may get very confused: if APF mechanism is not enabled, why do we report that we 'can inject async page present'? In reality, upon injection kvm_arch_async_page_present() will check the same condition again and, in case APF is disabled, will just drop the item. This is fine as the guest which deliberately disabled APF doesn't expect to get any APF notifications. Rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present() to make it clear what we are checking: if the item can be dequeued (meaning either injected or just dropped). On s390 kvm_arch_can_inject_async_page_present() always returns 'true' so the rename doesn't matter much. Signed-off-by: Vitaly Kuznetsov --- arch/s390/include/asm/kvm_host.h | 2 +- arch/s390/kvm/kvm-s390.c | 2 +- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/x86.c | 2 +- virt/kvm/async_pf.c | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index d6bcd34f3ec3..5ba9968c3436 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -971,7 +971,7 @@ struct kvm_arch_async_pf { unsigned long pfault_token; }; -bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu); +bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu); void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index d05bb040fd42..64128a336c24 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -3944,7 +3944,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, /* s390 will always inject the page directly */ } -bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu) +bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu) { /* * s390 will always inject the page directly, diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c195f63c1086..459dc824b10b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1641,7 +1641,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); -bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu); +bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu); extern bool kvm_find_async_pf_gfn(struct kvm_vcpu *vcpu, gfn_t gfn); int kvm_skip_emulated_instruction(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a3102406102c..bfa7e07fd6fd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10458,7 +10458,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; } -bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu) +bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu) { if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED)) return true; diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c index 15e5b037f92d..8a8770c7c889 100644 --- a/virt/kvm/async_pf.c +++ b/virt/kvm/async_pf.c @@ -135,7 +135,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu) struct kvm_async_pf *work; while (!list_empty_careful(&vcpu->async_pf.done) && - kvm_arch_can_inject_async_page_present(vcpu)) { + kvm_arch_can_dequeue_async_page_present(vcpu)) { spin_lock(&vcpu->async_pf.lock); work = list_first_entry(&vcpu->async_pf.done, typeof(*work), link); From patchwork Mon May 25 14:41:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568921 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0A20113B4 for ; Mon, 25 May 2020 14:41:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E5098208A7 for ; Mon, 25 May 2020 14:41:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Mz2x2TQj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404020AbgEYOly (ORCPT ); Mon, 25 May 2020 10:41:54 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:42879 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2404006AbgEYOlw (ORCPT ); Mon, 25 May 2020 10:41:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417710; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=30IgJOjVqToqPrs2o6Bvx2yubGeAJF8Z93ly6OSPocg=; b=Mz2x2TQjQeXpJsbHJdJSUdOhFltNA/uqDHoQwaAG0bbZ3sho+0+BrdOo3BI489ePg4Fklr vPgeP3unpQ5vFP1F7cMZXUxPlWHDFxNdNpoa903AKgXmvIRNuW0HOnyZOVHDsn6sDdWMnG hjfawHeisScWmYC2ZWRCsaJWnWZwnP8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-231-u5n5b22NNXiUe54psgZ1xQ-1; Mon, 25 May 2020 10:41:46 -0400 X-MC-Unique: u5n5b22NNXiUe54psgZ1xQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2F1E618FE861; Mon, 25 May 2020 14:41:45 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id A061A5D9C5; Mon, 25 May 2020 14:41:41 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 04/10] KVM: introduce kvm_read_guest_offset_cached() Date: Mon, 25 May 2020 16:41:19 +0200 Message-Id: <20200525144125.143875-5-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We already have kvm_write_guest_offset_cached(), introduce read analogue. Signed-off-by: Vitaly Kuznetsov --- include/linux/kvm_host.h | 3 +++ virt/kvm/kvm_main.c | 19 ++++++++++++++----- 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 131cc1527d68..8020f7ab7409 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -733,6 +733,9 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset, int kvm_read_guest(struct kvm *kvm, gpa_t gpa, void *data, unsigned long len); int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, void *data, unsigned long len); +int kvm_read_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, + void *data, unsigned int offset, + unsigned long len); int kvm_write_guest_page(struct kvm *kvm, gfn_t gfn, const void *data, int offset, int len); int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 731c1e517716..7686819687da 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2511,13 +2511,15 @@ int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, } EXPORT_SYMBOL_GPL(kvm_write_guest_cached); -int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, - void *data, unsigned long len) +int kvm_read_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, + void *data, unsigned int offset, + unsigned long len) { struct kvm_memslots *slots = kvm_memslots(kvm); int r; + gpa_t gpa = ghc->gpa + offset; - BUG_ON(len > ghc->len); + BUG_ON(len + offset > ghc->len); if (slots->generation != ghc->generation) { if (__kvm_gfn_to_hva_cache_init(slots, ghc, ghc->gpa, ghc->len)) @@ -2528,14 +2530,21 @@ int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, return -EFAULT; if (unlikely(!ghc->memslot)) - return kvm_read_guest(kvm, ghc->gpa, data, len); + return kvm_read_guest(kvm, gpa, data, len); - r = __copy_from_user(data, (void __user *)ghc->hva, len); + r = __copy_from_user(data, (void __user *)ghc->hva + offset, len); if (r) return -EFAULT; return 0; } +EXPORT_SYMBOL_GPL(kvm_read_guest_offset_cached); + +int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc, + void *data, unsigned long len) +{ + return kvm_read_guest_offset_cached(kvm, ghc, data, 0, len); +} EXPORT_SYMBOL_GPL(kvm_read_guest_cached); int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len) From patchwork Mon May 25 14:41:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568925 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 956D8912 for ; Mon, 25 May 2020 14:42:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6F37E2087D for ; Mon, 25 May 2020 14:42:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="C8FN01By" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404019AbgEYOl7 (ORCPT ); Mon, 25 May 2020 10:41:59 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:23319 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2404021AbgEYOl6 (ORCPT ); Mon, 25 May 2020 10:41:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417715; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eymcXRh+FECXLH1BGRP7uC1amSHxnhc9Q/pJocauacg=; b=C8FN01BymiGW7rM2VOUBq1rytWXFO4GRrprJLAVh2hRa3sueZS/4L4k15S0+ECd1JnMIZf AU1FTTjQKGkusXSgjrJBFyLgZVzeTmH71EOd7mxAd9DK8t2dgKwUBNNeIw/AYq1uMGW8cC E44ql4yDrJqnPv3VKvyfOESbpyR/I2o= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-405-RpgLxdnqOTGXmfvQ90TawA-1; Mon, 25 May 2020 10:41:53 -0400 X-MC-Unique: RpgLxdnqOTGXmfvQ90TawA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9DA8CEC1A1; Mon, 25 May 2020 14:41:51 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8D9385D9C5; Mon, 25 May 2020 14:41:45 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 05/10] KVM: x86: interrupt based APF 'page ready' event delivery Date: Mon, 25 May 2020 16:41:20 +0200 Message-Id: <20200525144125.143875-6-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Concerns were expressed around APF delivery via synthetic #PF exception as in some cases such delivery may collide with real page fault. For 'page ready' notifications we can easily switch to using an interrupt instead. Introduce new MSR_KVM_ASYNC_PF_INT mechanism and deprecate the legacy one. One notable difference between the two mechanisms is that interrupt may not get handled immediately so whenever we would like to deliver next event (regardless of its type) we must be sure the guest had read and cleared previous event in the slot. While on it, get rid on 'type 1/type 2' names for APF events in the documentation as they are causing confusion. Use 'page not present' and 'page ready' everywhere instead. Signed-off-by: Vitaly Kuznetsov --- Documentation/virt/kvm/msr.rst | 102 +++++++++++++++++++-------- arch/x86/include/asm/kvm_host.h | 4 +- arch/x86/include/uapi/asm/kvm_para.h | 12 +++- arch/x86/kvm/x86.c | 93 +++++++++++++++++------- 4 files changed, 152 insertions(+), 59 deletions(-) diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst index 33892036672d..be08df12f31a 100644 --- a/Documentation/virt/kvm/msr.rst +++ b/Documentation/virt/kvm/msr.rst @@ -190,41 +190,68 @@ MSR_KVM_ASYNC_PF_EN: 0x4b564d02 data: - Bits 63-6 hold 64-byte aligned physical address of a - 64 byte memory area which must be in guest RAM and must be - zeroed. Bits 5-3 are reserved and should be zero. Bit 0 is 1 - when asynchronous page faults are enabled on the vcpu 0 when - disabled. Bit 1 is 1 if asynchronous page faults can be injected - when vcpu is in cpl == 0. Bit 2 is 1 if asynchronous page faults - are delivered to L1 as #PF vmexits. Bit 2 can be set only if - KVM_FEATURE_ASYNC_PF_VMEXIT is present in CPUID. - - First 4 byte of 64 byte memory location will be written to by - the hypervisor at the time of asynchronous page fault (APF) - injection to indicate type of asynchronous page fault. Value - of 1 means that the page referred to by the page fault is not - present. Value 2 means that the page is now available. Disabling - interrupt inhibits APFs. Guest must not enable interrupt - before the reason is read, or it may be overwritten by another - APF. Since APF uses the same exception vector as regular page - fault guest must reset the reason to 0 before it does - something that can generate normal page fault. If during page - fault APF reason is 0 it means that this is regular page - fault. - - During delivery of type 1 APF cr2 contains a token that will - be used to notify a guest when missing page becomes - available. When page becomes available type 2 APF is sent with - cr2 set to the token associated with the page. There is special - kind of token 0xffffffff which tells vcpu that it should wake - up all processes waiting for APFs and no individual type 2 APFs - will be sent. + Asynchronous page fault (APF) control MSR. + + Bits 63-6 hold 64-byte aligned physical address of a 64 byte memory area + which must be in guest RAM and must be zeroed. This memory is expected + to hold a copy of the following structure:: + + struct kvm_vcpu_pv_apf_data { + /* Used for 'page not present' events delivered via #PF */ + __u32 flags; + + /* Used for 'page ready' events delivered via interrupt notification */ + __u32 token; + + __u8 pad[56]; + __u32 enabled; + }; + + Bits 5-4 of the MSR are reserved and should be zero. Bit 0 is set to 1 + when asynchronous page faults are enabled on the vcpu, 0 when disabled. + Bit 1 is 1 if asynchronous page faults can be injected when vcpu is in + cpl == 0. Bit 2 is 1 if asynchronous page faults are delivered to L1 as + #PF vmexits. Bit 2 can be set only if KVM_FEATURE_ASYNC_PF_VMEXIT is + present in CPUID. Bit 3 enables interrupt based delivery of 'page ready' + events. + + 'Page not present' events are currently always delivered as synthetic + #PF exception. During delivery of these events APF CR2 register contains + a token that will be used to notify the guest when missing page becomes + available. Also, to make it possible to distinguish between real #PF and + APF, first 4 bytes of 64 byte memory location ('flags') will be written + to by the hypervisor at the time of injection. Only first bit of 'flags' + is currently supported, when set, it indicates that the guest is dealing + with asynchronous 'page not present' event. If during a page fault APF + 'flags' is '0' it means that this is regular page fault. Guest is + supposed to clear 'flags' when it is done handling #PF exception so the + next event can be delivered. + + Note, since APF 'page not present' events use the same exception vector + as regular page fault, guest must reset 'flags' to '0' before it does + something that can generate normal page fault. + + Bytes 5-7 of 64 byte memory location ('token') will be written to by the + hypervisor at the time of APF 'page ready' event injection. The content + of these bytes is a token which was previously delivered as 'page not + present' event. The event indicates the page in now available. Guest is + supposed to write '0' to 'token' when it is done handling 'page ready' + event so the next one can be delivered. + + Note, MSR_KVM_ASYNC_PF_INT MSR specifying the interrupt vector for 'page + ready' APF delivery needs to be written to before enabling APF mechanism + in MSR_KVM_ASYNC_PF_EN or interrupt #0 can get injected. + + Note, previously, 'page ready' events were delivered via the same #PF + exception as 'page not present' events but this is now deprecated. If + bit 3 (interrupt based delivery) is not set APF events are not delivered. If APF is disabled while there are outstanding APFs, they will not be delivered. - Currently type 2 APF will be always delivered on the same vcpu as - type 1 was, but guest should not rely on that. + Currently 'page ready' APF events will be always delivered on the + same vcpu as 'page not present' event was, but guest should not rely on + that. MSR_KVM_STEAL_TIME: 0x4b564d03 @@ -319,3 +346,16 @@ data: KVM guests can request the host not to poll on HLT, for example if they are performing polling themselves. + +MSR_KVM_ASYNC_PF_INT: + 0x4b564d06 + +data: + Second asynchronous page fault (APF) control MSR. + + Bits 0-7: APIC vector for delivery of 'page ready' APF events. + Bits 8-63: Reserved + + Interrupt vector for asynchnonous 'page ready' notifications delivery. + The vector has to be set up before asynchronous page fault mechanism + is enabled in MSR_KVM_ASYNC_PF_EN. diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 459dc824b10b..c2a70e25a1f3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -764,7 +764,9 @@ struct kvm_vcpu_arch { bool halted; gfn_t gfns[roundup_pow_of_two(ASYNC_PF_PER_VCPU)]; struct gfn_to_hva_cache data; - u64 msr_val; + u64 msr_en_val; /* MSR_KVM_ASYNC_PF_EN */ + u64 msr_int_val; /* MSR_KVM_ASYNC_PF_INT */ + u16 vec; u32 id; bool send_user_only; u32 host_apf_flags; diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index d1cd5c0f431a..1d37d616b1fc 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -50,6 +50,7 @@ #define MSR_KVM_STEAL_TIME 0x4b564d03 #define MSR_KVM_PV_EOI_EN 0x4b564d04 #define MSR_KVM_POLL_CONTROL 0x4b564d05 +#define MSR_KVM_ASYNC_PF_INT 0x4b564d06 struct kvm_steal_time { __u64 steal; @@ -81,6 +82,11 @@ struct kvm_clock_pairing { #define KVM_ASYNC_PF_ENABLED (1 << 0) #define KVM_ASYNC_PF_SEND_ALWAYS (1 << 1) #define KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT (1 << 2) +#define KVM_ASYNC_PF_DELIVERY_AS_INT (1 << 3) + +/* MSR_KVM_ASYNC_PF_INT */ +#define KVM_ASYNC_PF_VEC_MASK GENMASK(7, 0) + /* Operations for KVM_HC_MMU_OP */ #define KVM_MMU_OP_WRITE_PTE 1 @@ -112,8 +118,12 @@ struct kvm_mmu_op_release_pt { #define KVM_PV_REASON_PAGE_READY 2 struct kvm_vcpu_pv_apf_data { + /* Used for 'page not present' events delivered via #PF */ __u32 flags; - __u32 token; /* Used for page ready notification only */ + + /* Used for 'page ready' events delivered via interrupt notification */ + __u32 token; + __u8 pad[56]; __u32 enabled; }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bfa7e07fd6fd..99fd347849b2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1235,7 +1235,7 @@ static const u32 emulated_msrs_all[] = { HV_X64_MSR_TSC_EMULATION_STATUS, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, - MSR_KVM_PV_EOI_EN, + MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF_INT, MSR_IA32_TSC_ADJUST, MSR_IA32_TSCDEADLINE, @@ -2637,17 +2637,24 @@ static int xen_hvm_config(struct kvm_vcpu *vcpu, u64 data) return r; } +static inline bool kvm_pv_async_pf_enabled(struct kvm_vcpu *vcpu) +{ + u64 mask = KVM_ASYNC_PF_ENABLED | KVM_ASYNC_PF_DELIVERY_AS_INT; + + return (vcpu->arch.apf.msr_en_val & mask) == mask; +} + static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data) { gpa_t gpa = data & ~0x3f; - /* Bits 3:5 are reserved, Should be zero */ - if (data & 0x38) + /* Bits 4:5 are reserved, Should be zero */ + if (data & 0x30) return 1; - vcpu->arch.apf.msr_val = data; + vcpu->arch.apf.msr_en_val = data; - if (!(data & KVM_ASYNC_PF_ENABLED)) { + if (!kvm_pv_async_pf_enabled(vcpu)) { kvm_clear_async_pf_completion_queue(vcpu); kvm_async_pf_hash_reset(vcpu); return 0; @@ -2659,7 +2666,25 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data) vcpu->arch.apf.send_user_only = !(data & KVM_ASYNC_PF_SEND_ALWAYS); vcpu->arch.apf.delivery_as_pf_vmexit = data & KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT; + kvm_async_pf_wakeup_all(vcpu); + + return 0; +} + +static int kvm_pv_enable_async_pf_int(struct kvm_vcpu *vcpu, u64 data) +{ + /* Bits 8-63 are reserved */ + if (data >> 8) + return 1; + + if (!lapic_in_kernel(vcpu)) + return 1; + + vcpu->arch.apf.msr_int_val = data; + + vcpu->arch.apf.vec = data & KVM_ASYNC_PF_VEC_MASK; + return 0; } @@ -2875,6 +2900,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (kvm_pv_enable_async_pf(vcpu, data)) return 1; break; + case MSR_KVM_ASYNC_PF_INT: + if (kvm_pv_enable_async_pf_int(vcpu, data)) + return 1; + break; case MSR_KVM_STEAL_TIME: if (unlikely(!sched_info_on())) @@ -3149,7 +3178,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = vcpu->arch.time; break; case MSR_KVM_ASYNC_PF_EN: - msr_info->data = vcpu->arch.apf.msr_val; + msr_info->data = vcpu->arch.apf.msr_en_val; + break; + case MSR_KVM_ASYNC_PF_INT: + msr_info->data = vcpu->arch.apf.msr_int_val; break; case MSR_KVM_STEAL_TIME: msr_info->data = vcpu->arch.st.msr_val; @@ -9502,7 +9534,8 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->arch.cr2 = 0; kvm_make_request(KVM_REQ_EVENT, vcpu); - vcpu->arch.apf.msr_val = 0; + vcpu->arch.apf.msr_en_val = 0; + vcpu->arch.apf.msr_int_val = 0; vcpu->arch.st.msr_val = 0; kvmclock_reset(vcpu); @@ -10367,10 +10400,22 @@ static inline int apf_put_user_notpresent(struct kvm_vcpu *vcpu) static inline int apf_put_user_ready(struct kvm_vcpu *vcpu, u32 token) { - u64 val = (u64)token << 32 | KVM_PV_REASON_PAGE_READY; + unsigned int offset = offsetof(struct kvm_vcpu_pv_apf_data, token); - return kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.apf.data, &val, - sizeof(val)); + return kvm_write_guest_offset_cached(vcpu->kvm, &vcpu->arch.apf.data, + &token, offset, sizeof(token)); +} + +static inline bool apf_pageready_slot_free(struct kvm_vcpu *vcpu) +{ + unsigned int offset = offsetof(struct kvm_vcpu_pv_apf_data, token); + u32 val; + + if (kvm_read_guest_offset_cached(vcpu->kvm, &vcpu->arch.apf.data, + &val, offset, sizeof(val))) + return false; + + return !val; } static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu) @@ -10378,9 +10423,8 @@ static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu) if (!vcpu->arch.apf.delivery_as_pf_vmexit && is_guest_mode(vcpu)) return false; - if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED) || - (vcpu->arch.apf.send_user_only && - kvm_x86_ops.get_cpl(vcpu) == 0)) + if (!kvm_pv_async_pf_enabled(vcpu) || + (vcpu->arch.apf.send_user_only && kvm_x86_ops.get_cpl(vcpu) == 0)) return false; return true; @@ -10436,7 +10480,10 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu, void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work) { - struct x86_exception fault; + struct kvm_lapic_irq irq = { + .delivery_mode = APIC_DM_FIXED, + .vector = vcpu->arch.apf.vec + }; if (work->wakeup_all) work->arch.token = ~0; /* broadcast wakeup */ @@ -10444,26 +10491,20 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, kvm_del_async_pf_gfn(vcpu, work->arch.gfn); trace_kvm_async_pf_ready(work->arch.token, work->cr2_or_gpa); - if (vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED && - !apf_put_user_ready(vcpu, work->arch.token)) { - fault.vector = PF_VECTOR; - fault.error_code_valid = true; - fault.error_code = 0; - fault.nested_page_fault = false; - fault.address = work->arch.token; - fault.async_page_fault = true; - kvm_inject_page_fault(vcpu, &fault); - } + if (kvm_pv_async_pf_enabled(vcpu) && + !apf_put_user_ready(vcpu, work->arch.token)) + kvm_apic_set_irq(vcpu, &irq, NULL); + vcpu->arch.apf.halted = false; vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; } bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu) { - if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED)) + if (!kvm_pv_async_pf_enabled(vcpu)) return true; else - return kvm_can_do_async_pf(vcpu); + return apf_pageready_slot_free(vcpu); } void kvm_arch_start_assignment(struct kvm *kvm) From patchwork Mon May 25 14:41:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568927 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B3F94912 for ; Mon, 25 May 2020 14:42:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 92426208A7 for ; Mon, 25 May 2020 14:42:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CFPaRvL/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404053AbgEYOmH (ORCPT ); Mon, 25 May 2020 10:42:07 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:47110 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2404002AbgEYOmB (ORCPT ); Mon, 25 May 2020 10:42:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417718; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3NAnvubXx9UM8oJdsI/0D4DVgLkVYkdl/RmpRfFhywk=; b=CFPaRvL/kA8j/sTdEKrkGMtQ7LuiA9gtqBSrI3IEGtNSWIPaJo7kfbhHrsk7BBiol7CUIl rh3DCwGViwtV9BprIDzJd+C4D9udYbbzoI3bR0b/QDaVXwLo/jMmYffw1Ls5j9rczxUOlJ 8QMxVfbx/QBLRgdqtLlBdB4QGWLV4H8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-499-myZG4OGOOwGgAlOVWIPzBw-1; Mon, 25 May 2020 10:41:57 -0400 X-MC-Unique: myZG4OGOOwGgAlOVWIPzBw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 56C0380183C; Mon, 25 May 2020 14:41:55 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 075CB5D9C5; Mon, 25 May 2020 14:41:51 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 06/10] KVM: x86: acknowledgment mechanism for async pf page ready notifications Date: Mon, 25 May 2020 16:41:21 +0200 Message-Id: <20200525144125.143875-7-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If two page ready notifications happen back to back the second one is not delivered and the only mechanism we currently have is kvm_check_async_pf_completion() check in vcpu_run() loop. The check will only be performed with the next vmexit when it happens and in some cases it may take a while. With interrupt based page ready notification delivery the situation is even worse: unlike exceptions, interrupts are not handled immediately so we must check if the slot is empty. This is slow and unnecessary. Introduce dedicated MSR_KVM_ASYNC_PF_ACK MSR to communicate the fact that the slot is free and host should check its notification queue. Mandate using it for interrupt based 'page ready' APF event delivery. As kvm_check_async_pf_completion() is going away from vcpu_run() we need a way to communicate the fact that vcpu->async_pf.done queue has transitioned from empty to non-empty state. Introduce kvm_arch_async_page_present_queued() and KVM_REQ_APF_READY to do the job. Signed-off-by: Vitaly Kuznetsov --- Documentation/virt/kvm/msr.rst | 16 +++++++++++++++- arch/s390/include/asm/kvm_host.h | 2 ++ arch/x86/include/asm/kvm_host.h | 3 +++ arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/x86.c | 26 ++++++++++++++++++++++---- virt/kvm/async_pf.c | 10 ++++++++++ 6 files changed, 53 insertions(+), 5 deletions(-) diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst index be08df12f31a..8ea3fbcc67fd 100644 --- a/Documentation/virt/kvm/msr.rst +++ b/Documentation/virt/kvm/msr.rst @@ -236,7 +236,10 @@ data: of these bytes is a token which was previously delivered as 'page not present' event. The event indicates the page in now available. Guest is supposed to write '0' to 'token' when it is done handling 'page ready' - event so the next one can be delivered. + event so the next one can be delivered. It is also supposed to write + '1' to MSR_KVM_ASYNC_PF_ACK every time after clearing the location, + this forces KVM to re-scan its queue and deliver next pending + notification. Note, MSR_KVM_ASYNC_PF_INT MSR specifying the interrupt vector for 'page ready' APF delivery needs to be written to before enabling APF mechanism @@ -359,3 +362,14 @@ data: Interrupt vector for asynchnonous 'page ready' notifications delivery. The vector has to be set up before asynchronous page fault mechanism is enabled in MSR_KVM_ASYNC_PF_EN. + +MSR_KVM_ASYNC_PF_ACK: + 0x4b564d07 + +data: + Asynchronous page fault (APF) acknowledgment. + + When the guest is done processing 'page ready' APF event and 'token' + field in 'struct kvm_vcpu_pv_apf_data' is cleared it is supposed to + write '1' to bit 0 of the MSR, this caused the host to re-scan its queue + and check if there are more notifications pending. diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index 5ba9968c3436..bb1ede017b7e 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -982,6 +982,8 @@ void kvm_arch_async_page_not_present(struct kvm_vcpu *vcpu, void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); +static inline void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu) {} + void kvm_arch_crypto_clear_masks(struct kvm *kvm); void kvm_arch_crypto_set_masks(struct kvm *kvm, unsigned long *apm, unsigned long *aqm, unsigned long *adm); diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c2a70e25a1f3..356c02bfa587 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -83,6 +83,7 @@ #define KVM_REQ_GET_VMCS12_PAGES KVM_ARCH_REQ(24) #define KVM_REQ_APICV_UPDATE \ KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) +#define KVM_REQ_APF_READY KVM_ARCH_REQ(26) #define CR0_RESERVED_BITS \ (~(unsigned long)(X86_CR0_PE | X86_CR0_MP | X86_CR0_EM | X86_CR0_TS \ @@ -772,6 +773,7 @@ struct kvm_vcpu_arch { u32 host_apf_flags; unsigned long nested_apf_token; bool delivery_as_pf_vmexit; + bool pageready_pending; } apf; /* OSVW MSRs (AMD only) */ @@ -1643,6 +1645,7 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work); +void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu); bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu); extern bool kvm_find_async_pf_gfn(struct kvm_vcpu *vcpu, gfn_t gfn); diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 1d37d616b1fc..7ac20df80ba8 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -51,6 +51,7 @@ #define MSR_KVM_PV_EOI_EN 0x4b564d04 #define MSR_KVM_POLL_CONTROL 0x4b564d05 #define MSR_KVM_ASYNC_PF_INT 0x4b564d06 +#define MSR_KVM_ASYNC_PF_ACK 0x4b564d07 struct kvm_steal_time { __u64 steal; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 99fd347849b2..ffe1497b7beb 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1235,7 +1235,7 @@ static const u32 emulated_msrs_all[] = { HV_X64_MSR_TSC_EMULATION_STATUS, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME, - MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF_INT, + MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF_INT, MSR_KVM_ASYNC_PF_ACK, MSR_IA32_TSC_ADJUST, MSR_IA32_TSCDEADLINE, @@ -2904,6 +2904,12 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (kvm_pv_enable_async_pf_int(vcpu, data)) return 1; break; + case MSR_KVM_ASYNC_PF_ACK: + if (data & 0x1) { + vcpu->arch.apf.pageready_pending = false; + kvm_check_async_pf_completion(vcpu); + } + break; case MSR_KVM_STEAL_TIME: if (unlikely(!sched_info_on())) @@ -3183,6 +3189,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_KVM_ASYNC_PF_INT: msr_info->data = vcpu->arch.apf.msr_int_val; break; + case MSR_KVM_ASYNC_PF_ACK: + msr_info->data = 0; + break; case MSR_KVM_STEAL_TIME: msr_info->data = vcpu->arch.st.msr_val; break; @@ -8340,6 +8349,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) kvm_hv_process_stimers(vcpu); if (kvm_check_request(KVM_REQ_APICV_UPDATE, vcpu)) kvm_vcpu_update_apicv(vcpu); + if (kvm_check_request(KVM_REQ_APF_READY, vcpu)) + kvm_check_async_pf_completion(vcpu); } if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) { @@ -8613,8 +8624,6 @@ static int vcpu_run(struct kvm_vcpu *vcpu) break; } - kvm_check_async_pf_completion(vcpu); - if (signal_pending(current)) { r = -EINTR; vcpu->run->exit_reason = KVM_EXIT_INTR; @@ -10492,13 +10501,22 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu, trace_kvm_async_pf_ready(work->arch.token, work->cr2_or_gpa); if (kvm_pv_async_pf_enabled(vcpu) && - !apf_put_user_ready(vcpu, work->arch.token)) + !apf_put_user_ready(vcpu, work->arch.token)) { + vcpu->arch.apf.pageready_pending = true; kvm_apic_set_irq(vcpu, &irq, NULL); + } vcpu->arch.apf.halted = false; vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; } +void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu) +{ + kvm_make_request(KVM_REQ_APF_READY, vcpu); + if (!vcpu->arch.apf.pageready_pending) + kvm_vcpu_kick(vcpu); +} + bool kvm_arch_can_dequeue_async_page_present(struct kvm_vcpu *vcpu) { if (!kvm_pv_async_pf_enabled(vcpu)) diff --git a/virt/kvm/async_pf.c b/virt/kvm/async_pf.c index 8a8770c7c889..5660bbc831bf 100644 --- a/virt/kvm/async_pf.c +++ b/virt/kvm/async_pf.c @@ -51,6 +51,7 @@ static void async_pf_execute(struct work_struct *work) unsigned long addr = apf->addr; gpa_t cr2_or_gpa = apf->cr2_or_gpa; int locked = 1; + bool first; might_sleep(); @@ -69,10 +70,14 @@ static void async_pf_execute(struct work_struct *work) kvm_arch_async_page_present(vcpu, apf); spin_lock(&vcpu->async_pf.lock); + first = list_empty(&vcpu->async_pf.done); list_add_tail(&apf->link, &vcpu->async_pf.done); apf->vcpu = NULL; spin_unlock(&vcpu->async_pf.lock); + if (!IS_ENABLED(CONFIG_KVM_ASYNC_PF_SYNC) && first) + kvm_arch_async_page_present_queued(vcpu); + /* * apf may be freed by kvm_check_async_pf_completion() after * this point @@ -202,6 +207,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu) { struct kvm_async_pf *work; + bool first; if (!list_empty_careful(&vcpu->async_pf.done)) return 0; @@ -214,9 +220,13 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu) INIT_LIST_HEAD(&work->queue); /* for list_del to work */ spin_lock(&vcpu->async_pf.lock); + first = list_empty(&vcpu->async_pf.done); list_add_tail(&work->link, &vcpu->async_pf.done); spin_unlock(&vcpu->async_pf.lock); + if (!IS_ENABLED(CONFIG_KVM_ASYNC_PF_SYNC) && first) + kvm_arch_async_page_present_queued(vcpu); + vcpu->async_pf.queued++; return 0; } From patchwork Mon May 25 14:41:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568935 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8C54513B4 for ; Mon, 25 May 2020 14:42:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7521820888 for ; Mon, 25 May 2020 14:42:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Y09MEEnV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404044AbgEYOmH (ORCPT ); Mon, 25 May 2020 10:42:07 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:30940 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2404035AbgEYOmD (ORCPT ); Mon, 25 May 2020 10:42:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417722; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gcWPQcy0dLcM/idycFzBWKrR20stKK5D971BAOD+nRg=; b=Y09MEEnVC7am+z62gP2Ih0JysJ3BzgIjtgOM1anF75uvDvbDDdKtzPimv9iM7Oj+MXNCPz PX80QUL++wQxmXbKFhouLN58wercMYy9gHeCwJUotG5BmO9RJPZHonRnPT6KHQwGB21m+g qInqzZ3rpKPQRCVcCDbqDnzCTbPvmAg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-45-QbeNpbpcO82crC9H4IItqA-1; Mon, 25 May 2020 10:42:00 -0400 X-MC-Unique: QbeNpbpcO82crC9H4IItqA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D186118FE86C; Mon, 25 May 2020 14:41:58 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id B86B95D9C5; Mon, 25 May 2020 14:41:55 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 07/10] KVM: x86: announce KVM_FEATURE_ASYNC_PF_INT Date: Mon, 25 May 2020 16:41:22 +0200 Message-Id: <20200525144125.143875-8-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Introduce new capability to indicate that KVM supports interrupt based delivery of 'page ready' APF events. This includes support for both MSR_KVM_ASYNC_PF_INT and MSR_KVM_ASYNC_PF_ACK. Signed-off-by: Vitaly Kuznetsov --- Documentation/virt/kvm/cpuid.rst | 6 ++++++ Documentation/virt/kvm/msr.rst | 12 ++++++++---- arch/x86/include/uapi/asm/kvm_para.h | 1 + arch/x86/kvm/cpuid.c | 3 ++- arch/x86/kvm/x86.c | 1 + include/uapi/linux/kvm.h | 1 + 6 files changed, 19 insertions(+), 5 deletions(-) diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst index 01b081f6e7ea..98cbd257ac33 100644 --- a/Documentation/virt/kvm/cpuid.rst +++ b/Documentation/virt/kvm/cpuid.rst @@ -86,6 +86,12 @@ KVM_FEATURE_PV_SCHED_YIELD 13 guest checks this feature bit before using paravirtualized sched yield. +KVM_FEATURE_ASYNC_PF_INT 14 guest checks this feature bit + before using the second async + pf control msr 0x4b564d06 and + async pf acknowledgment msr + 0x4b564d07. + KVM_FEATURE_CLOCSOURCE_STABLE_BIT 24 host will warn if no guest-side per-cpu warps are expeced in kvmclock diff --git a/Documentation/virt/kvm/msr.rst b/Documentation/virt/kvm/msr.rst index 8ea3fbcc67fd..d409cc26193f 100644 --- a/Documentation/virt/kvm/msr.rst +++ b/Documentation/virt/kvm/msr.rst @@ -213,7 +213,8 @@ data: cpl == 0. Bit 2 is 1 if asynchronous page faults are delivered to L1 as #PF vmexits. Bit 2 can be set only if KVM_FEATURE_ASYNC_PF_VMEXIT is present in CPUID. Bit 3 enables interrupt based delivery of 'page ready' - events. + events. Bit 3 can only be set if KVM_FEATURE_ASYNC_PF_INT is present in + CPUID. 'Page not present' events are currently always delivered as synthetic #PF exception. During delivery of these events APF CR2 register contains @@ -243,7 +244,8 @@ data: Note, MSR_KVM_ASYNC_PF_INT MSR specifying the interrupt vector for 'page ready' APF delivery needs to be written to before enabling APF mechanism - in MSR_KVM_ASYNC_PF_EN or interrupt #0 can get injected. + in MSR_KVM_ASYNC_PF_EN or interrupt #0 can get injected. The MSR is + available if KVM_FEATURE_ASYNC_PF_INT is present in CPUID. Note, previously, 'page ready' events were delivered via the same #PF exception as 'page not present' events but this is now deprecated. If @@ -361,7 +363,8 @@ data: Interrupt vector for asynchnonous 'page ready' notifications delivery. The vector has to be set up before asynchronous page fault mechanism - is enabled in MSR_KVM_ASYNC_PF_EN. + is enabled in MSR_KVM_ASYNC_PF_EN. The MSR is only available if + KVM_FEATURE_ASYNC_PF_INT is present in CPUID. MSR_KVM_ASYNC_PF_ACK: 0x4b564d07 @@ -372,4 +375,5 @@ data: When the guest is done processing 'page ready' APF event and 'token' field in 'struct kvm_vcpu_pv_apf_data' is cleared it is supposed to write '1' to bit 0 of the MSR, this caused the host to re-scan its queue - and check if there are more notifications pending. + and check if there are more notifications pending. The MSR is available + if KVM_FEATURE_ASYNC_PF_INT is present in CPUID. diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 7ac20df80ba8..812e9b4c1114 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -31,6 +31,7 @@ #define KVM_FEATURE_PV_SEND_IPI 11 #define KVM_FEATURE_POLL_CONTROL 12 #define KVM_FEATURE_PV_SCHED_YIELD 13 +#define KVM_FEATURE_ASYNC_PF_INT 14 #define KVM_HINTS_REALTIME 0 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 901cd1fdecd9..790fe4988001 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -712,7 +712,8 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function) (1 << KVM_FEATURE_ASYNC_PF_VMEXIT) | (1 << KVM_FEATURE_PV_SEND_IPI) | (1 << KVM_FEATURE_POLL_CONTROL) | - (1 << KVM_FEATURE_PV_SCHED_YIELD); + (1 << KVM_FEATURE_PV_SCHED_YIELD) | + (1 << KVM_FEATURE_ASYNC_PF_INT); if (sched_info_on()) entry->eax |= (1 << KVM_FEATURE_STEAL_TIME); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ffe1497b7beb..cc1bf6cfc5e0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3404,6 +3404,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_X86_ROBUST_SINGLESTEP: case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: + case KVM_CAP_ASYNC_PF_INT: case KVM_CAP_GET_TSC_KHZ: case KVM_CAP_KVMCLOCK_CTRL: case KVM_CAP_READONLY_MEM: diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 428c7dde6b4b..15012f78a691 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1017,6 +1017,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_S390_VCPU_RESETS 179 #define KVM_CAP_S390_PROTECTED 180 #define KVM_CAP_PPC_SECURE_GUEST 181 +#define KVM_CAP_ASYNC_PF_INT 182 #ifdef KVM_CAP_IRQ_ROUTING From patchwork Mon May 25 14:41:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568933 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F71D912 for ; Mon, 25 May 2020 14:42:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 82D2620888 for ; Mon, 25 May 2020 14:42:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HKegPIbc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404058AbgEYOmJ (ORCPT ); Mon, 25 May 2020 10:42:09 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:25852 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2403960AbgEYOmI (ORCPT ); Mon, 25 May 2020 10:42:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417726; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Pwnif6cyZVcBZZP8bYm94v9LRSFVFmkl9LmDv62yx7k=; b=HKegPIbcDg8iD45kO4iO/IvBUWYYIXe97ik15zjmZapgZgLCuEjoxF4vQP1skyzTQuObmK Xac4OyBzV464++rrBgFZusqSFBy4Bo7IvH3jXgpkFuSKhobpu1wMbVnFgWOH06IhiDMMBt Rp6yunuV1baG/tkuEP/0UfkJBQCr0jo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-451-UxW-vGIwNSeq8vzTzBXULA-1; Mon, 25 May 2020 10:42:04 -0400 X-MC-Unique: UxW-vGIwNSeq8vzTzBXULA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B1BF8460; Mon, 25 May 2020 14:42:02 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 305CF5D9C5; Mon, 25 May 2020 14:41:59 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 08/10] KVM: x86: Switch KVM guest to using interrupts for page ready APF delivery Date: Mon, 25 May 2020 16:41:23 +0200 Message-Id: <20200525144125.143875-9-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM now supports using interrupt for 'page ready' APF event delivery and legacy mechanism was deprecated. Switch KVM guests to the new one. Signed-off-by: Vitaly Kuznetsov --- arch/x86/entry/entry_32.S | 5 ++++ arch/x86/entry/entry_64.S | 5 ++++ arch/x86/include/asm/hardirq.h | 3 ++ arch/x86/include/asm/irq_vectors.h | 6 +++- arch/x86/include/asm/kvm_para.h | 6 ++++ arch/x86/kernel/irq.c | 9 ++++++ arch/x86/kernel/kvm.c | 45 ++++++++++++++++++++++-------- 7 files changed, 66 insertions(+), 13 deletions(-) diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S index b67bae7091d7..d574dadcb2a1 100644 --- a/arch/x86/entry/entry_32.S +++ b/arch/x86/entry/entry_32.S @@ -1475,6 +1475,11 @@ BUILD_INTERRUPT3(hv_stimer0_callback_vector, HYPERV_STIMER0_VECTOR, #endif /* CONFIG_HYPERV */ +#ifdef CONFIG_KVM_GUEST +BUILD_INTERRUPT3(kvm_async_pf_vector, KVM_ASYNC_PF_VECTOR, + kvm_async_pf_intr) +#endif + SYM_CODE_START(page_fault) ASM_CLAC pushl $do_page_fault diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 3063aa9090f9..138f5c5aca2e 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1190,6 +1190,11 @@ apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \ acrn_hv_callback_vector acrn_hv_vector_handler #endif +#ifdef CONFIG_KVM_GUEST +apicinterrupt3 KVM_ASYNC_PF_VECTOR \ + kvm_async_pf_vector kvm_async_pf_intr +#endif + idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=IST_INDEX_DB ist_offset=DB_STACK_OFFSET idtentry int3 do_int3 has_error_code=0 create_gap=1 idtentry stack_segment do_stack_segment has_error_code=1 diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index 07533795b8d2..be0fbb15ad7f 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -44,6 +44,9 @@ typedef struct { unsigned int irq_hv_reenlightenment_count; unsigned int hyperv_stimer0_count; #endif +#ifdef CONFIG_KVM_GUEST + unsigned int kvm_async_pf_pageready_count; +#endif } ____cacheline_aligned irq_cpustat_t; DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat); diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h index 889f8b1b5b7f..8879a9ecd908 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -104,7 +104,11 @@ #define HYPERV_STIMER0_VECTOR 0xed #endif -#define LOCAL_TIMER_VECTOR 0xec +#ifdef CONFIG_KVM_GUEST +#define KVM_ASYNC_PF_VECTOR 0xec +#endif + +#define LOCAL_TIMER_VECTOR 0xeb #define NR_VECTORS 256 diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h index 2a3102fee189..a075cd8fa5c7 100644 --- a/arch/x86/include/asm/kvm_para.h +++ b/arch/x86/include/asm/kvm_para.h @@ -4,6 +4,7 @@ #include #include +#include #include extern void kvmclock_init(void); @@ -93,6 +94,11 @@ void kvm_async_pf_task_wake(u32 token); u32 kvm_read_and_reset_apf_flags(void); extern void kvm_disable_steal_time(void); void do_async_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address); +extern __visible void kvm_async_pf_vector(void); +#ifdef CONFIG_TRACING +#define trace_kvm_async_pf_vector kvm_async_pf_vector +#endif +__visible void __irq_entry kvm_async_pf_intr(struct pt_regs *regs); #ifdef CONFIG_PARAVIRT_SPINLOCKS void __init kvm_spinlock_init(void); diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index c7965ff429c5..a4c2f25ad74d 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -159,6 +159,15 @@ int arch_show_interrupts(struct seq_file *p, int prec) irq_stats(j)->hyperv_stimer0_count); seq_puts(p, " Hyper-V stimer0 interrupts\n"); } +#endif +#ifdef CONFIG_KVM_GUEST + if (test_bit(KVM_ASYNC_PF_VECTOR, system_vectors)) { + seq_printf(p, "%*s: ", prec, "APF"); + for_each_online_cpu(j) + seq_printf(p, "%10u ", + irq_stats(j)->kvm_async_pf_pageready_count); + seq_puts(p, " KVM async PF page ready interrupts\n"); + } #endif seq_printf(p, "%*s: %10u\n", prec, "ERR", atomic_read(&irq_err_count)); #if defined(CONFIG_X86_IO_APIC) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 340df5dab30d..79730eaef1e1 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -245,23 +245,39 @@ NOKPROBE_SYMBOL(kvm_read_and_reset_apf_flags); dotraplinkage void do_async_page_fault(struct pt_regs *regs, unsigned long error_code, unsigned long address) { - switch (kvm_read_and_reset_apf_flags()) { - default: + u32 flags = kvm_read_and_reset_apf_flags(); + + if (!flags) { + /* This is a normal page fault */ do_page_fault(regs, error_code, address); - break; - case KVM_PV_REASON_PAGE_NOT_PRESENT: + return; + } else if (flags & KVM_PV_REASON_PAGE_NOT_PRESENT) { /* page is swapped out by the host. */ kvm_async_pf_task_wait((u32)address, !user_mode(regs)); - break; - case KVM_PV_REASON_PAGE_READY: - rcu_irq_enter(); - kvm_async_pf_task_wake((u32)address); - rcu_irq_exit(); - break; + } else { + WARN_ONCE(1, "Unexpected async PF flags: %x\n", flags); } } NOKPROBE_SYMBOL(do_async_page_fault); +__visible void __irq_entry kvm_async_pf_intr(struct pt_regs *regs) +{ + u32 token; + + entering_ack_irq(); + + inc_irq_stat(kvm_async_pf_pageready_count); + + if (__this_cpu_read(apf_reason.enabled)) { + token = __this_cpu_read(apf_reason.token); + kvm_async_pf_task_wake(token); + __this_cpu_write(apf_reason.token, 0); + wrmsrl(MSR_KVM_ASYNC_PF_ACK, 1); + } + + exiting_irq(); +} + static void __init paravirt_ops_setup(void) { pv_info.name = "KVM"; @@ -305,17 +321,19 @@ static notrace void kvm_guest_apic_eoi_write(u32 reg, u32 val) static void kvm_guest_cpu_init(void) { - if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF) && kvmapf) { + if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_INT) && kvmapf) { u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason)); #ifdef CONFIG_PREEMPTION pa |= KVM_ASYNC_PF_SEND_ALWAYS; #endif - pa |= KVM_ASYNC_PF_ENABLED; + pa |= KVM_ASYNC_PF_ENABLED | KVM_ASYNC_PF_DELIVERY_AS_INT; if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_VMEXIT)) pa |= KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT; + wrmsrl(MSR_KVM_ASYNC_PF_INT, KVM_ASYNC_PF_VECTOR); + wrmsrl(MSR_KVM_ASYNC_PF_EN, pa); __this_cpu_write(apf_reason.enabled, 1); printk(KERN_INFO"KVM setup async PF for cpu %d\n", @@ -649,6 +667,9 @@ static void __init kvm_guest_init(void) if (kvm_para_has_feature(KVM_FEATURE_PV_EOI)) apic_set_eoi_write(kvm_guest_apic_eoi_write); + if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_INT)) + alloc_intr_gate(KVM_ASYNC_PF_VECTOR, kvm_async_pf_vector); + #ifdef CONFIG_SMP smp_ops.smp_prepare_cpus = kvm_smp_prepare_cpus; smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu; From patchwork Mon May 25 14:41:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568931 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 367D613B4 for ; Mon, 25 May 2020 14:42:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1DC592087D for ; Mon, 25 May 2020 14:42:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GbIjdqLs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404081AbgEYOm3 (ORCPT ); Mon, 25 May 2020 10:42:29 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:57878 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2404061AbgEYOmN (ORCPT ); Mon, 25 May 2020 10:42:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417732; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E5DiB+MWHZo7YMaEZxoZmOa2F6h0GpMlsmbKYOyDmac=; b=GbIjdqLsa0qZLJgCw9vDS3uIywgak6B1oJQO16NWEH5CJbMsHTgnnm9u0H4D+tPNQ4aOBB pPMkRo40dbimsi0yeyAzYVR5PyVg7LLbEoRfZSL4sZejB8WIGpqRtieyFjlQBg6I6IfQEj X2D+ukq1+K/w/Ggt7Mpxpqy59zUgag4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-238-emavy1G_PZuaGhGn5c5tGA-1; Mon, 25 May 2020 10:42:08 -0400 X-MC-Unique: emavy1G_PZuaGhGn5c5tGA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B08091005510; Mon, 25 May 2020 14:42:06 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2F0865D9C5; Mon, 25 May 2020 14:42:02 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 09/10] KVM: x86: drop KVM_PV_REASON_PAGE_READY case from kvm_handle_page_fault() Date: Mon, 25 May 2020 16:41:24 +0200 Message-Id: <20200525144125.143875-10-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM guest code in Linux enables APF only when KVM_FEATURE_ASYNC_PF_INT is supported, this means we will never see KVM_PV_REASON_PAGE_READY when handling page fault vmexit in KVM. While on it, make sure we only follow genuine page fault path when APF reason is zero. If we happen to see something else this means that the underlying hypervisor is misbehaving. Leave WARN_ON_ONCE() to catch that. Signed-off-by: Vitaly Kuznetsov --- arch/x86/kvm/mmu/mmu.c | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 7fa5253237b2..1bb7a8c6c28e 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4178,6 +4178,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, u64 fault_address, char *insn, int insn_len) { int r = 1; + u32 flags = vcpu->arch.apf.host_apf_flags; #ifndef CONFIG_X86_64 /* A 64-bit CR2 should be impossible on 32-bit KVM. */ @@ -4186,28 +4187,22 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, #endif vcpu->arch.l1tf_flush_l1d = true; - switch (vcpu->arch.apf.host_apf_flags) { - default: + if (!flags) { trace_kvm_page_fault(fault_address, error_code); if (kvm_event_needs_reinjection(vcpu)) kvm_mmu_unprotect_page_virt(vcpu, fault_address); r = kvm_mmu_page_fault(vcpu, fault_address, error_code, insn, insn_len); - break; - case KVM_PV_REASON_PAGE_NOT_PRESENT: + } else if (flags & KVM_PV_REASON_PAGE_NOT_PRESENT) { vcpu->arch.apf.host_apf_flags = 0; local_irq_disable(); kvm_async_pf_task_wait(fault_address, 0); local_irq_enable(); - break; - case KVM_PV_REASON_PAGE_READY: - vcpu->arch.apf.host_apf_flags = 0; - local_irq_disable(); - kvm_async_pf_task_wake(fault_address); - local_irq_enable(); - break; + } else { + WARN_ONCE(1, "Unexpected host async PF flags: %x\n", flags); } + return r; } EXPORT_SYMBOL_GPL(kvm_handle_page_fault); From patchwork Mon May 25 14:41:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 11568929 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D7A1C13B4 for ; Mon, 25 May 2020 14:42:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C059C20899 for ; Mon, 25 May 2020 14:42:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MzZdV5Tj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404101AbgEYOmW (ORCPT ); Mon, 25 May 2020 10:42:22 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:23511 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2404079AbgEYOmP (ORCPT ); Mon, 25 May 2020 10:42:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1590417733; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f3A0gdFX+KFXgCRhKQMtbqW7JDhQLi/iXO4FB0FTFtY=; b=MzZdV5Tj3vgO5bWhhbyS/zQbuTmyEKHhoVixbp/TvuP417DQaHZNxY9jqz0tIuMDYFQMES zZpABnnsX1tr644BZEQRR6z5qBO2fDphLIxCCkbDMju5sD8ZsInMyOOcnfZZ6hzga/PeY3 t/jbPTsulNX9T2BHdn1VJ/EDYkPTdK8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-493-2YcZ597PNl2IAPJmLJ1YNg-1; Mon, 25 May 2020 10:42:12 -0400 X-MC-Unique: 2YcZ597PNl2IAPJmLJ1YNg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5DFD2107ACCD; Mon, 25 May 2020 14:42:10 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.194.114]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1443F5D9C5; Mon, 25 May 2020 14:42:06 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, x86@kernel.org Cc: Paolo Bonzini , Andy Lutomirski , Thomas Gleixner , Borislav Petkov , "H. Peter Anvin" , Wanpeng Li , Sean Christopherson , Jim Mattson , Vivek Goyal , Gavin Shan , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: [PATCH v2 10/10] KVM: x86: deprecate KVM_ASYNC_PF_SEND_ALWAYS Date: Mon, 25 May 2020 16:41:25 +0200 Message-Id: <20200525144125.143875-11-vkuznets@redhat.com> In-Reply-To: <20200525144125.143875-1-vkuznets@redhat.com> References: <20200525144125.143875-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Concerns were expressed around APF events delivery when CPU is not in user mode (KVM_ASYNC_PF_SEND_ALWAYS), e.g. https://lore.kernel.org/kvm/ed71d0967113a35f670a9625a058b8e6e0b2f104.1583547991.git.luto@kernel.org/ 'Page ready' events are already free from '#PF abuse' but 'page not ready' notifications still go through #PF (to be changed in the future). Make the probability of running into issues when APF collides with regular #PF lower by deprecating KVM_ASYNC_PF_SEND_ALWAYS. The feature doesn't seem to be important enough for any particular workload to notice the difference. Suggested-by: Andy Lutomirski Signed-off-by: Vitaly Kuznetsov --- arch/x86/include/asm/kvm_host.h | 1 - arch/x86/include/uapi/asm/kvm_para.h | 2 +- arch/x86/kernel/kvm.c | 3 --- arch/x86/kvm/x86.c | 13 +++++++++---- 4 files changed, 10 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 356c02bfa587..f491214b7667 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -769,7 +769,6 @@ struct kvm_vcpu_arch { u64 msr_int_val; /* MSR_KVM_ASYNC_PF_INT */ u16 vec; u32 id; - bool send_user_only; u32 host_apf_flags; unsigned long nested_apf_token; bool delivery_as_pf_vmexit; diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h index 812e9b4c1114..3cae0faac2b8 100644 --- a/arch/x86/include/uapi/asm/kvm_para.h +++ b/arch/x86/include/uapi/asm/kvm_para.h @@ -82,7 +82,7 @@ struct kvm_clock_pairing { #define KVM_MAX_MMU_OP_BATCH 32 #define KVM_ASYNC_PF_ENABLED (1 << 0) -#define KVM_ASYNC_PF_SEND_ALWAYS (1 << 1) +#define KVM_ASYNC_PF_SEND_ALWAYS (1 << 1) /* deprecated */ #define KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT (1 << 2) #define KVM_ASYNC_PF_DELIVERY_AS_INT (1 << 3) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 79730eaef1e1..add123302122 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -324,9 +324,6 @@ static void kvm_guest_cpu_init(void) if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_INT) && kvmapf) { u64 pa = slow_virt_to_phys(this_cpu_ptr(&apf_reason)); -#ifdef CONFIG_PREEMPTION - pa |= KVM_ASYNC_PF_SEND_ALWAYS; -#endif pa |= KVM_ASYNC_PF_ENABLED | KVM_ASYNC_PF_DELIVERY_AS_INT; if (kvm_para_has_feature(KVM_FEATURE_ASYNC_PF_VMEXIT)) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cc1bf6cfc5e0..8222133bf5be 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2648,7 +2648,10 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data) { gpa_t gpa = data & ~0x3f; - /* Bits 4:5 are reserved, Should be zero */ + /* + * Bits 4:5 are reserved and should be zero. Bit 1 + * (KVM_ASYNC_PF_SEND_ALWAYS) was deprecated and is being ignored. + */ if (data & 0x30) return 1; @@ -2664,7 +2667,6 @@ static int kvm_pv_enable_async_pf(struct kvm_vcpu *vcpu, u64 data) sizeof(u64))) return 1; - vcpu->arch.apf.send_user_only = !(data & KVM_ASYNC_PF_SEND_ALWAYS); vcpu->arch.apf.delivery_as_pf_vmexit = data & KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT; kvm_async_pf_wakeup_all(vcpu); @@ -10433,8 +10435,11 @@ static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu) if (!vcpu->arch.apf.delivery_as_pf_vmexit && is_guest_mode(vcpu)) return false; - if (!kvm_pv_async_pf_enabled(vcpu) || - (vcpu->arch.apf.send_user_only && kvm_x86_ops.get_cpl(vcpu) == 0)) + /* + * 'Page not present' APF events are only delivered when CPU is in + * user mode and APF mechanism is enabled. + */ + if (!kvm_pv_async_pf_enabled(vcpu) || kvm_x86_ops.get_cpl(vcpu) == 0) return false; return true;