From patchwork Mon Feb 7 15:28:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CC776C433F5 for ; Mon, 7 Feb 2022 15:29:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E428210E37A; Mon, 7 Feb 2022 15:29:18 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7EC5E10E2A8 for ; Mon, 7 Feb 2022 15:29:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247755; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EQ3ltiA1zD8GMi+gBPOuvalM+7bzpftE5TYI4/s1Kfw=; b=dVTJYxno/NYspxnnBMEzTXsoQ+To4p1/2Il/8iE/DfzExKmRJWqIM8tH9nDaJ6yNBZzSts 5htINDndEGdFz4WTcpMAimqSoZoU+7w8DwJ7m50g9sWcoRP44K7/3xCVSGOCQKIgUgxW1P U8+DlP4iqDZPkvLmekJQ4fQ2eY9lqBA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-110-A-kc9TDVMy6FspJpOB3tRw-1; Mon, 07 Feb 2022 10:29:11 -0500 X-MC-Unique: A-kc9TDVMy6FspJpOB3tRw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 607DC8710F8; Mon, 7 Feb 2022 15:29:07 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9C86B84D38; Mon, 7 Feb 2022 15:28:58 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 01/30] KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case Date: Mon, 7 Feb 2022 17:28:18 +0200 Message-Id: <20220207152847.836777-2-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner , stable@vger.kernel.org" , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" When the guest doesn't enable paging, and NPT/EPT is disabled, we use guest't paging CR3's as KVM's shadow paging pointer and we are technically in direct mode as if we were to use NPT/EPT. In direct mode we create SPTEs with user mode permissions because usually in the direct mode the NPT/EPT doesn't need to restrict access based on guest CPL (there are MBE/GMET extenstions for that but KVM doesn't use them). In this special "use guest paging as direct" mode however, and if CR4.SMAP/CR4.SMEP are enabled, that will make the CPU fault on each access and KVM will enter endless loop of page faults. Since page protection doesn't have any meaning in !PG case, just don't passthrough these bits. The fix is the same as was done for VMX in commit: commit 656ec4a4928a ("KVM: VMX: fix SMEP and SMAP without EPT") This fixes the boot of windows 10 without NPT for good. (Without this patch, BSP boots, but APs were stuck in endless loop of page faults, causing the VM boot with 1 CPU) Signed-off-by: Maxim Levitsky Cc: stable@vger.kernel.org --- arch/x86/kvm/svm/svm.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 975be872cd1a3..995c203a62fd9 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1596,6 +1596,7 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { struct vcpu_svm *svm = to_svm(vcpu); u64 hcr0 = cr0; + bool old_paging = is_paging(vcpu); #ifdef CONFIG_X86_64 if (vcpu->arch.efer & EFER_LME && !vcpu->arch.guest_state_protected) { @@ -1612,8 +1613,11 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) #endif vcpu->arch.cr0 = cr0; - if (!npt_enabled) + if (!npt_enabled) { hcr0 |= X86_CR0_PG | X86_CR0_WP; + if (old_paging != is_paging(vcpu)) + svm_set_cr4(vcpu, kvm_read_cr4(vcpu)); + } /* * re-enable caching here because the QEMU bios @@ -1657,8 +1661,12 @@ void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) svm_flush_tlb_current(vcpu); vcpu->arch.cr4 = cr4; - if (!npt_enabled) + if (!npt_enabled) { cr4 |= X86_CR4_PAE; + + if (!is_paging(vcpu)) + cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE); + } cr4 |= host_cr4_mce; to_svm(vcpu)->vmcb->save.cr4 = cr4; vmcb_mark_dirty(to_svm(vcpu)->vmcb, VMCB_CR); From patchwork Mon Feb 7 15:28:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8F7FEC433F5 for ; Mon, 7 Feb 2022 15:29:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C37D510E87A; Mon, 7 Feb 2022 15:29:22 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id DCD7910E6B8 for ; Mon, 7 Feb 2022 15:29:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247759; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZWkaPZigKR8ZVYYpYKUmXEITXI5Ito7CWFDFszKolT0=; b=OossA3NuOzdxv0fIgilC/NcvtzLoNQ824eOm3PtDbj9IpVSeSA6uvd0mUGIS+AhJ55SBaC VS1c8FvWbErVUEoUTLQFD/DmVa8N/s7cEST+f3BZJGm+h7gkbct6amvQ6qCYzSBDSc8akZ a+NUd0RPpENh+ecapiiPu5l1Em9INQ0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-2-X643oSf_OgSyyqZjBs80qQ-1; Mon, 07 Feb 2022 10:29:18 -0500 X-MC-Unique: X643oSf_OgSyyqZjBs80qQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 931A61091DA0; Mon, 7 Feb 2022 15:29:15 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id CCEA384A2C; Mon, 7 Feb 2022 15:29:07 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 02/30] KVM: x86: nSVM: fix potential NULL derefernce on nested migration Date: Mon, 7 Feb 2022 17:28:19 +0200 Message-Id: <20220207152847.836777-3-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner , stable@vger.kernel.org" , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Turns out that due to review feedback and/or rebases I accidentally moved the call to nested_svm_load_cr3 to be too early, before the NPT is enabled, which is very wrong to do. KVM can't even access guest memory at that point as nested NPT is needed for that, and of course it won't initialize the walk_mmu, which is main issue the patch was addressing. Fix this for real. Fixes: 232f75d3b4b5 ("KVM: nSVM: call nested_svm_load_cr3 on nested state load") Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 1218b5a342fc8..39d280e7e80ef 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1457,18 +1457,6 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, !__nested_vmcb_check_save(vcpu, &save_cached)) goto out_free; - /* - * While the nested guest CR3 is already checked and set by - * KVM_SET_SREGS, it was set when nested state was yet loaded, - * thus MMU might not be initialized correctly. - * Set it again to fix this. - */ - - ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3, - nested_npt_enabled(svm), false); - if (WARN_ON_ONCE(ret)) - goto out_free; - /* * All checks done, we can enter guest mode. Userspace provides @@ -1494,6 +1482,20 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, svm_switch_vmcb(svm, &svm->nested.vmcb02); nested_vmcb02_prepare_control(svm); + + /* + * While the nested guest CR3 is already checked and set by + * KVM_SET_SREGS, it was set when nested state was yet loaded, + * thus MMU might not be initialized correctly. + * Set it again to fix this. + */ + + ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3, + nested_npt_enabled(svm), false); + if (WARN_ON_ONCE(ret)) + goto out_free; + + kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); ret = 0; out_free: From patchwork Mon Feb 7 15:28:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737456 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 626EDC433F5 for ; Mon, 7 Feb 2022 15:29:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EB15210E3A9; Mon, 7 Feb 2022 15:29:32 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id ABAFE10E3A9 for ; Mon, 7 Feb 2022 15:29:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247770; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I4R9ip16BNHzOCMyjNW2i8e6oOdWXHYHkd2se90F0XI=; b=NizpxYwztQMO1W7AXqt3GNkEDclc8+mJmObW1cYKJGgDex7jGQZiooSmmS6kdDXQJ2wHQ8 PzysXN1c90cVMIn3YOBfhmRi0w/mSAkLygGpwW93odduiabJ9MTy9n4mvKIO6inqW01hkj fCJClKYhOmANttU9I2cSZlkWusWTvd8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-29-SZXAp8i0P6yrVuzPqlGnBg-1; Mon, 07 Feb 2022 10:29:27 -0500 X-MC-Unique: SZXAp8i0P6yrVuzPqlGnBg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 03E031091DA0; Mon, 7 Feb 2022 15:29:24 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0CD7084A2C; Mon, 7 Feb 2022 15:29:15 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 03/30] KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state Date: Mon, 7 Feb 2022 17:28:20 +0200 Message-Id: <20220207152847.836777-4-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner , stable@vger.kernel.org" , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" While usually, restoring the smm state makes the KVM enter the nested guest thus a different vmcb (vmcb02 vs vmcb01), KVM should still mark it as dirty, since hardware can in theory cache multiple vmcbs. Failure to do so, combined with lack of setting the nested_run_pending (which is fixed in the next patch), might make KVM re-enter vmcb01, which was just exited from, with completely different set of guest state registers (SMM vs non SMM) and without proper dirty bits set, which results in the CPU reusing stale IDTR pointer which leads to a guest shutdown on any interrupt. On the real hardware this usually doesn't happen, but when running nested, L0's KVM does check and honour few dirty bits, causing this issue to happen. This patch fixes boot of hyperv and SMM enabled windows VM running nested on KVM. Signed-off-by: Maxim Levitsky Cc: stable@vger.kernel.org --- arch/x86/kvm/svm/svm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 995c203a62fd9..3f1d11e652123 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4267,6 +4267,8 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) * Enter the nested guest now */ + vmcb_mark_all_dirty(svm->vmcb01.ptr); + vmcb12 = map.hva; nested_copy_vmcb_control_to_cache(svm, &vmcb12->control); nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); From patchwork Mon Feb 7 15:28:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 86441C433FE for ; Mon, 7 Feb 2022 15:29:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9B7AD10E2A8; Mon, 7 Feb 2022 15:29:43 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5F3D210E450 for ; Mon, 7 Feb 2022 15:29:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247781; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=va2Nt/crWYOdjbL3kQd/XU5O6mMNd0LbnQBpM4oG1KI=; b=S2Lf8TMkuMh0ZmfUO38/hWoxfyoAWs/8HvO8bppNf/MNnrJrjW0gG7y/9IuMy/2QaSUvKK BNn0VA8hY0DUiT8ga3YXNtIl4ZLDOplAea4A/JlgoRhlW4PiGdiomavxv65/ybjigEetzs SYWIMkFNdBohE5Cg7JaGi/QbA4brKu4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-590-0boE7vV1NsGHAe4_eG9XEw-1; Mon, 07 Feb 2022 10:29:38 -0500 X-MC-Unique: 0boE7vV1NsGHAe4_eG9XEw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4DF201800D50; Mon, 7 Feb 2022 15:29:35 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7181984A2C; Mon, 7 Feb 2022 15:29:24 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 04/30] KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM Date: Mon, 7 Feb 2022 17:28:21 +0200 Message-Id: <20220207152847.836777-5-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner , stable@vger.kernel.org" , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" While RSM induced VM entries are not full VM entries, they still need to be followed by actual VM entry to complete it, unlike setting the nested state. This patch fixes boot of hyperv and SMM enabled windows VM running nested on KVM, which fail due to this issue combined with lack of dirty bit setting. Signed-off-by: Maxim Levitsky Cc: stable@vger.kernel.org --- arch/x86/kvm/svm/svm.c | 5 +++++ arch/x86/kvm/vmx/vmx.c | 1 + 2 files changed, 6 insertions(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 3f1d11e652123..71bfa52121622 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4274,6 +4274,11 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, false); + if (ret) + goto unmap_save; + + svm->nested.nested_run_pending = 1; + unmap_save: kvm_vcpu_unmap(vcpu, &map_save, true); unmap_map: diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 8ac5a6fa77203..fc9c4eca90a78 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7659,6 +7659,7 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) if (ret) return ret; + vmx->nested.nested_run_pending = 1; vmx->nested.smm.guest_mode = false; } return 0; From patchwork Mon Feb 7 15:28:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737458 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C8EEAC433EF for ; Mon, 7 Feb 2022 15:30:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E438B10EC21; Mon, 7 Feb 2022 15:30:03 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 23C0810EBFD for ; Mon, 7 Feb 2022 15:30:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247801; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qMnOM1ahvRbyv4h4oZCdQyrFX9rMZxmp7xDBGeC6Xow=; b=YFwRUehB3MFfzCtqr6Uh0iluK+fek54SL0DdpTC26FBL3WaT2l/J550rWyXeXpnhlSIHyS qMaa5Ji6glc1CXEFQBFjbfQV9Xhm1nWA4ya0zmK+3nPtucy/JFZfZvUWJr0/ZnwNAxKN/C 70JE1jiQWOfXOlm8hGCjjI/QJe88XN8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-441-1VcO7F3JPcWWGRPGx_-gfg-1; Mon, 07 Feb 2022 10:29:57 -0500 X-MC-Unique: 1VcO7F3JPcWWGRPGx_-gfg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6EAE083DD23; Mon, 7 Feb 2022 15:29:54 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id ACB3484A2C; Mon, 7 Feb 2022 15:29:35 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 05/30] KVM: x86: nSVM: expose clean bit support to the guest Date: Mon, 7 Feb 2022 17:28:22 +0200 Message-Id: <20220207152847.836777-6-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" KVM already honours few clean bits thus it makes sense to let the nested guest know about it. Note that KVM also doesn't check if the hardware supports clean bits, and therefore nested KVM was already setting clean bits and L0 KVM was already honouring them. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 71bfa52121622..8013be9edf27c 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4663,6 +4663,7 @@ static __init void svm_set_cpu_caps(void) /* CPUID 0x80000001 and 0x8000000A (SVM features) */ if (nested) { kvm_cpu_cap_set(X86_FEATURE_SVM); + kvm_cpu_cap_set(X86_FEATURE_VMCBCLEAN); if (nrips) kvm_cpu_cap_set(X86_FEATURE_NRIPS); From patchwork Mon Feb 7 15:28:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 32F79C433F5 for ; Mon, 7 Feb 2022 15:30:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 77D2710E279; Mon, 7 Feb 2022 15:30:49 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5DE7B10E279 for ; Mon, 7 Feb 2022 15:30:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247847; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EAX/iPSDnKifJ1ul8GYSz/iW0O/HgQPeT+89Chaumh4=; b=QnCcvYi23Yeuz8Q5LgQYDSivNSdVFp9F3px/yn/FWXlEjAR76DVy16MKFUkB9B0XQmGy1p FXRQNT97k7Bc+dtNgblV0YK0b2D4uWe0rLiVGfvAWpAwlO39z/Z2o4YUYAJnM+lP3D+rLz /2lChHnc6rMjAU7U75xG9XMiugAsXHc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-348-cAg6ILZRNku52ufsetSXTw-1; Mon, 07 Feb 2022 10:30:44 -0500 X-MC-Unique: cAg6ILZRNku52ufsetSXTw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3D4801F2DF; Mon, 7 Feb 2022 15:30:41 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id DC1B484D3D; Mon, 7 Feb 2022 15:29:54 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 06/30] KVM: x86: mark syntethic SMM vmexit as SVM_EXIT_SW Date: Mon, 7 Feb 2022 17:28:23 +0200 Message-Id: <20220207152847.836777-7-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Use a dummy unused vmexit reason to mark the 'VM exit' that is happening when we exit to handle SMM, which is not a real VM exit. This makes it a bit easier to read the KVM trace, and avoids other potential problems. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 8013be9edf27c..9a4e299ed5673 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4194,7 +4194,7 @@ static int svm_enter_smm(struct kvm_vcpu *vcpu, char *smstate) svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP]; svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP]; - ret = nested_svm_vmexit(svm); + ret = nested_svm_simple_vmexit(svm, SVM_EXIT_SW); if (ret) return ret; From patchwork Mon Feb 7 15:28:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737460 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 25662C433F5 for ; Mon, 7 Feb 2022 15:31:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EAF4510EBFD; Mon, 7 Feb 2022 15:30:58 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id B331310EDD9 for ; Mon, 7 Feb 2022 15:30:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247855; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZPXfeiFExpOXhTUBTnQ3rPqBk4UBybQ6CD98nOBVkzg=; b=GhjA4FGs2UE/t1ThDOwONDKvXEva1mQBkfEYQnw/l1cdpkD/2z6v2CRnON8mF5fprowSUG 5uBtuuvVQcLb03I5OhN46wRq0gQ8URxzBN2M/Y+ZunIfagf3A7ASBb7sUzgbWcT34v/Elf 2DC30BQh3JD9d1nH4VPywY/HELTRmX8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-387-qDpOo9wfOwuvnAJHTaZYeA-1; Mon, 07 Feb 2022 10:30:52 -0500 X-MC-Unique: qDpOo9wfOwuvnAJHTaZYeA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 347AA184C5E8; Mon, 7 Feb 2022 15:30:49 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id A15E57D723; Mon, 7 Feb 2022 15:30:41 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 07/30] KVM: x86: nSVM: deal with L1 hypervisor that intercepts interrupts but lets L2 control them Date: Mon, 7 Feb 2022 17:28:24 +0200 Message-Id: <20220207152847.836777-8-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Fix a corner case in which the L1 hypervisor intercepts interrupts (INTERCEPT_INTR) and either doesn't set virtual interrupt masking (V_INTR_MASKING) or enters a nested guest with EFLAGS.IF disabled prior to the entry. In this case, despite the fact that L1 intercepts the interrupts, KVM still needs to set up an interrupt window to wait before injecting the INTR vmexit. Currently the KVM instead enters an endless loop of 'req_immediate_exit'. Exactly the same issue also happens for SMIs and NMI. Fix this as well. Note that on VMX, this case is impossible as there is only 'vmexit on external interrupts' execution control which either set, in which case both host and guest's EFLAGS.IF are ignored, or not set, in which case no VMexits are delivered. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 9a4e299ed5673..22e614008cf59 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3372,11 +3372,13 @@ static int svm_nmi_allowed(struct kvm_vcpu *vcpu, bool for_injection) if (svm->nested.nested_run_pending) return -EBUSY; + if (svm_nmi_blocked(vcpu)) + return 0; + /* An NMI must not be injected into L2 if it's supposed to VM-Exit. */ if (for_injection && is_guest_mode(vcpu) && nested_exit_on_nmi(svm)) return -EBUSY; - - return !svm_nmi_blocked(vcpu); + return 1; } static bool svm_get_nmi_mask(struct kvm_vcpu *vcpu) @@ -3428,9 +3430,13 @@ bool svm_interrupt_blocked(struct kvm_vcpu *vcpu) static int svm_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection) { struct vcpu_svm *svm = to_svm(vcpu); + if (svm->nested.nested_run_pending) return -EBUSY; + if (svm_interrupt_blocked(vcpu)) + return 0; + /* * An IRQ must not be injected into L2 if it's supposed to VM-Exit, * e.g. if the IRQ arrived asynchronously after checking nested events. @@ -3438,7 +3444,7 @@ static int svm_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection) if (for_injection && is_guest_mode(vcpu) && nested_exit_on_intr(svm)) return -EBUSY; - return !svm_interrupt_blocked(vcpu); + return 1; } static void svm_enable_irq_window(struct kvm_vcpu *vcpu) @@ -4169,11 +4175,14 @@ static int svm_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection) if (svm->nested.nested_run_pending) return -EBUSY; + if (svm_smi_blocked(vcpu)) + return 0; + /* An SMI must not be injected into L2 if it's supposed to VM-Exit. */ if (for_injection && is_guest_mode(vcpu) && nested_exit_on_smi(svm)) return -EBUSY; - return !svm_smi_blocked(vcpu); + return 1; } static int svm_enter_smm(struct kvm_vcpu *vcpu, char *smstate) From patchwork Mon Feb 7 15:28:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D4D8CC433FE for ; Mon, 7 Feb 2022 15:31:07 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2E34610F061; Mon, 7 Feb 2022 15:31:06 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id A79AC10EDD9 for ; Mon, 7 Feb 2022 15:31:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247863; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hIkOyOBrYC66as9Hh5WBjO5DbzrEJzCI4Y98UxmVlRg=; b=DkRCr6ZtQYiLnYqrFeFZlIX5k/Jkhe7edD0FcuzeFBBD4TftrgQEeaqlhXQuSMIPctCSQk l20XV5FXdP8f/gocqoQyXnmWB2vMyqHKQ0hibKnzaJXd60201oeJQLbJVj97jz+4wrH3mD VmZB3RB6qRLtchteLazM9tEPPqZs06U= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-355-Xtlmfo-dOsq6CAsSUcUukw-1; Mon, 07 Feb 2022 10:31:00 -0500 X-MC-Unique: Xtlmfo-dOsq6CAsSUcUukw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 61DED100CC85; Mon, 7 Feb 2022 15:30:57 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id A1D085E495; Mon, 7 Feb 2022 15:30:49 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 08/30] KVM: x86: lapic: don't touch irr_pending in kvm_apic_update_apicv when inhibiting it Date: Mon, 7 Feb 2022 17:28:25 +0200 Message-Id: <20220207152847.836777-9-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" kvm_apic_update_apicv is called when AVIC is still active, thus IRR bits can be set by the CPU after it is called, and don't cause the irr_pending to be set to true. Also logic in avic_kick_target_vcpu doesn't expect a race with this function so to make it simple, just keep irr_pending set to true and let the next interrupt injection to the guest clear it. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/lapic.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 0da7d0960fcb5..dd4e2888c244b 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2307,7 +2307,12 @@ void kvm_apic_update_apicv(struct kvm_vcpu *vcpu) apic->irr_pending = true; apic->isr_count = 1; } else { - apic->irr_pending = (apic_search_irr(apic) != -1); + /* + * Don't clear irr_pending, searching the IRR can race with + * updates from the CPU as APICv is still active from hardware's + * perspective. The flag will be cleared as appropriate when + * KVM injects the interrupt. + */ apic->isr_count = count_vectors(apic->regs + APIC_ISR); } } From patchwork Mon Feb 7 15:28:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737462 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D6AECC433F5 for ; Mon, 7 Feb 2022 15:31:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8772010F479; Mon, 7 Feb 2022 15:31:13 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6047810EA3D for ; Mon, 7 Feb 2022 15:31:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247871; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qdwpmbef/Bh6XlwaL5pVGukYEI2IsJrzG/O+p5FnU2s=; b=XmXdHBUzogF01+Ht5IrVCQpYmNO3BakRZh4MzvrnYbkYEXMmEXbJ8RKoezP1XTAn72yCea LZMvoHipYarRiev3BQFrZdDoS9sMu5nAxe6YezR9dHRGkL8YVDsSzIX0LXAfEGX0qsC4Ii W50x7jTvMSrjqWEVfOphBuGiX+aMqv4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-649-7ZyCnbh6Obq2vdwC_7Fy0g-1; Mon, 07 Feb 2022 10:31:08 -0500 X-MC-Unique: 7ZyCnbh6Obq2vdwC_7Fy0g-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 66C2284DA43; Mon, 7 Feb 2022 15:31:05 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id CF15485881; Mon, 7 Feb 2022 15:30:57 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 09/30] KVM: x86: SVM: move avic definitions from AMD's spec to svm.h Date: Mon, 7 Feb 2022 17:28:26 +0200 Message-Id: <20220207152847.836777-10-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" asm/svm.h is the correct place for all values that are defined in the SVM spec, and that includes AVIC. Also add some values from the spec that were not defined before and will be soon useful. Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/svm.h | 36 ++++++++++++++++++++++++++++++++ arch/x86/kvm/svm/avic.c | 22 +------------------ arch/x86/kvm/svm/svm.h | 11 ---------- 4 files changed, 38 insertions(+), 32 deletions(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 01e2650b95859..552ff8a5ea023 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -476,6 +476,7 @@ #define MSR_AMD64_ICIBSEXTDCTL 0xc001103c #define MSR_AMD64_IBSOPDATA4 0xc001103d #define MSR_AMD64_IBS_REG_COUNT_MAX 8 /* includes MSR_AMD64_IBSBRTARGET */ +#define MSR_AMD64_SVM_AVIC_DOORBELL 0xc001011b #define MSR_AMD64_VM_PAGE_FLUSH 0xc001011e #define MSR_AMD64_SEV_ES_GHCB 0xc0010130 #define MSR_AMD64_SEV 0xc0010131 diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index b00dbc5fac2b2..bb2fb78523cee 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -220,6 +220,42 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define SVM_NESTED_CTL_SEV_ENABLE BIT(1) #define SVM_NESTED_CTL_SEV_ES_ENABLE BIT(2) + +/* AVIC */ +#define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK (0xFF) +#define AVIC_LOGICAL_ID_ENTRY_VALID_BIT 31 +#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK (1 << 31) + +#define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK (0xFFULL) +#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK (0xFFFFFFFFFFULL << 12) +#define AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK (1ULL << 62) +#define AVIC_PHYSICAL_ID_ENTRY_VALID_MASK (1ULL << 63) +#define AVIC_PHYSICAL_ID_TABLE_SIZE_MASK (0xFF) + +#define AVIC_DOORBELL_PHYSICAL_ID_MASK (0xFF) + +#define AVIC_UNACCEL_ACCESS_WRITE_MASK 1 +#define AVIC_UNACCEL_ACCESS_OFFSET_MASK 0xFF0 +#define AVIC_UNACCEL_ACCESS_VECTOR_MASK 0xFFFFFFFF + +enum avic_ipi_failure_cause { + AVIC_IPI_FAILURE_INVALID_INT_TYPE, + AVIC_IPI_FAILURE_TARGET_NOT_RUNNING, + AVIC_IPI_FAILURE_INVALID_TARGET, + AVIC_IPI_FAILURE_INVALID_BACKING_PAGE, +}; + + +/* + * 0xff is broadcast, so the max index allowed for physical APIC ID + * table is 0xfe. APIC IDs above 0xff are reserved. + */ +#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff + +#define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF) +#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL + + struct vmcb_seg { u16 selector; u16 attrib; diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 99f907ec5aa8f..fabfc337e1c35 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -27,20 +27,6 @@ #include "irq.h" #include "svm.h" -#define SVM_AVIC_DOORBELL 0xc001011b - -#define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF) - -/* - * 0xff is broadcast, so the max index allowed for physical APIC ID - * table is 0xfe. APIC IDs above 0xff are reserved. - */ -#define AVIC_MAX_PHYSICAL_ID_COUNT 255 - -#define AVIC_UNACCEL_ACCESS_WRITE_MASK 1 -#define AVIC_UNACCEL_ACCESS_OFFSET_MASK 0xFF0 -#define AVIC_UNACCEL_ACCESS_VECTOR_MASK 0xFFFFFFFF - /* AVIC GATAG is encoded using VM and VCPU IDs */ #define AVIC_VCPU_ID_BITS 8 #define AVIC_VCPU_ID_MASK ((1 << AVIC_VCPU_ID_BITS) - 1) @@ -73,12 +59,6 @@ struct amd_svm_iommu_ir { void *data; /* Storing pointer to struct amd_ir_data */ }; -enum avic_ipi_failure_cause { - AVIC_IPI_FAILURE_INVALID_INT_TYPE, - AVIC_IPI_FAILURE_TARGET_NOT_RUNNING, - AVIC_IPI_FAILURE_INVALID_TARGET, - AVIC_IPI_FAILURE_INVALID_BACKING_PAGE, -}; /* Note: * This function is called from IOMMU driver to notify @@ -702,7 +682,7 @@ int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec) * one is harmless). */ if (cpu != get_cpu()) - wrmsrl(SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu)); + wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu)); put_cpu(); } else { /* diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 852b12aee03d7..6343558982c73 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -555,17 +555,6 @@ extern struct kvm_x86_nested_ops svm_nested_ops; /* avic.c */ -#define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK (0xFF) -#define AVIC_LOGICAL_ID_ENTRY_VALID_BIT 31 -#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK (1 << 31) - -#define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK (0xFFULL) -#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK (0xFFFFFFFFFFULL << 12) -#define AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK (1ULL << 62) -#define AVIC_PHYSICAL_ID_ENTRY_VALID_MASK (1ULL << 63) - -#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL - int avic_ga_log_notifier(u32 ga_tag); void avic_vm_destroy(struct kvm *kvm); int avic_vm_init(struct kvm *kvm); From patchwork Mon Feb 7 15:28:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DB84EC433EF for ; Mon, 7 Feb 2022 15:31:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B7E7910FF4B; Mon, 7 Feb 2022 15:31:23 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9CBBB10FA5B for ; Mon, 7 Feb 2022 15:31:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247880; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+qBv9qwSEqvx96/HtSvc9qvyeLhvKy4iFAhj6kK93B8=; b=XNT+XU1YMquoMXC2iD15olb5alewXcO3St2tffBPz9z4J+36W+fhEfnQ8o7yXMw3NDIp6L 8fnDPYdFKqaOTBScWDY4x6bCkF3pDCmjuwaKr0h/T4/qo6KiCPULcCTPzrURkLNim6nbSz d39zdnhrPKVVIlhRhuRfNgVnnGGwoQc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-22-vUZnz9OrMWSNbm4bzDT-IA-1; Mon, 07 Feb 2022 10:31:17 -0500 X-MC-Unique: vUZnz9OrMWSNbm4bzDT-IA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 89AA38E7C0; Mon, 7 Feb 2022 15:31:14 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id D2E9E84D38; Mon, 7 Feb 2022 15:31:05 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 10/30] KVM: x86: SVM: fix race between interrupt delivery and AVIC inhibition Date: Mon, 7 Feb 2022 17:28:27 +0200 Message-Id: <20220207152847.836777-11-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" If svm_deliver_avic_intr is called just after the target vcpu's AVIC got inhibited, it might read a stale value of vcpu->arch.apicv_active which can lead to the target vCPU not noticing the interrupt. To fix this use load-acquire/store-release so that, if the target vCPU is IN_GUEST_MODE, we're guaranteed to see a previous disabling of the AVIC. If AVIC has been disabled in the meanwhile, proceed with the KVM_REQ_EVENT-based delivery. All this complicated logic is actually exactly how we can handle an incomplete IPI vmexit; the only difference lies in who sets IRR, whether KVM or the processor. Also incomplete IPI vmexit also has the same races as svm_deliver_avic_intr. Therefore use the avic_kick_target_vcpu there as well. Co-developed-by: Paolo Bonzini Signed-off-by: Paolo Bonzini Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/avic.c | 73 ++++++++++++++--------------------------- arch/x86/kvm/svm/svm.c | 65 ++++++++++++++++++++++++++++-------- arch/x86/kvm/svm/svm.h | 3 ++ arch/x86/kvm/x86.c | 4 ++- 4 files changed, 82 insertions(+), 63 deletions(-) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index fabfc337e1c35..4c2d622b3b9f0 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -269,6 +269,24 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu) return 0; } + +void avic_ring_doorbell(struct kvm_vcpu *vcpu) +{ + /* + * Note, the vCPU could get migrated to a different pCPU at any + * point, which could result in signalling the wrong/previous + * pCPU. But if that happens the vCPU is guaranteed to do a + * VMRUN (after being migrated) and thus will process pending + * interrupts, i.e. a doorbell is not needed (and the spurious + * one is harmless). + */ + int cpu = READ_ONCE(vcpu->cpu); + + if (cpu != get_cpu()) + wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu)); + put_cpu(); +} + static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source, u32 icrl, u32 icrh) { @@ -284,8 +302,13 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source, kvm_for_each_vcpu(i, vcpu, kvm) { if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK, GET_APIC_DEST_FIELD(icrh), - icrl & APIC_DEST_MASK)) - kvm_vcpu_wake_up(vcpu); + icrl & APIC_DEST_MASK)) { + vcpu->arch.apic->irr_pending = true; + svm_complete_interrupt_delivery(vcpu, + icrl & APIC_MODE_MASK, + icrl & APIC_INT_LEVELTRIG, + icrl & APIC_VECTOR_MASK); + } } } @@ -649,52 +672,6 @@ void avic_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) return; } -int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec) -{ - if (!vcpu->arch.apicv_active) - return -1; - - kvm_lapic_set_irr(vec, vcpu->arch.apic); - - /* - * Pairs with the smp_mb_*() after setting vcpu->guest_mode in - * vcpu_enter_guest() to ensure the write to the vIRR is ordered before - * the read of guest_mode, which guarantees that either VMRUN will see - * and process the new vIRR entry, or that the below code will signal - * the doorbell if the vCPU is already running in the guest. - */ - smp_mb__after_atomic(); - - /* - * Signal the doorbell to tell hardware to inject the IRQ if the vCPU - * is in the guest. If the vCPU is not in the guest, hardware will - * automatically process AVIC interrupts at VMRUN. - */ - if (vcpu->mode == IN_GUEST_MODE) { - int cpu = READ_ONCE(vcpu->cpu); - - /* - * Note, the vCPU could get migrated to a different pCPU at any - * point, which could result in signalling the wrong/previous - * pCPU. But if that happens the vCPU is guaranteed to do a - * VMRUN (after being migrated) and thus will process pending - * interrupts, i.e. a doorbell is not needed (and the spurious - * one is harmless). - */ - if (cpu != get_cpu()) - wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu)); - put_cpu(); - } else { - /* - * Wake the vCPU if it was blocking. KVM will then detect the - * pending IRQ when checking if the vCPU has a wake event. - */ - kvm_vcpu_wake_up(vcpu); - } - - return 0; -} - bool avic_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu) { return false; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 22e614008cf59..18d4d87e12e15 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3310,20 +3310,6 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu) SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_INTR; } -static void svm_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, - int trig_mode, int vector) -{ - struct kvm_vcpu *vcpu = apic->vcpu; - - if (svm_deliver_avic_intr(vcpu, vector)) { - kvm_lapic_set_irr(vector, apic); - kvm_make_request(KVM_REQ_EVENT, vcpu); - kvm_vcpu_kick(vcpu); - } else { - trace_kvm_apicv_accept_irq(vcpu->vcpu_id, delivery_mode, - trig_mode, vector); - } -} static void svm_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) { @@ -4142,6 +4128,57 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu, return ret; } +void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode, + int trig_mode, int vec) +{ + /* + * vcpu->arch.apicv_active must be read after vcpu->mode. + * Pairs with smp_store_release in vcpu_enter_guest. + */ + bool in_guest_mode = (smp_load_acquire(&vcpu->mode) == IN_GUEST_MODE); + + if (!READ_ONCE(vcpu->arch.apicv_active)) { + /* + * Manually signal the event + */ + kvm_make_request(KVM_REQ_EVENT, vcpu); + kvm_vcpu_kick(vcpu); + return; + } + + trace_kvm_apicv_accept_irq(vcpu->vcpu_id, delivery_mode, trig_mode, vec); + + if (in_guest_mode) + /* + * Signal the doorbell to tell hardware to inject the IRQ if the vCPU + * is in the guest. If the vCPU is not in the guest, hardware will + * automatically process AVIC interrupts at VMRUN. + */ + avic_ring_doorbell(vcpu); + else + /* + * Wake the vCPU if it was blocking. KVM will then detect the + * pending IRQ when checking if the vCPU has a wake event. + */ + kvm_vcpu_wake_up(vcpu); +} + +static void svm_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, + int trig_mode, int vec) +{ + kvm_lapic_set_irr(vec, apic); + + /* + * Pairs with the smp_mb_*() after setting vcpu->guest_mode in + * vcpu_enter_guest() to ensure the write to the vIRR is ordered before + * the read of guest_mode, which guarantees that either VMRUN will see + * and process the new vIRR entry, or that the below code will signal + * the doorbell if the vCPU is already running in the guest. + */ + smp_mb__after_atomic(); + svm_complete_interrupt_delivery(apic->vcpu, delivery_mode, trig_mode, vec); +} + static void svm_handle_exit_irqoff(struct kvm_vcpu *vcpu) { } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 6343558982c73..83f9f95eced3e 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -488,6 +488,8 @@ void svm_set_gif(struct vcpu_svm *svm, bool value); int svm_invoke_exit_handler(struct kvm_vcpu *vcpu, u64 exit_code); void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr, int read, int write); +void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode, + int trig_mode, int vec); /* nested.c */ @@ -577,6 +579,7 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); void avic_vcpu_blocking(struct kvm_vcpu *vcpu); void avic_vcpu_unblocking(struct kvm_vcpu *vcpu); +void avic_ring_doorbell(struct kvm_vcpu *vcpu); /* sev.c */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6f69f3e3635e2..8cb5390f75efe 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10039,7 +10039,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) * result in virtual interrupt delivery. */ local_irq_disable(); - vcpu->mode = IN_GUEST_MODE; + + /* Store vcpu->apicv_active before vcpu->mode. */ + smp_store_release(&vcpu->mode, IN_GUEST_MODE); srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); From patchwork Mon Feb 7 15:28:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737464 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4E034C433FE for ; Mon, 7 Feb 2022 15:31:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 20DCC10F21D; Mon, 7 Feb 2022 15:31:33 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id B1997112456 for ; Mon, 7 Feb 2022 15:31:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247890; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7uKkJYp30pF3ErxkQlRiirr2VHEvcBthwFChel8Mbe4=; b=RJojYyB1ZqueEeszdYeyqcnRQnZT++waVBKz2M5bmslZy/O/0vlORg/ArwuTA1WiDSYSVd 3P9YWgpknNU2ZsfRkU2oGtRohl0/1zg9YXF78XzUAvQUG6ylx5+cG+ftk3cz6bMBwmYSVH MeFTTpuonGGWR0BCf6d1VEtILwLqKjU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-631-oNiW8xocPHS43XMSda0FrA-1; Mon, 07 Feb 2022 10:31:27 -0500 X-MC-Unique: oNiW8xocPHS43XMSda0FrA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4DDCF18B9EC3; Mon, 7 Feb 2022 15:31:23 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0398584A2C; Mon, 7 Feb 2022 15:31:14 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 11/30] KVM: x86: SVM: use vmcb01 in avic_init_vmcb Date: Mon, 7 Feb 2022 17:28:28 +0200 Message-Id: <20220207152847.836777-12-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Out of precation use vmcb01 when enabling host AVIC. No functional change intended. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/avic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 4c2d622b3b9f0..c6072245f7fbb 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -167,7 +167,7 @@ int avic_vm_init(struct kvm *kvm) void avic_init_vmcb(struct vcpu_svm *svm) { - struct vmcb *vmcb = svm->vmcb; + struct vmcb *vmcb = svm->vmcb01.ptr; struct kvm_svm *kvm_svm = to_kvm_svm(svm->vcpu.kvm); phys_addr_t bpa = __sme_set(page_to_phys(svm->avic_backing_page)); phys_addr_t lpa = __sme_set(page_to_phys(kvm_svm->avic_logical_id_table_page)); From patchwork Mon Feb 7 15:28:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E5E1FC433F5 for ; Mon, 7 Feb 2022 15:31:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0B9E1112459; Mon, 7 Feb 2022 15:31:43 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id D9C79112459 for ; Mon, 7 Feb 2022 15:31:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247900; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WfBR/k98L4+bC6Zh+Zl9tYY0D7k4ei3pM6o4QxmereM=; b=XiqYuG4TU/np+uEYWJOnhdZKI677yNYjJL5oo3NAhUcmG0M9cSKaEyOO3so8+7ryEYNJBX 3e0VY0NeDpA0ND1JWny9CPqY4yjQQSjNjpmT/7x6s4ZX/3HDT6x4xP+tZFvKmIltu4PIeY VgMrdr7/RX293akli/fWzvqn6NGFBbM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-85-dCkIUO6NMZKdvlNTztAonA-1; Mon, 07 Feb 2022 10:31:37 -0500 X-MC-Unique: dCkIUO6NMZKdvlNTztAonA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 761201006AA6; Mon, 7 Feb 2022 15:31:31 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id BC7835E495; Mon, 7 Feb 2022 15:31:23 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 12/30] KVM: x86: SVM: allow AVIC to co-exist with a nested guest running Date: Mon, 7 Feb 2022 17:28:29 +0200 Message-Id: <20220207152847.836777-13-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Inhibit the AVIC of the vCPU that is running nested for the duration of the nested run, so that all interrupts arriving from both its vCPU siblings and from KVM are delivered using normal IPIs and cause that vCPU to vmexit. Note that unlike normal AVIC inhibition, there is no need to update the AVIC mmio memslot, because the nested guest uses its own set of paging tables. That also means that AVIC doesn't need to be inhibited VM wide. Note that in the theory when a nested guest doesn't intercept physical interrupts, we could continue using AVIC to deliver them to it but don't bother doing so for now. Plus when nested AVIC is implemented, the nested guest will likely use it, which will not allow this optimization to be used (can't use real AVIC to support both L1 and L2 at the same time) Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 8 +++++++- arch/x86/kvm/svm/avic.c | 7 ++++++- arch/x86/kvm/svm/nested.c | 15 ++++++++++----- arch/x86/kvm/svm/svm.c | 31 +++++++++++++++++++----------- arch/x86/kvm/svm/svm.h | 1 + arch/x86/kvm/x86.c | 18 +++++++++++++++-- 7 files changed, 61 insertions(+), 20 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 9e37dc3d88636..c0d8f351dcbc0 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -125,6 +125,7 @@ KVM_X86_OP_NULL(migrate_timers) KVM_X86_OP(msr_filter_changed) KVM_X86_OP_NULL(complete_emulated_msr) KVM_X86_OP(vcpu_deliver_sipi_vector) +KVM_X86_OP_NULL(vcpu_has_apicv_inhibit_condition); #undef KVM_X86_OP #undef KVM_X86_OP_NULL diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c371ee7e45f78..256539c0481c5 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1039,7 +1039,6 @@ struct kvm_x86_msr_filter { #define APICV_INHIBIT_REASON_DISABLE 0 #define APICV_INHIBIT_REASON_HYPERV 1 -#define APICV_INHIBIT_REASON_NESTED 2 #define APICV_INHIBIT_REASON_IRQWIN 3 #define APICV_INHIBIT_REASON_PIT_REINJ 4 #define APICV_INHIBIT_REASON_X2APIC 5 @@ -1494,6 +1493,12 @@ struct kvm_x86_ops { int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err); void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector); + + /* + * Returns true if for some reason APICv (e.g guest mode) + * must be inhibited on this vCPU + */ + bool (*vcpu_has_apicv_inhibit_condition)(struct kvm_vcpu *vcpu); }; struct kvm_x86_nested_ops { @@ -1784,6 +1789,7 @@ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva, bool kvm_apicv_activated(struct kvm *kvm); void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu); +bool vcpu_has_apicv_inhibit_condition(struct kvm_vcpu *vcpu); void kvm_request_apicv_update(struct kvm *kvm, bool activate, unsigned long bit); diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index c6072245f7fbb..8f23e7d239097 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -677,6 +677,12 @@ bool avic_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu) return false; } +bool avic_has_vcpu_inhibit_condition(struct kvm_vcpu *vcpu) +{ + return is_guest_mode(vcpu); +} + + static void svm_ir_list_del(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi) { unsigned long flags; @@ -888,7 +894,6 @@ bool avic_check_apicv_inhibit_reasons(ulong bit) ulong supported = BIT(APICV_INHIBIT_REASON_DISABLE) | BIT(APICV_INHIBIT_REASON_ABSENT) | BIT(APICV_INHIBIT_REASON_HYPERV) | - BIT(APICV_INHIBIT_REASON_NESTED) | BIT(APICV_INHIBIT_REASON_IRQWIN) | BIT(APICV_INHIBIT_REASON_PIT_REINJ) | BIT(APICV_INHIBIT_REASON_X2APIC) | diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 39d280e7e80ef..ac9159b0618c7 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -551,11 +551,6 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) * exit_int_info, exit_int_info_err, next_rip, insn_len, insn_bytes. */ - /* - * Also covers avic_vapic_bar, avic_backing_page, avic_logical_id, - * avic_physical_id. - */ - WARN_ON(kvm_apicv_activated(svm->vcpu.kvm)); /* Copied from vmcb01. msrpm_base can be overwritten later. */ svm->vmcb->control.nested_ctl = svm->vmcb01.ptr->control.nested_ctl; @@ -659,6 +654,9 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, svm_set_gif(svm, true); + if (kvm_vcpu_apicv_active(vcpu)) + kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu); + return 0; } @@ -923,6 +921,13 @@ int nested_svm_vmexit(struct vcpu_svm *svm) if (unlikely(svm->vmcb->save.rflags & X86_EFLAGS_TF)) kvm_queue_exception(&(svm->vcpu), DB_VECTOR); + /* + * Un-inhibit the AVIC right away, so that other vCPUs can start + * to benefit from VM-exit less IPI right away + */ + if (kvm_apicv_activated(vcpu->kvm)) + kvm_vcpu_update_apicv(vcpu); + return 0; } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 18d4d87e12e15..85035324ed762 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1392,7 +1392,8 @@ static void svm_set_vintr(struct vcpu_svm *svm) /* * The following fields are ignored when AVIC is enabled */ - WARN_ON(kvm_apicv_activated(svm->vcpu.kvm)); + if (!is_guest_mode(&svm->vcpu)) + WARN_ON(kvm_apicv_activated(svm->vcpu.kvm)); svm_set_intercept(svm, INTERCEPT_VINTR); @@ -2898,10 +2899,16 @@ static int interrupt_window_interception(struct kvm_vcpu *vcpu) svm_clear_vintr(to_svm(vcpu)); /* - * For AVIC, the only reason to end up here is ExtINTs. + * If not running nested, for AVIC, the only reason to end up here is ExtINTs. * In this case AVIC was temporarily disabled for * requesting the IRQ window and we have to re-enable it. + * + * If running nested, still uninhibit the AVIC in case irq window + * was requested when it was not running nested. + * All vCPUs which run nested will have their AVIC still + * inhibited due to AVIC inhibition override for that. */ + kvm_request_apicv_update(vcpu->kvm, true, APICV_INHIBIT_REASON_IRQWIN); ++vcpu->stat.irq_window_exits; @@ -3451,8 +3458,16 @@ static void svm_enable_irq_window(struct kvm_vcpu *vcpu) * unless we have pending ExtINT since it cannot be injected * via AVIC. In such case, we need to temporarily disable AVIC, * and fallback to injecting IRQ via V_IRQ. + * + * If running nested, this vCPU will use separate page tables + * which don't have L1's AVIC mapped, and the AVIC is + * already inhibited thus there is no need for global + * AVIC inhibition. */ - kvm_request_apicv_update(vcpu->kvm, false, APICV_INHIBIT_REASON_IRQWIN); + + if (!is_guest_mode(vcpu)) + kvm_request_apicv_update(vcpu->kvm, false, APICV_INHIBIT_REASON_IRQWIN); + svm_set_vintr(svm); } } @@ -3927,14 +3942,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) if (guest_cpuid_has(vcpu, X86_FEATURE_X2APIC)) kvm_request_apicv_update(vcpu->kvm, false, APICV_INHIBIT_REASON_X2APIC); - - /* - * Currently, AVIC does not work with nested virtualization. - * So, we disable AVIC when cpuid for SVM is set in the L1 guest. - */ - if (nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM)) - kvm_request_apicv_update(vcpu->kvm, false, - APICV_INHIBIT_REASON_NESTED); } init_vmcb_after_set_cpuid(vcpu); } @@ -4657,6 +4664,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .complete_emulated_msr = svm_complete_emulated_msr, .vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector, + .vcpu_has_apicv_inhibit_condition = avic_has_vcpu_inhibit_condition, }; /* @@ -4840,6 +4848,7 @@ static __init int svm_hardware_setup(void) } else { svm_x86_ops.vcpu_blocking = NULL; svm_x86_ops.vcpu_unblocking = NULL; + svm_x86_ops.vcpu_has_apicv_inhibit_condition = NULL; } if (vls) { diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 83f9f95eced3e..c02903641d13d 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -580,6 +580,7 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq, void avic_vcpu_blocking(struct kvm_vcpu *vcpu); void avic_vcpu_unblocking(struct kvm_vcpu *vcpu); void avic_ring_doorbell(struct kvm_vcpu *vcpu); +bool avic_has_vcpu_inhibit_condition(struct kvm_vcpu *vcpu); /* sev.c */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8cb5390f75efe..63d84c373e465 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9697,6 +9697,14 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm) kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC); } +bool vcpu_has_apicv_inhibit_condition(struct kvm_vcpu *vcpu) +{ + if (kvm_x86_ops.vcpu_has_apicv_inhibit_condition) + return static_call(kvm_x86_vcpu_has_apicv_inhibit_condition)(vcpu); + else + return false; +} + void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu) { bool activate; @@ -9706,7 +9714,9 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu) down_read(&vcpu->kvm->arch.apicv_update_lock); - activate = kvm_apicv_activated(vcpu->kvm); + activate = kvm_apicv_activated(vcpu->kvm) && + !vcpu_has_apicv_inhibit_condition(vcpu); + if (vcpu->arch.apicv_active == activate) goto out; @@ -10110,7 +10120,11 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) * per-VM state, and responsing vCPUs must wait for the update * to complete before servicing KVM_REQ_APICV_UPDATE. */ - WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu)); + if (vcpu_has_apicv_inhibit_condition(vcpu)) + WARN_ON(kvm_vcpu_apicv_active(vcpu)); + else + WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu)); + exit_fastpath = static_call(kvm_x86_vcpu_run)(vcpu); if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST)) From patchwork Mon Feb 7 15:28:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737466 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E0565C433EF for ; Mon, 7 Feb 2022 15:31:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0075B11245A; Mon, 7 Feb 2022 15:31:47 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id B7AB111245A for ; Mon, 7 Feb 2022 15:31:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247905; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0HEZBYTVo91JCiV+9jLxhzD11zRa4eS0IgQUtngVtbM=; b=T/rVZQbsTXQu71Ttes2v5dCfEcOL/EDNHDaQOnjz+U5S3QZkIr1ujMJX2A78agN13dYS9/ OXus4iF8gsdWDqesfb1c4tg9DWvb4BvG8fmUVX5IWzXtlf3yNMtE+Ig8eiZfFZkZycuQF6 6TQuF+/v1ggk5IjGa/NFNzPQk8jzdzc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-594-fI_o2JAcPgqP5JSEjuEhuA-1; Mon, 07 Feb 2022 10:31:42 -0500 X-MC-Unique: fI_o2JAcPgqP5JSEjuEhuA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8DD381091DB8; Mon, 7 Feb 2022 15:31:39 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id E432C7D728; Mon, 7 Feb 2022 15:31:31 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 13/30] KVM: x86: lapic: don't allow to change APIC ID when apic acceleration is enabled Date: Mon, 7 Feb 2022 17:28:30 +0200 Message-Id: <20220207152847.836777-14-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" No normal guest has any reason to change physical APIC IDs, and allowing this introduces bugs into APIC acceleration code. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/lapic.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index dd4e2888c244b..7ff695cab27b2 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2002,10 +2002,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) switch (reg) { case APIC_ID: /* Local APIC ID */ - if (!apic_x2apic_mode(apic)) - kvm_apic_set_xapic_id(apic, val >> 24); - else + if (apic_x2apic_mode(apic)) { ret = 1; + break; + } + /* + * Don't allow setting APIC ID with any APIC acceleration + * enabled to avoid unexpected issues + */ + if (enable_apicv && ((val >> 24) != apic->vcpu->vcpu_id)) { + kvm_vm_bugged(apic->vcpu->kvm); + break; + } + + kvm_apic_set_xapic_id(apic, val >> 24); break; case APIC_TASKPRI: @@ -2572,10 +2582,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu) static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s, bool set) { - if (apic_x2apic_mode(vcpu->arch.apic)) { - u32 *id = (u32 *)(s->regs + APIC_ID); - u32 *ldr = (u32 *)(s->regs + APIC_LDR); + u32 *id = (u32 *)(s->regs + APIC_ID); + u32 *ldr = (u32 *)(s->regs + APIC_LDR); + if (!apic_x2apic_mode(vcpu->arch.apic)) { + /* Don't allow setting APIC ID with any APIC acceleration + * enabled to avoid unexpected issues + */ + if (enable_apicv && (*id >> 24) != vcpu->vcpu_id) + return -EINVAL; + } else { if (vcpu->kvm->arch.x2apic_format) { if (*id != vcpu->vcpu_id) return -EINVAL; From patchwork Mon Feb 7 15:28:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CD358C433F5 for ; Mon, 7 Feb 2022 15:31:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D3ED1112466; Mon, 7 Feb 2022 15:31:56 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0395C112460 for ; Mon, 7 Feb 2022 15:31:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247915; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qZTGmWS8KYVLiOi++oL+/LS4m4MbwSz3cj1OP+bvfJo=; b=aDBO4ge/cV8DFI7aM5wCL8r6A9gqig9ut1b4RvEJPOpaz3jcJ+I/gMi5sS7M1q2OSOr+wO y2BlqvhQ3Z4GkRFsgWFwwrte0830ERLI9ZzarUmiSYjgxO+UgHqRF0y6VfOuIU8Porh2Z2 2p73rzTh7mOHtVy8/TDneFlKIqqQWGQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-204-IDOF_Mr9PRG9c6ZS-owvLw-1; Mon, 07 Feb 2022 10:31:51 -0500 X-MC-Unique: IDOF_Mr9PRG9c6ZS-owvLw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9424B1800D50; Mon, 7 Feb 2022 15:31:47 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 07ECB5E495; Mon, 7 Feb 2022 15:31:39 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 14/30] KVM: x86: lapic: don't allow to change local apic id when using older x2apic api Date: Mon, 7 Feb 2022 17:28:31 +0200 Message-Id: <20220207152847.836777-15-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" KVM allowed to set non boot apic id via setting apic state if using older non x2apic 32 bit apic id userspace api. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/lapic.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 7ff695cab27b2..aeddd68d31181 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2592,15 +2592,15 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu, if (enable_apicv && (*id >> 24) != vcpu->vcpu_id) return -EINVAL; } else { - if (vcpu->kvm->arch.x2apic_format) { - if (*id != vcpu->vcpu_id) - return -EINVAL; - } else { - if (set) - *id >>= 24; - else - *id <<= 24; - } + + if (!vcpu->kvm->arch.x2apic_format && set) + *id >>= 24; + + if (*id != vcpu->vcpu_id) + return -EINVAL; + + if (!vcpu->kvm->arch.x2apic_format && !set) + *id <<= 24; /* In x2APIC mode, the LDR is fixed and based on the id */ if (set) From patchwork Mon Feb 7 15:28:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737468 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2415BC433F5 for ; Mon, 7 Feb 2022 15:32:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9BC23112455; Mon, 7 Feb 2022 15:32:04 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id B124A11245B for ; Mon, 7 Feb 2022 15:32:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247921; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WZW9E72jh7L2Mkny6o+vpEWPJ8Jc7Yzvnii4XswRCco=; b=Tc811LZZ0oOaQYtD4QZionNuUCX21hH+nnylq97Ric0nGpII+Vly6j9gxYRMDHmel2/wwF xEaWYEkle/ESwWD0LgN2QGWKF01T2AUPHScJntAabUh3ZqWUSxdD7B4LDdCx3czcJx/x0C 35YRiEa1WvcI1+sA6uWEGu787erxGlA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-610-ZRBrq-jbMYiOvI2cZjgfxQ-1; Mon, 07 Feb 2022 10:31:58 -0500 X-MC-Unique: ZRBrq-jbMYiOvI2cZjgfxQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9CC96101F7AC; Mon, 7 Feb 2022 15:31:55 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0CB4E5E495; Mon, 7 Feb 2022 15:31:47 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 15/30] KVM: x86: SVM: remove avic's broken code that updated APIC ID Date: Mon, 7 Feb 2022 17:28:32 +0200 Message-Id: <20220207152847.836777-16-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Now that KVM doesn't allow to change APIC ID in case AVIC is enabled, remove buggy AVIC code that tried to do so. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/avic.c | 35 ----------------------------------- 1 file changed, 35 deletions(-) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 8f23e7d239097..768252b3dfee6 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -440,35 +440,6 @@ static int avic_handle_ldr_update(struct kvm_vcpu *vcpu) return ret; } -static int avic_handle_apic_id_update(struct kvm_vcpu *vcpu) -{ - u64 *old, *new; - struct vcpu_svm *svm = to_svm(vcpu); - u32 id = kvm_xapic_id(vcpu->arch.apic); - - if (vcpu->vcpu_id == id) - return 0; - - old = avic_get_physical_id_entry(vcpu, vcpu->vcpu_id); - new = avic_get_physical_id_entry(vcpu, id); - if (!new || !old) - return 1; - - /* We need to move physical_id_entry to new offset */ - *new = *old; - *old = 0ULL; - to_svm(vcpu)->avic_physical_id_cache = new; - - /* - * Also update the guest physical APIC ID in the logical - * APIC ID table entry if already setup the LDR. - */ - if (svm->ldr_reg) - avic_handle_ldr_update(vcpu); - - return 0; -} - static void avic_handle_dfr_update(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -488,10 +459,6 @@ static int avic_unaccel_trap_write(struct vcpu_svm *svm) AVIC_UNACCEL_ACCESS_OFFSET_MASK; switch (offset) { - case APIC_ID: - if (avic_handle_apic_id_update(&svm->vcpu)) - return 0; - break; case APIC_LDR: if (avic_handle_ldr_update(&svm->vcpu)) return 0; @@ -584,8 +551,6 @@ int avic_init_vcpu(struct vcpu_svm *svm) void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu) { - if (avic_handle_apic_id_update(vcpu) != 0) - return; avic_handle_dfr_update(vcpu); avic_handle_ldr_update(vcpu); } From patchwork Mon Feb 7 15:28:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DD01EC433F5 for ; Mon, 7 Feb 2022 15:32:11 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 090B210EC6D; Mon, 7 Feb 2022 15:32:11 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 18C5810EC6D for ; Mon, 7 Feb 2022 15:32:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247929; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JIz72i323mNLi2DkTwdrDXZhWqpT5EG0EiEDO8eaza0=; b=YVAMnAG4Zi2k5dW0r52A1CkxDOajbXaa8p/ZpAVPWhT9fZmVMorzVZJHKUH1N1YBXQBc6T aflTMZnetUPIC4QSsY5sRCqNrCfvaHcx/WLz2BBKtwbDU2LdtalYseSSKHVcB/UjBPwIOx /DpPkDrQvaU8i9H3DfXVihqDCthoCMs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-52-xQCvYTM7OYOhVZ8PDv0x5Q-1; Mon, 07 Feb 2022 10:32:07 -0500 X-MC-Unique: xQCvYTM7OYOhVZ8PDv0x5Q-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9BDDF1091DA2; Mon, 7 Feb 2022 15:32:04 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 18C485E495; Mon, 7 Feb 2022 15:31:55 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 16/30] KVM: x86: SVM: allow to force AVIC to be enabled Date: Mon, 7 Feb 2022 17:28:33 +0200 Message-Id: <20220207152847.836777-17-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Apparently on some systems AVIC is disabled in CPUID but still usable. Allow the user to override the CPUID if the user is willing to take the risk. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 85035324ed762..b88ca7f07a0fc 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -202,6 +202,9 @@ module_param(tsc_scaling, int, 0444); static bool avic; module_param(avic, bool, 0444); +static bool force_avic; +module_param_unsafe(force_avic, bool, 0444); + bool __read_mostly dump_invalid_vmcb; module_param(dump_invalid_vmcb, bool, 0644); @@ -4839,10 +4842,14 @@ static __init int svm_hardware_setup(void) nrips = false; } - enable_apicv = avic = avic && npt_enabled && boot_cpu_has(X86_FEATURE_AVIC); + enable_apicv = avic = avic && npt_enabled && (boot_cpu_has(X86_FEATURE_AVIC) || force_avic); if (enable_apicv) { - pr_info("AVIC enabled\n"); + if (!boot_cpu_has(X86_FEATURE_AVIC)) { + pr_warn("AVIC is not supported in CPUID but force enabled"); + pr_warn("Your system might crash and burn"); + } else + pr_info("AVIC enabled\n"); amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier); } else { From patchwork Mon Feb 7 15:28:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8C503C433F5 for ; Mon, 7 Feb 2022 15:32:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B8F5410FA8A; Mon, 7 Feb 2022 15:32:20 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2B38110FA8A for ; Mon, 7 Feb 2022 15:32:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247939; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ihOxmZAbTpzWMc2NwiHn3Gv0b20etgSCi1QvjMhiZ/g=; b=b6AGSKwMTKqedCgmWAJ+jLVLv9g6OENYxJZ2hhqqippLtZML3y9E7zVrG/xW+5DCWN0LSw zGGSFPa61K8k1cLSm4OAS1uXx7aW0832cWSqs9sJ0z1pPJDLjRZavrdoW0mJmtP3bWlZVQ eIfojHEgnhyOlMkO5D4t5I0BvKEEhBU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-635-dIsqZjtWPc67o-KOzzRURA-1; Mon, 07 Feb 2022 10:32:16 -0500 X-MC-Unique: dIsqZjtWPc67o-KOzzRURA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C01291923B83; Mon, 7 Feb 2022 15:32:12 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 18C765E495; Mon, 7 Feb 2022 15:32:04 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 17/30] KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set Date: Mon, 7 Feb 2022 17:28:34 +0200 Message-Id: <20220207152847.836777-18-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" It makes more sense to print new SPTE value than the old value. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/mmu/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 296f8723f9ae9..43c7abdd6b70f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2708,8 +2708,8 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, if (*sptep == spte) { ret = RET_PF_SPURIOUS; } else { - trace_kvm_mmu_set_spte(level, gfn, sptep); flush |= mmu_spte_update(sptep, spte); + trace_kvm_mmu_set_spte(level, gfn, sptep); } if (wrprot) { From patchwork Mon Feb 7 15:28:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 078EEC43217 for ; Mon, 7 Feb 2022 15:32:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D98E411245E; Mon, 7 Feb 2022 15:32:30 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id BFC9011245F for ; Mon, 7 Feb 2022 15:32:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247948; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PA0ZlhnyV8+URLprs6c44r6sDD4SwJewRmeM2u7HZ1o=; b=DsPp4RRzQPzbfvOz7PJq/nWknH4r9Ucpy3SqHCIMR9Oqw3uM54rfxpKa3t6y//l0IB7JfQ 3kXy4hDw3LQO/Ch0HS2z/YORewFcbHAEnpeJjE/ezZlTrAg7sX1Bu2GjrJntjzcHEys22n FO0ze33ugxwxCeFin/C+LbAZOc9WQjU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-324-9nEFvhYxO4W6nDGlvtXIyA-1; Mon, 07 Feb 2022 10:32:25 -0500 X-MC-Unique: 9nEFvhYxO4W6nDGlvtXIyA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DE53B2F25; Mon, 7 Feb 2022 15:32:20 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3863E84FF5; Mon, 7 Feb 2022 15:32:12 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 18/30] KVM: x86: mmu: add strict mmu mode Date: Mon, 7 Feb 2022 17:28:35 +0200 Message-Id: <20220207152847.836777-19-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add an (mostly debug) option to force KVM's shadow mmu to never have unsync pages. This is useful in some cases to debug it. It is also useful for some legacy guest OSes which don't flush TLBs correctly, and thus don't work on modern CPUs which have speculative MMUs. Using this option together with legacy paging (npt/ept=0) allows to correctly simulate such old MMU while still getting most of the benefits of the virtualization. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/mmu/mmu.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 43c7abdd6b70f..fa2da6990703f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -91,6 +91,10 @@ __MODULE_PARM_TYPE(nx_huge_pages_recovery_period_ms, "uint"); static bool __read_mostly force_flush_and_sync_on_reuse; module_param_named(flush_on_reuse, force_flush_and_sync_on_reuse, bool, 0644); + +bool strict_mmu; +module_param(strict_mmu, bool, 0644); + /* * When setting this variable to true it enables Two-Dimensional-Paging * where the hardware walks 2 page tables: @@ -2703,7 +2707,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, } wrprot = make_spte(vcpu, sp, slot, pte_access, gfn, pfn, *sptep, prefetch, - true, host_writable, &spte); + !strict_mmu, host_writable, &spte); if (*sptep == spte) { ret = RET_PF_SPURIOUS; @@ -5139,6 +5143,11 @@ static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa, */ static bool detect_write_flooding(struct kvm_mmu_page *sp) { + /* + * When using non speculating MMU, use a bit higher threshold + * for write flood detection + */ + int threshold = strict_mmu ? 10 : 3; /* * Skip write-flooding detected for the sp whose level is 1, because * it can become unsync, then the guest page is not write-protected. @@ -5147,7 +5156,7 @@ static bool detect_write_flooding(struct kvm_mmu_page *sp) return false; atomic_inc(&sp->write_flooding_count); - return atomic_read(&sp->write_flooding_count) >= 3; + return atomic_read(&sp->write_flooding_count) >= threshold; } /* From patchwork Mon Feb 7 15:28:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A8764C433F5 for ; Mon, 7 Feb 2022 15:32:40 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 88CB2112462; Mon, 7 Feb 2022 15:32:39 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id E5F1511245F for ; Mon, 7 Feb 2022 15:32:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247957; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mXBdRvvYHg1PpTVknSy/2IjwX4+D+xql9YO4/ocE+w4=; b=Gn2Ki2rrKB71LPdkUix2cXM5HrrAey3POK24nrwvZF1wag1jd7WmQdWKT0Q6d2alWhdWPS 1H02UBJWcgY6LRbietHIS6eNvJqpUZuFUfWLbcuwL91xYcJsMw2MnAnjMDxJZyo94APt2F O62RJAIOb3iNPrglF9vbMPsBLFz++c0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-345-D1huxYWjMUCet-mvp-t50w-1; Mon, 07 Feb 2022 10:32:34 -0500 X-MC-Unique: D1huxYWjMUCet-mvp-t50w-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 75F5B85EE6F; Mon, 7 Feb 2022 15:32:29 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 587057D723; Mon, 7 Feb 2022 15:32:21 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 19/30] KVM: x86: mmu: add gfn_in_memslot helper Date: Mon, 7 Feb 2022 17:28:36 +0200 Message-Id: <20220207152847.836777-20-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This is a tiny refactoring, and can be useful to check if a GPA/GFN is within a memslot a bit more cleanly. Signed-off-by: Maxim Levitsky --- include/linux/kvm_host.h | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b3810976a27f8..483681c6e322e 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1564,6 +1564,13 @@ int kvm_request_irq_source_id(struct kvm *kvm); void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id); bool kvm_arch_irqfd_allowed(struct kvm *kvm, struct kvm_irqfd *args); + +static inline bool gfn_in_memslot(struct kvm_memory_slot *slot, gfn_t gfn) +{ + return (gfn >= slot->base_gfn && gfn < slot->base_gfn + slot->npages); +} + + /* * Returns a pointer to the memslot if it contains gfn. * Otherwise returns NULL. @@ -1574,12 +1581,13 @@ try_get_memslot(struct kvm_memory_slot *slot, gfn_t gfn) if (!slot) return NULL; - if (gfn >= slot->base_gfn && gfn < slot->base_gfn + slot->npages) + if (gfn_in_memslot(slot, gfn)) return slot; else return NULL; } + /* * Returns a pointer to the memslot that contains gfn. Otherwise returns NULL. * From patchwork Mon Feb 7 15:28:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C2C32C433F5 for ; Mon, 7 Feb 2022 15:32:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E513611245F; Mon, 7 Feb 2022 15:32:46 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id DA568112464 for ; Mon, 7 Feb 2022 15:32:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247965; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XmdC6IIj3xv5TnY6QoReJ3NyIf+jyplwQ94vtfIKROs=; b=eNGr2SIGA1dyfURS0128cD7Ggw8Ab/Itw/q0pmvmrrAbnXH2SwCUHomhgDENDuby7NzUlc UOLP74g1/w/vTzFIKLNPrE6UmzE5Fm8mC/xUFNlq/Tahe/TkbDbuUsLoVIan/n5IzMNvjA VV4YCgx5i0ce/cp9HK3dQjVgRW1/hMY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-320-w2DonqQvM02Z3iHkUzgbxA-1; Mon, 07 Feb 2022 10:32:41 -0500 X-MC-Unique: w2DonqQvM02Z3iHkUzgbxA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 241811091DA0; Mon, 7 Feb 2022 15:32:38 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id CE40284FFB; Mon, 7 Feb 2022 15:32:29 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 20/30] KVM: x86: mmu: allow to enable write tracking externally Date: Mon, 7 Feb 2022 17:28:37 +0200 Message-Id: <20220207152847.836777-21-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This will be used to enable write tracking from nested AVIC code and can also be used to enable write tracking in GVT-g module when it actually uses it as opposed to always enabling it, when the module is compiled in the kernel. No functional change intended. Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/include/asm/kvm_page_track.h | 1 + arch/x86/kvm/mmu.h | 8 +++++--- arch/x86/kvm/mmu/mmu.c | 16 +++++++++------- arch/x86/kvm/mmu/page_track.c | 10 ++++++++-- 5 files changed, 24 insertions(+), 13 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 256539c0481c5..428ab1cc7dd34 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1225,7 +1225,7 @@ struct kvm_arch { * is used as one input when determining whether certain memslot * related allocations are necessary. */ - bool shadow_root_allocated; + bool mmu_page_tracking_enabled; #if IS_ENABLED(CONFIG_HYPERV) hpa_t hv_root_tdp; diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h index eb186bc57f6a9..955a5ae07b10e 100644 --- a/arch/x86/include/asm/kvm_page_track.h +++ b/arch/x86/include/asm/kvm_page_track.h @@ -50,6 +50,7 @@ int kvm_page_track_init(struct kvm *kvm); void kvm_page_track_cleanup(struct kvm *kvm); bool kvm_page_track_write_tracking_enabled(struct kvm *kvm); +int kvm_page_track_write_tracking_enable(struct kvm *kvm); int kvm_page_track_write_tracking_alloc(struct kvm_memory_slot *slot); void kvm_page_track_free_memslot(struct kvm_memory_slot *slot); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 51faa2c76ca5f..48cc042f17466 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -267,7 +267,7 @@ int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); int kvm_mmu_post_init_vm(struct kvm *kvm); void kvm_mmu_pre_destroy_vm(struct kvm *kvm); -static inline bool kvm_shadow_root_allocated(struct kvm *kvm) +static inline bool mmu_page_tracking_enabled(struct kvm *kvm) { /* * Read shadow_root_allocated before related pointers. Hence, threads @@ -275,9 +275,11 @@ static inline bool kvm_shadow_root_allocated(struct kvm *kvm) * see the pointers. Pairs with smp_store_release in * mmu_first_shadow_root_alloc. */ - return smp_load_acquire(&kvm->arch.shadow_root_allocated); + return smp_load_acquire(&kvm->arch.mmu_page_tracking_enabled); } +int mmu_enable_write_tracking(struct kvm *kvm); + #ifdef CONFIG_X86_64 static inline bool is_tdp_mmu_enabled(struct kvm *kvm) { return kvm->arch.tdp_mmu_enabled; } #else @@ -286,7 +288,7 @@ static inline bool is_tdp_mmu_enabled(struct kvm *kvm) { return false; } static inline bool kvm_memslots_have_rmaps(struct kvm *kvm) { - return !is_tdp_mmu_enabled(kvm) || kvm_shadow_root_allocated(kvm); + return !is_tdp_mmu_enabled(kvm) || mmu_page_tracking_enabled(kvm); } static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t base_gfn, int level) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index fa2da6990703f..431e02ba73690 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3384,7 +3384,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) return r; } -static int mmu_first_shadow_root_alloc(struct kvm *kvm) +int mmu_enable_write_tracking(struct kvm *kvm) { struct kvm_memslots *slots; struct kvm_memory_slot *slot; @@ -3394,21 +3394,20 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm) * Check if this is the first shadow root being allocated before * taking the lock. */ - if (kvm_shadow_root_allocated(kvm)) + if (mmu_page_tracking_enabled(kvm)) return 0; mutex_lock(&kvm->slots_arch_lock); /* Recheck, under the lock, whether this is the first shadow root. */ - if (kvm_shadow_root_allocated(kvm)) + if (mmu_page_tracking_enabled(kvm)) goto out_unlock; /* * Check if anything actually needs to be allocated, e.g. all metadata * will be allocated upfront if TDP is disabled. */ - if (kvm_memslots_have_rmaps(kvm) && - kvm_page_track_write_tracking_enabled(kvm)) + if (kvm_memslots_have_rmaps(kvm) && mmu_page_tracking_enabled(kvm)) goto out_success; for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { @@ -3438,7 +3437,7 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm) * all the related pointers are set. */ out_success: - smp_store_release(&kvm->arch.shadow_root_allocated, true); + smp_store_release(&kvm->arch.mmu_page_tracking_enabled, true); out_unlock: mutex_unlock(&kvm->slots_arch_lock); @@ -3475,7 +3474,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) } } - r = mmu_first_shadow_root_alloc(vcpu->kvm); + r = mmu_enable_write_tracking(vcpu->kvm); if (r) return r; @@ -5712,6 +5711,9 @@ void kvm_mmu_init_vm(struct kvm *kvm) node->track_write = kvm_mmu_pte_write; node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot; kvm_page_track_register_notifier(kvm, node); + + if (IS_ENABLED(CONFIG_KVM_EXTERNAL_WRITE_TRACKING) || !tdp_enabled) + mmu_enable_write_tracking(kvm); } void kvm_mmu_uninit_vm(struct kvm *kvm) diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index 68eb1fb548b61..ce5735909e74c 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -21,10 +21,16 @@ bool kvm_page_track_write_tracking_enabled(struct kvm *kvm) { - return IS_ENABLED(CONFIG_KVM_EXTERNAL_WRITE_TRACKING) || - !tdp_enabled || kvm_shadow_root_allocated(kvm); + return mmu_page_tracking_enabled(kvm); } +int kvm_page_track_write_tracking_enable(struct kvm *kvm) +{ + return mmu_enable_write_tracking(kvm); +} +EXPORT_SYMBOL_GPL(kvm_page_track_write_tracking_enable); + + void kvm_page_track_free_memslot(struct kvm_memory_slot *slot) { int i; From patchwork Mon Feb 7 15:28:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1FAC1C433FE for ; Mon, 7 Feb 2022 15:32:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EEB0911246B; Mon, 7 Feb 2022 15:32:54 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5ABAB112469 for ; Mon, 7 Feb 2022 15:32:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247972; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w984BAq0R1oM4sAFKCgHPFGUA43pzvjZKViTrxIXmwo=; b=GBrildwA5IKeIfaJrmmvBzBw2uJ3v1XICKmaKbwQMBe0SHGFlZPHKMh717LzaxcotK06Lg cTZqNML6qfeo9sn13kJEx9X7ZPQdQDBtbi/CFfv6maszCiHPiz/sigSKjS/brUawvGR5HH wdAhtoaJfVj7tOF2nIK03Mx0dLjsDK0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-48-n_weAD-0NLG-GcN03me7-g-1; Mon, 07 Feb 2022 10:32:49 -0500 X-MC-Unique: n_weAD-0NLG-GcN03me7-g-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 43C3B15720; Mon, 7 Feb 2022 15:32:46 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 936F484FFB; Mon, 7 Feb 2022 15:32:38 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 21/30] x86: KVMGT: use kvm_page_track_write_tracking_enable Date: Mon, 7 Feb 2022 17:28:38 +0200 Message-Id: <20220207152847.836777-22-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This allows to enable the write tracking only when KVMGT is actually used and doesn't carry any penalty otherwise. Tested by booting a VM with a kvmgt mdev device. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/Kconfig | 3 --- arch/x86/kvm/mmu/mmu.c | 2 +- drivers/gpu/drm/i915/Kconfig | 1 - drivers/gpu/drm/i915/gvt/kvmgt.c | 5 +++++ 4 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index ebc8ce9ec9173..169f8833cd0d1 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -132,7 +132,4 @@ config KVM_MMU_AUDIT This option adds a R/W kVM module parameter 'mmu_audit', which allows auditing of KVM MMU events at runtime. -config KVM_EXTERNAL_WRITE_TRACKING - bool - endif # VIRTUALIZATION diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 431e02ba73690..e4e2fc8e7d7a5 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5712,7 +5712,7 @@ void kvm_mmu_init_vm(struct kvm *kvm) node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot; kvm_page_track_register_notifier(kvm, node); - if (IS_ENABLED(CONFIG_KVM_EXTERNAL_WRITE_TRACKING) || !tdp_enabled) + if (!tdp_enabled) mmu_enable_write_tracking(kvm); } diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 84b6fc70cbf52..bf041b26ffec3 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -126,7 +126,6 @@ config DRM_I915_GVT_KVMGT depends on DRM_I915_GVT depends on KVM depends on VFIO_MDEV - select KVM_EXTERNAL_WRITE_TRACKING default n help Choose this option if you want to enable KVMGT support for diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 20b82fb036f8c..64ced3c2bc550 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1916,6 +1916,7 @@ static int kvmgt_guest_init(struct mdev_device *mdev) struct intel_vgpu *vgpu; struct kvmgt_vdev *vdev; struct kvm *kvm; + int ret; vgpu = mdev_get_drvdata(mdev); if (handle_valid(vgpu->handle)) @@ -1931,6 +1932,10 @@ static int kvmgt_guest_init(struct mdev_device *mdev) if (__kvmgt_vgpu_exist(vgpu, kvm)) return -EEXIST; + ret = kvm_page_track_write_tracking_enable(kvm); + if (ret) + return ret; + info = vzalloc(sizeof(struct kvmgt_guest_info)); if (!info) return -ENOMEM; From patchwork Mon Feb 7 15:28:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6DC24C433F5 for ; Mon, 7 Feb 2022 15:33:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5A54E11246C; Mon, 7 Feb 2022 15:33:03 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8686111246D for ; Mon, 7 Feb 2022 15:33:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247980; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aV5jkqTeSUUXBrzTfKSBHIk3+2wrmrIK57QNIliC8wc=; b=ZScf/EftTbPUv7gCkOELlfEk+Vvw/TlfDjWhVWfIi2CWUyFKTBQlDA+s0q3RMBinKEgg0W h5iLmN8ZCdGKsE96mQFg6agZhrdwBjnovRpMuKFNt2ThzzhGElx87HthI5CrotHzThstcx GfbXPNOsMc3kOagBWhXjxf6S96YW4EQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-466-MrUjQzLENKyrcmMpkHhahg-1; Mon, 07 Feb 2022 10:32:57 -0500 X-MC-Unique: MrUjQzLENKyrcmMpkHhahg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 491671923B9A; Mon, 7 Feb 2022 15:32:54 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id B1C7384FFB; Mon, 7 Feb 2022 15:32:46 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 22/30] KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running Date: Mon, 7 Feb 2022 17:28:39 +0200 Message-Id: <20220207152847.836777-23-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" When L2 is running without LBR virtualization, we should ensure that L1's LBR msrs continue to update as usual. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 11 +++++ arch/x86/kvm/svm/svm.c | 98 +++++++++++++++++++++++++++++++-------- arch/x86/kvm/svm/svm.h | 2 + 3 files changed, 92 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index ac9159b0618c7..9f7bc7db08dd3 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -535,6 +535,9 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 svm->vcpu.arch.dr6 = svm->nested.save.dr6 | DR6_ACTIVE_LOW; vmcb_mark_dirty(svm->vmcb, VMCB_DR); } + + if (unlikely(svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK)) + svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); } static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) @@ -587,6 +590,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) svm->vmcb->control.event_inj = svm->nested.ctl.event_inj; svm->vmcb->control.event_inj_err = svm->nested.ctl.event_inj_err; + svm->vmcb->control.virt_ext = svm->vmcb01.ptr->control.virt_ext & + LBR_CTL_ENABLE_MASK; + nested_svm_transition_tlb_flush(vcpu); /* Enter Guest-Mode */ @@ -852,6 +858,11 @@ int nested_svm_vmexit(struct vcpu_svm *svm) svm_switch_vmcb(svm, &svm->vmcb01); + if (unlikely(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK)) { + svm_copy_lbrs(svm->nested.vmcb02.ptr, svm->vmcb); + svm_update_lbrv(vcpu); + } + /* * On vmexit the GIF is set to false and * no event can be injected in L1. diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index b88ca7f07a0fc..294e016f575a8 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -805,6 +805,17 @@ static void init_msrpm_offsets(void) } } +void svm_copy_lbrs(struct vmcb *from_vmcb, struct vmcb *to_vmcb) +{ + to_vmcb->save.dbgctl = from_vmcb->save.dbgctl; + to_vmcb->save.br_from = from_vmcb->save.br_from; + to_vmcb->save.br_to = from_vmcb->save.br_to; + to_vmcb->save.last_excp_from = from_vmcb->save.last_excp_from; + to_vmcb->save.last_excp_to = from_vmcb->save.last_excp_to; + + vmcb_mark_dirty(to_vmcb, VMCB_LBR); +} + static void svm_enable_lbrv(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -814,6 +825,10 @@ static void svm_enable_lbrv(struct kvm_vcpu *vcpu) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 1, 1); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1); + + /* Move the LBR msrs to the vmcb02 so that the guest can see them. */ + if (is_guest_mode(vcpu)) + svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); } static void svm_disable_lbrv(struct kvm_vcpu *vcpu) @@ -825,6 +840,63 @@ static void svm_disable_lbrv(struct kvm_vcpu *vcpu) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 0, 0); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 0, 0); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 0, 0); + + /* + * Move the LBR msrs back to the vmcb01 to avoid copying them + * on nested guest entries. + */ + if (is_guest_mode(vcpu)) + svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr); +} + +static int svm_get_lbr_msr(struct vcpu_svm *svm, u32 index) +{ + /* + * If the LBR virtualization is disabled, the LBR msrs are always + * kept in the vmcb01 to avoid copying them on nested guest entries. + * + * If nested, and the LBR virtualization is enabled/disabled, the msrs + * are moved between the vmcb01 and vmcb02 as needed. + */ + struct vmcb *vmcb = + (svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK) ? + svm->vmcb : svm->vmcb01.ptr; + + switch (index) { + case MSR_IA32_DEBUGCTLMSR: + return vmcb->save.dbgctl; + case MSR_IA32_LASTBRANCHFROMIP: + return vmcb->save.br_from; + case MSR_IA32_LASTBRANCHTOIP: + return vmcb->save.br_to; + case MSR_IA32_LASTINTFROMIP: + return vmcb->save.last_excp_from; + case MSR_IA32_LASTINTTOIP: + return vmcb->save.last_excp_to; + default: + KVM_BUG(false, svm->vcpu.kvm, + "%s: Unknown MSR 0x%x", __func__, index); + return 0; + } +} + +void svm_update_lbrv(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + bool enable_lbrv = svm_get_lbr_msr(svm, MSR_IA32_DEBUGCTLMSR) & + DEBUGCTLMSR_LBR; + + bool current_enable_lbrv = !!(svm->vmcb->control.virt_ext & + LBR_CTL_ENABLE_MASK); + + if (enable_lbrv == current_enable_lbrv) + return; + + if (enable_lbrv) + svm_enable_lbrv(vcpu); + else + svm_disable_lbrv(vcpu); } void disable_nmi_singlestep(struct vcpu_svm *svm) @@ -2591,25 +2663,12 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_TSC_AUX: msr_info->data = svm->tsc_aux; break; - /* - * Nobody will change the following 5 values in the VMCB so we can - * safely return them on rdmsr. They will always be 0 until LBRV is - * implemented. - */ case MSR_IA32_DEBUGCTLMSR: - msr_info->data = svm->vmcb->save.dbgctl; - break; case MSR_IA32_LASTBRANCHFROMIP: - msr_info->data = svm->vmcb->save.br_from; - break; case MSR_IA32_LASTBRANCHTOIP: - msr_info->data = svm->vmcb->save.br_to; - break; case MSR_IA32_LASTINTFROMIP: - msr_info->data = svm->vmcb->save.last_excp_from; - break; case MSR_IA32_LASTINTTOIP: - msr_info->data = svm->vmcb->save.last_excp_to; + msr_info->data = svm_get_lbr_msr(svm, msr_info->index); break; case MSR_VM_HSAVE_PA: msr_info->data = svm->nested.hsave_msr; @@ -2840,12 +2899,13 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) if (data & DEBUGCTL_RESERVED_BITS) return 1; - svm->vmcb->save.dbgctl = data; - vmcb_mark_dirty(svm->vmcb, VMCB_LBR); - if (data & (1ULL<<0)) - svm_enable_lbrv(vcpu); + if (svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK) + svm->vmcb->save.dbgctl = data; else - svm_disable_lbrv(vcpu); + svm->vmcb01.ptr->save.dbgctl = data; + + svm_update_lbrv(vcpu); + break; case MSR_VM_HSAVE_PA: /* diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index c02903641d13d..b83e06d5d942a 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -476,6 +476,8 @@ u32 svm_msrpm_offset(u32 msr); u32 *svm_vcpu_alloc_msrpm(void); void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu, u32 *msrpm); void svm_vcpu_free_msrpm(u32 *msrpm); +void svm_copy_lbrs(struct vmcb *from_vmcb, struct vmcb *to_vmcb); +void svm_update_lbrv(struct kvm_vcpu *vcpu); int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer); void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0); From patchwork Mon Feb 7 15:28:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 163A1C433F5 for ; Mon, 7 Feb 2022 15:33:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 42747112457; Mon, 7 Feb 2022 15:33:13 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id C40EA11246E for ; Mon, 7 Feb 2022 15:33:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247990; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MrKkOBVSy/FtXXgDcr9UR7sqO5t4KzWH4q1ibusmu34=; b=dwDHZu2TTE7cW6Tn+QBoHe8f4U0WHo7Ko6l6ZotdRYg4NQRk/hTDMtD965ucWuR/qbaswy hFh/hbQVQwHS6Op6hqRQ1X6ewZzyxJECZ5RL/cAODO9eRxHYiIIGDuveggHbNBOZOuABhE Yi5V2LbKzR3olQ6qMj4t9MadhMAhCBY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-163-1KgoXbKiN3iIM0qk6VNGyA-1; Mon, 07 Feb 2022 10:33:05 -0500 X-MC-Unique: 1KgoXbKiN3iIM0qk6VNGyA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6F3CC85EE62; Mon, 7 Feb 2022 15:33:02 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id B773A84D18; Mon, 7 Feb 2022 15:32:54 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 23/30] KVM: x86: nSVM: implement nested LBR virtualization Date: Mon, 7 Feb 2022 17:28:40 +0200 Message-Id: <20220207152847.836777-24-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This was tested with kvm-unit-test that was developed for this purpose. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 21 +++++++++++++++++++-- arch/x86/kvm/svm/svm.c | 8 ++++++++ arch/x86/kvm/svm/svm.h | 1 + 3 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 9f7bc7db08dd3..4a228a76b27d7 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -536,8 +536,19 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 vmcb_mark_dirty(svm->vmcb, VMCB_DR); } - if (unlikely(svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK)) + if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + + /* Copy LBR related registers from vmcb12, + * but make sure that we only pick LBR enable bit from the guest. + */ + + svm_copy_lbrs(vmcb12, svm->vmcb); + svm->vmcb->save.dbgctl &= LBR_CTL_ENABLE_MASK; + svm_update_lbrv(&svm->vcpu); + + } else if (unlikely(svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK)) { svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); + } } static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) @@ -592,6 +603,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) svm->vmcb->control.virt_ext = svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK; + if (svm->lbrv_enabled) + svm->vmcb->control.virt_ext |= + (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK); nested_svm_transition_tlb_flush(vcpu); @@ -858,7 +872,10 @@ int nested_svm_vmexit(struct vcpu_svm *svm) svm_switch_vmcb(svm, &svm->vmcb01); - if (unlikely(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK)) { + if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + svm_copy_lbrs(svm->nested.vmcb02.ptr, vmcb12); + svm_update_lbrv(vcpu); + } else if (unlikely(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK)) { svm_copy_lbrs(svm->nested.vmcb02.ptr, svm->vmcb); svm_update_lbrv(vcpu); } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 294e016f575a8..76aa6054d9db2 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -890,6 +890,10 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu) bool current_enable_lbrv = !!(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK); + if (unlikely(is_guest_mode(vcpu) && svm->lbrv_enabled)) + if (unlikely(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK)) + enable_lbrv = true; + if (enable_lbrv == current_enable_lbrv) return; @@ -3987,6 +3991,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) guest_cpuid_has(vcpu, X86_FEATURE_NRIPS); svm->tsc_scaling_enabled = tsc_scaling && guest_cpuid_has(vcpu, X86_FEATURE_TSCRATEMSR); + svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV); svm_recalc_instruction_intercepts(vcpu, svm); @@ -4791,6 +4796,9 @@ static __init void svm_set_cpu_caps(void) if (tsc_scaling) kvm_cpu_cap_set(X86_FEATURE_TSCRATEMSR); + if (lbrv) + kvm_cpu_cap_set(X86_FEATURE_LBRV); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index b83e06d5d942a..0012ba5affcba 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -220,6 +220,7 @@ struct vcpu_svm { /* cached guest cpuid flags for faster access */ bool nrips_enabled : 1; bool tsc_scaling_enabled : 1; + bool lbrv_enabled : 1; u32 ldr_reg; u32 dfr_reg; From patchwork Mon Feb 7 15:28:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 931BBC433F5 for ; Mon, 7 Feb 2022 15:33:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9EE80112456; Mon, 7 Feb 2022 15:33:19 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9721411245B for ; Mon, 7 Feb 2022 15:33:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644247996; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oU3s7CnCjp2O1VaC3lW5tk3dKh1EHix/EeWBDaZUEPI=; b=aUdNRnmHHWlssHuxT0FdQdTkZFafbWdSvl+PboRpqrXAmadZMcfVTbxGxoh+lo4dLc5fE8 ss/vIctoXr/wa7QIFRu/E7a5YceY0pdGhWkjeMj3BPnUxxXVIGzbU/KA8LIlVTy756Zwv6 bKSuoCySOdYpS64TU/w2SlxJEiaSsR4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-274-1jisPzjKOBq4fw-RaZabSA-1; Mon, 07 Feb 2022 10:33:13 -0500 X-MC-Unique: 1jisPzjKOBq4fw-RaZabSA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8226985EE6D; Mon, 7 Feb 2022 15:33:10 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id DDA055E495; Mon, 7 Feb 2022 15:33:02 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 24/30] KVM: x86: nSVM: implement nested VMLOAD/VMSAVE Date: Mon, 7 Feb 2022 17:28:41 +0200 Message-Id: <20220207152847.836777-25-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This was tested by booting L1,L2,L3 (all Linux) and checking that no VMLOAD/VMSAVE vmexits happened. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 35 +++++++++++++++++++++++++++++------ arch/x86/kvm/svm/svm.c | 7 +++++++ arch/x86/kvm/svm/svm.h | 8 +++++++- 3 files changed, 43 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 4a228a76b27d7..bdcb23c76e89e 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -120,6 +120,20 @@ static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu) vcpu->arch.walk_mmu = &vcpu->arch.root_mmu; } +static bool nested_vmcb_needs_vls_intercept(struct vcpu_svm *svm) +{ + if (!svm->v_vmload_vmsave_enabled) + return true; + + if (!nested_npt_enabled(svm)) + return true; + + if (!(svm->nested.ctl.virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK)) + return true; + + return false; +} + void recalc_intercepts(struct vcpu_svm *svm) { struct vmcb_control_area *c, *h; @@ -161,8 +175,17 @@ void recalc_intercepts(struct vcpu_svm *svm) if (!intercept_smi) vmcb_clr_intercept(c, INTERCEPT_SMI); - vmcb_set_intercept(c, INTERCEPT_VMLOAD); - vmcb_set_intercept(c, INTERCEPT_VMSAVE); + if (nested_vmcb_needs_vls_intercept(svm)) { + /* + * If the virtual VMLOAD/VMSAVE is not enabled for the L2, + * we must intercept these instructions to correctly + * emulate them in case L1 doesn't intercept them. + */ + vmcb_set_intercept(c, INTERCEPT_VMLOAD); + vmcb_set_intercept(c, INTERCEPT_VMSAVE); + } else { + WARN_ON(!(c->virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK)); + } } static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm) @@ -426,10 +449,7 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm, vmcb12->control.exit_int_info = exit_int_info; } -static inline bool nested_npt_enabled(struct vcpu_svm *svm) -{ - return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; -} + static void nested_svm_transition_tlb_flush(struct kvm_vcpu *vcpu) { @@ -607,6 +627,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) svm->vmcb->control.virt_ext |= (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK); + if (!nested_vmcb_needs_vls_intercept(svm)) + svm->vmcb->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; + nested_svm_transition_tlb_flush(vcpu); /* Enter Guest-Mode */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 76aa6054d9db2..0f068da098d9f 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1051,6 +1051,8 @@ static inline void init_vmcb_after_set_cpuid(struct kvm_vcpu *vcpu) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_SYSENTER_EIP, 0, 0); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_SYSENTER_ESP, 0, 0); + + svm->v_vmload_vmsave_enabled = false; } else { /* * If hardware supports Virtual VMLOAD VMSAVE then enable it @@ -3993,6 +3995,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) svm->tsc_scaling_enabled = tsc_scaling && guest_cpuid_has(vcpu, X86_FEATURE_TSCRATEMSR); svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV); + svm->v_vmload_vmsave_enabled = vls && guest_cpuid_has(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD); + svm_recalc_instruction_intercepts(vcpu, svm); /* For sev guests, the memory encryption bit is not reserved in CR3. */ @@ -4799,6 +4803,9 @@ static __init void svm_set_cpu_caps(void) if (lbrv) kvm_cpu_cap_set(X86_FEATURE_LBRV); + if (vls) + kvm_cpu_cap_set(X86_FEATURE_V_VMSAVE_VMLOAD); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 0012ba5affcba..e8ffd458a5575 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -217,10 +217,11 @@ struct vcpu_svm { unsigned int3_injected; unsigned long int3_rip; - /* cached guest cpuid flags for faster access */ + /* optional nested SVM features that are enabled for this guest */ bool nrips_enabled : 1; bool tsc_scaling_enabled : 1; bool lbrv_enabled : 1; + bool v_vmload_vmsave_enabled : 1; u32 ldr_reg; u32 dfr_reg; @@ -468,6 +469,11 @@ static inline bool gif_set(struct vcpu_svm *svm) return !!(svm->vcpu.arch.hflags & HF_GIF_MASK); } +static inline bool nested_npt_enabled(struct vcpu_svm *svm) +{ + return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; +} + /* svm.c */ #define MSR_INVALID 0xffffffffU From patchwork Mon Feb 7 15:28:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 56C75C433FE for ; Mon, 7 Feb 2022 15:33:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8099311245D; Mon, 7 Feb 2022 15:33:28 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id E1CF5112471 for ; Mon, 7 Feb 2022 15:33:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644248006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uQ1uiq30ca/wtmdEmBkE1pSZd4AYgwDXu3pWHPnS4Sg=; b=IQuQ7LxSUDpnTu8Zq7n8xah6HlQia29WSjPAB+UGkP6dbeYlzmgYcjtTcI+zwMyXn0nnSk IL438QvNMjIJ3SSqUDhgjJUohvEPecqaevpOzHPU/x6RokdaGCTQksH1QZN3cFgXLOsyiF S4OxG1azhLD0dmQ41M+kKG1KRQmoC9I= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-451-pbAlwA32MgSSiL-D6nM2NA-1; Mon, 07 Feb 2022 10:33:22 -0500 X-MC-Unique: pbAlwA32MgSSiL-D6nM2NA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 491011091DA0; Mon, 7 Feb 2022 15:33:19 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id EFF6E5E495; Mon, 7 Feb 2022 15:33:10 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 25/30] KVM: x86: nSVM: support PAUSE filter threshold and count when cpu_pm=on Date: Mon, 7 Feb 2022 17:28:42 +0200 Message-Id: <20220207152847.836777-26-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Allow L1 to use these settings if L0 disables PAUSE interception (AKA cpu_pm=on) Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 6 ++++++ arch/x86/kvm/svm/svm.c | 17 +++++++++++++++++ arch/x86/kvm/svm/svm.h | 2 ++ 3 files changed, 25 insertions(+) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index bdcb23c76e89e..601d38ae05cc6 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -630,6 +630,12 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) if (!nested_vmcb_needs_vls_intercept(svm)) svm->vmcb->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; + if (svm->pause_filter_enabled) + svm->vmcb->control.pause_filter_count = svm->nested.ctl.pause_filter_count; + + if (svm->pause_threshold_enabled) + svm->vmcb->control.pause_filter_thresh = svm->nested.ctl.pause_filter_thresh; + nested_svm_transition_tlb_flush(vcpu); /* Enter Guest-Mode */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 0f068da098d9f..e49043807ec44 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3997,6 +3997,17 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) svm->v_vmload_vmsave_enabled = vls && guest_cpuid_has(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD); + if (kvm_pause_in_guest(vcpu->kvm)) { + svm->pause_filter_enabled = pause_filter_count > 0 && + guest_cpuid_has(vcpu, X86_FEATURE_PAUSEFILTER); + + svm->pause_threshold_enabled = pause_filter_thresh > 0 && + guest_cpuid_has(vcpu, X86_FEATURE_PFTHRESHOLD); + } else { + svm->pause_filter_enabled = false; + svm->pause_threshold_enabled = false; + } + svm_recalc_instruction_intercepts(vcpu, svm); /* For sev guests, the memory encryption bit is not reserved in CR3. */ @@ -4806,6 +4817,12 @@ static __init void svm_set_cpu_caps(void) if (vls) kvm_cpu_cap_set(X86_FEATURE_V_VMSAVE_VMLOAD); + if (pause_filter_count) + kvm_cpu_cap_set(X86_FEATURE_PAUSEFILTER); + + if (pause_filter_thresh) + kvm_cpu_cap_set(X86_FEATURE_PFTHRESHOLD); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index e8ffd458a5575..297ec57f9941c 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -222,6 +222,8 @@ struct vcpu_svm { bool tsc_scaling_enabled : 1; bool lbrv_enabled : 1; bool v_vmload_vmsave_enabled : 1; + bool pause_filter_enabled : 1; + bool pause_threshold_enabled : 1; u32 ldr_reg; u32 dfr_reg; From patchwork Mon Feb 7 15:28:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CDACFC433F5 for ; Mon, 7 Feb 2022 15:33:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D014E112467; Mon, 7 Feb 2022 15:33:44 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4CBCF112464 for ; Mon, 7 Feb 2022 15:33:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644248022; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0YZKFPfPAlzbRhVzmbaqcoTuZ3BdIzpFOausE6A/ZGI=; b=SfSt4Sm3UYkFFFQtFWFcvTvyOUxWk4fDL2tNoe+zFPiK3mm0ZjA50OH44O97rCVNxkEHHp /l/9RtxTNkk1nwqK4/wx1hQ2SrXwrT5S0pi4ZHCMKdxXUeVsqRzTCJUBfi3jxTgFeA8T1w I0VX1s+tYhJw8I1+uv2Vbiaz+8QJlPo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-301-z8blMLWqNNKtRwtuJROlbw-1; Mon, 07 Feb 2022 10:33:41 -0500 X-MC-Unique: z8blMLWqNNKtRwtuJROlbw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 190331006AA9; Mon, 7 Feb 2022 15:33:38 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id B75755E495; Mon, 7 Feb 2022 15:33:19 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 26/30] KVM: x86: nSVM: implement nested vGIF Date: Mon, 7 Feb 2022 17:28:43 +0200 Message-Id: <20220207152847.836777-27-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" In case L1 enables vGIF for L2, the L2 cannot affect L1's GIF, regardless of STGI/CLGI intercepts, and since VM entry enables GIF, this means that L1's GIF is always 1 while L2 is running. Thus in this case leave L1's vGIF in vmcb01, while letting L2 control the vGIF thus implementing nested vGIF. Also allow KVM to toggle L1's GIF during nested entry/exit by always using vmcb01. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 17 +++++++++++++---- arch/x86/kvm/svm/svm.c | 5 +++++ arch/x86/kvm/svm/svm.h | 25 +++++++++++++++++++++---- 3 files changed, 39 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 601d38ae05cc6..a426d4d3dcd82 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -408,6 +408,10 @@ void nested_sync_control_from_vmcb02(struct vcpu_svm *svm) */ mask &= ~V_IRQ_MASK; } + + if (nested_vgif_enabled(svm)) + mask |= V_GIF_MASK; + svm->nested.ctl.int_ctl &= ~mask; svm->nested.ctl.int_ctl |= svm->vmcb->control.int_ctl & mask; } @@ -573,10 +577,8 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) { - const u32 int_ctl_vmcb01_bits = - V_INTR_MASKING_MASK | V_GIF_MASK | V_GIF_ENABLE_MASK; - - const u32 int_ctl_vmcb12_bits = V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK; + u32 int_ctl_vmcb01_bits = V_INTR_MASKING_MASK; + u32 int_ctl_vmcb12_bits = V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK; struct kvm_vcpu *vcpu = &svm->vcpu; @@ -586,6 +588,13 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) */ + + + if (svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK)) + int_ctl_vmcb12_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK); + else + int_ctl_vmcb01_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK); + /* Copied from vmcb01. msrpm_base can be overwritten later. */ svm->vmcb->control.nested_ctl = svm->vmcb01.ptr->control.nested_ctl; svm->vmcb->control.iopm_base_pa = svm->vmcb01.ptr->control.iopm_base_pa; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e49043807ec44..1cf682d1553cc 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4008,6 +4008,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) svm->pause_threshold_enabled = false; } + svm->vgif_enabled = vgif && guest_cpuid_has(vcpu, X86_FEATURE_VGIF); + svm_recalc_instruction_intercepts(vcpu, svm); /* For sev guests, the memory encryption bit is not reserved in CR3. */ @@ -4823,6 +4825,9 @@ static __init void svm_set_cpu_caps(void) if (pause_filter_thresh) kvm_cpu_cap_set(X86_FEATURE_PFTHRESHOLD); + if (vgif) + kvm_cpu_cap_set(X86_FEATURE_VGIF); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 297ec57f9941c..73cc9d3e784bd 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -224,6 +224,7 @@ struct vcpu_svm { bool v_vmload_vmsave_enabled : 1; bool pause_filter_enabled : 1; bool pause_threshold_enabled : 1; + bool vgif_enabled : 1; u32 ldr_reg; u32 dfr_reg; @@ -442,31 +443,47 @@ static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit) return vmcb_is_intercept(&svm->vmcb->control, bit); } +static bool nested_vgif_enabled(struct vcpu_svm *svm) +{ + if (!is_guest_mode(&svm->vcpu) || !svm->vgif_enabled) + return false; + return svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK; +} + static inline bool vgif_enabled(struct vcpu_svm *svm) { - return !!(svm->vmcb->control.int_ctl & V_GIF_ENABLE_MASK); + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + + return !!(vmcb->control.int_ctl & V_GIF_ENABLE_MASK); } static inline void enable_gif(struct vcpu_svm *svm) { + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + if (vgif_enabled(svm)) - svm->vmcb->control.int_ctl |= V_GIF_MASK; + vmcb->control.int_ctl |= V_GIF_MASK; else svm->vcpu.arch.hflags |= HF_GIF_MASK; } static inline void disable_gif(struct vcpu_svm *svm) { + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + if (vgif_enabled(svm)) - svm->vmcb->control.int_ctl &= ~V_GIF_MASK; + vmcb->control.int_ctl &= ~V_GIF_MASK; else svm->vcpu.arch.hflags &= ~HF_GIF_MASK; + } static inline bool gif_set(struct vcpu_svm *svm) { + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + if (vgif_enabled(svm)) - return !!(svm->vmcb->control.int_ctl & V_GIF_MASK); + return !!(vmcb->control.int_ctl & V_GIF_MASK); else return !!(svm->vcpu.arch.hflags & HF_GIF_MASK); } From patchwork Mon Feb 7 15:28:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 24498C433FE for ; Mon, 7 Feb 2022 15:34:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 08D7010E450; Mon, 7 Feb 2022 15:34:20 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id CDB65112464 for ; Mon, 7 Feb 2022 15:34:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644248057; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0Xxoic2Sn/qFA/mhRCvHnS08xjpesyCCqOxwVNKsehI=; b=U3zMK3SQfj2XiOlmHaGPD7y3qcOBC0nDpA3Ix0iTaSgEvPr3SBdPNG7UQoqwvAxQSrj90n KoNFEVqxAhDi77XTJOzRXPGgFxOwoE2tdaF0l2kKeHXHR9PTox+dn7cMkuGAuqYfNVDItX e/xlVZeJLrFYXaw0j53sqFgWzqBqJSM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-648-liy7FMWTPFmy0kaoiLc8fg-1; Mon, 07 Feb 2022 10:34:14 -0500 X-MC-Unique: liy7FMWTPFmy0kaoiLc8fg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C90F3363A4; Mon, 7 Feb 2022 15:34:11 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8809184FF5; Mon, 7 Feb 2022 15:33:38 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 27/30] KVM: x86: add force_intercept_exceptions_mask Date: Mon, 7 Feb 2022 17:28:44 +0200 Message-Id: <20220207152847.836777-28-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner , Borislav Petkov " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This parameter will be used by VMX and SVM code to force interception of a set of exceptions, given by a bitmask for guest debug and/or kvm debug. This is based on an idea first shown here: https://patchwork.kernel.org/project/kvm/patch/20160301192822.GD22677@pd.tnic/ CC: Borislav Petkov Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 7 +++++++ arch/x86/kvm/x86.c | 9 +++++++++ arch/x86/kvm/x86.h | 5 +++++ 3 files changed, 21 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 428ab1cc7dd34..fa498612839a0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1168,6 +1168,13 @@ struct kvm_arch { struct kvm_pmu_event_filter __rcu *pmu_event_filter; struct task_struct *nx_lpage_recovery_thread; + /* + * Bitmask of exceptions that KVM will intercept + * and forward to the guest, even if that is not needed + * for normal operation. Debug feature. + */ + u32 force_intercept_exceptions_bitmask; + #ifdef CONFIG_X86_64 /* * Whether the TDP MMU is enabled for this VM. This contains a diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 63d84c373e465..202c34697852f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -193,6 +193,13 @@ module_param(enable_pmu, bool, 0444); bool __read_mostly eager_page_split = true; module_param(eager_page_split, bool, 0644); +/* + * force_intercept_exceptions_mask is a writable param and its value + * is snapshotted when a VM is created + */ +static uint force_intercept_exceptions_mask; +module_param(force_intercept_exceptions_mask, uint, S_IRUGO | S_IWUSR); + /* * Restoring the host value for MSRs that are only consumed when running in * usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU @@ -11646,6 +11653,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags); kvm->arch.guest_can_read_msr_platform_info = true; + kvm->arch.force_intercept_exceptions_bitmask = force_intercept_exceptions_mask; #if IS_ENABLED(CONFIG_HYPERV) spin_lock_init(&kvm->arch.hv_root_tdp_lock); @@ -12886,6 +12894,7 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, } EXPORT_SYMBOL_GPL(kvm_sev_es_string_io); + EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_entry); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_fast_mmio); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index e9b303b21f173..34f96f483c7e5 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -91,6 +91,11 @@ static inline bool kvm_exception_is_soft(unsigned int nr) return (nr == BP_VECTOR) || (nr == OF_VECTOR); } +static inline bool kvm_is_exception_force_intercepted(struct kvm *kvm, int exception) +{ + return kvm->arch.force_intercept_exceptions_bitmask & BIT(exception); +} + static inline bool is_protmode(struct kvm_vcpu *vcpu) { return kvm_read_cr0_bits(vcpu, X86_CR0_PE); From patchwork Mon Feb 7 15:28:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737532 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7A9EDC433F5 for ; Mon, 7 Feb 2022 15:34:29 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 09525112469; Mon, 7 Feb 2022 15:34:28 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5B362112465 for ; Mon, 7 Feb 2022 15:34:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644248066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iV+sT7eBaIHlnSrQTGcmgC2l+6R2K6ll4jPmbZAHXmI=; b=V1hHtNrnux+ZZqwXJMmjXlUkXv7/v+PQTuvxbL9f4EOpZMJK2M+o/l94SLoE57wRzd3dCg BvvoLMydOt2Np+Wu2nWqgzfb9GwWcsFzxQzs+i3UjNygbQ8YJU++bFHN+vneBQRqXY7eEz NUtV6qMuXFt9Ak6YKztsChsVznqhl1w= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-118-mZGXc3z_N4u2XQtm2vkTmA-1; Mon, 07 Feb 2022 10:34:23 -0500 X-MC-Unique: mZGXc3z_N4u2XQtm2vkTmA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4FF91363A6; Mon, 7 Feb 2022 15:34:20 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4221B5E495; Mon, 7 Feb 2022 15:34:12 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 28/30] KVM: SVM: implement force_intercept_exceptions_mask Date: Mon, 7 Feb 2022 17:28:45 +0200 Message-Id: <20220207152847.836777-29-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Currently #TS interception is only done once. Also exception interception is not enabled for SEV guests. Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 2 + arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/svm/svm.c | 92 ++++++++++++++++++++++++++++++++- arch/x86/kvm/svm/svm.h | 5 +- arch/x86/kvm/svm/svm_onhyperv.c | 1 + arch/x86/kvm/x86.c | 5 +- 6 files changed, 101 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index fa498612839a0..446ee29e6cc99 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1750,6 +1750,8 @@ int kvm_emulate_rdpmc(struct kvm_vcpu *vcpu); void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr); void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code); void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, unsigned long payload); +void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr, + u32 error_code, unsigned long payload); void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr); void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code); void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault); diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index bf6e96011dfed..d462b4808e893 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -32,6 +32,7 @@ #define MC_VECTOR 18 #define XM_VECTOR 19 #define VE_VECTOR 20 +#define CP_VECTOR 21 /* Select x86 specific features in */ #define __KVM_HAVE_PIT diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 1cf682d1553cc..afa4116ea938c 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -245,6 +245,8 @@ static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000}; #define MSRS_RANGE_SIZE 2048 #define MSRS_IN_RANGE (MSRS_RANGE_SIZE * 8 / 2) +static int svm_handle_invalid_exit(struct kvm_vcpu *vcpu, u64 exit_code); + u32 svm_msrpm_offset(u32 msr) { u32 offset; @@ -1035,6 +1037,16 @@ static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu, } } +static void svm_init_force_exceptions_intercepts(struct kvm_vcpu *vcpu) +{ + int exc; + struct vcpu_svm *svm = to_svm(vcpu); + + for (exc = 0 ; exc < 32 ; ++exc) + if (kvm_is_exception_force_intercepted(vcpu->kvm, exc)) + set_exception_intercept(svm, exc); +} + static inline void init_vmcb_after_set_cpuid(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -1235,6 +1247,8 @@ static void __svm_vcpu_reset(struct kvm_vcpu *vcpu) if (sev_es_guest(vcpu->kvm)) sev_es_vcpu_reset(svm); + else + svm_init_force_exceptions_intercepts(vcpu); } static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) @@ -1865,6 +1879,19 @@ static int pf_interception(struct kvm_vcpu *vcpu) u64 fault_address = svm->vmcb->control.exit_info_2; u64 error_code = svm->vmcb->control.exit_info_1; + + if (kvm_is_exception_force_intercepted(vcpu->kvm, PF_VECTOR)) { + if (npt_enabled && !vcpu->arch.apf.host_apf_flags) { + /* If the #PF was only intercepted for debug, inject + * it directly to the guest, since the kvm's mmu code + * is not ready to deal with such page faults. + */ + kvm_queue_exception_e_p(vcpu, PF_VECTOR, + error_code, fault_address); + return 1; + } + } + return kvm_handle_page_fault(vcpu, error_code, fault_address, static_cpu_has(X86_FEATURE_DECODEASSISTS) ? svm->vmcb->control.insn_bytes : NULL, @@ -1940,6 +1967,46 @@ static int ac_interception(struct kvm_vcpu *vcpu) return 1; } +static int gen_exc_interception(struct kvm_vcpu *vcpu) +{ + /* + * Generic exception intercept handler which forwards a guest exception + * as-is to the guest. + * For exceptions that don't have a special intercept handler. + * + * Used only for 'force_intercept_exceptions_mask' KVM debug feature. + */ + struct vcpu_svm *svm = to_svm(vcpu); + int exc = svm->vmcb->control.exit_code - SVM_EXIT_EXCP_BASE; + + if (!kvm_is_exception_force_intercepted(vcpu->kvm, exc)) + return svm_handle_invalid_exit(vcpu, svm->vmcb->control.exit_code); + + if (x86_exception_has_error_code(exc)) { + + if (exc == TS_VECTOR) { + /* + * SVM doesn't provide us with an error code to be able to + * re-inject the #TS exception, so just disable its + * intercept, and let the guest re-execute the instruction. + */ + vmcb_clr_intercept(&svm->vmcb01.ptr->control, + INTERCEPT_EXCEPTION_OFFSET + TS_VECTOR); + recalc_intercepts(svm); + return 1; + } else if (exc == DF_VECTOR) { + /* + * SVM doesn't provide the error code on #DF either + * but it is always 0 + */ + svm->vmcb->control.exit_info_1 = 0; + } + kvm_queue_exception_e(vcpu, exc, svm->vmcb->control.exit_info_1); + } else + kvm_queue_exception(vcpu, exc); + return 1; +} + static bool is_erratum_383(void) { int err, i; @@ -3050,13 +3117,34 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = { [SVM_EXIT_WRITE_DR5] = dr_interception, [SVM_EXIT_WRITE_DR6] = dr_interception, [SVM_EXIT_WRITE_DR7] = dr_interception, + + /* 0 */ + [SVM_EXIT_EXCP_BASE + DE_VECTOR] = gen_exc_interception, [SVM_EXIT_EXCP_BASE + DB_VECTOR] = db_interception, + /* NMI*/ [SVM_EXIT_EXCP_BASE + BP_VECTOR] = bp_interception, + [SVM_EXIT_EXCP_BASE + OF_VECTOR] = gen_exc_interception, + [SVM_EXIT_EXCP_BASE + BR_VECTOR] = gen_exc_interception, [SVM_EXIT_EXCP_BASE + UD_VECTOR] = ud_interception, + [SVM_EXIT_EXCP_BASE + NM_VECTOR] = gen_exc_interception, + /* 8*/ + [SVM_EXIT_EXCP_BASE + DF_VECTOR] = gen_exc_interception, + /* 9 is reserved*/ + [SVM_EXIT_EXCP_BASE + TS_VECTOR] = gen_exc_interception, + [SVM_EXIT_EXCP_BASE + NP_VECTOR] = gen_exc_interception, + [SVM_EXIT_EXCP_BASE + SS_VECTOR] = gen_exc_interception, + [SVM_EXIT_EXCP_BASE + GP_VECTOR] = gp_interception, [SVM_EXIT_EXCP_BASE + PF_VECTOR] = pf_interception, - [SVM_EXIT_EXCP_BASE + MC_VECTOR] = mc_interception, + /* 15 is reserved*/ + /* 16 */ + [SVM_EXIT_EXCP_BASE + MF_VECTOR] = gen_exc_interception, [SVM_EXIT_EXCP_BASE + AC_VECTOR] = ac_interception, - [SVM_EXIT_EXCP_BASE + GP_VECTOR] = gp_interception, + [SVM_EXIT_EXCP_BASE + MC_VECTOR] = mc_interception, + [SVM_EXIT_EXCP_BASE + XM_VECTOR] = gen_exc_interception, + /* 20 - #VE - reserved on AMD*/ + [SVM_EXIT_EXCP_BASE + CP_VECTOR] = gen_exc_interception, + /* TODO: exceptions 22-31 */ + [SVM_EXIT_INTR] = intr_interception, [SVM_EXIT_NMI] = nmi_interception, [SVM_EXIT_SMI] = smi_interception, diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 73cc9d3e784bd..dd3671d77258b 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -415,8 +415,11 @@ static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit) struct vmcb *vmcb = svm->vmcb01.ptr; WARN_ON_ONCE(bit >= 32); - vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit); + if (kvm_is_exception_force_intercepted(svm->vcpu.kvm, bit)) + return; + + vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit); recalc_intercepts(svm); } diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c index 98aa981c04ec5..81be254e757b2 100644 --- a/arch/x86/kvm/svm/svm_onhyperv.c +++ b/arch/x86/kvm/svm/svm_onhyperv.c @@ -8,6 +8,7 @@ #include +#include "x86.h" #include "svm.h" #include "svm_ops.h" diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 202c34697852f..0ee2fbb068b17 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -706,12 +706,13 @@ void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, } EXPORT_SYMBOL_GPL(kvm_queue_exception_p); -static void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr, - u32 error_code, unsigned long payload) +void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr, + u32 error_code, unsigned long payload) { kvm_multiple_exception(vcpu, nr, true, error_code, true, payload, false); } +EXPORT_SYMBOL_GPL(kvm_queue_exception_e_p); int kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err) { From patchwork Mon Feb 7 15:28:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5D0DEC433F5 for ; Mon, 7 Feb 2022 15:34:39 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4FE88112464; Mon, 7 Feb 2022 15:34:37 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id BBE77112464 for ; Mon, 7 Feb 2022 15:34:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644248074; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PdxSxSRMwi46phQmQDHMqytXBvrQerh2Lq0bkWEDmYc=; b=fNX7exg+iaPecSN6WsW8tMdGQ0+xoLvC26FOkL5Q/+DeCchywvnCtjcjWAfz6WJa7iG/Zw B9kW+ld1XpL5tYLqoXu/3qLXEfebaupnsYu8vEOvg72FvCotToG6zLSqS6mu0CUItPdBoi lIvGEJhLIpXXJ+AH9xJDLk40TjcZKgw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-98--O5g3A7GP8GQvV1SlAQjnQ-1; Mon, 07 Feb 2022 10:34:31 -0500 X-MC-Unique: -O5g3A7GP8GQvV1SlAQjnQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C162286A8A0; Mon, 7 Feb 2022 15:34:28 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id BE5015E495; Mon, 7 Feb 2022 15:34:20 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 29/30] KVM: VMX: implement force_intercept_exceptions_mask Date: Mon, 7 Feb 2022 17:28:46 +0200 Message-Id: <20220207152847.836777-30-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" All exceptions are supported. Some bugs might remain in regard to KVM own interception of #PF but since this is strictly debug feature this should be OK. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 8 +++++++ arch/x86/kvm/vmx/vmcs.h | 6 +++++ arch/x86/kvm/vmx/vmx.c | 47 +++++++++++++++++++++++++++++++++------ 3 files changed, 54 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index c73e4d938ddc3..e89b32b1d9efb 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -5902,6 +5902,14 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu, switch ((u16)exit_reason.basic) { case EXIT_REASON_EXCEPTION_NMI: intr_info = vmx_get_intr_info(vcpu); + + if (is_exception(intr_info)) { + int ex_no = intr_info & INTR_INFO_VECTOR_MASK; + + if (kvm_is_exception_force_intercepted(vcpu->kvm, ex_no)) + return true; + } + if (is_nmi(intr_info)) return true; else if (is_page_fault(intr_info)) diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h index e325c290a8162..d5aac5abe5cdd 100644 --- a/arch/x86/kvm/vmx/vmcs.h +++ b/arch/x86/kvm/vmx/vmcs.h @@ -94,6 +94,12 @@ static inline bool is_exception_n(u32 intr_info, u8 vector) return is_intr_type_n(intr_info, INTR_TYPE_HARD_EXCEPTION, vector); } +static inline bool is_exception(u32 intr_info) +{ + return is_intr_type(intr_info, INTR_TYPE_HARD_EXCEPTION) || + is_intr_type(intr_info, INTR_TYPE_SOFT_EXCEPTION); +} + static inline bool is_debug(u32 intr_info) { return is_exception_n(intr_info, DB_VECTOR); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index fc9c4eca90a78..aec2b962707a0 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -719,6 +719,7 @@ static u32 vmx_read_guest_seg_ar(struct vcpu_vmx *vmx, unsigned seg) void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu) { u32 eb; + int exc; eb = (1u << PF_VECTOR) | (1u << UD_VECTOR) | (1u << MC_VECTOR) | (1u << DB_VECTOR) | (1u << AC_VECTOR); @@ -749,7 +750,8 @@ void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu) else { int mask = 0, match = 0; - if (enable_ept && (eb & (1u << PF_VECTOR))) { + if (enable_ept && (eb & (1u << PF_VECTOR)) && + !kvm_is_exception_force_intercepted(vcpu->kvm, PF_VECTOR)) { /* * If EPT is enabled, #PF is currently only intercepted * if MAXPHYADDR is smaller on the guest than on the @@ -772,6 +774,10 @@ void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu) if (vcpu->arch.xfd_no_write_intercept) eb |= (1u << NM_VECTOR); + for (exc = 0 ; exc < 32 ; ++exc) + if (kvm_is_exception_force_intercepted(vcpu->kvm, exc) && exc != NMI_VECTOR) + eb |= (1u << exc); + vmcs_write32(EXCEPTION_BITMAP, eb); } @@ -4867,18 +4873,23 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE); if (!vmx->rmode.vm86_active && is_gp_fault(intr_info)) { - WARN_ON_ONCE(!enable_vmware_backdoor); - /* * VMware backdoor emulation on #GP interception only handles * IN{S}, OUT{S}, and RDPMC, none of which generate a non-zero * error code on #GP. */ - if (error_code) { + + if (enable_vmware_backdoor && !error_code) + return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP); + + if (!kvm_is_exception_force_intercepted(vcpu->kvm, GP_VECTOR)) + WARN_ON_ONCE(!enable_vmware_backdoor); + + if (intr_info & INTR_INFO_DELIVER_CODE_MASK) kvm_queue_exception_e(vcpu, GP_VECTOR, error_code); - return 1; - } - return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP); + else + kvm_queue_exception(vcpu, GP_VECTOR); + return 1; } /* @@ -4887,6 +4898,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) * See the comments in vmx_handle_exit. */ if ((vect_info & VECTORING_INFO_VALID_MASK) && + !kvm_is_exception_force_intercepted(vcpu->kvm, PF_VECTOR) && !(is_page_fault(intr_info) && !(error_code & PFERR_RSVD_MASK))) { vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_SIMUL_EX; @@ -4901,10 +4913,23 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) if (is_page_fault(intr_info)) { cr2 = vmx_get_exit_qual(vcpu); if (enable_ept && !vcpu->arch.apf.host_apf_flags) { + /* + * If we force intercept #PF and the page fault + * is due to the reason which we don't intercept, + * reflect it to the guest. + */ + if (kvm_is_exception_force_intercepted(vcpu->kvm, PF_VECTOR) && + (!allow_smaller_maxphyaddr || + !(error_code & PFERR_PRESENT_MASK) || + (error_code & PFERR_RSVD_MASK))) { + kvm_queue_exception_e_p(vcpu, PF_VECTOR, error_code, cr2); + return 1; + } /* * EPT will cause page fault only if we need to * detect illegal GPAs. */ + WARN_ON_ONCE(!allow_smaller_maxphyaddr); kvm_fixup_and_inject_pf_error(vcpu, cr2, error_code); return 1; @@ -4983,6 +5008,14 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) return 1; fallthrough; default: + if (kvm_is_exception_force_intercepted(vcpu->kvm, ex_no)) { + if (intr_info & INTR_INFO_DELIVER_CODE_MASK) + kvm_queue_exception_e(vcpu, ex_no, error_code); + else + kvm_queue_exception(vcpu, ex_no); + break; + } + kvm_run->exit_reason = KVM_EXIT_EXCEPTION; kvm_run->ex.exception = ex_no; kvm_run->ex.error_code = error_code; From patchwork Mon Feb 7 15:28:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737534 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E37ACC433EF for ; Mon, 7 Feb 2022 15:34:46 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 09A6F112472; Mon, 7 Feb 2022 15:34:46 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6C472112465 for ; Mon, 7 Feb 2022 15:34:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644248083; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5WIO5n6+3jqzKnCirxzx86dJnJZyXVHjiDs7Tj6xNnY=; b=DYrFdqE5DqHnWsAGMpYpUuLAD2Xxv0AUvjP50LloLQfxwgkChU5MI7FVSb+wvmPD+ewp6V NYRJJp6FVfhVkDv2HRrpxrUGN/fHHW1wTQSFK43JTnvhyYOgwmfBrzBqBcItRXhbkAdjVp 2v82nyxtKC3VeStcby714gXqGO1S2CI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-232-JpLtitjQOiWSR0ljzuCs7g-1; Mon, 07 Feb 2022 10:34:40 -0500 X-MC-Unique: JpLtitjQOiWSR0ljzuCs7g-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1648B85EE60; Mon, 7 Feb 2022 15:34:37 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 475ED5E495; Mon, 7 Feb 2022 15:34:29 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Subject: [PATCH 30/30] KVM: x86: get rid of KVM_REQ_GET_NESTED_STATE_PAGES Date: Mon, 7 Feb 2022 17:28:47 +0200 Message-Id: <20220207152847.836777-31-mlevitsk@redhat.com> In-Reply-To: <20220207152847.836777-1-mlevitsk@redhat.com> References: <20220207152847.836777-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Dave Hansen , Wanpeng Li , David Airlie , "Chang S. Bae" , "maintainer:X86 ARCHITECTURE 32-BIT AND 64-BIT" , "open list:X86 ARCHITECTURE 32-BIT AND 64-BIT" , Maxim Levitsky , Tony Luck , "open list:DRM DRIVERS" , Brijesh Singh , Rodrigo Vivi , Paolo Bonzini , Vitaly Kuznetsov , Jim Mattson , "open list:INTEL GVT-g DRIVERS Intel GPU Virtualization" , "open list:INTEL DRM DRIVERS excluding Poulsbo, Moorestow..., Joerg Roedel , Borislav Petkov , Daniel Vetter , \"H. Peter Anvin\" , Ingo Molnar , Sean Christopherson , Joonas Lahtinen , Pawan Gupta , Thomas Gleixner " , Zhi Wang , Kan Liang Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" As it turned out this request isn't really needed, and it complicates the nested migration. In theory this patch can break userspace if userspace relies on updating KVM's memslots after setting nested state but there is little reason for it to rely on this. However this is undocumented and there is a good chance that no userspace relies on this, thus just try to remove this code. Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 5 +- arch/x86/kvm/hyperv.c | 4 ++ arch/x86/kvm/svm/nested.c | 50 ++++------------- arch/x86/kvm/svm/svm.c | 2 +- arch/x86/kvm/svm/svm.h | 2 +- arch/x86/kvm/vmx/nested.c | 99 +++++++++------------------------ arch/x86/kvm/x86.c | 6 -- 7 files changed, 45 insertions(+), 123 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 446ee29e6cc99..fc2d5628ad930 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -92,7 +92,6 @@ #define KVM_REQ_HV_EXIT KVM_ARCH_REQ(21) #define KVM_REQ_HV_STIMER KVM_ARCH_REQ(22) #define KVM_REQ_LOAD_EOI_EXITMAP KVM_ARCH_REQ(23) -#define KVM_REQ_GET_NESTED_STATE_PAGES KVM_ARCH_REQ(24) #define KVM_REQ_APICV_UPDATE \ KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) #define KVM_REQ_TLB_FLUSH_CURRENT KVM_ARCH_REQ(26) @@ -1519,12 +1518,14 @@ struct kvm_x86_nested_ops { int (*set_state)(struct kvm_vcpu *vcpu, struct kvm_nested_state __user *user_kvm_nested_state, struct kvm_nested_state *kvm_state); - bool (*get_nested_state_pages)(struct kvm_vcpu *vcpu); int (*write_log_dirty)(struct kvm_vcpu *vcpu, gpa_t l2_gpa); int (*enable_evmcs)(struct kvm_vcpu *vcpu, uint16_t *vmcs_version); uint16_t (*get_evmcs_version)(struct kvm_vcpu *vcpu); + + bool (*get_evmcs_page)(struct kvm_vcpu *vcpu); + }; struct kvm_x86_init_ops { diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index dac41784f2b87..d297d102c0910 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -1497,6 +1497,10 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host) gfn_to_gpa(gfn) | KVM_MSR_ENABLED, sizeof(struct hv_vp_assist_page))) return 1; + + if (host && kvm_x86_ops.nested_ops->get_evmcs_page) + if (!kvm_x86_ops.nested_ops->get_evmcs_page(vcpu)) + return 1; break; } case HV_X64_MSR_EOI: diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index a426d4d3dcd82..ac813ad83d784 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -670,7 +670,7 @@ static void nested_svm_copy_common_state(struct vmcb *from_vmcb, struct vmcb *to } int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, - struct vmcb *vmcb12, bool from_vmrun) + struct vmcb *vmcb12) { struct vcpu_svm *svm = to_svm(vcpu); int ret; @@ -700,15 +700,13 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, nested_vmcb02_prepare_save(svm, vmcb12); ret = nested_svm_load_cr3(&svm->vcpu, svm->nested.save.cr3, - nested_npt_enabled(svm), from_vmrun); + nested_npt_enabled(svm), true); if (ret) return ret; if (!npt_enabled) vcpu->arch.mmu->inject_page_fault = svm_inject_page_fault_nested; - if (!from_vmrun) - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); svm_set_gif(svm, true); @@ -779,7 +777,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) svm->nested.nested_run_pending = 1; - if (enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, true)) + if (enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12)) goto out_exit_err; if (nested_svm_vmrun_msrpm(svm)) @@ -863,8 +861,6 @@ int nested_svm_vmexit(struct vcpu_svm *svm) svm->nested.vmcb12_gpa = 0; WARN_ON_ONCE(svm->nested.nested_run_pending); - kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); - /* in case we halted in L2 */ svm->vcpu.arch.mp_state = KVM_MP_STATE_RUNNABLE; @@ -1069,8 +1065,6 @@ void svm_leave_nested(struct kvm_vcpu *vcpu) nested_svm_uninit_mmu_context(vcpu); vmcb_mark_all_dirty(svm->vmcb); } - - kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); } static int nested_svm_exit_handled_msr(struct vcpu_svm *svm) @@ -1562,53 +1556,31 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, */ ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3, - nested_npt_enabled(svm), false); + nested_npt_enabled(svm), !vcpu->arch.pdptrs_from_userspace); if (WARN_ON_ONCE(ret)) goto out_free; - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); - ret = 0; -out_free: - kfree(save); - kfree(ctl); - - return ret; -} - -static bool svm_get_nested_state_pages(struct kvm_vcpu *vcpu) -{ - struct vcpu_svm *svm = to_svm(vcpu); - - if (WARN_ON(!is_guest_mode(vcpu))) - return true; - - if (!vcpu->arch.pdptrs_from_userspace && - !nested_npt_enabled(svm) && is_pae_paging(vcpu)) - /* - * Reload the guest's PDPTRs since after a migration - * the guest CR3 might be restored prior to setting the nested - * state which can lead to a load of wrong PDPTRs. - */ - if (CC(!load_pdptrs(vcpu, vcpu->arch.cr3))) - return false; - if (!nested_svm_vmrun_msrpm(svm)) { vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION; vcpu->run->internal.ndata = 0; - return false; + goto out_free; } - return true; + ret = 0; +out_free: + kfree(save); + kfree(ctl); + return ret; } + struct kvm_x86_nested_ops svm_nested_ops = { .leave_nested = svm_leave_nested, .check_events = svm_check_nested_events, .triple_fault = nested_svm_triple_fault, - .get_nested_state_pages = svm_get_nested_state_pages, .get_state = svm_get_nested_state, .set_state = svm_set_nested_state, }; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index afa4116ea938c..6d6421e0cadcd 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4498,7 +4498,7 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) vmcb12 = map.hva; nested_copy_vmcb_control_to_cache(svm, &vmcb12->control); nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); - ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, false); + ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12); if (ret) goto unmap_save; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index dd3671d77258b..e2eb91851e922 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -551,7 +551,7 @@ static inline bool nested_exit_on_nmi(struct vcpu_svm *svm) } int enter_svm_guest_mode(struct kvm_vcpu *vcpu, - u64 vmcb_gpa, struct vmcb *vmcb12, bool from_vmrun); + u64 vmcb_gpa, struct vmcb *vmcb12); void svm_leave_nested(struct kvm_vcpu *vcpu); void svm_free_nested(struct vcpu_svm *svm); int svm_allocate_nested(struct vcpu_svm *svm); diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index e89b32b1d9efb..19331f742662d 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -294,8 +294,6 @@ static void free_nested(struct kvm_vcpu *vcpu) if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon) return; - kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); - vmx->nested.vmxon = false; vmx->nested.smm.vmxon = false; vmx->nested.vmxon_ptr = INVALID_GPA; @@ -2593,7 +2591,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, /* Shadow page tables on either EPT or shadow page tables. */ if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, nested_cpu_has_ept(vmcs12), - from_vmentry, entry_failure_code)) + from_vmentry || !vcpu->arch.pdptrs_from_userspace, + entry_failure_code)) return -EINVAL; /* @@ -3125,7 +3124,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) return 0; } -static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu) +bool nested_get_evmcs_page(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -3161,18 +3160,6 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu) struct page *page; u64 hpa; - if (!vcpu->arch.pdptrs_from_userspace && - !nested_cpu_has_ept(vmcs12) && is_pae_paging(vcpu)) { - /* - * Reload the guest's PDPTRs since after a migration - * the guest CR3 might be restored prior to setting the nested - * state which can lead to a load of wrong PDPTRs. - */ - if (CC(!load_pdptrs(vcpu, vcpu->arch.cr3))) - return false; - } - - if (nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) { /* * Translate L1 physical address to host physical @@ -3254,25 +3241,6 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu) return true; } -static bool vmx_get_nested_state_pages(struct kvm_vcpu *vcpu) -{ - if (!nested_get_evmcs_page(vcpu)) { - pr_debug_ratelimited("%s: enlightened vmptrld failed\n", - __func__); - vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; - vcpu->run->internal.suberror = - KVM_INTERNAL_ERROR_EMULATION; - vcpu->run->internal.ndata = 0; - - return false; - } - - if (is_guest_mode(vcpu) && !nested_get_vmcs12_pages(vcpu)) - return false; - - return true; -} - static int nested_vmx_write_pml_buffer(struct kvm_vcpu *vcpu, gpa_t gpa) { struct vmcs12 *vmcs12; @@ -3402,12 +3370,12 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, prepare_vmcs02_early(vmx, &vmx->vmcs01, vmcs12); - if (from_vmentry) { - if (unlikely(!nested_get_vmcs12_pages(vcpu))) { - vmx_switch_vmcs(vcpu, &vmx->vmcs01); - return NVMX_VMENTRY_KVM_INTERNAL_ERROR; - } + if (unlikely(!nested_get_vmcs12_pages(vcpu))) { + vmx_switch_vmcs(vcpu, &vmx->vmcs01); + return NVMX_VMENTRY_KVM_INTERNAL_ERROR; + } + if (from_vmentry) { if (nested_vmx_check_vmentry_hw(vcpu)) { vmx_switch_vmcs(vcpu, &vmx->vmcs01); return NVMX_VMENTRY_VMFAIL; @@ -3429,24 +3397,14 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, goto vmentry_fail_vmexit_guest_mode; } - if (from_vmentry) { - failed_index = nested_vmx_load_msr(vcpu, - vmcs12->vm_entry_msr_load_addr, - vmcs12->vm_entry_msr_load_count); - if (failed_index) { - exit_reason.basic = EXIT_REASON_MSR_LOAD_FAIL; - vmcs12->exit_qualification = failed_index; - goto vmentry_fail_vmexit_guest_mode; - } - } else { - /* - * The MMU is not initialized to point at the right entities yet and - * "get pages" would need to read data from the guest (i.e. we will - * need to perform gpa to hpa translation). Request a call - * to nested_get_vmcs12_pages before the next VM-entry. The MSRs - * have already been set at vmentry time and should not be reset. - */ - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); + + failed_index = nested_vmx_load_msr(vcpu, + vmcs12->vm_entry_msr_load_addr, + vmcs12->vm_entry_msr_load_count); + if (failed_index) { + exit_reason.basic = EXIT_REASON_MSR_LOAD_FAIL; + vmcs12->exit_qualification = failed_index; + goto vmentry_fail_vmexit_guest_mode; } /* @@ -4516,16 +4474,6 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason, /* Similarly, triple faults in L2 should never escape. */ WARN_ON_ONCE(kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu)); - if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) { - /* - * KVM_REQ_GET_NESTED_STATE_PAGES is also used to map - * Enlightened VMCS after migration and we still need to - * do that when something is forcing L2->L1 exit prior to - * the first L2 run. - */ - (void)nested_get_evmcs_page(vcpu); - } - /* Service pending TLB flush requests for L2 before switching to L1. */ kvm_service_local_tlb_flush_requests(vcpu); @@ -6382,14 +6330,17 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu, set_current_vmptr(vmx, kvm_state->hdr.vmx.vmcs12_pa); } else if (kvm_state->flags & KVM_STATE_NESTED_EVMCS) { + + vmx->nested.hv_evmcs_vmptr = EVMPTR_MAP_PENDING; + /* * nested_vmx_handle_enlightened_vmptrld() cannot be called - * directly from here as HV_X64_MSR_VP_ASSIST_PAGE may not be - * restored yet. EVMCS will be mapped from - * nested_get_vmcs12_pages(). + * directly from here if HV_X64_MSR_VP_ASSIST_PAGE is not + * restored yet. EVMCS will be mapped when it is. */ - vmx->nested.hv_evmcs_vmptr = EVMPTR_MAP_PENDING; - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); + if (kvm_hv_assist_page_enabled(vcpu)) + nested_get_evmcs_page(vcpu); + } else { return -EINVAL; } @@ -6811,8 +6762,8 @@ struct kvm_x86_nested_ops vmx_nested_ops = { .triple_fault = nested_vmx_triple_fault, .get_state = vmx_get_nested_state, .set_state = vmx_set_nested_state, - .get_nested_state_pages = vmx_get_nested_state_pages, .write_log_dirty = nested_vmx_write_pml_buffer, .enable_evmcs = nested_enable_evmcs, .get_evmcs_version = nested_get_evmcs_version, + .get_evmcs_page = nested_get_evmcs_page, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0ee2fbb068b17..48dd01fd7a1ec 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9897,12 +9897,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) r = -EIO; goto out; } - if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) { - if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) { - r = 0; - goto out; - } - } if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu)) kvm_mmu_unload(vcpu); if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))