From patchwork Mon Feb 7 15:54:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737592 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDAD3C35278 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442422AbiBGQEe (ORCPT ); Mon, 7 Feb 2022 11:04:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378835AbiBGPzM (ORCPT ); Mon, 7 Feb 2022 10:55:12 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 151D8C0401CF for ; Mon, 7 Feb 2022 07:55:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249311; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EQ3ltiA1zD8GMi+gBPOuvalM+7bzpftE5TYI4/s1Kfw=; b=asbIsE73BWKAuEWfASPwQhe8QdYbJJ++4CYIWqnGS6bT6nEUyJ6yQWuG2+L7CQIYntP2xC tWhVPACuqoFxbe/HxL9fGSIP1H+lOsUYO0NYX6atHVE1o3+lFD9HPNsjQVB6mDq+eWryKT gK/rwuH+DKekm0S/lGuxFo5m5UQ61LE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-361-6tc-WsGFNY-HT2UZY1tsFQ-1; Mon, 07 Feb 2022 10:55:08 -0500 X-MC-Unique: 6tc-WsGFNY-HT2UZY1tsFQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D8E0593923; Mon, 7 Feb 2022 15:55:04 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4DE7E7DE5F; Mon, 7 Feb 2022 15:54:57 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky , stable@vger.kernel.org Subject: [PATCH RESEND 01/30] KVM: x86: SVM: don't passthrough SMAP/SMEP/PKE bits in !NPT && !gCR0.PG case Date: Mon, 7 Feb 2022 17:54:18 +0200 Message-Id: <20220207155447.840194-2-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When the guest doesn't enable paging, and NPT/EPT is disabled, we use guest't paging CR3's as KVM's shadow paging pointer and we are technically in direct mode as if we were to use NPT/EPT. In direct mode we create SPTEs with user mode permissions because usually in the direct mode the NPT/EPT doesn't need to restrict access based on guest CPL (there are MBE/GMET extenstions for that but KVM doesn't use them). In this special "use guest paging as direct" mode however, and if CR4.SMAP/CR4.SMEP are enabled, that will make the CPU fault on each access and KVM will enter endless loop of page faults. Since page protection doesn't have any meaning in !PG case, just don't passthrough these bits. The fix is the same as was done for VMX in commit: commit 656ec4a4928a ("KVM: VMX: fix SMEP and SMAP without EPT") This fixes the boot of windows 10 without NPT for good. (Without this patch, BSP boots, but APs were stuck in endless loop of page faults, causing the VM boot with 1 CPU) Signed-off-by: Maxim Levitsky Cc: stable@vger.kernel.org --- arch/x86/kvm/svm/svm.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 975be872cd1a3..995c203a62fd9 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1596,6 +1596,7 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { struct vcpu_svm *svm = to_svm(vcpu); u64 hcr0 = cr0; + bool old_paging = is_paging(vcpu); #ifdef CONFIG_X86_64 if (vcpu->arch.efer & EFER_LME && !vcpu->arch.guest_state_protected) { @@ -1612,8 +1613,11 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) #endif vcpu->arch.cr0 = cr0; - if (!npt_enabled) + if (!npt_enabled) { hcr0 |= X86_CR0_PG | X86_CR0_WP; + if (old_paging != is_paging(vcpu)) + svm_set_cr4(vcpu, kvm_read_cr4(vcpu)); + } /* * re-enable caching here because the QEMU bios @@ -1657,8 +1661,12 @@ void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) svm_flush_tlb_current(vcpu); vcpu->arch.cr4 = cr4; - if (!npt_enabled) + if (!npt_enabled) { cr4 |= X86_CR4_PAE; + + if (!is_paging(vcpu)) + cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE); + } cr4 |= host_cr4_mce; to_svm(vcpu)->vmcb->save.cr4 = cr4; vmcb_mark_dirty(to_svm(vcpu)->vmcb, VMCB_CR); From patchwork Mon Feb 7 15:54:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737580 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3A6CC3526E for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1392300AbiBGQEQ (ORCPT ); Mon, 7 Feb 2022 11:04:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1385423AbiBGPzU (ORCPT ); Mon, 7 Feb 2022 10:55:20 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 46E2EC0401D2 for ; Mon, 7 Feb 2022 07:55:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249319; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZWkaPZigKR8ZVYYpYKUmXEITXI5Ito7CWFDFszKolT0=; b=FoU8VjHyA+VSai7IIWulvkhJQjrpuuPMquS9VTYvhuUenzPoTfK4I6jigdXuoh3qz15aGv og9ZRJaErQA1JH5Voh+7ke904RmlWh5KMeg7KkyO+9mAKXBztt58y7TWsGnaS3s8Gw71s2 sAusFqEUzeFoDplWE0J8E/mMQsHWbFM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-663-UJZl6NgZNt-ieIzpmnWFHQ-1; Mon, 07 Feb 2022 10:55:16 -0500 X-MC-Unique: UJZl6NgZNt-ieIzpmnWFHQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 064471091DA2; Mon, 7 Feb 2022 15:55:13 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 51F757DE38; Mon, 7 Feb 2022 15:55:05 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky , stable@vger.kernel.org Subject: [PATCH RESEND 02/30] KVM: x86: nSVM: fix potential NULL derefernce on nested migration Date: Mon, 7 Feb 2022 17:54:19 +0200 Message-Id: <20220207155447.840194-3-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Turns out that due to review feedback and/or rebases I accidentally moved the call to nested_svm_load_cr3 to be too early, before the NPT is enabled, which is very wrong to do. KVM can't even access guest memory at that point as nested NPT is needed for that, and of course it won't initialize the walk_mmu, which is main issue the patch was addressing. Fix this for real. Fixes: 232f75d3b4b5 ("KVM: nSVM: call nested_svm_load_cr3 on nested state load") Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 1218b5a342fc8..39d280e7e80ef 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1457,18 +1457,6 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, !__nested_vmcb_check_save(vcpu, &save_cached)) goto out_free; - /* - * While the nested guest CR3 is already checked and set by - * KVM_SET_SREGS, it was set when nested state was yet loaded, - * thus MMU might not be initialized correctly. - * Set it again to fix this. - */ - - ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3, - nested_npt_enabled(svm), false); - if (WARN_ON_ONCE(ret)) - goto out_free; - /* * All checks done, we can enter guest mode. Userspace provides @@ -1494,6 +1482,20 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, svm_switch_vmcb(svm, &svm->nested.vmcb02); nested_vmcb02_prepare_control(svm); + + /* + * While the nested guest CR3 is already checked and set by + * KVM_SET_SREGS, it was set when nested state was yet loaded, + * thus MMU might not be initialized correctly. + * Set it again to fix this. + */ + + ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3, + nested_npt_enabled(svm), false); + if (WARN_ON_ONCE(ret)) + goto out_free; + + kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); ret = 0; out_free: From patchwork Mon Feb 7 15:54:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737573 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8537BC4167B for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1390014AbiBGQEK (ORCPT ); Mon, 7 Feb 2022 11:04:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1385525AbiBGPz2 (ORCPT ); Mon, 7 Feb 2022 10:55:28 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 27A79C0401D2 for ; Mon, 7 Feb 2022 07:55:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249327; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I4R9ip16BNHzOCMyjNW2i8e6oOdWXHYHkd2se90F0XI=; b=aTFIDjBGvyTm7QnthiKk2ty4gTdKKVeKETUgbLd0/aTP472YeOzXC+pezqLI2R8f9rSWaG X3i6FNg9ezuoBrx95zZRwMe+0Y28CXzHNbbBsmHe0YNJyZB/eUjIe1155NpniudS3uR5RX OoL030AKv1lo4lnanr/vsVyiIsl/f28= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-661-wKwVbDHBNS-GatSourBQzg-1; Mon, 07 Feb 2022 10:55:24 -0500 X-MC-Unique: wKwVbDHBNS-GatSourBQzg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E010293922; Mon, 7 Feb 2022 15:55:20 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 739AF7DE56; Mon, 7 Feb 2022 15:55:13 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky , stable@vger.kernel.org Subject: [PATCH RESEND 03/30] KVM: x86: nSVM: mark vmcb01 as dirty when restoring SMM saved state Date: Mon, 7 Feb 2022 17:54:20 +0200 Message-Id: <20220207155447.840194-4-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org While usually, restoring the smm state makes the KVM enter the nested guest thus a different vmcb (vmcb02 vs vmcb01), KVM should still mark it as dirty, since hardware can in theory cache multiple vmcbs. Failure to do so, combined with lack of setting the nested_run_pending (which is fixed in the next patch), might make KVM re-enter vmcb01, which was just exited from, with completely different set of guest state registers (SMM vs non SMM) and without proper dirty bits set, which results in the CPU reusing stale IDTR pointer which leads to a guest shutdown on any interrupt. On the real hardware this usually doesn't happen, but when running nested, L0's KVM does check and honour few dirty bits, causing this issue to happen. This patch fixes boot of hyperv and SMM enabled windows VM running nested on KVM. Signed-off-by: Maxim Levitsky Cc: stable@vger.kernel.org --- arch/x86/kvm/svm/svm.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 995c203a62fd9..3f1d11e652123 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4267,6 +4267,8 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) * Enter the nested guest now */ + vmcb_mark_all_dirty(svm->vmcb01.ptr); + vmcb12 = map.hva; nested_copy_vmcb_control_to_cache(svm, &vmcb12->control); nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); From patchwork Mon Feb 7 15:54:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737586 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCA99C47082 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442431AbiBGQEf (ORCPT ); Mon, 7 Feb 2022 11:04:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1381154AbiBGPzh (ORCPT ); Mon, 7 Feb 2022 10:55:37 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A05C6C0401CF for ; Mon, 7 Feb 2022 07:55:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249335; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=va2Nt/crWYOdjbL3kQd/XU5O6mMNd0LbnQBpM4oG1KI=; b=d8pbV2fMdertdfphG6AP7qKTwaJTu9j7UiwC8i6L9X9SM6eEbGHTUB8N9jYlLsQPlD25UD lv8T11tQmsC/ljgDlrakyUVKOwjYtzN039MbxJ5TumgLUzJJlMy+IB6JIOd3AYINzKBkPK Z6Amphf+rV/ehvNmnAUK6ai2LZFH4mI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-646-q1yrGdk1OXO_2Bn2pDEwxA-1; Mon, 07 Feb 2022 10:55:32 -0500 X-MC-Unique: q1yrGdk1OXO_2Bn2pDEwxA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4057F80F04E; Mon, 7 Feb 2022 15:55:29 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5AA4B7DE38; Mon, 7 Feb 2022 15:55:21 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky , stable@vger.kernel.org Subject: [PATCH RESEND 04/30] KVM: x86: nSVM/nVMX: set nested_run_pending on VM entry which is a result of RSM Date: Mon, 7 Feb 2022 17:54:21 +0200 Message-Id: <20220207155447.840194-5-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org While RSM induced VM entries are not full VM entries, they still need to be followed by actual VM entry to complete it, unlike setting the nested state. This patch fixes boot of hyperv and SMM enabled windows VM running nested on KVM, which fail due to this issue combined with lack of dirty bit setting. Signed-off-by: Maxim Levitsky Cc: stable@vger.kernel.org --- arch/x86/kvm/svm/svm.c | 5 +++++ arch/x86/kvm/vmx/vmx.c | 1 + 2 files changed, 6 insertions(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 3f1d11e652123..71bfa52121622 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4274,6 +4274,11 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, false); + if (ret) + goto unmap_save; + + svm->nested.nested_run_pending = 1; + unmap_save: kvm_vcpu_unmap(vcpu, &map_save, true); unmap_map: diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 8ac5a6fa77203..fc9c4eca90a78 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7659,6 +7659,7 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) if (ret) return ret; + vmx->nested.nested_run_pending = 1; vmx->nested.smm.guest_mode = false; } return 0; From patchwork Mon Feb 7 15:54:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737582 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3AE9BC35274 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442064AbiBGQE1 (ORCPT ); Mon, 7 Feb 2022 11:04:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1385554AbiBGPzo (ORCPT ); Mon, 7 Feb 2022 10:55:44 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F2689C0401D1 for ; Mon, 7 Feb 2022 07:55:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249343; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qMnOM1ahvRbyv4h4oZCdQyrFX9rMZxmp7xDBGeC6Xow=; b=FIciMeprzVJHP7CPEDQRJMEc8mugP827ZWkziakM7CNQQmaDkfyUt2APhVDllpLOWuFXVv b92kzmGBmKqeTZVw06Cceuc1eNT0v+F8V9+zxspLEfvVP2YauLBxfCh7SFVihBQFY/iA7R Ox8nBpgMrsssVW9TWERmcB5ZPcbpGFs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-263-6Dh-KuYGPjOE8ZRERIhVbA-1; Mon, 07 Feb 2022 10:55:40 -0500 X-MC-Unique: 6Dh-KuYGPjOE8ZRERIhVbA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id ED04693920; Mon, 7 Feb 2022 15:55:36 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id ADBA87DE38; Mon, 7 Feb 2022 15:55:29 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 05/30] KVM: x86: nSVM: expose clean bit support to the guest Date: Mon, 7 Feb 2022 17:54:22 +0200 Message-Id: <20220207155447.840194-6-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM already honours few clean bits thus it makes sense to let the nested guest know about it. Note that KVM also doesn't check if the hardware supports clean bits, and therefore nested KVM was already setting clean bits and L0 KVM was already honouring them. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 71bfa52121622..8013be9edf27c 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4663,6 +4663,7 @@ static __init void svm_set_cpu_caps(void) /* CPUID 0x80000001 and 0x8000000A (SVM features) */ if (nested) { kvm_cpu_cap_set(X86_FEATURE_SVM); + kvm_cpu_cap_set(X86_FEATURE_VMCBCLEAN); if (nrips) kvm_cpu_cap_set(X86_FEATURE_NRIPS); From patchwork Mon Feb 7 15:54:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737579 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CB98C35272 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1392478AbiBGQET (ORCPT ); Mon, 7 Feb 2022 11:04:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1385584AbiBGPzw (ORCPT ); Mon, 7 Feb 2022 10:55:52 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 43941C0401CC for ; Mon, 7 Feb 2022 07:55:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249351; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EAX/iPSDnKifJ1ul8GYSz/iW0O/HgQPeT+89Chaumh4=; b=LC569S7T+X3ePsCLgWs9TDNGxFCbz30nhzCdOX1ueMlsvDiwadEb+6ZaT8FT9NNjIjm9Ea gmYEXSb8SMFuoCgnB25X7JYroT+kHae3NOQFnmhkMtO9DtUKF6Pj+QyDsVsGyKXuDaKeyl nStSc1gIyDgHuQawSuVdtamLIAl3bWg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-311-jqEL4Il4OMi6dTRpCOZybg-1; Mon, 07 Feb 2022 10:55:48 -0500 X-MC-Unique: jqEL4Il4OMi6dTRpCOZybg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1FA26100C675; Mon, 7 Feb 2022 15:55:45 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 495177DE38; Mon, 7 Feb 2022 15:55:37 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 06/30] KVM: x86: mark syntethic SMM vmexit as SVM_EXIT_SW Date: Mon, 7 Feb 2022 17:54:23 +0200 Message-Id: <20220207155447.840194-7-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use a dummy unused vmexit reason to mark the 'VM exit' that is happening when we exit to handle SMM, which is not a real VM exit. This makes it a bit easier to read the KVM trace, and avoids other potential problems. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 8013be9edf27c..9a4e299ed5673 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4194,7 +4194,7 @@ static int svm_enter_smm(struct kvm_vcpu *vcpu, char *smstate) svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP]; svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP]; - ret = nested_svm_vmexit(svm); + ret = nested_svm_simple_vmexit(svm, SVM_EXIT_SW); if (ret) return ret; From patchwork Mon Feb 7 15:54:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737570 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F580C433FE for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1388849AbiBGQED (ORCPT ); Mon, 7 Feb 2022 11:04:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1385587AbiBGP4H (ORCPT ); Mon, 7 Feb 2022 10:56:07 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 3C590C0401CE for ; Mon, 7 Feb 2022 07:56:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249366; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZPXfeiFExpOXhTUBTnQ3rPqBk4UBybQ6CD98nOBVkzg=; b=RfI2Fu+bTDO2/YfkWELOFGBts+ncPuBV4cHm4LtalIHBw01if7+81tSqB6Hzo+CqnSZX5H 66uTKsN41B9niixMV79ozIDuA8qxxsfN+YDSD8Nd2UhPSU6mcKggJioSZwPPLfKZSpTSYg tAKI667anbA/nXh/j67SzONmKEzbrT4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-434-NgEBCeIEN32whUkd4taQkg-1; Mon, 07 Feb 2022 10:56:03 -0500 X-MC-Unique: NgEBCeIEN32whUkd4taQkg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0DFD31091DA2; Mon, 7 Feb 2022 15:56:00 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8E9757DE38; Mon, 7 Feb 2022 15:55:45 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 07/30] KVM: x86: nSVM: deal with L1 hypervisor that intercepts interrupts but lets L2 control them Date: Mon, 7 Feb 2022 17:54:24 +0200 Message-Id: <20220207155447.840194-8-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Fix a corner case in which the L1 hypervisor intercepts interrupts (INTERCEPT_INTR) and either doesn't set virtual interrupt masking (V_INTR_MASKING) or enters a nested guest with EFLAGS.IF disabled prior to the entry. In this case, despite the fact that L1 intercepts the interrupts, KVM still needs to set up an interrupt window to wait before injecting the INTR vmexit. Currently the KVM instead enters an endless loop of 'req_immediate_exit'. Exactly the same issue also happens for SMIs and NMI. Fix this as well. Note that on VMX, this case is impossible as there is only 'vmexit on external interrupts' execution control which either set, in which case both host and guest's EFLAGS.IF are ignored, or not set, in which case no VMexits are delivered. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 9a4e299ed5673..22e614008cf59 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3372,11 +3372,13 @@ static int svm_nmi_allowed(struct kvm_vcpu *vcpu, bool for_injection) if (svm->nested.nested_run_pending) return -EBUSY; + if (svm_nmi_blocked(vcpu)) + return 0; + /* An NMI must not be injected into L2 if it's supposed to VM-Exit. */ if (for_injection && is_guest_mode(vcpu) && nested_exit_on_nmi(svm)) return -EBUSY; - - return !svm_nmi_blocked(vcpu); + return 1; } static bool svm_get_nmi_mask(struct kvm_vcpu *vcpu) @@ -3428,9 +3430,13 @@ bool svm_interrupt_blocked(struct kvm_vcpu *vcpu) static int svm_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection) { struct vcpu_svm *svm = to_svm(vcpu); + if (svm->nested.nested_run_pending) return -EBUSY; + if (svm_interrupt_blocked(vcpu)) + return 0; + /* * An IRQ must not be injected into L2 if it's supposed to VM-Exit, * e.g. if the IRQ arrived asynchronously after checking nested events. @@ -3438,7 +3444,7 @@ static int svm_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection) if (for_injection && is_guest_mode(vcpu) && nested_exit_on_intr(svm)) return -EBUSY; - return !svm_interrupt_blocked(vcpu); + return 1; } static void svm_enable_irq_window(struct kvm_vcpu *vcpu) @@ -4169,11 +4175,14 @@ static int svm_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection) if (svm->nested.nested_run_pending) return -EBUSY; + if (svm_smi_blocked(vcpu)) + return 0; + /* An SMI must not be injected into L2 if it's supposed to VM-Exit. */ if (for_injection && is_guest_mode(vcpu) && nested_exit_on_smi(svm)) return -EBUSY; - return !svm_smi_blocked(vcpu); + return 1; } static int svm_enter_smm(struct kvm_vcpu *vcpu, char *smstate) From patchwork Mon Feb 7 15:54:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FFE4C35275 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442211AbiBGQEc (ORCPT ); Mon, 7 Feb 2022 11:04:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1385593AbiBGP4T (ORCPT ); Mon, 7 Feb 2022 10:56:19 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 85F5FC0401CE for ; Mon, 7 Feb 2022 07:56:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249377; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hIkOyOBrYC66as9Hh5WBjO5DbzrEJzCI4Y98UxmVlRg=; b=Eil31yxaf80Lu+0BSGUgbnuKsQ0iEMZmts58qlRwAMtDk2zJfcfyYfwZablXSDcEXwOVzN IrMt9NENTiKc6LmsUmrdGxI5kQkPzRMOTsH6Qu3658vCafhfphqjkrMVdSGeOBmG90uO3d aYlqP/vuT+MLS22QwCRXNWcnPlIMCWo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-639-oRBj8DFEPRSHtMKGb7doUw-1; Mon, 07 Feb 2022 10:56:14 -0500 X-MC-Unique: oRBj8DFEPRSHtMKGb7doUw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 22080100C662; Mon, 7 Feb 2022 15:56:11 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7C77482765; Mon, 7 Feb 2022 15:56:00 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 08/30] KVM: x86: lapic: don't touch irr_pending in kvm_apic_update_apicv when inhibiting it Date: Mon, 7 Feb 2022 17:54:25 +0200 Message-Id: <20220207155447.840194-9-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org kvm_apic_update_apicv is called when AVIC is still active, thus IRR bits can be set by the CPU after it is called, and don't cause the irr_pending to be set to true. Also logic in avic_kick_target_vcpu doesn't expect a race with this function so to make it simple, just keep irr_pending set to true and let the next interrupt injection to the guest clear it. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/lapic.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 0da7d0960fcb5..dd4e2888c244b 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2307,7 +2307,12 @@ void kvm_apic_update_apicv(struct kvm_vcpu *vcpu) apic->irr_pending = true; apic->isr_count = 1; } else { - apic->irr_pending = (apic_search_irr(apic) != -1); + /* + * Don't clear irr_pending, searching the IRR can race with + * updates from the CPU as APICv is still active from hardware's + * perspective. The flag will be cleared as appropriate when + * KVM injects the interrupt. + */ apic->isr_count = count_vectors(apic->regs + APIC_ISR); } } From patchwork Mon Feb 7 15:54:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737572 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 607A2C4321E for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1389441AbiBGQEH (ORCPT ); Mon, 7 Feb 2022 11:04:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232506AbiBGP40 (ORCPT ); Mon, 7 Feb 2022 10:56:26 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A3602C0401CF for ; Mon, 7 Feb 2022 07:56:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249384; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qdwpmbef/Bh6XlwaL5pVGukYEI2IsJrzG/O+p5FnU2s=; b=DE/BHsSiEucXgigebfg+eeNh1VVPmYVyk0rJSTOOZYWrwdCyWnmSPtWfKuuh3iOLLjTdt2 we7cOTEbI8neYUIEYMObDrCwv8FcMjk37OY8FhY9H7a/+mLYjpeWQoW6gdygHfPgCaOXjB 6dl8+8H7NB/HWYH54JrRCpqD2YH8xcA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-372-oPbGccWgM-qCFCmjnGmEAg-1; Mon, 07 Feb 2022 10:56:21 -0500 X-MC-Unique: oPbGccWgM-qCFCmjnGmEAg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C37231091DA5; Mon, 7 Feb 2022 15:56:18 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 90B227DE4D; Mon, 7 Feb 2022 15:56:11 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 09/30] KVM: x86: SVM: move avic definitions from AMD's spec to svm.h Date: Mon, 7 Feb 2022 17:54:26 +0200 Message-Id: <20220207155447.840194-10-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org asm/svm.h is the correct place for all values that are defined in the SVM spec, and that includes AVIC. Also add some values from the spec that were not defined before and will be soon useful. Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/svm.h | 36 ++++++++++++++++++++++++++++++++ arch/x86/kvm/svm/avic.c | 22 +------------------ arch/x86/kvm/svm/svm.h | 11 ---------- 4 files changed, 38 insertions(+), 32 deletions(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 01e2650b95859..552ff8a5ea023 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -476,6 +476,7 @@ #define MSR_AMD64_ICIBSEXTDCTL 0xc001103c #define MSR_AMD64_IBSOPDATA4 0xc001103d #define MSR_AMD64_IBS_REG_COUNT_MAX 8 /* includes MSR_AMD64_IBSBRTARGET */ +#define MSR_AMD64_SVM_AVIC_DOORBELL 0xc001011b #define MSR_AMD64_VM_PAGE_FLUSH 0xc001011e #define MSR_AMD64_SEV_ES_GHCB 0xc0010130 #define MSR_AMD64_SEV 0xc0010131 diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index b00dbc5fac2b2..bb2fb78523cee 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -220,6 +220,42 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define SVM_NESTED_CTL_SEV_ENABLE BIT(1) #define SVM_NESTED_CTL_SEV_ES_ENABLE BIT(2) + +/* AVIC */ +#define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK (0xFF) +#define AVIC_LOGICAL_ID_ENTRY_VALID_BIT 31 +#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK (1 << 31) + +#define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK (0xFFULL) +#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK (0xFFFFFFFFFFULL << 12) +#define AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK (1ULL << 62) +#define AVIC_PHYSICAL_ID_ENTRY_VALID_MASK (1ULL << 63) +#define AVIC_PHYSICAL_ID_TABLE_SIZE_MASK (0xFF) + +#define AVIC_DOORBELL_PHYSICAL_ID_MASK (0xFF) + +#define AVIC_UNACCEL_ACCESS_WRITE_MASK 1 +#define AVIC_UNACCEL_ACCESS_OFFSET_MASK 0xFF0 +#define AVIC_UNACCEL_ACCESS_VECTOR_MASK 0xFFFFFFFF + +enum avic_ipi_failure_cause { + AVIC_IPI_FAILURE_INVALID_INT_TYPE, + AVIC_IPI_FAILURE_TARGET_NOT_RUNNING, + AVIC_IPI_FAILURE_INVALID_TARGET, + AVIC_IPI_FAILURE_INVALID_BACKING_PAGE, +}; + + +/* + * 0xff is broadcast, so the max index allowed for physical APIC ID + * table is 0xfe. APIC IDs above 0xff are reserved. + */ +#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff + +#define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF) +#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL + + struct vmcb_seg { u16 selector; u16 attrib; diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 99f907ec5aa8f..fabfc337e1c35 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -27,20 +27,6 @@ #include "irq.h" #include "svm.h" -#define SVM_AVIC_DOORBELL 0xc001011b - -#define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF) - -/* - * 0xff is broadcast, so the max index allowed for physical APIC ID - * table is 0xfe. APIC IDs above 0xff are reserved. - */ -#define AVIC_MAX_PHYSICAL_ID_COUNT 255 - -#define AVIC_UNACCEL_ACCESS_WRITE_MASK 1 -#define AVIC_UNACCEL_ACCESS_OFFSET_MASK 0xFF0 -#define AVIC_UNACCEL_ACCESS_VECTOR_MASK 0xFFFFFFFF - /* AVIC GATAG is encoded using VM and VCPU IDs */ #define AVIC_VCPU_ID_BITS 8 #define AVIC_VCPU_ID_MASK ((1 << AVIC_VCPU_ID_BITS) - 1) @@ -73,12 +59,6 @@ struct amd_svm_iommu_ir { void *data; /* Storing pointer to struct amd_ir_data */ }; -enum avic_ipi_failure_cause { - AVIC_IPI_FAILURE_INVALID_INT_TYPE, - AVIC_IPI_FAILURE_TARGET_NOT_RUNNING, - AVIC_IPI_FAILURE_INVALID_TARGET, - AVIC_IPI_FAILURE_INVALID_BACKING_PAGE, -}; /* Note: * This function is called from IOMMU driver to notify @@ -702,7 +682,7 @@ int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec) * one is harmless). */ if (cpu != get_cpu()) - wrmsrl(SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu)); + wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu)); put_cpu(); } else { /* diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 852b12aee03d7..6343558982c73 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -555,17 +555,6 @@ extern struct kvm_x86_nested_ops svm_nested_ops; /* avic.c */ -#define AVIC_LOGICAL_ID_ENTRY_GUEST_PHYSICAL_ID_MASK (0xFF) -#define AVIC_LOGICAL_ID_ENTRY_VALID_BIT 31 -#define AVIC_LOGICAL_ID_ENTRY_VALID_MASK (1 << 31) - -#define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK (0xFFULL) -#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK (0xFFFFFFFFFFULL << 12) -#define AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK (1ULL << 62) -#define AVIC_PHYSICAL_ID_ENTRY_VALID_MASK (1ULL << 63) - -#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL - int avic_ga_log_notifier(u32 ga_tag); void avic_vm_destroy(struct kvm *kvm); int avic_vm_init(struct kvm *kvm); From patchwork Mon Feb 7 15:54:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41C82C43219 for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345656AbiBGQEF (ORCPT ); Mon, 7 Feb 2022 11:04:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240978AbiBGP4g (ORCPT ); Mon, 7 Feb 2022 10:56:36 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 48D82C0401CC for ; Mon, 7 Feb 2022 07:56:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249394; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+qBv9qwSEqvx96/HtSvc9qvyeLhvKy4iFAhj6kK93B8=; b=gD1YCfJwTlwCgdL0p2ncinNwn5IgWrInfNbjWhmMME23MMGUPRtURfKD7e3UlAdzXZTelJ aUU34DXmcwWoKsBnzqT3SUGjVzLH151ZISba1W6sxR/+2jGJXbSYR7a04SUZt8wOR40NoT Vkc6u0GYc27t+KZZx2+opgCcFE2ZNtE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-663-WnQs0KM8Phi6NpwLCF8Bkw-1; Mon, 07 Feb 2022 10:56:33 -0500 X-MC-Unique: WnQs0KM8Phi6NpwLCF8Bkw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2A9F2192D786; Mon, 7 Feb 2022 15:56:30 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3D44B7DE57; Mon, 7 Feb 2022 15:56:19 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 10/30] KVM: x86: SVM: fix race between interrupt delivery and AVIC inhibition Date: Mon, 7 Feb 2022 17:54:27 +0200 Message-Id: <20220207155447.840194-11-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If svm_deliver_avic_intr is called just after the target vcpu's AVIC got inhibited, it might read a stale value of vcpu->arch.apicv_active which can lead to the target vCPU not noticing the interrupt. To fix this use load-acquire/store-release so that, if the target vCPU is IN_GUEST_MODE, we're guaranteed to see a previous disabling of the AVIC. If AVIC has been disabled in the meanwhile, proceed with the KVM_REQ_EVENT-based delivery. All this complicated logic is actually exactly how we can handle an incomplete IPI vmexit; the only difference lies in who sets IRR, whether KVM or the processor. Also incomplete IPI vmexit also has the same races as svm_deliver_avic_intr. Therefore use the avic_kick_target_vcpu there as well. Co-developed-by: Paolo Bonzini Signed-off-by: Paolo Bonzini Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/avic.c | 73 ++++++++++++++--------------------------- arch/x86/kvm/svm/svm.c | 65 ++++++++++++++++++++++++++++-------- arch/x86/kvm/svm/svm.h | 3 ++ arch/x86/kvm/x86.c | 4 ++- 4 files changed, 82 insertions(+), 63 deletions(-) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index fabfc337e1c35..4c2d622b3b9f0 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -269,6 +269,24 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu) return 0; } + +void avic_ring_doorbell(struct kvm_vcpu *vcpu) +{ + /* + * Note, the vCPU could get migrated to a different pCPU at any + * point, which could result in signalling the wrong/previous + * pCPU. But if that happens the vCPU is guaranteed to do a + * VMRUN (after being migrated) and thus will process pending + * interrupts, i.e. a doorbell is not needed (and the spurious + * one is harmless). + */ + int cpu = READ_ONCE(vcpu->cpu); + + if (cpu != get_cpu()) + wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu)); + put_cpu(); +} + static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source, u32 icrl, u32 icrh) { @@ -284,8 +302,13 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source, kvm_for_each_vcpu(i, vcpu, kvm) { if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK, GET_APIC_DEST_FIELD(icrh), - icrl & APIC_DEST_MASK)) - kvm_vcpu_wake_up(vcpu); + icrl & APIC_DEST_MASK)) { + vcpu->arch.apic->irr_pending = true; + svm_complete_interrupt_delivery(vcpu, + icrl & APIC_MODE_MASK, + icrl & APIC_INT_LEVELTRIG, + icrl & APIC_VECTOR_MASK); + } } } @@ -649,52 +672,6 @@ void avic_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) return; } -int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec) -{ - if (!vcpu->arch.apicv_active) - return -1; - - kvm_lapic_set_irr(vec, vcpu->arch.apic); - - /* - * Pairs with the smp_mb_*() after setting vcpu->guest_mode in - * vcpu_enter_guest() to ensure the write to the vIRR is ordered before - * the read of guest_mode, which guarantees that either VMRUN will see - * and process the new vIRR entry, or that the below code will signal - * the doorbell if the vCPU is already running in the guest. - */ - smp_mb__after_atomic(); - - /* - * Signal the doorbell to tell hardware to inject the IRQ if the vCPU - * is in the guest. If the vCPU is not in the guest, hardware will - * automatically process AVIC interrupts at VMRUN. - */ - if (vcpu->mode == IN_GUEST_MODE) { - int cpu = READ_ONCE(vcpu->cpu); - - /* - * Note, the vCPU could get migrated to a different pCPU at any - * point, which could result in signalling the wrong/previous - * pCPU. But if that happens the vCPU is guaranteed to do a - * VMRUN (after being migrated) and thus will process pending - * interrupts, i.e. a doorbell is not needed (and the spurious - * one is harmless). - */ - if (cpu != get_cpu()) - wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu)); - put_cpu(); - } else { - /* - * Wake the vCPU if it was blocking. KVM will then detect the - * pending IRQ when checking if the vCPU has a wake event. - */ - kvm_vcpu_wake_up(vcpu); - } - - return 0; -} - bool avic_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu) { return false; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 22e614008cf59..18d4d87e12e15 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3310,20 +3310,6 @@ static void svm_inject_irq(struct kvm_vcpu *vcpu) SVM_EVTINJ_VALID | SVM_EVTINJ_TYPE_INTR; } -static void svm_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, - int trig_mode, int vector) -{ - struct kvm_vcpu *vcpu = apic->vcpu; - - if (svm_deliver_avic_intr(vcpu, vector)) { - kvm_lapic_set_irr(vector, apic); - kvm_make_request(KVM_REQ_EVENT, vcpu); - kvm_vcpu_kick(vcpu); - } else { - trace_kvm_apicv_accept_irq(vcpu->vcpu_id, delivery_mode, - trig_mode, vector); - } -} static void svm_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) { @@ -4142,6 +4128,57 @@ static int svm_check_intercept(struct kvm_vcpu *vcpu, return ret; } +void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode, + int trig_mode, int vec) +{ + /* + * vcpu->arch.apicv_active must be read after vcpu->mode. + * Pairs with smp_store_release in vcpu_enter_guest. + */ + bool in_guest_mode = (smp_load_acquire(&vcpu->mode) == IN_GUEST_MODE); + + if (!READ_ONCE(vcpu->arch.apicv_active)) { + /* + * Manually signal the event + */ + kvm_make_request(KVM_REQ_EVENT, vcpu); + kvm_vcpu_kick(vcpu); + return; + } + + trace_kvm_apicv_accept_irq(vcpu->vcpu_id, delivery_mode, trig_mode, vec); + + if (in_guest_mode) + /* + * Signal the doorbell to tell hardware to inject the IRQ if the vCPU + * is in the guest. If the vCPU is not in the guest, hardware will + * automatically process AVIC interrupts at VMRUN. + */ + avic_ring_doorbell(vcpu); + else + /* + * Wake the vCPU if it was blocking. KVM will then detect the + * pending IRQ when checking if the vCPU has a wake event. + */ + kvm_vcpu_wake_up(vcpu); +} + +static void svm_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, + int trig_mode, int vec) +{ + kvm_lapic_set_irr(vec, apic); + + /* + * Pairs with the smp_mb_*() after setting vcpu->guest_mode in + * vcpu_enter_guest() to ensure the write to the vIRR is ordered before + * the read of guest_mode, which guarantees that either VMRUN will see + * and process the new vIRR entry, or that the below code will signal + * the doorbell if the vCPU is already running in the guest. + */ + smp_mb__after_atomic(); + svm_complete_interrupt_delivery(apic->vcpu, delivery_mode, trig_mode, vec); +} + static void svm_handle_exit_irqoff(struct kvm_vcpu *vcpu) { } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 6343558982c73..83f9f95eced3e 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -488,6 +488,8 @@ void svm_set_gif(struct vcpu_svm *svm, bool value); int svm_invoke_exit_handler(struct kvm_vcpu *vcpu, u64 exit_code); void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr, int read, int write); +void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode, + int trig_mode, int vec); /* nested.c */ @@ -577,6 +579,7 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq, bool set); void avic_vcpu_blocking(struct kvm_vcpu *vcpu); void avic_vcpu_unblocking(struct kvm_vcpu *vcpu); +void avic_ring_doorbell(struct kvm_vcpu *vcpu); /* sev.c */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6f69f3e3635e2..8cb5390f75efe 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10039,7 +10039,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) * result in virtual interrupt delivery. */ local_irq_disable(); - vcpu->mode = IN_GUEST_MODE; + + /* Store vcpu->apicv_active before vcpu->mode. */ + smp_store_release(&vcpu->mode, IN_GUEST_MODE); srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); From patchwork Mon Feb 7 15:54:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737575 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4873C4167E for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1391836AbiBGQEN (ORCPT ); Mon, 7 Feb 2022 11:04:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344470AbiBGP4p (ORCPT ); Mon, 7 Feb 2022 10:56:45 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2553DC0401D8 for ; Mon, 7 Feb 2022 07:56:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249404; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7uKkJYp30pF3ErxkQlRiirr2VHEvcBthwFChel8Mbe4=; b=Tb3ODZGX0HbJpDH1Zv12UbFmt5k4APu8S3mStqZDc/94WNW+0cpPYQvVWSnGW+lRimCzw3 /8mCYVQgTnEGjllfH2xqY1wOKAYcwYWgNcLg9mLJBQn0hvZxjU+bpYGKUHj/KUf+GlSJ7X 3xVhjqco7hDVyVyjCL2KMvJuoi0CBWo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-139-qpyNhtRvOWyRLxQ07D8F3g-1; Mon, 07 Feb 2022 10:56:40 -0500 X-MC-Unique: qpyNhtRvOWyRLxQ07D8F3g-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DA995874980; Mon, 7 Feb 2022 15:56:37 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 991677DE38; Mon, 7 Feb 2022 15:56:30 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 11/30] KVM: x86: SVM: use vmcb01 in avic_init_vmcb Date: Mon, 7 Feb 2022 17:54:28 +0200 Message-Id: <20220207155447.840194-12-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Out of precation use vmcb01 when enabling host AVIC. No functional change intended. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/avic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 4c2d622b3b9f0..c6072245f7fbb 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -167,7 +167,7 @@ int avic_vm_init(struct kvm *kvm) void avic_init_vmcb(struct vcpu_svm *svm) { - struct vmcb *vmcb = svm->vmcb; + struct vmcb *vmcb = svm->vmcb01.ptr; struct kvm_svm *kvm_svm = to_kvm_svm(svm->vcpu.kvm); phys_addr_t bpa = __sme_set(page_to_phys(svm->avic_backing_page)); phys_addr_t lpa = __sme_set(page_to_phys(kvm_svm->avic_logical_id_table_page)); From patchwork Mon Feb 7 15:54:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09056C4332F for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1388241AbiBGQEB (ORCPT ); Mon, 7 Feb 2022 11:04:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232334AbiBGP5T (ORCPT ); Mon, 7 Feb 2022 10:57:19 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C45ABC0401CE for ; Mon, 7 Feb 2022 07:57:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249437; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WfBR/k98L4+bC6Zh+Zl9tYY0D7k4ei3pM6o4QxmereM=; b=QxhU25eTSX7+wMtQN1IjPQiSRqUqtJpL6qNTfD3YUoTXx6eFOlZth/DDrHkqyu6+ncIovb 4ieLwbMU56zHv6GvadUCwE8GotML3bFGp1UJp8vfRedeRTwew4Gxz3FUBmtkKuiE+c0T/s EsquTN2LKtCajlTe886vHNnE08erc9g= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-626-HWEGEVyzP7-uSFNtYigGXA-1; Mon, 07 Feb 2022 10:57:14 -0500 X-MC-Unique: HWEGEVyzP7-uSFNtYigGXA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D99BC100C660; Mon, 7 Feb 2022 15:57:11 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 540707DE38; Mon, 7 Feb 2022 15:56:38 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 12/30] KVM: x86: SVM: allow AVIC to co-exist with a nested guest running Date: Mon, 7 Feb 2022 17:54:29 +0200 Message-Id: <20220207155447.840194-13-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Inhibit the AVIC of the vCPU that is running nested for the duration of the nested run, so that all interrupts arriving from both its vCPU siblings and from KVM are delivered using normal IPIs and cause that vCPU to vmexit. Note that unlike normal AVIC inhibition, there is no need to update the AVIC mmio memslot, because the nested guest uses its own set of paging tables. That also means that AVIC doesn't need to be inhibited VM wide. Note that in the theory when a nested guest doesn't intercept physical interrupts, we could continue using AVIC to deliver them to it but don't bother doing so for now. Plus when nested AVIC is implemented, the nested guest will likely use it, which will not allow this optimization to be used (can't use real AVIC to support both L1 and L2 at the same time) Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 8 +++++++- arch/x86/kvm/svm/avic.c | 7 ++++++- arch/x86/kvm/svm/nested.c | 15 ++++++++++----- arch/x86/kvm/svm/svm.c | 31 +++++++++++++++++++----------- arch/x86/kvm/svm/svm.h | 1 + arch/x86/kvm/x86.c | 18 +++++++++++++++-- 7 files changed, 61 insertions(+), 20 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 9e37dc3d88636..c0d8f351dcbc0 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -125,6 +125,7 @@ KVM_X86_OP_NULL(migrate_timers) KVM_X86_OP(msr_filter_changed) KVM_X86_OP_NULL(complete_emulated_msr) KVM_X86_OP(vcpu_deliver_sipi_vector) +KVM_X86_OP_NULL(vcpu_has_apicv_inhibit_condition); #undef KVM_X86_OP #undef KVM_X86_OP_NULL diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c371ee7e45f78..256539c0481c5 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1039,7 +1039,6 @@ struct kvm_x86_msr_filter { #define APICV_INHIBIT_REASON_DISABLE 0 #define APICV_INHIBIT_REASON_HYPERV 1 -#define APICV_INHIBIT_REASON_NESTED 2 #define APICV_INHIBIT_REASON_IRQWIN 3 #define APICV_INHIBIT_REASON_PIT_REINJ 4 #define APICV_INHIBIT_REASON_X2APIC 5 @@ -1494,6 +1493,12 @@ struct kvm_x86_ops { int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err); void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector); + + /* + * Returns true if for some reason APICv (e.g guest mode) + * must be inhibited on this vCPU + */ + bool (*vcpu_has_apicv_inhibit_condition)(struct kvm_vcpu *vcpu); }; struct kvm_x86_nested_ops { @@ -1784,6 +1789,7 @@ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva, bool kvm_apicv_activated(struct kvm *kvm); void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu); +bool vcpu_has_apicv_inhibit_condition(struct kvm_vcpu *vcpu); void kvm_request_apicv_update(struct kvm *kvm, bool activate, unsigned long bit); diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index c6072245f7fbb..8f23e7d239097 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -677,6 +677,12 @@ bool avic_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu) return false; } +bool avic_has_vcpu_inhibit_condition(struct kvm_vcpu *vcpu) +{ + return is_guest_mode(vcpu); +} + + static void svm_ir_list_del(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi) { unsigned long flags; @@ -888,7 +894,6 @@ bool avic_check_apicv_inhibit_reasons(ulong bit) ulong supported = BIT(APICV_INHIBIT_REASON_DISABLE) | BIT(APICV_INHIBIT_REASON_ABSENT) | BIT(APICV_INHIBIT_REASON_HYPERV) | - BIT(APICV_INHIBIT_REASON_NESTED) | BIT(APICV_INHIBIT_REASON_IRQWIN) | BIT(APICV_INHIBIT_REASON_PIT_REINJ) | BIT(APICV_INHIBIT_REASON_X2APIC) | diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 39d280e7e80ef..ac9159b0618c7 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -551,11 +551,6 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) * exit_int_info, exit_int_info_err, next_rip, insn_len, insn_bytes. */ - /* - * Also covers avic_vapic_bar, avic_backing_page, avic_logical_id, - * avic_physical_id. - */ - WARN_ON(kvm_apicv_activated(svm->vcpu.kvm)); /* Copied from vmcb01. msrpm_base can be overwritten later. */ svm->vmcb->control.nested_ctl = svm->vmcb01.ptr->control.nested_ctl; @@ -659,6 +654,9 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, svm_set_gif(svm, true); + if (kvm_vcpu_apicv_active(vcpu)) + kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu); + return 0; } @@ -923,6 +921,13 @@ int nested_svm_vmexit(struct vcpu_svm *svm) if (unlikely(svm->vmcb->save.rflags & X86_EFLAGS_TF)) kvm_queue_exception(&(svm->vcpu), DB_VECTOR); + /* + * Un-inhibit the AVIC right away, so that other vCPUs can start + * to benefit from VM-exit less IPI right away + */ + if (kvm_apicv_activated(vcpu->kvm)) + kvm_vcpu_update_apicv(vcpu); + return 0; } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 18d4d87e12e15..85035324ed762 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1392,7 +1392,8 @@ static void svm_set_vintr(struct vcpu_svm *svm) /* * The following fields are ignored when AVIC is enabled */ - WARN_ON(kvm_apicv_activated(svm->vcpu.kvm)); + if (!is_guest_mode(&svm->vcpu)) + WARN_ON(kvm_apicv_activated(svm->vcpu.kvm)); svm_set_intercept(svm, INTERCEPT_VINTR); @@ -2898,10 +2899,16 @@ static int interrupt_window_interception(struct kvm_vcpu *vcpu) svm_clear_vintr(to_svm(vcpu)); /* - * For AVIC, the only reason to end up here is ExtINTs. + * If not running nested, for AVIC, the only reason to end up here is ExtINTs. * In this case AVIC was temporarily disabled for * requesting the IRQ window and we have to re-enable it. + * + * If running nested, still uninhibit the AVIC in case irq window + * was requested when it was not running nested. + * All vCPUs which run nested will have their AVIC still + * inhibited due to AVIC inhibition override for that. */ + kvm_request_apicv_update(vcpu->kvm, true, APICV_INHIBIT_REASON_IRQWIN); ++vcpu->stat.irq_window_exits; @@ -3451,8 +3458,16 @@ static void svm_enable_irq_window(struct kvm_vcpu *vcpu) * unless we have pending ExtINT since it cannot be injected * via AVIC. In such case, we need to temporarily disable AVIC, * and fallback to injecting IRQ via V_IRQ. + * + * If running nested, this vCPU will use separate page tables + * which don't have L1's AVIC mapped, and the AVIC is + * already inhibited thus there is no need for global + * AVIC inhibition. */ - kvm_request_apicv_update(vcpu->kvm, false, APICV_INHIBIT_REASON_IRQWIN); + + if (!is_guest_mode(vcpu)) + kvm_request_apicv_update(vcpu->kvm, false, APICV_INHIBIT_REASON_IRQWIN); + svm_set_vintr(svm); } } @@ -3927,14 +3942,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) if (guest_cpuid_has(vcpu, X86_FEATURE_X2APIC)) kvm_request_apicv_update(vcpu->kvm, false, APICV_INHIBIT_REASON_X2APIC); - - /* - * Currently, AVIC does not work with nested virtualization. - * So, we disable AVIC when cpuid for SVM is set in the L1 guest. - */ - if (nested && guest_cpuid_has(vcpu, X86_FEATURE_SVM)) - kvm_request_apicv_update(vcpu->kvm, false, - APICV_INHIBIT_REASON_NESTED); } init_vmcb_after_set_cpuid(vcpu); } @@ -4657,6 +4664,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .complete_emulated_msr = svm_complete_emulated_msr, .vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector, + .vcpu_has_apicv_inhibit_condition = avic_has_vcpu_inhibit_condition, }; /* @@ -4840,6 +4848,7 @@ static __init int svm_hardware_setup(void) } else { svm_x86_ops.vcpu_blocking = NULL; svm_x86_ops.vcpu_unblocking = NULL; + svm_x86_ops.vcpu_has_apicv_inhibit_condition = NULL; } if (vls) { diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 83f9f95eced3e..c02903641d13d 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -580,6 +580,7 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq, void avic_vcpu_blocking(struct kvm_vcpu *vcpu); void avic_vcpu_unblocking(struct kvm_vcpu *vcpu); void avic_ring_doorbell(struct kvm_vcpu *vcpu); +bool avic_has_vcpu_inhibit_condition(struct kvm_vcpu *vcpu); /* sev.c */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8cb5390f75efe..63d84c373e465 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9697,6 +9697,14 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm) kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC); } +bool vcpu_has_apicv_inhibit_condition(struct kvm_vcpu *vcpu) +{ + if (kvm_x86_ops.vcpu_has_apicv_inhibit_condition) + return static_call(kvm_x86_vcpu_has_apicv_inhibit_condition)(vcpu); + else + return false; +} + void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu) { bool activate; @@ -9706,7 +9714,9 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu) down_read(&vcpu->kvm->arch.apicv_update_lock); - activate = kvm_apicv_activated(vcpu->kvm); + activate = kvm_apicv_activated(vcpu->kvm) && + !vcpu_has_apicv_inhibit_condition(vcpu); + if (vcpu->arch.apicv_active == activate) goto out; @@ -10110,7 +10120,11 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) * per-VM state, and responsing vCPUs must wait for the update * to complete before servicing KVM_REQ_APICV_UPDATE. */ - WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu)); + if (vcpu_has_apicv_inhibit_condition(vcpu)) + WARN_ON(kvm_vcpu_apicv_active(vcpu)); + else + WARN_ON_ONCE(kvm_apicv_activated(vcpu->kvm) != kvm_vcpu_apicv_active(vcpu)); + exit_fastpath = static_call(kvm_x86_vcpu_run)(vcpu); if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST)) From patchwork Mon Feb 7 15:54:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737577 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C15AEC3526C for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1392194AbiBGQEP (ORCPT ); Mon, 7 Feb 2022 11:04:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236669AbiBGP5q (ORCPT ); Mon, 7 Feb 2022 10:57:46 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A5487C0401CE for ; Mon, 7 Feb 2022 07:57:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249464; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0HEZBYTVo91JCiV+9jLxhzD11zRa4eS0IgQUtngVtbM=; b=LWCoEBoNchRtbSZs7/DWehC9jZQMzm10VVlf3sUmhtm1YHqyvqf/Vs9H9LFGlcS6sSp9sS dPPFGs5ia2+edlQmROidhDfflbaCtyi5sUAXaBJoA85Nc/xUdWzDatl8JbKU5XXCHgE9EL tIZ3ChJQS3LRuDDNVoCBPCNhOQsApU8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-307-IDusY3QONkeS_JBFG2cWcw-1; Mon, 07 Feb 2022 10:57:41 -0500 X-MC-Unique: IDusY3QONkeS_JBFG2cWcw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6A4C4192D787; Mon, 7 Feb 2022 15:57:38 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 54A807DE38; Mon, 7 Feb 2022 15:57:12 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 13/30] KVM: x86: lapic: don't allow to change APIC ID when apic acceleration is enabled Date: Mon, 7 Feb 2022 17:54:30 +0200 Message-Id: <20220207155447.840194-14-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org No normal guest has any reason to change physical APIC IDs, and allowing this introduces bugs into APIC acceleration code. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/lapic.c | 28 ++++++++++++++++++++++------ 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index dd4e2888c244b..7ff695cab27b2 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2002,10 +2002,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val) switch (reg) { case APIC_ID: /* Local APIC ID */ - if (!apic_x2apic_mode(apic)) - kvm_apic_set_xapic_id(apic, val >> 24); - else + if (apic_x2apic_mode(apic)) { ret = 1; + break; + } + /* + * Don't allow setting APIC ID with any APIC acceleration + * enabled to avoid unexpected issues + */ + if (enable_apicv && ((val >> 24) != apic->vcpu->vcpu_id)) { + kvm_vm_bugged(apic->vcpu->kvm); + break; + } + + kvm_apic_set_xapic_id(apic, val >> 24); break; case APIC_TASKPRI: @@ -2572,10 +2582,16 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu) static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s, bool set) { - if (apic_x2apic_mode(vcpu->arch.apic)) { - u32 *id = (u32 *)(s->regs + APIC_ID); - u32 *ldr = (u32 *)(s->regs + APIC_LDR); + u32 *id = (u32 *)(s->regs + APIC_ID); + u32 *ldr = (u32 *)(s->regs + APIC_LDR); + if (!apic_x2apic_mode(vcpu->arch.apic)) { + /* Don't allow setting APIC ID with any APIC acceleration + * enabled to avoid unexpected issues + */ + if (enable_apicv && (*id >> 24) != vcpu->vcpu_id) + return -EINVAL; + } else { if (vcpu->kvm->arch.x2apic_format) { if (*id != vcpu->vcpu_id) return -EINVAL; From patchwork Mon Feb 7 15:54:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737581 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1ADCDC35273 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1392556AbiBGQEU (ORCPT ); Mon, 7 Feb 2022 11:04:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237278AbiBGP55 (ORCPT ); Mon, 7 Feb 2022 10:57:57 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2AD0CC0401CC for ; Mon, 7 Feb 2022 07:57:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249476; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qZTGmWS8KYVLiOi++oL+/LS4m4MbwSz3cj1OP+bvfJo=; b=axatlc6oDssSSIRpaEQsMENvZjHg1qvxK9rYGEdybw173TQXZ2DuQebhGHEHdfWkKq+T9X bpKfQYGWFKMue75i7XIwGdf03ok9x5n+aOw8dfbkEUpaVyL880/4KCCW/Y4yLBTJFX5GOD +8Gj64Pjbi4Kh98Ujr/uVHV9rfws1q4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-56-Ma6qPu0sO2uPY4BMS1-Obw-1; Mon, 07 Feb 2022 10:57:51 -0500 X-MC-Unique: Ma6qPu0sO2uPY4BMS1-Obw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 339E61800D50; Mon, 7 Feb 2022 15:57:48 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id D7C2B7DE57; Mon, 7 Feb 2022 15:57:38 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 14/30] KVM: x86: lapic: don't allow to change local apic id when using older x2apic api Date: Mon, 7 Feb 2022 17:54:31 +0200 Message-Id: <20220207155447.840194-15-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM allowed to set non boot apic id via setting apic state if using older non x2apic 32 bit apic id userspace api. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/lapic.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 7ff695cab27b2..aeddd68d31181 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2592,15 +2592,15 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu, if (enable_apicv && (*id >> 24) != vcpu->vcpu_id) return -EINVAL; } else { - if (vcpu->kvm->arch.x2apic_format) { - if (*id != vcpu->vcpu_id) - return -EINVAL; - } else { - if (set) - *id >>= 24; - else - *id <<= 24; - } + + if (!vcpu->kvm->arch.x2apic_format && set) + *id >>= 24; + + if (*id != vcpu->vcpu_id) + return -EINVAL; + + if (!vcpu->kvm->arch.x2apic_format && !set) + *id <<= 24; /* In x2APIC mode, the LDR is fixed and based on the id */ if (set) From patchwork Mon Feb 7 15:54:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737574 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76DDDC4167D for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1389533AbiBGQEI (ORCPT ); Mon, 7 Feb 2022 11:04:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238566AbiBGP6i (ORCPT ); Mon, 7 Feb 2022 10:58:38 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4BE58C0401CE for ; Mon, 7 Feb 2022 07:58:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249517; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WZW9E72jh7L2Mkny6o+vpEWPJ8Jc7Yzvnii4XswRCco=; b=TImUIBmmKlnPFJb+y34X9rn2bqcRQFZMmwrvA/e7hANyr7cupt1Zh6PDcqFa1YUQPS23Sl EQqCIbEf4qq7AGXFEML7X0Fs4xRT+TNTF6nA+7k67y4/cntvdY28VEoCb+T6BtJ3dKysDC NFDUGitD720N1IW8Wh4XWxf+6NZ/NSg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-102-ItZjqKbCO7yGGgBJ6tjfpA-1; Mon, 07 Feb 2022 10:58:34 -0500 X-MC-Unique: ItZjqKbCO7yGGgBJ6tjfpA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 293E6192FDB5; Mon, 7 Feb 2022 15:58:31 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id A20C67DE38; Mon, 7 Feb 2022 15:57:48 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 15/30] KVM: x86: SVM: remove avic's broken code that updated APIC ID Date: Mon, 7 Feb 2022 17:54:32 +0200 Message-Id: <20220207155447.840194-16-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Now that KVM doesn't allow to change APIC ID in case AVIC is enabled, remove buggy AVIC code that tried to do so. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/avic.c | 35 ----------------------------------- 1 file changed, 35 deletions(-) diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index 8f23e7d239097..768252b3dfee6 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -440,35 +440,6 @@ static int avic_handle_ldr_update(struct kvm_vcpu *vcpu) return ret; } -static int avic_handle_apic_id_update(struct kvm_vcpu *vcpu) -{ - u64 *old, *new; - struct vcpu_svm *svm = to_svm(vcpu); - u32 id = kvm_xapic_id(vcpu->arch.apic); - - if (vcpu->vcpu_id == id) - return 0; - - old = avic_get_physical_id_entry(vcpu, vcpu->vcpu_id); - new = avic_get_physical_id_entry(vcpu, id); - if (!new || !old) - return 1; - - /* We need to move physical_id_entry to new offset */ - *new = *old; - *old = 0ULL; - to_svm(vcpu)->avic_physical_id_cache = new; - - /* - * Also update the guest physical APIC ID in the logical - * APIC ID table entry if already setup the LDR. - */ - if (svm->ldr_reg) - avic_handle_ldr_update(vcpu); - - return 0; -} - static void avic_handle_dfr_update(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -488,10 +459,6 @@ static int avic_unaccel_trap_write(struct vcpu_svm *svm) AVIC_UNACCEL_ACCESS_OFFSET_MASK; switch (offset) { - case APIC_ID: - if (avic_handle_apic_id_update(&svm->vcpu)) - return 0; - break; case APIC_LDR: if (avic_handle_ldr_update(&svm->vcpu)) return 0; @@ -584,8 +551,6 @@ int avic_init_vcpu(struct vcpu_svm *svm) void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu) { - if (avic_handle_apic_id_update(vcpu) != 0) - return; avic_handle_dfr_update(vcpu); avic_handle_ldr_update(vcpu); } From patchwork Mon Feb 7 15:54:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737576 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95ACCC41535 for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1390514AbiBGQEM (ORCPT ); Mon, 7 Feb 2022 11:04:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240006AbiBGP6r (ORCPT ); Mon, 7 Feb 2022 10:58:47 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 303D1C0401CE for ; Mon, 7 Feb 2022 07:58:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249526; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JIz72i323mNLi2DkTwdrDXZhWqpT5EG0EiEDO8eaza0=; b=Q2UMtfD/dqUHrm6Uq45HfDH27v2JGBDZjBGgqTuUdUYIn39LHUOtsCooloR5DDW1Xn+ZiA Jrx1QaTEBwxC5M/0i6JZ5M3A2QZ5vpHu73PH+KjvRQ3eDO/cVtJ+pKgkGuFSnGDKqp7Byq D7cpXjC06KpvQbpR5HuNcRjyz4Gopm4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-624-1uywd_D1N1C3NLAYKqlggA-1; Mon, 07 Feb 2022 10:58:43 -0500 X-MC-Unique: 1uywd_D1N1C3NLAYKqlggA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E95F4814503; Mon, 7 Feb 2022 15:58:39 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 896937DE5C; Mon, 7 Feb 2022 15:58:31 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 16/30] KVM: x86: SVM: allow to force AVIC to be enabled Date: Mon, 7 Feb 2022 17:54:33 +0200 Message-Id: <20220207155447.840194-17-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Apparently on some systems AVIC is disabled in CPUID but still usable. Allow the user to override the CPUID if the user is willing to take the risk. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/svm.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 85035324ed762..b88ca7f07a0fc 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -202,6 +202,9 @@ module_param(tsc_scaling, int, 0444); static bool avic; module_param(avic, bool, 0444); +static bool force_avic; +module_param_unsafe(force_avic, bool, 0444); + bool __read_mostly dump_invalid_vmcb; module_param(dump_invalid_vmcb, bool, 0644); @@ -4839,10 +4842,14 @@ static __init int svm_hardware_setup(void) nrips = false; } - enable_apicv = avic = avic && npt_enabled && boot_cpu_has(X86_FEATURE_AVIC); + enable_apicv = avic = avic && npt_enabled && (boot_cpu_has(X86_FEATURE_AVIC) || force_avic); if (enable_apicv) { - pr_info("AVIC enabled\n"); + if (!boot_cpu_has(X86_FEATURE_AVIC)) { + pr_warn("AVIC is not supported in CPUID but force enabled"); + pr_warn("Your system might crash and burn"); + } else + pr_info("AVIC enabled\n"); amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier); } else { From patchwork Mon Feb 7 15:54:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737568 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EED2BC433F5 for ; Mon, 7 Feb 2022 16:05:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1388211AbiBGQD7 (ORCPT ); Mon, 7 Feb 2022 11:03:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240343AbiBGP64 (ORCPT ); Mon, 7 Feb 2022 10:58:56 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B5987C0401CC for ; Mon, 7 Feb 2022 07:58:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249534; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ihOxmZAbTpzWMc2NwiHn3Gv0b20etgSCi1QvjMhiZ/g=; b=P/Iqul7J6eAZ6biGoYUxKm+F3khAX+sf8TAIVUebXC0f5oOgBnZUV+GA35ShRi7nbpisij nU20lkFQ3nIUvDBJYNcxzemE4cAj6xnLvKuJoJJlQd3EmP/InM5U5PPKsvpThqNYqDq340 tmAxSMcZt1d8/Mrx07EMo8tj1h1nOq8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-65-skuDRrX2O5modn52iyUwQQ-1; Mon, 07 Feb 2022 10:58:51 -0500 X-MC-Unique: skuDRrX2O5modn52iyUwQQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8ED8D84B9D9; Mon, 7 Feb 2022 15:58:47 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id B597C92303; Mon, 7 Feb 2022 15:58:39 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 17/30] KVM: x86: mmu: trace kvm_mmu_set_spte after the new SPTE was set Date: Mon, 7 Feb 2022 17:54:34 +0200 Message-Id: <20220207155447.840194-18-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org It makes more sense to print new SPTE value than the old value. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/mmu/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 296f8723f9ae9..43c7abdd6b70f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2708,8 +2708,8 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, if (*sptep == spte) { ret = RET_PF_SPURIOUS; } else { - trace_kvm_mmu_set_spte(level, gfn, sptep); flush |= mmu_spte_update(sptep, spte); + trace_kvm_mmu_set_spte(level, gfn, sptep); } if (wrprot) { From patchwork Mon Feb 7 15:54:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737584 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D394EC4707A for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1392243AbiBGQEQ (ORCPT ); Mon, 7 Feb 2022 11:04:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242977AbiBGP7N (ORCPT ); Mon, 7 Feb 2022 10:59:13 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D85D5C0401CE for ; Mon, 7 Feb 2022 07:59:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249552; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PA0ZlhnyV8+URLprs6c44r6sDD4SwJewRmeM2u7HZ1o=; b=jMPFZ/gt5tFdRYiobreHN4+W4V4Y6O0LMq1QmlWHrXenrJTzEkx4r7NNva81pFaKSxonzL BXjrhG0rhxCxDGS5fJ+wX8diE0qqq/oU3BWFblCVphn8ZfwP5ZPVuVMr0tJaTN1rvIqa9W m9DOXhKj0wZu4YrreQlSnaLv0Rmwi6M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-534-CJQeKoKLORqhqrfN0EJQcg-1; Mon, 07 Feb 2022 10:59:08 -0500 X-MC-Unique: CJQeKoKLORqhqrfN0EJQcg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1522F3E743; Mon, 7 Feb 2022 15:58:56 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 923C89652D; Mon, 7 Feb 2022 15:58:47 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 18/30] KVM: x86: mmu: add strict mmu mode Date: Mon, 7 Feb 2022 17:54:35 +0200 Message-Id: <20220207155447.840194-19-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add an (mostly debug) option to force KVM's shadow mmu to never have unsync pages. This is useful in some cases to debug it. It is also useful for some legacy guest OSes which don't flush TLBs correctly, and thus don't work on modern CPUs which have speculative MMUs. Using this option together with legacy paging (npt/ept=0) allows to correctly simulate such old MMU while still getting most of the benefits of the virtualization. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/mmu/mmu.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 43c7abdd6b70f..fa2da6990703f 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -91,6 +91,10 @@ __MODULE_PARM_TYPE(nx_huge_pages_recovery_period_ms, "uint"); static bool __read_mostly force_flush_and_sync_on_reuse; module_param_named(flush_on_reuse, force_flush_and_sync_on_reuse, bool, 0644); + +bool strict_mmu; +module_param(strict_mmu, bool, 0644); + /* * When setting this variable to true it enables Two-Dimensional-Paging * where the hardware walks 2 page tables: @@ -2703,7 +2707,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot, } wrprot = make_spte(vcpu, sp, slot, pte_access, gfn, pfn, *sptep, prefetch, - true, host_writable, &spte); + !strict_mmu, host_writable, &spte); if (*sptep == spte) { ret = RET_PF_SPURIOUS; @@ -5139,6 +5143,11 @@ static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa, */ static bool detect_write_flooding(struct kvm_mmu_page *sp) { + /* + * When using non speculating MMU, use a bit higher threshold + * for write flood detection + */ + int threshold = strict_mmu ? 10 : 3; /* * Skip write-flooding detected for the sp whose level is 1, because * it can become unsync, then the guest page is not write-protected. @@ -5147,7 +5156,7 @@ static bool detect_write_flooding(struct kvm_mmu_page *sp) return false; atomic_inc(&sp->write_flooding_count); - return atomic_read(&sp->write_flooding_count) >= 3; + return atomic_read(&sp->write_flooding_count) >= threshold; } /* From patchwork Mon Feb 7 15:54:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737588 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD418C47084 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442440AbiBGQEg (ORCPT ); Mon, 7 Feb 2022 11:04:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243917AbiBGP7V (ORCPT ); Mon, 7 Feb 2022 10:59:21 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 50AD7C0401CE for ; Mon, 7 Feb 2022 07:59:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249560; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mXBdRvvYHg1PpTVknSy/2IjwX4+D+xql9YO4/ocE+w4=; b=RlXos7GjLEqjUQPEskvJ3FjX/lxEoEllk36dAvNU5zRsZv1P9z3Z0bJsuImOxdjEOOd4X4 b1xLoH4DaKTkp/vag6tYhvEi9FKqotuFjILrAT6MUv0XownxsfIBbLOSkCQolY8AThN5Xr 437xZMg23ZupgHKvpf2Y8d2uTcXqPHk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-512-sVC76GOdNYa7KW-r-Qp2Hg-1; Mon, 07 Feb 2022 10:59:17 -0500 X-MC-Unique: sVC76GOdNYa7KW-r-Qp2Hg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BDF88192CC6B; Mon, 7 Feb 2022 15:59:03 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 60561B1881; Mon, 7 Feb 2022 15:58:55 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 19/30] KVM: x86: mmu: add gfn_in_memslot helper Date: Mon, 7 Feb 2022 17:54:36 +0200 Message-Id: <20220207155447.840194-20-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This is a tiny refactoring, and can be useful to check if a GPA/GFN is within a memslot a bit more cleanly. Signed-off-by: Maxim Levitsky --- include/linux/kvm_host.h | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b3810976a27f8..483681c6e322e 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1564,6 +1564,13 @@ int kvm_request_irq_source_id(struct kvm *kvm); void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id); bool kvm_arch_irqfd_allowed(struct kvm *kvm, struct kvm_irqfd *args); + +static inline bool gfn_in_memslot(struct kvm_memory_slot *slot, gfn_t gfn) +{ + return (gfn >= slot->base_gfn && gfn < slot->base_gfn + slot->npages); +} + + /* * Returns a pointer to the memslot if it contains gfn. * Otherwise returns NULL. @@ -1574,12 +1581,13 @@ try_get_memslot(struct kvm_memory_slot *slot, gfn_t gfn) if (!slot) return NULL; - if (gfn >= slot->base_gfn && gfn < slot->base_gfn + slot->npages) + if (gfn_in_memslot(slot, gfn)) return slot; else return NULL; } + /* * Returns a pointer to the memslot that contains gfn. Otherwise returns NULL. * From patchwork Mon Feb 7 15:54:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1DA3C433F5 for ; Mon, 7 Feb 2022 16:03:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235334AbiBGQC6 (ORCPT ); Mon, 7 Feb 2022 11:02:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245429AbiBGP7Z (ORCPT ); Mon, 7 Feb 2022 10:59:25 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C2949C0401CC for ; Mon, 7 Feb 2022 07:59:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249563; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XmdC6IIj3xv5TnY6QoReJ3NyIf+jyplwQ94vtfIKROs=; b=MN2LqPHCRg/gr/aTBSQWdJ0yTnM0DyB8L9BpjPAcx8R9X/NCzuXHYPZmOvR5uein2ilaqO eQwH/RJcy0B0A5NL7O8ztrxtuFDWaGsKjRpTZY5p/vUrCFb8+Hi5TDc0X37jRqv2Rgzf7B /eaPe7dNgU4gDgcLSaG0PwngIy9rUyc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-649-Te0PGfX_MCe-jM6NN5EXXQ-1; Mon, 07 Feb 2022 10:59:20 -0500 X-MC-Unique: Te0PGfX_MCe-jM6NN5EXXQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9477D83DD36; Mon, 7 Feb 2022 15:59:11 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 347C97DE4D; Mon, 7 Feb 2022 15:59:03 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 20/30] KVM: x86: mmu: allow to enable write tracking externally Date: Mon, 7 Feb 2022 17:54:37 +0200 Message-Id: <20220207155447.840194-21-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This will be used to enable write tracking from nested AVIC code and can also be used to enable write tracking in GVT-g module when it actually uses it as opposed to always enabling it, when the module is compiled in the kernel. No functional change intended. Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/include/asm/kvm_page_track.h | 1 + arch/x86/kvm/mmu.h | 8 +++++--- arch/x86/kvm/mmu/mmu.c | 16 +++++++++------- arch/x86/kvm/mmu/page_track.c | 10 ++++++++-- 5 files changed, 24 insertions(+), 13 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 256539c0481c5..428ab1cc7dd34 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1225,7 +1225,7 @@ struct kvm_arch { * is used as one input when determining whether certain memslot * related allocations are necessary. */ - bool shadow_root_allocated; + bool mmu_page_tracking_enabled; #if IS_ENABLED(CONFIG_HYPERV) hpa_t hv_root_tdp; diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h index eb186bc57f6a9..955a5ae07b10e 100644 --- a/arch/x86/include/asm/kvm_page_track.h +++ b/arch/x86/include/asm/kvm_page_track.h @@ -50,6 +50,7 @@ int kvm_page_track_init(struct kvm *kvm); void kvm_page_track_cleanup(struct kvm *kvm); bool kvm_page_track_write_tracking_enabled(struct kvm *kvm); +int kvm_page_track_write_tracking_enable(struct kvm *kvm); int kvm_page_track_write_tracking_alloc(struct kvm_memory_slot *slot); void kvm_page_track_free_memslot(struct kvm_memory_slot *slot); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 51faa2c76ca5f..48cc042f17466 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -267,7 +267,7 @@ int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); int kvm_mmu_post_init_vm(struct kvm *kvm); void kvm_mmu_pre_destroy_vm(struct kvm *kvm); -static inline bool kvm_shadow_root_allocated(struct kvm *kvm) +static inline bool mmu_page_tracking_enabled(struct kvm *kvm) { /* * Read shadow_root_allocated before related pointers. Hence, threads @@ -275,9 +275,11 @@ static inline bool kvm_shadow_root_allocated(struct kvm *kvm) * see the pointers. Pairs with smp_store_release in * mmu_first_shadow_root_alloc. */ - return smp_load_acquire(&kvm->arch.shadow_root_allocated); + return smp_load_acquire(&kvm->arch.mmu_page_tracking_enabled); } +int mmu_enable_write_tracking(struct kvm *kvm); + #ifdef CONFIG_X86_64 static inline bool is_tdp_mmu_enabled(struct kvm *kvm) { return kvm->arch.tdp_mmu_enabled; } #else @@ -286,7 +288,7 @@ static inline bool is_tdp_mmu_enabled(struct kvm *kvm) { return false; } static inline bool kvm_memslots_have_rmaps(struct kvm *kvm) { - return !is_tdp_mmu_enabled(kvm) || kvm_shadow_root_allocated(kvm); + return !is_tdp_mmu_enabled(kvm) || mmu_page_tracking_enabled(kvm); } static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t base_gfn, int level) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index fa2da6990703f..431e02ba73690 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3384,7 +3384,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) return r; } -static int mmu_first_shadow_root_alloc(struct kvm *kvm) +int mmu_enable_write_tracking(struct kvm *kvm) { struct kvm_memslots *slots; struct kvm_memory_slot *slot; @@ -3394,21 +3394,20 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm) * Check if this is the first shadow root being allocated before * taking the lock. */ - if (kvm_shadow_root_allocated(kvm)) + if (mmu_page_tracking_enabled(kvm)) return 0; mutex_lock(&kvm->slots_arch_lock); /* Recheck, under the lock, whether this is the first shadow root. */ - if (kvm_shadow_root_allocated(kvm)) + if (mmu_page_tracking_enabled(kvm)) goto out_unlock; /* * Check if anything actually needs to be allocated, e.g. all metadata * will be allocated upfront if TDP is disabled. */ - if (kvm_memslots_have_rmaps(kvm) && - kvm_page_track_write_tracking_enabled(kvm)) + if (kvm_memslots_have_rmaps(kvm) && mmu_page_tracking_enabled(kvm)) goto out_success; for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { @@ -3438,7 +3437,7 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm) * all the related pointers are set. */ out_success: - smp_store_release(&kvm->arch.shadow_root_allocated, true); + smp_store_release(&kvm->arch.mmu_page_tracking_enabled, true); out_unlock: mutex_unlock(&kvm->slots_arch_lock); @@ -3475,7 +3474,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) } } - r = mmu_first_shadow_root_alloc(vcpu->kvm); + r = mmu_enable_write_tracking(vcpu->kvm); if (r) return r; @@ -5712,6 +5711,9 @@ void kvm_mmu_init_vm(struct kvm *kvm) node->track_write = kvm_mmu_pte_write; node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot; kvm_page_track_register_notifier(kvm, node); + + if (IS_ENABLED(CONFIG_KVM_EXTERNAL_WRITE_TRACKING) || !tdp_enabled) + mmu_enable_write_tracking(kvm); } void kvm_mmu_uninit_vm(struct kvm *kvm) diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index 68eb1fb548b61..ce5735909e74c 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -21,10 +21,16 @@ bool kvm_page_track_write_tracking_enabled(struct kvm *kvm) { - return IS_ENABLED(CONFIG_KVM_EXTERNAL_WRITE_TRACKING) || - !tdp_enabled || kvm_shadow_root_allocated(kvm); + return mmu_page_tracking_enabled(kvm); } +int kvm_page_track_write_tracking_enable(struct kvm *kvm) +{ + return mmu_enable_write_tracking(kvm); +} +EXPORT_SYMBOL_GPL(kvm_page_track_write_tracking_enable); + + void kvm_page_track_free_memslot(struct kvm_memory_slot *slot) { int i; From patchwork Mon Feb 7 15:54:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737593 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC700C3527A for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442604AbiBGQEi (ORCPT ); Mon, 7 Feb 2022 11:04:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38636 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344116AbiBGP7e (ORCPT ); Mon, 7 Feb 2022 10:59:34 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A5D89C0401D3 for ; Mon, 7 Feb 2022 07:59:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249571; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w984BAq0R1oM4sAFKCgHPFGUA43pzvjZKViTrxIXmwo=; b=IJWwb1AufLCleVD3JQj1MiERfTFc2BS98u1YmsncVywy0/pKNftqX4xLzpRd2ENtua3jTj VPqjjRztpS81uWPVnlN24oi4Mnm7Xka99ICAOVc3+ilO018C/nRvkLsnfeI8OgqkY4DjnD AO0VmjnR8hcQbOYagdKwvlQ/wRj6nyQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-163-O5YhTrTkOROtuwBe0TEI_A-1; Mon, 07 Feb 2022 10:59:25 -0500 X-MC-Unique: O5YhTrTkOROtuwBe0TEI_A-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 74A9B1800D50; Mon, 7 Feb 2022 15:59:19 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0DC207DE4D; Mon, 7 Feb 2022 15:59:11 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 21/30] x86: KVMGT: use kvm_page_track_write_tracking_enable Date: Mon, 7 Feb 2022 17:54:38 +0200 Message-Id: <20220207155447.840194-22-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This allows to enable the write tracking only when KVMGT is actually used and doesn't carry any penalty otherwise. Tested by booting a VM with a kvmgt mdev device. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/Kconfig | 3 --- arch/x86/kvm/mmu/mmu.c | 2 +- drivers/gpu/drm/i915/Kconfig | 1 - drivers/gpu/drm/i915/gvt/kvmgt.c | 5 +++++ 4 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index ebc8ce9ec9173..169f8833cd0d1 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -132,7 +132,4 @@ config KVM_MMU_AUDIT This option adds a R/W kVM module parameter 'mmu_audit', which allows auditing of KVM MMU events at runtime. -config KVM_EXTERNAL_WRITE_TRACKING - bool - endif # VIRTUALIZATION diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 431e02ba73690..e4e2fc8e7d7a5 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5712,7 +5712,7 @@ void kvm_mmu_init_vm(struct kvm *kvm) node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot; kvm_page_track_register_notifier(kvm, node); - if (IS_ENABLED(CONFIG_KVM_EXTERNAL_WRITE_TRACKING) || !tdp_enabled) + if (!tdp_enabled) mmu_enable_write_tracking(kvm); } diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 84b6fc70cbf52..bf041b26ffec3 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -126,7 +126,6 @@ config DRM_I915_GVT_KVMGT depends on DRM_I915_GVT depends on KVM depends on VFIO_MDEV - select KVM_EXTERNAL_WRITE_TRACKING default n help Choose this option if you want to enable KVMGT support for diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 20b82fb036f8c..64ced3c2bc550 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1916,6 +1916,7 @@ static int kvmgt_guest_init(struct mdev_device *mdev) struct intel_vgpu *vgpu; struct kvmgt_vdev *vdev; struct kvm *kvm; + int ret; vgpu = mdev_get_drvdata(mdev); if (handle_valid(vgpu->handle)) @@ -1931,6 +1932,10 @@ static int kvmgt_guest_init(struct mdev_device *mdev) if (__kvmgt_vgpu_exist(vgpu, kvm)) return -EEXIST; + ret = kvm_page_track_write_tracking_enable(kvm); + if (ret) + return ret; + info = vzalloc(sizeof(struct kvmgt_guest_info)); if (!info) return -ENOMEM; From patchwork Mon Feb 7 15:54:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737594 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BC5EC433EF for ; Mon, 7 Feb 2022 16:05:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442644AbiBGQEj (ORCPT ); Mon, 7 Feb 2022 11:04:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344131AbiBGP7e (ORCPT ); Mon, 7 Feb 2022 10:59:34 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 226E2C0401CE for ; Mon, 7 Feb 2022 07:59:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249573; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aV5jkqTeSUUXBrzTfKSBHIk3+2wrmrIK57QNIliC8wc=; b=PjTc8H8nQ9WpI7gaXPVwUVFqm3AqCzBkj+I46KP2WvRrrK94j0jXd56NYhJ24bRwogTCS+ xvVKyj1HUR+XN2YuRTDKynhwjomakuQ5OQb2emeChqnkcm59/zacvu26cYt2SUxhJf0MSH b9p0K9sEUlLMilgkI7zYLbcvryHvO8A= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-339-TDQxQXxnNtaAvtvuuWWbkg-1; Mon, 07 Feb 2022 10:59:30 -0500 X-MC-Unique: TDQxQXxnNtaAvtvuuWWbkg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id EB5C11842144; Mon, 7 Feb 2022 15:59:26 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id C192E7DE38; Mon, 7 Feb 2022 15:59:19 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 22/30] KVM: x86: nSVM: correctly virtualize LBR msrs when L2 is running Date: Mon, 7 Feb 2022 17:54:39 +0200 Message-Id: <20220207155447.840194-23-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When L2 is running without LBR virtualization, we should ensure that L1's LBR msrs continue to update as usual. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 11 +++++ arch/x86/kvm/svm/svm.c | 98 +++++++++++++++++++++++++++++++-------- arch/x86/kvm/svm/svm.h | 2 + 3 files changed, 92 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index ac9159b0618c7..9f7bc7db08dd3 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -535,6 +535,9 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 svm->vcpu.arch.dr6 = svm->nested.save.dr6 | DR6_ACTIVE_LOW; vmcb_mark_dirty(svm->vmcb, VMCB_DR); } + + if (unlikely(svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK)) + svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); } static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) @@ -587,6 +590,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) svm->vmcb->control.event_inj = svm->nested.ctl.event_inj; svm->vmcb->control.event_inj_err = svm->nested.ctl.event_inj_err; + svm->vmcb->control.virt_ext = svm->vmcb01.ptr->control.virt_ext & + LBR_CTL_ENABLE_MASK; + nested_svm_transition_tlb_flush(vcpu); /* Enter Guest-Mode */ @@ -852,6 +858,11 @@ int nested_svm_vmexit(struct vcpu_svm *svm) svm_switch_vmcb(svm, &svm->vmcb01); + if (unlikely(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK)) { + svm_copy_lbrs(svm->nested.vmcb02.ptr, svm->vmcb); + svm_update_lbrv(vcpu); + } + /* * On vmexit the GIF is set to false and * no event can be injected in L1. diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index b88ca7f07a0fc..294e016f575a8 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -805,6 +805,17 @@ static void init_msrpm_offsets(void) } } +void svm_copy_lbrs(struct vmcb *from_vmcb, struct vmcb *to_vmcb) +{ + to_vmcb->save.dbgctl = from_vmcb->save.dbgctl; + to_vmcb->save.br_from = from_vmcb->save.br_from; + to_vmcb->save.br_to = from_vmcb->save.br_to; + to_vmcb->save.last_excp_from = from_vmcb->save.last_excp_from; + to_vmcb->save.last_excp_to = from_vmcb->save.last_excp_to; + + vmcb_mark_dirty(to_vmcb, VMCB_LBR); +} + static void svm_enable_lbrv(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -814,6 +825,10 @@ static void svm_enable_lbrv(struct kvm_vcpu *vcpu) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 1, 1); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1); + + /* Move the LBR msrs to the vmcb02 so that the guest can see them. */ + if (is_guest_mode(vcpu)) + svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); } static void svm_disable_lbrv(struct kvm_vcpu *vcpu) @@ -825,6 +840,63 @@ static void svm_disable_lbrv(struct kvm_vcpu *vcpu) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 0, 0); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 0, 0); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 0, 0); + + /* + * Move the LBR msrs back to the vmcb01 to avoid copying them + * on nested guest entries. + */ + if (is_guest_mode(vcpu)) + svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr); +} + +static int svm_get_lbr_msr(struct vcpu_svm *svm, u32 index) +{ + /* + * If the LBR virtualization is disabled, the LBR msrs are always + * kept in the vmcb01 to avoid copying them on nested guest entries. + * + * If nested, and the LBR virtualization is enabled/disabled, the msrs + * are moved between the vmcb01 and vmcb02 as needed. + */ + struct vmcb *vmcb = + (svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK) ? + svm->vmcb : svm->vmcb01.ptr; + + switch (index) { + case MSR_IA32_DEBUGCTLMSR: + return vmcb->save.dbgctl; + case MSR_IA32_LASTBRANCHFROMIP: + return vmcb->save.br_from; + case MSR_IA32_LASTBRANCHTOIP: + return vmcb->save.br_to; + case MSR_IA32_LASTINTFROMIP: + return vmcb->save.last_excp_from; + case MSR_IA32_LASTINTTOIP: + return vmcb->save.last_excp_to; + default: + KVM_BUG(false, svm->vcpu.kvm, + "%s: Unknown MSR 0x%x", __func__, index); + return 0; + } +} + +void svm_update_lbrv(struct kvm_vcpu *vcpu) +{ + struct vcpu_svm *svm = to_svm(vcpu); + + bool enable_lbrv = svm_get_lbr_msr(svm, MSR_IA32_DEBUGCTLMSR) & + DEBUGCTLMSR_LBR; + + bool current_enable_lbrv = !!(svm->vmcb->control.virt_ext & + LBR_CTL_ENABLE_MASK); + + if (enable_lbrv == current_enable_lbrv) + return; + + if (enable_lbrv) + svm_enable_lbrv(vcpu); + else + svm_disable_lbrv(vcpu); } void disable_nmi_singlestep(struct vcpu_svm *svm) @@ -2591,25 +2663,12 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_TSC_AUX: msr_info->data = svm->tsc_aux; break; - /* - * Nobody will change the following 5 values in the VMCB so we can - * safely return them on rdmsr. They will always be 0 until LBRV is - * implemented. - */ case MSR_IA32_DEBUGCTLMSR: - msr_info->data = svm->vmcb->save.dbgctl; - break; case MSR_IA32_LASTBRANCHFROMIP: - msr_info->data = svm->vmcb->save.br_from; - break; case MSR_IA32_LASTBRANCHTOIP: - msr_info->data = svm->vmcb->save.br_to; - break; case MSR_IA32_LASTINTFROMIP: - msr_info->data = svm->vmcb->save.last_excp_from; - break; case MSR_IA32_LASTINTTOIP: - msr_info->data = svm->vmcb->save.last_excp_to; + msr_info->data = svm_get_lbr_msr(svm, msr_info->index); break; case MSR_VM_HSAVE_PA: msr_info->data = svm->nested.hsave_msr; @@ -2840,12 +2899,13 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) if (data & DEBUGCTL_RESERVED_BITS) return 1; - svm->vmcb->save.dbgctl = data; - vmcb_mark_dirty(svm->vmcb, VMCB_LBR); - if (data & (1ULL<<0)) - svm_enable_lbrv(vcpu); + if (svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK) + svm->vmcb->save.dbgctl = data; else - svm_disable_lbrv(vcpu); + svm->vmcb01.ptr->save.dbgctl = data; + + svm_update_lbrv(vcpu); + break; case MSR_VM_HSAVE_PA: /* diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index c02903641d13d..b83e06d5d942a 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -476,6 +476,8 @@ u32 svm_msrpm_offset(u32 msr); u32 *svm_vcpu_alloc_msrpm(void); void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu, u32 *msrpm); void svm_vcpu_free_msrpm(u32 *msrpm); +void svm_copy_lbrs(struct vmcb *from_vmcb, struct vmcb *to_vmcb); +void svm_update_lbrv(struct kvm_vcpu *vcpu); int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer); void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0); From patchwork Mon Feb 7 15:54:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737569 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50333C43217 for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1389324AbiBGQEG (ORCPT ); Mon, 7 Feb 2022 11:04:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345530AbiBGP7l (ORCPT ); Mon, 7 Feb 2022 10:59:41 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 19F48C0401CF for ; Mon, 7 Feb 2022 07:59:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249580; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MrKkOBVSy/FtXXgDcr9UR7sqO5t4KzWH4q1ibusmu34=; b=ANyMyKzYJ5y/ormvnOJTV0mK1+ZW0LeEVEUecftUNlWESDx0/jVMR0qUBVHMv75YG+hGtz wp3rCzNC9kQ/t8Ys6BN5zyGQJSgmDzkWXxw31I4W6IvmCThfEPj7bJnbaPV3AvypRzzcuO xNIkbU10CWNU0bvipw9IMIFYCFEjG/I= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-10--fg2iXI_MC6_5NetEWXWmA-1; Mon, 07 Feb 2022 10:59:37 -0500 X-MC-Unique: -fg2iXI_MC6_5NetEWXWmA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5927C81433E; Mon, 7 Feb 2022 15:59:34 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4A2097DE5C; Mon, 7 Feb 2022 15:59:27 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 23/30] KVM: x86: nSVM: implement nested LBR virtualization Date: Mon, 7 Feb 2022 17:54:40 +0200 Message-Id: <20220207155447.840194-24-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This was tested with kvm-unit-test that was developed for this purpose. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 21 +++++++++++++++++++-- arch/x86/kvm/svm/svm.c | 8 ++++++++ arch/x86/kvm/svm/svm.h | 1 + 3 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 9f7bc7db08dd3..4a228a76b27d7 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -536,8 +536,19 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 vmcb_mark_dirty(svm->vmcb, VMCB_DR); } - if (unlikely(svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK)) + if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + + /* Copy LBR related registers from vmcb12, + * but make sure that we only pick LBR enable bit from the guest. + */ + + svm_copy_lbrs(vmcb12, svm->vmcb); + svm->vmcb->save.dbgctl &= LBR_CTL_ENABLE_MASK; + svm_update_lbrv(&svm->vcpu); + + } else if (unlikely(svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK)) { svm_copy_lbrs(svm->vmcb01.ptr, svm->vmcb); + } } static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) @@ -592,6 +603,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) svm->vmcb->control.virt_ext = svm->vmcb01.ptr->control.virt_ext & LBR_CTL_ENABLE_MASK; + if (svm->lbrv_enabled) + svm->vmcb->control.virt_ext |= + (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK); nested_svm_transition_tlb_flush(vcpu); @@ -858,7 +872,10 @@ int nested_svm_vmexit(struct vcpu_svm *svm) svm_switch_vmcb(svm, &svm->vmcb01); - if (unlikely(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK)) { + if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + svm_copy_lbrs(svm->nested.vmcb02.ptr, vmcb12); + svm_update_lbrv(vcpu); + } else if (unlikely(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK)) { svm_copy_lbrs(svm->nested.vmcb02.ptr, svm->vmcb); svm_update_lbrv(vcpu); } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 294e016f575a8..76aa6054d9db2 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -890,6 +890,10 @@ void svm_update_lbrv(struct kvm_vcpu *vcpu) bool current_enable_lbrv = !!(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK); + if (unlikely(is_guest_mode(vcpu) && svm->lbrv_enabled)) + if (unlikely(svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK)) + enable_lbrv = true; + if (enable_lbrv == current_enable_lbrv) return; @@ -3987,6 +3991,7 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) guest_cpuid_has(vcpu, X86_FEATURE_NRIPS); svm->tsc_scaling_enabled = tsc_scaling && guest_cpuid_has(vcpu, X86_FEATURE_TSCRATEMSR); + svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV); svm_recalc_instruction_intercepts(vcpu, svm); @@ -4791,6 +4796,9 @@ static __init void svm_set_cpu_caps(void) if (tsc_scaling) kvm_cpu_cap_set(X86_FEATURE_TSCRATEMSR); + if (lbrv) + kvm_cpu_cap_set(X86_FEATURE_LBRV); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index b83e06d5d942a..0012ba5affcba 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -220,6 +220,7 @@ struct vcpu_svm { /* cached guest cpuid flags for faster access */ bool nrips_enabled : 1; bool tsc_scaling_enabled : 1; + bool lbrv_enabled : 1; u32 ldr_reg; u32 dfr_reg; From patchwork Mon Feb 7 15:54:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737590 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADF43C35276 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442411AbiBGQEd (ORCPT ); Mon, 7 Feb 2022 11:04:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378178AbiBGP76 (ORCPT ); Mon, 7 Feb 2022 10:59:58 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 219F9C0401CE for ; Mon, 7 Feb 2022 07:59:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249597; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oU3s7CnCjp2O1VaC3lW5tk3dKh1EHix/EeWBDaZUEPI=; b=Hq3ku4E8WMfwOKfLn7zLujJ0u6zAcYzQ1YEmApnxnDWbeohVVknB3jTScnY/SEQ0LWEDYB kLGHx/AdJcTmZd9BKu1jbeqT2M49Tn6ZL8p+7KRjF2GAiwX0SWa3kwjgwZExfiyi2q2Fpo N+uD5dpog6XUrDbAEMlouNEhEdZHjpY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-17-kR5m2bS0MdWEc_u9IHssKg-1; Mon, 07 Feb 2022 10:59:54 -0500 X-MC-Unique: kR5m2bS0MdWEc_u9IHssKg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 955F7100D680; Mon, 7 Feb 2022 15:59:49 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id C6E917DE38; Mon, 7 Feb 2022 15:59:34 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 24/30] KVM: x86: nSVM: implement nested VMLOAD/VMSAVE Date: Mon, 7 Feb 2022 17:54:41 +0200 Message-Id: <20220207155447.840194-25-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This was tested by booting L1,L2,L3 (all Linux) and checking that no VMLOAD/VMSAVE vmexits happened. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 35 +++++++++++++++++++++++++++++------ arch/x86/kvm/svm/svm.c | 7 +++++++ arch/x86/kvm/svm/svm.h | 8 +++++++- 3 files changed, 43 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 4a228a76b27d7..bdcb23c76e89e 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -120,6 +120,20 @@ static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu) vcpu->arch.walk_mmu = &vcpu->arch.root_mmu; } +static bool nested_vmcb_needs_vls_intercept(struct vcpu_svm *svm) +{ + if (!svm->v_vmload_vmsave_enabled) + return true; + + if (!nested_npt_enabled(svm)) + return true; + + if (!(svm->nested.ctl.virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK)) + return true; + + return false; +} + void recalc_intercepts(struct vcpu_svm *svm) { struct vmcb_control_area *c, *h; @@ -161,8 +175,17 @@ void recalc_intercepts(struct vcpu_svm *svm) if (!intercept_smi) vmcb_clr_intercept(c, INTERCEPT_SMI); - vmcb_set_intercept(c, INTERCEPT_VMLOAD); - vmcb_set_intercept(c, INTERCEPT_VMSAVE); + if (nested_vmcb_needs_vls_intercept(svm)) { + /* + * If the virtual VMLOAD/VMSAVE is not enabled for the L2, + * we must intercept these instructions to correctly + * emulate them in case L1 doesn't intercept them. + */ + vmcb_set_intercept(c, INTERCEPT_VMLOAD); + vmcb_set_intercept(c, INTERCEPT_VMSAVE); + } else { + WARN_ON(!(c->virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK)); + } } static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm) @@ -426,10 +449,7 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm, vmcb12->control.exit_int_info = exit_int_info; } -static inline bool nested_npt_enabled(struct vcpu_svm *svm) -{ - return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; -} + static void nested_svm_transition_tlb_flush(struct kvm_vcpu *vcpu) { @@ -607,6 +627,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) svm->vmcb->control.virt_ext |= (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK); + if (!nested_vmcb_needs_vls_intercept(svm)) + svm->vmcb->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; + nested_svm_transition_tlb_flush(vcpu); /* Enter Guest-Mode */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 76aa6054d9db2..0f068da098d9f 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1051,6 +1051,8 @@ static inline void init_vmcb_after_set_cpuid(struct kvm_vcpu *vcpu) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_SYSENTER_EIP, 0, 0); set_msr_interception(vcpu, svm->msrpm, MSR_IA32_SYSENTER_ESP, 0, 0); + + svm->v_vmload_vmsave_enabled = false; } else { /* * If hardware supports Virtual VMLOAD VMSAVE then enable it @@ -3993,6 +3995,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) svm->tsc_scaling_enabled = tsc_scaling && guest_cpuid_has(vcpu, X86_FEATURE_TSCRATEMSR); svm->lbrv_enabled = lbrv && guest_cpuid_has(vcpu, X86_FEATURE_LBRV); + svm->v_vmload_vmsave_enabled = vls && guest_cpuid_has(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD); + svm_recalc_instruction_intercepts(vcpu, svm); /* For sev guests, the memory encryption bit is not reserved in CR3. */ @@ -4799,6 +4803,9 @@ static __init void svm_set_cpu_caps(void) if (lbrv) kvm_cpu_cap_set(X86_FEATURE_LBRV); + if (vls) + kvm_cpu_cap_set(X86_FEATURE_V_VMSAVE_VMLOAD); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 0012ba5affcba..e8ffd458a5575 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -217,10 +217,11 @@ struct vcpu_svm { unsigned int3_injected; unsigned long int3_rip; - /* cached guest cpuid flags for faster access */ + /* optional nested SVM features that are enabled for this guest */ bool nrips_enabled : 1; bool tsc_scaling_enabled : 1; bool lbrv_enabled : 1; + bool v_vmload_vmsave_enabled : 1; u32 ldr_reg; u32 dfr_reg; @@ -468,6 +469,11 @@ static inline bool gif_set(struct vcpu_svm *svm) return !!(svm->vcpu.arch.hflags & HF_GIF_MASK); } +static inline bool nested_npt_enabled(struct vcpu_svm *svm) +{ + return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; +} + /* svm.c */ #define MSR_INVALID 0xffffffffU From patchwork Mon Feb 7 15:54:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F75AC47080 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442165AbiBGQEa (ORCPT ); Mon, 7 Feb 2022 11:04:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1383313AbiBGQAc (ORCPT ); Mon, 7 Feb 2022 11:00:32 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CC87EC0401CE for ; Mon, 7 Feb 2022 08:00:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249630; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uQ1uiq30ca/wtmdEmBkE1pSZd4AYgwDXu3pWHPnS4Sg=; b=DKthNpp6ft1+OIFATQB7lU0WK3Ozn7127Q9Lq2dw+7FBOMODuyN1tvKpXl6boZh7AH8BRC 2nX5CH+shdmUjJUW14J2fbNSeNFwUUy4NdnGJb8+xns/o5OIgygxHll6IbYMmYqUZiZIOm M3OmBN4+7sT7LkjCA9bOFqVIK9iFR0Q= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-669-go2mTwCUMJOBBfV8StMKZA-1; Mon, 07 Feb 2022 11:00:27 -0500 X-MC-Unique: go2mTwCUMJOBBfV8StMKZA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6588C8144E1; Mon, 7 Feb 2022 16:00:12 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0FC027DE38; Mon, 7 Feb 2022 15:59:49 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 25/30] KVM: x86: nSVM: support PAUSE filter threshold and count when cpu_pm=on Date: Mon, 7 Feb 2022 17:54:42 +0200 Message-Id: <20220207155447.840194-26-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Allow L1 to use these settings if L0 disables PAUSE interception (AKA cpu_pm=on) Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 6 ++++++ arch/x86/kvm/svm/svm.c | 17 +++++++++++++++++ arch/x86/kvm/svm/svm.h | 2 ++ 3 files changed, 25 insertions(+) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index bdcb23c76e89e..601d38ae05cc6 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -630,6 +630,12 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) if (!nested_vmcb_needs_vls_intercept(svm)) svm->vmcb->control.virt_ext |= VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; + if (svm->pause_filter_enabled) + svm->vmcb->control.pause_filter_count = svm->nested.ctl.pause_filter_count; + + if (svm->pause_threshold_enabled) + svm->vmcb->control.pause_filter_thresh = svm->nested.ctl.pause_filter_thresh; + nested_svm_transition_tlb_flush(vcpu); /* Enter Guest-Mode */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 0f068da098d9f..e49043807ec44 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3997,6 +3997,17 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) svm->v_vmload_vmsave_enabled = vls && guest_cpuid_has(vcpu, X86_FEATURE_V_VMSAVE_VMLOAD); + if (kvm_pause_in_guest(vcpu->kvm)) { + svm->pause_filter_enabled = pause_filter_count > 0 && + guest_cpuid_has(vcpu, X86_FEATURE_PAUSEFILTER); + + svm->pause_threshold_enabled = pause_filter_thresh > 0 && + guest_cpuid_has(vcpu, X86_FEATURE_PFTHRESHOLD); + } else { + svm->pause_filter_enabled = false; + svm->pause_threshold_enabled = false; + } + svm_recalc_instruction_intercepts(vcpu, svm); /* For sev guests, the memory encryption bit is not reserved in CR3. */ @@ -4806,6 +4817,12 @@ static __init void svm_set_cpu_caps(void) if (vls) kvm_cpu_cap_set(X86_FEATURE_V_VMSAVE_VMLOAD); + if (pause_filter_count) + kvm_cpu_cap_set(X86_FEATURE_PAUSEFILTER); + + if (pause_filter_thresh) + kvm_cpu_cap_set(X86_FEATURE_PFTHRESHOLD); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index e8ffd458a5575..297ec57f9941c 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -222,6 +222,8 @@ struct vcpu_svm { bool tsc_scaling_enabled : 1; bool lbrv_enabled : 1; bool v_vmload_vmsave_enabled : 1; + bool pause_filter_enabled : 1; + bool pause_threshold_enabled : 1; u32 ldr_reg; u32 dfr_reg; From patchwork Mon Feb 7 15:54:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737578 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3879C46467 for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1391940AbiBGQEO (ORCPT ); Mon, 7 Feb 2022 11:04:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1385583AbiBGQA5 (ORCPT ); Mon, 7 Feb 2022 11:00:57 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AAA53C0401CE for ; Mon, 7 Feb 2022 08:00:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249655; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0YZKFPfPAlzbRhVzmbaqcoTuZ3BdIzpFOausE6A/ZGI=; b=XSUntx453+VZjjGXsn173/BO/GCCZiiUU3gLH1dZQFQ7ihYdGfQtIc1/FLlKr/7Bv7fSmJ t2jch6QnC7Kn4GkiWmsY2OpqEmuuiXwwnxToBm8GcHbD41VmzqPyd5aQFuS6lYFHqVJEwI QZbnIGF4fr82bqZ4NEZwe9RwYMx8fGQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-227-DItzqBf_Nx6qyRMfwq5ONQ-1; Mon, 07 Feb 2022 11:00:52 -0500 X-MC-Unique: DItzqBf_Nx6qyRMfwq5ONQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 527A8100C611; Mon, 7 Feb 2022 16:00:48 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id D44AD7DE38; Mon, 7 Feb 2022 16:00:12 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 26/30] KVM: x86: nSVM: implement nested vGIF Date: Mon, 7 Feb 2022 17:54:43 +0200 Message-Id: <20220207155447.840194-27-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In case L1 enables vGIF for L2, the L2 cannot affect L1's GIF, regardless of STGI/CLGI intercepts, and since VM entry enables GIF, this means that L1's GIF is always 1 while L2 is running. Thus in this case leave L1's vGIF in vmcb01, while letting L2 control the vGIF thus implementing nested vGIF. Also allow KVM to toggle L1's GIF during nested entry/exit by always using vmcb01. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/svm/nested.c | 17 +++++++++++++---- arch/x86/kvm/svm/svm.c | 5 +++++ arch/x86/kvm/svm/svm.h | 25 +++++++++++++++++++++---- 3 files changed, 39 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 601d38ae05cc6..a426d4d3dcd82 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -408,6 +408,10 @@ void nested_sync_control_from_vmcb02(struct vcpu_svm *svm) */ mask &= ~V_IRQ_MASK; } + + if (nested_vgif_enabled(svm)) + mask |= V_GIF_MASK; + svm->nested.ctl.int_ctl &= ~mask; svm->nested.ctl.int_ctl |= svm->vmcb->control.int_ctl & mask; } @@ -573,10 +577,8 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *vmcb12 static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) { - const u32 int_ctl_vmcb01_bits = - V_INTR_MASKING_MASK | V_GIF_MASK | V_GIF_ENABLE_MASK; - - const u32 int_ctl_vmcb12_bits = V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK; + u32 int_ctl_vmcb01_bits = V_INTR_MASKING_MASK; + u32 int_ctl_vmcb12_bits = V_TPR_MASK | V_IRQ_INJECTION_BITS_MASK; struct kvm_vcpu *vcpu = &svm->vcpu; @@ -586,6 +588,13 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm) */ + + + if (svm->vgif_enabled && (svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK)) + int_ctl_vmcb12_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK); + else + int_ctl_vmcb01_bits |= (V_GIF_MASK | V_GIF_ENABLE_MASK); + /* Copied from vmcb01. msrpm_base can be overwritten later. */ svm->vmcb->control.nested_ctl = svm->vmcb01.ptr->control.nested_ctl; svm->vmcb->control.iopm_base_pa = svm->vmcb01.ptr->control.iopm_base_pa; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e49043807ec44..1cf682d1553cc 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4008,6 +4008,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) svm->pause_threshold_enabled = false; } + svm->vgif_enabled = vgif && guest_cpuid_has(vcpu, X86_FEATURE_VGIF); + svm_recalc_instruction_intercepts(vcpu, svm); /* For sev guests, the memory encryption bit is not reserved in CR3. */ @@ -4823,6 +4825,9 @@ static __init void svm_set_cpu_caps(void) if (pause_filter_thresh) kvm_cpu_cap_set(X86_FEATURE_PFTHRESHOLD); + if (vgif) + kvm_cpu_cap_set(X86_FEATURE_VGIF); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 297ec57f9941c..73cc9d3e784bd 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -224,6 +224,7 @@ struct vcpu_svm { bool v_vmload_vmsave_enabled : 1; bool pause_filter_enabled : 1; bool pause_threshold_enabled : 1; + bool vgif_enabled : 1; u32 ldr_reg; u32 dfr_reg; @@ -442,31 +443,47 @@ static inline bool svm_is_intercept(struct vcpu_svm *svm, int bit) return vmcb_is_intercept(&svm->vmcb->control, bit); } +static bool nested_vgif_enabled(struct vcpu_svm *svm) +{ + if (!is_guest_mode(&svm->vcpu) || !svm->vgif_enabled) + return false; + return svm->nested.ctl.int_ctl & V_GIF_ENABLE_MASK; +} + static inline bool vgif_enabled(struct vcpu_svm *svm) { - return !!(svm->vmcb->control.int_ctl & V_GIF_ENABLE_MASK); + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + + return !!(vmcb->control.int_ctl & V_GIF_ENABLE_MASK); } static inline void enable_gif(struct vcpu_svm *svm) { + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + if (vgif_enabled(svm)) - svm->vmcb->control.int_ctl |= V_GIF_MASK; + vmcb->control.int_ctl |= V_GIF_MASK; else svm->vcpu.arch.hflags |= HF_GIF_MASK; } static inline void disable_gif(struct vcpu_svm *svm) { + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + if (vgif_enabled(svm)) - svm->vmcb->control.int_ctl &= ~V_GIF_MASK; + vmcb->control.int_ctl &= ~V_GIF_MASK; else svm->vcpu.arch.hflags &= ~HF_GIF_MASK; + } static inline bool gif_set(struct vcpu_svm *svm) { + struct vmcb *vmcb = nested_vgif_enabled(svm) ? svm->vmcb01.ptr : svm->vmcb; + if (vgif_enabled(svm)) - return !!(svm->vmcb->control.int_ctl & V_GIF_MASK); + return !!(vmcb->control.int_ctl & V_GIF_MASK); else return !!(svm->vcpu.arch.hflags & HF_GIF_MASK); } From patchwork Mon Feb 7 15:54:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737589 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CBCAC4707F for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1441806AbiBGQEX (ORCPT ); Mon, 7 Feb 2022 11:04:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1386880AbiBGQBg (ORCPT ); Mon, 7 Feb 2022 11:01:36 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id EECC6C0401D1 for ; Mon, 7 Feb 2022 08:01:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249695; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0Xxoic2Sn/qFA/mhRCvHnS08xjpesyCCqOxwVNKsehI=; b=FG1ebdiDdUhq14BQ2zFh/f8/Yu1nQ8HZIHoEBUOjqYd1ZzScYGv+9980tZJQbsfyu2p+Rg cVhcTTcgzRGtSjcswY09+N3/fIXsYg79HdtRXCUN8Qy3A15s1PWk+Pub6xQJph7BTawTy4 YY8ToidN36MPW6sXC2JPw3ezG65vUm8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-488-U0zzcjwdOgqudp1ofBwlyw-1; Mon, 07 Feb 2022 11:01:31 -0500 X-MC-Unique: U0zzcjwdOgqudp1ofBwlyw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 49ED4814243; Mon, 7 Feb 2022 16:01:28 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id C07CA83796; Mon, 7 Feb 2022 16:00:48 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky , Borislav Petkov Subject: [PATCH RESEND 27/30] KVM: x86: add force_intercept_exceptions_mask Date: Mon, 7 Feb 2022 17:54:44 +0200 Message-Id: <20220207155447.840194-28-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This parameter will be used by VMX and SVM code to force interception of a set of exceptions, given by a bitmask for guest debug and/or kvm debug. This is based on an idea first shown here: https://patchwork.kernel.org/project/kvm/patch/20160301192822.GD22677@pd.tnic/ CC: Borislav Petkov Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 7 +++++++ arch/x86/kvm/x86.c | 9 +++++++++ arch/x86/kvm/x86.h | 5 +++++ 3 files changed, 21 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 428ab1cc7dd34..fa498612839a0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1168,6 +1168,13 @@ struct kvm_arch { struct kvm_pmu_event_filter __rcu *pmu_event_filter; struct task_struct *nx_lpage_recovery_thread; + /* + * Bitmask of exceptions that KVM will intercept + * and forward to the guest, even if that is not needed + * for normal operation. Debug feature. + */ + u32 force_intercept_exceptions_bitmask; + #ifdef CONFIG_X86_64 /* * Whether the TDP MMU is enabled for this VM. This contains a diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 63d84c373e465..202c34697852f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -193,6 +193,13 @@ module_param(enable_pmu, bool, 0444); bool __read_mostly eager_page_split = true; module_param(eager_page_split, bool, 0644); +/* + * force_intercept_exceptions_mask is a writable param and its value + * is snapshotted when a VM is created + */ +static uint force_intercept_exceptions_mask; +module_param(force_intercept_exceptions_mask, uint, S_IRUGO | S_IWUSR); + /* * Restoring the host value for MSRs that are only consumed when running in * usermode, e.g. SYSCALL MSRs and TSC_AUX, can be deferred until the CPU @@ -11646,6 +11653,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags); kvm->arch.guest_can_read_msr_platform_info = true; + kvm->arch.force_intercept_exceptions_bitmask = force_intercept_exceptions_mask; #if IS_ENABLED(CONFIG_HYPERV) spin_lock_init(&kvm->arch.hv_root_tdp_lock); @@ -12886,6 +12894,7 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, } EXPORT_SYMBOL_GPL(kvm_sev_es_string_io); + EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_entry); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_fast_mmio); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index e9b303b21f173..34f96f483c7e5 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -91,6 +91,11 @@ static inline bool kvm_exception_is_soft(unsigned int nr) return (nr == BP_VECTOR) || (nr == OF_VECTOR); } +static inline bool kvm_is_exception_force_intercepted(struct kvm *kvm, int exception) +{ + return kvm->arch.force_intercept_exceptions_bitmask & BIT(exception); +} + static inline bool is_protmode(struct kvm_vcpu *vcpu) { return kvm_read_cr0_bits(vcpu, X86_CR0_PE); From patchwork Mon Feb 7 15:54:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1FE7C4707E for ; Mon, 7 Feb 2022 16:05:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1392415AbiBGQES (ORCPT ); Mon, 7 Feb 2022 11:04:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1351739AbiBGQB6 (ORCPT ); Mon, 7 Feb 2022 11:01:58 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 69C4DC0401D5 for ; Mon, 7 Feb 2022 08:01:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249716; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iV+sT7eBaIHlnSrQTGcmgC2l+6R2K6ll4jPmbZAHXmI=; b=gcI7kwW+aczBX1abYRClLqe6YF8XVZujFcvH9dtUj+D+4rE6Vn+vFwDDT0LZQwsn9bSUD6 TniJP3c7HFRdG6Iox77zgbr/qN0Nya10LqQ9lVYLZONxGNr0S7vamXfxSweq4NmF7hYAic QFemImNC2eJ2hrkAPoQBoV2qbsNGqAw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-197-Vy5h3DWeNXi5I40VW26cAA-1; Mon, 07 Feb 2022 11:01:52 -0500 X-MC-Unique: Vy5h3DWeNXi5I40VW26cAA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0C38346860; Mon, 7 Feb 2022 16:01:48 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id B7A2B7DE38; Mon, 7 Feb 2022 16:01:28 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 28/30] KVM: SVM: implement force_intercept_exceptions_mask Date: Mon, 7 Feb 2022 17:54:45 +0200 Message-Id: <20220207155447.840194-29-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Currently #TS interception is only done once. Also exception interception is not enabled for SEV guests. Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 2 + arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/svm/svm.c | 92 ++++++++++++++++++++++++++++++++- arch/x86/kvm/svm/svm.h | 5 +- arch/x86/kvm/svm/svm_onhyperv.c | 1 + arch/x86/kvm/x86.c | 5 +- 6 files changed, 101 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index fa498612839a0..446ee29e6cc99 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1750,6 +1750,8 @@ int kvm_emulate_rdpmc(struct kvm_vcpu *vcpu); void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr); void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code); void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, unsigned long payload); +void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr, + u32 error_code, unsigned long payload); void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr); void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code); void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault); diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index bf6e96011dfed..d462b4808e893 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -32,6 +32,7 @@ #define MC_VECTOR 18 #define XM_VECTOR 19 #define VE_VECTOR 20 +#define CP_VECTOR 21 /* Select x86 specific features in */ #define __KVM_HAVE_PIT diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 1cf682d1553cc..afa4116ea938c 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -245,6 +245,8 @@ static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000}; #define MSRS_RANGE_SIZE 2048 #define MSRS_IN_RANGE (MSRS_RANGE_SIZE * 8 / 2) +static int svm_handle_invalid_exit(struct kvm_vcpu *vcpu, u64 exit_code); + u32 svm_msrpm_offset(u32 msr) { u32 offset; @@ -1035,6 +1037,16 @@ static void svm_recalc_instruction_intercepts(struct kvm_vcpu *vcpu, } } +static void svm_init_force_exceptions_intercepts(struct kvm_vcpu *vcpu) +{ + int exc; + struct vcpu_svm *svm = to_svm(vcpu); + + for (exc = 0 ; exc < 32 ; ++exc) + if (kvm_is_exception_force_intercepted(vcpu->kvm, exc)) + set_exception_intercept(svm, exc); +} + static inline void init_vmcb_after_set_cpuid(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); @@ -1235,6 +1247,8 @@ static void __svm_vcpu_reset(struct kvm_vcpu *vcpu) if (sev_es_guest(vcpu->kvm)) sev_es_vcpu_reset(svm); + else + svm_init_force_exceptions_intercepts(vcpu); } static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) @@ -1865,6 +1879,19 @@ static int pf_interception(struct kvm_vcpu *vcpu) u64 fault_address = svm->vmcb->control.exit_info_2; u64 error_code = svm->vmcb->control.exit_info_1; + + if (kvm_is_exception_force_intercepted(vcpu->kvm, PF_VECTOR)) { + if (npt_enabled && !vcpu->arch.apf.host_apf_flags) { + /* If the #PF was only intercepted for debug, inject + * it directly to the guest, since the kvm's mmu code + * is not ready to deal with such page faults. + */ + kvm_queue_exception_e_p(vcpu, PF_VECTOR, + error_code, fault_address); + return 1; + } + } + return kvm_handle_page_fault(vcpu, error_code, fault_address, static_cpu_has(X86_FEATURE_DECODEASSISTS) ? svm->vmcb->control.insn_bytes : NULL, @@ -1940,6 +1967,46 @@ static int ac_interception(struct kvm_vcpu *vcpu) return 1; } +static int gen_exc_interception(struct kvm_vcpu *vcpu) +{ + /* + * Generic exception intercept handler which forwards a guest exception + * as-is to the guest. + * For exceptions that don't have a special intercept handler. + * + * Used only for 'force_intercept_exceptions_mask' KVM debug feature. + */ + struct vcpu_svm *svm = to_svm(vcpu); + int exc = svm->vmcb->control.exit_code - SVM_EXIT_EXCP_BASE; + + if (!kvm_is_exception_force_intercepted(vcpu->kvm, exc)) + return svm_handle_invalid_exit(vcpu, svm->vmcb->control.exit_code); + + if (x86_exception_has_error_code(exc)) { + + if (exc == TS_VECTOR) { + /* + * SVM doesn't provide us with an error code to be able to + * re-inject the #TS exception, so just disable its + * intercept, and let the guest re-execute the instruction. + */ + vmcb_clr_intercept(&svm->vmcb01.ptr->control, + INTERCEPT_EXCEPTION_OFFSET + TS_VECTOR); + recalc_intercepts(svm); + return 1; + } else if (exc == DF_VECTOR) { + /* + * SVM doesn't provide the error code on #DF either + * but it is always 0 + */ + svm->vmcb->control.exit_info_1 = 0; + } + kvm_queue_exception_e(vcpu, exc, svm->vmcb->control.exit_info_1); + } else + kvm_queue_exception(vcpu, exc); + return 1; +} + static bool is_erratum_383(void) { int err, i; @@ -3050,13 +3117,34 @@ static int (*const svm_exit_handlers[])(struct kvm_vcpu *vcpu) = { [SVM_EXIT_WRITE_DR5] = dr_interception, [SVM_EXIT_WRITE_DR6] = dr_interception, [SVM_EXIT_WRITE_DR7] = dr_interception, + + /* 0 */ + [SVM_EXIT_EXCP_BASE + DE_VECTOR] = gen_exc_interception, [SVM_EXIT_EXCP_BASE + DB_VECTOR] = db_interception, + /* NMI*/ [SVM_EXIT_EXCP_BASE + BP_VECTOR] = bp_interception, + [SVM_EXIT_EXCP_BASE + OF_VECTOR] = gen_exc_interception, + [SVM_EXIT_EXCP_BASE + BR_VECTOR] = gen_exc_interception, [SVM_EXIT_EXCP_BASE + UD_VECTOR] = ud_interception, + [SVM_EXIT_EXCP_BASE + NM_VECTOR] = gen_exc_interception, + /* 8*/ + [SVM_EXIT_EXCP_BASE + DF_VECTOR] = gen_exc_interception, + /* 9 is reserved*/ + [SVM_EXIT_EXCP_BASE + TS_VECTOR] = gen_exc_interception, + [SVM_EXIT_EXCP_BASE + NP_VECTOR] = gen_exc_interception, + [SVM_EXIT_EXCP_BASE + SS_VECTOR] = gen_exc_interception, + [SVM_EXIT_EXCP_BASE + GP_VECTOR] = gp_interception, [SVM_EXIT_EXCP_BASE + PF_VECTOR] = pf_interception, - [SVM_EXIT_EXCP_BASE + MC_VECTOR] = mc_interception, + /* 15 is reserved*/ + /* 16 */ + [SVM_EXIT_EXCP_BASE + MF_VECTOR] = gen_exc_interception, [SVM_EXIT_EXCP_BASE + AC_VECTOR] = ac_interception, - [SVM_EXIT_EXCP_BASE + GP_VECTOR] = gp_interception, + [SVM_EXIT_EXCP_BASE + MC_VECTOR] = mc_interception, + [SVM_EXIT_EXCP_BASE + XM_VECTOR] = gen_exc_interception, + /* 20 - #VE - reserved on AMD*/ + [SVM_EXIT_EXCP_BASE + CP_VECTOR] = gen_exc_interception, + /* TODO: exceptions 22-31 */ + [SVM_EXIT_INTR] = intr_interception, [SVM_EXIT_NMI] = nmi_interception, [SVM_EXIT_SMI] = smi_interception, diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 73cc9d3e784bd..dd3671d77258b 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -415,8 +415,11 @@ static inline void clr_exception_intercept(struct vcpu_svm *svm, u32 bit) struct vmcb *vmcb = svm->vmcb01.ptr; WARN_ON_ONCE(bit >= 32); - vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit); + if (kvm_is_exception_force_intercepted(svm->vcpu.kvm, bit)) + return; + + vmcb_clr_intercept(&vmcb->control, INTERCEPT_EXCEPTION_OFFSET + bit); recalc_intercepts(svm); } diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c index 98aa981c04ec5..81be254e757b2 100644 --- a/arch/x86/kvm/svm/svm_onhyperv.c +++ b/arch/x86/kvm/svm/svm_onhyperv.c @@ -8,6 +8,7 @@ #include +#include "x86.h" #include "svm.h" #include "svm_ops.h" diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 202c34697852f..0ee2fbb068b17 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -706,12 +706,13 @@ void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, } EXPORT_SYMBOL_GPL(kvm_queue_exception_p); -static void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr, - u32 error_code, unsigned long payload) +void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr, + u32 error_code, unsigned long payload) { kvm_multiple_exception(vcpu, nr, true, error_code, true, payload, false); } +EXPORT_SYMBOL_GPL(kvm_queue_exception_e_p); int kvm_complete_insn_gp(struct kvm_vcpu *vcpu, int err) { From patchwork Mon Feb 7 15:54:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EA7CC47081 for ; Mon, 7 Feb 2022 16:05:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442248AbiBGQEc (ORCPT ); Mon, 7 Feb 2022 11:04:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S245655AbiBGQCa (ORCPT ); Mon, 7 Feb 2022 11:02:30 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 93D3AC0401CF for ; Mon, 7 Feb 2022 08:02:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249748; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PdxSxSRMwi46phQmQDHMqytXBvrQerh2Lq0bkWEDmYc=; b=WQGlxGP55SFLgBZO6hFtMSxQ9EWsmzJIs5I/AkvFRijH9izLBE+HTXHZhoGbE4YJ1oY1OM DFmQJv9l078s4/fQhWWTY5LekVM3Bz32eQ4ANNN8M7PaHVoyWm9jKFSH480GbS4e/zh/1Y TEgtE0bKfDjUPsDQW+b8ZMg92aGC0+w= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-580-y1zHmLZzOZ6GXX4QHvir3w-1; Mon, 07 Feb 2022 11:02:25 -0500 X-MC-Unique: y1zHmLZzOZ6GXX4QHvir3w-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5BAF3839A44; Mon, 7 Feb 2022 16:02:22 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id 79D757DE38; Mon, 7 Feb 2022 16:01:48 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 29/30] KVM: VMX: implement force_intercept_exceptions_mask Date: Mon, 7 Feb 2022 17:54:46 +0200 Message-Id: <20220207155447.840194-30-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org All exceptions are supported. Some bugs might remain in regard to KVM own interception of #PF but since this is strictly debug feature this should be OK. Signed-off-by: Maxim Levitsky --- arch/x86/kvm/vmx/nested.c | 8 +++++++ arch/x86/kvm/vmx/vmcs.h | 6 +++++ arch/x86/kvm/vmx/vmx.c | 47 +++++++++++++++++++++++++++++++++------ 3 files changed, 54 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index c73e4d938ddc3..e89b32b1d9efb 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -5902,6 +5902,14 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu, switch ((u16)exit_reason.basic) { case EXIT_REASON_EXCEPTION_NMI: intr_info = vmx_get_intr_info(vcpu); + + if (is_exception(intr_info)) { + int ex_no = intr_info & INTR_INFO_VECTOR_MASK; + + if (kvm_is_exception_force_intercepted(vcpu->kvm, ex_no)) + return true; + } + if (is_nmi(intr_info)) return true; else if (is_page_fault(intr_info)) diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h index e325c290a8162..d5aac5abe5cdd 100644 --- a/arch/x86/kvm/vmx/vmcs.h +++ b/arch/x86/kvm/vmx/vmcs.h @@ -94,6 +94,12 @@ static inline bool is_exception_n(u32 intr_info, u8 vector) return is_intr_type_n(intr_info, INTR_TYPE_HARD_EXCEPTION, vector); } +static inline bool is_exception(u32 intr_info) +{ + return is_intr_type(intr_info, INTR_TYPE_HARD_EXCEPTION) || + is_intr_type(intr_info, INTR_TYPE_SOFT_EXCEPTION); +} + static inline bool is_debug(u32 intr_info) { return is_exception_n(intr_info, DB_VECTOR); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index fc9c4eca90a78..aec2b962707a0 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -719,6 +719,7 @@ static u32 vmx_read_guest_seg_ar(struct vcpu_vmx *vmx, unsigned seg) void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu) { u32 eb; + int exc; eb = (1u << PF_VECTOR) | (1u << UD_VECTOR) | (1u << MC_VECTOR) | (1u << DB_VECTOR) | (1u << AC_VECTOR); @@ -749,7 +750,8 @@ void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu) else { int mask = 0, match = 0; - if (enable_ept && (eb & (1u << PF_VECTOR))) { + if (enable_ept && (eb & (1u << PF_VECTOR)) && + !kvm_is_exception_force_intercepted(vcpu->kvm, PF_VECTOR)) { /* * If EPT is enabled, #PF is currently only intercepted * if MAXPHYADDR is smaller on the guest than on the @@ -772,6 +774,10 @@ void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu) if (vcpu->arch.xfd_no_write_intercept) eb |= (1u << NM_VECTOR); + for (exc = 0 ; exc < 32 ; ++exc) + if (kvm_is_exception_force_intercepted(vcpu->kvm, exc) && exc != NMI_VECTOR) + eb |= (1u << exc); + vmcs_write32(EXCEPTION_BITMAP, eb); } @@ -4867,18 +4873,23 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE); if (!vmx->rmode.vm86_active && is_gp_fault(intr_info)) { - WARN_ON_ONCE(!enable_vmware_backdoor); - /* * VMware backdoor emulation on #GP interception only handles * IN{S}, OUT{S}, and RDPMC, none of which generate a non-zero * error code on #GP. */ - if (error_code) { + + if (enable_vmware_backdoor && !error_code) + return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP); + + if (!kvm_is_exception_force_intercepted(vcpu->kvm, GP_VECTOR)) + WARN_ON_ONCE(!enable_vmware_backdoor); + + if (intr_info & INTR_INFO_DELIVER_CODE_MASK) kvm_queue_exception_e(vcpu, GP_VECTOR, error_code); - return 1; - } - return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP); + else + kvm_queue_exception(vcpu, GP_VECTOR); + return 1; } /* @@ -4887,6 +4898,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) * See the comments in vmx_handle_exit. */ if ((vect_info & VECTORING_INFO_VALID_MASK) && + !kvm_is_exception_force_intercepted(vcpu->kvm, PF_VECTOR) && !(is_page_fault(intr_info) && !(error_code & PFERR_RSVD_MASK))) { vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_SIMUL_EX; @@ -4901,10 +4913,23 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) if (is_page_fault(intr_info)) { cr2 = vmx_get_exit_qual(vcpu); if (enable_ept && !vcpu->arch.apf.host_apf_flags) { + /* + * If we force intercept #PF and the page fault + * is due to the reason which we don't intercept, + * reflect it to the guest. + */ + if (kvm_is_exception_force_intercepted(vcpu->kvm, PF_VECTOR) && + (!allow_smaller_maxphyaddr || + !(error_code & PFERR_PRESENT_MASK) || + (error_code & PFERR_RSVD_MASK))) { + kvm_queue_exception_e_p(vcpu, PF_VECTOR, error_code, cr2); + return 1; + } /* * EPT will cause page fault only if we need to * detect illegal GPAs. */ + WARN_ON_ONCE(!allow_smaller_maxphyaddr); kvm_fixup_and_inject_pf_error(vcpu, cr2, error_code); return 1; @@ -4983,6 +5008,14 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) return 1; fallthrough; default: + if (kvm_is_exception_force_intercepted(vcpu->kvm, ex_no)) { + if (intr_info & INTR_INFO_DELIVER_CODE_MASK) + kvm_queue_exception_e(vcpu, ex_no, error_code); + else + kvm_queue_exception(vcpu, ex_no); + break; + } + kvm_run->exit_reason = KVM_EXIT_EXCEPTION; kvm_run->ex.exception = ex_no; kvm_run->ex.error_code = error_code; From patchwork Mon Feb 7 15:54:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Maxim Levitsky X-Patchwork-Id: 12737601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84B76C433EF for ; Mon, 7 Feb 2022 16:17:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241382AbiBGQQv (ORCPT ); Mon, 7 Feb 2022 11:16:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41592 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348696AbiBGQDA (ORCPT ); Mon, 7 Feb 2022 11:03:00 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E91C5C0401D3 for ; Mon, 7 Feb 2022 08:02:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644249778; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5WIO5n6+3jqzKnCirxzx86dJnJZyXVHjiDs7Tj6xNnY=; b=OBZ3btoJKhxTb6cZtOVtshcPNzmTpWFi1sfsYhUxyZwff+jC+uufFzwav/Z9DkLMd6mtLq E5Altt/53vVCGEuC3XZmX0zdmCJv8NZUF3iuGsL+B+aDy5MKrYL5aO4HvZdZenBiQ8FD9I GjZdBQcu3v5piFSZs+rxHW22q+uoK6Y= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-10-vZy_U4-_P6ay5ymHJ7brZQ-1; Mon, 07 Feb 2022 11:02:56 -0500 X-MC-Unique: vZy_U4-_P6ay5ymHJ7brZQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7F40219251A0; Mon, 7 Feb 2022 16:02:53 +0000 (UTC) Received: from localhost.localdomain (unknown [10.40.192.15]) by smtp.corp.redhat.com (Postfix) with ESMTP id CCE5D8276B; Mon, 7 Feb 2022 16:02:22 +0000 (UTC) From: Maxim Levitsky To: kvm@vger.kernel.org Cc: Tony Luck , "Chang S. Bae" , Thomas Gleixner , Wanpeng Li , Ingo Molnar , Vitaly Kuznetsov , Pawan Gupta , Dave Hansen , Paolo Bonzini , linux-kernel@vger.kernel.org, Rodrigo Vivi , "H. Peter Anvin" , intel-gvt-dev@lists.freedesktop.org, Joonas Lahtinen , Joerg Roedel , Sean Christopherson , David Airlie , Zhi Wang , Brijesh Singh , Jim Mattson , x86@kernel.org, Daniel Vetter , Borislav Petkov , Zhenyu Wang , Kan Liang , Jani Nikula , Maxim Levitsky Subject: [PATCH RESEND 30/30] KVM: x86: get rid of KVM_REQ_GET_NESTED_STATE_PAGES Date: Mon, 7 Feb 2022 17:54:47 +0200 Message-Id: <20220207155447.840194-31-mlevitsk@redhat.com> In-Reply-To: <20220207155447.840194-1-mlevitsk@redhat.com> References: <20220207155447.840194-1-mlevitsk@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org As it turned out this request isn't really needed, and it complicates the nested migration. In theory this patch can break userspace if userspace relies on updating KVM's memslots after setting nested state but there is little reason for it to rely on this. However this is undocumented and there is a good chance that no userspace relies on this, thus just try to remove this code. Signed-off-by: Maxim Levitsky --- arch/x86/include/asm/kvm_host.h | 5 +- arch/x86/kvm/hyperv.c | 4 ++ arch/x86/kvm/svm/nested.c | 50 ++++------------- arch/x86/kvm/svm/svm.c | 2 +- arch/x86/kvm/svm/svm.h | 2 +- arch/x86/kvm/vmx/nested.c | 99 +++++++++------------------------ arch/x86/kvm/x86.c | 6 -- 7 files changed, 45 insertions(+), 123 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 446ee29e6cc99..fc2d5628ad930 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -92,7 +92,6 @@ #define KVM_REQ_HV_EXIT KVM_ARCH_REQ(21) #define KVM_REQ_HV_STIMER KVM_ARCH_REQ(22) #define KVM_REQ_LOAD_EOI_EXITMAP KVM_ARCH_REQ(23) -#define KVM_REQ_GET_NESTED_STATE_PAGES KVM_ARCH_REQ(24) #define KVM_REQ_APICV_UPDATE \ KVM_ARCH_REQ_FLAGS(25, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP) #define KVM_REQ_TLB_FLUSH_CURRENT KVM_ARCH_REQ(26) @@ -1519,12 +1518,14 @@ struct kvm_x86_nested_ops { int (*set_state)(struct kvm_vcpu *vcpu, struct kvm_nested_state __user *user_kvm_nested_state, struct kvm_nested_state *kvm_state); - bool (*get_nested_state_pages)(struct kvm_vcpu *vcpu); int (*write_log_dirty)(struct kvm_vcpu *vcpu, gpa_t l2_gpa); int (*enable_evmcs)(struct kvm_vcpu *vcpu, uint16_t *vmcs_version); uint16_t (*get_evmcs_version)(struct kvm_vcpu *vcpu); + + bool (*get_evmcs_page)(struct kvm_vcpu *vcpu); + }; struct kvm_x86_init_ops { diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index dac41784f2b87..d297d102c0910 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -1497,6 +1497,10 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host) gfn_to_gpa(gfn) | KVM_MSR_ENABLED, sizeof(struct hv_vp_assist_page))) return 1; + + if (host && kvm_x86_ops.nested_ops->get_evmcs_page) + if (!kvm_x86_ops.nested_ops->get_evmcs_page(vcpu)) + return 1; break; } case HV_X64_MSR_EOI: diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index a426d4d3dcd82..ac813ad83d784 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -670,7 +670,7 @@ static void nested_svm_copy_common_state(struct vmcb *from_vmcb, struct vmcb *to } int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, - struct vmcb *vmcb12, bool from_vmrun) + struct vmcb *vmcb12) { struct vcpu_svm *svm = to_svm(vcpu); int ret; @@ -700,15 +700,13 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, nested_vmcb02_prepare_save(svm, vmcb12); ret = nested_svm_load_cr3(&svm->vcpu, svm->nested.save.cr3, - nested_npt_enabled(svm), from_vmrun); + nested_npt_enabled(svm), true); if (ret) return ret; if (!npt_enabled) vcpu->arch.mmu->inject_page_fault = svm_inject_page_fault_nested; - if (!from_vmrun) - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); svm_set_gif(svm, true); @@ -779,7 +777,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) svm->nested.nested_run_pending = 1; - if (enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, true)) + if (enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12)) goto out_exit_err; if (nested_svm_vmrun_msrpm(svm)) @@ -863,8 +861,6 @@ int nested_svm_vmexit(struct vcpu_svm *svm) svm->nested.vmcb12_gpa = 0; WARN_ON_ONCE(svm->nested.nested_run_pending); - kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); - /* in case we halted in L2 */ svm->vcpu.arch.mp_state = KVM_MP_STATE_RUNNABLE; @@ -1069,8 +1065,6 @@ void svm_leave_nested(struct kvm_vcpu *vcpu) nested_svm_uninit_mmu_context(vcpu); vmcb_mark_all_dirty(svm->vmcb); } - - kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); } static int nested_svm_exit_handled_msr(struct vcpu_svm *svm) @@ -1562,53 +1556,31 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, */ ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3, - nested_npt_enabled(svm), false); + nested_npt_enabled(svm), !vcpu->arch.pdptrs_from_userspace); if (WARN_ON_ONCE(ret)) goto out_free; - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); - ret = 0; -out_free: - kfree(save); - kfree(ctl); - - return ret; -} - -static bool svm_get_nested_state_pages(struct kvm_vcpu *vcpu) -{ - struct vcpu_svm *svm = to_svm(vcpu); - - if (WARN_ON(!is_guest_mode(vcpu))) - return true; - - if (!vcpu->arch.pdptrs_from_userspace && - !nested_npt_enabled(svm) && is_pae_paging(vcpu)) - /* - * Reload the guest's PDPTRs since after a migration - * the guest CR3 might be restored prior to setting the nested - * state which can lead to a load of wrong PDPTRs. - */ - if (CC(!load_pdptrs(vcpu, vcpu->arch.cr3))) - return false; - if (!nested_svm_vmrun_msrpm(svm)) { vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION; vcpu->run->internal.ndata = 0; - return false; + goto out_free; } - return true; + ret = 0; +out_free: + kfree(save); + kfree(ctl); + return ret; } + struct kvm_x86_nested_ops svm_nested_ops = { .leave_nested = svm_leave_nested, .check_events = svm_check_nested_events, .triple_fault = nested_svm_triple_fault, - .get_nested_state_pages = svm_get_nested_state_pages, .get_state = svm_get_nested_state, .set_state = svm_set_nested_state, }; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index afa4116ea938c..6d6421e0cadcd 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4498,7 +4498,7 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) vmcb12 = map.hva; nested_copy_vmcb_control_to_cache(svm, &vmcb12->control); nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); - ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, false); + ret = enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12); if (ret) goto unmap_save; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index dd3671d77258b..e2eb91851e922 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -551,7 +551,7 @@ static inline bool nested_exit_on_nmi(struct vcpu_svm *svm) } int enter_svm_guest_mode(struct kvm_vcpu *vcpu, - u64 vmcb_gpa, struct vmcb *vmcb12, bool from_vmrun); + u64 vmcb_gpa, struct vmcb *vmcb12); void svm_leave_nested(struct kvm_vcpu *vcpu); void svm_free_nested(struct vcpu_svm *svm); int svm_allocate_nested(struct vcpu_svm *svm); diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index e89b32b1d9efb..19331f742662d 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -294,8 +294,6 @@ static void free_nested(struct kvm_vcpu *vcpu) if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon) return; - kvm_clear_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); - vmx->nested.vmxon = false; vmx->nested.smm.vmxon = false; vmx->nested.vmxon_ptr = INVALID_GPA; @@ -2593,7 +2591,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, /* Shadow page tables on either EPT or shadow page tables. */ if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, nested_cpu_has_ept(vmcs12), - from_vmentry, entry_failure_code)) + from_vmentry || !vcpu->arch.pdptrs_from_userspace, + entry_failure_code)) return -EINVAL; /* @@ -3125,7 +3124,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu) return 0; } -static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu) +bool nested_get_evmcs_page(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -3161,18 +3160,6 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu) struct page *page; u64 hpa; - if (!vcpu->arch.pdptrs_from_userspace && - !nested_cpu_has_ept(vmcs12) && is_pae_paging(vcpu)) { - /* - * Reload the guest's PDPTRs since after a migration - * the guest CR3 might be restored prior to setting the nested - * state which can lead to a load of wrong PDPTRs. - */ - if (CC(!load_pdptrs(vcpu, vcpu->arch.cr3))) - return false; - } - - if (nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) { /* * Translate L1 physical address to host physical @@ -3254,25 +3241,6 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu) return true; } -static bool vmx_get_nested_state_pages(struct kvm_vcpu *vcpu) -{ - if (!nested_get_evmcs_page(vcpu)) { - pr_debug_ratelimited("%s: enlightened vmptrld failed\n", - __func__); - vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; - vcpu->run->internal.suberror = - KVM_INTERNAL_ERROR_EMULATION; - vcpu->run->internal.ndata = 0; - - return false; - } - - if (is_guest_mode(vcpu) && !nested_get_vmcs12_pages(vcpu)) - return false; - - return true; -} - static int nested_vmx_write_pml_buffer(struct kvm_vcpu *vcpu, gpa_t gpa) { struct vmcs12 *vmcs12; @@ -3402,12 +3370,12 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, prepare_vmcs02_early(vmx, &vmx->vmcs01, vmcs12); - if (from_vmentry) { - if (unlikely(!nested_get_vmcs12_pages(vcpu))) { - vmx_switch_vmcs(vcpu, &vmx->vmcs01); - return NVMX_VMENTRY_KVM_INTERNAL_ERROR; - } + if (unlikely(!nested_get_vmcs12_pages(vcpu))) { + vmx_switch_vmcs(vcpu, &vmx->vmcs01); + return NVMX_VMENTRY_KVM_INTERNAL_ERROR; + } + if (from_vmentry) { if (nested_vmx_check_vmentry_hw(vcpu)) { vmx_switch_vmcs(vcpu, &vmx->vmcs01); return NVMX_VMENTRY_VMFAIL; @@ -3429,24 +3397,14 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu, goto vmentry_fail_vmexit_guest_mode; } - if (from_vmentry) { - failed_index = nested_vmx_load_msr(vcpu, - vmcs12->vm_entry_msr_load_addr, - vmcs12->vm_entry_msr_load_count); - if (failed_index) { - exit_reason.basic = EXIT_REASON_MSR_LOAD_FAIL; - vmcs12->exit_qualification = failed_index; - goto vmentry_fail_vmexit_guest_mode; - } - } else { - /* - * The MMU is not initialized to point at the right entities yet and - * "get pages" would need to read data from the guest (i.e. we will - * need to perform gpa to hpa translation). Request a call - * to nested_get_vmcs12_pages before the next VM-entry. The MSRs - * have already been set at vmentry time and should not be reset. - */ - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); + + failed_index = nested_vmx_load_msr(vcpu, + vmcs12->vm_entry_msr_load_addr, + vmcs12->vm_entry_msr_load_count); + if (failed_index) { + exit_reason.basic = EXIT_REASON_MSR_LOAD_FAIL; + vmcs12->exit_qualification = failed_index; + goto vmentry_fail_vmexit_guest_mode; } /* @@ -4516,16 +4474,6 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason, /* Similarly, triple faults in L2 should never escape. */ WARN_ON_ONCE(kvm_check_request(KVM_REQ_TRIPLE_FAULT, vcpu)); - if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) { - /* - * KVM_REQ_GET_NESTED_STATE_PAGES is also used to map - * Enlightened VMCS after migration and we still need to - * do that when something is forcing L2->L1 exit prior to - * the first L2 run. - */ - (void)nested_get_evmcs_page(vcpu); - } - /* Service pending TLB flush requests for L2 before switching to L1. */ kvm_service_local_tlb_flush_requests(vcpu); @@ -6382,14 +6330,17 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu, set_current_vmptr(vmx, kvm_state->hdr.vmx.vmcs12_pa); } else if (kvm_state->flags & KVM_STATE_NESTED_EVMCS) { + + vmx->nested.hv_evmcs_vmptr = EVMPTR_MAP_PENDING; + /* * nested_vmx_handle_enlightened_vmptrld() cannot be called - * directly from here as HV_X64_MSR_VP_ASSIST_PAGE may not be - * restored yet. EVMCS will be mapped from - * nested_get_vmcs12_pages(). + * directly from here if HV_X64_MSR_VP_ASSIST_PAGE is not + * restored yet. EVMCS will be mapped when it is. */ - vmx->nested.hv_evmcs_vmptr = EVMPTR_MAP_PENDING; - kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu); + if (kvm_hv_assist_page_enabled(vcpu)) + nested_get_evmcs_page(vcpu); + } else { return -EINVAL; } @@ -6811,8 +6762,8 @@ struct kvm_x86_nested_ops vmx_nested_ops = { .triple_fault = nested_vmx_triple_fault, .get_state = vmx_get_nested_state, .set_state = vmx_set_nested_state, - .get_nested_state_pages = vmx_get_nested_state_pages, .write_log_dirty = nested_vmx_write_pml_buffer, .enable_evmcs = nested_enable_evmcs, .get_evmcs_version = nested_get_evmcs_version, + .get_evmcs_page = nested_get_evmcs_page, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0ee2fbb068b17..48dd01fd7a1ec 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9897,12 +9897,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) r = -EIO; goto out; } - if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) { - if (unlikely(!kvm_x86_ops.nested_ops->get_nested_state_pages(vcpu))) { - r = 0; - goto out; - } - } if (kvm_check_request(KVM_REQ_MMU_RELOAD, vcpu)) kvm_mmu_unload(vcpu); if (kvm_check_request(KVM_REQ_MIGRATE_TIMER, vcpu))