From patchwork Wed Feb 9 17:00:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740494 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C4E9C433F5 for ; Wed, 9 Feb 2022 17:00:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237605AbiBIRAy (ORCPT ); Wed, 9 Feb 2022 12:00:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237143AbiBIRAp (ORCPT ); Wed, 9 Feb 2022 12:00:45 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B0042C05CB86 for ; Wed, 9 Feb 2022 09:00:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426047; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jEqL6PLWr+bZ/5j2l/zElDfsDQbZUnnUlRxHORnTr6s=; b=ivtfcwTEYb6XbxMi3EouTnlNO4IPsqekza5vYAnc2zVvErhfsyWm3VgxUIvSZvIPRup11b WUrPIsWaLbde1cRRGIcRoUDXlreMLaEPN4FCrDmJV41KFBAnZFRB9icgvg6WkmBCEOGXa5 RJpIzycnqnUcrL1hUpt2LIOK61HocoU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-574-YSjKLW7EP7yxonBU0a_7RQ-1; Wed, 09 Feb 2022 12:00:42 -0500 X-MC-Unique: YSjKLW7EP7yxonBU0a_7RQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 649588145E2; Wed, 9 Feb 2022 17:00:41 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8B5657D499; Wed, 9 Feb 2022 17:00:29 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com, stable@vger.kernel.org Subject: [PATCH 01/12] KVM: x86: host-initiated EFER.LME write affects the MMU Date: Wed, 9 Feb 2022 12:00:09 -0500 Message-Id: <20220209170020.1775368-2-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org While the guest runs, EFER.LME cannot change unless CR0.PG is clear, and therefore EFER.NX is the only bit that can affect the MMU role. However, set_efer accepts a host-initiated change to EFER.LME even with CR0.PG=1. In that case, the MMU has to be reset. Fixes: 11988499e62b ("KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes") Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini Reviewed-by: Sean Christopherson --- arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/x86.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 51faa2c76ca5..a5a50cfeffff 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -48,6 +48,7 @@ X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE) #define KVM_MMU_CR0_ROLE_BITS (X86_CR0_PG | X86_CR0_WP) +#define KVM_MMU_EFER_ROLE_BITS (EFER_LME | EFER_NX) static __always_inline u64 rsvd_bits(int s, int e) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9a9006226501..5e1298aef9e2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1647,7 +1647,7 @@ static int set_efer(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } /* Update reserved bits */ - if ((efer ^ old_efer) & EFER_NX) + if ((efer ^ old_efer) & KVM_MMU_EFER_ROLE_BITS) kvm_mmu_reset_context(vcpu); return 0; From patchwork Wed Feb 9 17:00:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA23CC43217 for ; Wed, 9 Feb 2022 17:01:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237693AbiBIRA4 (ORCPT ); Wed, 9 Feb 2022 12:00:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237645AbiBIRAt (ORCPT ); Wed, 9 Feb 2022 12:00:49 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C13CEC05CB9A for ; Wed, 9 Feb 2022 09:00:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=maNLEYrkI9/B7Gp1A8K9ykB0aRV+CgOE/hJeFpDr/sg=; b=eivZxkuQej4XxkAqU9jSTH9raTUmZEE2rqpKAv7hXL/RrBt/i0R5ToR6ffejuoRvBYcPiS NfYMW9bCHeFSVKoZMglTaXma6CYvdVg+JSxLvQ6Kvm12fF1Z5suZWGtCl5xCs2Y+zaNToD 3lNSFPaOM6VDLeOvx/tZoXqdPWKG53M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-594-IT82pY6aOeenyWtwkXPNyg-1; Wed, 09 Feb 2022 12:00:50 -0500 X-MC-Unique: IT82pY6aOeenyWtwkXPNyg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 35C388145EB; Wed, 9 Feb 2022 17:00:49 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7E5017CD66; Wed, 9 Feb 2022 17:00:41 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 02/12] KVM: MMU: move MMU role accessors to header Date: Wed, 9 Feb 2022 12:00:10 -0500 Message-Id: <20220209170020.1775368-3-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We will use is_cr0_pg to check whether a page fault can be delivered. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu.h | 21 +++++++++++++++++++++ arch/x86/kvm/mmu/mmu.c | 21 --------------------- 2 files changed, 21 insertions(+), 21 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index a5a50cfeffff..b9d06a218b2c 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -65,6 +65,27 @@ static __always_inline u64 rsvd_bits(int s, int e) return ((2ULL << (e - s)) - 1) << s; } +/* + * The MMU itself (with a valid role) is the single source of truth for the + * MMU. Do not use the regs used to build the MMU/role, nor the vCPU. The + * regs don't account for dependencies, e.g. clearing CR4 bits if CR0.PG=1, + * and the vCPU may be incorrect/irrelevant. + */ +#define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name) \ +static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \ +{ \ + return !!(mmu->mmu_role. base_or_ext . reg##_##name); \ +} +BUILD_MMU_ROLE_ACCESSOR(ext, cr0, pg); +BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp); +BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse); +BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pae); +BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smep); +BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smap); +BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pke); +BUILD_MMU_ROLE_ACCESSOR(ext, cr4, la57); +BUILD_MMU_ROLE_ACCESSOR(base, efer, nx); + void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask); void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 296f8723f9ae..e0c0f0bc2e8b 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -223,27 +223,6 @@ BUILD_MMU_ROLE_REGS_ACCESSOR(cr4, la57, X86_CR4_LA57); BUILD_MMU_ROLE_REGS_ACCESSOR(efer, nx, EFER_NX); BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA); -/* - * The MMU itself (with a valid role) is the single source of truth for the - * MMU. Do not use the regs used to build the MMU/role, nor the vCPU. The - * regs don't account for dependencies, e.g. clearing CR4 bits if CR0.PG=1, - * and the vCPU may be incorrect/irrelevant. - */ -#define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name) \ -static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \ -{ \ - return !!(mmu->mmu_role. base_or_ext . reg##_##name); \ -} -BUILD_MMU_ROLE_ACCESSOR(ext, cr0, pg); -BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp); -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse); -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pae); -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smep); -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, smap); -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pke); -BUILD_MMU_ROLE_ACCESSOR(ext, cr4, la57); -BUILD_MMU_ROLE_ACCESSOR(base, efer, nx); - static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu) { struct kvm_mmu_role_regs regs = { From patchwork Wed Feb 9 17:00:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740504 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EB29C433EF for ; Wed, 9 Feb 2022 17:01:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237743AbiBIRBl (ORCPT ); Wed, 9 Feb 2022 12:01:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237749AbiBIRBG (ORCPT ); Wed, 9 Feb 2022 12:01:06 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 372A7C05CBB6 for ; Wed, 9 Feb 2022 09:01:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+02+sNgYCOuI9QfDVR0KE7R+hKxoZomtfsP09w88qVA=; b=cu1Hp2avo1IE8GfebqB5A1A+VX49PMlw5ahpoYPLcND/qwWDeCNIcUkULoLGj2w3PwUqTi iwl5XiO1+2wC+eGR+Bmu0bob0k/xDj7pwsbiTZBwzrm8I6BnBmDhWaWlG8eup5A152SqsT R44raYBdE1L413tjXvwtimMEWXd+OOc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-348-kSAM0rXCM7SHfbBr7gI1jw-1; Wed, 09 Feb 2022 12:01:06 -0500 X-MC-Unique: kSAM0rXCM7SHfbBr7gI1jw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DD56718397B3; Wed, 9 Feb 2022 17:00:49 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 503EF7CD66; Wed, 9 Feb 2022 17:00:49 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 03/12] KVM: x86: do not deliver asynchronous page faults if CR0.PG=0 Date: Wed, 9 Feb 2022 12:00:11 -0500 Message-Id: <20220209170020.1775368-4-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Enabling async page faults is nonsensical if paging is disabled, but it is allowed because CR0.PG=0 does not clear the async page fault MSR. Just ignore them and only use the artificial halt state, similar to what happens in guest mode if async #PF vmexits are disabled. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/x86.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5e1298aef9e2..98aca0f2af12 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12272,7 +12272,9 @@ static inline bool apf_pageready_slot_free(struct kvm_vcpu *vcpu) static bool kvm_can_deliver_async_pf(struct kvm_vcpu *vcpu) { - if (!vcpu->arch.apf.delivery_as_pf_vmexit && is_guest_mode(vcpu)) + if (is_guest_mode(vcpu) + ? !vcpu->arch.apf.delivery_as_pf_vmexit + : !is_cr0_pg(vcpu->arch.mmu)) return false; if (!kvm_pv_async_pf_enabled(vcpu) || From patchwork Wed Feb 9 17:00:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0162C433EF for ; Wed, 9 Feb 2022 17:00:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237688AbiBIRAz (ORCPT ); Wed, 9 Feb 2022 12:00:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236980AbiBIRAu (ORCPT ); Wed, 9 Feb 2022 12:00:50 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B55EBC0613C9 for ; Wed, 9 Feb 2022 09:00:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426052; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8TYNwAs02vm4CDNx1aPCczbwgm5BVw9EzNs2ZbO9w3I=; b=EKfJQ+aoCZ25gj1QWk+qoo3OjHKtqqNqYQjk72Rw9CcAKIkMh7NeZiRVyLnz6qRBcUqPR1 C5ul7GebwH5YUYSi7N9szslnlJ8wbQI9sTnWMwAk+IZycSFUvjwaFJWrg4DmjeVoFQqakc eO2Yt7PA9muAjsrcsIBnF4BYs14Ifo8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-155-QukTdJvoOv-PPolitsLQWQ-1; Wed, 09 Feb 2022 12:00:51 -0500 X-MC-Unique: QukTdJvoOv-PPolitsLQWQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 802E48145E0; Wed, 9 Feb 2022 17:00:50 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 03CF37CD66; Wed, 9 Feb 2022 17:00:49 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 04/12] KVM: MMU: WARN if PAE roots linger after kvm_mmu_unload Date: Wed, 9 Feb 2022 12:00:12 -0500 Message-Id: <20220209170020.1775368-5-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e0c0f0bc2e8b..7b5765ced928 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5065,12 +5065,21 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) return r; } +static void __kvm_mmu_unload(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu) +{ + int i; + kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOTS_ALL); + WARN_ON(VALID_PAGE(mmu->root_hpa)); + if (mmu->pae_root) { + for (i = 0; i < 4; ++i) + WARN_ON(IS_VALID_PAE_ROOT(mmu->pae_root[i])); + } +} + void kvm_mmu_unload(struct kvm_vcpu *vcpu) { - kvm_mmu_free_roots(vcpu, &vcpu->arch.root_mmu, KVM_MMU_ROOTS_ALL); - WARN_ON(VALID_PAGE(vcpu->arch.root_mmu.root_hpa)); - kvm_mmu_free_roots(vcpu, &vcpu->arch.guest_mmu, KVM_MMU_ROOTS_ALL); - WARN_ON(VALID_PAGE(vcpu->arch.guest_mmu.root_hpa)); + __kvm_mmu_unload(vcpu, &vcpu->arch.root_mmu); + __kvm_mmu_unload(vcpu, &vcpu->arch.guest_mmu); } static bool need_remote_flush(u64 old, u64 new) From patchwork Wed Feb 9 17:00:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABC3FC433F5 for ; Wed, 9 Feb 2022 17:01:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237710AbiBIRA5 (ORCPT ); Wed, 9 Feb 2022 12:00:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237678AbiBIRAw (ORCPT ); Wed, 9 Feb 2022 12:00:52 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B9D87C05CB88 for ; Wed, 9 Feb 2022 09:00:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426054; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fZIUln5prMWPKDJM3hoJG1ZFa53meIsAdYHTNLyA4Gc=; b=UwHowb0C6gR9Ri/qEs89whxFq63b3Yx3qTZpRmJnMRzcecCLsGI1WPTHk5Kavs2lrmQOl4 ghi8RViHB+UALnwJkkX9T1p3VhZfmg/chDZS3y7/9br1u9kR4jpBKCL0gtoL3gxmrYjwTi i2nz7f49mwqgvvwIrtJ1EvakLGr++Dc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-517-qEOHYrzmNhq2Tq5dTVLzIw-1; Wed, 09 Feb 2022 12:00:53 -0500 X-MC-Unique: qEOHYrzmNhq2Tq5dTVLzIw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 21088101F7A6; Wed, 9 Feb 2022 17:00:51 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 99B9D7CD66; Wed, 9 Feb 2022 17:00:50 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 05/12] KVM: MMU: avoid NULL-pointer dereference on page freeing bugs Date: Wed, 9 Feb 2022 12:00:13 -0500 Message-Id: <20220209170020.1775368-6-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If kvm_mmu_free_roots encounters a PAE page table where a 64-bit page table is expected, the result is a NULL pointer dereference. Instead just WARN and exit. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 7b5765ced928..d0f2077bd798 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3201,6 +3201,8 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa, return; sp = to_shadow_page(*root_hpa & PT64_BASE_ADDR_MASK); + if (WARN_ON(!sp)) + return; if (is_tdp_mmu_page(sp)) kvm_tdp_mmu_put_root(kvm, sp, false); From patchwork Wed Feb 9 17:00:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3497C433F5 for ; Wed, 9 Feb 2022 17:01:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237684AbiBIRBB (ORCPT ); Wed, 9 Feb 2022 12:01:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237674AbiBIRAz (ORCPT ); Wed, 9 Feb 2022 12:00:55 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id DE300C05CB8A for ; Wed, 9 Feb 2022 09:00:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426057; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1FNh7pkHY41jFWmZFdVN600EQqaJlF1A+HaHz0XISvI=; b=g69tFctEqKOBTb+dhc3dxSdte1herpcLLpe/t2UWB3wBQP5/eQDNqULiF2N4IltTSp/CTe 24rvCffgC0sK/V5QjJddFzV/DC1ynZsQW9hcnLxLFL2GvpLAbFucuta+pqjLxBafF1qGu4 mtMCNVjGAAHk7HUdBR4GEsfve+pt5e0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-282-bhh_Xgj_PXqfi1V8wlP2YA-1; Wed, 09 Feb 2022 12:00:54 -0500 X-MC-Unique: bhh_Xgj_PXqfi1V8wlP2YA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B7C2718397A7; Wed, 9 Feb 2022 17:00:51 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3B8BD7CD66; Wed, 9 Feb 2022 17:00:51 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 06/12] KVM: MMU: rename kvm_mmu_reload Date: Wed, 9 Feb 2022 12:00:14 -0500 Message-Id: <20220209170020.1775368-7-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The name of kvm_mmu_reload is very confusing for two reasons: first, KVM_REQ_MMU_RELOAD actually does not call it; second, it only does anything if there is no valid root. Rename it to kvm_mmu_ensure_valid_root, which matches the actual behavior better. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu.h | 2 +- arch/x86/kvm/x86.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index b9d06a218b2c..c9f1c2162ade 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -104,7 +104,7 @@ void kvm_mmu_unload(struct kvm_vcpu *vcpu); void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu); void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu); -static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu) +static inline int kvm_mmu_ensure_valid_root(struct kvm_vcpu *vcpu) { if (likely(vcpu->arch.mmu->root_hpa != INVALID_PAGE)) return 0; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 98aca0f2af12..2685fb62807e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9976,7 +9976,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) } } - r = kvm_mmu_reload(vcpu); + r = kvm_mmu_ensure_valid_root(vcpu); if (unlikely(r)) { goto cancel_injection; } @@ -12164,7 +12164,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work) work->wakeup_all) return; - r = kvm_mmu_reload(vcpu); + r = kvm_mmu_ensure_valid_root(vcpu); if (unlikely(r)) return; From patchwork Wed Feb 9 17:00:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A54BDC433FE for ; Wed, 9 Feb 2022 17:01:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237780AbiBIRBH (ORCPT ); Wed, 9 Feb 2022 12:01:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237645AbiBIRA5 (ORCPT ); Wed, 9 Feb 2022 12:00:57 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7F87BC05CB86 for ; Wed, 9 Feb 2022 09:01:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426059; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rST8ZCfpeFgIVD9zxeUahDKLk3PTvPhk0YqFQNkI87c=; b=dCEfP26Ugve3q4sh//rsFUhofZVmqNczk78pJ2rU4zPaQed/SmxFHr5xic9+LPskx1lrdA LkPeJ+gzThRDfjgSL68pOJ+EayMwrOZW7OfUFZhEnASCjxRGf70b5STRm+gspS0lS3Qx8N mXeE8k6Hg9rUOfZJzE+73mF2fFui9rE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-408-E6H_uRNaMJG51fiiYEFBJQ-1; Wed, 09 Feb 2022 12:00:55 -0500 X-MC-Unique: E6H_uRNaMJG51fiiYEFBJQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5AB7818397BE; Wed, 9 Feb 2022 17:00:52 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id D30EF7CD66; Wed, 9 Feb 2022 17:00:51 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 07/12] KVM: x86: use struct kvm_mmu_root_info for mmu->root Date: Wed, 9 Feb 2022 12:00:15 -0500 Message-Id: <20220209170020.1775368-8-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The root_hpa and root_pgd fields form essentially a struct kvm_mmu_root_info. Use the struct to have more consistency between mmu->root and mmu->prev_roots. The patch is entirely search and replace except for cached_root_available, which does not need a temporary struct kvm_mmu_root_info anymore. Signed-off-by: Paolo Bonzini Reviewed-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 3 +- arch/x86/kvm/mmu.h | 4 +- arch/x86/kvm/mmu/mmu.c | 69 +++++++++++++++------------------ arch/x86/kvm/mmu/mmu_audit.c | 4 +- arch/x86/kvm/mmu/paging_tmpl.h | 2 +- arch/x86/kvm/mmu/tdp_mmu.c | 2 +- arch/x86/kvm/mmu/tdp_mmu.h | 2 +- arch/x86/kvm/vmx/nested.c | 2 +- arch/x86/kvm/vmx/vmx.c | 2 +- arch/x86/kvm/x86.c | 2 +- 10 files changed, 42 insertions(+), 50 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index a0d2925b6651..6da9a460e584 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -432,8 +432,7 @@ struct kvm_mmu { int (*sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp); void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa); - hpa_t root_hpa; - gpa_t root_pgd; + struct kvm_mmu_root_info root; union kvm_mmu_role mmu_role; u8 root_level; u8 shadow_root_level; diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index c9f1c2162ade..f896c438c8ee 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -106,7 +106,7 @@ void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu); static inline int kvm_mmu_ensure_valid_root(struct kvm_vcpu *vcpu) { - if (likely(vcpu->arch.mmu->root_hpa != INVALID_PAGE)) + if (likely(vcpu->arch.mmu->root.hpa != INVALID_PAGE)) return 0; return kvm_mmu_load(vcpu); @@ -128,7 +128,7 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu) static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu) { - u64 root_hpa = vcpu->arch.mmu->root_hpa; + u64 root_hpa = vcpu->arch.mmu->root.hpa; if (!VALID_PAGE(root_hpa)) return; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index d0f2077bd798..3c3f597ea00d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2141,7 +2141,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato * prev_root is currently only used for 64-bit hosts. So only * the active root_hpa is valid here. */ - BUG_ON(root != vcpu->arch.mmu->root_hpa); + BUG_ON(root != vcpu->arch.mmu->root.hpa); iterator->shadow_addr = vcpu->arch.mmu->pae_root[(addr >> 30) & 3]; @@ -2155,7 +2155,7 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato static void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator, struct kvm_vcpu *vcpu, u64 addr) { - shadow_walk_init_using_root(iterator, vcpu, vcpu->arch.mmu->root_hpa, + shadow_walk_init_using_root(iterator, vcpu, vcpu->arch.mmu->root.hpa, addr); } @@ -3224,7 +3224,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, BUILD_BUG_ON(KVM_MMU_NUM_PREV_ROOTS >= BITS_PER_LONG); /* Before acquiring the MMU lock, see if we need to do any real work. */ - if (!(free_active_root && VALID_PAGE(mmu->root_hpa))) { + if (!(free_active_root && VALID_PAGE(mmu->root.hpa))) { for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) if ((roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) && VALID_PAGE(mmu->prev_roots[i].hpa)) @@ -3244,7 +3244,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, if (free_active_root) { if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL && (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) { - mmu_free_root_page(kvm, &mmu->root_hpa, &invalid_list); + mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); } else if (mmu->pae_root) { for (i = 0; i < 4; ++i) { if (!IS_VALID_PAE_ROOT(mmu->pae_root[i])) @@ -3255,8 +3255,8 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, mmu->pae_root[i] = INVALID_PAE_ROOT; } } - mmu->root_hpa = INVALID_PAGE; - mmu->root_pgd = 0; + mmu->root.hpa = INVALID_PAGE; + mmu->root.pgd = 0; } kvm_mmu_commit_zap_page(kvm, &invalid_list); @@ -3329,10 +3329,10 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) if (is_tdp_mmu_enabled(vcpu->kvm)) { root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu); - mmu->root_hpa = root; + mmu->root.hpa = root; } else if (shadow_root_level >= PT64_ROOT_4LEVEL) { root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level, true); - mmu->root_hpa = root; + mmu->root.hpa = root; } else if (shadow_root_level == PT32E_ROOT_LEVEL) { if (WARN_ON_ONCE(!mmu->pae_root)) { r = -EIO; @@ -3347,15 +3347,15 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) mmu->pae_root[i] = root | PT_PRESENT_MASK | shadow_me_mask; } - mmu->root_hpa = __pa(mmu->pae_root); + mmu->root.hpa = __pa(mmu->pae_root); } else { WARN_ONCE(1, "Bad TDP root level = %d\n", shadow_root_level); r = -EIO; goto out_unlock; } - /* root_pgd is ignored for direct MMUs. */ - mmu->root_pgd = 0; + /* root.pgd is ignored for direct MMUs. */ + mmu->root.pgd = 0; out_unlock: write_unlock(&vcpu->kvm->mmu_lock); return r; @@ -3468,7 +3468,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) if (mmu->root_level >= PT64_ROOT_4LEVEL) { root = mmu_alloc_root(vcpu, root_gfn, 0, mmu->shadow_root_level, false); - mmu->root_hpa = root; + mmu->root.hpa = root; goto set_root_pgd; } @@ -3518,14 +3518,14 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) } if (mmu->shadow_root_level == PT64_ROOT_5LEVEL) - mmu->root_hpa = __pa(mmu->pml5_root); + mmu->root.hpa = __pa(mmu->pml5_root); else if (mmu->shadow_root_level == PT64_ROOT_4LEVEL) - mmu->root_hpa = __pa(mmu->pml4_root); + mmu->root.hpa = __pa(mmu->pml4_root); else - mmu->root_hpa = __pa(mmu->pae_root); + mmu->root.hpa = __pa(mmu->pae_root); set_root_pgd: - mmu->root_pgd = root_pgd; + mmu->root.pgd = root_pgd; out_unlock: write_unlock(&vcpu->kvm->mmu_lock); @@ -3638,13 +3638,13 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) if (vcpu->arch.mmu->direct_map) return; - if (!VALID_PAGE(vcpu->arch.mmu->root_hpa)) + if (!VALID_PAGE(vcpu->arch.mmu->root.hpa)) return; vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY); if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) { - hpa_t root = vcpu->arch.mmu->root_hpa; + hpa_t root = vcpu->arch.mmu->root.hpa; sp = to_shadow_page(root); if (!is_unsync_root(root)) @@ -3935,7 +3935,7 @@ static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, static bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, int mmu_seq) { - struct kvm_mmu_page *sp = to_shadow_page(vcpu->arch.mmu->root_hpa); + struct kvm_mmu_page *sp = to_shadow_page(vcpu->arch.mmu->root.hpa); /* Special roots, e.g. pae_root, are not backed by shadow pages. */ if (sp && is_obsolete_sp(vcpu->kvm, sp)) @@ -4092,34 +4092,27 @@ static inline bool is_root_usable(struct kvm_mmu_root_info *root, gpa_t pgd, /* * Find out if a previously cached root matching the new pgd/role is available. * The current root is also inserted into the cache. - * If a matching root was found, it is assigned to kvm_mmu->root_hpa and true is + * If a matching root was found, it is assigned to kvm_mmu->root.hpa and true is * returned. - * Otherwise, the LRU root from the cache is assigned to kvm_mmu->root_hpa and + * Otherwise, the LRU root from the cache is assigned to kvm_mmu->root.hpa and * false is returned. This root should now be freed by the caller. */ static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd, union kvm_mmu_page_role new_role) { uint i; - struct kvm_mmu_root_info root; struct kvm_mmu *mmu = vcpu->arch.mmu; - root.pgd = mmu->root_pgd; - root.hpa = mmu->root_hpa; - - if (is_root_usable(&root, new_pgd, new_role)) + if (is_root_usable(&mmu->root, new_pgd, new_role)) return true; for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) { - swap(root, mmu->prev_roots[i]); + swap(mmu->root, mmu->prev_roots[i]); - if (is_root_usable(&root, new_pgd, new_role)) + if (is_root_usable(&mmu->root, new_pgd, new_role)) break; } - mmu->root_hpa = root.hpa; - mmu->root_pgd = root.pgd; - return i < KVM_MMU_NUM_PREV_ROOTS; } @@ -4175,7 +4168,7 @@ static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd, */ if (!new_role.direct) __clear_sp_write_flooding_count( - to_shadow_page(vcpu->arch.mmu->root_hpa)); + to_shadow_page(vcpu->arch.mmu->root.hpa)); } void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd) @@ -5071,7 +5064,7 @@ static void __kvm_mmu_unload(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu) { int i; kvm_mmu_free_roots(vcpu, mmu, KVM_MMU_ROOTS_ALL); - WARN_ON(VALID_PAGE(mmu->root_hpa)); + WARN_ON(VALID_PAGE(mmu->root.hpa)); if (mmu->pae_root) { for (i = 0; i < 4; ++i) WARN_ON(IS_VALID_PAE_ROOT(mmu->pae_root[i])); @@ -5266,7 +5259,7 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, int r, emulation_type = EMULTYPE_PF; bool direct = vcpu->arch.mmu->direct_map; - if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa))) + if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root.hpa))) return RET_PF_RETRY; r = RET_PF_INVALID; @@ -5338,7 +5331,7 @@ void kvm_mmu_invalidate_gva(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, return; if (root_hpa == INVALID_PAGE) { - mmu->invlpg(vcpu, gva, mmu->root_hpa); + mmu->invlpg(vcpu, gva, mmu->root.hpa); /* * INVLPG is required to invalidate any global mappings for the VA, @@ -5374,7 +5367,7 @@ void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long pcid) uint i; if (pcid == kvm_get_active_pcid(vcpu)) { - mmu->invlpg(vcpu, gva, mmu->root_hpa); + mmu->invlpg(vcpu, gva, mmu->root.hpa); tlb_flush = true; } @@ -5487,8 +5480,8 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu) struct page *page; int i; - mmu->root_hpa = INVALID_PAGE; - mmu->root_pgd = 0; + mmu->root.hpa = INVALID_PAGE; + mmu->root.pgd = 0; for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) mmu->prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID; diff --git a/arch/x86/kvm/mmu/mmu_audit.c b/arch/x86/kvm/mmu/mmu_audit.c index f31fdb874f1f..3e5d62a25350 100644 --- a/arch/x86/kvm/mmu/mmu_audit.c +++ b/arch/x86/kvm/mmu/mmu_audit.c @@ -56,11 +56,11 @@ static void mmu_spte_walk(struct kvm_vcpu *vcpu, inspect_spte_fn fn) int i; struct kvm_mmu_page *sp; - if (!VALID_PAGE(vcpu->arch.mmu->root_hpa)) + if (!VALID_PAGE(vcpu->arch.mmu->root.hpa)) return; if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) { - hpa_t root = vcpu->arch.mmu->root_hpa; + hpa_t root = vcpu->arch.mmu->root.hpa; sp = to_shadow_page(root); __mmu_spte_walk(vcpu, sp, fn, vcpu->arch.mmu->root_level); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 5b5bdac97c7b..346f3bad3cb9 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -668,7 +668,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, if (FNAME(gpte_changed)(vcpu, gw, top_level)) goto out_gpte_changed; - if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa))) + if (WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root.hpa))) goto out_gpte_changed; for (shadow_walk_init(&it, vcpu, fault->addr); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 8def8f810cb0..debf08212f12 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -657,7 +657,7 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, else #define tdp_mmu_for_each_pte(_iter, _mmu, _start, _end) \ - for_each_tdp_pte(_iter, to_shadow_page(_mmu->root_hpa), _start, _end) + for_each_tdp_pte(_iter, to_shadow_page(_mmu->root.hpa), _start, _end) /* * Yield if the MMU lock is contended or this thread needs to return control diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 3f987785702a..57c73d8f76ce 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -95,7 +95,7 @@ static inline bool is_tdp_mmu_page(struct kvm_mmu_page *sp) { return sp->tdp_mmu static inline bool is_tdp_mmu(struct kvm_mmu *mmu) { struct kvm_mmu_page *sp; - hpa_t hpa = mmu->root_hpa; + hpa_t hpa = mmu->root.hpa; if (WARN_ON(!VALID_PAGE(hpa))) return false; diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index c73e4d938ddc..29289ecca223 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -5466,7 +5466,7 @@ static int handle_invept(struct kvm_vcpu *vcpu) VMXERR_INVALID_OPERAND_TO_INVEPT_INVVPID); roots_to_free = 0; - if (nested_ept_root_matches(mmu->root_hpa, mmu->root_pgd, + if (nested_ept_root_matches(mmu->root.hpa, mmu->root.pgd, operand.eptp)) roots_to_free |= KVM_MMU_ROOT_CURRENT; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 70e7f00362bc..5542a2b536e0 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2957,7 +2957,7 @@ static inline int vmx_get_current_vpid(struct kvm_vcpu *vcpu) static void vmx_flush_tlb_current(struct kvm_vcpu *vcpu) { struct kvm_mmu *mmu = vcpu->arch.mmu; - u64 root_hpa = mmu->root_hpa; + u64 root_hpa = mmu->root.hpa; /* No flush required if the current context is invalid. */ if (!VALID_PAGE(root_hpa)) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2685fb62807e..0d3646535cc5 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -762,7 +762,7 @@ bool kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, if ((fault->error_code & PFERR_PRESENT_MASK) && !(fault->error_code & PFERR_RSVD_MASK)) kvm_mmu_invalidate_gva(vcpu, fault_mmu, fault->address, - fault_mmu->root_hpa); + fault_mmu->root.hpa); fault_mmu->inject_page_fault(vcpu, fault); return fault->nested_page_fault; From patchwork Wed Feb 9 17:00:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740498 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33049C43219 for ; Wed, 9 Feb 2022 17:01:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237723AbiBIRA7 (ORCPT ); Wed, 9 Feb 2022 12:00:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33958 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237643AbiBIRAy (ORCPT ); Wed, 9 Feb 2022 12:00:54 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B8A5AC05CB86 for ; Wed, 9 Feb 2022 09:00:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426056; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UmiLwad0yNHqYcNsluzF3JRoNRWaXkEyVI1t62KrVok=; b=OiD7WnRIDmh4DeDoLG2p0w/9q7nJuo6lApc2+3E3e1OF4wDPcGF9PFoIMufUPfrt7xqOAD PJ8ZxuLYnPyFx5IxjHOaMPB+IQCQ/u45NO/QeOSVeWlaaMPbCLj5az6wJyw5CW0peCH/K6 feuP+b1eQU8e8AnDfm2RrpQQidCgObY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-462-U5WUf8sOO3O9cGvcDwstzQ-1; Wed, 09 Feb 2022 12:00:55 -0500 X-MC-Unique: U5WUf8sOO3O9cGvcDwstzQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id F0EBB101F7A1; Wed, 9 Feb 2022 17:00:52 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 754F47CD66; Wed, 9 Feb 2022 17:00:52 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 08/12] KVM: MMU: do not consult levels when freeing roots Date: Wed, 9 Feb 2022 12:00:16 -0500 Message-Id: <20220209170020.1775368-9-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Right now, PGD caching requires a complicated dance of first computing the MMU role and passing it to __kvm_mmu_new_pgd, and then separately calling kvm_init_mmu. Part of this is due to kvm_mmu_free_roots using mmu->root_level and mmu->shadow_root_level to distinguish whether the page table uses a single root or 4 PAE roots. Because kvm_init_mmu can overwrite mmu->root_level, kvm_mmu_free_roots must be called before kvm_init_mmu. However, even after kvm_init_mmu there is a way to detect whether the page table has a single root or four, because the pae_root does not have an associated struct kvm_mmu_page. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3c3f597ea00d..95d0fa0bb876 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3219,12 +3219,15 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, struct kvm *kvm = vcpu->kvm; int i; LIST_HEAD(invalid_list); - bool free_active_root = roots_to_free & KVM_MMU_ROOT_CURRENT; + bool free_active_root; BUILD_BUG_ON(KVM_MMU_NUM_PREV_ROOTS >= BITS_PER_LONG); /* Before acquiring the MMU lock, see if we need to do any real work. */ - if (!(free_active_root && VALID_PAGE(mmu->root.hpa))) { + free_active_root = (roots_to_free & KVM_MMU_ROOT_CURRENT) + && VALID_PAGE(mmu->root.hpa); + + if (!free_active_root) { for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) if ((roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) && VALID_PAGE(mmu->prev_roots[i].hpa)) @@ -3242,8 +3245,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, &invalid_list); if (free_active_root) { - if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL && - (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) { + if (to_shadow_page(mmu->root.hpa)) { mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); } else if (mmu->pae_root) { for (i = 0; i < 4; ++i) { From patchwork Wed Feb 9 17:00:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED14CC433EF for ; Wed, 9 Feb 2022 17:01:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237798AbiBIRBJ (ORCPT ); Wed, 9 Feb 2022 12:01:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237738AbiBIRBC (ORCPT ); Wed, 9 Feb 2022 12:01:02 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 271BCC05CB95 for ; Wed, 9 Feb 2022 09:01:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q9h2kopg+//QvhPwKzfjAfT60rLQ7JGTbMd+9FyXI5k=; b=HzmWoKX4+aUTaubtfMiUdFR6fvfB86If9p3ioPiZdsba37EaNS1WKOojfIXdud12xyCCcp JSpEvmNfo+fFM+4Yhx9sLK01bCt4n9F7Va+3FY0TbznrACmXmbOltRRH0Jevavrvd1a0HF pRBURQZCQwWSa0SLl955LXEz/bVSfkc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-466-99PD-VGPP3C1-fXmnX7zuQ-1; Wed, 09 Feb 2022 12:00:59 -0500 X-MC-Unique: 99PD-VGPP3C1-fXmnX7zuQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 93570101F7A4; Wed, 9 Feb 2022 17:00:53 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 178EE7CD6F; Wed, 9 Feb 2022 17:00:53 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 09/12] KVM: MMU: look for a cached PGD when going from 32-bit to 64-bit Date: Wed, 9 Feb 2022 12:00:17 -0500 Message-Id: <20220209170020.1775368-10-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Right now, PGD caching avoids placing a PAE root in the cache by using the old value of mmu->root_level and mmu->shadow_root_level; it does not look for a cached PGD if the old root is a PAE one, and then frees it using kvm_mmu_free_roots. Change the logic instead to free the uncacheable root early. This way, __kvm_new_mmu_pgd is able to look up the cache when going from 32-bit to 64-bit (if there is a hit, the invalid root becomes the least recently used). An example of this is nested virtualization with shadow paging, when a 64-bit L1 runs a 32-bit L2. As a side effect (which is actually the reason why this patch was written), PGD caching does not use the old value of mmu->root_level and mmu->shadow_root_level anymore. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 71 ++++++++++++++++++++++++++++++++---------- 1 file changed, 54 insertions(+), 17 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 95d0fa0bb876..f61208ccce43 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4087,20 +4087,20 @@ static inline bool is_root_usable(struct kvm_mmu_root_info *root, gpa_t pgd, union kvm_mmu_page_role role) { return (role.direct || pgd == root->pgd) && - VALID_PAGE(root->hpa) && to_shadow_page(root->hpa) && + VALID_PAGE(root->hpa) && role.word == to_shadow_page(root->hpa)->role.word; } /* * Find out if a previously cached root matching the new pgd/role is available. - * The current root is also inserted into the cache. - * If a matching root was found, it is assigned to kvm_mmu->root.hpa and true is - * returned. - * Otherwise, the LRU root from the cache is assigned to kvm_mmu->root.hpa and - * false is returned. This root should now be freed by the caller. + * If a matching root is found, it is assigned to kvm_mmu->root and + * true is returned. + * If no match is found, the current root becomes the MRU of the cache + * if valid (thus evicting the LRU root), kvm_mmu->root is left invalid, + * and false is returned. */ -static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd, - union kvm_mmu_page_role new_role) +static bool cached_root_find_and_promote(struct kvm_vcpu *vcpu, gpa_t new_pgd, + union kvm_mmu_page_role new_role) { uint i; struct kvm_mmu *mmu = vcpu->arch.mmu; @@ -4109,13 +4109,48 @@ static bool cached_root_available(struct kvm_vcpu *vcpu, gpa_t new_pgd, return true; for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) { + /* + * The swaps end up rotating the cache like this: + * C 0 1 2 3 (on entry to the function) + * 0 C 1 2 3 + * 1 C 0 2 3 + * 2 C 0 1 3 + * 3 C 0 1 2 (on exit from the loop) + */ swap(mmu->root, mmu->prev_roots[i]); - if (is_root_usable(&mmu->root, new_pgd, new_role)) - break; + return true; } - return i < KVM_MMU_NUM_PREV_ROOTS; + kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, KVM_MMU_ROOT_CURRENT); + return false; +} + +/* + * Find out if a previously cached root matching the new pgd/role is available. + * If a matching root is found, it is assigned to kvm_mmu->root and true + * is returned. The current, invalid root goes to the bottom of the cache. + * If no match is found, kvm_mmu->root is left invalid and false is returned. + */ +static bool cached_root_find_and_replace(struct kvm_vcpu *vcpu, gpa_t new_pgd, + union kvm_mmu_page_role new_role) +{ + uint i; + struct kvm_mmu *mmu = vcpu->arch.mmu; + + for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) + if (is_root_usable(&mmu->prev_roots[i], new_pgd, new_role)) + goto hit; + + return false; + +hit: + swap(mmu->root, mmu->prev_roots[i]); + /* Bubble up the remaining roots. */ + for (; i < KVM_MMU_NUM_PREV_ROOTS - 1; i++) + mmu->prev_roots[i] = mmu->prev_roots[i + 1]; + mmu->prev_roots[i].hpa = INVALID_PAGE; + return true; } static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd, @@ -4124,22 +4159,24 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd, struct kvm_mmu *mmu = vcpu->arch.mmu; /* - * For now, limit the fast switch to 64-bit hosts+VMs in order to avoid + * For now, limit the caching to 64-bit hosts+VMs in order to avoid * having to deal with PDPTEs. We may add support for 32-bit hosts/VMs * later if necessary. */ - if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL && - mmu->root_level >= PT64_ROOT_4LEVEL) - return cached_root_available(vcpu, new_pgd, new_role); + if (VALID_PAGE(mmu->root.hpa) && !to_shadow_page(mmu->root.hpa)) + kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, KVM_MMU_ROOT_CURRENT); - return false; + if (VALID_PAGE(mmu->root.hpa)) + return cached_root_find_and_promote(vcpu, new_pgd, new_role); + else + return cached_root_find_and_replace(vcpu, new_pgd, new_role); } static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd, union kvm_mmu_page_role new_role) { if (!fast_pgd_switch(vcpu, new_pgd, new_role)) { - kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, KVM_MMU_ROOT_CURRENT); + /* kvm_mmu_ensure_valid_pgd will set up a new root. */ return; } From patchwork Wed Feb 9 17:00:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740500 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3E39C433EF for ; Wed, 9 Feb 2022 17:01:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237765AbiBIRBE (ORCPT ); Wed, 9 Feb 2022 12:01:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237695AbiBIRA4 (ORCPT ); Wed, 9 Feb 2022 12:00:56 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0F651C0613C9 for ; Wed, 9 Feb 2022 09:00:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426059; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tR1GvTEwRwp3TbS3xow6kXaZGlbaFMOGHUB9ngIKM98=; b=SdP1jldyPNNiOp4857xBbZaFYZFYKep47C3MfbTJU8VqGa9Nh4OMt0hjZwPSEqOYDm8paT Kqj2x/kyrFhvnKVbubxaKPWudAySR9kSnOCGI63sUZafuvddlBQ8kNY9S2ZNCPoNonhddk 5XiPEhipyK0u/Z28t2Lx9SwHa06Fd5U= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-615-vilrr8bKP26rBZunguo2ng-1; Wed, 09 Feb 2022 12:00:57 -0500 X-MC-Unique: vilrr8bKP26rBZunguo2ng-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B7B9992504; Wed, 9 Feb 2022 17:00:54 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3966574E8C; Wed, 9 Feb 2022 17:00:54 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 10/12] KVM: MMU: load new PGD after the shadow MMU is initialized Date: Wed, 9 Feb 2022 12:00:18 -0500 Message-Id: <20220209170020.1775368-11-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Now that __kvm_mmu_new_pgd does not look at the MMU's root_level and shadow_root_level anymore, pull the PGD load after the initialization of the shadow MMUs. Besides being more intuitive, this enables future simplifications and optimizations because it's not necessary anymore to compute the role outside kvm_init_mmu. In particular, kvm_mmu_reset_context was not attempting to use a cached PGD to avoid having to figure out the new role. It will soon be able to follow what nested_{vmx,svm}_load_cr3 are doing, and avoid unloading all the cached roots. Signed-off-by: Paolo Bonzini Reviewed-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 37 +++++++++++++++++-------------------- arch/x86/kvm/svm/nested.c | 6 +++--- arch/x86/kvm/vmx/nested.c | 6 +++--- 3 files changed, 23 insertions(+), 26 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f61208ccce43..df9e0a43513c 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4882,9 +4882,8 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0, new_role = kvm_calc_shadow_npt_root_page_role(vcpu, ®s); - __kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base); - shadow_mmu_init_context(vcpu, context, ®s, new_role); + __kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base); } EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu); @@ -4922,27 +4921,25 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly, kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty, execonly, level); - __kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base); - - if (new_role.as_u64 == context->mmu_role.as_u64) - return; - - context->mmu_role.as_u64 = new_role.as_u64; + if (new_role.as_u64 != context->mmu_role.as_u64) { + context->mmu_role.as_u64 = new_role.as_u64; - context->shadow_root_level = level; + context->shadow_root_level = level; - context->ept_ad = accessed_dirty; - context->page_fault = ept_page_fault; - context->gva_to_gpa = ept_gva_to_gpa; - context->sync_page = ept_sync_page; - context->invlpg = ept_invlpg; - context->root_level = level; - context->direct_map = false; + context->ept_ad = accessed_dirty; + context->page_fault = ept_page_fault; + context->gva_to_gpa = ept_gva_to_gpa; + context->sync_page = ept_sync_page; + context->invlpg = ept_invlpg; + context->root_level = level; + context->direct_map = false; + update_permission_bitmask(context, true); + context->pkru_mask = 0; + reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level); + reset_ept_shadow_zero_bits_mask(context, execonly); + } - update_permission_bitmask(context, true); - context->pkru_mask = 0; - reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level); - reset_ept_shadow_zero_bits_mask(context, execonly); + __kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base); } EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu); diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index f284e61451c8..96bab464967f 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -492,14 +492,14 @@ static int nested_svm_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, CC(!load_pdptrs(vcpu, cr3))) return -EINVAL; - if (!nested_npt) - kvm_mmu_new_pgd(vcpu, cr3); - vcpu->arch.cr3 = cr3; /* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */ kvm_init_mmu(vcpu); + if (!nested_npt) + kvm_mmu_new_pgd(vcpu, cr3); + return 0; } diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 29289ecca223..abfcd71f787f 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -1126,15 +1126,15 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, return -EINVAL; } - if (!nested_ept) - kvm_mmu_new_pgd(vcpu, cr3); - vcpu->arch.cr3 = cr3; kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3); /* Re-initialize the MMU, e.g. to pick up CR4 MMU role changes. */ kvm_init_mmu(vcpu); + if (!nested_ept) + kvm_mmu_new_pgd(vcpu, cr3); + return 0; } From patchwork Wed Feb 9 17:00:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C90EC4332F for ; Wed, 9 Feb 2022 17:01:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237742AbiBIRBD (ORCPT ); Wed, 9 Feb 2022 12:01:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237653AbiBIRA4 (ORCPT ); Wed, 9 Feb 2022 12:00:56 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 1A6E3C05CB86 for ; Wed, 9 Feb 2022 09:00:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426058; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IQ8QeMPdVkeRP9kEW9HbE/igjhwB/TDxHP64swgeSBk=; b=FCnAENX1nraIheQFZ93wVJuV4ghcYzv0DaXIqnybcOOrSDluw/d3eGGihDNMtiYScoia1T MvO5L4fe0r/b6BNJYAAHsNjdBdMpDUpFHdINKhQQUgyxAi8XIFC3BKN/UYK7y2J/ULyb0R nP0TSdRSjp7IFtaBBM1zkgB7ZlvIXtQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-211-fVeOpJBmPhOlv92GnKX_mg-1; Wed, 09 Feb 2022 12:00:57 -0500 X-MC-Unique: fVeOpJBmPhOlv92GnKX_mg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7F7DE92500; Wed, 9 Feb 2022 17:00:55 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id D122674E8C; Wed, 9 Feb 2022 17:00:54 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 11/12] KVM: MMU: remove kvm_mmu_calc_root_page_role Date: Wed, 9 Feb 2022 12:00:19 -0500 Message-Id: <20220209170020.1775368-12-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Since the guest PGD is now loaded after the MMU has been set up completely, the desired role for a cache hit is simply the current mmu_role. There is no need to compute it again, so __kvm_mmu_new_pgd can be folded in kvm_mmu_new_pgd. For the !tdp_enabled case, it would also have been possible to use the role that is already in vcpu->arch.mmu. Signed-off-by: Paolo Bonzini Reviewed-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 29 ++++------------------------- 1 file changed, 4 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index df9e0a43513c..38b40ddcaad7 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -190,8 +190,6 @@ struct kmem_cache *mmu_page_header_cache; static struct percpu_counter kvm_total_used_mmu_pages; static void mmu_spte_set(u64 *sptep, u64 spte); -static union kvm_mmu_page_role -kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu); struct kvm_mmu_role_regs { const unsigned long cr0; @@ -4172,9 +4170,9 @@ static bool fast_pgd_switch(struct kvm_vcpu *vcpu, gpa_t new_pgd, return cached_root_find_and_replace(vcpu, new_pgd, new_role); } -static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd, - union kvm_mmu_page_role new_role) +void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd) { + union kvm_mmu_page_role new_role = vcpu->arch.mmu->mmu_role.base; if (!fast_pgd_switch(vcpu, new_pgd, new_role)) { /* kvm_mmu_ensure_valid_pgd will set up a new root. */ return; @@ -4209,11 +4207,6 @@ static void __kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd, __clear_sp_write_flooding_count( to_shadow_page(vcpu->arch.mmu->root.hpa)); } - -void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd) -{ - __kvm_mmu_new_pgd(vcpu, new_pgd, kvm_mmu_calc_root_page_role(vcpu)); -} EXPORT_SYMBOL_GPL(kvm_mmu_new_pgd); static unsigned long get_cr3(struct kvm_vcpu *vcpu) @@ -4883,7 +4876,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0, new_role = kvm_calc_shadow_npt_root_page_role(vcpu, ®s); shadow_mmu_init_context(vcpu, context, ®s, new_role); - __kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base); + kvm_mmu_new_pgd(vcpu, nested_cr3); } EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu); @@ -4939,7 +4932,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, bool execonly, reset_ept_shadow_zero_bits_mask(context, execonly); } - __kvm_mmu_new_pgd(vcpu, new_eptp, new_role.base); + kvm_mmu_new_pgd(vcpu, new_eptp); } EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu); @@ -5024,20 +5017,6 @@ void kvm_init_mmu(struct kvm_vcpu *vcpu) } EXPORT_SYMBOL_GPL(kvm_init_mmu); -static union kvm_mmu_page_role -kvm_mmu_calc_root_page_role(struct kvm_vcpu *vcpu) -{ - struct kvm_mmu_role_regs regs = vcpu_to_role_regs(vcpu); - union kvm_mmu_role role; - - if (tdp_enabled) - role = kvm_calc_tdp_mmu_root_page_role(vcpu, ®s, true); - else - role = kvm_calc_shadow_mmu_root_page_role(vcpu, ®s, true); - - return role.base; -} - void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) { /* From patchwork Wed Feb 9 17:00:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 12740502 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61371C433EF for ; Wed, 9 Feb 2022 17:01:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237789AbiBIRBI (ORCPT ); Wed, 9 Feb 2022 12:01:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237713AbiBIRA7 (ORCPT ); Wed, 9 Feb 2022 12:00:59 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5C634C05CB8C for ; Wed, 9 Feb 2022 09:01:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644426061; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N3kTVV9G0T5QAqpQwcJtBoAGDXQI3g2qBn7ZlIGXs3M=; b=X+LNyT1f/MKmiLBNXAoft6/F+ocJcwFw5T6U/2+gMK485q0NIznzJ+5G5jximtKyjmyDOE 3JAuJUWzKguFv4gIDce93egNbBiBZFJ3wHdxBJ0bh3MVi7EDxt24Mr4Vp5E6sLCWbpNsuZ +oLFm6ssTOW6WDMIi56wmasWopBAac4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-304-d-POX9QgMvCtU3wFX26-Cw-1; Wed, 09 Feb 2022 12:00:57 -0500 X-MC-Unique: d-POX9QgMvCtU3wFX26-Cw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2269818397D4; Wed, 9 Feb 2022 17:00:56 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 998C5708F1; Wed, 9 Feb 2022 17:00:55 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: vkuznets@redhat.com, mlevitsk@redhat.com, dmatlack@google.com, seanjc@google.com Subject: [PATCH 12/12] KVM: x86: do not unload MMU roots on all role changes Date: Wed, 9 Feb 2022 12:00:20 -0500 Message-Id: <20220209170020.1775368-13-pbonzini@redhat.com> In-Reply-To: <20220209170020.1775368-1-pbonzini@redhat.com> References: <20220209170020.1775368-1-pbonzini@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org kvm_mmu_reset_context is called on all role changes and right now it calls kvm_mmu_unload. With the legacy MMU this is a relatively cheap operation; the previous PGDs remains in the hash table and is picked up immediately on the next page fault. With the TDP MMU, however, the roots are thrown away for good and a full rebuild of the page tables is necessary, which is many times more expensive. Fortunately, throwing away the roots is not necessary except when the manual says a TLB flush is required: - changing CR0.PG from 1 to 0 (because it flushes the TLB according to the x86 architecture specification) - changing CPUID (which changes the interpretation of page tables in ways not reflected by the role). - changing CR4.SMEP from 0 to 1 (not doing so actually breaks access.c!) Except for these cases, once the MMU has updated the CPU/MMU roles and metadata it is enough to force-reload the current value of CR3. KVM will look up the cached roots for an entry with the right role and PGD, and only if the cache misses a new root will be created. Measuring with vmexit.flat from kvm-unit-tests shows the following improvement: TDP legacy shadow before 46754 5096 5150 after 4879 4875 5006 which is for very small page tables. The impact is however much larger when running as an L1 hypervisor, because the new page tables cause extra work for L0 to shadow them. Reported-by: Brad Spengler Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 7 ++++--- arch/x86/kvm/x86.c | 27 ++++++++++++++++++--------- 2 files changed, 22 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 38b40ddcaad7..dbd4e98ba426 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5020,8 +5020,8 @@ EXPORT_SYMBOL_GPL(kvm_init_mmu); void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) { /* - * Invalidate all MMU roles to force them to reinitialize as CPUID - * information is factored into reserved bit calculations. + * Invalidate all MMU roles and roots to force them to reinitialize, + * as CPUID information is factored into reserved bit calculations. * * Correctly handling multiple vCPU models with respect to paging and * physical address properties) in a single VM would require tracking @@ -5034,6 +5034,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) vcpu->arch.root_mmu.mmu_role.ext.valid = 0; vcpu->arch.guest_mmu.mmu_role.ext.valid = 0; vcpu->arch.nested_mmu.mmu_role.ext.valid = 0; + kvm_mmu_unload(vcpu); kvm_mmu_reset_context(vcpu); /* @@ -5045,8 +5046,8 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) void kvm_mmu_reset_context(struct kvm_vcpu *vcpu) { - kvm_mmu_unload(vcpu); kvm_init_mmu(vcpu); + kvm_mmu_new_pgd(vcpu, vcpu->arch.cr3); } EXPORT_SYMBOL_GPL(kvm_mmu_reset_context); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0d3646535cc5..97c4f5fc291f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -873,8 +873,12 @@ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned lon kvm_async_pf_hash_reset(vcpu); } - if ((cr0 ^ old_cr0) & KVM_MMU_CR0_ROLE_BITS) + if ((cr0 ^ old_cr0) & KVM_MMU_CR0_ROLE_BITS) { + /* Flush the TLB if CR0 is changed 1 -> 0. */ + if ((old_cr0 & X86_CR0_PG) && !(cr0 & X86_CR0_PG)) + kvm_mmu_unload(vcpu); kvm_mmu_reset_context(vcpu); + } if (((cr0 ^ old_cr0) & X86_CR0_CD) && kvm_arch_has_noncoherent_dma(vcpu->kvm) && @@ -1067,15 +1071,18 @@ void kvm_post_set_cr4(struct kvm_vcpu *vcpu, unsigned long old_cr4, unsigned lon * free them all. KVM_REQ_MMU_RELOAD is fit for the both cases; it * is slow, but changing CR4.PCIDE is a rare case. * - * If CR4.PGE is changed, the guest TLB must be flushed. + * Setting SMEP also needs to flush the TLB, in addition to resetting + * the MMU. * - * Note: resetting MMU is a superset of KVM_REQ_MMU_RELOAD and - * KVM_REQ_MMU_RELOAD is a superset of KVM_REQ_TLB_FLUSH_GUEST, hence - * the usage of "else if". + * If CR4.PGE is changed, the guest TLB must be flushed. Because + * the shadow MMU ignores global pages, this bit is not part of + * KVM_MMU_CR4_ROLE_BITS. */ - if ((cr4 ^ old_cr4) & KVM_MMU_CR4_ROLE_BITS) + if ((cr4 ^ old_cr4) & KVM_MMU_CR4_ROLE_BITS) { kvm_mmu_reset_context(vcpu); - else if ((cr4 ^ old_cr4) & X86_CR4_PCIDE) + if ((cr4 & X86_CR4_SMEP) && !(old_cr4 & X86_CR4_SMEP)) + kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); + } else if ((cr4 ^ old_cr4) & X86_CR4_PCIDE) kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu); else if ((cr4 ^ old_cr4) & X86_CR4_PGE) kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST, vcpu); @@ -11329,8 +11336,10 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) * paging related bits are ignored if paging is disabled, i.e. CR0.WP, * CR4, and EFER changes are all irrelevant if CR0.PG was '0'. */ - if (old_cr0 & X86_CR0_PG) - kvm_mmu_reset_context(vcpu); + if (old_cr0 & X86_CR0_PG) { + kvm_mmu_unload(vcpu); + kvm_init_mmu(vcpu); + } /* * Intel's SDM states that all TLB entries are flushed on INIT. AMD's