From patchwork Sat May 21 13:16:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857874 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2561EC433FE for ; Sat, 21 May 2022 13:16:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354788AbiEUNQN (ORCPT ); Sat, 21 May 2022 09:16:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232829AbiEUNQK (ORCPT ); Sat, 21 May 2022 09:16:10 -0400 Received: from mail-pg1-x536.google.com (mail-pg1-x536.google.com [IPv6:2607:f8b0:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5401415828; Sat, 21 May 2022 06:16:09 -0700 (PDT) Received: by mail-pg1-x536.google.com with SMTP id v10so9897058pgl.11; Sat, 21 May 2022 06:16:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DgSyS9XK30Wn2a59vfUdW42Ng9nYaYeSTtI+jsDbNeo=; b=Hu1eRAuRnh1/exlWZ0YyzqwKhPXSe8MMICiwbdZtrBIrNHTxszx0B8WSd+m67RskFk uoIPtuMzYB/gThSkU78UWPN6IR6suz8JpUr9tMk/1gXu5/1uBjNyB15wRNLmeHbHiC87 2CIeTBEzHRntodLy7OKK6FF2LRijsa0dpU2oathg3tae7PqClObaAP7plsINq2VXKK9o 9ywwfiPkPqt32boUyRndxi1abMtjal/3qS6dhMLQ8aEXr+GjErA47uV63RmNONY8eKVH tOFuAI7aNcIPPOEUGKs9bGFm8uj5kqM3bb7UwS1tMtR7jyBCqA5FJ9DENMf9j9vuF3pG dxAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DgSyS9XK30Wn2a59vfUdW42Ng9nYaYeSTtI+jsDbNeo=; b=dGdgqvhJKfI0uzJO5R7vF0NHJx6fHeXbDLTI1lP+8fs3+UrI1PmLW3xq/pJJcam1r8 L1j5Qui29mz+WQoUTjoJUbXLST7fq97xFG6D7f0WDYAM3JZN+Axla072ncVx8tJXR7/u TTs8+RPrM0geY/U0oc+2Cw3scw8gjanmReasrBbR5c54evUzYGx43LZ5X2G3OANGn99G uYGdWYphY/sxJgEXxIk0lJS1PWVjsVPLmFlGKRRRnnm7xcITYaIr/DohUTTEFrIp2I5g sxl4QFxfOtaHgWJNpW747Q8c1dr5+6b8969P3eIiHgrRt2j2ovXYLwDk6G1RDErzAmTa u33A== X-Gm-Message-State: AOAM532ypx1A7G4+MrDtKUdkiMPIqB2/4V8hIuOH4oBxjfzLySuVUvFV Sav+OywITqD4t5cKVuvfRP1iWhY0ulU= X-Google-Smtp-Source: ABdhPJyT3QUA6GDPJ7iewGaeV9mmg2L8XHzSl996cM3sUuf23iWfuIz2I9iqPIRcgf9DaivO33IhAQ== X-Received: by 2002:a63:d054:0:b0:3f2:50df:e008 with SMTP id s20-20020a63d054000000b003f250dfe008mr12656668pgi.317.1653138968692; Sat, 21 May 2022 06:16:08 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id g3-20020a1709026b4300b0016211344809sm511203plt.72.2022.05.21.06.16.08 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:08 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 01/12] KVM: X86/MMU: Verify PDPTE for nested NPT in PAE paging mode when page fault Date: Sat, 21 May 2022 21:16:49 +0800 Message-Id: <20220521131700.3661-2-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan When nested NPT enabled and L1 is PAE paging, mmu->get_pdptrs() which is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE from memory unconditionally for each call. The guest PAE root page is not write-protected. The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get a value different from previous calls or different from the return value of mmu->get_pdptrs() in mmu_alloc_shadow_roots(). It will cause FNAME(fetch) installs the spte in a wrong sp or links a sp to a wrong parent if the return value of mmu->get_pdptrs() is not verified unchanged since FNAME(gpte_changed) can't check this kind of change. Verify the return value of mmu->get_pdptrs() (only the gfn in it needs to be checked) and do kvm_mmu_free_roots() like load_pdptr() if the gfn isn't matched. Do the verifying unconditionally when the guest is PAE paging no matter whether it is nested NPT or not to avoid complicated code. The commit e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE from guest memory") fixs the same problem for non-nested case via caching the PDPTEs which is also the same way as how hardware caches the PDPTEs. Under SVM, however, when the processor is in guest mode with PAE enabled, the guest PDPTE entries are not cached or validated at this point, but instead are loaded and checked on demand in the normal course of address translation, just like page directory and page table entries. Any reserved bit violations are detected at the point of use, and result in a page-fault (#PF) exception rather than a general-protection (#GP) exception. So using caches can not fix the problem for shadowing nested NPT for 32bit L1. Fixes: e4e517b4be01 ("KVM: MMU: Do not unconditionally read PDPTE from guest memory") Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/paging_tmpl.h | 39 ++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index db80f7ccaa4e..6e3df84e8455 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -870,6 +870,44 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault if (is_page_fault_stale(vcpu, fault, mmu_seq)) goto out_unlock; + /* + * When nested NPT enabled and L1 is PAE paging, mmu->get_pdptrs() + * which is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE + * from memory unconditionally for each call. + * + * The guest PAE root page is not write-protected. + * + * The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get a value + * different from previous calls or different from the return value of + * mmu->get_pdptrs() in mmu_alloc_shadow_roots(). + * + * It will cause FNAME(fetch) installs the spte in a wrong sp or links + * a sp to a wrong parent if the return value of mmu->get_pdptrs() + * is not verified unchanged since FNAME(gpte_changed) can't check + * this kind of change. + * + * Verify the return value of mmu->get_pdptrs() (only the gfn in it + * needs to be checked) and do kvm_mmu_free_roots() like load_pdptr() + * if the gfn isn't matched. + * + * Do the verifying unconditionally when the guest is PAE paging no + * matter whether it is nested NPT or not to avoid complicated code. + */ + if (vcpu->arch.mmu->cpu_role.base.level == PT32E_ROOT_LEVEL) { + u64 pdpte = vcpu->arch.mmu->pae_root[(fault->addr >> 30) & 3]; + struct kvm_mmu_page *sp = NULL; + + if (IS_VALID_PAE_ROOT(pdpte)) + sp = to_shadow_page(pdpte & PT64_BASE_ADDR_MASK); + + if (!sp || walker.table_gfn[PT32E_ROOT_LEVEL - 2] != sp->gfn) { + write_unlock(&vcpu->kvm->mmu_lock); + kvm_mmu_free_roots(vcpu->kvm, vcpu->arch.mmu, + KVM_MMU_ROOT_CURRENT); + goto release_clean; + } + } + r = make_mmu_pages_available(vcpu); if (r) goto out_unlock; @@ -877,6 +915,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault out_unlock: write_unlock(&vcpu->kvm->mmu_lock); +release_clean: kvm_release_pfn_clean(fault->pfn); return r; } From patchwork Sat May 21 13:16:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857875 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C397C433F5 for ; Sat, 21 May 2022 13:16:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354914AbiEUNQT (ORCPT ); Sat, 21 May 2022 09:16:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1354795AbiEUNQO (ORCPT ); Sat, 21 May 2022 09:16:14 -0400 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F232A1AF22; Sat, 21 May 2022 06:16:12 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id 10so5990112plj.0; Sat, 21 May 2022 06:16:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sLeJwTRZqKwsni6KD3sTwNx4zaX7WbD0rQURlwLIxqw=; b=qNYBQUMvmkIWvAFd09B6mjeqs0CZHBIWv1m+/Im5RVDENWdlTn9upvNJbBlTUOHRcN 7C2a6qUjwS23O6suq6mAFVsf7xVcB5stZLNTuQQtjxwQIbrXFhYjNWLYWOR22c9VZe98 t7giFroEmdlXDNQFHZe6GD1lp5yRz3SbLwEdIUfgv2/OxFjzuR59KcBHSYGqaoBNhFw+ NTH7GkrCjlCE+awAsQVa4bftDxTuN02nuUWD3tOHa9oJxDkyTCj99giMoltWN/tz8sRP jZSJT4S4TGWzGBOLoxKPTWokrn+wp2w5YqwuGymjhi8Jt1UYiDXJhH6lWzC+k04dm8Yo O4cA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sLeJwTRZqKwsni6KD3sTwNx4zaX7WbD0rQURlwLIxqw=; b=eRNAnl2NXPpF8twoJsdLsJVWZ/ZEAQN6wFAM8y7HoaTLCl6rpBH5EinYcRTstabzFT hLrqsFOI6wCx9ZQPERi6Uw472LB23yjDvjJBxTjOSNILt+hvsYTsqhJTihIZdc0Kbdkm fqXb4zmHrLb6e404ZZki15e1Kfe/BOw/KAuoQPmu7P8H+ERY0ykls7P73TtyhXNInNC3 PPwP9i3ROoDK7Db20Xy8GClgHkK9LknfUaPbF0vBKoS2tK5+49k5k/wozPWzb0KtnRna SMQGRAz+HdsnZ191257H6mJrE2+E03ToLPs2J/ow5vVsJ849ec9P7jDppdHZlZHBSayw 3kbA== X-Gm-Message-State: AOAM531PvRz918jXoTE4bZMYcH8JBs72kfBTfPkwumHOlbIPPbTUYoXe XJvzaY5i3lVN89vuQj6YL79SttpPy8E= X-Google-Smtp-Source: ABdhPJyK0S8CvG3PwY7JthQT8/D6JlpIGj4RcIl+AZHC7FvxjSCPmxjR+nVOei6lvJu6l9H8jSFfVg== X-Received: by 2002:a17:902:e193:b0:161:e848:ad57 with SMTP id y19-20020a170902e19300b00161e848ad57mr9970313pla.167.1653138972211; Sat, 21 May 2022 06:16:12 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id n9-20020a170902f60900b0015e8d4eb2adsm1528316plg.247.2022.05.21.06.16.11 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:12 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 02/12] KVM: X86/MMU: Add using_local_root_page() Date: Sat, 21 May 2022 21:16:50 +0800 Message-Id: <20220521131700.3661-3-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan In some cases, local root pages are used for MMU. It is often using to_shadow_page(mmu->root.hpa) to check if local root pages are used. Add using_local_root_page() to directly check if local root pages are used or needed to be used even mmu->root.hpa is not set. Prepare for making to_shadow_page(mmu->root.hpa) returns non-NULL via using local shadow [root] pages. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 40 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 37 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index efe5a3dca1e0..624b6d2473f7 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1690,6 +1690,39 @@ static void drop_parent_pte(struct kvm_mmu_page *sp, mmu_spte_clear_no_track(parent_pte); } +/* + * KVM uses the VCPU's local root page (vcpu->mmu->pae_root) when either the + * shadow pagetable is using PAE paging or the host is shadowing nested NPT for + * 32bit L1 hypervisor. + * + * It includes cases: + * nonpaging when !tdp_enabled (direct paging) + * shadow paging for 32 bit guest when !tdp_enabled (shadow paging) + * NPT in 32bit host (not shadowing nested NPT) (direct paging) + * shadow nested NPT for 32bit L1 hypervisor in 32bit host (shadow paging) + * shadow nested NPT for 32bit L1 hypervisor in 64bit host (shadow paging) + * + * For the first four cases, mmu->root_role.level is PT32E_ROOT_LEVEL and the + * shadow pagetable is using PAE paging. + * + * For the last case, it is + * mmu->root_role.level > PT32E_ROOT_LEVEL && + * !mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL + * And if this condition is true, it must be the last case. + * + * With the two conditions combined, the checking condition is: + * mmu->root_role.level == PT32E_ROOT_LEVEL || + * (!mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL) + * + * (There is no "mmu->root_role.level > PT32E_ROOT_LEVEL" here, because it is + * already ensured that mmu->root_role.level >= PT32E_ROOT_LEVEL) + */ +static bool using_local_root_page(struct kvm_mmu *mmu) +{ + return mmu->root_role.level == PT32E_ROOT_LEVEL || + (!mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL); +} + static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct) { struct kvm_mmu_page *sp; @@ -4252,10 +4285,11 @@ static bool fast_pgd_switch(struct kvm *kvm, struct kvm_mmu *mmu, { /* * For now, limit the caching to 64-bit hosts+VMs in order to avoid - * having to deal with PDPTEs. We may add support for 32-bit hosts/VMs - * later if necessary. + * having to deal with PDPTEs. Local roots can not be put into + * mmu->prev_roots[] because mmu->pae_root can not be shared for + * different roots at the same time. */ - if (VALID_PAGE(mmu->root.hpa) && !to_shadow_page(mmu->root.hpa)) + if (unlikely(using_local_root_page(mmu))) kvm_mmu_free_roots(kvm, mmu, KVM_MMU_ROOT_CURRENT); if (VALID_PAGE(mmu->root.hpa)) From patchwork Sat May 21 13:16:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857876 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25216C433FE for ; Sat, 21 May 2022 13:16:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354917AbiEUNQ1 (ORCPT ); Sat, 21 May 2022 09:16:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355000AbiEUNQV (ORCPT ); Sat, 21 May 2022 09:16:21 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B34D20F7E; Sat, 21 May 2022 06:16:16 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 137so9922631pgb.5; Sat, 21 May 2022 06:16:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NfacmhLdAjJbzvc0zFEh8OxPHT+Y2L5BPF+17U7WCLE=; b=js7+NRAg9D+K+VH+fPPrjU/Oz4FLVEqkcIZAdb9sAz+o+U8rU4fwl9qE8LtrJNdcXO nes32AirKMLMO//kz5qFvHrM6WiXG5CHa5xC3uUebzaZbG16TKULtDbPcMcZhlbozeQi NKATH82Toj/D587xsuUAphrPwq1pOgzAeB583y0ob8TfanBbHK1lVI20U7tcc4rngRYC mCDiINV8zEibJTI7QsKcKzDNxC3thMmfkYH2fmxwJ1ridB7p2m6JtZM0vbg2KauQujtu /3P0NtUY1aSZtvw97nSWyah1alBq8yzQpKxtwOx94BgbrT8Ztpfq85jYmTUI5hSOQh9v h8yw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NfacmhLdAjJbzvc0zFEh8OxPHT+Y2L5BPF+17U7WCLE=; b=G4x1hgBRnRFMdRxXDAzwHxgmApob3sBZr9kthDQt3Hn5k3ad5L3YH+Mz1GdWakujTI S/bvsKgSflMJt3VeqwyALVCv3rpXKgVUS25aJQWMSj2IbHII2BkjnLU2/3noRa6UGYdw 3pHy8O9u89uCIELesxbUxhdWgAp1JSj7PzdnOV/P9vxzjEyegjLjsbhrtzeS6UF0ucha za8BUFK+WK9P7etXIMPAE/7bZPdWf2Y3FUwU+dKthaJstK9K0DfFcKUcfZ9z6coJgFN8 H/u/yMbr4Zvbp9ryrWN7fRF9L/mbxa1Xy/3xZOaN4EJ04TTmkHC8wwDoasWJsarzQoPC nqHA== X-Gm-Message-State: AOAM532cwDLTo2XyvVb9mxgKn+QoOpRNkbk31sIhQd/4SkQu10/MjnJX HXdf+B6wC3Vxo/WWK79J5VUL5qy9CM4= X-Google-Smtp-Source: ABdhPJwD88q0AA0d5wEtfx/f1kQX9cVEEPeMgRUyXs3GJoFgihoWwKnkI4H9dLZBeoDOpJg5+LpRGg== X-Received: by 2002:a05:6a00:846:b0:50d:f02f:bb46 with SMTP id q6-20020a056a00084600b0050df02fbb46mr14899936pfk.74.1653138975686; Sat, 21 May 2022 06:16:15 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id x22-20020aa784d6000000b0050e0dadb28dsm3592815pfn.205.2022.05.21.06.16.15 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:15 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 03/12] KVM: X86/MMU: Reduce a check in using_local_root_page() for common cases Date: Sat, 21 May 2022 21:16:51 +0800 Message-Id: <20220521131700.3661-4-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan For most cases, mmu->root_role.direct is true and mmu->root_role.level is not PT32E_ROOT_LEVEL which means using_local_root_page() is often checking for all the three test which is not good in fast paths. Morph the conditions in using_local_root_page() to an equivalent one to reduce a check. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 45 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 43 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 624b6d2473f7..240ebe589caf 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1716,11 +1716,52 @@ static void drop_parent_pte(struct kvm_mmu_page *sp, * * (There is no "mmu->root_role.level > PT32E_ROOT_LEVEL" here, because it is * already ensured that mmu->root_role.level >= PT32E_ROOT_LEVEL) + * + * But mmu->root_role.direct is normally true and mmu->root_role.level is + * normally not PT32E_ROOT_LEVEL. To reduce a check for the fast path of + * fast_pgd_switch() in mormal case, mmu->root_role.direct is checked first. + * + * The return value is: + * mmu->root_role.level == PT32E_ROOT_LEVEL || + * (!mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL) + * => + * (mmu->root_role.direct && mmu->root_role.level == PT32E_ROOT_LEVEL) || + * (!mmu->root_role.direct && mmu->root_role.level == PT32E_ROOT_LEVEL) || + * (!mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL) + * => + * (mmu->root_role.direct && mmu->root_role.level == PT32E_ROOT_LEVEL) || + * (!mmu->root_role.direct && + * (mmu->root_role.level == PT32E_ROOT_LEVEL || + * mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL)) + * => (for !direct, mmu->root_role.level == PT32E_ROOT_LEVEL implies + * mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL) + * => + * (mmu->root_role.direct && mmu->root_role.level == PT32E_ROOT_LEVEL) || + * (!mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL) + * + * In other words: + * + * For the first and third cases, it is + * mmu->root_role.direct && mmu->root_role.level == PT32E_ROOT_LEVEL + * And if this condition is true, it must be one of the two cases. + * + * For the 2nd, 4th and 5th cases, it is + * !mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL + * And if this condition is true, it must be one of the three cases although + * it is not so intuitive. It can be split into: + * mmu->root_role.level == PT32E_ROOT_LEVEL && + * (!mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL) + * which is for the 2nd and 4th cases and + * mmu->root_role.level > PT32E_ROOT_LEVEL && + * !mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL + * which is the last case. */ static bool using_local_root_page(struct kvm_mmu *mmu) { - return mmu->root_role.level == PT32E_ROOT_LEVEL || - (!mmu->root_role.direct && mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL); + if (mmu->root_role.direct) + return mmu->root_role.level == PT32E_ROOT_LEVEL; + else + return mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL; } static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct) From patchwork Sat May 21 13:16:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857877 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 597C7C433EF for ; Sat, 21 May 2022 13:16:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355026AbiEUNQg (ORCPT ); Sat, 21 May 2022 09:16:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355009AbiEUNQY (ORCPT ); Sat, 21 May 2022 09:16:24 -0400 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F4282CE11; Sat, 21 May 2022 06:16:20 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id j6so9849592pfe.13; Sat, 21 May 2022 06:16:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dCHtMiFFfpy1WrNe8a2M3hfIOd/KYtkFI9AjmBMluLs=; b=gPDgv1291pERaHb1SFkM2xHDBxCquTe+3Utjd2os45xKKcX4aYIwHRc5mnHTr91yXL 6+89cjsy+vDoYlXW7g5vNkf8mmMdREbwGNkXFMcVhYjdEuCjtvLmWfKIjhechliFiXR1 Shx/of2zzYXJmYWoWG75hXdW819FV29Sw8SXuq0541dFmX7dX5oJmsdnl5vMzEzTMfs5 DsA82mM1eg4M9TG4V4hKNS3qw66aX03he89pbIuMPY/G+023SGqy1jNo7Kbn3KgrZIeX TrzZhlmRaNb7vBGPunuBmgswOFEzmXTOW/l0La74ZhJ/Aj9+7GCNCu4OuarpRDwMx5eZ Knsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dCHtMiFFfpy1WrNe8a2M3hfIOd/KYtkFI9AjmBMluLs=; b=TtJ1bJ3KdjPZE8AwU5SNvjWQ+GJPPmS9U3Kqihdsm+5td+ybdyLjKYSvegnL8HKyyL K8FwTGFGBkF77I++7eOliz6z4aPFtI4331V7kG9kc8lQmpj35xBuXghWm52hHM0it+91 fxrJaxNDpdN71A9hhLgjxh+KwObl6+bvy7EJ4RtJxY6xE0/BXMZeCg4boXbMvCbRgPv/ C14wNRpOwlq6Bl1W5FQU4/KRi5lEgZDLiy7jdpvmIccrLVZTjbDUlroXcVR194cCo5dB XOK1ZfGdH4K6TnCsCrv1JjzCZwG7udSW4Qq/MeFO6lalDasXIdQ5hhT4oChzgf4uM7ul c+qQ== X-Gm-Message-State: AOAM530EEBn0Tom8iClMVU5/SFYgwGcG19AcD7w9GlNj4DZlSBlc21tf wc6A0f2fExLt95znT90d6o4+q8rrR0c= X-Google-Smtp-Source: ABdhPJyHJU5m8PuEWW2l/FQ0rvO0beZRBkRtL35MdGnFtms/U3GEtnYjTVNWcuhm8ev5DVdTyd2YRg== X-Received: by 2002:aa7:9217:0:b0:518:367d:fa85 with SMTP id 23-20020aa79217000000b00518367dfa85mr14679610pfo.9.1653138979618; Sat, 21 May 2022 06:16:19 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id c2-20020a62f842000000b0051800111b2fsm3691795pfm.216.2022.05.21.06.16.18 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:19 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 04/12] KVM: X86/MMU: Add local shadow pages Date: Sat, 21 May 2022 21:16:52 +0800 Message-Id: <20220521131700.3661-5-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan Local shadow pages are shadow pages to hold PDPTEs for 32bit guest or higher level shadow pages having children local shadow pages when shadowing nested NPT for 32bit L1 in 64 bit L0. Current code use mmu->pae_root, mmu->pml4_root, and mmu->pml5_root to setup local root page. The initialization code is complex and the root pages are not associated with struct kvm_mmu_page which causes the code more complex. Add kvm_mmu_alloc_local_shadow_page() and mmu_free_local_root_page() to allocate and free local shadow pages and prepare for using local shadow pages to replace current logic and share the most logic with non-local shadow pages. The code is not activated since using_local_root_page() is false in the place where it is inserted. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 109 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 108 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 240ebe589caf..c941a5931bc3 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1764,6 +1764,76 @@ static bool using_local_root_page(struct kvm_mmu *mmu) return mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL; } +/* + * Local shadow pages are shadow pages to hold PDPTEs for 32bit guest or higher + * level shadow pages having children local shadow pages when shadowing nested + * NPT for 32bit L1 in 64 bit L0. + * + * Local shadow pages are often local shadow root pages (or local root pages for + * short) except when shadowing nested NPT for 32bit L1 in 64 bit L0 which has + * 2 or 3 levels of local shadow pages on top of non-local shadow pages. + * + * Local shadow pages are locally allocated. If the local shadow page's level + * is PT32E_ROOT_LEVEL, it will use the preallocated mmu->pae_root for its + * sp->spt. Because sp->spt may need to be put in the 32 bits CR3 (even in + * x86_64) or decrypted. Using the preallocated one to handle these + * requirements makes the allocation simpler. + * + * Local shadow pages are only visible to local VCPU except through + * sp->parent_ptes rmap from their children, so they are not in the + * kvm->arch.active_mmu_pages nor in the hash. + * + * And they are neither accounted nor write-protected since they don't shadow a + * guest page table. + * + * Because of above, local shadow pages can not be freed nor zapped like + * non-local shadow pages. They are freed directly when the local root page + * is freed, see mmu_free_local_root_page(). + * + * Local root page can not be put on mmu->prev_roots because the comparison + * must use PDPTEs instead of CR3 and mmu->pae_root can not be shared for multi + * local root pages. + * + * Except above limitations, all the other abilities are the same as other + * shadow page, like link, parent rmap, sync, unsync etc. + * + * Local shadow pages can be obsoleted in a little different way other than + * the non-local shadow pages. When the obsoleting process is done, all the + * obsoleted non-local shadow pages are unlinked from the local shadow pages + * by the help of the sp->parent_ptes rmap and the local shadow pages become + * theoretically valid again except sp->mmu_valid_gen may be still outdated. + * If there is no other event to cause a VCPU to free the local root page and + * the VCPU is being preempted by the host during two obsoleting processes, + * sp->mmu_valid_gen might become valid again and the VCPU can reuse it when + * the VCPU is back. It is different from the non-local shadow pages which + * are always freed after obsoleted. + */ +static struct kvm_mmu_page * +kvm_mmu_alloc_local_shadow_page(struct kvm_vcpu *vcpu, union kvm_mmu_page_role role) +{ + struct kvm_mmu_page *sp; + + sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); + sp->gfn = 0; + sp->role = role; + /* + * Use the preallocated mmu->pae_root when the shadow page's + * level is PT32E_ROOT_LEVEL which may need to be put in the 32 bits + * CR3 (even in x86_64) or decrypted. The preallocated one is prepared + * for the requirements. + */ + if (role.level == PT32E_ROOT_LEVEL && + !WARN_ON_ONCE(!vcpu->arch.mmu->pae_root)) + sp->spt = vcpu->arch.mmu->pae_root; + else + sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); + /* sp->gfns is not used for local shadow page */ + set_page_private(virt_to_page(sp->spt), (unsigned long)sp); + sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen; + + return sp; +} + static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct) { struct kvm_mmu_page *sp; @@ -2121,6 +2191,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, if (level <= vcpu->arch.mmu->cpu_role.base.level) role.passthrough = 0; + if (unlikely(level >= PT32E_ROOT_LEVEL && using_local_root_page(vcpu->arch.mmu))) + return kvm_mmu_alloc_local_shadow_page(vcpu, role); + sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; for_each_valid_sp(vcpu->kvm, sp, sp_list) { if (sp->gfn != gfn) { @@ -3351,6 +3424,37 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa, *root_hpa = INVALID_PAGE; } +static void mmu_free_local_root_page(struct kvm *kvm, struct kvm_mmu *mmu) +{ + u64 spte = mmu->root.hpa; + struct kvm_mmu_page *sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK); + int i; + + /* Free level 5 or 4 roots for shadow NPT for 32 bit L1 */ + while (sp->role.level > PT32E_ROOT_LEVEL) + { + spte = sp->spt[0]; + mmu_page_zap_pte(kvm, sp, sp->spt + 0, NULL); + free_page((unsigned long)sp->spt); + kmem_cache_free(mmu_page_header_cache, sp); + if (!is_shadow_present_pte(spte)) + return; + sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK); + } + + if (WARN_ON_ONCE(sp->role.level != PT32E_ROOT_LEVEL)) + return; + + /* Disconnect PAE root from the 4 PAE page directories */ + for (i = 0; i < 4; i++) + mmu_page_zap_pte(kvm, sp, sp->spt + i, NULL); + + if (sp->spt != mmu->pae_root) + free_page((unsigned long)sp->spt); + + kmem_cache_free(mmu_page_header_cache, sp); +} + /* roots_to_free must be some combination of the KVM_MMU_ROOT_* flags */ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu, ulong roots_to_free) @@ -3384,7 +3488,10 @@ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu, if (free_active_root) { if (to_shadow_page(mmu->root.hpa)) { - mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); + if (using_local_root_page(mmu)) + mmu_free_local_root_page(kvm, mmu); + else + mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); } else if (mmu->pae_root) { for (i = 0; i < 4; ++i) { if (!IS_VALID_PAE_ROOT(mmu->pae_root[i])) From patchwork Sat May 21 13:16:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857879 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2A85C433EF for ; Sat, 21 May 2022 13:16:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355199AbiEUNQr (ORCPT ); Sat, 21 May 2022 09:16:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355069AbiEUNQd (ORCPT ); Sat, 21 May 2022 09:16:33 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E7A92F389; Sat, 21 May 2022 06:16:23 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id c14so9895582pfn.2; Sat, 21 May 2022 06:16:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8ROpEEdq4nhDPA6XwmM5rgE6fsdyhfYYyZpEn0MjbA0=; b=WXJMcvF8UyenHRio4TyF0ZVna5wzsZMKyZeKkf5TsZiN/7a1KmGZLofwZhbOeM2fNz wDH1nspnL1M4pIoZ8RWSbailZpSqNxW4RTyVP4eS32loFVPTKkSgR07AyYNdjbxFWwc9 JJBGfOry3leV5BP++RMrTwrFSE0qp0B4uYCm1yFqC9WsPWTT2J5PCUrEoXqHDLoGxl38 417Yub0Kuf/QpnEKg6+2gYBkwdPx7GzgxPdtTenVifKVqVCpHkXmnNg7uDoi3pLYMUx/ jRMGONc1UVyVu0b2I0zGo2RXRaOeeEgdA58MHBqth0oF2DxJytWXme6uIbtD3r4bSCzd wjVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8ROpEEdq4nhDPA6XwmM5rgE6fsdyhfYYyZpEn0MjbA0=; b=fuVgE+QVxHxTgUdWn2k1wW22U5jrUd0NbR50dRpgiktAcSOsSLr6hXC9bWmWeQCfq/ 7zhlesp6RXEgC5q7QdIYeH1sF/MXBpH0fwU9CdXu1HQEBQUcBq9fYQw2OB2jm7mhmWxp hpupgeHjU9rjD5ZlJXNX4jPMVRP27lJYImP9Rkb4Wee7E/wLmoM5781ErhL1Jd5DT3DS pXCf8tN3TInxI7JhQ3MIQn7cIBVDp0DLzYlIn0mqBL1o49ScUl66UVueZZD6v1dTymh1 9DEWRwyKxCPXJF4gLY2o++sthCqUBU2gpWa1lgdlj0VNxf5lp89inWIxY6mzuqNa9uXt ZMug== X-Gm-Message-State: AOAM531cvdmh1Q849hpeBHM0x/uShYa/wQNs0siLoD45YmfoaUpCnryy RilqaaZfqZ/3oSNGCSplyCinpjY83AM= X-Google-Smtp-Source: ABdhPJxsGi1W5zHRaktBq+gMKlpquJATrPPVKNLUyuxBANt7P539OMcrGL1S7UGdibbl60YB/bP5rQ== X-Received: by 2002:a63:e513:0:b0:3ab:a3fb:f100 with SMTP id r19-20020a63e513000000b003aba3fbf100mr12487336pgh.70.1653138983340; Sat, 21 May 2022 06:16:23 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id v11-20020a170902d68b00b0015e8d4eb284sm1552865ply.206.2022.05.21.06.16.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:23 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 05/12] KVM: X86/MMU: Link PAE root pagetable with its children Date: Sat, 21 May 2022 21:16:53 +0800 Message-Id: <20220521131700.3661-6-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan When local shadow pages are activated, link_shadow_page() might link a local shadow pages which is the PAE root for PAE paging with its children. Add make_pae_pdpte() to handle it. The code is not activated since local shadow pages are not activated yet. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 6 +++++- arch/x86/kvm/mmu/spte.c | 7 +++++++ arch/x86/kvm/mmu/spte.h | 1 + 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index c941a5931bc3..e1a059dd9621 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2340,7 +2340,11 @@ static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, BUILD_BUG_ON(VMX_EPT_WRITABLE_MASK != PT_WRITABLE_MASK); - spte = make_nonleaf_spte(sp->spt, sp_ad_disabled(sp)); + if (unlikely(sp->role.level == PT32_ROOT_LEVEL && + vcpu->arch.mmu->root_role.level == PT32E_ROOT_LEVEL)) + spte = make_pae_pdpte(sp->spt); + else + spte = make_nonleaf_spte(sp->spt, sp_ad_disabled(sp)); mmu_spte_set(sptep, spte); diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index b5960bbde7f7..5c31fa1d2b61 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -279,6 +279,13 @@ u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index) return child_spte; } +u64 make_pae_pdpte(u64 *child_pt) +{ + /* The only ignore bits in PDPTE are 11:9. */ + BUILD_BUG_ON(!(GENMASK(11,9) & SPTE_MMU_PRESENT_MASK)); + return __pa(child_pt) | PT_PRESENT_MASK | SPTE_MMU_PRESENT_MASK | + shadow_me_value; +} u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled) { diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 0127bb6e3c7d..2408ba1361d5 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -426,6 +426,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, u64 old_spte, bool prefetch, bool can_unsync, bool host_writable, u64 *new_spte); u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index); +u64 make_pae_pdpte(u64 *child_pt); u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled); u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access); u64 mark_spte_for_access_track(u64 spte); From patchwork Sat May 21 13:16:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857878 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28179C433EF for ; Sat, 21 May 2022 13:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355165AbiEUNQk (ORCPT ); Sat, 21 May 2022 09:16:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355128AbiEUNQf (ORCPT ); Sat, 21 May 2022 09:16:35 -0400 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 721D238BD8; Sat, 21 May 2022 06:16:28 -0700 (PDT) Received: by mail-pl1-x62f.google.com with SMTP id bh5so9419394plb.6; Sat, 21 May 2022 06:16:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0z0tRb7fboYayLeEmCjTFMrxW3XxALefbwdIJM3o4iA=; b=j7tNbojTr/+MF/Mk5laDX+qJN8vHKJ1gbw6wSm+c+0u1Phj4fu17V2w6MEq4+PSYeL MbQe7/63rzzoL8vQseiLIZ99fsNUfI4KycxMMpc52Gq1Ly2jRxNG7L+aqymkNpdbv7z+ OXaTNOg67t4S8aWHT4PyoG7etHN9LR1AcknA32HYuNwuV8CgoxebqEF2+nK2mQ0/+QKf 3Efd+OOmlX3f80JGMxx1LJFJRT6UOOgF8sW1ZaMQNw4fj1g//CmyTOVATWLiNh84pS9E 7wH0Zx0NnolEDWLaU2ziyJWsSWKVV32zcJoJes8wHr+pe/JXJWBrdY/QtUIemiJinMUM 4EtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0z0tRb7fboYayLeEmCjTFMrxW3XxALefbwdIJM3o4iA=; b=XexNAuJ8qatZrQQfwAsi10bpAPv7bhJ8dxYjAR3L5Vdajcz4PvERyG0KP5aozjtJKN 2G5gqFDBeQbUPrJN0OEhtPUaXc63bZDhpIdYj8t4TVEHp8r33Ybh7TGsfqqDmO31dobM P7Z+n0b4+tJsNTSCb3WU+Cqvw/gIF8Vl8V2BFOTGhQ3kQYVkRfvNydzMIW+Tp9rdl/g7 X+665smZzL9u5pe2/Aij/9PPlqzGWlC+HA9E+J5WfKuQDA1sgCeOXv+ptW4G/9oZBpT4 gGsSyBhRmri5YTwAb4yfL0PJekRFZ4/BnwPBOxC5x5WravUbXzwQG/qAVQIvuKSM64le n50Q== X-Gm-Message-State: AOAM5339zKTu4WiyYWaUUzjxUb0js/UfcuxAgRMJhz9vuc+z45TMveMZ faKq111IUzqSC2yCdOoGJbh9kRbVOpU= X-Google-Smtp-Source: ABdhPJx1SROWfEALAt6sG6SUmqwnN+gIzU9RM0hQoUdJtYdlwVTYj6kRl4USTkGsGeaodNHAKnfbfA== X-Received: by 2002:a17:90b:380d:b0:1dc:8dc2:bb2c with SMTP id mq13-20020a17090b380d00b001dc8dc2bb2cmr16791741pjb.236.1653138987236; Sat, 21 May 2022 06:16:27 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id u5-20020a170902b28500b00161d28d62f8sm1546107plr.84.2022.05.21.06.16.26 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:27 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 06/12] KVM: X86/MMU: Activate local shadow pages and remove old logic Date: Sat, 21 May 2022 21:16:54 +0800 Message-Id: <20220521131700.3661-7-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan Activate local shadow pages by allocate local shadow pages in mmu_alloc_direct_roots() and mmu_alloc_shadow_roots(). Make shadow walkings walk from the topmost shadow page even it is local shadow page so that they can be walked like normal root and shadowed PDPTEs can be made and installed on-demand. Walking from the topmost causes FNAME(fetch) needs to visit high level local shadow pages and allocate local shadow pages when shadowing NPT for 32bit L1 in 64bit host, so change FNAME(fetch) and FNAME(walk_addr_generic) to handle it for affected code. Do sync from the topmost in kvm_mmu_sync_roots() and simplifies the code. Now all the root pages and pagetable pointed by a present spte in struct kvm_mmu are associated by struct kvm_mmu_page, and to_shadow_page() is guaranteed to be not NULL. Affect cases are those that using_local_root_page() return true. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 174 +++------------------------------ arch/x86/kvm/mmu/paging_tmpl.h | 18 +++- 2 files changed, 31 insertions(+), 161 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e1a059dd9621..684a0221aa4c 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1691,9 +1691,9 @@ static void drop_parent_pte(struct kvm_mmu_page *sp, } /* - * KVM uses the VCPU's local root page (vcpu->mmu->pae_root) when either the - * shadow pagetable is using PAE paging or the host is shadowing nested NPT for - * 32bit L1 hypervisor. + * KVM uses the VCPU's local root page (kvm_mmu_alloc_local_shadow_page()) when + * either the shadow pagetable is using PAE paging or the host is shadowing + * nested NPT for 32bit L1 hypervisor. * * It includes cases: * nonpaging when !tdp_enabled (direct paging) @@ -2277,26 +2277,6 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato iterator->addr = addr; iterator->shadow_addr = root; iterator->level = vcpu->arch.mmu->root_role.level; - - if (iterator->level >= PT64_ROOT_4LEVEL && - vcpu->arch.mmu->cpu_role.base.level < PT64_ROOT_4LEVEL && - !vcpu->arch.mmu->root_role.direct) - iterator->level = PT32E_ROOT_LEVEL; - - if (iterator->level == PT32E_ROOT_LEVEL) { - /* - * prev_root is currently only used for 64-bit hosts. So only - * the active root_hpa is valid here. - */ - BUG_ON(root != vcpu->arch.mmu->root.hpa); - - iterator->shadow_addr - = vcpu->arch.mmu->pae_root[(addr >> 30) & 3]; - iterator->shadow_addr &= PT64_BASE_ADDR_MASK; - --iterator->level; - if (!iterator->shadow_addr) - iterator->level = 0; - } } static void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator, @@ -3491,21 +3471,10 @@ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu, &invalid_list); if (free_active_root) { - if (to_shadow_page(mmu->root.hpa)) { - if (using_local_root_page(mmu)) - mmu_free_local_root_page(kvm, mmu); - else - mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); - } else if (mmu->pae_root) { - for (i = 0; i < 4; ++i) { - if (!IS_VALID_PAE_ROOT(mmu->pae_root[i])) - continue; - - mmu_free_root_page(kvm, &mmu->pae_root[i], - &invalid_list); - mmu->pae_root[i] = INVALID_PAE_ROOT; - } - } + if (using_local_root_page(mmu)) + mmu_free_local_root_page(kvm, mmu); + else + mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); mmu->root.hpa = INVALID_PAGE; mmu->root.pgd = 0; } @@ -3570,7 +3539,6 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) struct kvm_mmu *mmu = vcpu->arch.mmu; u8 shadow_root_level = mmu->root_role.level; hpa_t root; - unsigned i; int r; write_lock(&vcpu->kvm->mmu_lock); @@ -3581,24 +3549,9 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) if (is_tdp_mmu_enabled(vcpu->kvm)) { root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu); mmu->root.hpa = root; - } else if (shadow_root_level >= PT64_ROOT_4LEVEL) { + } else if (shadow_root_level >= PT32E_ROOT_LEVEL) { root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level, true); mmu->root.hpa = root; - } else if (shadow_root_level == PT32E_ROOT_LEVEL) { - if (WARN_ON_ONCE(!mmu->pae_root)) { - r = -EIO; - goto out_unlock; - } - - for (i = 0; i < 4; ++i) { - WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i])); - - root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT), - i << 30, PT32_ROOT_LEVEL, true); - mmu->pae_root[i] = root | PT_PRESENT_MASK | - shadow_me_mask; - } - mmu->root.hpa = __pa(mmu->pae_root); } else { WARN_ONCE(1, "Bad TDP root level = %d\n", shadow_root_level); r = -EIO; @@ -3676,10 +3629,8 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm) static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) { struct kvm_mmu *mmu = vcpu->arch.mmu; - u64 pdptrs[4], pm_mask; gfn_t root_gfn, root_pgd; hpa_t root; - unsigned i; int r; root_pgd = mmu->get_guest_pgd(vcpu); @@ -3688,21 +3639,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) if (mmu_check_root(vcpu, root_gfn)) return 1; - /* - * On SVM, reading PDPTRs might access guest memory, which might fault - * and thus might sleep. Grab the PDPTRs before acquiring mmu_lock. - */ - if (mmu->cpu_role.base.level == PT32E_ROOT_LEVEL) { - for (i = 0; i < 4; ++i) { - pdptrs[i] = mmu->get_pdptr(vcpu, i); - if (!(pdptrs[i] & PT_PRESENT_MASK)) - continue; - - if (mmu_check_root(vcpu, pdptrs[i] >> PAGE_SHIFT)) - return 1; - } - } - r = mmu_first_shadow_root_alloc(vcpu->kvm); if (r) return r; @@ -3712,70 +3648,9 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) if (r < 0) goto out_unlock; - /* - * Do we shadow a long mode page table? If so we need to - * write-protect the guests page table root. - */ - if (mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) { - root = mmu_alloc_root(vcpu, root_gfn, 0, - mmu->root_role.level, false); - mmu->root.hpa = root; - goto set_root_pgd; - } - - if (WARN_ON_ONCE(!mmu->pae_root)) { - r = -EIO; - goto out_unlock; - } - - /* - * We shadow a 32 bit page table. This may be a legacy 2-level - * or a PAE 3-level page table. In either case we need to be aware that - * the shadow page table may be a PAE or a long mode page table. - */ - pm_mask = PT_PRESENT_MASK | shadow_me_value; - if (mmu->root_role.level >= PT64_ROOT_4LEVEL) { - pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK; - - if (WARN_ON_ONCE(!mmu->pml4_root)) { - r = -EIO; - goto out_unlock; - } - mmu->pml4_root[0] = __pa(mmu->pae_root) | pm_mask; - - if (mmu->root_role.level == PT64_ROOT_5LEVEL) { - if (WARN_ON_ONCE(!mmu->pml5_root)) { - r = -EIO; - goto out_unlock; - } - mmu->pml5_root[0] = __pa(mmu->pml4_root) | pm_mask; - } - } - - for (i = 0; i < 4; ++i) { - WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i])); - - if (mmu->cpu_role.base.level == PT32E_ROOT_LEVEL) { - if (!(pdptrs[i] & PT_PRESENT_MASK)) { - mmu->pae_root[i] = INVALID_PAE_ROOT; - continue; - } - root_gfn = pdptrs[i] >> PAGE_SHIFT; - } - - root = mmu_alloc_root(vcpu, root_gfn, i << 30, - PT32_ROOT_LEVEL, false); - mmu->pae_root[i] = root | pm_mask; - } - - if (mmu->root_role.level == PT64_ROOT_5LEVEL) - mmu->root.hpa = __pa(mmu->pml5_root); - else if (mmu->root_role.level == PT64_ROOT_4LEVEL) - mmu->root.hpa = __pa(mmu->pml4_root); - else - mmu->root.hpa = __pa(mmu->pae_root); - -set_root_pgd: + root = mmu_alloc_root(vcpu, root_gfn, 0, + mmu->root_role.level, false); + mmu->root.hpa = root; mmu->root.pgd = root_pgd; out_unlock: write_unlock(&vcpu->kvm->mmu_lock); @@ -3892,8 +3767,7 @@ static bool is_unsync_root(hpa_t root) void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) { - int i; - struct kvm_mmu_page *sp; + hpa_t root = vcpu->arch.mmu->root.hpa; if (vcpu->arch.mmu->root_role.direct) return; @@ -3903,31 +3777,11 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY); - if (vcpu->arch.mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL) { - hpa_t root = vcpu->arch.mmu->root.hpa; - sp = to_shadow_page(root); - - if (!is_unsync_root(root)) - return; - - write_lock(&vcpu->kvm->mmu_lock); - mmu_sync_children(vcpu, sp, true); - write_unlock(&vcpu->kvm->mmu_lock); + if (!is_unsync_root(root)) return; - } write_lock(&vcpu->kvm->mmu_lock); - - for (i = 0; i < 4; ++i) { - hpa_t root = vcpu->arch.mmu->pae_root[i]; - - if (IS_VALID_PAE_ROOT(root)) { - root &= PT64_BASE_ADDR_MASK; - sp = to_shadow_page(root); - mmu_sync_children(vcpu, sp, true); - } - } - + mmu_sync_children(vcpu, to_shadow_page(root), true); write_unlock(&vcpu->kvm->mmu_lock); } diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 6e3df84e8455..cd6032e1947c 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -316,6 +316,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, u16 errcode = 0; gpa_t real_gpa; gfn_t gfn; + int i; trace_kvm_mmu_pagetable_walk(addr, access); retry_walk: @@ -323,6 +324,20 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, pte = mmu->get_guest_pgd(vcpu); have_ad = PT_HAVE_ACCESSED_DIRTY(mmu); + /* + * Initialize the guest walker with default values. These values will + * be used in cases where KVM shadows a guest page table structure + * with more levels than what the guest. For example, KVM shadows + * 3-level nested NPT for 32 bit L1 with 5-level NPT paging. + * + * Note, the gfn is technically ignored for these local shadow pages, + * but it's more consistent to always pass 0 to kvm_mmu_get_page(). + */ + for (i = PT32_ROOT_LEVEL; i < PT_MAX_FULL_LEVELS; i++) { + walker->table_gfn[i] = 0; + walker->pt_access[i] = ACC_ALL; + } + #if PTTYPE == 64 walk_nx_mask = 1ULL << PT64_NX_SHIFT; if (walker->level == PT32E_ROOT_LEVEL) { @@ -675,7 +690,8 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, * Verify that the gpte in the page we've just write * protected is still there. */ - if (FNAME(gpte_changed)(vcpu, gw, it.level - 1)) + if (it.level - 1 < top_level && + FNAME(gpte_changed)(vcpu, gw, it.level - 1)) goto out_gpte_changed; if (sp) From patchwork Sat May 21 13:16:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857880 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E90EFC433F5 for ; Sat, 21 May 2022 13:17:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355126AbiEUNRO (ORCPT ); Sat, 21 May 2022 09:17:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355030AbiEUNQi (ORCPT ); Sat, 21 May 2022 09:16:38 -0400 Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFD653BA6A; Sat, 21 May 2022 06:16:31 -0700 (PDT) Received: by mail-pg1-x530.google.com with SMTP id h9so1728794pgl.4; Sat, 21 May 2022 06:16:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JWj2Gm0RTm2whtHiHymRcHUNFcQ1CwVqw1Y3UlZwvQE=; b=jWL5TOsGwwsNXdZIq+IyTkc0mhiYbjVe8P6bT3GmWMrY4XkIT/oDv/eZzdde2Fz+Uw is/ZQaztuIwBiT1342s+UyyA2H99UucSLYWMtcaWk8X1rLVsJWMvSukzlThsBEXPnE3V GGCumzhRMpKXKoKrVJBKjqW+BtsPDkpf08B29jjgqRrQYFVX4gFWjCqAYUC6R19OvotY ZFLTzczQwLUnREpKflc4Nva/G0pquqrqe2KKIvynQZ8ZhYTawC/2oEtNgt0i/WtgrMsU bEqlaKl89IoxV7SBpJ8UAqsln54ZaXD5C2zxFZ3vbXz9aeRre44k5WITXkPnp+DBGSyI cDHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JWj2Gm0RTm2whtHiHymRcHUNFcQ1CwVqw1Y3UlZwvQE=; b=259m8fGuVptmIyLQg4X8hzmnRGHd+zyhsHKNfpCCtxdGKmyw06IclX35aI0hWHLc2F gdT3RLs4cZK+rjhit7nlfU8prrfRTBUg+07f0qe4eDp2qabiB2xdbhLuMYOCl+u6cxLH fS6QwMu8nEVfw80WX8m462izcZQtKFV6J0T0hJCl/Rz64KXqR59hLtNYzhOr2YPuRfJm a8yJCSUozFWPhA0sUpg62l3nHA5VjL5xgKok5IjarLjDVAsJm2qiETzCK+4oa42v90JU j0JwElmfknFCF+O1ij4xpdO87hrwcPSptiMeHj9rZTOMaIQtu1M0gHUhZHELay39pIPO D1jA== X-Gm-Message-State: AOAM532Zf5btQTx2UED9cHe6/+vq8cEbqi1d0C0qe7YVY73PDKuGAOez psHy1iM0qXYx4vDU6aLEJnMUMsC8wxI= X-Google-Smtp-Source: ABdhPJzVSPEi7wqKIPnt34q012TK0y3635WQ7idVxf96AHxKrfw81J+hRwewb2CiOuyvPs3xHwKKDA== X-Received: by 2002:a65:6552:0:b0:3db:772a:2465 with SMTP id a18-20020a656552000000b003db772a2465mr12728770pgw.225.1653138990931; Sat, 21 May 2022 06:16:30 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id z11-20020a170902cccb00b0015e8d4eb28fsm1567562ple.217.2022.05.21.06.16.30 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:30 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 07/12] KVM: X86/MMU: Remove the check of the return value of to_shadow_page() Date: Sat, 21 May 2022 21:16:55 +0800 Message-Id: <20220521131700.3661-8-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan Remove the check of the return value of to_shadow_page() in mmu_free_root_page(), kvm_mmu_free_guest_mode_roots(), is_unsync_root() and is_tdp_mmu() because it can not return NULL. Remove the check of the return value of to_shadow_page() in is_page_fault_stale() and is_obsolete_root() because it can not return NULL and the obsoleting for local shadow page is already handled by a different way. When the obsoleting process is done, all the obsoleted non-local shadow pages are already unlinked from the local shadow pages by the help of the parent rmap from the children and the local shadow pages become theoretically valid again. The local shadow page can be freed if is_obsolete_sp() return true, or be reused if is_obsolete_sp() becomes false. Reviewed-by: David Matlack Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 44 +++----------------------------------- arch/x86/kvm/mmu/tdp_mmu.h | 7 +----- 2 files changed, 4 insertions(+), 47 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 684a0221aa4c..90b715eefe6a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3397,8 +3397,6 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa, return; sp = to_shadow_page(*root_hpa & PT64_BASE_ADDR_MASK); - if (WARN_ON(!sp)) - return; if (is_tdp_mmu_page(sp)) kvm_tdp_mmu_put_root(kvm, sp, false); @@ -3501,8 +3499,7 @@ void kvm_mmu_free_guest_mode_roots(struct kvm *kvm, struct kvm_mmu *mmu) if (!VALID_PAGE(root_hpa)) continue; - if (!to_shadow_page(root_hpa) || - to_shadow_page(root_hpa)->role.guest_mode) + if (to_shadow_page(root_hpa)->role.guest_mode) roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i); } @@ -3752,13 +3749,6 @@ static bool is_unsync_root(hpa_t root) smp_rmb(); sp = to_shadow_page(root); - /* - * PAE roots (somewhat arbitrarily) aren't backed by shadow pages, the - * PDPTEs for a given PAE root need to be synchronized individually. - */ - if (WARN_ON_ONCE(!sp)) - return false; - if (sp->unsync || sp->unsync_children) return true; @@ -4068,21 +4058,7 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) static bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, int mmu_seq) { - struct kvm_mmu_page *sp = to_shadow_page(vcpu->arch.mmu->root.hpa); - - /* Special roots, e.g. pae_root, are not backed by shadow pages. */ - if (sp && is_obsolete_sp(vcpu->kvm, sp)) - return true; - - /* - * Roots without an associated shadow page are considered invalid if - * there is a pending request to free obsolete roots. The request is - * only a hint that the current root _may_ be obsolete and needs to be - * reloaded, e.g. if the guest frees a PGD that KVM is tracking as a - * previous root, then __kvm_mmu_prepare_zap_page() signals all vCPUs - * to reload even if no vCPU is actively using the root. - */ - if (!sp && kvm_test_request(KVM_REQ_MMU_FREE_OBSOLETE_ROOTS, vcpu)) + if (is_obsolete_sp(vcpu->kvm, to_shadow_page(vcpu->arch.mmu->root.hpa))) return true; return fault->slot && @@ -5190,24 +5166,10 @@ void kvm_mmu_unload(struct kvm_vcpu *vcpu) static bool is_obsolete_root(struct kvm *kvm, hpa_t root_hpa) { - struct kvm_mmu_page *sp; - if (!VALID_PAGE(root_hpa)) return false; - /* - * When freeing obsolete roots, treat roots as obsolete if they don't - * have an associated shadow page. This does mean KVM will get false - * positives and free roots that don't strictly need to be freed, but - * such false positives are relatively rare: - * - * (a) only PAE paging and nested NPT has roots without shadow pages - * (b) remote reloads due to a memslot update obsoletes _all_ roots - * (c) KVM doesn't track previous roots for PAE paging, and the guest - * is unlikely to zap an in-use PGD. - */ - sp = to_shadow_page(root_hpa); - return !sp || is_obsolete_sp(kvm, sp); + return is_obsolete_sp(kvm, to_shadow_page(root_hpa)); } static void __kvm_mmu_free_obsolete_roots(struct kvm *kvm, struct kvm_mmu *mmu) diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index c163f7cc23ca..5779a2a7161e 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -78,13 +78,8 @@ static inline bool is_tdp_mmu(struct kvm_mmu *mmu) if (WARN_ON(!VALID_PAGE(hpa))) return false; - /* - * A NULL shadow page is legal when shadowing a non-paging guest with - * PAE paging, as the MMU will be direct with root_hpa pointing at the - * pae_root page, not a shadow page. - */ sp = to_shadow_page(hpa); - return sp && is_tdp_mmu_page(sp) && sp->root_count; + return is_tdp_mmu_page(sp) && sp->root_count; } #else static inline int kvm_mmu_init_tdp_mmu(struct kvm *kvm) { return 0; } From patchwork Sat May 21 13:16:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857881 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98E97C433F5 for ; Sat, 21 May 2022 13:17:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355250AbiEUNR1 (ORCPT ); Sat, 21 May 2022 09:17:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355181AbiEUNRQ (ORCPT ); Sat, 21 May 2022 09:17:16 -0400 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07C3A4BB9C; Sat, 21 May 2022 06:16:45 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id y41so9856164pfw.12; Sat, 21 May 2022 06:16:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hnKZJ+lERgc4n/q9nnaVzYKEdHblpgWM0rWz+Lvu+Mc=; b=QPsRUKAOR+ryfVPMvsudsn3ngPhDRRPdZl0GI9yJ+LF4O/Vv+movozqHk/1gAlhl6Y bM73DVMLpI2J7LEwahTbQy4Sp+DqUFW11ozGhhOJUApAjjsHF5NByEBmDeJWeaZcdUZD iJojaM1+yHM2jNxyDpconiP8+a/YBqVo76V3U/3oDl3DIxO9g9bBcwryaJh9xKzl/GnJ wz+lqlBxvqptoq7TPTFL+9xGDvIJg6nI4iSXHcwuVJ9KnuYCYWo52bdMpBMSIn1FnQGx QyGrXMEngTQWSeoY1zLHkd19AWdlo/qsLjR57LxYN+TZ38gvzyC2LJ/sUTQ/yM7dLRgi pabg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hnKZJ+lERgc4n/q9nnaVzYKEdHblpgWM0rWz+Lvu+Mc=; b=rP4ctB9m3CDvsjpvpJpJSzLJ8ry6+dId88f/k7u2Qhyce2p4pUuEr7vGAZ0Ptus2JL kM0Wn/rqKukYZ7qBQuc+8bAsD5v+5OTpaWLuQebi+TvHjFTgT/FiVcsfKsKY0YYX472v 4XtXc1xVsmKZitAmX3hkpGXmcNgj2RFLDccAypoA1rmzX2+SlGwRYCc9DpbMuasTkl80 1mvk2UJ5LBUTYY0pviHMHs8/eKnKt24pfAtdzNTs3MEka2Yy+tKHSRa2WWJyq1TiO+rJ x51U7njlKI3BwdfDSXTKgInV9OlME2Kfae0H5JMUruKejKbvi50/TCu3pNwa3dI5MJMH D07w== X-Gm-Message-State: AOAM530+foDDkvREw7UPT4R5yQONWcCA0ph+G8MW2sHPkX+OBdcEaNHb ge1ZySenapEo4ii5odCmG4PsMZSetCk= X-Google-Smtp-Source: ABdhPJyIpe53uFrbLytFOPNgD8/LCRbJTnt6w46nKP7pAPlITIBqWOjCJkrsE/mse7OjdcLnZfhR/g== X-Received: by 2002:a05:6a00:2311:b0:4e1:52bf:e466 with SMTP id h17-20020a056a00231100b004e152bfe466mr14767799pfh.77.1653138994662; Sat, 21 May 2022 06:16:34 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id n23-20020a056a00213700b0050dc762815esm3611551pfj.56.2022.05.21.06.16.33 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:34 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 08/12] KVM: X86/MMU: Allocate mmu->pae_root for PAE paging on-demand Date: Sat, 21 May 2022 21:16:56 +0800 Message-Id: <20220521131700.3661-9-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan mmu->pae_root for non-PAE paging is allocated on-demand, but mmu->pae_root for PAE paging is allocated early when struct kvm_mmu is being created. Simplify the code to allocate mmu->pae_root for PAE paging and make it on-demand. Signed-off-by: Lai Jiangshan --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu/mmu.c | 101 +++++++++++++------------------- arch/x86/kvm/x86.c | 4 +- 3 files changed, 44 insertions(+), 63 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 9cdc5bbd721f..fb9751dfc1a7 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1615,7 +1615,7 @@ int kvm_mmu_vendor_module_init(void); void kvm_mmu_vendor_module_exit(void); void kvm_mmu_destroy(struct kvm_vcpu *vcpu); -int kvm_mmu_create(struct kvm_vcpu *vcpu); +void kvm_mmu_create(struct kvm_vcpu *vcpu); int kvm_mmu_init_vm(struct kvm *kvm); void kvm_mmu_uninit_vm(struct kvm *kvm); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 90b715eefe6a..63c2b2c6122c 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -668,6 +668,41 @@ static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu) } } +static int mmu_alloc_pae_root(struct kvm_vcpu *vcpu) +{ + struct page *page; + + if (vcpu->arch.mmu->root_role.level != PT32E_ROOT_LEVEL) + return 0; + if (vcpu->arch.mmu->pae_root) + return 0; + + /* + * Allocate a page to hold the four PDPTEs for PAE paging when emulating + * 32-bit mode. CR3 is only 32 bits even on x86_64 in this case. + * Therefore we need to allocate the PDP table in the first 4GB of + * memory, which happens to fit the DMA32 zone. + */ + page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_DMA32); + if (!page) + return -ENOMEM; + vcpu->arch.mmu->pae_root = page_address(page); + + /* + * CR3 is only 32 bits when PAE paging is used, thus it's impossible to + * get the CPU to treat the PDPTEs as encrypted. Decrypt the page so + * that KVM's writes and the CPU's reads get along. Note, this is + * only necessary when using shadow paging, as 64-bit NPT can get at + * the C-bit even when shadowing 32-bit NPT, and SME isn't supported + * by 32-bit kernels (when KVM itself uses 32-bit NPT). + */ + if (!tdp_enabled) + set_memory_decrypted((unsigned long)vcpu->arch.mmu->pae_root, 1); + else + WARN_ON_ONCE(shadow_me_value); + return 0; +} + static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect) { int r; @@ -5127,6 +5162,9 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->root_role.direct); if (r) goto out; + r = mmu_alloc_pae_root(vcpu); + if (r) + return r; r = mmu_alloc_special_roots(vcpu); if (r) goto out; @@ -5591,63 +5629,18 @@ static void free_mmu_pages(struct kvm_mmu *mmu) free_page((unsigned long)mmu->pml5_root); } -static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu) +static void __kvm_mmu_create(struct kvm_mmu *mmu) { - struct page *page; int i; mmu->root.hpa = INVALID_PAGE; mmu->root.pgd = 0; for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) mmu->prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID; - - /* vcpu->arch.guest_mmu isn't used when !tdp_enabled. */ - if (!tdp_enabled && mmu == &vcpu->arch.guest_mmu) - return 0; - - /* - * When using PAE paging, the four PDPTEs are treated as 'root' pages, - * while the PDP table is a per-vCPU construct that's allocated at MMU - * creation. When emulating 32-bit mode, cr3 is only 32 bits even on - * x86_64. Therefore we need to allocate the PDP table in the first - * 4GB of memory, which happens to fit the DMA32 zone. TDP paging - * generally doesn't use PAE paging and can skip allocating the PDP - * table. The main exception, handled here, is SVM's 32-bit NPT. The - * other exception is for shadowing L1's 32-bit or PAE NPT on 64-bit - * KVM; that horror is handled on-demand by mmu_alloc_special_roots(). - */ - if (tdp_enabled && kvm_mmu_get_tdp_level(vcpu) > PT32E_ROOT_LEVEL) - return 0; - - page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_DMA32); - if (!page) - return -ENOMEM; - - mmu->pae_root = page_address(page); - - /* - * CR3 is only 32 bits when PAE paging is used, thus it's impossible to - * get the CPU to treat the PDPTEs as encrypted. Decrypt the page so - * that KVM's writes and the CPU's reads get along. Note, this is - * only necessary when using shadow paging, as 64-bit NPT can get at - * the C-bit even when shadowing 32-bit NPT, and SME isn't supported - * by 32-bit kernels (when KVM itself uses 32-bit NPT). - */ - if (!tdp_enabled) - set_memory_decrypted((unsigned long)mmu->pae_root, 1); - else - WARN_ON_ONCE(shadow_me_value); - - for (i = 0; i < 4; ++i) - mmu->pae_root[i] = INVALID_PAE_ROOT; - - return 0; } -int kvm_mmu_create(struct kvm_vcpu *vcpu) +void kvm_mmu_create(struct kvm_vcpu *vcpu) { - int ret; - vcpu->arch.mmu_pte_list_desc_cache.kmem_cache = pte_list_desc_cache; vcpu->arch.mmu_pte_list_desc_cache.gfp_zero = __GFP_ZERO; @@ -5659,18 +5652,8 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) vcpu->arch.mmu = &vcpu->arch.root_mmu; vcpu->arch.walk_mmu = &vcpu->arch.root_mmu; - ret = __kvm_mmu_create(vcpu, &vcpu->arch.guest_mmu); - if (ret) - return ret; - - ret = __kvm_mmu_create(vcpu, &vcpu->arch.root_mmu); - if (ret) - goto fail_allocate_root; - - return ret; - fail_allocate_root: - free_mmu_pages(&vcpu->arch.guest_mmu); - return ret; + __kvm_mmu_create(&vcpu->arch.guest_mmu); + __kvm_mmu_create(&vcpu->arch.root_mmu); } #define BATCH_ZAP_PAGES 10 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 04812eaaf61b..064aecb188dc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11285,9 +11285,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) else vcpu->arch.mp_state = KVM_MP_STATE_UNINITIALIZED; - r = kvm_mmu_create(vcpu); - if (r < 0) - return r; + kvm_mmu_create(vcpu); if (irqchip_in_kernel(vcpu->kvm)) { r = kvm_create_lapic(vcpu, lapic_timer_advance_ns); From patchwork Sat May 21 13:16:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857882 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F4D3C433EF for ; Sat, 21 May 2022 13:17:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355024AbiEUNRg (ORCPT ); Sat, 21 May 2022 09:17:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355255AbiEUNRR (ORCPT ); Sat, 21 May 2022 09:17:17 -0400 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE234562D8; Sat, 21 May 2022 06:16:48 -0700 (PDT) Received: by mail-pj1-x1033.google.com with SMTP id n10so10175857pjh.5; Sat, 21 May 2022 06:16:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l94TVW3wuVRtc3xT6nzOf4GkBXwrul2+5gv+PMml5Tg=; b=ArJqDxgtQTfe25Ba0jETZGtU+VXR9BgSH7Z04f8rxW+4bhYdzcUJYYo1EtSTaRK2Zk PLvf/Z3j0pJXXQz3Di7qAnGRCOypjwC0q6Nyf6PQNvuaVYv2qrh+1kgf3ortKP6LiL1f 5sAr5bBrAhU9Rs31y2I1RML3zC7cXS9rsBoQtaD4XPRDtMCDrObumxoMtoG6ahtA7Y7Z 0XlT4Fzemv+/PFTR7o3Lla+HC7vXXfl0WOl2sAWVZOoJyAUwM5JLZEjnHdNFav979NFh jxlzNHIxcUQvjDYANYeC0AVEcz/S7wlqdMnjgPAKnyj3s/upQxz2SUtDa2PAWdVF41/8 QI3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l94TVW3wuVRtc3xT6nzOf4GkBXwrul2+5gv+PMml5Tg=; b=w8d3xsDwnPKePV+ymzqH1BY0LO47m4qx4wgWt5eRoDd9/nGGRw5rhH6GqZtFMc4/GO uvC8gM+DVzyDP7krEl2AXMCsZ/DxS4pk50j45nAmOtxkuRnNlyEolSunHsdyf5+2NTY0 Lf4cjSizJfZNdgrlPkPOr8G6D5dxiT7UfNgEMeLbpqIAeZs3H/jTwb8DtOXmd2JdEtSX SCCCqN8MMlUH8OvFPAQF6tQ5wArvfSLu20dGSeKABntbAp468gKyD0Cz9BBLCH5cOc2x Ln87UPYIOVAaRCMRR+2Xj4ftDXUPXWx66ymw85QUxWF1kieZxOStvB9E1LwFyZJ8lmOE Z4fg== X-Gm-Message-State: AOAM530BrH3RFcldL/dTTfeaoazlzo+JvgFBf7cxqQG6VQq4aY7isCuL 5A+5GyeVVoo7q9HgC16saSCAgkAVqAs= X-Google-Smtp-Source: ABdhPJxmJvyTReInu99O+pz45PFKkztmSo6JBB2Q4B3W6dOhNsqXS6QrnFKLL4MiKOfvE6K0Eh76rg== X-Received: by 2002:a17:902:d4d2:b0:161:b4f7:385b with SMTP id o18-20020a170902d4d200b00161b4f7385bmr14328082plg.5.1653138998291; Sat, 21 May 2022 06:16:38 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id z10-20020a17090a170a00b001dc1e6db7c2sm3622346pjd.57.2022.05.21.06.16.37 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:38 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 09/12] KVM: X86/MMU: Move the verifying of NPT's PDPTE in FNAME(fetch) Date: Sat, 21 May 2022 21:16:57 +0800 Message-Id: <20220521131700.3661-10-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan FNAME(page_fault) verifies PDPTE for nested NPT in PAE paging mode because nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE from memory unconditionally for each call. The verifying is complicated and it works only when mmu->pae_root is always used when the guest is PAE paging. Move the verifying code in FNAME(fetch) and simplify it since the local shadow page is used and it can be walked in FNAME(fetch) and unlinked from children via drop_spte(). It also allows for mmu->pae_root NOT to be used when it is NOT required to be put in a 32bit CR3. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/paging_tmpl.h | 72 ++++++++++++++++------------------ 1 file changed, 33 insertions(+), 39 deletions(-) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index cd6032e1947c..67c419bce1e5 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -659,6 +659,39 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, clear_sp_write_flooding_count(it.sptep); drop_large_spte(vcpu, it.sptep); + /* + * When nested NPT enabled and L1 is PAE paging, + * mmu->get_pdptrs() which is nested_svm_get_tdp_pdptr() reads + * the guest NPT's PDPTE from memory unconditionally for each + * call. + * + * The guest PAE root page is not write-protected. + * + * The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get + * a value different from previous calls or different from the + * return value of mmu->get_pdptrs() in mmu_alloc_shadow_roots(). + * + * It will cause the following code installs the spte in a wrong + * sp or links a sp to a wrong parent if the return value of + * mmu->get_pdptrs() is not verified unchanged since + * FNAME(gpte_changed) can't check this kind of change. + * + * Verify the return value of mmu->get_pdptrs() (only the gfn + * in it needs to be checked) and drop the spte if the gfn isn't + * matched. + * + * Do the verifying unconditionally when the guest is PAE + * paging no matter whether it is nested NPT or not to avoid + * complicated code. + */ + if (vcpu->arch.mmu->cpu_role.base.level == PT32E_ROOT_LEVEL && + it.level == PT32E_ROOT_LEVEL && + is_shadow_present_pte(*it.sptep)) { + sp = to_shadow_page(*it.sptep & PT64_BASE_ADDR_MASK); + if (gw->table_gfn[it.level - 2] != sp->gfn) + drop_spte(vcpu->kvm, it.sptep); + } + sp = NULL; if (!is_shadow_present_pte(*it.sptep)) { table_gfn = gw->table_gfn[it.level - 2]; @@ -886,44 +919,6 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault if (is_page_fault_stale(vcpu, fault, mmu_seq)) goto out_unlock; - /* - * When nested NPT enabled and L1 is PAE paging, mmu->get_pdptrs() - * which is nested_svm_get_tdp_pdptr() reads the guest NPT's PDPTE - * from memory unconditionally for each call. - * - * The guest PAE root page is not write-protected. - * - * The mmu->get_pdptrs() in FNAME(walk_addr_generic) might get a value - * different from previous calls or different from the return value of - * mmu->get_pdptrs() in mmu_alloc_shadow_roots(). - * - * It will cause FNAME(fetch) installs the spte in a wrong sp or links - * a sp to a wrong parent if the return value of mmu->get_pdptrs() - * is not verified unchanged since FNAME(gpte_changed) can't check - * this kind of change. - * - * Verify the return value of mmu->get_pdptrs() (only the gfn in it - * needs to be checked) and do kvm_mmu_free_roots() like load_pdptr() - * if the gfn isn't matched. - * - * Do the verifying unconditionally when the guest is PAE paging no - * matter whether it is nested NPT or not to avoid complicated code. - */ - if (vcpu->arch.mmu->cpu_role.base.level == PT32E_ROOT_LEVEL) { - u64 pdpte = vcpu->arch.mmu->pae_root[(fault->addr >> 30) & 3]; - struct kvm_mmu_page *sp = NULL; - - if (IS_VALID_PAE_ROOT(pdpte)) - sp = to_shadow_page(pdpte & PT64_BASE_ADDR_MASK); - - if (!sp || walker.table_gfn[PT32E_ROOT_LEVEL - 2] != sp->gfn) { - write_unlock(&vcpu->kvm->mmu_lock); - kvm_mmu_free_roots(vcpu->kvm, vcpu->arch.mmu, - KVM_MMU_ROOT_CURRENT); - goto release_clean; - } - } - r = make_mmu_pages_available(vcpu); if (r) goto out_unlock; @@ -931,7 +926,6 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault out_unlock: write_unlock(&vcpu->kvm->mmu_lock); -release_clean: kvm_release_pfn_clean(fault->pfn); return r; } From patchwork Sat May 21 13:16:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857883 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 265BCC433EF for ; Sat, 21 May 2022 13:17:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355051AbiEUNRw (ORCPT ); Sat, 21 May 2022 09:17:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355266AbiEUNRT (ORCPT ); Sat, 21 May 2022 09:17:19 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED4E75675B; Sat, 21 May 2022 06:16:51 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id s14so9416875plk.8; Sat, 21 May 2022 06:16:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AOz0mmcwPMc1b3ipLhmnfSVEDUnpfpIWhOsFHZTCejQ=; b=MHDMjiHPg9cFx3n8pExhMq7Uf0fuABChD6WsgKEXubgKkZKoeEHOi3u/9q0NRLrf/K mkg03PBGQ0ETVn9lO7MCxdEWTDmJTtnhnFJqMuSctSRhRmpaA/4JsDVUYpqdSeo7wzQQ MspPCxHCwmaqNacszOJvheKIwKZS8DsHMFUqUXUN6W57qUXf+RWLfyEPJZxzOFOdheFn MNkzejHcp7JvbEsoU/EQCARUykfUTbDvr941sBdp3MgbukBaH/hXxVNeh2liqBPT0jpT uk6V7fOCBzHpvBA6cOs3tCCCn9s9GNZ8j4o9u7PIJL8bTEezAEc9ABcoQuAtk/qzNrVG ITvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AOz0mmcwPMc1b3ipLhmnfSVEDUnpfpIWhOsFHZTCejQ=; b=hGEuRfgkkz0wKpb9VLqXIqapT4tjkXbrT3CbKczn3UkdTGCnRph/jwZJNCvbCUspXL R9aFXSxYVRHdKnNPYbwrD3LK2vgBLu4JL7DbQ6FcKPVtW9DEkgcnjWb8M1CaFcih0FHW psHBaVAIyBasvoOaw0F1d6kak8i2cpPoWEChJc/6mtMZAhmBcIYPwUOA/I1u/sORsmU9 h4x2+uxt/PwGWojRjNAdQ7Vb6xg8rmLwUIlbFFdZrC8gryem7wsg6wyXYBfCjcn7fBF3 Ta9kc93ykCWzOoPo6cvjxSh9Qkvip3/1oc6QRWEosFkkusWRLQiwZ7Gthe/KkfoXgzT/ 9h5w== X-Gm-Message-State: AOAM532ul1+KyhKgzNJ/XrdyteMsuGSlSymOS8VRgQwNCyB7igpFU22+ AiZBkeNWoHMK5BqLbYgVjvIeYkm439g= X-Google-Smtp-Source: ABdhPJxviNdrc+pHZNkMhsNRanRHsHb1czV3atCcpVsj0z4FUpOoU2EiU2dqP8Dbop5wO6rA7/GzDw== X-Received: by 2002:a17:90b:48d1:b0:1df:4fc8:c2d7 with SMTP id li17-20020a17090b48d100b001df4fc8c2d7mr16259585pjb.45.1653139001990; Sat, 21 May 2022 06:16:41 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id o24-20020aa79798000000b0050dc76281b9sm3730168pfp.147.2022.05.21.06.16.41 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:41 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 10/12] KVM: X86/MMU: Remove unused INVALID_PAE_ROOT and IS_VALID_PAE_ROOT Date: Sat, 21 May 2022 21:16:58 +0800 Message-Id: <20220521131700.3661-11-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan They are unused and replaced with 0ull like other zero sptes and is_shadow_present_pte(). Signed-off-by: Lai Jiangshan Reviewed-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu_internal.h | 10 ---------- 1 file changed, 10 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index bd2a26897b97..4feb1ac2742c 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -20,16 +20,6 @@ extern bool dbg; #define MMU_WARN_ON(x) do { } while (0) #endif -/* - * Unlike regular MMU roots, PAE "roots", a.k.a. PDPTEs/PDPTRs, have a PRESENT - * bit, and thus are guaranteed to be non-zero when valid. And, when a guest - * PDPTR is !PRESENT, its corresponding PAE root cannot be set to INVALID_PAGE, - * as the CPU would treat that as PRESENT PDPTR with reserved bits set. Use - * '0' instead of INVALID_PAGE to indicate an invalid PAE root. - */ -#define INVALID_PAE_ROOT 0 -#define IS_VALID_PAE_ROOT(x) (!!(x)) - typedef u64 __rcu *tdp_ptep_t; struct kvm_mmu_page { From patchwork Sat May 21 13:16:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857884 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E770C433EF for ; Sat, 21 May 2022 13:17:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355329AbiEUNRz (ORCPT ); Sat, 21 May 2022 09:17:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39328 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355270AbiEUNRT (ORCPT ); Sat, 21 May 2022 09:17:19 -0400 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 347F756C10; Sat, 21 May 2022 06:16:54 -0700 (PDT) Received: by mail-pj1-x1033.google.com with SMTP id l14so10190130pjk.2; Sat, 21 May 2022 06:16:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fQH2zr83sjDai8/bGfNzq6iiSpBWMtWKp/VD1T2j7tI=; b=if9OiDwUjIHLte68aGpBhCFJjybVAKpKHLJRYupHCHhxpZFA7lVkHESh3uLvT2whpP XvbFCXjcAsgKHmTsoknEqVXg9nwgYgcLc139yCZZRCOzXtn/Q8Xb0aI36QePAoLYzeRR Vz5ejrzaSOOskVkhltF3IWummF1TXpS+AOtjN0+/+aopyKI2QPVyFjS3qckPDmbgepnq nOO+Z5YzEnqqjAWdrfhbhQn+o3zpHiJSGN7CJ9sMM1I/qDUR14J7TLlTEBYkRcew7xap zNy9KYW8lyBdJ//XnzqIlWzvs4ti8MwfZHzMkhamHPsKhXoXnYhOflWTBbYiSHrxrL8Y 7A6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fQH2zr83sjDai8/bGfNzq6iiSpBWMtWKp/VD1T2j7tI=; b=ZgSUiJrQT85wBorTQ72ChSebt1unlbH3PNkolVfFBrlFSBumJ/ijCfuqkelFRT7rQQ DyDTmlzP0Bf38IX6GwHrNyZUqUXim585x+xbW9XsdcAk8lGvjTq/sysphNK75E20y2el EhCqqbsIeagCRexPZEj1E6Ru6bf/Ax90UlQ6vcYseh8rr3LtfRWn/06cHCTVYFwdKCs4 QA1M/ee9DIO7eQW7xMyGb6EYzKIPOw91Hzjb+c4cWi21kMwdBSKdhOblSMwUAxrH+Aig /lRdjqIuby42fwpX/k+cxgTDCJhEXdEhSvd+kXh/cfcc7SmoMdB4M8eUuSmuiCYKXi0e 7P6Q== X-Gm-Message-State: AOAM530+si6Bl5RqYs2M4EwK8oE5SwPnjKSOPEFTd0tJcimhnvBd4MFo HoPwvzRxiqP29Z5KcqswZR5ACDk1YV4= X-Google-Smtp-Source: ABdhPJw/7gk3sgaSQ0NTwi53kntpyz6wS2f1vtenPnoG6rVtpnjJ3Jzh9FXaTyqoImTWS2fMnIo82A== X-Received: by 2002:a17:902:bb89:b0:161:ffec:a1b3 with SMTP id m9-20020a170902bb8900b00161ffeca1b3mr4565723pls.141.1653139005838; Sat, 21 May 2022 06:16:45 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id m5-20020a654385000000b003c14af5062csm1474310pgp.68.2022.05.21.06.16.45 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:45 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 11/12] KVM: X86/MMU: Don't use mmu->pae_root when shadowing PAE NPT in 64-bit host Date: Sat, 21 May 2022 21:16:59 +0800 Message-Id: <20220521131700.3661-12-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan Allocate the tables when allocating the local shadow page. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 63c2b2c6122c..73e6a8e1e1a9 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1809,10 +1809,12 @@ static bool using_local_root_page(struct kvm_mmu *mmu) * 2 or 3 levels of local shadow pages on top of non-local shadow pages. * * Local shadow pages are locally allocated. If the local shadow page's level - * is PT32E_ROOT_LEVEL, it will use the preallocated mmu->pae_root for its - * sp->spt. Because sp->spt may need to be put in the 32 bits CR3 (even in - * x86_64) or decrypted. Using the preallocated one to handle these - * requirements makes the allocation simpler. + * is PT32E_ROOT_LEVEL, and it is not shadowing nested NPT for 32-bit L1 in + * 64-bit L0 (or said when the shadow pagetable's level is PT32E_ROOT_LEVEL), + * it will use the preallocated mmu->pae_root for its sp->spt. Because sp->spt + * need to be put in the 32-bit CR3 (even in 64-bit host) or decrypted. Using + * the preallocated one to handle these requirements makes the allocation + * simpler. * * Local shadow pages are only visible to local VCPU except through * sp->parent_ptes rmap from their children, so they are not in the @@ -1852,13 +1854,12 @@ kvm_mmu_alloc_local_shadow_page(struct kvm_vcpu *vcpu, union kvm_mmu_page_role r sp->gfn = 0; sp->role = role; /* - * Use the preallocated mmu->pae_root when the shadow page's - * level is PT32E_ROOT_LEVEL which may need to be put in the 32 bits + * Use the preallocated mmu->pae_root when the shadow pagetable's + * level is PT32E_ROOT_LEVEL which need to be put in the 32 bits * CR3 (even in x86_64) or decrypted. The preallocated one is prepared * for the requirements. */ - if (role.level == PT32E_ROOT_LEVEL && - !WARN_ON_ONCE(!vcpu->arch.mmu->pae_root)) + if (vcpu->arch.mmu->root_role.level == PT32E_ROOT_LEVEL) sp->spt = vcpu->arch.mmu->pae_root; else sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); From patchwork Sat May 21 13:17:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12857885 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A912CC433F5 for ; Sat, 21 May 2022 13:18:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355215AbiEUNSM (ORCPT ); Sat, 21 May 2022 09:18:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43118 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355286AbiEUNRV (ORCPT ); Sat, 21 May 2022 09:17:21 -0400 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 905015711A; Sat, 21 May 2022 06:16:55 -0700 (PDT) Received: by mail-pj1-x1030.google.com with SMTP id z7-20020a17090abd8700b001df78c7c209so13669711pjr.1; Sat, 21 May 2022 06:16:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7UpZ3vIydYQbJynx9tacn2HIuDZ985w6Pm7fWZXYM7w=; b=EJWfcKYnX1rQp5r6kuYIXe1wjGTHsJYqMxVi5VudrwxjdyVc58p7R1dws9TWHRztkc 2sPS30RSdGAAczzjtJEVTdQAxvXZmr1s4LyJTRGU2FY1j3haLZrEFHdqWmi1UKPjyDya 4Ewiou/3alhfuaGRx2wxx5ibL5yiJ9mpASzUuvDuOnX9rfura0iAFaA/+VqAi5nb5cad szE3YVcWeKekXqNwYJc89glgsLp0JivjUJsVl+qNKxGCbN15DURC0a/UBCd9PtFCoj8w tMONC3Y3I4/yTRuKwJVyZAQJnFVOypUjkapBjnuvUTzLs/hVSjMHr+ddQpiA9B5+2r2f CBjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7UpZ3vIydYQbJynx9tacn2HIuDZ985w6Pm7fWZXYM7w=; b=Bejzkt9J1hH33e8WgNNETHoJrvcmvLWayRkdO2rGjL3bYm6QSPrAw6IiTaEA3iRZ1t P0NJLt2vJ2QhmVhKYnKtFehRF9GRfa5nKIX9F+2ebYCBCawIDJSS6QHG9hFJbdKuq6rC vh3p3RkCGg4UlQK+r9iujhY47cpR5GsJrTNLp440nRYSHapVf5xLoMLHTJUlHtr6jh5m tGevegX8PIOqnArfZxV//RQiE1y51bMV8MkBmT5W0FfQjHljLs4i52ZFfTvXdzuZKana 2cDGs5K9nSykE08BoeBnLkqhnGtReUxb1ObFXkfs9rSVEx0enWSunqJQx9+1e5qjt+Xa faVg== X-Gm-Message-State: AOAM5322nOQpeN5qE1Dz5OCBLFJgG1ebOThwhpyh2xvnIH4mH2KBbxCC pGewJt8GQ1Ka4HwXXR4BmCk1AYJSI9w= X-Google-Smtp-Source: ABdhPJxjzMuiA+NbmaXQWcxLn28moQWTn0wy0OgupI8O1H4w7hXVHwDDeognYbk3aS3Hgpddq1Q7Pg== X-Received: by 2002:a17:90b:4a8d:b0:1dc:3769:20fc with SMTP id lp13-20020a17090b4a8d00b001dc376920fcmr17013667pjb.114.1653139009561; Sat, 21 May 2022 06:16:49 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id z24-20020a62d118000000b005182d505389sm3514208pfg.72.2022.05.21.06.16.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 May 2022 06:16:49 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Vitaly Kuznetsov , Maxim Levitsky , David Matlack , Lai Jiangshan Subject: [PATCH V3 12/12] KVM: X86/MMU: Remove mmu_alloc_special_roots() Date: Sat, 21 May 2022 21:17:00 +0800 Message-Id: <20220521131700.3661-13-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220521131700.3661-1-jiangshanlai@gmail.com> References: <20220521131700.3661-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan mmu_alloc_special_roots() allocates mmu->pae_root for non-PAE paging (as for shadowing 32bit NPT on 64 bit host) and mmu->pml4_root and mmu->pml5_root. But mmu->pml4_root and mmu->pml5_root is not used, neither mmu->pae_root for non-PAE paging. So remove mmu_alloc_special_roots(), mmu->pml4_root and mmu->pml5_root. Signed-off-by: Lai Jiangshan --- arch/x86/include/asm/kvm_host.h | 3 -- arch/x86/kvm/mmu/mmu.c | 77 --------------------------------- 2 files changed, 80 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index fb9751dfc1a7..ec44e6c3d5ea 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -458,9 +458,6 @@ struct kvm_mmu { u8 permissions[16]; u64 *pae_root; - u64 *pml4_root; - u64 *pml5_root; - /* * check zero bits on shadow page table entries, these * bits include not only hardware reserved bits but also diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 73e6a8e1e1a9..b8eed217314d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3691,78 +3691,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) return r; } -static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu) -{ - struct kvm_mmu *mmu = vcpu->arch.mmu; - bool need_pml5 = mmu->root_role.level > PT64_ROOT_4LEVEL; - u64 *pml5_root = NULL; - u64 *pml4_root = NULL; - u64 *pae_root; - - /* - * When shadowing 32-bit or PAE NPT with 64-bit NPT, the PML4 and PDP - * tables are allocated and initialized at root creation as there is no - * equivalent level in the guest's NPT to shadow. Allocate the tables - * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare. - */ - if (mmu->root_role.direct || - mmu->cpu_role.base.level >= PT64_ROOT_4LEVEL || - mmu->root_role.level < PT64_ROOT_4LEVEL) - return 0; - - /* - * NPT, the only paging mode that uses this horror, uses a fixed number - * of levels for the shadow page tables, e.g. all MMUs are 4-level or - * all MMus are 5-level. Thus, this can safely require that pml5_root - * is allocated if the other roots are valid and pml5 is needed, as any - * prior MMU would also have required pml5. - */ - if (mmu->pae_root && mmu->pml4_root && (!need_pml5 || mmu->pml5_root)) - return 0; - - /* - * The special roots should always be allocated in concert. Yell and - * bail if KVM ends up in a state where only one of the roots is valid. - */ - if (WARN_ON_ONCE(!tdp_enabled || mmu->pae_root || mmu->pml4_root || - (need_pml5 && mmu->pml5_root))) - return -EIO; - - /* - * Unlike 32-bit NPT, the PDP table doesn't need to be in low mem, and - * doesn't need to be decrypted. - */ - pae_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT); - if (!pae_root) - return -ENOMEM; - -#ifdef CONFIG_X86_64 - pml4_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT); - if (!pml4_root) - goto err_pml4; - - if (need_pml5) { - pml5_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT); - if (!pml5_root) - goto err_pml5; - } -#endif - - mmu->pae_root = pae_root; - mmu->pml4_root = pml4_root; - mmu->pml5_root = pml5_root; - - return 0; - -#ifdef CONFIG_X86_64 -err_pml5: - free_page((unsigned long)pml4_root); -err_pml4: - free_page((unsigned long)pae_root); - return -ENOMEM; -#endif -} - static bool is_unsync_root(hpa_t root) { struct kvm_mmu_page *sp; @@ -5166,9 +5094,6 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) r = mmu_alloc_pae_root(vcpu); if (r) return r; - r = mmu_alloc_special_roots(vcpu); - if (r) - goto out; if (vcpu->arch.mmu->root_role.direct) r = mmu_alloc_direct_roots(vcpu); else @@ -5626,8 +5551,6 @@ static void free_mmu_pages(struct kvm_mmu *mmu) if (!tdp_enabled && mmu->pae_root) set_memory_encrypted((unsigned long)mmu->pae_root, 1); free_page((unsigned long)mmu->pae_root); - free_page((unsigned long)mmu->pml4_root); - free_page((unsigned long)mmu->pml5_root); } static void __kvm_mmu_create(struct kvm_mmu *mmu)