From patchwork Wed Apr 20 13:25:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12820246 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80D9BC433F5 for ; Wed, 20 Apr 2022 13:25:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378992AbiDTN2T (ORCPT ); Wed, 20 Apr 2022 09:28:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378990AbiDTN2K (ORCPT ); Wed, 20 Apr 2022 09:28:10 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A89C03CA5A; Wed, 20 Apr 2022 06:25:24 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id v12so1773137plv.4; Wed, 20 Apr 2022 06:25:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I4usMyDcj5fWXer18hnDtOwiRS1hL2tNUD9GGuHV+go=; b=cmdtINL6FXpxHXfQfHRZIbwBLotN19dA4LLjlmOCALf/zMQjuITpLTXiawmgyFsvlQ W/4/AUbJv1VdawN50Rwh+IVaZJywIVucyjGN5915EPn3VrFMOZU+jwcngz2nZuwhh+Li XVvk5LhkARHFJOLLEPOLnmxT3rRLfCDSK7sOzEHoUFIhD7IgJ1+eBt62r+lLSSNTuIHH 7kDKJC0ziGPT5udyBuYAcz87TzOahMprKSCdju81dIIcJcbCmR6bNTE1dzBQrYhHHoHg 41QdwnSM+ZTF8wILMeiEwrvvPiWjhV5/t+otiaAfKvm3jUCMagGy62fBhN4TreP2RITS kBcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I4usMyDcj5fWXer18hnDtOwiRS1hL2tNUD9GGuHV+go=; b=4tvLWV/fhy9Moyaxr8GahNWV2rE2gUS3kdp0sE2OW92flQ7lIhYkpGAKIRbce4yePw Bb8OrxvJiPGp8U3nRzMIRFGfWnLSfYF3PdsGa1VBCFsDITSC3yQj+4uLdJB/bOsU9fbF GIGsmU19xvVgTlyoD4V1fAlR2EJXllevh1PcB+CFYMZNAeoNELPhBRakEXf2vrY6N2yH wWWyH/VzoIuNRALKqqvmGVkGjfnjd1Rm+5YTJg8arAlmc/73AYUtw57qFPFqV4K4NWCg ki04hthafQuUcCH9meP2T/kslq8kENs/q+B5hlyOqn/zfZYBqSzOpMvJCvKdEOcvy5Sk R0zw== X-Gm-Message-State: AOAM533mYAUbb8VzqW4jqcEGNJ3bDKX61F3MPNY8ZFyXdXN8N/hOKLFz 6cb7wLwVfcbAlXoaWNMA7vDW66xyJBI= X-Google-Smtp-Source: ABdhPJwqqUVWT/2tEyMlxOREiJfJ1klPI1HgepDxoguLbr2s62rutwySIZyWV07bB4f4mNQGNdB3sQ== X-Received: by 2002:a17:902:cec4:b0:159:4cf:229d with SMTP id d4-20020a170902cec400b0015904cf229dmr13173171plg.40.1650461123936; Wed, 20 Apr 2022 06:25:23 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id a19-20020a62e213000000b005061dd378a1sm19627806pfi.35.2022.04.20.06.25.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Apr 2022 06:25:23 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Lai Jiangshan , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 1/7] KVM: X86/MMU: Add using_special_root_page() Date: Wed, 20 Apr 2022 21:25:59 +0800 Message-Id: <20220420132605.3813-2-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220420132605.3813-1-jiangshanlai@gmail.com> References: <20220420132605.3813-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan In some case, special roots are used in mmu. It is often using to_shadow_page(mmu->root.hpa) to check if special roots are used. Add using_special_root_page() to directly check if special roots are used or needed to be used even mmu->root.hpa is not set. Prepare for making to_shadow_page(mmu->root.hpa) return !NULL via using special shadow pages. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index d14cb6f99cb1..6461e499d305 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1714,6 +1714,14 @@ static void drop_parent_pte(struct kvm_mmu_page *sp, mmu_spte_clear_no_track(parent_pte); } +static bool using_special_root_page(struct kvm_mmu *mmu) +{ + if (mmu->direct_map) + return mmu->shadow_root_level == PT32E_ROOT_LEVEL; + else + return mmu->root_level <= PT32E_ROOT_LEVEL; +} + static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct) { struct kvm_mmu_page *sp; @@ -4201,10 +4209,10 @@ static bool fast_pgd_switch(struct kvm *kvm, struct kvm_mmu *mmu, { /* * For now, limit the caching to 64-bit hosts+VMs in order to avoid - * having to deal with PDPTEs. We may add support for 32-bit hosts/VMs - * later if necessary. + * having to deal with PDPTEs. Special roots can not be put into + * mmu->prev_root[]. */ - if (VALID_PAGE(mmu->root.hpa) && !to_shadow_page(mmu->root.hpa)) + if (VALID_PAGE(mmu->root.hpa) && using_special_root_page(mmu)) kvm_mmu_free_roots(kvm, mmu, KVM_MMU_ROOT_CURRENT); if (VALID_PAGE(mmu->root.hpa)) From patchwork Wed Apr 20 13:26:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12820247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4689C433F5 for ; Wed, 20 Apr 2022 13:25:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379003AbiDTN2V (ORCPT ); Wed, 20 Apr 2022 09:28:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379004AbiDTN2R (ORCPT ); Wed, 20 Apr 2022 09:28:17 -0400 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12BC8427ED; Wed, 20 Apr 2022 06:25:31 -0700 (PDT) Received: by mail-pj1-x1033.google.com with SMTP id s14-20020a17090a880e00b001caaf6d3dd1so4943137pjn.3; Wed, 20 Apr 2022 06:25:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=u4JR4TrG0H2Ztu+GjBRHrn1iBgo6HKiDZ5uZ0oa5Bb8=; b=UsATndhIHRF6aIhKMAOsG7dQ9kILXn+YYfjH9L+Ypw/ARwAM1T4j3a/B/Gm3aGJ8E8 AzvTtWqGsTjNeTIzcw54iPx2Q1g65kzOwAWwzLohbbcnd+ujgm+hAPyVl5r7yY1kZxEf 5dcSjMQSolAneK7ub67uU2hmCkgr6qsiW771Toa+4CvYPpf9GX1a3K6NiJzmY4ArBh8Q 8deYeGZdy0U72GJ3LHNOm6zhuCTqSNYMtKRzNd+ddkrv7GnX5rvXDRduh/EE79RDvJQS C3bEm221LV8Qm6pHn/LidL4tPWDsaFK/rBZaBzWJrm1En0cvxOsoyQKHgse6PXNf/Lae Tqjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=u4JR4TrG0H2Ztu+GjBRHrn1iBgo6HKiDZ5uZ0oa5Bb8=; b=w1zcCUr8IWl3s+qk2Ihlg1j/xWcgedJ11iINV6W12Dy1mfu+pxZhHTyml/bSvNQ1FL ZwCAmdrgTYDn1vZB0xU9AKOiaVgmQntM8sNfyaGc3dQO6CqzsYugje7dtUTdlwnFEsER 67ZNNPlmGbZvRHWSxnmN2VjaYkOujsvnNPqE6PlBb4k/KEUi3KC5BXqv/Y6Xjmokxw82 HKCQBxSuHw9opZyM1LWF7wqAgp5SVsx7VWVp/EeVgBg7ES9wsTOeN29Mu1esUr+ltNq8 ZaXdk6B0tyOpgEObGnpmj4Z4r8mw2H1hqOu6wHbsMTIoriLbrZKO/yWD2RfopRyiBqd0 dVlQ== X-Gm-Message-State: AOAM533mnRZtcP54liWQw1MlZSCoqMW2yV0Ggd9Ul9cJA+XlapMrp0lD SA4PZaizKZH9vPH1hByjOvDByOoupSo= X-Google-Smtp-Source: ABdhPJwKkcZWHJi6rV+yveL/8P3vrSj3RQGriDszgVkuuqKgZ72L5odfXVBglqUjXW80THPCX+djCQ== X-Received: by 2002:a17:902:aa8e:b0:158:e94b:7c92 with SMTP id d14-20020a170902aa8e00b00158e94b7c92mr19638339plr.126.1650461130320; Wed, 20 Apr 2022 06:25:30 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id y6-20020a17090a644600b001d2b4d3d406sm8255079pjm.33.2022.04.20.06.25.29 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Apr 2022 06:25:30 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Lai Jiangshan , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 2/7] KVM: X86/MMU: Add special shadow pages Date: Wed, 20 Apr 2022 21:26:00 +0800 Message-Id: <20220420132605.3813-3-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220420132605.3813-1-jiangshanlai@gmail.com> References: <20220420132605.3813-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan Special pages are pages to hold PDPTEs for 32bit guest or higher level pages linked to special page when shadowing NPT. Current code use mmu->pae_root, mmu->pml4_root, and mmu->pml5_root to setup special root. The initialization code is complex and the roots are not associated with struct kvm_mmu_page which causes the code more complex. Add kvm_mmu_alloc_special_page() and mmu_free_special_root_page() to allocate and free special shadow pages and prepare for using special shadow pages to replace current logic and share the most logic with normal shadow pages. The code is not activated since using_special_root_page() is false in the place where it is inserted. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 91 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 90 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6461e499d305..f6eee1a2b1d6 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1722,6 +1722,58 @@ static bool using_special_root_page(struct kvm_mmu *mmu) return mmu->root_level <= PT32E_ROOT_LEVEL; } +/* + * Special pages are pages to hold PAE PDPTEs for 32bit guest or higher level + * pages linked to special page when shadowing NPT. + * + * Special pages are specially allocated. If sp->spt needs to be 32bit, it + * will use the preallocated mmu->pae_root. + * + * Special pages are only visible to local VCPU except through rmap from their + * children, so they are not in the kvm->arch.active_mmu_pages nor in the hash. + * + * And they are either accounted nor write-protected since they don't has gfn + * associated. + * + * Because of above, special pages can not be freed nor zapped like normal + * shadow pages. They are freed directly when the special root is freed, see + * mmu_free_special_root_page(). + * + * Special root page can not be put on mmu->prev_roots because the comparison + * must use PDPTEs instead of CR3 and mmu->pae_root can not be shared for multi + * root pages. + * + * Except above limitations, all the other abilities are the same as other + * shadow page, like link, rmap, sync, unsync etc. + * + * Special pages can be obsoleted but might be possibly reused later. When + * the obsoleting process is done, all the obsoleted shadow pages are unlinked + * from the special pages by the help of the rmap of the children and the + * special pages become theoretically valid again. If there is no other event + * to cause a VCPU to free the root and the VCPU is being preempted by the host + * during two obsoleting processes, the VCPU can reuse its special pages when + * it is back. + */ +static struct kvm_mmu_page *kvm_mmu_alloc_special_page(struct kvm_vcpu *vcpu, + union kvm_mmu_page_role role) +{ + struct kvm_mmu_page *sp; + + sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); + sp->gfn = 0; + sp->role = role; + if (role.level == PT32E_ROOT_LEVEL && + vcpu->arch.mmu->shadow_root_level == PT32E_ROOT_LEVEL) + sp->spt = vcpu->arch.mmu->pae_root; + else + sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); + /* sp->gfns is not used for special sp */ + set_page_private(virt_to_page(sp->spt), (unsigned long)sp); + sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen; + + return sp; +} + static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct) { struct kvm_mmu_page *sp; @@ -2081,6 +2133,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, if (level <= vcpu->arch.mmu->root_level) role.passthrough = 0; + if (unlikely(level >= PT32E_ROOT_LEVEL && using_special_root_page(vcpu->arch.mmu))) + return kvm_mmu_alloc_special_page(vcpu, role); + sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]; for_each_valid_sp(vcpu->kvm, sp, sp_list) { if (sp->gfn != gfn) { @@ -3250,6 +3305,37 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa, *root_hpa = INVALID_PAGE; } +static void mmu_free_special_root_page(struct kvm *kvm, struct kvm_mmu *mmu) +{ + u64 spte = mmu->root.hpa; + struct kvm_mmu_page *sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK); + int i; + + /* Free level 5 or 4 roots for shadow NPT for 32 bit L1 */ + while (sp->role.level > PT32E_ROOT_LEVEL) + { + spte = sp->spt[0]; + mmu_page_zap_pte(kvm, sp, sp->spt + 0, NULL); + free_page((unsigned long)sp->spt); + kmem_cache_free(mmu_page_header_cache, sp); + if (!is_shadow_present_pte(spte)) + return; + sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK); + } + + if (WARN_ON_ONCE(sp->role.level != PT32E_ROOT_LEVEL)) + return; + + /* Free PAE roots */ + for (i = 0; i < 4; i++) + mmu_page_zap_pte(kvm, sp, sp->spt + i, NULL); + + if (sp->spt != mmu->pae_root) + free_page((unsigned long)sp->spt); + + kmem_cache_free(mmu_page_header_cache, sp); +} + /* roots_to_free must be some combination of the KVM_MMU_ROOT_* flags */ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu, ulong roots_to_free) @@ -3283,7 +3369,10 @@ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu, if (free_active_root) { if (to_shadow_page(mmu->root.hpa)) { - mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); + if (using_special_root_page(mmu)) + mmu_free_special_root_page(kvm, mmu); + else + mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); } else if (mmu->pae_root) { for (i = 0; i < 4; ++i) { if (!IS_VALID_PAE_ROOT(mmu->pae_root[i])) From patchwork Wed Apr 20 13:26:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12820248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F71BC433F5 for ; Wed, 20 Apr 2022 13:25:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378978AbiDTN20 (ORCPT ); Wed, 20 Apr 2022 09:28:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45506 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379013AbiDTN2X (ORCPT ); Wed, 20 Apr 2022 09:28:23 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 124C215821; Wed, 20 Apr 2022 06:25:37 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id v12so1773647plv.4; Wed, 20 Apr 2022 06:25:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aM5VvG/5vt6DPB1ntifwoIFjKJCDD8XHdmfREnmCM5o=; b=ULYKQfaainTU65ZIyY4HBMpf1Cd7NEPF+gJfyd7ql1S+TwEHsEIfyqqFyQqKYgVGBy BCyq/iyjMigMyOkbKMDLU+HIzwKwe0OTBqanQc8/LfSP7BtxScCb/L/bY/JFqsMnUCOa Di9gB964ZAOmMclEGUxYnGmLxzHqDnOQVYUWWVAEMaVi9DvU8XHfayrWUin8/+bhLFlS 77S5hAIBy6aw8zoD+yiZ+BpJuQO2RCtWFgETMFsibKoPFxelZ07V/eUKsaK7yakQcnnf FdaS5fvvd1pcqnxSYj2UvWSzHGdral5UhjSP0sZ25ngM5iH4hg3fMinCo8fa/6HHJaeB GoTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aM5VvG/5vt6DPB1ntifwoIFjKJCDD8XHdmfREnmCM5o=; b=Twl4gLQ5BWMAsftfs+O9IrdTYvTRFkhBzW4uBEWYH1FJcUVEr1xuK2x8/SmFT4rJI7 V7M1+hHOaM+RSW8ztnmytc7y1m222ceLagbUooe/gD0W40gqzF5vuttkiDnHNc22yuFb rL4ayt4QbFMWrJ2Y8k/dzW8VbbRd5n4cB9uDwWcbL8LjavXi/hxxeyQKjrbIH5hAe3OC IGUANf3RdmK14/UvPYJZkP1O3DH01sBJeVc1kRbY2N/82V04iaglxA6PQf6uY2C30rFx K59yktmpJgGA4N9w98IpffyWkWGk6ag/tx2PEMZg9kJoF223TRYphXlY3JBooq9yzo0e 2FAA== X-Gm-Message-State: AOAM533yljgmNb68VRLQfKGlFcbLwBBotkZZr0cPn6Ya28o1PwQodiP/ giE0TGPOLCvkfsGpNibOSUB/AfPf5tM= X-Google-Smtp-Source: ABdhPJy9XhL8LjcMH3c8M0tWPDF7f/8cabUadjoEVmK3SDwLVs+xYectF0HU1x0ExcLOICupnaiuRw== X-Received: by 2002:a17:902:e8c6:b0:158:f809:311f with SMTP id v6-20020a170902e8c600b00158f809311fmr16346169plg.4.1650461136433; Wed, 20 Apr 2022 06:25:36 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004fae885424dsm21276868pfx.72.2022.04.20.06.25.35 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Apr 2022 06:25:36 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Lai Jiangshan , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 3/7] KVM: X86/MMU: Link PAE root pagetable with its children Date: Wed, 20 Apr 2022 21:26:01 +0800 Message-Id: <20220420132605.3813-4-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220420132605.3813-1-jiangshanlai@gmail.com> References: <20220420132605.3813-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan When special shadow pages are activated, link_shadow_page() might link a special shadow pages which is the PAE root for PAE paging with its children. Add make_pae_pdpte() to handle it. The code is not activated since special shadow pages are not activated yet. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 6 +++++- arch/x86/kvm/mmu/spte.c | 7 +++++++ arch/x86/kvm/mmu/spte.h | 1 + 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f6eee1a2b1d6..eefe1528f91e 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2280,7 +2280,11 @@ static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, BUILD_BUG_ON(VMX_EPT_WRITABLE_MASK != PT_WRITABLE_MASK); - spte = make_nonleaf_spte(sp->spt, sp_ad_disabled(sp)); + if (unlikely(sp->role.level == PT32_ROOT_LEVEL && + vcpu->arch.mmu->shadow_root_level == PT32E_ROOT_LEVEL)) + spte = make_pae_pdpte(sp->spt); + else + spte = make_nonleaf_spte(sp->spt, sp_ad_disabled(sp)); mmu_spte_set(sptep, spte); diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 4739b53c9734..0d3aedd2f0c7 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -250,6 +250,13 @@ u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index) return child_spte; } +u64 make_pae_pdpte(u64 *child_pt) +{ + /* The only ignore bits in PDPTE are 11:9. */ + BUILD_BUG_ON(!(GENMASK(11,9) & SPTE_MMU_PRESENT_MASK)); + return __pa(child_pt) | PT_PRESENT_MASK | SPTE_MMU_PRESENT_MASK | + shadow_me_mask; +} u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled) { diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 73f12615416f..b2d14b3f9ff6 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -416,6 +416,7 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, u64 old_spte, bool prefetch, bool can_unsync, bool host_writable, u64 *new_spte); u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index); +u64 make_pae_pdpte(u64 *child_pt); u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled); u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access); u64 mark_spte_for_access_track(u64 spte); From patchwork Wed Apr 20 13:26:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12820249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 631ACC433EF for ; Wed, 20 Apr 2022 13:25:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379023AbiDTN2d (ORCPT ); Wed, 20 Apr 2022 09:28:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379021AbiDTN23 (ORCPT ); Wed, 20 Apr 2022 09:28:29 -0400 Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49B073A5D0; Wed, 20 Apr 2022 06:25:43 -0700 (PDT) Received: by mail-pj1-x1034.google.com with SMTP id mp16-20020a17090b191000b001cb5efbcab6so4937297pjb.4; Wed, 20 Apr 2022 06:25:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0isWlJxIzUtALUF2/VnBGm4wzlo8yeoKRaWXLxTXRxA=; b=bwNHlQ3In26IQuqJzrt18tp+VHTyO9poaxqAqFIVIsrVq8oHqq3uDA+Bx6AfQWUpr/ U/8EqNXBUseVcl5EV5Ovx0iBPYj02mfgP3ezgc0yVuUhb/3YBRvbD8/gTU4COYVi5art jlp+tPEMEhIdcpgU5SX2uZFJNsbKaN49SAia6WDRJDeMqW8v0DaoiAV0g1oE3/7y8Zn+ QaB20iTI8dA9NtJhhDBlKKmO6ruiFuwsqm7PFxN68jGuceLYVqVEYjMKA9FaWFdcMklx 0RB+SxhVnQiC0XvVyULqWzDJINEkXUH+GoFP53fc/Vnb+Y6zeMi3mmJK/j+AgPvr6DO5 xskQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0isWlJxIzUtALUF2/VnBGm4wzlo8yeoKRaWXLxTXRxA=; b=3Z1R5kWlR6MQodXkcyjURoq49lV3MQu7X7JI/ua/e8jxJFL4LtVAFU8OAqyqnxMlL/ wCoeri+alFDtDlC0rFtuag+8/YPargHdhb79VIs3A1iaVEcoBDiSXhg0LHmFaUCpIyWV KS8rVob4HC9D+Fj/x0K2VPdQZ1ZJHTo9+CeOM5P7/jrSewhUqKa5SThAnhhTuMPLSeu4 vqOOGFMqTkuELTLS61oG7vQUriXN8ixZKKv3ZcjWuOZ28oVLi5sloehJkhwTcXraIji+ NniP6+ChJsjojB+hRZKR9Hku7IJ1HYQ0x6U2WTv8j37TfEHMb0wi0vmYsoOnCdX3WpC9 sPgg== X-Gm-Message-State: AOAM5319yfl1TS181y+paeAu5kZ+fZmensifAu1z0vOgRcy8h7UmqLdA brZdgCgee/S7LQb0JZv6Mi493zX5oC8= X-Google-Smtp-Source: ABdhPJy+uKO2vZF5E0w1dV8B+mKvYtNfMnjpc61sMHR5+UASUUVdda8I4hvFGV24M4IA1qweRquIlw== X-Received: by 2002:a17:903:4c:b0:158:8e21:213d with SMTP id l12-20020a170903004c00b001588e21213dmr20732355pla.108.1650461142566; Wed, 20 Apr 2022 06:25:42 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id c18-20020a056a000ad200b004cdccd3da08sm21859421pfl.44.2022.04.20.06.25.41 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Apr 2022 06:25:42 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Lai Jiangshan , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 4/7] KVM: X86/MMU: Activate special shadow pages and remove old logic Date: Wed, 20 Apr 2022 21:26:02 +0800 Message-Id: <20220420132605.3813-5-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220420132605.3813-1-jiangshanlai@gmail.com> References: <20220420132605.3813-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan Activate special shadow pages by allocate special shadow pages in mmu_alloc_direct_roots() and mmu_alloc_shadow_roots(). Make shadow walkings walk from the topmost shadow page even it is special shadow page so that they can be walked like normal root and shadowed PDPTEs can be made and installed on-demand. Walking from the topmost causes FNAME(fetch) needs to visit high level special shadow pages and allocate special shadow pages when shadowing NPT for 32bit L1 in 64bit host, so change FNAME(fetch) and FNAME(walk_addr_generic) to handle it for affected code. Do sync from the topmost in kvm_mmu_sync_roots() and simplifies the code. Now all the root pages and pagetable pointed by a present spte in struct kvm_mmu are associated by struct kvm_mmu_page, and to_shadow_page() is guaranteed to be not NULL. Affect cases are those that using_special_root_page() return true. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 168 +++------------------------------ arch/x86/kvm/mmu/paging_tmpl.h | 15 ++- 2 files changed, 25 insertions(+), 158 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index eefe1528f91e..3b34a6912081 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2217,26 +2217,6 @@ static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterato iterator->addr = addr; iterator->shadow_addr = root; iterator->level = vcpu->arch.mmu->shadow_root_level; - - if (iterator->level >= PT64_ROOT_4LEVEL && - vcpu->arch.mmu->root_level < PT64_ROOT_4LEVEL && - !vcpu->arch.mmu->direct_map) - iterator->level = PT32E_ROOT_LEVEL; - - if (iterator->level == PT32E_ROOT_LEVEL) { - /* - * prev_root is currently only used for 64-bit hosts. So only - * the active root_hpa is valid here. - */ - BUG_ON(root != vcpu->arch.mmu->root.hpa); - - iterator->shadow_addr - = vcpu->arch.mmu->pae_root[(addr >> 30) & 3]; - iterator->shadow_addr &= PT64_BASE_ADDR_MASK; - --iterator->level; - if (!iterator->shadow_addr) - iterator->level = 0; - } } static void shadow_walk_init(struct kvm_shadow_walk_iterator *iterator, @@ -3372,21 +3352,10 @@ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu, &invalid_list); if (free_active_root) { - if (to_shadow_page(mmu->root.hpa)) { - if (using_special_root_page(mmu)) - mmu_free_special_root_page(kvm, mmu); - else - mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); - } else if (mmu->pae_root) { - for (i = 0; i < 4; ++i) { - if (!IS_VALID_PAE_ROOT(mmu->pae_root[i])) - continue; - - mmu_free_root_page(kvm, &mmu->pae_root[i], - &invalid_list); - mmu->pae_root[i] = INVALID_PAE_ROOT; - } - } + if (using_special_root_page(mmu)) + mmu_free_special_root_page(kvm, mmu); + else + mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list); mmu->root.hpa = INVALID_PAGE; mmu->root.pgd = 0; } @@ -3451,7 +3420,6 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) struct kvm_mmu *mmu = vcpu->arch.mmu; u8 shadow_root_level = mmu->shadow_root_level; hpa_t root; - unsigned i; int r; write_lock(&vcpu->kvm->mmu_lock); @@ -3462,24 +3430,9 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) if (is_tdp_mmu_enabled(vcpu->kvm)) { root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu); mmu->root.hpa = root; - } else if (shadow_root_level >= PT64_ROOT_4LEVEL) { + } else if (shadow_root_level >= PT32E_ROOT_LEVEL) { root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level, true); mmu->root.hpa = root; - } else if (shadow_root_level == PT32E_ROOT_LEVEL) { - if (WARN_ON_ONCE(!mmu->pae_root)) { - r = -EIO; - goto out_unlock; - } - - for (i = 0; i < 4; ++i) { - WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i])); - - root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT), - i << 30, PT32_ROOT_LEVEL, true); - mmu->pae_root[i] = root | PT_PRESENT_MASK | - shadow_me_mask; - } - mmu->root.hpa = __pa(mmu->pae_root); } else { WARN_ONCE(1, "Bad TDP root level = %d\n", shadow_root_level); r = -EIO; @@ -3557,10 +3510,8 @@ static int mmu_first_shadow_root_alloc(struct kvm *kvm) static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) { struct kvm_mmu *mmu = vcpu->arch.mmu; - u64 pdptrs[4], pm_mask; gfn_t root_gfn, root_pgd; hpa_t root; - unsigned i; int r; root_pgd = mmu->get_guest_pgd(vcpu); @@ -3569,21 +3520,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) if (mmu_check_root(vcpu, root_gfn)) return 1; - /* - * On SVM, reading PDPTRs might access guest memory, which might fault - * and thus might sleep. Grab the PDPTRs before acquiring mmu_lock. - */ - if (mmu->root_level == PT32E_ROOT_LEVEL) { - for (i = 0; i < 4; ++i) { - pdptrs[i] = mmu->get_pdptr(vcpu, i); - if (!(pdptrs[i] & PT_PRESENT_MASK)) - continue; - - if (mmu_check_root(vcpu, pdptrs[i] >> PAGE_SHIFT)) - return 1; - } - } - r = mmu_first_shadow_root_alloc(vcpu->kvm); if (r) return r; @@ -3593,70 +3529,9 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) if (r < 0) goto out_unlock; - /* - * Do we shadow a long mode page table? If so we need to - * write-protect the guests page table root. - */ - if (mmu->root_level >= PT64_ROOT_4LEVEL) { - root = mmu_alloc_root(vcpu, root_gfn, 0, - mmu->shadow_root_level, false); - mmu->root.hpa = root; - goto set_root_pgd; - } - - if (WARN_ON_ONCE(!mmu->pae_root)) { - r = -EIO; - goto out_unlock; - } - - /* - * We shadow a 32 bit page table. This may be a legacy 2-level - * or a PAE 3-level page table. In either case we need to be aware that - * the shadow page table may be a PAE or a long mode page table. - */ - pm_mask = PT_PRESENT_MASK | shadow_me_mask; - if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL) { - pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK; - - if (WARN_ON_ONCE(!mmu->pml4_root)) { - r = -EIO; - goto out_unlock; - } - mmu->pml4_root[0] = __pa(mmu->pae_root) | pm_mask; - - if (mmu->shadow_root_level == PT64_ROOT_5LEVEL) { - if (WARN_ON_ONCE(!mmu->pml5_root)) { - r = -EIO; - goto out_unlock; - } - mmu->pml5_root[0] = __pa(mmu->pml4_root) | pm_mask; - } - } - - for (i = 0; i < 4; ++i) { - WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i])); - - if (mmu->root_level == PT32E_ROOT_LEVEL) { - if (!(pdptrs[i] & PT_PRESENT_MASK)) { - mmu->pae_root[i] = INVALID_PAE_ROOT; - continue; - } - root_gfn = pdptrs[i] >> PAGE_SHIFT; - } - - root = mmu_alloc_root(vcpu, root_gfn, i << 30, - PT32_ROOT_LEVEL, false); - mmu->pae_root[i] = root | pm_mask; - } - - if (mmu->shadow_root_level == PT64_ROOT_5LEVEL) - mmu->root.hpa = __pa(mmu->pml5_root); - else if (mmu->shadow_root_level == PT64_ROOT_4LEVEL) - mmu->root.hpa = __pa(mmu->pml4_root); - else - mmu->root.hpa = __pa(mmu->pae_root); - -set_root_pgd: + root = mmu_alloc_root(vcpu, root_gfn, 0, + mmu->shadow_root_level, false); + mmu->root.hpa = root; mmu->root.pgd = root_pgd; out_unlock: write_unlock(&vcpu->kvm->mmu_lock); @@ -3772,8 +3647,7 @@ static bool is_unsync_root(hpa_t root) void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) { - int i; - struct kvm_mmu_page *sp; + hpa_t root = vcpu->arch.mmu->root.hpa; if (vcpu->arch.mmu->direct_map) return; @@ -3783,31 +3657,11 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY); - if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) { - hpa_t root = vcpu->arch.mmu->root.hpa; - sp = to_shadow_page(root); - - if (!is_unsync_root(root)) - return; - - write_lock(&vcpu->kvm->mmu_lock); - mmu_sync_children(vcpu, sp, true); - write_unlock(&vcpu->kvm->mmu_lock); + if (!is_unsync_root(root)) return; - } write_lock(&vcpu->kvm->mmu_lock); - - for (i = 0; i < 4; ++i) { - hpa_t root = vcpu->arch.mmu->pae_root[i]; - - if (IS_VALID_PAE_ROOT(root)) { - root &= PT64_BASE_ADDR_MASK; - sp = to_shadow_page(root); - mmu_sync_children(vcpu, sp, true); - } - } - + mmu_sync_children(vcpu, to_shadow_page(root), true); write_unlock(&vcpu->kvm->mmu_lock); } diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index c1b975fb85a2..677029b15709 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -365,6 +365,18 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, pte = mmu->get_guest_pgd(vcpu); have_ad = PT_HAVE_ACCESSED_DIRTY(mmu); + /* + * FNAME(fetch) might pass these values to allocate special shadow + * page. Although the gfn is not used at the end, it is better not + * to pass an uninitialized value to kvm_mmu_get_page(). + */ + walker->table_gfn[4] = 0; + walker->pt_access[4] = ACC_ALL; + walker->table_gfn[3] = 0; + walker->pt_access[3] = ACC_ALL; + walker->table_gfn[2] = 0; + walker->pt_access[2] = ACC_ALL; + #if PTTYPE == 64 walk_nx_mask = 1ULL << PT64_NX_SHIFT; if (walker->level == PT32E_ROOT_LEVEL) { @@ -710,7 +722,8 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, * Verify that the gpte in the page we've just write * protected is still there. */ - if (FNAME(gpte_changed)(vcpu, gw, it.level - 1)) + if (it.level - 1 < top_level && + FNAME(gpte_changed)(vcpu, gw, it.level - 1)) goto out_gpte_changed; if (sp) From patchwork Wed Apr 20 13:26:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12820250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7F4AC4321E for ; Wed, 20 Apr 2022 13:26:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379038AbiDTN2o (ORCPT ); Wed, 20 Apr 2022 09:28:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379032AbiDTN2g (ORCPT ); Wed, 20 Apr 2022 09:28:36 -0400 Received: from mail-pg1-x533.google.com (mail-pg1-x533.google.com [IPv6:2607:f8b0:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E174F11A36; Wed, 20 Apr 2022 06:25:49 -0700 (PDT) Received: by mail-pg1-x533.google.com with SMTP id bg9so1604742pgb.9; Wed, 20 Apr 2022 06:25:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6LkvNab/6tjOJLVxEQOpqPQJoB0oTUl274FVeJL6vXA=; b=DorGCDAlIME225NVrrVUdFe8hl/FJoxit0vf/SUZTlVJQ096P6UpSwyZwsw/dTyUEa Uo038b/+rHeHFZlhzABlPJr0+jFGm+lW8E5YRG4JwKl9B7bDk3sirniuZY0gjnJqbRh4 emAKGdruA57bwR+VWQRwcZ7GpI1x6eU93GoAdZRK6YFApFIxhQ1YLobYbz0iafOdxJcp hKU0kbGSE4dZL4EEt8FaL1/Orgk1xMe3JUV9IdvghTAY7B98GNtr0b3EFaBAQ1ot4F5F 8S++ZnqFOFyulmaGa9t8sm6xp6NUzz1WNo0wwMF/VJu6lV0aqRwrTcWeSrvcCgGpOdBR NUQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6LkvNab/6tjOJLVxEQOpqPQJoB0oTUl274FVeJL6vXA=; b=cFABkfaVNs3zXY8NLa3d4/JBFQFV+RuncnDF8OSdkL6+ivLuXN/vVtRpTKVOo0V8mM LPQRWV2G4G3QktXtRYWTr/xF1PreIJN2Ovek6FshdF9q3K8XFa4poMtey+w6T7dgwDhN cNS1TKS7Jy93ddXnh98eCHt/NHOdLmCyoUzWN5BTc7vuPHqig3kmI1iWZUAWI2tHhFK2 kTiTow5U2Qmrws0H5YuaN0vkDH3C/2tKCSV517hx4fpCaZbq0twruixURC7l9nQ0yH6+ Q+VTOXQ443LWcvA08DhV4Fxq8C/8lCroWct5OlrOi9Fzuz+jP0EWr1l4ZzCs7D9aAcZj l55Q== X-Gm-Message-State: AOAM532lMqRdb2W9793fSrJl/aOh7xVAs/Mqdnc6a8+NtCqyHXQ4S8mX C46sNBfh0iguOKorL2NH/MjhOSUNh9A= X-Google-Smtp-Source: ABdhPJxH753vNUNaiRfbkVu9zw8nveSjuhbUIU8CFyc/KgFf6Qzj99r6dwCkttJRsoKDV0SGKwswiQ== X-Received: by 2002:a62:5343:0:b0:4f7:baad:5c22 with SMTP id h64-20020a625343000000b004f7baad5c22mr23214956pfb.30.1650461149233; Wed, 20 Apr 2022 06:25:49 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id d8-20020a636808000000b00398e9c7049bsm19899161pgc.31.2022.04.20.06.25.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Apr 2022 06:25:49 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Lai Jiangshan , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 5/7] KVM: X86/MMU: Remove the check of the return value of to_shadow_page() Date: Wed, 20 Apr 2022 21:26:03 +0800 Message-Id: <20220420132605.3813-6-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220420132605.3813-1-jiangshanlai@gmail.com> References: <20220420132605.3813-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan Remove the check of the return value of to_shadow_page() in mmu_free_root_page(), kvm_mmu_free_guest_mode_roots(), is_unsync_root() and is_tdp_mmu() because it can not return NULL. Remove the check of the return value of to_shadow_page() in is_page_fault_stale() and is_obsolete_root() because it can not return NULL and the obsoleting for special shadow page is already handled by a different way. When the obsoleting process is done, all the obsoleted shadow pages are already unlinked from the special pages by the help of the rmap of the children and the special pages become theoretically valid again. The special shadow page can be freed if is_obsolete_sp() return true, or be reused if is_obsolete_sp() return false. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 44 +++----------------------------------- arch/x86/kvm/mmu/tdp_mmu.h | 7 +----- 2 files changed, 4 insertions(+), 47 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3b34a6912081..72a1af35e331 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3278,8 +3278,6 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa, return; sp = to_shadow_page(*root_hpa & PT64_BASE_ADDR_MASK); - if (WARN_ON(!sp)) - return; if (is_tdp_mmu_page(sp)) kvm_tdp_mmu_put_root(kvm, sp, false); @@ -3382,8 +3380,7 @@ void kvm_mmu_free_guest_mode_roots(struct kvm *kvm, struct kvm_mmu *mmu) if (!VALID_PAGE(root_hpa)) continue; - if (!to_shadow_page(root_hpa) || - to_shadow_page(root_hpa)->role.guest_mode) + if (to_shadow_page(root_hpa)->role.guest_mode) roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i); } @@ -3632,13 +3629,6 @@ static bool is_unsync_root(hpa_t root) smp_rmb(); sp = to_shadow_page(root); - /* - * PAE roots (somewhat arbitrarily) aren't backed by shadow pages, the - * PDPTEs for a given PAE root need to be synchronized individually. - */ - if (WARN_ON_ONCE(!sp)) - return false; - if (sp->unsync || sp->unsync_children) return true; @@ -3934,21 +3924,7 @@ static bool kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, static bool is_page_fault_stale(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault, int mmu_seq) { - struct kvm_mmu_page *sp = to_shadow_page(vcpu->arch.mmu->root.hpa); - - /* Special roots, e.g. pae_root, are not backed by shadow pages. */ - if (sp && is_obsolete_sp(vcpu->kvm, sp)) - return true; - - /* - * Roots without an associated shadow page are considered invalid if - * there is a pending request to free obsolete roots. The request is - * only a hint that the current root _may_ be obsolete and needs to be - * reloaded, e.g. if the guest frees a PGD that KVM is tracking as a - * previous root, then __kvm_mmu_prepare_zap_page() signals all vCPUs - * to reload even if no vCPU is actively using the root. - */ - if (!sp && kvm_test_request(KVM_REQ_MMU_FREE_OBSOLETE_ROOTS, vcpu)) + if (is_obsolete_sp(vcpu->kvm, to_shadow_page(vcpu->arch.mmu->root.hpa))) return true; return fault->slot && @@ -5099,24 +5075,10 @@ void kvm_mmu_unload(struct kvm_vcpu *vcpu) static bool is_obsolete_root(struct kvm *kvm, hpa_t root_hpa) { - struct kvm_mmu_page *sp; - if (!VALID_PAGE(root_hpa)) return false; - /* - * When freeing obsolete roots, treat roots as obsolete if they don't - * have an associated shadow page. This does mean KVM will get false - * positives and free roots that don't strictly need to be freed, but - * such false positives are relatively rare: - * - * (a) only PAE paging and nested NPT has roots without shadow pages - * (b) remote reloads due to a memslot update obsoletes _all_ roots - * (c) KVM doesn't track previous roots for PAE paging, and the guest - * is unlikely to zap an in-use PGD. - */ - sp = to_shadow_page(root_hpa); - return !sp || is_obsolete_sp(kvm, sp); + return is_obsolete_sp(kvm, to_shadow_page(root_hpa)); } static void __kvm_mmu_free_obsolete_roots(struct kvm *kvm, struct kvm_mmu *mmu) diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 5e5ef2576c81..4f70cb1b46df 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -84,13 +84,8 @@ static inline bool is_tdp_mmu(struct kvm_mmu *mmu) if (WARN_ON(!VALID_PAGE(hpa))) return false; - /* - * A NULL shadow page is legal when shadowing a non-paging guest with - * PAE paging, as the MMU will be direct with root_hpa pointing at the - * pae_root page, not a shadow page. - */ sp = to_shadow_page(hpa); - return sp && is_tdp_mmu_page(sp) && sp->root_count; + return is_tdp_mmu_page(sp) && sp->root_count; } #else static inline bool kvm_mmu_init_tdp_mmu(struct kvm *kvm) { return false; } From patchwork Wed Apr 20 13:26:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12820251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 677E0C433EF for ; Wed, 20 Apr 2022 13:26:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379054AbiDTN2q (ORCPT ); Wed, 20 Apr 2022 09:28:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379039AbiDTN2m (ORCPT ); Wed, 20 Apr 2022 09:28:42 -0400 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07F5D3A187; Wed, 20 Apr 2022 06:25:56 -0700 (PDT) Received: by mail-pf1-x42c.google.com with SMTP id p8so1914724pfh.8; Wed, 20 Apr 2022 06:25:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jjSP1EY6GE15ryex7YKFqGxc3rcSfRHmQrniXn4FjUc=; b=oxqxZbWgxFnLN2naCgMlw72VN2dZ8fNFqgDsgoz14rgp5AkETywM4CqumR97t1lY+D 3wPmWxKvPumh1YOlTnyDWTsaYfhW4YHTwhAmiOW2sWz5ni3FWvlnte1dgQv0efKYroyB D575woexiYfgqBhNLg+MHL+1TC9HsAmTzyo+TMGjjlaO0DKwSdh/Y1XMA3tVAisCcIZo k8tIkemMiE6euldW9QGnAv1tCL0OFCmYDBszr0Q6BVIvckw7nbaavngx8dC+wpynCMlU oI2Qt4A5css2nzXBjLC/XTwd+PPYgYxI79oJQzac+Ni+zGeqP4QOI8UHjckSLk7KqVhQ CbtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jjSP1EY6GE15ryex7YKFqGxc3rcSfRHmQrniXn4FjUc=; b=GsJCpuFNgQGvmHKQsGmr6HuNpDq8ehSNg2lfZKqZnxC+gizF/6TLsRkBAc6Wr3XA/6 1qlsgLGwCVl8yp7vhzrA9mwwmsMqwuhaXVOADeffOc1vBshzPpvRY7FMYUl2A/6y5QCM 2SxVWqyi15pqEHLOR7b8AVtCcrHCA5L6L7p3/h1dfI41ApDRcs3K6d4bjDyhR0k1ACjT PmdeLbmsoy/xJM55UhWJENqkun3UoWqATJ65uUlaDQ8ctJR+QQdb2pUHZxsK5a1/0Nus O+OMZN+bADNdTCLu10GHVLuQx8e5z6drlL7PDxhLqZ/AmjVq0LTTtviHaZfFvwNIfSg0 OnMg== X-Gm-Message-State: AOAM531T65rbMG8DstiP5/R+Nv4NtpkFbnVzh9utRQdBfrqtQJYinYMI jd5YE2IgEJhXO/VkE9LchBT+gWqbouQ= X-Google-Smtp-Source: ABdhPJwhFmtlDYofvkK6rrlWWHpnRtw4Vxe368lehSgXLDlKKHkTlAn5aqSevVHHFdMs5fRn1UKE3g== X-Received: by 2002:a05:6a00:1893:b0:50a:9b07:20bd with SMTP id x19-20020a056a00189300b0050a9b0720bdmr7228985pfh.66.1650461155230; Wed, 20 Apr 2022 06:25:55 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id r8-20020a17090a0ac800b001c9e35d3a3asm19297925pje.24.2022.04.20.06.25.54 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Apr 2022 06:25:55 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Lai Jiangshan , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 6/7] KVM: X86/MMU: Allocate mmu->pae_root for PAE paging on-demand Date: Wed, 20 Apr 2022 21:26:04 +0800 Message-Id: <20220420132605.3813-7-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220420132605.3813-1-jiangshanlai@gmail.com> References: <20220420132605.3813-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan mmu->pae_root for non-PAE paging is allocated on-demand, but mmu->pae_root for PAE paging is allocated early when struct kvm_mmu is being created. Simplify the code to allocate mmu->pae_root for PAE paging and make it on-demand. Signed-off-by: Lai Jiangshan --- arch/x86/kvm/mmu/mmu.c | 99 ++++++++++++++------------------- arch/x86/kvm/mmu/mmu_internal.h | 10 ---- 2 files changed, 42 insertions(+), 67 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 72a1af35e331..2f590779ee39 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -694,6 +694,41 @@ static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu) } } +static int mmu_alloc_pae_root(struct kvm_vcpu *vcpu) +{ + struct page *page; + + if (vcpu->arch.mmu->shadow_root_level != PT32E_ROOT_LEVEL) + return 0; + if (vcpu->arch.mmu->pae_root) + return 0; + + /* + * Allocate a page to hold the four PDPTEs for PAE paging when emulating + * 32-bit mode. CR3 is only 32 bits even on x86_64 in this case. + * Therefore we need to allocate the PDP table in the first 4GB of + * memory, which happens to fit the DMA32 zone. + */ + page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_DMA32); + if (!page) + return -ENOMEM; + vcpu->arch.mmu->pae_root = page_address(page); + + /* + * CR3 is only 32 bits when PAE paging is used, thus it's impossible to + * get the CPU to treat the PDPTEs as encrypted. Decrypt the page so + * that KVM's writes and the CPU's reads get along. Note, this is + * only necessary when using shadow paging, as 64-bit NPT can get at + * the C-bit even when shadowing 32-bit NPT, and SME isn't supported + * by 32-bit kernels (when KVM itself uses 32-bit NPT). + */ + if (!tdp_enabled) + set_memory_decrypted((unsigned long)vcpu->arch.mmu->pae_root, 1); + else + WARN_ON_ONCE(shadow_me_mask); + return 0; +} + static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect) { int r; @@ -5036,6 +5071,9 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->direct_map); if (r) goto out; + r = mmu_alloc_pae_root(vcpu); + if (r) + return r; r = mmu_alloc_special_roots(vcpu); if (r) goto out; @@ -5500,63 +5538,18 @@ static void free_mmu_pages(struct kvm_mmu *mmu) free_page((unsigned long)mmu->pml5_root); } -static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu) +static void __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu) { - struct page *page; int i; mmu->root.hpa = INVALID_PAGE; mmu->root.pgd = 0; for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) mmu->prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID; - - /* vcpu->arch.guest_mmu isn't used when !tdp_enabled. */ - if (!tdp_enabled && mmu == &vcpu->arch.guest_mmu) - return 0; - - /* - * When using PAE paging, the four PDPTEs are treated as 'root' pages, - * while the PDP table is a per-vCPU construct that's allocated at MMU - * creation. When emulating 32-bit mode, cr3 is only 32 bits even on - * x86_64. Therefore we need to allocate the PDP table in the first - * 4GB of memory, which happens to fit the DMA32 zone. TDP paging - * generally doesn't use PAE paging and can skip allocating the PDP - * table. The main exception, handled here, is SVM's 32-bit NPT. The - * other exception is for shadowing L1's 32-bit or PAE NPT on 64-bit - * KVM; that horror is handled on-demand by mmu_alloc_special_roots(). - */ - if (tdp_enabled && kvm_mmu_get_tdp_level(vcpu) > PT32E_ROOT_LEVEL) - return 0; - - page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_DMA32); - if (!page) - return -ENOMEM; - - mmu->pae_root = page_address(page); - - /* - * CR3 is only 32 bits when PAE paging is used, thus it's impossible to - * get the CPU to treat the PDPTEs as encrypted. Decrypt the page so - * that KVM's writes and the CPU's reads get along. Note, this is - * only necessary when using shadow paging, as 64-bit NPT can get at - * the C-bit even when shadowing 32-bit NPT, and SME isn't supported - * by 32-bit kernels (when KVM itself uses 32-bit NPT). - */ - if (!tdp_enabled) - set_memory_decrypted((unsigned long)mmu->pae_root, 1); - else - WARN_ON_ONCE(shadow_me_mask); - - for (i = 0; i < 4; ++i) - mmu->pae_root[i] = INVALID_PAE_ROOT; - - return 0; } int kvm_mmu_create(struct kvm_vcpu *vcpu) { - int ret; - vcpu->arch.mmu_pte_list_desc_cache.kmem_cache = pte_list_desc_cache; vcpu->arch.mmu_pte_list_desc_cache.gfp_zero = __GFP_ZERO; @@ -5568,18 +5561,10 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) vcpu->arch.mmu = &vcpu->arch.root_mmu; vcpu->arch.walk_mmu = &vcpu->arch.root_mmu; - ret = __kvm_mmu_create(vcpu, &vcpu->arch.guest_mmu); - if (ret) - return ret; - - ret = __kvm_mmu_create(vcpu, &vcpu->arch.root_mmu); - if (ret) - goto fail_allocate_root; + __kvm_mmu_create(vcpu, &vcpu->arch.guest_mmu); + __kvm_mmu_create(vcpu, &vcpu->arch.root_mmu); - return ret; - fail_allocate_root: - free_mmu_pages(&vcpu->arch.guest_mmu); - return ret; + return 0; } #define BATCH_ZAP_PAGES 10 diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 1bff453f7cbe..d5673a42680f 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -20,16 +20,6 @@ extern bool dbg; #define MMU_WARN_ON(x) do { } while (0) #endif -/* - * Unlike regular MMU roots, PAE "roots", a.k.a. PDPTEs/PDPTRs, have a PRESENT - * bit, and thus are guaranteed to be non-zero when valid. And, when a guest - * PDPTR is !PRESENT, its corresponding PAE root cannot be set to INVALID_PAGE, - * as the CPU would treat that as PRESENT PDPTR with reserved bits set. Use - * '0' instead of INVALID_PAGE to indicate an invalid PAE root. - */ -#define INVALID_PAE_ROOT 0 -#define IS_VALID_PAE_ROOT(x) (!!(x)) - typedef u64 __rcu *tdp_ptep_t; struct kvm_mmu_page { From patchwork Wed Apr 20 13:26:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lai Jiangshan X-Patchwork-Id: 12820252 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FFE8C433F5 for ; Wed, 20 Apr 2022 13:26:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379068AbiDTN2x (ORCPT ); Wed, 20 Apr 2022 09:28:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379059AbiDTN2t (ORCPT ); Wed, 20 Apr 2022 09:28:49 -0400 Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB3E842A00; Wed, 20 Apr 2022 06:26:01 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id j8so1747558pll.11; Wed, 20 Apr 2022 06:26:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Hj08weOnMiNuvuHVFNgvhLkhm3HK1Sro404l1Lgvx5w=; b=dQ+WiKwx0u/NmwBwydj2yJ1x8OVhFo/mHdn0LkOYsir5MmvEhpCLhJm/Yzyes8fgwu qQjJ65wunCxFnKvDDMmJfnTTGYHbHXrUzUKzBwmYme61y0syvzoFbZGXHSBMAXSgtFP9 2Q2Ad6VtfEmcpqBVTlFXKeSxCxfEt4hTXShrSkJfrCCcGv0PF0esZ8fOcR6xxgXx54wE QtYKsX0irPO8Sl3teQ7Fet+7UQiVW2A7Ed1rd/NZFAg7tickWu7hPWAxnOU/Twe7J4W0 PftUVvquEjFyvq75OqeD45O/THW+Ihrgkw0NLqIOoAvyS1yy6REzdUMqTHI+aHIFdo3H 4q1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Hj08weOnMiNuvuHVFNgvhLkhm3HK1Sro404l1Lgvx5w=; b=s0OopmQllmra15MOtWImzk+T3Hc0ZIM8OEwhLfCp4yw/879gwxrGllFg5u1dELSQBJ sLedOzmlbTjxl9MOBfUSayE7sYgSOXEhlyrVX5To6Pv4RXBPc70ZDn/D+jvQock/M5cm LMC1k8BwK8UXRdkHQeOMoHGcU6vhdsY5ZzHKAJGNIZpOxzWqLhs7t7qCIQp1D8t6QLDo RDK5K1kdu5cMy5kU5jHGpOTqzDMJwu7MueXZBA+gWg/ZCy6YpFRLfLBf8DQff/hR99Q8 jc8TVIyFJeZWcDv+tkrxRiFBYyl7ZKnjn4bFi7GrtWfVTk51DMPVGf2xuvF9q75AFxmy C+dQ== X-Gm-Message-State: AOAM533NFxiY3X3uu6Y3MV+1W0McT5rD0x7ks8RPZATmeyqR2gwZ5gqs VRw6x//OAc2O2xAhTYY+4iMgMcBDsRg= X-Google-Smtp-Source: ABdhPJy486mRkFHpISnRUJKz6ZAlNggZCTjfunDQffEZynbc8jWDKiMf6Zrhc5uO/NzhUIZj1gKGtQ== X-Received: by 2002:a17:902:8b8a:b0:158:fbd0:45ab with SMTP id ay10-20020a1709028b8a00b00158fbd045abmr14527557plb.110.1650461161319; Wed, 20 Apr 2022 06:26:01 -0700 (PDT) Received: from localhost ([47.251.4.198]) by smtp.gmail.com with ESMTPSA id f187-20020a6251c4000000b005058e59604csm20451115pfb.217.2022.04.20.06.26.00 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Apr 2022 06:26:01 -0700 (PDT) From: Lai Jiangshan To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Paolo Bonzini , Sean Christopherson Cc: Lai Jiangshan , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH 7/7] KVM: X86/MMU: Remove mmu_alloc_special_roots() Date: Wed, 20 Apr 2022 21:26:05 +0800 Message-Id: <20220420132605.3813-8-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20220420132605.3813-1-jiangshanlai@gmail.com> References: <20220420132605.3813-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Lai Jiangshan mmu_alloc_special_roots() allocates mmu->pae_root for non-PAE paging (as for shadowing 32bit NPT on 64 bit host) and mmu->pml4_root and mmu->pml5_root. But mmu->pml4_root and mmu->pml5_root is not used, neither mmu->pae_root for non-PAE paging. So remove mmu_alloc_special_roots(), mmu->pml4_root and mmu->pml5_root. Signed-off-by: Lai Jiangshan --- arch/x86/include/asm/kvm_host.h | 3 -- arch/x86/kvm/mmu/mmu.c | 76 --------------------------------- 2 files changed, 79 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index d4f8f4784d87..8bfebe509c09 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -462,9 +462,6 @@ struct kvm_mmu { u32 pkru_mask; u64 *pae_root; - u64 *pml4_root; - u64 *pml5_root; - /* * check zero bits on shadow page table entries, these * bits include not only hardware reserved bits but also diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 2f590779ee39..b16255c00c5a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3571,77 +3571,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) return r; } -static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu) -{ - struct kvm_mmu *mmu = vcpu->arch.mmu; - bool need_pml5 = mmu->shadow_root_level > PT64_ROOT_4LEVEL; - u64 *pml5_root = NULL; - u64 *pml4_root = NULL; - u64 *pae_root; - - /* - * When shadowing 32-bit or PAE NPT with 64-bit NPT, the PML4 and PDP - * tables are allocated and initialized at root creation as there is no - * equivalent level in the guest's NPT to shadow. Allocate the tables - * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare. - */ - if (mmu->direct_map || mmu->root_level >= PT64_ROOT_4LEVEL || - mmu->shadow_root_level < PT64_ROOT_4LEVEL) - return 0; - - /* - * NPT, the only paging mode that uses this horror, uses a fixed number - * of levels for the shadow page tables, e.g. all MMUs are 4-level or - * all MMus are 5-level. Thus, this can safely require that pml5_root - * is allocated if the other roots are valid and pml5 is needed, as any - * prior MMU would also have required pml5. - */ - if (mmu->pae_root && mmu->pml4_root && (!need_pml5 || mmu->pml5_root)) - return 0; - - /* - * The special roots should always be allocated in concert. Yell and - * bail if KVM ends up in a state where only one of the roots is valid. - */ - if (WARN_ON_ONCE(!tdp_enabled || mmu->pae_root || mmu->pml4_root || - (need_pml5 && mmu->pml5_root))) - return -EIO; - - /* - * Unlike 32-bit NPT, the PDP table doesn't need to be in low mem, and - * doesn't need to be decrypted. - */ - pae_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT); - if (!pae_root) - return -ENOMEM; - -#ifdef CONFIG_X86_64 - pml4_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT); - if (!pml4_root) - goto err_pml4; - - if (need_pml5) { - pml5_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT); - if (!pml5_root) - goto err_pml5; - } -#endif - - mmu->pae_root = pae_root; - mmu->pml4_root = pml4_root; - mmu->pml5_root = pml5_root; - - return 0; - -#ifdef CONFIG_X86_64 -err_pml5: - free_page((unsigned long)pml4_root); -err_pml4: - free_page((unsigned long)pae_root); - return -ENOMEM; -#endif -} - static bool is_unsync_root(hpa_t root) { struct kvm_mmu_page *sp; @@ -5074,9 +5003,6 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) r = mmu_alloc_pae_root(vcpu); if (r) return r; - r = mmu_alloc_special_roots(vcpu); - if (r) - goto out; if (vcpu->arch.mmu->direct_map) r = mmu_alloc_direct_roots(vcpu); else @@ -5534,8 +5460,6 @@ static void free_mmu_pages(struct kvm_mmu *mmu) if (!tdp_enabled && mmu->pae_root) set_memory_encrypted((unsigned long)mmu->pae_root, 1); free_page((unsigned long)mmu->pae_root); - free_page((unsigned long)mmu->pml4_root); - free_page((unsigned long)mmu->pml5_root); } static void __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)