From patchwork Sat Feb 18 03:23:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F776C6379F for ; Sat, 18 Feb 2023 03:23:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229656AbjBRDXV (ORCPT ); Fri, 17 Feb 2023 22:23:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229506AbjBRDXU (ORCPT ); Fri, 17 Feb 2023 22:23:20 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FEAB83D6 for ; Fri, 17 Feb 2023 19:23:19 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id 127-20020a251885000000b0092aabd4fa90so2399604yby.18 for ; Fri, 17 Feb 2023 19:23:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=G8W7vR0PRQmPGFMPIXJ++IapBHBRtIMd4Chx9W9B5Ps=; b=jfshQ85qJucHKlc9NtInF4jRlhylfSuyLpB0sjsaublJRAg8aCadTARhFFiJRw1CtZ brVcjj52MiE0ZT3l4SduCr0zSuPGVfnqTPiukrZlRVA4eALEupuFrrYCnq5vUQYl2Yow NrtMlFuEZThW3dk+ytyBqPvXroS0hxtB/1aIyqry1xz/e+/dV2bvZ4HQ6JABHdUssUcS 4XaS2Ua584WH8LqzqavcXtevKQVKiyK9i980P+IzC4FY6iRe10ITilbboKKulq8MZAZg 7PbRpdikVvWIM8uwUTJwvIRa98Awv4Y+ZuGx7+w25P1x/BTHXwQaaOk8Kz2u4GMQwOS7 ZP5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=G8W7vR0PRQmPGFMPIXJ++IapBHBRtIMd4Chx9W9B5Ps=; b=HX+N23/Zx0LHfrezId7pgN4rQw6g7HKsHVIypglutCrFAitglDofF830CRcK7YymZN eJSWLgZehJT8w0eO9mTStghFqqe+MZa1hyXBNP/VerwkqF0vJB7S43dVzEi4fAUkIXJV 03egq3daNTdIwc71CFsjIBSGBUu21By5u552Nn9+QCgj0bLZzlyh9ghLFS7ryvcbFpaS LBMY52F0WJKDQ8FWffn2R51Q7yTeisJ4cF+jVV4QcIlGuhhkMZInNiJubmBxZaFqnPje pw5Z063slwZZ7roZ21tpnZXZ1zD84i2mOfOaNNEW/JCb4c6cfnnQctLaTh17Vp98ULWn j8Vw== X-Gm-Message-State: AO0yUKXwkTxxe2Sxw1IJkOxpoJA85xsjyYaldSWdFPGCQu1o5zgojGRb wYNj8p0jywe1RHOIOlYGP7N9EBJJc3kEnA== X-Google-Smtp-Source: AK7set8xVKfNyBAGZlVz+zxNxGP3IALOxKRKuRoHBiLyPDSeaWu4wCUI5EWWUpiE3JolGW6iyDb4YSaev4n/Mg== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a81:6a03:0:b0:533:9485:c828 with SMTP id f3-20020a816a03000000b005339485c828mr831393ywc.512.1676690598494; Fri, 17 Feb 2023 19:23:18 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:03 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-2-ricarkol@google.com> Subject: [PATCH v4 01/12] KVM: arm64: Add KVM_PGTABLE_WALK ctx->flags for skipping BBM and CMO From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add two flags to kvm_pgtable_visit_ctx, KVM_PGTABLE_WALK_SKIP_BBM and KVM_PGTABLE_WALK_SKIP_CMO, to indicate that the walk should not perform break-before-make (BBM) nor cache maintenance operations (CMO). This will by a future commit to create unlinked tables not accessible to the HW page-table walker. This is safe as these removed tables are not visible to the HW page-table walker. Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_pgtable.h | 18 ++++++++++++++++++ arch/arm64/kvm/hyp/pgtable.c | 27 ++++++++++++++++----------- 2 files changed, 34 insertions(+), 11 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 63f81b27a4e3..252b651f743d 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -188,12 +188,20 @@ typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, u64 end, * children. * @KVM_PGTABLE_WALK_SHARED: Indicates the page-tables may be shared * with other software walkers. + * @KVM_PGTABLE_WALK_SKIP_BBM: Visit and update table entries + * without Break-before-make + * requirements. + * @KVM_PGTABLE_WALK_SKIP_CMO: Visit and update table entries + * without Cache maintenance + * operations required. */ enum kvm_pgtable_walk_flags { KVM_PGTABLE_WALK_LEAF = BIT(0), KVM_PGTABLE_WALK_TABLE_PRE = BIT(1), KVM_PGTABLE_WALK_TABLE_POST = BIT(2), KVM_PGTABLE_WALK_SHARED = BIT(3), + KVM_PGTABLE_WALK_SKIP_BBM = BIT(4), + KVM_PGTABLE_WALK_SKIP_CMO = BIT(5), }; struct kvm_pgtable_visit_ctx { @@ -215,6 +223,16 @@ static inline bool kvm_pgtable_walk_shared(const struct kvm_pgtable_visit_ctx *c return ctx->flags & KVM_PGTABLE_WALK_SHARED; } +static inline bool kvm_pgtable_walk_skip_bbm(const struct kvm_pgtable_visit_ctx *ctx) +{ + return ctx->flags & KVM_PGTABLE_WALK_SKIP_BBM; +} + +static inline bool kvm_pgtable_walk_skip_cmo(const struct kvm_pgtable_visit_ctx *ctx) +{ + return ctx->flags & KVM_PGTABLE_WALK_SKIP_CMO; +} + /** * struct kvm_pgtable_walker - Hook into a page-table walk. * @cb: Callback function to invoke during the walk. diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index b11cf2c618a6..e093e222daf3 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -717,14 +717,17 @@ static bool stage2_try_break_pte(const struct kvm_pgtable_visit_ctx *ctx, if (!stage2_try_set_pte(ctx, KVM_INVALID_PTE_LOCKED)) return false; - /* - * Perform the appropriate TLB invalidation based on the evicted pte - * value (if any). - */ - if (kvm_pte_table(ctx->old, ctx->level)) - kvm_call_hyp(__kvm_tlb_flush_vmid, mmu); - else if (kvm_pte_valid(ctx->old)) - kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr, ctx->level); + if (!kvm_pgtable_walk_skip_bbm(ctx)) { + /* + * Perform the appropriate TLB invalidation based on the + * evicted pte value (if any). + */ + if (kvm_pte_table(ctx->old, ctx->level)) + kvm_call_hyp(__kvm_tlb_flush_vmid, mmu); + else if (kvm_pte_valid(ctx->old)) + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, + ctx->addr, ctx->level); + } if (stage2_pte_is_counted(ctx->old)) mm_ops->put_page(ctx->ptep); @@ -808,11 +811,13 @@ static int stage2_map_walker_try_leaf(const struct kvm_pgtable_visit_ctx *ctx, return -EAGAIN; /* Perform CMOs before installation of the guest stage-2 PTE */ - if (mm_ops->dcache_clean_inval_poc && stage2_pte_cacheable(pgt, new)) + if (!kvm_pgtable_walk_skip_cmo(ctx) && mm_ops->dcache_clean_inval_poc && + stage2_pte_cacheable(pgt, new)) mm_ops->dcache_clean_inval_poc(kvm_pte_follow(new, mm_ops), - granule); + granule); - if (mm_ops->icache_inval_pou && stage2_pte_executable(new)) + if (!kvm_pgtable_walk_skip_cmo(ctx) && mm_ops->icache_inval_pou && + stage2_pte_executable(new)) mm_ops->icache_inval_pou(kvm_pte_follow(new, mm_ops), granule); stage2_make_pte(ctx, new); From patchwork Sat Feb 18 03:23:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87C3CC636D6 for ; Sat, 18 Feb 2023 03:23:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229673AbjBRDXX (ORCPT ); Fri, 17 Feb 2023 22:23:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229506AbjBRDXW (ORCPT ); Fri, 17 Feb 2023 22:23:22 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 220EF298E4 for ; Fri, 17 Feb 2023 19:23:21 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5366c22f138so15241517b3.10 for ; Fri, 17 Feb 2023 19:23:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PtIyUzFZ+POnJVZohdUvw1RC8Y306B94OJRPN02aY5c=; b=QsgbrZlEi/GSuCuA7VqSJwr0EbqZk15xghg0MEeMJz3ukp2oWwmJioHUunXOyKO6Vy lpC8ZwHBaRHAlzyrAxuiQGsrk5liGTE6g45MndH31A4qfkLWojBd1aCZQuD6DMXlA806 h3mA8K9YeW4vLINy6ISLvQ1st9ABmnhZv0Dgvc9fJ8+6x5/wKK5WNeVhUSipjFsnjxoo f/8NDCIqgjDNL8JoWf4XXHHj+AHFZRiwlMgbnA48sS3pYzzPzBw6ak4ptnXmhvwnaYP/ hln4dEo8MUye8l5HtH99LKsDpapy4hYhivyckGcwKWA4vxESOa8JrvfBOcbiE775x75Y Gn6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PtIyUzFZ+POnJVZohdUvw1RC8Y306B94OJRPN02aY5c=; b=vfoZTSneuBF2dKdxGVewwmw1ChQ3DxAX5t31kycNVljEA3LfcMKZtvkrD4KzzN29F1 U3Jlsi7OLd1WapkW9YAYRzkc/q2vkQA8LXpLGqPuCHGwONinMlmd510SjAx/1wV7lAdr p9ZU4wRwjo3URJ8Ro43M94Rcc/VhuGoOrcVc9KQV0fAabUoWuC4Sxk51INWumjs2esy1 rllgYXmfJK1RxaB5DBKbTmVi+V8VLSiuAfhfwTyREfhX/02jK9av2VZz4PI5+ws2oqmr K5RmLWWFwD34uYP2ZlXP4zTBfQ4Ww3aA9/7aD3BrkyaCYFVc88lKZtmTTXYVtjOb4QFb tCmQ== X-Gm-Message-State: AO0yUKVreEMzXJSgpVh6D4ltvFWuIccMRz0tT+0a/wyDLVw+YqzCA6vL 8yz4bRPWKrZTYAwOFsdSu31kpn6CNB66bA== X-Google-Smtp-Source: AK7set+V3zRtYJ8+vOkQY169szCDURd19+1oKMsz/bWzZPDgcECJmnpTwzBdBirzbaOdf3cXwjZcIhLZ1Qbe6Q== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a05:6902:10c:b0:997:c919:4484 with SMTP id o12-20020a056902010c00b00997c9194484mr49779ybh.6.1676690600280; Fri, 17 Feb 2023 19:23:20 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:04 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-3-ricarkol@google.com> Subject: [PATCH v4 02/12] KVM: arm64: Rename free_unlinked to free_removed From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller , Oliver Upton Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Normalize on referring to tables outside of an active paging structure as 'unlinked'. A subsequent change to KVM will add support for building page tables that are not part of an active paging structure. The existing 'removed_table' terminology is quite clunky when applied in this context. No functional change intended. Signed-off-by: Ricardo Koller Reviewed-by: Oliver Upton --- arch/arm64/include/asm/kvm_pgtable.h | 8 ++++---- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 6 +++--- arch/arm64/kvm/hyp/pgtable.c | 6 +++--- arch/arm64/kvm/mmu.c | 10 +++++----- 4 files changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 252b651f743d..dcd3aafd3e6c 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -99,7 +99,7 @@ static inline bool kvm_level_supports_block_mapping(u32 level) * allocation is physically contiguous. * @free_pages_exact: Free an exact number of memory pages previously * allocated by zalloc_pages_exact. - * @free_removed_table: Free a removed paging structure by unlinking and + * @free_unlinked_table: Free an unlinked paging structure by unlinking and * dropping references. * @get_page: Increment the refcount on a page. * @put_page: Decrement the refcount on a page. When the @@ -119,7 +119,7 @@ struct kvm_pgtable_mm_ops { void* (*zalloc_page)(void *arg); void* (*zalloc_pages_exact)(size_t size); void (*free_pages_exact)(void *addr, size_t size); - void (*free_removed_table)(void *addr, u32 level); + void (*free_unlinked_table)(void *addr, u32 level); void (*get_page)(void *addr); void (*put_page)(void *addr); int (*page_count)(void *addr); @@ -450,7 +450,7 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt); /** - * kvm_pgtable_stage2_free_removed() - Free a removed stage-2 paging structure. + * kvm_pgtable_stage2_free_unlinked() - Free an unlinked stage-2 paging structure. * @mm_ops: Memory management callbacks. * @pgtable: Unlinked stage-2 paging structure to be freed. * @level: Level of the stage-2 paging structure to be freed. @@ -458,7 +458,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt); * The page-table is assumed to be unreachable by any hardware walkers prior to * freeing and therefore no TLB invalidation is performed. */ -void kvm_pgtable_stage2_free_removed(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level); +void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level); /** * kvm_pgtable_stage2_map() - Install a mapping in a guest stage-2 page-table. diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 552653fa18be..b030170d803b 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -91,9 +91,9 @@ static void host_s2_put_page(void *addr) hyp_put_page(&host_s2_pool, addr); } -static void host_s2_free_removed_table(void *addr, u32 level) +static void host_s2_free_unlinked_table(void *addr, u32 level) { - kvm_pgtable_stage2_free_removed(&host_mmu.mm_ops, addr, level); + kvm_pgtable_stage2_free_unlinked(&host_mmu.mm_ops, addr, level); } static int prepare_s2_pool(void *pgt_pool_base) @@ -110,7 +110,7 @@ static int prepare_s2_pool(void *pgt_pool_base) host_mmu.mm_ops = (struct kvm_pgtable_mm_ops) { .zalloc_pages_exact = host_s2_zalloc_pages_exact, .zalloc_page = host_s2_zalloc_page, - .free_removed_table = host_s2_free_removed_table, + .free_unlinked_table = host_s2_free_unlinked_table, .phys_to_virt = hyp_phys_to_virt, .virt_to_phys = hyp_virt_to_phys, .page_count = hyp_page_count, diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index e093e222daf3..0a5ef9288371 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -841,7 +841,7 @@ static int stage2_map_walk_table_pre(const struct kvm_pgtable_visit_ctx *ctx, if (ret) return ret; - mm_ops->free_removed_table(childp, ctx->level); + mm_ops->free_unlinked_table(childp, ctx->level); return 0; } @@ -886,7 +886,7 @@ static int stage2_map_walk_leaf(const struct kvm_pgtable_visit_ctx *ctx, * The TABLE_PRE callback runs for table entries on the way down, looking * for table entries which we could conceivably replace with a block entry * for this mapping. If it finds one it replaces the entry and calls - * kvm_pgtable_mm_ops::free_removed_table() to tear down the detached table. + * kvm_pgtable_mm_ops::free_unlinked_table() to tear down the detached table. * * Otherwise, the LEAF callback performs the mapping at the existing leaves * instead. @@ -1250,7 +1250,7 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt) pgt->pgd = NULL; } -void kvm_pgtable_stage2_free_removed(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level) +void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level) { kvm_pteref_t ptep = (kvm_pteref_t)pgtable; struct kvm_pgtable_walker walker = { diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index a3ee3b605c9b..9bd3c2cfb476 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -130,21 +130,21 @@ static void kvm_s2_free_pages_exact(void *virt, size_t size) static struct kvm_pgtable_mm_ops kvm_s2_mm_ops; -static void stage2_free_removed_table_rcu_cb(struct rcu_head *head) +static void stage2_free_unlinked_table_rcu_cb(struct rcu_head *head) { struct page *page = container_of(head, struct page, rcu_head); void *pgtable = page_to_virt(page); u32 level = page_private(page); - kvm_pgtable_stage2_free_removed(&kvm_s2_mm_ops, pgtable, level); + kvm_pgtable_stage2_free_unlinked(&kvm_s2_mm_ops, pgtable, level); } -static void stage2_free_removed_table(void *addr, u32 level) +static void stage2_free_unlinked_table(void *addr, u32 level) { struct page *page = virt_to_page(addr); set_page_private(page, (unsigned long)level); - call_rcu(&page->rcu_head, stage2_free_removed_table_rcu_cb); + call_rcu(&page->rcu_head, stage2_free_unlinked_table_rcu_cb); } static void kvm_host_get_page(void *addr) @@ -681,7 +681,7 @@ static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = { .zalloc_page = stage2_memcache_zalloc_page, .zalloc_pages_exact = kvm_s2_zalloc_pages_exact, .free_pages_exact = kvm_s2_free_pages_exact, - .free_removed_table = stage2_free_removed_table, + .free_unlinked_table = stage2_free_unlinked_table, .get_page = kvm_host_get_page, .put_page = kvm_s2_put_page, .page_count = kvm_host_page_count, From patchwork Sat Feb 18 03:23:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00AF7C6379F for ; Sat, 18 Feb 2023 03:23:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229679AbjBRDXZ (ORCPT ); Fri, 17 Feb 2023 22:23:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229506AbjBRDXX (ORCPT ); Fri, 17 Feb 2023 22:23:23 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A77F21A951 for ; Fri, 17 Feb 2023 19:23:22 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id b15-20020a252e4f000000b008ee1c76c25dso2407024ybn.11 for ; Fri, 17 Feb 2023 19:23:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=IPKuIrgtmaUvnSQsSTqJdSngfJOw0G5AsmYHjfQDA+w=; b=Df1ZtaZZDKkMQcinUaB82zEycAOToO5IWYefHKGUwkjpNvaM1eFEGTH2JKXaM97RQt sx1LAS0HYD0+iSqWZcKnhakihEd2vR85fH8b49cYd0v+AA9ePg1fKRrR0gpqHBVux4PU VkvWLwg+lL6aTRSvNvchziRnyznLen1sTUgeyF1d+m/BgCbUGc5YcNc4+c0ajIH4O7AT MHr8N9LGGxyxeUIQRRZH46vZzLEyQ4UknI8YAMpZIZfh3ULGNzMcUtJ9JYqS0eQcqm6W 9UB5p+qLsKYq7yvsplYerlWZjYcdw4Jt1oO2R/4z6Mog8AVlsFoE/pTbqsqySH1rwXMK YWPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IPKuIrgtmaUvnSQsSTqJdSngfJOw0G5AsmYHjfQDA+w=; b=ABKkVEaxoSCZcsFYPd9V3zbBqcBpmC+2l/9O0hfeT7iU3+G8HpmTsvYUd2ezpmKH8m TQNgBmREeYLTGBURDhnJlKArZE8pY+uMstB/FKaHcCAw8z4IDlT59hMHMbiee3I3JjEg /TMQucP3f0E/1ZkTDrQdWi1iL8odpNCcZZ+YhkfhZiRW92nWVtVlZ0wIA9oY0MqWIU3+ 5WOz3C9eUUvbHfjtqh/pAHoUfFhNMBFIUxkktnbMbGhPz5jyito5dW5weNiOof/3B6vy 3p+T6EXjMtbsD2o+gEpqyf0P5MHmQhwGYnjXBj+D3WSLSyi+w9WD87k7NIr0h6TOpTIc MhJg== X-Gm-Message-State: AO0yUKVOeon+I8s59B5A1dthtFy+jEQaO1J6Y5wwv0UecdmwF8GqY3b0 j/Z2Ip2Ljd9TYk51z1gtytwUUeAKNuOZRQ== X-Google-Smtp-Source: AK7set+LpHmmkgDl+/xDHOC/PKsWo6xoTJCTFIX3caU3Y29wflPl4vB0beoWr8mP2hw10HMmU41BpbJMuaJfwA== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a81:7b02:0:b0:52e:d589:c893 with SMTP id w2-20020a817b02000000b0052ed589c893mr1411708ywc.457.1676690601872; Fri, 17 Feb 2023 19:23:21 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:05 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-4-ricarkol@google.com> Subject: [PATCH v4 03/12] KVM: arm64: Add helper for creating unlinked stage2 subtrees From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a stage2 helper, kvm_pgtable_stage2_create_unlinked(), for creating unlinked tables (which is the opposite of kvm_pgtable_stage2_free_unlinked()). Creating an unlinked table is useful for splitting PMD and PUD blocks into subtrees of PAGE_SIZE PTEs. For example, a PUD can be split into PAGE_SIZE PTEs by first creating a fully populated tree, and then use it to replace the PUD in a single step. This will be used in a subsequent commit for eager huge-page splitting (a dirty-logging optimization). No functional change intended. This new function will be used in a subsequent commit. Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_pgtable.h | 28 +++++++++++++++++ arch/arm64/kvm/hyp/pgtable.c | 46 ++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index dcd3aafd3e6c..b8cde914cca9 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -460,6 +460,34 @@ void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt); */ void kvm_pgtable_stage2_free_unlinked(struct kvm_pgtable_mm_ops *mm_ops, void *pgtable, u32 level); +/** + * kvm_pgtable_stage2_create_unlinked() - Create an unlinked stage-2 paging structure. + * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*(). + * @phys: Physical address of the memory to map. + * @level: Starting level of the stage-2 paging structure to be created. + * @prot: Permissions and attributes for the mapping. + * @mc: Cache of pre-allocated and zeroed memory from which to allocate + * page-table pages. + * @force_pte: Force mappings to PAGE_SIZE granularity. + * + * Create an unlinked page-table tree under @new. If @force_pte is + * true or @level is 2 (the PMD level), then the tree is mapped up to + * the PAGE_SIZE leaf PTE; the tree is mapped up one level otherwise. + * This new page-table tree is not reachable (i.e., it is unlinked) + * from the root pgd and it's therefore unreachableby the hardware + * page-table walker. No TLB invalidation or CMOs are performed. + * + * If device attributes are not explicitly requested in @prot, then the + * mapping will be normal, cacheable. + * + * Return: The fully populated (unlinked) stage-2 paging structure, or + * an ERR_PTR(error) on failure. + */ +kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt, + u64 phys, u32 level, + enum kvm_pgtable_prot prot, + void *mc, bool force_pte); + /** * kvm_pgtable_stage2_map() - Install a mapping in a guest stage-2 page-table. * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*(). diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 0a5ef9288371..80f2965ab0fe 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1181,6 +1181,52 @@ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size) return kvm_pgtable_walk(pgt, addr, size, &walker); } +kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt, + u64 phys, u32 level, + enum kvm_pgtable_prot prot, + void *mc, bool force_pte) +{ + struct stage2_map_data map_data = { + .phys = phys, + .mmu = pgt->mmu, + .memcache = mc, + .force_pte = force_pte, + }; + struct kvm_pgtable_walker walker = { + .cb = stage2_map_walker, + .flags = KVM_PGTABLE_WALK_LEAF | + KVM_PGTABLE_WALK_SKIP_BBM | + KVM_PGTABLE_WALK_SKIP_CMO, + .arg = &map_data, + }; + /* .addr (the IPA) is irrelevant for a removed table */ + struct kvm_pgtable_walk_data data = { + .walker = &walker, + .addr = 0, + .end = kvm_granule_size(level), + }; + struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops; + kvm_pte_t *pgtable; + int ret; + + ret = stage2_set_prot_attr(pgt, prot, &map_data.attr); + if (ret) + return ERR_PTR(ret); + + pgtable = mm_ops->zalloc_page(mc); + if (!pgtable) + return ERR_PTR(-ENOMEM); + + ret = __kvm_pgtable_walk(&data, mm_ops, (kvm_pteref_t)pgtable, + level + 1); + if (ret) { + kvm_pgtable_stage2_free_unlinked(mm_ops, pgtable, level); + mm_ops->put_page(pgtable); + return ERR_PTR(ret); + } + + return pgtable; +} int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, struct kvm_pgtable_mm_ops *mm_ops, From patchwork Sat Feb 18 03:23:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24D8FC05027 for ; Sat, 18 Feb 2023 03:23:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229724AbjBRDX2 (ORCPT ); Fri, 17 Feb 2023 22:23:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229687AbjBRDXZ (ORCPT ); Fri, 17 Feb 2023 22:23:25 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35EA61422D for ; Fri, 17 Feb 2023 19:23:24 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id o14-20020a25810e000000b0095d2ada3d26so2133497ybk.5 for ; Fri, 17 Feb 2023 19:23:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=sWl8r2vGcb6fsDM+UiT1aLTOtpAnLts9h5IrkrEYugA=; b=Kltg1Qd9h6AL9AG4oBeoxQOUxL3J93hRT0ksFS8TFi6GLUvjhAFfQGq0+N7V+8WCSA JOpkgEt2B5PaAPuMND3GpXYtuUAUO2NZUbBIZFZ0lJ1Ie4Cld/XG2Pj8t5mIgh6mCr3V CBubLCCONIwwzGdlZX3T+sitdQF8j8Jwdp7QBe4mkhAEruK3CQhc+64EtgovaLxVK9kx z6gapwP+KjR3Bo+jkj8wDTs9UDNaJ74Ny279ufBfDMNDqUbRuU2WDtItroIJPM6XpZfU q6ojQHoXiiq7YcrqA2un1ibuBTYsvluXHoP4RzK5wzDCS1Qz6WK/aVyyBvMZFMcl1uBI vZpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sWl8r2vGcb6fsDM+UiT1aLTOtpAnLts9h5IrkrEYugA=; b=oWbZGmWaeBshtscn7NsjamgvlmDzCV/ewLKk1BJ64i64doeeMOeACmS4j0H6pcYi9r HRumLaCmOLF8dcYQXdtoo81uzeWjY8NGxurvXuQtdkIghVoHRBAcalA03iuY0H54l2Ip JfQY6WEacn5CoVVdpCdUDtkviZvbY+SOrJKPlqr6MXyHygv6kidF0nnljrSUwKGHE3uE VZpJsOQr9yHMfZCN+N0ACtj/u6woXqZOWu9PaK7XFSMg6gz1x/51vM4poSZbT+Bwywrj C7s5UeQ5c8uCQdMdTAchR/Zx+T1edEE6WYBBcHwAaZCpsY/TADFTDTKDDQbjgQH4Y79H 8sbw== X-Gm-Message-State: AO0yUKWnvw3RnEI25UK9SiG6ccklKk8IGTfaqaBj7L43fHanJd69osrb UcESlQDTRPtUqYiq00ZDnDMAPZlE3DeWsg== X-Google-Smtp-Source: AK7set+b3N/Tt0sdPcR8K3f0xGAIaPQ87xNO0k/s0lLwPfYU1eTcY3bTamWct8qMgH8g6KiEtD25b1QgPkuB/Q== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a05:6902:10e:b0:95d:6b4f:a73a with SMTP id o14-20020a056902010e00b0095d6b4fa73amr28018ybh.8.1676690603463; Fri, 17 Feb 2023 19:23:23 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:06 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-5-ricarkol@google.com> Subject: [PATCH v4 04/12] KVM: arm64: Add kvm_pgtable_stage2_split() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a new stage2 function, kvm_pgtable_stage2_split(), for splitting a range of huge pages. This will be used for eager-splitting huge pages into PAGE_SIZE pages. The goal is to avoid having to split huge pages on write-protection faults, and instead use this function to do it ahead of time for large ranges (e.g., all guest memory in 1G chunks at a time). No functional change intended. This new function will be used in a subsequent commit. Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_pgtable.h | 30 +++++++ arch/arm64/kvm/hyp/pgtable.c | 113 +++++++++++++++++++++++++++ 2 files changed, 143 insertions(+) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index b8cde914cca9..6908109ac11e 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -657,6 +657,36 @@ bool kvm_pgtable_stage2_is_young(struct kvm_pgtable *pgt, u64 addr); */ int kvm_pgtable_stage2_flush(struct kvm_pgtable *pgt, u64 addr, u64 size); +/** + * kvm_pgtable_stage2_split() - Split a range of huge pages into leaf PTEs pointing + * to PAGE_SIZE guest pages. + * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init(). + * @addr: Intermediate physical address from which to split. + * @size: Size of the range. + * @mc: Cache of pre-allocated and zeroed memory from which to allocate + * page-table pages. + * @mc_capacity: Number of pages in @mc. + * + * @addr and the end (@addr + @size) are effectively aligned down and up to + * the top level huge-page block size. This is an example using 1GB + * huge-pages and 4KB granules. + * + * [---input range---] + * : : + * [--1G block pte--][--1G block pte--][--1G block pte--][--1G block pte--] + * : : + * [--2MB--][--2MB--][--2MB--][--2MB--] + * : : + * [ ][ ][:][ ][ ][ ][ ][ ][:][ ][ ][ ] + * : : + * + * Return: 0 on success, negative error code on failure. Note that + * kvm_pgtable_stage2_split() is best effort: it tries to break as many + * blocks in the input range as allowed by @mc_capacity. + */ +int kvm_pgtable_stage2_split(struct kvm_pgtable *pgt, u64 addr, u64 size, + void *mc, u64 mc_capacity); + /** * kvm_pgtable_walk() - Walk a page-table. * @pgt: Page-table structure initialised by kvm_pgtable_*_init(). diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 80f2965ab0fe..9f1c8fdd9330 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1228,6 +1228,119 @@ kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt, return pgtable; } +struct stage2_split_data { + struct kvm_s2_mmu *mmu; + void *memcache; + u64 mc_capacity; +}; + +/* + * Get the number of page-tables needed to replace a block with a + * fully populated tree, up to the PTE level, at particular level. + */ +static inline int stage2_block_get_nr_page_tables(u32 level) +{ + if (WARN_ON_ONCE(level < KVM_PGTABLE_MIN_BLOCK_LEVEL || + level >= KVM_PGTABLE_MAX_LEVELS)) + return -EINVAL; + + switch (level) { + case 1: + return PTRS_PER_PTE + 1; + case 2: + return 1; + case 3: + return 0; + default: + return -EINVAL; + }; +} + +static int stage2_split_walker(const struct kvm_pgtable_visit_ctx *ctx, + enum kvm_pgtable_walk_flags visit) +{ + struct kvm_pgtable_mm_ops *mm_ops = ctx->mm_ops; + struct stage2_split_data *data = ctx->arg; + kvm_pte_t pte = ctx->old, new, *childp; + enum kvm_pgtable_prot prot; + void *mc = data->memcache; + u32 level = ctx->level; + bool force_pte; + int nr_pages; + u64 phys; + + /* No huge-pages exist at the last level */ + if (level == KVM_PGTABLE_MAX_LEVELS - 1) + return 0; + + /* We only split valid block mappings */ + if (!kvm_pte_valid(pte)) + return 0; + + nr_pages = stage2_block_get_nr_page_tables(level); + if (nr_pages < 0) + return nr_pages; + + if (data->mc_capacity >= nr_pages) { + /* Build a tree mapped down to the PTE granularity. */ + force_pte = true; + } else { + /* + * Don't force PTEs. This requires a single page of PMDs at the + * PUD level, or a single page of PTEs at the PMD level. If we + * are at the PUD level, the PTEs will be created recursively. + */ + force_pte = false; + nr_pages = 1; + } + + if (data->mc_capacity < nr_pages) + return -ENOMEM; + + phys = kvm_pte_to_phys(pte); + prot = kvm_pgtable_stage2_pte_prot(pte); + + childp = kvm_pgtable_stage2_create_unlinked(data->mmu->pgt, phys, + level, prot, mc, force_pte); + if (IS_ERR(childp)) + return PTR_ERR(childp); + + if (!stage2_try_break_pte(ctx, data->mmu)) { + kvm_pgtable_stage2_free_unlinked(mm_ops, childp, level); + mm_ops->put_page(childp); + return -EAGAIN; + } + + /* + * Note, the contents of the page table are guaranteed to be made + * visible before the new PTE is assigned because stage2_make_pte() + * writes the PTE using smp_store_release(). + */ + new = kvm_init_table_pte(childp, mm_ops); + stage2_make_pte(ctx, new); + dsb(ishst); + data->mc_capacity -= nr_pages; + return 0; +} + +int kvm_pgtable_stage2_split(struct kvm_pgtable *pgt, u64 addr, u64 size, + void *mc, u64 mc_capacity) +{ + struct stage2_split_data split_data = { + .mmu = pgt->mmu, + .memcache = mc, + .mc_capacity = mc_capacity, + }; + + struct kvm_pgtable_walker walker = { + .cb = stage2_split_walker, + .flags = KVM_PGTABLE_WALK_LEAF, + .arg = &split_data, + }; + + return kvm_pgtable_walk(pgt, addr, size, &walker); +} + int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, struct kvm_pgtable_mm_ops *mm_ops, enum kvm_pgtable_stage2_flags flags, From patchwork Sat Feb 18 03:23:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE8CFC6379F for ; Sat, 18 Feb 2023 03:23:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229689AbjBRDX3 (ORCPT ); Fri, 17 Feb 2023 22:23:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229695AbjBRDX0 (ORCPT ); Fri, 17 Feb 2023 22:23:26 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95FBE68E6D for ; Fri, 17 Feb 2023 19:23:25 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id 75-20020a250b4e000000b0090f2c84a6a4so2318501ybl.13 for ; Fri, 17 Feb 2023 19:23:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=rbgzVORnpmf4LGOu3cnJ0X83jNhf+dzdrTmSVTaKZV8=; b=QU9GCG/O+l0wHwkEw7Ymi6rY8tvAR9gYzS+a0e7KHxTCwE9tCaLLE5yn4YuHRMW0a9 AXNc+zYd7tK9t37xcq0XGoxCo0BpLW7Do1lzXdj7HGIK6PQ72G0TwBlkWlqcjV3fVe8c zMUUQzayZ+XPtkqC53KBFClAJXqeYZP2o7n2KpBf0wJT/JeqBO0UwzZyBhy6spQhkHpj xwfkP0MiRzQMqks27Pf26e7sJl/yM+LjQiE+i23JsqeKg8lmLkg7OpI854Rr56x67Les 44QUWIKc4+tFbnyZdsOzy7rPgOP5nbICXn30c5Z7/00AmgzZhvQiTrrN2/K4ARKRCfnU IAYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rbgzVORnpmf4LGOu3cnJ0X83jNhf+dzdrTmSVTaKZV8=; b=nejcQzrmA2cAL1RpoCZjID7nbHllaNc1NS2oagoghAkTUtEvBIA4BTs5f1mewJSZYE QlopxGbpPnqHTzfQLxdiaAxHXROv8jgtQK+pZPI8/3FC03Y4v94GUX9jnb/VKR/cAoRk 2AGEhEpfPkCdPD07k88BZKbRizXLEeKc4Ro0pd05lCeTW6TkZJeRxusE1Y3WuXEKOMRs 6GnNz6ZvoW+blU+6hA9LLK3JIM7rjb8+cp/gdJlL+3yeMEodPhGRF0NjckGcwdd7V/ap d+ibWNscUU18pIwOfg0GiN0LAs4GKJY/mqyU79F0BAnD4zjUNfKqOenhcSUCqH9/QiWx OtNQ== X-Gm-Message-State: AO0yUKW+qOZafuAU16dASCMbY5w1sE6SrAge5fUj/za+zf4qnYMv7iRy RTZba2OBHf8ZsJN4Td3xVVc9x2H6s8d5KA== X-Google-Smtp-Source: AK7set8GJdorUdMgVoGdwXGY2XY+L5DQ5JduEeXIylQUvnCnu1ULOjCB0q94gwRBydtPqZgOcrFvDlc9CzCMjA== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a25:97c2:0:b0:9a0:1d7b:707b with SMTP id j2-20020a2597c2000000b009a01d7b707bmr11456ybo.4.1676690604853; Fri, 17 Feb 2023 19:23:24 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:07 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-6-ricarkol@google.com> Subject: [PATCH v4 05/12] KVM: arm64: Refactor kvm_arch_commit_memory_region() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Refactor kvm_arch_commit_memory_region() as a preparation for a future commit to look cleaner and more understandable. Also, it looks more like its x86 counterpart (in kvm_mmu_slot_apply_flags()). No functional change intended. Signed-off-by: Ricardo Koller --- arch/arm64/kvm/mmu.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 9bd3c2cfb476..d2c5e6992459 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1761,20 +1761,27 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, const struct kvm_memory_slot *new, enum kvm_mr_change change) { + bool log_dirty_pages = new && new->flags & KVM_MEM_LOG_DIRTY_PAGES; + /* * At this point memslot has been committed and there is an * allocated dirty_bitmap[], dirty pages will be tracked while the * memory slot is write protected. */ - if (change != KVM_MR_DELETE && new->flags & KVM_MEM_LOG_DIRTY_PAGES) { + if (log_dirty_pages) { + + if (change == KVM_MR_DELETE) + return; + /* * If we're with initial-all-set, we don't need to write * protect any pages because they're all reported as dirty. * Huge pages and normal pages will be write protect gradually. */ - if (!kvm_dirty_log_manual_protect_and_init_set(kvm)) { - kvm_mmu_wp_memory_region(kvm, new->id); - } + if (kvm_dirty_log_manual_protect_and_init_set(kvm)) + return; + + kvm_mmu_wp_memory_region(kvm, new->id); } } From patchwork Sat Feb 18 03:23:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1C3CC64ED6 for ; Sat, 18 Feb 2023 03:23:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229553AbjBRDXb (ORCPT ); Fri, 17 Feb 2023 22:23:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229728AbjBRDX2 (ORCPT ); Fri, 17 Feb 2023 22:23:28 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7F3A63BCC for ; Fri, 17 Feb 2023 19:23:26 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id o14-20020a25810e000000b0095d2ada3d26so2133566ybk.5 for ; Fri, 17 Feb 2023 19:23:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GBUZ1MvoHSzPOlntwcFWNJ7NDg9kNgKK6wVJ4q5z3So=; b=Pg7rxK6++ZYeD8KtaidJr0Wlm3ceogsYfwldGjTjvNz1HwY9odkMb5eIjO74WA6uF6 q0aVxvuXklbDuKZF6u60OW8A97f/zWkg/cmBUzNHuOvkxwXhjall/kirQzn7IKrGS1Ib ztKTTjRT7Rgo/vYTI4A2YLDV9FjX0PeD2ZJMdXZwfXCcvam5YDHgtYFp0UA7JuQ8wq8k uYubfSdMLjySUAx4uKzh9UrLkIaN/vhK163zYJ8tUTKOgfDIniJFB2x8/SlYbcFJkNwy TQSKB/MxzkMyL9YJ6Won8+r4yWNQsLLj/UMUAVjUN6hRgwDwZ3w+G1fqBHAJYcj+gz1G aCSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GBUZ1MvoHSzPOlntwcFWNJ7NDg9kNgKK6wVJ4q5z3So=; b=F7vuOC0BaTh3+TjZKHWz97bqJPAfZ3L2bLtohKEwSm7lx64njeCpk/rYvWS8fGddms /s91LcPvg12pxNKTHzClTJVG0mGh6WIOf3bvqbRbstO1YG1XYaGTrpG/fms6fLJowWkZ 8rLIegMZHY+VroQb6yFQ3b3ZL3Q03VWLMuVkxj4DVhKunQ2QdkQ25MTGJXL4MnG3hpps ZGXwLzNHcWtv8JZTXDfGMC9MoLSSofn0OPpSyTyJnPT6Hl0VRR5cU6xAt6RjwDTANYB3 /cc30Wjwne5Nj4o/ZOAVamF0GUXsaEoMH3GMMP3JoEB9727iw1XIfKJEGY0Vlb4ySrfG HUOA== X-Gm-Message-State: AO0yUKVPaVTQJDDOvAp0HccPGz2wd4U/9GeitPDRyKX1xfonI2pDTFMc CiETwRdUIYrgK9p3LgKQU6yj0qUzZYdXOw== X-Google-Smtp-Source: AK7set988k9/1hvcxfifUIiwdm6LLPOaqIcWDkdvnEu0U/xKw1bqb0HuinJzSTR6CuyoTY6y5/u56D+2UWFkGQ== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a25:9341:0:b0:8bb:dfe8:a33b with SMTP id g1-20020a259341000000b008bbdfe8a33bmr216446ybo.9.1676690606409; Fri, 17 Feb 2023 19:23:26 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:08 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-7-ricarkol@google.com> Subject: [PATCH v4 06/12] KVM: arm64: Add kvm_uninit_stage2_mmu() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add kvm_uninit_stage2_mmu() and move kvm_free_stage2_pgd() into it. A future commit will add some more things to do inside of kvm_uninit_stage2_mmu(). No functional change intended. Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_mmu.h | 1 + arch/arm64/kvm/mmu.c | 7 ++++++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index e4a7e6369499..058f3ae5bc26 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -167,6 +167,7 @@ void free_hyp_pgds(void); void stage2_unmap_vm(struct kvm *kvm); int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type); +void kvm_uninit_stage2_mmu(struct kvm *kvm); void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu); int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, phys_addr_t pa, unsigned long size, bool writable); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index d2c5e6992459..812633a75e74 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -766,6 +766,11 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t return err; } +void kvm_uninit_stage2_mmu(struct kvm *kvm) +{ + kvm_free_stage2_pgd(&kvm->arch.mmu); +} + static void stage2_unmap_memslot(struct kvm *kvm, struct kvm_memory_slot *memslot) { @@ -1855,7 +1860,7 @@ void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) void kvm_arch_flush_shadow_all(struct kvm *kvm) { - kvm_free_stage2_pgd(&kvm->arch.mmu); + kvm_uninit_stage2_mmu(kvm); } void kvm_arch_flush_shadow_memslot(struct kvm *kvm, From patchwork Sat Feb 18 03:23:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78B13C05027 for ; Sat, 18 Feb 2023 03:23:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229605AbjBRDXc (ORCPT ); Fri, 17 Feb 2023 22:23:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229623AbjBRDX3 (ORCPT ); Fri, 17 Feb 2023 22:23:29 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C13082C66D for ; Fri, 17 Feb 2023 19:23:28 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id x25-20020aa793b9000000b005a8ad1228d4so1355815pff.10 for ; Fri, 17 Feb 2023 19:23:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1676690608; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=V+h39w+M5y7yM65vTe8AIma457bKn0boXmudJkmUO64=; b=OOGQDAozfJPxU4cPh+eNsnFORVxhBseQ4po+PMVYCaaHrMa27a9eZwz2HjcuIdNgVh slYOm4TsOONQCaqTOCU64Vyqjc6wA23716ZiGacc/GuZx9D+Ze+sZX3iXZOiGOJPfzWw DEoYq/LLYfSROStk+2ZwCOK08p6EUsCQQomIY0RjEmXUx1ogiu0aDnkPstrbqkFIAsIO MmTUzMho1a89tjVA6XSHJPzknQsAy9d3/oybmTkcrgybl1jryNlrmOncv1u6qAqxxR1C WAcxMZX5PvwwVrEDyVwVNABO4a3hoCllU8GjbsW9Wy5fF9tXx1SQaV4ccNARUsNP0pKc brWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1676690608; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=V+h39w+M5y7yM65vTe8AIma457bKn0boXmudJkmUO64=; b=nAQj68cc5bv9oVjg21Q9nCCLp4Bkv4J2NCIguVSalMWB/oSSqyV8xduYaRysrgikP7 Y+tAFg51vVIDd5xX/JKoumSh08TzD5loOnuHULdzi6maBooM0SkPx3lV7BXo01MvGKcA pRNdbadBrmYp9liXdxmwlINvRQIdQ6o9Bkl38GhNZBYk6ivFUyOio64QN2M8yqIU3cSa VCmHzVQqOJ5S8Wc7KJUkPcnGvzeTSLuI2cXKe5cyH52FxbM+o0AAU4SLxb2ZiGUprU04 u3KmP/OjCJxTvliH5fatDDXZB6ewagWY19rIbSTH/M7u8Zfpfp+/RXpGdZV6jUOcZo3s I0zA== X-Gm-Message-State: AO0yUKUIfH9Wd2y/AZmNoqtQHRxZIr6yC9Jm9hayhVJW7MfZEVZqv42v 20rX2IrOMMb844jCB+dwiagki/RkPQC2vg== X-Google-Smtp-Source: AK7set+NggMflM6BNmRBHhGqrUcGVQByOp77BaqX2/+UoO+KkK/O0C3um/TDX+sbYIPuNVsMZ/613iCYuJMYfA== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a17:902:7207:b0:19c:2b30:f22e with SMTP id ba7-20020a170902720700b0019c2b30f22emr354082plb.11.1676690608082; Fri, 17 Feb 2023 19:23:28 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:09 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-8-ricarkol@google.com> Subject: [PATCH v4 07/12] KVM: arm64: Export kvm_are_all_memslots_empty() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Export kvm_are_all_memslots_empty(). This will be used by a future commit when checking before setting a capability. No functional change intended. Signed-off-by: Ricardo Koller --- include/linux/kvm_host.h | 2 ++ virt/kvm/kvm_main.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 4f26b244f6d0..8c5530e03a78 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -991,6 +991,8 @@ static inline bool kvm_memslots_empty(struct kvm_memslots *slots) return RB_EMPTY_ROOT(&slots->gfn_tree); } +bool kvm_are_all_memslots_empty(struct kvm *kvm); + #define kvm_for_each_memslot(memslot, bkt, slots) \ hash_for_each(slots->id_hash, bkt, memslot, id_node[slots->node_idx]) \ if (WARN_ON_ONCE(!memslot->npages)) { \ diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9c60384b5ae0..3940d2467e1b 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4604,7 +4604,7 @@ int __attribute__((weak)) kvm_vm_ioctl_enable_cap(struct kvm *kvm, return -EINVAL; } -static bool kvm_are_all_memslots_empty(struct kvm *kvm) +bool kvm_are_all_memslots_empty(struct kvm *kvm) { int i; From patchwork Sat Feb 18 03:23:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73E42C6379F for ; Sat, 18 Feb 2023 03:23:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229728AbjBRDXe (ORCPT ); Fri, 17 Feb 2023 22:23:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229708AbjBRDXc (ORCPT ); Fri, 17 Feb 2023 22:23:32 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90ABF6C018 for ; Fri, 17 Feb 2023 19:23:30 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id g63-20020a25db42000000b00889c54916f2so2061241ybf.14 for ; Fri, 17 Feb 2023 19:23:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gbSTlN/R7sFLt7NEbeV1LfkNJ13li3QgpciXxHucdPA=; b=E0v+pTVBD+PSap3EzWQTe4YgZHdD3KpfYSMjGDEHoHtzSZA1YJl3ulfRCks+5Owyt5 K6bRvtd7t4remMKuC1QA4CbInOm8Az9V70RCAmA4TTxkqVnGM3kryG4h/xiYALh/WZn+ w+A0HqPccdEyRgRY0A5xCezTI1SSIYjvDzBSVfaRn+YCUIOK2xRRIlz88uDHe+G0p7n4 lkeU7LLh50RJJ1qAuNVL5TfJuMwNIJNl501QI1ZVV13ldXugMj40GTYVUYC4IUMSBUjB fqGcKmqbuvYciJjKCMa/1ST96oCjo+imza9ssRGIstb5qvD397xkVtqaceDQKchyWNkn 82qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gbSTlN/R7sFLt7NEbeV1LfkNJ13li3QgpciXxHucdPA=; b=zzKNLgMeNARGtnkgYm/5Utw7M3V14zX6ygLR5l0J/cyW2AEBFF/+jCAqYMLhB+vDEI 5objTAzaxuel21GFQoVNASthrS/D684FpIzsOPUu0nwkt47TFiNmaa73WMgqEekethC3 6axV2bdKiTRVnW3fYuBpXtHfHK+E1+jnJeVUZRSs6oTQHNRyCn7nbTQ4gyZQwwvxPax8 HOWdmIovZpK96cTwLR3BVvZB7eMAZgdhfz7i0bzip20T5as3OeMZu1k/VSPTMt10bPvI ntWYkCyRkDzURkia5EhEASZj036sPlHT7Um9VwoKmqrng/ZxNvhAfTg8WNxBm7mhCQJL g3EQ== X-Gm-Message-State: AO0yUKUjpeV6Trga2/T2Cf0PpCulaHGnw7QWMhZ9PSFnAxKHyx7GG9bC JplnyGfFKgzzKp9M84Zgg7WDSxJWY91h4Q== X-Google-Smtp-Source: AK7set+v9+2nJ8bLXXnFa4tQDFP6zCdWRXjm1KjqbKbtydcgLmUPwO9K33R4Fjrw+bmLxEo8QVvkvA80BWj83w== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a81:4703:0:b0:527:3e8:5a94 with SMTP id u3-20020a814703000000b0052703e85a94mr1004218ywa.68.1676690609825; Fri, 17 Feb 2023 19:23:29 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:10 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-9-ricarkol@google.com> Subject: [PATCH v4 08/12] KVM: arm64: Add KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller , Oliver Upton Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add a capability for userspace to specify the eager split chunk size. The chunk size specifies how many pages to break at a time, using a single allocation. Bigger the chunk size, more pages need to be allocated ahead of time. Suggested-by: Oliver Upton Signed-off-by: Ricardo Koller --- Documentation/virt/kvm/api.rst | 26 ++++++++++++++++++++++++++ arch/arm64/include/asm/kvm_host.h | 19 +++++++++++++++++++ arch/arm64/kvm/arm.c | 22 ++++++++++++++++++++++ arch/arm64/kvm/mmu.c | 3 +++ include/uapi/linux/kvm.h | 1 + 5 files changed, 71 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 9807b05a1b57..a9332e331cce 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -8284,6 +8284,32 @@ structure. When getting the Modified Change Topology Report value, the attr->addr must point to a byte where the value will be stored or retrieved from. +8.40 KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE +--------------------------------------- + +:Capability: KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE +:Architectures: arm64 +:Type: vm +:Parameters: arg[0] is the new chunk size. +:Returns: 0 on success, -EINVAL if any memslot has been created. + +This capability sets the chunk size used in Eager Page Splitting. + +Eager Page Splitting improves the performance of dirty-logging (used +in live migrations) when guest memory is backed by huge-pages. This +optimization is enabled by default on arm64. It avoids splitting +huge-pages (into PAGE_SIZE pages) on fault, by doing it eagerly when +enabling dirty logging (with the KVM_MEM_LOG_DIRTY_PAGES flag for a +memory region), or when using KVM_CLEAR_DIRTY_LOG. + +The chunk size specifies how many pages to break at a time, using a +single allocation for each chunk. Bigger the chunk size, more pages +need to be allocated ahead of time. A good heuristic is to pick the +size of the huge-pages as the chunk size. + +If the chunk size (arg[0]) is zero, then no eager page splitting is +performed. The default value PMD size (e.g., 2M when PAGE_SIZE is 4K). + 9. Known KVM API problems ========================= diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 35a159d131b5..1445cbf6295e 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -153,6 +153,25 @@ struct kvm_s2_mmu { /* The last vcpu id that ran on each physical CPU */ int __percpu *last_vcpu_ran; +#define KVM_ARM_EAGER_SPLIT_CHUNK_SIZE_DEFAULT PMD_SIZE + /* + * Memory cache used to split + * KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE worth of huge pages. It + * is used to allocate stage2 page tables while splitting huge + * pages. Note that the choice of EAGER_PAGE_SPLIT_CHUNK_SIZE + * influences both the capacity of the split page cache, and + * how often KVM reschedules. Be wary of raising CHUNK_SIZE + * too high. + * + * A good heuristic to pick CHUNK_SIZE is that it should be + * the size of the huge-pages backing guest memory. If not + * known, the PMD size (usually 2M) is a good guess. + * + * Protected by kvm->slots_lock. + */ + struct kvm_mmu_memory_cache split_page_cache; + uint64_t split_page_chunk_size; + struct kvm_arch *arch; }; diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 9c5573bc4614..c80617ced599 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -101,6 +101,22 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, r = 0; set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags); break; + case KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE: + mutex_lock(&kvm->lock); + mutex_lock(&kvm->slots_lock); + /* + * To keep things simple, allow changing the chunk + * size only if there are no memslots created. + */ + if (!kvm_are_all_memslots_empty(kvm)) { + r = -EINVAL; + } else { + r = 0; + kvm->arch.mmu.split_page_chunk_size = cap->args[0]; + } + mutex_unlock(&kvm->slots_lock); + mutex_unlock(&kvm->lock); + break; default: r = -EINVAL; break; @@ -298,6 +314,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_ARM_PTRAUTH_GENERIC: r = system_has_full_ptr_auth(); break; + case KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE: + if (kvm) + r = kvm->arch.mmu.split_page_chunk_size; + else + r = KVM_ARM_EAGER_SPLIT_CHUNK_SIZE_DEFAULT; + break; default: r = 0; } diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 812633a75e74..e2ada6588017 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -755,6 +755,9 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t for_each_possible_cpu(cpu) *per_cpu_ptr(mmu->last_vcpu_ran, cpu) = -1; + mmu->split_page_cache.gfp_zero = __GFP_ZERO; + mmu->split_page_chunk_size = KVM_ARM_EAGER_SPLIT_CHUNK_SIZE_DEFAULT; + mmu->pgt = pgt; mmu->pgd_phys = __pa(pgt->pgd); return 0; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 55155e262646..02e05f7918e2 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1175,6 +1175,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223 #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225 +#define KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE 226 #ifdef KVM_CAP_IRQ_ROUTING From patchwork Sat Feb 18 03:23:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B1B2C636D6 for ; Sat, 18 Feb 2023 03:23:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229711AbjBRDXg (ORCPT ); Fri, 17 Feb 2023 22:23:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229729AbjBRDXf (ORCPT ); Fri, 17 Feb 2023 22:23:35 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E5A36CA20 for ; Fri, 17 Feb 2023 19:23:32 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id q10-20020a056902150a00b0091b90b20cd9so2055360ybu.6 for ; Fri, 17 Feb 2023 19:23:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+JtjRuHvJ4lJ74wAOo4i3rggWTQ8ctkUgc2yQnu50oA=; b=GJJlmnb0khkAFXuKmC3y/VTiEJJhY5DUzDBpMYQYuFm2JF0q/8JMtK1tVZTrqMQbgb MgeH6HQjUGhenQ65VfhQqFnXD+/aJiDAiGZtTcSDmNqdfJTxCjbIPsTLJrYHEnIlLUOe Pfqh4ne1n9D5QRNqiszdVRGYWmnZe/I2rMJAerBKPeGVomuYg2xcot3bQBIBB9EinHAJ zWIoTH4mfEf/q7dS9b7z9E3FTGXmBf14v3bc70JY+t6CrSJDuhxZm73T1OiLjpGGmGG2 CCH2oaZVOoQKsBS3dvH+taX5iNU8cA0Kl/9CkWbia1qbxf16aS6lhdFdY9P12Xb1tBRL oBSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+JtjRuHvJ4lJ74wAOo4i3rggWTQ8ctkUgc2yQnu50oA=; b=BdebysKDWEo35xtpd7DrXkmgANN3U5K09bk7yTT5WVnRLrdA1Of5udH0xJRJvmXagu Ugj0w3NKqn4wHlWRTwdI4IwirGRj8OwhdECLx+H3Buf8BRHVo2O0fDNokK2l6fIP9gvG MMOsUDbiL2ELT4MgXxdGPaugTBg5HuPZwMS0Qhy0KBRX5oIOD/PiHYjEyHldwwKqLoLy eEthQwx3Ti72AzYsmh1dzaL0qBuJyIp3jg8mwkxICndStXy0ZEKZCTtdVnCNZ9vgm/X1 gnD2RZ26SxlLbM0O/SH+gucmIWZEj8B7VZ+miwFX+YmEyBa5+LPUfYnHp5lrwvSt52Hu oLsQ== X-Gm-Message-State: AO0yUKVPFvanSoJH3172Y3nzHCdR8BdA2UP6kAxi75J2sBmrAJAkady8 dUtJDtlKkzepv8QolkgQjvF/FNX8HkGfow== X-Google-Smtp-Source: AK7set8q81wH/uR+rJRPsM/n0wIjOLfaUQqZn+iLY2XLl5XqmsWVHNibRIv5e8qryy3EQ+UrB7EN8K4696WxIA== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a81:678a:0:b0:534:71c4:9aed with SMTP id b132-20020a81678a000000b0053471c49aedmr434402ywc.339.1676690611495; Fri, 17 Feb 2023 19:23:31 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:11 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-10-ricarkol@google.com> Subject: [PATCH v4 09/12] KVM: arm64: Split huge pages when dirty logging is enabled From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Split huge pages eagerly when enabling dirty logging. The goal is to avoid doing it while faulting on write-protected pages, which negatively impacts guest performance. A memslot marked for dirty logging is split in 1GB pieces at a time. This is in order to release the mmu_lock and give other kernel threads the opportunity to run, and also in order to allocate enough pages to split a 1GB range worth of huge pages (or a single 1GB huge page). Note that these page allocations can fail, so eager page splitting is best-effort. This is not a correctness issue though, as huge pages can still be split on write-faults. The benefits of eager page splitting are the same as in x86, added with commit a3fe5dbda0a4 ("KVM: x86/mmu: Split huge pages mapped by the TDP MMU when dirty logging is enabled"). For example, when running dirty_log_perf_test with 64 virtual CPUs (Ampere Altra), 1GB per vCPU, 50% reads, and 2MB HugeTLB memory, the time it takes vCPUs to access all of their memory after dirty logging is enabled decreased by 44% from 2.58s to 1.42s. Signed-off-by: Ricardo Koller --- arch/arm64/kvm/mmu.c | 118 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 116 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index e2ada6588017..20458251c85e 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -31,14 +31,21 @@ static phys_addr_t hyp_idmap_vector; static unsigned long io_map_base; -static phys_addr_t stage2_range_addr_end(phys_addr_t addr, phys_addr_t end) +static phys_addr_t __stage2_range_addr_end(phys_addr_t addr, phys_addr_t end, + phys_addr_t size) { - phys_addr_t size = kvm_granule_size(KVM_PGTABLE_MIN_BLOCK_LEVEL); phys_addr_t boundary = ALIGN_DOWN(addr + size, size); return (boundary - 1 < end - 1) ? boundary : end; } +static phys_addr_t stage2_range_addr_end(phys_addr_t addr, phys_addr_t end) +{ + phys_addr_t size = kvm_granule_size(KVM_PGTABLE_MIN_BLOCK_LEVEL); + + return __stage2_range_addr_end(addr, end, size); +} + /* * Release kvm_mmu_lock periodically if the memory region is large. Otherwise, * we may see kernel panics with CONFIG_DETECT_HUNG_TASK, @@ -71,6 +78,77 @@ static int stage2_apply_range(struct kvm *kvm, phys_addr_t addr, return ret; } +static bool need_topup_split_page_cache_or_resched(struct kvm *kvm, uint64_t min) +{ + struct kvm_mmu_memory_cache *cache; + + if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) + return true; + + cache = &kvm->arch.mmu.split_page_cache; + return kvm_mmu_memory_cache_nr_free_objects(cache) < min; +} + +/* + * Get the maximum number of page-tables needed to split a range of + * blocks into PAGE_SIZE PTEs. It assumes the range is already mapped + * at the PMD level, or at the PUD level if allowed. + */ +static int kvm_mmu_split_nr_page_tables(u64 range) +{ + int n = 0; + + if (KVM_PGTABLE_MIN_BLOCK_LEVEL < 2) + n += DIV_ROUND_UP_ULL(range, PUD_SIZE); + n += DIV_ROUND_UP_ULL(range, PMD_SIZE); + return n; +} + +static int kvm_mmu_split_huge_pages(struct kvm *kvm, phys_addr_t addr, + phys_addr_t end) +{ + struct kvm_mmu_memory_cache *cache; + struct kvm_pgtable *pgt; + int ret; + u64 next; + u64 chunk_size = kvm->arch.mmu.split_page_chunk_size; + int cache_capacity = kvm_mmu_split_nr_page_tables(chunk_size); + + if (chunk_size == 0) + return 0; + + lockdep_assert_held_write(&kvm->mmu_lock); + + cache = &kvm->arch.mmu.split_page_cache; + + do { + if (need_topup_split_page_cache_or_resched(kvm, + cache_capacity)) { + write_unlock(&kvm->mmu_lock); + cond_resched(); + /* Eager page splitting is best-effort. */ + ret = __kvm_mmu_topup_memory_cache(cache, + cache_capacity, + cache_capacity); + write_lock(&kvm->mmu_lock); + if (ret) + break; + } + + pgt = kvm->arch.mmu.pgt; + if (!pgt) + return -EINVAL; + + next = __stage2_range_addr_end(addr, end, chunk_size); + ret = kvm_pgtable_stage2_split(pgt, addr, next - addr, + cache, cache_capacity); + if (ret) + break; + } while (addr = next, addr != end); + + return ret; +} + #define stage2_apply_range_resched(kvm, addr, end, fn) \ stage2_apply_range(kvm, addr, end, fn, true) @@ -772,6 +850,7 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t void kvm_uninit_stage2_mmu(struct kvm *kvm) { kvm_free_stage2_pgd(&kvm->arch.mmu); + kvm_mmu_free_memory_cache(&kvm->arch.mmu.split_page_cache); } static void stage2_unmap_memslot(struct kvm *kvm, @@ -999,6 +1078,31 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, stage2_wp_range(&kvm->arch.mmu, start, end); } +/** + * kvm_mmu_split_memory_region() - split the stage 2 blocks into PAGE_SIZE + * pages for memory slot + * @kvm: The KVM pointer + * @slot: The memory slot to split + * + * Acquires kvm->mmu_lock. Called with kvm->slots_lock mutex acquired, + * serializing operations for VM memory regions. + */ +static void kvm_mmu_split_memory_region(struct kvm *kvm, int slot) +{ + struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memory_slot *memslot = id_to_memslot(slots, slot); + phys_addr_t start, end; + + lockdep_assert_held(&kvm->slots_lock); + + start = memslot->base_gfn << PAGE_SHIFT; + end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT; + + write_lock(&kvm->mmu_lock); + kvm_mmu_split_huge_pages(kvm, start, end); + write_unlock(&kvm->mmu_lock); +} + /* * kvm_arch_mmu_enable_log_dirty_pt_masked - enable dirty logging for selected * dirty pages. @@ -1790,6 +1894,16 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, return; kvm_mmu_wp_memory_region(kvm, new->id); + kvm_mmu_split_memory_region(kvm, new->id); + } else { + /* + * Free any leftovers from the eager page splitting cache. Do + * this when deleting, moving, disabling dirty logging, or + * creating the memslot (a nop). Doing it for deletes makes + * sure we don't leak memory, and there's no need to keep the + * cache around for any of the other cases. + */ + kvm_mmu_free_memory_cache(&kvm->arch.mmu.split_page_cache); } } From patchwork Sat Feb 18 03:23:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE986C636D6 for ; Sat, 18 Feb 2023 03:23:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229760AbjBRDXi (ORCPT ); Fri, 17 Feb 2023 22:23:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229729AbjBRDXh (ORCPT ); Fri, 17 Feb 2023 22:23:37 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BF9C6C013 for ; Fri, 17 Feb 2023 19:23:33 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-53655de27a1so31656637b3.14 for ; Fri, 17 Feb 2023 19:23:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/GR/sjDUZPfmGPpftS1VEvzIDAWFei9nVb6vABCcmnI=; b=Uima7vSx6WcxB5PWVawuvbz+4MoRtGvtZQP4OQDuV9tgWsk6GRiRwjIl/itkeG0Eec emdJVgV5GwiQnHcOZECCtS5iAuy2/3zhdhpSbd88fLDdY7mdI+2r0NU/+rdxQtFuBy/v aZPxmd66dsIslltCZIqnXV6WqBeb+BPr1x5f2vg7RDw1yMVx33NwqVVjxuzJr1Dg1qWJ H6ET2PUkRWQXljpkrYL4k2S0BFbUiNdav8tfRLpu589RNmb1qZsQLbvPFrv6PLJZFYfW OKMxl7KEJip8+d37UEQt/j4sVs7tQXhGW8gMIvlknR9eFj9srW+0Fa94INKT0D58sEOg xYYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/GR/sjDUZPfmGPpftS1VEvzIDAWFei9nVb6vABCcmnI=; b=F+8fmr4o0ttGoRvApnWkDkVSChDAbbofj6nU1tL+dmTEb+HZUv2T8VDD/NoPKIY6tT TrIycf3dYXvWXvJHoJqhn0az/WuAAHddG66WVXBU8qs4JclrAaDwLG3J0Vu9HWLhKgS/ pNr9zEsbUvPHshyPnDQUeUh+2O3HwUJ5tA6sbfGO46tGUFCjoxwLaD88jSTBT8GuV2UG vTMCC6Ctzly1ccdq+n/4IJODmOd2irBXhevkmFitlO2jjyeUT7wqBqgoKmn3B4c5m4LU olBIu9NuLTLq7Yy9ZKdrG1JUnATTCSFSMinViYFvxOaSJZoZmWGMvd26P7Gh06u3DDeG gn+A== X-Gm-Message-State: AO0yUKVCUOmOuMiHdXW+MqwndKxY7or27oz4RHPGdRm2focpwwS6tJnN bj39qfrVLSBh/O+tJ5ZNgX3bIUhRsulAbg== X-Google-Smtp-Source: AK7set/t1+3Hbkl9+VqIteNsWUqU8jbePr3G530RDyJGa8FC/0DW+/y3mO0WfD0yJ+Ly2FqDWjfl9LWu2tBLpg== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a81:bf47:0:b0:52f:3088:8aaa with SMTP id s7-20020a81bf47000000b0052f30888aaamr1560161ywk.345.1676690613052; Fri, 17 Feb 2023 19:23:33 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:12 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-11-ricarkol@google.com> Subject: [PATCH v4 10/12] KVM: arm64: Open-code kvm_mmu_write_protect_pt_masked() From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move the functionality of kvm_mmu_write_protect_pt_masked() into its caller, kvm_arch_mmu_enable_log_dirty_pt_masked(). This will be used in a subsequent commit in order to share some of the code in kvm_arch_mmu_enable_log_dirty_pt_masked(). No functional change intended. Signed-off-by: Ricardo Koller --- arch/arm64/kvm/mmu.c | 42 +++++++++++++++--------------------------- 1 file changed, 15 insertions(+), 27 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 20458251c85e..8e9d612dda00 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1056,28 +1056,6 @@ static void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot) kvm_flush_remote_tlbs(kvm); } -/** - * kvm_mmu_write_protect_pt_masked() - write protect dirty pages - * @kvm: The KVM pointer - * @slot: The memory slot associated with mask - * @gfn_offset: The gfn offset in memory slot - * @mask: The mask of dirty pages at offset 'gfn_offset' in this memory - * slot to be write protected - * - * Walks bits set in mask write protects the associated pte's. Caller must - * acquire kvm_mmu_lock. - */ -static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm, - struct kvm_memory_slot *slot, - gfn_t gfn_offset, unsigned long mask) -{ - phys_addr_t base_gfn = slot->base_gfn + gfn_offset; - phys_addr_t start = (base_gfn + __ffs(mask)) << PAGE_SHIFT; - phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT; - - stage2_wp_range(&kvm->arch.mmu, start, end); -} - /** * kvm_mmu_split_memory_region() - split the stage 2 blocks into PAGE_SIZE * pages for memory slot @@ -1104,17 +1082,27 @@ static void kvm_mmu_split_memory_region(struct kvm *kvm, int slot) } /* - * kvm_arch_mmu_enable_log_dirty_pt_masked - enable dirty logging for selected - * dirty pages. + * kvm_arch_mmu_enable_log_dirty_pt_masked() - enable dirty logging for selected pages. + * @kvm: The KVM pointer + * @slot: The memory slot associated with mask + * @gfn_offset: The gfn offset in memory slot + * @mask: The mask of pages at offset 'gfn_offset' in this memory + * slot to enable dirty logging on * - * It calls kvm_mmu_write_protect_pt_masked to write protect selected pages to - * enable dirty logging for them. + * Writes protect selected pages to enable dirty logging for them. Caller must + * acquire kvm->mmu_lock. */ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn_offset, unsigned long mask) { - kvm_mmu_write_protect_pt_masked(kvm, slot, gfn_offset, mask); + phys_addr_t base_gfn = slot->base_gfn + gfn_offset; + phys_addr_t start = (base_gfn + __ffs(mask)) << PAGE_SHIFT; + phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT; + + lockdep_assert_held_write(&kvm->mmu_lock); + + stage2_wp_range(&kvm->arch.mmu, start, end); } static void kvm_send_hwpoison_signal(unsigned long address, short lsb) From patchwork Sat Feb 18 03:23:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01146C636D6 for ; Sat, 18 Feb 2023 03:23:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229602AbjBRDXq (ORCPT ); Fri, 17 Feb 2023 22:23:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229708AbjBRDXn (ORCPT ); Fri, 17 Feb 2023 22:23:43 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE51A6D789 for ; Fri, 17 Feb 2023 19:23:35 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5365a29685dso22396987b3.12 for ; Fri, 17 Feb 2023 19:23:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=s/7R5ej6h5zuscnhDtO9neYKca8wJ2+Nh25+D9oE7OQ=; b=QWh2YpfYuo1Bfp/DWb7P1qu9AZTfi4z3+0VDnUgv9G8S/FKTe0KwQIo2XnOwqOUNCD Qqb3gg7L7eUd+HW1qxOdJkLGa6RO3/wMiSQX+crjekFgmKhpMEQRy1MAr+BTBEN6OE9A FILlJ+jDXycDLFwBrqWce2U3kZpZaqhkkNk/i5VeDLcM805jSPv2S2rG/juJMzCAgEHz eRzynL7TEmbH4KK21ecLpK9lZqwXv6fQCPFymTtDku1V3MfY61ABXV9i9FgltsxwyFrc WVLkuuWWiqbsdmOmVs/TSlZIY4Bp+SARwyaFujoZEbICpOdM+WpWG6Z9XMKjJEFBUaxS Czfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=s/7R5ej6h5zuscnhDtO9neYKca8wJ2+Nh25+D9oE7OQ=; b=4tOuj0AUHd0SGxBuJqnpmTEhJfcweK2h4GbYTmcOcVnhhYOft8iF9mLs5dv4pVazcC crs3s1PYr3pbGlKXfFOtc9JulpbHDQ13Rxw9WN6tU9fyhCCMkAt5FCYBRmusLw9S/qB5 OpkyKPJzMVe8pyC1pV2uKCHuGQHTBfrpK5Z5jVCzy51nT48/B7hSbQiAbj3F0jLTMDBz Dnb3bR9lpxS/2AkafVBnYQ0/epn6n3k0lyqYBPKfUHsAnI0bfig5HVwO9xEs8WNoOk7q ir/1U7zJuZjFpcMJf2IVp1KZTpZLOhehcyNves4lZPD9+/16K7Hu5/RZrU9s2mOkJlAV FCfg== X-Gm-Message-State: AO0yUKWFiXLGrgdTr1W9fh1FhdMxYcRBzhNAk5reiKl7nupAZ6Wb+5w0 oKkfg1CnoecrbgDOjHToxOr7GxY/9EzUhg== X-Google-Smtp-Source: AK7set/aW9Y72FZsVzf91MozQGe+Q5u8KbHZsA4913mkF0aisI+PQiLGbO+5m+8JXKTzoIvJ3HdXMG8LAlFMCg== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a25:9e92:0:b0:900:c3fd:a093 with SMTP id p18-20020a259e92000000b00900c3fda093mr897185ybq.684.1676690614628; Fri, 17 Feb 2023 19:23:34 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:13 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-12-ricarkol@google.com> Subject: [PATCH v4 11/12] KVM: arm64: Split huge pages during KVM_CLEAR_DIRTY_LOG From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This is the arm64 counterpart of commit cb00a70bd4b7 ("KVM: x86/mmu: Split huge pages mapped by the TDP MMU during KVM_CLEAR_DIRTY_LOG"), which has the benefit of splitting the cost of splitting a memslot across multiple ioctls. Split huge pages on the range specified using KVM_CLEAR_DIRTY_LOG. And do not split when enabling dirty logging if KVM_DIRTY_LOG_INITIALLY_SET is set. Signed-off-by: Ricardo Koller --- arch/arm64/kvm/mmu.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 8e9d612dda00..5dae0e6a697f 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1089,8 +1089,8 @@ static void kvm_mmu_split_memory_region(struct kvm *kvm, int slot) * @mask: The mask of pages at offset 'gfn_offset' in this memory * slot to enable dirty logging on * - * Writes protect selected pages to enable dirty logging for them. Caller must - * acquire kvm->mmu_lock. + * Splits selected pages to PAGE_SIZE and then writes protect them to enable + * dirty logging for them. Caller must acquire kvm->mmu_lock. */ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, struct kvm_memory_slot *slot, @@ -1103,6 +1103,13 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, lockdep_assert_held_write(&kvm->mmu_lock); stage2_wp_range(&kvm->arch.mmu, start, end); + + /* + * If initially-all-set mode is not set, then huge-pages were already + * split when enabling dirty logging: no need to do it again. + */ + if (kvm_dirty_log_manual_protect_and_init_set(kvm)) + kvm_mmu_split_huge_pages(kvm, start, end); } static void kvm_send_hwpoison_signal(unsigned long address, short lsb) @@ -1889,7 +1896,9 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, * this when deleting, moving, disabling dirty logging, or * creating the memslot (a nop). Doing it for deletes makes * sure we don't leak memory, and there's no need to keep the - * cache around for any of the other cases. + * cache around for any of the other cases. Keeping the cache + * is useful for successive KVM_CLEAR_DIRTY_LOG calls, which is + * not handled in this function. */ kvm_mmu_free_memory_cache(&kvm->arch.mmu.split_page_cache); } From patchwork Sat Feb 18 03:23:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ricardo Koller X-Patchwork-Id: 13145480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 216D0C636D6 for ; Sat, 18 Feb 2023 03:23:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229751AbjBRDXu (ORCPT ); Fri, 17 Feb 2023 22:23:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229708AbjBRDXt (ORCPT ); Fri, 17 Feb 2023 22:23:49 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CE1F1422D for ; Fri, 17 Feb 2023 19:23:37 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id y33-20020a25ad21000000b00953ffdfbe1aso2522226ybi.23 for ; Fri, 17 Feb 2023 19:23:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FEDRhyMFCECWJJMCXfuOIikv5npXJq0fQihBBev/TIA=; b=TsL/rEFHgzF8PB3fl/52k7sY+V2wtYkQo2G2NtQUkj8mG/mR9M+X0ARF+JByYpTdf4 rHdC2lPdoW/iHfrfnJNxzkRdZIdWxB6961+rR6gt1FTOrfRJpYyf98KtklSvEkZ50EIz ph9DAW1DudYqTZWpbcdhs6HC2O6GaqY8PVyYAaURCKqLEcTGYo5kpLhBeKKj/7Qsj23I gva/Kvmt2p08y3rhs63O7QRd2QRBHM+UxE+EuxQ467soHSxuN8WFfemk7TepF2XHWVcS EEuwmz+/YmtTMbmz4w6ddwctx3k/7DWxI/Jxa00WvgYBdkQhQMW7qcUorb7WbfY30iVz Q/rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FEDRhyMFCECWJJMCXfuOIikv5npXJq0fQihBBev/TIA=; b=HxEuru/sYgkAAn7/UMaDjwt0MYyD67/MBS1RKdnMqc/7t1KTSEZcfGeKPqVqD6nm+G XN27xi8krzmRX9W9TU4JZOVgMQd9MQwjZXaED7jDrXH3rSJ9h/fLIX5knKjeB+Fqsrpf hwJvl/FpVDLQDvkDs9cUpHTU0tWgUy610DbkloCDifHLGLtAVOd9FixCHqRiZQPfnvo7 mjoyMZ67hsNN8v1CW6aR7TSHwz1BZPM3RqtzFds5wPygGZhxDOcA8p6JuDGXjssCNA6C hOaLrBMFBmA8xcj5iLpy7HeWMESN5eaUiupi8dfQuTaWK/H6XMW36aUcPs74WgwY8v5D o4Qw== X-Gm-Message-State: AO0yUKUHtw/GsaoPmAN/GqBcqKrXXT/3ln5gYg9UOJHTMjOHJu5kadd/ iAl1RzKeCGhmxYbtzxsVKvf65MVKhFEndw== X-Google-Smtp-Source: AK7set+8ZqBbGv/5+KpNRrBzr4O7c0V0Q3qngND/S2IUittkfFuLPIrXR022lFKFcWFdml6nj4vqj2wsIvmbTw== X-Received: from ricarkol4.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:1248]) (user=ricarkol job=sendgmr) by 2002:a81:b660:0:b0:534:515:e472 with SMTP id h32-20020a81b660000000b005340515e472mr69437ywk.4.1676690616159; Fri, 17 Feb 2023 19:23:36 -0800 (PST) Date: Sat, 18 Feb 2023 03:23:14 +0000 In-Reply-To: <20230218032314.635829-1-ricarkol@google.com> Mime-Version: 1.0 References: <20230218032314.635829-1-ricarkol@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218032314.635829-13-ricarkol@google.com> Subject: [PATCH v4 12/12] KVM: arm64: Use local TLBI on permission relaxation From: Ricardo Koller To: pbonzini@redhat.com, maz@kernel.org, oupton@google.com, yuzenghui@huawei.com, dmatlack@google.com Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, qperret@google.com, catalin.marinas@arm.com, andrew.jones@linux.dev, seanjc@google.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com, rananta@google.com, bgardon@google.com, ricarkol@gmail.com, Ricardo Koller Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Marc Zyngier Broadcasted TLB invalidations (TLBI) are usually less performant than their local variant. In particular, we observed some implementations that take millliseconds to complete parallel broadcasted TLBIs. It's safe to use local, non-shareable, TLBIs when relaxing permissions on a PTE in the KVM case for a couple of reasons. First, according to the ARM Arm (DDI 0487H.a D5-4913), permission relaxation does not need break-before-make. Second, KVM does not set the VTTBR_EL2.CnP bit, so each PE has its own TLB entry for the same page. KVM could tolerate that when doing permission relaxation (i.e., not having changes broadcasted to all PEs). Signed-off-by: Marc Zyngier Signed-off-by: Ricardo Koller --- arch/arm64/include/asm/kvm_asm.h | 4 +++ arch/arm64/kvm/hyp/nvhe/hyp-main.c | 10 ++++++ arch/arm64/kvm/hyp/nvhe/tlb.c | 54 ++++++++++++++++++++++++++++++ arch/arm64/kvm/hyp/pgtable.c | 2 +- arch/arm64/kvm/hyp/vhe/tlb.c | 32 ++++++++++++++++++ 5 files changed, 101 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 43c3bc0f9544..bb17b2ead4c7 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -68,6 +68,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___kvm_vcpu_run, __KVM_HOST_SMCCC_FUNC___kvm_flush_vm_context, __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_ipa, + __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_ipa_nsh, __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid, __KVM_HOST_SMCCC_FUNC___kvm_flush_cpu_context, __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff, @@ -225,6 +226,9 @@ extern void __kvm_flush_vm_context(void); extern void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu); extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, int level); +extern void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu, + phys_addr_t ipa, + int level); extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu); extern void __kvm_timer_set_cntvoff(u64 cntvoff); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 728e01d4536b..c6bf1e49ca93 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -125,6 +125,15 @@ static void handle___kvm_tlb_flush_vmid_ipa(struct kvm_cpu_context *host_ctxt) __kvm_tlb_flush_vmid_ipa(kern_hyp_va(mmu), ipa, level); } +static void handle___kvm_tlb_flush_vmid_ipa_nsh(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(struct kvm_s2_mmu *, mmu, host_ctxt, 1); + DECLARE_REG(phys_addr_t, ipa, host_ctxt, 2); + DECLARE_REG(int, level, host_ctxt, 3); + + __kvm_tlb_flush_vmid_ipa_nsh(kern_hyp_va(mmu), ipa, level); +} + static void handle___kvm_tlb_flush_vmid(struct kvm_cpu_context *host_ctxt) { DECLARE_REG(struct kvm_s2_mmu *, mmu, host_ctxt, 1); @@ -315,6 +324,7 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__kvm_vcpu_run), HANDLE_FUNC(__kvm_flush_vm_context), HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa), + HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa_nsh), HANDLE_FUNC(__kvm_tlb_flush_vmid), HANDLE_FUNC(__kvm_flush_cpu_context), HANDLE_FUNC(__kvm_timer_set_cntvoff), diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c index d296d617f589..ef2b70587f93 100644 --- a/arch/arm64/kvm/hyp/nvhe/tlb.c +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c @@ -109,6 +109,60 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, __tlb_switch_to_host(&cxt); } +void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu, + phys_addr_t ipa, int level) +{ + struct tlb_inv_context cxt; + + dsb(nshst); + + /* Switch to requested VMID */ + __tlb_switch_to_guest(mmu, &cxt); + + /* + * We could do so much better if we had the VA as well. + * Instead, we invalidate Stage-2 for this IPA, and the + * whole of Stage-1. Weep... + */ + ipa >>= 12; + __tlbi_level(ipas2e1, ipa, level); + + /* + * We have to ensure completion of the invalidation at Stage-2, + * since a table walk on another CPU could refill a TLB with a + * complete (S1 + S2) walk based on the old Stage-2 mapping if + * the Stage-1 invalidation happened first. + */ + dsb(nsh); + __tlbi(vmalle1); + dsb(nsh); + isb(); + + /* + * If the host is running at EL1 and we have a VPIPT I-cache, + * then we must perform I-cache maintenance at EL2 in order for + * it to have an effect on the guest. Since the guest cannot hit + * I-cache lines allocated with a different VMID, we don't need + * to worry about junk out of guest reset (we nuke the I-cache on + * VMID rollover), but we do need to be careful when remapping + * executable pages for the same guest. This can happen when KSM + * takes a CoW fault on an executable page, copies the page into + * a page that was previously mapped in the guest and then needs + * to invalidate the guest view of the I-cache for that page + * from EL1. To solve this, we invalidate the entire I-cache when + * unmapping a page from a guest if we have a VPIPT I-cache but + * the host is running at EL1. As above, we could do better if + * we had the VA. + * + * The moral of this story is: if you have a VPIPT I-cache, then + * you should be running with VHE enabled. + */ + if (icache_is_vpipt()) + icache_inval_all_pou(); + + __tlb_switch_to_host(&cxt); +} + void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) { struct tlb_inv_context cxt; diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 9f1c8fdd9330..399e62a8c453 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1148,7 +1148,7 @@ int kvm_pgtable_stage2_relax_perms(struct kvm_pgtable *pgt, u64 addr, ret = stage2_update_leaf_attrs(pgt, addr, 1, set, clr, NULL, &level, KVM_PGTABLE_WALK_SHARED); if (!ret) - kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, pgt->mmu, addr, level); + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa_nsh, pgt->mmu, addr, level); return ret; } diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c index 24cef9b87f9e..e69da550cdc5 100644 --- a/arch/arm64/kvm/hyp/vhe/tlb.c +++ b/arch/arm64/kvm/hyp/vhe/tlb.c @@ -111,6 +111,38 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, __tlb_switch_to_host(&cxt); } +void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu, + phys_addr_t ipa, int level) +{ + struct tlb_inv_context cxt; + + dsb(nshst); + + /* Switch to requested VMID */ + __tlb_switch_to_guest(mmu, &cxt); + + /* + * We could do so much better if we had the VA as well. + * Instead, we invalidate Stage-2 for this IPA, and the + * whole of Stage-1. Weep... + */ + ipa >>= 12; + __tlbi_level(ipas2e1, ipa, level); + + /* + * We have to ensure completion of the invalidation at Stage-2, + * since a table walk on another CPU could refill a TLB with a + * complete (S1 + S2) walk based on the old Stage-2 mapping if + * the Stage-1 invalidation happened first. + */ + dsb(nsh); + __tlbi(vmalle1); + dsb(nsh); + isb(); + + __tlb_switch_to_host(&cxt); +} + void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu) { struct tlb_inv_context cxt;