From patchwork Mon Oct 17 11:51:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008666 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2300C4332F for ; Mon, 17 Oct 2022 11:52:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230102AbiJQLw3 (ORCPT ); Mon, 17 Oct 2022 07:52:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229816AbiJQLw2 (ORCPT ); Mon, 17 Oct 2022 07:52:28 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C1AE15812 for ; Mon, 17 Oct 2022 04:52:27 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8D33B610A2 for ; Mon, 17 Oct 2022 11:52:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB5B4C4347C; Mon, 17 Oct 2022 11:52:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007546; bh=+fQLKdCRMyVXv9zLE1eBeiQI5NRaL8bTwzFAc5SEhgA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VVyzoKkgxXVLlKW+5D+YXWMw+eO+YwNFEq7bybQ7FVGUq2S/jHJs665RCaw9C+QgX r4+DbDzpZjjl2HhbMAybZMG65B+CUDVXyoSkNNgmfxWGS4lBHa5lccnxJAb0kkkhOa y8VHQHGYnX6xcJpxPCk2+andsIWNmOMa/FJalQ8y6vs31J8CuUNTU7hanuXSRXjV1z 085HAXdIo6MGb/GsV05ARQVSanCmitQAHSEiMGYZ80MC0bMboFdw+p++B2JLJbTItj ptaT5o52dhY/FrYbvVfYdrWYg/4UXiFydICg6B7IVNLS8nNFCMeGBewh6MNxyT7QBt wbB8swDwtyhyA== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 01/25] KVM: arm64: Move hyp refcount manipulation helpers to common header file Date: Mon, 17 Oct 2022 12:51:45 +0100 Message-Id: <20221017115209.2099-2-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret We will soon need to manipulate 'struct hyp_page' refcounts from outside page_alloc.c, so move the helpers to a common header file to allow them to be reused easily. Reviewed-by: Oliver Upton Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon Reviewed-by: Philippe Mathieu-Daudé --- arch/arm64/kvm/hyp/include/nvhe/memory.h | 22 ++++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/page_alloc.c | 19 ------------------- 2 files changed, 22 insertions(+), 19 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h index 592b7edb3edb..9422900e5c6a 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/memory.h +++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h @@ -38,6 +38,10 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr) #define hyp_page_to_virt(page) __hyp_va(hyp_page_to_phys(page)) #define hyp_page_to_pool(page) (((struct hyp_page *)page)->pool) +/* + * Refcounting for 'struct hyp_page'. + * hyp_pool::lock must be held if atomic access to the refcount is required. + */ static inline int hyp_page_count(void *addr) { struct hyp_page *p = hyp_virt_to_page(addr); @@ -45,4 +49,22 @@ static inline int hyp_page_count(void *addr) return p->refcount; } +static inline void hyp_page_ref_inc(struct hyp_page *p) +{ + BUG_ON(p->refcount == USHRT_MAX); + p->refcount++; +} + +static inline int hyp_page_ref_dec_and_test(struct hyp_page *p) +{ + BUG_ON(!p->refcount); + p->refcount--; + return (p->refcount == 0); +} + +static inline void hyp_set_page_refcounted(struct hyp_page *p) +{ + BUG_ON(p->refcount); + p->refcount = 1; +} #endif /* __KVM_HYP_MEMORY_H */ diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c index d40f0b30b534..1ded09fc9b10 100644 --- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c +++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c @@ -144,25 +144,6 @@ static struct hyp_page *__hyp_extract_page(struct hyp_pool *pool, return p; } -static inline void hyp_page_ref_inc(struct hyp_page *p) -{ - BUG_ON(p->refcount == USHRT_MAX); - p->refcount++; -} - -static inline int hyp_page_ref_dec_and_test(struct hyp_page *p) -{ - BUG_ON(!p->refcount); - p->refcount--; - return (p->refcount == 0); -} - -static inline void hyp_set_page_refcounted(struct hyp_page *p) -{ - BUG_ON(p->refcount); - p->refcount = 1; -} - static void __hyp_put_page(struct hyp_pool *pool, struct hyp_page *p) { if (hyp_page_ref_dec_and_test(p)) From patchwork Mon Oct 17 11:51:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A97B9C433FE for ; Mon, 17 Oct 2022 11:52:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230096AbiJQLwc (ORCPT ); Mon, 17 Oct 2022 07:52:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230055AbiJQLwb (ORCPT ); Mon, 17 Oct 2022 07:52:31 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97EA622BF8 for ; Mon, 17 Oct 2022 04:52:30 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 18DDB610A0 for ; Mon, 17 Oct 2022 11:52:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 764C5C433C1; Mon, 17 Oct 2022 11:52:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007549; bh=yEAREzQrE4OzDm8tjXcid5kmoNA03bQzioyQnganDwc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GqExlozjMaYbuK8eyaZW+XPhKqnpqREY2VKOWn5Y7y8NkBhIHcu7mzLy+kUbuE/39 yWhwDISRKO4M3IQoySONehxdAQXplYhseMpq8GIgn6tPYfRdErLqa1PUnOqsFsmHyf XgS2IlPNYcaWQhJwlaGzaAdXGfEKloct3YPVDK3UuOrpeQvHqo+wJZfxvnGpGUeSFg yNVZVmbQIy26CpIl+BlVEyehAuVobLtnDhmKp70gSUAjQdTGpRRilTZHEUfiZoFbb6 1Raq5IddRJ6+sWWR8CrzIJyEk7oUwYuV79OCA7Pt2FXCQ++nB7cX78Af3h8GDR+Gd3 RCeRiWgxJBMuw== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 02/25] KVM: arm64: Allow attaching of non-coalescable pages to a hyp pool Date: Mon, 17 Oct 2022 12:51:46 +0100 Message-Id: <20221017115209.2099-3-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret All the contiguous pages used to initialize a 'struct hyp_pool' are considered coalescable, which means that the hyp page allocator will actively try to merge them with their buddies on the hyp_put_page() path. However, using hyp_put_page() on a page that is not part of the inital memory range given to a hyp_pool() is currently unsupported. In order to allow dynamically extending hyp pools at run-time, add a check to __hyp_attach_page() to allow inserting 'external' pages into the free-list of order 0. This will be necessary to allow lazy donation of pages from the host to the hypervisor when allocating guest stage-2 page-table pages at EL2. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/kvm/hyp/nvhe/page_alloc.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c index 1ded09fc9b10..0d15227aced8 100644 --- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c +++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c @@ -93,11 +93,15 @@ static inline struct hyp_page *node_to_page(struct list_head *node) static void __hyp_attach_page(struct hyp_pool *pool, struct hyp_page *p) { + phys_addr_t phys = hyp_page_to_phys(p); unsigned short order = p->order; struct hyp_page *buddy; memset(hyp_page_to_virt(p), 0, PAGE_SIZE << p->order); + if (phys < pool->range_start || phys >= pool->range_end) + goto insert; + /* * Only the first struct hyp_page of a high-order page (otherwise known * as the 'head') should have p->order set. The non-head pages should @@ -116,6 +120,7 @@ static void __hyp_attach_page(struct hyp_pool *pool, p = min(p, buddy); } +insert: /* Mark the new head, and insert it */ p->order = order; page_add_to_list(p, &pool->free_area[order]); From patchwork Mon Oct 17 11:51:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008668 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75E8DC433FE for ; Mon, 17 Oct 2022 11:52:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230197AbiJQLwh (ORCPT ); Mon, 17 Oct 2022 07:52:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230117AbiJQLwg (ORCPT ); Mon, 17 Oct 2022 07:52:36 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF52622BFD for ; Mon, 17 Oct 2022 04:52:34 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 93838610A2 for ; Mon, 17 Oct 2022 11:52:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 02BB7C433D6; Mon, 17 Oct 2022 11:52:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007553; bh=+VciJv3yxgpw+I6hALj/oevzvF0dM7MgXiRJ3qzU4eg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CzbnIljzRSEgO0mAk+MPQlkDkCJjkU4Uw4Rugv+YY4OTD46DleQgom3dsH6MlLKj5 Yt5oSAmOBMa144W/ZazS8p6s4kUEth19IUCnARcpeux7afehToyGe74quA2fGcwKMi SbzrlkW8Tj66H6cAIlcS1GRpLViCrbWhaICvu7vQwgmiBrHdsQl3icvxot1rhyvubi fzB7Dx8jvGze6SGmQ2YB/bNgxNLBxon/P28hZq4nC1z45VnwtKLJg/4k3wpYwm1Ioo EFAQkSwAEqA8y21ZYCOygeMJshRL7FMYC3TEeFSRn5BpE+RWsHiyfnmXpT+LHEHYYI KpDiW43RXmcNQ== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 03/25] KVM: arm64: Back the hypervisor 'struct hyp_page' array for all memory Date: Mon, 17 Oct 2022 12:51:47 +0100 Message-Id: <20221017115209.2099-4-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret The EL2 'vmemmap' array in nVHE Protected mode is currently very sparse: only memory pages owned by the hypervisor itself have a matching 'struct hyp_page'. However, as the size of this struct has been reduced significantly since its introduction, it appears that we can now afford to back the vmemmap for all of memory. Having an easily accessible 'struct hyp_page' for every physical page in memory provides the hypervisor with a simple mechanism to store metadata (e.g. a refcount) that wouldn't otherwise fit in the very limited number of software bits available in the host stage-2 page-table entries. This will be used in subsequent patches when pinning host memory pages for use by the hypervisor at EL2. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_pkvm.h | 26 +++++++++++++++++++++++ arch/arm64/kvm/hyp/include/nvhe/mm.h | 14 +------------ arch/arm64/kvm/hyp/nvhe/mm.c | 31 ++++++++++++++++++++++++---- arch/arm64/kvm/hyp/nvhe/page_alloc.c | 4 +--- arch/arm64/kvm/hyp/nvhe/setup.c | 7 +++---- arch/arm64/kvm/pkvm.c | 18 ++-------------- 6 files changed, 60 insertions(+), 40 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h index 9f4ad2a8df59..8f7b8a2314bb 100644 --- a/arch/arm64/include/asm/kvm_pkvm.h +++ b/arch/arm64/include/asm/kvm_pkvm.h @@ -14,6 +14,32 @@ extern struct memblock_region kvm_nvhe_sym(hyp_memory)[]; extern unsigned int kvm_nvhe_sym(hyp_memblock_nr); +static inline unsigned long +hyp_vmemmap_memblock_size(struct memblock_region *reg, size_t vmemmap_entry_size) +{ + unsigned long nr_pages = reg->size >> PAGE_SHIFT; + unsigned long start, end; + + start = (reg->base >> PAGE_SHIFT) * vmemmap_entry_size; + end = start + nr_pages * vmemmap_entry_size; + start = ALIGN_DOWN(start, PAGE_SIZE); + end = ALIGN(end, PAGE_SIZE); + + return end - start; +} + +static inline unsigned long hyp_vmemmap_pages(size_t vmemmap_entry_size) +{ + unsigned long res = 0, i; + + for (i = 0; i < kvm_nvhe_sym(hyp_memblock_nr); i++) { + res += hyp_vmemmap_memblock_size(&kvm_nvhe_sym(hyp_memory)[i], + vmemmap_entry_size); + } + + return res >> PAGE_SHIFT; +} + static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages) { unsigned long total = 0, i; diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h index 42d8eb9bfe72..b2ee6d5df55b 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mm.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h @@ -15,7 +15,7 @@ extern hyp_spinlock_t pkvm_pgd_lock; int hyp_create_idmap(u32 hyp_va_bits); int hyp_map_vectors(void); -int hyp_back_vmemmap(phys_addr_t phys, unsigned long size, phys_addr_t back); +int hyp_back_vmemmap(phys_addr_t back); int pkvm_cpu_set_vector(enum arm64_hyp_spectre_vector slot); int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot); int pkvm_create_mappings_locked(void *from, void *to, enum kvm_pgtable_prot prot); @@ -24,16 +24,4 @@ int __pkvm_create_private_mapping(phys_addr_t phys, size_t size, unsigned long *haddr); int pkvm_alloc_private_va_range(size_t size, unsigned long *haddr); -static inline void hyp_vmemmap_range(phys_addr_t phys, unsigned long size, - unsigned long *start, unsigned long *end) -{ - unsigned long nr_pages = size >> PAGE_SHIFT; - struct hyp_page *p = hyp_phys_to_page(phys); - - *start = (unsigned long)p; - *end = *start + nr_pages * sizeof(struct hyp_page); - *start = ALIGN_DOWN(*start, PAGE_SIZE); - *end = ALIGN(*end, PAGE_SIZE); -} - #endif /* __KVM_HYP_MM_H */ diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c index 96193cb31a39..d3a3b47181de 100644 --- a/arch/arm64/kvm/hyp/nvhe/mm.c +++ b/arch/arm64/kvm/hyp/nvhe/mm.c @@ -129,13 +129,36 @@ int pkvm_create_mappings(void *from, void *to, enum kvm_pgtable_prot prot) return ret; } -int hyp_back_vmemmap(phys_addr_t phys, unsigned long size, phys_addr_t back) +int hyp_back_vmemmap(phys_addr_t back) { - unsigned long start, end; + unsigned long i, start, size, end = 0; + int ret; - hyp_vmemmap_range(phys, size, &start, &end); + for (i = 0; i < hyp_memblock_nr; i++) { + start = hyp_memory[i].base; + start = ALIGN_DOWN((u64)hyp_phys_to_page(start), PAGE_SIZE); + /* + * The begining of the hyp_vmemmap region for the current + * memblock may already be backed by the page backing the end + * the previous region, so avoid mapping it twice. + */ + start = max(start, end); + + end = hyp_memory[i].base + hyp_memory[i].size; + end = PAGE_ALIGN((u64)hyp_phys_to_page(end)); + if (start >= end) + continue; + + size = end - start; + ret = __pkvm_create_mappings(start, size, back, PAGE_HYP); + if (ret) + return ret; + + memset(hyp_phys_to_virt(back), 0, size); + back += size; + } - return __pkvm_create_mappings(start, end - start, back, PAGE_HYP); + return 0; } static void *__hyp_bp_vect_base; diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c index 0d15227aced8..7804da89e55d 100644 --- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c +++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c @@ -235,10 +235,8 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages, /* Init the vmemmap portion */ p = hyp_phys_to_page(phys); - for (i = 0; i < nr_pages; i++) { - p[i].order = 0; + for (i = 0; i < nr_pages; i++) hyp_set_page_refcounted(&p[i]); - } /* Attach the unused pages to the buddy tree */ for (i = reserved_pages; i < nr_pages; i++) diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index e8d4ea2fcfa0..579eb4f73476 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -31,12 +31,11 @@ static struct hyp_pool hpool; static int divide_memory_pool(void *virt, unsigned long size) { - unsigned long vstart, vend, nr_pages; + unsigned long nr_pages; hyp_early_alloc_init(virt, size); - hyp_vmemmap_range(__hyp_pa(virt), size, &vstart, &vend); - nr_pages = (vend - vstart) >> PAGE_SHIFT; + nr_pages = hyp_vmemmap_pages(sizeof(struct hyp_page)); vmemmap_base = hyp_early_alloc_contig(nr_pages); if (!vmemmap_base) return -ENOMEM; @@ -78,7 +77,7 @@ static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size, if (ret) return ret; - ret = hyp_back_vmemmap(phys, size, hyp_virt_to_phys(vmemmap_base)); + ret = hyp_back_vmemmap(hyp_virt_to_phys(vmemmap_base)); if (ret) return ret; diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c index ebecb7c045f4..34229425b25d 100644 --- a/arch/arm64/kvm/pkvm.c +++ b/arch/arm64/kvm/pkvm.c @@ -53,7 +53,7 @@ static int __init register_memblock_regions(void) void __init kvm_hyp_reserve(void) { - u64 nr_pages, prev, hyp_mem_pages = 0; + u64 hyp_mem_pages = 0; int ret; if (!is_hyp_mode_available() || is_kernel_in_hyp_mode()) @@ -71,21 +71,7 @@ void __init kvm_hyp_reserve(void) hyp_mem_pages += hyp_s1_pgtable_pages(); hyp_mem_pages += host_s2_pgtable_pages(); - - /* - * The hyp_vmemmap needs to be backed by pages, but these pages - * themselves need to be present in the vmemmap, so compute the number - * of pages needed by looking for a fixed point. - */ - nr_pages = 0; - do { - prev = nr_pages; - nr_pages = hyp_mem_pages + prev; - nr_pages = DIV_ROUND_UP(nr_pages * STRUCT_HYP_PAGE_SIZE, - PAGE_SIZE); - nr_pages += __hyp_pgtable_max_pages(nr_pages); - } while (nr_pages != prev); - hyp_mem_pages += nr_pages; + hyp_mem_pages += hyp_vmemmap_pages(STRUCT_HYP_PAGE_SIZE); /* * Try to allocate a PMD-aligned region to reduce TLB pressure once From patchwork Mon Oct 17 11:51:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008669 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10405C43217 for ; Mon, 17 Oct 2022 11:52:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230170AbiJQLwk (ORCPT ); Mon, 17 Oct 2022 07:52:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230118AbiJQLwi (ORCPT ); Mon, 17 Oct 2022 07:52:38 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 878E221269 for ; Mon, 17 Oct 2022 04:52:37 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 228A36108F for ; Mon, 17 Oct 2022 11:52:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81290C43470; Mon, 17 Oct 2022 11:52:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007556; bh=XLQqaxaFoJ1w3bEHfz1Nz6Iwt6VUcQunxS4ndTgHiAM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OtwOuGSPZNKY1wGsGfTC3CdnUYjAJ9W9W9jQejtnn1OqdAFely9JX6+eODDmVgjo8 eDZK0WBd84y8+B+E+7yKsphgi8eoMsloZ5yJ/MxOrWBmaMIm+EE0tdUHtVju342P6z XftVm65y1W9qjkUUqI8E8iehET+o6mLXNtPQeU/rgY9htflfcYjq6giijFLOHb0JjU XsyTlMz8d3+dcUY+DieJvk75+ahwqL7PZzn8YpCGJ7+YWFCEcX/Xg/RDij7CRwDRDj Fvm5iidxbCWz7XJrKHMiaeRKwlaTwIJnPVE1djDNYRR7za2ex8pi/gpz4C2o0kXHyY GCtWO5FgIQu0w== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 04/25] KVM: arm64: Fix-up hyp stage-1 refcounts for all pages mapped at EL2 Date: Mon, 17 Oct 2022 12:51:48 +0100 Message-Id: <20221017115209.2099-5-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret In order to allow unmapping arbitrary memory pages from the hypervisor stage-1 page-table, fix-up the initial refcount for pages that have been mapped before the 'vmemmap' array was up and running so that it accurately accounts for all existing hypervisor mappings. This is achieved by traversing the entire hypervisor stage-1 page-table during initialisation of EL2 and updating the corresponding 'struct hyp_page' for each valid mapping. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/kvm/hyp/nvhe/setup.c | 62 +++++++++++++++++++++++---------- 1 file changed, 43 insertions(+), 19 deletions(-) diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 579eb4f73476..8f2726d7e201 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -185,12 +185,11 @@ static void hpool_put_page(void *addr) hyp_put_page(&hpool, addr); } -static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level, - kvm_pte_t *ptep, - enum kvm_pgtable_walk_flags flag, - void * const arg) +static int fix_host_ownership_walker(u64 addr, u64 end, u32 level, + kvm_pte_t *ptep, + enum kvm_pgtable_walk_flags flag, + void * const arg) { - struct kvm_pgtable_mm_ops *mm_ops = arg; enum kvm_pgtable_prot prot; enum pkvm_page_state state; kvm_pte_t pte = *ptep; @@ -199,15 +198,6 @@ static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level, if (!kvm_pte_valid(pte)) return 0; - /* - * Fix-up the refcount for the page-table pages as the early allocator - * was unable to access the hyp_vmemmap and so the buddy allocator has - * initialised the refcount to '1'. - */ - mm_ops->get_page(ptep); - if (flag != KVM_PGTABLE_WALK_LEAF) - return 0; - if (level != (KVM_PGTABLE_MAX_LEVELS - 1)) return -EINVAL; @@ -236,12 +226,30 @@ static int finalize_host_mappings_walker(u64 addr, u64 end, u32 level, return host_stage2_idmap_locked(phys, PAGE_SIZE, prot); } -static int finalize_host_mappings(void) +static int fix_hyp_pgtable_refcnt_walker(u64 addr, u64 end, u32 level, + kvm_pte_t *ptep, + enum kvm_pgtable_walk_flags flag, + void * const arg) +{ + struct kvm_pgtable_mm_ops *mm_ops = arg; + kvm_pte_t pte = *ptep; + + /* + * Fix-up the refcount for the page-table pages as the early allocator + * was unable to access the hyp_vmemmap and so the buddy allocator has + * initialised the refcount to '1'. + */ + if (kvm_pte_valid(pte)) + mm_ops->get_page(ptep); + + return 0; +} + +static int fix_host_ownership(void) { struct kvm_pgtable_walker walker = { - .cb = finalize_host_mappings_walker, - .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, - .arg = pkvm_pgtable.mm_ops, + .cb = fix_host_ownership_walker, + .flags = KVM_PGTABLE_WALK_LEAF, }; int i, ret; @@ -257,6 +265,18 @@ static int finalize_host_mappings(void) return 0; } +static int fix_hyp_pgtable_refcnt(void) +{ + struct kvm_pgtable_walker walker = { + .cb = fix_hyp_pgtable_refcnt_walker, + .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, + .arg = pkvm_pgtable.mm_ops, + }; + + return kvm_pgtable_walk(&pkvm_pgtable, 0, BIT(pkvm_pgtable.ia_bits), + &walker); +} + void __noreturn __pkvm_init_finalise(void) { struct kvm_host_data *host_data = this_cpu_ptr(&kvm_host_data); @@ -286,7 +306,11 @@ void __noreturn __pkvm_init_finalise(void) }; pkvm_pgtable.mm_ops = &pkvm_pgtable_mm_ops; - ret = finalize_host_mappings(); + ret = fix_host_ownership(); + if (ret) + goto out; + + ret = fix_hyp_pgtable_refcnt(); if (ret) goto out; From patchwork Mon Oct 17 11:51:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008670 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29B9EC4332F for ; Mon, 17 Oct 2022 11:52:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230249AbiJQLws (ORCPT ); Mon, 17 Oct 2022 07:52:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230212AbiJQLwo (ORCPT ); Mon, 17 Oct 2022 07:52:44 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 133BF2A424 for ; Mon, 17 Oct 2022 04:52:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4AAF6B81630 for ; Mon, 17 Oct 2022 11:52:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0A840C433C1; Mon, 17 Oct 2022 11:52:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007560; bh=kHeroY+azM9EScaIHR1Ce0DRrtBPY5P4+YZiouhzfmQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MmAGrk53K9rvObuRQeomaf+bY1yqTxdOXubyWFF30dHhWV8Mnbu5hOz21+Ul/51zk xCEj7XYAxbNsJmIbRlRJuO1qr+pdkQiZ222ZZamg2b7ZoLq1cH04b24vBn0g+3jhZu olfMUKbMlQJWlZSOizjnOWg5l0VZPZ6L6AWXJWuozE0W4kuEdf6fodyW/uXaeVJ3Zz pD9tbkf+AP5UdovLWg1hEfPa46mLHRI1X51TW7bRhAu/o95b75/RWCB4a4RqEds4k7 tNfXrxhEdpYvhVL7Vqe7p2YPVF81W+UfN/PrLLqOup5aPf998D1a8EQo32iURjf24f xNg7z5Elm9/4w== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 05/25] KVM: arm64: Unify identifiers used to distinguish host and hypervisor Date: Mon, 17 Oct 2022 12:51:49 +0100 Message-Id: <20221017115209.2099-6-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The 'pkvm_component_id' enum type provides constants to refer to the host and the hypervisor, yet this information is duplicated by the 'pkvm_hyp_id' constant. Remove the definition of 'pkvm_hyp_id' and move the 'pkvm_component_id' type definition to 'mem_protect.h' so that it can be used outside of the memory protection code, for example when initialising the owner for hypervisor-owned pages. Tested-by: Vincent Donnefort Signed-off-by: Will Deacon Reviewed-by: Philippe Mathieu-Daudé --- arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 6 +++++- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 8 -------- arch/arm64/kvm/hyp/nvhe/setup.c | 2 +- 3 files changed, 6 insertions(+), 10 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index 80e99836eac7..f5705a1e972f 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -51,7 +51,11 @@ struct host_kvm { }; extern struct host_kvm host_kvm; -extern const u8 pkvm_hyp_id; +/* This corresponds to page-table locking order */ +enum pkvm_component_id { + PKVM_ID_HOST, + PKVM_ID_HYP, +}; int __pkvm_prot_finalize(void); int __pkvm_host_share_hyp(u64 pfn); diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 1e78acf9662e..ff86f5bd230f 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -26,8 +26,6 @@ struct host_kvm host_kvm; static struct hyp_pool host_s2_pool; -const u8 pkvm_hyp_id = 1; - static void host_lock_component(void) { hyp_spin_lock(&host_kvm.lock); @@ -380,12 +378,6 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt) BUG_ON(ret && ret != -EAGAIN); } -/* This corresponds to locking order */ -enum pkvm_component_id { - PKVM_ID_HOST, - PKVM_ID_HYP, -}; - struct pkvm_mem_transition { u64 nr_pages; diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 8f2726d7e201..0312c9c74a5a 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -212,7 +212,7 @@ static int fix_host_ownership_walker(u64 addr, u64 end, u32 level, state = pkvm_getstate(kvm_pgtable_hyp_pte_prot(pte)); switch (state) { case PKVM_PAGE_OWNED: - return host_stage2_set_owner_locked(phys, PAGE_SIZE, pkvm_hyp_id); + return host_stage2_set_owner_locked(phys, PAGE_SIZE, PKVM_ID_HYP); case PKVM_PAGE_SHARED_OWNED: prot = pkvm_mkstate(PKVM_HOST_MEM_PROT, PKVM_PAGE_SHARED_BORROWED); break; From patchwork Mon Oct 17 11:51:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008671 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D71C7C4332F for ; Mon, 17 Oct 2022 11:53:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230140AbiJQLw7 (ORCPT ); Mon, 17 Oct 2022 07:52:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230272AbiJQLwx (ORCPT ); Mon, 17 Oct 2022 07:52:53 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 043C321269 for ; Mon, 17 Oct 2022 04:52:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0399EB812A1 for ; Mon, 17 Oct 2022 11:52:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8B4B1C433D7; Mon, 17 Oct 2022 11:52:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007563; bh=wHXJvKczYaZBPZiAQZnXAnjt2K3HIwXbQYMdzt9dcY4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KAbTgA5rwEvECI46G5+IEo8SWUJnY7wmAz2G+hrdJh/M4zZMOHrkzcfIRjYE+wcmC 2q4qoPpQjy932/cFwYl2d06efrqDhaEd+WCgM1hIfpuLx/YAbYcgIfyRRhbAkx7Os9 TEy+EtgdsZzI82XLB6fmNlkVPOjhl21yxSye/yEDD4a1krmm+VBKG99KZrvzoSAExp XhdBMjGHaBhDLl3z6PCctozD5OSkQ2GmXxYoWKOzDy+2CYTUQYQPr98R89W3XJonRs t2C/aYmy1BI2XQajEufm23BkWyMwVQlzFKDlm3THzw5Xuhg5p8MU+/i8KHlDE/pUjb 5g9+mJBIbEoww== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 06/25] KVM: arm64: Implement do_donate() helper for donating memory Date: Mon, 17 Oct 2022 12:51:50 +0100 Message-Id: <20221017115209.2099-7-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret Transferring ownership information of a memory region from one component to another can be achieved using a "donate" operation, which results in the previous owner losing access to the underlying pages entirely and the new owner having exclusive access to the page. Implement a do_donate() helper, along the same lines as do_{un,}share, and provide this functionality for the host-{to,from}-hyp cases as this will later be used to donate/reclaim memory pages to store VM metadata at EL2. In a similar manner to the sharing transitions, permission checks are performed by the hypervisor to ensure that the component initiating the transition really is the owner of the page and also that the completer does not currently have a page mapped at the target address. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 2 + arch/arm64/kvm/hyp/nvhe/mem_protect.c | 239 ++++++++++++++++++ 2 files changed, 241 insertions(+) diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index f5705a1e972f..c87b19b2d468 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -60,6 +60,8 @@ enum pkvm_component_id { int __pkvm_prot_finalize(void); int __pkvm_host_share_hyp(u64 pfn); int __pkvm_host_unshare_hyp(u64 pfn); +int __pkvm_host_donate_hyp(u64 pfn, u64 nr_pages); +int __pkvm_hyp_donate_host(u64 pfn, u64 nr_pages); bool addr_is_memory(phys_addr_t phys); int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot); diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index ff86f5bd230f..c30402737548 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -391,6 +391,9 @@ struct pkvm_mem_transition { /* Address in the completer's address space */ u64 completer_addr; } host; + struct { + u64 completer_addr; + } hyp; }; } initiator; @@ -404,6 +407,10 @@ struct pkvm_mem_share { const enum kvm_pgtable_prot completer_prot; }; +struct pkvm_mem_donation { + const struct pkvm_mem_transition tx; +}; + struct check_walk_data { enum pkvm_page_state desired; enum pkvm_page_state (*get_page_state)(kvm_pte_t pte); @@ -503,6 +510,46 @@ static int host_initiate_unshare(u64 *completer_addr, return __host_set_page_state_range(addr, size, PKVM_PAGE_OWNED); } +static int host_initiate_donation(u64 *completer_addr, + const struct pkvm_mem_transition *tx) +{ + u8 owner_id = tx->completer.id; + u64 size = tx->nr_pages * PAGE_SIZE; + + *completer_addr = tx->initiator.host.completer_addr; + return host_stage2_set_owner_locked(tx->initiator.addr, size, owner_id); +} + +static bool __host_ack_skip_pgtable_check(const struct pkvm_mem_transition *tx) +{ + return !(IS_ENABLED(CONFIG_NVHE_EL2_DEBUG) || + tx->initiator.id != PKVM_ID_HYP); +} + +static int __host_ack_transition(u64 addr, const struct pkvm_mem_transition *tx, + enum pkvm_page_state state) +{ + u64 size = tx->nr_pages * PAGE_SIZE; + + if (__host_ack_skip_pgtable_check(tx)) + return 0; + + return __host_check_page_state_range(addr, size, state); +} + +static int host_ack_donation(u64 addr, const struct pkvm_mem_transition *tx) +{ + return __host_ack_transition(addr, tx, PKVM_NOPAGE); +} + +static int host_complete_donation(u64 addr, const struct pkvm_mem_transition *tx) +{ + u64 size = tx->nr_pages * PAGE_SIZE; + u8 host_id = tx->completer.id; + + return host_stage2_set_owner_locked(addr, size, host_id); +} + static enum pkvm_page_state hyp_get_page_state(kvm_pte_t pte) { if (!kvm_pte_valid(pte)) @@ -523,6 +570,27 @@ static int __hyp_check_page_state_range(u64 addr, u64 size, return check_page_state_range(&pkvm_pgtable, addr, size, &d); } +static int hyp_request_donation(u64 *completer_addr, + const struct pkvm_mem_transition *tx) +{ + u64 size = tx->nr_pages * PAGE_SIZE; + u64 addr = tx->initiator.addr; + + *completer_addr = tx->initiator.hyp.completer_addr; + return __hyp_check_page_state_range(addr, size, PKVM_PAGE_OWNED); +} + +static int hyp_initiate_donation(u64 *completer_addr, + const struct pkvm_mem_transition *tx) +{ + u64 size = tx->nr_pages * PAGE_SIZE; + int ret; + + *completer_addr = tx->initiator.hyp.completer_addr; + ret = kvm_pgtable_hyp_unmap(&pkvm_pgtable, tx->initiator.addr, size); + return (ret != size) ? -EFAULT : 0; +} + static bool __hyp_ack_skip_pgtable_check(const struct pkvm_mem_transition *tx) { return !(IS_ENABLED(CONFIG_NVHE_EL2_DEBUG) || @@ -554,6 +622,16 @@ static int hyp_ack_unshare(u64 addr, const struct pkvm_mem_transition *tx) PKVM_PAGE_SHARED_BORROWED); } +static int hyp_ack_donation(u64 addr, const struct pkvm_mem_transition *tx) +{ + u64 size = tx->nr_pages * PAGE_SIZE; + + if (__hyp_ack_skip_pgtable_check(tx)) + return 0; + + return __hyp_check_page_state_range(addr, size, PKVM_NOPAGE); +} + static int hyp_complete_share(u64 addr, const struct pkvm_mem_transition *tx, enum kvm_pgtable_prot perms) { @@ -572,6 +650,15 @@ static int hyp_complete_unshare(u64 addr, const struct pkvm_mem_transition *tx) return (ret != size) ? -EFAULT : 0; } +static int hyp_complete_donation(u64 addr, + const struct pkvm_mem_transition *tx) +{ + void *start = (void *)addr, *end = start + (tx->nr_pages * PAGE_SIZE); + enum kvm_pgtable_prot prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_OWNED); + + return pkvm_create_mappings_locked(start, end, prot); +} + static int check_share(struct pkvm_mem_share *share) { const struct pkvm_mem_transition *tx = &share->tx; @@ -724,6 +811,94 @@ static int do_unshare(struct pkvm_mem_share *share) return WARN_ON(__do_unshare(share)); } +static int check_donation(struct pkvm_mem_donation *donation) +{ + const struct pkvm_mem_transition *tx = &donation->tx; + u64 completer_addr; + int ret; + + switch (tx->initiator.id) { + case PKVM_ID_HOST: + ret = host_request_owned_transition(&completer_addr, tx); + break; + case PKVM_ID_HYP: + ret = hyp_request_donation(&completer_addr, tx); + break; + default: + ret = -EINVAL; + } + + if (ret) + return ret; + + switch (tx->completer.id){ + case PKVM_ID_HOST: + ret = host_ack_donation(completer_addr, tx); + break; + case PKVM_ID_HYP: + ret = hyp_ack_donation(completer_addr, tx); + break; + default: + ret = -EINVAL; + } + + return ret; +} + +static int __do_donate(struct pkvm_mem_donation *donation) +{ + const struct pkvm_mem_transition *tx = &donation->tx; + u64 completer_addr; + int ret; + + switch (tx->initiator.id) { + case PKVM_ID_HOST: + ret = host_initiate_donation(&completer_addr, tx); + break; + case PKVM_ID_HYP: + ret = hyp_initiate_donation(&completer_addr, tx); + break; + default: + ret = -EINVAL; + } + + if (ret) + return ret; + + switch (tx->completer.id){ + case PKVM_ID_HOST: + ret = host_complete_donation(completer_addr, tx); + break; + case PKVM_ID_HYP: + ret = hyp_complete_donation(completer_addr, tx); + break; + default: + ret = -EINVAL; + } + + return ret; +} + +/* + * do_donate(): + * + * The page owner transfers ownership to another component, losing access + * as a consequence. + * + * Initiator: OWNED => NOPAGE + * Completer: NOPAGE => OWNED + */ +static int do_donate(struct pkvm_mem_donation *donation) +{ + int ret; + + ret = check_donation(donation); + if (ret) + return ret; + + return WARN_ON(__do_donate(donation)); +} + int __pkvm_host_share_hyp(u64 pfn) { int ret; @@ -789,3 +964,67 @@ int __pkvm_host_unshare_hyp(u64 pfn) return ret; } + +int __pkvm_host_donate_hyp(u64 pfn, u64 nr_pages) +{ + int ret; + u64 host_addr = hyp_pfn_to_phys(pfn); + u64 hyp_addr = (u64)__hyp_va(host_addr); + struct pkvm_mem_donation donation = { + .tx = { + .nr_pages = nr_pages, + .initiator = { + .id = PKVM_ID_HOST, + .addr = host_addr, + .host = { + .completer_addr = hyp_addr, + }, + }, + .completer = { + .id = PKVM_ID_HYP, + }, + }, + }; + + host_lock_component(); + hyp_lock_component(); + + ret = do_donate(&donation); + + hyp_unlock_component(); + host_unlock_component(); + + return ret; +} + +int __pkvm_hyp_donate_host(u64 pfn, u64 nr_pages) +{ + int ret; + u64 host_addr = hyp_pfn_to_phys(pfn); + u64 hyp_addr = (u64)__hyp_va(host_addr); + struct pkvm_mem_donation donation = { + .tx = { + .nr_pages = nr_pages, + .initiator = { + .id = PKVM_ID_HYP, + .addr = hyp_addr, + .hyp = { + .completer_addr = host_addr, + }, + }, + .completer = { + .id = PKVM_ID_HOST, + }, + }, + }; + + host_lock_component(); + hyp_lock_component(); + + ret = do_donate(&donation); + + hyp_unlock_component(); + host_unlock_component(); + + return ret; +} From patchwork Mon Oct 17 11:51:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008672 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2F85C433FE for ; Mon, 17 Oct 2022 11:53:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230255AbiJQLxC (ORCPT ); Mon, 17 Oct 2022 07:53:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230216AbiJQLwz (ORCPT ); Mon, 17 Oct 2022 07:52:55 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C9B7580A6 for ; Mon, 17 Oct 2022 04:52:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 759A6B812A8 for ; Mon, 17 Oct 2022 11:52:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3E06AC433C1; Mon, 17 Oct 2022 11:52:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007567; bh=jCaLq7to3/AuF6vKjq7Qx6bD9iG0ZWRGazDqhkaJ4I4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uaZCh2dF42xC996Jl1/+QdGYrF4UWLVy30p1+GZ9W5ybbpzJp31QBnOFNEJqvaDKr DHjIOyBDT1yZ8y40qqMoFNYrWpogNpD+0za6G92OJKRFlqqxI/LBMJWrgoUK8bkYOB zsGhXRec9/MqFtmBfH1xPLoDvDML1Sb6dGGhKvV4AxOhnsMJtnh4vf1MOH5wmLouTL ZZSeI9FX43vOdSrk9jKxUAGJsTLw4VDlmB50+59t5Q47t96BPdxM+Ayg07siNuo/y4 m5O5cwfM3UVT27rJbhTScpJHNo2ZKEg4RK1rjyrxAU07radKibv+2yQPRL7o252IjB uM5jBnOvDDWMQ== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 07/25] KVM: arm64: Prevent the donation of no-map pages Date: Mon, 17 Oct 2022 12:51:51 +0100 Message-Id: <20221017115209.2099-8-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret Memory regions marked as "no-map" in the host device-tree routinely include TrustZone carevouts and DMA pools. Although donating such pages to the hypervisor may not breach confidentiality, it could be used to corrupt its state in uncontrollable ways. To prevent this, let's block host-initiated memory transitions targeting "no-map" pages altogether in nVHE protected mode as there should be no valid reason to do this in current operation. Thankfully, the pKVM EL2 hypervisor has a full copy of the host's list of memblock regions, so we can easily check for the presence of the MEMBLOCK_NOMAP flag on a region containing pages being donated from the host. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon Reviewed-by: Philippe Mathieu-Daudé --- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index c30402737548..a7156fd13bc8 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -193,7 +193,7 @@ struct kvm_mem_range { u64 end; }; -static bool find_mem_range(phys_addr_t addr, struct kvm_mem_range *range) +static struct memblock_region *find_mem_range(phys_addr_t addr, struct kvm_mem_range *range) { int cur, left = 0, right = hyp_memblock_nr; struct memblock_region *reg; @@ -216,18 +216,28 @@ static bool find_mem_range(phys_addr_t addr, struct kvm_mem_range *range) } else { range->start = reg->base; range->end = end; - return true; + return reg; } } - return false; + return NULL; } bool addr_is_memory(phys_addr_t phys) { struct kvm_mem_range range; - return find_mem_range(phys, &range); + return !!find_mem_range(phys, &range); +} + +static bool addr_is_allowed_memory(phys_addr_t phys) +{ + struct memblock_region *reg; + struct kvm_mem_range range; + + reg = find_mem_range(phys, &range); + + return reg && !(reg->flags & MEMBLOCK_NOMAP); } static bool is_in_mem_range(u64 addr, struct kvm_mem_range *range) @@ -346,7 +356,7 @@ static bool host_stage2_force_pte_cb(u64 addr, u64 end, enum kvm_pgtable_prot pr static int host_stage2_idmap(u64 addr) { struct kvm_mem_range range; - bool is_memory = find_mem_range(addr, &range); + bool is_memory = !!find_mem_range(addr, &range); enum kvm_pgtable_prot prot; int ret; @@ -424,7 +434,7 @@ static int __check_page_state_visitor(u64 addr, u64 end, u32 level, struct check_walk_data *d = arg; kvm_pte_t pte = *ptep; - if (kvm_pte_valid(pte) && !addr_is_memory(kvm_pte_to_phys(pte))) + if (kvm_pte_valid(pte) && !addr_is_allowed_memory(kvm_pte_to_phys(pte))) return -EINVAL; return d->get_page_state(pte) == d->desired ? 0 : -EPERM; From patchwork Mon Oct 17 11:51:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008673 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0A83C4332F for ; Mon, 17 Oct 2022 11:53:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230233AbiJQLxE (ORCPT ); Mon, 17 Oct 2022 07:53:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230206AbiJQLw6 (ORCPT ); Mon, 17 Oct 2022 07:52:58 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0924A15812 for ; Mon, 17 Oct 2022 04:52:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 070A0B81632 for ; Mon, 17 Oct 2022 11:52:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BDB68C433D6; Mon, 17 Oct 2022 11:52:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007570; bh=VxPja3d2GHqoIhuBEgJwpkqFA2a5hFZGKu7/PjMAaYM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qp0WdwQQRsqAmA01Sjb3WrTNM85uukSDjCLBxj3CiXfSLquqlJzNIgmEvNP/kEs5O lu86ygN/TIC+Oylx/CpYSJi7nngToQwSWIoaTTXDwushaQZ6538l0+SfTrgG+ow2VX 3Mq6xllimzJmB8Piej2z/7pseRUz/OscZgCVSOsyqKjNgUvTtc8Cj1j0U/Y1v/KZLX lahjTGsLgwhmsgk2hanytH0Q/guAmeSfHda9Xb6YhYZAykzMQUh2hNoHDFwDEyMhLE Za6xVN3s/NBs4ZT1db+iHdpDDBJdwvP9MSDwIn9PCwwI7hJzTK+Lo0VXMsA0e5k5MY kIezNor4QgEIw== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 08/25] KVM: arm64: Add helpers to pin memory shared with the hypervisor at EL2 Date: Mon, 17 Oct 2022 12:51:52 +0100 Message-Id: <20221017115209.2099-9-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret Add helpers allowing the hypervisor to check whether a range of pages are currently shared by the host, and 'pin' them if so by blocking host unshare operations until the memory has been unpinned. This will allow the hypervisor to take references on host-provided data-structures (e.g. 'struct kvm') with the guarantee that these pages will remain in a stable state until the hypervisor decides to release them, for example during guest teardown. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 3 ++ arch/arm64/kvm/hyp/include/nvhe/memory.h | 7 ++- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 48 +++++++++++++++++++ 3 files changed, 57 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index c87b19b2d468..998bf165af71 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -69,6 +69,9 @@ int host_stage2_set_owner_locked(phys_addr_t addr, u64 size, u8 owner_id); int kvm_host_prepare_stage2(void *pgt_pool_base); void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt); +int hyp_pin_shared_mem(void *from, void *to); +void hyp_unpin_shared_mem(void *from, void *to); + static __always_inline void __load_host_stage2(void) { if (static_branch_likely(&kvm_protected_mode_initialized)) diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h index 9422900e5c6a..ab205c4d6774 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/memory.h +++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h @@ -55,10 +55,15 @@ static inline void hyp_page_ref_inc(struct hyp_page *p) p->refcount++; } -static inline int hyp_page_ref_dec_and_test(struct hyp_page *p) +static inline void hyp_page_ref_dec(struct hyp_page *p) { BUG_ON(!p->refcount); p->refcount--; +} + +static inline int hyp_page_ref_dec_and_test(struct hyp_page *p) +{ + hyp_page_ref_dec(p); return (p->refcount == 0); } diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index a7156fd13bc8..1262dbae7f06 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -625,6 +625,9 @@ static int hyp_ack_unshare(u64 addr, const struct pkvm_mem_transition *tx) { u64 size = tx->nr_pages * PAGE_SIZE; + if (tx->initiator.id == PKVM_ID_HOST && hyp_page_count((void *)addr)) + return -EBUSY; + if (__hyp_ack_skip_pgtable_check(tx)) return 0; @@ -1038,3 +1041,48 @@ int __pkvm_hyp_donate_host(u64 pfn, u64 nr_pages) return ret; } + +int hyp_pin_shared_mem(void *from, void *to) +{ + u64 cur, start = ALIGN_DOWN((u64)from, PAGE_SIZE); + u64 end = PAGE_ALIGN((u64)to); + u64 size = end - start; + int ret; + + host_lock_component(); + hyp_lock_component(); + + ret = __host_check_page_state_range(__hyp_pa(start), size, + PKVM_PAGE_SHARED_OWNED); + if (ret) + goto unlock; + + ret = __hyp_check_page_state_range(start, size, + PKVM_PAGE_SHARED_BORROWED); + if (ret) + goto unlock; + + for (cur = start; cur < end; cur += PAGE_SIZE) + hyp_page_ref_inc(hyp_virt_to_page(cur)); + +unlock: + hyp_unlock_component(); + host_unlock_component(); + + return ret; +} + +void hyp_unpin_shared_mem(void *from, void *to) +{ + u64 cur, start = ALIGN_DOWN((u64)from, PAGE_SIZE); + u64 end = PAGE_ALIGN((u64)to); + + host_lock_component(); + hyp_lock_component(); + + for (cur = start; cur < end; cur += PAGE_SIZE) + hyp_page_ref_dec(hyp_virt_to_page(cur)); + + hyp_unlock_component(); + host_unlock_component(); +} From patchwork Mon Oct 17 11:51:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008674 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DBF9C433FE for ; Mon, 17 Oct 2022 11:53:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230282AbiJQLxF (ORCPT ); Mon, 17 Oct 2022 07:53:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230263AbiJQLw6 (ORCPT ); Mon, 17 Oct 2022 07:52:58 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CC881570E for ; Mon, 17 Oct 2022 04:52:55 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D823D6108F for ; Mon, 17 Oct 2022 11:52:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 46B2AC4347C; Mon, 17 Oct 2022 11:52:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007574; bh=8f0p130EJ4EPgnQoABkRw1xEzbZ5piYUHc7zA8Cjng4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=njjAERXRRVn2n0XAb9kvgVw45DF4V7i3juqdjfan5ZZTzWOGrsuSQrQQfFcUbNUVn DXhc1SwRb7l9UofUcZ2gZ6RE7uM+wFzM8cZJ6xronCMSBJw2tQ8o+Fdv3SYrkZ1YU/ sSlKW76kS52sNpC2gW1t5sA2DnOmEycOrZ2jxo1y/z8uKWlBdWSQKJRyBXaU8PCo/F Okz1zRIQYhR38oAABLD/3dHKl6/NnXCoTxIBNm7wRrUHn5Lf7Lke7YeIQjfe28R8fC Tx46VWk+db9taIiwNmrqAayGRXsXI1uX4YpZNTN57QHu+SCD0v1XO+/4nSA7LGba4c JFFKfvgfg3i/Q== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 09/25] KVM: arm64: Include asm/kvm_mmu.h in nvhe/mem_protect.h Date: Mon, 17 Oct 2022 12:51:53 +0100 Message-Id: <20221017115209.2099-10-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org nvhe/mem_protect.h refers to __load_stage2() in the definition of __load_host_stage2() but doesn't include the relevant header. Include asm/kvm_mmu.h in nvhe/mem_protect.h so that users of the latter don't have to do this themselves. Tested-by: Vincent Donnefort Signed-off-by: Will Deacon Reviewed-by: Philippe Mathieu-Daudé --- arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index 998bf165af71..3bea816296dc 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -8,6 +8,7 @@ #define __KVM_NVHE_MEM_PROTECT__ #include #include +#include #include #include #include From patchwork Mon Oct 17 11:51:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008675 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0827C4332F for ; Mon, 17 Oct 2022 11:53:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230267AbiJQLxI (ORCPT ); Mon, 17 Oct 2022 07:53:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230268AbiJQLxE (ORCPT ); Mon, 17 Oct 2022 07:53:04 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79FEA4F1A2 for ; Mon, 17 Oct 2022 04:53:00 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0EF0CB81630 for ; Mon, 17 Oct 2022 11:52:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5939C43470; Mon, 17 Oct 2022 11:52:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007577; bh=8SOmXtI+0us+qdlvL7fSrftel8tnl6QXbcToLbVYnW0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HtV2hgxAyir7vaWS0ZGWfCioTh33iTvaaq8VeijwzGvKlDDob+C0fawZnINEXEqCy ZVjJvPooLPNGdVjoJFPzPE+Xiiy0cdZLIZpaIeok6f9Ry5vz9KeFQfMHD2cBUK+mC5 drvWNCLMamDdpur32YUj3OB65+2g1OKjZsccjmNphgKU1XiLST4TcNynwDG59vrIOz XJilUmskXbNkliWYtgGKRVknD20/Iv0Q9J+In6xmdCN48oAt4x3csXiUs7Qh+mrOz/ ugRVEy65lV+2X4TELFdE68GwYVMdfgRoBZjeLl3rQsxttQ9OkCADH6zgdlxXAyqpIV G4RoyOuKXIagw== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 10/25] KVM: arm64: Add hyp_spinlock_t static initializer Date: Mon, 17 Oct 2022 12:51:54 +0100 Message-Id: <20221017115209.2099-11-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Fuad Tabba Introduce a static initializer macro for 'hyp_spinlock_t' so that it is straightforward to instantiate global locks at EL2. This will be later utilised for locking the VM table in the hypervisor. Tested-by: Vincent Donnefort Signed-off-by: Fuad Tabba Signed-off-by: Will Deacon Reviewed-by: Philippe Mathieu-Daudé --- arch/arm64/kvm/hyp/include/nvhe/spinlock.h | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/hyp/include/nvhe/spinlock.h b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h index 4652fd04bdbe..7c7ea8c55405 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/spinlock.h +++ b/arch/arm64/kvm/hyp/include/nvhe/spinlock.h @@ -28,9 +28,17 @@ typedef union hyp_spinlock { }; } hyp_spinlock_t; +#define __HYP_SPIN_LOCK_INITIALIZER \ + { .__val = 0 } + +#define __HYP_SPIN_LOCK_UNLOCKED \ + ((hyp_spinlock_t) __HYP_SPIN_LOCK_INITIALIZER) + +#define DEFINE_HYP_SPINLOCK(x) hyp_spinlock_t x = __HYP_SPIN_LOCK_UNLOCKED + #define hyp_spin_lock_init(l) \ do { \ - *(l) = (hyp_spinlock_t){ .__val = 0 }; \ + *(l) = __HYP_SPIN_LOCK_UNLOCKED; \ } while (0) static inline void hyp_spin_lock(hyp_spinlock_t *lock) From patchwork Mon Oct 17 11:51:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008676 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9E83C433FE for ; Mon, 17 Oct 2022 11:53:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230243AbiJQLxJ (ORCPT ); Mon, 17 Oct 2022 07:53:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230253AbiJQLxF (ORCPT ); Mon, 17 Oct 2022 07:53:05 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CD1221269 for ; Mon, 17 Oct 2022 04:53:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 9E547B812A8 for ; Mon, 17 Oct 2022 11:53:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4FDF2C433B5; Mon, 17 Oct 2022 11:52:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007581; bh=IiAv853HFgz7MJ8suf0He9JlzRXJfENcNmObSxYEdtY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bGP2zCRmgFVDxaApy4TDSR5mA8BeStDqd7oi9BpZc61KxdfHfQ86NqY5cJOcDf51K dpHN+sr0pcYWvq6M9CK3jtegRhHlY+G9sv2RGUQUR+cnNWtCYrNxG248XGkFUIoXj5 8fKCtd8ye+L9Ur+J3/MVsmGyf4LCOeksDZ5gBjmInnqs+pgOcubb0eS3ZQ31TMiKXG m6Ip6tPXCRBmPGKJyFls0pxRuJblW2VPkDNBGAwBZNvo6nhvr9V84XwTg63EG0feJ1 etwychIYN3FOqgxZHVjp63TlfgqADbQLMMv9LEwdzFv0WDbGJPHCBnlH8WYq3jZ6oH gpBN2iJ3yXEWQ== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 11/25] KVM: arm64: Rename 'host_kvm' to 'host_mmu' Date: Mon, 17 Oct 2022 12:51:55 +0100 Message-Id: <20221017115209.2099-12-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In preparation for introducing VM and vCPU state at EL2, rename the existing 'struct host_kvm' and its singleton 'host_kvm' instance to 'host_mmu' so as to avoid confusion between the structure tracking the host stage-2 MMU state and the host instance of a 'struct kvm' for a protected guest. Tested-by: Vincent Donnefort Signed-off-by: Will Deacon Reviewed-by: Philippe Mathieu-Daudé --- arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 6 +-- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 46 +++++++++---------- 2 files changed, 26 insertions(+), 26 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index 3bea816296dc..0a6d3e7f2a43 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -44,13 +44,13 @@ static inline enum pkvm_page_state pkvm_getstate(enum kvm_pgtable_prot prot) return prot & PKVM_PAGE_STATE_PROT_MASK; } -struct host_kvm { +struct host_mmu { struct kvm_arch arch; struct kvm_pgtable pgt; struct kvm_pgtable_mm_ops mm_ops; hyp_spinlock_t lock; }; -extern struct host_kvm host_kvm; +extern struct host_mmu host_mmu; /* This corresponds to page-table locking order */ enum pkvm_component_id { @@ -76,7 +76,7 @@ void hyp_unpin_shared_mem(void *from, void *to); static __always_inline void __load_host_stage2(void) { if (static_branch_likely(&kvm_protected_mode_initialized)) - __load_stage2(&host_kvm.arch.mmu, &host_kvm.arch); + __load_stage2(&host_mmu.arch.mmu, &host_mmu.arch); else write_sysreg(0, vttbr_el2); } diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 1262dbae7f06..bec8306c2392 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -22,18 +22,18 @@ #define KVM_HOST_S2_FLAGS (KVM_PGTABLE_S2_NOFWB | KVM_PGTABLE_S2_IDMAP) extern unsigned long hyp_nr_cpus; -struct host_kvm host_kvm; +struct host_mmu host_mmu; static struct hyp_pool host_s2_pool; static void host_lock_component(void) { - hyp_spin_lock(&host_kvm.lock); + hyp_spin_lock(&host_mmu.lock); } static void host_unlock_component(void) { - hyp_spin_unlock(&host_kvm.lock); + hyp_spin_unlock(&host_mmu.lock); } static void hyp_lock_component(void) @@ -88,7 +88,7 @@ static int prepare_s2_pool(void *pgt_pool_base) if (ret) return ret; - host_kvm.mm_ops = (struct kvm_pgtable_mm_ops) { + host_mmu.mm_ops = (struct kvm_pgtable_mm_ops) { .zalloc_pages_exact = host_s2_zalloc_pages_exact, .zalloc_page = host_s2_zalloc_page, .phys_to_virt = hyp_phys_to_virt, @@ -109,7 +109,7 @@ static void prepare_host_vtcr(void) parange = kvm_get_parange(id_aa64mmfr0_el1_sys_val); phys_shift = id_aa64mmfr0_parange_to_phys_shift(parange); - host_kvm.arch.vtcr = kvm_get_vtcr(id_aa64mmfr0_el1_sys_val, + host_mmu.arch.vtcr = kvm_get_vtcr(id_aa64mmfr0_el1_sys_val, id_aa64mmfr1_el1_sys_val, phys_shift); } @@ -117,25 +117,25 @@ static bool host_stage2_force_pte_cb(u64 addr, u64 end, enum kvm_pgtable_prot pr int kvm_host_prepare_stage2(void *pgt_pool_base) { - struct kvm_s2_mmu *mmu = &host_kvm.arch.mmu; + struct kvm_s2_mmu *mmu = &host_mmu.arch.mmu; int ret; prepare_host_vtcr(); - hyp_spin_lock_init(&host_kvm.lock); - mmu->arch = &host_kvm.arch; + hyp_spin_lock_init(&host_mmu.lock); + mmu->arch = &host_mmu.arch; ret = prepare_s2_pool(pgt_pool_base); if (ret) return ret; - ret = __kvm_pgtable_stage2_init(&host_kvm.pgt, mmu, - &host_kvm.mm_ops, KVM_HOST_S2_FLAGS, + ret = __kvm_pgtable_stage2_init(&host_mmu.pgt, mmu, + &host_mmu.mm_ops, KVM_HOST_S2_FLAGS, host_stage2_force_pte_cb); if (ret) return ret; - mmu->pgd_phys = __hyp_pa(host_kvm.pgt.pgd); - mmu->pgt = &host_kvm.pgt; + mmu->pgd_phys = __hyp_pa(host_mmu.pgt.pgd); + mmu->pgt = &host_mmu.pgt; atomic64_set(&mmu->vmid.id, 0); return 0; @@ -143,19 +143,19 @@ int kvm_host_prepare_stage2(void *pgt_pool_base) int __pkvm_prot_finalize(void) { - struct kvm_s2_mmu *mmu = &host_kvm.arch.mmu; + struct kvm_s2_mmu *mmu = &host_mmu.arch.mmu; struct kvm_nvhe_init_params *params = this_cpu_ptr(&kvm_init_params); if (params->hcr_el2 & HCR_VM) return -EPERM; params->vttbr = kvm_get_vttbr(mmu); - params->vtcr = host_kvm.arch.vtcr; + params->vtcr = host_mmu.arch.vtcr; params->hcr_el2 |= HCR_VM; kvm_flush_dcache_to_poc(params, sizeof(*params)); write_sysreg(params->hcr_el2, hcr_el2); - __load_stage2(&host_kvm.arch.mmu, &host_kvm.arch); + __load_stage2(&host_mmu.arch.mmu, &host_mmu.arch); /* * Make sure to have an ISB before the TLB maintenance below but only @@ -173,7 +173,7 @@ int __pkvm_prot_finalize(void) static int host_stage2_unmap_dev_all(void) { - struct kvm_pgtable *pgt = &host_kvm.pgt; + struct kvm_pgtable *pgt = &host_mmu.pgt; struct memblock_region *reg; u64 addr = 0; int i, ret; @@ -258,7 +258,7 @@ static bool range_is_memory(u64 start, u64 end) static inline int __host_stage2_idmap(u64 start, u64 end, enum kvm_pgtable_prot prot) { - return kvm_pgtable_stage2_map(&host_kvm.pgt, start, end - start, start, + return kvm_pgtable_stage2_map(&host_mmu.pgt, start, end - start, start, prot, &host_s2_pool); } @@ -271,7 +271,7 @@ static inline int __host_stage2_idmap(u64 start, u64 end, #define host_stage2_try(fn, ...) \ ({ \ int __ret; \ - hyp_assert_lock_held(&host_kvm.lock); \ + hyp_assert_lock_held(&host_mmu.lock); \ __ret = fn(__VA_ARGS__); \ if (__ret == -ENOMEM) { \ __ret = host_stage2_unmap_dev_all(); \ @@ -294,8 +294,8 @@ static int host_stage2_adjust_range(u64 addr, struct kvm_mem_range *range) u32 level; int ret; - hyp_assert_lock_held(&host_kvm.lock); - ret = kvm_pgtable_get_leaf(&host_kvm.pgt, addr, &pte, &level); + hyp_assert_lock_held(&host_mmu.lock); + ret = kvm_pgtable_get_leaf(&host_mmu.pgt, addr, &pte, &level); if (ret) return ret; @@ -327,7 +327,7 @@ int host_stage2_idmap_locked(phys_addr_t addr, u64 size, int host_stage2_set_owner_locked(phys_addr_t addr, u64 size, u8 owner_id) { - return host_stage2_try(kvm_pgtable_stage2_set_owner, &host_kvm.pgt, + return host_stage2_try(kvm_pgtable_stage2_set_owner, &host_mmu.pgt, addr, size, &host_s2_pool, owner_id); } @@ -468,8 +468,8 @@ static int __host_check_page_state_range(u64 addr, u64 size, .get_page_state = host_get_page_state, }; - hyp_assert_lock_held(&host_kvm.lock); - return check_page_state_range(&host_kvm.pgt, addr, size, &d); + hyp_assert_lock_held(&host_mmu.lock); + return check_page_state_range(&host_mmu.pgt, addr, size, &d); } static int __host_set_page_state_range(u64 addr, u64 size, From patchwork Mon Oct 17 11:51:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008677 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 186E8C4332F for ; Mon, 17 Oct 2022 11:53:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230212AbiJQLxM (ORCPT ); Mon, 17 Oct 2022 07:53:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230263AbiJQLxL (ORCPT ); Mon, 17 Oct 2022 07:53:11 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F256120F64 for ; Mon, 17 Oct 2022 04:53:07 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 56617B81632 for ; Mon, 17 Oct 2022 11:53:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D2044C4347C; Mon, 17 Oct 2022 11:53:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007585; bh=z4lR4FzrXtXbz9V5TplRaskZ+b3uujoqlK/pnPb3FZo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BUAeK/WkfJJc/gMhc0tGy1tFyhCN5GhRlve40aQgpVdZNXqgRqUU/t5Ff5Z6DROOE ayOxp08AcjeHpD2aBQ3aF2R9ZYY1OXZ93s1bCA5f5a+PYevqk1pp0cnwWMLV8E705G EabaRM3brbhmeA91z/RgFJ1mtkfl2dpxKa46ciVrnyXDO5zCcG8WEpFVizRONaqf/B jRKsodwYTBr7hRk+TGLciNrG7hQFz25uRiLE3CHFgEZHtwuLurFkH6URixWpfK+J/R MzNanUiPrL4YBJzPzJz72dpLJIgk44KkxyXM3WJFyuFFaZECrKKAU35X++fsfiDeu0 HnOt7m0jJmKxA== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 12/25] KVM: arm64: Add infrastructure to create and track pKVM instances at EL2 Date: Mon, 17 Oct 2022 12:51:56 +0100 Message-Id: <20221017115209.2099-13-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Fuad Tabba Introduce a global table (and lock) to track pKVM instances at EL2, and provide hypercalls that can be used by the untrusted host to create and destroy pKVM VMs and their vCPUs. pKVM VM/vCPU state is directly accessible only by the trusted hypervisor (EL2). Each pKVM VM is directly associated with an untrusted host KVM instance, and is referenced by the host using an opaque handle. Future patches will provide hypercalls to allow the host to initialize/set/get pKVM VM/vCPU state using the opaque handle. Tested-by: Vincent Donnefort Signed-off-by: Fuad Tabba Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_asm.h | 3 + arch/arm64/include/asm/kvm_host.h | 8 + arch/arm64/include/asm/kvm_pgtable.h | 8 + arch/arm64/include/asm/kvm_pkvm.h | 8 + arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 3 + arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 64 +++ arch/arm64/kvm/hyp/nvhe/hyp-main.c | 31 ++ arch/arm64/kvm/hyp/nvhe/mem_protect.c | 14 + arch/arm64/kvm/hyp/nvhe/pkvm.c | 389 ++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/setup.c | 8 + arch/arm64/kvm/hyp/pgtable.c | 9 + arch/arm64/kvm/pkvm.c | 1 + 12 files changed, 546 insertions(+) create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 53035763e48e..de52ba775d48 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -76,6 +76,9 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___vgic_v3_save_aprs, __KVM_HOST_SMCCC_FUNC___vgic_v3_restore_aprs, __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps, + __KVM_HOST_SMCCC_FUNC___pkvm_init_vm, + __KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu, + __KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm, }; #define DECLARE_KVM_VHE_SYM(sym) extern char sym[] diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 45e2136322ba..d3dd7ab9c79e 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -115,6 +115,8 @@ struct kvm_smccc_features { unsigned long vendor_hyp_bmap; }; +typedef unsigned int pkvm_handle_t; + struct kvm_arch { struct kvm_s2_mmu mmu; @@ -166,6 +168,12 @@ struct kvm_arch { /* Hypercall features firmware registers' descriptor */ struct kvm_smccc_features smccc_feat; + + /* + * For an untrusted host VM, 'pkvm_handle' is used to lookup + * the associated pKVM instance in the hypervisor. + */ + pkvm_handle_t pkvm_handle; }; struct kvm_vcpu_fault_info { diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 1b098bd4cd37..4f6d79fe4352 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -288,6 +288,14 @@ u64 kvm_pgtable_hyp_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size); */ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift); +/** + * kvm_pgtable_stage2_pgd_size() - Helper to compute size of a stage-2 PGD + * @vtcr: Content of the VTCR register. + * + * Return: the size (in bytes) of the stage-2 PGD + */ +size_t kvm_pgtable_stage2_pgd_size(u64 vtcr); + /** * __kvm_pgtable_stage2_init() - Initialise a guest stage-2 page-table. * @pgt: Uninitialised page-table structure to initialise. diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h index 8f7b8a2314bb..e22856cf4207 100644 --- a/arch/arm64/include/asm/kvm_pkvm.h +++ b/arch/arm64/include/asm/kvm_pkvm.h @@ -9,6 +9,9 @@ #include #include +/* Maximum number of protected VMs that can be created. */ +#define KVM_MAX_PVMS 255 + #define HYP_MEMBLOCK_REGIONS 128 extern struct memblock_region kvm_nvhe_sym(hyp_memory)[]; @@ -40,6 +43,11 @@ static inline unsigned long hyp_vmemmap_pages(size_t vmemmap_entry_size) return res >> PAGE_SHIFT; } +static inline unsigned long hyp_vm_table_pages(void) +{ + return PAGE_ALIGN(KVM_MAX_PVMS * sizeof(void *)) >> PAGE_SHIFT; +} + static inline unsigned long __hyp_pgtable_max_pages(unsigned long nr_pages) { unsigned long total = 0, i; diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index 0a6d3e7f2a43..ce9a796a85ee 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -11,6 +11,7 @@ #include #include #include +#include #include /* @@ -68,10 +69,12 @@ bool addr_is_memory(phys_addr_t phys); int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot prot); int host_stage2_set_owner_locked(phys_addr_t addr, u64 size, u8 owner_id); int kvm_host_prepare_stage2(void *pgt_pool_base); +int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd); void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt); int hyp_pin_shared_mem(void *from, void *to); void hyp_unpin_shared_mem(void *from, void *to); +void reclaim_guest_pages(struct pkvm_hyp_vm *vm); static __always_inline void __load_host_stage2(void) { diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h new file mode 100644 index 000000000000..d176399dbfb5 --- /dev/null +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h @@ -0,0 +1,64 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2021 Google LLC + * Author: Fuad Tabba + */ + +#ifndef __ARM64_KVM_NVHE_PKVM_H__ +#define __ARM64_KVM_NVHE_PKVM_H__ + +#include + +/* + * Holds the relevant data for maintaining the vcpu state completely at hyp. + */ +struct pkvm_hyp_vcpu { + struct kvm_vcpu vcpu; + + /* Backpointer to the host's (untrusted) vCPU instance. */ + struct kvm_vcpu *host_vcpu; +}; + +/* + * Holds the relevant data for running a protected vm. + */ +struct pkvm_hyp_vm { + struct kvm kvm; + + /* Backpointer to the host's (untrusted) KVM instance. */ + struct kvm *host_kvm; + + /* + * Total amount of memory donated by the host for maintaining + * this 'struct pkvm_hyp_vm' in the hypervisor. + */ + size_t donated_memory_size; + + /* The guest's stage-2 page-table managed by the hypervisor. */ + struct kvm_pgtable pgt; + + /* + * The number of vcpus initialized and ready to run. + * Modifying this is protected by 'vm_table_lock'. + */ + unsigned int nr_vcpus; + + /* Array of the hyp vCPU structures for this VM. */ + struct pkvm_hyp_vcpu *vcpus[]; +}; + +static inline struct pkvm_hyp_vm * +pkvm_hyp_vcpu_to_hyp_vm(struct pkvm_hyp_vcpu *hyp_vcpu) +{ + return container_of(hyp_vcpu->vcpu.kvm, struct pkvm_hyp_vm, kvm); +} + +void pkvm_hyp_vm_table_init(void *tbl); + +int __pkvm_init_vm(struct kvm *host_kvm, unsigned long vm_hva, + unsigned long pgd_hva); +int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, + unsigned long vcpu_hva); +int __pkvm_teardown_vm(pkvm_handle_t handle); + +#endif /* __ARM64_KVM_NVHE_PKVM_H__ */ diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 3cea4b6ac23e..b5f3fcfe9135 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -15,6 +15,7 @@ #include #include +#include #include DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params); @@ -191,6 +192,33 @@ static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt) __pkvm_vcpu_init_traps(kern_hyp_va(vcpu)); } +static void handle___pkvm_init_vm(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(struct kvm *, host_kvm, host_ctxt, 1); + DECLARE_REG(unsigned long, vm_hva, host_ctxt, 2); + DECLARE_REG(unsigned long, pgd_hva, host_ctxt, 3); + + host_kvm = kern_hyp_va(host_kvm); + cpu_reg(host_ctxt, 1) = __pkvm_init_vm(host_kvm, vm_hva, pgd_hva); +} + +static void handle___pkvm_init_vcpu(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(pkvm_handle_t, handle, host_ctxt, 1); + DECLARE_REG(struct kvm_vcpu *, host_vcpu, host_ctxt, 2); + DECLARE_REG(unsigned long, vcpu_hva, host_ctxt, 3); + + host_vcpu = kern_hyp_va(host_vcpu); + cpu_reg(host_ctxt, 1) = __pkvm_init_vcpu(handle, host_vcpu, vcpu_hva); +} + +static void handle___pkvm_teardown_vm(struct kvm_cpu_context *host_ctxt) +{ + DECLARE_REG(pkvm_handle_t, handle, host_ctxt, 1); + + cpu_reg(host_ctxt, 1) = __pkvm_teardown_vm(handle); +} + typedef void (*hcall_t)(struct kvm_cpu_context *); #define HANDLE_FUNC(x) [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x @@ -220,6 +248,9 @@ static const hcall_t host_hcall[] = { HANDLE_FUNC(__vgic_v3_save_aprs), HANDLE_FUNC(__vgic_v3_restore_aprs), HANDLE_FUNC(__pkvm_vcpu_init_traps), + HANDLE_FUNC(__pkvm_init_vm), + HANDLE_FUNC(__pkvm_init_vcpu), + HANDLE_FUNC(__pkvm_teardown_vm), }; static void handle_host_hcall(struct kvm_cpu_context *host_ctxt) diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index bec8306c2392..2ef6aaa21ba5 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -141,6 +141,20 @@ int kvm_host_prepare_stage2(void *pgt_pool_base) return 0; } +int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd) +{ + vm->pgt.pgd = pgd; + return 0; +} + +void reclaim_guest_pages(struct pkvm_hyp_vm *vm) +{ + unsigned long nr_pages; + + nr_pages = kvm_pgtable_stage2_pgd_size(vm->kvm.arch.vtcr) >> PAGE_SHIFT; + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(vm->pgt.pgd), nr_pages)); +} + int __pkvm_prot_finalize(void) { struct kvm_s2_mmu *mmu = &host_mmu.arch.mmu; diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index 85d3b7ae720f..7e6f7d2f7713 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -7,6 +7,9 @@ #include #include #include +#include +#include +#include #include /* @@ -183,3 +186,389 @@ void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu) pvm_init_traps_aa64mmfr0(vcpu); pvm_init_traps_aa64mmfr1(vcpu); } + +/* + * Start the VM table handle at the offset defined instead of at 0. + * Mainly for sanity checking and debugging. + */ +#define HANDLE_OFFSET 0x1000 + +static unsigned int vm_handle_to_idx(pkvm_handle_t handle) +{ + return handle - HANDLE_OFFSET; +} + +static pkvm_handle_t idx_to_vm_handle(unsigned int idx) +{ + return idx + HANDLE_OFFSET; +} + +/* + * Spinlock for protecting state related to the VM table. Protects writes + * to 'vm_table' and 'nr_table_entries' as well as reads and writes to + * 'last_hyp_vcpu_lookup'. + */ +static DEFINE_HYP_SPINLOCK(vm_table_lock); + +/* + * The table of VM entries for protected VMs in hyp. + * Allocated at hyp initialization and setup. + */ +static struct pkvm_hyp_vm **vm_table; + +void pkvm_hyp_vm_table_init(void *tbl) +{ + WARN_ON(vm_table); + vm_table = tbl; +} + +/* + * Return the hyp vm structure corresponding to the handle. + */ +static struct pkvm_hyp_vm *get_vm_by_handle(pkvm_handle_t handle) +{ + unsigned int idx = vm_handle_to_idx(handle); + + if (unlikely(idx >= KVM_MAX_PVMS)) + return NULL; + + return vm_table[idx]; +} + +static void unpin_host_vcpu(struct kvm_vcpu *host_vcpu) +{ + if (host_vcpu) + hyp_unpin_shared_mem(host_vcpu, host_vcpu + 1); +} + +static void unpin_host_vcpus(struct pkvm_hyp_vcpu *hyp_vcpus[], + unsigned int nr_vcpus) +{ + int i; + + for (i = 0; i < nr_vcpus; i++) + unpin_host_vcpu(hyp_vcpus[i]->host_vcpu); +} + +static void init_pkvm_hyp_vm(struct kvm *host_kvm, struct pkvm_hyp_vm *hyp_vm, + unsigned int nr_vcpus) +{ + hyp_vm->host_kvm = host_kvm; + hyp_vm->kvm.created_vcpus = nr_vcpus; + hyp_vm->kvm.arch.vtcr = host_mmu.arch.vtcr; +} + +static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu, + struct pkvm_hyp_vm *hyp_vm, + struct kvm_vcpu *host_vcpu, + unsigned int vcpu_idx) +{ + int ret = 0; + + if (hyp_pin_shared_mem(host_vcpu, host_vcpu + 1)) + return -EBUSY; + + if (host_vcpu->vcpu_idx != vcpu_idx) { + ret = -EINVAL; + goto done; + } + + hyp_vcpu->host_vcpu = host_vcpu; + + hyp_vcpu->vcpu.kvm = &hyp_vm->kvm; + hyp_vcpu->vcpu.vcpu_id = READ_ONCE(host_vcpu->vcpu_id); + hyp_vcpu->vcpu.vcpu_idx = vcpu_idx; + + hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu; +done: + if (ret) + unpin_host_vcpu(host_vcpu); + return ret; +} + +static int find_free_vm_table_entry(struct kvm *host_kvm) +{ + int i, ret = -ENOMEM; + + for (i = 0; i < KVM_MAX_PVMS; ++i) { + struct pkvm_hyp_vm *vm = vm_table[i]; + + if (!vm) { + if (ret < 0) + ret = i; + continue; + } + + if (unlikely(vm->host_kvm == host_kvm)) { + ret = -EEXIST; + break; + } + } + + return ret; +} + +/* + * Allocate a VM table entry and insert a pointer to the new vm. + * + * Return a unique handle to the protected VM on success, + * negative error code on failure. + */ +static pkvm_handle_t insert_vm_table_entry(struct kvm *host_kvm, + struct pkvm_hyp_vm *hyp_vm, + size_t donated_memory_size) +{ + struct kvm_s2_mmu *mmu = &hyp_vm->kvm.arch.mmu; + int idx; + + hyp_assert_lock_held(&vm_table_lock); + + /* + * Initializing protected state might have failed, yet a malicious + * host could trigger this function. Thus, ensure that 'vm_table' + * exists. + */ + if (unlikely(!vm_table)) + return -EINVAL; + + idx = find_free_vm_table_entry(host_kvm); + if (idx < 0) + return idx; + + hyp_vm->kvm.arch.pkvm_handle = idx_to_vm_handle(idx); + hyp_vm->donated_memory_size = donated_memory_size; + + /* VMID 0 is reserved for the host */ + atomic64_set(&mmu->vmid.id, idx + 1); + + mmu->arch = &hyp_vm->kvm.arch; + mmu->pgt = &hyp_vm->pgt; + + vm_table[idx] = hyp_vm; + return hyp_vm->kvm.arch.pkvm_handle; +} + +/* + * Deallocate and remove the VM table entry corresponding to the handle. + */ +static void remove_vm_table_entry(pkvm_handle_t handle) +{ + hyp_assert_lock_held(&vm_table_lock); + vm_table[vm_handle_to_idx(handle)] = NULL; +} + +static size_t pkvm_get_hyp_vm_size(unsigned int nr_vcpus) +{ + return size_add(sizeof(struct pkvm_hyp_vm), + size_mul(sizeof(struct pkvm_hyp_vcpu *), nr_vcpus)); +} + +static void *map_donated_memory_noclear(unsigned long host_va, size_t size) +{ + void *va = (void *)kern_hyp_va(host_va); + + if (!PAGE_ALIGNED(va)) + return NULL; + + if (__pkvm_host_donate_hyp(hyp_virt_to_pfn(va), + PAGE_ALIGN(size) >> PAGE_SHIFT)) + return NULL; + + return va; +} + +static void *map_donated_memory(unsigned long host_va, size_t size) +{ + void *va = map_donated_memory_noclear(host_va, size); + + if (va) + memset(va, 0, size); + + return va; +} + +static void __unmap_donated_memory(void *va, size_t size) +{ + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(va), + PAGE_ALIGN(size) >> PAGE_SHIFT)); +} + +static void unmap_donated_memory(void *va, size_t size) +{ + if (!va) + return; + + memset(va, 0, size); + __unmap_donated_memory(va, size); +} + +static void unmap_donated_memory_noclear(void *va, size_t size) +{ + if (!va) + return; + + __unmap_donated_memory(va, size); +} + +/* + * Initialize the hypervisor copy of the protected VM state using the + * memory donated by the host. + * + * Unmaps the donated memory from the host at stage 2. + * + * kvm: A pointer to the host's struct kvm. + * vm_hva: The host va of the area being donated for the VM state. + * Must be page aligned. + * pgd_hva: The host va of the area being donated for the stage-2 PGD for + * the VM. Must be page aligned. Its size is implied by the VM's + * VTCR. + * + * Return a unique handle to the protected VM on success, + * negative error code on failure. + */ +int __pkvm_init_vm(struct kvm *host_kvm, unsigned long vm_hva, + unsigned long pgd_hva) +{ + struct pkvm_hyp_vm *hyp_vm = NULL; + size_t vm_size, pgd_size; + unsigned int nr_vcpus; + void *pgd = NULL; + int ret; + + ret = hyp_pin_shared_mem(host_kvm, host_kvm + 1); + if (ret) + return ret; + + nr_vcpus = READ_ONCE(host_kvm->created_vcpus); + if (nr_vcpus < 1) { + ret = -EINVAL; + goto err_unpin_kvm; + } + + vm_size = pkvm_get_hyp_vm_size(nr_vcpus); + pgd_size = kvm_pgtable_stage2_pgd_size(host_mmu.arch.vtcr); + + ret = -ENOMEM; + + hyp_vm = map_donated_memory(vm_hva, vm_size); + if (!hyp_vm) + goto err_remove_mappings; + + pgd = map_donated_memory_noclear(pgd_hva, pgd_size); + if (!pgd) + goto err_remove_mappings; + + init_pkvm_hyp_vm(host_kvm, hyp_vm, nr_vcpus); + + hyp_spin_lock(&vm_table_lock); + ret = insert_vm_table_entry(host_kvm, hyp_vm, vm_size); + if (ret < 0) + goto err_unlock; + + ret = kvm_guest_prepare_stage2(hyp_vm, pgd); + if (ret) + goto err_remove_vm_table_entry; + hyp_spin_unlock(&vm_table_lock); + + return hyp_vm->kvm.arch.pkvm_handle; + +err_remove_vm_table_entry: + remove_vm_table_entry(hyp_vm->kvm.arch.pkvm_handle); +err_unlock: + hyp_spin_unlock(&vm_table_lock); +err_remove_mappings: + unmap_donated_memory(hyp_vm, vm_size); + unmap_donated_memory(pgd, pgd_size); +err_unpin_kvm: + hyp_unpin_shared_mem(host_kvm, host_kvm + 1); + return ret; +} + +/* + * Initialize the hypervisor copy of the protected vCPU state using the + * memory donated by the host. + * + * handle: The handle for the protected vm. + * host_vcpu: A pointer to the corresponding host vcpu. + * vcpu_hva: The host va of the area being donated for the vcpu state. + * Must be page aligned. The size of the area must be equal to + * the page-aligned size of 'struct pkvm_hyp_vcpu'. + * Return 0 on success, negative error code on failure. + */ +int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, + unsigned long vcpu_hva) +{ + struct pkvm_hyp_vcpu *hyp_vcpu; + struct pkvm_hyp_vm *hyp_vm; + unsigned int idx; + int ret; + + hyp_vcpu = map_donated_memory(vcpu_hva, sizeof(*hyp_vcpu)); + if (!hyp_vcpu) + return -ENOMEM; + + hyp_spin_lock(&vm_table_lock); + + hyp_vm = get_vm_by_handle(handle); + if (!hyp_vm) { + ret = -ENOENT; + goto unlock; + } + + idx = hyp_vm->nr_vcpus; + if (idx >= hyp_vm->kvm.created_vcpus) { + ret = -EINVAL; + goto unlock; + } + + ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu, idx); + if (ret) + goto unlock; + + hyp_vm->vcpus[idx] = hyp_vcpu; + hyp_vm->nr_vcpus++; +unlock: + hyp_spin_unlock(&vm_table_lock); + + if (ret) + unmap_donated_memory(hyp_vcpu, sizeof(*hyp_vcpu)); + + return ret; +} + +int __pkvm_teardown_vm(pkvm_handle_t handle) +{ + struct pkvm_hyp_vm *hyp_vm; + int err; + + hyp_spin_lock(&vm_table_lock); + hyp_vm = get_vm_by_handle(handle); + if (!hyp_vm) { + err = -ENOENT; + goto err_unlock; + } + + if (WARN_ON(hyp_page_count(hyp_vm))) { + err = -EBUSY; + goto err_unlock; + } + + /* Ensure the VMID is clean before it can be reallocated */ + __kvm_tlb_flush_vmid(&hyp_vm->kvm.arch.mmu); + remove_vm_table_entry(handle); + hyp_spin_unlock(&vm_table_lock); + + /* Reclaim guest pages (including page-table pages) */ + reclaim_guest_pages(hyp_vm); + unpin_host_vcpus(hyp_vm->vcpus, hyp_vm->nr_vcpus); + + /* Push the metadata pages to the teardown memcache */ + hyp_unpin_shared_mem(hyp_vm->host_kvm, hyp_vm->host_kvm + 1); + + unmap_donated_memory(hyp_vm, hyp_vm->donated_memory_size); + return 0; + +err_unlock: + hyp_spin_unlock(&vm_table_lock); + return err; +} diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 0312c9c74a5a..2be72fbe7279 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -16,6 +16,7 @@ #include #include #include +#include #include unsigned long hyp_nr_cpus; @@ -24,6 +25,7 @@ unsigned long hyp_nr_cpus; (unsigned long)__per_cpu_start) static void *vmemmap_base; +static void *vm_table_base; static void *hyp_pgt_base; static void *host_s2_pgt_base; static struct kvm_pgtable_mm_ops pkvm_pgtable_mm_ops; @@ -40,6 +42,11 @@ static int divide_memory_pool(void *virt, unsigned long size) if (!vmemmap_base) return -ENOMEM; + nr_pages = hyp_vm_table_pages(); + vm_table_base = hyp_early_alloc_contig(nr_pages); + if (!vm_table_base) + return -ENOMEM; + nr_pages = hyp_s1_pgtable_pages(); hyp_pgt_base = hyp_early_alloc_contig(nr_pages); if (!hyp_pgt_base) @@ -314,6 +321,7 @@ void __noreturn __pkvm_init_finalise(void) if (ret) goto out; + pkvm_hyp_vm_table_init(vm_table_base); out: /* * We tail-called to here from handle___pkvm_init() and will not return, diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index cdf8e76b0be1..a1a27f88a312 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1200,6 +1200,15 @@ int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu, return 0; } +size_t kvm_pgtable_stage2_pgd_size(u64 vtcr) +{ + u32 ia_bits = VTCR_EL2_IPA(vtcr); + u32 sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, vtcr); + u32 start_level = VTCR_EL2_TGRAN_SL0_BASE - sl0; + + return kvm_pgd_pages(ia_bits, start_level) * PAGE_SIZE; +} + static int stage2_free_walker(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, enum kvm_pgtable_walk_flags flag, void * const arg) diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c index 34229425b25d..71493136e59c 100644 --- a/arch/arm64/kvm/pkvm.c +++ b/arch/arm64/kvm/pkvm.c @@ -71,6 +71,7 @@ void __init kvm_hyp_reserve(void) hyp_mem_pages += hyp_s1_pgtable_pages(); hyp_mem_pages += host_s2_pgtable_pages(); + hyp_mem_pages += hyp_vm_table_pages(); hyp_mem_pages += hyp_vmemmap_pages(STRUCT_HYP_PAGE_SIZE); /* From patchwork Mon Oct 17 11:51:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82B0BC433FE for ; Mon, 17 Oct 2022 11:53:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230036AbiJQLxS (ORCPT ); Mon, 17 Oct 2022 07:53:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230251AbiJQLxQ (ORCPT ); Mon, 17 Oct 2022 07:53:16 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A32D580B8 for ; Mon, 17 Oct 2022 04:53:11 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B9AB6B81633 for ; Mon, 17 Oct 2022 11:53:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 85F84C433D6; Mon, 17 Oct 2022 11:53:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007588; bh=ajl+QKd98/u6HGhG22+h7NXI7CO6S41FxnMzYXJiD5c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kZkRlncZsdWcBNt6oWGYjbC2oXrI1F9aJ6MGoS59XYWhzk8fudfxy0hXLlDKIgftl 8wXQnPxPB01I9tr+xOVxMHC94+WqIAR+lGtX99vIvXLVipLPWACvryLwv0FnG6gK4D U2R4dmBFNqzadRkuivuGDifwv733+ZXYj50kCyo/duJ03FTztET8iln73beq3b1b0j ByaeEjhu0ePRJu6DHfegLIhgwFjY7TshV+8zCJJXI1RJeSZ0tnLBBqBAdM+Sl51LNK ZORsBoxc1br+5FTpu/F5jTQrXx8HrOcix5FHzClKtaoGYxlvl9706Il5/Ujj0C/KUI XCSYgFxT97CUA== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 13/25] KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1 Date: Mon, 17 Oct 2022 12:51:57 +0100 Message-Id: <20221017115209.2099-14-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Fuad Tabba With the pKVM hypervisor at EL2 now offering hypercalls to the host for creating and destroying VM and vCPU structures, plumb these in to the existing arm64 KVM backend to ensure that the hypervisor data structures are allocated and initialised on first vCPU run for a pKVM guest. In the host, 'struct kvm_protected_vm' is introduced to hold the handle of the pKVM VM instance as well as to track references to the memory donated to the hypervisor so that it can be freed back to the host allocator following VM teardown. The stage-2 page-table, hypervisor VM and vCPU structures are allocated separately so as to avoid the need for a large physically-contiguous allocation in the host at run-time. Tested-by: Vincent Donnefort Signed-off-by: Fuad Tabba Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_host.h | 15 +++- arch/arm64/include/asm/kvm_pkvm.h | 4 + arch/arm64/kvm/arm.c | 14 +++ arch/arm64/kvm/hyp/hyp-constants.c | 3 + arch/arm64/kvm/hyp/nvhe/pkvm.c | 15 +++- arch/arm64/kvm/pkvm.c | 138 +++++++++++++++++++++++++++++ 6 files changed, 183 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index d3dd7ab9c79e..f09b422ba1fe 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -117,6 +117,17 @@ struct kvm_smccc_features { typedef unsigned int pkvm_handle_t; +struct kvm_protected_vm { + pkvm_handle_t handle; + struct mutex vm_lock; + + struct { + void *pgd; + void *vm; + void *vcpus[KVM_MAX_VCPUS]; + } hyp_donations; +}; + struct kvm_arch { struct kvm_s2_mmu mmu; @@ -170,10 +181,10 @@ struct kvm_arch { struct kvm_smccc_features smccc_feat; /* - * For an untrusted host VM, 'pkvm_handle' is used to lookup + * For an untrusted host VM, 'pkvm.handle' is used to lookup * the associated pKVM instance in the hypervisor. */ - pkvm_handle_t pkvm_handle; + struct kvm_protected_vm pkvm; }; struct kvm_vcpu_fault_info { diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h index e22856cf4207..8340798b2289 100644 --- a/arch/arm64/include/asm/kvm_pkvm.h +++ b/arch/arm64/include/asm/kvm_pkvm.h @@ -14,6 +14,10 @@ #define HYP_MEMBLOCK_REGIONS 128 +int pkvm_init_host_vm(struct kvm *kvm); +int pkvm_create_hyp_vm(struct kvm *kvm); +void pkvm_destroy_hyp_vm(struct kvm *kvm); + extern struct memblock_region kvm_nvhe_sym(hyp_memory)[]; extern unsigned int kvm_nvhe_sym(hyp_memblock_nr); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 94d33e296e10..30d6fc5d3a93 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include @@ -150,6 +151,10 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) if (ret) goto out_free_stage2_pgd; + ret = pkvm_init_host_vm(kvm); + if (ret) + goto out_free_stage2_pgd; + if (!zalloc_cpumask_var(&kvm->arch.supported_cpus, GFP_KERNEL)) { ret = -ENOMEM; goto out_free_stage2_pgd; @@ -187,6 +192,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kvm_vgic_destroy(kvm); + if (is_protected_kvm_enabled()) + pkvm_destroy_hyp_vm(kvm); + kvm_destroy_vcpus(kvm); kvm_unshare_hyp(kvm, kvm + 1); @@ -569,6 +577,12 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) if (ret) return ret; + if (is_protected_kvm_enabled()) { + ret = pkvm_create_hyp_vm(kvm); + if (ret) + return ret; + } + if (!irqchip_in_kernel(kvm)) { /* * Tell the rest of the code that there are userspace irqchip diff --git a/arch/arm64/kvm/hyp/hyp-constants.c b/arch/arm64/kvm/hyp/hyp-constants.c index b3742a6691e8..b257a3b4bfc5 100644 --- a/arch/arm64/kvm/hyp/hyp-constants.c +++ b/arch/arm64/kvm/hyp/hyp-constants.c @@ -2,9 +2,12 @@ #include #include +#include int main(void) { DEFINE(STRUCT_HYP_PAGE_SIZE, sizeof(struct hyp_page)); + DEFINE(PKVM_HYP_VM_SIZE, sizeof(struct pkvm_hyp_vm)); + DEFINE(PKVM_HYP_VCPU_SIZE, sizeof(struct pkvm_hyp_vcpu)); return 0; } diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index 7e6f7d2f7713..5970fd1adbfb 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -335,7 +335,7 @@ static pkvm_handle_t insert_vm_table_entry(struct kvm *host_kvm, if (idx < 0) return idx; - hyp_vm->kvm.arch.pkvm_handle = idx_to_vm_handle(idx); + hyp_vm->kvm.arch.pkvm.handle = idx_to_vm_handle(idx); hyp_vm->donated_memory_size = donated_memory_size; /* VMID 0 is reserved for the host */ @@ -345,7 +345,7 @@ static pkvm_handle_t insert_vm_table_entry(struct kvm *host_kvm, mmu->pgt = &hyp_vm->pgt; vm_table[idx] = hyp_vm; - return hyp_vm->kvm.arch.pkvm_handle; + return hyp_vm->kvm.arch.pkvm.handle; } /* @@ -470,10 +470,10 @@ int __pkvm_init_vm(struct kvm *host_kvm, unsigned long vm_hva, goto err_remove_vm_table_entry; hyp_spin_unlock(&vm_table_lock); - return hyp_vm->kvm.arch.pkvm_handle; + return hyp_vm->kvm.arch.pkvm.handle; err_remove_vm_table_entry: - remove_vm_table_entry(hyp_vm->kvm.arch.pkvm_handle); + remove_vm_table_entry(hyp_vm->kvm.arch.pkvm.handle); err_unlock: hyp_spin_unlock(&vm_table_lock); err_remove_mappings: @@ -539,6 +539,7 @@ int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, int __pkvm_teardown_vm(pkvm_handle_t handle) { struct pkvm_hyp_vm *hyp_vm; + unsigned int idx; int err; hyp_spin_lock(&vm_table_lock); @@ -565,6 +566,12 @@ int __pkvm_teardown_vm(pkvm_handle_t handle) /* Push the metadata pages to the teardown memcache */ hyp_unpin_shared_mem(hyp_vm->host_kvm, hyp_vm->host_kvm + 1); + for (idx = 0; idx < hyp_vm->nr_vcpus; ++idx) { + struct pkvm_hyp_vcpu *hyp_vcpu = hyp_vm->vcpus[idx]; + + unmap_donated_memory(hyp_vcpu, sizeof(*hyp_vcpu)); + } + unmap_donated_memory(hyp_vm, hyp_vm->donated_memory_size); return 0; diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c index 71493136e59c..754632a608e3 100644 --- a/arch/arm64/kvm/pkvm.c +++ b/arch/arm64/kvm/pkvm.c @@ -6,6 +6,7 @@ #include #include +#include #include #include @@ -94,3 +95,140 @@ void __init kvm_hyp_reserve(void) kvm_info("Reserved %lld MiB at 0x%llx\n", hyp_mem_size >> 20, hyp_mem_base); } + +/* + * Allocates and donates memory for hypervisor VM structs at EL2. + * + * Allocates space for the VM state, which includes the hyp vm as well as + * the hyp vcpus. + * + * Stores an opaque handler in the kvm struct for future reference. + * + * Return 0 on success, negative error code on failure. + */ +static int __pkvm_create_hyp_vm(struct kvm *host_kvm) +{ + size_t pgd_sz, hyp_vm_sz, hyp_vcpu_sz; + struct kvm_vcpu *host_vcpu; + pkvm_handle_t handle; + void *pgd, *hyp_vm; + unsigned long idx; + int ret; + + if (host_kvm->created_vcpus < 1) + return -EINVAL; + + pgd_sz = kvm_pgtable_stage2_pgd_size(host_kvm->arch.vtcr); + + /* + * The PGD pages will be reclaimed using a hyp_memcache which implies + * page granularity. So, use alloc_pages_exact() to get individual + * refcounts. + */ + pgd = alloc_pages_exact(pgd_sz, GFP_KERNEL_ACCOUNT); + if (!pgd) + return -ENOMEM; + + /* Allocate memory to donate to hyp for vm and vcpu pointers. */ + hyp_vm_sz = PAGE_ALIGN(size_add(PKVM_HYP_VM_SIZE, + size_mul(sizeof(void *), + host_kvm->created_vcpus))); + hyp_vm = alloc_pages_exact(hyp_vm_sz, GFP_KERNEL_ACCOUNT); + if (!hyp_vm) { + ret = -ENOMEM; + goto free_pgd; + } + + /* Donate the VM memory to hyp and let hyp initialize it. */ + ret = kvm_call_hyp_nvhe(__pkvm_init_vm, host_kvm, hyp_vm, pgd); + if (ret < 0) + goto free_vm; + + handle = ret; + + host_kvm->arch.pkvm.handle = handle; + host_kvm->arch.pkvm.hyp_donations.pgd = pgd; + host_kvm->arch.pkvm.hyp_donations.vm = hyp_vm; + + /* Donate memory for the vcpus at hyp and initialize it. */ + hyp_vcpu_sz = PAGE_ALIGN(PKVM_HYP_VCPU_SIZE); + kvm_for_each_vcpu(idx, host_vcpu, host_kvm) { + void *hyp_vcpu; + + /* Indexing of the vcpus to be sequential starting at 0. */ + if (WARN_ON(host_vcpu->vcpu_idx != idx)) { + ret = -EINVAL; + goto destroy_vm; + } + + hyp_vcpu = alloc_pages_exact(hyp_vcpu_sz, GFP_KERNEL_ACCOUNT); + if (!hyp_vcpu) { + ret = -ENOMEM; + goto destroy_vm; + } + + host_kvm->arch.pkvm.hyp_donations.vcpus[idx] = hyp_vcpu; + + ret = kvm_call_hyp_nvhe(__pkvm_init_vcpu, handle, host_vcpu, + hyp_vcpu); + if (ret) + goto destroy_vm; + } + + return 0; + +destroy_vm: + pkvm_destroy_hyp_vm(host_kvm); + return ret; +free_vm: + free_pages_exact(hyp_vm, hyp_vm_sz); +free_pgd: + free_pages_exact(pgd, pgd_sz); + return ret; +} + +int pkvm_create_hyp_vm(struct kvm *host_kvm) +{ + int ret = 0; + + mutex_lock(&host_kvm->arch.pkvm.vm_lock); + if (!host_kvm->arch.pkvm.handle) + ret = __pkvm_create_hyp_vm(host_kvm); + mutex_unlock(&host_kvm->arch.pkvm.vm_lock); + + return ret; +} + +void pkvm_destroy_hyp_vm(struct kvm *host_kvm) +{ + unsigned long idx, nr_vcpus = host_kvm->created_vcpus; + size_t pgd_sz, hyp_vm_sz; + + if (host_kvm->arch.pkvm.handle) + WARN_ON(kvm_call_hyp_nvhe(__pkvm_teardown_vm, + host_kvm->arch.pkvm.handle)); + + host_kvm->arch.pkvm.handle = 0; + + for (idx = 0; idx < nr_vcpus; ++idx) { + void *hyp_vcpu = host_kvm->arch.pkvm.hyp_donations.vcpus[idx]; + + if (!hyp_vcpu) + break; + + free_pages_exact(hyp_vcpu, PAGE_ALIGN(PKVM_HYP_VCPU_SIZE)); + } + + hyp_vm_sz = PAGE_ALIGN(size_add(PKVM_HYP_VM_SIZE, + size_mul(sizeof(void *), nr_vcpus))); + pgd_sz = kvm_pgtable_stage2_pgd_size(host_kvm->arch.vtcr); + + free_pages_exact(host_kvm->arch.pkvm.hyp_donations.vm, hyp_vm_sz); + free_pages_exact(host_kvm->arch.pkvm.hyp_donations.pgd, pgd_sz); +} + +int pkvm_init_host_vm(struct kvm *host_kvm) +{ + mutex_init(&host_kvm->arch.pkvm.vm_lock); + return 0; +} From patchwork Mon Oct 17 11:51:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008680 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C0FCC4332F for ; Mon, 17 Oct 2022 11:53:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229796AbiJQLxs (ORCPT ); Mon, 17 Oct 2022 07:53:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230295AbiJQLx3 (ORCPT ); Mon, 17 Oct 2022 07:53:29 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5497F4E852 for ; Mon, 17 Oct 2022 04:53:15 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 61B2FB81639 for ; Mon, 17 Oct 2022 11:53:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0E6C8C43142; Mon, 17 Oct 2022 11:53:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007592; bh=Inzs8ZJdYCkUwGAUADJMBeCAylz6WPUUbJixhOI4Z58=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KSpYZWdvgclllzObUJV229VzuMnzALc8oeO2wR+IbXe9l0GZaXKmdIXQdVux/W8Bh 4hTV5dfC6vMdSjCpdEoDaoXPuxbUuaojB2/a/5PYJ+y8ndY/q5hm2Zuvevfm1wHf3A 4s5VFXwVRhVv9Wv/p0AUzKXHEo/JDFhhp3cgYrAas+qlBTeXA71lGnfJsYTT4Fhh3K /8cfgitBByGG761x08xVSG8MkbHO5m1CiO8A6ki+VpoBynQH4jjOdcLs44XNojI81i +3WKz6FxmQLjFogVHKRxWwq/MaKnmJb0EcModctD3DWhnbjm1ZGFnTqu69SM8pUHBv +fBs64bG/16cA== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 14/25] KVM: arm64: Add per-cpu fixmap infrastructure at EL2 Date: Mon, 17 Oct 2022 12:51:58 +0100 Message-Id: <20221017115209.2099-15-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret Mapping pages in a guest page-table from within the pKVM hypervisor at EL2 may require cache maintenance to ensure that the initialised page contents is visible even to non-cacheable (e.g. MMU-off) accesses from the guest. In preparation for performing this maintenance at EL2, introduce a per-vCPU fixmap which allows the pKVM hypervisor to map guest pages temporarily into its stage-1 page-table for the purposes of cache maintenance and, in future, poisoning on the reclaim path. The use of a fixmap avoids the need for memory allocation or locking on the map() path. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_pgtable.h | 14 +++ arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 2 + arch/arm64/kvm/hyp/include/nvhe/mm.h | 4 + arch/arm64/kvm/hyp/nvhe/mem_protect.c | 1 - arch/arm64/kvm/hyp/nvhe/mm.c | 94 +++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/setup.c | 4 + arch/arm64/kvm/hyp/pgtable.c | 12 --- 7 files changed, 118 insertions(+), 13 deletions(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 4f6d79fe4352..b2a886c9e78d 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -30,6 +30,8 @@ typedef u64 kvm_pte_t; #define KVM_PTE_ADDR_MASK GENMASK(47, PAGE_SHIFT) #define KVM_PTE_ADDR_51_48 GENMASK(15, 12) +#define KVM_PHYS_INVALID (-1ULL) + static inline bool kvm_pte_valid(kvm_pte_t pte) { return pte & KVM_PTE_VALID; @@ -45,6 +47,18 @@ static inline u64 kvm_pte_to_phys(kvm_pte_t pte) return pa; } +static inline kvm_pte_t kvm_phys_to_pte(u64 pa) +{ + kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK; + + if (PAGE_SHIFT == 16) { + pa &= GENMASK(51, 48); + pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48); + } + + return pte; +} + static inline u64 kvm_granule_shift(u32 level) { /* Assumes KVM_PGTABLE_MAX_LEVELS is 4 */ diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index ce9a796a85ee..ef31a1872c93 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -59,6 +59,8 @@ enum pkvm_component_id { PKVM_ID_HYP, }; +extern unsigned long hyp_nr_cpus; + int __pkvm_prot_finalize(void); int __pkvm_host_share_hyp(u64 pfn); int __pkvm_host_unshare_hyp(u64 pfn); diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h b/arch/arm64/kvm/hyp/include/nvhe/mm.h index b2ee6d5df55b..d5ec972b5c1e 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mm.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h @@ -13,6 +13,10 @@ extern struct kvm_pgtable pkvm_pgtable; extern hyp_spinlock_t pkvm_pgd_lock; +int hyp_create_pcpu_fixmap(void); +void *hyp_fixmap_map(phys_addr_t phys); +void hyp_fixmap_unmap(void); + int hyp_create_idmap(u32 hyp_va_bits); int hyp_map_vectors(void); int hyp_back_vmemmap(phys_addr_t back); diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 2ef6aaa21ba5..1c38451050e5 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -21,7 +21,6 @@ #define KVM_HOST_S2_FLAGS (KVM_PGTABLE_S2_NOFWB | KVM_PGTABLE_S2_IDMAP) -extern unsigned long hyp_nr_cpus; struct host_mmu host_mmu; static struct hyp_pool host_s2_pool; diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c index d3a3b47181de..b77215630d5c 100644 --- a/arch/arm64/kvm/hyp/nvhe/mm.c +++ b/arch/arm64/kvm/hyp/nvhe/mm.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -25,6 +26,12 @@ unsigned int hyp_memblock_nr; static u64 __io_map_base; +struct hyp_fixmap_slot { + u64 addr; + kvm_pte_t *ptep; +}; +static DEFINE_PER_CPU(struct hyp_fixmap_slot, fixmap_slots); + static int __pkvm_create_mappings(unsigned long start, unsigned long size, unsigned long phys, enum kvm_pgtable_prot prot) { @@ -212,6 +219,93 @@ int hyp_map_vectors(void) return 0; } +void *hyp_fixmap_map(phys_addr_t phys) +{ + struct hyp_fixmap_slot *slot = this_cpu_ptr(&fixmap_slots); + kvm_pte_t pte, *ptep = slot->ptep; + + pte = *ptep; + pte &= ~kvm_phys_to_pte(KVM_PHYS_INVALID); + pte |= kvm_phys_to_pte(phys) | KVM_PTE_VALID; + WRITE_ONCE(*ptep, pte); + dsb(nshst); + + return (void *)slot->addr; +} + +static void fixmap_clear_slot(struct hyp_fixmap_slot *slot) +{ + kvm_pte_t *ptep = slot->ptep; + u64 addr = slot->addr; + + WRITE_ONCE(*ptep, *ptep & ~KVM_PTE_VALID); + dsb(nshst); + __tlbi_level(vale2, __TLBI_VADDR(addr, 0), (KVM_PGTABLE_MAX_LEVELS - 1)); + dsb(nsh); + isb(); +} + +void hyp_fixmap_unmap(void) +{ + fixmap_clear_slot(this_cpu_ptr(&fixmap_slots)); +} + +static int __create_fixmap_slot_cb(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, + enum kvm_pgtable_walk_flags flag, + void * const arg) +{ + struct hyp_fixmap_slot *slot = per_cpu_ptr(&fixmap_slots, (u64)arg); + + if (!kvm_pte_valid(*ptep) || level != KVM_PGTABLE_MAX_LEVELS - 1) + return -EINVAL; + + slot->addr = addr; + slot->ptep = ptep; + + /* + * Clear the PTE, but keep the page-table page refcount elevated to + * prevent it from ever being freed. This lets us manipulate the PTEs + * by hand safely without ever needing to allocate memory. + */ + fixmap_clear_slot(slot); + + return 0; +} + +static int create_fixmap_slot(u64 addr, u64 cpu) +{ + struct kvm_pgtable_walker walker = { + .cb = __create_fixmap_slot_cb, + .flags = KVM_PGTABLE_WALK_LEAF, + .arg = (void *)cpu, + }; + + return kvm_pgtable_walk(&pkvm_pgtable, addr, PAGE_SIZE, &walker); +} + +int hyp_create_pcpu_fixmap(void) +{ + unsigned long addr, i; + int ret; + + for (i = 0; i < hyp_nr_cpus; i++) { + ret = pkvm_alloc_private_va_range(PAGE_SIZE, &addr); + if (ret) + return ret; + + ret = kvm_pgtable_hyp_map(&pkvm_pgtable, addr, PAGE_SIZE, + __hyp_pa(__hyp_bss_start), PAGE_HYP); + if (ret) + return ret; + + ret = create_fixmap_slot(addr, i); + if (ret) + return ret; + } + + return 0; +} + int hyp_create_idmap(u32 hyp_va_bits) { unsigned long start, end; diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 2be72fbe7279..0f69c1393416 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -321,6 +321,10 @@ void __noreturn __pkvm_init_finalise(void) if (ret) goto out; + ret = hyp_create_pcpu_fixmap(); + if (ret) + goto out; + pkvm_hyp_vm_table_init(vm_table_base); out: /* diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index a1a27f88a312..2bcb2d5903ba 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -57,8 +57,6 @@ struct kvm_pgtable_walk_data { u64 end; }; -#define KVM_PHYS_INVALID (-1ULL) - static bool kvm_phys_is_valid(u64 phys) { return phys < BIT(id_aa64mmfr0_parange_to_phys_shift(ID_AA64MMFR0_EL1_PARANGE_MAX)); @@ -122,16 +120,6 @@ static bool kvm_pte_table(kvm_pte_t pte, u32 level) return FIELD_GET(KVM_PTE_TYPE, pte) == KVM_PTE_TYPE_TABLE; } -static kvm_pte_t kvm_phys_to_pte(u64 pa) -{ - kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK; - - if (PAGE_SHIFT == 16) - pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48); - - return pte; -} - static kvm_pte_t *kvm_pte_follow(kvm_pte_t pte, struct kvm_pgtable_mm_ops *mm_ops) { return mm_ops->phys_to_virt(kvm_pte_to_phys(pte)); From patchwork Mon Oct 17 11:51:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008681 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 097BFC433FE for ; Mon, 17 Oct 2022 11:53:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230253AbiJQLxt (ORCPT ); Mon, 17 Oct 2022 07:53:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229739AbiJQLx3 (ORCPT ); Mon, 17 Oct 2022 07:53:29 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C12C85925A for ; Mon, 17 Oct 2022 04:53:18 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E3099B81632 for ; Mon, 17 Oct 2022 11:53:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 89E11C433C1; Mon, 17 Oct 2022 11:53:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007595; bh=8MrW/qLGdmTas2m8Su33Q0MiR8j5vDsbuxe/SRO/NHI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HC9l9SOlqCT1zh2ZPmSsOc9VY2WSS1E0LM4rUc1YqB7n4PlEC1C7V53ZmqzhaAxfa p9vBU1lEh9wjkEi6uHyM5q7I8QDT43W2tYU/UDPhc1sSpYjzaiGnbH3WSbJUFMg4B7 RgHdMgnLFtYt5Yf/oSX2jYM4FB7GFhOr6t2MK4uq+sdEQtAqHMkXQxLZQRJdGCgAaO /jpXTZL45feWda67v3lnLVaybsgMKfV6pLO8TZ5QrFz8BQkZ0XnyQp7iKTWUAz5I+D MsgZBsPYObVZxvm5eslA0DKDrqafBR5CrRemyMvL07BGlEIixSAu8A4Qd7MRI3yXCr UNfr4d5xCWrfg== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 15/25] KVM: arm64: Initialise hypervisor copies of host symbols unconditionally Date: Mon, 17 Oct 2022 12:51:59 +0100 Message-Id: <20221017115209.2099-16-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The nVHE object at EL2 maintains its own copies of some host variables so that, when pKVM is enabled, the host cannot directly modify the hypervisor state. When running in normal nVHE mode, however, these variables are still mirrored at EL2 but are not initialised. Initialise the hypervisor symbols from the host copies regardless of pKVM, ensuring that any reference to this data at EL2 with normal nVHE will return a sensibly initialised value. Tested-by: Vincent Donnefort Signed-off-by: Will Deacon Reviewed-by: Philippe Mathieu-Daudé --- arch/arm64/kvm/arm.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 30d6fc5d3a93..584626e11797 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1884,11 +1884,8 @@ static int do_pkvm_init(u32 hyp_va_bits) return ret; } -static int kvm_hyp_init_protection(u32 hyp_va_bits) +static void kvm_hyp_init_symbols(void) { - void *addr = phys_to_virt(hyp_mem_base); - int ret; - kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1); kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1); kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1); @@ -1897,6 +1894,12 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits) kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1); +} + +static int kvm_hyp_init_protection(u32 hyp_va_bits) +{ + void *addr = phys_to_virt(hyp_mem_base); + int ret; ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP); if (ret) @@ -2071,6 +2074,8 @@ static int init_hyp_mode(void) cpu_prepare_hyp_mode(cpu); } + kvm_hyp_init_symbols(); + if (is_protected_kvm_enabled()) { init_cpu_logical_map(); @@ -2078,9 +2083,7 @@ static int init_hyp_mode(void) err = -ENODEV; goto out_err; } - } - if (is_protected_kvm_enabled()) { err = kvm_hyp_init_protection(hyp_va_bits); if (err) { kvm_err("Failed to init hyp memory protection\n"); From patchwork Mon Oct 17 11:52:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 345D3C433FE for ; Mon, 17 Oct 2022 11:53:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230292AbiJQLx1 (ORCPT ); Mon, 17 Oct 2022 07:53:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229739AbiJQLxZ (ORCPT ); Mon, 17 Oct 2022 07:53:25 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0B7059724 for ; Mon, 17 Oct 2022 04:53:20 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A5D10610A3 for ; Mon, 17 Oct 2022 11:53:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 13AEFC4347C; Mon, 17 Oct 2022 11:53:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007599; bh=M/4MPZmrF3Xl70G5UyY6515GYKhKwEoLZXAGqLSMA34=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=u/CdCPgQGnwuaTuxakgdzbxNj0v8eLVsfYLdmxmp8SaiMouEUGMKu0TD6a+VEW9j5 0v4EW5h/Y0q+j1suho8GB+wrXnmvveoFyCPLvFXebDx/Wr/AvbF2g2IpP5pvTQtAYM hiLacCF8eOeObUETT1kgy4/gN7+5OKrXcAdq9B+/wiVqmq3QaIlA9vovK9iBfOZJkY iHRJTXBFdb049K+xqf4EH0YcfAykIliqLrcgw+rmplneBkXIKF+HIz/egIECbXmnAa nJzkOZr2eEKn3tHq6socY5JFcXONaYJguy0qr+E9gY7lfnb8vnSs/iRcQ/KJ6nDfP6 1rENzzTadFvvw== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 16/25] KVM: arm64: Provide I-cache invalidation by virtual address at EL2 Date: Mon, 17 Oct 2022 12:52:00 +0100 Message-Id: <20221017115209.2099-17-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org In preparation for handling cache maintenance of guest pages from within the pKVM hypervisor at EL2, introduce an EL2 copy of icache_inval_pou() which will later be plumbed into the stage-2 page-table cache maintenance callbacks, ensuring that the initial contents of pages mapped as executable into the guest stage-2 page-table is visible to the instruction fetcher. Tested-by: Vincent Donnefort Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_hyp.h | 1 + arch/arm64/kernel/image-vars.h | 3 --- arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/hyp/nvhe/cache.S | 11 +++++++++++ arch/arm64/kvm/hyp/nvhe/pkvm.c | 3 +++ 5 files changed, 16 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index aa7fa2a08f06..fd99cf09972d 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -123,4 +123,5 @@ extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val); extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val); extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val); +extern unsigned long kvm_nvhe_sym(__icache_flags); #endif /* __ARM64_KVM_HYP_H__ */ diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index 8151412653de..7f4e43bfaade 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -71,9 +71,6 @@ KVM_NVHE_ALIAS(nvhe_hyp_panic_handler); /* Vectors installed by hyp-init on reset HVC. */ KVM_NVHE_ALIAS(__hyp_stub_vectors); -/* Kernel symbol used by icache_is_vpipt(). */ -KVM_NVHE_ALIAS(__icache_flags); - /* VMID bits set by the KVM VMID allocator */ KVM_NVHE_ALIAS(kvm_arm_vmid_bits); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 584626e11797..d99e93e6ddf7 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1894,6 +1894,7 @@ static void kvm_hyp_init_symbols(void) kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1); + kvm_nvhe_sym(__icache_flags) = __icache_flags; } static int kvm_hyp_init_protection(u32 hyp_va_bits) diff --git a/arch/arm64/kvm/hyp/nvhe/cache.S b/arch/arm64/kvm/hyp/nvhe/cache.S index 0c367eb5f4e2..85936c17ae40 100644 --- a/arch/arm64/kvm/hyp/nvhe/cache.S +++ b/arch/arm64/kvm/hyp/nvhe/cache.S @@ -12,3 +12,14 @@ SYM_FUNC_START(__pi_dcache_clean_inval_poc) ret SYM_FUNC_END(__pi_dcache_clean_inval_poc) SYM_FUNC_ALIAS(dcache_clean_inval_poc, __pi_dcache_clean_inval_poc) + +SYM_FUNC_START(__pi_icache_inval_pou) +alternative_if ARM64_HAS_CACHE_DIC + isb + ret +alternative_else_nop_endif + + invalidate_icache_by_line x0, x1, x2, x3 + ret +SYM_FUNC_END(__pi_icache_inval_pou) +SYM_FUNC_ALIAS(icache_inval_pou, __pi_icache_inval_pou) diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index 5970fd1adbfb..2156ce546d90 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -12,6 +12,9 @@ #include #include +/* Used by icache_is_vpipt(). */ +unsigned long __icache_flags; + /* * Set trap register values based on features in ID_AA64PFR0. */ From patchwork Mon Oct 17 11:52:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008683 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72D7EC4332F for ; Mon, 17 Oct 2022 11:53:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229968AbiJQLxv (ORCPT ); Mon, 17 Oct 2022 07:53:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230371AbiJQLxj (ORCPT ); Mon, 17 Oct 2022 07:53:39 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93C505A151 for ; Mon, 17 Oct 2022 04:53:24 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2C3A56109E for ; Mon, 17 Oct 2022 11:53:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8FEDFC433C1; Mon, 17 Oct 2022 11:53:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007602; bh=ZLSfew7ixQDK0MJ4/GsqTNTelBXSBHLYEIoBcTsc+iQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=u3D+8xEC4Suh5OAuAlSLjYJHM0Xx5mk8LnF9KSZCGQtRiynnSBXuCNf8CZyq7SsVI yQQ1oT1KeFSL6PLzvnJ27TA2iOndG2FVf15qrbRi8ADFmfjLSg/UfdZ+6yzqpJr5Wc 839H217eITPtH7Cqq2+QmuDCh/R3PGSPJ0Kdh+OzIZ2TW38qlCnQe1rodKxBQFvazL tktOWSkpJdA+Hvt8Euk2BH5HK16V6ldr6xf0BvKoBgyRcgO8BpqEQiQWMxVkqXAMO2 DMp7LfBNDdeuCf8easeHDmY46Hzfu/+lbRrMqNqgGAs7Zism6UvxK943aBL72ZYTNb 6U+r1BBoNXXEg== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 17/25] KVM: arm64: Add generic hyp_memcache helpers Date: Mon, 17 Oct 2022 12:52:01 +0100 Message-Id: <20221017115209.2099-18-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret The host at EL1 and the pKVM hypervisor at EL2 will soon need to exchange memory pages dynamically for creating and destroying VM state. Indeed, the hypervisor will rely on the host to donate memory pages it can use to create guest stage-2 page-tables and to store VM and vCPU metadata. In order to ease this process, introduce a 'struct hyp_memcache' which is essentially a linked list of available pages, indexed by physical addresses so that it can be passed meaningfully between the different virtual address spaces configured at EL1 and EL2. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_host.h | 57 +++++++++++++++++++ arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 2 + arch/arm64/kvm/hyp/nvhe/mm.c | 33 +++++++++++ arch/arm64/kvm/mmu.c | 26 +++++++++ 4 files changed, 118 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index f09b422ba1fe..2ec268cc2031 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -73,6 +73,63 @@ u32 __attribute_const__ kvm_target_cpu(void); int kvm_reset_vcpu(struct kvm_vcpu *vcpu); void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu); +struct kvm_hyp_memcache { + phys_addr_t head; + unsigned long nr_pages; +}; + +static inline void push_hyp_memcache(struct kvm_hyp_memcache *mc, + phys_addr_t *p, + phys_addr_t (*to_pa)(void *virt)) +{ + *p = mc->head; + mc->head = to_pa(p); + mc->nr_pages++; +} + +static inline void *pop_hyp_memcache(struct kvm_hyp_memcache *mc, + void *(*to_va)(phys_addr_t phys)) +{ + phys_addr_t *p = to_va(mc->head); + + if (!mc->nr_pages) + return NULL; + + mc->head = *p; + mc->nr_pages--; + + return p; +} + +static inline int __topup_hyp_memcache(struct kvm_hyp_memcache *mc, + unsigned long min_pages, + void *(*alloc_fn)(void *arg), + phys_addr_t (*to_pa)(void *virt), + void *arg) +{ + while (mc->nr_pages < min_pages) { + phys_addr_t *p = alloc_fn(arg); + + if (!p) + return -ENOMEM; + push_hyp_memcache(mc, p, to_pa); + } + + return 0; +} + +static inline void __free_hyp_memcache(struct kvm_hyp_memcache *mc, + void (*free_fn)(void *virt, void *arg), + void *(*to_va)(phys_addr_t phys), + void *arg) +{ + while (mc->nr_pages) + free_fn(pop_hyp_memcache(mc, to_va), arg); +} + +void free_hyp_memcache(struct kvm_hyp_memcache *mc); +int topup_hyp_memcache(struct kvm_hyp_memcache *mc, unsigned long min_pages); + struct kvm_vmid { atomic64_t id; }; diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index ef31a1872c93..420b87e755a4 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -77,6 +77,8 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt); int hyp_pin_shared_mem(void *from, void *to); void hyp_unpin_shared_mem(void *from, void *to); void reclaim_guest_pages(struct pkvm_hyp_vm *vm); +int refill_memcache(struct kvm_hyp_memcache *mc, unsigned long min_pages, + struct kvm_hyp_memcache *host_mc); static __always_inline void __load_host_stage2(void) { diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c index b77215630d5c..c73f87f13e29 100644 --- a/arch/arm64/kvm/hyp/nvhe/mm.c +++ b/arch/arm64/kvm/hyp/nvhe/mm.c @@ -330,3 +330,36 @@ int hyp_create_idmap(u32 hyp_va_bits) return __pkvm_create_mappings(start, end - start, start, PAGE_HYP_EXEC); } + +static void *admit_host_page(void *arg) +{ + struct kvm_hyp_memcache *host_mc = arg; + + if (!host_mc->nr_pages) + return NULL; + + /* + * The host still owns the pages in its memcache, so we need to go + * through a full host-to-hyp donation cycle to change it. Fortunately, + * __pkvm_host_donate_hyp() takes care of races for us, so if it + * succeeds we're good to go. + */ + if (__pkvm_host_donate_hyp(hyp_phys_to_pfn(host_mc->head), 1)) + return NULL; + + return pop_hyp_memcache(host_mc, hyp_phys_to_virt); +} + +/* Refill our local memcache by poping pages from the one provided by the host. */ +int refill_memcache(struct kvm_hyp_memcache *mc, unsigned long min_pages, + struct kvm_hyp_memcache *host_mc) +{ + struct kvm_hyp_memcache tmp = *host_mc; + int ret; + + ret = __topup_hyp_memcache(mc, min_pages, admit_host_page, + hyp_virt_to_phys, &tmp); + *host_mc = tmp; + + return ret; +} diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 34c5feed9dc1..8b52566d1cb9 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -800,6 +800,32 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu) } } +static void hyp_mc_free_fn(void *addr, void *unused) +{ + free_page((unsigned long)addr); +} + +static void *hyp_mc_alloc_fn(void *unused) +{ + return (void *)__get_free_page(GFP_KERNEL_ACCOUNT); +} + +void free_hyp_memcache(struct kvm_hyp_memcache *mc) +{ + if (is_protected_kvm_enabled()) + __free_hyp_memcache(mc, hyp_mc_free_fn, + kvm_host_va, NULL); +} + +int topup_hyp_memcache(struct kvm_hyp_memcache *mc, unsigned long min_pages) +{ + if (!is_protected_kvm_enabled()) + return 0; + + return __topup_hyp_memcache(mc, min_pages, hyp_mc_alloc_fn, + kvm_host_pa, NULL); +} + /** * kvm_phys_addr_ioremap - map a device range to guest IPA * From patchwork Mon Oct 17 11:52:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008684 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A1B3C43219 for ; Mon, 17 Oct 2022 11:53:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230308AbiJQLxw (ORCPT ); Mon, 17 Oct 2022 07:53:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33024 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230424AbiJQLxn (ORCPT ); Mon, 17 Oct 2022 07:53:43 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3336D4F18C for ; Mon, 17 Oct 2022 04:53:30 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 616A2B812A1 for ; Mon, 17 Oct 2022 11:53:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 16964C433D7; Mon, 17 Oct 2022 11:53:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007606; bh=KB1YnU8GUx5aSSC2hnyONAfr/+E8XnKyRJarXcq08IA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MUDOIfiGAXRhuYzXCXLV6lGHQmNsRx+1JsZcw7NCW3LFJxtGywiQhOStt0rmZeTku hgArWwzOKpuJVZGFBnneB15Su4jb6kya7Z5nLHzAQeMALLPsQkPPpqIsi3zVK5FABc cO9LmknFz3O7mlU0iiQeU0853Sr/vt1VULMtFx7YuIlmXvdu+mwY0f0R5ivnURxRsq xckCb/L4aknSeTeSvBGbaDgSWKP6SWfLoqGXf7S8Hk+842eSKSwQa8FAyMuzLlHAsi HtpKytJ7PQIaxTd5DpxSDs/AVTyGdwln3z8b5NwywshDZ1a9qBpaph49aquy8jx5GB i5A9aHNsyuTCQ== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 18/25] KVM: arm64: Consolidate stage-2 initialisation into a single function Date: Mon, 17 Oct 2022 12:52:02 +0100 Message-Id: <20221017115209.2099-19-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret The initialisation of guest stage-2 page-tables is currently split across two functions: kvm_init_stage2_mmu() and kvm_arm_setup_stage2(). That is presumably for historical reasons as kvm_arm_setup_stage2() originates from the (now defunct) KVM port for 32-bit Arm. Simplify this code path by merging both functions into one, taking care to map the 'struct kvm' into the hypervisor stage-1 early on in order to simplify the failure path. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_arm.h | 2 +- arch/arm64/include/asm/kvm_host.h | 2 -- arch/arm64/include/asm/kvm_mmu.h | 2 +- arch/arm64/kvm/arm.c | 27 +++++++++++++-------------- arch/arm64/kvm/mmu.c | 27 ++++++++++++++++++++++++++- arch/arm64/kvm/reset.c | 29 ----------------------------- 6 files changed, 41 insertions(+), 48 deletions(-) diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h index 8aa8492dafc0..89e63585dae4 100644 --- a/arch/arm64/include/asm/kvm_arm.h +++ b/arch/arm64/include/asm/kvm_arm.h @@ -135,7 +135,7 @@ * 40 bits wide (T0SZ = 24). Systems with a PARange smaller than 40 bits are * not known to exist and will break with this configuration. * - * The VTCR_EL2 is configured per VM and is initialised in kvm_arm_setup_stage2(). + * The VTCR_EL2 is configured per VM and is initialised in kvm_init_stage2_mmu. * * Note that when using 4K pages, we concatenate two first level page tables * together. With 16K pages, we concatenate 16 first level page tables. diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 2ec268cc2031..3fa973237b90 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -991,8 +991,6 @@ int kvm_set_ipa_limit(void); #define __KVM_HAVE_ARCH_VM_ALLOC struct kvm *kvm_arch_alloc_vm(void); -int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type); - static inline bool kvm_vm_is_protected(struct kvm *kvm) { return false; diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 7784081088e7..e4a7e6369499 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -166,7 +166,7 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size, void free_hyp_pgds(void); void stage2_unmap_vm(struct kvm *kvm); -int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu); +int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type); void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu); int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, phys_addr_t pa, unsigned long size, bool writable); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index d99e93e6ddf7..f78eefa02f6b 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -139,28 +139,24 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { int ret; - ret = kvm_arm_setup_stage2(kvm, type); - if (ret) - return ret; - - ret = kvm_init_stage2_mmu(kvm, &kvm->arch.mmu); - if (ret) - return ret; - ret = kvm_share_hyp(kvm, kvm + 1); if (ret) - goto out_free_stage2_pgd; + return ret; ret = pkvm_init_host_vm(kvm); if (ret) - goto out_free_stage2_pgd; + goto err_unshare_kvm; if (!zalloc_cpumask_var(&kvm->arch.supported_cpus, GFP_KERNEL)) { ret = -ENOMEM; - goto out_free_stage2_pgd; + goto err_unshare_kvm; } cpumask_copy(kvm->arch.supported_cpus, cpu_possible_mask); + ret = kvm_init_stage2_mmu(kvm, &kvm->arch.mmu, type); + if (ret) + goto err_free_cpumask; + kvm_vgic_early_init(kvm); /* The maximum number of VCPUs is limited by the host's GIC model */ @@ -169,9 +165,12 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) set_default_spectre(kvm); kvm_arm_init_hypercalls(kvm); - return ret; -out_free_stage2_pgd: - kvm_free_stage2_pgd(&kvm->arch.mmu); + return 0; + +err_free_cpumask: + free_cpumask_var(kvm->arch.supported_cpus); +err_unshare_kvm: + kvm_unshare_hyp(kvm, kvm + 1); return ret; } diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 8b52566d1cb9..43761d31f763 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -668,15 +668,40 @@ static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = { * kvm_init_stage2_mmu - Initialise a S2 MMU structure * @kvm: The pointer to the KVM structure * @mmu: The pointer to the s2 MMU structure + * @type: The machine type of the virtual machine * * Allocates only the stage-2 HW PGD level table(s). * Note we don't need locking here as this is only called when the VM is * created, which can only be done once. */ -int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu) +int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type) { + u32 kvm_ipa_limit = get_kvm_ipa_limit(); int cpu, err; struct kvm_pgtable *pgt; + u64 mmfr0, mmfr1; + u32 phys_shift; + + if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK) + return -EINVAL; + + phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type); + if (phys_shift) { + if (phys_shift > kvm_ipa_limit || + phys_shift < ARM64_MIN_PARANGE_BITS) + return -EINVAL; + } else { + phys_shift = KVM_PHYS_SHIFT; + if (phys_shift > kvm_ipa_limit) { + pr_warn_once("%s using unsupported default IPA limit, upgrade your VMM\n", + current->comm); + return -EINVAL; + } + } + + mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); + mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); + kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift); if (mmu->pgt != NULL) { kvm_err("kvm_arch already initialized?\n"); diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 5ae18472205a..e0267f672b8a 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -395,32 +395,3 @@ int kvm_set_ipa_limit(void) return 0; } - -int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type) -{ - u64 mmfr0, mmfr1; - u32 phys_shift; - - if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK) - return -EINVAL; - - phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type); - if (phys_shift) { - if (phys_shift > kvm_ipa_limit || - phys_shift < ARM64_MIN_PARANGE_BITS) - return -EINVAL; - } else { - phys_shift = KVM_PHYS_SHIFT; - if (phys_shift > kvm_ipa_limit) { - pr_warn_once("%s using unsupported default IPA limit, upgrade your VMM\n", - current->comm); - return -EINVAL; - } - } - - mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1); - mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); - kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift); - - return 0; -} From patchwork Mon Oct 17 11:52:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008682 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10A56C4321E for ; Mon, 17 Oct 2022 11:53:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230298AbiJQLxu (ORCPT ); Mon, 17 Oct 2022 07:53:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230387AbiJQLxj (ORCPT ); Mon, 17 Oct 2022 07:53:39 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D784C53D39 for ; Mon, 17 Oct 2022 04:53:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 394ED6104D for ; Mon, 17 Oct 2022 11:53:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9CFA8C43470; Mon, 17 Oct 2022 11:53:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007609; bh=T6Dlfxy8O91K1JKpLz6sE+xW8rmJQg8AX8DmdyH/bmM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HFZSmLBTIzDFrQCIsjVoVF/2Zb0nUvR4aBSqK72+1FgH58TGW+TMjtP7Xy/+9TliZ haISRuchdcwFixDQQusp3b4Pb+eFPwqXUhr2bTBYBoiebGOzi8eOTOxYTzG1yn33O+ llZPVM8K/Km/sM1V/8v7j2nO4qfH9yTkvqzEeBSLPLhH1Qhyrqr7v166baxjfC5Lko xhbTa1rRAi2Xm3YXJRGqKcQ7nYCN6xJAkEeY0f3O12a7Afbz4xXJuG+0qLtFQFR/F1 6dqoNJTlRhFCzmhrUuKLPYXP2n984QxmSF14TLgYrrxY6GiguN5BPwOzaeY1FUP5l/ fNIOcCHb0XyRQ== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 19/25] KVM: arm64: Instantiate guest stage-2 page-tables at EL2 Date: Mon, 17 Oct 2022 12:52:03 +0100 Message-Id: <20221017115209.2099-20-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret Extend the initialisation of guest data structures within the pKVM hypervisor at EL2 so that we instantiate a memory pool and a full 'struct kvm_s2_mmu' structure for each VM, with a stage-2 page-table entirely independent from the one managed by the host at EL1. The 'struct kvm_pgtable_mm_ops' used by the page-table code is populated with a set of callbacks that can manage guest pages in the hypervisor without any direct intervention from the host, allocating page-table pages from the provided pool and returning these to the host on VM teardown. To keep things simple, the stage-2 MMU for the guest is configured identically to the host stage-2 in the VTCR register and so the IPA size of the guest must match the PA size of the host. For now, the new page-table is unused as there is no way for the host to map anything into it. Yet. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 6 ++ arch/arm64/kvm/hyp/nvhe/mem_protect.c | 125 ++++++++++++++++++++++++- arch/arm64/kvm/mmu.c | 4 +- 3 files changed, 132 insertions(+), 3 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h index d176399dbfb5..5d456438445c 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h @@ -9,6 +9,9 @@ #include +#include +#include + /* * Holds the relevant data for maintaining the vcpu state completely at hyp. */ @@ -36,6 +39,9 @@ struct pkvm_hyp_vm { /* The guest's stage-2 page-table managed by the hypervisor. */ struct kvm_pgtable pgt; + struct kvm_pgtable_mm_ops mm_ops; + struct hyp_pool pool; + hyp_spinlock_t lock; /* * The number of vcpus initialized and ready to run. diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 1c38451050e5..27b16a6b85bb 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -25,6 +25,21 @@ struct host_mmu host_mmu; static struct hyp_pool host_s2_pool; +static DEFINE_PER_CPU(struct pkvm_hyp_vm *, __current_vm); +#define current_vm (*this_cpu_ptr(&__current_vm)) + +static void guest_lock_component(struct pkvm_hyp_vm *vm) +{ + hyp_spin_lock(&vm->lock); + current_vm = vm; +} + +static void guest_unlock_component(struct pkvm_hyp_vm *vm) +{ + current_vm = NULL; + hyp_spin_unlock(&vm->lock); +} + static void host_lock_component(void) { hyp_spin_lock(&host_mmu.lock); @@ -140,18 +155,124 @@ int kvm_host_prepare_stage2(void *pgt_pool_base) return 0; } +static bool guest_stage2_force_pte_cb(u64 addr, u64 end, + enum kvm_pgtable_prot prot) +{ + return true; +} + +static void *guest_s2_zalloc_pages_exact(size_t size) +{ + void *addr = hyp_alloc_pages(¤t_vm->pool, get_order(size)); + + WARN_ON(size != (PAGE_SIZE << get_order(size))); + hyp_split_page(hyp_virt_to_page(addr)); + + return addr; +} + +static void guest_s2_free_pages_exact(void *addr, unsigned long size) +{ + u8 order = get_order(size); + unsigned int i; + + for (i = 0; i < (1 << order); i++) + hyp_put_page(¤t_vm->pool, addr + (i * PAGE_SIZE)); +} + +static void *guest_s2_zalloc_page(void *mc) +{ + struct hyp_page *p; + void *addr; + + addr = hyp_alloc_pages(¤t_vm->pool, 0); + if (addr) + return addr; + + addr = pop_hyp_memcache(mc, hyp_phys_to_virt); + if (!addr) + return addr; + + memset(addr, 0, PAGE_SIZE); + p = hyp_virt_to_page(addr); + memset(p, 0, sizeof(*p)); + p->refcount = 1; + + return addr; +} + +static void guest_s2_get_page(void *addr) +{ + hyp_get_page(¤t_vm->pool, addr); +} + +static void guest_s2_put_page(void *addr) +{ + hyp_put_page(¤t_vm->pool, addr); +} + +static void clean_dcache_guest_page(void *va, size_t size) +{ + __clean_dcache_guest_page(hyp_fixmap_map(__hyp_pa(va)), size); + hyp_fixmap_unmap(); +} + +static void invalidate_icache_guest_page(void *va, size_t size) +{ + __invalidate_icache_guest_page(hyp_fixmap_map(__hyp_pa(va)), size); + hyp_fixmap_unmap(); +} + int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd) { - vm->pgt.pgd = pgd; + struct kvm_s2_mmu *mmu = &vm->kvm.arch.mmu; + unsigned long nr_pages; + int ret; + + nr_pages = kvm_pgtable_stage2_pgd_size(vm->kvm.arch.vtcr) >> PAGE_SHIFT; + ret = hyp_pool_init(&vm->pool, hyp_virt_to_pfn(pgd), nr_pages, 0); + if (ret) + return ret; + + hyp_spin_lock_init(&vm->lock); + vm->mm_ops = (struct kvm_pgtable_mm_ops) { + .zalloc_pages_exact = guest_s2_zalloc_pages_exact, + .free_pages_exact = guest_s2_free_pages_exact, + .zalloc_page = guest_s2_zalloc_page, + .phys_to_virt = hyp_phys_to_virt, + .virt_to_phys = hyp_virt_to_phys, + .page_count = hyp_page_count, + .get_page = guest_s2_get_page, + .put_page = guest_s2_put_page, + .dcache_clean_inval_poc = clean_dcache_guest_page, + .icache_inval_pou = invalidate_icache_guest_page, + }; + + guest_lock_component(vm); + ret = __kvm_pgtable_stage2_init(mmu->pgt, mmu, &vm->mm_ops, 0, + guest_stage2_force_pte_cb); + guest_unlock_component(vm); + if (ret) + return ret; + + vm->kvm.arch.mmu.pgd_phys = __hyp_pa(vm->pgt.pgd); + return 0; } void reclaim_guest_pages(struct pkvm_hyp_vm *vm) { + void *pgd = vm->pgt.pgd; unsigned long nr_pages; nr_pages = kvm_pgtable_stage2_pgd_size(vm->kvm.arch.vtcr) >> PAGE_SHIFT; - WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(vm->pgt.pgd), nr_pages)); + + guest_lock_component(vm); + kvm_pgtable_stage2_destroy(&vm->pgt); + vm->kvm.arch.mmu.pgd_phys = 0ULL; + guest_unlock_component(vm); + + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(pgd), nr_pages)); } int __pkvm_prot_finalize(void) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 43761d31f763..301fc7275062 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -686,7 +686,9 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t return -EINVAL; phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type); - if (phys_shift) { + if (is_protected_kvm_enabled()) { + phys_shift = kvm_ipa_limit; + } else if (phys_shift) { if (phys_shift > kvm_ipa_limit || phys_shift < ARM64_MIN_PARANGE_BITS) return -EINVAL; From patchwork Mon Oct 17 11:52:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008685 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B967C433FE for ; Mon, 17 Oct 2022 11:53:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230283AbiJQLxy (ORCPT ); Mon, 17 Oct 2022 07:53:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230447AbiJQLxo (ORCPT ); Mon, 17 Oct 2022 07:53:44 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1381C58160 for ; Mon, 17 Oct 2022 04:53:34 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AA1DA610A2 for ; Mon, 17 Oct 2022 11:53:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 206B2C433C1; Mon, 17 Oct 2022 11:53:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007613; bh=cVjnXG+VZIsd85yZ1IzAApVTTfptVQKo/qo+eCzveO8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=N3jH6L4mt/kLaZZKR5Gm9fA7OmaEYX3aJwpAjZ58ZS3gRoCAwfFCxwZOZ5akqF1qR rvzFz9i14hs8NP74z8v5ogClaPSKJUSXTYG7duoYh8pEuj6XM+5U5WcYuA7BTAtvaG i7PfZOxmU/ZYiDuGyMVVsC3P8dsgEfLtOW4d/aiYcvUvVJ2FeXXFLduUetzmXOQoGG dZJ6fHCOXld0G3Y6/KKxOXHHeDGs7IlzLIOXMbCKz0JspatsEP/EUHQDykX/cZ7o+E D/lWf1stTijB8oq2uv+B/DdjuyNPGX0zm5I568zGFvbdmmFZq2R7Dh+APBrERyzkQ4 xIz8XDzdG4rNw== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 20/25] KVM: arm64: Return guest memory from EL2 via dedicated teardown memcache Date: Mon, 17 Oct 2022 12:52:04 +0100 Message-Id: <20221017115209.2099-21-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret Rather than relying on the host to free the previously-donated pKVM hypervisor VM pages explicitly on teardown, introduce a dedicated teardown memcache which allows the host to reclaim guest memory resources without having to keep track of all of the allocations made by the pKVM hypervisor at EL2. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_host.h | 7 +---- arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 2 +- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 17 ++++++---- arch/arm64/kvm/hyp/nvhe/pkvm.c | 20 ++++++++++-- arch/arm64/kvm/pkvm.c | 31 ++++--------------- 5 files changed, 36 insertions(+), 41 deletions(-) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 3fa973237b90..22a07823b0a9 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -177,12 +177,7 @@ typedef unsigned int pkvm_handle_t; struct kvm_protected_vm { pkvm_handle_t handle; struct mutex vm_lock; - - struct { - void *pgd; - void *vm; - void *vcpus[KVM_MAX_VCPUS]; - } hyp_donations; + struct kvm_hyp_memcache teardown_mc; }; struct kvm_arch { diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h index 420b87e755a4..b7bdbe63deed 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h @@ -76,7 +76,7 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt); int hyp_pin_shared_mem(void *from, void *to); void hyp_unpin_shared_mem(void *from, void *to); -void reclaim_guest_pages(struct pkvm_hyp_vm *vm); +void reclaim_guest_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc); int refill_memcache(struct kvm_hyp_memcache *mc, unsigned long min_pages, struct kvm_hyp_memcache *host_mc); diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 27b16a6b85bb..ffa56a89acdb 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -260,19 +260,24 @@ int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd) return 0; } -void reclaim_guest_pages(struct pkvm_hyp_vm *vm) +void reclaim_guest_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc) { - void *pgd = vm->pgt.pgd; - unsigned long nr_pages; - - nr_pages = kvm_pgtable_stage2_pgd_size(vm->kvm.arch.vtcr) >> PAGE_SHIFT; + void *addr; + /* Dump all pgtable pages in the hyp_pool */ guest_lock_component(vm); kvm_pgtable_stage2_destroy(&vm->pgt); vm->kvm.arch.mmu.pgd_phys = 0ULL; guest_unlock_component(vm); - WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(pgd), nr_pages)); + /* Drain the hyp_pool into the memcache */ + addr = hyp_alloc_pages(&vm->pool, 0); + while (addr) { + memset(hyp_virt_to_page(addr), 0, sizeof(struct hyp_page)); + push_hyp_memcache(mc, addr, hyp_virt_to_phys); + WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(addr), 1)); + addr = hyp_alloc_pages(&vm->pool, 0); + } } int __pkvm_prot_finalize(void) diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index 2156ce546d90..f5f300cc5864 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -539,8 +539,21 @@ int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, return ret; } +static void +teardown_donated_memory(struct kvm_hyp_memcache *mc, void *addr, size_t size) +{ + size = PAGE_ALIGN(size); + memset(addr, 0, size); + + for (void *start = addr; start < addr + size; start += PAGE_SIZE) + push_hyp_memcache(mc, start, hyp_virt_to_phys); + + unmap_donated_memory_noclear(addr, size); +} + int __pkvm_teardown_vm(pkvm_handle_t handle) { + struct kvm_hyp_memcache *mc; struct pkvm_hyp_vm *hyp_vm; unsigned int idx; int err; @@ -563,7 +576,8 @@ int __pkvm_teardown_vm(pkvm_handle_t handle) hyp_spin_unlock(&vm_table_lock); /* Reclaim guest pages (including page-table pages) */ - reclaim_guest_pages(hyp_vm); + mc = &hyp_vm->host_kvm->arch.pkvm.teardown_mc; + reclaim_guest_pages(hyp_vm, mc); unpin_host_vcpus(hyp_vm->vcpus, hyp_vm->nr_vcpus); /* Push the metadata pages to the teardown memcache */ @@ -572,10 +586,10 @@ int __pkvm_teardown_vm(pkvm_handle_t handle) for (idx = 0; idx < hyp_vm->nr_vcpus; ++idx) { struct pkvm_hyp_vcpu *hyp_vcpu = hyp_vm->vcpus[idx]; - unmap_donated_memory(hyp_vcpu, sizeof(*hyp_vcpu)); + teardown_donated_memory(mc, hyp_vcpu, sizeof(*hyp_vcpu)); } - unmap_donated_memory(hyp_vm, hyp_vm->donated_memory_size); + teardown_donated_memory(mc, hyp_vm, hyp_vm->donated_memory_size); return 0; err_unlock: diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c index 754632a608e3..a9953db08592 100644 --- a/arch/arm64/kvm/pkvm.c +++ b/arch/arm64/kvm/pkvm.c @@ -147,8 +147,6 @@ static int __pkvm_create_hyp_vm(struct kvm *host_kvm) handle = ret; host_kvm->arch.pkvm.handle = handle; - host_kvm->arch.pkvm.hyp_donations.pgd = pgd; - host_kvm->arch.pkvm.hyp_donations.vm = hyp_vm; /* Donate memory for the vcpus at hyp and initialize it. */ hyp_vcpu_sz = PAGE_ALIGN(PKVM_HYP_VCPU_SIZE); @@ -167,12 +165,12 @@ static int __pkvm_create_hyp_vm(struct kvm *host_kvm) goto destroy_vm; } - host_kvm->arch.pkvm.hyp_donations.vcpus[idx] = hyp_vcpu; - ret = kvm_call_hyp_nvhe(__pkvm_init_vcpu, handle, host_vcpu, hyp_vcpu); - if (ret) + if (ret) { + free_pages_exact(hyp_vcpu, hyp_vcpu_sz); goto destroy_vm; + } } return 0; @@ -201,30 +199,13 @@ int pkvm_create_hyp_vm(struct kvm *host_kvm) void pkvm_destroy_hyp_vm(struct kvm *host_kvm) { - unsigned long idx, nr_vcpus = host_kvm->created_vcpus; - size_t pgd_sz, hyp_vm_sz; - - if (host_kvm->arch.pkvm.handle) + if (host_kvm->arch.pkvm.handle) { WARN_ON(kvm_call_hyp_nvhe(__pkvm_teardown_vm, host_kvm->arch.pkvm.handle)); - - host_kvm->arch.pkvm.handle = 0; - - for (idx = 0; idx < nr_vcpus; ++idx) { - void *hyp_vcpu = host_kvm->arch.pkvm.hyp_donations.vcpus[idx]; - - if (!hyp_vcpu) - break; - - free_pages_exact(hyp_vcpu, PAGE_ALIGN(PKVM_HYP_VCPU_SIZE)); } - hyp_vm_sz = PAGE_ALIGN(size_add(PKVM_HYP_VM_SIZE, - size_mul(sizeof(void *), nr_vcpus))); - pgd_sz = kvm_pgtable_stage2_pgd_size(host_kvm->arch.vtcr); - - free_pages_exact(host_kvm->arch.pkvm.hyp_donations.vm, hyp_vm_sz); - free_pages_exact(host_kvm->arch.pkvm.hyp_donations.pgd, pgd_sz); + host_kvm->arch.pkvm.handle = 0; + free_hyp_memcache(&host_kvm->arch.pkvm.teardown_mc); } int pkvm_init_host_vm(struct kvm *host_kvm) From patchwork Mon Oct 17 11:52:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008686 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DDC5C4332F for ; Mon, 17 Oct 2022 11:53:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230316AbiJQLxz (ORCPT ); Mon, 17 Oct 2022 07:53:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230304AbiJQLxw (ORCPT ); Mon, 17 Oct 2022 07:53:52 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE01C4E852 for ; Mon, 17 Oct 2022 04:53:40 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E5F8CB81636 for ; Mon, 17 Oct 2022 11:53:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 979C4C4347C; Mon, 17 Oct 2022 11:53:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007616; bh=pvaYXug0hNpmDYywZvDH12QrW2BGInmPiwQ9xyFQG5o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SkDNMkBs1NWVbqfWmLW97LfNvau2M2pNl4pWfHYRDbmrMfQM2FXqyj3+5p8vrhBBn X8969TqLvtLlhY/C81+GG8tRyZ1x9OE5dImsRfvsP9Ni9xO/YtXw4IYqkRxewWgDzH 4WRiee41/rOJm0IXLATdgNIy4s5kRowA6Za2xGIE0XQsrLRk2c/5hznjrIizcUIl4m Pm3gfMSvXG4lCJ80dHUkaqSC1wfFNI1oJniC5E1smHTrWhC6MoTwIshURMrOwIv91M ycJm7CPLRW5H+ZQXcaV3PdxPGPavaKFIDRoO+afF16WtTRMD+EEF2+EPZ5vRvbJrVE 42bBIghdDL26w== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 21/25] KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host Date: Mon, 17 Oct 2022 12:52:05 +0100 Message-Id: <20221017115209.2099-22-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret When pKVM is enabled, the hypervisor at EL2 does not trust the host at EL1 and must therefore prevent it from having unrestricted access to internal hypervisor state. The 'kvm_arm_hyp_percpu_base' array holds the offsets for hypervisor per-cpu allocations, so move this this into the nVHE code where it cannot be modified by the untrusted host at EL1. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_asm.h | 4 ++-- arch/arm64/kernel/image-vars.h | 3 --- arch/arm64/kvm/arm.c | 9 ++++----- arch/arm64/kvm/hyp/nvhe/hyp-smp.c | 2 ++ 4 files changed, 8 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index de52ba775d48..43c3bc0f9544 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -109,7 +109,7 @@ enum __kvm_host_smccc_func { #define per_cpu_ptr_nvhe_sym(sym, cpu) \ ({ \ unsigned long base, off; \ - base = kvm_arm_hyp_percpu_base[cpu]; \ + base = kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[cpu]; \ off = (unsigned long)&CHOOSE_NVHE_SYM(sym) - \ (unsigned long)&CHOOSE_NVHE_SYM(__per_cpu_start); \ base ? (typeof(CHOOSE_NVHE_SYM(sym))*)(base + off) : NULL; \ @@ -214,7 +214,7 @@ DECLARE_KVM_HYP_SYM(__kvm_hyp_vector); #define __kvm_hyp_init CHOOSE_NVHE_SYM(__kvm_hyp_init) #define __kvm_hyp_vector CHOOSE_HYP_SYM(__kvm_hyp_vector) -extern unsigned long kvm_arm_hyp_percpu_base[NR_CPUS]; +extern unsigned long kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[]; DECLARE_KVM_NVHE_SYM(__per_cpu_start); DECLARE_KVM_NVHE_SYM(__per_cpu_end); diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index 7f4e43bfaade..ae8f37f4aa8c 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -89,9 +89,6 @@ KVM_NVHE_ALIAS(gic_nonsecure_priorities); KVM_NVHE_ALIAS(__start___kvm_ex_table); KVM_NVHE_ALIAS(__stop___kvm_ex_table); -/* Array containing bases of nVHE per-CPU memory regions. */ -KVM_NVHE_ALIAS(kvm_arm_hyp_percpu_base); - /* PMU available static key */ #ifdef CONFIG_HW_PERF_EVENTS KVM_NVHE_ALIAS(kvm_arm_pmu_available); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index f78eefa02f6b..25467f24803d 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -51,7 +51,6 @@ DEFINE_STATIC_KEY_FALSE(kvm_protected_mode_initialized); DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector); DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page); -unsigned long kvm_arm_hyp_percpu_base[NR_CPUS]; DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params); static bool vgic_present; @@ -1857,13 +1856,13 @@ static void teardown_hyp_mode(void) free_hyp_pgds(); for_each_possible_cpu(cpu) { free_page(per_cpu(kvm_arm_hyp_stack_page, cpu)); - free_pages(kvm_arm_hyp_percpu_base[cpu], nvhe_percpu_order()); + free_pages(kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[cpu], nvhe_percpu_order()); } } static int do_pkvm_init(u32 hyp_va_bits) { - void *per_cpu_base = kvm_ksym_ref(kvm_arm_hyp_percpu_base); + void *per_cpu_base = kvm_ksym_ref(kvm_nvhe_sym(kvm_arm_hyp_percpu_base)); int ret; preempt_disable(); @@ -1967,7 +1966,7 @@ static int init_hyp_mode(void) page_addr = page_address(page); memcpy(page_addr, CHOOSE_NVHE_SYM(__per_cpu_start), nvhe_percpu_size()); - kvm_arm_hyp_percpu_base[cpu] = (unsigned long)page_addr; + kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[cpu] = (unsigned long)page_addr; } /* @@ -2060,7 +2059,7 @@ static int init_hyp_mode(void) } for_each_possible_cpu(cpu) { - char *percpu_begin = (char *)kvm_arm_hyp_percpu_base[cpu]; + char *percpu_begin = (char *)kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[cpu]; char *percpu_end = percpu_begin + nvhe_percpu_size(); /* Map Hyp percpu pages */ diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-smp.c b/arch/arm64/kvm/hyp/nvhe/hyp-smp.c index 9f54833af400..04d194583f1e 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-smp.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-smp.c @@ -23,6 +23,8 @@ u64 cpu_logical_map(unsigned int cpu) return hyp_cpu_logical_map[cpu]; } +unsigned long __ro_after_init kvm_arm_hyp_percpu_base[NR_CPUS]; + unsigned long __hyp_per_cpu_offset(unsigned int cpu) { unsigned long *cpu_base_array; From patchwork Mon Oct 17 11:52:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008687 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D65D0C43219 for ; Mon, 17 Oct 2022 11:53:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230337AbiJQLx4 (ORCPT ); Mon, 17 Oct 2022 07:53:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230311AbiJQLxx (ORCPT ); Mon, 17 Oct 2022 07:53:53 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5A7E53D39 for ; Mon, 17 Oct 2022 04:53:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 64231B81639 for ; Mon, 17 Oct 2022 11:53:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 194EEC433B5; Mon, 17 Oct 2022 11:53:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007620; bh=JkBWkPIuSIXe9eQIz3zlM+OZss3C50u/x/WKjBuAZA4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hJY/44EbJqi2edZBxPtRVP+qqaSOASnhmgTqtaEnSFfMaRQR3Xjt4JML3efHQEhEo ZaA0q0tC5hwUQF3g4+nzF0tAVFQSZqWj/d0lkXr2GcR7W1hh6udqBXQ+S8qdBUyOuM hpjGLuFdAs4iKaYd4IjPFmfrr68J/vTxZ7MsNVQOlavnPKna8LzfwltrPvImuE0cG/ LIPof/259t6jrlLRUKYvx4+VHbK2tAF7TPIQb8gKkHJw44HgNfZEwmZEF2XFuCTGe2 4aXy6IWT7OgqutZA82DWSk3r0+MeksIsGjIXCGckqMyRVc5jCdeJY3pwvO90gm5gxD CsLQaVas1j/yg== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 22/25] KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2 Date: Mon, 17 Oct 2022 12:52:06 +0100 Message-Id: <20221017115209.2099-23-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Sharing 'kvm_arm_vmid_bits' between EL1 and EL2 allows the host to modify the variable arbitrarily, potentially leading to all sorts of shenanians as this is used to configure the VTTBR register for the guest stage-2. In preparation for unmapping host sections entirely from EL2, maintain a copy of 'kvm_arm_vmid_bits' in the pKVM hypervisor and initialise it from the host value while it is still trusted. Tested-by: Vincent Donnefort Signed-off-by: Will Deacon --- arch/arm64/include/asm/kvm_hyp.h | 2 ++ arch/arm64/kernel/image-vars.h | 3 --- arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/hyp/nvhe/pkvm.c | 3 +++ 4 files changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index fd99cf09972d..6797eafe7890 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -124,4 +124,6 @@ extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val); extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val); extern unsigned long kvm_nvhe_sym(__icache_flags); +extern unsigned int kvm_nvhe_sym(kvm_arm_vmid_bits); + #endif /* __ARM64_KVM_HYP_H__ */ diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index ae8f37f4aa8c..31ad75da4d58 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -71,9 +71,6 @@ KVM_NVHE_ALIAS(nvhe_hyp_panic_handler); /* Vectors installed by hyp-init on reset HVC. */ KVM_NVHE_ALIAS(__hyp_stub_vectors); -/* VMID bits set by the KVM VMID allocator */ -KVM_NVHE_ALIAS(kvm_arm_vmid_bits); - /* Static keys which are set if a vGIC trap should be handled in hyp. */ KVM_NVHE_ALIAS(vgic_v2_cpuif_trap); KVM_NVHE_ALIAS(vgic_v3_cpuif_trap); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 25467f24803d..1d4b8122d010 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1893,6 +1893,7 @@ static void kvm_hyp_init_symbols(void) kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1); kvm_nvhe_sym(__icache_flags) = __icache_flags; + kvm_nvhe_sym(kvm_arm_vmid_bits) = kvm_arm_vmid_bits; } static int kvm_hyp_init_protection(u32 hyp_va_bits) diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index f5f300cc5864..826335ccad9d 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -15,6 +15,9 @@ /* Used by icache_is_vpipt(). */ unsigned long __icache_flags; +/* Used by kvm_get_vttbr(). */ +unsigned int kvm_arm_vmid_bits; + /* * Set trap register values based on features in ID_AA64PFR0. */ From patchwork Mon Oct 17 11:52:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008688 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDB22C4332F for ; Mon, 17 Oct 2022 11:53:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230317AbiJQLx6 (ORCPT ); Mon, 17 Oct 2022 07:53:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33470 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230331AbiJQLxz (ORCPT ); Mon, 17 Oct 2022 07:53:55 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11AAD4F18C for ; Mon, 17 Oct 2022 04:53:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2A2216108F for ; Mon, 17 Oct 2022 11:53:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8F93FC433C1; Mon, 17 Oct 2022 11:53:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007623; bh=TlmuoizYiy7ocLRUX9ITRdoDGCG+RyLE/kBvsyVSA4Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=E/YX1u6Evb6DoaMiPuFbzBItnTR1d6+0VWVw7kjpQ3djj1CLRxeg+3t7HtnPnrBG7 iq49leJqB+ctfPDoqYlFnj6CoIa2wIUNvbOQLDc0us0mUouz7LcXMLGXSs3SINJ5Gz AnB2XJI1CHnSxSIF1gPu6YVBldxTNa3XFPygIe/SNFlQSPYGrXxWqTTkRp3tiJLg6n xTBK3CKp1h7tMBLlgm3FLCSv7Ajo/eqwzFGctpDIqbnLpNquB8T0yOoe6RxwfAPoCO 0rpuPVlY1ETIYSs73iqTNTDgpPfbnM5r2QZ4aicP1Frp7Z3CWtJDHwq1XKFGRsBN/H WPo1Ws9uz713A== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 23/25] KVM: arm64: Explicitly map 'kvm_vgic_global_state' at EL2 Date: Mon, 17 Oct 2022 12:52:07 +0100 Message-Id: <20221017115209.2099-24-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret The pkvm hypervisor at EL2 may need to read the 'kvm_vgic_global_state' variable from the host, for example when saving and restoring the state of the virtual GIC. Explicitly map 'kvm_vgic_global_state' in the stage-1 page-table of the pKVM hypervisor rather than relying on mapping all of the host '.rodata' section. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/kvm/hyp/nvhe/setup.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 0f69c1393416..5a371ab236db 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -161,6 +161,11 @@ static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size, if (ret) return ret; + ret = pkvm_create_mappings(&kvm_vgic_global_state, + &kvm_vgic_global_state + 1, prot); + if (ret) + return ret; + return 0; } From patchwork Mon Oct 17 11:52:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008689 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3E26C43217 for ; Mon, 17 Oct 2022 11:54:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230346AbiJQLyA (ORCPT ); Mon, 17 Oct 2022 07:54:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230320AbiJQLx6 (ORCPT ); Mon, 17 Oct 2022 07:53:58 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A9DFD58513 for ; Mon, 17 Oct 2022 04:53:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 54AD3B81633 for ; Mon, 17 Oct 2022 11:53:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 12BDAC4347C; Mon, 17 Oct 2022 11:53:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007627; bh=o1+D7GlEbIo/66tqzr55voUAuV+8d8HNm+7m2mJHTnU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZEvz1dN9V2J6DfWwaHcINmGNk5KfC+DrGecS/CYfMFt3bqlb0z/5Gu4lSnVKrjvLR H+pq4EHAvciAwcJl8/5q0a3v/7vsiJweWJzs1FpIZu0dG+Pbh9xeATxkEIO2suI4wk 3MnGnzPp0BYBQt2he5snpb62IhURuZt7VIvWmQn12j1+1Aj4rdWnTMLTVnD4yu6gfJ ASqmT6v4l8QDNir6qvaiUj2logQs2EkcsyuwVnDkzpFkkOIZvcIXFbdAg7EUJrL0rO 87/VupeAHUljJt5U8ZkRnCXVOfEABvq163z88euFMFVOqPo+KzvJh1hQEAia8fCt72 wyuK82vvGkgog== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 24/25] KVM: arm64: Don't unnecessarily map host kernel sections at EL2 Date: Mon, 17 Oct 2022 12:52:08 +0100 Message-Id: <20221017115209.2099-25-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Quentin Perret We no longer need to map the host's '.rodata' and '.bss' sections in the stage-1 page-table of the pKVM hypervisor at EL2, so remove those mappings and avoid creating any future dependencies at EL2 on host-controlled data structures. Tested-by: Vincent Donnefort Signed-off-by: Quentin Perret Signed-off-by: Will Deacon --- arch/arm64/kernel/image-vars.h | 6 ------ arch/arm64/kvm/hyp/nvhe/setup.c | 14 +++----------- 2 files changed, 3 insertions(+), 17 deletions(-) diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index 31ad75da4d58..e3f88b5836a2 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -102,12 +102,6 @@ KVM_NVHE_ALIAS_HYP(__memcpy, __pi_memcpy); KVM_NVHE_ALIAS_HYP(__memset, __pi_memset); #endif -/* Kernel memory sections */ -KVM_NVHE_ALIAS(__start_rodata); -KVM_NVHE_ALIAS(__end_rodata); -KVM_NVHE_ALIAS(__bss_start); -KVM_NVHE_ALIAS(__bss_stop); - /* Hyp memory sections */ KVM_NVHE_ALIAS(__hyp_idmap_text_start); KVM_NVHE_ALIAS(__hyp_idmap_text_end); diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index 5a371ab236db..5cdf3fb09bb4 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -144,23 +144,15 @@ static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size, } /* - * Map the host's .bss and .rodata sections RO in the hypervisor, but - * transfer the ownership from the host to the hypervisor itself to - * make sure it can't be donated or shared with another entity. + * Map the host sections RO in the hypervisor, but transfer the + * ownership from the host to the hypervisor itself to make sure they + * can't be donated or shared with another entity. * * The ownership transition requires matching changes in the host * stage-2. This will be done later (see finalize_host_mappings()) once * the hyp_vmemmap is addressable. */ prot = pkvm_mkstate(PAGE_HYP_RO, PKVM_PAGE_SHARED_OWNED); - ret = pkvm_create_mappings(__start_rodata, __end_rodata, prot); - if (ret) - return ret; - - ret = pkvm_create_mappings(__hyp_bss_end, __bss_stop, prot); - if (ret) - return ret; - ret = pkvm_create_mappings(&kvm_vgic_global_state, &kvm_vgic_global_state + 1, prot); if (ret) From patchwork Mon Oct 17 11:52:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 13008690 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8605C4332F for ; Mon, 17 Oct 2022 11:54:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230360AbiJQLyC (ORCPT ); Mon, 17 Oct 2022 07:54:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230331AbiJQLyA (ORCPT ); Mon, 17 Oct 2022 07:54:00 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F9485853C for ; Mon, 17 Oct 2022 04:53:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1FCB46104D for ; Mon, 17 Oct 2022 11:53:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 886E4C43470; Mon, 17 Oct 2022 11:53:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666007630; bh=iCQvJqIBHlz/qoj4u7oHbyxitmwvOTi7uXENaLoM508=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gJ/DVoZfyMXiVOq9U7MLmdr8QbBTj3mtty6Asgmr1KjgjjKgeSRGt0ztGd95QZRVO AuX+gtuosOou7Eah0dunPfhj/dUxfUTbQwjH+ZhzpAcdZQ3ws86ugDRoAD4VQB6EIF BrLVeBWy3x5c1jZqtRvCNXIK0zEEECAUBkt6H4eMNjuM1B+lvtGxm0tE+T7y7h2RRX NbIhKnheDQo9que48glYbsVejHbYfOs1Ok8mUYsojt0vpvIWU3qwBylFbV3fXOFV85 aIzxdbshA4njc2msJro0KbGPBtimcqylpywDLvi6mbfJ0xOVkKeYdY9BYid7Tpzujq lMMI1SO1X7AFQ== From: Will Deacon To: kvmarm@lists.linux.dev Cc: Will Deacon , Sean Christopherson , Vincent Donnefort , Alexandru Elisei , Catalin Marinas , James Morse , Chao Peng , Quentin Perret , Suzuki K Poulose , Mark Rutland , Fuad Tabba , Oliver Upton , Marc Zyngier , kernel-team@android.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v4 25/25] KVM: arm64: Use the pKVM hyp vCPU structure in handle___kvm_vcpu_run() Date: Mon, 17 Oct 2022 12:52:09 +0100 Message-Id: <20221017115209.2099-26-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20221017115209.2099-1-will@kernel.org> References: <20221017115209.2099-1-will@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org As a stepping stone towards deprivileging the host's access to the guest's vCPU structures, introduce some naive flush/sync routines to copy most of the host vCPU into the hyp vCPU on vCPU run and back again on return to EL1. This allows us to run using the pKVM hyp structures when KVM is initialised in protected mode. Tested-by: Vincent Donnefort Signed-off-by: Will Deacon --- arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 4 ++ arch/arm64/kvm/hyp/nvhe/hyp-main.c | 79 +++++++++++++++++++++++++- arch/arm64/kvm/hyp/nvhe/pkvm.c | 28 +++++++++ 3 files changed, 109 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h index 5d456438445c..38424b98ed84 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h @@ -67,4 +67,8 @@ int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, unsigned long vcpu_hva); int __pkvm_teardown_vm(pkvm_handle_t handle); +struct pkvm_hyp_vcpu *pkvm_load_hyp_vcpu(pkvm_handle_t handle, + unsigned int vcpu_idx); +void pkvm_put_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu); + #endif /* __ARM64_KVM_NVHE_PKVM_H__ */ diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index b5f3fcfe9135..728e01d4536b 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -22,11 +22,86 @@ DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params); void __kvm_hyp_host_forward_smc(struct kvm_cpu_context *host_ctxt); +static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu) +{ + struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu; + + hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt; + + hyp_vcpu->vcpu.arch.sve_state = kern_hyp_va(host_vcpu->arch.sve_state); + hyp_vcpu->vcpu.arch.sve_max_vl = host_vcpu->arch.sve_max_vl; + + hyp_vcpu->vcpu.arch.hw_mmu = host_vcpu->arch.hw_mmu; + + hyp_vcpu->vcpu.arch.hcr_el2 = host_vcpu->arch.hcr_el2; + hyp_vcpu->vcpu.arch.mdcr_el2 = host_vcpu->arch.mdcr_el2; + hyp_vcpu->vcpu.arch.cptr_el2 = host_vcpu->arch.cptr_el2; + + hyp_vcpu->vcpu.arch.iflags = host_vcpu->arch.iflags; + hyp_vcpu->vcpu.arch.fp_state = host_vcpu->arch.fp_state; + + hyp_vcpu->vcpu.arch.debug_ptr = kern_hyp_va(host_vcpu->arch.debug_ptr); + hyp_vcpu->vcpu.arch.host_fpsimd_state = host_vcpu->arch.host_fpsimd_state; + + hyp_vcpu->vcpu.arch.vsesr_el2 = host_vcpu->arch.vsesr_el2; + + hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3 = host_vcpu->arch.vgic_cpu.vgic_v3; +} + +static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu) +{ + struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu; + struct vgic_v3_cpu_if *hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3; + struct vgic_v3_cpu_if *host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3; + unsigned int i; + + host_vcpu->arch.ctxt = hyp_vcpu->vcpu.arch.ctxt; + + host_vcpu->arch.hcr_el2 = hyp_vcpu->vcpu.arch.hcr_el2; + host_vcpu->arch.cptr_el2 = hyp_vcpu->vcpu.arch.cptr_el2; + + host_vcpu->arch.fault = hyp_vcpu->vcpu.arch.fault; + + host_vcpu->arch.iflags = hyp_vcpu->vcpu.arch.iflags; + host_vcpu->arch.fp_state = hyp_vcpu->vcpu.arch.fp_state; + + host_cpu_if->vgic_hcr = hyp_cpu_if->vgic_hcr; + for (i = 0; i < hyp_cpu_if->used_lrs; ++i) + host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i]; +} + static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt) { - DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1); + DECLARE_REG(struct kvm_vcpu *, host_vcpu, host_ctxt, 1); + int ret; + + host_vcpu = kern_hyp_va(host_vcpu); + + if (unlikely(is_protected_kvm_enabled())) { + struct pkvm_hyp_vcpu *hyp_vcpu; + struct kvm *host_kvm; + + host_kvm = kern_hyp_va(host_vcpu->kvm); + hyp_vcpu = pkvm_load_hyp_vcpu(host_kvm->arch.pkvm.handle, + host_vcpu->vcpu_idx); + if (!hyp_vcpu) { + ret = -EINVAL; + goto out; + } + + flush_hyp_vcpu(hyp_vcpu); + + ret = __kvm_vcpu_run(&hyp_vcpu->vcpu); + + sync_hyp_vcpu(hyp_vcpu); + pkvm_put_hyp_vcpu(hyp_vcpu); + } else { + /* The host is fully trusted, run its vCPU directly. */ + ret = __kvm_vcpu_run(host_vcpu); + } - cpu_reg(host_ctxt, 1) = __kvm_vcpu_run(kern_hyp_va(vcpu)); +out: + cpu_reg(host_ctxt, 1) = ret; } static void handle___kvm_adjust_pc(struct kvm_cpu_context *host_ctxt) diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index 826335ccad9d..d4d2ed07a481 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -241,6 +241,33 @@ static struct pkvm_hyp_vm *get_vm_by_handle(pkvm_handle_t handle) return vm_table[idx]; } +struct pkvm_hyp_vcpu *pkvm_load_hyp_vcpu(pkvm_handle_t handle, + unsigned int vcpu_idx) +{ + struct pkvm_hyp_vcpu *hyp_vcpu = NULL; + struct pkvm_hyp_vm *hyp_vm; + + hyp_spin_lock(&vm_table_lock); + hyp_vm = get_vm_by_handle(handle); + if (!hyp_vm || hyp_vm->nr_vcpus <= vcpu_idx) + goto unlock; + + hyp_vcpu = hyp_vm->vcpus[vcpu_idx]; + hyp_page_ref_inc(hyp_virt_to_page(hyp_vm)); +unlock: + hyp_spin_unlock(&vm_table_lock); + return hyp_vcpu; +} + +void pkvm_put_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu) +{ + struct pkvm_hyp_vm *hyp_vm = pkvm_hyp_vcpu_to_hyp_vm(hyp_vcpu); + + hyp_spin_lock(&vm_table_lock); + hyp_page_ref_dec(hyp_virt_to_page(hyp_vm)); + hyp_spin_unlock(&vm_table_lock); +} + static void unpin_host_vcpu(struct kvm_vcpu *host_vcpu) { if (host_vcpu) @@ -286,6 +313,7 @@ static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu, hyp_vcpu->vcpu.vcpu_idx = vcpu_idx; hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu; + hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags); done: if (ret) unpin_host_vcpu(host_vcpu);