From patchwork Wed Aug 21 15:38:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steven Price X-Patchwork-Id: 13771797 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6AE2E1B7910; Wed, 21 Aug 2024 15:39:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724254801; cv=none; b=LcY9sj2/ocXS1cdtaP7o68m+av02ZdFpJbq5CjyBo2EgyhlWLFSHa1v7uRy6ikPbUzBipie23folpMnkWewEb7veR0cSerZgrVnANMtfjYe8biCHdOq48QjpqNlKB3xtmrNNRBkahULRJyjD6tuSo6QKjKZ6rbUgrMK9GqnCkgw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724254801; c=relaxed/simple; bh=St2NCskBR5WRbnMU4B/Z8M4yPanMOEUnKhJIE2Zu99E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kB5S+giGQQT760ou5ZdtyU4Y/3aKPvazdUNCI2BsjQJ5Wasm2PRJStDk0/DAbCp1Q3o4R3NPn2Au5Lo3MGqoF967mGIxcEpIVsUJaosrWwI6Lq4M1uSh+OkViI7j4mL4YE0sRZJjlAYgHHqEr44rkhBIuof0rwjReufptjzSCbc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 26800DA7; Wed, 21 Aug 2024 08:40:25 -0700 (PDT) Received: from e122027.cambridge.arm.com (e122027.cambridge.arm.com [10.1.37.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CDC9D3F73B; Wed, 21 Aug 2024 08:39:55 -0700 (PDT) From: Steven Price To: kvm@vger.kernel.org, kvmarm@lists.linux.dev Cc: Steven Price , Catalin Marinas , Marc Zyngier , Will Deacon , James Morse , Oliver Upton , Suzuki K Poulose , Zenghui Yu , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Joey Gouly , Alexandru Elisei , Christoffer Dall , Fuad Tabba , linux-coco@lists.linux.dev, Ganapatrao Kulkarni , Gavin Shan , Shanker Donthineni , Alper Gun Subject: [PATCH v4 14/43] arm64: RME: Allocate/free RECs to match vCPUs Date: Wed, 21 Aug 2024 16:38:15 +0100 Message-Id: <20240821153844.60084-15-steven.price@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240821153844.60084-1-steven.price@arm.com> References: <20240821153844.60084-1-steven.price@arm.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 The RMM maintains a data structure known as the Realm Execution Context (or REC). It is similar to struct kvm_vcpu and tracks the state of the virtual CPUs. KVM must delegate memory and request the structures are created when vCPUs are created, and suitably tear down on destruction. RECs must also be supplied with addition pages - auxiliary (or AUX) granules - for storing the larger registers state (e.g. for SVE). The number of AUX granules for a REC depends on the parameters with which the Realm was created - the RMM makes this information available via the RMI_REC_AUX_COUNT call performed after creating the Realm Descriptor (RD). Note that only some of register state for the REC can be set by KVM, the rest is defined by the RMM (zeroed). The register state then cannot be changed by KVM after the REC is created (except when the guest explicitly requests this e.g. by performing a PSCI call). See Realm Management Monitor specification (DEN0137) for more information: https://developer.arm.com/documentation/den0137/ Signed-off-by: Steven Price --- Changes since v2: * Free rec->run earlier in kvm_destroy_realm() and adapt to previous patches. --- arch/arm64/include/asm/kvm_emulate.h | 2 + arch/arm64/include/asm/kvm_host.h | 3 + arch/arm64/include/asm/kvm_rme.h | 18 ++++ arch/arm64/kvm/arm.c | 2 + arch/arm64/kvm/reset.c | 11 ++ arch/arm64/kvm/rme.c | 155 +++++++++++++++++++++++++++ 6 files changed, 191 insertions(+) diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index 5edcfb1b6c68..7430c77574e3 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -712,6 +712,8 @@ static inline bool kvm_realm_is_created(struct kvm *kvm) static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu) { + if (static_branch_unlikely(&kvm_rme_is_available)) + return vcpu->arch.rec.mpidr != INVALID_HWID; return false; } diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index e36be05b97f8..27e4ae382b38 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -755,6 +755,9 @@ struct kvm_vcpu_arch { /* Per-vcpu CCSIDR override or NULL */ u32 *ccsidr; + + /* Realm meta data */ + struct realm_rec rec; }; /* diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h index e5704859a6e5..3a3aaf5d591c 100644 --- a/arch/arm64/include/asm/kvm_rme.h +++ b/arch/arm64/include/asm/kvm_rme.h @@ -6,6 +6,7 @@ #ifndef __ASM_KVM_RME_H #define __ASM_KVM_RME_H +#include #include /** @@ -70,6 +71,21 @@ struct realm { unsigned int ia_bits; }; +/** + * struct realm_rec - Additional per VCPU data for a Realm + * + * @mpidr: MPIDR (Multiprocessor Affinity Register) value to identify this VCPU + * @rec_page: Kernel VA of the RMM's private page for this REC + * @aux_pages: Additional pages private to the RMM for this REC + * @run: Kernel VA of the RmiRecRun structure shared with the RMM + */ +struct realm_rec { + unsigned long mpidr; + void *rec_page; + struct page *aux_pages[REC_PARAMS_AUX_GRANULES]; + struct rec_run *run; +}; + void kvm_init_rme(void); u32 kvm_realm_ipa_limit(void); @@ -77,6 +93,8 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap); int kvm_init_realm_vm(struct kvm *kvm); void kvm_destroy_realm(struct kvm *kvm); void kvm_realm_destroy_rtts(struct kvm *kvm, u32 ia_bits); +int kvm_create_rec(struct kvm_vcpu *vcpu); +void kvm_destroy_rec(struct kvm_vcpu *vcpu); #define RME_RTT_BLOCK_LEVEL 2 #define RME_RTT_MAX_LEVEL 3 diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index fb1ed1f44561..89432ccee389 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -522,6 +522,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) /* Force users to call KVM_ARM_VCPU_INIT */ vcpu_clear_flag(vcpu, VCPU_INITIALIZED); + vcpu->arch.rec.mpidr = INVALID_HWID; + vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO; /* Set up the timer */ diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index 0b0ae5ae7bc2..845b1ece47d4 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -137,6 +137,11 @@ int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature) return -EPERM; return kvm_vcpu_finalize_sve(vcpu); + case KVM_ARM_VCPU_REC: + if (!kvm_is_realm(vcpu->kvm)) + return -EINVAL; + + return kvm_create_rec(vcpu); } return -EINVAL; @@ -147,6 +152,11 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu) if (vcpu_has_sve(vcpu) && !kvm_arm_vcpu_sve_finalized(vcpu)) return false; + if (kvm_is_realm(vcpu->kvm) && + !(vcpu_is_rec(vcpu) && + READ_ONCE(vcpu->kvm->arch.realm.state) == REALM_STATE_ACTIVE)) + return false; + return true; } @@ -159,6 +169,7 @@ void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu) kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu)); kfree(sve_state); kfree(vcpu->arch.ccsidr); + kvm_destroy_rec(vcpu); } static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c index 7db405d2b2b2..6f0ced6e0cc1 100644 --- a/arch/arm64/kvm/rme.c +++ b/arch/arm64/kvm/rme.c @@ -422,6 +422,161 @@ void kvm_destroy_realm(struct kvm *kvm) kvm_free_stage2_pgd(&kvm->arch.mmu); } +static void free_rec_aux(struct page **aux_pages, + unsigned int num_aux) +{ + unsigned int i; + + for (i = 0; i < num_aux; i++) { + phys_addr_t aux_page_phys = page_to_phys(aux_pages[i]); + + /* If the undelegate fails then leak the page */ + if (WARN_ON(rmi_granule_undelegate(aux_page_phys))) + continue; + + __free_page(aux_pages[i]); + } +} + +static int alloc_rec_aux(struct page **aux_pages, + u64 *aux_phys_pages, + unsigned int num_aux) +{ + int ret; + unsigned int i; + + for (i = 0; i < num_aux; i++) { + struct page *aux_page; + phys_addr_t aux_page_phys; + + aux_page = alloc_page(GFP_KERNEL); + if (!aux_page) { + ret = -ENOMEM; + goto out_err; + } + aux_page_phys = page_to_phys(aux_page); + if (rmi_granule_delegate(aux_page_phys)) { + __free_page(aux_page); + ret = -ENXIO; + goto out_err; + } + aux_pages[i] = aux_page; + aux_phys_pages[i] = aux_page_phys; + } + + return 0; +out_err: + free_rec_aux(aux_pages, i); + return ret; +} + +int kvm_create_rec(struct kvm_vcpu *vcpu) +{ + struct user_pt_regs *vcpu_regs = vcpu_gp_regs(vcpu); + unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu); + struct realm *realm = &vcpu->kvm->arch.realm; + struct realm_rec *rec = &vcpu->arch.rec; + unsigned long rec_page_phys; + struct rec_params *params; + int r, i; + + if (kvm_realm_state(vcpu->kvm) != REALM_STATE_NEW) + return -ENOENT; + + /* + * The RMM will report PSCI v1.0 to Realms and the KVM_ARM_VCPU_PSCI_0_2 + * flag covers v0.2 and onwards. + */ + if (!vcpu_has_feature(vcpu, KVM_ARM_VCPU_PSCI_0_2)) + return -EINVAL; + + BUILD_BUG_ON(sizeof(*params) > PAGE_SIZE); + BUILD_BUG_ON(sizeof(*rec->run) > PAGE_SIZE); + + params = (struct rec_params *)get_zeroed_page(GFP_KERNEL); + rec->rec_page = (void *)__get_free_page(GFP_KERNEL); + rec->run = (void *)get_zeroed_page(GFP_KERNEL); + if (!params || !rec->rec_page || !rec->run) { + r = -ENOMEM; + goto out_free_pages; + } + + for (i = 0; i < ARRAY_SIZE(params->gprs); i++) + params->gprs[i] = vcpu_regs->regs[i]; + + params->pc = vcpu_regs->pc; + + if (vcpu->vcpu_id == 0) + params->flags |= REC_PARAMS_FLAG_RUNNABLE; + + rec_page_phys = virt_to_phys(rec->rec_page); + + if (rmi_granule_delegate(rec_page_phys)) { + r = -ENXIO; + goto out_free_pages; + } + + r = alloc_rec_aux(rec->aux_pages, params->aux, realm->num_aux); + if (r) + goto out_undelegate_rmm_rec; + + params->num_rec_aux = realm->num_aux; + params->mpidr = mpidr; + + if (rmi_rec_create(virt_to_phys(realm->rd), + rec_page_phys, + virt_to_phys(params))) { + r = -ENXIO; + goto out_free_rec_aux; + } + + rec->mpidr = mpidr; + + free_page((unsigned long)params); + return 0; + +out_free_rec_aux: + free_rec_aux(rec->aux_pages, realm->num_aux); +out_undelegate_rmm_rec: + if (WARN_ON(rmi_granule_undelegate(rec_page_phys))) + rec->rec_page = NULL; +out_free_pages: + free_page((unsigned long)rec->run); + free_page((unsigned long)rec->rec_page); + free_page((unsigned long)params); + return r; +} + +void kvm_destroy_rec(struct kvm_vcpu *vcpu) +{ + struct realm *realm = &vcpu->kvm->arch.realm; + struct realm_rec *rec = &vcpu->arch.rec; + unsigned long rec_page_phys; + + if (!vcpu_is_rec(vcpu)) + return; + + free_page((unsigned long)rec->run); + + rec_page_phys = virt_to_phys(rec->rec_page); + + /* + * The REC and any AUX pages cannot be reclaimed until the REC is + * destroyed. So if the REC destroy fails then the REC page and any AUX + * pages will be leaked. + */ + if (WARN_ON(rmi_rec_destroy(rec_page_phys))) + return; + + free_rec_aux(rec->aux_pages, realm->num_aux); + + /* If the undelegate fails then leak the REC page */ + if (WARN_ON(rmi_granule_undelegate(rec_page_phys))) + return; + + free_page((unsigned long)rec->rec_page); +} + int kvm_init_realm_vm(struct kvm *kvm) { struct realm_params *params;