[v8,35/69] KVM: arm64: nv: Support multiple nested Stage-2 mmu structures

Message ID	20230131092504.2880505-36-maz@kernel.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@vger.kernel.org> From: Marc Zyngier <maz@kernel.org> To: kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: Alexandru Elisei <alexandru.elisei@arm.com>, Andre Przywara <andre.przywara@arm.com>, Chase Conklin <chase.conklin@arm.com>, Christoffer Dall <christoffer.dall@arm.com>, Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>, Jintack Lim <jintack@cs.columbia.edu>, Russell King <rmk+kernel@armlinux.org.uk>, James Morse <james.morse@arm.com>, Suzuki K Poulose <suzuki.poulose@arm.com>, Oliver Upton <oliver.upton@linux.dev>, Zenghui Yu <yuzenghui@huawei.com> Subject: [PATCH v8 35/69] KVM: arm64: nv: Support multiple nested Stage-2 mmu structures Date: Tue, 31 Jan 2023 09:24:30 +0000 Message-Id: <20230131092504.2880505-36-maz@kernel.org> In-Reply-To: <20230131092504.2880505-1-maz@kernel.org> References: <20230131092504.2880505-1-maz@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	KVM: arm64: ARMv8.3/8.4 Nested Virtualization support \| expand [v8,00/69] KVM: arm64: ARMv8.3/8.4 Nested Virtualization support [v8,01/69] arm64: Add ARM64_HAS_NESTED_VIRT cpufeature [v8,02/69] KVM: arm64: Use the S2 MMU context to iterate over S2 table [v8,03/69] KVM: arm64: nv: Introduce nested virtualization VCPU feature [v8,04/69] KVM: arm64: nv: Reset VCPU to EL2 registers if VCPU nested virt is set [v8,05/69] KVM: arm64: nv: Allow userspace to set PSR_MODE_EL2x [v8,06/69] KVM: arm64: nv: Add EL2 system registers to vcpu context [v8,07/69] KVM: arm64: nv: Add nested virt VCPU primitives for vEL2 VCPU state [v8,08/69] KVM: arm64: nv: Handle HCR_EL2.NV system register traps [v8,09/69] KVM: arm64: nv: Reset VMPIDR_EL2 and VPIDR_EL2 to sane values [v8,10/69] KVM: arm64: nv: Support virtual EL2 exceptions [v8,11/69] KVM: arm64: nv: Inject HVC exceptions to the virtual EL2 [v8,12/69] KVM: arm64: nv: Handle trapped ERET from virtual EL2 [v8,13/69] KVM: arm64: nv: Add non-VHE-EL2->EL1 translation helpers [v8,14/69] KVM: arm64: nv: Handle virtual EL2 registers in vcpu_read/write_sys_reg() [v8,15/69] KVM: arm64: nv: Handle SPSR_EL2 specially [v8,16/69] KVM: arm64: nv: Handle HCR_EL2.E2H specially [v8,17/69] KVM: arm64: nv: Save/Restore vEL2 sysregs [v8,18/69] KVM: arm64: nv: Emulate PSTATE.M for a guest hypervisor [v8,19/69] KVM: arm64: nv: Trap EL1 VM register accesses in virtual EL2 [v8,20/69] KVM: arm64: nv: Add accessors for SPSR_EL1, ELR_EL1 and VBAR_EL1 from virtual EL2 [v8,21/69] KVM: arm64: nv: Trap CPACR_EL1 access in virtual EL2 [v8,22/69] KVM: arm64: nv: Handle PSCI call via smc from the guest [v8,23/69] KVM: arm64: nv: Respect virtual HCR_EL2.TWX setting [v8,24/69] KVM: arm64: nv: Respect virtual CPTR_EL2.{TFP,FPEN} settings [v8,25/69] KVM: arm64: nv: Respect the virtual HCR_EL2.NV bit setting [v8,26/69] KVM: arm64: nv: Respect virtual HCR_EL2.TVM and TRVM settings [v8,27/69] KVM: arm64: nv: Respect the virtual HCR_EL2.NV1 bit setting [v8,28/69] KVM: arm64: nv: Allow a sysreg to be hidden from userspace only [v8,29/69] KVM: arm64: nv: Emulate EL12 register accesses from the virtual EL2 [v8,30/69] KVM: arm64: nv: Forward debug traps to the nested guest [v8,31/69] KVM: arm64: nv: Configure HCR_EL2 for nested virtualization [v8,32/69] KVM: arm64: nv: Only toggle cache for virtual EL2 when SCTLR_EL2 changes [v8,33/69] KVM: arm64: nv: Filter out unsupported features from ID regs [v8,34/69] KVM: arm64: nv: Hide RAS from nested guests [v8,35/69] KVM: arm64: nv: Support multiple nested Stage-2 mmu structures [v8,36/69] KVM: arm64: nv: Implement nested Stage-2 page table walk logic [v8,37/69] KVM: arm64: nv: Handle shadow stage 2 page faults [v8,38/69] KVM: arm64: nv: Restrict S2 RD/WR permissions to match the guest's [v8,39/69] KVM: arm64: nv: Unmap/flush shadow stage 2 page tables [v8,40/69] KVM: arm64: nv: Set a handler for the system instruction traps [v8,41/69] KVM: arm64: nv: Trap and emulate AT instructions from virtual EL2 [v8,42/69] KVM: arm64: nv: Trap and emulate TLBI instructions from virtual EL2 [v8,43/69] KVM: arm64: nv: Fold guest's HCR_EL2 configuration into the host's [v8,44/69] KVM: arm64: nv: arch_timer: Support hyp timer emulation [v8,45/69] KVM: arm64: nv: Add handling of EL2-specific timer registers [v8,46/69] KVM: arm64: nv: Load timer before the GIC [v8,47/69] KVM: arm64: nv: Nested GICv3 Support [v8,48/69] KVM: arm64: nv: Don't load the GICv4 context on entering a nested guest [v8,49/69] KVM: arm64: nv: vgic: Emulate the HW bit in software [v8,50/69] KVM: arm64: nv: vgic: Allow userland to set VGIC maintenance IRQ [v8,51/69] KVM: arm64: nv: Implement maintenance interrupt forwarding [v8,52/69] KVM: arm64: nv: Deal with broken VGIC on maintenance interrupt delivery [v8,53/69] KVM: arm64: nv: Add nested GICv3 tracepoints [v8,54/69] KVM: arm64: nv: Allow userspace to request KVM_ARM_VCPU_NESTED_VIRT [v8,55/69] KVM: arm64: nv: Add handling of FEAT_TTL TLB invalidation [v8,56/69] KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information [v8,57/69] KVM: arm64: nv: Tag shadow S2 entries with nested level [v8,58/69] KVM: arm64: nv: Add include containing the VNCR_EL2 offsets [v8,59/69] KVM: arm64: nv: Map VNCR-capable registers to a separate page [v8,60/69] KVM: arm64: nv: Move nested vgic state into the sysreg file [v8,61/69] KVM: arm64: Add FEAT_NV2 cpu feature [v8,62/69] KVM: arm64: nv: Sync nested timer state with FEAT_NV2 [v8,63/69] KVM: arm64: nv: Publish emulated timer interrupt state in the in-memory state [v8,64/69] KVM: arm64: nv: Allocate VNCR page when required [v8,65/69] KVM: arm64: nv: Enable ARMv8.4-NV support [v8,66/69] KVM: arm64: nv: Fast-track 'InHost' exception returns [v8,67/69] KVM: arm64: nv: Fast-track EL1 TLBIs for VHE guests [v8,68/69] KVM: arm64: nv: Use FEAT_ECV to trap access to EL0 timers [v8,69/69] KVM: arm64: nv: Accelerate EL0 timer read accesses when FEAT_ECV is on

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index a1892a8f6032..a348c93b1094 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -159,8 +159,35 @@ struct kvm_s2_mmu { int __percpu *last_vcpu_ran; struct kvm_arch *arch; + + /* + * For a shadow stage-2 MMU, the virtual vttbr used by the + * host to parse the guest S2. + * This either contains: + * - the virtual VTTBR programmed by the guest hypervisor with + * CnP cleared + * - The value 1 (VMID=0, BADDR=0, CnP=1) if invalid + */ + u64 vttbr; + + /* + * true when this represents a nested context where virtual + * HCR_EL2.VM == 1 + */ + bool nested_stage2_enabled; + + /* + * 0: Nobody is currently using this, check vttbr for validity + * >0: Somebody is actively using this. + */ + atomic_t refcnt; }; +static inline bool kvm_s2_mmu_valid(struct kvm_s2_mmu *mmu) +{ + return !(mmu->vttbr & 1); +} + struct kvm_arch_memory_slot { }; @@ -187,6 +214,14 @@ struct kvm_protected_vm { struct kvm_arch { struct kvm_s2_mmu mmu; + /* + * Stage 2 paging state for VMs with nested S2 using a virtual + * VMID. + */ + struct kvm_s2_mmu *nested_mmus; + size_t nested_mmus_size; + int nested_mmus_next; + /* VTCR_EL2 value for this VM */ u64 vtcr; diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 083cc47dca08..0cb5ad7b0852 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -117,6 +117,7 @@ alternative_cb_end #include <asm/mmu_context.h> #include <asm/kvm_emulate.h> #include <asm/kvm_host.h> +#include <asm/kvm_nested.h> void kvm_update_va_mask(struct alt_instr *alt, __le32 *origptr, __le32 *updptr, int nr_inst); @@ -166,6 +167,7 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size, void **haddr); void __init free_hyp_pgds(void); +void kvm_unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size); void stage2_unmap_vm(struct kvm *kvm); int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type); void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu); @@ -307,5 +309,12 @@ static inline struct kvm *kvm_s2_mmu_to_kvm(struct kvm_s2_mmu *mmu) { return container_of(mmu->arch, struct kvm, arch); } + +static inline u64 get_vmid(u64 vttbr) +{ + return (vttbr & VTTBR_VMID_MASK(kvm_get_vmid_bits())) >> + VTTBR_VMID_SHIFT; +} + #endif /* __ASSEMBLY__ */ #endif /* __ARM64_KVM_MMU_H__ */ diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h index f2820c82e956..e3bcb351aae1 100644 --- a/arch/arm64/include/asm/kvm_nested.h +++ b/arch/arm64/include/asm/kvm_nested.h @@ -59,6 +59,13 @@ static inline u64 translate_ttbr0_el2_to_ttbr0_el1(u64 ttbr0) return ttbr0 & ~GENMASK_ULL(63, 48); } +extern void kvm_init_nested(struct kvm *kvm); +extern int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu); +extern void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu); +extern struct kvm_s2_mmu *lookup_s2_mmu(struct kvm *kvm, u64 vttbr, u64 hcr); +extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu); +extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu); + extern bool __forward_traps(struct kvm_vcpu *vcpu, unsigned int reg, u64 control_bit); extern bool forward_traps(struct kvm_vcpu *vcpu, u64 control_bit); diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 4e1943e3aa42..92d56959cb27 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -36,9 +36,10 @@ #include <asm/virt.h> #include <asm/kvm_arm.h> #include <asm/kvm_asm.h> +#include <asm/kvm_emulate.h> #include <asm/kvm_mmu.h> +#include <asm/kvm_nested.h> #include <asm/kvm_pkvm.h> -#include <asm/kvm_emulate.h> #include <asm/sections.h> #include <kvm/arm_hypercalls.h> @@ -128,6 +129,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { int ret; + kvm_init_nested(kvm); + ret = kvm_share_hyp(kvm, kvm + 1); if (ret) return ret; @@ -387,6 +390,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) struct kvm_s2_mmu *mmu; int *last_ran; + if (vcpu_has_nv(vcpu)) + kvm_vcpu_load_hw_mmu(vcpu); + mmu = vcpu->arch.hw_mmu; last_ran = this_cpu_ptr(mmu->last_vcpu_ran); @@ -437,9 +443,12 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) kvm_timer_vcpu_put(vcpu); kvm_vgic_put(vcpu); kvm_vcpu_pmu_restore_host(vcpu); + if (vcpu_has_nv(vcpu)) + kvm_vcpu_put_hw_mmu(vcpu); kvm_arm_vmid_clear_active(); vcpu_clear_on_unsupported_cpu(vcpu); + vcpu->cpu = -1; } @@ -1169,8 +1178,13 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, vcpu->arch.target = phys_target; + /* Prepare for nested if required */ + ret = kvm_vcpu_init_nested(vcpu); + /* Now we know what it is, we can reset it. */ - ret = kvm_reset_vcpu(vcpu); + if (!ret) + ret = kvm_reset_vcpu(vcpu); + if (ret) { vcpu->arch.target = -1; bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES); diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 217a511eb271..9637049f74ae 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -217,7 +217,7 @@ static void invalidate_icache_guest_page(void *va, size_t size) * does. */ /** - * unmap_stage2_range -- Clear stage2 page table entries to unmap a range + * __unmap_stage2_range -- Clear stage2 page table entries to unmap a range * @mmu: The KVM stage-2 MMU pointer * @start: The intermediate physical base address of the range to unmap * @size: The size of the area to unmap @@ -240,7 +240,7 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 may_block)); } -static void unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size) +void kvm_unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size) { __unmap_stage2_range(mmu, start, size, true); } @@ -692,21 +692,9 @@ static struct kvm_pgtable_mm_ops kvm_s2_mm_ops = { .icache_inval_pou = invalidate_icache_guest_page, }; -/** - * kvm_init_stage2_mmu - Initialise a S2 MMU structure - * @kvm: The pointer to the KVM structure - * @mmu: The pointer to the s2 MMU structure - * @type: The machine type of the virtual machine - * - * Allocates only the stage-2 HW PGD level table(s). - * Note we don't need locking here as this is only called when the VM is - * created, which can only be done once. - */ -int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type) +static int kvm_init_ipa_range(struct kvm *kvm, unsigned long type) { u32 kvm_ipa_limit = get_kvm_ipa_limit(); - int cpu, err; - struct kvm_pgtable *pgt; u64 mmfr0, mmfr1; u32 phys_shift; @@ -733,7 +721,53 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1); kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift); + return 0; +} + +/** + * kvm_init_stage2_mmu - Initialise a S2 MMU structure + * @kvm: The pointer to the KVM structure + * @mmu: The pointer to the s2 MMU structure + * @type: The machine type of the virtual machine + * + * Allocates only the stage-2 HW PGD level table(s). + * Note we don't need locking here as this is only called in two cases: + * + * - when the VM is created, which can't race against anything + * + * - when secondary kvm_s2_mmu structures are initialised for NV + * guests, and the caller must hold kvm->lock as this is called on a + * per-vcpu basis. + */ +int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type) +{ + int cpu, err; + struct kvm_pgtable *pgt; + + /* + * We only initialise the IPA range on the canonical MMU, so + * the type is meaningless in all other situations. + */ + if (&kvm->arch.mmu == mmu) { + err = kvm_init_ipa_range(kvm, type); + if (err) + return err; + } + + /* + * If we already have our page tables in place, and that the + * MMU context is the canonical one, we have a bug somewhere, + * as this is only supposed to ever happen once per VM. + * + * Otherwise, we're building nested page tables, and that's + * probably because userspace called KVM_ARM_VCPU_INIT more + * than once on the same vcpu. Since that's actually legal, + * don't kick a fuss and leave gracefully. + */ if (mmu->pgt != NULL) { + if (&kvm->arch.mmu != mmu) + return 0; + kvm_err("kvm_arch already initialized?\n"); return -EINVAL; } @@ -758,6 +792,10 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t mmu->pgt = pgt; mmu->pgd_phys = __pa(pgt->pgd); + + if (&kvm->arch.mmu != mmu) + kvm_init_nested_s2_mmu(mmu); + return 0; out_destroy_pgtable: @@ -803,7 +841,7 @@ static void stage2_unmap_memslot(struct kvm *kvm, if (!(vma->vm_flags & VM_PFNMAP)) { gpa_t gpa = addr + (vm_start - memslot->userspace_addr); - unmap_stage2_range(&kvm->arch.mmu, gpa, vm_end - vm_start); + kvm_unmap_stage2_range(&kvm->arch.mmu, gpa, vm_end - vm_start); } hva = vm_end; } while (hva < reg_end); @@ -1846,11 +1884,6 @@ void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) { } -void kvm_arch_flush_shadow_all(struct kvm *kvm) -{ - kvm_free_stage2_pgd(&kvm->arch.mmu); -} - void kvm_arch_flush_shadow_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) { @@ -1858,7 +1891,7 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm, phys_addr_t size = slot->npages << PAGE_SHIFT; write_lock(&kvm->mmu_lock); - unmap_stage2_range(&kvm->arch.mmu, gpa, size); + kvm_unmap_stage2_range(&kvm->arch.mmu, gpa, size); write_unlock(&kvm->mmu_lock); } diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c index f7ec27c27a4f..5514116429af 100644 --- a/arch/arm64/kvm/nested.c +++ b/arch/arm64/kvm/nested.c @@ -7,7 +7,9 @@ #include <linux/kvm.h> #include <linux/kvm_host.h> +#include <asm/kvm_arm.h> #include <asm/kvm_emulate.h> +#include <asm/kvm_mmu.h> #include <asm/kvm_nested.h> #include <asm/sysreg.h> @@ -16,6 +18,200 @@ /* Protection against the sysreg repainting madness... */ #define NV_FTR(r, f) ID_AA64##r##_EL1_##f +void kvm_init_nested(struct kvm *kvm) +{ + kvm->arch.nested_mmus = NULL; + kvm->arch.nested_mmus_size = 0; +} + +int kvm_vcpu_init_nested(struct kvm_vcpu *vcpu) +{ + struct kvm *kvm = vcpu->kvm; + struct kvm_s2_mmu *tmp; + int num_mmus; + int ret = -ENOMEM; + + if (!test_bit(KVM_ARM_VCPU_HAS_EL2, vcpu->arch.features)) + return 0; + + if (!cpus_have_final_cap(ARM64_HAS_NESTED_VIRT)) + return -EINVAL; + + mutex_lock(&kvm->lock); + + /* + * Let's treat memory allocation failures as benign: If we fail to + * allocate anything, return an error and keep the allocated array + * alive. Userspace may try to recover by intializing the vcpu + * again, and there is no reason to affect the whole VM for this. + */ + num_mmus = atomic_read(&kvm->online_vcpus) * 2; + tmp = krealloc(kvm->arch.nested_mmus, + num_mmus * sizeof(*kvm->arch.nested_mmus), + GFP_KERNEL_ACCOUNT | __GFP_ZERO); + if (tmp) { + /* + * If we went through a realocation, adjust the MMU + * back-pointers in the previously initialised + * pg_table structures. + */ + if (kvm->arch.nested_mmus != tmp) { + int i; + + for (i = 0; i < num_mmus - 2; i++) + tmp[i].pgt->mmu = &tmp[i]; + } + + if (kvm_init_stage2_mmu(kvm, &tmp[num_mmus - 1], 0) || + kvm_init_stage2_mmu(kvm, &tmp[num_mmus - 2], 0)) { + kvm_free_stage2_pgd(&tmp[num_mmus - 1]); + kvm_free_stage2_pgd(&tmp[num_mmus - 2]); + } else { + kvm->arch.nested_mmus_size = num_mmus; + ret = 0; + } + + kvm->arch.nested_mmus = tmp; + } + + mutex_unlock(&kvm->lock); + return ret; +} + +/* Must be called with kvm->mmu_lock held */ +struct kvm_s2_mmu *lookup_s2_mmu(struct kvm *kvm, u64 vttbr, u64 hcr) +{ + bool nested_stage2_enabled = hcr & HCR_VM; + int i; + + /* Don't consider the CnP bit for the vttbr match */ + vttbr = vttbr & ~VTTBR_CNP_BIT; + + /* + * Two possibilities when looking up a S2 MMU context: + * + * - either S2 is enabled in the guest, and we need a context that + * is S2-enabled and matches the full VTTBR (VMID+BADDR), which + * makes it safe from a TLB conflict perspective (a broken guest + * won't be able to generate them), + * + * - or S2 is disabled, and we need a context that is S2-disabled + * and matches the VMID only, as all TLBs are tagged by VMID even + * if S2 translation is disabled. + */ + for (i = 0; i < kvm->arch.nested_mmus_size; i++) { + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i]; + + if (!kvm_s2_mmu_valid(mmu)) + continue; + + if (nested_stage2_enabled && + mmu->nested_stage2_enabled && + vttbr == mmu->vttbr) + return mmu; + + if (!nested_stage2_enabled && + !mmu->nested_stage2_enabled && + get_vmid(vttbr) == get_vmid(mmu->vttbr)) + return mmu; + } + return NULL; +} + +/* Must be called with kvm->mmu_lock held */ +static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu) +{ + struct kvm *kvm = vcpu->kvm; + u64 vttbr = vcpu_read_sys_reg(vcpu, VTTBR_EL2); + u64 hcr= vcpu_read_sys_reg(vcpu, HCR_EL2); + struct kvm_s2_mmu *s2_mmu; + int i; + + s2_mmu = lookup_s2_mmu(kvm, vttbr, hcr); + if (s2_mmu) + goto out; + + /* + * Make sure we don't always search from the same point, or we + * will always reuse a potentially active context, leaving + * free contexts unused. + */ + for (i = kvm->arch.nested_mmus_next; + i < (kvm->arch.nested_mmus_size + kvm->arch.nested_mmus_next); + i++) { + s2_mmu = &kvm->arch.nested_mmus[i % kvm->arch.nested_mmus_size]; + + if (atomic_read(&s2_mmu->refcnt) == 0) + break; + } + BUG_ON(atomic_read(&s2_mmu->refcnt)); /* We have struct MMUs to spare */ + + /* Set the scene for the next search */ + kvm->arch.nested_mmus_next = (i + 1) % kvm->arch.nested_mmus_size; + + if (kvm_s2_mmu_valid(s2_mmu)) { + /* Clear the old state */ + kvm_unmap_stage2_range(s2_mmu, 0, kvm_phys_size(kvm)); + if (atomic64_read(&s2_mmu->vmid.id)) + kvm_call_hyp(__kvm_tlb_flush_vmid, s2_mmu); + } + + /* + * The virtual VMID (modulo CnP) will be used as a key when matching + * an existing kvm_s2_mmu. + */ + s2_mmu->vttbr = vttbr & ~VTTBR_CNP_BIT; + s2_mmu->nested_stage2_enabled = hcr & HCR_VM; + +out: + atomic_inc(&s2_mmu->refcnt); + return s2_mmu; +} + +void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu) +{ + mmu->vttbr = 1; + mmu->nested_stage2_enabled = false; + atomic_set(&mmu->refcnt, 0); +} + +void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu) +{ + if (is_hyp_ctxt(vcpu)) { + vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu; + } else { + write_lock(&vcpu->kvm->mmu_lock); + vcpu->arch.hw_mmu = get_s2_mmu_nested(vcpu); + write_unlock(&vcpu->kvm->mmu_lock); + } +} + +void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu) +{ + if (vcpu->arch.hw_mmu != &vcpu->kvm->arch.mmu) { + atomic_dec(&vcpu->arch.hw_mmu->refcnt); + vcpu->arch.hw_mmu = NULL; + } +} + +void kvm_arch_flush_shadow_all(struct kvm *kvm) +{ + int i; + + for (i = 0; i < kvm->arch.nested_mmus_size; i++) { + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i]; + + WARN_ON(atomic_read(&mmu->refcnt)); + + if (!atomic_read(&mmu->refcnt)) + kvm_free_stage2_pgd(mmu); + } + kfree(kvm->arch.nested_mmus); + kvm->arch.nested_mmus = NULL; + kvm->arch.nested_mmus_size = 0; + kvm_free_stage2_pgd(&kvm->arch.mmu); +} + /* * Our emulated CPU doesn't support all the possible features. For the * sake of simplicity (and probably mental sanity), wipe out a number

[v8,35/69] KVM: arm64: nv: Support multiple nested Stage-2 mmu structures

Commit Message

Patch