Message ID | 20250214150258.464798-4-tabba@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: Fix initializing HCRX_EL2 and other traps in pKVM | expand |
On Fri, Feb 14, 2025 at 03:02:58PM +0000, Fuad Tabba wrote: > Instead of creating and initializing _all_ hyp vcpus in pKVM when > the first host vcpu runs for the first time, initialize _each_ > hyp vcpu in conjunction with its corresponding host vcpu. > > Some of the host vcpu state (e.g., system registers and traps > values) are not initialized until the first time the host vcpu is > run. Therefore, initializing a hyp vcpu before its corresponding > host vcpu has run for the first time might not view the complete > host state of these vcpus. > > Additionally, this behavior is inline with non-protected modes. > > Signed-off-by: Fuad Tabba <tabba@google.com> > --- > arch/arm64/include/asm/kvm_host.h | 2 + > arch/arm64/include/asm/kvm_pkvm.h | 1 + > arch/arm64/kvm/arm.c | 4 ++ > arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 6 --- > arch/arm64/kvm/hyp/nvhe/pkvm.c | 54 +++++++++++++++----------- > arch/arm64/kvm/pkvm.c | 28 ++++++------- > 6 files changed, 53 insertions(+), 42 deletions(-) [...] > static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu, > struct pkvm_hyp_vm *hyp_vm, > - struct kvm_vcpu *host_vcpu, > - unsigned int vcpu_idx) > + struct kvm_vcpu *host_vcpu) > { > int ret = 0; > > if (hyp_pin_shared_mem(host_vcpu, host_vcpu + 1)) > return -EBUSY; > > - if (host_vcpu->vcpu_idx != vcpu_idx) { > - ret = -EINVAL; > - goto done; > - } > - > hyp_vcpu->host_vcpu = host_vcpu; > > hyp_vcpu->vcpu.kvm = &hyp_vm->kvm; > hyp_vcpu->vcpu.vcpu_id = READ_ONCE(host_vcpu->vcpu_id); > - hyp_vcpu->vcpu.vcpu_idx = vcpu_idx; > + hyp_vcpu->vcpu.vcpu_idx = READ_ONCE(host_vcpu->vcpu_idx); > > hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu; > hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags); > @@ -687,27 +689,28 @@ int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, > goto unlock; > } > > - idx = hyp_vm->nr_vcpus; > + ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu); > + if (ret) > + goto unlock; > + > + idx = hyp_vcpu->vcpu.vcpu_idx; > if (idx >= hyp_vm->kvm.created_vcpus) { > ret = -EINVAL; > goto unlock; > } > > - ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu, idx); > - if (ret) > + if (hyp_vm->vcpus[idx]) { > + ret = -EINVAL; > goto unlock; > + } I'm not sure how much we care at EL2, but it looks like there's a potential spectre gadget here given that 'idx' is now untrusted. Perhaps chuck something like: idx = array_index_nospec(idx, hyp_vm->kvm.created_vcpus); before indexing into 'hyp_vm->vcpus[]'? Either way: Acked-by: Will Deacon <will@kernel.org> Will
Hi Will, On Mon, 17 Feb 2025 at 15:30, Will Deacon <will@kernel.org> wrote: > > On Fri, Feb 14, 2025 at 03:02:58PM +0000, Fuad Tabba wrote: > > Instead of creating and initializing _all_ hyp vcpus in pKVM when > > the first host vcpu runs for the first time, initialize _each_ > > hyp vcpu in conjunction with its corresponding host vcpu. > > > > Some of the host vcpu state (e.g., system registers and traps > > values) are not initialized until the first time the host vcpu is > > run. Therefore, initializing a hyp vcpu before its corresponding > > host vcpu has run for the first time might not view the complete > > host state of these vcpus. > > > > Additionally, this behavior is inline with non-protected modes. > > > > Signed-off-by: Fuad Tabba <tabba@google.com> > > --- > > arch/arm64/include/asm/kvm_host.h | 2 + > > arch/arm64/include/asm/kvm_pkvm.h | 1 + > > arch/arm64/kvm/arm.c | 4 ++ > > arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 6 --- > > arch/arm64/kvm/hyp/nvhe/pkvm.c | 54 +++++++++++++++----------- > > arch/arm64/kvm/pkvm.c | 28 ++++++------- > > 6 files changed, 53 insertions(+), 42 deletions(-) > > [...] > > > static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu, > > struct pkvm_hyp_vm *hyp_vm, > > - struct kvm_vcpu *host_vcpu, > > - unsigned int vcpu_idx) > > + struct kvm_vcpu *host_vcpu) > > { > > int ret = 0; > > > > if (hyp_pin_shared_mem(host_vcpu, host_vcpu + 1)) > > return -EBUSY; > > > > - if (host_vcpu->vcpu_idx != vcpu_idx) { > > - ret = -EINVAL; > > - goto done; > > - } > > - > > hyp_vcpu->host_vcpu = host_vcpu; > > > > hyp_vcpu->vcpu.kvm = &hyp_vm->kvm; > > hyp_vcpu->vcpu.vcpu_id = READ_ONCE(host_vcpu->vcpu_id); > > - hyp_vcpu->vcpu.vcpu_idx = vcpu_idx; > > + hyp_vcpu->vcpu.vcpu_idx = READ_ONCE(host_vcpu->vcpu_idx); > > > > hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu; > > hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags); > > @@ -687,27 +689,28 @@ int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, > > goto unlock; > > } > > > > - idx = hyp_vm->nr_vcpus; > > + ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu); > > + if (ret) > > + goto unlock; > > + > > + idx = hyp_vcpu->vcpu.vcpu_idx; > > if (idx >= hyp_vm->kvm.created_vcpus) { > > ret = -EINVAL; > > goto unlock; > > } > > > > - ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu, idx); > > - if (ret) > > + if (hyp_vm->vcpus[idx]) { > > + ret = -EINVAL; > > goto unlock; > > + } > > I'm not sure how much we care at EL2, but it looks like there's a > potential spectre gadget here given that 'idx' is now untrusted. > Perhaps chuck something like: > > idx = array_index_nospec(idx, hyp_vm->kvm.created_vcpus); > > before indexing into 'hyp_vm->vcpus[]'? I'll add that when I respin. > Either way: > > Acked-by: Will Deacon <will@kernel.org> Thanks, /fuad > Will
Hi Will, On Mon, 17 Feb 2025 at 15:41, Fuad Tabba <tabba@google.com> wrote: > > Hi Will, > > On Mon, 17 Feb 2025 at 15:30, Will Deacon <will@kernel.org> wrote: > > > > On Fri, Feb 14, 2025 at 03:02:58PM +0000, Fuad Tabba wrote: > > > Instead of creating and initializing _all_ hyp vcpus in pKVM when > > > the first host vcpu runs for the first time, initialize _each_ > > > hyp vcpu in conjunction with its corresponding host vcpu. > > > > > > Some of the host vcpu state (e.g., system registers and traps > > > values) are not initialized until the first time the host vcpu is > > > run. Therefore, initializing a hyp vcpu before its corresponding > > > host vcpu has run for the first time might not view the complete > > > host state of these vcpus. > > > > > > Additionally, this behavior is inline with non-protected modes. > > > > > > Signed-off-by: Fuad Tabba <tabba@google.com> > > > --- > > > arch/arm64/include/asm/kvm_host.h | 2 + > > > arch/arm64/include/asm/kvm_pkvm.h | 1 + > > > arch/arm64/kvm/arm.c | 4 ++ > > > arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 6 --- > > > arch/arm64/kvm/hyp/nvhe/pkvm.c | 54 +++++++++++++++----------- > > > arch/arm64/kvm/pkvm.c | 28 ++++++------- > > > 6 files changed, 53 insertions(+), 42 deletions(-) > > > > [...] > > > > > static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu, > > > struct pkvm_hyp_vm *hyp_vm, > > > - struct kvm_vcpu *host_vcpu, > > > - unsigned int vcpu_idx) > > > + struct kvm_vcpu *host_vcpu) > > > { > > > int ret = 0; > > > > > > if (hyp_pin_shared_mem(host_vcpu, host_vcpu + 1)) > > > return -EBUSY; > > > > > > - if (host_vcpu->vcpu_idx != vcpu_idx) { > > > - ret = -EINVAL; > > > - goto done; > > > - } > > > - > > > hyp_vcpu->host_vcpu = host_vcpu; > > > > > > hyp_vcpu->vcpu.kvm = &hyp_vm->kvm; > > > hyp_vcpu->vcpu.vcpu_id = READ_ONCE(host_vcpu->vcpu_id); > > > - hyp_vcpu->vcpu.vcpu_idx = vcpu_idx; > > > + hyp_vcpu->vcpu.vcpu_idx = READ_ONCE(host_vcpu->vcpu_idx); > > > > > > hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu; > > > hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags); > > > @@ -687,27 +689,28 @@ int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, > > > goto unlock; > > > } > > > > > > - idx = hyp_vm->nr_vcpus; > > > + ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu); > > > + if (ret) > > > + goto unlock; > > > + > > > + idx = hyp_vcpu->vcpu.vcpu_idx; > > > if (idx >= hyp_vm->kvm.created_vcpus) { > > > ret = -EINVAL; > > > goto unlock; > > > } > > > > > > - ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu, idx); > > > - if (ret) > > > + if (hyp_vm->vcpus[idx]) { > > > + ret = -EINVAL; > > > goto unlock; > > > + } > > > > I'm not sure how much we care at EL2, but it looks like there's a > > potential spectre gadget here given that 'idx' is now untrusted. > > Perhaps chuck something like: > > > > idx = array_index_nospec(idx, hyp_vm->kvm.created_vcpus); > > > > before indexing into 'hyp_vm->vcpus[]'? > > I'll add that when I respin. pKVM is riddled with these kinds of accesses (e.g., get_vm_by_handle(). It might be better if I address this in a separate series. Cheers, /fuad > > > Either way: > > > > Acked-by: Will Deacon <will@kernel.org> > > Thanks, > /fuad > > > Will
On Mon, Feb 17, 2025 at 03:56:53PM +0000, Fuad Tabba wrote: > On Mon, 17 Feb 2025 at 15:41, Fuad Tabba <tabba@google.com> wrote: > > On Mon, 17 Feb 2025 at 15:30, Will Deacon <will@kernel.org> wrote: > > > > > > On Fri, Feb 14, 2025 at 03:02:58PM +0000, Fuad Tabba wrote: > > > > Instead of creating and initializing _all_ hyp vcpus in pKVM when > > > > the first host vcpu runs for the first time, initialize _each_ > > > > hyp vcpu in conjunction with its corresponding host vcpu. > > > > > > > > Some of the host vcpu state (e.g., system registers and traps > > > > values) are not initialized until the first time the host vcpu is > > > > run. Therefore, initializing a hyp vcpu before its corresponding > > > > host vcpu has run for the first time might not view the complete > > > > host state of these vcpus. > > > > > > > > Additionally, this behavior is inline with non-protected modes. > > > > > > > > Signed-off-by: Fuad Tabba <tabba@google.com> > > > > --- > > > > arch/arm64/include/asm/kvm_host.h | 2 + > > > > arch/arm64/include/asm/kvm_pkvm.h | 1 + > > > > arch/arm64/kvm/arm.c | 4 ++ > > > > arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 6 --- > > > > arch/arm64/kvm/hyp/nvhe/pkvm.c | 54 +++++++++++++++----------- > > > > arch/arm64/kvm/pkvm.c | 28 ++++++------- > > > > 6 files changed, 53 insertions(+), 42 deletions(-) > > > > > > [...] > > > > > > > static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu, > > > > struct pkvm_hyp_vm *hyp_vm, > > > > - struct kvm_vcpu *host_vcpu, > > > > - unsigned int vcpu_idx) > > > > + struct kvm_vcpu *host_vcpu) > > > > { > > > > int ret = 0; > > > > > > > > if (hyp_pin_shared_mem(host_vcpu, host_vcpu + 1)) > > > > return -EBUSY; > > > > > > > > - if (host_vcpu->vcpu_idx != vcpu_idx) { > > > > - ret = -EINVAL; > > > > - goto done; > > > > - } > > > > - > > > > hyp_vcpu->host_vcpu = host_vcpu; > > > > > > > > hyp_vcpu->vcpu.kvm = &hyp_vm->kvm; > > > > hyp_vcpu->vcpu.vcpu_id = READ_ONCE(host_vcpu->vcpu_id); > > > > - hyp_vcpu->vcpu.vcpu_idx = vcpu_idx; > > > > + hyp_vcpu->vcpu.vcpu_idx = READ_ONCE(host_vcpu->vcpu_idx); > > > > > > > > hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu; > > > > hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags); > > > > @@ -687,27 +689,28 @@ int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, > > > > goto unlock; > > > > } > > > > > > > > - idx = hyp_vm->nr_vcpus; > > > > + ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu); > > > > + if (ret) > > > > + goto unlock; > > > > + > > > > + idx = hyp_vcpu->vcpu.vcpu_idx; > > > > if (idx >= hyp_vm->kvm.created_vcpus) { > > > > ret = -EINVAL; > > > > goto unlock; > > > > } > > > > > > > > - ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu, idx); > > > > - if (ret) > > > > + if (hyp_vm->vcpus[idx]) { > > > > + ret = -EINVAL; > > > > goto unlock; > > > > + } > > > > > > I'm not sure how much we care at EL2, but it looks like there's a > > > potential spectre gadget here given that 'idx' is now untrusted. > > > Perhaps chuck something like: > > > > > > idx = array_index_nospec(idx, hyp_vm->kvm.created_vcpus); > > > > > > before indexing into 'hyp_vm->vcpus[]'? > > > > I'll add that when I respin. > > pKVM is riddled with these kinds of accesses (e.g., > get_vm_by_handle(). It might be better if I address this in a separate > series. Yeah, makes sense to me. Like I said, I'm not entirely sure how much we care about spectre at EL2, given that it's using a separate translation regime, but having a series to fix up some of the potential issues might provide a useful basis for discussion. Will
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 7cfa024de4e3..d35ce3c860fc 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -881,6 +881,8 @@ struct kvm_vcpu_arch { #define VCPU_INITIALIZED __vcpu_single_flag(cflags, BIT(0)) /* SVE config completed */ #define VCPU_SVE_FINALIZED __vcpu_single_flag(cflags, BIT(1)) +/* pKVM VCPU setup completed */ +#define VCPU_PKVM_FINALIZED __vcpu_single_flag(cflags, BIT(2)) /* Exception pending */ #define PENDING_EXCEPTION __vcpu_single_flag(iflags, BIT(0)) diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h index eb65f12e81d9..abd693ce5b93 100644 --- a/arch/arm64/include/asm/kvm_pkvm.h +++ b/arch/arm64/include/asm/kvm_pkvm.h @@ -19,6 +19,7 @@ int pkvm_init_host_vm(struct kvm *kvm); int pkvm_create_hyp_vm(struct kvm *kvm); void pkvm_destroy_hyp_vm(struct kvm *kvm); +int pkvm_create_hyp_vcpu(struct kvm_vcpu *vcpu); /* * This functions as an allow-list of protected VM capabilities. diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 071a7d75be68..e4661c2f0423 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -833,6 +833,10 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) ret = pkvm_create_hyp_vm(kvm); if (ret) return ret; + + ret = pkvm_create_hyp_vcpu(vcpu); + if (ret) + return ret; } mutex_lock(&kvm->arch.config_lock); diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h index e42bf68c8848..ce31d3b73603 100644 --- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h @@ -43,12 +43,6 @@ struct pkvm_hyp_vm { struct hyp_pool pool; hyp_spinlock_t lock; - /* - * The number of vcpus initialized and ready to run. - * Modifying this is protected by 'vm_table_lock'. - */ - unsigned int nr_vcpus; - /* Array of the hyp vCPU structures for this VM. */ struct pkvm_hyp_vcpu *vcpus[]; }; diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c index 668ebec27f1b..20abd46e03b8 100644 --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c @@ -285,10 +285,12 @@ struct pkvm_hyp_vcpu *pkvm_load_hyp_vcpu(pkvm_handle_t handle, hyp_spin_lock(&vm_table_lock); hyp_vm = get_vm_by_handle(handle); - if (!hyp_vm || hyp_vm->nr_vcpus <= vcpu_idx) + if (!hyp_vm || hyp_vm->kvm.created_vcpus <= vcpu_idx) goto unlock; hyp_vcpu = hyp_vm->vcpus[vcpu_idx]; + if (!hyp_vcpu) + goto unlock; /* Ensure vcpu isn't loaded on more than one cpu simultaneously. */ if (unlikely(hyp_vcpu->loaded_hyp_vcpu)) { @@ -407,8 +409,14 @@ static void unpin_host_vcpus(struct pkvm_hyp_vcpu *hyp_vcpus[], { int i; - for (i = 0; i < nr_vcpus; i++) - unpin_host_vcpu(hyp_vcpus[i]->host_vcpu); + for (i = 0; i < nr_vcpus; i++) { + struct pkvm_hyp_vcpu *hyp_vcpu = hyp_vcpus[i]; + + if (!hyp_vcpu) + continue; + + unpin_host_vcpu(hyp_vcpu->host_vcpu); + } } static void init_pkvm_hyp_vm(struct kvm *host_kvm, struct pkvm_hyp_vm *hyp_vm, @@ -432,24 +440,18 @@ static void pkvm_vcpu_init_sve(struct pkvm_hyp_vcpu *hyp_vcpu, struct kvm_vcpu * static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu, struct pkvm_hyp_vm *hyp_vm, - struct kvm_vcpu *host_vcpu, - unsigned int vcpu_idx) + struct kvm_vcpu *host_vcpu) { int ret = 0; if (hyp_pin_shared_mem(host_vcpu, host_vcpu + 1)) return -EBUSY; - if (host_vcpu->vcpu_idx != vcpu_idx) { - ret = -EINVAL; - goto done; - } - hyp_vcpu->host_vcpu = host_vcpu; hyp_vcpu->vcpu.kvm = &hyp_vm->kvm; hyp_vcpu->vcpu.vcpu_id = READ_ONCE(host_vcpu->vcpu_id); - hyp_vcpu->vcpu.vcpu_idx = vcpu_idx; + hyp_vcpu->vcpu.vcpu_idx = READ_ONCE(host_vcpu->vcpu_idx); hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu; hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags); @@ -687,27 +689,28 @@ int __pkvm_init_vcpu(pkvm_handle_t handle, struct kvm_vcpu *host_vcpu, goto unlock; } - idx = hyp_vm->nr_vcpus; + ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu); + if (ret) + goto unlock; + + idx = hyp_vcpu->vcpu.vcpu_idx; if (idx >= hyp_vm->kvm.created_vcpus) { ret = -EINVAL; goto unlock; } - ret = init_pkvm_hyp_vcpu(hyp_vcpu, hyp_vm, host_vcpu, idx); - if (ret) + if (hyp_vm->vcpus[idx]) { + ret = -EINVAL; goto unlock; + } hyp_vm->vcpus[idx] = hyp_vcpu; - hyp_vm->nr_vcpus++; unlock: hyp_spin_unlock(&vm_table_lock); - if (ret) { + if (ret) unmap_donated_memory(hyp_vcpu, sizeof(*hyp_vcpu)); - return ret; - } - - return 0; + return ret; } static void @@ -753,12 +756,17 @@ int __pkvm_teardown_vm(pkvm_handle_t handle) /* Reclaim guest pages (including page-table pages) */ mc = &host_kvm->arch.pkvm.teardown_mc; reclaim_guest_pages(hyp_vm, mc); - unpin_host_vcpus(hyp_vm->vcpus, hyp_vm->nr_vcpus); + unpin_host_vcpus(hyp_vm->vcpus, hyp_vm->kvm.created_vcpus); /* Push the metadata pages to the teardown memcache */ - for (idx = 0; idx < hyp_vm->nr_vcpus; ++idx) { + for (idx = 0; idx < hyp_vm->kvm.created_vcpus; ++idx) { struct pkvm_hyp_vcpu *hyp_vcpu = hyp_vm->vcpus[idx]; - struct kvm_hyp_memcache *vcpu_mc = &hyp_vcpu->vcpu.arch.pkvm_memcache; + struct kvm_hyp_memcache *vcpu_mc; + + if (!hyp_vcpu) + continue; + + vcpu_mc = &hyp_vcpu->vcpu.arch.pkvm_memcache; while (vcpu_mc->nr_pages) { void *addr = pop_hyp_memcache(vcpu_mc, hyp_phys_to_virt); diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c index 06665cf221bf..f2a09dc50676 100644 --- a/arch/arm64/kvm/pkvm.c +++ b/arch/arm64/kvm/pkvm.c @@ -125,7 +125,9 @@ static int __pkvm_create_hyp_vcpu(struct kvm_vcpu *vcpu) return -ENOMEM; ret = kvm_call_hyp_nvhe(__pkvm_init_vcpu, handle, vcpu, hyp_vcpu); - if (ret) + if (!ret) + vcpu_set_flag(vcpu, VCPU_PKVM_FINALIZED); + else free_pages_exact(hyp_vcpu, hyp_vcpu_sz); return ret; @@ -144,9 +146,7 @@ static int __pkvm_create_hyp_vcpu(struct kvm_vcpu *vcpu) static int __pkvm_create_hyp_vm(struct kvm *host_kvm) { size_t pgd_sz, hyp_vm_sz; - struct kvm_vcpu *host_vcpu; void *pgd, *hyp_vm; - unsigned long idx; int ret; if (host_kvm->created_vcpus < 1) @@ -180,17 +180,7 @@ static int __pkvm_create_hyp_vm(struct kvm *host_kvm) WRITE_ONCE(host_kvm->arch.pkvm.handle, ret); - kvm_for_each_vcpu(idx, host_vcpu, host_kvm) { - ret = __pkvm_create_hyp_vcpu(host_vcpu); - if (ret) - goto destroy_vm; - } - return 0; - -destroy_vm: - __pkvm_destroy_hyp_vm(host_kvm); - return ret; free_vm: free_pages_exact(hyp_vm, hyp_vm_sz); free_pgd: @@ -210,6 +200,18 @@ int pkvm_create_hyp_vm(struct kvm *host_kvm) return ret; } +int pkvm_create_hyp_vcpu(struct kvm_vcpu *vcpu) +{ + int ret = 0; + + mutex_lock(&vcpu->kvm->arch.config_lock); + if (!vcpu_get_flag(vcpu, VCPU_PKVM_FINALIZED)) + ret = __pkvm_create_hyp_vcpu(vcpu); + mutex_unlock(&vcpu->kvm->arch.config_lock); + + return ret; +} + void pkvm_destroy_hyp_vm(struct kvm *host_kvm) { mutex_lock(&host_kvm->arch.config_lock);
Instead of creating and initializing _all_ hyp vcpus in pKVM when the first host vcpu runs for the first time, initialize _each_ hyp vcpu in conjunction with its corresponding host vcpu. Some of the host vcpu state (e.g., system registers and traps values) are not initialized until the first time the host vcpu is run. Therefore, initializing a hyp vcpu before its corresponding host vcpu has run for the first time might not view the complete host state of these vcpus. Additionally, this behavior is inline with non-protected modes. Signed-off-by: Fuad Tabba <tabba@google.com> --- arch/arm64/include/asm/kvm_host.h | 2 + arch/arm64/include/asm/kvm_pkvm.h | 1 + arch/arm64/kvm/arm.c | 4 ++ arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 6 --- arch/arm64/kvm/hyp/nvhe/pkvm.c | 54 +++++++++++++++----------- arch/arm64/kvm/pkvm.c | 28 ++++++------- 6 files changed, 53 insertions(+), 42 deletions(-)