Message ID | 20221110190259.26861-1-will@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | KVM: arm64: Introduce pKVM hyp VM and vCPU state at EL2 | expand |
On Thu, 10 Nov 2022 19:02:33 +0000, Will Deacon <will@kernel.org> wrote: > > Hi all, > > This is version six of the pKVM EL2 state series, extending the pKVM > hypervisor code so that it can dynamically instantiate and manage VM > data structures without the host being able to access them directly. > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2 > page-table for the MMU. The pages used to hold the hypervisor structures > are returned to the host when the VM is destroyed. > > Previous versions are archived at: > > Mega-patch: https://lore.kernel.org/kvmarm/20220519134204.5379-1-will@kernel.org/ > v2: https://lore.kernel.org/all/20220630135747.26983-1-will@kernel.org/ > v3: https://lore.kernel.org/kvmarm/20220914083500.5118-1-will@kernel.org/ > v4: https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/ > v5: https://lore.kernel.org/r/20221020133827.5541-1-will@kernel.org > > The changes since v5 include: > > * Fix teardown ordering so that the host 'kvm' structure remains pins > while the memcache is being filled. > > * Fixed a kerneldoc typo. > > * Included a patch from Oliver to rework the 'pkvm_mem_transition' > structure and it's handling of the completer address. > > * Tweaked some commit messages and added new R-b tags. > > As before, the final patch is RFC since it illustrates a very naive use > of the new hypervisor structures and subsequent changes will improve on > this once we have the guest private memory story sorted out. > > Oliver: I'm pretty sure we're going to need to revert your completer > address cleanup as soon as we have guest-host sharing. We want to keep > the 'pkvm_mem_transition' structure 'const', but we will only know the > host address (PA) after walking the guest stage-2 and so we're going to > want to track that separately. Anyway, I've included it here at the end > so Marc can decide what he wants to do! Thanks, I guess... :-/ If this patch is going to be reverted, I'd rather not take it (without guest/host sharing, we don't have much of a hypervisor). I guess we can always revisit this in the light of the avalanche of pKVM patches that will follow... Cheers, M.
On Thu, 10 Nov 2022 19:02:33 +0000, Will Deacon wrote: > This is version six of the pKVM EL2 state series, extending the pKVM > hypervisor code so that it can dynamically instantiate and manage VM > data structures without the host being able to access them directly. > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2 > page-table for the MMU. The pages used to hold the hypervisor structures > are returned to the host when the VM is destroyed. > > [...] As for Oliver's series, I've tentatively applied this to -next. I've dropped Oliver's patch for now, but kept the RFC one. Maybe I'll change my mind. Anyway, there was an interesting number of conflicts between the two series, which I tried to resolve as well as I could, but it is likely I broke something (although it compiles, so it must be perfect). Please have a look and shout if/when you spot something. [01/26] KVM: arm64: Move hyp refcount manipulation helpers to common header file commit: 0f4f7ae10ee4e6403659b2d9ddf05424eecde45b [02/26] KVM: arm64: Allow attaching of non-coalescable pages to a hyp pool commit: 72a5bc0f153ce8ca80e9abbd1d9adec7d586915a [03/26] KVM: arm64: Back the hypervisor 'struct hyp_page' array for all memory commit: 8e6bcc3a4502a0d8d065466efd888b6b59b85789 [04/26] KVM: arm64: Fix-up hyp stage-1 refcounts for all pages mapped at EL2 commit: 0d16d12eb26ef85602ef8a678d94825a66772774 [05/26] KVM: arm64: Unify identifiers used to distinguish host and hypervisor commit: 33bc332d4061e95db55594893c4f80105b1dd813 [06/26] KVM: arm64: Implement do_donate() helper for donating memory commit: 1ed5c24c26f48ff61dc5d97c655769821f36a622 [07/26] KVM: arm64: Prevent the donation of no-map pages commit: 43c1ff8b75011bc3e3e923adf31ba815864a2494 [08/26] KVM: arm64: Add helpers to pin memory shared with the hypervisor at EL2 commit: 9926cfce8dcb880255f30ab9ac930add787e1ead [09/26] KVM: arm64: Include asm/kvm_mmu.h in nvhe/mem_protect.h commit: 4d968b12e6bbe4440f4f220c41d779e02df8af1a [10/26] KVM: arm64: Add hyp_spinlock_t static initializer commit: 1c80002e3264552d8b9c0e162e09aa4087403716 [11/26] KVM: arm64: Rename 'host_kvm' to 'host_mmu' commit: 5304002dc3754a5663d75c977bfa2d9e3c08906d [12/26] KVM: arm64: Add infrastructure to create and track pKVM instances at EL2 commit: a1ec5c70d3f63d8a143fb83cd7f53bd8ff2f72c8 [13/26] KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1 commit: 9d0c063a4d1d10ef8e6288899b8524413e40cfa0 [14/26] KVM: arm64: Add per-cpu fixmap infrastructure at EL2 commit: aa6948f82f0b7060fbbac21911dc7996b144ba3c [15/26] KVM: arm64: Initialise hypervisor copies of host symbols unconditionally commit: 6c165223e9a6384aa1e934b90f2650e71adb972a [16/26] KVM: arm64: Provide I-cache invalidation by virtual address at EL2 commit: 13e248aab73d2f1c27b458ef09d38b44f3e5bf2e [17/26] KVM: arm64: Add generic hyp_memcache helpers commit: 717a7eebac106a5cc5d5493f8eef9cf4ae6edf19 [18/26] KVM: arm64: Consolidate stage-2 initialisation into a single function commit: 315775ff7c6de497dd07c3f6eff499fb538783eb [19/26] KVM: arm64: Instantiate guest stage-2 page-tables at EL2 commit: 60dfe093ec13b056856c672e1daa35134be38283 [20/26] KVM: arm64: Return guest memory from EL2 via dedicated teardown memcache commit: f41dff4efb918db68923a826e966ca62c7c8e929 [21/26] KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host commit: fe41a7f8c0ee3ee2f682f8c28c7e1c5ff2be8a79 [22/26] KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2 commit: 73f38ef2ae531b180685173e0923225551434fcb [23/26] KVM: arm64: Explicitly map 'kvm_vgic_global_state' at EL2 commit: 27eb26bfff5d358d42911d04bbecc62e659ec32b [24/26] KVM: arm64: Don't unnecessarily map host kernel sections at EL2 commit: 169cd0f8238f2598b85d2db2e15828e8f8da18e5 [25/26] KVM: arm64: Clean out the odd handling of completer_addr (no commit info) [26/26] KVM: arm64: Use the pKVM hyp vCPU structure in handle___kvm_vcpu_run() commit: be66e67f175096f283c9d5614c4991fc9e7ed975 Cheers, M.
On Fri, Nov 11, 2022 at 04:54:14PM +0000, Marc Zyngier wrote: > On Thu, 10 Nov 2022 19:02:33 +0000, > Will Deacon <will@kernel.org> wrote: > > > > Hi all, > > > > This is version six of the pKVM EL2 state series, extending the pKVM > > hypervisor code so that it can dynamically instantiate and manage VM > > data structures without the host being able to access them directly. > > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2 > > page-table for the MMU. The pages used to hold the hypervisor structures > > are returned to the host when the VM is destroyed. > > > > Previous versions are archived at: > > > > Mega-patch: https://lore.kernel.org/kvmarm/20220519134204.5379-1-will@kernel.org/ > > v2: https://lore.kernel.org/all/20220630135747.26983-1-will@kernel.org/ > > v3: https://lore.kernel.org/kvmarm/20220914083500.5118-1-will@kernel.org/ > > v4: https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/ > > v5: https://lore.kernel.org/r/20221020133827.5541-1-will@kernel.org > > > > The changes since v5 include: > > > > * Fix teardown ordering so that the host 'kvm' structure remains pins > > while the memcache is being filled. > > > > * Fixed a kerneldoc typo. > > > > * Included a patch from Oliver to rework the 'pkvm_mem_transition' > > structure and it's handling of the completer address. > > > > * Tweaked some commit messages and added new R-b tags. > > > > As before, the final patch is RFC since it illustrates a very naive use > > of the new hypervisor structures and subsequent changes will improve on > > this once we have the guest private memory story sorted out. > > > > Oliver: I'm pretty sure we're going to need to revert your completer > > address cleanup as soon as we have guest-host sharing. We want to keep > > the 'pkvm_mem_transition' structure 'const', but we will only know the > > host address (PA) after walking the guest stage-2 and so we're going to > > want to track that separately. Anyway, I've included it here at the end > > so Marc can decide what he wants to do! > > Thanks, I guess... :-/ > > If this patch is going to be reverted, I'd rather not take it (without > guest/host sharing, we don't have much of a hypervisor). +1, I'm more than happy being told my patch doesn't work :) Having said that, if there are parts of the design that I've whined about that are intentional then please educate me. Some things haven't been quite as obvious, but I know you folks have been working on this feature for a while. I probably need to give the full patch-bomb another read to get all the context too. -- Thanks, Oliver
On Fri, Nov 11, 2022 at 07:06:14PM +0000, Marc Zyngier wrote: > On Thu, 10 Nov 2022 19:02:33 +0000, Will Deacon wrote: > > This is version six of the pKVM EL2 state series, extending the pKVM > > hypervisor code so that it can dynamically instantiate and manage VM > > data structures without the host being able to access them directly. > > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2 > > page-table for the MMU. The pages used to hold the hypervisor structures > > are returned to the host when the VM is destroyed. > > > > [...] > > As for Oliver's series, I've tentatively applied this to -next. > I've dropped Oliver's patch for now, but kept the RFC one. Maybe I'll > change my mind. > > Anyway, there was an interesting number of conflicts between the two > series, which I tried to resolve as well as I could, but it is likely > I broke something (although it compiles, so it must be perfect). > > Please have a look and shout if/when you spot something. Here is where you and I diverged on the conflict resolution, neither amounts to a whole lot but feel free to squash in. Hoping that Will + co can test the pKVM side of this. diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c index f2c4672697c2..318298eb3d6b 100644 --- a/arch/arm64/kvm/hyp/nvhe/mm.c +++ b/arch/arm64/kvm/hyp/nvhe/mm.c @@ -265,7 +265,7 @@ static int __create_fixmap_slot_cb(const struct kvm_pgtable_visit_ctx *ctx, { struct hyp_fixmap_slot *slot = per_cpu_ptr(&fixmap_slots, (u64)ctx->arg); - if (!kvm_pte_valid(*ctx->ptep) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1) + if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1) return -EINVAL; slot->addr = ctx->addr; diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c index b47d969ae4d3..110f04627785 100644 --- a/arch/arm64/kvm/hyp/nvhe/setup.c +++ b/arch/arm64/kvm/hyp/nvhe/setup.c @@ -190,7 +190,7 @@ static void hpool_put_page(void *addr) } static int fix_host_ownership_walker(const struct kvm_pgtable_visit_ctx *ctx, - enum kvm_pgtable_walk_flags visit) + enum kvm_pgtable_walk_flags visit) { enum kvm_pgtable_prot prot; enum pkvm_page_state state; -- Thanks, Oliver
On Fri, 11 Nov 2022 20:08:46 +0000, Oliver Upton <oliver.upton@linux.dev> wrote: > > On Fri, Nov 11, 2022 at 07:06:14PM +0000, Marc Zyngier wrote: > > On Thu, 10 Nov 2022 19:02:33 +0000, Will Deacon wrote: > > > This is version six of the pKVM EL2 state series, extending the pKVM > > > hypervisor code so that it can dynamically instantiate and manage VM > > > data structures without the host being able to access them directly. > > > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2 > > > page-table for the MMU. The pages used to hold the hypervisor structures > > > are returned to the host when the VM is destroyed. > > > > > > [...] > > > > As for Oliver's series, I've tentatively applied this to -next. > > I've dropped Oliver's patch for now, but kept the RFC one. Maybe I'll > > change my mind. > > > > Anyway, there was an interesting number of conflicts between the two > > series, which I tried to resolve as well as I could, but it is likely > > I broke something (although it compiles, so it must be perfect). > > > > Please have a look and shout if/when you spot something. > > Here is where you and I diverged on the conflict resolution, neither > amounts to a whole lot but feel free to squash in. Hoping that Will + co > can test the pKVM side of this. > > diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c > index f2c4672697c2..318298eb3d6b 100644 > --- a/arch/arm64/kvm/hyp/nvhe/mm.c > +++ b/arch/arm64/kvm/hyp/nvhe/mm.c > @@ -265,7 +265,7 @@ static int __create_fixmap_slot_cb(const struct kvm_pgtable_visit_ctx *ctx, > { > struct hyp_fixmap_slot *slot = per_cpu_ptr(&fixmap_slots, (u64)ctx->arg); > > - if (!kvm_pte_valid(*ctx->ptep) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1) > + if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1) > return -EINVAL; > > slot->addr = ctx->addr; > diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c > index b47d969ae4d3..110f04627785 100644 > --- a/arch/arm64/kvm/hyp/nvhe/setup.c > +++ b/arch/arm64/kvm/hyp/nvhe/setup.c > @@ -190,7 +190,7 @@ static void hpool_put_page(void *addr) > } > > static int fix_host_ownership_walker(const struct kvm_pgtable_visit_ctx *ctx, > - enum kvm_pgtable_walk_flags visit) > + enum kvm_pgtable_walk_flags visit) > { > enum kvm_pgtable_prot prot; > enum pkvm_page_state state; > Thanks. I've folded that in the resolution and regenerated the -next branch after taking another patch from Gavin in the dirty-ring series. I've managed to test the result on both VHE and nVHE this morning, and the wheels are still attached. I don't have a good setup for pKVM at the moment though, and it is likely that the rest of the monster series will need some rework. M.
Hey Oliver, On Fri, Nov 11, 2022 at 07:42:46PM +0000, Oliver Upton wrote: > On Fri, Nov 11, 2022 at 04:54:14PM +0000, Marc Zyngier wrote: > > On Thu, 10 Nov 2022 19:02:33 +0000, > > Will Deacon <will@kernel.org> wrote: > > > > > > Hi all, > > > > > > This is version six of the pKVM EL2 state series, extending the pKVM > > > hypervisor code so that it can dynamically instantiate and manage VM > > > data structures without the host being able to access them directly. > > > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2 > > > page-table for the MMU. The pages used to hold the hypervisor structures > > > are returned to the host when the VM is destroyed. > > > > > > Previous versions are archived at: > > > > > > Mega-patch: https://lore.kernel.org/kvmarm/20220519134204.5379-1-will@kernel.org/ > > > v2: https://lore.kernel.org/all/20220630135747.26983-1-will@kernel.org/ > > > v3: https://lore.kernel.org/kvmarm/20220914083500.5118-1-will@kernel.org/ > > > v4: https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/ > > > v5: https://lore.kernel.org/r/20221020133827.5541-1-will@kernel.org > > > > > > The changes since v5 include: > > > > > > * Fix teardown ordering so that the host 'kvm' structure remains pins > > > while the memcache is being filled. > > > > > > * Fixed a kerneldoc typo. > > > > > > * Included a patch from Oliver to rework the 'pkvm_mem_transition' > > > structure and it's handling of the completer address. > > > > > > * Tweaked some commit messages and added new R-b tags. > > > > > > As before, the final patch is RFC since it illustrates a very naive use > > > of the new hypervisor structures and subsequent changes will improve on > > > this once we have the guest private memory story sorted out. > > > > > > Oliver: I'm pretty sure we're going to need to revert your completer > > > address cleanup as soon as we have guest-host sharing. We want to keep > > > the 'pkvm_mem_transition' structure 'const', but we will only know the > > > host address (PA) after walking the guest stage-2 and so we're going to > > > want to track that separately. Anyway, I've included it here at the end > > > so Marc can decide what he wants to do! > > > > Thanks, I guess... :-/ > > > > If this patch is going to be reverted, I'd rather not take it (without > > guest/host sharing, we don't have much of a hypervisor). > > +1, I'm more than happy being told my patch doesn't work :) > > Having said that, if there are parts of the design that I've whined > about that are intentional then please educate me. Some things haven't > been quite as obvious, but I know you folks have been working on this > feature for a while. Oh sure, I replied on your patches previously: https://lore.kernel.org/r/20221110104215.GA26282@willie-the-truck But here's some more detail... If a guest issues a SHARE hypercall to share a page with the host, then we'll end up in a situation where we have the guest as the initiator and the host as the completer of the share operation. At the point at which we populate the initial (const) 'pkvm_mem_transition' structure, all we will have in our hand is the guest IPA of the page being shared. We can't determine the host (completer) address from this without first walking the guest stage-2 page-table, which happens as part of the guest initiate_share code, so that's why the completer address is decoupled from the rest of the structure -- essentially, it's determine by the initiator after it performs its check. Please do shout if there's something else you're not sure about or if the above is unclear. > I probably need to give the full patch-bomb another read to get all the > context too. We'll probably drop another one of those once 6.2 is out, although we're going to need the guest private memory story to be resolved before we can progress much there, I think. Will
On Fri, Nov 11, 2022 at 07:06:14PM +0000, Marc Zyngier wrote: > On Thu, 10 Nov 2022 19:02:33 +0000, Will Deacon wrote: > > This is version six of the pKVM EL2 state series, extending the pKVM > > hypervisor code so that it can dynamically instantiate and manage VM > > data structures without the host being able to access them directly. > > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2 > > page-table for the MMU. The pages used to hold the hypervisor structures > > are returned to the host when the VM is destroyed. > > > > [...] > > As for Oliver's series, I've tentatively applied this to -next. > I've dropped Oliver's patch for now, but kept the RFC one. Maybe I'll > change my mind. > > Anyway, there was an interesting number of conflicts between the two > series, which I tried to resolve as well as I could, but it is likely > I broke something (although it compiles, so it must be perfect). > > Please have a look and shout if/when you spot something. Cheers, the resolution looks good to me and I've played around booting non-protected guests under pKVM without any issues so far. Will