mbox series

[v6,00/26] KVM: arm64: Introduce pKVM hyp VM and vCPU state at EL2

Message ID 20221110190259.26861-1-will@kernel.org (mailing list archive)
Headers show
Series KVM: arm64: Introduce pKVM hyp VM and vCPU state at EL2 | expand

Message

Will Deacon Nov. 10, 2022, 7:02 p.m. UTC
Hi all,

This is version six of the pKVM EL2 state series, extending the pKVM
hypervisor code so that it can dynamically instantiate and manage VM
data structures without the host being able to access them directly.
These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
page-table for the MMU. The pages used to hold the hypervisor structures
are returned to the host when the VM is destroyed.

Previous versions are archived at:

  Mega-patch: https://lore.kernel.org/kvmarm/20220519134204.5379-1-will@kernel.org/
  v2: https://lore.kernel.org/all/20220630135747.26983-1-will@kernel.org/
  v3: https://lore.kernel.org/kvmarm/20220914083500.5118-1-will@kernel.org/
  v4: https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/
  v5: https://lore.kernel.org/r/20221020133827.5541-1-will@kernel.org

The changes since v5 include:

  * Fix teardown ordering so that the host 'kvm' structure remains pins
    while the memcache is being filled.

  * Fixed a kerneldoc typo.

  * Included a patch from Oliver to rework the 'pkvm_mem_transition'
    structure and it's handling of the completer address.

  * Tweaked some commit messages and added new R-b tags.

As before, the final patch is RFC since it illustrates a very naive use
of the new hypervisor structures and subsequent changes will improve on
this once we have the guest private memory story sorted out.

Oliver: I'm pretty sure we're going to need to revert your completer
address cleanup as soon as we have guest-host sharing. We want to keep
the 'pkvm_mem_transition' structure 'const', but we will only know the
host address (PA) after walking the guest stage-2 and so we're going to
want to track that separately. Anyway, I've included it here at the end
so Marc can decide what he wants to do!

Cheers,

Will, Quentin and Fuad

Cc: Sean Christopherson <seanjc@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Vincent Donnefort <vdonnefort@google.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: James Morse <james.morse@arm.com>
Cc: Chao Peng <chao.p.peng@linux.intel.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Marc Zyngier <maz@kernel.org>

Cc: kernel-team@android.com
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.linux.dev
Cc: linux-arm-kernel@lists.infradead.org


--->8

Fuad Tabba (3):
  KVM: arm64: Add hyp_spinlock_t static initializer
  KVM: arm64: Add infrastructure to create and track pKVM instances at
    EL2
  KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from
    EL1

Oliver Upton (1):
  KVM: arm64: Clean out the odd handling of completer_addr

Quentin Perret (14):
  KVM: arm64: Move hyp refcount manipulation helpers to common header
    file
  KVM: arm64: Allow attaching of non-coalescable pages to a hyp pool
  KVM: arm64: Back the hypervisor 'struct hyp_page' array for all memory
  KVM: arm64: Fix-up hyp stage-1 refcounts for all pages mapped at EL2
  KVM: arm64: Prevent the donation of no-map pages
  KVM: arm64: Add helpers to pin memory shared with the hypervisor at
    EL2
  KVM: arm64: Add per-cpu fixmap infrastructure at EL2
  KVM: arm64: Add generic hyp_memcache helpers
  KVM: arm64: Consolidate stage-2 initialisation into a single function
  KVM: arm64: Instantiate guest stage-2 page-tables at EL2
  KVM: arm64: Return guest memory from EL2 via dedicated teardown
    memcache
  KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host
  KVM: arm64: Explicitly map 'kvm_vgic_global_state' at EL2
  KVM: arm64: Don't unnecessarily map host kernel sections at EL2

Will Deacon (8):
  KVM: arm64: Unify identifiers used to distinguish host and hypervisor
  KVM: arm64: Implement do_donate() helper for donating memory
  KVM: arm64: Include asm/kvm_mmu.h in nvhe/mem_protect.h
  KVM: arm64: Rename 'host_kvm' to 'host_mmu'
  KVM: arm64: Initialise hypervisor copies of host symbols
    unconditionally
  KVM: arm64: Provide I-cache invalidation by virtual address at EL2
  KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2
  KVM: arm64: Use the pKVM hyp vCPU structure in handle___kvm_vcpu_run()

 arch/arm64/include/asm/kvm_arm.h              |   2 +-
 arch/arm64/include/asm/kvm_asm.h              |   7 +-
 arch/arm64/include/asm/kvm_host.h             |  72 ++-
 arch/arm64/include/asm/kvm_hyp.h              |   3 +
 arch/arm64/include/asm/kvm_mmu.h              |   2 +-
 arch/arm64/include/asm/kvm_pgtable.h          |  22 +
 arch/arm64/include/asm/kvm_pkvm.h             |  38 ++
 arch/arm64/kernel/image-vars.h                |  15 -
 arch/arm64/kvm/arm.c                          |  61 +-
 arch/arm64/kvm/hyp/hyp-constants.c            |   3 +
 arch/arm64/kvm/hyp/include/nvhe/mem_protect.h |  25 +-
 arch/arm64/kvm/hyp/include/nvhe/memory.h      |  27 +
 arch/arm64/kvm/hyp/include/nvhe/mm.h          |  18 +-
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |  68 +++
 arch/arm64/kvm/hyp/include/nvhe/spinlock.h    |  10 +-
 arch/arm64/kvm/hyp/nvhe/cache.S               |  11 +
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            | 110 +++-
 arch/arm64/kvm/hyp/nvhe/hyp-smp.c             |   2 +
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         | 558 +++++++++++++++---
 arch/arm64/kvm/hyp/nvhe/mm.c                  | 168 +++++-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c          |  29 +-
 arch/arm64/kvm/hyp/nvhe/pkvm.c                | 436 ++++++++++++++
 arch/arm64/kvm/hyp/nvhe/setup.c               |  96 ++-
 arch/arm64/kvm/hyp/pgtable.c                  |  21 +-
 arch/arm64/kvm/mmu.c                          |  55 +-
 arch/arm64/kvm/pkvm.c                         | 138 ++++-
 arch/arm64/kvm/reset.c                        |  29 -
 27 files changed, 1765 insertions(+), 261 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h

Comments

Marc Zyngier Nov. 11, 2022, 4:54 p.m. UTC | #1
On Thu, 10 Nov 2022 19:02:33 +0000,
Will Deacon <will@kernel.org> wrote:
> 
> Hi all,
> 
> This is version six of the pKVM EL2 state series, extending the pKVM
> hypervisor code so that it can dynamically instantiate and manage VM
> data structures without the host being able to access them directly.
> These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
> page-table for the MMU. The pages used to hold the hypervisor structures
> are returned to the host when the VM is destroyed.
> 
> Previous versions are archived at:
> 
>   Mega-patch: https://lore.kernel.org/kvmarm/20220519134204.5379-1-will@kernel.org/
>   v2: https://lore.kernel.org/all/20220630135747.26983-1-will@kernel.org/
>   v3: https://lore.kernel.org/kvmarm/20220914083500.5118-1-will@kernel.org/
>   v4: https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/
>   v5: https://lore.kernel.org/r/20221020133827.5541-1-will@kernel.org
> 
> The changes since v5 include:
> 
>   * Fix teardown ordering so that the host 'kvm' structure remains pins
>     while the memcache is being filled.
> 
>   * Fixed a kerneldoc typo.
> 
>   * Included a patch from Oliver to rework the 'pkvm_mem_transition'
>     structure and it's handling of the completer address.
> 
>   * Tweaked some commit messages and added new R-b tags.
> 
> As before, the final patch is RFC since it illustrates a very naive use
> of the new hypervisor structures and subsequent changes will improve on
> this once we have the guest private memory story sorted out.
> 
> Oliver: I'm pretty sure we're going to need to revert your completer
> address cleanup as soon as we have guest-host sharing. We want to keep
> the 'pkvm_mem_transition' structure 'const', but we will only know the
> host address (PA) after walking the guest stage-2 and so we're going to
> want to track that separately. Anyway, I've included it here at the end
> so Marc can decide what he wants to do!

Thanks, I guess... :-/

If this patch is going to be reverted, I'd rather not take it (without
guest/host sharing, we don't have much of a hypervisor).

I guess we can always revisit this in the light of the avalanche of
pKVM patches that will follow...

Cheers,

	M.
Marc Zyngier Nov. 11, 2022, 7:06 p.m. UTC | #2
On Thu, 10 Nov 2022 19:02:33 +0000, Will Deacon wrote:
> This is version six of the pKVM EL2 state series, extending the pKVM
> hypervisor code so that it can dynamically instantiate and manage VM
> data structures without the host being able to access them directly.
> These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
> page-table for the MMU. The pages used to hold the hypervisor structures
> are returned to the host when the VM is destroyed.
> 
> [...]

As for Oliver's series, I've tentatively applied this to -next.
I've dropped Oliver's patch for now, but kept the RFC one. Maybe I'll
change my mind.

Anyway, there was an interesting number of conflicts between the two
series, which I tried to resolve as well as I could, but it is likely
I broke something (although it compiles, so it must be perfect).

Please have a look and shout if/when you spot something.

[01/26] KVM: arm64: Move hyp refcount manipulation helpers to common header file
        commit: 0f4f7ae10ee4e6403659b2d9ddf05424eecde45b
[02/26] KVM: arm64: Allow attaching of non-coalescable pages to a hyp pool
        commit: 72a5bc0f153ce8ca80e9abbd1d9adec7d586915a
[03/26] KVM: arm64: Back the hypervisor 'struct hyp_page' array for all memory
        commit: 8e6bcc3a4502a0d8d065466efd888b6b59b85789
[04/26] KVM: arm64: Fix-up hyp stage-1 refcounts for all pages mapped at EL2
        commit: 0d16d12eb26ef85602ef8a678d94825a66772774
[05/26] KVM: arm64: Unify identifiers used to distinguish host and hypervisor
        commit: 33bc332d4061e95db55594893c4f80105b1dd813
[06/26] KVM: arm64: Implement do_donate() helper for donating memory
        commit: 1ed5c24c26f48ff61dc5d97c655769821f36a622
[07/26] KVM: arm64: Prevent the donation of no-map pages
        commit: 43c1ff8b75011bc3e3e923adf31ba815864a2494
[08/26] KVM: arm64: Add helpers to pin memory shared with the hypervisor at EL2
        commit: 9926cfce8dcb880255f30ab9ac930add787e1ead
[09/26] KVM: arm64: Include asm/kvm_mmu.h in nvhe/mem_protect.h
        commit: 4d968b12e6bbe4440f4f220c41d779e02df8af1a
[10/26] KVM: arm64: Add hyp_spinlock_t static initializer
        commit: 1c80002e3264552d8b9c0e162e09aa4087403716
[11/26] KVM: arm64: Rename 'host_kvm' to 'host_mmu'
        commit: 5304002dc3754a5663d75c977bfa2d9e3c08906d
[12/26] KVM: arm64: Add infrastructure to create and track pKVM instances at EL2
        commit: a1ec5c70d3f63d8a143fb83cd7f53bd8ff2f72c8
[13/26] KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1
        commit: 9d0c063a4d1d10ef8e6288899b8524413e40cfa0
[14/26] KVM: arm64: Add per-cpu fixmap infrastructure at EL2
        commit: aa6948f82f0b7060fbbac21911dc7996b144ba3c
[15/26] KVM: arm64: Initialise hypervisor copies of host symbols unconditionally
        commit: 6c165223e9a6384aa1e934b90f2650e71adb972a
[16/26] KVM: arm64: Provide I-cache invalidation by virtual address at EL2
        commit: 13e248aab73d2f1c27b458ef09d38b44f3e5bf2e
[17/26] KVM: arm64: Add generic hyp_memcache helpers
        commit: 717a7eebac106a5cc5d5493f8eef9cf4ae6edf19
[18/26] KVM: arm64: Consolidate stage-2 initialisation into a single function
        commit: 315775ff7c6de497dd07c3f6eff499fb538783eb
[19/26] KVM: arm64: Instantiate guest stage-2 page-tables at EL2
        commit: 60dfe093ec13b056856c672e1daa35134be38283
[20/26] KVM: arm64: Return guest memory from EL2 via dedicated teardown memcache
        commit: f41dff4efb918db68923a826e966ca62c7c8e929
[21/26] KVM: arm64: Unmap 'kvm_arm_hyp_percpu_base' from the host
        commit: fe41a7f8c0ee3ee2f682f8c28c7e1c5ff2be8a79
[22/26] KVM: arm64: Maintain a copy of 'kvm_arm_vmid_bits' at EL2
        commit: 73f38ef2ae531b180685173e0923225551434fcb
[23/26] KVM: arm64: Explicitly map 'kvm_vgic_global_state' at EL2
        commit: 27eb26bfff5d358d42911d04bbecc62e659ec32b
[24/26] KVM: arm64: Don't unnecessarily map host kernel sections at EL2
        commit: 169cd0f8238f2598b85d2db2e15828e8f8da18e5
[25/26] KVM: arm64: Clean out the odd handling of completer_addr
        (no commit info)
[26/26] KVM: arm64: Use the pKVM hyp vCPU structure in handle___kvm_vcpu_run()
        commit: be66e67f175096f283c9d5614c4991fc9e7ed975

Cheers,

	M.
Oliver Upton Nov. 11, 2022, 7:42 p.m. UTC | #3
On Fri, Nov 11, 2022 at 04:54:14PM +0000, Marc Zyngier wrote:
> On Thu, 10 Nov 2022 19:02:33 +0000,
> Will Deacon <will@kernel.org> wrote:
> > 
> > Hi all,
> > 
> > This is version six of the pKVM EL2 state series, extending the pKVM
> > hypervisor code so that it can dynamically instantiate and manage VM
> > data structures without the host being able to access them directly.
> > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
> > page-table for the MMU. The pages used to hold the hypervisor structures
> > are returned to the host when the VM is destroyed.
> > 
> > Previous versions are archived at:
> > 
> >   Mega-patch: https://lore.kernel.org/kvmarm/20220519134204.5379-1-will@kernel.org/
> >   v2: https://lore.kernel.org/all/20220630135747.26983-1-will@kernel.org/
> >   v3: https://lore.kernel.org/kvmarm/20220914083500.5118-1-will@kernel.org/
> >   v4: https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/
> >   v5: https://lore.kernel.org/r/20221020133827.5541-1-will@kernel.org
> > 
> > The changes since v5 include:
> > 
> >   * Fix teardown ordering so that the host 'kvm' structure remains pins
> >     while the memcache is being filled.
> > 
> >   * Fixed a kerneldoc typo.
> > 
> >   * Included a patch from Oliver to rework the 'pkvm_mem_transition'
> >     structure and it's handling of the completer address.
> > 
> >   * Tweaked some commit messages and added new R-b tags.
> > 
> > As before, the final patch is RFC since it illustrates a very naive use
> > of the new hypervisor structures and subsequent changes will improve on
> > this once we have the guest private memory story sorted out.
> > 
> > Oliver: I'm pretty sure we're going to need to revert your completer
> > address cleanup as soon as we have guest-host sharing. We want to keep
> > the 'pkvm_mem_transition' structure 'const', but we will only know the
> > host address (PA) after walking the guest stage-2 and so we're going to
> > want to track that separately. Anyway, I've included it here at the end
> > so Marc can decide what he wants to do!
> 
> Thanks, I guess... :-/
> 
> If this patch is going to be reverted, I'd rather not take it (without
> guest/host sharing, we don't have much of a hypervisor).

+1, I'm more than happy being told my patch doesn't work :)

Having said that, if there are parts of the design that I've whined
about that are intentional then please educate me. Some things haven't
been quite as obvious, but I know you folks have been working on this
feature for a while.

I probably need to give the full patch-bomb another read to get all the
context too.

--
Thanks,
Oliver
Oliver Upton Nov. 11, 2022, 8:08 p.m. UTC | #4
On Fri, Nov 11, 2022 at 07:06:14PM +0000, Marc Zyngier wrote:
> On Thu, 10 Nov 2022 19:02:33 +0000, Will Deacon wrote:
> > This is version six of the pKVM EL2 state series, extending the pKVM
> > hypervisor code so that it can dynamically instantiate and manage VM
> > data structures without the host being able to access them directly.
> > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
> > page-table for the MMU. The pages used to hold the hypervisor structures
> > are returned to the host when the VM is destroyed.
> > 
> > [...]
> 
> As for Oliver's series, I've tentatively applied this to -next.
> I've dropped Oliver's patch for now, but kept the RFC one. Maybe I'll
> change my mind.
> 
> Anyway, there was an interesting number of conflicts between the two
> series, which I tried to resolve as well as I could, but it is likely
> I broke something (although it compiles, so it must be perfect).
> 
> Please have a look and shout if/when you spot something.

Here is where you and I diverged on the conflict resolution, neither
amounts to a whole lot but feel free to squash in. Hoping that Will + co
can test the pKVM side of this.

diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
index f2c4672697c2..318298eb3d6b 100644
--- a/arch/arm64/kvm/hyp/nvhe/mm.c
+++ b/arch/arm64/kvm/hyp/nvhe/mm.c
@@ -265,7 +265,7 @@ static int __create_fixmap_slot_cb(const struct kvm_pgtable_visit_ctx *ctx,
 {
 	struct hyp_fixmap_slot *slot = per_cpu_ptr(&fixmap_slots, (u64)ctx->arg);
 
-	if (!kvm_pte_valid(*ctx->ptep) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1)
+	if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1)
 		return -EINVAL;
 
 	slot->addr = ctx->addr;
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index b47d969ae4d3..110f04627785 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -190,7 +190,7 @@ static void hpool_put_page(void *addr)
 }
 
 static int fix_host_ownership_walker(const struct kvm_pgtable_visit_ctx *ctx,
-					 enum kvm_pgtable_walk_flags visit)
+				     enum kvm_pgtable_walk_flags visit)
 {
 	enum kvm_pgtable_prot prot;
 	enum pkvm_page_state state;

--
Thanks,
Oliver
Marc Zyngier Nov. 12, 2022, 11:34 a.m. UTC | #5
On Fri, 11 Nov 2022 20:08:46 +0000,
Oliver Upton <oliver.upton@linux.dev> wrote:
> 
> On Fri, Nov 11, 2022 at 07:06:14PM +0000, Marc Zyngier wrote:
> > On Thu, 10 Nov 2022 19:02:33 +0000, Will Deacon wrote:
> > > This is version six of the pKVM EL2 state series, extending the pKVM
> > > hypervisor code so that it can dynamically instantiate and manage VM
> > > data structures without the host being able to access them directly.
> > > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
> > > page-table for the MMU. The pages used to hold the hypervisor structures
> > > are returned to the host when the VM is destroyed.
> > > 
> > > [...]
> > 
> > As for Oliver's series, I've tentatively applied this to -next.
> > I've dropped Oliver's patch for now, but kept the RFC one. Maybe I'll
> > change my mind.
> > 
> > Anyway, there was an interesting number of conflicts between the two
> > series, which I tried to resolve as well as I could, but it is likely
> > I broke something (although it compiles, so it must be perfect).
> > 
> > Please have a look and shout if/when you spot something.
> 
> Here is where you and I diverged on the conflict resolution, neither
> amounts to a whole lot but feel free to squash in. Hoping that Will + co
> can test the pKVM side of this.
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/mm.c b/arch/arm64/kvm/hyp/nvhe/mm.c
> index f2c4672697c2..318298eb3d6b 100644
> --- a/arch/arm64/kvm/hyp/nvhe/mm.c
> +++ b/arch/arm64/kvm/hyp/nvhe/mm.c
> @@ -265,7 +265,7 @@ static int __create_fixmap_slot_cb(const struct kvm_pgtable_visit_ctx *ctx,
>  {
>  	struct hyp_fixmap_slot *slot = per_cpu_ptr(&fixmap_slots, (u64)ctx->arg);
>  
> -	if (!kvm_pte_valid(*ctx->ptep) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1)
> +	if (!kvm_pte_valid(ctx->old) || ctx->level != KVM_PGTABLE_MAX_LEVELS - 1)
>  		return -EINVAL;
>  
>  	slot->addr = ctx->addr;
> diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
> index b47d969ae4d3..110f04627785 100644
> --- a/arch/arm64/kvm/hyp/nvhe/setup.c
> +++ b/arch/arm64/kvm/hyp/nvhe/setup.c
> @@ -190,7 +190,7 @@ static void hpool_put_page(void *addr)
>  }
>  
>  static int fix_host_ownership_walker(const struct kvm_pgtable_visit_ctx *ctx,
> -					 enum kvm_pgtable_walk_flags visit)
> +				     enum kvm_pgtable_walk_flags visit)
>  {
>  	enum kvm_pgtable_prot prot;
>  	enum pkvm_page_state state;
> 

Thanks. I've folded that in the resolution and regenerated the -next
branch after taking another patch from Gavin in the dirty-ring series.

I've managed to test the result on both VHE and nVHE this morning, and
the wheels are still attached. I don't have a good setup for pKVM at
the moment though, and it is likely that the rest of the monster
series will need some rework.

	M.
Will Deacon Nov. 14, 2022, 6:19 p.m. UTC | #6
Hey Oliver,

On Fri, Nov 11, 2022 at 07:42:46PM +0000, Oliver Upton wrote:
> On Fri, Nov 11, 2022 at 04:54:14PM +0000, Marc Zyngier wrote:
> > On Thu, 10 Nov 2022 19:02:33 +0000,
> > Will Deacon <will@kernel.org> wrote:
> > > 
> > > Hi all,
> > > 
> > > This is version six of the pKVM EL2 state series, extending the pKVM
> > > hypervisor code so that it can dynamically instantiate and manage VM
> > > data structures without the host being able to access them directly.
> > > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
> > > page-table for the MMU. The pages used to hold the hypervisor structures
> > > are returned to the host when the VM is destroyed.
> > > 
> > > Previous versions are archived at:
> > > 
> > >   Mega-patch: https://lore.kernel.org/kvmarm/20220519134204.5379-1-will@kernel.org/
> > >   v2: https://lore.kernel.org/all/20220630135747.26983-1-will@kernel.org/
> > >   v3: https://lore.kernel.org/kvmarm/20220914083500.5118-1-will@kernel.org/
> > >   v4: https://lore.kernel.org/kvm/20221017115209.2099-1-will@kernel.org/
> > >   v5: https://lore.kernel.org/r/20221020133827.5541-1-will@kernel.org
> > > 
> > > The changes since v5 include:
> > > 
> > >   * Fix teardown ordering so that the host 'kvm' structure remains pins
> > >     while the memcache is being filled.
> > > 
> > >   * Fixed a kerneldoc typo.
> > > 
> > >   * Included a patch from Oliver to rework the 'pkvm_mem_transition'
> > >     structure and it's handling of the completer address.
> > > 
> > >   * Tweaked some commit messages and added new R-b tags.
> > > 
> > > As before, the final patch is RFC since it illustrates a very naive use
> > > of the new hypervisor structures and subsequent changes will improve on
> > > this once we have the guest private memory story sorted out.
> > > 
> > > Oliver: I'm pretty sure we're going to need to revert your completer
> > > address cleanup as soon as we have guest-host sharing. We want to keep
> > > the 'pkvm_mem_transition' structure 'const', but we will only know the
> > > host address (PA) after walking the guest stage-2 and so we're going to
> > > want to track that separately. Anyway, I've included it here at the end
> > > so Marc can decide what he wants to do!
> > 
> > Thanks, I guess... :-/
> > 
> > If this patch is going to be reverted, I'd rather not take it (without
> > guest/host sharing, we don't have much of a hypervisor).
> 
> +1, I'm more than happy being told my patch doesn't work :)
> 
> Having said that, if there are parts of the design that I've whined
> about that are intentional then please educate me. Some things haven't
> been quite as obvious, but I know you folks have been working on this
> feature for a while.

Oh sure, I replied on your patches previously:

https://lore.kernel.org/r/20221110104215.GA26282@willie-the-truck

But here's some more detail...

If a guest issues a SHARE hypercall to share a page with the host, then
we'll end up in a situation where we have the guest as the initiator and
the host as the completer of the share operation. At the point at which
we populate the initial (const) 'pkvm_mem_transition' structure, all we
will have in our hand is the guest IPA of the page being shared. We can't
determine the host (completer) address from this without first walking the
guest stage-2 page-table, which happens as part of the guest initiate_share
code, so that's why the completer address is decoupled from the rest of the
structure -- essentially, it's determine by the initiator after it performs
its check.

Please do shout if there's something else you're not sure about or if the
above is unclear.

> I probably need to give the full patch-bomb another read to get all the
> context too.

We'll probably drop another one of those once 6.2 is out, although we're
going to need the guest private memory story to be resolved before we can
progress much there, I think.

Will
Will Deacon Nov. 14, 2022, 7:30 p.m. UTC | #7
On Fri, Nov 11, 2022 at 07:06:14PM +0000, Marc Zyngier wrote:
> On Thu, 10 Nov 2022 19:02:33 +0000, Will Deacon wrote:
> > This is version six of the pKVM EL2 state series, extending the pKVM
> > hypervisor code so that it can dynamically instantiate and manage VM
> > data structures without the host being able to access them directly.
> > These structures consist of a hyp VM, a set of hyp vCPUs and the stage-2
> > page-table for the MMU. The pages used to hold the hypervisor structures
> > are returned to the host when the VM is destroyed.
> > 
> > [...]
> 
> As for Oliver's series, I've tentatively applied this to -next.
> I've dropped Oliver's patch for now, but kept the RFC one. Maybe I'll
> change my mind.
> 
> Anyway, there was an interesting number of conflicts between the two
> series, which I tried to resolve as well as I could, but it is likely
> I broke something (although it compiles, so it must be perfect).
> 
> Please have a look and shout if/when you spot something.

Cheers, the resolution looks good to me and I've played around booting
non-protected guests under pKVM without any issues so far.

Will