diff mbox series

[4/6] Add GHCB allocations and helpers

Message ID 20240409133959.2888018-5-pgonda@google.com (mailing list archive)
State New, archived
Headers show
Series Add initial GHCB support for SEV-ES selftests | expand

Commit Message

Peter Gonda April 9, 2024, 1:39 p.m. UTC
Add GHCB management functionality similar to the ucall management.
Allows for selftest vCPUs to acquire GHCBs for their usage.

Cc: Vishal Annapurve <vannapurve@google.com>
Cc: Ackerley Tng <ackerleytng@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Carlos Bilbao <carlos.bilbao@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: kvm@vger.kernel.org
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Peter Gonda <pgonda@google.com>
---
 .../selftests/kvm/include/x86_64/sev.h        |  2 +
 .../selftests/kvm/lib/x86_64/processor.c      |  8 ++
 tools/testing/selftests/kvm/lib/x86_64/sev.c  | 77 +++++++++++++++++++
 3 files changed, 87 insertions(+)

Comments

Sean Christopherson April 24, 2024, 12:58 a.m. UTC | #1
On Tue, Apr 09, 2024, Peter Gonda wrote:
> Add GHCB management functionality similar to the ucall management.
> Allows for selftest vCPUs to acquire GHCBs for their usage.

Do we actually need a dedicated pool of GHCBs?  The conundrum with ucalls is that
we have no place in the guest to store the pointer on all architectures.  Or rather,
we were too lazy to find one. :-)

But for SEV-ES, we have MSR_AMD64_SEV_ES_GHCB, and any test that clobbers that
obviously can't use ucalls anyways.  Argh, but we can't get a value in that MSR
from the host.

Hmm, that seems like a KVM flaw.  KVM should advertise its supported GHCB protocol
to *userspace*, but userspace should control the actual information exposed to
the guest.

Oof, and doesn't SNP support effectively *require* version 2?  I.e. shouldn't
the KVM patch[*] that adds support for the AP reset MSR protocol bump the version?
The GHCB spec very cleary states that that's v2+.

And what if userspace wants to advertise v2 to an SEV-ES guest?  KVM shouldn't
switch *all* SEV-ES guests to v2, without a way back.  And the GHCB spec clearly
states that some of the features are in scope for SEV-ES, e.g.

  In addition to hypervisor feature advertisement, version 2 provides:

  SEV-ES enhancements:
     o GHCB Format Version 2:
        ▪ The addition of the XSS MSR value (if supported) when CPUID 0xD is
          requested.
        ▪ The shared area specified in the GHCB SW_SCRATCH field must reside in the
          GHCB SharedBuffer area of the GHCB.
     o MSR protocol support for AP reset hold.

So I'm pretty sure KVM needs to let userspace set the initial value for
MSR_AMD64_SEV_ES_GHCB.  I suppose we could do that indirectly, e.g. through a
capability.  Hrm, but even if userspace can set the initial value, KVM would need
to parse the min/max version so that KVM knows what *it* should support, which
means that throwing in a GPA for selftests would screw things up.

Blech.  Why do CPU folks insist on using ascending version numbers to bundle
features?

Anyways, back to selftests.  Since we apparently can't stuff MSR_AMD64_SEV_ES_GHCB
from host userspace, what if we instead use a trampoline?  Instead having
vcpu_arch_set_entry_point() point directly at guest_code, point it at a trampoline
for SEV-ES guests, and then have the trampoline set MSR_AMD64_SEV_ES_GHCB to
the vCPU-specific GHCB before invoking guest_code().

Then we just need a register to stuff the GHCB into.  Ah, and the actual guest 
entry point.  GPRs are already part of selftest's "ABI", since they're set by
vcpu_args_set().  And this is all 64-bit only, so we can use r10+.

Ugh, the complication is that the trampoline would need to save/restore RAX, RCX,
and RDX in order to preserve the values from vcpu_args_set(), but that's just
annoying, not hard.  And I think it'd be less painful overall than
having to create a GHCB pool?

In rough pseudo-asm, something like this?

static void vcpu_sev_es_guest_trampoline(void)
{
	asm volatile(<save rax, rcx, rdx>
		     "mov %%r15d, %%eax\n\t"
		     "shr %%r15, $32\n\t"
		     "mov %%r15d, %%eax\n\t"
		     "mov $MSR_AMD64_SEV_ES_GHCB, %%ecx\n\t"
		     <restore rax, rcx, rdx>
		     "jmp %%r14")
}

[*] https://lore.kernel.org/all/20240421180122.1650812-3-michael.roth@amd.com
Sean Christopherson April 24, 2024, 2:39 p.m. UTC | #2
On Tue, Apr 23, 2024, Sean Christopherson wrote:
> On Tue, Apr 09, 2024, Peter Gonda wrote:
> > Add GHCB management functionality similar to the ucall management.
> > Allows for selftest vCPUs to acquire GHCBs for their usage.
> 
> Do we actually need a dedicated pool of GHCBs?  The conundrum with ucalls is that
> we have no place in the guest to store the pointer on all architectures.  Or rather,
> we were too lazy to find one. :-)
> 
> But for SEV-ES, we have MSR_AMD64_SEV_ES_GHCB, and any test that clobbers that
> obviously can't use ucalls anyways.  Argh, but we can't get a value in that MSR
> from the host.

...

> Anyways, back to selftests.  Since we apparently can't stuff MSR_AMD64_SEV_ES_GHCB
> from host userspace, what if we instead use a trampoline?  Instead having
> vcpu_arch_set_entry_point() point directly at guest_code, point it at a trampoline
> for SEV-ES guests, and then have the trampoline set MSR_AMD64_SEV_ES_GHCB to
> the vCPU-specific GHCB before invoking guest_code().
> 
> Then we just need a register to stuff the GHCB into.  Ah, and the actual guest 
> entry point.  GPRs are already part of selftest's "ABI", since they're set by
> vcpu_args_set().  And this is all 64-bit only, so we can use r10+.
> 
> Ugh, the complication is that the trampoline would need to save/restore RAX, RCX,
> and RDX in order to preserve the values from vcpu_args_set(), but that's just
> annoying, not hard.  And I think it'd be less painful overall than
> having to create a GHCB pool?
> 
> In rough pseudo-asm, something like this?
> 
> static void vcpu_sev_es_guest_trampoline(void)
> {
> 	asm volatile(<save rax, rcx, rdx>
> 		     "mov %%r15d, %%eax\n\t"
> 		     "shr %%r15, $32\n\t"
> 		     "mov %%r15d, %%eax\n\t"
> 		     "mov $MSR_AMD64_SEV_ES_GHCB, %%ecx\n\t"
> 		     <restore rax, rcx, rdx>
> 		     "jmp %%r14")
> }

Scratch using inline asm, it needs to be a proper asm subroutine, as it's possible
the compiler could clobber GPRs before emitting the asm.  But writing actual
assembly code is probably a good thing.

And we need assembly for TDX selftests, which forces vCPUs to start at the RESET
vector[*].  Rather than add a TDX specific td_boot.S, we can add a common-ish
entry.S to hold all of the CoCo entry points that need to be in assembly.

Then I think we'll eventually end up with something like:

diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index fd94a1bd82c9..03818d3c4669 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -597,7 +597,12 @@ void vcpu_arch_set_entry_point(struct kvm_vcpu *vcpu, void *guest_code)
        struct kvm_regs regs;
 
        vcpu_regs_get(vcpu, &regs);
-       regs.rip = (unsigned long) guest_code;
+       if (<is sev-es guest>)
+               regs.r14 = guest_code;
+       else if (<is tdx guest>)
+               <guest_code gets shoved somewhere else>
+       else
+               regs.rip = (unsigned long) guest_code;
        vcpu_regs_set(vcpu, &regs);
 }
 
@@ -635,6 +640,10 @@ struct kvm_vcpu *vm_arch_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id)
        vcpu_regs_get(vcpu, &regs);
        regs.rflags = regs.rflags | 0x2;
        regs.rsp = stack_vaddr;
+       if (<is sev-es guest>) {
+               regs.rip = vcpu_sev_es_guest_trampoline;
+               regs.r15 = <allocate and map ghcb>();
+       }
        vcpu_regs_set(vcpu, &regs);
 
        /* Setup the MP state */
---

[*] https://lore.kernel.org/all/20231212204647.2170650-6-sagis@google.com
Sean Christopherson April 24, 2024, 8:13 p.m. UTC | #3
On Tue, Apr 23, 2024, Sean Christopherson wrote:
> Oof, and doesn't SNP support effectively *require* version 2?  I.e. shouldn't
> the KVM patch[*] that adds support for the AP reset MSR protocol bump the version?
> The GHCB spec very cleary states that that's v2+.

Ah, the SNP series does bump the max version one patch later, presumably after all
mandatory features for v2 are implemented.
diff mbox series

Patch

diff --git a/tools/testing/selftests/kvm/include/x86_64/sev.h b/tools/testing/selftests/kvm/include/x86_64/sev.h
index 8a1bf88474c9..bfd481707f67 100644
--- a/tools/testing/selftests/kvm/include/x86_64/sev.h
+++ b/tools/testing/selftests/kvm/include/x86_64/sev.h
@@ -27,6 +27,8 @@  enum sev_guest_state {
 
 #define GHCB_MSR_TERM_REQ	0x100
 
+int ghcb_nr_pages_required(uint64_t page_size);
+
 void sev_vm_launch(struct kvm_vm *vm, uint32_t policy);
 void sev_vm_launch_measure(struct kvm_vm *vm, uint8_t *measurement);
 void sev_vm_launch_finish(struct kvm_vm *vm);
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index 49288fe10cd3..fd94a1bd82c9 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -584,6 +584,14 @@  void kvm_arch_vm_post_create(struct kvm_vm *vm)
 		sev_es_vm_init(vm);
 }
 
+int kvm_arch_vm_additional_pages_required(struct vm_shape shape, uint64_t page_size)
+{
+	if (shape.subtype == VM_SUBTYPE_SEV_ES)
+		return  ghcb_nr_pages_required(page_size);
+
+	return 0;
+}
+
 void vcpu_arch_set_entry_point(struct kvm_vcpu *vcpu, void *guest_code)
 {
 	struct kvm_regs regs;
diff --git a/tools/testing/selftests/kvm/lib/x86_64/sev.c b/tools/testing/selftests/kvm/lib/x86_64/sev.c
index e248d3364b9c..27ae1d3b1355 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/sev.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/sev.c
@@ -4,6 +4,80 @@ 
 #include <stdbool.h>
 
 #include "sev.h"
+#include "linux/bitmap.h"
+#include "svm.h"
+#include "svm_util.h"
+
+struct ghcb_entry {
+	struct ghcb ghcb;
+
+	/* Guest physical address of this GHCB. */
+	void *gpa;
+
+	/* Host virtual address of this struct. */
+	struct ghcb_entry *hva;
+};
+
+struct ghcb_header {
+	struct ghcb_entry ghcbs[KVM_MAX_VCPUS];
+	DECLARE_BITMAP(in_use, KVM_MAX_VCPUS);
+};
+
+static struct ghcb_header *ghcb_pool;
+
+int ghcb_nr_pages_required(uint64_t page_size)
+{
+	return align_up(sizeof(struct ghcb_header), page_size) / page_size;
+}
+
+void ghcb_init(struct kvm_vm *vm)
+{
+	struct ghcb_header *hdr;
+	struct ghcb_entry *entry;
+	vm_vaddr_t vaddr;
+	int i;
+
+	vaddr = vm_vaddr_alloc_shared(vm, sizeof(*hdr), KVM_UTIL_MIN_VADDR,
+				      MEM_REGION_DATA);
+	hdr = (struct ghcb_header *)addr_gva2hva(vm, vaddr);
+	memset(hdr, 0, sizeof(*hdr));
+
+	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
+		entry = &hdr->ghcbs[i];
+		entry->hva = entry;
+		entry->gpa = addr_hva2gpa(vm, &entry->ghcb);
+	}
+
+	write_guest_global(vm, ghcb_pool, (struct ghcb_header *)vaddr);
+}
+
+static struct ghcb_entry *ghcb_alloc(void)
+{
+	return &ghcb_pool->ghcbs[0];
+	struct ghcb_entry *entry;
+	int i;
+
+	if (!ghcb_pool)
+		goto ucall_failed;
+
+	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
+		if (!test_and_set_bit(i, ghcb_pool->in_use)) {
+			entry = &ghcb_pool->ghcbs[i];
+			memset(&entry->ghcb, 0, sizeof(entry->ghcb));
+			return entry;
+		}
+	}
+
+ucall_failed:
+	return NULL;
+}
+
+static void ghcb_free(struct ghcb_entry *entry)
+{
+	/* Beware, here be pointer arithmetic.  */
+	clear_bit(entry - ghcb_pool->ghcbs, ghcb_pool->in_use);
+}
+
 
 /*
  * sparsebit_next_clear() can return 0 if [x, 2**64-1] are all set, and the
@@ -44,6 +118,9 @@  void sev_vm_launch(struct kvm_vm *vm, uint32_t policy)
 	struct kvm_sev_guest_status status;
 	int ctr;
 
+	if (policy & SEV_POLICY_ES)
+		ghcb_init(vm);
+
 	vm_sev_ioctl(vm, KVM_SEV_LAUNCH_START, &launch_start);
 	vm_sev_ioctl(vm, KVM_SEV_GUEST_STATUS, &status);