diff mbox series

[v12,18/29] KVM: SEV: Use a VMSA physical address variable for populating VMCB

Message ID 20240329225835.400662-19-michael.roth@amd.com (mailing list archive)
State Not Applicable
Delegated to: Herbert Xu
Headers show
Series Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support | expand

Commit Message

Michael Roth March 29, 2024, 10:58 p.m. UTC
From: Tom Lendacky <thomas.lendacky@amd.com>

In preparation to support SEV-SNP AP Creation, use a variable that holds
the VMSA physical address rather than converting the virtual address.
This will allow SEV-SNP AP Creation to set the new physical address that
will be used should the vCPU reset path be taken.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c | 3 +--
 arch/x86/kvm/svm/svm.c | 9 ++++++++-
 arch/x86/kvm/svm/svm.h | 1 +
 3 files changed, 10 insertions(+), 3 deletions(-)

Comments

Paolo Bonzini March 30, 2024, 9:01 p.m. UTC | #1
On 3/29/24 23:58, Michael Roth wrote:
> From: Tom Lendacky<thomas.lendacky@amd.com>
> 
> In preparation to support SEV-SNP AP Creation, use a variable that holds
> the VMSA physical address rather than converting the virtual address.
> This will allow SEV-SNP AP Creation to set the new physical address that
> will be used should the vCPU reset path be taken.
> 
> Signed-off-by: Tom Lendacky<thomas.lendacky@amd.com>
> Signed-off-by: Ashish Kalra<ashish.kalra@amd.com>
> Signed-off-by: Michael Roth<michael.roth@amd.com>
> ---

I'll get back to this one after Easter, but it looks like Sean had some 
objections at https://lore.kernel.org/lkml/ZeCqnq7dLcJI41O9@google.com/.

Paolo
Paolo Bonzini April 16, 2024, 11:53 a.m. UTC | #2
On Sat, Mar 30, 2024 at 10:01 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 3/29/24 23:58, Michael Roth wrote:
> > From: Tom Lendacky<thomas.lendacky@amd.com>
> >
> > In preparation to support SEV-SNP AP Creation, use a variable that holds
> > the VMSA physical address rather than converting the virtual address.
> > This will allow SEV-SNP AP Creation to set the new physical address that
> > will be used should the vCPU reset path be taken.
> >
> > Signed-off-by: Tom Lendacky<thomas.lendacky@amd.com>
> > Signed-off-by: Ashish Kalra<ashish.kalra@amd.com>
> > Signed-off-by: Michael Roth<michael.roth@amd.com>
> > ---
>
> I'll get back to this one after Easter, but it looks like Sean had some
> objections at https://lore.kernel.org/lkml/ZeCqnq7dLcJI41O9@google.com/.

So IIUC the gist of the solution here would be to replace

   /* Use the new VMSA */
   svm->sev_es.vmsa_pa = pfn_to_hpa(pfn);
   svm->vmcb->control.vmsa_pa = svm->sev_es.vmsa_pa;

with something like

   /* Use the new VMSA */
   __free_page(virt_to_page(svm->sev_es.vmsa));
   svm->sev_es.vmsa = pfn_to_kaddr(pfn);
   svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);

and wrap the __free_page() in sev_free_vcpu() with "if
(!svm->sev_es.snp_ap_create)".

This should remove the need for svm->sev_es.vmsa_pa. It is always
equal to svm->vmcb->control.vmsa_pa anyway.

Also, it's possible to remove

   /*
    * gmem pages aren't currently migratable, but if this ever
    * changes then care should be taken to ensure
    * svm->sev_es.vmsa_pa is pinned through some other means.
    */
   kvm_release_pfn_clean(pfn);

if sev_free_vcpu() does

   if (svm->sev_es.snp_ap_create) {
     __free_page(virt_to_page(svm->sev_es.vmsa));
   } else {
     put_page(virt_to_page(svm->sev_es.vmsa));
   }

and while at it, please reverse the polarity of snp_ap_create and
rename it to snp_ap_created.

Paolo
Tom Lendacky April 16, 2024, 2:25 p.m. UTC | #3
On 4/16/24 06:53, Paolo Bonzini wrote:
> On Sat, Mar 30, 2024 at 10:01 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 3/29/24 23:58, Michael Roth wrote:
>>> From: Tom Lendacky<thomas.lendacky@amd.com>
>>>
>>> In preparation to support SEV-SNP AP Creation, use a variable that holds
>>> the VMSA physical address rather than converting the virtual address.
>>> This will allow SEV-SNP AP Creation to set the new physical address that
>>> will be used should the vCPU reset path be taken.
>>>
>>> Signed-off-by: Tom Lendacky<thomas.lendacky@amd.com>
>>> Signed-off-by: Ashish Kalra<ashish.kalra@amd.com>
>>> Signed-off-by: Michael Roth<michael.roth@amd.com>
>>> ---
>>
>> I'll get back to this one after Easter, but it looks like Sean had some
>> objections at https://lore.kernel.org/lkml/ZeCqnq7dLcJI41O9@google.com/.
> 

Note that AP create is called multiple times per vCPU under OVMF with 
and added call by the kernel when booting the APs.

> So IIUC the gist of the solution here would be to replace
> 
>     /* Use the new VMSA */
>     svm->sev_es.vmsa_pa = pfn_to_hpa(pfn);
>     svm->vmcb->control.vmsa_pa = svm->sev_es.vmsa_pa;
> 
> with something like
> 
>     /* Use the new VMSA */
>     __free_page(virt_to_page(svm->sev_es.vmsa));

This should only be called for the page that KVM allocated during vCPU 
creation. After that, the VMSA page from an AP create is a guest page 
and shouldn't be freed by KVM.

>     svm->sev_es.vmsa = pfn_to_kaddr(pfn);
>     svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);
> 
> and wrap the __free_page() in sev_free_vcpu() with "if
> (!svm->sev_es.snp_ap_create)".
> 
> This should remove the need for svm->sev_es.vmsa_pa. It is always
> equal to svm->vmcb->control.vmsa_pa anyway.

Yeah, a little bit of multiple VMPL support worked its way in there 
where the VMSA per VMPL level is maintained.

But I believe that Sean wants a separate KVM object per VMPL level, so 
that would disappear anyway (Joerg and I want to get on the PUCK 
schedule to talk about multi-VMPL level support soon.)

> 
> Also, it's possible to remove
> 
>     /*
>      * gmem pages aren't currently migratable, but if this ever
>      * changes then care should be taken to ensure
>      * svm->sev_es.vmsa_pa is pinned through some other means.
>      */
>     kvm_release_pfn_clean(pfn);

Removing this here will cause any previous guest VMSA page(s) to remain 
pinned, that's the reason for unpinning here. OVMF re-uses the VMSA, but 
that isn't a requirement for a firmware, and the kernel will create a 
new VMSA page.

> 
> if sev_free_vcpu() does
> 
>     if (svm->sev_es.snp_ap_create) {
>       __free_page(virt_to_page(svm->sev_es.vmsa));
>     } else {
>       put_page(virt_to_page(svm->sev_es.vmsa));
>     }
> 
> and while at it, please reverse the polarity of snp_ap_create and
> rename it to snp_ap_created.

The snp_ap_create flag gets cleared once the new VMSA is put in place, 
it doesn't remain. So the flag usage will have to be altered in order 
for this function to work properly.

Thanks,
Tom

> 
> Paolo
>
Paolo Bonzini April 16, 2024, 5 p.m. UTC | #4
On Tue, Apr 16, 2024 at 4:25 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 4/16/24 06:53, Paolo Bonzini wrote:
> > On Sat, Mar 30, 2024 at 10:01 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>
> >> On 3/29/24 23:58, Michael Roth wrote:
> >>> From: Tom Lendacky<thomas.lendacky@amd.com>
> >>>
> >>> In preparation to support SEV-SNP AP Creation, use a variable that holds
> >>> the VMSA physical address rather than converting the virtual address.
> >>> This will allow SEV-SNP AP Creation to set the new physical address that
> >>> will be used should the vCPU reset path be taken.
> >>>
> >>> Signed-off-by: Tom Lendacky<thomas.lendacky@amd.com>
> >>> Signed-off-by: Ashish Kalra<ashish.kalra@amd.com>
> >>> Signed-off-by: Michael Roth<michael.roth@amd.com>
> >>> ---
> >>
> >> I'll get back to this one after Easter, but it looks like Sean had some
> >> objections at https://lore.kernel.org/lkml/ZeCqnq7dLcJI41O9@google.com/.
> >
>
> Note that AP create is called multiple times per vCPU under OVMF with
> and added call by the kernel when booting the APs.

Oooh, I somehow thought that

+ target_svm->sev_es.snp_vmsa_gpa = INVALID_PAGE;
+ target_svm->sev_es.snp_ap_create = true;

was in svm_create_vcpu().

So there should be separate "snp_ap_waiting_for_reset" and
"snp_has_guest_vmsa" flags. The latter is set once in
__sev_snp_update_protected_guest_state and is what governs whether the
VMSA page was allocated or just refcounted.

> But I believe that Sean wants a separate KVM object per VMPL level, so
> that would disappear anyway (Joerg and I want to get on the PUCK
> schedule to talk about multi-VMPL level support soon.)

Yes, agreed on both counts.

> >     /*
> >      * gmem pages aren't currently migratable, but if this ever
> >      * changes then care should be taken to ensure
> >      * svm->sev_es.vmsa_pa is pinned through some other means.
> >      */
> >     kvm_release_pfn_clean(pfn);
>
> Removing this here will cause any previous guest VMSA page(s) to remain
> pinned, that's the reason for unpinning here. OVMF re-uses the VMSA, but
> that isn't a requirement for a firmware, and the kernel will create a
> new VMSA page.

Yes, and once you understand that I was thinking of a set-once flag
"snp_has_guest_vmsa" it should all make a lot more sense.

Paolo
Michael Roth April 17, 2024, 8:57 p.m. UTC | #5
On Tue, Apr 16, 2024 at 01:53:24PM +0200, Paolo Bonzini wrote:
> On Sat, Mar 30, 2024 at 10:01 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > On 3/29/24 23:58, Michael Roth wrote:
> > > From: Tom Lendacky<thomas.lendacky@amd.com>
> > >
> > > In preparation to support SEV-SNP AP Creation, use a variable that holds
> > > the VMSA physical address rather than converting the virtual address.
> > > This will allow SEV-SNP AP Creation to set the new physical address that
> > > will be used should the vCPU reset path be taken.
> > >
> > > Signed-off-by: Tom Lendacky<thomas.lendacky@amd.com>
> > > Signed-off-by: Ashish Kalra<ashish.kalra@amd.com>
> > > Signed-off-by: Michael Roth<michael.roth@amd.com>
> > > ---
> >
> > I'll get back to this one after Easter, but it looks like Sean had some
> > objections at https://lore.kernel.org/lkml/ZeCqnq7dLcJI41O9@google.com/.
> 
> So IIUC the gist of the solution here would be to replace
> 
>    /* Use the new VMSA */
>    svm->sev_es.vmsa_pa = pfn_to_hpa(pfn);
>    svm->vmcb->control.vmsa_pa = svm->sev_es.vmsa_pa;
> 
> with something like
> 
>    /* Use the new VMSA */
>    __free_page(virt_to_page(svm->sev_es.vmsa));

One downside to free'ing VMSA at this point is there are a number of
additional cleanup routines like wbinvd_on_all_cpus() and in sev_free_vcpu()
which will need to be called before we are able to safely free the page back
to the system.

It would be simple to wrap all that up in an sev_free_vmsa() helper and
also call it here rather than defer it, but from a performance
perspective it would be nice to defer it to shutdown path.


>    svm->sev_es.vmsa = pfn_to_kaddr(pfn);
>    svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);

It turns out sev_es_init_vmcb() always ends up setting control.vmsa_pa
again using the new vmsa stored in sev_es.vmsa before the AP re-enters the
guest:

  svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);

If we modify that code to instead do:

  if (!svm->sev_es.snp_has_guest_vmsa)
    svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);
      
Then it will instead continue to use the control.vmsa_pa set here in
__sev_snp_update_protected_guest_state(), in which case svm->sev_es.vmsa
will only ever be used to store the initial VMSA that was allocated by KVM.
Given that...

> 
> and wrap the __free_page() in sev_free_vcpu() with "if
> (!svm->sev_es.snp_ap_create)".

If we take the deferred approach above, then no checks are needed here
and the KVM-allocated VMSA is cleaned up the same way it is handled for
SEV-ES. SNP never needs to piggy-back off of sev_es.vmsa to pass around
VMSAs that reside in guest memory.

I can still rework things to free KVM-allocated VMSA immediately here if
you prefer but for now I have things implemented as above to keep
SEV-ES/SNP handling similar and avoid performance penalty during guest
boot. I've pushed the revised AP creation patch here for reference:

  https://github.com/mdroth/linux/commit/5a7e76231a7629ba62f8b0bba8039d93d3595ecb

Thanks for the suggestions, this all looks a good bit cleaner either way.

-Mike

> 
> This should remove the need for svm->sev_es.vmsa_pa. It is always
> equal to svm->vmcb->control.vmsa_pa anyway.
> 
> Also, it's possible to remove
> 
>    /*
>     * gmem pages aren't currently migratable, but if this ever
>     * changes then care should be taken to ensure
>     * svm->sev_es.vmsa_pa is pinned through some other means.
>     */
>    kvm_release_pfn_clean(pfn);
> 
> if sev_free_vcpu() does
> 
>    if (svm->sev_es.snp_ap_create) {
>      __free_page(virt_to_page(svm->sev_es.vmsa));
>    } else {
>      put_page(virt_to_page(svm->sev_es.vmsa));
>    }
> 
> and while at it, please reverse the polarity of snp_ap_create and
> rename it to snp_ap_created.
> 
> Paolo
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index a0a88471f9ab..ce1c727bad23 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -3780,8 +3780,7 @@  static void sev_es_init_vmcb(struct vcpu_svm *svm)
 	 * the VMSA will be NULL if this vCPU is the destination for intrahost
 	 * migration, and will be copied later.
 	 */
-	if (svm->sev_es.vmsa)
-		svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);
+	svm->vmcb->control.vmsa_pa = svm->sev_es.vmsa_pa;
 
 	/* Can't intercept CR register access, HV can't modify CR registers */
 	svm_clr_intercept(svm, INTERCEPT_CR0_READ);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 648a05ca53fc..e036a8927717 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1451,9 +1451,16 @@  static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 	svm->vmcb01.pa = __sme_set(page_to_pfn(vmcb01_page) << PAGE_SHIFT);
 	svm_switch_vmcb(svm, &svm->vmcb01);
 
-	if (vmsa_page)
+	if (vmsa_page) {
 		svm->sev_es.vmsa = page_address(vmsa_page);
 
+		/*
+		 * Do not include the encryption mask on the VMSA physical
+		 * address since hardware will access it using the guest key.
+		 */
+		svm->sev_es.vmsa_pa = __pa(svm->sev_es.vmsa);
+	}
+
 	svm->guest_state_loaded = false;
 
 	return 0;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index c0675ff2d8a2..8cce3315b46c 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -199,6 +199,7 @@  struct vcpu_sev_es_state {
 	struct ghcb *ghcb;
 	u8 valid_bitmap[16];
 	struct kvm_host_map ghcb_map;
+	hpa_t vmsa_pa;
 	bool received_first_sipi;
 	unsigned int ap_reset_hold_type;