[RFC,v7,62/64] x86/sev: Add KVM commands for instance certs

Message ID	20221214194056.161492-63-michael.roth@amd.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C From: Michael Roth <michael.roth@amd.com> To: <kvm@vger.kernel.org> CC: <linux-coco@lists.linux.dev>, <linux-mm@kvack.org>, <linux-crypto@vger.kernel.org>, <x86@kernel.org>, <linux-kernel@vger.kernel.org>, <tglx@linutronix.de>, <mingo@redhat.com>, <jroedel@suse.de>, <thomas.lendacky@amd.com>, <hpa@zytor.com>, <ardb@kernel.org>, <pbonzini@redhat.com>, <seanjc@google.com>, <vkuznets@redhat.com>, <wanpengli@tencent.com>, <jmattson@google.com>, <luto@kernel.org>, <dave.hansen@linux.intel.com>, <slp@redhat.com>, <pgonda@google.com>, <peterz@infradead.org>, <srinivas.pandruvada@linux.intel.com>, <rientjes@google.com>, <dovmurik@linux.ibm.com>, <tobin@ibm.com>, <bp@alien8.de>, <vbabka@suse.cz>, <kirill@shutemov.name>, <ak@linux.intel.com>, <tony.luck@intel.com>, <marcorr@google.com>, <sathyanarayanan.kuppuswamy@linux.intel.com>, <alpergun@google.com>, <dgilbert@redhat.com>, <jarkko@kernel.org>, <ashish.kalra@amd.com>, <harald@profian.com>, Dionna Glaze <dionnaglaze@google.com>, Tom Lendacky <Thomas.Lendacky@amd.com> Subject: [PATCH RFC v7 62/64] x86/sev: Add KVM commands for instance certs Date: Wed, 14 Dec 2022 13:40:54 -0600 Message-ID: <20221214194056.161492-63-michael.roth@amd.com> In-Reply-To: <20221214194056.161492-1-michael.roth@amd.com> References: <20221214194056.161492-1-michael.roth@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk
Series	Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support \| expand [RFC,v7,00/64] Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support [RFC,v7,01/64] KVM: Fix memslot boundary condition for large page [RFC,v7,02/64] KVM: x86: Add KVM_CAP_UNMAPPED_PRIVATE_MEMORY [RFC,v7,03/64] KVM: SVM: Advertise private memory support to KVM [RFC,v7,04/64] KVM: x86: Add 'fault_is_private' x86 op [RFC,v7,05/64] KVM: x86: Add 'update_mem_attr' x86 op [RFC,v7,06/64] KVM: x86: Add platform hooks for private memory invalidations [RFC,v7,07/64] KVM: SEV: Handle KVM_HC_MAP_GPA_RANGE hypercall [RFC,v7,08/64] KVM: Move kvm_for_each_memslot_in_hva_range() to be used in SVM [RFC,v7,09/64] KVM: Add HVA range operator [RFC,v7,10/64] KVM: SEV: Populate private memory fd during LAUNCH_UPDATE_DATA [RFC,v7,11/64] KVM: SEV: Support private pages in LAUNCH_UPDATE_DATA [RFC,v7,12/64] KVM: SEV: Implement .fault_is_private callback [RFC,v7,13/64] x86/cpufeatures: Add SEV-SNP CPU feature [RFC,v7,14/64] x86/sev: Add the host SEV-SNP initialization support [RFC,v7,15/64] x86/sev: Add RMP entry lookup helpers [RFC,v7,16/64] x86/sev: Add helper functions for RMPUPDATE and PSMASH instruction [RFC,v7,17/64] x86/mm/pat: Introduce set_memory_p [RFC,v7,18/64] x86/sev: Invalidate pages from the direct map when adding them to the RMP table [RFC,v7,19/64] x86/traps: Define RMP violation #PF error code [RFC,v7,20/64] x86/fault: Add support to handle the RMP fault for user address [RFC,v7,21/64] x86/fault: fix handle_split_page_fault() to work with memfd backed pages [RFC,v7,22/64] x86/fault: Return pfn from dump_pagetable() for SEV-specific fault handling. [RFC,v7,23/64] x86/fault: Add support to dump RMP entry on fault [RFC,v7,24/64] crypto:ccp: Define the SEV-SNP commands [RFC,v7,25/64] crypto: ccp: Add support to initialize the AMD-SP for SEV-SNP [RFC,v7,26/64] crypto:ccp: Provide API to issue SEV and SNP commands [RFC,v7,27/64] crypto: ccp: Introduce snp leaked pages list [RFC,v7,28/64] crypto: ccp: Handle the legacy TMR allocation when SNP is enabled [RFC,v7,29/64] crypto: ccp: Handle the legacy SEV command when SNP is enabled [RFC,v7,30/64] crypto: ccp: Add the SNP_PLATFORM_STATUS command [RFC,v7,31/64] crypto: ccp: Add the SNP_{SET,GET}_EXT_CONFIG command [RFC,v7,32/64] crypto: ccp: Provide APIs to query extended attestation report [RFC,v7,33/64] KVM: SVM: Add support to handle AP reset MSR protocol [RFC,v7,34/64] KVM: SVM: Provide the Hypervisor Feature support VMGEXIT [RFC,v7,35/64] KVM: SVM: Make AVIC backing, VMSA and VMCB memory allocation SNP safe [RFC,v7,36/64] KVM: SVM: Add initial SEV-SNP support [RFC,v7,37/64] KVM: SVM: Add KVM_SNP_INIT command [RFC,v7,38/64] KVM: SVM: Add KVM_SEV_SNP_LAUNCH_START command [RFC,v7,39/64] KVM: SVM: Add KVM_SEV_SNP_LAUNCH_UPDATE command [RFC,v7,40/64] KVM: SVM: Add KVM_SEV_SNP_LAUNCH_FINISH command [RFC,v7,41/64] KVM: X86: Keep the NPT and RMP page level in sync [RFC,v7,42/64] KVM: x86: Define RMP page fault error bits for #NPF [RFC,v7,43/64] KVM: SVM: Do not use long-lived GHCB map while setting scratch area [RFC,v7,44/64] KVM: SVM: Remove the long-lived GHCB host map [RFC,v7,45/64] KVM: SVM: Add support to handle GHCB GPA register VMGEXIT [RFC,v7,46/64] KVM: SVM: Add KVM_EXIT_VMGEXIT [RFC,v7,47/64] KVM: SVM: Add support to handle MSR based Page State Change VMGEXIT [RFC,v7,48/64] KVM: SVM: Add support to handle Page State Change VMGEXIT [RFC,v7,49/64] KVM: SVM: Introduce ops for the post gfn map and unmap [RFC,v7,50/64] KVM: x86: Export the kvm_zap_gfn_range() for the SNP use [RFC,v7,51/64] KVM: SVM: Add support to handle the RMP nested page fault [RFC,v7,52/64] KVM: SVM: Provide support for SNP_GUEST_REQUEST NAE event [RFC,v7,53/64] KVM: SVM: Use a VMSA physical address variable for populating VMCB [RFC,v7,54/64] KVM: SVM: Support SEV-SNP AP Creation NAE event [RFC,v7,55/64] KVM: SVM: Add SNP-specific handling for memory attribute updates [RFC,v7,56/64] KVM: x86/mmu: Generate KVM_EXIT_MEMORY_FAULT for implicit conversions for SNP [RFC,v7,57/64] KVM: SEV: Handle restricted memory invalidations for SNP [RFC,v7,58/64] KVM: SVM: Add module parameter to enable the SEV-SNP [RFC,v7,59/64] ccp: Add support to decrypt the page [RFC,v7,60/64] KVM: SVM: Sync the GHCB scratch buffer using already mapped ghcb [RFC,v7,61/64] KVM: SVM: Make VMSAVE target area memory allocation SNP safe [RFC,v7,62/64] x86/sev: Add KVM commands for instance certs [RFC,v7,63/64] x86/sev: Document KVM_SEV_SNP_{G,S}ET_CERTS [RFC,v7,64/64] iommu/amd: Add IOMMU_SNP_SHUTDOWN support

Michael Roth Dec. 14, 2022, 7:40 p.m. UTC

From: Dionna Glaze <dionnaglaze@google.com>

The /dev/sev device has the ability to store host-wide certificates for
the key used by the AMD-SP for SEV-SNP attestation report signing,
but for hosts that want to specify additional certificates that are
specific to the image launched in a VM, a different way is needed to
communicate those certificates.

This patch adds two new KVM ioctl commands: KVM_SEV_SNP_{GET,SET}_CERTS

The certificates that are set with this command are expected to follow
the same format as the host certificates, but that format is opaque
to the kernel.

The new behavior for custom certificates is that the extended guest
request command will now return the overridden certificates if they
were installed for the instance. The error condition for a too small
data buffer is changed to return the overridden certificate data size
if there is an overridden certificate set installed.

Setting a 0 length certificate returns the system state to only return
the host certificates on an extended guest request.

We also increase the SEV_FW_BLOB_MAX_SIZE another 4K page to allow
space for an extra certificate.

Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
---
 arch/x86/kvm/svm/sev.c   | 111 ++++++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/svm/svm.h   |   1 +
 include/linux/psp-sev.h  |   2 +-
 include/uapi/linux/kvm.h |  12 +++++
 4 files changed, 123 insertions(+), 3 deletions(-)

Dov Murik Dec. 22, 2022, 2:57 p.m. UTC | #1

Hi Dionna, Mike,

On 14/12/2022 21:40, Michael Roth wrote:
> From: Dionna Glaze <dionnaglaze@google.com>
> 
> The /dev/sev device has the ability to store host-wide certificates for
> the key used by the AMD-SP for SEV-SNP attestation report signing,
> but for hosts that want to specify additional certificates that are
> specific to the image launched in a VM, a different way is needed to
> communicate those certificates.
> 
> This patch adds two new KVM ioctl commands: KVM_SEV_SNP_{GET,SET}_CERTS
> 
> The certificates that are set with this command are expected to follow
> the same format as the host certificates, but that format is opaque
> to the kernel.
> 
> The new behavior for custom certificates is that the extended guest
> request command will now return the overridden certificates if they
> were installed for the instance. The error condition for a too small
> data buffer is changed to return the overridden certificate data size
> if there is an overridden certificate set installed.
> 
> Setting a 0 length certificate returns the system state to only return
> the host certificates on an extended guest request.
> 
> We also increase the SEV_FW_BLOB_MAX_SIZE another 4K page to allow
> space for an extra certificate.
> 
> Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> 
> Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> Signed-off-by: Michael Roth <michael.roth@amd.com>
> ---
>  arch/x86/kvm/svm/sev.c   | 111 ++++++++++++++++++++++++++++++++++++++-
>  arch/x86/kvm/svm/svm.h   |   1 +
>  include/linux/psp-sev.h  |   2 +-
>  include/uapi/linux/kvm.h |  12 +++++
>  4 files changed, 123 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 4de952d1d446..d0e58cffd1ed 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2081,6 +2081,7 @@ static void *snp_context_create(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  		goto e_free;
>  
>  	sev->snp_certs_data = certs_data;
> +	sev->snp_certs_len = 0;
>  
>  	return context;
>  
> @@ -2364,6 +2365,86 @@ static int snp_launch_finish(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	return ret;
>  }
>  
> +static int snp_get_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	struct kvm_sev_snp_get_certs params;
> +
> +	if (!sev_snp_guest(kvm))
> +		return -ENOTTY;
> +
> +	if (!sev->snp_context)
> +		return -EINVAL;
> +
> +	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
> +			   sizeof(params)))
> +		return -EFAULT;
> +
> +	/* No instance certs set. */
> +	if (!sev->snp_certs_len)
> +		return -ENOENT;
> +
> +	if (params.certs_len < sev->snp_certs_len) {
> +		/* Output buffer too small. Return the required size. */
> +		params.certs_len = sev->snp_certs_len;
> +
> +		if (copy_to_user((void __user *)(uintptr_t)argp->data, &params,
> +				 sizeof(params)))
> +			return -EFAULT;
> +
> +		return -EINVAL;
> +	}
> +
> +	if (copy_to_user((void __user *)(uintptr_t)params.certs_uaddr,
> +			 sev->snp_certs_data, sev->snp_certs_len))
> +		return -EFAULT;
> +
> +	return 0;
> +}
> +
> +static int snp_set_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	unsigned long length = SEV_FW_BLOB_MAX_SIZE;
> +	void *to_certs = sev->snp_certs_data;
> +	struct kvm_sev_snp_set_certs params;
> +
> +	if (!sev_snp_guest(kvm))
> +		return -ENOTTY;
> +
> +	if (!sev->snp_context)
> +		return -EINVAL;
> +
> +	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data,
> +			   sizeof(params)))
> +		return -EFAULT;
> +
> +	if (params.certs_len > SEV_FW_BLOB_MAX_SIZE)
> +		return -EINVAL;
> +
> +	/*
> +	 * Setting a length of 0 is the same as "uninstalling" instance-
> +	 * specific certificates.
> +	 */
> +	if (params.certs_len == 0) {
> +		sev->snp_certs_len = 0;
> +		return 0;
> +	}
> +
> +	/* Page-align the length */
> +	length = (params.certs_len + PAGE_SIZE - 1) & PAGE_MASK;
> +
> +	if (copy_from_user(to_certs,
> +			   (void __user *)(uintptr_t)params.certs_uaddr,
> +			   params.certs_len)) {
> +		return -EFAULT;
> +	}
> +
> +	sev->snp_certs_len = length;

Here we set the length to the page-aligned value, but we copy only
params.cert_len bytes.  If there are two subsequent
snp_set_instance_certs() calls where the second one has a shorter
length, we might "keep" some leftover bytes from the first call.

Consider:
1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192)
2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097)

If I understand correctly, on the second call we'll copy 4097 "BBB..."
bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
1) & PAGE_MASK which will be 8192.

Later when fetching the certs (for the extended report or in
snp_get_instance_certs()) the user will get a buffer of 8192 bytes
filled with 4097 BBBs and 4095 leftover AAAs.

Maybe zero sev->snp_certs_data entirely before writing to it?

Related question (not only for this patch) regarding snp_certs_data
(host or per-instance): why is its size page-aligned at all? why is it
limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer
is never sent to the PSP.

> +
> +	return 0;
> +}
> +
>  int sev_mem_enc_ioctl(struct kvm *kvm, void __user *argp)
>  {
>  	struct kvm_sev_cmd sev_cmd;

[...]

> diff --git a/include/linux/psp-sev.h b/include/linux/psp-sev.h
> index a1e6624540f3..970a9de0ed20 100644
> --- a/include/linux/psp-sev.h
> +++ b/include/linux/psp-sev.h
> @@ -22,7 +22,7 @@
>  #define __psp_pa(x)	__pa(x)
>  #endif
>  
> -#define SEV_FW_BLOB_MAX_SIZE	0x4000	/* 16KB */
> +#define SEV_FW_BLOB_MAX_SIZE	0x5000	/* 20KB */
>  

This has effects in drivers/crypto/ccp/sev-dev.c
                                                               (for
example in alloc_snp_host_map).  Is that OK?


-Dov

>  /**
>   * SEV platform state
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 61b1e26ced01..48bcc59cf86b 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1949,6 +1949,8 @@ enum sev_cmd_id {
>  	KVM_SEV_SNP_LAUNCH_START,
>  	KVM_SEV_SNP_LAUNCH_UPDATE,
>  	KVM_SEV_SNP_LAUNCH_FINISH,
> +	KVM_SEV_SNP_GET_CERTS,
> +	KVM_SEV_SNP_SET_CERTS,
>  
>  	KVM_SEV_NR_MAX,
>  };
> @@ -2096,6 +2098,16 @@ struct kvm_sev_snp_launch_finish {
>  	__u8 pad[6];
>  };
>  
> +struct kvm_sev_snp_get_certs {
> +	__u64 certs_uaddr;
> +	__u64 certs_len;
> +};
> +
> +struct kvm_sev_snp_set_certs {
> +	__u64 certs_uaddr;
> +	__u64 certs_len;
> +};
> +
>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU	(1 << 0)
>  #define KVM_DEV_ASSIGN_PCI_2_3		(1 << 1)
>  #define KVM_DEV_ASSIGN_MASK_INTX	(1 << 2)

Dionna Glaze Jan. 9, 2023, 4:55 p.m. UTC | #2

> > +
> > +static int snp_set_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp)
> > +{
> [...]
>
> Here we set the length to the page-aligned value, but we copy only
> params.cert_len bytes.  If there are two subsequent
> snp_set_instance_certs() calls where the second one has a shorter
> length, we might "keep" some leftover bytes from the first call.
>
> Consider:
> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192)
> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097)
>
> If I understand correctly, on the second call we'll copy 4097 "BBB..."
> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
> 1) & PAGE_MASK which will be 8192.
>
> Later when fetching the certs (for the extended report or in
> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
> filled with 4097 BBBs and 4095 leftover AAAs.
>
> Maybe zero sev->snp_certs_data entirely before writing to it?
>

Yes, I agree it should be zeroed, at least if the previous length is
greater than the new length. Good catch.


> Related question (not only for this patch) regarding snp_certs_data
> (host or per-instance): why is its size page-aligned at all? why is it
> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer
> is never sent to the PSP.
>

The buffer is meant to be copied into the guest driver following the
GHCB extended guest request protocol. The data to copy back are
expected to be in 4K page granularity.

> [...]
> >
> > -#define SEV_FW_BLOB_MAX_SIZE 0x4000  /* 16KB */
> > +#define SEV_FW_BLOB_MAX_SIZE 0x5000  /* 20KB */
> >
>
> This has effects in drivers/crypto/ccp/sev-dev.c
>                                                                (for
> example in alloc_snp_host_map).  Is that OK?
>

No, this was a mistake of mine because I was using a bloated data
encoding that needed 5 pages for the GUID table plus 4 small
certificates. I've since fixed that in our user space code.
We shouldn't change this size and instead wait for a better size
negotiation protocol between the guest and host to avoid this awkward
hard-coding.

Tom Lendacky Jan. 9, 2023, 10:27 p.m. UTC | #3

On 1/9/23 10:55, Dionna Amalie Glaze wrote:
>>> +
>>> +static int snp_set_instance_certs(struct kvm *kvm, struct kvm_sev_cmd *argp)
>>> +{
>> [...]
>>
>> Here we set the length to the page-aligned value, but we copy only
>> params.cert_len bytes.  If there are two subsequent
>> snp_set_instance_certs() calls where the second one has a shorter
>> length, we might "keep" some leftover bytes from the first call.
>>
>> Consider:
>> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192)
>> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097)
>>
>> If I understand correctly, on the second call we'll copy 4097 "BBB..."
>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
>> 1) & PAGE_MASK which will be 8192.
>>
>> Later when fetching the certs (for the extended report or in
>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
>> filled with 4097 BBBs and 4095 leftover AAAs.
>>
>> Maybe zero sev->snp_certs_data entirely before writing to it?
>>
> 
> Yes, I agree it should be zeroed, at least if the previous length is
> greater than the new length. Good catch.
> 
> 
>> Related question (not only for this patch) regarding snp_certs_data
>> (host or per-instance): why is its size page-aligned at all? why is it
>> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer
>> is never sent to the PSP.
>>
> 
> The buffer is meant to be copied into the guest driver following the
> GHCB extended guest request protocol. The data to copy back are
> expected to be in 4K page granularity.

I don't think the data has to be in 4K page granularity. Why do you think 
it does?

Thanks,
Tom

> 
>> [...]
>>>
>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000  /* 16KB */
>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000  /* 20KB */
>>>
>>
>> This has effects in drivers/crypto/ccp/sev-dev.c
>>                                                                 (for
>> example in alloc_snp_host_map).  Is that OK?
>>
> 
> No, this was a mistake of mine because I was using a bloated data
> encoding that needed 5 pages for the GUID table plus 4 small
> certificates. I've since fixed that in our user space code.
> We shouldn't change this size and instead wait for a better size
> negotiation protocol between the guest and host to avoid this awkward
> hard-coding.
> 
>

Dov Murik Jan. 10, 2023, 7:10 a.m. UTC | #4

Hi Tom,

On 10/01/2023 0:27, Tom Lendacky wrote:
> On 1/9/23 10:55, Dionna Amalie Glaze wrote:
>>>> +
>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct
>>>> kvm_sev_cmd *argp)
>>>> +{
>>> [...]
>>>
>>> Here we set the length to the page-aligned value, but we copy only
>>> params.cert_len bytes.  If there are two subsequent
>>> snp_set_instance_certs() calls where the second one has a shorter
>>> length, we might "keep" some leftover bytes from the first call.
>>>
>>> Consider:
>>> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192)
>>> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097)
>>>
>>> If I understand correctly, on the second call we'll copy 4097 "BBB..."
>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
>>> 1) & PAGE_MASK which will be 8192.
>>>
>>> Later when fetching the certs (for the extended report or in
>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
>>> filled with 4097 BBBs and 4095 leftover AAAs.
>>>
>>> Maybe zero sev->snp_certs_data entirely before writing to it?
>>>
>>
>> Yes, I agree it should be zeroed, at least if the previous length is
>> greater than the new length. Good catch.
>>
>>
>>> Related question (not only for this patch) regarding snp_certs_data
>>> (host or per-instance): why is its size page-aligned at all? why is it
>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer
>>> is never sent to the PSP.
>>>
>>
>> The buffer is meant to be copied into the guest driver following the
>> GHCB extended guest request protocol. The data to copy back are
>> expected to be in 4K page granularity.
> 
> I don't think the data has to be in 4K page granularity. Why do you
> think it does?
> 

I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication
Block Standardization (July 2022), page 37.  The table says:

--------------

NAE Event: SNP Extended Guest Request

Notes:

RAX will have the guest physical address of the page(s) to hold returned
data

RBX
State to Hypervisor: will contain the number of guest contiguous
pages supplied to hold returned data
State from Hypervisor: on error will contain the number of guest
contiguous pages required to hold the data to be returned

...

The request page, response page and data page(s) must be assigned to the
hypervisor (shared).

--------------


According to this spec, it looks like the sizes are communicated as
number of pages in RBX.  So the data should start at a 4KB alignment
(this is verified in snp_handle_ext_guest_request()) and its length
should be 4KB-aligned, as Dionna noted.

I see no reason (in the spec and in the kernel code) for the data length
to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some
flow because Dionna ran into this limit.


-Dov



> Thanks,
> Tom
> 
>>
>>> [...]
>>>>
>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000  /* 16KB */
>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000  /* 20KB */
>>>>
>>>
>>> This has effects in drivers/crypto/ccp/sev-dev.c
>>>                                                                 (for
>>> example in alloc_snp_host_map).  Is that OK?
>>>
>>
>> No, this was a mistake of mine because I was using a bloated data
>> encoding that needed 5 pages for the GUID table plus 4 small
>> certificates. I've since fixed that in our user space code.
>> We shouldn't change this size and instead wait for a better size
>> negotiation protocol between the guest and host to avoid this awkward
>> hard-coding.
>>
>>

Tom Lendacky Jan. 10, 2023, 3:10 p.m. UTC | #5

On 1/10/23 01:10, Dov Murik wrote:
> Hi Tom,
> 
> On 10/01/2023 0:27, Tom Lendacky wrote:
>> On 1/9/23 10:55, Dionna Amalie Glaze wrote:
>>>>> +
>>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct
>>>>> kvm_sev_cmd *argp)
>>>>> +{
>>>> [...]
>>>>
>>>> Here we set the length to the page-aligned value, but we copy only
>>>> params.cert_len bytes.  If there are two subsequent
>>>> snp_set_instance_certs() calls where the second one has a shorter
>>>> length, we might "keep" some leftover bytes from the first call.
>>>>
>>>> Consider:
>>>> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192)
>>>> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097)
>>>>
>>>> If I understand correctly, on the second call we'll copy 4097 "BBB..."
>>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
>>>> 1) & PAGE_MASK which will be 8192.
>>>>
>>>> Later when fetching the certs (for the extended report or in
>>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
>>>> filled with 4097 BBBs and 4095 leftover AAAs.
>>>>
>>>> Maybe zero sev->snp_certs_data entirely before writing to it?
>>>>
>>>
>>> Yes, I agree it should be zeroed, at least if the previous length is
>>> greater than the new length. Good catch.
>>>
>>>
>>>> Related question (not only for this patch) regarding snp_certs_data
>>>> (host or per-instance): why is its size page-aligned at all? why is it
>>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer
>>>> is never sent to the PSP.
>>>>
>>>
>>> The buffer is meant to be copied into the guest driver following the
>>> GHCB extended guest request protocol. The data to copy back are
>>> expected to be in 4K page granularity.
>>
>> I don't think the data has to be in 4K page granularity. Why do you
>> think it does?
>>
> 
> I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication
> Block Standardization (July 2022), page 37.  The table says:
> 
> --------------
> 
> NAE Event: SNP Extended Guest Request
> 
> Notes:
> 
> RAX will have the guest physical address of the page(s) to hold returned
> data
> 
> RBX
> State to Hypervisor: will contain the number of guest contiguous
> pages supplied to hold returned data
> State from Hypervisor: on error will contain the number of guest
> contiguous pages required to hold the data to be returned
> 
> ...
> 
> The request page, response page and data page(s) must be assigned to the
> hypervisor (shared).
> 
> --------------
> 
> 
> According to this spec, it looks like the sizes are communicated as
> number of pages in RBX.  So the data should start at a 4KB alignment
> (this is verified in snp_handle_ext_guest_request()) and its length
> should be 4KB-aligned, as Dionna noted.

That only indicates how many pages are required to hold the data, but the 
hypervisor only has to copy however much data is present. If the data is 
20 bytes, then you only have to copy 20 bytes. If the user supplied 0 for 
the number of pages, then the code returns 1 in RBX to indicate that one 
page is required to hold the 20 bytes.

> 
> I see no reason (in the spec and in the kernel code) for the data length
> to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some
> flow because Dionna ran into this limit.

Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way 
to keep the memory usage controlled because data is coming from userspace 
and it isn't expected that the data would be larger than that.

I'm not sure if that was in from the start or as a result of a review 
comment. Not sure what is the best approach is.

Thanks,
Tom

> 
> 
> -Dov
> 
> 
> 
>> Thanks,
>> Tom
>>
>>>
>>>> [...]
>>>>>
>>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000  /* 16KB */
>>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000  /* 20KB */
>>>>>
>>>>
>>>> This has effects in drivers/crypto/ccp/sev-dev.c
>>>>                                                                  (for
>>>> example in alloc_snp_host_map).  Is that OK?
>>>>
>>>
>>> No, this was a mistake of mine because I was using a bloated data
>>> encoding that needed 5 pages for the GUID table plus 4 small
>>> certificates. I've since fixed that in our user space code.
>>> We shouldn't change this size and instead wait for a better size
>>> negotiation protocol between the guest and host to avoid this awkward
>>> hard-coding.
>>>
>>>

Peter Gonda Jan. 10, 2023, 3:23 p.m. UTC | #6

On Tue, Jan 10, 2023 at 8:10 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 1/10/23 01:10, Dov Murik wrote:
> > Hi Tom,
> >
> > On 10/01/2023 0:27, Tom Lendacky wrote:
> >> On 1/9/23 10:55, Dionna Amalie Glaze wrote:
> >>>>> +
> >>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct
> >>>>> kvm_sev_cmd *argp)
> >>>>> +{
> >>>> [...]
> >>>>
> >>>> Here we set the length to the page-aligned value, but we copy only
> >>>> params.cert_len bytes.  If there are two subsequent
> >>>> snp_set_instance_certs() calls where the second one has a shorter
> >>>> length, we might "keep" some leftover bytes from the first call.
> >>>>
> >>>> Consider:
> >>>> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192)
> >>>> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097)
> >>>>
> >>>> If I understand correctly, on the second call we'll copy 4097 "BBB..."
> >>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
> >>>> 1) & PAGE_MASK which will be 8192.
> >>>>
> >>>> Later when fetching the certs (for the extended report or in
> >>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
> >>>> filled with 4097 BBBs and 4095 leftover AAAs.
> >>>>
> >>>> Maybe zero sev->snp_certs_data entirely before writing to it?
> >>>>
> >>>
> >>> Yes, I agree it should be zeroed, at least if the previous length is
> >>> greater than the new length. Good catch.
> >>>
> >>>
> >>>> Related question (not only for this patch) regarding snp_certs_data
> >>>> (host or per-instance): why is its size page-aligned at all? why is it
> >>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer
> >>>> is never sent to the PSP.
> >>>>
> >>>
> >>> The buffer is meant to be copied into the guest driver following the
> >>> GHCB extended guest request protocol. The data to copy back are
> >>> expected to be in 4K page granularity.
> >>
> >> I don't think the data has to be in 4K page granularity. Why do you
> >> think it does?
> >>
> >
> > I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication
> > Block Standardization (July 2022), page 37.  The table says:
> >
> > --------------
> >
> > NAE Event: SNP Extended Guest Request
> >
> > Notes:
> >
> > RAX will have the guest physical address of the page(s) to hold returned
> > data
> >
> > RBX
> > State to Hypervisor: will contain the number of guest contiguous
> > pages supplied to hold returned data
> > State from Hypervisor: on error will contain the number of guest
> > contiguous pages required to hold the data to be returned
> >
> > ...
> >
> > The request page, response page and data page(s) must be assigned to the
> > hypervisor (shared).
> >
> > --------------
> >
> >
> > According to this spec, it looks like the sizes are communicated as
> > number of pages in RBX.  So the data should start at a 4KB alignment
> > (this is verified in snp_handle_ext_guest_request()) and its length
> > should be 4KB-aligned, as Dionna noted.
>
> That only indicates how many pages are required to hold the data, but the
> hypervisor only has to copy however much data is present. If the data is
> 20 bytes, then you only have to copy 20 bytes. If the user supplied 0 for
> the number of pages, then the code returns 1 in RBX to indicate that one
> page is required to hold the 20 bytes.
>
> >
> > I see no reason (in the spec and in the kernel code) for the data length
> > to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some
> > flow because Dionna ran into this limit.
>
> Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way
> to keep the memory usage controlled because data is coming from userspace
> and it isn't expected that the data would be larger than that.
>
> I'm not sure if that was in from the start or as a result of a review
> comment. Not sure what is the best approach is.

This was discussed a bit in the guest driver changes recently too that
SEV_FW_BLOB_MAX_SIZE is used in the guest driver code for the max cert
length. We discussed increasing the limit there after fixing the IV
reuse issue.

Maybe we could introduce SEV_CERT_BLOB_MAX_SIZE here to be more clear
there is no firmware based limit? Then we could switch the guest
driver to use that too. Dionna confirmed 4 pages is enough for our
current usecase, Dov would you recommend something larger to start?

>
> Thanks,
> Tom
>
> >
> >
> > -Dov
> >
> >
> >
> >> Thanks,
> >> Tom
> >>
> >>>
> >>>> [...]
> >>>>>
> >>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000  /* 16KB */
> >>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000  /* 20KB */
> >>>>>
> >>>>
> >>>> This has effects in drivers/crypto/ccp/sev-dev.c
> >>>>                                                                  (for
> >>>> example in alloc_snp_host_map).  Is that OK?
> >>>>
> >>>
> >>> No, this was a mistake of mine because I was using a bloated data
> >>> encoding that needed 5 pages for the GUID table plus 4 small
> >>> certificates. I've since fixed that in our user space code.
> >>> We shouldn't change this size and instead wait for a better size
> >>> negotiation protocol between the guest and host to avoid this awkward
> >>> hard-coding.
> >>>
> >>>

Dov Murik Jan. 11, 2023, 6 a.m. UTC | #7

On 10/01/2023 17:10, Tom Lendacky wrote:
> On 1/10/23 01:10, Dov Murik wrote:
>> Hi Tom,
>>
>> On 10/01/2023 0:27, Tom Lendacky wrote:
>>> On 1/9/23 10:55, Dionna Amalie Glaze wrote:
>>>>>> +
>>>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct
>>>>>> kvm_sev_cmd *argp)
>>>>>> +{
>>>>> [...]
>>>>>
>>>>> Here we set the length to the page-aligned value, but we copy only
>>>>> params.cert_len bytes.  If there are two subsequent
>>>>> snp_set_instance_certs() calls where the second one has a shorter
>>>>> length, we might "keep" some leftover bytes from the first call.
>>>>>
>>>>> Consider:
>>>>> 1. snp_set_instance_certs(certs_addr point to "AAA...",
>>>>> certs_len=8192)
>>>>> 2. snp_set_instance_certs(certs_addr point to "BBB...",
>>>>> certs_len=4097)
>>>>>
>>>>> If I understand correctly, on the second call we'll copy 4097 "BBB..."
>>>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
>>>>> 1) & PAGE_MASK which will be 8192.
>>>>>
>>>>> Later when fetching the certs (for the extended report or in
>>>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
>>>>> filled with 4097 BBBs and 4095 leftover AAAs.
>>>>>
>>>>> Maybe zero sev->snp_certs_data entirely before writing to it?
>>>>>
>>>>
>>>> Yes, I agree it should be zeroed, at least if the previous length is
>>>> greater than the new length. Good catch.
>>>>
>>>>
>>>>> Related question (not only for this patch) regarding snp_certs_data
>>>>> (host or per-instance): why is its size page-aligned at all? why is it
>>>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this
>>>>> buffer
>>>>> is never sent to the PSP.
>>>>>
>>>>
>>>> The buffer is meant to be copied into the guest driver following the
>>>> GHCB extended guest request protocol. The data to copy back are
>>>> expected to be in 4K page granularity.
>>>
>>> I don't think the data has to be in 4K page granularity. Why do you
>>> think it does?
>>>
>>
>> I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication
>> Block Standardization (July 2022), page 37.  The table says:
>>
>> --------------
>>
>> NAE Event: SNP Extended Guest Request
>>
>> Notes:
>>
>> RAX will have the guest physical address of the page(s) to hold returned
>> data
>>
>> RBX
>> State to Hypervisor: will contain the number of guest contiguous
>> pages supplied to hold returned data
>> State from Hypervisor: on error will contain the number of guest
>> contiguous pages required to hold the data to be returned
>>
>> ...
>>
>> The request page, response page and data page(s) must be assigned to the
>> hypervisor (shared).
>>
>> --------------
>>
>>
>> According to this spec, it looks like the sizes are communicated as
>> number of pages in RBX.  So the data should start at a 4KB alignment
>> (this is verified in snp_handle_ext_guest_request()) and its length
>> should be 4KB-aligned, as Dionna noted.
> 
> That only indicates how many pages are required to hold the data, but
> the hypervisor only has to copy however much data is present. If the
> data is 20 bytes, then you only have to copy 20 bytes. If the user
> supplied 0 for the number of pages, then the code returns 1 in RBX to
> indicate that one page is required to hold the 20 bytes.
> 


Maybe it should only copy 20 bytes, but current implementation copies
whole 4KB pages:


        if (sev->snp_certs_len)
                data_npages = sev->snp_certs_len >> PAGE_SHIFT;
        ...
        ...
        /* Copy the certificate blob in the guest memory */
        if (data_npages &&
            kvm_write_guest(kvm, data_gpa, sev->snp_certs_data, data_npages << PAGE_SHIFT))
                rc = SEV_RET_INVALID_ADDRESS;


(elsewhere we ensure that sev->snp_certs_len is page-aligned, so the assignment
to data_npages is in fact correct even though looks off-by-one; aside, maybe it's
better to use some DIV_ROUND_UP macro anywhere we calculate the number of
needed pages.)

Also -- how does the guest know they got only 20 bytes and not 4096? Do they have
to read all the 'struct cert_table' entries at the beginning of the received data?

-Dov


>>
>> I see no reason (in the spec and in the kernel code) for the data length
>> to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some
>> flow because Dionna ran into this limit.
> 
> Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way
> to keep the memory usage controlled because data is coming from
> userspace and it isn't expected that the data would be larger than that.
> 
> I'm not sure if that was in from the start or as a result of a review
> comment. Not sure what is the best approach is.
> 
> Thanks,
> Tom
> 
>>
>>
>> -Dov
>>
>>
>>
>>> Thanks,
>>> Tom
>>>
>>>>
>>>>> [...]
>>>>>>
>>>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000  /* 16KB */
>>>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000  /* 20KB */
>>>>>>
>>>>>
>>>>> This has effects in drivers/crypto/ccp/sev-dev.c
>>>>>                                                                  (for
>>>>> example in alloc_snp_host_map).  Is that OK?
>>>>>
>>>>
>>>> No, this was a mistake of mine because I was using a bloated data
>>>> encoding that needed 5 pages for the GUID table plus 4 small
>>>> certificates. I've since fixed that in our user space code.
>>>> We shouldn't change this size and instead wait for a better size
>>>> negotiation protocol between the guest and host to avoid this awkward
>>>> hard-coding.
>>>>
>>>>

Dov Murik Jan. 11, 2023, 7:26 a.m. UTC | #8

Hi Peter,

On 10/01/2023 17:23, Peter Gonda wrote:
> On Tue, Jan 10, 2023 at 8:10 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> On 1/10/23 01:10, Dov Murik wrote:
>>> Hi Tom,
>>>
>>> On 10/01/2023 0:27, Tom Lendacky wrote:
>>>> On 1/9/23 10:55, Dionna Amalie Glaze wrote:
>>>>>>> +
>>>>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct
>>>>>>> kvm_sev_cmd *argp)
>>>>>>> +{
>>>>>> [...]
>>>>>>
>>>>>> Here we set the length to the page-aligned value, but we copy only
>>>>>> params.cert_len bytes.  If there are two subsequent
>>>>>> snp_set_instance_certs() calls where the second one has a shorter
>>>>>> length, we might "keep" some leftover bytes from the first call.
>>>>>>
>>>>>> Consider:
>>>>>> 1. snp_set_instance_certs(certs_addr point to "AAA...", certs_len=8192)
>>>>>> 2. snp_set_instance_certs(certs_addr point to "BBB...", certs_len=4097)
>>>>>>
>>>>>> If I understand correctly, on the second call we'll copy 4097 "BBB..."
>>>>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
>>>>>> 1) & PAGE_MASK which will be 8192.
>>>>>>
>>>>>> Later when fetching the certs (for the extended report or in
>>>>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
>>>>>> filled with 4097 BBBs and 4095 leftover AAAs.
>>>>>>
>>>>>> Maybe zero sev->snp_certs_data entirely before writing to it?
>>>>>>
>>>>>
>>>>> Yes, I agree it should be zeroed, at least if the previous length is
>>>>> greater than the new length. Good catch.
>>>>>
>>>>>
>>>>>> Related question (not only for this patch) regarding snp_certs_data
>>>>>> (host or per-instance): why is its size page-aligned at all? why is it
>>>>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this buffer
>>>>>> is never sent to the PSP.
>>>>>>
>>>>>
>>>>> The buffer is meant to be copied into the guest driver following the
>>>>> GHCB extended guest request protocol. The data to copy back are
>>>>> expected to be in 4K page granularity.
>>>>
>>>> I don't think the data has to be in 4K page granularity. Why do you
>>>> think it does?
>>>>
>>>
>>> I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication
>>> Block Standardization (July 2022), page 37.  The table says:
>>>
>>> --------------
>>>
>>> NAE Event: SNP Extended Guest Request
>>>
>>> Notes:
>>>
>>> RAX will have the guest physical address of the page(s) to hold returned
>>> data
>>>
>>> RBX
>>> State to Hypervisor: will contain the number of guest contiguous
>>> pages supplied to hold returned data
>>> State from Hypervisor: on error will contain the number of guest
>>> contiguous pages required to hold the data to be returned
>>>
>>> ...
>>>
>>> The request page, response page and data page(s) must be assigned to the
>>> hypervisor (shared).
>>>
>>> --------------
>>>
>>>
>>> According to this spec, it looks like the sizes are communicated as
>>> number of pages in RBX.  So the data should start at a 4KB alignment
>>> (this is verified in snp_handle_ext_guest_request()) and its length
>>> should be 4KB-aligned, as Dionna noted.
>>
>> That only indicates how many pages are required to hold the data, but the
>> hypervisor only has to copy however much data is present. If the data is
>> 20 bytes, then you only have to copy 20 bytes. If the user supplied 0 for
>> the number of pages, then the code returns 1 in RBX to indicate that one
>> page is required to hold the 20 bytes.
>>
>>>
>>> I see no reason (in the spec and in the kernel code) for the data length
>>> to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some
>>> flow because Dionna ran into this limit.
>>
>> Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way
>> to keep the memory usage controlled because data is coming from userspace
>> and it isn't expected that the data would be larger than that.
>>
>> I'm not sure if that was in from the start or as a result of a review
>> comment. Not sure what is the best approach is.
> 
> This was discussed a bit in the guest driver changes recently too that
> SEV_FW_BLOB_MAX_SIZE is used in the guest driver code for the max cert
> length. We discussed increasing the limit there after fixing the IV
> reuse issue.

I see it now.

(Joerg, maybe we should add F:drivers/virt/coco/ to the MAINTAINERS list
so that patches there are hopefully sent to linux-coco?)


> 
> Maybe we could introduce SEV_CERT_BLOB_MAX_SIZE here to be more clear
> there is no firmware based limit? Then we could switch the guest
> driver to use that too. Dionna confirmed 4 pages is enough for our
> current usecase, Dov would you recommend something larger to start?
> 

Introducing a new constant sounds good to me (and use the same constant
in the guest driver).

I think 4 pages are OK; I also don't see real harm in increasing this
limit to 1 MB (if the host+guest agree to pass more stuff there, besides
certificates).  But maybe that's just abusing this channel, and for
other data we should use other mechanisms (like vsock).

-Dov

Tom Lendacky Jan. 11, 2023, 2:32 p.m. UTC | #9

On 1/11/23 00:00, Dov Murik wrote:
> 
> 
> On 10/01/2023 17:10, Tom Lendacky wrote:
>> On 1/10/23 01:10, Dov Murik wrote:
>>> Hi Tom,
>>>
>>> On 10/01/2023 0:27, Tom Lendacky wrote:
>>>> On 1/9/23 10:55, Dionna Amalie Glaze wrote:
>>>>>>> +
>>>>>>> +static int snp_set_instance_certs(struct kvm *kvm, struct
>>>>>>> kvm_sev_cmd *argp)
>>>>>>> +{
>>>>>> [...]
>>>>>>
>>>>>> Here we set the length to the page-aligned value, but we copy only
>>>>>> params.cert_len bytes.  If there are two subsequent
>>>>>> snp_set_instance_certs() calls where the second one has a shorter
>>>>>> length, we might "keep" some leftover bytes from the first call.
>>>>>>
>>>>>> Consider:
>>>>>> 1. snp_set_instance_certs(certs_addr point to "AAA...",
>>>>>> certs_len=8192)
>>>>>> 2. snp_set_instance_certs(certs_addr point to "BBB...",
>>>>>> certs_len=4097)
>>>>>>
>>>>>> If I understand correctly, on the second call we'll copy 4097 "BBB..."
>>>>>> bytes into the to_certs buffer, but length will be (4096 + PAGE_SIZE -
>>>>>> 1) & PAGE_MASK which will be 8192.
>>>>>>
>>>>>> Later when fetching the certs (for the extended report or in
>>>>>> snp_get_instance_certs()) the user will get a buffer of 8192 bytes
>>>>>> filled with 4097 BBBs and 4095 leftover AAAs.
>>>>>>
>>>>>> Maybe zero sev->snp_certs_data entirely before writing to it?
>>>>>>
>>>>>
>>>>> Yes, I agree it should be zeroed, at least if the previous length is
>>>>> greater than the new length. Good catch.
>>>>>
>>>>>
>>>>>> Related question (not only for this patch) regarding snp_certs_data
>>>>>> (host or per-instance): why is its size page-aligned at all? why is it
>>>>>> limited by 16KB or 20KB? If I understand correctly, for SNP, this
>>>>>> buffer
>>>>>> is never sent to the PSP.
>>>>>>
>>>>>
>>>>> The buffer is meant to be copied into the guest driver following the
>>>>> GHCB extended guest request protocol. The data to copy back are
>>>>> expected to be in 4K page granularity.
>>>>
>>>> I don't think the data has to be in 4K page granularity. Why do you
>>>> think it does?
>>>>
>>>
>>> I looked at AMD publication 56421 SEV-ES Guest-Hypervisor Communication
>>> Block Standardization (July 2022), page 37.  The table says:
>>>
>>> --------------
>>>
>>> NAE Event: SNP Extended Guest Request
>>>
>>> Notes:
>>>
>>> RAX will have the guest physical address of the page(s) to hold returned
>>> data
>>>
>>> RBX
>>> State to Hypervisor: will contain the number of guest contiguous
>>> pages supplied to hold returned data
>>> State from Hypervisor: on error will contain the number of guest
>>> contiguous pages required to hold the data to be returned
>>>
>>> ...
>>>
>>> The request page, response page and data page(s) must be assigned to the
>>> hypervisor (shared).
>>>
>>> --------------
>>>
>>>
>>> According to this spec, it looks like the sizes are communicated as
>>> number of pages in RBX.  So the data should start at a 4KB alignment
>>> (this is verified in snp_handle_ext_guest_request()) and its length
>>> should be 4KB-aligned, as Dionna noted.
>>
>> That only indicates how many pages are required to hold the data, but
>> the hypervisor only has to copy however much data is present. If the
>> data is 20 bytes, then you only have to copy 20 bytes. If the user
>> supplied 0 for the number of pages, then the code returns 1 in RBX to
>> indicate that one page is required to hold the 20 bytes.
>>
> 
> 
> Maybe it should only copy 20 bytes, but current implementation copies
> whole 4KB pages:
> 
> 
>          if (sev->snp_certs_len)
>                  data_npages = sev->snp_certs_len >> PAGE_SHIFT;
>          ...
>          ...
>          /* Copy the certificate blob in the guest memory */
>          if (data_npages &&
>              kvm_write_guest(kvm, data_gpa, sev->snp_certs_data, data_npages << PAGE_SHIFT))
>                  rc = SEV_RET_INVALID_ADDRESS;
> 
> 
> (elsewhere we ensure that sev->snp_certs_len is page-aligned, so the assignment
> to data_npages is in fact correct even though looks off-by-one; aside, maybe it's
> better to use some DIV_ROUND_UP macro anywhere we calculate the number of
> needed pages.)

Hmmm... yeah, not sure why it was implemented that way, I guess it can 
always be changed later if desired.

> 
> Also -- how does the guest know they got only 20 bytes and not 4096? Do they have
> to read all the 'struct cert_table' entries at the beginning of the received data?

Yes, they should walk the cert table entries.

Thanks,
Tom

> 
> -Dov
> 
> 
>>>
>>> I see no reason (in the spec and in the kernel code) for the data length
>>> to be limited to 16KB (SEV_FW_BLOB_MAX_SIZE) but I might be missing some
>>> flow because Dionna ran into this limit.
>>
>> Correct, there is no limit. I believe that SEV_FW_BLOB_MAX_SIZE is a way
>> to keep the memory usage controlled because data is coming from
>> userspace and it isn't expected that the data would be larger than that.
>>
>> I'm not sure if that was in from the start or as a result of a review
>> comment. Not sure what is the best approach is.
>>
>> Thanks,
>> Tom
>>
>>>
>>>
>>> -Dov
>>>
>>>
>>>
>>>> Thanks,
>>>> Tom
>>>>
>>>>>
>>>>>> [...]
>>>>>>>
>>>>>>> -#define SEV_FW_BLOB_MAX_SIZE 0x4000  /* 16KB */
>>>>>>> +#define SEV_FW_BLOB_MAX_SIZE 0x5000  /* 20KB */
>>>>>>>
>>>>>>
>>>>>> This has effects in drivers/crypto/ccp/sev-dev.c
>>>>>>                                                                   (for
>>>>>> example in alloc_snp_host_map).  Is that OK?
>>>>>>
>>>>>
>>>>> No, this was a mistake of mine because I was using a bloated data
>>>>> encoding that needed 5 pages for the GUID table plus 4 small
>>>>> certificates. I've since fixed that in our user space code.
>>>>> We shouldn't change this size and instead wait for a better size
>>>>> negotiation protocol between the guest and host to avoid this awkward
>>>>> hard-coding.
>>>>>
>>>>>

Dionna Glaze Jan. 19, 2023, 6:49 p.m. UTC | #10

> +
> +       /* Page-align the length */
> +       length = (params.certs_len + PAGE_SIZE - 1) & PAGE_MASK;
> +

I believe Ashish wanted this to be PAGE_ALIGN(params.certs_len)

Kalra, Ashish Jan. 19, 2023, 10:18 p.m. UTC | #11

Hello Dionna,

Do you also have other updates to this patch with regard to review 
comments from Dov ?

Thanks,
Ashish

On 1/19/2023 12:49 PM, Dionna Amalie Glaze wrote:
>> +
>> +       /* Page-align the length */
>> +       length = (params.certs_len + PAGE_SIZE - 1) & PAGE_MASK;
>> +
> 
> I believe Ashish wanted this to be PAGE_ALIGN(params.certs_len)
>

Dionna Glaze Jan. 20, 2023, 1:40 a.m. UTC | #12

On Thu, Jan 19, 2023 at 2:18 PM Kalra, Ashish <ashish.kalra@amd.com> wrote:
>
> Hello Dionna,
>
> Do you also have other updates to this patch with regard to review
> comments from Dov ?
>

Apart from the PAGE_ALIGN change, the result of the whole discussion
appears to only need the following immediately before the
copy_from_user of certs_uaddr in the snp_set_instance_certs function:

/* The size could shrink and leave garbage at the end. */
memset(sev->snp_certs_data, 0, SEV_FW_BLOB_MAX_SIZE);

I don't believe there is an off-by-one with the page shifting for the
number of pages because snp_certs_len is already rounded up to the
nearest page size. Any other change wrt the way the blob size is
decided between the guest and host should come later.

[RFC,v7,62/64] x86/sev: Add KVM commands for instance certs

Commit Message

Comments

Patch