[V4] virt: sev: Prevent IV reuse in SNP guest driver

Message ID	20221103152318.88354-1-pgonda@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> Date: Thu, 3 Nov 2022 08:23:18 -0700 Message-Id: <20221103152318.88354-1-pgonda@google.com> Mime-Version: 1.0 Subject: [PATCH V4] virt: sev: Prevent IV reuse in SNP guest driver From: Peter Gonda <pgonda@google.com> To: thomas.lendacky@amd.com Cc: Peter Gonda <pgonda@google.com>, Dionna Glaze <dionnaglaze@google.com>, Borislav Petkov <bp@suse.de>, Michael Roth <michael.roth@amd.com>, Haowen Bai <baihaowen@meizu.com>, Yang Yingliang <yangyingliang@huawei.com>, Marc Orr <marcorr@google.com>, David Rientjes <rientjes@google.com>, Ashish Kalra <Ashish.Kalra@amd.com>, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	[V4] virt: sev: Prevent IV reuse in SNP guest driver \| expand [V4] virt: sev: Prevent IV reuse in SNP guest driver

Peter Gonda Nov. 3, 2022, 3:23 p.m. UTC

The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
communicate securely with each other. The IV to this scheme is a
sequence number that both the ASP and the guest track. Currently this
sequence number in a guest request must exactly match the sequence
number tracked by the ASP. This means that if the guest sees an error
from the host during a request it can only retry that exact request or
disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
reuse see:
https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf

To handle userspace querying the cert_data length handle_guest_request()
now: saves the number of pages required by the host, retries the request
without requesting the extended data, then returns the number of pages
required.

Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver")
Signed-off-by: Peter Gonda <pgonda@google.com>
Reported-by: Peter Gonda <pgonda@google.com>
Cc: Dionna Glaze <dionnaglaze@google.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Michael Roth <michael.roth@amd.com>
Cc: Haowen Bai <baihaowen@meizu.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Marc Orr <marcorr@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Ashish Kalra <Ashish.Kalra@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org
---
Tested by placing each of the guest requests: attestation quote,
extended attestation quote, and get key. Then tested the extended
attestation quote certificate length querying.

V4
 * As suggested by Dionna moved the extended request retry logic into
   the driver.
 * Due to big change in patch dropped any reviewed-by tags.

---
 drivers/virt/coco/sev-guest/sev-guest.c | 70 +++++++++++++++++++------
 1 file changed, 53 insertions(+), 17 deletions(-)

Tom Lendacky Nov. 4, 2022, 5:39 p.m. UTC | #1

On 11/3/22 10:23, Peter Gonda wrote:
> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
> communicate securely with each other. The IV to this scheme is a
> sequence number that both the ASP and the guest track. Currently this
> sequence number in a guest request must exactly match the sequence
> number tracked by the ASP. This means that if the guest sees an error
> from the host during a request it can only retry that exact request or
> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> reuse see:
> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
> 
> To handle userspace querying the cert_data length handle_guest_request()
> now: saves the number of pages required by the host, retries the request
> without requesting the extended data, then returns the number of pages
> required.
> 
> Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver")
> Signed-off-by: Peter Gonda <pgonda@google.com>
> Reported-by: Peter Gonda <pgonda@google.com>
> Cc: Dionna Glaze <dionnaglaze@google.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Michael Roth <michael.roth@amd.com>
> Cc: Haowen Bai <baihaowen@meizu.com>
> Cc: Yang Yingliang <yangyingliang@huawei.com>
> Cc: Marc Orr <marcorr@google.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Ashish Kalra <Ashish.Kalra@amd.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: kvm@vger.kernel.org

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

> ---
> Tested by placing each of the guest requests: attestation quote,
> extended attestation quote, and get key. Then tested the extended
> attestation quote certificate length querying.
> 
> V4
>   * As suggested by Dionna moved the extended request retry logic into
>     the driver.
>   * Due to big change in patch dropped any reviewed-by tags.
> 
> ---

Borislav Petkov Nov. 11, 2022, 4:46 p.m. UTC | #2

On Thu, Nov 03, 2022 at 08:23:18AM -0700, Peter Gonda wrote:
> The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to

ASP?

That must be the AMD Secure Processor or so but pls write it out.

> communicate securely with each other. The IV to this scheme is a
> sequence number that both the ASP and the guest track. Currently this
> sequence number in a guest request must exactly match the sequence
> number tracked by the ASP. This means that if the guest sees an error
> from the host during a request it can only retry that exact request or
> disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> reuse see:
> https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf

Right, how stable will that link be?

IOW, perhaps quote the paper name and authors so that people can find it
on their own.

> To handle userspace querying the cert_data length handle_guest_request()
> now: saves the number of pages required by the host, retries the request

This needs to sound like this:

"In order to address this, save the number of pages ..."

IOW, as the docs say:

"Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
to do frotz", as if you are giving orders to the codebase to change
its behaviour."

> without requesting the extended data, then returns the number of pages
> required.
> 
> Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver")

I'm guessing this needs to go to stable?

> Signed-off-by: Peter Gonda <pgonda@google.com>
> Reported-by: Peter Gonda <pgonda@google.com>
> Cc: Dionna Glaze <dionnaglaze@google.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Michael Roth <michael.roth@amd.com>
> Cc: Haowen Bai <baihaowen@meizu.com>
> Cc: Yang Yingliang <yangyingliang@huawei.com>
> Cc: Marc Orr <marcorr@google.com>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Ashish Kalra <Ashish.Kalra@amd.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: kvm@vger.kernel.org
> ---
> Tested by placing each of the guest requests: attestation quote,
> extended attestation quote, and get key. Then tested the extended
> attestation quote certificate length querying.
> 
> V4
>  * As suggested by Dionna moved the extended request retry logic into
>    the driver.
>  * Due to big change in patch dropped any reviewed-by tags.
> 
> ---
>  drivers/virt/coco/sev-guest/sev-guest.c | 70 +++++++++++++++++++------
>  1 file changed, 53 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> index f422f9c58ba79..7dd6337ebdd5b 100644
> --- a/drivers/virt/coco/sev-guest/sev-guest.c
> +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> @@ -41,7 +41,7 @@ struct snp_guest_dev {
>  	struct device *dev;
>  	struct miscdevice misc;
>  
> -	void *certs_data;
> +	u8 (*certs_data)[SEV_FW_BLOB_MAX_SIZE];
>  	struct snp_guest_crypto *crypto;
>  	struct snp_guest_msg *request, *response;
>  	struct snp_secrets_page_layout *layout;
> @@ -67,8 +67,27 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
>  	return true;
>  }
>  
> +/*
> + * If we receive an error from the host or ASP we have two options. We can

Please use passive voice in your commit message: no "we" or "I", etc,
and describe your changes in imperative mood.

Bottom line is: personal pronouns are ambiguous in text, especially with
so many parties/companies/etc developing the kernel so let's avoid them
please.

> + * either retry the exact same encrypted request or we can discontinue using the
> + * VMPCK.
> + *
> + * This is because in the current encryption scheme GHCB v2 uses AES-GCM to
> + * encrypt the requests. The IV for this scheme is the sequence number. GCM
> + * cannot tolerate IV reuse.
> + *
> + * The ASP FW v1.51 only increments the sequence numbers on a successful
> + * guest<->ASP back and forth and only accepts messages at its exact sequence
> + * number.
> + *
> + * So if we were to reuse the sequence number the encryption scheme is
> + * vulnerable. If we encrypt the sequence number for a fresh IV the ASP will
> + * reject our request.
> + */
>  static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
>  {
> +	dev_alert(snp_dev->dev, "Disabling vmpck_id: %d to prevent IV reuse.\n",
> +		  vmpck_id);
>  	memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
>  	snp_dev->vmpck = NULL;
>  }
> @@ -323,32 +342,49 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
>  
>  	/* Call firmware to process the request */
>  	rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> +
> +	/*
> +	 * If the extended guest request fails due to having to small of a

"... too small... "

> +	 * certificate data buffer retry the same guest request without the
> +	 * extended data request.
> +	 */
> +	if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> +	    err == SNP_GUEST_REQ_INVALID_LEN) {
> +		const unsigned int certs_npages = snp_dev->input.data_npages;
> +
> +		exit_code = SVM_VMGEXIT_GUEST_REQUEST;
> +		rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> +
> +		err = SNP_GUEST_REQ_INVALID_LEN;

Huh, why are we overwriting err here?

> +		snp_dev->input.data_npages = certs_npages;
> +	}
> +
>  	if (fw_err)
>  		*fw_err = err;
>  
> -	if (rc)
> -		return rc;
> +	if (rc) {
> +		dev_alert(snp_dev->dev,
> +			  "Detected error from ASP request. rc: %d, fw_err: %llu\n",
> +			  rc, *fw_err);
> +		goto disable_vmpck;
> +	}
>  
> -	/*
> -	 * The verify_and_dec_payload() will fail only if the hypervisor is
> -	 * actively modifying the message header or corrupting the encrypted payload.
> -	 * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
> -	 * the key cannot be used for any communication. The key is disabled to ensure
> -	 * that AES-GCM does not use the same IV while encrypting the request payload.
> -	 */
>  	rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
>  	if (rc) {
>  		dev_alert(snp_dev->dev,
> -			  "Detected unexpected decode failure, disabling the vmpck_id %d\n",
> -			  vmpck_id);
> -		snp_disable_vmpck(snp_dev);
> -		return rc;
> +			  "Detected unexpected decode failure from ASP. rc: %d\n",
> +			  rc);
> +		goto disable_vmpck;
>  	}
>  
>  	/* Increment to new message sequence after payload decryption was successful. */
>  	snp_inc_msg_seqno(snp_dev);
>  
>  	return 0;
> +
> +disable_vmpck:
> +	snp_disable_vmpck(snp_dev);
> +	return rc;
>  }
>  
>  static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> @@ -676,7 +712,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
>  	if (!snp_dev->response)
>  		goto e_free_request;
>  
> -	snp_dev->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
> +	snp_dev->certs_data = alloc_shared_pages(dev, sizeof(*snp_dev->certs_data));

What's that change for?

I went searching for that ->certs_data only ot realize that it is an
array of size of SEV_FW_BLOB_MAX_SIZE elems.

Thx.

Peter Gonda Nov. 14, 2022, 9:11 p.m. UTC | #3

On Fri, Nov 11, 2022 at 9:46 AM Borislav Petkov <bp@suse.de> wrote:
>
> On Thu, Nov 03, 2022 at 08:23:18AM -0700, Peter Gonda wrote:
> > The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
>
> ASP?
>
> That must be the AMD Secure Processor or so but pls write it out.
>
> > communicate securely with each other. The IV to this scheme is a
> > sequence number that both the ASP and the guest track. Currently this
> > sequence number in a guest request must exactly match the sequence
> > number tracked by the ASP. This means that if the guest sees an error
> > from the host during a request it can only retry that exact request or
> > disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> > reuse see:
> > https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>
> Right, how stable will that link be?
>
> IOW, perhaps quote the paper name and authors so that people can find it
> on their own.
>
> > To handle userspace querying the cert_data length handle_guest_request()
> > now: saves the number of pages required by the host, retries the request
>
> This needs to sound like this:
>
> "In order to address this, save the number of pages ..."
>
> IOW, as the docs say:
>
> "Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
> instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
> to do frotz", as if you are giving orders to the codebase to change
> its behaviour."

Thanks for detailed review. Working on cleaning up the text for a V5.

>
> > without requesting the extended data, then returns the number of pages
> > required.
> >
> > Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver")
>
> I'm guessing this needs to go to stable?
>
> > Signed-off-by: Peter Gonda <pgonda@google.com>
> > Reported-by: Peter Gonda <pgonda@google.com>
> > Cc: Dionna Glaze <dionnaglaze@google.com>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Tom Lendacky <thomas.lendacky@amd.com>
> > Cc: Michael Roth <michael.roth@amd.com>
> > Cc: Haowen Bai <baihaowen@meizu.com>
> > Cc: Yang Yingliang <yangyingliang@huawei.com>
> > Cc: Marc Orr <marcorr@google.com>
> > Cc: David Rientjes <rientjes@google.com>
> > Cc: Ashish Kalra <Ashish.Kalra@amd.com>
> > Cc: linux-kernel@vger.kernel.org
> > Cc: kvm@vger.kernel.org
> > ---
> > Tested by placing each of the guest requests: attestation quote,
> > extended attestation quote, and get key. Then tested the extended
> > attestation quote certificate length querying.
> >
> > V4
> >  * As suggested by Dionna moved the extended request retry logic into
> >    the driver.
> >  * Due to big change in patch dropped any reviewed-by tags.
> >
> > ---
> >  drivers/virt/coco/sev-guest/sev-guest.c | 70 +++++++++++++++++++------
> >  1 file changed, 53 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> > index f422f9c58ba79..7dd6337ebdd5b 100644
> > --- a/drivers/virt/coco/sev-guest/sev-guest.c
> > +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> > @@ -41,7 +41,7 @@ struct snp_guest_dev {
> >       struct device *dev;
> >       struct miscdevice misc;
> >
> > -     void *certs_data;
> > +     u8 (*certs_data)[SEV_FW_BLOB_MAX_SIZE];
> >       struct snp_guest_crypto *crypto;
> >       struct snp_guest_msg *request, *response;
> >       struct snp_secrets_page_layout *layout;
> > @@ -67,8 +67,27 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
> >       return true;
> >  }
> >
> > +/*
> > + * If we receive an error from the host or ASP we have two options. We can
>
> Please use passive voice in your commit message: no "we" or "I", etc,
> and describe your changes in imperative mood.
>
> Bottom line is: personal pronouns are ambiguous in text, especially with
> so many parties/companies/etc developing the kernel so let's avoid them
> please.
>
> > + * either retry the exact same encrypted request or we can discontinue using the
> > + * VMPCK.
> > + *
> > + * This is because in the current encryption scheme GHCB v2 uses AES-GCM to
> > + * encrypt the requests. The IV for this scheme is the sequence number. GCM
> > + * cannot tolerate IV reuse.
> > + *
> > + * The ASP FW v1.51 only increments the sequence numbers on a successful
> > + * guest<->ASP back and forth and only accepts messages at its exact sequence
> > + * number.
> > + *
> > + * So if we were to reuse the sequence number the encryption scheme is
> > + * vulnerable. If we encrypt the sequence number for a fresh IV the ASP will
> > + * reject our request.
> > + */
> >  static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
> >  {
> > +     dev_alert(snp_dev->dev, "Disabling vmpck_id: %d to prevent IV reuse.\n",
> > +               vmpck_id);
> >       memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
> >       snp_dev->vmpck = NULL;
> >  }
> > @@ -323,32 +342,49 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
> >
> >       /* Call firmware to process the request */
> >       rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> > +
> > +     /*
> > +      * If the extended guest request fails due to having to small of a
>
> "... too small... "
>
> > +      * certificate data buffer retry the same guest request without the
> > +      * extended data request.
> > +      */
> > +     if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> > +         err == SNP_GUEST_REQ_INVALID_LEN) {
> > +             const unsigned int certs_npages = snp_dev->input.data_npages;
> > +
> > +             exit_code = SVM_VMGEXIT_GUEST_REQUEST;
> > +             rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> > +
> > +             err = SNP_GUEST_REQ_INVALID_LEN;
>
> Huh, why are we overwriting err here?
>
> > +             snp_dev->input.data_npages = certs_npages;
> > +     }
> > +
> >       if (fw_err)
> >               *fw_err = err;
> >
> > -     if (rc)
> > -             return rc;
> > +     if (rc) {
> > +             dev_alert(snp_dev->dev,
> > +                       "Detected error from ASP request. rc: %d, fw_err: %llu\n",
> > +                       rc, *fw_err);
> > +             goto disable_vmpck;
> > +     }
> >
> > -     /*
> > -      * The verify_and_dec_payload() will fail only if the hypervisor is
> > -      * actively modifying the message header or corrupting the encrypted payload.
> > -      * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
> > -      * the key cannot be used for any communication. The key is disabled to ensure
> > -      * that AES-GCM does not use the same IV while encrypting the request payload.
> > -      */
> >       rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
> >       if (rc) {
> >               dev_alert(snp_dev->dev,
> > -                       "Detected unexpected decode failure, disabling the vmpck_id %d\n",
> > -                       vmpck_id);
> > -             snp_disable_vmpck(snp_dev);
> > -             return rc;
> > +                       "Detected unexpected decode failure from ASP. rc: %d\n",
> > +                       rc);
> > +             goto disable_vmpck;
> >       }
> >
> >       /* Increment to new message sequence after payload decryption was successful. */
> >       snp_inc_msg_seqno(snp_dev);
> >
> >       return 0;
> > +
> > +disable_vmpck:
> > +     snp_disable_vmpck(snp_dev);
> > +     return rc;
> >  }
> >
> >  static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> > @@ -676,7 +712,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
> >       if (!snp_dev->response)
> >               goto e_free_request;
> >
> > -     snp_dev->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
> > +     snp_dev->certs_data = alloc_shared_pages(dev, sizeof(*snp_dev->certs_data));
>
> What's that change for?
>
> I went searching for that ->certs_data only ot realize that it is an
> array of size of SEV_FW_BLOB_MAX_SIZE elems.

Do you want this change reverted? I liked the extra readability of the
sizeof(*snp_dev->certs_data) but its unnecessary for this change.

>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> SUSE Software Solutions Germany GmbH
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Martje Boudien Moerman
> (HRB 36809, AG Nürnberg)

Borislav Petkov Nov. 15, 2022, 9:36 p.m. UTC | #4

On Mon, Nov 14, 2022 at 02:11:48PM -0700, Peter Gonda wrote:
> Thanks for detailed review. Working on cleaning up the text for a V5.

Yeah, and I have some questions in my reply which you haven't
addressed...

> > > @@ -676,7 +712,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
> > >       if (!snp_dev->response)
> > >               goto e_free_request;
> > >
> > > -     snp_dev->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
> > > +     snp_dev->certs_data = alloc_shared_pages(dev, sizeof(*snp_dev->certs_data));
> >
> > What's that change for?
> >
> > I went searching for that ->certs_data only ot realize that it is an
> > array of size of SEV_FW_BLOB_MAX_SIZE elems.
> 
> Do you want this change reverted? I liked the extra readability of the
> sizeof(*snp_dev->certs_data) but its unnecessary for this change.

Really?

I think using a define which is a SIZE define is better. Especially if
you look at

	sizeof(*snp_dev->certs_data)

and wonder what type ->certs_data is.

And it's not like it'll go out of sync since it is an array of size,
well, SEV_FW_BLOB_MAX_SIZE.

:-)

Thx.

Peter Gonda Nov. 15, 2022, 9:47 p.m. UTC | #5

On Fri, Nov 11, 2022 at 9:46 AM Borislav Petkov <bp@suse.de> wrote:
>
> On Thu, Nov 03, 2022 at 08:23:18AM -0700, Peter Gonda wrote:
> > The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
>
> ASP?
>
> That must be the AMD Secure Processor or so but pls write it out.

Yes I'll update to write out AMD Secure Processor (ASP). So that the
acronym is clear through in each comment block.

>
> > communicate securely with each other. The IV to this scheme is a
> > sequence number that both the ASP and the guest track. Currently this
> > sequence number in a guest request must exactly match the sequence
> > number tracked by the ASP. This means that if the guest sees an error
> > from the host during a request it can only retry that exact request or
> > disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
> > reuse see:
> > https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf
>
> Right, how stable will that link be?
>
> IOW, perhaps quote the paper name and authors so that people can find it
> on their own.

I will update to the title + authors.

>
> > To handle userspace querying the cert_data length handle_guest_request()
> > now: saves the number of pages required by the host, retries the request
>
> This needs to sound like this:
>
> "In order to address this, save the number of pages ..."
>
> IOW, as the docs say:
>
> "Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
> instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
> to do frotz", as if you are giving orders to the codebase to change
> its behaviour."

Thanks I have updated the comments and description to this style for
the next revision.

>
> > without requesting the extended data, then returns the number of pages
> > required.
> >
> > Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver")
>
> I'm guessing this needs to go to stable?

Yes this should go to stable.

>
> > Signed-off-by: Peter Gonda <pgonda@google.com>
> > Reported-by: Peter Gonda <pgonda@google.com>
> > Cc: Dionna Glaze <dionnaglaze@google.com>
> > Cc: Borislav Petkov <bp@suse.de>
> > Cc: Tom Lendacky <thomas.lendacky@amd.com>
> > Cc: Michael Roth <michael.roth@amd.com>
> > Cc: Haowen Bai <baihaowen@meizu.com>
> > Cc: Yang Yingliang <yangyingliang@huawei.com>
> > Cc: Marc Orr <marcorr@google.com>
> > Cc: David Rientjes <rientjes@google.com>
> > Cc: Ashish Kalra <Ashish.Kalra@amd.com>
> > Cc: linux-kernel@vger.kernel.org
> > Cc: kvm@vger.kernel.org
> > ---
> > Tested by placing each of the guest requests: attestation quote,
> > extended attestation quote, and get key. Then tested the extended
> > attestation quote certificate length querying.
> >
> > V4
> >  * As suggested by Dionna moved the extended request retry logic into
> >    the driver.
> >  * Due to big change in patch dropped any reviewed-by tags.
> >
> > ---
> >  drivers/virt/coco/sev-guest/sev-guest.c | 70 +++++++++++++++++++------
> >  1 file changed, 53 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c
> > index f422f9c58ba79..7dd6337ebdd5b 100644
> > --- a/drivers/virt/coco/sev-guest/sev-guest.c
> > +++ b/drivers/virt/coco/sev-guest/sev-guest.c
> > @@ -41,7 +41,7 @@ struct snp_guest_dev {
> >       struct device *dev;
> >       struct miscdevice misc;
> >
> > -     void *certs_data;
> > +     u8 (*certs_data)[SEV_FW_BLOB_MAX_SIZE];
> >       struct snp_guest_crypto *crypto;
> >       struct snp_guest_msg *request, *response;
> >       struct snp_secrets_page_layout *layout;
> > @@ -67,8 +67,27 @@ static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
> >       return true;
> >  }
> >
> > +/*
> > + * If we receive an error from the host or ASP we have two options. We can
>
> Please use passive voice in your commit message: no "we" or "I", etc,
> and describe your changes in imperative mood.
>
> Bottom line is: personal pronouns are ambiguous in text, especially with
> so many parties/companies/etc developing the kernel so let's avoid them
> please.

I have removed the pronouns for the next revision.

>
> > + * either retry the exact same encrypted request or we can discontinue using the
> > + * VMPCK.
> > + *
> > + * This is because in the current encryption scheme GHCB v2 uses AES-GCM to
> > + * encrypt the requests. The IV for this scheme is the sequence number. GCM
> > + * cannot tolerate IV reuse.
> > + *
> > + * The ASP FW v1.51 only increments the sequence numbers on a successful
> > + * guest<->ASP back and forth and only accepts messages at its exact sequence
> > + * number.
> > + *
> > + * So if we were to reuse the sequence number the encryption scheme is
> > + * vulnerable. If we encrypt the sequence number for a fresh IV the ASP will
> > + * reject our request.
> > + */
> >  static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
> >  {
> > +     dev_alert(snp_dev->dev, "Disabling vmpck_id: %d to prevent IV reuse.\n",
> > +               vmpck_id);
> >       memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
> >       snp_dev->vmpck = NULL;
> >  }
> > @@ -323,32 +342,49 @@ static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, in
> >
> >       /* Call firmware to process the request */
> >       rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> > +
> > +     /*
> > +      * If the extended guest request fails due to having to small of a
>
> "... too small... "
>
> > +      * certificate data buffer retry the same guest request without the
> > +      * extended data request.
> > +      */
> > +     if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> > +         err == SNP_GUEST_REQ_INVALID_LEN) {
> > +             const unsigned int certs_npages = snp_dev->input.data_npages;
> > +
> > +             exit_code = SVM_VMGEXIT_GUEST_REQUEST;
> > +             rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> > +
> > +             err = SNP_GUEST_REQ_INVALID_LEN;
>
> Huh, why are we overwriting err here?

I have added a comment for the next revision.

We are overwriting err here so that userspace is alerted that they
supplied a buffer too small.

>
> > +             snp_dev->input.data_npages = certs_npages;
> > +     }
> > +
> >       if (fw_err)
> >               *fw_err = err;
> >
> > -     if (rc)
> > -             return rc;
> > +     if (rc) {
> > +             dev_alert(snp_dev->dev,
> > +                       "Detected error from ASP request. rc: %d, fw_err: %llu\n",
> > +                       rc, *fw_err);
> > +             goto disable_vmpck;
> > +     }
> >
> > -     /*
> > -      * The verify_and_dec_payload() will fail only if the hypervisor is
> > -      * actively modifying the message header or corrupting the encrypted payload.
> > -      * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
> > -      * the key cannot be used for any communication. The key is disabled to ensure
> > -      * that AES-GCM does not use the same IV while encrypting the request payload.
> > -      */
> >       rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
> >       if (rc) {
> >               dev_alert(snp_dev->dev,
> > -                       "Detected unexpected decode failure, disabling the vmpck_id %d\n",
> > -                       vmpck_id);
> > -             snp_disable_vmpck(snp_dev);
> > -             return rc;
> > +                       "Detected unexpected decode failure from ASP. rc: %d\n",
> > +                       rc);
> > +             goto disable_vmpck;
> >       }
> >
> >       /* Increment to new message sequence after payload decryption was successful. */
> >       snp_inc_msg_seqno(snp_dev);
> >
> >       return 0;
> > +
> > +disable_vmpck:
> > +     snp_disable_vmpck(snp_dev);
> > +     return rc;
> >  }
> >
> >  static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
> > @@ -676,7 +712,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
> >       if (!snp_dev->response)
> >               goto e_free_request;
> >
> > -     snp_dev->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
> > +     snp_dev->certs_data = alloc_shared_pages(dev, sizeof(*snp_dev->certs_data));
>
> What's that change for?
>
> I went searching for that ->certs_data only ot realize that it is an
> array of size of SEV_FW_BLOB_MAX_SIZE elems.
>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> SUSE Software Solutions Germany GmbH
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Martje Boudien Moerman
> (HRB 36809, AG Nürnberg)

Peter Gonda Nov. 15, 2022, 9:48 p.m. UTC | #6

On Tue, Nov 15, 2022 at 2:36 PM Borislav Petkov <bp@suse.de> wrote:
>
> On Mon, Nov 14, 2022 at 02:11:48PM -0700, Peter Gonda wrote:
> > Thanks for detailed review. Working on cleaning up the text for a V5.
>
> Yeah, and I have some questions in my reply which you haven't
> addressed...

Ah I was just applying answers to those in the comments/commit description.

I have replied back to your first post.

>
> > > > @@ -676,7 +712,7 @@ static int __init sev_guest_probe(struct platform_device *pdev)
> > > >       if (!snp_dev->response)
> > > >               goto e_free_request;
> > > >
> > > > -     snp_dev->certs_data = alloc_shared_pages(dev, SEV_FW_BLOB_MAX_SIZE);
> > > > +     snp_dev->certs_data = alloc_shared_pages(dev, sizeof(*snp_dev->certs_data));
> > >
> > > What's that change for?
> > >
> > > I went searching for that ->certs_data only ot realize that it is an
> > > array of size of SEV_FW_BLOB_MAX_SIZE elems.
> >
> > Do you want this change reverted? I liked the extra readability of the
> > sizeof(*snp_dev->certs_data) but its unnecessary for this change.
>
> Really?
>
> I think using a define which is a SIZE define is better. Especially if
> you look at
>
>         sizeof(*snp_dev->certs_data)
>
> and wonder what type ->certs_data is.
>
> And it's not like it'll go out of sync since it is an array of size,
> well, SEV_FW_BLOB_MAX_SIZE.
>

OK! I'll revert this part of the change.

Borislav Petkov Nov. 16, 2022, 12:20 p.m. UTC | #7

On Tue, Nov 15, 2022 at 02:47:31PM -0700, Peter Gonda wrote:
> > > +      * certificate data buffer retry the same guest request without the
> > > +      * extended data request.
> > > +      */
> > > +     if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> > > +         err == SNP_GUEST_REQ_INVALID_LEN) {
> > > +             const unsigned int certs_npages = snp_dev->input.data_npages;
> > > +
> > > +             exit_code = SVM_VMGEXIT_GUEST_REQUEST;
> > > +             rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> > > +
> > > +             err = SNP_GUEST_REQ_INVALID_LEN;
> >
> > Huh, why are we overwriting err here?
> 
> I have added a comment for the next revision.
> 
> We are overwriting err here so that userspace is alerted that they
> supplied a buffer too small.

Sure but you're not checking rc either. What if that reissue fails for
whatever other reason? -EIO for example...

Peter Gonda Nov. 16, 2022, 4:23 p.m. UTC | #8

On Wed, Nov 16, 2022 at 5:20 AM Borislav Petkov <bp@suse.de> wrote:
>
> On Tue, Nov 15, 2022 at 02:47:31PM -0700, Peter Gonda wrote:
> > > > +      * certificate data buffer retry the same guest request without the
> > > > +      * extended data request.
> > > > +      */
> > > > +     if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> > > > +         err == SNP_GUEST_REQ_INVALID_LEN) {
> > > > +             const unsigned int certs_npages = snp_dev->input.data_npages;
> > > > +
> > > > +             exit_code = SVM_VMGEXIT_GUEST_REQUEST;
> > > > +             rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> > > > +
> > > > +             err = SNP_GUEST_REQ_INVALID_LEN;
> > >
> > > Huh, why are we overwriting err here?
> >
> > I have added a comment for the next revision.
> >
> > We are overwriting err here so that userspace is alerted that they
> > supplied a buffer too small.
>
> Sure but you're not checking rc either. What if that reissue fails for
> whatever other reason? -EIO for example...

If we get any error here we have to wipe the VMPCK here so I thought
this always override @err was OK.

I can update this to only override @err if after the secondary
SVM_VMGEXIT_GUEST_REQUEST rc and err are OK. Thoughts?

>
> --
> Regards/Gruss,
>     Boris.
>
> SUSE Software Solutions Germany GmbH
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Martje Boudien Moerman
> (HRB 36809, AG Nürnberg)

Tom Lendacky Nov. 16, 2022, 4:58 p.m. UTC | #9

On 11/16/22 10:23, Peter Gonda wrote:
> On Wed, Nov 16, 2022 at 5:20 AM Borislav Petkov <bp@suse.de> wrote:
>>
>> On Tue, Nov 15, 2022 at 02:47:31PM -0700, Peter Gonda wrote:
>>>>> +      * certificate data buffer retry the same guest request without the
>>>>> +      * extended data request.
>>>>> +      */
>>>>> +     if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
>>>>> +         err == SNP_GUEST_REQ_INVALID_LEN) {
>>>>> +             const unsigned int certs_npages = snp_dev->input.data_npages;
>>>>> +
>>>>> +             exit_code = SVM_VMGEXIT_GUEST_REQUEST;
>>>>> +             rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
>>>>> +
>>>>> +             err = SNP_GUEST_REQ_INVALID_LEN;
>>>>
>>>> Huh, why are we overwriting err here?
>>>
>>> I have added a comment for the next revision.
>>>
>>> We are overwriting err here so that userspace is alerted that they
>>> supplied a buffer too small.
>>
>> Sure but you're not checking rc either. What if that reissue fails for
>> whatever other reason? -EIO for example...
> 
> If we get any error here we have to wipe the VMPCK here so I thought

More accurate to say that you will wipe the VMPCK, since the value of rc 
is checked a bit further down in the code and the -EIO (or other non-zero) 
will be result in a call to snp_disable_vmpck() and rc being propagated 
back to the user as an ioctl() return code.

Might be worth a comment above that second snp_issue_guest_request() 
explaining that.

> this always override @err was OK.
> 
> I can update this to only override @err if after the secondary
> SVM_VMGEXIT_GUEST_REQUEST rc and err are OK. Thoughts?

I think it's ok to set it no matter what, but I don't have a strong 
opinion either way.

Thanks,
Tom

> 
>>
>> --
>> Regards/Gruss,
>>      Boris.
>>
>> SUSE Software Solutions Germany GmbH
>> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Martje Boudien Moerman
>> (HRB 36809, AG Nürnberg)

Peter Gonda Nov. 16, 2022, 5:10 p.m. UTC | #10

On Wed, Nov 16, 2022 at 9:58 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 11/16/22 10:23, Peter Gonda wrote:
> > On Wed, Nov 16, 2022 at 5:20 AM Borislav Petkov <bp@suse.de> wrote:
> >>
> >> On Tue, Nov 15, 2022 at 02:47:31PM -0700, Peter Gonda wrote:
> >>>>> +      * certificate data buffer retry the same guest request without the
> >>>>> +      * extended data request.
> >>>>> +      */
> >>>>> +     if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
> >>>>> +         err == SNP_GUEST_REQ_INVALID_LEN) {
> >>>>> +             const unsigned int certs_npages = snp_dev->input.data_npages;
> >>>>> +
> >>>>> +             exit_code = SVM_VMGEXIT_GUEST_REQUEST;
> >>>>> +             rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
> >>>>> +
> >>>>> +             err = SNP_GUEST_REQ_INVALID_LEN;
> >>>>
> >>>> Huh, why are we overwriting err here?
> >>>
> >>> I have added a comment for the next revision.
> >>>
> >>> We are overwriting err here so that userspace is alerted that they
> >>> supplied a buffer too small.
> >>
> >> Sure but you're not checking rc either. What if that reissue fails for
> >> whatever other reason? -EIO for example...
> >
> > If we get any error here we have to wipe the VMPCK here so I thought
>
> More accurate to say that you will wipe the VMPCK, since the value of rc
> is checked a bit further down in the code and the -EIO (or other non-zero)
> will be result in a call to snp_disable_vmpck() and rc being propagated
> back to the user as an ioctl() return code.
>
> Might be worth a comment above that second snp_issue_guest_request()
> explaining that.

I'll add a comment above the second snp_issue_guest_request(), good idea thanks.

I think another comment above the first snp_issue_guest_request()
could help too. Saying once we call this function we either need to
increment the sequence number or wipe the VMPCK to ensure the
encryption scheme is safe.

>
> > this always override @err was OK.
> >
> > I can update this to only override @err if after the secondary
> > SVM_VMGEXIT_GUEST_REQUEST rc and err are OK. Thoughts?
>
> I think it's ok to set it no matter what, but I don't have a strong
> opinion either way.
>
> Thanks,
> Tom
>
> >
> >>
> >> --
> >> Regards/Gruss,
> >>      Boris.
> >>
> >> SUSE Software Solutions Germany GmbH
> >> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Martje Boudien Moerman
> >> (HRB 36809, AG Nürnberg)

Borislav Petkov Nov. 16, 2022, 5:28 p.m. UTC | #11

On Wed, Nov 16, 2022 at 10:10:58AM -0700, Peter Gonda wrote:
> I think another comment above the first snp_issue_guest_request()
> could help too. Saying once we call this function we either need to
> increment the sequence number or wipe the VMPCK to ensure the
> encryption scheme is safe.

And make that explicit pls:

        /*
         * If the extended guest request fails due to having to small of a
         * certificate data buffer retry the same guest request without the
         * extended data request...

	 ... in order to not have to reuse the IV.


I have to admit, the flow in that function is still not optimal but I
haven't stared at it long enough to have a better idea...

Thx.

Peter Gonda Nov. 16, 2022, 5:57 p.m. UTC | #12

On Wed, Nov 16, 2022 at 10:28 AM Borislav Petkov <bp@suse.de> wrote:
>
> On Wed, Nov 16, 2022 at 10:10:58AM -0700, Peter Gonda wrote:
> > I think another comment above the first snp_issue_guest_request()
> > could help too. Saying once we call this function we either need to
> > increment the sequence number or wipe the VMPCK to ensure the
> > encryption scheme is safe.
>
> And make that explicit pls:
>
>         /*
>          * If the extended guest request fails due to having to small of a
>          * certificate data buffer retry the same guest request without the
>          * extended data request...
>
>          ... in order to not have to reuse the IV.
>
>
> I have to admit, the flow in that function is still not optimal but I
> haven't stared at it long enough to have a better idea...

Thanks for all the feedback Tom and Boris. I've sent out a V5. I hope
I've gotten the grammar correct in these comments.

[V4] virt: sev: Prevent IV reuse in SNP guest driver

Commit Message

Comments

Patch