diff mbox series

[RFC,RFT,1/4] hv: Leak pages if set_memory_encrypted() fails

Message ID 20240222021006.2279329-2-rick.p.edgecombe@intel.com (mailing list archive)
State RFC
Headers show
Series Handle set_memory_XXcrypted() errors in hyperv | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 940 this patch: 940
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 957 this patch: 957
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 957 this patch: 957
netdev/checkpatch warning WARNING: line length of 88 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Edgecombe, Rick P Feb. 22, 2024, 2:10 a.m. UTC
On TDX it is possible for the untrusted host to cause
set_memory_encrypted() or set_memory_decrypted() to fail such that an
error is returned and the resulting memory is shared. Callers need to take
care to handle these errors to avoid returning decrypted (shared) memory to
the page allocator, which could lead to functional or security issues.

Hyperv could free decrypted/shared pages if set_memory_encrypted() fails.
Leak the pages if this happens.

Only compile tested.

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: linux-hyperv@vger.kernel.org
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 drivers/hv/connection.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

Comments

Wei Liu March 1, 2024, 9:26 a.m. UTC | #1
On Wed, Feb 21, 2024 at 06:10:03PM -0800, Rick Edgecombe wrote:
> On TDX it is possible for the untrusted host to cause
> set_memory_encrypted() or set_memory_decrypted() to fail such that an
> error is returned and the resulting memory is shared. Callers need to take
> care to handle these errors to avoid returning decrypted (shared) memory to
> the page allocator, which could lead to functional or security issues.
> 
> Hyperv could free decrypted/shared pages if set_memory_encrypted() fails.

"Hyper-V" throughout.

> Leak the pages if this happens.
> 
> Only compile tested.
> 
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: Dexuan Cui <decui@microsoft.com>
> Cc: linux-hyperv@vger.kernel.org
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  drivers/hv/connection.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> index 3cabeeabb1ca..e39493421bbb 100644
> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -315,6 +315,7 @@ int vmbus_connect(void)
>  
>  void vmbus_disconnect(void)
>  {
> +	int ret;
>  	/*
>  	 * First send the unload request to the host.
>  	 */
> @@ -337,11 +338,13 @@ void vmbus_disconnect(void)
>  		vmbus_connection.int_page = NULL;
>  	}
>  
> -	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
> -	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
> +	ret = set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
> +	ret |= set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
>  
> -	hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> -	hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> +	if (!ret) {
> +		hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> +		hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> +	}

This silently leaks the pages if set_memory_encrypted() fails.  I think
there should print some warning or error messages here.

Thanks,
Wei.

>  	vmbus_connection.monitor_pages[0] = NULL;
>  	vmbus_connection.monitor_pages[1] = NULL;
>  }
> -- 
> 2.34.1
>
Michael Kelley March 1, 2024, 7 p.m. UTC | #2
From: Rick Edgecombe <rick.p.edgecombe@intel.com> Sent: Wednesday, February 21, 2024 6:10 PM
> 

Historically, the preferred Subject prefix for changes to connection.c has
been "Drivers: hv: vmbus:", not just "hv:".  Sometimes that preference
isn't followed, but most of the time it is.

> On TDX it is possible for the untrusted host to cause

I'd argue that this is for CoCo VMs in general, not just TDX.  I don't know
all the failure modes for SEV-SNP, but the code paths you are changing
are run in both TDX and SEV-SNP CoCo VMs.

> set_memory_encrypted() or set_memory_decrypted() to fail such that an
> error is returned and the resulting memory is shared. Callers need to take
> care to handle these errors to avoid returning decrypted (shared) memory to
> the page allocator, which could lead to functional or security issues.
> 
> Hyperv could free decrypted/shared pages if set_memory_encrypted() fails.

It's not Hyper-V doing the freeing.  Maybe say "VMBus code could
free ...."

> Leak the pages if this happens.
> 
> Only compile tested.
> 
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: Dexuan Cui <decui@microsoft.com>
> Cc: linux-hyperv@vger.kernel.org
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  drivers/hv/connection.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> index 3cabeeabb1ca..e39493421bbb 100644
> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -315,6 +315,7 @@ int vmbus_connect(void)
> 
>  void vmbus_disconnect(void)
>  {
> +	int ret;
>  	/*
>  	 * First send the unload request to the host.
>  	 */
> @@ -337,11 +338,13 @@ void vmbus_disconnect(void)
>  		vmbus_connection.int_page = NULL;
>  	}
> 
> -	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
> -	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
> +	ret = set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
> +	ret |= set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
> 
> -	hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> -	hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> +	if (!ret) {
> +		hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> +		hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> +	}

Of course, this will leak the memory for both pages if only one of the
set_memory_encrypted() calls fails, but I'm OK with that.  It doesn't
seem worth the additional complexity to treat each page separately.

>  	vmbus_connection.monitor_pages[0] = NULL;
>  	vmbus_connection.monitor_pages[1] = NULL;
>  }
> --
> 2.34.1
Edgecombe, Rick P March 1, 2024, 7:12 p.m. UTC | #3
On Fri, 2024-03-01 at 09:26 +0000, Wei Liu wrote:
> > Hyperv could free decrypted/shared pages if set_memory_encrypted()
> > fails.
> 
> "Hyper-V" throughout.

Ok.

> 
> > Leak the pages if this happens.
> > 
> > Only compile tested.
> > 
> > Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> > Cc: Haiyang Zhang <haiyangz@microsoft.com>
> > Cc: Wei Liu <wei.liu@kernel.org>
> > Cc: Dexuan Cui <decui@microsoft.com>
> > Cc: linux-hyperv@vger.kernel.org
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > ---
> >   drivers/hv/connection.c | 11 +++++++----
> >   1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> > index 3cabeeabb1ca..e39493421bbb 100644
> > --- a/drivers/hv/connection.c
> > +++ b/drivers/hv/connection.c
> > @@ -315,6 +315,7 @@ int vmbus_connect(void)
> >   
> >   void vmbus_disconnect(void)
> >   {
> > +       int ret;
> >         /*
> >          * First send the unload request to the host.
> >          */
> > @@ -337,11 +338,13 @@ void vmbus_disconnect(void)
> >                 vmbus_connection.int_page = NULL;
> >         }
> >   
> > -       set_memory_encrypted((unsigned
> > long)vmbus_connection.monitor_pages[0], 1);
> > -       set_memory_encrypted((unsigned
> > long)vmbus_connection.monitor_pages[1], 1);
> > +       ret = set_memory_encrypted((unsigned
> > long)vmbus_connection.monitor_pages[0], 1);
> > +       ret |= set_memory_encrypted((unsigned
> > long)vmbus_connection.monitor_pages[1], 1);
> >   
> > -       hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> > -       hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> > +       if (!ret) {
> > +               hv_free_hyperv_page(vmbus_connection.monitor_pages[
> > 0]);
> > +               hv_free_hyperv_page(vmbus_connection.monitor_pages[
> > 1]);
> > +       }
> 
> This silently leaks the pages if set_memory_encrypted() fails.  I
> think
> there should print some warning or error messages here.

Another patch will warn in CPA for the particular case:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/mm

Do we want a warning in the caller too? It is more robust to other
types of failures in the future I guess.
Edgecombe, Rick P March 1, 2024, 7:13 p.m. UTC | #4
On Fri, 2024-03-01 at 19:00 +0000, Michael Kelley wrote:
> From: Rick Edgecombe <rick.p.edgecombe@intel.com> Sent: Wednesday,
> February 21, 2024 6:10 PM
> > 
> 
> Historically, the preferred Subject prefix for changes to
> connection.c has
> been "Drivers: hv: vmbus:", not just "hv:".  Sometimes that
> preference
> isn't followed, but most of the time it is.

Ok, I can update it.

> 
> > On TDX it is possible for the untrusted host to cause
> 
> I'd argue that this is for CoCo VMs in general, not just TDX.  I
> don't know
> all the failure modes for SEV-SNP, but the code paths you are
> changing
> are run in both TDX and SEV-SNP CoCo VMs.

On SEV-SNP the host can cause the call to fail too was my
understanding. But in Linux, that side panics and never gets to the
point of being able to free the shared memory. So it's not TDX
architecture specific, it's just how Linux handles it on the different
sids. For TDX the suggestion was to avoid panicing because it is
possible to handle in SW, as Linux usually tries it's best to do.

> 
> > set_memory_encrypted() or set_memory_decrypted() to fail such that
> > an
> > error is returned and the resulting memory is shared. Callers need
> > to take
> > care to handle these errors to avoid returning decrypted (shared)
> > memory to
> > the page allocator, which could lead to functional or security
> > issues.
> > 
> > Hyperv could free decrypted/shared pages if set_memory_encrypted()
> > fails.
> 
> It's not Hyper-V doing the freeing.  Maybe say "VMBus code could
> free ...."

Ok.
Michael Kelley March 1, 2024, 8:21 p.m. UTC | #5
From: Edgecombe, Rick P <rick.p.edgecombe@intel.com> Sent: Friday, March 1, 2024 11:13 AM
> >
> > > On TDX it is possible for the untrusted host to cause
> >
> > I'd argue that this is for CoCo VMs in general, not just TDX.  I don't know
> > all the failure modes for SEV-SNP, but the code paths you are changing
> > are run in both TDX and SEV-SNP CoCo VMs.
> 
> On SEV-SNP the host can cause the call to fail too was my
> understanding. But in Linux, that side panics and never gets to the
> point of being able to free the shared memory. So it's not TDX
> architecture specific, it's just how Linux handles it on the different
> sids. For TDX the suggestion was to avoid panicing because it is
> possible to handle in SW, as Linux usually tries it's best to do.
> 

The Hyper-V case can actually be a third path when a paravisor
is being used.  In that case, for both TDX and SEV-SNP, the
hypervisor callbacks in __set_memory_enc_pgtable() go
to Hyper-V specific functions that talk to the paravisor. Those
callbacks never panic. After a failure, either at the paravisor
level or in the paravisor talking to the hypervisor/VMM, the
decrypted/encrypted state of the memory isn't known.  So
leaking the memory is still the right thing to do, and your
patch set is good. But in the Hyper-V with paravisor case,
the leaking is applicable more broadly than just TDX.

The text in the commit message isn't something that I'll
go to the mat over.  But I wanted to offer the slightly broader
perspective. 

Michael
Edgecombe, Rick P March 1, 2024, 8:47 p.m. UTC | #6
On Fri, 2024-03-01 at 20:21 +0000, Michael Kelley wrote:
> 
> The Hyper-V case can actually be a third path when a paravisor
> is being used.  In that case, for both TDX and SEV-SNP, the
> hypervisor callbacks in __set_memory_enc_pgtable() go
> to Hyper-V specific functions that talk to the paravisor. Those
> callbacks never panic. After a failure, either at the paravisor
> level or in the paravisor talking to the hypervisor/VMM, the
> decrypted/encrypted state of the memory isn't known.  So
> leaking the memory is still the right thing to do, and your
> patch set is good. But in the Hyper-V with paravisor case,
> the leaking is applicable more broadly than just TDX.
> 
> The text in the commit message isn't something that I'll
> go to the mat over.  But I wanted to offer the slightly broader
> perspective.

Oh, interesting. I think I missed it because it only has a special
enc_status_change_finish(). But yea. It does sound like the text you
suggested is more accurate. I'll update it.
diff mbox series

Patch

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 3cabeeabb1ca..e39493421bbb 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -315,6 +315,7 @@  int vmbus_connect(void)
 
 void vmbus_disconnect(void)
 {
+	int ret;
 	/*
 	 * First send the unload request to the host.
 	 */
@@ -337,11 +338,13 @@  void vmbus_disconnect(void)
 		vmbus_connection.int_page = NULL;
 	}
 
-	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
-	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
+	ret = set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
+	ret |= set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
 
-	hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
-	hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
+	if (!ret) {
+		hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
+		hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
+	}
 	vmbus_connection.monitor_pages[0] = NULL;
 	vmbus_connection.monitor_pages[1] = NULL;
 }