diff mbox series

[RFC,2/4] x86/sgx: Set dirty bit after modifying page contents

Message ID b0596095e09ee4a14a3802c5ce91372a0dc21220.1651171455.git.reinette.chatre@intel.com (mailing list archive)
State New, archived
Headers show
Series [RFC,1/4] x86/sgx: Do not free backing memory on ENCLS[ELDU] failure | expand

Commit Message

Reinette Chatre April 28, 2022, 8:11 p.m. UTC
Recent commit 08999b2489b4 ("x86/sgx: Free backing memory
after faulting the enclave page") expanded __sgx_encl_eldu()
to clear an enclave page's PCMD (Paging Crypto MetaData)
from the PCMD page in the backing store after the enclave
page is restored to the enclave.

Since the PCMD page in the backing store is modified the page
should be set as dirty when releasing the reference to
ensure the modified data is retained.

Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/sgx/encl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dave Hansen April 28, 2022, 9:40 p.m. UTC | #1
On 4/28/22 13:11, Reinette Chatre wrote:
> 
> Since the PCMD page in the backing store is modified the page
> should be set as dirty when releasing the reference to
> ensure the modified data is retained.
> 
> Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page")
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/encl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index e5d2661800ac..e03f124ce772 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -98,7 +98,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>  	kunmap_atomic(pcmd_page);
>  	kunmap_atomic((void *)(unsigned long)pginfo.contents);
>  
> -	sgx_encl_put_backing(&b, false);
> +	sgx_encl_put_backing(&b, true);

I think you're on the right track here.  The concept is right.  The
memset() wrote fresh data into the b.pcmd page.  But, if it were clean
swap cache, it can be discarded again and the memset() might be lost.

I *think* all that would do in the end is leave us with a PCMD page that
will never be truncated because the page has non-zero PCMD data that
will never be used, the result of the thrown-away memset().

But, I'd rather this be done more directly and closer to the actual
dirtying of the page.  Perhaps:

+	set_page_dirty(b.pcmd);
        memset(pcmd_page + b.pcmd_offset, 0, sizeof(struct sgx_pcmd));


I don't think the "b.contents" page needs the same treatment because its
contents are always discarded in this path while some of the PCMD page's
contents need to be preserved.
Reinette Chatre April 28, 2022, 10:41 p.m. UTC | #2
Hi Dave,

On 4/28/2022 2:40 PM, Dave Hansen wrote:
> On 4/28/22 13:11, Reinette Chatre wrote:
>>
>> Since the PCMD page in the backing store is modified the page
>> should be set as dirty when releasing the reference to
>> ensure the modified data is retained.
>>
>> Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page")
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>> ---
>>  arch/x86/kernel/cpu/sgx/encl.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
>> index e5d2661800ac..e03f124ce772 100644
>> --- a/arch/x86/kernel/cpu/sgx/encl.c
>> +++ b/arch/x86/kernel/cpu/sgx/encl.c
>> @@ -98,7 +98,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>>  	kunmap_atomic(pcmd_page);
>>  	kunmap_atomic((void *)(unsigned long)pginfo.contents);
>>  
>> -	sgx_encl_put_backing(&b, false);
>> +	sgx_encl_put_backing(&b, true);
> 
> I think you're on the right track here.  The concept is right.  The
> memset() wrote fresh data into the b.pcmd page.  But, if it were clean
> swap cache, it can be discarded again and the memset() might be lost.
> 
> I *think* all that would do in the end is leave us with a PCMD page that
> will never be truncated because the page has non-zero PCMD data that
> will never be used, the result of the thrown-away memset().

Thank you very much for the analysis. This explains why this change
did not impact the issue I am chasing.

> 
> But, I'd rather this be done more directly and closer to the actual
> dirtying of the page.  Perhaps:
> 
> +	set_page_dirty(b.pcmd);
>         memset(pcmd_page + b.pcmd_offset, 0, sizeof(struct sgx_pcmd));
> 
> 

Will do (with comments to explain why this line was extracted from
the subsequent sgx_encl_put_backing() call). 


> I don't think the "b.contents" page needs the same treatment because its
> contents are always discarded in this path while some of the PCMD page's
> contents need to be preserved.

That's right. The consistent symmetrical API of get()/put() was appealing
to me.

Thank you very much.

Reinette
Jarkko Sakkinen May 6, 2022, 10:27 p.m. UTC | #3
On Thu, 2022-04-28 at 13:11 -0700, Reinette Chatre wrote:
> Recent commit 08999b2489b4 ("x86/sgx: Free backing memory
> after faulting the enclave page") expanded __sgx_encl_eldu()
> to clear an enclave page's PCMD (Paging Crypto MetaData)
> from the PCMD page in the backing store after the enclave
> page is restored to the enclave.
> 
> Since the PCMD page in the backing store is modified the page
> should be set as dirty when releasing the reference to
> ensure the modified data is retained.
> 
> Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page")
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/encl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> index e5d2661800ac..e03f124ce772 100644
> --- a/arch/x86/kernel/cpu/sgx/encl.c
> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> @@ -98,7 +98,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>         kunmap_atomic(pcmd_page);
>         kunmap_atomic((void *)(unsigned long)pginfo.contents);
>  
> -       sgx_encl_put_backing(&b, false);
> +       sgx_encl_put_backing(&b, true);
>  
>         sgx_encl_truncate_backing_page(encl, page_index);
>  

So, the implementation is:

void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write)
{
	if (do_write) {
		set_page_dirty(backing->pcmd);
		set_page_dirty(backing->contents);
	}

	put_page(backing->pcmd);
	put_page(backing->contents);
}

And we only want to set dirty for PCMD part, right?

Thus, perhaps we should fix it instead by:

	set_page_dirty(backing->pcmd);
	put_page(backing->pcmd):
	put_page(backing->contents);
	
?	

I would not mind getting rid of that function anyway. It's kind
of bad wrapping IMHO.
	
BR, Jarkko
Reinette Chatre May 6, 2022, 10:40 p.m. UTC | #4
Hi Jarkko,

On 5/6/2022 3:27 PM, Jarkko Sakkinen wrote:
> On Thu, 2022-04-28 at 13:11 -0700, Reinette Chatre wrote:
>> Recent commit 08999b2489b4 ("x86/sgx: Free backing memory
>> after faulting the enclave page") expanded __sgx_encl_eldu()
>> to clear an enclave page's PCMD (Paging Crypto MetaData)
>> from the PCMD page in the backing store after the enclave
>> page is restored to the enclave.
>>
>> Since the PCMD page in the backing store is modified the page
>> should be set as dirty when releasing the reference to
>> ensure the modified data is retained.
>>
>> Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page")
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>> ---
>>  arch/x86/kernel/cpu/sgx/encl.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
>> index e5d2661800ac..e03f124ce772 100644
>> --- a/arch/x86/kernel/cpu/sgx/encl.c
>> +++ b/arch/x86/kernel/cpu/sgx/encl.c
>> @@ -98,7 +98,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
>>         kunmap_atomic(pcmd_page);
>>         kunmap_atomic((void *)(unsigned long)pginfo.contents);
>>  
>> -       sgx_encl_put_backing(&b, false);
>> +       sgx_encl_put_backing(&b, true);
>>  
>>         sgx_encl_truncate_backing_page(encl, page_index);
>>  
> 
> So, the implementation is:
> 
> void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write)
> {
> 	if (do_write) {
> 		set_page_dirty(backing->pcmd);
> 		set_page_dirty(backing->contents);
> 	}
> 
> 	put_page(backing->pcmd);
> 	put_page(backing->contents);
> }
> 
> And we only want to set dirty for PCMD part, right?
> 
> Thus, perhaps we should fix it instead by:
> 
> 	set_page_dirty(backing->pcmd);
> 	put_page(backing->pcmd):
> 	put_page(backing->contents);
> 	
> ?	
> 
> I would not mind getting rid of that function anyway. It's kind
> of bad wrapping IMHO.

Could we rather just change sgx_encl_put_backing() to be:

void sgx_encl_put_backing(struct sgx_backing *backing)
{
	put_page(backing->pcmd);
	put_page(backing->contents);
}

Two reasons:

1) Instead of getting rid of sgx_encl_put_backing() we can keep
   it so that its name reflects its work and its usage remains
   symmetrical to sgx_encl_get_backing() that will stay and
   the code thus continues to be clear on the page references.

2) By moving the set_page_dirty() out of that function the
   set_page_dirty() can be called _before_ writing data to
   the page, which is the right thing to do per:
   https://lore.kernel.org/linux-sgx/c057af3d-b7fb-34cd-0d75-989fca0e67fe@intel.com/

Reinette
Jarkko Sakkinen May 7, 2022, 6:01 p.m. UTC | #5
On Fri, May 06, 2022 at 03:40:26PM -0700, Reinette Chatre wrote:
> Hi Jarkko,
> 
> On 5/6/2022 3:27 PM, Jarkko Sakkinen wrote:
> > On Thu, 2022-04-28 at 13:11 -0700, Reinette Chatre wrote:
> >> Recent commit 08999b2489b4 ("x86/sgx: Free backing memory
> >> after faulting the enclave page") expanded __sgx_encl_eldu()
> >> to clear an enclave page's PCMD (Paging Crypto MetaData)
> >> from the PCMD page in the backing store after the enclave
> >> page is restored to the enclave.
> >>
> >> Since the PCMD page in the backing store is modified the page
> >> should be set as dirty when releasing the reference to
> >> ensure the modified data is retained.
> >>
> >> Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page")
> >> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> >> ---
> >>  arch/x86/kernel/cpu/sgx/encl.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> >> index e5d2661800ac..e03f124ce772 100644
> >> --- a/arch/x86/kernel/cpu/sgx/encl.c
> >> +++ b/arch/x86/kernel/cpu/sgx/encl.c
> >> @@ -98,7 +98,7 @@ static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
> >>         kunmap_atomic(pcmd_page);
> >>         kunmap_atomic((void *)(unsigned long)pginfo.contents);
> >>  
> >> -       sgx_encl_put_backing(&b, false);
> >> +       sgx_encl_put_backing(&b, true);
> >>  
> >>         sgx_encl_truncate_backing_page(encl, page_index);
> >>  
> > 
> > So, the implementation is:
> > 
> > void sgx_encl_put_backing(struct sgx_backing *backing, bool do_write)
> > {
> > 	if (do_write) {
> > 		set_page_dirty(backing->pcmd);
> > 		set_page_dirty(backing->contents);
> > 	}
> > 
> > 	put_page(backing->pcmd);
> > 	put_page(backing->contents);
> > }
> > 
> > And we only want to set dirty for PCMD part, right?
> > 
> > Thus, perhaps we should fix it instead by:
> > 
> > 	set_page_dirty(backing->pcmd);
> > 	put_page(backing->pcmd):
> > 	put_page(backing->contents);
> > 	
> > ?	
> > 
> > I would not mind getting rid of that function anyway. It's kind
> > of bad wrapping IMHO.
> 
> Could we rather just change sgx_encl_put_backing() to be:
> 
> void sgx_encl_put_backing(struct sgx_backing *backing)
> {
> 	put_page(backing->pcmd);
> 	put_page(backing->contents);
> }
> 
> Two reasons:
> 
> 1) Instead of getting rid of sgx_encl_put_backing() we can keep
>    it so that its name reflects its work and its usage remains
>    symmetrical to sgx_encl_get_backing() that will stay and
>    the code thus continues to be clear on the page references.
> 
> 2) By moving the set_page_dirty() out of that function the
>    set_page_dirty() can be called _before_ writing data to
>    the page, which is the right thing to do per:
>    https://lore.kernel.org/linux-sgx/c057af3d-b7fb-34cd-0d75-989fca0e67fe@intel.com/


Yes, this does make sense to me. Important bit was to refactor
set_page_dirty() calls out but for sure put_page()'s can still
be wrapped.

As I responded to Dave just few moments ago, one option would be
*possibly* to do the truncation part in ksgxd but I hope we hope
we don't have to go to that because it would mean also extra
book-keeping...

Took a bit of effort but I finally got this reproduced with my
laptop with this CPU:

Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz

That's why I've had some radio silence in the meanwhile. In other
words, I should be now able to also to test the fix locally.

PS. I'm sorry if I've double-replied to this. Was not sure if
my first response left yesterday, and could not find it from
lore (and this does contain anyway stuff that was not incldued
into that).

BR, Jarkko
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index e5d2661800ac..e03f124ce772 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -98,7 +98,7 @@  static int __sgx_encl_eldu(struct sgx_encl_page *encl_page,
 	kunmap_atomic(pcmd_page);
 	kunmap_atomic((void *)(unsigned long)pginfo.contents);
 
-	sgx_encl_put_backing(&b, false);
+	sgx_encl_put_backing(&b, true);
 
 	sgx_encl_truncate_backing_page(encl, page_index);