diff mbox series

KVM: SVM: snp_alloc_firmware_pages: memory leak

Message ID 20250214035932.3414337-1-aik@amd.com (mailing list archive)
State Changes Requested
Delegated to: Herbert Xu
Headers show
Series KVM: SVM: snp_alloc_firmware_pages: memory leak | expand

Commit Message

Alexey Kardashevskiy Feb. 14, 2025, 3:59 a.m. UTC
Failure to rmpupdate leads to page(s) leak, fix that.

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
---
 drivers/crypto/ccp/sev-dev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Tom Lendacky Feb. 14, 2025, 2:53 p.m. UTC | #1
On 2/13/25 21:59, Alexey Kardashevskiy wrote:
> Failure to rmpupdate leads to page(s) leak, fix that.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> ---
>  drivers/crypto/ccp/sev-dev.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> index 2e87ca0e292a..0b5f8ab657c5 100644
> --- a/drivers/crypto/ccp/sev-dev.c
> +++ b/drivers/crypto/ccp/sev-dev.c
> @@ -443,8 +443,10 @@ static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order)
>  		return page;
>  
>  	paddr = __pa((unsigned long)page_address(page));
> -	if (rmp_mark_pages_firmware(paddr, npages, false))
> +	if (rmp_mark_pages_firmware(paddr, npages, false)) {
> +		__free_pages(page, order);

I'm not sure we can do this. On error, rmp_mark_pages_firmware() attempts
to cleanup and restore any pages that were marked firmware. But
snp_reclaim_pages() will leak pages that it can't restore and we don't
pass back any info to the caller of rmp_mark_pages_firmware() to let it
know what pages are truly available to free.

Thanks,
Tom

>  		return NULL;
> +	}
>  
>  	return page;
>  }
Alexey Kardashevskiy Feb. 18, 2025, 1:24 a.m. UTC | #2
On 15/2/25 01:53, Tom Lendacky wrote:
> On 2/13/25 21:59, Alexey Kardashevskiy wrote:
>> Failure to rmpupdate leads to page(s) leak, fix that.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
>> ---
>>   drivers/crypto/ccp/sev-dev.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
>> index 2e87ca0e292a..0b5f8ab657c5 100644
>> --- a/drivers/crypto/ccp/sev-dev.c
>> +++ b/drivers/crypto/ccp/sev-dev.c
>> @@ -443,8 +443,10 @@ static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order)
>>   		return page;
>>   
>>   	paddr = __pa((unsigned long)page_address(page));
>> -	if (rmp_mark_pages_firmware(paddr, npages, false))
>> +	if (rmp_mark_pages_firmware(paddr, npages, false)) {
>> +		__free_pages(page, order);
> 
> I'm not sure we can do this. On error, rmp_mark_pages_firmware() attempts
> to cleanup and restore any pages that were marked firmware. But
> snp_reclaim_pages() will leak pages that it can't restore and we don't
> pass back any info to the caller of rmp_mark_pages_firmware() to let it
> know what pages are truly available to free.

oh right. But there is snp_leaked_pages_list which 
__snp_alloc_firmware_pages() could look at.

Or just replace __free_pages() above with:

snp_leak_pages(__page_to_pfn(page), 1 << order)

so memory leak leaves traces in dmesg, at least?


> 
> Thanks,
> Tom
> 
>>   		return NULL;
>> +	}
>>   
>>   	return page;
>>   }
Tom Lendacky Feb. 18, 2025, 2:42 p.m. UTC | #3
On 2/17/25 19:24, Alexey Kardashevskiy wrote:
> On 15/2/25 01:53, Tom Lendacky wrote:
>> On 2/13/25 21:59, Alexey Kardashevskiy wrote:
>>> Failure to rmpupdate leads to page(s) leak, fix that.
>>>
>>> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
>>> ---
>>>   drivers/crypto/ccp/sev-dev.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
>>> index 2e87ca0e292a..0b5f8ab657c5 100644
>>> --- a/drivers/crypto/ccp/sev-dev.c
>>> +++ b/drivers/crypto/ccp/sev-dev.c
>>> @@ -443,8 +443,10 @@ static struct page
>>> *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order)
>>>           return page;
>>>         paddr = __pa((unsigned long)page_address(page));
>>> -    if (rmp_mark_pages_firmware(paddr, npages, false))
>>> +    if (rmp_mark_pages_firmware(paddr, npages, false)) {
>>> +        __free_pages(page, order);
>>
>> I'm not sure we can do this. On error, rmp_mark_pages_firmware() attempts
>> to cleanup and restore any pages that were marked firmware. But
>> snp_reclaim_pages() will leak pages that it can't restore and we don't
>> pass back any info to the caller of rmp_mark_pages_firmware() to let it
>> know what pages are truly available to free.
> 
> oh right. But there is snp_leaked_pages_list which
> __snp_alloc_firmware_pages() could look at.
> 
> Or just replace __free_pages() above with:
> 
> snp_leak_pages(__page_to_pfn(page), 1 << order)
> 
> so memory leak leaves traces in dmesg, at least?

I haven't looked too closely at the error path, but it might make sense
to have rmp_mark_pages_firmware() leak all the pages vs trying to do any
cleanup. Also, have snp_reclaim_pages() leak all the pages on any single
reclaim page error, because it looks like, in general, that the pages
are never free'd if any single page fails to be reclaimed, but only the
page that failed and then the remaining pages gets leaked via
snp_leak_pages().

Except in sev_ioctl_do_snp_platform_status(), where __free_pages() is
called if rmp_mark_pages_firmware() fails, which doesn't seem right.

And I'm not sure what to do if rmp_mark_pages_firmware() fails in
snp_prep_cmd_buf(), since __sev_do_cmd_locked() is using pre-allocated
buffers. But it looks like if snp_prep_cmd_buf() fails or
snp_reclaim_cmd_buf() fails, then the buffer usage indicator is never
released and commands will just fail at some point...  But those buffers
are allocated using devm_get_free_pages(), so nothing good would happen
if the ccp module is unloaded and those pages are freed in the wrong state.

Thanks,
Tom

> 
> 
>>
>> Thanks,
>> Tom
>>
>>>           return NULL;
>>> +    }
>>>         return page;
>>>   }
>
diff mbox series

Patch

diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
index 2e87ca0e292a..0b5f8ab657c5 100644
--- a/drivers/crypto/ccp/sev-dev.c
+++ b/drivers/crypto/ccp/sev-dev.c
@@ -443,8 +443,10 @@  static struct page *__snp_alloc_firmware_pages(gfp_t gfp_mask, int order)
 		return page;
 
 	paddr = __pa((unsigned long)page_address(page));
-	if (rmp_mark_pages_firmware(paddr, npages, false))
+	if (rmp_mark_pages_firmware(paddr, npages, false)) {
+		__free_pages(page, order);
 		return NULL;
+	}
 
 	return page;
 }