diff mbox series

[v2,1/3] kasan: Avoid sleepable page allocation from atomic context

Message ID 2d9f4ac4528701b59d511a379a60107fa608ad30.1744128123.git.agordeev@linux.ibm.com (mailing list archive)
State New
Headers show
Series mm: Fix apply_to_pte_range() vs lazy MMU mode | expand

Commit Message

Alexander Gordeev April 8, 2025, 4:07 p.m. UTC
apply_to_page_range() enters lazy MMU mode and then invokes
kasan_populate_vmalloc_pte() callback on each page table walk
iteration. The lazy MMU mode may only be entered only under
protection of the page table lock. However, the callback can
go into sleep when trying to allocate a single page.

Change __get_free_page() allocation mode from GFP_KERNEL to
GFP_ATOMIC to avoid scheduling out while in atomic context.

Cc: stable@vger.kernel.org
Fixes: 3c5c3cfb9ef4 ("kasan: support backing vmalloc space with real shadow memory")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
---
 mm/kasan/shadow.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Andrey Ryabinin April 9, 2025, 2:10 p.m. UTC | #1
On 4/8/25 6:07 PM, Alexander Gordeev wrote:
> apply_to_page_range() enters lazy MMU mode and then invokes
> kasan_populate_vmalloc_pte() callback on each page table walk
> iteration. The lazy MMU mode may only be entered only under
> protection of the page table lock. However, the callback can
> go into sleep when trying to allocate a single page.
> 
> Change __get_free_page() allocation mode from GFP_KERNEL to
> GFP_ATOMIC to avoid scheduling out while in atomic context.
> 
> Cc: stable@vger.kernel.org
> Fixes: 3c5c3cfb9ef4 ("kasan: support backing vmalloc space with real shadow memory")
> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
> ---
>  mm/kasan/shadow.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
> index 88d1c9dcb507..edfa77959474 100644
> --- a/mm/kasan/shadow.c
> +++ b/mm/kasan/shadow.c
> @@ -301,7 +301,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr,
>  	if (likely(!pte_none(ptep_get(ptep))))
>  		return 0;
>  
> -	page = __get_free_page(GFP_KERNEL);
> +	page = __get_free_page(GFP_ATOMIC);
>  	if (!page)
>  		return -ENOMEM;
>  

I think a better way to fix this would be moving out allocation from atomic context. Allocate page prior
to apply_to_page_range() call and pass it down to kasan_populate_vmalloc_pte().

Whenever kasan_populate_vmalloc_pte() will require additional page we could bail out with -EAGAIN,
and allocate another one.
Alexander Gordeev April 9, 2025, 2:25 p.m. UTC | #2
On Wed, Apr 09, 2025 at 04:10:58PM +0200, Andrey Ryabinin wrote:

Hi Andrey,

> > @@ -301,7 +301,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr,
> >  	if (likely(!pte_none(ptep_get(ptep))))
> >  		return 0;
> >  
> > -	page = __get_free_page(GFP_KERNEL);
> > +	page = __get_free_page(GFP_ATOMIC);
> >  	if (!page)
> >  		return -ENOMEM;
> >  
> 
> I think a better way to fix this would be moving out allocation from atomic context. Allocate page prior
> to apply_to_page_range() call and pass it down to kasan_populate_vmalloc_pte().

I think the page address could be passed as the parameter to kasan_populate_vmalloc_pte().

> Whenever kasan_populate_vmalloc_pte() will require additional page we could bail out with -EAGAIN,
> and allocate another one.

When would it be needed? kasan_populate_vmalloc_pte() handles just one page.

Thanks!
Andrey Ryabinin April 9, 2025, 2:56 p.m. UTC | #3
On 4/9/25 4:25 PM, Alexander Gordeev wrote:
> On Wed, Apr 09, 2025 at 04:10:58PM +0200, Andrey Ryabinin wrote:
> 
> Hi Andrey,
> 
>>> @@ -301,7 +301,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr,
>>>  	if (likely(!pte_none(ptep_get(ptep))))
>>>  		return 0;
>>>  
>>> -	page = __get_free_page(GFP_KERNEL);
>>> +	page = __get_free_page(GFP_ATOMIC);
>>>  	if (!page)
>>>  		return -ENOMEM;
>>>  
>>
>> I think a better way to fix this would be moving out allocation from atomic context. Allocate page prior
>> to apply_to_page_range() call and pass it down to kasan_populate_vmalloc_pte().
> 
> I think the page address could be passed as the parameter to kasan_populate_vmalloc_pte().

We'll need to pass it as 'struct page **page' or maybe as pointer to some struct, e.g.:
struct page_data {
 struct page *page;
};


So, the kasan_populate_vmalloc_pte() would do something like this:

kasan_populate_vmalloc_pte() {
	if (!pte_none)
		return 0;
	if (!page_data->page)
		return -EAGAIN;

	//use page to set pte

        //NULLify pointer so that next kasan_populate_vmalloc_pte() will bail
	// out to allocate new page
	page_data->page = NULL; 
}

And it might be good idea to add 'last_addr' to page_data, so that we know where we stopped
so that the next apply_to_page_range() call could continue, instead of starting from the beginning. 


> 
>> Whenever kasan_populate_vmalloc_pte() will require additional page we could bail out with -EAGAIN,
>> and allocate another one.
> 
> When would it be needed? kasan_populate_vmalloc_pte() handles just one page.
> 

apply_to_page_range() goes over range of addresses and calls kasan_populate_vmalloc_pte()
multiple times (each time with different 'addr' but the same '*unused' arg). Things will go wrong
if you'll use same page multiple times for different addresses.


> Thanks!
Alexander Gordeev April 10, 2025, 3:18 p.m. UTC | #4
On Wed, Apr 09, 2025 at 04:56:29PM +0200, Andrey Ryabinin wrote:

Hi Andrey,

...
> >>> -	page = __get_free_page(GFP_KERNEL);
> >>> +	page = __get_free_page(GFP_ATOMIC);
> >>>  	if (!page)
> >> I think a better way to fix this would be moving out allocation from atomic context. Allocate page prior
> >> to apply_to_page_range() call and pass it down to kasan_populate_vmalloc_pte().
> > I think the page address could be passed as the parameter to kasan_populate_vmalloc_pte().
> 
> We'll need to pass it as 'struct page **page' or maybe as pointer to some struct, e.g.:
> struct page_data {
>  struct page *page;
> };
...

Thanks for the hint! I will try to implement that, but will likely start
in two weeks, after I am back from vacation.

Not sure wether this version needs to be dropped.

Thanks!
diff mbox series

Patch

diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index 88d1c9dcb507..edfa77959474 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -301,7 +301,7 @@  static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr,
 	if (likely(!pte_none(ptep_get(ptep))))
 		return 0;
 
-	page = __get_free_page(GFP_KERNEL);
+	page = __get_free_page(GFP_ATOMIC);
 	if (!page)
 		return -ENOMEM;