Message ID | 2d9f4ac4528701b59d511a379a60107fa608ad30.1744128123.git.agordeev@linux.ibm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: Fix apply_to_pte_range() vs lazy MMU mode | expand |
On 4/8/25 6:07 PM, Alexander Gordeev wrote: > apply_to_page_range() enters lazy MMU mode and then invokes > kasan_populate_vmalloc_pte() callback on each page table walk > iteration. The lazy MMU mode may only be entered only under > protection of the page table lock. However, the callback can > go into sleep when trying to allocate a single page. > > Change __get_free_page() allocation mode from GFP_KERNEL to > GFP_ATOMIC to avoid scheduling out while in atomic context. > > Cc: stable@vger.kernel.org > Fixes: 3c5c3cfb9ef4 ("kasan: support backing vmalloc space with real shadow memory") > Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> > --- > mm/kasan/shadow.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c > index 88d1c9dcb507..edfa77959474 100644 > --- a/mm/kasan/shadow.c > +++ b/mm/kasan/shadow.c > @@ -301,7 +301,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr, > if (likely(!pte_none(ptep_get(ptep)))) > return 0; > > - page = __get_free_page(GFP_KERNEL); > + page = __get_free_page(GFP_ATOMIC); > if (!page) > return -ENOMEM; > I think a better way to fix this would be moving out allocation from atomic context. Allocate page prior to apply_to_page_range() call and pass it down to kasan_populate_vmalloc_pte(). Whenever kasan_populate_vmalloc_pte() will require additional page we could bail out with -EAGAIN, and allocate another one.
On Wed, Apr 09, 2025 at 04:10:58PM +0200, Andrey Ryabinin wrote: Hi Andrey, > > @@ -301,7 +301,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr, > > if (likely(!pte_none(ptep_get(ptep)))) > > return 0; > > > > - page = __get_free_page(GFP_KERNEL); > > + page = __get_free_page(GFP_ATOMIC); > > if (!page) > > return -ENOMEM; > > > > I think a better way to fix this would be moving out allocation from atomic context. Allocate page prior > to apply_to_page_range() call and pass it down to kasan_populate_vmalloc_pte(). I think the page address could be passed as the parameter to kasan_populate_vmalloc_pte(). > Whenever kasan_populate_vmalloc_pte() will require additional page we could bail out with -EAGAIN, > and allocate another one. When would it be needed? kasan_populate_vmalloc_pte() handles just one page. Thanks!
On 4/9/25 4:25 PM, Alexander Gordeev wrote: > On Wed, Apr 09, 2025 at 04:10:58PM +0200, Andrey Ryabinin wrote: > > Hi Andrey, > >>> @@ -301,7 +301,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr, >>> if (likely(!pte_none(ptep_get(ptep)))) >>> return 0; >>> >>> - page = __get_free_page(GFP_KERNEL); >>> + page = __get_free_page(GFP_ATOMIC); >>> if (!page) >>> return -ENOMEM; >>> >> >> I think a better way to fix this would be moving out allocation from atomic context. Allocate page prior >> to apply_to_page_range() call and pass it down to kasan_populate_vmalloc_pte(). > > I think the page address could be passed as the parameter to kasan_populate_vmalloc_pte(). We'll need to pass it as 'struct page **page' or maybe as pointer to some struct, e.g.: struct page_data { struct page *page; }; So, the kasan_populate_vmalloc_pte() would do something like this: kasan_populate_vmalloc_pte() { if (!pte_none) return 0; if (!page_data->page) return -EAGAIN; //use page to set pte //NULLify pointer so that next kasan_populate_vmalloc_pte() will bail // out to allocate new page page_data->page = NULL; } And it might be good idea to add 'last_addr' to page_data, so that we know where we stopped so that the next apply_to_page_range() call could continue, instead of starting from the beginning. > >> Whenever kasan_populate_vmalloc_pte() will require additional page we could bail out with -EAGAIN, >> and allocate another one. > > When would it be needed? kasan_populate_vmalloc_pte() handles just one page. > apply_to_page_range() goes over range of addresses and calls kasan_populate_vmalloc_pte() multiple times (each time with different 'addr' but the same '*unused' arg). Things will go wrong if you'll use same page multiple times for different addresses. > Thanks!
On Wed, Apr 09, 2025 at 04:56:29PM +0200, Andrey Ryabinin wrote: Hi Andrey, ... > >>> - page = __get_free_page(GFP_KERNEL); > >>> + page = __get_free_page(GFP_ATOMIC); > >>> if (!page) > >> I think a better way to fix this would be moving out allocation from atomic context. Allocate page prior > >> to apply_to_page_range() call and pass it down to kasan_populate_vmalloc_pte(). > > I think the page address could be passed as the parameter to kasan_populate_vmalloc_pte(). > > We'll need to pass it as 'struct page **page' or maybe as pointer to some struct, e.g.: > struct page_data { > struct page *page; > }; ... Thanks for the hint! I will try to implement that, but will likely start in two weeks, after I am back from vacation. Not sure wether this version needs to be dropped. Thanks!
diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c index 88d1c9dcb507..edfa77959474 100644 --- a/mm/kasan/shadow.c +++ b/mm/kasan/shadow.c @@ -301,7 +301,7 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr, if (likely(!pte_none(ptep_get(ptep)))) return 0; - page = __get_free_page(GFP_KERNEL); + page = __get_free_page(GFP_ATOMIC); if (!page) return -ENOMEM;
apply_to_page_range() enters lazy MMU mode and then invokes kasan_populate_vmalloc_pte() callback on each page table walk iteration. The lazy MMU mode may only be entered only under protection of the page table lock. However, the callback can go into sleep when trying to allocate a single page. Change __get_free_page() allocation mode from GFP_KERNEL to GFP_ATOMIC to avoid scheduling out while in atomic context. Cc: stable@vger.kernel.org Fixes: 3c5c3cfb9ef4 ("kasan: support backing vmalloc space with real shadow memory") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> --- mm/kasan/shadow.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)