diff mbox series

arm64: define __alloc_zeroed_user_highpage

Message ID 20200312155920.50067-1-glider@google.com (mailing list archive)
State Mainlined
Commit c17a290f7e7e59d24b4507736b7b40b0eb5f8f1f
Headers show
Series arm64: define __alloc_zeroed_user_highpage | expand

Commit Message

Alexander Potapenko March 12, 2020, 3:59 p.m. UTC
When running the kernel with init_on_alloc=1, calling the default
implementation of __alloc_zeroed_user_highpage() from include/linux/highmem.h
leads to double-initialization of the allocated page (first by the page
allocator, then by clear_user_page().
Calling alloc_page_vma() with __GFP_ZERO, similarly to e.g. x86, seems
to be enough to ensure the user page is zeroed only once.

Signed-off-by: Alexander Potapenko <glider@google.com>
---
 arch/arm64/include/asm/page.h | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Mark Rutland March 12, 2020, 4:49 p.m. UTC | #1
On Thu, Mar 12, 2020 at 04:59:20PM +0100, glider@google.com wrote:
> When running the kernel with init_on_alloc=1, calling the default
> implementation of __alloc_zeroed_user_highpage() from include/linux/highmem.h
> leads to double-initialization of the allocated page (first by the page
> allocator, then by clear_user_page().
> Calling alloc_page_vma() with __GFP_ZERO, similarly to e.g. x86, seems
> to be enough to ensure the user page is zeroed only once.

Just to check, is there a functional ussue beyond the redundant zeroing,
or is this jsut a performance issue?

On architectures with real highmem, does GFP_HIGHUSER prevent the
allocator from zeroing the page in this case, or is the architecture
prevented from allocating from highmem?

This feels like something we should be able to fix in the generic
implementation of __alloc_zeroed_user_highpage(), with an additional
check to see if init_on_alloc is in use.

Thanks,
Mark.

> 
> Signed-off-by: Alexander Potapenko <glider@google.com>
> ---
>  arch/arm64/include/asm/page.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index d39ddb258a049..75d6cd23a6790 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -21,6 +21,10 @@ extern void __cpu_copy_user_page(void *to, const void *from,
>  extern void copy_page(void *to, const void *from);
>  extern void clear_page(void *to);
>  
> +#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
> +	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
> +#define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
> +
>  #define clear_user_page(addr,vaddr,pg)  __cpu_clear_user_page(addr, vaddr)
>  #define copy_user_page(to,from,vaddr,pg) __cpu_copy_user_page(to, from, vaddr)
>  
> -- 
> 2.25.1.481.gfbce0eb801-goog
>
Alexander Potapenko March 12, 2020, 7:59 p.m. UTC | #2
On Thu, Mar 12, 2020 at 5:49 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Thu, Mar 12, 2020 at 04:59:20PM +0100, glider@google.com wrote:
> > When running the kernel with init_on_alloc=1, calling the default
> > implementation of __alloc_zeroed_user_highpage() from include/linux/highmem.h
> > leads to double-initialization of the allocated page (first by the page
> > allocator, then by clear_user_page().
> > Calling alloc_page_vma() with __GFP_ZERO, similarly to e.g. x86, seems
> > to be enough to ensure the user page is zeroed only once.
>
> Just to check, is there a functional ussue beyond the redundant zeroing,
> or is this jsut a performance issue?

This is just a performance issue that only manifests when running the
kernel with init_on_alloc=1.

> On architectures with real highmem, does GFP_HIGHUSER prevent the
> allocator from zeroing the page in this case, or is the architecture
> prevented from allocating from highmem?

I was hoping one of ARM maintainers can answer this question. My
understanding was that __GFP_ZERO should be sufficient, but there's
probably something I'm missing.



> Thanks,
> Mark.
>
> >
> > Signed-off-by: Alexander Potapenko <glider@google.com>
> > ---
> >  arch/arm64/include/asm/page.h | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> > index d39ddb258a049..75d6cd23a6790 100644
> > --- a/arch/arm64/include/asm/page.h
> > +++ b/arch/arm64/include/asm/page.h
> > @@ -21,6 +21,10 @@ extern void __cpu_copy_user_page(void *to, const void *from,
> >  extern void copy_page(void *to, const void *from);
> >  extern void clear_page(void *to);
> >
> > +#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
> > +     alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
> > +#define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
> > +
> >  #define clear_user_page(addr,vaddr,pg)  __cpu_clear_user_page(addr, vaddr)
> >  #define copy_user_page(to,from,vaddr,pg) __cpu_copy_user_page(to, from, vaddr)
> >
> > --
> > 2.25.1.481.gfbce0eb801-goog
> >
Catalin Marinas March 13, 2020, 3:03 p.m. UTC | #3
On Thu, Mar 12, 2020 at 08:59:28PM +0100, Alexander Potapenko wrote:
> On Thu, Mar 12, 2020 at 5:49 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Thu, Mar 12, 2020 at 04:59:20PM +0100, glider@google.com wrote:
> > > When running the kernel with init_on_alloc=1, calling the default
> > > implementation of __alloc_zeroed_user_highpage() from include/linux/highmem.h
> > > leads to double-initialization of the allocated page (first by the page
> > > allocator, then by clear_user_page().
> > > Calling alloc_page_vma() with __GFP_ZERO, similarly to e.g. x86, seems
> > > to be enough to ensure the user page is zeroed only once.
> >
> > Just to check, is there a functional ussue beyond the redundant zeroing,
> > or is this jsut a performance issue?
> 
> This is just a performance issue that only manifests when running the
> kernel with init_on_alloc=1.
> 
> > On architectures with real highmem, does GFP_HIGHUSER prevent the
> > allocator from zeroing the page in this case, or is the architecture
> > prevented from allocating from highmem?
> 
> I was hoping one of ARM maintainers can answer this question. My
> understanding was that __GFP_ZERO should be sufficient, but there's
> probably something I'm missing.

On architectures with aliasing D-cache (whether it's VIVT or aliasing
VIPT), clear_user_highpage() ensures that the correct alias, as seen by
the user, is cleared (see the arm32 v6_clear_user_highpage_aliasing() as
an example). The clear_highpage() call as done by page_alloc.c does not
have the user address information, so it can only clear the kernel
alias.

On arm64 we don't have such issue, so we can optimise this case as per
your patch. We may change this function later with MTE if we allow tags
other than 0 on the first allocation of anonymous pages.
Catalin Marinas March 17, 2020, 6:37 p.m. UTC | #4
On Thu, Mar 12, 2020 at 04:59:20PM +0100, glider@google.com wrote:
> When running the kernel with init_on_alloc=1, calling the default
> implementation of __alloc_zeroed_user_highpage() from include/linux/highmem.h
> leads to double-initialization of the allocated page (first by the page
> allocator, then by clear_user_page().
> Calling alloc_page_vma() with __GFP_ZERO, similarly to e.g. x86, seems
> to be enough to ensure the user page is zeroed only once.
> 
> Signed-off-by: Alexander Potapenko <glider@google.com>

I queued this for 5.7. Thanks.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index d39ddb258a049..75d6cd23a6790 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -21,6 +21,10 @@  extern void __cpu_copy_user_page(void *to, const void *from,
 extern void copy_page(void *to, const void *from);
 extern void clear_page(void *to);
 
+#define __alloc_zeroed_user_highpage(movableflags, vma, vaddr) \
+	alloc_page_vma(GFP_HIGHUSER | __GFP_ZERO | movableflags, vma, vaddr)
+#define __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
+
 #define clear_user_page(addr,vaddr,pg)  __cpu_clear_user_page(addr, vaddr)
 #define copy_user_page(to,from,vaddr,pg) __cpu_copy_user_page(to, from, vaddr)