diff mbox series

[RFC,06/14] fork: zero vmap stack using clear_page() instead of memset()

Message ID 20240311164638.2015063-7-pasha.tatashin@soleen.com (mailing list archive)
State New
Headers show
Series Dynamic Kernel Stacks | expand

Commit Message

Pasha Tatashin March 11, 2024, 4:46 p.m. UTC
In preporation for dynamic kernel stacks do not zero the whole span of
the stack, but instead only the pages that are part of the vm_area.

This is because with dynamic stacks we might have only partially
populated stacks.

Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
---
 kernel/fork.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Nikolay Borisov March 12, 2024, 7:15 a.m. UTC | #1
On 11.03.24 г. 18:46 ч., Pasha Tatashin wrote:
> In preporation for dynamic kernel stacks do not zero the whole span of
> the stack, but instead only the pages that are part of the vm_area.
> 
> This is because with dynamic stacks we might have only partially
> populated stacks.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---
>   kernel/fork.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 6a2f2c85e09f..41e0baee79d2 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -263,8 +263,8 @@ static int memcg_charge_kernel_stack(struct vm_struct *vm)
>   static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>   {
>   	struct vm_struct *vm_area;
> +	int i, j, nr_pages;
>   	void *stack;
> -	int i;
>   
>   	for (i = 0; i < NR_CACHED_STACKS; i++) {
>   		vm_area = this_cpu_xchg(cached_stacks[i], NULL);
> @@ -282,7 +282,9 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>   		stack = kasan_reset_tag(vm_area->addr);
>   
>   		/* Clear stale pointers from reused stack. */
> -		memset(stack, 0, THREAD_SIZE);
> +		nr_pages = vm_area->nr_pages;
> +		for (j = 0; j < nr_pages; j++)
> +			clear_page(page_address(vm_area->pages[j]));

Can't this be memset(stack, 0, nr_pages*PAGE_SIZE) ?
>   
>   		tsk->stack_vm_area = vm_area;
>   		tsk->stack = stack;
Pasha Tatashin March 12, 2024, 4:53 p.m. UTC | #2
On Tue, Mar 12, 2024 at 3:16 AM Nikolay Borisov <nik.borisov@suse.com> wrote:
>
>
>
> On 11.03.24 г. 18:46 ч., Pasha Tatashin wrote:
> > In preporation for dynamic kernel stacks do not zero the whole span of
> > the stack, but instead only the pages that are part of the vm_area.
> >
> > This is because with dynamic stacks we might have only partially
> > populated stacks.
> >
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > ---
> >   kernel/fork.c | 6 ++++--
> >   1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 6a2f2c85e09f..41e0baee79d2 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -263,8 +263,8 @@ static int memcg_charge_kernel_stack(struct vm_struct *vm)
> >   static int alloc_thread_stack_node(struct task_struct *tsk, int node)
> >   {
> >       struct vm_struct *vm_area;
> > +     int i, j, nr_pages;
> >       void *stack;
> > -     int i;
> >
> >       for (i = 0; i < NR_CACHED_STACKS; i++) {
> >               vm_area = this_cpu_xchg(cached_stacks[i], NULL);
> > @@ -282,7 +282,9 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
> >               stack = kasan_reset_tag(vm_area->addr);
> >
> >               /* Clear stale pointers from reused stack. */
> > -             memset(stack, 0, THREAD_SIZE);
> > +             nr_pages = vm_area->nr_pages;
> > +             for (j = 0; j < nr_pages; j++)
> > +                     clear_page(page_address(vm_area->pages[j]));
>
> Can't this be memset(stack, 0, nr_pages*PAGE_SIZE) ?

No, we can't, because the pages can be physically discontiguous.

Pasha
Christophe Leroy March 14, 2024, 7:55 a.m. UTC | #3
Le 12/03/2024 à 17:53, Pasha Tatashin a écrit :
> On Tue, Mar 12, 2024 at 3:16 AM Nikolay Borisov <nik.borisov@suse.com> wrote:
>>
>>
>>
>> On 11.03.24 г. 18:46 ч., Pasha Tatashin wrote:
>>> In preporation for dynamic kernel stacks do not zero the whole span of
>>> the stack, but instead only the pages that are part of the vm_area.
>>>
>>> This is because with dynamic stacks we might have only partially
>>> populated stacks.
>>>
>>> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
>>> ---
>>>    kernel/fork.c | 6 ++++--
>>>    1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/fork.c b/kernel/fork.c
>>> index 6a2f2c85e09f..41e0baee79d2 100644
>>> --- a/kernel/fork.c
>>> +++ b/kernel/fork.c
>>> @@ -263,8 +263,8 @@ static int memcg_charge_kernel_stack(struct vm_struct *vm)
>>>    static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>>>    {
>>>        struct vm_struct *vm_area;
>>> +     int i, j, nr_pages;
>>>        void *stack;
>>> -     int i;
>>>
>>>        for (i = 0; i < NR_CACHED_STACKS; i++) {
>>>                vm_area = this_cpu_xchg(cached_stacks[i], NULL);
>>> @@ -282,7 +282,9 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node)
>>>                stack = kasan_reset_tag(vm_area->addr);
>>>
>>>                /* Clear stale pointers from reused stack. */
>>> -             memset(stack, 0, THREAD_SIZE);
>>> +             nr_pages = vm_area->nr_pages;
>>> +             for (j = 0; j < nr_pages; j++)
>>> +                     clear_page(page_address(vm_area->pages[j]));
>>
>> Can't this be memset(stack, 0, nr_pages*PAGE_SIZE) ?
> 
> No, we can't, because the pages can be physically discontiguous.
> 

But the pages were already physically discontiguous before your change, 
what's the difference ?

It doesn't matter that the pages are physically discontiguous as far as 
they are virtually contiguous, which should still be the case here for a 
stack.

Nevertheless, from powerpc point of view I'm happy with clear_page() 
which is more optimised than memset(0)

Christophe
Pasha Tatashin March 14, 2024, 1:52 p.m. UTC | #4
> But the pages were already physically discontiguous before your change,
> what's the difference ?

Pages were not physically contiguous before my change. They were
allocated with __vmalloc_node_range() which allocates sparse pages and
maps them to a virtually contiguous span of memory within
[VMALLOC_START, VMALLOC_END) range.

> It doesn't matter that the pages are physically discontiguous as far as
> they are virtually contiguous, which should still be the case here for a
> stack.

This patch is a preparation patch for the "dynamic kernel stack"
feature, in the description it says:
This is because with dynamic stacks we might have only partially
populated stacks.

We could compute the populated part of the stack, and determine its
start and end mapped VA range by using vm_area->pages[] and
vm_area->nr_pages, but that would make code a little uglier especially
becuase we would need to take into the account if stack is growing up
or down.. Therefore, using clear_page()  is simpler and should be fast
enough.

Thanks,
Pasha
Christophe JAILLET March 17, 2024, 2:48 p.m. UTC | #5
Le 11/03/2024 à 17:46, Pasha Tatashin a écrit :
> In preporation for dynamic kernel stacks do not zero the whole span of

Nit: preparation

> the stack, but instead only the pages that are part of the vm_area.
> 
> This is because with dynamic stacks we might have only partially
> populated stacks.
> 
> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> ---
>   kernel/fork.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)

...
Pasha Tatashin March 17, 2024, 3:15 p.m. UTC | #6
On Sun, Mar 17, 2024 at 10:48 AM Christophe JAILLET
<christophe.jaillet@wanadoo.fr> wrote:
>
> Le 11/03/2024 à 17:46, Pasha Tatashin a écrit :
> > In preporation for dynamic kernel stacks do not zero the whole span of
>
> Nit: preparation

Thank you,
Pasha

>
> > the stack, but instead only the pages that are part of the vm_area.
> >
> > This is because with dynamic stacks we might have only partially
> > populated stacks.
> >
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > ---
> >   kernel/fork.c | 6 ++++--
> >   1 file changed, 4 insertions(+), 2 deletions(-)
>
> ...
>
diff mbox series

Patch

diff --git a/kernel/fork.c b/kernel/fork.c
index 6a2f2c85e09f..41e0baee79d2 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -263,8 +263,8 @@  static int memcg_charge_kernel_stack(struct vm_struct *vm)
 static int alloc_thread_stack_node(struct task_struct *tsk, int node)
 {
 	struct vm_struct *vm_area;
+	int i, j, nr_pages;
 	void *stack;
-	int i;
 
 	for (i = 0; i < NR_CACHED_STACKS; i++) {
 		vm_area = this_cpu_xchg(cached_stacks[i], NULL);
@@ -282,7 +282,9 @@  static int alloc_thread_stack_node(struct task_struct *tsk, int node)
 		stack = kasan_reset_tag(vm_area->addr);
 
 		/* Clear stale pointers from reused stack. */
-		memset(stack, 0, THREAD_SIZE);
+		nr_pages = vm_area->nr_pages;
+		for (j = 0; j < nr_pages; j++)
+			clear_page(page_address(vm_area->pages[j]));
 
 		tsk->stack_vm_area = vm_area;
 		tsk->stack = stack;