diff mbox series

[v2] memcg: Fix memcg_kmem_bypass() for remote memcg charging

Message ID 3a721f62-5a66-8bc5-247b-5c8b7c51c555@huawei.com (mailing list archive)
State New, archived
Headers show
Series [v2] memcg: Fix memcg_kmem_bypass() for remote memcg charging | expand

Commit Message

Zefan Li May 13, 2020, 11:47 a.m. UTC
While trying to use remote memcg charging in an out-of-tree kernel module
I found it's not working, because the current thread is a workqueue thread.

As we will probably encounter this issue in the future as the users of
memalloc_use_memcg() grow, it's better we fix it now.

Signed-off-by: Zefan Li <lizefan@huawei.com>
---

v2: add a comment as sugguested by Michal. and add changelog to explain why
upstream kernel needs this fix.

---

 mm/memcontrol.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Shakeel Butt May 13, 2020, 12:22 p.m. UTC | #1
On Wed, May 13, 2020 at 4:47 AM Zefan Li <lizefan@huawei.com> wrote:
>
> While trying to use remote memcg charging in an out-of-tree kernel module
> I found it's not working, because the current thread is a workqueue thread.
>
> As we will probably encounter this issue in the future as the users of
> memalloc_use_memcg() grow, it's better we fix it now.
>
> Signed-off-by: Zefan Li <lizefan@huawei.com>
> ---
>
> v2: add a comment as sugguested by Michal. and add changelog to explain why
> upstream kernel needs this fix.
>
> ---
>
>  mm/memcontrol.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index a3b97f1..43a12ed 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2802,6 +2802,9 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
>
>  static inline bool memcg_kmem_bypass(void)
>  {
> +       /* Allow remote memcg charging in kthread contexts. */
> +       if (unlikely(current->active_memcg))
> +               return false;

What about __GFP_ACCOUNT allocations in the interrupt context?

e.g.

memalloc_use_memcg(memcg);
--->interrupt
--->alloc_page(GFP_KERNEL_ACCOUNT) in interrupt context.
alloc_page(GFP_KERNEL_ACCOUNT);
memalloc_unuse_memcg();


>         if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
>                 return true;
>         return false;
> --
> 2.7.4
>
Johannes Weiner May 13, 2020, 1:05 p.m. UTC | #2
On Wed, May 13, 2020 at 07:47:49PM +0800, Zefan Li wrote:
> While trying to use remote memcg charging in an out-of-tree kernel module
> I found it's not working, because the current thread is a workqueue thread.
> 
> As we will probably encounter this issue in the future as the users of
> memalloc_use_memcg() grow, it's better we fix it now.
> 
> Signed-off-by: Zefan Li <lizefan@huawei.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Roman Gushchin May 13, 2020, 4:11 p.m. UTC | #3
On Wed, May 13, 2020 at 07:47:49PM +0800, Zefan Li wrote:
> While trying to use remote memcg charging in an out-of-tree kernel module
> I found it's not working, because the current thread is a workqueue thread.
> 
> As we will probably encounter this issue in the future as the users of
> memalloc_use_memcg() grow, it's better we fix it now.
> 
> Signed-off-by: Zefan Li <lizefan@huawei.com>
> ---
> 
> v2: add a comment as sugguested by Michal. and add changelog to explain why
> upstream kernel needs this fix.
> 
> ---
> 
>  mm/memcontrol.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index a3b97f1..43a12ed 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2802,6 +2802,9 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
>  
>  static inline bool memcg_kmem_bypass(void)
>  {
> +	/* Allow remote memcg charging in kthread contexts. */
> +	if (unlikely(current->active_memcg))
> +		return false;
>  	if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
>  		return true;

Shakeel is right about interrupts. How about something like this?

static inline bool memcg_kmem_bypass(void)
{
	if (in_interrupt())
		return true;

	if ((!current->mm || current->flags & PF_KTHREAD) && !current->active_memcg)
		return true;

	return false;
}

Thanks!
Zefan Li May 14, 2020, 1:16 a.m. UTC | #4
On 2020/5/14 0:11, Roman Gushchin wrote:
> On Wed, May 13, 2020 at 07:47:49PM +0800, Zefan Li wrote:
>> While trying to use remote memcg charging in an out-of-tree kernel module
>> I found it's not working, because the current thread is a workqueue thread.
>>
>> As we will probably encounter this issue in the future as the users of
>> memalloc_use_memcg() grow, it's better we fix it now.
>>
>> Signed-off-by: Zefan Li <lizefan@huawei.com>
>> ---
>>
>> v2: add a comment as sugguested by Michal. and add changelog to explain why
>> upstream kernel needs this fix.
>>
>> ---
>>
>>  mm/memcontrol.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index a3b97f1..43a12ed 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -2802,6 +2802,9 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
>>  
>>  static inline bool memcg_kmem_bypass(void)
>>  {
>> +	/* Allow remote memcg charging in kthread contexts. */
>> +	if (unlikely(current->active_memcg))
>> +		return false;
>>  	if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
>>  		return true;
> 
> Shakeel is right about interrupts. How about something like this?
> 
> static inline bool memcg_kmem_bypass(void)
> {
> 	if (in_interrupt())
> 		return true;
> 
> 	if ((!current->mm || current->flags & PF_KTHREAD) && !current->active_memcg)
> 		return true;
> 
> 	return false;
> }
> 

I thought the user should ensure not do this, but now I think it makes sense to just bypass
the interrupt case.
Roman Gushchin May 14, 2020, 10:52 p.m. UTC | #5
On Thu, May 14, 2020 at 09:16:29AM +0800, Zefan Li wrote:
> On 2020/5/14 0:11, Roman Gushchin wrote:
> > On Wed, May 13, 2020 at 07:47:49PM +0800, Zefan Li wrote:
> >> While trying to use remote memcg charging in an out-of-tree kernel module
> >> I found it's not working, because the current thread is a workqueue thread.
> >>
> >> As we will probably encounter this issue in the future as the users of
> >> memalloc_use_memcg() grow, it's better we fix it now.
> >>
> >> Signed-off-by: Zefan Li <lizefan@huawei.com>
> >> ---
> >>
> >> v2: add a comment as sugguested by Michal. and add changelog to explain why
> >> upstream kernel needs this fix.
> >>
> >> ---
> >>
> >>  mm/memcontrol.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> >> index a3b97f1..43a12ed 100644
> >> --- a/mm/memcontrol.c
> >> +++ b/mm/memcontrol.c
> >> @@ -2802,6 +2802,9 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
> >>  
> >>  static inline bool memcg_kmem_bypass(void)
> >>  {
> >> +	/* Allow remote memcg charging in kthread contexts. */
> >> +	if (unlikely(current->active_memcg))
> >> +		return false;
> >>  	if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
> >>  		return true;
> > 
> > Shakeel is right about interrupts. How about something like this?
> > 
> > static inline bool memcg_kmem_bypass(void)
> > {
> > 	if (in_interrupt())
> > 		return true;
> > 
> > 	if ((!current->mm || current->flags & PF_KTHREAD) && !current->active_memcg)
> > 		return true;
> > 
> > 	return false;
> > }
> > 
> 
> I thought the user should ensure not do this, but now I think it makes sense to just bypass
> the interrupt case.

I think now it's mostly a legacy of the opt-out kernel memory accounting.
Actually we can relax this requirement by forcibly overcommit the memory cgroup
if the allocation is happening from the irq context, and punish it afterwards.
Idk how much we wanna this, hopefully nobody is allocating large non-temporarily
objects from an irq.

Will you send a v3?

Thanks!
Michal Hocko May 15, 2020, 6:56 a.m. UTC | #6
On Thu 14-05-20 15:52:59, Roman Gushchin wrote:
> On Thu, May 14, 2020 at 09:16:29AM +0800, Zefan Li wrote:
> > On 2020/5/14 0:11, Roman Gushchin wrote:
> > > On Wed, May 13, 2020 at 07:47:49PM +0800, Zefan Li wrote:
> > >> While trying to use remote memcg charging in an out-of-tree kernel module
> > >> I found it's not working, because the current thread is a workqueue thread.
> > >>
> > >> As we will probably encounter this issue in the future as the users of
> > >> memalloc_use_memcg() grow, it's better we fix it now.
> > >>
> > >> Signed-off-by: Zefan Li <lizefan@huawei.com>
> > >> ---
> > >>
> > >> v2: add a comment as sugguested by Michal. and add changelog to explain why
> > >> upstream kernel needs this fix.
> > >>
> > >> ---
> > >>
> > >>  mm/memcontrol.c | 3 +++
> > >>  1 file changed, 3 insertions(+)
> > >>
> > >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > >> index a3b97f1..43a12ed 100644
> > >> --- a/mm/memcontrol.c
> > >> +++ b/mm/memcontrol.c
> > >> @@ -2802,6 +2802,9 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
> > >>  
> > >>  static inline bool memcg_kmem_bypass(void)
> > >>  {
> > >> +	/* Allow remote memcg charging in kthread contexts. */
> > >> +	if (unlikely(current->active_memcg))
> > >> +		return false;
> > >>  	if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
> > >>  		return true;
> > > 
> > > Shakeel is right about interrupts. How about something like this?
> > > 
> > > static inline bool memcg_kmem_bypass(void)
> > > {
> > > 	if (in_interrupt())
> > > 		return true;
> > > 
> > > 	if ((!current->mm || current->flags & PF_KTHREAD) && !current->active_memcg)
> > > 		return true;
> > > 
> > > 	return false;
> > > }
> > > 
> > 
> > I thought the user should ensure not do this, but now I think it makes sense to just bypass
> > the interrupt case.
> 
> I think now it's mostly a legacy of the opt-out kernel memory accounting.
> Actually we can relax this requirement by forcibly overcommit the memory cgroup
> if the allocation is happening from the irq context, and punish it afterwards.
> Idk how much we wanna this, hopefully nobody is allocating large non-temporarily
> objects from an irq.

I do not think we want to pretend that remote charging from the IRQ
context is supported. Why don't we simply WARN_ON(in_interrupt()) there?

> 
> Will you send a v3?
> 
> Thanks!
Zefan Li May 15, 2020, 8:20 a.m. UTC | #7
On 2020/5/15 14:56, Michal Hocko wrote:
> On Thu 14-05-20 15:52:59, Roman Gushchin wrote:
>> On Thu, May 14, 2020 at 09:16:29AM +0800, Zefan Li wrote:
>>> On 2020/5/14 0:11, Roman Gushchin wrote:
>>>> On Wed, May 13, 2020 at 07:47:49PM +0800, Zefan Li wrote:
>>>>> While trying to use remote memcg charging in an out-of-tree kernel module
>>>>> I found it's not working, because the current thread is a workqueue thread.
>>>>>
>>>>> As we will probably encounter this issue in the future as the users of
>>>>> memalloc_use_memcg() grow, it's better we fix it now.
>>>>>
>>>>> Signed-off-by: Zefan Li <lizefan@huawei.com>
>>>>> ---
>>>>>
>>>>> v2: add a comment as sugguested by Michal. and add changelog to explain why
>>>>> upstream kernel needs this fix.
>>>>>
>>>>> ---
>>>>>
>>>>>  mm/memcontrol.c | 3 +++
>>>>>  1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>>>>> index a3b97f1..43a12ed 100644
>>>>> --- a/mm/memcontrol.c
>>>>> +++ b/mm/memcontrol.c
>>>>> @@ -2802,6 +2802,9 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
>>>>>  
>>>>>  static inline bool memcg_kmem_bypass(void)
>>>>>  {
>>>>> +	/* Allow remote memcg charging in kthread contexts. */
>>>>> +	if (unlikely(current->active_memcg))
>>>>> +		return false;
>>>>>  	if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
>>>>>  		return true;
>>>>
>>>> Shakeel is right about interrupts. How about something like this?
>>>>
>>>> static inline bool memcg_kmem_bypass(void)
>>>> {
>>>> 	if (in_interrupt())
>>>> 		return true;
>>>>
>>>> 	if ((!current->mm || current->flags & PF_KTHREAD) && !current->active_memcg)
>>>> 		return true;
>>>>
>>>> 	return false;
>>>> }
>>>>
>>>
>>> I thought the user should ensure not do this, but now I think it makes sense to just bypass
>>> the interrupt case.
>>
>> I think now it's mostly a legacy of the opt-out kernel memory accounting.
>> Actually we can relax this requirement by forcibly overcommit the memory cgroup
>> if the allocation is happening from the irq context, and punish it afterwards.
>> Idk how much we wanna this, hopefully nobody is allocating large non-temporarily
>> objects from an irq.
> 
> I do not think we want to pretend that remote charging from the IRQ
> context is supported. Why don't we simply WARN_ON(in_interrupt()) there?
> 

How about:

static inline bool memcg_kmem_bypass(void)
{
        if (in_interrupt()) {
                WARN_ON(current->active_memcg);
                return true;
        }

        /* Allow remote memcg charging in kthread contexts. */
        if ((!current->mm || (current->flags & PF_KTHREAD)) && !current->active_memcg)
                return true;
        return false;
}
Michal Hocko May 15, 2020, 8:34 a.m. UTC | #8
On Fri 15-05-20 16:20:04, Li Zefan wrote:
> On 2020/5/15 14:56, Michal Hocko wrote:
> > On Thu 14-05-20 15:52:59, Roman Gushchin wrote:
[...]
> >>> I thought the user should ensure not do this, but now I think it makes sense to just bypass
> >>> the interrupt case.
> >>
> >> I think now it's mostly a legacy of the opt-out kernel memory accounting.
> >> Actually we can relax this requirement by forcibly overcommit the memory cgroup
> >> if the allocation is happening from the irq context, and punish it afterwards.
> >> Idk how much we wanna this, hopefully nobody is allocating large non-temporarily
> >> objects from an irq.
> > 
> > I do not think we want to pretend that remote charging from the IRQ
> > context is supported. Why don't we simply WARN_ON(in_interrupt()) there?
> > 
> 
> How about:
> 
> static inline bool memcg_kmem_bypass(void)
> {
>         if (in_interrupt()) {
>                 WARN_ON(current->active_memcg);
>                 return true;
>         }

Why not simply 

	if (WARN_ON_ONCE(in_interrupt())
		return true;

the idea is that we want to catch any __GFP_ACCOUNT user from IRQ
context because this just doesn't work and we do not plan to support it
for now and foreseeable future. If this is reduced only to active_memcg
then we are only getting a partial picture.
		
> 
>         /* Allow remote memcg charging in kthread contexts. */
>         if ((!current->mm || (current->flags & PF_KTHREAD)) && !current->active_memcg)
>                 return true;
>         return false;
> }
Shakeel Butt May 15, 2020, 4:22 p.m. UTC | #9
On Fri, May 15, 2020 at 1:35 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Fri 15-05-20 16:20:04, Li Zefan wrote:
> > On 2020/5/15 14:56, Michal Hocko wrote:
> > > On Thu 14-05-20 15:52:59, Roman Gushchin wrote:
> [...]
> > >>> I thought the user should ensure not do this, but now I think it makes sense to just bypass
> > >>> the interrupt case.
> > >>
> > >> I think now it's mostly a legacy of the opt-out kernel memory accounting.
> > >> Actually we can relax this requirement by forcibly overcommit the memory cgroup
> > >> if the allocation is happening from the irq context, and punish it afterwards.
> > >> Idk how much we wanna this, hopefully nobody is allocating large non-temporarily
> > >> objects from an irq.
> > >
> > > I do not think we want to pretend that remote charging from the IRQ
> > > context is supported. Why don't we simply WARN_ON(in_interrupt()) there?
> > >
> >
> > How about:
> >
> > static inline bool memcg_kmem_bypass(void)
> > {
> >         if (in_interrupt()) {
> >                 WARN_ON(current->active_memcg);
> >                 return true;
> >         }
>
> Why not simply
>
>         if (WARN_ON_ONCE(in_interrupt())
>                 return true;
>
> the idea is that we want to catch any __GFP_ACCOUNT user from IRQ
> context because this just doesn't work and we do not plan to support it
> for now and foreseeable future. If this is reduced only to active_memcg
> then we are only getting a partial picture.
>

There are SLAB_ACCOUNT kmem caches which do allocations in IRQ context
(see sk_prot_alloc()), so, either we make charging work in IRQ or no
warnings at all.
Roman Gushchin May 15, 2020, 5:31 p.m. UTC | #10
On Fri, May 15, 2020 at 09:22:25AM -0700, Shakeel Butt wrote:
> On Fri, May 15, 2020 at 1:35 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Fri 15-05-20 16:20:04, Li Zefan wrote:
> > > On 2020/5/15 14:56, Michal Hocko wrote:
> > > > On Thu 14-05-20 15:52:59, Roman Gushchin wrote:
> > [...]
> > > >>> I thought the user should ensure not do this, but now I think it makes sense to just bypass
> > > >>> the interrupt case.
> > > >>
> > > >> I think now it's mostly a legacy of the opt-out kernel memory accounting.
> > > >> Actually we can relax this requirement by forcibly overcommit the memory cgroup
> > > >> if the allocation is happening from the irq context, and punish it afterwards.
> > > >> Idk how much we wanna this, hopefully nobody is allocating large non-temporarily
> > > >> objects from an irq.
> > > >
> > > > I do not think we want to pretend that remote charging from the IRQ
> > > > context is supported. Why don't we simply WARN_ON(in_interrupt()) there?
> > > >
> > >
> > > How about:
> > >
> > > static inline bool memcg_kmem_bypass(void)
> > > {
> > >         if (in_interrupt()) {
> > >                 WARN_ON(current->active_memcg);
> > >                 return true;
> > >         }
> >
> > Why not simply
> >
> >         if (WARN_ON_ONCE(in_interrupt())
> >                 return true;
> >
> > the idea is that we want to catch any __GFP_ACCOUNT user from IRQ
> > context because this just doesn't work and we do not plan to support it
> > for now and foreseeable future.

Actually, why not?
It should be fairly simple, especially after the rework of the slab controller.

> If this is reduced only to active_memcg
> > then we are only getting a partial picture.
> >
> 
> There are SLAB_ACCOUNT kmem caches which do allocations in IRQ context
> (see sk_prot_alloc()), so, either we make charging work in IRQ or no
> warnings at all.

I agree. Actually, there is nothing wrong to warn about, it's just a limitation
of the current accounting implementation.

Thanks!
Michal Hocko May 18, 2020, 9:13 a.m. UTC | #11
On Fri 15-05-20 09:22:25, Shakeel Butt wrote:
> On Fri, May 15, 2020 at 1:35 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Fri 15-05-20 16:20:04, Li Zefan wrote:
> > > On 2020/5/15 14:56, Michal Hocko wrote:
> > > > On Thu 14-05-20 15:52:59, Roman Gushchin wrote:
> > [...]
> > > >>> I thought the user should ensure not do this, but now I think it makes sense to just bypass
> > > >>> the interrupt case.
> > > >>
> > > >> I think now it's mostly a legacy of the opt-out kernel memory accounting.
> > > >> Actually we can relax this requirement by forcibly overcommit the memory cgroup
> > > >> if the allocation is happening from the irq context, and punish it afterwards.
> > > >> Idk how much we wanna this, hopefully nobody is allocating large non-temporarily
> > > >> objects from an irq.
> > > >
> > > > I do not think we want to pretend that remote charging from the IRQ
> > > > context is supported. Why don't we simply WARN_ON(in_interrupt()) there?
> > > >
> > >
> > > How about:
> > >
> > > static inline bool memcg_kmem_bypass(void)
> > > {
> > >         if (in_interrupt()) {
> > >                 WARN_ON(current->active_memcg);
> > >                 return true;
> > >         }
> >
> > Why not simply
> >
> >         if (WARN_ON_ONCE(in_interrupt())
> >                 return true;
> >
> > the idea is that we want to catch any __GFP_ACCOUNT user from IRQ
> > context because this just doesn't work and we do not plan to support it
> > for now and foreseeable future. If this is reduced only to active_memcg
> > then we are only getting a partial picture.
> >
> 
> There are SLAB_ACCOUNT kmem caches which do allocations in IRQ context
> (see sk_prot_alloc()), so, either we make charging work in IRQ or no
> warnings at all.

OK, I see. I wasn't aware that those caches are used from IRQ context.
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a3b97f1..43a12ed 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2802,6 +2802,9 @@  static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
 
 static inline bool memcg_kmem_bypass(void)
 {
+	/* Allow remote memcg charging in kthread contexts. */
+	if (unlikely(current->active_memcg))
+		return false;
 	if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
 		return true;
 	return false;