[v2] kmemleak: survive in a low-memory situation

Message ID	20190102180619.12392-1-cai@lca.pw (mailing list archive)
State	New, archived
Headers	show Return-Path: <owner-linux-mm@kvack.org> Received-SPF: pass (google.com: domain of cai@lca.pw designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; From: Qian Cai <cai@lca.pw> To: catalin.marinas@arm.com Cc: akpm@linux-foundation.org, cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qian Cai <cai@lca.pw> Subject: [PATCH v2] kmemleak: survive in a low-memory situation Date: Wed, 2 Jan 2019 13:06:19 -0500 Message-Id: <20190102180619.12392-1-cai@lca.pw> In-Reply-To: <20190102165931.GB6584@arrakis.emea.arm.com> References: <20190102165931.GB6584@arrakis.emea.arm.com> Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	[v2] kmemleak: survive in a low-memory situation \| expand [v2] kmemleak: survive in a low-memory situation

Qian Cai Jan. 2, 2019, 6:06 p.m. UTC

Kmemleak could quickly fail to allocate an object structure and then
disable itself in a low-memory situation. For example, running a mmap()
workload triggering swapping and OOM [1].

Kmemleak allocation could fail even though the trackig object is
succeeded. Hence, it could still try to start a direct reclaim if it is
not executed in an atomic context (spinlock, irq-handler etc), or a
high-priority allocation in an atomic context as a last-ditch effort.

[1]
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c

Signed-off-by: Qian Cai <cai@lca.pw>
---

v2: remove the needless checking for NULL objects in slab_post_alloc_hook()
    pointed out by Catalin.

 mm/kmemleak.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Michal Hocko Jan. 3, 2019, 9:32 a.m. UTC | #1

On Wed 02-01-19 13:06:19, Qian Cai wrote:
[...]
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index f9d9dc250428..9e1aa3b7df75 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
>  	struct rb_node **link, *rb_parent;
>  
>  	object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
> +#ifdef CONFIG_PREEMPT_COUNT
> +	if (!object) {
> +		/* last-ditch effort in a low-memory situation */
> +		if (irqs_disabled() || is_idle_task(current) || in_atomic())
> +			gfp = GFP_ATOMIC;
> +		else
> +			gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM;
> +		object = kmem_cache_alloc(object_cache, gfp);
> +	}
> +#endif

I do not get it. How can this possibly help when gfp_kmemleak_mask()
adds __GFP_NOFAIL modifier to the given gfp mask? Or is this not the
case anymore in some tree?

Qian Cai Jan. 3, 2019, 4:51 p.m. UTC | #2

On 1/3/19 4:32 AM, Michal Hocko wrote:
> On Wed 02-01-19 13:06:19, Qian Cai wrote:
> [...]
>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
>> index f9d9dc250428..9e1aa3b7df75 100644
>> --- a/mm/kmemleak.c
>> +++ b/mm/kmemleak.c
>> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
>>  	struct rb_node **link, *rb_parent;
>>  
>>  	object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
>> +#ifdef CONFIG_PREEMPT_COUNT
>> +	if (!object) {
>> +		/* last-ditch effort in a low-memory situation */
>> +		if (irqs_disabled() || is_idle_task(current) || in_atomic())
>> +			gfp = GFP_ATOMIC;
>> +		else
>> +			gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM;
>> +		object = kmem_cache_alloc(object_cache, gfp);
>> +	}
>> +#endif
> 
> I do not get it. How can this possibly help when gfp_kmemleak_mask()
> adds __GFP_NOFAIL modifier to the given gfp mask? Or is this not the
> case anymore in some tree?

Well, __GFP_NOFAIL can still fail easily without __GFP_DIRECT_RECLAIM in a
low-memory situation.

__alloc_pages_slowpath():

/* Caller is not willing to reclaim, we can't balance anything */
if (!can_direct_reclaim)
	goto nopage;

nopage:

/*
 * All existing users of the __GFP_NOFAIL are blockable, so
 * warn of any new users that actually require GFP_NOWAIT
 */
if (WARN_ON_ONCE(!can_direct_reclaim))
	goto fail;

Michal Hocko Jan. 3, 2019, 5:07 p.m. UTC | #3

On Thu 03-01-19 11:51:57, Qian Cai wrote:
> On 1/3/19 4:32 AM, Michal Hocko wrote:
> > On Wed 02-01-19 13:06:19, Qian Cai wrote:
> > [...]
> >> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> >> index f9d9dc250428..9e1aa3b7df75 100644
> >> --- a/mm/kmemleak.c
> >> +++ b/mm/kmemleak.c
> >> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
> >>  	struct rb_node **link, *rb_parent;
> >>  
> >>  	object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
> >> +#ifdef CONFIG_PREEMPT_COUNT
> >> +	if (!object) {
> >> +		/* last-ditch effort in a low-memory situation */
> >> +		if (irqs_disabled() || is_idle_task(current) || in_atomic())
> >> +			gfp = GFP_ATOMIC;
> >> +		else
> >> +			gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM;
> >> +		object = kmem_cache_alloc(object_cache, gfp);
> >> +	}
> >> +#endif
> > 
> > I do not get it. How can this possibly help when gfp_kmemleak_mask()
> > adds __GFP_NOFAIL modifier to the given gfp mask? Or is this not the
> > case anymore in some tree?
> 
> Well, __GFP_NOFAIL can still fail easily without __GFP_DIRECT_RECLAIM in a
> low-memory situation.

OK, I guess I understand now. So the issue is that a (general) atomic
allocation will provide its gfp mask down to kmemleak and you are
trying/hoping that if the allocation is no from an atomic context then
you can fortify it by using a sleepable allocation for the kmemleak
metadata or giving it access to memory reserves for atomic allocations.

I think this is still fragile because most atomic allocations are for a
good reason. As I've said earlier the current implementation which
abuses __GFP_NOFAIL is fra from great and we have discussed some
alternatives. Not sure whan came out of it.

I will not object to this workaround but I strongly believe that
kmemleak should rethink the metadata allocation strategy to be really
robust.

Catalin Marinas Jan. 7, 2019, 10:43 a.m. UTC | #4

On Thu, Jan 03, 2019 at 06:07:35PM +0100, Michal Hocko wrote:
> > > On Wed 02-01-19 13:06:19, Qian Cai wrote:
> > > [...]
> > >> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> > >> index f9d9dc250428..9e1aa3b7df75 100644
> > >> --- a/mm/kmemleak.c
> > >> +++ b/mm/kmemleak.c
> > >> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
> > >>  	struct rb_node **link, *rb_parent;
> > >>  
> > >>  	object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
> > >> +#ifdef CONFIG_PREEMPT_COUNT
> > >> +	if (!object) {
> > >> +		/* last-ditch effort in a low-memory situation */
> > >> +		if (irqs_disabled() || is_idle_task(current) || in_atomic())
> > >> +			gfp = GFP_ATOMIC;
> > >> +		else
> > >> +			gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM;
> > >> +		object = kmem_cache_alloc(object_cache, gfp);
> > >> +	}
> > >> +#endif
[...]
> I will not object to this workaround but I strongly believe that
> kmemleak should rethink the metadata allocation strategy to be really
> robust.

This would be nice indeed and it was discussed last year. I just haven't
got around to trying anything yet:

https://marc.info/?l=linux-mm&m=152812489819532

Qian Cai Jan. 8, 2019, 2:06 a.m. UTC | #5

On 1/7/19 5:43 AM, Catalin Marinas wrote:
> On Thu, Jan 03, 2019 at 06:07:35PM +0100, Michal Hocko wrote:
>>>> On Wed 02-01-19 13:06:19, Qian Cai wrote:
>>>> [...]
>>>>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
>>>>> index f9d9dc250428..9e1aa3b7df75 100644
>>>>> --- a/mm/kmemleak.c
>>>>> +++ b/mm/kmemleak.c
>>>>> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
>>>>>  	struct rb_node **link, *rb_parent;
>>>>>  
>>>>>  	object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
>>>>> +#ifdef CONFIG_PREEMPT_COUNT
>>>>> +	if (!object) {
>>>>> +		/* last-ditch effort in a low-memory situation */
>>>>> +		if (irqs_disabled() || is_idle_task(current) || in_atomic())
>>>>> +			gfp = GFP_ATOMIC;
>>>>> +		else
>>>>> +			gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM;
>>>>> +		object = kmem_cache_alloc(object_cache, gfp);
>>>>> +	}
>>>>> +#endif
> [...]
>> I will not object to this workaround but I strongly believe that
>> kmemleak should rethink the metadata allocation strategy to be really
>> robust.
> 
> This would be nice indeed and it was discussed last year. I just haven't
> got around to trying anything yet:
> 
> https://marc.info/?l=linux-mm&m=152812489819532
> 

It could be helpful to apply this 10-line patch first if has no fundamental
issue, as it survives probably 50 times running LTP oom* workloads without a
single kmemleak allocation failure.

Of course, if someone is going to embed kmemleak metadata into slab objects
itself soon, this workaround is not needed.

Qian Cai Jan. 8, 2019, 3:49 a.m. UTC | #6

On 1/7/19 9:06 PM, Qian Cai wrote:
> 
> 
> On 1/7/19 5:43 AM, Catalin Marinas wrote:
>> On Thu, Jan 03, 2019 at 06:07:35PM +0100, Michal Hocko wrote:
>>>>> On Wed 02-01-19 13:06:19, Qian Cai wrote:
>>>>> [...]
>>>>>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
>>>>>> index f9d9dc250428..9e1aa3b7df75 100644
>>>>>> --- a/mm/kmemleak.c
>>>>>> +++ b/mm/kmemleak.c
>>>>>> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
>>>>>>  	struct rb_node **link, *rb_parent;
>>>>>>  
>>>>>>  	object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
>>>>>> +#ifdef CONFIG_PREEMPT_COUNT
>>>>>> +	if (!object) {
>>>>>> +		/* last-ditch effort in a low-memory situation */
>>>>>> +		if (irqs_disabled() || is_idle_task(current) || in_atomic())
>>>>>> +			gfp = GFP_ATOMIC;
>>>>>> +		else
>>>>>> +			gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM;
>>>>>> +		object = kmem_cache_alloc(object_cache, gfp);
>>>>>> +	}
>>>>>> +#endif
>> [...]
>>> I will not object to this workaround but I strongly believe that
>>> kmemleak should rethink the metadata allocation strategy to be really
>>> robust.
>>
>> This would be nice indeed and it was discussed last year. I just haven't
>> got around to trying anything yet:
>>
>> https://marc.info/?l=linux-mm&m=152812489819532
>>
> 
> It could be helpful to apply this 10-line patch first if has no fundamental
> issue, as it survives probably 50 times running LTP oom* workloads without a
> single kmemleak allocation failure.
> 
> Of course, if someone is going to embed kmemleak metadata into slab objects
> itself soon, this workaround is not needed.
> 

Well, it is really hard to tell even if someone get eventually redesign kmemleak
to embed the metadata into slab objects alone would survive LTP oom* workloads,
because it seems still use separate metadata for non-slab objects where kmemleak
allocation could fail like it right now and disable itself again.

[v2] kmemleak: survive in a low-memory situation

Commit Message

Comments

Patch