Message ID | 20190102180619.12392-1-cai@lca.pw (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] kmemleak: survive in a low-memory situation | expand |
On Wed 02-01-19 13:06:19, Qian Cai wrote: [...] > diff --git a/mm/kmemleak.c b/mm/kmemleak.c > index f9d9dc250428..9e1aa3b7df75 100644 > --- a/mm/kmemleak.c > +++ b/mm/kmemleak.c > @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, > struct rb_node **link, *rb_parent; > > object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); > +#ifdef CONFIG_PREEMPT_COUNT > + if (!object) { > + /* last-ditch effort in a low-memory situation */ > + if (irqs_disabled() || is_idle_task(current) || in_atomic()) > + gfp = GFP_ATOMIC; > + else > + gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; > + object = kmem_cache_alloc(object_cache, gfp); > + } > +#endif I do not get it. How can this possibly help when gfp_kmemleak_mask() adds __GFP_NOFAIL modifier to the given gfp mask? Or is this not the case anymore in some tree?
On 1/3/19 4:32 AM, Michal Hocko wrote: > On Wed 02-01-19 13:06:19, Qian Cai wrote: > [...] >> diff --git a/mm/kmemleak.c b/mm/kmemleak.c >> index f9d9dc250428..9e1aa3b7df75 100644 >> --- a/mm/kmemleak.c >> +++ b/mm/kmemleak.c >> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, >> struct rb_node **link, *rb_parent; >> >> object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); >> +#ifdef CONFIG_PREEMPT_COUNT >> + if (!object) { >> + /* last-ditch effort in a low-memory situation */ >> + if (irqs_disabled() || is_idle_task(current) || in_atomic()) >> + gfp = GFP_ATOMIC; >> + else >> + gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; >> + object = kmem_cache_alloc(object_cache, gfp); >> + } >> +#endif > > I do not get it. How can this possibly help when gfp_kmemleak_mask() > adds __GFP_NOFAIL modifier to the given gfp mask? Or is this not the > case anymore in some tree? Well, __GFP_NOFAIL can still fail easily without __GFP_DIRECT_RECLAIM in a low-memory situation. __alloc_pages_slowpath(): /* Caller is not willing to reclaim, we can't balance anything */ if (!can_direct_reclaim) goto nopage; nopage: /* * All existing users of the __GFP_NOFAIL are blockable, so * warn of any new users that actually require GFP_NOWAIT */ if (WARN_ON_ONCE(!can_direct_reclaim)) goto fail;
On Thu 03-01-19 11:51:57, Qian Cai wrote: > On 1/3/19 4:32 AM, Michal Hocko wrote: > > On Wed 02-01-19 13:06:19, Qian Cai wrote: > > [...] > >> diff --git a/mm/kmemleak.c b/mm/kmemleak.c > >> index f9d9dc250428..9e1aa3b7df75 100644 > >> --- a/mm/kmemleak.c > >> +++ b/mm/kmemleak.c > >> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, > >> struct rb_node **link, *rb_parent; > >> > >> object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); > >> +#ifdef CONFIG_PREEMPT_COUNT > >> + if (!object) { > >> + /* last-ditch effort in a low-memory situation */ > >> + if (irqs_disabled() || is_idle_task(current) || in_atomic()) > >> + gfp = GFP_ATOMIC; > >> + else > >> + gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; > >> + object = kmem_cache_alloc(object_cache, gfp); > >> + } > >> +#endif > > > > I do not get it. How can this possibly help when gfp_kmemleak_mask() > > adds __GFP_NOFAIL modifier to the given gfp mask? Or is this not the > > case anymore in some tree? > > Well, __GFP_NOFAIL can still fail easily without __GFP_DIRECT_RECLAIM in a > low-memory situation. OK, I guess I understand now. So the issue is that a (general) atomic allocation will provide its gfp mask down to kmemleak and you are trying/hoping that if the allocation is no from an atomic context then you can fortify it by using a sleepable allocation for the kmemleak metadata or giving it access to memory reserves for atomic allocations. I think this is still fragile because most atomic allocations are for a good reason. As I've said earlier the current implementation which abuses __GFP_NOFAIL is fra from great and we have discussed some alternatives. Not sure whan came out of it. I will not object to this workaround but I strongly believe that kmemleak should rethink the metadata allocation strategy to be really robust.
On Thu, Jan 03, 2019 at 06:07:35PM +0100, Michal Hocko wrote: > > > On Wed 02-01-19 13:06:19, Qian Cai wrote: > > > [...] > > >> diff --git a/mm/kmemleak.c b/mm/kmemleak.c > > >> index f9d9dc250428..9e1aa3b7df75 100644 > > >> --- a/mm/kmemleak.c > > >> +++ b/mm/kmemleak.c > > >> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, > > >> struct rb_node **link, *rb_parent; > > >> > > >> object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); > > >> +#ifdef CONFIG_PREEMPT_COUNT > > >> + if (!object) { > > >> + /* last-ditch effort in a low-memory situation */ > > >> + if (irqs_disabled() || is_idle_task(current) || in_atomic()) > > >> + gfp = GFP_ATOMIC; > > >> + else > > >> + gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; > > >> + object = kmem_cache_alloc(object_cache, gfp); > > >> + } > > >> +#endif [...] > I will not object to this workaround but I strongly believe that > kmemleak should rethink the metadata allocation strategy to be really > robust. This would be nice indeed and it was discussed last year. I just haven't got around to trying anything yet: https://marc.info/?l=linux-mm&m=152812489819532
On 1/7/19 5:43 AM, Catalin Marinas wrote: > On Thu, Jan 03, 2019 at 06:07:35PM +0100, Michal Hocko wrote: >>>> On Wed 02-01-19 13:06:19, Qian Cai wrote: >>>> [...] >>>>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c >>>>> index f9d9dc250428..9e1aa3b7df75 100644 >>>>> --- a/mm/kmemleak.c >>>>> +++ b/mm/kmemleak.c >>>>> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, >>>>> struct rb_node **link, *rb_parent; >>>>> >>>>> object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); >>>>> +#ifdef CONFIG_PREEMPT_COUNT >>>>> + if (!object) { >>>>> + /* last-ditch effort in a low-memory situation */ >>>>> + if (irqs_disabled() || is_idle_task(current) || in_atomic()) >>>>> + gfp = GFP_ATOMIC; >>>>> + else >>>>> + gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; >>>>> + object = kmem_cache_alloc(object_cache, gfp); >>>>> + } >>>>> +#endif > [...] >> I will not object to this workaround but I strongly believe that >> kmemleak should rethink the metadata allocation strategy to be really >> robust. > > This would be nice indeed and it was discussed last year. I just haven't > got around to trying anything yet: > > https://marc.info/?l=linux-mm&m=152812489819532 > It could be helpful to apply this 10-line patch first if has no fundamental issue, as it survives probably 50 times running LTP oom* workloads without a single kmemleak allocation failure. Of course, if someone is going to embed kmemleak metadata into slab objects itself soon, this workaround is not needed.
On 1/7/19 9:06 PM, Qian Cai wrote: > > > On 1/7/19 5:43 AM, Catalin Marinas wrote: >> On Thu, Jan 03, 2019 at 06:07:35PM +0100, Michal Hocko wrote: >>>>> On Wed 02-01-19 13:06:19, Qian Cai wrote: >>>>> [...] >>>>>> diff --git a/mm/kmemleak.c b/mm/kmemleak.c >>>>>> index f9d9dc250428..9e1aa3b7df75 100644 >>>>>> --- a/mm/kmemleak.c >>>>>> +++ b/mm/kmemleak.c >>>>>> @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, >>>>>> struct rb_node **link, *rb_parent; >>>>>> >>>>>> object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); >>>>>> +#ifdef CONFIG_PREEMPT_COUNT >>>>>> + if (!object) { >>>>>> + /* last-ditch effort in a low-memory situation */ >>>>>> + if (irqs_disabled() || is_idle_task(current) || in_atomic()) >>>>>> + gfp = GFP_ATOMIC; >>>>>> + else >>>>>> + gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; >>>>>> + object = kmem_cache_alloc(object_cache, gfp); >>>>>> + } >>>>>> +#endif >> [...] >>> I will not object to this workaround but I strongly believe that >>> kmemleak should rethink the metadata allocation strategy to be really >>> robust. >> >> This would be nice indeed and it was discussed last year. I just haven't >> got around to trying anything yet: >> >> https://marc.info/?l=linux-mm&m=152812489819532 >> > > It could be helpful to apply this 10-line patch first if has no fundamental > issue, as it survives probably 50 times running LTP oom* workloads without a > single kmemleak allocation failure. > > Of course, if someone is going to embed kmemleak metadata into slab objects > itself soon, this workaround is not needed. > Well, it is really hard to tell even if someone get eventually redesign kmemleak to embed the metadata into slab objects alone would survive LTP oom* workloads, because it seems still use separate metadata for non-slab objects where kmemleak allocation could fail like it right now and disable itself again.
diff --git a/mm/kmemleak.c b/mm/kmemleak.c index f9d9dc250428..9e1aa3b7df75 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -576,6 +576,16 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, struct rb_node **link, *rb_parent; object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); +#ifdef CONFIG_PREEMPT_COUNT + if (!object) { + /* last-ditch effort in a low-memory situation */ + if (irqs_disabled() || is_idle_task(current) || in_atomic()) + gfp = GFP_ATOMIC; + else + gfp = gfp_kmemleak_mask(gfp) | __GFP_DIRECT_RECLAIM; + object = kmem_cache_alloc(object_cache, gfp); + } +#endif if (!object) { pr_warn("Cannot allocate a kmemleak_object structure\n"); kmemleak_disable();
Kmemleak could quickly fail to allocate an object structure and then disable itself in a low-memory situation. For example, running a mmap() workload triggering swapping and OOM [1]. Kmemleak allocation could fail even though the trackig object is succeeded. Hence, it could still try to start a direct reclaim if it is not executed in an atomic context (spinlock, irq-handler etc), or a high-priority allocation in an atomic context as a last-ditch effort. [1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c Signed-off-by: Qian Cai <cai@lca.pw> --- v2: remove the needless checking for NULL objects in slab_post_alloc_hook() pointed out by Catalin. mm/kmemleak.c | 10 ++++++++++ 1 file changed, 10 insertions(+)