Message ID | 1562964544-59519-1-git-send-email-yang.shi@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm: page_alloc: document kmemleak's non-blockable __GFP_NOFAIL case | expand |
On Sat, 13 Jul 2019, Yang Shi wrote: > When running ltp's oom test with kmemleak enabled, the below warning was > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is > passed in: > > WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50 > Modules linked in: loop dax_pmem dax_pmem_core > ip_tables x_tables xfs virtio_net net_failover virtio_blk failover > ata_generic virtio_pci virtio_ring virtio libata > CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 > RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50 > ... > kmemleak_alloc+0x4e/0xb0 > kmem_cache_alloc+0x2a7/0x3e0 > ? __kmalloc+0x1d6/0x470 > ? ___might_sleep+0x9c/0x170 > ? mempool_alloc+0x2b0/0x2b0 > mempool_alloc_slab+0x2d/0x40 > mempool_alloc+0x118/0x2b0 > ? __kasan_check_read+0x11/0x20 > ? mempool_resize+0x390/0x390 > ? lock_downgrade+0x3c0/0x3c0 > bio_alloc_bioset+0x19d/0x350 > ? __swap_duplicate+0x161/0x240 > ? bvec_alloc+0x1b0/0x1b0 > ? do_raw_spin_unlock+0xa8/0x140 > ? _raw_spin_unlock+0x27/0x40 > get_swap_bio+0x80/0x230 > ? __x64_sys_madvise+0x50/0x50 > ? end_swap_bio_read+0x310/0x310 > ? __kasan_check_read+0x11/0x20 > ? check_chain_key+0x24e/0x300 > ? bdev_write_page+0x55/0x130 > __swap_writepage+0x5ff/0xb20 > > The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has > __GFP_NOFAIL set all the time due to commit > d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist > with fault injection"). > It only clears __GFP_DIRECT_RECLAIM provisionally to see if the allocation would immediately succeed before falling back to the elements in the mempool. If that fails, and the mempool is empty, mempool_alloc() attempts the allocation with __GFP_DIRECT_RECLAIM. So for the problem described here, I think what we really want is this: diff --git a/mm/mempool.c b/mm/mempool.c --- a/mm/mempool.c +++ b/mm/mempool.c @@ -386,7 +386,7 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask) gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */ gfp_mask |= __GFP_NOWARN; /* failures are OK */ - gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO); + gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO|__GFP_NOFAIL); repeat_alloc: But bio_alloc_bioset() plays with gfp_mask itself: are we sure that it isn't the one clearing __GFP_DIRECT_RECLAIM itself before falling back to saved_gfp? In other words do we also want this? diff --git a/block/bio.c b/block/bio.c --- a/block/bio.c +++ b/block/bio.c @@ -462,16 +462,16 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs, * We solve this, and guarantee forward progress, with a rescuer * workqueue per bio_set. If we go to allocate and there are * bios on current->bio_list, we first try the allocation - * without __GFP_DIRECT_RECLAIM; if that fails, we punt those - * bios we would be blocking to the rescuer workqueue before - * we retry with the original gfp_flags. + * without __GFP_DIRECT_RECLAIM or __GFP_NOFAIL; if that fails, + * we punt those bios we would be blocking to the rescuer + * workqueue before we retry with the original gfp_flags. */ - if (current->bio_list && (!bio_list_empty(¤t->bio_list[0]) || !bio_list_empty(¤t->bio_list[1])) && bs->rescue_workqueue) - gfp_mask &= ~__GFP_DIRECT_RECLAIM; + gfp_mask &= ~(__GFP_DIRECT_RECLAIM | + __GFP_NOFAIL); p = mempool_alloc(&bs->bio_pool, gfp_mask); if (!p && gfp_mask != saved_gfp) {
On Sat, Jul 13, 2019 at 04:49:04AM +0800, Yang Shi wrote: > When running ltp's oom test with kmemleak enabled, the below warning was > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is > passed in: There are lots of places where kmemleak will call kmalloc with __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM (including the XArray code, which is how I know about it). It needs to be fixed to allow its internal allocations to fail and return failure of the original allocation as a consequence.
On 7/13/19 12:39 PM, David Rientjes wrote: > On Sat, 13 Jul 2019, Yang Shi wrote: > >> When running ltp's oom test with kmemleak enabled, the below warning was >> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is >> passed in: >> >> WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50 >> Modules linked in: loop dax_pmem dax_pmem_core >> ip_tables x_tables xfs virtio_net net_failover virtio_blk failover >> ata_generic virtio_pci virtio_ring virtio libata >> CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 >> RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50 >> ... >> kmemleak_alloc+0x4e/0xb0 >> kmem_cache_alloc+0x2a7/0x3e0 >> ? __kmalloc+0x1d6/0x470 >> ? ___might_sleep+0x9c/0x170 >> ? mempool_alloc+0x2b0/0x2b0 >> mempool_alloc_slab+0x2d/0x40 >> mempool_alloc+0x118/0x2b0 >> ? __kasan_check_read+0x11/0x20 >> ? mempool_resize+0x390/0x390 >> ? lock_downgrade+0x3c0/0x3c0 >> bio_alloc_bioset+0x19d/0x350 >> ? __swap_duplicate+0x161/0x240 >> ? bvec_alloc+0x1b0/0x1b0 >> ? do_raw_spin_unlock+0xa8/0x140 >> ? _raw_spin_unlock+0x27/0x40 >> get_swap_bio+0x80/0x230 >> ? __x64_sys_madvise+0x50/0x50 >> ? end_swap_bio_read+0x310/0x310 >> ? __kasan_check_read+0x11/0x20 >> ? check_chain_key+0x24e/0x300 >> ? bdev_write_page+0x55/0x130 >> __swap_writepage+0x5ff/0xb20 >> >> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has >> __GFP_NOFAIL set all the time due to commit >> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist >> with fault injection"). >> > It only clears __GFP_DIRECT_RECLAIM provisionally to see if the allocation > would immediately succeed before falling back to the elements in the > mempool. If that fails, and the mempool is empty, mempool_alloc() > attempts the allocation with __GFP_DIRECT_RECLAIM. So for the problem > described here, I think what we really want is this: > > diff --git a/mm/mempool.c b/mm/mempool.c > --- a/mm/mempool.c > +++ b/mm/mempool.c > @@ -386,7 +386,7 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask) > gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */ > gfp_mask |= __GFP_NOWARN; /* failures are OK */ > > - gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO); > + gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO|__GFP_NOFAIL); > > repeat_alloc: > > But bio_alloc_bioset() plays with gfp_mask itself: are we sure that it > isn't the one clearing __GFP_DIRECT_RECLAIM itself before falling back to > saved_gfp? > > In other words do we also want this? > > diff --git a/block/bio.c b/block/bio.c > --- a/block/bio.c > +++ b/block/bio.c > @@ -462,16 +462,16 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs, > * We solve this, and guarantee forward progress, with a rescuer > * workqueue per bio_set. If we go to allocate and there are > * bios on current->bio_list, we first try the allocation > - * without __GFP_DIRECT_RECLAIM; if that fails, we punt those > - * bios we would be blocking to the rescuer workqueue before > - * we retry with the original gfp_flags. > + * without __GFP_DIRECT_RECLAIM or __GFP_NOFAIL; if that fails, > + * we punt those bios we would be blocking to the rescuer > + * workqueue before we retry with the original gfp_flags. > */ > - > if (current->bio_list && > (!bio_list_empty(¤t->bio_list[0]) || > !bio_list_empty(¤t->bio_list[1])) && > bs->rescue_workqueue) > - gfp_mask &= ~__GFP_DIRECT_RECLAIM; > + gfp_mask &= ~(__GFP_DIRECT_RECLAIM | > + __GFP_NOFAIL); > > p = mempool_alloc(&bs->bio_pool, gfp_mask); > if (!p && gfp_mask != saved_gfp) { I don't think it will make any difference by removing __GFP_NOFAIL outside kmemleak. The problem is the commit d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist with fault injection") makes __GFP_NOFAIL is set for kmemleak always in order to turn off fault-injection for kmemleak. As long as kmemleak is called in ~__GFP_DIRECT_RECLAIM path, the warning might be hit. And since kmemleak is just a debugging tool, so IMHO I don't think this is worth fixing, so I came up with the patch to document it.
On 7/13/19 2:25 PM, Matthew Wilcox wrote: > On Sat, Jul 13, 2019 at 04:49:04AM +0800, Yang Shi wrote: >> When running ltp's oom test with kmemleak enabled, the below warning was >> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is >> passed in: > There are lots of places where kmemleak will call kmalloc with > __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM (including the XArray code, which > is how I know about it). It needs to be fixed to allow its internal > allocations to fail and return failure of the original allocation as > a consequence. Do you mean kmemleak internal allocation? It would fail even though __GFP_NOFAIL is passed in if GFP_NOWAIT is specified. Currently buddy allocator will not retry if the allocation is non-blockable.
On Sun, Jul 14, 2019 at 08:47:07PM -0700, Yang Shi wrote: > > > On 7/13/19 2:25 PM, Matthew Wilcox wrote: > > On Sat, Jul 13, 2019 at 04:49:04AM +0800, Yang Shi wrote: > > > When running ltp's oom test with kmemleak enabled, the below warning was > > > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is > > > passed in: > > There are lots of places where kmemleak will call kmalloc with > > __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM (including the XArray code, which > > is how I know about it). It needs to be fixed to allow its internal > > allocations to fail and return failure of the original allocation as > > a consequence. > > Do you mean kmemleak internal allocation? It would fail even though > __GFP_NOFAIL is passed in if GFP_NOWAIT is specified. Currently buddy > allocator will not retry if the allocation is non-blockable. Actually it sets off a warning. Which is the right response from the core mm code because specifying __GFP_NOFAIL and __GFP_NOWAIT makes no sense.
On Sat 13-07-19 04:49:04, Yang Shi wrote: > When running ltp's oom test with kmemleak enabled, the below warning was > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is > passed in: kmemleak is broken and this is a long term issue. I thought that Catalin had something to address this. While this patch only adds a comment and discourages future changes of the warning which is fine and probably something that we should do, kmemleak really should be fixed sooner than later.
On Sat 13-07-19 12:39:16, David Rientjes wrote: > On Sat, 13 Jul 2019, Yang Shi wrote: > > > When running ltp's oom test with kmemleak enabled, the below warning was > > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is > > passed in: > > > > WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50 > > Modules linked in: loop dax_pmem dax_pmem_core > > ip_tables x_tables xfs virtio_net net_failover virtio_blk failover > > ata_generic virtio_pci virtio_ring virtio libata > > CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 > > RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50 > > ... > > kmemleak_alloc+0x4e/0xb0 > > kmem_cache_alloc+0x2a7/0x3e0 > > ? __kmalloc+0x1d6/0x470 > > ? ___might_sleep+0x9c/0x170 > > ? mempool_alloc+0x2b0/0x2b0 > > mempool_alloc_slab+0x2d/0x40 > > mempool_alloc+0x118/0x2b0 > > ? __kasan_check_read+0x11/0x20 > > ? mempool_resize+0x390/0x390 > > ? lock_downgrade+0x3c0/0x3c0 > > bio_alloc_bioset+0x19d/0x350 > > ? __swap_duplicate+0x161/0x240 > > ? bvec_alloc+0x1b0/0x1b0 > > ? do_raw_spin_unlock+0xa8/0x140 > > ? _raw_spin_unlock+0x27/0x40 > > get_swap_bio+0x80/0x230 > > ? __x64_sys_madvise+0x50/0x50 > > ? end_swap_bio_read+0x310/0x310 > > ? __kasan_check_read+0x11/0x20 > > ? check_chain_key+0x24e/0x300 > > ? bdev_write_page+0x55/0x130 > > __swap_writepage+0x5ff/0xb20 > > > > The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has > > __GFP_NOFAIL set all the time due to commit > > d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist > > with fault injection"). > > > > It only clears __GFP_DIRECT_RECLAIM provisionally to see if the allocation > would immediately succeed before falling back to the elements in the > mempool. If that fails, and the mempool is empty, mempool_alloc() > attempts the allocation with __GFP_DIRECT_RECLAIM. So for the problem > described here, I think what we really want is this: > > diff --git a/mm/mempool.c b/mm/mempool.c > --- a/mm/mempool.c > +++ b/mm/mempool.c > @@ -386,7 +386,7 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask) > gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */ > gfp_mask |= __GFP_NOWARN; /* failures are OK */ > > - gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO); > + gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO|__GFP_NOFAIL); > > repeat_alloc: No, I do not think we should make mempool allocator more complex for something that is an implementation problem the kmemleak.
On 15 Jul 2019, at 08:17, Michal Hocko <mhocko@kernel.org> wrote: > On Sat 13-07-19 04:49:04, Yang Shi wrote: >> When running ltp's oom test with kmemleak enabled, the below warning was >> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is >> passed in: > > kmemleak is broken and this is a long term issue. I thought that > Catalin had something to address this. What needs to be done in the short term is revert commit d9570ee3bd1d4f20ce63485f5ef05663866fe6c0. Longer term the solution is to embed kmemleak metadata into the slab so that we don’t have the situation where the primary slab allocation success but the kmemleak metadata fails. I’m on holiday for one more week with just a phone to reply from but feel free to revert the above commit. I’ll follow up with a better solution. Catalin
On Mon, 2019-07-15 at 10:01 -0500, Catalin Marinas wrote: > On 15 Jul 2019, at 08:17, Michal Hocko <mhocko@kernel.org> wrote: > > On Sat 13-07-19 04:49:04, Yang Shi wrote: > > > When running ltp's oom test with kmemleak enabled, the below warning was > > > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is > > > passed in: > > > > kmemleak is broken and this is a long term issue. I thought that > > Catalin had something to address this. > > What needs to be done in the short term is revert commit > d9570ee3bd1d4f20ce63485f5ef05663866fe6c0. Longer term the solution is to embed > kmemleak metadata into the slab so that we don’t have the situation where the > primary slab allocation success but the kmemleak metadata fails. > > I’m on holiday for one more week with just a phone to reply from but feel free > to revert the above commit. I’ll follow up with a better solution. Well, the reverting will only make the situation worst for the kmemleak under memory pressure. In the meantime, if someone wants to push for the mempool solution with tunable pool sizes along with the reverting, that could be an improvement. https://lore.kernel.org/linux-mm/20190328145917.GC10283@arrakis.emea.arm.com/
On 7/15/19 8:18 AM, Qian Cai wrote: > On Mon, 2019-07-15 at 10:01 -0500, Catalin Marinas wrote: >> On 15 Jul 2019, at 08:17, Michal Hocko <mhocko@kernel.org> wrote: >>> On Sat 13-07-19 04:49:04, Yang Shi wrote: >>>> When running ltp's oom test with kmemleak enabled, the below warning was >>>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is >>>> passed in: >>> kmemleak is broken and this is a long term issue. I thought that >>> Catalin had something to address this. >> What needs to be done in the short term is revert commit >> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0. Longer term the solution is to embed >> kmemleak metadata into the slab so that we don’t have the situation where the >> primary slab allocation success but the kmemleak metadata fails. >> >> I’m on holiday for one more week with just a phone to reply from but feel free >> to revert the above commit. I’ll follow up with a better solution. > Well, the reverting will only make the situation worst for the kmemleak under > memory pressure. In the meantime, if someone wants to push for the mempool I think this is expected by reverting that commit since kmemleak metadata could fail. But, it could fail too even though that commit is not reverted if the context is non-blockable. > solution with tunable pool sizes along with the reverting, that could be an > improvement. > > https://lore.kernel.org/linux-mm/20190328145917.GC10283@arrakis.emea.arm.com/
On 7/15/19 6:06 AM, Matthew Wilcox wrote: > On Sun, Jul 14, 2019 at 08:47:07PM -0700, Yang Shi wrote: >> >> On 7/13/19 2:25 PM, Matthew Wilcox wrote: >>> On Sat, Jul 13, 2019 at 04:49:04AM +0800, Yang Shi wrote: >>>> When running ltp's oom test with kmemleak enabled, the below warning was >>>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is >>>> passed in: >>> There are lots of places where kmemleak will call kmalloc with >>> __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM (including the XArray code, which >>> is how I know about it). It needs to be fixed to allow its internal >>> allocations to fail and return failure of the original allocation as >>> a consequence. >> Do you mean kmemleak internal allocation? It would fail even though >> __GFP_NOFAIL is passed in if GFP_NOWAIT is specified. Currently buddy >> allocator will not retry if the allocation is non-blockable. > Actually it sets off a warning. Which is the right response from the > core mm code because specifying __GFP_NOFAIL and __GFP_NOWAIT makes no > sense. Yes, this is what I meant. Kmemleak did a trick to fool fault-injection by passing in __GFP_NOFAIL, but it doesn't make sense for non-blockable allocation.
On 7/15/19 8:01 AM, Catalin Marinas wrote: > On 15 Jul 2019, at 08:17, Michal Hocko <mhocko@kernel.org> wrote: >> On Sat 13-07-19 04:49:04, Yang Shi wrote: >>> When running ltp's oom test with kmemleak enabled, the below warning was >>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is >>> passed in: >> kmemleak is broken and this is a long term issue. I thought that >> Catalin had something to address this. > What needs to be done in the short term is revert commit d9570ee3bd1d4f20ce63485f5ef05663866fe6c0. Longer term the solution is to embed kmemleak metadata into the slab so that we don’t have the situation where the primary slab allocation success but the kmemleak metadata fails. > > I’m on holiday for one more week with just a phone to reply from but feel free to revert the above commit. I’ll follow up with a better solution. Thanks, I'm going to submit a new patch to revert that commit. Yang > > Catalin
On Sat, 13 Jul 2019 04:49:04 +0800 Yang Shi <yang.shi@linux.alibaba.com> wrote: > When running ltp's oom test with kmemleak enabled, the below warning was > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is > passed in: > > ... > > The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has > __GFP_NOFAIL set all the time due to commit > d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist > with fault injection"). > > The fault-injection would not try to fail slab or page allocation if > __GFP_NOFAIL is used and that commit tries to turn off fault injection > for kmemleak allocation. Although __GFP_NOFAIL doesn't guarantee no > failure for all the cases (i.e. non-blockable allocation may fail), it > still makes sense to the most cases. Kmemleak is also a debugging tool, > so it sounds not worth changing the behavior. > > It also meaks sense to keep the warning, so just document the special > case in the comment. > > ... > > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4531,8 +4531,14 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > */ > if (gfp_mask & __GFP_NOFAIL) { > /* > - * All existing users of the __GFP_NOFAIL are blockable, so warn > - * of any new users that actually require GFP_NOWAIT > + * The users of the __GFP_NOFAIL are expected be blockable, > + * and this is true for the most cases except for kmemleak. > + * The kmemleak pass in __GFP_NOFAIL to skip fault injection, > + * however kmemleak may allocate object at some non-blockable > + * context to trigger this warning. > + * > + * Keep this warning since it is still useful for the most > + * normal cases. > */ Comment has rather a lot of typos. I'd normally fix them but I think I'll duck this patch until the kmemleak situation is addressed, so we can add a kmemleakless long-term comment, if desired.
On 7/24/19 7:48 PM, Andrew Morton wrote: > On Sat, 13 Jul 2019 04:49:04 +0800 Yang Shi <yang.shi@linux.alibaba.com> wrote: > >> When running ltp's oom test with kmemleak enabled, the below warning was >> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is >> passed in: >> >> ... >> >> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has >> __GFP_NOFAIL set all the time due to commit >> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist >> with fault injection"). >> >> The fault-injection would not try to fail slab or page allocation if >> __GFP_NOFAIL is used and that commit tries to turn off fault injection >> for kmemleak allocation. Although __GFP_NOFAIL doesn't guarantee no >> failure for all the cases (i.e. non-blockable allocation may fail), it >> still makes sense to the most cases. Kmemleak is also a debugging tool, >> so it sounds not worth changing the behavior. >> >> It also meaks sense to keep the warning, so just document the special >> case in the comment. >> >> ... >> >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -4531,8 +4531,14 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) >> */ >> if (gfp_mask & __GFP_NOFAIL) { >> /* >> - * All existing users of the __GFP_NOFAIL are blockable, so warn >> - * of any new users that actually require GFP_NOWAIT >> + * The users of the __GFP_NOFAIL are expected be blockable, >> + * and this is true for the most cases except for kmemleak. >> + * The kmemleak pass in __GFP_NOFAIL to skip fault injection, >> + * however kmemleak may allocate object at some non-blockable >> + * context to trigger this warning. >> + * >> + * Keep this warning since it is still useful for the most >> + * normal cases. >> */ > Comment has rather a lot of typos. I'd normally fix them but I think > I'll duck this patch until the kmemleak situation is addressed, so we > can add a kmemleakless long-term comment, if desired. Actually, this has been replaced by reverting the problematic commit. And, the patch has been in -mm tree. Please see: revert-kmemleak-allow-to-coexist-with-fault-injection.patch I think we would like to have this merged in 5.3-rc1 or rc2?
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d66bc8a..cac6efb 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4531,8 +4531,14 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) */ if (gfp_mask & __GFP_NOFAIL) { /* - * All existing users of the __GFP_NOFAIL are blockable, so warn - * of any new users that actually require GFP_NOWAIT + * The users of the __GFP_NOFAIL are expected be blockable, + * and this is true for the most cases except for kmemleak. + * The kmemleak pass in __GFP_NOFAIL to skip fault injection, + * however kmemleak may allocate object at some non-blockable + * context to trigger this warning. + * + * Keep this warning since it is still useful for the most + * normal cases. */ if (WARN_ON_ONCE(!can_direct_reclaim)) goto fail;
When running ltp's oom test with kmemleak enabled, the below warning was triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is passed in: WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50 Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50 ... kmemleak_alloc+0x4e/0xb0 kmem_cache_alloc+0x2a7/0x3e0 ? __kmalloc+0x1d6/0x470 ? ___might_sleep+0x9c/0x170 ? mempool_alloc+0x2b0/0x2b0 mempool_alloc_slab+0x2d/0x40 mempool_alloc+0x118/0x2b0 ? __kasan_check_read+0x11/0x20 ? mempool_resize+0x390/0x390 ? lock_downgrade+0x3c0/0x3c0 bio_alloc_bioset+0x19d/0x350 ? __swap_duplicate+0x161/0x240 ? bvec_alloc+0x1b0/0x1b0 ? do_raw_spin_unlock+0xa8/0x140 ? _raw_spin_unlock+0x27/0x40 get_swap_bio+0x80/0x230 ? __x64_sys_madvise+0x50/0x50 ? end_swap_bio_read+0x310/0x310 ? __kasan_check_read+0x11/0x20 ? check_chain_key+0x24e/0x300 ? bdev_write_page+0x55/0x130 __swap_writepage+0x5ff/0xb20 The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has __GFP_NOFAIL set all the time due to commit d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist with fault injection"). The fault-injection would not try to fail slab or page allocation if __GFP_NOFAIL is used and that commit tries to turn off fault injection for kmemleak allocation. Although __GFP_NOFAIL doesn't guarantee no failure for all the cases (i.e. non-blockable allocation may fail), it still makes sense to the most cases. Kmemleak is also a debugging tool, so it sounds not worth changing the behavior. It also meaks sense to keep the warning, so just document the special case in the comment. Cc: Michal Hocko <mhocko@suse.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com> --- mm/page_alloc.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)