Message ID | 20220505070105.1835745-1-42.hyeyoo@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/kfence: reset PG_slab and memcg_data before freeing __kfence_pool | expand |
On Thu, 5 May 2022 at 09:01, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote: > > When kfence fails to initialize kfence pool, it frees the pool. > But it does not reset PG_slab flag and memcg_data of struct page. > > Below is a BUG because of this. Let's fix it by resetting PG_slab > and memcg_data before free. > > [ 0.089149] BUG: Bad page state in process swapper/0 pfn:3d8e06 > [ 0.089149] page:ffffea46cf638180 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3d8e06 > [ 0.089150] memcg:ffffffff94a475d1 > [ 0.089150] flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) > [ 0.089151] raw: 0017ffffc0000200 ffffea46cf638188 ffffea46cf638188 0000000000000000 > [ 0.089152] raw: 0000000000000000 0000000000000000 00000000ffffffff ffffffff94a475d1 > [ 0.089152] page dumped because: page still charged to cgroup > [ 0.089153] Modules linked in: > [ 0.089153] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B W 5.18.0-rc1+ #965 > [ 0.089154] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > [ 0.089154] Call Trace: > [ 0.089155] <TASK> > [ 0.089155] dump_stack_lvl+0x49/0x5f > [ 0.089157] dump_stack+0x10/0x12 > [ 0.089158] bad_page.cold+0x63/0x94 > [ 0.089159] check_free_page_bad+0x66/0x70 > [ 0.089160] __free_pages_ok+0x423/0x530 > [ 0.089161] __free_pages_core+0x8e/0xa0 > [ 0.089162] memblock_free_pages+0x10/0x12 > [ 0.089164] memblock_free_late+0x8f/0xb9 > [ 0.089165] kfence_init+0x68/0x92 > [ 0.089166] start_kernel+0x789/0x992 > [ 0.089167] x86_64_start_reservations+0x24/0x26 > [ 0.089168] x86_64_start_kernel+0xa9/0xaf > [ 0.089170] secondary_startup_64_no_verify+0xd5/0xdb > [ 0.089171] </TASK> This is probably: Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure") > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > --- > mm/kfence/core.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/mm/kfence/core.c b/mm/kfence/core.c > index a203747ad2c0..2ab3d473321e 100644 > --- a/mm/kfence/core.c > +++ b/mm/kfence/core.c > @@ -642,6 +642,13 @@ static bool __init kfence_init_pool_early(void) > * fails for the first page, and therefore expect addr==__kfence_pool in > * most failure cases. > */ > + for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { > + struct page *page; > + > + page = virt_to_page(p); #ifdef CONFIG_MEMCG > + page->memcg_data = 0; #endif > + __ClearPageSlab(page); We're now using __folio_set_slab(), so I'm guessing this should be __folio_clear_slab()? > + } > memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); > __kfence_pool = NULL; > return false; > -- > 2.32.0 >
On Thu, 5 May 2022 at 09:12, Marco Elver <elver@google.com> wrote: > > On Thu, 5 May 2022 at 09:01, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote: > > > > When kfence fails to initialize kfence pool, it frees the pool. > > But it does not reset PG_slab flag and memcg_data of struct page. > > > > Below is a BUG because of this. Let's fix it by resetting PG_slab > > and memcg_data before free. > > > > [ 0.089149] BUG: Bad page state in process swapper/0 pfn:3d8e06 > > [ 0.089149] page:ffffea46cf638180 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3d8e06 > > [ 0.089150] memcg:ffffffff94a475d1 > > [ 0.089150] flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) > > [ 0.089151] raw: 0017ffffc0000200 ffffea46cf638188 ffffea46cf638188 0000000000000000 > > [ 0.089152] raw: 0000000000000000 0000000000000000 00000000ffffffff ffffffff94a475d1 > > [ 0.089152] page dumped because: page still charged to cgroup > > [ 0.089153] Modules linked in: > > [ 0.089153] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B W 5.18.0-rc1+ #965 > > [ 0.089154] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > > [ 0.089154] Call Trace: > > [ 0.089155] <TASK> > > [ 0.089155] dump_stack_lvl+0x49/0x5f > > [ 0.089157] dump_stack+0x10/0x12 > > [ 0.089158] bad_page.cold+0x63/0x94 > > [ 0.089159] check_free_page_bad+0x66/0x70 > > [ 0.089160] __free_pages_ok+0x423/0x530 > > [ 0.089161] __free_pages_core+0x8e/0xa0 > > [ 0.089162] memblock_free_pages+0x10/0x12 > > [ 0.089164] memblock_free_late+0x8f/0xb9 > > [ 0.089165] kfence_init+0x68/0x92 > > [ 0.089166] start_kernel+0x789/0x992 > > [ 0.089167] x86_64_start_reservations+0x24/0x26 > > [ 0.089168] x86_64_start_kernel+0xa9/0xaf > > [ 0.089170] secondary_startup_64_no_verify+0xd5/0xdb > > [ 0.089171] </TASK> > > This is probably: > > Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure") Hmm, looking closer at the above BUG, I think it's Fixes: 8f0b36497303 ("mm: kfence: fix objcgs vector allocation") ? > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > --- > > mm/kfence/core.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/mm/kfence/core.c b/mm/kfence/core.c > > index a203747ad2c0..2ab3d473321e 100644 > > --- a/mm/kfence/core.c > > +++ b/mm/kfence/core.c > > @@ -642,6 +642,13 @@ static bool __init kfence_init_pool_early(void) > > * fails for the first page, and therefore expect addr==__kfence_pool in > > * most failure cases. > > */ > > + for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { > > + struct page *page; > > + > > + page = virt_to_page(p); > > #ifdef CONFIG_MEMCG > > > + page->memcg_data = 0; > > #endif > > > + __ClearPageSlab(page); > > We're now using __folio_set_slab(), so I'm guessing this should be > __folio_clear_slab()? > > > + } > > memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); > > __kfence_pool = NULL; > > return false; > > -- > > 2.32.0 > >
On Thu, May 05, 2022 at 09:19:31AM +0200, Marco Elver wrote: > On Thu, 5 May 2022 at 09:12, Marco Elver <elver@google.com> wrote: > > > > On Thu, 5 May 2022 at 09:01, Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote: > > > > > > When kfence fails to initialize kfence pool, it frees the pool. > > > But it does not reset PG_slab flag and memcg_data of struct page. > > > > > > Below is a BUG because of this. Let's fix it by resetting PG_slab > > > and memcg_data before free. > > > > > > [ 0.089149] BUG: Bad page state in process swapper/0 pfn:3d8e06 > > > [ 0.089149] page:ffffea46cf638180 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3d8e06 > > > [ 0.089150] memcg:ffffffff94a475d1 > > > [ 0.089150] flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) > > > [ 0.089151] raw: 0017ffffc0000200 ffffea46cf638188 ffffea46cf638188 0000000000000000 > > > [ 0.089152] raw: 0000000000000000 0000000000000000 00000000ffffffff ffffffff94a475d1 > > > [ 0.089152] page dumped because: page still charged to cgroup > > > [ 0.089153] Modules linked in: > > > [ 0.089153] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B W 5.18.0-rc1+ #965 > > > [ 0.089154] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > > > [ 0.089154] Call Trace: > > > [ 0.089155] <TASK> > > > [ 0.089155] dump_stack_lvl+0x49/0x5f > > > [ 0.089157] dump_stack+0x10/0x12 > > > [ 0.089158] bad_page.cold+0x63/0x94 > > > [ 0.089159] check_free_page_bad+0x66/0x70 > > > [ 0.089160] __free_pages_ok+0x423/0x530 > > > [ 0.089161] __free_pages_core+0x8e/0xa0 > > > [ 0.089162] memblock_free_pages+0x10/0x12 > > > [ 0.089164] memblock_free_late+0x8f/0xb9 > > > [ 0.089165] kfence_init+0x68/0x92 > > > [ 0.089166] start_kernel+0x789/0x992 > > > [ 0.089167] x86_64_start_reservations+0x24/0x26 > > > [ 0.089168] x86_64_start_kernel+0xa9/0xaf > > > [ 0.089170] secondary_startup_64_no_verify+0xd5/0xdb > > > [ 0.089171] </TASK> > > > > This is probably: > > > > Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure") > > Hmm, looking closer at the above BUG, I think it's > > Fixes: 8f0b36497303 ("mm: kfence: fix objcgs vector allocation") > > ? Marco, Thanks for comments. I think it fixes both because not clearing PG_slab also invokes a BUG. > > > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > > --- > > > mm/kfence/core.c | 7 +++++++ > > > 1 file changed, 7 insertions(+) > > > > > > diff --git a/mm/kfence/core.c b/mm/kfence/core.c > > > index a203747ad2c0..2ab3d473321e 100644 > > > --- a/mm/kfence/core.c > > > +++ b/mm/kfence/core.c > > > @@ -642,6 +642,13 @@ static bool __init kfence_init_pool_early(void) > > > * fails for the first page, and therefore expect addr==__kfence_pool in > > > * most failure cases. > > > */ > > > + for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { > > > + struct page *page; > > > + > > > + page = virt_to_page(p); > > > > #ifdef CONFIG_MEMCG > > > > > + page->memcg_data = 0; > > > > #endif > > > > > + __ClearPageSlab(page); > > Ah, thanks! Will do in v2. > > We're now using __folio_set_slab(), so I'm guessing this should be > > __folio_clear_slab()? Right. Will do in v2. > > > > > + } > > > memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); > > > __kfence_pool = NULL; > > > return false; > > > -- > > > 2.32.0 > > >
Hi Hyeonggon, Thank you for the patch! Yet something to improve: [auto build test ERROR on hnaz-mm/master] url: https://github.com/intel-lab-lkp/linux/commits/Hyeonggon-Yoo/mm-kfence-reset-PG_slab-and-memcg_data-before-freeing-__kfence_pool/20220505-150237 base: https://github.com/hnaz/linux-mm master config: s390-randconfig-r044-20220505 (https://download.01.org/0day-ci/archive/20220505/202205051852.M26PcwNj-lkp@intel.com/config) compiler: s390-linux-gcc (GCC) 11.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/ad166fcbcd464ea0251165580e1ea0152531fe56 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Hyeonggon-Yoo/mm-kfence-reset-PG_slab-and-memcg_data-before-freeing-__kfence_pool/20220505-150237 git checkout ad166fcbcd464ea0251165580e1ea0152531fe56 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.3.0 make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash mm/kfence/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): mm/kfence/core.c: In function 'kfence_init_pool_early': >> mm/kfence/core.c:634:21: error: 'struct page' has no member named 'memcg_data' 634 | page->memcg_data = 0; | ^~ vim +634 mm/kfence/core.c 610 611 static bool __init kfence_init_pool_early(void) 612 { 613 unsigned long addr; 614 615 if (!__kfence_pool) 616 return false; 617 618 addr = kfence_init_pool(); 619 620 if (!addr) 621 return true; 622 623 /* 624 * Only release unprotected pages, and do not try to go back and change 625 * page attributes due to risk of failing to do so as well. If changing 626 * page attributes for some pages fails, it is very likely that it also 627 * fails for the first page, and therefore expect addr==__kfence_pool in 628 * most failure cases. 629 */ 630 for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { 631 struct page *page; 632 633 page = virt_to_page(p); > 634 page->memcg_data = 0; 635 __ClearPageSlab(page); 636 } 637 memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); 638 __kfence_pool = NULL; 639 return false; 640 } 641
diff --git a/mm/kfence/core.c b/mm/kfence/core.c index a203747ad2c0..2ab3d473321e 100644 --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -642,6 +642,13 @@ static bool __init kfence_init_pool_early(void) * fails for the first page, and therefore expect addr==__kfence_pool in * most failure cases. */ + for (char *p = (char *)addr; p < __kfence_pool + KFENCE_POOL_SIZE; p += PAGE_SIZE) { + struct page *page; + + page = virt_to_page(p); + page->memcg_data = 0; + __ClearPageSlab(page); + } memblock_free_late(__pa(addr), KFENCE_POOL_SIZE - (addr - (unsigned long)__kfence_pool)); __kfence_pool = NULL; return false;
When kfence fails to initialize kfence pool, it frees the pool. But it does not reset PG_slab flag and memcg_data of struct page. Below is a BUG because of this. Let's fix it by resetting PG_slab and memcg_data before free. [ 0.089149] BUG: Bad page state in process swapper/0 pfn:3d8e06 [ 0.089149] page:ffffea46cf638180 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3d8e06 [ 0.089150] memcg:ffffffff94a475d1 [ 0.089150] flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) [ 0.089151] raw: 0017ffffc0000200 ffffea46cf638188 ffffea46cf638188 0000000000000000 [ 0.089152] raw: 0000000000000000 0000000000000000 00000000ffffffff ffffffff94a475d1 [ 0.089152] page dumped because: page still charged to cgroup [ 0.089153] Modules linked in: [ 0.089153] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B W 5.18.0-rc1+ #965 [ 0.089154] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [ 0.089154] Call Trace: [ 0.089155] <TASK> [ 0.089155] dump_stack_lvl+0x49/0x5f [ 0.089157] dump_stack+0x10/0x12 [ 0.089158] bad_page.cold+0x63/0x94 [ 0.089159] check_free_page_bad+0x66/0x70 [ 0.089160] __free_pages_ok+0x423/0x530 [ 0.089161] __free_pages_core+0x8e/0xa0 [ 0.089162] memblock_free_pages+0x10/0x12 [ 0.089164] memblock_free_late+0x8f/0xb9 [ 0.089165] kfence_init+0x68/0x92 [ 0.089166] start_kernel+0x789/0x992 [ 0.089167] x86_64_start_reservations+0x24/0x26 [ 0.089168] x86_64_start_kernel+0xa9/0xaf [ 0.089170] secondary_startup_64_no_verify+0xd5/0xdb [ 0.089171] </TASK> Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/kfence/core.c | 7 +++++++ 1 file changed, 7 insertions(+)