diff mbox series

mm: skip checking poison pattern for page_to_nid()

Message ID 20181219015732.26179-1-cai@lca.pw (mailing list archive)
State New, archived
Headers show
Series mm: skip checking poison pattern for page_to_nid() | expand

Commit Message

Qian Cai Dec. 19, 2018, 1:57 a.m. UTC
Kernel panic with page_owner=on

CONFIG_DEBUG_VM_PGFLAGS=y
PAGE_OWNER=y
NODE_NOT_IN_PAGE_FLAGS=n

This is due to f165b378bbd (mm: uninitialized struct page poisoning
sanity checking) shoots itself in the foot.

[   11.917212] page:ffffea0004200000 is uninitialized and poisoned
[   11.917220] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff
ffffffffffffffff
[   11.921745] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff
ffffffffffffffff
[   11.924523] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
[   11.926498] page_owner info is not active (free page?)
[   12.329560] kernel BUG at include/linux/mm.h:990!
[   12.337632] RIP: 0010:init_page_owner+0x486/0x520

At first,

start_kernel
  setup_arch
    pagetable_init
      paging_init
        sparse_init
          sparse_init_nid
            sparse_buffer_init
              memblock_virt_alloc_try_nid_raw

It poisons all the allocated pages there.

memset(ptr, PAGE_POISON_PATTERN, size)

Later,

page_ext_init
  invoke_init_callbacks
    init_section_page_ext
      init_page_owner
        init_early_allocated_pages
          init_zones_in_node
            init_pages_in_zone
              lookup_page_ext
                page_to_nid
                  PF_POISONED_CHECK <--- panic here.

This because all allocated pages are not initialized until later.

init_pages_in_zone
  __set_page_owner_handle

Fixes: f165b378bbd (mm: uninitialized struct page poisoning
sanity checking)
Signed-off-by: Qian Cai <cai@lca.pw>
---
 include/linux/mm.h | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

Comments

Michal Hocko Dec. 19, 2018, 10:20 a.m. UTC | #1
On Tue 18-12-18 20:57:32, Qian Cai wrote:
[...]
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 5411de93a363..f083f366ea90 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -985,9 +985,7 @@ extern int page_to_nid(const struct page *page);
>  #else
>  static inline int page_to_nid(const struct page *page)
>  {
> -	struct page *p = (struct page *)page;
> -
> -	return (PF_POISONED_CHECK(p)->flags >> NODES_PGSHIFT) & NODES_MASK;
> +	return (page->flags >> NODES_PGSHIFT) & NODES_MASK;
>  }
>  #endif

I didn't get to think about a proper fix but this is clearly worng. If
the page is still poisoned then flags are clearly bogus and the node you
get is a garbage as well. Have you actually tested this patch?
Qian Cai Dec. 19, 2018, 12:46 p.m. UTC | #2
On 12/19/18 5:20 AM, Michal Hocko wrote:
> On Tue 18-12-18 20:57:32, Qian Cai wrote:
> [...]
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 5411de93a363..f083f366ea90 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -985,9 +985,7 @@ extern int page_to_nid(const struct page *page);
>>  #else
>>  static inline int page_to_nid(const struct page *page)
>>  {
>> -	struct page *p = (struct page *)page;
>> -
>> -	return (PF_POISONED_CHECK(p)->flags >> NODES_PGSHIFT) & NODES_MASK;
>> +	return (page->flags >> NODES_PGSHIFT) & NODES_MASK;
>>  }
>>  #endif
> 
> I didn't get to think about a proper fix but this is clearly worng. If
> the page is still poisoned then flags are clearly bogus and the node you
> get is a garbage as well. Have you actually tested this patch?
> 

Yes, I did notice that after running for a while triggering some UBSAN
out-of-bounds access warnings. I am still trying to figure out how those
uninitialized page flags survived though after

mm_init
  mem_init
    memblock_free_all
      init_single_page()
diff mbox series

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5411de93a363..f083f366ea90 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -985,9 +985,7 @@  extern int page_to_nid(const struct page *page);
 #else
 static inline int page_to_nid(const struct page *page)
 {
-	struct page *p = (struct page *)page;
-
-	return (PF_POISONED_CHECK(p)->flags >> NODES_PGSHIFT) & NODES_MASK;
+	return (page->flags >> NODES_PGSHIFT) & NODES_MASK;
 }
 #endif