Message ID | 1570732366-16426-1-git-send-email-cai@lca.pw (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/page_owner: fix a crash after memory offline | expand |
On 10.10.19 20:32, Qian Cai wrote: > The linux-next series "mm/memory_hotplug: Shrink zones before removing > memory" [1] seems make a crash easier to reproduce while reading > /proc/pagetypeinfo after offlining a memory section. Fix it by using > pfn_to_online_page() in the PFN walker. Can you please rephrase the subject+description to describe the actual problem and drop the reference to the series? E.g., similar to my recent patches: "mm/page_owner: Don't access uninitialized memmaps when reading /proc/pagetypeinfo Uninitialized memmaps contain garbage and in the worst case trigger kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get touched. For example, when not onlining a memory block that is spanned by a zone and reading /proc/pagetypeinfo, we can trigger a kernel BUG: ... " However, you also have to justify why it is okay to no longer consider ZONE_DEVICE (I think walk_zones_in_node() will skip ZONE_DEVICE due to assert_populated == true and ZONE_DEVICE will never be populated, Therefore, we will never end in this code path with ZONE_DEVICE). > > [1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@redhat.com/ > > page:ffffea0021200000 is uninitialized and poisoned > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) > There is not page extension available. > ------------[ cut here ]------------ > kernel BUG at include/linux/mm.h:1107! > RIP: 0010:pagetypeinfo_showmixedcount_print+0x4fb/0x680 > Call Trace: > walk_zones_in_node+0x3a/0xc0 > pagetypeinfo_show+0x260/0x2c0 > seq_read+0x27e/0x710 > proc_reg_read+0x12e/0x190 > __vfs_read+0x50/0xa0 > vfs_read+0xcb/0x1e0 > ksys_read+0xc6/0x160 > __x64_sys_read+0x43/0x50 > do_syscall_64+0xcc/0xaec > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > Signed-off-by: Qian Cai <cai@lca.pw> > --- > mm/page_owner.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/mm/page_owner.c b/mm/page_owner.c > index dee931184788..03a6b19b3cdd 100644 > --- a/mm/page_owner.c > +++ b/mm/page_owner.c > @@ -296,11 +296,10 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m, > pageblock_mt = get_pageblock_migratetype(page); > What about the pfn_valid() in the outermost loop? You can skip over the whole pageblock if the first page is not online. > for (; pfn < block_end_pfn; pfn++) { > - if (!pfn_valid_within(pfn)) > + page = pfn_to_online_page(pfn); > + if (!page) > continue; > > - page = pfn_to_page(pfn); > - > if (page_zone(page) != zone) > continue; > >
On Thu, 2019-10-10 at 20:55 +0200, David Hildenbrand wrote: > On 10.10.19 20:32, Qian Cai wrote: > > The linux-next series "mm/memory_hotplug: Shrink zones before removing > > memory" [1] seems make a crash easier to reproduce while reading > > /proc/pagetypeinfo after offlining a memory section. Fix it by using > > pfn_to_online_page() in the PFN walker. > > Can you please rephrase the subject+description to describe the actual > problem and drop the reference to the series? I'd figure it is better for you to post this as you are on the top of this whole mess. What do you think? > > E.g., similar to my recent patches: > > "mm/page_owner: Don't access uninitialized memmaps when reading > /proc/pagetypeinfo > > Uninitialized memmaps contain garbage and in the worst case trigger > kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get > touched. > > For example, when not onlining a memory block that is spanned by a zone > and reading /proc/pagetypeinfo, we can trigger a kernel BUG: ... > " > > However, you also have to justify why it is okay to no longer consider > ZONE_DEVICE (I think walk_zones_in_node() will skip ZONE_DEVICE due to > assert_populated == true and ZONE_DEVICE will never be populated, > Therefore, we will never end in this code path with ZONE_DEVICE). > > > > > > [1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@redhat.com/ > > > > page:ffffea0021200000 is uninitialized and poisoned > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff > > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) > > There is not page extension available. > > ------------[ cut here ]------------ > > kernel BUG at include/linux/mm.h:1107! > > RIP: 0010:pagetypeinfo_showmixedcount_print+0x4fb/0x680 > > Call Trace: > > walk_zones_in_node+0x3a/0xc0 > > pagetypeinfo_show+0x260/0x2c0 > > seq_read+0x27e/0x710 > > proc_reg_read+0x12e/0x190 > > __vfs_read+0x50/0xa0 > > vfs_read+0xcb/0x1e0 > > ksys_read+0xc6/0x160 > > __x64_sys_read+0x43/0x50 > > do_syscall_64+0xcc/0xaec > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > > > Signed-off-by: Qian Cai <cai@lca.pw> > > --- > > mm/page_owner.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/mm/page_owner.c b/mm/page_owner.c > > index dee931184788..03a6b19b3cdd 100644 > > --- a/mm/page_owner.c > > +++ b/mm/page_owner.c > > @@ -296,11 +296,10 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m, > > pageblock_mt = get_pageblock_migratetype(page); > > > > What about the pfn_valid() in the outermost loop? You can skip over the > whole pageblock if the first page is not online. > > > for (; pfn < block_end_pfn; pfn++) { > > - if (!pfn_valid_within(pfn)) > > + page = pfn_to_online_page(pfn); > > + if (!page) > > continue; > > > > - page = pfn_to_page(pfn); > > - > > if (page_zone(page) != zone) > > continue; > > > > > >
On 10.10.19 21:12, Qian Cai wrote: > On Thu, 2019-10-10 at 20:55 +0200, David Hildenbrand wrote: >> On 10.10.19 20:32, Qian Cai wrote: >>> The linux-next series "mm/memory_hotplug: Shrink zones before removing >>> memory" [1] seems make a crash easier to reproduce while reading >>> /proc/pagetypeinfo after offlining a memory section. Fix it by using >>> pfn_to_online_page() in the PFN walker. >> >> Can you please rephrase the subject+description to describe the actual >> problem and drop the reference to the series? > > I'd figure it is better for you to post this as you are on the top of this whole > mess. What do you think? You mean, I found the key to an unlimited amount of undetected BUGs, so I should fix them? ;) In case you don't have time to explore all the dirty details, I can take it from here. Just let me know.
> On Oct 11, 2019, at 5:57 AM, David Hildenbrand <david@redhat.com> wrote: > > You mean, I found the key to an unlimited amount of undetected BUGs, so I should fix them? ;) No, it is more of for the sake of consistency, and the fact that all those bugs are only starting to show up only recently after applied to your series, so you will probably explain for a better changelog why that fact is not important. > > In case you don't have time to explore all the dirty details, I can take it from here. Just let me know. Indeed, I will appreciate that if I don’t need to dig all those dirty details again myself.
diff --git a/mm/page_owner.c b/mm/page_owner.c index dee931184788..03a6b19b3cdd 100644 --- a/mm/page_owner.c +++ b/mm/page_owner.c @@ -296,11 +296,10 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m, pageblock_mt = get_pageblock_migratetype(page); for (; pfn < block_end_pfn; pfn++) { - if (!pfn_valid_within(pfn)) + page = pfn_to_online_page(pfn); + if (!page) continue; - page = pfn_to_page(pfn); - if (page_zone(page) != zone) continue;
The linux-next series "mm/memory_hotplug: Shrink zones before removing memory" [1] seems make a crash easier to reproduce while reading /proc/pagetypeinfo after offlining a memory section. Fix it by using pfn_to_online_page() in the PFN walker. [1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@redhat.com/ page:ffffea0021200000 is uninitialized and poisoned raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) There is not page extension available. ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1107! RIP: 0010:pagetypeinfo_showmixedcount_print+0x4fb/0x680 Call Trace: walk_zones_in_node+0x3a/0xc0 pagetypeinfo_show+0x260/0x2c0 seq_read+0x27e/0x710 proc_reg_read+0x12e/0x190 __vfs_read+0x50/0xa0 vfs_read+0xcb/0x1e0 ksys_read+0xc6/0x160 __x64_sys_read+0x43/0x50 do_syscall_64+0xcc/0xaec entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Qian Cai <cai@lca.pw> --- mm/page_owner.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)