diff mbox series

[v2,3/3] mm: hwpoison: dump page for unhandlable page

Message ID 20210819054116.266126-3-shy828301@gmail.com (mailing list archive)
State New
Headers show
Series [v2,1/3] mm: hwpoison: don't drop slab caches for offlining non-LRU page | expand

Commit Message

Yang Shi Aug. 19, 2021, 5:41 a.m. UTC
Currently just very simple message is shown for unhandlable page, e.g.
non-LRU page, like:
soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()

It is not very helpful for further debug, calling dump_page() could show
more useful information.

Calling dump_page() in get_any_page() in order to not duplicate the call
in a couple of different places.  It may be called with pcp disabled and
holding memory hotplug lock, it should be not a big deal since hwpoison
handler is not called very often.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/memory-failure.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

HORIGUCHI NAOYA(堀口 直也) Aug. 20, 2021, 6:48 a.m. UTC | #1
On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote:
> Currently just very simple message is shown for unhandlable page, e.g.
> non-LRU page, like:
> soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()
> 
> It is not very helpful for further debug, calling dump_page() could show
> more useful information.
> 
> Calling dump_page() in get_any_page() in order to not duplicate the call
> in a couple of different places.  It may be called with pcp disabled and
> holding memory hotplug lock, it should be not a big deal since hwpoison
> handler is not called very often.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  mm/memory-failure.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 7cfa134b1370..60df8fcd0444 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1228,6 +1228,9 @@ static int get_any_page(struct page *p, unsigned long flags)
>  		ret = -EIO;
>  	}
>  out:
> +	if (ret == -EIO)
> +		dump_page(p, "hwpoison: unhandlable page");
> +

I feel that 4 callers of get_hwpoison_page() are in the different context,
so it might be better to consider them separately to add dump_page() or not.
soft_offline_page() still prints out "%s: %#lx: unknown page type: %lx (%pGp)"
message, which might be duplicate so this printk() may be dropped.
In memory_failure_hugetlb() and memory_failure(), we can call dump_page() after
action_result().  unpoison_memory() doesn't need dump_page() at all because
it's related to already hwpoisoned page.

Thanks,
Naoya Horiguchi
Yang Shi Aug. 20, 2021, 6:40 p.m. UTC | #2
On Thu, Aug 19, 2021 at 11:48 PM HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@nec.com> wrote:
>
> On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote:
> > Currently just very simple message is shown for unhandlable page, e.g.
> > non-LRU page, like:
> > soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()
> >
> > It is not very helpful for further debug, calling dump_page() could show
> > more useful information.
> >
> > Calling dump_page() in get_any_page() in order to not duplicate the call
> > in a couple of different places.  It may be called with pcp disabled and
> > holding memory hotplug lock, it should be not a big deal since hwpoison
> > handler is not called very often.
> >
> > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Cc: Oscar Salvador <osalvador@suse.de>
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > ---
> >  mm/memory-failure.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 7cfa134b1370..60df8fcd0444 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1228,6 +1228,9 @@ static int get_any_page(struct page *p, unsigned long flags)
> >               ret = -EIO;
> >       }
> >  out:
> > +     if (ret == -EIO)
> > +             dump_page(p, "hwpoison: unhandlable page");
> > +
>
> I feel that 4 callers of get_hwpoison_page() are in the different context,
> so it might be better to consider them separately to add dump_page() or not.
> soft_offline_page() still prints out "%s: %#lx: unknown page type: %lx (%pGp)"

No strong opinion to keep or remove it.

> message, which might be duplicate so this printk() may be dropped.
> In memory_failure_hugetlb() and memory_failure(), we can call dump_page() after
> action_result().  unpoison_memory() doesn't need dump_page() at all because
> it's related to already hwpoisoned page.

I don't have a strong opinion either to have the dump_page() called
either before action or after action, it just moves around the dumped
page information around that printk.

For unpoison_memory(), I think it is harmless to have dump_page()
called, right? If get_hwpoison_page() can't return -EIO, then the
dump_page() won't be called at all, if it is possible then this is
exactly why we call dump_page() to help debug.

So IMHO calling dump_page() in get_any_page when -EIO is returned
could work for all the cases well and avoid duplicating the call.

>
> Thanks,
> Naoya Horiguchi
HORIGUCHI NAOYA(堀口 直也) Aug. 23, 2021, 5:05 a.m. UTC | #3
On Fri, Aug 20, 2021 at 11:40:24AM -0700, Yang Shi wrote:
> On Thu, Aug 19, 2021 at 11:48 PM HORIGUCHI NAOYA(堀口 直也)
> <naoya.horiguchi@nec.com> wrote:
> >
> > On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote:
> > > Currently just very simple message is shown for unhandlable page, e.g.
> > > non-LRU page, like:
> > > soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()
> > >
> > > It is not very helpful for further debug, calling dump_page() could show
> > > more useful information.
> > >
> > > Calling dump_page() in get_any_page() in order to not duplicate the call
> > > in a couple of different places.  It may be called with pcp disabled and
> > > holding memory hotplug lock, it should be not a big deal since hwpoison
> > > handler is not called very often.
> > >
> > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > Cc: Oscar Salvador <osalvador@suse.de>
> > > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > > ---
> > >  mm/memory-failure.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > > index 7cfa134b1370..60df8fcd0444 100644
> > > --- a/mm/memory-failure.c
> > > +++ b/mm/memory-failure.c
> > > @@ -1228,6 +1228,9 @@ static int get_any_page(struct page *p, unsigned long flags)
> > >               ret = -EIO;
> > >       }
> > >  out:
> > > +     if (ret == -EIO)
> > > +             dump_page(p, "hwpoison: unhandlable page");
> > > +
> >
> > I feel that 4 callers of get_hwpoison_page() are in the different context,
> > so it might be better to consider them separately to add dump_page() or not.
> > soft_offline_page() still prints out "%s: %#lx: unknown page type: %lx (%pGp)"
> 
> No strong opinion to keep or remove it.

Reading the explanation below, I think that calling dump_page() in the
original place is fine.  So let's remove "else if (ret == 0)" block in
soft_offline_page().

> 
> > message, which might be duplicate so this printk() may be dropped.
> > In memory_failure_hugetlb() and memory_failure(), we can call dump_page() after
> > action_result().  unpoison_memory() doesn't need dump_page() at all because
> > it's related to already hwpoisoned page.
> 
> I don't have a strong opinion either to have the dump_page() called
> either before action or after action, it just moves around the dumped
> page information around that printk.
> 
> For unpoison_memory(), I think it is harmless to have dump_page()
> called, right? If get_hwpoison_page() can't return -EIO, then the
> dump_page() won't be called at all, if it is possible then this is
> exactly why we call dump_page() to help debug.
> 
> So IMHO calling dump_page() in get_any_page when -EIO is returned
> could work for all the cases well and avoid duplicating the call.

Fair enough. So could you repost 3/3 with the above change in soft_offline_page()?

Thanks,
Naoya Horiguchi
Yang Shi Aug. 23, 2021, 5:47 p.m. UTC | #4
On Sun, Aug 22, 2021 at 10:05 PM HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@nec.com> wrote:
>
> On Fri, Aug 20, 2021 at 11:40:24AM -0700, Yang Shi wrote:
> > On Thu, Aug 19, 2021 at 11:48 PM HORIGUCHI NAOYA(堀口 直也)
> > <naoya.horiguchi@nec.com> wrote:
> > >
> > > On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote:
> > > > Currently just very simple message is shown for unhandlable page, e.g.
> > > > non-LRU page, like:
> > > > soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()
> > > >
> > > > It is not very helpful for further debug, calling dump_page() could show
> > > > more useful information.
> > > >
> > > > Calling dump_page() in get_any_page() in order to not duplicate the call
> > > > in a couple of different places.  It may be called with pcp disabled and
> > > > holding memory hotplug lock, it should be not a big deal since hwpoison
> > > > handler is not called very often.
> > > >
> > > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > > Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > > Cc: Oscar Salvador <osalvador@suse.de>
> > > > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > > > ---
> > > >  mm/memory-failure.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > > > index 7cfa134b1370..60df8fcd0444 100644
> > > > --- a/mm/memory-failure.c
> > > > +++ b/mm/memory-failure.c
> > > > @@ -1228,6 +1228,9 @@ static int get_any_page(struct page *p, unsigned long flags)
> > > >               ret = -EIO;
> > > >       }
> > > >  out:
> > > > +     if (ret == -EIO)
> > > > +             dump_page(p, "hwpoison: unhandlable page");
> > > > +
> > >
> > > I feel that 4 callers of get_hwpoison_page() are in the different context,
> > > so it might be better to consider them separately to add dump_page() or not.
> > > soft_offline_page() still prints out "%s: %#lx: unknown page type: %lx (%pGp)"
> >
> > No strong opinion to keep or remove it.
>
> Reading the explanation below, I think that calling dump_page() in the
> original place is fine.  So let's remove "else if (ret == 0)" block in
> soft_offline_page().

The "else if (ret == 0)" block is used to handle free page IIUC. I'm
supposed you mean the "else if (ret == -EIO)" block which just calls
printk.

>
> >
> > > message, which might be duplicate so this printk() may be dropped.
> > > In memory_failure_hugetlb() and memory_failure(), we can call dump_page() after
> > > action_result().  unpoison_memory() doesn't need dump_page() at all because
> > > it's related to already hwpoisoned page.
> >
> > I don't have a strong opinion either to have the dump_page() called
> > either before action or after action, it just moves around the dumped
> > page information around that printk.
> >
> > For unpoison_memory(), I think it is harmless to have dump_page()
> > called, right? If get_hwpoison_page() can't return -EIO, then the
> > dump_page() won't be called at all, if it is possible then this is
> > exactly why we call dump_page() to help debug.
> >
> > So IMHO calling dump_page() in get_any_page when -EIO is returned
> > could work for all the cases well and avoid duplicating the call.
>
> Fair enough. So could you repost 3/3 with the above change in soft_offline_page()?
>
> Thanks,
> Naoya Horiguchi
HORIGUCHI NAOYA(堀口 直也) Aug. 23, 2021, 11:17 p.m. UTC | #5
On Mon, Aug 23, 2021 at 10:47:03AM -0700, Yang Shi wrote:
> On Sun, Aug 22, 2021 at 10:05 PM HORIGUCHI NAOYA(堀口 直也)
> <naoya.horiguchi@nec.com> wrote:
> >
> > On Fri, Aug 20, 2021 at 11:40:24AM -0700, Yang Shi wrote:
> > > On Thu, Aug 19, 2021 at 11:48 PM HORIGUCHI NAOYA(堀口 直也)
> > > <naoya.horiguchi@nec.com> wrote:
> > > >
> > > > On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote:
> > > > > Currently just very simple message is shown for unhandlable page, e.g.
> > > > > non-LRU page, like:
> > > > > soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()
> > > > >
> > > > > It is not very helpful for further debug, calling dump_page() could show
> > > > > more useful information.
> > > > >
> > > > > Calling dump_page() in get_any_page() in order to not duplicate the call
> > > > > in a couple of different places.  It may be called with pcp disabled and
> > > > > holding memory hotplug lock, it should be not a big deal since hwpoison
> > > > > handler is not called very often.
> > > > >
> > > > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > > > Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > > > Cc: Oscar Salvador <osalvador@suse.de>
> > > > > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > > > > ---
> > > > >  mm/memory-failure.c | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > > > > index 7cfa134b1370..60df8fcd0444 100644
> > > > > --- a/mm/memory-failure.c
> > > > > +++ b/mm/memory-failure.c
> > > > > @@ -1228,6 +1228,9 @@ static int get_any_page(struct page *p, unsigned long flags)
> > > > >               ret = -EIO;
> > > > >       }
> > > > >  out:
> > > > > +     if (ret == -EIO)
> > > > > +             dump_page(p, "hwpoison: unhandlable page");
> > > > > +
> > > >
> > > > I feel that 4 callers of get_hwpoison_page() are in the different context,
> > > > so it might be better to consider them separately to add dump_page() or not.
> > > > soft_offline_page() still prints out "%s: %#lx: unknown page type: %lx (%pGp)"
> > >
> > > No strong opinion to keep or remove it.
> >
> > Reading the explanation below, I think that calling dump_page() in the
> > original place is fine.  So let's remove "else if (ret == 0)" block in
> > soft_offline_page().
> 
> The "else if (ret == 0)" block is used to handle free page IIUC. I'm
> supposed you mean the "else if (ret == -EIO)" block which just calls
> printk.

Sorry, you're right. I miss-copied the line.

- Naoya Horiguchi
Luck, Tony Sept. 22, 2021, 7:37 p.m. UTC | #6
On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote:
> Currently just very simple message is shown for unhandlable page, e.g.
> non-LRU page, like:
> soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()
> 
> It is not very helpful for further debug, calling dump_page() could show
> more useful information.

Looks like your code already caught something. An error injection
test may have injected into a shared library. Though I'm not sure that
the refcount/mapcount in the dump agrees with that diagnosis from the
author of this test.

Here's what appeared on the console:

[ 4817.622254] mce: Uncorrected hardware memory error in user-access at cef2747000
[ 4817.630520] page:000000003ab9dca4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xcef2747
[ 4817.638651] mce: Uncorrected hardware memory error in user-access at cef2747000
[ 4817.646860] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
[ 4818.025515] mce: Uncorrected hardware memory error in user-access at cef2747000
[ 4818.033689] raw: 0057ffffc0801000 ffd400033bc9d1c8 ffd400033bc9d1c8 0000000000000000
[ 4818.272435] mce: Uncorrected hardware memory error in user-access at cef2747000
[ 4818.280640] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
[ 4818.280658] mce: Uncorrected hardware memory error in user-access at cef2747000
[ 4818.313606] mce: Uncorrected hardware memory error in user-access at cef2747000
[ 4818.321804] page dumped because: hwpoison: unhandlable page
[ 4818.564802] mce: Uncorrected hardware memory error in user-access at cef2747000
[ 4818.573043] Memory failure: 0xcef2747: recovery action for unknown page: Ignored
[ 4818.595837] Memory failure: 0xcef2747: already hardware poisoned
[ 4818.603245] Memory failure: 0xcef2747: Sending SIGBUS to multichase:67460 due to hardware memory corruption
[ 4818.614297] Memory failure: 0xcef2747: already hardware poisoned

-Tony
Yang Shi Sept. 22, 2021, 7:58 p.m. UTC | #7
On Wed, Sep 22, 2021 at 12:37 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote:
> > Currently just very simple message is shown for unhandlable page, e.g.
> > non-LRU page, like:
> > soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()
> >
> > It is not very helpful for further debug, calling dump_page() could show
> > more useful information.
>
> Looks like your code already caught something. An error injection
> test may have injected into a shared library. Though I'm not sure that
> the refcount/mapcount in the dump agrees with that diagnosis from the
> author of this test.

The messages from dump_page() are (unwind them from mce logs):

[ 4817.630520] page:000000003ab9dca4 refcount:1 mapcount:0
mapping:0000000000000000 index:0x0 pfn:0xcef2747
[ 4817.646860] flags:
0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
[ 4818.033689] raw: 0057ffffc0801000 ffd400033bc9d1c8 ffd400033bc9d1c8
0000000000000000
[ 4818.280640] raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000

The page flags tell it is a "reserved" page and mapping is NULL. It
doesn't seem like a user page or movable page, so hwpoision can't
handle it so that the messages are dumped.

>
> Here's what appeared on the console:
>
> [ 4817.622254] mce: Uncorrected hardware memory error in user-access at cef2747000
> [ 4817.630520] page:000000003ab9dca4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xcef2747
> [ 4817.638651] mce: Uncorrected hardware memory error in user-access at cef2747000
> [ 4817.646860] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
> [ 4818.025515] mce: Uncorrected hardware memory error in user-access at cef2747000
> [ 4818.033689] raw: 0057ffffc0801000 ffd400033bc9d1c8 ffd400033bc9d1c8 0000000000000000
> [ 4818.272435] mce: Uncorrected hardware memory error in user-access at cef2747000
> [ 4818.280640] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> [ 4818.280658] mce: Uncorrected hardware memory error in user-access at cef2747000
> [ 4818.313606] mce: Uncorrected hardware memory error in user-access at cef2747000
> [ 4818.321804] page dumped because: hwpoison: unhandlable page
> [ 4818.564802] mce: Uncorrected hardware memory error in user-access at cef2747000
> [ 4818.573043] Memory failure: 0xcef2747: recovery action for unknown page: Ignored
> [ 4818.595837] Memory failure: 0xcef2747: already hardware poisoned
> [ 4818.603245] Memory failure: 0xcef2747: Sending SIGBUS to multichase:67460 due to hardware memory corruption
> [ 4818.614297] Memory failure: 0xcef2747: already hardware poisoned
>
> -Tony
Yang Shi Sept. 22, 2021, 8:37 p.m. UTC | #8
On Wed, Sep 22, 2021 at 12:58 PM Yang Shi <shy828301@gmail.com> wrote:
>
> On Wed, Sep 22, 2021 at 12:37 PM Luck, Tony <tony.luck@intel.com> wrote:
> >
> > On Wed, Aug 18, 2021 at 10:41:16PM -0700, Yang Shi wrote:
> > > Currently just very simple message is shown for unhandlable page, e.g.
> > > non-LRU page, like:
> > > soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 ()
> > >
> > > It is not very helpful for further debug, calling dump_page() could show
> > > more useful information.
> >
> > Looks like your code already caught something. An error injection
> > test may have injected into a shared library. Though I'm not sure that
> > the refcount/mapcount in the dump agrees with that diagnosis from the
> > author of this test.
>
> The messages from dump_page() are (unwind them from mce logs):
>
> [ 4817.630520] page:000000003ab9dca4 refcount:1 mapcount:0
> mapping:0000000000000000 index:0x0 pfn:0xcef2747
> [ 4817.646860] flags:
> 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
> [ 4818.033689] raw: 0057ffffc0801000 ffd400033bc9d1c8 ffd400033bc9d1c8
> 0000000000000000
> [ 4818.280640] raw: 0000000000000000 0000000000000000 00000001ffffffff
> 0000000000000000

Missed one line from the dump:

[ 4818.321804] page dumped because: hwpoison: unhandlable page

Anyway dump_page() is just called when unhandlable page is met.

>
> The page flags tell it is a "reserved" page and mapping is NULL. It
> doesn't seem like a user page or movable page, so hwpoision can't
> handle it so that the messages are dumped.
>
> >
> > Here's what appeared on the console:
> >
> > [ 4817.622254] mce: Uncorrected hardware memory error in user-access at cef2747000
> > [ 4817.630520] page:000000003ab9dca4 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xcef2747
> > [ 4817.638651] mce: Uncorrected hardware memory error in user-access at cef2747000
> > [ 4817.646860] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
> > [ 4818.025515] mce: Uncorrected hardware memory error in user-access at cef2747000
> > [ 4818.033689] raw: 0057ffffc0801000 ffd400033bc9d1c8 ffd400033bc9d1c8 0000000000000000
> > [ 4818.272435] mce: Uncorrected hardware memory error in user-access at cef2747000
> > [ 4818.280640] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> > [ 4818.280658] mce: Uncorrected hardware memory error in user-access at cef2747000
> > [ 4818.313606] mce: Uncorrected hardware memory error in user-access at cef2747000
> > [ 4818.321804] page dumped because: hwpoison: unhandlable page
> > [ 4818.564802] mce: Uncorrected hardware memory error in user-access at cef2747000
> > [ 4818.573043] Memory failure: 0xcef2747: recovery action for unknown page: Ignored
> > [ 4818.595837] Memory failure: 0xcef2747: already hardware poisoned
> > [ 4818.603245] Memory failure: 0xcef2747: Sending SIGBUS to multichase:67460 due to hardware memory corruption
> > [ 4818.614297] Memory failure: 0xcef2747: already hardware poisoned
> >
> > -Tony
diff mbox series

Patch

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 7cfa134b1370..60df8fcd0444 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1228,6 +1228,9 @@  static int get_any_page(struct page *p, unsigned long flags)
 		ret = -EIO;
 	}
 out:
+	if (ret == -EIO)
+		dump_page(p, "hwpoison: unhandlable page");
+
 	return ret;
 }