[v2] mm: clean up hwpoison page cache page in fault path

Message ID	20220212213740.423efcea@imladris.surriel.com (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> Date: Sat, 12 Feb 2022 21:37:40 -0500 From: Rik van Riel <riel@surriel.com> To: linux-kernel@vger.kernel.org Cc: kernel-team@fb.com, linux-mm@kvack.org, Miaohe Lin <linmiaohe@huawei.com>, Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mgorman@suse.de>, Johannes Weiner <hannes@cmpxchg.org>, Matthew Wilcox <willy@infradead.org> Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault path Message-ID: <20220212213740.423efcea@imladris.surriel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	[v2] mm: clean up hwpoison page cache page in fault path \| expand [v2] mm: clean up hwpoison page cache page in fault path

Rik van Riel Feb. 13, 2022, 2:37 a.m. UTC

Sometimes the page offlining code can leave behind a hwpoisoned clean
page cache page. This can lead to programs being killed over and over
and over again as they fault in the hwpoisoned page, get killed, and
then get re-spawned by whatever wanted to run them.

This is particularly embarrassing when the page was offlined due to
having too many corrected memory errors. Now we are killing tasks
due to them trying to access memory that probably isn't even corrupted.

This problem can be avoided by invalidating the page from the page
fault handler, which already has a branch for dealing with these
kinds of pages. With this patch we simply pretend the page fault
was successful if the page was invalidated, return to userspace,
incur another page fault, read in the file from disk (to a new
memory page), and then everything works again.

Signed-off-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
---
v2: fix compiler warning found by kernel test robot

 mm/memory.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Andrew Morton Feb. 14, 2022, 11:24 p.m. UTC | #1

> Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault path

At first scan I thought this was a code cleanup.

I think I'll do s/clean up/invalidate/.

On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@surriel.com> wrote:

> Sometimes the page offlining code can leave behind a hwpoisoned clean
> page cache page.

Is this correct behaviour?

> This can lead to programs being killed over and over
> and over again as they fault in the hwpoisoned page, get killed, and
> then get re-spawned by whatever wanted to run them.
> 
> This is particularly embarrassing when the page was offlined due to
> having too many corrected memory errors. Now we are killing tasks
> due to them trying to access memory that probably isn't even corrupted.
> 
> This problem can be avoided by invalidating the page from the page
> fault handler, which already has a branch for dealing with these
> kinds of pages. With this patch we simply pretend the page fault
> was successful if the page was invalidated, return to userspace,
> incur another page fault, read in the file from disk (to a new
> memory page), and then everything works again.

Is this worth a cc:stable?

Rik van Riel Feb. 15, 2022, 1:37 a.m. UTC | #2

On Mon, 2022-02-14 at 15:24 -0800, Andrew Morton wrote:
> 
> > Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault
> > path
> 
> At first scan I thought this was a code cleanup.
> 
> I think I'll do s/clean up/invalidate/.
> 
OK, that sounds good.

> On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@surriel.com>
> wrote:
> 
> > Sometimes the page offlining code can leave behind a hwpoisoned
> > clean
> > page cache page.
> 
> Is this correct behaviour?

It is not desirable, and the soft page offlining code
tries to invalidate the page, but I don't think overhauling
the way we lock and refcount page cache pages just to make
offlining them more reliable would be worthwhile, when we
already have a branch in the page fault handler to deal with
these pages, anyway.

> > This can lead to programs being killed over and over
> > and over again as they fault in the hwpoisoned page, get killed,
> > and
> > then get re-spawned by whatever wanted to run them.
> > 
> > This is particularly embarrassing when the page was offlined due to
> > having too many corrected memory errors. Now we are killing tasks
> > due to them trying to access memory that probably isn't even
> > corrupted.
> > 
> > This problem can be avoided by invalidating the page from the page
> > fault handler, which already has a branch for dealing with these
> > kinds of pages. With this patch we simply pretend the page fault
> > was successful if the page was invalidated, return to userspace,
> > incur another page fault, read in the file from disk (to a new
> > memory page), and then everything works again.
> 
> Is this worth a cc:stable?

Maybe. I don't know how far back this issue goes...

HORIGUCHI NAOYA(堀口直也) Feb. 15, 2022, 5:15 a.m. UTC | #3

On Mon, Feb 14, 2022 at 08:37:26PM -0500, Rik van Riel wrote:
> On Mon, 2022-02-14 at 15:24 -0800, Andrew Morton wrote:
> > 
> > > Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault
> > > path
> > 
> > At first scan I thought this was a code cleanup.
> > 
> > I think I'll do s/clean up/invalidate/.
> > 
> OK, that sounds good.
> 
> > On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@surriel.com>
> > wrote:
> > 
> > > Sometimes the page offlining code can leave behind a hwpoisoned
> > > clean
> > > page cache page.
> > 
> > Is this correct behaviour?
> 
> It is not desirable, and the soft page offlining code
> tries to invalidate the page, but I don't think overhauling
> the way we lock and refcount page cache pages just to make
> offlining them more reliable would be worthwhile, when we
> already have a branch in the page fault handler to deal with
> these pages, anyway.

I don't have any idea about how this kind of page is left on page
cache after page offlining.  But I agree with the suggested change.

> 
> > > This can lead to programs being killed over and over
> > > and over again as they fault in the hwpoisoned page, get killed,
> > > and
> > > then get re-spawned by whatever wanted to run them.
> > > 
> > > This is particularly embarrassing when the page was offlined due to
> > > having too many corrected memory errors. Now we are killing tasks
> > > due to them trying to access memory that probably isn't even
> > > corrupted.
> > > 
> > > This problem can be avoided by invalidating the page from the page
> > > fault handler, which already has a branch for dealing with these
> > > kinds of pages. With this patch we simply pretend the page fault
> > > was successful if the page was invalidated, return to userspace,
> > > incur another page fault, read in the file from disk (to a new
> > > memory page), and then everything works again.
> > 
> > Is this worth a cc:stable?
> 
> Maybe. I don't know how far back this issue goes...

This issue should be orthogonal with recent changes on hwpoison, and
the base code targetted by this patch is unchanged since 2016 (4.10-rc1),
so this patch is simply applicable to most of the maintained stable trees
(maybe except 4.9.z).

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

Thanks,
Naoya Horiguchi

David Hildenbrand Feb. 15, 2022, 11:22 a.m. UTC | #4

On 13.02.22 03:37, Rik van Riel wrote:
> Sometimes the page offlining code can leave behind a hwpoisoned clean
> page cache page. This can lead to programs being killed over and over
> and over again as they fault in the hwpoisoned page, get killed, and
> then get re-spawned by whatever wanted to run them.

Hi Rik,

am I correct that you are only talking about soft offlining as triggered
from mm/memory-failure.c, not page offlining as in memory offlining
mm/memory_hotunplug.c ?

Maybe you can clarify that in the patch description, it made me nervous
for a second :)

Oscar Salvador Feb. 15, 2022, 12:51 p.m. UTC | #5

On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote:
> Sometimes the page offlining code can leave behind a hwpoisoned clean
> page cache page. This can lead to programs being killed over and over
> and over again as they fault in the hwpoisoned page, get killed, and
> then get re-spawned by whatever wanted to run them.

Hi Rik,

Do you know how that exactly happens? We should not be really leaving
anything behind, and soft-offline (not hard) code works with the premise
of only poisoning a page in case it was contained, so I am wondering
what is going on here.

In-use pagecache pages are migrated away, and the actual page is
contained, and for clean ones, we already do the invalidate_inode_page()
and then contain it in case we succeed.

One scenario I can imagine this can happen is if by the time we call
page_handle_poison(), someone has taken another refcount on the page,
and the put_page() does not really free it, but I am not sure that
can happen.

Rik van Riel Feb. 15, 2022, 3:01 p.m. UTC | #6

On Tue, 2022-02-15 at 12:22 +0100, David Hildenbrand wrote:
> On 13.02.22 03:37, Rik van Riel wrote:
> > Sometimes the page offlining code can leave behind a hwpoisoned
> > clean
> > page cache page. This can lead to programs being killed over and
> > over
> > and over again as they fault in the hwpoisoned page, get killed,
> > and
> > then get re-spawned by whatever wanted to run them.
> 
> Hi Rik,
> 
> am I correct that you are only talking about soft offlining as
> triggered
> from mm/memory-failure.c, not page offlining as in memory offlining
> mm/memory_hotunplug.c ?

That is correct in that I am talking only about memory-failure.c,
however the code in memory-failure.c has both hard and soft
offlining modes, and I suppose this patch covers both?

Rik van Riel Feb. 15, 2022, 3:04 p.m. UTC | #7

On Tue, 2022-02-15 at 13:51 +0100, Oscar Salvador wrote:
> On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote:
> > Sometimes the page offlining code can leave behind a hwpoisoned
> > clean
> > page cache page. This can lead to programs being killed over and
> > over
> > and over again as they fault in the hwpoisoned page, get killed,
> > and
> > then get re-spawned by whatever wanted to run them.
> 
> Hi Rik,
> 
> Do you know how that exactly happens? We should not be really leaving
> anything behind, and soft-offline (not hard) code works with the
> premise
> of only poisoning a page in case it was contained, so I am wondering
> what is going on here.
> 
> In-use pagecache pages are migrated away, and the actual page is
> contained, and for clean ones, we already do the
> invalidate_inode_page()
> and then contain it in case we succeed.

I do not know the exact failure case, since I have never
caught a system in the act of leaking one of these pages.

I just know I have seen this issue on systems where the
"soft_offline: %#lx: invalidated\n" printk was the only
offline method leaving any message in the kernel log.

However, there are a few code paths through the soft
offlining code path that don't seem to have any printks,
so I am not sure exactly where things went wrong.

I only really found the aftermath, and tested this patch
by loading it as a kernel live patch module on some of
those systems.

David Hildenbrand Feb. 15, 2022, 3:06 p.m. UTC | #8

On 15.02.22 16:01, Rik van Riel wrote:
> On Tue, 2022-02-15 at 12:22 +0100, David Hildenbrand wrote:
>> On 13.02.22 03:37, Rik van Riel wrote:
>>> Sometimes the page offlining code can leave behind a hwpoisoned
>>> clean
>>> page cache page. This can lead to programs being killed over and
>>> over
>>> and over again as they fault in the hwpoisoned page, get killed,
>>> and
>>> then get re-spawned by whatever wanted to run them.
>>
>> Hi Rik,
>>
>> am I correct that you are only talking about soft offlining as
>> triggered
>> from mm/memory-failure.c, not page offlining as in memory offlining
>> mm/memory_hotunplug.c ?
> 
> That is correct in that I am talking only about memory-failure.c,
> however the code in memory-failure.c has both hard and soft
> offlining modes, and I suppose this patch covers both?
> 

Right, for hwpoison handling there is hard and soft offlining of pages
... maybe "hwpoison page offlining" would be clearer, not sure.

Thanks for clarifying!

Miaohe Lin Feb. 16, 2022, 2:13 a.m. UTC | #9

On 2022/2/15 20:51, Oscar Salvador wrote:
> On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote:
>> Sometimes the page offlining code can leave behind a hwpoisoned clean
>> page cache page. This can lead to programs being killed over and over
>> and over again as they fault in the hwpoisoned page, get killed, and
>> then get re-spawned by whatever wanted to run them.
> 
> Hi Rik,
> 
> Do you know how that exactly happens? We should not be really leaving
> anything behind, and soft-offline (not hard) code works with the premise
> of only poisoning a page in case it was contained, so I am wondering
> what is going on here.
> 
> In-use pagecache pages are migrated away, and the actual page is
> contained, and for clean ones, we already do the invalidate_inode_page()
> and then contain it in case we succeed.
> 

IIUC, this could not happen when soft-offlining a pagecache page. They're either
invalidated or migrated away and then we set PageHWPoison.
I think this may happen on a clean pagecache page when it's isolated. So it's !PageLRU.
And identify_page_state treats it as me_unknown because it's non reserved, slab, swapcache
and so on ...(see error_states for details). Or am I miss anything?

Thanks.

> One scenario I can imagine this can happen is if by the time we call
> page_handle_poison(), someone has taken another refcount on the page,
> and the put_page() does not really free it, but I am not sure that
> can happen.
>

Oscar Salvador Feb. 21, 2022, 12:54 p.m. UTC | #10

On Wed, Feb 16, 2022 at 10:13:14AM +0800, Miaohe Lin wrote:
> IIUC, this could not happen when soft-offlining a pagecache page. They're either
> invalidated or migrated away and then we set PageHWPoison.
> I think this may happen on a clean pagecache page when it's isolated. So it's !PageLRU.
> And identify_page_state treats it as me_unknown because it's non reserved, slab, swapcache
> and so on ...(see error_states for details). Or am I miss anything?

But the path you are talking about is when we do have a non-recoverable
error, so memory_failure() path.
AFAIU, Rik talks about pages with corrected errors, and that is
soft_offline().

Oscar Salvador Feb. 21, 2022, 12:55 p.m. UTC | #11

On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote:
> Sometimes the page offlining code can leave behind a hwpoisoned clean
> page cache page. This can lead to programs being killed over and over
> and over again as they fault in the hwpoisoned page, get killed, and
> then get re-spawned by whatever wanted to run them.
> 
> This is particularly embarrassing when the page was offlined due to
> having too many corrected memory errors. Now we are killing tasks
> due to them trying to access memory that probably isn't even corrupted.
> 
> This problem can be avoided by invalidating the page from the page
> fault handler, which already has a branch for dealing with these
> kinds of pages. With this patch we simply pretend the page fault
> was successful if the page was invalidated, return to userspace,
> incur another page fault, read in the file from disk (to a new
> memory page), and then everything works again.
> 
> Signed-off-by: Rik van Riel <riel@surriel.com>
> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

Although I would really loved to understand how we got there,
it fixes the problem, so:

Reviewed-by: Oscar Salvador <osalvador@suse.de>

Miaohe Lin Feb. 21, 2022, 1:19 p.m. UTC | #12

On 2022/2/21 20:54, Oscar Salvador wrote:
> On Wed, Feb 16, 2022 at 10:13:14AM +0800, Miaohe Lin wrote:
>> IIUC, this could not happen when soft-offlining a pagecache page. They're either
>> invalidated or migrated away and then we set PageHWPoison.
>> I think this may happen on a clean pagecache page when it's isolated. So it's !PageLRU.
>> And identify_page_state treats it as me_unknown because it's non reserved, slab, swapcache
>> and so on ...(see error_states for details). Or am I miss anything?
> 
> But the path you are talking about is when we do have a non-recoverable
> error, so memory_failure() path.
> AFAIU, Rik talks about pages with corrected errors, and that is
> soft_offline().

Oh, yes, Rik talks about pages with corrected errors. My mistake. Then I really
want to understand how we got there too. :)

Thanks.

>

[v2] mm: clean up hwpoison page cache page in fault path

Commit Message

Comments

Patch