Message ID | 20220212213740.423efcea@imladris.surriel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] mm: clean up hwpoison page cache page in fault path | expand |
> Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault path At first scan I thought this was a code cleanup. I think I'll do s/clean up/invalidate/. On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@surriel.com> wrote: > Sometimes the page offlining code can leave behind a hwpoisoned clean > page cache page. Is this correct behaviour? > This can lead to programs being killed over and over > and over again as they fault in the hwpoisoned page, get killed, and > then get re-spawned by whatever wanted to run them. > > This is particularly embarrassing when the page was offlined due to > having too many corrected memory errors. Now we are killing tasks > due to them trying to access memory that probably isn't even corrupted. > > This problem can be avoided by invalidating the page from the page > fault handler, which already has a branch for dealing with these > kinds of pages. With this patch we simply pretend the page fault > was successful if the page was invalidated, return to userspace, > incur another page fault, read in the file from disk (to a new > memory page), and then everything works again. Is this worth a cc:stable?
On Mon, 2022-02-14 at 15:24 -0800, Andrew Morton wrote: > > > Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault > > path > > At first scan I thought this was a code cleanup. > > I think I'll do s/clean up/invalidate/. > OK, that sounds good. > On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@surriel.com> > wrote: > > > Sometimes the page offlining code can leave behind a hwpoisoned > > clean > > page cache page. > > Is this correct behaviour? It is not desirable, and the soft page offlining code tries to invalidate the page, but I don't think overhauling the way we lock and refcount page cache pages just to make offlining them more reliable would be worthwhile, when we already have a branch in the page fault handler to deal with these pages, anyway. > > This can lead to programs being killed over and over > > and over again as they fault in the hwpoisoned page, get killed, > > and > > then get re-spawned by whatever wanted to run them. > > > > This is particularly embarrassing when the page was offlined due to > > having too many corrected memory errors. Now we are killing tasks > > due to them trying to access memory that probably isn't even > > corrupted. > > > > This problem can be avoided by invalidating the page from the page > > fault handler, which already has a branch for dealing with these > > kinds of pages. With this patch we simply pretend the page fault > > was successful if the page was invalidated, return to userspace, > > incur another page fault, read in the file from disk (to a new > > memory page), and then everything works again. > > Is this worth a cc:stable? Maybe. I don't know how far back this issue goes...
On Mon, Feb 14, 2022 at 08:37:26PM -0500, Rik van Riel wrote: > On Mon, 2022-02-14 at 15:24 -0800, Andrew Morton wrote: > > > > > Subject: [PATCH v2] mm: clean up hwpoison page cache page in fault > > > path > > > > At first scan I thought this was a code cleanup. > > > > I think I'll do s/clean up/invalidate/. > > > OK, that sounds good. > > > On Sat, 12 Feb 2022 21:37:40 -0500 Rik van Riel <riel@surriel.com> > > wrote: > > > > > Sometimes the page offlining code can leave behind a hwpoisoned > > > clean > > > page cache page. > > > > Is this correct behaviour? > > It is not desirable, and the soft page offlining code > tries to invalidate the page, but I don't think overhauling > the way we lock and refcount page cache pages just to make > offlining them more reliable would be worthwhile, when we > already have a branch in the page fault handler to deal with > these pages, anyway. I don't have any idea about how this kind of page is left on page cache after page offlining. But I agree with the suggested change. > > > > This can lead to programs being killed over and over > > > and over again as they fault in the hwpoisoned page, get killed, > > > and > > > then get re-spawned by whatever wanted to run them. > > > > > > This is particularly embarrassing when the page was offlined due to > > > having too many corrected memory errors. Now we are killing tasks > > > due to them trying to access memory that probably isn't even > > > corrupted. > > > > > > This problem can be avoided by invalidating the page from the page > > > fault handler, which already has a branch for dealing with these > > > kinds of pages. With this patch we simply pretend the page fault > > > was successful if the page was invalidated, return to userspace, > > > incur another page fault, read in the file from disk (to a new > > > memory page), and then everything works again. > > > > Is this worth a cc:stable? > > Maybe. I don't know how far back this issue goes... This issue should be orthogonal with recent changes on hwpoison, and the base code targetted by this patch is unchanged since 2016 (4.10-rc1), so this patch is simply applicable to most of the maintained stable trees (maybe except 4.9.z). Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Thanks, Naoya Horiguchi
On 13.02.22 03:37, Rik van Riel wrote: > Sometimes the page offlining code can leave behind a hwpoisoned clean > page cache page. This can lead to programs being killed over and over > and over again as they fault in the hwpoisoned page, get killed, and > then get re-spawned by whatever wanted to run them. Hi Rik, am I correct that you are only talking about soft offlining as triggered from mm/memory-failure.c, not page offlining as in memory offlining mm/memory_hotunplug.c ? Maybe you can clarify that in the patch description, it made me nervous for a second :)
On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote: > Sometimes the page offlining code can leave behind a hwpoisoned clean > page cache page. This can lead to programs being killed over and over > and over again as they fault in the hwpoisoned page, get killed, and > then get re-spawned by whatever wanted to run them. Hi Rik, Do you know how that exactly happens? We should not be really leaving anything behind, and soft-offline (not hard) code works with the premise of only poisoning a page in case it was contained, so I am wondering what is going on here. In-use pagecache pages are migrated away, and the actual page is contained, and for clean ones, we already do the invalidate_inode_page() and then contain it in case we succeed. One scenario I can imagine this can happen is if by the time we call page_handle_poison(), someone has taken another refcount on the page, and the put_page() does not really free it, but I am not sure that can happen.
On Tue, 2022-02-15 at 12:22 +0100, David Hildenbrand wrote: > On 13.02.22 03:37, Rik van Riel wrote: > > Sometimes the page offlining code can leave behind a hwpoisoned > > clean > > page cache page. This can lead to programs being killed over and > > over > > and over again as they fault in the hwpoisoned page, get killed, > > and > > then get re-spawned by whatever wanted to run them. > > Hi Rik, > > am I correct that you are only talking about soft offlining as > triggered > from mm/memory-failure.c, not page offlining as in memory offlining > mm/memory_hotunplug.c ? That is correct in that I am talking only about memory-failure.c, however the code in memory-failure.c has both hard and soft offlining modes, and I suppose this patch covers both?
On Tue, 2022-02-15 at 13:51 +0100, Oscar Salvador wrote: > On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote: > > Sometimes the page offlining code can leave behind a hwpoisoned > > clean > > page cache page. This can lead to programs being killed over and > > over > > and over again as they fault in the hwpoisoned page, get killed, > > and > > then get re-spawned by whatever wanted to run them. > > Hi Rik, > > Do you know how that exactly happens? We should not be really leaving > anything behind, and soft-offline (not hard) code works with the > premise > of only poisoning a page in case it was contained, so I am wondering > what is going on here. > > In-use pagecache pages are migrated away, and the actual page is > contained, and for clean ones, we already do the > invalidate_inode_page() > and then contain it in case we succeed. I do not know the exact failure case, since I have never caught a system in the act of leaking one of these pages. I just know I have seen this issue on systems where the "soft_offline: %#lx: invalidated\n" printk was the only offline method leaving any message in the kernel log. However, there are a few code paths through the soft offlining code path that don't seem to have any printks, so I am not sure exactly where things went wrong. I only really found the aftermath, and tested this patch by loading it as a kernel live patch module on some of those systems.
On 15.02.22 16:01, Rik van Riel wrote: > On Tue, 2022-02-15 at 12:22 +0100, David Hildenbrand wrote: >> On 13.02.22 03:37, Rik van Riel wrote: >>> Sometimes the page offlining code can leave behind a hwpoisoned >>> clean >>> page cache page. This can lead to programs being killed over and >>> over >>> and over again as they fault in the hwpoisoned page, get killed, >>> and >>> then get re-spawned by whatever wanted to run them. >> >> Hi Rik, >> >> am I correct that you are only talking about soft offlining as >> triggered >> from mm/memory-failure.c, not page offlining as in memory offlining >> mm/memory_hotunplug.c ? > > That is correct in that I am talking only about memory-failure.c, > however the code in memory-failure.c has both hard and soft > offlining modes, and I suppose this patch covers both? > Right, for hwpoison handling there is hard and soft offlining of pages ... maybe "hwpoison page offlining" would be clearer, not sure. Thanks for clarifying!
On 2022/2/15 20:51, Oscar Salvador wrote: > On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote: >> Sometimes the page offlining code can leave behind a hwpoisoned clean >> page cache page. This can lead to programs being killed over and over >> and over again as they fault in the hwpoisoned page, get killed, and >> then get re-spawned by whatever wanted to run them. > > Hi Rik, > > Do you know how that exactly happens? We should not be really leaving > anything behind, and soft-offline (not hard) code works with the premise > of only poisoning a page in case it was contained, so I am wondering > what is going on here. > > In-use pagecache pages are migrated away, and the actual page is > contained, and for clean ones, we already do the invalidate_inode_page() > and then contain it in case we succeed. > IIUC, this could not happen when soft-offlining a pagecache page. They're either invalidated or migrated away and then we set PageHWPoison. I think this may happen on a clean pagecache page when it's isolated. So it's !PageLRU. And identify_page_state treats it as me_unknown because it's non reserved, slab, swapcache and so on ...(see error_states for details). Or am I miss anything? Thanks. > One scenario I can imagine this can happen is if by the time we call > page_handle_poison(), someone has taken another refcount on the page, > and the put_page() does not really free it, but I am not sure that > can happen. >
On Wed, Feb 16, 2022 at 10:13:14AM +0800, Miaohe Lin wrote: > IIUC, this could not happen when soft-offlining a pagecache page. They're either > invalidated or migrated away and then we set PageHWPoison. > I think this may happen on a clean pagecache page when it's isolated. So it's !PageLRU. > And identify_page_state treats it as me_unknown because it's non reserved, slab, swapcache > and so on ...(see error_states for details). Or am I miss anything? But the path you are talking about is when we do have a non-recoverable error, so memory_failure() path. AFAIU, Rik talks about pages with corrected errors, and that is soft_offline().
On Sat, Feb 12, 2022 at 09:37:40PM -0500, Rik van Riel wrote: > Sometimes the page offlining code can leave behind a hwpoisoned clean > page cache page. This can lead to programs being killed over and over > and over again as they fault in the hwpoisoned page, get killed, and > then get re-spawned by whatever wanted to run them. > > This is particularly embarrassing when the page was offlined due to > having too many corrected memory errors. Now we are killing tasks > due to them trying to access memory that probably isn't even corrupted. > > This problem can be avoided by invalidating the page from the page > fault handler, which already has a branch for dealing with these > kinds of pages. With this patch we simply pretend the page fault > was successful if the page was invalidated, return to userspace, > incur another page fault, read in the file from disk (to a new > memory page), and then everything works again. > > Signed-off-by: Rik van Riel <riel@surriel.com> > Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Although I would really loved to understand how we got there, it fixes the problem, so: Reviewed-by: Oscar Salvador <osalvador@suse.de>
On 2022/2/21 20:54, Oscar Salvador wrote: > On Wed, Feb 16, 2022 at 10:13:14AM +0800, Miaohe Lin wrote: >> IIUC, this could not happen when soft-offlining a pagecache page. They're either >> invalidated or migrated away and then we set PageHWPoison. >> I think this may happen on a clean pagecache page when it's isolated. So it's !PageLRU. >> And identify_page_state treats it as me_unknown because it's non reserved, slab, swapcache >> and so on ...(see error_states for details). Or am I miss anything? > > But the path you are talking about is when we do have a non-recoverable > error, so memory_failure() path. > AFAIU, Rik talks about pages with corrected errors, and that is > soft_offline(). Oh, yes, Rik talks about pages with corrected errors. My mistake. Then I really want to understand how we got there too. :) Thanks. >
diff --git a/mm/memory.c b/mm/memory.c index c125c4969913..55270ea2a7c7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3871,11 +3871,16 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) return ret; if (unlikely(PageHWPoison(vmf->page))) { - if (ret & VM_FAULT_LOCKED) + vm_fault_t poisonret = VM_FAULT_HWPOISON; + if (ret & VM_FAULT_LOCKED) { + /* Retry if a clean page was removed from the cache. */ + if (invalidate_inode_page(vmf->page)) + poisonret = 0; unlock_page(vmf->page); + } put_page(vmf->page); vmf->page = NULL; - return VM_FAULT_HWPOISON; + return poisonret; } if (unlikely(!(ret & VM_FAULT_LOCKED)))