[v1,3/7] mm: rename and move page_is_poisoned()

Message ID	20210429122519.15183-4-david@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=wbcb=J2=kvack.org=owner-linux-mm@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 16392613F1 From: David Hildenbrand <david@redhat.com> To: linux-kernel@vger.kernel.org Cc: David Hildenbrand <david@redhat.com>, Andrew Morton <akpm@linux-foundation.org>, "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, Alexey Dobriyan <adobriyan@gmail.com>, Mike Rapoport <rppt@kernel.org>, "Matthew Wilcox (Oracle)" <willy@infradead.org>, Oscar Salvador <osalvador@suse.de>, Michal Hocko <mhocko@suse.com>, Roman Gushchin <guro@fb.com>, Alex Shi <alex.shi@linux.alibaba.com>, Steven Price <steven.price@arm.com>, Mike Kravetz <mike.kravetz@oracle.com>, Aili Yao <yaoaili@kingsoft.com>, Jiri Bohac <jbohac@suse.cz>, "K. Y. Srinivasan" <kys@microsoft.com>, Haiyang Zhang <haiyangz@microsoft.com>, Stephen Hemminger <sthemmin@microsoft.com>, Wei Liu <wei.liu@kernel.org>, Naoya Horiguchi <naoya.horiguchi@nec.com>, linux-hyperv@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v1 3/7] mm: rename and move page_is_poisoned() Date: Thu, 29 Apr 2021 14:25:15 +0200 Message-Id: <20210429122519.15183-4-david@redhat.com> In-Reply-To: <20210429122519.15183-1-david@redhat.com> References: <20210429122519.15183-1-david@redhat.com> MIME-Version: 1.0 Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf04; identity=mailfrom; envelope-from="<david@redhat.com>"; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages \| expand [v1,0/7] fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages [v1,1/7] fs/proc/kcore: drop KCORE_REMAP and KCORE_OTHER [v1,2/7] fs/proc/kcore: pfn_is_ram check only applies to KCORE_RAM [v1,3/7] mm: rename and move page_is_poisoned() [v1,4/7] fs/proc/kcore: don't read offline sections, logically offline pages and hwpoisoned pages [v1,5/7] mm: introduce page_offline_(begin\|end\|freeze\|unfreeze) to synchronize setting PageOffline() [v1,6/7] virtio-mem: use page_offline_(start\|end) when setting PageOffline() [v1,7/7] fs/proc/kcore: use page_offline_(freeze\|unfreeze)

David Hildenbrand April 29, 2021, 12:25 p.m. UTC

Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
introduced page_is_poisoned(), however, v5 [1] of the patch used
"page_is_hwpoison()" and something went wrong while upstreaming. Rename the
function and move it to page-flags.h, from where it can be used in other
-- kcore -- context.

Move the comment to the place where it belongs and simplify.

[1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/page-flags.h |  7 +++++++
 mm/gup.c                   |  6 +++++-
 mm/internal.h              | 20 --------------------
 3 files changed, 12 insertions(+), 21 deletions(-)

Mike Rapoport May 2, 2021, 6:32 a.m. UTC | #1

On Thu, Apr 29, 2021 at 02:25:15PM +0200, David Hildenbrand wrote:
> Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
> introduced page_is_poisoned(), however, v5 [1] of the patch used
> "page_is_hwpoison()" and something went wrong while upstreaming. Rename the
> function and move it to page-flags.h, from where it can be used in other
> -- kcore -- context.
> 
> Move the comment to the place where it belongs and simplify.
> 
> [1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>

> ---
>  include/linux/page-flags.h |  7 +++++++
>  mm/gup.c                   |  6 +++++-
>  mm/internal.h              | 20 --------------------
>  3 files changed, 12 insertions(+), 21 deletions(-)

Nice :)

> 
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 04a34c08e0a6..b8c56672a588 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -694,6 +694,13 @@ PAGEFLAG_FALSE(DoubleMap)
>  	TESTSCFLAG_FALSE(DoubleMap)
>  #endif
>  
> +static inline bool is_page_hwpoison(struct page *page)
> +{
> +	if (PageHWPoison(page))
> +		return true;
> +	return PageHuge(page) && PageHWPoison(compound_head(page));
> +}
> +
>  /*
>   * For pages that are never mapped to userspace (and aren't PageSlab),
>   * page_type may be used.  Because it is initialised to -1, we invert the
> diff --git a/mm/gup.c b/mm/gup.c
> index ef7d2da9f03f..000f3303e7f2 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1536,7 +1536,11 @@ struct page *get_dump_page(unsigned long addr)
>  	if (locked)
>  		mmap_read_unlock(mm);
>  
> -	if (ret == 1 && is_page_poisoned(page))
> +	/*
> +	 * We might have hwpoisoned pages still mapped into user space. Don't
> +	 * read these pages when creating a coredump, access could be fatal.
> +	 */
> +	if (ret == 1 && is_page_hwpoison(page))
>  		return NULL;
>  
>  	return (ret == 1) ? page : NULL;
> diff --git a/mm/internal.h b/mm/internal.h
> index cb3c5e0a7799..1432feec62df 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -97,26 +97,6 @@ static inline void set_page_refcounted(struct page *page)
>  	set_page_count(page, 1);
>  }
>  
> -/*
> - * When kernel touch the user page, the user page may be have been marked
> - * poison but still mapped in user space, if without this page, the kernel
> - * can guarantee the data integrity and operation success, the kernel is
> - * better to check the posion status and avoid touching it, be good not to
> - * panic, coredump for process fatal signal is a sample case matching this
> - * scenario. Or if kernel can't guarantee the data integrity, it's better
> - * not to call this function, let kernel touch the poison page and get to
> - * panic.
> - */
> -static inline bool is_page_poisoned(struct page *page)
> -{
> -	if (PageHWPoison(page))
> -		return true;
> -	else if (PageHuge(page) && PageHWPoison(compound_head(page)))
> -		return true;
> -
> -	return false;
> -}
> -
>  extern unsigned long highest_memmap_pfn;
>  
>  /*
> -- 
> 2.30.2
>

Michal Hocko May 5, 2021, 1:13 p.m. UTC | #2

On Thu 29-04-21 14:25:15, David Hildenbrand wrote:
> Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
> introduced page_is_poisoned(), however, v5 [1] of the patch used
> "page_is_hwpoison()" and something went wrong while upstreaming. Rename the
> function and move it to page-flags.h, from where it can be used in other
> -- kcore -- context.
> 
> Move the comment to the place where it belongs and simplify.
> 
> [1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>

I do agree that being explicit about hwpoison is much better. Poisoned
page can be also an unitialized one and I believe this is the reason why
you are bringing that up.

But you've made me look at d3378e86d182 and I am wondering whether this
is really a valid patch. First of all it can leak a reference count
AFAICS. Moreover it doesn't really fix anything because the page can be
marked hwpoison right after the check is done. I do not think the race
is feasible to be closed. So shouldn't we rather revert it?

David Hildenbrand May 5, 2021, 1:17 p.m. UTC | #3

On 05.05.21 15:13, Michal Hocko wrote:
> On Thu 29-04-21 14:25:15, David Hildenbrand wrote:
>> Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
>> introduced page_is_poisoned(), however, v5 [1] of the patch used
>> "page_is_hwpoison()" and something went wrong while upstreaming. Rename the
>> function and move it to page-flags.h, from where it can be used in other
>> -- kcore -- context.
>>
>> Move the comment to the place where it belongs and simplify.
>>
>> [1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> I do agree that being explicit about hwpoison is much better. Poisoned
> page can be also an unitialized one and I believe this is the reason why
> you are bringing that up.

I'm bringing it up because I want to reuse that function as state above :)

> 
> But you've made me look at d3378e86d182 and I am wondering whether this
> is really a valid patch. First of all it can leak a reference count
> AFAICS. Moreover it doesn't really fix anything because the page can be
> marked hwpoison right after the check is done. I do not think the race
> is feasible to be closed. So shouldn't we rather revert it?

I am not sure if we really care about races here that much here? I mean, 
essentially we are racing with HW breaking asynchronously. Just because 
we would be synchronizing with SetPageHWPoison() wouldn't mean we can 
stop HW from breaking.

Long story short, this should be good enough for the cases we actually 
can handle? What am I missing?

Michal Hocko May 5, 2021, 1:27 p.m. UTC | #4

On Wed 05-05-21 15:17:53, David Hildenbrand wrote:
> On 05.05.21 15:13, Michal Hocko wrote:
> > On Thu 29-04-21 14:25:15, David Hildenbrand wrote:
> > > Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
> > > introduced page_is_poisoned(), however, v5 [1] of the patch used
> > > "page_is_hwpoison()" and something went wrong while upstreaming. Rename the
> > > function and move it to page-flags.h, from where it can be used in other
> > > -- kcore -- context.
> > > 
> > > Move the comment to the place where it belongs and simplify.
> > > 
> > > [1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine
> > > 
> > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > 
> > I do agree that being explicit about hwpoison is much better. Poisoned
> > page can be also an unitialized one and I believe this is the reason why
> > you are bringing that up.
> 
> I'm bringing it up because I want to reuse that function as state above :)
> 
> > 
> > But you've made me look at d3378e86d182 and I am wondering whether this
> > is really a valid patch. First of all it can leak a reference count
> > AFAICS. Moreover it doesn't really fix anything because the page can be
> > marked hwpoison right after the check is done. I do not think the race
> > is feasible to be closed. So shouldn't we rather revert it?
> 
> I am not sure if we really care about races here that much here? I mean,
> essentially we are racing with HW breaking asynchronously. Just because we
> would be synchronizing with SetPageHWPoison() wouldn't mean we can stop HW
> from breaking.

Right

> Long story short, this should be good enough for the cases we actually can
> handle? What am I missing?

I am not sure I follow. My point is that I fail to see any added value
of the check as it doesn't prevent the race (it fundamentally cannot as
the page can be poisoned at any time) but the failure path doesn't
put_page which is incorrect even for hwpoison pages.

David Hildenbrand May 5, 2021, 1:39 p.m. UTC | #5

>> Long story short, this should be good enough for the cases we actually can
>> handle? What am I missing?
> 
> I am not sure I follow. My point is that I fail to see any added value
> of the check as it doesn't prevent the race (it fundamentally cannot as
> the page can be poisoned at any time) but the failure path doesn't
> put_page which is incorrect even for hwpoison pages.

Oh, I think you are right. If we have a page and return NULL we would 
leak a reference.

Actually, we discussed in that thread handling this entirely 
differently, which resulted in a v7 [1]; however Andrew moved forward 
with this (outdated?) patch, maybe that was just a mistake?

Yes, I agree we should revert that patch for now.

Regarding the race comment: AFAIU e.g., [2], it's not really a problem 
with a race, but rather some corner case issue that can happen if we 
fail in memory_failure().


[1] https://lkml.kernel.org/r/20210406104123.451ee3c3@alex-virtual-machine
[2] 
https://lkml.kernel.org/r/20210331015258.GB22060@hori.linux.bs1.fc.nec.co.jp

Michal Hocko May 5, 2021, 1:45 p.m. UTC | #6

On Wed 05-05-21 15:39:08, David Hildenbrand wrote:
> > > Long story short, this should be good enough for the cases we actually can
> > > handle? What am I missing?
> > 
> > I am not sure I follow. My point is that I fail to see any added value
> > of the check as it doesn't prevent the race (it fundamentally cannot as
> > the page can be poisoned at any time) but the failure path doesn't
> > put_page which is incorrect even for hwpoison pages.
> 
> Oh, I think you are right. If we have a page and return NULL we would leak a
> reference.
> 
> Actually, we discussed in that thread handling this entirely differently,
> which resulted in a v7 [1]; however Andrew moved forward with this
> (outdated?) patch, maybe that was just a mistake?
> 
> Yes, I agree we should revert that patch for now.

OK, Let me send the revert to Andrew.

yaoaili [么爱利] May 6, 2021, 12:56 a.m. UTC | #7

On Wed, 5 May 2021 15:27:39 +0200
Michal Hocko <mhocko@suse.com> wrote:

> On Wed 05-05-21 15:17:53, David Hildenbrand wrote:
> > On 05.05.21 15:13, Michal Hocko wrote:  
> > > On Thu 29-04-21 14:25:15, David Hildenbrand wrote:  
> > > > Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
> > > > introduced page_is_poisoned(), however, v5 [1] of the patch used
> > > > "page_is_hwpoison()" and something went wrong while upstreaming. Rename the
> > > > function and move it to page-flags.h, from where it can be used in other
> > > > -- kcore -- context.
> > > > 
> > > > Move the comment to the place where it belongs and simplify.
> > > > 
> > > > [1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine
> > > > 
> > > > Signed-off-by: David Hildenbrand <david@redhat.com>  
> > > 
> > > I do agree that being explicit about hwpoison is much better. Poisoned
> > > page can be also an unitialized one and I believe this is the reason why
> > > you are bringing that up.  
> > 
> > I'm bringing it up because I want to reuse that function as state above :)
> >   
> > > 
> > > But you've made me look at d3378e86d182 and I am wondering whether this
> > > is really a valid patch. First of all it can leak a reference count
> > > AFAICS. Moreover it doesn't really fix anything because the page can be
> > > marked hwpoison right after the check is done. I do not think the race
> > > is feasible to be closed. So shouldn't we rather revert it?  
> > 
> > I am not sure if we really care about races here that much here? I mean,
> > essentially we are racing with HW breaking asynchronously. Just because we
> > would be synchronizing with SetPageHWPoison() wouldn't mean we can stop HW
> > from breaking.  
> 
> Right
> 
> > Long story short, this should be good enough for the cases we actually can
> > handle? What am I missing?  
> 
> I am not sure I follow. My point is that I fail to see any added value
> of the check as it doesn't prevent the race (it fundamentally cannot as
> the page can be poisoned at any time) but the failure path doesn't
> put_page which is incorrect even for hwpoison pages.

Sorry, I have something to say:

I have noticed the ref count leak in the previous topic ,but  I don't think
it's a really matter. For memory recovery case for user pages, we will keep one
reference to the poison page so the error page will not be freed to buddy allocator.
which can be checked in memory_faulure() function.

For the case here, the reference count for error page may be greater than one as it's not
unmmapped successfully and may shared. we take a reference for that page will not result some
really mattering issue.

The only break I can think for this leak is that we will fail to unpoison the error page for test purpose.

Thanks!
Aili Yao

yaoaili [么爱利] May 6, 2021, 1:08 a.m. UTC | #8

On Wed, 5 May 2021 15:45:47 +0200
Michal Hocko <mhocko@suse.com> wrote:

> On Wed 05-05-21 15:39:08, David Hildenbrand wrote:
> > > > Long story short, this should be good enough for the cases we actually can
> > > > handle? What am I missing?  
> > > 
> > > I am not sure I follow. My point is that I fail to see any added value
> > > of the check as it doesn't prevent the race (it fundamentally cannot as
> > > the page can be poisoned at any time) but the failure path doesn't
> > > put_page which is incorrect even for hwpoison pages.  
> > 
> > Oh, I think you are right. If we have a page and return NULL we would leak a
> > reference.
> > 
> > Actually, we discussed in that thread handling this entirely differently,
> > which resulted in a v7 [1]; however Andrew moved forward with this
> > (outdated?) patch, maybe that was just a mistake?
> > 
> > Yes, I agree we should revert that patch for now.  
> 
> OK, Let me send the revert to Andrew.
> 

Got this!
Anyway, I will try to post a new patch for this issue based on the previous patch v7.

Thanks!
Aili Yao

Michal Hocko May 6, 2021, 7:06 a.m. UTC | #9

On Thu 06-05-21 08:56:11, Aili Yao wrote:
> On Wed, 5 May 2021 15:27:39 +0200
> Michal Hocko <mhocko@suse.com> wrote:
> 
> > On Wed 05-05-21 15:17:53, David Hildenbrand wrote:
> > > On 05.05.21 15:13, Michal Hocko wrote:  
> > > > On Thu 29-04-21 14:25:15, David Hildenbrand wrote:  
> > > > > Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
> > > > > introduced page_is_poisoned(), however, v5 [1] of the patch used
> > > > > "page_is_hwpoison()" and something went wrong while upstreaming. Rename the
> > > > > function and move it to page-flags.h, from where it can be used in other
> > > > > -- kcore -- context.
> > > > > 
> > > > > Move the comment to the place where it belongs and simplify.
> > > > > 
> > > > > [1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine
> > > > > 
> > > > > Signed-off-by: David Hildenbrand <david@redhat.com>  
> > > > 
> > > > I do agree that being explicit about hwpoison is much better. Poisoned
> > > > page can be also an unitialized one and I believe this is the reason why
> > > > you are bringing that up.  
> > > 
> > > I'm bringing it up because I want to reuse that function as state above :)
> > >   
> > > > 
> > > > But you've made me look at d3378e86d182 and I am wondering whether this
> > > > is really a valid patch. First of all it can leak a reference count
> > > > AFAICS. Moreover it doesn't really fix anything because the page can be
> > > > marked hwpoison right after the check is done. I do not think the race
> > > > is feasible to be closed. So shouldn't we rather revert it?  
> > > 
> > > I am not sure if we really care about races here that much here? I mean,
> > > essentially we are racing with HW breaking asynchronously. Just because we
> > > would be synchronizing with SetPageHWPoison() wouldn't mean we can stop HW
> > > from breaking.  
> > 
> > Right
> > 
> > > Long story short, this should be good enough for the cases we actually can
> > > handle? What am I missing?  
> > 
> > I am not sure I follow. My point is that I fail to see any added value
> > of the check as it doesn't prevent the race (it fundamentally cannot as
> > the page can be poisoned at any time) but the failure path doesn't
> > put_page which is incorrect even for hwpoison pages.
> 
> Sorry, I have something to say:
> 
> I have noticed the ref count leak in the previous topic ,but  I don't think
> it's a really matter. For memory recovery case for user pages, we will keep one
> reference to the poison page so the error page will not be freed to buddy allocator.
> which can be checked in memory_faulure() function.

So what would happen if those pages are hwpoisoned from userspace rather
than by HW. And repeatedly so?

yaoaili [么爱利] May 6, 2021, 7:28 a.m. UTC | #10

On Thu, 6 May 2021 09:06:14 +0200
Michal Hocko <mhocko@suse.com> wrote:

> On Thu 06-05-21 08:56:11, Aili Yao wrote:
> > On Wed, 5 May 2021 15:27:39 +0200
> > Michal Hocko <mhocko@suse.com> wrote:
> >   
> > > On Wed 05-05-21 15:17:53, David Hildenbrand wrote:  
> > > > On 05.05.21 15:13, Michal Hocko wrote:    
> > > > > On Thu 29-04-21 14:25:15, David Hildenbrand wrote:    
> > > > > > Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
> > > > > > introduced page_is_poisoned(), however, v5 [1] of the patch used
> > > > > > "page_is_hwpoison()" and something went wrong while upstreaming. Rename the
> > > > > > function and move it to page-flags.h, from where it can be used in other
> > > > > > -- kcore -- context.
> > > > > > 
> > > > > > Move the comment to the place where it belongs and simplify.
> > > > > > 
> > > > > > [1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine
> > > > > > 
> > > > > > Signed-off-by: David Hildenbrand <david@redhat.com>    
> > > > > 
> > > > > I do agree that being explicit about hwpoison is much better. Poisoned
> > > > > page can be also an unitialized one and I believe this is the reason why
> > > > > you are bringing that up.    
> > > > 
> > > > I'm bringing it up because I want to reuse that function as state above :)
> > > >     
> > > > > 
> > > > > But you've made me look at d3378e86d182 and I am wondering whether this
> > > > > is really a valid patch. First of all it can leak a reference count
> > > > > AFAICS. Moreover it doesn't really fix anything because the page can be
> > > > > marked hwpoison right after the check is done. I do not think the race
> > > > > is feasible to be closed. So shouldn't we rather revert it?    
> > > > 
> > > > I am not sure if we really care about races here that much here? I mean,
> > > > essentially we are racing with HW breaking asynchronously. Just because we
> > > > would be synchronizing with SetPageHWPoison() wouldn't mean we can stop HW
> > > > from breaking.    
> > > 
> > > Right
> > >   
> > > > Long story short, this should be good enough for the cases we actually can
> > > > handle? What am I missing?    
> > > 
> > > I am not sure I follow. My point is that I fail to see any added value
> > > of the check as it doesn't prevent the race (it fundamentally cannot as
> > > the page can be poisoned at any time) but the failure path doesn't
> > > put_page which is incorrect even for hwpoison pages.  
> > 
> > Sorry, I have something to say:
> > 
> > I have noticed the ref count leak in the previous topic ,but  I don't think
> > it's a really matter. For memory recovery case for user pages, we will keep one
> > reference to the poison page so the error page will not be freed to buddy allocator.
> > which can be checked in memory_faulure() function.  
> 
> So what would happen if those pages are hwpoisoned from userspace rather
> than by HW. And repeatedly so?

Sorry, I may be not totally understand what you mean.

Do you mean hard page offline from mcelog?
If yes, I think it's not for one real UC error but for CE storms.
when we access this page in kernel, the access may success even it was marked hwpoison.

Thanks!
Aili Yao

Michal Hocko May 6, 2021, 7:55 a.m. UTC | #11

On Thu 06-05-21 15:28:05, Aili Yao wrote:
> On Thu, 6 May 2021 09:06:14 +0200
> Michal Hocko <mhocko@suse.com> wrote:
> 
> > On Thu 06-05-21 08:56:11, Aili Yao wrote:
> > > On Wed, 5 May 2021 15:27:39 +0200
> > > Michal Hocko <mhocko@suse.com> wrote:
[...]
> > > > I am not sure I follow. My point is that I fail to see any added value
> > > > of the check as it doesn't prevent the race (it fundamentally cannot as
> > > > the page can be poisoned at any time) but the failure path doesn't
> > > > put_page which is incorrect even for hwpoison pages.  
> > > 
> > > Sorry, I have something to say:
> > > 
> > > I have noticed the ref count leak in the previous topic ,but  I don't think
> > > it's a really matter. For memory recovery case for user pages, we will keep one
> > > reference to the poison page so the error page will not be freed to buddy allocator.
> > > which can be checked in memory_faulure() function.  
> > 
> > So what would happen if those pages are hwpoisoned from userspace rather
> > than by HW. And repeatedly so?
> 
> Sorry, I may be not totally understand what you mean.
> 
> Do you mean hard page offline from mcelog?

No I mean soft hwpoison from userspace (e.g. by MADV_HWPOISON but there
are other interfaces AFAIK).

And just to be explicit. All those interfaces are root only
(CAP_SYS_ADMIN) so I am not really worried about any malitious abuse of
the reference leak. I am mostly concerned that this is obviously broken
without a good reason. The most trivial fix would have been to put_page
in the return path but as I've mentioned in other email thread the fix
really needs a deeper thought and consider other things.

Hope that clarifies this some more.

yaoaili [么爱利] May 6, 2021, 8:52 a.m. UTC | #12

On Thu, 6 May 2021 09:55:51 +0200
Michal Hocko <mhocko@suse.com> wrote:

> On Thu 06-05-21 15:28:05, Aili Yao wrote:
> > On Thu, 6 May 2021 09:06:14 +0200
> > Michal Hocko <mhocko@suse.com> wrote:
> >   
> > > On Thu 06-05-21 08:56:11, Aili Yao wrote:  
> > > > On Wed, 5 May 2021 15:27:39 +0200
> > > > Michal Hocko <mhocko@suse.com> wrote:  
> [...]
> > > > > I am not sure I follow. My point is that I fail to see any added value
> > > > > of the check as it doesn't prevent the race (it fundamentally cannot as
> > > > > the page can be poisoned at any time) but the failure path doesn't
> > > > > put_page which is incorrect even for hwpoison pages.    
> > > > 
> > > > Sorry, I have something to say:
> > > > 
> > > > I have noticed the ref count leak in the previous topic ,but  I don't think
> > > > it's a really matter. For memory recovery case for user pages, we will keep one
> > > > reference to the poison page so the error page will not be freed to buddy allocator.
> > > > which can be checked in memory_faulure() function.    
> > > 
> > > So what would happen if those pages are hwpoisoned from userspace rather
> > > than by HW. And repeatedly so?  
> > 
> > Sorry, I may be not totally understand what you mean.
> > 
> > Do you mean hard page offline from mcelog?  
> 
> No I mean soft hwpoison from userspace (e.g. by MADV_HWPOISON but there
> are other interfaces AFAIK).
> 
> And just to be explicit. All those interfaces are root only
> (CAP_SYS_ADMIN) so I am not really worried about any malitious abuse of
> the reference leak. I am mostly concerned that this is obviously broken
> without a good reason. The most trivial fix would have been to put_page
> in the return path but as I've mentioned in other email thread the fix
> really needs a deeper thought and consider other things.
> 
> Hope that clarifies this some more.

Thanks, got it!
Yes, there are some test scenarios that should be covered.

But for test, the default SIGBUS handlers is usually replaced, and the test process
may not hit the coredump code.

Anyway, there is a ref leak in the normal enviorments and better to be fixed.

Thanks!
Aili Yao

[v1,3/7] mm: rename and move page_is_poisoned()

Commit Message

Comments

Patch