diff mbox series

[4/4] mm: hwpoison: handle non-anonymous THP correctly

Message ID 20210914183718.4236-5-shy828301@gmail.com (mailing list archive)
State New
Headers show
Series Solve silent data loss caused by poisoned page cache (shmem/tmpfs) | expand

Commit Message

Yang Shi Sept. 14, 2021, 6:37 p.m. UTC
Currently hwpoison doesn't handle non-anonymous THP, but since v4.8 THP
support for tmpfs and read-only file cache has been added.  They could
be offlined by split THP, just like anonymous THP.

Signed-off-by: Yang Shi <shy828301@gmail.com>
---
 mm/memory-failure.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

Comments

Naoya Horiguchi Sept. 21, 2021, 9:50 a.m. UTC | #1
On Tue, Sep 14, 2021 at 11:37:18AM -0700, Yang Shi wrote:
> Currently hwpoison doesn't handle non-anonymous THP, but since v4.8 THP
> support for tmpfs and read-only file cache has been added.  They could
> be offlined by split THP, just like anonymous THP.
> 
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> ---
>  mm/memory-failure.c | 21 ++++++++++++---------
>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 3e06cb9d5121..6f72aab8ec4a 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1150,13 +1150,16 @@ static int __get_hwpoison_page(struct page *page)
>  
>  	if (PageTransHuge(head)) {
>  		/*
> -		 * Non anonymous thp exists only in allocation/free time. We
> -		 * can't handle such a case correctly, so let's give it up.
> -		 * This should be better than triggering BUG_ON when kernel
> -		 * tries to touch the "partially handled" page.
> +		 * We can't handle allocating or freeing THPs, so let's give
> +		 * it up. This should be better than triggering BUG_ON when
> +		 * kernel tries to touch the "partially handled" page.
> +		 *
> +		 * page->mapping won't be initialized until the page is added
> +		 * to rmap or page cache.  Use this as an indicator for if
> +		 * this is an instantiated page.
>  		 */
> -		if (!PageAnon(head)) {
> -			pr_err("Memory failure: %#lx: non anonymous thp\n",
> +		if (!head->mapping) {
> +			pr_err("Memory failure: %#lx: non instantiated thp\n",
>  				page_to_pfn(page));
>  			return 0;
>  		}

How about cleaning up this whole "PageTransHuge()" block?  As explained in
commit 415c64c1453a (mm/memory-failure: split thp earlier in memory error
handling), this check was introduced to avoid that non-anonymous thp is
considered as hugetlb and code for hugetlb is executed (resulting in crash).

With recent improvement in __get_hwpoison_page(), this confusion never
happens (because hugetlb check is done before this check), so this check
seems to finish its role.

Thanks,
Naoya Horiguchi

> @@ -1415,12 +1418,12 @@ static int identify_page_state(unsigned long pfn, struct page *p,
>  static int try_to_split_thp_page(struct page *page, const char *msg)
>  {
>  	lock_page(page);
> -	if (!PageAnon(page) || unlikely(split_huge_page(page))) {
> +	if (!page->mapping || unlikely(split_huge_page(page))) {
>  		unsigned long pfn = page_to_pfn(page);
>  
>  		unlock_page(page);
> -		if (!PageAnon(page))
> -			pr_info("%s: %#lx: non anonymous thp\n", msg, pfn);
> +		if (!page->mapping)
> +			pr_info("%s: %#lx: not instantiated thp\n", msg, pfn);
>  		else
>  			pr_info("%s: %#lx: thp split failed\n", msg, pfn);
>  		put_page(page);
> -- 
> 2.26.2
>
Yang Shi Sept. 21, 2021, 7:46 p.m. UTC | #2
On Tue, Sep 21, 2021 at 2:50 AM Naoya Horiguchi
<naoya.horiguchi@linux.dev> wrote:
>
> On Tue, Sep 14, 2021 at 11:37:18AM -0700, Yang Shi wrote:
> > Currently hwpoison doesn't handle non-anonymous THP, but since v4.8 THP
> > support for tmpfs and read-only file cache has been added.  They could
> > be offlined by split THP, just like anonymous THP.
> >
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > ---
> >  mm/memory-failure.c | 21 ++++++++++++---------
> >  1 file changed, 12 insertions(+), 9 deletions(-)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 3e06cb9d5121..6f72aab8ec4a 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1150,13 +1150,16 @@ static int __get_hwpoison_page(struct page *page)
> >
> >       if (PageTransHuge(head)) {
> >               /*
> > -              * Non anonymous thp exists only in allocation/free time. We
> > -              * can't handle such a case correctly, so let's give it up.
> > -              * This should be better than triggering BUG_ON when kernel
> > -              * tries to touch the "partially handled" page.
> > +              * We can't handle allocating or freeing THPs, so let's give
> > +              * it up. This should be better than triggering BUG_ON when
> > +              * kernel tries to touch the "partially handled" page.
> > +              *
> > +              * page->mapping won't be initialized until the page is added
> > +              * to rmap or page cache.  Use this as an indicator for if
> > +              * this is an instantiated page.
> >                */
> > -             if (!PageAnon(head)) {
> > -                     pr_err("Memory failure: %#lx: non anonymous thp\n",
> > +             if (!head->mapping) {
> > +                     pr_err("Memory failure: %#lx: non instantiated thp\n",
> >                               page_to_pfn(page));
> >                       return 0;
> >               }
>
> How about cleaning up this whole "PageTransHuge()" block?  As explained in
> commit 415c64c1453a (mm/memory-failure: split thp earlier in memory error
> handling), this check was introduced to avoid that non-anonymous thp is
> considered as hugetlb and code for hugetlb is executed (resulting in crash).
>
> With recent improvement in __get_hwpoison_page(), this confusion never
> happens (because hugetlb check is done before this check), so this check
> seems to finish its role.

I see. IIUC the !PageAnon check was used to prevent from mistreating
the THP to hugetlb page. But it was actually solved by splitting THP
earlier. If so this check definitely could go away since the worst
case is split failure. Will fix it in the next version.

>
> Thanks,
> Naoya Horiguchi
>
> > @@ -1415,12 +1418,12 @@ static int identify_page_state(unsigned long pfn, struct page *p,
> >  static int try_to_split_thp_page(struct page *page, const char *msg)
> >  {
> >       lock_page(page);
> > -     if (!PageAnon(page) || unlikely(split_huge_page(page))) {
> > +     if (!page->mapping || unlikely(split_huge_page(page))) {
> >               unsigned long pfn = page_to_pfn(page);
> >
> >               unlock_page(page);
> > -             if (!PageAnon(page))
> > -                     pr_info("%s: %#lx: non anonymous thp\n", msg, pfn);
> > +             if (!page->mapping)
> > +                     pr_info("%s: %#lx: not instantiated thp\n", msg, pfn);
> >               else
> >                       pr_info("%s: %#lx: thp split failed\n", msg, pfn);
> >               put_page(page);
> > --
> > 2.26.2
> >
diff mbox series

Patch

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 3e06cb9d5121..6f72aab8ec4a 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1150,13 +1150,16 @@  static int __get_hwpoison_page(struct page *page)
 
 	if (PageTransHuge(head)) {
 		/*
-		 * Non anonymous thp exists only in allocation/free time. We
-		 * can't handle such a case correctly, so let's give it up.
-		 * This should be better than triggering BUG_ON when kernel
-		 * tries to touch the "partially handled" page.
+		 * We can't handle allocating or freeing THPs, so let's give
+		 * it up. This should be better than triggering BUG_ON when
+		 * kernel tries to touch the "partially handled" page.
+		 *
+		 * page->mapping won't be initialized until the page is added
+		 * to rmap or page cache.  Use this as an indicator for if
+		 * this is an instantiated page.
 		 */
-		if (!PageAnon(head)) {
-			pr_err("Memory failure: %#lx: non anonymous thp\n",
+		if (!head->mapping) {
+			pr_err("Memory failure: %#lx: non instantiated thp\n",
 				page_to_pfn(page));
 			return 0;
 		}
@@ -1415,12 +1418,12 @@  static int identify_page_state(unsigned long pfn, struct page *p,
 static int try_to_split_thp_page(struct page *page, const char *msg)
 {
 	lock_page(page);
-	if (!PageAnon(page) || unlikely(split_huge_page(page))) {
+	if (!page->mapping || unlikely(split_huge_page(page))) {
 		unsigned long pfn = page_to_pfn(page);
 
 		unlock_page(page);
-		if (!PageAnon(page))
-			pr_info("%s: %#lx: non anonymous thp\n", msg, pfn);
+		if (!page->mapping)
+			pr_info("%s: %#lx: not instantiated thp\n", msg, pfn);
 		else
 			pr_info("%s: %#lx: thp split failed\n", msg, pfn);
 		put_page(page);