diff mbox series

[v1] mm, hwpoison, hugetlb: Check hugetlb head page hwpoison flag when unpoison page

Message ID 20220804122819.2917249-1-luofei@unicloud.com (mailing list archive)
State New
Headers show
Series [v1] mm, hwpoison, hugetlb: Check hugetlb head page hwpoison flag when unpoison page | expand

Commit Message

luofei Aug. 4, 2022, 12:28 p.m. UTC
When software-poison a huge page, if dissolve_free_huge_page() failed,
the huge page will be added to hugepage_freelists. In this case, the
head page will hold the hwpoison flag, but the real poisoned tail page
hwpoison flag is not set, this will cause unpoison_memory() fail to
unpoison the previously poisoned page.

So add a check on hugetlb head page, and also need to ensure the
previously poisoned tail page in huge page raw_hwp_list.

Signed-off-by: luofei <luofei@unicloud.com>
---
 mm/memory-failure.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

Comments

Naoya Horiguchi Aug. 4, 2022, 1:28 p.m. UTC | #1
On Thu, Aug 04, 2022 at 08:28:19AM -0400, luofei wrote:
> When software-poison a huge page, if dissolve_free_huge_page() failed,
> the huge page will be added to hugepage_freelists. In this case, the
> head page will hold the hwpoison flag, but the real poisoned tail page
> hwpoison flag is not set, this will cause unpoison_memory() fail to
> unpoison the previously poisoned page.

Hi luofei,

When you try to unpoison a hwpoisoned hugepage, you just have to pass the
pfn of the head page, not the pfn of raw poisoned subpage.  Note that the
position of raw error page is not exposed to userspace (dmesg shows it, but
saving and parsing it for unpoison is not that useful) and the related
utilities like page-types only checks PageHWpoison flag to find error pages,
so it seems to me that you're introducing an inconsistent assumption.

Thanks,
Naoya Horiguchi

> 
> So add a check on hugetlb head page, and also need to ensure the
> previously poisoned tail page in huge page raw_hwp_list.
> 
> Signed-off-by: luofei <luofei@unicloud.com>
> ---
>  mm/memory-failure.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 14439806b5ef..92dbeaa24afb 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -2293,6 +2293,28 @@ core_initcall(memory_failure_init);
>  		pr_info(fmt, pfn);			\
>  })
>  
> +static bool hugetlb_page_head_poison(struct page *hpage, struct page *page)
> +{
> +	struct llist_head *head;
> +	struct llist_node *t, *tnode;
> +	struct raw_hwp_page *p;
> +
> +	if (!PageHuge(page) || !PageHWPoison(hpage) || !HPageFreed(hpage))
> +		return false;
> +
> +	if (HPageRawHwpUnreliable(hpage))
> +		return false;
> +
> +	head = raw_hwp_list_head(hpage);
> +	llist_for_each_safe(tnode, t, head->first) {
> +		p = container_of(tnode, struct raw_hwp_page, node);
> +		if (p->page == page)
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
>  /**
>   * unpoison_memory - Unpoison a previously poisoned page
>   * @pfn: Page number of the to be unpoisoned page
> @@ -2330,7 +2352,7 @@ int unpoison_memory(unsigned long pfn)
>  		goto unlock_mutex;
>  	}
>  
> -	if (!PageHWPoison(p)) {
> +	if (!PageHWPoison(p) && !hugetlb_page_head_poison(page, p)) {
>  		unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n",
>  				 pfn, &unpoison_rs);
>  		goto unlock_mutex;
> -- 
> 2.27.0
diff mbox series

Patch

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 14439806b5ef..92dbeaa24afb 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2293,6 +2293,28 @@  core_initcall(memory_failure_init);
 		pr_info(fmt, pfn);			\
 })
 
+static bool hugetlb_page_head_poison(struct page *hpage, struct page *page)
+{
+	struct llist_head *head;
+	struct llist_node *t, *tnode;
+	struct raw_hwp_page *p;
+
+	if (!PageHuge(page) || !PageHWPoison(hpage) || !HPageFreed(hpage))
+		return false;
+
+	if (HPageRawHwpUnreliable(hpage))
+		return false;
+
+	head = raw_hwp_list_head(hpage);
+	llist_for_each_safe(tnode, t, head->first) {
+		p = container_of(tnode, struct raw_hwp_page, node);
+		if (p->page == page)
+			return true;
+	}
+
+	return false;
+}
+
 /**
  * unpoison_memory - Unpoison a previously poisoned page
  * @pfn: Page number of the to be unpoisoned page
@@ -2330,7 +2352,7 @@  int unpoison_memory(unsigned long pfn)
 		goto unlock_mutex;
 	}
 
-	if (!PageHWPoison(p)) {
+	if (!PageHWPoison(p) && !hugetlb_page_head_poison(page, p)) {
 		unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n",
 				 pfn, &unpoison_rs);
 		goto unlock_mutex;