diff mbox series

[v3,1/2] mm: soft-offline: return -EBUSY if set_hwpoison_free_buddy_page() fails

Message ID 1560761476-4651-2-git-send-email-n-horiguchi@ah.jp.nec.com (mailing list archive)
State New, archived
Headers show
Series fix return value issue of soft offlining hugepages | expand

Commit Message

Naoya Horiguchi June 17, 2019, 8:51 a.m. UTC
The pass/fail of soft offline should be judged by checking whether the
raw error page was finally contained or not (i.e. the result of
set_hwpoison_free_buddy_page()), but current code do not work like that.
So this patch is suggesting to fix it.

Without this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may
not offline the original page and will not return an error.  It might
lead us to misjudge the test result when set_hwpoison_free_buddy_page()
actually fails.

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
Cc: <stable@vger.kernel.org> # v4.19+
---
ChangeLog v2->v3:
- update patch description to clarify user visible change
---
 mm/memory-failure.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Oscar Salvador June 18, 2019, 2:57 p.m. UTC | #1
On Mon, Jun 17, 2019 at 05:51:15PM +0900, Naoya Horiguchi wrote:
> The pass/fail of soft offline should be judged by checking whether the
> raw error page was finally contained or not (i.e. the result of
> set_hwpoison_free_buddy_page()), but current code do not work like that.
> So this patch is suggesting to fix it.
> 
> Without this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may
> not offline the original page and will not return an error.  It might
> lead us to misjudge the test result when set_hwpoison_free_buddy_page()
> actually fails.
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Fixes: 6bc9b56433b76 ("mm: fix race on soft-offlining")
> Cc: <stable@vger.kernel.org> # v4.19+

Reviewed-by: Oscar Salvador <osalvador@suse.de>

> ---
> ChangeLog v2->v3:
> - update patch description to clarify user visible change
> ---
>  mm/memory-failure.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git v5.2-rc4/mm/memory-failure.c v5.2-rc4_patched/mm/memory-failure.c
> index 8da0334..8ee7b16 100644
> --- v5.2-rc4/mm/memory-failure.c
> +++ v5.2-rc4_patched/mm/memory-failure.c
> @@ -1730,6 +1730,8 @@ static int soft_offline_huge_page(struct page *page, int flags)
>  		if (!ret) {
>  			if (set_hwpoison_free_buddy_page(page))
>  				num_poisoned_pages_inc();
> +			else
> +				ret = -EBUSY;
>  		}
>  	}
>  	return ret;
> -- 
> 2.7.0
>
Mike Kravetz June 18, 2019, 5:33 p.m. UTC | #2
On 6/17/19 1:51 AM, Naoya Horiguchi wrote:
> The pass/fail of soft offline should be judged by checking whether the
> raw error page was finally contained or not (i.e. the result of
> set_hwpoison_free_buddy_page()), but current code do not work like that.
> So this patch is suggesting to fix it.
> 
> Without this fix, there are cases where madvise(MADV_SOFT_OFFLINE) may
> not offline the original page and will not return an error.  It might
> lead us to misjudge the test result when set_hwpoison_free_buddy_page()
> actually fails.
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

Thanks for the updates,

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
diff mbox series

Patch

diff --git v5.2-rc4/mm/memory-failure.c v5.2-rc4_patched/mm/memory-failure.c
index 8da0334..8ee7b16 100644
--- v5.2-rc4/mm/memory-failure.c
+++ v5.2-rc4_patched/mm/memory-failure.c
@@ -1730,6 +1730,8 @@  static int soft_offline_huge_page(struct page *page, int flags)
 		if (!ret) {
 			if (set_hwpoison_free_buddy_page(page))
 				num_poisoned_pages_inc();
+			else
+				ret = -EBUSY;
 		}
 	}
 	return ret;