diff mbox series

[v7,14/14] mm,hwpoison: Try to narrow window race for free pages

Message ID 20200922135650.1634-15-osalvador@suse.de (mailing list archive)
State New, archived
Headers show
Series HWPOISON: soft offline rework | expand

Commit Message

Oscar Salvador Sept. 22, 2020, 1:56 p.m. UTC
Aristeu Rozanski reported that a customer test case started
to report -EBUSY after the hwpoison rework patchset.

There is a race window between spotting a free page and taking it off
its buddy freelist, so it might be that by the time we try to take it off,
the page has been already allocated.

This patch tries to handle such race window by trying to handle the new
type of page again if the page was allocated under us.

Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reported-by: Aristeu Rozanski <aris@ruivo.org>
Tested-by: Aristeu Rozanski <aris@ruivo.org>
---
 mm/memory-failure.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

HORIGUCHI NAOYA(堀口 直也) Sept. 23, 2020, 7:40 a.m. UTC | #1
On Tue, Sep 22, 2020 at 03:56:50PM +0200, Oscar Salvador wrote:
> Aristeu Rozanski reported that a customer test case started
> to report -EBUSY after the hwpoison rework patchset.
> 
> There is a race window between spotting a free page and taking it off
> its buddy freelist, so it might be that by the time we try to take it off,
> the page has been already allocated.
> 
> This patch tries to handle such race window by trying to handle the new
> type of page again if the page was allocated under us.
> 
> Signed-off-by: Oscar Salvador <osalvador@suse.de>
> Reported-by: Aristeu Rozanski <aris@ruivo.org>
> Tested-by: Aristeu Rozanski <aris@ruivo.org>

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

> ---
>  mm/memory-failure.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 46b1821d2817..8f23d3c7a0a2 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1903,6 +1903,7 @@ int soft_offline_page(unsigned long pfn, int flags)
>  {
>  	int ret;
>  	struct page *page;
> +	bool try_again = true;
>  
>  	if (!pfn_valid(pfn))
>  		return -ENXIO;
> @@ -1918,6 +1919,7 @@ int soft_offline_page(unsigned long pfn, int flags)
>  		return 0;
>  	}
>  
> +retry:
>  	get_online_mems();
>  	ret = get_any_page(page, pfn, flags);
>  	put_online_mems();
> @@ -1925,7 +1927,10 @@ int soft_offline_page(unsigned long pfn, int flags)
>  	if (ret > 0)
>  		ret = soft_offline_in_use_page(page);
>  	else if (ret == 0)
> -		ret = soft_offline_free_page(page);
> +		if (soft_offline_free_page(page) && try_again) {
> +			try_again = false;
> +			goto retry;
> +		}
>  
>  	return ret;
>  }
> -- 
> 2.26.2
>
diff mbox series

Patch

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 46b1821d2817..8f23d3c7a0a2 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1903,6 +1903,7 @@  int soft_offline_page(unsigned long pfn, int flags)
 {
 	int ret;
 	struct page *page;
+	bool try_again = true;
 
 	if (!pfn_valid(pfn))
 		return -ENXIO;
@@ -1918,6 +1919,7 @@  int soft_offline_page(unsigned long pfn, int flags)
 		return 0;
 	}
 
+retry:
 	get_online_mems();
 	ret = get_any_page(page, pfn, flags);
 	put_online_mems();
@@ -1925,7 +1927,10 @@  int soft_offline_page(unsigned long pfn, int flags)
 	if (ret > 0)
 		ret = soft_offline_in_use_page(page);
 	else if (ret == 0)
-		ret = soft_offline_free_page(page);
+		if (soft_offline_free_page(page) && try_again) {
+			try_again = false;
+			goto retry;
+		}
 
 	return ret;
 }