diff mbox series

[v3] mm/page_alloc: fix counting of free pages after take off from buddy

Message ID 20210526075247.11130-1-dinghui@sangfor.com.cn (mailing list archive)
State New, archived
Headers show
Series [v3] mm/page_alloc: fix counting of free pages after take off from buddy | expand

Commit Message

Ding Hui May 26, 2021, 7:52 a.m. UTC
Recently we found that there is a lot MemFree left in /proc/meminfo
after do a lot of pages soft offline, it's not quite correct.

Before Oscar rework soft offline for free pages [1], if we soft
offline free pages, these pages are left in buddy with HWPoison
flag, and NR_FREE_PAGES is not updated immediately. So the difference
between NR_FREE_PAGES and real number of available free pages is
also even big at the beginning.

However, with the workload running, when we catch HWPoison page in
any alloc functions subsequently, we will remove it from buddy,
meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES
will get more and more closer to the real number of available free pages.
(regardless of unpoison_memory())

Now, for offline free pages, after a successful call take_page_off_buddy(),
the page is no longer belong to buddy allocator, and will not be
used any more, but we missed accounting NR_FREE_PAGES in this situation,
and there is no chance to be updated later.

Do update in take_page_off_buddy() like rmqueue() does, but avoid
double counting if some one already set_migratetype_isolate() on the
page.

[1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")

Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
---
v3:
- as Naoya Horiguchi suggested, do update only when
  is_migrate_isolate(migratetype)) is false
- updated patch description

v2:
- https://lore.kernel.org/linux-mm/20210508035533.23222-1-dinghui@sangfor.com.cn/
- use __mod_zone_freepage_state instead of __mod_zone_page_state 

 mm/page_alloc.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

David Hildenbrand May 26, 2021, 7:58 a.m. UTC | #1
On 26.05.21 09:52, Ding Hui wrote:
> Recently we found that there is a lot MemFree left in /proc/meminfo
> after do a lot of pages soft offline, it's not quite correct.
> 
> Before Oscar rework soft offline for free pages [1], if we soft
> offline free pages, these pages are left in buddy with HWPoison
> flag, and NR_FREE_PAGES is not updated immediately. So the difference
> between NR_FREE_PAGES and real number of available free pages is
> also even big at the beginning.
> 
> However, with the workload running, when we catch HWPoison page in
> any alloc functions subsequently, we will remove it from buddy,
> meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES
> will get more and more closer to the real number of available free pages.
> (regardless of unpoison_memory())
> 
> Now, for offline free pages, after a successful call take_page_off_buddy(),
> the page is no longer belong to buddy allocator, and will not be
> used any more, but we missed accounting NR_FREE_PAGES in this situation,
> and there is no chance to be updated later.
> 
> Do update in take_page_off_buddy() like rmqueue() does, but avoid
> double counting if some one already set_migratetype_isolate() on the
> page.
> 
> [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
> 
> Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> ---
> v3:
> - as Naoya Horiguchi suggested, do update only when
>    is_migrate_isolate(migratetype)) is false
> - updated patch description
> 
> v2:
> - https://lore.kernel.org/linux-mm/20210508035533.23222-1-dinghui@sangfor.com.cn/
> - use __mod_zone_freepage_state instead of __mod_zone_page_state
> 
>   mm/page_alloc.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index aaa1655cf682..d1f5de1c1283 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -9158,6 +9158,8 @@ bool take_page_off_buddy(struct page *page)
>   			del_page_from_free_list(page_head, zone, page_order);
>   			break_down_buddy_pages(zone, page_head, page, 0,
>   						page_order, migratetype);
> +			if (!is_migrate_isolate(migratetype))
> +				__mod_zone_freepage_state(zone, -1, migratetype);
>   			ret = true;
>   			break;
>   		}
> 

I guess if we'd actually be removing a page from the buddy while it's 
currently isolated by someone else (i.e., alloc_contig_range()), we 
might be in bigger trouble.

I think we should actually skip isolated pages completely. 
take_page_off_buddy() should not touch them.

Anyhow, different problem, so

Acked-by: David Hildenbrand <david@redhat.com>
Oscar Salvador May 26, 2021, 10:42 a.m. UTC | #2
On Wed, May 26, 2021 at 03:52:47PM +0800, Ding Hui wrote:
> Recently we found that there is a lot MemFree left in /proc/meminfo
> after do a lot of pages soft offline, it's not quite correct.
> 
> Before Oscar rework soft offline for free pages [1], if we soft
> offline free pages, these pages are left in buddy with HWPoison
> flag, and NR_FREE_PAGES is not updated immediately. So the difference
> between NR_FREE_PAGES and real number of available free pages is
> also even big at the beginning.
> 
> However, with the workload running, when we catch HWPoison page in
> any alloc functions subsequently, we will remove it from buddy,
> meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES
> will get more and more closer to the real number of available free pages.
> (regardless of unpoison_memory())
> 
> Now, for offline free pages, after a successful call take_page_off_buddy(),
> the page is no longer belong to buddy allocator, and will not be
> used any more, but we missed accounting NR_FREE_PAGES in this situation,
> and there is no chance to be updated later.
> 
> Do update in take_page_off_buddy() like rmqueue() does, but avoid
> double counting if some one already set_migratetype_isolate() on the
> page.
> 
> [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
> 
> Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>

Reviewed-by: Oscar Salvador <osalvador@suse.de>
Oscar Salvador May 26, 2021, 10:43 a.m. UTC | #3
On Wed, May 26, 2021 at 09:58:15AM +0200, David Hildenbrand wrote:
> I guess if we'd actually be removing a page from the buddy while it's
> currently isolated by someone else (i.e., alloc_contig_range()), we might be
> in bigger trouble.
> 
> I think we should actually skip isolated pages completely.
> take_page_off_buddy() should not touch them.

That might be a problem indeed.
I will have a look at it.

Thanks
HORIGUCHI NAOYA(堀口 直也) May 27, 2021, 12:34 a.m. UTC | #4
On Wed, May 26, 2021 at 03:52:47PM +0800, Ding Hui wrote:
> Recently we found that there is a lot MemFree left in /proc/meminfo
> after do a lot of pages soft offline, it's not quite correct.
> 
> Before Oscar rework soft offline for free pages [1], if we soft
> offline free pages, these pages are left in buddy with HWPoison
> flag, and NR_FREE_PAGES is not updated immediately. So the difference
> between NR_FREE_PAGES and real number of available free pages is
> also even big at the beginning.
> 
> However, with the workload running, when we catch HWPoison page in
> any alloc functions subsequently, we will remove it from buddy,
> meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES
> will get more and more closer to the real number of available free pages.
> (regardless of unpoison_memory())
> 
> Now, for offline free pages, after a successful call take_page_off_buddy(),
> the page is no longer belong to buddy allocator, and will not be
> used any more, but we missed accounting NR_FREE_PAGES in this situation,
> and there is no chance to be updated later.
> 
> Do update in take_page_off_buddy() like rmqueue() does, but avoid
> double counting if some one already set_migratetype_isolate() on the
> page.
> 
> [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages")
> 
> Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>

Thank you very much.

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

As for unpoison_memory(), I'm writing patches to fix unpoison (maybe takes a
few weeks to be posted) and that will add a reverse operation of
take_page_off_buddy() which simply calls __free_one_page(), so NR_FREE_PAGES
counter will also be handled correctly with the patches.
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index aaa1655cf682..d1f5de1c1283 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -9158,6 +9158,8 @@  bool take_page_off_buddy(struct page *page)
 			del_page_from_free_list(page_head, zone, page_order);
 			break_down_buddy_pages(zone, page_head, page, 0,
 						page_order, migratetype);
+			if (!is_migrate_isolate(migratetype))
+				__mod_zone_freepage_state(zone, -1, migratetype);
 			ret = true;
 			break;
 		}