Message ID | 20230110212821.984047-9-sidhartha.kumar@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | continue hugetlb folio conversion | expand |
On Tue, Jan 10, 2023 at 03:28:21PM -0600, Sidhartha Kumar wrote: > @@ -3505,6 +3505,7 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) > { > int nr_nodes, node; > struct page *page; > + struct folio *folio; > > lockdep_assert_held(&hugetlb_lock); > > @@ -3518,8 +3519,8 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) > list_for_each_entry(page, &h->hugepage_freelists[node], lru) { > if (PageHWPoison(page)) > continue; > - > - return demote_free_huge_page(h, page); > + folio = page_folio(page); > + return demote_free_hugetlb_folio(h, folio); > } > } Can't this be list_for_each_entry(folio, &h->hugepage_freelists[node], lru) which avoids the call to page_folio() here. I think the call to PageHWPoison is actually wrong here. That would only check the hwpoison bit on the first page, whereas we want to know about the hwpoison bit on any page (don't we?) So this should be folio_test_has_hwpoisoned()? Or is that a THP-thing that is different for hugetlb pages?
On 01/10/23 21:40, Matthew Wilcox wrote: > On Tue, Jan 10, 2023 at 03:28:21PM -0600, Sidhartha Kumar wrote: > > @@ -3505,6 +3505,7 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) > > { > > int nr_nodes, node; > > struct page *page; > > + struct folio *folio; > > > > lockdep_assert_held(&hugetlb_lock); > > > > @@ -3518,8 +3519,8 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) > > list_for_each_entry(page, &h->hugepage_freelists[node], lru) { > > if (PageHWPoison(page)) > > continue; > > - > > - return demote_free_huge_page(h, page); > > + folio = page_folio(page); > > + return demote_free_hugetlb_folio(h, folio); > > } > > } > > Can't this be > list_for_each_entry(folio, &h->hugepage_freelists[node], lru) > > which avoids the call to page_folio() here. > > I think the call to PageHWPoison is actually wrong here. That would > only check the hwpoison bit on the first page, whereas we want to know > about the hwpoison bit on any page (don't we?) So this should be > folio_test_has_hwpoisoned()? > > Or is that a THP-thing that is different for hugetlb pages? I believe it is different for hugetlb pages. See hugetlb_set_page_hwpoison() where it sets PageHWPoison on head page as well as allocating a raw_hwp_page to track the actual page with poison. Note that we can not directly flag hugetlb 'subpages' because we may not have the struct pages due to vmemmap optimization. Adding Naoya just to be sure. Do agree that this could be list_for_each_entry(folio ...
On 1/10/23 6:00 PM, Mike Kravetz wrote: > On 01/10/23 21:40, Matthew Wilcox wrote: >> On Tue, Jan 10, 2023 at 03:28:21PM -0600, Sidhartha Kumar wrote: >>> @@ -3505,6 +3505,7 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) >>> { >>> int nr_nodes, node; >>> struct page *page; >>> + struct folio *folio; >>> >>> lockdep_assert_held(&hugetlb_lock); >>> >>> @@ -3518,8 +3519,8 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) >>> list_for_each_entry(page, &h->hugepage_freelists[node], lru) { >>> if (PageHWPoison(page)) >>> continue; >>> - >>> - return demote_free_huge_page(h, page); >>> + folio = page_folio(page); >>> + return demote_free_hugetlb_folio(h, folio); >>> } >>> } >> >> Can't this be >> list_for_each_entry(folio, &h->hugepage_freelists[node], lru) >> >> which avoids the call to page_folio() here. >> >> I think the call to PageHWPoison is actually wrong here. That would >> only check the hwpoison bit on the first page, whereas we want to know >> about the hwpoison bit on any page (don't we?) So this should be >> folio_test_has_hwpoisoned()? >> >> Or is that a THP-thing that is different for hugetlb pages? > > I believe it is different for hugetlb pages. See hugetlb_set_page_hwpoison() > where it sets PageHWPoison on head page as well as allocating a raw_hwp_page I agree, this line in hugetlb_set_page_hwpoison (which is now folio_set_hugetlb_hwpoison) sets the HWPoison flag on the head page int ret = folio_test_set_hwpoison(folio) ? -EHWPOISON : 0; so the correct code in demote_pool_huge_page() would be: list_for_each_entry(folio, &h->hugepage_freelists[node], lru) { if (folio_test_hwpoison(folio)) continue; return demote_free_hugetlb_folio(h, folio); } > to track the actual page with poison. Note that we can not directly flag > hugetlb 'subpages' because we may not have the struct pages due to vmemmap > optimization. Adding Naoya just to be sure. > > Do agree that this could be list_for_each_entry(folio ... Will make this change in v3. Thanks, Sidhartha Kumar
On Fri, Jan 13, 2023 at 09:16:23AM -0600, Sidhartha Kumar wrote: > On 1/10/23 6:00 PM, Mike Kravetz wrote: > > On 01/10/23 21:40, Matthew Wilcox wrote: > > > On Tue, Jan 10, 2023 at 03:28:21PM -0600, Sidhartha Kumar wrote: > > > > @@ -3505,6 +3505,7 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) > > > > { > > > > int nr_nodes, node; > > > > struct page *page; > > > > + struct folio *folio; > > > > lockdep_assert_held(&hugetlb_lock); > > > > @@ -3518,8 +3519,8 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) > > > > list_for_each_entry(page, &h->hugepage_freelists[node], lru) { > > > > if (PageHWPoison(page)) > > > > continue; > > > > - > > > > - return demote_free_huge_page(h, page); > > > > + folio = page_folio(page); > > > > + return demote_free_hugetlb_folio(h, folio); > > > > } > > > > } > > > > > > Can't this be > > > list_for_each_entry(folio, &h->hugepage_freelists[node], lru) > > > > > > which avoids the call to page_folio() here. > > > > > > I think the call to PageHWPoison is actually wrong here. That would > > > only check the hwpoison bit on the first page, whereas we want to know > > > about the hwpoison bit on any page (don't we?) So this should be > > > folio_test_has_hwpoisoned()? > > > > > > Or is that a THP-thing that is different for hugetlb pages? > > > > I believe it is different for hugetlb pages. See hugetlb_set_page_hwpoison() > > where it sets PageHWPoison on head page as well as allocating a raw_hwp_page > > I agree, this line in hugetlb_set_page_hwpoison (which is now > folio_set_hugetlb_hwpoison) sets the HWPoison flag on the head page > > int ret = folio_test_set_hwpoison(folio) ? -EHWPOISON : 0; > > so the correct code in demote_pool_huge_page() would be: > > list_for_each_entry(folio, &h->hugepage_freelists[node], lru) { > if (folio_test_hwpoison(folio)) > continue; > return demote_free_hugetlb_folio(h, folio); > } > > > > to track the actual page with poison. Note that we can not directly flag > > hugetlb 'subpages' because we may not have the struct pages due to vmemmap > > optimization. Adding Naoya just to be sure. Yes, the above code and the reason totally make sense to me. Thanks for the patch and the update. - Naoya Horiguchif
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index f8cd0c694fe9..21ffab744c8f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3435,12 +3435,12 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, return 0; } -static int demote_free_huge_page(struct hstate *h, struct page *page) +static int demote_free_hugetlb_folio(struct hstate *h, struct folio *folio) { - int i, nid = page_to_nid(page); + int i, nid = folio_nid(folio); struct hstate *target_hstate; - struct folio *folio = page_folio(page); struct page *subpage; + struct folio *inner_folio; int rc = 0; target_hstate = size_to_hstate(PAGE_SIZE << h->demote_order); @@ -3448,18 +3448,18 @@ static int demote_free_huge_page(struct hstate *h, struct page *page) remove_hugetlb_folio_for_demote(h, folio, false); spin_unlock_irq(&hugetlb_lock); - rc = hugetlb_vmemmap_restore(h, page); + rc = hugetlb_vmemmap_restore(h, &folio->page); if (rc) { - /* Allocation of vmemmmap failed, we can not demote page */ + /* Allocation of vmemmmap failed, we can not demote folio */ spin_lock_irq(&hugetlb_lock); - set_page_refcounted(page); - add_hugetlb_folio(h, page_folio(page), false); + folio_ref_unfreeze(folio, 1); + add_hugetlb_folio(h, folio, false); return rc; } /* * Use destroy_compound_hugetlb_folio_for_demote for all huge page - * sizes as it will not ref count pages. + * sizes as it will not ref count folios. */ destroy_compound_hugetlb_folio_for_demote(folio, huge_page_order(h)); @@ -3474,15 +3474,15 @@ static int demote_free_huge_page(struct hstate *h, struct page *page) mutex_lock(&target_hstate->resize_lock); for (i = 0; i < pages_per_huge_page(h); i += pages_per_huge_page(target_hstate)) { - subpage = nth_page(page, i); - folio = page_folio(subpage); + subpage = folio_page(folio, i); + inner_folio = page_folio(subpage); if (hstate_is_gigantic(target_hstate)) - prep_compound_gigantic_folio_for_demote(folio, + prep_compound_gigantic_folio_for_demote(inner_folio, target_hstate->order); else prep_compound_page(subpage, target_hstate->order); - set_page_private(subpage, 0); - prep_new_hugetlb_folio(target_hstate, folio, nid); + folio_change_private(inner_folio, NULL); + prep_new_hugetlb_folio(target_hstate, inner_folio, nid); free_huge_page(subpage); } mutex_unlock(&target_hstate->resize_lock); @@ -3505,6 +3505,7 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) { int nr_nodes, node; struct page *page; + struct folio *folio; lockdep_assert_held(&hugetlb_lock); @@ -3518,8 +3519,8 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed) list_for_each_entry(page, &h->hugepage_freelists[node], lru) { if (PageHWPoison(page)) continue; - - return demote_free_huge_page(h, page); + folio = page_folio(page); + return demote_free_hugetlb_folio(h, folio); } }
Change demote_free_huge_page to demote_free_hugetlb_folio() and change demote_pool_huge_page() pass in a folio. Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> --- mm/hugetlb.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-)