Message ID | 20250415054705.370412-1-osalvador@suse.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm, hugetlb: Reset mapping to TAIL_MAPPING before restoring vmemmap | expand |
On 15.04.25 07:47, Oscar Salvador wrote: > commit 4eeec8c89a0c ("mm: move hugetlb specific things in folio to page[3]") > shifted hugetlb specific stuff, and now mapping overlaps _hugetlb_cgroup field. > > _hugetlb_cgroup is set to NULL when preparing the hugetlb page in > init_new_hugetlb_folio(). > For a better picture, this is page->mapping before and after the comming > for the first three tail pages: > > before: > page: fffff51a44358040 0000000000000000 > page: fffff51a44358080 0000000000000000 > page: fffff51a443580c0 dead000000000400 > > after: > page: fffff1f0042b0040 0000000000000000 > page: fffff1f0042b0080 fffff1f0042b0090 > page: fffff1f0042b00c0 0000000000000000 > > Tail#2 has fffff1f0042b0090 because of the _deferred_list initialization, > which was also shifted, but that is not a problem. > > For HVO, upon restoring that gets copied in some tail pages (reset_struct_pages) > and so those tail pages will not have TAIL_MAPPING set and the check > in free_tail_page_prepare() will fail: > > kernel: BUG: Bad page state in process kworker/0:3 pfn:10ac40 > kernel: page does not match folio > kernel: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10ac40 > kernel: flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) > kernel: raw: 0017ffffc0000000 fffff1f0042b0000 0000000000000000 0000000000000000 > kernel: raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 > kernel: page dumped because: corrupted mapping in tail page > > Reset _hugetlb_cgroup to TAIL_MAPPING before restoring so tail pages have the > right value. Hi, To handle that for ordinary hugtlb alloc/free I added in that patch in free_tail_page_prepare(): case 3: /* the third tail page: hugetlb specifics overlap ->mappings */ if (IS_ENABLED(CONFIG_HUGETLB_PAGE)) break; fallthrough; default: if (page->mapping != TAIL_MAPPING) { bad_page(page, "corrupted mapping in tail page"); goto out; } break; } Now I am confused why that check doesn't catch that? Apparently only a problem with HVO? Because I recall testing the ordinary alloc/free.
On 15.04.25 09:23, David Hildenbrand wrote: > On 15.04.25 07:47, Oscar Salvador wrote: >> commit 4eeec8c89a0c ("mm: move hugetlb specific things in folio to page[3]") >> shifted hugetlb specific stuff, and now mapping overlaps _hugetlb_cgroup field. >> >> _hugetlb_cgroup is set to NULL when preparing the hugetlb page in >> init_new_hugetlb_folio(). >> For a better picture, this is page->mapping before and after the comming >> for the first three tail pages: >> >> before: >> page: fffff51a44358040 0000000000000000 >> page: fffff51a44358080 0000000000000000 >> page: fffff51a443580c0 dead000000000400 >> >> after: >> page: fffff1f0042b0040 0000000000000000 >> page: fffff1f0042b0080 fffff1f0042b0090 >> page: fffff1f0042b00c0 0000000000000000 >> >> Tail#2 has fffff1f0042b0090 because of the _deferred_list initialization, >> which was also shifted, but that is not a problem. >> >> For HVO, upon restoring that gets copied in some tail pages (reset_struct_pages) >> and so those tail pages will not have TAIL_MAPPING set and the check >> in free_tail_page_prepare() will fail: >> >> kernel: BUG: Bad page state in process kworker/0:3 pfn:10ac40 >> kernel: page does not match folio >> kernel: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10ac40 >> kernel: flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) >> kernel: raw: 0017ffffc0000000 fffff1f0042b0000 0000000000000000 0000000000000000 >> kernel: raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 >> kernel: page dumped because: corrupted mapping in tail page >> >> Reset _hugetlb_cgroup to TAIL_MAPPING before restoring so tail pages have the >> right value. > > Hi, > > To handle that for ordinary hugtlb alloc/free I added in that patch in free_tail_page_prepare(): > > case 3: > /* the third tail page: hugetlb specifics overlap ->mappings */ > if (IS_ENABLED(CONFIG_HUGETLB_PAGE)) > break; > fallthrough; > default: > if (page->mapping != TAIL_MAPPING) { > bad_page(page, "corrupted mapping in tail page"); > goto out; > } > break; > } > > Now I am confused why that check doesn't catch that? > > Apparently only a problem with HVO? Because I recall testing the ordinary alloc/free. Ah, reading about the HVO hackery in the comment above NR_RESET_STRUCT_PAGE, might the following fix it? diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 9a99dfa3c4958..27245e86df250 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -238,11 +238,11 @@ static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, * struct page, the special metadata (e.g. page->flags or page->mapping) * cannot copy to the tail struct page structs. The invalid value will be * checked in the free_tail_page_prepare(). In order to avoid the message - * of "corrupted mapping in tail page". We need to reset at least 3 (one - * head struct page struct and two tail struct page structs) struct page + * of "corrupted mapping in tail page". We need to reset at least 4 (one + * head struct page struct and three tail struct page structs) struct page * structs. */ -#define NR_RESET_STRUCT_PAGE 3 +#define NR_RESET_STRUCT_PAGE 4 static inline void reset_struct_pages(struct page *start) {
> On Apr 15, 2025, at 15:32, David Hildenbrand <david@redhat.com> wrote: > > On 15.04.25 09:23, David Hildenbrand wrote: >> On 15.04.25 07:47, Oscar Salvador wrote: >>> commit 4eeec8c89a0c ("mm: move hugetlb specific things in folio to page[3]") >>> shifted hugetlb specific stuff, and now mapping overlaps _hugetlb_cgroup field. >>> >>> _hugetlb_cgroup is set to NULL when preparing the hugetlb page in >>> init_new_hugetlb_folio(). >>> For a better picture, this is page->mapping before and after the comming >>> for the first three tail pages: >>> >>> before: >>> page: fffff51a44358040 0000000000000000 >>> page: fffff51a44358080 0000000000000000 >>> page: fffff51a443580c0 dead000000000400 >>> >>> after: >>> page: fffff1f0042b0040 0000000000000000 >>> page: fffff1f0042b0080 fffff1f0042b0090 >>> page: fffff1f0042b00c0 0000000000000000 >>> >>> Tail#2 has fffff1f0042b0090 because of the _deferred_list initialization, >>> which was also shifted, but that is not a problem. >>> >>> For HVO, upon restoring that gets copied in some tail pages (reset_struct_pages) >>> and so those tail pages will not have TAIL_MAPPING set and the check >>> in free_tail_page_prepare() will fail: >>> >>> kernel: BUG: Bad page state in process kworker/0:3 pfn:10ac40 >>> kernel: page does not match folio >>> kernel: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10ac40 >>> kernel: flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) >>> kernel: raw: 0017ffffc0000000 fffff1f0042b0000 0000000000000000 0000000000000000 >>> kernel: raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 >>> kernel: page dumped because: corrupted mapping in tail page >>> >>> Reset _hugetlb_cgroup to TAIL_MAPPING before restoring so tail pages have the >>> right value. >> Hi, >> To handle that for ordinary hugtlb alloc/free I added in that patch in free_tail_page_prepare(): >> case 3: >> /* the third tail page: hugetlb specifics overlap ->mappings */ >> if (IS_ENABLED(CONFIG_HUGETLB_PAGE)) >> break; >> fallthrough; >> default: >> if (page->mapping != TAIL_MAPPING) { >> bad_page(page, "corrupted mapping in tail page"); >> goto out; >> } >> break; >> } >> Now I am confused why that check doesn't catch that? >> Apparently only a problem with HVO? Because I recall testing the ordinary alloc/free. > > Ah, reading about the HVO hackery in the comment above NR_RESET_STRUCT_PAGE, might the following fix it? Yes. And the Fixes tag should be Fixes: 4eeec8c89a0c ("mm: move hugetlb specific things in folio to page[3]") Thanks. > > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > index 9a99dfa3c4958..27245e86df250 100644 > --- a/mm/hugetlb_vmemmap.c > +++ b/mm/hugetlb_vmemmap.c > @@ -238,11 +238,11 @@ static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, > * struct page, the special metadata (e.g. page->flags or page->mapping) > * cannot copy to the tail struct page structs. The invalid value will be > * checked in the free_tail_page_prepare(). In order to avoid the message > - * of "corrupted mapping in tail page". We need to reset at least 3 (one > - * head struct page struct and two tail struct page structs) struct page > + * of "corrupted mapping in tail page". We need to reset at least 4 (one > + * head struct page struct and three tail struct page structs) struct page > * structs. > */ > -#define NR_RESET_STRUCT_PAGE 3 > +#define NR_RESET_STRUCT_PAGE 4 > static inline void reset_struct_pages(struct page *start) > { > > -- > Cheers, > > David / dhildenb
On Tue, Apr 15, 2025 at 09:32:40AM +0200, David Hildenbrand wrote:
> Ah, reading about the HVO hackery in the comment above NR_RESET_STRUCT_PAGE, might the following fix it?
Yap, I was experimenting in parallel with that after sending the patch
and it also fixes the issue, and I think is nicer indeed.
I do not want to step on someone else's fet, so do you want me to send
the patch or will you?
thanks!
On 15.04.25 10:02, Oscar Salvador wrote: > On Tue, Apr 15, 2025 at 09:32:40AM +0200, David Hildenbrand wrote: >> Ah, reading about the HVO hackery in the comment above NR_RESET_STRUCT_PAGE, might the following fix it? > > Yap, I was experimenting in parallel with that after sending the patch > and it also fixes the issue, and I think is nicer indeed. > > I do not want to step on someone else's fet, so do you want me to send > the patch or will you? Yes, please send it out.
Hi Oscar, kernel test robot noticed the following build errors: [auto build test ERROR on akpm-mm/mm-everything] url: https://github.com/intel-lab-lkp/linux/commits/Oscar-Salvador/mm-hugetlb-Reset-mapping-to-TAIL_MAPPING-before-restoring-vmemmap/20250415-134835 base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything patch link: https://lore.kernel.org/r/20250415054705.370412-1-osalvador%40suse.de patch subject: [PATCH] mm, hugetlb: Reset mapping to TAIL_MAPPING before restoring vmemmap config: x86_64-buildonly-randconfig-003-20250416 (https://download.01.org/0day-ci/archive/20250416/202504161523.M2CmTjsj-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250416/202504161523.M2CmTjsj-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202504161523.M2CmTjsj-lkp@intel.com/ All errors (new ones prefixed by >>): mm/hugetlb_vmemmap.c: In function 'hugetlb_vmemmap_restore_folio': >> mm/hugetlb_vmemmap.c:506:9: error: implicit declaration of function 'set_hugetlb_cgroup' [-Werror=implicit-function-declaration] 506 | set_hugetlb_cgroup(folio, TAIL_MAPPING); | ^~~~~~~~~~~~~~~~~~ cc1: some warnings being treated as errors vim +/set_hugetlb_cgroup +506 mm/hugetlb_vmemmap.c 488 489 /** 490 * hugetlb_vmemmap_restore_folio - restore previously optimized (by 491 * hugetlb_vmemmap_optimize_folio()) vmemmap pages which 492 * will be reallocated and remapped. 493 * @h: struct hstate. 494 * @folio: the folio whose vmemmap pages will be restored. 495 * 496 * Return: %0 if @folio's vmemmap pages have been reallocated and remapped, 497 * negative error code otherwise. 498 */ 499 int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio) 500 { 501 /* 502 * Before restoring vmemmap, make sure to reset mapping to TAIL_MAPPING, 503 * so tail pages that were reset will have the right thing after being 504 * restored, and the checks in free_tail_page_prepare() will pass. 505 */ > 506 set_hugetlb_cgroup(folio, TAIL_MAPPING); 507 return __hugetlb_vmemmap_restore_folio(h, folio, VMEMMAP_SYNCHRONIZE_RCU); 508 } 509
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 9a99dfa3c495..3d763182c834 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -498,6 +498,12 @@ static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, */ int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio) { + /* + * Before restoring vmemmap, make sure to reset mapping to TAIL_MAPPING, + * so tail pages that were reset will have the right thing after being + * restored, and the checks in free_tail_page_prepare() will pass. + */ + set_hugetlb_cgroup(folio, TAIL_MAPPING); return __hugetlb_vmemmap_restore_folio(h, folio, VMEMMAP_SYNCHRONIZE_RCU); }
commit 4eeec8c89a0c ("mm: move hugetlb specific things in folio to page[3]") shifted hugetlb specific stuff, and now mapping overlaps _hugetlb_cgroup field. _hugetlb_cgroup is set to NULL when preparing the hugetlb page in init_new_hugetlb_folio(). For a better picture, this is page->mapping before and after the comming for the first three tail pages: before: page: fffff51a44358040 0000000000000000 page: fffff51a44358080 0000000000000000 page: fffff51a443580c0 dead000000000400 after: page: fffff1f0042b0040 0000000000000000 page: fffff1f0042b0080 fffff1f0042b0090 page: fffff1f0042b00c0 0000000000000000 Tail#2 has fffff1f0042b0090 because of the _deferred_list initialization, which was also shifted, but that is not a problem. For HVO, upon restoring that gets copied in some tail pages (reset_struct_pages) and so those tail pages will not have TAIL_MAPPING set and the check in free_tail_page_prepare() will fail: kernel: BUG: Bad page state in process kworker/0:3 pfn:10ac40 kernel: page does not match folio kernel: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10ac40 kernel: flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) kernel: raw: 0017ffffc0000000 fffff1f0042b0000 0000000000000000 0000000000000000 kernel: raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 kernel: page dumped because: corrupted mapping in tail page Reset _hugetlb_cgroup to TAIL_MAPPING before restoring so tail pages have the right value. Signed-off-by: Oscar Salvador <osalvador@suse.de> --- Hi guys, Although I can no longer reproduce the issue with this patch, I'm not entirely sure this is the right way to fix the problem, so I'm open to suggestions. --- mm/hugetlb_vmemmap.c | 6 ++++++ 1 file changed, 6 insertions(+)