Message ID | 010101857bbc4d26-d9683bb4-c4f0-465b-aea6-5314dbf0aa01-000000@us-west-2.amazonses.com (mailing list archive) |
---|---|
State | Deferred, archived |
Headers | show |
Series | [1/1] mm: Always release pages to the buddy allocator in memblock_free_late(). | expand |
Hi, On Wed, Jan 04, 2023 at 07:43:36AM +0000, Aaron Thompson wrote: > If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages() > only releases pages to the buddy allocator if they are not in the > deferred range. This is correct for free pages (as defined by > for_each_free_mem_pfn_range_in_zone()) because free pages in the > deferred range will be initialized and released as part of the deferred > init process. memblock_free_pages() is called by memblock_free_late(), > which is used to free reserved ranges after memblock_free_all() has > run. memblock_free_all() initializes all pages in reserved ranges, and To be precise, memblock_free_all() frees pages, or releases them to the pages allocator, rather than initializes. > accordingly, those pages are not touched by the deferred init > process. This means that currently, if the pages that > memblock_free_late() intends to release are in the deferred range, they > will never be released to the buddy allocator. They will forever be > reserved. > > In addition, memblock_free_pages() calls kmsan_memblock_free_pages(), > which is also correct for free pages but is not correct for reserved > pages. KMSAN metadata for reserved pages is initialized by > kmsan_init_shadow(), which runs shortly before memblock_free_all(). > > For both of these reasons, memblock_free_pages() should only be called > for free pages, and memblock_free_late() should call __free_pages_core() > directly instead. Overall looks fine to me and I couldn't spot potential issues. I'd appreciate if you add a paragraph about the actual issue with EFI boot you described in the cover letter to the commit message. > Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") > Signed-off-by: Aaron Thompson <dev@aaront.org> > --- > mm/memblock.c | 2 +- > tools/testing/memblock/internal.h | 4 ++++ > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index 511d4783dcf1..56a5b6086c50 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1640,7 +1640,7 @@ void __init memblock_free_late(phys_addr_t base, phys_addr_t size) > end = PFN_DOWN(base + size); > > for (; cursor < end; cursor++) { > - memblock_free_pages(pfn_to_page(cursor), cursor, 0); > + __free_pages_core(pfn_to_page(cursor), 0); Please add a comment that explains why it is safe to call __free_pages_core() here. Something like /* * Reserved pages are always initialized by the end of * memblock_free_all() either during memmap_init() or, with deferred * initialization if struct page in reserve_bootmem_region() */ > totalram_pages_inc(); > } > } > diff --git a/tools/testing/memblock/internal.h b/tools/testing/memblock/internal.h > index fdb7f5db7308..85973e55489e 100644 > --- a/tools/testing/memblock/internal.h > +++ b/tools/testing/memblock/internal.h > @@ -15,6 +15,10 @@ bool mirrored_kernelcore = false; > > struct page {}; > > +void __free_pages_core(struct page *page, unsigned int order) > +{ > +} > + > void memblock_free_pages(struct page *page, unsigned long pfn, > unsigned int order) > { > -- > 2.30.2 >
Hi Mike, On 2023-01-04 11:34, Mike Rapoport wrote: > Hi, > > On Wed, Jan 04, 2023 at 07:43:36AM +0000, Aaron Thompson wrote: >> If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages() >> only releases pages to the buddy allocator if they are not in the >> deferred range. This is correct for free pages (as defined by >> for_each_free_mem_pfn_range_in_zone()) because free pages in the >> deferred range will be initialized and released as part of the >> deferred >> init process. memblock_free_pages() is called by memblock_free_late(), >> which is used to free reserved ranges after memblock_free_all() has >> run. memblock_free_all() initializes all pages in reserved ranges, and > > To be precise, memblock_free_all() frees pages, or releases them to the > pages allocator, rather than initializes. As you mentioned in the comment below, whether memblock_free_all() does any initializing depends on the particular deferred init situation. memblock_free_all() does ultimately call init_reserved_page() for every reserved page (via reserve_bootmem_region()), but that only actually initializes the page if it's in the deferred range. In either case, all I was trying to say here is that we can be certain that all reserved pages have been initialized after memblock_free_all() has run, so I'll rephrase that. >> accordingly, those pages are not touched by the deferred init >> process. This means that currently, if the pages that >> memblock_free_late() intends to release are in the deferred range, >> they >> will never be released to the buddy allocator. They will forever be >> reserved. >> >> In addition, memblock_free_pages() calls kmsan_memblock_free_pages(), >> which is also correct for free pages but is not correct for reserved >> pages. KMSAN metadata for reserved pages is initialized by >> kmsan_init_shadow(), which runs shortly before memblock_free_all(). >> >> For both of these reasons, memblock_free_pages() should only be called >> for free pages, and memblock_free_late() should call >> __free_pages_core() >> directly instead. > > Overall looks fine to me and I couldn't spot potential issues. > > I'd appreciate if you add a paragraph about the actual issue with EFI > boot > you described in the cover letter to the commit message. Sure, will do. >> Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages >> if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") >> Signed-off-by: Aaron Thompson <dev@aaront.org> >> --- >> mm/memblock.c | 2 +- >> tools/testing/memblock/internal.h | 4 ++++ >> 2 files changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/mm/memblock.c b/mm/memblock.c >> index 511d4783dcf1..56a5b6086c50 100644 >> --- a/mm/memblock.c >> +++ b/mm/memblock.c >> @@ -1640,7 +1640,7 @@ void __init memblock_free_late(phys_addr_t base, >> phys_addr_t size) >> end = PFN_DOWN(base + size); >> >> for (; cursor < end; cursor++) { >> - memblock_free_pages(pfn_to_page(cursor), cursor, 0); >> + __free_pages_core(pfn_to_page(cursor), 0); > > Please add a comment that explains why it is safe to call > __free_pages_core() here. > Something like > > /* > * Reserved pages are always initialized by the end of > * memblock_free_all() either during memmap_init() or, with deferred > * initialization if struct page in reserve_bootmem_region() > */ Will do. Thanks for the review. >> totalram_pages_inc(); >> } >> } >> diff --git a/tools/testing/memblock/internal.h >> b/tools/testing/memblock/internal.h >> index fdb7f5db7308..85973e55489e 100644 >> --- a/tools/testing/memblock/internal.h >> +++ b/tools/testing/memblock/internal.h >> @@ -15,6 +15,10 @@ bool mirrored_kernelcore = false; >> >> struct page {}; >> >> +void __free_pages_core(struct page *page, unsigned int order) >> +{ >> +} >> + >> void memblock_free_pages(struct page *page, unsigned long pfn, >> unsigned int order) >> { >> -- >> 2.30.2 >> Thanks, -- Aaron
diff --git a/mm/memblock.c b/mm/memblock.c index 511d4783dcf1..56a5b6086c50 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1640,7 +1640,7 @@ void __init memblock_free_late(phys_addr_t base, phys_addr_t size) end = PFN_DOWN(base + size); for (; cursor < end; cursor++) { - memblock_free_pages(pfn_to_page(cursor), cursor, 0); + __free_pages_core(pfn_to_page(cursor), 0); totalram_pages_inc(); } } diff --git a/tools/testing/memblock/internal.h b/tools/testing/memblock/internal.h index fdb7f5db7308..85973e55489e 100644 --- a/tools/testing/memblock/internal.h +++ b/tools/testing/memblock/internal.h @@ -15,6 +15,10 @@ bool mirrored_kernelcore = false; struct page {}; +void __free_pages_core(struct page *page, unsigned int order) +{ +} + void memblock_free_pages(struct page *page, unsigned long pfn, unsigned int order) {
If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages() only releases pages to the buddy allocator if they are not in the deferred range. This is correct for free pages (as defined by for_each_free_mem_pfn_range_in_zone()) because free pages in the deferred range will be initialized and released as part of the deferred init process. memblock_free_pages() is called by memblock_free_late(), which is used to free reserved ranges after memblock_free_all() has run. memblock_free_all() initializes all pages in reserved ranges, and accordingly, those pages are not touched by the deferred init process. This means that currently, if the pages that memblock_free_late() intends to release are in the deferred range, they will never be released to the buddy allocator. They will forever be reserved. In addition, memblock_free_pages() calls kmsan_memblock_free_pages(), which is also correct for free pages but is not correct for reserved pages. KMSAN metadata for reserved pages is initialized by kmsan_init_shadow(), which runs shortly before memblock_free_all(). For both of these reasons, memblock_free_pages() should only be called for free pages, and memblock_free_late() should call __free_pages_core() directly instead. Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Signed-off-by: Aaron Thompson <dev@aaront.org> --- mm/memblock.c | 2 +- tools/testing/memblock/internal.h | 4 ++++ 2 files changed, 5 insertions(+), 1 deletion(-)