Message ID | 20170920201714.19817-3-pasha.tatashin@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed 20-09-17 16:17:04, Pavel Tatashin wrote: > Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT), > flags and other fields in "struct page"es are never changed prior to first > initializing struct pages by going through __init_single_page(). > > With deferred struct page feature enabled there is a case where we set some > fields prior to initializing: > > mem_init() { > register_page_bootmem_info(); > free_all_bootmem(); > ... > } > > When register_page_bootmem_info() is called only non-deferred struct pages > are initialized. But, this function goes through some reserved pages which > might be part of the deferred, and thus are not yet initialized. > > mem_init > register_page_bootmem_info > register_page_bootmem_info_node > get_page_bootmem > .. setting fields here .. > such as: page->freelist = (void *)type; > > free_all_bootmem() > free_low_memory_core_early() > for_each_reserved_mem_region() > reserve_bootmem_region() > init_reserved_page() <- Only if this is deferred reserved page > __init_single_pfn() > __init_single_page() > memset(0) <-- Loose the set fields here > > We end-up with similar issue as in the previous patch, where currently we > do not observe problem as memory is zeroed. But, if flag asserts are > changed we can start hitting issues. > > Also, because in this patch series we will stop zeroing struct page memory > during allocation, we must make sure that struct pages are properly > initialized prior to using them. > > The deferred-reserved pages are initialized in free_all_bootmem(). > Therefore, the fix is to switch the above calls. > > Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> > Reviewed-by: Steven Sistare <steven.sistare@oracle.com> > Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> > Reviewed-by: Bob Picco <bob.picco@oracle.com> > Acked-by: David S. Miller <davem@davemloft.net> As you separated x86 and sparc patches doing essentially the same I assume David is going to take this patch? Acked-by: Michal Hocko <mhocko@suse.com> > --- > arch/sparc/mm/init_64.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c > index 6034569e2c0d..310c6754bcaa 100644 > --- a/arch/sparc/mm/init_64.c > +++ b/arch/sparc/mm/init_64.c > @@ -2548,9 +2548,15 @@ void __init mem_init(void) > { > high_memory = __va(last_valid_pfn << PAGE_SHIFT); > > - register_page_bootmem_info(); > free_all_bootmem(); > > + /* Must be done after boot memory is put on freelist, because here we > + * might set fields in deferred struct pages that have not yet been > + * initialized, and free_all_bootmem() initializes all the reserved > + * deferred pages for us. > + */ > + register_page_bootmem_info(); > + > /* > * Set up the zero page, mark it reserved, so that page count > * is not manipulated when freeing the page from user ptes. > -- > 2.14.1
> > As you separated x86 and sparc patches doing essentially the same I > assume David is going to take this patch? Correct, I noticed that usually platform specific changes are done in separate patches even if they are small. Dave already Acked this patch. So, I do not think it should be separated from the rest of the patches when this projects goes into mm-tree. > > Acked-by: Michal Hocko <mhocko@suse.com> Thank you, Pasha
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 6034569e2c0d..310c6754bcaa 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -2548,9 +2548,15 @@ void __init mem_init(void) { high_memory = __va(last_valid_pfn << PAGE_SHIFT); - register_page_bootmem_info(); free_all_bootmem(); + /* Must be done after boot memory is put on freelist, because here we + * might set fields in deferred struct pages that have not yet been + * initialized, and free_all_bootmem() initializes all the reserved + * deferred pages for us. + */ + register_page_bootmem_info(); + /* * Set up the zero page, mark it reserved, so that page count * is not manipulated when freeing the page from user ptes.