Message ID | 20240629013322.12364-2-richard.weiyang@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/4] mm: use zonelist_zone() to get zone | expand |
On 29.06.24 03:33, Wei Yang wrote: > Initialize all pages reserved is an ancient behavior. > > Since commit 92923ca3aace ("mm: meminit: only set page reserved in the > memblock region"), SetPageReserved is removed from > __init_single_page(). Only those reserved pages are marked PG_reserved. > > But we still set PG_reserved on offline and check it on online. > > Following two commits removed both of them: > > * Commit 0ee5f4f31d36 ("mm/page_alloc.c: don't set pages PageReserved() > when offlining") removed the set on offline. > * Commit 5ecae6359e3a ("mm/memory_hotplug: drop PageReserved() check in > online_pages_range()") removed the check on online. > > This means we set PG_reserved for hot-plugged memory at initialization > is not helpful and a little different from bootmem initialization path. > Now we can remove it. It's not that easy for ZONE_DEVICE. Also, see mm/mm-stable commit 3dadec1babf9eee0c67c967df931d6f0cb124a04 Author: David Hildenbrand <david@redhat.com> Date: Fri Jun 7 11:09:36 2024 +0200 mm: pass meminit_context to __free_pages_core() Patch series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE". commit b873faaa609ab44c223b2327f55d2b6a2ba4ca9c Author: David Hildenbrand <david@redhat.com> Date: Fri Jun 7 11:09:37 2024 +0200 mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() instead of PageReserved() If you want to work on removing it for ZONE_DEVICE, one idea is to replace all relevant PageReserved() checks by a more generic function that would check PageReserved() and the zone
On Sat, Jun 29, 2024 at 08:19:49AM +0200, David Hildenbrand wrote: >On 29.06.24 03:33, Wei Yang wrote: >> Initialize all pages reserved is an ancient behavior. >> >> Since commit 92923ca3aace ("mm: meminit: only set page reserved in the >> memblock region"), SetPageReserved is removed from >> __init_single_page(). Only those reserved pages are marked PG_reserved. >> >> But we still set PG_reserved on offline and check it on online. >> >> Following two commits removed both of them: >> >> * Commit 0ee5f4f31d36 ("mm/page_alloc.c: don't set pages PageReserved() >> when offlining") removed the set on offline. >> * Commit 5ecae6359e3a ("mm/memory_hotplug: drop PageReserved() check in >> online_pages_range()") removed the check on online. >> >> This means we set PG_reserved for hot-plugged memory at initialization >> is not helpful and a little different from bootmem initialization path. >> Now we can remove it. > >It's not that easy for ZONE_DEVICE. > >Also, see mm/mm-stable > >commit 3dadec1babf9eee0c67c967df931d6f0cb124a04 >Author: David Hildenbrand <david@redhat.com> >Date: Fri Jun 7 11:09:36 2024 +0200 > > mm: pass meminit_context to __free_pages_core() > > Patch series "mm/memory_hotplug: use PageOffline() instead of > PageReserved() for !ZONE_DEVICE". > > >commit b873faaa609ab44c223b2327f55d2b6a2ba4ca9c >Author: David Hildenbrand <david@redhat.com> >Date: Fri Jun 7 11:09:37 2024 +0200 > > mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() >instead of PageReserved() > Let me try to understand this. You also tries to get rid of PG_reserved but you want PG_offline instead, because this benefit virtio-mem, right? But I don't get why PG_offline is wrong for ZONE_DEVICE. I may miss some knowledge for it. > >If you want to work on removing it for ZONE_DEVICE, one idea >is to replace all relevant PageReserved() checks by a >more generic function that would check PageReserved() and the zone > > >-- >Cheers, > >David / dhildenb
On 29.06.24 10:32, Wei Yang wrote: > On Sat, Jun 29, 2024 at 08:19:49AM +0200, David Hildenbrand wrote: >> On 29.06.24 03:33, Wei Yang wrote: >>> Initialize all pages reserved is an ancient behavior. >>> >>> Since commit 92923ca3aace ("mm: meminit: only set page reserved in the >>> memblock region"), SetPageReserved is removed from >>> __init_single_page(). Only those reserved pages are marked PG_reserved. >>> >>> But we still set PG_reserved on offline and check it on online. >>> >>> Following two commits removed both of them: >>> >>> * Commit 0ee5f4f31d36 ("mm/page_alloc.c: don't set pages PageReserved() >>> when offlining") removed the set on offline. >>> * Commit 5ecae6359e3a ("mm/memory_hotplug: drop PageReserved() check in >>> online_pages_range()") removed the check on online. >>> >>> This means we set PG_reserved for hot-plugged memory at initialization >>> is not helpful and a little different from bootmem initialization path. >>> Now we can remove it. >> >> It's not that easy for ZONE_DEVICE. >> >> Also, see mm/mm-stable >> >> commit 3dadec1babf9eee0c67c967df931d6f0cb124a04 >> Author: David Hildenbrand <david@redhat.com> >> Date: Fri Jun 7 11:09:36 2024 +0200 >> >> mm: pass meminit_context to __free_pages_core() >> >> Patch series "mm/memory_hotplug: use PageOffline() instead of >> PageReserved() for !ZONE_DEVICE". >> >> >> commit b873faaa609ab44c223b2327f55d2b6a2ba4ca9c >> Author: David Hildenbrand <david@redhat.com> >> Date: Fri Jun 7 11:09:37 2024 +0200 >> >> mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() >> instead of PageReserved() >> > > Let me try to understand this. > > You also tries to get rid of PG_reserved but you want PG_offline instead, > because this benefit virtio-mem, right? We now make proper use of PG_offline. All hotplugged pages start out PG_offline once we turn the section online. Only the ones that actually get exposed to the buddy -- actually get onlined -- get PG_offline cleared. A side effect of that is less hacks for virtio-mem, and more natural handling for the other ballooning drivers that hotplug memory. In the future, I'm planning on moving more fake-offlining code from virtio-mem the core, making use of more PG_offline in memory. For now, it's stops the PG_reserved use while maintaining the same semantics as before: the page content and "struct page" is not to be touched by anybody except the "owner". > > But I don't get why PG_offline is wrong for ZONE_DEVICE. I may miss some > knowledge for it. I suggest you take a look at the PG_offline documentation. ZONE_DEVICE are certainly not logically offline pages. They will never be considered online as part of online sections. But they will never be handed to the buddy. Maybe we want a dedicate page type for them in the future, not sure. We can right now identify them reliably using the zone idx. Using a page type right now is very likely not possible, because we might be using the page->_mapcount in rmap code when mapping some of them to user space. If we want to get rid of the PG_reserved for them right now, we'll have to make sure all existing PageReserved checks won't be degraded. For example, drivers/vfio/vfio_iommu_type1.c might need some work (no sure). The KVM one in kvm_pfn_to_refcounted_page() should already be fine, because they really want to refcount them. A lot of other ones like can_gather_numa_stats(), already refuse is_zone_device_page() manually, and maybe we want to factor both checks out into a separate function like "is_special_reserved_page()" or sth like that.
On Sat, Jun 29, 2024 at 04:38:47PM +0200, David Hildenbrand wrote: >On 29.06.24 10:32, Wei Yang wrote: >> On Sat, Jun 29, 2024 at 08:19:49AM +0200, David Hildenbrand wrote: >> > On 29.06.24 03:33, Wei Yang wrote: >> > > Initialize all pages reserved is an ancient behavior. >> > > >> > > Since commit 92923ca3aace ("mm: meminit: only set page reserved in the >> > > memblock region"), SetPageReserved is removed from >> > > __init_single_page(). Only those reserved pages are marked PG_reserved. >> > > >> > > But we still set PG_reserved on offline and check it on online. >> > > >> > > Following two commits removed both of them: >> > > >> > > * Commit 0ee5f4f31d36 ("mm/page_alloc.c: don't set pages PageReserved() >> > > when offlining") removed the set on offline. >> > > * Commit 5ecae6359e3a ("mm/memory_hotplug: drop PageReserved() check in >> > > online_pages_range()") removed the check on online. >> > > >> > > This means we set PG_reserved for hot-plugged memory at initialization >> > > is not helpful and a little different from bootmem initialization path. >> > > Now we can remove it. >> > >> > It's not that easy for ZONE_DEVICE. >> > >> > Also, see mm/mm-stable >> > >> > commit 3dadec1babf9eee0c67c967df931d6f0cb124a04 >> > Author: David Hildenbrand <david@redhat.com> >> > Date: Fri Jun 7 11:09:36 2024 +0200 >> > >> > mm: pass meminit_context to __free_pages_core() >> > >> > Patch series "mm/memory_hotplug: use PageOffline() instead of >> > PageReserved() for !ZONE_DEVICE". >> > >> > >> > commit b873faaa609ab44c223b2327f55d2b6a2ba4ca9c >> > Author: David Hildenbrand <david@redhat.com> >> > Date: Fri Jun 7 11:09:37 2024 +0200 >> > >> > mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() >> > instead of PageReserved() >> > >> >> Let me try to understand this. >> >> You also tries to get rid of PG_reserved but you want PG_offline instead, >> because this benefit virtio-mem, right? > >We now make proper use of PG_offline. All hotplugged pages start out >PG_offline once we turn the section online. Only the ones that actually get >exposed to the buddy -- actually get onlined -- get PG_offline cleared. A >side effect of that is less hacks for virtio-mem, and more natural handling >for the other ballooning drivers that hotplug memory. > >In the future, I'm planning on moving more fake-offlining code from >virtio-mem the core, making use of more PG_offline in memory. > >For now, it's stops the PG_reserved use while maintaining the same semantics >as before: the page content and "struct page" is not to be touched by anybody >except the "owner". > >> >> But I don't get why PG_offline is wrong for ZONE_DEVICE. I may miss some >> knowledge for it. > >I suggest you take a look at the PG_offline documentation. ZONE_DEVICE are >certainly not logically offline pages. They will never be considered online >as part of online sections. But they will never be handed to the buddy. > >Maybe we want a dedicate page type for them in the future, not sure. We can >right now identify them reliably using the zone idx. > >Using a page type right now is very likely not possible, because we might be >using the page->_mapcount in rmap code when mapping some of them to user >space. > >If we want to get rid of the PG_reserved for them right now, we'll have to >make sure all existing PageReserved checks won't be degraded. For >example, drivers/vfio/vfio_iommu_type1.c might need some work (no sure). > >The KVM one in kvm_pfn_to_refcounted_page() should already be fine, because >they really want to refcount them. > >A lot of other ones like can_gather_numa_stats(), already refuse >is_zone_device_page() manually, and maybe we want to factor both checks out >into a separate function like "is_special_reserved_page()" or sth like that. > Thanks for the information, I need sometime to digest it :-) >-- >Cheers, > >David / dhildenb
diff --git a/mm/mm_init.c b/mm/mm_init.c index 3ec04933f7fd..362ac4334b99 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -839,9 +839,8 @@ static void __init init_unavailable_range(unsigned long spfn, } /* - * Initially all pages are reserved - free ones are freed - * up by memblock_free_all() once the early boot process is - * done. Non-atomic initialization, single-pass. + * Free ones are freed up by memblock_free_all() once the early boot process + * is done. Non-atomic initialization, single-pass. * * All aligned pageblocks are initialized to the specified migratetype * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related @@ -892,8 +891,6 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone page = pfn_to_page(pfn); __init_single_page(page, pfn, zone, nid); - if (context == MEMINIT_HOTPLUG) - __SetPageReserved(page); /* * Usually, we want to mark the pageblock MIGRATE_MOVABLE,
Initialize all pages reserved is an ancient behavior. Since commit 92923ca3aace ("mm: meminit: only set page reserved in the memblock region"), SetPageReserved is removed from __init_single_page(). Only those reserved pages are marked PG_reserved. But we still set PG_reserved on offline and check it on online. Following two commits removed both of them: * Commit 0ee5f4f31d36 ("mm/page_alloc.c: don't set pages PageReserved() when offlining") removed the set on offline. * Commit 5ecae6359e3a ("mm/memory_hotplug: drop PageReserved() check in online_pages_range()") removed the check on online. This means we set PG_reserved for hot-plugged memory at initialization is not helpful and a little different from bootmem initialization path. Now we can remove it. Memory hot-add and hot-remove have been tested. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Nathan Zimmer <nzimmer@sgi.com> CC: David Hildenbrand <david@redhat.com> CC: Mike Rapoport (IBM) <rppt@kernel.org> --- mm/mm_init.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)