Message ID | 20201130151838.11208-6-songmuchun@bytedance.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Free some vmemmap pages of hugetlb page | expand |
On 30.11.20 16:18, Muchun Song wrote: > In the later patch, we can use the free_vmemmap_page() to free the > unused vmemmap pages and initialize a page for vmemmap page using > via prepare_vmemmap_page(). > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > --- > include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++ > 1 file changed, 24 insertions(+) > > diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h > index 4ed6dee1adc9..239e3cc8f86c 100644 > --- a/include/linux/bootmem_info.h > +++ b/include/linux/bootmem_info.h > @@ -3,6 +3,7 @@ > #define __LINUX_BOOTMEM_INFO_H > > #include <linux/mmzone.h> > +#include <linux/mm.h> > > /* > * Types for free bootmem stored in page->lru.next. These have to be in > @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat); > void get_page_bootmem(unsigned long info, struct page *page, > unsigned long type); > void put_page_bootmem(struct page *page); > + > +static inline void free_vmemmap_page(struct page *page) > +{ > + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2); > + > + /* bootmem page has reserved flag in the reserve_bootmem_region */ > + if (PageReserved(page)) { > + unsigned long magic = (unsigned long)page->freelist; > + > + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) > + put_page_bootmem(page); > + else > + WARN_ON(1); > + } > +} > + > +static inline void prepare_vmemmap_page(struct page *page) > +{ > + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page)); > + > + get_page_bootmem(section_nr, page, SECTION_INFO); > + mark_page_reserved(page); > +} Can you clarify in the description when exactly these functions are called and on which type of pages? Would indicating "bootmem" in the function names make it clearer what we are dealing with? E.g., any memory allocated via the memblock allocator and not via the buddy will be makred reserved already in the memmap. It's unclear to me why we need the mark_page_reserved() here - can you enlighten me? :)
On Mon, Dec 7, 2020 at 8:39 PM David Hildenbrand <david@redhat.com> wrote: > > On 30.11.20 16:18, Muchun Song wrote: > > In the later patch, we can use the free_vmemmap_page() to free the > > unused vmemmap pages and initialize a page for vmemmap page using > > via prepare_vmemmap_page(). > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > --- > > include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++ > > 1 file changed, 24 insertions(+) > > > > diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h > > index 4ed6dee1adc9..239e3cc8f86c 100644 > > --- a/include/linux/bootmem_info.h > > +++ b/include/linux/bootmem_info.h > > @@ -3,6 +3,7 @@ > > #define __LINUX_BOOTMEM_INFO_H > > > > #include <linux/mmzone.h> > > +#include <linux/mm.h> > > > > /* > > * Types for free bootmem stored in page->lru.next. These have to be in > > @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat); > > void get_page_bootmem(unsigned long info, struct page *page, > > unsigned long type); > > void put_page_bootmem(struct page *page); > > + > > +static inline void free_vmemmap_page(struct page *page) > > +{ > > + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2); > > + > > + /* bootmem page has reserved flag in the reserve_bootmem_region */ > > + if (PageReserved(page)) { > > + unsigned long magic = (unsigned long)page->freelist; > > + > > + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) > > + put_page_bootmem(page); > > + else > > + WARN_ON(1); > > + } > > +} > > + > > +static inline void prepare_vmemmap_page(struct page *page) > > +{ > > + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page)); > > + > > + get_page_bootmem(section_nr, page, SECTION_INFO); > > + mark_page_reserved(page); > > +} > > Can you clarify in the description when exactly these functions are > called and on which type of pages? Will do. > > Would indicating "bootmem" in the function names make it clearer what we > are dealing with? > > E.g., any memory allocated via the memblock allocator and not via the > buddy will be makred reserved already in the memmap. It's unclear to me > why we need the mark_page_reserved() here - can you enlighten me? :) Very thanks for your suggestions. > > -- > Thanks, > > David / dhildenb >
On Mon, Dec 7, 2020 at 8:39 PM David Hildenbrand <david@redhat.com> wrote: > > On 30.11.20 16:18, Muchun Song wrote: > > In the later patch, we can use the free_vmemmap_page() to free the > > unused vmemmap pages and initialize a page for vmemmap page using > > via prepare_vmemmap_page(). > > > > Signed-off-by: Muchun Song <songmuchun@bytedance.com> > > --- > > include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++ > > 1 file changed, 24 insertions(+) > > > > diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h > > index 4ed6dee1adc9..239e3cc8f86c 100644 > > --- a/include/linux/bootmem_info.h > > +++ b/include/linux/bootmem_info.h > > @@ -3,6 +3,7 @@ > > #define __LINUX_BOOTMEM_INFO_H > > > > #include <linux/mmzone.h> > > +#include <linux/mm.h> > > > > /* > > * Types for free bootmem stored in page->lru.next. These have to be in > > @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat); > > void get_page_bootmem(unsigned long info, struct page *page, > > unsigned long type); > > void put_page_bootmem(struct page *page); > > + > > +static inline void free_vmemmap_page(struct page *page) > > +{ > > + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2); > > + > > + /* bootmem page has reserved flag in the reserve_bootmem_region */ > > + if (PageReserved(page)) { > > + unsigned long magic = (unsigned long)page->freelist; > > + > > + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) > > + put_page_bootmem(page); > > + else > > + WARN_ON(1); > > + } > > +} > > + > > +static inline void prepare_vmemmap_page(struct page *page) > > +{ > > + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page)); > > + > > + get_page_bootmem(section_nr, page, SECTION_INFO); > > + mark_page_reserved(page); > > +} > > Can you clarify in the description when exactly these functions are > called and on which type of pages? > > Would indicating "bootmem" in the function names make it clearer what we > are dealing with? > > E.g., any memory allocated via the memblock allocator and not via the > buddy will be makred reserved already in the memmap. It's unclear to me > why we need the mark_page_reserved() here - can you enlighten me? :) Sorry for ignoring this question. Because the vmemmap pages are allocated from the bootmem allocator which is marked as PG_reserved. For those bootmem pages, we should call put_page_bootmem for free. You can see that we clear the PG_reserved in the put_page_bootmem. In order to be consistent, the prepare_vmemmap_page also marks the page as PG_reserved. Thanks. > > -- > Thanks, > > David / dhildenb >
On 09.12.20 08:36, Muchun Song wrote: > On Mon, Dec 7, 2020 at 8:39 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 30.11.20 16:18, Muchun Song wrote: >>> In the later patch, we can use the free_vmemmap_page() to free the >>> unused vmemmap pages and initialize a page for vmemmap page using >>> via prepare_vmemmap_page(). >>> >>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >>> --- >>> include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++ >>> 1 file changed, 24 insertions(+) >>> >>> diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h >>> index 4ed6dee1adc9..239e3cc8f86c 100644 >>> --- a/include/linux/bootmem_info.h >>> +++ b/include/linux/bootmem_info.h >>> @@ -3,6 +3,7 @@ >>> #define __LINUX_BOOTMEM_INFO_H >>> >>> #include <linux/mmzone.h> >>> +#include <linux/mm.h> >>> >>> /* >>> * Types for free bootmem stored in page->lru.next. These have to be in >>> @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat); >>> void get_page_bootmem(unsigned long info, struct page *page, >>> unsigned long type); >>> void put_page_bootmem(struct page *page); >>> + >>> +static inline void free_vmemmap_page(struct page *page) >>> +{ >>> + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2); >>> + >>> + /* bootmem page has reserved flag in the reserve_bootmem_region */ >>> + if (PageReserved(page)) { >>> + unsigned long magic = (unsigned long)page->freelist; >>> + >>> + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) >>> + put_page_bootmem(page); >>> + else >>> + WARN_ON(1); >>> + } >>> +} >>> + >>> +static inline void prepare_vmemmap_page(struct page *page) >>> +{ >>> + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page)); >>> + >>> + get_page_bootmem(section_nr, page, SECTION_INFO); >>> + mark_page_reserved(page); >>> +} >> >> Can you clarify in the description when exactly these functions are >> called and on which type of pages? >> >> Would indicating "bootmem" in the function names make it clearer what we >> are dealing with? >> >> E.g., any memory allocated via the memblock allocator and not via the >> buddy will be makred reserved already in the memmap. It's unclear to me >> why we need the mark_page_reserved() here - can you enlighten me? :) > > Sorry for ignoring this question. Because the vmemmap pages are allocated > from the bootmem allocator which is marked as PG_reserved. For those bootmem > pages, we should call put_page_bootmem for free. You can see that we > clear the PG_reserved in the put_page_bootmem. In order to be consistent, > the prepare_vmemmap_page also marks the page as PG_reserved. I don't think that really makes sense. After put_page_bootmem() put the last reference, it clears PG_reserved and hands the page over to the buddy via free_reserved_page(). From that point on, further get_page_bootmem() would be completely wrong and dangerous. Both, put_page_bootmem() and get_page_bootmem() rely on the fact that they are dealing with memblock allcoations - marked via PG_reserved. If prepare_vmemmap_page() would be called on something that's *not* coming from the memblock allocator, it would be completely broken - or am I missing something? AFAIKT, there should rather be a BUG_ON(!PageReserved(page)) in prepare_vmemmap_page() - or proper handling to deal with !memblock allocations. And as I said, indicating "bootmem" as part of the function names might make it clearer that this is not for getting any vmemmap pages (esp. allocated when hotplugging memory).
On Wed, Dec 9, 2020 at 4:50 PM David Hildenbrand <david@redhat.com> wrote: > > On 09.12.20 08:36, Muchun Song wrote: > > On Mon, Dec 7, 2020 at 8:39 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 30.11.20 16:18, Muchun Song wrote: > >>> In the later patch, we can use the free_vmemmap_page() to free the > >>> unused vmemmap pages and initialize a page for vmemmap page using > >>> via prepare_vmemmap_page(). > >>> > >>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >>> --- > >>> include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++ > >>> 1 file changed, 24 insertions(+) > >>> > >>> diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h > >>> index 4ed6dee1adc9..239e3cc8f86c 100644 > >>> --- a/include/linux/bootmem_info.h > >>> +++ b/include/linux/bootmem_info.h > >>> @@ -3,6 +3,7 @@ > >>> #define __LINUX_BOOTMEM_INFO_H > >>> > >>> #include <linux/mmzone.h> > >>> +#include <linux/mm.h> > >>> > >>> /* > >>> * Types for free bootmem stored in page->lru.next. These have to be in > >>> @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat); > >>> void get_page_bootmem(unsigned long info, struct page *page, > >>> unsigned long type); > >>> void put_page_bootmem(struct page *page); > >>> + > >>> +static inline void free_vmemmap_page(struct page *page) > >>> +{ > >>> + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2); > >>> + > >>> + /* bootmem page has reserved flag in the reserve_bootmem_region */ > >>> + if (PageReserved(page)) { > >>> + unsigned long magic = (unsigned long)page->freelist; > >>> + > >>> + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) > >>> + put_page_bootmem(page); > >>> + else > >>> + WARN_ON(1); > >>> + } > >>> +} > >>> + > >>> +static inline void prepare_vmemmap_page(struct page *page) > >>> +{ > >>> + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page)); > >>> + > >>> + get_page_bootmem(section_nr, page, SECTION_INFO); > >>> + mark_page_reserved(page); > >>> +} > >> > >> Can you clarify in the description when exactly these functions are > >> called and on which type of pages? > >> > >> Would indicating "bootmem" in the function names make it clearer what we > >> are dealing with? > >> > >> E.g., any memory allocated via the memblock allocator and not via the > >> buddy will be makred reserved already in the memmap. It's unclear to me > >> why we need the mark_page_reserved() here - can you enlighten me? :) > > > > Sorry for ignoring this question. Because the vmemmap pages are allocated > > from the bootmem allocator which is marked as PG_reserved. For those bootmem > > pages, we should call put_page_bootmem for free. You can see that we > > clear the PG_reserved in the put_page_bootmem. In order to be consistent, > > the prepare_vmemmap_page also marks the page as PG_reserved. > > I don't think that really makes sense. > > After put_page_bootmem() put the last reference, it clears PG_reserved > and hands the page over to the buddy via free_reserved_page(). From that > point on, further get_page_bootmem() would be completely wrong and > dangerous. > > Both, put_page_bootmem() and get_page_bootmem() rely on the fact that > they are dealing with memblock allcoations - marked via PG_reserved. If > prepare_vmemmap_page() would be called on something that's *not* coming > from the memblock allocator, it would be completely broken - or am I > missing something? > > AFAIKT, there should rather be a BUG_ON(!PageReserved(page)) in > prepare_vmemmap_page() - or proper handling to deal with !memblock > allocations. > I want to allocate some pages as the vmemmap when we free a HugeTLB page to the buddy allocator. So I use the prepare_vmemmap_page() to initialize the page (which allocated from buddy allocator) and make it as the vmemmap of the freed HugeTLB page. Any suggestions to deal with this case? I have a solution to address this. When the pages allocated from the buddy as vmemmap pages, we do not call prepare_vmemmap_page(). When we free some vmemmap pages of a HugeTLB page, if the PG_reserved of the vmemmap page is set, we call free_vmemmap_page() to free it to buddy, otherwise call free_page(). What is your opinion? > > And as I said, indicating "bootmem" as part of the function names might > make it clearer that this is not for getting any vmemmap pages (esp. > allocated when hotplugging memory). Agree. I am doing that for the next version. > > -- > Thanks, > > David / dhildenb >
On 09.12.20 10:25, Muchun Song wrote: > On Wed, Dec 9, 2020 at 4:50 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 09.12.20 08:36, Muchun Song wrote: >>> On Mon, Dec 7, 2020 at 8:39 PM David Hildenbrand <david@redhat.com> wrote: >>>> >>>> On 30.11.20 16:18, Muchun Song wrote: >>>>> In the later patch, we can use the free_vmemmap_page() to free the >>>>> unused vmemmap pages and initialize a page for vmemmap page using >>>>> via prepare_vmemmap_page(). >>>>> >>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> >>>>> --- >>>>> include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++ >>>>> 1 file changed, 24 insertions(+) >>>>> >>>>> diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h >>>>> index 4ed6dee1adc9..239e3cc8f86c 100644 >>>>> --- a/include/linux/bootmem_info.h >>>>> +++ b/include/linux/bootmem_info.h >>>>> @@ -3,6 +3,7 @@ >>>>> #define __LINUX_BOOTMEM_INFO_H >>>>> >>>>> #include <linux/mmzone.h> >>>>> +#include <linux/mm.h> >>>>> >>>>> /* >>>>> * Types for free bootmem stored in page->lru.next. These have to be in >>>>> @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat); >>>>> void get_page_bootmem(unsigned long info, struct page *page, >>>>> unsigned long type); >>>>> void put_page_bootmem(struct page *page); >>>>> + >>>>> +static inline void free_vmemmap_page(struct page *page) >>>>> +{ >>>>> + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2); >>>>> + >>>>> + /* bootmem page has reserved flag in the reserve_bootmem_region */ >>>>> + if (PageReserved(page)) { >>>>> + unsigned long magic = (unsigned long)page->freelist; >>>>> + >>>>> + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) >>>>> + put_page_bootmem(page); >>>>> + else >>>>> + WARN_ON(1); >>>>> + } >>>>> +} >>>>> + >>>>> +static inline void prepare_vmemmap_page(struct page *page) >>>>> +{ >>>>> + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page)); >>>>> + >>>>> + get_page_bootmem(section_nr, page, SECTION_INFO); >>>>> + mark_page_reserved(page); >>>>> +} >>>> >>>> Can you clarify in the description when exactly these functions are >>>> called and on which type of pages? >>>> >>>> Would indicating "bootmem" in the function names make it clearer what we >>>> are dealing with? >>>> >>>> E.g., any memory allocated via the memblock allocator and not via the >>>> buddy will be makred reserved already in the memmap. It's unclear to me >>>> why we need the mark_page_reserved() here - can you enlighten me? :) >>> >>> Sorry for ignoring this question. Because the vmemmap pages are allocated >>> from the bootmem allocator which is marked as PG_reserved. For those bootmem >>> pages, we should call put_page_bootmem for free. You can see that we >>> clear the PG_reserved in the put_page_bootmem. In order to be consistent, >>> the prepare_vmemmap_page also marks the page as PG_reserved. >> >> I don't think that really makes sense. >> >> After put_page_bootmem() put the last reference, it clears PG_reserved >> and hands the page over to the buddy via free_reserved_page(). From that >> point on, further get_page_bootmem() would be completely wrong and >> dangerous. >> >> Both, put_page_bootmem() and get_page_bootmem() rely on the fact that >> they are dealing with memblock allcoations - marked via PG_reserved. If >> prepare_vmemmap_page() would be called on something that's *not* coming >> from the memblock allocator, it would be completely broken - or am I >> missing something? >> >> AFAIKT, there should rather be a BUG_ON(!PageReserved(page)) in >> prepare_vmemmap_page() - or proper handling to deal with !memblock >> allocations. >> > > I want to allocate some pages as the vmemmap when > we free a HugeTLB page to the buddy allocator. So I use > the prepare_vmemmap_page() to initialize the page (which > allocated from buddy allocator) and make it as the vmemmap > of the freed HugeTLB page. > > Any suggestions to deal with this case? If you obtained pages via the buddy, there shouldn't be anything special to handle, no? What speaks against prepare_vmemmap_page(): if (!PageReserved(page)) return; put_page_bootmem(): if (!PageReserved(page)) __free_page(); Or if we care about multiple references, get_page() and put_page(). > > I have a solution to address this. When the pages allocated > from the buddy as vmemmap pages, we do not call > prepare_vmemmap_page(). > > When we free some vmemmap pages of a HugeTLB > page, if the PG_reserved of the vmemmap page is set, > we call free_vmemmap_page() to free it to buddy, > otherwise call free_page(). What is your opinion? That would also work. Then, please include "bootmem" as part of the function name. If you plan on using my suggestion, you can drop "bootmem" from the name as it works for both types of pages.
On Wed, Dec 9, 2020 at 5:33 PM David Hildenbrand <david@redhat.com> wrote: > > On 09.12.20 10:25, Muchun Song wrote: > > On Wed, Dec 9, 2020 at 4:50 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 09.12.20 08:36, Muchun Song wrote: > >>> On Mon, Dec 7, 2020 at 8:39 PM David Hildenbrand <david@redhat.com> wrote: > >>>> > >>>> On 30.11.20 16:18, Muchun Song wrote: > >>>>> In the later patch, we can use the free_vmemmap_page() to free the > >>>>> unused vmemmap pages and initialize a page for vmemmap page using > >>>>> via prepare_vmemmap_page(). > >>>>> > >>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com> > >>>>> --- > >>>>> include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++ > >>>>> 1 file changed, 24 insertions(+) > >>>>> > >>>>> diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h > >>>>> index 4ed6dee1adc9..239e3cc8f86c 100644 > >>>>> --- a/include/linux/bootmem_info.h > >>>>> +++ b/include/linux/bootmem_info.h > >>>>> @@ -3,6 +3,7 @@ > >>>>> #define __LINUX_BOOTMEM_INFO_H > >>>>> > >>>>> #include <linux/mmzone.h> > >>>>> +#include <linux/mm.h> > >>>>> > >>>>> /* > >>>>> * Types for free bootmem stored in page->lru.next. These have to be in > >>>>> @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat); > >>>>> void get_page_bootmem(unsigned long info, struct page *page, > >>>>> unsigned long type); > >>>>> void put_page_bootmem(struct page *page); > >>>>> + > >>>>> +static inline void free_vmemmap_page(struct page *page) > >>>>> +{ > >>>>> + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2); > >>>>> + > >>>>> + /* bootmem page has reserved flag in the reserve_bootmem_region */ > >>>>> + if (PageReserved(page)) { > >>>>> + unsigned long magic = (unsigned long)page->freelist; > >>>>> + > >>>>> + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) > >>>>> + put_page_bootmem(page); > >>>>> + else > >>>>> + WARN_ON(1); > >>>>> + } > >>>>> +} > >>>>> + > >>>>> +static inline void prepare_vmemmap_page(struct page *page) > >>>>> +{ > >>>>> + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page)); > >>>>> + > >>>>> + get_page_bootmem(section_nr, page, SECTION_INFO); > >>>>> + mark_page_reserved(page); > >>>>> +} > >>>> > >>>> Can you clarify in the description when exactly these functions are > >>>> called and on which type of pages? > >>>> > >>>> Would indicating "bootmem" in the function names make it clearer what we > >>>> are dealing with? > >>>> > >>>> E.g., any memory allocated via the memblock allocator and not via the > >>>> buddy will be makred reserved already in the memmap. It's unclear to me > >>>> why we need the mark_page_reserved() here - can you enlighten me? :) > >>> > >>> Sorry for ignoring this question. Because the vmemmap pages are allocated > >>> from the bootmem allocator which is marked as PG_reserved. For those bootmem > >>> pages, we should call put_page_bootmem for free. You can see that we > >>> clear the PG_reserved in the put_page_bootmem. In order to be consistent, > >>> the prepare_vmemmap_page also marks the page as PG_reserved. > >> > >> I don't think that really makes sense. > >> > >> After put_page_bootmem() put the last reference, it clears PG_reserved > >> and hands the page over to the buddy via free_reserved_page(). From that > >> point on, further get_page_bootmem() would be completely wrong and > >> dangerous. > >> > >> Both, put_page_bootmem() and get_page_bootmem() rely on the fact that > >> they are dealing with memblock allcoations - marked via PG_reserved. If > >> prepare_vmemmap_page() would be called on something that's *not* coming > >> from the memblock allocator, it would be completely broken - or am I > >> missing something? > >> > >> AFAIKT, there should rather be a BUG_ON(!PageReserved(page)) in > >> prepare_vmemmap_page() - or proper handling to deal with !memblock > >> allocations. > >> > > > > I want to allocate some pages as the vmemmap when > > we free a HugeTLB page to the buddy allocator. So I use > > the prepare_vmemmap_page() to initialize the page (which > > allocated from buddy allocator) and make it as the vmemmap > > of the freed HugeTLB page. > > > > Any suggestions to deal with this case? > > If you obtained pages via the buddy, there shouldn't be anything special > to handle, no? What speaks against > > > prepare_vmemmap_page(): > if (!PageReserved(page)) > return; > > > put_page_bootmem(): > if (!PageReserved(page)) > __free_page(); > Thanks. > > Or if we care about multiple references, get_page() and put_page(). > > > > > I have a solution to address this. When the pages allocated > > from the buddy as vmemmap pages, we do not call > > prepare_vmemmap_page(). > > > > When we free some vmemmap pages of a HugeTLB > > page, if the PG_reserved of the vmemmap page is set, > > we call free_vmemmap_page() to free it to buddy, > > otherwise call free_page(). What is your opinion? > > That would also work. Then, please include "bootmem" as part of the > function name. If you plan on using my suggestion, you can drop > "bootmem" from the name as it works for both types of pages. > Agree. Thanks. > > -- > Thanks, > > David / dhildenb >
diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h index 4ed6dee1adc9..239e3cc8f86c 100644 --- a/include/linux/bootmem_info.h +++ b/include/linux/bootmem_info.h @@ -3,6 +3,7 @@ #define __LINUX_BOOTMEM_INFO_H #include <linux/mmzone.h> +#include <linux/mm.h> /* * Types for free bootmem stored in page->lru.next. These have to be in @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat); void get_page_bootmem(unsigned long info, struct page *page, unsigned long type); void put_page_bootmem(struct page *page); + +static inline void free_vmemmap_page(struct page *page) +{ + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2); + + /* bootmem page has reserved flag in the reserve_bootmem_region */ + if (PageReserved(page)) { + unsigned long magic = (unsigned long)page->freelist; + + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO) + put_page_bootmem(page); + else + WARN_ON(1); + } +} + +static inline void prepare_vmemmap_page(struct page *page) +{ + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page)); + + get_page_bootmem(section_nr, page, SECTION_INFO); + mark_page_reserved(page); +} #else static inline void register_page_bootmem_info_node(struct pglist_data *pgdat) {
In the later patch, we can use the free_vmemmap_page() to free the unused vmemmap pages and initialize a page for vmemmap page using via prepare_vmemmap_page(). Signed-off-by: Muchun Song <songmuchun@bytedance.com> --- include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)