diff mbox series

[resend] mm: hugetlb_vmemmap: use bulk allocator in alloc_vmemmap_page_list()

Message ID 20230905103508.2996474-1-wangkefeng.wang@huawei.com (mailing list archive)
State New
Headers show
Series [resend] mm: hugetlb_vmemmap: use bulk allocator in alloc_vmemmap_page_list() | expand

Commit Message

Kefeng Wang Sept. 5, 2023, 10:35 a.m. UTC
It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
to use it to accelerate page allocation.

Simple test on arm64's qemu with 1G Hugetlb, 870,842ns vs 3,845,252ns,
even if there is a certain fluctuation, it is still a nice improvement.

Tested-by: Yuan Can <yuancan@huawei.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
resend: fix allocated spell and decrease nr_pages in fallback logical

 include/linux/gfp.h  | 9 +++++++++
 mm/hugetlb_vmemmap.c | 6 ++++++
 2 files changed, 15 insertions(+)

Comments

Muchun Song Sept. 6, 2023, 2:41 a.m. UTC | #1
> On Sep 5, 2023, at 18:35, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> 
> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
> to use it to accelerate page allocation.
> 
> Simple test on arm64's qemu with 1G Hugetlb, 870,842ns vs 3,845,252ns,
> even if there is a certain fluctuation, it is still a nice improvement.
> 
> Tested-by: Yuan Can <yuancan@huawei.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Muchun Song <songmuchun@bytedance.com>

Thanks.
Matthew Wilcox Sept. 6, 2023, 2:47 a.m. UTC | #2
On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
> to use it to accelerate page allocation.

Argh, no, please don't do this.

Iterating a linked list is _expensive_.  It is about 10x quicker to
iterate an array than a linked list.  Adding the list_head option
to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.

These pages are going into an array anyway.  Don't put them on a list
first.
Kefeng Wang Sept. 6, 2023, 3:13 a.m. UTC | #3
On 2023/9/6 10:47, Matthew Wilcox wrote:
> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
>> to use it to accelerate page allocation.
> 
> Argh, no, please don't do this.
> 
> Iterating a linked list is _expensive_.  It is about 10x quicker to
> iterate an array than a linked list.  Adding the list_head option
> to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.
> 
> These pages are going into an array anyway.  Don't put them on a list
> first.

struct vmemmap_remap_walk - walk vmemmap page table

  * @vmemmap_pages:  the list head of the vmemmap pages that can be freed
  *                  or is mapped from.

At present, the struct vmemmap_remap_walk use a list for vmemmap page 
table walk, so do you mean we need change vmemmap_pages from a list to a 
array firstly and then use array bulk api, even kill list bulk api ?
Matthew Wilcox Sept. 6, 2023, 3:14 a.m. UTC | #4
On Wed, Sep 06, 2023 at 11:13:27AM +0800, Kefeng Wang wrote:
> On 2023/9/6 10:47, Matthew Wilcox wrote:
> > On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
> > > It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
> > > alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
> > > alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
> > > to use it to accelerate page allocation.
> > 
> > Argh, no, please don't do this.
> > 
> > Iterating a linked list is _expensive_.  It is about 10x quicker to
> > iterate an array than a linked list.  Adding the list_head option
> > to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.
> > 
> > These pages are going into an array anyway.  Don't put them on a list
> > first.
> 
> struct vmemmap_remap_walk - walk vmemmap page table
> 
>  * @vmemmap_pages:  the list head of the vmemmap pages that can be freed
>  *                  or is mapped from.
> 
> At present, the struct vmemmap_remap_walk use a list for vmemmap page table
> walk, so do you mean we need change vmemmap_pages from a list to a array
> firstly and then use array bulk api, even kill list bulk api ?

That would be better, yes.
Muchun Song Sept. 6, 2023, 3:25 a.m. UTC | #5
> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> 
> 
> 
> On 2023/9/6 10:47, Matthew Wilcox wrote:
>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
>>> to use it to accelerate page allocation.
>> Argh, no, please don't do this.
>> Iterating a linked list is _expensive_.  It is about 10x quicker to
>> iterate an array than a linked list.  Adding the list_head option
>> to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.
>> These pages are going into an array anyway.  Don't put them on a list
>> first.
> 
> struct vmemmap_remap_walk - walk vmemmap page table
> 
> * @vmemmap_pages:  the list head of the vmemmap pages that can be freed
> *                  or is mapped from.
> 
> At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ?

It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to
directly use __alloc_pages_bulk in hugetlb_vmemmap itself?
Kefeng Wang Sept. 6, 2023, 9:33 a.m. UTC | #6
On 2023/9/6 11:25, Muchun Song wrote:
> 
> 
>> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>
>>
>>
>> On 2023/9/6 10:47, Matthew Wilcox wrote:
>>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
>>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
>>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
>>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
>>>> to use it to accelerate page allocation.
>>> Argh, no, please don't do this.
>>> Iterating a linked list is _expensive_.  It is about 10x quicker to
>>> iterate an array than a linked list.  Adding the list_head option
>>> to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.
>>> These pages are going into an array anyway.  Don't put them on a list
>>> first.
>>
>> struct vmemmap_remap_walk - walk vmemmap page table
>>
>> * @vmemmap_pages:  the list head of the vmemmap pages that can be freed
>> *                  or is mapped from.
>>
>> At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ?
> 
> It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to
> directly use __alloc_pages_bulk in hugetlb_vmemmap itself?


We could use alloc_pages_bulk_array_node() here without introduce a new
alloc_pages_bulk_list_node(), only focus on accelerate page allocation
for now.
Kefeng Wang Sept. 6, 2023, 12:32 p.m. UTC | #7
On 2023/9/6 11:14, Matthew Wilcox wrote:
> On Wed, Sep 06, 2023 at 11:13:27AM +0800, Kefeng Wang wrote:
>> On 2023/9/6 10:47, Matthew Wilcox wrote:
>>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
>>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
>>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
>>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
>>>> to use it to accelerate page allocation.
>>>
>>> Argh, no, please don't do this.
>>>
>>> Iterating a linked list is _expensive_.  It is about 10x quicker to
>>> iterate an array than a linked list.  Adding the list_head option
>>> to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.
>>>
>>> These pages are going into an array anyway.  Don't put them on a list
>>> first.
>>
>> struct vmemmap_remap_walk - walk vmemmap page table
>>
>>   * @vmemmap_pages:  the list head of the vmemmap pages that can be freed
>>   *                  or is mapped from.
>>
>> At present, the struct vmemmap_remap_walk use a list for vmemmap page table
>> walk, so do you mean we need change vmemmap_pages from a list to a array
>> firstly and then use array bulk api, even kill list bulk api ?
> 
> That would be better, yes.
Maybe not quick to convert vmemmap_remap_walk to use array, will use
page array alloc bulk api firstly and won't introduce a new 
alloc_pages_bulk_list_node(), thanks.
Muchun Song Sept. 6, 2023, 2:32 p.m. UTC | #8
> On Sep 6, 2023, at 17:33, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> 
> 
> 
> On 2023/9/6 11:25, Muchun Song wrote:
>>> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>> 
>>> 
>>> 
>>> On 2023/9/6 10:47, Matthew Wilcox wrote:
>>>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
>>>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
>>>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
>>>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
>>>>> to use it to accelerate page allocation.
>>>> Argh, no, please don't do this.
>>>> Iterating a linked list is _expensive_.  It is about 10x quicker to
>>>> iterate an array than a linked list.  Adding the list_head option
>>>> to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.
>>>> These pages are going into an array anyway.  Don't put them on a list
>>>> first.
>>> 
>>> struct vmemmap_remap_walk - walk vmemmap page table
>>> 
>>> * @vmemmap_pages:  the list head of the vmemmap pages that can be freed
>>> *                  or is mapped from.
>>> 
>>> At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ?
>> It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to
>> directly use __alloc_pages_bulk in hugetlb_vmemmap itself?
> 
> 
> We could use alloc_pages_bulk_array_node() here without introduce a new
> alloc_pages_bulk_list_node(), only focus on accelerate page allocation
> for now.
> 

No. Using alloc_pages_bulk_array_node() will add more complexity (you need to allocate
an array fist) for hugetlb_vmemap and this path that you optimized is only a control
path and this optimization is at the millisecond level. So I don't think it is a great
value to do this.

Thanks.
Kefeng Wang Sept. 6, 2023, 2:58 p.m. UTC | #9
On 2023/9/6 22:32, Muchun Song wrote:
> 
> 
>> On Sep 6, 2023, at 17:33, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>
>>
>>
>> On 2023/9/6 11:25, Muchun Song wrote:
>>>> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>>>
>>>>
>>>>
>>>> On 2023/9/6 10:47, Matthew Wilcox wrote:
>>>>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
>>>>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
>>>>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
>>>>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
>>>>>> to use it to accelerate page allocation.
>>>>> Argh, no, please don't do this.
>>>>> Iterating a linked list is _expensive_.  It is about 10x quicker to
>>>>> iterate an array than a linked list.  Adding the list_head option
>>>>> to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.
>>>>> These pages are going into an array anyway.  Don't put them on a list
>>>>> first.
>>>>
>>>> struct vmemmap_remap_walk - walk vmemmap page table
>>>>
>>>> * @vmemmap_pages:  the list head of the vmemmap pages that can be freed
>>>> *                  or is mapped from.
>>>>
>>>> At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ?
>>> It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to
>>> directly use __alloc_pages_bulk in hugetlb_vmemmap itself?
>>
>>
>> We could use alloc_pages_bulk_array_node() here without introduce a new
>> alloc_pages_bulk_list_node(), only focus on accelerate page allocation
>> for now.
>>
> 
> No. Using alloc_pages_bulk_array_node() will add more complexity (you need to allocate
> an array fist) for hugetlb_vmemap and this path that you optimized is only a control
> path and this optimization is at the millisecond level. So I don't think it is a great
> value to do this.
> 
I tried it, yes, a little complex,

diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 4b9734777f69..5f502e18f950 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -377,26 +377,53 @@ static int vmemmap_remap_free(unsigned long start, 
unsigned long end,
  	return ret;
  }

+static int vmemmap_bulk_alloc_pages(gfp_t gfp, int nid, unsigned int 
nr_pages,
+				    struct page **pages)
+{
+	unsigned int last, allocated = 0;
+
+	do {
+		last = allocated;
+
+		allocated = alloc_pages_bulk_array_node(gfp, nid, nr_pages, pages);
+		if (allocated == last)
+			goto err;
+
+	} while (allocated < nr_pages)
+
+	return 0;
+err:
+	for (allocated = 0; allocated < nr_pages; allocated++) {
+		if (pages[allocated])
+			__free_page(pages[allocated]);
+	}
+
+	return -ENOMEM;
+}
+
  static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
  				   struct list_head *list)
  {
  	gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
  	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
  	int nid = page_to_nid((struct page *)start);
-	struct page *page, *next;
+	struct page **pages;
+	int ret = -ENOMEM;
+
+	pages = kzalloc(array_size(nr_pages, sizeof(struct page *)), gfp_mask);
+	if (!pages)
+		return ret;
+
+	ret = vmemmap_bulk_alloc_pages(gfp_mask, nid, nr_pages, pages);
+	if (ret)
+		goto out;

  	while (nr_pages--) {
-		page = alloc_pages_node(nid, gfp_mask, 0);
-		if (!page)
-			goto out;
-		list_add_tail(&page->lru, list);
+		list_add_tail(&pages[nr_pages]->lru, list);
  	}
-
-	return 0;
  out:
-	list_for_each_entry_safe(page, next, list, lru)
-		__free_page(page);
-	return -ENOMEM;
+	kfree(pages);
+	return ret;
  }

or just use __alloc_pages_bulk in it, but as Matthew said, we should
avoid list usage, list api need to be cleanup and no one should use it,
or no change, since it is not a hot path :)> Thanks.
>
Muchun Song Sept. 7, 2023, 6:35 a.m. UTC | #10
> On Sep 6, 2023, at 22:58, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> 
> 
> 
> On 2023/9/6 22:32, Muchun Song wrote:
>>> On Sep 6, 2023, at 17:33, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>> 
>>> 
>>> 
>>> On 2023/9/6 11:25, Muchun Song wrote:
>>>>> On Sep 6, 2023, at 11:13, Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 2023/9/6 10:47, Matthew Wilcox wrote:
>>>>>> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
>>>>>>> It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
>>>>>>> alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
>>>>>>> alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
>>>>>>> to use it to accelerate page allocation.
>>>>>> Argh, no, please don't do this.
>>>>>> Iterating a linked list is _expensive_.  It is about 10x quicker to
>>>>>> iterate an array than a linked list.  Adding the list_head option
>>>>>> to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.
>>>>>> These pages are going into an array anyway.  Don't put them on a list
>>>>>> first.
>>>>> 
>>>>> struct vmemmap_remap_walk - walk vmemmap page table
>>>>> 
>>>>> * @vmemmap_pages:  the list head of the vmemmap pages that can be freed
>>>>> *                  or is mapped from.
>>>>> 
>>>>> At present, the struct vmemmap_remap_walk use a list for vmemmap page table walk, so do you mean we need change vmemmap_pages from a list to a array firstly and then use array bulk api, even kill list bulk api ?
>>>> It'll be a little complex for hugetlb_vmemmap. Should it be reasonable to
>>>> directly use __alloc_pages_bulk in hugetlb_vmemmap itself?
>>> 
>>> 
>>> We could use alloc_pages_bulk_array_node() here without introduce a new
>>> alloc_pages_bulk_list_node(), only focus on accelerate page allocation
>>> for now.
>>> 
>> No. Using alloc_pages_bulk_array_node() will add more complexity (you need to allocate
>> an array fist) for hugetlb_vmemap and this path that you optimized is only a control
>> path and this optimization is at the millisecond level. So I don't think it is a great
>> value to do this.
> I tried it, yes, a little complex,
> 
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index 4b9734777f69..5f502e18f950 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -377,26 +377,53 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end,
> return ret;
> }
> 
> +static int vmemmap_bulk_alloc_pages(gfp_t gfp, int nid, unsigned int nr_pages,
> +     struct page **pages)
> +{
> + unsigned int last, allocated = 0;
> +
> + do {
> + last = allocated;
> +
> + allocated = alloc_pages_bulk_array_node(gfp, nid, nr_pages, pages);
> + if (allocated == last)
> + goto err;
> +
> + } while (allocated < nr_pages)
> +
> + return 0;
> +err:
> + for (allocated = 0; allocated < nr_pages; allocated++) {
> + if (pages[allocated])
> + __free_page(pages[allocated]);
> + }
> +
> + return -ENOMEM;
> +}
> +
> static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
>    struct list_head *list)
> {
> gfp_t gfp_mask = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE;
> unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
> int nid = page_to_nid((struct page *)start);
> - struct page *page, *next;
> + struct page **pages;
> + int ret = -ENOMEM;
> +
> + pages = kzalloc(array_size(nr_pages, sizeof(struct page *)), gfp_mask);
> + if (!pages)
> + return ret;
> +
> + ret = vmemmap_bulk_alloc_pages(gfp_mask, nid, nr_pages, pages);
> + if (ret)
> + goto out;
> 
> while (nr_pages--) {
> - page = alloc_pages_node(nid, gfp_mask, 0);
> - if (!page)
> - goto out;
> - list_add_tail(&page->lru, list);
> + list_add_tail(&pages[nr_pages]->lru, list);
> }
> -
> - return 0;
> out:
> - list_for_each_entry_safe(page, next, list, lru)
> - __free_page(page);
> - return -ENOMEM;
> + kfree(pages);
> + return ret;
> }
> 
> or just use __alloc_pages_bulk in it, but as Matthew said, we should
> avoid list usage, list api need to be cleanup and no one should use it,
> or no change, since it is not a hot path :)> Thanks.

Let's keep it no change.

Thanks.
Mel Gorman Oct. 25, 2023, 9:32 a.m. UTC | #11
On Wed, Sep 06, 2023 at 03:47:42AM +0100, Matthew Wilcox wrote:
> On Tue, Sep 05, 2023 at 06:35:08PM +0800, Kefeng Wang wrote:
> > It is needed 4095 pages(1G) or 7 pages(2M) to be allocated once in
> > alloc_vmemmap_page_list(), so let's add a bulk allocator varietas
> > alloc_pages_bulk_list_node() and switch alloc_vmemmap_page_list()
> > to use it to accelerate page allocation.
> 
> Argh, no, please don't do this.
> 
> Iterating a linked list is _expensive_.  It is about 10x quicker to
> iterate an array than a linked list.  Adding the list_head option
> to __alloc_pages_bulk() was a colossal mistake.  Don't perpetuate it.
> 
> These pages are going into an array anyway.  Don't put them on a list
> first.
> 

I don't have access to my test infrastructure at the moment so this isn't
even properly build tested but I think it's time for something like;

--8<--
mm: page_alloc: Remove list interface for bulk page allocation

Commit 387ba26fb1cb ("mm/page_alloc: add a bulk page allocator") introduced
an API for bulk allocating pages stored on a list. Later an array interface
was introduced as it was faster for the initial use cases. For two years,
the array option continues to be superior for each use case and any attempt
to us the list interface ultimately used arrays.  Remove the list-based
API for bulk page allocation and rename the API.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/gfp.h   | 17 +++++------------
 mm/mempolicy.c        | 21 ++++++++++-----------
 mm/page_alloc.c       | 24 ++++++------------------
 mm/vmalloc.c          |  4 ++--
 net/core/page_pool.c  |  4 ++--
 net/sunrpc/svc.c      |  2 +-
 net/sunrpc/svc_xprt.c |  2 +-
 7 files changed, 27 insertions(+), 47 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 665f06675c83..d81eb255718a 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -181,33 +181,26 @@ struct folio *__folio_alloc(gfp_t gfp, unsigned int order, int preferred_nid,
 
 unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 				nodemask_t *nodemask, int nr_pages,
-				struct list_head *page_list,
 				struct page **page_array);
 
-unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp,
+unsigned long alloc_pages_bulk_mempolicy(gfp_t gfp,
 				unsigned long nr_pages,
 				struct page **page_array);
 
 /* Bulk allocate order-0 pages */
 static inline unsigned long
-alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list)
+alloc_pages_bulk(gfp_t gfp, unsigned long nr_pages, struct page **page_array)
 {
-	return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list, NULL);
+	return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, page_array);
 }
 
 static inline unsigned long
-alloc_pages_bulk_array(gfp_t gfp, unsigned long nr_pages, struct page **page_array)
-{
-	return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, NULL, page_array);
-}
-
-static inline unsigned long
-alloc_pages_bulk_array_node(gfp_t gfp, int nid, unsigned long nr_pages, struct page **page_array)
+alloc_pages_bulk_node(gfp_t gfp, int nid, unsigned long nr_pages, struct page **page_array)
 {
 	if (nid == NUMA_NO_NODE)
 		nid = numa_mem_id();
 
-	return __alloc_pages_bulk(gfp, nid, NULL, nr_pages, NULL, page_array);
+	return __alloc_pages_bulk(gfp, nid, NULL, nr_pages, page_array);
 }
 
 static inline void warn_if_node_offline(int this_node, gfp_t gfp_mask)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 29ebf1e7898c..2a763df3e1ee 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2315,7 +2315,7 @@ struct folio *folio_alloc(gfp_t gfp, unsigned order)
 }
 EXPORT_SYMBOL(folio_alloc);
 
-static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp,
+static unsigned long alloc_pages_bulk_interleave(gfp_t gfp,
 		struct mempolicy *pol, unsigned long nr_pages,
 		struct page **page_array)
 {
@@ -2334,13 +2334,12 @@ static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp,
 		if (delta) {
 			nr_allocated = __alloc_pages_bulk(gfp,
 					interleave_nodes(pol), NULL,
-					nr_pages_per_node + 1, NULL,
-					page_array);
+					nr_pages_per_node + 1, page_array);
 			delta--;
 		} else {
 			nr_allocated = __alloc_pages_bulk(gfp,
 					interleave_nodes(pol), NULL,
-					nr_pages_per_node, NULL, page_array);
+					nr_pages_per_node, page_array);
 		}
 
 		page_array += nr_allocated;
@@ -2350,7 +2349,7 @@ static unsigned long alloc_pages_bulk_array_interleave(gfp_t gfp,
 	return total_allocated;
 }
 
-static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid,
+static unsigned long alloc_pages_bulk_preferred_many(gfp_t gfp, int nid,
 		struct mempolicy *pol, unsigned long nr_pages,
 		struct page **page_array)
 {
@@ -2361,11 +2360,11 @@ static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid,
 	preferred_gfp &= ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL);
 
 	nr_allocated  = __alloc_pages_bulk(preferred_gfp, nid, &pol->nodes,
-					   nr_pages, NULL, page_array);
+					   nr_pages, page_array);
 
 	if (nr_allocated < nr_pages)
 		nr_allocated += __alloc_pages_bulk(gfp, numa_node_id(), NULL,
-				nr_pages - nr_allocated, NULL,
+				nr_pages - nr_allocated,
 				page_array + nr_allocated);
 	return nr_allocated;
 }
@@ -2376,7 +2375,7 @@ static unsigned long alloc_pages_bulk_array_preferred_many(gfp_t gfp, int nid,
  * It can accelerate memory allocation especially interleaving
  * allocate memory.
  */
-unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp,
+unsigned long alloc_pages_bulk_mempolicy(gfp_t gfp,
 		unsigned long nr_pages, struct page **page_array)
 {
 	struct mempolicy *pol = &default_policy;
@@ -2385,15 +2384,15 @@ unsigned long alloc_pages_bulk_array_mempolicy(gfp_t gfp,
 		pol = get_task_policy(current);
 
 	if (pol->mode == MPOL_INTERLEAVE)
-		return alloc_pages_bulk_array_interleave(gfp, pol,
+		return alloc_pages_bulk_interleave(gfp, pol,
 							 nr_pages, page_array);
 
 	if (pol->mode == MPOL_PREFERRED_MANY)
-		return alloc_pages_bulk_array_preferred_many(gfp,
+		return alloc_pages_bulk_preferred_many(gfp,
 				numa_node_id(), pol, nr_pages, page_array);
 
 	return __alloc_pages_bulk(gfp, policy_node(gfp, pol, numa_node_id()),
-				  policy_nodemask(gfp, pol), nr_pages, NULL,
+				  policy_nodemask(gfp, pol), nr_pages,
 				  page_array);
 }
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 85741403948f..8f035304fbf2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4221,23 +4221,20 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
  * @preferred_nid: The preferred NUMA node ID to allocate from
  * @nodemask: Set of nodes to allocate from, may be NULL
  * @nr_pages: The number of pages desired on the list or array
- * @page_list: Optional list to store the allocated pages
- * @page_array: Optional array to store the pages
+ * @page_array: Array array to store the pages
  *
  * This is a batched version of the page allocator that attempts to
- * allocate nr_pages quickly. Pages are added to page_list if page_list
- * is not NULL, otherwise it is assumed that the page_array is valid.
+ * allocate nr_pages quickly with pointers stored in page_array.
  *
  * For lists, nr_pages is the number of pages that should be allocated.
  *
  * For arrays, only NULL elements are populated with pages and nr_pages
  * is the maximum number of pages that will be stored in the array.
  *
- * Returns the number of pages on the list or array.
+ * Returns the number of pages in the array.
  */
 unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 			nodemask_t *nodemask, int nr_pages,
-			struct list_head *page_list,
 			struct page **page_array)
 {
 	struct page *page;
@@ -4351,11 +4348,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 		nr_account++;
 
 		prep_new_page(page, 0, gfp, 0);
-		if (page_list)
-			list_add(&page->lru, page_list);
-		else
-			page_array[nr_populated] = page;
-		nr_populated++;
+		page_array[nr_populated++] = page;
 	}
 
 	pcp_spin_unlock(pcp);
@@ -4372,13 +4365,8 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
 
 failed:
 	page = __alloc_pages(gfp, 0, preferred_nid, nodemask);
-	if (page) {
-		if (page_list)
-			list_add(&page->lru, page_list);
-		else
-			page_array[nr_populated] = page;
-		nr_populated++;
-	}
+	if (page)
+		page_array[nr_populated++] = page;
 
 	goto out;
 }
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index a3fedb3ee0db..46d49e04039e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3025,12 +3025,12 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 			 * but mempolicy wants to alloc memory by interleaving.
 			 */
 			if (IS_ENABLED(CONFIG_NUMA) && nid == NUMA_NO_NODE)
-				nr = alloc_pages_bulk_array_mempolicy(bulk_gfp,
+				nr = alloc_pages_bulk_mempolicy(bulk_gfp,
 							nr_pages_request,
 							pages + nr_allocated);
 
 			else
-				nr = alloc_pages_bulk_array_node(bulk_gfp, nid,
+				nr = alloc_pages_bulk_node(bulk_gfp, nid,
 							nr_pages_request,
 							pages + nr_allocated);
 
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 77cb75e63aca..9cff3e4350ee 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -426,10 +426,10 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
 	if (unlikely(pool->alloc.count > 0))
 		return pool->alloc.cache[--pool->alloc.count];
 
-	/* Mark empty alloc.cache slots "empty" for alloc_pages_bulk_array */
+	/* Mark empty alloc.cache slots "empty" for alloc_pages_bulk */
 	memset(&pool->alloc.cache, 0, sizeof(void *) * bulk);
 
-	nr_pages = alloc_pages_bulk_array_node(gfp, pool->p.nid, bulk,
+	nr_pages = alloc_pages_bulk_node(gfp, pool->p.nid, bulk,
 					       pool->alloc.cache);
 	if (unlikely(!nr_pages))
 		return NULL;
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index 812fda9d45dd..afce749a0871 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -613,7 +613,7 @@ svc_init_buffer(struct svc_rqst *rqstp, unsigned int size, int node)
 	if (pages > RPCSVC_MAXPAGES)
 		pages = RPCSVC_MAXPAGES;
 
-	ret = alloc_pages_bulk_array_node(GFP_KERNEL, node, pages,
+	ret = alloc_pages_bulk_node(GFP_KERNEL, node, pages,
 					  rqstp->rq_pages);
 	return ret == pages;
 }
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 4cfe9640df48..92c6eae6610c 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -666,7 +666,7 @@ static bool svc_alloc_arg(struct svc_rqst *rqstp)
 	}
 
 	for (filled = 0; filled < pages; filled = ret) {
-		ret = alloc_pages_bulk_array_node(GFP_KERNEL,
+		ret = alloc_pages_bulk_node(GFP_KERNEL,
 						  rqstp->rq_pool->sp_id,
 						  pages, rqstp->rq_pages);
 		if (ret > filled)
kernel test robot Oct. 25, 2023, 1:50 p.m. UTC | #12
Hi Mel,

kernel test robot noticed the following build errors:

[auto build test ERROR on trondmy-nfs/linux-next]
[also build test ERROR on linus/master v6.6-rc7]
[cannot apply to akpm-mm/mm-everything next-20231025]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mel-Gorman/Re-PATCH-resend-mm-hugetlb_vmemmap-use-bulk-allocator-in-alloc_vmemmap_page_list/20231025-173425
base:   git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next
patch link:    https://lore.kernel.org/r/20231025093254.xvomlctwhcuerzky%40techsingularity.net
patch subject: Re: [PATCH resend] mm: hugetlb_vmemmap: use bulk allocator in alloc_vmemmap_page_list()
config: loongarch-randconfig-002-20231025 (https://download.01.org/0day-ci/archive/20231025/202310252149.qzjGG49d-lkp@intel.com/config)
compiler: loongarch64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231025/202310252149.qzjGG49d-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310252149.qzjGG49d-lkp@intel.com/

All errors (new ones prefixed by >>):

   fs/xfs/xfs_buf.c: In function 'xfs_buf_alloc_pages':
>> fs/xfs/xfs_buf.c:391:26: error: implicit declaration of function 'alloc_pages_bulk_array'; did you mean 'alloc_pages_bulk_node'? [-Werror=implicit-function-declaration]
     391 |                 filled = alloc_pages_bulk_array(gfp_mask, bp->b_page_count,
         |                          ^~~~~~~~~~~~~~~~~~~~~~
         |                          alloc_pages_bulk_node
   cc1: some warnings being treated as errors
--
   fs/btrfs/extent_io.c: In function 'btrfs_alloc_page_array':
>> fs/btrfs/extent_io.c:688:29: error: implicit declaration of function 'alloc_pages_bulk_array'; did you mean 'alloc_pages_bulk_node'? [-Werror=implicit-function-declaration]
     688 |                 allocated = alloc_pages_bulk_array(GFP_NOFS, nr_pages, page_array);
         |                             ^~~~~~~~~~~~~~~~~~~~~~
         |                             alloc_pages_bulk_node
   cc1: some warnings being treated as errors


vim +391 fs/xfs/xfs_buf.c

0e6e847ffe3743 fs/xfs/linux-2.6/xfs_buf.c Dave Chinner      2011-03-26  353  
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  354  static int
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  355  xfs_buf_alloc_pages(
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  356  	struct xfs_buf	*bp,
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  357  	xfs_buf_flags_t	flags)
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  358  {
289ae7b48c2c4d fs/xfs/xfs_buf.c           Dave Chinner      2021-06-07  359  	gfp_t		gfp_mask = __GFP_NOWARN;
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  360  	long		filled = 0;
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  361  
289ae7b48c2c4d fs/xfs/xfs_buf.c           Dave Chinner      2021-06-07  362  	if (flags & XBF_READ_AHEAD)
289ae7b48c2c4d fs/xfs/xfs_buf.c           Dave Chinner      2021-06-07  363  		gfp_mask |= __GFP_NORETRY;
289ae7b48c2c4d fs/xfs/xfs_buf.c           Dave Chinner      2021-06-07  364  	else
289ae7b48c2c4d fs/xfs/xfs_buf.c           Dave Chinner      2021-06-07  365  		gfp_mask |= GFP_NOFS;
289ae7b48c2c4d fs/xfs/xfs_buf.c           Dave Chinner      2021-06-07  366  
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  367  	/* Make sure that we have a page list */
934d1076bb2c5b fs/xfs/xfs_buf.c           Christoph Hellwig 2021-06-07  368  	bp->b_page_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE);
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  369  	if (bp->b_page_count <= XB_PAGES) {
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  370  		bp->b_pages = bp->b_page_array;
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  371  	} else {
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  372  		bp->b_pages = kzalloc(sizeof(struct page *) * bp->b_page_count,
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  373  					gfp_mask);
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  374  		if (!bp->b_pages)
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  375  			return -ENOMEM;
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  376  	}
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  377  	bp->b_flags |= _XBF_PAGES;
02c5117386884e fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  378  
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  379  	/* Assure zeroed buffer for non-read cases. */
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  380  	if (!(flags & XBF_READ))
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  381  		gfp_mask |= __GFP_ZERO;
0a683794ace283 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  382  
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  383  	/*
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  384  	 * Bulk filling of pages can take multiple calls. Not filling the entire
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  385  	 * array is not an allocation failure, so don't back off if we get at
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  386  	 * least one extra page.
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  387  	 */
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  388  	for (;;) {
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  389  		long	last = filled;
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  390  
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01 @391  		filled = alloc_pages_bulk_array(gfp_mask, bp->b_page_count,
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  392  						bp->b_pages);
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  393  		if (filled == bp->b_page_count) {
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  394  			XFS_STATS_INC(bp->b_mount, xb_page_found);
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  395  			break;
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  396  		}
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  397  
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  398  		if (filled != last)
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  399  			continue;
c9fa563072e133 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  400  
ce8e922c0e79c8 fs/xfs/linux-2.6/xfs_buf.c Nathan Scott      2006-01-11  401  		if (flags & XBF_READ_AHEAD) {
e7d236a6fe5102 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  402  			xfs_buf_free_pages(bp);
e7d236a6fe5102 fs/xfs/xfs_buf.c           Dave Chinner      2021-06-01  403  			return -ENOMEM;
^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds    2005-04-16  404  		}
^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds    2005-04-16  405  
dbd329f1e44ed4 fs/xfs/xfs_buf.c           Christoph Hellwig 2019-06-28  406  		XFS_STATS_INC(bp->b_mount, xb_page_retries);
4034247a0d6ab2 fs/xfs/xfs_buf.c           NeilBrown         2022-01-14  407  		memalloc_retry_wait(gfp_mask);
^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds    2005-04-16  408  	}
0e6e847ffe3743 fs/xfs/linux-2.6/xfs_buf.c Dave Chinner      2011-03-26  409  	return 0;
^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds    2005-04-16  410  }
^1da177e4c3f41 fs/xfs/linux-2.6/xfs_buf.c Linus Torvalds    2005-04-16  411
diff mbox series

Patch

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 665f06675c83..d6e82f15b61f 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -195,6 +195,15 @@  alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list)
 	return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list, NULL);
 }
 
+static inline unsigned long
+alloc_pages_bulk_list_node(gfp_t gfp, int nid, unsigned long nr_pages, struct list_head *list)
+{
+	if (nid == NUMA_NO_NODE)
+		nid = numa_mem_id();
+
+	return __alloc_pages_bulk(gfp, nid, NULL, nr_pages, list, NULL);
+}
+
 static inline unsigned long
 alloc_pages_bulk_array(gfp_t gfp, unsigned long nr_pages, struct page **page_array)
 {
diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 4b9734777f69..786e581703c7 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -384,7 +384,13 @@  static int alloc_vmemmap_page_list(unsigned long start, unsigned long end,
 	unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
 	int nid = page_to_nid((struct page *)start);
 	struct page *page, *next;
+	unsigned long nr_allocated;
 
+	nr_allocated = alloc_pages_bulk_list_node(gfp_mask, nid, nr_pages, list);
+	if (!nr_allocated)
+		return -ENOMEM;
+
+	nr_pages -= nr_allocated;
 	while (nr_pages--) {
 		page = alloc_pages_node(nid, gfp_mask, 0);
 		if (!page)