From patchwork Mon Mar 22 09:18:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 12154191 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E78AEC433E8 for ; Mon, 22 Mar 2021 09:20:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AB43F61973 for ; Mon, 22 Mar 2021 09:20:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229953AbhCVJTd (ORCPT ); Mon, 22 Mar 2021 05:19:33 -0400 Received: from outbound-smtp44.blacknight.com ([46.22.136.52]:36661 "EHLO outbound-smtp44.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229893AbhCVJTD (ORCPT ); Mon, 22 Mar 2021 05:19:03 -0400 Received: from mail.blacknight.com (pemlinmail05.blacknight.ie [81.17.254.26]) by outbound-smtp44.blacknight.com (Postfix) with ESMTPS id 1F9F9F836B for ; Mon, 22 Mar 2021 09:18:46 +0000 (GMT) Received: (qmail 16115 invoked from network); 22 Mar 2021 09:18:45 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 22 Mar 2021 09:18:45 -0000 From: Mel Gorman To: Andrew Morton Cc: Vlastimil Babka , Chuck Lever , Jesper Dangaard Brouer , Christoph Hellwig , Alexander Duyck , Matthew Wilcox , LKML , Linux-Net , Linux-MM , Linux-NFS , Mel Gorman Subject: [PATCH 1/3] mm/page_alloc: Rename alloced to allocated Date: Mon, 22 Mar 2021 09:18:43 +0000 Message-Id: <20210322091845.16437-2-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210322091845.16437-1-mgorman@techsingularity.net> References: <20210322091845.16437-1-mgorman@techsingularity.net> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Review feedback of the bulk allocator twice found problems with "alloced" being a counter for pages allocated. The naming was based on the API name "alloc" and was based on the idea that verbal communication about malloc tends to use the fake word "malloced" instead of the fake word mallocated. To be consistent, this preparation patch renames alloced to allocated in rmqueue_bulk so the bulk allocator and per-cpu allocator use similar names when the bulk allocator is introduced. Signed-off-by: Mel Gorman --- mm/page_alloc.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index dfa9af064f74..8a3e13277e22 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2908,7 +2908,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, unsigned long count, struct list_head *list, int migratetype, unsigned int alloc_flags) { - int i, alloced = 0; + int i, allocated = 0; spin_lock(&zone->lock); for (i = 0; i < count; ++i) { @@ -2931,7 +2931,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, * pages are ordered properly. */ list_add_tail(&page->lru, list); - alloced++; + allocated++; if (is_migrate_cma(get_pcppage_migratetype(page))) __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, -(1 << order)); @@ -2940,12 +2940,12 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, /* * i pages were removed from the buddy list even if some leak due * to check_pcp_refill failing so adjust NR_FREE_PAGES based - * on i. Do not confuse with 'alloced' which is the number of + * on i. Do not confuse with 'allocated' which is the number of * pages added to the pcp list. */ __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order)); spin_unlock(&zone->lock); - return alloced; + return allocated; } #ifdef CONFIG_NUMA From patchwork Mon Mar 22 09:18:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 12154193 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18147C433EB for ; Mon, 22 Mar 2021 09:20:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E4BFA61972 for ; Mon, 22 Mar 2021 09:20:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230006AbhCVJTg (ORCPT ); Mon, 22 Mar 2021 05:19:36 -0400 Received: from outbound-smtp35.blacknight.com ([46.22.139.218]:36755 "EHLO outbound-smtp35.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229890AbhCVJTF (ORCPT ); Mon, 22 Mar 2021 05:19:05 -0400 Received: from mail.blacknight.com (pemlinmail05.blacknight.ie [81.17.254.26]) by outbound-smtp35.blacknight.com (Postfix) with ESMTPS id 78A17273D for ; Mon, 22 Mar 2021 09:18:46 +0000 (GMT) Received: (qmail 16137 invoked from network); 22 Mar 2021 09:18:46 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 22 Mar 2021 09:18:46 -0000 From: Mel Gorman To: Andrew Morton Cc: Vlastimil Babka , Chuck Lever , Jesper Dangaard Brouer , Christoph Hellwig , Alexander Duyck , Matthew Wilcox , LKML , Linux-Net , Linux-MM , Linux-NFS , Mel Gorman Subject: [PATCH 2/3] mm/page_alloc: Add a bulk page allocator Date: Mon, 22 Mar 2021 09:18:44 +0000 Message-Id: <20210322091845.16437-3-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210322091845.16437-1-mgorman@techsingularity.net> References: <20210322091845.16437-1-mgorman@techsingularity.net> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch adds a new page allocator interface via alloc_pages_bulk, and __alloc_pages_bulk_nodemask. A caller requests a number of pages to be allocated and added to a list. The API is not guaranteed to return the requested number of pages and may fail if the preferred allocation zone has limited free memory, the cpuset changes during the allocation or page debugging decides to fail an allocation. It's up to the caller to request more pages in batch if necessary. Note that this implementation is not very efficient and could be improved but it would require refactoring. The intent is to make it available early to determine what semantics are required by different callers. Once the full semantics are nailed down, it can be refactored. Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka --- include/linux/gfp.h | 11 ++++ mm/page_alloc.c | 124 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 135 insertions(+) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 0a88f84b08f4..4a304fd39916 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -518,6 +518,17 @@ static inline int arch_make_page_accessible(struct page *page) struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid, nodemask_t *nodemask); +int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, + nodemask_t *nodemask, int nr_pages, + struct list_head *list); + +/* Bulk allocate order-0 pages */ +static inline unsigned long +alloc_pages_bulk(gfp_t gfp, unsigned long nr_pages, struct list_head *list) +{ + return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list); +} + /* * Allocate pages, preferring the node given as nid. The node must be valid and * online. For more general interface, see alloc_pages_node(). diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8a3e13277e22..3f4d56854c74 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4965,6 +4965,130 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, return true; } +/* + * __alloc_pages_bulk - Allocate a number of order-0 pages to a list + * @gfp: GFP flags for the allocation + * @preferred_nid: The preferred NUMA node ID to allocate from + * @nodemask: Set of nodes to allocate from, may be NULL + * @nr_pages: The number of pages requested + * @page_list: List to store the allocated pages, must be empty + * + * This is a batched version of the page allocator that attempts to + * allocate nr_pages quickly and add them to a list. The list must be + * empty to allow new pages to be prepped with IRQs enabled. + * + * Returns the number of pages allocated. + */ +int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, + nodemask_t *nodemask, int nr_pages, + struct list_head *page_list) +{ + struct page *page; + unsigned long flags; + struct zone *zone; + struct zoneref *z; + struct per_cpu_pages *pcp; + struct list_head *pcp_list; + struct alloc_context ac; + gfp_t alloc_gfp; + unsigned int alloc_flags; + int allocated = 0; + + if (WARN_ON_ONCE(nr_pages <= 0)) + return 0; + + if (WARN_ON_ONCE(!list_empty(page_list))) + return 0; + + if (nr_pages == 1) + goto failed; + + /* May set ALLOC_NOFRAGMENT, fragmentation will return 1 page. */ + gfp &= gfp_allowed_mask; + alloc_gfp = gfp; + if (!prepare_alloc_pages(gfp, 0, preferred_nid, nodemask, &ac, &alloc_gfp, &alloc_flags)) + return 0; + gfp = alloc_gfp; + + /* Find an allowed local zone that meets the high watermark. */ + for_each_zone_zonelist_nodemask(zone, z, ac.zonelist, ac.highest_zoneidx, ac.nodemask) { + unsigned long mark; + + if (cpusets_enabled() && (alloc_flags & ALLOC_CPUSET) && + !__cpuset_zone_allowed(zone, gfp)) { + continue; + } + + if (nr_online_nodes > 1 && zone != ac.preferred_zoneref->zone && + zone_to_nid(zone) != zone_to_nid(ac.preferred_zoneref->zone)) { + goto failed; + } + + mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK) + nr_pages; + if (zone_watermark_fast(zone, 0, mark, + zonelist_zone_idx(ac.preferred_zoneref), + alloc_flags, gfp)) { + break; + } + } + + /* + * If there are no allowed local zones that meets the watermarks then + * try to allocate a single page and reclaim if necessary. + */ + if (!zone) + goto failed; + + /* Attempt the batch allocation */ + local_irq_save(flags); + pcp = &this_cpu_ptr(zone->pageset)->pcp; + pcp_list = &pcp->lists[ac.migratetype]; + + while (allocated < nr_pages) { + page = __rmqueue_pcplist(zone, ac.migratetype, alloc_flags, + pcp, pcp_list); + if (!page) { + /* Try and get at least one page */ + if (!allocated) + goto failed_irq; + break; + } + + /* + * Ideally this would be batched but the best way to do + * that cheaply is to first convert zone_statistics to + * be inaccurate per-cpu counter like vm_events to avoid + * a RMW cycle then do the accounting with IRQs enabled. + */ + __count_zid_vm_events(PGALLOC, zone_idx(zone), 1); + zone_statistics(ac.preferred_zoneref->zone, zone); + + list_add(&page->lru, page_list); + allocated++; + } + + local_irq_restore(flags); + + /* Prep pages with IRQs enabled. */ + list_for_each_entry(page, page_list, lru) + prep_new_page(page, 0, gfp, 0); + + return allocated; + +failed_irq: + local_irq_restore(flags); + +failed: + page = __alloc_pages(gfp, 0, preferred_nid, nodemask); + if (page) { + list_add(&page->lru, page_list); + allocated = 1; + } + + return allocated; +} +EXPORT_SYMBOL_GPL(__alloc_pages_bulk); + /* * This is the 'heart' of the zoned buddy allocator. */ From patchwork Mon Mar 22 09:18:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 12154195 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CD37C433EC for ; Mon, 22 Mar 2021 09:20:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 124EC6197F for ; Mon, 22 Mar 2021 09:20:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230018AbhCVJTh (ORCPT ); Mon, 22 Mar 2021 05:19:37 -0400 Received: from outbound-smtp11.blacknight.com ([46.22.139.106]:52571 "EHLO outbound-smtp11.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229913AbhCVJTE (ORCPT ); Mon, 22 Mar 2021 05:19:04 -0400 Received: from mail.blacknight.com (pemlinmail05.blacknight.ie [81.17.254.26]) by outbound-smtp11.blacknight.com (Postfix) with ESMTPS id 0AA991C5845 for ; Mon, 22 Mar 2021 09:18:47 +0000 (GMT) Received: (qmail 16194 invoked from network); 22 Mar 2021 09:18:46 -0000 Received: from unknown (HELO stampy.112glenside.lan) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPA; 22 Mar 2021 09:18:46 -0000 From: Mel Gorman To: Andrew Morton Cc: Vlastimil Babka , Chuck Lever , Jesper Dangaard Brouer , Christoph Hellwig , Alexander Duyck , Matthew Wilcox , LKML , Linux-Net , Linux-MM , Linux-NFS , Mel Gorman Subject: [PATCH 3/3] mm/page_alloc: Add an array-based interface to the bulk page allocator Date: Mon, 22 Mar 2021 09:18:45 +0000 Message-Id: <20210322091845.16437-4-mgorman@techsingularity.net> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210322091845.16437-1-mgorman@techsingularity.net> References: <20210322091845.16437-1-mgorman@techsingularity.net> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The proposed callers for the bulk allocator store pages from the bulk allocator in an array. This patch adds an array-based interface to the API to avoid multiple list iterations. The page list interface is preserved to avoid requiring all users of the bulk API to allocate and manage enough storage to store the pages. Signed-off-by: Mel Gorman --- include/linux/gfp.h | 13 ++++++-- mm/page_alloc.c | 75 ++++++++++++++++++++++++++++++++++----------- 2 files changed, 67 insertions(+), 21 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 4a304fd39916..fb6234e1fe59 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -520,13 +520,20 @@ struct page *__alloc_pages(gfp_t gfp, unsigned int order, int preferred_nid, int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, nodemask_t *nodemask, int nr_pages, - struct list_head *list); + struct list_head *page_list, + struct page **page_array); /* Bulk allocate order-0 pages */ static inline unsigned long -alloc_pages_bulk(gfp_t gfp, unsigned long nr_pages, struct list_head *list) +alloc_pages_bulk_list(gfp_t gfp, unsigned long nr_pages, struct list_head *list) { - return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list); + return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, list, NULL); +} + +static inline unsigned long +alloc_pages_bulk_array(gfp_t gfp, unsigned long nr_pages, struct page **page_array) +{ + return __alloc_pages_bulk(gfp, numa_mem_id(), NULL, nr_pages, NULL, page_array); } /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3f4d56854c74..c83d38dfe936 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4966,22 +4966,31 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order, } /* - * __alloc_pages_bulk - Allocate a number of order-0 pages to a list + * __alloc_pages_bulk - Allocate a number of order-0 pages to a list or array * @gfp: GFP flags for the allocation * @preferred_nid: The preferred NUMA node ID to allocate from * @nodemask: Set of nodes to allocate from, may be NULL * @nr_pages: The number of pages requested - * @page_list: List to store the allocated pages, must be empty + * @page_list: Optional list to store the allocated pages + * @page_array: Optional array to store the pages * * This is a batched version of the page allocator that attempts to - * allocate nr_pages quickly and add them to a list. The list must be - * empty to allow new pages to be prepped with IRQs enabled. + * allocate nr_pages quickly. Pages are added to page_list if page_list + * is not NULL, otherwise it is assumed that the page_array is valid. * - * Returns the number of pages allocated. + * For lists, nr_pages is the number of pages that should be allocated. + * + * For arrays, only NULL elements are populated with pages and nr_pages + * is the maximum number of pages that will be stored in the array. Note + * that arrays with NULL holes in the middle may return prematurely. + * + * Returns the number of pages added to the page_list or the known + * number of populated elements in the page_array. */ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, nodemask_t *nodemask, int nr_pages, - struct list_head *page_list) + struct list_head *page_list, + struct page **page_array) { struct page *page; unsigned long flags; @@ -4992,14 +5001,23 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, struct alloc_context ac; gfp_t alloc_gfp; unsigned int alloc_flags; - int allocated = 0; + int nr_populated = 0, prep_index = 0; if (WARN_ON_ONCE(nr_pages <= 0)) return 0; - if (WARN_ON_ONCE(!list_empty(page_list))) + if (WARN_ON_ONCE(page_list && !list_empty(page_list))) return 0; + /* Skip populated array elements. */ + if (page_array) { + while (nr_populated < nr_pages && page_array[nr_populated]) + nr_populated++; + if (nr_populated == nr_pages) + return nr_populated; + prep_index = nr_populated; + } + if (nr_pages == 1) goto failed; @@ -5044,12 +5062,22 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, pcp = &this_cpu_ptr(zone->pageset)->pcp; pcp_list = &pcp->lists[ac.migratetype]; - while (allocated < nr_pages) { + while (nr_populated < nr_pages) { + /* + * Stop allocating if the next index has a populated + * page or the page will be prepared a second time when + * IRQs are enabled. + */ + if (page_array && page_array[nr_populated]) { + nr_populated++; + break; + } + page = __rmqueue_pcplist(zone, ac.migratetype, alloc_flags, pcp, pcp_list); if (!page) { /* Try and get at least one page */ - if (!allocated) + if (!nr_populated) goto failed_irq; break; } @@ -5063,17 +5091,25 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, __count_zid_vm_events(PGALLOC, zone_idx(zone), 1); zone_statistics(ac.preferred_zoneref->zone, zone); - list_add(&page->lru, page_list); - allocated++; + if (page_list) + list_add(&page->lru, page_list); + else + page_array[nr_populated] = page; + nr_populated++; } local_irq_restore(flags); /* Prep pages with IRQs enabled. */ - list_for_each_entry(page, page_list, lru) - prep_new_page(page, 0, gfp, 0); + if (page_list) { + list_for_each_entry(page, page_list, lru) + prep_new_page(page, 0, gfp, 0); + } else { + while (prep_index < nr_populated) + prep_new_page(page_array[prep_index++], 0, gfp, 0); + } - return allocated; + return nr_populated; failed_irq: local_irq_restore(flags); @@ -5081,11 +5117,14 @@ int __alloc_pages_bulk(gfp_t gfp, int preferred_nid, failed: page = __alloc_pages(gfp, 0, preferred_nid, nodemask); if (page) { - list_add(&page->lru, page_list); - allocated = 1; + if (page_list) + list_add(&page->lru, page_list); + else + page_array[nr_populated] = page; + nr_populated++; } - return allocated; + return nr_populated; } EXPORT_SYMBOL_GPL(__alloc_pages_bulk);