From patchwork Sun Dec 1 01:55:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11268371 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3AE90921 for ; Sun, 1 Dec 2019 01:55:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E12502084E for ; Sun, 1 Dec 2019 01:55:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="ExVX+EEv" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E12502084E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id ED0AE6B031C; Sat, 30 Nov 2019 20:55:10 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E828A6B031E; Sat, 30 Nov 2019 20:55:10 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D49946B031F; Sat, 30 Nov 2019 20:55:10 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0026.hostedemail.com [216.40.44.26]) by kanga.kvack.org (Postfix) with ESMTP id B56A26B031C for ; Sat, 30 Nov 2019 20:55:10 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 6522D824999B for ; Sun, 1 Dec 2019 01:55:10 +0000 (UTC) X-FDA: 76214904780.16.mint96_2d1cbefec5802 X-Spam-Summary: 2,0,0,c9e9ee1a9f13d78e,d41d8cd98f00b204,akpm@linux-foundation.org,:aarcange@redhat.com:akpm@linux-foundation.org:anshuman.khandual@arm.com:dan.j.williams@intel.com:david@redhat.com::mgorman@techsingularity.net:mhocko@suse.com:mike.kravetz@oracle.com:mm-commits@vger.kernel.org:osalvador@suse.de:pavel.tatashin@microsoft.com:rientjes@google.com:rppt@linux.ibm.com:torvalds@linux-foundation.org:vbabka@suse.cz:willy@infradead.org,RULES_HIT:1:2:41:69:355:379:800:960:966:967:973:988:989:1260:1263:1345:1381:1431:1437:1605:1730:1747:1777:1792:1963:1981:2194:2196:2198:2199:2200:2201:2393:2525:2553:2559:2563:2682:2685:2693:2731:2859:2902:2933:2937:2939:2942:2945:2947:2951:2954:3022:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3934:3936:3938:3941:3944:3947:3950:3953:3956:3959:4051:4250:4321:4385:4605:5007:6119:6261:6630:6653:6737:7576:7688:7875:7903:8957:9025:9391:9545:9592:10004:10913:11026:11473:11658:11914:12043:12048:12114:12291:12295:12296:12297:12438:1 2517:125 X-HE-Tag: mint96_2d1cbefec5802 X-Filterd-Recvd-Size: 10147 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Sun, 1 Dec 2019 01:55:09 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 52837206E1; Sun, 1 Dec 2019 01:55:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1575165309; bh=lgaF4lRPrwq/2ZIvQoTZucsr0H2Llw88rAMnLY4+xJk=; h=Date:From:To:Subject:From; b=ExVX+EEvR+UAPDFSceHOFHqccjSq+Xhr/Kr4BKXWP4YaOxC8vdC4d3KfDCjAI3GxJ uoGSqP4Yqy6quxQe6XQw1VGlMc+9Ic3VIO7OqudkSYlkqVv7ZYmkAWYTSiKQB3HPNU 9ZXvjPVnhaMIRYb/e9X2R98ETkEuHyUlIEcT5Lto= Date: Sat, 30 Nov 2019 17:55:06 -0800 From: akpm@linux-foundation.org To: aarcange@redhat.com, akpm@linux-foundation.org, anshuman.khandual@arm.com, dan.j.williams@intel.com, david@redhat.com, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@suse.com, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, osalvador@suse.de, pavel.tatashin@microsoft.com, rientjes@google.com, rppt@linux.ibm.com, torvalds@linux-foundation.org, vbabka@suse.cz, willy@infradead.org Subject: [patch 097/158] mm/page_alloc: add alloc_contig_pages() Message-ID: <20191201015506.7d9vdT2Lu%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Anshuman Khandual Subject: mm/page_alloc: add alloc_contig_pages() HugeTLB helper alloc_gigantic_page() implements fairly generic allocation method where it scans over various zones looking for a large contiguous pfn range before trying to allocate it with alloc_contig_range(). Other than deriving the requested order from 'struct hstate', there is nothing HugeTLB specific in there. This can be made available for general use to allocate contiguous memory which could not have been allocated through the buddy allocator. alloc_gigantic_page() has been split carving out actual allocation method which is then made available via new alloc_contig_pages() helper wrapped under CONFIG_CONTIG_ALLOC. All references to 'gigantic' have been replaced with more generic term 'contig'. Allocated pages here should be freed with free_contig_range() or by calling __free_page() on each allocated page. Link: http://lkml.kernel.org/r/1571300646-32240-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual Acked-by: David Hildenbrand Acked-by: Michal Hocko Cc: Mike Kravetz Cc: Vlastimil Babka Cc: Michal Hocko Cc: David Rientjes Cc: Andrea Arcangeli Cc: Oscar Salvador Cc: Mel Gorman Cc: Mike Rapoport Cc: Dan Williams Cc: Pavel Tatashin Cc: Matthew Wilcox Cc: David Hildenbrand Signed-off-by: Andrew Morton --- include/linux/gfp.h | 2 mm/hugetlb.c | 77 -------------------------------- mm/page_alloc.c | 101 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 105 insertions(+), 75 deletions(-) --- a/include/linux/gfp.h~mm-page_alloc-add-alloc_contig_pages +++ a/include/linux/gfp.h @@ -612,6 +612,8 @@ static inline bool pm_suspended_storage( /* The below functions must be run on a range from a single zone. */ extern int alloc_contig_range(unsigned long start, unsigned long end, unsigned migratetype, gfp_t gfp_mask); +extern struct page *alloc_contig_pages(unsigned long nr_pages, gfp_t gfp_mask, + int nid, nodemask_t *nodemask); #endif void free_contig_range(unsigned long pfn, unsigned int nr_pages); --- a/mm/hugetlb.c~mm-page_alloc-add-alloc_contig_pages +++ a/mm/hugetlb.c @@ -1069,85 +1069,12 @@ static void free_gigantic_page(struct pa } #ifdef CONFIG_CONTIG_ALLOC -static int __alloc_gigantic_page(unsigned long start_pfn, - unsigned long nr_pages, gfp_t gfp_mask) -{ - unsigned long end_pfn = start_pfn + nr_pages; - return alloc_contig_range(start_pfn, end_pfn, MIGRATE_MOVABLE, - gfp_mask); -} - -static bool pfn_range_valid_gigantic(struct zone *z, - unsigned long start_pfn, unsigned long nr_pages) -{ - unsigned long i, end_pfn = start_pfn + nr_pages; - struct page *page; - - for (i = start_pfn; i < end_pfn; i++) { - page = pfn_to_online_page(i); - if (!page) - return false; - - if (page_zone(page) != z) - return false; - - if (PageReserved(page)) - return false; - - if (page_count(page) > 0) - return false; - - if (PageHuge(page)) - return false; - } - - return true; -} - -static bool zone_spans_last_pfn(const struct zone *zone, - unsigned long start_pfn, unsigned long nr_pages) -{ - unsigned long last_pfn = start_pfn + nr_pages - 1; - return zone_spans_pfn(zone, last_pfn); -} - static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask, int nid, nodemask_t *nodemask) { - unsigned int order = huge_page_order(h); - unsigned long nr_pages = 1 << order; - unsigned long ret, pfn, flags; - struct zonelist *zonelist; - struct zone *zone; - struct zoneref *z; - - zonelist = node_zonelist(nid, gfp_mask); - for_each_zone_zonelist_nodemask(zone, z, zonelist, gfp_zone(gfp_mask), nodemask) { - spin_lock_irqsave(&zone->lock, flags); - - pfn = ALIGN(zone->zone_start_pfn, nr_pages); - while (zone_spans_last_pfn(zone, pfn, nr_pages)) { - if (pfn_range_valid_gigantic(zone, pfn, nr_pages)) { - /* - * We release the zone lock here because - * alloc_contig_range() will also lock the zone - * at some point. If there's an allocation - * spinning on this lock, it may win the race - * and cause alloc_contig_range() to fail... - */ - spin_unlock_irqrestore(&zone->lock, flags); - ret = __alloc_gigantic_page(pfn, nr_pages, gfp_mask); - if (!ret) - return pfn_to_page(pfn); - spin_lock_irqsave(&zone->lock, flags); - } - pfn += nr_pages; - } - - spin_unlock_irqrestore(&zone->lock, flags); - } + unsigned long nr_pages = 1UL << huge_page_order(h); - return NULL; + return alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask); } static void prep_new_huge_page(struct hstate *h, struct page *page, int nid); --- a/mm/page_alloc.c~mm-page_alloc-add-alloc_contig_pages +++ a/mm/page_alloc.c @@ -8502,6 +8502,107 @@ done: pfn_max_align_up(end), migratetype); return ret; } + +static int __alloc_contig_pages(unsigned long start_pfn, + unsigned long nr_pages, gfp_t gfp_mask) +{ + unsigned long end_pfn = start_pfn + nr_pages; + + return alloc_contig_range(start_pfn, end_pfn, MIGRATE_MOVABLE, + gfp_mask); +} + +static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn, + unsigned long nr_pages) +{ + unsigned long i, end_pfn = start_pfn + nr_pages; + struct page *page; + + for (i = start_pfn; i < end_pfn; i++) { + page = pfn_to_online_page(i); + if (!page) + return false; + + if (page_zone(page) != z) + return false; + + if (PageReserved(page)) + return false; + + if (page_count(page) > 0) + return false; + + if (PageHuge(page)) + return false; + } + return true; +} + +static bool zone_spans_last_pfn(const struct zone *zone, + unsigned long start_pfn, unsigned long nr_pages) +{ + unsigned long last_pfn = start_pfn + nr_pages - 1; + + return zone_spans_pfn(zone, last_pfn); +} + +/** + * alloc_contig_pages() -- tries to find and allocate contiguous range of pages + * @nr_pages: Number of contiguous pages to allocate + * @gfp_mask: GFP mask to limit search and used during compaction + * @nid: Target node + * @nodemask: Mask for other possible nodes + * + * This routine is a wrapper around alloc_contig_range(). It scans over zones + * on an applicable zonelist to find a contiguous pfn range which can then be + * tried for allocation with alloc_contig_range(). This routine is intended + * for allocation requests which can not be fulfilled with the buddy allocator. + * + * The allocated memory is always aligned to a page boundary. If nr_pages is a + * power of two then the alignment is guaranteed to be to the given nr_pages + * (e.g. 1GB request would be aligned to 1GB). + * + * Allocated pages can be freed with free_contig_range() or by manually calling + * __free_page() on each allocated page. + * + * Return: pointer to contiguous pages on success, or NULL if not successful. + */ +struct page *alloc_contig_pages(unsigned long nr_pages, gfp_t gfp_mask, + int nid, nodemask_t *nodemask) +{ + unsigned long ret, pfn, flags; + struct zonelist *zonelist; + struct zone *zone; + struct zoneref *z; + + zonelist = node_zonelist(nid, gfp_mask); + for_each_zone_zonelist_nodemask(zone, z, zonelist, + gfp_zone(gfp_mask), nodemask) { + spin_lock_irqsave(&zone->lock, flags); + + pfn = ALIGN(zone->zone_start_pfn, nr_pages); + while (zone_spans_last_pfn(zone, pfn, nr_pages)) { + if (pfn_range_valid_contig(zone, pfn, nr_pages)) { + /* + * We release the zone lock here because + * alloc_contig_range() will also lock the zone + * at some point. If there's an allocation + * spinning on this lock, it may win the race + * and cause alloc_contig_range() to fail... + */ + spin_unlock_irqrestore(&zone->lock, flags); + ret = __alloc_contig_pages(pfn, nr_pages, + gfp_mask); + if (!ret) + return pfn_to_page(pfn); + spin_lock_irqsave(&zone->lock, flags); + } + pfn += nr_pages; + } + spin_unlock_irqrestore(&zone->lock, flags); + } + return NULL; +} #endif /* CONFIG_CONTIG_ALLOC */ void free_contig_range(unsigned long pfn, unsigned int nr_pages)