From patchwork Sat Feb 18 00:28:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 707DBC6379F for ; Sat, 18 Feb 2023 00:29:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DD4FA28001B; Fri, 17 Feb 2023 19:29:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CBD5E280002; Fri, 17 Feb 2023 19:29:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AE85028001B; Fri, 17 Feb 2023 19:29:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 9761F280002 for ; Fri, 17 Feb 2023 19:29:17 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7D7D7808A5 for ; Sat, 18 Feb 2023 00:29:17 +0000 (UTC) X-FDA: 80478528354.15.6B68CC1 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) by imf21.hostedemail.com (Postfix) with ESMTP id B51001C0002 for ; Sat, 18 Feb 2023 00:29:15 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iqrumGyh; spf=pass (imf21.hostedemail.com: domain of 32hvwYwoKCPcisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=32hvwYwoKCPcisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680155; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=h5QeZmWStm8MBBmvLyhf4o+hksgZUS71zRGtmoeJ5N4=; b=I3G00XSXTHreFpGGBy6xJP4A8Tbx6qOAMXSy85qeQ4B7cfPdk4avKYx3xk1gfY6oJv4f4D 97VSujKaR8ySLdcB+nPymNklmNUY7wGM7J4l/dry1jkXgEwQO6oaREu51mT2cpSXGeS+B8 ZECm99zigb6UFz/mnfXqY6CFgXERggQ= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=iqrumGyh; spf=pass (imf21.hostedemail.com: domain of 32hvwYwoKCPcisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com designates 209.85.128.201 as permitted sender) smtp.mailfrom=32hvwYwoKCPcisgntfgsnmfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--jthoughton.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680155; a=rsa-sha256; cv=none; b=QT9FPmJAFmQ0Okt2UG/FnSKgY/8/ARu1HqmBQ7o4sw1f70GiPBuyuTq7QGgOJBBAd9RdT/ 7gVXPdCZWIKkdxF7d8GH3J3Uj+JIdvaWtuqRw5DCAosnNrKJY8R/p7nkGsJDol6O9uSoU7 2x5dpYy/7rnWlNgmvj7eLJqsE8mmlWQ= Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5365a8e6d8dso18112957b3.7 for ; Fri, 17 Feb 2023 16:29:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=h5QeZmWStm8MBBmvLyhf4o+hksgZUS71zRGtmoeJ5N4=; b=iqrumGyh+1DXgn+jsoYD5tlyxEnhTKa0RiDswgYlulz2R+ePcUkS4hk7pJr1l5TN4Z t4OGyf3Ee88VCdy7Q/P2cmEh2dAWdOVlOvBP1o2vJ8WyqBhGxkkoh3k8/Itb4ZdbmqKm cc2rGHBwWYUHY+CQ/nhxnlepr/hLoBQcN2mefhb5d+Skxpoc3mQ6YYqwqicwAf3fMJi9 IbApGiC57N5YDkCsKcP9870idXAalXkC8/897RuNBgRUzMFJC0wrZjC6bZRbVALKQPA4 KUFkbVYqdOLVPYutCh7Ev7uHgUEdcvwEGi0C/ONTXZeJkJHpEuUFB85nzCIxbQ0alG9k tAvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=h5QeZmWStm8MBBmvLyhf4o+hksgZUS71zRGtmoeJ5N4=; b=OUe5Jc/LxQlW424gml2aqxqqZ/8+o3AUKjBWOS6+3RnlCmodSTkxpC4yPeJz3rMb3b cZ9WnbyVajS8/T4M1DjtfxF2sOarKuUOVbGKj0ugw8pU9jOpbMqYeZ7EmVieGFHNwRYW mcuEkgcjDUwD1d+k8Njb3mxjTl92K8kHJdNuYlDoGcvRIt5XLIZxwNMRO8yejKfp8/V0 RyLMHifUrcvMG9d8ecoR0tXsY1YJ4bgAWkUkwxUGbfQ2z4q7F8f6F25/JaE+u2kQFgd/ leIlWEKYoE900mKjHsaHhlOWeDY4LTk6H3ACT30d6KxK+TW6Al0bVg9c0MOWsTKm7eDZ hYSg== X-Gm-Message-State: AO0yUKVaysSCRLriD4FNAbES1j4ZQQS2+c/hroJrm0WLclXyM7E1PnT/ 4ysyLnmMYZDYU23JrcIjbTkjMY3uzgeIjZ5T X-Google-Smtp-Source: AK7set8cs/bMSPkcig+yHBN2nUuLXJMfaVz/DRZ9ytGQCyW0wyjGaQbg0OHjwkIcX38+L9v/Io1eBcNX9GOn9imw X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a25:e211:0:b0:872:465e:2cbf with SMTP id h17-20020a25e211000000b00872465e2cbfmr1298716ybe.264.1676680154885; Fri, 17 Feb 2023 16:29:14 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:04 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-32-jthoughton@google.com> Subject: [PATCH v2 31/46] hugetlb: sort hstates in hugetlb_init_hstates From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B51001C0002 X-Rspam-User: X-Stat-Signature: uacotzimooqu9sygu3bjs9aaqpwk5bow X-HE-Tag: 1676680155-814578 X-HE-Meta: U2FsdGVkX1/4cafXB1gNn39sEA53kG/Urb+taH44scxn/eCHWPjsIxksL0PH4b0ZoA77+uYgmEOxVseScji5MOAMYubS26hk1O2dJ3/Ogeb8fMjHMmA1eQDOpJGPw0SXl/Y0E5tPnnvw0e5GepTqllNva6w7WDo3aU7C0UoYImCZTLrOSIhJLgw+BefFtpOcFSH1q+1rism2lqexplaf0D6YuA1baWSkIdBv6CDasSEzb9pAHeiOemFKHglh9vjMQ396g287fcMazp4a9C2VCETNlarvtt7LdGxuOf9gSwxpkb4lw2uKvvoD7RAy1J03x1pJS3ZDqoGfHiSWIefBm0VfxolQ9g0ubW1+MyWf7kaDJkQz7b3RyH7sEDKHwHGpMmkRrAE3sTc/xY7eC/Kg8ScS4T6+ViP+XP4hcics++haa11fB8rimpt0ICtrmBcaGYvviLJqwNBBbk2GiRPXJsxHvt1m4+rfbfLUf38StdvdL0pQVlGkfp1JDoAq8lec8RENC4Y6zQZyVk5zqPu3JkdmKBXYsGL+FEPubNyiN7qt34YAm8JXIl4SMJoYPEshqpGmu1vTWdmyWCFPEP9YSWXHvzDH3ZhihpNo3IosjFbftF0exvN4jLSzCCl9QGlEpM2E777jKMBpdBovWRZk+i0xrRSoAtfgkaSTQw0roa4+RwnkOy6612WEqYAp/clmJIsLhszT39JZYcTM+c3McfQVAy2ChTaj/+7fFD79fgE947u5OrrLBq38X2um8a2o6/K/IvVMawSgNoZP/462sADGy8yvpjtdiqmZjm1w8WuW4Zvkoi8ggZ7dS/oTN2NRnxqWfDyrf8pWySyjVK3qExctP7zAqjbAYcazT7W8lWTnb1Jf+hKQJnC4UWzWbNqFvNEB9aEWFORLPonVvR0UXUDVn6goKOEHVk5xrLWpDN2cQgI0wh+ua7/qg/ngcrp80sWAFIe2//sFdZeIwZk cdLrD+hZ myDIZAKkyRiF+PJrXmJIIIJw500aaMo92AIBLJPFzu5AIsBlXGKRYVxm0bBfMhTcFBs7gqYzjRjyXgoqhzOMrBBvPcchp1x0obe2V6XaAqKplHuhxmMLT9whWC+gzwb1vDQYubn/C24hKSd/4gE27ZlRr3bgl+ZxEOpAtimAm0/J/hnET/x9/YsaeTRe2U0aTp9FfTGMTVCBgvG/uh75L8CgpbEIbeqdi4EPyaofR32wL1VS4vnJ2BwiO4fKCRHbVFv3yQW4l5fRFdrEwLvBDIm7HbADdiYLZczQ+bRlHRV2FvX0j5OauCVMHQOgQSXYdgarSXQbyIGTeUGUoZQyxvM2NRTMhyHGNqr3LfiGbWOGFRLGoYMfHRQmMORZMNbTuVBnStWjoyddfrgsUcWnauXyB//mYpEgbX3i+5F2RXVzJYq+IdIF1Gc2uf6KyPzBBPw4gmJ8TXFa5iZkx/BjTIwONTmL1/5otRGa/ZsOjPnianJOt3exUGJQRGKgKQ0eaNrymYIQB1tc7pI9BlDHPZgpBwav4Vljgz41DsyWc/jesP/bZjd7Q6FbygmgxFy+JyOlO7LS6KIpDx9hBtRanjFSs45a7aEP1bbwcqDTQOXX4YatAlhfzFZD+ESQDLtzalMQj+BPvTyAR/vxk5xgtyJn2WRsv4hfW8lGnV3p1WSLor0ZagiIqDVe8FFSu6Has//A16rRyPnUESA21K0jTb7GG+e+HcEfYLycv1mgvbe4MBqSNRJvLXgm2yw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When using HugeTLB high-granularity mapping, we need to go through the supported hugepage sizes in decreasing order so that we pick the largest size that works. Consider the case where we're faulting in a 1G hugepage for the first time: we want hugetlb_fault/hugetlb_no_page to map it with a PUD. By going through the sizes in decreasing order, we will find that PUD_SIZE works before finding out that PMD_SIZE or PAGE_SIZE work too. This commit also changes bootmem hugepages from storing hstate pointers directly to storing the hstate sizes. The hstate pointers used for boot-time-allocated hugepages become invalid after we sort the hstates. `gather_bootmem_prealloc`, called after the hstates have been sorted, now converts the size to the correct hstate. Signed-off-by: James Houghton diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 2fe1eb6897d4..a344f9d9eba1 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -766,7 +766,7 @@ struct hstate { struct huge_bootmem_page { struct list_head list; - struct hstate *hstate; + unsigned long hstate_sz; }; int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 39f541b4a0a8..e20df8f6216e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -49,6 +50,10 @@ int hugetlb_max_hstate __read_mostly; unsigned int default_hstate_idx; +/* + * After hugetlb_init_hstates is called, hstates will be sorted from largest + * to smallest. + */ struct hstate hstates[HUGE_MAX_HSTATE]; #ifdef CONFIG_CMA @@ -3464,7 +3469,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) /* Put them into a private list first because mem_map is not up yet */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages); - m->hstate = h; + m->hstate_sz = huge_page_size(h); return 1; } @@ -3479,7 +3484,7 @@ static void __init gather_bootmem_prealloc(void) list_for_each_entry(m, &huge_boot_pages, list) { struct page *page = virt_to_page(m); struct folio *folio = page_folio(page); - struct hstate *h = m->hstate; + struct hstate *h = size_to_hstate(m->hstate_sz); VM_BUG_ON(!hstate_is_gigantic(h)); WARN_ON(folio_ref_count(folio) != 1); @@ -3595,9 +3600,38 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) kfree(node_alloc_noretry); } +static int compare_hstates_decreasing(const void *a, const void *b) +{ + unsigned long sz_a = huge_page_size((const struct hstate *)a); + unsigned long sz_b = huge_page_size((const struct hstate *)b); + + if (sz_a < sz_b) + return 1; + if (sz_a > sz_b) + return -1; + return 0; +} + +static void sort_hstates(void) +{ + unsigned long default_hstate_sz = huge_page_size(&default_hstate); + + /* Sort from largest to smallest. */ + sort(hstates, hugetlb_max_hstate, sizeof(*hstates), + compare_hstates_decreasing, NULL); + + /* + * We may have changed the location of the default hstate, so we need to + * update it. + */ + default_hstate_idx = hstate_index(size_to_hstate(default_hstate_sz)); +} + static void __init hugetlb_init_hstates(void) { - struct hstate *h, *h2; + struct hstate *h; + + sort_hstates(); for_each_hstate(h) { /* oversize hugepages were init'ed in early boot */ @@ -3616,13 +3650,8 @@ static void __init hugetlb_init_hstates(void) continue; if (hugetlb_cma_size && h->order <= HUGETLB_PAGE_ORDER) continue; - for_each_hstate(h2) { - if (h2 == h) - continue; - if (h2->order < h->order && - h2->order > h->demote_order) - h->demote_order = h2->order; - } + if (h - 1 >= &hstates[0]) + h->demote_order = huge_page_order(h - 1); } }