From patchwork Thu Feb 27 23:02:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Prescher via B4 Relay X-Patchwork-Id: 13995332 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D1C4C197BF for ; Thu, 27 Feb 2025 23:02:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8202A280008; Thu, 27 Feb 2025 18:02:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7CE55280001; Thu, 27 Feb 2025 18:02:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 584B7280008; Thu, 27 Feb 2025 18:02:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 165EE280001 for ; Thu, 27 Feb 2025 18:02:15 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B654B141E43 for ; Thu, 27 Feb 2025 23:02:14 +0000 (UTC) X-FDA: 83167249788.04.5A86804 Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf05.hostedemail.com (Postfix) with ESMTP id 8F8A1100026 for ; Thu, 27 Feb 2025 23:02:12 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=YSx2biNL; spf=pass (imf05.hostedemail.com: domain of devnull+thomas.prescher.cyberus-technology.de@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=devnull+thomas.prescher.cyberus-technology.de@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740697332; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=amRfIu7egDuh2gQT2LLaey0nNPyqBorglSAqnFUklxU=; b=YKWTHKnFj6EG+A0ypn/33uuodik0nQMMCQx4Qj+DerMs56IpFkExcC1TZ9ygEXYrznLjHL uYPRUPIJ31vVzuRgPAPgnkOlMa6uc58f2NPJ2UU+yElIVFJdlf7rG8DtWOhf1d0YoC8+A5 7sBiqmj7uBUjbRS8Wzp0P6YxbrOJIuc= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=YSx2biNL; spf=pass (imf05.hostedemail.com: domain of devnull+thomas.prescher.cyberus-technology.de@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=devnull+thomas.prescher.cyberus-technology.de@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740697332; a=rsa-sha256; cv=none; b=7hw5xmlXNqYxD64kmg+YXmUCrz2xquQGqIfviKhDeyVyxmg1By/EW0hSY372JT4Ho5SW1/ nn5arX1jQfysLS2LwfXcqnJCUmZQ7qhwcI94xD+Hnc03BIlKyronEmnZqVG96pa14IgrJA Pgze8KWZ19KCeZDhfvYaXspsoj/eerQ= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 6BBEC61149; Thu, 27 Feb 2025 23:02:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPS id 9F62BC4CEE5; Thu, 27 Feb 2025 23:02:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740697331; bh=oJkzUc1eO6G6pDf5SWNcF21d4R6h8JEKsl5niSERdXM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=YSx2biNLel4X6ZP1W/gaW33feRWUqEFDPvcJOwgZYCGyt/sCa630rTTevvqyiO66w lnGf595QQRRqbSLeuqZ5zRgwj6oWM32hlWfRVFxoDFO5Mf869AXc38UIR46MKeNMw+ Dtp39gK8ygSvOwzr3e6e29TOpIoiMbSJ7lcaDU5pe9+8TUNVTq+GE9pnqD39RZDDyV uKRzDd6GiFGdho3LTgBLtUkBTbJm2FFMa8y6kOaoOfNoHCWv7S/H3rg2ukG8troDI4 YLYuDyjy0fsvxYtD7T+8HP1UtCpHkkDCwPOBStPQT/TSSprxnMkT3e0gIbiNqU1EK/ sypjP1Fl5k/+w== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8897FC19F32; Thu, 27 Feb 2025 23:02:11 +0000 (UTC) From: Thomas Prescher via B4 Relay Date: Fri, 28 Feb 2025 00:02:10 +0100 Subject: [PATCH v3 1/3] mm: hugetlb: improve parallel huge page allocation time MIME-Version: 1.0 Message-Id: <20250228-hugepage-parameter-v3-1-2628e9b2b5c0@cyberus-technology.de> References: <20250228-hugepage-parameter-v3-0-2628e9b2b5c0@cyberus-technology.de> In-Reply-To: <20250228-hugepage-parameter-v3-0-2628e9b2b5c0@cyberus-technology.de> To: Jonathan Corbet , Muchun Song , Andrew Morton Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Thomas Prescher X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1740697330; l=3225; i=thomas.prescher@cyberus-technology.de; s=20250221; h=from:subject:message-id; bh=J38/TB8riQAZ4fzuSMdQ3k6+xa/EprG/GBf8GbXV050=; b=vOjP4B1DdPNfhvskR5NLf5KL0dhmWvZ8fNSGh+J2B/n9nrAWy1cYAwk4sJjUbqvqK+Wq1RjLV KLwMJC6/XAHDMH7TZTR8U4e9ce3TwK3/dB61DC+yHT7uY6kVQvCEhBt X-Developer-Key: i=thomas.prescher@cyberus-technology.de; a=ed25519; pk=T5MVdLVCc/0UUyv5IcSqGVvGcVkgWW/KtuEo2RRJwM8= X-Endpoint-Received: by B4 Relay for thomas.prescher@cyberus-technology.de/20250221 with auth_id=345 X-Original-From: Thomas Prescher Reply-To: thomas.prescher@cyberus-technology.de X-Rspam-User: X-Stat-Signature: 3hn7qj6kefzjuogogyhbp8rkbka7u6hz X-Rspamd-Queue-Id: 8F8A1100026 X-Rspamd-Server: rspam07 X-HE-Tag: 1740697332-62468 X-HE-Meta: U2FsdGVkX1+XLMXonkxO3BBEv+u/xbQ2vf6Md2d3Agixm+DwH68w4HBaFhnZbNOaeZr3BS31Op1DwL70vJ3mi7GgiR6+gCa7/aHY0oI7qAhAx0PCgiOl8ObGinaRoxFT/F3Yv7FhEx8ZilBTIQLrbkaxqKvECBT9ThJbXKDqtpNgPzIvXgk7wz9OyHHzaF/HaPQJzZo7DZhoL1uElitFi27vEAoVw9dViZPCF4NJ4e/srqlZLuP7fvu78t2Da6Avtypsd2RUtiUo4GCi3EVv7LPIYDLAW5Mwj+9CNLALt7gw+gnIc25SPuYgW6lg3+WtmpDEfrWg7GHBafanNlWItreu+QLPE6zDfm8sED0WP+odMwRqGNQqePoenpARkMpLxWL1qMaWqvy91K0FiQcqWXFoXY7/dh2/4bgscEp/JwFCUXRx2cl3SOnRg5nuHZnRovELwVZ5f+OptpA/8wAEKeADdV/HiaC38uHeMmlZJNslp7Du+bWvuCXaOx1fF6sEdUNmCdNRuZIdeeDDFUoqwjFUe5JzFNw2dkqYekFroJMoLwYMoBysjU+S+QGtR6Xl8cSIK9NDREroG0WF0mQzKlAb4qbDdyJn3O4Rsey2TPIq1oo4QgOeuc0H/GIW1gQrsju82D0rWXJ1WJ1shegnd9gTvY8T9gjn/HFkK1DKzU/buC2uKf8sQ1DqWSeIKPuxscgcDuNZBoyL55JCdl8Ghj3E1d24ypJc9biSMABQTGMlQOJ7pqQd84+D+ul8DqozybYYq229D1rk8TMoNeC7llknLAIKC1IEcD9QZzV2PcZeBXBcsAIp7hy7nQ0N4WiN3FT0sZyz84K+46QmGjDc25mzZPURdjxuhp9LMHIrPwne3Cs2cedQngnzoPoTjz2thXzHBGeO6EH96d36JO2GjNWYBZnBh834Jh0GVBHCM+DbKRriFWfV+ogveoEqPfKU9SqcPdKRMu1rgE9zxoU dmd2uBD/ w4A7ICCZ/AWlSxZo+gwLbmrI6PdBy2HcI8wNOGwk3UH415N1heIpJ50oEy29rJDYh8bw/JPYGKPRx6qkU5T20z8FiPF2mrskPJs5mppJhUclrpp48SrDN7q8zcpsaL141tN+T9F9FgUHdYVxPFkOJ+x9+AFLhRvXk2abuOOOmJlAmDRxPC9lhqBEgdospG8KqVwnjiaS9nIX5FOgxa0phMXifTEmBmEnFRJJ/TJVinLv5UjNkMQ07VY5l5zIoE9QPD0SexZN2bB3qs38XfCjqIQ31DJtgHnMqTAe+daudA1d5nqZEB+5ESMpfCk6PTlzB+6xJ7+geEW4W8EJMfmOKFGtM79nWqEnO4X9qM3AiBuY2yCiaB5C+Rrmxr6990JEZhS4dK4x9tbrKGH4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Thomas Prescher Before this patch, the kernel currently used a hard coded value of 2 threads per NUMA node for these allocations. This patch changes this policy and the kernel now uses 25% of the available hardware threads for the allocations. Signed-off-by: Thomas Prescher --- mm/hugetlb.c | 34 ++++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea16450026496c020b544877db147d1..e9b1b3e2b9d467f067d54359e1401a03f9926108 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -14,9 +14,11 @@ #include #include #include +#include #include #include #include +#include #include #include #include @@ -3427,31 +3429,31 @@ static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h) .numa_aware = true }; + unsigned int num_allocation_threads = max(num_online_cpus() / 4, 1); + job.thread_fn = hugetlb_pages_alloc_boot_node; job.start = 0; job.size = h->max_huge_pages; /* - * job.max_threads is twice the num_node_state(N_MEMORY), + * job.max_threads is 25% of the available cpu threads by default. * - * Tests below indicate that a multiplier of 2 significantly improves - * performance, and although larger values also provide improvements, - * the gains are marginal. + * On large servers with terabytes of memory, huge page allocation + * can consume a considerably amount of time. * - * Therefore, choosing 2 as the multiplier strikes a good balance between - * enhancing parallel processing capabilities and maintaining efficient - * resource management. + * Tests below show how long it takes to allocate 1 TiB of memory with 2MiB huge pages. + * 2MiB huge pages. Using more threads can significantly improve allocation time. * - * +------------+-------+-------+-------+-------+-------+ - * | multiplier | 1 | 2 | 3 | 4 | 5 | - * +------------+-------+-------+-------+-------+-------+ - * | 256G 2node | 358ms | 215ms | 157ms | 134ms | 126ms | - * | 2T 4node | 979ms | 679ms | 543ms | 489ms | 481ms | - * | 50G 2node | 71ms | 44ms | 37ms | 30ms | 31ms | - * +------------+-------+-------+-------+-------+-------+ + * +-----------------------+-------+-------+-------+-------+-------+ + * | threads | 8 | 16 | 32 | 64 | 128 | + * +-----------------------+-------+-------+-------+-------+-------+ + * | skylake 144 cpus | 44s | 22s | 16s | 19s | 20s | + * | cascade lake 192 cpus | 39s | 20s | 11s | 10s | 9s | + * +-----------------------+-------+-------+-------+-------+-------+ */ - job.max_threads = num_node_state(N_MEMORY) * 2; - job.min_chunk = h->max_huge_pages / num_node_state(N_MEMORY) / 2; + + job.max_threads = num_allocation_threads; + job.min_chunk = h->max_huge_pages / num_allocation_threads; padata_do_multithreaded(&job); return h->nr_huge_pages;