Message ID | 20250221-hugepage-parameter-v1-0-fa49a77c87c8@cyberus-technology.de (mailing list archive) |
---|---|
Headers | show
Return-Path: <owner-linux-mm@kvack.org> X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6ABDC021B3 for <linux-mm@archiver.kernel.org>; Fri, 21 Feb 2025 13:49:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1BFE528000A; Fri, 21 Feb 2025 08:49:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 12285280001; Fri, 21 Feb 2025 08:49:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D58CE28000A; Fri, 21 Feb 2025 08:49:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B4E43280004 for <linux-mm@kvack.org>; Fri, 21 Feb 2025 08:49:10 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5BD60B2207 for <linux-mm@kvack.org>; Fri, 21 Feb 2025 13:49:10 +0000 (UTC) X-FDA: 83144083260.25.B5E3E64 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf03.hostedemail.com (Postfix) with ESMTP id 6A8FB2000A for <linux-mm@kvack.org>; Fri, 21 Feb 2025 13:49:08 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=OqIM6cEj; spf=pass (imf03.hostedemail.com: domain of devnull+thomas.prescher.cyberus-technology.de@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=devnull+thomas.prescher.cyberus-technology.de@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740145748; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=uzZa/En5XboOzloIRqyfFnPJVtIrp7KqrBQaSEOg2hk=; b=bcC+WvUVlJvrQjvTFYamBm57a70fGgI0H/L+esLZigLV80K3IqvjuY3y85JbJo7b7oWY52 Pl3EIx/KF2IHmugvezXQg2OaQCF67pu8ZcWnf/UQggwAh8YxXR2LS6xvnbAEaQAjJ11KsZ cimsLFCQModbH12sGcVjuUSwJeqEBHI= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=OqIM6cEj; spf=pass (imf03.hostedemail.com: domain of devnull+thomas.prescher.cyberus-technology.de@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=devnull+thomas.prescher.cyberus-technology.de@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740145748; a=rsa-sha256; cv=none; b=gOkF/XBuwYFwNVIwU3IkMhqCfUQ8kRRhQYCdLKUDUsNpp5fAzL0OVLHt13ioAUK/esIp2B 0LjGcLXlAuTiHZWEIKngOeWJEVPv1RUhwFjMTgw1jruz0ijBShGwg1EXineDOUNUU5N7Wy 1OZin01nv4TuWCh5WEy4Rfzt5QS+Do0= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 513B45C63F9; Fri, 21 Feb 2025 13:48:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPS id 1C192C4CED6; Fri, 21 Feb 2025 13:49:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740145747; bh=UalauN8XWtUsTEbYj+PKqhYtpX3u+/YFSuhA5nmw7d8=; h=From:Subject:Date:To:Cc:Reply-To:From; b=OqIM6cEjo3PReFp0MiT4cETfcuDeJR++Qyi/nsQZIVM9FghBSr78fRoMt1IGLkEky NMMAAONQX8UGJCYx5Q1I9bjESM+RtWoHGyq1U7kKBHsxQ39DGUXq6GOuQULzqMsn8R qgqQBPkoV64wpPYAlR95a/XXzoiIupQiK5m6Kc/sLUYjhyfuqZg4KNoymJ2O7OVPN6 ARqSX9gE3y2P/0OCJGJ+0uO2bMhfe5iT9bYZdCx57NEEhyAXN3MEYwYwAS9krcH/Rw yJdOiaBsnta9j4Q6G4btoRuKeeQ1wXwdnLkE7Oj3W3WzD3nOJpu7UgMBAufAiTZvmn jGEf0SH3zuxdA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1101EC021B3; Fri, 21 Feb 2025 13:49:07 +0000 (UTC) From: Thomas Prescher via B4 Relay <devnull+thomas.prescher.cyberus-technology.de@kernel.org> Subject: [PATCH 0/2] Add a command line option that enables control of how many threads per NUMA node should be used to allocate huge pages. Date: Fri, 21 Feb 2025 14:49:02 +0100 Message-Id: <20250221-hugepage-parameter-v1-0-fa49a77c87c8@cyberus-technology.de> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-B4-Tracking: v=1; b=H4sIAE+EuGcC/x2M0QpAQBBFf0XzbGtNlPyKPEzcZR6wzSIl/27ze E6d81CCKRJ1xUOGS5PuW4aqLGhcZJvhdMpM7LnxzJVbzhlRso9isuKAObRNzWEKoxdQDqMh6P1 P++F9P6ZOKcZkAAAA X-Change-ID: 20250221-hugepage-parameter-e8542fdfc0ae To: Jonathan Corbet <corbet@lwn.net>, Muchun Song <muchun.song@linux.dev>, Andrew Morton <akpm@linux-foundation.org> Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Thomas Prescher <thomas.prescher@cyberus-technology.de> X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1740145745; l=1895; i=thomas.prescher@cyberus-technology.de; s=20250221; h=from:subject:message-id; bh=UalauN8XWtUsTEbYj+PKqhYtpX3u+/YFSuhA5nmw7d8=; b=JfCkdFsZg1zFxVgHw1RR2OOZrwWQQ4+CJPZufzTC/zoCX918Y1i8T5b22UG2JM5RDv9J1D80m Uwxd5zbJJTADWO9UMhCRe7u8qHjPlCmGfjpSD4XvLx4UDMa4ecmxOg0 X-Developer-Key: i=thomas.prescher@cyberus-technology.de; a=ed25519; pk=T5MVdLVCc/0UUyv5IcSqGVvGcVkgWW/KtuEo2RRJwM8= X-Endpoint-Received: by B4 Relay for thomas.prescher@cyberus-technology.de/20250221 with auth_id=345 X-Original-From: Thomas Prescher <thomas.prescher@cyberus-technology.de> Reply-To: thomas.prescher@cyberus-technology.de X-Rspamd-Queue-Id: 6A8FB2000A X-Stat-Signature: xjr6o1xujkxk1ikfyyjsc14cqaymbyrf X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1740145748-371703 X-HE-Meta: U2FsdGVkX193ZxBeJDzh81xWwv9RYVHmwe5tH+m2+e0aPukKFx5EFlCCp4/Gw+Vw32tZLnapifnzNYErEPpKDFHOEc6YrpO45zK8gQ9AGQA8Lq2BfvlAC+k7m67FO7/EWGcsFCvjUIdMoxyYDfPdPf/oDRdaauD8gD7twPIgkfJHqTo8XN+KRI53REPFOzPhkbxGN6P+UEOBgyJ+rJqWFgMnvZK2RDj4v/D4VNdhwpVsMGhYBh/slFJXb9twa+8Mk+KX2d6SlTCYLMiFjyNOZIO0h7/kGT1KgplWArJcqwDXS4ucd24gnq519jUyzyf/T1JbVStC/CYZybgXpW2igYHttH+mXZwuNbDsdABeYBHOV/O5RdqQmsiAwV61mzo4i0t1POt5eOwEbWwocaoL4tGXT75rWKWr1ZLHDtVLgTte0ndwAjSj04QsmlO8PF8xlWJN4m+ykI/CYgB3W8YDZ62MRMqljjvbZiTPSyd9tDmFzKoRqQlgoNVAySJOL7+cRdGZemsMAsYv5264gnkzYfI+hBKo8CSjBKAHDxh/c2Pa60MNTJnDaBgZVNSfMuhGXn2YP4jrK3lbdcDMC5C9zNSXF/3tQjh8wAjI9q6HXccTNVVXIH06lls+XkV3dBTTvVFeFtLDCgxyWlnBGi3x3KO5C3ts0GwmHYIPZNDDojVgQuTZXETL9tRhfELy8QdVwRBR5fnNrgeR/vVXhRnYewtruWUzR17OTw44xGMqrfy/Dx/t8M6sqkyJq0E5z31gzC3e80vt9jceve+ohALsnFRRlGRwxo3faiur8JqCeDe+qnOMoIvbKFcdBHFv4/lEVtL+EGc3NIF+HGWLIVjBYRkU/UzFx4UZxH2XCxRr5jPUtTQNpLS/8l7OMCBn3vd5LRe3rqZ0aV0nQRyb58B6u9bgNm/zAr8RQcTo7fPIkLp1bUZXrpRyqgq3wVqup+oJs3k4yl/6rVBiHGab1Yq KKf+ChkV gOWrMHAVj3Uy7YXE7vhEZFXrz1Yol4NhDdD9uHQZ91rBqKYIO0UD7uQXmcVi2Ky1hHeJi8q44nk4S+vS92dObXP2BOYPbMdH0nsJEVdHlRqQPK6Gnz2w2u25kzru2bW7pfGuMTdUR4WMpWhfe1ui0JVGnzFp+bIL2R3y/WywRSsjlhlb4oz+iPPAU8bmbCRWyJ1C/sdVpZAecIHln944uDsoRGC+m0OVyKU6yp56igaZORbqPdMI22Q/45ylW1u34XKQofYOhpHeJXqLTqmU4xodjpi8tIwyizBXo+A+1xeedyEakXYp5Qlg98QE8FUr2spQP1bq71dRyYWBLMvvdNppIPvWuGNcA+jpezj7C4mQ/r6MLAMfWDgEtI3VIvEwpP4bnx/66tIM3ud4qrsq176LHJ2Mv99M40sdBjAvy8fMwaNo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000016, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: <linux-mm.kvack.org> List-Subscribe: <mailto:majordomo@kvack.org> List-Unsubscribe: <mailto:majordomo@kvack.org> |
Series |
Add a command line option that enables control of how many threads per NUMA node should be used to allocate huge pages.
|
expand
|
Allocating huge pages can take a very long time on servers with terabytes of memory even when they are allocated at boot time where the allocation happens in parallel. The kernel currently uses a hard coded value of 2 threads per NUMA node for these allocations. This value might have been good enough in the past but it is not sufficient to fully utilize newer systems. This patch allows to override this value. We tested this on 2 generations of Xeon CPUs and the results show a big improvement of the overall allocation time. +--------------------+-------+-------+-------+-------+-------+ | threads per node | 2 | 4 | 8 | 16 | 32 | +--------------------+-------+-------+-------+-------+-------+ | skylake 4node | 44s | 22s | 16s | 19s | 20s | | cascade lake 4node | 39s | 20s | 11s | 10s | 9s | +--------------------+-------+-------+-------+-------+-------+ On skylake, we see an improvment of 2.75x when using 8 threads, on cascade lake we can get even better at 4.3x when we use 32 threads per node. This speedup is quite significant and users of large machines like these should have the option to make the machines boot as fast as possible. Signed-off-by: Thomas Prescher <thomas.prescher@cyberus-technology.de> --- Thomas Prescher (2): mm: hugetlb: add hugetlb_alloc_threads cmdline option mm: hugetlb: log time needed to allocate hugepages Documentation/admin-guide/kernel-parameters.txt | 7 +++ Documentation/admin-guide/mm/hugetlbpage.rst | 9 +++- mm/hugetlb.c | 59 ++++++++++++++++++------- 3 files changed, 58 insertions(+), 17 deletions(-) --- base-commit: 334426094588f8179fe175a09ecc887ff0c75758 change-id: 20250221-hugepage-parameter-e8542fdfc0ae Best regards,