From patchwork Mon Jul 29 22:27:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nico Pache X-Patchwork-Id: 13745845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0757DC3DA4A for ; Mon, 29 Jul 2024 22:28:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8FA506B0093; Mon, 29 Jul 2024 18:28:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8AA936B0095; Mon, 29 Jul 2024 18:28:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7561B6B0096; Mon, 29 Jul 2024 18:28:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4FE036B0093 for ; Mon, 29 Jul 2024 18:28:06 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C256340110 for ; Mon, 29 Jul 2024 22:28:05 +0000 (UTC) X-FDA: 82394229330.19.46C8517 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf05.hostedemail.com (Postfix) with ESMTP id 1666810000B for ; Mon, 29 Jul 2024 22:28:03 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=JicfFW0T; spf=pass (imf05.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722292031; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1dh1LJsNavfHIk0SmT2VfaSsPtsucB+yQftwoouaONk=; b=FWRKbki2qeO16bKM7COE+CDFSixqOdCpA/G9WUc9BDFUVNtTCEXepO0rfvTfMDCS6TvJFL ujnVMQYfCeDnSnKBIvgTjVFmFlnhhcktAAcAK0bmsCeKn7mC9U9gnZVmQjUhyHHchRW8hJ INUJK81unBpYR6C5MXvkhmXE2mOnqHE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722292031; a=rsa-sha256; cv=none; b=4vFx3i+K1DxnIsVQmnj8Lx0osXzVFbYmvbFERN0amrOD7kdus8n8embMddFj0qmwqqmSWB U9Mk4zTTuiFbADg510zZ8YkqydsZ9xrpedqSTfPtAs9y9CAiyQGTumxuL3OCwYjZT4Cr5v Q53cr/lpHDKRI1/ii+Z2E49bsVx3M8E= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=JicfFW0T; spf=pass (imf05.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1722292083; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1dh1LJsNavfHIk0SmT2VfaSsPtsucB+yQftwoouaONk=; b=JicfFW0TfHbyzbiVmY3sndxFA/2ahuX9mft2cVowIi/xqYrbV4wSwWnKZMY9JsNavSokm6 t5eZcFk2QhS4NayUb5IZtP1VmaO9c8aN84K47MnoD0ZwOuiustakooZAkRRld+aB1fgCjF rPMVO1XVfZS75tIR53j2T54D+lfuU/Y= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-591-CizAE6wRN66ju72-UQYRjw-1; Mon, 29 Jul 2024 18:27:57 -0400 X-MC-Unique: CizAE6wRN66ju72-UQYRjw-1 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D75431955F3D; Mon, 29 Jul 2024 22:27:55 +0000 (UTC) Received: from h1.redhat.com (unknown [10.22.10.55]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 07E341955F40; Mon, 29 Jul 2024 22:27:51 +0000 (UTC) From: Nico Pache To: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , Matthew Wilcox , Barry Song , Ryan Roberts , Baolin Wang , Lance Yang , Peter Xu , Zi Yan , Rafael Aquini , Andrea Arcangeli , Jonathan Corbet Subject: [RFC 2/2] mm: document transparent_hugepage=defer usage Date: Mon, 29 Jul 2024 16:27:27 -0600 Message-ID: <20240729222727.64319-3-npache@redhat.com> In-Reply-To: <20240729222727.64319-1-npache@redhat.com> References: <20240729222727.64319-1-npache@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 X-Rspamd-Queue-Id: 1666810000B X-Stat-Signature: ui3iy977ih747bohmzwfnjjjg6ww3htq X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1722292083-727056 X-HE-Meta: U2FsdGVkX1/knvTdibFXxIMlUL6kZ4ShfqQIQC/gwTMXD7nVpyhZcAAmXdwwyFZpKtumnG2PagWEOd5gXRaeyGLDMJdkFztk/cqhMTvAjXlOdf8LMNPaMN0TFncYUZhT1trnlTBek7ohj+X0fQovOVRauSDed2hgHv0w9hEJUup9K/vahPz8HqqoKHrngadQkyfJVQEI3IWnecUdHurjeBjR4+HXG1eHzRFMQBxzEiLdVdcs5yki8qxGqKoHZlgPF0GXZ/3gVQ1emp5DCP2wp4BRd+1E+708545WtFhozuKMSBQiJyQQCiU3MxxltxvtaaQeef3knH3mJ98MTEFHRkfY0W70vB9QLu0+PqoDDht/Kb1kOSwDa0fo3ByM6atyeJRba+2hIoNKm7MTp3hfV39co4lCDokDZh1ejQRC0aCnV9qS6v3sGiW6jm+jKkdZ5wXVbu1hRLeJYGwpHGtluF/z301SkDu6nK00S+eA1lY9ZzU1YWIdNZ7E4JwSj240v4X2nfUuuxgvoCPYWlNUu3IcuofYf/vonJpjB3PfRY0VPwFxEaOzieyMaaNCOMy8mtaa5tLiPwMwkN8kf4nFkz11wAJnr02ezFO/5CEpp5v4aWLOKgJ1msFC2vA7Er8DzZ0Mioo9PHYvXGaiE2Z2s/gRHWPiGGZWtKIbzI8Ta59fIZzVDKpjj/dzNYlAzik/bTghH4dvutATM571kLdenJl4cUe6ZBZ6HFvLzMgSGas0s6qjL7NYL/pxsdxRco47SqGC+/W1FFahC7w/7LAGaUcZXPz9ptQAkMFPMu0w5vp8Tzt/dWq612IOpBjHTmM6wIrpan4F5m7mUW6m9/3gIW6mtSdgiUtSyt1ufxOITXKOCy1lb0iVFwADBQUeTIF6FmcdGeR5Ek1eNUtQEVbnqPD2qPSp3cS7Aq6gEsNSDmON9pOdUdXdCRGrkB5gc2GVDE5VcG8HcXs4Vm6ZL/R dIYfdqpW Le4BES/38ZIAReFdz0ePbAxC9skGG2UJSmgKqAMEyRKqZlmdERYX0rJJPva7tEX4yo4LU/83Yr0x6B0hFzStKU/Lu9WisiEpM2sXzS1/TzFRBIVXpXqJgibp3PckW/YrTIHmQCCgek7UgAMD7EugPpVSKISGAh8autaXZqVG4/SJJcx0FWOTMMzES1FZX48UkdmLwC6GNRhuzQ0ITzl6dWERrZRdlcsTSjUwUmdUoXYJir58psRY3d5BgccznmYWNBHcGWHo+kzg5Vn3+s4eCv3j7sYaQ48BKzWg2LudZnMARYq6uByl8Q6jBUaWp7rgykn53li0fy5jN+Lrv0I+Xxl05/uoQ9qOUdc5tultEZppKxBeSUjODsBPeAhgW2FpClCbncK4j/Vitp32h+xg2dYGWERl6O52JiqWQfH+1Psc70Z1F+W99Yt05tjlNkqgdcIIxan8M4AZM7WBFICa6Xb7E3xYPZFuCGg31EJREgC8RQKB/z+87sxTbjk9hlVlr7Zmj70usELpXrZI6v3qto9NSGDF6CJGa8+D1cjNQSIXX9p83Fy63wwE1VrVVirKcDh/EifAypCRg1yAlfra4P4XUS/ON8MLnXOMhnVsPE4+D//hISAbfbcQyOZyHIMF2yRG2vSSiuCu5u+ND6y+VJNESj9gmf1PkPvNE20kdSfclC5WK+KMIBMJ6PZEL2/peXLpVTwcunq6cBN6wR1suaFVtKg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The new transparent_hugepage=defer option allows for a more conservative approach to THPs. Document its usage in the transhuge admin-guide. Cc: Andrew Morton Cc: David Hildenbrand Cc: Matthew Wilcox Cc: Barry Song Cc: Ryan Roberts Cc: Baolin Wang Cc: Lance Yang Cc: Peter Xu Cc: Zi Yan Cc: Rafael Aquini Cc: Andrea Arcangeli Cc: Jonathan Corbet Signed-off-by: Nico Pache --- Documentation/admin-guide/mm/transhuge.rst | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 058485daf186..1946fbb789b2 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -88,8 +88,9 @@ In certain cases when hugepages are enabled system wide, application may end up allocating more memory resources. An application may mmap a large region but only touch 1 byte of it, in that case a 2M page might be allocated instead of a 4k page for no good. This is why it's -possible to disable hugepages system-wide and to only have them inside -MADV_HUGEPAGE madvise regions. +possible to disable hugepages system-wide, only have them inside +MADV_HUGEPAGE madvise regions, or defer them away from the page fault +handler to khugepaged. Embedded systems should enable hugepages only inside madvise regions to eliminate any risk of wasting any precious byte of memory and to @@ -99,6 +100,15 @@ Applications that gets a lot of benefit from hugepages and that don't risk to lose memory by using hugepages, should use madvise(MADV_HUGEPAGE) on their critical mmapped regions. +Applications that would like to benefit from THPs but would still like a +more memory conservative approach can choose 'defer'. This avoids +inserting THPs at the page fault handler unless they are MADV_HUGEPAGE. +Khugepaged will then scan the mappings for potential collapses into PMD +sized pages. Admins using this the 'defer' setting should consider +tweaking khugepaged/max_ptes_none. The current default of 511 may +aggressively collapse your PTEs into PMDs. Lower this value to conserve +more memory (ie. max_ptes_none=64). + .. _thp_sysfs: sysfs @@ -136,6 +146,7 @@ The top-level setting (for use with "inherit") can be set by issuing one of the following commands:: echo always >/sys/kernel/mm/transparent_hugepage/enabled + echo defer >/sys/kernel/mm/transparent_hugepage/enabled echo madvise >/sys/kernel/mm/transparent_hugepage/enabled echo never >/sys/kernel/mm/transparent_hugepage/enabled @@ -264,7 +275,8 @@ of small pages into one large page:: A higher value leads to use additional memory for programs. A lower value leads to gain less thp performance. Value of max_ptes_none can waste cpu time very little, you can -ignore it. +ignore it. Consider lowering this value when using +``transparent_hugepage=defer`` ``max_ptes_swap`` specifies how many pages can be brought in from swap when collapsing a group of pages into a transparent huge page::