From patchwork Tue Feb 11 00:40:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nico Pache X-Patchwork-Id: 13968645 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C391C0219D for ; Tue, 11 Feb 2025 00:41:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEF6D28002C; Mon, 10 Feb 2025 19:41:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D9DFA28000C; Mon, 10 Feb 2025 19:41:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF0FC28002C; Mon, 10 Feb 2025 19:41:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A142628000C for ; Mon, 10 Feb 2025 19:41:51 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6633F44A36 for ; Tue, 11 Feb 2025 00:41:51 +0000 (UTC) X-FDA: 83105811222.13.41699D0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf28.hostedemail.com (Postfix) with ESMTP id B59FCC0002 for ; Tue, 11 Feb 2025 00:41:49 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EpLZbfxS; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf28.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739234509; a=rsa-sha256; cv=none; b=wo8c8mr5Oor+APEh5eDB1jNw24EPCnKQLykDGadFBwDZwjbA+3ZBrFUGWfHmufCUSVsiES h/WkKOwZYlz1x3QOjOj45KI9NKsvF88ilk8rrlW4Vje5C1RPxWbfHu4n8s8lYIztNlfZN8 ut5SmWhYjeCycCXa1DGJXY1VnPpwjdI= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EpLZbfxS; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf28.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739234509; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lueIMS7GNplHH2FDTf2Wie6UZEYR6RkSMcK6ekbEp2Q=; b=jEDxojUmwKm3cOvMnEfO0SqYFd5VfXL0dOgqqfugaPACrRR4ym+m1Ykp/K6hwZH00qtR+G rllk10Yrb+zbBzZseJX2qJwkiyorUNWNQ8JQIbhcITZspfxfNN0QX0WO4ZqlVgQWgO9gF8 rt/6UFFUFFeUMA3/hmdws0OIUztsnlo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739234509; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lueIMS7GNplHH2FDTf2Wie6UZEYR6RkSMcK6ekbEp2Q=; b=EpLZbfxSzWbNNx44sEts5pfaiKupicA4CMIMc6VFN9QVoh/NtYUcykf+5/Y//z5DFyqTrT TTSXgJSa+lqfnDpHXxo6Bp6DWRVJkqvxBlwoWX5LFRPod8DpViUvO887tBCHEVCHLnFmV5 ufQ7i1wxb2oBMv4Fv/6n1WM6dgpuBhU= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-321-Q5HLC00ENK-uAZBuWss8Hg-1; Mon, 10 Feb 2025 19:41:44 -0500 X-MC-Unique: Q5HLC00ENK-uAZBuWss8Hg-1 X-Mimecast-MFC-AGG-ID: Q5HLC00ENK-uAZBuWss8Hg Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 229591800877; Tue, 11 Feb 2025 00:41:40 +0000 (UTC) Received: from h1.redhat.com (unknown [10.22.88.129]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 108571800873; Tue, 11 Feb 2025 00:41:29 +0000 (UTC) From: Nico Pache To: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, willy@infradead.org, kirill.shutemov@linux.intel.com, david@redhat.com, aarcange@redhat.com, raquini@redhat.com, dev.jain@arm.com, sunnanyong@huawei.com, usamaarif642@gmail.com, audra@redhat.com, akpm@linux-foundation.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tiwai@suse.de, baolin.wang@linux.alibaba.com, corbet@lwn.net, shuah@kernel.org Subject: [RFC v2 1/5] mm: defer THP insertion to khugepaged Date: Mon, 10 Feb 2025 17:40:50 -0700 Message-ID: <20250211004054.222931-2-npache@redhat.com> In-Reply-To: <20250211004054.222931-1-npache@redhat.com> References: <20250211004054.222931-1-npache@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: B59FCC0002 X-Stat-Signature: 3yjdsyryazik6y6ada1msdfjbkp76rg3 X-Rspam-User: X-HE-Tag: 1739234509-450537 X-HE-Meta: U2FsdGVkX18WS3hAhto4AIxw7oDqfgWDa8Z7Ce4c9lIT0QPB0iUieKyCrE/sNfhhpMhLthkHwmvvM/tJkWDLX8Gr138OPnWZM4AX/GbrauJiFdTxgr4/r7IGZWnnjrVHltlQzP9OQsh/I3QAL6HxaF3nqHWu07CWC7hdUtw+fGWR/fxYVN/j7tUvR3+Wa7rwvD/ZeDbZKb4GLTIcD1iuNC3uWU3OCLRQ8nM9sctSLQEUYulNSO1hOPXfz7tstyj7hGNj5jWCHpSerciAbAyLyj48oWAXB0oTZri0rWfLGBBYNUjrjpafP0rdq+k/KfNolWAGxCdYG2AAyxbL2tgCji56QH7/cpuDZw5iZsf8KDaOlvKfI8vQkODikE5Q1UYYNBQKs0LoqVbFqIwu32PkztI0OJAtGGI7yVen4JvUZdKZq72xoYe3oJispCkA3ns0RSbEPYC4w79rUgG3ixrWlDJ0O/IGlY0H97IWeU5kO7n4hmzinVa0mbAgRbZII/yOuS36r4+Sv6T33QRTE+gTZ9sxwK4Pq5+SjwYpo45qEI7QNHwbr7xFIGUj1a27k0xUCuBeRHOdz6v6pDSZY+o+lzsuFugjDRZzi8ErQSLSuwmdLV3R/GX8HrdeVXqmu1DEjQ+qpQUewzPxtvyfP+lE0a0dOJPRhPMqrmLYpc+Q4jdpRFgQMEEFcDihRHDsoi5fxDEx/EEK8F7HgBp01XJzUi5lkG6/ikD0wRWO9gJa0RO/qD4DNbrosjXmICI9pg2hmAUzf6nzufFy8VAyEG9Ku+pfk5T2TyQ82b1K+Z1RnwfQQRoQwsGtHSj+rXKgBJ1/Kel8i40B+URWXZbeMWleBc7iPi1Tn/775vG+kk0wdpfugtjj3POcJ16mtVjjZs8q0BDEfV+C8Co40p6i+9ev/0bsA9NGtYbqSYF9C95i2qdvCAtWskYb78wHIXqM07K7qCxOD4yL/+2SOXTaAYG fwGyC1oD 65P0OsDm3OxzX7pwkU+FcJZ3p9oj9jZazYzg7AAcDf/+3sXZTrcqWaed9Nx5xQISPOCJbj35M95yIExdltnvKAQyBmO63RJTx8lTF9KyRe8EZ0unHiqUh5BjIqpwM/oW3sc6zFTj8CIq8pbWT0Cy8WHCw/mtYD6BLumWOcLl/DmEAw6HTeS0GzwMgWDTqbghi+DI7v2b2es427c4Tkh/vxjRfvGtcfTmAjaZHMbTzETjVMcMK2oewWnfNlSrlGMwLA+Um2dRnaCF+4vKYaP5GOW5CLOvCSQjjNAfcv8h5HgpXqgTgVegf0QUGxUAVMPJZoEghhYjaRQKxI682VfigF4NsFARqnXRRxTTeTjU+9tOaOyQFNLCDbW+5XLPQt9kD1upZ3Io95BRGJGE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: setting /transparent_hugepages/enabled=always allows applications to benefit from THPs without having to madvise. However, the pf handler takes very few considerations to decide weather or not to actually use a THP. This can lead to a lot of wasted memory. khugepaged only operates on memory that was either allocated with enabled=always or MADV_HUGEPAGE. Introduce the ability to set enabled=defer, which will prevent THPs from being allocated by the page fault handler unless madvise is set, leaving it up to khugepaged to decide which allocations will collapse to a THP. This should allow applications to benefits from THPs, while curbing some of the memory waste. Signed-off-by: Nico Pache --- include/linux/huge_mm.h | 15 +++++++++++++-- mm/huge_memory.c | 31 +++++++++++++++++++++++++++---- 2 files changed, 40 insertions(+), 6 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 93e509b6c00e..fb381ca720ea 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -44,6 +44,7 @@ enum transparent_hugepage_flag { TRANSPARENT_HUGEPAGE_UNSUPPORTED, TRANSPARENT_HUGEPAGE_FLAG, TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, + TRANSPARENT_HUGEPAGE_DEFER_PF_INST_FLAG, TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, @@ -177,6 +178,7 @@ static inline bool hugepage_global_enabled(void) { return transparent_hugepage_flags & ((1< X-Patchwork-Id: 13968646 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DFD4C0219D for ; Tue, 11 Feb 2025 00:42:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D5BA2280031; Mon, 10 Feb 2025 19:42:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D039028000C; Mon, 10 Feb 2025 19:42:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA2DC280031; Mon, 10 Feb 2025 19:42:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 9E1F628000C for ; Mon, 10 Feb 2025 19:42:03 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5558FC08C6 for ; Tue, 11 Feb 2025 00:42:03 +0000 (UTC) X-FDA: 83105811726.02.FD89FD0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 9E369C0003 for ; Tue, 11 Feb 2025 00:42:01 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=e0xydFbA; spf=pass (imf22.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739234521; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WeJh2P4bxed8LMp5Ek4gDvhHkD3M/Aeurz6cvFHf2wo=; b=2OTmQf8R4kAHxsKLjxvTp3hcQlrTqkyz1vxnFjjJDSLr7YInZQBO829/kCymrlcZaQ/oT1 XXEhddLkpGu0ZPUZvZd0J/IxMDYXMDSnWs0b6gJ34ipTwQJpFlwyGLRbt/i940JGp2zcHH sjlo9qfhuxecFMD9JJwPy46/GpMgULk= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=e0xydFbA; spf=pass (imf22.hostedemail.com: domain of npache@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739234521; a=rsa-sha256; cv=none; b=zZm3UQ3ZTn+sIBrd3tia0sqCPca5H0q9Z7NeJl5dm86tMJTqOCs67YepixH5IzM1We2XZS qsNUz5xZdfhymSNUkRyn0nZhxXjZRzwygc5EGothW18Cw5fMWILW6bbuPHEk79FBXEuKis 4TnMG+0QmVHaI81nFTAjjsDR3DwZN4c= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739234521; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WeJh2P4bxed8LMp5Ek4gDvhHkD3M/Aeurz6cvFHf2wo=; b=e0xydFbATM9O8kxHahjzbfE+9qp0l+Wip9EIAAchJYMNOD+OCDvakMuo/3qOVk8KZcwDZg g+I1GYZdLvLIGYrcy4IkHItLqvgX6+er3N3U9wpIhNyAbEMEiBBSl3YCglUDEe+QKuHpYQ E/tWFiSIZj3QJIHntwKEMdX8QiS+0t0= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-161-QP1PNo-sPkebWvclG8DInw-1; Mon, 10 Feb 2025 19:41:55 -0500 X-MC-Unique: QP1PNo-sPkebWvclG8DInw-1 X-Mimecast-MFC-AGG-ID: QP1PNo-sPkebWvclG8DInw Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B53FB1956086; Tue, 11 Feb 2025 00:41:50 +0000 (UTC) Received: from h1.redhat.com (unknown [10.22.88.129]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 7117518004A7; Tue, 11 Feb 2025 00:41:40 +0000 (UTC) From: Nico Pache To: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, willy@infradead.org, kirill.shutemov@linux.intel.com, david@redhat.com, aarcange@redhat.com, raquini@redhat.com, dev.jain@arm.com, sunnanyong@huawei.com, usamaarif642@gmail.com, audra@redhat.com, akpm@linux-foundation.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tiwai@suse.de, baolin.wang@linux.alibaba.com, corbet@lwn.net, shuah@kernel.org Subject: [RFC v2 2/5] mm: document transparent_hugepage=defer usage Date: Mon, 10 Feb 2025 17:40:51 -0700 Message-ID: <20250211004054.222931-3-npache@redhat.com> In-Reply-To: <20250211004054.222931-1-npache@redhat.com> References: <20250211004054.222931-1-npache@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Rspam-User: X-Stat-Signature: diemjnbdw8t4sjto1wj1dw8443yfodjw X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 9E369C0003 X-HE-Tag: 1739234521-425699 X-HE-Meta: U2FsdGVkX19TAqdqmteWF1MAhDu5KrDq/DzHofCdUVQ94MoLyaGx6iSjHxPkOToFtxpCKYodnQshtItzoXUN/hlz1Jjhn9c2MIrnRRhmRtfjK7V/hyLwcxsyy5+TANyYLHcIy21apt26XAWntsbSYAqSj7kyuziJxEonBctJUnv4MlHx7s+bx53GXwSz9yTFISgrpKtS9Wxm9fbH3VeE6SGNj1N/TNvVfEW2w5hHSI2nz8JdRie7hPeIynRVeqRNUpfkS9DxPOV+k4LCAF+jJ9jeXvSmJRI/LvqwMPplXeVW9L5j/XwAmFZt1KVCD7mvb1ZufDjjcZw2G/QfvJ9WP1aVD+S/99Dy06EXiiGUIj3/x5ycZx9GaPvJgw+DIBADlhYQdU1GfJ4M2qhD43ZNyUEh7x4cMtbaShW4px2ObR1X510Pv22xEjX+jjn36fCaHEKp59B3H+1/8AL7sH59/Ns4gvQqgUQwLUAe5vZr9j0ml6/xBkQ9Fr5J7yFSpfbrOjwKQQgrNsPHforQ9YTPmfra9c6zzPL5F+QBOtJpIJeyKMZfOzvaPvymeNVTvdonNXlzIHp0wT56UNq6lDNKTZK1FXhoul+PTa1RsKb4+5PPLHIj1r8MzonRFRrk0tDKlRPoF+Sv/mkshtap4nTMyYi2qMhMw7j3s+LbvfOtnjJr2GmCYCc/Xex3jlp9Y0V6fuhlevBZCddWw/yDh2vveonqiF0btgwckRQeiCctD1nqnk3y5TvJdM8GxJpDlYM0JFv5tl1n0XJyhfQDrysIBmsmv9VLVeYFEkReBhDy3x4+kgwxCK+of5LmFK1tEJ7MYct3/w54z0nw4MRmi2zJpZ77HPzBY2mIbtzqE9TawgM3d4JP9PhWmF4Wxl+aHYzuPh4pWZ4YRmih720LMiMGDlofkzC7eGM/3f472uwze7ebB5K0t61kzj/RsbWeLh695BQ6zA1xK4f5DPVPe13 /7+Ckac7 2RFsPVfj0y9ESRsBbnbtk1RaWPSCj3xhG3UZ7LQKdnXg8o0cF7sKAv9Yr1byqwtCEzKKBbg7xUdGddyd0/uaaFuGeuXQXfyUoqGhvAynsBIDn1AVzEfB3Naruupg8j/pGsWYe/7eaKmGMMDXbwacRkkUQwVzdnXWlcpli5qgmpo6NPXw2+t4hSKtsvyc7Pak/C0BxKFvVevuJtxR4aLSVX/O39vetO5WDHbmNweoyA3Gl3cLqw2rDUjYZ6wLi+kG99OS79P/VCJdW+RYh67dCMZwuHBHXhAATP54aQrFaMiqZYvcCr1HAQE/b0M7JnPIsFQ/xm4fRo8XqSqbhxYOr9y1m6KP8gDTU/I7ZhgwH49HAK309onxnPiM64L3cJgsYuBCIEvTmbThA5P/UAadoqhZqsYdDRHJRh+UHZogue2B3ypLD5vDgtyc9aclVwRRKIpVu X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The new transparent_hugepage=defer option allows for a more conservative approach to THPs. Document its usage in the transhuge admin-guide. Signed-off-by: Nico Pache --- Documentation/admin-guide/mm/transhuge.rst | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index dff8d5985f0f..b3b18573bbb4 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -88,8 +88,9 @@ In certain cases when hugepages are enabled system wide, application may end up allocating more memory resources. An application may mmap a large region but only touch 1 byte of it, in that case a 2M page might be allocated instead of a 4k page for no good. This is why it's -possible to disable hugepages system-wide and to only have them inside -MADV_HUGEPAGE madvise regions. +possible to disable hugepages system-wide, only have them inside +MADV_HUGEPAGE madvise regions, or defer them away from the page fault +handler to khugepaged. Embedded systems should enable hugepages only inside madvise regions to eliminate any risk of wasting any precious byte of memory and to @@ -99,6 +100,15 @@ Applications that gets a lot of benefit from hugepages and that don't risk to lose memory by using hugepages, should use madvise(MADV_HUGEPAGE) on their critical mmapped regions. +Applications that would like to benefit from THPs but would still like a +more memory conservative approach can choose 'defer'. This avoids +inserting THPs at the page fault handler unless they are MADV_HUGEPAGE. +Khugepaged will then scan the mappings for potential collapses into PMD +sized pages. Admins using this the 'defer' setting should consider +tweaking khugepaged/max_ptes_none. The current default of 511 may +aggressively collapse your PTEs into PMDs. Lower this value to conserve +more memory (ie. max_ptes_none=64). + .. _thp_sysfs: sysfs @@ -136,6 +146,7 @@ The top-level setting (for use with "inherit") can be set by issuing one of the following commands:: echo always >/sys/kernel/mm/transparent_hugepage/enabled + echo defer >/sys/kernel/mm/transparent_hugepage/enabled echo madvise >/sys/kernel/mm/transparent_hugepage/enabled echo never >/sys/kernel/mm/transparent_hugepage/enabled @@ -274,7 +285,8 @@ of small pages into one large page:: A higher value leads to use additional memory for programs. A lower value leads to gain less thp performance. Value of max_ptes_none can waste cpu time very little, you can -ignore it. +ignore it. Consider lowering this value when using +``transparent_hugepage=defer`` ``max_ptes_swap`` specifies how many pages can be brought in from swap when collapsing a group of pages into a transparent huge page:: @@ -299,8 +311,8 @@ Boot parameters You can change the sysfs boot time default for the top-level "enabled" control by passing the parameter ``transparent_hugepage=always`` or -``transparent_hugepage=madvise`` or ``transparent_hugepage=never`` to the -kernel command line. +``transparent_hugepage=madvise`` or ``transparent_hugepage=defer`` or +``transparent_hugepage=never`` to the kernel command line. Alternatively, each supported anonymous THP size can be controlled by passing ``thp_anon=[KMG],[KMG]:;[KMG]-[KMG]:``, From patchwork Tue Feb 11 00:40:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nico Pache X-Patchwork-Id: 13968647 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 782BAC02198 for ; Tue, 11 Feb 2025 00:42:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F676280033; Mon, 10 Feb 2025 19:42:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 07EFE28000C; Mon, 10 Feb 2025 19:42:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E13EC280033; Mon, 10 Feb 2025 19:42:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id BF6A628000C for ; Mon, 10 Feb 2025 19:42:11 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 7AF2D1C6C1F for ; Tue, 11 Feb 2025 00:42:11 +0000 (UTC) X-FDA: 83105812062.24.166CBAD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf15.hostedemail.com (Postfix) with ESMTP id B9ED5A0004 for ; Tue, 11 Feb 2025 00:42:09 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZFxBOyEN; spf=pass (imf15.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739234529; a=rsa-sha256; cv=none; b=7775XWNcTy5gf/LlzFOPzQhmeTAxfAJGMNkkIlfkDUy6A6pQrNcafQUod2DIW94nnXSkpb 1fjDgP1u3g5q7+jVqW2ftheh5RY/I9tlMPSjbfvFguT4PnUsdnFjKhVQWX+6/jKfiDgNOs wLYIXBaPa16R5QVru/vzs+mwnLVEjCA= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ZFxBOyEN; spf=pass (imf15.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739234529; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7n76Fvpmfa8LN30y0qvCHdYcw71Um4Eimev9X0oBxqU=; b=NPPXj/Lm9kN17uwXx8DmRM04rEfiXr9j0kHQ53k8vF8+RxPQjK71ZUJ3sdnug+xerzHw5e bOHdF9BYXzf2ZQ/GYtZ7suEcNgFHCGq0C13ywCyU39Hg69TCl3LKby03Zu12N3Nll6N6Ks /GdeboQaqdAZ7/3np/U88PV+3wafrU8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739234529; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7n76Fvpmfa8LN30y0qvCHdYcw71Um4Eimev9X0oBxqU=; b=ZFxBOyENnh8E9M74QB4eKsCCfn12WveRgGVCkNMwFA69rT0KMhE9J1KVrTNk4ZVjphft5R 20x2VV0JNxg4G6+97oW4+wuUWEMsPpP6Pec61Nv30l7VJfttPGw3/GSe1jIx0/FGOI+xGi NqtqiF2w1cSndQkFWIz40FE0/L1ucnQ= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-634-aYBG98PjP5S9ARVq1-GRRw-1; Mon, 10 Feb 2025 19:42:04 -0500 X-MC-Unique: aYBG98PjP5S9ARVq1-GRRw-1 X-Mimecast-MFC-AGG-ID: aYBG98PjP5S9ARVq1-GRRw Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1FA6119560B0; Tue, 11 Feb 2025 00:42:00 +0000 (UTC) Received: from h1.redhat.com (unknown [10.22.88.129]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 10D6D1800352; Tue, 11 Feb 2025 00:41:50 +0000 (UTC) From: Nico Pache To: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, willy@infradead.org, kirill.shutemov@linux.intel.com, david@redhat.com, aarcange@redhat.com, raquini@redhat.com, dev.jain@arm.com, sunnanyong@huawei.com, usamaarif642@gmail.com, audra@redhat.com, akpm@linux-foundation.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tiwai@suse.de, baolin.wang@linux.alibaba.com, corbet@lwn.net, shuah@kernel.org Subject: [RFC v2 3/5] selftests: mm: add defer to thp setting parser Date: Mon, 10 Feb 2025 17:40:52 -0700 Message-ID: <20250211004054.222931-4-npache@redhat.com> In-Reply-To: <20250211004054.222931-1-npache@redhat.com> References: <20250211004054.222931-1-npache@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Stat-Signature: 4wrc7j6r48gta4hgwaagcg5ffeob3kn3 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: B9ED5A0004 X-Rspam-User: X-HE-Tag: 1739234529-906059 X-HE-Meta: U2FsdGVkX18zIhQld351BUWtTwxZzB00L7rOjZ8O3cMK0VMkpMWYUZFOw37eS0gBDAEIWHEOLC4zZTYP3ghK1MWS2pX8GjSJZpEssSVqFgKz0fbsHzQXhDccPLU/qId5IlwnDrEcpRufA3nwSp6dd2C5gytLYhMl7sfjmCnkCuTuTDeU1tWa6sP5B2YnNm3iME9oDFLzy+jLkrdW1lyaOfWzejW3Wz+i/E5HRgmt2B+DXfPumK27ISBiBiLLVgT0skKRDDrdYYsWLC0TO0qDP7TF/s/qZ95YssIrNhv2HrhsxdPDqc5qSR2MPcGOdbpcfVFQnPzaT22FFaD5r1TuJITS0Vsii1zTeVBQdUzHRMJXazyfnAsAJnohmafuMS0z9FvsMA8gk2YFo8rr5EgcCi14rmA8+kgQP0XjM0DgDMAYSkZ1eOwChx0pdy4ga4IX/MbqGgGYHa960KTv7hsATUZj6TSsD0RgTtZPvRmAriAJCdwgwQM3o/zyTWh5h9eKzivpL8gZ5zSLOR1OGoVtHBXJsXWUuZDdStLmIgwiP7HAZn79OmlPe/a4+Dlua89+paWu2Tv7HbalCYrG6R+uxQObJKK8bTndvAZdD/O+sPnLPU7eZEIXjXySaH44/26gaM0XoCZm5wV1W1izeyNT2+jfNdbRsST9DgB4ezfFWGu38c2pZYA9KexyxL0cRtYHJ6IQ3TZeGHptXPEJAHXsPqnLMF25MpzaK4FMvMYHEPquc3cgXkO9GgQEA41fR31LbJQy6qm2tf6H8P0tXxHeCNbAWGjU6zFWBAFy61924wjY/7vhrZxpA9GdkCC5fD1ikfzMb9ucDW5jlWpmol2EQSzS1SnCgyk1mdpbKKuXnok3dglay7s+CzqSiGG22BxsPmJbk0ROh0aBtXZuLtgN3nAJQ/C37akinlcOVXblTWKuypAeEYJ5GxTEqJkkWoa2ncuuVl0j3V9axiQRJzJ OCjJSJaZ MST9dvCP/iwpvsl49pLdC3JWrSzTnT3v+E+RteirQ6h0qFRzPzZq02DFmqZhle5a2H+hWC37gA0UAs21pJnFYHJsZ6x92vgb4+ghP+At2FkXHZNLRrA3K4V9jyP4O1WW2Q2yqgLMtUN1PU9kytyRxOl4uQXFbENkGKEZATflF7c1gYguaNb4Cqbutyr0tQ7dgDFJyC5lKFJPkRPcbIEmfbDf0xIgL2Axi3/W9WcnWsCCNZYWyhTXfHrN3Grv+HPWKKdcrFRX/9ksGHxdxkTIGSS4EZ1g1oUK19OaHHOUqWSuFicNiW/4Kkf9U8WDKSPt0Hk/wHkjywDEW5aiy96bEKTjOgIUk02hWtDOdlLv3RrQAxKrJ1WZ2sRJUx6HTngnQhGSzU3I9F2qh0pY5MgQE0uN1CUkQvgnqCKfPeqZ04zl7M3A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: add the defer setting to the selftests library for reading thp settings. Signed-off-by: Nico Pache --- tools/testing/selftests/mm/thp_settings.c | 1 + tools/testing/selftests/mm/thp_settings.h | 1 + 2 files changed, 2 insertions(+) diff --git a/tools/testing/selftests/mm/thp_settings.c b/tools/testing/selftests/mm/thp_settings.c index ad872af1c81a..b2f9f62b302a 100644 --- a/tools/testing/selftests/mm/thp_settings.c +++ b/tools/testing/selftests/mm/thp_settings.c @@ -20,6 +20,7 @@ static const char * const thp_enabled_strings[] = { "always", "inherit", "madvise", + "defer", NULL }; diff --git a/tools/testing/selftests/mm/thp_settings.h b/tools/testing/selftests/mm/thp_settings.h index fc131d23d593..0d52e6d4f754 100644 --- a/tools/testing/selftests/mm/thp_settings.h +++ b/tools/testing/selftests/mm/thp_settings.h @@ -11,6 +11,7 @@ enum thp_enabled { THP_ALWAYS, THP_INHERIT, THP_MADVISE, + THP_DEFER, }; enum thp_defrag { From patchwork Tue Feb 11 00:40:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nico Pache X-Patchwork-Id: 13968648 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5CB2C02198 for ; Tue, 11 Feb 2025 00:42:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 82291280034; Mon, 10 Feb 2025 19:42:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7D25A28000C; Mon, 10 Feb 2025 19:42:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 627C6280034; Mon, 10 Feb 2025 19:42:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 449A928000C for ; Mon, 10 Feb 2025 19:42:21 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 06ED0B1064 for ; Tue, 11 Feb 2025 00:42:21 +0000 (UTC) X-FDA: 83105812482.17.BB6CE0E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf27.hostedemail.com (Postfix) with ESMTP id 5A89740008 for ; Tue, 11 Feb 2025 00:42:19 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="FMtqW/be"; spf=pass (imf27.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739234539; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8P/mybpr+hwfsZO+hj/GAqlx+mlfaphH1WhdGIZk2Qg=; b=A42U0wM+G7OwTyXJM9+ssL055hSU0Ho8ZmT1hfEcEEPAAmO9R+zGElgHixvOw8g1EYlgG/ pMboma6ls/4mf3m5DHi7G5ZUz+/jiK/Bx9hTnxifz7qOoroS7APEZ1JVvPOSE/pcwJPdTL GAn42CbRoLSNxobz5oDVLK7wH5dB5xU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739234539; a=rsa-sha256; cv=none; b=BJKVW6adrVyVndVp2dERR+lhKk6AdQT2/fiJlIKxyA/chECtmsFsP4McEhfw/1lmHpeGQn j2y5XTYvQXFUiFLPlnOor+y0eV/ZkPPEETWkY66BRpOHg6Acjt4FxyFv1QnGMw+KdkE9w+ n24EGPNOaV3CleAKlzZ8djiK+U0Upgw= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="FMtqW/be"; spf=pass (imf27.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739234538; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8P/mybpr+hwfsZO+hj/GAqlx+mlfaphH1WhdGIZk2Qg=; b=FMtqW/bepFrSclfm3KWYrRKQK9XI6zEIlLPBSRo0rf5II9pQT/4TRDbbT1Eg7c614UCfvH dNgdhe0NoZbcgiCxF1Io8CLEHqNpt27yAs8xm5xpl/ODdZF+2maioQ+kafitjkVLdwbXmf vHUrAWt95zpg4b8qJG7H+o8bLYg5lFg= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-48-RBpm614cNu6jwsq17rWxBQ-1; Mon, 10 Feb 2025 19:42:15 -0500 X-MC-Unique: RBpm614cNu6jwsq17rWxBQ-1 X-Mimecast-MFC-AGG-ID: RBpm614cNu6jwsq17rWxBQ Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 118561800268; Tue, 11 Feb 2025 00:42:10 +0000 (UTC) Received: from h1.redhat.com (unknown [10.22.88.129]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6CC5818004A7; Tue, 11 Feb 2025 00:42:00 +0000 (UTC) From: Nico Pache To: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, willy@infradead.org, kirill.shutemov@linux.intel.com, david@redhat.com, aarcange@redhat.com, raquini@redhat.com, dev.jain@arm.com, sunnanyong@huawei.com, usamaarif642@gmail.com, audra@redhat.com, akpm@linux-foundation.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tiwai@suse.de, baolin.wang@linux.alibaba.com, corbet@lwn.net, shuah@kernel.org Subject: [RFC v2 4/5] khugepaged: add defer option to mTHP options Date: Mon, 10 Feb 2025 17:40:53 -0700 Message-ID: <20250211004054.222931-5-npache@redhat.com> In-Reply-To: <20250211004054.222931-1-npache@redhat.com> References: <20250211004054.222931-1-npache@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Rspam-User: X-Rspamd-Queue-Id: 5A89740008 X-Rspamd-Server: rspam07 X-Stat-Signature: hroo439zwtokem3xriay3n8g3r6xty4f X-HE-Tag: 1739234539-234128 X-HE-Meta: U2FsdGVkX1+YWysxMvjrVvLiOK2CHGyZnOWRaZ7l/hCeoGxTB+Xzh6p8bDzdLrqGdSy6V6b5nJoAy8UUUrZpADgcm6wmoDB2/edm7sHWFQ0dFKWR5GnwDgb1+QLCbU9k4TYKMv+W426EW//p8jD8hXQLr5gevjflNtjtbMSdJHP8Ot9l42Ja7PGKD2egGo5uMNhIFsK1apn8kDUX77gerwAGwupHCOwAquVBpXjglEgxOUaI7HwpsVZniYsN/kW7uQfEKgwmJIhYxfeBwFdYWXR+R0Xr2Lw88ykLVXFrQyc0UF6P90rwRtNMFlpTKwyP2IkYUnWQZ4RQXnr1zhNjwpnctUuHHgf7NP2ncowXqF1k6OMeeusUcdueveABuMtKPru1Xf58O2QQfADaO4tVx105KwrpZj66YqfOqwbVHevDqPswXnZQC2KuQYToVMUWbfpFabpWrWegvlOINV9Ew/grg9BavSvf+L49vBsOru+Od6wGAaKcOFoyqFo0wccsfcHeswY3o5pwcUN07qLeVd0QxYuWDBDC29xUbHeaAHsCEMcYjFu93rFnEDDMTOqSds+QDU+XIWqEVWIWUGhBAZLlJT5a9Vx1G2aHJQGKD0PQvlQ2W2TDCMnJe75INcHu8INGbBdmrukeyHA0xltQtE3C577ATRJ2IhcuBDHsR6fQuQ4ahcc3PtzZxWMmFsfvbsQUv1vCE6e815Wy+m0LXLUuKOZobQe3BmjSNGGmbb8UVnQV50yuiNXgP2XjBu3hWub/pmRRHCqoVRxxdwzlwLhl+bKQdMBKsRSXMaySBM/C/a0sRpNbw8jb2LX4lvyZR7cFIoW6xeukGh1caqj+eDIcCAi1V/z85CzQ+fBhl2ejt1BlZX7fDpeBKeMtbHKxRUlSouKDe8nnCyI3ZHAiYa//1xoJR7c4YoU7x9k5r4jkJnsz7jbBEeNfc/xyC0BHqPLGZ+YuwlbZgc/xh77 QnX1g0W+ bAL3F42edqF9LCk/uWkUY/IDCzPUQnQl5kC9Q4AHML9lTu5d8HvUUs8JkEGMpfvoUP5o6S8+jVeQPYHZV+kwCZzYDLW3HpsdT1bgEPlRdR0PTVbLLjRgLu66/A98laUIy7lq+VJ6BVWXcWXNdRYYT+kAjPT1L8MuS+YHeugh6dacnvgmfegYfroP9bt1eysdmhDTj38IC2gecDk4gpgzuLIQih0Qyg5IbwRm/lD4rycNMdFeDad7L1VHhDuAvwpLHGqboF0aaZUZE7juo+0aU19h/ifdVSIQn+hcnpbK32NyLBDaLQR2ISrJuea4I9lBgNUcdWMSie8Ze1e1M6KfT5vSEenem65l+RQpW/686AfS6ZqejMl+BYIfByflekIt4Il4DM2t1WDwP594= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that we have defer to globally disable THPs at fault time, lets add a defer setting to the mTHP options. This will allow khugepaged to operate at that order, while avoiding it at PF time. Signed-off-by: Nico Pache --- include/linux/huge_mm.h | 5 +++++ mm/huge_memory.c | 38 +++++++++++++++++++++++++++++++++----- mm/khugepaged.c | 10 +++++----- 3 files changed, 43 insertions(+), 10 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index fb381ca720ea..8173a9ab0f3b 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -92,6 +92,7 @@ extern struct kobj_attribute thpsize_shmem_enabled_attr; #define TVA_SMAPS (1 << 0) /* Will be used for procfs */ #define TVA_IN_PF (1 << 1) /* Page fault handler */ #define TVA_ENFORCE_SYSFS (1 << 2) /* Obey sysfs configuration */ +#define TVA_IN_KHUGEPAGE ((1 << 2) | (1 << 3)) /* Khugepaged defer support */ #define thp_vma_allowable_order(vma, vm_flags, tva_flags, order) \ (!!thp_vma_allowable_orders(vma, vm_flags, tva_flags, BIT(order))) @@ -173,6 +174,7 @@ extern unsigned long transparent_hugepage_flags; extern unsigned long huge_anon_orders_always; extern unsigned long huge_anon_orders_madvise; extern unsigned long huge_anon_orders_inherit; +extern unsigned long huge_anon_orders_defer; static inline bool hugepage_global_enabled(void) { @@ -297,6 +299,9 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, /* Optimization to check if required orders are enabled early. */ if ((tva_flags & TVA_ENFORCE_SYSFS) && vma_is_anonymous(vma)) { unsigned long mask = READ_ONCE(huge_anon_orders_always); + + if ((tva_flags) & (TVA_IN_KHUGEPAGE)) + mask |= READ_ONCE(huge_anon_orders_defer); if (vm_flags & VM_HUGEPAGE) mask |= READ_ONCE(huge_anon_orders_madvise); if (hugepage_global_always() || hugepage_global_defer() || diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a5e66a12bae8..de45595b0f98 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -81,6 +81,7 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL; unsigned long huge_anon_orders_always __read_mostly; unsigned long huge_anon_orders_madvise __read_mostly; unsigned long huge_anon_orders_inherit __read_mostly; +unsigned long huge_anon_orders_defer __read_mostly; static bool anon_orders_configured __initdata; static inline bool file_thp_enabled(struct vm_area_struct *vma) @@ -505,13 +506,15 @@ static ssize_t anon_enabled_show(struct kobject *kobj, const char *output; if (test_bit(order, &huge_anon_orders_always)) - output = "[always] inherit madvise never"; + output = "[always] inherit madvise defer never"; else if (test_bit(order, &huge_anon_orders_inherit)) - output = "always [inherit] madvise never"; + output = "always [inherit] madvise defer never"; else if (test_bit(order, &huge_anon_orders_madvise)) - output = "always inherit [madvise] never"; + output = "always inherit [madvise] defer never"; + else if (test_bit(order, &huge_anon_orders_defer)) + output = "always inherit madvise [defer] never"; else - output = "always inherit madvise [never]"; + output = "always inherit madvise defer [never]"; return sysfs_emit(buf, "%s\n", output); } @@ -527,25 +530,36 @@ static ssize_t anon_enabled_store(struct kobject *kobj, spin_lock(&huge_anon_orders_lock); clear_bit(order, &huge_anon_orders_inherit); clear_bit(order, &huge_anon_orders_madvise); + clear_bit(order, &huge_anon_orders_defer); set_bit(order, &huge_anon_orders_always); spin_unlock(&huge_anon_orders_lock); } else if (sysfs_streq(buf, "inherit")) { spin_lock(&huge_anon_orders_lock); clear_bit(order, &huge_anon_orders_always); clear_bit(order, &huge_anon_orders_madvise); + clear_bit(order, &huge_anon_orders_defer); set_bit(order, &huge_anon_orders_inherit); spin_unlock(&huge_anon_orders_lock); } else if (sysfs_streq(buf, "madvise")) { spin_lock(&huge_anon_orders_lock); clear_bit(order, &huge_anon_orders_always); clear_bit(order, &huge_anon_orders_inherit); + clear_bit(order, &huge_anon_orders_defer); set_bit(order, &huge_anon_orders_madvise); spin_unlock(&huge_anon_orders_lock); + } else if (sysfs_streq(buf, "defer")) { + spin_lock(&huge_anon_orders_lock); + clear_bit(order, &huge_anon_orders_always); + clear_bit(order, &huge_anon_orders_inherit); + clear_bit(order, &huge_anon_orders_madvise); + set_bit(order, &huge_anon_orders_defer); + spin_unlock(&huge_anon_orders_lock); } else if (sysfs_streq(buf, "never")) { spin_lock(&huge_anon_orders_lock); clear_bit(order, &huge_anon_orders_always); clear_bit(order, &huge_anon_orders_inherit); clear_bit(order, &huge_anon_orders_madvise); + clear_bit(order, &huge_anon_orders_defer); spin_unlock(&huge_anon_orders_lock); } else ret = -EINVAL; @@ -991,7 +1005,7 @@ static char str_dup[PAGE_SIZE] __initdata; static int __init setup_thp_anon(char *str) { char *token, *range, *policy, *subtoken; - unsigned long always, inherit, madvise; + unsigned long always, inherit, madvise, defer; char *start_size, *end_size; int start, end, nr; char *p; @@ -1003,6 +1017,8 @@ static int __init setup_thp_anon(char *str) always = huge_anon_orders_always; madvise = huge_anon_orders_madvise; inherit = huge_anon_orders_inherit; + defer = huge_anon_orders_defer; + p = str_dup; while ((token = strsep(&p, ";")) != NULL) { range = strsep(&token, ":"); @@ -1042,18 +1058,28 @@ static int __init setup_thp_anon(char *str) bitmap_set(&always, start, nr); bitmap_clear(&inherit, start, nr); bitmap_clear(&madvise, start, nr); + bitmap_clear(&defer, start, nr); } else if (!strcmp(policy, "madvise")) { bitmap_set(&madvise, start, nr); bitmap_clear(&inherit, start, nr); bitmap_clear(&always, start, nr); + bitmap_clear(&defer, start, nr); } else if (!strcmp(policy, "inherit")) { bitmap_set(&inherit, start, nr); bitmap_clear(&madvise, start, nr); bitmap_clear(&always, start, nr); + bitmap_clear(&defer, start, nr); + } else if (!strcmp(policy, "defer")) { + bitmap_set(&defer, start, nr); + bitmap_clear(&madvise, start, nr); + bitmap_clear(&always, start, nr); + bitmap_clear(&inherit, start, nr); } else if (!strcmp(policy, "never")) { bitmap_clear(&inherit, start, nr); bitmap_clear(&madvise, start, nr); bitmap_clear(&always, start, nr); + bitmap_clear(&defer, start, nr); + } else { pr_err("invalid policy %s in thp_anon boot parameter\n", policy); goto err; @@ -1064,6 +1090,8 @@ static int __init setup_thp_anon(char *str) huge_anon_orders_always = always; huge_anon_orders_madvise = madvise; huge_anon_orders_inherit = inherit; + huge_anon_orders_defer = defer; + anon_orders_configured = true; return 1; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index fc30698b8e6e..a83bc812ea64 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -488,7 +488,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma, { if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) && hugepage_pmd_enabled()) { - if (thp_vma_allowable_order(vma, vm_flags, TVA_ENFORCE_SYSFS, + if (thp_vma_allowable_order(vma, vm_flags, TVA_IN_KHUGEPAGE, PMD_ORDER)) __khugepaged_enter(vma->vm_mm); } @@ -943,7 +943,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, struct collapse_control *cc, int order) { struct vm_area_struct *vma; - unsigned long tva_flags = cc->is_khugepaged ? TVA_ENFORCE_SYSFS : 0; + unsigned long tva_flags = cc->is_khugepaged ? TVA_IN_KHUGEPAGE : 0; if (unlikely(khugepaged_test_exit_or_disable(mm))) return SCAN_ANY_PROCESS; @@ -1393,7 +1393,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, bool writable = false; int chunk_none_count = 0; int scaled_none = khugepaged_max_ptes_none >> (HPAGE_PMD_ORDER - MIN_MTHP_ORDER); - unsigned long tva_flags = cc->is_khugepaged ? TVA_ENFORCE_SYSFS : 0; + unsigned long tva_flags = cc->is_khugepaged ? TVA_IN_KHUGEPAGE : 0; VM_BUG_ON(address & ~HPAGE_PMD_MASK); result = find_pmd_or_thp_or_none(mm, address, &pmd); @@ -2505,7 +2505,7 @@ static int khugepaged_collapse_single_pmd(unsigned long addr, struct mm_struct * struct collapse_control *cc) { int result = SCAN_FAIL; - unsigned long tva_flags = cc->is_khugepaged ? TVA_ENFORCE_SYSFS : 0; + unsigned long tva_flags = cc->is_khugepaged ? TVA_IN_KHUGEPAGE : 0; if (!*mmap_locked) { mmap_read_lock(mm); @@ -2595,7 +2595,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, break; } if (!thp_vma_allowable_order(vma, vma->vm_flags, - TVA_ENFORCE_SYSFS, PMD_ORDER)) { + TVA_IN_KHUGEPAGE, PMD_ORDER)) { skip: progress++; continue; From patchwork Tue Feb 11 00:40:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nico Pache X-Patchwork-Id: 13968649 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8038C0219D for ; Tue, 11 Feb 2025 00:42:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5663A280005; Mon, 10 Feb 2025 19:42:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5146728000C; Mon, 10 Feb 2025 19:42:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 38D39280005; Mon, 10 Feb 2025 19:42:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 19A5B280003 for ; Mon, 10 Feb 2025 19:42:31 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C0F858085C for ; Tue, 11 Feb 2025 00:42:30 +0000 (UTC) X-FDA: 83105812860.14.F5FAC35 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf19.hostedemail.com (Postfix) with ESMTP id 1121A1A0008 for ; Tue, 11 Feb 2025 00:42:28 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RgTCgvgd; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739234549; a=rsa-sha256; cv=none; b=IbJUNa5kJj6G8vvRPFLbQr+6A6jnxtPCN3fx4O+dhvKVRKaq9zvbzavdNoW7w+zO/CTXBI EEFe+0SaTzA4g1COcWAWKfyFewDddnooz09Bp6uANoyoU4rsMuORJOLmJt8z6RPleg+QPL BLTKEdmCU7AhsPQJgXxJLA2JAQSfl+s= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=RgTCgvgd; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf19.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739234549; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9t+x6jSPF/Kaax2beF3oSfdLLrK+SQWfb/7DMQGVOLs=; b=n3HJA+zH8+D9IhBopwlDlIg3I/W5ec2RUp9FDh/lscBsoal9Rle7aXABxXMulObxUO2LC0 o1FKt1m+KxJb6xi5uUS2L1LwnLKTXzTjmtBy1wC2H7+nOJJ4AnidwAMkud/Vzed5K3nCKk 3ywBZmviMMLUHeakqrmQ0/fCa/O8xfk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739234548; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9t+x6jSPF/Kaax2beF3oSfdLLrK+SQWfb/7DMQGVOLs=; b=RgTCgvgdNbM1NXNc6eu7LZQpDrSAFe3WBELCiTxaiPXHkFzCnddJifIPT3wmGIdYxUkzmo 5H0boztPg5oORYMx78l2cBs80km6pCHkWRnyeO4tIy5S+V5/JU0uo/YO+3aebMnZCsK7VZ hqz7YF9nm7vwVv+8+7qj7DLSpcg+jGE= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-468-Ku2qpg2aMryfCR2GurdvVQ-1; Mon, 10 Feb 2025 19:42:23 -0500 X-MC-Unique: Ku2qpg2aMryfCR2GurdvVQ-1 X-Mimecast-MFC-AGG-ID: Ku2qpg2aMryfCR2GurdvVQ Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5936B180087A; Tue, 11 Feb 2025 00:42:19 +0000 (UTC) Received: from h1.redhat.com (unknown [10.22.88.129]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 602901800873; Tue, 11 Feb 2025 00:42:10 +0000 (UTC) From: Nico Pache To: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, jack@suse.cz, srivatsa@csail.mit.edu, haowenchao22@gmail.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, ioworker0@gmail.com, wangkefeng.wang@huawei.com, ziy@nvidia.com, jglisse@google.com, surenb@google.com, vishal.moola@gmail.com, zokeefe@google.com, zhengqi.arch@bytedance.com, jhubbard@nvidia.com, 21cnbao@gmail.com, willy@infradead.org, kirill.shutemov@linux.intel.com, david@redhat.com, aarcange@redhat.com, raquini@redhat.com, dev.jain@arm.com, sunnanyong@huawei.com, usamaarif642@gmail.com, audra@redhat.com, akpm@linux-foundation.org, rostedt@goodmis.org, mathieu.desnoyers@efficios.com, tiwai@suse.de, baolin.wang@linux.alibaba.com, corbet@lwn.net, shuah@kernel.org Subject: [RFC v2 5/5] mm: document mTHP defer setting Date: Mon, 10 Feb 2025 17:40:54 -0700 Message-ID: <20250211004054.222931-6-npache@redhat.com> In-Reply-To: <20250211004054.222931-1-npache@redhat.com> References: <20250211004054.222931-1-npache@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Rspam-User: X-Rspamd-Queue-Id: 1121A1A0008 X-Rspamd-Server: rspam12 X-Stat-Signature: 439bz7n78qprxzpxyoj7nati9y5ituh7 X-HE-Tag: 1739234548-648622 X-HE-Meta: U2FsdGVkX1+IBFuYBsWPjIRZyjBWhRu0A12hdYYCD183LIYiJndpyNRkVsy8cVr3w6I2no3aPuX0jHPMhape3xgnHoribqGXnSSD6EP+aXhR3aChnZIe/SPH3bLxxK4AWCc2JKf/vnKquK4M93nvt2qJgZT2HLby9iH/j8Ql8v8K5GkuwykwYev0tu0FOcgcUln5mDnr34nov0sFER/Z46E9GZ0KjMeW1o83Rj93+TrxsRdlzj70Xpyk93YXEVUg+NmyzD27VYdgqTHKcmfJpj/IEwm09uRoS+pIg0YYWBabXrGqr0yjUOJd0YlI0Sd45RvcxSx2qhcxGp3eqPPclIng58vWmjm59ZPvjN7Spp2q1CM2XkcGYkV7TnOZl0Rw1SSv6T/pfeFfp4JkADyZ2sC4oHa98uQjpJ9lbh/mRnZS9wjIJU2VmekY7zwqnvHIS7H/sYkQiBgoIzMP84KwPuEYNjukMWXcS7p4ityGZrO9rVuS901zkCECIAJr9WcT1azx835eOdZ2dc6uo26ghNzSPJRyafckw636NTSXu3w7l61kdTK3lhfifhag3igHBYc5fDx0zaqKGm2ghHeeq7nRHuve/Kp4cOiGRwdW/Cy+qtY1ULmd9TfjH9FGgMLT1X3pE0Zj+W+pfRcKMtVWw7K4yBFY9gKtkkQXrreIQyEjnTdV4+EtFkooLBeSXXxJVtJukEhk6SQt8y7/AYFRacykJZs055U7w+IwuHOk4ZQ9umInjOdi2pZ7fksePr+QQxvkBvNwj07NrvwVggBO8J8I4Bn5Xd2/6nFwOodgVWlKPulGbxGvmfocE431nCtiGtYXAUKy0nw7nzp5ddnHSfYcCty640YePAvpM1fpCgN9FlXfbfUkk0qLB82NuUr365l6ZrGD4TeOY5o3FFZkmx0j3e6mSdRO0NoviRMTBtTuCXMZlqH6n33M21RE9lyp+SjFD6VpN80LYVUwMay cGS2qrQK xOHGL0RN4JJyNgeRSxKrVfA0RC1H5BeVXxLWhTkLarNkDXfIIgvCjTo4Eahl9utiEMvGkkK+gEodyor9Kid59RQfLOl97lLUlkDv85GPtlQTBcaz8DheCXA8INC659QQ/sa4ZnCBmcnilfvvKx/hb4zelnFoA5U2GoKOyOMnlJbiOnrdvn6uwZKmByVl5AGXY1agxxFwcwlaXcm+jHrApG/+6d560sSb7kNdhwITR6jBbUOgJ5quRlP4Y+kIMI/3wZOXDWOgj83nb8/5a7LjfbIox7qV06UBv4tPA0A8eF0mF3fnKl0Au9eQP+q9zWusApTMmphMwc0py2GmIs0cZj0iYapQJCfTSEoI4v8FYV4J5Y5L3IttZZlaa5QsISSF7OpLZABq5gHH0RXnVAw/mYWkTM5mILXQ1R+TDzQa5oK3eTdKfamwwcc1mLA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that we have mTHP support in khugepaged, lets add it to the transhuge admin guide to provide proper guidance. Signed-off-by: Nico Pache --- Documentation/admin-guide/mm/transhuge.rst | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index b3b18573bbb4..99ba3763c1c4 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -63,7 +63,7 @@ often. THP can be enabled system wide or restricted to certain tasks or even memory ranges inside task's address space. Unless THP is completely disabled, there is ``khugepaged`` daemon that scans memory and -collapses sequences of basic pages into PMD-sized huge pages. +collapses sequences of basic pages into huge pages. The THP behaviour is controlled via :ref:`sysfs ` interface and using madvise(2) and prctl(2) system calls. @@ -103,8 +103,8 @@ madvise(MADV_HUGEPAGE) on their critical mmapped regions. Applications that would like to benefit from THPs but would still like a more memory conservative approach can choose 'defer'. This avoids inserting THPs at the page fault handler unless they are MADV_HUGEPAGE. -Khugepaged will then scan the mappings for potential collapses into PMD -sized pages. Admins using this the 'defer' setting should consider +Khugepaged will then scan the mappings for potential collapses into (m)THP +pages. Admins using this the 'defer' setting should consider tweaking khugepaged/max_ptes_none. The current default of 511 may aggressively collapse your PTEs into PMDs. Lower this value to conserve more memory (ie. max_ptes_none=64). @@ -119,11 +119,14 @@ Global THP controls Transparent Hugepage Support for anonymous memory can be entirely disabled (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE -regions (to avoid the risk of consuming more memory resources) or enabled -system wide. This can be achieved per-supported-THP-size with one of:: +regions (to avoid the risk of consuming more memory resources), defered to +khugepaged, or enabled system wide. + +This can be achieved per-supported-THP-size with one of:: echo always >/sys/kernel/mm/transparent_hugepage/hugepages-kB/enabled echo madvise >/sys/kernel/mm/transparent_hugepage/hugepages-kB/enabled + echo defer >/sys/kernel/mm/transparent_hugepage/hugepages-kB/enabled echo never >/sys/kernel/mm/transparent_hugepage/hugepages-kB/enabled where is the hugepage size being addressed, the available sizes @@ -155,6 +158,13 @@ hugepage sizes have enabled="never". If enabling multiple hugepage sizes, the kernel will select the most appropriate enabled size for a given allocation. +khugepaged use max_ptes_none scaled to the order of the enabled mTHP size to +determine collapses. When using mTHPs its recommended to set max_ptes_none low. +Ideally less than HPAGE_PMD_NR / 2 (255 on 4k page size). This will prevent +undesired "creep" behavior that leads to continously collapsing to a larger +mTHP size. max_ptes_shared and max_ptes_swap have no effect when collapsing to a +mTHP, and mTHP collapse will fail on shared or swapped out pages. + It's also possible to limit defrag efforts in the VM to generate anonymous hugepages in case they're not immediately free to madvise regions or to never try to defrag memory and simply fallback to regular @@ -318,7 +328,7 @@ Alternatively, each supported anonymous THP size can be controlled by passing ``thp_anon=[KMG],[KMG]:;[KMG]-[KMG]:``, where ```` is the THP size (must be a power of 2 of PAGE_SIZE and supported anonymous THP) and ```` is one of ``always``, ``madvise``, -``never`` or ``inherit``. +``defer``, ``never`` or ``inherit``. For example, the following will set 16K, 32K, 64K THP to ``always``, set 128K, 512K to ``inherit``, set 256K to ``madvise`` and 1M, 2M