From patchwork Wed Feb 21 09:27:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13565162 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F9F4C5478A for ; Wed, 21 Feb 2024 09:28:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 78BE36B0083; Wed, 21 Feb 2024 04:28:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 73B066B0085; Wed, 21 Feb 2024 04:28:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 575206B0087; Wed, 21 Feb 2024 04:28:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 305BA6B0083 for ; Wed, 21 Feb 2024 04:28:13 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id F11E7A09BA for ; Wed, 21 Feb 2024 09:28:12 +0000 (UTC) X-FDA: 81815284824.14.53E4816 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) by imf12.hostedemail.com (Postfix) with ESMTP id 5118A40009 for ; Wed, 21 Feb 2024 09:28:09 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="GzC/uFhW"; spf=pass (imf12.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.112 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708507691; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dbpOfTRpA7RL14rl+8ySQliZQuoH0mAg7qi1p2q34yY=; b=FA8JlP6rVpDVOWnUu5EZBMH5GTJmjqUDGI/HZG2hG6/u5FNwzd/0CxTciLa8xQn69P1fIJ sE8sxXifThL897om2D11020FO0oMn9GAVJnCKlfnS46lMfnl1kYuHNBR9Q7Koy5t0JmGnw u2qrFD+Ipjt6jzrLHPSDEVcaqpKBFIU= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="GzC/uFhW"; spf=pass (imf12.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.112 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708507691; a=rsa-sha256; cv=none; b=IvNHVuzeaXhO6lzMxDZCh2EDj1oryV3fQFxZF38SECASEgJEeckY5LvCLxErxLspxm9kkT RH7AArMrnQvfdsLhB7R2YEYmzRhwi2UQQZvMX68jahzwgcgjyN31JeE+lvYErgeSfuNsR/ rGPB2gNuFapYHWee4R/vSBScNc0npws= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1708507686; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=dbpOfTRpA7RL14rl+8ySQliZQuoH0mAg7qi1p2q34yY=; b=GzC/uFhWuHPux7CSgieh03iLZjVxDikAteV0gYO/Rh+zx5BfE+eHK2REncMgwJaj/VfV40BR41mp+/PeSIBmAiHnvSePs1Z2EXxz94fqP9p7nOZaG7xOXFnPgdCkUR4EZSao8G80EzWq5I04ialtbMWBwjxBIBGEacA7n9sL0gc= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R211e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045176;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0W0zPloF_1708507683; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W0zPloF_1708507683) by smtp.aliyun-inc.com; Wed, 21 Feb 2024 17:28:04 +0800 From: Baolin Wang To: akpm@linux-foundation.org Cc: muchun.song@linux.dev, osalvador@suse.de, david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, mhocko@kernel.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH 1/3] mm: record the migration reason for struct migration_target_control Date: Wed, 21 Feb 2024 17:27:53 +0800 Message-Id: <04e445d16ab9f8d5a4cd4082c0c9b5f5e0bbf54c.1708507022.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 5118A40009 X-Rspam-User: X-Stat-Signature: 9d9346zzyorjjkwdh5ge3tetk8ypy4ag X-Rspamd-Server: rspam01 X-HE-Tag: 1708507689-853838 X-HE-Meta: U2FsdGVkX1+agZ+Q7SgQpG/p8BwoeWS7r694mPwLiyJBveeHBk8Uvablc0Faw0goaM3JYai4Oz6G3sYiir06jnaK5db17UauJShZYrLq4zQ5/6oYRaxPcCCpI4NKXmSaL2iEPK+gmPqjWbxQUxhAfU2xeTXZT+xD3hRJcvq6X1cbiR4JMwLJv2uFNuGkaVbaewvhxZcM4rbGcEEz0MBCl+pqDVg05T5dpJfyeaMSVx72Dq1qwKysOJxpNuI2T1goIM3GQU0CH2pdrvE0buiLE1fHkwyScxJb9G1tjjidDz7zwgZHj+lDAhd9DVx+lLx/mJNcRhojxgzgLrqjGJL73GzLi8F4Fp4Q2MzPEww8uLC24r4nM2azBcXsEa23hO6kIXWTedSwgoduUlWQ8zt8rr7qCDd3VQSRy8H3w2Ibo0pOVUQi7hJL2sepWrFNfCEiWw+a6g3/x83L/NF4eLtScqL0osliePYzo07B4T5sc44ALLqffULoCZO8T8QOV4EiUT/u4+IA7GPLWc/KGrK8etdxkTWZW098zn2kZ7rnPJHdB0F5Yx7WmmEWfsVnfBZEafHabUXKiKtENJbesGUzUjQHHYTk5GCg8qhXo3fJQhqxLpbGr7t0+6N54DTwvW0ZSYd0IfagsAeI+nI3UMEhBKZGNVvvAiwJpLIIqXAv7Ge3U4sCGrRl9146xb7of2YDBfYPIbOhyYDqkOPo8u8fPHvc8qKNSLHoz9XJ8nTG9OlRZ/lMQ5M9m0R+KI9zVmgKnsQmcjLLVXZZ7gbOMVOk/F47146JSfqXHKrr0n2UupKNLjZeoYO12xKJhuCVEE678yenZfeme6ZJ38izzEtzjtdpQ5VxeoI6IM8S1pSxzQDI2TeieYgWgv44b0C0IbcvCBNlCK1Rw8A0zDOXoT2rOPjWMuFCMeBNXq7iLOEgqYuhCyUv1qZgmPgIqC16UWAzdNLd2rojXieaGrHNJlf 55m6axKZ nSZp4bMECLy2Hvi3vcOKv3Mo0bW5Vlc8fTA6EUvX+P4SF1eBuQnAKNwYKc8IOvMdblo8KnIAc0LAqi0/WLhoniY8jA/489hN/1naeOHrqWaxdbWQWkZPCdV0ec1d9/8Yc71f7DF8qsJ+E9D2wKBsLVw4i6tbE8I5/vWfst0MsDFjrzV+oZha97mJSSFwcp51KpfvABLO4wx0xtJtH3yOTy09xiuWyzK09JiF2SnmixSSltZVYiMqdwbxJsdnUKJTRN/U4e2lrWyDg4o8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To support different hugetlb allocation strategies during hugetlb migration based on various migration reasons, recording the migration reason for the migration_target_control structure as a preparation. Signed-off-by: Baolin Wang --- mm/gup.c | 1 + mm/internal.h | 1 + mm/memory-failure.c | 1 + mm/memory_hotplug.c | 1 + mm/mempolicy.c | 1 + mm/migrate.c | 1 + mm/page_alloc.c | 1 + mm/vmscan.c | 3 ++- 8 files changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/gup.c b/mm/gup.c index df83182ec72d..959a1a05b059 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2132,6 +2132,7 @@ static int migrate_longterm_unpinnable_pages( struct migration_target_control mtc = { .nid = NUMA_NO_NODE, .gfp_mask = GFP_USER | __GFP_NOWARN, + .reason = MR_LONGTERM_PIN, }; if (migrate_pages(movable_page_list, alloc_migration_target, diff --git a/mm/internal.h b/mm/internal.h index 93e229112045..7677ee4d8e12 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -958,6 +958,7 @@ struct migration_target_control { int nid; /* preferred node id */ nodemask_t *nmask; gfp_t gfp_mask; + enum migrate_reason reason; }; /* diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 9349948f1abf..780bb2aee0af 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2666,6 +2666,7 @@ static int soft_offline_in_use_page(struct page *page) struct migration_target_control mtc = { .nid = NUMA_NO_NODE, .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL, + .reason = MR_MEMORY_FAILURE, }; if (!huge && folio_test_large(folio)) { diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a444e2d7dd2b..b79ba36e09e0 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1841,6 +1841,7 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) struct migration_target_control mtc = { .nmask = &nmask, .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL, + .reason = MR_MEMORY_HOTPLUG, }; int ret; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index f60b4c99f130..98ceb12e0e17 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1070,6 +1070,7 @@ static long migrate_to_node(struct mm_struct *mm, int source, int dest, struct migration_target_control mtc = { .nid = dest, .gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE, + .reason = MR_SYSCALL, }; nodes_clear(nmask); diff --git a/mm/migrate.c b/mm/migrate.c index 73a052a382f1..bde63010a3cf 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2060,6 +2060,7 @@ static int do_move_pages_to_node(struct list_head *pagelist, int node) struct migration_target_control mtc = { .nid = node, .gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE, + .reason = MR_SYSCALL, }; err = migrate_pages(pagelist, alloc_migration_target, NULL, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index b0b92ce997dc..81ba73d77921 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6264,6 +6264,7 @@ int __alloc_contig_migrate_range(struct compact_control *cc, struct migration_target_control mtc = { .nid = zone_to_nid(cc->zone), .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL, + .reason = MR_CONTIG_RANGE, }; lru_cache_disable(); diff --git a/mm/vmscan.c b/mm/vmscan.c index 87df3a48bdd7..d111c8e3b40e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -978,7 +978,8 @@ static unsigned int demote_folio_list(struct list_head *demote_folios, .gfp_mask = (GFP_HIGHUSER_MOVABLE & ~__GFP_RECLAIM) | __GFP_NOWARN | __GFP_NOMEMALLOC | GFP_NOWAIT, .nid = target_nid, - .nmask = &allowed_mask + .nmask = &allowed_mask, + .reason = MR_DEMOTION, }; if (list_empty(demote_folios)) From patchwork Wed Feb 21 09:27:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13565163 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 195EDC54788 for ; Wed, 21 Feb 2024 09:28:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A25CE6B0087; Wed, 21 Feb 2024 04:28:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D36D6B0088; Wed, 21 Feb 2024 04:28:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 825D96B0089; Wed, 21 Feb 2024 04:28:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6EB516B0087 for ; Wed, 21 Feb 2024 04:28:18 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 452D21C0AB8 for ; Wed, 21 Feb 2024 09:28:18 +0000 (UTC) X-FDA: 81815285076.10.CC83F63 Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) by imf15.hostedemail.com (Postfix) with ESMTP id 29933A000F for ; Wed, 21 Feb 2024 09:28:14 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=N8dnYYs4; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf15.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708507696; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ImWD/LFam8uPK216oOawCtCWE3yiiD0csaC4SgFveA8=; b=kPxipEc/exao5rxwzTnnSUG5j5R9eaOwzI3PRW69Gba7sx4aperDyTC7XqTtwVFmP/pu8Y NE1fXQDYQvcFOIMKksQ3qXhqT6eXTnwzKyuVmSjuaticEduUEcRT7/sQj7Q+V4/kwbE+aQ OMAy/R9rKpLlQqiDsGdQTkjXV0512UE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=N8dnYYs4; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf15.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708507696; a=rsa-sha256; cv=none; b=0/lrK+xgpEMUoBma4E5qmSIxGH/SKuVMShiCdmYVPp1zcBdWpp3lZsShbqJoc+/ihmagGW ye/FtuvsY33H3Ki/fqXqlCeVLinN/u3L9PUFwA6BZLX/3xOvvyKktI4PZqUyD2NSXF7N+G BTK7clnCKV7e5NcHFxmTpKUj6MsIQcw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1708507686; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=ImWD/LFam8uPK216oOawCtCWE3yiiD0csaC4SgFveA8=; b=N8dnYYs4/RxkmL95EU9cddIr52GZltuM1MHI6/tbV4RPoxPXk9YywsZeJpvCq+yb+h+GRpBqbYGz0eOfRhOoOK4DZoYPDa3SlNqtFT0iQgtTOpcc2tV/1WQy7vIwK1QQ+b4NrsPOQ3rv4egKD2VaQ+Kdtn+92jFwoAlKI90t/iU= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R951e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046056;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0W0zUu6x_1708507684; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W0zUu6x_1708507684) by smtp.aliyun-inc.com; Wed, 21 Feb 2024 17:28:05 +0800 From: Baolin Wang To: akpm@linux-foundation.org Cc: muchun.song@linux.dev, osalvador@suse.de, david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, mhocko@kernel.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH 2/3] mm: hugetlb: make the hugetlb migration strategy consistent Date: Wed, 21 Feb 2024 17:27:54 +0800 Message-Id: <0514e5139b17ecf3cd9e09d86c93e586c56688dc.1708507022.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 29933A000F X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 9qntack5go4u6eotuwbxsi48s4cd5b9j X-HE-Tag: 1708507694-758009 X-HE-Meta: U2FsdGVkX1/gIVF/7OAQlbKBI1Fc5yQk+0kr41cRIhRNYxRMr+LCmWsbIOdUEOi5NB0tOPDYjGMi+zKMg4sNMtGrEHd+ReP1kaKBXlWP9jbwyilJ2SasGtGvDBL3+uF3DV8P7DU1+1cH3mB61Q6JXg0LrO8FjZ9PS8xA8DWmvLtTlxcI9JnGVQv4Kmhe+APHMm7D9bh9QctEBqlRX3nbruAPwCsvd7anjTZYRw39soclNX5fj/jMsSLo36WS+vC/5aeUF4Zs6lgQtwCbcRKm5niK02CjqNGDPQGgsG+h3XnSGuava+fjCk/YYVRNowRROAmFpmxqhtQ58KUAgrMiINKYyc61rQB024GVBE6zR1F4x7n0/jQjk9LLRfIs0tlLmj6zRPJMF1O44w+HOb62X7aGXypFgC0jqG3irVZNVVZuecDubz4GASvDwF2bsBvQv6L/dr3XuJ0FRIi37rjVvEVIN8Zee6nHgILEa5vLScbuw28650cDVn/hgdHNkVRyJTn5WnSNQAhCia18TmBxmID+ciSffgjqJKQTnoIEorSwXcfHhj7kIl7tyZUpTrwbVgdarO/tzjUpAHDCx0hxwqfIscuxwx2p9AmWfoasHeBcAQ4g8d+ZUvRTwhmkPMzLJ6CZ0P4gnzu/flP9NbMqmrn/wqIh5a00dnwP4xhDS+0Gotw1kEpTw3tEc55t4rtFZZ06eclayQ8JuQWw9A++UQQM5clMP67/Ac8RSjygoV3xTD6pM3gEA6lPE9SA9iJhZvLt2YxYoEaNcCbGQGnBV8ziE63TJCA//CrZ4SHhfUH0BCqFPKBABBsP/GX5ilxuL+bGTDUlhoEtitah1PtT00awBA2N+z5/4HZYvrbNlOatYymx71dFDRhT+HpBRUQaPg5n5NSLkmX7+n7FaIJ1EUpF9p0wNnjY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: As discussed in previous thread [1], there is an inconsistency when handing hugetlb migration. When handling the migration of freed hugetlb, it prevents fallback to other NUMA nodes in alloc_and_dissolve_hugetlb_folio(). However, when dealing with in-use hugetlb, it allows fallback to other NUMA nodes in alloc_hugetlb_folio_nodemask(), which can break the per-node hugetlb pool and might result in unexpected failures when node bound workloads doesn't get what is asssumed available. To make hugetlb migration strategy more clear, we should list all the scenarios of hugetlb migration and analyze whether allocation fallback is permitted: 1) Memory offline: will call dissolve_free_huge_pages() to free the freed hugetlb, and call do_migrate_range() to migrate the in-use hugetlb. Both can break the per-node hugetlb pool, but as this is an explicit offlining operation, no better choice. So should allow the hugetlb allocation fallback. 2) Memory failure: same as memory offline. Should allow fallback to a different node might be the only option to handle it, otherwise the impact of poisoned memory can be amplified. 3) Longterm pinning: will call migrate_longterm_unpinnable_pages() to migrate in-use and not-longterm-pinnable hugetlb, which can break the per-node pool. But we should fail to longterm pinning if can not allocate on current node to avoid breaking the per-node pool. 4) Syscalls (mbind, migrate_pages, move_pages): these are explicit users operation to move pages to other nodes, so fallback to other nodes should not be prohibited. 5) alloc_contig_range: used by CMA allocation and virtio-mem fake-offline to allocate given range of pages. Now the freed hugetlb migration is not allowed to fallback, to keep consistency, the in-use hugetlb migration should be also not allowed to fallback. 6) alloc_contig_pages: used by kfence, pgtable_debug etc. The strategy should be consistent with that of alloc_contig_range(). Based on the analysis of the various scenarios above, determine whether fallback is permitted according to the migration reason in alloc_hugetlb_folio_nodemask(). [1] https://lore.kernel.org/all/6f26ce22d2fcd523418a085f2c588fe0776d46e7.1706794035.git.baolin.wang@linux.alibaba.com/ Signed-off-by: Baolin Wang --- include/linux/hugetlb.h | 4 ++-- mm/hugetlb.c | 28 ++++++++++++++++++++++++++-- mm/mempolicy.c | 2 +- mm/migrate.c | 2 +- 4 files changed, 30 insertions(+), 6 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 77b30a8c6076..fa122dc509cf 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -747,7 +747,7 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, - nodemask_t *nmask, gfp_t gfp_mask); + nodemask_t *nmask, gfp_t gfp_mask, int reason); int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping, pgoff_t idx); void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma, @@ -1065,7 +1065,7 @@ static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, static inline struct folio * alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, - nodemask_t *nmask, gfp_t gfp_mask) + nodemask_t *nmask, gfp_t gfp_mask, int reason) { return NULL; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 68283e54c899..a55cfc7844bc 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2621,8 +2621,10 @@ struct folio *alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h, /* folio migration callback function */ struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, - nodemask_t *nmask, gfp_t gfp_mask) + nodemask_t *nmask, gfp_t gfp_mask, int reason) { + bool allowed_fallback = false; + spin_lock_irq(&hugetlb_lock); if (available_huge_pages(h)) { struct folio *folio; @@ -2636,6 +2638,28 @@ struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, } spin_unlock_irq(&hugetlb_lock); + if (gfp_mask & __GFP_THISNODE) + goto alloc_new; + + /* + * Note: the memory offline, memory failure and migration syscalls can break + * the per-node hugetlb pool. Other cases can not allocate new hugetlb on + * other nodes. + */ + switch (reason) { + case MR_MEMORY_HOTPLUG: + case MR_MEMORY_FAILURE: + case MR_SYSCALL: + case MR_MEMPOLICY_MBIND: + allowed_fallback = true; + break; + default: + break; + } + + if (!allowed_fallback) + gfp_mask |= __GFP_THISNODE; +alloc_new: return alloc_migrate_hugetlb_folio(h, gfp_mask, preferred_nid, nmask); } @@ -6666,7 +6690,7 @@ static struct folio *alloc_hugetlb_folio_vma(struct hstate *h, gfp_mask = htlb_alloc_mask(h); node = huge_node(vma, address, gfp_mask, &mpol, &nodemask); - folio = alloc_hugetlb_folio_nodemask(h, node, nodemask, gfp_mask); + folio = alloc_hugetlb_folio_nodemask(h, node, nodemask, gfp_mask, -1); mpol_cond_put(mpol); return folio; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 98ceb12e0e17..436e817eeaeb 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1228,7 +1228,7 @@ static struct folio *alloc_migration_target_by_mpol(struct folio *src, h = folio_hstate(src); gfp = htlb_alloc_mask(h); nodemask = policy_nodemask(gfp, pol, ilx, &nid); - return alloc_hugetlb_folio_nodemask(h, nid, nodemask, gfp); + return alloc_hugetlb_folio_nodemask(h, nid, nodemask, gfp, MR_MEMPOLICY_MBIND); } if (folio_test_large(src)) diff --git a/mm/migrate.c b/mm/migrate.c index bde63010a3cf..0c2b70800da3 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2022,7 +2022,7 @@ struct folio *alloc_migration_target(struct folio *src, unsigned long private) gfp_mask = htlb_modify_alloc_mask(h, gfp_mask); return alloc_hugetlb_folio_nodemask(h, nid, - mtc->nmask, gfp_mask); + mtc->nmask, gfp_mask, mtc->reason); } if (folio_test_large(src)) { From patchwork Wed Feb 21 09:27:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13565161 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABF31C48BC3 for ; Wed, 21 Feb 2024 09:28:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C7076B0082; Wed, 21 Feb 2024 04:28:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 34E976B0087; Wed, 21 Feb 2024 04:28:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E4286B0085; Wed, 21 Feb 2024 04:28:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0B0386B0082 for ; Wed, 21 Feb 2024 04:28:13 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D2A55120801 for ; Wed, 21 Feb 2024 09:28:12 +0000 (UTC) X-FDA: 81815284824.18.96466E8 Received: from out30-110.freemail.mail.aliyun.com (out30-110.freemail.mail.aliyun.com [115.124.30.110]) by imf03.hostedemail.com (Postfix) with ESMTP id 731232000B for ; Wed, 21 Feb 2024 09:28:10 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=tbf261B3; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf03.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708507691; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=m3nqGI/fMVZjL40459w/MnS9DkbLdhc8HEWPtY+H0PM=; b=H28lTZZVfByf5MC4X0DevN8pisN9GQ+NuuzLvHgDJgWIpkADrIO0CL979LqWabFZvJNGMv sJEioWT8x58xpgN8T5+TJnE5t/7EKtx9zqR9PPY8O5Y9GLayiZUczI/K8GoOspng1jtfHL Tb9ShdCoAylJJkotkZyj5hOkTC0JGJ0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=tbf261B3; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf03.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.110 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708507691; a=rsa-sha256; cv=none; b=bH2jaPhV8p+wWmIw3jNwjqOindBKC4pJqRGcPCoH91+ya7+Fk3HdbohLBWeiDPuI4DkU/L asxHitmOel+MpXYfPNWAX8sZ3gwE14dSkIGLJsPO5bqkqgJYYOD6Kwnl3c6Cj0B+sa4WZG uKmuHvqAUTfONteQGLiov/10iMhf+/g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1708507687; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=m3nqGI/fMVZjL40459w/MnS9DkbLdhc8HEWPtY+H0PM=; b=tbf261B3zP1aJ8GoO26ZLScrCldl7dzZMaMeJlr4A0bIl/eqetArzNNTiOsFE+fOf/P1yNqrhbc01ou4qQRZMWJoVnjJ3soxij/rRObHr6LyXu5wk8ViEdZ/Bm1Q1GVHnL3JyQopwO4VLGTofhfHTP+Uxm8md5XHJXiDpx/ZCk0= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0W0zShrU_1708507686; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W0zShrU_1708507686) by smtp.aliyun-inc.com; Wed, 21 Feb 2024 17:28:06 +0800 From: Baolin Wang To: akpm@linux-foundation.org Cc: muchun.song@linux.dev, osalvador@suse.de, david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, mhocko@kernel.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH 3/3] docs: hugetlbpage.rst: add hugetlb migration description Date: Wed, 21 Feb 2024 17:27:55 +0800 Message-Id: <75b80937a84bd98211cea0607707bfdee8cb5873.1708507022.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 731232000B X-Stat-Signature: oykiryj1yswo569sz151itbzaxidb4ub X-Rspam-User: X-HE-Tag: 1708507690-439289 X-HE-Meta: U2FsdGVkX18OPFcqmAkL/AkovthAlSjn+Qi6kz91oMgAXu+iD5zohBrGRSTMRtm3kGyDZiYon6ijlVI41n206dp+RaUPuoLvnY1/hba3u5zionqebcWmiEyNn1CHYDLsvrIY8qFSEN+nj5ePsXpA3kKMVHeOoQsNk7nMvRVOOqynmtQePNBrxvUoZsSgzH7w99AYPoL5JkWehrXP1WMZJca83RcTjCmobHC1zya5EDXZEMSUYKTha5FIGYetldUnehBdXK389QACjSPS/ngq0DXps2fop7+W1LXjU7QwLoiFv8ps6SI8vBjw8fR2JCqUQ/1uUGcuANIRfPX6/E6E0mm8duS7uMX5egT+pLWgSJyHIsIDefx/cu5QuaQolTHF/JCkdoEYlcLX8MVYIASBEcdq9d1SPdSWYe8uXiX0N6L3zYKLYPVnSYQ08u/gDvASthS2tIFgd2dvQBB/00GLEE2p9OEz/Z1CaLPE7w/2wdLV+SNwgDDYZpnIGyyPMLCG2iuEKFUia/qWYuehvaWJ/Sw7XxMZiPQfmQBOrOKMuNidzpFg5cnYDoaL4l1fIVw0WmexWbRFmcj9bpSCFi+l/oTPb3snM2xo17U+Aep54iZXhICdFN1qvQ8h1k/niNFSpK0zZpFuWC0ilJFqNW91971rlCmdStvpQ0UgjdcYTbzaJmgSF6hPBiixBDR3ADFlAutEaY3UVGVCCyMWtlLZtoQHiYHOQ8z8xa/xelcMpoSZze95tDjBxODXKY5Q0KuNt+CcJWS1HM2W3IvZVa1x/n0d0vusSF+nrbGEOFmWteQZVMcqASjr2PUnWPNowN+7WMY/PqQiLqBpmVhnXAoFDyMk5J0ZJqOvef6mN3xi8m2h1pv73ikDSHT/RzdbxuQtWdZKeVIZEY6Z8Chhmx7ncLSd2rbmTosl X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add some description of the hugetlb migration strategy. Signed-off-by: Baolin Wang --- Documentation/admin-guide/mm/hugetlbpage.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst index e4d4b4a8dc97..68d7bc2165c9 100644 --- a/Documentation/admin-guide/mm/hugetlbpage.rst +++ b/Documentation/admin-guide/mm/hugetlbpage.rst @@ -376,6 +376,13 @@ Note that the number of overcommit and reserve pages remain global quantities, as we don't know until fault time, when the faulting task's mempolicy is applied, from which node the huge page allocation will be attempted. +The hugetlb may be migrated between the per-node hugepages pool in the following +scenarios: memory offline, memory failure, longterm pinning, syscalls(mbind, +migrate_pages, move_pages), alloc_contig_range() and alloc_contig_pages(). Now +only memory offline, memory failure and syscalls allow fallback to allocate a +new hugetlb on a different node if the current node is unable to allocate during +hugetlb migration, that means these 3 cases can break the per-node hugepages pool. + .. _using_huge_pages: Using Huge Pages