From patchwork Wed Mar 6 10:13:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13583753 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DFAEC5475B for ; Wed, 6 Mar 2024 10:13:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D0446B0080; Wed, 6 Mar 2024 05:13:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 656F16B0082; Wed, 6 Mar 2024 05:13:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D1716B0081; Wed, 6 Mar 2024 05:13:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 35BBE6B0075 for ; Wed, 6 Mar 2024 05:13:43 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 011A5A0A58 for ; Wed, 6 Mar 2024 10:13:42 +0000 (UTC) X-FDA: 81866202726.05.5A7B546 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) by imf15.hostedemail.com (Postfix) with ESMTP id 1A672A001B for ; Wed, 6 Mar 2024 10:13:39 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=M2UZoAHy; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf15.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709720021; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3+gbi0TkcqE1wrd2gyS4ajAOD2lf+Wm0tnSEhRfeeJg=; b=6jX41SXRKWuB3nValr/TAbNx6jeemig/1OzA/yL7NxIu0NKJQXJx8L8sMYBCqwPnDfPIEN TVzyAbYGMykDaqvxuGfXfYPSRIoP9XPV0STmAqRQpjFJ41GdiLRuP4mEaCK+JzFglh2il5 Plcfk6qt0ZpfARku+zGnuqMGnR+A9MA= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=M2UZoAHy; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf15.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.118 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709720021; a=rsa-sha256; cv=none; b=10XA3FQmH/s3VcaUc4Favd+Qz2gGaJA5kOw1H9spm7LZcfNrFJkJY2HcAY3LpFa1rELMXN rZPomNHYWEosnWPBvI7QKLfxwM64YggSQfqlm5uoQCpC8EQlY7vWQevrz9zZipajb8Ogah f4Dn0uQR4I+L349FZjNWemtlaL+LZ8o= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1709720017; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=3+gbi0TkcqE1wrd2gyS4ajAOD2lf+Wm0tnSEhRfeeJg=; b=M2UZoAHyytPC+FkaOoXq1sVQOcNdpB8UhaLPS/pQuGF20pT7Z+KfsQy6hyXDCo6c4uigOAH1uFEyLpKm6NiWnq27wERKfktga+a7+8YeFw5uvqTptuTYWz5ELB3MYg55rpXyTQ2R3mjIRycQ4u3lChDSVLIrsZABTJP/8gWy/KU= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0W1xC4XJ_1709720015; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W1xC4XJ_1709720015) by smtp.aliyun-inc.com; Wed, 06 Mar 2024 18:13:36 +0800 From: Baolin Wang To: akpm@linux-foundation.org Cc: muchun.song@linux.dev, osalvador@suse.de, david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, mhocko@kernel.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 1/3] mm: record the migration reason for struct migration_target_control Date: Wed, 6 Mar 2024 18:13:26 +0800 Message-Id: <7b95d4981e07211f57139fc5b1f7ce91b920cee4.1709719720.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: mzyw8my7453mjy8c7ey516ps31odsg79 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 1A672A001B X-HE-Tag: 1709720019-873348 X-HE-Meta: U2FsdGVkX19sc0B86SWnJzFE3KVMYwJpZmAEHBYuyBPkqiYZiF2tddOI/QkzD1GuypPkWlBQsMxnUhYxYCGfzYqeYiQsDfUMyEXUnYTXrBWnGUU+ffmMii3eUdi+3leljc8QfVigko7Wmo6opEzcd0SMP9QjiNxQwFORCK1yNsznFTofTDPDhKxj017vLnBexwJeIfd71n9n6qv0v14ULm6ZYqTAxY2RSsl7iYaHe+i5AhFoNB60BPATjHwkFR40p8svVOh38uWErjlDHNRq7ESfqn67N6gsIWwfUZ76chRB1b8Ll/UsoHxgdSfr4GMpdiMZ3kpIHkVi45rWtfBof6UbaB1PgyZWCDe0sLdGsu0Lgh83gJ1wuYJv0nIeIsy1zO5QFC89y0858LSST+aOKjl33URPB1VdN26ofOqHXXVsqXN99Kn3XFe4TTx1Rj3Fi75Hj5fx8x3w277IVYm7a1pTGI0JEOapvgTEgYqxT8XtIlQTdKj5SCoRTKa0u/sfUSKpoEHyXUajEkMTHqe9jI8bQ9dpzC0Qk8sQvD9ms4yNuGCerv7mL8bWLf44n43OBfM5POfpt9w3vSakUtzk87GD3tUrWrO41P3Po7puGWQHSUoK01F2mgJYBm6fkCQL+/A/M5FA79xNHh4mj5FqzebryWhBDIlbxrwytoss3QOjW2IohqRzypUzcQoJ/U3arzEBlGmrZ7dm9J3ORA1YSYG3tj6IgQLTisyaOJAiB/yHiPSP1hneYhfBowta8NTMGfun7vPpr1RDVCIaL/kgM81PSW8o2oBkQiG6+FdX2GKGEwj3H3bIGItkJsRM9WZCdl4qMaKUcAGzYl/0LhRgGtGfFClEGYHT07HYVajBOGQdtVqc9h40xdInqSrw39nTZiYNBZBAlS6G7gMsMgX8DfTJa7Q5PBcFhVbM8C3oDOq1XTDQfWAvc9cik9nriXNJ9+aYKbNlBP4YroF0RWR ivxpsZyT UeKZPugqesouqW8uyYZ3Tx2GF6HVmHpqN9SZUgO/TofLWRYpqbFfXYRlicc8PiC9olHmssRIfuYZzG/mI2BA+MLnH0tj7flUz371iMBgLG7Krmq9gV69LHTHwmC83vgqBAT5xrGd/OvoIjVRGbQMHdqXImnsn8B3PLKdjaDj3H2ElTEUbAT3LvWIdkt8vNQHFN0R8sb1ivnYGs75inmFjlU60YUizdyn2l3miOdJRLZO5WOc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To support different hugetlb allocation strategies during hugetlb migration based on various migration reasons, record the migration reason in the migration_target_control structure as a preparation. Signed-off-by: Baolin Wang Reviewed-by: Oscar Salvador --- mm/gup.c | 1 + mm/internal.h | 1 + mm/memory-failure.c | 1 + mm/memory_hotplug.c | 1 + mm/mempolicy.c | 1 + mm/migrate.c | 1 + mm/page_alloc.c | 1 + mm/vmscan.c | 3 ++- 8 files changed, 9 insertions(+), 1 deletion(-) diff --git a/mm/gup.c b/mm/gup.c index df83182ec72d..959a1a05b059 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -2132,6 +2132,7 @@ static int migrate_longterm_unpinnable_pages( struct migration_target_control mtc = { .nid = NUMA_NO_NODE, .gfp_mask = GFP_USER | __GFP_NOWARN, + .reason = MR_LONGTERM_PIN, }; if (migrate_pages(movable_page_list, alloc_migration_target, diff --git a/mm/internal.h b/mm/internal.h index 2b7efffbe4d7..47edf69b6ee6 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -959,6 +959,7 @@ struct migration_target_control { int nid; /* preferred node id */ nodemask_t *nmask; gfp_t gfp_mask; + enum migrate_reason reason; }; /* diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 9349948f1abf..780bb2aee0af 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -2666,6 +2666,7 @@ static int soft_offline_in_use_page(struct page *page) struct migration_target_control mtc = { .nid = NUMA_NO_NODE, .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL, + .reason = MR_MEMORY_FAILURE, }; if (!huge && folio_test_large(folio)) { diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a444e2d7dd2b..b79ba36e09e0 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1841,6 +1841,7 @@ static void do_migrate_range(unsigned long start_pfn, unsigned long end_pfn) struct migration_target_control mtc = { .nmask = &nmask, .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL, + .reason = MR_MEMORY_HOTPLUG, }; int ret; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index f60b4c99f130..98ceb12e0e17 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1070,6 +1070,7 @@ static long migrate_to_node(struct mm_struct *mm, int source, int dest, struct migration_target_control mtc = { .nid = dest, .gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE, + .reason = MR_SYSCALL, }; nodes_clear(nmask); diff --git a/mm/migrate.c b/mm/migrate.c index 73a052a382f1..bde63010a3cf 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2060,6 +2060,7 @@ static int do_move_pages_to_node(struct list_head *pagelist, int node) struct migration_target_control mtc = { .nid = node, .gfp_mask = GFP_HIGHUSER_MOVABLE | __GFP_THISNODE, + .reason = MR_SYSCALL, }; err = migrate_pages(pagelist, alloc_migration_target, NULL, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 96839b210abe..8e6dd3a1028b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6266,6 +6266,7 @@ int __alloc_contig_migrate_range(struct compact_control *cc, struct migration_target_control mtc = { .nid = zone_to_nid(cc->zone), .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL, + .reason = MR_CONTIG_RANGE, }; lru_cache_disable(); diff --git a/mm/vmscan.c b/mm/vmscan.c index 402c290fbf5a..510f438bb9e0 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -978,7 +978,8 @@ static unsigned int demote_folio_list(struct list_head *demote_folios, .gfp_mask = (GFP_HIGHUSER_MOVABLE & ~__GFP_RECLAIM) | __GFP_NOWARN | __GFP_NOMEMALLOC | GFP_NOWAIT, .nid = target_nid, - .nmask = &allowed_mask + .nmask = &allowed_mask, + .reason = MR_DEMOTION, }; if (list_empty(demote_folios)) From patchwork Wed Mar 6 10:13:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13583754 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77B0CC54E58 for ; Wed, 6 Mar 2024 10:13:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 971BC6B0078; Wed, 6 Mar 2024 05:13:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8CCC36B007D; Wed, 6 Mar 2024 05:13:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6CC4A6B0078; Wed, 6 Mar 2024 05:13:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3F3356B0078 for ; Wed, 6 Mar 2024 05:13:43 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id DD059A10DD for ; Wed, 6 Mar 2024 10:13:42 +0000 (UTC) X-FDA: 81866202684.04.EBCF61B Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) by imf07.hostedemail.com (Postfix) with ESMTP id 8438940008 for ; Wed, 6 Mar 2024 10:13:40 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=tFVn4nnE; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf07.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.132 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709720021; a=rsa-sha256; cv=none; b=wYJmVkxrizhC8SM8D4ZadcLlBfdx7Mlo4UbGmFTnR1WRwG1pUp9Lr55moXNrIaAGfaIIfB a9xV6tO71CMMUjsmGz1pUwUIoJzZVC/XQog8lIaKZouYxw5Dk99xmILTPzEhP/uv5ZD0KW NI9nGnkeZKu1lAV2Atj+PAHsAtMniqI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=tFVn4nnE; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf07.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.132 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709720021; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gufctVT71MgKLODZg/m+qbYkjQYb/SdtD29jbaLsG7s=; b=6sD8bp/Pav+bXAQcBeDK4qqq5m79Mny78oDdPs0w8dW8gDQYvj0znegOKTYOm3tDARs3JX Pq0TcmBqCVKDFcCXrJGBXDzOUoNlK5d9nJPClT/zsYdAgHev8Z7Tuc/aLL28u2V0ZP3sUP +GLklZvQx5VQ+S/5EFnrCvKlOnItcww= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1709720017; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=gufctVT71MgKLODZg/m+qbYkjQYb/SdtD29jbaLsG7s=; b=tFVn4nnEcPcUONtzfEm4oQIPNW8CqMzvD6hvd+I9vjjDYaUZ8ALyw8/wVEtk46zvsFHpkSVzy1DBWiy/GAhUf/txfof2Bx6oyLJj7BpyBimtDzV9CgYv06p0GMEn5IEgrQTEVjHv4NiFzRBMzQ+0gG9xKu0ENyMzdqgFBHCHQEE= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046051;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0W1xFr15_1709720016; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W1xFr15_1709720016) by smtp.aliyun-inc.com; Wed, 06 Mar 2024 18:13:37 +0800 From: Baolin Wang To: akpm@linux-foundation.org Cc: muchun.song@linux.dev, osalvador@suse.de, david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, mhocko@kernel.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 2/3] mm: hugetlb: make the hugetlb migration strategy consistent Date: Wed, 6 Mar 2024 18:13:27 +0800 Message-Id: <3519fcd41522817307a05b40fb551e2e17e68101.1709719720.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 8438940008 X-Stat-Signature: meyje3wbxcruynfadm89s955t48ocftr X-HE-Tag: 1709720020-737137 X-HE-Meta: U2FsdGVkX19zng12uYjNu+CMCq9JzJeAWihAIWPYiADCJfA8KgbRNK4wfxxA7LBYjsSGCFxXnt8UkvU9yty0cBCi+CnXa30GCq4vBYmpJPTmN86I+QKy6xyoWpu5PPFlT3DSpZGjxWt/8PA6/7FuTzZweedbM87VtYN/NN2W2+F2T4smDwyaq71Ks3qTLRkRLDIiddTgUqpdb0gFo3fEBMlEYhU437yo2QEVilMleekOwB25Y7XXPCzln3ixYT532yo1vKFy8RiVBe9EBFeDXFX2HjsjyBLwYRPgjyH8yGIvOWxHIwNe3rMMZw48I72LvVf5nHXFkzo/fV1TXy3NR2eU1nJRjS58FNZiSlBeX60Nf4iRVvatZXpjp7VNd67vxMfpKjcccWPDjmFqB31TwZYmonvamZkq8HIpVY0zcXQeFgK/9wpMxFfvS2ttWxcZlYAGhgrSJrRSGBEHgAxixBL9yjA70BREtmVMl6cQu3BdtJaxEmBddQZW8Txi5EGdcxKwwj6SBP9drU/0yyS7VFrl6LKwutVpn8NAzSMmPm/q6E0/FvMEPzWdks/G/zl7v4b3k7mSpYjuEwNAIITuTl3HvJd1URaJ+pFVx9OkzLBa9xkVsKh3THbaw+34mkBw4PgqHYwGUu24wLOXHDuSIIBsZ2ceqgCZGNEtNF344R3K93iMaIZOxAazLLaVK6Xm5BF/90wea071d9FUNi6OfAdbDHExqDfLONJyGh2Nk2eJmZ0XNae5RGDRguO3K+j0UAcj/F/rd+iGwam9rzhB5V8veK5iYeFYZ9gXGCgVVd9h8EpH/4ix8HjhYkF9XGAlPx53t2upujD+PmuJXh28aiPNIcjUBVrQ5GH6TIXbzrHK/RsN6nFdz4i8KWJugoRt2QxJsSiTuWmGBSNg4QuUOwca1Q5pnQ1e4EeAHQYqx9rXqAdQ6sn/pVV0faU0beXggUVJK2JjbZVE65FAnuP 8ShyxNzr Rf5DxNsJg7NxZVRhWlKYaeCKrE5ElJNVojY2L/jQCarCzKQ/C5dBxtGnPt+VcMaXKPxoEZKe6ASKXc9/92uRmpDlfm1RhANZVfxIjs/HdXRjCZ5IJi5Xd7AUnIphRO6TChUEBDnil45Ni4k4uPZ8AHJOlWA7DJAGQRJ0sdij2S/9DRHE2vdCEGYjeh+BO0N5B9ALcFagzaTfKWXfRUcJu9wAkT3NZudjcc+hogJhaRu9P613sBmb701HJAEjjxXw9ntn49kLLKH3TE0dHEwSyXFmIVQdPICvXeVr/ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: As discussed in previous thread [1], there is an inconsistency when handing hugetlb migration. When handling the migration of freed hugetlb, it prevents fallback to other NUMA nodes in alloc_and_dissolve_hugetlb_folio(). However, when dealing with in-use hugetlb, it allows fallback to other NUMA nodes in alloc_hugetlb_folio_nodemask(), which can break the per-node hugetlb pool and might result in unexpected failures when node bound workloads doesn't get what is asssumed available. To make hugetlb migration strategy more clear, we should list all the scenarios of hugetlb migration and analyze whether allocation fallback is permitted: 1) Memory offline: will call dissolve_free_huge_pages() to free the freed hugetlb, and call do_migrate_range() to migrate the in-use hugetlb. Both can break the per-node hugetlb pool, but as this is an explicit offlining operation, no better choice. So should allow the hugetlb allocation fallback. 2) Memory failure: same as memory offline. Should allow fallback to a different node might be the only option to handle it, otherwise the impact of poisoned memory can be amplified. 3) Longterm pinning: will call migrate_longterm_unpinnable_pages() to migrate in-use and not-longterm-pinnable hugetlb, which can break the per-node pool. But we should fail to longterm pinning if can not allocate on current node to avoid breaking the per-node pool. 4) Syscalls (mbind, migrate_pages, move_pages): these are explicit users operation to move pages to other nodes, so fallback to other nodes should not be prohibited. 5) alloc_contig_range: used by CMA allocation and virtio-mem fake-offline to allocate given range of pages. Now the freed hugetlb migration is not allowed to fallback, to keep consistency, the in-use hugetlb migration should be also not allowed to fallback. 6) alloc_contig_pages: used by kfence, pgtable_debug etc. The strategy should be consistent with that of alloc_contig_range(). Based on the analysis of the various scenarios above, introducing a new helper to determine whether fallback is permitted according to the migration reason.. [1] https://lore.kernel.org/all/6f26ce22d2fcd523418a085f2c588fe0776d46e7.1706794035.git.baolin.wang@linux.alibaba.com/ Signed-off-by: Baolin Wang Reviewed-by: Oscar Salvador --- include/linux/hugetlb.h | 35 +++++++++++++++++++++++++++++++++-- mm/hugetlb.c | 14 ++++++++++++-- mm/mempolicy.c | 3 ++- mm/migrate.c | 3 ++- 4 files changed, 49 insertions(+), 6 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 77b30a8c6076..e6723aaadc09 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -747,7 +747,8 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, int avoid_reserve); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, - nodemask_t *nmask, gfp_t gfp_mask); + nodemask_t *nmask, gfp_t gfp_mask, + bool allow_alloc_fallback); int hugetlb_add_to_page_cache(struct folio *folio, struct address_space *mapping, pgoff_t idx); void restore_reserve_on_error(struct hstate *h, struct vm_area_struct *vma, @@ -970,6 +971,30 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return modified_mask; } +static inline bool htlb_allow_alloc_fallback(int reason) +{ + bool allowed_fallback = false; + + /* + * Note: the memory offline, memory failure and migration syscalls will + * be allowed to fallback to other nodes due to lack of a better chioce, + * that might break the per-node hugetlb pool. While other cases will + * set the __GFP_THISNODE to avoid breaking the per-node hugetlb pool. + */ + switch (reason) { + case MR_MEMORY_HOTPLUG: + case MR_MEMORY_FAILURE: + case MR_SYSCALL: + case MR_MEMPOLICY_MBIND: + allowed_fallback = true; + break; + default: + break; + } + + return allowed_fallback; +} + static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) { @@ -1065,7 +1090,8 @@ static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, static inline struct folio * alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, - nodemask_t *nmask, gfp_t gfp_mask) + nodemask_t *nmask, gfp_t gfp_mask, + bool allow_alloc_fallback) { return NULL; } @@ -1181,6 +1207,11 @@ static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mask) return 0; } +static inline bool htlb_allow_alloc_fallback(int reason) +{ + return false; +} + static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 418d66953224..071218141fb2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2621,7 +2621,7 @@ struct folio *alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h, /* folio migration callback function */ struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, - nodemask_t *nmask, gfp_t gfp_mask) + nodemask_t *nmask, gfp_t gfp_mask, bool allow_alloc_fallback) { spin_lock_irq(&hugetlb_lock); if (available_huge_pages(h)) { @@ -2636,6 +2636,10 @@ struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, } spin_unlock_irq(&hugetlb_lock); + /* We cannot fallback to other nodes, as we could break the per-node pool. */ + if (!allow_alloc_fallback) + gfp_mask |= __GFP_THISNODE; + return alloc_migrate_hugetlb_folio(h, gfp_mask, preferred_nid, nmask); } @@ -6653,7 +6657,13 @@ static struct folio *alloc_hugetlb_folio_vma(struct hstate *h, gfp_mask = htlb_alloc_mask(h); node = huge_node(vma, address, gfp_mask, &mpol, &nodemask); - folio = alloc_hugetlb_folio_nodemask(h, node, nodemask, gfp_mask); + /* + * This is used to allocate a temporary hugetlb to hold the copied + * content, which will then be copied again to the final hugetlb + * consuming a reservation. Set the alloc_fallback to false to indicate + * that breaking the per-node hugetlb pool is not allowed in this case. + */ + folio = alloc_hugetlb_folio_nodemask(h, node, nodemask, gfp_mask, false); mpol_cond_put(mpol); return folio; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 98ceb12e0e17..fe853452db9f 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1228,7 +1228,8 @@ static struct folio *alloc_migration_target_by_mpol(struct folio *src, h = folio_hstate(src); gfp = htlb_alloc_mask(h); nodemask = policy_nodemask(gfp, pol, ilx, &nid); - return alloc_hugetlb_folio_nodemask(h, nid, nodemask, gfp); + return alloc_hugetlb_folio_nodemask(h, nid, nodemask, gfp, + htlb_allow_alloc_fallback(MR_MEMPOLICY_MBIND)); } if (folio_test_large(src)) diff --git a/mm/migrate.c b/mm/migrate.c index bde63010a3cf..ab9856f5931b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2022,7 +2022,8 @@ struct folio *alloc_migration_target(struct folio *src, unsigned long private) gfp_mask = htlb_modify_alloc_mask(h, gfp_mask); return alloc_hugetlb_folio_nodemask(h, nid, - mtc->nmask, gfp_mask); + mtc->nmask, gfp_mask, + htlb_allow_alloc_fallback(mtc->reason)); } if (folio_test_large(src)) { From patchwork Wed Mar 6 10:13:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13583756 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29222C54E49 for ; Wed, 6 Mar 2024 10:13:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 28A976B0075; Wed, 6 Mar 2024 05:13:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 211D76B0081; Wed, 6 Mar 2024 05:13:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0B2606B0082; Wed, 6 Mar 2024 05:13:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E6DC06B0081 for ; Wed, 6 Mar 2024 05:13:43 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9ECDA12097C for ; Wed, 6 Mar 2024 10:13:43 +0000 (UTC) X-FDA: 81866202726.02.BB566B8 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by imf02.hostedemail.com (Postfix) with ESMTP id 985C680013 for ; Wed, 6 Mar 2024 10:13:41 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="aVH/oJsK"; spf=pass (imf02.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.130 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1709720022; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jQS6ixAjVk0AC8cKYwTBDAm2HkLKEDsUdifzfOmBcs0=; b=4g/R0jCgsevYl+dbAndNjHJir7WSr02n7lRtU5f3/zXaqreJ/S6BjAjFdnPTgqcV7S5rgs iMwV2oKQxcwpzV3HzDz9nuulGULevEB0c/5E6jk9mBBEhfnKnH5C9N7X+WCSpxmLySK+m3 J7UupfYgdig+Exdfmnja8kP4w3WMNUk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1709720022; a=rsa-sha256; cv=none; b=ucUtq4jCG8Y/RDa5h0akpI8rdGKAiU3lFZF1wP/Q7duoqfV/UgH3Y5OT6yf5e0aXfaTS++ xQqX51OJQANO1uRJPqLr63EsEWsdgT8Dszzo8tx9N2ilodKixg9yqpIT3Zo1SMSCSMrxMo bW8q2R7RGGJuMM4ZyMiTI5XJIAhfXOo= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="aVH/oJsK"; spf=pass (imf02.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.130 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1709720018; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=jQS6ixAjVk0AC8cKYwTBDAm2HkLKEDsUdifzfOmBcs0=; b=aVH/oJsKmuK6AiXGW+Yhae9St6wsrA+7CYmVTQyL6WXrMA45Ym8q7cE4mc3QAGNjSve/8uzcXVWPvHkxX9Lo1UKlrEko+TNsrrsRzfrKXLFO6PYLgwM5NjXyyZfAuimM0UZnYzeqp8abGal/E6v5Vx+OfdCGAPr9JhSCGrzwp24= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0W1xC4Y-_1709720017; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0W1xC4Y-_1709720017) by smtp.aliyun-inc.com; Wed, 06 Mar 2024 18:13:38 +0800 From: Baolin Wang To: akpm@linux-foundation.org Cc: muchun.song@linux.dev, osalvador@suse.de, david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com, mhocko@kernel.org, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 3/3] docs: hugetlbpage.rst: add hugetlb migration description Date: Wed, 6 Mar 2024 18:13:28 +0800 Message-Id: <63fb16e7a4ebc5cb69ce655af86e29b2d8e9ba34.1709719720.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 985C680013 X-Rspam-User: X-Stat-Signature: 1ocjxnywekfmsed6ur315ra5tk9bazqi X-Rspamd-Server: rspam03 X-HE-Tag: 1709720021-32525 X-HE-Meta: U2FsdGVkX18wWhl5FFaHtJRyx/Ujk8DMPHlvAUdGHGBjRPVdL80n3QVZvpPGuN4ekotMHFhxHM6xOspOp3DZJpSVrjO6K7cg2+ls6Rhanaw7+fjSvc7ef8qU7+QdQVGB3ZYQDJjVB6t9cVH5VFNDAbhFfkMBifwK115r4gZcwqGsxkYkEg0R6Q2AK1OH85fgYsxQKImGB7lmbuXK83DiLyOVgDsXMRrB9BqEH4aKhaiphGCjfxFiM5lPPqrQQc08L8lT4Im0PORkaVlO+DrkiYG+Hd/++GuygV/knhfeJJkvXWjzE15xJX3GZw1I462UAHrGWLW71Oo72zoPZwvvOwWKs/45Xpd3MdsGirYtGND0l7IpNNCXJYbLp33m8IuCVnUkYoQkTPSGARhLGY8NjMDvSg46MDZ3KKBPVmjzS+ASG+fo2tZ4Y+LR+uhxwy3Mzy93v2lB60/b9tQxTqDw/eUqL/tbNjVLFTRRm394eX/7LlXQroEBJzgLRIR5DnX5suzSBdhOxs5em6xKt1fElmJwH5wlrUyLmHD3JULloq5J5wthhkep6f04mnxWlamKrNYW0AjzHpJSiGYe6L5NUMEEf1FuUvsArK4peFi88OfDMitGtBROXTzF69BdAh60AzLs0br1b0TBeOwxcamiOaYDphvoqZgDKgjdV07eHgG8FHZkWkyFZyfKML56ln97o5SKzOi9BatQDbMWrZJwR74+pA+4iVHiMjr19kgJIjLNKRa6F9r+C+Vb9JbZiOUyz67K35ce+lQcZ6eoX8fo7yYuLSfkX8eyTDsfZ1FbvGW1S7HUuqWt0uqVj7ytMF7Xl/dNofnPzJZMI8+0C0BpilnKDrqWJ9oQqslYfZv/hIS1qu8l0x3Dml6WVJ+jikj2MPUGT4kOI3NZ52cFbgxBwnZdy1Rr+FR2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add some description of the hugetlb migration strategy. Signed-off-by: Baolin Wang --- Documentation/admin-guide/mm/hugetlbpage.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst index e4d4b4a8dc97..f34a0d798d5b 100644 --- a/Documentation/admin-guide/mm/hugetlbpage.rst +++ b/Documentation/admin-guide/mm/hugetlbpage.rst @@ -376,6 +376,13 @@ Note that the number of overcommit and reserve pages remain global quantities, as we don't know until fault time, when the faulting task's mempolicy is applied, from which node the huge page allocation will be attempted. +The hugetlb may be migrated between the per-node hugepages pool in the following +scenarios: memory offline, memory failure, longterm pinning, syscalls(mbind, +migrate_pages and move_pages), alloc_contig_range() and alloc_contig_pages(). +Now only memory offline, memory failure and syscalls allow fallbacking to allocate +a new hugetlb on a different node if the current node is unable to allocate during +hugetlb migration, that means these 3 cases can break the per-node hugepages pool. + .. _using_huge_pages: Using Huge Pages