From patchwork Tue Feb 18 11:40:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ge Yang X-Patchwork-Id: 13979681 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F06A6C02198 for ; Tue, 18 Feb 2025 11:40:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6EAE028011D; Tue, 18 Feb 2025 06:40:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6734328011A; Tue, 18 Feb 2025 06:40:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5143C28011D; Tue, 18 Feb 2025 06:40:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2FA8C28011A for ; Tue, 18 Feb 2025 06:40:43 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 98023140F48 for ; Tue, 18 Feb 2025 11:40:42 +0000 (UTC) X-FDA: 83132873124.14.83E50A4 Received: from m16.mail.126.com (m16.mail.126.com [220.197.31.7]) by imf05.hostedemail.com (Postfix) with ESMTP id 0F026100008 for ; Tue, 18 Feb 2025 11:40:37 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=Ouss9I8F; spf=pass (imf05.hostedemail.com: domain of yangge1116@126.com designates 220.197.31.7 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739878841; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:references:dkim-signature; bh=KkHPLPhS/lcvXv2PGW47P5xHPzK/OHNmih6dbHFhbrA=; b=1VD3Pashf00dWp/7kGq6NK4dstzNMKXKuMi8j35Ab562Z71Pp6gJQ078ZauRpTFhSnQbjr CyUFolEALX3ne6u4XWySAMa6Cxp8bWupDFNWbEOLBmTJTL2RemINvjIRWqejdfMo47XKFk Oz6bYkCEiK1bmg9BnUi5F6T1WSEffOI= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=Ouss9I8F; spf=pass (imf05.hostedemail.com: domain of yangge1116@126.com designates 220.197.31.7 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739878841; a=rsa-sha256; cv=none; b=rF9WiUDGCFiS0XZjkPHhE4K0LDRSbQoy6vGsWqyvZVRfPqsXGxaOfxShXzQ7e1Ee6ilAo9 FoiExB1Gltp+cbfESohfUodicAQWWzXWlTAJOZl+PapkQVGBo5XjgESeeZiiAGyQFiYSAs DiiRpy5IYbav1+dsWja2l1uRUqdmmGY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=From:Subject:Date:Message-Id; bh=KkHPLPhS/lcvXv2PGW 47P5xHPzK/OHNmih6dbHFhbrA=; b=Ouss9I8Fc4R4KkbIlN1YgVgbUZ9QDDLzp7 LBabT84KOv4vz0grUvIk/SJUAIjYP2QFgcip731TZfNOB/sxSN9aTFCAC+8YmvoN B3FJePlnfcoha1dN0QXxr5We+uUP1LmEoS1n4wGwzEX5hZjawI5P4wOmzXDg++x7 tPvWiIre4= Received: from hg-OptiPlex-7040.hygon.cn (unknown []) by gzga-smtp-mtada-g1-0 (Coremail) with SMTP id _____wDnD9StcbRn5BnwAw--.45758S2; Tue, 18 Feb 2025 19:40:30 +0800 (CST) From: yangge1116@126.com To: akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, 21cnbao@gmail.com, david@redhat.com, baolin.wang@linux.alibaba.com, muchun.song@linux.dev, osalvador@suse.de, liuzixing@hygon.cn, Ge Yang Subject: [PATCH V3] mm/hugetlb: wait for hugetlb folios to be freed Date: Tue, 18 Feb 2025 19:40:28 +0800 Message-Id: <1739878828-9960-1-git-send-email-yangge1116@126.com> X-Mailer: git-send-email 2.7.4 X-CM-TRANSID: _____wDnD9StcbRn5BnwAw--.45758S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxXryrZr43Gw13Cw48Cw45KFg_yoWrur4rpF yUKr13GayDJr9akrn7AwsYyr12y3ykZFWjkrWIqw45ZFnxJas7KFy2vwn0v3y8Ar93CFWx ZrWqqrWDuF1UZaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x0zR0JmUUUUUU= X-Originating-IP: [112.64.138.194] X-CM-SenderInfo: 51dqwwjhrrila6rslhhfrp/1tbifgP3G2e0TMW1YgABsd X-Rspam-User: X-Rspamd-Queue-Id: 0F026100008 X-Stat-Signature: 37c8zzuhjqyyrked7dxrt9crwmfgds57 X-Rspamd-Server: rspam03 X-HE-Tag: 1739878837-601308 X-HE-Meta: U2FsdGVkX193dDc+qpcieEKVrTPJAZL/PwD/cv9RhXKUFl6kFs7l4Uh6NQzCwRSu5IWGwejKVKX4MKHO+4BfkNSU+/nqfU5H+eqYr3WSbVbYcGIQX1B6r1U3XX5JA0TwDf+SicP7ClzBYqaur9umBWXZvkK7dMbs+flX5Ng1LxI5YKaP/+bwzUKkCQgagJEFBBqZTtMgid2f9RZPPYe4U6iSfIZR5wLbkVk8VTx67OCywZsda2a/75o+WkGVIHPuY5Hv1kEWIbjMVKIwa72Tfppte6mQjD4hQ7i7WpF4FoYc4htg9XQPcL3navmuQaxE/HxYlL7H9QdI3PEg9lcRI1CQ9ifgBVKrrB5PGvz6nI6VvtyZrxUEtA/PPYIh4Hgs3ZYG7JE+3Wcm1IjaVR/uaTkvN31THS1ZqDCdCz92dtFBctg+Ivumr8Q4Do/YIdPkrtTiLoblWdf+bMjyVC9xlW3OtTkBPGldZvYM56XXvJqLnYkXruGUf5c40ymlPPXzWDo/clDtMSihi6QAQ/TwntXeFjvNXMC4hB7COjo60ZH2tsoZG8bBvlnUJGKVddpJRyYj5xPogofCTZ9KyKOCM83Ps55+2M5g+EE402Ow68NroTs21MsONcdYSq33CQZIpW1DchNlnktWlF4vKdli7zKvaAy8shfb+hCTRd94O68YF8X1VVpMF9ZkBl0Gx162f0CPG2pr4Taecinw9bdDIEPpCDIAPiF59qw9CT8ffPGvWMYmgcrlsnCDeDkG5xOBVwdeFrS583q9Oo/qSAcpaShkNPCA8cVWsKw0OK09AGrNh+W+tWeKU9fPPfwo+kx3bxmkYcrX9c54IMnabQnh28FOlfpzZOznvqdwgpNBThFN+xyGkRqFDu0iRiR/iJneUciW1Pxpm79gael+vTYALFiLfBk7pTQFiCbWPInIgYIIEWc8f6F4yQCXvcXKR9aEZePvDydocLy4mzsOY9G FEgbUeuS XBkrZ4bZpxjbPyADBZ7fPiyANkWjZI+RogZk5VN7uSpkVEmdvu9fs/6PuYQi7mMJe/XXHu/MBhQZpRkd6/ySahs/QKbdzzIILqAhBYZSOO3IPCP9aB+mQJl100aLMNsfgVMynP+Od1pUIh6+4HHGegkrANAwkLxLb+NSGxHja9TNV5KdbxzSFtqeuuXmvauEZ5489PNuHpHro2dG91A2HvuS7g34tGR/QG8fDbJQbsCucNE3lKJajA22l58vjwdLoetXZb8UW5CNmaSlGn29unAFO2ZmYIoKPX4LxCWDZ+7oC/IM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Ge Yang Since the introduction of commit c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context"), which supports deferring the freeing of hugetlb pages, the allocation of contiguous memory through cma_alloc() may fail probabilistically. In the CMA allocation process, if it is found that the CMA area is occupied by in-use hugetlb folios, these in-use hugetlb folios need to be migrated to another location. When there are no available hugetlb folios in the free hugetlb pool during the migration of in-use hugetlb folios, new folios are allocated from the buddy system. A temporary state is set on the newly allocated folio. Upon completion of the hugetlb folio migration, the temporary state is transferred from the new folios to the old folios. Normally, when the old folios with the temporary state are freed, it is directly released back to the buddy system. However, due to the deferred freeing of hugetlb pages, the PageBuddy() check fails, ultimately leading to the failure of cma_alloc(). Here is a simplified call trace illustrating the process: cma_alloc() ->__alloc_contig_migrate_range() // Migrate in-use hugetlb folios ->unmap_and_move_huge_page() ->folio_putback_hugetlb() // Free old folios ->test_pages_isolated() ->__test_page_isolated_in_pageblock() ->PageBuddy(page) // Check if the page is in buddy To resolve this issue, we have implemented a function named wait_for_freed_hugetlb_folios(). This function ensures that the hugetlb folios are properly released back to the buddy system after their migration is completed. By invoking wait_for_freed_hugetlb_folios() before calling PageBuddy(), we ensure that PageBuddy() will succeed. Fixes: c77c0a8ac4c52 ("mm/hugetlb: defer freeing of huge pages if in non-task context") Signed-off-by: Ge Yang Cc: Acked-by: David Hildenbrand --- V3: - adjust code and message suggested by Muchun and David V2: - flush all folios at once suggested by David include/linux/hugetlb.h | 5 +++++ mm/hugetlb.c | 5 +++++ mm/page_isolation.c | 10 ++++++++++ 3 files changed, 20 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 6c6546b..0c54b3a 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -697,6 +697,7 @@ bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); +void wait_for_freed_hugetlb_folios(void); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, @@ -1092,6 +1093,10 @@ static inline int replace_free_hugepage_folios(unsigned long start_pfn, return 0; } +static inline void wait_for_freed_hugetlb_folios(void) +{ +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 30bc34d..b4630b3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2955,6 +2955,11 @@ int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn) return ret; } +void wait_for_freed_hugetlb_folios(void) +{ + flush_work(&free_hpage_work); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 8ed53ee0..b2fc526 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -615,6 +615,16 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, int ret; /* + * Due to the deferred freeing of hugetlb folios, the hugepage folios may + * not immediately release to the buddy system. This can cause PageBuddy() + * to fail in __test_page_isolated_in_pageblock(). To ensure that the + * hugetlb folios are properly released back to the buddy system, we + * invoke the wait_for_freed_hugetlb_folios() function to wait for the + * release to complete. + */ + wait_for_freed_hugetlb_folios(); + + /* * Note: pageblock_nr_pages != MAX_PAGE_ORDER. Then, chunks of free * pages are not aligned to pageblock_nr_pages. * Then we just check migratetype first.