From patchwork Wed Mar 16 02:26:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: luofei X-Patchwork-Id: 12782087 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F521C433F5 for ; Wed, 16 Mar 2022 02:27:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 57EAE8D0002; Tue, 15 Mar 2022 22:27:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 52DF48D0001; Tue, 15 Mar 2022 22:27:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3CEDD8D0002; Tue, 15 Mar 2022 22:27:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id 2A8F58D0001 for ; Tue, 15 Mar 2022 22:27:02 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id D102A1206BF for ; Wed, 16 Mar 2022 02:27:01 +0000 (UTC) X-FDA: 79248661842.06.EF928A0 Received: from spam.unicloud.com (mx.uniclinxens.com [220.194.70.58]) by imf01.hostedemail.com (Postfix) with ESMTP id 7B4D34000A for ; Wed, 16 Mar 2022 02:26:59 +0000 (UTC) Received: from eage.unicloud.com ([220.194.70.35]) by spam.unicloud.com with ESMTP id 22G2Qo1x028879; Wed, 16 Mar 2022 10:26:50 +0800 (GMT-8) (envelope-from luofei@unicloud.com) Received: from zgys-ex-mb09.Unicloud.com (10.10.0.24) by zgys-ex-mb10.Unicloud.com (10.10.0.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2375.17; Wed, 16 Mar 2022 10:26:49 +0800 Received: from zgys-ex-mb09.Unicloud.com ([fe80::eda0:6815:ca71:5aa]) by zgys-ex-mb09.Unicloud.com ([fe80::eda0:6815:ca71:5aa%16]) with mapi id 15.01.2375.017; Wed, 16 Mar 2022 10:26:49 +0800 From: =?eucgb2312_cn?b?wt63yQ==?= To: Mike Kravetz , Muchun Song CC: Andrew Morton , Linux Memory Management List , LKML Subject: =?eucgb2312_cn?b?tPC4tDogW1BBVENIXSBodWdldGxiZnM6IGZpeCBkZXNjcmlwdGlvbiBh?= =?eucgb2312_cn?b?Ym91dCBhdG9taWMgYWxsb2NhdGlvbiBvZiB2bWVtbWFwIHBhZ2VzIHdoZW4gZnJl?= =?eucgb2312_cn?b?ZSBodWdlIHBhZ2U=?= Thread-Topic: [PATCH] hugetlbfs: fix description about atomic allocation of vmemmap pages when free huge page Thread-Index: AQHYOCSEJQtF7F2NS0Ol60FQZC1p/Ky/6v2AgACCawCAANlxmw== Date: Wed, 16 Mar 2022 02:26:49 +0000 Message-ID: <380ea251d3084917aaaf6cc973a857e1@unicloud.com> References: <20220315042355.362810-1-luofei@unicloud.com> , In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.10.1.7] MIME-Version: 1.0 X-DNSRBL: X-MAIL: spam.unicloud.com 22G2Qo1x028879 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 7B4D34000A X-Stat-Signature: csu1k45ik3ud6tsmeemipd9m51qbx9j1 Authentication-Results: imf01.hostedemail.com; dkim=none; spf=pass (imf01.hostedemail.com: domain of luofei@unicloud.com designates 220.194.70.58 as permitted sender) smtp.mailfrom=luofei@unicloud.com; dmarc=none X-Rspam-User: X-HE-Tag: 1647397619-515852 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >>> >>> No matter what context update_and_free_page() is called in, >>> the flag for allocating the vmemmap page is fixed >>> (GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE), and no atomic >>> allocation is involved, so the description of atomicity here >>> is somewhat inappropriate. >>> >>> and the atomic parameter naming of update_and_free_page() is >>> somewhat misleading. >>> >>> Signed-off-by: luofei >>> --- >>> mm/hugetlb.c | 10 ++++------ >>> 1 file changed, 4 insertions(+), 6 deletions(-) >>> >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>> index f8ca7cca3c1a..239ef82b7897 100644 >>> --- a/mm/hugetlb.c >>> +++ b/mm/hugetlb.c >>> @@ -1570,8 +1570,8 @@ static void __update_and_free_page(struct hstate *h, struct page *page) >>> >>> /* >>> * As update_and_free_page() can be called under any context, so we cannot >>> - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the >>> - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate >>> + * use GFP_ATOMIC to allocate vmemmap pages. However, we can defer the >>> + * actual freeing in a workqueue to prevent waits caused by allocating >>> * the vmemmap pages. >>> * >>> * free_hpage_workfn() locklessly retrieves the linked list of pages to be >>> @@ -1617,16 +1617,14 @@ static inline void flush_free_hpage_work(struct hstate *h) >>> } >>> >>> static void update_and_free_page(struct hstate *h, struct page *page, >>> - bool atomic) >>> + bool delay) >> >> Hi luofei, >> >> At least, I don't agree with this change. The "atomic" means if the >> caller is under atomic context instead of whether using atomic >> GFP_MASK. The "delay" seems to tell the caller that it can undelay >> the allocation even if it is under atomic context (actually, it has no >> choice). But "atomic" can indicate the user is being asked to tell us >> if it is under atomic context. > >There may be some confusion since GFP_ATOMIC is mentioned in the comments >and GFP_ATOMIC is not used in the allocation of vmemmap pages. IIRC, >the use of GFP_ATOMIC was discussed at one time but dismissed because of >undesired side effects such as dipping into "atomic reserves". > >How about an update to the comments as follows (sorry mailer may mess up >formatting)? > >diff --git a/mm/hugetlb.c b/mm/hugetlb.c >index f8ca7cca3c1a..6a4d27e24b21 100644 >--- a/mm/hugetlb.c >+++ b/mm/hugetlb.c >@@ -1569,10 +1569,12 @@ static void __update_and_free_page(struct hstate *h, struct page *page) >} > >/* >- * As update_and_free_page() can be called under any context, so we cannot >- * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the >- * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate >- * the vmemmap pages. >+ * Freeing hugetlb pages in done in update_and_free_page(). When freeing a >+ * hugetlb page, vmemmap pages may need to be allocated. The routine >+ * alloc_huge_page_vmemmap() can possibly sleep as it uses GFP_KERNEL. >+ * However, update_and_free_page() can be called under any context. To >+ * avoid the possibility of sleeping in a context where sleeping is not >+ * allowed, defer the actual freeing in a workqueue where sleeping is allowed. > * > * free_hpage_workfn() locklessly retrieves the linked list of pages to be > * freed and frees them one-by-one. As the page->mapping pointer is going >@@ -1616,6 +1618,10 @@ static inline void flush_free_hpage_work(struct hstate *h) > flush_work(&free_hpage_work); > } > >+/* >+ * atomic == true indicates called from a context where sleeping is >+ * not allowed. >+ */ > static void update_and_free_page(struct hstate *h, struct page *page, > bool atomic) > { >@@ -1625,7 +1631,8 @@ static void update_and_free_page(struct hstate *h, struct page *page, > } > > /* >- * Defer freeing to avoid using GFP_ATOMIC to allocate vmemmap pages. >+ * Defer freeing to avoid possible sleeping when allocating >+ * vmemmap pages. > * > * Only call schedule_work() if hpage_freelist is previously > * empty. Otherwise, schedule_work() had been called but the workfn The comments made it clearer for me. I will revise the patch. Thanks :) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index f8ca7cca3c1a..6a4d27e24b21 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1569,10 +1569,12 @@ static void __update_and_free_page(struct hstate *h, struct page *page) } /* - * As update_and_free_page() can be called under any context, so we cannot - * use GFP_KERNEL to allocate vmemmap pages. However, we can defer the - * actual freeing in a workqueue to prevent from using GFP_ATOMIC to allocate - * the vmemmap pages. + * Freeing hugetlb pages in done in update_and_free_page(). When freeing a + * hugetlb page, vmemmap pages may need to be allocated. The routine + * alloc_huge_page_vmemmap() can possibly sleep as it uses GFP_KERNEL. + * However, update_and_free_page() can be called under any context. To + * avoid the possibility of sleeping in a context where sleeping is not + * allowed, defer the actual freeing in a workqueue where sleeping is allowed. * * free_hpage_workfn() locklessly retrieves the linked list of pages to be * freed and frees them one-by-one. As the page->mapping pointer is going @@ -1616,6 +1618,10 @@ static inline void flush_free_hpage_work(struct hstate *h) flush_work(&free_hpage_work); } +/* + * atomic == true indicates called from a context where sleeping is + * not allowed. + */ static void update_and_free_page(struct hstate *h, struct page *page, bool atomic) { @@ -1625,7 +1631,8 @@ static void update_and_free_page(struct hstate *h, struct page *page, } /* - * Defer freeing to avoid using GFP_ATOMIC to allocate vmemmap pages. + * Defer freeing to avoid possible sleeping when allocating + * vmemmap pages. * * Only call schedule_work() if hpage_freelist is previously * empty. Otherwise, schedule_work() had been called but the workfn