From patchwork Mon Nov 27 08:46:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 13469264 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D351BC07D5A for ; Mon, 27 Nov 2023 08:47:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6DF006B0325; Mon, 27 Nov 2023 03:47:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 667846B0326; Mon, 27 Nov 2023 03:47:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 508776B0327; Mon, 27 Nov 2023 03:47:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 3E7066B0325 for ; Mon, 27 Nov 2023 03:47:26 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0FFD112016A for ; Mon, 27 Nov 2023 08:47:26 +0000 (UTC) X-FDA: 81503105292.15.75D02EF Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) by imf11.hostedemail.com (Postfix) with ESMTP id 35E2D40013 for ; Mon, 27 Nov 2023 08:47:24 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=YQetLNQl; spf=pass (imf11.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.210.175 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701074844; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9POB4kFuEKuzMcdZHLoubaPw/TJvHsiTa281mWkLZq8=; b=GIa5fn17TkXMCZih66CqaDEgx/qUwNJYw3/+eg7Ano2v49JNNy25zeTywJltOdbyd1M/BE D1Qn0oHLOqk6hmy16sgyIldeiDIYvOGpGPNJT0lc42taieO6b5dzHXV2Qy/sjcNcWezcFf hO4ZvffC2Z8rcPaVvbAeceeSPNrFr54= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701074844; a=rsa-sha256; cv=none; b=740lUAkgFlNzSWyNVV+9uRqs7Wg9jI1mBVsqf79aehvawFnGNCs0wRgmFEcQvjgacFloWz 7HqhDZMhA9Tpvo1AphkNN7rmx/v2S5C4/fZ/jyHB4sSQRG8qWQVMbEB2kBYfnKMK0MSuk1 ypbmdeYH4ksfH8kHn2lBaq0uen3F/X8= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=YQetLNQl; spf=pass (imf11.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.210.175 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-6cbe5b6ec62so2898344b3a.1 for ; Mon, 27 Nov 2023 00:47:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1701074843; x=1701679643; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9POB4kFuEKuzMcdZHLoubaPw/TJvHsiTa281mWkLZq8=; b=YQetLNQluMt4CiZT8nIos1TjqdIA3K8UtgfrnP+ukiKxD3nMNTJ8hB+ejZLeua49FO BEFys0Q2IWXeot9PeX89GmQJsb7zOTbypOK5fqETXku2Z0IHELzmFxgxUJP+tAhrjpN2 9tky29KV6qujIPBanu+iv6jOSuR9CHZf979MFdjWzxCW+2rYIsSG0IqLGCqFP07u4Ewu AUIgmzrKQsP4ebZRs4iqoVSue3DH6/TY/cgv/wy1+F3RnlMZ6eOAyX+gB22AvfdXBfZZ x7c2/nuEMYy/tK6eiBAuszaKc9kzSp5yzSBTIdNiJR6mFxEnN9uhtU25/SE3amVx7WBr KF8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701074843; x=1701679643; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9POB4kFuEKuzMcdZHLoubaPw/TJvHsiTa281mWkLZq8=; b=PZgsyqCokSPi88WdJhLFDGdJqqh23/QY7ueadqQ8ZlBvnp6rG3AlCX4bBmzlUJgoo8 QiBvzLrMG6MxCVGMrjP3p56fqLiVJDZ29nVQoXqKxShUn2gG1MJ2omj4nqJUxQn7CbaW lf49C1JgAy9R/8eB6VEHkbzvWmuDm/ArhM3JxE3jWrF6UQSn8VntJhjXmMnM2Dr/IfDW o8OSYQlNT2dMQVMoh948ybsDNxWeA68fWMc+NSyHQXHNvBQ9XjEo4XBxHWGF/RFkKo2r Ssdv1XbO7yCumeuuce4G5qygCdYgrnpuz6xbrp8zp3C8psM1fptLoeoAE+tvaFjbEear ZxPg== X-Gm-Message-State: AOJu0Ywq8HXaM1JV+oMRnH0YDoY0/YtrjoDp08/0pj2jt36+TyECj+zb 5EVEszXm92sr3dI2iVg9YNcn4A== X-Google-Smtp-Source: AGHT+IGPbAAqj7jljm3HkPkr6GUGngrFdQat2yh2VPEfUSz7fn/6W6SAjj42octpaoYEmJpwaYLguA== X-Received: by 2002:a05:6a00:3926:b0:690:c75e:25c8 with SMTP id fh38-20020a056a00392600b00690c75e25c8mr11586636pfb.7.1701074843002; Mon, 27 Nov 2023 00:47:23 -0800 (PST) Received: from PXLDJ45XCM.bytedance.net ([139.177.225.230]) by smtp.gmail.com with ESMTPSA id e22-20020aa78c56000000b006c875abecbcsm6686932pfd.121.2023.11.27.00.47.20 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 27 Nov 2023 00:47:22 -0800 (PST) From: Muchun Song To: mike.kravetz@oracle.com, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 1/4] mm: pagewalk: assert write mmap lock only for walking the user page tables Date: Mon, 27 Nov 2023 16:46:42 +0800 Message-Id: <20231127084645.27017-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20231127084645.27017-1-songmuchun@bytedance.com> References: <20231127084645.27017-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Stat-Signature: azfy4yzfh4a8s448z4rfqbw8cmabmhsu X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 35E2D40013 X-Rspam-User: X-HE-Tag: 1701074844-33634 X-HE-Meta: U2FsdGVkX1/VUFeWh5o8GPkOKT+h7VktUh4bHHK4Jh2d3Xxi3BMMDdLhCLOPTnKyBeKwhylSha1vZE8UtVs6DnA4X4zRz91EmWq4xgbvcxFqhY5ROVM6oZLzQCIbCSXiKTtzOABP2lFijqp+Of+2R8N6ChH0yZkQTJ0wSKIuCg80lr1XIwTELhw7epMLOQjv3RHgYIp2qlD/cojBl6NvJiWnj7MQ91B2Ddvd/+Jwkq5+1VoWX6BztlFtjOVj5tatTSBl2zesCc1GRXhoFtK6XXHEFroff+nIherAm4ciZrShpTbinIt5F9XMLIxh+ZvlqVzMdVEwBZ01yxMdmKanUxJx7uOKrVQBVKbSjrhGhZhmwKoG0x3aaOMhskdMKhSO8MC4q27AEqfuYx94blkJcMEMvSg3aY7PNi1rbaYpkF9niueVMJ8GQbO0UNEfw/rBjuDinfbssA46JbAlRsbrZjhbT2FEz8eG1hVHzfCzsYTOSDF3hk502h7a1MOvGSvRmICQAbO57cn7JTPOrDCO/oWeUB9jM8HUH9CBHTD4xKPYd0JXT37bMn7N1LubXoZk74oxKj8PeeZS6ENKMxztI0ERpV0AQ2hYk12aj6P+15aF25ScQ/lqUf+z0hCv5tRqXx8r4U6L8qC+b+upjt7sa3N5x0xY6fZYuPJDOeMj8RQd3l/MpXee/iiT3IVq7zqBHYrc3wmBxTsfnokgI0TpfJPqEa/b/PxDYXgtrgaDTVp+HMf6ps60NqzI2/UFEcdyOM8FF8H0wdor8zQj48hqhuU9t+kIsVHU3KYosJHWre5OVVnhFee5vhKlJwMkN9vxP/8J0FNDAPsWW8CJQ3Pbk8gSqrS4Fos2eTfPUfMbAJ63f+1Z87xiqIEAxEcKXt4vxmCoU0kxQWL1++4F/4HWKMSWu0Gq7HdvELjW5gb5pwY0P99od15MHCmZtNK/nKGjUGcRoL81CqK/jPK9KBb xNWOjw7x /jZFRd9KBDUfEf9nfrs/8XFO0eIJg0rv+8DsNX+C7OmPghxfx6PArufhpQb83XhoSpjglG3eBZn28wwjAENd0zQMtZqeQTQ197nEnYdKx/uZUiMcDgnl+CvIH5VcmsIuRHO3WzPptaWB2X1gBsjUdMmQRyLNV6ac/oWt6ZUlK5L15FyL53Oweqq6OiDJk9Vdxv0OlrceIqAtncE/WuMLoj4rienc2GYcR+e0pGS7uMSN0vx5h8fxUJBYVumxbbNLQ2OH7ToM5UZoDNVy9BFjLnpdl8tqedx3M+DuGKEOQ3EOmWFhp7nsMjXNKvztsITRKEQXvlzXnOiMH264m5b6lk3Bks4GUKK26DHx2Mucpl+Pu7m4zXFxHtY/iDktTHnx574JL6Dk2nq5ttQ2foo1qc0WCdv+MDiJzKoxp X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The 8782fb61cc848 ("mm: pagewalk: Fix race between unmap and page walker") introduces an assertion to walk_page_range_novma() to make all the users of page table walker is safe. However, the race only exists for walking the user page tables. And it is ridiculous to hold a particular user mmap write lock against the changes of the kernel page tables. So only assert at least mmap read lock when walking the kernel page tables. And some users matching this case could downgrade to a mmap read lock to relief the contention of mmap lock of init_mm, it will be nicer in hugetlb (only holding mmap read lock) in the next patch. Signed-off-by: Muchun Song Acked-by: Mike Kravetz --- mm/pagewalk.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index b7d7e4fcfad7a..f46c80b18ce4f 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -539,6 +539,11 @@ int walk_page_range(struct mm_struct *mm, unsigned long start, * not backed by VMAs. Because 'unusual' entries may be walked this function * will also not lock the PTEs for the pte_entry() callback. This is useful for * walking the kernel pages tables or page tables for firmware. + * + * Note: Be careful to walk the kernel pages tables, the caller may be need to + * take other effective approache (mmap lock may be insufficient) to prevent + * the intermediate kernel page tables belonging to the specified address range + * from being freed (e.g. memory hot-remove). */ int walk_page_range_novma(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, @@ -556,7 +561,29 @@ int walk_page_range_novma(struct mm_struct *mm, unsigned long start, if (start >= end || !walk.mm) return -EINVAL; - mmap_assert_write_locked(walk.mm); + /* + * 1) For walking the user virtual address space: + * + * The mmap lock protects the page walker from changes to the page + * tables during the walk. However a read lock is insufficient to + * protect those areas which don't have a VMA as munmap() detaches + * the VMAs before downgrading to a read lock and actually tearing + * down PTEs/page tables. In which case, the mmap write lock should + * be hold. + * + * 2) For walking the kernel virtual address space: + * + * The kernel intermediate page tables usually do not be freed, so + * the mmap map read lock is sufficient. But there are some exceptions. + * E.g. memory hot-remove. In which case, the mmap lock is insufficient + * to prevent the intermediate kernel pages tables belonging to the + * specified address range from being freed. The caller should take + * other actions to prevent this race. + */ + if (mm == &init_mm) + mmap_assert_locked(walk.mm); + else + mmap_assert_write_locked(walk.mm); return walk_pgd_range(start, end, &walk); } From patchwork Mon Nov 27 08:46:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 13469265 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E29E3C4167B for ; Mon, 27 Nov 2023 08:47:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F00A6B0327; Mon, 27 Nov 2023 03:47:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 79F8F6B0328; Mon, 27 Nov 2023 03:47:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6690B6B0329; Mon, 27 Nov 2023 03:47:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 538DF6B0327 for ; Mon, 27 Nov 2023 03:47:29 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 143B24017D for ; Mon, 27 Nov 2023 08:47:29 +0000 (UTC) X-FDA: 81503105418.20.0085930 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf13.hostedemail.com (Postfix) with ESMTP id 41D2C20003 for ; Mon, 27 Nov 2023 08:47:27 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hb4hRqz0; spf=pass (imf13.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701074847; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=u6GtJdFrAMiD9TMiTJ2tpU0p4EP2T7FpthDyFgbUgV8=; b=M/2k9mI9m4Hz35Q73oNjAHGS/RtDpvg+gL8I+6Fok10+nkOvy7lZxMO+f4Fdmx4r3zMDl6 gzyLvAceRErJuOFzWZHjROaKZjd5guOkzEEeFP0wCHm2cmEaMSwn6SARqRZdqAHUKbM/eL jUqB/WpQCCzXszWYg2BpaiGpaQcLfVo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701074847; a=rsa-sha256; cv=none; b=lO4HdmkPFylHl4EsKiCjNi1iB+ZmF1HH0TSRRkJD+bo4vaooAhPaM5ktYl2Pyufy26TCLu 9qmV38YBA+hH3kH64xGwm2i5LJqZpE8ff90IRVW4HjgTTHWbM/lQGdDqsVEoWMBb4grTgj Oc+oCI6DOLlJ47rTtUOBtXmfoRIUefo= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hb4hRqz0; spf=pass (imf13.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.210.169 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-6cbd24d9557so2680813b3a.1 for ; Mon, 27 Nov 2023 00:47:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1701074846; x=1701679646; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u6GtJdFrAMiD9TMiTJ2tpU0p4EP2T7FpthDyFgbUgV8=; b=hb4hRqz0rPjXOJVHbV7RZFT3FHug6s8Q8Fmt/8FgyPm+JMzH4Lf8roD+JVHSEyMDaG se2qU0sP8Z5y9jIrhH2rvOzJkPiAupmTjuxDooL/NRcOY4CUSkDSYoOqCYtf2Bo4u+IT F0EUQRF7C3aIf1eR4/PsV9rDEfl4vyI2MTHTYZxx7uXPgVNvAObGh3iaQlzNUU+HXJVj 1QKNOWd/sUBiBNF++Bq8Y1zGyFYRx98DfCte76VPwiIGt/47skvNC0GL2VYgy5QFYdsm FOzpDgXno0QFLVXAA3hd1/Vn6KIr68KJScJl4vbclV0zq3ncgrTq7N3b/qXbJVtTmyZc lztA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701074846; x=1701679646; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u6GtJdFrAMiD9TMiTJ2tpU0p4EP2T7FpthDyFgbUgV8=; b=MYsSP3h5X/tXpBIT5TWR5ak6LVo60wW3kvvQYVXZcIqX1MpHaLKnxa5tWiot5isJGT whr7ztCrbWyJSHAp98//VZp73sSImclxK3pg7fV2C6/wBaJqzgpZLhf5PECTxoVTlTn8 F7ZDkYE2vA7SPnb8/5yz0BcBssHzH41yXAjMsplRIoOyeZvLJbFNL0YB9ZyJnuRhCkwP 1IuMo2QHJwDc30ZIXVtDRL0JUnS56Oq0RokEXk20zfD2Su+UR76aMlILz+FrUP4InxRk K0X1zJ50NzzdVdCwrKfzktbSD/47WPhlxij4Xt/h+lMs2i5KtODanVcJ6iMxZyPVel0o tpKw== X-Gm-Message-State: AOJu0YyqiwRSahEvldw8FtQvSNpAV1fwQ9jnG0WvKaQNhzqSALe+TM87 6uSsqeyXb099wdcnabLWy4e96g== X-Google-Smtp-Source: AGHT+IH8dlIQBo06oYLZpoUw9SLI1+YxmO7I/y/nkByjLUJqY3NGvBpQZM1zhaq6gplUxS6RY2JFSg== X-Received: by 2002:a05:6a20:9390:b0:18c:5178:9649 with SMTP id x16-20020a056a20939000b0018c51789649mr7667438pzh.14.1701074846205; Mon, 27 Nov 2023 00:47:26 -0800 (PST) Received: from PXLDJ45XCM.bytedance.net ([139.177.225.230]) by smtp.gmail.com with ESMTPSA id e22-20020aa78c56000000b006c875abecbcsm6686932pfd.121.2023.11.27.00.47.23 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 27 Nov 2023 00:47:25 -0800 (PST) From: Muchun Song To: mike.kravetz@oracle.com, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 2/4] mm: hugetlb_vmemmap: use walk_page_range_novma() to simplify the code Date: Mon, 27 Nov 2023 16:46:43 +0800 Message-Id: <20231127084645.27017-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20231127084645.27017-1-songmuchun@bytedance.com> References: <20231127084645.27017-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 41D2C20003 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 1a85diep4kyf44sg5f6ismn5bmh73jyw X-HE-Tag: 1701074847-58755 X-HE-Meta: U2FsdGVkX1+Ck+gbJ+nvaBoFCxTDFX/s1/7mPMilhMzW4gr72XNnSzHDlj2ozgab9pzS2rSUinKLXctiL8Xr8Ociibue90s8VWgmd6EBFE5ujY04JpHtuoZiN9tUW5baTb5Laa9ldliZZgLhZkhDi3IZMjDmcPwbxw5cNE5GK+xJY5eQ+H9cbAhvsSdPWSFRCXHa/33DLFVeVnLA1ACFbQ6ifEyipVTQaGvthhS5GeVL+3e2WWA1nbpYPUYA9jhTXq1XAqlyF6iQ0zEClGcoEQ+lxkSFt7H1hVMUPwKHe96IlA+6rCoaD47sppuubfjCfcixiWHtg3Hnov++/Kf/Etj/hNye2xj12ZImvY/LpGtMlecnB4TKV81JiOG1EEojHldfffcPD50P19OBf2c1JwPHJF20qO1gT4BOmimVxOYjLj0hrBgQYMTi0iYbpyfYnDVyF5lzX+hQGV8wer9g2DMNpWSm9ku+WwoHB2054xZI6P3LeQl8YGepRDaQlo9bf/U9StDw8JDGrnsLIlYt1gHqCQK4uC74HliUrlC7rqD8ua7+/fb4EI2gwmRFTjKqkXc8roFMApfnFAneA2KlUZCcB7qXCDHPTHinEhdSZR7vRmJDtun+d/LhVHeW8wfH2MdI7S6lXJRfrWtmfgR6Tlhvx1UaELkxUl/QsXrquWZR63/RfOFudZ98LPw8Dd8/c0nJB79ZWpgTuM2HMv60oGXyWontSbHcLTjHh6s910z//WghuR1t8AWF7YRJFQwQhOn5kmzfr9rkbNbmEKdHt5Li1kf2/n3xQWr+p8Fn+RSYtDWy36BUtJz1gsaK1diQbb7pPQjC4coNyjaB7COt8npSU4ScPPvViLIjaPeD0U++ZebomX2qIxOb8xlKIhFlEFU5QC+Fld7fBQJ3z5O+aLbeb2Azm6EX1azch6vvJY3vbz356odfRUbQIKfY7/Wzya5seAIWn/VPsdU5BN1 RJmY5dPF XTLhsluM0mGaML5uwnRS0Ivl8avWApPMMgkWIdsgchFyBAIrUXVu+R4GBetHsxjbb46gr/vbGmPH/YmthDjPAwVn+P2mwvvjv3HjDkgddkqfBS7JGTip6CvnH2sfr3FLCsK/53lrl1SJFqv42B5yn6ndsqbI9eGKSJUnuNKRDnj/oGD02LzK/ekjgkdGI+HN0NrwjMYfZa62ogXfPmLkbTRMyfwotmXbQSSgmLMBEApOTbJugyL1M6k0NL7ft+0YfzzcFhLp1QA+MXNkKyI1N23MJc2UkW65szt8yK2k0AQLVzojHjjQjY2yr+g3/zvPB8LTSXE6s12K4RD5Pf7SB4BhI3N9c5+ZHMQjGqYVx27hiGNG7mRNmoTjh45I//MoSIz1/e/nLYcHvd4q7NiEzEn4Cc63A4myxl6SqDqTg/vFZUYvJV7ZtQy2ukvDQq06My/y8kO4rpv5wF60= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: It is unnecessary to implement a series of dedicated page table walking helpers since there is already a general one walk_page_range_novma(). So use it to simplify the code. Signed-off-by: Muchun Song Reviewed-by: Mike Kravetz --- mm/hugetlb_vmemmap.c | 148 ++++++++++++------------------------------- 1 file changed, 39 insertions(+), 109 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 87818ee7f01d7..ef14356855d13 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include "hugetlb_vmemmap.h" @@ -45,21 +46,14 @@ struct vmemmap_remap_walk { unsigned long flags; }; -static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start, bool flush) +static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long start, + struct vmemmap_remap_walk *walk) { pmd_t __pmd; int i; unsigned long addr = start; - struct page *head; pte_t *pgtable; - spin_lock(&init_mm.page_table_lock); - head = pmd_leaf(*pmd) ? pmd_page(*pmd) : NULL; - spin_unlock(&init_mm.page_table_lock); - - if (!head) - return 0; - pgtable = pte_alloc_one_kernel(&init_mm); if (!pgtable) return -ENOMEM; @@ -88,7 +82,7 @@ static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start, bool flush) /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); pmd_populate_kernel(&init_mm, pmd, pgtable); - if (flush) + if (!(walk->flags & VMEMMAP_SPLIT_NO_TLB_FLUSH)) flush_tlb_kernel_range(start, start + PMD_SIZE); } else { pte_free_kernel(&init_mm, pgtable); @@ -98,123 +92,59 @@ static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start, bool flush) return 0; } -static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr, - unsigned long end, - struct vmemmap_remap_walk *walk) -{ - pte_t *pte = pte_offset_kernel(pmd, addr); - - /* - * The reuse_page is found 'first' in table walk before we start - * remapping (which is calling @walk->remap_pte). - */ - if (!walk->reuse_page) { - walk->reuse_page = pte_page(ptep_get(pte)); - /* - * Because the reuse address is part of the range that we are - * walking, skip the reuse address range. - */ - addr += PAGE_SIZE; - pte++; - walk->nr_walked++; - } - - for (; addr != end; addr += PAGE_SIZE, pte++) { - walk->remap_pte(pte, addr, walk); - walk->nr_walked++; - } -} - -static int vmemmap_pmd_range(pud_t *pud, unsigned long addr, - unsigned long end, - struct vmemmap_remap_walk *walk) +static int vmemmap_pmd_entry(pmd_t *pmd, unsigned long addr, + unsigned long next, struct mm_walk *walk) { - pmd_t *pmd; - unsigned long next; - - pmd = pmd_offset(pud, addr); - do { - int ret; - - ret = split_vmemmap_huge_pmd(pmd, addr & PMD_MASK, - !(walk->flags & VMEMMAP_SPLIT_NO_TLB_FLUSH)); - if (ret) - return ret; + struct page *head; + struct vmemmap_remap_walk *vmemmap_walk = walk->private; - next = pmd_addr_end(addr, end); + /* Only splitting, not remapping the vmemmap pages. */ + if (!vmemmap_walk->remap_pte) + walk->action = ACTION_CONTINUE; - /* - * We are only splitting, not remapping the hugetlb vmemmap - * pages. - */ - if (!walk->remap_pte) - continue; - - vmemmap_pte_range(pmd, addr, next, walk); - } while (pmd++, addr = next, addr != end); + spin_lock(&init_mm.page_table_lock); + head = pmd_leaf(*pmd) ? pmd_page(*pmd) : NULL; + spin_unlock(&init_mm.page_table_lock); + if (!head) + return 0; - return 0; + return vmemmap_split_pmd(pmd, head, addr & PMD_MASK, vmemmap_walk); } -static int vmemmap_pud_range(p4d_t *p4d, unsigned long addr, - unsigned long end, - struct vmemmap_remap_walk *walk) +static int vmemmap_pte_entry(pte_t *pte, unsigned long addr, + unsigned long next, struct mm_walk *walk) { - pud_t *pud; - unsigned long next; - - pud = pud_offset(p4d, addr); - do { - int ret; + struct vmemmap_remap_walk *vmemmap_walk = walk->private; - next = pud_addr_end(addr, end); - ret = vmemmap_pmd_range(pud, addr, next, walk); - if (ret) - return ret; - } while (pud++, addr = next, addr != end); + /* + * The reuse_page is found 'first' in page table walking before + * starting remapping. + */ + if (!vmemmap_walk->reuse_page) + vmemmap_walk->reuse_page = pte_page(ptep_get(pte)); + else + vmemmap_walk->remap_pte(pte, addr, vmemmap_walk); + vmemmap_walk->nr_walked++; return 0; } -static int vmemmap_p4d_range(pgd_t *pgd, unsigned long addr, - unsigned long end, - struct vmemmap_remap_walk *walk) -{ - p4d_t *p4d; - unsigned long next; - - p4d = p4d_offset(pgd, addr); - do { - int ret; - - next = p4d_addr_end(addr, end); - ret = vmemmap_pud_range(p4d, addr, next, walk); - if (ret) - return ret; - } while (p4d++, addr = next, addr != end); - - return 0; -} +static const struct mm_walk_ops vmemmap_remap_ops = { + .pmd_entry = vmemmap_pmd_entry, + .pte_entry = vmemmap_pte_entry, +}; static int vmemmap_remap_range(unsigned long start, unsigned long end, struct vmemmap_remap_walk *walk) { - unsigned long addr = start; - unsigned long next; - pgd_t *pgd; - - VM_BUG_ON(!PAGE_ALIGNED(start)); - VM_BUG_ON(!PAGE_ALIGNED(end)); + int ret; - pgd = pgd_offset_k(addr); - do { - int ret; + VM_BUG_ON(!PAGE_ALIGNED(start | end)); - next = pgd_addr_end(addr, end); - ret = vmemmap_p4d_range(pgd, addr, next, walk); - if (ret) - return ret; - } while (pgd++, addr = next, addr != end); + ret = walk_page_range_novma(&init_mm, start, end, &vmemmap_remap_ops, + NULL, walk); + if (ret) + return ret; if (walk->remap_pte && !(walk->flags & VMEMMAP_REMAP_NO_TLB_FLUSH)) flush_tlb_kernel_range(start, end); From patchwork Mon Nov 27 08:46:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 13469266 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F40BC4167B for ; Mon, 27 Nov 2023 08:47:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E33256B0329; Mon, 27 Nov 2023 03:47:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DE2C86B032A; Mon, 27 Nov 2023 03:47:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5BA06B032B; Mon, 27 Nov 2023 03:47:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B1DFC6B0329 for ; Mon, 27 Nov 2023 03:47:32 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 897DAB525F for ; Mon, 27 Nov 2023 08:47:32 +0000 (UTC) X-FDA: 81503105544.19.D14B987 Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) by imf22.hostedemail.com (Postfix) with ESMTP id 9CF4BC0023 for ; Mon, 27 Nov 2023 08:47:30 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=g2lKJOFQ; spf=pass (imf22.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.210.178 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701074850; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CXU92qYOzo6aL9X8rG2XQuljOMNrDRKtELmWJz0kLI0=; b=F1tOaoFbP8vdzfbXRFyo/fB1upEEs1V/KUUN92zaewrvXX+O/V/uFQILP9OrxQtW+/JaUB 4sjHHPVyTfLyxTVKUpXA2HhyTxQgb1Md5+do4Ab2gYv2dZRGxn8PSf+1Rh+eJD2we51BFF AhrNR3xhnbVuMli2GnosIOQ1RLHjlKg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701074850; a=rsa-sha256; cv=none; b=xGqj0lbyntT8f0DPk2/PC8ukNc7lCqPoUaE5tpHhzV70Rr5RKuwdFEDFVPYlVOLUtTtLb2 KbPsNXB/LUNfb6DEk9/Jt0s8N+8UWp7r5K5S+FeSjUuymGV0mI1Q9xDWhnL0m1Hzjax8lh pxURlvzS9/PCqvkq6FklLfQA3utwKl8= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=g2lKJOFQ; spf=pass (imf22.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.210.178 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-pf1-f178.google.com with SMTP id d2e1a72fcca58-6cb66fbc63dso2699969b3a.0 for ; Mon, 27 Nov 2023 00:47:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1701074849; x=1701679649; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CXU92qYOzo6aL9X8rG2XQuljOMNrDRKtELmWJz0kLI0=; b=g2lKJOFQN0gRis8Wxsgi235SYDIb7g5qr45jJNfoFJ+QpmUq8aJEZ5xMI1iSELvBHH ma+TDNCArDggqEvaaiuMg1JalsIz0ihF5c07XJxjkGW9BtkRSjFXbkn+hmPRL7rbEH7o li6vPEF9VL0quXxm0LAKjFr4O6y8d2SDImcnC1Sub8OZs6WyUuW7EfJyWV9k9YSSMHKE tkGXlzk14gkgypQTnvL3lEHLMRaWV/yNHKZLc65HkN8xGaj2hgIt0/0Y7vqEAHMbn4kk LAh9aMC6o09PfiDk8q9oA4KkyTAwpVV6g3LqlIc8kGLisfTr8qGLM1QHNMW7RzR58fNJ czwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701074849; x=1701679649; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CXU92qYOzo6aL9X8rG2XQuljOMNrDRKtELmWJz0kLI0=; b=aCrTWQCnB7Ms7g7eSKD5VaESQTB92bwYAq5zhf08lhrgr/IZDKuULEnfOq9xlVv6c+ DQfr9xYPduHuvFrvVrVXdFoS8npZojIuSAMAah2HxoT4EitK4XeratNXlhwp8+n/SmAy hDYVBlPrUtU9W5M9z70v/09SN4W/qbGgz40QqwMiJ+4ofV+D1qtIBBQ02JXTFAA1aCTd LqnTJ5nPowrmzF/mPUJqICxhaOtYAzW6mTApbhS1bK8F2dAKO1tGOI+L3B2WC/HSDlIy E4yFjlj25djtvJl1IW+zGVYh0KkqGSod1cjhdvnktS5t2e6Zuyu1+uldDBF7asSdmMWZ RM4w== X-Gm-Message-State: AOJu0Yx4npiV9It06p+5jGm1udvqsMWabnPzEJ+RA5kGTzUppM6RfoDG 5wxCuqo1xfk+YIHJy5gSDIYnIA== X-Google-Smtp-Source: AGHT+IHXNermaEvQJyziObFn7H4znAJ2WmRDi/l1vX6FLLvpxZ6tTRgBpFuCL+ANq0CYCQTGl1dHCg== X-Received: by 2002:a05:6a00:878c:b0:6cb:cc23:f69f with SMTP id hk12-20020a056a00878c00b006cbcc23f69fmr18033800pfb.16.1701074849478; Mon, 27 Nov 2023 00:47:29 -0800 (PST) Received: from PXLDJ45XCM.bytedance.net ([139.177.225.230]) by smtp.gmail.com with ESMTPSA id e22-20020aa78c56000000b006c875abecbcsm6686932pfd.121.2023.11.27.00.47.26 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 27 Nov 2023 00:47:28 -0800 (PST) From: Muchun Song To: mike.kravetz@oracle.com, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 3/4] mm: hugetlb_vmemmap: move PageVmemmapSelfHosted() check to split_vmemmap_huge_pmd() Date: Mon, 27 Nov 2023 16:46:44 +0800 Message-Id: <20231127084645.27017-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20231127084645.27017-1-songmuchun@bytedance.com> References: <20231127084645.27017-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Stat-Signature: 8x4dtq73xp7uo3ygw963agopar1syxbo X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 9CF4BC0023 X-Rspam-User: X-HE-Tag: 1701074850-200350 X-HE-Meta: U2FsdGVkX1+Onc2JEMPO5Xy4gHimWSHtGeQl0oFuPvmy9QQOecoRgiJkzx1YtvF+fM3xISPMKDObDE/O5ty/4QwSIwmrlOA4bahTkOXmgYoK7oDMau6Smura2tcuncOTssaqk/Fu3E+hr2RLySXDHf+SVyZE8AM9HAgcEsi2hirvcoQMnonRJ72UIFT6FgmsZo7sdQStJE8ai0W6jTg7APSFa0kz5mo9slteHAZxF5+oA9RvBGIIXBg2k5CSaMOt+q6rWya+Ci7CZSRy2KfKx2kwUjnzuq8ySWqzPeOxiuz8l1kehC+305N0IKIN8uJtweFee7ZrjJ3LQA8Z8XqNm56LIekWksxd0nwlalz8CFReZKqeSKKcysDykJNeIZcIP5zDPFyqZQCuGl23CddVlv7kZjTpd839oZ87WjKHBgt+fPmL3xEsMgE5LRRK13UptPWk4mADpnypF9x81x9V+wPujt4BwYZhufziIZw2Lerc5I/D9diLVTLQ5N1ip2hVKg1ypevCKgi0PjqbglLvsDrqce73IB7KaPmnX92TL8U5qNytNZynjDiKfFHZFz0Zr36jg/+eRRrKRgQejAuzfRy+I+fGFsySeqB4/nlJ81MROenNrhF5YUcKFkuQnHnqK0083+76d4p+2VOTuumL/AdmC5ftRU7ZkiunUB5bStz9N2UK2ormGM8ag9QvE8SK4Phfk+CsBw9uD6iT70h/9n1N+F57AkGt6xJcO5iqUHT3RdxF+FSHsVgIaX4jw4WW5P75WOQ1j64973WIVqVCidy0rlLxpCXhwXsJO9AcZLHYJSen23uCbYDVyr5T1NqqDciT8jX4Q8mafaFx4PuvqYR0HxO5UQk6qs0wghoHnEnIkjlu0aX3xV89Zg4P7yYL6jC1gn1/mZLfkZBlZO4KLvytYcEU0Sva5swQDFZi5DRPFcwsttjFCCIjDSCjG480HwVqSFLS41WyIO+ISlU YUMEKVOO YWUumsMBYGpsV3D1O9PRN6gE4+Y6mRhzpXBdzjnaGbihEt+vH+0ftJW/8LzsvnnXIdsyotGg+87oKLxIO/tGW6pYWpQGihkpcdKw8DPdN6TO1E/JbozPD/xy2Og+gOV8sJQ4jcmJvQERUmlpzTu3kbqubkOpfDhk6/dLKFaJS/q21Ug/RLjr8hwXJiO6fucklVS5q/s5gL/b4YQODDjdP17w51xNX5bK+hkWLTE2Wlqda/xTKjAOlBVJyFNUr3U/qYVrLsM+9ofj/KqKnpwqYPbpZh2bolYzTP0iBolyLUJyMT4bKKBlO1QkBPT1AHB1aArZ3XlcFbLl8kSfh7HkzQxE2G89BbtFNBG/Oymo0svdZtpVAMDFQS7ZeqAmImwXfgkGn00g7aeBy/fGtTZPIHeyDcHa5n371+D3ktOJlDBiWU3RJOuEQKUofJWYrBH4BELK0d8wKhSsHSuE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To check a page whether it is self-hosted needs to traverse the page table (e.g. pmd_off_k()), however, we already have done this in the next calling of vmemmap_remap_range(). Moving PageVmemmapSelfHosted() check to vmemmap_pmd_entry() could simplify the code a bit. Signed-off-by: Muchun Song Reviewed-by: Mike Kravetz --- mm/hugetlb_vmemmap.c | 70 +++++++++++++++----------------------------- 1 file changed, 24 insertions(+), 46 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index ef14356855d13..ce920ca6c90ee 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -95,6 +95,7 @@ static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long start, static int vmemmap_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long next, struct mm_walk *walk) { + int ret = 0; struct page *head; struct vmemmap_remap_walk *vmemmap_walk = walk->private; @@ -104,9 +105,30 @@ static int vmemmap_pmd_entry(pmd_t *pmd, unsigned long addr, spin_lock(&init_mm.page_table_lock); head = pmd_leaf(*pmd) ? pmd_page(*pmd) : NULL; + /* + * Due to HugeTLB alignment requirements and the vmemmap + * pages being at the start of the hotplugged memory + * region in memory_hotplug.memmap_on_memory case. Checking + * the vmemmap page associated with the first vmemmap page + * if it is self-hosted is sufficient. + * + * [ hotplugged memory ] + * [ section ][...][ section ] + * [ vmemmap ][ usable memory ] + * ^ | ^ | + * +--+ | | + * +------------------------+ + */ + if (unlikely(!vmemmap_walk->nr_walked)) { + struct page *page = head ? head + pte_index(addr) : + pte_page(ptep_get(pte_offset_kernel(pmd, addr))); + + if (PageVmemmapSelfHosted(page)) + ret = -ENOTSUPP; + } spin_unlock(&init_mm.page_table_lock); - if (!head) - return 0; + if (!head || ret) + return ret; return vmemmap_split_pmd(pmd, head, addr & PMD_MASK, vmemmap_walk); } @@ -524,50 +546,6 @@ static bool vmemmap_should_optimize(const struct hstate *h, const struct page *h if (!hugetlb_vmemmap_optimizable(h)) return false; - if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG)) { - pmd_t *pmdp, pmd; - struct page *vmemmap_page; - unsigned long vaddr = (unsigned long)head; - - /* - * Only the vmemmap page's vmemmap page can be self-hosted. - * Walking the page tables to find the backing page of the - * vmemmap page. - */ - pmdp = pmd_off_k(vaddr); - /* - * The READ_ONCE() is used to stabilize *pmdp in a register or - * on the stack so that it will stop changing under the code. - * The only concurrent operation where it can be changed is - * split_vmemmap_huge_pmd() (*pmdp will be stable after this - * operation). - */ - pmd = READ_ONCE(*pmdp); - if (pmd_leaf(pmd)) - vmemmap_page = pmd_page(pmd) + pte_index(vaddr); - else - vmemmap_page = pte_page(*pte_offset_kernel(pmdp, vaddr)); - /* - * Due to HugeTLB alignment requirements and the vmemmap pages - * being at the start of the hotplugged memory region in - * memory_hotplug.memmap_on_memory case. Checking any vmemmap - * page's vmemmap page if it is marked as VmemmapSelfHosted is - * sufficient. - * - * [ hotplugged memory ] - * [ section ][...][ section ] - * [ vmemmap ][ usable memory ] - * ^ | | | - * +---+ | | - * ^ | | - * +-------+ | - * ^ | - * +-------------------------------------------+ - */ - if (PageVmemmapSelfHosted(vmemmap_page)) - return false; - } - return true; } From patchwork Mon Nov 27 08:46:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 13469267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75AFFC4167B for ; Mon, 27 Nov 2023 08:47:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0ED966B032B; Mon, 27 Nov 2023 03:47:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 075A96B032C; Mon, 27 Nov 2023 03:47:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E58846B032D; Mon, 27 Nov 2023 03:47:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D1F756B032B for ; Mon, 27 Nov 2023 03:47:35 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id AEB0A4012B for ; Mon, 27 Nov 2023 08:47:35 +0000 (UTC) X-FDA: 81503105670.22.1FCE3D0 Received: from mail-oi1-f179.google.com (mail-oi1-f179.google.com [209.85.167.179]) by imf08.hostedemail.com (Postfix) with ESMTP id E3388160016 for ; Mon, 27 Nov 2023 08:47:33 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=DUta5HiH; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf08.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.167.179 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701074853; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KDwCfd7tNwbLPqndee0T3tycLaDjfbEI5ZluCSMX2aI=; b=s5f+aWVcwTRBCFOb8PMVVoHGHx2L3+E/HJPrUNeUY5TQ5g7t5m3rt7aSeUBW4DOwTWSqbO ijrjbieXA8PyWNTUY/s+OK2FFDCvF1Wsx6UNbS25kSrzPRjriRL+05kueh1mpJpTl4Jrqp H84ArNsYo7pKIkw6RdfpOB/s9UIM/6w= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=DUta5HiH; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf08.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.167.179 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701074853; a=rsa-sha256; cv=none; b=PfYcnUGA4yjYqqjJTl5TeAp9NLQkGKEtP2HL+axCRlh6OhokhTTM5uChlFXq20zsYbr8vX AdxZb3d/yiY7oidg+tzW4ofBPuaeMxhY1dabi7YLR+q1m6EaTsHBGT6gyee346ZRO6VzTV U5Aw6SHOmJAUn2i1PA8VCUapYmfV/Sg= Received: by mail-oi1-f179.google.com with SMTP id 5614622812f47-3b844357f7cso2529338b6e.1 for ; Mon, 27 Nov 2023 00:47:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1701074853; x=1701679653; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KDwCfd7tNwbLPqndee0T3tycLaDjfbEI5ZluCSMX2aI=; b=DUta5HiH6AYXWZZ7Rh8uSc2E/ymi+lcq58iD05Sy8DMrPBLM3IDYFfnTdmNUFx65R9 HBawJFEGiHrOXHPO+4qpP/8E8yU6Tbjf9ch/I+gX1DyvL3U3NMVeX8JaCSVcIimPGQgo 0yimQvUNIR27TKwMikq2BrUmiHOQLmAo+02HIoge+7CEUAx9PvMIvjgyD4wtSllqlPFQ 6VDnjkvBspgGJeaGfkX8KRoL92t8i61UtUlPx/IsWUKBu9ycRu2JO4C/ejz4VHLE37gn Luo0xkKxoNvxEE26euBQtOynT7yhsqmT+bKc4bDVgQjCmA9z1BRDGeh50MVz1OrABkqI 4fGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701074853; x=1701679653; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KDwCfd7tNwbLPqndee0T3tycLaDjfbEI5ZluCSMX2aI=; b=sTUdZSRq7IPMkrDXQ8P95WfHRT4Gr/W2gwo7u4s2enNG3BfPxHpcSJbb5QK9LpqLBF z6yEdEJ/cLRfh6t4m4o+CVTtaw3/spK9cTnMj8k8xkbPcC5uP+oded6s/pmrsWFLnKRX GfDp+U26chu4HgAhfbYmOYWeoWf7seyf4AHoB2ri7iSExt+9uS4zolEshxxaw9tVqqNr Pl7kS3+ybNJqEm1zjiE9Y0iegzOY4fMb/MXvGcE1hZWivSYyD6A03jqoSAtcIPK8haGM dHkJaHdX+qT4zGxNdX4JHXvfz0Ku3eSZLQtqLZTV0UWxj4RSU6f9r3Z3gR0dj2oWMX36 LfhQ== X-Gm-Message-State: AOJu0YxEwVr+TfmwSBAWNB9B7N4XwCBpeUmJ4uICOyyVsEBETckPb1lD vbijaKYWKynAZib2EFuOdmeA2A== X-Google-Smtp-Source: AGHT+IGjtsE1BR2K2VhxfkYoA/VOTcyqYgi8MEBB3p+kUcn0pG5OAgrvWgbA1djFn9WP71UH2Rx0sg== X-Received: by 2002:aca:1c02:0:b0:3ae:156f:d325 with SMTP id c2-20020aca1c02000000b003ae156fd325mr12357984oic.58.1701074853042; Mon, 27 Nov 2023 00:47:33 -0800 (PST) Received: from PXLDJ45XCM.bytedance.net ([139.177.225.230]) by smtp.gmail.com with ESMTPSA id e22-20020aa78c56000000b006c875abecbcsm6686932pfd.121.2023.11.27.00.47.29 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 27 Nov 2023 00:47:32 -0800 (PST) From: Muchun Song To: mike.kravetz@oracle.com, muchun.song@linux.dev, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 4/4] mm: hugetlb_vmemmap: convert page to folio Date: Mon, 27 Nov 2023 16:46:45 +0800 Message-Id: <20231127084645.27017-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.39.3 (Apple Git-145) In-Reply-To: <20231127084645.27017-1-songmuchun@bytedance.com> References: <20231127084645.27017-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: E3388160016 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: iym67wwwjfiw4gegqkbwhpdhnegcz87g X-HE-Tag: 1701074853-828940 X-HE-Meta: U2FsdGVkX18CH7Gy3wlzg3gxqu8vneI94MI88ucGGnGprkYltdXNGAwf3KTNwANqEe7brsLpIOogXDFfy3Z7Q3cp5xhrI/6JQjeO25Di7JAEmXjgChYjKdI74BezXyhbBrCABINjlfiID0SI/6gCLEHmaLmHYs8wN7gaZGKCFE6KyuARhp82zu0QWsw8aVXjIbi/4ZUoBOVnhwhIPzwgLW7VdSJ+O7c2PaCboW9Gl6zSy75DAvePCsezaG/UjKHpDT4bGrUt46nGR2kdU1N90qHF3FmT6cRik4JJtUBFryCACzEVW5Qk45QVhAlaoCkdf0cP+jcxMnpfsONfio2KzKYt6fY+O59u3LFgdlSeVcG1A1X4Sb4dGJIJa5x9apTaPsGnzN/Xmrh5beycdFcqxccsykifGF9X2tjezcssVTQSS0SpwBwRlVYjRTuepVSwVhLlXLpTejQeU/9g8SHqodhnpJsAkwlnnV9yJ0Hk8gGFihM1ik0yeM7EhrmSK+sFZf+Iim+5hP2cEVLbgboGmp7uW6HlI7S+LfVHjTupqmMnDc+W8+K7hFgwEfG6nBTst2pTMlvL6l+Y8kpShOj39Hh3DQ37H4uO2QmPk+qIBWi79F1TzqfidA4+hCPBbuRYNjdi2qADsfq44lUmNnCwa+IJoD3l0lpFgsXwtDveS83mGTgXf/KAZP8WFBWYgAVFGVbbY//NTd8r9E/FZYCo/GVwbOfwEePBGytHnMYmzmWT7GM+Z/JQaxjLhnGdZFL31TLkdT8cJLi880wQIITbfffLcLgc7Q8O6ttZpHNrN01sSzA0jMe//cNyL7RNLVifcfZrufdzDh2bt87nNmIDPQ7PTsrAcEh9AVY2sKaaYHc/oLfHkUicLgRiulgWOQF0h6zCnraFeBvrAp4Eo744A4bbVrJAN2owdmhVG+qf8btj/6afYdK5i4+rHvKEvdP4MvRDLGYRPh7sBov3QjZ KJndcQjI K1rcm5bIKnPyLTpWHaiLm9+avCWK+YJI3YDy/Jl2lj+druHRb04dDsrZ0ANX4QKsFBXNFC3jGx3238J/7IR+l/biL6/tjh0Iuu71hl3ygPPPDcGuNv2XMzdPu0zskTDCY06vTDflloRVuyZwpaGKIIQnfIiogpt6Za+lWHUsxNNXWlWQwA7gvFLtcT0IyZM0f36YQwQxb8/gYTtGuSUTwYBrTzd3+BDHVy0w04ONNPZoseMC2SQI4fw/ABSpdF34OkTNKEmG9kJZsB80eDLuvl52R5ZhJ8yLKYimap6WKBqMe2+0Fs9EkYijQpSG9BudDXRn9tjr4kxYo1ZvbenZYFnYVIfWvbtZjj3rNmyQ/++saNJ2Jl3TepQsUmJ0ENos6GqpzYcmYbMQkZPmB839QkgRNZcYkJnKFO7Krtg5SuSifDpsgUD4bE69jrxX7xo74mHjrGBilAL/NRcQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: There is still some places where it does not be converted to folio, this patch convert all of them to folio. And this patch also does some trival cleanup to fix the code style problems. Signed-off-by: Muchun Song Reviewed-by: Mike Kravetz --- mm/hugetlb_vmemmap.c | 51 ++++++++++++++++++++++---------------------- 1 file changed, 25 insertions(+), 26 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index ce920ca6c90ee..54f388aa361fb 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -280,7 +280,7 @@ static void vmemmap_restore_pte(pte_t *pte, unsigned long addr, * Return: %0 on success, negative error code otherwise. */ static int vmemmap_remap_split(unsigned long start, unsigned long end, - unsigned long reuse) + unsigned long reuse) { int ret; struct vmemmap_remap_walk walk = { @@ -447,14 +447,14 @@ EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); static bool vmemmap_optimize_enabled = IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON); core_param(hugetlb_free_vmemmap, vmemmap_optimize_enabled, bool, 0); -static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio, unsigned long flags) +static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, + struct folio *folio, unsigned long flags) { int ret; - struct page *head = &folio->page; - unsigned long vmemmap_start = (unsigned long)head, vmemmap_end; + unsigned long vmemmap_start = (unsigned long)&folio->page, vmemmap_end; unsigned long vmemmap_reuse; - VM_WARN_ON_ONCE(!PageHuge(head)); + VM_WARN_ON_ONCE_FOLIO(!folio_test_hugetlb(folio), folio); if (!folio_test_hugetlb_vmemmap_optimized(folio)) return 0; @@ -517,7 +517,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, list_for_each_entry_safe(folio, t_folio, folio_list, lru) { if (folio_test_hugetlb_vmemmap_optimized(folio)) { ret = __hugetlb_vmemmap_restore_folio(h, folio, - VMEMMAP_REMAP_NO_TLB_FLUSH); + VMEMMAP_REMAP_NO_TLB_FLUSH); if (ret) break; restored++; @@ -535,9 +535,9 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, } /* Return true iff a HugeTLB whose vmemmap should and can be optimized. */ -static bool vmemmap_should_optimize(const struct hstate *h, const struct page *head) +static bool vmemmap_should_optimize_folio(const struct hstate *h, struct folio *folio) { - if (HPageVmemmapOptimized((struct page *)head)) + if (folio_test_hugetlb_vmemmap_optimized(folio)) return false; if (!READ_ONCE(vmemmap_optimize_enabled)) @@ -550,17 +550,16 @@ static bool vmemmap_should_optimize(const struct hstate *h, const struct page *h } static int __hugetlb_vmemmap_optimize_folio(const struct hstate *h, - struct folio *folio, - struct list_head *vmemmap_pages, - unsigned long flags) + struct folio *folio, + struct list_head *vmemmap_pages, + unsigned long flags) { int ret = 0; - struct page *head = &folio->page; - unsigned long vmemmap_start = (unsigned long)head, vmemmap_end; + unsigned long vmemmap_start = (unsigned long)&folio->page, vmemmap_end; unsigned long vmemmap_reuse; - VM_WARN_ON_ONCE(!PageHuge(head)); - if (!vmemmap_should_optimize(h, head)) + VM_WARN_ON_ONCE_FOLIO(!folio_test_hugetlb(folio), folio); + if (!vmemmap_should_optimize_folio(h, folio)) return ret; static_branch_inc(&hugetlb_optimize_vmemmap_key); @@ -588,7 +587,7 @@ static int __hugetlb_vmemmap_optimize_folio(const struct hstate *h, * the caller. */ ret = vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, - vmemmap_pages, flags); + vmemmap_pages, flags); if (ret) { static_branch_dec(&hugetlb_optimize_vmemmap_key); folio_clear_hugetlb_vmemmap_optimized(folio); @@ -615,12 +614,12 @@ void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio) free_vmemmap_page_list(&vmemmap_pages); } -static int hugetlb_vmemmap_split(const struct hstate *h, struct page *head) +static int hugetlb_vmemmap_split_folio(const struct hstate *h, struct folio *folio) { - unsigned long vmemmap_start = (unsigned long)head, vmemmap_end; + unsigned long vmemmap_start = (unsigned long)&folio->page, vmemmap_end; unsigned long vmemmap_reuse; - if (!vmemmap_should_optimize(h, head)) + if (!vmemmap_should_optimize_folio(h, folio)) return 0; vmemmap_end = vmemmap_start + hugetlb_vmemmap_size(h); @@ -640,7 +639,7 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l LIST_HEAD(vmemmap_pages); list_for_each_entry(folio, folio_list, lru) { - int ret = hugetlb_vmemmap_split(h, &folio->page); + int ret = hugetlb_vmemmap_split_folio(h, folio); /* * Spliting the PMD requires allocating a page, thus lets fail @@ -655,9 +654,10 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { - int ret = __hugetlb_vmemmap_optimize_folio(h, folio, - &vmemmap_pages, - VMEMMAP_REMAP_NO_TLB_FLUSH); + int ret; + + ret = __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, + VMEMMAP_REMAP_NO_TLB_FLUSH); /* * Pages to be freed may have been accumulated. If we @@ -671,9 +671,8 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); INIT_LIST_HEAD(&vmemmap_pages); - __hugetlb_vmemmap_optimize_folio(h, folio, - &vmemmap_pages, - VMEMMAP_REMAP_NO_TLB_FLUSH); + __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, + VMEMMAP_REMAP_NO_TLB_FLUSH); } }