From patchwork Wed Aug 21 08:18:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13770983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E10AFC52D6F for ; Wed, 21 Aug 2024 08:19:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D32F6B0096; Wed, 21 Aug 2024 04:19:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 35C946B00BD; Wed, 21 Aug 2024 04:19:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 200156B00B0; Wed, 21 Aug 2024 04:19:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 01DF66B00BD for ; Wed, 21 Aug 2024 04:19:30 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7471E80A61 for ; Wed, 21 Aug 2024 08:19:30 +0000 (UTC) X-FDA: 82475553300.20.215137B Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by imf17.hostedemail.com (Postfix) with ESMTP id D624D40007 for ; Wed, 21 Aug 2024 08:19:27 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=cSX7Hfc4; spf=pass (imf17.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724228306; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=PN5+in6xnBXgxe2LIsQDc93ppSQBPQmBv4bxyw8DmlQ=; b=RbSAtPDLtzFK0ZdT8xj24m+uCnYdbrvRjHfzLRc4P7DfgcqaMVG8tZKdBPmuFWqvMMKM7g /3Tz5HY73Qfh7Mxb3cE13iIplTl2m7psvkzpe5fkWsZs7ztuFl/6XW196a1VuEPSdB6UdK ayF48Uo0zgxy2ihHlcJCiDCYBEELAO0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=cSX7Hfc4; spf=pass (imf17.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724228306; a=rsa-sha256; cv=none; b=8cwQOiHAiz+GxLPheH3yB9Gj3ZGzV/c77qZo7OW0HxFoaAVvcwpTbxf9FtxQJOZmoLVtkY sP8P5/yxjyHbCZtyUd/1qB25YafLpTrIRpENYLcOG8o8F+OT+rytKrSitGP2s2xLvK4ol3 wP/5QKMx+qKRy3uoQg0zuSLYSVjj+8g= Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-2d3dacaccfaso3901657a91.0 for ; Wed, 21 Aug 2024 01:19:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1724228366; x=1724833166; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=PN5+in6xnBXgxe2LIsQDc93ppSQBPQmBv4bxyw8DmlQ=; b=cSX7Hfc43umduHyUYxo+o7IbSHwfAi+N9ohweXPLwyXYeVwJobo3UPtdkXPunLlx4O +SyC3reDrVQ7Jry5E5+ICn8QWhiOoSQVZy3FDO33tlHzICqHiKB4dIFIr/0YUDdlKq8B 42Ijmscqmk0DDCeYLgPBE0+xr13OcGXY1MOT8gPeJMGcHYHpOvJvWffSnMNiiLer7Mvw MdNABrPNsDD8YkdBpnk9LIWA8wb9P2x1JL8+lyHFJhYAnnW0+PWd5SOPa5dVkMEJdmWc 5//+mmYNOMCFliEL6PCvXUvgpIhmdlqm9eaIpZn68kAT3YW2dZRMw70cn+DEF49UvEte QBhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724228366; x=1724833166; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=PN5+in6xnBXgxe2LIsQDc93ppSQBPQmBv4bxyw8DmlQ=; b=Q1B2DfUjVSjYoIpM2h+SsOz+zTB3kpicQ6kze0rCHo0pDImqCG045r84dKFSLekh6z oC5DwKqGlJWHevykuUT6Lna8+W0q+sD50h5KOubDIpkJ8bIap1x4tmUt8NoxBDmn3VyU HAA1lHqbOOI1GgriT3oO1O7vEOml9IHVQSEV8jnpevLr1RVQworzrCNvLFDgrjnR0Hcp iYSR/bGBmIZZ1LWPKSemJirEJXz+0Emi48Ln4YpG8SbWGl6v31hrQywfzG0KlbQcWzVx GE8phJbtYVeU5Hqx7UeXskloXDaZHOgfPkjzGkxD2nqiTWyIBXV4uYNHhgBYiW1UDFpk tD1Q== X-Forwarded-Encrypted: i=1; AJvYcCU3k3EMzUfP9Tz3v8Bvq9NCJUor6W7EedJucRj5LQ2XQOew4jxj0Y6eUa6wkgtXbLPQoOCEMXwAUQ==@kvack.org X-Gm-Message-State: AOJu0YyocEYsUMbSMcl8VuS6S71z17B4sbVsIRCmbaQMEKIfYQMwKaMQ Rc/cUVQxXa/Q4JbCOoXoSabyIL/upUzHXuThKadukbRLQbU5BmqB+IhNugjZHWs= X-Google-Smtp-Source: AGHT+IHHUoYOI/2K4+IHeJgFEXGAQNHZYM2BB7ldoGeWEQ/b7YbP3WstRyHOcikTGkk/QMWJJeDGRQ== X-Received: by 2002:a17:90a:c381:b0:2c9:69cc:3a6f with SMTP id 98e67ed59e1d1-2d5e9dcab76mr1637566a91.31.1724228366127; Wed, 21 Aug 2024 01:19:26 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.150]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2d5eb9049b0sm1091453a91.17.2024.08.21.01.19.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Aug 2024 01:19:25 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, rppt@kernel.org, vishal.moola@gmail.com, peterx@redhat.com, ryan.roberts@arm.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, Qi Zheng Subject: [PATCH 00/14] introduce pte_offset_map_{readonly|maywrite}_nolock() Date: Wed, 21 Aug 2024 16:18:43 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 X-Stat-Signature: 7s9qz8ckthsq9itpz58n3nqpighizbsi X-Rspam-User: X-Rspamd-Queue-Id: D624D40007 X-Rspamd-Server: rspam02 X-HE-Tag: 1724228367-175339 X-HE-Meta: U2FsdGVkX19EUfmjurGH4EcvA3fdu2DtmyPlxIeh4vPUfOTrtLLQtVO6l4jqXiAwZc2jpcZrkrcKh7gc5Gn6ZlXqqK3eW+/rpL2wfWtEGW33H7c/zoec6p+iUHyOkb/n3DOp1iDyJMKvVodvWslY0quaHIqM+kcacqENbBymXV8LYAo1ZQbSigwR7rd+mqt3KTgeN+wD0zC20hJDbwy0d0hFbivurKiT42IsE2gaf1L2/CQApjKQ/mPvD189hLc6jXZaV4zvozqg534ZgG5A8RoLfNO2UJnXxbyo/HBdHA1EESxspx0uPA3f4Vq1MhzqWPDB+sAhu6C5MCRfvYXuuJSeP0ixR+KfhA7pmPvccLm6frgUY+KZ4Mq7DoYH+B+vI5Y9kyQPgLs14zvYhmIvnvIIkLQbp2eSm67OwC7xSWJxNShXSJ4VcdFX7eQ5YNj4da1fOufCa5zWIGsand0dbSZUIfIYGZj0svnurGW6VByBA26CkVsYQZx+9DMB47oNy382CVn+Au26yL+EH+Sx0o+PzoMt6nsbXDsJsdWPQgcw3Dkf4Mt81FY2Elx9kr8Bo5O5PFfOi1oJc5OeurQMCzdDltLGCaVswPuozvSOPhnfpYkA1uxEhbbvxXVPsdkU8eTiW18rmocaj2VL/SgC0Vu9PtQSF49onVeGbaNsmyRRX2N86G0ortB+SsTIHLTs/I5bwzd2j5IInPwAPr7L1Zws0gAjy1FzPf/+aq1p2HopqXo4Bz/z4Xc+/9I4K3JtDuQ6g4Silmh1fASHpIEZN3Gm+XmQDnYrslZpV1rYu9XA0duVOMGBMi1mznrwGKMIdcvT1e896Hf8CZCc3Hbf+evf+b4YcxeskHVydpT3KnSgtE6rXTAkYk775E8OHNdEsW7xZ98Vs7g7GIQxSUEAVj60UXvtdI5riWa+zGy+XUtWArfsyob4lU/4sixIkBC/GSzvCxABHg2PDhBpefA RYCczFJg +XnrcVf1zEJoNaYrVFX8EfpaqZ5diw/eq1aZ3KRhMJU4RNIDfbAMIZzW16JOtulY4kazcu0ByKfX9vXvm3FGD1VszNDJZBili//lL6xNwOREZijiRTbnTkcqD/QmMI8UPWkQ5ltGMS6cUOWA7junLltcpLgjnPsGIw31t5G9gPf1MbSW8pjSHULIVC7JGlfKK7H2PNR/BhC8oih7TTvoFs8JnisFTDL1BCaI+59uXyYnhk6TgYgeRNWQw6+x4F1TIHPG1qJC0eUfqYW+mtTogP4omEXIXmtDIzqHR+8sK/QphtQ5VKvGpgMkIcJiNzTPY7lRndEsiB4chxEoCWTRpnP2XF/UL0BP6p+AND7B98BtPsHbZ7B3rUxnC7bRmJZE5vJ1O9HY5Szys0EoOBdw0EOEXmVdS7HwjeFybBDYI9I7O60o/EqzJyVpRc7dcc+upuvVe+wyknMOsaEF9kM69OrXeqw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi all, As proposed by David Hildenbrand [1], this series introduces the following two new helper functions to replace pte_offset_map_nolock(). 1. pte_offset_map_readonly_nolock() 2. pte_offset_map_maywrite_nolock() As the name suggests, pte_offset_map_readonly_nolock() is used for read-only case. In this case, only read-only operations will be performed on PTE page after the PTL is held. The RCU lock in pte_offset_map_nolock() will ensure that the PTE page will not be freed, and there is no need to worry about whether the pmd entry is modified. Therefore pte_offset_map_readonly_nolock() is just a renamed version of pte_offset_map_nolock(). pte_offset_map_maywrite_nolock() is used for may-write case. In this case, the pte or pmd entry may be modified after the PTL is held, so we need to ensure that the pmd entry has not been modified concurrently. So in addition to the name change, it also outputs the pmdval when successful. This can help the caller recheck *pmd once the PTL is taken. In some cases we can pass NULL to pmdvalp: either the mmap_lock for write, or pte_same() check on contents, is also enough to ensure that the pmd entry is stable. This series will convert all pte_offset_map_nolock() into the above two helper functions one by one, and finally completely delete it. This also a preparation for reclaiming the empty user PTE page table pages. This series is based on the next-20240820. Comments and suggestions are welcome! Thanks, Qi [1]. https://lore.kernel.org/lkml/f79bbfc9-bb4c-4da4-9902-2e73817dd135@redhat.com/ Qi Zheng (14): mm: pgtable: introduce pte_offset_map_{readonly|maywrite}_nolock() arm: adjust_pte() use pte_offset_map_maywrite_nolock() powerpc: assert_pte_locked() use pte_offset_map_readonly_nolock() mm: filemap: filemap_fault_recheck_pte_none() use pte_offset_map_readonly_nolock() mm: khugepaged: __collapse_huge_page_swapin() use pte_offset_map_readonly_nolock() mm: handle_pte_fault() use pte_offset_map_maywrite_nolock() mm: khugepaged: collapse_pte_mapped_thp() use pte_offset_map_maywrite_nolock() mm: copy_pte_range() use pte_offset_map_maywrite_nolock() mm: mremap: move_ptes() use pte_offset_map_maywrite_nolock() mm: page_vma_mapped_walk: map_pte() use pte_offset_map_maywrite_nolock() mm: userfaultfd: move_pages_pte() use pte_offset_map_maywrite_nolock() mm: multi-gen LRU: walk_pte_range() use pte_offset_map_maywrite_nolock() mm: pgtable: remove pte_offset_map_nolock() mm: khugepaged: retract_page_tables() use pte_offset_map_maywrite_nolock() Documentation/mm/split_page_table_lock.rst | 6 +++- arch/arm/mm/fault-armv.c | 9 ++++- arch/powerpc/mm/pgtable.c | 2 +- include/linux/mm.h | 7 ++-- mm/filemap.c | 4 +-- mm/khugepaged.c | 39 ++++++++++++++++++-- mm/memory.c | 13 +++++-- mm/mremap.c | 7 +++- mm/page_vma_mapped.c | 24 ++++++++++--- mm/pgtable-generic.c | 42 ++++++++++++++++------ mm/userfaultfd.c | 12 +++++-- mm/vmscan.c | 9 ++++- 12 files changed, 143 insertions(+), 31 deletions(-)