From patchwork Wed Sep 4 08:40:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13790169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E7168CA0ED3 for ; Wed, 4 Sep 2024 09:05:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-Type: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=PnxZ3keyEwgQ9RkNvwyvGwt3DYekuRW0HVUJKFS+J3s=; b=0ln5Ev18efBd8wa46iYgRtnYxt htZrk1usD80pn1FVxS7e5iEiROLpTnJsIV33L0feTOebsUdBTX/TbQbFfeKU9m0qg6Z4lPMdy3shX sv1Qb9zHyr6DLD42T42CNHvVe5N9Pvfbt6Djasrj+f2XTRJICZx5eCKT6crDbvQbSIR8zViRps/fA KPFlkfp46B9FgdTP3opJluvPR8lVzWFbuE9ENhYbOCr0uXsBWvFIjHoKcgRY+5AZ0lsnqw8q/Lx8M K27M3ROaO2A2sWEmYkCVHpiv2ZpKkwjLTISTRnrD68xWUBGXmR0+3ci4UtHEsOOkvCYZnW3PuAcd7 duvUNZtA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sllwp-00000003aBR-3UTz; Wed, 04 Sep 2024 09:05:08 +0000 Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sllZI-00000003SsM-36d8 for linux-arm-kernel@lists.infradead.org; Wed, 04 Sep 2024 08:40:50 +0000 Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-7d4ed6158bcso1440306a12.1 for ; Wed, 04 Sep 2024 01:40:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1725439243; x=1726044043; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=PnxZ3keyEwgQ9RkNvwyvGwt3DYekuRW0HVUJKFS+J3s=; b=TqW1hmlh2iA/lBuPH6RkFjOGiYuW7Y3gvWSxCyUTydipF7h6eUKzXGSjSRKsnDt2Cs tf+Q7LGKmcHgF0zjySDQCkoKGnA6jaiYYx4qapRGcr35Dww/KQgcR3YJZKHKslD+wXoX V4ioiTjTfUXUuSk6OUj72y/TnfWm0juVNT5lXcgQih5VDzvhshLtw5taspU5LZ8BH6lg p+JAdXEV99dDGk1AmsLGf88RQhh75ATtY6PtUr0mczGuzNq2cX4/An21EOqEjWQ4aRl+ 10uDKRvxwQ4cJCyJRLVViH9NgK8BMhxDVfIzISN6fbMNEWhQQ5ePwVV9ykAMwZJyZ9Sm QOcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725439243; x=1726044043; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=PnxZ3keyEwgQ9RkNvwyvGwt3DYekuRW0HVUJKFS+J3s=; b=wj7H2ayan55W7PIVAgjBSm4n2OEuKv1G1a31B/PgtDZS0Tx83Fi5nSCIWzcwc3hOR5 LXKq8zZfSVcUrltzHdXI/vVa9LuN5cCu1JMUlPq+lhmgP+363fRKd+Cug2tggYgyH0uT oJ4mwHeTsL0mK+7NKEICwTOzdugXtpqTb+dwwXewiUtjNFS8WbZ3DwJmMmYC4Y2H5IDt znWi6FZioaIX4uH6bBF02VHw4Huy5Btpwg4xzng/THgh2xNCWHpEHOMOV+VQCzjtulXh UZgSPHNRqEWSF01S+qx+z8rXDLLHlIOTyw0BKzaJ8SydE6wA+Yx6Dmwq9V9t0DGfCpc+ P6DQ== X-Forwarded-Encrypted: i=1; AJvYcCVOKjRkRGd5eU2fRZfFtu0PS7ja8FgvakEa3b2C2DzByifppj2ByozYqhvEHPJBJVyKCTA698CHdMB3C8TAsdXO@lists.infradead.org X-Gm-Message-State: AOJu0Yy1KlCru0pzo7sM6Yd0iMKmSQAo1+mPC65ukUmUXmFnCDUqLqTx SwzuRTaXSr8iYSNw9C1b/AT4LvIXqGaD1oyiFu4vuQY48TLtCzagbqbgveERIPU= X-Google-Smtp-Source: AGHT+IG+iEKF7FtSjHKinQSwDvpK7gnjP4wMOie8wNTxXeNBeSlHHCqUvqbL69VXmYEDlh1n7ZOmCw== X-Received: by 2002:a05:6a20:d503:b0:1c0:f2d9:a452 with SMTP id adf61e73a8af0-1cece4fdcacmr13163272637.13.1725439243372; Wed, 04 Sep 2024 01:40:43 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.242]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-206ae95a51csm9414045ad.117.2024.09.04.01.40.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Sep 2024 01:40:42 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, rppt@kernel.org, vishal.moola@gmail.com, peterx@redhat.com, ryan.roberts@arm.com, christophe.leroy2@cs-soprasteria.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, Qi Zheng Subject: [PATCH v3 00/14] introduce pte_offset_map_{ro|rw}_nolock() Date: Wed, 4 Sep 2024 16:40:08 +0800 Message-Id: <20240904084022.32728-1-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240904_014048_966027_57A25576 X-CRM114-Status: GOOD ( 17.06 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Changes in v3: - change to use VM_WARN_ON_ONCE() instead of BUG_ON() in pte_offset_map_rw_nolock() (David Hildenbrand) - modify the comment above the pte_offset_map_lock() in [PATCH v2 01/14] (David Hildenbrand and Muchun Song) - modify the comment above the pte_offset_map_rw_nolock() in [PATCH v2 06/14] (David Hildenbrand and Muchun Song) - also perform a pmd_same() check in [PATCH v2 08/14] and [PATCH v2 09/14] (since we may free the PTE page in retract_page_tables() without holding the read lock of mmap_lock) - collect the Acked-bys and Reviewed-bys - rebase onto the next-20240904 Changes in v2: - rename pte_offset_map_{readonly|maywrite}_nolock() to pte_offset_map_{ro|rw}_nolock() (LEROY Christophe) - make pte_offset_map_rw_nolock() not accept NULL parameters (David Hildenbrand) - rebase onto the next-20240822 Hi all, As proposed by David Hildenbrand [1], this series introduces the following two new helper functions to replace pte_offset_map_nolock(). 1. pte_offset_map_ro_nolock() 2. pte_offset_map_rw_nolock() As the name suggests, pte_offset_map_ro_nolock() is used for read-only case. In this case, only read-only operations will be performed on PTE page after the PTL is held. The RCU lock in pte_offset_map_nolock() will ensure that the PTE page will not be freed, and there is no need to worry about whether the pmd entry is modified. Therefore pte_offset_map_ro_nolock() is just a renamed version of pte_offset_map_nolock(). pte_offset_map_rw_nolock() is used for may-write case. In this case, the pte or pmd entry may be modified after the PTL is held, so we need to ensure that the pmd entry has not been modified concurrently. So in addition to the name change, it also outputs the pmdval when successful. The users should make sure the page table is stable like checking pte_same() or checking pmd_same() by using the output pmdval before performing the write operations. This series will convert all pte_offset_map_nolock() into the above two helper functions one by one, and finally completely delete it. This also a preparation for reclaiming the empty user PTE page table pages. This series is based on the next-20240904. Comments and suggestions are welcome! Thanks, Qi Qi Zheng (14): mm: pgtable: introduce pte_offset_map_{ro|rw}_nolock() arm: adjust_pte() use pte_offset_map_rw_nolock() powerpc: assert_pte_locked() use pte_offset_map_ro_nolock() mm: filemap: filemap_fault_recheck_pte_none() use pte_offset_map_ro_nolock() mm: khugepaged: __collapse_huge_page_swapin() use pte_offset_map_ro_nolock() mm: handle_pte_fault() use pte_offset_map_rw_nolock() mm: khugepaged: collapse_pte_mapped_thp() use pte_offset_map_rw_nolock() mm: copy_pte_range() use pte_offset_map_rw_nolock() mm: mremap: move_ptes() use pte_offset_map_rw_nolock() mm: page_vma_mapped_walk: map_pte() use pte_offset_map_rw_nolock() mm: userfaultfd: move_pages_pte() use pte_offset_map_rw_nolock() mm: multi-gen LRU: walk_pte_range() use pte_offset_map_rw_nolock() mm: pgtable: remove pte_offset_map_nolock() mm: khugepaged: retract_page_tables() use pte_offset_map_rw_nolock() Documentation/mm/split_page_table_lock.rst | 6 ++- arch/arm/mm/fault-armv.c | 9 ++++- arch/powerpc/mm/pgtable.c | 2 +- include/linux/mm.h | 7 +++- mm/filemap.c | 4 +- mm/khugepaged.c | 39 ++++++++++++++++++-- mm/memory.c | 32 ++++++++++++++-- mm/mremap.c | 20 +++++++++- mm/page_vma_mapped.c | 24 ++++++++++-- mm/pgtable-generic.c | 43 ++++++++++++++++++---- mm/userfaultfd.c | 15 ++++++-- mm/vmscan.c | 9 ++++- 12 files changed, 180 insertions(+), 30 deletions(-)