mbox series

[00/14] introduce pte_offset_map_{readonly|maywrite}_nolock()

Message ID cover.1724226076.git.zhengqi.arch@bytedance.com (mailing list archive)
Headers show
Series introduce pte_offset_map_{readonly|maywrite}_nolock() | expand

Message

Qi Zheng Aug. 21, 2024, 8:18 a.m. UTC
Hi all,

As proposed by David Hildenbrand [1], this series introduces the following two
new helper functions to replace pte_offset_map_nolock().

1. pte_offset_map_readonly_nolock()
2. pte_offset_map_maywrite_nolock()

As the name suggests, pte_offset_map_readonly_nolock() is used for read-only
case. In this case, only read-only operations will be performed on PTE page
after the PTL is held. The RCU lock in pte_offset_map_nolock() will ensure that
the PTE page will not be freed, and there is no need to worry about whether the
pmd entry is modified. Therefore pte_offset_map_readonly_nolock() is just a
renamed version of pte_offset_map_nolock().

pte_offset_map_maywrite_nolock() is used for may-write case. In this case, the
pte or pmd entry may be modified after the PTL is held, so we need to ensure
that the pmd entry has not been modified concurrently. So in addition to the
name change, it also outputs the pmdval when successful. This can help the
caller recheck *pmd once the PTL is taken. In some cases we can pass NULL to
pmdvalp: either the mmap_lock for write, or pte_same() check on contents, is
also enough to ensure that the pmd entry is stable.

This series will convert all pte_offset_map_nolock() into the above two helper
functions one by one, and finally completely delete it.

This also a preparation for reclaiming the empty user PTE page table pages.

This series is based on the next-20240820.

Comments and suggestions are welcome!

Thanks,
Qi

[1]. https://lore.kernel.org/lkml/f79bbfc9-bb4c-4da4-9902-2e73817dd135@redhat.com/

Qi Zheng (14):
  mm: pgtable: introduce pte_offset_map_{readonly|maywrite}_nolock()
  arm: adjust_pte() use pte_offset_map_maywrite_nolock()
  powerpc: assert_pte_locked() use pte_offset_map_readonly_nolock()
  mm: filemap: filemap_fault_recheck_pte_none() use
    pte_offset_map_readonly_nolock()
  mm: khugepaged: __collapse_huge_page_swapin() use
    pte_offset_map_readonly_nolock()
  mm: handle_pte_fault() use pte_offset_map_maywrite_nolock()
  mm: khugepaged: collapse_pte_mapped_thp() use
    pte_offset_map_maywrite_nolock()
  mm: copy_pte_range() use pte_offset_map_maywrite_nolock()
  mm: mremap: move_ptes() use pte_offset_map_maywrite_nolock()
  mm: page_vma_mapped_walk: map_pte() use
    pte_offset_map_maywrite_nolock()
  mm: userfaultfd: move_pages_pte() use pte_offset_map_maywrite_nolock()
  mm: multi-gen LRU: walk_pte_range() use
    pte_offset_map_maywrite_nolock()
  mm: pgtable: remove pte_offset_map_nolock()
  mm: khugepaged: retract_page_tables() use
    pte_offset_map_maywrite_nolock()

 Documentation/mm/split_page_table_lock.rst |  6 +++-
 arch/arm/mm/fault-armv.c                   |  9 ++++-
 arch/powerpc/mm/pgtable.c                  |  2 +-
 include/linux/mm.h                         |  7 ++--
 mm/filemap.c                               |  4 +--
 mm/khugepaged.c                            | 39 ++++++++++++++++++--
 mm/memory.c                                | 13 +++++--
 mm/mremap.c                                |  7 +++-
 mm/page_vma_mapped.c                       | 24 ++++++++++---
 mm/pgtable-generic.c                       | 42 ++++++++++++++++------
 mm/userfaultfd.c                           | 12 +++++--
 mm/vmscan.c                                |  9 ++++-
 12 files changed, 143 insertions(+), 31 deletions(-)