mbox series

[v2,00/13] mm/munlock: rework of mlock+munlock page handling

Message ID 55a49083-37f9-3766-1de9-9feea7428ac@google.com (mailing list archive)
Headers show
Series mm/munlock: rework of mlock+munlock page handling | expand

Message

Hugh Dickins Feb. 15, 2022, 2:18 a.m. UTC
I wondered whether to post this munlocking rework in
https://lore.kernel.org/linux-mm/35c340a6-96f-28a0-2b7b-2f9fbddc01f@google.com/

There the discussion was OOM reaping, but the main reason for the rework
has been catastrophic contention on i_mmap_rwsem when exiting from
multiply mlocked files; and frustration with how contorted munlocking is.

tl;dr
 mm/mlock.c                |  637 +++++++++++++++-----------------------
 23 files changed, 514 insertions(+), 780 deletions(-)

v1 of the series was posted on 6 Feb 2022:
https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/
Preview of v2 changed patches 01 04 07 10 11 were posted on 13 Feb 2022.
Here is the full v2 series, in case this is easier to manage:
based again on 5.17-rc2, applies also to -rc3 and -rc4.

Andrew, many thanks for including v1 and fixes in mmotm: please now replace

mm-munlock-delete-page_mlock-and-all-its-works.patch
mm-munlock-delete-foll_mlock-and-foll_populate.patch
mm-munlock-delete-munlock_vma_pages_all-allow-oomreap.patch
mm-munlock-rmap-call-mlock_vma_page-munlock_vma_page.patch
mm-munlock-replace-clear_page_mlock-by-final-clearance.patch
mm-munlock-maintain-page-mlock_count-while-unevictable.patch
mm-munlock-mlock_pte_range-when-mlocking-or-munlocking.patch
mm-migrate-__unmap_and_move-push-good-newpage-to-lru.patch
mm-munlock-delete-smp_mb-from-__pagevec_lru_add_fn.patch
mm-munlock-mlock_page-munlock_page-batch-by-pagevec.patch
mm-munlock-mlock_page-munlock_page-batch-by-pagevec-fix.patch
mm-munlock-mlock_page-munlock_page-batch-by-pagevec-fix-2.patch
mm-munlock-page-migration-needs-mlock-pagevec-drained.patch
mm-thp-collapse_file-do-try_to_unmapttu_batch_flush.patch
mm-thp-shrink_page_list-avoid-splitting-vm_locked-thp.patch

by the following thirteen of v2. As before, some easy fixups will be
needed to apply in mm/huge_memory.c, but otherwise expected to be clean.

Many thanks to Vlastimil Babka for his review of 01 through 11, and
to Matthew Wilcox for graciously volunteering to rebase his over these.

At present there's no update to Documentation/vm/unevictable-lru.rst:
that always needs a different mindset, can follow later, please refer
to commit messages for now.

There are two half-related mm/thp patches at the end: enhancements
we've had for a long time, but needed more after the mlock changes.

01/13 mm/munlock: delete page_mlock() and all its works
02/13 mm/munlock: delete FOLL_MLOCK and FOLL_POPULATE
03/13 mm/munlock: delete munlock_vma_pages_all(), allow oomreap
04/13 mm/munlock: rmap call mlock_vma_page() munlock_vma_page()
05/13 mm/munlock: replace clear_page_mlock() by final clearance
06/13 mm/munlock: maintain page->mlock_count while unevictable
07/13 mm/munlock: mlock_pte_range() when mlocking or munlocking
08/13 mm/migrate: __unmap_and_move() push good newpage to LRU
09/13 mm/munlock: delete smp_mb() from __pagevec_lru_add_fn()
10/13 mm/munlock: mlock_page() munlock_page() batch by pagevec
11/13 mm/munlock: page migration needs mlock pagevec drained
12/13 mm/thp: collapse_file() do try_to_unmap(TTU_BATCH_FLUSH)
13/13 mm/thp: shrink_page_list() avoid splitting VM_LOCKED THP

 include/linux/mm.h        |    2 
 include/linux/mm_inline.h |   11 
 include/linux/mm_types.h  |   19 +
 include/linux/rmap.h      |   23 -
 kernel/events/uprobes.c   |    7 
 mm/gup.c                  |   43 --
 mm/huge_memory.c          |   55 ---
 mm/hugetlb.c              |    4 
 mm/internal.h             |   66 ++-
 mm/khugepaged.c           |   14 
 mm/ksm.c                  |   12 
 mm/madvise.c              |    5 
 mm/memcontrol.c           |    3 
 mm/memory.c               |   45 --
 mm/migrate.c              |   42 +-
 mm/mlock.c                |  637 +++++++++++++++-----------------------
 mm/mmap.c                 |   32 -
 mm/mmzone.c               |    7 
 mm/oom_kill.c             |    2 
 mm/rmap.c                 |  156 ++-------
 mm/swap.c                 |   89 ++---
 mm/userfaultfd.c          |   14 
 mm/vmscan.c               |    6 
 23 files changed, 514 insertions(+), 780 deletions(-)

Hugh

Comments

Matthew Wilcox Feb. 15, 2022, 7:35 p.m. UTC | #1
On Mon, Feb 14, 2022 at 06:18:34PM -0800, Hugh Dickins wrote:
> Andrew, many thanks for including v1 and fixes in mmotm: please now replace
> 
> mm-munlock-delete-page_mlock-and-all-its-works.patch
> mm-munlock-delete-foll_mlock-and-foll_populate.patch
> mm-munlock-delete-munlock_vma_pages_all-allow-oomreap.patch
> mm-munlock-rmap-call-mlock_vma_page-munlock_vma_page.patch
> mm-munlock-replace-clear_page_mlock-by-final-clearance.patch
> mm-munlock-maintain-page-mlock_count-while-unevictable.patch
> mm-munlock-mlock_pte_range-when-mlocking-or-munlocking.patch
> mm-migrate-__unmap_and_move-push-good-newpage-to-lru.patch
> mm-munlock-delete-smp_mb-from-__pagevec_lru_add_fn.patch
> mm-munlock-mlock_page-munlock_page-batch-by-pagevec.patch
> mm-munlock-mlock_page-munlock_page-batch-by-pagevec-fix.patch
> mm-munlock-mlock_page-munlock_page-batch-by-pagevec-fix-2.patch
> mm-munlock-page-migration-needs-mlock-pagevec-drained.patch
> mm-thp-collapse_file-do-try_to_unmapttu_batch_flush.patch
> mm-thp-shrink_page_list-avoid-splitting-vm_locked-thp.patch
> 
> by the following thirteen of v2. As before, some easy fixups will be
> needed to apply in mm/huge_memory.c, but otherwise expected to be clean.
> 
> Many thanks to Vlastimil Babka for his review of 01 through 11, and
> to Matthew Wilcox for graciously volunteering to rebase his over these.

I have now pushed these 13 patches to my for-next branch:

git://git.infradead.org/users/willy/pagecache.git for-next

and rebased my folio patches on top.  Mostly that involved dropping
my mlock-related patches, although there were a few other adjustments
that needed to be made.  That should make Stephen's merge resolution
much easier once Andrew drops v1 of these patches from his tree.