Message ID | 20220303222014.517033-1-shy828301@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: thp: don't have to lock page anymore when splitting PMD | expand |
On 2022/3/4 6:20, Yang Shi wrote: > The commit c444eb564fb1 ("mm: thp: make the THP mapcount atomic against > __split_huge_pmd_locked()") locked the page for PMD split to make > mapcount stable for reuse_swap_page(), then commit 1c2f67308af4 ("mm: > thp: fix MADV_REMOVE deadlock on shmem THP") reduce the scope to > anonymous page only. > > However COW has not used mapcount to determine if the page is shared or > not anymore due to the COW fixes [1] from David Hildenbrand and the > reuse_swap_page() was removed as well. So PMD split doesn't have to > lock the page anymore. This patch basically reverted the above two > commits. > Sounds reasonable. Since mapcount is not used to determine if we need to COW the page, PMD split doesn't have to lock the page anymore. > [1] https://lore.kernel.org/linux-mm/20220131162940.210846-1-david@redhat.com/ > > Cc: David Hildenbrand <david@redhat.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> > Signed-off-by: Yang Shi <shy828301@gmail.com> > --- > mm/huge_memory.c | 44 +++++--------------------------------------- > 1 file changed, 5 insertions(+), 39 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index b49e1a11df2e..daaa698bd273 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2134,8 +2134,6 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, > { > spinlock_t *ptl; > struct mmu_notifier_range range; > - bool do_unlock_folio = false; > - pmd_t _pmd; > > mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm, > address & HPAGE_PMD_MASK, > @@ -2148,48 +2146,16 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, > * pmd against. Otherwise we can end up replacing wrong folio. > */ > VM_BUG_ON(freeze && !folio); > - if (folio) { > - VM_WARN_ON_ONCE(!folio_test_locked(folio)); > - if (folio != page_folio(pmd_page(*pmd))) > - goto out; > - } > + if (folio && folio != page_folio(pmd_page(*pmd))) > + goto out; > > -repeat: > - if (pmd_trans_huge(*pmd)) { > - if (!folio) { > - folio = page_folio(pmd_page(*pmd)); > - /* > - * An anonymous page must be locked, to ensure that a > - * concurrent reuse_swap_page() sees stable mapcount; > - * but reuse_swap_page() is not used on shmem or file, > - * and page lock must not be taken when zap_pmd_range() > - * calls __split_huge_pmd() while i_mmap_lock is held. > - */ > - if (folio_test_anon(folio)) { > - if (unlikely(!folio_trylock(folio))) { > - folio_get(folio); > - _pmd = *pmd; > - spin_unlock(ptl); > - folio_lock(folio); > - spin_lock(ptl); > - if (unlikely(!pmd_same(*pmd, _pmd))) { > - folio_unlock(folio); > - folio_put(folio); > - folio = NULL; > - goto repeat; > - } > - folio_put(folio); > - } > - do_unlock_folio = true; > - } > - } > - } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > + if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > goto out; > + > __split_huge_pmd_locked(vma, pmd, range.start, freeze); IUUC, here should be something like below: if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd)) __split_huge_pmd_locked(vma, pmd, range.start, freeze); i.e. the pmd_trans_huge case is missing in the above change. Or am I miss something ? Thanks for the patch. This really simplify the code and avoid the unneeded overhead. > out: > spin_unlock(ptl); > - if (do_unlock_folio) > - folio_unlock(folio); > + > /* > * No need to double call mmu_notifier->invalidate_range() callback. > * They are 3 cases to consider inside __split_huge_pmd_locked(): >
On Thu, Mar 3, 2022 at 6:25 PM Miaohe Lin <linmiaohe@huawei.com> wrote: > > On 2022/3/4 6:20, Yang Shi wrote: > > The commit c444eb564fb1 ("mm: thp: make the THP mapcount atomic against > > __split_huge_pmd_locked()") locked the page for PMD split to make > > mapcount stable for reuse_swap_page(), then commit 1c2f67308af4 ("mm: > > thp: fix MADV_REMOVE deadlock on shmem THP") reduce the scope to > > anonymous page only. > > > > However COW has not used mapcount to determine if the page is shared or > > not anymore due to the COW fixes [1] from David Hildenbrand and the > > reuse_swap_page() was removed as well. So PMD split doesn't have to > > lock the page anymore. This patch basically reverted the above two > > commits. > > > > Sounds reasonable. Since mapcount is not used to determine if we need to COW > the page, PMD split doesn't have to lock the page anymore. > > > [1] https://lore.kernel.org/linux-mm/20220131162940.210846-1-david@redhat.com/ > > > > Cc: David Hildenbrand <david@redhat.com> > > Cc: Andrea Arcangeli <aarcange@redhat.com> > > Cc: Hugh Dickins <hughd@google.com> > > Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> > > Signed-off-by: Yang Shi <shy828301@gmail.com> > > --- > > mm/huge_memory.c | 44 +++++--------------------------------------- > > 1 file changed, 5 insertions(+), 39 deletions(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index b49e1a11df2e..daaa698bd273 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -2134,8 +2134,6 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, > > { > > spinlock_t *ptl; > > struct mmu_notifier_range range; > > - bool do_unlock_folio = false; > > - pmd_t _pmd; > > > > mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm, > > address & HPAGE_PMD_MASK, > > @@ -2148,48 +2146,16 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, > > * pmd against. Otherwise we can end up replacing wrong folio. > > */ > > VM_BUG_ON(freeze && !folio); > > - if (folio) { > > - VM_WARN_ON_ONCE(!folio_test_locked(folio)); > > - if (folio != page_folio(pmd_page(*pmd))) > > - goto out; > > - } > > + if (folio && folio != page_folio(pmd_page(*pmd))) > > + goto out; > > > > -repeat: > > - if (pmd_trans_huge(*pmd)) { > > - if (!folio) { > > - folio = page_folio(pmd_page(*pmd)); > > - /* > > - * An anonymous page must be locked, to ensure that a > > - * concurrent reuse_swap_page() sees stable mapcount; > > - * but reuse_swap_page() is not used on shmem or file, > > - * and page lock must not be taken when zap_pmd_range() > > - * calls __split_huge_pmd() while i_mmap_lock is held. > > - */ > > - if (folio_test_anon(folio)) { > > - if (unlikely(!folio_trylock(folio))) { > > - folio_get(folio); > > - _pmd = *pmd; > > - spin_unlock(ptl); > > - folio_lock(folio); > > - spin_lock(ptl); > > - if (unlikely(!pmd_same(*pmd, _pmd))) { > > - folio_unlock(folio); > > - folio_put(folio); > > - folio = NULL; > > - goto repeat; > > - } > > - folio_put(folio); > > - } > > - do_unlock_folio = true; > > - } > > - } > > - } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > > + if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > > goto out; > > + > > __split_huge_pmd_locked(vma, pmd, range.start, freeze); > > IUUC, here should be something like below: > if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd)) > __split_huge_pmd_locked(vma, pmd, range.start, freeze); > > i.e. the pmd_trans_huge case is missing in the above change. Or am I miss something ? Yes, you are definitely right. Must have pmd_trans_huge(*pmd). > > Thanks for the patch. This really simplify the code and avoid the unneeded overhead. > > > out: > > spin_unlock(ptl); > > - if (do_unlock_folio) > > - folio_unlock(folio); > > + > > /* > > * No need to double call mmu_notifier->invalidate_range() callback. > > * They are 3 cases to consider inside __split_huge_pmd_locked(): > > >
Hi, This probably bounces on the list due to html junk from the gmail app. What happened to https://lore.kernel.org/linux-mm/20220131162940.210846-10-david@redhat.com/ Included in the very series mentioned below? Was this silently dropped due to folio conversion collisions? :/ Yang Shi <shy828301@gmail.com> schrieb am Do. 3. März 2022 um 23:20: > The commit c444eb564fb1 ("mm: thp: make the THP mapcount atomic against > __split_huge_pmd_locked()") locked the page for PMD split to make > mapcount stable for reuse_swap_page(), then commit 1c2f67308af4 ("mm: > thp: fix MADV_REMOVE deadlock on shmem THP") reduce the scope to > anonymous page only. > > However COW has not used mapcount to determine if the page is shared or > not anymore due to the COW fixes [1] from David Hildenbrand and the > reuse_swap_page() was removed as well. So PMD split doesn't have to > lock the page anymore. This patch basically reverted the above two > commits. > > [1] > https://lore.kernel.org/linux-mm/20220131162940.210846-1-david@redhat.com/ > > Cc: David Hildenbrand <david@redhat.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> > Signed-off-by: Yang Shi <shy828301@gmail.com> > --- > mm/huge_memory.c | 44 +++++--------------------------------------- > 1 file changed, 5 insertions(+), 39 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index b49e1a11df2e..daaa698bd273 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2134,8 +2134,6 @@ void __split_huge_pmd(struct vm_area_struct *vma, > pmd_t *pmd, > { > spinlock_t *ptl; > struct mmu_notifier_range range; > - bool do_unlock_folio = false; > - pmd_t _pmd; > > mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, > vma->vm_mm, > address & HPAGE_PMD_MASK, > @@ -2148,48 +2146,16 @@ void __split_huge_pmd(struct vm_area_struct *vma, > pmd_t *pmd, > * pmd against. Otherwise we can end up replacing wrong folio. > */ > VM_BUG_ON(freeze && !folio); > - if (folio) { > - VM_WARN_ON_ONCE(!folio_test_locked(folio)); > - if (folio != page_folio(pmd_page(*pmd))) > - goto out; > - } > + if (folio && folio != page_folio(pmd_page(*pmd))) > + goto out; > > -repeat: > - if (pmd_trans_huge(*pmd)) { > - if (!folio) { > - folio = page_folio(pmd_page(*pmd)); > - /* > - * An anonymous page must be locked, to ensure > that a > - * concurrent reuse_swap_page() sees stable > mapcount; > - * but reuse_swap_page() is not used on shmem or > file, > - * and page lock must not be taken when > zap_pmd_range() > - * calls __split_huge_pmd() while i_mmap_lock is > held. > - */ > - if (folio_test_anon(folio)) { > - if (unlikely(!folio_trylock(folio))) { > - folio_get(folio); > - _pmd = *pmd; > - spin_unlock(ptl); > - folio_lock(folio); > - spin_lock(ptl); > - if (unlikely(!pmd_same(*pmd, > _pmd))) { > - folio_unlock(folio); > - folio_put(folio); > - folio = NULL; > - goto repeat; > - } > - folio_put(folio); > - } > - do_unlock_folio = true; > - } > - } > - } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > + if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > goto out; > + > __split_huge_pmd_locked(vma, pmd, range.start, freeze); > out: > spin_unlock(ptl); > - if (do_unlock_folio) > - folio_unlock(folio); > + > /* > * No need to double call mmu_notifier->invalidate_range() > callback. > * They are 3 cases to consider inside __split_huge_pmd_locked(): > -- > 2.26.3 > >
On Thu, Mar 3, 2022 at 9:06 PM David Hildenbrand <david@redhat.com> wrote: > > Hi, > > This probably bounces on the list due to html junk from the gmail app. > > What happened to > > https://lore.kernel.org/linux-mm/20220131162940.210846-10-david@redhat.com/ > > Included in the very series mentioned below? > > Was this silently dropped due to folio conversion collisions? :/ I really didn't notice you already proposed this. Maybe folio conversion, maybe mlock cleanup, I can't tell. But anyway this patch needs to get rebased. I will submit v2 to solve the comment, will add your signed-off-by. > > Yang Shi <shy828301@gmail.com> schrieb am Do. 3. März 2022 um 23:20: >> >> The commit c444eb564fb1 ("mm: thp: make the THP mapcount atomic against >> __split_huge_pmd_locked()") locked the page for PMD split to make >> mapcount stable for reuse_swap_page(), then commit 1c2f67308af4 ("mm: >> thp: fix MADV_REMOVE deadlock on shmem THP") reduce the scope to >> anonymous page only. >> >> However COW has not used mapcount to determine if the page is shared or >> not anymore due to the COW fixes [1] from David Hildenbrand and the >> reuse_swap_page() was removed as well. So PMD split doesn't have to >> lock the page anymore. This patch basically reverted the above two >> commits. >> >> [1] https://lore.kernel.org/linux-mm/20220131162940.210846-1-david@redhat.com/ >> >> Cc: David Hildenbrand <david@redhat.com> >> Cc: Andrea Arcangeli <aarcange@redhat.com> >> Cc: Hugh Dickins <hughd@google.com> >> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> >> Signed-off-by: Yang Shi <shy828301@gmail.com> >> --- >> mm/huge_memory.c | 44 +++++--------------------------------------- >> 1 file changed, 5 insertions(+), 39 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index b49e1a11df2e..daaa698bd273 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -2134,8 +2134,6 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, >> { >> spinlock_t *ptl; >> struct mmu_notifier_range range; >> - bool do_unlock_folio = false; >> - pmd_t _pmd; >> >> mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm, >> address & HPAGE_PMD_MASK, >> @@ -2148,48 +2146,16 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, >> * pmd against. Otherwise we can end up replacing wrong folio. >> */ >> VM_BUG_ON(freeze && !folio); >> - if (folio) { >> - VM_WARN_ON_ONCE(!folio_test_locked(folio)); >> - if (folio != page_folio(pmd_page(*pmd))) >> - goto out; >> - } >> + if (folio && folio != page_folio(pmd_page(*pmd))) >> + goto out; >> >> -repeat: >> - if (pmd_trans_huge(*pmd)) { >> - if (!folio) { >> - folio = page_folio(pmd_page(*pmd)); >> - /* >> - * An anonymous page must be locked, to ensure that a >> - * concurrent reuse_swap_page() sees stable mapcount; >> - * but reuse_swap_page() is not used on shmem or file, >> - * and page lock must not be taken when zap_pmd_range() >> - * calls __split_huge_pmd() while i_mmap_lock is held. >> - */ >> - if (folio_test_anon(folio)) { >> - if (unlikely(!folio_trylock(folio))) { >> - folio_get(folio); >> - _pmd = *pmd; >> - spin_unlock(ptl); >> - folio_lock(folio); >> - spin_lock(ptl); >> - if (unlikely(!pmd_same(*pmd, _pmd))) { >> - folio_unlock(folio); >> - folio_put(folio); >> - folio = NULL; >> - goto repeat; >> - } >> - folio_put(folio); >> - } >> - do_unlock_folio = true; >> - } >> - } >> - } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) >> + if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) >> goto out; >> + >> __split_huge_pmd_locked(vma, pmd, range.start, freeze); >> out: >> spin_unlock(ptl); >> - if (do_unlock_folio) >> - folio_unlock(folio); >> + >> /* >> * No need to double call mmu_notifier->invalidate_range() callback. >> * They are 3 cases to consider inside __split_huge_pmd_locked(): >> -- >> 2.26.3 >>
On 04.03.22 19:30, Yang Shi wrote: > On Thu, Mar 3, 2022 at 9:06 PM David Hildenbrand <david@redhat.com> wrote: >> >> Hi, >> >> This probably bounces on the list due to html junk from the gmail app. >> >> What happened to >> >> https://lore.kernel.org/linux-mm/20220131162940.210846-10-david@redhat.com/ >> >> Included in the very series mentioned below? >> >> Was this silently dropped due to folio conversion collisions? :/ > > I really didn't notice you already proposed this. Maybe folio > conversion, maybe mlock cleanup, I can't tell. But anyway this patch > needs to get rebased. I will submit v2 to solve the comment, will add > your signed-off-by. > Why a rebase? The folio change comes via another tree (unfortunately not Andrews tree, I wish we would have a single MM tree for MM patches). @Andrew, the last mail I received was + mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch added to -mm tree The patch shows up in mmotm as #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch ... which shouldn't be true.
On Fri, Mar 4, 2022 at 10:50 AM David Hildenbrand <david@redhat.com> wrote: > > On 04.03.22 19:30, Yang Shi wrote: > > On Thu, Mar 3, 2022 at 9:06 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> Hi, > >> > >> This probably bounces on the list due to html junk from the gmail app. > >> > >> What happened to > >> > >> https://lore.kernel.org/linux-mm/20220131162940.210846-10-david@redhat.com/ > >> > >> Included in the very series mentioned below? > >> > >> Was this silently dropped due to folio conversion collisions? :/ > > > > I really didn't notice you already proposed this. Maybe folio > > conversion, maybe mlock cleanup, I can't tell. But anyway this patch > > needs to get rebased. I will submit v2 to solve the comment, will add > > your signed-off-by. > > > > Why a rebase? The folio change comes via another tree (unfortunately not > Andrews tree, I wish we would have a single MM tree for MM patches). > > @Andrew, the last mail I received was > > + mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > added to -mm tree > > The patch shows up in mmotm as > > #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > > ... which shouldn't be true. Yes, otherwise my patch can't be applied. What I saw on linux-next for mm/huge_memory.c looks like: bfede97b8d6d mm/huge_memory: remove stale page_trans_huge_mapcount() 6c127ac2ff1d mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page() cb63c1eb2a3c mm/migration: add trace events for THP migrations e65f964148fe Merge branch 'akpm-current/current' 61435e1e267c mm/readahead: Align file mappings for non-DAX 80ef527bf929 mm: Turn can_split_huge_page() into can_split_folio() 8339af1d0a18 mm/rmap: Convert rmap_walk() to take a folio 16f06327291e mm/migrate: Convert remove_migration_ptes() to folios bd23d3f12232 memory tiering: skip to scan fast memory 47a3e10abb78 mm: thp: fix wrong cache flush in remove_migration_pmd() 1de8566cca0a mm/rmap: Convert try_to_migrate() to folios 5a470d51cb2b mm/rmap: Convert try_to_unmap() to take a folio 1c760ad73a13 mm/huge_memory: Convert __split_huge_pmd() to take a folio 82865a9e1187 mm: Add folio_mapcount() 07ca76067308 mm/munlock: maintain page->mlock_count while unevictable cea86fe246b6 mm/munlock: rmap call mlock_vma_page() munlock_vma_page() b67bf49ce7aa mm/munlock: delete FOLL_MLOCK and FOLL_POPULATE f56caedaf94f Merge branch 'akpm' (patches from Andrew) It looks like your series were rebased on top of mlock cleanup and folio. Anyway I will wait for Andrew. If he thought a new patch rebased on top of both mlock cleanup and folio would make his life easier, I will submit v2. > > -- > Thanks, > > David / dhildenb >
On Fri, 4 Mar 2022 19:50:08 +0100 David Hildenbrand <david@redhat.com> wrote: > @Andrew, the last mail I received was > > + mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > added to -mm tree > > The patch shows up in mmotm as > > #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > > ... which shouldn't be true. I guess I mislabelled the reason for dropping it. Should have been to-be-updated, due to https://lkml.kernel.org/r/CAHbLzkpbnQyHRckoRtbZoaLvANu92MY4kEsbKudaQ8MDUA3nVg@mail.gmail.com
On 07.03.22 03:07, Andrew Morton wrote: > On Fri, 4 Mar 2022 19:50:08 +0100 David Hildenbrand <david@redhat.com> wrote: > >> @Andrew, the last mail I received was >> >> + mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch >> added to -mm tree >> >> The patch shows up in mmotm as >> >> #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch >> >> ... which shouldn't be true. > > I guess I mislabelled the reason for dropping it. Should have been to-be-updated, > due to https://lkml.kernel.org/r/CAHbLzkpbnQyHRckoRtbZoaLvANu92MY4kEsbKudaQ8MDUA3nVg@mail.gmail.com > Let me clarify. 1. I sent [1] (9 patches) 2. You queued the 9 patches E.g., in "mmotm 2022-02-15-20-22 uploaded" * mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch * mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch * mm-slightly-clarify-ksm-logic-in-do_swap_page.patch * mm-streamline-cow-logic-in-do_swap_page.patch * mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch * mm-khugepaged-remove-reuse_swap_page-usage.patch * mm-swapfile-remove-stale-reuse_swap_page.patch * mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch * mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch 3. The last patch in the series was dropped. What remains are 8 patches. E.g., in "mmotm 2022-02-24-22-38 uploaded" * mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch * mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch * mm-slightly-clarify-ksm-logic-in-do_swap_page.patch * mm-streamline-cow-logic-in-do_swap_page.patch * mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch * mm-khugepaged-remove-reuse_swap_page-usage.patch * mm-swapfile-remove-stale-reuse_swap_page.patch * mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch 4. Yang Shi sent his patch (the one we're replying to) 5. You picked his patch and dropped it again due to [2] I'm wondering why 3 happened and why https://www.ozlabs.org/~akpm/mmotm/series contains: mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch mm-slightly-clarify-ksm-logic-in-do_swap_page.patch mm-streamline-cow-logic-in-do_swap_page.patch mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch mm-khugepaged-remove-reuse_swap_page-usage.patch mm-swapfile-remove-stale-reuse_swap_page.patch mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch ... #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch [1] https://lore.kernel.org/linux-mm/20220131162940.210846-1-david@redhat.com/ [2] https://lkml.kernel.org/r/CAHbLzkpbnQyHRckoRtbZoaLvANu92MY4kEsbKudaQ8MDUA3nVg@mail.gmail.com
On Mon, 7 Mar 2022 09:24:58 +0100 David Hildenbrand <david@redhat.com> wrote: > On 07.03.22 03:07, Andrew Morton wrote: > > On Fri, 4 Mar 2022 19:50:08 +0100 David Hildenbrand <david@redhat.com> wrote: > > > >> @Andrew, the last mail I received was > >> > >> + mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > >> added to -mm tree > >> > >> The patch shows up in mmotm as > >> > >> #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > >> > >> ... which shouldn't be true. > > > > I guess I mislabelled the reason for dropping it. Should have been to-be-updated, > > due to https://lkml.kernel.org/r/CAHbLzkpbnQyHRckoRtbZoaLvANu92MY4kEsbKudaQ8MDUA3nVg@mail.gmail.com > > > > Let me clarify. > > 1. I sent [1] (9 patches) > > 2. You queued the 9 patches > > E.g., in "mmotm 2022-02-15-20-22 uploaded" > > * mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch > * mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch > * mm-slightly-clarify-ksm-logic-in-do_swap_page.patch > * mm-streamline-cow-logic-in-do_swap_page.patch > * mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch > * mm-khugepaged-remove-reuse_swap_page-usage.patch > * mm-swapfile-remove-stale-reuse_swap_page.patch > * mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch > * mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > > 3. The last patch in the series was dropped. What remains are 8 patches. > > E.g., in "mmotm 2022-02-24-22-38 uploaded" > > * mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch > * mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch > * mm-slightly-clarify-ksm-logic-in-do_swap_page.patch > * mm-streamline-cow-logic-in-do_swap_page.patch > * mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch > * mm-khugepaged-remove-reuse_swap_page-usage.patch > * mm-swapfile-remove-stale-reuse_swap_page.patch > * mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch > > 4. Yang Shi sent his patch (the one we're replying to) > > 5. You picked his patch and dropped it again due to [2] > > > I'm wondering why 3 happened and why > https://www.ozlabs.org/~akpm/mmotm/series contains: > > > mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch > mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch > mm-slightly-clarify-ksm-logic-in-do_swap_page.patch > mm-streamline-cow-logic-in-do_swap_page.patch > mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch > mm-khugepaged-remove-reuse_swap_page-usage.patch > mm-swapfile-remove-stale-reuse_swap_page.patch > mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch > ... > #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch OK, thanks. I guess it was me seeing 100% rejects when merging onto the folio changes then incorrectly deciding the patch was now in linux-next via some other tree. I restored it and fixed things up. Please check. --- a/mm/huge_memory.c~mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd +++ a/mm/huge_memory.c @@ -2133,8 +2133,6 @@ void __split_huge_pmd(struct vm_area_str { spinlock_t *ptl; struct mmu_notifier_range range; - bool do_unlock_folio = false; - pmd_t _pmd; mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm, address & HPAGE_PMD_MASK, @@ -2153,42 +2151,14 @@ void __split_huge_pmd(struct vm_area_str goto out; } -repeat: if (pmd_trans_huge(*pmd)) { - if (!folio) { + if (!folio) folio = page_folio(pmd_page(*pmd)); - /* - * An anonymous page must be locked, to ensure that a - * concurrent reuse_swap_page() sees stable mapcount; - * but reuse_swap_page() is not used on shmem or file, - * and page lock must not be taken when zap_pmd_range() - * calls __split_huge_pmd() while i_mmap_lock is held. - */ - if (folio_test_anon(folio)) { - if (unlikely(!folio_trylock(folio))) { - folio_get(folio); - _pmd = *pmd; - spin_unlock(ptl); - folio_lock(folio); - spin_lock(ptl); - if (unlikely(!pmd_same(*pmd, _pmd))) { - folio_unlock(folio); - folio_put(folio); - folio = NULL; - goto repeat; - } - folio_put(folio); - } - do_unlock_folio = true; - } - } } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) goto out; __split_huge_pmd_locked(vma, pmd, range.start, freeze); out: spin_unlock(ptl); - if (do_unlock_folio) - folio_unlock(folio); /* * No need to double call mmu_notifier->invalidate_range() callback. * They are 3 cases to consider inside __split_huge_pmd_locked():
On Mon, Mar 7, 2022 at 3:43 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Mon, 7 Mar 2022 09:24:58 +0100 David Hildenbrand <david@redhat.com> wrote: > > > On 07.03.22 03:07, Andrew Morton wrote: > > > On Fri, 4 Mar 2022 19:50:08 +0100 David Hildenbrand <david@redhat.com> wrote: > > > > > >> @Andrew, the last mail I received was > > >> > > >> + mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > > >> added to -mm tree > > >> > > >> The patch shows up in mmotm as > > >> > > >> #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > > >> > > >> ... which shouldn't be true. > > > > > > I guess I mislabelled the reason for dropping it. Should have been to-be-updated, > > > due to https://lkml.kernel.org/r/CAHbLzkpbnQyHRckoRtbZoaLvANu92MY4kEsbKudaQ8MDUA3nVg@mail.gmail.com > > > > > > > Let me clarify. > > > > 1. I sent [1] (9 patches) > > > > 2. You queued the 9 patches > > > > E.g., in "mmotm 2022-02-15-20-22 uploaded" > > > > * mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch > > * mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch > > * mm-slightly-clarify-ksm-logic-in-do_swap_page.patch > > * mm-streamline-cow-logic-in-do_swap_page.patch > > * mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch > > * mm-khugepaged-remove-reuse_swap_page-usage.patch > > * mm-swapfile-remove-stale-reuse_swap_page.patch > > * mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch > > * mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > > > > 3. The last patch in the series was dropped. What remains are 8 patches. > > > > E.g., in "mmotm 2022-02-24-22-38 uploaded" > > > > * mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch > > * mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch > > * mm-slightly-clarify-ksm-logic-in-do_swap_page.patch > > * mm-streamline-cow-logic-in-do_swap_page.patch > > * mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch > > * mm-khugepaged-remove-reuse_swap_page-usage.patch > > * mm-swapfile-remove-stale-reuse_swap_page.patch > > * mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch > > > > 4. Yang Shi sent his patch (the one we're replying to) > > > > 5. You picked his patch and dropped it again due to [2] > > > > > > I'm wondering why 3 happened and why > > https://www.ozlabs.org/~akpm/mmotm/series contains: > > > > > > mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch > > mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch > > mm-slightly-clarify-ksm-logic-in-do_swap_page.patch > > mm-streamline-cow-logic-in-do_swap_page.patch > > mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch > > mm-khugepaged-remove-reuse_swap_page-usage.patch > > mm-swapfile-remove-stale-reuse_swap_page.patch > > mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch > > ... > > #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > > OK, thanks. I guess it was me seeing 100% rejects when merging onto > the folio changes then incorrectly deciding the patch was now in > linux-next via some other tree. > > I restored it and fixed things up. Please check. Thanks, Andrew. I think we could clean it up a little bit further. > > > --- a/mm/huge_memory.c~mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd > +++ a/mm/huge_memory.c > @@ -2133,8 +2133,6 @@ void __split_huge_pmd(struct vm_area_str > { > spinlock_t *ptl; > struct mmu_notifier_range range; > - bool do_unlock_folio = false; > - pmd_t _pmd; > > mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm, > address & HPAGE_PMD_MASK, > @@ -2153,42 +2151,14 @@ void __split_huge_pmd(struct vm_area_str > goto out; > } > > -repeat: > if (pmd_trans_huge(*pmd)) { > - if (!folio) { > + if (!folio) > folio = page_folio(pmd_page(*pmd)); We could remove the "if (pmd_trans_huge(*pmd))" section since folio is actually not used afterward at all. > - /* > - * An anonymous page must be locked, to ensure that a > - * concurrent reuse_swap_page() sees stable mapcount; > - * but reuse_swap_page() is not used on shmem or file, > - * and page lock must not be taken when zap_pmd_range() > - * calls __split_huge_pmd() while i_mmap_lock is held. > - */ > - if (folio_test_anon(folio)) { > - if (unlikely(!folio_trylock(folio))) { > - folio_get(folio); > - _pmd = *pmd; > - spin_unlock(ptl); > - folio_lock(folio); > - spin_lock(ptl); > - if (unlikely(!pmd_same(*pmd, _pmd))) { > - folio_unlock(folio); > - folio_put(folio); > - folio = NULL; > - goto repeat; > - } > - folio_put(folio); > - } > - do_unlock_folio = true; > - } > - } > } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > goto out; > __split_huge_pmd_locked(vma, pmd, range.start, freeze); With the above if removed, this could be changed to: if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd)) __split_huge_pmd_locked(vma, pmd, range.start, freeze); > out: > spin_unlock(ptl); > - if (do_unlock_folio) > - folio_unlock(folio); > /* > * No need to double call mmu_notifier->invalidate_range() callback. > * They are 3 cases to consider inside __split_huge_pmd_locked(): > _ >
On Mon, 7 Mar 2022 16:03:12 -0800 Yang Shi <shy828301@gmail.com> wrote: > On Mon, Mar 7, 2022 at 3:43 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > @@ -2133,8 +2133,6 @@ void __split_huge_pmd(struct vm_area_str > > { > > spinlock_t *ptl; > > struct mmu_notifier_range range; > > - bool do_unlock_folio = false; > > - pmd_t _pmd; > > > > mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm, > > address & HPAGE_PMD_MASK, > > @@ -2153,42 +2151,14 @@ void __split_huge_pmd(struct vm_area_str > > goto out; > > } > > > > -repeat: > > if (pmd_trans_huge(*pmd)) { > > - if (!folio) { > > + if (!folio) > > folio = page_folio(pmd_page(*pmd)); > > We could remove the "if (pmd_trans_huge(*pmd))" section since folio is > actually not used afterward at all. > > ... > > > With the above if removed, this could be changed to: > > if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || > is_pmd_migration_entry(*pmd)) > __split_huge_pmd_locked(vma, pmd, range.start, freeze); > OK, looks sane. Can someone please test all this? --- a/mm/huge_memory.c~mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd-fix +++ a/mm/huge_memory.c @@ -2151,12 +2151,10 @@ void __split_huge_pmd(struct vm_area_str goto out; } - if (pmd_trans_huge(*pmd)) { - if (!folio) - folio = page_folio(pmd_page(*pmd)); - } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) - goto out; - __split_huge_pmd_locked(vma, pmd, range.start, freeze); + if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || + is_pmd_migration_entry(*pmd))) + __split_huge_pmd_locked(vma, pmd, range.start, freeze); + out: spin_unlock(ptl); /*
On Mon, Mar 7, 2022 at 4:50 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Mon, 7 Mar 2022 16:03:12 -0800 Yang Shi <shy828301@gmail.com> wrote: > > > On Mon, Mar 7, 2022 at 3:43 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > > @@ -2133,8 +2133,6 @@ void __split_huge_pmd(struct vm_area_str > > > { > > > spinlock_t *ptl; > > > struct mmu_notifier_range range; > > > - bool do_unlock_folio = false; > > > - pmd_t _pmd; > > > > > > mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm, > > > address & HPAGE_PMD_MASK, > > > @@ -2153,42 +2151,14 @@ void __split_huge_pmd(struct vm_area_str > > > goto out; > > > } > > > > > > -repeat: > > > if (pmd_trans_huge(*pmd)) { > > > - if (!folio) { > > > + if (!folio) > > > folio = page_folio(pmd_page(*pmd)); > > > > We could remove the "if (pmd_trans_huge(*pmd))" section since folio is > > actually not used afterward at all. > > > > > ... > > > > > > With the above if removed, this could be changed to: > > > > if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || > > is_pmd_migration_entry(*pmd)) > > __split_huge_pmd_locked(vma, pmd, range.start, freeze); > > > > OK, looks sane. Can someone please test all this? Build test? I had the exact same change in my tree, it worked for me. > > --- a/mm/huge_memory.c~mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd-fix > +++ a/mm/huge_memory.c > @@ -2151,12 +2151,10 @@ void __split_huge_pmd(struct vm_area_str > goto out; > } > > - if (pmd_trans_huge(*pmd)) { > - if (!folio) > - folio = page_folio(pmd_page(*pmd)); > - } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) > - goto out; > - __split_huge_pmd_locked(vma, pmd, range.start, freeze); > + if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || > + is_pmd_migration_entry(*pmd))) > + __split_huge_pmd_locked(vma, pmd, range.start, freeze); > + > out: > spin_unlock(ptl); > /* > _ >
On Mon, 7 Mar 2022, Andrew Morton wrote: > > OK, looks sane. Can someone please test all this? Only the briefest of testing, but mmotm 2022-03-06-20-33 minus mm-thp-dont-have-to-lock-page-anymore-when-splitting-pmd.patch plus mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch plus mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd-fix.patch is infinitely better than mmotm 2022-03-06-20-33. I could just about reach a login on yesterday's mmotm, but graphics no. Couldn't spend time to investigate, but Naresh's mail today that LTP thp04 timeouted (hey, I'm English, we say timed out!) inspired me to try booting with transparent_hugepage=never on cmdline, and that worked. Again, no time to investigate, but the combination above has booted and is running load, including transparent hugepages. And before setting that load going, I did try LTP thp04, which passed. (There is an unrelated console printk lockdep spew, and am I the only one to see these mm/workingset.c:567 shadow_lru_isolate warnings that started up a week or three ago?) Must dash, Hugh
On Mon, Mar 7, 2022 at 6:36 PM Hugh Dickins <hughd@google.com> wrote: > > On Mon, 7 Mar 2022, Andrew Morton wrote: > > > > OK, looks sane. Can someone please test all this? > > Only the briefest of testing, but > > mmotm 2022-03-06-20-33 > minus mm-thp-dont-have-to-lock-page-anymore-when-splitting-pmd.patch > plus mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > plus mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd-fix.patch > > is infinitely better than mmotm 2022-03-06-20-33. > > I could just about reach a login on yesterday's mmotm, but graphics no. > Couldn't spend time to investigate, but Naresh's mail today that LTP > thp04 timeouted (hey, I'm English, we say timed out!) inspired me to > try booting with transparent_hugepage=never on cmdline, and that worked. I think it was because of mm-thp-dont-have-to-lock-page-anymore-when-splitting-pmd.patch. It was buggy (missed pmd_trans_huge(*pmd) check) and was dropped by Andrew. > > Again, no time to investigate, but the combination above has booted > and is running load, including transparent hugepages. And before > setting that load going, I did try LTP thp04, which passed. > > (There is an unrelated console printk lockdep spew, and am I the only > one to see these mm/workingset.c:567 shadow_lru_isolate warnings that > started up a week or three ago?) > > Must dash, > Hugh
On 08.03.22 00:43, Andrew Morton wrote: > On Mon, 7 Mar 2022 09:24:58 +0100 David Hildenbrand <david@redhat.com> wrote: > >> On 07.03.22 03:07, Andrew Morton wrote: >>> On Fri, 4 Mar 2022 19:50:08 +0100 David Hildenbrand <david@redhat.com> wrote: >>> >>>> @Andrew, the last mail I received was >>>> >>>> + mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch >>>> added to -mm tree >>>> >>>> The patch shows up in mmotm as >>>> >>>> #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch >>>> >>>> ... which shouldn't be true. >>> >>> I guess I mislabelled the reason for dropping it. Should have been to-be-updated, >>> due to https://lkml.kernel.org/r/CAHbLzkpbnQyHRckoRtbZoaLvANu92MY4kEsbKudaQ8MDUA3nVg@mail.gmail.com >>> >> >> Let me clarify. >> >> 1. I sent [1] (9 patches) >> >> 2. You queued the 9 patches >> >> E.g., in "mmotm 2022-02-15-20-22 uploaded" >> >> * mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch >> * mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch >> * mm-slightly-clarify-ksm-logic-in-do_swap_page.patch >> * mm-streamline-cow-logic-in-do_swap_page.patch >> * mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch >> * mm-khugepaged-remove-reuse_swap_page-usage.patch >> * mm-swapfile-remove-stale-reuse_swap_page.patch >> * mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch >> * mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch >> >> 3. The last patch in the series was dropped. What remains are 8 patches. >> >> E.g., in "mmotm 2022-02-24-22-38 uploaded" >> >> * mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch >> * mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch >> * mm-slightly-clarify-ksm-logic-in-do_swap_page.patch >> * mm-streamline-cow-logic-in-do_swap_page.patch >> * mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch >> * mm-khugepaged-remove-reuse_swap_page-usage.patch >> * mm-swapfile-remove-stale-reuse_swap_page.patch >> * mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch >> >> 4. Yang Shi sent his patch (the one we're replying to) >> >> 5. You picked his patch and dropped it again due to [2] >> >> >> I'm wondering why 3 happened and why >> https://www.ozlabs.org/~akpm/mmotm/series contains: >> >> >> mm-optimize-do_wp_page-for-exclusive-pages-in-the-swapcache.patch >> mm-optimize-do_wp_page-for-fresh-pages-in-local-lru-pagevecs.patch >> mm-slightly-clarify-ksm-logic-in-do_swap_page.patch >> mm-streamline-cow-logic-in-do_swap_page.patch >> mm-huge_memory-streamline-cow-logic-in-do_huge_pmd_wp_page.patch >> mm-khugepaged-remove-reuse_swap_page-usage.patch >> mm-swapfile-remove-stale-reuse_swap_page.patch >> mm-huge_memory-remove-stale-page_trans_huge_mapcount.patch >> ... >> #[merged]mm-huge_memory-remove-stale-locking-logic-from-__split_huge_pmd.patch > > OK, thanks. I guess it was me seeing 100% rejects when merging onto > the folio changes then incorrectly deciding the patch was now in > linux-next via some other tree. > Thanks Andrew, my 2 cents would have been that my series, which fixes actual CVEs should go in before folio cleanups. But that's a different discussion (and the patch is question is just a cleanup part of the same series, so i don't particularly care). > I restored it and fixed things up. Please check. > That change looks good to me. I'd even say that we do the second cleanup separately, with Yang Shi being the author. But whatever you+others prefer.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b49e1a11df2e..daaa698bd273 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2134,8 +2134,6 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, { spinlock_t *ptl; struct mmu_notifier_range range; - bool do_unlock_folio = false; - pmd_t _pmd; mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm, address & HPAGE_PMD_MASK, @@ -2148,48 +2146,16 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, * pmd against. Otherwise we can end up replacing wrong folio. */ VM_BUG_ON(freeze && !folio); - if (folio) { - VM_WARN_ON_ONCE(!folio_test_locked(folio)); - if (folio != page_folio(pmd_page(*pmd))) - goto out; - } + if (folio && folio != page_folio(pmd_page(*pmd))) + goto out; -repeat: - if (pmd_trans_huge(*pmd)) { - if (!folio) { - folio = page_folio(pmd_page(*pmd)); - /* - * An anonymous page must be locked, to ensure that a - * concurrent reuse_swap_page() sees stable mapcount; - * but reuse_swap_page() is not used on shmem or file, - * and page lock must not be taken when zap_pmd_range() - * calls __split_huge_pmd() while i_mmap_lock is held. - */ - if (folio_test_anon(folio)) { - if (unlikely(!folio_trylock(folio))) { - folio_get(folio); - _pmd = *pmd; - spin_unlock(ptl); - folio_lock(folio); - spin_lock(ptl); - if (unlikely(!pmd_same(*pmd, _pmd))) { - folio_unlock(folio); - folio_put(folio); - folio = NULL; - goto repeat; - } - folio_put(folio); - } - do_unlock_folio = true; - } - } - } else if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) + if (!(pmd_devmap(*pmd) || is_pmd_migration_entry(*pmd))) goto out; + __split_huge_pmd_locked(vma, pmd, range.start, freeze); out: spin_unlock(ptl); - if (do_unlock_folio) - folio_unlock(folio); + /* * No need to double call mmu_notifier->invalidate_range() callback. * They are 3 cases to consider inside __split_huge_pmd_locked():
The commit c444eb564fb1 ("mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()") locked the page for PMD split to make mapcount stable for reuse_swap_page(), then commit 1c2f67308af4 ("mm: thp: fix MADV_REMOVE deadlock on shmem THP") reduce the scope to anonymous page only. However COW has not used mapcount to determine if the page is shared or not anymore due to the COW fixes [1] from David Hildenbrand and the reuse_swap_page() was removed as well. So PMD split doesn't have to lock the page anymore. This patch basically reverted the above two commits. [1] https://lore.kernel.org/linux-mm/20220131162940.210846-1-david@redhat.com/ Cc: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Yang Shi <shy828301@gmail.com> --- mm/huge_memory.c | 44 +++++--------------------------------------- 1 file changed, 5 insertions(+), 39 deletions(-)