mbox series

[mm-unstable,v5,0/7] Cleanup transhuge_xxx helpers

Message ID 20220616174840.1202070-1-shy828301@gmail.com (mailing list archive)
Headers show
Series Cleanup transhuge_xxx helpers | expand

Message

Yang Shi June 16, 2022, 5:48 p.m. UTC
v5: * Removed transparent_hugepage_active() for !THP, per Zach.
      Patch 4/7 and 5/7 were updated accordingly.
    * Collected review tags.
v4: * Consolidated the transhuge_vma_size_ok() helper proposed in the
      earlier versions into transhuge_vma_suitable(), per Zach.
    * Fixed the regression introduced by patch 3/7, per Zach and Miaohe.
    * Reworded the comment for transhuge_vma_suitable(), per Zach.
    * Removed khugepaged_enter() per Miaohe.
    * More comments for hugepage_vma_check(), per Zach.
    * Squashed patch 4/7 (mm: khugepaged: use transhuge_vma_suitable replace open-code)
      in the earlier version into patch 2/7 of this version.
    * Minor correction to the doc about THPeligible (patch 7/7), so the
      total number of patches is kept 7. 
v3: * Fixed the comment from Willy
v2: * Rebased to the latest mm-unstable
    * Fixed potential regression for smaps's THPeligible

This series is the follow-up of the discussion about cleaning up transhuge_xxx
helpers at https://lore.kernel.org/linux-mm/627a71f8-e879-69a5-ceb3-fc8d29d2f7f1@suse.cz/.

THP has a bunch of helpers that do VMA sanity check for different paths, they
do the similar checks for the most callsites and have a lot duplicate codes.
And it is confusing what helpers should be used at what conditions.

This series reorganized and cleaned up the code so that we could consolidate
all the checks into hugepage_vma_check().

The transhuge_vma_enabled(), transparent_hugepage_active() and
__transparent_hugepage_enabled() are killed by this series.


Yang Shi (7):
      mm: khugepaged: check THP flag in hugepage_vma_check()
      mm: thp: consolidate vma size check to transhuge_vma_suitable
      mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate
      mm: thp: kill transparent_hugepage_active()
      mm: thp: kill __transhuge_page_enabled()
      mm: khugepaged: reorg some khugepaged helpers
      doc: proc: fix the description to THPeligible

 Documentation/filesystems/proc.rst |  4 ++-
 fs/proc/task_mmu.c                 |  2 +-
 include/linux/huge_mm.h            | 80 +++++++++++++++++++----------------------------------------
 include/linux/khugepaged.h         | 30 ----------------------
 mm/huge_memory.c                   | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++--------
 mm/khugepaged.c                    | 84 +++++++++++++++++++-------------------------------------------
 mm/memory.c                        |  7 ++++--
 7 files changed, 130 insertions(+), 158 deletions(-)

Comments

Zach O'Keefe June 16, 2022, 11:08 p.m. UTC | #1
On 16 Jun 10:48, Yang Shi wrote:
> 
> v5: * Removed transparent_hugepage_active() for !THP, per Zach.
>       Patch 4/7 and 5/7 were updated accordingly.
>     * Collected review tags.
> v4: * Consolidated the transhuge_vma_size_ok() helper proposed in the
>       earlier versions into transhuge_vma_suitable(), per Zach.
>     * Fixed the regression introduced by patch 3/7, per Zach and Miaohe.
>     * Reworded the comment for transhuge_vma_suitable(), per Zach.
>     * Removed khugepaged_enter() per Miaohe.
>     * More comments for hugepage_vma_check(), per Zach.
>     * Squashed patch 4/7 (mm: khugepaged: use transhuge_vma_suitable replace open-code)
>       in the earlier version into patch 2/7 of this version.
>     * Minor correction to the doc about THPeligible (patch 7/7), so the
>       total number of patches is kept 7. 
> v3: * Fixed the comment from Willy
> v2: * Rebased to the latest mm-unstable
>     * Fixed potential regression for smaps's THPeligible
> 
> This series is the follow-up of the discussion about cleaning up transhuge_xxx
> helpers at https://lore.kernel.org/linux-mm/627a71f8-e879-69a5-ceb3-fc8d29d2f7f1@suse.cz/.
> 
> THP has a bunch of helpers that do VMA sanity check for different paths, they
> do the similar checks for the most callsites and have a lot duplicate codes.
> And it is confusing what helpers should be used at what conditions.
> 
> This series reorganized and cleaned up the code so that we could consolidate
> all the checks into hugepage_vma_check().
> 
> The transhuge_vma_enabled(), transparent_hugepage_active() and
> __transparent_hugepage_enabled() are killed by this series.
> 
> 
> Yang Shi (7):
>       mm: khugepaged: check THP flag in hugepage_vma_check()
>       mm: thp: consolidate vma size check to transhuge_vma_suitable
>       mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate
>       mm: thp: kill transparent_hugepage_active()
>       mm: thp: kill __transhuge_page_enabled()
>       mm: khugepaged: reorg some khugepaged helpers
>       doc: proc: fix the description to THPeligible
> 
>  Documentation/filesystems/proc.rst |  4 ++-
>  fs/proc/task_mmu.c                 |  2 +-
>  include/linux/huge_mm.h            | 80 +++++++++++++++++++----------------------------------------
>  include/linux/khugepaged.h         | 30 ----------------------
>  mm/huge_memory.c                   | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++--------
>  mm/khugepaged.c                    | 84 +++++++++++++++++++-------------------------------------------
>  mm/memory.c                        |  7 ++++--
>  7 files changed, 130 insertions(+), 158 deletions(-)
> 

Series LGTM. Thanks for these cleanups, Yang!

Best,
Zach
Yang Shi June 17, 2022, 5:54 p.m. UTC | #2
On Thu, Jun 16, 2022 at 4:08 PM Zach O'Keefe <zokeefe@google.com> wrote:
>
> On 16 Jun 10:48, Yang Shi wrote:
> >
> > v5: * Removed transparent_hugepage_active() for !THP, per Zach.
> >       Patch 4/7 and 5/7 were updated accordingly.
> >     * Collected review tags.
> > v4: * Consolidated the transhuge_vma_size_ok() helper proposed in the
> >       earlier versions into transhuge_vma_suitable(), per Zach.
> >     * Fixed the regression introduced by patch 3/7, per Zach and Miaohe.
> >     * Reworded the comment for transhuge_vma_suitable(), per Zach.
> >     * Removed khugepaged_enter() per Miaohe.
> >     * More comments for hugepage_vma_check(), per Zach.
> >     * Squashed patch 4/7 (mm: khugepaged: use transhuge_vma_suitable replace open-code)
> >       in the earlier version into patch 2/7 of this version.
> >     * Minor correction to the doc about THPeligible (patch 7/7), so the
> >       total number of patches is kept 7.
> > v3: * Fixed the comment from Willy
> > v2: * Rebased to the latest mm-unstable
> >     * Fixed potential regression for smaps's THPeligible
> >
> > This series is the follow-up of the discussion about cleaning up transhuge_xxx
> > helpers at https://lore.kernel.org/linux-mm/627a71f8-e879-69a5-ceb3-fc8d29d2f7f1@suse.cz/.
> >
> > THP has a bunch of helpers that do VMA sanity check for different paths, they
> > do the similar checks for the most callsites and have a lot duplicate codes.
> > And it is confusing what helpers should be used at what conditions.
> >
> > This series reorganized and cleaned up the code so that we could consolidate
> > all the checks into hugepage_vma_check().
> >
> > The transhuge_vma_enabled(), transparent_hugepage_active() and
> > __transparent_hugepage_enabled() are killed by this series.
> >
> >
> > Yang Shi (7):
> >       mm: khugepaged: check THP flag in hugepage_vma_check()
> >       mm: thp: consolidate vma size check to transhuge_vma_suitable
> >       mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate
> >       mm: thp: kill transparent_hugepage_active()
> >       mm: thp: kill __transhuge_page_enabled()
> >       mm: khugepaged: reorg some khugepaged helpers
> >       doc: proc: fix the description to THPeligible
> >
> >  Documentation/filesystems/proc.rst |  4 ++-
> >  fs/proc/task_mmu.c                 |  2 +-
> >  include/linux/huge_mm.h            | 80 +++++++++++++++++++----------------------------------------
> >  include/linux/khugepaged.h         | 30 ----------------------
> >  mm/huge_memory.c                   | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++--------
> >  mm/khugepaged.c                    | 84 +++++++++++++++++++-------------------------------------------
> >  mm/memory.c                        |  7 ++++--
> >  7 files changed, 130 insertions(+), 158 deletions(-)
> >
>
> Series LGTM. Thanks for these cleanups, Yang!

Thank you so much for reviewing the patches.

>
> Best,
> Zach
Mike Kravetz June 21, 2022, 9:07 p.m. UTC | #3
On 06/16/22 10:48, Yang Shi wrote:
> This series is the follow-up of the discussion about cleaning up transhuge_xxx
> helpers at https://lore.kernel.org/linux-mm/627a71f8-e879-69a5-ceb3-fc8d29d2f7f1@suse.cz/.
> 
> THP has a bunch of helpers that do VMA sanity check for different paths, they
> do the similar checks for the most callsites and have a lot duplicate codes.
> And it is confusing what helpers should be used at what conditions.
> 
> This series reorganized and cleaned up the code so that we could consolidate
> all the checks into hugepage_vma_check().
> 
> The transhuge_vma_enabled(), transparent_hugepage_active() and
> __transparent_hugepage_enabled() are killed by this series.

Running libhugetlbfs tests on next-20220621 produces the following:

[   77.436038] BUG: kernel NULL pointer dereference, address: 0000000000000378
[   77.437278] #PF: supervisor read access in kernel mode
[   77.438211] #PF: error_code(0x0000) - not-present page
[   77.439097] PGD 800000017a1a6067 P4D 800000017a1a6067 PUD 17f3b9067 PMD 0 
[   77.440021] Oops: 0000 [#7] PREEMPT SMP PTI
[   77.440635] CPU: 1 PID: 2720 Comm: get_huge_pages Tainted: G      D           5.19.0-rc3-next-20220621+ #22
[   77.441973] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
[   77.443115] RIP: 0010:hugepage_vma_check+0x15/0x170
[   77.444021] Code: 01 e9 84 fd ff ff 48 89 d8 e9 14 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 f7 c6 00 00 00 40 0f 85 fe 00 00 00 48 8b 47 10 <48> 8b 80 78 03 00 00 48 c1 e8 18 83 e0 01 0f 85 e6 00 00 00 4c 8b
[   77.447327] RSP: 0018:ffffc900039dfd20 EFLAGS: 00010246
[   77.448317] RAX: 0000000000000000 RBX: ffff88817e4b27a0 RCX: 0000000000000000
[   77.449681] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffffff823f2000
[   77.451040] RBP: ffffffff823f2000 R08: 0000000000000008 R09: ffff8881a3c042e6
[   77.452353] R10: 00007ffe8b341000 R11: ffff8881a3c04526 R12: ffff88817e4b27a0
[   77.453677] R13: ffffc900039dfd28 R14: ffff88817e4b27c8 R15: ffffffff823f2000
[   77.455046] FS:  00007f0edc9880c0(0000) GS:ffff888277d00000(0000) knlGS:0000000000000000
[   77.456625] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   77.457745] CR2: 0000000000000378 CR3: 0000000179394006 CR4: 0000000000370ee0
[   77.459936] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   77.461308] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   77.462477] Call Trace:
[   77.462950]  <TASK>
[   77.463402]  show_smap+0xed/0x1c0
[   77.464019]  seq_read_iter+0x2af/0x480
[   77.464674]  seq_read+0xeb/0x120
[   77.465286]  vfs_read+0x97/0x190
[   77.465880]  ksys_read+0x5f/0xe0
[   77.466488]  do_syscall_64+0x3b/0x90
[   77.467155]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[   77.468023] RIP: 0033:0x7f0edca7ade2
[   77.468609] Code: c0 e9 b2 fe ff ff 50 48 8d 3d b2 3f 0a 00 e8 05 f0 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[   77.471492] RSP: 002b:00007ffe8b324c28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[   77.472845] RAX: ffffffffffffffda RBX: 00000000022f82a0 RCX: 00007f0edca7ade2
[   77.474017] RDX: 0000000000000400 RSI: 00000000022f8480 RDI: 0000000000000003
[   77.475184] RBP: 00007f0edcb4f320 R08: 0000000000000003 R09: 0000000000000000
[   77.476379] R10: 00007f0edcaffac0 R11: 0000000000000246 R12: 00000000022f82a0
[   77.477517] R13: 0000000000000d68 R14: 00007f0edcb4e720 R15: 0000000000000d68
[   77.478590]  </TASK>
[   77.479012] Modules linked in: rfkill ip6table_filter ip6_tables sunrpc snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm joydev snd_timer 9p netfs 9pnet_virtio snd soundcore virtio_balloon 9pnet virtio_console virtio_blk virtio_net net_failover failover crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw virtio_pci virtio virtio_ring virtio_pci_legacy_dev virtio_pci_modern_dev fuse
[   77.484573] CR2: 0000000000000378
[   77.485123] ---[ end trace 0000000000000000 ]---


Looks to be related to this series.  I'll start debugging unless someone
knows what the issue may be.
Zach O'Keefe June 21, 2022, 10:43 p.m. UTC | #4
On 21 Jun 14:07, Mike Kravetz wrote:
> On 06/16/22 10:48, Yang Shi wrote:
> > This series is the follow-up of the discussion about cleaning up transhuge_xxx
> > helpers at https://lore.kernel.org/linux-mm/627a71f8-e879-69a5-ceb3-fc8d29d2f7f1@suse.cz/.
> > 
> > THP has a bunch of helpers that do VMA sanity check for different paths, they
> > do the similar checks for the most callsites and have a lot duplicate codes.
> > And it is confusing what helpers should be used at what conditions.
> > 
> > This series reorganized and cleaned up the code so that we could consolidate
> > all the checks into hugepage_vma_check().
> > 
> > The transhuge_vma_enabled(), transparent_hugepage_active() and
> > __transparent_hugepage_enabled() are killed by this series.
> 
> Running libhugetlbfs tests on next-20220621 produces the following:
> 
> [   77.436038] BUG: kernel NULL pointer dereference, address: 0000000000000378
> [   77.437278] #PF: supervisor read access in kernel mode
> [   77.438211] #PF: error_code(0x0000) - not-present page
> [   77.439097] PGD 800000017a1a6067 P4D 800000017a1a6067 PUD 17f3b9067 PMD 0 
> [   77.440021] Oops: 0000 [#7] PREEMPT SMP PTI
> [   77.440635] CPU: 1 PID: 2720 Comm: get_huge_pages Tainted: G      D           5.19.0-rc3-next-20220621+ #22
> [   77.441973] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
> [   77.443115] RIP: 0010:hugepage_vma_check+0x15/0x170
> [   77.444021] Code: 01 e9 84 fd ff ff 48 89 d8 e9 14 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 f7 c6 00 00 00 40 0f 85 fe 00 00 00 48 8b 47 10 <48> 8b 80 78 03 00 00 48 c1 e8 18 83 e0 01 0f 85 e6 00 00 00 4c 8b
> [   77.447327] RSP: 0018:ffffc900039dfd20 EFLAGS: 00010246
> [   77.448317] RAX: 0000000000000000 RBX: ffff88817e4b27a0 RCX: 0000000000000000
> [   77.449681] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffffff823f2000
> [   77.451040] RBP: ffffffff823f2000 R08: 0000000000000008 R09: ffff8881a3c042e6
> [   77.452353] R10: 00007ffe8b341000 R11: ffff8881a3c04526 R12: ffff88817e4b27a0
> [   77.453677] R13: ffffc900039dfd28 R14: ffff88817e4b27c8 R15: ffffffff823f2000
> [   77.455046] FS:  00007f0edc9880c0(0000) GS:ffff888277d00000(0000) knlGS:0000000000000000
> [   77.456625] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   77.457745] CR2: 0000000000000378 CR3: 0000000179394006 CR4: 0000000000370ee0
> [   77.459936] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   77.461308] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   77.462477] Call Trace:
> [   77.462950]  <TASK>
> [   77.463402]  show_smap+0xed/0x1c0
> [   77.464019]  seq_read_iter+0x2af/0x480
> [   77.464674]  seq_read+0xeb/0x120
> [   77.465286]  vfs_read+0x97/0x190
> [   77.465880]  ksys_read+0x5f/0xe0
> [   77.466488]  do_syscall_64+0x3b/0x90
> [   77.467155]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [   77.468023] RIP: 0033:0x7f0edca7ade2
> [   77.468609] Code: c0 e9 b2 fe ff ff 50 48 8d 3d b2 3f 0a 00 e8 05 f0 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
> [   77.471492] RSP: 002b:00007ffe8b324c28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> [   77.472845] RAX: ffffffffffffffda RBX: 00000000022f82a0 RCX: 00007f0edca7ade2
> [   77.474017] RDX: 0000000000000400 RSI: 00000000022f8480 RDI: 0000000000000003
> [   77.475184] RBP: 00007f0edcb4f320 R08: 0000000000000003 R09: 0000000000000000
> [   77.476379] R10: 00007f0edcaffac0 R11: 0000000000000246 R12: 00000000022f82a0
> [   77.477517] R13: 0000000000000d68 R14: 00007f0edcb4e720 R15: 0000000000000d68
> [   77.478590]  </TASK>
> [   77.479012] Modules linked in: rfkill ip6table_filter ip6_tables sunrpc snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm joydev snd_timer 9p netfs 9pnet_virtio snd soundcore virtio_balloon 9pnet virtio_console virtio_blk virtio_net net_failover failover crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw virtio_pci virtio virtio_ring virtio_pci_legacy_dev virtio_pci_modern_dev fuse
> [   77.484573] CR2: 0000000000000378
> [   77.485123] ---[ end trace 0000000000000000 ]---
> 
> 
> Looks to be related to this series.  I'll start debugging unless someone
> knows what the issue may be.
> -- 
> Mike Kravetz

Thanks Mike,

I posted the issue in response to '[v5 PATCH 4/7] mm: thp: kill
transparent_hugepage_active()' of this series[1], where the issue was
introduced.

TLDR: we need to check vma->vm_mm because for vDSO vma, this is NULL.


[1] https://lore.kernel.org/linux-mm/YrIU2iP0H5LQbV7R@google.com/
Andrew Morton July 3, 2022, 11:14 p.m. UTC | #5
On Thu, 16 Jun 2022 10:48:33 -0700 Yang Shi <shy828301@gmail.com> wrote:

> This series is the follow-up of the discussion about cleaning up transhuge_xxx
> helpers at https://lore.kernel.org/linux-mm/627a71f8-e879-69a5-ceb3-fc8d29d2f7f1@suse.cz/.
> 
> THP has a bunch of helpers that do VMA sanity check for different paths, they
> do the similar checks for the most callsites and have a lot duplicate codes.
> And it is confusing what helpers should be used at what conditions.
> 
> This series reorganized and cleaned up the code so that we could consolidate
> all the checks into hugepage_vma_check().
> 
> The transhuge_vma_enabled(), transparent_hugepage_active() and
> __transparent_hugepage_enabled() are killed by this series.

I plan to move this into mm-stable later this week, along with two fixups:

https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-thp-kill-transparent_hugepage_active-fix.patch
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-thp-kill-transparent_hugepage_active-fix-fix.patch
Yang Shi July 5, 2022, 7:45 p.m. UTC | #6
On Sun, Jul 3, 2022 at 4:14 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu, 16 Jun 2022 10:48:33 -0700 Yang Shi <shy828301@gmail.com> wrote:
>
> > This series is the follow-up of the discussion about cleaning up transhuge_xxx
> > helpers at https://lore.kernel.org/linux-mm/627a71f8-e879-69a5-ceb3-fc8d29d2f7f1@suse.cz/.
> >
> > THP has a bunch of helpers that do VMA sanity check for different paths, they
> > do the similar checks for the most callsites and have a lot duplicate codes.
> > And it is confusing what helpers should be used at what conditions.
> >
> > This series reorganized and cleaned up the code so that we could consolidate
> > all the checks into hugepage_vma_check().
> >
> > The transhuge_vma_enabled(), transparent_hugepage_active() and
> > __transparent_hugepage_enabled() are killed by this series.
>
> I plan to move this into mm-stable later this week, along with two fixups:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-thp-kill-transparent_hugepage_active-fix.patch
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-thp-kill-transparent_hugepage_active-fix-fix.patch

Looks good to me. Thanks for the heads up.