diff mbox series

[2/2,RESEND] mm/huge_memory: do not overkill when splitting huge_zero_page

Message ID f35f8b97377d5d3ede1bc5ac3114da888c57cbce.1651052574.git.xuyu@linux.alibaba.com (mailing list archive)
State New
Headers show
Series None | expand

Commit Message

Xu Yu April 27, 2022, 9:44 a.m. UTC
Kernel panic when injecting memory_failure for the global
huge_zero_page, when CONFIG_DEBUG_VM is enabled, as follows.

  Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000
  page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00
  head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0
  flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff)
  raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000
  raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
  page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head))
  ------------[ cut here ]------------
  kernel BUG at mm/huge_memory.c:2499!
  invalid opcode: 0000 [#1] PREEMPT SMP PTI
  CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11
  Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
  RIP: 0010:split_huge_page_to_list+0x66a/0x880
  Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b
  RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246
  RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff
  RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff
  R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000
  R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40
  FS:  00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
  try_to_split_thp_page+0x3a/0x130
  memory_failure+0x128/0x800
  madvise_inject_error.cold+0x8b/0xa1
  __x64_sys_madvise+0x54/0x60
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
  RIP: 0033:0x7fc3754f8bf9
  Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
  RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c
  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9
  RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000
  RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000
  R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490
  R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000

We think that raising BUG is overkilling for splitting huge_zero_page,
the huge_zero_page can't be met from normal paths other than memory
failure, but memory failure is a valid caller. So we tend to replace the
BUG to WARN + returning -EBUSY, and thus the panic above won't happen
again.

Suggested-by: Yang Shi <shy828301@gmail.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
---
 mm/huge_memory.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Yang Shi April 27, 2022, 9:15 p.m. UTC | #1
On Wed, Apr 27, 2022 at 2:45 AM Xu Yu <xuyu@linux.alibaba.com> wrote:
>
> Kernel panic when injecting memory_failure for the global
> huge_zero_page, when CONFIG_DEBUG_VM is enabled, as follows.
>
>   Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000
>   page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00
>   head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0
>   flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff)
>   raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000
>   raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
>   page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head))
>   ------------[ cut here ]------------
>   kernel BUG at mm/huge_memory.c:2499!
>   invalid opcode: 0000 [#1] PREEMPT SMP PTI
>   CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11
>   Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
>   RIP: 0010:split_huge_page_to_list+0x66a/0x880
>   Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b
>   RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246
>   RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000
>   RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff
>   RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff
>   R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000
>   R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40
>   FS:  00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>   try_to_split_thp_page+0x3a/0x130
>   memory_failure+0x128/0x800
>   madvise_inject_error.cold+0x8b/0xa1
>   __x64_sys_madvise+0x54/0x60
>   do_syscall_64+0x35/0x80
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>   RIP: 0033:0x7fc3754f8bf9
>   Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
>   RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c
>   RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9
>   RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000
>   RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490
>   R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000
>
> We think that raising BUG is overkilling for splitting huge_zero_page,
> the huge_zero_page can't be met from normal paths other than memory
> failure, but memory failure is a valid caller. So we tend to replace the
> BUG to WARN + returning -EBUSY, and thus the panic above won't happen
> again.

Reviewed-by: Yang Shi <shy828301@gmail.com>

>
> Suggested-by: Yang Shi <shy828301@gmail.com>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
> ---
>  mm/huge_memory.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index c468fee595ff..910a138e9859 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2495,11 +2495,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>         struct address_space *mapping = NULL;
>         int extra_pins, ret;
>         pgoff_t end;
> +       bool is_hzp;
>
> -       VM_BUG_ON_PAGE(is_huge_zero_page(head), head);
>         VM_BUG_ON_PAGE(!PageLocked(head), head);
>         VM_BUG_ON_PAGE(!PageCompound(head), head);
>
> +       is_hzp = is_huge_zero_page(head);
> +       VM_WARN_ON_ONCE_PAGE(is_hzp, head);
> +       if (is_hzp)
> +               return -EBUSY;
> +
>         if (PageWriteback(head))
>                 return -EBUSY;
>
> --
> 2.20.1.2432.ga663e714
>
Miaohe Lin April 28, 2022, 2:25 a.m. UTC | #2
On 2022/4/27 17:44, Xu Yu wrote:
> Kernel panic when injecting memory_failure for the global
> huge_zero_page, when CONFIG_DEBUG_VM is enabled, as follows.
> 
>   Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000
>   page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00
>   head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0
>   flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff)
>   raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000
>   raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
>   page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head))
>   ------------[ cut here ]------------
>   kernel BUG at mm/huge_memory.c:2499!
>   invalid opcode: 0000 [#1] PREEMPT SMP PTI
>   CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11
>   Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
>   RIP: 0010:split_huge_page_to_list+0x66a/0x880
>   Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b
>   RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246
>   RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000
>   RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff
>   RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff
>   R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000
>   R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40
>   FS:  00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>   try_to_split_thp_page+0x3a/0x130
>   memory_failure+0x128/0x800
>   madvise_inject_error.cold+0x8b/0xa1
>   __x64_sys_madvise+0x54/0x60
>   do_syscall_64+0x35/0x80
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>   RIP: 0033:0x7fc3754f8bf9
>   Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
>   RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c
>   RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9
>   RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000
>   RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490
>   R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000
> 
> We think that raising BUG is overkilling for splitting huge_zero_page,
> the huge_zero_page can't be met from normal paths other than memory
> failure, but memory failure is a valid caller. So we tend to replace the
> BUG to WARN + returning -EBUSY, and thus the panic above won't happen
> again.
> 
> Suggested-by: Yang Shi <shy828301@gmail.com>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>

LGTM. Thanks!

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> ---
>  mm/huge_memory.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index c468fee595ff..910a138e9859 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2495,11 +2495,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>  	struct address_space *mapping = NULL;
>  	int extra_pins, ret;
>  	pgoff_t end;
> +	bool is_hzp;
>  
> -	VM_BUG_ON_PAGE(is_huge_zero_page(head), head);
>  	VM_BUG_ON_PAGE(!PageLocked(head), head);
>  	VM_BUG_ON_PAGE(!PageCompound(head), head);
>  
> +	is_hzp = is_huge_zero_page(head);
> +	VM_WARN_ON_ONCE_PAGE(is_hzp, head);
> +	if (is_hzp)
> +		return -EBUSY;
> +
>  	if (PageWriteback(head))
>  		return -EBUSY;
>  
>
David Hildenbrand April 28, 2022, 4:04 p.m. UTC | #3
On 27.04.22 11:44, Xu Yu wrote:
> Kernel panic when injecting memory_failure for the global
> huge_zero_page, when CONFIG_DEBUG_VM is enabled, as follows.
> 
>   Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000
>   page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00
>   head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0
>   flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff)
>   raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000
>   raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
>   page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head))
>   ------------[ cut here ]------------
>   kernel BUG at mm/huge_memory.c:2499!
>   invalid opcode: 0000 [#1] PREEMPT SMP PTI
>   CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11
>   Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
>   RIP: 0010:split_huge_page_to_list+0x66a/0x880
>   Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b
>   RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246
>   RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000
>   RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff
>   RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff
>   R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000
>   R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40
>   FS:  00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>   try_to_split_thp_page+0x3a/0x130
>   memory_failure+0x128/0x800
>   madvise_inject_error.cold+0x8b/0xa1
>   __x64_sys_madvise+0x54/0x60
>   do_syscall_64+0x35/0x80
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>   RIP: 0033:0x7fc3754f8bf9
>   Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
>   RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c
>   RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9
>   RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000
>   RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490
>   R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000
> 
> We think that raising BUG is overkilling for splitting huge_zero_page,
> the huge_zero_page can't be met from normal paths other than memory
> failure, but memory failure is a valid caller. So we tend to replace the
> BUG to WARN + returning -EBUSY, and thus the panic above won't happen
> again.
> 
> Suggested-by: Yang Shi <shy828301@gmail.com>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Reported-by: kernel test robot <lkp@intel.com>
> Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
> ---
>  mm/huge_memory.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index c468fee595ff..910a138e9859 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2495,11 +2495,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>  	struct address_space *mapping = NULL;
>  	int extra_pins, ret;
>  	pgoff_t end;
> +	bool is_hzp;
>  
> -	VM_BUG_ON_PAGE(is_huge_zero_page(head), head);
>  	VM_BUG_ON_PAGE(!PageLocked(head), head);
>  	VM_BUG_ON_PAGE(!PageCompound(head), head);
>  
> +	is_hzp = is_huge_zero_page(head);
> +	VM_WARN_ON_ONCE_PAGE(is_hzp, head);

If this code is valid to be reached, VM_WARN_ON_ONCE_PAGE is most
probably the wrong choice.

IIUC, after patch #1 (revert) we can reach this again?
Yang Shi April 28, 2022, 5:18 p.m. UTC | #4
On Thu, Apr 28, 2022 at 9:04 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 27.04.22 11:44, Xu Yu wrote:
> > Kernel panic when injecting memory_failure for the global
> > huge_zero_page, when CONFIG_DEBUG_VM is enabled, as follows.
> >
> >   Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000
> >   page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00
> >   head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0
> >   flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff)
> >   raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000
> >   raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
> >   page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head))
> >   ------------[ cut here ]------------
> >   kernel BUG at mm/huge_memory.c:2499!
> >   invalid opcode: 0000 [#1] PREEMPT SMP PTI
> >   CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11
> >   Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
> >   RIP: 0010:split_huge_page_to_list+0x66a/0x880
> >   Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b
> >   RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246
> >   RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000
> >   RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff
> >   RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff
> >   R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000
> >   R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40
> >   FS:  00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
> >   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >   CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0
> >   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >   Call Trace:
> >   try_to_split_thp_page+0x3a/0x130
> >   memory_failure+0x128/0x800
> >   madvise_inject_error.cold+0x8b/0xa1
> >   __x64_sys_madvise+0x54/0x60
> >   do_syscall_64+0x35/0x80
> >   entry_SYSCALL_64_after_hwframe+0x44/0xae
> >   RIP: 0033:0x7fc3754f8bf9
> >   Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
> >   RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c
> >   RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9
> >   RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000
> >   RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000
> >   R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490
> >   R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000
> >
> > We think that raising BUG is overkilling for splitting huge_zero_page,
> > the huge_zero_page can't be met from normal paths other than memory
> > failure, but memory failure is a valid caller. So we tend to replace the
> > BUG to WARN + returning -EBUSY, and thus the panic above won't happen
> > again.
> >
> > Suggested-by: Yang Shi <shy828301@gmail.com>
> > Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Reported-by: kernel test robot <lkp@intel.com>
> > Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
> > ---
> >  mm/huge_memory.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index c468fee595ff..910a138e9859 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -2495,11 +2495,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> >       struct address_space *mapping = NULL;
> >       int extra_pins, ret;
> >       pgoff_t end;
> > +     bool is_hzp;
> >
> > -     VM_BUG_ON_PAGE(is_huge_zero_page(head), head);
> >       VM_BUG_ON_PAGE(!PageLocked(head), head);
> >       VM_BUG_ON_PAGE(!PageCompound(head), head);
> >
> > +     is_hzp = is_huge_zero_page(head);
> > +     VM_WARN_ON_ONCE_PAGE(is_hzp, head);
>
> If this code is valid to be reached, VM_WARN_ON_ONCE_PAGE is most
> probably the wrong choice.

Only from the memory failure path, any other path is invalid. The
warning is mainly used to catch the invalid cases. It should be rare
to have memory failure on huge zero page in real life.

>
> IIUC, after patch #1 (revert) we can reach this again?
>
> --
> Thanks,
>
> David / dhildenb
>
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index c468fee595ff..910a138e9859 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2495,11 +2495,16 @@  int split_huge_page_to_list(struct page *page, struct list_head *list)
 	struct address_space *mapping = NULL;
 	int extra_pins, ret;
 	pgoff_t end;
+	bool is_hzp;
 
-	VM_BUG_ON_PAGE(is_huge_zero_page(head), head);
 	VM_BUG_ON_PAGE(!PageLocked(head), head);
 	VM_BUG_ON_PAGE(!PageCompound(head), head);
 
+	is_hzp = is_huge_zero_page(head);
+	VM_WARN_ON_ONCE_PAGE(is_hzp, head);
+	if (is_hzp)
+		return -EBUSY;
+
 	if (PageWriteback(head))
 		return -EBUSY;