[RESEND] mm/hotplug: fix notification in offline error path
diff mbox series

Message ID 20190320204255.53571-1-cai@lca.pw
State New
Headers show
Series
  • [RESEND] mm/hotplug: fix notification in offline error path
Related show

Commit Message

Qian Cai March 20, 2019, 8:42 p.m. UTC
When start_isolate_page_range() returned -EBUSY in __offline_pages(), it
calls memory_notify(MEM_CANCEL_OFFLINE, &arg) with an uninitialized
"arg". As the result, it triggers warnings below. Also, it is only
necessary to notify MEM_CANCEL_OFFLINE after MEM_GOING_OFFLINE.

page:ffffea0001200000 count:1 mapcount:0 mapping:0000000000000000
index:0x0
flags: 0x3fffe000001000(reserved)
raw: 003fffe000001000 ffffea0001200008 ffffea0001200008 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: unmovable page
WARNING: CPU: 25 PID: 1665 at mm/kasan/common.c:665
kasan_mem_notifier+0x34/0x23b
CPU: 25 PID: 1665 Comm: bash Tainted: G        W         5.0.0+ #94
Hardware name: HP ProLiant DL180 Gen9/ProLiant DL180 Gen9, BIOS U20
10/25/2017
RIP: 0010:kasan_mem_notifier+0x34/0x23b
RSP: 0018:ffff8883ec737890 EFLAGS: 00010206
RAX: 0000000000000246 RBX: ff10f0f4435f1000 RCX: f887a7a21af88000
RDX: dffffc0000000000 RSI: 0000000000000020 RDI: ffff8881f221af88
RBP: ffff8883ec737898 R08: ffff888000000000 R09: ffffffffb0bddcd0
R10: ffffed103e857088 R11: ffff8881f42b8443 R12: dffffc0000000000
R13: 00000000fffffff9 R14: dffffc0000000000 R15: 0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000560fbd31d730 CR3: 00000004049c6003 CR4: 00000000001606a0
Call Trace:
 notifier_call_chain+0xbf/0x130
 __blocking_notifier_call_chain+0x76/0xc0
 blocking_notifier_call_chain+0x16/0x20
 memory_notify+0x1b/0x20
 __offline_pages+0x3e2/0x1210
 offline_pages+0x11/0x20
 memory_block_action+0x144/0x300
 memory_subsys_offline+0xe5/0x170
 device_offline+0x13f/0x1e0
 state_store+0xeb/0x110
 dev_attr_store+0x3f/0x70
 sysfs_kf_write+0x104/0x150
 kernfs_fop_write+0x25c/0x410
 __vfs_write+0x66/0x120
 vfs_write+0x15a/0x4f0
 ksys_write+0xd2/0x1b0
 __x64_sys_write+0x73/0xb0
 do_syscall_64+0xeb/0xb78
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f14f75cc3b8
RSP: 002b:00007ffe84d01d68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f14f75cc3b8
RDX: 0000000000000008 RSI: 0000563f8e433d70 RDI: 0000000000000001
RBP: 0000563f8e433d70 R08: 000000000000000a R09: 00007ffe84d018f0
R10: 000000000000000a R11: 0000000000000246 R12: 00007f14f789e780
R13: 0000000000000008 R14: 00007f14f7899740 R15: 0000000000000008

Fixes: 7960509329c2 ("mm, memory_hotplug: print reason for the offlining failure")
CC: stable@vger.kernel.org # 5.0.x
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Qian Cai <cai@lca.pw>
---
 mm/memory_hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Souptick Joarder March 22, 2019, 6:50 a.m. UTC | #1
On Thu, Mar 21, 2019 at 2:13 AM Qian Cai <cai@lca.pw> wrote:
>
> When start_isolate_page_range() returned -EBUSY in __offline_pages(), it
> calls memory_notify(MEM_CANCEL_OFFLINE, &arg) with an uninitialized
> "arg". As the result, it triggers warnings below. Also, it is only
> necessary to notify MEM_CANCEL_OFFLINE after MEM_GOING_OFFLINE.

For my clarification, if test_pages_in_a_zone() failed in  __offline_pages(),
we have the similar scenario as well. If yes, do we need to capture it
in change log ?

>
> page:ffffea0001200000 count:1 mapcount:0 mapping:0000000000000000
> index:0x0
> flags: 0x3fffe000001000(reserved)
> raw: 003fffe000001000 ffffea0001200008 ffffea0001200008 0000000000000000
> raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
> page dumped because: unmovable page
> WARNING: CPU: 25 PID: 1665 at mm/kasan/common.c:665
> kasan_mem_notifier+0x34/0x23b
> CPU: 25 PID: 1665 Comm: bash Tainted: G        W         5.0.0+ #94
> Hardware name: HP ProLiant DL180 Gen9/ProLiant DL180 Gen9, BIOS U20
> 10/25/2017
> RIP: 0010:kasan_mem_notifier+0x34/0x23b
> RSP: 0018:ffff8883ec737890 EFLAGS: 00010206
> RAX: 0000000000000246 RBX: ff10f0f4435f1000 RCX: f887a7a21af88000
> RDX: dffffc0000000000 RSI: 0000000000000020 RDI: ffff8881f221af88
> RBP: ffff8883ec737898 R08: ffff888000000000 R09: ffffffffb0bddcd0
> R10: ffffed103e857088 R11: ffff8881f42b8443 R12: dffffc0000000000
> R13: 00000000fffffff9 R14: dffffc0000000000 R15: 0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000560fbd31d730 CR3: 00000004049c6003 CR4: 00000000001606a0
> Call Trace:
>  notifier_call_chain+0xbf/0x130
>  __blocking_notifier_call_chain+0x76/0xc0
>  blocking_notifier_call_chain+0x16/0x20
>  memory_notify+0x1b/0x20
>  __offline_pages+0x3e2/0x1210
>  offline_pages+0x11/0x20
>  memory_block_action+0x144/0x300
>  memory_subsys_offline+0xe5/0x170
>  device_offline+0x13f/0x1e0
>  state_store+0xeb/0x110
>  dev_attr_store+0x3f/0x70
>  sysfs_kf_write+0x104/0x150
>  kernfs_fop_write+0x25c/0x410
>  __vfs_write+0x66/0x120
>  vfs_write+0x15a/0x4f0
>  ksys_write+0xd2/0x1b0
>  __x64_sys_write+0x73/0xb0
>  do_syscall_64+0xeb/0xb78
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f14f75cc3b8
> RSP: 002b:00007ffe84d01d68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f14f75cc3b8
> RDX: 0000000000000008 RSI: 0000563f8e433d70 RDI: 0000000000000001
> RBP: 0000563f8e433d70 R08: 000000000000000a R09: 00007ffe84d018f0
> R10: 000000000000000a R11: 0000000000000246 R12: 00007f14f789e780
> R13: 0000000000000008 R14: 00007f14f7899740 R15: 0000000000000008
>
> Fixes: 7960509329c2 ("mm, memory_hotplug: print reason for the offlining failure")
> CC: stable@vger.kernel.org # 5.0.x
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
>  mm/memory_hotplug.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 0e0a16021fd5..0082d699be94 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1699,12 +1699,12 @@ static int __ref __offline_pages(unsigned long start_pfn,
>
>  failed_removal_isolated:
>         undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
> +       memory_notify(MEM_CANCEL_OFFLINE, &arg);
>  failed_removal:
>         pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
>                  (unsigned long long) start_pfn << PAGE_SHIFT,
>                  ((unsigned long long) end_pfn << PAGE_SHIFT) - 1,
>                  reason);
> -       memory_notify(MEM_CANCEL_OFFLINE, &arg);
>         /* pushback to free area */
>         mem_hotplug_done();
>         return ret;
> --
> 2.17.2 (Apple Git-113)
>
Michal Hocko March 22, 2019, 8:22 a.m. UTC | #2
On Fri 22-03-19 12:20:12, Souptick Joarder wrote:
> On Thu, Mar 21, 2019 at 2:13 AM Qian Cai <cai@lca.pw> wrote:
> >
> > When start_isolate_page_range() returned -EBUSY in __offline_pages(), it
> > calls memory_notify(MEM_CANCEL_OFFLINE, &arg) with an uninitialized
> > "arg". As the result, it triggers warnings below. Also, it is only
> > necessary to notify MEM_CANCEL_OFFLINE after MEM_GOING_OFFLINE.
> 
> For my clarification, if test_pages_in_a_zone() failed in  __offline_pages(),
> we have the similar scenario as well. If yes, do we need to capture it
> in change log ?

Yes this is the same situation. We can add a note that the same applies
to test_pages_in_a_zone failure path but I do not think it is strictly
necessary. Thanks for the note anyway.

Patch
diff mbox series

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0e0a16021fd5..0082d699be94 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1699,12 +1699,12 @@  static int __ref __offline_pages(unsigned long start_pfn,
 
 failed_removal_isolated:
 	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 failed_removal:
 	pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
 		 (unsigned long long) start_pfn << PAGE_SHIFT,
 		 ((unsigned long long) end_pfn << PAGE_SHIFT) - 1,
 		 reason);
-	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 	/* pushback to free area */
 	mem_hotplug_done();
 	return ret;