diff mbox series

[-next] mm/hotplug: skip bad PFNs from pfn_to_online_page()

Message ID 1560366952-10660-1-git-send-email-cai@lca.pw (mailing list archive)
State New, archived
Headers show
Series [-next] mm/hotplug: skip bad PFNs from pfn_to_online_page() | expand

Commit Message

Qian Cai June 12, 2019, 7:15 p.m. UTC
The linux-next commit "mm/sparsemem: Add helpers track active portions
of a section at boot" [1] causes a crash below when the first kmemleak
scan kthread kicks in. This is because kmemleak_scan() calls
pfn_to_online_page(() which calls pfn_valid_within() instead of
pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.

The commit [1] did add an additional check of pfn_section_valid() in
pfn_valid(), but forgot to add it in the above code path.

page:ffffea0002748000 is uninitialized and poisoned
raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
------------[ cut here ]------------
kernel BUG at include/linux/mm.h:1084!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
BIOS -[TEE113T-1.00]- 07/07/2017
RIP: 0010:kmemleak_scan+0x6df/0xad0
Call Trace:
 kmemleak_scan_thread+0x9f/0xc7
 kthread+0x1d2/0x1f0
 ret_from_fork+0x35/0x4

[1] https://patchwork.kernel.org/patch/10977957/

Signed-off-by: Qian Cai <cai@lca.pw>
---
 include/linux/memory_hotplug.h | 1 +
 1 file changed, 1 insertion(+)

Comments

Dan Williams June 12, 2019, 7:37 p.m. UTC | #1
On Wed, Jun 12, 2019 at 12:16 PM Qian Cai <cai@lca.pw> wrote:
>
> The linux-next commit "mm/sparsemem: Add helpers track active portions
> of a section at boot" [1] causes a crash below when the first kmemleak
> scan kthread kicks in. This is because kmemleak_scan() calls
> pfn_to_online_page(() which calls pfn_valid_within() instead of
> pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.
>
> The commit [1] did add an additional check of pfn_section_valid() in
> pfn_valid(), but forgot to add it in the above code path.
>
> page:ffffea0002748000 is uninitialized and poisoned
> raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> ------------[ cut here ]------------
> kernel BUG at include/linux/mm.h:1084!
> invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
> Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
> BIOS -[TEE113T-1.00]- 07/07/2017
> RIP: 0010:kmemleak_scan+0x6df/0xad0
> Call Trace:
>  kmemleak_scan_thread+0x9f/0xc7
>  kthread+0x1d2/0x1f0
>  ret_from_fork+0x35/0x4
>
> [1] https://patchwork.kernel.org/patch/10977957/
>
> Signed-off-by: Qian Cai <cai@lca.pw>
> ---
>  include/linux/memory_hotplug.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 0b8a5e5ef2da..f02be86077e3 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -28,6 +28,7 @@
>         unsigned long ___nr = pfn_to_section_nr(___pfn);           \
>                                                                    \
>         if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> +           pfn_section_valid(__nr_to_section(___nr), pfn) &&      \
>             pfn_valid_within(___pfn))                              \
>                 ___page = pfn_to_page(___pfn);                     \
>         ___page;                                                   \

Looks ok to me:

Acked-by: Dan Williams <dan.j.williams@intel.com>

...but why is pfn_to_online_page() a multi-line macro instead of a
static inline like all the helper routines it invokes?
Dan Williams June 12, 2019, 7:38 p.m. UTC | #2
On Wed, Jun 12, 2019 at 12:37 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Wed, Jun 12, 2019 at 12:16 PM Qian Cai <cai@lca.pw> wrote:
> >
> > The linux-next commit "mm/sparsemem: Add helpers track active portions
> > of a section at boot" [1] causes a crash below when the first kmemleak
> > scan kthread kicks in. This is because kmemleak_scan() calls
> > pfn_to_online_page(() which calls pfn_valid_within() instead of
> > pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.
> >
> > The commit [1] did add an additional check of pfn_section_valid() in
> > pfn_valid(), but forgot to add it in the above code path.
> >
> > page:ffffea0002748000 is uninitialized and poisoned
> > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > ------------[ cut here ]------------
> > kernel BUG at include/linux/mm.h:1084!
> > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> > CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
> > Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
> > BIOS -[TEE113T-1.00]- 07/07/2017
> > RIP: 0010:kmemleak_scan+0x6df/0xad0
> > Call Trace:
> >  kmemleak_scan_thread+0x9f/0xc7
> >  kthread+0x1d2/0x1f0
> >  ret_from_fork+0x35/0x4
> >
> > [1] https://patchwork.kernel.org/patch/10977957/
> >
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> >  include/linux/memory_hotplug.h | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> > index 0b8a5e5ef2da..f02be86077e3 100644
> > --- a/include/linux/memory_hotplug.h
> > +++ b/include/linux/memory_hotplug.h
> > @@ -28,6 +28,7 @@
> >         unsigned long ___nr = pfn_to_section_nr(___pfn);           \
> >                                                                    \
> >         if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> > +           pfn_section_valid(__nr_to_section(___nr), pfn) &&      \
> >             pfn_valid_within(___pfn))                              \
> >                 ___page = pfn_to_page(___pfn);                     \
> >         ___page;                                                   \
>
> Looks ok to me:
>
> Acked-by: Dan Williams <dan.j.williams@intel.com>
>
> ...but why is pfn_to_online_page() a multi-line macro instead of a
> static inline like all the helper routines it invokes?

I do need to send out a refreshed version of the sub-section patchset,
so I'll fold this in and give you a Reported-by credit.
Qian Cai June 12, 2019, 9:47 p.m. UTC | #3
On Wed, 2019-06-12 at 12:38 -0700, Dan Williams wrote:
> On Wed, Jun 12, 2019 at 12:37 PM Dan Williams <dan.j.williams@intel.com>
> wrote:
> > 
> > On Wed, Jun 12, 2019 at 12:16 PM Qian Cai <cai@lca.pw> wrote:
> > > 
> > > The linux-next commit "mm/sparsemem: Add helpers track active portions
> > > of a section at boot" [1] causes a crash below when the first kmemleak
> > > scan kthread kicks in. This is because kmemleak_scan() calls
> > > pfn_to_online_page(() which calls pfn_valid_within() instead of
> > > pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.
> > > 
> > > The commit [1] did add an additional check of pfn_section_valid() in
> > > pfn_valid(), but forgot to add it in the above code path.
> > > 
> > > page:ffffea0002748000 is uninitialized and poisoned
> > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > > ------------[ cut here ]------------
> > > kernel BUG at include/linux/mm.h:1084!
> > > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> > > CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
> > > Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
> > > BIOS -[TEE113T-1.00]- 07/07/2017
> > > RIP: 0010:kmemleak_scan+0x6df/0xad0
> > > Call Trace:
> > >  kmemleak_scan_thread+0x9f/0xc7
> > >  kthread+0x1d2/0x1f0
> > >  ret_from_fork+0x35/0x4
> > > 
> > > [1] https://patchwork.kernel.org/patch/10977957/
> > > 
> > > Signed-off-by: Qian Cai <cai@lca.pw>
> > > ---
> > >  include/linux/memory_hotplug.h | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/include/linux/memory_hotplug.h
> > > b/include/linux/memory_hotplug.h
> > > index 0b8a5e5ef2da..f02be86077e3 100644
> > > --- a/include/linux/memory_hotplug.h
> > > +++ b/include/linux/memory_hotplug.h
> > > @@ -28,6 +28,7 @@
> > >         unsigned long ___nr = pfn_to_section_nr(___pfn);           \
> > >                                                                    \
> > >         if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> > > +           pfn_section_valid(__nr_to_section(___nr), pfn) &&      \
> > >             pfn_valid_within(___pfn))                              \
> > >                 ___page = pfn_to_page(___pfn);                     \
> > >         ___page;                                                   \
> > 
> > Looks ok to me:
> > 
> > Acked-by: Dan Williams <dan.j.williams@intel.com>
> > 
> > ...but why is pfn_to_online_page() a multi-line macro instead of a
> > static inline like all the helper routines it invokes?
> 
> I do need to send out a refreshed version of the sub-section patchset,
> so I'll fold this in and give you a Reported-by credit.

BTW, not sure if your new version will fix those two problem below due to the
same commit.

https://patchwork.kernel.org/patch/10977957/

1) offline is busted [1]. It looks like test_pages_in_a_zone() missed the same
pfn_section_valid() check.

2) powerpc booting is generating endless warnings [2]. In vmemmap_populated() at
arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
PAGES_PER_SUBSECTION, but it alone seems not enough.

[1]
[  415.158451][ T1946] page:ffffea00016a0000 is uninitialized and poisoned
[  415.158459][ T1946] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff
ffffffffffffffff
[  415.226266][ T1946] raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff
ffffffffffffffff
[  415.264284][ T1946] page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
[  415.294332][ T1946] page_owner info is not active (free page?)
[  415.320902][ T1946] ------------[ cut here ]------------
[  415.345340][ T1946] kernel BUG at include/linux/mm.h:1084!
[  415.370284][ T1946] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  415.402589][ T1946] CPU: 12 PID: 1946 Comm: test.sh Not tainted 5.2.0-rc4-
next-20190612+ #6
[  415.444923][ T1946] Hardware name: HP ProLiant XL420 Gen9/ProLiant XL420
Gen9, BIOS U19 12/27/2015
[  415.485079][ T1946] RIP: 0010:test_pages_in_a_zone+0x285/0x310
[  415.511320][ T1946] Code: c6 c0 96 4c a2 48 89 df e8 18 23 f6 ff 0f 0b 48 c7
c7 80 c7 ad a2 e8 ae c2 1f 00 48 c7 c6 c0 96 4c a2 48 89 cf e8 fb 22 f6 ff <0f>
0b 48 c7 c7 00 c8 ad a2 e8 91 c2 1f 00 48 85 db 0f 84 3c ff ff
[  415.598840][ T1946] RSP: 0018:ffff88832ba37930 EFLAGS: 00010292
[  415.625597][ T1946] RAX: 0000000000000000 RBX: ffff88847fff36c0 RCX:
ffffffffa1b40b78
[  415.660713][ T1946] RDX: 0000000000000000 RSI: 0000000000000008 RDI:
ffff88884743d380
[  415.695778][ T1946] RBP: ffff88832ba37988 R08: ffffed1108e87a71 R09:
ffffed1108e87a70
[  415.730831][ T1946] R10: ffffed1108e87a70 R11: ffff88884743d387 R12:
0000000000060000
[  415.766058][ T1946] R13: 0000000000060000 R14: 0000000000060000 R15:
000000000005a800
[  415.800727][ T1946] FS:  00007fca293e7740(0000) GS:ffff888847400000(0000)
knlGS:0000000000000000
[  415.840114][ T1946] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  415.868966][ T1946] CR2: 0000558da8ffffc0 CR3: 00000002bff10006 CR4:
00000000001606a0
[  415.904736][ T1946] Call Trace:
[  415.920601][ T1946]  __offline_pages+0xdd/0x990
[  415.942887][ T1946]  ? online_pages+0x4f0/0x4f0
[  415.963195][ T1946]  ? kasan_check_write+0x14/0x20
[  415.984710][ T1946]  ? __mutex_lock+0x2ac/0xb70
[  416.004986][ T1946]  ? device_offline+0x70/0x110
[  416.025654][ T1946]  ? klist_next+0x43/0x1c0
[  416.044819][ T1946]  ? __mutex_add_waiter+0xc0/0xc0
[  416.066741][ T1946]  ? do_raw_spin_unlock+0xa8/0x140
[  416.089036][ T1946]  ? klist_next+0xf2/0x1c0
[  416.108178][ T1946]  offline_pages+0x11/0x20
[  416.127490][ T1946]  memory_block_action+0x12e/0x210
[  416.149808][ T1946]  ? device_remove_class_symlinks+0xc0/0xc0
[  416.175650][ T1946]  memory_subsys_offline+0x7d/0xb0
[  416.197897][ T1946]  device_offline+0xd5/0x110
[  416.217800][ T1946]  ? memory_block_action+0x210/0x210
[  416.240809][ T1946]  state_store+0xc6/0xe0
[  416.259508][ T1946]  dev_attr_store+0x3f/0x60
[  416.279018][ T1946]  ? device_create_release+0x60/0x60
[  416.302081][ T1946]  sysfs_kf_write+0x89/0xb0
[  416.321625][ T1946]  ? sysfs_file_ops+0xa0/0xa0
[  416.341906][ T1946]  kernfs_fop_write+0x188/0x240
[  416.363700][ T1946]  __vfs_write+0x50/0xa0
[  416.382789][ T1946]  vfs_write+0x105/0x290
[  416.401087][ T1946]  ksys_write+0xc6/0x160
[  416.421144][ T1946]  ? __x64_sys_read+0x50/0x50
[  416.444824][ T1946]  ? fput+0x13/0x20
[  416.462255][ T1946]  ? filp_close+0x8e/0xa0
[  416.480951][ T1946]  ? __close_fd+0xe0/0x110
[  416.500343][ T1946]  __x64_sys_write+0x43/0x50
[  416.520327][ T1946]  do_syscall_64+0xc8/0x63b
[  416.540048][ T1946]  ? syscall_return_slowpath+0x120/0x120
[  416.564728][ T1946]  ? __do_page_fault+0x44d/0x5b0
[  416.586119][ T1946]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  416.611778][ T1946] RIP: 0033:0x7fca28ac63b8
[  416.630947][ T1946] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00
00 f3 0f 1e fa 48 8d 05 65 63 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48>
3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
[  416.717953][ T1946] RSP: 002b:00007ffc33f8eb98 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[  416.755847][ T1946] RAX: ffffffffffffffda RBX: 0000000000000008 RCX:
00007fca28ac63b8
[  416.790908][ T1946] RDX: 0000000000000008 RSI: 0000558daa079880 RDI:
0000000000000001
[  416.826002][ T1946] RBP: 0000558daa079880 R08: 000000000000000a R09:
00007ffc33f8e720
[  416.861054][ T1946] R10: 000000000000000a R11: 0000000000000246 R12:
00007fca28d98780
[  416.896253][ T1946] R13: 0000000000000008 R14: 00007fca28d93740 R15:
0000000000000008
[  416.932117][ T1946] Modules linked in: kvm_intel kvm irqbypass dax_pmem
dax_pmem_core ip_tables x_tables xfs sd_mod igb i2c_algo_bit hpsa i2c_core
scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
[  417.019852][ T1946] ---[ end trace 5a30e75692517f36 ]---
[  417.044089][ T1946] RIP: 0010:test_pages_in_a_zone+0x285/0x310
[  417.070435][ T1946] Code: c6 c0 96 4c a2 48 89 df e8 18 23 f6 ff 0f 0b 48 c7
c7 80 c7 ad a2 e8 ae c2 1f 00 48 c7 c6 c0 96 4c a2 48 89 cf e8 fb 22 f6 ff <0f>
0b 48 c7 c7 00 c8 ad a2 e8 91 c2 1f 00 48 85 db 0f 84 3c ff ff
[  417.158165][ T1946] RSP: 0018:ffff88832ba37930 EFLAGS: 00010292
[  417.184809][ T1946] RAX: 0000000000000000 RBX: ffff88847fff36c0 RCX:
ffffffffa1b40b78
[  417.220249][ T1946] RDX: 0000000000000000 RSI: 0000000000000008 RDI:
ffff88884743d380
[  417.255589][ T1946] RBP: ffff88832ba37988 R08: ffffed1108e87a71 R09:
ffffed1108e87a70
[  417.290652][ T1946] R10: ffffed1108e87a70 R11: ffff88884743d387 R12:
0000000000060000
[  417.325808][ T1946] R13: 0000000000060000 R14: 0000000000060000 R15:
000000000005a800
[  417.360953][ T1946] FS:  00007fca293e7740(0000) GS:ffff888847400000(0000)
knlGS:0000000000000000
[  417.401830][ T1946] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  417.430817][ T1946] CR2: 0000558da8ffffc0 CR3: 00000002bff10006 CR4:
00000000001606a0
[  417.470406][ T1946] Kernel panic - not syncing: Fatal exception
[  417.497018][ T1946] Kernel Offset: 0x20600000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  417.548754][ T1946] ---[ end Kernel panic - not syncing: Fatal exception ]---

[2]
[    0.000000][    T0] WARNING: CPU: 0 PID: 0 at arch/powerpc/mm/pgtable.c:186
set_pte_at+0x3c/0x190
[    0.000000][    T0] Modules linked in:
[    0.000000][    T0] CPU: 0 PID: 0 Comm: swapper Tainted:
G        W         5.2.0-rc4+ #7
[    0.000000][    T0] NIP:  c00000000006129c LR: c000000000075724 CTR:
c000000000061270
[    0.000000][    T0] REGS: c0000000016d7770 TRAP: 0700   Tainted:
G        W          (5.2.0-rc4+)
[    0.000000][    T0] MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR:
44002884  XER: 20040000
[    0.000000][    T0] CFAR: c00000000005d514 IRQMASK: 1 
[    0.000000][    T0] GPR00: c000000000075724 c0000000016d7a00 c0000000016d4900
c0000000016a48b0 
[    0.000000][    T0] GPR04: c00c0000003d0000 c000001bff5300e8 8e014b001c000080
ffffffffffffffff 
[    0.000000][    T0] GPR08: c000001bff530000 06000000000000c0 07000000000000c0
0000000000000001 
[    0.000000][    T0] GPR12: c000000000061270 c000000002b30000 c0000000009e8830
c0000000009e8860 
[    0.000000][    T0] GPR16: 0000000000000009 0000000000000009 c000001ffffca000
0000000000000000 
[    0.000000][    T0] GPR20: 0000000000000015 0000000000000000 0000000000000000
c000001ffffc9000 
[    0.000000][    T0] GPR24: c0000000016a48b0 c0000000018a07c0 0000000000000005
c00c0000003d0000 
[    0.000000][    T0] GPR28: 800000000000018e 8000001c004b018e c000001bff5300e8
0000000000000008 
[    0.000000][    T0] NIP [c00000000006129c] set_pte_at+0x3c/0x190
[    0.000000][    T0] LR [c000000000075724] __map_kernel_page+0x7a4/0x890
[    0.000000][    T0] Call Trace:
[    0.000000][    T0] [c0000000016d7a00] [0000000400000000] 0x400000000
(unreliable)
[    0.000000][    T0] [c0000000016d7a40] [0000001c004b0000] 0x1c004b0000
[    0.000000][    T0] [c0000000016d7af0] [c0000000008b858c]
radix__vmemmap_create_mapping+0x98/0xbc
[    0.000000][    T0] [c0000000016d7b70] [c0000000008b7194]
vmemmap_populate+0x284/0x31c
[    0.000000][    T0] [c0000000016d7c30] [c0000000008baeb0]
sparse_mem_map_populate+0x40/0x68
[    0.000000][    T0] [c0000000016d7c60] [c000000000af5e10]
sparse_init_nid+0x35c/0x550
[    0.000000][    T0] [c0000000016d7d20] [c000000000af63b0]
sparse_init+0x1a8/0x240
[    0.000000][    T0] [c0000000016d7d60] [c000000000ac67b0]
initmem_init+0x368/0x40c
[    0.000000][    T0] [c0000000016d7e80] [c000000000aba9b8]
setup_arch+0x300/0x380
[    0.000000][    T0] [c0000000016d7ef0] [c000000000ab3fd8]
start_kernel+0xb4/0x710
[    0.000000][    T0] [c0000000016d7f90] [c00000000000ab74]
start_here_common+0x1c/0x4a8
Dan Williams June 12, 2019, 9:52 p.m. UTC | #4
On Wed, Jun 12, 2019 at 2:47 PM Qian Cai <cai@lca.pw> wrote:
>
> On Wed, 2019-06-12 at 12:38 -0700, Dan Williams wrote:
> > On Wed, Jun 12, 2019 at 12:37 PM Dan Williams <dan.j.williams@intel.com>
> > wrote:
> > >
> > > On Wed, Jun 12, 2019 at 12:16 PM Qian Cai <cai@lca.pw> wrote:
> > > >
> > > > The linux-next commit "mm/sparsemem: Add helpers track active portions
> > > > of a section at boot" [1] causes a crash below when the first kmemleak
> > > > scan kthread kicks in. This is because kmemleak_scan() calls
> > > > pfn_to_online_page(() which calls pfn_valid_within() instead of
> > > > pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.
> > > >
> > > > The commit [1] did add an additional check of pfn_section_valid() in
> > > > pfn_valid(), but forgot to add it in the above code path.
> > > >
> > > > page:ffffea0002748000 is uninitialized and poisoned
> > > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > > > ------------[ cut here ]------------
> > > > kernel BUG at include/linux/mm.h:1084!
> > > > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> > > > CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
> > > > Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
> > > > BIOS -[TEE113T-1.00]- 07/07/2017
> > > > RIP: 0010:kmemleak_scan+0x6df/0xad0
> > > > Call Trace:
> > > >  kmemleak_scan_thread+0x9f/0xc7
> > > >  kthread+0x1d2/0x1f0
> > > >  ret_from_fork+0x35/0x4
> > > >
> > > > [1] https://patchwork.kernel.org/patch/10977957/
> > > >
> > > > Signed-off-by: Qian Cai <cai@lca.pw>
> > > > ---
> > > >  include/linux/memory_hotplug.h | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/include/linux/memory_hotplug.h
> > > > b/include/linux/memory_hotplug.h
> > > > index 0b8a5e5ef2da..f02be86077e3 100644
> > > > --- a/include/linux/memory_hotplug.h
> > > > +++ b/include/linux/memory_hotplug.h
> > > > @@ -28,6 +28,7 @@
> > > >         unsigned long ___nr = pfn_to_section_nr(___pfn);           \
> > > >                                                                    \
> > > >         if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> > > > +           pfn_section_valid(__nr_to_section(___nr), pfn) &&      \
> > > >             pfn_valid_within(___pfn))                              \
> > > >                 ___page = pfn_to_page(___pfn);                     \
> > > >         ___page;                                                   \
> > >
> > > Looks ok to me:
> > >
> > > Acked-by: Dan Williams <dan.j.williams@intel.com>
> > >
> > > ...but why is pfn_to_online_page() a multi-line macro instead of a
> > > static inline like all the helper routines it invokes?
> >
> > I do need to send out a refreshed version of the sub-section patchset,
> > so I'll fold this in and give you a Reported-by credit.
>
> BTW, not sure if your new version will fix those two problem below due to the
> same commit.
>
> https://patchwork.kernel.org/patch/10977957/
>
> 1) offline is busted [1]. It looks like test_pages_in_a_zone() missed the same
> pfn_section_valid() check.
>
> 2) powerpc booting is generating endless warnings [2]. In vmemmap_populated() at
> arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> PAGES_PER_SUBSECTION, but it alone seems not enough.

Yes, I was just sending you another note about this. I don't think
your proposed fix is sufficient. The original intent of
pfn_valid_within() was to use it as a cheaper lookup after already
validating that the first page in a MAX_ORDER_NR_PAGES range satisfied
pfn_valid(). Quoting commit  14e072984179 "add pfn_valid_within helper
for sub-MAX_ORDER hole detection":

    Add a pfn_valid_within() helper which should be used when scanning pages
    within a MAX_ORDER_NR_PAGES block when we have already checked the
validility
    of the block normally with pfn_valid().  This can then be
optimised away when
    we do not have holes within a MAX_ORDER_NR_PAGES block of pages.

So, with that insight I think the complete fix is this:

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 6dd52d544857..9d15ec793330 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1428,7 +1428,7 @@ void memory_present(int nid, unsigned long
start, unsigned long end);
 #ifdef CONFIG_HOLES_IN_ZONE
 #define pfn_valid_within(pfn) pfn_valid(pfn)
 #else
-#define pfn_valid_within(pfn) (1)
+#define pfn_valid_within(pfn) pfn_section_valid(pfn)
 #endif

 #ifdef CONFIG_ARCH_HAS_HOLES_MEMORYMODEL
Dan Williams June 12, 2019, 11:13 p.m. UTC | #5
On Wed, Jun 12, 2019 at 2:52 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Wed, Jun 12, 2019 at 2:47 PM Qian Cai <cai@lca.pw> wrote:
> >
> > On Wed, 2019-06-12 at 12:38 -0700, Dan Williams wrote:
> > > On Wed, Jun 12, 2019 at 12:37 PM Dan Williams <dan.j.williams@intel.com>
> > > wrote:
> > > >
> > > > On Wed, Jun 12, 2019 at 12:16 PM Qian Cai <cai@lca.pw> wrote:
> > > > >
> > > > > The linux-next commit "mm/sparsemem: Add helpers track active portions
> > > > > of a section at boot" [1] causes a crash below when the first kmemleak
> > > > > scan kthread kicks in. This is because kmemleak_scan() calls
> > > > > pfn_to_online_page(() which calls pfn_valid_within() instead of
> > > > > pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.
> > > > >
> > > > > The commit [1] did add an additional check of pfn_section_valid() in
> > > > > pfn_valid(), but forgot to add it in the above code path.
> > > > >
> > > > > page:ffffea0002748000 is uninitialized and poisoned
> > > > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > > > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > > > > ------------[ cut here ]------------
> > > > > kernel BUG at include/linux/mm.h:1084!
> > > > > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> > > > > CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
> > > > > Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
> > > > > BIOS -[TEE113T-1.00]- 07/07/2017
> > > > > RIP: 0010:kmemleak_scan+0x6df/0xad0
> > > > > Call Trace:
> > > > >  kmemleak_scan_thread+0x9f/0xc7
> > > > >  kthread+0x1d2/0x1f0
> > > > >  ret_from_fork+0x35/0x4
> > > > >
> > > > > [1] https://patchwork.kernel.org/patch/10977957/
> > > > >
> > > > > Signed-off-by: Qian Cai <cai@lca.pw>
> > > > > ---
> > > > >  include/linux/memory_hotplug.h | 1 +
> > > > >  1 file changed, 1 insertion(+)
> > > > >
> > > > > diff --git a/include/linux/memory_hotplug.h
> > > > > b/include/linux/memory_hotplug.h
> > > > > index 0b8a5e5ef2da..f02be86077e3 100644
> > > > > --- a/include/linux/memory_hotplug.h
> > > > > +++ b/include/linux/memory_hotplug.h
> > > > > @@ -28,6 +28,7 @@
> > > > >         unsigned long ___nr = pfn_to_section_nr(___pfn);           \
> > > > >                                                                    \
> > > > >         if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> > > > > +           pfn_section_valid(__nr_to_section(___nr), pfn) &&      \
> > > > >             pfn_valid_within(___pfn))                              \
> > > > >                 ___page = pfn_to_page(___pfn);                     \
> > > > >         ___page;                                                   \
> > > >
> > > > Looks ok to me:
> > > >
> > > > Acked-by: Dan Williams <dan.j.williams@intel.com>
> > > >
> > > > ...but why is pfn_to_online_page() a multi-line macro instead of a
> > > > static inline like all the helper routines it invokes?
> > >
> > > I do need to send out a refreshed version of the sub-section patchset,
> > > so I'll fold this in and give you a Reported-by credit.
> >
> > BTW, not sure if your new version will fix those two problem below due to the
> > same commit.
> >
> > https://patchwork.kernel.org/patch/10977957/
> >
> > 1) offline is busted [1]. It looks like test_pages_in_a_zone() missed the same
> > pfn_section_valid() check.
> >
> > 2) powerpc booting is generating endless warnings [2]. In vmemmap_populated() at
> > arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> > PAGES_PER_SUBSECTION, but it alone seems not enough.
>
> Yes, I was just sending you another note about this. I don't think
> your proposed fix is sufficient. The original intent of
> pfn_valid_within() was to use it as a cheaper lookup after already
> validating that the first page in a MAX_ORDER_NR_PAGES range satisfied
> pfn_valid(). Quoting commit  14e072984179 "add pfn_valid_within helper
> for sub-MAX_ORDER hole detection":
>
>     Add a pfn_valid_within() helper which should be used when scanning pages
>     within a MAX_ORDER_NR_PAGES block when we have already checked the
> validility
>     of the block normally with pfn_valid().  This can then be
> optimised away when
>     we do not have holes within a MAX_ORDER_NR_PAGES block of pages.
>
> So, with that insight I think the complete fix is this:
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 6dd52d544857..9d15ec793330 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1428,7 +1428,7 @@ void memory_present(int nid, unsigned long
> start, unsigned long end);
>  #ifdef CONFIG_HOLES_IN_ZONE
>  #define pfn_valid_within(pfn) pfn_valid(pfn)
>  #else
> -#define pfn_valid_within(pfn) (1)
> +#define pfn_valid_within(pfn) pfn_section_valid(pfn)

Well, obviously that won't work because pfn_section_valid needs a
'struct mem_section *' arg, but this does serve as a good check of
whether call sites were properly using pfn_valid_within() to constrain
the validity after an existing pfn_valid() check within the same
MAX_ORDER_NR_PAGES span.
Dan Williams June 13, 2019, 12:06 a.m. UTC | #6
On Wed, Jun 12, 2019 at 2:47 PM Qian Cai <cai@lca.pw> wrote:
>
> On Wed, 2019-06-12 at 12:38 -0700, Dan Williams wrote:
> > On Wed, Jun 12, 2019 at 12:37 PM Dan Williams <dan.j.williams@intel.com>
> > wrote:
> > >
> > > On Wed, Jun 12, 2019 at 12:16 PM Qian Cai <cai@lca.pw> wrote:
> > > >
> > > > The linux-next commit "mm/sparsemem: Add helpers track active portions
> > > > of a section at boot" [1] causes a crash below when the first kmemleak
> > > > scan kthread kicks in. This is because kmemleak_scan() calls
> > > > pfn_to_online_page(() which calls pfn_valid_within() instead of
> > > > pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.
> > > >
> > > > The commit [1] did add an additional check of pfn_section_valid() in
> > > > pfn_valid(), but forgot to add it in the above code path.
> > > >
> > > > page:ffffea0002748000 is uninitialized and poisoned
> > > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > > > ------------[ cut here ]------------
> > > > kernel BUG at include/linux/mm.h:1084!
> > > > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> > > > CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
> > > > Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
> > > > BIOS -[TEE113T-1.00]- 07/07/2017
> > > > RIP: 0010:kmemleak_scan+0x6df/0xad0
> > > > Call Trace:
> > > >  kmemleak_scan_thread+0x9f/0xc7
> > > >  kthread+0x1d2/0x1f0
> > > >  ret_from_fork+0x35/0x4
> > > >
> > > > [1] https://patchwork.kernel.org/patch/10977957/
> > > >
> > > > Signed-off-by: Qian Cai <cai@lca.pw>
> > > > ---
> > > >  include/linux/memory_hotplug.h | 1 +
> > > >  1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/include/linux/memory_hotplug.h
> > > > b/include/linux/memory_hotplug.h
> > > > index 0b8a5e5ef2da..f02be86077e3 100644
> > > > --- a/include/linux/memory_hotplug.h
> > > > +++ b/include/linux/memory_hotplug.h
> > > > @@ -28,6 +28,7 @@
> > > >         unsigned long ___nr = pfn_to_section_nr(___pfn);           \
> > > >                                                                    \
> > > >         if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> > > > +           pfn_section_valid(__nr_to_section(___nr), pfn) &&      \
> > > >             pfn_valid_within(___pfn))                              \
> > > >                 ___page = pfn_to_page(___pfn);                     \
> > > >         ___page;                                                   \
> > >
> > > Looks ok to me:
> > >
> > > Acked-by: Dan Williams <dan.j.williams@intel.com>
> > >
> > > ...but why is pfn_to_online_page() a multi-line macro instead of a
> > > static inline like all the helper routines it invokes?
> >
> > I do need to send out a refreshed version of the sub-section patchset,
> > so I'll fold this in and give you a Reported-by credit.
>
> BTW, not sure if your new version will fix those two problem below due to the
> same commit.
>
> https://patchwork.kernel.org/patch/10977957/
>
> 1) offline is busted [1]. It looks like test_pages_in_a_zone() missed the same
> pfn_section_valid() check.

All online memory is to be onlined as a complete section, so I think
the issue is more related to vmemmap_populated() not establishing the
mem_map for all pages in a section.

I take back my suggestions about pfn_valid_within() that operation
should always be scoped to a section when validating online memory.

>
> 2) powerpc booting is generating endless warnings [2]. In vmemmap_populated() at
> arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> PAGES_PER_SUBSECTION, but it alone seems not enough.

On PowerPC PAGES_PER_SECTION == PAGES_PER_SUBSECTION because the
PowerPC section size was already small. Instead I think the issue is
that PowerPC is partially populating sections, but still expecting
pfn_valid() to succeed. I.e. prior to the subsection patches
pfn_valid() would still work for those holes, but now that it is more
precise it is failing.
Qian Cai June 13, 2019, 6:42 p.m. UTC | #7
On Wed, 2019-06-12 at 12:37 -0700, Dan Williams wrote:
> On Wed, Jun 12, 2019 at 12:16 PM Qian Cai <cai@lca.pw> wrote:
> > 
> > The linux-next commit "mm/sparsemem: Add helpers track active portions
> > of a section at boot" [1] causes a crash below when the first kmemleak
> > scan kthread kicks in. This is because kmemleak_scan() calls
> > pfn_to_online_page(() which calls pfn_valid_within() instead of
> > pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.
> > 
> > The commit [1] did add an additional check of pfn_section_valid() in
> > pfn_valid(), but forgot to add it in the above code path.
> > 
> > page:ffffea0002748000 is uninitialized and poisoned
> > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > ------------[ cut here ]------------
> > kernel BUG at include/linux/mm.h:1084!
> > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> > CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
> > Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
> > BIOS -[TEE113T-1.00]- 07/07/2017
> > RIP: 0010:kmemleak_scan+0x6df/0xad0
> > Call Trace:
> >  kmemleak_scan_thread+0x9f/0xc7
> >  kthread+0x1d2/0x1f0
> >  ret_from_fork+0x35/0x4
> > 
> > [1] https://patchwork.kernel.org/patch/10977957/
> > 
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> >  include/linux/memory_hotplug.h | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> > index 0b8a5e5ef2da..f02be86077e3 100644
> > --- a/include/linux/memory_hotplug.h
> > +++ b/include/linux/memory_hotplug.h
> > @@ -28,6 +28,7 @@
> >         unsigned long ___nr = pfn_to_section_nr(___pfn);           \
> >                                                                    \
> >         if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> > +           pfn_section_valid(__nr_to_section(___nr), pfn) &&      \
> >             pfn_valid_within(___pfn))                              \
> >                 ___page = pfn_to_page(___pfn);                     \
> >         ___page;                                                   \
> 
> Looks ok to me:
> 
> Acked-by: Dan Williams <dan.j.williams@intel.com>
> 
> ...but why is pfn_to_online_page() a multi-line macro instead of a
> static inline like all the helper routines it invokes?

Sigh, probably because it is a mess over there.

memory_hotplug.h and mmzone.h are included each other. Converted it directly to
a static inline triggers compilation errors because mmzone.h was included
somewhere else and found pfn_to_online_page() needs things like
pfn_valid_within() and online_section_nr() etc which are only defined later in
mmzone.h.

Move pfn_to_online_page() into mmzone.h triggers errors below.

In file included from ./arch/x86/include/asm/page.h:76,
                 from ./arch/x86/include/asm/thread_info.h:12,
                 from ./include/linux/thread_info.h:38,
                 from ./arch/x86/include/asm/preempt.h:7,
                 from ./include/linux/preempt.h:78,
                 from ./include/linux/spinlock.h:51,
                 from ./include/linux/mmzone.h:8,
                 from ./include/linux/gfp.h:6,
                 from ./include/linux/slab.h:15,
                 from ./include/linux/crypto.h:19,
                 from arch/x86/kernel/asm-offsets.c:9:
./include/linux/memory_hotplug.h: In function ‘pfn_to_online_page’:
./include/asm-generic/memory_model.h:54:29: error: ‘vmemmap’ undeclared (first
use in this function); did you mean ‘mem_map’?
 #define __pfn_to_page(pfn) (vmemmap + (pfn))
                             ^~~~~~~
./include/asm-generic/memory_model.h:82:21: note: in expansion of macro
‘__pfn_to_page’
 #define pfn_to_page __pfn_to_page
                     ^~~~~~~~~~~~~
./include/linux/memory_hotplug.h:30:10: note: in expansion of macro
‘pfn_to_page’
   return pfn_to_page(pfn);
          ^~~~~~~~~~~
./include/asm-generic/memory_model.h:54:29: note: each undeclared identifier is
reported only once for each function it appears in
 #define __pfn_to_page(pfn) (vmemmap + (pfn))
                             ^~~~~~~
./include/asm-generic/memory_model.h:82:21: note: in expansion of macro
‘__pfn_to_page’
 #define pfn_to_page __pfn_to_page
                     ^~~~~~~~~~~~~
./include/linux/memory_hotplug.h:30:10: note: in expansion of macro
‘pfn_to_page’
   return pfn_to_page(pfn);
          ^~~~~~~~~~~
make[1]: *** [scripts/Makefile.build:112: arch/x86/kernel/asm-offsets.s] Error 1
Dan Williams June 14, 2019, 1:17 a.m. UTC | #8
On Thu, Jun 13, 2019 at 11:42 AM Qian Cai <cai@lca.pw> wrote:
>
> On Wed, 2019-06-12 at 12:37 -0700, Dan Williams wrote:
> > On Wed, Jun 12, 2019 at 12:16 PM Qian Cai <cai@lca.pw> wrote:
> > >
> > > The linux-next commit "mm/sparsemem: Add helpers track active portions
> > > of a section at boot" [1] causes a crash below when the first kmemleak
> > > scan kthread kicks in. This is because kmemleak_scan() calls
> > > pfn_to_online_page(() which calls pfn_valid_within() instead of
> > > pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.
> > >
> > > The commit [1] did add an additional check of pfn_section_valid() in
> > > pfn_valid(), but forgot to add it in the above code path.
> > >
> > > page:ffffea0002748000 is uninitialized and poisoned
> > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > > ------------[ cut here ]------------
> > > kernel BUG at include/linux/mm.h:1084!
> > > invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> > > CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
> > > Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
> > > BIOS -[TEE113T-1.00]- 07/07/2017
> > > RIP: 0010:kmemleak_scan+0x6df/0xad0
> > > Call Trace:
> > >  kmemleak_scan_thread+0x9f/0xc7
> > >  kthread+0x1d2/0x1f0
> > >  ret_from_fork+0x35/0x4
> > >
> > > [1] https://patchwork.kernel.org/patch/10977957/
> > >
> > > Signed-off-by: Qian Cai <cai@lca.pw>
> > > ---
> > >  include/linux/memory_hotplug.h | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> > > index 0b8a5e5ef2da..f02be86077e3 100644
> > > --- a/include/linux/memory_hotplug.h
> > > +++ b/include/linux/memory_hotplug.h
> > > @@ -28,6 +28,7 @@
> > >         unsigned long ___nr = pfn_to_section_nr(___pfn);           \
> > >                                                                    \
> > >         if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
> > > +           pfn_section_valid(__nr_to_section(___nr), pfn) &&      \
> > >             pfn_valid_within(___pfn))                              \
> > >                 ___page = pfn_to_page(___pfn);                     \
> > >         ___page;                                                   \
> >
> > Looks ok to me:
> >
> > Acked-by: Dan Williams <dan.j.williams@intel.com>
> >
> > ...but why is pfn_to_online_page() a multi-line macro instead of a
> > static inline like all the helper routines it invokes?
>
> Sigh, probably because it is a mess over there.
>
> memory_hotplug.h and mmzone.h are included each other. Converted it directly to
> a static inline triggers compilation errors because mmzone.h was included
> somewhere else and found pfn_to_online_page() needs things like
> pfn_valid_within() and online_section_nr() etc which are only defined later in
> mmzone.h.

Ok, makes sense I had I assumed it was something horrible like that.

Qian, can you send more details on the reproduction steps for the
failures you are seeing? Like configs and platforms you're testing.
I've tried enabling kmemleak and offlining memory and have yet to
trigger these failures. I also have a couple people willing to help me
out with tracking down the PowerPC issue, but I assume they need some
help with the reproduction as well.
Qian Cai June 14, 2019, 1:29 a.m. UTC | #9
> On Jun 13, 2019, at 9:17 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> 
> On Thu, Jun 13, 2019 at 11:42 AM Qian Cai <cai@lca.pw> wrote:
>> 
>> On Wed, 2019-06-12 at 12:37 -0700, Dan Williams wrote:
>>> On Wed, Jun 12, 2019 at 12:16 PM Qian Cai <cai@lca.pw> wrote:
>>>> 
>>>> The linux-next commit "mm/sparsemem: Add helpers track active portions
>>>> of a section at boot" [1] causes a crash below when the first kmemleak
>>>> scan kthread kicks in. This is because kmemleak_scan() calls
>>>> pfn_to_online_page(() which calls pfn_valid_within() instead of
>>>> pfn_valid() on x86 due to CONFIG_HOLES_IN_ZONE=n.
>>>> 
>>>> The commit [1] did add an additional check of pfn_section_valid() in
>>>> pfn_valid(), but forgot to add it in the above code path.
>>>> 
>>>> page:ffffea0002748000 is uninitialized and poisoned
>>>> raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
>>>> raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
>>>> page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
>>>> ------------[ cut here ]------------
>>>> kernel BUG at include/linux/mm.h:1084!
>>>> invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
>>>> CPU: 5 PID: 332 Comm: kmemleak Not tainted 5.2.0-rc4-next-20190612+ #6
>>>> Hardware name: Lenovo ThinkSystem SR530 -[7X07RCZ000]-/-[7X07RCZ000]-,
>>>> BIOS -[TEE113T-1.00]- 07/07/2017
>>>> RIP: 0010:kmemleak_scan+0x6df/0xad0
>>>> Call Trace:
>>>> kmemleak_scan_thread+0x9f/0xc7
>>>> kthread+0x1d2/0x1f0
>>>> ret_from_fork+0x35/0x4
>>>> 
>>>> [1] https://patchwork.kernel.org/patch/10977957/
>>>> 
>>>> Signed-off-by: Qian Cai <cai@lca.pw>
>>>> ---
>>>> include/linux/memory_hotplug.h | 1 +
>>>> 1 file changed, 1 insertion(+)
>>>> 
>>>> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
>>>> index 0b8a5e5ef2da..f02be86077e3 100644
>>>> --- a/include/linux/memory_hotplug.h
>>>> +++ b/include/linux/memory_hotplug.h
>>>> @@ -28,6 +28,7 @@
>>>>        unsigned long ___nr = pfn_to_section_nr(___pfn);           \
>>>>                                                                   \
>>>>        if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
>>>> +           pfn_section_valid(__nr_to_section(___nr), pfn) &&      \
>>>>            pfn_valid_within(___pfn))                              \
>>>>                ___page = pfn_to_page(___pfn);                     \
>>>>        ___page;                                                   \
>>> 
>>> Looks ok to me:
>>> 
>>> Acked-by: Dan Williams <dan.j.williams@intel.com>
>>> 
>>> ...but why is pfn_to_online_page() a multi-line macro instead of a
>>> static inline like all the helper routines it invokes?
>> 
>> Sigh, probably because it is a mess over there.
>> 
>> memory_hotplug.h and mmzone.h are included each other. Converted it directly to
>> a static inline triggers compilation errors because mmzone.h was included
>> somewhere else and found pfn_to_online_page() needs things like
>> pfn_valid_within() and online_section_nr() etc which are only defined later in
>> mmzone.h.
> 
> Ok, makes sense I had I assumed it was something horrible like that.
> 
> Qian, can you send more details on the reproduction steps for the
> failures you are seeing? Like configs and platforms you're testing.
> I've tried enabling kmemleak and offlining memory and have yet to
> trigger these failures. I also have a couple people willing to help me
> out with tracking down the PowerPC issue, but I assume they need some
> help with the reproduction as well.

https://github.com/cailca/linux-mm

You can see the configs for each arch there. It was reproduced on several x86 NUMA bare-metal machines HPE, Lenovo etc either Intel or AMD. Check the “test.sh”, there is a part to do offline/online will reproduce it.

The powerpc is IBM 8335-GTC (ibm,witherspoon) POWER9 which is a NUMA PowerNV platform.
Aneesh Kumar K.V June 14, 2019, 8:58 a.m. UTC | #10
Qian Cai <cai@lca.pw> writes:


> 1) offline is busted [1]. It looks like test_pages_in_a_zone() missed the same
> pfn_section_valid() check.
>
> 2) powerpc booting is generating endless warnings [2]. In vmemmap_populated() at
> arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> PAGES_PER_SUBSECTION, but it alone seems not enough.
>

Can you check with this change on ppc64.  I haven't reviewed this series yet.
I did limited testing with change . Before merging this I need to go
through the full series again. The vmemmap poplulate on ppc64 needs to
handle two translation mode (hash and radix). With respect to vmemap
hash doesn't setup a translation in the linux page table. Hence we need
to make sure we don't try to setup a mapping for a range which is
arleady convered by an existing mapping. 

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index a4e17a979e45..15c342f0a543 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -88,16 +88,23 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
  * which overlaps this vmemmap page is initialised then this page is
  * initialised already.
  */
-static int __meminit vmemmap_populated(unsigned long start, int page_size)
+static bool __meminit vmemmap_populated(unsigned long start, int page_size)
 {
 	unsigned long end = start + page_size;
 	start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
 
-	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
-		if (pfn_valid(page_to_pfn((struct page *)start)))
-			return 1;
+	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) {
 
-	return 0;
+		struct mem_section *ms;
+		unsigned long pfn = page_to_pfn((struct page *)start);
+
+		if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
+			return 0;
+		ms = __nr_to_section(pfn_to_section_nr(pfn));
+		if (valid_section(ms))
+			return true;
+	}
+	return false;
 }
 
 /*
Qian Cai June 14, 2019, 2:59 p.m. UTC | #11
On Fri, 2019-06-14 at 14:28 +0530, Aneesh Kumar K.V wrote:
> Qian Cai <cai@lca.pw> writes:
> 
> 
> > 1) offline is busted [1]. It looks like test_pages_in_a_zone() missed the
> > same
> > pfn_section_valid() check.
> > 
> > 2) powerpc booting is generating endless warnings [2]. In
> > vmemmap_populated() at
> > arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> > PAGES_PER_SUBSECTION, but it alone seems not enough.
> > 
> 
> Can you check with this change on ppc64.  I haven't reviewed this series yet.
> I did limited testing with change . Before merging this I need to go
> through the full series again. The vmemmap poplulate on ppc64 needs to
> handle two translation mode (hash and radix). With respect to vmemap
> hash doesn't setup a translation in the linux page table. Hence we need
> to make sure we don't try to setup a mapping for a range which is
> arleady convered by an existing mapping.

It works fine.

> 
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index a4e17a979e45..15c342f0a543 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -88,16 +88,23 @@ static unsigned long __meminit
> vmemmap_section_start(unsigned long page)
>   * which overlaps this vmemmap page is initialised then this page is
>   * initialised already.
>   */
> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
> +static bool __meminit vmemmap_populated(unsigned long start, int page_size)
>  {
>  	unsigned long end = start + page_size;
>  	start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
>  
> -	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct
> page)))
> -		if (pfn_valid(page_to_pfn((struct page *)start)))
> -			return 1;
> +	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct
> page))) {
>  
> -	return 0;
> +		struct mem_section *ms;
> +		unsigned long pfn = page_to_pfn((struct page *)start);
> +
> +		if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> +			return 0;
> +		ms = __nr_to_section(pfn_to_section_nr(pfn));
> +		if (valid_section(ms))
> +			return true;
> +	}
> +	return false;
>  }
>  
>  /*
>
Oscar Salvador June 14, 2019, 3:35 p.m. UTC | #12
On Fri, Jun 14, 2019 at 02:28:40PM +0530, Aneesh Kumar K.V wrote:
> Can you check with this change on ppc64.  I haven't reviewed this series yet.
> I did limited testing with change . Before merging this I need to go
> through the full series again. The vmemmap poplulate on ppc64 needs to
> handle two translation mode (hash and radix). With respect to vmemap
> hash doesn't setup a translation in the linux page table. Hence we need
> to make sure we don't try to setup a mapping for a range which is
> arleady convered by an existing mapping. 
> 
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index a4e17a979e45..15c342f0a543 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -88,16 +88,23 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
>   * which overlaps this vmemmap page is initialised then this page is
>   * initialised already.
>   */
> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
> +static bool __meminit vmemmap_populated(unsigned long start, int page_size)
>  {
>  	unsigned long end = start + page_size;
>  	start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
>  
> -	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
> -		if (pfn_valid(page_to_pfn((struct page *)start)))
> -			return 1;
> +	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) {
>  
> -	return 0;
> +		struct mem_section *ms;
> +		unsigned long pfn = page_to_pfn((struct page *)start);
> +
> +		if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> +			return 0;

I might be missing something, but is this right?
Having a section_nr above NR_MEM_SECTIONS is invalid, but if we return 0 here,
vmemmap_populate will go on and populate it.
Aneesh Kumar K.V June 14, 2019, 4:18 p.m. UTC | #13
On 6/14/19 9:05 PM, Oscar Salvador wrote:
> On Fri, Jun 14, 2019 at 02:28:40PM +0530, Aneesh Kumar K.V wrote:
>> Can you check with this change on ppc64.  I haven't reviewed this series yet.
>> I did limited testing with change . Before merging this I need to go
>> through the full series again. The vmemmap poplulate on ppc64 needs to
>> handle two translation mode (hash and radix). With respect to vmemap
>> hash doesn't setup a translation in the linux page table. Hence we need
>> to make sure we don't try to setup a mapping for a range which is
>> arleady convered by an existing mapping.
>>
>> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
>> index a4e17a979e45..15c342f0a543 100644
>> --- a/arch/powerpc/mm/init_64.c
>> +++ b/arch/powerpc/mm/init_64.c
>> @@ -88,16 +88,23 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
>>    * which overlaps this vmemmap page is initialised then this page is
>>    * initialised already.
>>    */
>> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
>> +static bool __meminit vmemmap_populated(unsigned long start, int page_size)
>>   {
>>   	unsigned long end = start + page_size;
>>   	start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
>>   
>> -	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
>> -		if (pfn_valid(page_to_pfn((struct page *)start)))
>> -			return 1;
>> +	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) {
>>   
>> -	return 0;
>> +		struct mem_section *ms;
>> +		unsigned long pfn = page_to_pfn((struct page *)start);
>> +
>> +		if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>> +			return 0;
> 
> I might be missing something, but is this right?
> Having a section_nr above NR_MEM_SECTIONS is invalid, but if we return 0 here,
> vmemmap_populate will go on and populate it.

I should drop that completely. We should not hit that condition at all. 
I will send a final patch once I go through the full patch series making 
sure we are not breaking any ppc64 details.

Wondering why we did the below

#if defined(ARCH_SUBSECTION_SHIFT)
#define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
#elif defined(PMD_SHIFT)
#define SUBSECTION_SHIFT (PMD_SHIFT)
#else
/*
  * Memory hotplug enabled platforms avoid this default because they
  * either define ARCH_SUBSECTION_SHIFT, or PMD_SHIFT is a constant, but
  * this is kept as a backstop to allow compilation on
  * !ARCH_ENABLE_MEMORY_HOTPLUG archs.
  */
#define SUBSECTION_SHIFT 21
#endif

why not

#if defined(ARCH_SUBSECTION_SHIFT)
#define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
#else
#define SUBSECTION_SHIFT  SECTION_SHIFT
#endif

ie, if SUBSECTION is not supported by arch we have one sub-section per 
section?


-aneesh
Dan Williams June 14, 2019, 4:22 p.m. UTC | #14
On Fri, Jun 14, 2019 at 9:18 AM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> On 6/14/19 9:05 PM, Oscar Salvador wrote:
> > On Fri, Jun 14, 2019 at 02:28:40PM +0530, Aneesh Kumar K.V wrote:
> >> Can you check with this change on ppc64.  I haven't reviewed this series yet.
> >> I did limited testing with change . Before merging this I need to go
> >> through the full series again. The vmemmap poplulate on ppc64 needs to
> >> handle two translation mode (hash and radix). With respect to vmemap
> >> hash doesn't setup a translation in the linux page table. Hence we need
> >> to make sure we don't try to setup a mapping for a range which is
> >> arleady convered by an existing mapping.
> >>
> >> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> >> index a4e17a979e45..15c342f0a543 100644
> >> --- a/arch/powerpc/mm/init_64.c
> >> +++ b/arch/powerpc/mm/init_64.c
> >> @@ -88,16 +88,23 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
> >>    * which overlaps this vmemmap page is initialised then this page is
> >>    * initialised already.
> >>    */
> >> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
> >> +static bool __meminit vmemmap_populated(unsigned long start, int page_size)
> >>   {
> >>      unsigned long end = start + page_size;
> >>      start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
> >>
> >> -    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
> >> -            if (pfn_valid(page_to_pfn((struct page *)start)))
> >> -                    return 1;
> >> +    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) {
> >>
> >> -    return 0;
> >> +            struct mem_section *ms;
> >> +            unsigned long pfn = page_to_pfn((struct page *)start);
> >> +
> >> +            if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> >> +                    return 0;
> >
> > I might be missing something, but is this right?
> > Having a section_nr above NR_MEM_SECTIONS is invalid, but if we return 0 here,
> > vmemmap_populate will go on and populate it.
>
> I should drop that completely. We should not hit that condition at all.
> I will send a final patch once I go through the full patch series making
> sure we are not breaking any ppc64 details.
>
> Wondering why we did the below
>
> #if defined(ARCH_SUBSECTION_SHIFT)
> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
> #elif defined(PMD_SHIFT)
> #define SUBSECTION_SHIFT (PMD_SHIFT)
> #else
> /*
>   * Memory hotplug enabled platforms avoid this default because they
>   * either define ARCH_SUBSECTION_SHIFT, or PMD_SHIFT is a constant, but
>   * this is kept as a backstop to allow compilation on
>   * !ARCH_ENABLE_MEMORY_HOTPLUG archs.
>   */
> #define SUBSECTION_SHIFT 21
> #endif
>
> why not
>
> #if defined(ARCH_SUBSECTION_SHIFT)
> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
> #else
> #define SUBSECTION_SHIFT  SECTION_SHIFT
> #endif
>
> ie, if SUBSECTION is not supported by arch we have one sub-section per
> section?

A couple comments:

The only reason ARCH_SUBSECTION_SHIFT exists is because PMD_SHIFT on
PowerPC was a non-constant value. However, I'm planning to remove the
distinction in the next rev of the patches. Jeff rightly points out
that having a variable subsection size per arch will lead to
situations where persistent memory namespaces are not portable across
archs. So I plan to just make SUBSECTION_SHIFT 21 everywhere.
Aneesh Kumar K.V June 14, 2019, 4:26 p.m. UTC | #15
On 6/14/19 9:52 PM, Dan Williams wrote:
> On Fri, Jun 14, 2019 at 9:18 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> On 6/14/19 9:05 PM, Oscar Salvador wrote:
>>> On Fri, Jun 14, 2019 at 02:28:40PM +0530, Aneesh Kumar K.V wrote:
>>>> Can you check with this change on ppc64.  I haven't reviewed this series yet.
>>>> I did limited testing with change . Before merging this I need to go
>>>> through the full series again. The vmemmap poplulate on ppc64 needs to
>>>> handle two translation mode (hash and radix). With respect to vmemap
>>>> hash doesn't setup a translation in the linux page table. Hence we need
>>>> to make sure we don't try to setup a mapping for a range which is
>>>> arleady convered by an existing mapping.
>>>>
>>>> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
>>>> index a4e17a979e45..15c342f0a543 100644
>>>> --- a/arch/powerpc/mm/init_64.c
>>>> +++ b/arch/powerpc/mm/init_64.c
>>>> @@ -88,16 +88,23 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
>>>>     * which overlaps this vmemmap page is initialised then this page is
>>>>     * initialised already.
>>>>     */
>>>> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
>>>> +static bool __meminit vmemmap_populated(unsigned long start, int page_size)
>>>>    {
>>>>       unsigned long end = start + page_size;
>>>>       start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
>>>>
>>>> -    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
>>>> -            if (pfn_valid(page_to_pfn((struct page *)start)))
>>>> -                    return 1;
>>>> +    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) {
>>>>
>>>> -    return 0;
>>>> +            struct mem_section *ms;
>>>> +            unsigned long pfn = page_to_pfn((struct page *)start);
>>>> +
>>>> +            if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>>> +                    return 0;
>>>
>>> I might be missing something, but is this right?
>>> Having a section_nr above NR_MEM_SECTIONS is invalid, but if we return 0 here,
>>> vmemmap_populate will go on and populate it.
>>
>> I should drop that completely. We should not hit that condition at all.
>> I will send a final patch once I go through the full patch series making
>> sure we are not breaking any ppc64 details.
>>
>> Wondering why we did the below
>>
>> #if defined(ARCH_SUBSECTION_SHIFT)
>> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
>> #elif defined(PMD_SHIFT)
>> #define SUBSECTION_SHIFT (PMD_SHIFT)
>> #else
>> /*
>>    * Memory hotplug enabled platforms avoid this default because they
>>    * either define ARCH_SUBSECTION_SHIFT, or PMD_SHIFT is a constant, but
>>    * this is kept as a backstop to allow compilation on
>>    * !ARCH_ENABLE_MEMORY_HOTPLUG archs.
>>    */
>> #define SUBSECTION_SHIFT 21
>> #endif
>>
>> why not
>>
>> #if defined(ARCH_SUBSECTION_SHIFT)
>> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
>> #else
>> #define SUBSECTION_SHIFT  SECTION_SHIFT

That should be SECTION_SIZE_SHIFT

>> #endif
>>
>> ie, if SUBSECTION is not supported by arch we have one sub-section per
>> section?
> 
> A couple comments:
> 
> The only reason ARCH_SUBSECTION_SHIFT exists is because PMD_SHIFT on
> PowerPC was a non-constant value. However, I'm planning to remove the
> distinction in the next rev of the patches. Jeff rightly points out
> that having a variable subsection size per arch will lead to
> situations where persistent memory namespaces are not portable across
> archs. So I plan to just make SUBSECTION_SHIFT 21 everywhere.
> 


persistent memory namespaces are not portable across archs because they 
have PAGE_SIZE dependency. Then we have dependencies like the page size 
with which we map the vmemmap area. Why not let the arch
arch decide the SUBSECTION_SHIFT and default to one subsection per 
section if arch is not enabled to work with subsection.

-aneesh
Dan Williams June 14, 2019, 4:36 p.m. UTC | #16
On Fri, Jun 14, 2019 at 9:26 AM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> On 6/14/19 9:52 PM, Dan Williams wrote:
> > On Fri, Jun 14, 2019 at 9:18 AM Aneesh Kumar K.V
> > <aneesh.kumar@linux.ibm.com> wrote:
> >>
> >> On 6/14/19 9:05 PM, Oscar Salvador wrote:
> >>> On Fri, Jun 14, 2019 at 02:28:40PM +0530, Aneesh Kumar K.V wrote:
> >>>> Can you check with this change on ppc64.  I haven't reviewed this series yet.
> >>>> I did limited testing with change . Before merging this I need to go
> >>>> through the full series again. The vmemmap poplulate on ppc64 needs to
> >>>> handle two translation mode (hash and radix). With respect to vmemap
> >>>> hash doesn't setup a translation in the linux page table. Hence we need
> >>>> to make sure we don't try to setup a mapping for a range which is
> >>>> arleady convered by an existing mapping.
> >>>>
> >>>> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> >>>> index a4e17a979e45..15c342f0a543 100644
> >>>> --- a/arch/powerpc/mm/init_64.c
> >>>> +++ b/arch/powerpc/mm/init_64.c
> >>>> @@ -88,16 +88,23 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
> >>>>     * which overlaps this vmemmap page is initialised then this page is
> >>>>     * initialised already.
> >>>>     */
> >>>> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
> >>>> +static bool __meminit vmemmap_populated(unsigned long start, int page_size)
> >>>>    {
> >>>>       unsigned long end = start + page_size;
> >>>>       start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
> >>>>
> >>>> -    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
> >>>> -            if (pfn_valid(page_to_pfn((struct page *)start)))
> >>>> -                    return 1;
> >>>> +    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) {
> >>>>
> >>>> -    return 0;
> >>>> +            struct mem_section *ms;
> >>>> +            unsigned long pfn = page_to_pfn((struct page *)start);
> >>>> +
> >>>> +            if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> >>>> +                    return 0;
> >>>
> >>> I might be missing something, but is this right?
> >>> Having a section_nr above NR_MEM_SECTIONS is invalid, but if we return 0 here,
> >>> vmemmap_populate will go on and populate it.
> >>
> >> I should drop that completely. We should not hit that condition at all.
> >> I will send a final patch once I go through the full patch series making
> >> sure we are not breaking any ppc64 details.
> >>
> >> Wondering why we did the below
> >>
> >> #if defined(ARCH_SUBSECTION_SHIFT)
> >> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
> >> #elif defined(PMD_SHIFT)
> >> #define SUBSECTION_SHIFT (PMD_SHIFT)
> >> #else
> >> /*
> >>    * Memory hotplug enabled platforms avoid this default because they
> >>    * either define ARCH_SUBSECTION_SHIFT, or PMD_SHIFT is a constant, but
> >>    * this is kept as a backstop to allow compilation on
> >>    * !ARCH_ENABLE_MEMORY_HOTPLUG archs.
> >>    */
> >> #define SUBSECTION_SHIFT 21
> >> #endif
> >>
> >> why not
> >>
> >> #if defined(ARCH_SUBSECTION_SHIFT)
> >> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
> >> #else
> >> #define SUBSECTION_SHIFT  SECTION_SHIFT
>
> That should be SECTION_SIZE_SHIFT
>
> >> #endif
> >>
> >> ie, if SUBSECTION is not supported by arch we have one sub-section per
> >> section?
> >
> > A couple comments:
> >
> > The only reason ARCH_SUBSECTION_SHIFT exists is because PMD_SHIFT on
> > PowerPC was a non-constant value. However, I'm planning to remove the
> > distinction in the next rev of the patches. Jeff rightly points out
> > that having a variable subsection size per arch will lead to
> > situations where persistent memory namespaces are not portable across
> > archs. So I plan to just make SUBSECTION_SHIFT 21 everywhere.
> >
>
>
> persistent memory namespaces are not portable across archs because they
> have PAGE_SIZE dependency.

We can fix that by reserving mem_map capacity assuming the smallest
PAGE_SIZE across archs.

> Then we have dependencies like the page size
> with which we map the vmemmap area.

How does that lead to cross-arch incompatibility? Even on a single
arch the vmemmap area will be mapped with 2MB pages for 128MB aligned
spans of pmem address space and 4K pages for subsections.

> Why not let the arch
> arch decide the SUBSECTION_SHIFT and default to one subsection per
> section if arch is not enabled to work with subsection.

Because that keeps the implementation from ever reaching a point where
a namespace might be able to be moved from one arch to another. If we
can squash these arch differences then we can have a common tool to
initialize namespaces outside of the kernel. The one wrinkle is
device-dax that wants to enforce the mapping size, but I think we can
have a module-option compatibility override for that case for the
admin to say "yes, I know this namespace is defined for 2MB x86 pages,
but I want to force enable it with 64K pages on PowerPC"
Aneesh Kumar K.V June 14, 2019, 4:50 p.m. UTC | #17
On 6/14/19 10:06 PM, Dan Williams wrote:
> On Fri, Jun 14, 2019 at 9:26 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> On 6/14/19 9:52 PM, Dan Williams wrote:
>>> On Fri, Jun 14, 2019 at 9:18 AM Aneesh Kumar K.V
>>> <aneesh.kumar@linux.ibm.com> wrote:
>>>>
>>>> On 6/14/19 9:05 PM, Oscar Salvador wrote:
>>>>> On Fri, Jun 14, 2019 at 02:28:40PM +0530, Aneesh Kumar K.V wrote:
>>>>>> Can you check with this change on ppc64.  I haven't reviewed this series yet.
>>>>>> I did limited testing with change . Before merging this I need to go
>>>>>> through the full series again. The vmemmap poplulate on ppc64 needs to
>>>>>> handle two translation mode (hash and radix). With respect to vmemap
>>>>>> hash doesn't setup a translation in the linux page table. Hence we need
>>>>>> to make sure we don't try to setup a mapping for a range which is
>>>>>> arleady convered by an existing mapping.
>>>>>>
>>>>>> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
>>>>>> index a4e17a979e45..15c342f0a543 100644
>>>>>> --- a/arch/powerpc/mm/init_64.c
>>>>>> +++ b/arch/powerpc/mm/init_64.c
>>>>>> @@ -88,16 +88,23 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
>>>>>>      * which overlaps this vmemmap page is initialised then this page is
>>>>>>      * initialised already.
>>>>>>      */
>>>>>> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
>>>>>> +static bool __meminit vmemmap_populated(unsigned long start, int page_size)
>>>>>>     {
>>>>>>        unsigned long end = start + page_size;
>>>>>>        start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
>>>>>>
>>>>>> -    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
>>>>>> -            if (pfn_valid(page_to_pfn((struct page *)start)))
>>>>>> -                    return 1;
>>>>>> +    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) {
>>>>>>
>>>>>> -    return 0;
>>>>>> +            struct mem_section *ms;
>>>>>> +            unsigned long pfn = page_to_pfn((struct page *)start);
>>>>>> +
>>>>>> +            if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>>>>>> +                    return 0;
>>>>>
>>>>> I might be missing something, but is this right?
>>>>> Having a section_nr above NR_MEM_SECTIONS is invalid, but if we return 0 here,
>>>>> vmemmap_populate will go on and populate it.
>>>>
>>>> I should drop that completely. We should not hit that condition at all.
>>>> I will send a final patch once I go through the full patch series making
>>>> sure we are not breaking any ppc64 details.
>>>>
>>>> Wondering why we did the below
>>>>
>>>> #if defined(ARCH_SUBSECTION_SHIFT)
>>>> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
>>>> #elif defined(PMD_SHIFT)
>>>> #define SUBSECTION_SHIFT (PMD_SHIFT)
>>>> #else
>>>> /*
>>>>     * Memory hotplug enabled platforms avoid this default because they
>>>>     * either define ARCH_SUBSECTION_SHIFT, or PMD_SHIFT is a constant, but
>>>>     * this is kept as a backstop to allow compilation on
>>>>     * !ARCH_ENABLE_MEMORY_HOTPLUG archs.
>>>>     */
>>>> #define SUBSECTION_SHIFT 21
>>>> #endif
>>>>
>>>> why not
>>>>
>>>> #if defined(ARCH_SUBSECTION_SHIFT)
>>>> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
>>>> #else
>>>> #define SUBSECTION_SHIFT  SECTION_SHIFT
>>
>> That should be SECTION_SIZE_SHIFT
>>
>>>> #endif
>>>>
>>>> ie, if SUBSECTION is not supported by arch we have one sub-section per
>>>> section?
>>>
>>> A couple comments:
>>>
>>> The only reason ARCH_SUBSECTION_SHIFT exists is because PMD_SHIFT on
>>> PowerPC was a non-constant value. However, I'm planning to remove the
>>> distinction in the next rev of the patches. Jeff rightly points out
>>> that having a variable subsection size per arch will lead to
>>> situations where persistent memory namespaces are not portable across
>>> archs. So I plan to just make SUBSECTION_SHIFT 21 everywhere.
>>>
>>
>>
>> persistent memory namespaces are not portable across archs because they
>> have PAGE_SIZE dependency.
> 
> We can fix that by reserving mem_map capacity assuming the smallest
> PAGE_SIZE across archs.
> 
>> Then we have dependencies like the page size
>> with which we map the vmemmap area.
> 
> How does that lead to cross-arch incompatibility? Even on a single
> arch the vmemmap area will be mapped with 2MB pages for 128MB aligned
> spans of pmem address space and 4K pages for subsections.

I am not sure I understood that details. On ppc64 vmemmap can be mapped 
by either 16M, 2M, 64K depending on the translation mode (hash or 
radix). Doesn't that imply our reserve area size will vary between these 
configs? I was thinking we should let the arch pick the largest value as 
alignment and align things based on that. Otherwise if you align the 
vmemmap/altmap area to 2M and we move to a platform that map the vmemmap 
area using 16MB pagesize we fail right? In other words if you want to 
build a portable pmem region, we have to configure these alignment 
correctly.

Also the label area storage is completely hidden in firmware right? So 
the portability will be limited to platforms that support same firmware?


> 
>> Why not let the arch
>> arch decide the SUBSECTION_SHIFT and default to one subsection per
>> section if arch is not enabled to work with subsection.
> 
> Because that keeps the implementation from ever reaching a point where
> a namespace might be able to be moved from one arch to another. If we
> can squash these arch differences then we can have a common tool to
> initialize namespaces outside of the kernel. The one wrinkle is
> device-dax that wants to enforce the mapping size, but I think we can
> have a module-option compatibility override for that case for the
> admin to say "yes, I know this namespace is defined for 2MB x86 pages,
> but I want to force enable it with 64K pages on PowerPC"

But then you can't say I want to enable this with 16M pages on PowerPC.
But I understood what you are suggesting here.

-aneesh
Aneesh Kumar K.V June 14, 2019, 4:55 p.m. UTC | #18
On 6/14/19 10:06 PM, Dan Williams wrote:
> On Fri, Jun 14, 2019 at 9:26 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:

>> Why not let the arch
>> arch decide the SUBSECTION_SHIFT and default to one subsection per
>> section if arch is not enabled to work with subsection.
> 
> Because that keeps the implementation from ever reaching a point where
> a namespace might be able to be moved from one arch to another. If we
> can squash these arch differences then we can have a common tool to
> initialize namespaces outside of the kernel. The one wrinkle is
> device-dax that wants to enforce the mapping size,

The fsdax have a much bigger issue right? The file system block size is 
the same as PAGE_SIZE and we can't make it portable across archs that 
support different PAGE_SIZE?

-aneesh
Jeff Moyer June 14, 2019, 5:08 p.m. UTC | #19
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:

> On 6/14/19 10:06 PM, Dan Williams wrote:
>> On Fri, Jun 14, 2019 at 9:26 AM Aneesh Kumar K.V
>> <aneesh.kumar@linux.ibm.com> wrote:
>
>>> Why not let the arch
>>> arch decide the SUBSECTION_SHIFT and default to one subsection per
>>> section if arch is not enabled to work with subsection.
>>
>> Because that keeps the implementation from ever reaching a point where
>> a namespace might be able to be moved from one arch to another. If we
>> can squash these arch differences then we can have a common tool to
>> initialize namespaces outside of the kernel. The one wrinkle is
>> device-dax that wants to enforce the mapping size,
>
> The fsdax have a much bigger issue right? The file system block size
> is the same as PAGE_SIZE and we can't make it portable across archs
> that support different PAGE_SIZE?

File system blocks are not tied to page size.  They can't be *bigger*
than the page size currently, but they can be smaller.

Still, I don't see that as an arugment against trying to make the
namespaces work across architectures.  Consider a user who only has
sector mode namespaces.  We'd like that to work if at all possible.

-Jeff
Dan Williams June 14, 2019, 5:14 p.m. UTC | #20
On Fri, Jun 14, 2019 at 10:09 AM Jeff Moyer <jmoyer@redhat.com> wrote:
>
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>
> > On 6/14/19 10:06 PM, Dan Williams wrote:
> >> On Fri, Jun 14, 2019 at 9:26 AM Aneesh Kumar K.V
> >> <aneesh.kumar@linux.ibm.com> wrote:
> >
> >>> Why not let the arch
> >>> arch decide the SUBSECTION_SHIFT and default to one subsection per
> >>> section if arch is not enabled to work with subsection.
> >>
> >> Because that keeps the implementation from ever reaching a point where
> >> a namespace might be able to be moved from one arch to another. If we
> >> can squash these arch differences then we can have a common tool to
> >> initialize namespaces outside of the kernel. The one wrinkle is
> >> device-dax that wants to enforce the mapping size,
> >
> > The fsdax have a much bigger issue right? The file system block size
> > is the same as PAGE_SIZE and we can't make it portable across archs
> > that support different PAGE_SIZE?
>
> File system blocks are not tied to page size.  They can't be *bigger*
> than the page size currently, but they can be smaller.
>
> Still, I don't see that as an arugment against trying to make the
> namespaces work across architectures.  Consider a user who only has
> sector mode namespaces.  We'd like that to work if at all possible.

Even with fsdax namespaces I don't see the concern. Yes, DAX might be
disabled if the filesystem on the namespace has a block size that is
smaller than the current system PAGE_SIZE, but the filesystem will
still work. I.e. it's fine to put a 512 byte block size filesystem on
a system that has a 4K PAGE_SIZE, you only lose DAX operations, not
your data access.
Aneesh Kumar K.V June 14, 2019, 5:40 p.m. UTC | #21
On 6/14/19 10:38 PM, Jeff Moyer wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> 
>> On 6/14/19 10:06 PM, Dan Williams wrote:
>>> On Fri, Jun 14, 2019 at 9:26 AM Aneesh Kumar K.V
>>> <aneesh.kumar@linux.ibm.com> wrote:
>>
>>>> Why not let the arch
>>>> arch decide the SUBSECTION_SHIFT and default to one subsection per
>>>> section if arch is not enabled to work with subsection.
>>>
>>> Because that keeps the implementation from ever reaching a point where
>>> a namespace might be able to be moved from one arch to another. If we
>>> can squash these arch differences then we can have a common tool to
>>> initialize namespaces outside of the kernel. The one wrinkle is
>>> device-dax that wants to enforce the mapping size,
>>
>> The fsdax have a much bigger issue right? The file system block size
>> is the same as PAGE_SIZE and we can't make it portable across archs
>> that support different PAGE_SIZE?
> 
> File system blocks are not tied to page size.  They can't be *bigger*
> than the page size currently, but they can be smaller.
> 


ppc64 page size is 64K.

> Still, I don't see that as an arugment against trying to make the
> namespaces work across architectures.  Consider a user who only has
> sector mode namespaces.  We'd like that to work if at all possible.
> 

agreed. I was trying to list out the challenges here.

-aneesh
Dan Williams June 14, 2019, 6:03 p.m. UTC | #22
On Fri, Jun 14, 2019 at 7:59 AM Qian Cai <cai@lca.pw> wrote:
>
> On Fri, 2019-06-14 at 14:28 +0530, Aneesh Kumar K.V wrote:
> > Qian Cai <cai@lca.pw> writes:
> >
> >
> > > 1) offline is busted [1]. It looks like test_pages_in_a_zone() missed the
> > > same
> > > pfn_section_valid() check.
> > >
> > > 2) powerpc booting is generating endless warnings [2]. In
> > > vmemmap_populated() at
> > > arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> > > PAGES_PER_SUBSECTION, but it alone seems not enough.
> > >
> >
> > Can you check with this change on ppc64.  I haven't reviewed this series yet.
> > I did limited testing with change . Before merging this I need to go
> > through the full series again. The vmemmap poplulate on ppc64 needs to
> > handle two translation mode (hash and radix). With respect to vmemap
> > hash doesn't setup a translation in the linux page table. Hence we need
> > to make sure we don't try to setup a mapping for a range which is
> > arleady convered by an existing mapping.
>
> It works fine.

Strange... it would only change behavior if valid_section() is true
when pfn_valid() is not or vice versa. They "should" be identical
because subsection-size == section-size on PowerPC, at least with the
current definition of SUBSECTION_SHIFT. I suspect maybe
free_area_init_nodes() is too late to call subsection_map_init() for
PowerPC.
Dan Williams June 14, 2019, 6:57 p.m. UTC | #23
On Fri, Jun 14, 2019 at 11:03 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> On Fri, Jun 14, 2019 at 7:59 AM Qian Cai <cai@lca.pw> wrote:
> >
> > On Fri, 2019-06-14 at 14:28 +0530, Aneesh Kumar K.V wrote:
> > > Qian Cai <cai@lca.pw> writes:
> > >
> > >
> > > > 1) offline is busted [1]. It looks like test_pages_in_a_zone() missed the
> > > > same
> > > > pfn_section_valid() check.
> > > >
> > > > 2) powerpc booting is generating endless warnings [2]. In
> > > > vmemmap_populated() at
> > > > arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> > > > PAGES_PER_SUBSECTION, but it alone seems not enough.
> > > >
> > >
> > > Can you check with this change on ppc64.  I haven't reviewed this series yet.
> > > I did limited testing with change . Before merging this I need to go
> > > through the full series again. The vmemmap poplulate on ppc64 needs to
> > > handle two translation mode (hash and radix). With respect to vmemap
> > > hash doesn't setup a translation in the linux page table. Hence we need
> > > to make sure we don't try to setup a mapping for a range which is
> > > arleady convered by an existing mapping.
> >
> > It works fine.
>
> Strange... it would only change behavior if valid_section() is true
> when pfn_valid() is not or vice versa. They "should" be identical
> because subsection-size == section-size on PowerPC, at least with the
> current definition of SUBSECTION_SHIFT. I suspect maybe
> free_area_init_nodes() is too late to call subsection_map_init() for
> PowerPC.

Can you give the attached incremental patch a try? This will break
support for doing sub-section hot-add in a section that was only
partially populated early at init, but that can be repaired later in
the series. First things first, don't regress.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 874eb22d22e4..520c83aa0fec 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7286,12 +7286,10 @@ void __init free_area_init_nodes(unsigned long
*max_zone_pfn)

        /* Print out the early node map */
        pr_info("Early memory node ranges\n");
-       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) {
+       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid)
                pr_info("  node %3d: [mem %#018Lx-%#018Lx]\n", nid,
                        (u64)start_pfn << PAGE_SHIFT,
                        ((u64)end_pfn << PAGE_SHIFT) - 1);
-               subsection_map_init(start_pfn, end_pfn - start_pfn);
-       }

        /* Initialise every node */
        mminit_verify_pageflags_layout();
diff --git a/mm/sparse.c b/mm/sparse.c
index 0baa2e55cfdd..bca8e6fa72d2 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -533,6 +533,7 @@ static void __init sparse_init_nid(int nid,
unsigned long pnum_begin,
                }
                check_usemap_section_nr(nid, usage);
                sparse_init_one_section(__nr_to_section(pnum), pnum,
map, usage);
+               subsection_map_init(section_nr_to_pfn(pnum), PAGES_PER_SECTION);
                usage = (void *) usage + mem_section_usage_size();
        }
        sparse_buffer_fini();
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 874eb22d22e4..520c83aa0fec 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7286,12 +7286,10 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn)
 
 	/* Print out the early node map */
 	pr_info("Early memory node ranges\n");
-	for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) {
+	for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid)
 		pr_info("  node %3d: [mem %#018Lx-%#018Lx]\n", nid,
 			(u64)start_pfn << PAGE_SHIFT,
 			((u64)end_pfn << PAGE_SHIFT) - 1);
-		subsection_map_init(start_pfn, end_pfn - start_pfn);
-	}
 
 	/* Initialise every node */
 	mminit_verify_pageflags_layout();
diff --git a/mm/sparse.c b/mm/sparse.c
index 0baa2e55cfdd..bca8e6fa72d2 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -533,6 +533,7 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin,
 		}
 		check_usemap_section_nr(nid, usage);
 		sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage);
+		subsection_map_init(section_nr_to_pfn(pnum), PAGES_PER_SECTION);
 		usage = (void *) usage + mem_section_usage_size();
 	}
 	sparse_buffer_fini();
Qian Cai June 14, 2019, 7:40 p.m. UTC | #24
On Fri, 2019-06-14 at 11:57 -0700, Dan Williams wrote:
> On Fri, Jun 14, 2019 at 11:03 AM Dan Williams <dan.j.williams@intel.com>
> wrote:
> > 
> > On Fri, Jun 14, 2019 at 7:59 AM Qian Cai <cai@lca.pw> wrote:
> > > 
> > > On Fri, 2019-06-14 at 14:28 +0530, Aneesh Kumar K.V wrote:
> > > > Qian Cai <cai@lca.pw> writes:
> > > > 
> > > > 
> > > > > 1) offline is busted [1]. It looks like test_pages_in_a_zone() missed
> > > > > the
> > > > > same
> > > > > pfn_section_valid() check.
> > > > > 
> > > > > 2) powerpc booting is generating endless warnings [2]. In
> > > > > vmemmap_populated() at
> > > > > arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> > > > > PAGES_PER_SUBSECTION, but it alone seems not enough.
> > > > > 
> > > > 
> > > > Can you check with this change on ppc64.  I haven't reviewed this series
> > > > yet.
> > > > I did limited testing with change . Before merging this I need to go
> > > > through the full series again. The vmemmap poplulate on ppc64 needs to
> > > > handle two translation mode (hash and radix). With respect to vmemap
> > > > hash doesn't setup a translation in the linux page table. Hence we need
> > > > to make sure we don't try to setup a mapping for a range which is
> > > > arleady convered by an existing mapping.
> > > 
> > > It works fine.
> > 
> > Strange... it would only change behavior if valid_section() is true
> > when pfn_valid() is not or vice versa. They "should" be identical
> > because subsection-size == section-size on PowerPC, at least with the
> > current definition of SUBSECTION_SHIFT. I suspect maybe
> > free_area_init_nodes() is too late to call subsection_map_init() for
> > PowerPC.
> 
> Can you give the attached incremental patch a try? This will break
> support for doing sub-section hot-add in a section that was only
> partially populated early at init, but that can be repaired later in
> the series. First things first, don't regress.
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 874eb22d22e4..520c83aa0fec 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7286,12 +7286,10 @@ void __init free_area_init_nodes(unsigned long
> *max_zone_pfn)
> 
>         /* Print out the early node map */
>         pr_info("Early memory node ranges\n");
> -       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) {
> +       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid)
>                 pr_info("  node %3d: [mem %#018Lx-%#018Lx]\n", nid,
>                         (u64)start_pfn << PAGE_SHIFT,
>                         ((u64)end_pfn << PAGE_SHIFT) - 1);
> -               subsection_map_init(start_pfn, end_pfn - start_pfn);
> -       }
> 
>         /* Initialise every node */
>         mminit_verify_pageflags_layout();
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 0baa2e55cfdd..bca8e6fa72d2 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -533,6 +533,7 @@ static void __init sparse_init_nid(int nid,
> unsigned long pnum_begin,
>                 }
>                 check_usemap_section_nr(nid, usage);
>                 sparse_init_one_section(__nr_to_section(pnum), pnum,
> map, usage);
> +               subsection_map_init(section_nr_to_pfn(pnum),
> PAGES_PER_SECTION);
>                 usage = (void *) usage + mem_section_usage_size();
>         }
>         sparse_buffer_fini();

It works fine except it starts to trigger slab debugging errors during boot. Not
sure if it is related yet.

[  OK  ] Mounted /boot.
[  OK  ] Started LVM event activation on device 8:1.
[  OK  ] Found device /dev/mapper/rhel_ibm--p9wr--06-home.
         Mounting /home...
[   47.553541][  T920]
=============================================================================
[   47.553586][  T920] BUG kmem_cache (Not tainted): Poison overwritten
[   47.553618][  T920] -------------------------------------------------------
----------------------
[   47.553618][  T920] 
[   47.553655][  T920] Disabling lock debugging due to kernel taint
[   47.553697][  T920] INFO: 0x0000000056823988-0x0000000050e781ac. First byte
0x0 instead of 0x6b
[   47.553739][  T920] INFO: Allocated in create_cache+0x9c/0x2f0 age=1381
cpu=104 pid=751
[   47.553777][  T920] 	__slab_alloc+0x34/0x60
[   47.553801][  T920] 	kmem_cache_alloc+0x4e4/0x5a0
[   47.553815][  T920] 	create_cache+0x9c/0x2f0
[   47.553856][  T920] 	memcg_create_kmem_cache+0x150/0x1b0
[   47.553871][  T920] 	memcg_kmem_cache_create_func+0x3c/0x150
[   47.553888][  T920] 	process_one_work+0x300/0x800
[   47.553939][  T920] 	worker_thread+0x78/0x540
[   47.553991][  T920] 	kthread+0x1b8/0x1c0
[   47.554030][  T920] 	ret_from_kernel_thread+0x5c/0x70
[   47.554057][  T920] INFO: Freed in slab_kmem_cache_release+0x60/0xc0 age=379
cpu=94 pid=484
[   47.554100][  T920] 	kmem_cache_free+0x58c/0x680
[   47.554128][  T920] 	slab_kmem_cache_release+0x60/0xc0
[   47.554166][  T920] 	kmem_cache_release+0x24/0x40
[   47.554204][  T920] 	kobject_put+0x12c/0x300
[   47.554253][  T920] 	sysfs_slab_release+0x38/0x50
[   47.554293][  T920] 	shutdown_cache+0x2d4/0x3b0
[   47.554331][  T920] 	kmemcg_cache_shutdown_fn+0x20/0x40
[   47.554360][  T920] 	kmemcg_workfn+0x64/0xb0
[   47.554385][  T920] 	process_one_work+0x300/0x800
[   47.554420][  T920] 	worker_thread+0x78/0x540
[   47.554461][  T920] 	kthread+0x1b8/0x1c0
[   47.554486][  T920] 	ret_from_kernel_thread+0x5c/0x70
[   47.554534][  T920] INFO: Slab 0x00000000e5a3850e objects=21 used=21
fp=0x000000001c184c17 flags=0x83fffc000000200
[   47.554616][  T920] INFO: Object 0x000000004f30f83e @offset=40064
fp=0x00000000c5c64399
[   47.554616][  T920] 
[   47.554690][  T920] Redzone 00000000e463ee75: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.554758][  T920] Redzone 000000004b6f4884: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.554813][  T920] Redzone 000000005c73936a: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.554881][  T920] Redzone 00000000e294755c: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.554956][  T920] Redzone 0000000026ba2e61: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.555038][  T920] Redzone 00000000210aec0a: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.555108][  T920] Redzone 0000000047851caf: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.555181][  T920] Redzone 00000000a88fe569: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.555262][  T920] Object 000000004f30f83e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555307][  T920] Object 00000000d4d50ef6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555375][  T920] Object 000000002c43675d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555441][  T920] Object 000000002b7fff5c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555511][  T920] Object 00000000eaa8b500: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555581][  T920] Object 00000000e149fa9d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555660][  T920] Object 000000004a87fa48: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555741][  T920] Object 0000000093301b2a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555797][  T920] Object 00000000dc013892: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555864][  T920] Object 000000005fc6a904: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555931][  T920] Object 000000005f9f9d53: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.555994][  T920] Object 000000003b35200a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556063][  T920] Object 000000006800397f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556132][  T920] Object 0000000004744c02: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556200][  T920] Object 000000003241106b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556267][  T920] Object 00000000b051d781: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556324][  T920] Object 00000000ee00435d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556394][  T920] Object 00000000e4c76b09: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556452][  T920] Object 00000000a955601d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556521][  T920] Object 00000000f23d6d54: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556589][  T920] Object 00000000948d914f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556671][  T920] Object 000000009b83b552: 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00
00 00 00 00 00  kkkkkkkk........
[   47.556742][  T920] Object 0000000098183b83: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556798][  T920] Object 000000005ae5b5d3: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556867][  T920] Object 00000000abd5b5de: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.556923][  T920] Object 00000000e876d61c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557018][  T920] Object 0000000013c0228e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557084][  T920] Object 00000000307c7694: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557164][  T920] Object 000000000367c078: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557234][  T920] Object 000000008665e37a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557307][  T920] Object 0000000086fd7e15: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557376][  T920] Object 00000000429b53bb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557431][  T920] Object 00000000cea8da45: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557487][  T920] Object 00000000ca4efb98: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557557][  T920] Object 000000005a281995: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557626][  T920] Object 00000000cf084d69: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557687][  T920] Object 000000001fdf79e5: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557777][  T920] Object 000000005f5e054e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557833][  T920] Object 0000000046b6818a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557891][  T920] Object 00000000caac6967: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.557958][  T920] Object 00000000d540458f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558029][  T920] Object 00000000c0bf366b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558098][  T920] Object 00000000d3dfbf6f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558165][  T920] Object 000000001a35dd94: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558248][  T920] Object 00000000e5f4aba1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558328][  T920] Object 00000000c566b1d4: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558384][  T920] Object 00000000633ab657: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558442][  T920] Object 000000007312cef0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558512][  T920] Object 00000000c8b1d277: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558631][  T920] Object 00000000a6e5ae5f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558748][  T920] Object 0000000094fa22e6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.558862][  T920] Object 000000004df6b97f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.559003][  T920] Object 0000000027e179b2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.559099][  T920] Object 000000001aa9ac19: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.559245][  T920] Object 000000006ba9ce74: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.559363][  T920] Object 00000000c9dcd994: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.559478][  T920] Object 00000000741f43aa: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.559595][  T920] Object 00000000b933f584: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.559723][  T920] Object 000000003fcd984d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.559834][  T920] Object 0000000097669358: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.559949][  T920] Object 0000000086db6bff: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.560069][  T920] Object 00000000fda6b38a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.560191][  T920] Object 000000008d81cdc4: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.560307][  T920] Object 00000000a9b762b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.560437][  T920] Object 0000000030205af6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.560546][  T920] Object 000000000232113f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.560664][  T920] Object 00000000de5e4928: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.560792][  T920] Object 00000000b6bfd22c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.560902][  T920] Object 00000000bcf857ae: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.561016][  T920] Object 0000000049aad6d1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.561131][  T920] Object 00000000e95cd85d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.561255][  T920] Object 000000002354a060: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.561361][  T920] Object 0000000099fb38b6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.561521][  T920] Object 00000000622d0c0d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.561630][  T920] Object 00000000802f3461: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.561750][  T920] Object 000000000f29e0cd: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.561868][  T920] Object 0000000079ec25b2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.561979][  T920] Object 0000000014d121be: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.562108][  T920] Object 000000001751c3dc: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.562207][  T920] Object 00000000fa176337: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.562328][  T920] Object 000000009fd0cdfb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.562456][  T920] Object 00000000e3504fc7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.562568][  T920] Object 000000005610dc63: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.562686][  T920] Object 00000000566eae63: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.562812][  T920] Object 00000000e3bb1fde: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.562945][  T920] Object 00000000bf6c1146: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.563059][  T920] Object 000000005c9138b3: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.563174][  T920] Object 000000005e2563c6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.563291][  T920] Object 00000000a140b499: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.563412][  T920] Object 00000000d9a0bf91: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.563542][  T920] Object 0000000078b76649: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.563651][  T920] Object 000000009ea820a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.563770][  T920] Object 0000000028c41bed: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.563900][  T920] Object 00000000f51e38ad: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.564021][  T920] Object 00000000c02d82b9: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.564158][  T920] Object 0000000058ebb46f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.564264][  T920] Object 00000000ea08dece: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.564386][  T920] Object 000000007a3b09c4: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.564498][  T920] Object 000000001a5867fa: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.564608][  T920] Object 0000000032da9381: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.564723][  T920] Object 00000000ff06b6e1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.564859][  T920] Object 00000000c498e74f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.564962][  T920] Object 00000000fa5b10ba: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.565094][  T920] Object 0000000091a64fdf: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.565211][  T920] Object 00000000d8cbdea4: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.565330][  T920] Object 00000000822f8c2b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.565458][  T920] Object 0000000077baccaa: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.565574][  T920] Object 0000000020f7f917: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.565692][  T920] Object 000000002665ae1e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.565803][  T920] Object 000000009a085cfd: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.565914][  T920] Object 00000000a6349306: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.566043][  T920] Object 00000000ed8bb9c6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.566187][  T920] Object 000000007b63b8ca: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.566345][  T920] Object 00000000ccc27101: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.566471][  T920] Object 00000000ef4a11c7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.566600][  T920] Object 00000000990e4e67: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.566729][  T920] Object 000000001441daf8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.566830][  T920] Object 00000000629b60fd: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.566959][  T920] Object 0000000005756df5: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.567095][  T920] Object 000000001335c55d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.567202][  T920] Object 000000008a10e58c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.567311][  T920] Object 0000000075bd1fe7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.567466][  T920] Object 000000007e35380c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.567583][  T920] Object 00000000845eff4e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.567704][  T920] Object 00000000785bc98b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.567847][  T920] Object 0000000014e0c8d4: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.567962][  T920] Object 0000000016a9470d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.568077][  T920] Object 0000000043305971: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.568190][  T920] Object 000000008c5c5689: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.568340][  T920] Object 0000000036f07dab: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.568445][  T920] Object 00000000b2bfcacf: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.568542][  T920] Object 000000000ae7f17b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.568650][  T920] Object 00000000a8314823: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.568791][  T920] Object 000000008edcb310: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.568937][  T920] Object 00000000c1c5a76b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.569040][  T920] Object 00000000d3df47c1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.569163][  T920] Object 00000000409f0701: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.569289][  T920] Object 00000000d905df04: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.569413][  T920] Object 0000000093e68225: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.569556][  T920] Object 00000000e31c71b1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.569658][  T920] Object 00000000bb083b6d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.569764][  T920] Object 0000000033ec023f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.569905][  T920] Object 0000000059b795ff: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.570030][  T920] Object 00000000a433364d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.570143][  T920] Object 0000000056e6045e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.570260][  T920] Object 00000000496ffb36: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.570391][  T920] Object 0000000053fd9d70: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.570528][  T920] Object 000000005ba5d6bc: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.570625][  T920] Object 0000000022c2afa2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.570745][  T920] Object 0000000035b58153: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.570888][  T920] Object 00000000cf9e2d08: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.571024][  T920] Object 0000000020e2e55a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.571136][  T920] Object 00000000dded7e78: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.571237][  T920] Object 000000005b5dc165: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.571357][  T920] Object 00000000ed1631b2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.571489][  T920] Object 00000000bb9fcf9e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.571628][  T920] Object 000000000b203ce1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
[   47.571724][  T920] Redzone 000000004649abf0: bb bb bb bb bb bb bb
bb                          ........
[   47.571827][  T920] Padding 000000009c7fa5f3: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.571970][  T920] Padding 00000000693f60d5: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.572091][  T920] Padding 000000002f6715e7: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.572205][  T920] Padding 00000000dd7440e4: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.572349][  T920] Padding 000000003da6a5e5: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.572472][  T920] Padding 00000000ddafc3df: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.572576][  T920] Padding 000000001892aa1d: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.572717][  T920] CPU: 73 PID: 920 Comm: kworker/73:1 Tainted:
G    B             5.2.0-rc4-next-20190614+ #15
[   47.572835][  T920] Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
[   47.572934][  T920] Call Trace:
[   47.572972][  T920] [c00020152898f680] [c00000000089793c]
dump_stack+0xb0/0xf4 (unreliable)
[   47.573063][  T920] [c00020152898f6c0] [c0000000003df8a8]
print_trailer+0x23c/0x264
[   47.573153][  T920] [c00020152898f750] [c0000000003cedd8]
check_bytes_and_report+0x138/0x160
[   47.573266][  T920] [c00020152898f7f0] [c0000000003d1de8]
check_object+0x348/0x3e0
[   47.573363][  T920] [c00020152898f860] [c0000000003d2038]
alloc_debug_processing+0x1b8/0x2c0
[   47.573447][  T920] [c00020152898f900] [c0000000003d5214]
___slab_alloc+0xb94/0xf80
[   47.573538][  T920] [c00020152898fa30] [c0000000003d5634]
__slab_alloc+0x34/0x60
[   47.573639][  T920] [c00020152898fa60] [c0000000003d5b44]
kmem_cache_alloc+0x4e4/0x5a0
[   47.573731][  T920] [c00020152898faf0] [c00000000033288c]
create_cache+0x9c/0x2f0
[   47.573830][  T920] [c00020152898fb60] [c000000000333350]
memcg_create_kmem_cache+0x150/0x1b0
[   47.573931][  T920] [c00020152898fc00] [c00000000040f15c]
memcg_kmem_cache_create_func+0x3c/0x150
[   47.574032][  T920] [c00020152898fc40] [c00000000014b060]
process_one_work+0x300/0x800
[   47.574148][  T920] [c00020152898fd20] [c00000000014b5d8]
worker_thread+0x78/0x540
[   47.574235][  T920] [c00020152898fdb0] [c000000000155ef8] kthread+0x1b8/0x1c0
[   47.574338][  T920] [c00020152898fe20] [c00000000000b4cc]
ret_from_kernel_thread+0x5c/0x70
[   47.574407][  T920] FIX kmem_cache: Restoring 0x0000000056823988-
0x0000000050e781ac=0x6b
[   47.574407][  T920] 
[   47.574550][  T920] FIX kmem_cache: Marking all objects used
[   47.622056][ T3790] XFS (dm-2): Mounting V5 Filesystem
[  OK  ] Started LVM event activation on device 8:19.
[   47.833132][ T3790] XFS (dm-2): Ending clean mount
[  OK  ] Mounted /home.
[  OK  ] Reached target Local File Systems.
         Starting Tell Plymouth To Write Out Runtime Data...
         Starting Import network configuration from initramfs...
         Starting Restore /run/initramfs on shutdown...
[   47.959491][  T924]
=============================================================================
[   47.959532][  T924] BUG kmem_cache (Tainted: G    B            ): Poison
overwritten
[   47.959565][  T924] -------------------------------------------------------
----------------------
[   47.959565][  T924] 
[   47.959601][  T924] INFO: 0x000000005bf9327f-0x0000000012b186d0. First byte
0x0 instead of 0x6b
[   47.959643][  T924] INFO: Allocated in create_cache+0x9c/0x2f0 age=1444
cpu=104 pid=751
[   47.959684][  T924] 	__slab_alloc+0x34/0x60
[   47.959708][  T924] 	kmem_cache_alloc+0x4e4/0x5a0
[   47.959722][  T924] 	create_cache+0x9c/0x2f0
[   47.959761][  T924] 	memcg_create_kmem_cache+0x150/0x1b0
[   47.959811][  T924] 	memcg_kmem_cache_create_func+0x3c/0x150
[   47.959850][  T924] 	process_one_work+0x300/0x800
[   47.959874][  T924] 	worker_thread+0x78/0x540
[   47.959900][  T924] 	kthread+0x1b8/0x1c0
[   47.959913][  T924] 	ret_from_kernel_thread+0x5c/0x70
[   47.959938][  T924] INFO: Freed in slab_kmem_cache_release+0x60/0xc0 age=472
cpu=94 pid=484
[   47.960008][  T924] 	kmem_cache_free+0x58c/0x680
[   47.960045][  T924] 	slab_kmem_cache_release+0x60/0xc0
[   47.960081][  T924] 	kmem_cache_release+0x24/0x40
[   47.960121][  T924] 	kobject_put+0x12c/0x300
[   47.960146][  T924] 	sysfs_slab_release+0x38/0x50
[   47.960171][  T924] 	shutdown_cache+0x2d4/0x3b0
[   47.960246][  T924] 	kmemcg_cache_shutdown_fn+0x20/0x40
[   47.960282][  T924] 	kmemcg_workfn+0x64/0xb0
[   47.960330][  T924] 	process_one_work+0x300/0x800
[   47.960366][  T924] 	worker_thread+0x78/0x540
[   47.960413][  T924] 	kthread+0x1b8/0x1c0
[   47.960448][  T924] 	ret_from_kernel_thread+0x5c/0x70
[   47.960462][  T924] INFO: Slab 0x00000000db2ed41f objects=21 used=21
fp=0x000000001c184c17 flags=0x83fffc000000200
[   47.960506][  T924] INFO: Object 0x00000000c0f66338 @offset=21632
fp=0x00000000b3cc6b7b
[   47.960506][  T924] 
[   47.960580][  T924] Redzone 00000000ff912fe4: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.960659][  T924] Redzone 0000000071d29417: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.960717][  T924] Redzone 000000001cc1e9aa: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.960784][  T924] Redzone 0000000057ad4648: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.960876][  T924] Redzone 0000000063aa1956: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.960943][  T924] Redzone 000000005e03e281: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.961013][  T924] Redzone 00000000d1d049b4: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.961068][  T924] Redzone 00000000ea455f4b: bb bb bb bb bb bb bb bb bb bb
bb bb bb bb bb bb  ................
[   47.961134][  T924] Object 00000000c0f66338: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961201][  T924] Object 0000000054d58d73: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961270][  T924] Object 0000000080b3d18f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961326][  T924] Object 00000000cbecca66: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961395][  T924] Object 00000000e6f9bb18: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961462][  T924] Object 00000000528c4c8d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961532][  T924] Object 000000002cac1453: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961601][  T924] Object 000000002e2b052f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961647][  T924] Object 00000000af6b3436: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961714][  T924] Object 00000000d8c9093b: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961782][  T924] Object 00000000a443983c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961862][  T924] Object 000000007140fb0a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.961942][  T924] Object 0000000092423efb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962008][  T924] Object 00000000068bbb54: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962065][  T924] Object 00000000e1ec757d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962134][  T924] Object 000000005dfea769: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962201][  T924] Object 0000000044039bb7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962271][  T924] Object 000000002f80f51a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962369][  T924] Object 0000000030e34515: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962451][  T924] Object 0000000055d8fc06: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962506][  T924] Object 00000000f59d5107: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962573][  T924] Object 00000000874e788f: 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00
00 00 00 00 00  kkkkkkkk........
[   47.962628][  T924] Object 000000006b718d20: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962711][  T924] Object 00000000340f3026: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962778][  T924] Object 00000000d4e61a50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962864][  T924] Object 000000007038e3bb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.962956][  T924] Object 000000009960f486: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963033][  T924] Object 000000006cad65a2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963114][  T924] Object 00000000c99fba18: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963182][  T924] Object 00000000a106e4d7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963251][  T924] Object 00000000aeaeedbf: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963345][  T924] Object 00000000719d60fd: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963410][  T924] Object 00000000dfd03254: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963466][  T924] Object 000000008ad4ff34: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963534][  T924] Object 00000000e881cddb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963606][  T924] Object 00000000cf1484d6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963675][  T924] Object 00000000695331d9: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963732][  T924] Object 00000000cdded125: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963815][  T924] Object 000000006ce2abec: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963882][  T924] Object 00000000b211e85a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.963977][  T924] Object 0000000053457b75: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964031][  T924] Object 00000000a6f8d40e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964097][  T924] Object 00000000a02a557f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964152][  T924] Object 0000000059a320fb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964218][  T924] Object 00000000aaa0c239: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964288][  T924] Object 000000001add4b11: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964358][  T924] Object 000000007bf168d6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964426][  T924] Object 0000000084150200: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964493][  T924] Object 000000000642da96: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964562][  T924] Object 00000000f1c32a58: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964632][  T924] Object 00000000cfd89b02: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964698][  T924] Object 000000000c486447: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964803][  T924] Object 0000000011627406: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.964928][  T924] Object 000000008d53fcdc: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.965024][  T924] Object 000000009dffdfd4: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.965143][  T924] Object 000000005d0eae17: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.965270][  T924] Object 0000000075227d08: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.965388][  T924] Object 00000000c8898d84: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.965520][  T924] Object 000000005914b371: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.965659][  T924] Object 00000000a96aa124: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.965773][  T924] Object 00000000c22b7e51: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.965860][  T924] Object 00000000a7c6ee60: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.966008][  T924] Object 0000000009caf8c1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.966118][  T924] Object 00000000135fecb7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.966238][  T924] Object 00000000370f2819: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.966351][  T924] Object 0000000055aac92d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.966484][  T924] Object 00000000ca111fe9: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.966603][  T924] Object 00000000b8ad384d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.966728][  T924] Object 000000001b7ced3d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.966858][  T924] Object 0000000078732f09: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.966968][  T924] Object 00000000ee57157a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.967106][  T924] Object 000000000f1d3779: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.967230][  T924] Object 00000000eacfc252: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.967332][  T924] Object 000000006abcee92: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.967470][  T924] Object 00000000d1381811: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.967589][  T924] Object 00000000a8b70685: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.967711][  T924] Object 00000000ac0cc71f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.967837][  T924] Object 000000005f11afd2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.967940][  T924] Object 000000009b914a99: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.968079][  T924] Object 00000000ae873765: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.968195][  T924] Object 00000000057d5d97: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.968327][  T924] Object 000000007fcca92d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.968432][  T924] Object 00000000f09756c6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.968551][  T924] Object 0000000095172d31: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.968681][  T924] Object 000000009ca23ebb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.968805][  T924] Object 000000003e8e08f5: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.968909][  T924] Object 00000000d908bbe6: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.969031][  T924] Object 000000001a41d8fc: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.969149][  T924] Object 0000000062402f77: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.969270][  T924] Object 00000000dba67e95: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.969403][  T924] Object 00000000a8001b4d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.969518][  T924] Object 0000000006400943: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.969640][  T924] Object 00000000fc2c21bb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.969762][  T924] Object 000000003be34237: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.969875][  T924] Object 00000000d2cddc41: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.970008][  T924] Object 0000000046257224: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.970127][  T924] Object 00000000bc6dd975: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.970248][  T924] Object 00000000397e5d55: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.970363][  T924] Object 0000000022bd4778: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.970477][  T924] Object 00000000963f1ff1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.970604][  T924] Object 00000000043daeca: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.970744][  T924] Object 00000000ee7d0bf1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.970859][  T924] Object 00000000e4e7ea50: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.971005][  T924] Object 000000009ed61c4e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.971087][  T924] Object 00000000064b9367: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.971242][  T924] Object 0000000079caca89: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.971344][  T924] Object 00000000b511810c: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.971479][  T924] Object 0000000080ad1f92: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.971578][  T924] Object 000000006ca56518: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.971702][  T924] Object 000000000815f5f3: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.971818][  T924] Object 000000005550f6c1: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.971952][  T924] Object 00000000d3291dbe: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.972060][  T924] Object 00000000bb53e90f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.972177][  T924] Object 00000000459055fc: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.972299][  T924] Object 0000000064bfcd86: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.972425][  T924] Object 0000000052b345e7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.972552][  T924] Object 00000000f64553cc: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.972666][  T924] Object 00000000e39fc888: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.972775][  T924] Object 0000000037e7c5f2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.972897][  T924] Object 00000000251af3d2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.973048][  T924] Object 00000000bf4035c9: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.973132][  T924] Object 00000000df11fb12: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.973309][  T924] Object 000000004c13fbfe: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.973412][  T924] Object 00000000a6ba6ac5: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.973530][  T924] Object 000000004328e43a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.973668][  T924] Object 000000003538c4db: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.973782][  T924] Object 000000004314dd6f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.973889][  T924] Object 000000000303753f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.974023][  T924] Object 0000000004c2996d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.974151][  T924] Object 00000000498eff7f: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.974339][  T924] Object 00000000aea60dfa: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.974453][  T924] Object 00000000a122fe27: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.974577][  T924] Object 00000000fcd25601: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.974729][  T924] Object 00000000c02923fb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.974822][  T924] Object 0000000058301ba5: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.974971][  T924] Object 000000006f987dfb: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.975128][  T924] Object 0000000027a5dedc: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.975264][  T924] Object 0000000005b3ecd7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.975375][  T924] Object 00000000a6946bb3: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.975474][  T924] Object 00000000df895e5e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.975609][  T924] Object 000000005a8cfb87: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.975729][  T924] Object 000000005a718b0e: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.975863][  T924] Object 0000000029f7cb73: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.975961][  T924] Object 00000000c49cf6fe: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.976092][  T924] Object 00000000382c5b8a: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.976223][  T924] Object 00000000bbdbbd53: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.976326][  T924] Object 0000000009866104: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.976479][  T924] Object 000000006fb54a35: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.976593][  T924] Object 00000000ba138b15: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.976714][  T924] Object 000000004896d243: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.976836][  T924] Object 00000000715e3719: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.976961][  T924] Object 00000000b31a6327: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.977080][  T924] Object 00000000469b3fa7: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.977211][  T924] Object 000000000ca34717: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.977324][  T924] Object 0000000062afcbf2: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.977452][  T924] Object 00000000ec39b624: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
[   47.977578][  T924] Object 00000000d9874c53: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
[   47.977686][  T924] Redzone 000000004d458caf: bb bb bb bb bb bb bb
bb                          ........
[   47.977805][  T924] Padding 000000003393d98c: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.977909][  T924] Padding 000000000d637794: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.978025][  T924] Padding 00000000d4fd521d: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.978192][  T924] Padding 00000000325c4503: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.978304][  T924] Padding 00000000f795c69f: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.978412][  T924] Padding 00000000e6145184: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.978570][  T924] Padding 000000001665d379: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a
5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
[   47.978706][  T924] CPU: 83 PID: 924 Comm: kworker/83:1 Tainted:
G    B             5.2.0-rc4-next-20190614+ #15
[   47.978840][  T924] Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
[   47.978920][  T924] Call Trace:
[   47.978980][  T924] [c0002015285cf680] [c00000000089793c]
dump_stack+0xb0/0xf4 (unreliable)
[   47.979069][  T924] [c0002015285cf6c0] [c0000000003df8a8]
print_trailer+0x23c/0x264
[   47.979192][  T924] [c0002015285cf750] [c0000000003cedd8]
check_bytes_and_report+0x138/0x160
[   47.979288][  T924] [c0002015285cf7f0] [c0000000003d1de8]
check_object+0x348/0x3e0
[   47.979373][  T924] [c0002015285cf860] [c0000000003d2038]
alloc_debug_processing+0x1b8/0x2c0
[   47.979495][  T924] [c0002015285cf900] [c0000000003d5214]
___slab_alloc+0xb94/0xf80
[   47.979582][  T924] [c0002015285cfa30] [c0000000003d5634]
__slab_alloc+0x34/0x60
[   47.979669][  T924] [c0002015285cfa60] [c0000000003d5b44]
kmem_cache_alloc+0x4e4/0x5a0
[   47.979765][  T924] [c0002015285cfaf0] [c00000000033288c]
create_cache+0x9c/0x2f0
[   47.979898][  T924] [c0002015285cfb60] [c000000000333350]
memcg_create_kmem_cache+0x150/0x1b0
[   47.979997][  T924] [c0002015285cfc00] [c00000000040f15c]
memcg_kmem_cache_create_func+0x3c/0x150
[   47.980102][  T924] [c0002015285cfc40] [c00000000014b060]
process_one_work+0x300/0x800
[   47.980206][  T924] [c0002015285cfd20] [c00000000014b5d8]
worker_thread+0x78/0x540
[   47.980292][  T924] [c0002015285cfdb0] [c000000000155ef8] kthread+0x1b8/0x1c0
[   47.980400][  T924] [c0002015285cfe20] [c00000000000b4cc]
ret_from_kernel_thread+0x5c/0x70
[   47.980488][  T924] FIX kmem_cache: Restoring 0x000000005bf9327f-
0x0000000012b186d0=0x6b
[   47.980488][  T924] 
[   47.980640][  T924] FIX kmem_cache: Marking all objects used
[  OK  ] Started Restore /run/initramfs on shutdown.
[  OK  ] Started Tell Plymouth To Write Out Runtime Data.
[  OK  ] Started Import network configuration from initramfs.
         Starting Create Volatile Files and Directories...
[  OK  ] Started Create Volatile Files and Directories.
         Starting Update UTMP about System Boot/Shutdown...
Dan Williams June 14, 2019, 7:48 p.m. UTC | #25
On Fri, Jun 14, 2019 at 12:40 PM Qian Cai <cai@lca.pw> wrote:
>
> On Fri, 2019-06-14 at 11:57 -0700, Dan Williams wrote:
> > On Fri, Jun 14, 2019 at 11:03 AM Dan Williams <dan.j.williams@intel.com>
> > wrote:
> > >
> > > On Fri, Jun 14, 2019 at 7:59 AM Qian Cai <cai@lca.pw> wrote:
> > > >
> > > > On Fri, 2019-06-14 at 14:28 +0530, Aneesh Kumar K.V wrote:
> > > > > Qian Cai <cai@lca.pw> writes:
> > > > >
> > > > >
> > > > > > 1) offline is busted [1]. It looks like test_pages_in_a_zone() missed
> > > > > > the
> > > > > > same
> > > > > > pfn_section_valid() check.
> > > > > >
> > > > > > 2) powerpc booting is generating endless warnings [2]. In
> > > > > > vmemmap_populated() at
> > > > > > arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> > > > > > PAGES_PER_SUBSECTION, but it alone seems not enough.
> > > > > >
> > > > >
> > > > > Can you check with this change on ppc64.  I haven't reviewed this series
> > > > > yet.
> > > > > I did limited testing with change . Before merging this I need to go
> > > > > through the full series again. The vmemmap poplulate on ppc64 needs to
> > > > > handle two translation mode (hash and radix). With respect to vmemap
> > > > > hash doesn't setup a translation in the linux page table. Hence we need
> > > > > to make sure we don't try to setup a mapping for a range which is
> > > > > arleady convered by an existing mapping.
> > > >
> > > > It works fine.
> > >
> > > Strange... it would only change behavior if valid_section() is true
> > > when pfn_valid() is not or vice versa. They "should" be identical
> > > because subsection-size == section-size on PowerPC, at least with the
> > > current definition of SUBSECTION_SHIFT. I suspect maybe
> > > free_area_init_nodes() is too late to call subsection_map_init() for
> > > PowerPC.
> >
> > Can you give the attached incremental patch a try? This will break
> > support for doing sub-section hot-add in a section that was only
> > partially populated early at init, but that can be repaired later in
> > the series. First things first, don't regress.
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 874eb22d22e4..520c83aa0fec 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7286,12 +7286,10 @@ void __init free_area_init_nodes(unsigned long
> > *max_zone_pfn)
> >
> >         /* Print out the early node map */
> >         pr_info("Early memory node ranges\n");
> > -       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) {
> > +       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid)
> >                 pr_info("  node %3d: [mem %#018Lx-%#018Lx]\n", nid,
> >                         (u64)start_pfn << PAGE_SHIFT,
> >                         ((u64)end_pfn << PAGE_SHIFT) - 1);
> > -               subsection_map_init(start_pfn, end_pfn - start_pfn);
> > -       }
> >
> >         /* Initialise every node */
> >         mminit_verify_pageflags_layout();
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index 0baa2e55cfdd..bca8e6fa72d2 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -533,6 +533,7 @@ static void __init sparse_init_nid(int nid,
> > unsigned long pnum_begin,
> >                 }
> >                 check_usemap_section_nr(nid, usage);
> >                 sparse_init_one_section(__nr_to_section(pnum), pnum,
> > map, usage);
> > +               subsection_map_init(section_nr_to_pfn(pnum),
> > PAGES_PER_SECTION);
> >                 usage = (void *) usage + mem_section_usage_size();
> >         }
> >         sparse_buffer_fini();
>
> It works fine except it starts to trigger slab debugging errors during boot. Not
> sure if it is related yet.

If you want you can give this branch a try if you suspect something
else in -next is triggering the slab warning.

https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=subsection-v9

It's the original v9 patchset + dependencies backported to v5.2-rc4.

I otherwise don't see how subsections would effect slab caches.
Qian Cai June 14, 2019, 8:43 p.m. UTC | #26
On Fri, 2019-06-14 at 12:48 -0700, Dan Williams wrote:
> On Fri, Jun 14, 2019 at 12:40 PM Qian Cai <cai@lca.pw> wrote:
> > 
> > On Fri, 2019-06-14 at 11:57 -0700, Dan Williams wrote:
> > > On Fri, Jun 14, 2019 at 11:03 AM Dan Williams <dan.j.williams@intel.com>
> > > wrote:
> > > > 
> > > > On Fri, Jun 14, 2019 at 7:59 AM Qian Cai <cai@lca.pw> wrote:
> > > > > 
> > > > > On Fri, 2019-06-14 at 14:28 +0530, Aneesh Kumar K.V wrote:
> > > > > > Qian Cai <cai@lca.pw> writes:
> > > > > > 
> > > > > > 
> > > > > > > 1) offline is busted [1]. It looks like test_pages_in_a_zone()
> > > > > > > missed
> > > > > > > the
> > > > > > > same
> > > > > > > pfn_section_valid() check.
> > > > > > > 
> > > > > > > 2) powerpc booting is generating endless warnings [2]. In
> > > > > > > vmemmap_populated() at
> > > > > > > arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> > > > > > > PAGES_PER_SUBSECTION, but it alone seems not enough.
> > > > > > > 
> > > > > > 
> > > > > > Can you check with this change on ppc64.  I haven't reviewed this
> > > > > > series
> > > > > > yet.
> > > > > > I did limited testing with change . Before merging this I need to go
> > > > > > through the full series again. The vmemmap poplulate on ppc64 needs
> > > > > > to
> > > > > > handle two translation mode (hash and radix). With respect to vmemap
> > > > > > hash doesn't setup a translation in the linux page table. Hence we
> > > > > > need
> > > > > > to make sure we don't try to setup a mapping for a range which is
> > > > > > arleady convered by an existing mapping.
> > > > > 
> > > > > It works fine.
> > > > 
> > > > Strange... it would only change behavior if valid_section() is true
> > > > when pfn_valid() is not or vice versa. They "should" be identical
> > > > because subsection-size == section-size on PowerPC, at least with the
> > > > current definition of SUBSECTION_SHIFT. I suspect maybe
> > > > free_area_init_nodes() is too late to call subsection_map_init() for
> > > > PowerPC.
> > > 
> > > Can you give the attached incremental patch a try? This will break
> > > support for doing sub-section hot-add in a section that was only
> > > partially populated early at init, but that can be repaired later in
> > > the series. First things first, don't regress.
> > > 
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 874eb22d22e4..520c83aa0fec 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -7286,12 +7286,10 @@ void __init free_area_init_nodes(unsigned long
> > > *max_zone_pfn)
> > > 
> > >         /* Print out the early node map */
> > >         pr_info("Early memory node ranges\n");
> > > -       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn,
> > > &nid) {
> > > +       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn,
> > > &nid)
> > >                 pr_info("  node %3d: [mem %#018Lx-%#018Lx]\n", nid,
> > >                         (u64)start_pfn << PAGE_SHIFT,
> > >                         ((u64)end_pfn << PAGE_SHIFT) - 1);
> > > -               subsection_map_init(start_pfn, end_pfn - start_pfn);
> > > -       }
> > > 
> > >         /* Initialise every node */
> > >         mminit_verify_pageflags_layout();
> > > diff --git a/mm/sparse.c b/mm/sparse.c
> > > index 0baa2e55cfdd..bca8e6fa72d2 100644
> > > --- a/mm/sparse.c
> > > +++ b/mm/sparse.c
> > > @@ -533,6 +533,7 @@ static void __init sparse_init_nid(int nid,
> > > unsigned long pnum_begin,
> > >                 }
> > >                 check_usemap_section_nr(nid, usage);
> > >                 sparse_init_one_section(__nr_to_section(pnum), pnum,
> > > map, usage);
> > > +               subsection_map_init(section_nr_to_pfn(pnum),
> > > PAGES_PER_SECTION);
> > >                 usage = (void *) usage + mem_section_usage_size();
> > >         }
> > >         sparse_buffer_fini();
> > 
> > It works fine except it starts to trigger slab debugging errors during boot.
> > Not
> > sure if it is related yet.
> 
> If you want you can give this branch a try if you suspect something
> else in -next is triggering the slab warning.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=subsect
> ion-v9
> 
> It's the original v9 patchset + dependencies backported to v5.2-rc4.
> 
> I otherwise don't see how subsections would effect slab caches.

It works fine there.
Aneesh Kumar K.V June 16, 2019, 3:49 a.m. UTC | #27
Dan Williams <dan.j.williams@intel.com> writes:

> On Fri, Jun 14, 2019 at 9:18 AM Aneesh Kumar K.V
> <aneesh.kumar@linux.ibm.com> wrote:
>>
>> On 6/14/19 9:05 PM, Oscar Salvador wrote:
>> > On Fri, Jun 14, 2019 at 02:28:40PM +0530, Aneesh Kumar K.V wrote:
>> >> Can you check with this change on ppc64.  I haven't reviewed this series yet.
>> >> I did limited testing with change . Before merging this I need to go
>> >> through the full series again. The vmemmap poplulate on ppc64 needs to
>> >> handle two translation mode (hash and radix). With respect to vmemap
>> >> hash doesn't setup a translation in the linux page table. Hence we need
>> >> to make sure we don't try to setup a mapping for a range which is
>> >> arleady convered by an existing mapping.
>> >>
>> >> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
>> >> index a4e17a979e45..15c342f0a543 100644
>> >> --- a/arch/powerpc/mm/init_64.c
>> >> +++ b/arch/powerpc/mm/init_64.c
>> >> @@ -88,16 +88,23 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
>> >>    * which overlaps this vmemmap page is initialised then this page is
>> >>    * initialised already.
>> >>    */
>> >> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
>> >> +static bool __meminit vmemmap_populated(unsigned long start, int page_size)
>> >>   {
>> >>      unsigned long end = start + page_size;
>> >>      start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
>> >>
>> >> -    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
>> >> -            if (pfn_valid(page_to_pfn((struct page *)start)))
>> >> -                    return 1;
>> >> +    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) {
>> >>
>> >> -    return 0;
>> >> +            struct mem_section *ms;
>> >> +            unsigned long pfn = page_to_pfn((struct page *)start);
>> >> +
>> >> +            if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
>> >> +                    return 0;
>> >
>> > I might be missing something, but is this right?
>> > Having a section_nr above NR_MEM_SECTIONS is invalid, but if we return 0 here,
>> > vmemmap_populate will go on and populate it.
>>
>> I should drop that completely. We should not hit that condition at all.
>> I will send a final patch once I go through the full patch series making
>> sure we are not breaking any ppc64 details.
>>
>> Wondering why we did the below
>>
>> #if defined(ARCH_SUBSECTION_SHIFT)
>> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
>> #elif defined(PMD_SHIFT)
>> #define SUBSECTION_SHIFT (PMD_SHIFT)
>> #else
>> /*
>>   * Memory hotplug enabled platforms avoid this default because they
>>   * either define ARCH_SUBSECTION_SHIFT, or PMD_SHIFT is a constant, but
>>   * this is kept as a backstop to allow compilation on
>>   * !ARCH_ENABLE_MEMORY_HOTPLUG archs.
>>   */
>> #define SUBSECTION_SHIFT 21
>> #endif
>>
>> why not
>>
>> #if defined(ARCH_SUBSECTION_SHIFT)
>> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
>> #else
>> #define SUBSECTION_SHIFT  SECTION_SHIFT
>> #endif
>>
>> ie, if SUBSECTION is not supported by arch we have one sub-section per
>> section?
>
> A couple comments:
>
> The only reason ARCH_SUBSECTION_SHIFT exists is because PMD_SHIFT on
> PowerPC was a non-constant value. However, I'm planning to remove the
> distinction in the next rev of the patches. Jeff rightly points out
> that having a variable subsection size per arch will lead to
> situations where persistent memory namespaces are not portable across
> archs. So I plan to just make SUBSECTION_SHIFT 21 everywhere.

What is the dependency between subsection and pageblock_order? Shouldn't
subsection size >= pageblock size?

We do have pageblock size drived from HugeTLB size.

-aneesh
Dan Williams June 16, 2019, 3:42 p.m. UTC | #28
On Fri, Jun 14, 2019 at 1:43 PM Qian Cai <cai@lca.pw> wrote:
>
> On Fri, 2019-06-14 at 12:48 -0700, Dan Williams wrote:
> > On Fri, Jun 14, 2019 at 12:40 PM Qian Cai <cai@lca.pw> wrote:
> > >
> > > On Fri, 2019-06-14 at 11:57 -0700, Dan Williams wrote:
> > > > On Fri, Jun 14, 2019 at 11:03 AM Dan Williams <dan.j.williams@intel.com>
> > > > wrote:
> > > > >
> > > > > On Fri, Jun 14, 2019 at 7:59 AM Qian Cai <cai@lca.pw> wrote:
> > > > > >
> > > > > > On Fri, 2019-06-14 at 14:28 +0530, Aneesh Kumar K.V wrote:
> > > > > > > Qian Cai <cai@lca.pw> writes:
> > > > > > >
> > > > > > >
> > > > > > > > 1) offline is busted [1]. It looks like test_pages_in_a_zone()
> > > > > > > > missed
> > > > > > > > the
> > > > > > > > same
> > > > > > > > pfn_section_valid() check.
> > > > > > > >
> > > > > > > > 2) powerpc booting is generating endless warnings [2]. In
> > > > > > > > vmemmap_populated() at
> > > > > > > > arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
> > > > > > > > PAGES_PER_SUBSECTION, but it alone seems not enough.
> > > > > > > >
> > > > > > >
> > > > > > > Can you check with this change on ppc64.  I haven't reviewed this
> > > > > > > series
> > > > > > > yet.
> > > > > > > I did limited testing with change . Before merging this I need to go
> > > > > > > through the full series again. The vmemmap poplulate on ppc64 needs
> > > > > > > to
> > > > > > > handle two translation mode (hash and radix). With respect to vmemap
> > > > > > > hash doesn't setup a translation in the linux page table. Hence we
> > > > > > > need
> > > > > > > to make sure we don't try to setup a mapping for a range which is
> > > > > > > arleady convered by an existing mapping.
> > > > > >
> > > > > > It works fine.
> > > > >
> > > > > Strange... it would only change behavior if valid_section() is true
> > > > > when pfn_valid() is not or vice versa. They "should" be identical
> > > > > because subsection-size == section-size on PowerPC, at least with the
> > > > > current definition of SUBSECTION_SHIFT. I suspect maybe
> > > > > free_area_init_nodes() is too late to call subsection_map_init() for
> > > > > PowerPC.
> > > >
> > > > Can you give the attached incremental patch a try? This will break
> > > > support for doing sub-section hot-add in a section that was only
> > > > partially populated early at init, but that can be repaired later in
> > > > the series. First things first, don't regress.
> > > >
> > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > index 874eb22d22e4..520c83aa0fec 100644
> > > > --- a/mm/page_alloc.c
> > > > +++ b/mm/page_alloc.c
> > > > @@ -7286,12 +7286,10 @@ void __init free_area_init_nodes(unsigned long
> > > > *max_zone_pfn)
> > > >
> > > >         /* Print out the early node map */
> > > >         pr_info("Early memory node ranges\n");
> > > > -       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn,
> > > > &nid) {
> > > > +       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn,
> > > > &nid)
> > > >                 pr_info("  node %3d: [mem %#018Lx-%#018Lx]\n", nid,
> > > >                         (u64)start_pfn << PAGE_SHIFT,
> > > >                         ((u64)end_pfn << PAGE_SHIFT) - 1);
> > > > -               subsection_map_init(start_pfn, end_pfn - start_pfn);
> > > > -       }
> > > >
> > > >         /* Initialise every node */
> > > >         mminit_verify_pageflags_layout();
> > > > diff --git a/mm/sparse.c b/mm/sparse.c
> > > > index 0baa2e55cfdd..bca8e6fa72d2 100644
> > > > --- a/mm/sparse.c
> > > > +++ b/mm/sparse.c
> > > > @@ -533,6 +533,7 @@ static void __init sparse_init_nid(int nid,
> > > > unsigned long pnum_begin,
> > > >                 }
> > > >                 check_usemap_section_nr(nid, usage);
> > > >                 sparse_init_one_section(__nr_to_section(pnum), pnum,
> > > > map, usage);
> > > > +               subsection_map_init(section_nr_to_pfn(pnum),
> > > > PAGES_PER_SECTION);
> > > >                 usage = (void *) usage + mem_section_usage_size();
> > > >         }
> > > >         sparse_buffer_fini();
> > >
> > > It works fine except it starts to trigger slab debugging errors during boot.
> > > Not
> > > sure if it is related yet.
> >
> > If you want you can give this branch a try if you suspect something
> > else in -next is triggering the slab warning.
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=subsect
> > ion-v9
> >
> > It's the original v9 patchset + dependencies backported to v5.2-rc4.
> >
> > I otherwise don't see how subsections would effect slab caches.
>
> It works fine there.

Much appreciated Qian!

Does this change modulate the x86 failures?
Qian Cai June 17, 2019, 2:25 a.m. UTC | #29
> On Jun 16, 2019, at 11:42 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> 
> On Fri, Jun 14, 2019 at 1:43 PM Qian Cai <cai@lca.pw> wrote:
>> 
>> On Fri, 2019-06-14 at 12:48 -0700, Dan Williams wrote:
>>> On Fri, Jun 14, 2019 at 12:40 PM Qian Cai <cai@lca.pw> wrote:
>>>> 
>>>> On Fri, 2019-06-14 at 11:57 -0700, Dan Williams wrote:
>>>>> On Fri, Jun 14, 2019 at 11:03 AM Dan Williams <dan.j.williams@intel.com>
>>>>> wrote:
>>>>>> 
>>>>>> On Fri, Jun 14, 2019 at 7:59 AM Qian Cai <cai@lca.pw> wrote:
>>>>>>> 
>>>>>>> On Fri, 2019-06-14 at 14:28 +0530, Aneesh Kumar K.V wrote:
>>>>>>>> Qian Cai <cai@lca.pw> writes:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 1) offline is busted [1]. It looks like test_pages_in_a_zone()
>>>>>>>>> missed
>>>>>>>>> the
>>>>>>>>> same
>>>>>>>>> pfn_section_valid() check.
>>>>>>>>> 
>>>>>>>>> 2) powerpc booting is generating endless warnings [2]. In
>>>>>>>>> vmemmap_populated() at
>>>>>>>>> arch/powerpc/mm/init_64.c, I tried to change PAGES_PER_SECTION to
>>>>>>>>> PAGES_PER_SUBSECTION, but it alone seems not enough.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Can you check with this change on ppc64.  I haven't reviewed this
>>>>>>>> series
>>>>>>>> yet.
>>>>>>>> I did limited testing with change . Before merging this I need to go
>>>>>>>> through the full series again. The vmemmap poplulate on ppc64 needs
>>>>>>>> to
>>>>>>>> handle two translation mode (hash and radix). With respect to vmemap
>>>>>>>> hash doesn't setup a translation in the linux page table. Hence we
>>>>>>>> need
>>>>>>>> to make sure we don't try to setup a mapping for a range which is
>>>>>>>> arleady convered by an existing mapping.
>>>>>>> 
>>>>>>> It works fine.
>>>>>> 
>>>>>> Strange... it would only change behavior if valid_section() is true
>>>>>> when pfn_valid() is not or vice versa. They "should" be identical
>>>>>> because subsection-size == section-size on PowerPC, at least with the
>>>>>> current definition of SUBSECTION_SHIFT. I suspect maybe
>>>>>> free_area_init_nodes() is too late to call subsection_map_init() for
>>>>>> PowerPC.
>>>>> 
>>>>> Can you give the attached incremental patch a try? This will break
>>>>> support for doing sub-section hot-add in a section that was only
>>>>> partially populated early at init, but that can be repaired later in
>>>>> the series. First things first, don't regress.
>>>>> 
>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>>> index 874eb22d22e4..520c83aa0fec 100644
>>>>> --- a/mm/page_alloc.c
>>>>> +++ b/mm/page_alloc.c
>>>>> @@ -7286,12 +7286,10 @@ void __init free_area_init_nodes(unsigned long
>>>>> *max_zone_pfn)
>>>>> 
>>>>>        /* Print out the early node map */
>>>>>        pr_info("Early memory node ranges\n");
>>>>> -       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn,
>>>>> &nid) {
>>>>> +       for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn,
>>>>> &nid)
>>>>>                pr_info("  node %3d: [mem %#018Lx-%#018Lx]\n", nid,
>>>>>                        (u64)start_pfn << PAGE_SHIFT,
>>>>>                        ((u64)end_pfn << PAGE_SHIFT) - 1);
>>>>> -               subsection_map_init(start_pfn, end_pfn - start_pfn);
>>>>> -       }
>>>>> 
>>>>>        /* Initialise every node */
>>>>>        mminit_verify_pageflags_layout();
>>>>> diff --git a/mm/sparse.c b/mm/sparse.c
>>>>> index 0baa2e55cfdd..bca8e6fa72d2 100644
>>>>> --- a/mm/sparse.c
>>>>> +++ b/mm/sparse.c
>>>>> @@ -533,6 +533,7 @@ static void __init sparse_init_nid(int nid,
>>>>> unsigned long pnum_begin,
>>>>>                }
>>>>>                check_usemap_section_nr(nid, usage);
>>>>>                sparse_init_one_section(__nr_to_section(pnum), pnum,
>>>>> map, usage);
>>>>> +               subsection_map_init(section_nr_to_pfn(pnum),
>>>>> PAGES_PER_SECTION);
>>>>>                usage = (void *) usage + mem_section_usage_size();
>>>>>        }
>>>>>        sparse_buffer_fini();
>>>> 
>>>> It works fine except it starts to trigger slab debugging errors during boot.
>>>> Not
>>>> sure if it is related yet.
>>> 
>>> If you want you can give this branch a try if you suspect something
>>> else in -next is triggering the slab warning.
>>> 
>>> https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=subsect
>>> ion-v9
>>> 
>>> It's the original v9 patchset + dependencies backported to v5.2-rc4.
>>> 
>>> I otherwise don't see how subsections would effect slab caches.
>> 
>> It works fine there.
> 
> Much appreciated Qian!
> 
> Does this change modulate the x86 failures?

Yes, it also fix the kmemleak_scan() and offline issues on x86.
Dan Williams June 17, 2019, 5:21 p.m. UTC | #30
On Sat, Jun 15, 2019 at 8:50 PM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
>
> Dan Williams <dan.j.williams@intel.com> writes:
>
> > On Fri, Jun 14, 2019 at 9:18 AM Aneesh Kumar K.V
> > <aneesh.kumar@linux.ibm.com> wrote:
> >>
> >> On 6/14/19 9:05 PM, Oscar Salvador wrote:
> >> > On Fri, Jun 14, 2019 at 02:28:40PM +0530, Aneesh Kumar K.V wrote:
> >> >> Can you check with this change on ppc64.  I haven't reviewed this series yet.
> >> >> I did limited testing with change . Before merging this I need to go
> >> >> through the full series again. The vmemmap poplulate on ppc64 needs to
> >> >> handle two translation mode (hash and radix). With respect to vmemap
> >> >> hash doesn't setup a translation in the linux page table. Hence we need
> >> >> to make sure we don't try to setup a mapping for a range which is
> >> >> arleady convered by an existing mapping.
> >> >>
> >> >> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> >> >> index a4e17a979e45..15c342f0a543 100644
> >> >> --- a/arch/powerpc/mm/init_64.c
> >> >> +++ b/arch/powerpc/mm/init_64.c
> >> >> @@ -88,16 +88,23 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
> >> >>    * which overlaps this vmemmap page is initialised then this page is
> >> >>    * initialised already.
> >> >>    */
> >> >> -static int __meminit vmemmap_populated(unsigned long start, int page_size)
> >> >> +static bool __meminit vmemmap_populated(unsigned long start, int page_size)
> >> >>   {
> >> >>      unsigned long end = start + page_size;
> >> >>      start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
> >> >>
> >> >> -    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
> >> >> -            if (pfn_valid(page_to_pfn((struct page *)start)))
> >> >> -                    return 1;
> >> >> +    for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) {
> >> >>
> >> >> -    return 0;
> >> >> +            struct mem_section *ms;
> >> >> +            unsigned long pfn = page_to_pfn((struct page *)start);
> >> >> +
> >> >> +            if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> >> >> +                    return 0;
> >> >
> >> > I might be missing something, but is this right?
> >> > Having a section_nr above NR_MEM_SECTIONS is invalid, but if we return 0 here,
> >> > vmemmap_populate will go on and populate it.
> >>
> >> I should drop that completely. We should not hit that condition at all.
> >> I will send a final patch once I go through the full patch series making
> >> sure we are not breaking any ppc64 details.
> >>
> >> Wondering why we did the below
> >>
> >> #if defined(ARCH_SUBSECTION_SHIFT)
> >> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
> >> #elif defined(PMD_SHIFT)
> >> #define SUBSECTION_SHIFT (PMD_SHIFT)
> >> #else
> >> /*
> >>   * Memory hotplug enabled platforms avoid this default because they
> >>   * either define ARCH_SUBSECTION_SHIFT, or PMD_SHIFT is a constant, but
> >>   * this is kept as a backstop to allow compilation on
> >>   * !ARCH_ENABLE_MEMORY_HOTPLUG archs.
> >>   */
> >> #define SUBSECTION_SHIFT 21
> >> #endif
> >>
> >> why not
> >>
> >> #if defined(ARCH_SUBSECTION_SHIFT)
> >> #define SUBSECTION_SHIFT (ARCH_SUBSECTION_SHIFT)
> >> #else
> >> #define SUBSECTION_SHIFT  SECTION_SHIFT
> >> #endif
> >>
> >> ie, if SUBSECTION is not supported by arch we have one sub-section per
> >> section?
> >
> > A couple comments:
> >
> > The only reason ARCH_SUBSECTION_SHIFT exists is because PMD_SHIFT on
> > PowerPC was a non-constant value. However, I'm planning to remove the
> > distinction in the next rev of the patches. Jeff rightly points out
> > that having a variable subsection size per arch will lead to
> > situations where persistent memory namespaces are not portable across
> > archs. So I plan to just make SUBSECTION_SHIFT 21 everywhere.
>
> What is the dependency between subsection and pageblock_order? Shouldn't
> subsection size >= pageblock size?
>
> We do have pageblock size drived from HugeTLB size.

The pageblock size is independent of subsection-size. The pageblock
size is a page-allocator concern, subsections only exist for pages
that are never onlined to the page-allocator.
diff mbox series

Patch

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 0b8a5e5ef2da..f02be86077e3 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -28,6 +28,7 @@ 
 	unsigned long ___nr = pfn_to_section_nr(___pfn);	   \
 								   \
 	if (___nr < NR_MEM_SECTIONS && online_section_nr(___nr) && \
+	    pfn_section_valid(__nr_to_section(___nr), pfn) &&	   \
 	    pfn_valid_within(___pfn))				   \
 		___page = pfn_to_page(___pfn);			   \
 	___page;						   \