Message ID | 20240604031658.951493-1-lizhijian@fujitsu.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | nvdimm: Fix devs leaks in scan_labels() | expand |
On 6/3/24 8:16 PM, Li Zhijian wrote: > Don't allocate devs again when it's valid pointer which has pionted to > the memory allocated above with size (count + 2 * sizeof(dev)). > > A kmemleak reports: > unreferenced object 0xffff88800dda1980 (size 16): > comm "kworker/u10:5", pid 69, jiffies 4294671781 > hex dump (first 16 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace (crc 0): > [<00000000c5dea560>] __kmalloc+0x32c/0x470 > [<000000009ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 [libnvdimm] > [<000000000e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] > [<000000007b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] > [<00000000a5f3da2e>] really_probe+0xc6/0x390 > [<00000000129e2a69>] __driver_probe_device+0x78/0x150 > [<000000002dfed28b>] driver_probe_device+0x1e/0x90 > [<00000000e7048de2>] __device_attach_driver+0x85/0x110 > [<0000000032dca295>] bus_for_each_drv+0x85/0xe0 > [<00000000391c5a7d>] __device_attach+0xbe/0x1e0 > [<0000000026dabec0>] bus_probe_device+0x94/0xb0 > [<00000000c590d936>] device_add+0x656/0x870 > [<000000003d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] > [<000000003f4c52a4>] async_run_entry_fn+0x2e/0x110 > [<00000000e201f4b0>] process_one_work+0x1ee/0x600 > [<000000006d90d5a9>] worker_thread+0x183/0x350 > > Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") > Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> > --- > drivers/nvdimm/namespace_devs.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c > index d6d558f94d6b..56b016dbe307 100644 > --- a/drivers/nvdimm/namespace_devs.c > +++ b/drivers/nvdimm/namespace_devs.c > @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region *nd_region) > /* Publish a zero-sized namespace for userspace to configure. */ > nd_mapping_free_labels(nd_mapping); > > - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > + /* devs probably has been allocated */ > + if (!devs) > + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); This changes the behavior of this code and possibly corrupting the previously allocated memory at times when 'devs' is valid. Was the 'devs' leaked out from the previous loop and should be freed instead? > if (!devs) > goto err; >
Li Zhijian wrote: > Don't allocate devs again when it's valid pointer which has pionted to > the memory allocated above with size (count + 2 * sizeof(dev)). > > A kmemleak reports: > unreferenced object 0xffff88800dda1980 (size 16): > comm "kworker/u10:5", pid 69, jiffies 4294671781 > hex dump (first 16 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace (crc 0): > [<00000000c5dea560>] __kmalloc+0x32c/0x470 > [<000000009ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 [libnvdimm] > [<000000000e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] > [<000000007b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] > [<00000000a5f3da2e>] really_probe+0xc6/0x390 > [<00000000129e2a69>] __driver_probe_device+0x78/0x150 > [<000000002dfed28b>] driver_probe_device+0x1e/0x90 > [<00000000e7048de2>] __device_attach_driver+0x85/0x110 > [<0000000032dca295>] bus_for_each_drv+0x85/0xe0 > [<00000000391c5a7d>] __device_attach+0xbe/0x1e0 > [<0000000026dabec0>] bus_probe_device+0x94/0xb0 > [<00000000c590d936>] device_add+0x656/0x870 > [<000000003d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] > [<000000003f4c52a4>] async_run_entry_fn+0x2e/0x110 > [<00000000e201f4b0>] process_one_work+0x1ee/0x600 > [<000000006d90d5a9>] worker_thread+0x183/0x350 > > Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") > Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> > --- > drivers/nvdimm/namespace_devs.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c > index d6d558f94d6b..56b016dbe307 100644 > --- a/drivers/nvdimm/namespace_devs.c > +++ b/drivers/nvdimm/namespace_devs.c > @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region *nd_region) > /* Publish a zero-sized namespace for userspace to configure. */ > nd_mapping_free_labels(nd_mapping); > > - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > + /* devs probably has been allocated */ I don't think this is where the bug is. The loop above is processing the known labels and should exit with a count > 0 if devs is not NULL. From what I can tell create_namespace_pmem() must be returning EAGAIN which leaves devs allocated but fails to increment count. Thus there are no valid labels but devs was not free'ed. Can you trace the error you are seeing a bit more to see if this is the case? Thanks, Ira > + if (!devs) > + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > if (!devs) > goto err; > > -- > 2.29.2 >
On 07/06/2024 00:27, Dave Jiang wrote: > > > On 6/3/24 8:16 PM, Li Zhijian wrote: >> Don't allocate devs again when it's valid pointer which has pionted to >> the memory allocated above with size (count + 2 * sizeof(dev)). >> >> A kmemleak reports: >> unreferenced object 0xffff88800dda1980 (size 16): >> comm "kworker/u10:5", pid 69, jiffies 4294671781 >> hex dump (first 16 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> backtrace (crc 0): >> [<00000000c5dea560>] __kmalloc+0x32c/0x470 >> [<000000009ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 [libnvdimm] >> [<000000000e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] >> [<000000007b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] >> [<00000000a5f3da2e>] really_probe+0xc6/0x390 >> [<00000000129e2a69>] __driver_probe_device+0x78/0x150 >> [<000000002dfed28b>] driver_probe_device+0x1e/0x90 >> [<00000000e7048de2>] __device_attach_driver+0x85/0x110 >> [<0000000032dca295>] bus_for_each_drv+0x85/0xe0 >> [<00000000391c5a7d>] __device_attach+0xbe/0x1e0 >> [<0000000026dabec0>] bus_probe_device+0x94/0xb0 >> [<00000000c590d936>] device_add+0x656/0x870 >> [<000000003d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] >> [<000000003f4c52a4>] async_run_entry_fn+0x2e/0x110 >> [<00000000e201f4b0>] process_one_work+0x1ee/0x600 >> [<000000006d90d5a9>] worker_thread+0x183/0x350 >> >> Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") >> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> >> --- >> drivers/nvdimm/namespace_devs.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c >> index d6d558f94d6b..56b016dbe307 100644 >> --- a/drivers/nvdimm/namespace_devs.c >> +++ b/drivers/nvdimm/namespace_devs.c >> @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region *nd_region) >> /* Publish a zero-sized namespace for userspace to configure. */ >> nd_mapping_free_labels(nd_mapping); >> >> - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); >> + /* devs probably has been allocated */ >> + if (!devs) >> + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > > This changes the behavior of this code and possibly corrupting the previously allocated memory at times when 'devs' is valid. If we reach here, that means count == 0 is true. so we can deduce that the previously allocated memory was not touched at all in previous loop. and its size was (2 * sizeof(dev)) too. Was the 'devs' leaked out from the previous loop and should be freed instead? this option is also fine to me. we can also fix this by free(devs) before allocate it again.: + kfree(devs); // kfree(NULL) is safe. devs = kcalloc(2, sizeof(dev), GFP_KERNEL); Thanks Zhijian > >> if (!devs) >> goto err; >> >
On 07/06/2024 00:49, Ira Weiny wrote: > Li Zhijian wrote: >> Don't allocate devs again when it's valid pointer which has pionted to >> the memory allocated above with size (count + 2 * sizeof(dev)). >> >> A kmemleak reports: >> unreferenced object 0xffff88800dda1980 (size 16): >> comm "kworker/u10:5", pid 69, jiffies 4294671781 >> hex dump (first 16 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> backtrace (crc 0): >> [<00000000c5dea560>] __kmalloc+0x32c/0x470 >> [<000000009ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 [libnvdimm] >> [<000000000e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] >> [<000000007b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] >> [<00000000a5f3da2e>] really_probe+0xc6/0x390 >> [<00000000129e2a69>] __driver_probe_device+0x78/0x150 >> [<000000002dfed28b>] driver_probe_device+0x1e/0x90 >> [<00000000e7048de2>] __device_attach_driver+0x85/0x110 >> [<0000000032dca295>] bus_for_each_drv+0x85/0xe0 >> [<00000000391c5a7d>] __device_attach+0xbe/0x1e0 >> [<0000000026dabec0>] bus_probe_device+0x94/0xb0 >> [<00000000c590d936>] device_add+0x656/0x870 >> [<000000003d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] >> [<000000003f4c52a4>] async_run_entry_fn+0x2e/0x110 >> [<00000000e201f4b0>] process_one_work+0x1ee/0x600 >> [<000000006d90d5a9>] worker_thread+0x183/0x350 >> >> Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") >> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> >> --- >> drivers/nvdimm/namespace_devs.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c >> index d6d558f94d6b..56b016dbe307 100644 >> --- a/drivers/nvdimm/namespace_devs.c >> +++ b/drivers/nvdimm/namespace_devs.c >> @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region *nd_region) >> /* Publish a zero-sized namespace for userspace to configure. */ >> nd_mapping_free_labels(nd_mapping); >> >> - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); >> + /* devs probably has been allocated */ > > I don't think this is where the bug is. The loop above is processing the > known labels and should exit with a count > 0 if devs is not NULL. > > From what I can tell create_namespace_pmem() must be returning EAGAIN > which leaves devs allocated but fails to increment count. Thus there are > no valid labels but devs was not free'ed. Per the piece of the code, return EAGAIN and ENODEV could cause this issue in theory. 1980 dev = create_namespace_pmem(nd_region, nd_mapping, nd_label); 1981 if (IS_ERR(dev)) { 1982 switch (PTR_ERR(dev)) { 1983 case -EAGAIN: 1984 /* skip invalid labels */ 1985 continue; 1986 case -ENODEV: 1987 /* fallthrough to seed creation */ 1988 break; 1989 default: 1990 goto err; 1991 } 1992 } else 1993 devs[count++] = dev; > > Can you trace the error you are seeing a bit more to see if this is the > case? I just tried, but I cannot reproduce this leaking again. When it happened(100% reproduce at that time), I probably had a corrupted LSA(I see empty output with command 'ndctl list'). It seemed the QEMU emulated Nvdimm device was broken for some reasons. Thanks Zhijian > > Thanks, > Ira > >> + if (!devs) >> + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); >> if (!devs) >> goto err; >> >> -- >> 2.29.2 >> > >
diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c index d6d558f94d6b..56b016dbe307 100644 --- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -1994,7 +1994,9 @@ static struct device **scan_labels(struct nd_region *nd_region) /* Publish a zero-sized namespace for userspace to configure. */ nd_mapping_free_labels(nd_mapping); - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); + /* devs probably has been allocated */ + if (!devs) + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); if (!devs) goto err;
Don't allocate devs again when it's valid pointer which has pionted to the memory allocated above with size (count + 2 * sizeof(dev)). A kmemleak reports: unreferenced object 0xffff88800dda1980 (size 16): comm "kworker/u10:5", pid 69, jiffies 4294671781 hex dump (first 16 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 0): [<00000000c5dea560>] __kmalloc+0x32c/0x470 [<000000009ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 [libnvdimm] [<000000000e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] [<000000007b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] [<00000000a5f3da2e>] really_probe+0xc6/0x390 [<00000000129e2a69>] __driver_probe_device+0x78/0x150 [<000000002dfed28b>] driver_probe_device+0x1e/0x90 [<00000000e7048de2>] __device_attach_driver+0x85/0x110 [<0000000032dca295>] bus_for_each_drv+0x85/0xe0 [<00000000391c5a7d>] __device_attach+0xbe/0x1e0 [<0000000026dabec0>] bus_probe_device+0x94/0xb0 [<00000000c590d936>] device_add+0x656/0x870 [<000000003d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] [<000000003f4c52a4>] async_run_entry_fn+0x2e/0x110 [<00000000e201f4b0>] process_one_work+0x1ee/0x600 [<000000006d90d5a9>] worker_thread+0x183/0x350 Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> --- drivers/nvdimm/namespace_devs.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)