Message ID | 20240719015836.1060677-1-lizhijian@fujitsu.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2,1/2] nvdimm: Fix devs leaks in scan_labels() | expand |
Li Zhijian wrote: > The leakage would happend when create_namespace_pmem() meets an invalid > label which gets failure in validating isetcookie. > > Try to resuse the devs that may have already been allocated with size > (2 * sizeof(dev)) previously. > > A kmemleak reports: > unreferenced object 0xffff88800dda1980 (size 16): > comm "kworker/u10:5", pid 69, jiffies 4294671781 > hex dump (first 16 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace (crc 0): > [<00000000c5dea560>] __kmalloc+0x32c/0x470 > [<000000009ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 [libnvdimm] > [<000000000e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] > [<000000007b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] > [<00000000a5f3da2e>] really_probe+0xc6/0x390 > [<00000000129e2a69>] __driver_probe_device+0x78/0x150 > [<000000002dfed28b>] driver_probe_device+0x1e/0x90 > [<00000000e7048de2>] __device_attach_driver+0x85/0x110 > [<0000000032dca295>] bus_for_each_drv+0x85/0xe0 > [<00000000391c5a7d>] __device_attach+0xbe/0x1e0 > [<0000000026dabec0>] bus_probe_device+0x94/0xb0 > [<00000000c590d936>] device_add+0x656/0x870 > [<000000003d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] > [<000000003f4c52a4>] async_run_entry_fn+0x2e/0x110 > [<00000000e201f4b0>] process_one_work+0x1ee/0x600 > [<000000006d90d5a9>] worker_thread+0x183/0x350 > > Cc: Dave Jiang <dave.jiang@intel.com> > Cc: Ira Weiny <ira.weiny@intel.com> > Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") What is the bigger effect the user will see? Does this cause a long term user effect? For example, if a users system has a bad label I think this is going to be a pretty minor memory leak which just hangs around until reboot, correct? > Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> > --- > > Cc: Ira Weiny <ira.weiny@intel.com> > > From what I can tell create_namespace_pmem() must be returning EAGAIN > > which leaves devs allocated but fails to increment count. Thus there are > > no valid labels but devs was not free'ed. > > > Can you trace the error you are seeing a bit more to see if this is the > > case? > Hi Ira, Sorry for the late reply. I have reproduced it these days. > Yeah, the LSA is containing a label in which the isetcookie is invalid. NP, sometimes it takes a while to really debug something. > > V2: > update description and comment > --- > drivers/nvdimm/namespace_devs.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c > index d6d558f94d6b..28c9afc01dca 100644 > --- a/drivers/nvdimm/namespace_devs.c > +++ b/drivers/nvdimm/namespace_devs.c > @@ -1994,7 +1994,13 @@ static struct device **scan_labels(struct nd_region *nd_region) > /* Publish a zero-sized namespace for userspace to configure. */ > nd_mapping_free_labels(nd_mapping); > > - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > + /* > + * Try to use the devs that may have already been allocated > + * above first. This would happend when create_namespace_pmem() > + * meets an invalid label. > + */ > + if (!devs) > + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); I'm still tempted to try and fix the count but I think this will work. Let me know about the severity of the issue. Ira > if (!devs) > goto err; > > -- > 2.29.2 >
On 26/07/2024 21:37, Ira Weiny wrote: > Li Zhijian wrote: >> The leakage would happend when create_namespace_pmem() meets an invalid >> label which gets failure in validating isetcookie. >> >> Try to resuse the devs that may have already been allocated with size >> (2 * sizeof(dev)) previously. >> >> A kmemleak reports: >> unreferenced object 0xffff88800dda1980 (size 16): >> comm "kworker/u10:5", pid 69, jiffies 4294671781 >> hex dump (first 16 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> backtrace (crc 0): >> [<00000000c5dea560>] __kmalloc+0x32c/0x470 >> [<000000009ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 [libnvdimm] >> [<000000000e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] >> [<000000007b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] >> [<00000000a5f3da2e>] really_probe+0xc6/0x390 >> [<00000000129e2a69>] __driver_probe_device+0x78/0x150 >> [<000000002dfed28b>] driver_probe_device+0x1e/0x90 >> [<00000000e7048de2>] __device_attach_driver+0x85/0x110 >> [<0000000032dca295>] bus_for_each_drv+0x85/0xe0 >> [<00000000391c5a7d>] __device_attach+0xbe/0x1e0 >> [<0000000026dabec0>] bus_probe_device+0x94/0xb0 >> [<00000000c590d936>] device_add+0x656/0x870 >> [<000000003d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] >> [<000000003f4c52a4>] async_run_entry_fn+0x2e/0x110 >> [<00000000e201f4b0>] process_one_work+0x1ee/0x600 >> [<000000006d90d5a9>] worker_thread+0x183/0x350 >> >> Cc: Dave Jiang <dave.jiang@intel.com> >> Cc: Ira Weiny <ira.weiny@intel.com> >> Fixes: 1b40e09a1232 ("libnvdimm: blk labels and namespace instantiation") > > What is the bigger effect the user will see? *Users* cannot use this pmem until they zero-label the device. In my understanding, once the leakage occurs, there is no way to reclaim the memory until reboot > > Does this cause a long term user effect? For example, if a users system > has a bad label I think this is going to be a pretty minor memory leak > which just hangs around until reboot, correct? > >> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> >> --- >> >> Cc: Ira Weiny <ira.weiny@intel.com> >>> From what I can tell create_namespace_pmem() must be returning EAGAIN >>> which leaves devs allocated but fails to increment count. Thus there are >>> no valid labels but devs was not free'ed. >> >>> Can you trace the error you are seeing a bit more to see if this is the >>> case? >> Hi Ira, Sorry for the late reply. I have reproduced it these days. >> Yeah, the LSA is containing a label in which the isetcookie is invalid. > > NP, sometimes it takes a while to really debug something. > >> >> V2: >> update description and comment >> --- >> drivers/nvdimm/namespace_devs.c | 8 +++++++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c >> index d6d558f94d6b..28c9afc01dca 100644 >> --- a/drivers/nvdimm/namespace_devs.c >> +++ b/drivers/nvdimm/namespace_devs.c >> @@ -1994,7 +1994,13 @@ static struct device **scan_labels(struct nd_region *nd_region) >> /* Publish a zero-sized namespace for userspace to configure. */ >> nd_mapping_free_labels(nd_mapping); >> >> - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); >> + /* >> + * Try to use the devs that may have already been allocated >> + * above first. This would happend when create_namespace_pmem() >> + * meets an invalid label. >> + */ >> + if (!devs) >> + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > > I'm still tempted to try and fix the count but I think this will work. I cannot get your *fix the count* ? Does "fix the count*" means to free the devs in the case of error cases: $ git diff diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c index 28c9afc01dca..3fae00a05ad7 100644 --- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -1970,6 +1970,10 @@ static struct device **scan_labels(struct nd_region *nd_region) dev = create_namespace_pmem(nd_region, nd_mapping, nd_label); if (IS_ERR(dev)) { + if (!count) { + kfree(devs); + devs = NULL; + } > Let me know about the severity of the issue. > > Ira > >> if (!devs) >> goto err; >> >> -- >> 2.29.2 >> > > >
I notice this patch is not upstream yet. Let's try to get it over the goal line. Li Zhijian wrote: > The leakage would happend when create_namespace_pmem() meets an invalid > label which gets failure in validating isetcookie. I would rewrite this as: "scan_labels() leaks memory when label scanning fails and it falls back to just creating a default "seed" namespace for userspace to configure. Root can force the kernel to leak memory." ...then a distribution developer knows the urgency to backport this fix. > Try to resuse the devs that may have already been allocated with size > (2 * sizeof(dev)) previously. Rather than conditionally reallocating I think it would be better to unconditionally allocate the minimum, something like: diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c index d6d558f94d6b..1c38c93bee21 100644 --- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -1937,12 +1937,16 @@ static int cmp_dpa(const void *a, const void *b) static struct device **scan_labels(struct nd_region *nd_region) { int i, count = 0; - struct device *dev, **devs = NULL; + struct device *dev, **devs; struct nd_label_ent *label_ent, *e; struct nd_mapping *nd_mapping = &nd_region->mapping[0]; struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); resource_size_t map_end = nd_mapping->start + nd_mapping->size - 1; + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); + if (!devs) + return NULL; + /* "safe" because create_namespace_pmem() might list_move() label_ent */ list_for_each_entry_safe(label_ent, e, &nd_mapping->labels, list) { struct nd_namespace_label *nd_label = label_ent->label; @@ -1961,12 +1965,14 @@ static struct device **scan_labels(struct nd_region *nd_region) goto err; if (i < count) continue; - __devs = kcalloc(count + 2, sizeof(dev), GFP_KERNEL); - if (!__devs) - goto err; - memcpy(__devs, devs, sizeof(dev) * count); - kfree(devs); - devs = __devs; + if (count) { + __devs = kcalloc(count + 2, sizeof(dev), GFP_KERNEL); + if (!__devs) + goto err; + memcpy(__devs, devs, sizeof(dev) * count); + kfree(devs); + devs = __devs; + } dev = create_namespace_pmem(nd_region, nd_mapping, nd_label); if (IS_ERR(dev)) { @@ -1994,10 +2000,6 @@ static struct device **scan_labels(struct nd_region *nd_region) /* Publish a zero-sized namespace for userspace to configure. */ nd_mapping_free_labels(nd_mapping); - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); - if (!devs) - goto err; - nspm = kzalloc(sizeof(*nspm), GFP_KERNEL); if (!nspm) goto err; @@ -2036,12 +2038,10 @@ static struct device **scan_labels(struct nd_region *nd_region) return devs; err: - if (devs) { - for (i = 0; devs[i]; i++) - namespace_pmem_release(devs[i]); - kfree(devs); - } - return NULL; + for (i = 0; devs[i]; i++) + namespace_pmem_release(devs[i]); + kfree(devs); + return NULL; } static struct device **create_namespaces(struct nd_region *nd_region) > A kmemleak reports: > unreferenced object 0xffff88800dda1980 (size 16): > comm "kworker/u10:5", pid 69, jiffies 4294671781 > hex dump (first 16 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace (crc 0): > [<00000000c5dea560>] __kmalloc+0x32c/0x470 > [<000000009ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 [libnvdimm] > [<000000000e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] > [<000000007b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] > [<00000000a5f3da2e>] really_probe+0xc6/0x390 > [<00000000129e2a69>] __driver_probe_device+0x78/0x150 > [<000000002dfed28b>] driver_probe_device+0x1e/0x90 > [<00000000e7048de2>] __device_attach_driver+0x85/0x110 > [<0000000032dca295>] bus_for_each_drv+0x85/0xe0 > [<00000000391c5a7d>] __device_attach+0xbe/0x1e0 > [<0000000026dabec0>] bus_probe_device+0x94/0xb0 > [<00000000c590d936>] device_add+0x656/0x870 > [<000000003d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] > [<000000003f4c52a4>] async_run_entry_fn+0x2e/0x110 > [<00000000e201f4b0>] process_one_work+0x1ee/0x600 > [<000000006d90d5a9>] worker_thread+0x183/0x350 Thanks for including this. With the above changes you can add: Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Dan Williams wrote: [..] > @@ -2036,12 +2038,10 @@ static struct device **scan_labels(struct nd_region *nd_region) ...of course you would also need something like: if (!count) { kfree(devs); return NULL; } ...here, I'll leave that to you to fix up and test. > return devs; > > err: > - if (devs) { > - for (i = 0; devs[i]; i++) > - namespace_pmem_release(devs[i]); > - kfree(devs); > - } > - return NULL; > + for (i = 0; devs[i]; i++) > + namespace_pmem_release(devs[i]); > + kfree(devs); > + return NULL; > } >
(Just back from the summer holiday) Sorry for the late reply. On 10/08/2024 06:38, Dan Williams wrote: > I notice this patch is not upstream yet. Let's try to get it over the > goal line. > > Li Zhijian wrote: >> The leakage would happen when create_namespace_pmem() meets an invalid >> label which gets failure in validating isetcookie. > > I would rewrite this as: > > "scan_labels() leaks memory when label scanning fails and it falls back > to just creating a default "seed" namespace for userspace to configure. > Root can force the kernel to leak memory." It sounds good to me. > > ...then a distribution developer knows the urgency to backport this fix. > >> Try to resuse the devs that may have already been allocated with size >> (2 * sizeof(dev)) previously. > > Rather than conditionally reallocating I think it would be better to > unconditionally allocate the minimum, something like: Okay, I will update it in V3. > > diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c > index d6d558f94d6b..1c38c93bee21 100644 > --- a/drivers/nvdimm/namespace_devs.c > +++ b/drivers/nvdimm/namespace_devs.c > @@ -1937,12 +1937,16 @@ static int cmp_dpa(const void *a, const void *b) > static struct device **scan_labels(struct nd_region *nd_region) > { > int i, count = 0; > - struct device *dev, **devs = NULL; > + struct device *dev, **devs; > struct nd_label_ent *label_ent, *e; > struct nd_mapping *nd_mapping = &nd_region->mapping[0]; > struct nvdimm_drvdata *ndd = to_ndd(nd_mapping); > resource_size_t map_end = nd_mapping->start + nd_mapping->size - 1; > > + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > + if (!devs) > + return NULL; > + > /* "safe" because create_namespace_pmem() might list_move() label_ent */ > list_for_each_entry_safe(label_ent, e, &nd_mapping->labels, list) { > struct nd_namespace_label *nd_label = label_ent->label; > @@ -1961,12 +1965,14 @@ static struct device **scan_labels(struct nd_region *nd_region) > goto err; > if (i < count) > continue; > - __devs = kcalloc(count + 2, sizeof(dev), GFP_KERNEL); > - if (!__devs) > - goto err; > - memcpy(__devs, devs, sizeof(dev) * count); > - kfree(devs); > - devs = __devs; > + if (count) { > + __devs = kcalloc(count + 2, sizeof(dev), GFP_KERNEL); > + if (!__devs) > + goto err; > + memcpy(__devs, devs, sizeof(dev) * count); > + kfree(devs); > + devs = __devs; > + } > > dev = create_namespace_pmem(nd_region, nd_mapping, nd_label); > if (IS_ERR(dev)) { > @@ -1994,10 +2000,6 @@ static struct device **scan_labels(struct nd_region *nd_region) > /* Publish a zero-sized namespace for userspace to configure. */ > nd_mapping_free_labels(nd_mapping); > > - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); > - if (!devs) > - goto err; > - > nspm = kzalloc(sizeof(*nspm), GFP_KERNEL); > if (!nspm) > goto err; > @@ -2036,12 +2038,10 @@ static struct device **scan_labels(struct nd_region *nd_region) > return devs; > > err: > - if (devs) { > - for (i = 0; devs[i]; i++) > - namespace_pmem_release(devs[i]); > - kfree(devs); > - } > - return NULL; > + for (i = 0; devs[i]; i++) > + namespace_pmem_release(devs[i]); > + kfree(devs); > + return NULL; > } > > static struct device **create_namespaces(struct nd_region *nd_region) > > >> A kmemleak reports: >> unreferenced object 0xffff88800dda1980 (size 16): >> comm "kworker/u10:5", pid 69, jiffies 4294671781 >> hex dump (first 16 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> backtrace (crc 0): >> [<00000000c5dea560>] __kmalloc+0x32c/0x470 >> [<000000009ed43c83>] nd_region_register_namespaces+0x6fb/0x1120 [libnvdimm] >> [<000000000e07a65c>] nd_region_probe+0xfe/0x210 [libnvdimm] >> [<000000007b79ce5f>] nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm] >> [<00000000a5f3da2e>] really_probe+0xc6/0x390 >> [<00000000129e2a69>] __driver_probe_device+0x78/0x150 >> [<000000002dfed28b>] driver_probe_device+0x1e/0x90 >> [<00000000e7048de2>] __device_attach_driver+0x85/0x110 >> [<0000000032dca295>] bus_for_each_drv+0x85/0xe0 >> [<00000000391c5a7d>] __device_attach+0xbe/0x1e0 >> [<0000000026dabec0>] bus_probe_device+0x94/0xb0 >> [<00000000c590d936>] device_add+0x656/0x870 >> [<000000003d69bfaa>] nd_async_device_register+0xe/0x50 [libnvdimm] >> [<000000003f4c52a4>] async_run_entry_fn+0x2e/0x110 >> [<00000000e201f4b0>] process_one_work+0x1ee/0x600 >> [<000000006d90d5a9>] worker_thread+0x183/0x350 > > Thanks for including this. > > With the above changes you can add: > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
On 10/08/2024 06:55, Dan Williams wrote: > Dan Williams wrote: > [..] >> @@ -2036,12 +2038,10 @@ static struct device **scan_labels(struct nd_region *nd_region) > > ...of course you would also need something like: > > if (!count) { > kfree(devs); > return NULL; > } It seems we don't need this cleanup, in the count=0 case, we would reach `err` to free devs. Thanks Zhijian > > ...here, I'll leave that to you to fix up and test. > >> return devs; >> >> err: >> - if (devs) { >> - for (i = 0; devs[i]; i++) >> - namespace_pmem_release(devs[i]); >> - kfree(devs); >> - } >> - return NULL; >> + for (i = 0; devs[i]; i++) >> + namespace_pmem_release(devs[i]); >> + kfree(devs); >> + return NULL; >> } >>
diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c index d6d558f94d6b..28c9afc01dca 100644 --- a/drivers/nvdimm/namespace_devs.c +++ b/drivers/nvdimm/namespace_devs.c @@ -1994,7 +1994,13 @@ static struct device **scan_labels(struct nd_region *nd_region) /* Publish a zero-sized namespace for userspace to configure. */ nd_mapping_free_labels(nd_mapping); - devs = kcalloc(2, sizeof(dev), GFP_KERNEL); + /* + * Try to use the devs that may have already been allocated + * above first. This would happend when create_namespace_pmem() + * meets an invalid label. + */ + if (!devs) + devs = kcalloc(2, sizeof(dev), GFP_KERNEL); if (!devs) goto err;