diff mbox series

[for-rc,1/2] IB/hfi1: Fix memory leaks in sysfs registration and unregistration

Message ID 20200326163807.21129.27371.stgit@awfm-01.aw.intel.com (mailing list archive)
State Mainlined
Commit 5c15abc4328ad696fa61e2f3604918ed0c207755
Delegated to: Jason Gunthorpe
Headers show
Series Pre-req for hfi1 cdev rework | expand

Commit Message

Dennis Dalessandro March 26, 2020, 4:38 p.m. UTC
From: Kaike Wan <kaike.wan@intel.com>

When the hfi1 driver is unloaded, kmemleak will report the following
issue:

unreferenced object 0xffff8888461a4c08 (size 8):
comm "kworker/0:0", pid 5, jiffies 4298601264 (age 2047.134s)
hex dump (first 8 bytes):
73 64 6d 61 30 00 ff ff sdma0...
backtrace:
[<00000000311a6ef5>] kvasprintf+0x62/0xd0
[<00000000ade94d9f>] kobject_set_name_vargs+0x1c/0x90
[<0000000060657dbb>] kobject_init_and_add+0x5d/0xb0
[<00000000346fe72b>] 0xffffffffa0c5ecba
[<000000006cfc5819>] 0xffffffffa0c866b9
[<0000000031c65580>] 0xffffffffa0c38e87
[<00000000e9739b3f>] local_pci_probe+0x41/0x80
[<000000006c69911d>] work_for_cpu_fn+0x16/0x20
[<00000000601267b5>] process_one_work+0x171/0x380
[<0000000049a0eefa>] worker_thread+0x1d1/0x3f0
[<00000000909cf2b9>] kthread+0xf8/0x130
[<0000000058f5f874>] ret_from_fork+0x35/0x40

This patch fixes the issue by:
- Releasing dd->per_sdma[i].kobject in hfi1_unregister_sysfs().
  - This will fix the memory leak.
- Calling kobject_put() to unwind operations only for those entries in
   dd->per_sdma[] whose operations have succeeded (including the current
   one that has just failed) in hfi1_verbs_register_sysfs().

Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
---
 drivers/infiniband/hw/hfi1/sysfs.c |   13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Jason Gunthorpe March 26, 2020, 5:25 p.m. UTC | #1
On Thu, Mar 26, 2020 at 12:38:07PM -0400, Dennis Dalessandro wrote:
> From: Kaike Wan <kaike.wan@intel.com>
> 
> When the hfi1 driver is unloaded, kmemleak will report the following
> issue:
> 
> unreferenced object 0xffff8888461a4c08 (size 8):
> comm "kworker/0:0", pid 5, jiffies 4298601264 (age 2047.134s)
> hex dump (first 8 bytes):
> 73 64 6d 61 30 00 ff ff sdma0...
> backtrace:
> [<00000000311a6ef5>] kvasprintf+0x62/0xd0
> [<00000000ade94d9f>] kobject_set_name_vargs+0x1c/0x90
> [<0000000060657dbb>] kobject_init_and_add+0x5d/0xb0
> [<00000000346fe72b>] 0xffffffffa0c5ecba
> [<000000006cfc5819>] 0xffffffffa0c866b9
> [<0000000031c65580>] 0xffffffffa0c38e87
> [<00000000e9739b3f>] local_pci_probe+0x41/0x80
> [<000000006c69911d>] work_for_cpu_fn+0x16/0x20
> [<00000000601267b5>] process_one_work+0x171/0x380
> [<0000000049a0eefa>] worker_thread+0x1d1/0x3f0
> [<00000000909cf2b9>] kthread+0xf8/0x130
> [<0000000058f5f874>] ret_from_fork+0x35/0x40
> 
> This patch fixes the issue by:
> - Releasing dd->per_sdma[i].kobject in hfi1_unregister_sysfs().
>   - This will fix the memory leak.
> - Calling kobject_put() to unwind operations only for those entries in
>    dd->per_sdma[] whose operations have succeeded (including the current
>    one that has just failed) in hfi1_verbs_register_sysfs().
> 
> Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup")
> Cc: <stable@vger.kernel.org>
> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
> Signed-off-by: Kaike Wan <kaike.wan@intel.com>
> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> ---
>  drivers/infiniband/hw/hfi1/sysfs.c |   13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)

I'm not certain, but this seems unwise.

After hfi1_verbs_unregiser_sysfs() returns there should be no sysfs left
under the ibdev as we are going to delete the ibdev sysfs next.

kobject_del() triggers synchronous delete of the sysfs, while
kobject_put() potentially defers it to the future.

Will ib unregister fail if the kobject_del() has not happened yet? I
am unsure.

Jason
Wan, Kaike March 26, 2020, 7:09 p.m. UTC | #2
> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Thursday, March 26, 2020 1:26 PM
> To: Dalessandro, Dennis <dennis.dalessandro@intel.com>
> Cc: dledford@redhat.com; linux-rdma@vger.kernel.org; Marciniszyn, Mike
> <mike.marciniszyn@intel.com>; stable@vger.kernel.org; Wan, Kaike
> <kaike.wan@intel.com>
> Subject: Re: [PATCH for-rc 1/2] IB/hfi1: Fix memory leaks in sysfs registration
> and unregistration
> 
> On Thu, Mar 26, 2020 at 12:38:07PM -0400, Dennis Dalessandro wrote:
> > From: Kaike Wan <kaike.wan@intel.com>
> >
> > When the hfi1 driver is unloaded, kmemleak will report the following
> > issue:
> >
> > unreferenced object 0xffff8888461a4c08 (size 8):
> > comm "kworker/0:0", pid 5, jiffies 4298601264 (age 2047.134s) hex dump
> > (first 8 bytes):
> > 73 64 6d 61 30 00 ff ff sdma0...
> > backtrace:
> > [<00000000311a6ef5>] kvasprintf+0x62/0xd0 [<00000000ade94d9f>]
> > kobject_set_name_vargs+0x1c/0x90 [<0000000060657dbb>]
> > kobject_init_and_add+0x5d/0xb0 [<00000000346fe72b>] 0xffffffffa0c5ecba
> > [<000000006cfc5819>] 0xffffffffa0c866b9 [<0000000031c65580>]
> > 0xffffffffa0c38e87 [<00000000e9739b3f>] local_pci_probe+0x41/0x80
> > [<000000006c69911d>] work_for_cpu_fn+0x16/0x20
> [<00000000601267b5>]
> > process_one_work+0x171/0x380 [<0000000049a0eefa>]
> > worker_thread+0x1d1/0x3f0 [<00000000909cf2b9>] kthread+0xf8/0x130
> > [<0000000058f5f874>] ret_from_fork+0x35/0x40
> >
> > This patch fixes the issue by:
> > - Releasing dd->per_sdma[i].kobject in hfi1_unregister_sysfs().
> >   - This will fix the memory leak.
> > - Calling kobject_put() to unwind operations only for those entries in
> >    dd->per_sdma[] whose operations have succeeded (including the current
> >    one that has just failed) in hfi1_verbs_register_sysfs().
> >
> > Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity
> > setup")
> > Cc: <stable@vger.kernel.org>
> > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
> > Signed-off-by: Kaike Wan <kaike.wan@intel.com>
> > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> > ---
> >  drivers/infiniband/hw/hfi1/sysfs.c |   13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> I'm not certain, but this seems unwise.
> 
> After hfi1_verbs_unregiser_sysfs() returns there should be no sysfs left
> under the ibdev as we are going to delete the ibdev sysfs next.
> 
> kobject_del() triggers synchronous delete of the sysfs, while
> kobject_put() potentially defers it to the future.
True.  However, kobject_del() will only delete the sysfs for the object, ie, unwrap what has been done in object_add, but it will not decrement the refcount for the kobject.
To unwap kobject_init_and_add(), one can call
(1) kobject_del() (optional)
(2) object_put()

The kobject cleanup function (kobject_cleanup()) will call kobject_del if kobj->state_in_sys is set. Therefore, one can call object_put() alone to get the job done.

> 
> Will ib unregister fail if the kobject_del() has not happened yet? I am unsure.
> 
I don't think so. We only observed the kmemleak complaints after unloading the driver, nothing else.

Kaike
Jason Gunthorpe March 26, 2020, 7:42 p.m. UTC | #3
On Thu, Mar 26, 2020 at 07:09:57PM +0000, Wan, Kaike wrote:
> 
> 
> > From: Jason Gunthorpe <jgg@ziepe.ca>
> > Sent: Thursday, March 26, 2020 1:26 PM
> > To: Dalessandro, Dennis <dennis.dalessandro@intel.com>
> > Cc: dledford@redhat.com; linux-rdma@vger.kernel.org; Marciniszyn, Mike
> > <mike.marciniszyn@intel.com>; stable@vger.kernel.org; Wan, Kaike
> > <kaike.wan@intel.com>
> > Subject: Re: [PATCH for-rc 1/2] IB/hfi1: Fix memory leaks in sysfs registration
> > and unregistration
> > 
> > On Thu, Mar 26, 2020 at 12:38:07PM -0400, Dennis Dalessandro wrote:
> > > From: Kaike Wan <kaike.wan@intel.com>
> > >
> > > When the hfi1 driver is unloaded, kmemleak will report the following
> > > issue:
> > >
> > > unreferenced object 0xffff8888461a4c08 (size 8):
> > > comm "kworker/0:0", pid 5, jiffies 4298601264 (age 2047.134s) hex dump
> > > (first 8 bytes):
> > > 73 64 6d 61 30 00 ff ff sdma0...
> > > backtrace:
> > > [<00000000311a6ef5>] kvasprintf+0x62/0xd0 [<00000000ade94d9f>]
> > > kobject_set_name_vargs+0x1c/0x90 [<0000000060657dbb>]
> > > kobject_init_and_add+0x5d/0xb0 [<00000000346fe72b>] 0xffffffffa0c5ecba
> > > [<000000006cfc5819>] 0xffffffffa0c866b9 [<0000000031c65580>]
> > > 0xffffffffa0c38e87 [<00000000e9739b3f>] local_pci_probe+0x41/0x80
> > > [<000000006c69911d>] work_for_cpu_fn+0x16/0x20
> > [<00000000601267b5>]
> > > process_one_work+0x171/0x380 [<0000000049a0eefa>]
> > > worker_thread+0x1d1/0x3f0 [<00000000909cf2b9>] kthread+0xf8/0x130
> > > [<0000000058f5f874>] ret_from_fork+0x35/0x40
> > >
> > > This patch fixes the issue by:
> > > - Releasing dd->per_sdma[i].kobject in hfi1_unregister_sysfs().
> > >   - This will fix the memory leak.
> > > - Calling kobject_put() to unwind operations only for those entries in
> > >    dd->per_sdma[] whose operations have succeeded (including the current
> > >    one that has just failed) in hfi1_verbs_register_sysfs().
> > >
> > > Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity
> > > setup")
> > > Cc: <stable@vger.kernel.org>
> > > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
> > > Signed-off-by: Kaike Wan <kaike.wan@intel.com>
> > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> > >  drivers/infiniband/hw/hfi1/sysfs.c |   13 +++++++++++--
> > >  1 file changed, 11 insertions(+), 2 deletions(-)
> > 
> > I'm not certain, but this seems unwise.
> > 
> > After hfi1_verbs_unregiser_sysfs() returns there should be no sysfs left
> > under the ibdev as we are going to delete the ibdev sysfs next.
> > 
> > kobject_del() triggers synchronous delete of the sysfs, while
> > kobject_put() potentially defers it to the future.

> True.  However, kobject_del() will only delete the sysfs for the
> object, ie, unwrap what has been done in object_add, but it will not
> decrement the refcount for the kobject.  To unwap
> kobject_init_and_add(), one can call 
> (1) kobject_del() (optional)
> (2) object_put()

Yes, you must call both, but kobject_put is not a replacement for
kobject_del.

> The kobject cleanup function (kobject_cleanup()) will call
> kobject_del if kobj->state_in_sys is set. Therefore, one can call
> object_put() alone to get the job done.

No, as I already explained, the moment that kobject_del happens is no
longer reliable with kobject_put.

> > Will ib unregister fail if the kobject_del() has not happened yet? I am unsure.
> 
> I don't think so. We only observed the kmemleak complaints after
> unloading the driver, nothing else.

Of course, hfi1 is missing the required kobject_put, so it was only a
leak.

To see if there is an issue here delete the kobject_del and
kobject_put entirely to leave a dangling sysfs during registration and
see if ib device unregistration explodes.

Jason
Wan, Kaike March 26, 2020, 10:24 p.m. UTC | #4
> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Thursday, March 26, 2020 3:43 PM
> To: Wan, Kaike <kaike.wan@intel.com>
> Cc: Dalessandro, Dennis <dennis.dalessandro@intel.com>;
> dledford@redhat.com; linux-rdma@vger.kernel.org; Marciniszyn, Mike
> <mike.marciniszyn@intel.com>; stable@vger.kernel.org
> Subject: Re: [PATCH for-rc 1/2] IB/hfi1: Fix memory leaks in sysfs registration
> and unregistration
> 
> > > > When the hfi1 driver is unloaded, kmemleak will report the
> > > > following
> > > > issue:
> > > >
> > > > unreferenced object 0xffff8888461a4c08 (size 8):
> > > > comm "kworker/0:0", pid 5, jiffies 4298601264 (age 2047.134s) hex
> > > > dump (first 8 bytes):
> > > > 73 64 6d 61 30 00 ff ff sdma0...
> > > > backtrace:
> > > > [<00000000311a6ef5>] kvasprintf+0x62/0xd0 [<00000000ade94d9f>]
> > > > kobject_set_name_vargs+0x1c/0x90 [<0000000060657dbb>]
> > > > kobject_init_and_add+0x5d/0xb0 [<00000000346fe72b>]
> > > > 0xffffffffa0c5ecba [<000000006cfc5819>] 0xffffffffa0c866b9
> > > > [<0000000031c65580>]
> > > > 0xffffffffa0c38e87 [<00000000e9739b3f>] local_pci_probe+0x41/0x80
> > > > [<000000006c69911d>] work_for_cpu_fn+0x16/0x20
> > > [<00000000601267b5>]
> > > > process_one_work+0x171/0x380 [<0000000049a0eefa>]
> > > > worker_thread+0x1d1/0x3f0 [<00000000909cf2b9>]
> kthread+0xf8/0x130
> > > > [<0000000058f5f874>] ret_from_fork+0x35/0x40
> > > >
> > > > This patch fixes the issue by:
> > > > - Releasing dd->per_sdma[i].kobject in hfi1_unregister_sysfs().
> > > >   - This will fix the memory leak.
> > > > - Calling kobject_put() to unwind operations only for those entries in
> > > >    dd->per_sdma[] whose operations have succeeded (including the
> current
> > > >    one that has just failed) in hfi1_verbs_register_sysfs().
> > > >
> > > > Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity
> > > > setup")
> > > > Cc: <stable@vger.kernel.org>
> > > > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
> > > > Signed-off-by: Kaike Wan <kaike.wan@intel.com>
> > > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> > > >  drivers/infiniband/hw/hfi1/sysfs.c |   13 +++++++++++--
> > > >  1 file changed, 11 insertions(+), 2 deletions(-)
> > >
> > > I'm not certain, but this seems unwise.
> > >
> > > After hfi1_verbs_unregiser_sysfs() returns there should be no sysfs
> > > left under the ibdev as we are going to delete the ibdev sysfs next.
> > >
> > > kobject_del() triggers synchronous delete of the sysfs, while
> > > kobject_put() potentially defers it to the future.
> 
> > True.  However, kobject_del() will only delete the sysfs for the
> > object, ie, unwrap what has been done in object_add, but it will not
> > decrement the refcount for the kobject.  To unwap
> > kobject_init_and_add(), one can call
> > (1) kobject_del() (optional)
> > (2) object_put()
> 
> Yes, you must call both, but kobject_put is not a replacement for kobject_del.
We can do that.
> 
> > The kobject cleanup function (kobject_cleanup()) will call kobject_del
> > if kobj->state_in_sys is set. Therefore, one can call
> > object_put() alone to get the job done.
> 
> No, as I already explained, the moment that kobject_del happens is no
> longer reliable with kobject_put.
> 
> > > Will ib unregister fail if the kobject_del() has not happened yet? I am
> unsure.
> >
> > I don't think so. We only observed the kmemleak complaints after
> > unloading the driver, nothing else.
> 
> Of course, hfi1 is missing the required kobject_put, so it was only a leak.
> 
> To see if there is an issue here delete the kobject_del and kobject_put
> entirely to leave a dangling sysfs during registration and see if ib device
> unregistration explodes.
I tried a patch wherein the function hfi1_verbs_unregister_sysfs() is never called at all and when unloading the driver the ib device un-registration went through smoothly(no error, the /sys/class/infiniband/hfi1_0 directory gone). Only kmemleak complaints were observed.

I will re-spin the patches.

Thanks,

Kaike
Jason Gunthorpe March 26, 2020, 10:36 p.m. UTC | #5
On Thu, Mar 26, 2020 at 10:24:51PM +0000, Wan, Kaike wrote:

> > To see if there is an issue here delete the kobject_del and kobject_put
> > entirely to leave a dangling sysfs during registration and see if ib device
> > unregistration explodes.

> I tried a patch wherein the function hfi1_verbs_unregister_sysfs()
> is never called at all and when unloading the driver the ib device
> un-registration went through smoothly(no error, the
> /sys/class/infiniband/hfi1_0 directory gone). Only kmemleak
> complaints were observed.

Then perhaps there is nothing to worry about and the patches are fine

Jason
Wan, Kaike March 26, 2020, 11:30 p.m. UTC | #6
> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Thursday, March 26, 2020 6:37 PM
> To: Wan, Kaike <kaike.wan@intel.com>
> Cc: Dalessandro, Dennis <dennis.dalessandro@intel.com>;
> dledford@redhat.com; linux-rdma@vger.kernel.org; Marciniszyn, Mike
> <mike.marciniszyn@intel.com>; stable@vger.kernel.org
> Subject: Re: [PATCH for-rc 1/2] IB/hfi1: Fix memory leaks in sysfs registration
> and unregistration
> 
> On Thu, Mar 26, 2020 at 10:24:51PM +0000, Wan, Kaike wrote:
> 
> > > To see if there is an issue here delete the kobject_del and
> > > kobject_put entirely to leave a dangling sysfs during registration
> > > and see if ib device unregistration explodes.
> 
> > I tried a patch wherein the function hfi1_verbs_unregister_sysfs() is
> > never called at all and when unloading the driver the ib device
> > un-registration went through smoothly(no error, the
> > /sys/class/infiniband/hfi1_0 directory gone). Only kmemleak complaints
> > were observed.
> 
> Then perhaps there is nothing to worry about and the patches are fine
>
Then I don't have to re-spin the patches?

Kaike
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/hfi1/sysfs.c b/drivers/infiniband/hw/hfi1/sysfs.c
index 90f62c4..f1bcecf 100644
--- a/drivers/infiniband/hw/hfi1/sysfs.c
+++ b/drivers/infiniband/hw/hfi1/sysfs.c
@@ -853,8 +853,13 @@  int hfi1_verbs_register_sysfs(struct hfi1_devdata *dd)
 
 	return 0;
 bail:
-	for (i = 0; i < dd->num_sdma; i++)
-		kobject_del(&dd->per_sdma[i].kobj);
+	/*
+	 * The function kobject_put() will call kobject_del() if the kobject
+	 * has been added successfully. The sysfs files created under the
+	 * kobject directory will also be removed during the process.
+	 */
+	for (; i >= 0; i--)
+		kobject_put(&dd->per_sdma[i].kobj);
 
 	return ret;
 }
@@ -867,6 +872,10 @@  void hfi1_verbs_unregister_sysfs(struct hfi1_devdata *dd)
 	struct hfi1_pportdata *ppd;
 	int i;
 
+	/* Unwind operations in hfi1_verbs_register_sysfs() */
+	for (i = 0; i < dd->num_sdma; i++)
+		kobject_put(&dd->per_sdma[i].kobj);
+
 	for (i = 0; i < dd->num_pports; i++) {
 		ppd = &dd->pport[i];