diff mbox series

[net] net: mvpp2: debugfs: fix problem with previous memory leak fix

Message ID 20220921114444.2247083-1-gregkh@linuxfoundation.org (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net] net: mvpp2: debugfs: fix problem with previous memory leak fix | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: Invalid email format for stable: 'stable <stable@kernel.org>', prefer 'stable@kernel.org' WARNING: line length of 94 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Greg Kroah-Hartman Sept. 21, 2022, 11:44 a.m. UTC
In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
debugfs_lookup()"), if the module is unloaded, the directory will still
be present if the module is loaded again and creating the directory will
fail, causing the creation of additional child debugfs entries for the
individual devices to fail.

As this module never cleaned up the root directory it created, even when
loaded, and unloading/loading a module is not a normal operation, none
of would normally happen.

To clean all of this up, use a tiny reference counted structure to hold
the root debugfs directory for the driver, and then clean it up when the
last user of it is removed from the system.  This should resolve the
previously reported problems, and the original memory leak that
fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
debugfs_lookup()") was attempting to fix.

Reported-by: Russell King <linux@armlinux.org.uk>
Cc: Marcin Wojtas <mw@semihalf.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Cc: stable <stable@kernel.org>
Fixes: fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
Note, test-built only, I do not have access to this hardware so please
review it for any foolish mistakes I might have again made.

 .../ethernet/marvell/mvpp2/mvpp2_debugfs.c    | 32 ++++++++++++++++---
 1 file changed, 28 insertions(+), 4 deletions(-)

Comments

Russell King (Oracle) Sept. 22, 2022, 4:05 p.m. UTC | #1
On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> debugfs_lookup()"), if the module is unloaded, the directory will still
> be present if the module is loaded again and creating the directory will
> fail, causing the creation of additional child debugfs entries for the
> individual devices to fail.
> 
> As this module never cleaned up the root directory it created, even when
> loaded, and unloading/loading a module is not a normal operation, none
> of would normally happen.
> 
> To clean all of this up, use a tiny reference counted structure to hold
> the root debugfs directory for the driver, and then clean it up when the
> last user of it is removed from the system.  This should resolve the
> previously reported problems, and the original memory leak that
> fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> debugfs_lookup()") was attempting to fix.

For the record... I have a better fix for this, but I haven't been able
to get it into a state suitable for submission yet.

http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff

Not yet against the net tree. Might have time tomorrow to do that, not
sure at the moment. Medical stuff is getting in the way. :(
Marcin Wojtas Sept. 22, 2022, 5:08 p.m. UTC | #2
Hi,

Thank you both for the patches.


czw., 22 wrz 2022 o 18:05 Russell King (Oracle)
<linux@armlinux.org.uk> napisał(a):
>
> On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> > In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > debugfs_lookup()"), if the module is unloaded, the directory will still
> > be present if the module is loaded again and creating the directory will
> > fail, causing the creation of additional child debugfs entries for the
> > individual devices to fail.
> >
> > As this module never cleaned up the root directory it created, even when
> > loaded, and unloading/loading a module is not a normal operation, none
> > of would normally happen.
> >
> > To clean all of this up, use a tiny reference counted structure to hold
> > the root debugfs directory for the driver, and then clean it up when the
> > last user of it is removed from the system.  This should resolve the
> > previously reported problems, and the original memory leak that
> > fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > debugfs_lookup()") was attempting to fix.
>
> For the record... I have a better fix for this, but I haven't been able
> to get it into a state suitable for submission yet.
>
> http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff
>
> Not yet against the net tree. Might have time tomorrow to do that, not
> sure at the moment. Medical stuff is getting in the way. :(
>

I'd lean towards this version - it is a bit more compact. I'll try to
test that tomorrow or during the weekend.

Best regards,
Marcin
Marcin Wojtas Sept. 24, 2022, 11:27 p.m. UTC | #3
Hi Russell,

czw., 22 wrz 2022 o 19:08 Marcin Wojtas <mw@semihalf.com> napisał(a):
>
> Hi,
>
> Thank you both for the patches.
>
>
> czw., 22 wrz 2022 o 18:05 Russell King (Oracle)
> <linux@armlinux.org.uk> napisał(a):
> >
> > On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> > > In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > debugfs_lookup()"), if the module is unloaded, the directory will still
> > > be present if the module is loaded again and creating the directory will
> > > fail, causing the creation of additional child debugfs entries for the
> > > individual devices to fail.
> > >
> > > As this module never cleaned up the root directory it created, even when
> > > loaded, and unloading/loading a module is not a normal operation, none
> > > of would normally happen.
> > >
> > > To clean all of this up, use a tiny reference counted structure to hold
> > > the root debugfs directory for the driver, and then clean it up when the
> > > last user of it is removed from the system.  This should resolve the
> > > previously reported problems, and the original memory leak that
> > > fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > debugfs_lookup()") was attempting to fix.
> >
> > For the record... I have a better fix for this, but I haven't been able
> > to get it into a state suitable for submission yet.
> >
> > http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff
> >
> > Not yet against the net tree. Might have time tomorrow to do that, not
> > sure at the moment. Medical stuff is getting in the way. :(
> >
>
> I'd lean towards this version - it is a bit more compact. I'll try to
> test that tomorrow or during the weekend.
>

I improved the patch compile and work (tested on my CN913x board).
Feel free to submit (if you wish, I can do it too - please let me know
your preference):
https://github.com/semihalf-wojtas-marcin/Linux-Kernel/commit/0abb75115ffb2772f595bb3346573e27e650018b

Best regards,
Marcin
Russell King (Oracle) Sept. 25, 2022, 6:58 a.m. UTC | #4
On Sun, Sep 25, 2022 at 01:27:06AM +0200, Marcin Wojtas wrote:
> Hi Russell,
> 
> czw., 22 wrz 2022 o 19:08 Marcin Wojtas <mw@semihalf.com> napisał(a):
> >
> > Hi,
> >
> > Thank you both for the patches.
> >
> >
> > czw., 22 wrz 2022 o 18:05 Russell King (Oracle)
> > <linux@armlinux.org.uk> napisał(a):
> > >
> > > On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> > > > In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > > debugfs_lookup()"), if the module is unloaded, the directory will still
> > > > be present if the module is loaded again and creating the directory will
> > > > fail, causing the creation of additional child debugfs entries for the
> > > > individual devices to fail.
> > > >
> > > > As this module never cleaned up the root directory it created, even when
> > > > loaded, and unloading/loading a module is not a normal operation, none
> > > > of would normally happen.
> > > >
> > > > To clean all of this up, use a tiny reference counted structure to hold
> > > > the root debugfs directory for the driver, and then clean it up when the
> > > > last user of it is removed from the system.  This should resolve the
> > > > previously reported problems, and the original memory leak that
> > > > fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > > debugfs_lookup()") was attempting to fix.
> > >
> > > For the record... I have a better fix for this, but I haven't been able
> > > to get it into a state suitable for submission yet.
> > >
> > > http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff
> > >
> > > Not yet against the net tree. Might have time tomorrow to do that, not
> > > sure at the moment. Medical stuff is getting in the way. :(
> > >
> >
> > I'd lean towards this version - it is a bit more compact. I'll try to
> > test that tomorrow or during the weekend.
> >
> 
> I improved the patch compile and work (tested on my CN913x board).
> Feel free to submit (if you wish, I can do it too - please let me know
> your preference):
> https://github.com/semihalf-wojtas-marcin/Linux-Kernel/commit/0abb75115ffb2772f595bb3346573e27e650018b

I don't see what the compile fixes were in that - it looks like my patch
ported onto current -net. Obvious changes:

- moved mvpp2_dbgfs_exit() declaration from after mvpp2_dbgfs_cleanup()
  to before.
- moved definition of mvpp2_root to the top of the file (as no effect
  on the code.)

and the change to port it:

- dropped my mvpp2_dbgfs_init() hunk (because it's different in -net)
- removed static declaration of mvpp2_root in mvpp2_dbgfs_init()

I'm not seeing any other changes.

Note that Sasha has submitted a revert of Greg's original patch for
mainline, so my original patch should apply as-is if that revert
happens - and I don't see any compile issues with it.
Russell King (Oracle) Sept. 25, 2022, 7:05 a.m. UTC | #5
On Sun, Sep 25, 2022 at 07:58:29AM +0100, Russell King (Oracle) wrote:
> On Sun, Sep 25, 2022 at 01:27:06AM +0200, Marcin Wojtas wrote:
> > Hi Russell,
> > 
> > I improved the patch compile and work (tested on my CN913x board).
> > Feel free to submit (if you wish, I can do it too - please let me know
> > your preference):
> > https://github.com/semihalf-wojtas-marcin/Linux-Kernel/commit/0abb75115ffb2772f595bb3346573e27e650018b
> 
> I don't see what the compile fixes were in that - it looks like my patch
> ported onto current -net. Obvious changes:
> 
> - moved mvpp2_dbgfs_exit() declaration from after mvpp2_dbgfs_cleanup()
>   to before.
> - moved definition of mvpp2_root to the top of the file (as no effect
>   on the code.)
> 
> and the change to port it:
> 
> - dropped my mvpp2_dbgfs_init() hunk (because it's different in -net)
> - removed static declaration of mvpp2_root in mvpp2_dbgfs_init()
> 
> I'm not seeing any other changes.
> 
> Note that Sasha has submitted a revert of Greg's original patch for
> mainline, so my original patch should apply as-is if that revert
> happens - and I don't see any compile issues with it.

On that - it seems the Stable kernel maintainers can't cope with being
told not to apply a patch - we've had a big long discussion about it
on IRC over the past few days.

Sasha states that the mainline process is broken - and as long as Greg's
original patch is in place, stable will repeatedly attempt to backport
it no matter whether a proper fix is being worked on, whether
maintainers are busy, and so on and so forth. The only way to stop
stable backporting patches is to revert them in mainline - as I asked to
happen in this case on the 13th.  So it's all our fault for not
reverting the patch. It's got nothing to do with stable maintainers
unable to keep track of which patches they shouldn't be picking up.

I maintain that the stable kernel process is totally broken due to
this - it makes no allowance for whether maintainers can sort the
problem out in a timely manner.

Quite honestly, I now regard the stable kernel process as being
utterly broken. In my mind, it's not a stable kernel. It's an unstable
kernel with randomly applied patches that may or may not be appropriate.
Certainly, stable-kernel-rules.rst is a total and utter joke - they
aren't rules at all, the stable kernel maintainers don't have any rules
about patches they backport.

Anyway, I guess we're going to have to wait for Sasha's revert to be
merged into -net first, which will make backporting our true and proper
memory-leak fix a lot easier.
Marcin Wojtas Sept. 25, 2022, 7:55 a.m. UTC | #6
niedz., 25 wrz 2022 o 08:58 Russell King (Oracle)
<linux@armlinux.org.uk> napisał(a):
>
> On Sun, Sep 25, 2022 at 01:27:06AM +0200, Marcin Wojtas wrote:
> > Hi Russell,
> >
> > czw., 22 wrz 2022 o 19:08 Marcin Wojtas <mw@semihalf.com> napisał(a):
> > >
> > > Hi,
> > >
> > > Thank you both for the patches.
> > >
> > >
> > > czw., 22 wrz 2022 o 18:05 Russell King (Oracle)
> > > <linux@armlinux.org.uk> napisał(a):
> > > >
> > > > On Wed, Sep 21, 2022 at 01:44:44PM +0200, Greg Kroah-Hartman wrote:
> > > > > In commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > > > debugfs_lookup()"), if the module is unloaded, the directory will still
> > > > > be present if the module is loaded again and creating the directory will
> > > > > fail, causing the creation of additional child debugfs entries for the
> > > > > individual devices to fail.
> > > > >
> > > > > As this module never cleaned up the root directory it created, even when
> > > > > loaded, and unloading/loading a module is not a normal operation, none
> > > > > of would normally happen.
> > > > >
> > > > > To clean all of this up, use a tiny reference counted structure to hold
> > > > > the root debugfs directory for the driver, and then clean it up when the
> > > > > last user of it is removed from the system.  This should resolve the
> > > > > previously reported problems, and the original memory leak that
> > > > > fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using
> > > > > debugfs_lookup()") was attempting to fix.
> > > >
> > > > For the record... I have a better fix for this, but I haven't been able
> > > > to get it into a state suitable for submission yet.
> > > >
> > > > http://www.home.armlinux.org.uk/~rmk/misc/mvpp2-debugfs.diff
> > > >
> > > > Not yet against the net tree. Might have time tomorrow to do that, not
> > > > sure at the moment. Medical stuff is getting in the way. :(
> > > >
> > >
> > > I'd lean towards this version - it is a bit more compact. I'll try to
> > > test that tomorrow or during the weekend.
> > >
> >
> > I improved the patch compile and work (tested on my CN913x board).
> > Feel free to submit (if you wish, I can do it too - please let me know
> > your preference):
> > https://github.com/semihalf-wojtas-marcin/Linux-Kernel/commit/0abb75115ffb2772f595bb3346573e27e650018b
>
> I don't see what the compile fixes were in that - it looks like my patch

I overlooked drivers/net/ethernet/marvell/mvpp2/mvpp2.h diff (I
applied patch manually), but I double-checked and it was there, so you
are right.

> ported onto current -net. Obvious changes:

Yes, it's applied on top of net/net-next.

>
> - moved mvpp2_dbgfs_exit() declaration from after mvpp2_dbgfs_cleanup()
>   to before.
> - moved definition of mvpp2_root to the top of the file (as no effect
>   on the code.)

These are irrelevant. The most important part was:

--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -7714,8 +7714,8 @@ module_init(mvpp2_driver_init);

 static void __exit mvpp2_driver_exit(void)
 {
-       mvpp2_dbgfs_exit();
        platform_driver_unregister(&mvpp2_driver);
+       mvpp2_dbgfs_exit();
 }
 module_exit(mvpp2_driver_exit);

Otherwise, I observed NULL-pointer dereference when
mvpp2_remove()->mvpp2_dbgfs_cleanup().

>
> and the change to port it:
>
> - dropped my mvpp2_dbgfs_init() hunk (because it's different in -net)
> - removed static declaration of mvpp2_root in mvpp2_dbgfs_init()
>
> I'm not seeing any other changes.
>
> Note that Sasha has submitted a revert of Greg's original patch for
> mainline, so my original patch should apply as-is if that revert
> happens - and I don't see any compile issues with it.
>

Sure. In order to avoid misunderstandings, I stop here and leave the
submission to you.

Thanks,
Marcin
diff mbox series

Patch

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_debugfs.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_debugfs.c
index 0eec05d905eb..16c303048f25 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_debugfs.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_debugfs.c
@@ -691,23 +691,47 @@  static int mvpp2_dbgfs_port_init(struct dentry *parent,
 	return 0;
 }
 
+static struct mvpp2_debug_dir {
+	struct dentry *dir;
+	struct kref kref;
+} *mvpp2_root;
+
+static void mvpp2_release(struct kref *kref)
+{
+	struct mvpp2_debug_dir *mvpp2_root = container_of(kref, struct mvpp2_debug_dir, kref);
+
+	debugfs_remove(mvpp2_root->dir);
+	kfree(mvpp2_root);
+}
+
 void mvpp2_dbgfs_cleanup(struct mvpp2 *priv)
 {
 	debugfs_remove_recursive(priv->dbgfs_dir);
 
 	kfree(priv->dbgfs_entries);
+
+	if (mvpp2_root)
+		kref_put(&mvpp2_root->kref, mvpp2_release);
 }
 
 void mvpp2_dbgfs_init(struct mvpp2 *priv, const char *name)
 {
-	static struct dentry *mvpp2_root;
 	struct dentry *mvpp2_dir;
 	int ret, i;
 
-	if (!mvpp2_root)
-		mvpp2_root = debugfs_create_dir(MVPP2_DRIVER_NAME, NULL);
+	if (!mvpp2_root) {
+		mvpp2_root = kzalloc(sizeof(mvpp2_root), GFP_KERNEL);
+		if (!mvpp2_root) {
+			mvpp2_root = NULL;
+			return;
+		}
+		mvpp2_root->dir = debugfs_create_dir(MVPP2_DRIVER_NAME, NULL);
+		kref_init(&mvpp2_root->kref);
+	} else {
+		kref_get(&mvpp2_root->kref);
+	}
 
-	mvpp2_dir = debugfs_create_dir(name, mvpp2_root);
+	mvpp2_dir = debugfs_create_dir(name, mvpp2_root->dir);
 
 	priv->dbgfs_dir = mvpp2_dir;
 	priv->dbgfs_entries = kzalloc(sizeof(*priv->dbgfs_entries), GFP_KERNEL);