Message ID | 20240808202658.5933-1-david.hunter.linux@gmail.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [1/1] Net: bcm.c: Remove Subtree Instead of Entry | expand |
Hello David, many thanks for the patch and the description. Btw. the data structures of the elements inside that bcm proc dir should have been removed at that point, so that the can-bcm dir should be empty. I'm not sure what happens to the open sockets that are (later) removed in bcm_release() when we use remove_proc_subtree() as suggested. Removing this warning probably does not heal the root cause of the issue. What did you do to trigger the warning? Did you work with network namespaces or LXC/Docker and purged an entire namespace? Best regards, Oliver On 08.08.24 22:26, David Hunter wrote: > Fix a warning with bcm.c that is caused by removing an entry. If the > entry had a process as a child, a warning is generated: > > remove_proc_entry: removing non-empty directory 'net/can-bcm'... > WARNING: CPU: 1 PID: 71 at fs/proc/generic.c:717 remove_proc_entry > Call Trace: > remove_proc_entry > canbcm_pernet_exit > ops_exit_list > > Instead of simply removing the entry, remove the entire subdirectory. > The child process will still be removed, but without a warning occurring. > > This patch was compiled and the code traced with gdb to see that the > tree was removed. The code was run to see that the warning was removed. > In addition, the code was tested with the kselftest > net subsystem. No regressions were detected. > > Signed-off-by: David Hunter <david.hunter.linux@gmail.com> > --- > net/can/bcm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/can/bcm.c b/net/can/bcm.c > index 27d5fcf0eac9..fea48fd793e5 100644 > --- a/net/can/bcm.c > +++ b/net/can/bcm.c > @@ -1779,7 +1779,7 @@ static void canbcm_pernet_exit(struct net *net) > #if IS_ENABLED(CONFIG_PROC_FS) > /* remove /proc/net/can-bcm directory */ > if (net->can.bcmproc_dir) > - remove_proc_entry("can-bcm", net->proc_net); > + remove_proc_subtree("can-bcm", net->proc_net); > #endif /* CONFIG_PROC_FS */ > } >
Hello Oliver, > What did you do to trigger the warning? I am in the Linux Kernel Internship Program for the Linux Foundation. Our goal is to fix outstanding bugs with the kernel. I found the following bug on syzbot: https://syzkaller.appspot.com/bug?extid=df49d48077305d17519a This specific link is for a separate issue that I will soon send a separate patch for; however, I found the bug for this patch after I switched the command parameter for panic_on_warn to 0. If you wish to reproduce the error, you can do the following steps: 1) compile and install a kernel with the config file from the link 2) pass kernel paramter panic_on_warn=0 3) build and run the C reproducer for the bug. As best as I can tell, the C reproducer simply made a system call that resulted in the bcm-can directry entry being deleted. I am still wrapping my head around the code (I am new to kernel programming), but here is the full stacktrace. 156.449047][ T71] Call Trace: [ 156.450067][ T71] <TASK> [ 156.451076][ T71] ? show_regs+0x84/0x8b [ 156.452490][ T71] ? __warn+0x150/0x29e [ 156.453754][ T71] ? remove_proc_entry+0x335/0x385 [ 156.456485][ T71] ? report_bug+0x33d/0x431 [ 156.457994][ T71] ? remove_proc_entry+0x335/0x385 [ 156.459845][ T71] ? handle_bug+0x3d/0x66 [ 156.461230][ T71] ? exc_invalid_op+0x17/0x3e [ 156.462672][ T71] ? asm_exc_invalid_op+0x1a/0x20 [ 156.464282][ T71] ? __warn_printk+0x26d/0x2aa [ 156.465759][ T71] ? remove_proc_entry+0x335/0x385 [ 156.467233][ T71] ? remove_proc_entry+0x334/0x385 [ 156.468821][ T71] ? proc_readdir+0x11a/0x11a [ 156.470122][ T71] ? __sanitizer_cov_trace_pc+0x1e/0x42 [ 156.471697][ T71] ? cgw_remove_all_jobs+0xa5/0x16f [ 156.474096][ T71] canbcm_pernet_exit+0x73/0x79 [ 156.476732][ T71] ops_exit_list+0xf1/0x146 [ 156.478358][ T71] cleanup_net+0x333/0x570 [ 156.479856][ T71] ? setup_net+0x7ba/0x7ba [ 156.481479][ T71] ? process_scheduled_works+0x652/0xbab [ 156.483592][ T71] process_scheduled_works+0x7b8/0xbab [ 156.486039][ T71] ? drain_workqueue+0x33b/0x33b [ 156.487841][ T71] ? __sanitizer_cov_trace_pc+0x1e/0x42 [ 156.489742][ T71] ? move_linked_works+0x9f/0x108 [ 156.491376][ T71] worker_thread+0x5bd/0x6cc [ 156.492877][ T71] ? rescuer_thread+0x64d/0x64d [ 156.494350][ T71] kthread+0x30a/0x31e [ 156.495769][ T71] ? kthread_complete_and_exit+0x35/0x35 [ 156.497977][ T71] ret_from_fork+0x34/0x6b [ 156.499734][ T71] ? kthread_complete_and_exit+0x35/0x35 [ 156.501494][ T71] ret_from_fork_asm+0x11/0x20 > Removing this warning probably does not heal the root cause of the issue. I would love to work on the root cause of the issue if at all possible. Do you think that the C reproducer went down an unlikely avenue, and therefore, further work is not needed, or do you think that this is an issue that requires some attention? I appreciate the response to my patch. I am learning a lot. Thanks, David
From: Oliver Hartkopp <socketcan@hartkopp.net> Date: Fri, 9 Aug 2024 11:57:41 +0200 > Hello David, > > many thanks for the patch and the description. > > Btw. the data structures of the elements inside that bcm proc dir should > have been removed at that point, so that the can-bcm dir should be empty. > > I'm not sure what happens to the open sockets that are (later) removed > in bcm_release() when we use remove_proc_subtree() as suggested. > Removing this warning probably does not heal the root cause of the issue. I posted a patch to fix bcm's proc entry leak few weeks ago, and this might be related. https://lore.kernel.org/netdev/20240722192842.37421-1-kuniyu@amazon.com/ Oliver, could you take this patch to can tree ?
From: Kuniyuki Iwashima <kuniyu@amazon.com> Date: Fri, 9 Aug 2024 13:22:49 -0700 > From: Oliver Hartkopp <socketcan@hartkopp.net> > Date: Fri, 9 Aug 2024 11:57:41 +0200 > > Hello David, > > > > many thanks for the patch and the description. > > > > Btw. the data structures of the elements inside that bcm proc dir should > > have been removed at that point, so that the can-bcm dir should be empty. > > > > I'm not sure what happens to the open sockets that are (later) removed > > in bcm_release() when we use remove_proc_subtree() as suggested. > > Removing this warning probably does not heal the root cause of the issue. > > I posted a patch to fix bcm's proc entry leak few weeks ago, and this might > be related. > https://lore.kernel.org/netdev/20240722192842.37421-1-kuniyu@amazon.com/ I just noticed the syzbot report that David pointed out has the same splat, so this is the same issue that my patch fixes. https://syzkaller.appspot.com/bug?extid=df49d48077305d17519a
On 09.08.2024 13:22:49, Kuniyuki Iwashima wrote: > From: Oliver Hartkopp <socketcan@hartkopp.net> > Date: Fri, 9 Aug 2024 11:57:41 +0200 > > Hello David, > > > > many thanks for the patch and the description. > > > > Btw. the data structures of the elements inside that bcm proc dir should > > have been removed at that point, so that the can-bcm dir should be empty. > > > > I'm not sure what happens to the open sockets that are (later) removed > > in bcm_release() when we use remove_proc_subtree() as suggested. > > Removing this warning probably does not heal the root cause of the issue. > > I posted a patch to fix bcm's proc entry leak few weeks ago, and this might > be related. > https://lore.kernel.org/netdev/20240722192842.37421-1-kuniyu@amazon.com/ > > Oliver, could you take this patch to can tree ? That patch is included in my latest PR to net: https://lore.kernel.org/all/20240829192947.1186760-1-mkl@pengutronix.de Marc
diff --git a/net/can/bcm.c b/net/can/bcm.c index 27d5fcf0eac9..fea48fd793e5 100644 --- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -1779,7 +1779,7 @@ static void canbcm_pernet_exit(struct net *net) #if IS_ENABLED(CONFIG_PROC_FS) /* remove /proc/net/can-bcm directory */ if (net->can.bcmproc_dir) - remove_proc_entry("can-bcm", net->proc_net); + remove_proc_subtree("can-bcm", net->proc_net); #endif /* CONFIG_PROC_FS */ }
Fix a warning with bcm.c that is caused by removing an entry. If the entry had a process as a child, a warning is generated: remove_proc_entry: removing non-empty directory 'net/can-bcm'... WARNING: CPU: 1 PID: 71 at fs/proc/generic.c:717 remove_proc_entry Call Trace: remove_proc_entry canbcm_pernet_exit ops_exit_list Instead of simply removing the entry, remove the entire subdirectory. The child process will still be removed, but without a warning occurring. This patch was compiled and the code traced with gdb to see that the tree was removed. The code was run to see that the warning was removed. In addition, the code was tested with the kselftest net subsystem. No regressions were detected. Signed-off-by: David Hunter <david.hunter.linux@gmail.com> --- net/can/bcm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)