diff mbox series

[net] ipmr,ip6mr: acquire RTNL before calling ip[6]mr_free_table() on failure path

Message ID 20220208053451.2885398-1-eric.dumazet@gmail.com (mailing list archive)
State Accepted
Commit 5611a00697c8ecc5aad04392bea629e9d6a20463
Delegated to: Netdev Maintainers
Headers show
Series [net] ipmr,ip6mr: acquire RTNL before calling ip[6]mr_free_table() on failure path | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 6 this patch: 6
netdev/cc_maintainers fail 1 blamed authors not CCed: xiyou.wangcong@gmail.com; 2 maintainers not CCed: xiyou.wangcong@gmail.com yoshfuji@linux-ipv6.org
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 11 this patch: 11
netdev/checkpatch warning WARNING: Possible repeated word: 'Google'
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Eric Dumazet Feb. 8, 2022, 5:34 a.m. UTC
From: Eric Dumazet <edumazet@google.com>

ip[6]mr_free_table() can only be called under RTNL lock.

RTNL: assertion failed at net/core/dev.c (10367)
WARNING: CPU: 1 PID: 5890 at net/core/dev.c:10367 unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
Modules linked in:
CPU: 1 PID: 5890 Comm: syz-executor.2 Not tainted 5.16.0-syzkaller-11627-g422ee58dc0ef #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
Code: 0f 85 9b ee ff ff e8 69 07 4b fa ba 7f 28 00 00 48 c7 c6 00 90 ae 8a 48 c7 c7 40 90 ae 8a c6 05 6d b1 51 06 01 e8 8c 90 d8 01 <0f> 0b e9 70 ee ff ff e8 3e 07 4b fa 4c 89 e7 e8 86 2a 59 fa e9 ee
RSP: 0018:ffffc900046ff6e0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff888050f51d00 RSI: ffffffff815fa008 RDI: fffff520008dfece
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff815f3d6e R11: 0000000000000000 R12: 00000000fffffff4
R13: dffffc0000000000 R14: ffffc900046ff750 R15: ffff88807b7dc000
FS:  00007f4ab736e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fee0b4f8990 CR3: 000000001e7d2000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 mroute_clean_tables+0x244/0xb40 net/ipv6/ip6mr.c:1509
 ip6mr_free_table net/ipv6/ip6mr.c:389 [inline]
 ip6mr_rules_init net/ipv6/ip6mr.c:246 [inline]
 ip6mr_net_init net/ipv6/ip6mr.c:1306 [inline]
 ip6mr_net_init+0x3f0/0x4e0 net/ipv6/ip6mr.c:1298
 ops_init+0xaf/0x470 net/core/net_namespace.c:140
 setup_net+0x54f/0xbb0 net/core/net_namespace.c:331
 copy_net_ns+0x318/0x760 net/core/net_namespace.c:475
 create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110
 copy_namespaces+0x391/0x450 kernel/nsproxy.c:178
 copy_process+0x2e0c/0x7300 kernel/fork.c:2167
 kernel_clone+0xe7/0xab0 kernel/fork.c:2555
 __do_sys_clone+0xc8/0x110 kernel/fork.c:2672
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f4ab89f9059
Code: Unable to access opcode bytes at RIP 0x7f4ab89f902f.
RSP: 002b:00007f4ab736e118 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 00007f4ab8b0bf60 RCX: 00007f4ab89f9059
RDX: 0000000020000280 RSI: 0000000020000270 RDI: 0000000040200000
RBP: 00007f4ab8a5308d R08: 0000000020000300 R09: 0000000020000300
R10: 00000000200002c0 R11: 0000000000000206 R12: 0000000000000000
R13: 00007ffc3977cc1f R14: 00007f4ab736e300 R15: 0000000000022000
 </TASK>

Fixes: f243e5a7859a ("ipmr,ip6mr: call ip6mr_free_table() on failure path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <cong.wang@bytedance.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
---
 net/ipv4/ipmr.c  | 2 ++
 net/ipv6/ip6mr.c | 2 ++
 2 files changed, 4 insertions(+)

Comments

patchwork-bot+netdevbpf@kernel.org Feb. 9, 2022, 5:20 a.m. UTC | #1
Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Mon,  7 Feb 2022 21:34:51 -0800 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> ip[6]mr_free_table() can only be called under RTNL lock.
> 
> RTNL: assertion failed at net/core/dev.c (10367)
> WARNING: CPU: 1 PID: 5890 at net/core/dev.c:10367 unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> Modules linked in:
> CPU: 1 PID: 5890 Comm: syz-executor.2 Not tainted 5.16.0-syzkaller-11627-g422ee58dc0ef #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> Code: 0f 85 9b ee ff ff e8 69 07 4b fa ba 7f 28 00 00 48 c7 c6 00 90 ae 8a 48 c7 c7 40 90 ae 8a c6 05 6d b1 51 06 01 e8 8c 90 d8 01 <0f> 0b e9 70 ee ff ff e8 3e 07 4b fa 4c 89 e7 e8 86 2a 59 fa e9 ee
> RSP: 0018:ffffc900046ff6e0 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff888050f51d00 RSI: ffffffff815fa008 RDI: fffff520008dfece
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: ffffffff815f3d6e R11: 0000000000000000 R12: 00000000fffffff4
> R13: dffffc0000000000 R14: ffffc900046ff750 R15: ffff88807b7dc000
> FS:  00007f4ab736e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fee0b4f8990 CR3: 000000001e7d2000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  mroute_clean_tables+0x244/0xb40 net/ipv6/ip6mr.c:1509
>  ip6mr_free_table net/ipv6/ip6mr.c:389 [inline]
>  ip6mr_rules_init net/ipv6/ip6mr.c:246 [inline]
>  ip6mr_net_init net/ipv6/ip6mr.c:1306 [inline]
>  ip6mr_net_init+0x3f0/0x4e0 net/ipv6/ip6mr.c:1298
>  ops_init+0xaf/0x470 net/core/net_namespace.c:140
>  setup_net+0x54f/0xbb0 net/core/net_namespace.c:331
>  copy_net_ns+0x318/0x760 net/core/net_namespace.c:475
>  create_new_namespaces+0x3f6/0xb20 kernel/nsproxy.c:110
>  copy_namespaces+0x391/0x450 kernel/nsproxy.c:178
>  copy_process+0x2e0c/0x7300 kernel/fork.c:2167
>  kernel_clone+0xe7/0xab0 kernel/fork.c:2555
>  __do_sys_clone+0xc8/0x110 kernel/fork.c:2672
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f4ab89f9059
> Code: Unable to access opcode bytes at RIP 0x7f4ab89f902f.
> RSP: 002b:00007f4ab736e118 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
> RAX: ffffffffffffffda RBX: 00007f4ab8b0bf60 RCX: 00007f4ab89f9059
> RDX: 0000000020000280 RSI: 0000000020000270 RDI: 0000000040200000
> RBP: 00007f4ab8a5308d R08: 0000000020000300 R09: 0000000020000300
> R10: 00000000200002c0 R11: 0000000000000206 R12: 0000000000000000
> R13: 00007ffc3977cc1f R14: 00007f4ab736e300 R15: 0000000000022000
>  </TASK>
> 
> [...]

Here is the summary with links:
  - [net] ipmr,ip6mr: acquire RTNL before calling ip[6]mr_free_table() on failure path
    https://git.kernel.org/netdev/net/c/5611a00697c8

You are awesome, thank you!
Cong Wang Feb. 15, 2022, 12:24 a.m. UTC | #2
On Mon, Feb 07, 2022 at 09:34:51PM -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> ip[6]mr_free_table() can only be called under RTNL lock.
> 
> RTNL: assertion failed at net/core/dev.c (10367)
> WARNING: CPU: 1 PID: 5890 at net/core/dev.c:10367 unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> Modules linked in:
> CPU: 1 PID: 5890 Comm: syz-executor.2 Not tainted 5.16.0-syzkaller-11627-g422ee58dc0ef #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> Code: 0f 85 9b ee ff ff e8 69 07 4b fa ba 7f 28 00 00 48 c7 c6 00 90 ae 8a 48 c7 c7 40 90 ae 8a c6 05 6d b1 51 06 01 e8 8c 90 d8 01 <0f> 0b e9 70 ee ff ff e8 3e 07 4b fa 4c 89 e7 e8 86 2a 59 fa e9 ee
> RSP: 0018:ffffc900046ff6e0 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: ffff888050f51d00 RSI: ffffffff815fa008 RDI: fffff520008dfece
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: ffffffff815f3d6e R11: 0000000000000000 R12: 00000000fffffff4
> R13: dffffc0000000000 R14: ffffc900046ff750 R15: ffff88807b7dc000
> FS:  00007f4ab736e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fee0b4f8990 CR3: 000000001e7d2000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  mroute_clean_tables+0x244/0xb40 net/ipv6/ip6mr.c:1509
>  ip6mr_free_table net/ipv6/ip6mr.c:389 [inline]
>  ip6mr_rules_init net/ipv6/ip6mr.c:246 [inline]
>  ip6mr_net_init net/ipv6/ip6mr.c:1306 [inline]

Isn't that new table still empty in this case? Which means
mroute_clean_tables() should not actually unregister any netdevice??

Should we just move that assertion after list empty check?


diff --git a/net/core/dev.c b/net/core/dev.c
index 909fb3815910..ff6e7d0074dd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10359,11 +10359,11 @@ void unregister_netdevice_many(struct list_head *head)
        LIST_HEAD(close_head);
 
        BUG_ON(dev_boot_phase);
-       ASSERT_RTNL();
 
        if (list_empty(head))
                return;
 
+       ASSERT_RTNL();
        list_for_each_entry_safe(dev, tmp, head, unreg_list) {
                /* Some devices call without registering
                 * for initialization unwind. Remove those
Eric Dumazet Feb. 15, 2022, 12:36 a.m. UTC | #3
On Mon, Feb 14, 2022 at 4:24 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Mon, Feb 07, 2022 at 09:34:51PM -0800, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > ip[6]mr_free_table() can only be called under RTNL lock.
> >
> > RTNL: assertion failed at net/core/dev.c (10367)
> > WARNING: CPU: 1 PID: 5890 at net/core/dev.c:10367 unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> > Modules linked in:
> > CPU: 1 PID: 5890 Comm: syz-executor.2 Not tainted 5.16.0-syzkaller-11627-g422ee58dc0ef #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > RIP: 0010:unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> > Code: 0f 85 9b ee ff ff e8 69 07 4b fa ba 7f 28 00 00 48 c7 c6 00 90 ae 8a 48 c7 c7 40 90 ae 8a c6 05 6d b1 51 06 01 e8 8c 90 d8 01 <0f> 0b e9 70 ee ff ff e8 3e 07 4b fa 4c 89 e7 e8 86 2a 59 fa e9 ee
> > RSP: 0018:ffffc900046ff6e0 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> > RDX: ffff888050f51d00 RSI: ffffffff815fa008 RDI: fffff520008dfece
> > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> > R10: ffffffff815f3d6e R11: 0000000000000000 R12: 00000000fffffff4
> > R13: dffffc0000000000 R14: ffffc900046ff750 R15: ffff88807b7dc000
> > FS:  00007f4ab736e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007fee0b4f8990 CR3: 000000001e7d2000 CR4: 00000000003506e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <TASK>
> >  mroute_clean_tables+0x244/0xb40 net/ipv6/ip6mr.c:1509
> >  ip6mr_free_table net/ipv6/ip6mr.c:389 [inline]
> >  ip6mr_rules_init net/ipv6/ip6mr.c:246 [inline]
> >  ip6mr_net_init net/ipv6/ip6mr.c:1306 [inline]
>
> Isn't that new table still empty in this case? Which means
> mroute_clean_tables() should not actually unregister any netdevice??
>
> Should we just move that assertion after list empty check?
>
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 909fb3815910..ff6e7d0074dd 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10359,11 +10359,11 @@ void unregister_netdevice_many(struct list_head *head)
>         LIST_HEAD(close_head);
>
>         BUG_ON(dev_boot_phase);
> -       ASSERT_RTNL();
>
>         if (list_empty(head))

The rule is that we need to hold RTNL when calling unregister_netdevice_many().

Adding a special case for empty list would avoid this safety check,
and perhaps hide future bugs.

This ASSER_RTNL() check has been there forever (before git)

Not sure what this brings, my patch only fixed a super-rare case ?
Do you think the added rtrnl acquisition is an issue ?
Cong Wang Feb. 15, 2022, 12:54 a.m. UTC | #4
On Mon, Feb 14, 2022 at 4:36 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Feb 14, 2022 at 4:24 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > On Mon, Feb 07, 2022 at 09:34:51PM -0800, Eric Dumazet wrote:
> > > From: Eric Dumazet <edumazet@google.com>
> > >
> > > ip[6]mr_free_table() can only be called under RTNL lock.
> > >
> > > RTNL: assertion failed at net/core/dev.c (10367)
> > > WARNING: CPU: 1 PID: 5890 at net/core/dev.c:10367 unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> > > Modules linked in:
> > > CPU: 1 PID: 5890 Comm: syz-executor.2 Not tainted 5.16.0-syzkaller-11627-g422ee58dc0ef #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > RIP: 0010:unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> > > Code: 0f 85 9b ee ff ff e8 69 07 4b fa ba 7f 28 00 00 48 c7 c6 00 90 ae 8a 48 c7 c7 40 90 ae 8a c6 05 6d b1 51 06 01 e8 8c 90 d8 01 <0f> 0b e9 70 ee ff ff e8 3e 07 4b fa 4c 89 e7 e8 86 2a 59 fa e9 ee
> > > RSP: 0018:ffffc900046ff6e0 EFLAGS: 00010286
> > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> > > RDX: ffff888050f51d00 RSI: ffffffff815fa008 RDI: fffff520008dfece
> > > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> > > R10: ffffffff815f3d6e R11: 0000000000000000 R12: 00000000fffffff4
> > > R13: dffffc0000000000 R14: ffffc900046ff750 R15: ffff88807b7dc000
> > > FS:  00007f4ab736e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00007fee0b4f8990 CR3: 000000001e7d2000 CR4: 00000000003506e0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > >  <TASK>
> > >  mroute_clean_tables+0x244/0xb40 net/ipv6/ip6mr.c:1509
> > >  ip6mr_free_table net/ipv6/ip6mr.c:389 [inline]
> > >  ip6mr_rules_init net/ipv6/ip6mr.c:246 [inline]
> > >  ip6mr_net_init net/ipv6/ip6mr.c:1306 [inline]
> >
> > Isn't that new table still empty in this case? Which means
> > mroute_clean_tables() should not actually unregister any netdevice??
> >
> > Should we just move that assertion after list empty check?
> >
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 909fb3815910..ff6e7d0074dd 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -10359,11 +10359,11 @@ void unregister_netdevice_many(struct list_head *head)
> >         LIST_HEAD(close_head);
> >
> >         BUG_ON(dev_boot_phase);
> > -       ASSERT_RTNL();
> >
> >         if (list_empty(head))
>
> The rule is that we need to hold RTNL when calling unregister_netdevice_many().
>
> Adding a special case for empty list would avoid this safety check,
> and perhaps hide future bugs.

Why is that? What bugs are you talking about when it is just a nop?

>
> This ASSER_RTNL() check has been there forever (before git)

So is this bug? ;)

>
> Not sure what this brings, my patch only fixed a super-rare case ?
> Do you think the added rtrnl acquisition is an issue ?

Yes, it is just completely unnecessary, I fail to see why we want to
use RTNL to protect a nop.

Thanks.
Eric Dumazet Feb. 15, 2022, 1:13 a.m. UTC | #5
On Mon, Feb 14, 2022 at 4:54 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Mon, Feb 14, 2022 at 4:36 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Mon, Feb 14, 2022 at 4:24 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > >
> > > On Mon, Feb 07, 2022 at 09:34:51PM -0800, Eric Dumazet wrote:
> > > > From: Eric Dumazet <edumazet@google.com>
> > > >
> > > > ip[6]mr_free_table() can only be called under RTNL lock.
> > > >
> > > > RTNL: assertion failed at net/core/dev.c (10367)
> > > > WARNING: CPU: 1 PID: 5890 at net/core/dev.c:10367 unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> > > > Modules linked in:
> > > > CPU: 1 PID: 5890 Comm: syz-executor.2 Not tainted 5.16.0-syzkaller-11627-g422ee58dc0ef #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > > RIP: 0010:unregister_netdevice_many+0x1246/0x1850 net/core/dev.c:10367
> > > > Code: 0f 85 9b ee ff ff e8 69 07 4b fa ba 7f 28 00 00 48 c7 c6 00 90 ae 8a 48 c7 c7 40 90 ae 8a c6 05 6d b1 51 06 01 e8 8c 90 d8 01 <0f> 0b e9 70 ee ff ff e8 3e 07 4b fa 4c 89 e7 e8 86 2a 59 fa e9 ee
> > > > RSP: 0018:ffffc900046ff6e0 EFLAGS: 00010286
> > > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> > > > RDX: ffff888050f51d00 RSI: ffffffff815fa008 RDI: fffff520008dfece
> > > > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> > > > R10: ffffffff815f3d6e R11: 0000000000000000 R12: 00000000fffffff4
> > > > R13: dffffc0000000000 R14: ffffc900046ff750 R15: ffff88807b7dc000
> > > > FS:  00007f4ab736e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
> > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 00007fee0b4f8990 CR3: 000000001e7d2000 CR4: 00000000003506e0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > Call Trace:
> > > >  <TASK>
> > > >  mroute_clean_tables+0x244/0xb40 net/ipv6/ip6mr.c:1509
> > > >  ip6mr_free_table net/ipv6/ip6mr.c:389 [inline]
> > > >  ip6mr_rules_init net/ipv6/ip6mr.c:246 [inline]
> > > >  ip6mr_net_init net/ipv6/ip6mr.c:1306 [inline]
> > >
> > > Isn't that new table still empty in this case? Which means
> > > mroute_clean_tables() should not actually unregister any netdevice??
> > >
> > > Should we just move that assertion after list empty check?
> > >
> > >
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 909fb3815910..ff6e7d0074dd 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -10359,11 +10359,11 @@ void unregister_netdevice_many(struct list_head *head)
> > >         LIST_HEAD(close_head);
> > >
> > >         BUG_ON(dev_boot_phase);
> > > -       ASSERT_RTNL();
> > >
> > >         if (list_empty(head))
> >
> > The rule is that we need to hold RTNL when calling unregister_netdevice_many().
> >
> > Adding a special case for empty list would avoid this safety check,
> > and perhaps hide future bugs.
>
> Why is that? What bugs are you talking about when it is just a nop?
>
> >
> > This ASSER_RTNL() check has been there forever (before git)
>
> So is this bug? ;)
>
> >
> > Not sure what this brings, my patch only fixed a super-rare case ?
> > Do you think the added rtrnl acquisition is an issue ?
>
> Yes, it is just completely unnecessary, I fail to see why we want to
> use RTNL to protect a nop.
>

Should we revert your patch then ?

There was no explanation of why was it needed to call p6mr_free_table()',
if later we had to shortcut innocent functions that are simply
assuming RTNL is held.

commit f243e5a7859a24d10975afb9a1708cac624ba6f1
Author: WANG Cong <xiyou.wangcong@gmail.com>
Date:   Wed Mar 25 14:45:03 2015 -0700

    ipmr,ip6mr: call ip6mr_free_table() on failure path

    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>


What was the reason to break the kernel, then complain later that
someone had to spend time to fix it ?
diff mbox series

Patch

diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 07274619b9ea11837501f8fe812d616d20573ee0..29bbe2b08ae970e5accd7e1b01bcd5f502a66810 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -256,7 +256,9 @@  static int __net_init ipmr_rules_init(struct net *net)
 	return 0;
 
 err2:
+	rtnl_lock();
 	ipmr_free_table(mrt);
+	rtnl_unlock();
 err1:
 	fib_rules_unregister(ops);
 	return err;
diff --git a/net/ipv6/ip6mr.c b/net/ipv6/ip6mr.c
index 7cf73e60e619ba92ea1eccc90c181ba7150225dd..8a2db926b5eb6ab2e08691ad7244c504611540e6 100644
--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -243,7 +243,9 @@  static int __net_init ip6mr_rules_init(struct net *net)
 	return 0;
 
 err2:
+	rtnl_lock();
 	ip6mr_free_table(mrt);
+	rtnl_unlock();
 err1:
 	fib_rules_unregister(ops);
 	return err;