diff mbox series

[for-rc] RDMA/ipoib: Fix warning caused by destroying non-initial netns

Message ID 20210525150134.139342-1-kamalheib1@gmail.com (mailing list archive)
State Accepted
Delegated to: Jason Gunthorpe
Headers show
Series [for-rc] RDMA/ipoib: Fix warning caused by destroying non-initial netns | expand

Commit Message

Kamal Heib May 25, 2021, 3:01 p.m. UTC
After the introduce of 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for
ipoib interfaces"), If the IPoIB device is moved to non-initial netns,
destroying that netns lets the device vanish instead of moving it back
to the initial netns, This is happening because default_device_exit()
skips the interfaces due to having rtnl_link_ops set.

Steps to reporoduce:
  ip netns add foo
  ip link set mlx5_ib0 netns foo
  ip netns delete foo

------------[ cut here ]------------
WARNING: CPU: 1 PID: 704 at net/core/dev.c:11435 netdev_exit+0x3f/0x50
Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun d
 fuse
CPU: 1 PID: 704 Comm: kworker/u64:3 Tainted: G S      W  5.13.0-rc1+ #1
Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
Workqueue: netns cleanup_net
RIP: 0010:netdev_exit+0x3f/0x50
Code: 48 8b bb 30 01 00 00 e8 ef 81 b1 ff 48 81 fb c0 3a 54 a1 74 13 48
8b 83 90 00 00 00 48 81 c3 90 00 00 00 48 39 d8 75 02 5b c3 <0f> 0b 5b
c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00
RSP: 0018:ffffb297079d7e08 EFLAGS: 00010206
RAX: ffff8eb542c00040 RBX: ffff8eb541333150 RCX: 000000008010000d
RDX: 000000008010000e RSI: 000000008010000d RDI: ffff8eb440042c00
RBP: ffffb297079d7e48 R08: 0000000000000001 R09: ffffffff9fdeac00
R10: ffff8eb5003be000 R11: 0000000000000001 R12: ffffffffa1545620
R13: ffffffffa1545628 R14: 0000000000000000 R15: ffffffffa1543b20
FS:  0000000000000000(0000) GS:ffff8ed37fa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005601b5f4c2e8 CR3: 0000001fc8c10002 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 ops_exit_list.isra.9+0x36/0x70
 cleanup_net+0x234/0x390
 process_one_work+0x1cb/0x360
 ? process_one_work+0x360/0x360
 worker_thread+0x30/0x370
 ? process_one_work+0x360/0x360
 kthread+0x116/0x130
 ? kthread_park+0x80/0x80
 ret_from_fork+0x22/0x30
---[ end trace 74b40f8fbd65a323 ]---

To avoid the above warning and later on the kernel panic that could
happen on shutdown due to a null pointer dereference, Make sure to set
the netns_refund flag that was introduced by [1] to properly restore
the IPoIB interfaces to the initial netns.

[1] - 3a5ca857079e ("can: dev: Move device back to init netns on owning
netns delete").

Fixes: 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Leon Romanovsky June 2, 2021, 8:10 a.m. UTC | #1
On Tue, May 25, 2021 at 06:01:34PM +0300, Kamal Heib wrote:
> After the introduce of 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for
> ipoib interfaces"), If the IPoIB device is moved to non-initial netns,
> destroying that netns lets the device vanish instead of moving it back
> to the initial netns, This is happening because default_device_exit()
> skips the interfaces due to having rtnl_link_ops set.
> 
> Steps to reporoduce:
>   ip netns add foo
>   ip link set mlx5_ib0 netns foo
>   ip netns delete foo
> 
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 704 at net/core/dev.c:11435 netdev_exit+0x3f/0x50
> Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
> nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun d
>  fuse
> CPU: 1 PID: 704 Comm: kworker/u64:3 Tainted: G S      W  5.13.0-rc1+ #1
> Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
> Workqueue: netns cleanup_net
> RIP: 0010:netdev_exit+0x3f/0x50
> Code: 48 8b bb 30 01 00 00 e8 ef 81 b1 ff 48 81 fb c0 3a 54 a1 74 13 48
> 8b 83 90 00 00 00 48 81 c3 90 00 00 00 48 39 d8 75 02 5b c3 <0f> 0b 5b
> c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00
> RSP: 0018:ffffb297079d7e08 EFLAGS: 00010206
> RAX: ffff8eb542c00040 RBX: ffff8eb541333150 RCX: 000000008010000d
> RDX: 000000008010000e RSI: 000000008010000d RDI: ffff8eb440042c00
> RBP: ffffb297079d7e48 R08: 0000000000000001 R09: ffffffff9fdeac00
> R10: ffff8eb5003be000 R11: 0000000000000001 R12: ffffffffa1545620
> R13: ffffffffa1545628 R14: 0000000000000000 R15: ffffffffa1543b20
> FS:  0000000000000000(0000) GS:ffff8ed37fa00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00005601b5f4c2e8 CR3: 0000001fc8c10002 CR4: 00000000003706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  ops_exit_list.isra.9+0x36/0x70
>  cleanup_net+0x234/0x390
>  process_one_work+0x1cb/0x360
>  ? process_one_work+0x360/0x360
>  worker_thread+0x30/0x370
>  ? process_one_work+0x360/0x360
>  kthread+0x116/0x130
>  ? kthread_park+0x80/0x80
>  ret_from_fork+0x22/0x30
> ---[ end trace 74b40f8fbd65a323 ]---
> 
> To avoid the above warning and later on the kernel panic that could
> happen on shutdown due to a null pointer dereference, Make sure to set
> the netns_refund flag that was introduced by [1] to properly restore
> the IPoIB interfaces to the initial netns.
> 
> [1] - 3a5ca857079e ("can: dev: Move device back to init netns on owning
> netns delete").
> 
> Fixes: 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces")
> Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 1 +
>  1 file changed, 1 insertion(+)

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Jason Gunthorpe June 2, 2021, 6:35 p.m. UTC | #2
On Tue, May 25, 2021 at 06:01:34PM +0300, Kamal Heib wrote:
> After the introduce of 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for
> ipoib interfaces"), If the IPoIB device is moved to non-initial netns,
> destroying that netns lets the device vanish instead of moving it back
> to the initial netns, This is happening because default_device_exit()
> skips the interfaces due to having rtnl_link_ops set.
> 
> Steps to reporoduce:
>   ip netns add foo
>   ip link set mlx5_ib0 netns foo
>   ip netns delete foo
> 
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 704 at net/core/dev.c:11435 netdev_exit+0x3f/0x50
> Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
> nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun d
>  fuse
> CPU: 1 PID: 704 Comm: kworker/u64:3 Tainted: G S      W  5.13.0-rc1+ #1
> Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
> Workqueue: netns cleanup_net
> RIP: 0010:netdev_exit+0x3f/0x50
> Code: 48 8b bb 30 01 00 00 e8 ef 81 b1 ff 48 81 fb c0 3a 54 a1 74 13 48
> 8b 83 90 00 00 00 48 81 c3 90 00 00 00 48 39 d8 75 02 5b c3 <0f> 0b 5b
> c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00
> RSP: 0018:ffffb297079d7e08 EFLAGS: 00010206
> RAX: ffff8eb542c00040 RBX: ffff8eb541333150 RCX: 000000008010000d
> RDX: 000000008010000e RSI: 000000008010000d RDI: ffff8eb440042c00
> RBP: ffffb297079d7e48 R08: 0000000000000001 R09: ffffffff9fdeac00
> R10: ffff8eb5003be000 R11: 0000000000000001 R12: ffffffffa1545620
> R13: ffffffffa1545628 R14: 0000000000000000 R15: ffffffffa1543b20
> FS:  0000000000000000(0000) GS:ffff8ed37fa00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00005601b5f4c2e8 CR3: 0000001fc8c10002 CR4: 00000000003706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  ops_exit_list.isra.9+0x36/0x70
>  cleanup_net+0x234/0x390
>  process_one_work+0x1cb/0x360
>  ? process_one_work+0x360/0x360
>  worker_thread+0x30/0x370
>  ? process_one_work+0x360/0x360
>  kthread+0x116/0x130
>  ? kthread_park+0x80/0x80
>  ret_from_fork+0x22/0x30
> ---[ end trace 74b40f8fbd65a323 ]---
> 
> To avoid the above warning and later on the kernel panic that could
> happen on shutdown due to a null pointer dereference, Make sure to set
> the netns_refund flag that was introduced by [1] to properly restore
> the IPoIB interfaces to the initial netns.
> 
> [1] - 3a5ca857079e ("can: dev: Move device back to init netns on owning
> netns delete").
> 
> Fixes: 5ce2dced8e95 ("RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces")
> Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 1 +
>  1 file changed, 1 insertion(+)

Applied to for-next, thanks

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
index d5a90a66b45c..5b05cf3837da 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
@@ -163,6 +163,7 @@  static size_t ipoib_get_size(const struct net_device *dev)
 
 static struct rtnl_link_ops ipoib_link_ops __read_mostly = {
 	.kind		= "ipoib",
+	.netns_refund   = true,
 	.maxtype	= IFLA_IPOIB_MAX,
 	.policy		= ipoib_policy,
 	.priv_size	= sizeof(struct ipoib_dev_priv),