diff mbox series

[rdma-rc] IB/IPoIB: Fix legacy IPoIB due to wrong number of queues

Message ID 4a7ecec08ee30ad8004019818fadf1e58057e945.1674137153.git.leon@kernel.org (mailing list archive)
State Superseded
Headers show
Series [rdma-rc] IB/IPoIB: Fix legacy IPoIB due to wrong number of queues | expand

Commit Message

Leon Romanovsky Jan. 19, 2023, 2:06 p.m. UTC
From: Dragos Tatulea <dtatulea@nvidia.com>

The cited commit creates child PKEY interfaces over netlink will multiple
tx and rx queues, but some devices doesn't support more than 1 tx and 1 rx
queues. This causes to a crash when traffic is sent over the PKEY interface
due to the parent having a single queue but the child having multiple queues.

This patch inherits the real_num_tx/rx_queues from the parent netdev.

BUG: kernel NULL pointer dereference, address: 000000000000036b
PGD 0 P4D 0
Oops: 0000 [#1] SMP
CPU: 4 PID: 209665 Comm: python3 Not tainted 6.1.0_for_upstream_min_debug_2022_12_12_17_02 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmem_cache_alloc+0xcb/0x450
Code: ce 7e 49 8b 50 08 49 83 78 10 00 4d 8b 28 0f 84 cb 02 00 00 4d 85 ed 0f 84 c2 02 00 00 41 8b 44 24 28 48 8d 4a 01 49 8b 3c 24 <49> 8b 5c 05 00 4c 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 74 b8 41 8b
RSP: 0018:ffff88822acbbab8 EFLAGS: 00010202
RAX: 0000000000000070 RBX: ffff8881c28e3e00 RCX: 00000000064f8dae
RDX: 00000000064f8dad RSI: 0000000000000a20 RDI: 0000000000030d00
RBP: 0000000000000a20 R08: ffff8882f5d30d00 R09: ffff888104032f40
R10: ffff88810fade828 R11: 736f6d6570736575 R12: ffff88810081c000
R13: 00000000000002fb R14: ffffffff817fc865 R15: 0000000000000000
FS:  00007f9324ff9700(0000) GS:ffff8882f5d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000036b CR3: 00000001125af004 CR4: 0000000000370ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 skb_clone+0x55/0xd0
 ip6_finish_output2+0x3fe/0x690
 ip6_finish_output+0xfa/0x310
 ip6_send_skb+0x1e/0x60
 udp_v6_send_skb+0x1e5/0x420
 udpv6_sendmsg+0xb3c/0xe60
 ? ip_mc_finish_output+0x180/0x180
 ? __switch_to_asm+0x3a/0x60
 ? __switch_to_asm+0x34/0x60
 sock_sendmsg+0x33/0x40
 __sys_sendto+0x103/0x160
 ? _copy_to_user+0x21/0x30
 ? kvm_clock_get_cycles+0xd/0x10
 ? ktime_get_ts64+0x49/0xe0
 __x64_sys_sendto+0x25/0x30
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f9374f1ed14
Code: 42 41 f8 ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 68 41 f8 ff 48 8b
RSP: 002b:00007f9324ff7bd0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f9324ff7cc8 RCX: 00007f9374f1ed14
RDX: 00000000000002fb RSI: 00007f93000052f0 RDI: 0000000000000030
RBP: 0000000000000000 R08: 00007f9324ff7d40 R09: 000000000000001c
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
R13: 000000012a05f200 R14: 0000000000000001 R15: 00007f9374d57bdc
 </TASK>

Fixes: dbc94a0fb817 ("IB/IPoIB: Fix queue count inconsistency for PKEY child interfaces")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_netlink.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Leon Romanovsky Jan. 20, 2023, 5 p.m. UTC | #1
On Thu, Jan 19, 2023 at 04:06:13PM +0200, Leon Romanovsky wrote:
> From: Dragos Tatulea <dtatulea@nvidia.com>

<...>

> +	err = netif_set_real_num_tx_queues(dev, pdev->real_num_tx_queues);
> +	if (err) {
> +		ipoib_warn(ppriv, "failed setting the child tx queue count based on parent\n");
> +		return err;
> +	}
> +
> +	err = netif_set_real_num_rx_queues(dev, pdev->real_num_rx_queues);
> +	if (err) {
> +		ipoib_warn(ppriv, "failed setting the child rx queue count based on parent\m");
                                                                                          ^^^^
strange, I thought what I fixed it prior sending. I'll resend.


> +		return err;
> +	}
> +
>  	err = ipoib_intf_init(ppriv->ca, ppriv->port, dev->name, dev);
>  	if (err) {
>  		ipoib_warn(ppriv, "failed to initialize pkey device\n");
> -- 
> 2.39.0
>
diff mbox series

Patch

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
index 9ad8d9856275..589ae7af33ea 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_netlink.c
@@ -126,6 +126,18 @@  static int ipoib_new_child_link(struct net *src_net, struct net_device *dev,
 	} else
 		child_pkey  = nla_get_u16(data[IFLA_IPOIB_PKEY]);
 
+	err = netif_set_real_num_tx_queues(dev, pdev->real_num_tx_queues);
+	if (err) {
+		ipoib_warn(ppriv, "failed setting the child tx queue count based on parent\n");
+		return err;
+	}
+
+	err = netif_set_real_num_rx_queues(dev, pdev->real_num_rx_queues);
+	if (err) {
+		ipoib_warn(ppriv, "failed setting the child rx queue count based on parent\m");
+		return err;
+	}
+
 	err = ipoib_intf_init(ppriv->ca, ppriv->port, dev->name, dev);
 	if (err) {
 		ipoib_warn(ppriv, "failed to initialize pkey device\n");