diff mbox series

[rdma-next] RDMA/core: Skip initialized but not leaked GID entries

Message ID 7cce156160c4da8062e3cc8c5e9d5b7880feaafd.1725284500.git.leonro@nvidia.com (mailing list archive)
State Handled Elsewhere
Headers show
Series [rdma-next] RDMA/core: Skip initialized but not leaked GID entries | expand

Commit Message

Leon Romanovsky Sept. 2, 2024, 1:42 p.m. UTC
From: Leon Romanovsky <leonro@nvidia.com>

Failure in driver initialization can lead to a situation where the GID
entries are set but not used yet. In this case, the kref will be equal to 1,
which will trigger a false positive leak detection.

For example, these messages are printed during the driver initialization
and followed by release_gid_table() call:

 infiniband syz1: ib_query_port failed (-19)
 infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
------------[ cut here ]------------
 GID entry ref leak for dev syz1 index 0 ref=1
 WARNING: CPU: 0 PID: 19837 at drivers/infiniband/core/cache.c:806 gid_table_release_one+0x387/0x4b0
 Modules linked in:
 CPU: 0 UID: 0 PID: 19837 Comm: syz.1.3934 Not tainted 6.11.0-rc5-syzkaller-00079-g928f79a188aa #0
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
 RIP: 0010:gid_table_release_one+0x387/0x4b0
 Code: 78 07 00 00 48 85 f6 74 2a 48 89 74 24 38
e8 b0 0a 76 f9 48 8b 74 24 38 44 89 f9 89 da 48 c7 c7 c0 69 51 8c e8 5a
c3 38 f9 90 <0f> 0b 90 90 e9 6f fe ff ff e8 8b 0a 76 f9 49 8d bc 24 28
07 00 00
 RSP: 0018:ffffc900042b7080 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9002811e000
 RDX: 0000000000040000 RSI: ffffffff814dd406 RDI: 0000000000000001
 RBP: ffff88807ebaaf00 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff888051860000
 R13: dffffc0000000000 R14: ffffed100fd755fb R15: 0000000000000001
 FS:  0000000000000000(0000) GS:ffff88802c000000(0063) knlGS:00000000f56c6b40
 CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
 CR2: 000000002effcff8 CR3: 0000000060c5e000 CR4: 0000000000350ef0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400
 Call Trace:
  <TASK>
  ? show_regs+0x8c/0xa0
  ? __warn+0xe5/0x3c0
  ? gid_table_release_one+0x387/0x4b0
  ? report_bug+0x3c0/0x580
  ? handle_bug+0x3d/0x70
  ? exc_invalid_op+0x17/0x50
  ? asm_exc_invalid_op+0x1a/0x20
  ? __warn_printk+0x1a6/0x350
  ? gid_table_release_one+0x387/0x4b0
  ib_device_release+0xef/0x1e0
  ? __pfx_ib_device_release+0x10/0x10
  device_release+0xa1/0x240
  kobject_put+0x1e4/0x5a0
  put_device+0x1f/0x30
  rxe_net_add+0xe0/0x110
  rxe_newlink+0x70/0x190
  nldev_newlink+0x373/0x5e0
  ? __pfx_nldev_newlink+0x10/0x10
  ? aa_get_newest_label+0x376/0x680
  ? __pfx_lock_acquire+0x10/0x10
  ? __pfx_aa_get_newest_label+0x10/0x10
  ? __pfx_rwsem_read_trylock+0x10/0x10
  ? __pfx___might_resched+0x10/0x10
  ? apparmor_capable+0x114/0x1d0
  ? ns_capable+0xd7/0x110
  ? __pfx_nldev_newlink+0x10/0x10
  rdma_nl_rcv_msg+0x388/0x6e0
  ? __pfx_rdma_nl_rcv_msg+0x10/0x10
  ? __pfx___lock_acquire+0x10/0x10
  ? find_held_lock+0x2d/0x110
  rdma_nl_rcv_skb.constprop.0.isra.0+0x2e6/0x450
  ? __pfx_rdma_nl_rcv_skb.constprop.0.isra.0+0x10/0x10
  ? netlink_deliver_tap+0x1ae/0xcf0
  netlink_unicast+0x53c/0x7f0
  ? __pfx_netlink_unicast+0x10/0x10
  ? __phys_addr_symbol+0x30/0x80
  ? __check_object_size+0x497/0x720
  netlink_sendmsg+0x8b8/0xd70
  ? __pfx_netlink_sendmsg+0x10/0x10
  ? bpf_lsm_socket_sendmsg+0x9/0x10
  ____sys_sendmsg+0x9b4/0xb50
  ? __pfx_____sys_sendmsg+0x10/0x10
  ? get_compat_msghdr+0x11b/0x170
  ? __pfx___lock_acquire+0x10/0x10
  ? try_to_wake_up+0xc08/0x13e0
  ___sys_sendmsg+0x135/0x1e0
  ? __pfx____sys_sendmsg+0x10/0x10
  ? __fget_light+0x173/0x210
  __sys_sendmsg+0x117/0x1f0
  ? __pfx___sys_sendmsg+0x10/0x10
  ? __ia32_sys_futex_time32+0x1da/0x460
  __do_fast_syscall_32+0x73/0x120
  do_fast_syscall_32+0x32/0x80
  entry_SYSENTER_compat_after_hwframe+0x84/0x8e
 RIP: 0023:0xf7f20579
 Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01
10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5
0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00
00 00 00
 RSP: 002b:00000000f56c656c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
 RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00000000200003c0
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
SYZFAIL: failed to recv rpc fd=3 want=4 recv=0 n=0 (errno 9: Bad file descriptor)
 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
 </TASK>
 Kernel panic - not syncing: kernel: panic_on_warn set ...

In order to fix it, don't print kern

Fixes: a92fbeac7e94 ("RDMA/cache: Release GID table even if leak is detected")
Reported-by: syzbot+b8b7a6774bf40cf8296b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000d70eed06211ac86b@google.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/cache.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Zhu Yanjun Sept. 2, 2024, 10:46 p.m. UTC | #1
在 2024/9/2 21:42, Leon Romanovsky 写道:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> Failure in driver initialization can lead to a situation where the GID
> entries are set but not used yet. In this case, the kref will be equal to 1,
> which will trigger a false positive leak detection.

I have also delved into this problem. And I agree with you. This is a 
false positive leak detection.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

> 
> For example, these messages are printed during the driver initialization
> and followed by release_gid_table() call:
> 
>   infiniband syz1: ib_query_port failed (-19)
>   infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
> ------------[ cut here ]------------
>   GID entry ref leak for dev syz1 index 0 ref=1
>   WARNING: CPU: 0 PID: 19837 at drivers/infiniband/core/cache.c:806 gid_table_release_one+0x387/0x4b0
>   Modules linked in:
>   CPU: 0 UID: 0 PID: 19837 Comm: syz.1.3934 Not tainted 6.11.0-rc5-syzkaller-00079-g928f79a188aa #0
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
>   RIP: 0010:gid_table_release_one+0x387/0x4b0
>   Code: 78 07 00 00 48 85 f6 74 2a 48 89 74 24 38
> e8 b0 0a 76 f9 48 8b 74 24 38 44 89 f9 89 da 48 c7 c7 c0 69 51 8c e8 5a
> c3 38 f9 90 <0f> 0b 90 90 e9 6f fe ff ff e8 8b 0a 76 f9 49 8d bc 24 28
> 07 00 00
>   RSP: 0018:ffffc900042b7080 EFLAGS: 00010286
>   RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9002811e000
>   RDX: 0000000000040000 RSI: ffffffff814dd406 RDI: 0000000000000001
>   RBP: ffff88807ebaaf00 R08: 0000000000000001 R09: 0000000000000000
>   R10: 0000000000000001 R11: 0000000000000000 R12: ffff888051860000
>   R13: dffffc0000000000 R14: ffffed100fd755fb R15: 0000000000000001
>   FS:  0000000000000000(0000) GS:ffff88802c000000(0063) knlGS:00000000f56c6b40
>   CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
>   CR2: 000000002effcff8 CR3: 0000000060c5e000 CR4: 0000000000350ef0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400
>   Call Trace:
>    <TASK>
>    ? show_regs+0x8c/0xa0
>    ? __warn+0xe5/0x3c0
>    ? gid_table_release_one+0x387/0x4b0
>    ? report_bug+0x3c0/0x580
>    ? handle_bug+0x3d/0x70
>    ? exc_invalid_op+0x17/0x50
>    ? asm_exc_invalid_op+0x1a/0x20
>    ? __warn_printk+0x1a6/0x350
>    ? gid_table_release_one+0x387/0x4b0
>    ib_device_release+0xef/0x1e0
>    ? __pfx_ib_device_release+0x10/0x10
>    device_release+0xa1/0x240
>    kobject_put+0x1e4/0x5a0
>    put_device+0x1f/0x30
>    rxe_net_add+0xe0/0x110
>    rxe_newlink+0x70/0x190
>    nldev_newlink+0x373/0x5e0
>    ? __pfx_nldev_newlink+0x10/0x10
>    ? aa_get_newest_label+0x376/0x680
>    ? __pfx_lock_acquire+0x10/0x10
>    ? __pfx_aa_get_newest_label+0x10/0x10
>    ? __pfx_rwsem_read_trylock+0x10/0x10
>    ? __pfx___might_resched+0x10/0x10
>    ? apparmor_capable+0x114/0x1d0
>    ? ns_capable+0xd7/0x110
>    ? __pfx_nldev_newlink+0x10/0x10
>    rdma_nl_rcv_msg+0x388/0x6e0
>    ? __pfx_rdma_nl_rcv_msg+0x10/0x10
>    ? __pfx___lock_acquire+0x10/0x10
>    ? find_held_lock+0x2d/0x110
>    rdma_nl_rcv_skb.constprop.0.isra.0+0x2e6/0x450
>    ? __pfx_rdma_nl_rcv_skb.constprop.0.isra.0+0x10/0x10
>    ? netlink_deliver_tap+0x1ae/0xcf0
>    netlink_unicast+0x53c/0x7f0
>    ? __pfx_netlink_unicast+0x10/0x10
>    ? __phys_addr_symbol+0x30/0x80
>    ? __check_object_size+0x497/0x720
>    netlink_sendmsg+0x8b8/0xd70
>    ? __pfx_netlink_sendmsg+0x10/0x10
>    ? bpf_lsm_socket_sendmsg+0x9/0x10
>    ____sys_sendmsg+0x9b4/0xb50
>    ? __pfx_____sys_sendmsg+0x10/0x10
>    ? get_compat_msghdr+0x11b/0x170
>    ? __pfx___lock_acquire+0x10/0x10
>    ? try_to_wake_up+0xc08/0x13e0
>    ___sys_sendmsg+0x135/0x1e0
>    ? __pfx____sys_sendmsg+0x10/0x10
>    ? __fget_light+0x173/0x210
>    __sys_sendmsg+0x117/0x1f0
>    ? __pfx___sys_sendmsg+0x10/0x10
>    ? __ia32_sys_futex_time32+0x1da/0x460
>    __do_fast_syscall_32+0x73/0x120
>    do_fast_syscall_32+0x32/0x80
>    entry_SYSENTER_compat_after_hwframe+0x84/0x8e
>   RIP: 0023:0xf7f20579
>   Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01
> 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5
> 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00
> 00 00 00
>   RSP: 002b:00000000f56c656c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
>   RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00000000200003c0
>   RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>   RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
> SYZFAIL: failed to recv rpc fd=3 want=4 recv=0 n=0 (errno 9: Bad file descriptor)
>   R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
>   </TASK>
>   Kernel panic - not syncing: kernel: panic_on_warn set ...
> 
> In order to fix it, don't print kern
> 
> Fixes: a92fbeac7e94 ("RDMA/cache: Release GID table even if leak is detected")
> Reported-by: syzbot+b8b7a6774bf40cf8296b@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000d70eed06211ac86b@google.com
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>   drivers/infiniband/core/cache.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index b7c078b7f7cf..c6aec2e04d4c 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
>   		return;
>   
>   	for (i = 0; i < table->sz; i++) {
> +		int gid_kref;
> +
>   		if (is_gid_entry_free(table->data_vec[i]))
>   			continue;
>   
> -		WARN_ONCE(true,
> +		gid_kref = kref_read(&table->data_vec[i]->kref);
> +		WARN_ONCE(gid_kref > 1,
>   			  "GID entry ref leak for dev %s index %d ref=%u\n",
> -			  dev_name(&device->dev), i,
> -			  kref_read(&table->data_vec[i]->kref));
> +			  dev_name(&device->dev), i, gid_kref);
>   	}
>   
>   	mutex_destroy(&table->lock);
Leon Romanovsky Sept. 3, 2024, 7:26 a.m. UTC | #2
On Tue, Sep 03, 2024 at 06:46:24AM +0800, Zhu Yanjun wrote:
> 在 2024/9/2 21:42, Leon Romanovsky 写道:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > Failure in driver initialization can lead to a situation where the GID
> > entries are set but not used yet. In this case, the kref will be equal to 1,
> > which will trigger a false positive leak detection.
> 
> I have also delved into this problem. And I agree with you. This is a false
> positive leak detection.
> 
> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Thanks for the review.
Leon Romanovsky Sept. 4, 2024, 8:33 a.m. UTC | #3
On Tue, Sep 03, 2024 at 06:46:24AM +0800, Zhu Yanjun wrote:
> 在 2024/9/2 21:42, Leon Romanovsky 写道:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > Failure in driver initialization can lead to a situation where the GID
> > entries are set but not used yet. In this case, the kref will be equal to 1,
> > which will trigger a false positive leak detection.
> 
> I have also delved into this problem. And I agree with you. This is a false
> positive leak detection.

I retested after applying commit 1403c8b14765 ("IB/core: Fix ib_cache_setup_one error flow cleanup")
and it looks like this patch is not needed anymore. I will drop it.

Thanks

> 
> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> 
> Zhu Yanjun
> 
> > 
> > For example, these messages are printed during the driver initialization
> > and followed by release_gid_table() call:
> > 
> >   infiniband syz1: ib_query_port failed (-19)
> >   infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
> > ------------[ cut here ]------------
> >   GID entry ref leak for dev syz1 index 0 ref=1
> >   WARNING: CPU: 0 PID: 19837 at drivers/infiniband/core/cache.c:806 gid_table_release_one+0x387/0x4b0
> >   Modules linked in:
> >   CPU: 0 UID: 0 PID: 19837 Comm: syz.1.3934 Not tainted 6.11.0-rc5-syzkaller-00079-g928f79a188aa #0
> >   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> >   RIP: 0010:gid_table_release_one+0x387/0x4b0
> >   Code: 78 07 00 00 48 85 f6 74 2a 48 89 74 24 38
> > e8 b0 0a 76 f9 48 8b 74 24 38 44 89 f9 89 da 48 c7 c7 c0 69 51 8c e8 5a
> > c3 38 f9 90 <0f> 0b 90 90 e9 6f fe ff ff e8 8b 0a 76 f9 49 8d bc 24 28
> > 07 00 00
> >   RSP: 0018:ffffc900042b7080 EFLAGS: 00010286
> >   RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc9002811e000
> >   RDX: 0000000000040000 RSI: ffffffff814dd406 RDI: 0000000000000001
> >   RBP: ffff88807ebaaf00 R08: 0000000000000001 R09: 0000000000000000
> >   R10: 0000000000000001 R11: 0000000000000000 R12: ffff888051860000
> >   R13: dffffc0000000000 R14: ffffed100fd755fb R15: 0000000000000001
> >   FS:  0000000000000000(0000) GS:ffff88802c000000(0063) knlGS:00000000f56c6b40
> >   CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
> >   CR2: 000000002effcff8 CR3: 0000000060c5e000 CR4: 0000000000350ef0
> >   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400
> >   Call Trace:
> >    <TASK>
> >    ? show_regs+0x8c/0xa0
> >    ? __warn+0xe5/0x3c0
> >    ? gid_table_release_one+0x387/0x4b0
> >    ? report_bug+0x3c0/0x580
> >    ? handle_bug+0x3d/0x70
> >    ? exc_invalid_op+0x17/0x50
> >    ? asm_exc_invalid_op+0x1a/0x20
> >    ? __warn_printk+0x1a6/0x350
> >    ? gid_table_release_one+0x387/0x4b0
> >    ib_device_release+0xef/0x1e0
> >    ? __pfx_ib_device_release+0x10/0x10
> >    device_release+0xa1/0x240
> >    kobject_put+0x1e4/0x5a0
> >    put_device+0x1f/0x30
> >    rxe_net_add+0xe0/0x110
> >    rxe_newlink+0x70/0x190
> >    nldev_newlink+0x373/0x5e0
> >    ? __pfx_nldev_newlink+0x10/0x10
> >    ? aa_get_newest_label+0x376/0x680
> >    ? __pfx_lock_acquire+0x10/0x10
> >    ? __pfx_aa_get_newest_label+0x10/0x10
> >    ? __pfx_rwsem_read_trylock+0x10/0x10
> >    ? __pfx___might_resched+0x10/0x10
> >    ? apparmor_capable+0x114/0x1d0
> >    ? ns_capable+0xd7/0x110
> >    ? __pfx_nldev_newlink+0x10/0x10
> >    rdma_nl_rcv_msg+0x388/0x6e0
> >    ? __pfx_rdma_nl_rcv_msg+0x10/0x10
> >    ? __pfx___lock_acquire+0x10/0x10
> >    ? find_held_lock+0x2d/0x110
> >    rdma_nl_rcv_skb.constprop.0.isra.0+0x2e6/0x450
> >    ? __pfx_rdma_nl_rcv_skb.constprop.0.isra.0+0x10/0x10
> >    ? netlink_deliver_tap+0x1ae/0xcf0
> >    netlink_unicast+0x53c/0x7f0
> >    ? __pfx_netlink_unicast+0x10/0x10
> >    ? __phys_addr_symbol+0x30/0x80
> >    ? __check_object_size+0x497/0x720
> >    netlink_sendmsg+0x8b8/0xd70
> >    ? __pfx_netlink_sendmsg+0x10/0x10
> >    ? bpf_lsm_socket_sendmsg+0x9/0x10
> >    ____sys_sendmsg+0x9b4/0xb50
> >    ? __pfx_____sys_sendmsg+0x10/0x10
> >    ? get_compat_msghdr+0x11b/0x170
> >    ? __pfx___lock_acquire+0x10/0x10
> >    ? try_to_wake_up+0xc08/0x13e0
> >    ___sys_sendmsg+0x135/0x1e0
> >    ? __pfx____sys_sendmsg+0x10/0x10
> >    ? __fget_light+0x173/0x210
> >    __sys_sendmsg+0x117/0x1f0
> >    ? __pfx___sys_sendmsg+0x10/0x10
> >    ? __ia32_sys_futex_time32+0x1da/0x460
> >    __do_fast_syscall_32+0x73/0x120
> >    do_fast_syscall_32+0x32/0x80
> >    entry_SYSENTER_compat_after_hwframe+0x84/0x8e
> >   RIP: 0023:0xf7f20579
> >   Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01
> > 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5
> > 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00
> > 00 00 00
> >   RSP: 002b:00000000f56c656c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
> >   RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00000000200003c0
> >   RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >   RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> >   R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
> > SYZFAIL: failed to recv rpc fd=3 want=4 recv=0 n=0 (errno 9: Bad file descriptor)
> >   R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> >   </TASK>
> >   Kernel panic - not syncing: kernel: panic_on_warn set ...
> > 
> > In order to fix it, don't print kern
> > 
> > Fixes: a92fbeac7e94 ("RDMA/cache: Release GID table even if leak is detected")
> > Reported-by: syzbot+b8b7a6774bf40cf8296b@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/all/000000000000d70eed06211ac86b@google.com
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> >   drivers/infiniband/core/cache.c | 8 +++++---
> >   1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > index b7c078b7f7cf..c6aec2e04d4c 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
> >   		return;
> >   	for (i = 0; i < table->sz; i++) {
> > +		int gid_kref;
> > +
> >   		if (is_gid_entry_free(table->data_vec[i]))
> >   			continue;
> > -		WARN_ONCE(true,
> > +		gid_kref = kref_read(&table->data_vec[i]->kref);
> > +		WARN_ONCE(gid_kref > 1,
> >   			  "GID entry ref leak for dev %s index %d ref=%u\n",
> > -			  dev_name(&device->dev), i,
> > -			  kref_read(&table->data_vec[i]->kref));
> > +			  dev_name(&device->dev), i, gid_kref);
> >   	}
> >   	mutex_destroy(&table->lock);
> 
>
Jason Gunthorpe Sept. 4, 2024, 2:31 p.m. UTC | #4
On Mon, Sep 02, 2024 at 04:42:52PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> Failure in driver initialization can lead to a situation where the GID
> entries are set but not used yet. In this case, the kref will be equal to 1,
> which will trigger a false positive leak detection.

Why does that happen??


> For example, these messages are printed during the driver initialization
> and followed by release_gid_table() call:
> 
>  infiniband syz1: ib_query_port failed (-19)
>  infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache

Okay, but who set the ref=1?

> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index b7c078b7f7cf..c6aec2e04d4c 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
>  		return;
>  
>  	for (i = 0; i < table->sz; i++) {
> +		int gid_kref;
> +
>  		if (is_gid_entry_free(table->data_vec[i]))
>  			continue;
>  
> -		WARN_ONCE(true,
> +		gid_kref = kref_read(&table->data_vec[i]->kref);
> +		WARN_ONCE(gid_kref > 1,
>  			  "GID entry ref leak for dev %s index %d ref=%u\n",
> -			  dev_name(&device->dev), i,
> -			  kref_read(&table->data_vec[i]->kref));
> +			  dev_name(&device->dev), i, gid_kref);
>  	}

I'm not convinced, I think the bug here is something wrong on the
refcounting side not the freeing side. Ref should not be 1. Seems like
missing error unwinding in the init side.

Jason
Leon Romanovsky Sept. 4, 2024, 3:34 p.m. UTC | #5
On Wed, Sep 04, 2024 at 11:31:13AM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 02, 2024 at 04:42:52PM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > Failure in driver initialization can lead to a situation where the GID
> > entries are set but not used yet. In this case, the kref will be equal to 1,
> > which will trigger a false positive leak detection.
> 
> Why does that happen??
> 
> 
> > For example, these messages are printed during the driver initialization
> > and followed by release_gid_table() call:
> > 
> >  infiniband syz1: ib_query_port failed (-19)
> >  infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
> 
> Okay, but who set the ref=1?
> 
> > diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> > index b7c078b7f7cf..c6aec2e04d4c 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
> >  		return;
> >  
> >  	for (i = 0; i < table->sz; i++) {
> > +		int gid_kref;
> > +
> >  		if (is_gid_entry_free(table->data_vec[i]))
> >  			continue;
> >  
> > -		WARN_ONCE(true,
> > +		gid_kref = kref_read(&table->data_vec[i]->kref);
> > +		WARN_ONCE(gid_kref > 1,
> >  			  "GID entry ref leak for dev %s index %d ref=%u\n",
> > -			  dev_name(&device->dev), i,
> > -			  kref_read(&table->data_vec[i]->kref));
> > +			  dev_name(&device->dev), i, gid_kref);
> >  	}
> 
> I'm not convinced, I think the bug here is something wrong on the
> refcounting side not the freeing side. Ref should not be 1. Seems like
> missing error unwinding in the init side.

I dropped this patch as the real fix is here 1403c8b14765 ("IB/core: Fix ib_cache_setup_one error flow cleanup")

Thanks

> 
> Jason
>
Zhu Yanjun Sept. 5, 2024, 6:54 a.m. UTC | #6
在 2024/9/4 23:34, Leon Romanovsky 写道:
> On Wed, Sep 04, 2024 at 11:31:13AM -0300, Jason Gunthorpe wrote:
>> On Mon, Sep 02, 2024 at 04:42:52PM +0300, Leon Romanovsky wrote:
>>> From: Leon Romanovsky <leonro@nvidia.com>
>>>
>>> Failure in driver initialization can lead to a situation where the GID
>>> entries are set but not used yet. In this case, the kref will be equal to 1,
>>> which will trigger a false positive leak detection.
>>
>> Why does that happen??
>>
>>
>>> For example, these messages are printed during the driver initialization
>>> and followed by release_gid_table() call:
>>>
>>>   infiniband syz1: ib_query_port failed (-19)
>>>   infiniband syz1: Couldn't set up InfiniBand P_Key/GID cache
>>
>> Okay, but who set the ref=1?
>>
>>> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
>>> index b7c078b7f7cf..c6aec2e04d4c 100644
>>> --- a/drivers/infiniband/core/cache.c
>>> +++ b/drivers/infiniband/core/cache.c
>>> @@ -800,13 +800,15 @@ static void release_gid_table(struct ib_device *device,
>>>   		return;
>>>   
>>>   	for (i = 0; i < table->sz; i++) {
>>> +		int gid_kref;
>>> +
>>>   		if (is_gid_entry_free(table->data_vec[i]))
>>>   			continue;
>>>   
>>> -		WARN_ONCE(true,
>>> +		gid_kref = kref_read(&table->data_vec[i]->kref);
>>> +		WARN_ONCE(gid_kref > 1,
>>>   			  "GID entry ref leak for dev %s index %d ref=%u\n",
>>> -			  dev_name(&device->dev), i,
>>> -			  kref_read(&table->data_vec[i]->kref));
>>> +			  dev_name(&device->dev), i, gid_kref);
>>>   	}
>>
>> I'm not convinced, I think the bug here is something wrong on the
>> refcounting side not the freeing side. Ref should not be 1. Seems like
>> missing error unwinding in the init side.
> 
> I dropped this patch as the real fix is here 1403c8b14765 ("IB/core: Fix ib_cache_setup_one error flow cleanup")

The commit 1403c8b14765 ("IB/core: Fix ib_cache_setup_one error flow 
cleanup") is in the link 
https://patchwork.kernel.org/project/linux-rdma/patch/79137687d829899b0b1c9835fcb4b258004c439a.1725273354.git.leon@kernel.org/

Zhu Yanjun

> 
> Thanks
> 
>>
>> Jason
>>
diff mbox series

Patch

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index b7c078b7f7cf..c6aec2e04d4c 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -800,13 +800,15 @@  static void release_gid_table(struct ib_device *device,
 		return;
 
 	for (i = 0; i < table->sz; i++) {
+		int gid_kref;
+
 		if (is_gid_entry_free(table->data_vec[i]))
 			continue;
 
-		WARN_ONCE(true,
+		gid_kref = kref_read(&table->data_vec[i]->kref);
+		WARN_ONCE(gid_kref > 1,
 			  "GID entry ref leak for dev %s index %d ref=%u\n",
-			  dev_name(&device->dev), i,
-			  kref_read(&table->data_vec[i]->kref));
+			  dev_name(&device->dev), i, gid_kref);
 	}
 
 	mutex_destroy(&table->lock);