diff mbox series

[rdma-next] RDMA/rxe: Fix slab-out-bounda access which lead to kernel crash later

Message ID 20190312081544.5756-1-leon@kernel.org (mailing list archive)
State Mainlined
Commit a4b7013db23e93824ac53083eeb3e4efdef4b5b0
Delegated to: Jason Gunthorpe
Headers show
Series [rdma-next] RDMA/rxe: Fix slab-out-bounda access which lead to kernel crash later | expand

Commit Message

Leon Romanovsky March 12, 2019, 8:15 a.m. UTC
From: Leon Romanovsky <leonro@mellanox.com>

[   80.194474] BUG: KASAN: slab-out-of-bounds in rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
[   80.194852] Read of size 8 at addr ffff88805c01a608 by task ib_send_bw/573
[   80.195245]
[   80.195389] CPU: 24 PID: 573 Comm: ib_send_bw Not tainted 5.0.0-rc5+ #189
[   80.195772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[   80.196436] Call Trace:
[   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
[   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
[   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
[   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
[   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
[   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
[   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
[   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
[   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
[   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
[   80.204980]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
[   80.206553]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
[   80.207298]  do_vfs_ioctl+0x193/0x1440
[   80.209126]  ksys_ioctl+0x3a/0x70
[   80.209266]  __x64_sys_ioctl+0x6f/0xb0
[   80.209415]  do_syscall_64+0x13f/0x570
[   80.210320]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   80.210508] RIP: 0033:0x7fa2399aa09b
[   80.210651] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00
48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01  48
[   80.211272] RSP: 002b:00007ffce51e7c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   80.211567] RAX: ffffffffffffffda RBX: 00007ffce51e7cf0 RCX: 00007fa2399aa09b
[   80.211835] RDX: 00007ffce51e7d10 RSI: 00000000c0181b01 RDI: 0000000000000003
[   80.212133] RBP: 00007ffce51e7d28 R08: 0000000000000028 R09: 00007ffce51e7ea4
[   80.212409] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000023d6420
[   80.212693] R13: 00007ffce51e7cf0 R14: 00007ffce51e7eb8 R15: 0000000000000000
[   80.212972]
[   80.213066] Allocated by task 573:
[   80.213208]  __kasan_kmalloc.constprop.5+0xc1/0xd0
[   80.213392]  __kmalloc+0x161/0x310
[   80.213536]  rxe_mem_alloc+0x52/0x470 [rdma_rxe]
[   80.213719]  rxe_mem_init_user+0x113/0x740 [rdma_rxe]
[   80.213913]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
[   80.214121]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
[   80.214309]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
[   80.214584]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
[   80.214769]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
[   80.214971]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
[   80.215156]  do_vfs_ioctl+0x193/0x1440
[   80.215296]  ksys_ioctl+0x3a/0x70
[   80.215435]  __x64_sys_ioctl+0x6f/0xb0
[   80.215572]  do_syscall_64+0x13f/0x570
[   80.215708]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[   80.215886]
[   80.215995] Freed by task 0:
[   80.216134]  __kasan_slab_free+0x12e/0x180
[   80.216278]  kfree+0x10a/0x2c0
[   80.216445]  rcu_process_callbacks+0xa77/0x1260
[   80.216637]  __do_softirq+0x2ad/0xacb
[   80.216771]
[   80.216867] The buggy address belongs to the object at ffff88805c01a588
[   80.216867]  which belongs to the cache kmalloc-128 of size 128
[   80.217281] The buggy address is located 0 bytes to the right of
[   80.217281]  128-byte region [ffff88805c01a588, ffff88805c01a608)
[   80.217684] The buggy address belongs to the page:
[   80.217871] page:ffffea0001700600 count:1 mapcount:0 mapping:ffff8880648173c0 index:0xffff88805c018008 compound_mapcount: 0
[   80.218236] flags: 0x4000000000010200(slab|head)
[   80.218420] raw: 4000000000010200 ffffea0001786b08 ffff888064800990 ffff8880648173c0
[   80.218707] raw: ffff88805c018008 0000000000220011 00000001ffffffff 0000000000000000
[   80.218984] page dumped because: kasan: bad access detected
[   80.219166]
[   80.219261] Memory state around the buggy address:
[   80.219451]  ffff88805c01a500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   80.219724]  ffff88805c01a580: fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   80.220007] >ffff88805c01a600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   80.220275]                       ^
[   80.220418]  ffff88805c01a680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   80.220689]  ffff88805c01a700: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb

Test scenario:
 ib_send_bw -x 1 -d rxe0 -a &
 ib_send_bw -x 1 -d rxe0 -a localhost

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Reported-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/sw/rxe/rxe_mr.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Comments

Leon Romanovsky March 12, 2019, 9:24 a.m. UTC | #1
Subject should be "slab-out-bounds" and not "slab-out-bounda".

Thanks

On Tue, Mar 12, 2019 at 10:15:44AM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
>
> [   80.194474] BUG: KASAN: slab-out-of-bounds in rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> [   80.194852] Read of size 8 at addr ffff88805c01a608 by task ib_send_bw/573
> [   80.195245]
> [   80.195389] CPU: 24 PID: 573 Comm: ib_send_bw Not tainted 5.0.0-rc5+ #189
> [   80.195772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> [   80.196436] Call Trace:
> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> [   80.204980]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> [   80.206553]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> [   80.207298]  do_vfs_ioctl+0x193/0x1440
> [   80.209126]  ksys_ioctl+0x3a/0x70
> [   80.209266]  __x64_sys_ioctl+0x6f/0xb0
> [   80.209415]  do_syscall_64+0x13f/0x570
> [   80.210320]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [   80.210508] RIP: 0033:0x7fa2399aa09b
> [   80.210651] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00
> 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f
> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01  48
> [   80.211272] RSP: 002b:00007ffce51e7c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [   80.211567] RAX: ffffffffffffffda RBX: 00007ffce51e7cf0 RCX: 00007fa2399aa09b
> [   80.211835] RDX: 00007ffce51e7d10 RSI: 00000000c0181b01 RDI: 0000000000000003
> [   80.212133] RBP: 00007ffce51e7d28 R08: 0000000000000028 R09: 00007ffce51e7ea4
> [   80.212409] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000023d6420
> [   80.212693] R13: 00007ffce51e7cf0 R14: 00007ffce51e7eb8 R15: 0000000000000000
> [   80.212972]
> [   80.213066] Allocated by task 573:
> [   80.213208]  __kasan_kmalloc.constprop.5+0xc1/0xd0
> [   80.213392]  __kmalloc+0x161/0x310
> [   80.213536]  rxe_mem_alloc+0x52/0x470 [rdma_rxe]
> [   80.213719]  rxe_mem_init_user+0x113/0x740 [rdma_rxe]
> [   80.213913]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> [   80.214121]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> [   80.214309]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> [   80.214584]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> [   80.214769]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> [   80.214971]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> [   80.215156]  do_vfs_ioctl+0x193/0x1440
> [   80.215296]  ksys_ioctl+0x3a/0x70
> [   80.215435]  __x64_sys_ioctl+0x6f/0xb0
> [   80.215572]  do_syscall_64+0x13f/0x570
> [   80.215708]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [   80.215886]
> [   80.215995] Freed by task 0:
> [   80.216134]  __kasan_slab_free+0x12e/0x180
> [   80.216278]  kfree+0x10a/0x2c0
> [   80.216445]  rcu_process_callbacks+0xa77/0x1260
> [   80.216637]  __do_softirq+0x2ad/0xacb
> [   80.216771]
> [   80.216867] The buggy address belongs to the object at ffff88805c01a588
> [   80.216867]  which belongs to the cache kmalloc-128 of size 128
> [   80.217281] The buggy address is located 0 bytes to the right of
> [   80.217281]  128-byte region [ffff88805c01a588, ffff88805c01a608)
> [   80.217684] The buggy address belongs to the page:
> [   80.217871] page:ffffea0001700600 count:1 mapcount:0 mapping:ffff8880648173c0 index:0xffff88805c018008 compound_mapcount: 0
> [   80.218236] flags: 0x4000000000010200(slab|head)
> [   80.218420] raw: 4000000000010200 ffffea0001786b08 ffff888064800990 ffff8880648173c0
> [   80.218707] raw: ffff88805c018008 0000000000220011 00000001ffffffff 0000000000000000
> [   80.218984] page dumped because: kasan: bad access detected
> [   80.219166]
> [   80.219261] Memory state around the buggy address:
> [   80.219451]  ffff88805c01a500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   80.219724]  ffff88805c01a580: fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   80.220007] >ffff88805c01a600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   80.220275]                       ^
> [   80.220418]  ffff88805c01a680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   80.220689]  ffff88805c01a700: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
>
> Test scenario:
>  ib_send_bw -x 1 -d rxe0 -a &
>  ib_send_bw -x 1 -d rxe0 -a localhost
>
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Reported-by: Parav Pandit <parav@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_mr.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index 42f0f25e396c..ec89fbd06c53 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -199,6 +199,12 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
>  		buf = map[0]->buf;
>
>  		for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->nmap, 0) {
> +			if (num_buf >= RXE_BUF_PER_MAP) {
> +				map++;
> +				buf = map[0]->buf;
> +				num_buf = 0;
> +			}
> +
>  			vaddr = page_address(sg_page_iter_page(&sg_iter));
>  			if (!vaddr) {
>  				pr_warn("null vaddr\n");
> @@ -211,11 +217,6 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
>  			num_buf++;
>  			buf++;
>
> -			if (num_buf >= RXE_BUF_PER_MAP) {
> -				map++;
> -				buf = map[0]->buf;
> -				num_buf = 0;
> -			}
>  		}
>  	}
>
> --
> 2.19.1
>
Zhu Yanjun March 13, 2019, 2:30 a.m. UTC | #2
On 2019/3/12 16:15, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
>
> [   80.194474] BUG: KASAN: slab-out-of-bounds in rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> [   80.194852] Read of size 8 at addr ffff88805c01a608 by task ib_send_bw/573
> [   80.195245]
> [   80.195389] CPU: 24 PID: 573 Comm: ib_send_bw Not tainted 5.0.0-rc5+ #189
> [   80.195772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> [   80.196436] Call Trace:
> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> [   80.204980]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> [   80.206553]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> [   80.207298]  do_vfs_ioctl+0x193/0x1440
> [   80.209126]  ksys_ioctl+0x3a/0x70
> [   80.209266]  __x64_sys_ioctl+0x6f/0xb0
> [   80.209415]  do_syscall_64+0x13f/0x570
> [   80.210320]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [   80.210508] RIP: 0033:0x7fa2399aa09b
> [   80.210651] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00
> 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f
> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01  48
> [   80.211272] RSP: 002b:00007ffce51e7c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [   80.211567] RAX: ffffffffffffffda RBX: 00007ffce51e7cf0 RCX: 00007fa2399aa09b
> [   80.211835] RDX: 00007ffce51e7d10 RSI: 00000000c0181b01 RDI: 0000000000000003
> [   80.212133] RBP: 00007ffce51e7d28 R08: 0000000000000028 R09: 00007ffce51e7ea4
> [   80.212409] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000023d6420
> [   80.212693] R13: 00007ffce51e7cf0 R14: 00007ffce51e7eb8 R15: 0000000000000000
> [   80.212972]
> [   80.213066] Allocated by task 573:
> [   80.213208]  __kasan_kmalloc.constprop.5+0xc1/0xd0
> [   80.213392]  __kmalloc+0x161/0x310
> [   80.213536]  rxe_mem_alloc+0x52/0x470 [rdma_rxe]
> [   80.213719]  rxe_mem_init_user+0x113/0x740 [rdma_rxe]
> [   80.213913]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> [   80.214121]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> [   80.214309]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> [   80.214584]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> [   80.214769]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> [   80.214971]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> [   80.215156]  do_vfs_ioctl+0x193/0x1440
> [   80.215296]  ksys_ioctl+0x3a/0x70
> [   80.215435]  __x64_sys_ioctl+0x6f/0xb0
> [   80.215572]  do_syscall_64+0x13f/0x570
> [   80.215708]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [   80.215886]
> [   80.215995] Freed by task 0:
> [   80.216134]  __kasan_slab_free+0x12e/0x180
> [   80.216278]  kfree+0x10a/0x2c0
> [   80.216445]  rcu_process_callbacks+0xa77/0x1260
> [   80.216637]  __do_softirq+0x2ad/0xacb
> [   80.216771]
> [   80.216867] The buggy address belongs to the object at ffff88805c01a588
> [   80.216867]  which belongs to the cache kmalloc-128 of size 128
> [   80.217281] The buggy address is located 0 bytes to the right of
> [   80.217281]  128-byte region [ffff88805c01a588, ffff88805c01a608)
> [   80.217684] The buggy address belongs to the page:
> [   80.217871] page:ffffea0001700600 count:1 mapcount:0 mapping:ffff8880648173c0 index:0xffff88805c018008 compound_mapcount: 0
> [   80.218236] flags: 0x4000000000010200(slab|head)
> [   80.218420] raw: 4000000000010200 ffffea0001786b08 ffff888064800990 ffff8880648173c0
> [   80.218707] raw: ffff88805c018008 0000000000220011 00000001ffffffff 0000000000000000
> [   80.218984] page dumped because: kasan: bad access detected
> [   80.219166]
> [   80.219261] Memory state around the buggy address:
> [   80.219451]  ffff88805c01a500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   80.219724]  ffff88805c01a580: fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   80.220007] >ffff88805c01a600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   80.220275]                       ^
> [   80.220418]  ffff88805c01a680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   80.220689]  ffff88805c01a700: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
>
> Test scenario:
>   ib_send_bw -x 1 -d rxe0 -a &
>   ib_send_bw -x 1 -d rxe0 -a localhost

With the above test commands, I can not reproduce this problem. Does it 
need other condition to trigger this problem?

The followings are the test result.

[root@localhost ~]# uname -a
Linux localhost.localdomain 5.0.0-rc7+ #1 SMP Sun Feb 24 00:33:33 EST 
2019 x86_64 x86_64 x86_64 GNU/Linux

[root@localhost ~]# ib_send_bw -x 1 -d rxe0 -a &
[1] 12355
[root@localhost ~]#
************************************
* Waiting for client to connect... *
************************************

[root@localhost ~]# ib_send_bw -x 1 -d rxe0 -a localhost
---------------------------------------------------------------------------------------
                     Send BW Test
  Dual-port       : OFF          Device         : rxe0
  Number of qps   : 1            Transport type : IB
  Connection type : RC           Using SRQ      : OFF
  RX depth        : 512
  CQ Moderation   : 100
  Mtu             : 1024[B]
  Link type       : Ethernet
  GID index       : 1
  Max inline data : 0[B]
  rdma_cm QPs     : OFF
  Data ex. method : Ethernet
---------------------------------------------------------------------------------------
  local address: LID 0000 QPN 0x0013 PSN 0xbbaf32
  GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:211:03:76
---------------------------------------------------------------------------------------
                     Send BW Test
  Dual-port       : OFF          Device         : rxe0
  Number of qps   : 1            Transport type : IB
  Connection type : RC           Using SRQ      : OFF
  TX depth        : 128
  CQ Moderation   : 100
  Mtu             : 1024[B]
  Link type       : Ethernet
  GID index       : 1
  Max inline data : 0[B]
  rdma_cm QPs     : OFF
  Data ex. method : Ethernet
---------------------------------------------------------------------------------------
  local address: LID 0000 QPN 0x0014 PSN 0x4cfe54
  GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:211:03:76
  remote address: LID 0000 QPN 0x0014 PSN 0x4cfe54
  GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:211:03:76
  remote address: LID 0000 QPN 0x0013 PSN 0xbbaf32
  GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:211:03:76
---------------------------------------------------------------------------------------
  #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec] 
MsgRate[Mpps]
---------------------------------------------------------------------------------------
  #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec] 
MsgRate[Mpps]
  2          1000             0.00               0.47 0.247933
  2          1000             0.48               0.47 0.246355
  4          1000             0.00               0.94 0.247689
  4          1000             0.95               0.94 0.247094
  8          1000             0.00               1.92 0.251087
  8          1000             1.93               1.91 0.250482
  16         1000             0.00               3.80 0.249234
  16         1000             3.82               3.79 0.248637
  32         1000             0.00               7.61 0.249468
  32         1000             7.69               7.59 0.248859
  64         1000             0.00               15.01 0.245916
  64         1000             15.15              14.97 0.245330
  128        1000             0.00               27.99 0.229257
  128        1000             28.08              27.92 0.228700
  256        1000             0.00               55.02 0.225382
  256        1000             55.26              54.89 0.224844
  512        1000             0.00               104.48 0.213984
  512        1000             105.18             104.25 0.213500
  1024       1000             0.00               193.27 0.197909
  1024       1000             194.40             192.86 0.197485
  2048       1000             0.00               261.74 0.134011
  2048       1000             263.59             261.25 0.133758
  4096       1000             0.00               331.09 0.084759
  4096       1000             332.41             330.55 0.084622
  8192       1000             0.00               374.75 0.047968
  8192       1000             374.63             374.22 0.047900
  16384      1000             0.00               400.41 0.025626
  16384      1000             401.26             399.91 0.025594
  32768      1000             0.00               417.90 0.013373
  32768      1000             417.63             417.41 0.013357
  65536      1000             0.00               426.77 0.006828
  65536      1000             427.17             426.29 0.006821
  131072     1000             0.00               427.31 0.003418
  131072     1000             427.75             426.84 0.003415
  262144     1000             0.00               424.07 0.001696
  262144     1000             425.43             423.62 0.001694
  524288     1000             0.00               424.54 0.000849
  524288     1000             426.71             424.09 0.000848
  1048576    1000             0.00               441.52 0.000442
  1048576    1000             450.96             441.03 0.000441
  2097152    1000             0.00               426.05 0.000213
  2097152    1000             425.63             425.59 0.000213
  4194304    1000             0.00               428.65 0.000107
  4194304    1000             448.96             428.19 0.000107
  8388608    1000             0.00               405.47 0.000051
  8388608    1000             407.23             405.06 0.000051
---------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------
[1]+  Done                    ib_send_bw -x 1 -d rxe0 -a

>
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Reported-by: Parav Pandit <parav@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>   drivers/infiniband/sw/rxe/rxe_mr.c | 11 ++++++-----
>   1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index 42f0f25e396c..ec89fbd06c53 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -199,6 +199,12 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
>   		buf = map[0]->buf;
>   
>   		for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->nmap, 0) {
> +			if (num_buf >= RXE_BUF_PER_MAP) {
> +				map++;
> +				buf = map[0]->buf;
> +				num_buf = 0;
> +			}
> +
>   			vaddr = page_address(sg_page_iter_page(&sg_iter));
>   			if (!vaddr) {
>   				pr_warn("null vaddr\n");
> @@ -211,11 +217,6 @@ int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
>   			num_buf++;
>   			buf++;
>   
> -			if (num_buf >= RXE_BUF_PER_MAP) {
> -				map++;
> -				buf = map[0]->buf;
> -				num_buf = 0;
> -			}
>   		}
>   	}
>
Leon Romanovsky March 13, 2019, 4:15 a.m. UTC | #3
On Wed, Mar 13, 2019 at 10:30:09AM +0800, Yanjun Zhu wrote:
>
> On 2019/3/12 16:15, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > [   80.194474] BUG: KASAN: slab-out-of-bounds in rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> > [   80.194852] Read of size 8 at addr ffff88805c01a608 by task ib_send_bw/573
> > [   80.195245]
> > [   80.195389] CPU: 24 PID: 573 Comm: ib_send_bw Not tainted 5.0.0-rc5+ #189
> > [   80.195772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> > [   80.196436] Call Trace:
> > [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> > [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> > [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> > [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> > [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> > [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> > [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> > [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> > [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> > [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> > [   80.204980]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> > [   80.206553]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> > [   80.207298]  do_vfs_ioctl+0x193/0x1440
> > [   80.209126]  ksys_ioctl+0x3a/0x70
> > [   80.209266]  __x64_sys_ioctl+0x6f/0xb0
> > [   80.209415]  do_syscall_64+0x13f/0x570
> > [   80.210320]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > [   80.210508] RIP: 0033:0x7fa2399aa09b
> > [   80.210651] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00
> > 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f
> > 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01  48
> > [   80.211272] RSP: 002b:00007ffce51e7c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > [   80.211567] RAX: ffffffffffffffda RBX: 00007ffce51e7cf0 RCX: 00007fa2399aa09b
> > [   80.211835] RDX: 00007ffce51e7d10 RSI: 00000000c0181b01 RDI: 0000000000000003
> > [   80.212133] RBP: 00007ffce51e7d28 R08: 0000000000000028 R09: 00007ffce51e7ea4
> > [   80.212409] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000023d6420
> > [   80.212693] R13: 00007ffce51e7cf0 R14: 00007ffce51e7eb8 R15: 0000000000000000
> > [   80.212972]
> > [   80.213066] Allocated by task 573:
> > [   80.213208]  __kasan_kmalloc.constprop.5+0xc1/0xd0
> > [   80.213392]  __kmalloc+0x161/0x310
> > [   80.213536]  rxe_mem_alloc+0x52/0x470 [rdma_rxe]
> > [   80.213719]  rxe_mem_init_user+0x113/0x740 [rdma_rxe]
> > [   80.213913]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> > [   80.214121]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> > [   80.214309]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> > [   80.214584]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> > [   80.214769]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> > [   80.214971]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> > [   80.215156]  do_vfs_ioctl+0x193/0x1440
> > [   80.215296]  ksys_ioctl+0x3a/0x70
> > [   80.215435]  __x64_sys_ioctl+0x6f/0xb0
> > [   80.215572]  do_syscall_64+0x13f/0x570
> > [   80.215708]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > [   80.215886]
> > [   80.215995] Freed by task 0:
> > [   80.216134]  __kasan_slab_free+0x12e/0x180
> > [   80.216278]  kfree+0x10a/0x2c0
> > [   80.216445]  rcu_process_callbacks+0xa77/0x1260
> > [   80.216637]  __do_softirq+0x2ad/0xacb
> > [   80.216771]
> > [   80.216867] The buggy address belongs to the object at ffff88805c01a588
> > [   80.216867]  which belongs to the cache kmalloc-128 of size 128
> > [   80.217281] The buggy address is located 0 bytes to the right of
> > [   80.217281]  128-byte region [ffff88805c01a588, ffff88805c01a608)
> > [   80.217684] The buggy address belongs to the page:
> > [   80.217871] page:ffffea0001700600 count:1 mapcount:0 mapping:ffff8880648173c0 index:0xffff88805c018008 compound_mapcount: 0
> > [   80.218236] flags: 0x4000000000010200(slab|head)
> > [   80.218420] raw: 4000000000010200 ffffea0001786b08 ffff888064800990 ffff8880648173c0
> > [   80.218707] raw: ffff88805c018008 0000000000220011 00000001ffffffff 0000000000000000
> > [   80.218984] page dumped because: kasan: bad access detected
> > [   80.219166]
> > [   80.219261] Memory state around the buggy address:
> > [   80.219451]  ffff88805c01a500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > [   80.219724]  ffff88805c01a580: fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > [   80.220007] >ffff88805c01a600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > [   80.220275]                       ^
> > [   80.220418]  ffff88805c01a680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > [   80.220689]  ffff88805c01a700: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
> >
> > Test scenario:
> >   ib_send_bw -x 1 -d rxe0 -a &
> >   ib_send_bw -x 1 -d rxe0 -a localhost
>
> With the above test commands, I can not reproduce this problem. Does it need
> other condition to trigger this problem?

Nothing special: KASAN option enabled in kernel, latest GCC, rdma-next and
upstream version of perftest.

Thanks
Zhu Yanjun March 14, 2019, 4:03 a.m. UTC | #4
On 2019/3/13 12:15, Leon Romanovsky wrote:
> On Wed, Mar 13, 2019 at 10:30:09AM +0800, Yanjun Zhu wrote:
>> On 2019/3/12 16:15, Leon Romanovsky wrote:
>>> From: Leon Romanovsky <leonro@mellanox.com>
>>>
>>> [   80.194474] BUG: KASAN: slab-out-of-bounds in rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
>>> [   80.194852] Read of size 8 at addr ffff88805c01a608 by task ib_send_bw/573
>>> [   80.195245]
>>> [   80.195389] CPU: 24 PID: 573 Comm: ib_send_bw Not tainted 5.0.0-rc5+ #189
>>> [   80.195772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
>>> [   80.196436] Call Trace:
>>> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
>>> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
>>> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
>>> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
>>> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
>>> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
>>> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
>>> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
>>> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
>>> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
>>> [   80.204980]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
>>> [   80.206553]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
>>> [   80.207298]  do_vfs_ioctl+0x193/0x1440
>>> [   80.209126]  ksys_ioctl+0x3a/0x70
>>> [   80.209266]  __x64_sys_ioctl+0x6f/0xb0
>>> [   80.209415]  do_syscall_64+0x13f/0x570
>>> [   80.210320]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> [   80.210508] RIP: 0033:0x7fa2399aa09b
>>> [   80.210651] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00
>>> 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f
>>> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01  48
>>> [   80.211272] RSP: 002b:00007ffce51e7c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>> [   80.211567] RAX: ffffffffffffffda RBX: 00007ffce51e7cf0 RCX: 00007fa2399aa09b
>>> [   80.211835] RDX: 00007ffce51e7d10 RSI: 00000000c0181b01 RDI: 0000000000000003
>>> [   80.212133] RBP: 00007ffce51e7d28 R08: 0000000000000028 R09: 00007ffce51e7ea4
>>> [   80.212409] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000023d6420
>>> [   80.212693] R13: 00007ffce51e7cf0 R14: 00007ffce51e7eb8 R15: 0000000000000000
>>> [   80.212972]
>>> [   80.213066] Allocated by task 573:
>>> [   80.213208]  __kasan_kmalloc.constprop.5+0xc1/0xd0
>>> [   80.213392]  __kmalloc+0x161/0x310
>>> [   80.213536]  rxe_mem_alloc+0x52/0x470 [rdma_rxe]
>>> [   80.213719]  rxe_mem_init_user+0x113/0x740 [rdma_rxe]
>>> [   80.213913]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
>>> [   80.214121]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
>>> [   80.214309]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
>>> [   80.214584]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
>>> [   80.214769]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
>>> [   80.214971]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
>>> [   80.215156]  do_vfs_ioctl+0x193/0x1440
>>> [   80.215296]  ksys_ioctl+0x3a/0x70
>>> [   80.215435]  __x64_sys_ioctl+0x6f/0xb0
>>> [   80.215572]  do_syscall_64+0x13f/0x570
>>> [   80.215708]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> [   80.215886]
>>> [   80.215995] Freed by task 0:
>>> [   80.216134]  __kasan_slab_free+0x12e/0x180
>>> [   80.216278]  kfree+0x10a/0x2c0
>>> [   80.216445]  rcu_process_callbacks+0xa77/0x1260
>>> [   80.216637]  __do_softirq+0x2ad/0xacb
>>> [   80.216771]
>>> [   80.216867] The buggy address belongs to the object at ffff88805c01a588
>>> [   80.216867]  which belongs to the cache kmalloc-128 of size 128
>>> [   80.217281] The buggy address is located 0 bytes to the right of
>>> [   80.217281]  128-byte region [ffff88805c01a588, ffff88805c01a608)
>>> [   80.217684] The buggy address belongs to the page:
>>> [   80.217871] page:ffffea0001700600 count:1 mapcount:0 mapping:ffff8880648173c0 index:0xffff88805c018008 compound_mapcount: 0
>>> [   80.218236] flags: 0x4000000000010200(slab|head)
>>> [   80.218420] raw: 4000000000010200 ffffea0001786b08 ffff888064800990 ffff8880648173c0
>>> [   80.218707] raw: ffff88805c018008 0000000000220011 00000001ffffffff 0000000000000000
>>> [   80.218984] page dumped because: kasan: bad access detected
>>> [   80.219166]
>>> [   80.219261] Memory state around the buggy address:
>>> [   80.219451]  ffff88805c01a500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> [   80.219724]  ffff88805c01a580: fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> [   80.220007] >ffff88805c01a600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> [   80.220275]                       ^
>>> [   80.220418]  ffff88805c01a680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> [   80.220689]  ffff88805c01a700: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
>>>
>>> Test scenario:
>>>    ib_send_bw -x 1 -d rxe0 -a &
>>>    ib_send_bw -x 1 -d rxe0 -a localhost
>> With the above test commands, I can not reproduce this problem. Does it need
>> other condition to trigger this problem?
> Nothing special: KASAN option enabled in kernel, latest GCC, rdma-next and
> upstream version of perftest.

Thanks. Wit KASAN option enabled in kernel, in ubuntu 16.04, all the 
packages are updated. the latest kernel (with this patch) is built,

ib_send_bw --version
Version: 5.60
The above call trace does not appear. It seems that this patch can work 
well in my test environment.

Zhu Yanjun

>
> Thanks
>
Leon Romanovsky March 14, 2019, 7:01 a.m. UTC | #5
On Thu, Mar 14, 2019 at 12:03:40PM +0800, Yanjun Zhu wrote:
>
> On 2019/3/13 12:15, Leon Romanovsky wrote:
> > On Wed, Mar 13, 2019 at 10:30:09AM +0800, Yanjun Zhu wrote:
> > > On 2019/3/12 16:15, Leon Romanovsky wrote:
> > > > From: Leon Romanovsky <leonro@mellanox.com>
> > > >
> > > > [   80.194474] BUG: KASAN: slab-out-of-bounds in rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> > > > [   80.194852] Read of size 8 at addr ffff88805c01a608 by task ib_send_bw/573
> > > > [   80.195245]
> > > > [   80.195389] CPU: 24 PID: 573 Comm: ib_send_bw Not tainted 5.0.0-rc5+ #189
> > > > [   80.195772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> > > > [   80.196436] Call Trace:
> > > > [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> > > > [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> > > > [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> > > > [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> > > > [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> > > > [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> > > > [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> > > > [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> > > > [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> > > > [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> > > > [   80.204980]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> > > > [   80.206553]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> > > > [   80.207298]  do_vfs_ioctl+0x193/0x1440
> > > > [   80.209126]  ksys_ioctl+0x3a/0x70
> > > > [   80.209266]  __x64_sys_ioctl+0x6f/0xb0
> > > > [   80.209415]  do_syscall_64+0x13f/0x570
> > > > [   80.210320]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > > [   80.210508] RIP: 0033:0x7fa2399aa09b
> > > > [   80.210651] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00
> > > > 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f
> > > > 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01  48
> > > > [   80.211272] RSP: 002b:00007ffce51e7c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > > > [   80.211567] RAX: ffffffffffffffda RBX: 00007ffce51e7cf0 RCX: 00007fa2399aa09b
> > > > [   80.211835] RDX: 00007ffce51e7d10 RSI: 00000000c0181b01 RDI: 0000000000000003
> > > > [   80.212133] RBP: 00007ffce51e7d28 R08: 0000000000000028 R09: 00007ffce51e7ea4
> > > > [   80.212409] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000023d6420
> > > > [   80.212693] R13: 00007ffce51e7cf0 R14: 00007ffce51e7eb8 R15: 0000000000000000
> > > > [   80.212972]
> > > > [   80.213066] Allocated by task 573:
> > > > [   80.213208]  __kasan_kmalloc.constprop.5+0xc1/0xd0
> > > > [   80.213392]  __kmalloc+0x161/0x310
> > > > [   80.213536]  rxe_mem_alloc+0x52/0x470 [rdma_rxe]
> > > > [   80.213719]  rxe_mem_init_user+0x113/0x740 [rdma_rxe]
> > > > [   80.213913]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> > > > [   80.214121]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> > > > [   80.214309]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> > > > [   80.214584]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> > > > [   80.214769]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> > > > [   80.214971]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> > > > [   80.215156]  do_vfs_ioctl+0x193/0x1440
> > > > [   80.215296]  ksys_ioctl+0x3a/0x70
> > > > [   80.215435]  __x64_sys_ioctl+0x6f/0xb0
> > > > [   80.215572]  do_syscall_64+0x13f/0x570
> > > > [   80.215708]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > > [   80.215886]
> > > > [   80.215995] Freed by task 0:
> > > > [   80.216134]  __kasan_slab_free+0x12e/0x180
> > > > [   80.216278]  kfree+0x10a/0x2c0
> > > > [   80.216445]  rcu_process_callbacks+0xa77/0x1260
> > > > [   80.216637]  __do_softirq+0x2ad/0xacb
> > > > [   80.216771]
> > > > [   80.216867] The buggy address belongs to the object at ffff88805c01a588
> > > > [   80.216867]  which belongs to the cache kmalloc-128 of size 128
> > > > [   80.217281] The buggy address is located 0 bytes to the right of
> > > > [   80.217281]  128-byte region [ffff88805c01a588, ffff88805c01a608)
> > > > [   80.217684] The buggy address belongs to the page:
> > > > [   80.217871] page:ffffea0001700600 count:1 mapcount:0 mapping:ffff8880648173c0 index:0xffff88805c018008 compound_mapcount: 0
> > > > [   80.218236] flags: 0x4000000000010200(slab|head)
> > > > [   80.218420] raw: 4000000000010200 ffffea0001786b08 ffff888064800990 ffff8880648173c0
> > > > [   80.218707] raw: ffff88805c018008 0000000000220011 00000001ffffffff 0000000000000000
> > > > [   80.218984] page dumped because: kasan: bad access detected
> > > > [   80.219166]
> > > > [   80.219261] Memory state around the buggy address:
> > > > [   80.219451]  ffff88805c01a500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > > > [   80.219724]  ffff88805c01a580: fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > [   80.220007] >ffff88805c01a600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > > > [   80.220275]                       ^
> > > > [   80.220418]  ffff88805c01a680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > > > [   80.220689]  ffff88805c01a700: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
> > > >
> > > > Test scenario:
> > > >    ib_send_bw -x 1 -d rxe0 -a &
> > > >    ib_send_bw -x 1 -d rxe0 -a localhost
> > > With the above test commands, I can not reproduce this problem. Does it need
> > > other condition to trigger this problem?
> > Nothing special: KASAN option enabled in kernel, latest GCC, rdma-next and
> > upstream version of perftest.
>
> Thanks. Wit KASAN option enabled in kernel, in ubuntu 16.04, all the
> packages are updated. the latest kernel (with this patch) is built,
>
> ib_send_bw --version
> Version: 5.60
> The above call trace does not appear. It seems that this patch can work
> well in my test environment.

Thanks,

Can you please send your Tested-by or Reviewed-by for this patch?

>
> Zhu Yanjun
>
> >
> > Thanks
> >
Zhu Yanjun March 14, 2019, 7:56 a.m. UTC | #6
On 2019/3/14 15:01, Leon Romanovsky wrote:
> On Thu, Mar 14, 2019 at 12:03:40PM +0800, Yanjun Zhu wrote:
>> On 2019/3/13 12:15, Leon Romanovsky wrote:
>>> On Wed, Mar 13, 2019 at 10:30:09AM +0800, Yanjun Zhu wrote:
>>>> On 2019/3/12 16:15, Leon Romanovsky wrote:
>>>>> From: Leon Romanovsky <leonro@mellanox.com>
>>>>>
>>>>> [   80.194474] BUG: KASAN: slab-out-of-bounds in rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
>>>>> [   80.194852] Read of size 8 at addr ffff88805c01a608 by task ib_send_bw/573
>>>>> [   80.195245]
>>>>> [   80.195389] CPU: 24 PID: 573 Comm: ib_send_bw Not tainted 5.0.0-rc5+ #189
>>>>> [   80.195772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
>>>>> [   80.196436] Call Trace:
>>>>> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
>>>>> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
>>>>> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
>>>>> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
>>>>> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
>>>>> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
>>>>> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
>>>>> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
>>>>> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
>>>>> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
>>>>> [   80.204980]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
>>>>> [   80.206553]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
>>>>> [   80.207298]  do_vfs_ioctl+0x193/0x1440
>>>>> [   80.209126]  ksys_ioctl+0x3a/0x70
>>>>> [   80.209266]  __x64_sys_ioctl+0x6f/0xb0
>>>>> [   80.209415]  do_syscall_64+0x13f/0x570
>>>>> [   80.210320]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>> [   80.210508] RIP: 0033:0x7fa2399aa09b
>>>>> [   80.210651] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00
>>>>> 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f
>>>>> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01  48
>>>>> [   80.211272] RSP: 002b:00007ffce51e7c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>>>> [   80.211567] RAX: ffffffffffffffda RBX: 00007ffce51e7cf0 RCX: 00007fa2399aa09b
>>>>> [   80.211835] RDX: 00007ffce51e7d10 RSI: 00000000c0181b01 RDI: 0000000000000003
>>>>> [   80.212133] RBP: 00007ffce51e7d28 R08: 0000000000000028 R09: 00007ffce51e7ea4
>>>>> [   80.212409] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000023d6420
>>>>> [   80.212693] R13: 00007ffce51e7cf0 R14: 00007ffce51e7eb8 R15: 0000000000000000
>>>>> [   80.212972]
>>>>> [   80.213066] Allocated by task 573:
>>>>> [   80.213208]  __kasan_kmalloc.constprop.5+0xc1/0xd0
>>>>> [   80.213392]  __kmalloc+0x161/0x310
>>>>> [   80.213536]  rxe_mem_alloc+0x52/0x470 [rdma_rxe]
>>>>> [   80.213719]  rxe_mem_init_user+0x113/0x740 [rdma_rxe]
>>>>> [   80.213913]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
>>>>> [   80.214121]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
>>>>> [   80.214309]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
>>>>> [   80.214584]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
>>>>> [   80.214769]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
>>>>> [   80.214971]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
>>>>> [   80.215156]  do_vfs_ioctl+0x193/0x1440
>>>>> [   80.215296]  ksys_ioctl+0x3a/0x70
>>>>> [   80.215435]  __x64_sys_ioctl+0x6f/0xb0
>>>>> [   80.215572]  do_syscall_64+0x13f/0x570
>>>>> [   80.215708]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>> [   80.215886]
>>>>> [   80.215995] Freed by task 0:
>>>>> [   80.216134]  __kasan_slab_free+0x12e/0x180
>>>>> [   80.216278]  kfree+0x10a/0x2c0
>>>>> [   80.216445]  rcu_process_callbacks+0xa77/0x1260
>>>>> [   80.216637]  __do_softirq+0x2ad/0xacb
>>>>> [   80.216771]
>>>>> [   80.216867] The buggy address belongs to the object at ffff88805c01a588
>>>>> [   80.216867]  which belongs to the cache kmalloc-128 of size 128
>>>>> [   80.217281] The buggy address is located 0 bytes to the right of
>>>>> [   80.217281]  128-byte region [ffff88805c01a588, ffff88805c01a608)
>>>>> [   80.217684] The buggy address belongs to the page:
>>>>> [   80.217871] page:ffffea0001700600 count:1 mapcount:0 mapping:ffff8880648173c0 index:0xffff88805c018008 compound_mapcount: 0
>>>>> [   80.218236] flags: 0x4000000000010200(slab|head)
>>>>> [   80.218420] raw: 4000000000010200 ffffea0001786b08 ffff888064800990 ffff8880648173c0
>>>>> [   80.218707] raw: ffff88805c018008 0000000000220011 00000001ffffffff 0000000000000000
>>>>> [   80.218984] page dumped because: kasan: bad access detected
>>>>> [   80.219166]
>>>>> [   80.219261] Memory state around the buggy address:
>>>>> [   80.219451]  ffff88805c01a500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>>> [   80.219724]  ffff88805c01a580: fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>> [   80.220007] >ffff88805c01a600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>>> [   80.220275]                       ^
>>>>> [   80.220418]  ffff88805c01a680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>>> [   80.220689]  ffff88805c01a700: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
>>>>>
>>>>> Test scenario:
>>>>>     ib_send_bw -x 1 -d rxe0 -a &
>>>>>     ib_send_bw -x 1 -d rxe0 -a localhost
>>>> With the above test commands, I can not reproduce this problem. Does it need
>>>> other condition to trigger this problem?
>>> Nothing special: KASAN option enabled in kernel, latest GCC, rdma-next and
>>> upstream version of perftest.
>> Thanks. Wit KASAN option enabled in kernel, in ubuntu 16.04, all the
>> packages are updated. the latest kernel (with this patch) is built,
>>
>> ib_send_bw --version
>> Version: 5.60
>> The above call trace does not appear. It seems that this patch can work
>> well in my test environment.
> Thanks,
>
> Can you please send your Tested-by or Reviewed-by for this patch?

OK.

Reviewed-and-tested-by: Zhu Yanjun <yanjun.zhu@oracle.com>

Zhu Yanjun

>
>> Zhu Yanjun
>>
>>> Thanks
>>>
Jason Gunthorpe March 26, 2019, 3:58 p.m. UTC | #7
On Tue, Mar 12, 2019 at 10:15:44AM +0200, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> [   80.194474] BUG: KASAN: slab-out-of-bounds in rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> [   80.194852] Read of size 8 at addr ffff88805c01a608 by task ib_send_bw/573
> [   80.195245]
> [   80.195389] CPU: 24 PID: 573 Comm: ib_send_bw Not tainted 5.0.0-rc5+ #189
> [   80.195772] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
> [   80.196436] Call Trace:
> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> [   80.198760]  rxe_mem_init_user+0x6c1/0x740 [rdma_rxe]
> [   80.199603]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> [   80.200210]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> [   80.201522]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> [   80.202351]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> [   80.204980]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> [   80.206553]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> [   80.207298]  do_vfs_ioctl+0x193/0x1440
> [   80.209126]  ksys_ioctl+0x3a/0x70
> [   80.209266]  __x64_sys_ioctl+0x6f/0xb0
> [   80.209415]  do_syscall_64+0x13f/0x570
> [   80.210320]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [   80.210508] RIP: 0033:0x7fa2399aa09b
> [   80.210651] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00
> 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f
> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01  48
> [   80.211272] RSP: 002b:00007ffce51e7c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [   80.211567] RAX: ffffffffffffffda RBX: 00007ffce51e7cf0 RCX: 00007fa2399aa09b
> [   80.211835] RDX: 00007ffce51e7d10 RSI: 00000000c0181b01 RDI: 0000000000000003
> [   80.212133] RBP: 00007ffce51e7d28 R08: 0000000000000028 R09: 00007ffce51e7ea4
> [   80.212409] R10: 00000000ffffffff R11: 0000000000000246 R12: 00000000023d6420
> [   80.212693] R13: 00007ffce51e7cf0 R14: 00007ffce51e7eb8 R15: 0000000000000000
> [   80.212972]
> [   80.213066] Allocated by task 573:
> [   80.213208]  __kasan_kmalloc.constprop.5+0xc1/0xd0
> [   80.213392]  __kmalloc+0x161/0x310
> [   80.213536]  rxe_mem_alloc+0x52/0x470 [rdma_rxe]
> [   80.213719]  rxe_mem_init_user+0x113/0x740 [rdma_rxe]
> [   80.213913]  rxe_reg_user_mr+0x9b/0x110 [rdma_rxe]
> [   80.214121]  ib_uverbs_reg_mr+0x428/0x9c0 [ib_uverbs]
> [   80.214309]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x2b0/0x410 [ib_uverbs]
> [   80.214584]  ib_uverbs_run_method+0x79c/0x1da0 [ib_uverbs]
> [   80.214769]  ib_uverbs_cmd_verbs+0x5f2/0xf20 [ib_uverbs]
> [   80.214971]  ib_uverbs_ioctl+0x202/0x310 [ib_uverbs]
> [   80.215156]  do_vfs_ioctl+0x193/0x1440
> [   80.215296]  ksys_ioctl+0x3a/0x70
> [   80.215435]  __x64_sys_ioctl+0x6f/0xb0
> [   80.215572]  do_syscall_64+0x13f/0x570
> [   80.215708]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> [   80.215886]
> [   80.215995] Freed by task 0:
> [   80.216134]  __kasan_slab_free+0x12e/0x180
> [   80.216278]  kfree+0x10a/0x2c0
> [   80.216445]  rcu_process_callbacks+0xa77/0x1260
> [   80.216637]  __do_softirq+0x2ad/0xacb
> [   80.216771]
> [   80.216867] The buggy address belongs to the object at ffff88805c01a588
> [   80.216867]  which belongs to the cache kmalloc-128 of size 128
> [   80.217281] The buggy address is located 0 bytes to the right of
> [   80.217281]  128-byte region [ffff88805c01a588, ffff88805c01a608)
> [   80.217684] The buggy address belongs to the page:
> [   80.217871] page:ffffea0001700600 count:1 mapcount:0 mapping:ffff8880648173c0 index:0xffff88805c018008 compound_mapcount: 0
> [   80.218236] flags: 0x4000000000010200(slab|head)
> [   80.218420] raw: 4000000000010200 ffffea0001786b08 ffff888064800990 ffff8880648173c0
> [   80.218707] raw: ffff88805c018008 0000000000220011 00000001ffffffff 0000000000000000
> [   80.218984] page dumped because: kasan: bad access detected
> [   80.219166]
> [   80.219261] Memory state around the buggy address:
> [   80.219451]  ffff88805c01a500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   80.219724]  ffff88805c01a580: fc 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   80.220007] >ffff88805c01a600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   80.220275]                       ^
> [   80.220418]  ffff88805c01a680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> [   80.220689]  ffff88805c01a700: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
> 
> Test scenario:
>  ib_send_bw -x 1 -d rxe0 -a &
>  ib_send_bw -x 1 -d rxe0 -a localhost
> 
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Reported-by: Parav Pandit <parav@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_mr.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)

Applied to for-next thanks

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index 42f0f25e396c..ec89fbd06c53 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -199,6 +199,12 @@  int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
 		buf = map[0]->buf;
 
 		for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->nmap, 0) {
+			if (num_buf >= RXE_BUF_PER_MAP) {
+				map++;
+				buf = map[0]->buf;
+				num_buf = 0;
+			}
+
 			vaddr = page_address(sg_page_iter_page(&sg_iter));
 			if (!vaddr) {
 				pr_warn("null vaddr\n");
@@ -211,11 +217,6 @@  int rxe_mem_init_user(struct rxe_pd *pd, u64 start,
 			num_buf++;
 			buf++;
 
-			if (num_buf >= RXE_BUF_PER_MAP) {
-				map++;
-				buf = map[0]->buf;
-				num_buf = 0;
-			}
 		}
 	}