Message ID | 20240809083148.1989912-4-liujian56@huawei.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Make SMC-R can work with rxe devices | expand |
On 2024/8/9 16:31, Liu Jian wrote: > BUG: kernel NULL pointer dereference, address: 0000000000000238 > PGD 0 P4D 0 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 3 PID: 289 Comm: kworker/3:1 Kdump: loaded Tainted: G OE > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 > Workqueue: smc_hs_wq smc_listen_work [smc] > RIP: 0010:dma_need_sync+0x5/0x60 > ... > Call Trace: > <TASK> > ? dma_need_sync+0x5/0x60 > ? smc_ib_is_sg_need_sync+0x61/0xf0 [smc] > smcr_buf_map_link+0x24a/0x380 [smc] > __smc_buf_create+0x483/0xb10 [smc] > smc_buf_create+0x21/0xe0 [smc] > smc_listen_work+0xf11/0x14f0 [smc] > ? smc_tcp_listen_work+0x364/0x520 [smc] > process_one_work+0x18d/0x3f0 > worker_thread+0x304/0x440 > kthread+0xe4/0x110 > ret_from_fork+0x47/0x70 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > If the software RoCE device is used, ibdev->dma_device is a null pointer. > As a result, the problem occurs. Null pointer detection is added to > prevent problems. > > Signed-off-by: Liu Jian <liujian56@huawei.com> > --- > net/smc/smc_ib.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c > index 382351ac9434..059822cc3fde 100644 > --- a/net/smc/smc_ib.c > +++ b/net/smc/smc_ib.c > @@ -748,6 +748,8 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk, > buf_slot->sgt[lnk->link_idx].nents, i) { > if (!sg_dma_len(sg)) > break; > + if (!lnk->smcibdev->ibdev->dma_device) > + break; LGTM. Reviewed-by: Wen Gu <guwen@linux.alibaba.com> > if (dma_need_sync(lnk->smcibdev->ibdev->dma_device, > sg_dma_address(sg))) { > ret = true;
On 2024-08-09 16:31:47, Liu Jian wrote: >BUG: kernel NULL pointer dereference, address: 0000000000000238 >PGD 0 P4D 0 >Oops: 0000 [#1] PREEMPT SMP PTI >CPU: 3 PID: 289 Comm: kworker/3:1 Kdump: loaded Tainted: G OE >Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 >Workqueue: smc_hs_wq smc_listen_work [smc] >RIP: 0010:dma_need_sync+0x5/0x60 >... >Call Trace: > <TASK> > ? dma_need_sync+0x5/0x60 > ? smc_ib_is_sg_need_sync+0x61/0xf0 [smc] > smcr_buf_map_link+0x24a/0x380 [smc] > __smc_buf_create+0x483/0xb10 [smc] > smc_buf_create+0x21/0xe0 [smc] > smc_listen_work+0xf11/0x14f0 [smc] > ? smc_tcp_listen_work+0x364/0x520 [smc] > process_one_work+0x18d/0x3f0 > worker_thread+0x304/0x440 > kthread+0xe4/0x110 > ret_from_fork+0x47/0x70 > ret_from_fork_asm+0x1a/0x30 > </TASK> > >If the software RoCE device is used, ibdev->dma_device is a null pointer. >As a result, the problem occurs. Null pointer detection is added to >prevent problems. > >Signed-off-by: Liu Jian <liujian56@huawei.com> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> Best regard, Dust >--- > net/smc/smc_ib.c | 2 ++ > 1 file changed, 2 insertions(+) > >diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c >index 382351ac9434..059822cc3fde 100644 >--- a/net/smc/smc_ib.c >+++ b/net/smc/smc_ib.c >@@ -748,6 +748,8 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk, > buf_slot->sgt[lnk->link_idx].nents, i) { > if (!sg_dma_len(sg)) > break; >+ if (!lnk->smcibdev->ibdev->dma_device) >+ break; > if (dma_need_sync(lnk->smcibdev->ibdev->dma_device, > sg_dma_address(sg))) { > ret = true; >-- >2.34.1 >
On 8/9/24 4:31 PM, Liu Jian wrote: > BUG: kernel NULL pointer dereference, address: 0000000000000238 > PGD 0 P4D 0 > Oops: 0000 [#1] PREEMPT SMP PTI > CPU: 3 PID: 289 Comm: kworker/3:1 Kdump: loaded Tainted: G OE > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 > Workqueue: smc_hs_wq smc_listen_work [smc] > RIP: 0010:dma_need_sync+0x5/0x60 > ... > Call Trace: > <TASK> > ? dma_need_sync+0x5/0x60 > ? smc_ib_is_sg_need_sync+0x61/0xf0 [smc] > smcr_buf_map_link+0x24a/0x380 [smc] > __smc_buf_create+0x483/0xb10 [smc] > smc_buf_create+0x21/0xe0 [smc] > smc_listen_work+0xf11/0x14f0 [smc] > ? smc_tcp_listen_work+0x364/0x520 [smc] > process_one_work+0x18d/0x3f0 > worker_thread+0x304/0x440 > kthread+0xe4/0x110 > ret_from_fork+0x47/0x70 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > If the software RoCE device is used, ibdev->dma_device is a null pointer. > As a result, the problem occurs. Null pointer detection is added to > prevent problems. > > Signed-off-by: Liu Jian <liujian56@huawei.com> > --- > net/smc/smc_ib.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c > index 382351ac9434..059822cc3fde 100644 > --- a/net/smc/smc_ib.c > +++ b/net/smc/smc_ib.c > @@ -748,6 +748,8 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk, > buf_slot->sgt[lnk->link_idx].nents, i) { > if (!sg_dma_len(sg)) > break; > + if (!lnk->smcibdev->ibdev->dma_device) > + break; > if (dma_need_sync(lnk->smcibdev->ibdev->dma_device, > sg_dma_address(sg))) { > ret = true; Maybe you need add a fix tag ?
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c index 382351ac9434..059822cc3fde 100644 --- a/net/smc/smc_ib.c +++ b/net/smc/smc_ib.c @@ -748,6 +748,8 @@ bool smc_ib_is_sg_need_sync(struct smc_link *lnk, buf_slot->sgt[lnk->link_idx].nents, i) { if (!sg_dma_len(sg)) break; + if (!lnk->smcibdev->ibdev->dma_device) + break; if (dma_need_sync(lnk->smcibdev->ibdev->dma_device, sg_dma_address(sg))) { ret = true;
BUG: kernel NULL pointer dereference, address: 0000000000000238 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 3 PID: 289 Comm: kworker/3:1 Kdump: loaded Tainted: G OE Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 Workqueue: smc_hs_wq smc_listen_work [smc] RIP: 0010:dma_need_sync+0x5/0x60 ... Call Trace: <TASK> ? dma_need_sync+0x5/0x60 ? smc_ib_is_sg_need_sync+0x61/0xf0 [smc] smcr_buf_map_link+0x24a/0x380 [smc] __smc_buf_create+0x483/0xb10 [smc] smc_buf_create+0x21/0xe0 [smc] smc_listen_work+0xf11/0x14f0 [smc] ? smc_tcp_listen_work+0x364/0x520 [smc] process_one_work+0x18d/0x3f0 worker_thread+0x304/0x440 kthread+0xe4/0x110 ret_from_fork+0x47/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> If the software RoCE device is used, ibdev->dma_device is a null pointer. As a result, the problem occurs. Null pointer detection is added to prevent problems. Signed-off-by: Liu Jian <liujian56@huawei.com> --- net/smc/smc_ib.c | 2 ++ 1 file changed, 2 insertions(+)