mbox series

[0/3] Misc changes for siw

Message ID 20230818082318.17489-1-guoqing.jiang@linux.dev (mailing list archive)
Headers show
Series Misc changes for siw | expand

Message

Guoqing Jiang Aug. 18, 2023, 8:23 a.m. UTC
Hi,

The first one fix below calltrace which could happen if siw_connect
goto error (I manually set rv to -1 after siw_send_mpareqrep to trigger
it) after cep is allocated.

[   97.341035] ------------[ cut here ]------------
[   97.341037] WARNING: CPU: 0 PID: 143 at drivers/infiniband/sw/siw/siw_cm.c:444 siw_cep_put+0x1c5/0x1e0 [siw]
...
[   97.341126] CPU: 0 PID: 143 Comm: kworker/u4:4 Tainted: G           OE      6.5.0-rc3+ #16
[   97.341128] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014
[   97.341130] Workqueue: rdma_cm cma_work_handler [rdma_cm]
[   97.341137] RIP: 0010:siw_cep_put+0x1c5/0x1e0 [siw]
...
[   97.341159] Call Trace:
[   97.341160]  <TASK>
[   97.341162]  ? show_regs+0x72/0x90
[   97.341166]  ? siw_cep_put+0x1c5/0x1e0 [siw]
[   97.341170]  ? __warn+0x8d/0x1a0
[   97.341175]  ? siw_cep_put+0x1c5/0x1e0 [siw]
[   97.341180]  ? report_bug+0x1f9/0x250
[   97.341185]  ? handle_bug+0x46/0x90
[   97.341188]  ? exc_invalid_op+0x19/0x80
[   97.341190]  ? asm_exc_invalid_op+0x1b/0x20
[   97.341196]  ? siw_cep_put+0x1c5/0x1e0 [siw]
[   97.341204]  siw_connect+0x474/0x780 [siw]
[   97.341211]  iw_cm_connect+0x1ca/0x250 [iw_cm]
[   97.341216]  rdma_connect_locked+0x1bf/0x940 [rdma_cm]
[   97.341227]  nvme_rdma_cm_handler+0x5d7/0x9c0 [nvme_rdma]
[   97.341235]  cma_cm_event_handler+0x4f/0x170 [rdma_cm]
[   97.341241]  cma_work_handler+0x6a/0xe0 [rdma_cm]
[   97.341247]  process_one_work+0x2bd/0x590
...

The second one make the debug message consistent with the condition,
and the last one cleanup code a bit. Pls help to review them.

Thanks,
Guoqing

Guoqing Jiang (3):
  RDMA/siw: Balance the reference of cep->kref in the error path
  RDMA/siw: Correct wrong debug message
  RDMA/siw: Call llist_reverse_order in siw_run_sq

 drivers/infiniband/sw/siw/siw_cm.c    |  1 -
 drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
 drivers/infiniband/sw/siw/siw_verbs.c |  2 +-
 3 files changed, 2 insertions(+), 13 deletions(-)

Comments

Leon Romanovsky Aug. 20, 2023, 9:43 a.m. UTC | #1
On Fri, Aug 18, 2023 at 04:23:15PM +0800, Guoqing Jiang wrote:
> Hi,
> 
> The first one fix below calltrace which could happen if siw_connect
> goto error (I manually set rv to -1 after siw_send_mpareqrep to trigger
> it) after cep is allocated.
> 
> [   97.341035] ------------[ cut here ]------------
> [   97.341037] WARNING: CPU: 0 PID: 143 at drivers/infiniband/sw/siw/siw_cm.c:444 siw_cep_put+0x1c5/0x1e0 [siw]
> ...
> [   97.341126] CPU: 0 PID: 143 Comm: kworker/u4:4 Tainted: G           OE      6.5.0-rc3+ #16
> [   97.341128] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014
> [   97.341130] Workqueue: rdma_cm cma_work_handler [rdma_cm]
> [   97.341137] RIP: 0010:siw_cep_put+0x1c5/0x1e0 [siw]
> ...
> [   97.341159] Call Trace:
> [   97.341160]  <TASK>
> [   97.341162]  ? show_regs+0x72/0x90
> [   97.341166]  ? siw_cep_put+0x1c5/0x1e0 [siw]
> [   97.341170]  ? __warn+0x8d/0x1a0
> [   97.341175]  ? siw_cep_put+0x1c5/0x1e0 [siw]
> [   97.341180]  ? report_bug+0x1f9/0x250
> [   97.341185]  ? handle_bug+0x46/0x90
> [   97.341188]  ? exc_invalid_op+0x19/0x80
> [   97.341190]  ? asm_exc_invalid_op+0x1b/0x20
> [   97.341196]  ? siw_cep_put+0x1c5/0x1e0 [siw]
> [   97.341204]  siw_connect+0x474/0x780 [siw]
> [   97.341211]  iw_cm_connect+0x1ca/0x250 [iw_cm]
> [   97.341216]  rdma_connect_locked+0x1bf/0x940 [rdma_cm]
> [   97.341227]  nvme_rdma_cm_handler+0x5d7/0x9c0 [nvme_rdma]
> [   97.341235]  cma_cm_event_handler+0x4f/0x170 [rdma_cm]
> [   97.341241]  cma_work_handler+0x6a/0xe0 [rdma_cm]
> [   97.341247]  process_one_work+0x2bd/0x590
> ...
> 
> The second one make the debug message consistent with the condition,
> and the last one cleanup code a bit. Pls help to review them.
> 
> Thanks,
> Guoqing
> 
> Guoqing Jiang (3):
>   RDMA/siw: Balance the reference of cep->kref in the error path
>   RDMA/siw: Correct wrong debug message
>   RDMA/siw: Call llist_reverse_order in siw_run_sq

All of these patches need to be with Fixes lines.

Thanks

> 
>  drivers/infiniband/sw/siw/siw_cm.c    |  1 -
>  drivers/infiniband/sw/siw/siw_qp_tx.c | 12 +-----------
>  drivers/infiniband/sw/siw/siw_verbs.c |  2 +-
>  3 files changed, 2 insertions(+), 13 deletions(-)
> 
> -- 
> 2.35.3
>
Guoqing Jiang Aug. 21, 2023, 1:39 a.m. UTC | #2
On 8/20/23 17:43, Leon Romanovsky wrote:
> On Fri, Aug 18, 2023 at 04:23:15PM +0800, Guoqing Jiang wrote:
>> Hi,
>>
>> The first one fix below calltrace which could happen if siw_connect
>> goto error (I manually set rv to -1 after siw_send_mpareqrep to trigger
>> it) after cep is allocated.
>>
>> [   97.341035] ------------[ cut here ]------------
>> [   97.341037] WARNING: CPU: 0 PID: 143 at drivers/infiniband/sw/siw/siw_cm.c:444 siw_cep_put+0x1c5/0x1e0 [siw]
>> ...
>> [   97.341126] CPU: 0 PID: 143 Comm: kworker/u4:4 Tainted: G           OE      6.5.0-rc3+ #16
>> [   97.341128] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014
>> [   97.341130] Workqueue: rdma_cm cma_work_handler [rdma_cm]
>> [   97.341137] RIP: 0010:siw_cep_put+0x1c5/0x1e0 [siw]
>> ...
>> [   97.341159] Call Trace:
>> [   97.341160]  <TASK>
>> [   97.341162]  ? show_regs+0x72/0x90
>> [   97.341166]  ? siw_cep_put+0x1c5/0x1e0 [siw]
>> [   97.341170]  ? __warn+0x8d/0x1a0
>> [   97.341175]  ? siw_cep_put+0x1c5/0x1e0 [siw]
>> [   97.341180]  ? report_bug+0x1f9/0x250
>> [   97.341185]  ? handle_bug+0x46/0x90
>> [   97.341188]  ? exc_invalid_op+0x19/0x80
>> [   97.341190]  ? asm_exc_invalid_op+0x1b/0x20
>> [   97.341196]  ? siw_cep_put+0x1c5/0x1e0 [siw]
>> [   97.341204]  siw_connect+0x474/0x780 [siw]
>> [   97.341211]  iw_cm_connect+0x1ca/0x250 [iw_cm]
>> [   97.341216]  rdma_connect_locked+0x1bf/0x940 [rdma_cm]
>> [   97.341227]  nvme_rdma_cm_handler+0x5d7/0x9c0 [nvme_rdma]
>> [   97.341235]  cma_cm_event_handler+0x4f/0x170 [rdma_cm]
>> [   97.341241]  cma_work_handler+0x6a/0xe0 [rdma_cm]
>> [   97.341247]  process_one_work+0x2bd/0x590
>> ...
>>
>> The second one make the debug message consistent with the condition,
>> and the last one cleanup code a bit. Pls help to review them.
>>
>> Thanks,
>> Guoqing
>>
>> Guoqing Jiang (3):
>>    RDMA/siw: Balance the reference of cep->kref in the error path
>>    RDMA/siw: Correct wrong debug message
>>    RDMA/siw: Call llist_reverse_order in siw_run_sq
> All of these patches need to be with Fixes lines.

The last one doesn't need it since it is a cleanup, will update the 
first two.

Thanks,
Guoqing