mbox series

[00/31] rdma/siw: fix a lot of deadlocks and use after free bugs

Message ID cover.1620343860.git.metze@samba.org (mailing list archive)
Headers show
Series rdma/siw: fix a lot of deadlocks and use after free bugs | expand

Message

Stefan Metzmacher May 6, 2021, 11:36 p.m. UTC
Hi Bernard,

while testing with my smbdirect driver I hit a lot of
bugs in the siw.ko driver. They all cause problems where
the siw driver was not able to unload anymore and I had to
reboot the machine.

I implemented:
- a non blocking connect
- fixed a lot of bugs where siw_cep_put() was called too often.
- fixed bugs where siw_cm_upcall() confused the core IWCM logic

I have some more changes to follow, but I wanted to send them
finally out after having them one and a half year sitting in some
private branch...

Stefan Metzmacher (31):
  rdma/siw: fix warning in siw_proc_send()
  rdma/siw: call smp_mb() after mem->stag_valid = 0 in
    siw_invalidate_stag() too
  rdma/siw: remove superfluous siw_cep_put() from siw_connect() error
    path
  rdma/siw: let siw_accept() deferr RDMA_MODE until EVENT_ESTABLISHED
  rdma/siw: make use of kernel_{bind,connect,listen}()
  rdma/siw: make siw_cm_upcall() a noop without valid 'id'
  rdma/siw: split out a __siw_cep_terminate_upcall() function
  rdma/siw: use __siw_cep_terminate_upcall() for indirect
    SIW_CM_WORK_CLOSE_LLP
  rdma/siw: use __siw_cep_terminate_upcall() for SIW_CM_WORK_PEER_CLOSE
  rdma/siw: use __siw_cep_terminate_upcall() for SIW_CM_WORK_MPATIMEOUT
  rdma/siw: introduce SIW_EPSTATE_ACCEPTING/REJECTING for
    rdma_accept/rdma_reject
  rdma/siw: add some debugging of state and sk_state to the teardown
    process
  rdma/siw: handle SIW_EPSTATE_CONNECTING in
    __siw_cep_terminate_upcall()
  rdma/siw: let siw_connect() set AWAIT_MPAREP before
    siw_send_mpareqrep()
  rdma/siw: create a temporary copy of private data
  rdma/siw: use error and out logic at the end of siw_connect()
  rdma/siw: start mpa timer before calling siw_send_mpareqrep()
  rdma/siw: call the blocking kernel_bindconnect() just before
    siw_send_mpareqrep()
  rdma/siw: split out a __siw_cep_close() function
  rdma/siw: implement non-blocking connect.
  rdma/siw: let siw_listen_address() call siw_cep_alloc() first
  rdma/siw: let siw_listen_address() call siw_cep_set_inuse() early
  rdma/siw: make use of __siw_cep_close() in siw_accept()
  rdma/siw: do the full disassociation of cep and qp in
    siw_qp_llp_close()
  rdma/siw: fix double siw_cep_put() in siw_cm_work_handler()
  rdma/siw: make use of __siw_cep_close() in siw_cm_work_handler()
  rdma/siw: fix the "close" logic in siw_qp_cm_drop()
  rdma/siw: make use of __siw_cep_close() in siw_qp_cm_drop()
  rdma/siw: make use of __siw_cep_close() in siw_reject()
  rdma/siw: make use of __siw_cep_close() in siw_listen_address()
  rdma/siw: make use of __siw_cep_close() in siw_drop_listeners()

 drivers/infiniband/sw/siw/siw_cm.c    | 537 +++++++++++++++-----------
 drivers/infiniband/sw/siw/siw_cm.h    |   3 +
 drivers/infiniband/sw/siw/siw_mem.c   |   2 +
 drivers/infiniband/sw/siw/siw_qp.c    |   3 +
 drivers/infiniband/sw/siw/siw_qp_rx.c |   2 +-
 5 files changed, 316 insertions(+), 231 deletions(-)

Comments

Bernard Metzler May 7, 2021, 12:08 p.m. UTC | #1
-----"Stefan Metzmacher" <metze@samba.org> wrote: -----

>To: "Bernard Metzler" <bmt@zurich.ibm.com>
>From: "Stefan Metzmacher" <metze@samba.org>
>Date: 05/07/2021 01:37AM
>Cc: linux-rdma@vger.kernel.org, "Stefan Metzmacher" <metze@samba.org>
>Subject: [EXTERNAL] [PATCH 00/31] rdma/siw: fix a lot of deadlocks
>and use after free bugs
>
>Hi Bernard,
>
>while testing with my smbdirect driver I hit a lot of
>bugs in the siw.ko driver. They all cause problems where
>the siw driver was not able to unload anymore and I had to
>reboot the machine.
>
Hi Stefan,

Much appreciated!
These are quite some patches, and I will need some time
to go through. Would bee nice if those would be broken
down into smaller bundles (introduce non-blocking connect,
_siw_cep_close() subroutine, fixing cep reference counting,
smp_mb() after STag invalidation, ..). Anyway, many
thanks for the effort, it will improve the driver!

First comments:

A non blocking connect does really makes sense as you
are pointing out. I hope it doesn't complicate the CM
code even further.

I think we agreed upon not using BUG() and BUG_ON(),
so please don't introduce it.

'I hit a lot of bugs' is not very helpful, but just
a statement.

Thanks very much!
Bernard.


>I implemented:
>- a non blocking connect
>- fixed a lot of bugs where siw_cep_put() was called too often.
>- fixed bugs where siw_cm_upcall() confused the core IWCM logic
>
>I have some more changes to follow, but I wanted to send them
>finally out after having them one and a half year sitting in some
>private branch...
>
>Stefan Metzmacher (31):
>  rdma/siw: fix warning in siw_proc_send()
>  rdma/siw: call smp_mb() after mem->stag_valid = 0 in
>    siw_invalidate_stag() too
>  rdma/siw: remove superfluous siw_cep_put() from siw_connect() error
>    path
>  rdma/siw: let siw_accept() deferr RDMA_MODE until EVENT_ESTABLISHED
>  rdma/siw: make use of kernel_{bind,connect,listen}()
>  rdma/siw: make siw_cm_upcall() a noop without valid 'id'
>  rdma/siw: split out a __siw_cep_terminate_upcall() function
>  rdma/siw: use __siw_cep_terminate_upcall() for indirect
>    SIW_CM_WORK_CLOSE_LLP
>  rdma/siw: use __siw_cep_terminate_upcall() for
>SIW_CM_WORK_PEER_CLOSE
>  rdma/siw: use __siw_cep_terminate_upcall() for
>SIW_CM_WORK_MPATIMEOUT
>  rdma/siw: introduce SIW_EPSTATE_ACCEPTING/REJECTING for
>    rdma_accept/rdma_reject
>  rdma/siw: add some debugging of state and sk_state to the teardown
>    process
>  rdma/siw: handle SIW_EPSTATE_CONNECTING in
>    __siw_cep_terminate_upcall()
>  rdma/siw: let siw_connect() set AWAIT_MPAREP before
>    siw_send_mpareqrep()
>  rdma/siw: create a temporary copy of private data
>  rdma/siw: use error and out logic at the end of siw_connect()
>  rdma/siw: start mpa timer before calling siw_send_mpareqrep()
>  rdma/siw: call the blocking kernel_bindconnect() just before
>    siw_send_mpareqrep()
>  rdma/siw: split out a __siw_cep_close() function
>  rdma/siw: implement non-blocking connect.
>  rdma/siw: let siw_listen_address() call siw_cep_alloc() first
>  rdma/siw: let siw_listen_address() call siw_cep_set_inuse() early
>  rdma/siw: make use of __siw_cep_close() in siw_accept()
>  rdma/siw: do the full disassociation of cep and qp in
>    siw_qp_llp_close()
>  rdma/siw: fix double siw_cep_put() in siw_cm_work_handler()
>  rdma/siw: make use of __siw_cep_close() in siw_cm_work_handler()
>  rdma/siw: fix the "close" logic in siw_qp_cm_drop()
>  rdma/siw: make use of __siw_cep_close() in siw_qp_cm_drop()
>  rdma/siw: make use of __siw_cep_close() in siw_reject()
>  rdma/siw: make use of __siw_cep_close() in siw_listen_address()
>  rdma/siw: make use of __siw_cep_close() in siw_drop_listeners()
>
> drivers/infiniband/sw/siw/siw_cm.c    | 537
>+++++++++++++++-----------
> drivers/infiniband/sw/siw/siw_cm.h    |   3 +
> drivers/infiniband/sw/siw/siw_mem.c   |   2 +
> drivers/infiniband/sw/siw/siw_qp.c    |   3 +
> drivers/infiniband/sw/siw/siw_qp_rx.c |   2 +-
> 5 files changed, 316 insertions(+), 231 deletions(-)
>
>-- 
>2.25.1
>
>
Stefan Metzmacher May 26, 2021, 3:43 p.m. UTC | #2
Hi Bernard,

> Much appreciated!
> These are quite some patches, and I will need some time
> to go through. Would bee nice if those would be broken
> down into smaller bundles (introduce non-blocking connect,
> _siw_cep_close() subroutine, fixing cep reference counting,
> smp_mb() after STag invalidation, ..). 

They mostly fall out naturally getting one step further
with each commit. So most of them depend on each other.
I'll see if I can reorder some of them, but I'm not sure it's really
worth the effort.

> Anyway, many thanks for the effort,

Thanks a lot for the review.

> it will improve the driver!

Yes!

On top I have some code to support MPA rev1 in peer_to_peer mode
in order to interoperate with a Chelcio T404-BT card running
under Windows.

In preparation I've code that moves the currently hardcoded values
(which where module params before) into a per device structure,
some like 'sdev->options.crc_strict'. With that we only need to
find a good way to pass these parameters from userspace to the
device. I guess that should be done somehow via the 'rdma link add'
command, or via files similar to /proc/sys/net/ipv4/conf/*.

Here's my branch with all (partly incomplete) siw changes:
https://git.samba.org/?p=metze/linux/wip.git;a=shortlog;h=refs/heads/rdma-next-siw

> First comments:
> 
> A non blocking connect does really makes sense as you
> are pointing out. I hope it doesn't complicate the CM
> code even further.
> 
> I think we agreed upon not using BUG() and BUG_ON(),
> so please don't introduce it.

Ok.

> 'I hit a lot of bugs' is not very helpful, but just
> a statement.

More details are in the individual commit messages,
should I double them in the cover letter next time?

Currently I'm quite busy with other stuff...
I hope to find some time in the next weeks to
comment more detailed and post a new revision.

metze
Bernard Metzler May 28, 2021, 2:35 p.m. UTC | #3
-----"Stefan Metzmacher" <metze@samba.org> wrote: -----

>To: "Bernard Metzler" <BMT@zurich.ibm.com>
>From: "Stefan Metzmacher" <metze@samba.org>
>Date: 05/26/2021 05:44PM
>Cc: "linux-rdma" <linux-rdma@vger.kernel.org>
>Subject: [EXTERNAL] Re: [PATCH 00/31] rdma/siw: fix a lot of
>deadlocks and use after free bugs
>
>Hi Bernard,
>
>> Much appreciated!
>> These are quite some patches, and I will need some time
>> to go through. Would bee nice if those would be broken
>> down into smaller bundles (introduce non-blocking connect,
>> _siw_cep_close() subroutine, fixing cep reference counting,
>> smp_mb() after STag invalidation, ..). 
>
>They mostly fall out naturally getting one step further
>with each commit. So most of them depend on each other.
>I'll see if I can reorder some of them, but I'm not sure it's really
>worth the effort.

Why not having just a few patches - one fixing the obvious
object management bug(s), one on a more concise error handling,
one on using exported kernel functions instead of calling
socket method directly, one introducing asynchronous connect..?

I understand you collected problems over time and fixed those,
but it would be much easier to digest if separated logically.

>
>> Anyway, many thanks for the effort,
>
>Thanks a lot for the review.
>
>> it will improve the driver!
>
>Yes!
>
>On top I have some code to support MPA rev1 in peer_to_peer mode
>in order to interoperate with a Chelcio T404-BT card running
>under Windows.
>
>In preparation I've code that moves the currently hardcoded values
>(which where module params before) into a per device structure,
>some like 'sdev->options.crc_strict'. With that we only need to
>find a good way to pass these parameters from userspace to the
>device. I guess that should be done somehow via the 'rdma link add'
>command, or via files similar to /proc/sys/net/ipv4/conf/*.
>
yes with dropping the module parameters we lost that flexibility...

I'd prefer protocol specific extensions to the rdma tools.

In fact, we could allow different CRC and MPA settings per
QP (which would make sense if we have connections from a
local siw device to multiple remote devices with different
capabilities etc.). But we do not have endpoint/QP object
specific settings in rdma netlink currently. 
Having link specific settings might be sufficient though.


>Here's my branch with all (partly incomplete) siw changes:
>INVALID URI REMOVED
>Fp-3Dmetze_linux_wip.git-3Ba-3Dshortlog-3Bh-3Drefs_heads_rdma-2Dnext-
>2Dsiw&d=DwIDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcR
>RLfmYTAgd3QCvqSc&m=bcCk65hNAmUVFgsBxIE9Y6S1cnxdE1otmHllxAlO-Ko&s=Rgu7
>GuEAeI9MyUx7m03KEMLH2qA7Y3065X8LCBo3EBY&e= 
>

Thanks, I'll have a look.

>> First comments:
>> 
>> A non blocking connect does really makes sense as you
>> are pointing out. I hope it doesn't complicate the CM
>> code even further.
>> 
>> I think we agreed upon not using BUG() and BUG_ON(),
>> so please don't introduce it.
>
>Ok.
>
>> 'I hit a lot of bugs' is not very helpful, but just
>> a statement.
>
>More details are in the individual commit messages,
>should I double them in the cover letter next time?
>
>Currently I'm quite busy with other stuff...
>I hope to find some time in the next weeks to
>comment more detailed and post a new revision.
>
>metze
>
Thanks again!
Bernard.