mbox series

[v4,00/13] SIW: Request for Comments

Message ID 20190130172136.23625-1-bmt@zurich.ibm.com (mailing list archive)
Headers show
Series SIW: Request for Comments | expand

Message

Bernard Metzler Jan. 30, 2019, 5:21 p.m. UTC
From: Bernard Metzler <bmt@zurich.ibm.com>

This patch set contributes a new version of the SoftiWarp
driver, as originally introduced to the list Oct 6th, 2017.
SoftiWarp (siw) implements the iWarp RDMA protocol over
kernel TCP sockets. The driver integrates with the
linux-rdma framework.

For this patch series, we aimed at fixing the main
obstacles, which prevented siw acceptance in the past:

1. siw now uses the recently extended rdma netlink protocol
   for adding and removing siw devices. It became the
   only way of managing siw devices.

2. The driver integrates with the currently reviewed
   RDMA/IWPM patch series on introducing no port mapping 
   requirements. These patches as provided by Steve Wise
   are a pre-requisit to run siw in an environment with
   active iwpmd.

The code has the following known limitations:

1. Only IPv4 addresses are supported / no IPv6 support.

2. All previously flexible module parameters are translated
   into const values as defined in siw_main.c. We propose
   another extension of the netlink protocol to make those
   driver parameters dynamically settable. Ideally, we would
   distinguish between link specific and connection specific
   parameters.

   Currently, we would like to see the following parameters
   settable:
   o MPA peer-to-peer mode (boolean on/off)
   o MPA CRC (boolean on/off)
   o MPA CRC negotiation mode: accept different CRC setting
     from peer (boolean on/off)
   o TCP_NODELAY to control Nagle settings of TCP socket
     (boolean on/off)
   o MPA version (0, 1 or 2)
   o Zerocopy to let TCP transmit out of application
     buffers wr/o copying data (boolean on/off)
   o GSO to select Generic Segementation Offload for
     larger frames (one frame may span 1 .. n Ethernet
     frames, if advertised by TCP socket)

   In principle, all of those parameters could be controlled
   per connection. At least, dynamically setting those per
   device is highly desirable. Please let's agree on the
   most clean solution for that.

We maintain a snapshot of the current code at
https://github.com/zrlio/softiwarp-for-linux-rdma.git
within branch 'siw-for-rdma-next-nvme-5.0'.
This branch is based on the rdma-next tree and additionally
includes the latest netlink and portmapper patches from
Steve Wise as well as the latest nvme-5.0 code changes from
git://git.infradead.org/nvme.git. We tested siw with
NVMeF host and target applications and therefor merged
with latest nvme development.

The matching siw user library is maintained at
https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
It is based on rdma-core, and extended with Steve's patches
to both rdma netlink and portmapper. The relevant branch
name is 'siw-for-rdma-next'.


As always, we'd highly appreciate your code review. Thanks
very much for your time.

Bernard

Bernard Metzler (13):
  iWarp wire packet format
  SIW main include file
  SIW network and RDMA core interface
  SIW object management
  SIW connection management
  SIW application interface
  SIW application buffer management
  SIW queue pair methods
  SIW transmit path
  SIW receive path
  SIW completion queue methods
  SIW debugging
  SIW addition to kernel build environment

 drivers/infiniband/Kconfig               |    1 +
 drivers/infiniband/sw/Makefile           |    1 +
 drivers/infiniband/sw/siw/Kconfig        |   17 +
 drivers/infiniband/sw/siw/Makefile       |   15 +
 drivers/infiniband/sw/siw/iwarp.h        |  415 ++++
 drivers/infiniband/sw/siw/siw.h          |  805 ++++++++
 drivers/infiniband/sw/siw/siw_ae.c       |  120 ++
 drivers/infiniband/sw/siw/siw_cm.c       | 2185 ++++++++++++++++++++++
 drivers/infiniband/sw/siw/siw_cm.h       |  156 ++
 drivers/infiniband/sw/siw/siw_cq.c       |  150 ++
 drivers/infiniband/sw/siw/siw_debug.c    |  467 +++++
 drivers/infiniband/sw/siw/siw_debug.h    |   87 +
 drivers/infiniband/sw/siw/siw_main.c     |  846 +++++++++
 drivers/infiniband/sw/siw/siw_mem.c      |  243 +++
 drivers/infiniband/sw/siw/siw_obj.c      |  338 ++++
 drivers/infiniband/sw/siw/siw_obj.h      |  200 ++
 drivers/infiniband/sw/siw/siw_qp.c       | 1473 +++++++++++++++
 drivers/infiniband/sw/siw/siw_qp_rx.c    | 1533 +++++++++++++++
 drivers/infiniband/sw/siw/siw_qp_tx.c    | 1340 +++++++++++++
 drivers/infiniband/sw/siw/siw_verbs.c    | 1888 +++++++++++++++++++
 drivers/infiniband/sw/siw/siw_verbs.h    |  119 ++
 include/uapi/rdma/rdma_user_ioctl_cmds.h |    1 +
 include/uapi/rdma/siw_user.h             |  216 +++
 23 files changed, 12616 insertions(+)
 create mode 100644 drivers/infiniband/sw/siw/Kconfig
 create mode 100644 drivers/infiniband/sw/siw/Makefile
 create mode 100644 drivers/infiniband/sw/siw/iwarp.h
 create mode 100644 drivers/infiniband/sw/siw/siw.h
 create mode 100644 drivers/infiniband/sw/siw/siw_ae.c
 create mode 100644 drivers/infiniband/sw/siw/siw_cm.c
 create mode 100644 drivers/infiniband/sw/siw/siw_cm.h
 create mode 100644 drivers/infiniband/sw/siw/siw_cq.c
 create mode 100644 drivers/infiniband/sw/siw/siw_debug.c
 create mode 100644 drivers/infiniband/sw/siw/siw_debug.h
 create mode 100644 drivers/infiniband/sw/siw/siw_main.c
 create mode 100644 drivers/infiniband/sw/siw/siw_mem.c
 create mode 100644 drivers/infiniband/sw/siw/siw_obj.c
 create mode 100644 drivers/infiniband/sw/siw/siw_obj.h
 create mode 100644 drivers/infiniband/sw/siw/siw_qp.c
 create mode 100644 drivers/infiniband/sw/siw/siw_qp_rx.c
 create mode 100644 drivers/infiniband/sw/siw/siw_qp_tx.c
 create mode 100644 drivers/infiniband/sw/siw/siw_verbs.c
 create mode 100644 drivers/infiniband/sw/siw/siw_verbs.h
 create mode 100644 include/uapi/rdma/siw_user.h

Comments

Doug Ledford Jan. 30, 2019, 5:36 p.m. UTC | #1
On Wed, 2019-01-30 at 18:21 +0100, bmt@zurich.ibm.com wrote:
> From: Bernard Metzler <bmt@zurich.ibm.com>
> 
> This patch set contributes a new version of the SoftiWarp
> driver, as originally introduced to the list Oct 6th, 2017.
> SoftiWarp (siw) implements the iWarp RDMA protocol over
> kernel TCP sockets. The driver integrates with the
> linux-rdma framework.
> 
> For this patch series, we aimed at fixing the main
> obstacles, which prevented siw acceptance in the past:
> 
> 1. siw now uses the recently extended rdma netlink protocol
>    for adding and removing siw devices. It became the
>    only way of managing siw devices.
> 
> 2. The driver integrates with the currently reviewed
>    RDMA/IWPM patch series on introducing no port mapping 
>    requirements. These patches as provided by Steve Wise
>    are a pre-requisit to run siw in an environment with
>    active iwpmd.
> 
> The code has the following known limitations:
> 
> 1. Only IPv4 addresses are supported / no IPv6 support.
> 
> 2. All previously flexible module parameters are translated
>    into const values as defined in siw_main.c. We propose
>    another extension of the netlink protocol to make those
>    driver parameters dynamically settable. Ideally, we would
>    distinguish between link specific and connection specific
>    parameters.
> 
>    Currently, we would like to see the following parameters
>    settable:
>    o MPA peer-to-peer mode (boolean on/off)
>    o MPA CRC (boolean on/off)
>    o MPA CRC negotiation mode: accept different CRC setting
>      from peer (boolean on/off)
>    o TCP_NODELAY to control Nagle settings of TCP socket
>      (boolean on/off)
>    o MPA version (0, 1 or 2)
>    o Zerocopy to let TCP transmit out of application
>      buffers wr/o copying data (boolean on/off)
>    o GSO to select Generic Segementation Offload for
>      larger frames (one frame may span 1 .. n Ethernet
>      frames, if advertised by TCP socket)
> 
>    In principle, all of those parameters could be controlled
>    per connection. At least, dynamically setting those per
>    device is highly desirable. Please let's agree on the
>    most clean solution for that.
> 
> We maintain a snapshot of the current code at
> https://github.com/zrlio/softiwarp-for-linux-rdma.git
> within branch 'siw-for-rdma-next-nvme-5.0'.
> This branch is based on the rdma-next tree and additionally
> includes the latest netlink and portmapper patches from
> Steve Wise as well as the latest nvme-5.0 code changes from
> git://git.infradead.org/nvme.git. We tested siw with
> NVMeF host and target applications and therefor merged
> with latest nvme development.
> 
> The matching siw user library is maintained at
> https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
> It is based on rdma-core, and extended with Steve's patches
> to both rdma netlink and portmapper. The relevant branch
> name is 'siw-for-rdma-next'.
> 
> 
> As always, we'd highly appreciate your code review. Thanks
> very much for your time.
> 
> Bernard

Hi Bernard,

I'm *thrilled* to see this updated submission at this point in time.  I
would love nothing more than to see this make the next merge window.
Dennis Dalessandro Jan. 30, 2019, 5:56 p.m. UTC | #2
On 1/30/2019 12:21 PM, bmt@zurich.ibm.com wrote:
> From: Bernard Metzler <bmt@zurich.ibm.com>
> 
> This patch set contributes a new version of the SoftiWarp
> driver, as originally introduced to the list Oct 6th, 2017.
> SoftiWarp (siw) implements the iWarp RDMA protocol over
> kernel TCP sockets. The driver integrates with the
> linux-rdma framework.
> 
> For this patch series, we aimed at fixing the main
> obstacles, which prevented siw acceptance in the past:
> 
> 1. siw now uses the recently extended rdma netlink protocol
>     for adding and removing siw devices. It became the
>     only way of managing siw devices.
> 
> 2. The driver integrates with the currently reviewed
>     RDMA/IWPM patch series on introducing no port mapping
>     requirements. These patches as provided by Steve Wise
>     are a pre-requisit to run siw in an environment with
>     active iwpmd.
> 
> The code has the following known limitations:
> 
> 1. Only IPv4 addresses are supported / no IPv6 support.
> 
> 2. All previously flexible module parameters are translated
>     into const values as defined in siw_main.c. We propose
>     another extension of the netlink protocol to make those
>     driver parameters dynamically settable. Ideally, we would
>     distinguish between link specific and connection specific
>     parameters.
> 
>     Currently, we would like to see the following parameters
>     settable:
>     o MPA peer-to-peer mode (boolean on/off)
>     o MPA CRC (boolean on/off)
>     o MPA CRC negotiation mode: accept different CRC setting
>       from peer (boolean on/off)
>     o TCP_NODELAY to control Nagle settings of TCP socket
>       (boolean on/off)
>     o MPA version (0, 1 or 2)
>     o Zerocopy to let TCP transmit out of application
>       buffers wr/o copying data (boolean on/off)
>     o GSO to select Generic Segementation Offload for
>       larger frames (one frame may span 1 .. n Ethernet
>       frames, if advertised by TCP socket)
> 
>     In principle, all of those parameters could be controlled
>     per connection. At least, dynamically setting those per
>     device is highly desirable. Please let's agree on the
>     most clean solution for that.
> 
> We maintain a snapshot of the current code at
> https://github.com/zrlio/softiwarp-for-linux-rdma.git
> within branch 'siw-for-rdma-next-nvme-5.0'.
> This branch is based on the rdma-next tree and additionally
> includes the latest netlink and portmapper patches from
> Steve Wise as well as the latest nvme-5.0 code changes from
> git://git.infradead.org/nvme.git. We tested siw with
> NVMeF host and target applications and therefor merged
> with latest nvme development.
> 
> The matching siw user library is maintained at
> https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
> It is based on rdma-core, and extended with Steve's patches
> to both rdma netlink and portmapper. The relevant branch
> name is 'siw-for-rdma-next'.
> 
> 
> As always, we'd highly appreciate your code review. Thanks
> very much for your time.
> 
> Bernard
> 
> Bernard Metzler (13):
>    iWarp wire packet format
>    SIW main include file
>    SIW network and RDMA core interface
>    SIW object management
>    SIW connection management
>    SIW application interface
>    SIW application buffer management
>    SIW queue pair methods
>    SIW transmit path
>    SIW receive path
>    SIW completion queue methods
>    SIW debugging
>    SIW addition to kernel build environment
> 
>   drivers/infiniband/Kconfig               |    1 +
>   drivers/infiniband/sw/Makefile           |    1 +
>   drivers/infiniband/sw/siw/Kconfig        |   17 +
>   drivers/infiniband/sw/siw/Makefile       |   15 +
>   drivers/infiniband/sw/siw/iwarp.h        |  415 ++++
>   drivers/infiniband/sw/siw/siw.h          |  805 ++++++++
>   drivers/infiniband/sw/siw/siw_ae.c       |  120 ++
>   drivers/infiniband/sw/siw/siw_cm.c       | 2185 ++++++++++++++++++++++
>   drivers/infiniband/sw/siw/siw_cm.h       |  156 ++
>   drivers/infiniband/sw/siw/siw_cq.c       |  150 ++
>   drivers/infiniband/sw/siw/siw_debug.c    |  467 +++++
>   drivers/infiniband/sw/siw/siw_debug.h    |   87 +
>   drivers/infiniband/sw/siw/siw_main.c     |  846 +++++++++
>   drivers/infiniband/sw/siw/siw_mem.c      |  243 +++
>   drivers/infiniband/sw/siw/siw_obj.c      |  338 ++++
>   drivers/infiniband/sw/siw/siw_obj.h      |  200 ++
>   drivers/infiniband/sw/siw/siw_qp.c       | 1473 +++++++++++++++
>   drivers/infiniband/sw/siw/siw_qp_rx.c    | 1533 +++++++++++++++
>   drivers/infiniband/sw/siw/siw_qp_tx.c    | 1340 +++++++++++++
>   drivers/infiniband/sw/siw/siw_verbs.c    | 1888 +++++++++++++++++++
>   drivers/infiniband/sw/siw/siw_verbs.h    |  119 ++
>   include/uapi/rdma/rdma_user_ioctl_cmds.h |    1 +
>   include/uapi/rdma/siw_user.h             |  216 +++
>   23 files changed, 12616 insertions(+)
>   create mode 100644 drivers/infiniband/sw/siw/Kconfig
>   create mode 100644 drivers/infiniband/sw/siw/Makefile
>   create mode 100644 drivers/infiniband/sw/siw/iwarp.h
>   create mode 100644 drivers/infiniband/sw/siw/siw.h
>   create mode 100644 drivers/infiniband/sw/siw/siw_ae.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_cm.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_cm.h
>   create mode 100644 drivers/infiniband/sw/siw/siw_cq.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_debug.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_debug.h
>   create mode 100644 drivers/infiniband/sw/siw/siw_main.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_mem.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_obj.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_obj.h
>   create mode 100644 drivers/infiniband/sw/siw/siw_qp.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_qp_rx.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_qp_tx.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_verbs.c
>   create mode 100644 drivers/infiniband/sw/siw/siw_verbs.h
>   create mode 100644 include/uapi/rdma/siw_user.h

Don't forget a MAINTAINERS file update.

-Denny
Jason Gunthorpe Jan. 30, 2019, 10:36 p.m. UTC | #3
On Wed, Jan 30, 2019 at 06:21:23PM +0100, bmt@zurich.ibm.com wrote:
> From: Bernard Metzler <bmt@zurich.ibm.com>
> 
> This patch set contributes a new version of the SoftiWarp
> driver, as originally introduced to the list Oct 6th, 2017.
> SoftiWarp (siw) implements the iWarp RDMA protocol over
> kernel TCP sockets. The driver integrates with the
> linux-rdma framework.
> 
> For this patch series, we aimed at fixing the main
> obstacles, which prevented siw acceptance in the past:
> 
> 1. siw now uses the recently extended rdma netlink protocol
>    for adding and removing siw devices. It became the
>    only way of managing siw devices.
> 
> 2. The driver integrates with the currently reviewed
>    RDMA/IWPM patch series on introducing no port mapping 
>    requirements. These patches as provided by Steve Wise
>    are a pre-requisit to run siw in an environment with
>    active iwpmd.
> 
> The code has the following known limitations:
> 
> 1. Only IPv4 addresses are supported / no IPv6 support.
> 
> 2. All previously flexible module parameters are translated
>    into const values as defined in siw_main.c. We propose
>    another extension of the netlink protocol to make those
>    driver parameters dynamically settable. Ideally, we would
>    distinguish between link specific and connection specific
>    parameters.
> 
>    Currently, we would like to see the following parameters
>    settable:
>    o MPA peer-to-peer mode (boolean on/off)
>    o MPA CRC (boolean on/off)
>    o MPA CRC negotiation mode: accept different CRC setting
>      from peer (boolean on/off)
>    o TCP_NODELAY to control Nagle settings of TCP socket
>      (boolean on/off)
>    o MPA version (0, 1 or 2)
>    o Zerocopy to let TCP transmit out of application
>      buffers wr/o copying data (boolean on/off)
>    o GSO to select Generic Segementation Offload for
>      larger frames (one frame may span 1 .. n Ethernet
>      frames, if advertised by TCP socket)
> 
>    In principle, all of those parameters could be controlled
>    per connection. At least, dynamically setting those per
>    device is highly desirable. Please let's agree on the
>    most clean solution for that.
> 
> We maintain a snapshot of the current code at
> https://github.com/zrlio/softiwarp-for-linux-rdma.git
> within branch 'siw-for-rdma-next-nvme-5.0'.
> This branch is based on the rdma-next tree and additionally
> includes the latest netlink and portmapper patches from
> Steve Wise as well as the latest nvme-5.0 code changes from
> git://git.infradead.org/nvme.git. We tested siw with
> NVMeF host and target applications and therefor merged
> with latest nvme development.
> 
> The matching siw user library is maintained at
> https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
> It is based on rdma-core, and extended with Steve's patches
> to both rdma netlink and portmapper. The relevant branch
> name is 'siw-for-rdma-next'.

This needs to be sent as a PR to rdma-core - I don't want to merge
drivers that don't have a ready to merge user space as well.

Jason
Jason Gunthorpe Jan. 30, 2019, 10:42 p.m. UTC | #4
On Wed, Jan 30, 2019 at 12:36:49PM -0500, Doug Ledford wrote:
> On Wed, 2019-01-30 at 18:21 +0100, bmt@zurich.ibm.com wrote:
> > From: Bernard Metzler <bmt@zurich.ibm.com>
> > 
> > This patch set contributes a new version of the SoftiWarp
> > driver, as originally introduced to the list Oct 6th, 2017.
> > SoftiWarp (siw) implements the iWarp RDMA protocol over
> > kernel TCP sockets. The driver integrates with the
> > linux-rdma framework.
> > 
> > For this patch series, we aimed at fixing the main
> > obstacles, which prevented siw acceptance in the past:
> > 
> > 1. siw now uses the recently extended rdma netlink protocol
> >    for adding and removing siw devices. It became the
> >    only way of managing siw devices.
> > 
> > 2. The driver integrates with the currently reviewed
> >    RDMA/IWPM patch series on introducing no port mapping 
> >    requirements. These patches as provided by Steve Wise
> >    are a pre-requisit to run siw in an environment with
> >    active iwpmd.
> > 
> > The code has the following known limitations:
> > 
> > 1. Only IPv4 addresses are supported / no IPv6 support.
> > 
> > 2. All previously flexible module parameters are translated
> >    into const values as defined in siw_main.c. We propose
> >    another extension of the netlink protocol to make those
> >    driver parameters dynamically settable. Ideally, we would
> >    distinguish between link specific and connection specific
> >    parameters.
> > 
> >    Currently, we would like to see the following parameters
> >    settable:
> >    o MPA peer-to-peer mode (boolean on/off)
> >    o MPA CRC (boolean on/off)
> >    o MPA CRC negotiation mode: accept different CRC setting
> >      from peer (boolean on/off)
> >    o TCP_NODELAY to control Nagle settings of TCP socket
> >      (boolean on/off)
> >    o MPA version (0, 1 or 2)
> >    o Zerocopy to let TCP transmit out of application
> >      buffers wr/o copying data (boolean on/off)
> >    o GSO to select Generic Segementation Offload for
> >      larger frames (one frame may span 1 .. n Ethernet
> >      frames, if advertised by TCP socket)
> > 
> >    In principle, all of those parameters could be controlled
> >    per connection. At least, dynamically setting those per
> >    device is highly desirable. Please let's agree on the
> >    most clean solution for that.
> > 
> > We maintain a snapshot of the current code at
> > https://github.com/zrlio/softiwarp-for-linux-rdma.git
> > within branch 'siw-for-rdma-next-nvme-5.0'.
> > This branch is based on the rdma-next tree and additionally
> > includes the latest netlink and portmapper patches from
> > Steve Wise as well as the latest nvme-5.0 code changes from
> > git://git.infradead.org/nvme.git. We tested siw with
> > NVMeF host and target applications and therefor merged
> > with latest nvme development.
> > 
> > The matching siw user library is maintained at
> > https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
> > It is based on rdma-core, and extended with Steve's patches
> > to both rdma netlink and portmapper. The relevant branch
> > name is 'siw-for-rdma-next'.
> > 
> > 
> > As always, we'd highly appreciate your code review. Thanks
> > very much for your time.
> > 
> > Bernard
> 
> Hi Bernard,
> 
> I'm *thrilled* to see this updated submission at this point in time.  I
> would love nothing more than to see this make the next merge window.

It depends on two big series I haven't sent yet :(

Jason
Doug Ledford Jan. 30, 2019, 11:10 p.m. UTC | #5
On Wed, 2019-01-30 at 15:42 -0700, Jason Gunthorpe wrote:
> On Wed, Jan 30, 2019 at 12:36:49PM -0500, Doug Ledford wrote:
> > On Wed, 2019-01-30 at 18:21 +0100, bmt@zurich.ibm.com wrote:
> > > From: Bernard Metzler <bmt@zurich.ibm.com>
> > > 
> > > This patch set contributes a new version of the SoftiWarp
> > > driver, as originally introduced to the list Oct 6th, 2017.
> > > SoftiWarp (siw) implements the iWarp RDMA protocol over
> > > kernel TCP sockets. The driver integrates with the
> > > linux-rdma framework.
> > > 
> > > For this patch series, we aimed at fixing the main
> > > obstacles, which prevented siw acceptance in the past:
> > > 
> > > 1. siw now uses the recently extended rdma netlink protocol
> > >    for adding and removing siw devices. It became the
> > >    only way of managing siw devices.
> > > 
> > > 2. The driver integrates with the currently reviewed
> > >    RDMA/IWPM patch series on introducing no port mapping 
> > >    requirements. These patches as provided by Steve Wise
> > >    are a pre-requisit to run siw in an environment with
> > >    active iwpmd.
> > > 
> > > The code has the following known limitations:
> > > 
> > > 1. Only IPv4 addresses are supported / no IPv6 support.
> > > 
> > > 2. All previously flexible module parameters are translated
> > >    into const values as defined in siw_main.c. We propose
> > >    another extension of the netlink protocol to make those
> > >    driver parameters dynamically settable. Ideally, we would
> > >    distinguish between link specific and connection specific
> > >    parameters.
> > > 
> > >    Currently, we would like to see the following parameters
> > >    settable:
> > >    o MPA peer-to-peer mode (boolean on/off)
> > >    o MPA CRC (boolean on/off)
> > >    o MPA CRC negotiation mode: accept different CRC setting
> > >      from peer (boolean on/off)
> > >    o TCP_NODELAY to control Nagle settings of TCP socket
> > >      (boolean on/off)
> > >    o MPA version (0, 1 or 2)
> > >    o Zerocopy to let TCP transmit out of application
> > >      buffers wr/o copying data (boolean on/off)
> > >    o GSO to select Generic Segementation Offload for
> > >      larger frames (one frame may span 1 .. n Ethernet
> > >      frames, if advertised by TCP socket)
> > > 
> > >    In principle, all of those parameters could be controlled
> > >    per connection. At least, dynamically setting those per
> > >    device is highly desirable. Please let's agree on the
> > >    most clean solution for that.
> > > 
> > > We maintain a snapshot of the current code at
> > > https://github.com/zrlio/softiwarp-for-linux-rdma.git
> > > within branch 'siw-for-rdma-next-nvme-5.0'.
> > > This branch is based on the rdma-next tree and additionally
> > > includes the latest netlink and portmapper patches from
> > > Steve Wise as well as the latest nvme-5.0 code changes from
> > > git://git.infradead.org/nvme.git. We tested siw with
> > > NVMeF host and target applications and therefor merged
> > > with latest nvme development.
> > > 
> > > The matching siw user library is maintained at
> > > https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
> > > It is based on rdma-core, and extended with Steve's patches
> > > to both rdma netlink and portmapper. The relevant branch
> > > name is 'siw-for-rdma-next'.
> > > 
> > > 
> > > As always, we'd highly appreciate your code review. Thanks
> > > very much for your time.
> > > 
> > > Bernard
> > 
> > Hi Bernard,
> > 
> > I'm *thrilled* to see this updated submission at this point in time.  I
> > would love nothing more than to see this make the next merge window.
> 
> It depends on two big series I haven't sent yet :(

The cover letter only mentions the port mapper issue and that it was
tested against the latest nvme tree (I assume that isn't needed for this
code to actually work and that the nvme stuff was just pulled for
interop testing).  What two large series are you referring to as it
needs to be clearly called out.
Jason Gunthorpe Jan. 30, 2019, 11:33 p.m. UTC | #6
On Wed, Jan 30, 2019 at 06:10:43PM -0500, Doug Ledford wrote:

> > > I'm *thrilled* to see this updated submission at this point in time.  I
> > > would love nothing more than to see this make the next merge window.
> > 
> > It depends on two big series I haven't sent yet :(
> 
> The cover letter only mentions the port mapper issue and that it was
> tested against the latest nvme tree (I assume that isn't needed for this
> code to actually work and that the nvme stuff was just pulled for
> interop testing).  What two large series are you referring to as it
> needs to be clearly called out.

It calls ib_unregister_driver() which is part of this:

https://github.com/jgunthorpe/linux/commits/device_locking_cleanup 

An earlier version of this was posted, but I've pretty much rewrote it
again in a different way after Parav's remarks.

Jason
Bernard Metzler Jan. 31, 2019, 1:40 p.m. UTC | #7
-----"Dennis Dalessandro" <dennis.dalessandro@intel.com> wrote: -----

>To: bmt@zurich.ibm.com, linux-rdma@vger.kernel.org
>From: "Dennis Dalessandro" <dennis.dalessandro@intel.com>
>Date: 01/30/2019 06:57PM
>Subject: Re: [PATCH v4 00/13] SIW: Request for Comments
>
>On 1/30/2019 12:21 PM, bmt@zurich.ibm.com wrote:
>> From: Bernard Metzler <bmt@zurich.ibm.com>
>> 
>> This patch set contributes a new version of the SoftiWarp
>> driver, as originally introduced to the list Oct 6th, 2017.
>> SoftiWarp (siw) implements the iWarp RDMA protocol over
>> kernel TCP sockets. The driver integrates with the
>> linux-rdma framework.
>> 
>> For this patch series, we aimed at fixing the main
>> obstacles, which prevented siw acceptance in the past:
>> 
>> 1. siw now uses the recently extended rdma netlink protocol
>>     for adding and removing siw devices. It became the
>>     only way of managing siw devices.
>> 
>> 2. The driver integrates with the currently reviewed
>>     RDMA/IWPM patch series on introducing no port mapping
>>     requirements. These patches as provided by Steve Wise
>>     are a pre-requisit to run siw in an environment with
>>     active iwpmd.
>> 
>> The code has the following known limitations:
>> 
>> 1. Only IPv4 addresses are supported / no IPv6 support.
>> 
>> 2. All previously flexible module parameters are translated
>>     into const values as defined in siw_main.c. We propose
>>     another extension of the netlink protocol to make those
>>     driver parameters dynamically settable. Ideally, we would
>>     distinguish between link specific and connection specific
>>     parameters.
>> 
>>     Currently, we would like to see the following parameters
>>     settable:
>>     o MPA peer-to-peer mode (boolean on/off)
>>     o MPA CRC (boolean on/off)
>>     o MPA CRC negotiation mode: accept different CRC setting
>>       from peer (boolean on/off)
>>     o TCP_NODELAY to control Nagle settings of TCP socket
>>       (boolean on/off)
>>     o MPA version (0, 1 or 2)
>>     o Zerocopy to let TCP transmit out of application
>>       buffers wr/o copying data (boolean on/off)
>>     o GSO to select Generic Segementation Offload for
>>       larger frames (one frame may span 1 .. n Ethernet
>>       frames, if advertised by TCP socket)
>> 
>>     In principle, all of those parameters could be controlled
>>     per connection. At least, dynamically setting those per
>>     device is highly desirable. Please let's agree on the
>>     most clean solution for that.
>> 
>> We maintain a snapshot of the current code at
>>
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp-2Dfor-2Dlinux-2Drdma.git&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg
>&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=7ICg1VK6TratuVrcLMEX
>5jrsvgP8SJgNbpkEJtuyHSc&s=uazaVPxlb69Q6XmFwTkfEkCAPz1P7OF2BkxK67ywX1Y
>&e=
>> within branch 'siw-for-rdma-next-nvme-5.0'.
>> This branch is based on the rdma-next tree and additionally
>> includes the latest netlink and portmapper patches from
>> Steve Wise as well as the latest nvme-5.0 code changes from
>> git://git.infradead.org/nvme.git. We tested siw with
>> NVMeF host and target applications and therefor merged
>> with latest nvme development.
>> 
>> The matching siw user library is maintained at
>>
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp-2Duser-2Dfor-2Dlinux-2Drdma.git&d=DwICaQ&c=jf_iaSHvJObTbx-
>siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=7ICg1VK6Tratu
>VrcLMEX5jrsvgP8SJgNbpkEJtuyHSc&s=t3URSplWVy22TgOGUGiJnD7O3YLQUK0pgM32
>-MW-_YY&e=.
>> It is based on rdma-core, and extended with Steve's patches
>> to both rdma netlink and portmapper. The relevant branch
>> name is 'siw-for-rdma-next'.
>> 
>> 
>> As always, we'd highly appreciate your code review. Thanks
>> very much for your time.
>> 
>> Bernard
>> 
>> Bernard Metzler (13):
>>    iWarp wire packet format
>>    SIW main include file
>>    SIW network and RDMA core interface
>>    SIW object management
>>    SIW connection management
>>    SIW application interface
>>    SIW application buffer management
>>    SIW queue pair methods
>>    SIW transmit path
>>    SIW receive path
>>    SIW completion queue methods
>>    SIW debugging
>>    SIW addition to kernel build environment
>> 
>>   drivers/infiniband/Kconfig               |    1 +
>>   drivers/infiniband/sw/Makefile           |    1 +
>>   drivers/infiniband/sw/siw/Kconfig        |   17 +
>>   drivers/infiniband/sw/siw/Makefile       |   15 +
>>   drivers/infiniband/sw/siw/iwarp.h        |  415 ++++
>>   drivers/infiniband/sw/siw/siw.h          |  805 ++++++++
>>   drivers/infiniband/sw/siw/siw_ae.c       |  120 ++
>>   drivers/infiniband/sw/siw/siw_cm.c       | 2185
>++++++++++++++++++++++
>>   drivers/infiniband/sw/siw/siw_cm.h       |  156 ++
>>   drivers/infiniband/sw/siw/siw_cq.c       |  150 ++
>>   drivers/infiniband/sw/siw/siw_debug.c    |  467 +++++
>>   drivers/infiniband/sw/siw/siw_debug.h    |   87 +
>>   drivers/infiniband/sw/siw/siw_main.c     |  846 +++++++++
>>   drivers/infiniband/sw/siw/siw_mem.c      |  243 +++
>>   drivers/infiniband/sw/siw/siw_obj.c      |  338 ++++
>>   drivers/infiniband/sw/siw/siw_obj.h      |  200 ++
>>   drivers/infiniband/sw/siw/siw_qp.c       | 1473 +++++++++++++++
>>   drivers/infiniband/sw/siw/siw_qp_rx.c    | 1533 +++++++++++++++
>>   drivers/infiniband/sw/siw/siw_qp_tx.c    | 1340 +++++++++++++
>>   drivers/infiniband/sw/siw/siw_verbs.c    | 1888
>+++++++++++++++++++
>>   drivers/infiniband/sw/siw/siw_verbs.h    |  119 ++
>>   include/uapi/rdma/rdma_user_ioctl_cmds.h |    1 +
>>   include/uapi/rdma/siw_user.h             |  216 +++
>>   23 files changed, 12616 insertions(+)
>>   create mode 100644 drivers/infiniband/sw/siw/Kconfig
>>   create mode 100644 drivers/infiniband/sw/siw/Makefile
>>   create mode 100644 drivers/infiniband/sw/siw/iwarp.h
>>   create mode 100644 drivers/infiniband/sw/siw/siw.h
>>   create mode 100644 drivers/infiniband/sw/siw/siw_ae.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_cm.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_cm.h
>>   create mode 100644 drivers/infiniband/sw/siw/siw_cq.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_debug.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_debug.h
>>   create mode 100644 drivers/infiniband/sw/siw/siw_main.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_mem.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_obj.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_obj.h
>>   create mode 100644 drivers/infiniband/sw/siw/siw_qp.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_qp_rx.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_qp_tx.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_verbs.c
>>   create mode 100644 drivers/infiniband/sw/siw/siw_verbs.h
>>   create mode 100644 include/uapi/rdma/siw_user.h
>
>Don't forget a MAINTAINERS file update.
>
>-

Absolutely! Thanks for pointing out.

Bernard
Steve Wise Jan. 31, 2019, 4:36 p.m. UTC | #8
On 1/30/2019 5:33 PM, Jason Gunthorpe wrote:
> On Wed, Jan 30, 2019 at 06:10:43PM -0500, Doug Ledford wrote:
>
>>>> I'm *thrilled* to see this updated submission at this point in time.  I
>>>> would love nothing more than to see this make the next merge window.
>>> It depends on two big series I haven't sent yet :(
>> The cover letter only mentions the port mapper issue and that it was
>> tested against the latest nvme tree (I assume that isn't needed for this
>> code to actually work and that the nvme stuff was just pulled for
>> interop testing).  What two large series are you referring to as it
>> needs to be clearly called out.
> It calls ib_unregister_driver() which is part of this:
>
> https://github.com/jgunthorpe/linux/commits/device_locking_cleanup 
>
> An earlier version of this was posted, but I've pretty much rewrote it
> again in a different way after Parav's remarks.
>
> Jason

I suggested Bernard get this submission going and to use the last
reviewed ib_unregister_driver() code that was included with my
NEWLINK/DELLINK series [1].  So his github has that series, the iwpm
series [2], the siw stuff, and then a merge of nvmf because of a recent
regression we wanted to pull in. 

Once we finalize jason's device work, and my NEWLINK/DELLINK, then
Bernard can rebase on top of -next and refactor his code as needed.

[1] https://www.spinics.net/lists/linux-rdma/msg72566.html

[2] https://www.spinics.net/lists/linux-rdma/msg74547.html