mbox series

[v7,00/12] SIW: Request for Comments

Message ID 20190417150051.365-1-bmt@zurich.ibm.com (mailing list archive)
Headers show
Series SIW: Request for Comments | expand

Message

Bernard Metzler April 17, 2019, 3 p.m. UTC
This patch set contributes version 7 of the SoftiWarp
driver, as originally introduced to the list Oct 6th, 2017.
SoftiWarp (siw) implements the iWarp RDMA protocol over
kernel TCP sockets. The driver integrates with the
linux-rdma framework.

Mainly in response to the various helpful feedback,
I fixed the following issues:

1. The code now relies on proper object management
   provided by the RDMA midlayer. With that, reference
   counting for PD's, CQ's and SRQ's got dropped.
   The corresponding files siw_obj.[ch] are removed.

2. The code now supports multiple user mmap operations
   of the same object (CQ, SQ, RQ, SRQ array) during
   its lifetime. To efficiently maintain the potentially
   large number of objects, those are now kept in a
   user context private cyclic xarray.

3. siw private memory access flags definition got dropped
   in favor of ib_access_flags.

4. Added code to consistently check complete STag
   during memory access - checking the user controlled
   8 bit 'key' field was inconsistent and partially
   missing.

We maintain a snapshot of the current code at
https://github.com/zrlio/softiwarp-for-linux-rdma.git
within branch 'siw-for-rdma-next-v7'.

The matching siw user library is maintained at
https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
The relevant branch name is 'siw-for-rdma-next-v7'.

As always, I highly appreciate your feedback. Thanks
very much for your time and help!

Bernard.


Bernard Metzler (12):
  iWarp wire packet format
  SIW main include file
  SIW network and RDMA core interface
  SIW connection management
  SIW application interface
  SIW application buffer management
  SIW queue pair methods
  SIW transmit path
  SIW receive path
  SIW completion queue methods
  SIW debugging
  SIW addition to kernel build environment

 MAINTAINERS                              |  579 ++++--
 drivers/infiniband/Kconfig               |    1 +
 drivers/infiniband/sw/Makefile           |    1 +
 drivers/infiniband/sw/siw/Kconfig        |   17 +
 drivers/infiniband/sw/siw/Makefile       |   12 +
 drivers/infiniband/sw/siw/iwarp.h        |  379 ++++
 drivers/infiniband/sw/siw/siw.h          |  733 ++++++++
 drivers/infiniband/sw/siw/siw_cm.c       | 2106 ++++++++++++++++++++++
 drivers/infiniband/sw/siw/siw_cm.h       |  121 ++
 drivers/infiniband/sw/siw/siw_cq.c       |  109 ++
 drivers/infiniband/sw/siw/siw_debug.c    |   91 +
 drivers/infiniband/sw/siw/siw_debug.h    |   40 +
 drivers/infiniband/sw/siw/siw_main.c     |  711 ++++++++
 drivers/infiniband/sw/siw/siw_mem.c      |  464 +++++
 drivers/infiniband/sw/siw/siw_mem.h      |   53 +
 drivers/infiniband/sw/siw/siw_qp.c       | 1354 ++++++++++++++
 drivers/infiniband/sw/siw/siw_qp_rx.c    | 1520 ++++++++++++++++
 drivers/infiniband/sw/siw/siw_qp_tx.c    | 1289 +++++++++++++
 drivers/infiniband/sw/siw/siw_verbs.c    | 1841 +++++++++++++++++++
 drivers/infiniband/sw/siw/siw_verbs.h    |   84 +
 include/uapi/rdma/rdma_user_ioctl_cmds.h |    1 +
 include/uapi/rdma/siw_user.h             |  186 ++
 22 files changed, 11559 insertions(+), 133 deletions(-)
 create mode 100644 drivers/infiniband/sw/siw/Kconfig
 create mode 100644 drivers/infiniband/sw/siw/Makefile
 create mode 100644 drivers/infiniband/sw/siw/iwarp.h
 create mode 100644 drivers/infiniband/sw/siw/siw.h
 create mode 100644 drivers/infiniband/sw/siw/siw_cm.c
 create mode 100644 drivers/infiniband/sw/siw/siw_cm.h
 create mode 100644 drivers/infiniband/sw/siw/siw_cq.c
 create mode 100644 drivers/infiniband/sw/siw/siw_debug.c
 create mode 100644 drivers/infiniband/sw/siw/siw_debug.h
 create mode 100644 drivers/infiniband/sw/siw/siw_main.c
 create mode 100644 drivers/infiniband/sw/siw/siw_mem.c
 create mode 100644 drivers/infiniband/sw/siw/siw_mem.h
 create mode 100644 drivers/infiniband/sw/siw/siw_qp.c
 create mode 100644 drivers/infiniband/sw/siw/siw_qp_rx.c
 create mode 100644 drivers/infiniband/sw/siw/siw_qp_tx.c
 create mode 100644 drivers/infiniband/sw/siw/siw_verbs.c
 create mode 100644 drivers/infiniband/sw/siw/siw_verbs.h
 create mode 100644 include/uapi/rdma/siw_user.h

Comments

Jason Gunthorpe April 22, 2019, 4:48 p.m. UTC | #1
On Wed, Apr 17, 2019 at 05:00:39PM +0200, Bernard Metzler wrote:
> This patch set contributes version 7 of the SoftiWarp
> driver, as originally introduced to the list Oct 6th, 2017.
> SoftiWarp (siw) implements the iWarp RDMA protocol over
> kernel TCP sockets. The driver integrates with the
> linux-rdma framework.
> 
> Mainly in response to the various helpful feedback,
> I fixed the following issues:
> 
> 1. The code now relies on proper object management
>    provided by the RDMA midlayer. With that, reference
>    counting for PD's, CQ's and SRQ's got dropped.
>    The corresponding files siw_obj.[ch] are removed.
> 
> 2. The code now supports multiple user mmap operations
>    of the same object (CQ, SQ, RQ, SRQ array) during
>    its lifetime. To efficiently maintain the potentially
>    large number of objects, those are now kept in a
>    user context private cyclic xarray.
> 
> 3. siw private memory access flags definition got dropped
>    in favor of ib_access_flags.
> 
> 4. Added code to consistently check complete STag
>    during memory access - checking the user controlled
>    8 bit 'key' field was inconsistent and partially
>    missing.
> 
> We maintain a snapshot of the current code at
> https://github.com/zrlio/softiwarp-for-linux-rdma.git
> within branch 'siw-for-rdma-next-v7'.
> 
> The matching siw user library is maintained at
> https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
> The relevant branch name is 'siw-for-rdma-next-v7'.
> 
> As always, I highly appreciate your feedback. Thanks
> very much for your time and help!

As before, I really want to see the various people stand up and say
this driver works, it passes their existing test suites (NFS, SRP,
iSER, NVMEOf, etc, etc)

I think that is the main remaning blocker to acceptance.

Jason
Bart Van Assche April 22, 2019, 5:03 p.m. UTC | #2
On Wed, 2019-04-17 at 17:00 +0200, Bernard Metzler wrote:
> We maintain a snapshot of the current code at
> https://github.com/zrlio/softiwarp-for-linux-rdma.git
> within branch 'siw-for-rdma-next-v7'.

Hi Bernard,

I had a look at that branch. What I found on that branch (compared to
Linus' master branch) is the following:
* Version 6 of the SIW patch series.
* A merge with Linus' v5.1-rc2 tag.
* A series of fixes for v6.

That is not how patch series should be prepared. I think Jason expects
something like the following:
* git remote add rdma git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
* git branch --set-upstream-to=rdma/for-next
* git pull --rebase

and next run git rebase -i rdma/for-next to apply the fixes to the patches
these are intended for. The patches in the branches of your github repo 
should match what is posted on the linux-rdma mailing list.

Thanks,

Bart.
Bernard Metzler April 23, 2019, 2:07 p.m. UTC | #3
-----"Bart Van Assche" <bvanassche@acm.org> wrote: -----

>To: "Bernard Metzler" <bmt@zurich.ibm.com>,
>linux-rdma@vger.kernel.org
>From: "Bart Van Assche" <bvanassche@acm.org>
>Date: 04/22/2019 07:03PM
>Subject: Re: [PATCH v7 00/12] SIW: Request for Comments
>
>On Wed, 2019-04-17 at 17:00 +0200, Bernard Metzler wrote:
>> We maintain a snapshot of the current code at
>>
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp-2Dfor-2Dlinux-2Drdma.git&d=DwICgQ&c=jf_iaSHvJObTbx-siA1ZOg
>&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=B989jL4ShcEiBbB8Fy9d
>tRLbFiGhqDdi0dbpofDu11I&s=V2AY8c20R6hHajVgPB_OwUGEzRB9fSJDoQQLw-ODV9s
>&e=
>> within branch 'siw-for-rdma-next-v7'.
>
>Hi Bernard,
>
>I had a look at that branch. What I found on that branch (compared to
>Linus' master branch) is the following:
>* Version 6 of the SIW patch series.
>* A merge with Linus' v5.1-rc2 tag.
>* A series of fixes for v6.
>
>That is not how patch series should be prepared. I think Jason
>expects
>something like the following:
>* git remote add rdma
>git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
>* git branch --set-upstream-to=rdma/for-next
>* git pull --rebase
>
>and next run git rebase -i rdma/for-next to apply the fixes to the
>patches
>these are intended for. The patches in the branches of your github
>repo 
>should match what is posted on the linux-rdma mailing list.
>
>Thanks,
>
>Bart.

Hi Bart,

thanks a lot for clarifying this! And, sorry for the mess on that
repo. I am going to fix it as suggested.

Thank you,
Bernard.
>
>
Olga Kornievskaia April 24, 2019, 4:17 p.m. UTC | #4
On Mon, Apr 22, 2019 at 12:48 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Wed, Apr 17, 2019 at 05:00:39PM +0200, Bernard Metzler wrote:
> > This patch set contributes version 7 of the SoftiWarp
> > driver, as originally introduced to the list Oct 6th, 2017.
> > SoftiWarp (siw) implements the iWarp RDMA protocol over
> > kernel TCP sockets. The driver integrates with the
> > linux-rdma framework.
> >
> > Mainly in response to the various helpful feedback,
> > I fixed the following issues:
> >
> > 1. The code now relies on proper object management
> >    provided by the RDMA midlayer. With that, reference
> >    counting for PD's, CQ's and SRQ's got dropped.
> >    The corresponding files siw_obj.[ch] are removed.
> >
> > 2. The code now supports multiple user mmap operations
> >    of the same object (CQ, SQ, RQ, SRQ array) during
> >    its lifetime. To efficiently maintain the potentially
> >    large number of objects, those are now kept in a
> >    user context private cyclic xarray.
> >
> > 3. siw private memory access flags definition got dropped
> >    in favor of ib_access_flags.
> >
> > 4. Added code to consistently check complete STag
> >    during memory access - checking the user controlled
> >    8 bit 'key' field was inconsistent and partially
> >    missing.
> >
> > We maintain a snapshot of the current code at
> > https://github.com/zrlio/softiwarp-for-linux-rdma.git
> > within branch 'siw-for-rdma-next-v7'.
> >
> > The matching siw user library is maintained at
> > https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
> > The relevant branch name is 'siw-for-rdma-next-v7'.
> >
> > As always, I highly appreciate your feedback. Thanks
> > very much for your time and help!
>
> As before, I really want to see the various people stand up and say
> this driver works, it passes their existing test suites (NFS, SRP,
> iSER, NVMEOf, etc, etc)
>
> I think that is the main remaning blocker to acceptance.

Hi Jason,

I'd like to provide my feedback about testing this code and running
NFS over RDMA over the software iWarp. With much appreciated help from
Bernard, I setup 2 CentOS 7.6 VMs and his v7 kernel branch. I
successfully, ran NFS connectathon test suite, xfstests, and ran "make
-j" compile of the linux kernel. Current code is useful for NFSoRDMA
functional testing. From a very limited comparison timing study in all
virtual environment, it is lacking a bit in performance compared to
non-RDMA mount (but it's better than software RoCE).


>
> Jason
Jason Gunthorpe April 24, 2019, 4:21 p.m. UTC | #5
On Wed, Apr 24, 2019 at 12:17:15PM -0400, Olga Kornievskaia wrote:
> On Mon, Apr 22, 2019 at 12:48 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >
> > On Wed, Apr 17, 2019 at 05:00:39PM +0200, Bernard Metzler wrote:
> > > This patch set contributes version 7 of the SoftiWarp
> > > driver, as originally introduced to the list Oct 6th, 2017.
> > > SoftiWarp (siw) implements the iWarp RDMA protocol over
> > > kernel TCP sockets. The driver integrates with the
> > > linux-rdma framework.
> > >
> > > Mainly in response to the various helpful feedback,
> > > I fixed the following issues:
> > >
> > > 1. The code now relies on proper object management
> > >    provided by the RDMA midlayer. With that, reference
> > >    counting for PD's, CQ's and SRQ's got dropped.
> > >    The corresponding files siw_obj.[ch] are removed.
> > >
> > > 2. The code now supports multiple user mmap operations
> > >    of the same object (CQ, SQ, RQ, SRQ array) during
> > >    its lifetime. To efficiently maintain the potentially
> > >    large number of objects, those are now kept in a
> > >    user context private cyclic xarray.
> > >
> > > 3. siw private memory access flags definition got dropped
> > >    in favor of ib_access_flags.
> > >
> > > 4. Added code to consistently check complete STag
> > >    during memory access - checking the user controlled
> > >    8 bit 'key' field was inconsistent and partially
> > >    missing.
> > >
> > > We maintain a snapshot of the current code at
> > > https://github.com/zrlio/softiwarp-for-linux-rdma.git
> > > within branch 'siw-for-rdma-next-v7'.
> > >
> > > The matching siw user library is maintained at
> > > https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
> > > The relevant branch name is 'siw-for-rdma-next-v7'.
> > >
> > > As always, I highly appreciate your feedback. Thanks
> > > very much for your time and help!
> >
> > As before, I really want to see the various people stand up and say
> > this driver works, it passes their existing test suites (NFS, SRP,
> > iSER, NVMEOf, etc, etc)
> >
> > I think that is the main remaning blocker to acceptance.
> 
> Hi Jason,
> 
> I'd like to provide my feedback about testing this code and running
> NFS over RDMA over the software iWarp. With much appreciated help from
> Bernard, I setup 2 CentOS 7.6 VMs and his v7 kernel branch. I
> successfully, ran NFS connectathon test suite, xfstests, and ran "make
> -j" compile of the linux kernel. Current code is useful for NFSoRDMA
> functional testing. From a very limited comparison timing study in all
> virtual environment, it is lacking a bit in performance compared to
> non-RDMA mount (but it's better than software RoCE).

Excellent feed back, thank you.

Lets hear from NVMeof too please

Jason
Bernard Metzler April 24, 2019, 4:54 p.m. UTC | #6
-----"Olga Kornievskaia" <aglo@umich.edu> wrote: -----

>To: "Jason Gunthorpe" <jgg@ziepe.ca>
>From: "Olga Kornievskaia" <aglo@umich.edu>
>Date: 04/24/2019 06:17PM
>Cc: "Bernard Metzler" <bmt@zurich.ibm.com>, "linux-rdma"
><linux-rdma@vger.kernel.org>
>Subject: Re: [PATCH v7 00/12] SIW: Request for Comments
>
>On Mon, Apr 22, 2019 at 12:48 PM Jason Gunthorpe <jgg@ziepe.ca>
>wrote:
>>
>> On Wed, Apr 17, 2019 at 05:00:39PM +0200, Bernard Metzler wrote:
>> > This patch set contributes version 7 of the SoftiWarp
>> > driver, as originally introduced to the list Oct 6th, 2017.
>> > SoftiWarp (siw) implements the iWarp RDMA protocol over
>> > kernel TCP sockets. The driver integrates with the
>> > linux-rdma framework.
>> >
>> > Mainly in response to the various helpful feedback,
>> > I fixed the following issues:
>> >
>> > 1. The code now relies on proper object management
>> >    provided by the RDMA midlayer. With that, reference
>> >    counting for PD's, CQ's and SRQ's got dropped.
>> >    The corresponding files siw_obj.[ch] are removed.
>> >
>> > 2. The code now supports multiple user mmap operations
>> >    of the same object (CQ, SQ, RQ, SRQ array) during
>> >    its lifetime. To efficiently maintain the potentially
>> >    large number of objects, those are now kept in a
>> >    user context private cyclic xarray.
>> >
>> > 3. siw private memory access flags definition got dropped
>> >    in favor of ib_access_flags.
>> >
>> > 4. Added code to consistently check complete STag
>> >    during memory access - checking the user controlled
>> >    8 bit 'key' field was inconsistent and partially
>> >    missing.
>> >
>> > We maintain a snapshot of the current code at
>> >
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp-2Dfor-2Dlinux-2Drdma.git&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg
>&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=G3P_ssjY7cDr3ZnlNjL7
>9Rb33OU_zvaWf0Reg-1NgQY&s=wC0ep1X5bQaX9RfmXhTHqtJjjQAttfWGS17XvMaWx3k
>&e=
>> > within branch 'siw-for-rdma-next-v7'.
>> >
>> > The matching siw user library is maintained at
>> >
>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrlio
>_softiwarp-2Duser-2Dfor-2Dlinux-2Drdma.git&d=DwIBaQ&c=jf_iaSHvJObTbx-
>siA1ZOg&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=G3P_ssjY7cDr3
>ZnlNjL79Rb33OU_zvaWf0Reg-1NgQY&s=pZ3roHqww6w4Py43ZpROWVrRu48GepKSlpzX
>WFSoQoc&e=.
>> > The relevant branch name is 'siw-for-rdma-next-v7'.
>> >
>> > As always, I highly appreciate your feedback. Thanks
>> > very much for your time and help!
>>
>> As before, I really want to see the various people stand up and say
>> this driver works, it passes their existing test suites (NFS, SRP,
>> iSER, NVMEOf, etc, etc)
>>
>> I think that is the main remaning blocker to acceptance.
>
>Hi Jason,
>
>I'd like to provide my feedback about testing this code and running
>NFS over RDMA over the software iWarp. With much appreciated help
>from
>Bernard, I setup 2 CentOS 7.6 VMs and his v7 kernel branch. I
>successfully, ran NFS connectathon test suite, xfstests, and ran
>"make
>-j" compile of the linux kernel. Current code is useful for NFSoRDMA
>functional testing. From a very limited comparison timing study in
>all
>virtual environment, it is lacking a bit in performance compared to
>non-RDMA mount (but it's better than software RoCE).
>

Hi Olga,

Many thanks again for taking the time to check siw! 
It's only your testing which pointed me to an issue in the
last patch series. siw did not take into account a new
io-address as potentially provided by the user during fast
register. 

Thanks and best regards,
Bernard.
Chuck Lever III April 24, 2019, 4:56 p.m. UTC | #7
> On Apr 24, 2019, at 12:17 PM, Olga Kornievskaia <aglo@umich.edu> wrote:
> 
> On Mon, Apr 22, 2019 at 12:48 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>> 
>> On Wed, Apr 17, 2019 at 05:00:39PM +0200, Bernard Metzler wrote:
>>> This patch set contributes version 7 of the SoftiWarp
>>> driver, as originally introduced to the list Oct 6th, 2017.
>>> SoftiWarp (siw) implements the iWarp RDMA protocol over
>>> kernel TCP sockets. The driver integrates with the
>>> linux-rdma framework.
>>> 
>>> Mainly in response to the various helpful feedback,
>>> I fixed the following issues:
>>> 
>>> 1. The code now relies on proper object management
>>>   provided by the RDMA midlayer. With that, reference
>>>   counting for PD's, CQ's and SRQ's got dropped.
>>>   The corresponding files siw_obj.[ch] are removed.
>>> 
>>> 2. The code now supports multiple user mmap operations
>>>   of the same object (CQ, SQ, RQ, SRQ array) during
>>>   its lifetime. To efficiently maintain the potentially
>>>   large number of objects, those are now kept in a
>>>   user context private cyclic xarray.
>>> 
>>> 3. siw private memory access flags definition got dropped
>>>   in favor of ib_access_flags.
>>> 
>>> 4. Added code to consistently check complete STag
>>>   during memory access - checking the user controlled
>>>   8 bit 'key' field was inconsistent and partially
>>>   missing.
>>> 
>>> We maintain a snapshot of the current code at
>>> https://github.com/zrlio/softiwarp-for-linux-rdma.git
>>> within branch 'siw-for-rdma-next-v7'.
>>> 
>>> The matching siw user library is maintained at
>>> https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
>>> The relevant branch name is 'siw-for-rdma-next-v7'.
>>> 
>>> As always, I highly appreciate your feedback. Thanks
>>> very much for your time and help!
>> 
>> As before, I really want to see the various people stand up and say
>> this driver works, it passes their existing test suites (NFS, SRP,
>> iSER, NVMEOf, etc, etc)
>> 
>> I think that is the main remaning blocker to acceptance.
> 
> Hi Jason,
> 
> I'd like to provide my feedback about testing this code and running
> NFS over RDMA over the software iWarp. With much appreciated help from
> Bernard, I setup 2 CentOS 7.6 VMs and his v7 kernel branch. I
> successfully, ran NFS connectathon test suite, xfstests, and ran "make
> -j" compile of the linux kernel. Current code is useful for NFSoRDMA
> functional testing. From a very limited comparison timing study in all
> virtual environment, it is lacking a bit in performance compared to
> non-RDMA mount (but it's better than software RoCE).

Thanks for your thorough effort!

--
Chuck Lever
Sagi Grimberg April 25, 2019, 7:06 a.m. UTC | #8
>> Hi Jason,
>>
>> I'd like to provide my feedback about testing this code and running
>> NFS over RDMA over the software iWarp. With much appreciated help from
>> Bernard, I setup 2 CentOS 7.6 VMs and his v7 kernel branch. I
>> successfully, ran NFS connectathon test suite, xfstests, and ran "make
>> -j" compile of the linux kernel. Current code is useful for NFSoRDMA
>> functional testing. From a very limited comparison timing study in all
>> virtual environment, it is lacking a bit in performance compared to
>> non-RDMA mount (but it's better than software RoCE).
> 
> Excellent feed back, thank you.
> 
> Lets hear from NVMeof too please

I actually took a stab and gave this a test drive with nvme/rdma
and iser (thanks Steve for making our lives better with rdma tool add 
link support), think it was v6 though...

There were some strange debug messages overlooked IIRC, and there
were some error messages, but things worked so don't know what
to make of it.

Pretty much the same feedback here, very limited testing on my VMs shows:
- functionally works
- faster than rxe
- slower than non-rdma (which sorta makes sense I assume)
Bernard Metzler April 25, 2019, 9:15 a.m. UTC | #9
-----"Sagi Grimberg" <sagi@grimberg.me> wrote: -----

>To: "Jason Gunthorpe" <jgg@ziepe.ca>, "Olga Kornievskaia"
><aglo@umich.edu>
>From: "Sagi Grimberg" <sagi@grimberg.me>
>Date: 04/25/2019 09:07AM
>Cc: "Bernard Metzler" <bmt@zurich.ibm.com>, "linux-rdma"
><linux-rdma@vger.kernel.org>
>Subject: Re: [PATCH v7 00/12] SIW: Request for Comments
>
>>> Hi Jason,
>>>
>>> I'd like to provide my feedback about testing this code and
>running
>>> NFS over RDMA over the software iWarp. With much appreciated help
>from
>>> Bernard, I setup 2 CentOS 7.6 VMs and his v7 kernel branch. I
>>> successfully, ran NFS connectathon test suite, xfstests, and ran
>"make
>>> -j" compile of the linux kernel. Current code is useful for
>NFSoRDMA
>>> functional testing. From a very limited comparison timing study in
>all
>>> virtual environment, it is lacking a bit in performance compared
>to
>>> non-RDMA mount (but it's better than software RoCE).
>> 
>> Excellent feed back, thank you.
>> 
>> Lets hear from NVMeof too please
>
>I actually took a stab and gave this a test drive with nvme/rdma
>and iser (thanks Steve for making our lives better with rdma tool add
>
>link support), think it was v6 though...
>
>There were some strange debug messages overlooked IIRC, and there
>were some error messages, but things worked so don't know what
>to make of it.
>
>Pretty much the same feedback here, very limited testing on my VMs
>shows:
>- functionally works
>- faster than rxe
>- slower than non-rdma (which sorta makes sense I assume)
>
>
Hi Sagi,

Many thanks for the feedback!

Performance was not my main concern since re-trying for acceptance
for upstream. I will look into perf tuning once we have it accepted.

One penalty we pay is - for HW interoperability - disabling
segmentation offloading awareness at sender side. While we could build
up to 64k frames in one shot (having it segmented on the wire by the NIC),
and process them same way in one shot at target side,
we don't do so, since some target iWarp hardware cannot handle MPA
frames larger than real MTU size. For siw - siw testing, we may switch
back on GSO awareness. These days, this is a compile time selection
only (since we abandoned all module parameters). Proposing another
extension of the netlink stuff for passing those driver private
parameters is on my todo list, but definitely not at the current stage.

In general, sitting on top of kernel TCP socket, adding some protocol
overhead, and even a 4 byte trailer checksum _after_ the data buffers
comes with a penalty, if the kernel application would otherwise use
the plain kernel TCP socket itself...

The performance story might be different for user level applications,
which potentially benefit more from the asynchronous verbs interface.


I learned Chelsio was doing some perf testing of NVMeF via siw against
iWarp HW themselves. They report line speed in a 100Gbs setup if siw is
on 2 clients side, talking to a T6 RNIC:
https://www.prnewswire.com/news-releases/chelsio-demonstrated-soft-iwarp-at-nvme-developer-days-300815249.html
and
https://www.chelsio.com/wp-content/uploads/resources/t6-100g-siw-nvmeof.pdf


Thanks,
Bernard.
Bernard Metzler April 25, 2019, 2:57 p.m. UTC | #10
-----linux-rdma-owner@vger.kernel.org wrote: -----

>To: "Bart Van Assche" <bvanassche@acm.org>
>From: "Bernard Metzler" 
>Sent by: linux-rdma-owner@vger.kernel.org
>Date: 04/23/2019 04:07PM
>Cc: linux-rdma@vger.kernel.org
>Subject: Re: [PATCH v7 00/12] SIW: Request for Comments
>
>-----"Bart Van Assche" <bvanassche@acm.org> wrote: -----
>
>>To: "Bernard Metzler" <bmt@zurich.ibm.com>,
>>linux-rdma@vger.kernel.org
>>From: "Bart Van Assche" <bvanassche@acm.org>
>>Date: 04/22/2019 07:03PM
>>Subject: Re: [PATCH v7 00/12] SIW: Request for Comments
>>
>>On Wed, 2019-04-17 at 17:00 +0200, Bernard Metzler wrote:
>>> We maintain a snapshot of the current code at
>>>
>>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrli
>o
>>_softiwarp-2Dfor-2Dlinux-2Drdma.git&d=DwICgQ&c=jf_iaSHvJObTbx-siA1ZO
>g
>>&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=B989jL4ShcEiBbB8Fy9
>d
>>tRLbFiGhqDdi0dbpofDu11I&s=V2AY8c20R6hHajVgPB_OwUGEzRB9fSJDoQQLw-ODV9
>s
>>&e=
>>> within branch 'siw-for-rdma-next-v7'.
>>
>>Hi Bernard,
>>
>>I had a look at that branch. What I found on that branch (compared
>to
>>Linus' master branch) is the following:
>>* Version 6 of the SIW patch series.
>>* A merge with Linus' v5.1-rc2 tag.
>>* A series of fixes for v6.
>>
>>That is not how patch series should be prepared. I think Jason
>>expects
>>something like the following:
>>* git remote add rdma
>>git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
>>* git branch --set-upstream-to=rdma/for-next
>>* git pull --rebase
>>
>>and next run git rebase -i rdma/for-next to apply the fixes to the
>>patches
>>these are intended for. The patches in the branches of your github
>>repo 
>>should match what is posted on the linux-rdma mailing list.
>>
>>Thanks,
>>
>>Bart.
>
>Hi Bart,
>
>thanks a lot for clarifying this! And, sorry for the mess on that
>repo. I am going to fix it as suggested.
>

Things seem to be a little tricky, since rdma/for-next of course
is a moving target. So I cannot rebase siw version 7 (the RFC I
sent last week) to its current status. rdma/for-next evolved
regarding core object management, and some of siw's verbs method
wouldn't fit anymore. So would it be a good idea to rebase it
(for the books, since it won't compile), tag it 'RFC version 7',
and push needed fixes on top of it to make v7 compile with current
rdma/for-next?

Unfortunately, some of nw functionality needed by siw was not
yet part of rdma/for-next until recently (deselecting TCP port
mapper, netlink extensions for device addition), so I worked on
my private repo which pulled together needed things from here
and there). Since all needed is available in rdma/for-next now,
I probably best restart with rebase to it and fix it.

Thanks,
Bernard.
Sagi Grimberg April 25, 2019, 3:04 p.m. UTC | #11
> Hi Sagi,
> 
> Many thanks for the feedback!
> 
> Performance was not my main concern since re-trying for acceptance
> for upstream. I will look into perf tuning once we have it accepted.

Yea, performance is less of a concern here. Good to see that at least
its seems functional which is a good start.

The only weirdness I was seeing is messages that the ULP is producing
(which seem harmless):
[47932.678136] nvmet_rdma: post_recv cmd failed
[47932.681190] nvmet_rdma: sending cmd response failed
Jason Gunthorpe April 25, 2019, 3:19 p.m. UTC | #12
On Thu, Apr 25, 2019 at 02:57:15PM +0000, Bernard Metzler wrote:
> 
> >To: "Bart Van Assche" <bvanassche@acm.org>
> >From: "Bernard Metzler" 
> >Sent by: linux-rdma-owner@vger.kernel.org
> >Date: 04/23/2019 04:07PM
> >Cc: linux-rdma@vger.kernel.org
> >Subject: Re: [PATCH v7 00/12] SIW: Request for Comments
> >
> >
> >>To: "Bernard Metzler" <bmt@zurich.ibm.com>,
> >>linux-rdma@vger.kernel.org
> >>From: "Bart Van Assche" <bvanassche@acm.org>
> >>Date: 04/22/2019 07:03PM
> >>Subject: Re: [PATCH v7 00/12] SIW: Request for Comments
> >>
> >>On Wed, 2019-04-17 at 17:00 +0200, Bernard Metzler wrote:
> >>> We maintain a snapshot of the current code at
> >>>
> >>https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_zrli
> >o
> >>_softiwarp-2Dfor-2Dlinux-2Drdma.git&d=DwICgQ&c=jf_iaSHvJObTbx-siA1ZO
> >g
> >>&r=2TaYXQ0T-r8ZO1PP1alNwU_QJcRRLfmYTAgd3QCvqSc&m=B989jL4ShcEiBbB8Fy9
> >d
> >>tRLbFiGhqDdi0dbpofDu11I&s=V2AY8c20R6hHajVgPB_OwUGEzRB9fSJDoQQLw-ODV9
> >s
> >>&e=
> >>> within branch 'siw-for-rdma-next-v7'.
> >>
> >>Hi Bernard,
> >>
> >>I had a look at that branch. What I found on that branch (compared
> >to
> >>Linus' master branch) is the following:
> >>* Version 6 of the SIW patch series.
> >>* A merge with Linus' v5.1-rc2 tag.
> >>* A series of fixes for v6.
> >>
> >>That is not how patch series should be prepared. I think Jason
> >>expects
> >>something like the following:
> >>* git remote add rdma
> >>git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
> >>* git branch --set-upstream-to=rdma/for-next
> >>* git pull --rebase
> >>
> >>and next run git rebase -i rdma/for-next to apply the fixes to the
> >>patches
> >>these are intended for. The patches in the branches of your github
> >>repo 
> >>should match what is posted on the linux-rdma mailing list.
> >>
> >>Thanks,
> >>
> >>Bart.
> >
> >Hi Bart,
> >
> >thanks a lot for clarifying this! And, sorry for the mess on that
> >repo. I am going to fix it as suggested.
> >
> 
> Things seem to be a little tricky, since rdma/for-next of course
> is a moving target. So I cannot rebase siw version 7 (the RFC I
> sent last week) to its current status. rdma/for-next evolved
> regarding core object management, and some of siw's verbs method
> wouldn't fit anymore.

Okay, well, if it can't be applied I have to drop it off patchworks,
resend something that can be applied please

Jason
Raju Rangoju April 25, 2019, 6:52 p.m. UTC | #13
On Wednesday, April 04/24/19, 2019 at 21:51:13 +0530, Jason Gunthorpe wrote:
> On Wed, Apr 24, 2019 at 12:17:15PM -0400, Olga Kornievskaia wrote:
> > On Mon, Apr 22, 2019 at 12:48 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > >
> > > On Wed, Apr 17, 2019 at 05:00:39PM +0200, Bernard Metzler wrote:
> > > > This patch set contributes version 7 of the SoftiWarp
> > > > driver, as originally introduced to the list Oct 6th, 2017.
> > > > SoftiWarp (siw) implements the iWarp RDMA protocol over
> > > > kernel TCP sockets. The driver integrates with the
> > > > linux-rdma framework.
> > > >
> > > > Mainly in response to the various helpful feedback,
> > > > I fixed the following issues:
> > > >
> > > > 1. The code now relies on proper object management
> > > >    provided by the RDMA midlayer. With that, reference
> > > >    counting for PD's, CQ's and SRQ's got dropped.
> > > >    The corresponding files siw_obj.[ch] are removed.
> > > >
> > > > 2. The code now supports multiple user mmap operations
> > > >    of the same object (CQ, SQ, RQ, SRQ array) during
> > > >    its lifetime. To efficiently maintain the potentially
> > > >    large number of objects, those are now kept in a
> > > >    user context private cyclic xarray.
> > > >
> > > > 3. siw private memory access flags definition got dropped
> > > >    in favor of ib_access_flags.
> > > >
> > > > 4. Added code to consistently check complete STag
> > > >    during memory access - checking the user controlled
> > > >    8 bit 'key' field was inconsistent and partially
> > > >    missing.
> > > >
> > > > We maintain a snapshot of the current code at
> > > > https://github.com/zrlio/softiwarp-for-linux-rdma.git
> > > > within branch 'siw-for-rdma-next-v7'.
> > > >
> > > > The matching siw user library is maintained at
> > > > https://github.com/zrlio/softiwarp-user-for-linux-rdma.git.
> > > > The relevant branch name is 'siw-for-rdma-next-v7'.
> > > >
> > > > As always, I highly appreciate your feedback. Thanks
> > > > very much for your time and help!
> > >
> > > As before, I really want to see the various people stand up and say
> > > this driver works, it passes their existing test suites (NFS, SRP,
> > > iSER, NVMEOf, etc, etc)
> > >
> > > I think that is the main remaning blocker to acceptance.
> > 
> > Hi Jason,
> > 
> > I'd like to provide my feedback about testing this code and running
> > NFS over RDMA over the software iWarp. With much appreciated help from
> > Bernard, I setup 2 CentOS 7.6 VMs and his v7 kernel branch. I
> > successfully, ran NFS connectathon test suite, xfstests, and ran "make
> > -j" compile of the linux kernel. Current code is useful for NFSoRDMA
> > functional testing. From a very limited comparison timing study in all
> > virtual environment, it is lacking a bit in performance compared to
> > non-RDMA mount (but it's better than software RoCE).
> 
> Excellent feed back, thank you.
> 
> Lets hear from NVMeof too please
>

Hi Jason,

Chelsio was able to test SIW with NVMeoF on both the Initiator and
Target side. We can do more coverage with other products (iSER,
user-space apps, etc.) in the future.

-Raju

> Jason