RDMA/ipoib: distribute cq completion vector better

Message ID	20201013074342.15867-1-jinpu.wang@cloud.ionos.com (mailing list archive)
State	Accepted
Delegated to:	Jason Gunthorpe
Headers	show Return-Path: <SRS0=67WZ=DU=vger.kernel.org=linux-rdma-owner@kernel.org> From: Jack Wang <jinpu.wang@cloud.ionos.com> To: linux-rdma@vger.kernel.org Cc: bvanassche@acm.org, leon@kernel.org, dledford@redhat.com, jgg@ziepe.ca, Gioh Kim <gi-oh.kim@cloud.ionos.com> Subject: [PATCH] RDMA/ipoib: distribute cq completion vector better Date: Tue, 13 Oct 2020 09:43:42 +0200 Message-Id: <20201013074342.15867-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	RDMA/ipoib: distribute cq completion vector better \| expand RDMA/ipoib: distribute cq completion vector better

Jinpu Wang Oct. 13, 2020, 7:43 a.m. UTC

Currently ipoib choose cq completion vector based on port number,
when HCA only have one port, all the interface recv queue completion
are bind to cq completion vector 0.

To better distribute the load, use same method as __ib_alloc_cq_any
to choose completion vector, with the change, each interface now use
different completion vectors.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Jason Gunthorpe Oct. 28, 2020, 1:44 p.m. UTC | #1

On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> Currently ipoib choose cq completion vector based on port number,
> when HCA only have one port, all the interface recv queue completion
> are bind to cq completion vector 0.
> 
> To better distribute the load, use same method as __ib_alloc_cq_any
> to choose completion vector, with the change, each interface now use
> different completion vectors.
> 
> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

If you care about IPoIB performance you should be using the
accelerated IPoIB stuff, the drivers implementing that all provide
much better handling of CQ affinity.

What scenario does this patch make a difference?

Jason

Jinpu Wang Oct. 28, 2020, 1:48 p.m. UTC | #2

On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > Currently ipoib choose cq completion vector based on port number,
> > when HCA only have one port, all the interface recv queue completion
> > are bind to cq completion vector 0.
> >
> > To better distribute the load, use same method as __ib_alloc_cq_any
> > to choose completion vector, with the change, each interface now use
> > different completion vectors.
> >
> > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > ---
> >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
>
> If you care about IPoIB performance you should be using the
> accelerated IPoIB stuff, the drivers implementing that all provide
> much better handling of CQ affinity.
>
> What scenario does this patch make a difference?

AFAIK the enhance mode is only for datagram mode, we are using connected mode.

Thanks!

Jason Gunthorpe Oct. 28, 2020, 1:53 p.m. UTC | #3

On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote:
> On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > > Currently ipoib choose cq completion vector based on port number,
> > > when HCA only have one port, all the interface recv queue completion
> > > are bind to cq completion vector 0.
> > >
> > > To better distribute the load, use same method as __ib_alloc_cq_any
> > > to choose completion vector, with the change, each interface now use
> > > different completion vectors.
> > >
> > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > If you care about IPoIB performance you should be using the
> > accelerated IPoIB stuff, the drivers implementing that all provide
> > much better handling of CQ affinity.
> >
> > What scenario does this patch make a difference?
> 
> AFAIK the enhance mode is only for datagram mode, we are using connected mode.

And you are using child interfaces or multiple cards?

Jason

Jinpu Wang Oct. 28, 2020, 1:57 p.m. UTC | #4

On Wed, Oct 28, 2020 at 2:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote:
> > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > > > Currently ipoib choose cq completion vector based on port number,
> > > > when HCA only have one port, all the interface recv queue completion
> > > > are bind to cq completion vector 0.
> > > >
> > > > To better distribute the load, use same method as __ib_alloc_cq_any
> > > > to choose completion vector, with the change, each interface now use
> > > > different completion vectors.
> > > >
> > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > > >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > If you care about IPoIB performance you should be using the
> > > accelerated IPoIB stuff, the drivers implementing that all provide
> > > much better handling of CQ affinity.
> > >
> > > What scenario does this patch make a difference?
> >
> > AFAIK the enhance mode is only for datagram mode, we are using connected mode.
>
> And you are using child interfaces or multiple cards?
we are using multiple child interfaces on Connect x5 HCA (MCX556A-ECAT)
ibstat
CA 'mlx5_0'
CA type: MT4119
Number of ports: 1
Firmware version: 16.28.2006
Hardware version: 0
Node GUID: 0x98039b03006c7912
System image GUID: 0x98039b03006c7912
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 38
LMC: 0
SM lid: 4
Capability mask: 0x2651e848
Port GUID: 0x98039b03006c7912
Link layer: InfiniBand
CA 'mlx5_1'
CA type: MT4119
Number of ports: 1
Firmware version: 16.28.2006
Hardware version: 0
Node GUID: 0x98039b03006c7913
System image GUID: 0x98039b03006c7912
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 134
LMC: 0
SM lid: 224
Capability mask: 0x2651e848
Port GUID: 0x98039b03006c7913
Link layer: InfiniBand

>
> Jason

Jason Gunthorpe Oct. 28, 2020, 1:59 p.m. UTC | #5

On Wed, Oct 28, 2020 at 02:57:57PM +0100, Jinpu Wang wrote:
> On Wed, Oct 28, 2020 at 2:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote:
> > > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > >
> > > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > > > > Currently ipoib choose cq completion vector based on port number,
> > > > > when HCA only have one port, all the interface recv queue completion
> > > > > are bind to cq completion vector 0.
> > > > >
> > > > > To better distribute the load, use same method as __ib_alloc_cq_any
> > > > > to choose completion vector, with the change, each interface now use
> > > > > different completion vectors.
> > > > >
> > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > > > >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > If you care about IPoIB performance you should be using the
> > > > accelerated IPoIB stuff, the drivers implementing that all provide
> > > > much better handling of CQ affinity.
> > > >
> > > > What scenario does this patch make a difference?
> > >
> > > AFAIK the enhance mode is only for datagram mode, we are using connected mode.
> >
> > And you are using child interfaces or multiple cards?
> we are using multiple child interfaces on Connect x5 HCA
> (MCX556A-ECAT)

Ok.. Do you think it would be better to change IPoIB to use the new
shared CQ pool API?

Jason

Jinpu Wang Oct. 28, 2020, 2:08 p.m. UTC | #6

On Wed, Oct 28, 2020 at 2:59 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Oct 28, 2020 at 02:57:57PM +0100, Jinpu Wang wrote:
> > On Wed, Oct 28, 2020 at 2:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote:
> > > > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > > >
> > > > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > > > > > Currently ipoib choose cq completion vector based on port number,
> > > > > > when HCA only have one port, all the interface recv queue completion
> > > > > > are bind to cq completion vector 0.
> > > > > >
> > > > > > To better distribute the load, use same method as __ib_alloc_cq_any
> > > > > > to choose completion vector, with the change, each interface now use
> > > > > > different completion vectors.
> > > > > >
> > > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > > > > >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > If you care about IPoIB performance you should be using the
> > > > > accelerated IPoIB stuff, the drivers implementing that all provide
> > > > > much better handling of CQ affinity.
> > > > >
> > > > > What scenario does this patch make a difference?
> > > >
> > > > AFAIK the enhance mode is only for datagram mode, we are using connected mode.
> > >
> > > And you are using child interfaces or multiple cards?
> > we are using multiple child interfaces on Connect x5 HCA
> > (MCX556A-ECAT)
>
> Ok.. Do you think it would be better to change IPoIB to use the new
> shared CQ pool API?
>
> Jason
I thought about it, but CQ pool API doesn't work with IB_POLL_DIRECT,
ipoib is using NAPI,
and it's non-trivial to convert to CQ pool API. I'm not confident to
do it properly,

Jason Gunthorpe Nov. 20, 2020, 8:28 p.m. UTC | #7

On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> Currently ipoib choose cq completion vector based on port number,
> when HCA only have one port, all the interface recv queue completion
> are bind to cq completion vector 0.
> 
> To better distribute the load, use same method as __ib_alloc_cq_any
> to choose completion vector, with the change, each interface now use
> different completion vectors.
> 
> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Applied to for next, thanks

Jason

RDMA/ipoib: distribute cq completion vector better

Commit Message

Comments

Patch