diff mbox series

RDMA/ipoib: distribute cq completion vector better

Message ID 20201013074342.15867-1-jinpu.wang@cloud.ionos.com (mailing list archive)
State Accepted
Delegated to: Jason Gunthorpe
Headers show
Series RDMA/ipoib: distribute cq completion vector better | expand

Commit Message

Jinpu Wang Oct. 13, 2020, 7:43 a.m. UTC
Currently ipoib choose cq completion vector based on port number,
when HCA only have one port, all the interface recv queue completion
are bind to cq completion vector 0.

To better distribute the load, use same method as __ib_alloc_cq_any
to choose completion vector, with the change, each interface now use
different completion vectors.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
---
 drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Jason Gunthorpe Oct. 28, 2020, 1:44 p.m. UTC | #1
On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> Currently ipoib choose cq completion vector based on port number,
> when HCA only have one port, all the interface recv queue completion
> are bind to cq completion vector 0.
> 
> To better distribute the load, use same method as __ib_alloc_cq_any
> to choose completion vector, with the change, each interface now use
> different completion vectors.
> 
> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

If you care about IPoIB performance you should be using the
accelerated IPoIB stuff, the drivers implementing that all provide
much better handling of CQ affinity.

What scenario does this patch make a difference?

Jason
Jinpu Wang Oct. 28, 2020, 1:48 p.m. UTC | #2
On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > Currently ipoib choose cq completion vector based on port number,
> > when HCA only have one port, all the interface recv queue completion
> > are bind to cq completion vector 0.
> >
> > To better distribute the load, use same method as __ib_alloc_cq_any
> > to choose completion vector, with the change, each interface now use
> > different completion vectors.
> >
> > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > ---
> >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
>
> If you care about IPoIB performance you should be using the
> accelerated IPoIB stuff, the drivers implementing that all provide
> much better handling of CQ affinity.
>
> What scenario does this patch make a difference?

AFAIK the enhance mode is only for datagram mode, we are using connected mode.

Thanks!
Jason Gunthorpe Oct. 28, 2020, 1:53 p.m. UTC | #3
On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote:
> On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > > Currently ipoib choose cq completion vector based on port number,
> > > when HCA only have one port, all the interface recv queue completion
> > > are bind to cq completion vector 0.
> > >
> > > To better distribute the load, use same method as __ib_alloc_cq_any
> > > to choose completion vector, with the change, each interface now use
> > > different completion vectors.
> > >
> > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > If you care about IPoIB performance you should be using the
> > accelerated IPoIB stuff, the drivers implementing that all provide
> > much better handling of CQ affinity.
> >
> > What scenario does this patch make a difference?
> 
> AFAIK the enhance mode is only for datagram mode, we are using connected mode.

And you are using child interfaces or multiple cards?

Jason
Jinpu Wang Oct. 28, 2020, 1:57 p.m. UTC | #4
On Wed, Oct 28, 2020 at 2:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote:
> > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > > > Currently ipoib choose cq completion vector based on port number,
> > > > when HCA only have one port, all the interface recv queue completion
> > > > are bind to cq completion vector 0.
> > > >
> > > > To better distribute the load, use same method as __ib_alloc_cq_any
> > > > to choose completion vector, with the change, each interface now use
> > > > different completion vectors.
> > > >
> > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > > >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > If you care about IPoIB performance you should be using the
> > > accelerated IPoIB stuff, the drivers implementing that all provide
> > > much better handling of CQ affinity.
> > >
> > > What scenario does this patch make a difference?
> >
> > AFAIK the enhance mode is only for datagram mode, we are using connected mode.
>
> And you are using child interfaces or multiple cards?
we are using multiple child interfaces on Connect x5 HCA (MCX556A-ECAT)
ibstat
CA 'mlx5_0'
CA type: MT4119
Number of ports: 1
Firmware version: 16.28.2006
Hardware version: 0
Node GUID: 0x98039b03006c7912
System image GUID: 0x98039b03006c7912
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 38
LMC: 0
SM lid: 4
Capability mask: 0x2651e848
Port GUID: 0x98039b03006c7912
Link layer: InfiniBand
CA 'mlx5_1'
CA type: MT4119
Number of ports: 1
Firmware version: 16.28.2006
Hardware version: 0
Node GUID: 0x98039b03006c7913
System image GUID: 0x98039b03006c7912
Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 134
LMC: 0
SM lid: 224
Capability mask: 0x2651e848
Port GUID: 0x98039b03006c7913
Link layer: InfiniBand

>
> Jason
Jason Gunthorpe Oct. 28, 2020, 1:59 p.m. UTC | #5
On Wed, Oct 28, 2020 at 02:57:57PM +0100, Jinpu Wang wrote:
> On Wed, Oct 28, 2020 at 2:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote:
> > > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > >
> > > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > > > > Currently ipoib choose cq completion vector based on port number,
> > > > > when HCA only have one port, all the interface recv queue completion
> > > > > are bind to cq completion vector 0.
> > > > >
> > > > > To better distribute the load, use same method as __ib_alloc_cq_any
> > > > > to choose completion vector, with the change, each interface now use
> > > > > different completion vectors.
> > > > >
> > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > > > >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > If you care about IPoIB performance you should be using the
> > > > accelerated IPoIB stuff, the drivers implementing that all provide
> > > > much better handling of CQ affinity.
> > > >
> > > > What scenario does this patch make a difference?
> > >
> > > AFAIK the enhance mode is only for datagram mode, we are using connected mode.
> >
> > And you are using child interfaces or multiple cards?
> we are using multiple child interfaces on Connect x5 HCA
> (MCX556A-ECAT)

Ok.. Do you think it would be better to change IPoIB to use the new
shared CQ pool API?

Jason
Jinpu Wang Oct. 28, 2020, 2:08 p.m. UTC | #6
On Wed, Oct 28, 2020 at 2:59 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Wed, Oct 28, 2020 at 02:57:57PM +0100, Jinpu Wang wrote:
> > On Wed, Oct 28, 2020 at 2:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote:
> > > > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > > >
> > > > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> > > > > > Currently ipoib choose cq completion vector based on port number,
> > > > > > when HCA only have one port, all the interface recv queue completion
> > > > > > are bind to cq completion vector 0.
> > > > > >
> > > > > > To better distribute the load, use same method as __ib_alloc_cq_any
> > > > > > to choose completion vector, with the change, each interface now use
> > > > > > different completion vectors.
> > > > > >
> > > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> > > > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> > > > > >  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
> > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > If you care about IPoIB performance you should be using the
> > > > > accelerated IPoIB stuff, the drivers implementing that all provide
> > > > > much better handling of CQ affinity.
> > > > >
> > > > > What scenario does this patch make a difference?
> > > >
> > > > AFAIK the enhance mode is only for datagram mode, we are using connected mode.
> > >
> > > And you are using child interfaces or multiple cards?
> > we are using multiple child interfaces on Connect x5 HCA
> > (MCX556A-ECAT)
>
> Ok.. Do you think it would be better to change IPoIB to use the new
> shared CQ pool API?
>
> Jason
I thought about it, but CQ pool API doesn't work with IB_POLL_DIRECT,
ipoib is using NAPI,
and it's non-trivial to convert to CQ pool API. I'm not confident to
do it properly,
Jason Gunthorpe Nov. 20, 2020, 8:28 p.m. UTC | #7
On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote:
> Currently ipoib choose cq completion vector based on port number,
> when HCA only have one port, all the interface recv queue completion
> are bind to cq completion vector 0.
> 
> To better distribute the load, use same method as __ib_alloc_cq_any
> to choose completion vector, with the change, each interface now use
> different completion vectors.
> 
> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
> ---
>  drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Applied to for next, thanks

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
index 587252fd6f57..5a150a080ac2 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c
@@ -158,6 +158,7 @@  int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca)
 
 	int ret, size, req_vec;
 	int i;
+	static atomic_t counter;
 
 	size = ipoib_recvq_size + 1;
 	ret = ipoib_cm_dev_init(dev);
@@ -171,8 +172,7 @@  int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca)
 		if (ret != -EOPNOTSUPP)
 			return ret;
 
-	req_vec = (priv->port - 1) * 2;
-
+	req_vec = atomic_inc_return(&counter) * 2;
 	cq_attr.cqe = size;
 	cq_attr.comp_vector = req_vec % priv->ca->num_comp_vectors;
 	priv->recv_cq = ib_create_cq(priv->ca, ipoib_ib_rx_completion, NULL,