Message ID | 20201013074342.15867-1-jinpu.wang@cloud.ionos.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | RDMA/ipoib: distribute cq completion vector better | expand |
On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote: > Currently ipoib choose cq completion vector based on port number, > when HCA only have one port, all the interface recv queue completion > are bind to cq completion vector 0. > > To better distribute the load, use same method as __ib_alloc_cq_any > to choose completion vector, with the change, each interface now use > different completion vectors. > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> > --- > drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) If you care about IPoIB performance you should be using the accelerated IPoIB stuff, the drivers implementing that all provide much better handling of CQ affinity. What scenario does this patch make a difference? Jason
On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote: > > Currently ipoib choose cq completion vector based on port number, > > when HCA only have one port, all the interface recv queue completion > > are bind to cq completion vector 0. > > > > To better distribute the load, use same method as __ib_alloc_cq_any > > to choose completion vector, with the change, each interface now use > > different completion vectors. > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> > > --- > > drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > If you care about IPoIB performance you should be using the > accelerated IPoIB stuff, the drivers implementing that all provide > much better handling of CQ affinity. > > What scenario does this patch make a difference? AFAIK the enhance mode is only for datagram mode, we are using connected mode. Thanks!
On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote: > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote: > > > Currently ipoib choose cq completion vector based on port number, > > > when HCA only have one port, all the interface recv queue completion > > > are bind to cq completion vector 0. > > > > > > To better distribute the load, use same method as __ib_alloc_cq_any > > > to choose completion vector, with the change, each interface now use > > > different completion vectors. > > > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> > > > drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > If you care about IPoIB performance you should be using the > > accelerated IPoIB stuff, the drivers implementing that all provide > > much better handling of CQ affinity. > > > > What scenario does this patch make a difference? > > AFAIK the enhance mode is only for datagram mode, we are using connected mode. And you are using child interfaces or multiple cards? Jason
On Wed, Oct 28, 2020 at 2:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote: > > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote: > > > > Currently ipoib choose cq completion vector based on port number, > > > > when HCA only have one port, all the interface recv queue completion > > > > are bind to cq completion vector 0. > > > > > > > > To better distribute the load, use same method as __ib_alloc_cq_any > > > > to choose completion vector, with the change, each interface now use > > > > different completion vectors. > > > > > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> > > > > drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > If you care about IPoIB performance you should be using the > > > accelerated IPoIB stuff, the drivers implementing that all provide > > > much better handling of CQ affinity. > > > > > > What scenario does this patch make a difference? > > > > AFAIK the enhance mode is only for datagram mode, we are using connected mode. > > And you are using child interfaces or multiple cards? we are using multiple child interfaces on Connect x5 HCA (MCX556A-ECAT) ibstat CA 'mlx5_0' CA type: MT4119 Number of ports: 1 Firmware version: 16.28.2006 Hardware version: 0 Node GUID: 0x98039b03006c7912 System image GUID: 0x98039b03006c7912 Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 38 LMC: 0 SM lid: 4 Capability mask: 0x2651e848 Port GUID: 0x98039b03006c7912 Link layer: InfiniBand CA 'mlx5_1' CA type: MT4119 Number of ports: 1 Firmware version: 16.28.2006 Hardware version: 0 Node GUID: 0x98039b03006c7913 System image GUID: 0x98039b03006c7912 Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 134 LMC: 0 SM lid: 224 Capability mask: 0x2651e848 Port GUID: 0x98039b03006c7913 Link layer: InfiniBand > > Jason
On Wed, Oct 28, 2020 at 02:57:57PM +0100, Jinpu Wang wrote: > On Wed, Oct 28, 2020 at 2:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote: > > > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > > > > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote: > > > > > Currently ipoib choose cq completion vector based on port number, > > > > > when HCA only have one port, all the interface recv queue completion > > > > > are bind to cq completion vector 0. > > > > > > > > > > To better distribute the load, use same method as __ib_alloc_cq_any > > > > > to choose completion vector, with the change, each interface now use > > > > > different completion vectors. > > > > > > > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > > > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> > > > > > drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > If you care about IPoIB performance you should be using the > > > > accelerated IPoIB stuff, the drivers implementing that all provide > > > > much better handling of CQ affinity. > > > > > > > > What scenario does this patch make a difference? > > > > > > AFAIK the enhance mode is only for datagram mode, we are using connected mode. > > > > And you are using child interfaces or multiple cards? > we are using multiple child interfaces on Connect x5 HCA > (MCX556A-ECAT) Ok.. Do you think it would be better to change IPoIB to use the new shared CQ pool API? Jason
On Wed, Oct 28, 2020 at 2:59 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Wed, Oct 28, 2020 at 02:57:57PM +0100, Jinpu Wang wrote: > > On Wed, Oct 28, 2020 at 2:53 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > > > On Wed, Oct 28, 2020 at 02:48:01PM +0100, Jinpu Wang wrote: > > > > On Wed, Oct 28, 2020 at 2:44 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > > > > > > > On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote: > > > > > > Currently ipoib choose cq completion vector based on port number, > > > > > > when HCA only have one port, all the interface recv queue completion > > > > > > are bind to cq completion vector 0. > > > > > > > > > > > > To better distribute the load, use same method as __ib_alloc_cq_any > > > > > > to choose completion vector, with the change, each interface now use > > > > > > different completion vectors. > > > > > > > > > > > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > > > > > > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> > > > > > > drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++-- > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > If you care about IPoIB performance you should be using the > > > > > accelerated IPoIB stuff, the drivers implementing that all provide > > > > > much better handling of CQ affinity. > > > > > > > > > > What scenario does this patch make a difference? > > > > > > > > AFAIK the enhance mode is only for datagram mode, we are using connected mode. > > > > > > And you are using child interfaces or multiple cards? > > we are using multiple child interfaces on Connect x5 HCA > > (MCX556A-ECAT) > > Ok.. Do you think it would be better to change IPoIB to use the new > shared CQ pool API? > > Jason I thought about it, but CQ pool API doesn't work with IB_POLL_DIRECT, ipoib is using NAPI, and it's non-trivial to convert to CQ pool API. I'm not confident to do it properly,
On Tue, Oct 13, 2020 at 09:43:42AM +0200, Jack Wang wrote: > Currently ipoib choose cq completion vector based on port number, > when HCA only have one port, all the interface recv queue completion > are bind to cq completion vector 0. > > To better distribute the load, use same method as __ib_alloc_cq_any > to choose completion vector, with the change, each interface now use > different completion vectors. > > Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> > Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com> > --- > drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Applied to for next, thanks Jason
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c index 587252fd6f57..5a150a080ac2 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_verbs.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_verbs.c @@ -158,6 +158,7 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) int ret, size, req_vec; int i; + static atomic_t counter; size = ipoib_recvq_size + 1; ret = ipoib_cm_dev_init(dev); @@ -171,8 +172,7 @@ int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) if (ret != -EOPNOTSUPP) return ret; - req_vec = (priv->port - 1) * 2; - + req_vec = atomic_inc_return(&counter) * 2; cq_attr.cqe = size; cq_attr.comp_vector = req_vec % priv->ca->num_comp_vectors; priv->recv_cq = ib_create_cq(priv->ca, ipoib_ib_rx_completion, NULL,