diff mbox

[rdma-rc] RDMA/core: Do not use invalid destination in determining port reuse

Message ID 20180305175906.6416-1-shiraz.saleem@intel.com (mailing list archive)
State Superseded
Headers show

Commit Message

Saleem, Shiraz March 5, 2018, 5:59 p.m. UTC
From: Tatyana Nikolova <tatyana.e.nikolova@intel.com>

cma_port_is_unique() allows local port reuse if the quad (source
address and port, destination address and port) for this connection
is unique. However, if the destination info is zero or unspecified, it
can't make a correct decision but still allows port reuse. For example,
sometimes rdma_bind_addr() is called with unspecified destination and
reusing the port can lead to creating a connection with a duplicate quad,
after the destination is resolved. The issue manifests when MPI scale-up
tests hang after the duplicate quad is used.

Add checks for unspecified or zero destination address and port to
prevent source port reuse based on invalid destination.

Fixes: 19b752a19dce ("IB/cma: Allow port reuse for rdma_id")
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
---
 drivers/infiniband/core/cma.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

Comments

Jason Gunthorpe March 6, 2018, 11:24 p.m. UTC | #1
On Mon, Mar 05, 2018 at 11:59:06AM -0600, Shiraz Saleem wrote:
> From: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> 
> cma_port_is_unique() allows local port reuse if the quad (source
> address and port, destination address and port) for this connection
> is unique. However, if the destination info is zero or unspecified, it
> can't make a correct decision but still allows port reuse. For example,
> sometimes rdma_bind_addr() is called with unspecified destination and
> reusing the port can lead to creating a connection with a duplicate quad,
> after the destination is resolved. The issue manifests when MPI scale-up
> tests hang after the duplicate quad is used.
> 
> Add checks for unspecified or zero destination address and port to
> prevent source port reuse based on invalid destination.
> 
> Fixes: 19b752a19dce ("IB/cma: Allow port reuse for rdma_id")
> Reviewed-by: Sean Hefty <sean.hefty@intel.com>
> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
>  drivers/infiniband/core/cma.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index 6294a70..bb8f0e2 100644
> +++ b/drivers/infiniband/core/cma.c
> @@ -1047,6 +1047,8 @@ EXPORT_SYMBOL(rdma_init_qp_attr);
>  static inline int cma_zero_addr(struct sockaddr *addr)
>  {
>  	switch (addr->sa_family) {
> +	case AF_UNSPEC:
> +		return 1;

I can't figure out why this is here?

I thought an 'unspecified destination' shouldn't be AF_UNSPEC. It is
AF_INETxx and a zero address in CMA? Where can AF_UNSPEC be set that
trips this up?

It seems really wierd to call AF_UNSPEC 'zero'..

And if you change this then it means cma_any_addr() becomes true for
AF_UNSPEC.. And that seems like it would have impact beyond just
cma_port_is_unique,

eg here:

static bool cma_match_private_data(struct rdma_id_private *id_priv,
				   const struct cma_hdr *hdr)
{
	if (cma_any_addr(addr) && !id_priv->afonly)
		return true;

Now AF_UNSPEC returns true, where before it was false.. Seems
concerning.

This really needs more explanation please.

>  		/* different dest port -> unique */
> -		if (!cma_any_port(cur_daddr) &&
> +		if (!cma_any_port(daddr) &&
> +		    !cma_any_port(cur_daddr) &&
>  		    (dport != cur_dport))
>  			continue;

> +		/* different dst address -> unique */
> +		if (!cma_any_addr(daddr) &&
> +		    !cma_any_addr(cur_daddr) &&
> +		    cma_addr_cmp(daddr, cur_daddr))
> +			continue;

Why re-order so much stuff? The diff could be much smaller.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hefty, Sean March 7, 2018, 11:52 p.m. UTC | #2
> > diff --git a/drivers/infiniband/core/cma.c
> > b/drivers/infiniband/core/cma.c index 6294a70..bb8f0e2 100644
> > +++ b/drivers/infiniband/core/cma.c
> > @@ -1047,6 +1047,8 @@ EXPORT_SYMBOL(rdma_init_qp_attr);  static
> inline
> > int cma_zero_addr(struct sockaddr *addr)  {
> >  	switch (addr->sa_family) {
> > +	case AF_UNSPEC:
> > +		return 1;
> 
> I can't figure out why this is here?
> 
> I thought an 'unspecified destination' shouldn't be AF_UNSPEC. It is
> AF_INETxx and a zero address in CMA? Where can AF_UNSPEC be set that
> trips this up?

At the time rdma_bind_addr() is called, the destination address of the rdma_cm_id is unset (sa_family = UNSPEC).  The daddr->sa_family isn't set until the end of bind.


> It seems really wierd to call AF_UNSPEC 'zero'..

The check for UNSPEC can move out of cma_zero_addr() into cma_any_addr(), but that doesn't address the next concern...


> And if you change this then it means cma_any_addr() becomes true for
> AF_UNSPEC.. And that seems like it would have impact beyond just
> cma_port_is_unique,
> 
> eg here:
> 
> static bool cma_match_private_data(struct rdma_id_private *id_priv,
> 				   const struct cma_hdr *hdr)
> {
> 	if (cma_any_addr(addr) && !id_priv->afonly)
> 		return true;
> 
> Now AF_UNSPEC returns true, where before it was false.. Seems
> concerning.
> 
> This really needs more explanation please.

The behavior of cma_any_addr() and cma_any_port() is different for UNSPEC.  cma_any_addr() returns false, but cma_any_port() returns true.  The latter results from cma_port() returning 0 in the case of UNSPEC (or any unhandled address format).  For consistency, cma_zero_addr() was modified to align with cma_port(), which also makes any_addr/port consistent.

In most of the cases I checked, passing in an address with sa_family set to UNSPEC to cma_any_addr() should not occur.  The port available check is one (valid) case where it can occur.  The cma_match_private_data() call referenced above _should_ always have a valid sa_family set for the source address, for example.

The only place that it looks like cma_any_addr() might be called with sa_family = UNSPEC is cma_set_mgid()/cma_iboe_set_mgid().  And I'm not sure the existing code through those paths is correct.  If the address is UNSPEC, then the existing code will default to the AF_INET case.

The librdmacm checks for UNSPEC in multicast join operations, so we may never be hitting this case in practice.  But I think a user space app could invoke this code path.  I think the result of that would just be the formation of a gid with 4 bytes of 0's in it.  In any case, the proposed changes would modify this path and convert UNSPEC to an mgid = 0.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 8, 2018, 12:01 a.m. UTC | #3
On Wed, Mar 07, 2018 at 11:52:28PM +0000, Hefty, Sean wrote:
> > > diff --git a/drivers/infiniband/core/cma.c
> > > b/drivers/infiniband/core/cma.c index 6294a70..bb8f0e2 100644
> > > +++ b/drivers/infiniband/core/cma.c
> > > @@ -1047,6 +1047,8 @@ EXPORT_SYMBOL(rdma_init_qp_attr);  static
> > inline
> > > int cma_zero_addr(struct sockaddr *addr)  {
> > >  	switch (addr->sa_family) {
> > > +	case AF_UNSPEC:
> > > +		return 1;
> > 
> > I can't figure out why this is here?
> > 
> > I thought an 'unspecified destination' shouldn't be AF_UNSPEC. It is
> > AF_INETxx and a zero address in CMA? Where can AF_UNSPEC be set that
> > trips this up?
> 
> At the time rdma_bind_addr() is called, the destination address of
> the rdma_cm_id is unset (sa_family = UNSPEC).  The daddr->sa_family
> isn't set until the end of bind.

This is surprising to me. You can't bind without also specifying a
family, so why should be family be left as UNSPEC for so long?

Doesn't that just risk bugs where we mix and match AF's incorrectly?

Could we get to the same place without the above case by just
adjusting the bind flow to set the family earlier?

> > It seems really wierd to call AF_UNSPEC 'zero'..
> 
> The check for UNSPEC can move out of cma_zero_addr() into
> cma_any_addr(), but that doesn't address the next concern...
> 
> 
> > And if you change this then it means cma_any_addr() becomes true for
> > AF_UNSPEC.. And that seems like it would have impact beyond just
> > cma_port_is_unique,
> > 
> > eg here:
> > 
> > static bool cma_match_private_data(struct rdma_id_private *id_priv,
> > 				   const struct cma_hdr *hdr)
> > {
> > 	if (cma_any_addr(addr) && !id_priv->afonly)
> > 		return true;
> > 
> > Now AF_UNSPEC returns true, where before it was false.. Seems
> > concerning.
> > 
> > This really needs more explanation please.
> 
> The behavior of cma_any_addr() and cma_any_port() is different for
> UNSPEC.  cma_any_addr() returns false, but cma_any_port() returns
> true.  The latter results from cma_port() returning 0 in the case of
> UNSPEC (or any unhandled address format).  For consistency,
> cma_zero_addr() was modified to align with cma_port(), which also
> makes any_addr/port consistent.
> 
> In most of the cases I checked, passing in an address with sa_family
> set to UNSPEC to cma_any_addr() should not occur.  The port
> available check is one (valid) case where it can occur.  The
> cma_match_private_data() call referenced above _should_ always have
> a valid sa_family set for the source address, for example.
> 
> The only place that it looks like cma_any_addr() might be called
> with sa_family = UNSPEC is cma_set_mgid()/cma_iboe_set_mgid().  And
> I'm not sure the existing code through those paths is correct.  If
> the address is UNSPEC, then the existing code will default to the
> AF_INET case.
> 
> The librdmacm checks for UNSPEC in multicast join operations, so we
> may never be hitting this case in practice.  But I think a user
> space app could invoke this code path.  I think the result of that
> would just be the formation of a gid with 4 bytes of 0's in it.  In
> any case, the proposed changes would modify this path and convert
> UNSPEC to an mgid = 0.

Okay, so basically I think some summary of this email needs to be
inserted as a comment above the case AF_UNSPEC, assuming it isn't
simplified as above..

The fact you had to audit existing paths to prove we don't use
AF_UNSPEC in them says this is just way to subtle and tricky...

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hefty, Sean March 8, 2018, 12:17 a.m. UTC | #4
> On Wed, Mar 07, 2018 at 11:52:28PM +0000, Hefty, Sean wrote:
> > > > diff --git a/drivers/infiniband/core/cma.c
> > > > b/drivers/infiniband/core/cma.c index 6294a70..bb8f0e2 100644
> > > > +++ b/drivers/infiniband/core/cma.c
> > > > @@ -1047,6 +1047,8 @@ EXPORT_SYMBOL(rdma_init_qp_attr);
> static
> > > inline
> > > > int cma_zero_addr(struct sockaddr *addr)  {
> > > >  	switch (addr->sa_family) {
> > > > +	case AF_UNSPEC:
> > > > +		return 1;
> > >
> > > I can't figure out why this is here?
> > >
> > > I thought an 'unspecified destination' shouldn't be AF_UNSPEC.
> It is
> > > AF_INETxx and a zero address in CMA? Where can AF_UNSPEC be set
> that
> > > trips this up?
> >
> > At the time rdma_bind_addr() is called, the destination address of
> the
> > rdma_cm_id is unset (sa_family = UNSPEC).  The daddr->sa_family
> isn't
> > set until the end of bind.
> 
> This is surprising to me. You can't bind without also specifying a
> family, so why should be family be left as UNSPEC for so long?

rdma_bind_addr() has an sa_family for the *source* address.  The *destination* address is just zeroed memory associated with the rdma_cm_id.

> Doesn't that just risk bugs where we mix and match AF's incorrectly?
> 
> Could we get to the same place without the above case by just
> adjusting the bind flow to set the family earlier?

Yes, the sa_family for the destination address could be set at the top of the call and cleared on failure.  It is currently only set at the end on success.  Even with this change, I would still vote to make the code more consistent.  If multicast join verifies the sa_family, it should be possible to assert that we're never dealing with unknown/unspecified local address formats.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 8, 2018, 3:15 a.m. UTC | #5
On Thu, Mar 08, 2018 at 12:17:10AM +0000, Hefty, Sean wrote:

> It is currently only set at the end on success.  Even with this
> change, I would still vote to make the code more consistent.  If
> multicast join verifies the sa_family, it should be possible to
> assert that we're never dealing with unknown/unspecified local
> address formats.

Never dealing with AF_UNSPEC sounds like a good goal if it isn't too
hard.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 6294a70..bb8f0e2 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1047,6 +1047,8 @@  EXPORT_SYMBOL(rdma_init_qp_attr);
 static inline int cma_zero_addr(struct sockaddr *addr)
 {
 	switch (addr->sa_family) {
+	case AF_UNSPEC:
+		return 1;
 	case AF_INET:
 		return ipv4_is_zeronet(((struct sockaddr_in *)addr)->sin_addr.s_addr);
 	case AF_INET6:
@@ -3013,21 +3015,23 @@  static int cma_port_is_unique(struct rdma_bind_list *bind_list,
 			continue;
 
 		/* different dest port -> unique */
-		if (!cma_any_port(cur_daddr) &&
+		if (!cma_any_port(daddr) &&
+		    !cma_any_port(cur_daddr) &&
 		    (dport != cur_dport))
 			continue;
 
+		/* different dst address -> unique */
+		if (!cma_any_addr(daddr) &&
+		    !cma_any_addr(cur_daddr) &&
+		    cma_addr_cmp(daddr, cur_daddr))
+			continue;
+
 		/* different src address -> unique */
 		if (!cma_any_addr(saddr) &&
 		    !cma_any_addr(cur_saddr) &&
 		    cma_addr_cmp(saddr, cur_saddr))
 			continue;
 
-		/* different dst address -> unique */
-		if (!cma_any_addr(cur_daddr) &&
-		    cma_addr_cmp(daddr, cur_daddr))
-			continue;
-
 		return -EADDRNOTAVAIL;
 	}
 	return 0;