diff mbox series

[3/4] ibacm: Unable to resurrect an interface

Message ID 20190124144738.7961-4-haakon.bugge@oracle.com (mailing list archive)
State Superseded
Delegated to: Jason Gunthorpe
Headers show
Series ibacm: Replace ioctl with netlink and fix inablity to resurrect an interface | expand

Commit Message

Haakon Bugge Jan. 24, 2019, 2:47 p.m. UTC
When an IB port has been brought back to Active state, after being
down, ibacm gets an event about it. It will then (re) enumerate the
devices, and does so by executing an ioctl with SIOCGIFCONF. This
particular ioctl will only return interfaces that are "running".

There may be a delay after the IB port becomes Active until its
address has been provisioned, and becomes "running". If ibacm attempts
to associate IPoIB interfaces to the port during this interval, it
will not see the interface because it is not "running".

Later, when ibacm is asked for a Path Record (PR) using the IP address
of the resurrected IPoIB interface, it will not be able to find the
associated end-point (EP), and the following is printed in the log:

acm_svr_resolve_path: notice - unknown local end point address

The bug can be provoked by the following script. We have a single HCA
with two ports, the IPoIB interfaces are named stib{0,1}, the IP
address of the first interface is 192.168.200.200, and the remote IP
address is 192.168.200.202. The LID of the IB switch is 1 and the
switch port number connected to port 1 of the HCA is 22.

<script>
 #!/bin/bash

ibportstate -P 2 1 22 disable

 # move the IP address
ip addr del 192.168.200.200/24 dev stib0
ip addr add 192.168.200.200/24 dev stib1

ibportstate -P 2 1 22 enable

 # Wait until port becomes active again
while [[ $(ibstat|grep State:|grep -c Active) != 2 ]]; do
   echo -n "."; sleep 1
done
echo

 # give ibacm time to re-enumerate the interfaces
sleep 1

 # move the IP address back
ip addr del 192.168.200.200/24 dev stib1
ip addr add 192.168.200.200/24 dev stib0

 # take the other port down, so we are sure we use stib0
ip link set down dev stib1

 # Start a utility requesting a PR
qperf 192.168.200.202 -cm1 rc_bw

 # check for failure
grep "unknown local end point" ibacm.log

 # restore the state
ip link set up dev stib1
</script>

This fix depends on the commit that re-factors the use of ioctl in
acm_if_iter(), and instead uses netlink. Now, by reducing the
requirements of the state of the interface, the EP is added, and
afterwards, when an address is assigned, it is associated with the EP.

This commit is a new implementation of
https://patchwork.kernel.org/patch/10748357, which was NAKed.

Change-Id: Ideb7725c3bf29a1c87a335bdf374efd4ea16c007
Signed-off-by: HÃ¥kon Bugge <haakon.bugge@oracle.com>
---
 ibacm/src/acm_util.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)
diff mbox series

Patch

diff --git a/ibacm/src/acm_util.c b/ibacm/src/acm_util.c
index fc22432c..2c050aa0 100644
--- a/ibacm/src/acm_util.c
+++ b/ibacm/src/acm_util.c
@@ -187,19 +187,13 @@  void acm_if_iter(struct nl_object *obj, void *_ctx_and_cb)
 	uint16_t pkey;
 	int addr_len;
 	char *label;
-	int flags;
-	int ret;
 	int af;
 
 	link = rtnl_link_get(link_cache, rtnl_addr_get_ifindex(addr));
-	flags = rtnl_link_get_flags(link);
 
 	if (rtnl_link_get_arptype(link) != ARPHRD_INFINIBAND)
 		return;
 
-	if (!(flags & IFF_RUNNING))
-		return;
-
 	if (!(a = rtnl_addr_get_local(addr)))
 		return;
 
@@ -213,23 +207,23 @@  void acm_if_iter(struct nl_object *obj, void *_ctx_and_cb)
 		return;
 
 	label = rtnl_addr_get_label(addr);
-	if (!label)
-		return;
 
 	link_addr = rtnl_link_get_addr(link);
+	/* gid has a 4 byte offset into the link address */
 	memcpy(sgid.raw, nl_addr_get_binary_addr(link_addr) + 4, sizeof(sgid));
+	/* guid is the least significant 64b of the gid */
 	guid_ptr = (unsigned long *)&sgid.raw;
+	++guid_ptr;
 
-	ret = acm_if_get_pkey(rtnl_link_get_name(link), &pkey);
-	if (ret)
+	if (acm_if_get_pkey(rtnl_link_get_name(link), &pkey))
 		return;
 
 	acm_log(2, "name: %5s label: %9s index: %2d flags: %s addr: %s pkey: 0x%04x guid: 0x%lx\n",
 		rtnl_link_get_name(link), label,
 		rtnl_addr_get_ifindex(addr),
-		rtnl_link_flags2str(flags, flags_str, sizeof(flags_str)),
+		rtnl_link_flags2str(rtnl_link_get_flags(link), flags_str, sizeof(flags_str)),
 		nl_addr2str(a, ip_str, sizeof(ip_str)),	pkey,
-		be64toh(*(guid_ptr + 1)));
+		be64toh(*guid_ptr));
 
 	memcpy(&bin_addr, nl_addr_get_binary_addr(a), addr_len);
 	ctx_cb->cb(rtnl_link_get_name(link), &sgid, pkey, af2acm_addr_type(af), bin_addr, ip_str, ctx_cb->ctx);