Message ID | 20190129084406.45591-4-haakon.bugge@oracle.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2,1/4] ibacm: Replace ioctl with netlink | expand |
On Tue, Jan 29, 2019 at 09:44:05AM +0100, Håkon Bugge wrote: > When an IB port has been brought back to Active state, after being > down, ibacm gets an event about it. It will then (re) enumerate the > devices, and does so by executing an ioctl with SIOCGIFCONF. This > particular ioctl will only return interfaces that are "running". > > There may be a delay after the IB port becomes Active until its > address has been provisioned, and becomes "running". If ibacm attempts > to associate IPoIB interfaces to the port during this interval, it > will not see the interface because it is not "running". > > Later, when ibacm is asked for a Path Record (PR) using the IP address > of the resurrected IPoIB interface, it will not be able to find the > associated end-point (EP), and the following is printed in the log: > > acm_svr_resolve_path: notice - unknown local end point address > > The bug can be provoked by the following script. We have a single HCA > with two ports, the IPoIB interfaces are named stib{0,1}, the IP > address of the first interface is 192.168.200.200, and the remote IP > address is 192.168.200.202. The LID of the IB switch is 1 and the > switch port number connected to port 1 of the HCA is 22. > > <script> > #!/bin/bash > > ibportstate -P 2 1 22 disable > > # move the IP address > ip addr del 192.168.200.200/24 dev stib0 > ip addr add 192.168.200.200/24 dev stib1 > > ibportstate -P 2 1 22 enable > > # Wait until port becomes active again > while [[ $(ibstat|grep State:|grep -c Active) != 2 ]]; do > echo -n "."; sleep 1 > done > echo > > # give ibacm time to re-enumerate the interfaces > sleep 1 > > # move the IP address back > ip addr del 192.168.200.200/24 dev stib1 > ip addr add 192.168.200.200/24 dev stib0 > > # take the other port down, so we are sure we use stib0 > ip link set down dev stib1 > > # Start a utility requesting a PR > qperf 192.168.200.202 -cm1 rc_bw > > # check for failure > grep "unknown local end point" ibacm.log > > # restore the state > ip link set up dev stib1 > </script> > > This fix depends on the commit that re-factors the use of ioctl in > acm_if_iter(), and instead uses netlink. Now, by reducing the > requirements of the state of the interface, the EP is added, and > afterwards, when an address is assigned, it is associated with the EP. > > This commit is a new implementation of > https://patchwork.kernel.org/patch/10748357, which was NAKed. > > Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> > > v1 -> v2: > * Removed Gerrit's Change-Id tag (Håkon) > --- > ibacm/src/acm_util.c | 14 +++----------- > 1 file changed, 3 insertions(+), 11 deletions(-) > > diff --git a/ibacm/src/acm_util.c b/ibacm/src/acm_util.c > index fecb6c89..8807579d 100644 > --- a/ibacm/src/acm_util.c > +++ b/ibacm/src/acm_util.c > @@ -180,19 +180,13 @@ static void acm_if_iter(struct nl_object *obj, void *_ctx_and_cb) > uint16_t pkey; > int addr_len; > char *label; > - int flags; > - int ret; > int af; > > link = rtnl_link_get(link_cache, rtnl_addr_get_ifindex(addr)); > - flags = rtnl_link_get_flags(link); > > if (rtnl_link_get_arptype(link) != ARPHRD_INFINIBAND) > return; > > - if (!(flags & IFF_RUNNING)) > - return; > - > if (!(a = rtnl_addr_get_local(addr))) > return; > > @@ -206,20 +200,18 @@ static void acm_if_iter(struct nl_object *obj, void *_ctx_and_cb) > return; > > label = rtnl_addr_get_label(addr); > - if (!label) > - return; > > link_addr = rtnl_link_get_addr(link); > + /* gid has a 4 byte offset into the link address */ > memcpy(sgid.raw, nl_addr_get_binary_addr(link_addr) + 4, sizeof(sgid)); > > - ret = acm_if_get_pkey(rtnl_link_get_name(link), &pkey); > - if (ret) > + if (acm_if_get_pkey(rtnl_link_get_name(link), &pkey)) > return; > > acm_log(2, "name: %5s label: %9s index: %2d flags: %s addr: %s pkey: 0x%04x guid: 0x%lx\n", > rtnl_link_get_name(link), label, > rtnl_addr_get_ifindex(addr), > - rtnl_link_flags2str(flags, flags_str, sizeof(flags_str)), > + rtnl_link_flags2str(rtnl_link_get_flags(link), flags_str, sizeof(flags_str)), > nl_addr2str(a, ip_str, sizeof(ip_str)), pkey, > be64toh(sgid.global.interface_id)); > > -- > 2.20.1 >
diff --git a/ibacm/src/acm_util.c b/ibacm/src/acm_util.c index fecb6c89..8807579d 100644 --- a/ibacm/src/acm_util.c +++ b/ibacm/src/acm_util.c @@ -180,19 +180,13 @@ static void acm_if_iter(struct nl_object *obj, void *_ctx_and_cb) uint16_t pkey; int addr_len; char *label; - int flags; - int ret; int af; link = rtnl_link_get(link_cache, rtnl_addr_get_ifindex(addr)); - flags = rtnl_link_get_flags(link); if (rtnl_link_get_arptype(link) != ARPHRD_INFINIBAND) return; - if (!(flags & IFF_RUNNING)) - return; - if (!(a = rtnl_addr_get_local(addr))) return; @@ -206,20 +200,18 @@ static void acm_if_iter(struct nl_object *obj, void *_ctx_and_cb) return; label = rtnl_addr_get_label(addr); - if (!label) - return; link_addr = rtnl_link_get_addr(link); + /* gid has a 4 byte offset into the link address */ memcpy(sgid.raw, nl_addr_get_binary_addr(link_addr) + 4, sizeof(sgid)); - ret = acm_if_get_pkey(rtnl_link_get_name(link), &pkey); - if (ret) + if (acm_if_get_pkey(rtnl_link_get_name(link), &pkey)) return; acm_log(2, "name: %5s label: %9s index: %2d flags: %s addr: %s pkey: 0x%04x guid: 0x%lx\n", rtnl_link_get_name(link), label, rtnl_addr_get_ifindex(addr), - rtnl_link_flags2str(flags, flags_str, sizeof(flags_str)), + rtnl_link_flags2str(rtnl_link_get_flags(link), flags_str, sizeof(flags_str)), nl_addr2str(a, ip_str, sizeof(ip_str)), pkey, be64toh(sgid.global.interface_id));
When an IB port has been brought back to Active state, after being down, ibacm gets an event about it. It will then (re) enumerate the devices, and does so by executing an ioctl with SIOCGIFCONF. This particular ioctl will only return interfaces that are "running". There may be a delay after the IB port becomes Active until its address has been provisioned, and becomes "running". If ibacm attempts to associate IPoIB interfaces to the port during this interval, it will not see the interface because it is not "running". Later, when ibacm is asked for a Path Record (PR) using the IP address of the resurrected IPoIB interface, it will not be able to find the associated end-point (EP), and the following is printed in the log: acm_svr_resolve_path: notice - unknown local end point address The bug can be provoked by the following script. We have a single HCA with two ports, the IPoIB interfaces are named stib{0,1}, the IP address of the first interface is 192.168.200.200, and the remote IP address is 192.168.200.202. The LID of the IB switch is 1 and the switch port number connected to port 1 of the HCA is 22. <script> #!/bin/bash ibportstate -P 2 1 22 disable # move the IP address ip addr del 192.168.200.200/24 dev stib0 ip addr add 192.168.200.200/24 dev stib1 ibportstate -P 2 1 22 enable # Wait until port becomes active again while [[ $(ibstat|grep State:|grep -c Active) != 2 ]]; do echo -n "."; sleep 1 done echo # give ibacm time to re-enumerate the interfaces sleep 1 # move the IP address back ip addr del 192.168.200.200/24 dev stib1 ip addr add 192.168.200.200/24 dev stib0 # take the other port down, so we are sure we use stib0 ip link set down dev stib1 # Start a utility requesting a PR qperf 192.168.200.202 -cm1 rc_bw # check for failure grep "unknown local end point" ibacm.log # restore the state ip link set up dev stib1 </script> This fix depends on the commit that re-factors the use of ioctl in acm_if_iter(), and instead uses netlink. Now, by reducing the requirements of the state of the interface, the EP is added, and afterwards, when an address is assigned, it is associated with the EP. This commit is a new implementation of https://patchwork.kernel.org/patch/10748357, which was NAKed. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> v1 -> v2: * Removed Gerrit's Change-Id tag (Håkon) --- ibacm/src/acm_util.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)