diff mbox series

ibacm: Unable to resurrect an interface

Message ID 20190104131720.466386-1-haakon.bugge@oracle.com (mailing list archive)
State Superseded
Headers show
Series ibacm: Unable to resurrect an interface | expand

Commit Message

Haakon Bugge Jan. 4, 2019, 1:17 p.m. UTC
When an IB port has been brought back to Active state, after being
down, ibacm gets an event about it. It will then (re) enumerate the
devices, and does so by executing an ioctl with SIOCGIFCONF. This
particular ioctl will only return interfaces that are "running".

There may be a delay after the IB port becomes Active until its
address has been provisioned, and becomes "running". If ibacm attempts
to associate IPoIB interfaces to the port during this interval, it
will not see the interface because it is not "running".

Later, when ibacm is asked for a Path Record (PR) using the IP address
of the resurrected IPoIB interface, it will not be able to find the
associated EP, and the following is printed in the log:

acm_svr_resolve_path: notice - unknown local end point address

The bug can be provoked by the following script. We have a single HCA
with two ports, the IPoIB interfaces are named stib{0,1}, the IP
address of the first interface is 192.168.200.200, and the remote IP
address is 192.168.200.202. The LID of the IB switch is 1 and the
switch port number connected to port 1 of the HCA is 22.

<script>
 #!/bin/bash

 ibportstate -P 2 1 22 disable

 # move the IP address
 ip addr del 192.168.200.200/24 dev stib0
 ip addr add 192.168.200.200/24 dev stib1

 ibportstate -P 2 1 22 enable

 # Wait until port becomes active again
 while [[ $(ibstat|grep State:|grep -c Active) != 2 ]]; do
    echo -n "."; sleep 1
 done
 echo

 # give ibacm time to re-enumerate the interfaces
 sleep 1

 # move the IP address back
 ip addr del 192.168.200.200/24 dev stib1
 ip addr add 192.168.200.200/24 dev stib0

 # take the other port down, so we are sure we use stib0
 ip link set down dev stib1

 # Start a utility requesting a PR
 qperf 192.168.200.202 -cm1 rc_bw

 # check for failure
 grep "unknown local end point" ibacm.log

 # restore the state
 ip link set up dev stib1
</script>

The fix is in acm_add_ep_ip(). When acm_find_ep() fails, an attempt to
take the EP up is performed by calling acm_ep_up().

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
---
 ibacm/src/acm.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Jason Gunthorpe Jan. 4, 2019, 4:27 p.m. UTC | #1
On Fri, Jan 04, 2019 at 02:17:20PM +0100, Håkon Bugge wrote:
> When an IB port has been brought back to Active state, after being
> down, ibacm gets an event about it. It will then (re) enumerate the
> devices, and does so by executing an ioctl with SIOCGIFCONF. This
> particular ioctl will only return interfaces that are "running".

This seems like an ugly hack, should this be using netlink instead of
SIOCGIFCONF to get the interface list?

Jason
Haakon Bugge Jan. 4, 2019, 8:19 p.m. UTC | #2
> On 4 Jan 2019, at 17:27, Jason Gunthorpe <jgg@mellanox.com> wrote:
> 
> On Fri, Jan 04, 2019 at 02:17:20PM +0100, Håkon Bugge wrote:
>> When an IB port has been brought back to Active state, after being
>> down, ibacm gets an event about it. It will then (re) enumerate the
>> devices, and does so by executing an ioctl with SIOCGIFCONF. This
>> particular ioctl will only return interfaces that are "running".
> 
> This seems like an ugly hack, should this be using netlink instead of
> SIOCGIFCONF to get the interface list?

You mean the SIOCGIFCONF ioctl or the commit that is ugly? 

At least, now we have something that works from kernel clients which requests PRs using GIDs, in an environment with limited pkeys, taking ports down and up, with corresponding move of IP addresses.

I am in favour of pursuing this patch. Down the road, the ioctl method can be hardened by using netlink instead, and this patch can be reverted.


Thxs, Håkon
Jason Gunthorpe Jan. 5, 2019, 3:37 a.m. UTC | #3
On Fri, Jan 04, 2019 at 09:19:26PM +0100, Håkon Bugge wrote:
> 
> 
> > On 4 Jan 2019, at 17:27, Jason Gunthorpe <jgg@mellanox.com> wrote:
> > 
> > On Fri, Jan 04, 2019 at 02:17:20PM +0100, Håkon Bugge wrote:
> >> When an IB port has been brought back to Active state, after being
> >> down, ibacm gets an event about it. It will then (re) enumerate the
> >> devices, and does so by executing an ioctl with SIOCGIFCONF. This
> >> particular ioctl will only return interfaces that are "running".
> > 
> > This seems like an ugly hack, should this be using netlink instead of
> > SIOCGIFCONF to get the interface list?
> 
> You mean the SIOCGIFCONF ioctl or the commit that is ugly? 

I think trying to recover from the error that is caused by not
enumerating the interfaces correctly is pretty ugly

Jason
Haakon Bugge Jan. 7, 2019, 5:25 p.m. UTC | #4
> On 5 Jan 2019, at 04:37, Jason Gunthorpe <jgg@ziepe.ca> wrote:
> 
> On Fri, Jan 04, 2019 at 09:19:26PM +0100, Håkon Bugge wrote:
>> 
>> 
>>> On 4 Jan 2019, at 17:27, Jason Gunthorpe <jgg@mellanox.com> wrote:
>>> 
>>> On Fri, Jan 04, 2019 at 02:17:20PM +0100, Håkon Bugge wrote:
>>>> When an IB port has been brought back to Active state, after being
>>>> down, ibacm gets an event about it. It will then (re) enumerate the
>>>> devices, and does so by executing an ioctl with SIOCGIFCONF. This
>>>> particular ioctl will only return interfaces that are "running".
>>> 
>>> This seems like an ugly hack, should this be using netlink instead of
>>> SIOCGIFCONF to get the interface list?
>> 
>> You mean the SIOCGIFCONF ioctl or the commit that is ugly? 
> 
> I think trying to recover from the error that is caused by not
> enumerating the interfaces correctly is pretty ugly

OK. I take that as a Nay to my proposal:

> I am in favour of pursuing this patch. Down the road, the ioctl method can be hardened by using netlink instead, and this patch can be reverted.

I am looking into reading /proc/net/dev in addition to the existing SIOCGIFCONF method. Give a hint if that can be accepted, or if a pure netlink implementation is required.


Thxs, Håkon
Jason Gunthorpe Jan. 7, 2019, 7:18 p.m. UTC | #5
On Mon, Jan 07, 2019 at 06:25:45PM +0100, Håkon Bugge wrote:
> 
> 
> > On 5 Jan 2019, at 04:37, Jason Gunthorpe <jgg@ziepe.ca> wrote:
> > 
> > On Fri, Jan 04, 2019 at 09:19:26PM +0100, Håkon Bugge wrote:
> >> 
> >> 
> >>> On 4 Jan 2019, at 17:27, Jason Gunthorpe <jgg@mellanox.com> wrote:
> >>> 
> >>> On Fri, Jan 04, 2019 at 02:17:20PM +0100, Håkon Bugge wrote:
> >>>> When an IB port has been brought back to Active state, after being
> >>>> down, ibacm gets an event about it. It will then (re) enumerate the
> >>>> devices, and does so by executing an ioctl with SIOCGIFCONF. This
> >>>> particular ioctl will only return interfaces that are "running".
> >>> 
> >>> This seems like an ugly hack, should this be using netlink instead of
> >>> SIOCGIFCONF to get the interface list?
> >> 
> >> You mean the SIOCGIFCONF ioctl or the commit that is ugly? 
> > 
> > I think trying to recover from the error that is caused by not
> > enumerating the interfaces correctly is pretty ugly
> 
> OK. I take that as a Nay to my proposal:
> 
> > I am in favour of pursuing this patch. Down the road, the ioctl method can be hardened by using netlink instead, and this patch can be reverted.
> 
> I am looking into reading /proc/net/dev in addition to the existing
> SIOCGIFCONF method. Give a hint if that can be accepted, or if a
> pure netlink implementation is required.

I think you should just use netlink, it is very simple and we already
have libnl3 available in the build system for this purpose

Jason
diff mbox series

Patch

diff --git a/ibacm/src/acm.c b/ibacm/src/acm.c
index 6453c5f0..0887d0c6 100644
--- a/ibacm/src/acm.c
+++ b/ibacm/src/acm.c
@@ -200,6 +200,7 @@  static int acm_ep_insert_addr(struct acmc_ep *ep, const char *name, uint8_t *add
 			      uint8_t addr_type);
 static void acm_event_handler(struct acmc_device *dev);
 static int acm_nl_send(int sock, struct acm_msg *msg);
+static void acm_ep_up(struct acmc_port *port, uint16_t pkey);
 
 static struct sa_data {
 	int		timeout;
@@ -1321,9 +1322,17 @@  static void acm_add_ep_ip(char *ifname, struct acm_ep_addr_data *data, char *ip_
 	if (acm_if_get_pkey(ifname, &pkey))
 		return;
 
-	acm_log(0, " %s\n", ip_str);
+	acm_log(0, " %s pkey: %04x port: %d\n", ip_str, pkey, port_num);
 
 	ep = acm_find_ep(&dev->port[port_num - 1], pkey);
+
+	if (!ep) {
+		acm_log(2, "no EP found, attempt adding it\n");
+		acm_ep_up(&dev->port[port_num - 1], pkey);
+		ep = acm_find_ep(&dev->port[port_num - 1], pkey);
+		acm_log(2, "EP was %s\n", ep ? "found" : "still not found");
+	}
+
 	if (ep) {
 		if (acm_ep_insert_addr(ep, ip_str, data->info.addr,
 				       data->type))