Message ID | 20180807085313.36597-1-ubraun@linux.ibm.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | [RFC,1/1] mlx4: trigger IB events needed by SMC | expand |
On Tue, Aug 07, 2018 at 10:53:13AM +0200, Ursula Braun wrote: > The mlx4 driver does not trigger an IB_EVENT_PORT_ACTIVE when the > RoCE network interface is activated. When SMC determines the RoCE > device port to be used, it checks the port states. > This patch triggers IB events for NETDEV_UP and NETDEV_DOWN. > > Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> > --- > drivers/infiniband/hw/mlx4/main.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c > index 4ec519afc45b..81a1fd9e8615 100644 > --- a/drivers/infiniband/hw/mlx4/main.c > +++ b/drivers/infiniband/hw/mlx4/main.c > @@ -2447,6 +2447,26 @@ static void mlx4_ib_scan_netdevs(struct mlx4_ib_dev *ibdev, > event == NETDEV_UP || event == NETDEV_CHANGE)) > update_qps_port = port; > > + if (dev == iboe->netdevs[port - 1] && > + event == NETDEV_UP) { > + struct ib_event ibev = { }; > + > + ibev.device = &ibdev->ib_dev; > + ibev.element.port_num = port; > + ibev.event = IB_EVENT_PORT_ACTIVE; > + ib_dispatch_event(&ibev); > + } > + > + if (dev == iboe->netdevs[port - 1] && > + event == NETDEV_DOWN) { > + struct ib_event ibev = { }; > + > + ibev.device = &ibdev->ib_dev; > + ibev.element.port_num = port; > + ibev.event = IB_EVENT_PORT_ERR; > + ib_dispatch_event(&ibev); > + } > + Thanks Ursula, I think that you need to take into account previous port states (IB_PORT_DOWN vs. IB_PORT_ACTIVE). Thanks
On 08/07/2018 12:19 PM, Leon Romanovsky wrote: > On Tue, Aug 07, 2018 at 10:53:13AM +0200, Ursula Braun wrote: >> The mlx4 driver does not trigger an IB_EVENT_PORT_ACTIVE when the >> RoCE network interface is activated. When SMC determines the RoCE >> device port to be used, it checks the port states. >> This patch triggers IB events for NETDEV_UP and NETDEV_DOWN. >> >> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> >> --- >> drivers/infiniband/hw/mlx4/main.c | 20 ++++++++++++++++++++ >> 1 file changed, 20 insertions(+) >> >> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c >> index 4ec519afc45b..81a1fd9e8615 100644 >> --- a/drivers/infiniband/hw/mlx4/main.c >> +++ b/drivers/infiniband/hw/mlx4/main.c >> @@ -2447,6 +2447,26 @@ static void mlx4_ib_scan_netdevs(struct mlx4_ib_dev *ibdev, >> event == NETDEV_UP || event == NETDEV_CHANGE)) >> update_qps_port = port; >> >> + if (dev == iboe->netdevs[port - 1] && >> + event == NETDEV_UP) { >> + struct ib_event ibev = { }; >> + >> + ibev.device = &ibdev->ib_dev; >> + ibev.element.port_num = port; >> + ibev.event = IB_EVENT_PORT_ACTIVE; >> + ib_dispatch_event(&ibev); >> + } >> + >> + if (dev == iboe->netdevs[port - 1] && >> + event == NETDEV_DOWN) { >> + struct ib_event ibev = { }; >> + >> + ibev.device = &ibdev->ib_dev; >> + ibev.element.port_num = port; >> + ibev.event = IB_EVENT_PORT_ERR; >> + ib_dispatch_event(&ibev); >> + } >> + > > Thanks Ursula, > > I think that you need to take into account previous port states > (IB_PORT_DOWN vs. IB_PORT_ACTIVE). > Do you want to skip raising an IB event in these scenarios: NETDEV_UP event and previous port state is already IB_PORT_ACTIVE NETDEV_UP event and current port state is IB_PORT_DOWN NETDEV_DOWN event and previous port state is IB_PORT_DOWN NETDEV_DOWN event and current port state is IB_PORT_UP Isn't it guaranteed that a NETDEV_UP event is seen only when the port state has been IB_PORT_DOWN before (and vice versa)? At least the net/core/dev.c code makes sure, a NETDEV_UP event is not raised twice. And even if the previous port state could be the same, would it hurt if an IB_PORT_ACTIVE or IB_PORT_DOWN event is raised a second time in rare cases? Nevertheless I tried to determine the current port state, and found: If I use ib_get_cached_port_state(), I still get IB_PORT_DOWN for the NETDEV_UP event and vice versa. If I call ibdev->query_port() I run into locking problems, since the iboe->lock is held. For the net/smc code the main problem is the missing IB event in the mlx4 driver, since the net/smc code relies on it. If there are scenarios raising the same IB event more than once, this does not cause trouble. Taking into account the previous port state might be an optimization, but facing the problems described I am not sure if it is really worth spending extra code to maintain the previous and current port state. Do you have a sample demonstrating the urgent need for that? > Thanks >
On Wed, Sep 05, 2018 at 04:50:33PM +0200, Ursula Braun wrote: > > > On 08/07/2018 12:19 PM, Leon Romanovsky wrote: > > On Tue, Aug 07, 2018 at 10:53:13AM +0200, Ursula Braun wrote: > >> The mlx4 driver does not trigger an IB_EVENT_PORT_ACTIVE when the > >> RoCE network interface is activated. When SMC determines the RoCE > >> device port to be used, it checks the port states. > >> This patch triggers IB events for NETDEV_UP and NETDEV_DOWN. > >> > >> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> > >> --- > >> drivers/infiniband/hw/mlx4/main.c | 20 ++++++++++++++++++++ > >> 1 file changed, 20 insertions(+) > >> > >> diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c > >> index 4ec519afc45b..81a1fd9e8615 100644 > >> --- a/drivers/infiniband/hw/mlx4/main.c > >> +++ b/drivers/infiniband/hw/mlx4/main.c > >> @@ -2447,6 +2447,26 @@ static void mlx4_ib_scan_netdevs(struct mlx4_ib_dev *ibdev, > >> event == NETDEV_UP || event == NETDEV_CHANGE)) > >> update_qps_port = port; > >> > >> + if (dev == iboe->netdevs[port - 1] && > >> + event == NETDEV_UP) { > >> + struct ib_event ibev = { }; > >> + > >> + ibev.device = &ibdev->ib_dev; > >> + ibev.element.port_num = port; > >> + ibev.event = IB_EVENT_PORT_ACTIVE; > >> + ib_dispatch_event(&ibev); > >> + } > >> + > >> + if (dev == iboe->netdevs[port - 1] && > >> + event == NETDEV_DOWN) { > >> + struct ib_event ibev = { }; > >> + > >> + ibev.device = &ibdev->ib_dev; > >> + ibev.element.port_num = port; > >> + ibev.event = IB_EVENT_PORT_ERR; > >> + ib_dispatch_event(&ibev); > >> + } > >> + > > > > Thanks Ursula, > > > > I think that you need to take into account previous port states > > (IB_PORT_DOWN vs. IB_PORT_ACTIVE). > > > > Do you want to skip raising an IB event in these scenarios: > NETDEV_UP event and previous port state is already IB_PORT_ACTIVE > NETDEV_UP event and current port state is IB_PORT_DOWN > NETDEV_DOWN event and previous port state is IB_PORT_DOWN > NETDEV_DOWN event and current port state is IB_PORT_UP > > Isn't it guaranteed that a NETDEV_UP event is seen only when the port state has > been IB_PORT_DOWN before (and vice versa)? At least the net/core/dev.c code makes > sure, a NETDEV_UP event is not raised twice. I would like to know the definitive answer too, because of I need to clean mlx5 code from such logic or we need to add to mlx4 those checks. > > And even if the previous port state could be the same, would it hurt if an IB_PORT_ACTIVE > or IB_PORT_DOWN event is raised a second time in rare cases? I don't know, according to the code, we are relying on IB_EVENT_PORT_ERR and IB_EVENT_PORT_ACTIVE in many places. > > Nevertheless I tried to determine the current port state, and found: > If I use ib_get_cached_port_state(), I still get IB_PORT_DOWN for the NETDEV_UP event > and vice versa. If I call ibdev->query_port() I run into locking problems, since the > iboe->lock is held. > > For the net/smc code the main problem is the missing IB event in the mlx4 driver, since > the net/smc code relies on it. If there are scenarios raising the same IB event more than > once, this does not cause trouble. > > Taking into account the previous port state might be an optimization, but facing the > problems described I am not sure if it is really worth spending extra code to maintain > the previous and current port state. > Do you have a sample demonstrating the urgent need for that? > No, I don't, but my worries about mlx4 driver are focused on preserving (do not break mode) instead of extra features. So, I would like to know if raising twice or more IB_EVENT_PORT_ERR and IB_EVENT_PORT_ACTIVE is safe. If yes, let's merge it, if no, we need to fix it. Thanks > > Thanks > > >
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c index 4ec519afc45b..81a1fd9e8615 100644 --- a/drivers/infiniband/hw/mlx4/main.c +++ b/drivers/infiniband/hw/mlx4/main.c @@ -2447,6 +2447,26 @@ static void mlx4_ib_scan_netdevs(struct mlx4_ib_dev *ibdev, event == NETDEV_UP || event == NETDEV_CHANGE)) update_qps_port = port; + if (dev == iboe->netdevs[port - 1] && + event == NETDEV_UP) { + struct ib_event ibev = { }; + + ibev.device = &ibdev->ib_dev; + ibev.element.port_num = port; + ibev.event = IB_EVENT_PORT_ACTIVE; + ib_dispatch_event(&ibev); + } + + if (dev == iboe->netdevs[port - 1] && + event == NETDEV_DOWN) { + struct ib_event ibev = { }; + + ibev.device = &ibdev->ib_dev; + ibev.element.port_num = port; + ibev.event = IB_EVENT_PORT_ERR; + ib_dispatch_event(&ibev); + } + } spin_unlock_bh(&iboe->lock);
The mlx4 driver does not trigger an IB_EVENT_PORT_ACTIVE when the RoCE network interface is activated. When SMC determines the RoCE device port to be used, it checks the port states. This patch triggers IB events for NETDEV_UP and NETDEV_DOWN. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> --- drivers/infiniband/hw/mlx4/main.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)