diff mbox series

[rdma-next,1/2] RDMA/mlx5: Get upper device only if device is lagged

Message ID 117b591f5e6e130aeccc871888084fb92fb43b5a.1692168533.git.leon@kernel.org (mailing list archive)
State Changes Requested
Headers show
Series mlx5 RDMA LAG fixes | expand

Commit Message

Leon Romanovsky Aug. 16, 2023, 6:52 a.m. UTC
From: Mark Bloch <mbloch@nvidia.com>

If the RDMA device isn't in LAG mode there is no need
to try to get the upper device.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/main.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

Comments

Jason Gunthorpe Aug. 18, 2023, 4:33 p.m. UTC | #1
On Wed, Aug 16, 2023 at 09:52:23AM +0300, Leon Romanovsky wrote:
> From: Mark Bloch <mbloch@nvidia.com>
> 
> If the RDMA device isn't in LAG mode there is no need
> to try to get the upper device.
> 
> Signed-off-by: Mark Bloch <mbloch@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  drivers/infiniband/hw/mlx5/main.c | 22 +++++++++++++++-------
>  1 file changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> index f0b394ed7452..215d7b0add8f 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -195,12 +195,18 @@ static int mlx5_netdev_event(struct notifier_block *this,
>  	case NETDEV_CHANGE:
>  	case NETDEV_UP:
>  	case NETDEV_DOWN: {
> -		struct net_device *lag_ndev = mlx5_lag_get_roce_netdev(mdev);
>  		struct net_device *upper = NULL;
>  
> -		if (lag_ndev) {
> -			upper = netdev_master_upper_dev_get(lag_ndev);
> -			dev_put(lag_ndev);
> +		if (ibdev->lag_active) {

Needs locking to read lag_active

Jason
Jason Gunthorpe Aug. 18, 2023, 4:42 p.m. UTC | #2
On Fri, Aug 18, 2023 at 01:33:35PM -0300, Jason Gunthorpe wrote:
> On Wed, Aug 16, 2023 at 09:52:23AM +0300, Leon Romanovsky wrote:
> > From: Mark Bloch <mbloch@nvidia.com>
> > 
> > If the RDMA device isn't in LAG mode there is no need
> > to try to get the upper device.
> > 
> > Signed-off-by: Mark Bloch <mbloch@nvidia.com>
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> >  drivers/infiniband/hw/mlx5/main.c | 22 +++++++++++++++-------
> >  1 file changed, 15 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > index f0b394ed7452..215d7b0add8f 100644
> > --- a/drivers/infiniband/hw/mlx5/main.c
> > +++ b/drivers/infiniband/hw/mlx5/main.c
> > @@ -195,12 +195,18 @@ static int mlx5_netdev_event(struct notifier_block *this,
> >  	case NETDEV_CHANGE:
> >  	case NETDEV_UP:
> >  	case NETDEV_DOWN: {
> > -		struct net_device *lag_ndev = mlx5_lag_get_roce_netdev(mdev);
> >  		struct net_device *upper = NULL;
> >  
> > -		if (lag_ndev) {
> > -			upper = netdev_master_upper_dev_get(lag_ndev);
> > -			dev_put(lag_ndev);
> > +		if (ibdev->lag_active) {
> 
> Needs locking to read lag_active

Specifically the use of the bitfield looks messed up.. If lag_active
and some others were set only during probe it could be OK.

But mixing other stuff that is being written concurrently is not OK to
do like this. (eg ib_active via a mlx5 notifier)

Jason
Leon Romanovsky Aug. 20, 2023, 9:59 a.m. UTC | #3
On Fri, Aug 18, 2023 at 01:42:30PM -0300, Jason Gunthorpe wrote:
> On Fri, Aug 18, 2023 at 01:33:35PM -0300, Jason Gunthorpe wrote:
> > On Wed, Aug 16, 2023 at 09:52:23AM +0300, Leon Romanovsky wrote:
> > > From: Mark Bloch <mbloch@nvidia.com>
> > > 
> > > If the RDMA device isn't in LAG mode there is no need
> > > to try to get the upper device.
> > > 
> > > Signed-off-by: Mark Bloch <mbloch@nvidia.com>
> > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > ---
> > >  drivers/infiniband/hw/mlx5/main.c | 22 +++++++++++++++-------
> > >  1 file changed, 15 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > > index f0b394ed7452..215d7b0add8f 100644
> > > --- a/drivers/infiniband/hw/mlx5/main.c
> > > +++ b/drivers/infiniband/hw/mlx5/main.c
> > > @@ -195,12 +195,18 @@ static int mlx5_netdev_event(struct notifier_block *this,
> > >  	case NETDEV_CHANGE:
> > >  	case NETDEV_UP:
> > >  	case NETDEV_DOWN: {
> > > -		struct net_device *lag_ndev = mlx5_lag_get_roce_netdev(mdev);
> > >  		struct net_device *upper = NULL;
> > >  
> > > -		if (lag_ndev) {
> > > -			upper = netdev_master_upper_dev_get(lag_ndev);
> > > -			dev_put(lag_ndev);
> > > +		if (ibdev->lag_active) {
> > 
> > Needs locking to read lag_active
> 
> Specifically the use of the bitfield looks messed up.. If lag_active
> and some others were set only during probe it could be OK.

All fields except ib_active are static and set during probe.

> 
> But mixing other stuff that is being written concurrently is not OK to
> do like this. (eg ib_active via a mlx5 notifier)

What you are looking is the following change, did I get you right?

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 9d0c56b59ed2..ee73113717b2 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1094,7 +1094,7 @@ struct mlx5_ib_dev {
        /* serialize update of capability mask
         */
        struct mutex                    cap_mask_mutex;
-       u8                              ib_active:1;
+       bool                            ib_active;
        u8                              is_rep:1;
        u8                              lag_active:1;
        u8                              wc_support:1;

> 
> Jason
Jason Gunthorpe Aug. 21, 2023, 1:39 p.m. UTC | #4
On Sun, Aug 20, 2023 at 12:59:26PM +0300, Leon Romanovsky wrote:
> On Fri, Aug 18, 2023 at 01:42:30PM -0300, Jason Gunthorpe wrote:
> > On Fri, Aug 18, 2023 at 01:33:35PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Aug 16, 2023 at 09:52:23AM +0300, Leon Romanovsky wrote:
> > > > From: Mark Bloch <mbloch@nvidia.com>
> > > > 
> > > > If the RDMA device isn't in LAG mode there is no need
> > > > to try to get the upper device.
> > > > 
> > > > Signed-off-by: Mark Bloch <mbloch@nvidia.com>
> > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > > > ---
> > > >  drivers/infiniband/hw/mlx5/main.c | 22 +++++++++++++++-------
> > > >  1 file changed, 15 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
> > > > index f0b394ed7452..215d7b0add8f 100644
> > > > --- a/drivers/infiniband/hw/mlx5/main.c
> > > > +++ b/drivers/infiniband/hw/mlx5/main.c
> > > > @@ -195,12 +195,18 @@ static int mlx5_netdev_event(struct notifier_block *this,
> > > >  	case NETDEV_CHANGE:
> > > >  	case NETDEV_UP:
> > > >  	case NETDEV_DOWN: {
> > > > -		struct net_device *lag_ndev = mlx5_lag_get_roce_netdev(mdev);
> > > >  		struct net_device *upper = NULL;
> > > >  
> > > > -		if (lag_ndev) {
> > > > -			upper = netdev_master_upper_dev_get(lag_ndev);
> > > > -			dev_put(lag_ndev);
> > > > +		if (ibdev->lag_active) {
> > > 
> > > Needs locking to read lag_active
> > 
> > Specifically the use of the bitfield looks messed up.. If lag_active
> > and some others were set only during probe it could be OK.
> 
> All fields except ib_active are static and set during probe.
> 
> > 
> > But mixing other stuff that is being written concurrently is not OK to
> > do like this. (eg ib_active via a mlx5 notifier)
> 
> What you are looking is the following change, did I get you right?
> 
> diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> index 9d0c56b59ed2..ee73113717b2 100644
> --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
> +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> @@ -1094,7 +1094,7 @@ struct mlx5_ib_dev {
>         /* serialize update of capability mask
>          */
>         struct mutex                    cap_mask_mutex;
> -       u8                              ib_active:1;
> +       bool                            ib_active;
>         u8                              is_rep:1;
>         u8                              lag_active:1;
>         u8                              wc_support:1;

That helps, but it still needs some kind of concurrency management for
ib_active

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index f0b394ed7452..215d7b0add8f 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -195,12 +195,18 @@  static int mlx5_netdev_event(struct notifier_block *this,
 	case NETDEV_CHANGE:
 	case NETDEV_UP:
 	case NETDEV_DOWN: {
-		struct net_device *lag_ndev = mlx5_lag_get_roce_netdev(mdev);
 		struct net_device *upper = NULL;
 
-		if (lag_ndev) {
-			upper = netdev_master_upper_dev_get(lag_ndev);
-			dev_put(lag_ndev);
+		if (ibdev->lag_active) {
+			struct net_device *lag_ndev;
+
+			lag_ndev = mlx5_lag_get_roce_netdev(mdev);
+			if (lag_ndev) {
+				upper = netdev_master_upper_dev_get(lag_ndev);
+				dev_put(lag_ndev);
+			} else {
+				goto done;
+			}
 		}
 
 		if (ibdev->is_rep)
@@ -254,9 +260,11 @@  static struct net_device *mlx5_ib_get_netdev(struct ib_device *device,
 	if (!mdev)
 		return NULL;
 
-	ndev = mlx5_lag_get_roce_netdev(mdev);
-	if (ndev)
-		goto out;
+	if (ibdev->lag_active) {
+		ndev = mlx5_lag_get_roce_netdev(mdev);
+		if (ndev)
+			goto out;
+	}
 
 	/* Ensure ndev does not disappear before we invoke dev_hold()
 	 */