diff mbox series

[net] net/smc: Fix lookup of netdev by using ib_device_get_netdev()

Message ID 20241025072356.56093-1-wenjia@linux.ibm.com (mailing list archive)
State Not Applicable
Headers show
Series [net] net/smc: Fix lookup of netdev by using ib_device_get_netdev() | expand

Commit Message

Wenjia Zhang Oct. 25, 2024, 7:23 a.m. UTC
Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
alternative to get_netdev") introduced an API ib_device_get_netdev.
The SMC-R variant of the SMC protocol continued to use the old API
ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
ib_device_ops.get_netdev didn't work any more at least by using a mlx5
device driver. Thus, using ib_device_set_netdev() now became mandatory.

Replace ib_device_ops.get_netdev() with ib_device_get_netdev().

Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
Reported-by: Aswin K <aswin@linux.ibm.com>
Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
---
 net/smc/smc_ib.c   | 8 ++------
 net/smc/smc_pnet.c | 4 +---
 2 files changed, 3 insertions(+), 9 deletions(-)

Comments

Halil Pasic Oct. 25, 2024, 8:57 a.m. UTC | #1
On Fri, 25 Oct 2024 09:23:55 +0200
Wenjia Zhang <wenjia@linux.ibm.com> wrote:

> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Simon Horman Oct. 25, 2024, 2:01 p.m. UTC | #2
On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

Reviewed-by: Simon Horman <horms@kernel.org>
Dust Li Oct. 26, 2024, 12:42 a.m. UTC | #3
On 2024-10-25 09:23:55, Wenjia Zhang wrote:
>Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
>alternative to get_netdev") introduced an API ib_device_get_netdev.
>The SMC-R variant of the SMC protocol continued to use the old API
>ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
>("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
>get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
>ib_device_ops.get_netdev didn't work any more at least by using a mlx5
>device driver. Thus, using ib_device_set_netdev() now became mandatory.
>
>Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
>
>Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
>Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
>Reported-by: Aswin K <aswin@linux.ibm.com>
>Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
>Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

Reviewed-by: Dust Li <dust.li@linux.alibaba.com>

>---
> net/smc/smc_ib.c   | 8 ++------
> net/smc/smc_pnet.c | 4 +---
> 2 files changed, 3 insertions(+), 9 deletions(-)
>
>diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
>index 9297dc20bfe2..9c563cdbea90 100644
>--- a/net/smc/smc_ib.c
>+++ b/net/smc/smc_ib.c
>@@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
> 	struct ib_device *ibdev = smcibdev->ibdev;
> 	struct net_device *ndev;
> 
>-	if (!ibdev->ops.get_netdev)
>-		return;
>-	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
>+	ndev = ib_device_get_netdev(ibdev, port + 1);
> 	if (ndev) {
> 		smcibdev->ndev_ifidx[port] = ndev->ifindex;
> 		dev_put(ndev);
>@@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
> 		port_cnt = smcibdev->ibdev->phys_port_cnt;
> 		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
> 			libdev = smcibdev->ibdev;
>-			if (!libdev->ops.get_netdev)
>-				continue;
>-			lndev = libdev->ops.get_netdev(libdev, i + 1);
>+			lndev = ib_device_get_netdev(libdev, i + 1);
> 			dev_put(lndev);
> 			if (lndev != ndev)
> 				continue;
>diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
>index 1dd362326c0a..8566937c8903 100644
>--- a/net/smc/smc_pnet.c
>+++ b/net/smc/smc_pnet.c
>@@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
> 		for (i = 1; i <= SMC_MAX_PORTS; i++) {
> 			if (!rdma_is_port_valid(ibdev->ibdev, i))
> 				continue;
>-			if (!ibdev->ibdev->ops.get_netdev)
>-				continue;
>-			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
>+			ndev = ib_device_get_netdev(ibdev->ibdev, i);
> 			if (!ndev)
> 				continue;
> 			dev_put(ndev);
>-- 
>2.43.0
>
Wen Gu Oct. 27, 2024, 11:18 a.m. UTC | #4
On 2024/10/25 15:23, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

LGTM!

Reviewed-by: Wen Gu <guwen@linux.alibaba.com>

> ---
>   net/smc/smc_ib.c   | 8 ++------
>   net/smc/smc_pnet.c | 4 +---
>   2 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>   	struct ib_device *ibdev = smcibdev->ibdev;
>   	struct net_device *ndev;
>   
> -	if (!ibdev->ops.get_netdev)
> -		return;
> -	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> +	ndev = ib_device_get_netdev(ibdev, port + 1);
>   	if (ndev) {
>   		smcibdev->ndev_ifidx[port] = ndev->ifindex;
>   		dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>   		port_cnt = smcibdev->ibdev->phys_port_cnt;
>   		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>   			libdev = smcibdev->ibdev;
> -			if (!libdev->ops.get_netdev)
> -				continue;
> -			lndev = libdev->ops.get_netdev(libdev, i + 1);
> +			lndev = ib_device_get_netdev(libdev, i + 1);
>   			dev_put(lndev);
>   			if (lndev != ndev)
>   				continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>   		for (i = 1; i <= SMC_MAX_PORTS; i++) {
>   			if (!rdma_is_port_valid(ibdev->ibdev, i))
>   				continue;
> -			if (!ibdev->ibdev->ops.get_netdev)
> -				continue;
> -			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> +			ndev = ib_device_get_netdev(ibdev->ibdev, i);
>   			if (!ndev)
>   				continue;
>   			dev_put(ndev);
Zhu Yanjun Oct. 27, 2024, 7:28 p.m. UTC | #5
在 2024/10/25 9:23, Wenjia Zhang 写道:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling

Thanks a lot.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Because the commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and 
get_netdev functions") removes the get_netdev callback from 
mlx5_ib_dev_common_roce_ops, in mlx4, get_netdev is still in 
mlx4_ib_dev_ops. So the following commit will follow mlx5 to remove 
get_netdev from mlx4 driver.

 From a59f2e01428640a332a51b8d910ec166704aa441 Mon Sep 17 00:00:00 2001
From: Zhu Yanjun <yanjun.zhu@linux.dev>
Date: Sun, 27 Oct 2024 20:21:27 +0100
Subject: [PATCH 1/1] RDMA/mlx4: Use IB get_netdev functions and remove
  get_netdev callback

In the commit 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev
functions") removed the get_netdev callback from
mlx5_ib_dev_common_roce_ops, in mlx4, get_netdev callback should also
be removed.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
compile successfully only
---
  drivers/infiniband/hw/mlx4/main.c | 35 -------------------------------
  1 file changed, 35 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 529db874d67c..cf34d92de7b1 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -123,40 +123,6 @@ static int num_ib_ports(struct mlx4_dev *dev)
         return ib_ports;
  }

-static struct net_device *mlx4_ib_get_netdev(struct ib_device *device,
-                                            u32 port_num)
-{
-       struct mlx4_ib_dev *ibdev = to_mdev(device);
-       struct net_device *dev, *ret = NULL;
-
-       rcu_read_lock();
-       for_each_netdev_rcu(&init_net, dev) {
-               if (dev->dev.parent != ibdev->ib_dev.dev.parent ||
-                   dev->dev_port + 1 != port_num)
-                       continue;
-
-               if (mlx4_is_bonded(ibdev->dev)) {
-                       struct net_device *upper;
-
-                       upper = netdev_master_upper_dev_get_rcu(dev);
-                       if (upper) {
-                               struct net_device *active;
-
-                               active = 
bond_option_active_slave_get_rcu(netdev_priv(upper));
-                               if (active)
-                                       dev = active;
-                       }
-               }
-
-               dev_hold(dev);
-               ret = dev;
-               break;
-       }
-
-       rcu_read_unlock();
-       return ret;
-}
-
  static int mlx4_ib_update_gids_v1(struct gid_entry *gids,
                                   struct mlx4_ib_dev *ibdev,
                                   u32 port_num)
@@ -2544,7 +2510,6 @@ static const struct ib_device_ops mlx4_ib_dev_ops = {
         .get_dev_fw_str = get_fw_ver_str,
         .get_dma_mr = mlx4_ib_get_dma_mr,
         .get_link_layer = mlx4_ib_port_link_layer,
-       .get_netdev = mlx4_ib_get_netdev,
         .get_port_immutable = mlx4_port_immutable,
         .map_mr_sg = mlx4_ib_map_mr_sg,
         .mmap = mlx4_ib_mmap,
--
2.34.1

Zhu Yanjun

> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> ---
>   net/smc/smc_ib.c   | 8 ++------
>   net/smc/smc_pnet.c | 4 +---
>   2 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>   	struct ib_device *ibdev = smcibdev->ibdev;
>   	struct net_device *ndev;
>   
> -	if (!ibdev->ops.get_netdev)
> -		return;
> -	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> +	ndev = ib_device_get_netdev(ibdev, port + 1);
>   	if (ndev) {
>   		smcibdev->ndev_ifidx[port] = ndev->ifindex;
>   		dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>   		port_cnt = smcibdev->ibdev->phys_port_cnt;
>   		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>   			libdev = smcibdev->ibdev;
> -			if (!libdev->ops.get_netdev)
> -				continue;
> -			lndev = libdev->ops.get_netdev(libdev, i + 1);
> +			lndev = ib_device_get_netdev(libdev, i + 1);
>   			dev_put(lndev);
>   			if (lndev != ndev)
>   				continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>   		for (i = 1; i <= SMC_MAX_PORTS; i++) {
>   			if (!rdma_is_port_valid(ibdev->ibdev, i))
>   				continue;
> -			if (!ibdev->ibdev->ops.get_netdev)
> -				continue;
> -			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> +			ndev = ib_device_get_netdev(ibdev->ibdev, i);
>   			if (!ndev)
>   				continue;
>   			dev_put(ndev);
Leon Romanovsky Oct. 27, 2024, 8:18 p.m. UTC | #6
On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. 

I would say that calls to ibdev ops from ULPs was never been right
thing to do. The ib_device_set_netdev() was introduced for the drivers.

So the whole commit message is not accurate and better to be rewritten.

> As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver.

It is not a correct statement too. All modern drivers (for last 5 years)
don't have that .get_netdev() ops, so it is not mlx5 specific, but another
justification to say that SMC-R was doing it wrong.

> Thus, using ib_device_set_netdev() now became mandatory.

ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
with ULPs.

> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().

It is too late for me to do proper review for today, but I would say
that it is worth to pay attention to multiple dev_put() calls in the
functions around the ib_device_get_netdev().

> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")

It is not related to this change Fixes line.

> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> ---
>  net/smc/smc_ib.c   | 8 ++------
>  net/smc/smc_pnet.c | 4 +---
>  2 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>  	struct ib_device *ibdev = smcibdev->ibdev;
>  	struct net_device *ndev;
>  
> -	if (!ibdev->ops.get_netdev)
> -		return;
> -	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> +	ndev = ib_device_get_netdev(ibdev, port + 1);
>  	if (ndev) {
>  		smcibdev->ndev_ifidx[port] = ndev->ifindex;
>  		dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>  		port_cnt = smcibdev->ibdev->phys_port_cnt;
>  		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>  			libdev = smcibdev->ibdev;
> -			if (!libdev->ops.get_netdev)
> -				continue;
> -			lndev = libdev->ops.get_netdev(libdev, i + 1);
> +			lndev = ib_device_get_netdev(libdev, i + 1);
>  			dev_put(lndev);
>  			if (lndev != ndev)
>  				continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>  		for (i = 1; i <= SMC_MAX_PORTS; i++) {
>  			if (!rdma_is_port_valid(ibdev->ibdev, i))
>  				continue;
> -			if (!ibdev->ibdev->ops.get_netdev)
> -				continue;
> -			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> +			ndev = ib_device_get_netdev(ibdev->ibdev, i);
>  			if (!ndev)
>  				continue;
>  			dev_put(ndev);
> -- 
> 2.43.0
> 
>
Leon Romanovsky Oct. 27, 2024, 8:30 p.m. UTC | #7
On Sun, Oct 27, 2024 at 10:18:57PM +0200, Leon Romanovsky wrote:
> On Fri, Oct 25, 2024 at 09:23:55AM +0200, Wenjia Zhang wrote:
> > Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> > alternative to get_netdev") introduced an API ib_device_get_netdev.
> > The SMC-R variant of the SMC protocol continued to use the old API
> > ib_device_ops.get_netdev() to lookup netdev. 
> 
> I would say that calls to ibdev ops from ULPs was never been right
> thing to do. The ib_device_set_netdev() was introduced for the drivers.
> 
> So the whole commit message is not accurate and better to be rewritten.
> 
> > As this commit 8d159eb2117b
> > ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> > get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> > ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> > device driver.
> 
> It is not a correct statement too. All modern drivers (for last 5 years)
> don't have that .get_netdev() ops, so it is not mlx5 specific, but another
> justification to say that SMC-R was doing it wrong.
> 
> > Thus, using ib_device_set_netdev() now became mandatory.
> 
> ib_device_set_netdev() is mandatory for the drivers, it is nothing to do
> with ULPs.
> 
> > 
> > Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> It is too late for me to do proper review for today, but I would say
> that it is worth to pay attention to multiple dev_put() calls in the
> functions around the ib_device_get_netdev().
> 
> > 
> > Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")

Honestly, this patch in Fixes line doesn't look right to me. It pokes inside
of ib_device to get netdev index. For example call to smc_ib_ndev_change()
will return completely unpredictable results, due to races.

It is bad that RDMA ML wasn't even CCed back then, we would say NAK to
this patch.
https://lore.kernel.org/netdev/20201201192049.53517-6-kgraul@linux.ibm.com/

Thanks
D. Wythe Oct. 29, 2024, 8:43 a.m. UTC | #8
On 10/25/24 3:23 PM, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>
> ---
>   net/smc/smc_ib.c   | 8 ++------
>   net/smc/smc_pnet.c | 4 +---
>   2 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 9297dc20bfe2..9c563cdbea90 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -899,9 +899,7 @@ static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
>   	struct ib_device *ibdev = smcibdev->ibdev;
>   	struct net_device *ndev;
>   
> -	if (!ibdev->ops.get_netdev)
> -		return;
> -	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
> +	ndev = ib_device_get_netdev(ibdev, port + 1);
>   	if (ndev) {
>   		smcibdev->ndev_ifidx[port] = ndev->ifindex;
>   		dev_put(ndev);
> @@ -921,9 +919,7 @@ void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
>   		port_cnt = smcibdev->ibdev->phys_port_cnt;
>   		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
>   			libdev = smcibdev->ibdev;
> -			if (!libdev->ops.get_netdev)
> -				continue;
> -			lndev = libdev->ops.get_netdev(libdev, i + 1);
> +			lndev = ib_device_get_netdev(libdev, i + 1);
>   			dev_put(lndev);
>   			if (lndev != ndev)
>   				continue;
> diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
> index 1dd362326c0a..8566937c8903 100644
> --- a/net/smc/smc_pnet.c
> +++ b/net/smc/smc_pnet.c
> @@ -1054,9 +1054,7 @@ static void smc_pnet_find_rdma_dev(struct net_device *netdev,
>   		for (i = 1; i <= SMC_MAX_PORTS; i++) {
>   			if (!rdma_is_port_valid(ibdev->ibdev, i))
>   				continue;
> -			if (!ibdev->ibdev->ops.get_netdev)
> -				continue;
> -			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
> +			ndev = ib_device_get_netdev(ibdev->ibdev, i);
>   			if (!ndev)
>   				continue;
>   			dev_put(ndev);


Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
Paolo Abeni Oct. 31, 2024, 10:01 a.m. UTC | #9
On 10/25/24 09:23, Wenjia Zhang wrote:
> Commit c2261dd76b54 ("RDMA/device: Add ib_device_set_netdev() as an
> alternative to get_netdev") introduced an API ib_device_get_netdev.
> The SMC-R variant of the SMC protocol continued to use the old API
> ib_device_ops.get_netdev() to lookup netdev. As this commit 8d159eb2117b
> ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") removed the
> get_netdev callback from mlx5_ib_dev_common_roce_ops, calling
> ib_device_ops.get_netdev didn't work any more at least by using a mlx5
> device driver. Thus, using ib_device_set_netdev() now became mandatory.
> 
> Replace ib_device_ops.get_netdev() with ib_device_get_netdev().
> 
> Fixes: 54903572c23c ("net/smc: allow pnetid-less configuration")
> Fixes: 8d159eb2117b ("RDMA/mlx5: Use IB set_netdev and get_netdev functions")
> Reported-by: Aswin K <aswin@linux.ibm.com>
> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
> Signed-off-by: Wenjia Zhang <wenjia@linux.ibm.com>

Please adjust the commit message as per Leon suggestion. You can retain
all the ack collected so far.

Thanks,

Paolo
diff mbox series

Patch

diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 9297dc20bfe2..9c563cdbea90 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -899,9 +899,7 @@  static void smc_copy_netdev_ifindex(struct smc_ib_device *smcibdev, int port)
 	struct ib_device *ibdev = smcibdev->ibdev;
 	struct net_device *ndev;
 
-	if (!ibdev->ops.get_netdev)
-		return;
-	ndev = ibdev->ops.get_netdev(ibdev, port + 1);
+	ndev = ib_device_get_netdev(ibdev, port + 1);
 	if (ndev) {
 		smcibdev->ndev_ifidx[port] = ndev->ifindex;
 		dev_put(ndev);
@@ -921,9 +919,7 @@  void smc_ib_ndev_change(struct net_device *ndev, unsigned long event)
 		port_cnt = smcibdev->ibdev->phys_port_cnt;
 		for (i = 0; i < min_t(size_t, port_cnt, SMC_MAX_PORTS); i++) {
 			libdev = smcibdev->ibdev;
-			if (!libdev->ops.get_netdev)
-				continue;
-			lndev = libdev->ops.get_netdev(libdev, i + 1);
+			lndev = ib_device_get_netdev(libdev, i + 1);
 			dev_put(lndev);
 			if (lndev != ndev)
 				continue;
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c
index 1dd362326c0a..8566937c8903 100644
--- a/net/smc/smc_pnet.c
+++ b/net/smc/smc_pnet.c
@@ -1054,9 +1054,7 @@  static void smc_pnet_find_rdma_dev(struct net_device *netdev,
 		for (i = 1; i <= SMC_MAX_PORTS; i++) {
 			if (!rdma_is_port_valid(ibdev->ibdev, i))
 				continue;
-			if (!ibdev->ibdev->ops.get_netdev)
-				continue;
-			ndev = ibdev->ibdev->ops.get_netdev(ibdev->ibdev, i);
+			ndev = ib_device_get_netdev(ibdev->ibdev, i);
 			if (!ndev)
 				continue;
 			dev_put(ndev);