diff mbox series

[net] hv_netvsc: fix race of netvsc and VF register_netdevice

Message ID 1698268592-20373-1-git-send-email-haiyangz@microsoft.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net] hv_netvsc: fix race of netvsc and VF register_netdevice | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1362 this patch: 1362
netdev/cc_maintainers success CCed 10 of 10 maintainers
netdev/build_clang success Errors and warnings before: 1386 this patch: 1386
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1386 this patch: 1386
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 54 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Haiyang Zhang Oct. 25, 2023, 9:16 p.m. UTC
The rtnl lock also needs to be held before rndis_filter_device_add()
which advertises nvsp_2_vsc_capability / sriov bit, and triggers
VF NIC offering and registering. If VF NIC finished register_netdev()
earlier it may cause name based config failure.

To fix this issue, move the call to rtnl_lock() before
rndis_filter_device_add(), so VF will be registered later than netvsc
/ synthetic NIC, and gets a name numbered (ethX) after netvsc.

And, move register_netdevice_notifier() earlier, so the call back
function is set before probing.

Cc: stable@vger.kernel.org
Fixes: e04e7a7bbd4b ("hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>

---
 drivers/net/hyperv/netvsc_drv.c | 30 +++++++++++++++++++-----------
 1 file changed, 19 insertions(+), 11 deletions(-)

Comments

Wojciech Drewek Oct. 26, 2023, 10:47 a.m. UTC | #1
On 25.10.2023 23:16, Haiyang Zhang wrote:
> The rtnl lock also needs to be held before rndis_filter_device_add()
> which advertises nvsp_2_vsc_capability / sriov bit, and triggers
> VF NIC offering and registering. If VF NIC finished register_netdev()
> earlier it may cause name based config failure.
> 
> To fix this issue, move the call to rtnl_lock() before
> rndis_filter_device_add(), so VF will be registered later than netvsc
> / synthetic NIC, and gets a name numbered (ethX) after netvsc.
> 
> And, move register_netdevice_notifier() earlier, so the call back
> function is set before probing.
> 
> Cc: stable@vger.kernel.org
> Fixes: e04e7a7bbd4b ("hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()")
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> 
> ---
>  drivers/net/hyperv/netvsc_drv.c | 30 +++++++++++++++++++-----------
>  1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index 3ba3c8fb28a5..feca1391f756 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -2531,15 +2531,6 @@ static int netvsc_probe(struct hv_device *dev,
>  		goto devinfo_failed;
>  	}
>  
> -	nvdev = rndis_filter_device_add(dev, device_info);
> -	if (IS_ERR(nvdev)) {
> -		ret = PTR_ERR(nvdev);
> -		netdev_err(net, "unable to add netvsc device (ret %d)\n", ret);
> -		goto rndis_failed;
> -	}
> -
> -	eth_hw_addr_set(net, device_info->mac_adr);
> -
>  	/* We must get rtnl lock before scheduling nvdev->subchan_work,
>  	 * otherwise netvsc_subchan_work() can get rtnl lock first and wait
>  	 * all subchannels to show up, but that may not happen because
> @@ -2547,9 +2538,23 @@ static int netvsc_probe(struct hv_device *dev,
>  	 * -> ... -> device_add() -> ... -> __device_attach() can't get
>  	 * the device lock, so all the subchannels can't be processed --
>  	 * finally netvsc_subchan_work() hangs forever.
> +	 *
> +	 * The rtnl lock also needs to be held before rndis_filter_device_add()
> +	 * which advertises nvsp_2_vsc_capability / sriov bit, and triggers
> +	 * VF NIC offering and registering. If VF NIC finished register_netdev()
> +	 * earlier it may cause name based config failure.
>  	 */
>  	rtnl_lock();
>  
> +	nvdev = rndis_filter_device_add(dev, device_info);
> +	if (IS_ERR(nvdev)) {
> +		ret = PTR_ERR(nvdev);
> +		netdev_err(net, "unable to add netvsc device (ret %d)\n", ret);
> +		goto rndis_failed;

In case of error rtnl won't be unlocked.

> +	}
> +
> +	eth_hw_addr_set(net, device_info->mac_adr);
> +
>  	if (nvdev->num_chn > 1)
>  		schedule_work(&nvdev->subchan_work);
>  
> @@ -2788,11 +2793,14 @@ static int __init netvsc_drv_init(void)
>  	}
>  	netvsc_ring_bytes = ring_size * PAGE_SIZE;
>  
> +	register_netdevice_notifier(&netvsc_netdev_notifier);
> +
>  	ret = vmbus_driver_register(&netvsc_drv);
> -	if (ret)
> +	if (ret) {
> +		unregister_netdevice_notifier(&netvsc_netdev_notifier);
>  		return ret;
> +	}
>  
> -	register_netdevice_notifier(&netvsc_netdev_notifier);
>  	return 0;
>  }
>
Haiyang Zhang Oct. 26, 2023, 2:52 p.m. UTC | #2
> -----Original Message-----
> From: Wojciech Drewek <wojciech.drewek@intel.com>
> Sent: Thursday, October 26, 2023 6:48 AM
> To: Haiyang Zhang <haiyangz@microsoft.com>; linux-hyperv@vger.kernel.org;
> netdev@vger.kernel.org
> Cc: KY Srinivasan <kys@microsoft.com>; wei.liu@kernel.org; Dexuan Cui
> <decui@microsoft.com>; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; davem@davemloft.net; linux-kernel@vger.kernel.org;
> stable@vger.kernel.org
> Subject: Re: [PATCH net] hv_netvsc: fix race of netvsc and VF
> register_netdevice
> 
> [You don't often get email from wojciech.drewek@intel.com. Learn why this is
> important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> On 25.10.2023 23:16, Haiyang Zhang wrote:
> > The rtnl lock also needs to be held before rndis_filter_device_add()
> > which advertises nvsp_2_vsc_capability / sriov bit, and triggers
> > VF NIC offering and registering. If VF NIC finished register_netdev()
> > earlier it may cause name based config failure.
> >
> > To fix this issue, move the call to rtnl_lock() before
> > rndis_filter_device_add(), so VF will be registered later than netvsc
> > / synthetic NIC, and gets a name numbered (ethX) after netvsc.
> >
> > And, move register_netdevice_notifier() earlier, so the call back
> > function is set before probing.
> >
> > Cc: stable@vger.kernel.org
> > Fixes: e04e7a7bbd4b ("hv_netvsc: Fix a deadlock by getting rtnl lock earlier
> in netvsc_probe()")
> > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> >
> > ---
> >  drivers/net/hyperv/netvsc_drv.c | 30 +++++++++++++++++++-----------
> >  1 file changed, 19 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/net/hyperv/netvsc_drv.c
> b/drivers/net/hyperv/netvsc_drv.c
> > index 3ba3c8fb28a5..feca1391f756 100644
> > --- a/drivers/net/hyperv/netvsc_drv.c
> > +++ b/drivers/net/hyperv/netvsc_drv.c
> > @@ -2531,15 +2531,6 @@ static int netvsc_probe(struct hv_device *dev,
> >               goto devinfo_failed;
> >       }
> >
> > -     nvdev = rndis_filter_device_add(dev, device_info);
> > -     if (IS_ERR(nvdev)) {
> > -             ret = PTR_ERR(nvdev);
> > -             netdev_err(net, "unable to add netvsc device (ret %d)\n", ret);
> > -             goto rndis_failed;
> > -     }
> > -
> > -     eth_hw_addr_set(net, device_info->mac_adr);
> > -
> >       /* We must get rtnl lock before scheduling nvdev->subchan_work,
> >        * otherwise netvsc_subchan_work() can get rtnl lock first and wait
> >        * all subchannels to show up, but that may not happen because
> > @@ -2547,9 +2538,23 @@ static int netvsc_probe(struct hv_device *dev,
> >        * -> ... -> device_add() -> ... -> __device_attach() can't get
> >        * the device lock, so all the subchannels can't be processed --
> >        * finally netvsc_subchan_work() hangs forever.
> > +      *
> > +      * The rtnl lock also needs to be held before rndis_filter_device_add()
> > +      * which advertises nvsp_2_vsc_capability / sriov bit, and triggers
> > +      * VF NIC offering and registering. If VF NIC finished register_netdev()
> > +      * earlier it may cause name based config failure.
> >        */
> >       rtnl_lock();
> >
> > +     nvdev = rndis_filter_device_add(dev, device_info);
> > +     if (IS_ERR(nvdev)) {
> > +             ret = PTR_ERR(nvdev);
> > +             netdev_err(net, "unable to add netvsc device (ret %d)\n", ret);
> > +             goto rndis_failed;
> 
> In case of error rtnl won't be unlocked.

Good catch! Will correct this. 

Thanks,
- Haiyang
diff mbox series

Patch

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 3ba3c8fb28a5..feca1391f756 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -2531,15 +2531,6 @@  static int netvsc_probe(struct hv_device *dev,
 		goto devinfo_failed;
 	}
 
-	nvdev = rndis_filter_device_add(dev, device_info);
-	if (IS_ERR(nvdev)) {
-		ret = PTR_ERR(nvdev);
-		netdev_err(net, "unable to add netvsc device (ret %d)\n", ret);
-		goto rndis_failed;
-	}
-
-	eth_hw_addr_set(net, device_info->mac_adr);
-
 	/* We must get rtnl lock before scheduling nvdev->subchan_work,
 	 * otherwise netvsc_subchan_work() can get rtnl lock first and wait
 	 * all subchannels to show up, but that may not happen because
@@ -2547,9 +2538,23 @@  static int netvsc_probe(struct hv_device *dev,
 	 * -> ... -> device_add() -> ... -> __device_attach() can't get
 	 * the device lock, so all the subchannels can't be processed --
 	 * finally netvsc_subchan_work() hangs forever.
+	 *
+	 * The rtnl lock also needs to be held before rndis_filter_device_add()
+	 * which advertises nvsp_2_vsc_capability / sriov bit, and triggers
+	 * VF NIC offering and registering. If VF NIC finished register_netdev()
+	 * earlier it may cause name based config failure.
 	 */
 	rtnl_lock();
 
+	nvdev = rndis_filter_device_add(dev, device_info);
+	if (IS_ERR(nvdev)) {
+		ret = PTR_ERR(nvdev);
+		netdev_err(net, "unable to add netvsc device (ret %d)\n", ret);
+		goto rndis_failed;
+	}
+
+	eth_hw_addr_set(net, device_info->mac_adr);
+
 	if (nvdev->num_chn > 1)
 		schedule_work(&nvdev->subchan_work);
 
@@ -2788,11 +2793,14 @@  static int __init netvsc_drv_init(void)
 	}
 	netvsc_ring_bytes = ring_size * PAGE_SIZE;
 
+	register_netdevice_notifier(&netvsc_netdev_notifier);
+
 	ret = vmbus_driver_register(&netvsc_drv);
-	if (ret)
+	if (ret) {
+		unregister_netdevice_notifier(&netvsc_netdev_notifier);
 		return ret;
+	}
 
-	register_netdevice_notifier(&netvsc_netdev_notifier);
 	return 0;
 }