Message ID | 1698268592-20373-1-git-send-email-haiyangz@microsoft.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] hv_netvsc: fix race of netvsc and VF register_netdevice | expand |
On 25.10.2023 23:16, Haiyang Zhang wrote: > The rtnl lock also needs to be held before rndis_filter_device_add() > which advertises nvsp_2_vsc_capability / sriov bit, and triggers > VF NIC offering and registering. If VF NIC finished register_netdev() > earlier it may cause name based config failure. > > To fix this issue, move the call to rtnl_lock() before > rndis_filter_device_add(), so VF will be registered later than netvsc > / synthetic NIC, and gets a name numbered (ethX) after netvsc. > > And, move register_netdevice_notifier() earlier, so the call back > function is set before probing. > > Cc: stable@vger.kernel.org > Fixes: e04e7a7bbd4b ("hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()") > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > --- > drivers/net/hyperv/netvsc_drv.c | 30 +++++++++++++++++++----------- > 1 file changed, 19 insertions(+), 11 deletions(-) > > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c > index 3ba3c8fb28a5..feca1391f756 100644 > --- a/drivers/net/hyperv/netvsc_drv.c > +++ b/drivers/net/hyperv/netvsc_drv.c > @@ -2531,15 +2531,6 @@ static int netvsc_probe(struct hv_device *dev, > goto devinfo_failed; > } > > - nvdev = rndis_filter_device_add(dev, device_info); > - if (IS_ERR(nvdev)) { > - ret = PTR_ERR(nvdev); > - netdev_err(net, "unable to add netvsc device (ret %d)\n", ret); > - goto rndis_failed; > - } > - > - eth_hw_addr_set(net, device_info->mac_adr); > - > /* We must get rtnl lock before scheduling nvdev->subchan_work, > * otherwise netvsc_subchan_work() can get rtnl lock first and wait > * all subchannels to show up, but that may not happen because > @@ -2547,9 +2538,23 @@ static int netvsc_probe(struct hv_device *dev, > * -> ... -> device_add() -> ... -> __device_attach() can't get > * the device lock, so all the subchannels can't be processed -- > * finally netvsc_subchan_work() hangs forever. > + * > + * The rtnl lock also needs to be held before rndis_filter_device_add() > + * which advertises nvsp_2_vsc_capability / sriov bit, and triggers > + * VF NIC offering and registering. If VF NIC finished register_netdev() > + * earlier it may cause name based config failure. > */ > rtnl_lock(); > > + nvdev = rndis_filter_device_add(dev, device_info); > + if (IS_ERR(nvdev)) { > + ret = PTR_ERR(nvdev); > + netdev_err(net, "unable to add netvsc device (ret %d)\n", ret); > + goto rndis_failed; In case of error rtnl won't be unlocked. > + } > + > + eth_hw_addr_set(net, device_info->mac_adr); > + > if (nvdev->num_chn > 1) > schedule_work(&nvdev->subchan_work); > > @@ -2788,11 +2793,14 @@ static int __init netvsc_drv_init(void) > } > netvsc_ring_bytes = ring_size * PAGE_SIZE; > > + register_netdevice_notifier(&netvsc_netdev_notifier); > + > ret = vmbus_driver_register(&netvsc_drv); > - if (ret) > + if (ret) { > + unregister_netdevice_notifier(&netvsc_netdev_notifier); > return ret; > + } > > - register_netdevice_notifier(&netvsc_netdev_notifier); > return 0; > } >
> -----Original Message----- > From: Wojciech Drewek <wojciech.drewek@intel.com> > Sent: Thursday, October 26, 2023 6:48 AM > To: Haiyang Zhang <haiyangz@microsoft.com>; linux-hyperv@vger.kernel.org; > netdev@vger.kernel.org > Cc: KY Srinivasan <kys@microsoft.com>; wei.liu@kernel.org; Dexuan Cui > <decui@microsoft.com>; edumazet@google.com; kuba@kernel.org; > pabeni@redhat.com; davem@davemloft.net; linux-kernel@vger.kernel.org; > stable@vger.kernel.org > Subject: Re: [PATCH net] hv_netvsc: fix race of netvsc and VF > register_netdevice > > [You don't often get email from wojciech.drewek@intel.com. Learn why this is > important at https://aka.ms/LearnAboutSenderIdentification ] > > On 25.10.2023 23:16, Haiyang Zhang wrote: > > The rtnl lock also needs to be held before rndis_filter_device_add() > > which advertises nvsp_2_vsc_capability / sriov bit, and triggers > > VF NIC offering and registering. If VF NIC finished register_netdev() > > earlier it may cause name based config failure. > > > > To fix this issue, move the call to rtnl_lock() before > > rndis_filter_device_add(), so VF will be registered later than netvsc > > / synthetic NIC, and gets a name numbered (ethX) after netvsc. > > > > And, move register_netdevice_notifier() earlier, so the call back > > function is set before probing. > > > > Cc: stable@vger.kernel.org > > Fixes: e04e7a7bbd4b ("hv_netvsc: Fix a deadlock by getting rtnl lock earlier > in netvsc_probe()") > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > > > --- > > drivers/net/hyperv/netvsc_drv.c | 30 +++++++++++++++++++----------- > > 1 file changed, 19 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c > b/drivers/net/hyperv/netvsc_drv.c > > index 3ba3c8fb28a5..feca1391f756 100644 > > --- a/drivers/net/hyperv/netvsc_drv.c > > +++ b/drivers/net/hyperv/netvsc_drv.c > > @@ -2531,15 +2531,6 @@ static int netvsc_probe(struct hv_device *dev, > > goto devinfo_failed; > > } > > > > - nvdev = rndis_filter_device_add(dev, device_info); > > - if (IS_ERR(nvdev)) { > > - ret = PTR_ERR(nvdev); > > - netdev_err(net, "unable to add netvsc device (ret %d)\n", ret); > > - goto rndis_failed; > > - } > > - > > - eth_hw_addr_set(net, device_info->mac_adr); > > - > > /* We must get rtnl lock before scheduling nvdev->subchan_work, > > * otherwise netvsc_subchan_work() can get rtnl lock first and wait > > * all subchannels to show up, but that may not happen because > > @@ -2547,9 +2538,23 @@ static int netvsc_probe(struct hv_device *dev, > > * -> ... -> device_add() -> ... -> __device_attach() can't get > > * the device lock, so all the subchannels can't be processed -- > > * finally netvsc_subchan_work() hangs forever. > > + * > > + * The rtnl lock also needs to be held before rndis_filter_device_add() > > + * which advertises nvsp_2_vsc_capability / sriov bit, and triggers > > + * VF NIC offering and registering. If VF NIC finished register_netdev() > > + * earlier it may cause name based config failure. > > */ > > rtnl_lock(); > > > > + nvdev = rndis_filter_device_add(dev, device_info); > > + if (IS_ERR(nvdev)) { > > + ret = PTR_ERR(nvdev); > > + netdev_err(net, "unable to add netvsc device (ret %d)\n", ret); > > + goto rndis_failed; > > In case of error rtnl won't be unlocked. Good catch! Will correct this. Thanks, - Haiyang
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 3ba3c8fb28a5..feca1391f756 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -2531,15 +2531,6 @@ static int netvsc_probe(struct hv_device *dev, goto devinfo_failed; } - nvdev = rndis_filter_device_add(dev, device_info); - if (IS_ERR(nvdev)) { - ret = PTR_ERR(nvdev); - netdev_err(net, "unable to add netvsc device (ret %d)\n", ret); - goto rndis_failed; - } - - eth_hw_addr_set(net, device_info->mac_adr); - /* We must get rtnl lock before scheduling nvdev->subchan_work, * otherwise netvsc_subchan_work() can get rtnl lock first and wait * all subchannels to show up, but that may not happen because @@ -2547,9 +2538,23 @@ static int netvsc_probe(struct hv_device *dev, * -> ... -> device_add() -> ... -> __device_attach() can't get * the device lock, so all the subchannels can't be processed -- * finally netvsc_subchan_work() hangs forever. + * + * The rtnl lock also needs to be held before rndis_filter_device_add() + * which advertises nvsp_2_vsc_capability / sriov bit, and triggers + * VF NIC offering and registering. If VF NIC finished register_netdev() + * earlier it may cause name based config failure. */ rtnl_lock(); + nvdev = rndis_filter_device_add(dev, device_info); + if (IS_ERR(nvdev)) { + ret = PTR_ERR(nvdev); + netdev_err(net, "unable to add netvsc device (ret %d)\n", ret); + goto rndis_failed; + } + + eth_hw_addr_set(net, device_info->mac_adr); + if (nvdev->num_chn > 1) schedule_work(&nvdev->subchan_work); @@ -2788,11 +2793,14 @@ static int __init netvsc_drv_init(void) } netvsc_ring_bytes = ring_size * PAGE_SIZE; + register_netdevice_notifier(&netvsc_netdev_notifier); + ret = vmbus_driver_register(&netvsc_drv); - if (ret) + if (ret) { + unregister_netdevice_notifier(&netvsc_netdev_notifier); return ret; + } - register_netdevice_notifier(&netvsc_netdev_notifier); return 0; }
The rtnl lock also needs to be held before rndis_filter_device_add() which advertises nvsp_2_vsc_capability / sriov bit, and triggers VF NIC offering and registering. If VF NIC finished register_netdev() earlier it may cause name based config failure. To fix this issue, move the call to rtnl_lock() before rndis_filter_device_add(), so VF will be registered later than netvsc / synthetic NIC, and gets a name numbered (ethX) after netvsc. And, move register_netdevice_notifier() earlier, so the call back function is set before probing. Cc: stable@vger.kernel.org Fixes: e04e7a7bbd4b ("hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> --- drivers/net/hyperv/netvsc_drv.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-)