Message ID | 20221121132935.2032325-1-lizetao1@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | virtio_net: Fix probe failed when modprobe virtio_net | expand |
On Mon, Nov 21, 2022 at 09:29:35PM +0800, Li Zetao wrote: > When doing the following test steps, an error was found: > step 1: modprobe virtio_net succeeded > # modprobe virtio_net <-- OK > > step 2: fault injection in register_netdevice() > # modprobe -r virtio_net <-- OK > # ... > FAULT_INJECTION: forcing a failure. > name failslab, interval 1, probability 0, space 0, times 0 > CPU: 0 PID: 3521 Comm: modprobe > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > Call Trace: > <TASK> > ... > should_failslab+0xa/0x20 > ... > dev_set_name+0xc0/0x100 > netdev_register_kobject+0xc2/0x340 > register_netdevice+0xbb9/0x1320 > virtnet_probe+0x1d72/0x2658 [virtio_net] > ... > </TASK> > virtio_net: probe of virtio0 failed with error -22 > > step 3: modprobe virtio_net failed > # modprobe virtio_net <-- failed > virtio_net: probe of virtio0 failed with error -2 > > The root cause of the problem is that the queues are not > disable on the error handling path when register_netdevice() > fails in virtnet_probe(), resulting in an error "-ENOENT" > returned in the next modprobe call in setup_vq(). > > virtio_pci_modern_device uses virtqueues to send or > receive message, and "queue_enable" records whether the > queues are available. In vp_modern_find_vqs(), all queues > will be selected and activated, but once queues are enabled > there is no way to go back except reset. > > Fix it by reset virtio device on error handling path. > > Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver") > Signed-off-by: Li Zetao <lizetao1@huawei.com> I would add to this: ------ This makes error handling follow the same order as normal device cleanup which does: static void remove_vq_common(struct virtnet_info *vi) { virtio_reset_device(vi->vdev); /* Free unused buffers in both send and recv, if any. */ free_unused_bufs(vi); free_receive_bufs(vi); free_receive_page_frags(vi); virtnet_del_vqs(vi); } static void virtnet_remove(struct virtio_device *vdev) { struct virtnet_info *vi = vdev->priv; virtnet_cpu_notif_remove(vi); /* Make sure no work handler is accessing the device. */ flush_work(&vi->config_work); unregister_netdev(vi->dev); net_failover_destroy(vi->failover); remove_vq_common(vi); free_netdev(vi->dev); } So unregister, destroy failover, then reset - and that flow is better tested than error handling so we can be reasonably sure it works well. ----- I would thus probably also include this tag instead: Fixes: 0246555550 ("virtio_net: fix use after free on allocation failure") this is what introduced the difference in cleanup order, modern driver just added hardware support. Besides extending the commit log Acked-by: Michael S. Tsirkin <mst@redhat.com> > --- > drivers/net/virtio_net.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 7106932c6f88..86e52454b5b5 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -3949,12 +3949,11 @@ static int virtnet_probe(struct virtio_device *vdev) > return 0; > > free_unregister_netdev: > - virtio_reset_device(vdev); > - > unregister_netdev(dev); > free_failover: > net_failover_destroy(vi->failover); > free_vqs: > + virtio_reset_device(vdev); > cancel_delayed_work_sync(&vi->refill); > free_receive_page_frags(vi); > virtnet_del_vqs(vi); > -- > 2.25.1
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 7106932c6f88..86e52454b5b5 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -3949,12 +3949,11 @@ static int virtnet_probe(struct virtio_device *vdev) return 0; free_unregister_netdev: - virtio_reset_device(vdev); - unregister_netdev(dev); free_failover: net_failover_destroy(vi->failover); free_vqs: + virtio_reset_device(vdev); cancel_delayed_work_sync(&vi->refill); free_receive_page_frags(vi); virtnet_del_vqs(vi);
When doing the following test steps, an error was found: step 1: modprobe virtio_net succeeded # modprobe virtio_net <-- OK step 2: fault injection in register_netdevice() # modprobe -r virtio_net <-- OK # ... FAULT_INJECTION: forcing a failure. name failslab, interval 1, probability 0, space 0, times 0 CPU: 0 PID: 3521 Comm: modprobe Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), Call Trace: <TASK> ... should_failslab+0xa/0x20 ... dev_set_name+0xc0/0x100 netdev_register_kobject+0xc2/0x340 register_netdevice+0xbb9/0x1320 virtnet_probe+0x1d72/0x2658 [virtio_net] ... </TASK> virtio_net: probe of virtio0 failed with error -22 step 3: modprobe virtio_net failed # modprobe virtio_net <-- failed virtio_net: probe of virtio0 failed with error -2 The root cause of the problem is that the queues are not disable on the error handling path when register_netdevice() fails in virtnet_probe(), resulting in an error "-ENOENT" returned in the next modprobe call in setup_vq(). virtio_pci_modern_device uses virtqueues to send or receive message, and "queue_enable" records whether the queues are available. In vp_modern_find_vqs(), all queues will be selected and activated, but once queues are enabled there is no way to go back except reset. Fix it by reset virtio device on error handling path. Fixes: 1fcf0512c9c8 ("virtio_pci: modern driver") Signed-off-by: Li Zetao <lizetao1@huawei.com> --- drivers/net/virtio_net.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)