Message ID | 20190628123659.139576-1-sgarzare@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | vsock/virtio: several fixes in the .probe() and .remove() | expand |
On Fri, Jun 28, 2019 at 02:36:56PM +0200, Stefano Garzarella wrote: > During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock > before registering the driver", Stefan pointed out some possible issues > in the .probe() and .remove() callbacks of the virtio-vsock driver. > > This series tries to solve these issues: > - Patch 1 adds RCU critical sections to avoid use-after-free of > 'the_virtio_vsock' pointer. > - Patch 2 stops workers before to call vdev->config->reset(vdev) to > be sure that no one is accessing the device. > - Patch 3 moves the works flush at the end of the .remove() to avoid > use-after-free of 'vsock' object. > > v2: > - Patch 1: use RCU to protect 'the_virtio_vsock' pointer > - Patch 2: no changes > - Patch 3: flush works only at the end of .remove() > - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers > allocated. > > v1: https://patchwork.kernel.org/cover/10964733/ This looks good to me. Did you run any stress tests? For example an SMP guest constantly connecting and sending packets together with a script that hotplug/unplugs vhost-vsock-pci from the host side. Stefan
On Mon, Jul 01, 2019 at 04:11:13PM +0100, Stefan Hajnoczi wrote: > On Fri, Jun 28, 2019 at 02:36:56PM +0200, Stefano Garzarella wrote: > > During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock > > before registering the driver", Stefan pointed out some possible issues > > in the .probe() and .remove() callbacks of the virtio-vsock driver. > > > > This series tries to solve these issues: > > - Patch 1 adds RCU critical sections to avoid use-after-free of > > 'the_virtio_vsock' pointer. > > - Patch 2 stops workers before to call vdev->config->reset(vdev) to > > be sure that no one is accessing the device. > > - Patch 3 moves the works flush at the end of the .remove() to avoid > > use-after-free of 'vsock' object. > > > > v2: > > - Patch 1: use RCU to protect 'the_virtio_vsock' pointer > > - Patch 2: no changes > > - Patch 3: flush works only at the end of .remove() > > - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers > > allocated. > > > > v1: https://patchwork.kernel.org/cover/10964733/ > > This looks good to me. Thanks for the review! > > Did you run any stress tests? For example an SMP guest constantly > connecting and sending packets together with a script that > hotplug/unplugs vhost-vsock-pci from the host side. Yes, I started an SMP guest (-smp 4 -monitor tcp:127.0.0.1:1234,server,nowait) and I run these scripts to stress the .probe()/.remove() path: - guest while true; do cat /dev/urandom | nc-vsock -l 4321 > /dev/null & cat /dev/urandom | nc-vsock -l 5321 > /dev/null & cat /dev/urandom | nc-vsock -l 6321 > /dev/null & cat /dev/urandom | nc-vsock -l 7321 > /dev/null & wait done - host while true; do cat /dev/urandom | nc-vsock 3 4321 > /dev/null & cat /dev/urandom | nc-vsock 3 5321 > /dev/null & cat /dev/urandom | nc-vsock 3 6321 > /dev/null & cat /dev/urandom | nc-vsock 3 7321 > /dev/null & sleep 2 echo "device_del v1" | nc 127.0.0.1 1234 sleep 1 echo "device_add vhost-vsock-pci,id=v1,guest-cid=3" | nc 127.0.0.1 1234 sleep 1 done Do you think is enough or is better to have a test more accurate? Thanks, Stefano
On Mon, Jul 01, 2019 at 07:03:57PM +0200, Stefano Garzarella wrote: > On Mon, Jul 01, 2019 at 04:11:13PM +0100, Stefan Hajnoczi wrote: > > On Fri, Jun 28, 2019 at 02:36:56PM +0200, Stefano Garzarella wrote: > > > During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock > > > before registering the driver", Stefan pointed out some possible issues > > > in the .probe() and .remove() callbacks of the virtio-vsock driver. > > > > > > This series tries to solve these issues: > > > - Patch 1 adds RCU critical sections to avoid use-after-free of > > > 'the_virtio_vsock' pointer. > > > - Patch 2 stops workers before to call vdev->config->reset(vdev) to > > > be sure that no one is accessing the device. > > > - Patch 3 moves the works flush at the end of the .remove() to avoid > > > use-after-free of 'vsock' object. > > > > > > v2: > > > - Patch 1: use RCU to protect 'the_virtio_vsock' pointer > > > - Patch 2: no changes > > > - Patch 3: flush works only at the end of .remove() > > > - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers > > > allocated. > > > > > > v1: https://patchwork.kernel.org/cover/10964733/ > > > > This looks good to me. > > Thanks for the review! > > > > > Did you run any stress tests? For example an SMP guest constantly > > connecting and sending packets together with a script that > > hotplug/unplugs vhost-vsock-pci from the host side. > > Yes, I started an SMP guest (-smp 4 -monitor tcp:127.0.0.1:1234,server,nowait) > and I run these scripts to stress the .probe()/.remove() path: > > - guest > while true; do > cat /dev/urandom | nc-vsock -l 4321 > /dev/null & > cat /dev/urandom | nc-vsock -l 5321 > /dev/null & > cat /dev/urandom | nc-vsock -l 6321 > /dev/null & > cat /dev/urandom | nc-vsock -l 7321 > /dev/null & > wait > done > > - host > while true; do > cat /dev/urandom | nc-vsock 3 4321 > /dev/null & > cat /dev/urandom | nc-vsock 3 5321 > /dev/null & > cat /dev/urandom | nc-vsock 3 6321 > /dev/null & > cat /dev/urandom | nc-vsock 3 7321 > /dev/null & > sleep 2 > echo "device_del v1" | nc 127.0.0.1 1234 > sleep 1 > echo "device_add vhost-vsock-pci,id=v1,guest-cid=3" | nc 127.0.0.1 1234 > sleep 1 > done > > Do you think is enough or is better to have a test more accurate? That's good when left running overnight so that thousands of hotplug events are tested. Stefan
On Wed, Jul 03, 2019 at 10:14:53AM +0100, Stefan Hajnoczi wrote: > On Mon, Jul 01, 2019 at 07:03:57PM +0200, Stefano Garzarella wrote: > > On Mon, Jul 01, 2019 at 04:11:13PM +0100, Stefan Hajnoczi wrote: > > > On Fri, Jun 28, 2019 at 02:36:56PM +0200, Stefano Garzarella wrote: > > > > During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock > > > > before registering the driver", Stefan pointed out some possible issues > > > > in the .probe() and .remove() callbacks of the virtio-vsock driver. > > > > > > > > This series tries to solve these issues: > > > > - Patch 1 adds RCU critical sections to avoid use-after-free of > > > > 'the_virtio_vsock' pointer. > > > > - Patch 2 stops workers before to call vdev->config->reset(vdev) to > > > > be sure that no one is accessing the device. > > > > - Patch 3 moves the works flush at the end of the .remove() to avoid > > > > use-after-free of 'vsock' object. > > > > > > > > v2: > > > > - Patch 1: use RCU to protect 'the_virtio_vsock' pointer > > > > - Patch 2: no changes > > > > - Patch 3: flush works only at the end of .remove() > > > > - Removed patch 4 because virtqueue_detach_unused_buf() returns all the buffers > > > > allocated. > > > > > > > > v1: https://patchwork.kernel.org/cover/10964733/ > > > > > > This looks good to me. > > > > Thanks for the review! > > > > > > > > Did you run any stress tests? For example an SMP guest constantly > > > connecting and sending packets together with a script that > > > hotplug/unplugs vhost-vsock-pci from the host side. > > > > Yes, I started an SMP guest (-smp 4 -monitor tcp:127.0.0.1:1234,server,nowait) > > and I run these scripts to stress the .probe()/.remove() path: > > > > - guest > > while true; do > > cat /dev/urandom | nc-vsock -l 4321 > /dev/null & > > cat /dev/urandom | nc-vsock -l 5321 > /dev/null & > > cat /dev/urandom | nc-vsock -l 6321 > /dev/null & > > cat /dev/urandom | nc-vsock -l 7321 > /dev/null & > > wait > > done > > > > - host > > while true; do > > cat /dev/urandom | nc-vsock 3 4321 > /dev/null & > > cat /dev/urandom | nc-vsock 3 5321 > /dev/null & > > cat /dev/urandom | nc-vsock 3 6321 > /dev/null & > > cat /dev/urandom | nc-vsock 3 7321 > /dev/null & > > sleep 2 > > echo "device_del v1" | nc 127.0.0.1 1234 > > sleep 1 > > echo "device_add vhost-vsock-pci,id=v1,guest-cid=3" | nc 127.0.0.1 1234 > > sleep 1 > > done > > > > Do you think is enough or is better to have a test more accurate? > > That's good when left running overnight so that thousands of hotplug > events are tested. Honestly I run the test for ~30 mins (because without the patch the crash happens in a few seconds), but of course, I'll run it this night :) Thanks, Stefano