Message ID | 20210120035027.51037-1-dubo163@126.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vhost-user-blk.c:fix the qemu-kvm crash in the vhost-user-blk env. | expand |
Looks good - I just suggest some suggestions for the commit message. On Wed, Jan 20, 2021 at 8:46 AM Bobo Du <dubo163@126.com> wrote: > > In our spdk env, when we restart spdk vhost process, all the spdk > vhost dev will be reconnected,if the vhost_user_blk_device_realize > failed in the reconnect code goto label with the qemu_chr_fe_wait_connected, > the vhost_user_cleanup will set user->chr be NULL,but the fe handler > vhost_user_blk_event is still work on the env. > > If the vhost slave(eg:spdk) has not been done,we will see the qemu-kvm > crash after reopen the vhost-user-blk dev: The English up to here is a little awkward. Maybe rather: When a vhost-user-blk backend, such as SPDK restarts, all of the backend's vhost devs will be reconnected. If vhost_user_blk_device_realize() fails at qemu_chr_fe_wait_connected(), goto virtio_err will be executed. This in turn calls vhost_user_cleanup which sets user->chr to NULL. This is problematic because the qemu_chr_fe_handler will still be active and will crash if the backend comes back up and reopens the chardev. > gdb debug info from qemu-kvm-2.10: > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > Core was generated by `/usr/libexec/qemu-kvm -name guest=db1ae942ac9c5486bf93c7baac5fcce6,debug-thread'. I don't think you need these three lines: [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/libexec/qemu-kvm -name guest=db1ae942ac9c5486bf93c7baac5fcce6,debug-thread'. > Program terminated with signal 11, Segmentation fault. > #0 qemu_chr_fe_set_msgfds (be=0x0, fds=0x0, num=0) at chardev/char-fe.c:144 > 144 Chardev *s = be->chr; > > So,we must reset the fe handler after the goto label virtio_err. > > Signed-off-by: Bobo Du <dubo163@126.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com> > --- > hw/block/vhost-user-blk.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c > index da4fbf9084..c90687ab82 100644 > --- a/hw/block/vhost-user-blk.c > +++ b/hw/block/vhost-user-blk.c > @@ -507,6 +507,8 @@ virtio_err: > } > g_free(s->virtqs); > virtio_cleanup(vdev); > + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, > + NULL, NULL, NULL, false); > vhost_user_cleanup(&s->vhost_user); > } > > -- > 2.17.0 > >
diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c index da4fbf9084..c90687ab82 100644 --- a/hw/block/vhost-user-blk.c +++ b/hw/block/vhost-user-blk.c @@ -507,6 +507,8 @@ virtio_err: } g_free(s->virtqs); virtio_cleanup(vdev); + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, NULL, + NULL, NULL, NULL, false); vhost_user_cleanup(&s->vhost_user); }
In our spdk env, when we restart spdk vhost process, all the spdk vhost dev will be reconnected,if the vhost_user_blk_device_realize failed in the reconnect code goto label with the qemu_chr_fe_wait_connected, the vhost_user_cleanup will set user->chr be NULL,but the fe handler vhost_user_blk_event is still work on the env. If the vhost slave(eg:spdk) has not been done,we will see the qemu-kvm crash after reopen the vhost-user-blk dev: gdb debug info from qemu-kvm-2.10: [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/libexec/qemu-kvm -name guest=db1ae942ac9c5486bf93c7baac5fcce6,debug-thread'. Program terminated with signal 11, Segmentation fault. #0 qemu_chr_fe_set_msgfds (be=0x0, fds=0x0, num=0) at chardev/char-fe.c:144 144 Chardev *s = be->chr; So,we must reset the fe handler after the goto label virtio_err. Signed-off-by: Bobo Du <dubo163@126.com> --- hw/block/vhost-user-blk.c | 2 ++ 1 file changed, 2 insertions(+)