Message ID | 20210125180115.22936-1-vgoyal@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | vhost-user: Shutdown/Flush slave channel properly | expand |
On Mon, Jan 25, 2021 at 01:01:09PM -0500, Vivek Goyal wrote: > Hi, > > We are working on DAX support in virtiofs and have some patches out of > the tree hosted here. > > https://gitlab.com/virtio-fs/qemu/-/commits/virtio-fs-dev > > These patches have not been proposed for merge yet, becasue David > Gilbert noticed that we can run into a deadlock during an emergency > reboot of guest kernel. (echo b > /proc/sysrq-trigger). > > I have provided details of deadlock in 4th path of the series with > subject "qemu, vhost-user: Extend protocol to start/stop/flush slave > channel". > > Basic problem seems to be that we don't have a proper mechanism to > shutdown slave channel when vhost-user device is stopping. This means > there might be pending messages in slave channel and slave is blocked > and waiting for response. > > This is an RFC patch series to enhance vhost-user protocol to > properly shutdown/flush slave channel and avoid the deadlock. Though > we faced the issue in the context of virtiofs, any vhost-user > device using slave channel can potentially run into issues and > can benefit from these patches. > > Any feedback is welcome. Currently patches are based on out of > tree code but after I get some feedback, I can only take pieces > which are relevant to upstream and post separately. > > Thanks > Vivek No comments so far - do you plan to post a non-RFC patchset? > Vivek Goyal (6): > virtiofsd: Drop ->vu_dispatch_rwlock while waiting for thread to exit > libvhost-user: Use slave_mutex in all slave messages > vhost-user: Return error code from slave_read() > qemu, vhost-user: Extend protocol to start/stop/flush slave channel > libvhost-user: Add support to start/stop/flush slave channel > virtiofsd: Opt in for slave start/stop/shutdown functionality > > hw/virtio/vhost-user.c | 151 +++++++++++++++++++++- > subprojects/libvhost-user/libvhost-user.c | 147 +++++++++++++++++---- > subprojects/libvhost-user/libvhost-user.h | 8 +- > tools/virtiofsd/fuse_virtio.c | 20 +++ > 4 files changed, 294 insertions(+), 32 deletions(-) > > -- > 2.25.4
On Wed, Feb 10, 2021 at 04:39:06PM -0500, Michael S. Tsirkin wrote: > On Mon, Jan 25, 2021 at 01:01:09PM -0500, Vivek Goyal wrote: > > Hi, > > > > We are working on DAX support in virtiofs and have some patches out of > > the tree hosted here. > > > > https://gitlab.com/virtio-fs/qemu/-/commits/virtio-fs-dev > > > > These patches have not been proposed for merge yet, becasue David > > Gilbert noticed that we can run into a deadlock during an emergency > > reboot of guest kernel. (echo b > /proc/sysrq-trigger). > > > > I have provided details of deadlock in 4th path of the series with > > subject "qemu, vhost-user: Extend protocol to start/stop/flush slave > > channel". > > > > Basic problem seems to be that we don't have a proper mechanism to > > shutdown slave channel when vhost-user device is stopping. This means > > there might be pending messages in slave channel and slave is blocked > > and waiting for response. > > > > This is an RFC patch series to enhance vhost-user protocol to > > properly shutdown/flush slave channel and avoid the deadlock. Though > > we faced the issue in the context of virtiofs, any vhost-user > > device using slave channel can potentially run into issues and > > can benefit from these patches. > > > > Any feedback is welcome. Currently patches are based on out of > > tree code but after I get some feedback, I can only take pieces > > which are relevant to upstream and post separately. > > > > Thanks > > Vivek > > No comments so far - do you plan to post a non-RFC patchset? Yes. Stefan wants me to poll both unix fd and slave fd in device shutdown path and serve both of these in parallel, instead of adding a new slave channel shutdown message. I am planning to give it a try and post new patches. Vivek > > > > Vivek Goyal (6): > > virtiofsd: Drop ->vu_dispatch_rwlock while waiting for thread to exit > > libvhost-user: Use slave_mutex in all slave messages > > vhost-user: Return error code from slave_read() > > qemu, vhost-user: Extend protocol to start/stop/flush slave channel > > libvhost-user: Add support to start/stop/flush slave channel > > virtiofsd: Opt in for slave start/stop/shutdown functionality > > > > hw/virtio/vhost-user.c | 151 +++++++++++++++++++++- > > subprojects/libvhost-user/libvhost-user.c | 147 +++++++++++++++++---- > > subprojects/libvhost-user/libvhost-user.h | 8 +- > > tools/virtiofsd/fuse_virtio.c | 20 +++ > > 4 files changed, 294 insertions(+), 32 deletions(-) > > > > -- > > 2.25.4 >
On Mon, Jan 25, 2021 at 01:01:09PM -0500, Vivek Goyal wrote: > Hi, > > We are working on DAX support in virtiofs and have some patches out of > the tree hosted here. > > https://gitlab.com/virtio-fs/qemu/-/commits/virtio-fs-dev any plans to post a non RFC version? > These patches have not been proposed for merge yet, becasue David > Gilbert noticed that we can run into a deadlock during an emergency > reboot of guest kernel. (echo b > /proc/sysrq-trigger). > > I have provided details of deadlock in 4th path of the series with > subject "qemu, vhost-user: Extend protocol to start/stop/flush slave > channel". > > Basic problem seems to be that we don't have a proper mechanism to > shutdown slave channel when vhost-user device is stopping. This means > there might be pending messages in slave channel and slave is blocked > and waiting for response. > > This is an RFC patch series to enhance vhost-user protocol to > properly shutdown/flush slave channel and avoid the deadlock. Though > we faced the issue in the context of virtiofs, any vhost-user > device using slave channel can potentially run into issues and > can benefit from these patches. > > Any feedback is welcome. Currently patches are based on out of > tree code but after I get some feedback, I can only take pieces > which are relevant to upstream and post separately. > > Thanks > Vivek > > Vivek Goyal (6): > virtiofsd: Drop ->vu_dispatch_rwlock while waiting for thread to exit > libvhost-user: Use slave_mutex in all slave messages > vhost-user: Return error code from slave_read() > qemu, vhost-user: Extend protocol to start/stop/flush slave channel > libvhost-user: Add support to start/stop/flush slave channel > virtiofsd: Opt in for slave start/stop/shutdown functionality > > hw/virtio/vhost-user.c | 151 +++++++++++++++++++++- > subprojects/libvhost-user/libvhost-user.c | 147 +++++++++++++++++---- > subprojects/libvhost-user/libvhost-user.h | 8 +- > tools/virtiofsd/fuse_virtio.c | 20 +++ > 4 files changed, 294 insertions(+), 32 deletions(-) > > -- > 2.25.4
On Tue, Feb 23, 2021 at 09:14:16AM -0500, Michael S. Tsirkin wrote: > On Mon, Jan 25, 2021 at 01:01:09PM -0500, Vivek Goyal wrote: > > Hi, > > > > We are working on DAX support in virtiofs and have some patches out of > > the tree hosted here. > > > > https://gitlab.com/virtio-fs/qemu/-/commits/virtio-fs-dev > > any plans to post a non RFC version? We want to post a non-RFC version. But review comments have not been taken care of yet. Stefan says don't extend vhost-user protocl. Instead, modify vhost_user_read() so that it polls both u->user->chr (unix domain socket) as well as u->slave_fd. IOW, keep on servicing slave fd request while we are waiting for vhost user message response. Have not been able to figure out how to do that given unix domain socket details are abstracted behind char device interface. CCing Greg, He might have ideas on how do that. Vivek > > > These patches have not been proposed for merge yet, becasue David > > Gilbert noticed that we can run into a deadlock during an emergency > > reboot of guest kernel. (echo b > /proc/sysrq-trigger). > > > > I have provided details of deadlock in 4th path of the series with > > subject "qemu, vhost-user: Extend protocol to start/stop/flush slave > > channel". > > > > Basic problem seems to be that we don't have a proper mechanism to > > shutdown slave channel when vhost-user device is stopping. This means > > there might be pending messages in slave channel and slave is blocked > > and waiting for response. > > > > This is an RFC patch series to enhance vhost-user protocol to > > properly shutdown/flush slave channel and avoid the deadlock. Though > > we faced the issue in the context of virtiofs, any vhost-user > > device using slave channel can potentially run into issues and > > can benefit from these patches. > > > > Any feedback is welcome. Currently patches are based on out of > > tree code but after I get some feedback, I can only take pieces > > which are relevant to upstream and post separately. > > > > Thanks > > Vivek > > > > Vivek Goyal (6): > > virtiofsd: Drop ->vu_dispatch_rwlock while waiting for thread to exit > > libvhost-user: Use slave_mutex in all slave messages > > vhost-user: Return error code from slave_read() > > qemu, vhost-user: Extend protocol to start/stop/flush slave channel > > libvhost-user: Add support to start/stop/flush slave channel > > virtiofsd: Opt in for slave start/stop/shutdown functionality > > > > hw/virtio/vhost-user.c | 151 +++++++++++++++++++++- > > subprojects/libvhost-user/libvhost-user.c | 147 +++++++++++++++++---- > > subprojects/libvhost-user/libvhost-user.h | 8 +- > > tools/virtiofsd/fuse_virtio.c | 20 +++ > > 4 files changed, 294 insertions(+), 32 deletions(-) > > > > -- > > 2.25.4 >
On Mon, Jan 25, 2021 at 01:01:09PM -0500, Vivek Goyal wrote: > Hi, > > We are working on DAX support in virtiofs and have some patches out of > the tree hosted here. > > https://gitlab.com/virtio-fs/qemu/-/commits/virtio-fs-dev ping anyone wants to pick this up and post a non-rfc version? > These patches have not been proposed for merge yet, becasue David > Gilbert noticed that we can run into a deadlock during an emergency > reboot of guest kernel. (echo b > /proc/sysrq-trigger). > > I have provided details of deadlock in 4th path of the series with > subject "qemu, vhost-user: Extend protocol to start/stop/flush slave > channel". > > Basic problem seems to be that we don't have a proper mechanism to > shutdown slave channel when vhost-user device is stopping. This means > there might be pending messages in slave channel and slave is blocked > and waiting for response. > > This is an RFC patch series to enhance vhost-user protocol to > properly shutdown/flush slave channel and avoid the deadlock. Though > we faced the issue in the context of virtiofs, any vhost-user > device using slave channel can potentially run into issues and > can benefit from these patches. > > Any feedback is welcome. Currently patches are based on out of > tree code but after I get some feedback, I can only take pieces > which are relevant to upstream and post separately. > > Thanks > Vivek > > Vivek Goyal (6): > virtiofsd: Drop ->vu_dispatch_rwlock while waiting for thread to exit > libvhost-user: Use slave_mutex in all slave messages > vhost-user: Return error code from slave_read() > qemu, vhost-user: Extend protocol to start/stop/flush slave channel > libvhost-user: Add support to start/stop/flush slave channel > virtiofsd: Opt in for slave start/stop/shutdown functionality > > hw/virtio/vhost-user.c | 151 +++++++++++++++++++++- > subprojects/libvhost-user/libvhost-user.c | 147 +++++++++++++++++---- > subprojects/libvhost-user/libvhost-user.h | 8 +- > tools/virtiofsd/fuse_virtio.c | 20 +++ > 4 files changed, 294 insertions(+), 32 deletions(-) > > -- > 2.25.4
On Sun, Mar 14, 2021 at 06:21:04PM -0400, Michael S. Tsirkin wrote: > On Mon, Jan 25, 2021 at 01:01:09PM -0500, Vivek Goyal wrote: > > Hi, > > > > We are working on DAX support in virtiofs and have some patches out of > > the tree hosted here. > > > > https://gitlab.com/virtio-fs/qemu/-/commits/virtio-fs-dev > > ping anyone wants to pick this up and post a non-rfc version? Hi Michael, Greg has picked this work and has alredy posted V2 of patches here. https://lore.kernel.org/qemu-devel/20210312092212.782255-8-groug@kaod.org/T/ Please have a look. Thanks Vivek > > > These patches have not been proposed for merge yet, becasue David > > Gilbert noticed that we can run into a deadlock during an emergency > > reboot of guest kernel. (echo b > /proc/sysrq-trigger). > > > > I have provided details of deadlock in 4th path of the series with > > subject "qemu, vhost-user: Extend protocol to start/stop/flush slave > > channel". > > > > Basic problem seems to be that we don't have a proper mechanism to > > shutdown slave channel when vhost-user device is stopping. This means > > there might be pending messages in slave channel and slave is blocked > > and waiting for response. > > > > This is an RFC patch series to enhance vhost-user protocol to > > properly shutdown/flush slave channel and avoid the deadlock. Though > > we faced the issue in the context of virtiofs, any vhost-user > > device using slave channel can potentially run into issues and > > can benefit from these patches. > > > > Any feedback is welcome. Currently patches are based on out of > > tree code but after I get some feedback, I can only take pieces > > which are relevant to upstream and post separately. > > > > Thanks > > Vivek > > > > Vivek Goyal (6): > > virtiofsd: Drop ->vu_dispatch_rwlock while waiting for thread to exit > > libvhost-user: Use slave_mutex in all slave messages > > vhost-user: Return error code from slave_read() > > qemu, vhost-user: Extend protocol to start/stop/flush slave channel > > libvhost-user: Add support to start/stop/flush slave channel > > virtiofsd: Opt in for slave start/stop/shutdown functionality > > > > hw/virtio/vhost-user.c | 151 +++++++++++++++++++++- > > subprojects/libvhost-user/libvhost-user.c | 147 +++++++++++++++++---- > > subprojects/libvhost-user/libvhost-user.h | 8 +- > > tools/virtiofsd/fuse_virtio.c | 20 +++ > > 4 files changed, 294 insertions(+), 32 deletions(-) > > > > -- > > 2.25.4 >