Message ID | 20191208225150.5944-1-mchristi@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | nbd: fix shutdown and recv work deadlock v2 | expand |
On 12/8/19 3:51 PM, Mike Christie wrote: > This fixes a regression added with: > > commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4 > Author: Mike Christie <mchristi@redhat.com> > Date: Sun Aug 4 14:10:06 2019 -0500 > > nbd: fix max number of supported devs > > where we can deadlock during device shutdown. The problem occurs if > the recv_work's nbd_config_put occurs after nbd_start_device_ioctl has > returned and the userspace app has droppped its reference via closing > the device and running nbd_release. The recv_work nbd_config_put call > would then drop the refcount to zero and try to destroy the config which > would try to do destroy_workqueue from the recv work. > > This patch just has nbd_start_device_ioctl do a flush_workqueue when it > wakes so we know after the ioctl returns running works have exited. This > also fixes a possible race where we could try to reuse the device while > old recv_works are still running. Applied, thanks Mike.
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 57532465fb83..b4607dd96185 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -1296,10 +1296,10 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b mutex_unlock(&nbd->config_lock); ret = wait_event_interruptible(config->recv_wq, atomic_read(&config->recv_threads) == 0); - if (ret) { + if (ret) sock_shutdown(nbd); - flush_workqueue(nbd->recv_workq); - } + flush_workqueue(nbd->recv_workq); + mutex_lock(&nbd->config_lock); nbd_bdev_reset(bdev); /* user requested, ignore socket errors */
This fixes a regression added with: commit e9e006f5fcf2bab59149cb38a48a4817c1b538b4 Author: Mike Christie <mchristi@redhat.com> Date: Sun Aug 4 14:10:06 2019 -0500 nbd: fix max number of supported devs where we can deadlock during device shutdown. The problem occurs if the recv_work's nbd_config_put occurs after nbd_start_device_ioctl has returned and the userspace app has droppped its reference via closing the device and running nbd_release. The recv_work nbd_config_put call would then drop the refcount to zero and try to destroy the config which would try to do destroy_workqueue from the recv work. This patch just has nbd_start_device_ioctl do a flush_workqueue when it wakes so we know after the ioctl returns running works have exited. This also fixes a possible race where we could try to reuse the device while old recv_works are still running. Cc: stable@vger.kernel.org Signed-off-by: Mike Christie <mchristi@redhat.com> --- v2: - Drop the taking/dropping of a config_refs around the ioctl. This is not needed because the caller has incremented the refcount already via the open() call before doing the ioctl(). drivers/block/nbd.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)