diff mbox series

[v2] nbd: add a flush_workqueue in nbd_start_device

Message ID 20200122031857.5859-1-sunke32@huawei.com (mailing list archive)
State New, archived
Headers show
Series [v2] nbd: add a flush_workqueue in nbd_start_device | expand

Commit Message

Sun Ke Jan. 22, 2020, 3:18 a.m. UTC
When kzalloc fail, may cause trying to destroy the
workqueue from inside the workqueue.

If num_connections is m (2 < m), and NO.1 ~ NO.n
(1 < n < m) kzalloc are successful. The NO.(n + 1)
failed. Then, nbd_start_device will return ENOMEM
to nbd_start_device_ioctl, and nbd_start_device_ioctl
will return immediately without running flush_workqueue.
However, we still have n recv threads. If nbd_release
run first, recv threads may have to drop the last
config_refs and try to destroy the workqueue from
inside the workqueue.

To fix it, add a flush_workqueue in nbd_start_device.

Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Signed-off-by: Sun Ke <sunke32@huawei.com>
---
 drivers/block/nbd.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Sun Ke Feb. 4, 2020, 2:28 a.m. UTC | #1
ping

在 2020/1/22 11:18, Sun Ke 写道:
> When kzalloc fail, may cause trying to destroy the
> workqueue from inside the workqueue.
> 
> If num_connections is m (2 < m), and NO.1 ~ NO.n
> (1 < n < m) kzalloc are successful. The NO.(n + 1)
> failed. Then, nbd_start_device will return ENOMEM
> to nbd_start_device_ioctl, and nbd_start_device_ioctl
> will return immediately without running flush_workqueue.
> However, we still have n recv threads. If nbd_release
> run first, recv threads may have to drop the last
> config_refs and try to destroy the workqueue from
> inside the workqueue.
> 
> To fix it, add a flush_workqueue in nbd_start_device.
> 
> Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
> Signed-off-by: Sun Ke <sunke32@huawei.com>
> ---
>   drivers/block/nbd.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index b4607dd96185..78181908f0df 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1265,6 +1265,16 @@ static int nbd_start_device(struct nbd_device *nbd)
>   		args = kzalloc(sizeof(*args), GFP_KERNEL);
>   		if (!args) {
>   			sock_shutdown(nbd);
> +			/*
> +			 * If num_connections is m (2 < m),
> +			 * and NO.1 ~ NO.n(1 < n < m) kzallocs are successful.
> +			 * But NO.(n + 1) failed. We still have n recv threads.
> +			 * So, add flush_workqueue here to prevent recv threads
> +			 * dropping the last config_refs and trying to destroy
> +			 * the workqueue from inside the workqueue.
> +			 */
> +			if (i)
> +				flush_workqueue(nbd->recv_workq);
>   			return -ENOMEM;
>   		}
>   		sk_set_memalloc(config->socks[i]->sock->sk);
>
Jens Axboe Feb. 4, 2020, 2:41 a.m. UTC | #2
On 2/3/20 7:28 PM, sunke (E) wrote:
> ping

Maybe I forgot to reply, but I queued it up last week:

https://git.kernel.dk/cgit/linux-block/commit/?h=block-5.6&id=5c0dd228b5fc30a3b732c7ae2657e0161ec7ed80
diff mbox series

Patch

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index b4607dd96185..78181908f0df 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1265,6 +1265,16 @@  static int nbd_start_device(struct nbd_device *nbd)
 		args = kzalloc(sizeof(*args), GFP_KERNEL);
 		if (!args) {
 			sock_shutdown(nbd);
+			/*
+			 * If num_connections is m (2 < m),
+			 * and NO.1 ~ NO.n(1 < n < m) kzallocs are successful.
+			 * But NO.(n + 1) failed. We still have n recv threads.
+			 * So, add flush_workqueue here to prevent recv threads
+			 * dropping the last config_refs and trying to destroy
+			 * the workqueue from inside the workqueue.
+			 */
+			if (i)
+				flush_workqueue(nbd->recv_workq);
 			return -ENOMEM;
 		}
 		sk_set_memalloc(config->socks[i]->sock->sk);