nbd: create a recv workqueue per nbd device

Message ID	1484322673-10606-1-git-send-email-jbacik@fb.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Josef Bacik <jbacik@fb.com> To: <linux-block@vger.kernel.org> Subject: [PATCH] nbd: create a recv workqueue per nbd device Date: Fri, 13 Jan 2017 10:51:13 -0500 Message-ID: <1484322673-10606-1-git-send-email-jbacik@fb.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: None (protection.outlook.com: fb.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DM5PR15MB1321; 23:PKM6UNPBpVgUTJLVrKuX07M3DAZzuMOlqqCo+xiQN?= =?us-ascii?Q?c2NTC0CTzcWmzkaoZTmkqNmapWOHAnJqIbIZEP9TcDBHxIIafl1GlSjWRkC9?= =?us-ascii?Q?FoFnAYwucUZffYvJ1D002m8JEGXCaa6+7UZnJ5L6dqyQGIYCyaMjV6+w2yEM?= =?us-ascii?Q?r07h7H3Vnilf/LMV36h6Q0Rn1nS5rR6NFOxlRIBAofrxQzIUP6e7Fn9BjO3D?= =?us-ascii?Q?GdA4+g75dRfwqhkGVQiA1GROCr6Pck5TsM/d7fiz7Vj53bntYKzy72N3w+by?= =?us-ascii?Q?tghKUsrQjdXhGF5xHqQZgQuAS/3dc8nyGu5FtZeRoBYoLgNHSo1OcywI92Pp?= =?us-ascii?Q?uLmcG4+udsxW7OPL3Gl7E5gMqQurk/UEKS/Y8nGeB1oa61r4WWqd90MfG8fV?= =?us-ascii?Q?oHkbKUUgP8dyyIeB9Rx6xyZ5KebK1jpUlIsiM1VyAqTw1l31ZhgTmJUa58Xk?= =?us-ascii?Q?2o2p5Z4eo3P1RiBX2Zo/CLFcCClc4RyXuyfYhzK52qRsjoKb/z1P6VmEgjbX?= =?us-ascii?Q?BJOB4E7UfHWTLzMKzuJ3anRitxLfk35f+SSrh2KR5LPBp4HeHaj85rKhOg8q?= =?us-ascii?Q?o+vhqBhNg8cSeQ8z1yA2Hrc+qPa47VzADA6DOXA3+CqmPOqqo+vEwkvZdiBJ?= =?us-ascii?Q?VE8oblEyzSRjRWlqQ2fL0D2UVxhRjDqdd230ytzYMTS5SkT8xYPuTfwbxPC5?= =?us-ascii?Q?i347nIl5/H7WhJ6mYIeq0YaVIZpxtK5kpCfYi2D+PQ4m1fFkwtEweg1rarTZ?= =?us-ascii?Q?Z0sywoYbmjVH71pEj/GLtT9egqByVAR/j1jeK12XdBIJPbmNUatZLp9qxLvR?= =?us-ascii?Q?CzPcztM6E0OOB4l0z9AhOMJ1S5FofV1OapKMCqytoWOLNWg4VLefSwVZhv+v?= =?us-ascii?Q?/j+Skt8v2sixGzK6+WueGsHrM1AurwozFLobXF3iR/M6heKLbUuZ0DnUoqPt?= =?us-ascii?Q?gUW/lf/z2WR0M6cxzIduhnsWi8z/QnE86NfYqrgBhCFW3FpdUnwwczNTViCU?= =?us-ascii?Q?e09xnIztOq8pJ14Xz1y516Udd+bGxvEnI7Etp3o5sIZKKWsrBxkqynfOMQcL?= =?us-ascii?Q?+cQGKZAknFsDh+NIC07Nf9gv7UcE3ogVMTx7CZxbzFyRoIj0j5uDS/CU3s78?= =?us-ascii?Q?InUK0SefYbdj6NyIueN1g25Bdz5Zyrc?= X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1321; 6:OI9+VuTgK9vT2yO0L9uptpFmKSQCrrteIPsSa4sgeCqC8IRflRCsSfbCEENB1daW1ptGj7WRcr1+PciAWn7SmPgp17w1Mq7L/keqMIn8wHRBXNkoC8ZCKzQIgzKw6eTSRizDYf2/NBLWY/NjchVd45rOvi9KOR+cBJF/oUgMS7R+wU9xppwg1b5rCFu2t9BkJdXrZQEAEbN6uqVD7rlOjFEdJkkepML5CvzKBFhorZyJBWFo5ySJOkNauTdSHdGcMXHov/FDys2Q8RV/dOUdIs0FcvRZCISv3fMIIMS7/VGlo7X5XuNlEgtOOFBvqZoncK0G+9ni6y0tBPnsBWe3ZosDUfP09In6j27mhsoXvze6ao5kgXaNIAssnHFy6nYnAOrYU1OWBOlc8brOS5YCnWRH0wZxLnFFryrNsTlBcCA=; 5:fApdm3IglPWnTJYp6f6eqp+ym5rfwa2ffueM5lTP/knjGnGoaZKQGGmq1TqiywaDyGPr2jApMhdt44ep6Br+1ZuMp5knN353qjuMOEIg4sOEEyouAzamZ9U3zqYXUd520PoG1apSeXHgA8NIKJbPFqZuolgSoUpCAWl0GRgVeiA=; 24:BWmV0WOx4N/eKo43cBFJn8mDFGXXqYdrDeydSdOfDxpCVsJgzT6UTOm1nlg0azQg9bkQ3Ad3EEPiHArRGbvlkcIgwZAPbvAPPRw6/0gRvb0= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; DM5PR15MB1321; 7:6og0OiDKcIYQmNSDRycgZ+TvQrHuAy1f6z4bKdgmj3A7gdLKFfclEkAOOyeYIUgXNjf1hYKTtJT74ZRGV6PbCaHwJQ/+gqc4oAadk+qS8S1d2GUXeDS7ncX+TdnxmBKvKmi98kknmrG5MOoN7pmDl/Vyt242zjElJ3xfogg5SI/+nJW2iqgjRo+K3RNMmZxp99IuH1wUUpIAue2P820tMuKgp0X7OxPZsXiTAMIa1yCX0VKEqnSFJvFpMmbZpGBBhojZKeIYRHOf+rHiroaSFgVkNAF8vddaiV3EP9UNcRaTTnI7DofXXI6vN11Pu7A+KjWKKQV2dSklQ/v0mwGxSVduviJv2A8LgHWa+jsNOmhcd85ghcDe3wa1/ytP7A7omd48IvZ3xFESmCVpT7ou5pv6Nfd7N/y6shdbHHVO/4P11khKA5irrXtfexc1L722DylSbl3iWfQLnlmveoeqVw==; 20:4vl4RSBdHS5nVj4g/AyOAxVnyffX6xDAIVv+0uQu2I6t4ZFFwwc1IYArV8UUo5bK9Aluvvdc/lQaoyn15v8fytfsdlb/W2C9bzqHiKfXzWYmBZMMNEe16UqJzCuQJRQ5+jz2lMzR+nQrbeiCSeuvKr7+RvH0hniOVVtF5PGc2eE= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jan 2017 15:51:16.7834 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR15MB1321 Sender: linux-block-owner@vger.kernel.org Precedence: bulk

Message ID

1484322673-10606-1-git-send-email-jbacik@fb.com (mailing list archive)

State

New, archived

Headers

From: Josef Bacik <jbacik@fb.com>
To: <linux-block@vger.kernel.org>
Subject: [PATCH] nbd: create a recv workqueue per nbd device
Date: Fri, 13 Jan 2017 10:51:13 -0500
Message-ID: <1484322673-10606-1-git-send-email-jbacik@fb.com>
MIME-Version: 1.0
Content-Type: text/plain
Received-SPF: None (protection.outlook.com: fb.com does not designate
	permitted sender hosts)
SpamDiagnosticOutput: 1:99
SpamDiagnosticMetadata: NSPM
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jan 2017 15:51:16.7834
	(UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR15MB1321
X-OriginatorOrg: fb.com
X-Proofpoint-Spam-Reason: safe
X-FB-Internal: Safe
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, ,
	definitions=2017-01-13_09:, , signatures=0
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-block.vger.kernel.org>
X-Mailing-List: linux-block@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Commit Message

Josef Bacik Jan. 13, 2017, 3:51 p.m. UTC

Since we are in the memory reclaim path we need our recv work to be on a
workqueue that has WQ_MEM_RECLAIM set so we can avoid deadlocks.  Also
set WQ_HIGHPRI since we are in the completion path for IO.

Signed-off-by: Josef Bacik <jbacik@fb.com>
---
 drivers/block/nbd.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Comments

Sagi Grimberg Jan. 13, 2017, 10:24 p.m. UTC | #1

Hey Josef,

> Since we are in the memory reclaim path we need our recv work to be on a
> workqueue that has WQ_MEM_RECLAIM set so we can avoid deadlocks.  Also
> set WQ_HIGHPRI since we are in the completion path for IO.

Really a workqueue per device?? Did this really give performance
advantage? Can this really scale with number of devices?
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik Jan. 14, 2017, 1:04 a.m. UTC | #2

> On Jan 13, 2017, at 5:24 PM, Sagi Grimberg <sagi@grimberg.me> wrote:
> 
> Hey Josef,
> 
>> Since we are in the memory reclaim path we need our recv work to be on a
>> workqueue that has WQ_MEM_RECLAIM set so we can avoid deadlocks.  Also
>> set WQ_HIGHPRI since we are in the completion path for IO.
> 
> Really a workqueue per device?? Did this really give performance
> advantage? Can this really scale with number of devices?

I don't see why not, especially since these things run the whole time the device is active.  I have patches forthcoming to make device creation dynamic so we don't have a bunch all at once.  That being said I'm not married to the idea, just seemed like a good idea at the time and not particularly harmful.  Thanks,

Josef--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Sagi Grimberg Jan. 14, 2017, 9:15 p.m. UTC | #3

>> Hey Josef,
>>
>>> Since we are in the memory reclaim path we need our recv work to be on a
>>> workqueue that has WQ_MEM_RECLAIM set so we can avoid deadlocks.  Also
>>> set WQ_HIGHPRI since we are in the completion path for IO.
>>
>> Really a workqueue per device?? Did this really give performance
>> advantage? Can this really scale with number of devices?
>
> I don't see why not, especially since these things run the whole time the device is active.  I have patches forthcoming to make device creation dynamic so we don't have a bunch all at once.  That being said I'm not married to the idea, just seemed like a good idea at the time and not particularly harmful.  Thanks,

I just don't see how having a worqueue per device helps anything? There
are plenty of active workers per workqueue and even if its not enough
you can specify more with max_active.

I guess what I'm trying to say is that I don't understand what this is
solving. The commit message explains why you need WQ_MEM_RECLAIM and why
you want WQ_HIGHPRI, but does not explain why workqueue per device is
helping/solving anything.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik Jan. 14, 2017, 9:27 p.m. UTC | #4

> On Jan 14, 2017, at 4:15 PM, Sagi Grimberg <sagi@grimberg.me> wrote:
> 
> 
>>> Hey Josef,
>>> 
>>>> Since we are in the memory reclaim path we need our recv work to be on a
>>>> workqueue that has WQ_MEM_RECLAIM set so we can avoid deadlocks.  Also
>>>> set WQ_HIGHPRI since we are in the completion path for IO.
>>> 
>>> Really a workqueue per device?? Did this really give performance
>>> advantage? Can this really scale with number of devices?
>> 
>> I don't see why not, especially since these things run the whole time the device is active.  I have patches forthcoming to make device creation dynamic so we don't have a bunch all at once.  That being said I'm not married to the idea, just seemed like a good idea at the time and not particularly harmful.  Thanks,
> 
> I just don't see how having a worqueue per device helps anything? There
> are plenty of active workers per workqueue and even if its not enough
> you can specify more with max_active.
> 
> I guess what I'm trying to say is that I don't understand what this is
> solving. The commit message explains why you need WQ_MEM_RECLAIM and why
> you want WQ_HIGHPRI, but does not explain why workqueue per device is
> helping/solving anything.

There's no reason for it, that's just the way I did it. I will test both ways on Tuesday and if there's no measurable difference then I'll do a global one.  Thanks,

Josef--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 99c8446..e0a8d51 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -70,6 +70,7 @@  struct nbd_device {
 	struct task_struct *task_recv;
 	struct task_struct *task_setup;
 
+	struct workqueue_struct *recv_workqueue;
 #if IS_ENABLED(CONFIG_DEBUG_FS)
 	struct dentry *dbg_dir;
 #endif
@@ -787,7 +788,7 @@  static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
 			INIT_WORK(&args[i].work, recv_work);
 			args[i].nbd = nbd;
 			args[i].index = i;
-			queue_work(system_long_wq, &args[i].work);
+			queue_work(nbd->recv_workqueue, &args[i].work);
 		}
 		wait_event_interruptible(nbd->recv_wq,
 					 atomic_read(&nbd->recv_threads) == 0);
@@ -1074,6 +1075,16 @@  static int __init nbd_init(void)
 			goto out;
 		}
 
+		nbd_dev[i].recv_workqueue =
+			alloc_workqueue("knbd-recv",
+					WQ_MEM_RECLAIM | WQ_HIGHPRI, 0);
+		if (!nbd_dev[i].recv_workqueue) {
+			blk_mq_free_tag_set(&nbd_dev[i].tag_set);
+			blk_cleanup_queue(disk->queue);
+			put_disk(disk);
+			goto out;
+		}
+
 		/*
 		 * Tell the block layer that we are not a rotational device
 		 */
@@ -1115,6 +1126,7 @@  static int __init nbd_init(void)
 		blk_mq_free_tag_set(&nbd_dev[i].tag_set);
 		blk_cleanup_queue(nbd_dev[i].disk->queue);
 		put_disk(nbd_dev[i].disk);
+		destroy_workqueue(nbd_dev[i].recv_workqueue);
 	}
 	kfree(nbd_dev);
 	return err;
@@ -1134,6 +1146,7 @@  static void __exit nbd_cleanup(void)
 			blk_cleanup_queue(disk->queue);
 			blk_mq_free_tag_set(&nbd_dev[i].tag_set);
 			put_disk(disk);
+			destroy_workqueue(nbd_dev[i].recv_workqueue);
 		}
 	}
 	unregister_blkdev(NBD_MAJOR, "nbd");

nbd: create a recv workqueue per nbd device

Commit Message

Comments

Patch