[RFC,V2,04/17] blk-mq: don't reserve tags for admin queue

Message ID	20180811071220.357-5-ming.lei@redhat.com (mailing list archive)
State	Changes Requested
Headers	show Return-Path: <linux-scsi-owner@kernel.org> From: Ming Lei <ming.lei@redhat.com> To: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org, Ming Lei <ming.lei@redhat.com>, Alan Stern <stern@rowland.harvard.edu>, Christoph Hellwig <hch@lst.de>, Bart Van Assche <bart.vanassche@wdc.com>, Jianchao Wang <jianchao.w.wang@oracle.com>, Hannes Reinecke <hare@suse.de>, Johannes Thumshirn <jthumshirn@suse.de>, Adrian Hunter <adrian.hunter@intel.com>, "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>, "Martin K. Petersen" <martin.petersen@oracle.com>, linux-scsi@vger.kernel.org Subject: [RFC PATCH V2 04/17] blk-mq: don't reserve tags for admin queue Date: Sat, 11 Aug 2018 15:12:07 +0800 Message-Id: <20180811071220.357-5-ming.lei@redhat.com> In-Reply-To: <20180811071220.357-1-ming.lei@redhat.com> References: <20180811071220.357-1-ming.lei@redhat.com> Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk
Series	SCSI: introduce per-host admin queue & enable runtime PM \| expand [RFC,V2,00/17] SCSI: introduce per-host admin queue & enable runtime PM [RFC,V2,01/17] blk-mq: allow to pass default queue flags for creating & initializing queue [RFC,V2,02/17] blk-mq: convert BLK_MQ_F_NO_SCHED into per-queue flag [RFC,V2,03/17] block: rename QUEUE_FLAG_NO_SCHED as QUEUE_FLAG_ADMIN [RFC,V2,04/17] blk-mq: don't reserve tags for admin queue [RFC,V2,05/17] SCSI: try to retrieve request_queue via 'scsi_cmnd' if possible [RFC,V2,06/17] SCSI: pass 'scsi_device' instance from 'scsi_request' [RFC,V2,07/17] SCSI: prepare for introducing admin queue for legacy path [RFC,V2,08/17] SCSI: pass scsi_device to scsi_mq_prep_fn [RFC,V2,09/17] SCSI: don't set .queuedata in scsi_mq_alloc_queue() [RFC,V2,10/17] SCSI: deal with admin queue busy [RFC,V2,11/17] SCSI: track pending admin commands [RFC,V2,12/17] SCSI: create admin queue for each host [RFC,V2,13/17] SCSI: use the dedicated admin queue to send admin commands [RFC,V2,14/17] SCSI: transport_spi: resume a quiesced device [RFC,V2,15/17] SCSI: use admin queue to implement queue QUIESCE [RFC,V2,16/17] block: simplify runtime PM support [RFC,V2,17/17] block: enable runtime PM for blk-mq

Ming Lei Aug. 11, 2018, 7:12 a.m. UTC

Not necessary to reserve tags for admin queue since there isn't
many inflight commands in admin queue usually.

This change won't starve admin queue too because each blocked queue
has equal priority to get one new tag when one driver tag is released,
no matter it is freed from any queue.

So that IO performance won't be affected after admin queue(shared tags
with IO queues) is introduced in the following patches.

Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Jianchao Wang <jianchao.w.wang@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq-tag.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

jianchao.wang Aug. 13, 2018, 10:02 a.m. UTC | #1

Hi Ming

On 08/11/2018 03:12 PM, Ming Lei wrote:
> Not necessary to reserve tags for admin queue since there isn't
> many inflight commands in admin queue usually.
> 
> This change won't starve admin queue too because each blocked queue
> has equal priority to get one new tag when one driver tag is released,
> no matter it is freed from any queue.
> 

We don't count the adminq into tags->active_queues, there maybe following side-effect following:
 - if send a admin request for the LUN_a, it may cause LUN_b cannot get request even though its
   budget is allowed. And when the admin request for LUN_a is completed, LUN_b may not be in the
   first batch to be waked up.

 - all the LUNs still have to share a limited budget of tags

Certainly, I'm not sure whether the performance will be affected, all of above comments are
just concern in theory. ;)

Thanks
Jianchao

Ming Lei Aug. 13, 2018, 10:48 a.m. UTC | #2

On Mon, Aug 13, 2018 at 06:02:18PM +0800, jianchao.wang wrote:
> Hi Ming
> 
> On 08/11/2018 03:12 PM, Ming Lei wrote:
> > Not necessary to reserve tags for admin queue since there isn't
> > many inflight commands in admin queue usually.
> > 
> > This change won't starve admin queue too because each blocked queue
> > has equal priority to get one new tag when one driver tag is released,
> > no matter it is freed from any queue.
> > 
> 
> We don't count the adminq into tags->active_queues,

Yes, just like without this patchset.

> there maybe following side-effect following:
>  - if send a admin request for the LUN_a, it may cause LUN_b cannot get request even though its
>    budget is allowed. And when the admin request for LUN_a is completed, LUN_b may not be in the
>    first batch to be waked up.
> 
>  - all the LUNs still have to share a limited budget of tags

It is nothing to do with where the admin request is sent, so no any
difference wrt. this issue between with and without this patchset,
right?


Thanks,
Ming

jianchao.wang Aug. 14, 2018, 1:29 a.m. UTC | #3

Hi Ming

On 08/13/2018 06:48 PM, Ming Lei wrote:
> It is nothing to do with where the admin request is sent, so no any
> difference wrt. this issue between with and without this patchset,
> right?

I'm afraid not.

For example:
  A scsi host has 8 LUNs associated with it.
  Before this patch set,
  When we send out the admin command, the budget is _per_ LUN, 1/8 of the total tags.
  After this patch set,
  When we send out the admin command, the budget is equal to _one_ LUN, 1/8 of the total tags.

However, the 1/8 above is different.
  Before the patch set, every LUN's admin command has 1/8 budget to use which is per LUN.
  After this patch set, all the 8 LUNs admin command has to share the 1/8 budget.

Thanks
Jianchao

Ming Lei Aug. 14, 2018, 2:10 a.m. UTC | #4

On Tue, Aug 14, 2018 at 09:29:25AM +0800, jianchao.wang wrote:
> Hi Ming
> 
> On 08/13/2018 06:48 PM, Ming Lei wrote:
> > It is nothing to do with where the admin request is sent, so no any
> > difference wrt. this issue between with and without this patchset,
> > right?
> 
> I'm afraid not.
> 
> For example:
>   A scsi host has 8 LUNs associated with it.
>   Before this patch set,
>   When we send out the admin command, the budget is _per_ LUN, 1/8 of the total tags.
>   After this patch set,
>   When we send out the admin command, the budget is equal to _one_ LUN, 1/8 of the total tags.
> 
> However, the 1/8 above is different.
>   Before the patch set, every LUN's admin command has 1/8 budget to use which is per LUN.

Strictly speaking, it is that all admin command and all other IOs share the 1/8 budget
if they aimed at same LUN.

>   After this patch set, all the 8 LUNs admin command has to share the 1/8 budget.

That only means number of active admin commands won't be bigger than 1/8 budget, which
is one extra implicit limit on admin queue. However, other LUN's budget is still 1/8.

So performance for IO queue won't be affected at all, will it?

scsi_execute_* can't be called often, it is really in slow path, so I
don't think there is any possible performance effect with this patch, or do
you have other performance concern wrt. this patch?

We still have q->queue_depth for enhancing any limit for admin queue, but up to now,
not see it is necessary.

Thanks, 
Ming

jianchao.wang Aug. 14, 2018, 2:47 a.m. UTC | #5

Hi Ming

On 08/14/2018 10:10 AM, Ming Lei wrote:
> On Tue, Aug 14, 2018 at 09:29:25AM +0800, jianchao.wang wrote:
>> Hi Ming
>>
>> On 08/13/2018 06:48 PM, Ming Lei wrote:
>>> It is nothing to do with where the admin request is sent, so no any
>>> difference wrt. this issue between with and without this patchset,
>>> right?
>>
>> I'm afraid not.
>>
>> For example:
>>   A scsi host has 8 LUNs associated with it.
>>   Before this patch set,
>>   When we send out the admin command, the budget is _per_ LUN, 1/8 of the total tags.
>>   After this patch set,
>>   When we send out the admin command, the budget is equal to _one_ LUN, 1/8 of the total tags.
>>
>> However, the 1/8 above is different.
>>   Before the patch set, every LUN's admin command has 1/8 budget to use which is per LUN.
> 
> Strictly speaking, it is that all admin command and all other IOs share the 1/8 budget
> if they aimed at same LUN.

Yes.

> 
>>   After this patch set, all the 8 LUNs admin command has to share the 1/8 budget.
> 
> That only means number of active admin commands won't be bigger than 1/8 budget, which
> is one extra implicit limit on admin queue. However, other LUN's budget is still 1/8.
> 
> So performance for IO queue won't be affected at all, will it?
> 
> scsi_execute_* can't be called often, it is really in slow path, so I
> don't think there is any possible performance effect with this patch, or do
> you have other performance concern wrt. this patch?
> 
> We still have q->queue_depth for enhancing any limit for admin queue, but up to now,
> not see it is necessary.
> 

I agree with you that the performance will not be affected.
But the adminq's budget here looks weird.
We don't reserve budget for admin queue (not count tag->active_queues for it).
But the admin queue has to comply to the limit in hctx_may_queue.

Since the there isn't many in-flight admin commands usually, we could take
admin queue here out of the limit of hctx_may_queue, then things could be clearer. :)

Thanks
Jianchao

Ming Lei Aug. 14, 2018, 3:06 a.m. UTC | #6

On Tue, Aug 14, 2018 at 10:47:21AM +0800, jianchao.wang wrote:
> Hi Ming
> 
> On 08/14/2018 10:10 AM, Ming Lei wrote:
> > On Tue, Aug 14, 2018 at 09:29:25AM +0800, jianchao.wang wrote:
> >> Hi Ming
> >>
> >> On 08/13/2018 06:48 PM, Ming Lei wrote:
> >>> It is nothing to do with where the admin request is sent, so no any
> >>> difference wrt. this issue between with and without this patchset,
> >>> right?
> >>
> >> I'm afraid not.
> >>
> >> For example:
> >>   A scsi host has 8 LUNs associated with it.
> >>   Before this patch set,
> >>   When we send out the admin command, the budget is _per_ LUN, 1/8 of the total tags.
> >>   After this patch set,
> >>   When we send out the admin command, the budget is equal to _one_ LUN, 1/8 of the total tags.
> >>
> >> However, the 1/8 above is different.
> >>   Before the patch set, every LUN's admin command has 1/8 budget to use which is per LUN.
> > 
> > Strictly speaking, it is that all admin command and all other IOs share the 1/8 budget
> > if they aimed at same LUN.
> 
> Yes.
> 
> > 
> >>   After this patch set, all the 8 LUNs admin command has to share the 1/8 budget.
> > 
> > That only means number of active admin commands won't be bigger than 1/8 budget, which
> > is one extra implicit limit on admin queue. However, other LUN's budget is still 1/8.
> > 
> > So performance for IO queue won't be affected at all, will it?
> > 
> > scsi_execute_* can't be called often, it is really in slow path, so I
> > don't think there is any possible performance effect with this patch, or do
> > you have other performance concern wrt. this patch?
> > 
> > We still have q->queue_depth for enhancing any limit for admin queue, but up to now,
> > not see it is necessary.
> > 
> 
> I agree with you that the performance will not be affected.

OK, thanks for your confirm.

> But the adminq's budget here looks weird.
> We don't reserve budget for admin queue (not count tag->active_queues for it).
> But the admin queue has to comply to the limit in hctx_may_queue.
> 
> Since the there isn't many in-flight admin commands usually, we could take
> admin queue here out of the limit of hctx_may_queue, then things could be clearer. :)

I just didn't want to add one line code in the fast path of hctx_may_queue()
because it isn't necessary.

Now looks this way may have one implicit benefit: avoid too many in-flight admin
requests.

I will leave the code in this way, but add comment like below to hctx_may_queue():

	Needn't to deal with admin queue specially here even though we don't
	take it account to tags->active_queues, so blk_queue_admin() can be
	avoided to check in the fast path, also with implicit benefit of
	limiting too many in-flight admin requests.


Thanks, 
Ming

[RFC,V2,04/17] blk-mq: don't reserve tags for admin queue

Commit Message

Comments

Patch