diff mbox series

blk-mq: test QUEUE_FLAG_HCTX_ACTIVE for sbitmap_shared in hctx_may_queue

Message ID 20201227113458.3289082-1-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series blk-mq: test QUEUE_FLAG_HCTX_ACTIVE for sbitmap_shared in hctx_may_queue | expand

Commit Message

Ming Lei Dec. 27, 2020, 11:34 a.m. UTC
In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against
q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE.

So fix it.

Cc: John Garry <john.garry@huawei.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

John Garry Jan. 4, 2021, 10:41 a.m. UTC | #1
On 27/12/2020 11:34, Ming Lei wrote:
> In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against
> q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE.
> 
> So fix it.
> 
> Cc: John Garry <john.garry@huawei.com>
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap")
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Reviewed-by: John Garry <john.garry@huawei.com>

> ---
>   block/blk-mq.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.h b/block/blk-mq.h
> index c1458d9502f1..3616453ca28c 100644
> --- a/block/blk-mq.h
> +++ b/block/blk-mq.h
> @@ -304,7 +304,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
>   		struct request_queue *q = hctx->queue;
>   		struct blk_mq_tag_set *set = q->tag_set;
>   
> -		if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags))
> +		if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))

I wonder how this ever worked properly, as BLK_MQ_S_TAG_ACTIVE is bit 
index 1, and for q->queue_flags that means QUEUE_FLAG_DYING bit, which I 
figure is not set normally..

>   			return true;
>   		users = atomic_read(&set->active_queues_shared_sbitmap);
>   	} else {
>
Ming Lei Jan. 5, 2021, 2:20 a.m. UTC | #2
On Mon, Jan 04, 2021 at 10:41:36AM +0000, John Garry wrote:
> On 27/12/2020 11:34, Ming Lei wrote:
> > In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against
> > q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE.
> > 
> > So fix it.
> > 
> > Cc: John Garry <john.garry@huawei.com>
> > Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> > Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap")
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> 
> Reviewed-by: John Garry <john.garry@huawei.com>
> 
> > ---
> >   block/blk-mq.h | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/block/blk-mq.h b/block/blk-mq.h
> > index c1458d9502f1..3616453ca28c 100644
> > --- a/block/blk-mq.h
> > +++ b/block/blk-mq.h
> > @@ -304,7 +304,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
> >   		struct request_queue *q = hctx->queue;
> >   		struct blk_mq_tag_set *set = q->tag_set;
> > -		if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags))
> > +		if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))
> 
> I wonder how this ever worked properly, as BLK_MQ_S_TAG_ACTIVE is bit index
> 1, and for q->queue_flags that means QUEUE_FLAG_DYING bit, which I figure is
> not set normally..

It always return true, and might just take a bit more CPU especially the tag queue
depth of magsas_raid and hisi_sas_v3 is quite high.

Thanks,
Ming
John Garry Jan. 5, 2021, 10:04 a.m. UTC | #3
On 05/01/2021 02:20, Ming Lei wrote:
> On Mon, Jan 04, 2021 at 10:41:36AM +0000, John Garry wrote:
>> On 27/12/2020 11:34, Ming Lei wrote:
>>> In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against
>>> q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE.
>>>
>>> So fix it.
>>>
>>> Cc: John Garry<john.garry@huawei.com>
>>> Cc: Kashyap Desai<kashyap.desai@broadcom.com>
>>> Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap")
>>> Signed-off-by: Ming Lei<ming.lei@redhat.com>
>> Reviewed-by: John Garry<john.garry@huawei.com>
>>
>>> ---
>>>    block/blk-mq.h | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/block/blk-mq.h b/block/blk-mq.h
>>> index c1458d9502f1..3616453ca28c 100644
>>> --- a/block/blk-mq.h
>>> +++ b/block/blk-mq.h
>>> @@ -304,7 +304,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
>>>    		struct request_queue *q = hctx->queue;
>>>    		struct blk_mq_tag_set *set = q->tag_set;
>>> -		if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags))
>>> +		if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))
>> I wonder how this ever worked properly, as BLK_MQ_S_TAG_ACTIVE is bit index
>> 1, and for q->queue_flags that means QUEUE_FLAG_DYING bit, which I figure is
>> not set normally..
> It always return true, and might just take a bit more CPU especially the tag queue
> depth of magsas_raid and hisi_sas_v3 is quite high.

Hi Ming,

Right, but we actually tested by hacking the host tag queue depth to be 
lower such that we should have tag contention, here is an extract from 
the original series cover letter for my results:

Tag depth 		4000 (default)		260**

Baseline (v5.9-rc1):
none sched:		2094K IOPS		513K
mq-deadline sched:	2145K IOPS		1336K

Final, host_tagset=0 in LLDD *, ***:
none sched:		2120K IOPS		550K
mq-deadline sched:	2121K IOPS		1309K

Final ***:
none sched:		2132K IOPS		1185		
mq-deadline sched:	2145K IOPS		2097	

Maybe my test did not expose the issue. Kashyap also tested this and 
reported the original issue such that we needed this feature, so I'm 
confused.

Thanks,
John
Ming Lei Jan. 5, 2021, 11:18 a.m. UTC | #4
On Tue, Jan 05, 2021 at 10:04:58AM +0000, John Garry wrote:
> On 05/01/2021 02:20, Ming Lei wrote:
> > On Mon, Jan 04, 2021 at 10:41:36AM +0000, John Garry wrote:
> > > On 27/12/2020 11:34, Ming Lei wrote:
> > > > In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against
> > > > q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE.
> > > > 
> > > > So fix it.
> > > > 
> > > > Cc: John Garry<john.garry@huawei.com>
> > > > Cc: Kashyap Desai<kashyap.desai@broadcom.com>
> > > > Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap")
> > > > Signed-off-by: Ming Lei<ming.lei@redhat.com>
> > > Reviewed-by: John Garry<john.garry@huawei.com>
> > > 
> > > > ---
> > > >    block/blk-mq.h | 2 +-
> > > >    1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/block/blk-mq.h b/block/blk-mq.h
> > > > index c1458d9502f1..3616453ca28c 100644
> > > > --- a/block/blk-mq.h
> > > > +++ b/block/blk-mq.h
> > > > @@ -304,7 +304,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
> > > >    		struct request_queue *q = hctx->queue;
> > > >    		struct blk_mq_tag_set *set = q->tag_set;
> > > > -		if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags))
> > > > +		if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))
> > > I wonder how this ever worked properly, as BLK_MQ_S_TAG_ACTIVE is bit index
> > > 1, and for q->queue_flags that means QUEUE_FLAG_DYING bit, which I figure is
> > > not set normally..
> > It always return true, and might just take a bit more CPU especially the tag queue
> > depth of magsas_raid and hisi_sas_v3 is quite high.
> 
> Hi Ming,
> 
> Right, but we actually tested by hacking the host tag queue depth to be
> lower such that we should have tag contention, here is an extract from the
> original series cover letter for my results:
> 
> Tag depth 		4000 (default)		260**
> 
> Baseline (v5.9-rc1):
> none sched:		2094K IOPS		513K
> mq-deadline sched:	2145K IOPS		1336K
> 
> Final, host_tagset=0 in LLDD *, ***:
> none sched:		2120K IOPS		550K
> mq-deadline sched:	2121K IOPS		1309K
> 
> Final ***:
> none sched:		2132K IOPS		1185		
> mq-deadline sched:	2145K IOPS		2097	
> 
> Maybe my test did not expose the issue. Kashyap also tested this and
> reported the original issue such that we needed this feature, so I'm
> confused.

How many LUNs are involved in above test with 260 depth?


Thanks,
Ming
John Garry Jan. 5, 2021, 11:38 a.m. UTC | #5
On 05/01/2021 11:18, Ming Lei wrote:
>>>> ot set normally..
>>> It always return true, and might just take a bit more CPU especially the tag queue
>>> depth of magsas_raid and hisi_sas_v3 is quite high.
>> Hi Ming,
>>
>> Right, but we actually tested by hacking the host tag queue depth to be
>> lower such that we should have tag contention, here is an extract from the
>> original series cover letter for my results:
>>
>> Tag depth 		4000 (default)		260**
>>
>> Baseline (v5.9-rc1):
>> none sched:		2094K IOPS		513K
>> mq-deadline sched:	2145K IOPS		1336K
>>
>> Final, host_tagset=0 in LLDD *, ***:
>> none sched:		2120K IOPS		550K
>> mq-deadline sched:	2121K IOPS		1309K
>>
>> Final ***:
>> none sched:		2132K IOPS		1185		
>> mq-deadline sched:	2145K IOPS		2097	
>>
>> Maybe my test did not expose the issue. Kashyap also tested this and
>> reported the original issue such that we needed this feature, so I'm
>> confused.

Hi Ming,

> How many LUNs are involved in above test with 260 depth?

For me, there was 12 SAS SSDs; for convenience here is the cover letter 
with details:
https://lore.kernel.org/linux-block/1597850436-116171-1-git-send-email-john.garry@huawei.com/

IIRC, for megaraid sas, Kashyap used many more LUNs for testing (64) and 
high fio depth (128) but did not reduce .can_queue, topic originally 
raised here:
https://lore.kernel.org/linux-block/29f8062c1fccace73c45252073232917@mail.gmail.com/

Thanks,
John
Ming Lei Jan. 6, 2021, 1:28 a.m. UTC | #6
On Tue, Jan 05, 2021 at 11:38:48AM +0000, John Garry wrote:
> On 05/01/2021 11:18, Ming Lei wrote:
> > > > > ot set normally..
> > > > It always return true, and might just take a bit more CPU especially the tag queue
> > > > depth of magsas_raid and hisi_sas_v3 is quite high.
> > > Hi Ming,
> > > 
> > > Right, but we actually tested by hacking the host tag queue depth to be
> > > lower such that we should have tag contention, here is an extract from the
> > > original series cover letter for my results:
> > > 
> > > Tag depth 		4000 (default)		260**
> > > 
> > > Baseline (v5.9-rc1):
> > > none sched:		2094K IOPS		513K
> > > mq-deadline sched:	2145K IOPS		1336K
> > > 
> > > Final, host_tagset=0 in LLDD *, ***:
> > > none sched:		2120K IOPS		550K
> > > mq-deadline sched:	2121K IOPS		1309K
> > > 
> > > Final ***:
> > > none sched:		2132K IOPS		1185		
> > > mq-deadline sched:	2145K IOPS		2097	
> > > 
> > > Maybe my test did not expose the issue. Kashyap also tested this and
> > > reported the original issue such that we needed this feature, so I'm
> > > confused.
> 
> Hi Ming,
> 
> > How many LUNs are involved in above test with 260 depth?
> 
> For me, there was 12 SAS SSDs; for convenience here is the cover letter with
> details:
> https://lore.kernel.org/linux-block/1597850436-116171-1-git-send-email-john.garry@huawei.com/
> 
> IIRC, for megaraid sas, Kashyap used many more LUNs for testing (64) and
> high fio depth (128) but did not reduce .can_queue, topic originally raised
> here:
> https://lore.kernel.org/linux-block/29f8062c1fccace73c45252073232917@mail.gmail.com/

OK, in both tests, nr_luns are big enough wrt. 260 depth. Maybe that is
why very low IOPS is observed in 'Final(hosttag=1)' with 260 depth.

I'd suggest to run your previous test again after applying this patch,
and see if difference can be observed.
John Garry Jan. 6, 2021, 11:38 a.m. UTC | #7
On 06/01/2021 01:28, Ming Lei wrote:
>>> How many LUNs are involved in above test with 260 depth?
>> For me, there was 12 SAS SSDs; for convenience here is the cover letter with
>> details:
>> https://lore.kernel.org/linux-block/1597850436-116171-1-git-send-email-john.garry@huawei.com/
>>
>> IIRC, for megaraid sas, Kashyap used many more LUNs for testing (64) and
>> high fio depth (128) but did not reduce .can_queue, topic originally raised
>> here:
>> https://lore.kernel.org/linux-block/29f8062c1fccace73c45252073232917@mail.gmail.com/
> OK, in both tests, nr_luns are big enough wrt. 260 depth. Maybe that is
> why very low IOPS is observed in 'Final(hosttag=1)' with 260 depth.
> 
> I'd suggest to run your previous test again after applying this patch,
> and see if difference can be observed.

Hi Ming,

I tested and didn't see a noticeable difference with the fix when using 
the reducing tag queue depth. I got ~500K IOPs with tag queue depth of 
260, as opposed to 2M with full tag queue depth. However I was doubtful 
on this test method before. Regardless, your change and this feature 
still look proper.

@Kashyap, it would be great if you guys could test this also on that 
same setup you described previously:

https://lore.kernel.org/linux-block/29f8062c1fccace73c45252073232917@mail.gmail.com/

Thanks,
John
Ming Lei Jan. 25, 2021, 2:29 a.m. UTC | #8
On Sun, Dec 27, 2020 at 07:34:58PM +0800, Ming Lei wrote:
> In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against
> q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE.
> 
> So fix it.
> 
> Cc: John Garry <john.garry@huawei.com>
> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
> Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap")
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

Hello Jens,

This one fixes one v5.11 issue, can you queue it?


Thanks, 
Ming
Jens Axboe Jan. 25, 2021, 4:25 a.m. UTC | #9
On 1/24/21 7:29 PM, Ming Lei wrote:
> On Sun, Dec 27, 2020 at 07:34:58PM +0800, Ming Lei wrote:
>> In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against
>> q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE.
>>
>> So fix it.
>>
>> Cc: John Garry <john.garry@huawei.com>
>> Cc: Kashyap Desai <kashyap.desai@broadcom.com>
>> Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap")
>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> 
> Hello Jens,
> 
> This one fixes one v5.11 issue, can you queue it?

Queued up, thanks.
diff mbox series

Patch

diff --git a/block/blk-mq.h b/block/blk-mq.h
index c1458d9502f1..3616453ca28c 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -304,7 +304,7 @@  static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
 		struct request_queue *q = hctx->queue;
 		struct blk_mq_tag_set *set = q->tag_set;
 
-		if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags))
+		if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags))
 			return true;
 		users = atomic_read(&set->active_queues_shared_sbitmap);
 	} else {