Message ID | 20201227113458.3289082-1-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-mq: test QUEUE_FLAG_HCTX_ACTIVE for sbitmap_shared in hctx_may_queue | expand |
On 27/12/2020 11:34, Ming Lei wrote: > In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against > q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE. > > So fix it. > > Cc: John Garry <john.garry@huawei.com> > Cc: Kashyap Desai <kashyap.desai@broadcom.com> > Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap") > Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.garry@huawei.com> > --- > block/blk-mq.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/blk-mq.h b/block/blk-mq.h > index c1458d9502f1..3616453ca28c 100644 > --- a/block/blk-mq.h > +++ b/block/blk-mq.h > @@ -304,7 +304,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, > struct request_queue *q = hctx->queue; > struct blk_mq_tag_set *set = q->tag_set; > > - if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags)) > + if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) I wonder how this ever worked properly, as BLK_MQ_S_TAG_ACTIVE is bit index 1, and for q->queue_flags that means QUEUE_FLAG_DYING bit, which I figure is not set normally.. > return true; > users = atomic_read(&set->active_queues_shared_sbitmap); > } else { >
On Mon, Jan 04, 2021 at 10:41:36AM +0000, John Garry wrote: > On 27/12/2020 11:34, Ming Lei wrote: > > In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against > > q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE. > > > > So fix it. > > > > Cc: John Garry <john.garry@huawei.com> > > Cc: Kashyap Desai <kashyap.desai@broadcom.com> > > Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap") > > Signed-off-by: Ming Lei <ming.lei@redhat.com> > > Reviewed-by: John Garry <john.garry@huawei.com> > > > --- > > block/blk-mq.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/block/blk-mq.h b/block/blk-mq.h > > index c1458d9502f1..3616453ca28c 100644 > > --- a/block/blk-mq.h > > +++ b/block/blk-mq.h > > @@ -304,7 +304,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, > > struct request_queue *q = hctx->queue; > > struct blk_mq_tag_set *set = q->tag_set; > > - if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags)) > > + if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) > > I wonder how this ever worked properly, as BLK_MQ_S_TAG_ACTIVE is bit index > 1, and for q->queue_flags that means QUEUE_FLAG_DYING bit, which I figure is > not set normally.. It always return true, and might just take a bit more CPU especially the tag queue depth of magsas_raid and hisi_sas_v3 is quite high. Thanks, Ming
On 05/01/2021 02:20, Ming Lei wrote: > On Mon, Jan 04, 2021 at 10:41:36AM +0000, John Garry wrote: >> On 27/12/2020 11:34, Ming Lei wrote: >>> In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against >>> q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE. >>> >>> So fix it. >>> >>> Cc: John Garry<john.garry@huawei.com> >>> Cc: Kashyap Desai<kashyap.desai@broadcom.com> >>> Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap") >>> Signed-off-by: Ming Lei<ming.lei@redhat.com> >> Reviewed-by: John Garry<john.garry@huawei.com> >> >>> --- >>> block/blk-mq.h | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/block/blk-mq.h b/block/blk-mq.h >>> index c1458d9502f1..3616453ca28c 100644 >>> --- a/block/blk-mq.h >>> +++ b/block/blk-mq.h >>> @@ -304,7 +304,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, >>> struct request_queue *q = hctx->queue; >>> struct blk_mq_tag_set *set = q->tag_set; >>> - if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags)) >>> + if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) >> I wonder how this ever worked properly, as BLK_MQ_S_TAG_ACTIVE is bit index >> 1, and for q->queue_flags that means QUEUE_FLAG_DYING bit, which I figure is >> not set normally.. > It always return true, and might just take a bit more CPU especially the tag queue > depth of magsas_raid and hisi_sas_v3 is quite high. Hi Ming, Right, but we actually tested by hacking the host tag queue depth to be lower such that we should have tag contention, here is an extract from the original series cover letter for my results: Tag depth 4000 (default) 260** Baseline (v5.9-rc1): none sched: 2094K IOPS 513K mq-deadline sched: 2145K IOPS 1336K Final, host_tagset=0 in LLDD *, ***: none sched: 2120K IOPS 550K mq-deadline sched: 2121K IOPS 1309K Final ***: none sched: 2132K IOPS 1185 mq-deadline sched: 2145K IOPS 2097 Maybe my test did not expose the issue. Kashyap also tested this and reported the original issue such that we needed this feature, so I'm confused. Thanks, John
On Tue, Jan 05, 2021 at 10:04:58AM +0000, John Garry wrote: > On 05/01/2021 02:20, Ming Lei wrote: > > On Mon, Jan 04, 2021 at 10:41:36AM +0000, John Garry wrote: > > > On 27/12/2020 11:34, Ming Lei wrote: > > > > In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against > > > > q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE. > > > > > > > > So fix it. > > > > > > > > Cc: John Garry<john.garry@huawei.com> > > > > Cc: Kashyap Desai<kashyap.desai@broadcom.com> > > > > Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap") > > > > Signed-off-by: Ming Lei<ming.lei@redhat.com> > > > Reviewed-by: John Garry<john.garry@huawei.com> > > > > > > > --- > > > > block/blk-mq.h | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/block/blk-mq.h b/block/blk-mq.h > > > > index c1458d9502f1..3616453ca28c 100644 > > > > --- a/block/blk-mq.h > > > > +++ b/block/blk-mq.h > > > > @@ -304,7 +304,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, > > > > struct request_queue *q = hctx->queue; > > > > struct blk_mq_tag_set *set = q->tag_set; > > > > - if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags)) > > > > + if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) > > > I wonder how this ever worked properly, as BLK_MQ_S_TAG_ACTIVE is bit index > > > 1, and for q->queue_flags that means QUEUE_FLAG_DYING bit, which I figure is > > > not set normally.. > > It always return true, and might just take a bit more CPU especially the tag queue > > depth of magsas_raid and hisi_sas_v3 is quite high. > > Hi Ming, > > Right, but we actually tested by hacking the host tag queue depth to be > lower such that we should have tag contention, here is an extract from the > original series cover letter for my results: > > Tag depth 4000 (default) 260** > > Baseline (v5.9-rc1): > none sched: 2094K IOPS 513K > mq-deadline sched: 2145K IOPS 1336K > > Final, host_tagset=0 in LLDD *, ***: > none sched: 2120K IOPS 550K > mq-deadline sched: 2121K IOPS 1309K > > Final ***: > none sched: 2132K IOPS 1185 > mq-deadline sched: 2145K IOPS 2097 > > Maybe my test did not expose the issue. Kashyap also tested this and > reported the original issue such that we needed this feature, so I'm > confused. How many LUNs are involved in above test with 260 depth? Thanks, Ming
On 05/01/2021 11:18, Ming Lei wrote: >>>> ot set normally.. >>> It always return true, and might just take a bit more CPU especially the tag queue >>> depth of magsas_raid and hisi_sas_v3 is quite high. >> Hi Ming, >> >> Right, but we actually tested by hacking the host tag queue depth to be >> lower such that we should have tag contention, here is an extract from the >> original series cover letter for my results: >> >> Tag depth 4000 (default) 260** >> >> Baseline (v5.9-rc1): >> none sched: 2094K IOPS 513K >> mq-deadline sched: 2145K IOPS 1336K >> >> Final, host_tagset=0 in LLDD *, ***: >> none sched: 2120K IOPS 550K >> mq-deadline sched: 2121K IOPS 1309K >> >> Final ***: >> none sched: 2132K IOPS 1185 >> mq-deadline sched: 2145K IOPS 2097 >> >> Maybe my test did not expose the issue. Kashyap also tested this and >> reported the original issue such that we needed this feature, so I'm >> confused. Hi Ming, > How many LUNs are involved in above test with 260 depth? For me, there was 12 SAS SSDs; for convenience here is the cover letter with details: https://lore.kernel.org/linux-block/1597850436-116171-1-git-send-email-john.garry@huawei.com/ IIRC, for megaraid sas, Kashyap used many more LUNs for testing (64) and high fio depth (128) but did not reduce .can_queue, topic originally raised here: https://lore.kernel.org/linux-block/29f8062c1fccace73c45252073232917@mail.gmail.com/ Thanks, John
On Tue, Jan 05, 2021 at 11:38:48AM +0000, John Garry wrote: > On 05/01/2021 11:18, Ming Lei wrote: > > > > > ot set normally.. > > > > It always return true, and might just take a bit more CPU especially the tag queue > > > > depth of magsas_raid and hisi_sas_v3 is quite high. > > > Hi Ming, > > > > > > Right, but we actually tested by hacking the host tag queue depth to be > > > lower such that we should have tag contention, here is an extract from the > > > original series cover letter for my results: > > > > > > Tag depth 4000 (default) 260** > > > > > > Baseline (v5.9-rc1): > > > none sched: 2094K IOPS 513K > > > mq-deadline sched: 2145K IOPS 1336K > > > > > > Final, host_tagset=0 in LLDD *, ***: > > > none sched: 2120K IOPS 550K > > > mq-deadline sched: 2121K IOPS 1309K > > > > > > Final ***: > > > none sched: 2132K IOPS 1185 > > > mq-deadline sched: 2145K IOPS 2097 > > > > > > Maybe my test did not expose the issue. Kashyap also tested this and > > > reported the original issue such that we needed this feature, so I'm > > > confused. > > Hi Ming, > > > How many LUNs are involved in above test with 260 depth? > > For me, there was 12 SAS SSDs; for convenience here is the cover letter with > details: > https://lore.kernel.org/linux-block/1597850436-116171-1-git-send-email-john.garry@huawei.com/ > > IIRC, for megaraid sas, Kashyap used many more LUNs for testing (64) and > high fio depth (128) but did not reduce .can_queue, topic originally raised > here: > https://lore.kernel.org/linux-block/29f8062c1fccace73c45252073232917@mail.gmail.com/ OK, in both tests, nr_luns are big enough wrt. 260 depth. Maybe that is why very low IOPS is observed in 'Final(hosttag=1)' with 260 depth. I'd suggest to run your previous test again after applying this patch, and see if difference can be observed.
On 06/01/2021 01:28, Ming Lei wrote: >>> How many LUNs are involved in above test with 260 depth? >> For me, there was 12 SAS SSDs; for convenience here is the cover letter with >> details: >> https://lore.kernel.org/linux-block/1597850436-116171-1-git-send-email-john.garry@huawei.com/ >> >> IIRC, for megaraid sas, Kashyap used many more LUNs for testing (64) and >> high fio depth (128) but did not reduce .can_queue, topic originally raised >> here: >> https://lore.kernel.org/linux-block/29f8062c1fccace73c45252073232917@mail.gmail.com/ > OK, in both tests, nr_luns are big enough wrt. 260 depth. Maybe that is > why very low IOPS is observed in 'Final(hosttag=1)' with 260 depth. > > I'd suggest to run your previous test again after applying this patch, > and see if difference can be observed. Hi Ming, I tested and didn't see a noticeable difference with the fix when using the reducing tag queue depth. I got ~500K IOPs with tag queue depth of 260, as opposed to 2M with full tag queue depth. However I was doubtful on this test method before. Regardless, your change and this feature still look proper. @Kashyap, it would be great if you guys could test this also on that same setup you described previously: https://lore.kernel.org/linux-block/29f8062c1fccace73c45252073232917@mail.gmail.com/ Thanks, John
On Sun, Dec 27, 2020 at 07:34:58PM +0800, Ming Lei wrote: > In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against > q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE. > > So fix it. > > Cc: John Garry <john.garry@huawei.com> > Cc: Kashyap Desai <kashyap.desai@broadcom.com> > Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap") > Signed-off-by: Ming Lei <ming.lei@redhat.com> Hello Jens, This one fixes one v5.11 issue, can you queue it? Thanks, Ming
On 1/24/21 7:29 PM, Ming Lei wrote: > On Sun, Dec 27, 2020 at 07:34:58PM +0800, Ming Lei wrote: >> In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against >> q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE. >> >> So fix it. >> >> Cc: John Garry <john.garry@huawei.com> >> Cc: Kashyap Desai <kashyap.desai@broadcom.com> >> Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap") >> Signed-off-by: Ming Lei <ming.lei@redhat.com> > > Hello Jens, > > This one fixes one v5.11 issue, can you queue it? Queued up, thanks.
diff --git a/block/blk-mq.h b/block/blk-mq.h index c1458d9502f1..3616453ca28c 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -304,7 +304,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, struct request_queue *q = hctx->queue; struct blk_mq_tag_set *set = q->tag_set; - if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &q->queue_flags)) + if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) return true; users = atomic_read(&set->active_queues_shared_sbitmap); } else {
In case of blk_mq_is_sbitmap_shared(), we should test QUEUE_FLAG_HCTX_ACTIVE against q->queue_flags instead of BLK_MQ_S_TAG_ACTIVE. So fix it. Cc: John Garry <john.garry@huawei.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Fixes: f1b49fdc1c64 ("blk-mq: Record active_queues_shared_sbitmap per tag_set for when using shared sbitmap") Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)