diff mbox series

[v3,3/5] block, bfq: don't disable wbt if CONFIG_BFQ_GROUP_IOSCHED is disabled

Message ID 20220922113558.1085314-4-yukuai3@huawei.com (mailing list archive)
State New, archived
Headers show
Series blk-wbt: simple improvment to enable wbt correctly | expand

Commit Message

Yu Kuai Sept. 22, 2022, 11:35 a.m. UTC
wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 block/bfq-iosched.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig Sept. 23, 2022, 8:56 a.m. UTC | #1
On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
> wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.

Umm, wouldn't this be something decided at runtime, that is not
if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
if the hierarchical cgroup based scheduling is actually used for a
given device?
Yu Kuai Sept. 23, 2022, 9:50 a.m. UTC | #2
Hi, Christoph

在 2022/09/23 16:56, Christoph Hellwig 写道:
> On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
>> wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
> 
> Umm, wouldn't this be something decided at runtime, that is not
> if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
> if the hierarchical cgroup based scheduling is actually used for a
> given device?
> .
> 

That's a good point,

Before this patch wbt is simply disabled if elevator is bfq.

With this patch, if elevator is bfq while bfq doesn't throttle
any IO yet, wbt still is disabled unnecessarily.

I have an idle to enable/disable wbt while tracking how many bfq_groups
are activated, which will rely on my another patchset, which is not
applied yet...

support concurrent sync io for bfq on a specail occasion.

I think currently this patch do make sense, perhaps I can do more work
after the above patchset finally applied?

Thanks,
Kuai
Jan Kara Sept. 23, 2022, 10:06 a.m. UTC | #3
On Fri 23-09-22 17:50:49, Yu Kuai wrote:
> Hi, Christoph
> 
> 在 2022/09/23 16:56, Christoph Hellwig 写道:
> > On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
> > > wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
> > 
> > Umm, wouldn't this be something decided at runtime, that is not
> > if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
> > if the hierarchical cgroup based scheduling is actually used for a
> > given device?
> > .
> > 
> 
> That's a good point,
> 
> Before this patch wbt is simply disabled if elevator is bfq.
> 
> With this patch, if elevator is bfq while bfq doesn't throttle
> any IO yet, wbt still is disabled unnecessarily.

It is not really disabled unnecessarily. Have you actually tested the
performance of the combination? I did once and the results were just
horrible (which is I made BFQ just disable wbt by default). The problem is
that blk-wbt assumes certain model of underlying storage stack and hardware
behavior and BFQ just does not fit in that model. For example BFQ wants to
see as many requests as possible so that it can heavily reorder them,
estimate think times of applications, etc. On the other hand blk-wbt
assumes that if request latency gets higher, it means there is too much IO
going on and we need to allow less of "lower priority" IO types to be
submitted. These two go directly against one another and I was easily
observing blk-wbt spiraling down to allowing only very small number of
requests submitted while BFQ was idling waiting for more IO from the
process that was currently scheduled.

So I'm kind of wondering why you'd like to use blk-wbt and BFQ together...

								Honza
Yu Kuai Sept. 23, 2022, 10:23 a.m. UTC | #4
Hi, Jan

在 2022/09/23 18:06, Jan Kara 写道:
> On Fri 23-09-22 17:50:49, Yu Kuai wrote:
>> Hi, Christoph
>>
>> 在 2022/09/23 16:56, Christoph Hellwig 写道:
>>> On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
>>>> wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
>>>
>>> Umm, wouldn't this be something decided at runtime, that is not
>>> if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
>>> if the hierarchical cgroup based scheduling is actually used for a
>>> given device?
>>> .
>>>
>>
>> That's a good point,
>>
>> Before this patch wbt is simply disabled if elevator is bfq.
>>
>> With this patch, if elevator is bfq while bfq doesn't throttle
>> any IO yet, wbt still is disabled unnecessarily.
> 
> It is not really disabled unnecessarily. Have you actually tested the
> performance of the combination? I did once and the results were just
> horrible (which is I made BFQ just disable wbt by default). The problem is
> that blk-wbt assumes certain model of underlying storage stack and hardware
> behavior and BFQ just does not fit in that model. For example BFQ wants to
> see as many requests as possible so that it can heavily reorder them,
> estimate think times of applications, etc. On the other hand blk-wbt
> assumes that if request latency gets higher, it means there is too much IO
> going on and we need to allow less of "lower priority" IO types to be
> submitted. These two go directly against one another and I was easily
> observing blk-wbt spiraling down to allowing only very small number of
> requests submitted while BFQ was idling waiting for more IO from the
> process that was currently scheduled.
> 

Thanks for your explanation, I understand that bfq and wbt should not
work together.

However, I wonder if CONFIG_BFQ_GROUP_IOSCHED is disabled, or service
guarantee is not needed, does the above phenomenon still exist? I find
it hard to understand... Perhaps I need to do some test.

Thanks,
Kuai

> So I'm kind of wondering why you'd like to use blk-wbt and BFQ together...
> 
> 								Honza
>
Jan Kara Sept. 23, 2022, 11:03 a.m. UTC | #5
Hi Kuai!

On Fri 23-09-22 18:23:03, Yu Kuai wrote:
> 在 2022/09/23 18:06, Jan Kara 写道:
> > On Fri 23-09-22 17:50:49, Yu Kuai wrote:
> > > Hi, Christoph
> > > 
> > > 在 2022/09/23 16:56, Christoph Hellwig 写道:
> > > > On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
> > > > > wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
> > > > 
> > > > Umm, wouldn't this be something decided at runtime, that is not
> > > > if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
> > > > if the hierarchical cgroup based scheduling is actually used for a
> > > > given device?
> > > > .
> > > > 
> > > 
> > > That's a good point,
> > > 
> > > Before this patch wbt is simply disabled if elevator is bfq.
> > > 
> > > With this patch, if elevator is bfq while bfq doesn't throttle
> > > any IO yet, wbt still is disabled unnecessarily.
> > 
> > It is not really disabled unnecessarily. Have you actually tested the
> > performance of the combination? I did once and the results were just
> > horrible (which is I made BFQ just disable wbt by default). The problem is
> > that blk-wbt assumes certain model of underlying storage stack and hardware
> > behavior and BFQ just does not fit in that model. For example BFQ wants to
> > see as many requests as possible so that it can heavily reorder them,
> > estimate think times of applications, etc. On the other hand blk-wbt
> > assumes that if request latency gets higher, it means there is too much IO
> > going on and we need to allow less of "lower priority" IO types to be
> > submitted. These two go directly against one another and I was easily
> > observing blk-wbt spiraling down to allowing only very small number of
> > requests submitted while BFQ was idling waiting for more IO from the
> > process that was currently scheduled.
> > 
> 
> Thanks for your explanation, I understand that bfq and wbt should not
> work together.
> 
> However, I wonder if CONFIG_BFQ_GROUP_IOSCHED is disabled, or service
> guarantee is not needed, does the above phenomenon still exist? I find
> it hard to understand... Perhaps I need to do some test.

Well, BFQ implements for example idling on sync IO queues which is one of
the features that upsets blk-wbt. That does not depend on
CONFIG_BFQ_GROUP_IOSCHED in any way. Also generally the idea that BFQ
assigns storage *time slots* to different processes and IO from other
processes is just queued at those times increases IO completion
latency (for IOs of processes that are not currently scheduled) and this
tends to confuse blk-wbt.

								Honza
Yu Kuai Sept. 23, 2022, 11:32 a.m. UTC | #6
Hi, Jan!

在 2022/09/23 19:03, Jan Kara 写道:
> Hi Kuai!
> 
> On Fri 23-09-22 18:23:03, Yu Kuai wrote:
>> 在 2022/09/23 18:06, Jan Kara 写道:
>>> On Fri 23-09-22 17:50:49, Yu Kuai wrote:
>>>> Hi, Christoph
>>>>
>>>> 在 2022/09/23 16:56, Christoph Hellwig 写道:
>>>>> On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
>>>>>> wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
>>>>>
>>>>> Umm, wouldn't this be something decided at runtime, that is not
>>>>> if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
>>>>> if the hierarchical cgroup based scheduling is actually used for a
>>>>> given device?
>>>>> .
>>>>>
>>>>
>>>> That's a good point,
>>>>
>>>> Before this patch wbt is simply disabled if elevator is bfq.
>>>>
>>>> With this patch, if elevator is bfq while bfq doesn't throttle
>>>> any IO yet, wbt still is disabled unnecessarily.
>>>
>>> It is not really disabled unnecessarily. Have you actually tested the
>>> performance of the combination? I did once and the results were just
>>> horrible (which is I made BFQ just disable wbt by default). The problem is
>>> that blk-wbt assumes certain model of underlying storage stack and hardware
>>> behavior and BFQ just does not fit in that model. For example BFQ wants to
>>> see as many requests as possible so that it can heavily reorder them,
>>> estimate think times of applications, etc. On the other hand blk-wbt
>>> assumes that if request latency gets higher, it means there is too much IO
>>> going on and we need to allow less of "lower priority" IO types to be
>>> submitted. These two go directly against one another and I was easily
>>> observing blk-wbt spiraling down to allowing only very small number of
>>> requests submitted while BFQ was idling waiting for more IO from the
>>> process that was currently scheduled.
>>>
>>
>> Thanks for your explanation, I understand that bfq and wbt should not
>> work together.
>>
>> However, I wonder if CONFIG_BFQ_GROUP_IOSCHED is disabled, or service
>> guarantee is not needed, does the above phenomenon still exist? I find
>> it hard to understand... Perhaps I need to do some test.
> 
> Well, BFQ implements for example idling on sync IO queues which is one of
> the features that upsets blk-wbt. That does not depend on
> CONFIG_BFQ_GROUP_IOSCHED in any way. Also generally the idea that BFQ
> assigns storage *time slots* to different processes and IO from other
> processes is just queued at those times increases IO completion
> latency (for IOs of processes that are not currently scheduled) and this
> tends to confuse blk-wbt.
> 
I see it now, thanks a lot for your expiations, that really helps a lot.

I misunderstand about the how the bfq works. I'll remove this patch in
next version.

Thanks,
Kuai

> 								Honza
>
Yu Kuai Sept. 26, 2022, 1 p.m. UTC | #7
Hi, Jan

在 2022/09/23 19:03, Jan Kara 写道:
> Hi Kuai!
> 
> On Fri 23-09-22 18:23:03, Yu Kuai wrote:
>> 在 2022/09/23 18:06, Jan Kara 写道:
>>> On Fri 23-09-22 17:50:49, Yu Kuai wrote:
>>>> Hi, Christoph
>>>>
>>>> 在 2022/09/23 16:56, Christoph Hellwig 写道:
>>>>> On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
>>>>>> wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
>>>>>
>>>>> Umm, wouldn't this be something decided at runtime, that is not
>>>>> if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
>>>>> if the hierarchical cgroup based scheduling is actually used for a
>>>>> given device?
>>>>> .
>>>>>
>>>>
>>>> That's a good point,
>>>>
>>>> Before this patch wbt is simply disabled if elevator is bfq.
>>>>
>>>> With this patch, if elevator is bfq while bfq doesn't throttle
>>>> any IO yet, wbt still is disabled unnecessarily.
>>>
>>> It is not really disabled unnecessarily. Have you actually tested the
>>> performance of the combination? I did once and the results were just
>>> horrible (which is I made BFQ just disable wbt by default). The problem is
>>> that blk-wbt assumes certain model of underlying storage stack and hardware
>>> behavior and BFQ just does not fit in that model. For example BFQ wants to
>>> see as many requests as possible so that it can heavily reorder them,
>>> estimate think times of applications, etc. On the other hand blk-wbt
>>> assumes that if request latency gets higher, it means there is too much IO
>>> going on and we need to allow less of "lower priority" IO types to be
>>> submitted. These two go directly against one another and I was easily
>>> observing blk-wbt spiraling down to allowing only very small number of
>>> requests submitted while BFQ was idling waiting for more IO from the
>>> process that was currently scheduled.
>>>
>>
>> Thanks for your explanation, I understand that bfq and wbt should not
>> work together.
>>
>> However, I wonder if CONFIG_BFQ_GROUP_IOSCHED is disabled, or service
>> guarantee is not needed, does the above phenomenon still exist? I find
>> it hard to understand... Perhaps I need to do some test.
> 
> Well, BFQ implements for example idling on sync IO queues which is one of
> the features that upsets blk-wbt. That does not depend on
> CONFIG_BFQ_GROUP_IOSCHED in any way. Also generally the idea that BFQ
> assigns storage *time slots* to different processes and IO from other
> processes is just queued at those times increases IO completion
> latency (for IOs of processes that are not currently scheduled) and this
> tends to confuse blk-wbt.
> 
Hi, Jan

Just to be curious, have you ever think about or tested wbt with
io-cost? And even more, how bfq work with io-cost?

I haven't tested yet, but it seems to me some of them can work well
together.

Thanks,
Kuai
> 								Honza
>
Jan Kara Sept. 26, 2022, 2:22 p.m. UTC | #8
Hi Kuai!

On Mon 26-09-22 21:00:48, Yu Kuai wrote:
> 在 2022/09/23 19:03, Jan Kara 写道:
> > Hi Kuai!
> > 
> > On Fri 23-09-22 18:23:03, Yu Kuai wrote:
> > > 在 2022/09/23 18:06, Jan Kara 写道:
> > > > On Fri 23-09-22 17:50:49, Yu Kuai wrote:
> > > > > Hi, Christoph
> > > > > 
> > > > > 在 2022/09/23 16:56, Christoph Hellwig 写道:
> > > > > > On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
> > > > > > > wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
> > > > > > 
> > > > > > Umm, wouldn't this be something decided at runtime, that is not
> > > > > > if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
> > > > > > if the hierarchical cgroup based scheduling is actually used for a
> > > > > > given device?
> > > > > > .
> > > > > > 
> > > > > 
> > > > > That's a good point,
> > > > > 
> > > > > Before this patch wbt is simply disabled if elevator is bfq.
> > > > > 
> > > > > With this patch, if elevator is bfq while bfq doesn't throttle
> > > > > any IO yet, wbt still is disabled unnecessarily.
> > > > 
> > > > It is not really disabled unnecessarily. Have you actually tested the
> > > > performance of the combination? I did once and the results were just
> > > > horrible (which is I made BFQ just disable wbt by default). The problem is
> > > > that blk-wbt assumes certain model of underlying storage stack and hardware
> > > > behavior and BFQ just does not fit in that model. For example BFQ wants to
> > > > see as many requests as possible so that it can heavily reorder them,
> > > > estimate think times of applications, etc. On the other hand blk-wbt
> > > > assumes that if request latency gets higher, it means there is too much IO
> > > > going on and we need to allow less of "lower priority" IO types to be
> > > > submitted. These two go directly against one another and I was easily
> > > > observing blk-wbt spiraling down to allowing only very small number of
> > > > requests submitted while BFQ was idling waiting for more IO from the
> > > > process that was currently scheduled.
> > > > 
> > > 
> > > Thanks for your explanation, I understand that bfq and wbt should not
> > > work together.
> > > 
> > > However, I wonder if CONFIG_BFQ_GROUP_IOSCHED is disabled, or service
> > > guarantee is not needed, does the above phenomenon still exist? I find
> > > it hard to understand... Perhaps I need to do some test.
> > 
> > Well, BFQ implements for example idling on sync IO queues which is one of
> > the features that upsets blk-wbt. That does not depend on
> > CONFIG_BFQ_GROUP_IOSCHED in any way. Also generally the idea that BFQ
> > assigns storage *time slots* to different processes and IO from other
> > processes is just queued at those times increases IO completion
> > latency (for IOs of processes that are not currently scheduled) and this
> > tends to confuse blk-wbt.
> > 
> Hi, Jan
> 
> Just to be curious, have you ever think about or tested wbt with
> io-cost? And even more, how bfq work with io-cost?
> 
> I haven't tested yet, but it seems to me some of them can work well
> together.

No, I didn't test these combinations. I actually expect there would be
troubles in both cases under high IO load but you can try :)

								Honza
Yu Kuai Sept. 27, 2022, 1:02 a.m. UTC | #9
Hi, Jan

在 2022/09/26 22:22, Jan Kara 写道:
> Hi Kuai!
> 
> On Mon 26-09-22 21:00:48, Yu Kuai wrote:
>> 在 2022/09/23 19:03, Jan Kara 写道:
>>> Hi Kuai!
>>>
>>> On Fri 23-09-22 18:23:03, Yu Kuai wrote:
>>>> 在 2022/09/23 18:06, Jan Kara 写道:
>>>>> On Fri 23-09-22 17:50:49, Yu Kuai wrote:
>>>>>> Hi, Christoph
>>>>>>
>>>>>> 在 2022/09/23 16:56, Christoph Hellwig 写道:
>>>>>>> On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
>>>>>>>> wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
>>>>>>>
>>>>>>> Umm, wouldn't this be something decided at runtime, that is not
>>>>>>> if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
>>>>>>> if the hierarchical cgroup based scheduling is actually used for a
>>>>>>> given device?
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>> That's a good point,
>>>>>>
>>>>>> Before this patch wbt is simply disabled if elevator is bfq.
>>>>>>
>>>>>> With this patch, if elevator is bfq while bfq doesn't throttle
>>>>>> any IO yet, wbt still is disabled unnecessarily.
>>>>>
>>>>> It is not really disabled unnecessarily. Have you actually tested the
>>>>> performance of the combination? I did once and the results were just
>>>>> horrible (which is I made BFQ just disable wbt by default). The problem is
>>>>> that blk-wbt assumes certain model of underlying storage stack and hardware
>>>>> behavior and BFQ just does not fit in that model. For example BFQ wants to
>>>>> see as many requests as possible so that it can heavily reorder them,
>>>>> estimate think times of applications, etc. On the other hand blk-wbt
>>>>> assumes that if request latency gets higher, it means there is too much IO
>>>>> going on and we need to allow less of "lower priority" IO types to be
>>>>> submitted. These two go directly against one another and I was easily
>>>>> observing blk-wbt spiraling down to allowing only very small number of
>>>>> requests submitted while BFQ was idling waiting for more IO from the
>>>>> process that was currently scheduled.
>>>>>
>>>>
>>>> Thanks for your explanation, I understand that bfq and wbt should not
>>>> work together.
>>>>
>>>> However, I wonder if CONFIG_BFQ_GROUP_IOSCHED is disabled, or service
>>>> guarantee is not needed, does the above phenomenon still exist? I find
>>>> it hard to understand... Perhaps I need to do some test.
>>>
>>> Well, BFQ implements for example idling on sync IO queues which is one of
>>> the features that upsets blk-wbt. That does not depend on
>>> CONFIG_BFQ_GROUP_IOSCHED in any way. Also generally the idea that BFQ
>>> assigns storage *time slots* to different processes and IO from other
>>> processes is just queued at those times increases IO completion
>>> latency (for IOs of processes that are not currently scheduled) and this
>>> tends to confuse blk-wbt.
>>>
>> Hi, Jan
>>
>> Just to be curious, have you ever think about or tested wbt with
>> io-cost? And even more, how bfq work with io-cost?
>>
>> I haven't tested yet, but it seems to me some of them can work well
>> together.
> 
> No, I didn't test these combinations. I actually expect there would be
> troubles in both cases under high IO load but you can try :)

Just realize I made a clerical error, I actually want to saied that
*can't* work well together.

I'll try to have a test the combinations.

Thanks,
Kuai
> 
> 								Honza
>
Paolo Valente Sept. 27, 2022, 4:14 p.m. UTC | #10
> Il giorno 27 set 2022, alle ore 03:02, Yu Kuai <yukuai1@huaweicloud.com> ha scritto:
> 
> Hi, Jan
> 
> 在 2022/09/26 22:22, Jan Kara 写道:
>> Hi Kuai!
>> On Mon 26-09-22 21:00:48, Yu Kuai wrote:
>>> 在 2022/09/23 19:03, Jan Kara 写道:
>>>> Hi Kuai!
>>>> 
>>>> On Fri 23-09-22 18:23:03, Yu Kuai wrote:
>>>>> 在 2022/09/23 18:06, Jan Kara 写道:
>>>>>> On Fri 23-09-22 17:50:49, Yu Kuai wrote:
>>>>>>> Hi, Christoph
>>>>>>> 
>>>>>>> 在 2022/09/23 16:56, Christoph Hellwig 写道:
>>>>>>>> On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
>>>>>>>>> wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
>>>>>>>> 
>>>>>>>> Umm, wouldn't this be something decided at runtime, that is not
>>>>>>>> if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
>>>>>>>> if the hierarchical cgroup based scheduling is actually used for a
>>>>>>>> given device?
>>>>>>>> .
>>>>>>>> 
>>>>>>> 
>>>>>>> That's a good point,
>>>>>>> 
>>>>>>> Before this patch wbt is simply disabled if elevator is bfq.
>>>>>>> 
>>>>>>> With this patch, if elevator is bfq while bfq doesn't throttle
>>>>>>> any IO yet, wbt still is disabled unnecessarily.
>>>>>> 
>>>>>> It is not really disabled unnecessarily. Have you actually tested the
>>>>>> performance of the combination? I did once and the results were just
>>>>>> horrible (which is I made BFQ just disable wbt by default). The problem is
>>>>>> that blk-wbt assumes certain model of underlying storage stack and hardware
>>>>>> behavior and BFQ just does not fit in that model. For example BFQ wants to
>>>>>> see as many requests as possible so that it can heavily reorder them,
>>>>>> estimate think times of applications, etc. On the other hand blk-wbt
>>>>>> assumes that if request latency gets higher, it means there is too much IO
>>>>>> going on and we need to allow less of "lower priority" IO types to be
>>>>>> submitted. These two go directly against one another and I was easily
>>>>>> observing blk-wbt spiraling down to allowing only very small number of
>>>>>> requests submitted while BFQ was idling waiting for more IO from the
>>>>>> process that was currently scheduled.
>>>>>> 
>>>>> 
>>>>> Thanks for your explanation, I understand that bfq and wbt should not
>>>>> work together.
>>>>> 
>>>>> However, I wonder if CONFIG_BFQ_GROUP_IOSCHED is disabled, or service
>>>>> guarantee is not needed, does the above phenomenon still exist? I find
>>>>> it hard to understand... Perhaps I need to do some test.
>>>> 
>>>> Well, BFQ implements for example idling on sync IO queues which is one of
>>>> the features that upsets blk-wbt. That does not depend on
>>>> CONFIG_BFQ_GROUP_IOSCHED in any way. Also generally the idea that BFQ
>>>> assigns storage *time slots* to different processes and IO from other
>>>> processes is just queued at those times increases IO completion
>>>> latency (for IOs of processes that are not currently scheduled) and this
>>>> tends to confuse blk-wbt.
>>>> 
>>> Hi, Jan
>>> 
>>> Just to be curious, have you ever think about or tested wbt with
>>> io-cost? And even more, how bfq work with io-cost?
>>> 
>>> I haven't tested yet, but it seems to me some of them can work well
>>> together.
>> No, I didn't test these combinations. I actually expect there would be
>> troubles in both cases under high IO load but you can try :)
> 
> Just realize I made a clerical error, I actually want to saied that
> *can't* work well together.
> 

You are right, they can't work together, conceptually. Their logics would simply keep conflicting, and none of the two would make ti to control IO as desired.

Thanks,
Paolo

> I'll try to have a test the combinations.
> 
> Thanks,
> Kuai
>> 								Honza
Yu Kuai Sept. 28, 2022, 3:30 a.m. UTC | #11
Hi,

在 2022/09/28 0:14, Paolo Valente 写道:
> 
> 
>> Il giorno 27 set 2022, alle ore 03:02, Yu Kuai <yukuai1@huaweicloud.com> ha scritto:
>>
>> Hi, Jan
>>
>> 在 2022/09/26 22:22, Jan Kara 写道:
>>> Hi Kuai!
>>> On Mon 26-09-22 21:00:48, Yu Kuai wrote:
>>>> 在 2022/09/23 19:03, Jan Kara 写道:
>>>>> Hi Kuai!
>>>>>
>>>>> On Fri 23-09-22 18:23:03, Yu Kuai wrote:
>>>>>> 在 2022/09/23 18:06, Jan Kara 写道:
>>>>>>> On Fri 23-09-22 17:50:49, Yu Kuai wrote:
>>>>>>>> Hi, Christoph
>>>>>>>>
>>>>>>>> 在 2022/09/23 16:56, Christoph Hellwig 写道:
>>>>>>>>> On Thu, Sep 22, 2022 at 07:35:56PM +0800, Yu Kuai wrote:
>>>>>>>>>> wbt and bfq should work just fine if CONFIG_BFQ_GROUP_IOSCHED is disabled.
>>>>>>>>>
>>>>>>>>> Umm, wouldn't this be something decided at runtime, that is not
>>>>>>>>> if CONFIG_BFQ_GROUP_IOSCHED is enable/disable in the kernel build
>>>>>>>>> if the hierarchical cgroup based scheduling is actually used for a
>>>>>>>>> given device?
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>> That's a good point,
>>>>>>>>
>>>>>>>> Before this patch wbt is simply disabled if elevator is bfq.
>>>>>>>>
>>>>>>>> With this patch, if elevator is bfq while bfq doesn't throttle
>>>>>>>> any IO yet, wbt still is disabled unnecessarily.
>>>>>>>
>>>>>>> It is not really disabled unnecessarily. Have you actually tested the
>>>>>>> performance of the combination? I did once and the results were just
>>>>>>> horrible (which is I made BFQ just disable wbt by default). The problem is
>>>>>>> that blk-wbt assumes certain model of underlying storage stack and hardware
>>>>>>> behavior and BFQ just does not fit in that model. For example BFQ wants to
>>>>>>> see as many requests as possible so that it can heavily reorder them,
>>>>>>> estimate think times of applications, etc. On the other hand blk-wbt
>>>>>>> assumes that if request latency gets higher, it means there is too much IO
>>>>>>> going on and we need to allow less of "lower priority" IO types to be
>>>>>>> submitted. These two go directly against one another and I was easily
>>>>>>> observing blk-wbt spiraling down to allowing only very small number of
>>>>>>> requests submitted while BFQ was idling waiting for more IO from the
>>>>>>> process that was currently scheduled.
>>>>>>>
>>>>>>
>>>>>> Thanks for your explanation, I understand that bfq and wbt should not
>>>>>> work together.
>>>>>>
>>>>>> However, I wonder if CONFIG_BFQ_GROUP_IOSCHED is disabled, or service
>>>>>> guarantee is not needed, does the above phenomenon still exist? I find
>>>>>> it hard to understand... Perhaps I need to do some test.
>>>>>
>>>>> Well, BFQ implements for example idling on sync IO queues which is one of
>>>>> the features that upsets blk-wbt. That does not depend on
>>>>> CONFIG_BFQ_GROUP_IOSCHED in any way. Also generally the idea that BFQ
>>>>> assigns storage *time slots* to different processes and IO from other
>>>>> processes is just queued at those times increases IO completion
>>>>> latency (for IOs of processes that are not currently scheduled) and this
>>>>> tends to confuse blk-wbt.
>>>>>
>>>> Hi, Jan
>>>>
>>>> Just to be curious, have you ever think about or tested wbt with
>>>> io-cost? And even more, how bfq work with io-cost?
>>>>
>>>> I haven't tested yet, but it seems to me some of them can work well
>>>> together.
>>> No, I didn't test these combinations. I actually expect there would be
>>> troubles in both cases under high IO load but you can try :)
>>
>> Just realize I made a clerical error, I actually want to saied that
>> *can't* work well together.
>>
> 
> You are right, they can't work together, conceptually. Their logics would simply keep conflicting, and none of the two would make ti to control IO as desired.

Yes, I just run some simple tests, test result is very bad...

Perhaps we can do something like bfq does to disable wbt.

Thanks,
Kuai
> 
> Thanks,
> Paolo
> 
>> I'll try to have a test the combinations.
>>
>> Thanks,
>> Kuai
>>> 								Honza
> 
> .
>
diff mbox series

Patch

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 7ea427817f7f..fec52968fe07 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -7037,6 +7037,7 @@  static void bfq_exit_queue(struct elevator_queue *e)
 
 #ifdef CONFIG_BFQ_GROUP_IOSCHED
 	blkcg_deactivate_policy(bfqd->queue, &blkcg_policy_bfq);
+	wbt_enable_default(bfqd->queue);
 #else
 	spin_lock_irq(&bfqd->lock);
 	bfq_put_async_queues(bfqd, bfqd->root_group);
@@ -7045,7 +7046,6 @@  static void bfq_exit_queue(struct elevator_queue *e)
 #endif
 
 	blk_stat_disable_accounting(bfqd->queue);
-	wbt_enable_default(bfqd->queue);
 
 	kfree(bfqd);
 }
@@ -7190,7 +7190,9 @@  static int bfq_init_queue(struct request_queue *q, struct elevator_type *e)
 	/* We dispatch from request queue wide instead of hw queue */
 	blk_queue_flag_set(QUEUE_FLAG_SQ_SCHED, q);
 
+#ifdef CONFIG_BFQ_GROUP_IOSCHED
 	wbt_disable_default(q);
+#endif
 	blk_stat_enable_accounting(q);
 
 	return 0;