mbox series

[for-6.12,0/4] block, bfq: fix corner cases related to bfqq merging

Message ID 20240902130329.3787024-1-yukuai1@huaweicloud.com (mailing list archive)
Headers show
Series block, bfq: fix corner cases related to bfqq merging | expand

Message

Yu Kuai Sept. 2, 2024, 1:03 p.m. UTC
From: Yu Kuai <yukuai3@huawei.com>

Our syzkaller report a UAF problem(details in patch 1), however it can't
be reporduced. And this set are some corner cases fix that might be
related, and they are found by code review.

Yu Kuai (4):
  block, bfq: fix possible UAF for bfqq->bic with merge chain
  block, bfq: choose the last bfqq from merge chain in
    bfq_setup_cooperator()
  block, bfq: don't break merge chain in bfq_split_bfqq()
  block, bfq: use bfq_reassign_last_bfqq() in bfq_bfqq_move()

 block/bfq-cgroup.c  |  7 +------
 block/bfq-iosched.c | 17 +++++++++++------
 block/bfq-iosched.h |  2 ++
 3 files changed, 14 insertions(+), 12 deletions(-)

Comments

Jens Axboe Sept. 3, 2024, 3:51 p.m. UTC | #1
On 9/2/24 7:03 AM, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@huawei.com>
> 
> Our syzkaller report a UAF problem(details in patch 1), however it can't
> be reporduced. And this set are some corner cases fix that might be
> related, and they are found by code review.
> 
> Yu Kuai (4):
>   block, bfq: fix possible UAF for bfqq->bic with merge chain
>   block, bfq: choose the last bfqq from merge chain in
>     bfq_setup_cooperator()
>   block, bfq: don't break merge chain in bfq_split_bfqq()
>   block, bfq: use bfq_reassign_last_bfqq() in bfq_bfqq_move()
> 
>  block/bfq-cgroup.c  |  7 +------
>  block/bfq-iosched.c | 17 +++++++++++------
>  block/bfq-iosched.h |  2 ++
>  3 files changed, 14 insertions(+), 12 deletions(-)

BFQ is effectively unmaintained, and has been for quite a while at
this point. I'll apply these, thanks for looking into it, but I think we
should move BFQ to an unmaintained state at this point.
Jens Axboe Sept. 3, 2024, 3:56 p.m. UTC | #2
On Mon, 02 Sep 2024 21:03:25 +0800, Yu Kuai wrote:
> Our syzkaller report a UAF problem(details in patch 1), however it can't
> be reporduced. And this set are some corner cases fix that might be
> related, and they are found by code review.
> 
> Yu Kuai (4):
>   block, bfq: fix possible UAF for bfqq->bic with merge chain
>   block, bfq: choose the last bfqq from merge chain in
>     bfq_setup_cooperator()
>   block, bfq: don't break merge chain in bfq_split_bfqq()
>   block, bfq: use bfq_reassign_last_bfqq() in bfq_bfqq_move()
> 
> [...]

Applied, thanks!

[1/4] block, bfq: fix possible UAF for bfqq->bic with merge chain
      commit: 18ad4df091dd5d067d2faa8fce1180b79f7041a7
[2/4] block, bfq: choose the last bfqq from merge chain in bfq_setup_cooperator()
      commit: 0e456dba86c7f9a19792204a044835f1ca2c8dbb
[3/4] block, bfq: don't break merge chain in bfq_split_bfqq()
      commit: 42c306ed723321af4003b2a41bb73728cab54f85
[4/4] block, bfq: use bfq_reassign_last_bfqq() in bfq_bfqq_move()
      commit: f45916ae60eb60e7c9c3ac60cf07e66fe1a7faad

Best regards,
Yu Kuai Sept. 4, 2024, 1:32 a.m. UTC | #3
Hi,

在 2024/09/03 23:51, Jens Axboe 写道:
> On 9/2/24 7:03 AM, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Our syzkaller report a UAF problem(details in patch 1), however it can't
>> be reporduced. And this set are some corner cases fix that might be
>> related, and they are found by code review.
>>
>> Yu Kuai (4):
>>    block, bfq: fix possible UAF for bfqq->bic with merge chain
>>    block, bfq: choose the last bfqq from merge chain in
>>      bfq_setup_cooperator()
>>    block, bfq: don't break merge chain in bfq_split_bfqq()
>>    block, bfq: use bfq_reassign_last_bfqq() in bfq_bfqq_move()
>>
>>   block/bfq-cgroup.c  |  7 +------
>>   block/bfq-iosched.c | 17 +++++++++++------
>>   block/bfq-iosched.h |  2 ++
>>   3 files changed, 14 insertions(+), 12 deletions(-)
> 
> BFQ is effectively unmaintained, and has been for quite a while at
> this point. I'll apply these, thanks for looking into it, but I think we
> should move BFQ to an unmaintained state at this point.

Sorry to hear that, we would be willing to take on the responsibility of
maintaining this code, please let me know if there are any specific
guidelines or processes we should follow. We do have customers are using
bfq in downstream kernels, and we are still running lots of test for
bfq.

Thanks,
Kuai


>
Bart Van Assche Sept. 4, 2024, 2:28 a.m. UTC | #4
On 9/3/24 6:32 PM, Yu Kuai wrote:
> We do have customers are using bfq in downstream kernels, and we are
> still running lots of test for bfq.

It may take less time to add any missing functionality to another I/O
scheduler rather than to keep maintaining BFQ.

If Android device vendors would stop using BFQ, my job would become
easier.

Thanks,

Bart.
Yu Kuai Sept. 4, 2024, 2:45 a.m. UTC | #5
Hi,

在 2024/09/04 10:28, Bart Van Assche 写道:
> On 9/3/24 6:32 PM, Yu Kuai wrote:
>> We do have customers are using bfq in downstream kernels, and we are
>> still running lots of test for bfq.
> 
> It may take less time to add any missing functionality to another I/O
> scheduler rather than to keep maintaining BFQ.
> 
> If Android device vendors would stop using BFQ, my job would become
> easier.

I'm confused now, I think keep maintaining BFQ won't stop you from
adding new functionality to another scheduler, right? Is this something
that all scheduler have to support?

Thanks,
Kuai

> 
> Thanks,
> 
> Bart.
> .
>
Ming Lei Sept. 4, 2024, 4:38 a.m. UTC | #6
On Wed, Sep 04, 2024 at 09:32:26AM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2024/09/03 23:51, Jens Axboe 写道:
> > On 9/2/24 7:03 AM, Yu Kuai wrote:
> > > From: Yu Kuai <yukuai3@huawei.com>
> > > 
> > > Our syzkaller report a UAF problem(details in patch 1), however it can't
> > > be reporduced. And this set are some corner cases fix that might be
> > > related, and they are found by code review.
> > > 
> > > Yu Kuai (4):
> > >    block, bfq: fix possible UAF for bfqq->bic with merge chain
> > >    block, bfq: choose the last bfqq from merge chain in
> > >      bfq_setup_cooperator()
> > >    block, bfq: don't break merge chain in bfq_split_bfqq()
> > >    block, bfq: use bfq_reassign_last_bfqq() in bfq_bfqq_move()
> > > 
> > >   block/bfq-cgroup.c  |  7 +------
> > >   block/bfq-iosched.c | 17 +++++++++++------
> > >   block/bfq-iosched.h |  2 ++
> > >   3 files changed, 14 insertions(+), 12 deletions(-)
> > 
> > BFQ is effectively unmaintained, and has been for quite a while at
> > this point. I'll apply these, thanks for looking into it, but I think we
> > should move BFQ to an unmaintained state at this point.
> 
> Sorry to hear that, we would be willing to take on the responsibility of
> maintaining this code, please let me know if there are any specific
> guidelines or processes we should follow. We do have customers are using
> bfq in downstream kernels, and we are still running lots of test for
> bfq.

Hi Yukuai,
 
BTW, BFQ is default IO scheduler for SCSI/MMC in Fedora. That is great if you'd
like to volunteer to take over maintaining it.


thanks,
Ming
Jan Kara Sept. 4, 2024, 12:29 p.m. UTC | #7
On Wed 04-09-24 09:32:26, Yu Kuai wrote:
> 在 2024/09/03 23:51, Jens Axboe 写道:
> > On 9/2/24 7:03 AM, Yu Kuai wrote:
> > > From: Yu Kuai <yukuai3@huawei.com>
> > > 
> > > Our syzkaller report a UAF problem(details in patch 1), however it can't
> > > be reporduced. And this set are some corner cases fix that might be
> > > related, and they are found by code review.
> > > 
> > > Yu Kuai (4):
> > >    block, bfq: fix possible UAF for bfqq->bic with merge chain
> > >    block, bfq: choose the last bfqq from merge chain in
> > >      bfq_setup_cooperator()
> > >    block, bfq: don't break merge chain in bfq_split_bfqq()
> > >    block, bfq: use bfq_reassign_last_bfqq() in bfq_bfqq_move()
> > > 
> > >   block/bfq-cgroup.c  |  7 +------
> > >   block/bfq-iosched.c | 17 +++++++++++------
> > >   block/bfq-iosched.h |  2 ++
> > >   3 files changed, 14 insertions(+), 12 deletions(-)
> > 
> > BFQ is effectively unmaintained, and has been for quite a while at
> > this point. I'll apply these, thanks for looking into it, but I think we
> > should move BFQ to an unmaintained state at this point.
> 
> Sorry to hear that, we would be willing to take on the responsibility of
> maintaining this code, please let me know if there are any specific
> guidelines or processes we should follow. We do have customers are using
> bfq in downstream kernels, and we are still running lots of test for
> bfq.

That would be awesome. I don't think there's much of a process to follow
given there's not much happening in BFQ. You can add yourself to
MAINTAINERS file under "BFQ I/O SCHEDULER" entry and then do your best to
keep BFQ alive by fixing bugs and responding to reports :) I'm not sure if
Jens would prefer you'd create your git tree from which he will pull or
whether merging patches is fine - he has to decide.

								Honza
Jens Axboe Sept. 4, 2024, 1:49 p.m. UTC | #8
On 9/4/24 6:29 AM, Jan Kara wrote:
> On Wed 04-09-24 09:32:26, Yu Kuai wrote:
>> ? 2024/09/03 23:51, Jens Axboe ??:
>>> On 9/2/24 7:03 AM, Yu Kuai wrote:
>>>> From: Yu Kuai <yukuai3@huawei.com>
>>>>
>>>> Our syzkaller report a UAF problem(details in patch 1), however it can't
>>>> be reporduced. And this set are some corner cases fix that might be
>>>> related, and they are found by code review.
>>>>
>>>> Yu Kuai (4):
>>>>    block, bfq: fix possible UAF for bfqq->bic with merge chain
>>>>    block, bfq: choose the last bfqq from merge chain in
>>>>      bfq_setup_cooperator()
>>>>    block, bfq: don't break merge chain in bfq_split_bfqq()
>>>>    block, bfq: use bfq_reassign_last_bfqq() in bfq_bfqq_move()
>>>>
>>>>   block/bfq-cgroup.c  |  7 +------
>>>>   block/bfq-iosched.c | 17 +++++++++++------
>>>>   block/bfq-iosched.h |  2 ++
>>>>   3 files changed, 14 insertions(+), 12 deletions(-)
>>>
>>> BFQ is effectively unmaintained, and has been for quite a while at
>>> this point. I'll apply these, thanks for looking into it, but I think we
>>> should move BFQ to an unmaintained state at this point.
>>
>> Sorry to hear that, we would be willing to take on the responsibility of
>> maintaining this code, please let me know if there are any specific
>> guidelines or processes we should follow. We do have customers are using
>> bfq in downstream kernels, and we are still running lots of test for
>> bfq.
> 
> That would be awesome. I don't think there's much of a process to follow
> given there's not much happening in BFQ. You can add yourself to
> MAINTAINERS file under "BFQ I/O SCHEDULER" entry and then do your best to
> keep BFQ alive by fixing bugs and responding to reports :) I'm not sure if
> Jens would prefer you'd create your git tree from which he will pull or
> whether merging patches is fine - he has to decide.

The usual process is that you start actually maintaining it, and after a
bit of a track record has been proven, then add the maintainers entry.
Too many times people start by adding a maintainers entry and then don't
really do anything. Not saying that'd necessarily be the case here, but
maintaining first and then adding an entry down the line seems like the
better approach.

I prefer people sending patches, as there's less risk there for messing
it up. Maintaining a git tree may seem easy, but lots of people end up
messing it up, particularly as a new maintainer.
Jens Axboe Sept. 4, 2024, 1:53 p.m. UTC | #9
On 9/3/24 7:32 PM, Yu Kuai wrote:
> Hi,
> 
> 在 2024/09/03 23:51, Jens Axboe 写道:
>> On 9/2/24 7:03 AM, Yu Kuai wrote:
>>> From: Yu Kuai <yukuai3@huawei.com>
>>>
>>> Our syzkaller report a UAF problem(details in patch 1), however it can't
>>> be reporduced. And this set are some corner cases fix that might be
>>> related, and they are found by code review.
>>>
>>> Yu Kuai (4):
>>>    block, bfq: fix possible UAF for bfqq->bic with merge chain
>>>    block, bfq: choose the last bfqq from merge chain in
>>>      bfq_setup_cooperator()
>>>    block, bfq: don't break merge chain in bfq_split_bfqq()
>>>    block, bfq: use bfq_reassign_last_bfqq() in bfq_bfqq_move()
>>>
>>>   block/bfq-cgroup.c  |  7 +------
>>>   block/bfq-iosched.c | 17 +++++++++++------
>>>   block/bfq-iosched.h |  2 ++
>>>   3 files changed, 14 insertions(+), 12 deletions(-)
>>
>> BFQ is effectively unmaintained, and has been for quite a while at
>> this point. I'll apply these, thanks for looking into it, but I think we
>> should move BFQ to an unmaintained state at this point.
> 
> Sorry to hear that, we would be willing to take on the responsibility of
> maintaining this code, please let me know if there are any specific
> guidelines or processes we should follow. We do have customers are using
> bfq in downstream kernels, and we are still running lots of test for
> bfq.

Most important is just reviewing fixes and tending to bug reports, and
then collecting those fixes and sending them out to the list+me for
inclusion. Not much more needs to happen, this series is a good example
of it.
Jens Axboe Sept. 4, 2024, 1:55 p.m. UTC | #10
On 9/3/24 8:45 PM, Yu Kuai wrote:
> Hi,
> 
> ? 2024/09/04 10:28, Bart Van Assche ??:
>> On 9/3/24 6:32 PM, Yu Kuai wrote:
>>> We do have customers are using bfq in downstream kernels, and we are
>>> still running lots of test for bfq.
>>
>> It may take less time to add any missing functionality to another I/O
>> scheduler rather than to keep maintaining BFQ.
>>
>> If Android device vendors would stop using BFQ, my job would become
>> easier.
> 
> I'm confused now, I think keep maintaining BFQ won't stop you from
> adding new functionality to another scheduler, right? Is this something
> that all scheduler have to support?

With fear of putting words into Bart's mouth, perhaps he's saying that
the BFQ is a bit of a mess and it'd be nice if we had a cleaner version
of some of the features it brings. But having someone actually maintain
it and perhaps clean it up a bit and reduce the complexity would be a
good thing. Really it's the authors choice on where to best spend his or
her time.
Bart Van Assche Sept. 4, 2024, 5:17 p.m. UTC | #11
On 9/3/24 7:45 PM, Yu Kuai wrote:
> 在 2024/09/04 10:28, Bart Van Assche 写道:
>> On 9/3/24 6:32 PM, Yu Kuai wrote:
>>> We do have customers are using bfq in downstream kernels, and we are
>>> still running lots of test for bfq.
>>
>> It may take less time to add any missing functionality to another I/O
>> scheduler rather than to keep maintaining BFQ.
>>
>> If Android device vendors would stop using BFQ, my job would become
>> easier.
> 
> I'm confused now, I think keep maintaining BFQ won't stop you from
> adding new functionality to another scheduler, right? Is this something
> that all scheduler have to support?

As long as the BFQ I/O scheduler does not get deprecated, there will be
Android device vendors that select it for their devices. BFQ bug reports
are either sent to one of my colleagues or to myself.

For Android devices that use UFS storage, we noticed that the
mq-deadline scheduler is good enough. The device boot time is shorter
and I'm not aware of any significant differences in application startup
time.

Thanks,

Bart.
Yu Kuai Sept. 5, 2024, 1:48 a.m. UTC | #12
Hi,

在 2024/09/05 1:17, Bart Van Assche 写道:
> On 9/3/24 7:45 PM, Yu Kuai wrote:
>> 在 2024/09/04 10:28, Bart Van Assche 写道:
>>> On 9/3/24 6:32 PM, Yu Kuai wrote:
>>>> We do have customers are using bfq in downstream kernels, and we are
>>>> still running lots of test for bfq.
>>>
>>> It may take less time to add any missing functionality to another I/O
>>> scheduler rather than to keep maintaining BFQ.
>>>
>>> If Android device vendors would stop using BFQ, my job would become
>>> easier.
>>
>> I'm confused now, I think keep maintaining BFQ won't stop you from
>> adding new functionality to another scheduler, right? Is this something
>> that all scheduler have to support?
> 
> As long as the BFQ I/O scheduler does not get deprecated, there will be
> Android device vendors that select it for their devices. BFQ bug reports
> are either sent to one of my colleagues or to myself.

Then, you can share them to me now, I'll like to help.
> 
> For Android devices that use UFS storage, we noticed that the
> mq-deadline scheduler is good enough. The device boot time is shorter
> and I'm not aware of any significant differences in application startup
> time.

We're using bfq for HDD, performance overhead in bfq is not less,
like you said, if bfq doen't show better results in UFS storage, and you
don't want to use the io control feature, you can choose not to use it,
however, remove bfq will be too aggressive.

Thanks,
Kuai

> 
> Thanks,
> 
> Bart.
> 
> .
>