mbox series

[0/9,v6] bfq: Avoid use-after-free when moving processes between cgroups

Message ID 20220330123438.32719-1-jack@suse.cz (mailing list archive)
Headers show
Series bfq: Avoid use-after-free when moving processes between cgroups | expand

Message

Jan Kara March 30, 2022, 12:42 p.m. UTC
Hello,

with a big delay (I'm sorry for that) here is the sixth version of my patches
to fix use-after-free issues in BFQ when processes with merged queues get moved
to different cgroups. The patches have survived some beating in my test VM, but
so far I fail to reproduce the original KASAN reports so testing from people
who can reproduce them is most welcome. Kuai, can you please give these patches
a run in your setup? Thanks a lot for your help with fixing this!

Changes since v5:
* Added handling of situation when bio is submitted for a cgroup that has
  already went through bfq_pd_offline()
* Convert bfq to avoid using deprecated __bio_blkcg() and thus fix possible
  races when returned cgroup can change while bfq is working with a request

Changes since v4:
* Even more aggressive splitting of merged bfq queues to avoid problems with
  long merge chains.

Changes since v3:
* Changed handling of bfq group move to handle the case when target of the
  merge has moved.

Changes since v2:
* Improved handling of bfq queue splitting on move between cgroups
* Removed broken change to bfq_put_cooperator()

Changes since v1:
* Added fix for bfq_put_cooperator()
* Added fix to handle move between cgroups in bfq_merge_bio()

								Honza
Previous versions:
Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
Link: http://lore.kernel.org/r/20220105143037.20542-1-jack@suse.cz # v2
Link: http://lore.kernel.org/r/20220112113529.6355-1-jack@suse.cz # v3
Link: http://lore.kernel.org/r/20220114164215.28972-1-jack@suse.cz # v4
Link: http://lore.kernel.org/r/20220121105503.14069-1-jack@suse.cz # v5

Comments

Yu Kuai March 31, 2022, 9:31 a.m. UTC | #1
在 2022/03/30 20:42, Jan Kara 写道:
> Hello,
> 
> with a big delay (I'm sorry for that) here is the sixth version of my patches
> to fix use-after-free issues in BFQ when processes with merged queues get moved
> to different cgroups. The patches have survived some beating in my test VM, but
> so far I fail to reproduce the original KASAN reports so testing from people
> who can reproduce them is most welcome. Kuai, can you please give these patches
> a run in your setup? Thanks a lot for your help with fixing this!

Thanks for the patch, I'll run reporducer and feedback to you soon.

Kuai
> 
> Changes since v5:
> * Added handling of situation when bio is submitted for a cgroup that has
>    already went through bfq_pd_offline()
> * Convert bfq to avoid using deprecated __bio_blkcg() and thus fix possible
>    races when returned cgroup can change while bfq is working with a request
> 
> Changes since v4:
> * Even more aggressive splitting of merged bfq queues to avoid problems with
>    long merge chains.
> 
> Changes since v3:
> * Changed handling of bfq group move to handle the case when target of the
>    merge has moved.
> 
> Changes since v2:
> * Improved handling of bfq queue splitting on move between cgroups
> * Removed broken change to bfq_put_cooperator()
> 
> Changes since v1:
> * Added fix for bfq_put_cooperator()
> * Added fix to handle move between cgroups in bfq_merge_bio()
> 
> 								Honza
> Previous versions:
> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
> Link: http://lore.kernel.org/r/20220105143037.20542-1-jack@suse.cz # v2
> Link: http://lore.kernel.org/r/20220112113529.6355-1-jack@suse.cz # v3
> Link: http://lore.kernel.org/r/20220114164215.28972-1-jack@suse.cz # v4
> Link: http://lore.kernel.org/r/20220121105503.14069-1-jack@suse.cz # v5
> .
>
Yu Kuai April 1, 2022, 3:40 a.m. UTC | #2
在 2022/03/30 20:42, Jan Kara 写道:
> Hello,
> 
> with a big delay (I'm sorry for that) here is the sixth version of my patches
> to fix use-after-free issues in BFQ when processes with merged queues get moved
> to different cgroups. The patches have survived some beating in my test VM, but
> so far I fail to reproduce the original KASAN reports so testing from people
> who can reproduce them is most welcome. Kuai, can you please give these patches
> a run in your setup? Thanks a lot for your help with fixing this!
> 
Hi, Jan

I ran the reproducer for more than 12 hours aready, and the uaf is not
reporduced anymore. Before this patchset this problem can be reporduced
within an hour.

Thanks,
Kuai
> Changes since v5:
> * Added handling of situation when bio is submitted for a cgroup that has
>    already went through bfq_pd_offline()
> * Convert bfq to avoid using deprecated __bio_blkcg() and thus fix possible
>    races when returned cgroup can change while bfq is working with a request
> 
> Changes since v4:
> * Even more aggressive splitting of merged bfq queues to avoid problems with
>    long merge chains.
> 
> Changes since v3:
> * Changed handling of bfq group move to handle the case when target of the
>    merge has moved.
> 
> Changes since v2:
> * Improved handling of bfq queue splitting on move between cgroups
> * Removed broken change to bfq_put_cooperator()
> 
> Changes since v1:
> * Added fix for bfq_put_cooperator()
> * Added fix to handle move between cgroups in bfq_merge_bio()
> 
> 								Honza
> Previous versions:
> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
> Link: http://lore.kernel.org/r/20220105143037.20542-1-jack@suse.cz # v2
> Link: http://lore.kernel.org/r/20220112113529.6355-1-jack@suse.cz # v3
> Link: http://lore.kernel.org/r/20220114164215.28972-1-jack@suse.cz # v4
> Link: http://lore.kernel.org/r/20220121105503.14069-1-jack@suse.cz # v5
> .
>
Jan Kara April 1, 2022, 9:26 a.m. UTC | #3
On Fri 01-04-22 11:40:39, yukuai (C) wrote:
> 在 2022/03/30 20:42, Jan Kara 写道:
> > Hello,
> > 
> > with a big delay (I'm sorry for that) here is the sixth version of my patches
> > to fix use-after-free issues in BFQ when processes with merged queues get moved
> > to different cgroups. The patches have survived some beating in my test VM, but
> > so far I fail to reproduce the original KASAN reports so testing from people
> > who can reproduce them is most welcome. Kuai, can you please give these patches
> > a run in your setup? Thanks a lot for your help with fixing this!
> > 
> Hi, Jan
> 
> I ran the reproducer for more than 12 hours aready, and the uaf is not
> reporduced anymore. Before this patchset this problem can be reporduced
> within an hour.

Great to hear that! Thanks a lot for testing and help with analysis! Can I
add your Tested-by tag?

									Honza

> > Changes since v5:
> > * Added handling of situation when bio is submitted for a cgroup that has
> >    already went through bfq_pd_offline()
> > * Convert bfq to avoid using deprecated __bio_blkcg() and thus fix possible
> >    races when returned cgroup can change while bfq is working with a request
> > 
> > Changes since v4:
> > * Even more aggressive splitting of merged bfq queues to avoid problems with
> >    long merge chains.
> > 
> > Changes since v3:
> > * Changed handling of bfq group move to handle the case when target of the
> >    merge has moved.
> > 
> > Changes since v2:
> > * Improved handling of bfq queue splitting on move between cgroups
> > * Removed broken change to bfq_put_cooperator()
> > 
> > Changes since v1:
> > * Added fix for bfq_put_cooperator()
> > * Added fix to handle move between cgroups in bfq_merge_bio()
> > 
> > 								Honza
> > Previous versions:
> > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
> > Link: http://lore.kernel.org/r/20220105143037.20542-1-jack@suse.cz # v2
> > Link: http://lore.kernel.org/r/20220112113529.6355-1-jack@suse.cz # v3
> > Link: http://lore.kernel.org/r/20220114164215.28972-1-jack@suse.cz # v4
> > Link: http://lore.kernel.org/r/20220121105503.14069-1-jack@suse.cz # v5
> > .
> >
Yu Kuai April 1, 2022, 9:40 a.m. UTC | #4
在 2022/04/01 17:26, Jan Kara 写道:
> On Fri 01-04-22 11:40:39, yukuai (C) wrote:
>> 在 2022/03/30 20:42, Jan Kara 写道:
>>> Hello,
>>>
>>> with a big delay (I'm sorry for that) here is the sixth version of my patches
>>> to fix use-after-free issues in BFQ when processes with merged queues get moved
>>> to different cgroups. The patches have survived some beating in my test VM, but
>>> so far I fail to reproduce the original KASAN reports so testing from people
>>> who can reproduce them is most welcome. Kuai, can you please give these patches
>>> a run in your setup? Thanks a lot for your help with fixing this!
>>>
>> Hi, Jan
>>
>> I ran the reproducer for more than 12 hours aready, and the uaf is not
>> reporduced anymore. Before this patchset this problem can be reporduced
>> within an hour.
> 
> Great to hear that! Thanks a lot for testing and help with analysis! Can I
> add your Tested-by tag?

Of course.

Cheers for address this problem
Kuai
> 
> 									Honza
> 
>>> Changes since v5:
>>> * Added handling of situation when bio is submitted for a cgroup that has
>>>     already went through bfq_pd_offline()
>>> * Convert bfq to avoid using deprecated __bio_blkcg() and thus fix possible
>>>     races when returned cgroup can change while bfq is working with a request
>>>
>>> Changes since v4:
>>> * Even more aggressive splitting of merged bfq queues to avoid problems with
>>>     long merge chains.
>>>
>>> Changes since v3:
>>> * Changed handling of bfq group move to handle the case when target of the
>>>     merge has moved.
>>>
>>> Changes since v2:
>>> * Improved handling of bfq queue splitting on move between cgroups
>>> * Removed broken change to bfq_put_cooperator()
>>>
>>> Changes since v1:
>>> * Added fix for bfq_put_cooperator()
>>> * Added fix to handle move between cgroups in bfq_merge_bio()
>>>
>>> 								Honza
>>> Previous versions:
>>> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1
>>> Link: http://lore.kernel.org/r/20220105143037.20542-1-jack@suse.cz # v2
>>> Link: http://lore.kernel.org/r/20220112113529.6355-1-jack@suse.cz # v3
>>> Link: http://lore.kernel.org/r/20220114164215.28972-1-jack@suse.cz # v4
>>> Link: http://lore.kernel.org/r/20220121105503.14069-1-jack@suse.cz # v5
>>> .
>>>