Message ID | 20220105143037.20542-1-jack@suse.cz (mailing list archive) |
---|---|
Headers | show |
Series | bfq: Avoid use-after-free when moving processes between cgroups | expand |
在 2022/01/05 22:36, Jan Kara 写道: > Hello, > > here is the second version of my patches to fix use-after-free issues in BFQ > when processes with merged queues get moved to different cgroups. The patches > have survived some beating in my test VM but so far I fail to reproduce the > original KASAN reports so testing from people who can reproduce them is most > welcome. Thanks! > > Changes since v1: > * Added fix for bfq_put_cooperator() > * Added fix to handle move between cgroups in bfq_merge_bio() > > Honza > Previous versions: > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1 > . > Hi, I repoduced the problem again with this patchset... [ 71.004788] BUG: KASAN: use-after-free in __bfq_deactivate_entity+0x21/0x290 [ 71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801 [ 71.007723] [ 71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G W 5.16.0-rc5-next-2021127 [ 71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-4 [ 71.012274] Call Trace: [ 71.012603] <TASK> [ 71.012886] dump_stack_lvl+0x34/0x44 [ 71.013379] print_address_description.constprop.0.cold+0xab/0x36b [ 71.014182] ? __bfq_deactivate_entity+0x21/0x290 [ 71.014795] ? __bfq_deactivate_entity+0x21/0x290 [ 71.015398] kasan_report.cold+0x83/0xdf [ 71.015904] ? _raw_read_lock_bh+0x20/0x40 [ 71.016433] ? __bfq_deactivate_entity+0x21/0x290 [ 71.017033] __bfq_deactivate_entity+0x21/0x290 [ 71.017617] bfq_pd_offline+0xc1/0x110 [ 71.018105] blkcg_deactivate_policy+0x14b/0x210 [ 71.018699] bfq_exit_queue+0xe5/0x100 [ 71.019189] blk_mq_exit_sched+0x113/0x140 [ 71.019720] elevator_exit+0x30/0x50 [ 71.020186] blk_release_queue+0xa8/0x160 [ 71.020702] kobject_put+0xd4/0x270 [ 71.021158] disk_release+0xc5/0xf0 [ 71.021609] device_release+0x56/0xe0 [ 71.022086] kobject_put+0xd4/0x270 [ 71.022546] null_del_dev.part.0+0xe6/0x220 [null_blk] [ 71.023228] null_exit+0x62/0xa6 [null_blk] [ 71.023792] __x64_sys_delete_module+0x20a/0x2f0 [ 71.024387] ? __ia32_sys_delete_module+0x2f0/0x2f0 [ 71.025008] ? fpregs_assert_state_consistent+0x55/0x60 [ 71.025674] ? exit_to_user_mode_prepare+0x39/0x1e0 [ 71.026304] do_syscall_64+0x35/0x80 [ 71.026763] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 71.027410] RIP: 0033:0x7fddec6bdfc7 [ 71.027869] Code: 73 01 c3 48 8b 0d c1 de 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 8 [ 71.030206] RSP: 002b:00007fff9486bc08 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 71.031160] RAX: ffffffffffffffda RBX: 00007fff9486bc68 RCX: 00007fddec6bdfc7 [ 71.032058] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055fc863ac258 [ 71.032962] RBP: 000055fc863ac1f0 R08: 00007fff9486ab81 R09: 0000000000000000 [ 71.033861] R10: 00007fddec730260 R11: 0000000000000206 R12: 00007fff9486be30 [ 71.034758] R13: 00007fff9486c45b R14: 000055fc863ab010 R15: 000055fc863ac1f0 [ 71.035659] </TASK> [ 71.035946] [ 71.036149] Allocated by task 620: [ 71.036585] kasan_save_stack+0x1e/0x40 [ 71.037076] __kasan_kmalloc+0x81/0xa0 [ 71.037558] bfq_pd_alloc+0x72/0x110 [ 71.038015] blkg_alloc+0x252/0x2c0 [ 71.038467] blkg_create+0x38e/0x560 [ 71.038927] bio_associate_blkg_from_css+0x3bc/0x490 [ 71.039563] bio_associate_blkg+0x3b/0x90 [ 71.040076] mpage_alloc+0x7c/0xe0 [ 71.040516] do_mpage_readpage+0xb42/0xff0 [ 71.041040] mpage_readahead+0x1fd/0x3f0 [ 71.041544] read_pages+0x144/0x770 [ 71.041994] page_cache_ra_unbounded+0x2d2/0x380 [ 71.042586] filemap_get_pages+0x1bc/0xaf0 [ 71.043115] filemap_read+0x1bf/0x5a0 [ 71.043585] aio_read+0x1ca/0x2f0 [ 71.044014] io_submit_one+0x874/0x11c0 [ 71.044511] __x64_sys_io_submit+0xfa/0x250 [ 71.045048] do_syscall_64+0x35/0x80 [ 71.045513] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 71.046161] [ 71.046361] Freed by task 0: [ 71.046735] kasan_save_stack+0x1e/0x40 [ 71.047230] kasan_set_track+0x21/0x30 [ 71.047709] kasan_set_free_info+0x20/0x30 [ 71.048235] __kasan_slab_free+0x103/0x180 [ 71.048757] kfree+0x9a/0x4b0 [ 71.049144] blkg_free.part.0+0x4a/0x90 [ 71.049635] rcu_do_batch+0x2e1/0x760 [ 71.050111] rcu_core+0x367/0x530 [ 71.050536] __do_softirq+0x119/0x3ba [ 71.051007] [ 71.051210] The buggy address belongs to the object at ffff88817a3dc000 [ 71.051210] which belongs to the cache kmalloc-2k of size 2048 [ 71.052766] The buggy address is located 176 bytes inside of [ 71.052766] 2048-byte region [ffff88817a3dc000, ffff88817a3dc800) [ 71.054240] The buggy address belongs to the page: [ 71.054843] page:00000000ce62c7c2 refcount:1 mapcount:0 mapping:0000000000000000 index:0x08 [ 71.056021] head:00000000ce62c7c2 order:3 compound_mapcount:0 compound_pincount:0 [ 71.056961] flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff) [ 71.057894] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff888100042f00 [ 71.058866] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000 [ 71.059840] page dumped because: kasan: bad access detected [ 71.060543] [ 71.060742] Memory state around the buggy address: [ 71.061349] ffff88817a3dbf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 71.062256] ffff88817a3dc000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 71.063163] >ffff88817a3dc080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 71.064129] ^ [ 71.064734] ffff88817a3dc100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 71.065639] ffff88817a3dc180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 71.066544] ================================================================== Here is the caller of __bfq_deactivate_entity: (gdb) list *(bfq_pd_offline+0xc1) 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942). 937 * entities to the idle tree. It happens if, in some 938 * of the calls to bfq_bfqq_move() performed by 939 * bfq_reparent_active_queues(), the queue to move is 940 * empty and gets expired. 941 */ 942 bfq_flush_idle_tree(st); 943 } 944 945 __bfq_deactivate_entity(entity, false);
On Fri 07-01-22 17:15:43, yukuai (C) wrote: > 在 2022/01/05 22:36, Jan Kara 写道: > > Hello, > > > > here is the second version of my patches to fix use-after-free issues in BFQ > > when processes with merged queues get moved to different cgroups. The patches > > have survived some beating in my test VM but so far I fail to reproduce the > > original KASAN reports so testing from people who can reproduce them is most > > welcome. Thanks! > > > > Changes since v1: > > * Added fix for bfq_put_cooperator() > > * Added fix to handle move between cgroups in bfq_merge_bio() > > > > Honza > > Previous versions: > > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1 > > . > > > > Hi, > > I repoduced the problem again with this patchset... Thanks for testing! > [ 71.004788] BUG: KASAN: use-after-free in > __bfq_deactivate_entity+0x21/0x290 > [ 71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801 > [ 71.007723] > [ 71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G W > 5.16.0-rc5-next-2021127 > [ 71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > ?-20190727_073836-4 > [ 71.012274] Call Trace: > [ 71.012603] <TASK> > [ 71.012886] dump_stack_lvl+0x34/0x44 > [ 71.013379] print_address_description.constprop.0.cold+0xab/0x36b > [ 71.014182] ? __bfq_deactivate_entity+0x21/0x290 > [ 71.014795] ? __bfq_deactivate_entity+0x21/0x290 > [ 71.015398] kasan_report.cold+0x83/0xdf > [ 71.015904] ? _raw_read_lock_bh+0x20/0x40 > [ 71.016433] ? __bfq_deactivate_entity+0x21/0x290 > [ 71.017033] __bfq_deactivate_entity+0x21/0x290 > [ 71.017617] bfq_pd_offline+0xc1/0x110 > [ 71.018105] blkcg_deactivate_policy+0x14b/0x210 ... > Here is the caller of __bfq_deactivate_entity: > (gdb) list *(bfq_pd_offline+0xc1) > 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942). > 937 * entities to the idle tree. It happens if, in some > 938 * of the calls to bfq_bfqq_move() performed by > 939 * bfq_reparent_active_queues(), the queue to move > is > 940 * empty and gets expired. > 941 */ > 942 bfq_flush_idle_tree(st); > 943 } > 944 > 945 __bfq_deactivate_entity(entity, false); So this is indeed strange. The group has in some of its idle service trees an entity whose entity->sched_data points to already freed cgroup. In particular it means the bfqq_entity_service_tree() leads to somewhere else than the 'st' we called bfq_flush_idle_tree() with. This looks like a different kind of problem AFAICT but still it needs fixing :). Can you please run your reproducer with my patches + the attached debug patch on top? Hopefully it should tell us more about the problematic entity and how it got there... Thanks! Honza
On Fri 07-01-22 15:58:53, Jan Kara wrote: > On Fri 07-01-22 17:15:43, yukuai (C) wrote: > > 在 2022/01/05 22:36, Jan Kara 写道: > > > Hello, > > > > > > here is the second version of my patches to fix use-after-free issues in BFQ > > > when processes with merged queues get moved to different cgroups. The patches > > > have survived some beating in my test VM but so far I fail to reproduce the > > > original KASAN reports so testing from people who can reproduce them is most > > > welcome. Thanks! > > > > > > Changes since v1: > > > * Added fix for bfq_put_cooperator() > > > * Added fix to handle move between cgroups in bfq_merge_bio() > > > > > > Honza > > > Previous versions: > > > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1 > > > . > > > > > > > Hi, > > > > I repoduced the problem again with this patchset... > > Thanks for testing! > > > [ 71.004788] BUG: KASAN: use-after-free in > > __bfq_deactivate_entity+0x21/0x290 > > [ 71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801 > > [ 71.007723] > > [ 71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G W > > 5.16.0-rc5-next-2021127 > > [ 71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > ?-20190727_073836-4 > > [ 71.012274] Call Trace: > > [ 71.012603] <TASK> > > [ 71.012886] dump_stack_lvl+0x34/0x44 > > [ 71.013379] print_address_description.constprop.0.cold+0xab/0x36b > > [ 71.014182] ? __bfq_deactivate_entity+0x21/0x290 > > [ 71.014795] ? __bfq_deactivate_entity+0x21/0x290 > > [ 71.015398] kasan_report.cold+0x83/0xdf > > [ 71.015904] ? _raw_read_lock_bh+0x20/0x40 > > [ 71.016433] ? __bfq_deactivate_entity+0x21/0x290 > > [ 71.017033] __bfq_deactivate_entity+0x21/0x290 > > [ 71.017617] bfq_pd_offline+0xc1/0x110 > > [ 71.018105] blkcg_deactivate_policy+0x14b/0x210 > ... > > > Here is the caller of __bfq_deactivate_entity: > > (gdb) list *(bfq_pd_offline+0xc1) > > 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942). > > 937 * entities to the idle tree. It happens if, in some > > 938 * of the calls to bfq_bfqq_move() performed by > > 939 * bfq_reparent_active_queues(), the queue to move > > is > > 940 * empty and gets expired. > > 941 */ > > 942 bfq_flush_idle_tree(st); > > 943 } > > 944 > > 945 __bfq_deactivate_entity(entity, false); > > So this is indeed strange. The group has in some of its idle service trees > an entity whose entity->sched_data points to already freed cgroup. In > particular it means the bfqq_entity_service_tree() leads to somewhere else > than the 'st' we called bfq_flush_idle_tree() with. This looks like a > different kind of problem AFAICT but still it needs fixing :). Can you > please run your reproducer with my patches + the attached debug patch on > top? Hopefully it should tell us more about the problematic entity and how > it got there... Thanks! Forgot to attach the patch. Here it is... Honza
在 2022/01/07 22:58, Jan Kara 写道: > On Fri 07-01-22 17:15:43, yukuai (C) wrote: >> 在 2022/01/05 22:36, Jan Kara 写道: >>> Hello, >>> >>> here is the second version of my patches to fix use-after-free issues in BFQ >>> when processes with merged queues get moved to different cgroups. The patches >>> have survived some beating in my test VM but so far I fail to reproduce the >>> original KASAN reports so testing from people who can reproduce them is most >>> welcome. Thanks! >>> >>> Changes since v1: >>> * Added fix for bfq_put_cooperator() >>> * Added fix to handle move between cgroups in bfq_merge_bio() >>> >>> Honza >>> Previous versions: >>> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1 >>> . >>> >> >> Hi, >> >> I repoduced the problem again with this patchset... > > Thanks for testing! > >> [ 71.004788] BUG: KASAN: use-after-free in >> __bfq_deactivate_entity+0x21/0x290 >> [ 71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801 >> [ 71.007723] >> [ 71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G W >> 5.16.0-rc5-next-2021127 >> [ 71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS >> ?-20190727_073836-4 >> [ 71.012274] Call Trace: >> [ 71.012603] <TASK> >> [ 71.012886] dump_stack_lvl+0x34/0x44 >> [ 71.013379] print_address_description.constprop.0.cold+0xab/0x36b >> [ 71.014182] ? __bfq_deactivate_entity+0x21/0x290 >> [ 71.014795] ? __bfq_deactivate_entity+0x21/0x290 >> [ 71.015398] kasan_report.cold+0x83/0xdf >> [ 71.015904] ? _raw_read_lock_bh+0x20/0x40 >> [ 71.016433] ? __bfq_deactivate_entity+0x21/0x290 >> [ 71.017033] __bfq_deactivate_entity+0x21/0x290 >> [ 71.017617] bfq_pd_offline+0xc1/0x110 >> [ 71.018105] blkcg_deactivate_policy+0x14b/0x210 > ... > >> Here is the caller of __bfq_deactivate_entity: >> (gdb) list *(bfq_pd_offline+0xc1) >> 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942). >> 937 * entities to the idle tree. It happens if, in some >> 938 * of the calls to bfq_bfqq_move() performed by >> 939 * bfq_reparent_active_queues(), the queue to move >> is >> 940 * empty and gets expired. >> 941 */ >> 942 bfq_flush_idle_tree(st); >> 943 } >> 944 >> 945 __bfq_deactivate_entity(entity, false); > > So this is indeed strange. The group has in some of its idle service trees > an entity whose entity->sched_data points to already freed cgroup. In > particular it means the bfqq_entity_service_tree() leads to somewhere else > than the 'st' we called bfq_flush_idle_tree() with. This looks like a > different kind of problem AFAICT but still it needs fixing :). Can you > please run your reproducer with my patches + the attached debug patch on > top? Hopefully it should tell us more about the problematic entity and how > it got there... Thanks! Hi, I'm not sure I understand what you mean... I reporduced again with your debug patch applied, however, no extra messages are printed. I think this is exactly the same problem we discussed before: 1) bfqq->new_bfqq is set, they are under g1 2) bfqq is moved to another group, and user thread of new_bfqq exit 3) g1 is offlied 3) io issued from bfqq will end up in new_bfqq, and the offlined g1 will be inserted to st of g1's parent. Currently such entity should be in active tree, however, I think it's prossible that it can end up in idle tree throuph deactivation of entity? 4) io is done, new_bfqq is deactivated 5) new_bfqq is freed, and then g1 is freed 6) the problem is triggered when g1's parent is offlined. Thanks, Kuai > > Honza >
在 2022/01/10 9:49, yukuai (C) 写道: > 在 2022/01/07 22:58, Jan Kara 写道: >> On Fri 07-01-22 17:15:43, yukuai (C) wrote: >>> 在 2022/01/05 22:36, Jan Kara 写道: >>>> Hello, >>>> >>>> here is the second version of my patches to fix use-after-free >>>> issues in BFQ >>>> when processes with merged queues get moved to different cgroups. >>>> The patches >>>> have survived some beating in my test VM but so far I fail to >>>> reproduce the >>>> original KASAN reports so testing from people who can reproduce them >>>> is most >>>> welcome. Thanks! >>>> >>>> Changes since v1: >>>> * Added fix for bfq_put_cooperator() >>>> * Added fix to handle move between cgroups in bfq_merge_bio() >>>> >>>> Honza >>>> Previous versions: >>>> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1 >>>> . >>>> >>> >>> Hi, >>> >>> I repoduced the problem again with this patchset... >> >> Thanks for testing! >> >>> [ 71.004788] BUG: KASAN: use-after-free in >>> __bfq_deactivate_entity+0x21/0x290 >>> [ 71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801 >>> [ 71.007723] >>> [ 71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G W >>> 5.16.0-rc5-next-2021127 >>> [ 71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), >>> BIOS >>> ?-20190727_073836-4 >>> [ 71.012274] Call Trace: >>> [ 71.012603] <TASK> >>> [ 71.012886] dump_stack_lvl+0x34/0x44 >>> [ 71.013379] print_address_description.constprop.0.cold+0xab/0x36b >>> [ 71.014182] ? __bfq_deactivate_entity+0x21/0x290 >>> [ 71.014795] ? __bfq_deactivate_entity+0x21/0x290 >>> [ 71.015398] kasan_report.cold+0x83/0xdf >>> [ 71.015904] ? _raw_read_lock_bh+0x20/0x40 >>> [ 71.016433] ? __bfq_deactivate_entity+0x21/0x290 >>> [ 71.017033] __bfq_deactivate_entity+0x21/0x290 >>> [ 71.017617] bfq_pd_offline+0xc1/0x110 >>> [ 71.018105] blkcg_deactivate_policy+0x14b/0x210 >> ... >> >>> Here is the caller of __bfq_deactivate_entity: >>> (gdb) list *(bfq_pd_offline+0xc1) >>> 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942). >>> 937 * entities to the idle tree. It happens if, >>> in some >>> 938 * of the calls to bfq_bfqq_move() performed by >>> 939 * bfq_reparent_active_queues(), the queue to >>> move >>> is >>> 940 * empty and gets expired. >>> 941 */ >>> 942 bfq_flush_idle_tree(st); >>> 943 } >>> 944 >>> 945 __bfq_deactivate_entity(entity, false); >> >> So this is indeed strange. The group has in some of its idle service >> trees >> an entity whose entity->sched_data points to already freed cgroup. In >> particular it means the bfqq_entity_service_tree() leads to somewhere >> else >> than the 'st' we called bfq_flush_idle_tree() with. This looks like a >> different kind of problem AFAICT but still it needs fixing :). Can you >> please run your reproducer with my patches + the attached debug patch on >> top? Hopefully it should tell us more about the problematic entity and >> how >> it got there... Thanks! > > Hi, > > I'm not sure I understand what you mean... I reporduced again with your > debug patch applied, however, no extra messages are printed. > > I think this is exactly the same problem we discussed before: > > 1) bfqq->new_bfqq is set, they are under g1 > 2) bfqq is moved to another group, and user thread of new_bfqq exit > 3) g1 is offlied > 3) io issued from bfqq will end up in new_bfqq, and the offlined > g1 will be inserted to st of g1's parent. > > Currently such entity should be in active tree, however, I think it's > prossible that it can end up in idle tree throuph deactivation of > entity? > > 4) io is done, new_bfqq is deactivated > 5) new_bfqq is freed, and then g1 is freed > 6) the problem is triggered when g1's parent is offlined. Hi, I applied the following modification and the problem can't be repoduced anymore. diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 24a5c5329bcd..9b29af66635f 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -730,8 +730,19 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct bfq_data *bfqd, if (sync_bfqq) { entity = &sync_bfqq->entity; - if (entity->sched_data != &bfqg->sched_data) + if (entity->sched_data != &bfqg->sched_data) { + if (sync_bfqq->new_bfqq) { + bfq_put_queue(sync_bfqq->new_bfqq); + sync_bfqq->new_bfqq = NULL; + } + + if (bfq_bfqq_coop(sync_bfqq)) { + bic->stably_merged = false; + bfq_mark_bfqq_split_coop(sync_bfqq); + } + bfq_bfqq_move(bfqd, sync_bfqq, bfqg); + } I'm not sure if we can set sync_bfqq->new_bfqq = NULL here, however, this can make sure the problem here really is what we discussed before... > > Thanks, > Kuai >> >> Honza >>
On Tue 11-01-22 16:28:35, yukuai (C) wrote: > 在 2022/01/10 9:49, yukuai (C) 写道: > > 在 2022/01/07 22:58, Jan Kara 写道: > > > On Fri 07-01-22 17:15:43, yukuai (C) wrote: > > > > 在 2022/01/05 22:36, Jan Kara 写道: > > > > > Hello, > > > > > > > > > > here is the second version of my patches to fix > > > > > use-after-free issues in BFQ > > > > > when processes with merged queues get moved to different > > > > > cgroups. The patches > > > > > have survived some beating in my test VM but so far I fail > > > > > to reproduce the > > > > > original KASAN reports so testing from people who can > > > > > reproduce them is most > > > > > welcome. Thanks! > > > > > > > > > > Changes since v1: > > > > > * Added fix for bfq_put_cooperator() > > > > > * Added fix to handle move between cgroups in bfq_merge_bio() > > > > > > > > > > Honza > > > > > Previous versions: > > > > > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1 > > > > > . > > > > > > > > > > > > > Hi, > > > > > > > > I repoduced the problem again with this patchset... > > > > > > Thanks for testing! > > > > > > > [ 71.004788] BUG: KASAN: use-after-free in > > > > __bfq_deactivate_entity+0x21/0x290 > > > > [ 71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801 > > > > [ 71.007723] > > > > [ 71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G W > > > > 5.16.0-rc5-next-2021127 > > > > [ 71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, > > > > 1996), BIOS > > > > ?-20190727_073836-4 > > > > [ 71.012274] Call Trace: > > > > [ 71.012603] <TASK> > > > > [ 71.012886] dump_stack_lvl+0x34/0x44 > > > > [ 71.013379] print_address_description.constprop.0.cold+0xab/0x36b > > > > [ 71.014182] ? __bfq_deactivate_entity+0x21/0x290 > > > > [ 71.014795] ? __bfq_deactivate_entity+0x21/0x290 > > > > [ 71.015398] kasan_report.cold+0x83/0xdf > > > > [ 71.015904] ? _raw_read_lock_bh+0x20/0x40 > > > > [ 71.016433] ? __bfq_deactivate_entity+0x21/0x290 > > > > [ 71.017033] __bfq_deactivate_entity+0x21/0x290 > > > > [ 71.017617] bfq_pd_offline+0xc1/0x110 > > > > [ 71.018105] blkcg_deactivate_policy+0x14b/0x210 > > > ... > > > > > > > Here is the caller of __bfq_deactivate_entity: > > > > (gdb) list *(bfq_pd_offline+0xc1) > > > > 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942). > > > > 937 * entities to the idle tree. It happens > > > > if, in some > > > > 938 * of the calls to bfq_bfqq_move() performed by > > > > 939 * bfq_reparent_active_queues(), the > > > > queue to move > > > > is > > > > 940 * empty and gets expired. > > > > 941 */ > > > > 942 bfq_flush_idle_tree(st); > > > > 943 } > > > > 944 > > > > 945 __bfq_deactivate_entity(entity, false); > > > > > > So this is indeed strange. The group has in some of its idle service > > > trees > > > an entity whose entity->sched_data points to already freed cgroup. In > > > particular it means the bfqq_entity_service_tree() leads to > > > somewhere else > > > than the 'st' we called bfq_flush_idle_tree() with. This looks like a > > > different kind of problem AFAICT but still it needs fixing :). Can you > > > please run your reproducer with my patches + the attached debug patch on > > > top? Hopefully it should tell us more about the problematic entity > > > and how > > > it got there... Thanks! > > > > Hi, > > > > I'm not sure I understand what you mean... I reporduced again with your > > debug patch applied, however, no extra messages are printed. > > > > I think this is exactly the same problem we discussed before: > > > > 1) bfqq->new_bfqq is set, they are under g1 > > 2) bfqq is moved to another group, and user thread of new_bfqq exit > > 3) g1 is offlied > > 3) io issued from bfqq will end up in new_bfqq, and the offlined > > g1 will be inserted to st of g1's parent. > > > > Currently such entity should be in active tree, however, I think it's > > prossible that it can end up in idle tree throuph deactivation of > > entity? > > > > 4) io is done, new_bfqq is deactivated > > 5) new_bfqq is freed, and then g1 is freed > > 6) the problem is triggered when g1's parent is offlined. > > Hi, > > I applied the following modification and the problem can't be > repoduced anymore. > > diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c > index 24a5c5329bcd..9b29af66635f 100644 > --- a/block/bfq-cgroup.c > +++ b/block/bfq-cgroup.c > @@ -730,8 +730,19 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct > bfq_data *bfqd, > > if (sync_bfqq) { > entity = &sync_bfqq->entity; > - if (entity->sched_data != &bfqg->sched_data) > + if (entity->sched_data != &bfqg->sched_data) { > + if (sync_bfqq->new_bfqq) { > + bfq_put_queue(sync_bfqq->new_bfqq); > + sync_bfqq->new_bfqq = NULL; > + } > + > + if (bfq_bfqq_coop(sync_bfqq)) { > + bic->stably_merged = false; > + bfq_mark_bfqq_split_coop(sync_bfqq); > + } > + > bfq_bfqq_move(bfqd, sync_bfqq, bfqg); > + } > > I'm not sure if we can set sync_bfqq->new_bfqq = NULL here, however, > this can make sure the problem here really is what we discussed > before... OK, if this fixed the problem for you then I'm wondering why WARN_ON_ONCE(sync_bfqq->new_bfqq); in my patch "bfq: Split shared queues on move between cgroups" didn't trigger? Now I see how sync_bfqq->new_bfqq can indeed be != NULL here so I agree with the change in principle (it has to use bfq_put_cooperator() though) but I'm wondering why the WARN_ON didn't trigger. Or you just didn't notice? Honza
在 2022/01/11 20:36, Jan Kara 写道: > On Tue 11-01-22 16:28:35, yukuai (C) wrote: >> 在 2022/01/10 9:49, yukuai (C) 写道: >>> 在 2022/01/07 22:58, Jan Kara 写道: >>>> On Fri 07-01-22 17:15:43, yukuai (C) wrote: >>>>> 在 2022/01/05 22:36, Jan Kara 写道: >>>>>> Hello, >>>>>> >>>>>> here is the second version of my patches to fix >>>>>> use-after-free issues in BFQ >>>>>> when processes with merged queues get moved to different >>>>>> cgroups. The patches >>>>>> have survived some beating in my test VM but so far I fail >>>>>> to reproduce the >>>>>> original KASAN reports so testing from people who can >>>>>> reproduce them is most >>>>>> welcome. Thanks! >>>>>> >>>>>> Changes since v1: >>>>>> * Added fix for bfq_put_cooperator() >>>>>> * Added fix to handle move between cgroups in bfq_merge_bio() >>>>>> >>>>>> Honza >>>>>> Previous versions: >>>>>> Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1 >>>>>> . >>>>>> >>>>> >>>>> Hi, >>>>> >>>>> I repoduced the problem again with this patchset... >>>> >>>> Thanks for testing! >>>> >>>>> [ 71.004788] BUG: KASAN: use-after-free in >>>>> __bfq_deactivate_entity+0x21/0x290 >>>>> [ 71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801 >>>>> [ 71.007723] >>>>> [ 71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G W >>>>> 5.16.0-rc5-next-2021127 >>>>> [ 71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, >>>>> 1996), BIOS >>>>> ?-20190727_073836-4 >>>>> [ 71.012274] Call Trace: >>>>> [ 71.012603] <TASK> >>>>> [ 71.012886] dump_stack_lvl+0x34/0x44 >>>>> [ 71.013379] print_address_description.constprop.0.cold+0xab/0x36b >>>>> [ 71.014182] ? __bfq_deactivate_entity+0x21/0x290 >>>>> [ 71.014795] ? __bfq_deactivate_entity+0x21/0x290 >>>>> [ 71.015398] kasan_report.cold+0x83/0xdf >>>>> [ 71.015904] ? _raw_read_lock_bh+0x20/0x40 >>>>> [ 71.016433] ? __bfq_deactivate_entity+0x21/0x290 >>>>> [ 71.017033] __bfq_deactivate_entity+0x21/0x290 >>>>> [ 71.017617] bfq_pd_offline+0xc1/0x110 >>>>> [ 71.018105] blkcg_deactivate_policy+0x14b/0x210 >>>> ... >>>> >>>>> Here is the caller of __bfq_deactivate_entity: >>>>> (gdb) list *(bfq_pd_offline+0xc1) >>>>> 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942). >>>>> 937 * entities to the idle tree. It happens >>>>> if, in some >>>>> 938 * of the calls to bfq_bfqq_move() performed by >>>>> 939 * bfq_reparent_active_queues(), the >>>>> queue to move >>>>> is >>>>> 940 * empty and gets expired. >>>>> 941 */ >>>>> 942 bfq_flush_idle_tree(st); >>>>> 943 } >>>>> 944 >>>>> 945 __bfq_deactivate_entity(entity, false); >>>> >>>> So this is indeed strange. The group has in some of its idle service >>>> trees >>>> an entity whose entity->sched_data points to already freed cgroup. In >>>> particular it means the bfqq_entity_service_tree() leads to >>>> somewhere else >>>> than the 'st' we called bfq_flush_idle_tree() with. This looks like a >>>> different kind of problem AFAICT but still it needs fixing :). Can you >>>> please run your reproducer with my patches + the attached debug patch on >>>> top? Hopefully it should tell us more about the problematic entity >>>> and how >>>> it got there... Thanks! >>> >>> Hi, >>> >>> I'm not sure I understand what you mean... I reporduced again with your >>> debug patch applied, however, no extra messages are printed. >>> >>> I think this is exactly the same problem we discussed before: >>> >>> 1) bfqq->new_bfqq is set, they are under g1 >>> 2) bfqq is moved to another group, and user thread of new_bfqq exit >>> 3) g1 is offlied >>> 3) io issued from bfqq will end up in new_bfqq, and the offlined >>> g1 will be inserted to st of g1's parent. >>> >>> Currently such entity should be in active tree, however, I think it's >>> prossible that it can end up in idle tree throuph deactivation of >>> entity? >>> >>> 4) io is done, new_bfqq is deactivated >>> 5) new_bfqq is freed, and then g1 is freed >>> 6) the problem is triggered when g1's parent is offlined. >> >> Hi, >> >> I applied the following modification and the problem can't be >> repoduced anymore. >> >> diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c >> index 24a5c5329bcd..9b29af66635f 100644 >> --- a/block/bfq-cgroup.c >> +++ b/block/bfq-cgroup.c >> @@ -730,8 +730,19 @@ static struct bfq_group *__bfq_bic_change_cgroup(struct >> bfq_data *bfqd, >> >> if (sync_bfqq) { >> entity = &sync_bfqq->entity; >> - if (entity->sched_data != &bfqg->sched_data) >> + if (entity->sched_data != &bfqg->sched_data) { >> + if (sync_bfqq->new_bfqq) { >> + bfq_put_queue(sync_bfqq->new_bfqq); >> + sync_bfqq->new_bfqq = NULL; >> + } >> + >> + if (bfq_bfqq_coop(sync_bfqq)) { >> + bic->stably_merged = false; >> + bfq_mark_bfqq_split_coop(sync_bfqq); >> + } >> + >> bfq_bfqq_move(bfqd, sync_bfqq, bfqg); >> + } >> >> I'm not sure if we can set sync_bfqq->new_bfqq = NULL here, however, >> this can make sure the problem here really is what we discussed >> before... > > OK, if this fixed the problem for you then I'm wondering why > > WARN_ON_ONCE(sync_bfqq->new_bfqq); > > in my patch "bfq: Split shared queues on move between cgroups" didn't > trigger? Now I see how sync_bfqq->new_bfqq can indeed be != NULL here so I > agree with the change in principle (it has to use bfq_put_cooperator() > though) but I'm wondering why the WARN_ON didn't trigger. Or you just > didn't notice? Hi, When the repoducer start running, the WARN_ON() will be triggered immediately, sorry that the missing log lead to misunderstanding. Thanks, Kuai > > Honza >
On Mon 10-01-22 09:49:34, yukuai (C) wrote: > 在 2022/01/07 22:58, Jan Kara 写道: > > On Fri 07-01-22 17:15:43, yukuai (C) wrote: > > > 在 2022/01/05 22:36, Jan Kara 写道: > > > > Hello, > > > > > > > > here is the second version of my patches to fix use-after-free issues in BFQ > > > > when processes with merged queues get moved to different cgroups. The patches > > > > have survived some beating in my test VM but so far I fail to reproduce the > > > > original KASAN reports so testing from people who can reproduce them is most > > > > welcome. Thanks! > > > > > > > > Changes since v1: > > > > * Added fix for bfq_put_cooperator() > > > > * Added fix to handle move between cgroups in bfq_merge_bio() > > > > > > > > Honza > > > > Previous versions: > > > > Link: http://lore.kernel.org/r/20211223171425.3551-1-jack@suse.cz # v1 > > > > . > > > > > > > > > > Hi, > > > > > > I repoduced the problem again with this patchset... > > > > Thanks for testing! > > > > > [ 71.004788] BUG: KASAN: use-after-free in > > > __bfq_deactivate_entity+0x21/0x290 > > > [ 71.006328] Read of size 1 at addr ffff88817a3dc0b0 by task rmmod/801 > > > [ 71.007723] > > > [ 71.008068] CPU: 7 PID: 801 Comm: rmmod Tainted: G W > > > 5.16.0-rc5-next-2021127 > > > [ 71.009995] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > > ?-20190727_073836-4 > > > [ 71.012274] Call Trace: > > > [ 71.012603] <TASK> > > > [ 71.012886] dump_stack_lvl+0x34/0x44 > > > [ 71.013379] print_address_description.constprop.0.cold+0xab/0x36b > > > [ 71.014182] ? __bfq_deactivate_entity+0x21/0x290 > > > [ 71.014795] ? __bfq_deactivate_entity+0x21/0x290 > > > [ 71.015398] kasan_report.cold+0x83/0xdf > > > [ 71.015904] ? _raw_read_lock_bh+0x20/0x40 > > > [ 71.016433] ? __bfq_deactivate_entity+0x21/0x290 > > > [ 71.017033] __bfq_deactivate_entity+0x21/0x290 > > > [ 71.017617] bfq_pd_offline+0xc1/0x110 > > > [ 71.018105] blkcg_deactivate_policy+0x14b/0x210 > > ... > > > > > Here is the caller of __bfq_deactivate_entity: > > > (gdb) list *(bfq_pd_offline+0xc1) > > > 0xffffffff81c504f1 is in bfq_pd_offline (block/bfq-cgroup.c:942). > > > 937 * entities to the idle tree. It happens if, in some > > > 938 * of the calls to bfq_bfqq_move() performed by > > > 939 * bfq_reparent_active_queues(), the queue to move > > > is > > > 940 * empty and gets expired. > > > 941 */ > > > 942 bfq_flush_idle_tree(st); > > > 943 } > > > 944 > > > 945 __bfq_deactivate_entity(entity, false); > > > > So this is indeed strange. The group has in some of its idle service trees > > an entity whose entity->sched_data points to already freed cgroup. In > > particular it means the bfqq_entity_service_tree() leads to somewhere else > > than the 'st' we called bfq_flush_idle_tree() with. This looks like a > > different kind of problem AFAICT but still it needs fixing :). Can you > > please run your reproducer with my patches + the attached debug patch on > > top? Hopefully it should tell us more about the problematic entity and how > > it got there... Thanks! > > Hi, > > I'm not sure I understand what you mean... I reporduced again with your > debug patch applied, however, no extra messages are printed. > > I think this is exactly the same problem we discussed before: > > 1) bfqq->new_bfqq is set, they are under g1 > 2) bfqq is moved to another group, and user thread of new_bfqq exit > 3) g1 is offlied > 3) io issued from bfqq will end up in new_bfqq, and the offlined > g1 will be inserted to st of g1's parent. Hmm, you are right. I was confused by the fact that bfq_setup_merge() is always immediately (under big bfq lock) followed by bfq_merge_bfqqs() but that redirects BIC of just one process pointing to the new bfqq. So the following is a bit more detailed and graphical version of your scenario for future reference :): Initial state, bfqq2 is shared by Process 2 & Process 3: Process 1 (blkcg1) Process 2 (blkcg1) Process 3 (blkcg1) (BIC) (BIC) (BIC) | | | | | / v v / bfqq1 bfqq2<------------------- \ / \ / \ / ----> BFQ group1<- Now while processing request for Process 2 we decide to merge bfqq2 to bfqq1. Situation after the merge: Process 1 (blkcg1) Process 2 (blkcg1) Process 3 (blkcg1) (BIC) (BIC) (BIC) | / | |/---------------------/ / vv new_bfqq / bfqq1<-----------------bfqq2<------------------- \ / \ / \ / ----> BFQ group1<- Processes 1 and 2 exit: Process 3 (blkcg1) (BIC) | / new_bfqq / bfqq1<-----------------bfqq2<----------- \ / \ / \ / ----> BFQ group1<- Process 3 is moved to blkcg2 and submits IO, blkcg1 is offlined. bfq_bic_update_cgroup() will change the picture to: Process 3 (blkcg2) (BIC) | / new_bfqq / bfqq1<-----------------bfqq2<----------- | | | | v v BFQ group1 BFQ group2 and following bfq_merge_bfqqs() when submitting the request will further modify the picture to: Process 3 (blkcg2) (BIC) | /-------------------------------------/ v new_bfqq bfqq1<-----------------bfqq2 | | | | v v BFQ group1 BFQ group2 and boom, we queue again offlined BFQ group1. Honza