diff mbox series

[RFC] fix use after free in xlog_wait()

Message ID 20200611013952.2589997-1-yukuai3@huawei.com (mailing list archive)
State New, archived
Headers show
Series [RFC] fix use after free in xlog_wait() | expand

Commit Message

Yu Kuai June 11, 2020, 1:39 a.m. UTC
I recently got UAF by running generic/019 in qemu:

==================================================================
  BUG: KASAN: use-after-free in __lock_acquire+0x4508/0x68c0
  Read of size 8 at addr ffff88811327f080 by task fio/11147

  CPU: 6 PID: 11147 Comm: fio Tainted: G        W         5.7.0-next-20200602+ #8
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
  Call Trace:
   dump_stack+0xf6/0x16e
   ? __lock_acquire+0x4508/0x68c0
   ? __lock_acquire+0x4508/0x68c0
   print_address_description.constprop.0+0x1a/0x210
   ? __lock_acquire+0x4508/0x68c0
   kasan_report.cold+0x1f/0x37
   ? lockdep_hardirqs_on_prepare+0x480/0x550
   ? __lock_acquire+0x4508/0x68c0
   __lock_acquire+0x4508/0x68c0
   ? print_usage_bug+0x1f0/0x1f0
   ? finish_task_switch+0x126/0x5e0
   ? lockdep_hardirqs_on_prepare+0x550/0x550
   ? mark_held_locks+0x9e/0xe0
   ? __schedule+0x801/0x1d90
   ? _raw_spin_unlock_irq+0x1f/0x30
   lock_acquire+0x182/0x790
   ? remove_wait_queue+0x1d/0x180
   ? __switch_to_asm+0x42/0x70
   ? lock_release+0x710/0x710
   ? __schedule+0x85c/0x1d90
   ? xfs_log_commit_cil+0x1d8e/0x2a50
   ? __sched_text_start+0x8/0x8
   _raw_spin_lock_irqsave+0x32/0x50
   ? remove_wait_queue+0x1d/0x180
   remove_wait_queue+0x1d/0x180
   xfs_log_commit_cil+0x1d9e/0x2a50
   ? xlog_cil_empty+0x90/0x90
   ? wake_up_q+0x140/0x140
   ? rcu_read_lock_sched_held+0x9c/0xd0
   ? rcu_read_lock_bh_held+0xb0/0xb0
   __xfs_trans_commit+0x292/0xec0
   ? xfs_trans_unreserve_and_mod_sb+0xab0/0xab0
   ? rcu_read_lock_bh_held+0xb0/0xb0
   ? xfs_isilocked+0x87/0x2e0
   ? xfs_trans_log_inode+0x1ad/0x480
   xfs_vn_update_time+0x3eb/0x6d0
   ? xfs_setattr_mode.isra.0+0xa0/0xa0
   ? current_time+0xa8/0x110
   ? timestamp_truncate+0x2f0/0x2f0
   ? xfs_setattr_mode.isra.0+0xa0/0xa0
   update_time+0x70/0xc0
   file_update_time+0x2b7/0x490
   ? update_time+0xc0/0xc0
   ? __sb_start_write+0x197/0x3e0
   __xfs_filemap_fault.constprop.0+0x1b7/0x480
   do_page_mkwrite+0x1ac/0x470
   do_wp_page+0x9e2/0x1b10
   ? do_raw_spin_lock+0x121/0x290
   ? finish_mkwrite_fault+0x4a0/0x4a0
   ? rwlock_bug.part.0+0x90/0x90
   ? handle_mm_fault+0xa81/0x3570
   handle_mm_fault+0x1c65/0x3570
   ? __pmd_alloc+0x4c0/0x4c0
   ? vmacache_find+0x55/0x2a0
   do_user_addr_fault+0x635/0xd42
   exc_page_fault+0xdd/0x5b0
   ? asm_common_interrupt+0x8/0x40
   ? asm_exc_page_fault+0x8/0x30
   asm_exc_page_fault+0x1e/0x30
  RIP: 0033:0x7f40e022336a
  Code: Bad RIP value.
  RSP: 002b:00007ffedefb0218 EFLAGS: 00010206
  RAX: 00007f40b7a5a000 RBX: 0000000002562280 RCX: 00000000025633d0
  RDX: 0000000000000fc0 RSI: 0000000002562420 RDI: 00007f40b7a5a000
  RBP: 00007f40b8620190 R08: 0000000000000000 R09: 00007f40b7a5aff0
  R10: 00007ffedeff8000 R11: 00007f40b7a5aff0 R12: 0000000000000001
  R13: 0000000000001000 R14: 00000000025622a8 R15: 00007f40b8620198

  Allocated by task 6826:
   save_stack+0x1b/0x40
   __kasan_kmalloc.constprop.0+0xc2/0xd0
   kmem_alloc+0x154/0x450
   xlog_cil_push_work+0xff/0x1250
   process_one_work+0xa3e/0x17a0
   worker_thread+0x8e2/0x1050
   kthread+0x355/0x470
   ret_from_fork+0x22/0x30

   Freed by task 6826:
   save_stack+0x1b/0x40
   __kasan_slab_free+0x12c/0x170
   kfree+0xd6/0x300
   kvfree+0x42/0x50
   xlog_cil_committed+0xa9c/0xf30
   xlog_cil_push_work+0xa8c/0x1250
   process_one_work+0xa3e/0x17a0
   worker_thread+0x8e2/0x1050
   kthread+0x355/0x470
   ret_from_fork+0x22/0x30

  The buggy address belongs to the object at ffff88811327f000
   which belongs to the cache kmalloc-256 of size 256
  The buggy address is located 128 bytes inside of
   256-byte region [ffff88811327f000, ffff88811327f100)
  The buggy address belongs to the page:
  page:ffffea00044c9f00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 head:ffffea00044c9f00 order:2 compound_mapcount:0 compound_pincount:0
  flags: 0x200000000010200(slab|head)
  raw: 0200000000010200 dead000000000100 dead000000000122 ffff88811a40e800
  raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff88811327ef80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
   ffff88811327f000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  >ffff88811327f080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                     ^
   ffff88811327f100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
   ffff88811327f180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ==================================================================

I think the reason is that when 'ctx' is freed in xlog_cil_committed(),
a previous call to xlog_wait(&ctx->xc_ctx->push_wait, ...) hasn't finished
yet. Thus when remove_wait_queue() is called, UAF will be triggered
since 'ctx' was freed:

thread1		    thread2             thread3

__xfs_trans_commit
 xfs_log_commit_cil
  xlog_wait
   schedule
                    xlog_cil_push_work
		     wake_up_all
		                        xlog_cil_committed
					 kmem_free
   remove_wait_queue
    spin_lock_irqsave --> UAF

I tried to fix the problem by using autoremove_wake_function() in
xlog_wait(), however, soft lockup will be triggered this way.

Instead, make sure waitqueue_active(&ctx->push_wait) return false before
freeing 'ctx'.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 fs/xfs/xfs_log_cil.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dave Chinner June 11, 2020, 2:28 a.m. UTC | #1
On Thu, Jun 11, 2020 at 09:39:52AM +0800, Yu Kuai wrote:
> I recently got UAF by running generic/019 in qemu:
> 
> ==================================================================
>   BUG: KASAN: use-after-free in __lock_acquire+0x4508/0x68c0
>   Read of size 8 at addr ffff88811327f080 by task fio/11147
....
>    remove_wait_queue+0x1d/0x180
>    xfs_log_commit_cil+0x1d9e/0x2a50
>    __xfs_trans_commit+0x292/0xec0

Ok, so this is waking up from a the CIL context overrunning the hard
size limit....

>    Freed by task 6826:
>    save_stack+0x1b/0x40
>    __kasan_slab_free+0x12c/0x170
>    kfree+0xd6/0x300
>    kvfree+0x42/0x50
>    xlog_cil_committed+0xa9c/0xf30
>    xlog_cil_push_work+0xa8c/0x1250
>    process_one_work+0xa3e/0x17a0
>    worker_thread+0x8e2/0x1050
>    kthread+0x355/0x470
>    ret_from_fork+0x22/0x30

Hmmmm. The CIL push work freed the context which means somethign
went wrong somewhere - we must be in CIL commit error path here...

/me checks generic/019

Oh, it's a repeated shutdown test. Right, so we're getting a
shutdown in the middle of a CIL push when the CIL is hard throttling
callers and the CIL context gets freed before the throttled tasks
can be woken.

Gotcha. Yup, that's a real issue, thanks for reporting it!

> I think the reason is that when 'ctx' is freed in xlog_cil_committed(),
> a previous call to xlog_wait(&ctx->xc_ctx->push_wait, ...) hasn't finished
> yet. Thus when remove_wait_queue() is called, UAF will be triggered
> since 'ctx' was freed:
> 
> thread1		    thread2             thread3
> 
> __xfs_trans_commit
>  xfs_log_commit_cil
>   xlog_wait
>    schedule
>                     xlog_cil_push_work
> 		     wake_up_all
> 		                        xlog_cil_committed
> 					 kmem_free
>    remove_wait_queue
>     spin_lock_irqsave --> UAF

Actually, it's a lot simpler:

thread1			thread2

__xfs_trans_commit
 xfs_log_commit_cil
  xlog_wait
   schedule
			xlog_cil_push_work
			wake_up_all
			<shutdown aborts commit>
			xlog_cil_committed
			kmem_free

   remove_wait_queue
    spin_lock_irqsave --> UAF

> Instead, make sure waitqueue_active(&ctx->push_wait) return false before
> freeing 'ctx'.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  fs/xfs/xfs_log_cil.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index b43f0e8f43f2..59b21485b0fc 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -607,7 +607,7 @@ xlog_cil_committed(
>  
>  	if (!list_empty(&ctx->busy_extents))
>  		xlog_discard_busy_extents(mp, ctx);
> -	else
> +	else if (!waitqueue_active(&ctx->push_wait))
>  		kmem_free(ctx);

That will just leak the memory instead, which is no better.

Let me go write a patch to fix this.

Cheers,

Dave.
Yu Kuai June 11, 2020, 3:01 a.m. UTC | #2
On 2020/6/11 10:28, Dave Chinner wrote
> Actually, it's a lot simpler:
> 
> thread1			thread2
> 
> __xfs_trans_commit
>   xfs_log_commit_cil
>    xlog_wait
>     schedule
> 			xlog_cil_push_work
> 			wake_up_all
> 			<shutdown aborts commit>
> 			xlog_cil_committed
> 			kmem_free
> 
>     remove_wait_queue
>      spin_lock_irqsave --> UAF
> 

It's ture in this case, however, I got another result when I
tried to reporduce it, which seems 'ctx' can be freed in a
different path:

[   64.975996] run fstests generic/019 at 2020-06-10 16:13:44
[   69.126208] xfs filesystem being mounted at /tmp/scratch supports 
timestamps until 2038 (0x7fffffff)
[   99.166846] XFS (sdb): log I/O error -5
[   99.170111] XFS (sdb): Log I/O Error Detected. Shutting down filesystem
[   99.171071] XFS (sdb): Please unmount the filesystem and rectify the 
problem(s)
[   99.179569] ------------[ cut here ]------------
[   99.180432] WARNING: CPU: 7 PID: 2705 at fs/iomap/buffered-io.c:1030 
iomap_page_mkwrite_actor+0x17d/0x1b0
[   99.181273] 
==================================================================
[   99.181758] Modules linked in:
[   99.182806] BUG: KASAN: use-after-free in do_raw_spin_trylock+0x67/0x180
[   99.183255] CPU: 7 PID: 2705 Comm: fio Not tainted 
5.7.0-next-20200602+ #29
[   99.184166] Read of size 4 at addr ffff888115f28868 by task fio/2704
[   99.184171]
[   99.185142] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 
04/01/2014
[   99.185995] CPU: 6 PID: 2704 Comm: fio Not tainted 
5.7.0-next-20200602+ #29
[   99.186227] RIP: 0010:iomap_page_mkwrite_actor+0x17d/0x1b0
[   99.188199] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 
04/01/2014
[   99.189178] Code: 89 ef e8 a6 d4 c7 ff e9 3f ff ff ff e8 fc 64 ad ff 
89 da 31 f6 48 89 ef e8 b0 1e f2 ff 49 89 dc e9 26 ff ff ff e8 e3 64 ad 
ff <0f> 0b eb be e8 da 64 ad ff 4d 8d 67 ffe
[   99.189899] Call Trace:
[   99.191748] RSP: 0000:ffff88810599fa18 EFLAGS: 00010293
[   99.194218]  dump_stack+0xf6/0x16e
[   99.194574] RAX: ffff888106df3680 RBX: 0000000000001000 RCX: 
ffffffff94338cad
[   99.195301]  ? do_raw_spin_trylock+0x67/0x180
[   99.195777] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000001
[   99.195786] RBP: ffffea0002544800 R08: ffff888106df3680 R09: 
fffff940004a8901
[   99.196764]  ? do_raw_spin_trylock+0x67/0x180
[   99.196778]  print_address_description.constprop.0.cold+0xd3/0x415
[   99.197393] R10: ffffea0002544807 R11: fffff940004a8900 R12: 
0000000000000000
[   99.198380]  ? do_raw_spin_trylock+0x67/0x180
[   99.199349] R13: 0000000000000000 R14: ffff8880b6ee7280 R15: 
ffffea00025447c8
[   99.199939]  kasan_report.cold+0x1f/0x37
[   99.199949]  ? __switch_to+0x510/0xef0
[   99.200812] FS:  00007ff7d8562740(0000) GS:ffff88811a400000(0000) 
knlGS:0000000000000000
[   99.201755]  ? do_raw_spin_trylock+0x67/0x180
[   99.201765]  check_memory_region+0x14e/0x1b0
[   99.202378] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   99.203319]  do_raw_spin_trylock+0x67/0x180
[   99.203858] CR2: 00007ff7ae3e4ff0 CR3: 00000001028be000 CR4: 
00000000000006e0
[   99.204374]  ? do_raw_spin_lock+0x290/0x290
[   99.205470] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[   99.206069]  _raw_spin_lock_irqsave+0x44/0x80
[   99.206641] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[   99.206647] Call Trace:
[   99.207420]  ? remove_wait_queue+0x22/0x190
[   99.208014]  iomap_apply+0x2d7/0xc00
[   99.208949]  remove_wait_queue+0x22/0x190
[   99.209539]  ? iomap_page_create+0x350/0x350
[   99.210510]  xfs_log_commit_cil+0x1c7e/0x2940
[   99.211115]  ? trace_event_raw_event_iomap_apply+0x430/0x430
[   99.212073]  ? xlog_cil_empty+0x90/0x90
[   99.212421]  ? down_write_trylock+0x2f0/0x2f0
[   99.212979]  ? check_flags.part.0+0x430/0x430
[   99.213486]  ? update_time+0xc0/0xc0
[   99.214025]  ? wake_up_q+0x140/0x140
[   99.214610]  iomap_page_mkwrite+0x26a/0x3b0
[   99.215201]  ? xlog_ticket_alloc+0x3e0/0x3e0
[   99.215966]  ? iomap_page_create+0x350/0x350
[   99.216487]  ? __kasan_kmalloc.constprop.0+0xc2/0xd0
[   99.217097]  __xfs_filemap_fault.constprop.0+0x32f/0x4e0
[   99.217675]  __xfs_trans_commit+0x2b3/0xf20
[   99.218206]  do_page_mkwrite+0x1b1/0x470
[   99.218681]  ? xfs_trans_unreserve_and_mod_sb+0xab0/0xab0
[   99.219259]  do_wp_page+0x9e7/0x1b10
[   99.219838]  ? xfs_isilocked+0x8c/0x2f0
[   99.220439]  ? finish_mkwrite_fault+0x4b0/0x4b0
[   99.221102]  ? xfs_trans_log_inode+0x1b2/0x480
[   99.221817]  ? do_user_addr_fault+0x2fd/0xd42
[   99.222382]  xfs_vn_update_time+0x40a/0x730
[   99.222922]  handle_mm_fault+0x1c9f/0x3600
[   99.223634]  ? xfs_setattr_mode.isra.0+0xb0/0xb0
[   99.224140]  ? __pmd_alloc+0x390/0x390
[   99.224653]  ? current_time+0xad/0x110
[   99.225278]  ? vmacache_find+0x5a/0x2a0
[   99.225861]  ? timestamp_truncate+0x2f0/0x2f0
[   99.226461]  do_user_addr_fault+0x635/0xd42
[   99.227019]  ? xfs_setattr_mode.isra.0+0xb0/0xb0
[   99.227029]  update_time+0x75/0xc0
[   99.227760]  exc_page_fault+0x12a/0x6f0
[   99.228377]  file_update_time+0x2bc/0x490
[   99.228889]  ? asm_exc_page_fault+0x8/0x30
[   99.229414]  ? update_time+0xc0/0xc0
[   99.229933]  asm_exc_page_fault+0x1e/0x30
[   99.230515]  ? __sb_start_write+0x225/0x3f0
[   99.231099] RIP: 0033:0x7ff7d709636a
[   99.231743]  ? __sb_start_write+0x1a3/0x3f0
[   99.231755]  __xfs_filemap_fault.constprop.0+0x313/0x4e0
[   99.232230] Code: Bad RIP value.
[   99.232747]  do_page_mkwrite+0x1b1/0x470
[   99.233296] RSP: 002b:00007ffd89c6e2a8 EFLAGS: 00010206
[   99.233841]  do_wp_page+0x9e7/0x1b10
[   99.234333] RAX: 00007ff7ae3e4000 RBX: 000000000159c280 RCX: 
000000000159d3d0
[   99.234863]  ? finish_mkwrite_fault+0x4b0/0x4b0
[   99.235432] RDX: 0000000000000fc0 RSI: 000000000159c420 RDI: 
00007ff7ae3e4000
[   99.235906]  ? _raw_spin_unlock_irq+0x24/0x30
[   99.236485] RBP: 00007ff7af493190 R08: 0000000000000000 R09: 
00007ff7ae3e4ff0
[   99.237198]  handle_mm_fault+0x1c9f/0x3600
[   99.237638] R10: 00007ffd89d04000 R11: 00007ff7ae3e4ff0 R12: 
0000000000000001
[   99.238163]  ? __pmd_alloc+0x390/0x390
[   99.238861] R13: 0000000000001000 R14: 000000000159c2a8 R15: 
00007ff7af493198
[   99.238874] irq event stamp: 0
[   99.239368]  ? vmacache_find+0x5a/0x2a0
[   99.240324] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[   99.240946]  do_user_addr_fault+0x635/0xd42
[   99.241903] hardirqs last disabled at (0): [<ffffffff93b3715c>] 
copy_process+0x182c/0x64b0
[   99.242488]  exc_page_fault+0x12a/0x6f0
[   99.243447] softirqs last  enabled at (0): [<ffffffff93b371fd>] 
copy_process+0x18cd/0x64b0
[   99.244001]  ? asm_exc_page_fault+0x8/0x30
[   99.244959] softirqs last disabled at (0): [<0000000000000000>] 0x0
[   99.244965] ---[ end trace a1aa182be3fbd861 ]---
[   99.245482]  asm_exc_page_fault+0x1e/0x30
[   99.254277] RIP: 0033:0x7ff7d709636a
[   99.254749] Code: Bad RIP value.
[   99.255182] RSP: 002b:00007ffd89c6e2a8 EFLAGS: 00010206
[   99.255879] RAX: 00007ff7ae7f5000 RBX: 000000000159c280 RCX: 
000000000159d3d0
[   99.256815] RDX: 0000000000000fc0 RSI: 000000000159c420 RDI: 
00007ff7ae7f5000
[   99.257760] RBP: 00007ff7af47a3a0 R08: 0000000000000000 R09: 
00007ff7ae7f5ff0
[   99.258720] R10: 00007ffd89d04000 R11: 00007ff7ae7f5ff0 R12: 
0000000000000001
[   99.259727] R13: 0000000000001000 R14: 000000000159c2a8 R15: 
00007ff7af47a3a8
[   99.260722]
[   99.260934] Allocated by task 986:
[   99.261403]  save_stack+0x1b/0x40
[   99.261843]  __kasan_kmalloc.constprop.0+0xc2/0xd0
[   99.262483]  __kmalloc+0x179/0x3e0
[   99.262937]  kmem_alloc+0x175/0x4a0
[   99.263416]  xlog_cil_push_work+0x104/0x1250
[   99.264038]  process_one_work+0xa62/0x1840
[   99.264583]  worker_thread+0xa3/0x1050
[   99.265087]  kthread+0x35a/0x470
[   99.265520]  ret_from_fork+0x22/0x30
[   99.266021]
[   99.266238] Freed by task 1123:
[   99.266658]  save_stack+0x1b/0x40
[   99.267095]  __kasan_slab_free+0x117/0x160
[   99.267652]  kfree+0xdf/0x320
[   99.268070]  kvfree+0x47/0x50
[   99.268500]  xlog_cil_committed+0xaf0/0xf80
[   99.269104]  xlog_cil_process_committed+0xfe/0x1e0
[   99.269743]  xlog_state_do_callback+0x513/0x9b0
[   99.270335]  xfs_log_force_umount+0x2d6/0x4a0
[   99.270925]  xfs_do_force_shutdown+0xa2/0x220
[   99.271514]  xlog_ioend_work+0x13a/0x240
[   99.272042]  process_one_work+0xa62/0x1840
[   99.272582]  worker_thread+0xa3/0x1050
[   99.273104]  kthread+0x35a/0x470
[   99.273527]  ret_from_fork+0x22/0x30
[   99.274008]
[   99.274221] The buggy address belongs to the object at ffff888115f28800
[   99.274221]  which belongs to the cache kmalloc-512 of size 512
[   99.275821] The buggy address is located 104 bytes inside of
[   99.275821]  512-byte region [ffff888115f28800, ffff888115f28a00)
[   99.277477] The buggy address belongs to the page:
[   99.278121] page:ffffea000457ca00 refcount:1 mapcount:0 
mapping:0000000000000000 index:0x0 head:ffffea000457ca00 order:3 
compound_mapcount:0 compound_pincount:0
[   99.279988] flags: 0x200000000010200(slab|head)
[   99.280582] raw: 0200000000010200 ffffea0004156c00 0000000400000004 
ffff888117c0e580
[   99.281615] raw: 0000000000000000 0000000080200020 00000001ffffffff 
0000000000000000
[   99.282615] page dumped because: kasan: bad access detected
[   99.283364]
[   99.283585] Memory state around the buggy address:
[   99.284227]  ffff888115f28700: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   99.285206]  ffff888115f28780: fc fc fc fc fc fc fc fc fc fc fc fc fc 
fc fc fc
[   99.286164] >ffff888115f28800: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   99.287116]                                                           ^
[   99.287962]  ffff888115f28880: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   99.288915]  ffff888115f28900: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[   99.289874] 
==================================================================

Thanks!
Yu Kuai
Dave Chinner June 11, 2020, 5:05 a.m. UTC | #3
On Thu, Jun 11, 2020 at 11:01:38AM +0800, yukuai (C) wrote:
> On 2020/6/11 10:28, Dave Chinner wrote
> > Actually, it's a lot simpler:
> > 
> > thread1			thread2
> > 
> > __xfs_trans_commit
> >   xfs_log_commit_cil
> >    xlog_wait
> >     schedule
> > 			xlog_cil_push_work
> > 			wake_up_all
> > 			<shutdown aborts commit>
> > 			xlog_cil_committed
> > 			kmem_free
> > 
> >     remove_wait_queue
> >      spin_lock_irqsave --> UAF
> > 
> 
> It's ture in this case, however, I got another result when I
> tried to reporduce it, which seems 'ctx' can be freed in a
> different path:

Yup, it's effectively the same thing because of the nature of the IO
failures (generated at submit time) and scheduler behaviour of
workqueues. THis means the IO completion that processes the error is
is queued to a workqueue on the same CPU. When thread 2 finishes
running (it hasn't seen an error yet) the completion work will get
get scheduled ahead of thread1 (cpu bound kernel task vs unbound
user task).  The completion work then runs the shutdown because it
saw a log IO error and because it's the commit record bio it runs
the journal checkpoint completion to abort all the items attached to
it and free the CIL context. Then thread 1 runs again.

The only difference between the two cases is which IO of the CIL
commit the request was failed on....

Cheers,

Dave.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index b43f0e8f43f2..59b21485b0fc 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -607,7 +607,7 @@  xlog_cil_committed(
 
 	if (!list_empty(&ctx->busy_extents))
 		xlog_discard_busy_extents(mp, ctx);
-	else
+	else if (!waitqueue_active(&ctx->push_wait))
 		kmem_free(ctx);
 }