diff mbox series

fix syzkaller task hung in exit_aio

Message ID 1551880403-132638-1-git-send-email-zhengbin13@huawei.com (mailing list archive)
State New, archived
Headers show
Series fix syzkaller task hung in exit_aio | expand

Commit Message

Zheng Bin March 6, 2019, 1:53 p.m. UTC
When I use syzkaller test kernel, will hung in exit_aio.

INFO: task syz-executor.2:22372 blocked for more than 140 seconds.
      Not tainted 4.19.25 #5
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.2  D27568 22372   2689 0x90000002
Call Trace:
 schedule+0x7c/0x1a0 kernel/sched/core.c:3516
 schedule_timeout+0x4cf/0x1140 kernel/time/timer.c:1780
 do_wait_for_common kernel/sched/completion.c:83 [inline]
 __wait_for_common kernel/sched/completion.c:104 [inline]
 wait_for_common kernel/sched/completion.c:115 [inline]
 wait_for_completion+0x27a/0x3d0 kernel/sched/completion.c:136
 exit_aio+0x2ef/0x3c0 fs/aio.c:881
 __mmput kernel/fork.c:1047 [inline]
 mmput+0xb4/0x460 kernel/fork.c:1071
 exit_mm kernel/exit.c:545 [inline]
 do_exit+0x79c/0x2cb0 kernel/exit.c:862
 do_group_exit+0x106/0x2f0 kernel/exit.c:978
 get_signal+0x325/0x1c80 kernel/signal.c:2572
 do_signal+0x94/0x16a0 arch/x86/kernel/signal.c:816
 exit_to_usermode_loop+0x108/0x1d0 arch/x86/entry/common.c:162
 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
 do_syscall_64+0x461/0x580 arch/x86/entry/common.c:293

The reason is as follows:
io_submit_one-->aio_get_req-->percpu_ref_get(&ctx->reqs)
                           -->req->ki_refcnt=0
             -->aio_poll-->req->ki_refcnt=2
                        -->aio_poll_complete-->aio_complete-->iocb_put
                        -->iocb_put

iocb_put will decrease req->ki_refcnt, the number of calls of
aio_poll_complete must be equal with iocb_put. Unfortunately, in some
case, this is not equal, which is as follows:

CPU 0                          CPU 1
aio_poll-->vfs_poll
                               eventfd_write-->spin_lock_irq(lock)
                                            -->..-->aio_poll_wake
                                            -->spin_unlock_irq(lock)
        -->spin_lock(lock)
        -->if (req->woken)
		mask = 0; --->did not call aio_poll_complete
        -->iocb_put

aio_poll_wake
	req->woken = true;
	if (mask) {
		if (!(mask & req->events))
			return 0;  --->did not call aio_poll_complete too

vfs_poll-->eventfd_poll-->poll_wait-->aio_poll_queue_proc(add
aio_poll_wake to req->head)

eventfd_write-->wake_up_locked_poll-->__wake_up_common-->curr->func
-->aio_poll_wake

This patch fixes that. by the way, fix the bug of the error handling path.

Signed-off-by: zhengbin <zhengbin13@huawei.com>
---
 fs/aio.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

--
2.7.4

Comments

Al Viro March 6, 2019, 7:44 p.m. UTC | #1
On Wed, Mar 06, 2019 at 09:53:23PM +0800, zhengbin wrote:

> CPU 0                          CPU 1
> aio_poll-->vfs_poll
>                                eventfd_write-->spin_lock_irq(lock)
>                                             -->..-->aio_poll_wake
>                                             -->spin_unlock_irq(lock)
>         -->spin_lock(lock)
>         -->if (req->woken)
> 		mask = 0; --->did not call aio_poll_complete
>         -->iocb_put
> 
> aio_poll_wake
> 	req->woken = true;
> 	if (mask) {
> 		if (!(mask & req->events))
> 			return 0;  --->did not call aio_poll_complete too

... and it's still on waitqueue, so it shouldn't be different from
_not_ having had a wakeup yet.  And yes, aio_poll() in mainline right
now ends up _not_ adding it to "can be cancelled" list, leading to
that bug.

> vfs_poll-->eventfd_poll-->poll_wait-->aio_poll_queue_proc(add
> aio_poll_wake to req->head)
> 
> eventfd_write-->wake_up_locked_poll-->__wake_up_common-->curr->func
> -->aio_poll_wake
> 
> This patch fixes that. by the way, fix the bug of the error handling path.

Leak on error is real (see thread a few days ago), and overall logics for
"woken" should be similar to what you suggest, but I'd rather handle it
slightly differently (see the same thread).

I've a patch that ought to fix that and it seems to survive testing; I'll
post once I finish carving it up - too many cleanups mixed into it.  Give
me a couple of hours; should be done (and posted) by then.
Al Viro March 7, 2019, 12:07 a.m. UTC | #2
On Wed, Mar 06, 2019 at 07:44:55PM +0000, Al Viro wrote:

> Leak on error is real (see thread a few days ago), and overall logics for
> "woken" should be similar to what you suggest, but I'd rather handle it
> slightly differently (see the same thread).
> 
> I've a patch that ought to fix that and it seems to survive testing; I'll
> post once I finish carving it up - too many cleanups mixed into it.  Give
> me a couple of hours; should be done (and posted) by then.

Carved up and posted - sorry, too longer than I hoped ;-/
diff mbox series

Patch

diff --git a/fs/aio.c b/fs/aio.c
index 38b741a..3bf8cdc 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1668,8 +1668,6 @@  static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,
 	__poll_t mask = key_to_poll(key);
 	unsigned long flags;

-	req->woken = true;
-
 	/* for instances that support it check for an event match first: */
 	if (mask) {
 		if (!(mask & req->events))
@@ -1687,12 +1685,14 @@  static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync,

 			list_del_init(&req->wait.entry);
 			aio_poll_complete(iocb, mask);
+			req->woken = true;
 			return 1;
 		}
 	}

 	list_del_init(&req->wait.entry);
 	schedule_work(&req->work);
+	req->woken = true;
 	return 1;
 }

@@ -1777,8 +1777,10 @@  static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
 	spin_unlock_irq(&ctx->ctx_lock);

 out:
-	if (unlikely(apt.error))
+	if (unlikely(apt.error)) {
+		iocb_put(aiocb);
 		return apt.error;
+	}

 	if (mask)
 		aio_poll_complete(aiocb, mask);