From patchwork Wed Mar 6 13:53:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zheng Bin X-Patchwork-Id: 10841149 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0FDFF1669 for ; Wed, 6 Mar 2019 13:49:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE5EC2D05A for ; Wed, 6 Mar 2019 13:49:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E276E2D0E5; Wed, 6 Mar 2019 13:49:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6B43B2D05A for ; Wed, 6 Mar 2019 13:49:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726707AbfCFNtO (ORCPT ); Wed, 6 Mar 2019 08:49:14 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:4653 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726172AbfCFNtO (ORCPT ); Wed, 6 Mar 2019 08:49:14 -0500 Received: from DGGEMS401-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 40AE12C3FBC6C1FF6396; Wed, 6 Mar 2019 21:49:09 +0800 (CST) Received: from huawei.com (10.90.53.225) by DGGEMS401-HUB.china.huawei.com (10.3.19.201) with Microsoft SMTP Server id 14.3.408.0; Wed, 6 Mar 2019 21:49:02 +0800 From: zhengbin To: , , , CC: , Subject: [PATCH] fix syzkaller task hung in exit_aio Date: Wed, 6 Mar 2019 21:53:23 +0800 Message-ID: <1551880403-132638-1-git-send-email-zhengbin13@huawei.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 X-Originating-IP: [10.90.53.225] X-CFilter-Loop: Reflected Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When I use syzkaller test kernel, will hung in exit_aio. INFO: task syz-executor.2:22372 blocked for more than 140 seconds. Not tainted 4.19.25 #5 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor.2 D27568 22372 2689 0x90000002 Call Trace: schedule+0x7c/0x1a0 kernel/sched/core.c:3516 schedule_timeout+0x4cf/0x1140 kernel/time/timer.c:1780 do_wait_for_common kernel/sched/completion.c:83 [inline] __wait_for_common kernel/sched/completion.c:104 [inline] wait_for_common kernel/sched/completion.c:115 [inline] wait_for_completion+0x27a/0x3d0 kernel/sched/completion.c:136 exit_aio+0x2ef/0x3c0 fs/aio.c:881 __mmput kernel/fork.c:1047 [inline] mmput+0xb4/0x460 kernel/fork.c:1071 exit_mm kernel/exit.c:545 [inline] do_exit+0x79c/0x2cb0 kernel/exit.c:862 do_group_exit+0x106/0x2f0 kernel/exit.c:978 get_signal+0x325/0x1c80 kernel/signal.c:2572 do_signal+0x94/0x16a0 arch/x86/kernel/signal.c:816 exit_to_usermode_loop+0x108/0x1d0 arch/x86/entry/common.c:162 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline] syscall_return_slowpath arch/x86/entry/common.c:268 [inline] do_syscall_64+0x461/0x580 arch/x86/entry/common.c:293 The reason is as follows: io_submit_one-->aio_get_req-->percpu_ref_get(&ctx->reqs) -->req->ki_refcnt=0 -->aio_poll-->req->ki_refcnt=2 -->aio_poll_complete-->aio_complete-->iocb_put -->iocb_put iocb_put will decrease req->ki_refcnt, the number of calls of aio_poll_complete must be equal with iocb_put. Unfortunately, in some case, this is not equal, which is as follows: CPU 0 CPU 1 aio_poll-->vfs_poll eventfd_write-->spin_lock_irq(lock) -->..-->aio_poll_wake -->spin_unlock_irq(lock) -->spin_lock(lock) -->if (req->woken) mask = 0; --->did not call aio_poll_complete -->iocb_put aio_poll_wake req->woken = true; if (mask) { if (!(mask & req->events)) return 0; --->did not call aio_poll_complete too vfs_poll-->eventfd_poll-->poll_wait-->aio_poll_queue_proc(add aio_poll_wake to req->head) eventfd_write-->wake_up_locked_poll-->__wake_up_common-->curr->func -->aio_poll_wake This patch fixes that. by the way, fix the bug of the error handling path. Signed-off-by: zhengbin --- fs/aio.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) -- 2.7.4 diff --git a/fs/aio.c b/fs/aio.c index 38b741a..3bf8cdc 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1668,8 +1668,6 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, __poll_t mask = key_to_poll(key); unsigned long flags; - req->woken = true; - /* for instances that support it check for an event match first: */ if (mask) { if (!(mask & req->events)) @@ -1687,12 +1685,14 @@ static int aio_poll_wake(struct wait_queue_entry *wait, unsigned mode, int sync, list_del_init(&req->wait.entry); aio_poll_complete(iocb, mask); + req->woken = true; return 1; } } list_del_init(&req->wait.entry); schedule_work(&req->work); + req->woken = true; return 1; } @@ -1777,8 +1777,10 @@ static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb) spin_unlock_irq(&ctx->ctx_lock); out: - if (unlikely(apt.error)) + if (unlikely(apt.error)) { + iocb_put(aiocb); return apt.error; + } if (mask) aio_poll_complete(aiocb, mask);