From patchwork Tue Oct 11 09:04:09 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: kernel@kyup.com X-Patchwork-Id: 9370261 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8F4516048F for ; Tue, 11 Oct 2016 09:10:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6D18E29BE2 for ; Tue, 11 Oct 2016 09:10:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6162F29BFA; Tue, 11 Oct 2016 09:10:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.4 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6595F29BFB for ; Tue, 11 Oct 2016 09:10:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752503AbcJKJK0 (ORCPT ); Tue, 11 Oct 2016 05:10:26 -0400 Received: from mail-qt0-f178.google.com ([209.85.216.178]:36639 "EHLO mail-qt0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752195AbcJKJJ2 (ORCPT ); Tue, 11 Oct 2016 05:09:28 -0400 Received: by mail-qt0-f178.google.com with SMTP id m5so7680265qtb.3 for ; Tue, 11 Oct 2016 02:09:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=CY8un1dAVUEvHnM65Rw/5AZ1dFBoNi2/TjnjncyF2pA=; b=XY1xMy/4OK3aVLH0Ezfa3H6mkmE0WaTYuVg/x89GPhm8hi6FE65CWD9EA2gAEEcqEV b6f6A0+UpUSLhU4sh1hbCCYXpoSB5aBrnRd2NKmtHo3/qxyLtLqiHPxpVnAJcUojdY/s AHC9SxglwSTU1rDY+T/2Kx/UhpGcXOAm7lSzhZdqT/CmHfcXyMMwI4ipUeGOSucRIco2 C77JUw4Qtrauc+q2EN1zIrmU6mPCW9FnQZwGTwUqRVoxYUlwZ+HchWv07EUdpZk01PYV 68Dxjjeyio1a0qvpfUFzymscylhPD8Q1NxW4cBFNjMrouMUQ4pEE+QkApIdVngdbvUjn psvA== X-Gm-Message-State: AA6/9Rk2bZznoOmopu3qn0WnCBHvP3vdkzxPfOtWl4TN9ZGbzTiGyMmn8cRfX7Yiv6/7cpyu X-Received: by 10.194.184.39 with SMTP id er7mr3485767wjc.159.1476176657577; Tue, 11 Oct 2016 02:04:17 -0700 (PDT) Received: from localhost.localdomain (admins.1h.com. [82.118.240.130]) by smtp.gmail.com with ESMTPSA id 17sm4577911wju.44.2016.10.11.02.04.16 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 11 Oct 2016 02:04:16 -0700 (PDT) From: Nikolay Borisov To: idryomov@gmail.com, zyan@redhat.com Cc: linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, Nikolay Borisov Subject: [PATCH] cephfs: Fix scheduler warning due to nested blocking Date: Tue, 11 Oct 2016 12:04:09 +0300 Message-Id: <1476176649-13393-1-git-send-email-kernel@kyup.com> X-Mailer: git-send-email 1.7.1 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP try_get_cap_refs can be used as a condition in a wait_event* calls. This is all fine until it has to call __ceph_do_pending_vmtruncate, which in turn acquires the i_truncate_mutex. This leads to a situation in which a task's state is !TASK_RUNNING and at the same time it's trying to acquire a sleeping primitive. In essence a nested sleeping primitives are being used. This causes the following warning: WARNING: CPU: 22 PID: 11064 at kernel/sched/core.c:7631 __might_sleep+0x9f/0xb0() do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_wait_event+0x5d/0x110 ipmi_msghandler tcp_scalable ib_qib dca ib_mad ib_core ib_addr ipv6 CPU: 22 PID: 11064 Comm: fs_checker.pl Tainted: G O 4.4.20-clouder2 #6 Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1a 10/16/2015 0000000000000000 ffff8838b416fa88 ffffffff812f4409 ffff8838b416fad0 ffffffff81a034f2 ffff8838b416fac0 ffffffff81052b46 ffffffff81a0432c 0000000000000061 0000000000000000 0000000000000000 ffff88167bda54a0 Call Trace: [] dump_stack+0x67/0x9e [] warn_slowpath_common+0x86/0xc0 [] warn_slowpath_fmt+0x4c/0x50 [] ? prepare_to_wait_event+0x5d/0x110 [] ? prepare_to_wait_event+0x5d/0x110 [] __might_sleep+0x9f/0xb0 [] mutex_lock+0x20/0x40 [] __ceph_do_pending_vmtruncate+0x44/0x1a0 [ceph] [] try_get_cap_refs+0xa2/0x320 [ceph] [] ceph_get_caps+0x255/0x2b0 [ceph] [] ? wait_woken+0xb0/0xb0 [] ceph_write_iter+0x2b1/0xde0 [ceph] [] ? schedule_timeout+0x202/0x260 [] ? kmem_cache_free+0x1ea/0x200 [] ? iput+0x9e/0x230 [] ? __might_sleep+0x52/0xb0 [] ? __might_fault+0x37/0x40 [] ? cp_new_stat+0x153/0x170 [] __vfs_write+0xaa/0xe0 [] vfs_write+0xa9/0x190 [] ? set_close_on_exec+0x31/0x70 [] SyS_write+0x46/0xa0 This happens since wait_event_interruptible can interfere with the mutex locking code, since they both fiddle with the task state. Fix the issue by using the newly-added nested blocking infrastructure in 61ada528dea0 ("sched/wait: Provide infrastructure to deal with nested blocking") Link: https://lwn.net/Articles/628628/ Signed-off-by: Nikolay Borisov --- fs/ceph/caps.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index c69e1253b47b..c6bf34e29ea4 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2467,6 +2467,7 @@ int ceph_get_caps(struct ceph_inode_info *ci, int need, int want, loff_t endoff, int *got, struct page **pinned_page) { int _got, ret, err = 0; + DEFINE_WAIT_FUNC(wait, woken_wait_function); ret = ceph_pool_perm_check(ci, need); if (ret < 0) @@ -2486,9 +2487,14 @@ int ceph_get_caps(struct ceph_inode_info *ci, int need, int want, if (err < 0) return err; } else { - ret = wait_event_interruptible(ci->i_cap_wq, - try_get_cap_refs(ci, need, want, endoff, - true, &_got, &err)); + add_wait_queue(ci->i_cap_wq, &wait); + + while (!try_get_cap_refs(ci, need, want, endoff, + true, &_got, &err)) + wait_woken(&wait, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT); + + remove_wait_queue(ci->i_cap_wq, &wait); + if (err == -EAGAIN) continue; if (err < 0)