From patchwork Thu May 17 03:29:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 10405189 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 70D1D60247 for ; Thu, 17 May 2018 03:29:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E7BA28907 for ; Thu, 17 May 2018 03:29:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4125B2890A; Thu, 17 May 2018 03:29:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A723028907 for ; Thu, 17 May 2018 03:29:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751524AbeEQD3w (ORCPT ); Wed, 16 May 2018 23:29:52 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:46024 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751474AbeEQD3v (ORCPT ); Wed, 16 May 2018 23:29:51 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6BC0ABB401; Thu, 17 May 2018 03:29:51 +0000 (UTC) Received: from ovpn-12-82.pek2.redhat.com (ovpn-12-82.pek2.redhat.com [10.72.12.82]) by smtp.corp.redhat.com (Postfix) with ESMTP id 261BA8577F; Thu, 17 May 2018 03:29:48 +0000 (UTC) From: "Yan, Zheng" To: ceph-devel@vger.kernel.org Cc: idryomov@gmail.com, jlayton@redhat.com, "Yan, Zheng" Subject: [PATCH] ceph: fix writeback thread waits on itself Date: Thu, 17 May 2018 11:29:44 +0800 Message-Id: <20180517032944.13230-1-zyan@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 17 May 2018 03:29:51 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Thu, 17 May 2018 03:29:51 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'zyan@redhat.com' RCPT:'' Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In the case of -ENOSPC, writeback thread may wait on itself. The call stack looks like: inode_wait_for_writeback+0x26/0x40 evict+0xb5/0x1a0 iput+0x1d2/0x220 ceph_put_wrbuffer_cap_refs+0xe0/0x2c0 [ceph] writepages_finish+0x2d3/0x410 [ceph] __complete_request+0x26/0x60 [libceph] complete_request+0x2e/0x70 [libceph] __submit_request+0x256/0x330 [libceph] submit_request+0x2b/0x30 [libceph] ceph_osdc_start_request+0x25/0x40 [libceph] ceph_writepages_start+0xdfe/0x1320 [ceph] do_writepages+0x1f/0x70 __writeback_single_inode+0x45/0x330 writeback_sb_inodes+0x26a/0x600 __writeback_inodes_wb+0x92/0xc0 wb_writeback+0x274/0x330 wb_workfn+0x2d5/0x3b0 The fix is make writepages_finish() not drop inode's last reference. Link: http://tracker.ceph.com/issues/23978 Signed-off-by: "Yan, Zheng" --- fs/ceph/addr.c | 21 +++++++++++++++++++++ fs/ceph/inode.c | 12 ++++++++++-- 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 5f7ad3d0df2e..9db2f4108951 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -772,6 +772,17 @@ static void writepages_finish(struct ceph_osd_request *req) ceph_release_pages(osd_data->pages, num_pages); } + if (rc < 0 && total_pages) { + /* + * In the case of error, this function may directly get + * called by the thread that does writeback. The writeback + * thread should not drop inode's last reference. Otherwise + * iput_final() may call inode_wait_for_writeback(), which + * waits on writeback. + */ + ihold(inode); + } + ceph_put_wrbuffer_cap_refs(ci, total_pages, snapc); osd_data = osd_req_op_extent_osd_data(req, 0); @@ -781,6 +792,16 @@ static void writepages_finish(struct ceph_osd_request *req) else kfree(osd_data->pages); ceph_osdc_put_request(req); + + if (rc < 0 && total_pages) { + for (;;) { + if (atomic_add_unless(&inode->i_count, -1, 1)) + break; + /* let writeback work drop the last reference */ + if (queue_work(fsc->wb_wq, &ci->i_wb_work)) + break; + } + } } /* diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index df3875fdfa41..aa7c5a4ff137 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1752,9 +1752,17 @@ static void ceph_writeback_work(struct work_struct *work) struct ceph_inode_info *ci = container_of(work, struct ceph_inode_info, i_wb_work); struct inode *inode = &ci->vfs_inode; + int wrbuffer_refs; + + spin_lock(&ci->i_ceph_lock); + wrbuffer_refs = ci->i_wrbuffer_ref; + spin_unlock(&ci->i_ceph_lock); + + if (wrbuffer_refs) { + dout("writeback %p\n", inode); + filemap_fdatawrite(&inode->i_data); + } - dout("writeback %p\n", inode); - filemap_fdatawrite(&inode->i_data); iput(inode); }