From patchwork Fri Dec 22 06:41:22 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Changwei Ge X-Patchwork-Id: 10128863 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B522D60318 for ; Fri, 22 Dec 2017 06:45:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E9F729FA4 for ; Fri, 22 Dec 2017 06:45:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 926E329FB7; Fri, 22 Dec 2017 06:45:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=ham version=3.3.1 Received: from aserp2120.oracle.com (aserp2120.oracle.com [141.146.126.78]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4938129FA4 for ; Fri, 22 Dec 2017 06:45:54 +0000 (UTC) Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.21/8.16.0.21) with SMTP id vBM6ivti134410; Fri, 22 Dec 2017 06:45:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : date : message-id : mime-version : cc : subject : list-id : list-unsubscribe : list-archive : list-post : list-help : list-subscribe : content-type : content-transfer-encoding : sender; s=corp-2017-10-26; bh=Ovdb9b+Yowqx9SFemZeJZfW+/FgFgqI68EYPfMrmohI=; b=qcx61YdGYedloM2e+BQTNvNKYn2dODLugzV23D8WZvHwKMYs0eB1pc1PDt/uKAejT0T4 UIcyjWIxJX3LqsSbuYr3DXoF/WAYLjPiJ8iy7TJQkwJrL/uP03kDA3BsaoZurnxePmIn m7FW6IMmEIXhM7dlzH/jGqGU/r9iV0eTV3g/39gxjijfgMwwk+OtuVisnMr1hNuXdbcz kw6hMjqB3P+dqr98cygjf+T9gjc/+rQyFzf4FvyXAnnFsdzjAkalR0rknWaKRaLaGGlG bwmtU2MorI0B7eXk+cFt8yUvSCHxVWlJD0sQZQgFGARSn53Z+dOc9GetmhldHJuUK2nG Gw== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp2120.oracle.com with ESMTP id 2f0w27g01g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 22 Dec 2017 06:45:18 +0000 Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vBM6jGAb022175 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 22 Dec 2017 06:45:16 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1eSH52-00074p-2P; Thu, 21 Dec 2017 22:45:16 -0800 Received: from userv0022.oracle.com ([156.151.31.74]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1eSH4a-00071g-Sx for ocfs2-devel@oss.oracle.com; Thu, 21 Dec 2017 22:44:49 -0800 Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vBM6imHM021218 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL) for ; Fri, 22 Dec 2017 06:44:48 GMT Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.16.0.21/8.16.0.21) with SMTP id vBM6gV2w043733 for ; Fri, 22 Dec 2017 06:44:48 GMT Received: from h3cmg01-ex.h3c.com (smtp.h3c.com [60.191.123.56]) by userp2030.oracle.com with ESMTP id 2f0ss22vcn-1 for ; Fri, 22 Dec 2017 06:44:47 +0000 Received: from BJHUB01-EX.srv.huawei-3com.com (unknown [10.63.20.169]) by h3cmg01-ex.h3c.com with smtp id 2305_0832_4a1305bd_51ff_48e8_b49c_582d101f85e1; Fri, 22 Dec 2017 14:41:36 +0800 Received: from H3CMLB12-EX.srv.huawei-3com.com ([fe80::10fe:abde:731b:fdde]) by BJHUB01-EX.srv.huawei-3com.com ([::1]) with mapi id 14.03.0248.002; Fri, 22 Dec 2017 14:41:23 +0800 From: Changwei Ge To: Joseph Qi , Mark Fasheh , Junxiao Bi , Joel Becker Thread-Topic: [PATCH] ocfs2: don't merge rightmost extent block if it was locked Thread-Index: AdN67+En1KOSaJEGQs67HVsiaZbbCw== Date: Fri, 22 Dec 2017 06:41:22 +0000 Message-ID: <63ADC13FD55D6546B7DECE290D39E373F290800A@H3CMLB12-EX.srv.huawei-3com.com> Accept-Language: en-US, zh-CN Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.125.136.231] MIME-Version: 1.0 X-CLX-Shades: MLX X-CLX-Response: 1TFkXGx8ZEQpMehcaEQpZTRdnZnIRCllJFxpxGhAadwYbGRlxHhgQG3cGGBo GGhEKWV4XaG55EQpJRhdFWEtJRk91WlhFTl9JXkNFRBl1T0sRCkNOF3tGe1NYRgdhW0kfY1xBGX V8HHtPY0ZLXGFHWUtgEkJFEQpYXBcfBBoEGxgYBxxLSEtPHhwaBRsaBBsaGgQeEgQbEBseGh8aE QpeWRd4SUEfHREKTVwXGB0ZEQpMWhdoaUJNXREKTU4XaBEKTEYXY2sRCkNaFxwaBBsTGwQbGBkE HxwRCkJeFxsRCkReFx4RCkRJFxkRCkJGF2cTbWAbW2VCH359EQpCXBcaEQpCRRduGVhMXmEBcFJ MYREKQk4XZEJ8WkVEQWIdZFARCkJMF29+XU0YBV1mGlJ7EQpCbBdkYU9LYEJIEngdZxEKQkAXZl wcaEtAH2dARR0RCkJYF2J9b3kBTxgZcHB7EQpNXhcbEQpwZxdtAWRaH15LcH1nfhAaEQpwaBdkb UYffXBATG9cYRAaEQpwaBdobRwcRWhDewVtZBAaEQpwaBdgXmhvZmkcUgESGxAaEQpwaBdtch5Y Yn5NeUMaWBAaEQpwaBdvGlxMfEVIUlhZRBAaEQpwbBdtThtvUwFHUkgdcxAeEhEKbX4XGxEKWE0 XSxEg X-PDR: PASS X-Source-IP: 60.191.123.56 X-ServerName: smtp.h3c.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 ip4:60.191.123.56 ip4:60.191.123.50 ip4:221.12.31.13 ip4:221.12.31.56 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8752 signatures=668651 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=0 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=153 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1712220095 X-Spam: Clean Cc: "ocfs2-devel@oss.oracle.com" Subject: [Ocfs2-devel] [PATCH] ocfs2: don't merge rightmost extent block if it was locked X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8752 signatures=668651 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1712220096 X-Virus-Scanned: ClamAV using ClamSMTP A crash issue was reported by John. The call trace follows: ocfs2_split_extent+0x1ad3/0x1b40 [ocfs2] ocfs2_change_extent_flag+0x33a/0x470 [ocfs2] ocfs2_mark_extent_written+0x172/0x220 [ocfs2] ocfs2_dio_end_io+0x62d/0x910 [ocfs2] dio_complete+0x19a/0x1a0 do_blockdev_direct_IO+0x19dd/0x1eb0 __blockdev_direct_IO+0x43/0x50 ocfs2_direct_IO+0x8f/0xa0 [ocfs2] generic_file_direct_write+0xb2/0x170 __generic_file_write_iter+0xc3/0x1b0 ocfs2_file_write_iter+0x4bb/0xca0 [ocfs2] __vfs_write+0xae/0xf0 vfs_write+0xb8/0x1b0 SyS_write+0x4f/0xb0 system_call_fastpath+0x16/0x75 The BUG code told that extent tree wants to grow but no metadata was reserved ahead of time. From my investigation into this issue, the root cause it that although enough metadata is reserved, rightmost extent is merged into left one due to a certain times of marking extent written. Because during marking extent written, we got many physically continuous extents. At last, an empty extent showed up and the rightmost path is removed from extent tree. In order to solve this issue, introduce a member named ::et_lock for extent tree. When ocfs2_lock_allocators is invoked and we indeed need to reserve metadata, set ::et_lock so that the rightmost path won't be removed during marking extents written. Also, this patch address the issue John reported that ::dw_zero_count is not calculated properly. After applying this patch, the issue John reported was gone. Thanks to the reproducer provided by John. And this patch has passed ocfs2-test suite running by New H3C Group. Reported-by: John Lightsey Signed-off-by: Changwei Ge --- fs/ocfs2/alloc.c | 4 +++- fs/ocfs2/alloc.h | 1 + fs/ocfs2/aops.c | 8 +++++--- fs/ocfs2/suballoc.c | 3 +++ 4 files changed, 12 insertions(+), 4 deletions(-) diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index ab5105f..160e393 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -444,6 +444,7 @@ static void __ocfs2_init_extent_tree(struct ocfs2_extent_tree *et, et->et_ops = ops; et->et_root_bh = bh; et->et_ci = ci; + et->et_lock = 0; et->et_root_journal_access = access; if (!obj) obj = (void *)bh->b_data; @@ -3606,7 +3607,8 @@ static int ocfs2_merge_rec_left(struct ocfs2_path *right_path, * it and we need to delete the right extent block. */ if (le16_to_cpu(right_rec->e_leaf_clusters) == 0 && - le16_to_cpu(el->l_next_free_rec) == 1) { + le16_to_cpu(el->l_next_free_rec) == 1 && + !et->et_lock) { /* extend credit for ocfs2_remove_rightmost_path */ ret = ocfs2_extend_rotate_transaction(handle, 0, handle->h_buffer_credits, diff --git a/fs/ocfs2/alloc.h b/fs/ocfs2/alloc.h index 27b75cf..898671d 100644 --- a/fs/ocfs2/alloc.h +++ b/fs/ocfs2/alloc.h @@ -61,6 +61,7 @@ struct ocfs2_extent_tree { ocfs2_journal_access_func et_root_journal_access; void *et_object; unsigned int et_max_leaf_clusters; + int et_lock; }; /* diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index d151632..c72ce60 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -797,6 +797,7 @@ struct ocfs2_write_ctxt { struct ocfs2_cached_dealloc_ctxt w_dealloc; struct list_head w_unwritten_list; + unsigned int w_unwritten_count; }; void ocfs2_unlock_and_free_pages(struct page **pages, int num_pages) @@ -1386,6 +1387,7 @@ static int ocfs2_unwritten_check(struct inode *inode, desc->c_clear_unwritten = 0; list_add_tail(&new->ue_ip_node, &oi->ip_unwritten_list); list_add_tail(&new->ue_node, &wc->w_unwritten_list); + wc->w_unwritten_count++; new = NULL; unlock: spin_unlock(&oi->ip_lock); @@ -1762,8 +1764,8 @@ int ocfs2_write_begin_nolock(struct address_space *mapping, */ ocfs2_init_dinode_extent_tree(&et, INODE_CACHE(inode), wc->w_di_bh); - ret = ocfs2_lock_allocators(inode, &et, - clusters_to_alloc, extents_to_split, + ret = ocfs2_lock_allocators(inode, &et, clusters_to_alloc, + 2*extents_to_split, &data_ac, &meta_ac); if (ret) { mlog_errno(ret); @@ -2256,7 +2258,7 @@ static int ocfs2_dio_wr_get_block(struct inode *inode, sector_t iblock, ue->ue_phys = desc->c_phys; list_splice_tail_init(&wc->w_unwritten_list, &dwc->dw_zero_list); - dwc->dw_zero_count++; + dwc->dw_zero_count += wc->w_unwritten_count; } ret = ocfs2_write_end_nolock(inode->i_mapping, pos, len, len, wc); diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 9f0b95a..32bc38e 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -2727,6 +2727,9 @@ int ocfs2_lock_allocators(struct inode *inode, } } + if (extents_to_split) + et->et_lock = 1; + if (clusters_to_add == 0) goto out;