From patchwork Mon Nov 20 18:54:50 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Lightsey X-Patchwork-Id: 10067045 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EAAEA603FA for ; Mon, 20 Nov 2017 18:56:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D995429236 for ; Mon, 20 Nov 2017 18:56:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CDD7E292B1; Mon, 20 Nov 2017 18:56:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id ED5F829236 for ; Mon, 20 Nov 2017 18:56:26 +0000 (UTC) Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id vAKItf2c019644 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 20 Nov 2017 18:55:41 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id vAKItcl3026176 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 20 Nov 2017 18:55:39 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1eGrEI-0000r9-Ra; Mon, 20 Nov 2017 10:55:38 -0800 Received: from aserv0022.oracle.com ([141.146.126.234]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1eGrDe-0000kA-3m for ocfs2-devel@oss.oracle.com; Mon, 20 Nov 2017 10:54:58 -0800 Received: from userp2040.oracle.com (userp2040.oracle.com [156.151.31.90]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id vAKIsvbG001994 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Mon, 20 Nov 2017 18:54:57 GMT Received: from pps.filterd (userp2040.oracle.com [127.0.0.1]) by userp2040.oracle.com (8.16.0.21/8.16.0.21) with SMTP id vAKIqwX7006540 for ; Mon, 20 Nov 2017 18:54:57 GMT Authentication-Results: oracle.com; spf=pass smtp.mailfrom=john@nixnuts.net Received: from mail.wazzim.com (mail.wazzim.com [208.74.120.37]) by userp2040.oracle.com with ESMTP id 2ec1ade6g9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 20 Nov 2017 18:54:56 +0000 Message-ID: <1511204090.3644.6.camel@nixnuts.net> From: John Lightsey To: ocfs2-devel@oss.oracle.com Date: Mon, 20 Nov 2017 12:54:50 -0600 X-Mailer: Evolution 3.22.6-1+deb9u1 Mime-Version: 1.0 X-Spam_score: -1.5 X-Spam_score_int: -14 X-Spam_bar: - X-Spam_report: Spam detection software, running on the system "nixnuts.net", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: In January Ben Hutchings reported Debian bug 841144 to the ocfs2-devel list: https://oss.oracle.com/pipermail/ocfs2-devel/2017-January/012701.html cPanel encountered this bug after upgrading our cluster to the 4.9 Debian stable kernel. In our environment, the bug would trigger every few hours. [...] Content analysis details: (-1.5 points, 3.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -0.5 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-CLX-Shades: MLX X-CLX-Response: 1TFkXGBkSEQpMehcaEQpZTRdnZnIRCllJFxpxGhAadwYYGxJxGx8bEBp3Bhg aBhoRClleF2huZhEKSUYXRVhLSUZPdVpYRU5fSV5DRUQZdU9LEQpDThcaT1thXmlHfGtubX1sfV IbQR1MW3BGZ3lgZ2Z1H3B8BxEKWFwXHwQaBBsbEwcbSBpOGE5LTwUbGgQbGhoEHhIEGxAbHhofG hEKXlkXeHlnR0kRCk1cFxscGBEKTFoXaGlNTV0RCkNaFxgaEgQdHgQbGBoEGR0RCkJeFxsRCkJG F2cTbWAbW2VCH359EQpCXBcaEQpCRRdkbnxjZX98fkRTfBEKQk4XbmV4WGFBHn9IBX0RCkJMF2N nE01hYUwBb3N9EQpCbBdiT0tCZE18HkNkWhEKQkAXZHIaf2dfRlx6BWURCkJYF2J9b3kBTxgZcH B7EQpNXhcbEQpaWBcYEQpwaBdtQWhLU0kZQAUYbBAaEQpwaBdtUE9dAVlaRn0TSxAaEQpwaBdoE kdcQ1ltXWVHSRAaEQpwaBdrTEt9bxtnRX14UxAaEQpwaBdsZxtCUmR8HBNMWRAaEQpwfxdmaURN Wn4SSFNnTBAbGh8RCnBfF2Idc3hgT0xeRAVLEBoRCnB9F2Z8E3x7HnlmeGFLEBoRCnBrF2FGZ39 AExpPXx1iEBoRCnBLF2BFfwFkTxxMGk8bEBoRCnBrF2RaUGFYfkFwEmxAEBoRCnBrF217YG1zWV 1aQHpDEBoRCnBLF2JpchNYXVxnbVNzEBsaHBEKcGsXZ0N8fW9CcklkeE8QGhEKcGwXbwF8Zxh7G mcFHUcQGRoRCm1+FxsRClhNF0sRIA== X-PDR: PASS X-ServerName: mail.wazzim.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 a mx -all X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8721 signatures=668619 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=0 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=238 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1711200254 Subject: [Ocfs2-devel] [PATCH] Bug#841144: kernel BUG at /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514! X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: aserv0021.oracle.com [141.146.126.233] X-Virus-Scanned: ClamAV using ClamSMTP In January Ben Hutchings reported Debian bug 841144 to the ocfs2-devel list: https://oss.oracle.com/pipermail/ocfs2-devel/2017-January/012701.html cPanel encountered this bug after upgrading our cluster to the 4.9 Debian stable kernel. In our environment, the bug would trigger every few hours. The core problem seems to be that the size of dw_zero_list is not tracked correctly. This causes the ocfs2_lock_allocators() call in ocfs2_dio_end_io_write() to underestimate the number of extents needed. As a result, meta_ac is null when it's needed in ocfs2_grow_tree(). The attached patch is a forward-ported version of the fix we applied to Debian's 4.9 kernel to correct the issue. From a3107e92b07ed95752d72703ee53ae71a7607098 Mon Sep 17 00:00:00 2001 From: John Lightsey Date: Mon, 20 Nov 2017 12:05:37 -0600 Subject: [PATCH] Fix OCFS2 extent split estimation for dio allocators locking. The dw_zero_count tracking was assuming that w_unwritten_list would always contain one element. The actual count is now tracked whenever the list is extended. --- fs/ocfs2/aops.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 88a31e9340a0..eb0a81368dbb 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -784,6 +784,8 @@ struct ocfs2_write_ctxt { struct ocfs2_cached_dealloc_ctxt w_dealloc; struct list_head w_unwritten_list; + + unsigned int w_unwritten_count; }; void ocfs2_unlock_and_free_pages(struct page **pages, int num_pages) @@ -873,6 +875,7 @@ static int ocfs2_alloc_write_ctxt(struct ocfs2_write_ctxt **wcp, ocfs2_init_dealloc_ctxt(&wc->w_dealloc); INIT_LIST_HEAD(&wc->w_unwritten_list); + wc->w_unwritten_count = 0; *wcp = wc; @@ -1373,6 +1376,7 @@ static int ocfs2_unwritten_check(struct inode *inode, desc->c_clear_unwritten = 0; list_add_tail(&new->ue_ip_node, &oi->ip_unwritten_list); list_add_tail(&new->ue_node, &wc->w_unwritten_list); + wc->w_unwritten_count++; new = NULL; unlock: spin_unlock(&oi->ip_lock); @@ -2246,7 +2250,7 @@ static int ocfs2_dio_get_block(struct inode *inode, sector_t iblock, ue->ue_phys = desc->c_phys; list_splice_tail_init(&wc->w_unwritten_list, &dwc->dw_zero_list); - dwc->dw_zero_count++; + dwc->dw_zero_count += wc->w_unwritten_count; } ret = ocfs2_write_end_nolock(inode->i_mapping, pos, len, len, wc); -- 2.11.0