From patchwork Mon Jan 29 02:01:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: piaojun X-Patchwork-Id: 10188705 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E9AAD60388 for ; Mon, 29 Jan 2018 02:03:16 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA3D028768 for ; Mon, 29 Jan 2018 02:03:16 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CF1A82876C; Mon, 29 Jan 2018 02:03:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=ham version=3.3.1 Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 40DFB28768 for ; Mon, 29 Jan 2018 02:03:15 +0000 (UTC) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w0T22cAW108100; Mon, 29 Jan 2018 02:02:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=to : from : message-id : date : mime-version : cc : subject : list-id : list-unsubscribe : list-archive : list-post : list-help : list-subscribe : content-type : content-transfer-encoding : sender; s=corp-2017-10-26; bh=nVm2+8mTDPUC1TAdjU8GrI7awxp+IxyjBpCyw/z/7Hc=; b=o9GDqEgu1T771H7xq9YdCV/0nCB34XRrARDARYOl0Lo6isFiODEpoAspzOCTHRawoTEc die3zW48RQeNdK1yTivHR4Y2dfb283byNFhZ2MDcIkBkQJ5yNRVCo1z0SyYW1j8C0g41 aE0z7JFPP1/fgq0fUZyUQ8R4ku8wYnWjcspjQEbIfEgQWp/h8SLL2nNXQDtWdwZl5Uj6 unXgjnF05h5cUl/wF6dkUYAhf9IljAdIYeYDizUbUs+cnKNoj/dwSQRZWNQux9laEodV X+ri0DhCs2Zk2RPcNC27IUD8I3mfQYbP9QNpQ8DdFaLsBpZRsbVnNcIrApY3YEjCx1S0 mA== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by userp2130.oracle.com with ESMTP id 2fst5h016p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jan 2018 02:02:40 +0000 Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w0T22a38012960 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 29 Jan 2018 02:02:37 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1efymK-0004Vv-Qr; Sun, 28 Jan 2018 18:02:36 -0800 Received: from aserv0022.oracle.com ([141.146.126.234]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1efymI-0004Vg-6x for ocfs2-devel@oss.oracle.com; Sun, 28 Jan 2018 18:02:34 -0800 Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w0T22Xhc014841 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL) for ; Mon, 29 Jan 2018 02:02:34 GMT Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w0T22XJh005377 for ; Mon, 29 Jan 2018 02:02:33 GMT Received: from huawei.com (szxga05-in.huawei.com [45.249.212.191]) by userp2030.oracle.com with ESMTP id 2fsm50eh3v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Mon, 29 Jan 2018 02:02:33 +0000 Received: from DGGEMS411-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id A0EF85C27E102; Mon, 29 Jan 2018 10:02:16 +0800 (CST) Received: from [10.177.253.249] (10.177.253.249) by smtp.huawei.com (10.3.19.211) with Microsoft SMTP Server id 14.3.361.1; Mon, 29 Jan 2018 10:02:08 +0800 To: "akpm@linux-foundation.org" , Mark Fasheh , Joseph Qi , Joel Becker , Junxiao Bi From: piaojun Message-ID: <5A6E8092.8090701@huawei.com> Date: Mon, 29 Jan 2018 10:01:54 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 X-Originating-IP: [10.177.253.249] X-CFilter-Loop: Reflected X-CLX-Shades: MLX X-CLX-Response: 1TFkXGx0YEQpMehcSHxEKWU0XZ2ZyEQpZSRcacRoQGncGGx8YcRkYEBp3Bhg aBhoRClleF2hueREKSUYXRVhLSUZPdVpYRU5fSV5DRUQZdU9LEQpDThd8ZntiR28SXUdGHE1bW0 FnXH4eW01HWW5BbmVuH3hfEhEKWFwXHwQaBBsYHwdOSR0eExodTAUbGgQbGhoEHhIEGxAbHhofG hEKXlkXeFppfH8RCk1cFxwbGhEKTFoXaGlCTXsRCk1OF2gRCkNaFx4fBBgeEwQYGxgEGxMbEQpC XhcbEQpEXhccEQpESRcfEQpCRhdsQlhpfEJGcHkaHxEKQlwXGhEKQkUXZlxse3BkYnoSfEMRCkJ OF2xCSFlTGk1leHgdEQpCTBdvSxkSEkRZeVsbXxEKQmwXYwVCUmZAYl5ae1IRCkJAF25yaGRTGX pHQlMTEQpCWBdifW95AU8YGXBwexEKWlgXGxEKcGgXYGx/fmFtYV1rbBIQGhEKcGgXYhhEWnxfW xJweGYQGRoRCnBoF2wcHkNhQBlFTEtFEBoRCnBoF2RNcEVAcx1vE0JOEBkaEQpwaBdhZmgafnl4 aH4YfBAZGhEKcGwXYUl5Q3pzSXhnbGIQHhIRCm1+FxoRClhNF0sRIA== X-PDR: PASS X-Source-IP: 45.249.212.191 X-ServerName: szxga05-in.huawei.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 ip4:45.249.212.32 ip4:45.249.212.35 ip4:119.145.14.93 ip4:58.251.152.93 ip4:194.213.3.17 ip4:206.16.17.72 ip4:45.249.212.255 ip4:45.249.212.187/29 ip4:45.249.212.191 ~all X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8788 signatures=668655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=85 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=172 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1801290026 X-Spam: Clean Cc: "ocfs2-devel@oss.oracle.com" Subject: [Ocfs2-devel] [PATCH v3] ocfs2: return error when we attempt to access a dirty bh in jbd2 X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8788 signatures=668655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1801290026 X-Virus-Scanned: ClamAV using ClamSMTP We should not reuse the dirty bh in jbd2 directly due to the following situation: 1. When removing extent rec, we will dirty the bhs of extent rec and truncate log at the same time, and hand them over to jbd2. 2. The bhs are submitted to jbd2 area successfully. 3. The write-back thread of device help flush the bhs to disk but encounter write error due to abnormal storage link. 4. After a while the storage link become normal. Truncate log flush worker triggered by the next space reclaiming found the dirty bh of truncate log and clear its 'BH_Write_EIO' and then set it uptodate in __ocfs2_journal_access(): ocfs2_truncate_log_worker ocfs2_flush_truncate_log __ocfs2_flush_truncate_log ocfs2_replay_truncate_records ocfs2_journal_access_di __ocfs2_journal_access // here we clear io_error and set 'tl_bh' uptodata. 5. Then jbd2 will flush the bh of truncate log to disk, but the bh of extent rec is still in error state, and unfortunately nobody will take care of it. 6. At last the space of extent rec was not reduced, but truncate log flush worker have given it back to globalalloc. That will cause duplicate cluster problem which could be identified by fsck.ocfs2. Sadlly we can hardly revert this but set fs read-only in case of ruining atomicity and consistency of space reclaim. Fixes: acf8fdbe6afb ("ocfs2: do not BUG if buffer not uptodate in __ocfs2_journal_access") Signed-off-by: Jun Piao Reviewed-by: Yiwen Jiang Reviewed-by: Changwei Ge --- fs/ocfs2/journal.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index 3630443..e5dcea6 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -666,23 +666,24 @@ static int __ocfs2_journal_access(handle_t *handle, /* we can safely remove this assertion after testing. */ if (!buffer_uptodate(bh)) { mlog(ML_ERROR, "giving me a buffer that's not uptodate!\n"); - mlog(ML_ERROR, "b_blocknr=%llu\n", - (unsigned long long)bh->b_blocknr); + mlog(ML_ERROR, "b_blocknr=%llu, b_state=0x%lx\n", + (unsigned long long)bh->b_blocknr, bh->b_state); lock_buffer(bh); /* - * A previous attempt to write this buffer head failed. - * Nothing we can do but to retry the write and hope for - * the best. + * A previous transaction with a couple of buffer heads fail + * to checkpoint, so all the bhs are marked as BH_Write_EIO. + * For current transaction, the bh is just among those error + * bhs which previous transaction handle. We can't just clear + * its BH_Write_EIO and reuse directly, since other bhs are + * not written to disk yet and that will cause metadata + * inconsistency. So we should set fs read-only to avoid + * further damage. */ if (buffer_write_io_error(bh) && !buffer_uptodate(bh)) { - clear_buffer_write_io_error(bh); - set_buffer_uptodate(bh); - } - - if (!buffer_uptodate(bh)) { unlock_buffer(bh); - return -EIO; + return ocfs2_error(osb->sb, "A previous attempt to " + "write this buffer head failed\n"); } unlock_buffer(bh); }