From patchwork Thu Dec 17 05:33:27 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhangguanghui X-Patchwork-Id: 7869681 Return-Path: X-Original-To: patchwork-ocfs2-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 687CC9F32E for ; Thu, 17 Dec 2015 05:35:56 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6DC5C203F3 for ; Thu, 17 Dec 2015 05:35:55 +0000 (UTC) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3AA80203ED for ; Thu, 17 Dec 2015 05:35:41 +0000 (UTC) Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id tBH5YEpU008415 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 17 Dec 2015 05:34:14 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id tBH5YBon000485 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 17 Dec 2015 05:34:11 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1a9RCd-0007Vs-HT; Wed, 16 Dec 2015 21:34:11 -0800 Received: from userv0022.oracle.com ([156.151.31.74]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1a9RCF-0007Ux-5y for ocfs2-devel@oss.oracle.com; Wed, 16 Dec 2015 21:33:47 -0800 Received: from aserp1020.oracle.com (aserp1020.oracle.com [141.146.126.67]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id tBH5Xke3031708 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 17 Dec 2015 05:33:46 GMT Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by aserp1020.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id tBH5XjM1014863 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Thu, 17 Dec 2015 05:33:46 GMT Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.15.0.59/8.15.0.59) with SMTP id tBH5Xjkx027244 for ; Thu, 17 Dec 2015 05:33:45 GMT Received: from h3cmg01-ex.h3c.com (smtp.h3c.com [221.12.31.13] (may be forged)) by userp2030.oracle.com with ESMTP id 1ytseq052a-1 for ; Thu, 17 Dec 2015 05:33:44 +0000 Received: from H3CHUB03-EX.srv.huawei-3com.com (unknown [10.63.20.169]) by h3cmg01-ex.h3c.com with smtp id 5249_0528_cd747003_14d6_426b_a296_a519758573cf; Thu, 17 Dec 2015 13:33:34 +0800 Received: from H3CMLB12-EX.srv.huawei-3com.com ([fe80::f091:bd11:f0a9:5cbe]) by H3CHUB03-EX.srv.huawei-3com.com ([fe80::ec6c:67e6:67f8:ce53%15]) with mapi id 14.01.0355.002; Thu, 17 Dec 2015 13:33:28 +0800 From: Zhangguanghui To: "ocfs2-devel@oss.oracle.com" Thread-Topic: ocfs2 cannot continue when JBD2 has aborted the journal, Thread-Index: AdE4fZzQrc4Z0AtGQHiSZv29E5772Q== Date: Thu, 17 Dec 2015 05:33:27 +0000 Message-ID: <2015121713343524045332@h3c.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.153.28.135] MIME-Version: 1.0 X-Proofpoint-SPF-Result: None X-ServerName: [221.12.31.13] X-Proofpoint-Virus-Version: vendor=nai engine=5700 definitions=8017 signatures=670672 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1507310007 definitions=main-1512170105 Subject: [Ocfs2-devel] ocfs2 cannot continue when JBD2 has aborted the journal, X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: userv0022.oracle.com [156.151.31.74] X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_50,HTML_MESSAGE, RCVD_IN_DNSWL_MED,T_RP_MATCHES_RCVD,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi all, A tiny race about JBD2 has aborted to jbd2_journal_flush, because of unstable storage link and I/O stress. while JBD2 state is aborted, have been -EIO error, may cause all cluster nodes hung. so I thinks JBD2 has aborted the journal, ocfs2 cannot continue and trigger ocfs2_abort. Thanks, Any ideas about this patch? description: ocfs2_commit_thread ocfs2_commit_cache jbd2_journal_flush zhangguanghui ------------------------------------------------------------------------------------------------------------------------------------- ???????????????????????????????????????? ???????????????????????????????????????? ???????????????????????????????????????? ??? This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! --- journal.c 2015-12-17 11:36:39.140542941 +0800 +++ journal.c.diff 2015-12-17 11:39:21.308542922 +0800 @@ -328,6 +328,9 @@ if (status < 0) { up_write(&journal->j_trans_barrier); mlog_errno(status); + if (is_journal_aborted(journal)) { + ocfs2_abort(osb->sb, "Detect aborted journal,while committing cache."); + } goto finally; } ________________________________