From patchwork Tue Jun 9 09:59:38 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhangguanghui X-Patchwork-Id: 6570901 Return-Path: X-Original-To: patchwork-ocfs2-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 15D95C0020 for ; Tue, 9 Jun 2015 10:01:23 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6098E204B5 for ; Tue, 9 Jun 2015 10:01:21 +0000 (UTC) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3E69B204AB for ; Tue, 9 Jun 2015 10:01:19 +0000 (UTC) Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t59A0Mnc023756 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 9 Jun 2015 10:00:22 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id t59A0Khc004375 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 9 Jun 2015 10:00:20 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1Z2GKS-0008Hy-C3; Tue, 09 Jun 2015 03:00:20 -0700 Received: from userv0022.oracle.com ([156.151.31.74]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1Z2GK4-0008C0-AM for ocfs2-devel@oss.oracle.com; Tue, 09 Jun 2015 02:59:56 -0700 Received: from aserp1020.oracle.com (aserp1020.oracle.com [141.146.126.67]) by userv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t599xt4q002068 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 9 Jun 2015 09:59:55 GMT Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by aserp1020.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t599xsKX001617 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Tue, 9 Jun 2015 09:59:55 GMT Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.14.7/8.14.7) with SMTP id t599xsdP027439 for ; Tue, 9 Jun 2015 09:59:54 GMT Received: from h3cmg01-ex.h3c.com (smtp.h3c.com [221.12.31.13] (may be forged)) by userp2030.oracle.com with ESMTP id 1uwum3j9jp-1 for ; Tue, 09 Jun 2015 09:59:50 +0000 Received: from H3CHUB03-EX.srv.huawei-3com.com (unknown [10.63.20.169]) by h3cmg01-ex.h3c.com with smtp id 4113_065f_3ecb1f99_ce86_436d_94bc_356f8da8c8e3; Tue, 09 Jun 2015 17:59:46 +0800 Received: from H3CMLB12-EX.srv.huawei-3com.com ([fe80::f091:bd11:f0a9:5cbe]) by H3CHUB03-EX.srv.huawei-3com.com ([fe80::1d84:7d22:976e:809%15]) with mapi id 14.01.0355.002; Tue, 9 Jun 2015 17:59:39 +0800 From: Zhangguanghui To: "ocfs2-devel@oss.oracle.com" Thread-Topic: __ocfs2_journal_access review, BUG Thread-Index: AdCik53DmEujK7A8Q4uIiZzTrlZg/g== Date: Tue, 9 Jun 2015 09:59:38 +0000 Message-ID: <2015060917590038012350@h3c.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.153.28.107] MIME-Version: 1.0 X-ServerName: [221.12.31.13] X-Proofpoint-Virus-Version: vendor=nai engine=5700 definitions=7826 signatures=670588 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 kscore.is_bulkscore=6.49480469405717e-15 kscore.compositescore=0 circleOfTrustscore=0 compositescore=0.596674908092042 suspectscore=0 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 rbsscore=0.596674908092042 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=0 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.886699632368168 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1506090178 Subject: [Ocfs2-devel] __ocfs2_journal_access review, BUG X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: aserv0021.oracle.com [141.146.126.233] X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00, HTML_FONT_FACE_BAD, HTML_FONT_LOW_CONTRAST, HTML_MESSAGE, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In the process of __ocfs2_journal_access? If LUNs can not be accessed for some reasons?such as storage network fails )?then BUG. When disk timeout , the server of fence ( emergency_restart() ) will fail, only can recovery by the reset of ILO. So we have to return the error -EIO, and avoid to BUG(panic). Moreover, whether all BUG_ON(!buffer_uptodate(bh)) in the ocfs2 file system can handle in the same way?? Finally, any feedback about this process (positive or negative) would be greatly appreciated. Jun 9 15:20:23 cvk68 kernel: [76994.822719] (pool,13568,12):__ocfs2_journal_access:664 ERROR: giving me a buffer that's not uptodate! Jun 9 15:20:23 cvk68 kernel: [76994.822721] (pool,13568,12):__ocfs2_journal_access:666 ERROR: b_blocknr=33030401 Jun 9 15:20:23 cvk68 kernel: [76994.822716] Read(10): 28 00 00 00 29 80 00 00 1f 00 Jun 9 15:20:23 cvk68 kernel: [76994.822729] (ksoftirqd/25,263,25):o2hb_bio_end_io:381 ERROR: IO Error -5 Jun 9 15:20:23 cvk68 kernel: [76994.822737] ------------[ cut here ]------------ Jun 9 15:20:23 cvk68 kernel: [76994.822740] (o2hb-771CAAF371,7589,9):o2hb_do_disk_heartbeat:993 ERROR: status = -5 Jun 9 15:20:23 cvk68 kernel: [76994.822746] Kernel BUG at ffffffffa048b15d [verbose debug info unavailable] Jun 9 15:20:23 cvk68 kernel: [76994.822748] invalid opcode: 0000 [#1] SMP Jun 9 15:20:23 cvk68 kernel: [76994.822751] sd 13:0:0:0: rejecting I/O to offline device Jun 9 15:20:23 cvk68 kernel: [76994.822753] (o2hb-771CAAF371,7589,9):o2hb_bio_end_io:381 ERROR: IO Error -5 Jun 9 15:20:23 cvk68 kernel: [76994.822755] (o2hb-771CAAF371,7589,9):o2hb_do_disk_heartbeat:993 ERROR: status = -5 Jun 9 15:20:23 cvk68 kernel: [76994.822751] Modules linked in: ip6table_filter(F) ip6_tables(F) iptable_filter(F) ip_tables(F) ebtable_nat(F) ebtables(F) x_tables(F) ocfs2(OF) quota_tree(F) cls_u32(F) sch_sfq(F) sch_htb(F) drbd(F) lru_cache(F) 8021q(F) mrp(F) garp(F) stp(F) llc(F) vhost_net(F) macvtap(F) macvlan(F) vhost(F) kvm_intel(F) kvm(F) ib_iser(F) rdma_cm(F) ib_cm(F) iw_cm(F) ib_sa(F) ib_mad(F) ib_core(F) ib_addr(F) iscsi_tcp(F) libiscsi_tcp(F) ocfs2_dlmfs(OF) ocfs2_stack_o2cb(OF) ocfs2_dlm(OF) ocfs2_nodemanager(OF) ocfs2_stackglue(OF) configfs(F) openvswitch(OF) libcrc32c(F) gre(F) nfsd(F) nfs_acl(F) auth_rpcgss(F) nfs(F) fscache(F) lockd(F) sunrpc(F) psmouse(F) sb_edac(F) ioatdma(F) edac_core(F) gpio_ich(F) dm_multipath(F) serio_raw(F) scsi_dh(F) dca(F) hpwdt(F) hpilo(F) mac_hid(F) lpc_ich(F) video(F) acpi_power_meter(F) lp(F) parport(F) be2iscsi(F) iscsi_boot_sysfs(F) libiscsi(F) hpsa(F) scsi_transport_iscsi(F) be2net(F) nbd(F) [last unloaded: ipmi_si] Jun 9 15:20:23 cvk68 kernel: [76994.822802] CPU: 12 PID: 13568 Comm: pool Tainted: GF O 3.13.6 #1 Jun 9 15:20:23 cvk68 kernel: [76994.822804] Hardware name: H3C FlexServer B390, BIOS I31 02/10/2014 Jun 9 15:20:23 cvk68 kernel: [76994.822806] task: ffff880611451810 ti: ffff8802cf8da000 task.ti: ffff8802cf8da000 Jun 9 15:20:23 cvk68 kernel: [76994.822808] RIP: 0010:[] [] __ocfs2_journal_access+0x30d/0x350 [ocfs2] Jun 9 15:20:23 cvk68 kernel: [76994.822832] RSP: 0018:ffff8802cf8dbb78 EFLAGS: 00010292 Jun 9 15:20:23 cvk68 kernel: [76994.822834] RAX: 0000000000000044 RBX: 1000000000000000 RCX: 000000000000c5c0 Jun 9 15:20:23 cvk68 kernel: [76994.822836] RDX: 0000000000000082 RSI: 0000000065ee65ea RDI: 0000000000000246 Jun 9 15:20:23 cvk68 kernel: [76994.822838] RBP: ffff8802cf8dbbf8 R08: ffffffff81ec09a8 R09: ffffffff81ee8f20 Jun 9 15:20:23 cvk68 kernel: [76994.822840] R10: 0000000000000064 R11: 0000000000017adc R12: ffff880604b31138 Jun 9 15:20:23 cvk68 kernel: [76994.822842] R13: ffff880611451810 R14: ffff880611451ce0 R15: 0000000000000001 Jun 9 15:20:23 cvk68 kernel: [76994.822845] FS: 00007f9bcffff700(0000) GS:ffff880c3f880000(0000) knlGS:0000000000000000 Jun 9 15:20:23 cvk68 kernel: [76994.822847] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jun 9 15:20:23 cvk68 kernel: [76994.822849] CR2: 000000000133b7b8 CR3: 000000061168a000 CR4: 00000000001427e0 Jun 9 15:20:23 cvk68 kernel: [76994.822851] Stack: Jun 9 15:20:23 cvk68 kernel: [76994.822852] 0000000001f80101 000000000000000b ffff880c1cc84030 0000000000000000 Jun 9 15:20:23 cvk68 kernel: [76994.822857] ffffffffa0505430 ffff880c1d183000 ffff880c1cc84030 0000000001f80101 Jun 9 15:20:23 cvk68 kernel: [76994.822861] 0000000001f80101 00001000a0473010 0000000000000000 ffff880c1dd35000 Jun 9 15:20:23 cvk68 kernel: [76994.822865] Call Trace: Jun 9 15:20:23 cvk68 kernel: [76994.822878] [] ocfs2_journal_access_di+0x18/0x20 [ocfs2] Jun 9 15:20:23 cvk68 kernel: [76994.822888] [] ocfs2_write_end_nolock+0x63/0x430 [ocfs2] Jun 9 15:20:23 cvk68 kernel: [76994.822897] [] ? ocfs2_write_begin+0x1e2/0x230 [ocfs2] Jun 9 15:20:23 cvk68 kernel: [76994.822906] [] ocfs2_write_end+0x26/0x50 [ocfs2] Jun 9 15:20:23 cvk68 kernel: [76994.822910] [] generic_file_buffered_write+0x165/0x280 Jun 9 15:20:23 cvk68 kernel: [76994.822921] [] ocfs2_file_aio_write+0x74f/0x790 [ocfs2] Jun 9 15:20:23 cvk68 kernel: [76994.822925] [] do_sync_write+0x5a/0x90 Jun 9 15:20:23 cvk68 kernel: [76994.822928] [] vfs_write+0xc5/0x1f0 Jun 9 15:20:23 cvk68 kernel: [76994.822931] [] SyS_write+0x52/0xa0 Jun 9 15:20:23 cvk68 kernel: [76994.822934] [] system_call_fastpath+0x1a/0x1f Jun 9 15:20:23 cvk68 kernel: [76994.822936] Code: 8b 95 fc 02 00 00 48 63 c9 48 89 04 24 41 b9 9a 02 00 00 49 c7 c0 e0 dc 4e a0 4c 89 f6 48 c7 c7 18 a4 4f a0 31 c0 e8 29 09 2c e1 <0f> 0b 65 8b 0c 25 64 b0 00 00 65 48 8b 34 25 c0 c7 00 00 8b 96 Jun 9 15:20:23 cvk68 kernel: [76994.822961] RIP [] __ocfs2_journal_access+0x30d/0x350 [ocfs2] ------------------------------------------------------------------------------------------------------------------------------------- ???????????????????????????????????????? ???????????????????????????????????????? ???????????????????????????????????????? ??? This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! --- journal.c 2015-05-18 00:55:21.000000000 +0800 +++ journal.c.bk 2015-06-09 17:37:13.531333444 +0800 @@ -670,7 +670,7 @@ mlog(ML_ERROR, "giving me a buffer that's not uptodate!\n"); mlog(ML_ERROR, "b_blocknr=%llu\n", (unsigned long long)bh->b_blocknr); - BUG(); + return -EIO; } /* Set the current transaction information on the ci so