From patchwork Thu Aug 11 22:20:45 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Somnath Roy X-Patchwork-Id: 9276097 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 18CAE60231 for ; Thu, 11 Aug 2016 22:20:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0513B28719 for ; Thu, 11 Aug 2016 22:20:53 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E9587287A7; Thu, 11 Aug 2016 22:20:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 80E4628719 for ; Thu, 11 Aug 2016 22:20:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751591AbcHKWUu (ORCPT ); Thu, 11 Aug 2016 18:20:50 -0400 Received: from mail-bn3nam01on0057.outbound.protection.outlook.com ([104.47.33.57]:45588 "EHLO NAM01-BN3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750856AbcHKWUs convert rfc822-to-8bit (ORCPT ); Thu, 11 Aug 2016 18:20:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sandiskcorp.onmicrosoft.com; s=selector1-sandisk-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=2X+Dl0/eGwzqcoL7xpSF7zoTQmZynCCbEkWT1cfXckY=; b=qhGRJPQh3vR2566zP+D4vtx6ZIaAusgiJUnj+uzDKeXcLckAoOhILSe0DChUKZ9sOTj9RRTbEr8qiF1oX7SMGiXfC+OVeLGjT1ws2XjfxlBse09nPKNiw+bFWegqTWuJR1SSDktB1MJ4JSh6SXTuNhV0fPAGHfWb76cWXBChgYc= Received: from BL2PR02MB2115.namprd02.prod.outlook.com (10.167.97.13) by BL2PR02MB2113.namprd02.prod.outlook.com (10.167.97.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.549.15; Thu, 11 Aug 2016 22:20:45 +0000 Received: from BL2PR02MB2115.namprd02.prod.outlook.com ([10.167.97.13]) by BL2PR02MB2115.namprd02.prod.outlook.com ([10.167.97.13]) with mapi id 15.01.0549.025; Thu, 11 Aug 2016 22:20:45 +0000 From: Somnath Roy To: Sage Weil , Mark Nelson CC: ceph-devel Subject: RE: Bluestore assert Thread-Topic: Bluestore assert Thread-Index: AdHzQ8MNS2EOw+JBS4S8E5qM/8FE6wACX1qAAACSvaAAJWd28AACGPSAAAAqBYAAABHigAALtzGQ Date: Thu, 11 Aug 2016 22:20:45 +0000 Message-ID: References: <7dc67e25-4e1c-09a1-8667-ee47572b9290@redhat.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Somnath.Roy@sandisk.com; x-originating-ip: [63.163.107.100] x-ms-office365-filtering-correlation-id: df497b33-eaab-467c-4e68-08d3c235bd48 x-microsoft-exchange-diagnostics: 1; BL2PR02MB2113; 20:L6ucmBZ8I7uShk4Frw6kSWlFR8R+ypc3bJD9dU0xYOwwvfZouX4Z6t9hikHzEDhudMoJA2nN5TCDA0zjGXvCg+zHtHBc4wzWt52MtQnn8rO2g6l+hMvcbJx1FzqmAJfCicgGrnXFuJcylP+6aPGarsgX0AnzarHvxQ3hQiN8ntS1K+4/UfsJg74cikXtvV5hB0cu88N1zfCnRnNmMf0OZgQjVbwiA3hbCsSHJKjF+qVu8AGOkMYwoIhyvbzhCZI9 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BL2PR02MB2113; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(166708455590820)(9452136761055); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040174)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6055026); SRVR:BL2PR02MB2113; BCL:0; PCL:0; RULEID:; SRVR:BL2PR02MB2113; x-forefront-prvs: 0031A0FFAF x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(189002)(199003)(13464003)(377454003)(374574003)(24454002)(4326007)(9686002)(11100500001)(586003)(122556002)(6116002)(102836003)(10400500002)(3480700004)(5002640100001)(2906002)(68736007)(8936002)(81166006)(3846002)(19580405001)(86362001)(76576001)(8558605004)(19580395003)(3660700001)(93886004)(8676002)(81156014)(3280700002)(305945005)(74316002)(33656002)(66066001)(551934003)(7846002)(7696003)(5001770100001)(97736004)(221733001)(105586002)(106356001)(2950100001)(99286002)(54356999)(2900100001)(101416001)(87936001)(7736002)(50986999)(77096005)(76176999)(92566002)(15975445007)(189998001); DIR:OUT; SFP:1101; SCL:1; SRVR:BL2PR02MB2113; H:BL2PR02MB2115.namprd02.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: sandisk.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: sandisk.com X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Aug 2016 22:20:45.1997 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: fcd9ea9c-ae8c-460c-ab3c-3db42d7ac64d X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL2PR02MB2113 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Sage, Regarding the db assert , I hit that again on multiple OSDs while I was populating 40TB rbd images (~35TB written before crash). I did the following changes in the code.. @@ -370,7 +370,7 @@ int RocksDBStore::submit_transaction(KeyValueDB::Transaction t) utime_t lat = ceph_clock_now(g_ceph_context) - start; logger->inc(l_rocksdb_txns); logger->tinc(l_rocksdb_submit_latency, lat); - return s.ok() ? 0 : -1; + return s.ok() ? 0 : -s.code(); } int RocksDBStore::submit_transaction_sync(KeyValueDB::Transaction t) @@ -385,7 +385,7 @@ int RocksDBStore::submit_transaction_sync(KeyValueDB::Transaction t) utime_t lat = ceph_clock_now(g_ceph_context) - start; logger->inc(l_rocksdb_txns_sync); logger->tinc(l_rocksdb_submit_sync_latency, lat); - return s.ok() ? 0 : -1; + return s.ok() ? 0 : -s.code(); } int RocksDBStore::get_info_log_level(string info_log_level) { This is printing -1 in the log before asset. So, the corresponding code from the rocksdb side is "kNotFound". It is not related to space as I hit this same issue irrespective of db partition size is 100G or 300G. It seems some kind of corruption within Bluestore ? Let me now the next step. Thanks & Regards Somnath -----Original Message----- From: Sage Weil [mailto:sweil@redhat.com] Sent: Thursday, August 11, 2016 9:36 AM To: Mark Nelson Cc: Somnath Roy; ceph-devel Subject: Re: Bluestore assert On Thu, 11 Aug 2016, Mark Nelson wrote: > Sorry if I missed this during discussion, but why are these being > called if the file is deleted? I'm not sure... rocksdb is the one consuming the interface. Looking through the code, though, this is the only way I can see that we could log an op_file_update *after* an op_file_remove. sage > > Mark > > On 08/11/2016 11:29 AM, Sage Weil wrote: > > On Thu, 11 Aug 2016, Somnath Roy wrote: > > > Sage, > > > Please find the full log for the BlueFS replay bug in the > > > following location. > > > > > > https://github.com/somnathr/ceph/blob/master/ceph-osd.1.log.zip > > > > > > For the db transaction one , I have added code to dump the rocksdb > > > error code before the assert as you suggested and waiting to reproduce. > > > > I'm pretty sure this is the root cause: > > > > https://github.com/ceph/ceph/pull/10686 > > > > sage > > -- > > To unsubscribe from this list: send the line "unsubscribe > > ceph-devel" in the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies). --- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/src/os/bluestore/BlueStore.cc b/src/os/bluestore/BlueStore.cc index fe7f743..3f4ecd5 100644 --- a/src/os/bluestore/BlueStore.cc +++ b/src/os/bluestore/BlueStore.cc @@ -4989,6 +4989,9 @@ void BlueStore::_kv_sync_thread() ++it) { _txc_finalize_kv((*it), (*it)->t); int r = db->submit_transaction((*it)->t); + if (r < 0 ) { + dout(0) << "submit_transaction returned = " << r << dendl; + } assert(r == 0); } } @@ -5026,6 +5029,10 @@ void BlueStore::_kv_sync_thread() t->rm_single_key(PREFIX_WAL, key); } int r = db->submit_transaction_sync(t); + if (r < 0 ) { + dout(0) << "submit_transaction_sync returned = " << r << dendl; + } + assert(r == 0);