From patchwork Thu Nov 19 20:30:34 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcin Gibu?a X-Patchwork-Id: 7661371 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 4AFBC9F392 for ; Thu, 19 Nov 2015 20:30:50 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 97E702055C for ; Thu, 19 Nov 2015 20:30:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AB5992054D for ; Thu, 19 Nov 2015 20:30:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161365AbbKSUaq (ORCPT ); Thu, 19 Nov 2015 15:30:46 -0500 Received: from ip-92-43-119-196.beyond.pl ([92.43.119.196]:55228 "EHLO mx.beyond.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161362AbbKSUap (ORCPT ); Thu, 19 Nov 2015 15:30:45 -0500 Received: from localhost (localhost [127.0.0.1]) by mx.beyond.pl (Postfix) with ESMTP id 82E2EBBD for ; Thu, 19 Nov 2015 21:30:43 +0100 (CET) X-Virus-Scanned: Scanned by Beyond.pl Virus Scanner Received: from mx.beyond.pl ([127.0.0.1]) by localhost (mw.beyond.pl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AI2i7aYtirZT for ; Thu, 19 Nov 2015 21:30:42 +0100 (CET) Received: from [192.168.1.120] (src75-20.unii.maverick.com.pl [194.187.75.20]) (Authenticated sender: m.gibula@beyond.pl) by mx.beyond.pl (Postfix) with ESMTPSA id D6DCA720 for ; Thu, 19 Nov 2015 21:30:42 +0100 (CET) Subject: Re: OSD memory usage during startup - advice needed To: ceph-devel@vger.kernel.org References: <564E11ED.7080001@beyond.pl> From: =?UTF-8?Q?Marcin_Gibu=c5=82a?= Message-ID: <564E316A.8050601@beyond.pl> Date: Thu, 19 Nov 2015 21:30:34 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <564E11ED.7080001@beyond.pl> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_TVD_MIME_EPI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP > Judging from debug output, the problem is in journal recovery, when it > tries to delete object with huge (several milion keys - it is radosgw > index* for bucket with over 50mln objects) amount of keys, using > leveldb's rmkeys_by_prefix() method. > > Looking at the source code, rmkeys_by_prefix() batches all operations > into one list and then submit_transaction() executes them all atomically. > > I'd love to write a patch for this issue, but it seems unfixable (or is > it?) with current API and method behaviour. Could you offer any advice > on how to proceed? Answering myself, could anyone verify if attached patch looks ok? Should reduce memory footprint a bit. When I first read this code, I assumed that data pointed by leveldb::Slice have to be reachable until db->Write is called. However, looking into leveldb and into its source code, there is no such requirement - leveldb makes its own copy of key, so we're effectivly doubling memory footprint for no reason. --- a/src/os/LevelDBStore.cc +++ b/src/os/LevelDBStore.cc @@ -156,9 +156,8 @@ void LevelDBStore::LevelDBTransactionImpl::set( buffers.push_back(to_set_bl); bufferlist &bl = *(buffers.rbegin()); string key = combine_strings(prefix, k); - keys.push_back(key); - bat.Delete(leveldb::Slice(*(keys.rbegin()))); - bat.Put(leveldb::Slice(*(keys.rbegin())), + bat.Delete(leveldb::Slice(key)); + bat.Put(leveldb::Slice(key), leveldb::Slice(bl.c_str(), bl.length())); } @@ -166,8 +165,7 @@ void LevelDBStore::LevelDBTransactionImpl::rmkey(const string &prefix, const string &k) { string key = combine_strings(prefix, k); - keys.push_back(key); - bat.Delete(leveldb::Slice(*(keys.rbegin()))); + bat.Delete(leveldb::Slice(key)); } void LevelDBStore::LevelDBTransactionImpl::rmkeys_by_prefix(const string &prefix) @@ -177,8 +175,7 @@ void LevelDBStore::LevelDBTransactionImpl::rmkeys_by_prefix(const string &prefix it->valid(); it->next()) { string key = combine_strings(prefix, it->key()); - keys.push_back(key); - bat.Delete(*(keys.rbegin())); + bat.Delete(key); } } diff --git a/src/os/LevelDBStore.h b/src/os/LevelDBStore.h index 4617c5c..dd248dd 100644 --- a/src/os/LevelDBStore.h +++ b/src/os/LevelDBStore.h @@ -175,7 +175,6 @@ public: public: leveldb::WriteBatch bat; list buffers; - list keys; LevelDBStore *db; LevelDBTransactionImpl(LevelDBStore *db) : db(db) {}