From patchwork Tue May 31 16:49:53 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Haomai Wang X-Patchwork-Id: 9145201 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7962660752 for ; Tue, 31 May 2016 16:51:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6B88120855 for ; Tue, 31 May 2016 16:51:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5EE52272D8; Tue, 31 May 2016 16:51:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8EBBA20855 for ; Tue, 31 May 2016 16:51:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755012AbcEaQu1 (ORCPT ); Tue, 31 May 2016 12:50:27 -0400 Received: from mail-hk2apc01on0121.outbound.protection.outlook.com ([104.47.124.121]:52528 "EHLO APC01-HK2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752864AbcEaQuI (ORCPT ); Tue, 31 May 2016 12:50:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=xsky.onmicrosoft.com; s=selector1-xsky-com; h=From:To:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=GKktcWk6tniEwq48ajOE35UDTLjbBs66xqYCvEftOKs=; b=hGyP4jrjPR/hMHQkEf9VgJTv2+/EIhhC+vQMfz7HMCz8vLwfYg/KKvuYMRAK7ld0AUnkEKMBYUjZpSMdKkcVUCFMmI8BGwVM5fWagu6/oxwTTGT6pgChz3R8BAXd6iu2UBM8CP0t8K7Lm2jSBTaRZifc1a7ouc6fdQTvn2Cyj9w= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none; vger.kernel.org; dmarc=none action=none header.from=xsky.com; Received: from mail-vk0-f42.google.com (209.85.213.42) by HKXPR02MB0663.apcprd02.prod.outlook.com (10.161.179.144) with Microsoft SMTP Server (TLS) id 15.1.506.2; Tue, 31 May 2016 16:50:04 +0000 Received: by mail-vk0-f42.google.com with SMTP id a6so61117146vkg.3 for ; Tue, 31 May 2016 09:50:03 -0700 (PDT) X-Gm-Message-State: ALyK8tI+SgKK/uiElJXZJ12ssgJUEIIUkit60OFkPU+XJTv97EMImC45soySA2GkeVrX0hY6FdokhGJjPj34kw== MIME-Version: 1.0 X-Received: by 10.176.0.182 with SMTP id 51mr18374347uaj.4.1464713394017; Tue, 31 May 2016 09:49:54 -0700 (PDT) Received: by 10.176.65.226 with HTTP; Tue, 31 May 2016 09:49:53 -0700 (PDT) Date: Wed, 1 Jun 2016 00:49:53 +0800 X-Gmail-Original-Message-ID: Message-ID: Subject: RocksDB Incorrect API Usage From: Haomai Wang To: Sage Weil , Mark Nelson CC: "ceph-devel@vger.kernel.org" X-Originating-IP: [209.85.213.42] X-ClientProxiedBy: KL1PR03CA0042.apcprd03.prod.outlook.com (10.165.63.52) To HKXPR02MB0663.apcprd02.prod.outlook.com (10.161.179.144) X-MS-Office365-Filtering-Correlation-Id: bf0ab33e-c677-45a8-7f6d-08d389739d44 X-Microsoft-Exchange-Diagnostics: 1; HKXPR02MB0663; 2:5x/ZM2fneOuCkMQLRSMHFUFLU5Yh1g9b0uKYIxUvkbhPNtQeCEGpSmbPyh3COm9hxY3o5efvaudPEk0cKk97dz6/F8pxSmSCChG/AxSYPiHUAaXyDnzXD2lf4ADuyzKjVFpK/evo7oN7ICbPPMDzkn3NNxeQHfJCvoHbalL9+QV5oJYUQ2mRldl3pCif986m; 3:oESDJLZCPRvnm0mGrhKrjM/ijae7GBLaDOcsu6A70rHWvuLwKzNH+Ibcl+aGdhj44+uTUfYFHUot6D8DTSNziuzzHoUUIfiLbaBlwlptON4oKEEjiQKN2bt6KWPl0lQA; 25:kgMrakIKXu9GHKXRyNmm+tqKUBOIMAQbsGxaOxnfWn5Gz0wbU/EzXtELVEAGRXNZWlW0oUc98BzY4i37v9pFLTTyWWsV5EugsxUDBpU8EjrkKqTiTnGihS3DhJU45PRPwehBkJSNh7XLLozlKRthXiUGwBMikHVZOYG8VqucxdRC7peLSp7rIarrJ8iQowmdVAPrj5rMBAGyv54FVPTQ2IAcfIfV0RS17MaLQHxG8TCsyReFYgWBldYtIswZ8rbq6EeoMGWa92UoAIa74O+2jVz1UTyBLcxdHWfWx5VQD0QCBlr+5C+nbl+sU4JSszWCrYT+Pkx5VtDpbGaD33HNxueOBWJW5gY1NjfEiisNvuBrKlfhDvFIlDNwmcDXhjh5ax7h5X8dxvkwSE+Xc2/IoYZN59Lg8XcNPVwFDdUYc0A= X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HKXPR02MB0663; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(166708455590820)(81227570615382); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040130)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6041072)(6043046); SRVR:HKXPR02MB0663; BCL:0; PCL:0; RULEID:; SRVR:HKXPR02MB0663; X-Microsoft-Exchange-Diagnostics: 1; HKXPR02MB0663; 4:5Tr1b5GFTO2Xh7KAM9/6l3okty6UaKlVaJh+7r8qATthZYF6R/bqJdT9LBZCpMyzdqx5Dmg/ImN0+/4zwp4JhNxtTTcEL+BEFCeUlcUnXw0bwdLVadsz5bd+aJ5XddSKBO/L6NvD4aEo3f3ZG9JDkbNbwg2f6JMx5O/OipVsKCofjtcTyU6cnU2G/xha875c90f5qu56EPv6mwEvQoOouS7nsaF+6UiTsYXdOtQL5vOsFwijh96mNxr583zZQz6fRZtwD6bn3jLet0stJ+T19lyFAgO8lSo5MUW4T79toAL74239fdyvf/YzV3xigja8GRpK1qjWqW+b+QaJq0Bv1TSGtWdW0dGZZmC0fEAOAMZ+DPcDlD+9uJB7kKkF6UDR+B6zxzsFeUF8II01EtC+8RDPInRWtyd/QbDGbgJ01Kmw4mfR3U585KSJ4m2AGezVd7MIYY+7mizZ4GVH8W53XH99quvpLFJE5iCulVdvXmscQbqwa1b1e74qS9xQxxtQ X-Forefront-PRVS: 095972DF2F X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(4630300001)(6009001)(42186005)(229853001)(19580405001)(5820100001)(19580395003)(81166006)(15975445007)(450100001)(122856001)(8676002)(50466002)(93516999)(54356999)(63696999)(50986999)(59536001)(61266001)(23676002)(86362001)(66066001)(55446002)(561944003)(61726006)(92566002)(189998001)(9896002)(5008740100001)(6116002)(3846002)(5001770100001)(586003)(4001450100002)(2906002)(5004730100002)(4326007)(9686002)(3480700004)(55456009); DIR:OUT; SFP:1102; SCL:1; SRVR:HKXPR02MB0663; H:mail-vk0-f42.google.com; FPR:; SPF:None; MLV:sfv; LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtIS1hQUjAyTUIwNjYzOzIzOjB3R2ZHdTEwNmhhbVZzeUhab3ZBUUdsSWg3?= =?utf-8?B?Q3hVNnl3YVYzS1NsWHIza1ZUK0xkY29zT0ZrblUyL09hcG96RTc3SGRzOGRZ?= =?utf-8?B?TWhHM2hyTmlmTWllaE1oS0tMeGVBaVFoMlVUa0N1bHNrblZJN1h4cEhxVWFz?= =?utf-8?B?R0U2L0ZVSWFtN3c2WXVVYVlWaXhNM1RyTXNLZzl3UWZqN3htckphVnQzVXVN?= =?utf-8?B?dnJ5eFE0d1BleG0xSCs0TXVVOFNFOWE0UUNobFE0aU5HTHJyYzliclBFNzBX?= =?utf-8?B?cFllU3crNnptcEQ2Q1FranhSWDdTWG50ZUVyZ01Dc0hocU9uTVFNcEpRQmFH?= =?utf-8?B?ZjBlOXVXbGd4emJrYWJmLzd2YzE3MzQ4bkQ4ZTBPRFhFN25XM2R0VWMxSytw?= =?utf-8?B?Q2h1endSZmNTV01YTGh4Q1YvMzM4alJlREtGVlZDb0U5Q0VqSU5LZzFycHYr?= =?utf-8?B?aVM0UWtVczdacThDeWVLOTN4OTNmMWE5WmNyV005UDhiWVZMWkI5ZlNqQXVK?= =?utf-8?B?Sk5xaFg4cjgvK3J0anZXcGJkOFovM3F6KzdZbm5LWDJadnBuVXhsMGU5cWZo?= =?utf-8?B?MVhXQnVTMUIwZHZ0dEZYN0JlYklMOThnVTlXSE11QStjQWNDU1hoZlJoZEFO?= =?utf-8?B?cXUxSkpKMnlEY2lSdkhGRHE5endWQU5qSVZpMWlSazBydGQwMXhLYXlwc2hE?= =?utf-8?B?NUc2cDQrcUlPd0JxbXhReGphZUFuZ2o3OGJxa09KRE02V3E3Z1A4aFY4WWx0?= =?utf-8?B?WjZHL0Z4cmQ4MGFwZytyODNEcndvaldxQldJZVAxUXdlZC9iV1Zoek8yNmFl?= =?utf-8?B?aVVuUklJNkIxT2J2cEtWUWQ1eXdLQXlPam5YNEVsanFKTE5nTG9LL2g5bTNS?= =?utf-8?B?UisvSk9OSVZLZGFUaUtlQ2FMY1BkN3RMVDNnWHNiT2hkY0JwR0RyUmVsVEw0?= =?utf-8?B?YXRoWW5MNDMxUEtuTXFNNDUrVlZUaWxjbUZyT2FSZkdpa3pQT3JTbC9hakFT?= =?utf-8?B?N3FrUnRRWjZxV085UDNEQjJmQ21tTjZ6blVFa3lRSVhYNGoyejAzMzZNdmlw?= =?utf-8?B?Q2tLN2FjN0xJTFVHKzQreVFWSlcrWE5Hby9UNUNRc1FEYnJpNGRXVlcxUmNm?= =?utf-8?B?MXJaQ29LM3V5Ky9OalBEL3BGTVl5Z1lpS1hsVHBxV01Hc0lZZ3NYSmRTRlRi?= =?utf-8?B?SmdkdmR0UkVvM2FSa3E0WVhlZ1VaUnJ5NUFnb1dPUURkTmk5bEd4VFI0SG15?= =?utf-8?B?M2JtSjlyM01sZlIzZlBGTXQxa00xUjRKb0pxZHpoQXdFaUtyNUlsWGNtTDFS?= =?utf-8?B?alFubGFORlZXSVAyZGpXZFZJekVjc2wxMXd1K1lkQngzYVVXbG1hY2JrRjVt?= =?utf-8?B?SlRhczhPOFRGWGhKOHhQMVVSZVd5MiszaVdCdE1oM2dsVWJTb3pMQ1dWLy9L?= =?utf-8?Q?pY3kFeMcbkS/6cCCXvsxRV6oiJC?= X-Microsoft-Exchange-Diagnostics: 1; HKXPR02MB0663; 5:H8vGcuY4fvfNDTlkXM8tYEtqtcdg1tCtAaufEJjGCQeGvAA8c0xzONKvWYtxKanT/w9WbKlpG0K1Ekx6ftZZz+e+7T6dtdf/bf1Cq0KmrWdMbXedmunjNIE27npO4o2uZeL6hpfDsYiEf96DyWCuiA==; 24:O6b/QZa8LC0pt9WI8k30zNHVfyBUDnKD6q53n5OZc1vKuAQMWELlTVYagDjR2BmtIvc74AKfQicMiATfI28AlNmH2MgXETVNQB+y/i4BBp4=; 7:cSNKUSPHUaGCFXnSuMOuChMFFIjRl45FBaT/03A3IcJVsfZAkHFdEKRQYPCVTnFdDRqM00GkVNKPQiLnR+ePtv+hlRJlWp/mpFC5G3xuFaGOzilMZPK00B288IkijfGyhRDsXwtTf3MM7pI5NNLYWpS2Wt7wRRKIsy/IFH5z9iJQiqPv/rzcaGxJAU57OQ6T SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: xsky.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 May 2016 16:50:04.2128 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: HKXPR02MB0663 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Sage and Mark, As mentioned in BlueStore standup, I found rocksdb iterator *Seek* won't use bloom filter like *Get*. *Get* impl: it will look at filter firstly https://github.com/facebook/rocksdb/blob/master/table/block_based_table_reader.cc#L1369 Iterator *Seek*: it will do binary search, by default we don't specify prefix feature(https://github.com/facebook/rocksdb/wiki/Prefix-Seek-API-Changes). https://github.com/facebook/rocksdb/blob/master/table/block.cc#L94 So I use a simple tests: ./db_bench -num 10000000 -benchmarks fillbatch fill the db firstly with 1000w records. ./db_bench -use_existing_db -benchmarks readrandomfast readrandomfast case will use *Get* API to retrive data [root@hunter-node2 rocksdb]# ./db_bench -use_existing_db -benchmarks readrandomfast LevelDB: version 4.3 Date: Wed Jun 1 00:29:16 2016 CPU: 32 * Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz CPUCache: 20480 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 Prefix: 0 bytes Keys per prefix: 0 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) Writes per second: 0 Compression: Snappy Memtablerep: skip_list Perf Level: 0 WARNING: Assertions are enabled; benchmarks unnecessarily slow ------------------------------------------------ DB path: [/tmp/rocksdbtest-0/dbbench] readrandomfast : 4.570 micros/op 218806 ops/sec; (1000100 of 1000100 found, issued 46639 non-exist keys) --- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html =========================== then I modify readrandomfast to use Iterator API[0]: [root@hunter-node2 rocksdb]# ./db_bench -use_existing_db -benchmarks readrandomfast LevelDB: version 4.3 Date: Wed Jun 1 00:33:03 2016 CPU: 32 * Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz CPUCache: 20480 KB Keys: 16 bytes each Values: 100 bytes each (50 bytes after compression) Entries: 1000000 Prefix: 0 bytes Keys per prefix: 0 RawSize: 110.6 MB (estimated) FileSize: 62.9 MB (estimated) Writes per second: 0 Compression: Snappy Memtablerep: skip_list Perf Level: 0 WARNING: Assertions are enabled; benchmarks unnecessarily slow ------------------------------------------------ DB path: [/tmp/rocksdbtest-0/dbbench] readrandomfast : 45.188 micros/op 22129 ops/sec; (1000100 of 1000100 found, issued 46639 non-exist keys) 45.18 us/op vs 4.57us/op! The test can be repeated and easy to do! Plz correct if I'm doing foolish thing I'm not aware.. So I proposal this PR: https://github.com/ceph/ceph/pull/9411 We still can make further improvements by scanning all iterate usage to make it better! [0]: --- a/db/db_bench.cc +++ b/db/db_bench.cc @@ -2923,14 +2923,12 @@ class Benchmark { int64_t key_rand = thread->rand.Next() & (pot - 1); GenerateKeyFromInt(key_rand, FLAGS_num, &key); ++read; - auto status = db->Get(options, key, &value); - if (status.ok()) { - ++found; - } else if (!status.IsNotFound()) { - fprintf(stderr, "Get returned an error: %s\n", - status.ToString().c_str()); - abort(); - } + Iterator* iter = db->NewIterator(options); + iter->Seek(key); + if (iter->Valid() && iter->key().compare(key) == 0) { + found++; + } + if (key_rand >= FLAGS_num) { ++nonexist; }