From patchwork Thu Feb 1 10:24:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13540886 Received: from wout5-smtp.messagingengine.com (wout5-smtp.messagingengine.com [64.147.123.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49DF815B11A for ; Thu, 1 Feb 2024 10:25:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=64.147.123.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783104; cv=none; b=os5XDpglDKE+tSuGuuklzsz80t6K8TwzBD7eXmkaNuWRkJukOEKId0QEZzN+IQQukvRQClLQvzmewKn2Jhlm5/6KD66cFvXlRZ4qq7VU5rs++MBalzpMF9TEEc8hwNZxi0IDvWgWbM91KgPA1dwvZt1tp2r0YptZrmfWikaiGtU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783104; c=relaxed/simple; bh=t1lhvWCIhLgBM0o9XXQ6fD4AOsIEGH8RJVfbe++jovM=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VcvmywLblR+3f8LAo+kd8wB0RRo6ATHk/wj04NcKSfLGcyn7QcOB8W/aMLNQtGy34jUX5MOpEEZBATmhnL6KnWmjF2XFvfYU2rJCz36UTr4m1VKM3R//MZS5WZEoK4UzDZ00uFHCG0Jq9XzsPBx6rmFC0K29wml4yCKAipIrlVs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=MVQy2UG1; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=JyRpzqqf; arc=none smtp.client-ip=64.147.123.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="MVQy2UG1"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="JyRpzqqf" Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id 64D373200AD5 for ; Thu, 1 Feb 2024 05:25:01 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Thu, 01 Feb 2024 05:25:01 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1706783100; x=1706869500; bh=Lhv61fwiKG 9qJgUQhy1YufZexrQNBvebUo+okskXm1I=; b=MVQy2UG1/DFcCuHWXcayOb5hTI TnYOH/zHbAHTfAm0JPZnYfhrRFDFJ4HdNlJiozZ+IxRa0QuAZVwT8uSpWnsk/9ST YTLzKDlLnP+s9lI/prrgFaclGy1abn6VZgItO0MTePKa5wH82a42LwfT6YXsKlKz DerRxcqwkp7bAKflbaUTQmkBz3IxmSMmvGbShXeeOw2Phrelq0wxGcYf9Kyz6EF0 MJlU9nl1APB5zk43PxXxapjvCEbjN9TtBEC5PliRPUh48eIswJ8EDweEMIbsVYBb 84wQKhUnsVZCpjctLr0zzenTWmmjyxkbUgycSx9RYcYSWGZSA8CW8qXWTJGA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1706783100; x=1706869500; bh=Lhv61fwiKG9qJgUQhy1YufZexrQN BvebUo+okskXm1I=; b=JyRpzqqfCfGFdiKZiLb2XKz4+w4wYpuODQ/dC/BSxSdp rOT9UKpdmlMoqgYNcfTO0P3GcFfLMSz8ZdTN6ADjI3IwXX/VLkpQe9mo9aDl2LIi L8cWWDWB4JDuiY5gMXMSJG+t1LGB3EKGWAKVVH8ZawxUBtbfbe7jIuAiiF8g4R8J CteOweQielVCfy44uUPs7k/k84XoMpVayngEZJFkASgBwhdes3xR/negnI/BQ7UP j53lzCkJI8utAhdWnAlfzwiq1uSPTm2jt4owaFIgOchn+aX6ltS0U4xP8VRYw2fi KRaie4e8erEyhnUEMqTxANCYWFeSBEIdRAWtXtaYKg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfeduuddgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtvdenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhepheeghfdtfeeuffehkefgffduleffjedthf dvjeektdfhhedvlefgtefgvdettdfhnecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Thu, 1 Feb 2024 05:24:59 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 1270aebe (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Thu, 1 Feb 2024 10:21:37 +0000 (UTC) Date: Thu, 1 Feb 2024 11:24:58 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 1/7] reftable/record: introduce function to compare records by key Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In some places we need to sort reftable records by their keys to determine their ordering. This is done by first formatting the keys into a `struct strbuf` and then using `strbuf_cmp()` to compare them. This logic is needlessly roundabout and can end up costing quite a bit fo CPU cycles, both due to the allocation and formatting logic. Introduce a new `reftable_record_cmp()` function that knows to compare two records with each other without requiring allocations. Signed-off-by: Patrick Steinhardt --- reftable/record.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++- reftable/record.h | 7 ++++++ 2 files changed, 68 insertions(+), 1 deletion(-) diff --git a/reftable/record.c b/reftable/record.c index 5c3fbb7b2a..f1b6a5eac9 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -430,7 +430,6 @@ static int reftable_ref_record_is_deletion_void(const void *p) (const struct reftable_ref_record *)p); } - static int reftable_ref_record_equal_void(const void *a, const void *b, int hash_size) { @@ -439,6 +438,13 @@ static int reftable_ref_record_equal_void(const void *a, return reftable_ref_record_equal(ra, rb, hash_size); } +static int reftable_ref_record_cmp_void(const void *_a, const void *_b) +{ + const struct reftable_ref_record *a = _a; + const struct reftable_ref_record *b = _b; + return strcmp(a->refname, b->refname); +} + static void reftable_ref_record_print_void(const void *rec, int hash_size) { @@ -455,6 +461,7 @@ static struct reftable_record_vtable reftable_ref_record_vtable = { .release = &reftable_ref_record_release_void, .is_deletion = &reftable_ref_record_is_deletion_void, .equal = &reftable_ref_record_equal_void, + .cmp = &reftable_ref_record_cmp_void, .print = &reftable_ref_record_print_void, }; @@ -625,6 +632,25 @@ static int reftable_obj_record_equal_void(const void *a, const void *b, int hash return 1; } +static int reftable_obj_record_cmp_void(const void *_a, const void *_b) +{ + const struct reftable_obj_record *a = _a; + const struct reftable_obj_record *b = _b; + int cmp; + + cmp = memcmp(a->hash_prefix, b->hash_prefix, + a->hash_prefix_len > b->hash_prefix_len ? + a->hash_prefix_len : b->hash_prefix_len); + if (cmp) + return cmp; + + /* + * When the prefix is the same then the object record that is longer is + * considered to be bigger. + */ + return a->hash_prefix_len - b->hash_prefix_len; +} + static struct reftable_record_vtable reftable_obj_record_vtable = { .key = &reftable_obj_record_key, .type = BLOCK_TYPE_OBJ, @@ -635,6 +661,7 @@ static struct reftable_record_vtable reftable_obj_record_vtable = { .release = &reftable_obj_record_release, .is_deletion = ¬_a_deletion, .equal = &reftable_obj_record_equal_void, + .cmp = &reftable_obj_record_cmp_void, .print = &reftable_obj_record_print, }; @@ -953,6 +980,22 @@ static int reftable_log_record_equal_void(const void *a, hash_size); } +static int reftable_log_record_cmp_void(const void *_a, const void *_b) +{ + const struct reftable_log_record *a = _a; + const struct reftable_log_record *b = _b; + int cmp = strcmp(a->refname, b->refname); + if (cmp) + return cmp; + + /* + * Note that the comparison here is reversed. This is because the + * update index is reversed when comparing keys. For reference, see how + * we handle this in reftable_log_record_key()`. + */ + return b->update_index - a->update_index; +} + int reftable_log_record_equal(const struct reftable_log_record *a, const struct reftable_log_record *b, int hash_size) { @@ -1002,6 +1045,7 @@ static struct reftable_record_vtable reftable_log_record_vtable = { .release = &reftable_log_record_release_void, .is_deletion = &reftable_log_record_is_deletion_void, .equal = &reftable_log_record_equal_void, + .cmp = &reftable_log_record_cmp_void, .print = &reftable_log_record_print_void, }; @@ -1077,6 +1121,13 @@ static int reftable_index_record_equal(const void *a, const void *b, int hash_si return ia->offset == ib->offset && !strbuf_cmp(&ia->last_key, &ib->last_key); } +static int reftable_index_record_cmp(const void *_a, const void *_b) +{ + const struct reftable_index_record *a = _a; + const struct reftable_index_record *b = _b; + return strbuf_cmp(&a->last_key, &b->last_key); +} + static void reftable_index_record_print(const void *rec, int hash_size) { const struct reftable_index_record *idx = rec; @@ -1094,6 +1145,7 @@ static struct reftable_record_vtable reftable_index_record_vtable = { .release = &reftable_index_record_release, .is_deletion = ¬_a_deletion, .equal = &reftable_index_record_equal, + .cmp = &reftable_index_record_cmp, .print = &reftable_index_record_print, }; @@ -1147,6 +1199,14 @@ int reftable_record_is_deletion(struct reftable_record *rec) reftable_record_data(rec)); } +int reftable_record_cmp(struct reftable_record *a, struct reftable_record *b) +{ + if (a->type != b->type) + BUG("cannot compare reftable records of different type"); + return reftable_record_vtable(a)->cmp( + reftable_record_data(a), reftable_record_data(b)); +} + int reftable_record_equal(struct reftable_record *a, struct reftable_record *b, int hash_size) { if (a->type != b->type) diff --git a/reftable/record.h b/reftable/record.h index fd80cd451d..0d96fbfd1b 100644 --- a/reftable/record.h +++ b/reftable/record.h @@ -62,6 +62,12 @@ struct reftable_record_vtable { /* Are two records equal? This assumes they have the same type. Returns 0 for non-equal. */ int (*equal)(const void *a, const void *b, int hash_size); + /* + * Compare keys of two records with each other. The records must have + * the same type. + */ + int (*cmp)(const void *a, const void *b); + /* Print on stdout, for debugging. */ void (*print)(const void *rec, int hash_size); }; @@ -114,6 +120,7 @@ struct reftable_record { }; /* see struct record_vtable */ +int reftable_record_cmp(struct reftable_record *a, struct reftable_record *b); int reftable_record_equal(struct reftable_record *a, struct reftable_record *b, int hash_size); void reftable_record_print(struct reftable_record *rec, int hash_size); void reftable_record_key(struct reftable_record *rec, struct strbuf *dest); From patchwork Thu Feb 1 10:25:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13540887 Received: from wout5-smtp.messagingengine.com (wout5-smtp.messagingengine.com [64.147.123.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 386AE15B990 for ; Thu, 1 Feb 2024 10:25:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=64.147.123.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783108; cv=none; b=EwStRGkDy4c0Q7EDMNmROsLlLPUceaeQUjsTp8xSH+1QWkQjwGgcwNf08n4kqDTKRurA6atAsLVIF0zMhEeVJkRqLcMOEAn89/7x5bKIX+VL4TeJbnGTue17Fefsd5fRElYgInf3v9VslssKt2UULCCC3hpd+KygqHX9NdJ2OKs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783108; c=relaxed/simple; bh=0mp2WUd2epIIoEhc6vE0SXtPS7rLXG2Ojz9rCgy2qrI=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=KKaXt2bSa2vvA093NdeD1vgixHd6b8nspvsB1vpEHQsPY7FJq6s72A5G53LjXPLe0Z4fQi2isF0NuDehxpfmMDhuKI4f2Lzoa3TgQZTdkN75KeoJZV8GUrn5CEhzDI6QSSTAIhuY0SHEMtDojFWa1Minw7WiLfbVSYI8IpM/ZDg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=k47CaNFv; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=ovAP9mJm; arc=none smtp.client-ip=64.147.123.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="k47CaNFv"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="ovAP9mJm" Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 23BB63200B1C for ; Thu, 1 Feb 2024 05:25:05 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 01 Feb 2024 05:25:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1706783104; x=1706869504; bh=oohmreglTx TteCMuyKsk6GVYX+wxqA2vk70Tz8xOXxs=; b=k47CaNFv8Frf7UIDGEyM3dsED1 vJpajsGfMVcrgaoCErvM3Xpw3ASip5CE9YOPMhM2xwkqQNqmMeGDaQpxHKgiwIPs DNUT5p12LWjG8RXFYx7lQmSPjvmu4n+LmrQds12saoUihqGA+hP9hAYGLcrrNEkh sPiQiXqsNHA7Zkx+9t87IVd721T6kcpHqIwZVoaJh0XRzNu8yeyOA2m5TbBPIpRE mTLG1bOfFBLdA6Od7SD9ipdD47YRjdEG0rFROj/IX+zxSb8T/VUD6YXXVS2FSyrc BDSBVaoddyZwSTY85wz4qJbqCT829H4vU2FjeYQD8nDhHbTbFH/TDWSqbQcg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1706783104; x=1706869504; bh=oohmreglTxTteCMuyKsk6GVYX+wx qA2vk70Tz8xOXxs=; b=ovAP9mJmqeYmXBOUjsFymGQITg+5SPFIx9qIhoDQECxF Eh1WxDY26B1+S+8hT8tV0alRM3On6y8LvpNFdzY+sa41LAm9QxO06RjVVQ7oEf4y UwmwDt4jNPVylDHcirD1Kz7Sh5KAOzu4LcDYvdj3a/dGCThx1Zr130KDG2nGV5x3 nQL2TLefvdtxF9Wl7GA3WYvfZDYfuzFBAJ2dyS2iaKyfOX0/y1dC0cSCbOsmPCGs xvgAzax1dG9uPi4kN2e5swrWIFzPZF5xAPDD8Nj82lG5DgrCe48COKxxs6jZrnQo +wO9IvBy6BcjtCOwGNlym2y7yr9O52tIIi2gwPiHzg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfeduuddgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtjeenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhephefgjeeuveejteduhefgffefffdvjeefje eivdekfffgkeeugfehveetueefleeknecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Thu, 1 Feb 2024 05:25:03 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 8240fba4 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Thu, 1 Feb 2024 10:21:41 +0000 (UTC) Date: Thu, 1 Feb 2024 11:25:02 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 2/7] reftable/merged: allocation-less dropping of shadowed records Message-ID: <576a96f2e5cf462d2f5793a0c8244f4966789654.1706782841.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The purpose of the merged reftable iterator is to iterate through all entries of a set of tables in the correct order. This is implemented by using a sub-iterator for each table, where the next entry of each of these iterators gets put into a priority queue. For each iteration, we do roughly the following steps: 1. Retrieve the top record of the priority queue. This is the entry we want to return to the caller. 2. Retrieve the next record of the sub-iterator that this record came from. If any, add it to the priority queue at the correct position. The position is determined by comparing the record keys, which e.g. corresponds to the refname for ref records. 3. Keep removing the top record of the priority queue until we hit the first entry whose key is larger than the returned record's key. This is required to drop "shadowed" records. The last step will lead to at least one comparison to the next entry, but may lead to many comparisons in case the reftable stack consists of many tables with shadowed records. It is thus part of the hot code path when iterating through records. The code to compare the entries with each other is quite inefficient though. Instead of comparing record keys with each other directly, we first format them into `struct strbuf`s and only then compare them with each other. While we already optimized this code path to reuse buffers in 829231dc20 (reftable/merged: reuse buffer to compute record keys, 2023-12-11), the cost to format the keys into the buffers still adds up quite significantly. Refactor the code to use `reftable_record_cmp()` instead, which has been introduced in the preceding commit. This function compares records with each other directly without requiring any memory allocations or copying and is thus way more efficient. The following benchmark uses git-show-ref(1) to print a single ref matching a pattern out of 1 million refs. This is the most direct way to exercise ref iteration speed as we remove all overhead of having to show the refs, too. Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 180.7 ms ± 4.7 ms [User: 177.1 ms, System: 3.4 ms] Range (min … max): 174.9 ms … 211.7 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 162.1 ms ± 4.4 ms [User: 158.5 ms, System: 3.4 ms] Range (min … max): 155.4 ms … 189.3 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.11 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/merged.c | 11 ++--------- reftable/merged.h | 2 -- 2 files changed, 2 insertions(+), 11 deletions(-) diff --git a/reftable/merged.c b/reftable/merged.c index c258ce953e..fb9978d798 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -51,8 +51,6 @@ static void merged_iter_close(void *p) reftable_iterator_destroy(&mi->stack[i]); } reftable_free(mi->stack); - strbuf_release(&mi->key); - strbuf_release(&mi->entry_key); } static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi, @@ -105,14 +103,11 @@ static int merged_iter_next_entry(struct merged_iter *mi, such a deployment, the loop below must be changed to collect all entries for the same key, and return new the newest one. */ - reftable_record_key(&entry.rec, &mi->entry_key); while (!merged_iter_pqueue_is_empty(mi->pq)) { struct pq_entry top = merged_iter_pqueue_top(mi->pq); - int cmp = 0; + int cmp; - reftable_record_key(&top.rec, &mi->key); - - cmp = strbuf_cmp(&mi->key, &mi->entry_key); + cmp = reftable_record_cmp(&top.rec, &entry.rec); if (cmp > 0) break; @@ -246,8 +241,6 @@ static int merged_table_seek_record(struct reftable_merged_table *mt, .typ = reftable_record_type(rec), .hash_id = mt->hash_id, .suppress_deletions = mt->suppress_deletions, - .key = STRBUF_INIT, - .entry_key = STRBUF_INIT, }; int n = 0; int err = 0; diff --git a/reftable/merged.h b/reftable/merged.h index d5b39dfe7f..7d9f95d27e 100644 --- a/reftable/merged.h +++ b/reftable/merged.h @@ -31,8 +31,6 @@ struct merged_iter { uint8_t typ; int suppress_deletions; struct merged_iter_pqueue pq; - struct strbuf key; - struct strbuf entry_key; }; void merged_table_release(struct reftable_merged_table *mt); From patchwork Thu Feb 1 10:25:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13540888 Received: from wfhigh4-smtp.messagingengine.com (wfhigh4-smtp.messagingengine.com [64.147.123.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A2B915CD41 for ; Thu, 1 Feb 2024 10:25:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=64.147.123.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783112; cv=none; b=cJD8DZ16MHRzbwijYOuOGiQSTM4mh+pOxrAmKyL839rRyHoWmz+OOKOBfCQNJDHpDsWYHwxqEGc2NAHRdzA3hmXjNI3d6sHPnXkwuXk2V+GLLDm8irWRD87awSrpwhdPsO5WktW0pojgWX/r4eJd0klTMZqHrcaaDHJgB/kzums= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783112; c=relaxed/simple; bh=q07VK1G7ghTfSbM8ha7oMYJsMjU1/IS6+YHOuvAh9lA=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=DIMuYkmoSf5nAyv9CiANer45FTIgmx57djopoVZKNlBEpqGlMWoeX7yxEJWLHz2j4ZQNjXihNu0404sGEC6lFIS0Ly05ICO8Wrzz3P9+J9i5K5KcDjgXJc5t1ig5FzAzQDgG72MbfkBNRgYk4mNLuOSW3Zq3S5vFDJsKkfz0mY8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=TVV8Jlpm; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=TlF2Qgm4; arc=none smtp.client-ip=64.147.123.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="TVV8Jlpm"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="TlF2Qgm4" Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailfhigh.west.internal (Postfix) with ESMTP id 2B8F91800089 for ; Thu, 1 Feb 2024 05:25:09 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Thu, 01 Feb 2024 05:25:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1706783108; x=1706869508; bh=IDlrT2Tn3h FqGEzl+OPSrJ8Pqa/MpCdct8/W6QvvLnQ=; b=TVV8JlpmKzLHZMiBpAktNi5Lpx AzPx7oBb6IFSQe+K20pCrFVHdLAoI+k3TXU8JqGL4tJzPqcuqd+pY5jeHiyarRQ3 Td3ScgXbPHbs8WZHrS0qQ1mbQcxYh0mdzAce1rcuqyZVU7WJX2SxX/0cYlN15Xdu QUAFueWhMgEu3ussuFmcbW8qLjiOZ6+9QEd5x1SdMX0G8FdWgxlRwPZ4FZUZVfJA KhvHENb5UuqTlBQmSld+6fb+hlIHAFA355cojFerNT68N7GU7SBWa5DBlxW71SUj l68MtDQOnssbfPDlCzXGW7BNa1aF2DZe1Eqf4WXrG/UTs2Cg7TmoMhT1031Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1706783108; x=1706869508; bh=IDlrT2Tn3hFqGEzl+OPSrJ8Pqa/M pCdct8/W6QvvLnQ=; b=TlF2Qgm4K5riOO1Ueu0ttSy+9Ag9RPMqLoqzu6ZhCH9r sXmXyFGZKBpPd9sENgvCVgriCeGfDp43fgkRmnpcFbMDA0J4300CkVu11Abs+zt5 woXfHpWZSPH/pE9nGxm3W3FxmQTDZ5EfAWu1j1z+LsU2XjGVMDHhzSKYnrHUkkgk jPNZKTMyu53wYn6aC3t4PQGtAFal0m33yYDT2/6gfRReakSMkiwvU0DyHAAPNY+s wueKe4vLgb61hhBbZ047LYmFRpAzD/zAMulRvzoaxmPQ3FR8IchYXfypNh+w5MR9 uuxKhH3tDOWPqvpO6M5Kp4NLnKxWBMpbqnz0Mq7HPw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfeduuddgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtjeenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhephefgjeeuveejteduhefgffefffdvjeefje eivdekfffgkeeugfehveetueefleeknecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Thu, 1 Feb 2024 05:25:07 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 0257954a (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Thu, 1 Feb 2024 10:21:45 +0000 (UTC) Date: Thu, 1 Feb 2024 11:25:06 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 3/7] reftable/merged: skip comparison for records of the same subiter Message-ID: <0ca86eba710895f0e22fc15fe5221f5487031f64.1706782841.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When retrieving the next entry of a merged iterator we need to drop all records of other sub-iterators that would be shadowed by the record that we are about to return. We do this by comparing record keys, dropping all keys that are smaller or equal to the key of the record we are about to return. There is an edge case here where we can skip that comparison: when the record in the priority queue comes from the same subiterator than the record we are about to return then we know that its key must be larger than the key of the record we are about to return. This property is guaranteed by the sub-iterators, and if it didn't hold then the whole merged iterator would return records in the wrong order, too. While this may seem like a very specific edge case it's in fact quite likely to happen. For most repositories out there you can assume that we will end up with one large table and several smaller ones on top of it. Thus, it is very likely that the next entry will sort towards the top of the priority queue. Special case this and break out of the loop in that case. The following benchmark uses git-show-ref(1) to print a single ref matching a pattern out of 1 million refs: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 162.6 ms ± 4.5 ms [User: 159.0 ms, System: 3.5 ms] Range (min … max): 156.6 ms … 188.5 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 156.8 ms ± 4.7 ms [User: 153.0 ms, System: 3.6 ms] Range (min … max): 151.4 ms … 188.4 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.04 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/merged.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/reftable/merged.c b/reftable/merged.c index fb9978d798..0f74a14a77 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -107,6 +107,14 @@ static int merged_iter_next_entry(struct merged_iter *mi, struct pq_entry top = merged_iter_pqueue_top(mi->pq); int cmp; + /* + * When the next entry comes from the same queue as the current + * entry then it must by definition be larger. This avoids a + * comparison in the most common case. + */ + if (top.index == entry.index) + break; + cmp = reftable_record_cmp(&top.rec, &entry.rec); if (cmp > 0) break; From patchwork Thu Feb 1 10:25:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13540889 Received: from wfhigh4-smtp.messagingengine.com (wfhigh4-smtp.messagingengine.com [64.147.123.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBD2C15B964 for ; Thu, 1 Feb 2024 10:25:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=64.147.123.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783116; cv=none; b=Ynv6PgSwuDwxb253pJtHFqZlD2vK/7yOM/wAQIPJLMsDcLYujCeUTssV+l+RFEXNiNQqQ+qoc6/k7k2PS9UcjpdgQOzz86PsLEUypO5ynYnpJ0RkSvWb+ig2EkywsXxDzVRJkifSYhYl2jAhKwqgs2O4U71KvioFjaCh3+9g5sI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783116; c=relaxed/simple; bh=OsJ8/eNqG8jcNoZ95dVKdNW714/2RfGsvyCXIA6aLzw=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=nK2nsRZQRGbZ2l2QU1Ty/CXm+xjeXL0EgkOxSeJbftDASilIqrYssgSnShbZIT9KT4DmAWb/UGxVT8i8mUJFRwD04LdUtDt1xMyPO9/wTctmnBOU31QQ1M0zd5SGmy+OahIa41Ic5CZTXoIGsqB94LvBixk4ikaGeMrOgU0Sofg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=gWO0F3EZ; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=V4SZSBSb; arc=none smtp.client-ip=64.147.123.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="gWO0F3EZ"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="V4SZSBSb" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfhigh.west.internal (Postfix) with ESMTP id D5C6A180007A for ; Thu, 1 Feb 2024 05:25:13 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Thu, 01 Feb 2024 05:25:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1706783113; x=1706869513; bh=TL7nbh2/B/ 7RlZEt8sHbJki5BX2GEwIlzPMIot3hJc4=; b=gWO0F3EZLM5MDU2m0HCJpUGFcE sVxdFnu2a9YnghemVsbJEXkdMJ25sKIypaIZ97npmrrh9KD7ajxyDHcYHdA8wX+8 Tt3HP0sbKOmxsvIDDZymxA3j3vaNysawlJiUFnJg13B92KguCynhrScw9SG7PCu0 xfWj5d8qQGENBfQxTSvfHfT07tDOGwZwMh1bURwaRiWW+34vu2I2+VAaYXa+mE20 RYDOFfpLmnKaWaPQrcGwBi6lgO5j+SB3IVzppizVD5gA2N+1phLDKzC7IilvbJvA HUQb6gWgh4dx9xWPvvAyHT8Wk/MN7AU8AW3whvg7iCZGQu4xKCtDJp0AM/sA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1706783113; x=1706869513; bh=TL7nbh2/B/7RlZEt8sHbJki5BX2G EwIlzPMIot3hJc4=; b=V4SZSBSb7cbTiNnGQ+1/UacczfQDQLnJQwZWQCJf4KM4 GI5Re1AT2LNKTbahnwq/t+/JCYQ+rOkL6CbeP13VRQvB8VPR4DW18FtihhUEKTC2 GMSDRUy1twfch8l3vt7UYQhHLj0gpjlQzKSi85Xu9kn5+xxi/S4FHUC4Bh9nS287 Bw0xdcB+YCzmzkF4AY2uQgyd4ExhrWVIDUFf82sY1RL7M3lu+N9Dd62c4nyRbTGn JhkJLbHcFMBH0YtTUc4UlmACDaB0tfeKfi+6Ii3gTBf7NhbgZRHgnjYB1GJIZZeQ 8C9nNStzriQntHxiwAajpp+4mMrc/iZJy8vnlFTbzQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfeduuddgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtjeenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhephefgjeeuveejteduhefgffefffdvjeefje eivdekfffgkeeugfehveetueefleeknecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Thu, 1 Feb 2024 05:25:12 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id c318743b (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Thu, 1 Feb 2024 10:21:49 +0000 (UTC) Date: Thu, 1 Feb 2024 11:25:10 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 4/7] reftable/pq: allocation-less comparison of entry keys Message-ID: <1c9c19a3b3daf9690fda0423c52da33e337c8b54.1706782841.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The priority queue is used by the merged iterator to iterate over reftable records from multiple tables in the correct order. The queue ends up having one record for each table that is being iterated over, with the record that is supposed to be shown next at the top. For example, the key of a ref record is equal to its name so that we end up sorting the priority queue lexicographically by ref name. To figure out the order we need to compare the reftable record keys with each other. This comparison is done by formatting them into a `struct strbuf` and then doing `strbuf_strcmp()` on the result. We then discard the buffers immediately after the comparison. This ends up being very expensive. Because the priority queue usually contains as many records as we have tables, we call the comparison function `O(log($tablecount))` many times for every record we insert. Furthermore, when iterating over many refs, we will insert at least one record for every ref we are iterating over. So ultimately, this ends up being called `O($refcount * log($tablecount))` many times. Refactor the code to use the new `refatble_record_cmp()` function that has been implemented in a preceding commit. This function does not need to allocate memory and is thus significantly more efficient. The following benchmark prints a single ref matching a specific pattern out of 1 million refs via git-show-ref(1), where the reftable stack consists of three tables: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 224.4 ms ± 6.5 ms [User: 220.6 ms, System: 3.6 ms] Range (min … max): 216.5 ms … 261.1 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 172.9 ms ± 4.4 ms [User: 169.2 ms, System: 3.6 ms] Range (min … max): 166.5 ms … 204.6 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.30 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/pq.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/reftable/pq.c b/reftable/pq.c index dcefeb793a..7220efc39a 100644 --- a/reftable/pq.c +++ b/reftable/pq.c @@ -14,20 +14,9 @@ license that can be found in the LICENSE file or at int pq_less(struct pq_entry *a, struct pq_entry *b) { - struct strbuf ak = STRBUF_INIT; - struct strbuf bk = STRBUF_INIT; - int cmp = 0; - reftable_record_key(&a->rec, &ak); - reftable_record_key(&b->rec, &bk); - - cmp = strbuf_cmp(&ak, &bk); - - strbuf_release(&ak); - strbuf_release(&bk); - + int cmp = reftable_record_cmp(&a->rec, &b->rec); if (cmp == 0) return a->index > b->index; - return cmp < 0; } From patchwork Thu Feb 1 10:25:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13540890 Received: from wfhigh4-smtp.messagingengine.com (wfhigh4-smtp.messagingengine.com [64.147.123.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A5AE15B964 for ; Thu, 1 Feb 2024 10:25:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=64.147.123.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783119; cv=none; b=PCaTo5M4jk3RZfQLUenMW+yVeb0znx7yKOF5bOYXqIIdU1vKR16XSgyD3vZtE8yvkuhnUOcCqa2giCERAjG4FHnWMkT1RvaTdKtrgWQh3PS3uSISCbu3ZcSA3d0t79R4+PZzi0cZ04NSSQ9sA+FU4skrBwwdPgZGG95YVcLtrro= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783119; c=relaxed/simple; bh=lUG38f103Jl8DBAb32Hr0CeZ0Z+WL1i2ePjrbMkMN68=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ZNIZ6ncPph1aC1zyF0tDZjCLcSb93+qR0PXsTSoIuGPA1pflfl0ckDzClWQIOvS+I7vLYvDmLmkEE/GUsZHj2sKeeTG4lQ3+VhFH515f8GmsLsAZJ+UDuFwPAl3/55oHp6wyE3Li/NmH3lxJuwpEbjklyG2F35P37wlyMqJPygM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=ZaJ2AU1k; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=TbZUD3PH; arc=none smtp.client-ip=64.147.123.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="ZaJ2AU1k"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="TbZUD3PH" Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailfhigh.west.internal (Postfix) with ESMTP id 8E762180007A for ; Thu, 1 Feb 2024 05:25:17 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Thu, 01 Feb 2024 05:25:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1706783117; x=1706869517; bh=VDwjXmBopJ cxeEzKGPCq58246N5MQB0qX7df64U00WQ=; b=ZaJ2AU1k8eskFp9vCO1Rthrp5z GWnhM0H1TgLyuX9ZUMcKt7YJzA8ZtYtF2aSzBdufjio5DZ2e7wdfQuGIfoI32mzx G44TOpDzNkM+NiOd5NnfadcICnm76Rn7fY3+/p9GudXsmkypgsxtPvKgEoCggl5x QTKhZQaftt8gBNUU5W4eyM3qvHaCgKG7b2qetbIP18Mlz6YYtxF51Ri1YLJcpHyF Oxl7GuQw+DQaytGvYX8aQzy/NFHWDpwO014crZeIlT9r/o+DhIXLofWnJaZWNd5H 5Wi+r6Gv2i/B4iY8Y0J4hwIPpzeJctoHe6UL4Xb++KaJStbWNVUJH6nqS0Yw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1706783117; x=1706869517; bh=VDwjXmBopJcxeEzKGPCq58246N5M QB0qX7df64U00WQ=; b=TbZUD3PHEMo8FYY3xqvSBZhM+uF+loZGULmrDP26mppa Eu1ZgwZE3lSZeEIkDX7fi4zTHVgNKBqXkUBm/jA5rR9BV98ojrroQ3se/QD0bJ3H UCBKrNrSqATKCj4ELXI1eYpDC9Hrx5mT4LHoBxODZE1Ex4kWf2Wi1Yx/AWqXma0X Gkl2v7Uk+4gP2rsG3SE7AI7NP/vcutbvg/81YsYGoHd1RkxXzFB8je5bhWtxplZc X9CVNCkDcFu9CFisBPDa6ZfrOTzDq0gEcz0Zpp2dCKkNaFPDxeZ98NQbLvHQ1B6l e+FiN7FsI2gh8wN0CqWZkHVb9lyQg5za1FT9L4fQCA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfeduuddgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtjeenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhephefgjeeuveejteduhefgffefffdvjeefje eivdekfffgkeeugfehveetueefleeknecuvehluhhsthgvrhfuihiivgepudenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Thu, 1 Feb 2024 05:25:16 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id a59ab554 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Thu, 1 Feb 2024 10:21:53 +0000 (UTC) Date: Thu, 1 Feb 2024 11:25:14 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 5/7] reftable/block: swap buffers instead of copying Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When iterating towards the next record in a reftable block we need to keep track of the key that the last record had. This is required because reftable records use prefix compression, where subsequent records may reuse parts of their preceding record's key. This key is stored in the `block_iter::last_key`, which we update after every call to `block_iter_next()`: we simply reset the buffer and then add the current key to it. This is a bit inefficient though because it requires us to copy over the key on every iteration, which adds up when iterating over many records. Instead, we can make use of the fact that the `block_iter::key` buffer is basically only a scratch buffer. So instead of copying over contents, we can just swap both buffers. The following benchmark prints a single ref matching a specific pattern out of 1 million refs via git-show-ref(1): Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 155.7 ms ± 5.0 ms [User: 152.1 ms, System: 3.4 ms] Range (min … max): 150.8 ms … 185.7 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 150.8 ms ± 4.2 ms [User: 147.1 ms, System: 3.5 ms] Range (min … max): 145.1 ms … 180.7 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.03 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/block.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 1df3d8a0f0..44381ea6a3 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -342,8 +342,7 @@ int block_iter_next(struct block_iter *it, struct reftable_record *rec) return -1; string_view_consume(&in, n); - strbuf_reset(&it->last_key); - strbuf_addbuf(&it->last_key, &it->key); + strbuf_swap(&it->last_key, &it->key); it->next_off += start.len - in.len; return 0; } From patchwork Thu Feb 1 10:25:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13540891 Received: from wfhigh4-smtp.messagingengine.com (wfhigh4-smtp.messagingengine.com [64.147.123.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F89F15B119 for ; Thu, 1 Feb 2024 10:25:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=64.147.123.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783123; cv=none; b=AXBxRj0rjXD9u386m9mqJbB15C4wNY+Xge1V9tbXZ3DDB1E1EUYhypAGUXWEKujDha4cIiotMLtQ5YUBS0ty9gbBmNHJLSfCOQs470q81m/RUzktxFU2KgRRKbiae/KsZDCNPiNlJEalb/uJ3GRJc3eGSQtAMgOBMSgxUxfUoE4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783123; c=relaxed/simple; bh=1CoiM0BAo4Y14MB2O4BLRomN+MVAgQbWGoI0OhPSRXE=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=gY8EFUejliuDQNWEiZE1pIxbKCIVZYNSkG2SDFrA756hUsnL3KEvvslODP1WSugHYlvR8EPc2XqiTM94Wr/kmOKOAakipIeIj+hJDugmukUMMfzN49YX49h1vCgQyKtL/cWMtm7uNhNeL+V+ryX+nDFVAg02gbsfi4GGzp0YZtU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=eMwvvJpk; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=c4RPy9Dr; arc=none smtp.client-ip=64.147.123.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="eMwvvJpk"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="c4RPy9Dr" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfhigh.west.internal (Postfix) with ESMTP id 45717180007A for ; Thu, 1 Feb 2024 05:25:21 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Thu, 01 Feb 2024 05:25:21 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1706783120; x=1706869520; bh=rLG/PQ52GT PDY3nbMdNRAcnoU6z9+a2r/rMNHTfxskM=; b=eMwvvJpkvr437wAlPsx/IIOyRW ZekYcH9btI+tM0M4G2c6F4oiaKA/LaYsFFXTw8ux5rw5J2xEGZOxmtc7n2yNwmWv A51Pqmnx+bwBNHP96S8yWz7cTx14WuLzKp9ScFp7xzPJ4omgzla48RT1VDTA0kFN AtqDTmQlZTE2t4FpeZYTs/kWRombCwQzjCxUCwgta5war4gJY6dvBxEfOB6rGffK KV9PY24T0XdJUKHWTki8q18jIEiDfQvfZFnQ410FlyX2GRqGbVFa17VghFafvF52 gRWY5gavpZ+90EF2TVt0tDGrHHkNdrr2ZUbpZtnELKUmG9KfZCUNRtjtQV2Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1706783120; x=1706869520; bh=rLG/PQ52GTPDY3nbMdNRAcnoU6z9 +a2r/rMNHTfxskM=; b=c4RPy9DrfYHk/iwM0A+1WBZhPy5fEFFfx3ztWZ3EjXJb 8EyPhvuzxg5iI8eC02F0Yi4+Y2SedRF78zESaAZQqe4rNoJE3c9WdYC3dcma484s Vof7KlGE6RLrITtyLWE6MrJJ0Fy9aH4gCrm5DsBD8pCQwY9/U1xE2PgzXakOsSqZ Q78vDA4TYZw3k3QOhvOTO878CcNvM0Prf3NUwDuEy+ssoKn0z5PeJcEerkDgZR0m A+B1SNYr/D/eqTG2MdpW7VW7+/JE4UYbTaX7QA6UOsxJxU9oASr3XphMZPLiowRf draaGfIeZylfY6+Y8duKdWjoflLsAS1PHzcuoNTk1Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfeduuddgudegucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtjeenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhephefgjeeuveejteduhefgffefffdvjeefje eivdekfffgkeeugfehveetueefleeknecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Thu, 1 Feb 2024 05:25:20 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id a4752ee0 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Thu, 1 Feb 2024 10:21:57 +0000 (UTC) Date: Thu, 1 Feb 2024 11:25:18 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 6/7] reftable/record: don't try to reallocate ref record name Message-ID: <41dff8731c308c6d72ebd0066be8963bda725ea2.1706782841.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When decoding reftable ref records we first release the pointer to the record passed to us and then use realloc(3P) to allocate the refname array. This is a bit misleading though as we know at that point that the refname will always be `NULL`, so we would always end up allocating a new char array anyway. Refactor the code to use `REFTABLE_ALLOC_ARRAY()` instead. As the following benchmark demonstrates this is a tiny bit more efficient. But the bigger selling point really is the gained clarity. Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 150.1 ms ± 4.1 ms [User: 146.6 ms, System: 3.3 ms] Range (min … max): 144.5 ms … 180.5 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 148.9 ms ± 4.5 ms [User: 145.2 ms, System: 3.4 ms] Range (min … max): 143.0 ms … 185.4 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.01 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~) Ideally, we should try and reuse the memory of the old record instead of first freeing and then immediately reallocating it. This requires some more surgery though and is thus left for a future iteration. Signed-off-by: Patrick Steinhardt --- reftable/record.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/reftable/record.c b/reftable/record.c index f1b6a5eac9..6465a7b8f4 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -377,10 +377,11 @@ static int reftable_ref_record_decode(void *rec, struct strbuf key, assert(hash_size > 0); - r->refname = reftable_realloc(r->refname, key.len + 1); + r->refname = reftable_malloc(key.len + 1); memcpy(r->refname, key.buf, key.len); - r->update_index = update_index; r->refname[key.len] = 0; + + r->update_index = update_index; r->value_type = val_type; switch (val_type) { case REFTABLE_REF_VAL1: From patchwork Thu Feb 1 10:25:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13540892 Received: from wout5-smtp.messagingengine.com (wout5-smtp.messagingengine.com [64.147.123.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDBA01586E6 for ; Thu, 1 Feb 2024 10:25:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=64.147.123.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783128; cv=none; b=gZwBL3gXk7leClxq7g72yxVouL61kjnX2RDCcU0xpM7DbB7hvXOk9kRDgOI+Di6b6/3U3k9mbbIsCPisA+lqxF7sGXQkO357XFFaIwYiUUhHaUbpSQ28WIfx30SHsszwe7yUtqdG3784V4K0MWuo8QdBwKY4sa2ydZtLZci0XWc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783128; c=relaxed/simple; bh=7V6A1twzfvQCyEKkX3cj7OvUNn0Kj0MyZjVGcC0L3/E=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Tea11465HjLOgvYHxOR+x3hjHxEAzAy0Y9Fa7dXAFi03YkSqtm+Y7NpHohDB+Geew8kZPkqDguzfMv8NBoXQhDsAERmmUoFbK/hXUcxiVrZZiyPOuMsjWh7w/YZKWb6XeTPZIdRWFuN1J6sDpb1C5zJqvQcDWk7qZFi0RgtPG/A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=A8dufSI9; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=pyVezRWN; arc=none smtp.client-ip=64.147.123.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="A8dufSI9"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="pyVezRWN" Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.west.internal (Postfix) with ESMTP id 073013200B76 for ; Thu, 1 Feb 2024 05:25:24 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Thu, 01 Feb 2024 05:25:25 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1706783124; x=1706869524; bh=Vp5hTAHxFp J4LH+lE0B3pWwZeTpdMqvzxwvdZOkkwPc=; b=A8dufSI9R+eAQQtVy9RSLoHL/d BxDsOFGC1cWpbRRHteupQt6pP+U+W3s9gvTsJxhpTETmTKhV6/wk8L2g/Rp3N0Mf T+GThywBlCt4Yc4615NiX8oIVHxJOwolVlBJ4hXO+EGIdf2AyAmA26BL9M2JMOes +6BgJWQ/UuT5nRn8w0W0MJJt94tu2kikUV1J6CUQDrwLeGsV0/EJmL1ezRq2t//0 5myKK+y36YvQY+MCmVSaHLAnVZ+AZRUS1JHSrnBpbcXdDQIUCqAJgqopMsH7YDpc eLGZf4BeiQAyWiWdxwQx+ADaFAnbvHgbkNnQw15+MU5+d6sN1GXtMhYskEzQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1706783124; x=1706869524; bh=Vp5hTAHxFpJ4LH+lE0B3pWwZeTpd MqvzxwvdZOkkwPc=; b=pyVezRWNt4cU5UH8lavdZE3VzL+7M26rDe8TE59GB3Ff Jl0CyKyxM8lXvzQSTbMJlXbu9QFNozKbLhMYLumBYSJqKcH9BHwivMOYqYt6FNeL y0iJ94m7n/pEqS3qBZUVN/YMkWMeRYp6CqIDsnyuibmyiqtIWEGCOp+k6zTWhK9U MsDKJVA9xRDEoLEsDbgPMzPCVBjwT5a7ISLF+BVZ/LYpvIG+UXCVGrpPjChUzrk4 nFNalHLfQjoFtn3YHQFOgOW9qzHA8lyIQNkotQmKMhp9xdYdpFhonVFt2lf5n3vW cI1n/hWlSH73CZ87GR+MEa1e75CnkLAfpcsUa7Ya/Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfeduuddgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtvdenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhepheeghfdtfeeuffehkefgffduleffjedthf dvjeektdfhhedvlefgtefgvdettdfhnecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Thu, 1 Feb 2024 05:25:23 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id af856a86 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Thu, 1 Feb 2024 10:22:01 +0000 (UTC) Date: Thu, 1 Feb 2024 11:25:22 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 7/7] reftable/reader: add comments to `table_iter_next()` Message-ID: <2f1f1dd95e1cc360bde3547bd18e227a9c326e13.1706782841.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: While working on the optimizations in the preceding patches I stumbled upon `table_iter_next()` multiple times. It is quite easy to miss the fact that we don't call `table_iter_next_in_block()` twice, but that the second call is in fact `table_iter_next_block()`. Add comments to explain what exactly is going on here to make things more obvious. While at it, touch up the code to conform to our code style better. Signed-off-by: Patrick Steinhardt --- reftable/reader.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/reftable/reader.c b/reftable/reader.c index 64dc366fb1..add7d57f0b 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -357,24 +357,32 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec) while (1) { struct table_iter next = TABLE_ITER_INIT; - int err = 0; - if (ti->is_finished) { + int err; + + if (ti->is_finished) return 1; - } + /* + * Check whether the current block still has more records. If + * so, return it. If the iterator returns positive then the + * current block has been exhausted. + */ err = table_iter_next_in_block(ti, rec); - if (err <= 0) { + if (err <= 0) return err; - } + /* + * Otherwise, we need to continue to the next block in the + * table and retry. If there are no more blocks then the + * iterator is drained. + */ err = table_iter_next_block(&next, ti); - if (err != 0) { - ti->is_finished = 1; - } table_iter_block_done(ti); - if (err != 0) { + if (err) { + ti->is_finished = 1; return err; } + table_iter_copy_from(ti, &next); block_iter_close(&next.bi); }