From patchwork Mon Feb 12 08:32:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13552782 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0CBD101C2 for ; Mon, 12 Feb 2024 08:32:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.111.4.28 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726752; cv=none; b=dES2ZrvPBHoFrgeju3CUF2AJQPpCAHqIMNtwuW3T3U4iiJf5Tx4JLdTsuXCDhuOVQhzFUthndB03T2yCn+dmtefKJow9Ga5s/+CTW91quc6ap63t3NY8Was5ab60KtQq6/1QXvVy124dJD4ellgiaTKtogdBukveeSP7QnSkE3g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726752; c=relaxed/simple; bh=wZGXdH52xM95KgSbzJKgABV3bjmkO3A24LPFqC7IqsE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=K7LPQmY9f5cXYEVKDVqDi/3HisAHS9h/GpC2uxp2ZZZ0du1BPw22ohnRmZeOqQQqDmEglOEGBsi4AfYh5RQEJHK5abIZTpMKclewRZcQZzktwZ72GD/WlYL51qr7WwmlAZUvtKkAwYkPPKMVXnnxicLu1BlMKzQR7Et7N1Yqfic= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=gkmjpJcz; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=LE23OfMo; arc=none smtp.client-ip=66.111.4.28 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="gkmjpJcz"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="LE23OfMo" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 9548D5C0090; Mon, 12 Feb 2024 03:32:28 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Mon, 12 Feb 2024 03:32:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1707726748; x=1707813148; bh=z6a+ohGe6f 291QgPQTdPn9UzOYrWCmNWHa2agK/dR6U=; b=gkmjpJczA99EnAFgHa9l8EBGkz 6SFZmPxRZ4ZGtYFTa/+qVTsjar+xTRnc2As8ujqWk2tLpXeHH2+fiOeGiAPuorNO OG95b3K4SlZe/Xvpb5kUrjXvP7y2ygzreT0BZhek55S5fdQt0Yf//E/BxwsJ9wlD sLoVahZ1QZ0JOW/xA0yAINEJuInGBpvuG3lr6ELWpcpgCUni5PWH4OevLNyEgCJj tauWAff7rE4A+KoHTImGUTft+7rVh5AwPVu96KCyE+WjpoH5w7D1T9I4+BIDk3NR FKN/QsAOIySmxsnh+9iLA3T38JuwsDArntabgHvJ5wWignPboFwYhFF+Ognw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1707726748; x=1707813148; bh=z6a+ohGe6f291QgPQTdPn9UzOYrW CmNWHa2agK/dR6U=; b=LE23OfMoFUCbS0fsDoGhahyFXCjELbtTnUzXGnYffjkS QQry1jC4cyn/tAgpADsSEyIu+B9dis0oCDMmf5Hndxv3RlypvqZo1eT8RLSWtXqF N9BeNXbOaQCOc0soIBWEIP9fjp8TQGtKLOJkeSm8VltIMxOjrCCf8WWuToAmz/z2 f2P4Uuv3xbN9boc85/5mCx4BRiZx3cad7bzeZP9eIDVCFL5pv6rxXwvUw4oqKAu8 +gEXwIkmpGBY+DxJKirk0LU51NZ8RqnDX2CMooMlm15byM2LFuu0Q6wx3QTopit8 z90bbJPI2f3DLK7FzvjA8yrupEQtboZnim+y07nvkA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddvgdduvdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 12 Feb 2024 03:32:27 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 95ce38b1 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 12 Feb 2024 08:28:41 +0000 (UTC) Date: Mon, 12 Feb 2024 09:32:25 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Eric Sunshine , John Cai Subject: [PATCH v2 1/7] reftable/record: introduce function to compare records by key Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In some places we need to sort reftable records by their keys to determine their ordering. This is done by first formatting the keys into a `struct strbuf` and then using `strbuf_cmp()` to compare them. This logic is needlessly roundabout and can end up costing quite a bit of CPU cycles, both due to the allocation and formatting logic. Introduce a new `reftable_record_cmp()` function that knows how to compare two records with each other without requiring allocations. Signed-off-by: Patrick Steinhardt --- reftable/record.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++- reftable/record.h | 7 ++++++ 2 files changed, 68 insertions(+), 1 deletion(-) diff --git a/reftable/record.c b/reftable/record.c index 5c3fbb7b2a..f1b6a5eac9 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -430,7 +430,6 @@ static int reftable_ref_record_is_deletion_void(const void *p) (const struct reftable_ref_record *)p); } - static int reftable_ref_record_equal_void(const void *a, const void *b, int hash_size) { @@ -439,6 +438,13 @@ static int reftable_ref_record_equal_void(const void *a, return reftable_ref_record_equal(ra, rb, hash_size); } +static int reftable_ref_record_cmp_void(const void *_a, const void *_b) +{ + const struct reftable_ref_record *a = _a; + const struct reftable_ref_record *b = _b; + return strcmp(a->refname, b->refname); +} + static void reftable_ref_record_print_void(const void *rec, int hash_size) { @@ -455,6 +461,7 @@ static struct reftable_record_vtable reftable_ref_record_vtable = { .release = &reftable_ref_record_release_void, .is_deletion = &reftable_ref_record_is_deletion_void, .equal = &reftable_ref_record_equal_void, + .cmp = &reftable_ref_record_cmp_void, .print = &reftable_ref_record_print_void, }; @@ -625,6 +632,25 @@ static int reftable_obj_record_equal_void(const void *a, const void *b, int hash return 1; } +static int reftable_obj_record_cmp_void(const void *_a, const void *_b) +{ + const struct reftable_obj_record *a = _a; + const struct reftable_obj_record *b = _b; + int cmp; + + cmp = memcmp(a->hash_prefix, b->hash_prefix, + a->hash_prefix_len > b->hash_prefix_len ? + a->hash_prefix_len : b->hash_prefix_len); + if (cmp) + return cmp; + + /* + * When the prefix is the same then the object record that is longer is + * considered to be bigger. + */ + return a->hash_prefix_len - b->hash_prefix_len; +} + static struct reftable_record_vtable reftable_obj_record_vtable = { .key = &reftable_obj_record_key, .type = BLOCK_TYPE_OBJ, @@ -635,6 +661,7 @@ static struct reftable_record_vtable reftable_obj_record_vtable = { .release = &reftable_obj_record_release, .is_deletion = ¬_a_deletion, .equal = &reftable_obj_record_equal_void, + .cmp = &reftable_obj_record_cmp_void, .print = &reftable_obj_record_print, }; @@ -953,6 +980,22 @@ static int reftable_log_record_equal_void(const void *a, hash_size); } +static int reftable_log_record_cmp_void(const void *_a, const void *_b) +{ + const struct reftable_log_record *a = _a; + const struct reftable_log_record *b = _b; + int cmp = strcmp(a->refname, b->refname); + if (cmp) + return cmp; + + /* + * Note that the comparison here is reversed. This is because the + * update index is reversed when comparing keys. For reference, see how + * we handle this in reftable_log_record_key()`. + */ + return b->update_index - a->update_index; +} + int reftable_log_record_equal(const struct reftable_log_record *a, const struct reftable_log_record *b, int hash_size) { @@ -1002,6 +1045,7 @@ static struct reftable_record_vtable reftable_log_record_vtable = { .release = &reftable_log_record_release_void, .is_deletion = &reftable_log_record_is_deletion_void, .equal = &reftable_log_record_equal_void, + .cmp = &reftable_log_record_cmp_void, .print = &reftable_log_record_print_void, }; @@ -1077,6 +1121,13 @@ static int reftable_index_record_equal(const void *a, const void *b, int hash_si return ia->offset == ib->offset && !strbuf_cmp(&ia->last_key, &ib->last_key); } +static int reftable_index_record_cmp(const void *_a, const void *_b) +{ + const struct reftable_index_record *a = _a; + const struct reftable_index_record *b = _b; + return strbuf_cmp(&a->last_key, &b->last_key); +} + static void reftable_index_record_print(const void *rec, int hash_size) { const struct reftable_index_record *idx = rec; @@ -1094,6 +1145,7 @@ static struct reftable_record_vtable reftable_index_record_vtable = { .release = &reftable_index_record_release, .is_deletion = ¬_a_deletion, .equal = &reftable_index_record_equal, + .cmp = &reftable_index_record_cmp, .print = &reftable_index_record_print, }; @@ -1147,6 +1199,14 @@ int reftable_record_is_deletion(struct reftable_record *rec) reftable_record_data(rec)); } +int reftable_record_cmp(struct reftable_record *a, struct reftable_record *b) +{ + if (a->type != b->type) + BUG("cannot compare reftable records of different type"); + return reftable_record_vtable(a)->cmp( + reftable_record_data(a), reftable_record_data(b)); +} + int reftable_record_equal(struct reftable_record *a, struct reftable_record *b, int hash_size) { if (a->type != b->type) diff --git a/reftable/record.h b/reftable/record.h index fd80cd451d..0d96fbfd1b 100644 --- a/reftable/record.h +++ b/reftable/record.h @@ -62,6 +62,12 @@ struct reftable_record_vtable { /* Are two records equal? This assumes they have the same type. Returns 0 for non-equal. */ int (*equal)(const void *a, const void *b, int hash_size); + /* + * Compare keys of two records with each other. The records must have + * the same type. + */ + int (*cmp)(const void *a, const void *b); + /* Print on stdout, for debugging. */ void (*print)(const void *rec, int hash_size); }; @@ -114,6 +120,7 @@ struct reftable_record { }; /* see struct record_vtable */ +int reftable_record_cmp(struct reftable_record *a, struct reftable_record *b); int reftable_record_equal(struct reftable_record *a, struct reftable_record *b, int hash_size); void reftable_record_print(struct reftable_record *rec, int hash_size); void reftable_record_key(struct reftable_record *rec, struct strbuf *dest); From patchwork Mon Feb 12 08:32:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13552783 Received: from fhigh4-smtp.messagingengine.com (fhigh4-smtp.messagingengine.com [103.168.172.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28F5610A05 for ; Mon, 12 Feb 2024 08:32:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726755; cv=none; b=dYNSALBqmltOa7rz8UIbm9gkmqxJltqb7B1En/+wUHCH56kRwQ6SZc1jsjmHog7A9Do9p+BO0BoWNtNvI46GEPZXUtdbwFziMwO3AuOp2vc5dw5O3X7N7a4H2LVlLZOBSLtSRb0pdk2mIIq1D9oU3C8fJfyZsZMlaXDZw6GqCXQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726755; c=relaxed/simple; bh=CJHSx1vBgsqj2R9gf0Wi7C+jSmP0h4B8Ii5Njbs9gNQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=japNEzwoXfjL02v/zWkgu+4CLhGWK0yrBvK0XCol47t8ND55U+QEYA29YG7TPu57Ny9wu9+9Xz0qPrGiKuJcq8DKxM3o1dUXL/CWp4+BRzu7Ppptmt5cPgkoNqDyqZ+nVjtWBFSx9JsRZc/pEoURKqFwJ+SRm6AYRntAGzT37kM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=SFaH0j0R; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=ihIFeZXC; arc=none smtp.client-ip=103.168.172.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="SFaH0j0R"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="ihIFeZXC" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfhigh.nyi.internal (Postfix) with ESMTP id D68E81140091; Mon, 12 Feb 2024 03:32:32 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Mon, 12 Feb 2024 03:32:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1707726752; x=1707813152; bh=66N+DOdzXN mWAMITmQ8SJRMgrkELdG2OYYK5RPVpKAI=; b=SFaH0j0R49Xez9YH5Bd5NC4/IP /MFJRbQEdv9aExCjxUO0/4Lujgpj5xr83ApzHEhwH8tBwTNZovgGJUlcAl+0QJYn uOd0WmixYT2C+cZP4zFaWFipI57P8RnFYP3TZPSLuvOa5tkCldTGHNj0racaILz6 4IO5QBqA95DlLC/kOp5zyigCIFQ7FdcwKUwru4hlS+kT58WU+JYGfOGn0iVhSTkz 0ELj5YeNuki44yQTsvGfxuqqRjyGqNP9R2iyyaykJX35yI7eJEbMLlBVB62lUzOF mSVG5hG/h7GsfuKQEZxGSVBQna2NB4Cod81+z0dQE9pwXNp/7/RGzv2Cprog== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1707726752; x=1707813152; bh=66N+DOdzXNmWAMITmQ8SJRMgrkEL dG2OYYK5RPVpKAI=; b=ihIFeZXC8mCvHIvPfVlWHi0kuHD/acSGJMVsw5VqD0C0 unYgVJ/d58L4LfkRs6su7HDyb/y8g8bsZ1HR6W5k9G3RMMAxyQP6EAg4rlkZmjNN zXbhXoqeOY/iYkzuNpn6MCjyheBfWMsePI90XRxK1F7CMey9V379NnJoBqniVDIL QhmRpDpH0ajq5JTJJZrhnHh7j680JJ2s7ETMpV0uG4dIei01UiiWt6BWmLBF4yRS s8f4viOQx2XzPkSUUZ2TTW4HyEemhzQ9CuZfV7ytO5EyKGSn6alHPdQuN2nsmMUE 6n1HL/INZ8z3ooyBYvc97qSFO+yTUixB88QOJjKd9g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddvgdduvdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtjeenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepteeuvefhhfdufedvgeeiueeileegtdfhgeeftdeuveejjedtgfejhedujeeutddu necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 12 Feb 2024 03:32:31 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 4e5c3c38 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 12 Feb 2024 08:28:45 +0000 (UTC) Date: Mon, 12 Feb 2024 09:32:29 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Eric Sunshine , John Cai Subject: [PATCH v2 2/7] reftable/merged: allocation-less dropping of shadowed records Message-ID: <2364a0fa3369063355bed966267a19fb9af3a6b6.1707726654.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The purpose of the merged reftable iterator is to iterate through all entries of a set of tables in the correct order. This is implemented by using a sub-iterator for each table, where the next entry of each of these iterators gets put into a priority queue. For each iteration, we do roughly the following steps: 1. Retrieve the top record of the priority queue. This is the entry we want to return to the caller. 2. Retrieve the next record of the sub-iterator that this record came from. If any, add it to the priority queue at the correct position. The position is determined by comparing the record keys, which e.g. corresponds to the refname for ref records. 3. Keep removing the top record of the priority queue until we hit the first entry whose key is larger than the returned record's key. This is required to drop "shadowed" records. The last step will lead to at least one comparison to the next entry, but may lead to many comparisons in case the reftable stack consists of many tables with shadowed records. It is thus part of the hot code path when iterating through records. The code to compare the entries with each other is quite inefficient though. Instead of comparing record keys with each other directly, we first format them into `struct strbuf`s and only then compare them with each other. While we already optimized this code path to reuse buffers in 829231dc20 (reftable/merged: reuse buffer to compute record keys, 2023-12-11), the cost to format the keys into the buffers still adds up quite significantly. Refactor the code to use `reftable_record_cmp()` instead, which has been introduced in the preceding commit. This function compares records with each other directly without requiring any memory allocations or copying and is thus way more efficient. The following benchmark uses git-show-ref(1) to print a single ref matching a pattern out of 1 million refs. This is the most direct way to exercise ref iteration speed as we remove all overhead of having to show the refs, too. Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 180.7 ms ± 4.7 ms [User: 177.1 ms, System: 3.4 ms] Range (min … max): 174.9 ms … 211.7 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 162.1 ms ± 4.4 ms [User: 158.5 ms, System: 3.4 ms] Range (min … max): 155.4 ms … 189.3 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.11 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/merged.c | 11 ++--------- reftable/merged.h | 2 -- 2 files changed, 2 insertions(+), 11 deletions(-) diff --git a/reftable/merged.c b/reftable/merged.c index c258ce953e..fb9978d798 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -51,8 +51,6 @@ static void merged_iter_close(void *p) reftable_iterator_destroy(&mi->stack[i]); } reftable_free(mi->stack); - strbuf_release(&mi->key); - strbuf_release(&mi->entry_key); } static int merged_iter_advance_nonnull_subiter(struct merged_iter *mi, @@ -105,14 +103,11 @@ static int merged_iter_next_entry(struct merged_iter *mi, such a deployment, the loop below must be changed to collect all entries for the same key, and return new the newest one. */ - reftable_record_key(&entry.rec, &mi->entry_key); while (!merged_iter_pqueue_is_empty(mi->pq)) { struct pq_entry top = merged_iter_pqueue_top(mi->pq); - int cmp = 0; + int cmp; - reftable_record_key(&top.rec, &mi->key); - - cmp = strbuf_cmp(&mi->key, &mi->entry_key); + cmp = reftable_record_cmp(&top.rec, &entry.rec); if (cmp > 0) break; @@ -246,8 +241,6 @@ static int merged_table_seek_record(struct reftable_merged_table *mt, .typ = reftable_record_type(rec), .hash_id = mt->hash_id, .suppress_deletions = mt->suppress_deletions, - .key = STRBUF_INIT, - .entry_key = STRBUF_INIT, }; int n = 0; int err = 0; diff --git a/reftable/merged.h b/reftable/merged.h index d5b39dfe7f..7d9f95d27e 100644 --- a/reftable/merged.h +++ b/reftable/merged.h @@ -31,8 +31,6 @@ struct merged_iter { uint8_t typ; int suppress_deletions; struct merged_iter_pqueue pq; - struct strbuf key; - struct strbuf entry_key; }; void merged_table_release(struct reftable_merged_table *mt); From patchwork Mon Feb 12 08:32:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13552784 Received: from fhigh4-smtp.messagingengine.com (fhigh4-smtp.messagingengine.com [103.168.172.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E50D10A05 for ; Mon, 12 Feb 2024 08:32:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726763; cv=none; b=qzNtRHWmweVLHf41k78s2kWTbsSRLSPZZnKZjQpdo2K7Cl1n3Lo3J38u+vrCRcYNYyJYf4Kg5panMBeZyOX6HpN2uBlhSvRnCfijCOv0HYWOb6WQJqtz3X73brD00QJPBAoBrTPAW6z0/9p1dTCTnPQ7afMFRAksyfUPbJh+cLc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726763; c=relaxed/simple; bh=g59/EbwwpiXrVK1/mtUySWh8H6xpWpH2KPzit1t7YIQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=umyMRU8x+CkZzDKzplqE9pBEr6Y4NCl1/835VKKKfHZXL1SmOb0vqJ4Snk4CM6Z2a8avkTuJHWDP3E1UgCyRbYh8VrDp9S8yfWTVwRJkXzvFPhFZEs4kOTIe8Mqkx6WkDVutOftnojc7R0nCcmpucvlONy5XeBYljMdVa/iTJns= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=ZraYAelz; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=UKeZzRNC; arc=none smtp.client-ip=103.168.172.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="ZraYAelz"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="UKeZzRNC" Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 4D67B1140099; Mon, 12 Feb 2024 03:32:40 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Mon, 12 Feb 2024 03:32:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1707726760; x=1707813160; bh=VJzeEQIQHi q40lFoyRDy3hyPnHDudasBZ7yDj8vwh6I=; b=ZraYAelzkXgB7wDBFwx4c/tMTA xe2EBmdatjkPlIVps3xG9sOl2DJseYhdmTXdvjwHGgQGacRL8KBXM0Z5EUt/BeEu AdPmNB0oPNDsXV6Qzs12CKSo9UoFDMG+3sNMIoJEqmEiOfSFFPsQ/ffy6QSmO/rK e0UjFMhW3zRua1qc6YZ5gkAZAoa4sX/8NvxNyBaNmGHdL6S1IjbU5AVNmbJJ7FTr hvwcasYuaQBmfnieDyW3WCx72BUYLoLkhdhqbhnhhlkbkO83reRv3ht8nMmtz8Be 4SYHAPRSj9CnwsIpSrbpFU6TSX14B1dorNbtjn9MSms+faz/EYQdokwnUTNw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1707726760; x=1707813160; bh=VJzeEQIQHiq40lFoyRDy3hyPnHDu dasBZ7yDj8vwh6I=; b=UKeZzRNC7LIIFcpOrumXQIUEqnO63SYvFR1cOMwZhClX dCj8JOm706yNAimF1ysNg4afW/Vcpj2RH9I+fwUZL1PXydSzIkPKMYrqIVDJB80u +Bo8oMqjJ+vMvqybZXxGmk9+vzQqXNpY3c0JxRuuKijNzNevmQgxX23Gpn3O3w6d V7kU9MlkK0b0ZmeZ10f2mT+7OVUyFjusgZc0bpDH3bWX+DnSraoC1JlzzisBqADq d4Ae9R7MU/oibDhPc352VmsOMpDBg8AbHz3VbplMp9Q05xdPoA5liE92LYX2d8HB 56n1eTdTYqKmQ/ZktFodqaQfvbjwGrr8g1JFg6Orjg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddvgdduvdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtjeenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepteeuvefhhfdufedvgeeiueeileegtdfhgeeftdeuveejjedtgfejhedujeeutddu necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 12 Feb 2024 03:32:39 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id ee8dfd64 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 12 Feb 2024 08:28:52 +0000 (UTC) Date: Mon, 12 Feb 2024 09:32:37 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Eric Sunshine , John Cai Subject: [PATCH v2 3/7] reftable/merged: skip comparison for records of the same subiter Message-ID: <205e6b488be9165d08e0fecb6197ebfc6747ba1c.1707726654.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When retrieving the next entry of a merged iterator we need to drop all records of other sub-iterators that would be shadowed by the record that we are about to return. We do this by comparing record keys, dropping all keys that are smaller or equal to the key of the record we are about to return. There is an edge case here where we can skip that comparison: when the record in the priority queue comes from the same subiterator as the record we are about to return then we know that its key must be larger than the key of the record we are about to return. This property is guaranteed by the sub-iterators, and if it didn't hold then the whole merged iterator would return records in the wrong order, too. While this may seem like a very specific edge case it's in fact quite likely to happen. For most repositories out there you can assume that we will end up with one large table and several smaller ones on top of it. Thus, it is very likely that the next entry will sort towards the top of the priority queue. Special case this and break out of the loop in that case. The following benchmark uses git-show-ref(1) to print a single ref matching a pattern out of 1 million refs: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 162.6 ms ± 4.5 ms [User: 159.0 ms, System: 3.5 ms] Range (min … max): 156.6 ms … 188.5 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 156.8 ms ± 4.7 ms [User: 153.0 ms, System: 3.6 ms] Range (min … max): 151.4 ms … 188.4 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.04 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/merged.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/reftable/merged.c b/reftable/merged.c index fb9978d798..0f74a14a77 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -107,6 +107,14 @@ static int merged_iter_next_entry(struct merged_iter *mi, struct pq_entry top = merged_iter_pqueue_top(mi->pq); int cmp; + /* + * When the next entry comes from the same queue as the current + * entry then it must by definition be larger. This avoids a + * comparison in the most common case. + */ + if (top.index == entry.index) + break; + cmp = reftable_record_cmp(&top.rec, &entry.rec); if (cmp > 0) break; From patchwork Mon Feb 12 08:32:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13552785 Received: from fhigh4-smtp.messagingengine.com (fhigh4-smtp.messagingengine.com [103.168.172.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 928E2111B1 for ; Mon, 12 Feb 2024 08:32:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726769; cv=none; b=I9/MHFvD0hcTg7+BNNtKQP+LjN3VEPP18DW6MnfNsKkXmBPv2Wr/QKRIggDKGjG8pRHL6A9d32omtH3zU7Z+MaS+vtTjaHqC55Amf1L6zqBcnnYGz9CzpjMDBF7dMj8BFkhnPmyVmUDGjhYjWFnywCHRvWwWnT3TdMKIEJsxyTA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726769; c=relaxed/simple; bh=tASX5euincqdLAY5iI+k5eMNBO1zjQlDosjl8g0jCjY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=p93//XLhZqavtAxP2tt7Tm8FijfuDCWC1iK3YMf/9Buo2e61cBVFaCxFZuB70ddE86qWzP4TPA+AkkY/UeGvz0UCVtzsliW03pVARHsU7J7BqteWeboo23A67Lop//ML27Ge3/RMAJaCOOM4ZFNaxh8/rgzXcDmwF4JV2YmKb8Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=hGO4WXqy; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=xU2Byf1t; arc=none smtp.client-ip=103.168.172.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="hGO4WXqy"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="xU2Byf1t" Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 5414C1140091; Mon, 12 Feb 2024 03:32:46 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Mon, 12 Feb 2024 03:32:46 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1707726766; x=1707813166; bh=PjMeQvFcvJ zfk8C+QLgS/LrXjOP+clCzIg2y/N2t4eg=; b=hGO4WXqy2YxMhDfqeHHmbvD+I9 KSHtrS3OeFigjxu7873FgxP4f/LFxvTJ1icctTd0mZrL1xnIGozNl5Fl73gIbqD0 kKAI73109QGPlJn1ZroH1/R8k10Vb0dD+D53pvRtiHSoN4gdlfDEh+YRECC95Zsu vFQ4rPkp1W4wW+Xu7/CELCnEStnZ1FcowjCYmvB3PPWk8/Enhw67BPgzFWSXCEK3 NdAamAD2yrN7g6wT2GEeMBVzW+TymgpcRAUa2QiQ44Z1oTGdsALG+Kq/ARB47+I1 O09ai4CkPNaKYDXjMWFh4IYDwEax5CaPsCzMLylAq+mAYiS23Cf5ktXnNNEQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1707726766; x=1707813166; bh=PjMeQvFcvJzfk8C+QLgS/LrXjOP+ clCzIg2y/N2t4eg=; b=xU2Byf1tVcnlvZtf/1CkmIbc+El7vLKr37ZbE8poK+Ly a66tDVlWLnRryWZZAcaaI1aXGkx66ECst/yqm8V2ku37zrtdRJbZzsLQb8gDhZ55 MpW3gVbK4xu0AkB6dFQ9U2JltbkXkeGoPK/zts0w0/ArTVYlgibggQvwNkQOHpwR mBrBUxvmdcEOQfzrTt7UXWkHWeU3QyKOHnwWGKNO22BygKSQ7coDAUnzIEVy35nC ZRD6B8MXJG8QeCZjqVxN6XYtARBOXly5KwGWNIAT+IZm3i80lBFMMiWPiKl28L55 UvHaVCzUbZ0YiXuPHy2WEgsaYdTbxSETo4Rizbon9w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddvgdduvddvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtjeenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepteeuvefhhfdufedvgeeiueeileegtdfhgeeftdeuveejjedtgfejhedujeeutddu necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 12 Feb 2024 03:32:45 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id e3c3293f (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 12 Feb 2024 08:28:59 +0000 (UTC) Date: Mon, 12 Feb 2024 09:32:43 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Eric Sunshine , John Cai Subject: [PATCH v2 4/7] reftable/pq: allocation-less comparison of entry keys Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The priority queue is used by the merged iterator to iterate over reftable records from multiple tables in the correct order. The queue ends up having one record for each table that is being iterated over, with the record that is supposed to be shown next at the top. For example, the key of a ref record is equal to its name so that we end up sorting the priority queue lexicographically by ref name. To figure out the order we need to compare the reftable record keys with each other. This comparison is done by formatting them into a `struct strbuf` and then doing `strbuf_strcmp()` on the result. We then discard the buffers immediately after the comparison. This ends up being very expensive. Because the priority queue usually contains as many records as we have tables, we call the comparison function `O(log($tablecount))` many times for every record we insert. Furthermore, when iterating over many refs, we will insert at least one record for every ref we are iterating over. So ultimately, this ends up being called `O($refcount * log($tablecount))` many times. Refactor the code to use the new `refatble_record_cmp()` function that has been implemented in a preceding commit. This function does not need to allocate memory and is thus significantly more efficient. The following benchmark prints a single ref matching a specific pattern out of 1 million refs via git-show-ref(1), where the reftable stack consists of three tables: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 224.4 ms ± 6.5 ms [User: 220.6 ms, System: 3.6 ms] Range (min … max): 216.5 ms … 261.1 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 172.9 ms ± 4.4 ms [User: 169.2 ms, System: 3.6 ms] Range (min … max): 166.5 ms … 204.6 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.30 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/pq.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/reftable/pq.c b/reftable/pq.c index dcefeb793a..7220efc39a 100644 --- a/reftable/pq.c +++ b/reftable/pq.c @@ -14,20 +14,9 @@ license that can be found in the LICENSE file or at int pq_less(struct pq_entry *a, struct pq_entry *b) { - struct strbuf ak = STRBUF_INIT; - struct strbuf bk = STRBUF_INIT; - int cmp = 0; - reftable_record_key(&a->rec, &ak); - reftable_record_key(&b->rec, &bk); - - cmp = strbuf_cmp(&ak, &bk); - - strbuf_release(&ak); - strbuf_release(&bk); - + int cmp = reftable_record_cmp(&a->rec, &b->rec); if (cmp == 0) return a->index > b->index; - return cmp < 0; } From patchwork Mon Feb 12 08:32:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13552786 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 560A711CBE for ; Mon, 12 Feb 2024 08:32:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.111.4.28 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726773; cv=none; b=T0+fuBI3E7a4EgHP97Ny28bzJY1+XJgfTQXnr6qZNc586LpRBvjc3X4RmgWuQQAIjdlzQEeATFehxqfn8BriaUBn08fBUH0ihjz2oZ1vPhUWltZDuI8+l3CT3ipKq4v8dGHDu1GlZ5yGKJD7PTsHr0+Crgb8qE2mY83vRfcB324= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726773; c=relaxed/simple; bh=4HKk/LFukbUDGcCm7PpiZwJdzEamaMMK3u2W2WrdCio=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Zplg5SnYI4Knkbs5K/tRheP1WN4SL9UJPEIJS7xLjwriDJgd7uK3AuTVu0TTzOh6CDo9RCAXt8tUxt1XVNuA1TAb42anqFVYVe8wtbJIyWom4HPsTRBJPlOKvm6j9HIOmIq7/rrK2yyU5xjJmJoO77Wnn17pdXUjnH3yovURa9s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=G47eDYA6; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=q/9bbEpB; arc=none smtp.client-ip=66.111.4.28 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="G47eDYA6"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="q/9bbEpB" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 5A0F75C0091; Mon, 12 Feb 2024 03:32:51 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Mon, 12 Feb 2024 03:32:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1707726771; x=1707813171; bh=Xlms8ZWkWU HMuSbm6JfOSjwsihVm52vqi7v4sacYMyg=; b=G47eDYA6jqTz/RAlKzqhxRXov6 98r61JSOzmdwvxpRJ7LyoZchKerWAOV8c8kbiQRfe8QsGG/LYXw97VtvOsBewvtD K+zvlQbK8+akuqaOO8C4J85vOsDxuSFQ0K3GkMVcnHakLq9HWG9JS4ar0zgIAXue zC1eqW08VrEmmtrcEZlTO/eaZduETBfy1ASiSFBvg6wyJyezcAik8IkQWLP8rr9b cVQgAf46cKV3fB98bdfarfrbNg8kJvTD14heHDj3Ngq8mCfkC2rdpK08bJJ2nrSz Uz+90jOEELBn1E4e3HpgGL/XDcEA3BhRDPlLUdtrouyj6DgEPavpgtJ0ZJAw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1707726771; x=1707813171; bh=Xlms8ZWkWUHMuSbm6JfOSjwsihVm 52vqi7v4sacYMyg=; b=q/9bbEpBEkSNmIlpr82kL8QiKQHVJzAWvdGU7GNlAXp6 YvA49Lswm7QkpRJQJlC6DTC75rHI8ZCNUTqfK5sAttPf93uecbFlsTtFG12ule+M Vf5AH9f6i+ED/rsJshd5ivaOnPOLoIjG/ip4asORyzCGOlEXkPU1F/f5Z1yDJEbO VMsC+QDudakDP3BSmikCsvsBE4Sy0ebQS69oszrG2bphUWCKLBj6MDQ8kM4f8gmC qNQ2WhtqQuDLpcBiAJCqU1Dk4BvefAYAwALV1mcABqE6nnY9lPDwjAbc/PX39KCE 6/86OI41bbpiQ2ikOrLjHZmIpgHGwlI968GYNM+MbA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddvgdduvdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtjeenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepteeuvefhhfdufedvgeeiueeileegtdfhgeeftdeuveejjedtgfejhedujeeutddu necuvehluhhsthgvrhfuihiivgepudenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 12 Feb 2024 03:32:50 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 54eea43e (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 12 Feb 2024 08:29:04 +0000 (UTC) Date: Mon, 12 Feb 2024 09:32:48 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Eric Sunshine , John Cai Subject: [PATCH v2 5/7] reftable/block: swap buffers instead of copying Message-ID: <2317aa43b95fc5f418504e9cf24b01048c3dbf8f.1707726654.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When iterating towards the next record in a reftable block we need to keep track of the key that the last record had. This is required because reftable records use prefix compression, where subsequent records may reuse parts of their preceding record's key. This key is stored in the `block_iter::last_key`, which we update after every call to `block_iter_next()`: we simply reset the buffer and then add the current key to it. This is a bit inefficient though because it requires us to copy over the key on every iteration, which adds up when iterating over many records. Instead, we can make use of the fact that the `block_iter::key` buffer is basically only a scratch buffer. So instead of copying over contents, we can just swap both buffers. The following benchmark prints a single ref matching a specific pattern out of 1 million refs via git-show-ref(1): Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 155.7 ms ± 5.0 ms [User: 152.1 ms, System: 3.4 ms] Range (min … max): 150.8 ms … 185.7 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 150.8 ms ± 4.2 ms [User: 147.1 ms, System: 3.5 ms] Range (min … max): 145.1 ms … 180.7 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.03 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/block.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 1df3d8a0f0..44381ea6a3 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -342,8 +342,7 @@ int block_iter_next(struct block_iter *it, struct reftable_record *rec) return -1; string_view_consume(&in, n); - strbuf_reset(&it->last_key); - strbuf_addbuf(&it->last_key, &it->key); + strbuf_swap(&it->last_key, &it->key); it->next_off += start.len - in.len; return 0; } From patchwork Mon Feb 12 08:32:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13552787 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C69112B81 for ; Mon, 12 Feb 2024 08:32:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.111.4.28 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726778; cv=none; b=JfGxeJ6aydjRw9JS5gGR/NFxyDniX4BlCY0eKSce9ibHqLhf+BKxkZbCvXdNgKpm6yLnbp8DB45gMB+jaqSZpgmPoqbYIlRLkGnXBgRWK68zZPs4vugZQ2m438Hs2cGU8CkDXXO4fKM8v/sYZQNLC4zVo3KwZB4TN+OZexOg/yw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726778; c=relaxed/simple; bh=aiQn5198zMQXbcbtBwh+KZT12MqHW3jKcIQ7U7tHsfI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=f457xdPuSS4LuSNCk/pGzeMh/U0gHHwIBCT8aQAuwm1S77hm/s7zfWax9SxEQWOyDTdZHlBA1YuxgczTJd8JS0yNbYZy9VL+bxvfXsTUnmoiM8t6IxYZcypWsgGRQMBs+pgwaXiWbkJjOMYYy0uj0jAklYAtNN8T+8EEs9RaA2c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=gZ3syMPi; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=dnwv1y3l; arc=none smtp.client-ip=66.111.4.28 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="gZ3syMPi"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="dnwv1y3l" Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 8A0085C0091; Mon, 12 Feb 2024 03:32:55 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 12 Feb 2024 03:32:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1707726775; x=1707813175; bh=0nPAyquOxv 0qb5AJNxVdLotcWcQPSDmvSVu9DN36bV0=; b=gZ3syMPiQmORvajvfHKXbAOFAW R6lC2D+c0gA1gUz9rx1IsE/uMdHu37U7doazAycH3vWR/yEnaGsWWxuhpqM36n4Y NyTCcrZjFnQXdL79IYcgnyWGQmp9ja+13oU7zUpC9JBiNyNdOEb/w54AmI3lJFaR SZKXIu9OJXSZEbwuD9dZ6T+F5ia9f5IjjwdJjhHvsNUzW4BNHb7CGESoydEUCw56 FxxZK3tPTXgRpLjmHkUWCDj9hqKRrJPYMQRbALzMGFj+4hImK1s0w6LIcf5SMGof wnZRndMup1mu4cQHtSr++FOrSMAdhziuE5TT5Mnujd1D5UN5febiXsJs0tUw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1707726775; x=1707813175; bh=0nPAyquOxv0qb5AJNxVdLotcWcQP SDmvSVu9DN36bV0=; b=dnwv1y3llAqC2aQJucOqHQNMKpnKDtBElfbDbsm4Y6Vk OGHgKxHLUy00CiYNldgcjwoN+lKEwiz6LX8AKHeLDFzfYFCqATTOtB9hnt9eyJ0V gFTYXsoMxbgFXVju6lucSWneG698FUyHL+pszCrrG2lMKi+4NtjFkYa0fBdUqBK2 wY5A9y85rZ64lxIjljmzwC0/eKwm45UK8BKBmINaz3aDC7xVcNuQcUwzJlu1FWSk m/RqRwc2qJZC5rOknGThS0BwPRi4ZGrpAq7Av/gRNSbMHmatxrVaxYz230Xo9LLl CqE15uGUHV4kaxk49tywTnVU48GI+qO0zjZswjUVWA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddvgdduvdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtjeenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepteeuvefhhfdufedvgeeiueeileegtdfhgeeftdeuveejjedtgfejhedujeeutddu necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 12 Feb 2024 03:32:54 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id cd974518 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 12 Feb 2024 08:29:08 +0000 (UTC) Date: Mon, 12 Feb 2024 09:32:53 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Eric Sunshine , John Cai Subject: [PATCH v2 6/7] reftable/record: don't try to reallocate ref record name Message-ID: <8c674915044c47c46c8e4efcb4d914fdd26e5c55.1707726654.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When decoding reftable ref records we first release the pointer to the record passed to us and then use realloc(3P) to allocate the refname array. This is a bit misleading though as we know at that point that the refname will always be `NULL`, so we would always end up allocating a new char array anyway. Refactor the code to use `REFTABLE_ALLOC_ARRAY()` instead. As the following benchmark demonstrates this is a tiny bit more efficient. But the bigger selling point really is the gained clarity. Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 150.1 ms ± 4.1 ms [User: 146.6 ms, System: 3.3 ms] Range (min … max): 144.5 ms … 180.5 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 148.9 ms ± 4.5 ms [User: 145.2 ms, System: 3.4 ms] Range (min … max): 143.0 ms … 185.4 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.01 ± 0.04 times faster than show-ref: single matching ref (revision = HEAD~) Ideally, we should try and reuse the memory of the old record instead of first freeing and then immediately reallocating it. This requires some more surgery though and is thus left for a future iteration. Signed-off-by: Patrick Steinhardt --- reftable/record.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/reftable/record.c b/reftable/record.c index f1b6a5eac9..6465a7b8f4 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -377,10 +377,11 @@ static int reftable_ref_record_decode(void *rec, struct strbuf key, assert(hash_size > 0); - r->refname = reftable_realloc(r->refname, key.len + 1); + r->refname = reftable_malloc(key.len + 1); memcpy(r->refname, key.buf, key.len); - r->update_index = update_index; r->refname[key.len] = 0; + + r->update_index = update_index; r->value_type = val_type; switch (val_type) { case REFTABLE_REF_VAL1: From patchwork Mon Feb 12 08:32:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13552788 Received: from fhigh4-smtp.messagingengine.com (fhigh4-smtp.messagingengine.com [103.168.172.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BDD5171BF for ; Mon, 12 Feb 2024 08:33:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726782; cv=none; b=hQl7KiiUnLB3XHaqNLejRKLnMcIfBWwqi3mrBxQJvGcpczgMQgcqbaLgvAquY8EuwQQpmv7U1RQIPXT1D2y2iPJrZIbcXLgKSnC+Aprq8jww6HufNC8iYiioTcMoCLnygl2zdVLb7buEoQwHyRZEapxH84+V2qnzysiamO0mH5o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707726782; c=relaxed/simple; bh=2EgDi3phXmNUlLsMxAsU37jyBnHUTs9ZsrwDQNBTdA4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=dwt4KsIImvoL7XovYqoA+jF3RzJ24wKxeYZE8PQvsjuk3gtQ/gGFaaS78gLxvgnTnJcpveRoSWvejvgZ8A8U6vYadYgL5NnjYVo53hYNA7eXdlFW0QBm52dZgCvIZFkdACNpr9X78IlXeomIemQUNyTMPHw2RGG6lSipRDv12SM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=asAoQR0t; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=IjDqDvJz; arc=none smtp.client-ip=103.168.172.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="asAoQR0t"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="IjDqDvJz" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 9853E1140091; Mon, 12 Feb 2024 03:32:59 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Mon, 12 Feb 2024 03:32:59 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1707726779; x=1707813179; bh=P0O5cnO4RQ q7yqB8Y9NpgICI1dXNHsUFquFVZRWwRvA=; b=asAoQR0tftCphib2tzQgaaI0wV JzY5A2O4c59kZ7TwGX+JJD2hmlfE2FoqvOD/wJuFuY4lydp3DN0puOPEynIXgkQy m1HifiT9ogUtWxvRT3TScmUKQrn+V5AB6j3hzt/zCkE+bmflTm1eEtj7clmcsMAx d7kSrZVUtioL8WYJStFQLe1btyay0WVg/J/QQVxG7k0zDJah/AJx6IxqnDyUwY5x S4uZfCP4BKDr31GmV81bGuRmXlmT3bxndgGgR4sA9f1lfQKEwsvxU31VqlhNSLOr H2+vBMhAHo8PKaPYB4JUV446+0WSDrXQJ5F3PHBP1q2Xcr51ZSdg9FiogmEg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1707726779; x=1707813179; bh=P0O5cnO4RQq7yqB8Y9NpgICI1dXN HsUFquFVZRWwRvA=; b=IjDqDvJz1MbDxGMtPYCUZ0TdROD1/hgdGe3Neh5BIdlR qqy3imXNkWFPWg9TCL1aCpXPLOqIkTTd+sQYSuG/59OQ6bkM3R88tMr4QXc6e3aV JepLBx1SQiRA4uP5cFrQPlKGVA2jtJ1QrfA3du8Xin0WYP4IhV+7t08bEoG/Aq53 umMcSrlSvzVsTRUlkACt7l2Y1xLNuJSG/ykdUjbnHBDDHC692OMudr9lAS8BYsyF dFWoDdde/EYw6M2pKuHUnVUH6wWe4j+DTOFDW8qXjwNk4jrLWt1pFN+i2ssPf3b5 2sKWTarpKxw8EWVYYXvx1mOB7XXdQp9eb9+fQhGWlw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddvgdduvdefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgepvdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 12 Feb 2024 03:32:58 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id f02954d4 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 12 Feb 2024 08:29:13 +0000 (UTC) Date: Mon, 12 Feb 2024 09:32:57 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Eric Sunshine , John Cai Subject: [PATCH v2 7/7] reftable/reader: add comments to `table_iter_next()` Message-ID: <167f67fad841ad06535a5532088fa6c9125fb1cd.1707726654.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: While working on the optimizations in the preceding patches I stumbled upon `table_iter_next()` multiple times. It is quite easy to miss the fact that we don't call `table_iter_next_in_block()` twice, but that the second call is in fact `table_iter_next_block()`. Add comments to explain what exactly is going on here to make things more obvious. While at it, touch up the code to conform to our code style better. Note that one of the refactorings merges two conditional blocks into one. Before, we had the following code: ``` err = table_iter_next_block(&next, ti if (err != 0) { ti->is_finished = 1; } table_iter_block_done(ti); if (err != 0) { return err; } ``` As `table_iter_block_done()` does not care about `is_finished`, the conditional blocks can be merged into one block: ``` err = table_iter_next_block(&next, ti table_iter_block_done(ti); if (err != 0) { ti->is_finished = 1; return err; } ``` This is both easier to reason about and more performant because we have one branch less. Signed-off-by: Patrick Steinhardt --- reftable/reader.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) diff --git a/reftable/reader.c b/reftable/reader.c index 64dc366fb1..add7d57f0b 100644 --- a/reftable/reader.c +++ b/reftable/reader.c @@ -357,24 +357,32 @@ static int table_iter_next(struct table_iter *ti, struct reftable_record *rec) while (1) { struct table_iter next = TABLE_ITER_INIT; - int err = 0; - if (ti->is_finished) { + int err; + + if (ti->is_finished) return 1; - } + /* + * Check whether the current block still has more records. If + * so, return it. If the iterator returns positive then the + * current block has been exhausted. + */ err = table_iter_next_in_block(ti, rec); - if (err <= 0) { + if (err <= 0) return err; - } + /* + * Otherwise, we need to continue to the next block in the + * table and retry. If there are no more blocks then the + * iterator is drained. + */ err = table_iter_next_block(&next, ti); - if (err != 0) { - ti->is_finished = 1; - } table_iter_block_done(ti); - if (err != 0) { + if (err) { + ti->is_finished = 1; return err; } + table_iter_copy_from(ti, &next); block_iter_close(&next.bi); }