From patchwork Wed Jan 3 06:22:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13509678 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A062F1799A for ; Wed, 3 Jan 2024 06:22:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="gbunOJ+g"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="NmtO6noT" Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 88D895C01DC; Wed, 3 Jan 2024 01:22:15 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Wed, 03 Jan 2024 01:22:15 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1704262935; x=1704349335; bh=Y4tzJ45r/+ RjhrXXIK89PFK2Y+AZPkz7gE0HDvEkL1Q=; b=gbunOJ+gaNrC8a48t8NInxfwhv jQWytPcMXB/0oAoxFCr27CgdEcifG6FiglJR7+RoUoSCEfFKAb1dWeR1NBgjDBJ5 2KTjeCSVuBgkNygT+46dHP+wS3y+AoUtfrAq4vj54J8oinvTuAW45M3XYsAPuVyo i2Qq6/yM2M5NC/NLWGjk7Ju4U/WemizhcmphnuMjEg68MGRrvTudBS7CjJ8YhZk/ 8b7VkxRNiuD1DMgg86cpTaPQ6rIC6ZXueK4XSgMqa4Ypt+qY9VZUqMT3lN1mgNiz SAz++jiYguHXr3FvDm9MYwrw2YmsZpuTyrCFmWOCW8k2AeZiQlCB6+LTgyQg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1704262935; x=1704349335; bh=Y4tzJ45r/+RjhrXXIK89PFK2Y+AZ Pkz7gE0HDvEkL1Q=; b=NmtO6noTJ9K1gwhh7T04IN9ImWSn1VY2bFk+kZ6sFQNq Zv7ILPEra+0QoiPt7S/13LTDrsyTf7O3Szg6FSoFX85sRjANlsIBLoAOldIoryYQ v6dFN2K6AsmNgB97C4rek6J8JIWOOX0Xyuk5Prl31roao53NtSQ5kHatmD1h5idL gYnehsw3WCseRjHqlFld6T3H+4xyrivhEnbxQmqGEcOUS7/ARxheQc03RQi2u0Tt BgG+8Opb0LOBALXzIUybfZdoxm4n/AwVcRx+8EaeXA4pg5AVJKoDGHtAR3CrOt6o DvxbANRS7K+1AzS4I1b5Lqftc/GM630XGlDyzKiHDw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrvdeggedgleeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Jan 2024 01:22:14 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 7bf19f5e (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Jan 2024 06:19:49 +0000 (UTC) Date: Wed, 3 Jan 2024 07:22:13 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Junio C Hamano Subject: [PATCH v3 1/8] reftable/stack: do not overwrite errors when compacting Message-ID: <1dc8ddf04a112c38f41d573a48dac3f99b4b51e9.1704262787.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In order to compact multiple stacks we iterate through the merged ref and log records. When there is any error either when reading the records from the old merged table or when writing the records to the new table then we break out of the respective loops. When breaking out of the loop for the ref records though the error code will be overwritten, which may cause us to inadvertently skip over bad ref records. In the worst case, this can lead to a compacted stack that is missing records. Fix the code by using `goto done` instead so that any potential error codes are properly returned to the caller. Signed-off-by: Patrick Steinhardt --- reftable/stack.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/reftable/stack.c b/reftable/stack.c index 16bab82063..8729508dc3 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -801,18 +801,16 @@ static int stack_write_compact(struct reftable_stack *st, err = 0; break; } - if (err < 0) { - break; - } + if (err < 0) + goto done; if (first == 0 && reftable_ref_record_is_deletion(&ref)) { continue; } err = reftable_writer_add_ref(wr, &ref); - if (err < 0) { - break; - } + if (err < 0) + goto done; entries++; } reftable_iterator_destroy(&it); @@ -827,9 +825,8 @@ static int stack_write_compact(struct reftable_stack *st, err = 0; break; } - if (err < 0) { - break; - } + if (err < 0) + goto done; if (first == 0 && reftable_log_record_is_deletion(&log)) { continue; } @@ -845,9 +842,8 @@ static int stack_write_compact(struct reftable_stack *st, } err = reftable_writer_add_log(wr, &log); - if (err < 0) { - break; - } + if (err < 0) + goto done; entries++; } From patchwork Wed Jan 3 06:22:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13509679 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B012C18053 for ; Wed, 3 Jan 2024 06:22:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="V7g+B+xF"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="ToUNbGvU" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id B78165C019A; Wed, 3 Jan 2024 01:22:19 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Wed, 03 Jan 2024 01:22:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1704262939; x=1704349339; bh=Fxf4ccFifo Mq7CopDOK6rkoC+1syMSYWoybI9T2OtC4=; b=V7g+B+xFuu5ronayLDUZ/W6V9E mTYMNd0oyR1W6q+tbm8OahMopx3VMNqa3K4i8CiJ2Rxa2juv1BS+OrNSo8snyL5Y aLdrDiCAQKK9J2nDGZJG8C/RKXJsb71ID/MauYgLIQJvH1/3fn3JsjRSPgEOalAN V+sYYIuOJwtEOme3bbyxWuz9+VTsw5RMqaFJa10oZG5BeT10ay+89WhN/c7hi+IC rzebiA9LgWeJiRfoV9U7IVdRDsDEGc5ZnsqlIs5tIgyL+YTQRKZF2McuhvVMGQyr sNygw5BOKbCymFWUgpqxoemADx3dUXOBTLZR+ROJ7ddOYN2IqkVsGx2phBsQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1704262939; x=1704349339; bh=Fxf4ccFifoMq7CopDOK6rkoC+1sy MSYWoybI9T2OtC4=; b=ToUNbGvUC9csq2XGb3fkIjQ4qY//TA/WEaIKRSR0naIk PBn9KSR2RaLihLGruRNCByx9lI7RmDJfmWQiYwqo+FTQ7QAj+zNm8p1WNzfO5Usk kt2ko4kQHSas2LYluIVKlj5Xosu8rhiz/v5RkXJe4erExSYgeDM78ll9noi3Ske1 PabGa6cmvFCbBQClaBRX5u8uvfbfqbV6W9oJ8xASENaiYGGrkW1gnMIFGUDe72YP 0NTY5AuObqRugCa/aY3Hg7psQhXQS2vk8sCVQcZashO+iiFmN/8PeBrepm+vDMUV ZNcrBX2+P+hS0WYivKesYhAa5Wf0TSmv4ePCq1/aWw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrvdeggedgleeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedunecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Jan 2024 01:22:18 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id a9f7b756 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Jan 2024 06:19:53 +0000 (UTC) Date: Wed, 3 Jan 2024 07:22:17 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Junio C Hamano Subject: [PATCH v3 2/8] reftable/stack: do not auto-compact twice in `reftable_stack_add()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In 5c086453ff (reftable/stack: perform auto-compaction with transactional interface, 2023-12-11), we fixed a bug where the transactional interface to add changes to a reftable stack did not perform auto-compaction by calling `reftable_stack_auto_compact()` in `reftable_stack_addition_commit()`. While correct, this change may now cause us to perform auto-compaction twice in the non-transactional interface `reftable_stack_add()`: - It performs auto-compaction by itself. - It now transitively performs auto-compaction via the transactional interface. Remove the first instance so that we only end up doing auto-compaction once. Reported-by: Han-Wen Nienhuys Signed-off-by: Patrick Steinhardt --- reftable/stack.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/reftable/stack.c b/reftable/stack.c index 8729508dc3..7ffeb3ee10 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -425,9 +425,6 @@ int reftable_stack_add(struct reftable_stack *st, return err; } - if (!st->disable_auto_compact) - return reftable_stack_auto_compact(st); - return 0; } From patchwork Wed Jan 3 06:22:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13509680 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CC686182AB for ; Wed, 3 Jan 2024 06:22:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="wsGXED8c"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="nyfcYIe1" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id E13515C019A; Wed, 3 Jan 2024 01:22:23 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Wed, 03 Jan 2024 01:22:23 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1704262943; x=1704349343; bh=mYs4vHIgO/ 0zSHx3I0dlaP7G/uteq0iLRfRgKMzBe1g=; b=wsGXED8ck82yxrngS/Du9u2TZ/ ORYQTvoPeUh+rDBO23TgXXtj4R5Wpkidduih2WxRmWcilShulTKak06HYiGQcOVs 7KaqJRECP+1ajwqDfO2d9s8aIXxhhmb3SKhMhmu0ScWgLME0qQ6iJXgHV1GbjTf8 Nxu1fWGUWxmIdCx057XKvn1MhgUUCdaiVStfljUvbHMnKh+7LQcC/f7rHWifuSeY 8I/HLivUeSxoXIFxiOCYFqIJ4qboZ284YEnwBtNoRZGkkwk0qXFUTqj1OiIeuR0H BDiCmr08brkZ858aCAmFDj7O2BnIqjcqzn78P6SufcFc5O4WHzvedqY5Y3fw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1704262943; x=1704349343; bh=mYs4vHIgO/0zSHx3I0dlaP7G/ute q0iLRfRgKMzBe1g=; b=nyfcYIe1wFAvrteSrluBTO0M0EfBvsd49q92WgHMVQ0l NIqvZTrFC6YI43r8ibYoAIcDLYnuBNsf95ZUZ5gd0h8yz+qiiRbH5A16t5K6pleY Di0L2/8QlYul7cUM31UigabPYWNKStHqHC7wYkY4+hmIhMeXx8KGmMVeLk7Tjy8O mTC2sl6fou5c6Yp2KLM5IRrp4SENjvUbfTuLuqiqEu0Hx4LU2uhp++GH2czT1HbR ue+eqhKjnURIXmZ8ymL8QuzFdXBjBi3FJYo+jvODAhMn2CsQfVPKYCna4WavIXmR e/qnHbSVOiiYxcpl9b74kl61PvxVIrP0ooRqpQ88yw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrvdeggedgleeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Jan 2024 01:22:22 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 6fbe14de (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Jan 2024 06:19:57 +0000 (UTC) Date: Wed, 3 Jan 2024 07:22:21 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Junio C Hamano Subject: [PATCH v3 3/8] reftable/writer: fix index corruption when writing multiple indices Message-ID: <15e12b8f2904cc879498c344945f3f6380c0aa86.1704262787.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Each reftable may contain multiple types of blocks for refs, objects and reflog records, where each of these may have an index that makes it more efficient to find the records. It was observed that the index for log records can become corrupted under certain circumstances, where the first entry of the index points into the object index instead of to the log records. As it turns out, this corruption can occur whenever we write a log index as well as at least one additional index. Writing records and their index is basically a two-step process: 1. We write all blocks for the corresponding record. Each block that gets written is added to a list of blocks to index. 2. Once all blocks were written we finish the section. If at least two blocks have been added to the list of blocks to index then we will now write the index for those blocks and flush it, as well. When we have a very large number of blocks then we may decide to write a multi-level index, which is why we also keep track of the list of the index blocks in the same way as we previously kept track of the blocks to index. Now when we have finished writing all index blocks we clear the index and flush the last block to disk. This is done in the wrong order though because flushing the block to disk will re-add it to the list of blocks to be indexed. The result is that the next section we are about to write will have an entry in the list of blocks to index that points to the last block of the preceding section's index, which will corrupt the log index. Fix this corruption by clearing the index after having written the last block. Signed-off-by: Patrick Steinhardt --- reftable/readwrite_test.c | 80 +++++++++++++++++++++++++++++++++++++++ reftable/writer.c | 4 +- 2 files changed, 82 insertions(+), 2 deletions(-) diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c index 278663f22d..9c16e0504e 100644 --- a/reftable/readwrite_test.c +++ b/reftable/readwrite_test.c @@ -798,6 +798,85 @@ static void test_write_key_order(void) strbuf_release(&buf); } +static void test_write_multiple_indices(void) +{ + struct reftable_write_options opts = { + .block_size = 100, + }; + struct strbuf writer_buf = STRBUF_INIT, buf = STRBUF_INIT; + struct reftable_block_source source = { 0 }; + struct reftable_iterator it = { 0 }; + const struct reftable_stats *stats; + struct reftable_writer *writer; + struct reftable_reader *reader; + int err, i; + + writer = reftable_new_writer(&strbuf_add_void, &writer_buf, &opts); + reftable_writer_set_limits(writer, 1, 1); + for (i = 0; i < 100; i++) { + unsigned char hash[GIT_SHA1_RAWSZ] = {i}; + struct reftable_ref_record ref = { + .update_index = 1, + .value_type = REFTABLE_REF_VAL1, + .value.val1 = hash, + }; + + strbuf_reset(&buf); + strbuf_addf(&buf, "refs/heads/%04d", i); + ref.refname = buf.buf, + + err = reftable_writer_add_ref(writer, &ref); + EXPECT_ERR(err); + } + + for (i = 0; i < 100; i++) { + unsigned char hash[GIT_SHA1_RAWSZ] = {i}; + struct reftable_log_record log = { + .update_index = 1, + .value_type = REFTABLE_LOG_UPDATE, + .value.update = { + .old_hash = hash, + .new_hash = hash, + }, + }; + + strbuf_reset(&buf); + strbuf_addf(&buf, "refs/heads/%04d", i); + log.refname = buf.buf, + + err = reftable_writer_add_log(writer, &log); + EXPECT_ERR(err); + } + + reftable_writer_close(writer); + + /* + * The written data should be sufficiently large to result in indices + * for each of the block types. + */ + stats = reftable_writer_stats(writer); + EXPECT(stats->ref_stats.index_offset > 0); + EXPECT(stats->obj_stats.index_offset > 0); + EXPECT(stats->log_stats.index_offset > 0); + + block_source_from_strbuf(&source, &writer_buf); + err = reftable_new_reader(&reader, &source, "filename"); + EXPECT_ERR(err); + + /* + * Seeking the log uses the log index now. In case there is any + * confusion regarding indices we would notice here. + */ + err = reftable_reader_seek_log(reader, &it, ""); + EXPECT_ERR(err); + + reftable_iterator_destroy(&it); + reftable_writer_free(writer); + reftable_reader_free(reader); + strbuf_release(&writer_buf); + strbuf_release(&buf); +} + static void test_corrupt_table_empty(void) { struct strbuf buf = STRBUF_INIT; @@ -847,5 +926,6 @@ int readwrite_test_main(int argc, const char *argv[]) RUN_TEST(test_log_overflow); RUN_TEST(test_write_object_id_length); RUN_TEST(test_write_object_id_min_length); + RUN_TEST(test_write_multiple_indices); return 0; } diff --git a/reftable/writer.c b/reftable/writer.c index 2e322a5683..ee4590e20f 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -432,12 +432,12 @@ static int writer_finish_section(struct reftable_writer *w) reftable_free(idx); } - writer_clear_index(w); - err = writer_flush_block(w); if (err < 0) return err; + writer_clear_index(w); + bstats = writer_reftable_block_stats(w, typ); bstats->index_blocks = w->stats.idx_stats.blocks - before_blocks; bstats->index_offset = index_start; From patchwork Wed Jan 3 06:22:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13509681 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 256D8182C6 for ; Wed, 3 Jan 2024 06:22:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="UplzA90S"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="1lL2BaFD" Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailout.nyi.internal (Postfix) with ESMTP id 1F9ED5C01E3; Wed, 3 Jan 2024 01:22:29 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Wed, 03 Jan 2024 01:22:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1704262949; x=1704349349; bh=tUcaSsK13l ZbfrDLNbP66sBapQF9wLtTHjI2MWtsVxM=; b=UplzA90SkZMvH84QxndYigAydo qua6j7m9i0uvzPml7vzlIq+K+hw80gWlDPk7K48hIY5u02EN90FeToaBnR0qWAuh Q4+S0WoPRF60N05IFCKu5I4cDNFamrmPVxUYLv8MhP7e3ANZa3tDXPxFz2gqg5QN BzLLFkXG7Ip2NOU6UxDP7BdU/rKVHAa99/c7eIHQnYYsu2khea1mprB5w2PuOdAd c4tQxNlkLJiIhOsnGN4FjsrbzBjxUV6qVKs/Mpgr7J7B3kiFWF10Xg7Mmg90sBXA PcRp9n+2/zeBOTp4O+11TYMx1wSy4KHhbIt/njQFAQVk+vuSvRmVQH71gZfA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1704262949; x=1704349349; bh=tUcaSsK13lZbfrDLNbP66sBapQF9 wLtTHjI2MWtsVxM=; b=1lL2BaFDJpF8VRbJX871M53x0I66QPFltuReYAQG50R/ VgNtIAUdxXpTiXXumI44LykYeF0Iua4vNXfMP+4znRioa5aTzduR3LapAiVdX8Pe r3/xag0Au9NOAnbimWNBe42hLGhd38QKvRMoTgCBBCDVwQylvqlRp/A6EGGsG03P bAcmP72CEHHiLqr5WCqDVM8zD0DjfmHGrtiwYFZjqx5hI6qtu/jYNlB4hA3VAQuL 1ZgsAWfj7deaYS8MC50boqyCayZBDO2+5e7aj3OxRIHLgjFJoiZeRK6Q2cLXBhXQ DRMEh6XRqK8+E+uBh9y/ubznU29p4kDkZcOsvDl3Eg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrvdeggedgleeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Jan 2024 01:22:28 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id faf4a90b (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Jan 2024 06:20:02 +0000 (UTC) Date: Wed, 3 Jan 2024 07:22:26 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Junio C Hamano Subject: [PATCH v3 4/8] reftable/record: constify some parts of the interface Message-ID: <017e8943c7b0b735d0252ad368f52443601da716.1704262787.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: We're about to convert reftable records to stop storing their object IDs as allocated hashes. Prepare for this refactoring by constifying some parts of the interface that will be impacted by this. Signed-off-by: Patrick Steinhardt --- reftable/record.c | 8 ++++---- reftable/reftable-record.h | 4 ++-- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/reftable/record.c b/reftable/record.c index fbaa1fbef5..5e258c734b 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -76,7 +76,7 @@ int reftable_is_block_type(uint8_t typ) return 0; } -uint8_t *reftable_ref_record_val1(const struct reftable_ref_record *rec) +const unsigned char *reftable_ref_record_val1(const struct reftable_ref_record *rec) { switch (rec->value_type) { case REFTABLE_REF_VAL1: @@ -88,7 +88,7 @@ uint8_t *reftable_ref_record_val1(const struct reftable_ref_record *rec) } } -uint8_t *reftable_ref_record_val2(const struct reftable_ref_record *rec) +const unsigned char *reftable_ref_record_val2(const struct reftable_ref_record *rec) { switch (rec->value_type) { case REFTABLE_REF_VAL2: @@ -242,7 +242,7 @@ static char hexdigit(int c) return 'a' + (c - 10); } -static void hex_format(char *dest, uint8_t *src, int hash_size) +static void hex_format(char *dest, const unsigned char *src, int hash_size) { assert(hash_size > 0); if (src) { @@ -1164,7 +1164,7 @@ int reftable_record_equal(struct reftable_record *a, struct reftable_record *b, reftable_record_data(a), reftable_record_data(b), hash_size); } -static int hash_equal(uint8_t *a, uint8_t *b, int hash_size) +static int hash_equal(const unsigned char *a, const unsigned char *b, int hash_size) { if (a && b) return !memcmp(a, b, hash_size); diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h index 67104f8fbf..f7eb2d6015 100644 --- a/reftable/reftable-record.h +++ b/reftable/reftable-record.h @@ -49,11 +49,11 @@ struct reftable_ref_record { /* Returns the first hash, or NULL if `rec` is not of type * REFTABLE_REF_VAL1 or REFTABLE_REF_VAL2. */ -uint8_t *reftable_ref_record_val1(const struct reftable_ref_record *rec); +const unsigned char *reftable_ref_record_val1(const struct reftable_ref_record *rec); /* Returns the second hash, or NULL if `rec` is not of type * REFTABLE_REF_VAL2. */ -uint8_t *reftable_ref_record_val2(const struct reftable_ref_record *rec); +const unsigned char *reftable_ref_record_val2(const struct reftable_ref_record *rec); /* returns whether 'ref' represents a deletion */ int reftable_ref_record_is_deletion(const struct reftable_ref_record *ref); From patchwork Wed Jan 3 06:22:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13509682 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76A1318623 for ; Wed, 3 Jan 2024 06:22:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="G5dnpLtO"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="EkOx1Gqp" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 4F5285C01E8; Wed, 3 Jan 2024 01:22:33 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Wed, 03 Jan 2024 01:22:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1704262953; x=1704349353; bh=GUCInya8GL 9eLoSa0aP3KhiheroH7P//W+4lLi/5Hug=; b=G5dnpLtOGK63kvDG9sMf5Dpifg rkk/ZmyUWp2IspckUMT54pNyt7vXMGUFR3qCWIk89yy+ebhsMzC+HGJrpDqaw6vJ o65bmRWjiUa2pwu2m+7lj8I2iyYkEFE+CuPr/n11gIkuVugLzW3Y2aM+UkQoL+6w G09waHns4SjyJqlibp87acroatHwIKmkz9cmKDB/7OSBpZYey9Jc0tHMUnSdBwDt tv97UV5o+Ho839T//0DBy4n8O+OnPGJpYR2e3irIgvkOgoDIxEpgcQSf3K2TrpC3 gkFESdpAhmLl8y2KqurvKCmj3cVcNcjYwoHYs9EClD7vLnPXidlNIVdd+SqQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1704262953; x=1704349353; bh=GUCInya8GL9eLoSa0aP3KhiheroH 7P//W+4lLi/5Hug=; b=EkOx1GqpDe1PyzIVHY9gs6E96afdpKQGkWfXq6YPCIDm wNU51Wi3Ft8nPBU1z4u+1hoo9SJwEZWY779WCthNmijZRpqCJXsPH16Nal+F79d+ 6INfzwSkNdeM3ISYzjLLoHvkf9YmU+wcMZP+3CTx7dfGw42U6xH+/nbLZXEtXhow 5RQx+/QmjAx8lc04PTlb5hWUhRojldpAtjxgkm81DxZSJ71XlbqY3zzqqkKnEHJA k15XMc9+3ZwJkaydWdXnv7evj7Ia6GXFHvcACNrsYkAE9GU4pjU6XKbLQTvjwUV4 b8UuCCSmpI4BMlc69pwPfW1nC4X6o8PBkxGJRRT94g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrvdeggedgleeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedvnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Jan 2024 01:22:32 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 6f8fbf91 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Jan 2024 06:20:06 +0000 (UTC) Date: Wed, 3 Jan 2024 07:22:30 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Junio C Hamano Subject: [PATCH v3 5/8] reftable/record: store "val1" hashes as static arrays Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When reading ref records of type "val1", we store its object ID in an allocated array. This results in an additional allocation for every single ref record we read, which is rather inefficient especially when iterating over refs. Refactor the code to instead use an embedded array of `GIT_MAX_RAWSZ` bytes. While this means that `struct ref_record` is bigger now, we typically do not store all refs in an array anyway and instead only handle a limited number of records at the same point in time. Using `git show-ref --quiet` in a repository with ~350k refs this leads to a significant drop in allocations. Before: HEAP SUMMARY: in use at exit: 21,098 bytes in 192 blocks total heap usage: 2,116,683 allocs, 2,116,491 frees, 76,098,060 bytes allocated After: HEAP SUMMARY: in use at exit: 21,098 bytes in 192 blocks total heap usage: 1,419,031 allocs, 1,418,839 frees, 62,145,036 bytes allocated Signed-off-by: Patrick Steinhardt --- reftable/block_test.c | 4 +--- reftable/merged_test.c | 16 ++++++---------- reftable/readwrite_test.c | 14 ++++---------- reftable/record.c | 3 --- reftable/record_test.c | 1 - reftable/reftable-record.h | 3 ++- reftable/stack_test.c | 2 -- 7 files changed, 13 insertions(+), 30 deletions(-) diff --git a/reftable/block_test.c b/reftable/block_test.c index c00bbc8aed..dedb05c7d8 100644 --- a/reftable/block_test.c +++ b/reftable/block_test.c @@ -49,13 +49,11 @@ static void test_block_read_write(void) for (i = 0; i < N; i++) { char name[100]; - uint8_t hash[GIT_SHA1_RAWSZ]; snprintf(name, sizeof(name), "branch%02d", i); - memset(hash, i, sizeof(hash)); rec.u.ref.refname = name; rec.u.ref.value_type = REFTABLE_REF_VAL1; - rec.u.ref.value.val1 = hash; + memset(rec.u.ref.value.val1, i, GIT_SHA1_RAWSZ); names[i] = xstrdup(name); n = block_writer_add(&bw, &rec); diff --git a/reftable/merged_test.c b/reftable/merged_test.c index d08c16abef..b3927a5d73 100644 --- a/reftable/merged_test.c +++ b/reftable/merged_test.c @@ -123,13 +123,11 @@ static void readers_destroy(struct reftable_reader **readers, size_t n) static void test_merged_between(void) { - uint8_t hash1[GIT_SHA1_RAWSZ] = { 1, 2, 3, 0 }; - struct reftable_ref_record r1[] = { { .refname = "b", .update_index = 1, .value_type = REFTABLE_REF_VAL1, - .value.val1 = hash1, + .value.val1 = { 1, 2, 3, 0 }, } }; struct reftable_ref_record r2[] = { { .refname = "a", @@ -165,26 +163,24 @@ static void test_merged_between(void) static void test_merged(void) { - uint8_t hash1[GIT_SHA1_RAWSZ] = { 1 }; - uint8_t hash2[GIT_SHA1_RAWSZ] = { 2 }; struct reftable_ref_record r1[] = { { .refname = "a", .update_index = 1, .value_type = REFTABLE_REF_VAL1, - .value.val1 = hash1, + .value.val1 = { 1 }, }, { .refname = "b", .update_index = 1, .value_type = REFTABLE_REF_VAL1, - .value.val1 = hash1, + .value.val1 = { 1 }, }, { .refname = "c", .update_index = 1, .value_type = REFTABLE_REF_VAL1, - .value.val1 = hash1, + .value.val1 = { 1 }, } }; struct reftable_ref_record r2[] = { { @@ -197,13 +193,13 @@ static void test_merged(void) .refname = "c", .update_index = 3, .value_type = REFTABLE_REF_VAL1, - .value.val1 = hash2, + .value.val1 = { 2 }, }, { .refname = "d", .update_index = 3, .value_type = REFTABLE_REF_VAL1, - .value.val1 = hash1, + .value.val1 = { 1 }, }, }; diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c index 9c16e0504e..87b238105c 100644 --- a/reftable/readwrite_test.c +++ b/reftable/readwrite_test.c @@ -60,18 +60,15 @@ static void write_table(char ***names, struct strbuf *buf, int N, *names = reftable_calloc(sizeof(char *) * (N + 1)); reftable_writer_set_limits(w, update_index, update_index); for (i = 0; i < N; i++) { - uint8_t hash[GIT_SHA256_RAWSZ] = { 0 }; char name[100]; int n; - set_test_hash(hash, i); - snprintf(name, sizeof(name), "refs/heads/branch%02d", i); ref.refname = name; ref.update_index = update_index; ref.value_type = REFTABLE_REF_VAL1; - ref.value.val1 = hash; + set_test_hash(ref.value.val1, i); (*names)[i] = xstrdup(name); n = reftable_writer_add_ref(w, &ref); @@ -675,11 +672,10 @@ static void test_write_object_id_min_length(void) struct strbuf buf = STRBUF_INIT; struct reftable_writer *w = reftable_new_writer(&strbuf_add_void, &buf, &opts); - uint8_t hash[GIT_SHA1_RAWSZ] = {42}; struct reftable_ref_record ref = { .update_index = 1, .value_type = REFTABLE_REF_VAL1, - .value.val1 = hash, + .value.val1 = {42}, }; int err; int i; @@ -711,11 +707,10 @@ static void test_write_object_id_length(void) struct strbuf buf = STRBUF_INIT; struct reftable_writer *w = reftable_new_writer(&strbuf_add_void, &buf, &opts); - uint8_t hash[GIT_SHA1_RAWSZ] = {42}; struct reftable_ref_record ref = { .update_index = 1, .value_type = REFTABLE_REF_VAL1, - .value.val1 = hash, + .value.val1 = {42}, }; int err; int i; @@ -814,11 +809,10 @@ static void test_write_multiple_indices(void) writer = reftable_new_writer(&strbuf_add_void, &writer_buf, &opts); reftable_writer_set_limits(writer, 1, 1); for (i = 0; i < 100; i++) { - unsigned char hash[GIT_SHA1_RAWSZ] = {i}; struct reftable_ref_record ref = { .update_index = 1, .value_type = REFTABLE_REF_VAL1, - .value.val1 = hash, + .value.val1 = {i}, }; strbuf_reset(&buf); diff --git a/reftable/record.c b/reftable/record.c index 5e258c734b..a67a6b4d8a 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -219,7 +219,6 @@ static void reftable_ref_record_copy_from(void *rec, const void *src_rec, case REFTABLE_REF_DELETION: break; case REFTABLE_REF_VAL1: - ref->value.val1 = reftable_malloc(hash_size); memcpy(ref->value.val1, src->value.val1, hash_size); break; case REFTABLE_REF_VAL2: @@ -303,7 +302,6 @@ void reftable_ref_record_release(struct reftable_ref_record *ref) reftable_free(ref->value.val2.value); break; case REFTABLE_REF_VAL1: - reftable_free(ref->value.val1); break; case REFTABLE_REF_DELETION: break; @@ -394,7 +392,6 @@ static int reftable_ref_record_decode(void *rec, struct strbuf key, return -1; } - r->value.val1 = reftable_malloc(hash_size); memcpy(r->value.val1, in.buf, hash_size); string_view_consume(&in, hash_size); break; diff --git a/reftable/record_test.c b/reftable/record_test.c index 70ae78feca..5c94d26e35 100644 --- a/reftable/record_test.c +++ b/reftable/record_test.c @@ -119,7 +119,6 @@ static void test_reftable_ref_record_roundtrip(void) case REFTABLE_REF_DELETION: break; case REFTABLE_REF_VAL1: - in.u.ref.value.val1 = reftable_malloc(GIT_SHA1_RAWSZ); set_hash(in.u.ref.value.val1, 1); break; case REFTABLE_REF_VAL2: diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h index f7eb2d6015..7f3a0df635 100644 --- a/reftable/reftable-record.h +++ b/reftable/reftable-record.h @@ -9,6 +9,7 @@ license that can be found in the LICENSE file or at #ifndef REFTABLE_RECORD_H #define REFTABLE_RECORD_H +#include "hash-ll.h" #include /* @@ -38,7 +39,7 @@ struct reftable_ref_record { #define REFTABLE_NR_REF_VALUETYPES 4 } value_type; union { - uint8_t *val1; /* malloced hash. */ + unsigned char val1[GIT_MAX_RAWSZ]; struct { uint8_t *value; /* first value, malloced hash */ uint8_t *target_value; /* second value, malloced hash */ diff --git a/reftable/stack_test.c b/reftable/stack_test.c index 14a3fc11ee..feab49d7f7 100644 --- a/reftable/stack_test.c +++ b/reftable/stack_test.c @@ -463,7 +463,6 @@ static void test_reftable_stack_add(void) refs[i].refname = xstrdup(buf); refs[i].update_index = i + 1; refs[i].value_type = REFTABLE_REF_VAL1; - refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ); set_test_hash(refs[i].value.val1, i); logs[i].refname = xstrdup(buf); @@ -600,7 +599,6 @@ static void test_reftable_stack_tombstone(void) refs[i].update_index = i + 1; if (i % 2 == 0) { refs[i].value_type = REFTABLE_REF_VAL1; - refs[i].value.val1 = reftable_malloc(GIT_SHA1_RAWSZ); set_test_hash(refs[i].value.val1, i); } From patchwork Wed Jan 3 06:22:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13509683 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BCF918623 for ; Wed, 3 Jan 2024 06:22:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="UEZYtAw2"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="KigAHhbm" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 728DB5C01E3; Wed, 3 Jan 2024 01:22:37 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Wed, 03 Jan 2024 01:22:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1704262957; x=1704349357; bh=LUEzIz2MvW GXS0GMwIs0PepWxY49DadEnt4HKy0NAbE=; b=UEZYtAw2b4580caghQSiG5ILHM 8ymDWh2vL7mgSyVmPS4jiOCTUWjVtePgOnrNPPt4pM+Kb8Er3DVDwrkxqB+14SBf Y1sEOL8bPTq6KA+wT6zRNZIaNXa3P9Lf6i/uUnb89VWScFbf/92XTyJsnIE9fule B104EmFkTufq5UmOQ89WTvJFunejvRBz9Ng/NOsTtunCbpFo1y0q/EkoktBgZOjW UAugTOXJOcwU09cRQWn5WmGFv1CdEEaCTlMoXMXleJViA7ftVStcaggPEB0TRSuZ ZJIXNStkvrDJ9al4hD3k+ZjXldV4LSToTYcegKSx5aSGgs5/aDXF3UT6/nig== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1704262957; x=1704349357; bh=LUEzIz2MvWGXS0GMwIs0PepWxY49 DadEnt4HKy0NAbE=; b=KigAHhbmGPxIgPkf396hM07WQZu++/xiqvEub2p2cSQU eXmFmiTV0w4orddm82T6z8aeot892QNctumN7jbLT73U2d/gRh47lVbbw9CmBwNV 6G/D3WQt9YAwmdmrdLG+XLdlBqNDGF2yyQOhzTL5ww4wA6EZuHrr7f1FlVfK0dpB ZpMmz1QAA78SwMVBt3aF6adcMVCFmDRaFdJZLTAM0r2xsfyidVL71TTPMWSPBZ1o u+TC7RmJGrOJ7G+8MY1SHLQZ1D88LVD/1KLYZcAsupaOXK1d9/GbutKn4iKicEi5 Rwmhlf+2xrSnMFct+vnQfFdtGSRzAX+xoxJj4Vb8pA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrvdeggedgleeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeehjeffudefkeejheekfefguedvtdehte fgiefgveelgefgheekhfdvjeefvdekudenucffohhmrghinhepvhgrlhdvrdhtrghrghgv thenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehpsh esphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Jan 2024 01:22:36 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id e172148d (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Jan 2024 06:20:11 +0000 (UTC) Date: Wed, 3 Jan 2024 07:22:34 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Junio C Hamano Subject: [PATCH v3 6/8] reftable/record: store "val2" hashes as static arrays Message-ID: <6ec02d6775c0f3e3274d57ef4a7dfeefac8308a9.1704262787.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Similar to the preceding commit, convert ref records of type "val2" to store their object IDs in static arrays instead of allocating them for every single record. We're using the same benchmark as in the preceding commit, with `git show-ref --quiet` in a repository with ~350k refs. This time around though the effects aren't this huge. Before: HEAP SUMMARY: in use at exit: 21,163 bytes in 193 blocks total heap usage: 1,419,040 allocs, 1,418,847 frees, 62,153,868 bytes allocated After: HEAP SUMMARY: in use at exit: 21,163 bytes in 193 blocks total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated This is because "val2"-type records are typically only stored for peeled tags, and the number of annotated tags in the benchmark repository is rather low. Still, it can be seen that this change leads to a reduction of allocations overall, even if only a small one. Signed-off-by: Patrick Steinhardt --- reftable/readwrite_test.c | 12 ++++-------- reftable/record.c | 6 ------ reftable/record_test.c | 4 ---- reftable/reftable-record.h | 4 ++-- 4 files changed, 6 insertions(+), 20 deletions(-) diff --git a/reftable/readwrite_test.c b/reftable/readwrite_test.c index 87b238105c..178766dfa8 100644 --- a/reftable/readwrite_test.c +++ b/reftable/readwrite_test.c @@ -547,8 +547,6 @@ static void test_table_refs_for(int indexed) uint8_t hash[GIT_SHA1_RAWSZ]; char fill[51] = { 0 }; char name[100]; - uint8_t hash1[GIT_SHA1_RAWSZ]; - uint8_t hash2[GIT_SHA1_RAWSZ]; struct reftable_ref_record ref = { NULL }; memset(hash, i, sizeof(hash)); @@ -558,11 +556,9 @@ static void test_table_refs_for(int indexed) name[40] = 0; ref.refname = name; - set_test_hash(hash1, i / 4); - set_test_hash(hash2, 3 + i / 4); ref.value_type = REFTABLE_REF_VAL2; - ref.value.val2.value = hash1; - ref.value.val2.target_value = hash2; + set_test_hash(ref.value.val2.value, i / 4); + set_test_hash(ref.value.val2.target_value, 3 + i / 4); /* 80 bytes / entry, so 3 entries per block. Yields 17 */ @@ -570,8 +566,8 @@ static void test_table_refs_for(int indexed) n = reftable_writer_add_ref(w, &ref); EXPECT(n == 0); - if (!memcmp(hash1, want_hash, GIT_SHA1_RAWSZ) || - !memcmp(hash2, want_hash, GIT_SHA1_RAWSZ)) { + if (!memcmp(ref.value.val2.value, want_hash, GIT_SHA1_RAWSZ) || + !memcmp(ref.value.val2.target_value, want_hash, GIT_SHA1_RAWSZ)) { want_names[want_names_len++] = xstrdup(name); } } diff --git a/reftable/record.c b/reftable/record.c index a67a6b4d8a..5c3fbb7b2a 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -222,9 +222,7 @@ static void reftable_ref_record_copy_from(void *rec, const void *src_rec, memcpy(ref->value.val1, src->value.val1, hash_size); break; case REFTABLE_REF_VAL2: - ref->value.val2.value = reftable_malloc(hash_size); memcpy(ref->value.val2.value, src->value.val2.value, hash_size); - ref->value.val2.target_value = reftable_malloc(hash_size); memcpy(ref->value.val2.target_value, src->value.val2.target_value, hash_size); break; @@ -298,8 +296,6 @@ void reftable_ref_record_release(struct reftable_ref_record *ref) reftable_free(ref->value.symref); break; case REFTABLE_REF_VAL2: - reftable_free(ref->value.val2.target_value); - reftable_free(ref->value.val2.value); break; case REFTABLE_REF_VAL1: break; @@ -401,11 +397,9 @@ static int reftable_ref_record_decode(void *rec, struct strbuf key, return -1; } - r->value.val2.value = reftable_malloc(hash_size); memcpy(r->value.val2.value, in.buf, hash_size); string_view_consume(&in, hash_size); - r->value.val2.target_value = reftable_malloc(hash_size); memcpy(r->value.val2.target_value, in.buf, hash_size); string_view_consume(&in, hash_size); break; diff --git a/reftable/record_test.c b/reftable/record_test.c index 5c94d26e35..2876db7d27 100644 --- a/reftable/record_test.c +++ b/reftable/record_test.c @@ -122,11 +122,7 @@ static void test_reftable_ref_record_roundtrip(void) set_hash(in.u.ref.value.val1, 1); break; case REFTABLE_REF_VAL2: - in.u.ref.value.val2.value = - reftable_malloc(GIT_SHA1_RAWSZ); set_hash(in.u.ref.value.val2.value, 1); - in.u.ref.value.val2.target_value = - reftable_malloc(GIT_SHA1_RAWSZ); set_hash(in.u.ref.value.val2.target_value, 2); break; case REFTABLE_REF_SYMREF: diff --git a/reftable/reftable-record.h b/reftable/reftable-record.h index 7f3a0df635..bb6e99acd3 100644 --- a/reftable/reftable-record.h +++ b/reftable/reftable-record.h @@ -41,8 +41,8 @@ struct reftable_ref_record { union { unsigned char val1[GIT_MAX_RAWSZ]; struct { - uint8_t *value; /* first value, malloced hash */ - uint8_t *target_value; /* second value, malloced hash */ + unsigned char value[GIT_MAX_RAWSZ]; /* first hash */ + unsigned char target_value[GIT_MAX_RAWSZ]; /* second hash */ } val2; char *symref; /* referent, malloced 0-terminated string */ } value; From patchwork Wed Jan 3 06:22:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13509684 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6836A18623 for ; Wed, 3 Jan 2024 06:22:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="V9XFQHpz"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="eueJjhDQ" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id B39AD5C01EC; Wed, 3 Jan 2024 01:22:41 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Wed, 03 Jan 2024 01:22:41 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1704262961; x=1704349361; bh=ZVLHiZ2Hii FnyoYQac5W3gZUnnqU8PyqnnHeSaBy7Sw=; b=V9XFQHpzVnG97Lh8LMiSMKq1Su NsJk6yy0pV8t5RyV0D4xCHgDVq787ATvATDiSV2djw2LeKVxNvlt5aQYxDlaj2FI M6cRO8OniJDsnwFiiBfU5C1Z0zwXLGeHeFSzwqDyC7k3s7uPhA5mTWMDoqmvq75i 6NJy25GC57cDIUiah24z80iDL6zEiK0CFI91W8lLUcviyleWUePge/aEwtsyR5cu IuI0IlpenYHUp4b/Cx/dNC/ZtpZRanqMArc3ppax4S0U3FrTxnMTdwODuxyEOjDC 9TnObGlzrwgew53Ec+3hKcqVhWLl2aEUc5aMD6FlQDAXSETUaDTZVd6t60UA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1704262961; x=1704349361; bh=ZVLHiZ2HiiFnyoYQac5W3gZUnnqU 8PyqnnHeSaBy7Sw=; b=eueJjhDQzKuw337rq/TSZgwbrSmL1ubKPBc/hOZjgbmI m+h5TQSiWXOJ43npUrnJs226rUlnG0jA5zVlY28V2yoelII/4NR97CC7bWQsvi8C vHiXcF1K/kdTePsPbZeLXWMpBhcg4VhRkf1xxj8nVd+mm+mX2b07rocZV+lBSceW EGh6bA5z39RiLMeMwcMk+jYTDnRSTVP5FMA/d/GGfgdNnShYDPgHWNpSDCyn4zHv M/kw8qkUNevFehp6AMnwLdiReeVrm7iW2VK8vcd4AQDuuuXBczQG81ZVbHaAs3Kk A+9/Zprrf9xfbZke1pfyzwv5b0kRZSgKEzC4LB5iTQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrvdeggedgleeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpeefnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Jan 2024 01:22:40 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id b1c86c43 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Jan 2024 06:20:15 +0000 (UTC) Date: Wed, 3 Jan 2024 07:22:39 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Junio C Hamano Subject: [PATCH v3 7/8] reftable/merged: really reuse buffers to compute record keys Message-ID: <845dec239019886575ba6796f34b0181c78981de.1704262787.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In 829231dc20 (reftable/merged: reuse buffer to compute record keys, 2023-12-11), we have refactored the merged iterator to reuse a pair of long-living strbufs by relying on the fact that `reftable_record_key()` tries to reuse already allocated strbufs by calling `strbuf_reset()`, which should give us significantly fewer reallocations compared to the old code that used on-stack strbufs that are allocated for each and every iteration. Unfortunately, we called `strbuf_release()` on these long-living strbufs that we meant to reuse on each iteration, defeating the optimization. Fix this performance issue by not releasing those buffers on iteration anymore, where we instead rely on `merged_iter_close()` to release the buffers for us. Using `git show-ref --quiet` in a repository with ~350k refs this leads to a significant drop in allocations. Before: HEAP SUMMARY: in use at exit: 21,163 bytes in 193 blocks total heap usage: 1,410,148 allocs, 1,409,955 frees, 61,976,068 bytes allocated After: HEAP SUMMARY: in use at exit: 21,163 bytes in 193 blocks total heap usage: 708,058 allocs, 707,865 frees, 36,783,255 bytes allocated Signed-off-by: Patrick Steinhardt --- reftable/merged.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/reftable/merged.c b/reftable/merged.c index 556bb5c556..a28bb99aaf 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -128,8 +128,6 @@ static int merged_iter_next_entry(struct merged_iter *mi, done: reftable_record_release(&entry.rec); - strbuf_release(&mi->entry_key); - strbuf_release(&mi->key); return err; } From patchwork Wed Jan 3 06:22:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13509685 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 885CB1864E for ; Wed, 3 Jan 2024 06:22:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="xDrDuSeA"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="tJCOWOQ+" Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id DB24A5C00E2; Wed, 3 Jan 2024 01:22:45 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Wed, 03 Jan 2024 01:22:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1704262965; x=1704349365; bh=FqbAHKp6GZ pMNU6U9RvoV84G+vN6lNcnKBEXD1pYKy4=; b=xDrDuSeANg0OubpLbcQRkbqg+C pzdd6DMKztyX0LoOJgdqUhO9m8VMNv2UWod/lr/2KfzI2zpUzGfIO88JfyFrZrik tc9I0dMrfl1L4Kj9nR/gxZDbLHAkK4aZ3fT6EAXY9rMTydeAWmm0/wl0GGFl+wFa 2PlYRy6fX45Hn3Snl0mR9I5UjkkC0KxDPRnd20QX7QWhRODSN7P4Ncu+1UpkCTKt CJPSPiRwW072UqztVvwCXuMCLSwAPd6i90LRNlwMS2zPli7iG0WMI3Nzu2Zi35IX IUYwYEbWDCGyLeaqC8crCpdFBCZZVaLNRgbM9elN9s1pTHHt4QGNvebBGVqA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1704262965; x=1704349365; bh=FqbAHKp6GZpMNU6U9RvoV84G+vN6 lNcnKBEXD1pYKy4=; b=tJCOWOQ+m5mSc/riWQkkTJRkYamud4uTvW0flj3Bidj6 +Lo6BXRWTsyyTLG1bSfWup/bKJomhMlsqrz9gi6c+b25zs3pR1c2/YIcc15qDkX6 8Vqrteb/+/mkx+19QE5d6ij6Zl8UvvYPG1Y8O563d6Ji5kTRydbzcfBL7KHSxqgS VBCa47k97+C9DQMLSUCIjhO+lzf5QJt2upioR4mQMdi7Zmz67I0ZQsvih4Uo5MxI Fm4Spd5SLMdopVUQ2J4uZ7lBdQGV1PSZbV2xIZFZCWpqXvKv9FDBXlOrgt8beyw9 vI7umEszWegJCHXQ095v4L2S5HzVIadz1FG38MCREg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrvdeggedgleeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Jan 2024 01:22:44 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 63004312 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Wed, 3 Jan 2024 06:20:19 +0000 (UTC) Date: Wed, 3 Jan 2024 07:22:43 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Han-Wen Nienhuys , Junio C Hamano Subject: [PATCH v3 8/8] reftable/merged: transfer ownership of records when iterating Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When iterating over records with the merged iterator we put the records into a priority queue before yielding them to the caller. This means that we need to allocate the contents of these records before we can pass them over to the caller. The handover to the caller is quite inefficient though because we first deallocate the record passed in by the caller and then copy over the new record, which requires us to reallocate memory. Refactor the code to instead transfer ownership of the new record to the caller. So instead of reallocating all contents, we now release the old record and then copy contents of the new record into place. The following benchmark of `git show-ref --quiet` in a repository with around 350k refs shows a clear improvement. Before: HEAP SUMMARY: in use at exit: 21,163 bytes in 193 blocks total heap usage: 708,058 allocs, 707,865 frees, 36,783,255 bytes allocated After: HEAP SUMMARY: in use at exit: 21,163 bytes in 193 blocks total heap usage: 357,007 allocs, 356,814 frees, 24,193,602 bytes allocated This shows that we now have roundabout a single allocation per record that we're yielding from the iterator. Ideally, we'd also get rid of this allocation so that the number of allocations doesn't scale with the number of refs anymore. This would require some larger surgery though because the memory is owned by the priority queue before transferring it over to the caller. Signed-off-by: Patrick Steinhardt --- reftable/merged.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/reftable/merged.c b/reftable/merged.c index a28bb99aaf..a52914d667 100644 --- a/reftable/merged.c +++ b/reftable/merged.c @@ -124,10 +124,12 @@ static int merged_iter_next_entry(struct merged_iter *mi, reftable_record_release(&top.rec); } - reftable_record_copy_from(rec, &entry.rec, hash_size(mi->hash_id)); + reftable_record_release(rec); + *rec = entry.rec; done: - reftable_record_release(&entry.rec); + if (err) + reftable_record_release(&entry.rec); return err; }