From patchwork Thu Feb 1 10:25:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13540889 Received: from wfhigh4-smtp.messagingengine.com (wfhigh4-smtp.messagingengine.com [64.147.123.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBD2C15B964 for ; Thu, 1 Feb 2024 10:25:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=64.147.123.155 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783116; cv=none; b=Ynv6PgSwuDwxb253pJtHFqZlD2vK/7yOM/wAQIPJLMsDcLYujCeUTssV+l+RFEXNiNQqQ+qoc6/k7k2PS9UcjpdgQOzz86PsLEUypO5ynYnpJ0RkSvWb+ig2EkywsXxDzVRJkifSYhYl2jAhKwqgs2O4U71KvioFjaCh3+9g5sI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706783116; c=relaxed/simple; bh=OsJ8/eNqG8jcNoZ95dVKdNW714/2RfGsvyCXIA6aLzw=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=nK2nsRZQRGbZ2l2QU1Ty/CXm+xjeXL0EgkOxSeJbftDASilIqrYssgSnShbZIT9KT4DmAWb/UGxVT8i8mUJFRwD04LdUtDt1xMyPO9/wTctmnBOU31QQ1M0zd5SGmy+OahIa41Ic5CZTXoIGsqB94LvBixk4ikaGeMrOgU0Sofg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=gWO0F3EZ; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=V4SZSBSb; arc=none smtp.client-ip=64.147.123.155 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="gWO0F3EZ"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="V4SZSBSb" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfhigh.west.internal (Postfix) with ESMTP id D5C6A180007A for ; Thu, 1 Feb 2024 05:25:13 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Thu, 01 Feb 2024 05:25:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1706783113; x=1706869513; bh=TL7nbh2/B/ 7RlZEt8sHbJki5BX2GEwIlzPMIot3hJc4=; b=gWO0F3EZLM5MDU2m0HCJpUGFcE sVxdFnu2a9YnghemVsbJEXkdMJ25sKIypaIZ97npmrrh9KD7ajxyDHcYHdA8wX+8 Tt3HP0sbKOmxsvIDDZymxA3j3vaNysawlJiUFnJg13B92KguCynhrScw9SG7PCu0 xfWj5d8qQGENBfQxTSvfHfT07tDOGwZwMh1bURwaRiWW+34vu2I2+VAaYXa+mE20 RYDOFfpLmnKaWaPQrcGwBi6lgO5j+SB3IVzppizVD5gA2N+1phLDKzC7IilvbJvA HUQb6gWgh4dx9xWPvvAyHT8Wk/MN7AU8AW3whvg7iCZGQu4xKCtDJp0AM/sA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1706783113; x=1706869513; bh=TL7nbh2/B/7RlZEt8sHbJki5BX2G EwIlzPMIot3hJc4=; b=V4SZSBSb7cbTiNnGQ+1/UacczfQDQLnJQwZWQCJf4KM4 GI5Re1AT2LNKTbahnwq/t+/JCYQ+rOkL6CbeP13VRQvB8VPR4DW18FtihhUEKTC2 GMSDRUy1twfch8l3vt7UYQhHLj0gpjlQzKSi85Xu9kn5+xxi/S4FHUC4Bh9nS287 Bw0xdcB+YCzmzkF4AY2uQgyd4ExhrWVIDUFf82sY1RL7M3lu+N9Dd62c4nyRbTGn JhkJLbHcFMBH0YtTUc4UlmACDaB0tfeKfi+6Ii3gTBf7NhbgZRHgnjYB1GJIZZeQ 8C9nNStzriQntHxiwAajpp+4mMrc/iZJy8vnlFTbzQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfeduuddgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtjeenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhephefgjeeuveejteduhefgffefffdvjeefje eivdekfffgkeeugfehveetueefleeknecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Thu, 1 Feb 2024 05:25:12 -0500 (EST) Received: by vm-mail (OpenSMTPD) with ESMTPSA id c318743b (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Thu, 1 Feb 2024 10:21:49 +0000 (UTC) Date: Thu, 1 Feb 2024 11:25:10 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 4/7] reftable/pq: allocation-less comparison of entry keys Message-ID: <1c9c19a3b3daf9690fda0423c52da33e337c8b54.1706782841.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The priority queue is used by the merged iterator to iterate over reftable records from multiple tables in the correct order. The queue ends up having one record for each table that is being iterated over, with the record that is supposed to be shown next at the top. For example, the key of a ref record is equal to its name so that we end up sorting the priority queue lexicographically by ref name. To figure out the order we need to compare the reftable record keys with each other. This comparison is done by formatting them into a `struct strbuf` and then doing `strbuf_strcmp()` on the result. We then discard the buffers immediately after the comparison. This ends up being very expensive. Because the priority queue usually contains as many records as we have tables, we call the comparison function `O(log($tablecount))` many times for every record we insert. Furthermore, when iterating over many refs, we will insert at least one record for every ref we are iterating over. So ultimately, this ends up being called `O($refcount * log($tablecount))` many times. Refactor the code to use the new `refatble_record_cmp()` function that has been implemented in a preceding commit. This function does not need to allocate memory and is thus significantly more efficient. The following benchmark prints a single ref matching a specific pattern out of 1 million refs via git-show-ref(1), where the reftable stack consists of three tables: Benchmark 1: show-ref: single matching ref (revision = HEAD~) Time (mean ± σ): 224.4 ms ± 6.5 ms [User: 220.6 ms, System: 3.6 ms] Range (min … max): 216.5 ms … 261.1 ms 1000 runs Benchmark 2: show-ref: single matching ref (revision = HEAD) Time (mean ± σ): 172.9 ms ± 4.4 ms [User: 169.2 ms, System: 3.6 ms] Range (min … max): 166.5 ms … 204.6 ms 1000 runs Summary show-ref: single matching ref (revision = HEAD) ran 1.30 ± 0.05 times faster than show-ref: single matching ref (revision = HEAD~) Signed-off-by: Patrick Steinhardt --- reftable/pq.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/reftable/pq.c b/reftable/pq.c index dcefeb793a..7220efc39a 100644 --- a/reftable/pq.c +++ b/reftable/pq.c @@ -14,20 +14,9 @@ license that can be found in the LICENSE file or at int pq_less(struct pq_entry *a, struct pq_entry *b) { - struct strbuf ak = STRBUF_INIT; - struct strbuf bk = STRBUF_INIT; - int cmp = 0; - reftable_record_key(&a->rec, &ak); - reftable_record_key(&b->rec, &bk); - - cmp = strbuf_cmp(&ak, &bk); - - strbuf_release(&ak); - strbuf_release(&bk); - + int cmp = reftable_record_cmp(&a->rec, &b->rec); if (cmp == 0) return a->index > b->index; - return cmp < 0; }