From patchwork Fri Mar 22 12:22:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13600026 Received: from fhigh2-smtp.messagingengine.com (fhigh2-smtp.messagingengine.com [103.168.172.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 65B9940BF5 for ; Fri, 22 Mar 2024 12:22:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.153 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110158; cv=none; b=erqSwyMfu73ZUbtR+Oxd1Dv6u2sN6y0iQAxyisptc8A/mOdL0/a4Swr0Trk9t3RU2PEKEh88KWD2EuY4ksk41KHG8x3XXrPDGFwSuSyiP08NyC7a8XRFJnaPZ1C6ni9ytXsX7KLZ4ZuA5fNKHUTz36v6ynsG7JQQwWQ8iPeiNO4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110158; c=relaxed/simple; bh=R4CkqGfM3uVLn/P5K39gjJ7SoIpQtvV1GgeWdSjzlTo=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Y4gMkrKwtHiCyk9sj62nJ9ZksYv8r5FxJ9XgCPwbralBs3QULdT+UhIBNzHYA9qg1/cFtp0r2FTT03jcw791kQD6WAAfSdTYKfsRB11itw1NszFkE9kCKG3L2fx7UPANzB9I2jb/hEwT3Myk4oqTpC5q4Hw0lfYO97MgVnTHch0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=qb+D7Cuw; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=v6vJF0y0; arc=none smtp.client-ip=103.168.172.153 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="qb+D7Cuw"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="v6vJF0y0" Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 7ECB3114013C for ; Fri, 22 Mar 2024 08:22:33 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Fri, 22 Mar 2024 08:22:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1711110153; x=1711196553; bh=GjH2fo8ECo SfW8dEbznRXafQRWSFIQAL29EYm3QyQjY=; b=qb+D7CuweRfQkgeXVaJoq9VHmr wxy4+wuC2537WhN57U0qdSDC7XXkIBD2eHxr5lX41H/mUVaqdLiWvWKEuj6oIRXC w8cZDv3hlnWivnaB0NPS/Fhmnctdf7DVqjZUHvcASVk62hWUl7tO555MpCGJ2X6r dNSe9Sq5WsZhOSPwvQctE7VMEB3ynrV/y6i7+EvuFWpPt52crjj7SVbI4r0rK2Kl hFVbNFMqCqew3QpMRgnZRq1D1YJkkP2wTexVolfEpKh/NwsCtL8l0eQ93iw+WjjG Cuk9PEhLWR/ymA4tFKTuCBJxMmHy2b4bVSLgJSi4bBA4ZPmjuknmjDHUkYjw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1711110153; x=1711196553; bh=GjH2fo8ECoSfW8dEbznRXafQRWSF IQAL29EYm3QyQjY=; b=v6vJF0y0MHo0D0o+5Ms416vYETo6Z6+RNelU1QEY8mHY nCgncV0WhoG4xrNDfP0m/Yk8kgLSoCtxz3c7cQajgSbhIdT5RIEJdbJL6Om2hOLw ArqV1M0UsHAEuXt1gfq+JtMlvGdw0BxA9uezqUcKF+iJBURQ77zccaWRkkrN1i0z DTGQ7RnONGEE1xuiWhaZyXDFH1wAOSdo8K1s7TbtLdp8CUJk+FMpn17zMc5jZ7x/ cEJbAgaNMZbQtELtKisfR/8tE9QNZsEOUh9lqCVnN42jxVLfZ5eXhZj8thHqjo8x 9+zMgevzGsxVyEH0xuGN0ua4P05pXlU7XAn8XonrdA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddttddgfeeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtvdenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhepheeghfdtfeeuffehkefgffduleffjedthf dvjeektdfhhedvlefgtefgvdettdfhnecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Fri, 22 Mar 2024 08:22:32 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id ce5bfa80 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Fri, 22 Mar 2024 12:22:27 +0000 (UTC) Date: Fri, 22 Mar 2024 13:22:30 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 1/7] reftable/basics: fix return type of `binsearch()` to be `size_t` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The `binsearch()` function can be used to find the first element for which a callback functions returns a truish value. But while the array size is of type `size_t`, the function in fact returns an `int` that is supposed to index into that array. Fix the function signature to return a `size_t`. This conversion does not change any semantics given that the function would only ever return a value in the range `[0, sz]` anyway. Signed-off-by: Patrick Steinhardt --- reftable/basics.c | 2 +- reftable/basics.h | 2 +- reftable/basics_test.c | 6 +++--- reftable/block.c | 3 ++- reftable/refname.c | 17 +++++++---------- 5 files changed, 14 insertions(+), 16 deletions(-) diff --git a/reftable/basics.c b/reftable/basics.c index 0785aff941..2c5f34b39e 100644 --- a/reftable/basics.c +++ b/reftable/basics.c @@ -27,7 +27,7 @@ void put_be16(uint8_t *out, uint16_t i) out[1] = (uint8_t)(i & 0xff); } -int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args) +size_t binsearch(size_t sz, int (*f)(size_t k, void *args), void *args) { size_t lo = 0; size_t hi = sz; diff --git a/reftable/basics.h b/reftable/basics.h index 91f3533efe..2672520e76 100644 --- a/reftable/basics.h +++ b/reftable/basics.h @@ -28,7 +28,7 @@ void put_be16(uint8_t *out, uint16_t i); * Contrary to bsearch(3), this returns something useful if the argument is not * found. */ -int binsearch(size_t sz, int (*f)(size_t k, void *args), void *args); +size_t binsearch(size_t sz, int (*f)(size_t k, void *args), void *args); /* * Frees a NULL terminated array of malloced strings. The array itself is also diff --git a/reftable/basics_test.c b/reftable/basics_test.c index 1fcd229725..dc1c87c5df 100644 --- a/reftable/basics_test.c +++ b/reftable/basics_test.c @@ -34,15 +34,15 @@ static void test_binsearch(void) int i = 0; for (i = 1; i < 11; i++) { - int res; + size_t res; + args.key = i; res = binsearch(sz, &binsearch_func, &args); if (res < sz) { EXPECT(args.key < arr[res]); - if (res > 0) { + if (res > 0) EXPECT(args.key >= arr[res - 1]); - } } else { EXPECT(args.key == 10 || args.key == 11); } diff --git a/reftable/block.c b/reftable/block.c index e2a2cee58d..422885bddb 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -382,7 +382,8 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, }; struct block_iter next = BLOCK_ITER_INIT; struct reftable_record rec; - int err = 0, i; + int err = 0; + size_t i; if (args.error) { err = REFTABLE_FORMAT_ERROR; diff --git a/reftable/refname.c b/reftable/refname.c index 7570e4acf9..64eba1b886 100644 --- a/reftable/refname.c +++ b/reftable/refname.c @@ -33,10 +33,9 @@ static int modification_has_ref(struct modification *mod, const char *name) .names = mod->add, .want = name, }; - int idx = binsearch(mod->add_len, find_name, &arg); - if (idx < mod->add_len && !strcmp(mod->add[idx], name)) { + size_t idx = binsearch(mod->add_len, find_name, &arg); + if (idx < mod->add_len && !strcmp(mod->add[idx], name)) return 0; - } } if (mod->del_len > 0) { @@ -44,10 +43,9 @@ static int modification_has_ref(struct modification *mod, const char *name) .names = mod->del, .want = name, }; - int idx = binsearch(mod->del_len, find_name, &arg); - if (idx < mod->del_len && !strcmp(mod->del[idx], name)) { + size_t idx = binsearch(mod->del_len, find_name, &arg); + if (idx < mod->del_len && !strcmp(mod->del[idx], name)) return 1; - } } err = reftable_table_read_ref(&mod->tab, name, &ref); @@ -77,7 +75,7 @@ static int modification_has_ref_with_prefix(struct modification *mod, .names = mod->add, .want = prefix, }; - int idx = binsearch(mod->add_len, find_name, &arg); + size_t idx = binsearch(mod->add_len, find_name, &arg); if (idx < mod->add_len && !strncmp(prefix, mod->add[idx], strlen(prefix))) goto done; @@ -96,11 +94,10 @@ static int modification_has_ref_with_prefix(struct modification *mod, .names = mod->del, .want = ref.refname, }; - int idx = binsearch(mod->del_len, find_name, &arg); + size_t idx = binsearch(mod->del_len, find_name, &arg); if (idx < mod->del_len && - !strcmp(ref.refname, mod->del[idx])) { + !strcmp(ref.refname, mod->del[idx])) continue; - } } if (strncmp(ref.refname, prefix, strlen(prefix))) { From patchwork Fri Mar 22 12:22:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13600027 Received: from fout6-smtp.messagingengine.com (fout6-smtp.messagingengine.com [103.168.172.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 53FFA3FE46 for ; Fri, 22 Mar 2024 12:22:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.149 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110160; cv=none; b=lzXsUevG52PCK/hsHsvr7hJZwmV7qdVTCcqySy9XjX6QFFjCHInr7VaQuIp6idOok2IvrkBGCbTNaO3Mv9Z+IUqdRrtrxe3/AXoebrYdVjkS0nVO6ouPclR7RShQCJJAnY7v5XDTaxAxp7my8MNWU0if/1+0AliAutUAfi1wszI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110160; c=relaxed/simple; bh=BaNYT5MLIxYYcMXMcDWu0jfzBN1e9ky2CY/eKLGjm1A=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=IxJGx8UD+XRMSCKzdbenl0eI/8pnwJsX2HZQyOzcbmSxpJYuMVlBgA9takKM7wKaWgNO3+2GbKhZWVla97vuJT9eN1QkDcW9MNdXpjwzg8uDv9qoIqAkWL9jSKZcrcUc5F+SpRHtJutsodqivkpvx/kYE6MyEJIxKtV+bX6aRQs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=c947KVMs; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=uLNuOVmu; arc=none smtp.client-ip=103.168.172.149 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="c947KVMs"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="uLNuOVmu" Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfout.nyi.internal (Postfix) with ESMTP id 6599513800F8 for ; Fri, 22 Mar 2024 08:22:37 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Fri, 22 Mar 2024 08:22:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1711110157; x=1711196557; bh=Kigp1rEQ+J AaQBkBHmAEpbNjaREoSvpq2xWlq4hEuUs=; b=c947KVMsJStRuQesPBEiOD8x8a UsFL+4aIBGHvTuCCqP2kTBUX4RlbD78X882j3daSq0FDpBAquuHttLaO2KRS/Z+q O49/qisGFeUqpHZjBzqJbODU4Fekuj/s7/rZMT3Dp505Dwd/lMnS3qwq2lLQiLw5 +z+i0evHlLKbUpxY1sOXE0imMQ9nRvG8GjmsxWR181lKrOm2gAunFNV/n0EAghD1 c+aeIP1H/lm4qhp7U7raIB1OmsQMIsWa5R/xukjHxn+c4YBYI991p8sAnoAZsXeT fGkn2DoIEQKU/dnTHnhzTPtI5sM93yx1dHGAGadCWrPsiptNAYYcjpZYsGmA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1711110157; x=1711196557; bh=Kigp1rEQ+JAaQBkBHmAEpbNjaREo Svpq2xWlq4hEuUs=; b=uLNuOVmufiuFpyM8Iob4w35GYMHps/MWCLSVAswmxMot wAngOupBeFSzbJIzb4B6M5e7/bgPaIcDIYqkWuFXHeJdkzrNLdf1ACsRep5AS0D9 mY8cmzRWvIBmPtl5SCGIyBn6iClHXMNkBAEqwydK3dzHto3SMXeVD9Ks7V6HogNr 8rH0In5d8UaESgfyZDDZy4WGUg4Jd7WjX6s0AqxZYM9UXA05rioAn8UC4ICPbOLp Z28rnnjz2g9KgBHckYmIsPiHowtDKHMPN2sUkk2CHE95n+agddrQiknFxOgLnm17 Z/sSqGPkcgm007c2Yi7RkDsEsuoThgYyMhAjus6sGA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddttddgfeeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtvdenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhepheeghfdtfeeuffehkefgffduleffjedthf dvjeektdfhhedvlefgtefgvdettdfhnecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Fri, 22 Mar 2024 08:22:36 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 021578d0 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Fri, 22 Mar 2024 12:22:31 +0000 (UTC) Date: Fri, 22 Mar 2024 13:22:35 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 2/7] reftable/basics: improve `binsearch()` test Message-ID: <7955f7983a6d8ef81a572f108b11c7afa93e34fd.1711109214.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The `binsearch()` test is somewhat weird in that it doesn't explicitly spell out its expectations. Instead it does so in a rather ad-hoc way with some hard-to-understand computations. Refactor the test to spell out the needle as well as expected index for all testcases. This refactoring highlights that the `binsearch_func()` is written somewhat weirdly to find the first integer smaller than the needle, not smaller or equal to it. Adjust the function accordingly. While at it, rename the callback function to better convey its meaning. Signed-off-by: Patrick Steinhardt --- reftable/basics_test.c | 55 ++++++++++++++++++++++++------------------ 1 file changed, 31 insertions(+), 24 deletions(-) diff --git a/reftable/basics_test.c b/reftable/basics_test.c index dc1c87c5df..85c4d1621c 100644 --- a/reftable/basics_test.c +++ b/reftable/basics_test.c @@ -12,40 +12,47 @@ license that can be found in the LICENSE file or at #include "test_framework.h" #include "reftable-tests.h" -struct binsearch_args { - int key; - int *arr; +struct integer_needle_lesseq { + int needle; + int *haystack; }; -static int binsearch_func(size_t i, void *void_args) +static int integer_needle_lesseq(size_t i, void *_args) { - struct binsearch_args *args = void_args; - - return args->key < args->arr[i]; + struct integer_needle_lesseq *args = _args; + return args->needle <= args->haystack[i]; } static void test_binsearch(void) { - int arr[] = { 2, 4, 6, 8, 10 }; - size_t sz = ARRAY_SIZE(arr); - struct binsearch_args args = { - .arr = arr, + int haystack[] = { 2, 4, 6, 8, 10 }; + struct { + int needle; + size_t expected_idx; + } testcases[] = { + {-9000, 0}, + {-1, 0}, + {0, 0}, + {2, 0}, + {3, 1}, + {4, 1}, + {7, 3}, + {9, 4}, + {10, 4}, + {11, 5}, + {9000, 5}, }; + size_t i = 0; - int i = 0; - for (i = 1; i < 11; i++) { - size_t res; - - args.key = i; - res = binsearch(sz, &binsearch_func, &args); + for (i = 0; i < ARRAY_SIZE(testcases); i++) { + struct integer_needle_lesseq args = { + .haystack = haystack, + .needle = testcases[i].needle, + }; + size_t idx; - if (res < sz) { - EXPECT(args.key < arr[res]); - if (res > 0) - EXPECT(args.key >= arr[res - 1]); - } else { - EXPECT(args.key == 10 || args.key == 11); - } + idx = binsearch(ARRAY_SIZE(haystack), &integer_needle_lesseq, &args); + EXPECT(idx == testcases[i].expected_idx); } } From patchwork Fri Mar 22 12:22:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13600028 Received: from fout6-smtp.messagingengine.com (fout6-smtp.messagingengine.com [103.168.172.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33838446DB for ; Fri, 22 Mar 2024 12:22:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.149 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110165; cv=none; b=FZicla4gKSdYrN/oANAOfpO8bJrlHzxOJNmhKcb8FTlWBJ/dbNObF01Sc2TtQs8GIOWvDYi9gOsG5tYjPAh5etYNlVSYEtrEEU/pBDlWhTpp81PWhopONZ+xzO7DLxoJS5PXS1Xe7f80vRNh5ACFs85YkQJoyZuE41DNWDcyRzk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110165; c=relaxed/simple; bh=3aHD4qJdJg8ITYYJJJsvxNOrP2fUOB1b73JOMJbEDRo=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=GuuPZeDq0yRnlYWVHwN6feDkIFMXmDlAgCHr5k4rgkY1hgWAjO8o9Grk9rxIsyqeakX9sPNBvOR/BQv4wbafJSscF3OlkKzXqlZNICN1jzqfUV9Yvm8uNiVyZm5o6P7vE725DqCX2PHmU5+Y7XP9OCAgfBXG3gj/OyHw3NxnxNw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=bAKtWuIm; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=wSpB1eZi; arc=none smtp.client-ip=103.168.172.149 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="bAKtWuIm"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="wSpB1eZi" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfout.nyi.internal (Postfix) with ESMTP id 4F31C13800F7 for ; Fri, 22 Mar 2024 08:22:42 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Fri, 22 Mar 2024 08:22:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1711110162; x=1711196562; bh=42Pbfgda5c kYq5TiurrBvSIz/V+eiMQewuGOQn4u+JU=; b=bAKtWuImtGMW/3+s3nm3PBe+Mk QzaF1dOwiWWowA2KRkA7u1skwlpk9xX9a7dtVyYgoAMjG17ppTQFZnDVjsMKgC0K Yg4gdUo4MyyR2ScTl4rgdD5KhVjscR7ibpMGS6vxjrEHBw70feLBTo96bs6CB5Ct L4J8bok9ThrqD+wyY9vLluTAYsh01HxhTbvemdaEeTp3ArBo2VLeCOeGb8HjlbhC b+b1/GftmEx3EaBZBWqnsXNBdP510HA6x9ia+sPvLdcr5SpVn30JG5YCGhGOX6Er SMqjztjPgMTWnmvIheYn13IxOBT8CvRCBEbN4O9Sa/QN9Uycu2RAb1G/oS2w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1711110162; x=1711196562; bh=42Pbfgda5ckYq5TiurrBvSIz/V+e iMQewuGOQn4u+JU=; b=wSpB1eZin2eNF+xs7myDHJJfaelGjEwEFr13vLBfy3tV b4d5PiZYnLkVU5duVk97is+WQzl8EjE3Hfbhz5zNRsPwT+8IAgSBBBglZpU4it2F an8S9hp8F1Q4FjL03I/+BXCgAMILJThFJ5bcAJtDYXmBkMx6YxjHBjxFRQOCFq2V 2iXtjFjG6t3jnXaGuupKDfdHpqrJeaa18t1HLWIg9aJr52iiUUJyKZqTjvZKYbOo QEs9r3TwfBQPC4LXj42P1O1elJD4AvkzQgNQpbCnbr5+fxI0S2jFS5bYjJDZquzR FWJbNuEF7bg48g4mc/3uk7sWVHr5Jn8HKmbXKGcJbw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddttddgfeeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtvdenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhepheeghfdtfeeuffehkefgffduleffjedthf dvjeektdfhhedvlefgtefgvdettdfhnecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Fri, 22 Mar 2024 08:22:41 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id c8fc589e (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Fri, 22 Mar 2024 12:22:35 +0000 (UTC) Date: Fri, 22 Mar 2024 13:22:39 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 3/7] reftable/refname: refactor binary search over refnames Message-ID: <44386818ce681da02f00a498acf66043aa55558e.1711109214.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: It is comparatively hard to understand how exactly the binary search over refnames works given that the function and variable names are not exactly easy to grasp. Rename them to make this more obvious. This should not result in any change in behaviour. Signed-off-by: Patrick Steinhardt --- reftable/refname.c | 44 ++++++++++++++++++++++---------------------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/reftable/refname.c b/reftable/refname.c index 64eba1b886..9ec488d727 100644 --- a/reftable/refname.c +++ b/reftable/refname.c @@ -12,15 +12,15 @@ #include "refname.h" #include "reftable-iterator.h" -struct find_arg { - char **names; - const char *want; +struct refname_needle_lesseq_args { + char **haystack; + const char *needle; }; -static int find_name(size_t k, void *arg) +static int refname_needle_lesseq(size_t k, void *arg) { - struct find_arg *f_arg = arg; - return strcmp(f_arg->names[k], f_arg->want) >= 0; + struct refname_needle_lesseq_args *f_arg = arg; + return strcmp(f_arg->needle, f_arg->haystack[k]) <= 0; } static int modification_has_ref(struct modification *mod, const char *name) @@ -29,21 +29,21 @@ static int modification_has_ref(struct modification *mod, const char *name) int err = 0; if (mod->add_len > 0) { - struct find_arg arg = { - .names = mod->add, - .want = name, + struct refname_needle_lesseq_args arg = { + .haystack = mod->add, + .needle = name, }; - size_t idx = binsearch(mod->add_len, find_name, &arg); + size_t idx = binsearch(mod->add_len, refname_needle_lesseq, &arg); if (idx < mod->add_len && !strcmp(mod->add[idx], name)) return 0; } if (mod->del_len > 0) { - struct find_arg arg = { - .names = mod->del, - .want = name, + struct refname_needle_lesseq_args arg = { + .haystack = mod->del, + .needle = name, }; - size_t idx = binsearch(mod->del_len, find_name, &arg); + size_t idx = binsearch(mod->del_len, refname_needle_lesseq, &arg); if (idx < mod->del_len && !strcmp(mod->del[idx], name)) return 1; } @@ -71,11 +71,11 @@ static int modification_has_ref_with_prefix(struct modification *mod, int err = 0; if (mod->add_len > 0) { - struct find_arg arg = { - .names = mod->add, - .want = prefix, + struct refname_needle_lesseq_args arg = { + .haystack = mod->add, + .needle = prefix, }; - size_t idx = binsearch(mod->add_len, find_name, &arg); + size_t idx = binsearch(mod->add_len, refname_needle_lesseq, &arg); if (idx < mod->add_len && !strncmp(prefix, mod->add[idx], strlen(prefix))) goto done; @@ -90,11 +90,11 @@ static int modification_has_ref_with_prefix(struct modification *mod, goto done; if (mod->del_len > 0) { - struct find_arg arg = { - .names = mod->del, - .want = ref.refname, + struct refname_needle_lesseq_args arg = { + .haystack = mod->del, + .needle = ref.refname, }; - size_t idx = binsearch(mod->del_len, find_name, &arg); + size_t idx = binsearch(mod->del_len, refname_needle_lesseq, &arg); if (idx < mod->del_len && !strcmp(ref.refname, mod->del[idx])) continue; From patchwork Fri Mar 22 12:22:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13600029 Received: from fhigh2-smtp.messagingengine.com (fhigh2-smtp.messagingengine.com [103.168.172.153]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D0B345033 for ; Fri, 22 Mar 2024 12:22:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.153 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110169; cv=none; b=C/OUcJBk3iBzBKlQFsGYMwiR3syBP+zEeshdBMiBEEe6+y+nfmXqZSBvfDyj6UkKBOzIclAEIm8liUfljqwxQ30hznHms5pg8Jb40JMsh5cjJie8DHq9RE82VAQ/wdPrIrKU+u4Hs4DdWwJxTIlvgK3g1vgA/KHfJIPVKrjRbME= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110169; c=relaxed/simple; bh=rKovJTyTp+8/JRebPji9kMT5D2tNRmHtEeI0pxH+K8k=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=WKn0neRNpreKj2YiztR6OpIWDlFJZ4xnBTCgthsquz+/6VKg47EmHB1F77ns2aXpMRfBkQEzKrhuL5xC6vu/ot24oeh3MAbkfm/yzPGuSDPSgEvaIYzzKI9ecNz0JPIdVAYbc2qoLj37DQ5PSiHHQejA1zaDNGtdmUsncyX5jHY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=Wt9EGCFC; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=hiGOFOP2; arc=none smtp.client-ip=103.168.172.153 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="Wt9EGCFC"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="hiGOFOP2" Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 38C8B114013D for ; Fri, 22 Mar 2024 08:22:46 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Fri, 22 Mar 2024 08:22:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1711110166; x=1711196566; bh=qfiGegWuXb rIgwMta1Zr0bm7JNs0q4sDzvVnJGIVu8o=; b=Wt9EGCFCsHySSkaofXIziQhfJp fRQMUAXrlMI2SmztzzsBPEXTcrQgnFzKm9pc523NHqN1GUUs4nO7wodeIT+TolEH MY13bBPJy/TZ2P2mK7CdbD6CL5fN8hsRcrCwtHXAZj/i3cvec0nnE85BkFomyt2H IwBuOrNs1jygiH3L8jGUcOlJ6PMtcFGLl050Bcg4J5qFjElZuATg4lbj8x8/zB75 nDX10Jm+Rk9eAjNStnsAWfUbA2+r3+erdK0e0jghIcvsH88TNU57R9Gb5rj887Dg nnJ3ioVGa0is0rNq7MhAjVIMvssUJK8q8wxm+abVruim6iT2k12GukzfX47A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1711110166; x=1711196566; bh=qfiGegWuXbrIgwMta1Zr0bm7JNs0 q4sDzvVnJGIVu8o=; b=hiGOFOP2YIu8+dG/YtTHpvC7i6zmZrLsWt0t96Voz8JW UhRMKezzrrGYG0iQb3yeUJ6FChC+XTBUWVp8WJVRh55WuPS67sD2TqSMGgBPuq38 xu3fS4XRqZr6M5ww3wLoYXDXvZMpj5sciHJJrD7K9ru308YWiXHNGjN8jGKBzuS7 QtJ3Ztjumr/3U29YcnRJdTPtJ852oa0Q2dKXOV6C+l0mXpAgFR96gmy+xmpz0tKz JfXsL/0pBUqAUHS7wBPtRTSe2LpUrkmtUuIdTJQ7PFcUuMSpJT89JYSlwTiBeyZq E/y1wryHU4YE55/ikjpY8VK1fIpSHNC8MWtpggh5LQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddttddgfeeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtvdenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhepheeghfdtfeeuffehkefgffduleffjedthf dvjeektdfhhedvlefgtefgvdettdfhnecuvehluhhsthgvrhfuihiivgepudenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Fri, 22 Mar 2024 08:22:45 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 26a0ec49 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Fri, 22 Mar 2024 12:22:39 +0000 (UTC) Date: Fri, 22 Mar 2024 13:22:43 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 4/7] reftable/block: refactor binary search over restart points Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When seeking a record in our block reader we perform a binary search over the block's restart points so that we don't have to do a linear scan over the whole block. The logic to do so is quite intricate though, which makes it hard to understand. Improve documentation and rename some of the functions and variables so that the code becomes easier to understand overall. This refactoring should not result in any change in behaviour. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 97 ++++++++++++++++++++++++++++++++++-------------- 1 file changed, 70 insertions(+), 27 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 422885bddb..6cd4c053df 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -273,34 +273,36 @@ void block_reader_start(struct block_reader *br, struct block_iter *it) it->next_off = br->header_off + 4; } -struct restart_find_args { +struct restart_needle_less_args { int error; - struct strbuf key; - struct block_reader *r; + struct strbuf needle; + struct block_reader *reader; }; -static int restart_key_less(size_t idx, void *args) +static int restart_needle_less(size_t idx, void *_args) { - struct restart_find_args *a = args; - uint32_t off = block_reader_restart_offset(a->r, idx); + struct restart_needle_less_args *args = _args; + uint32_t off = block_reader_restart_offset(args->reader, idx); struct string_view in = { - .buf = a->r->block.data + off, - .len = a->r->block_len - off, + .buf = args->reader->block.data + off, + .len = args->reader->block_len - off, }; - - /* the restart key is verbatim in the block, so this could avoid the - alloc for decoding the key */ - struct strbuf rkey = STRBUF_INIT; + struct strbuf kth_restart_key = STRBUF_INIT; uint8_t unused_extra; - int n = reftable_decode_key(&rkey, &unused_extra, in); - int result; + int result, n; + + /* + * TODO: The restart key is verbatim in the block, so we can in theory + * avoid decoding the key and thus save some allocations. + */ + n = reftable_decode_key(&kth_restart_key, &unused_extra, in); if (n < 0) { - a->error = 1; + args->error = 1; return -1; } - result = strbuf_cmp(&a->key, &rkey); - strbuf_release(&rkey); + result = strbuf_cmp(&args->needle, &kth_restart_key); + strbuf_release(&kth_restart_key); return result < 0; } @@ -376,9 +378,9 @@ void block_iter_close(struct block_iter *it) int block_reader_seek(struct block_reader *br, struct block_iter *it, struct strbuf *want) { - struct restart_find_args args = { - .key = *want, - .r = br, + struct restart_needle_less_args args = { + .needle = *want, + .reader = br, }; struct block_iter next = BLOCK_ITER_INIT; struct reftable_record rec; @@ -390,7 +392,35 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, goto done; } - i = binsearch(br->restart_count, &restart_key_less, &args); + /* + * Perform a binary search over the block's restart points, which + * avoids doing a linear scan over the whole block. Like this, we + * identify the section of the block that should contain our key. + * + * Note that we explicitly search for the first restart point _greater_ + * than the sought-after record, not _greater or equal_ to it. In case + * the sought-after record is located directly at the restart point we + * would otherwise start doing the linear search at the preceding + * restart point. While that works alright, we would end up scanning + * too many record. + */ + i = binsearch(br->restart_count, &restart_needle_less, &args); + + /* + * Now there are multiple cases: + * + * - `i == 0`: The wanted record must be contained before the first + * restart point. We will thus start searching for the record in + * that section after accounting for the header offset. + * + * - `i == restart_count`: The wanted record was not found at any of + * the restart points. As there is no restart point at the end of + * the section the record may thus be contained in the last block. + * + * - `i > 0`: The wanted record must be contained in the section + * before the found restart point. We thus do a linear search + * starting from the preceding restart point. + */ if (i > 0) it->next_off = block_reader_restart_offset(br, i - 1); else @@ -399,21 +429,34 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, reftable_record_init(&rec, block_reader_type(br)); - /* We're looking for the last entry less/equal than the wanted key, so - we have to go one entry too far and then back up. - */ + /* + * We're looking for the last entry less than the wanted key so that + * the next call to `block_reader_next()` would yield the wanted + * record. We thus don't want to position our reader at the sought + * after record, but one before. To do so, we have to go one entry too + * far and then back up. + */ while (1) { block_iter_copy_from(&next, it); err = block_iter_next(&next, &rec); if (err < 0) goto done; - - reftable_record_key(&rec, &it->last_key); - if (err > 0 || strbuf_cmp(&it->last_key, want) >= 0) { + if (err > 0) { err = 0; goto done; } + /* + * Check whether the current key is greater or equal to the + * sought-after key. In case it is greater we know that the + * record does not exist in the block and can thus abort early. + * In case it is equal to the sought-after key we have found + * the desired record. + */ + reftable_record_key(&rec, &it->last_key); + if (strbuf_cmp(&it->last_key, want) >= 0) + goto done; + block_iter_copy_from(it, &next); } From patchwork Fri Mar 22 12:22:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13600030 Received: from fout6-smtp.messagingengine.com (fout6-smtp.messagingengine.com [103.168.172.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F7FB45C04 for ; Fri, 22 Mar 2024 12:22:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.149 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110172; cv=none; b=a2yVT5jfDEYIjWh8LtQwoR32fjPXdLIkSLgAKSqtPoTM2qdynhRz8028ix1iNYN8+knLGapWjvHl2NLhbhBSpvXdCiRTkak4k+s//L3Il+zIiRi8LSQ3/bds4wBwmvLKUS6f7pIEYPPRMXdM8KLL18P+xTTve4r+Ki41u5GtCzI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110172; c=relaxed/simple; bh=qymDGwTb333PeIIL16hBSfNl0KsmgDnSkLsI2PjpcAw=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=QMK/k6wamb+QzZu1OIKqvLnrvTG8kYxSWh5P5WoeqcHHGtG8MdbUsAFeW3wg2TMSwlCMWCUxK3xo/k2AYTbR9YRcw6TX3l+HjhN+anSxF2GzzfnTyBxwzSaxhllFmpf7BM5m6lTl9p5Qawt1TgAPgIKgWY74dMWbvZZ8bD9FmEo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=YSA11hCO; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=YiCDVZ5k; arc=none smtp.client-ip=103.168.172.149 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="YSA11hCO"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="YiCDVZ5k" Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfout.nyi.internal (Postfix) with ESMTP id 2181F1380068 for ; Fri, 22 Mar 2024 08:22:50 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Fri, 22 Mar 2024 08:22:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1711110170; x=1711196570; bh=bl4V1gOTc8 TxqvhDWrpWQEwA97Q1CEoWpPydeboV3XY=; b=YSA11hCO4Kb15kSOZXBtO9yV5H vJq4jJeh9ZdPEh6kxHtPyE5A7L2iuEQjvLQhv6QOptYyODPCnsqyzkrXC5y/ER46 7tWgjgedebEIpXA7FWNxronJpjPwPJ+SJLMdOXSxX+iLNcixH6HeJX4Y5XT11/et AKYH9kePH0kU3bkbh4ciEFRnrv+3Zu2s9JRu6LfFBKh+pzevUfYXS9TMdPBpag9C LtKHBL2j6xz0pLE4R00q7xjaLyfCUfSTCespAQh5ygBzLpQjun5Ew/PgqcIylQNB WVPGRkL6bhL8GkNJfJtrQCupqQm04AzM4WHr+5YO7tIhQO1huUBjA95naJYw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1711110170; x=1711196570; bh=bl4V1gOTc8TxqvhDWrpWQEwA97Q1 CEoWpPydeboV3XY=; b=YiCDVZ5klUpKS2dAcesBdvR1CQAWRDqU7CWP7GSJD3rE XtnYrM/R5+RonMEZPjoO1aWRbvcQTVf6trrKfJwP1mPGbaAXygGFsa9heWamHJXm NmKSX/S/g5mGR/uTKoGVlrUVV+bQAUvBH3HhkIx3doVx+XUG9Exd203lzohNo0Ll EDdoBfaifNNxkBG8lhI7SSduE2WWZMPuWnbUHUVMKfxuh5UqNB+Iv944n/4fmty9 jm/hZfBXwq3W0TQotFnpJe0k7Q9oQeJI9tuNDj/NtW7mveui9H0ZN2d430qcITbo aqdEflmlX0ccf184w2zleMSy/gBR9xYX7F24J2XkiQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddttddgfeeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtvdenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhepheeghfdtfeeuffehkefgffduleffjedthf dvjeektdfhhedvlefgtefgvdettdfhnecuvehluhhsthgvrhfuihiivgepudenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Fri, 22 Mar 2024 08:22:49 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 1c367154 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Fri, 22 Mar 2024 12:22:43 +0000 (UTC) Date: Fri, 22 Mar 2024 13:22:47 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 5/7] reftable/block: fix error handling when searching restart points Message-ID: <36b1ef8e5c0948392210e859b61d9ae3bdb58be9.1711109214.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When doing the binary search over restart points in a block we need to decode the record keys. This decoding step can result in an error when the block is corrupted, which we indicate to the caller of the binary search by setting `args.error = 1`. But the only caller that exists mishandles this because it in fact performs the error check before calling `binsearch()`. Fix this bug by checking for errors at the right point in time. Furthermore, refactor `binsearch()` so that it aborts the search in case the callback function returns a negative value so that we don't needlessly continue to search the block. Signed-off-by: Patrick Steinhardt --- reftable/basics.c | 5 ++++- reftable/basics.h | 5 +++-- reftable/block.c | 9 ++++----- 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/reftable/basics.c b/reftable/basics.c index 2c5f34b39e..fea711db7e 100644 --- a/reftable/basics.c +++ b/reftable/basics.c @@ -39,8 +39,11 @@ size_t binsearch(size_t sz, int (*f)(size_t k, void *args), void *args) */ while (hi - lo > 1) { size_t mid = lo + (hi - lo) / 2; + int ret = f(mid, args); + if (ret < 0) + return sz; - if (f(mid, args)) + if (ret > 0) hi = mid; else lo = mid; diff --git a/reftable/basics.h b/reftable/basics.h index 2672520e76..523ecd5307 100644 --- a/reftable/basics.h +++ b/reftable/basics.h @@ -22,8 +22,9 @@ uint32_t get_be24(uint8_t *in); void put_be16(uint8_t *out, uint16_t i); /* - * find smallest index i in [0, sz) at which f(i) is true, assuming - * that f is ascending. Return sz if f(i) is false for all indices. + * find smallest index i in [0, sz) at which `f(i) > 0`, assuming that f is + * ascending. Return sz if `f(i) == 0` for all indices. The search is aborted + * and `sz` is returned in case `f(i) < 0`. * * Contrary to bsearch(3), this returns something useful if the argument is not * found. diff --git a/reftable/block.c b/reftable/block.c index 6cd4c053df..ca80a05e21 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -387,11 +387,6 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, int err = 0; size_t i; - if (args.error) { - err = REFTABLE_FORMAT_ERROR; - goto done; - } - /* * Perform a binary search over the block's restart points, which * avoids doing a linear scan over the whole block. Like this, we @@ -405,6 +400,10 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, * too many record. */ i = binsearch(br->restart_count, &restart_needle_less, &args); + if (args.error) { + err = REFTABLE_FORMAT_ERROR; + goto done; + } /* * Now there are multiple cases: From patchwork Fri Mar 22 12:22:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13600031 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17D213FE35 for ; Fri, 22 Mar 2024 12:22:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.111.4.26 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110176; cv=none; b=F/jSGBB86jsjeYEyklJEmyAea14WF5iB66ECUQvjZ0seVOYmhtDgUwDv6tGcZ9RYYYMl2NNsIP67GyQCsUlOFWQwR5Lk3dHcHP9z7HtBl4GLej8d05G4eycDE2LHu7rm4/PanO9dr3g3LOfoJSc0tTMgcZGJ85B6LhOznI/VwDs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110176; c=relaxed/simple; bh=Fr9LJfIB+JrZxiqmP67X+MhljCSwejQ7Tq4cU4io5gg=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=gyH8abcMLCjRm2vVJnEL/dKCIvgQJbmOhUQSn8G/RVGr7ZoItRxcrMp0rYTtVvozmVqLGYz5LVopzNrgPhrLzFbnEEZ56XOkcRA12e2C69ILZfDyRimtaiw/fAj/PlY5qVDkoTmmD6wWmH8VV9QT0kXwjBncFVFDf4jZUpF3k+c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=iOcFuz/l; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=UU+VyN7B; arc=none smtp.client-ip=66.111.4.26 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="iOcFuz/l"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="UU+VyN7B" Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 25C825C0096 for ; Fri, 22 Mar 2024 08:22:54 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Fri, 22 Mar 2024 08:22:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1711110174; x=1711196574; bh=R3BC2Jdu0u ZGvghyx3l8jrFbgh4VIJ05l18uTDDkOPk=; b=iOcFuz/lL8R5NlXK66mlZCjjiJ N+Uoas1AKkB3DuMPN6OejOgR6DQHbLpFWnOQmw3ljm2bTnO12K8pPjMAqFUWfq99 thsVXyGKp5RIEijQ3OcAFiDVOw79q9GOJjTR3ck+ExrLszzPATI2fHk+n8ILsEGT c0AO0nCVS7BQjYWeVMUnWcVcxNHAl3ttNsrE/zl0yn4pMIjYpbFIPnmfSa7cIoR3 ASwlE1b9XDeBvdPkDCqVD4bbEyhNNT0/bTfDAopVqqqruKGXK83CzZQYzIY8zd5X T7FQmPvoiLfemIJEuR46q5ayf59I5GVKVv/ETK/UvrcwTJnPzSzPTuG/UoSA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1711110174; x=1711196574; bh=R3BC2Jdu0uZGvghyx3l8jrFbgh4V IJ05l18uTDDkOPk=; b=UU+VyN7BhC4/ie5++K8d+g2vdn/zRMIAswWbZ4nLcmNr 4PfkxKPw1jVtD+K5bu7UDuqqthHvWYFML04xU7iWXQYi9kjCQ4rQQrHQSjJSSrb4 Eh/cEeScGXOoss/JgreRJfnOKgRDW0OZOOb3oSgvspLi2fqRPuhiEfu/6llCF7x5 Z2vy2AyRBUeH4zkdOkIHc6UY4cT2ijsHh8YmPAwmVc2yRLPeUCEb7YEf8oxaTkPW vZKzjNGMsh8ONvCGXF/XNGfmea0KaotMsIbrKQMAf6U/adpr/1W4TJD8gzl+KGTe ShcA0oR33/EVtQKO615tZ2uqm9yqi46BwBbXr7sxvQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddttddgfeeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtvdenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhepheeghfdtfeeuffehkefgffduleffjedthf dvjeektdfhhedvlefgtefgvdettdfhnecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Fri, 22 Mar 2024 08:22:53 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 3da4b4f8 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Fri, 22 Mar 2024 12:22:47 +0000 (UTC) Date: Fri, 22 Mar 2024 13:22:51 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 6/7] reftable/record: extract function to decode key lengths Message-ID: <38666de451348b0306c55b57d22fa55d2365792d.1711109214.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: We're about to refactor the binary search over restart points so that it does not need to fully decode the record keys anymore. To do so we will need to decode the record key lengths, which is non-trivial logic. Extract the logic to decode these lengths from `refatble_decode_key()` so that we can reuse it. Signed-off-by: Patrick Steinhardt --- reftable/record.c | 34 +++++++++++++++++++++++++--------- reftable/record.h | 6 ++++++ 2 files changed, 31 insertions(+), 9 deletions(-) diff --git a/reftable/record.c b/reftable/record.c index 23b497adab..5506f3e913 100644 --- a/reftable/record.c +++ b/reftable/record.c @@ -159,26 +159,42 @@ int reftable_encode_key(int *restart, struct string_view dest, return start.len - dest.len; } -int reftable_decode_key(struct strbuf *last_key, uint8_t *extra, - struct string_view in) +int reftable_decode_keylen(struct string_view in, + uint64_t *prefix_len, + uint64_t *suffix_len, + uint8_t *extra) { - int start_len = in.len; - uint64_t prefix_len = 0; - uint64_t suffix_len = 0; + size_t start_len = in.len; int n; - n = get_var_int(&prefix_len, &in); + n = get_var_int(prefix_len, &in); if (n < 0) return -1; string_view_consume(&in, n); - n = get_var_int(&suffix_len, &in); + n = get_var_int(suffix_len, &in); if (n <= 0) return -1; string_view_consume(&in, n); - *extra = (uint8_t)(suffix_len & 0x7); - suffix_len >>= 3; + *extra = (uint8_t)(*suffix_len & 0x7); + *suffix_len >>= 3; + + return start_len - in.len; +} + +int reftable_decode_key(struct strbuf *last_key, uint8_t *extra, + struct string_view in) +{ + int start_len = in.len; + uint64_t prefix_len = 0; + uint64_t suffix_len = 0; + int n; + + n = reftable_decode_keylen(in, &prefix_len, &suffix_len, extra); + if (n < 0) + return -1; + string_view_consume(&in, n); if (in.len < suffix_len || prefix_len > last_key->len) diff --git a/reftable/record.h b/reftable/record.h index 826ee1c55c..d778133e6e 100644 --- a/reftable/record.h +++ b/reftable/record.h @@ -86,6 +86,12 @@ int reftable_encode_key(int *is_restart, struct string_view dest, struct strbuf prev_key, struct strbuf key, uint8_t extra); +/* Decode a record's key lengths. */ +int reftable_decode_keylen(struct string_view in, + uint64_t *prefix_len, + uint64_t *suffix_len, + uint8_t *extra); + /* * Decode into `last_key` and `extra` from `in`. `last_key` is expected to * contain the decoded key of the preceding record, if any. From patchwork Fri Mar 22 12:22:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13600032 Received: from fout6-smtp.messagingengine.com (fout6-smtp.messagingengine.com [103.168.172.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 09FA9405DF for ; Fri, 22 Mar 2024 12:22:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.149 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110180; cv=none; b=XNjbuDYE822pBIaiCL6DjflC8BqeD/qwvhKj0FaPmDHsWcJvo3Ood3Ifrap6gL6zACJN6KSi08UjOXjc2NC0P1jFw49P54vHHyneBhAE9UkBo9jhs2euG+cezwwVALlXb3kuhZXZwtlUgk5z8vYKhI8vbIhgTjfxhQTN7AQWtTo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711110180; c=relaxed/simple; bh=OuEX5uuBA4g6PW9rT9RCPgdHwen7vBOvN2wtbwB5nnc=; h=Date:From:To:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ShfU3uPDtSBthjdN6spBw/pLzIwDm4qZz6Z8qadTdDf2imTf9dHqbsNqKkmWOVsGiom1tA5TCaWYQj2N6rjOMhGNy/8yZNCso2VxuJ5IyMbkb2pcFbAHKhL7jjihtXQufl61IzVQkd1dy+yNWoVSikv6Cgh9f46ZNIz700sN1ug= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=bP6AeK8D; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=h+kr3DH8; arc=none smtp.client-ip=103.168.172.149 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="bP6AeK8D"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="h+kr3DH8" Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfout.nyi.internal (Postfix) with ESMTP id 0DBE513800F7 for ; Fri, 22 Mar 2024 08:22:58 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Fri, 22 Mar 2024 08:22:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1711110178; x=1711196578; bh=CkLybLgvrV 3KAxVbHMFiO58uv0nvmM4UwXESGs7FeMo=; b=bP6AeK8DwgUP72e1WUX5t8Dd2z yTdfa0sWEXn6EdPd1xevxxeMU3sHsAio/hNYj5lEyRupB0WMtorgr/vSytLnF65N i1lkklrEu/shM+XB56zrZsa2GvaIVYmVVgL9QwqiCNWeN5FrvV6GFIimyujq7sYa 7S/6IYYFSu2RL/nBx1vfJs0/3vYg7aOop7CaYHRGGTkj5WuP3u25oWQ/myarHOjW DfltosVx12zo2fWrLwcGsZkOE2hfs4ewW5HZ8f/a93zV93L5MKJ0LH0btnNpgGcs yf/ywxNu0hB2VY4Pv4iuloM/zh8vUSp5ebld3p9kvp8W46Rs8I2sssp9CF0A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1711110178; x=1711196578; bh=CkLybLgvrV3KAxVbHMFiO58uv0nv mM4UwXESGs7FeMo=; b=h+kr3DH8fGA82zKF1LUxrPQfMphyLa64o1XhNT7npvKA STkWvj8CFUpzx0AED3Hg1xRjNyDkR8Vg3/C2rRILmJB35yyFOKWEJv7FL/DDBVIt np0OhK8LnIDaJL17N32W6tqZqms6UW+DK2h+zozDkPQjx/0B0quJe1VZx/zTmuqe LKdZVGOiiuJfRV39z1DQ6wCZq1e5jTr4GL8xUC9ktkisVSlsrkOtVEaIJ5iR2Qpz 1amlrJOlo3tNibajgFoeDdaj8Eb6qQhtvw4UQp7+VakhY+YsrKg/m6ls8guLAPOi AaxeDcn2+mTKUejR+H15x/UWKMezM8NO/nVOcMWb9Q== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledruddttddgfeeiucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkfhggtggujgesghdtre ertddtvdenucfhrhhomheprfgrthhrihgtkhcuufhtvghinhhhrghrughtuceophhssehp khhsrdhimheqnecuggftrfgrthhtvghrnhepheeghfdtfeeuffehkefgffduleffjedthf dvjeektdfhhedvlefgtefgvdettdfhnecuvehluhhsthgvrhfuihiivgeptdenucfrrghr rghmpehmrghilhhfrhhomhepphhssehpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Fri, 22 Mar 2024 08:22:57 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 27ef84ad (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Fri, 22 Mar 2024 12:22:52 +0000 (UTC) Date: Fri, 22 Mar 2024 13:22:55 +0100 From: Patrick Steinhardt To: git@vger.kernel.org Subject: [PATCH 7/7] reftable/block: avoid decoding keys when searching restart points Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When searching over restart points in a block we decode the key of each of the records, which results in a memory allocation. This is quite pointless though given that records it restart points will never use prefix compression and thus store their keys verbatim in the block. Refactor the code so that we can avoid decoding the keys, which saves us some allocations. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index ca80a05e21..8bb4e43cec 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -287,23 +287,32 @@ static int restart_needle_less(size_t idx, void *_args) .buf = args->reader->block.data + off, .len = args->reader->block_len - off, }; - struct strbuf kth_restart_key = STRBUF_INIT; - uint8_t unused_extra; - int result, n; + uint64_t prefix_len, suffix_len; + uint8_t extra; + int n; /* - * TODO: The restart key is verbatim in the block, so we can in theory - * avoid decoding the key and thus save some allocations. + * Records at restart points are stored without prefix compression, so + * there is no need to fully decode the record key here. This removes + * the need for allocating memory. */ - n = reftable_decode_key(&kth_restart_key, &unused_extra, in); - if (n < 0) { + n = reftable_decode_keylen(in, &prefix_len, &suffix_len, &extra); + if (n < 0 || prefix_len) { args->error = 1; return -1; } - result = strbuf_cmp(&args->needle, &kth_restart_key); - strbuf_release(&kth_restart_key); - return result < 0; + string_view_consume(&in, n); + if (suffix_len > in.len) { + args->error = 1; + return -1; + } + + n = memcmp(args->needle.buf, in.buf, + args->needle.len < suffix_len ? args->needle.len : suffix_len); + if (n) + return n < 0; + return args->needle.len < suffix_len; } void block_iter_copy_from(struct block_iter *dest, struct block_iter *src)