diff mbox series

[08/15] refs/packed-backend.c: refactor `find_reference_location()`

Message ID 836a5665b7df065811edc678cb8e70004f7b7c49.1683581621.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series refs: implement skip lists for packed backend | expand

Commit Message

Taylor Blau May 8, 2023, 10 p.m. UTC
The function `find_reference_location()` is used to perform a
binary search-like function over the contents of a repository's
`$GIT_DIR/packed-refs` file.

The search it implements is unlike a standard binary search in that the
records it searches over are not of a fixed width, so the comparison
must locate the end of a record before comparing it.

Extract the core routine of `find_reference_location()` in order to
implement a function in the following patch which will find the first
location in the `packed-refs` file that *doesn't* match the given
pattern.

The behavior of `find_reference_location()` is unchanged.

Co-authored-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 refs/packed-backend.c | 46 +++++++++++++++++++++++++------------------
 1 file changed, 27 insertions(+), 19 deletions(-)

Comments

Junio C Hamano May 8, 2023, 11:56 p.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> The function `find_reference_location()` is used to perform a
> binary search-like function over the contents of a repository's
> `$GIT_DIR/packed-refs` file.
>
> The search it implements is unlike a standard binary search in that the
> records it searches over are not of a fixed width, so the comparison
> must locate the end of a record before comparing it.
>
> Extract the core routine of `find_reference_location()` in order to
> implement a function in the following patch which will find the first
> location in the `packed-refs` file that *doesn't* match the given
> pattern.
>
> The behavior of `find_reference_location()` is unchanged.

The splitting out of this step is rather unfortunate in that the
extra parameter "start" given to cmp_record_to_refname(), together
with a rather curious returning to "-1" when the parameter is false,
are not justified at all by looking at the caller, because the only
caller of the function, find_reference_location_1() always passes
"start" it got from its caller without modifying, and the sole
caller of that passes "1" to "start".  IOW, the behaviour of
cmp_record_to_refname() when "start" is false can be any arbitrary
garbage (it could even be BUG("")) and the claim "the behaviour is
unchanged" will still be true.

But that does not help the readers all that much.

So, yes I can agree that this does not introduce any new bug, it is
a mysterious no-op, and why we want to pass different values in "start"
in future steps in order to achieve what is not explained and leaves
the readers frustrated.

I'll stop here for now and come back to the rest laster.

Thanks.
Taylor Blau May 9, 2023, 8:29 p.m. UTC | #2
On Mon, May 08, 2023 at 04:56:41PM -0700, Junio C Hamano wrote:
> So, yes I can agree that this does not introduce any new bug, it is
> a mysterious no-op, and why we want to pass different values in "start"
> in future steps in order to achieve what is not explained and leaves
> the readers frustrated.

Re-reading this patch again with your review in mind, I agree that the
split is poorly placed.

I modified this patch to just extract the core routine behind a helper
function and avoided adding the "start" parameter here.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index e54e78e540..98f96bf3ee 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -302,7 +302,8 @@  static int cmp_packed_ref_records(const void *v1, const void *v2)
  * Compare a snapshot record at `rec` to the specified NUL-terminated
  * refname.
  */
-static int cmp_record_to_refname(const char *rec, const char *refname)
+static int cmp_record_to_refname(const char *rec, const char *refname,
+				 int start)
 {
 	const char *r1 = rec + the_hash_algo->hexsz + 1;
 	const char *r2 = refname;
@@ -311,7 +312,7 @@  static int cmp_record_to_refname(const char *rec, const char *refname)
 		if (*r1 == '\n')
 			return *r2 ? -1 : 0;
 		if (!*r2)
-			return 1;
+			return start ? 1 : -1;
 		if (*r1 != *r2)
 			return (unsigned char)*r1 < (unsigned char)*r2 ? -1 : +1;
 		r1++;
@@ -526,22 +527,9 @@  static int load_contents(struct snapshot *snapshot)
 	return 1;
 }
 
-/*
- * Find the place in `snapshot->buf` where the start of the record for
- * `refname` starts. If `mustexist` is true and the reference doesn't
- * exist, then return NULL. If `mustexist` is false and the reference
- * doesn't exist, then return the point where that reference would be
- * inserted, or `snapshot->eof` (which might be NULL) if it would be
- * inserted at the end of the file. In the latter mode, `refname`
- * doesn't have to be a proper reference name; for example, one could
- * search for "refs/replace/" to find the start of any replace
- * references.
- *
- * The record is sought using a binary search, so `snapshot->buf` must
- * be sorted.
- */
-static const char *find_reference_location(struct snapshot *snapshot,
-					   const char *refname, int mustexist)
+static const char *find_reference_location_1(struct snapshot *snapshot,
+					     const char *refname, int mustexist,
+					     int start)
 {
 	/*
 	 * This is not *quite* a garden-variety binary search, because
@@ -571,7 +559,7 @@  static const char *find_reference_location(struct snapshot *snapshot,
 
 		mid = lo + (hi - lo) / 2;
 		rec = find_start_of_record(lo, mid);
-		cmp = cmp_record_to_refname(rec, refname);
+		cmp = cmp_record_to_refname(rec, refname, start);
 		if (cmp < 0) {
 			lo = find_end_of_record(mid, hi);
 		} else if (cmp > 0) {
@@ -587,6 +575,26 @@  static const char *find_reference_location(struct snapshot *snapshot,
 		return lo;
 }
 
+/*
+ * Find the place in `snapshot->buf` where the start of the record for
+ * `refname` starts. If `mustexist` is true and the reference doesn't
+ * exist, then return NULL. If `mustexist` is false and the reference
+ * doesn't exist, then return the point where that reference would be
+ * inserted, or `snapshot->eof` (which might be NULL) if it would be
+ * inserted at the end of the file. In the latter mode, `refname`
+ * doesn't have to be a proper reference name; for example, one could
+ * search for "refs/replace/" to find the start of any replace
+ * references.
+ *
+ * The record is sought using a binary search, so `snapshot->buf` must
+ * be sorted.
+ */
+static const char *find_reference_location(struct snapshot *snapshot,
+					   const char *refname, int mustexist)
+{
+	return find_reference_location_1(snapshot, refname, mustexist, 1);
+}
+
 /*
  * Create a newly-allocated `snapshot` of the `packed-refs` file in
  * its current state and return it. The return value will already have