[v2,6/9] oid-array: provide a for-loop iterator

Message ID	X85+RNGwV9WWeIXZ@coredump.intra.peff.net (mailing list archive)
State	Accepted
Commit	12c4b4ce754640ac565f8b9b6f072bc10fbed5af
Headers	show Return-Path: <git-owner@kernel.org> Date: Mon, 7 Dec 2020 14:11:00 -0500 From: Jeff King <peff@peff.net> To: git@vger.kernel.org Cc: Derrick Stolee <dstolee@microsoft.com>, Eric Sunshine <sunshine@sunshineco.com> Subject: [PATCH v2 6/9] oid-array: provide a for-loop iterator Message-ID: <X85+RNGwV9WWeIXZ@coredump.intra.peff.net> References: <X85+GbvmN4wIjsYY@coredump.intra.peff.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <X85+GbvmN4wIjsYY@coredump.intra.peff.net> Precedence: bulk
Series	misc commit-graph and oid-array cleanups \| expand [v2,0/9] misc commit-graph and oid-array cleanups [v2,1/9] oid-array.h: drop sha1 mention from header guard [v2,2/9] t0064: drop sha1 mention from filename [v2,3/9] t0064: make duplicate tests more robust [v2,4/9] cache.h: move hash/oid functions to hash.h [v2,5/9] oid-array: make sort function public [v2,6/9] oid-array: provide a for-loop iterator [v2,7/9] commit-graph: drop count_distinct_commits() function [v2,8/9] commit-graph: replace packed_oid_list with oid_array [v2,9/9] commit-graph: use size_t for array allocation and indexing

Message ID

X85+RNGwV9WWeIXZ@coredump.intra.peff.net (mailing list archive)

State

Accepted

Commit

12c4b4ce754640ac565f8b9b6f072bc10fbed5af

Headers

Date: Mon, 7 Dec 2020 14:11:00 -0500
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Cc: Derrick Stolee <dstolee@microsoft.com>,
        Eric Sunshine <sunshine@sunshineco.com>
Subject: [PATCH v2 6/9] oid-array: provide a for-loop iterator
Message-ID: <X85+RNGwV9WWeIXZ@coredump.intra.peff.net>
References: <X85+GbvmN4wIjsYY@coredump.intra.peff.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <X85+GbvmN4wIjsYY@coredump.intra.peff.net>
Precedence: bulk

Series

misc commit-graph and oid-array cleanups | expand

Commit Message

Jeff King Dec. 7, 2020, 7:11 p.m. UTC

We provide oid_array_for_each_unique() for iterating over the
de-duplicated items in an array. But it's awkward to use for two
reasons:

  1. It uses a callback, which means marshaling arguments into a struct
     and passing it to the callback with a void parameter.

  2. The callback doesn't know the numeric index of the oid we're
     looking at. This is useful for things like progress meters.

Iterating with a for-loop is much more natural for some cases, but the
caller has to do the de-duping itself. However, we can provide a small
helper to make this easier (see the docstring in the header for an
example use).

The caller does have to remember to sort the array first. We could add
an assertion into the helper that array->sorted is set, but I didn't
want to complicate what is otherwise a pretty fast code path.

I also considered adding a full iterator type with init/next/end
functions (similar to what we have for hashmaps). But it ended up making
the callers much harder to read. This version keeps us close to a basic
for-loop.

Yet another option would be adding an option to sort the array and
compact out the duplicates. This would mean iterating over the array an
extra time, though that's probably not a big deal (we did just do an
O(n log n) sort). But we'd still have to write a for-loop to iterate, so
it doesn't really make anything easier for the caller.

No new test, since we'll convert the callback iterator (which is covered
by t0064, among other callers) to use the new code.

Signed-off-by: Jeff King <peff@peff.net>
---
 oid-array.c |  7 ++-----
 oid-array.h | 23 +++++++++++++++++++++++
 2 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/oid-array.c b/oid-array.c
index 29f718d835..8e1bcedc0c 100644
--- a/oid-array.c
+++ b/oid-array.c
@@ -67,11 +67,8 @@  int oid_array_for_each_unique(struct oid_array *array,
 
 	oid_array_sort(array);
 
-	for (i = 0; i < array->nr; i++) {
-		int ret;
-		if (i > 0 && oideq(array->oid + i, array->oid + i - 1))
-			continue;
-		ret = fn(array->oid + i, data);
+	for (i = 0; i < array->nr; i = oid_array_next_unique(array, i)) {
+		int ret = fn(array->oid + i, data);
 		if (ret)
 			return ret;
 	}
diff --git a/oid-array.h b/oid-array.h
index 6a22c0ac94..72bca78b7d 100644
--- a/oid-array.h
+++ b/oid-array.h
@@ -1,6 +1,8 @@ 
 #ifndef OID_ARRAY_H
 #define OID_ARRAY_H
 
+#include "hash.h"
+
 /**
  * The API provides storage and manipulation of sets of object identifiers.
  * The emphasis is on storage and processing efficiency, making them suitable
@@ -111,4 +113,25 @@  void oid_array_filter(struct oid_array *array,
  */
 void oid_array_sort(struct oid_array *array);
 
+/**
+ * Find the next unique oid in the array after position "cur".
+ * The array must be sorted for this to work. You can iterate
+ * over unique elements like this:
+ *
+ *   size_t i;
+ *   oid_array_sort(array);
+ *   for (i = 0; i < array->nr; i = oid_array_next_unique(array, i))
+ *	printf("%s", oid_to_hex(array->oids[i]);
+ *
+ * Non-unique iteration can just increment with "i++" to visit each element.
+ */
+static inline size_t oid_array_next_unique(struct oid_array *array, size_t cur)
+{
+	do {
+		cur++;
+	} while (cur < array->nr &&
+		 oideq(array->oid + cur, array->oid + cur - 1));
+	return cur;
+}
+
 #endif /* OID_ARRAY_H */

[v2,6/9] oid-array: provide a for-loop iterator

Commit Message

Patch