diff mbox series

[v2,2/3] for-each-ref: add 'is-base' token

Message ID 13341e7e51241e077a85ea83eb76d4e48d04be7b.1723397687.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series git for-each-ref: is-base atom and base branches | expand

Commit Message

Derrick Stolee Aug. 11, 2024, 5:34 p.m. UTC
From: Derrick Stolee <stolee@gmail.com>

The previous change introduced the get_branch_base_for_tip() method in
commit-reach.c. The motivation of that change was about using a heuristic to
deteremine the base branch for a source commit from a list of candidate
commit tips. This change makes that algorithm visible to users via a new
atom in the 'git for-each-ref' format. This change is very similar to the
chang in 49abcd21da6 (for-each-ref: add ahead-behind format atom,
2023-03-20).

Introduce the 'is-base:<source>' atom, which will indicate that the
algorithm should be computed and the result of the algorithm is reported
using an indicator of the form '(<source>)'. For example, using
'%(is-base:HEAD)' would result in one line having the token '(HEAD)'.

Use the sorted order of refs included in the ref filter to break ties in the
algorithm's heuristic. In the previous change, the motivating examples
include using an L0 trunk, long-lived L1 branches, and temporary release
branches. A caller could communicate the ordered preference among these
categories using the input refpecs and avoiding a different sort mechanism.
This sorting behavior is tested in the test scripts.

It is important to include this atom as a special case to
can_do_iterative_format() to match the expectations created in bd98f9774e1
(ref-filter.c: filter & format refs in the same callback, 2023-11-14). The
ahead-behind atom was one of the special cases, and this similarly requires
using an algorithm across all input refs before starting the format of any
single ref.

In the test script, the format tokens use colons or lack whitespace to avoid
Git complaining about trailing whitespace errors.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
---
 Documentation/git-for-each-ref.txt | 42 ++++++++++++++++
 ref-filter.c                       | 78 +++++++++++++++++++++++++++++-
 ref-filter.h                       | 15 ++++++
 t/t6600-test-reach.sh              | 47 ++++++++++++++++++
 4 files changed, 181 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Aug. 12, 2024, 9:05 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +is-base:<committish>::
> +	In at most one row, `(<committish>)` will appear to indicate the ref
> +	that is most likely the ref used as a starting point for the branch
> +	that produced `<committish>`. This choice is made using a heuristic:
> +	choose the ref that minimizes the number of commits in the
> +	first-parent history of `<committish>` and not in the first-parent
> +	history of the ref.

Very nicely described.  

Giving the end-user oriented "purpose/meaning" first makes it easier
to understand for readers when they want to use it, and giving the
heuristics to compute the result (and the example) next allows them
to verify that the feature matches what they are looking for.


> @@ -2475,6 +2495,16 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
>  				v->s = xstrdup("");
>  			}
>  			continue;
> +		} else if (atom_type == ATOM_ISBASE) {
> +			if (ref->is_base && ref->is_base[is_base_atoms]) {
> +				v->s = xstrfmt("(%s)", ref->is_base[is_base_atoms]);
> +				free(ref->is_base[is_base_atoms]);
> +			} else {
> +				/* Not a commit. */

This is unexpected.  I thought that most of the branches except at
most one that gets annotated with "Yeah, this is forked from branch
B" would take the "else" side.  They are still commits, no?

> +				v->s = xstrdup("");
> +			}
> +			is_base_atoms++;
> +			continue;
>  		} else
>  			continue;
>  
> @@ -2876,6 +2906,7 @@ static void free_array_item(struct ref_array_item *item)
>  		free(item->value);
>  	}
>  	free(item->counts);
> +	free(item->is_base);
>  	free(item);
>  }
>  
> @@ -3040,6 +3071,49 @@ void filter_ahead_behind(struct repository *r,
>  	free(commits);
>  }
>  
> +void filter_is_base(struct repository *r,
> +		    struct ref_format *format,
> +		    struct ref_array *array)
> +{
> +	struct commit **bases;
> +	size_t bases_nr = 0;
> +	struct ref_array_item **back_index;
> +
> +	if (!format->is_base_tips.nr || !array->nr)
> +		return;
> +
> +	CALLOC_ARRAY(back_index, array->nr);
> +	CALLOC_ARRAY(bases, array->nr);
> +
> +	for (size_t i = 0; i < array->nr; i++) {
> +		const char *name = array->items[i]->refname;
> +		struct commit *c = lookup_commit_reference_by_name(name);
> +
> +		CALLOC_ARRAY(array->items[i]->is_base, format->is_base_tips.nr);
> +
> +		if (!c)
> +			continue;

Hmph, wouldn't we want to leave array->items[i]->is_base NULL if
"name" looked up to "c" happens to be non-commit (i.e. NULL)?

> +		back_index[bases_nr] = array->items[i];
> +		bases[bases_nr] = c;
> +		bases_nr++;
> +	}


Thanks.
Derrick Stolee Aug. 13, 2024, 1:44 p.m. UTC | #2
On 8/12/24 5:05 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

>> +		} else if (atom_type == ATOM_ISBASE) {
>> +			if (ref->is_base && ref->is_base[is_base_atoms]) {
>> +				v->s = xstrfmt("(%s)", ref->is_base[is_base_atoms]);
>> +				free(ref->is_base[is_base_atoms]);
>> +			} else {
>> +				/* Not a commit. */
> 
> This is unexpected.  I thought that most of the branches except at
> most one that gets annotated with "Yeah, this is forked from branch
> B" would take the "else" side.  They are still commits, no?

You are correct. This is leftover from copy-pasting the ahead-behind section.
Will remove.

>> +	for (size_t i = 0; i < array->nr; i++) {
>> +		const char *name = array->items[i]->refname;
>> +		struct commit *c = lookup_commit_reference_by_name(name);
>> +
>> +		CALLOC_ARRAY(array->items[i]->is_base, format->is_base_tips.nr);
>> +
>> +		if (!c)
>> +			continue;
> 
> Hmph, wouldn't we want to leave array->items[i]->is_base NULL if
> "name" looked up to "c" happens to be non-commit (i.e. NULL)?

Your comment initially made me second-guess the logic here, but...

>> +		back_index[bases_nr] = array->items[i];
>> +		bases[bases_nr] = c;
>> +		bases_nr++;

This array of "back_index" is intended to allow the array being passed to
get_branch_base_for_tip() to have no gaps with NULL commits. The indices
are then translated back to the original array items when scanning the
results.

This matches the behavior of the ahead-behind code, giving an existing
behavior. The alternative would be to allow get_branch_base_for_tip() to
be sensitive to NULL commits in the 'bases' array. But since we need to
create an array of commit pointers (different from the array of ref
items that we start with) this is likely the simplest approach.

You did inspire me to double-check that this code works in the presence
of non-commit refs, so I'll update some things and send a v3 with a new
test. It will also include some things to make error messages quieter
for that case.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index c1dd12b93cf..d3764401a23 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -264,6 +264,48 @@  ahead-behind:<committish>::
 	commits ahead and behind, respectively, when comparing the output
 	ref to the `<committish>` specified in the format.
 
+is-base:<committish>::
+	In at most one row, `(<committish>)` will appear to indicate the ref
+	that is most likely the ref used as a starting point for the branch
+	that produced `<committish>`. This choice is made using a heuristic:
+	choose the ref that minimizes the number of commits in the
+	first-parent history of `<committish>` and not in the first-parent
+	history of the ref.
++
+For example, consider the following figure of first-parent histories of
+several refs:
++
+----
+*--*--*--*--*--* refs/heads/A
+\
+ \
+  *--*--*--* refs/heads/B
+   \     \
+    \     \
+     *     * refs/heads/C
+      \
+       \
+	*--* refs/heads/D
+----
++
+Here, if `A`, `B`, and `C` are the filtered references, and the format
+string is `%(refname):%(is-base:D)`, then the output would be
++
+----
+refs/heads/A:
+refs/heads/B:(D)
+refs/heads/C:
+----
++
+This is because the first-parent history of `D` has its earliest
+intersection with the first-parent histories of the filtered refs at a
+common first-parent ancestor of `B` and `C` and ties are broken by the
+earliest ref in the sorted order.
++
+Note that this token will not appear if the first-parent history of
+`<committish>` does not intersect the first-parent histories of the
+filtered refs.
+
 describe[:options]::
 	A human-readable name, like linkgit:git-describe[1];
 	empty string for undescribable commits. The `describe` string may
diff --git a/ref-filter.c b/ref-filter.c
index 59ad6f54ddb..59689672da1 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -167,6 +167,7 @@  enum atom_type {
 	ATOM_ELSE,
 	ATOM_REST,
 	ATOM_AHEADBEHIND,
+	ATOM_ISBASE,
 };
 
 /*
@@ -889,6 +890,23 @@  static int ahead_behind_atom_parser(struct ref_format *format,
 	return 0;
 }
 
+static int is_base_atom_parser(struct ref_format *format,
+			       struct used_atom *atom UNUSED,
+			       const char *arg, struct strbuf *err)
+{
+	struct string_list_item *item;
+
+	if (!arg)
+		return strbuf_addf_ret(err, -1, _("expected format: %%(is-base:<committish>)"));
+
+	item = string_list_append(&format->is_base_tips, arg);
+	item->util = lookup_commit_reference_by_name(arg);
+	if (!item->util)
+		die("failed to find '%s'", arg);
+
+	return 0;
+}
+
 static int head_atom_parser(struct ref_format *format UNUSED,
 			    struct used_atom *atom,
 			    const char *arg, struct strbuf *err)
@@ -952,6 +970,7 @@  static struct {
 	[ATOM_ELSE] = { "else", SOURCE_NONE },
 	[ATOM_REST] = { "rest", SOURCE_NONE, FIELD_STR, rest_atom_parser },
 	[ATOM_AHEADBEHIND] = { "ahead-behind", SOURCE_OTHER, FIELD_STR, ahead_behind_atom_parser },
+	[ATOM_ISBASE] = { "is-base", SOURCE_OTHER, FIELD_STR, is_base_atom_parser },
 	/*
 	 * Please update $__git_ref_fieldlist in git-completion.bash
 	 * when you add new atoms
@@ -2334,6 +2353,7 @@  static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 	int i;
 	struct object_info empty = OBJECT_INFO_INIT;
 	int ahead_behind_atoms = 0;
+	int is_base_atoms = 0;
 
 	CALLOC_ARRAY(ref->value, used_atom_cnt);
 
@@ -2475,6 +2495,16 @@  static int populate_value(struct ref_array_item *ref, struct strbuf *err)
 				v->s = xstrdup("");
 			}
 			continue;
+		} else if (atom_type == ATOM_ISBASE) {
+			if (ref->is_base && ref->is_base[is_base_atoms]) {
+				v->s = xstrfmt("(%s)", ref->is_base[is_base_atoms]);
+				free(ref->is_base[is_base_atoms]);
+			} else {
+				/* Not a commit. */
+				v->s = xstrdup("");
+			}
+			is_base_atoms++;
+			continue;
 		} else
 			continue;
 
@@ -2876,6 +2906,7 @@  static void free_array_item(struct ref_array_item *item)
 		free(item->value);
 	}
 	free(item->counts);
+	free(item->is_base);
 	free(item);
 }
 
@@ -3040,6 +3071,49 @@  void filter_ahead_behind(struct repository *r,
 	free(commits);
 }
 
+void filter_is_base(struct repository *r,
+		    struct ref_format *format,
+		    struct ref_array *array)
+{
+	struct commit **bases;
+	size_t bases_nr = 0;
+	struct ref_array_item **back_index;
+
+	if (!format->is_base_tips.nr || !array->nr)
+		return;
+
+	CALLOC_ARRAY(back_index, array->nr);
+	CALLOC_ARRAY(bases, array->nr);
+
+	for (size_t i = 0; i < array->nr; i++) {
+		const char *name = array->items[i]->refname;
+		struct commit *c = lookup_commit_reference_by_name(name);
+
+		CALLOC_ARRAY(array->items[i]->is_base, format->is_base_tips.nr);
+
+		if (!c)
+			continue;
+
+		back_index[bases_nr] = array->items[i];
+		bases[bases_nr] = c;
+		bases_nr++;
+	}
+
+	for (size_t i = 0; i < format->is_base_tips.nr; i++) {
+		struct commit *tip = format->is_base_tips.items[i].util;
+		int base_index = get_branch_base_for_tip(r, tip, bases, bases_nr);
+
+		if (base_index < 0)
+			continue;
+
+		/* Store the string for use in output later. */
+		back_index[base_index]->is_base[i] = xstrdup(format->is_base_tips.items[i].string);
+	}
+
+	free(back_index);
+	free(bases);
+}
+
 static int do_filter_refs(struct ref_filter *filter, unsigned int type, each_ref_fn fn, void *cb_data)
 {
 	int ret = 0;
@@ -3126,7 +3200,8 @@  static inline int can_do_iterative_format(struct ref_filter *filter,
 	return !(filter->reachable_from ||
 		 filter->unreachable_from ||
 		 sorting ||
-		 format->bases.nr);
+		 format->bases.nr ||
+		 format->is_base_tips.nr);
 }
 
 void filter_and_format_refs(struct ref_filter *filter, unsigned int type,
@@ -3150,6 +3225,7 @@  void filter_and_format_refs(struct ref_filter *filter, unsigned int type,
 		struct ref_array array = { 0 };
 		filter_refs(&array, filter, type);
 		filter_ahead_behind(the_repository, format, &array);
+		filter_is_base(the_repository, format, &array);
 		ref_array_sort(sorting, &array);
 		print_formatted_ref_array(&array, format);
 		ref_array_clear(&array);
diff --git a/ref-filter.h b/ref-filter.h
index 0ca28d2bba6..20419a56218 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -48,6 +48,7 @@  struct ref_array_item {
 	struct commit *commit;
 	struct atom_value *value;
 	struct ahead_behind_count **counts;
+	char **is_base;
 
 	char refname[FLEX_ARRAY];
 };
@@ -101,6 +102,9 @@  struct ref_format {
 	/* List of bases for ahead-behind counts. */
 	struct string_list bases;
 
+	/* List of bases for is-base indicators. */
+	struct string_list is_base_tips;
+
 	struct {
 		int max_count;
 		int omit_empty;
@@ -114,6 +118,7 @@  struct ref_format {
 #define REF_FORMAT_INIT {             \
 	.use_color = -1,              \
 	.bases = STRING_LIST_INIT_DUP, \
+	.is_base_tips = STRING_LIST_INIT_DUP, \
 }
 
 /*  Macros for checking --merged and --no-merged options */
@@ -203,6 +208,16 @@  void filter_ahead_behind(struct repository *r,
 			 struct ref_format *format,
 			 struct ref_array *array);
 
+/*
+ * If the provided format includes is-base atoms, then compute the base checks
+ * for those tips against all refs.
+ *
+ * If this is not called, then any is-base atoms will be blank.
+ */
+void filter_is_base(struct repository *r,
+		    struct ref_format *format,
+		    struct ref_array *array);
+
 void ref_filter_init(struct ref_filter *filter);
 void ref_filter_clear(struct ref_filter *filter);
 
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index 3069efc8601..6c7f92bcb38 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -659,4 +659,51 @@  test_expect_success 'get_branch_base_for_tip: all reach tip' '
 	test_all_modes get_branch_base_for_tip
 '
 
+test_expect_success 'for-each-ref is-base: none reach' '
+	cat >input <<-\EOF &&
+	refs/heads/commit-1-1
+	refs/heads/commit-4-2
+	refs/heads/commit-4-4
+	refs/heads/commit-8-4
+	EOF
+	cat >expect <<-\EOF &&
+	refs/heads/commit-1-1:
+	refs/heads/commit-4-2:(commit-2-3)
+	refs/heads/commit-4-4:
+	refs/heads/commit-8-4:
+	EOF
+	run_all_modes git for-each-ref \
+		--format="%(refname):%(is-base:commit-2-3)" --stdin
+'
+
+test_expect_success 'for-each-ref is-base: all reach' '
+	cat >input <<-\EOF &&
+	refs/heads/commit-4-2
+	refs/heads/commit-5-1
+	EOF
+	cat >expect <<-\EOF &&
+	refs/heads/commit-4-2:(commit-4-1)
+	refs/heads/commit-5-1:
+	EOF
+	run_all_modes git for-each-ref \
+		--format="%(refname):%(is-base:commit-4-1)" --stdin
+'
+
+test_expect_success 'for-each-ref is-base:multiple' '
+	cat >input <<-\EOF &&
+	refs/heads/commit-1-1
+	refs/heads/commit-4-2
+	refs/heads/commit-4-4
+	refs/heads/commit-8-4
+	EOF
+	cat >expect <<-\EOF &&
+	refs/heads/commit-1-1[-]
+	refs/heads/commit-4-2[(commit-2-3)-]
+	refs/heads/commit-4-4[-]
+	refs/heads/commit-8-4[-(commit-6-5)]
+	EOF
+	run_all_modes git for-each-ref \
+		--format="%(refname)[%(is-base:commit-2-3)-%(is-base:commit-6-5)]" --stdin
+'
+
 test_done