[v7,1/1] Implement rev-list --bisect* --first-parent
diff mbox series

Message ID 20191105052141.15913-2-workingjubilee@gmail.com
State New
Headers show
Series
  • Revenge of --bisect --first-parent
Related show

Commit Message

Jubilee Young Nov. 5, 2019, 5:21 a.m. UTC
From: Jubilee Young <workingjubilee@gmail.com>

Not all repository maintainers expect every commit to pass tests, only
testing the merge commits. Currently bisection assumes every commit is
of interest. The highly-requested --bisect --first-parent feature imbues
git with the same indifference to minutiae when the option is set, so
that it casually riffles through commits, throwing aside mountains of
irrelevant data when looking for a breaking change. Further refinement
of where breaks occurred can be gained by bisecting over the merge's
range.

Note, bisecting on --first-parent becomes part of findall's previously
existing pass-through as an "option state" flag.

In order to limit possible obfuscation of bisect operations resulting
from the addition of new flags, some extra documentation was folded in
to the patch.

Signed-off-by: Jubilee Young <workingjubilee@gmail.com>
Based-on-patch-by: Tiago Botelho <tiagonbotelho@hotmail.com>
Based-on-patch-by: Harald Nordgren <haraldnordgren@gmail.com>
---
 bisect.c                   | 56 ++++++++++++++++++++++++++------------
 bisect.h                   | 17 ++++++++++--
 builtin/rev-list.c         |  9 ++++--
 revision.c                 |  3 --
 t/t6000-rev-list-misc.sh   |  4 +--
 t/t6002-rev-list-bisect.sh | 54 ++++++++++++++++++++++++++++++++++++
 6 files changed, 116 insertions(+), 27 deletions(-)

Comments

Jonathan Tan Nov. 5, 2019, 11:04 p.m. UTC | #1
First of all, thanks for taking a year-old patch and updating it.
Overall, this looks good. I have some minor comments, but it might be
best to wait until someone more experienced with bisect.c takes a look
too.

Your commit message title should be of the form "<component>: <change>",
e.g.:

  rev-list: support --first-parent with --bisect*

> Not all repository maintainers expect every commit to pass tests, only
> testing the merge commits. Currently bisection assumes every commit is
> of interest. The highly-requested --bisect --first-parent feature imbues
> git with the same indifference to minutiae when the option is set, so
> that it casually riffles through commits, throwing aside mountains of
> irrelevant data when looking for a breaking change. Further refinement
> of where breaks occurred can be gained by bisecting over the merge's
> range.

I would be much more laconic (in particular, omitting subjective terms
like "minutiae" and "mountains of irrelevant data"), but perhaps that is
just a matter of subjective style.

> Note, bisecting on --first-parent becomes part of findall's previously
> existing pass-through as an "option state" flag.

I don't understand this part.

> In order to limit possible obfuscation of bisect operations resulting
> from the addition of new flags, some extra documentation was folded in
> to the patch.

What is being obfuscated, and what extra documentation was "folded"?

Also, clarify in the commit message somewhere that this commit does not
change the behavior of "git bisect".

As for the diff, besides my comments below, a change in the user-facing
documentation of "rev-list" is needed, since --bisect and --first-parent
now work together.

> -static int count_interesting_parents(struct commit *commit)
> +static int count_interesting_parents(struct commit *commit, unsigned bisect_flags)
>  {
>  	struct commit_list *p;
>  	int count;
>  
>  	for (count = 0, p = commit->parents; p; p = p->next) {
> -		if (p->item->object.flags & UNINTERESTING)
> -			continue;
> -		count++;
> +		if (!(p->item->object.flags & UNINTERESTING))
> +			count++;
> +		if (bisect_flags & BISECT_FIRST_PARENT)
> +			break;
>  	}
>  	return count;
>  }

(Note that I'm writing my thoughts as I go along to aid future
reviewers, and to show the author (you) how I'm understanding the
patch.)

We only take into account the first parent - straightforward enough.
I'll have to see how this function is used to ensure that this change is
correct.

>  static void show_list(const char *debug, int counted, int nr,
> -		      struct commit_list *list)
> +		      struct commit_list *list, unsigned bisect_flags)
>  {
>  	struct commit_list *p;

What is the purpose of this change? bisect_flags is never used anywhere
in show_list().

> @@ -271,13 +274,13 @@ static struct commit_list *do_find_bisection(struct commit_list *list,
>  		unsigned flags = commit->object.flags;
>  
>  		*commit_weight_at(&commit_weight, p->item) = &weights[n++];
> -		switch (count_interesting_parents(commit)) {
> +		switch (count_interesting_parents(commit, bisect_flags)) {
>  		case 0:

[snip]

Not shown, but this is an iteration over all commits in the parameter
"list". I'll quote the entire loop:

>         for (n = 0, p = list; p; p = p->next) {
>                 struct commit *commit = p->item;
>                 unsigned flags = commit->object.flags;
> 
>                 *commit_weight_at(&commit_weight, p->item) = &weights[n++];
>                 switch (count_interesting_parents(commit, bisect_flags)) {
>                 case 0:
>                         if (!(flags & TREESAME)) {
>                                 weight_set(p, 1);
>                                 counted++;
>                                 show_list("bisection 2 count one",
>                                           counted, nr, list, bisect_flags);
>                         }
>                         /*
>                          * otherwise, it is known not to reach any
>                          * tree-changing commit and gets weight 0.
>                          */
>                         break;
>                 case 1:
>                         weight_set(p, -1);
>                         break;
>                 default:
>                         weight_set(p, -2);
>                         break;
>                 }
>         }

Looks like count_interesting_parents() is correct. After this loop, all
commits have weight 1, 0 (weights is xcalloc'ed by the caller), -1, or
-2 (impossible when BISECT_FIRST_PARENT). We'll have to observe how
subsequent code treats the weights 1, 0, and -1.

> -	show_list("bisection 2 count_distance", counted, nr, list);
> +	/* should match bisection 2 initialize when BISECT_FIRST_PARENT */
> +	show_list("bisection 2 count_distance", counted, nr, list, bisect_flags);

Besides the review comment I made about show_list() earlier, this /* */
comment is also unnecessary.

> @@ -333,6 +337,17 @@ static struct commit_list *do_find_bisection(struct commit_list *list,
>  			if (0 <= weight(p))
>  				continue;
>  			for (q = p->item->parents; q; q = q->next) {
> +				/*
> +				 * first_parent can skip parent nodes, but only when
> +				 * confirmed as being of no interest.
> +				 */
> +				if (first_parent) {
> +					if ((q->item->object.flags & UNINTERESTING) ||
> +						(weight(q) < 0)) {
> +						q = NULL;
> +					}
> +					break;
> +				}
>  				if (q->item->object.flags & UNINTERESTING)
>  					continue;
>  				if (0 <= weight(q))

This is also in a loop. As can be seen at the top of the diff ("if (0 <=
weight(p))"), this only operates on commits with negative weights.

Originally, the inner loop advances until a non-UNINTERESTING parent
with a non-negative weight. If no such parent is found, at the end of
the loop, q is NULL. The added code effectively replicates what's going
on, but ignoring any parents after the first.

A previous reviewer [1] wanted an explanation for this part, so thanks
for attempting to do that. But I don't understand the explanation -
firstly, it is not a question of "can" (optional) but of "will"
(mandatory), and it is not only UNINTERESTING that determines skipping,
but weight as well.

I would write the entire section like this (remember to wrap the lines):

	if (first_parent) {
		q = p->item->parents;
		if (q && ((q->item->object.flags & UNINTERESTING) || weight(q) < 0))
			q = NULL;
	} else {
		/*
		 * Find an interesting parent with non-negative weight.
		 */
		for (...) {
		}
	}

Looking at the rest of do_find_bisection():

- I don't see any other parts that would be affected by only calculating
  weights based on the first parent, so that's fine.

- There are some early returns that assume that "list" is generated by
  iterating only over first parents. do_find_bisection() is called only
  by find_bisection(), and the latter is called only by cmd_rev_list()
  and bisect_next_all(). The former is fine, but I will discuss the
  latter later.

- I do see some unclear parts (in particular, counter might not reach nr
  if any of the weights are 0 and if the "weight_set(p, weight(q));"
  line is reached, potentially resulting in an infinite loop) but that
  is unrelated to this patch, so don't worry about it.

[1] https://public-inbox.org/git/nycvar.QRO.7.76.6.1808281512240.73@tvgsbejvaqbjf.bet/

> @@ -964,7 +981,12 @@ int bisect_next_all(struct repository *r, const char *prefix, int no_checkout)
>  
>  	bisect_common(&revs);
>  
> -	find_bisection(&revs.commits, &reaches, &all, !!skipped_revs.nr);
> +	if (skipped_revs.nr)
> +		bisect_flags |= BISECT_FIND_ALL;
> +	if (revs.first_parent_only)
> +		bisect_flags |= BISECT_FIRST_PARENT;
> +
> +	find_bisection(&revs.commits, &reaches, &all, bisect_flags);
>  	revs.commits = managed_skipped(revs.commits, &tried);
>  
>  	if (!revs.commits) {

I don't see how revs.first_parent_only is ever set in this function. If
it's never set, undo this change, since this code is never executed.

> +/*
> + * Coordinates a bisection by examining input made available so far,
> + * setting up internal variables, and then bisecting with them.
> + * no_checkout directs this to only update BISECT_HEAD refs.
> + *
> + * Exit code 10 on successful bisection, so caller should exit with 0.
> + * Exit code 4 when no commits were found to bisect through.
> + * Exit code 1 MAY result from skipping the commit it would report.
> + *
> + * Otherwise, returns a call to command handlers which choose an exit.
> + */
>  int bisect_next_all(struct repository *r,
>  		    const char *prefix,
>  		    int no_checkout);

I haven't looked at this function in detail, so I can't verify that this
comment is correct. But in any case, if you end up not making any change
to bisect_next_all(), I think it's best to leave this change out as
well.

> diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
> index b8cf82349b..95949e4ff1 100755
> --- a/t/t6000-rev-list-misc.sh
> +++ b/t/t6000-rev-list-misc.sh
> @@ -122,8 +122,8 @@ test_expect_success 'rev-list can negate index objects' '
>  	test_cmp expect actual
>  '
>  
> -test_expect_success '--bisect and --first-parent can not be combined' '
> -	test_must_fail git rev-list --bisect --first-parent HEAD
> +test_expect_success '--bisect and --first-parent CAN be combined' '
> +	git rev-list --bisect --first-parent HEAD
>  '
>  

I think this test can just be deleted. It is tested in t6002.

> diff --git a/t/t6002-rev-list-bisect.sh b/t/t6002-rev-list-bisect.sh
> index a661408038..6caf2af650 100755
> --- a/t/t6002-rev-list-bisect.sh
> +++ b/t/t6002-rev-list-bisect.sh
> @@ -263,4 +263,58 @@ test_expect_success 'rev-parse --bisect can default to good/bad refs' '
>  	test_cmp expect.sorted actual.sorted
>  '
>  
> +# --first-parent tests
> +
> +# --bisect --first-parent should pluck out the middle.
> +printf "%s\n" e4 |
> +test_output_expect_success "--bisect --first-parent" '
> +	git rev-list --bisect --first-parent E ^F
> +'
> +
> +printf "%s\n" E e1 e2 e3 e4 e5 e6 e7 e8 |
> +test_output_expect_success "--first-parent" '
> +	git rev-list --first-parent E ^F
> +'
> +
> +test_output_expect_success '--bisect-vars --first-parent' '
> +	git rev-list --bisect-vars --first-parent E ^F
> +' <<-EOF
> +	bisect_rev='e5'
> +	bisect_nr=4
> +	bisect_good=4
> +	bisect_bad=3
> +	bisect_all=9
> +	bisect_steps=2
> +EOF

Looks good, except for the middle test - that should already be working
even before the current patch, right? If there's a reason for including
it anyway, mention it in the commit message.

> +test_expect_success '--bisect-all --first-parent returns correct order' '
> +	git rev-list --bisect-all --first-parent E ^F >actual &&
> +
> +	# Make sure the entries are sorted in the dist order
> +	sed -e "s/.*dist=\([0-9]\).*/\1/" actual >actual.dists &&
> +	sort -r -n actual.dists >actual.dists.sorted &&
> +	test_cmp actual.dists.sorted actual.dists
> +'
> +
> +# NEEDSWORK: this test could afford being hardened against other
> +# changes in the same file.
> +test_expect_success '--bisect-all --first-parent compares correctly' '
> +	cat >expect <<-EOF &&
> +	$(git rev-parse tags/e5) (tag: e5, dist=4)
> +	$(git rev-parse tags/e4) (tag: e4, dist=4)
> +	$(git rev-parse tags/e6) (tag: e6, dist=3)
> +	$(git rev-parse tags/e3) (tag: e3, dist=3)
> +	$(git rev-parse tags/e7) (tag: e7, dist=2)
> +	$(git rev-parse tags/e2) (tag: e2, dist=2)
> +	$(git rev-parse tags/e8) (tag: e8, dist=1)
> +	$(git rev-parse tags/e1) (tag: e1, dist=1)
> +	$(git rev-parse tags/E) (tag: E, dist=0)
> +EOF
> +
> +git rev-list --bisect-all --first-parent E ^F >actual &&
> +	sort actual >actual.sorted &&
> +	sort expect >expect.sorted &&
> +	test_cmp expect.sorted actual.sorted
> +'

I think these 2 tests can be combined, since the latter also checks the
dists. Also, correct the indentation of the latter test.
Junio C Hamano Nov. 6, 2019, 2:42 a.m. UTC | #2
Jonathan Tan <jonathantanmy@google.com> writes:

> Your commit message title should be of the form "<component>: <change>",
> e.g.:
>
>   rev-list: support --first-parent with --bisect*

Good suggestion.

> I would be much more laconic (in particular, omitting subjective terms
> like "minutiae" and "mountains of irrelevant data"), but perhaps that is
> just a matter of subjective style.

FWIW, I had the same reaction.  That part of the message was too
noisy without adding much actual value.

>> Note, bisecting on --first-parent becomes part of findall's previously
>> existing pass-through as an "option state" flag.
>
> I don't understand this part.

Me neither.

> Also, clarify in the commit message somewhere that this commit does not
> change the behavior of "git bisect".

s/\.$/ when used without the "--first-parent" option&/; you mean?

> As for the diff, besides my comments below, a change in the user-facing
> documentation of "rev-list" is needed, since --bisect and --first-parent
> now work together.

True.  I too am, like you are, happy to see that these two options
made to work well together.

Thanks, both, for the patch and useful comments.  My own review on
it may take a bit more time.
Johannes Schindelin Nov. 6, 2019, 11:30 a.m. UTC | #3
Hi,

On Wed, 6 Nov 2019, Junio C Hamano wrote:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
> > Your commit message title should be of the form "<component>: <change>",
> > e.g.:
> >
> >   rev-list: support --first-parent with --bisect*
>
> Good suggestion.
>
> > I would be much more laconic (in particular, omitting subjective terms
> > like "minutiae" and "mountains of irrelevant data"), but perhaps that is
> > just a matter of subjective style.
>
> FWIW, I had the same reaction.  That part of the message was too
> noisy without adding much actual value.
>
> >> Note, bisecting on --first-parent becomes part of findall's previously
> >> existing pass-through as an "option state" flag.
> >
> > I don't understand this part.
>
> Me neither.
>
> > Also, clarify in the commit message somewhere that this commit does not
> > change the behavior of "git bisect".
>
> s/\.$/ when used without the "--first-parent" option&/; you mean?
>
> > As for the diff, besides my comments below, a change in the user-facing
> > documentation of "rev-list" is needed, since --bisect and --first-parent
> > now work together.
>
> True.  I too am, like you are, happy to see that these two options
> made to work well together.
>
> Thanks, both, for the patch and useful comments.  My own review on
> it may take a bit more time.

While I sadly won't have time to review this patch, let me first state
that I am very excited that you revived this stalled patch. Thank you!

In addition to Jonathan's comments, I would like to add another one: I
would have loved for this patch to be split in two, the first one
introducing the `bisect_flags` and using it with `BISECT_FIND_ALL`
instead of doing the `find_all` thing, the second one building the
first-parent feature on top.

Thanks,
Dscho
Jonathan Tan Nov. 6, 2019, 9:36 p.m. UTC | #4
> > Also, clarify in the commit message somewhere that this commit does not
> > change the behavior of "git bisect".
> 
> s/\.$/ when used without the "--first-parent" option&/; you mean?

As far as I know, "git bisect" doesn't support --first-parent, whether
before or after this patch.

At first I thought that this patch also teaches "git bisect" to support
"--first-parent", but that is not the case. Only "git rev-list" learns
to make "--bisect" work with "--first-parent". So I wanted the
clarification. (But if you think that the clarification is unnecessary,
that's fine too.)
Junio C Hamano Nov. 7, 2019, 3:35 a.m. UTC | #5
Jonathan Tan <jonathantanmy@google.com> writes:

>> > Also, clarify in the commit message somewhere that this commit does not
>> > change the behavior of "git bisect".
>> 
>> s/\.$/ when used without the "--first-parent" option&/; you mean?
>
> As far as I know, "git bisect" doesn't support --first-parent, whether
> before or after this patch.
>
> At first I thought that this patch also teaches "git bisect" to support
> "--first-parent", but that is not the case.

Ah, I was not paying enough attention.  So this is only the first
part of the whole solution that does not have much end-user visible
benefit yet.

With "yet" at the end of what you wrote, I do agree with you that
such a clarification would be necessary.

Thanks.
Jubilee Young Nov. 7, 2019, 5:07 a.m. UTC | #6
Hi! Thanks for all the feedback.

On Wed, Nov 6, 2019 at 1:36 PM Jonathan Tan <jonathantanmy@google.com>
wrote:
>
> > > Also, clarify in the commit message somewhere that this commit does
not
> > > change the behavior of "git bisect".
> >
> > s/\.$/ when used without the "--first-parent" option&/; you mean?
>
> As far as I know, "git bisect" doesn't support --first-parent, whether
> before or after this patch.

Correct, it still provides the usual "usage: git bisect ..." error.
This provides plumbing but no porcelain right now.

> At first I thought that this patch also teaches "git bisect" to support
> "--first-parent", but that is not the case. Only "git rev-list" learns
> to make "--bisect" work with "--first-parent". So I wanted the
> clarification. (But if you think that the clarification is unnecessary,
> that's fine too.)

I had been trying to deduce the best approach for writing the commit
message but in the end I retained a hunch that I didn't yet know enough
to write the best one so this conversation is quite illuminating re:
commit style.

If it's even slightly confusing I would like to clarify, since bisect.c
and git bisect are... very different!

I think Dscho's recommendation also will result in the patch overall
being far more lucid.
Jubilee Young Nov. 7, 2019, 4 p.m. UTC | #7
On Tue, Nov 5, 2019 at 3:04 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> First of all, thanks for taking a year-old patch and updating it.
> Overall, this looks good. I have some minor comments, but it might be
> best to wait until someone more experienced with bisect.c takes a look
> too.
>
> Your commit message title should be of the form "<component>: <change>",
> e.g.:
>
>   rev-list: support --first-parent with --bisect*
>
> > Not all repository maintainers expect every commit to pass tests, only
> > testing the merge commits. Currently bisection assumes every commit is
> > of interest. The highly-requested --bisect --first-parent feature imbues
> > git with the same indifference to minutiae when the option is set, so
> > that it casually riffles through commits, throwing aside mountains of
> > irrelevant data when looking for a breaking change. Further refinement
> > of where breaks occurred can be gained by bisecting over the merge's
> > range.
>
> I would be much more laconic (in particular, omitting subjective terms
> like "minutiae" and "mountains of irrelevant data"), but perhaps that is
> just a matter of subjective style.
>
> > Note, bisecting on --first-parent becomes part of findall's previously
> > existing pass-through as an "option state" flag.
>
> I don't understand this part.
>
> > In order to limit possible obfuscation of bisect operations resulting
> > from the addition of new flags, some extra documentation was folded in
> > to the patch.
>
> What is being obfuscated, and what extra documentation was "folded"?
>
> Also, clarify in the commit message somewhere that this commit does not
> change the behavior of "git bisect".
>
> As for the diff, besides my comments below, a change in the user-facing
> documentation of "rev-list" is needed, since --bisect and --first-parent
> now work together.

Will do.

>
> > -static int count_interesting_parents(struct commit *commit)
> > +static int count_interesting_parents(struct commit *commit, unsigned bisect_flags)
> >  {
> >       struct commit_list *p;
> >       int count;
> >
> >       for (count = 0, p = commit->parents; p; p = p->next) {
> > -             if (p->item->object.flags & UNINTERESTING)
> > -                     continue;
> > -             count++;
> > +             if (!(p->item->object.flags & UNINTERESTING))
> > +                     count++;
> > +             if (bisect_flags & BISECT_FIRST_PARENT)
> > +                     break;
> >       }
> >       return count;
> >  }
>
> (Note that I'm writing my thoughts as I go along to aid future
> reviewers, and to show the author (you) how I'm understanding the
> patch.)
>
> We only take into account the first parent - straightforward enough.
> I'll have to see how this function is used to ensure that this change is
> correct.
>
> >  static void show_list(const char *debug, int counted, int nr,
> > -                   struct commit_list *list)
> > +                   struct commit_list *list, unsigned bisect_flags)
> >  {
> >       struct commit_list *p;
>
> What is the purpose of this change? bisect_flags is never used anywhere
> in show_list().

Insufficiently cleaned-up change cruft! Thanks for catching it.

> This is also in a loop. As can be seen at the top of the diff ("if (0 <=
> weight(p))"), this only operates on commits with negative weights.
>
> Originally, the inner loop advances until a non-UNINTERESTING parent
> with a non-negative weight. If no such parent is found, at the end of
> the loop, q is NULL. The added code effectively replicates what's going
> on, but ignoring any parents after the first.
>
> A previous reviewer [1] wanted an explanation for this part, so thanks
> for attempting to do that. But I don't understand the explanation -
> firstly, it is not a question of "can" (optional) but of "will"
> (mandatory), and it is not only UNINTERESTING that determines skipping,
> but weight as well.
>
> I would write the entire section like this (remember to wrap the lines):
>
>         if (first_parent) {
>                 q = p->item->parents;
>                 if (q && ((q->item->object.flags & UNINTERESTING) || weight(q) < 0))
>                         q = NULL;
>         } else {
>                 /*
>                  * Find an interesting parent with non-negative weight.
>                  */
>                 for (...) {
>                 }
>         }
>

"uninteresting" was meant in the colloquial sense rather than the
CONSTANT, but fair, it's probably just confusing.

> Looking at the rest of do_find_bisection():
>
> - I don't see any other parts that would be affected by only calculating
>   weights based on the first parent, so that's fine.
>
> - There are some early returns that assume that "list" is generated by
>   iterating only over first parents. do_find_bisection() is called only
>   by find_bisection(), and the latter is called only by cmd_rev_list()
>   and bisect_next_all(). The former is fine, but I will discuss the
>   latter later.
>
> - I do see some unclear parts (in particular, counter might not reach nr
>   if any of the weights are 0 and if the "weight_set(p, weight(q));"
>   line is reached, potentially resulting in an infinite loop) but that
>   is unrelated to this patch, so don't worry about it.
>
> [1] https://public-inbox.org/git/nycvar.QRO.7.76.6.1808281512240.73@tvgsbejvaqbjf.bet/
>
> > @@ -964,7 +981,12 @@ int bisect_next_all(struct repository *r, const char *prefix, int no_checkout)
> >
> >       bisect_common(&revs);
> >
> > -     find_bisection(&revs.commits, &reaches, &all, !!skipped_revs.nr);
> > +     if (skipped_revs.nr)
> > +             bisect_flags |= BISECT_FIND_ALL;
> > +     if (revs.first_parent_only)
> > +             bisect_flags |= BISECT_FIRST_PARENT;
> > +
> > +     find_bisection(&revs.commits, &reaches, &all, bisect_flags);
> >       revs.commits = managed_skipped(revs.commits, &tried);
> >
> >       if (!revs.commits) {
>
> I don't see how revs.first_parent_only is ever set in this function. If
> it's never set, undo this change, since this code is never executed.

In this function, we call bisect_rev_setup() using the revs struct we
made, which then calls setup_revisions() on the revs, which appears to
call handle_revision_opt() with that struct again,which finally is
allowed to set revs->first_parent_only = 1; in revision.c.

So unless I am horribly misreading something, we do set it.

> > diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
> > index b8cf82349b..95949e4ff1 100755
> > --- a/t/t6000-rev-list-misc.sh
> > +++ b/t/t6000-rev-list-misc.sh
> > @@ -122,8 +122,8 @@ test_expect_success 'rev-list can negate index objects' '
> >       test_cmp expect actual
> >  '
> >
> > -test_expect_success '--bisect and --first-parent can not be combined' '
> > -     test_must_fail git rev-list --bisect --first-parent HEAD
> > +test_expect_success '--bisect and --first-parent CAN be combined' '
> > +     git rev-list --bisect --first-parent HEAD
> >  '
> >
>
> I think this test can just be deleted. It is tested in t6002.

Sure, I'll drop it.

> > diff --git a/t/t6002-rev-list-bisect.sh b/t/t6002-rev-list-bisect.sh
> > index a661408038..6caf2af650 100755
> > --- a/t/t6002-rev-list-bisect.sh
> > +++ b/t/t6002-rev-list-bisect.sh
> > @@ -263,4 +263,58 @@ test_expect_success 'rev-parse --bisect can default to good/bad refs' '
> >       test_cmp expect.sorted actual.sorted
> >  '
> >
> > +# --first-parent tests
> > +
> > +# --bisect --first-parent should pluck out the middle.
> > +printf "%s\n" e4 |
> > +test_output_expect_success "--bisect --first-parent" '
> > +     git rev-list --bisect --first-parent E ^F
> > +'
> > +
> > +printf "%s\n" E e1 e2 e3 e4 e5 e6 e7 e8 |
> > +test_output_expect_success "--first-parent" '
> > +     git rev-list --first-parent E ^F
> > +'
> > +
> > +test_output_expect_success '--bisect-vars --first-parent' '
> > +     git rev-list --bisect-vars --first-parent E ^F
> > +' <<-EOF
> > +     bisect_rev='e5'
> > +     bisect_nr=4
> > +     bisect_good=4
> > +     bisect_bad=3
> > +     bisect_all=9
> > +     bisect_steps=2
> > +EOF
>
> Looks good, except for the middle test - that should already be working
> even before the current patch, right? If there's a reason for including
> it anyway, mention it in the commit message.

Also cutting, I think.

> > +test_expect_success '--bisect-all --first-parent returns correct order' '
> > +     git rev-list --bisect-all --first-parent E ^F >actual &&
> > +
> > +     # Make sure the entries are sorted in the dist order
> > +     sed -e "s/.*dist=\([0-9]\).*/\1/" actual >actual.dists &&
> > +     sort -r -n actual.dists >actual.dists.sorted &&
> > +     test_cmp actual.dists.sorted actual.dists
> > +'
> > +
> > +# NEEDSWORK: this test could afford being hardened against other
> > +# changes in the same file.
> > +test_expect_success '--bisect-all --first-parent compares correctly' '
> > +     cat >expect <<-EOF &&
> > +     $(git rev-parse tags/e5) (tag: e5, dist=4)
> > +     $(git rev-parse tags/e4) (tag: e4, dist=4)
> > +     $(git rev-parse tags/e6) (tag: e6, dist=3)
> > +     $(git rev-parse tags/e3) (tag: e3, dist=3)
> > +     $(git rev-parse tags/e7) (tag: e7, dist=2)
> > +     $(git rev-parse tags/e2) (tag: e2, dist=2)
> > +     $(git rev-parse tags/e8) (tag: e8, dist=1)
> > +     $(git rev-parse tags/e1) (tag: e1, dist=1)
> > +     $(git rev-parse tags/E) (tag: E, dist=0)
> > +EOF
> > +
> > +git rev-list --bisect-all --first-parent E ^F >actual &&
> > +     sort actual >actual.sorted &&
> > +     sort expect >expect.sorted &&
> > +     test_cmp expect.sorted actual.sorted
> > +'
>
> I think these 2 tests can be combined, since the latter also checks the
> dists. Also, correct the indentation of the latter test.

I understand they are similar tests, but... Is there a tangible
reason for combining them? Especially when their logic can
live and breathe completely separately, compacting tests
reduces the resolution of the information we can extract from failure.

I would rather simply drop one and preserve 1 test = 1 data point.
Jonathan Tan Nov. 7, 2019, 6:56 p.m. UTC | #8
> > > @@ -964,7 +981,12 @@ int bisect_next_all(struct repository *r, const char *prefix, int no_checkout)
> > >
> > >       bisect_common(&revs);
> > >
> > > -     find_bisection(&revs.commits, &reaches, &all, !!skipped_revs.nr);
> > > +     if (skipped_revs.nr)
> > > +             bisect_flags |= BISECT_FIND_ALL;
> > > +     if (revs.first_parent_only)
> > > +             bisect_flags |= BISECT_FIRST_PARENT;
> > > +
> > > +     find_bisection(&revs.commits, &reaches, &all, bisect_flags);
> > >       revs.commits = managed_skipped(revs.commits, &tried);
> > >
> > >       if (!revs.commits) {
> >
> > I don't see how revs.first_parent_only is ever set in this function. If
> > it's never set, undo this change, since this code is never executed.
> 
> In this function, 

(1)
> we call bisect_rev_setup() using the revs struct we
> made,

(2)
> which then calls setup_revisions() on the revs,

(3)
> which appears to
> call handle_revision_opt() with that struct again,

(4)
> which finally is
> allowed to set revs->first_parent_only = 1; in revision.c.
> 
> So unless I am horribly misreading something, we do set it.

Thanks for performing this analysis. Working backwards:

(4) handle_revision_opt() in revision.c sets revs->first_parent_only
only if argv[0] is "--first-parent".

(3) setup_revisions() calls handle_revision_opt() in a loop over its
argv parameter.

(2) bisect_rev_setup() initializes rev_argv (with ARGV_ARRAY_INIT, a
blank array). Then, it pushes "bisect_rev_setup", an OID in
"bad_format", some OIDs in "good_format", and possibly some paths. Then
it calls setup_revisions() with rev_argv as the argv parameter. Notice
that there is no "--first-parent" sent at all.

> > > +test_expect_success '--bisect-all --first-parent returns correct order' '
> > > +     git rev-list --bisect-all --first-parent E ^F >actual &&
> > > +
> > > +     # Make sure the entries are sorted in the dist order
> > > +     sed -e "s/.*dist=\([0-9]\).*/\1/" actual >actual.dists &&
> > > +     sort -r -n actual.dists >actual.dists.sorted &&
> > > +     test_cmp actual.dists.sorted actual.dists
> > > +'
> > > +
> > > +# NEEDSWORK: this test could afford being hardened against other
> > > +# changes in the same file.
> > > +test_expect_success '--bisect-all --first-parent compares correctly' '
> > > +     cat >expect <<-EOF &&
> > > +     $(git rev-parse tags/e5) (tag: e5, dist=4)
> > > +     $(git rev-parse tags/e4) (tag: e4, dist=4)
> > > +     $(git rev-parse tags/e6) (tag: e6, dist=3)
> > > +     $(git rev-parse tags/e3) (tag: e3, dist=3)
> > > +     $(git rev-parse tags/e7) (tag: e7, dist=2)
> > > +     $(git rev-parse tags/e2) (tag: e2, dist=2)
> > > +     $(git rev-parse tags/e8) (tag: e8, dist=1)
> > > +     $(git rev-parse tags/e1) (tag: e1, dist=1)
> > > +     $(git rev-parse tags/E) (tag: E, dist=0)
> > > +EOF
> > > +
> > > +git rev-list --bisect-all --first-parent E ^F >actual &&
> > > +     sort actual >actual.sorted &&
> > > +     sort expect >expect.sorted &&
> > > +     test_cmp expect.sorted actual.sorted
> > > +'
> >
> > I think these 2 tests can be combined, since the latter also checks the
> > dists. Also, correct the indentation of the latter test.
> 
> I understand they are similar tests, but... Is there a tangible
> reason for combining them? Especially when their logic can
> live and breathe completely separately, compacting tests
> reduces the resolution of the information we can extract from failure.
> 
> I would rather simply drop one and preserve 1 test = 1 data point.

In general, I agree that 1 test should represent 1 data point, and the
reason for that being (as you said) the resolution of the information we
can extract from failure. But here, the resolution is diminished if we
don't combine the tests. Here, if either test fails, because of the
postprocessing we've had to do on the output, we lose signal. But if we
had the combined test, we wouldn't.

Patch
diff mbox series

diff --git a/bisect.c b/bisect.c
index e81c91d02c..54129a796b 100644
--- a/bisect.c
+++ b/bisect.c
@@ -88,15 +88,16 @@  static inline void weight_set(struct commit_list *elem, int weight)
 	**commit_weight_at(&commit_weight, elem->item) = weight;
 }
 
-static int count_interesting_parents(struct commit *commit)
+static int count_interesting_parents(struct commit *commit, unsigned bisect_flags)
 {
 	struct commit_list *p;
 	int count;
 
 	for (count = 0, p = commit->parents; p; p = p->next) {
-		if (p->item->object.flags & UNINTERESTING)
-			continue;
-		count++;
+		if (!(p->item->object.flags & UNINTERESTING))
+			count++;
+		if (bisect_flags & BISECT_FIRST_PARENT)
+			break;
 	}
 	return count;
 }
@@ -123,7 +124,7 @@  static inline int halfway(struct commit_list *p, int nr)
 }
 
 static void show_list(const char *debug, int counted, int nr,
-		      struct commit_list *list)
+		      struct commit_list *list, unsigned bisect_flags)
 {
 	struct commit_list *p;
 
@@ -259,10 +260,12 @@  static struct commit_list *best_bisection_sorted(struct commit_list *list, int n
  */
 static struct commit_list *do_find_bisection(struct commit_list *list,
 					     int nr, int *weights,
-					     int find_all)
+					     unsigned bisect_flags)
 {
 	int n, counted;
 	struct commit_list *p;
+	unsigned first_parent = bisect_flags & BISECT_FIRST_PARENT;
+	unsigned find_all = bisect_flags & BISECT_FIND_ALL;
 
 	counted = 0;
 
@@ -271,13 +274,13 @@  static struct commit_list *do_find_bisection(struct commit_list *list,
 		unsigned flags = commit->object.flags;
 
 		*commit_weight_at(&commit_weight, p->item) = &weights[n++];
-		switch (count_interesting_parents(commit)) {
+		switch (count_interesting_parents(commit, bisect_flags)) {
 		case 0:
 			if (!(flags & TREESAME)) {
 				weight_set(p, 1);
 				counted++;
 				show_list("bisection 2 count one",
-					  counted, nr, list);
+					  counted, nr, list, bisect_flags);
 			}
 			/*
 			 * otherwise, it is known not to reach any
@@ -293,7 +296,7 @@  static struct commit_list *do_find_bisection(struct commit_list *list,
 		}
 	}
 
-	show_list("bisection 2 initialize", counted, nr, list);
+	show_list("bisection 2 initialize", counted, nr, list, bisect_flags);
 
 	/*
 	 * If you have only one parent in the resulting set
@@ -323,7 +326,8 @@  static struct commit_list *do_find_bisection(struct commit_list *list,
 		counted++;
 	}
 
-	show_list("bisection 2 count_distance", counted, nr, list);
+	/* should match bisection 2 initialize when BISECT_FIRST_PARENT */
+	show_list("bisection 2 count_distance", counted, nr, list, bisect_flags);
 
 	while (counted < nr) {
 		for (p = list; p; p = p->next) {
@@ -333,6 +337,17 @@  static struct commit_list *do_find_bisection(struct commit_list *list,
 			if (0 <= weight(p))
 				continue;
 			for (q = p->item->parents; q; q = q->next) {
+				/*
+				 * first_parent can skip parent nodes, but only when
+				 * confirmed as being of no interest.
+				 */
+				if (first_parent) {
+					if ((q->item->object.flags & UNINTERESTING) ||
+						(weight(q) < 0)) {
+						q = NULL;
+					}
+					break;
+				}
 				if (q->item->object.flags & UNINTERESTING)
 					continue;
 				if (0 <= weight(q))
@@ -350,7 +365,7 @@  static struct commit_list *do_find_bisection(struct commit_list *list,
 				weight_set(p, weight(q)+1);
 				counted++;
 				show_list("bisection 2 count one",
-					  counted, nr, list);
+					  counted, nr, list, bisect_flags);
 			}
 			else
 				weight_set(p, weight(q));
@@ -361,7 +376,7 @@  static struct commit_list *do_find_bisection(struct commit_list *list,
 		}
 	}
 
-	show_list("bisection 2 counted all", counted, nr, list);
+	show_list("bisection 2 counted all", counted, nr, list, bisect_flags);
 
 	if (!find_all)
 		return best_bisection(list, nr);
@@ -370,13 +385,14 @@  static struct commit_list *do_find_bisection(struct commit_list *list,
 }
 
 void find_bisection(struct commit_list **commit_list, int *reaches,
-		    int *all, int find_all)
+		    int *all, unsigned bisect_flags)
 {
 	int nr, on_list;
 	struct commit_list *list, *p, *best, *next, *last;
 	int *weights;
+	unsigned find_all = bisect_flags & BISECT_FIND_ALL;
 
-	show_list("bisection 2 entry", 0, 0, *commit_list);
+	show_list("bisection 2 entry", 0, 0, *commit_list, bisect_flags);
 	init_commit_weight(&commit_weight);
 
 	/*
@@ -400,13 +416,13 @@  void find_bisection(struct commit_list **commit_list, int *reaches,
 		on_list++;
 	}
 	list = last;
-	show_list("bisection 2 sorted", 0, nr, list);
+	show_list("bisection 2 sorted", 0, nr, list, bisect_flags);
 
 	*all = nr;
 	weights = xcalloc(on_list, sizeof(*weights));
 
 	/* Do the real work of finding bisection commit. */
-	best = do_find_bisection(list, nr, weights, find_all);
+	best = do_find_bisection(list, nr, weights, bisect_flags);
 	if (best) {
 		if (!find_all) {
 			list->item = best->item;
@@ -950,6 +966,7 @@  int bisect_next_all(struct repository *r, const char *prefix, int no_checkout)
 	struct rev_info revs;
 	struct commit_list *tried;
 	int reaches = 0, all = 0, nr, steps;
+	unsigned bisect_flags = 0;
 	struct object_id *bisect_rev;
 	char *steps_msg;
 
@@ -964,7 +981,12 @@  int bisect_next_all(struct repository *r, const char *prefix, int no_checkout)
 
 	bisect_common(&revs);
 
-	find_bisection(&revs.commits, &reaches, &all, !!skipped_revs.nr);
+	if (skipped_revs.nr)
+		bisect_flags |= BISECT_FIND_ALL;
+	if (revs.first_parent_only)
+		bisect_flags |= BISECT_FIRST_PARENT;
+
+	find_bisection(&revs.commits, &reaches, &all, bisect_flags);
 	revs.commits = managed_skipped(revs.commits, &tried);
 
 	if (!revs.commits) {
diff --git a/bisect.h b/bisect.h
index 4e69a11ea8..518be01999 100644
--- a/bisect.h
+++ b/bisect.h
@@ -3,16 +3,18 @@ 
 
 struct commit_list;
 struct repository;
+#define BISECT_FIND_ALL	(1U<<0)
+#define BISECT_FIRST_PARENT	(1U<<1)
 
 /*
  * Find bisection. If something is found, `reaches` will be the number of
  * commits that the best commit reaches. `all` will be the count of
  * non-SAMETREE commits. If nothing is found, `list` will be NULL.
  * Otherwise, it will be either all non-SAMETREE commits or the single
- * best commit, as chosen by `find_all`.
+ * best commit, as chosen by the flag `BISECT_FIND_ALL`.
  */
 void find_bisection(struct commit_list **list, int *reaches, int *all,
-		    int find_all);
+		    unsigned bisect_flags);
 
 struct commit_list *filter_skipped(struct commit_list *list,
 				   struct commit_list **tried,
@@ -31,6 +33,17 @@  struct rev_list_info {
 	const char *header_prefix;
 };
 
+/*
+ * Coordinates a bisection by examining input made available so far,
+ * setting up internal variables, and then bisecting with them.
+ * no_checkout directs this to only update BISECT_HEAD refs.
+ *
+ * Exit code 10 on successful bisection, so caller should exit with 0.
+ * Exit code 4 when no commits were found to bisect through.
+ * Exit code 1 MAY result from skipping the commit it would report.
+ *
+ * Otherwise, returns a call to command handlers which choose an exit.
+ */
 int bisect_next_all(struct repository *r,
 		    const char *prefix,
 		    int no_checkout);
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index e28d62ec64..759a182ec6 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -374,8 +374,8 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	int i;
 	int bisect_list = 0;
 	int bisect_show_vars = 0;
-	int bisect_find_all = 0;
 	int use_bitmap_index = 0;
+	unsigned bisect_flags = 0;
 	const char *show_progress = NULL;
 
 	if (argc == 2 && !strcmp(argv[1], "-h"))
@@ -443,7 +443,7 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 		}
 		if (!strcmp(arg, "--bisect-all")) {
 			bisect_list = 1;
-			bisect_find_all = 1;
+			bisect_flags |= BISECT_FIND_ALL;
 			info.flags |= BISECT_SHOW_ALL;
 			revs.show_decorations = 1;
 			continue;
@@ -565,7 +565,10 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (bisect_list) {
 		int reaches, all;
 
-		find_bisection(&revs.commits, &reaches, &all, bisect_find_all);
+		if (revs.first_parent_only)
+			bisect_flags |= BISECT_FIRST_PARENT;
+
+		find_bisection(&revs.commits, &reaches, &all, bisect_flags);
 
 		if (bisect_show_vars)
 			return show_bisect_vars(&info, reaches, all);
diff --git a/revision.c b/revision.c
index 0e39b2b8a5..c6f0a9f213 100644
--- a/revision.c
+++ b/revision.c
@@ -2716,9 +2716,6 @@  int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
 	if (!revs->reflog_info && revs->grep_filter.use_reflog_filter)
 		die("cannot use --grep-reflog without --walk-reflogs");
 
-	if (revs->first_parent_only && revs->bisect)
-		die(_("--first-parent is incompatible with --bisect"));
-
 	if (revs->line_level_traverse &&
 	    (revs->diffopt.output_format & ~(DIFF_FORMAT_PATCH | DIFF_FORMAT_NO_OUTPUT)))
 		die(_("-L does not yet support diff formats besides -p and -s"));
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index b8cf82349b..95949e4ff1 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -122,8 +122,8 @@  test_expect_success 'rev-list can negate index objects' '
 	test_cmp expect actual
 '
 
-test_expect_success '--bisect and --first-parent can not be combined' '
-	test_must_fail git rev-list --bisect --first-parent HEAD
+test_expect_success '--bisect and --first-parent CAN be combined' '
+	git rev-list --bisect --first-parent HEAD
 '
 
 test_expect_success '--header shows a NUL after each commit' '
diff --git a/t/t6002-rev-list-bisect.sh b/t/t6002-rev-list-bisect.sh
index a661408038..6caf2af650 100755
--- a/t/t6002-rev-list-bisect.sh
+++ b/t/t6002-rev-list-bisect.sh
@@ -263,4 +263,58 @@  test_expect_success 'rev-parse --bisect can default to good/bad refs' '
 	test_cmp expect.sorted actual.sorted
 '
 
+# --first-parent tests
+
+# --bisect --first-parent should pluck out the middle.
+printf "%s\n" e4 |
+test_output_expect_success "--bisect --first-parent" '
+	git rev-list --bisect --first-parent E ^F
+'
+
+printf "%s\n" E e1 e2 e3 e4 e5 e6 e7 e8 |
+test_output_expect_success "--first-parent" '
+	git rev-list --first-parent E ^F
+'
+
+test_output_expect_success '--bisect-vars --first-parent' '
+	git rev-list --bisect-vars --first-parent E ^F
+' <<-EOF
+	bisect_rev='e5'
+	bisect_nr=4
+	bisect_good=4
+	bisect_bad=3
+	bisect_all=9
+	bisect_steps=2
+EOF
+
+test_expect_success '--bisect-all --first-parent returns correct order' '
+	git rev-list --bisect-all --first-parent E ^F >actual &&
+
+	# Make sure the entries are sorted in the dist order
+	sed -e "s/.*dist=\([0-9]\).*/\1/" actual >actual.dists &&
+	sort -r -n actual.dists >actual.dists.sorted &&
+	test_cmp actual.dists.sorted actual.dists
+'
+
+# NEEDSWORK: this test could afford being hardened against other
+# changes in the same file.
+test_expect_success '--bisect-all --first-parent compares correctly' '
+	cat >expect <<-EOF &&
+	$(git rev-parse tags/e5) (tag: e5, dist=4)
+	$(git rev-parse tags/e4) (tag: e4, dist=4)
+	$(git rev-parse tags/e6) (tag: e6, dist=3)
+	$(git rev-parse tags/e3) (tag: e3, dist=3)
+	$(git rev-parse tags/e7) (tag: e7, dist=2)
+	$(git rev-parse tags/e2) (tag: e2, dist=2)
+	$(git rev-parse tags/e8) (tag: e8, dist=1)
+	$(git rev-parse tags/e1) (tag: e1, dist=1)
+	$(git rev-parse tags/E) (tag: E, dist=0)
+EOF
+
+git rev-list --bisect-all --first-parent E ^F >actual &&
+	sort actual >actual.sorted &&
+	sort expect >expect.sorted &&
+	test_cmp expect.sorted actual.sorted
+'
+
 test_done