[v2,06/15] rev-list: make --count work with --objects
diff mbox series

Message ID 20200214182220.GF150965@coredump.intra.peff.net
State New
Headers show
Series
  • combining object filters and bitmaps
Related show

Commit Message

Jeff King Feb. 14, 2020, 6:22 p.m. UTC
The current behavior from "rev-list --count --objects" is nonsensical:
we enumerate all of the objects except commits, but then give a count of
commits. This wasn't planned, and is just what the code happens to do.

Instead, let's give the answer the user almost certainly wanted: the
full count of objects.

Note that there are more complicated cases around cherry-marking, etc.
We'll punt on those for now, but let the user know that we can't produce
an answer (rather than giving them something useless).

We'll test both the new feature as well as a vanilla --count of commits,
since that surprisingly doesn't seem to be covered in the existing
tests.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/rev-list.c       | 13 +++++++++++++
 t/t6000-rev-list-misc.sh | 12 ++++++++++++
 2 files changed, 25 insertions(+)

Comments

Taylor Blau Feb. 15, 2020, 12:42 a.m. UTC | #1
On Fri, Feb 14, 2020 at 01:22:20PM -0500, Jeff King wrote:
> The current behavior from "rev-list --count --objects" is nonsensical:
> we enumerate all of the objects except commits, but then give a count of
> commits. This wasn't planned, and is just what the code happens to do.
>
> Instead, let's give the answer the user almost certainly wanted: the
> full count of objects.

This makes sense: I've often worried about introducing
backwards-incompatible changes in newer versions of Git, even for
behavior that didn't make sense to begin with.

Of course, backwards-incompatible changes *are* something worth worrying
about, but I don't find that the behavior was sensible to begin with, so
I don't have a problem "breaking" it if "breaking" means making
something nonsensical behave correctly.

> Note that there are more complicated cases around cherry-marking, etc.
> We'll punt on those for now, but let the user know that we can't produce
> an answer (rather than giving them something useless).

Yep, sounds good.

> We'll test both the new feature as well as a vanilla --count of commits,
> since that surprisingly doesn't seem to be covered in the existing
> tests.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  builtin/rev-list.c       | 13 +++++++++++++
>  t/t6000-rev-list-misc.sh | 12 ++++++++++++
>  2 files changed, 25 insertions(+)
>
> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> index 38c5ca5603..9452123988 100644
> --- a/builtin/rev-list.c
> +++ b/builtin/rev-list.c
> @@ -253,11 +253,19 @@ static int finish_object(struct object *obj, const char *name, void *cb_data)
>  static void show_object(struct object *obj, const char *name, void *cb_data)
>  {
>  	struct rev_list_info *info = cb_data;
> +	struct rev_info *revs = info->revs;
> +
>  	if (finish_object(obj, name, cb_data))
>  		return;
>  	display_progress(progress, ++progress_counter);
>  	if (info->flags & REV_LIST_QUIET)
>  		return;
> +
> +	if (revs->count) {
> +		revs->count_right++;
> +		return;
> +	}
> +

Hmm. This puzzled me at first. Do you think that it could benefit from a
comment?

>  	if (arg_show_object_names)
>  		show_object_with_name(stdout, obj, name);
>  	else
> @@ -584,6 +592,11 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  	if (revs.show_notes)
>  		die(_("rev-list does not support display of notes"));
>
> +	if (revs.count &&
> +	    (revs.tag_objects || revs.tree_objects || revs.blob_objects) &&
> +	    (revs.left_right || revs.cherry_mark))
> +		die(_("marked counting is incompatible with --objects"));
> +
>  	if (filter_options.choice)
>  		use_bitmap_index = 0;
>
> diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
> index b8cf82349b..383f2c457d 100755
> --- a/t/t6000-rev-list-misc.sh
> +++ b/t/t6000-rev-list-misc.sh
> @@ -148,4 +148,16 @@ test_expect_success 'rev-list --end-of-options' '
>  	test_cmp expect actual
>  '
>
> +test_expect_success 'rev-list --count' '
> +	count=$(git rev-list --count HEAD) &&
> +	git rev-list HEAD >actual &&
> +	test_line_count = $count actual
> +'
> +
> +test_expect_success 'rev-list --count --objects' '
> +	count=$(git rev-list --count --objects HEAD) &&
> +	git rev-list --objects HEAD >actual &&
> +	test_line_count = $count actual
> +'
> +
>  test_done
> --
> 2.25.0.796.gcc29325708

All looks good.

Thanks,
Taylor
Jeff King Feb. 15, 2020, 6:48 a.m. UTC | #2
On Fri, Feb 14, 2020 at 04:42:16PM -0800, Taylor Blau wrote:

> On Fri, Feb 14, 2020 at 01:22:20PM -0500, Jeff King wrote:
> > The current behavior from "rev-list --count --objects" is nonsensical:
> > we enumerate all of the objects except commits, but then give a count of
> > commits. This wasn't planned, and is just what the code happens to do.
> >
> > Instead, let's give the answer the user almost certainly wanted: the
> > full count of objects.
> 
> This makes sense: I've often worried about introducing
> backwards-incompatible changes in newer versions of Git, even for
> behavior that didn't make sense to begin with.
> 
> Of course, backwards-incompatible changes *are* something worth worrying
> about, but I don't find that the behavior was sensible to begin with, so
> I don't have a problem "breaking" it if "breaking" means making
> something nonsensical behave correctly.

Yeah, I admit I'm guessing that nobody cares about the current behavior,
or that it was unplanned. But it seems sufficiently insane to me to take
a chance on.

> > +	if (revs->count) {
> > +		revs->count_right++;
> > +		return;
> > +	}
> > +
> 
> Hmm. This puzzled me at first. Do you think that it could benefit from a
> comment?

What would it say (i.e., I'm not sure what confused you)?

-Peff
Junio C Hamano Feb. 16, 2020, 11:34 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

>> > +	if (revs->count) {
>> > +		revs->count_right++;
>> > +		return;
>> > +	}
>> > +
>> 
>> Hmm. This puzzled me at first. Do you think that it could benefit from a
>> comment?
>
> What would it say (i.e., I'm not sure what confused you)?

I think the question reader had was "why *right*?"
Jeff King Feb. 18, 2020, 5:24 a.m. UTC | #4
On Sun, Feb 16, 2020 at 03:34:13PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> > +	if (revs->count) {
> >> > +		revs->count_right++;
> >> > +		return;
> >> > +	}
> >> > +
> >> 
> >> Hmm. This puzzled me at first. Do you think that it could benefit from a
> >> comment?
> >
> > What would it say (i.e., I'm not sure what confused you)?
> 
> I think the question reader had was "why *right*?"

Ah. The answer is: because it's not SYMMETRIC_LEFT. ;)

The counting code accumulates there by default when there are no side
markings, so that's what it will report when there's no left_right or
cherry_mark (which we know will be the case here, since we die()
otherwise).

-Peff
Junio C Hamano Feb. 18, 2020, 5:28 p.m. UTC | #5
Jeff King <peff@peff.net> writes:

> On Sun, Feb 16, 2020 at 03:34:13PM -0800, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> >> > +	if (revs->count) {
>> >> > +		revs->count_right++;
>> >> > +		return;
>> >> > +	}
>> >> > +
>> >> 
>> >> Hmm. This puzzled me at first. Do you think that it could benefit from a
>> >> comment?
>> >
>> > What would it say (i.e., I'm not sure what confused you)?
>> 
>> I think the question reader had was "why *right*?"
>
> Ah. The answer is: because it's not SYMMETRIC_LEFT. ;)
>
> The counting code accumulates there by default when there are no side
> markings, so that's what it will report when there's no left_right or
> cherry_mark (which we know will be the case here, since we die()
> otherwise).

Yup.

	/*
	 * The object count is always accumulated in the .count_right
	 * field for traversal that is not a left-right traversal,
	 * and cmd_rev_list() made sure that a .count request that
	 * wants to count non-commit objects, which is handled by
	 * the show_object() callback, does not ask for .left_right.
	 */

Overkill?  I dunno.
Derrick Stolee Feb. 18, 2020, 6:05 p.m. UTC | #6
On 2/14/2020 1:22 PM, Jeff King wrote:
> +test_expect_success 'rev-list --count' '
> +	count=$(git rev-list --count HEAD) &&
> +	git rev-list HEAD >actual &&
> +	test_line_count = $count actual
> +'
> +
> +test_expect_success 'rev-list --count --objects' '
> +	count=$(git rev-list --count --objects HEAD) &&
> +	git rev-list --objects HEAD >actual &&
> +	test_line_count = $count actual
> +'

I suppose these tests work, although I would probably
prefer precomputed explicit expected counts instead
of asking rev-list for the correct answer to a
rev-list command. This is still fine, because we test
that the non-count versions return the correct results,
but I would hate for a bug to affect both modes equally
and cause these tests to pass.

Thanks,
-Stolee
Jeff King Feb. 18, 2020, 7:55 p.m. UTC | #7
On Tue, Feb 18, 2020 at 09:28:31AM -0800, Junio C Hamano wrote:

> >> I think the question reader had was "why *right*?"
> >
> > Ah. The answer is: because it's not SYMMETRIC_LEFT. ;)
> >
> > The counting code accumulates there by default when there are no side
> > markings, so that's what it will report when there's no left_right or
> > cherry_mark (which we know will be the case here, since we die()
> > otherwise).
> 
> Yup.
> 
> 	/*
> 	 * The object count is always accumulated in the .count_right
> 	 * field for traversal that is not a left-right traversal,
> 	 * and cmd_rev_list() made sure that a .count request that
> 	 * wants to count non-commit objects, which is handled by
> 	 * the show_object() callback, does not ask for .left_right.
> 	 */
> 
> Overkill?  I dunno.

No, I think it looks good. Want to squash that in (and I'll do likewise
locally)?

-Peff
Jeff King Feb. 18, 2020, 7:59 p.m. UTC | #8
On Tue, Feb 18, 2020 at 01:05:35PM -0500, Derrick Stolee wrote:

> On 2/14/2020 1:22 PM, Jeff King wrote:
> > +test_expect_success 'rev-list --count' '
> > +	count=$(git rev-list --count HEAD) &&
> > +	git rev-list HEAD >actual &&
> > +	test_line_count = $count actual
> > +'
> > +
> > +test_expect_success 'rev-list --count --objects' '
> > +	count=$(git rev-list --count --objects HEAD) &&
> > +	git rev-list --objects HEAD >actual &&
> > +	test_line_count = $count actual
> > +'
> 
> I suppose these tests work, although I would probably
> prefer precomputed explicit expected counts instead
> of asking rev-list for the correct answer to a
> rev-list command. This is still fine, because we test
> that the non-count versions return the correct results,
> but I would hate for a bug to affect both modes equally
> and cause these tests to pass.

I was hoping it would make the tests a bit less brittle. And as you
note, we should be checking the enumeration results themselves more
carefully in other tests, so I find it fairly unlikely to see a bug that
doesn't trigger _any_ test failure.

-Peff
Junio C Hamano Feb. 18, 2020, 9:19 p.m. UTC | #9
Jeff King <peff@peff.net> writes:

> On Tue, Feb 18, 2020 at 09:28:31AM -0800, Junio C Hamano wrote:
>
>> >> I think the question reader had was "why *right*?"
>> >
>> > Ah. The answer is: because it's not SYMMETRIC_LEFT. ;)
>> >
>> > The counting code accumulates there by default when there are no side
>> > markings, so that's what it will report when there's no left_right or
>> > cherry_mark (which we know will be the case here, since we die()
>> > otherwise).
>> 
>> Yup.
>> 
>> 	/*
>> 	 * The object count is always accumulated in the .count_right
>> 	 * field for traversal that is not a left-right traversal,
>> 	 * and cmd_rev_list() made sure that a .count request that
>> 	 * wants to count non-commit objects, which is handled by
>> 	 * the show_object() callback, does not ask for .left_right.
>> 	 */
>> 
>> Overkill?  I dunno.
>
> No, I think it looks good. Want to squash that in (and I'll do likewise
> locally)?

It's too late for squashing anything in, as they are in 'next'
already.
Jeff King Feb. 18, 2020, 9:23 p.m. UTC | #10
On Tue, Feb 18, 2020 at 01:19:51PM -0800, Junio C Hamano wrote:

> >> 	/*
> >> 	 * The object count is always accumulated in the .count_right
> >> 	 * field for traversal that is not a left-right traversal,
> >> 	 * and cmd_rev_list() made sure that a .count request that
> >> 	 * wants to count non-commit objects, which is handled by
> >> 	 * the show_object() callback, does not ask for .left_right.
> >> 	 */
> >> 
> >> Overkill?  I dunno.
> >
> > No, I think it looks good. Want to squash that in (and I'll do likewise
> > locally)?
> 
> It's too late for squashing anything in, as they are in 'next'
> already.

Ah, I hadn't noticed that yet. I can live without it, but I'd also be OK
with a patch on top if you want to add it.

-Peff

Patch
diff mbox series

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 38c5ca5603..9452123988 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -253,11 +253,19 @@  static int finish_object(struct object *obj, const char *name, void *cb_data)
 static void show_object(struct object *obj, const char *name, void *cb_data)
 {
 	struct rev_list_info *info = cb_data;
+	struct rev_info *revs = info->revs;
+
 	if (finish_object(obj, name, cb_data))
 		return;
 	display_progress(progress, ++progress_counter);
 	if (info->flags & REV_LIST_QUIET)
 		return;
+
+	if (revs->count) {
+		revs->count_right++;
+		return;
+	}
+
 	if (arg_show_object_names)
 		show_object_with_name(stdout, obj, name);
 	else
@@ -584,6 +592,11 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (revs.show_notes)
 		die(_("rev-list does not support display of notes"));
 
+	if (revs.count &&
+	    (revs.tag_objects || revs.tree_objects || revs.blob_objects) &&
+	    (revs.left_right || revs.cherry_mark))
+		die(_("marked counting is incompatible with --objects"));
+
 	if (filter_options.choice)
 		use_bitmap_index = 0;
 
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index b8cf82349b..383f2c457d 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -148,4 +148,16 @@  test_expect_success 'rev-list --end-of-options' '
 	test_cmp expect actual
 '
 
+test_expect_success 'rev-list --count' '
+	count=$(git rev-list --count HEAD) &&
+	git rev-list HEAD >actual &&
+	test_line_count = $count actual
+'
+
+test_expect_success 'rev-list --count --objects' '
+	count=$(git rev-list --count --objects HEAD) &&
+	git rev-list --objects HEAD >actual &&
+	test_line_count = $count actual
+'
+
 test_done