[06/13] rev-list: make --count work with --objects
diff mbox series

Message ID 20200213021929.GF1126038@coredump.intra.peff.net
State New
Headers show
Series
  • combining object filters and bitmaps
Related show

Commit Message

Jeff King Feb. 13, 2020, 2:19 a.m. UTC
The current behavior from "rev-list --count --objects" is nonsensical:
we enumerate all of the objects except commits, but then give a count of
commits. This wasn't planned, and is just what the code happens to do.

Instead, let's give the answer the user almost certainly wanted: the
full count of objects.

Note that there are more complicated cases around cherry-marking, etc.
We'll punt on those for now, but let the user know that we can't produce
an answer (rather than giving them something useless).

We'll test both the new feature as well as a vanilla --count of commits,
since that surprisingly doesn't seem to be covered in the existing
tests.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/rev-list.c       | 13 +++++++++++++
 t/t6000-rev-list-misc.sh | 12 ++++++++++++
 2 files changed, 25 insertions(+)

Comments

Junio C Hamano Feb. 13, 2020, 7:14 p.m. UTC | #1
Jeff King <peff@peff.net> writes:

> The current behavior from "rev-list --count --objects" is nonsensical:
> we enumerate all of the objects except commits, but then give a count of
> commits. This wasn't planned, and is just what the code happens to do.
>
> Instead, let's give the answer the user almost certainly wanted: the
> full count of objects.
>
> Note that there are more complicated cases around cherry-marking, etc.
> We'll punt on those for now, but let the user know that we can't produce
> an answer (rather than giving them something useless).

Now, finally we start changing the behaviour.  

Is the reason why --left-right and --objects do not work well
together because same trees and blobs can be shared between commits
on both sides?

> diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
> index b8cf82349b..383f2c457d 100755
> --- a/t/t6000-rev-list-misc.sh
> +++ b/t/t6000-rev-list-misc.sh
> @@ -148,4 +148,16 @@ test_expect_success 'rev-list --end-of-options' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'rev-list --count' '
> +	count=$(git rev-list --count HEAD) &&
> +	git rev-list HEAD >actual &&
> +	test_line_count = $count actual
> +'
> +
> +test_expect_success 'rev-list --count --objects' '
> +	count=$(git rev-list --count --objects HEAD) &&
> +	git rev-list --objects HEAD >actual &&
> +	test_line_count = $count actual
> +'

Makes sense to define --count as the same as number of objects
shown and verify it.
Jeff King Feb. 13, 2020, 8:27 p.m. UTC | #2
On Thu, Feb 13, 2020 at 11:14:06AM -0800, Junio C Hamano wrote:

> > The current behavior from "rev-list --count --objects" is nonsensical:
> > we enumerate all of the objects except commits, but then give a count of
> > commits. This wasn't planned, and is just what the code happens to do.
> >
> > Instead, let's give the answer the user almost certainly wanted: the
> > full count of objects.
> >
> > Note that there are more complicated cases around cherry-marking, etc.
> > We'll punt on those for now, but let the user know that we can't produce
> > an answer (rather than giving them something useless).
> 
> Now, finally we start changing the behaviour.  
> 
> Is the reason why --left-right and --objects do not work well
> together because same trees and blobs can be shared between commits
> on both sides?

Yes, exactly. I think you could probably define some sensible responses
there. E.g., consider this history:

  commit() { echo $1 >$1 && git add $1 && git commit -m $1; }
  commit base
  commit left
  git checkout -b side HEAD^
  commit right

which looks like this:

  $ git log --graph --oneline --all
  * e0b7b94 (master) left
  | * 2a4163a (HEAD -> side) right
  |/  
  * 8d8a806 base

That "base" commit is sort of in the same boat: it's reachable from
either side. By default we don't count it at all:

  $ git rev-list --count --left-right master...side
  1	1

but we do know about it as a boundary commit, and you could ask for it
to be counted on the right:

  $ git rev-list --boundary --left-right master...side
  <e0b7b9473efa49db3c6bf4ceab587f22d1935f28
  >2a4163a0a9bbcb837af2ac2e80e17120798f863a
  -8d8a806249905f101b5ac3f1eb74fa426f90ddf2

  $ git rev-list --count --boundary --left-right master...side
  2	1

So probably it would be sensible to do the same for the objects.
Anything reachable from both is a "boundary" object. But I don't think
we extend the left-right marking to --objects at all now:

  $ git rev-list --boundary --left-right --objects master...side
  <e0b7b9473efa49db3c6bf4ceab587f22d1935f28
  >2a4163a0a9bbcb837af2ac2e80e17120798f863a
  -8d8a806249905f101b5ac3f1eb74fa426f90ddf2
  f48654b5fe8de485e4622598842ca14b04b62c0a
  45cf141ba67d59203f02a54f03162f3fcef57830 left
  4540e3db9d99d518601ecadb81f7d55d55855033
  c376d892e8b105bd712d06ec5162b5f31ce949c3 right

Nor do we even seem to dig into the boundary objects.

So there would need to be more infrastructure in the traversal itself to
be able to do this right, I think. I'm happy to wait until somebody
demonstrates a real use for it. :)

-Peff

Patch
diff mbox series

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index c2daf40449..b4fd507f25 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -253,11 +253,19 @@  static int finish_object(struct object *obj, const char *name, void *cb_data)
 static void show_object(struct object *obj, const char *name, void *cb_data)
 {
 	struct rev_list_info *info = cb_data;
+	struct rev_info *revs = info->revs;
+
 	if (finish_object(obj, name, cb_data))
 		return;
 	display_progress(progress, ++progress_counter);
 	if (info->flags & REV_LIST_QUIET)
 		return;
+
+	if (revs->count) {
+		revs->count_right++;
+		return;
+	}
+
 	if (arg_show_object_names)
 		show_object_with_name(stdout, obj, name);
 	else
@@ -584,6 +592,11 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (revs.show_notes)
 		die(_("rev-list does not support display of notes"));
 
+	if (revs.count &&
+	    (revs.tag_objects || revs.tree_objects || revs.blob_objects) &&
+	    (revs.left_right || revs.cherry_mark))
+		die(_("marked counting is incompatible with --objects"));
+
 	if (filter_options.choice || revs.prune)
 		use_bitmap_index = 0;
 
diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh
index b8cf82349b..383f2c457d 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -148,4 +148,16 @@  test_expect_success 'rev-list --end-of-options' '
 	test_cmp expect actual
 '
 
+test_expect_success 'rev-list --count' '
+	count=$(git rev-list --count HEAD) &&
+	git rev-list HEAD >actual &&
+	test_line_count = $count actual
+'
+
+test_expect_success 'rev-list --count --objects' '
+	count=$(git rev-list --count --objects HEAD) &&
+	git rev-list --objects HEAD >actual &&
+	test_line_count = $count actual
+'
+
 test_done