[v2,08/15] t5310: factor out bitmap traversal comparison
diff mbox series

Message ID 20200214182225.GH150965@coredump.intra.peff.net
State New
Headers show
Series
  • combining object filters and bitmaps
Related show

Commit Message

Jeff King Feb. 14, 2020, 6:22 p.m. UTC
We check the results of "rev-list --use-bitmap-index" by comparing it to
the same traversal without the bitmap option. However, this is a little
tricky to do because of some output differences (see the included
comment for details). Let's pull this out into a helper function, since
we'll be adding some similar tests.

While we're at it, let's also try to confirm that the bitmap output did
indeed use bitmaps. Since the code internally falls back to the
non-bitmap path in some cases, the tests are at risk of becoming trivial
noops.

This is a bit fragile, as not all outputs will differ (e.g., looking at
only the commits from a fully-bitmapped pack will end up exactly the
same as the normal traversal order, since it also matches the pack
order). So we'll provide an escape hatch by which tests can disable this
check (which should only be used after manually confirming that bitmaps
kicked in).

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t5310-pack-bitmaps.sh | 10 +++-------
 t/test-lib-functions.sh | 27 +++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 7 deletions(-)

Comments

Taylor Blau Feb. 15, 2020, 2:14 a.m. UTC | #1
On Fri, Feb 14, 2020 at 01:22:25PM -0500, Jeff King wrote:
> We check the results of "rev-list --use-bitmap-index" by comparing it to
> the same traversal without the bitmap option. However, this is a little
> tricky to do because of some output differences (see the included
> comment for details). Let's pull this out into a helper function, since
> we'll be adding some similar tests.
>
> While we're at it, let's also try to confirm that the bitmap output did
> indeed use bitmaps. Since the code internally falls back to the
> non-bitmap path in some cases, the tests are at risk of becoming trivial
> noops.
>
> This is a bit fragile, as not all outputs will differ (e.g., looking at
> only the commits from a fully-bitmapped pack will end up exactly the
> same as the normal traversal order, since it also matches the pack
> order). So we'll provide an escape hatch by which tests can disable this
> check (which should only be used after manually confirming that bitmaps
> kicked in).

Thanks for pointing out the rationale behind this.

> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  t/t5310-pack-bitmaps.sh | 10 +++-------
>  t/test-lib-functions.sh | 27 +++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
> index 7ba7d294a5..b8645ae070 100755
> --- a/t/t5310-pack-bitmaps.sh
> +++ b/t/t5310-pack-bitmaps.sh
> @@ -81,13 +81,9 @@ rev_list_tests() {
>  	'
>
>  	test_expect_success "enumerate --objects ($state)" '
> -		git rev-list --objects --use-bitmap-index HEAD >tmp &&
> -		cut -d" " -f1 <tmp >tmp2 &&
> -		sort <tmp2 >actual &&
> -		git rev-list --objects HEAD >tmp &&
> -		cut -d" " -f1 <tmp >tmp2 &&
> -		sort <tmp2 >expect &&
> -		test_cmp expect actual
> +		git rev-list --objects --use-bitmap-index HEAD >actual &&
> +		git rev-list --objects HEAD >expect &&
> +		test_bitmap_traversal expect actual
>  	'
>
>  	test_expect_success "bitmap --objects handles non-commit objects ($state)" '
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 284c52d076..352c213d52 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1516,3 +1516,30 @@ test_set_port () {
>  	port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0}))
>  	eval $var=$port
>  }
> +
> +# Compare a file containing rev-list bitmap traversal output to its non-bitmap
> +# counterpart. You can't just use test_cmp for this, because the two produce
> +# subtly different output:
> +#
> +#   - regular output is in traversal order, whereas bitmap is split by type,
> +#     with non-packed objects at the end
> +#
> +#   - regular output has a space and the pathname appended to non-commit
> +#     objects; bitmap output omits this
> +#
> +# This function normalizes and compares the two. The second file should
> +# always be the bitmap output.
> +test_bitmap_traversal () {
> +	if test "$1" = "--no-confirm-bitmaps"
> +	then
> +		shift
> +	elif cmp "$1" "$2"

Any reason to prefer 'cmp' here and then use 'test_cmp' below? I can't
think of a good reason one way or another, especially in places where
GIT_TEST_CMP is just 'cmp', but I think the two usages should be
consistent with one another.

If there *is* a favorable argument for 'test_cmp' (instead of a bare
'cmp'), I think that it'd be that the original code a couple of hunks up
uses it.

> +	then
> +		echo >&2 "identical raw outputs; are you sure bitmaps were used?"
> +		return 1
> +	fi &&
> +	cut -d' ' -f1 "$1" | sort >"$1.normalized" &&
> +	sort "$2" >"$2.normalized" &&
> +	test_cmp "$1.normalized" "$2.normalized" &&
> +	rm -f "$1.normalized" "$2.normalized"

This implementation looks good to me.

> +}
> --
> 2.25.0.796.gcc29325708

Thanks,
Taylor
Jeff King Feb. 15, 2020, 7 a.m. UTC | #2
On Fri, Feb 14, 2020 at 06:14:46PM -0800, Taylor Blau wrote:

> > This is a bit fragile, as not all outputs will differ (e.g., looking at
> > only the commits from a fully-bitmapped pack will end up exactly the
> > same as the normal traversal order, since it also matches the pack
> > order). So we'll provide an escape hatch by which tests can disable this
> > check (which should only be used after manually confirming that bitmaps
> > kicked in).
> 
> Thanks for pointing out the rationale behind this.

I didn't write it this way at first, but the need became quite apparent
when I ran the test from the next patch. :)

I waffled on whether the extra option made the helper too convoluted to
be worthwhile.  Another way to do it would be two separate functions:

  test_for_bitmap_output expect actual &&
  test_bitmap_output_matches expect actual

but as you can see I couldn't come up with good names.

I doubt it matters all that much either way, as long as the docstring is
clear.

> > +test_bitmap_traversal () {
> > +	if test "$1" = "--no-confirm-bitmaps"
> > +	then
> > +		shift
> > +	elif cmp "$1" "$2"
> 
> Any reason to prefer 'cmp' here and then use 'test_cmp' below? I can't
> think of a good reason one way or another, especially in places where
> GIT_TEST_CMP is just 'cmp', but I think the two usages should be
> consistent with one another.

On most platforms GIT_TEST_CMP is "diff -u" (it's only on platforms with
a crappy system diff tool that we replace it with "cmp"). So I thought
it would be confusing for it to dump a diff in the expected case (since
unlike a normal test_cmp, we want the outputs to differ).  "cmp" does
still write a short message to stderr; I thought about redirecting it to
/dev/null, but worried that would make debugging verbose output harder).

> If there *is* a favorable argument for 'test_cmp' (instead of a bare
> 'cmp'), I think that it'd be that the original code a couple of hunks up
> uses it.

Right, but it uses to check that two things are equal (which we also use
it for here).

-Peff

Patch
diff mbox series

diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
index 7ba7d294a5..b8645ae070 100755
--- a/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -81,13 +81,9 @@  rev_list_tests() {
 	'
 
 	test_expect_success "enumerate --objects ($state)" '
-		git rev-list --objects --use-bitmap-index HEAD >tmp &&
-		cut -d" " -f1 <tmp >tmp2 &&
-		sort <tmp2 >actual &&
-		git rev-list --objects HEAD >tmp &&
-		cut -d" " -f1 <tmp >tmp2 &&
-		sort <tmp2 >expect &&
-		test_cmp expect actual
+		git rev-list --objects --use-bitmap-index HEAD >actual &&
+		git rev-list --objects HEAD >expect &&
+		test_bitmap_traversal expect actual
 	'
 
 	test_expect_success "bitmap --objects handles non-commit objects ($state)" '
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 284c52d076..352c213d52 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1516,3 +1516,30 @@  test_set_port () {
 	port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0}))
 	eval $var=$port
 }
+
+# Compare a file containing rev-list bitmap traversal output to its non-bitmap
+# counterpart. You can't just use test_cmp for this, because the two produce
+# subtly different output:
+#
+#   - regular output is in traversal order, whereas bitmap is split by type,
+#     with non-packed objects at the end
+#
+#   - regular output has a space and the pathname appended to non-commit
+#     objects; bitmap output omits this
+#
+# This function normalizes and compares the two. The second file should
+# always be the bitmap output.
+test_bitmap_traversal () {
+	if test "$1" = "--no-confirm-bitmaps"
+	then
+		shift
+	elif cmp "$1" "$2"
+	then
+		echo >&2 "identical raw outputs; are you sure bitmaps were used?"
+		return 1
+	fi &&
+	cut -d' ' -f1 "$1" | sort >"$1.normalized" &&
+	sort "$2" >"$2.normalized" &&
+	test_cmp "$1.normalized" "$2.normalized" &&
+	rm -f "$1.normalized" "$2.normalized"
+}