diff mbox series

[v9,1/6] t5323: test cases for git-pack-redundant

Message ID 20190201162152.31136-2-worldhello.net@gmail.com (mailing list archive)
State New, archived
Headers show
Series pack-redundant: new algorithm to find min packs | expand

Commit Message

Jiang Xin Feb. 1, 2019, 4:21 p.m. UTC
From: Jiang Xin <zhiyou.jx@alibaba-inc.com>

Add test cases for git pack-redundant to validate new algorithm for git
pack-redundant.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Reviewed-by: Sun Chao <sunchao9@huawei.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t5323-pack-redundant.sh | 510 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 510 insertions(+)
 create mode 100755 t/t5323-pack-redundant.sh

Comments

Eric Sunshine Feb. 1, 2019, 7:42 p.m. UTC | #1
On Fri, Feb 1, 2019 at 11:22 AM Jiang Xin <worldhello.net@gmail.com> wrote:
> Add test cases for git pack-redundant to validate new algorithm for git
> pack-redundant.
>
> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> ---
> diff --git a/t/t5323-pack-redundant.sh b/t/t5323-pack-redundant.sh
> @@ -0,0 +1,510 @@
> +# Note: DO NOT run it in a subshell, otherwise the variables will not be set

Which variables won't be set? It's not clear what this restriction is about.

> +# Usage: create_commits_in <repo> A B C ...
> +create_commits_in () {
> +       repo="$1" &&
> +       parent=$(git -C "$repo" rev-parse HEAD^{} 2>/dev/null) || parent=

Broken &&-chain. Instead, perhaps:

    if ! parent=$(git -C "$repo" rev-parse HEAD^{} 2>/dev/null)
    then
        parent=
    fi &&

or something simpler.

> +       T=$(git -C "$repo" write-tree) &&
> +       shift &&
> +       while test $# -gt 0
> +       do
> +               name=$1 &&
> +               test_tick &&
> +               if test -z "$parent"
> +               then
> +                       oid=$(echo $name | git -C "$repo" commit-tree $T)
> +               else
> +                       oid=$(echo $name | git -C "$repo" commit-tree -p $parent $T)
> +               fi &&
> +               eval $name=$oid &&
> +               parent=$oid &&
> +               shift ||
> +               return 1
> +       done

Broken &&-chain. Use:

    done &&

> +       git -C "$repo" update-ref refs/heads/master $oid
> +}
> +
> +# Note: DO NOT run it in a subshell, otherwise the variables will not be set
> +create_pack_1 () {
> +       P1=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF

Which variables? Note that you can capture output of a subshell into a
variable, if necessary.
Junio C Hamano Feb. 1, 2019, 9:03 p.m. UTC | #2
Eric Sunshine <sunshine@sunshineco.com> writes:

> On Fri, Feb 1, 2019 at 11:22 AM Jiang Xin <worldhello.net@gmail.com> wrote:
>> Add test cases for git pack-redundant to validate new algorithm for git
>> pack-redundant.
>>
>> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
>> ---
>> diff --git a/t/t5323-pack-redundant.sh b/t/t5323-pack-redundant.sh
>> @@ -0,0 +1,510 @@
>> +# Note: DO NOT run it in a subshell, otherwise the variables will not be set
>
> Which variables won't be set? It's not clear what this restriction is about.

>> +       git -C "$repo" update-ref refs/heads/master $oid
>> +}
>> +
>> +# Note: DO NOT run it in a subshell, otherwise the variables will not be set
>> +create_pack_1 () {
>> +       P1=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
>
> Which variables? Note that you can capture output of a subshell into a
> variable, if necessary.

These helper functions set a bunch of variables $P1, $P2, etc. as
well as variables whose name begin with P and followed by 40-hex.
The script wants to use them later when preparing expected output,
and with the most natural way to organize the code, that "later"
happens in the process that would have spawned a subshell to run
this function.

It would have been easier for you to grok if the note instead said
"this function sets two global shell variables" or something,
perhaps?  Such a variable would certainly not be visible if this
function is called inside a subshell to the main process.
Eric Sunshine Feb. 1, 2019, 9:49 p.m. UTC | #3
On Fri, Feb 1, 2019 at 4:03 PM Junio C Hamano <gitster@pobox.com> wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> > On Fri, Feb 1, 2019 at 11:22 AM Jiang Xin <worldhello.net@gmail.com> wrote:
> >> +# Note: DO NOT run it in a subshell, otherwise the variables will not be set
> >
> > Which variables won't be set? It's not clear what this restriction is about.
>
> >> +# Note: DO NOT run it in a subshell, otherwise the variables will not be set
> >> +create_pack_1 () {
> >> +       P1=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
> >
> > Which variables? Note that you can capture output of a subshell into a
> > variable, if necessary.
>
> These helper functions set a bunch of variables $P1, $P2, etc. as
> well as variables whose name begin with P and followed by 40-hex.
> The script wants to use them later when preparing expected output,
> and with the most natural way to organize the code, that "later"
> happens in the process that would have spawned a subshell to run
> this function.
>
> It would have been easier for you to grok if the note instead said
> "this function sets two global shell variables" or something,
> perhaps?  Such a variable would certainly not be visible if this
> function is called inside a subshell to the main process.

Yes, better function comments would facilitate comprehension both for
the reviewer and those working on the code in the future. For
instance:

    # Create commit for each argument [...with blah properties...] and
    # assign [...] to shell variable of same name as argument.
    # NOTE: Avoid calling this function from a subshell since variable
    # assignments will disappear when subshell exits.

or something.

But, looking more closely at the patch, I'm wondering why the various
create_pack_#() functions are defined at all since they are each only
ever called from a single place.

    create_pack_4 () {
        ...
        eval P$P4=P4:$P4
    }
   ...
   test_expect_success 'create pack 4, 5' '
        create_pack_4 && create_pack_5
    '

I haven't been able to convince myself that this helps readability --
especially since the function definition is often far removed from the
single point of use -- over merely inlining the function body directly
in the sole test which calls it.

Anyhow, this is all pretty minor, not necessarily worth a re-roll.
diff mbox series

Patch

diff --git a/t/t5323-pack-redundant.sh b/t/t5323-pack-redundant.sh
new file mode 100755
index 0000000000..d224ff3c50
--- /dev/null
+++ b/t/t5323-pack-redundant.sh
@@ -0,0 +1,510 @@ 
+#!/bin/sh
+#
+# Copyright (c) 2018 Jiang Xin
+#
+
+test_description='Test git pack-redundant
+
+In order to test git-pack-redundant, we will create a number of objects and
+packs in the repository `master.git`. The relationship between packs (P1-P8)
+and objects (T, A-R) is showed in the following chart. Objects of a pack will
+be marked with letter x, while objects of redundant packs will be marked with
+exclamation point, and redundant pack itself will be marked with asterisk.
+
+        | T A B C D E F G H I J K L M N O P Q R
+    ----+--------------------------------------
+    P1  | x x x x x x x                       x
+    P2* |     ! ! ! !   ! ! !
+    P3  |             x     x x x x x
+    P4* |                     ! ! ! !     !
+    P5  |               x x           x x
+    P6* |                             ! !   !
+    P7  |                                 x x
+    P8* |   !
+    ----+--------------------------------------
+    ALL | x x x x x x x x x x x x x x x x x x x
+
+Another repository `shared.git` has unique objects (X-Z), while other objects
+(marked with letter s) are shared through alt-odb (of `master.git`). The
+relationship between packs and objects is as follows:
+
+        | T A B C D E F G H I J K L M N O P Q R   X Y Z
+    ----+----------------------------------------------
+    Px1 |   s s s                                 x x x
+    Px2 |         s s s                           x x x
+'
+
+. ./test-lib.sh
+
+master_repo=master.git
+shared_repo=shared.git
+
+# Note: DO NOT run it in a subshell, otherwise the variables will not be set
+# Usage: create_commits_in <repo> A B C ...
+create_commits_in () {
+	repo="$1" &&
+	parent=$(git -C "$repo" rev-parse HEAD^{} 2>/dev/null) || parent=
+	T=$(git -C "$repo" write-tree) &&
+	shift &&
+	while test $# -gt 0
+	do
+		name=$1 &&
+		test_tick &&
+		if test -z "$parent"
+		then
+			oid=$(echo $name | git -C "$repo" commit-tree $T)
+		else
+			oid=$(echo $name | git -C "$repo" commit-tree -p $parent $T)
+		fi &&
+		eval $name=$oid &&
+		parent=$oid &&
+		shift ||
+		return 1
+	done
+	git -C "$repo" update-ref refs/heads/master $oid
+}
+
+# Note: DO NOT run it in a subshell, otherwise the variables will not be set
+create_pack_1 () {
+	P1=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
+		$T
+		$A
+		$B
+		$C
+		$D
+		$E
+		$F
+		$R
+		EOF
+	) &&
+	eval P$P1=P1:$P1
+}
+
+create_pack_2 () {
+	P2=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
+		$B
+		$C
+		$D
+		$E
+		$G
+		$H
+		$I
+		EOF
+	) &&
+	eval P$P2=P2:$P2
+}
+
+create_pack_3 () {
+	P3=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
+		$F
+		$I
+		$J
+		$K
+		$L
+		$M
+		EOF
+	) &&
+	eval P$P3=P3:$P3
+}
+
+create_pack_4 () {
+	P4=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
+		$J
+		$K
+		$L
+		$M
+		$P
+		EOF
+	) &&
+	eval P$P4=P4:$P4
+}
+
+create_pack_5 () {
+	P5=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
+		$G
+		$H
+		$N
+		$O
+		EOF
+	) &&
+	eval P$P5=P5:$P5
+}
+
+create_pack_6 () {
+	P6=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
+		$N
+		$O
+		$Q
+		EOF
+	) &&
+	eval P$P6=P6:$P6
+}
+
+create_pack_7 () {
+	P7=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
+		$P
+		$Q
+		EOF
+	) &&
+	eval P$P7=P7:$P7
+}
+
+create_pack_8 () {
+	P8=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
+		$A
+		EOF
+	) &&
+	eval P$P8=P8:$P8
+}
+
+format_packfiles () {
+	sed \
+		-e "s#.*/pack-\(.*\)\.idx#\1#" \
+		-e "s#.*/pack-\(.*\)\.pack#\1#" |
+	sort -u |
+	while read p
+	do
+		if test -z "$(eval echo \${P$p})"
+		then
+			echo $p
+		else
+			eval echo "\${P$p}"
+		fi
+	done |
+	sort
+}
+
+test_expect_success 'setup master repo' '
+	git init --bare "$master_repo" &&
+	create_commits_in "$master_repo" A B C D E F G H I J K L M N O P Q R
+'
+
+#############################################################################
+# Chart of packs and objects for this test case
+#
+#         | T A B C D E F G H I J K L M N O P Q R
+#     ----+--------------------------------------
+#     P1  | x x x x x x x                       x
+#     P2  |     x x x x   x x x
+#     P3  |             x     x x x x x
+#     ----+--------------------------------------
+#     ALL | x x x x x x x x x x x x x x         x
+#
+#############################################################################
+test_expect_success 'no redundant for pack 1, 2, 3' '
+	create_pack_1 && create_pack_2 && create_pack_3 &&
+	(
+		cd "$master_repo" &&
+		git pack-redundant --all >out &&
+		test_must_be_empty out
+	)
+'
+
+test_expect_success 'create pack 4, 5' '
+	create_pack_4 && create_pack_5
+'
+
+#############################################################################
+# Chart of packs and objects for this test case
+#
+#         | T A B C D E F G H I J K L M N O P Q R
+#     ----+--------------------------------------
+#     P1  | x x x x x x x                       x
+#     P2* |     ! ! ! !   ! ! !
+#     P3  |             x     x x x x x
+#     P4  |                     x x x x     x
+#     P5  |               x x           x x
+#     ----+--------------------------------------
+#     ALL | x x x x x x x x x x x x x x x x x   x
+#
+#############################################################################
+test_expect_success 'one of pack-2/pack-3 is redundant' '
+	(
+		cd "$master_repo" &&
+		cat >expect <<-EOF &&
+			P2:$P2
+			EOF
+		git pack-redundant --all >out &&
+		format_packfiles <out >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'create pack 6, 7' '
+	create_pack_6 && create_pack_7
+'
+
+#############################################################################
+# Chart of packs and objects for this test case
+#
+#         | T A B C D E F G H I J K L M N O P Q R
+#     ----+--------------------------------------
+#     P1  | x x x x x x x                       x
+#     P2* |     ! ! ! !   ! ! !
+#     P3  |             x     x x x x x
+#     P4* |                     ! ! ! !     !
+#     P5  |               x x           x x
+#     P6* |                             ! !   !
+#     P7  |                                 x x
+#     ----+--------------------------------------
+#     ALL | x x x x x x x x x x x x x x x x x x x
+#
+#############################################################################
+test_expect_success 'pack 2, 4, and 6 are redundant' '
+	(
+		cd "$master_repo" &&
+		cat >expect <<-EOF &&
+			P2:$P2
+			P4:$P4
+			P6:$P6
+			EOF
+		git pack-redundant --all >out &&
+		format_packfiles <out >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'create pack 8' '
+	create_pack_8
+'
+
+#############################################################################
+# Chart of packs and objects for this test case
+#
+#         | T A B C D E F G H I J K L M N O P Q R
+#     ----+--------------------------------------
+#     P1  | x x x x x x x                       x
+#     P2* |     ! ! ! !   ! ! !
+#     P3  |             x     x x x x x
+#     P4* |                     ! ! ! !     !
+#     P5  |               x x           x x
+#     P6* |                             ! !   !
+#     P7  |                                 x x
+#     P8* |   !
+#     ----+--------------------------------------
+#     ALL | x x x x x x x x x x x x x x x x x x x
+#
+#############################################################################
+test_expect_success 'pack-8 (subset of pack-1) is also redundant' '
+	(
+		cd "$master_repo" &&
+		cat >expect <<-EOF &&
+			P2:$P2
+			P4:$P4
+			P6:$P6
+			P8:$P8
+			EOF
+		git pack-redundant --all >out &&
+		format_packfiles <out >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'clean loose objects' '
+	(
+		cd "$master_repo" &&
+		git prune-packed &&
+		find objects -type f | sed -e "/objects\/pack\//d" >out &&
+		test_must_be_empty out
+	)
+'
+
+test_expect_success 'remove redundant packs and pass fsck' '
+	(
+		cd "$master_repo" &&
+		git pack-redundant --all | xargs rm &&
+		git fsck &&
+		git pack-redundant --all >out &&
+		test_must_be_empty out
+	)
+'
+
+# The following test cases will execute inside `shared.git`, instead of
+# inside `master.git`.
+test_expect_success 'setup shared.git' '
+	git clone --mirror "$master_repo" "$shared_repo" &&
+	(
+		cd "$shared_repo" &&
+		printf "../../$master_repo/objects\n" >objects/info/alternates
+	)
+'
+
+test_expect_success 'no redundant packs without --alt-odb' '
+	(
+		cd "$shared_repo" &&
+		git pack-redundant --all >out &&
+		test_must_be_empty out
+	)
+'
+
+#############################################################################
+# Chart of packs and objects for this test case
+#
+#     ================ master.git ===============
+#         | T A B C D E F G H I J K L M N O P Q R  <----------+
+#     ----+--------------------------------------             |
+#     P1  | x x x x x x x                       x             |
+#     P3  |             x     x x x x x                       |
+#     P5  |               x x           x x                   |
+#     P7  |                                 x x               |
+#     ----+--------------------------------------             |
+#     ALL | x x x x x x x x x x x x x x x x x x x             |
+#                                                             |
+#                                                             |
+#     ================ shared.git ===============             |
+#         | T A B C D E F G H I J K L M N O P Q R  <objects/info/alternates>
+#     ----+--------------------------------------
+#     P1* | s s s s s s s                       s
+#     P3* |             s     s s s s s
+#     P5* |               s s           s s
+#     P7* |                                 s s
+#     ----+--------------------------------------
+#     ALL | x x x x x x x x x x x x x x x x x x x
+#
+#############################################################################
+test_expect_success 'pack-redundant --verbose: show duplicate packs in stderr' '
+	(
+		cd "$shared_repo" &&
+		cat >expect <<-EOF &&
+			P1:$P1
+			P3:$P3
+			P5:$P5
+			P7:$P7
+			EOF
+		git pack-redundant --all --verbose >out 2>out.err &&
+		test_must_be_empty out &&
+		grep "pack$" out.err | format_packfiles >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'remove redundant packs by alt-odb, no packs left' '
+	(
+		cd "$shared_repo" &&
+		cat >expect <<-EOF &&
+			fatal: Zero packs found!
+			EOF
+		git pack-redundant --all --alt-odb | xargs rm &&
+		git fsck &&
+		test_must_fail git pack-redundant --all --alt-odb >actual 2>&1 &&
+		test_cmp expect actual
+	)
+'
+
+# Note: DO NOT run function `create_pack_*` in sub shell, or variables are not set
+create_pack_x1_in () {
+	repo="$1" &&
+	Px1=$(git -C "$repo/objects/pack" pack-objects -q pack <<-EOF
+		$X
+		$Y
+		$Z
+		$A
+		$B
+		$C
+		EOF
+	) &&
+	eval P${Px1}=Px1:${Px1}
+}
+
+create_pack_x2_in () {
+	repo="$1" &&
+	Px2=$(git -C "$repo/objects/pack" pack-objects -q pack <<-EOF
+		$X
+		$Y
+		$Z
+		$D
+		$E
+		$F
+		EOF
+	) &&
+	eval P${Px2}=Px2:${Px2}
+}
+
+test_expect_success 'create new objects and packs in shared.git' '
+	create_commits_in "$shared_repo" X Y Z &&
+	create_pack_x1_in "$shared_repo" &&
+	create_pack_x2_in "$shared_repo"
+'
+
+test_expect_success 'no redundant without --alt-odb' '
+	(
+		cd "$shared_repo" &&
+		git pack-redundant --all >out &&
+		test_must_be_empty out
+	)
+'
+
+#############################################################################
+# Chart of packs and objects for this test case
+#
+#     ================ master.git ===============
+#         | T A B C D E F G H I J K L M N O P Q R  <----------------+
+#     ----+--------------------------------------                   |
+#     P1  | x x x x x x x                       x                   |
+#     P3  |             x     x x x x x                             |
+#     P5  |               x x           x x                         |
+#     P7  |                                 x x                     |
+#     ----+--------------------------------------                   |
+#     ALL | x x x x x x x x x x x x x x x x x x x                   |
+#                                                                   |
+#                                                                   |
+#     ================ shared.git =======================           |
+#         | T A B C D E F G H I J K L M N O P Q R   X Y Z <objects/info/alternates>
+#     ----+----------------------------------------------
+#     Px1 |   s s s                                 x x x
+#     Px2*|         s s s                           ! ! !
+#     ----+----------------------------------------------
+#     ALL | s s s s s s s s s s s s s s s s s s s   x x x
+#
+#############################################################################
+test_expect_success 'one pack is redundant' '
+	(
+		cd "$shared_repo" &&
+		git pack-redundant --all --alt-odb >out &&
+		format_packfiles <out >actual &&
+		test_line_count = 1 actual
+	)
+'
+
+#############################################################################
+# Chart of packs and objects for this test case
+#
+#     ================ master.git ===============
+#         | T A B C D E F G H I J K L M N O P Q R  <----------------+
+#     ----+--------------------------------------                   |
+#     P1  | x x x x x x x                       x                   |
+#     P3  |             x     x x x x x                             |
+#     P5  |               x x           x x                         |
+#     P7  |                                 x x                     |
+#     ----+--------------------------------------                   |
+#     ALL | x x x x x x x x x x x x x x x x x x x                   |
+#                                                                   |
+#                                                                   |
+#     ================ shared.git =======================           |
+#         | T A B C D E F G H I J K L M N O P Q R   X Y Z <objects/info/alternates>
+#     ----+----------------------------------------------
+#     Px1*|   s s s                                 i i i
+#     Px2*|         s s s                           i i i
+#     ----+----------------------------------------------
+#     ALL | s s s s s s s s s s s s s s s s s s s   i i i
+#                                                  (ignored objects, marked with i)
+#
+#############################################################################
+test_expect_success 'set ignore objects and all two packs are redundant' '
+	(
+		cd "$shared_repo" &&
+		cat >expect <<-EOF &&
+			Px1:$Px1
+			Px2:$Px2
+			EOF
+		git pack-redundant --all --alt-odb >out <<-EOF &&
+			$X
+			$Y
+			$Z
+			EOF
+		format_packfiles <out >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_done