diff mbox series

[v4,1/2] bundle: lost objects when removing duplicate pendings

Message ID 20210108144514.24805-2-worldhello.net@gmail.com (mailing list archive)
State New, archived
Headers show
Series Improvements for git-bundle | expand

Commit Message

Jiang Xin Jan. 8, 2021, 2:45 p.m. UTC
From: Jiang Xin <zhiyou.jx@alibaba-inc.com>

`git rev-list` will list one commit for the following command:

    $ git rev-list 'main^!'
    <tip-commit-of-main-branch>

But providing the same rev-list args to `git bundle`, fail to create
a bundle file.

    $ git bundle create - 'main^!'
    # v2 git bundle
    -<OID> <one-line-message>

    fatal: Refusing to create empty bundle.

This is because when removing duplicate objects in function
`object_array_remove_duplicates()`, one unique pending object which has
the same name is deleted by mistake.  The revision arg 'main^!' in the
above example is parsed by `handle_revision_arg()`, and at lease two
different objects will be appended to `revs.pending`, one points to the
parent commit of the "main" branch, and the other points to the tip
commit of the "main" branch.  These two objects have the same name
"main".  Only one object is left with the name "main" after calling the
function `object_array_remove_duplicates()`.

And what's worse, when adding boundary commits into pending list, we use
one-line commit message as names, and the arbitory names may surprise
git-bundle.

Only comparing objects themselves (".item") is also not good enough,
because user may want to create a bundle with two identical objects but
with different reference names, such as: "HEAD" and "refs/heads/main".

Add new function `contains_object()` which compare both the address and
the name of the object.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
---
 object.c               |  10 +-
 t/t6020-bundle-misc.sh | 477 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 483 insertions(+), 4 deletions(-)
 create mode 100755 t/t6020-bundle-misc.sh

Comments

Junio C Hamano Jan. 9, 2021, 2:10 a.m. UTC | #1
Jiang Xin <worldhello.net@gmail.com> writes:

> diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
> new file mode 100755
> index 0000000000..c4447ca88f
> --- /dev/null
> +++ b/t/t6020-bundle-misc.sh
> @@ -0,0 +1,477 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2021 Jiang Xin
> +#
> +
> +test_description='Test git-bundle'
> +
> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> +
> +. ./test-lib.sh
> +
> +# Check count of objects in a bundle file.
> +# We can use "--thin" opiton to check thin pack, which must be fixed by
> +# command `git-index-pack --fix-thin --stdin`.
> +test_bundle_object_count () {
> +	thin= &&
> +	if test "$1" = "--thin"
> +	then
> +		thin=yes
> +		shift
> +	fi &&
> +	if test $# -ne 2
> +	then
> +		echo >&2 "args should be: <bundle> <count>"
> +		return 1
> +	fi
> +	bundle=$1 &&
> +	pack=${bundle%.bdl}.pack &&
> +	convert_bundle_to_pack <"$bundle" >"$pack" &&
> +	if test -n "$thin"
> +	then
> +		test_must_fail git index-pack "$pack" &&

This is overly strict, isn't it?  

Imagine a case where the objects newer revisions introduce have *no*
resemblance to the objects in the prerequisites' trees---the
resulting pack will have no object that is expressed as a delta
against anything outside the pack, and the above "index-pack" would
succeed.

Besides, "git pack-objects --thin" is *not* obligated to create a
pack that lacks one or more objects.  The "--thin" option merely
*allows* pack-objects to omit base objects if it is convenient to do
so.

> +		mv "$pack" "$pack"-thin &&
> +		cat "$pack"-thin |
> +			git index-pack --stdin --fix-thin "$pack"

This side is good, but do not cat a single file into a pipe.
The whole "then" clause would become

	then
		mv "$pack" "$pack-thin" &&
		git index-pack --stdin --fix-thin "$pack" <"$pack-thin"
	else

I would think.

> +	else
> +		git index-pack "$pack"
> +	fi &&
> +	git verify-pack -v "$pack" >verify.out
> +	if test $? -ne 0
> +	then
> +		echo >&2 "error: fail to convert $bundle to $pack"
> +		return 1
> +	fi

At this point, we are not testing the bundle subcommand, but is
testing "git index-pack --fix-thin" that we run ourselves.  Is it
essential to ensure $pack is sane here?  I doubt it.

> +	count=$(grep -c "^$OID_REGEX " verify.out) &&

And if there is no need to run verify-pack, then we can do
count=$(git show-index "${pack%pack}idx" | wc -l) instead, perhaps?

> +	test $2 = $count && return 0
> +	echo >&2 "error: object count for $bundle is $count, not $2"
> +	return 1
> +}
> +
> +# Display the pack data contained in the bundle file, bypassing the
> +# header that contains the signature, prerequisites and references.
> +convert_bundle_to_pack () {
> +	while read x && test -n "$x"
> +	do
> +		:;
> +	done
> +	cat
> +}

This looks somewhat familiar.  Perhaps extract out necessary helpers
including this one into t/test-bundle-lib or something similar in a
preparatory step before this patch?

> +# Create a commit or tag and set the variable with the object ID.
> +test_commit_setvar () {
> +	notick= &&
> +	signoff= &&
> +	indir= &&
> +	merge= &&
> +	tag= &&
> +	var= &&
> +	while test $# != 0
> +	do
> +		case "$1" in
> +		--merge)
> +			merge=yes
> +			;;
> +		--tag)
> +			tag=yes
> +			;;
> +		--notick)
> +			notick=yes
> +			;;
> +		--signoff)
> +			signoff="$1"
> +			;;
> +		-C)
> +			indir="$2"
> +			shift
> +			;;
> +		-*)
> +			echo >&2 "error: unknown option $1"
> +			return 1
> +			;;
> +		*)
> +			test -n "$var" && break
> +			var=$1
> +			;;
> +		esac
> +		shift
> +	done &&

At this point, if $var is still empty, the caller is buggy, and ...

> +	indir=${indir:+"$indir"/} &&
> +	if test $# -eq 0
> +	then
> +		echo >&2 "no args provided"
> +		return 1
> +	fi &&
> +	if test -z "$notick"
> +	then
> +		test_tick
> +	fi &&
> +	if test -n "$merge"
> +	then
> +		git ${indir:+ -C "$indir"} merge --no-edit --no-ff \
> +			${2:+-m "$2"} "$1" &&
> +		oid=$(git ${indir:+ -C "$indir"} rev-parse HEAD)
> +	elif test -n "$tag"
> +	then
> +		git ${indir:+ -C "$indir"} tag -m "$1" "$1" &&
> +		oid=$(git ${indir:+ -C "$indir"} rev-parse "$1")
> +	else
> +		file=${2:-"$1.t"} &&
> +		echo "${3-$1}" > "$indir$file" &&
> +		git ${indir:+ -C "$indir"} add "$file" &&
> +		git ${indir:+ -C "$indir"} commit $signoff -m "$1" &&
> +		oid=$(git ${indir:+ -C "$indir"} rev-parse HEAD)
> +	fi &&
> +	eval $var=$oid
> +}

... it will cause a failure in 'eval' we have here.  Not good.

> +# Format the output of git commands to make a user-friendly and stable
> +# text.  We can easily prepare the expect text without having to worry
> +# about future changes of the commit ID and spaces of the output.

Hmph.  This relies on 7 hexdigits being sufficient to uniquely
identify all objects involved in the test?  It should be OK in
practice.

Is there a point in having both <COMMIT-A> and <OID-A>?  I would
have expected that all these "full object name" conversions are
unneeded.

> +make_user_friendly_and_stable_output () {
> +	sed \
> +		-e "s/$A/<COMMIT-A>/" \
> +		-e "s/$B/<COMMIT-B>/" \
> +		-e "s/$C/<COMMIT-C>/" \
> +		-e "s/$D/<COMMIT-D>/" \
> +		-e "s/$E/<COMMIT-E>/" \
> +		-e "s/$F/<COMMIT-F>/" \
> +		-e "s/$G/<COMMIT-G>/" \
> +		-e "s/$H/<COMMIT-H>/" \
> +		-e "s/$I/<COMMIT-I>/" \
> +		-e "s/$J/<COMMIT-J>/" \
> +		-e "s/$K/<COMMIT-K>/" \
> +		-e "s/$L/<COMMIT-L>/" \
> +		-e "s/$M/<COMMIT-M>/" \
> +		-e "s/$N/<COMMIT-N>/" \
> +		-e "s/$O/<COMMIT-O>/" \
> +		-e "s/$P/<COMMIT-P>/" \
> +		-e "s/$TAG1/<TAG-1>/" \
> +		-e "s/$TAG2/<TAG-2>/" \
> +		-e "s/$TAG3/<TAG-3>/" \
> +		-e "s/$(echo $A | cut -c1-7)[0-9a-f]*/<OID-A>/g" \
> +		-e "s/$(echo $B | cut -c1-7)[0-9a-f]*/<OID-B>/g" \
> +		-e "s/$(echo $C | cut -c1-7)[0-9a-f]*/<OID-C>/g" \
> +		-e "s/$(echo $D | cut -c1-7)[0-9a-f]*/<OID-D>/g" \
> +		-e "s/$(echo $E | cut -c1-7)[0-9a-f]*/<OID-E>/g" \
> +		-e "s/$(echo $F | cut -c1-7)[0-9a-f]*/<OID-F>/g" \
> +		-e "s/$(echo $G | cut -c1-7)[0-9a-f]*/<OID-G>/g" \
> +		-e "s/$(echo $H | cut -c1-7)[0-9a-f]*/<OID-H>/g" \
> +		-e "s/$(echo $I | cut -c1-7)[0-9a-f]*/<OID-I>/g" \
> +		-e "s/$(echo $J | cut -c1-7)[0-9a-f]*/<OID-J>/g" \
> +		-e "s/$(echo $K | cut -c1-7)[0-9a-f]*/<OID-K>/g" \
> +		-e "s/$(echo $L | cut -c1-7)[0-9a-f]*/<OID-L>/g" \
> +		-e "s/$(echo $M | cut -c1-7)[0-9a-f]*/<OID-M>/g" \
> +		-e "s/$(echo $N | cut -c1-7)[0-9a-f]*/<OID-N>/g" \
> +		-e "s/$(echo $O | cut -c1-7)[0-9a-f]*/<OID-O>/g" \
> +		-e "s/$(echo $P | cut -c1-7)[0-9a-f]*/<OID-P>/g" \
> +		-e "s/$(echo $TAG1 | cut -c1-7)[0-9a-f]*/<OID-TAG-1>/g" \
> +		-e "s/$(echo $TAG2 | cut -c1-7)[0-9a-f]*/<OID-TAG-2>/g" \
> +		-e "s/$(echo $TAG3 | cut -c1-7)[0-9a-f]*/<OID-TAG-3>/g" \
> +		-e "s/ *\$//"
> +}
> ...
> +test_expect_success 'create bundle from special rev: main^!' '
> +	git bundle create special-rev.bdl "main^!" &&
> +
> +	git bundle list-heads special-rev.bdl |
> +		make_user_friendly_and_stable_output >actual &&
> +	cat >expect <<-EOF &&
> +		<COMMIT-P> refs/heads/main
> +		EOF

We prefer to indent these more like so:

	cat >expect <<-\EOF &&
	<COMMIT-P> refs/heads/main
	EOF

i.e. the indent of the line with <<EOF on it and the indent of the
line with the matching EOF are the same.  Also, quote EOF to signal
that the body of the here text should be taken as-is without $var
substitution.
Jiang Xin Jan. 9, 2021, 1:32 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> 于2021年1月9日周六 上午10:11写道:
>
> Jiang Xin <worldhello.net@gmail.com> writes:
>
> > diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
> > new file mode 100755
> > index 0000000000..c4447ca88f
> > --- /dev/null
> > +++ b/t/t6020-bundle-misc.sh
> > @@ -0,0 +1,477 @@
> > +#!/bin/sh
> > +#
> > +# Copyright (c) 2021 Jiang Xin
> > +#
> > +
> > +test_description='Test git-bundle'
> > +
> > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
> > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
> > +
> > +. ./test-lib.sh
> > +
> > +# Check count of objects in a bundle file.
> > +# We can use "--thin" opiton to check thin pack, which must be fixed by
> > +# command `git-index-pack --fix-thin --stdin`.
> > +test_bundle_object_count () {
> > +     thin= &&
> > +     if test "$1" = "--thin"
> > +     then
> > +             thin=yes
> > +             shift
> > +     fi &&
> > +     if test $# -ne 2
> > +     then
> > +             echo >&2 "args should be: <bundle> <count>"
> > +             return 1
> > +     fi
> > +     bundle=$1 &&
> > +     pack=${bundle%.bdl}.pack &&
> > +     convert_bundle_to_pack <"$bundle" >"$pack" &&
> > +     if test -n "$thin"
> > +     then
> > +             test_must_fail git index-pack "$pack" &&
>
> This is overly strict, isn't it?

I want to make sure that the created bundle file which contains a thin
pack is not changed, but now I think this check is not necessary,
testing the count of objects is enough.

> Imagine a case where the objects newer revisions introduce have *no*
> resemblance to the objects in the prerequisites' trees---the
> resulting pack will have no object that is expressed as a delta
> against anything outside the pack, and the above "index-pack" would
> succeed.
>
> Besides, "git pack-objects --thin" is *not* obligated to create a
> pack that lacks one or more objects.  The "--thin" option merely
> *allows* pack-objects to omit base objects if it is convenient to do
> so.
>
> > +             mv "$pack" "$pack"-thin &&
> > +             cat "$pack"-thin |
> > +                     git index-pack --stdin --fix-thin "$pack"
>
> This side is good, but do not cat a single file into a pipe.
> The whole "then" clause would become
>
>         then
>                 mv "$pack" "$pack-thin" &&
>                 git index-pack --stdin --fix-thin "$pack" <"$pack-thin"
>         else
>
> I would think.

That's better.

> > +     else
> > +             git index-pack "$pack"
> > +     fi &&
> > +     git verify-pack -v "$pack" >verify.out
> > +     if test $? -ne 0
> > +     then
> > +             echo >&2 "error: fail to convert $bundle to $pack"
> > +             return 1
> > +     fi
>
> At this point, we are not testing the bundle subcommand, but is
> testing "git index-pack --fix-thin" that we run ourselves.  Is it
> essential to ensure $pack is sane here?  I doubt it.

If not check the return error code of `git index-pack`, the report
message will be "error: object count for $bundle is , not $2". I want
to give a specific error message for developer.

> > +     count=$(grep -c "^$OID_REGEX " verify.out) &&
>
> And if there is no need to run verify-pack, then we can do
> count=$(git show-index "${pack%pack}idx" | wc -l) instead, perhaps?

Will do.

> > +     test $2 = $count && return 0
> > +     echo >&2 "error: object count for $bundle is $count, not $2"
> > +     return 1
> > +}
> > +
> > +# Display the pack data contained in the bundle file, bypassing the
> > +# header that contains the signature, prerequisites and references.
> > +convert_bundle_to_pack () {
> > +     while read x && test -n "$x"
> > +     do
> > +             :;
> > +     done
> > +     cat
> > +}
>
> This looks somewhat familiar.  Perhaps extract out necessary helpers
> including this one into t/test-bundle-lib or something similar in a
> preparatory step before this patch?
>
> > +# Create a commit or tag and set the variable with the object ID.
> > +test_commit_setvar () {
> > +     notick= &&
> > +     signoff= &&
> > +     indir= &&
> > +     merge= &&
> > +     tag= &&
> > +     var= &&
> > +     while test $# != 0
> > +     do
> > +             case "$1" in
> > +             --merge)
> > +                     merge=yes
> > +                     ;;
> > +             --tag)
> > +                     tag=yes
> > +                     ;;
> > +             --notick)
> > +                     notick=yes
> > +                     ;;
> > +             --signoff)
> > +                     signoff="$1"
> > +                     ;;
> > +             -C)
> > +                     indir="$2"
> > +                     shift
> > +                     ;;
> > +             -*)
> > +                     echo >&2 "error: unknown option $1"
> > +                     return 1
> > +                     ;;
> > +             *)
> > +                     test -n "$var" && break
> > +                     var=$1

The loop ends only if $var has been assigned a value, or no other
args.  Will report error if no other args later.

> > +                     ;;
> > +             esac
> > +             shift
> > +     done &&
>
> At this point, if $var is still empty, the caller is buggy, and ...

See the above note.

> > +     indir=${indir:+"$indir"/} &&
> > +     if test $# -eq 0
> > +     then
> > +             echo >&2 "no args provided"
> > +             return 1
> > +     fi &&
> > +     if test -z "$notick"
> > +     then
> > +             test_tick
> > +     fi &&
> > +     if test -n "$merge"
> > +     then
> > +             git ${indir:+ -C "$indir"} merge --no-edit --no-ff \
> > +                     ${2:+-m "$2"} "$1" &&
> > +             oid=$(git ${indir:+ -C "$indir"} rev-parse HEAD)
> > +     elif test -n "$tag"
> > +     then
> > +             git ${indir:+ -C "$indir"} tag -m "$1" "$1" &&
> > +             oid=$(git ${indir:+ -C "$indir"} rev-parse "$1")
> > +     else
> > +             file=${2:-"$1.t"} &&
> > +             echo "${3-$1}" > "$indir$file" &&
> > +             git ${indir:+ -C "$indir"} add "$file" &&
> > +             git ${indir:+ -C "$indir"} commit $signoff -m "$1" &&
> > +             oid=$(git ${indir:+ -C "$indir"} rev-parse HEAD)
> > +     fi &&
> > +     eval $var=$oid
> > +}
>
> ... it will cause a failure in 'eval' we have here.  Not good.
>
> > +# Format the output of git commands to make a user-friendly and stable
> > +# text.  We can easily prepare the expect text without having to worry
> > +# about future changes of the commit ID and spaces of the output.
>
> Hmph.  This relies on 7 hexdigits being sufficient to uniquely
> identify all objects involved in the test?  It should be OK in
> practice.
>
> Is there a point in having both <COMMIT-A> and <OID-A>?  I would
> have expected that all these "full object name" conversions are
> unneeded.

Will do.

> > +make_user_friendly_and_stable_output () {
> > +     sed \
> > +             -e "s/$A/<COMMIT-A>/" \
> > +             -e "s/$B/<COMMIT-B>/" \
> > +             -e "s/$C/<COMMIT-C>/" \
> > +             -e "s/$D/<COMMIT-D>/" \
> > +             -e "s/$E/<COMMIT-E>/" \
> > +             -e "s/$F/<COMMIT-F>/" \
> > +             -e "s/$G/<COMMIT-G>/" \
> > +             -e "s/$H/<COMMIT-H>/" \
> > +             -e "s/$I/<COMMIT-I>/" \
> > +             -e "s/$J/<COMMIT-J>/" \
> > +             -e "s/$K/<COMMIT-K>/" \
> > +             -e "s/$L/<COMMIT-L>/" \
> > +             -e "s/$M/<COMMIT-M>/" \
> > +             -e "s/$N/<COMMIT-N>/" \
> > +             -e "s/$O/<COMMIT-O>/" \
> > +             -e "s/$P/<COMMIT-P>/" \
> > +             -e "s/$TAG1/<TAG-1>/" \
> > +             -e "s/$TAG2/<TAG-2>/" \
> > +             -e "s/$TAG3/<TAG-3>/" \
> > +             -e "s/$(echo $A | cut -c1-7)[0-9a-f]*/<OID-A>/g" \
> > +             -e "s/$(echo $B | cut -c1-7)[0-9a-f]*/<OID-B>/g" \
> > +             -e "s/$(echo $C | cut -c1-7)[0-9a-f]*/<OID-C>/g" \
> > +             -e "s/$(echo $D | cut -c1-7)[0-9a-f]*/<OID-D>/g" \
> > +             -e "s/$(echo $E | cut -c1-7)[0-9a-f]*/<OID-E>/g" \
> > +             -e "s/$(echo $F | cut -c1-7)[0-9a-f]*/<OID-F>/g" \
> > +             -e "s/$(echo $G | cut -c1-7)[0-9a-f]*/<OID-G>/g" \
> > +             -e "s/$(echo $H | cut -c1-7)[0-9a-f]*/<OID-H>/g" \
> > +             -e "s/$(echo $I | cut -c1-7)[0-9a-f]*/<OID-I>/g" \
> > +             -e "s/$(echo $J | cut -c1-7)[0-9a-f]*/<OID-J>/g" \
> > +             -e "s/$(echo $K | cut -c1-7)[0-9a-f]*/<OID-K>/g" \
> > +             -e "s/$(echo $L | cut -c1-7)[0-9a-f]*/<OID-L>/g" \
> > +             -e "s/$(echo $M | cut -c1-7)[0-9a-f]*/<OID-M>/g" \
> > +             -e "s/$(echo $N | cut -c1-7)[0-9a-f]*/<OID-N>/g" \
> > +             -e "s/$(echo $O | cut -c1-7)[0-9a-f]*/<OID-O>/g" \
> > +             -e "s/$(echo $P | cut -c1-7)[0-9a-f]*/<OID-P>/g" \
> > +             -e "s/$(echo $TAG1 | cut -c1-7)[0-9a-f]*/<OID-TAG-1>/g" \
> > +             -e "s/$(echo $TAG2 | cut -c1-7)[0-9a-f]*/<OID-TAG-2>/g" \
> > +             -e "s/$(echo $TAG3 | cut -c1-7)[0-9a-f]*/<OID-TAG-3>/g" \
> > +             -e "s/ *\$//"
> > +}
> > ...
> > +test_expect_success 'create bundle from special rev: main^!' '
> > +     git bundle create special-rev.bdl "main^!" &&
> > +
> > +     git bundle list-heads special-rev.bdl |
> > +             make_user_friendly_and_stable_output >actual &&
> > +     cat >expect <<-EOF &&
> > +             <COMMIT-P> refs/heads/main
> > +             EOF
>
> We prefer to indent these more like so:
>
>         cat >expect <<-\EOF &&
>         <COMMIT-P> refs/heads/main
>         EOF
>
> i.e. the indent of the line with <<EOF on it and the indent of the
> line with the matching EOF are the same.  Also, quote EOF to signal
> that the body of the here text should be taken as-is without $var
> substitution.
>
Jiang Xin Jan. 9, 2021, 3:09 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> 于2021年1月9日周六 上午10:11写道:
> > +# Display the pack data contained in the bundle file, bypassing the
> > +# header that contains the signature, prerequisites and references.
> > +convert_bundle_to_pack () {
> > +     while read x && test -n "$x"
> > +     do
> > +             :;
> > +     done
> > +     cat
> > +}
>
> This looks somewhat familiar.  Perhaps extract out necessary helpers
> including this one into t/test-bundle-lib or something similar in a
> preparatory step before this patch?

Will add a new helper "t/test-bundle-functions.sh". But as how to
include this helper, there are several solutions, which is better?

1. For those scripts which use  these shared functions include this helper like:

        . ./test-lib.sh
        # current directory changed, so add path prefix
        .  "$TEST_DIRECTORY"/test-bundle-functions.sh

2. Include "test-bundle-functions.sh" in "test-lib.sh" like this:

        . "$TEST_DIRECTORY/test-lib-functions.sh"
        . "$TEST_DIRECTORY/test-bundle-functions.sh"

3. Create a "t/test-lib.d/" directory, add add
“test-bundle-functions.sh” inside "t/test-lib.d/" as an extension,
just like chriscool introduced in sharness project.

----8<-----
 # test_perf subshells can have them too
 . "$TEST_DIRECTORY/test-lib-functions.sh"

+if test -d "$TEST_DIRECTORY/test-lib.d"
+then
+       for file in "$TEST_DIRECTORY/test-lib.d/"*.sh
+       do
+               # Ensure glob was not an empty match:
+               test -e "${file}" || break
+
+               if test -n "$debug"
+               then
+                       echo >&5 "test-lib: loading extensions from ${file}"
+               fi
+               . "${file}"
+               if test $? != 0
+               then
+                       echo >&5 "test-lib: Error loading ${file}. Aborting."
+                       exit 1
+               fi
+       done
+fi
+
----->8-----
Junio C Hamano Jan. 9, 2021, 10:02 p.m. UTC | #4
Jiang Xin <worldhello.net@gmail.com> writes:

>> > +# Create a commit or tag and set the variable with the object ID.
>> > +test_commit_setvar () {
>> > +     notick= &&
>> > +     signoff= &&
>> > +     indir= &&
>> > +     merge= &&
>> > +     tag= &&
>> > +     var= &&
>> > +     while test $# != 0
>> > +     do
>> > +             case "$1" in
>> > ...
>> > +             -*)
>> > +                     echo >&2 "error: unknown option $1"
>> > +                     return 1
>> > +                     ;;
>> > +             *)
>> > +                     test -n "$var" && break
>> > +                     var=$1
>
> The loop ends only if $var has been assigned a value, or no other
> args.  Will report error if no other args later.

"We see an arg that is not an dashed option.  We already saw an arg
and taken it as the name of the variable, so break" was a misleading
way to structure this loop.

It looked as if it was just refusing to parse the remainder of the
command line, so that a check after the loop would complain if there
is still remaining arguments (as if to warn "var given twice").  But
that is not what the post-loop check does; it expects there still
are some argument left to be processed in mode-specific code that
follows, so it happens to work as intended.

That is brittle, though.  The current code may always consume one or
more extra arguments in $merge/$tag/other specific mode in the code
after the loop, but a new mode that will get added in the future to
sit next to --merge and --tag may learn all necessary info in the
command line parsing loop above, without any need for extra args to
be processed after the loop.  And $#=0 may not always be an error at
that point.

I forgot to notice / mention it, but now you made me to look at the
loop again, I see this part

	-C)
		indir="$2"
		shift
		;;

does not ensure we are getting something sensible in -C; that
potential bug by the caller also happens to be covered by the
post-loop "we require at least one argument that we can use as an
arg" check, but as I said already, that feels rather brittle.

Anyway, let's queue the patches as-is and see what others think.

Thanks.
Junio C Hamano Jan. 9, 2021, 10:02 p.m. UTC | #5
Jiang Xin <worldhello.net@gmail.com> writes:

> 1. For those scripts which use  these shared functions include this helper like:
>
>         . ./test-lib.sh
>         # current directory changed, so add path prefix
>         .  "$TEST_DIRECTORY"/test-bundle-functions.sh

This one.  But without double spaces after the dot.
diff mbox series

Patch

diff --git a/object.c b/object.c
index 68f80b0b3d..98017bed8e 100644
--- a/object.c
+++ b/object.c
@@ -412,15 +412,16 @@  void object_array_clear(struct object_array *array)
 }
 
 /*
- * Return true iff array already contains an entry with name.
+ * Return true if array already contains an entry.
  */
-static int contains_name(struct object_array *array, const char *name)
+static int contains_object(struct object_array *array,
+			   const struct object *item, const char *name)
 {
 	unsigned nr = array->nr, i;
 	struct object_array_entry *object = array->objects;
 
 	for (i = 0; i < nr; i++, object++)
-		if (!strcmp(object->name, name))
+		if (item == object->item && !strcmp(object->name, name))
 			return 1;
 	return 0;
 }
@@ -432,7 +433,8 @@  void object_array_remove_duplicates(struct object_array *array)
 
 	array->nr = 0;
 	for (src = 0; src < nr; src++) {
-		if (!contains_name(array, objects[src].name)) {
+		if (!contains_object(array, objects[src].item,
+				     objects[src].name)) {
 			if (src != array->nr)
 				objects[array->nr] = objects[src];
 			array->nr++;
diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
new file mode 100755
index 0000000000..c4447ca88f
--- /dev/null
+++ b/t/t6020-bundle-misc.sh
@@ -0,0 +1,477 @@ 
+#!/bin/sh
+#
+# Copyright (c) 2021 Jiang Xin
+#
+
+test_description='Test git-bundle'
+
+GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
+export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+
+. ./test-lib.sh
+
+# Check count of objects in a bundle file.
+# We can use "--thin" opiton to check thin pack, which must be fixed by
+# command `git-index-pack --fix-thin --stdin`.
+test_bundle_object_count () {
+	thin= &&
+	if test "$1" = "--thin"
+	then
+		thin=yes
+		shift
+	fi &&
+	if test $# -ne 2
+	then
+		echo >&2 "args should be: <bundle> <count>"
+		return 1
+	fi
+	bundle=$1 &&
+	pack=${bundle%.bdl}.pack &&
+	convert_bundle_to_pack <"$bundle" >"$pack" &&
+	if test -n "$thin"
+	then
+		test_must_fail git index-pack "$pack" &&
+		mv "$pack" "$pack"-thin &&
+		cat "$pack"-thin |
+			git index-pack --stdin --fix-thin "$pack"
+	else
+		git index-pack "$pack"
+	fi &&
+	git verify-pack -v "$pack" >verify.out
+	if test $? -ne 0
+	then
+		echo >&2 "error: fail to convert $bundle to $pack"
+		return 1
+	fi
+	count=$(grep -c "^$OID_REGEX " verify.out) &&
+	test $2 = $count && return 0
+	echo >&2 "error: object count for $bundle is $count, not $2"
+	return 1
+}
+
+# Display the pack data contained in the bundle file, bypassing the
+# header that contains the signature, prerequisites and references.
+convert_bundle_to_pack () {
+	while read x && test -n "$x"
+	do
+		:;
+	done
+	cat
+}
+
+# Create a commit or tag and set the variable with the object ID.
+test_commit_setvar () {
+	notick= &&
+	signoff= &&
+	indir= &&
+	merge= &&
+	tag= &&
+	var= &&
+	while test $# != 0
+	do
+		case "$1" in
+		--merge)
+			merge=yes
+			;;
+		--tag)
+			tag=yes
+			;;
+		--notick)
+			notick=yes
+			;;
+		--signoff)
+			signoff="$1"
+			;;
+		-C)
+			indir="$2"
+			shift
+			;;
+		-*)
+			echo >&2 "error: unknown option $1"
+			return 1
+			;;
+		*)
+			test -n "$var" && break
+			var=$1
+			;;
+		esac
+		shift
+	done &&
+	indir=${indir:+"$indir"/} &&
+	if test $# -eq 0
+	then
+		echo >&2 "no args provided"
+		return 1
+	fi &&
+	if test -z "$notick"
+	then
+		test_tick
+	fi &&
+	if test -n "$merge"
+	then
+		git ${indir:+ -C "$indir"} merge --no-edit --no-ff \
+			${2:+-m "$2"} "$1" &&
+		oid=$(git ${indir:+ -C "$indir"} rev-parse HEAD)
+	elif test -n "$tag"
+	then
+		git ${indir:+ -C "$indir"} tag -m "$1" "$1" &&
+		oid=$(git ${indir:+ -C "$indir"} rev-parse "$1")
+	else
+		file=${2:-"$1.t"} &&
+		echo "${3-$1}" > "$indir$file" &&
+		git ${indir:+ -C "$indir"} add "$file" &&
+		git ${indir:+ -C "$indir"} commit $signoff -m "$1" &&
+		oid=$(git ${indir:+ -C "$indir"} rev-parse HEAD)
+	fi &&
+	eval $var=$oid
+}
+
+
+# Format the output of git commands to make a user-friendly and stable
+# text.  We can easily prepare the expect text without having to worry
+# about future changes of the commit ID and spaces of the output.
+make_user_friendly_and_stable_output () {
+	sed \
+		-e "s/$A/<COMMIT-A>/" \
+		-e "s/$B/<COMMIT-B>/" \
+		-e "s/$C/<COMMIT-C>/" \
+		-e "s/$D/<COMMIT-D>/" \
+		-e "s/$E/<COMMIT-E>/" \
+		-e "s/$F/<COMMIT-F>/" \
+		-e "s/$G/<COMMIT-G>/" \
+		-e "s/$H/<COMMIT-H>/" \
+		-e "s/$I/<COMMIT-I>/" \
+		-e "s/$J/<COMMIT-J>/" \
+		-e "s/$K/<COMMIT-K>/" \
+		-e "s/$L/<COMMIT-L>/" \
+		-e "s/$M/<COMMIT-M>/" \
+		-e "s/$N/<COMMIT-N>/" \
+		-e "s/$O/<COMMIT-O>/" \
+		-e "s/$P/<COMMIT-P>/" \
+		-e "s/$TAG1/<TAG-1>/" \
+		-e "s/$TAG2/<TAG-2>/" \
+		-e "s/$TAG3/<TAG-3>/" \
+		-e "s/$(echo $A | cut -c1-7)[0-9a-f]*/<OID-A>/g" \
+		-e "s/$(echo $B | cut -c1-7)[0-9a-f]*/<OID-B>/g" \
+		-e "s/$(echo $C | cut -c1-7)[0-9a-f]*/<OID-C>/g" \
+		-e "s/$(echo $D | cut -c1-7)[0-9a-f]*/<OID-D>/g" \
+		-e "s/$(echo $E | cut -c1-7)[0-9a-f]*/<OID-E>/g" \
+		-e "s/$(echo $F | cut -c1-7)[0-9a-f]*/<OID-F>/g" \
+		-e "s/$(echo $G | cut -c1-7)[0-9a-f]*/<OID-G>/g" \
+		-e "s/$(echo $H | cut -c1-7)[0-9a-f]*/<OID-H>/g" \
+		-e "s/$(echo $I | cut -c1-7)[0-9a-f]*/<OID-I>/g" \
+		-e "s/$(echo $J | cut -c1-7)[0-9a-f]*/<OID-J>/g" \
+		-e "s/$(echo $K | cut -c1-7)[0-9a-f]*/<OID-K>/g" \
+		-e "s/$(echo $L | cut -c1-7)[0-9a-f]*/<OID-L>/g" \
+		-e "s/$(echo $M | cut -c1-7)[0-9a-f]*/<OID-M>/g" \
+		-e "s/$(echo $N | cut -c1-7)[0-9a-f]*/<OID-N>/g" \
+		-e "s/$(echo $O | cut -c1-7)[0-9a-f]*/<OID-O>/g" \
+		-e "s/$(echo $P | cut -c1-7)[0-9a-f]*/<OID-P>/g" \
+		-e "s/$(echo $TAG1 | cut -c1-7)[0-9a-f]*/<OID-TAG-1>/g" \
+		-e "s/$(echo $TAG2 | cut -c1-7)[0-9a-f]*/<OID-TAG-2>/g" \
+		-e "s/$(echo $TAG3 | cut -c1-7)[0-9a-f]*/<OID-TAG-3>/g" \
+		-e "s/ *\$//"
+}
+
+#            (C)   (D, pull/1/head, topic/1)
+#             o --- o
+#            /       \                              (L)
+#           /         \        o (H, topic/2)             (M, tag:v2)
+#          /    (F)    \      /                                 (N, tag:v3)
+#         /      o --------- o (G, pull/2/head)      o --- o --- o (release)
+#        /      /        \    \                      /       \
+#  o --- o --- o -------- o -- o ------------------ o ------- o --- o (main)
+# (A)   (B)  (E, tag:v1) (I)  (J)                  (K)       (O)   (P)
+#
+test_expect_success 'setup' '
+	# Try to make a stable fixed width for abbreviated commit ID,
+	# this fixed-width oid will be replaced with "<OID>".
+	git config core.abbrev 7 &&
+
+	# branch main: commit A & B
+	test_commit_setvar A "Commit A" main.txt &&
+	test_commit_setvar B "Commit B" main.txt &&
+
+	# branch topic/1: commit C & D, refs/pull/1/head
+	git checkout -b topic/1 &&
+	test_commit_setvar C "Commit C" topic-1.txt &&
+	test_commit_setvar D "Commit D" topic-1.txt &&
+	git update-ref refs/pull/1/head HEAD &&
+
+	# branch topic/1: commit E, tag v1
+	git checkout main &&
+	test_commit_setvar E "Commit E" main.txt &&
+	test_commit_setvar TAG1 --tag v1 &&
+
+	# branch topic/2: commit F & G, refs/pull/2/head
+	git checkout -b topic/2 &&
+	test_commit_setvar F "Commit F" topic-2.txt &&
+	test_commit_setvar G "Commit G" topic-2.txt &&
+	git update-ref refs/pull/2/head HEAD &&
+	test_commit_setvar H "Commit H" topic-2.txt &&
+
+	# branch main: merge commit I & J
+	git checkout main &&
+	test_commit_setvar I --merge topic/1 "Merge commit I" &&
+	test_commit_setvar J --merge refs/pull/2/head "Merge commit J" &&
+
+	# branch main: commit K
+	git checkout main &&
+	test_commit_setvar K "Commit K" main.txt &&
+
+	# branch release:
+	git checkout -b release &&
+	test_commit_setvar L "Commit L" release.txt &&
+	test_commit_setvar M "Commit M" release.txt &&
+	test_commit_setvar TAG2 --tag v2 &&
+	test_commit_setvar N "Commit N" release.txt &&
+	test_commit_setvar TAG3 --tag v3 &&
+
+	# branch main: merge commit O, commit P
+	git checkout main &&
+	test_commit_setvar O --merge tags/v2 "Merge commit O" &&
+	test_commit_setvar P "Commit P" main.txt
+'
+
+test_expect_success 'create bundle from special rev: main^!' '
+	git bundle create special-rev.bdl "main^!" &&
+
+	git bundle list-heads special-rev.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		<COMMIT-P> refs/heads/main
+		EOF
+	test_i18ncmp expect actual &&
+
+	git bundle verify special-rev.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		The bundle contains this ref:
+		<COMMIT-P> refs/heads/main
+		The bundle requires this ref:
+		<COMMIT-O>
+		EOF
+	test_i18ncmp expect actual &&
+
+	test_bundle_object_count special-rev.bdl 3
+'
+
+test_expect_success 'create bundle with --max-count option' '
+	git bundle create max-count.bdl --max-count 1 \
+		main \
+		"^release" \
+		refs/tags/v1 \
+		refs/pull/1/head \
+		refs/pull/2/head &&
+
+	git bundle list-heads max-count.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		<COMMIT-P> refs/heads/main
+		<TAG-1> refs/tags/v1
+		EOF
+	test_i18ncmp expect actual &&
+
+	git bundle verify max-count.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		The bundle contains these 2 refs:
+		<COMMIT-P> refs/heads/main
+		<TAG-1> refs/tags/v1
+		The bundle requires this ref:
+		<COMMIT-O>
+		EOF
+	test_i18ncmp expect actual &&
+
+	test_bundle_object_count max-count.bdl 4
+'
+
+test_expect_success 'create bundle with --since option' '
+	since="Thu Apr 7 15:26:13 2005 -0700" &&
+	git log -1 --pretty="%ad" $M >actual &&
+	echo "$since" >expect &&
+	test_cmp expect actual &&
+
+	git bundle create since.bdl \
+		--since "$since" --all &&
+
+	git bundle list-heads since.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		<COMMIT-P> refs/heads/main
+		<COMMIT-N> refs/heads/release
+		<TAG-2> refs/tags/v2
+		<TAG-3> refs/tags/v3
+		<COMMIT-P> HEAD
+		EOF
+	test_i18ncmp expect actual &&
+
+	git bundle verify since.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		The bundle contains these 5 refs:
+		<COMMIT-P> refs/heads/main
+		<COMMIT-N> refs/heads/release
+		<TAG-2> refs/tags/v2
+		<TAG-3> refs/tags/v3
+		<COMMIT-P> HEAD
+		The bundle requires these 2 refs:
+		<COMMIT-L>
+		<COMMIT-K>
+		EOF
+	test_i18ncmp expect actual &&
+
+	test_bundle_object_count --thin since.bdl 16
+'
+
+test_expect_success 'create bundle 1 - no prerequisites' '
+	git bundle create 1.bdl topic/1 topic/2 &&
+
+	cat >expect <<-EOF &&
+		The bundle contains these 2 refs:
+		<COMMIT-D> refs/heads/topic/1
+		<COMMIT-H> refs/heads/topic/2
+		The bundle records a complete history.
+		EOF
+
+	# verify bundle, which has no prerequisites
+	git bundle verify 1.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	test_i18ncmp expect actual &&
+
+	test_bundle_object_count 1.bdl 24
+'
+
+test_expect_success 'create bundle 2 - has prerequisites' '
+	git bundle create 2.bdl \
+		--ignore-missing \
+		^topic/deleted \
+		^$D \
+		^topic/2 \
+		release &&
+
+	cat >expect <<-EOF &&
+		The bundle contains this ref:
+		<COMMIT-N> refs/heads/release
+		The bundle requires these 3 refs:
+		<COMMIT-D>
+		<COMMIT-E>
+		<COMMIT-G>
+		EOF
+
+	git bundle verify 2.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	test_i18ncmp expect actual &&
+
+	test_bundle_object_count 2.bdl 16
+'
+
+test_expect_success 'fail to verify bundle without prerequisites' '
+	git init --bare test1.git &&
+
+	cat >expect <<-EOF &&
+		error: Repository lacks these prerequisite commits:
+		error: <COMMIT-D>
+		error: <COMMIT-E>
+		error: <COMMIT-G>
+		EOF
+
+	test_must_fail git -C test1.git bundle verify ../2.bdl 2>&1 |
+		make_user_friendly_and_stable_output >actual &&
+	test_i18ncmp expect actual
+'
+
+test_expect_success 'create bundle 3 - two refs, same object' '
+	git bundle create --version=3 3.bdl \
+		^release \
+		^topic/1 \
+		^topic/2 \
+		main \
+		HEAD &&
+
+	cat >expect <<-EOF &&
+		The bundle contains these 2 refs:
+		<COMMIT-P> refs/heads/main
+		<COMMIT-P> HEAD
+		The bundle requires these 2 refs:
+		<COMMIT-M>
+		<COMMIT-K>
+		EOF
+
+	git bundle verify 3.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	test_i18ncmp expect actual &&
+
+	test_bundle_object_count 3.bdl 4
+'
+
+test_expect_success 'create bundle 4 - with tags' '
+	git bundle create 4.bdl \
+		^main \
+		^release \
+		^topic/1 \
+		^topic/2 \
+		--all &&
+
+	cat >expect <<-EOF &&
+		The bundle contains these 3 refs:
+		<TAG-1> refs/tags/v1
+		<TAG-2> refs/tags/v2
+		<TAG-3> refs/tags/v3
+		The bundle records a complete history.
+		EOF
+
+	git bundle verify 4.bdl |
+		make_user_friendly_and_stable_output >actual &&
+	test_i18ncmp expect actual &&
+
+	test_bundle_object_count 4.bdl 3
+'
+
+test_expect_success 'clone from bundle' '
+	git clone --mirror 1.bdl mirror.git &&
+	git -C mirror.git show-ref |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		<COMMIT-D> refs/heads/topic/1
+		<COMMIT-H> refs/heads/topic/2
+		EOF
+	test_cmp expect actual &&
+
+	git -C mirror.git fetch ../2.bdl "+refs/*:refs/*" &&
+	git -C mirror.git show-ref |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		<COMMIT-N> refs/heads/release
+		<COMMIT-D> refs/heads/topic/1
+		<COMMIT-H> refs/heads/topic/2
+		EOF
+	test_cmp expect actual &&
+
+	git -C mirror.git fetch ../3.bdl "+refs/*:refs/*" &&
+	git -C mirror.git show-ref |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		<COMMIT-P> refs/heads/main
+		<COMMIT-N> refs/heads/release
+		<COMMIT-D> refs/heads/topic/1
+		<COMMIT-H> refs/heads/topic/2
+		EOF
+	test_cmp expect actual &&
+
+	git -C mirror.git fetch ../4.bdl "+refs/*:refs/*" &&
+	git -C mirror.git show-ref |
+		make_user_friendly_and_stable_output >actual &&
+	cat >expect <<-EOF &&
+		<COMMIT-P> refs/heads/main
+		<COMMIT-N> refs/heads/release
+		<COMMIT-D> refs/heads/topic/1
+		<COMMIT-H> refs/heads/topic/2
+		<TAG-1> refs/tags/v1
+		<TAG-2> refs/tags/v2
+		<TAG-3> refs/tags/v3
+		EOF
+	test_cmp expect actual
+'
+
+test_done