diff mbox series

[v6,1/3] bundle-uri: verify oid before writing refs

Message ID e958a3ab20c0c06f52a00038f39605f14302032a.1718109943.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series object checking related additions and fixes for bundles in fetches | expand

Commit Message

Xing Xin June 11, 2024, 12:45 p.m. UTC
From: Xing Xin <xingxin.xx@bytedance.com>

When using the bundle-uri mechanism with a bundle list containing
multiple interrelated bundles, we encountered a bug where tips from
downloaded bundles were not discovered, thus resulting in rather slow
clones. This was particularly problematic when employing the
"creationTokens" heuristic.

To reproduce this issue, consider a repository with a single branch
"main" pointing to commit "A". Firstly, create a base bundle with:

  git bundle create base.bundle main

Then, add a new commit "B" on top of "A", and create an incremental
bundle for "main":

  git bundle create incr.bundle A..main

Now, generate a bundle list with the following content:

  [bundle]
      version = 1
      mode = all
      heuristic = creationToken

  [bundle "base"]
      uri = base.bundle
      creationToken = 1

  [bundle "incr"]
      uri = incr.bundle
      creationToken = 2

A fresh clone with the bundle list above should result in a reference
"refs/bundles/main" pointing to "B" in the new repository. However, git
would still download everything from the server, as if it had fetched
nothing locally.

So why the "refs/bundles/main" is not discovered? After some digging I
found that:

1. Bundles in bundle list are downloaded to local files via
   `bundle-uri.c:download_bundle_list` or via
   `bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
   heuristic.
2. Each bundle is unbundled via `bundle-uri.c:unbundle_from_file`, which
   is called by `bundle-uri.c:unbundle_all_bundles` or called within
   `bundle-uri.c:fetch_bundles_by_token` for the "creationToken"
   heuristic.
3. To get all prerequisites of the bundle, the bundle header is read
   inside `bundle-uri.c:unbundle_from_file` to by calling
   `bundle.c:read_bundle_header`.
4. Then it calls `bundle.c:unbundle`, which calls
   `bundle.c:verify_bundle` to ensure the repository contains all the
   prerequisites.
5. `bundle.c:verify_bundle` calls `parse_object`, which eventually
   invokes `packfile.c:prepare_packed_git` or
   `packfile.c:reprepare_packed_git`, filling
   `raw_object_store->packed_git` and setting `packed_git_initialized`.
6. If `bundle.c:unbundle` succeeds, it writes refs via
   `refs.c:refs_update_ref` with `REF_SKIP_OID_VERIFICATION` set. Here
   bundle refs which can target arbitrary objects are written to the
   repository.
7. Finally, in `fetch-pack.c:do_fetch_pack_v2`, the functions
   `fetch-pack.c:mark_complete_and_common_ref` and
   `fetch-pack.c:mark_tips` are called with `OBJECT_INFO_QUICK` set to
   find local tips for negotiation. The `OBJECT_INFO_QUICK` flag
   prevents `packfile.c:reprepare_packed_git` from being called,
   resulting in failures to parse OIDs that reside only in the latest
   bundle.

In the example above, when unbunding "incr.bundle", "base.pack" is added
to `packed_git` due to prerequisites verification. However, "B" cannot
be found for negotiation because it exists in "incr.pack", which is not
included in `packed_git`.

This commit fixes the bug by removing `REF_SKIP_OID_VERIFICATION` flag
when writing bundle refs. When `refs.c:refs_update_ref` is called to to
write the corresponding bundle refs, it triggers
`refs.c:ref_transaction_commit`.  This, in turn, invokes
`refs.c:ref_transaction_prepare`, which calls `transaction_prepare` of
the refs storage backend. For files backend, this function is
`files-backend.c:files_transaction_prepare`, and for reftable backend,
it is `reftable-backend.c:reftable_be_transaction_prepare`. Both
functions eventually call `object.c:parse_object`, which can invoke
`packfile.c:reprepare_packed_git` to refresh `packed_git`. This ensures
that bundle refs point to valid objects and that all tips from bundle
refs are correctly parsed during subsequent negotiations.

A test has been added to demonstrate that bundles with incorrect
headers, where refs point to non-existent objects, do not result in any
bundle refs being created in the repository. Additionally, a set of
negotiation-related tests for fetching with bundle-uri has been
included.

Reviewed-by: Karthik Nayak <karthik.188@gmail.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
---
 bundle-uri.c                |   3 +-
 t/t5558-clone-bundle-uri.sh | 153 +++++++++++++++++++++++++++++++++++-
 2 files changed, 150 insertions(+), 6 deletions(-)

Comments

Junio C Hamano June 11, 2024, 7:08 p.m. UTC | #1
"Xing Xin via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This commit fixes the bug by removing `REF_SKIP_OID_VERIFICATION` flag

"Fix the bug by ...".

> when writing bundle refs. When `refs.c:refs_update_ref` is called to to

"to to"

> write the corresponding bundle refs, it triggers
> `refs.c:ref_transaction_commit`.  This, in turn, invokes
> `refs.c:ref_transaction_prepare`, which calls `transaction_prepare` of
> the refs storage backend. For files backend, this function is
> `files-backend.c:files_transaction_prepare`, and for reftable backend,
> it is `reftable-backend.c:reftable_be_transaction_prepare`. Both
> functions eventually call `object.c:parse_object`, which can invoke
> `packfile.c:reprepare_packed_git` to refresh `packed_git`.

Nice.  That does sound like the right fix.  The one who did
something to _require_ us to reprepare causes the packed_git
list refreshed.

>  test_expect_success 'create bundle' '
>  	git init clone-from &&
> -	git -C clone-from checkout -b topic &&
> -	test_commit -C clone-from A &&
> -	test_commit -C clone-from B &&
> -	git -C clone-from bundle create B.bundle topic
> +	(
> +		cd clone-from &&
> +		git checkout -b topic &&
> +
> +		test_commit A &&
> +		git bundle create A.bundle topic &&
> +
> +		test_commit B &&
> +		git bundle create B.bundle topic &&
> +
> +		# Create a bundle with reference pointing to non-existent object.
> +		sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle

I suspect that this would be terribly unportable.  The early part of
a bundle file may be text and sed may be able to grok but are you
sure everybody's implementation of sed would not barf (or even
worse, corrupt) the pack stream data that follows?

The code used in t/lib-bundle.sh:convert_bundle_to_pack() has been
in use since 8315588b (bundle: fix wrong check of read_header()'s
return value & add tests, 2007-03-06), so munging the bundle with
a code similar to it may have a better portability story.

Add something like:

        corrupt_bundle_header () {
                sed -e '/^$/q' "$@"
                cat
        }

to t/lib-bundle.sh, which can take an arbitrary sequence of command
line parameters to drive "sed", and can be used like so:

	corrupt_bundle_header \
		-e "s/^$(git rev-parse A) /$(git rev-parse B) /" \
		<A.bndl >B.bndl

perhaps?

> @@ -33,6 +42,16 @@ test_expect_success 'clone with path bundle' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'clone with bundle that has bad header' '
> +	git clone --bundle-uri="clone-from/bad-header.bundle" \
> +		clone-from clone-bad-header 2>err &&
> +	# Write bundle ref fails, but clone can still proceed.
> +	commit_b=$(git -C clone-from rev-parse B) &&
> +	test_grep "trying to write ref '\''refs/bundles/topic'\'' with nonexistent object $commit_b" err &&
> +	git -C clone-bad-header for-each-ref --format="%(refname)" >refs &&
> +	! grep "refs/bundles/" refs
> +'
> +

So this is the test the proposed log message discussed.  The
description gave a false impression that the "broken header" test
that has not much to do with the bug being fixed was the only added
test---we probably want to correct that impression.

> @@ -259,6 +278,132 @@ test_expect_success 'clone bundle list (file, any mode, all failures)' '
>  	! grep "refs/bundles/" refs
>  '
>  
> +#########################################################################
> +# Clone negotiation related tests begin here

Drop this divider and comment.  The HTTP tests you see below has a
much better reason to be separated like that in order to warn test
writers (they shouldn't add their random new tests after that point,
because everything after that one is skipped when HTTPD tests are
disabled---see the beginning of t/lib-httpd.sh which is included
after that divider line), but everything here you added is not
special.  Everybody should run these tests.

> +test_expect_success 'negotiation: bundle with part of wanted commits' '
> +	test_when_finished rm -rf trace*.txt &&

Do not overly depend on glob not matching at this point when you
establish the when-finished handler (or glob matching the files you
want to catch and later test not adding anything you would want to
clean).  Quote "rm -f trace*.txt" and drop "r" if you do not absolutely
need it (and I would imagine you don't, given the .txt suffix).

> +	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
> +	git clone --no-local --bundle-uri="clone-from/A.bundle" \
> +		clone-from nego-bundle-part &&
> +	git -C nego-bundle-part for-each-ref --format="%(refname)" >refs &&
> +	grep "refs/bundles/" refs >actual &&
> +	cat >expect <<-\EOF &&
> +	refs/bundles/topic
> +	EOF

Hmph, if the expected pattern is only a few lines without any magic,

	test_write_lines >expect refs/bundles/topic &&

may be easier to follow.

> +	test_cmp expect actual &&
> +	# Ensure that refs/bundles/topic are sent as "have".
> +	grep "clone> have $(git -C clone-from rev-parse A)" trace-packet.txt
> +'

Using "test_grep" would make it easier to diagnose when test breaks.
A failing "grep" will be silent (i.e., "I didn't find anything you
told me to look for").  A failing "test_grep" will tell you "I was
told to find this, but didn't find any in that".

> +test_expect_success 'negotiation: bundle with all wanted commits' '
> +	test_when_finished rm -rf trace*.txt &&
> +	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
> +	git clone --no-local --single-branch --branch=topic --no-tags \
> +		--bundle-uri="clone-from/B.bundle" \
> +		clone-from nego-bundle-all &&
> +	git -C nego-bundle-all for-each-ref --format="%(refname)" >refs &&
> +	grep "refs/bundles/" refs >actual &&
> +	cat >expect <<-\EOF &&
> +	refs/bundles/topic
> +	EOF
> +	test_cmp expect actual &&
> +	# We already have all needed commits so no "want" needed.
> +	! grep "clone> want " trace-packet.txt
> +'
> +
> +test_expect_success 'negotiation: bundle list (no heuristic)' '
> +	test_when_finished rm -f trace*.txt &&
> +	cat >bundle-list <<-EOF &&
> +	[bundle]
> +		version = 1
> +		mode = all
> +
> +	[bundle "bundle-1"]
> +		uri = file://$(pwd)/clone-from/bundle-1.bundle
> +
> +	[bundle "bundle-2"]
> +		uri = file://$(pwd)/clone-from/bundle-2.bundle
> +	EOF

OK.  This is a good use of here-doc (as opposed to test_write_lines
I sugested earlier for different purposes).  I wondered if these
$(pwd) and file://$(pwd) are safe (I always get confused by the need
to sometimes use $PWD to help Windows), but I see them used in what
Derrick wrote in this file, so they must be fine.

But there may be characters in the leading part of $(pwd) that we do
not control that needs quoting (like a double quote '"').  The value
of bundle.*.uri may need to be quoted a bit carefully.  This is not
a new problem this patch introduces, so you do not have to rewrite
this part of the patch; I'll mark it with #leftoverbits here---the
idea being somebody else who is too bored can come back, see if it
is truly broken, and fix them after all dust settles.

Abusing "git config -f bundle-list" might be safer, e.g.

	$ git config -f bundle.list bundle.bundle-1.uri \
		'file:///home/abc"def/t/trash dir/clone-from/b1.bndl'
	$ cat bundle.list
        [bundle "bundle-1"]
                uri = file:///home/abc\"def/t/trash dir/clone-from/b1.bndl

as you do not know what other garbage character is in $(pwd) part.

> +	# We already have all needed commits so no "want" needed.
> +	! grep "clone> want " trace-packet.txt

Just FYI, to negate test_grep, use

	test_grep ! "clone > want " trace-packet.txt

not	

	! test_grep "clone > want " trace-packet.txt ;# WRONG
Xing Xin June 17, 2024, 1:53 p.m. UTC | #2
At 2024-06-12 03:08:04, "Junio C Hamano" <gitster@pobox.com> wrote:
>"Xing Xin via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> This commit fixes the bug by removing `REF_SKIP_OID_VERIFICATION` flag
>
>"Fix the bug by ...".
>
>> when writing bundle refs. When `refs.c:refs_update_ref` is called to to
>
>"to to"

Ah, always typos. :-)

[snip]
>>  test_expect_success 'create bundle' '
>>  	git init clone-from &&
>> -	git -C clone-from checkout -b topic &&
>> -	test_commit -C clone-from A &&
>> -	test_commit -C clone-from B &&
>> -	git -C clone-from bundle create B.bundle topic
>> +	(
>> +		cd clone-from &&
>> +		git checkout -b topic &&
>> +
>> +		test_commit A &&
>> +		git bundle create A.bundle topic &&
>> +
>> +		test_commit B &&
>> +		git bundle create B.bundle topic &&
>> +
>> +		# Create a bundle with reference pointing to non-existent object.
>> +		sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle
>
>I suspect that this would be terribly unportable.  The early part of
>a bundle file may be text and sed may be able to grok but are you
>sure everybody's implementation of sed would not barf (or even
>worse, corrupt) the pack stream data that follows?
>
>The code used in t/lib-bundle.sh:convert_bundle_to_pack() has been
>in use since 8315588b (bundle: fix wrong check of read_header()'s
>return value & add tests, 2007-03-06), so munging the bundle with
>a code similar to it may have a better portability story.
>
>Add something like:
>
>        corrupt_bundle_header () {
>                sed -e '/^$/q' "$@"
>                cat
>        }
>
>to t/lib-bundle.sh, which can take an arbitrary sequence of command
>line parameters to drive "sed", and can be used like so:
>
>	corrupt_bundle_header \
>		-e "s/^$(git rev-parse A) /$(git rev-parse B) /" \
>		<A.bndl >B.bndl
>
>perhaps?

Thanks, I never knew sed could be used this way! It's so concise!

However, after applying these code, the added test "t5558.5 clone with
bundle that has bad header" fails in some CI jobs showing that we can
not get expected error. For example CI "linux-musl (alpine)" gives:

	+ git clone '--bundle-uri=clone-from/bad-header.bundle' clone-from clone-bad-header
	+ git -C clone-from rev-parse B
	+ commit_b=d9df4505cb3522088b9e29d6051ac16f1564154a
	+ test_grep 'trying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a' err
	+ eval 'last_arg=${2}'
	+ last_arg=err
	+ test -f err
	+ test 2 -lt 2
	+ test 'x!' '=' 'xtrying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a'
	+ test 'x!' '=' 'xtrying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a'
	+ grep 'trying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a' err
	+ echo 'error: '"'"'grep trying to write ref '"'"'refs/bundles/topic'"'"' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a' 'err'"'"' didn'"'"'t find a match in:'
	error: 'grep trying to write ref 'refs/bundles/topic' with nonexistent object d9df4505cb3522088b9e29d6051ac16f1564154a err' didn't find a match in:
	+ test -s err
	+ cat err
	Cloning into 'clone-bad-header'...
	fatal: pack signature mismatch
	error: index-pack died
	done.
	+ return 1
	error: last command exited with $?=1
	not ok 5 - clone with bundle that has bad header

	+# Create a bundle with reference pointing to non-existent object.
	+sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \
	+	<A.bundle >bad-header.bundle &&
	+convert_bundle_to_pack \
	+	<A.bundle >>bad-header.bundle

More details can be found at https://github.com/blanet/git/actions/runs/9541478191/job/26294731254.

After some digging, I discovered that inside the Alpine container,
`corrupt_bundle_header` is missing the leading "PACK\x00" in the pack
section of the converted bundle. This is likely caused by the
incompatibility of "sed" across different operating systems. I stopped
investigating further due to my unfamiliarity with containers and "sed"
itself. Instead, I found another approach that utilizes the suggested
usage of "sed" and `lib-bundle.sh:convert_bundle_to_pack`.

 	+# Create a bundle with reference pointing to non-existent object.
	+sed -e "/^$/q" -e "s/$(git rev-parse A) /$(git rev-parse B) /" \
	+	<A.bundle >bad-header.bundle &&
	+convert_bundle_to_pack \
	+	<A.bundle >>bad-header.bundle

>> @@ -33,6 +42,16 @@ test_expect_success 'clone with path bundle' '
>>  	test_cmp expect actual
>>  '
>>  
>> +test_expect_success 'clone with bundle that has bad header' '
>> +	git clone --bundle-uri="clone-from/bad-header.bundle" \
>> +		clone-from clone-bad-header 2>err &&
>> +	# Write bundle ref fails, but clone can still proceed.
>> +	commit_b=$(git -C clone-from rev-parse B) &&
>> +	test_grep "trying to write ref '\''refs/bundles/topic'\'' with nonexistent object $commit_b" err &&
>> +	git -C clone-bad-header for-each-ref --format="%(refname)" >refs &&
>> +	! grep "refs/bundles/" refs
>> +'
>> +
>
>So this is the test the proposed log message discussed.  The
>description gave a false impression that the "broken header" test
>that has not much to do with the bug being fixed was the only added
>test---we probably want to correct that impression.

Adjusted the commit message.

>> @@ -259,6 +278,132 @@ test_expect_success 'clone bundle list (file, any mode, all failures)' '
>>  	! grep "refs/bundles/" refs
>>  '
>>  
>> +#########################################################################
>> +# Clone negotiation related tests begin here
>
>Drop this divider and comment.  The HTTP tests you see below has a
>much better reason to be separated like that in order to warn test
>writers (they shouldn't add their random new tests after that point,
>because everything after that one is skipped when HTTPD tests are
>disabled---see the beginning of t/lib-httpd.sh which is included
>after that divider line), but everything here you added is not
>special.  Everybody should run these tests.

Thanks for the explanation, dropped the divider in new series.

>> +test_expect_success 'negotiation: bundle with part of wanted commits' '
>> +	test_when_finished rm -rf trace*.txt &&
>
>Do not overly depend on glob not matching at this point when you
>establish the when-finished handler (or glob matching the files you
>want to catch and later test not adding anything you would want to
>clean).  Quote "rm -f trace*.txt" and drop "r" if you do not absolutely
>need it (and I would imagine you don't, given the .txt suffix).

Yes, the "-r" is not safe and I just need to clean the "*.txt", not involving directories.

>> +	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
>> +	git clone --no-local --bundle-uri="clone-from/A.bundle" \
>> +		clone-from nego-bundle-part &&
>> +	git -C nego-bundle-part for-each-ref --format="%(refname)" >refs &&
>> +	grep "refs/bundles/" refs >actual &&
>> +	cat >expect <<-\EOF &&
>> +	refs/bundles/topic
>> +	EOF
>
>Hmph, if the expected pattern is only a few lines without any magic,
>
>	test_write_lines >expect refs/bundles/topic &&
>
>may be easier to follow.

Applied!

>> +	test_cmp expect actual &&
>> +	# Ensure that refs/bundles/topic are sent as "have".
>> +	grep "clone> have $(git -C clone-from rev-parse A)" trace-packet.txt
>> +'
>
>Using "test_grep" would make it easier to diagnose when test breaks.
>A failing "grep" will be silent (i.e., "I didn't find anything you
>told me to look for").  A failing "test_grep" will tell you "I was
>told to find this, but didn't find any in that".

Thanks, it just accelerated the digging process of the test failure
mentioned above. :)

>> +test_expect_success 'negotiation: bundle with all wanted commits' '
>> +	test_when_finished rm -rf trace*.txt &&
>> +	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
>> +	git clone --no-local --single-branch --branch=topic --no-tags \
>> +		--bundle-uri="clone-from/B.bundle" \
>> +		clone-from nego-bundle-all &&
>> +	git -C nego-bundle-all for-each-ref --format="%(refname)" >refs &&
>> +	grep "refs/bundles/" refs >actual &&
>> +	cat >expect <<-\EOF &&
>> +	refs/bundles/topic
>> +	EOF
>> +	test_cmp expect actual &&
>> +	# We already have all needed commits so no "want" needed.
>> +	! grep "clone> want " trace-packet.txt
>> +'
>> +
>> +test_expect_success 'negotiation: bundle list (no heuristic)' '
>> +	test_when_finished rm -f trace*.txt &&
>> +	cat >bundle-list <<-EOF &&
>> +	[bundle]
>> +		version = 1
>> +		mode = all
>> +
>> +	[bundle "bundle-1"]
>> +		uri = file://$(pwd)/clone-from/bundle-1.bundle
>> +
>> +	[bundle "bundle-2"]
>> +		uri = file://$(pwd)/clone-from/bundle-2.bundle
>> +	EOF
>
>OK.  This is a good use of here-doc (as opposed to test_write_lines
>I sugested earlier for different purposes).  I wondered if these
>$(pwd) and file://$(pwd) are safe (I always get confused by the need
>to sometimes use $PWD to help Windows), but I see them used in what
>Derrick wrote in this file, so they must be fine.
>
>But there may be characters in the leading part of $(pwd) that we do
>not control that needs quoting (like a double quote '"').  The value
>of bundle.*.uri may need to be quoted a bit carefully.  This is not
>a new problem this patch introduces, so you do not have to rewrite
>this part of the patch; I'll mark it with #leftoverbits here---the
>idea being somebody else who is too bored can come back, see if it
>is truly broken, and fix them after all dust settles.
>
>Abusing "git config -f bundle-list" might be safer, e.g.
>
>	$ git config -f bundle.list bundle.bundle-1.uri \
>		'file:///home/abc"def/t/trash dir/clone-from/b1.bndl'
>	$ cat bundle.list
>        [bundle "bundle-1"]
>                uri = file:///home/abc\"def/t/trash dir/clone-from/b1.bndl
>
>as you do not know what other garbage character is in $(pwd) part.
>
>> +	# We already have all needed commits so no "want" needed.
>> +	! grep "clone> want " trace-packet.txt
>
>Just FYI, to negate test_grep, use
>
>	test_grep ! "clone > want " trace-packet.txt
>
>not	
>
>	! test_grep "clone > want " trace-packet.txt ;# WRONG

Thanks your through review!

Xing Xin
diff mbox series

Patch

diff --git a/bundle-uri.c b/bundle-uri.c
index 91b3319a5c1..65666a11d9c 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -400,8 +400,7 @@  static int unbundle_from_file(struct repository *r, const char *file)
 		refs_update_ref(get_main_ref_store(the_repository),
 				"fetched bundle", bundle_ref.buf, oid,
 				has_old ? &old_oid : NULL,
-				REF_SKIP_OID_VERIFICATION,
-				UPDATE_REFS_MSG_ON_ERR);
+				0, UPDATE_REFS_MSG_ON_ERR);
 	}
 
 	bundle_header_release(&header);
diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh
index 1ca5f745e73..8f4f802e4f1 100755
--- a/t/t5558-clone-bundle-uri.sh
+++ b/t/t5558-clone-bundle-uri.sh
@@ -19,10 +19,19 @@  test_expect_success 'fail to clone from non-bundle file' '
 
 test_expect_success 'create bundle' '
 	git init clone-from &&
-	git -C clone-from checkout -b topic &&
-	test_commit -C clone-from A &&
-	test_commit -C clone-from B &&
-	git -C clone-from bundle create B.bundle topic
+	(
+		cd clone-from &&
+		git checkout -b topic &&
+
+		test_commit A &&
+		git bundle create A.bundle topic &&
+
+		test_commit B &&
+		git bundle create B.bundle topic &&
+
+		# Create a bundle with reference pointing to non-existent object.
+		sed "s/$(git rev-parse A)/$(git rev-parse B)/" <A.bundle >bad-header.bundle
+	)
 '
 
 test_expect_success 'clone with path bundle' '
@@ -33,6 +42,16 @@  test_expect_success 'clone with path bundle' '
 	test_cmp expect actual
 '
 
+test_expect_success 'clone with bundle that has bad header' '
+	git clone --bundle-uri="clone-from/bad-header.bundle" \
+		clone-from clone-bad-header 2>err &&
+	# Write bundle ref fails, but clone can still proceed.
+	commit_b=$(git -C clone-from rev-parse B) &&
+	test_grep "trying to write ref '\''refs/bundles/topic'\'' with nonexistent object $commit_b" err &&
+	git -C clone-bad-header for-each-ref --format="%(refname)" >refs &&
+	! grep "refs/bundles/" refs
+'
+
 test_expect_success 'clone with path bundle and non-default hash' '
 	test_when_finished "rm -rf clone-path-non-default-hash" &&
 	GIT_DEFAULT_HASH=sha256 git clone --bundle-uri="clone-from/B.bundle" \
@@ -259,6 +278,132 @@  test_expect_success 'clone bundle list (file, any mode, all failures)' '
 	! grep "refs/bundles/" refs
 '
 
+#########################################################################
+# Clone negotiation related tests begin here
+
+test_expect_success 'negotiation: bundle with part of wanted commits' '
+	test_when_finished rm -rf trace*.txt &&
+	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
+	git clone --no-local --bundle-uri="clone-from/A.bundle" \
+		clone-from nego-bundle-part &&
+	git -C nego-bundle-part for-each-ref --format="%(refname)" >refs &&
+	grep "refs/bundles/" refs >actual &&
+	cat >expect <<-\EOF &&
+	refs/bundles/topic
+	EOF
+	test_cmp expect actual &&
+	# Ensure that refs/bundles/topic are sent as "have".
+	grep "clone> have $(git -C clone-from rev-parse A)" trace-packet.txt
+'
+
+test_expect_success 'negotiation: bundle with all wanted commits' '
+	test_when_finished rm -rf trace*.txt &&
+	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
+	git clone --no-local --single-branch --branch=topic --no-tags \
+		--bundle-uri="clone-from/B.bundle" \
+		clone-from nego-bundle-all &&
+	git -C nego-bundle-all for-each-ref --format="%(refname)" >refs &&
+	grep "refs/bundles/" refs >actual &&
+	cat >expect <<-\EOF &&
+	refs/bundles/topic
+	EOF
+	test_cmp expect actual &&
+	# We already have all needed commits so no "want" needed.
+	! grep "clone> want " trace-packet.txt
+'
+
+test_expect_success 'negotiation: bundle list (no heuristic)' '
+	test_when_finished rm -f trace*.txt &&
+	cat >bundle-list <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+
+	[bundle "bundle-1"]
+		uri = file://$(pwd)/clone-from/bundle-1.bundle
+
+	[bundle "bundle-2"]
+		uri = file://$(pwd)/clone-from/bundle-2.bundle
+	EOF
+
+	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
+	git clone --no-local --bundle-uri="file://$(pwd)/bundle-list" \
+		clone-from nego-bundle-list-no-heuristic &&
+
+	git -C nego-bundle-list-no-heuristic for-each-ref --format="%(refname)" >refs &&
+	grep "refs/bundles/" refs >actual &&
+	cat >expect <<-\EOF &&
+	refs/bundles/base
+	refs/bundles/left
+	EOF
+	test_cmp expect actual &&
+	grep "clone> have $(git -C nego-bundle-list-no-heuristic rev-parse refs/bundles/left)" trace-packet.txt
+'
+
+test_expect_success 'negotiation: bundle list (creationToken)' '
+	test_when_finished rm -f trace*.txt &&
+	cat >bundle-list <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+		heuristic = creationToken
+
+	[bundle "bundle-1"]
+		uri = file://$(pwd)/clone-from/bundle-1.bundle
+		creationToken = 1
+
+	[bundle "bundle-2"]
+		uri = file://$(pwd)/clone-from/bundle-2.bundle
+		creationToken = 2
+	EOF
+
+	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
+	git clone --no-local --bundle-uri="file://$(pwd)/bundle-list" \
+		clone-from nego-bundle-list-heuristic &&
+
+	git -C nego-bundle-list-heuristic for-each-ref --format="%(refname)" >refs &&
+	grep "refs/bundles/" refs >actual &&
+	cat >expect <<-\EOF &&
+	refs/bundles/base
+	refs/bundles/left
+	EOF
+	test_cmp expect actual &&
+	grep "clone> have $(git -C nego-bundle-list-heuristic rev-parse refs/bundles/left)" trace-packet.txt
+'
+
+test_expect_success 'negotiation: bundle list with all wanted commits' '
+	test_when_finished rm -f trace*.txt &&
+	cat >bundle-list <<-EOF &&
+	[bundle]
+		version = 1
+		mode = all
+		heuristic = creationToken
+
+	[bundle "bundle-1"]
+		uri = file://$(pwd)/clone-from/bundle-1.bundle
+		creationToken = 1
+
+	[bundle "bundle-2"]
+		uri = file://$(pwd)/clone-from/bundle-2.bundle
+		creationToken = 2
+	EOF
+
+	GIT_TRACE_PACKET="$(pwd)/trace-packet.txt" \
+	git clone --no-local --single-branch --branch=left --no-tags \
+		--bundle-uri="file://$(pwd)/bundle-list" \
+		clone-from nego-bundle-list-all &&
+
+	git -C nego-bundle-list-all for-each-ref --format="%(refname)" >refs &&
+	grep "refs/bundles/" refs >actual &&
+	cat >expect <<-\EOF &&
+	refs/bundles/base
+	refs/bundles/left
+	EOF
+	test_cmp expect actual &&
+	# We already have all needed commits so no "want" needed.
+	! grep "clone> want " trace-packet.txt
+'
+
 #########################################################################
 # HTTP tests begin here