diff mbox series

[v4] upload-pack.c: fix filter spec quoting bug

Message ID 20210128160453.79169-1-jacob@gitlab.com (mailing list archive)
State Superseded
Commit ad5df6b782a13854c9ae9d273dd03c5b935ed7cb
Headers show
Series [v4] upload-pack.c: fix filter spec quoting bug | expand

Commit Message

Jacob Vosmaer Jan. 28, 2021, 4:04 p.m. UTC
Fix a bug in upload-pack.c that occurs when you combine partial clone and
uploadpack.packObjectsHook. You can reproduce it as follows:

	git clone -u 'git -c uploadpack.allowfilter '\
	'-c uploadpack.packobjectshook=env '\
	'upload-pack' --filter=blob:none --no-local \
	src.git dst.git

Be careful with the line endings because this has a long quoted string
as the -u argument.

The error I get when I run this is:

	Cloning into '/tmp/broken'...
	remote: fatal: invalid filter-spec ''blob:none''
	error: git upload-pack: git-pack-objects died with error.
	fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
	remote: aborting due to possible repository corruption on the remote side.
	fatal: early EOF
	fatal: index-pack failed

The problem is caused by unneeded quoting. This bug was already
present in 10ac85c785 (upload-pack: add object filtering for partial
clone, 2017-12-08) when the server side filter support was introduced.
In fact, in 10ac85c785 this was broken regardless of
uploadpack.packObjectsHook. Then in 0b6069fe0a (fetch-pack: test
support excluding large blobs, 2017-12-08) the quoting was removed but
only behind a conditional that depends on whether
uploadpack.packObjectsHook is set. Because uploadpack.packObjectsHook
is apparently rarely used, nobody noticed the problematic quoting
could still happen.

This commit removes the conditional quoting and adds a test for
partial clone in t5544-pack-objects-hook.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
---
 t/t5544-pack-objects-hook.sh | 9 +++++++++
 upload-pack.c                | 9 +--------
 2 files changed, 10 insertions(+), 8 deletions(-)

Comments

Jacob Vosmaer Jan. 28, 2021, 9:12 p.m. UTC | #1
On Thu, Jan 28, 2021 at 8:12 PM Junio C Hamano <gitster@pobox.com> wrote:
>  * As readers know "clone" would get not just the current branch,
>    the way this rev-list traverses only from HEAD makes them wonder
>    if you are trying to exclude refs other than the current branch
>    for a reason.  Better write it as
>
>         git -C dst.git rev-list --objects --missing=print --all >objects &&

Good point!

>  * The above says that we are happy if we can clone without erroring
>    out, as long as some objects are missing, but we could go a bit
>    stronger than that: among the objects we have, none should be a
>    blob object.  Is that something we can easily check?
>
>    Something along the lines of...
>
>         grep -v "^?" objects |
>         git -C dst.git cat-file --batch-check="%(objecttype)" >types &&
>         sed -e '/^commit/d' -e '/^tag/d' -e '/^tree/d' types >actual &&
>         test_must_be_empty actual
>
>    ... to ensure everything is either commit, tag or tree, perhaps?

Makes sense, I started down that route at first but I bumped my head
against needing --missing to prevent the lazy fetching, and stopped
once I had the question mark grep.

Now that we're talking about the test, I was wondering about something else.

In my original patch, I purposely did not add a test. Why? Because
--filter is just one option of several that upload-pack passes to
pack-objects (think of --shallow-file and --include-tag, for
instance). Why is --filter special? If the original quoting bug had
not happened, would we be testing various permutations of clone
options in combination with packObjectsHook?

As a reader looking at t5544, unless I know the backstory of the bug,
I do not understand why --filter gets a test but those other things do
not.

I am not saying this to push back on going back and improving the
test, I'm quite happy to. It's more that deleting the test may be the
ultimate improvement. Thanks in advance for humoring this contrarian
suggestion. :)

Cheers, Jacob
Jacob Vosmaer Jan. 28, 2021, 9:40 p.m. UTC | #2
On Thu, Jan 28, 2021 at 10:12 PM Jacob Vosmaer <jacob@gitlab.com> wrote:
> Thanks in advance for humoring this contrarian
> suggestion. :)

English is not my first language. I should have said "considering",
not "humoring".
Jeff King Jan. 28, 2021, 9:51 p.m. UTC | #3
On Thu, Jan 28, 2021 at 10:12:42PM +0100, Jacob Vosmaer wrote:

> Now that we're talking about the test, I was wondering about something else.
> 
> In my original patch, I purposely did not add a test. Why? Because
> --filter is just one option of several that upload-pack passes to
> pack-objects (think of --shallow-file and --include-tag, for
> instance). Why is --filter special? If the original quoting bug had
> not happened, would we be testing various permutations of clone
> options in combination with packObjectsHook?
> 
> As a reader looking at t5544, unless I know the backstory of the bug,
> I do not understand why --filter gets a test but those other things do
> not.

You're definitely not wrong that this is somewhat closing the barn door
after horse has left, and there may be other barns still to be fixed.

But usually we'll add a test that demonstrates the breakage, if only
because the fact that something so mundane and easy to trigger _wasn't_
caught by the existing test is pretty bad. So we should at least make
sure the combination is now covered, which your test does. And then over
time we build up a set of coverage.

Of course it is sometimes nice to be exhaustive, too. The organically
built-up set of tests is going to have holes if it's not done
systematically. But Git also isn't a black box; we know that the
implementation of this option was weirdly different.

I guess the argument you are making is that now that it's fixed, it
_isn't_ different, and we're unlikely to reintroduce the bug. I can buy
that, but I think we just do the "test the fixed bug" thing on
principle.

-Peff
Junio C Hamano Jan. 28, 2021, 9:58 p.m. UTC | #4
Jacob Vosmaer <jacob@gitlab.com> writes:

> As a reader looking at t5544, unless I know the backstory of the bug,
> I do not understand why --filter gets a test but those other things do
> not.

Good point.  Perhaps a retitle of the test or a bit of comment would
benefit future readers.


# git clone internally has to invoke upload-pack on the
# other end with multiple arguments, and it used to quote
# them incorrectly only when hooks are enabled.
test_expect_success 'hook works with partial clone' '
	clear_hook_results &&
	test_config_global uploadpack.packObjectsHook ./hook &&
	test_config_global uploadpack.allowFilter true &&

	git clone --bare --no-local --filter=blob:none . dst.git &&
	git -C dst.git rev-list --objects --missing=print HEAD >objects &&
	grep "^?" objects
'

But that may be overkill.  Those curious can run "git blame" to go
back to what you wrote in the log message, and that should be clear
enough for them why we care about this case.

And from that point of view, it may be sufficient that the resulting
repository lacks "some" objects and not necessarily check what are
missing.

Thanks.
Jacob Vosmaer Feb. 1, 2021, 8:31 p.m. UTC | #5
On Thu, Jan 28, 2021 at 10:51 PM Jeff King <peff@peff.net> wrote:
> I guess the argument you are making is that now that it's fixed, it
> _isn't_ different, and we're unlikely to reintroduce the bug. I can buy
> that, but I think we just do the "test the fixed bug" thing on
> principle.

Makes sense, thanks for explaining!
diff mbox series

Patch

diff --git a/t/t5544-pack-objects-hook.sh b/t/t5544-pack-objects-hook.sh
index 4357af1525..f5ba663d64 100755
--- a/t/t5544-pack-objects-hook.sh
+++ b/t/t5544-pack-objects-hook.sh
@@ -59,4 +59,13 @@  test_expect_success 'hook does not run from repo config' '
 	test_path_is_missing .git/hook.stdout
 '
 
+test_expect_success 'hook works with partial clone' '
+	clear_hook_results &&
+	test_config_global uploadpack.packObjectsHook ./hook &&
+	test_config_global uploadpack.allowFilter true &&
+	git clone --bare --no-local --filter=blob:none . dst.git &&
+	git -C dst.git rev-list --objects --missing=print HEAD >objects &&
+	grep "^?" objects
+'
+
 test_done
diff --git a/upload-pack.c b/upload-pack.c
index 3b66bf92ba..eae1fdbc55 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -305,14 +305,7 @@  static void create_pack_file(struct upload_pack_data *pack_data,
 	if (pack_data->filter_options.choice) {
 		const char *spec =
 			expand_list_objects_filter_spec(&pack_data->filter_options);
-		if (pack_objects.use_shell) {
-			struct strbuf buf = STRBUF_INIT;
-			sq_quote_buf(&buf, spec);
-			strvec_pushf(&pack_objects.args, "--filter=%s", buf.buf);
-			strbuf_release(&buf);
-		} else {
-			strvec_pushf(&pack_objects.args, "--filter=%s", spec);
-		}
+		strvec_pushf(&pack_objects.args, "--filter=%s", spec);
 	}
 	if (uri_protocols) {
 		for (i = 0; i < uri_protocols->nr; i++)