diff mbox series

[v2] upload-pack.c: fix filter spec quoting bug

Message ID 20210125170921.14291-1-jacob@gitlab.com (mailing list archive)
State Superseded
Headers show
Series [v2] upload-pack.c: fix filter spec quoting bug | expand

Commit Message

Jacob Vosmaer Jan. 25, 2021, 5:09 p.m. UTC
This fixes a bug that occurs when you combine partial clone and
uploadpack.packobjectshook. You can reproduce it as follows:

git clone -u 'git -c uploadpack.allowfilter '\
'-c uploadpack.packobjectshook=env '\
'upload-pack' --filter=blob:none --no-local \
src.git dst.git

Be careful with the line endings because this has a long quoted string
as the -u argument.

The error I get when I run this is:

Cloning into '/tmp/broken'...
remote: fatal: invalid filter-spec ''blob:none''
error: git upload-pack: git-pack-objects died with error.
fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
remote: aborting due to possible repository corruption on the remote side.
fatal: early EOF
fatal: index-pack failed

The problem is an unnecessary and harmful layer of quoting. I tried
digging through the history of this function and I think this quoting
was there from the start. My best guess is that it stems from a
misunderstanding what use_shell=1 means. The code seems to assume it
means "arguments get joined into one big string, then fed to /bin/sh".
But that is not what it means: use_shell=1 means that the first
argument in the arguments array may be a shell script and if so should
be passed to /bin/sh. All other arguments are passed as normal
arguments.

The solution is simple: never quote the filter spec.

This commit removes the conditional quoting and adds a test for
partial clone in t5544.
---
 t/t5544-pack-objects-hook.sh | 9 +++++++++
 upload-pack.c                | 9 +--------
 2 files changed, 10 insertions(+), 8 deletions(-)

Comments

Junio C Hamano Jan. 25, 2021, 7:48 p.m. UTC | #1
Jacob Vosmaer <jacob@gitlab.com> writes:

> This fixes a bug that occurs when you combine partial clone and
> uploadpack.packobjectshook. You can reproduce it as follows:
>
> git clone -u 'git -c uploadpack.allowfilter '\
> '-c uploadpack.packobjectshook=env '\
> 'upload-pack' --filter=blob:none --no-local \
> src.git dst.git
>
> Be careful with the line endings because this has a long quoted string
> as the -u argument.
>
> The error I get when I run this is:
>
> Cloning into '/tmp/broken'...
> remote: fatal: invalid filter-spec ''blob:none''
> error: git upload-pack: git-pack-objects died with error.
> fatal: git upload-pack: aborting due to possible repository corruption on the remote side.
> remote: aborting due to possible repository corruption on the remote side.
> fatal: early EOF
> fatal: index-pack failed
>
> The problem is an unnecessary and harmful layer of quoting. I tried
> digging through the history of this function and I think this quoting
> was there from the start.

Meaning that 10ac85c7 (upload-pack: add object filtering for partial
clone, 2017-12-08) that added:

        if (filter_options.filter_spec) {
                struct strbuf buf = STRBUF_INIT;
                sq_quote_buf(&buf, filter_options.filter_spec);
                argv_array_pushf(&pack_objects.args, "--filter=%s", buf.buf);
                strbuf_release(&buf);
        }

> My best guess is that it stems from a
> misunderstanding what use_shell=1 means. The code seems to assume it
> means "arguments get joined into one big string, then fed to /bin/sh".
> But that is not what it means: use_shell=1 means that the first
> argument in the arguments array may be a shell script and if so should
> be passed to /bin/sh. All other arguments are passed as normal
> arguments.

I noticed another thing that hasn't changed since that commit, which
is that the setting of .use_shell is conditional.  In today's code,
at the beginning of create_pack_file(), we have

        if (!pack_data->pack_objects_hook)
                pack_objects.git_cmd = 1;
        else {
                strvec_push(&pack_objects.args, pack_data->pack_objects_hook);
                strvec_push(&pack_objects.args, "git");
                pack_objects.use_shell = 1;
        }

I suspect that 0b6069fe (fetch-pack: test support excluding large
blobs, 2017-12-08) sort-of fixed half of the problem (i.e. the half
when there is no hook used) while leaving the other half still
broken as before.

But because .use_shell does not affect if we should or should not
quote, we can unconditionally drop the use of sq_quote_buf().

> The solution is simple: never quote the filter spec.

Which makes sense.

> This commit removes the conditional quoting and adds a test for
> partial clone in t5544.
> ---

Thanks.  Missing sign-off.

>  	if (pack_data->filter_options.choice) {
>  		const char *spec =
>  			expand_list_objects_filter_spec(&pack_data->filter_options);
> -		if (pack_objects.use_shell) {
> -			struct strbuf buf = STRBUF_INIT;
> -			sq_quote_buf(&buf, spec);
> -			strvec_pushf(&pack_objects.args, "--filter=%s", buf.buf);
> -			strbuf_release(&buf);
> -		} else {
> -			strvec_pushf(&pack_objects.args, "--filter=%s", spec);
> -		}
> +		strvec_pushf(&pack_objects.args, "--filter=%s", spec);
>  	}
>  	if (uri_protocols) {
>  		for (i = 0; i < uri_protocols->nr; i++)
Jeff King Jan. 25, 2021, 9:16 p.m. UTC | #2
On Mon, Jan 25, 2021 at 06:09:21PM +0100, Jacob Vosmaer wrote:

> +test_expect_success 'hook works with partial clone' '
> +	clear_hook_results &&
> +	test_config_global uploadpack.packObjectsHook ./hook &&
> +	test_config_global uploadpack.allowFilter true &&

I was going to complain that:

 test_config -C dst.git uploadpack.packObjectsHook ./hook &&
 test_config -C dst.git uploadpack.allowFilter true &&

would be more clear, since it tells us which repo we care about
impacting. But the rest of the script doesn't do that, and indeed it
can't, because we don't allow packObjectsHook to be set in per-repo
config for security reasons. :)

(We could do it per-repo for allowFilter, but I think it is just as well
to keep the two calls consistent).

> +	git clone --bare --no-local --filter=blob:none . dst.git &&
> +	git -C dst.git rev-list --objects --missing=print HEAD >objects &&
> +	grep "^?" objects

Previously that clone would fail, so the bug-fix is demonstrated by it
succeeding. But the other two commands are added to make sure that we
actually did apply the filter correctly. Even better. Thanks for being
thorough.

-Peff
Jeff King Jan. 25, 2021, 9:16 p.m. UTC | #3
On Mon, Jan 25, 2021 at 11:48:07AM -0800, Junio C Hamano wrote:

> I suspect that 0b6069fe (fetch-pack: test support excluding large
> blobs, 2017-12-08) sort-of fixed half of the problem (i.e. the half
> when there is no hook used) while leaving the other half still
> broken as before.
> 
> But because .use_shell does not affect if we should or should not
> quote, we can unconditionally drop the use of sq_quote_buf().

Yep. That was the same conclusion I came to in my earlier reply:

  https://lore.kernel.org/git/YAs2RMT1rEH%2F2LSp@coredump.intra.peff.net/

> > This commit removes the conditional quoting and adds a test for
> > partial clone in t5544.
> > ---
> 
> Thanks.  Missing sign-off.

Aside from the sign-off issue, this version looks good to me.

-Peff
Junio C Hamano Jan. 26, 2021, 2:25 a.m. UTC | #4
Jeff King <peff@peff.net> writes:

> On Mon, Jan 25, 2021 at 11:48:07AM -0800, Junio C Hamano wrote:
>
>> I suspect that 0b6069fe (fetch-pack: test support excluding large
>> blobs, 2017-12-08) sort-of fixed half of the problem (i.e. the half
>> when there is no hook used) while leaving the other half still
>> broken as before.
>> 
>> But because .use_shell does not affect if we should or should not
>> quote, we can unconditionally drop the use of sq_quote_buf().
>
> Yep. That was the same conclusion I came to in my earlier reply:
>
>   https://lore.kernel.org/git/YAs2RMT1rEH%2F2LSp@coredump.intra.peff.net/

Thanks.  As I am behind, I needed to think it aloud to make sure I
am on the same page as others ;-)  A sanity check like this is
greatly appreciated.
diff mbox series

Patch

diff --git a/t/t5544-pack-objects-hook.sh b/t/t5544-pack-objects-hook.sh
index 4357af1525..f5ba663d64 100755
--- a/t/t5544-pack-objects-hook.sh
+++ b/t/t5544-pack-objects-hook.sh
@@ -59,4 +59,13 @@  test_expect_success 'hook does not run from repo config' '
 	test_path_is_missing .git/hook.stdout
 '
 
+test_expect_success 'hook works with partial clone' '
+	clear_hook_results &&
+	test_config_global uploadpack.packObjectsHook ./hook &&
+	test_config_global uploadpack.allowFilter true &&
+	git clone --bare --no-local --filter=blob:none . dst.git &&
+	git -C dst.git rev-list --objects --missing=print HEAD >objects &&
+	grep "^?" objects
+'
+
 test_done
diff --git a/upload-pack.c b/upload-pack.c
index 3b66bf92ba..eae1fdbc55 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -305,14 +305,7 @@  static void create_pack_file(struct upload_pack_data *pack_data,
 	if (pack_data->filter_options.choice) {
 		const char *spec =
 			expand_list_objects_filter_spec(&pack_data->filter_options);
-		if (pack_objects.use_shell) {
-			struct strbuf buf = STRBUF_INIT;
-			sq_quote_buf(&buf, spec);
-			strvec_pushf(&pack_objects.args, "--filter=%s", buf.buf);
-			strbuf_release(&buf);
-		} else {
-			strvec_pushf(&pack_objects.args, "--filter=%s", spec);
-		}
+		strvec_pushf(&pack_objects.args, "--filter=%s", spec);
 	}
 	if (uri_protocols) {
 		for (i = 0; i < uri_protocols->nr; i++)