diff mbox series

[v2,3/3] archive: support remote archive from stateless transport

Message ID 20230923152201.14741-4-worldhello.net@gmail.com (mailing list archive)
State Superseded
Headers show
Series support remote archive from stateless transport | expand

Commit Message

Jiang Xin Sept. 23, 2023, 3:22 p.m. UTC
From: Jiang Xin <zhiyou.jx@alibaba-inc.com>

Even though we can establish a stateless connection, we still cannot
archive the remote repository using a stateless HTTP protocol. Try the
following steps to make it work.

 1. Add support for "git-upload-archive" service in "http-backend".

 2. Use the URL ".../info/refs?service=git-upload-pack" to detect the
    protocol version, instead of use the "git-upload-archive" service.

 3. "git-archive" does not expect to see protocol version and
    capabilities when connecting to remote-helper, so do not send them
    in "remote-curl.c" for the "git-upload-archive" service.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
---
 http-backend.c         | 15 +++++++++++++--
 remote-curl.c          | 14 +++++++++++---
 t/t5003-archive-zip.sh | 30 ++++++++++++++++++++++++++++++
 transport-helper.c     |  3 ++-
 4 files changed, 56 insertions(+), 6 deletions(-)

Comments

Eric Sunshine Sept. 24, 2023, 6:52 a.m. UTC | #1
On Sat, Sep 23, 2023 at 11:22 AM Jiang Xin <worldhello.net@gmail.com> wrote:
> Even though we can establish a stateless connection, we still cannot
> archive the remote repository using a stateless HTTP protocol. Try the
> following steps to make it work.
> [...]
> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> ---
> diff --git a/http-backend.c b/http-backend.c
> @@ -639,10 +640,19 @@ static void check_content_type(struct strbuf *hdr, const char *accepted_type)
> -       const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};
> +       const char *argv[4];
>
> +       if (!strcmp(service_name, "git-upload-archive")) {
> +               argv[1] = ".";
> +               argv[2] = NULL;
> +       } else {
> +               argv[1] = "--stateless-rpc";
> +               argv[2] = ".";
> +               argv[3] = NULL;
> +       }

It may not be worth a reroll, but since you're touching this code
anyhow, these days we'd use `strvec` for this:

    struct strvec argv = STRVEC_INIT;
    if (strcmp(service_name, "git-upload-archive"))
        strvec_push(&argv, "--stateless-rpc");
    strvec_push(&argv, ".");
Phillip Wood Sept. 24, 2023, 1:41 p.m. UTC | #2
On 23/09/2023 16:22, Jiang Xin wrote:
> From: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> 
> Even though we can establish a stateless connection, we still cannot
> archive the remote repository using a stateless HTTP protocol. Try the
> following steps to make it work.
> 
>   1. Add support for "git-upload-archive" service in "http-backend".
> 
>   2. Use the URL ".../info/refs?service=git-upload-pack" to detect the
>      protocol version, instead of use the "git-upload-archive" service.
> 
>   3. "git-archive" does not expect to see protocol version and
>      capabilities when connecting to remote-helper, so do not send them
>      in "remote-curl.c" for the "git-upload-archive" service.

I'm not familiar enough with the server side of git to comment on 
whether this patch is a good idea, but I did notice one C language issue 
below.

>   static struct string_list *get_parameters(void)
> @@ -639,10 +640,19 @@ static void check_content_type(struct strbuf *hdr, const char *accepted_type)
>   
>   static void service_rpc(struct strbuf *hdr, char *service_name)
>   {
> -	const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};

In the pre-image argv[0] is initialized to NULL

> +	const char *argv[4];

In the post-image argv is not initialized and the first element is not 
set in the code below.

>   	struct rpc_service *svc = select_service(hdr, service_name);
>   	struct strbuf buf = STRBUF_INIT;
>   
> +	if (!strcmp(service_name, "git-upload-archive")) {
> +		argv[1] = ".";
> +		argv[2] = NULL;
> +	} else {
> +		argv[1] = "--stateless-rpc";
> +		argv[2] = ".";
> +		argv[3] = NULL;
> +	}

Best Wishes

Phillip
Jiang Xin Sept. 24, 2023, 11:36 p.m. UTC | #3
On Sun, Sep 24, 2023 at 9:41 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> On 23/09/2023 16:22, Jiang Xin wrote:
> > From: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> >
> > Even though we can establish a stateless connection, we still cannot
> > archive the remote repository using a stateless HTTP protocol. Try the
> > following steps to make it work.
> >
> >   1. Add support for "git-upload-archive" service in "http-backend".
> >
> >   2. Use the URL ".../info/refs?service=git-upload-pack" to detect the
> >      protocol version, instead of use the "git-upload-archive" service.
> >
> >   3. "git-archive" does not expect to see protocol version and
> >      capabilities when connecting to remote-helper, so do not send them
> >      in "remote-curl.c" for the "git-upload-archive" service.
>
> I'm not familiar enough with the server side of git to comment on
> whether this patch is a good idea, but I did notice one C language issue
> below.
>
> >   static struct string_list *get_parameters(void)
> > @@ -639,10 +640,19 @@ static void check_content_type(struct strbuf *hdr, const char *accepted_type)
> >
> >   static void service_rpc(struct strbuf *hdr, char *service_name)
> >   {
> > -     const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};

For the original implementation, the first NULL is used as a
placeholder, and will be initialized somewhere below.

> In the pre-image argv[0] is initialized to NULL
>
> > +     const char *argv[4];
>
> In the post-image argv is not initialized and the first element is not
> set in the code below.
>
> >       struct rpc_service *svc = select_service(hdr, service_name);
> >       struct strbuf buf = STRBUF_INIT;
> >
> > +     if (!strcmp(service_name, "git-upload-archive")) {
> > +             argv[1] = ".";
> > +             argv[2] = NULL;
> > +     } else {
> > +             argv[1] = "--stateless-rpc";
> > +             argv[2] = ".";
> > +             argv[3] = NULL;
> > +     }

It will be initialized in the code further below, see http-backend.c:668.

        argv[0] = svc->name;
        run_service(argv, svc->buffer_input);
        strbuf_release(&buf);

Anyway, I will rewrite these code in reroll v3 to follow Eric's suggestion.

> Best Wishes
>
> Phillip
>
Jiang Xin Sept. 24, 2023, 11:39 p.m. UTC | #4
On Sun, Sep 24, 2023 at 2:52 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Sat, Sep 23, 2023 at 11:22 AM Jiang Xin <worldhello.net@gmail.com> wrote:
> > Even though we can establish a stateless connection, we still cannot
> > archive the remote repository using a stateless HTTP protocol. Try the
> > following steps to make it work.
> > [...]
> > Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> > ---
> > diff --git a/http-backend.c b/http-backend.c
> > @@ -639,10 +640,19 @@ static void check_content_type(struct strbuf *hdr, const char *accepted_type)
> > -       const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};
> > +       const char *argv[4];
> >
> > +       if (!strcmp(service_name, "git-upload-archive")) {
> > +               argv[1] = ".";
> > +               argv[2] = NULL;
> > +       } else {
> > +               argv[1] = "--stateless-rpc";
> > +               argv[2] = ".";
> > +               argv[3] = NULL;
> > +       }
>
> It may not be worth a reroll, but since you're touching this code
> anyhow, these days we'd use `strvec` for this:
>
>     struct strvec argv = STRVEC_INIT;
>     if (strcmp(service_name, "git-upload-archive"))
>         strvec_push(&argv, "--stateless-rpc");
>     strvec_push(&argv, ".");

Good suggestion, I'll queue this up as part of next reroll.

--
Jiang Xin
Randall S. Becker Sept. 24, 2023, 11:58 p.m. UTC | #5
On Sunday, September 24, 2023 7:40 PM, Jiang Xin wrote:
>On Sun, Sep 24, 2023 at 2:52 PM Eric Sunshine <sunshine@sunshineco.com>
>wrote:
>>
>> On Sat, Sep 23, 2023 at 11:22 AM Jiang Xin <worldhello.net@gmail.com> wrote:
>> > Even though we can establish a stateless connection, we still cannot
>> > archive the remote repository using a stateless HTTP protocol. Try
>> > the following steps to make it work.
>> > [...]
>> > Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
>> > ---
>> > diff --git a/http-backend.c b/http-backend.c @@ -639,10 +640,19 @@
>> > static void check_content_type(struct strbuf *hdr, const char *accepted_type)
>> > -       const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};
>> > +       const char *argv[4];
>> >
>> > +       if (!strcmp(service_name, "git-upload-archive")) {
>> > +               argv[1] = ".";
>> > +               argv[2] = NULL;
>> > +       } else {
>> > +               argv[1] = "--stateless-rpc";
>> > +               argv[2] = ".";
>> > +               argv[3] = NULL;
>> > +       }
>>
>> It may not be worth a reroll, but since you're touching this code
>> anyhow, these days we'd use `strvec` for this:
>>
>>     struct strvec argv = STRVEC_INIT;
>>     if (strcmp(service_name, "git-upload-archive"))
>>         strvec_push(&argv, "--stateless-rpc");
>>     strvec_push(&argv, ".");
>
>Good suggestion, I'll queue this up as part of next reroll.

Which test covers this change?

Thanks,
Randall
Jiang Xin Sept. 25, 2023, 12:15 a.m. UTC | #6
On Mon, Sep 25, 2023 at 7:58 AM <rsbecker@nexbridge.com> wrote:
>
> On Sunday, September 24, 2023 7:40 PM, Jiang Xin wrote:
> >On Sun, Sep 24, 2023 at 2:52 PM Eric Sunshine <sunshine@sunshineco.com>
> >wrote:
> >>
> >> On Sat, Sep 23, 2023 at 11:22 AM Jiang Xin <worldhello.net@gmail.com> wrote:
> >> > Even though we can establish a stateless connection, we still cannot
> >> > archive the remote repository using a stateless HTTP protocol. Try
> >> > the following steps to make it work.
> >> > [...]
> >> > Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> >> > ---
> >> > diff --git a/http-backend.c b/http-backend.c @@ -639,10 +640,19 @@
> >> > static void check_content_type(struct strbuf *hdr, const char *accepted_type)
> >> > -       const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};
> >> > +       const char *argv[4];
> >> >
> >> > +       if (!strcmp(service_name, "git-upload-archive")) {
> >> > +               argv[1] = ".";
> >> > +               argv[2] = NULL;
> >> > +       } else {
> >> > +               argv[1] = "--stateless-rpc";
> >> > +               argv[2] = ".";
> >> > +               argv[3] = NULL;
> >> > +       }
> >>
> >> It may not be worth a reroll, but since you're touching this code
> >> anyhow, these days we'd use `strvec` for this:
> >>
> >>     struct strvec argv = STRVEC_INIT;
> >>     if (strcmp(service_name, "git-upload-archive"))
> >>         strvec_push(&argv, "--stateless-rpc");
> >>     strvec_push(&argv, ".");
> >
> >Good suggestion, I'll queue this up as part of next reroll.
>
> Which test covers this change?

See: https://lore.kernel.org/git/20230923152201.14741-4-worldhello.net@gmail.com/#Z31t:t5003-archive-zip.sh

> Thanks,
> Randall
>
Randall S. Becker Sept. 25, 2023, 1:04 a.m. UTC | #7
On Sunday, September 24, 2023 8:16 PM, Jiang Xin wrote:
>On Mon, Sep 25, 2023 at 7:58 AM <rsbecker@nexbridge.com> wrote:
>>
>> On Sunday, September 24, 2023 7:40 PM, Jiang Xin wrote:
>> >On Sun, Sep 24, 2023 at 2:52 PM Eric Sunshine
>> ><sunshine@sunshineco.com>
>> >wrote:
>> >>
>> >> On Sat, Sep 23, 2023 at 11:22 AM Jiang Xin <worldhello.net@gmail.com>
>wrote:
>> >> > Even though we can establish a stateless connection, we still
>> >> > cannot archive the remote repository using a stateless HTTP
>> >> > protocol. Try the following steps to make it work.
>> >> > [...]
>> >> > Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
>> >> > ---
>> >> > diff --git a/http-backend.c b/http-backend.c @@ -639,10 +640,19
>> >> > @@ static void check_content_type(struct strbuf *hdr, const char
>*accepted_type)
>> >> > -       const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};
>> >> > +       const char *argv[4];
>> >> >
>> >> > +       if (!strcmp(service_name, "git-upload-archive")) {
>> >> > +               argv[1] = ".";
>> >> > +               argv[2] = NULL;
>> >> > +       } else {
>> >> > +               argv[1] = "--stateless-rpc";
>> >> > +               argv[2] = ".";
>> >> > +               argv[3] = NULL;
>> >> > +       }
>> >>
>> >> It may not be worth a reroll, but since you're touching this code
>> >> anyhow, these days we'd use `strvec` for this:
>> >>
>> >>     struct strvec argv = STRVEC_INIT;
>> >>     if (strcmp(service_name, "git-upload-archive"))
>> >>         strvec_push(&argv, "--stateless-rpc");
>> >>     strvec_push(&argv, ".");
>> >
>> >Good suggestion, I'll queue this up as part of next reroll.
>>
>> Which test covers this change?
>
>See: https://lore.kernel.org/git/20230923152201.14741-4-
>worldhello.net@gmail.com/#Z31t:t5003-archive-zip.sh

Thanks. That is what I needed. Looking forward to the merge.
--Randall
diff mbox series

Patch

diff --git a/http-backend.c b/http-backend.c
index ff07b87e64..ed3bed965a 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -38,6 +38,7 @@  struct rpc_service {
 static struct rpc_service rpc_service[] = {
 	{ "upload-pack", "uploadpack", 1, 1 },
 	{ "receive-pack", "receivepack", 0, -1 },
+	{ "upload-archive", "uploadarchive", 0, -1 },
 };
 
 static struct string_list *get_parameters(void)
@@ -639,10 +640,19 @@  static void check_content_type(struct strbuf *hdr, const char *accepted_type)
 
 static void service_rpc(struct strbuf *hdr, char *service_name)
 {
-	const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};
+	const char *argv[4];
 	struct rpc_service *svc = select_service(hdr, service_name);
 	struct strbuf buf = STRBUF_INIT;
 
+	if (!strcmp(service_name, "git-upload-archive")) {
+		argv[1] = ".";
+		argv[2] = NULL;
+	} else {
+		argv[1] = "--stateless-rpc";
+		argv[2] = ".";
+		argv[3] = NULL;
+	}
+
 	strbuf_reset(&buf);
 	strbuf_addf(&buf, "application/x-git-%s-request", svc->name);
 	check_content_type(hdr, buf.buf);
@@ -723,7 +733,8 @@  static struct service_cmd {
 	{"GET", "/objects/pack/pack-[0-9a-f]{64}\\.idx$", get_idx_file},
 
 	{"POST", "/git-upload-pack$", service_rpc},
-	{"POST", "/git-receive-pack$", service_rpc}
+	{"POST", "/git-receive-pack$", service_rpc},
+	{"POST", "/git-upload-archive$", service_rpc}
 };
 
 static int bad_request(struct strbuf *hdr, const struct service_cmd *c)
diff --git a/remote-curl.c b/remote-curl.c
index ef05752ca5..ce6cb8ac05 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1447,8 +1447,14 @@  static int stateless_connect(const char *service_name)
 	 * establish a stateless connection, otherwise we need to tell the
 	 * client to fallback to using other transport helper functions to
 	 * complete their request.
+	 *
+	 * The "git-upload-archive" service is a read-only operation. Fallback
+	 * to use "git-upload-pack" service to discover protocol version.
 	 */
-	discover = discover_refs(service_name, 0);
+	if (!strcmp(service_name, "git-upload-archive"))
+		discover = discover_refs("git-upload-pack", 0);
+	else
+		discover = discover_refs(service_name, 0);
 	if (discover->version != protocol_v2) {
 		printf("fallback\n");
 		fflush(stdout);
@@ -1486,9 +1492,11 @@  static int stateless_connect(const char *service_name)
 
 	/*
 	 * Dump the capability listing that we got from the server earlier
-	 * during the info/refs request.
+	 * during the info/refs request. This does not work with the
+	 * "git-upload-archive" service.
 	 */
-	write_or_die(rpc.in, discover->buf, discover->len);
+	if (strcmp(service_name, "git-upload-archive"))
+		write_or_die(rpc.in, discover->buf, discover->len);
 
 	/* Until we see EOF keep sending POSTs */
 	while (1) {
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index fc499cdff0..80123c1e06 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -239,4 +239,34 @@  check_zip with_untracked2
 check_added with_untracked2 untracked one/untracked
 check_added with_untracked2 untracked two/untracked
 
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success "setup for HTTP protocol" '
+	cp -R bare.git "$HTTPD_DOCUMENT_ROOT_PATH/bare.git" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/bare.git" \
+		config http.uploadpack true &&
+	set_askpass user@host pass@host
+'
+
+setup_askpass_helper
+
+test_expect_success 'remote archive does not work with protocol v1' '
+	test_when_finished "rm -f d5.zip" &&
+	test_must_fail git -c protocol.version=1 archive \
+		--remote="$HTTPD_URL/auth/smart/bare.git" \
+		--output=d5.zip HEAD >actual 2>&1 &&
+	cat >expect <<-EOF &&
+	fatal: can${SQ}t connect to subservice git-upload-archive
+	EOF
+	test_cmp expect actual
+'
+
+test_expect_success 'archive remote http repository' '
+	test_when_finished "rm -f d5.zip" &&
+	git archive --remote="$HTTPD_URL/auth/smart/bare.git" \
+		--output=d5.zip HEAD &&
+	test_cmp_bin d.zip d5.zip
+'
+
 test_done
diff --git a/transport-helper.c b/transport-helper.c
index 3c8802b7a3..91381be622 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -628,7 +628,8 @@  static int process_connect_service(struct transport *transport,
 		ret = run_connect(transport, &cmdbuf);
 	} else if (data->stateless_connect &&
 		   (get_protocol_version_config() == protocol_v2) &&
-		   !strcmp("git-upload-pack", name)) {
+		   (!strcmp("git-upload-pack", name) ||
+		    !strcmp("git-upload-archive", name))) {
 		strbuf_addf(&cmdbuf, "stateless-connect %s\n", name);
 		ret = run_connect(transport, &cmdbuf);
 		if (ret)