diff mbox series

upload-pack: allow stateless client EOF just prior to haves

Message ID 1604022059-18527-1-git-send-email-dan@mutual.io (mailing list archive)
State Superseded
Headers show
Series upload-pack: allow stateless client EOF just prior to haves | expand

Commit Message

Daniel Duvall Oct. 30, 2020, 1:40 a.m. UTC
During stateless packfile negotiation, it is normal behavior for
stateless RPC clients (e.g. git-remote-curl) to send multiple
upload-pack requests with the first containing only the
wants/shallows/deepens/filters followed by a flush.

When run in stateless mode, continuing on without first checking that
the client request has reached EOF can result in a bad file descriptor
during get_common_commits.

Instead, upload-pack should gently peek for an EOF between the sending
of shallow/unshallow lines (followed by flush) and the reading of client
haves. If the client has hung up at this point, exit normally.

Signed-off-by: Daniel Duvall <dan@mutual.io>
---
 t/t9904-upload-pack-stateless-timely-eof.sh | 24 ++++++++++++++++++++++++
 upload-pack.c                               | 13 ++++++++++++-
 2 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100755 t/t9904-upload-pack-stateless-timely-eof.sh

Comments

Eric Sunshine Oct. 30, 2020, 2:59 a.m. UTC | #1
On Thu, Oct 29, 2020 at 9:51 PM Daniel Duvall <dan@mutual.io> wrote:
> [...]
> Instead, upload-pack should gently peek for an EOF between the sending
> of shallow/unshallow lines (followed by flush) and the reading of client
> haves. If the client has hung up at this point, exit normally.
>
> Signed-off-by: Daniel Duvall <dan@mutual.io>
> ---
> diff --git a/t/t9904-upload-pack-stateless-timely-eof.sh b/t/t9904-upload-pack-stateless-timely-eof.sh
> @@ -0,0 +1,24 @@
> +#!/bin/sh
> +
> +test_description='stateless upload-pack gently handles EOF just after want/shallow/depth/flush'
> +
> +. ./test-lib.sh
> +
> +D=$(pwd)

What is the purpose of this assignment? It doesn't seem to be used
anywhere in this script.

> +test_expect_success 'upload-pack outputs flush and exits ok' '
> +       test_commit initial &&
> +       head=$(git rev-parse HEAD) &&
> +       hexsz=$(test_oid hexsz) &&
> +
> +       printf "%04xwant %s\n%04xshallow %s\n000ddeepen 1\n0000" \
> +               $(($hexsz + 10)) $head $(($hexsz + 13)) $head >request &&
> +
> +       git upload-pack --stateless-rpc "$(pwd)" <request >actual &&
> +
> +       printf "0000" >expect &&
> +
> +       test_cmp expect actual
> +'
> +
> +test_done
Daniel Duvall Oct. 30, 2020, 3:31 a.m. UTC | #2
On Thu, Oct 29, 2020 at 8:00 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Thu, Oct 29, 2020 at 9:51 PM Daniel Duvall <dan@mutual.io> wrote:
> > [...]
> > Instead, upload-pack should gently peek for an EOF between the sending
> > of shallow/unshallow lines (followed by flush) and the reading of client
> > haves. If the client has hung up at this point, exit normally.
> >
> > Signed-off-by: Daniel Duvall <dan@mutual.io>
> > ---
> > diff --git a/t/t9904-upload-pack-stateless-timely-eof.sh b/t/t9904-upload-pack-stateless-timely-eof.sh
> > @@ -0,0 +1,24 @@
> > +#!/bin/sh
> > +
> > +test_description='stateless upload-pack gently handles EOF just after want/shallow/depth/flush'
> > +
> > +. ./test-lib.sh
> > +
> > +D=$(pwd)
>
> What is the purpose of this assignment? It doesn't seem to be used
> anywhere in this script.

It's an artifact of my copying/pasting a previous test script as a
template. I didn't see it used in that script either, so I assumed it
was needed somewhere in the test libs. If that's not the case, I can
definitely remove it.

>
> > +test_expect_success 'upload-pack outputs flush and exits ok' '
> > +       test_commit initial &&
> > +       head=$(git rev-parse HEAD) &&
> > +       hexsz=$(test_oid hexsz) &&
> > +
> > +       printf "%04xwant %s\n%04xshallow %s\n000ddeepen 1\n0000" \
> > +               $(($hexsz + 10)) $head $(($hexsz + 13)) $head >request &&
> > +
> > +       git upload-pack --stateless-rpc "$(pwd)" <request >actual &&
> > +
> > +       printf "0000" >expect &&
> > +
> > +       test_cmp expect actual
> > +'
> > +
> > +test_done
Jeff King Oct. 30, 2020, 4:40 a.m. UTC | #3
On Thu, Oct 29, 2020 at 06:40:59PM -0700, Daniel Duvall wrote:

> During stateless packfile negotiation, it is normal behavior for
> stateless RPC clients (e.g. git-remote-curl) to send multiple
> upload-pack requests with the first containing only the
> wants/shallows/deepens/filters followed by a flush.

Hmm, is this normal? I'd expect it to send a "done" after the flush, and
indeed that's what happens if I try it. I see:

E.g., here's GIT_TRACE_CURL output from a v0 request (fetching into an
empty repo):

 Send data: 00a8want 4b379b7a1b97790f805c365b1e58b75b70d3d904 multi_ack_
 Send data: detailed no-done side-band-64k thin-pack ofs-delta deepen-si
 Send data: nce deepen-not agent=git/2.29.2.477.g2cec8aa0af.00000009done
 Send data: .
 Info: upload completely sent off: 181 out of 181 bytes

However, if I add --depth 1 to my fetch, I get:

  Send data: 00a8want 4b379b7a1b97790f805c365b1e58b75b70d3d904 multi_ack_
  Send data: detailed no-done side-band-64k thin-pack ofs-delta deepen-si
  Send data: nce deepen-not agent=git/2.29.2.477.g2cec8aa0af.000cdeepen 1
  Send data: 0000
  Info: upload completely sent off: 184 out of 184 bytes

Which maybe makes sense if we need the shallow response from the server
to determine the next step of the request. It doesn't matter for my
trivial case here (we end up resending the same request with a "done"
added), but I guess it could in other cases.

This client side code is in fetch-pack.c:

  static int find_common(struct fetch_negotiator *negotiator,
  [...]
  {
          [...]
          if (args->deepen) {
		  [...]
                  send_request(args, fd[1], &req_buf);
                  while (packet_reader_read(&reader) == PACKET_READ_NORMAL) {
                          if (skip_prefix(reader.line, "shallow ", &arg)) {
				[...etc...]
                  }
          } else if (!args->stateless_rpc)
                  send_request(args, fd[1], &req_buf);


and dates all the way back to 249b2004d8 (Smart fetch over HTTP: client
side, 2009-10-30). Curiously we don't send it in non-stateless-rpc mode,
even if we're shallow. So I'm a little puzzled why we need the response
first in the stateless case, as we start writing "have" lines at the
server whether before we get the shallow lines back from the server in
either case.

So I'm not entirely convinced there isn't a small bug lurking on the
client side here. Not one that produces a wrong answer, but one which
wastes an extra round trip to the server. I won't be at all surprised if
there is some subtle timing dependency here, though, and the extra
request really is necessary.

But even if it is a client-side bug, it has been in the wild for a long
time, and it may be worth having the server side handle this more
gracefully.

> When run in stateless mode, continuing on without first checking that
> the client request has reached EOF can result in a bad file descriptor
> during get_common_commits.

When you say "bad file descriptor" that makes me think we're getting
EBADF after trying to use a closed descriptor. But we'd die() as soon as
the pkt-line code tries to read and gets eof, right?

That's also worth fixing, but I want to make sure I understand the
problem completely. I think this part of the commit message would be a
good place to talk about the real-world effects:

  - the client doesn't care; by definition it has hung up at this point
    and will keep going with its next request

  - likewise, the server doesn't care in terms of its response; by
    definition it is stateless, so the next request the client makes
    will start fresh

  - it is annoying for server admins who get a bunch of useless logs
    with "remote end hung up unexpected", or if they are tracking exit
    codes from upload-pack

As somebody who has admin'd a busy git site, unexpected client network
drops are just a fact of life and you have to look past them in your
logs. But I still think it's worth keeping it as uncluttered as possible
and having upload-pack handle this without an error message.

> Instead, upload-pack should gently peek for an EOF between the sending
> of shallow/unshallow lines (followed by flush) and the reading of client
> haves. If the client has hung up at this point, exit normally.

Should we do this only if we saw a deepen line? From my reading of the
client code, that's the only thing that would cause this early request.
I don't know if there's any particular advantage to being more strict
here.

If we're _not_ going to be strict, then I actually wonder if we ought to
simply teach get_common_commits() that seeing an EOF is OK in stateless
mode, wherever it comes. It can't possibly impact the correctness of the
protocol conversation (since we're stateless and the client is gone),
but maybe it's useful if you're trying to count how often clients really
do hang up.

> --- /dev/null
> +++ b/t/t9904-upload-pack-stateless-timely-eof.sh

We usually try to group related tests by number. Maybe t5705 would be a
better spot? I also wondered if this could go into t5704, but its title
is "protocol violations". It's not clear to me yet if this is a
violation that happens to be mostly harmless, or something we need to be
doing. :)

> +test_expect_success 'upload-pack outputs flush and exits ok' '
> +	test_commit initial &&
> +	head=$(git rev-parse HEAD) &&
> +	hexsz=$(test_oid hexsz) &&
> +
> +	printf "%04xwant %s\n%04xshallow %s\n000ddeepen 1\n0000" \
> +		$(($hexsz + 10)) $head $(($hexsz + 13)) $head >request &&

We have a helper function that makes this a bit easier to read:

diff --git a/t/t9904-upload-pack-stateless-timely-eof.sh b/t/t9904-upload-pack-stateless-timely-eof.sh
index f8385a7ebd..1108401e8f 100755
--- a/t/t9904-upload-pack-stateless-timely-eof.sh
+++ b/t/t9904-upload-pack-stateless-timely-eof.sh
@@ -9,10 +9,13 @@ D=$(pwd)
 test_expect_success 'upload-pack outputs flush and exits ok' '
 	test_commit initial &&
 	head=$(git rev-parse HEAD) &&
-	hexsz=$(test_oid hexsz) &&
 
-	printf "%04xwant %s\n%04xshallow %s\n000ddeepen 1\n0000" \
-		$(($hexsz + 10)) $head $(($hexsz + 13)) $head >request &&
+	{
+		packetize "want $head" &&
+		packetize "shallow $head" &&
+		packetize "deepen 1" &&
+		printf "0000"
+	} >request &&
 
 	git upload-pack --stateless-rpc "$(pwd)" <request >actual &&
 

> +	git upload-pack --stateless-rpc "$(pwd)" <request >actual &&

You can just use "." here, which is a little shorter. It's not entirely
cosmetic; the difference between $PWD and $(pwd) on Windows always trips
me up, so I try to avoid using either whenever I can. ;)

It would be nice if we could test this through a real use of Git, but it
might not be worth the hassle. I guess we'd have to mine the apache logs
in one of our http test scripts to see if upload-pack failed. And I
guess if we _do_ change the client side to stop sending the extra
request but want to treat historical clients more gracefully, we'd still
need a manual test like this.

> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -1344,7 +1344,18 @@ void upload_pack(struct upload_pack_options *options)
>  				   PACKET_READ_DIE_ON_ERR_PACKET);

The actual fix looks correct to me, modulo the alternatives I raised
earlier.

This function only handles the v0 protocol. For v2, we end up in
upload_pack_v2(). But my reading of the client side do_fetch_pack_v2()
is that it _doesn't_ send this extra request. And a simple test seems to
confirm it. Which gives me further pause as to whether the extra request
is necessary for v0.


Well, that review ended up a bit longer than I had imagined. So let me
add what I should have said at the top: welcome to the Git list, and
thanks for looking into this issue. :)

-Peff
Daniel Duvall Oct. 30, 2020, 7:47 a.m. UTC | #4
Thanks for the welcome! I appreciate the lengthy feedback, because
above all else I'd love to understand the low-level workings of git
better than I do now.

On Thu, Oct 29, 2020 at 9:40 PM Jeff King <peff@peff.net> wrote:
>
> On Thu, Oct 29, 2020 at 06:40:59PM -0700, Daniel Duvall wrote:
>
> > During stateless packfile negotiation, it is normal behavior for
> > stateless RPC clients (e.g. git-remote-curl) to send multiple
> > upload-pack requests with the first containing only the
> > wants/shallows/deepens/filters followed by a flush.
>
> Hmm, is this normal? I'd expect it to send a "done" after the flush, and
> indeed that's what happens if I try it. I see:
>
> E.g., here's GIT_TRACE_CURL output from a v0 request (fetching into an
> empty repo):
>
>  Send data: 00a8want 4b379b7a1b97790f805c365b1e58b75b70d3d904 multi_ack_
>  Send data: detailed no-done side-band-64k thin-pack ofs-delta deepen-si
>  Send data: nce deepen-not agent=git/2.29.2.477.g2cec8aa0af.00000009done
>  Send data: .
>  Info: upload completely sent off: 181 out of 181 bytes
>
> However, if I add --depth 1 to my fetch, I get:
>
>   Send data: 00a8want 4b379b7a1b97790f805c365b1e58b75b70d3d904 multi_ack_
>   Send data: detailed no-done side-band-64k thin-pack ofs-delta deepen-si
>   Send data: nce deepen-not agent=git/2.29.2.477.g2cec8aa0af.000cdeepen 1
>   Send data: 0000
>   Info: upload completely sent off: 184 out of 184 bytes
>
> Which maybe makes sense if we need the shallow response from the server
> to determine the next step of the request. It doesn't matter for my
> trivial case here (we end up resending the same request with a "done"
> added), but I guess it could in other cases.

Right. This is what I was observing too—in a trivial case, the same
roundtrip being made again with nothing additional but "done". I
should have clarified that I thought it was normal only when a depth
was provided. I've been learning the packfile negotiation protocols as
I've been debugging this issue, so I probably should have reserved my
assertion of "normal" for when I have a firm grasp of them. :)

>
> > When run in stateless mode, continuing on without first checking that
> > the client request has reached EOF can result in a bad file descriptor
> > during get_common_commits.
>
> When you say "bad file descriptor" that makes me think we're getting
> EBADF after trying to use a closed descriptor. But we'd die() as soon as
> the pkt-line code tries to read and gets eof, right?

Right. It's still a valid file descriptor but a premature die() upon
EOF. I'll clarify that.

>
> That's also worth fixing, but I want to make sure I understand the
> problem completely. I think this part of the commit message would be a
> good place to talk about the real-world effects:
>
>   - the client doesn't care; by definition it has hung up at this point
>     and will keep going with its next request

That's true in the case where the server doesn't surface the non-zero
exit code from git-upload-pack—which results in git-http-backend
existing non-zero as well. An apache setup I used in testing doesn't
seem to care about the failure—it responds 200 and so the client is
happy—but in our (dayjob) case, we're running a Phabricator instance
which handles a non-zero exit from git-http-backend by responding 500,
and the client dies.

$ git fetch --depth 1
https://phabricator.wikimedia.org/source/phabricator.git HEAD
error: RPC failed; HTTP 500 curl 22 The requested URL returned error: 500
fatal: the remote end hung up unexpectedly

>
>   - likewise, the server doesn't care in terms of its response; by
>     definition it is stateless, so the next request the client makes
>     will start fresh

That makes sense to me.

>
>   - it is annoying for server admins who get a bunch of useless logs
>     with "remote end hung up unexpected", or if they are tracking exit
>     codes from upload-pack

Yes! Phabricator tries to work around this by detecting this stderr
output with a regex, which doesn't seem to hold up too well over time.

See https://secure.phabricator.com/D21484

This is actually how I started down this rabbithole. It seemed odd to
me that git-http-backend was exiting non-zero in the first place. Then
again, I didn't much understand packfile negotiation, so it was just a
hunch that there was some kind of bug. Knowing more about the protocol
now, it definitely seems buggy.

>
> As somebody who has admin'd a busy git site, unexpected client network
> drops are just a fact of life and you have to look past them in your
> logs. But I still think it's worth keeping it as uncluttered as possible
> and having upload-pack handle this without an error message.
>
> > Instead, upload-pack should gently peek for an EOF between the sending
> > of shallow/unshallow lines (followed by flush) and the reading of client
> > haves. If the client has hung up at this point, exit normally.
>
> Should we do this only if we saw a deepen line? From my reading of the
> client code, that's the only thing that would cause this early request.
> I don't know if there's any particular advantage to being more strict
> here.

I'm not sure what would be more correct. It seems to go against the
"server doesn't care" in a stateless case to be that strict, but maybe
there are additional benefits to strictness I'm not aware of.

>
> If we're _not_ going to be strict, then I actually wonder if we ought to
> simply teach get_common_commits() that seeing an EOF is OK in stateless
> mode, wherever it comes. It can't possibly impact the correctness of the
> protocol conversation (since we're stateless and the client is gone),
> but maybe it's useful if you're trying to count how often clients really
> do hang up.

I originally took that approach, but gently handling an EOF in the
get_common_commits loop resulted in a NAK being sent back because of:

                if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
                        [...]
                        if (data->have_obj.nr == 0 || data->multi_ack)
                                packet_write_fmt(1, "NAK\n");
                        [...]
                        if (data->stateless_rpc)
                                exit(0);
                        [...]
                }

which the client died on with an "expected shallow list" message. I
didn't see a straightforward way of modifying the conditions _inside_
the loop while ensuring I wasn't changing any expected behavior upon
EOF.

>
> > --- /dev/null
> > +++ b/t/t9904-upload-pack-stateless-timely-eof.sh
>
> We usually try to group related tests by number. Maybe t5705 would be a
> better spot? I also wondered if this could go into t5704, but its title
> is "protocol violations". It's not clear to me yet if this is a
> violation that happens to be mostly harmless, or something we need to be
> doing. :)

Ah! Okay, I was wondering about that, whether the numbered prefixes
were serial or otherwise meaningful. I'll try to group it
appropriately.

>
> > +test_expect_success 'upload-pack outputs flush and exits ok' '
> > +     test_commit initial &&
> > +     head=$(git rev-parse HEAD) &&
> > +     hexsz=$(test_oid hexsz) &&
> > +
> > +     printf "%04xwant %s\n%04xshallow %s\n000ddeepen 1\n0000" \
> > +             $(($hexsz + 10)) $head $(($hexsz + 13)) $head >request &&
>
> We have a helper function that makes this a bit easier to read:
>
> diff --git a/t/t9904-upload-pack-stateless-timely-eof.sh b/t/t9904-upload-pack-stateless-timely-eof.sh
> index f8385a7ebd..1108401e8f 100755
> --- a/t/t9904-upload-pack-stateless-timely-eof.sh
> +++ b/t/t9904-upload-pack-stateless-timely-eof.sh
> @@ -9,10 +9,13 @@ D=$(pwd)
>  test_expect_success 'upload-pack outputs flush and exits ok' '
>         test_commit initial &&
>         head=$(git rev-parse HEAD) &&
> -       hexsz=$(test_oid hexsz) &&
>
> -       printf "%04xwant %s\n%04xshallow %s\n000ddeepen 1\n0000" \
> -               $(($hexsz + 10)) $head $(($hexsz + 13)) $head >request &&
> +       {
> +               packetize "want $head" &&
> +               packetize "shallow $head" &&
> +               packetize "deepen 1" &&
> +               printf "0000"
> +       } >request &&
>
>         git upload-pack --stateless-rpc "$(pwd)" <request >actual &&
>
>
> > +     git upload-pack --stateless-rpc "$(pwd)" <request >actual &&
>
> You can just use "." here, which is a little shorter. It's not entirely
> cosmetic; the difference between $PWD and $(pwd) on Windows always trips
> me up, so I try to avoid using either whenever I can. ;)

Very cool re: the helpers. I'll make those changes.

>
> It would be nice if we could test this through a real use of Git, but it
> might not be worth the hassle. I guess we'd have to mine the apache logs
> in one of our http test scripts to see if upload-pack failed. And I
> guess if we _do_ change the client side to stop sending the extra
> request but want to treat historical clients more gracefully, we'd still
> need a manual test like this.

Yeah I just went with the minimal case I'm familiar with, but I'm game
to set up a more thorough case with some guidance.

>
> > --- a/upload-pack.c
> > +++ b/upload-pack.c
> > @@ -1344,7 +1344,18 @@ void upload_pack(struct upload_pack_options *options)
> >                                  PACKET_READ_DIE_ON_ERR_PACKET);
>
> The actual fix looks correct to me, modulo the alternatives I raised
> earlier.
>
> This function only handles the v0 protocol. For v2, we end up in
> upload_pack_v2(). But my reading of the client side do_fetch_pack_v2()
> is that it _doesn't_ send this extra request. And a simple test seems to
> confirm it. Which gives me further pause as to whether the extra request
> is necessary for v0.

When I do a trace using v2, I see two roundtrip requests as well. I
haven't tested the exit status of git-upload-pack in that case
however. It's getting late for me but I'll investigate tomorrow.

>
>
> Well, that review ended up a bit longer than I had imagined. So let me
> add what I should have said at the top: welcome to the Git list, and
> thanks for looking into this issue. :)

Thanks again! I appreciate all the feedback.
Jeff King Oct. 30, 2020, 9:09 a.m. UTC | #5
On Fri, Oct 30, 2020 at 12:47:29AM -0700, Daniel Duvall wrote:

> >   - the client doesn't care; by definition it has hung up at this point
> >     and will keep going with its next request
> 
> That's true in the case where the server doesn't surface the non-zero
> exit code from git-upload-pack—which results in git-http-backend
> existing non-zero as well. An apache setup I used in testing doesn't
> seem to care about the failure—it responds 200 and so the client is
> happy—but in our (dayjob) case, we're running a Phabricator instance
> which handles a non-zero exit from git-http-backend by responding 500,
> and the client dies.
> 
> $ git fetch --depth 1
> https://phabricator.wikimedia.org/source/phabricator.git HEAD
> error: RPC failed; HTTP 500 curl 22 The requested URL returned error: 500
> fatal: the remote end hung up unexpectedly

Thanks, that's a really helpful data point. The flaw in my reasoning is
that the client connection is bidirectional: just because the client
hung up does not mean they are not still listening to the remainder of
what the server has to say.

It sounds like Phabricator should probably be ignoring the exit code
from http-backend. Or at least doing so when the CGI managed to produce
an HTTP status code, and assuming that Git's response was either
well-formed, or that the client will realize it was broken. (Or possibly
http-backend should be less eager to pass along a non-zero exit code,
but it amounts to the same thing).

But regardless, we should make sure that upload-pack is doing the right
thing at least for this case.

> > If we're _not_ going to be strict, then I actually wonder if we ought to
> > simply teach get_common_commits() that seeing an EOF is OK in stateless
> > mode, wherever it comes. It can't possibly impact the correctness of the
> > protocol conversation (since we're stateless and the client is gone),
> > but maybe it's useful if you're trying to count how often clients really
> > do hang up.
> 
> I originally took that approach, but gently handling an EOF in the
> get_common_commits loop resulted in a NAK being sent back because of:
> 
>                 if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
>                         [...]
>                         if (data->have_obj.nr == 0 || data->multi_ack)
>                                 packet_write_fmt(1, "NAK\n");
>                         [...]
>                         if (data->stateless_rpc)
>                                 exit(0);
>                         [...]
>                 }
> 
> which the client died on with an "expected shallow list" message. I
> didn't see a straightforward way of modifying the conditions _inside_
> the loop while ensuring I wasn't changing any expected behavior upon
> EOF.

I was thinking something more like:

diff --git a/upload-pack.c b/upload-pack.c
index 2b128e4ad8..f6d3ef3e13 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -520,10 +520,18 @@ static int get_common_commits(struct upload_pack_data *data,
 
 	for (;;) {
 		const char *arg;
+		enum packet_read_status status;
 
 		reset_timeout(data->timeout);
 
-		if (packet_reader_read(reader) != PACKET_READ_NORMAL) {
+		status = packet_reader_read(reader);
+		if (status == PACKET_READ_EOF) {
+			if (data->stateless_rpc)
+				exit(0);
+			die("the remote end hung up unexpectedly");
+		}
+
+		if (status != PACKET_READ_NORMAL) {
 			if (data->multi_ack == MULTI_ACK_DETAILED
 			    && got_common
 			    && !got_other

But after reading the rest of your response, I think it probably does
make sense to be more strict here, and only handle the unexpected EOF
before entering get_common_commits(). I don't _think_ it should matter
to the client, because this whole "status != PACKET_READ_NORMAL" path
would never be entered with the current code (we'd die() when we see the
EOF inside packet_reader_read()).

> > This function only handles the v0 protocol. For v2, we end up in
> > upload_pack_v2(). But my reading of the client side do_fetch_pack_v2()
> > is that it _doesn't_ send this extra request. And a simple test seems to
> > confirm it. Which gives me further pause as to whether the extra request
> > is necessary for v0.
> 
> When I do a trace using v2, I see two roundtrip requests as well. I
> haven't tested the exit status of git-upload-pack in that case
> however. It's getting late for me but I'll investigate tomorrow.

There's an extra round-trip in v2: we probe for v2 and get the
capabilities in the initial request, then we get the ref advertisement,
then we start the object negotiation. In v0 the ref advertisement is
lumped into the initial response.

Here's what I see with:

  $ git init
  $ GIT_TRACE_CURL=$PWD/trace.out \
      git -c protocol.version=2 fetch --depth 1 \
      https://github.com/git/git master
  $ perl -lne '/(Send data:|Info: upload completely sent) .*/ and print $&' trace.out
  Send data: 0014command=ls-refs.0024agent=git/2.29.2.477.g2cec8aa0af0001
  Send data: 0009peel.000csymrefs.0016ref-prefix master.001bref-prefix re
  Send data: fs/master.0020ref-prefix refs/tags/master.0021ref-prefix ref
  Send data: s/heads/master.0023ref-prefix refs/remotes/master.0028ref-pr
  Send data: efix refs/remotes/master/HEAD.001aref-prefix refs/tags/.0000
  Info: upload completely sent off: 300 out of 300 bytes
  Send data: 0011command=fetch0024agent=git/2.29.2.477.g2cec8aa0af0001000
  Send data: dthin-pack000dofs-delta000cdeepen 10032want ad27df6a5cff694a
  Send data: dd500ab8c7f97234feb4a91f.0009done.0000
  Info: upload completely sent off: 158 out of 158 bytes

So the first one POST is getting the ref advertisement, and the second
one is the full negotiation request, including the "done".

I'm still uncertain whether it could all be done in one request for v0.
But one possible solution is: let's not care. If v2 does it correctly,
that's the future anyway (or present; it's now the default in v2.29).
And the change you're proposing in upload-pack would be desirable anyway
to help deal with older clients.

If that's the route we go, we should make sure the commit message
explains it.

-Peff
Junio C Hamano Nov. 3, 2020, 9:10 p.m. UTC | #6
Jeff King <peff@peff.net> writes:

> I'm still uncertain whether it could all be done in one request for v0.
> But one possible solution is: let's not care. If v2 does it correctly,
> that's the future anyway (or present; it's now the default in v2.29).
> And the change you're proposing in upload-pack would be desirable anyway
> to help deal with older clients.
>
> If that's the route we go, we should make sure the commit message
> explains it.

Yeah, I'd agree that punting on v0 and making sure the current
version would work well is good enough.

I lost track and am not sure what's the current status of the topic
is.  Is v3 [*1*] the latest and satisfactory one?

Thanks.


[Reference]
*1* https://lore.kernel.org/git/20201031023901.48193-1-dan@mutual.io/
Jeff King Nov. 4, 2020, 1:33 p.m. UTC | #7
On Tue, Nov 03, 2020 at 01:10:53PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > I'm still uncertain whether it could all be done in one request for v0.
> > But one possible solution is: let's not care. If v2 does it correctly,
> > that's the future anyway (or present; it's now the default in v2.29).
> > And the change you're proposing in upload-pack would be desirable anyway
> > to help deal with older clients.
> >
> > If that's the route we go, we should make sure the commit message
> > explains it.
> 
> Yeah, I'd agree that punting on v0 and making sure the current
> version would work well is good enough.
> 
> I lost track and am not sure what's the current status of the topic
> is.  Is v3 [*1*] the latest and satisfactory one?

Yeah, I just read over v3 again and it looks good to me. Thanks, Daniel!

-Peff
Daniel Duvall Nov. 4, 2020, 2:06 p.m. UTC | #8
On Wed, Nov 4, 2020 at 5:33 AM Jeff King <peff@peff.net> wrote:
>
> On Tue, Nov 03, 2020 at 01:10:53PM -0800, Junio C Hamano wrote:
>
> > Yeah, I'd agree that punting on v0 and making sure the current
> > version would work well is good enough.
> >
> > I lost track and am not sure what's the current status of the topic
> > is.  Is v3 [*1*] the latest and satisfactory one?
>
> Yeah, I just read over v3 again and it looks good to me. Thanks, Daniel!
>
> -Peff

Contributing this tiny patch was a rewarding experience, getting to
know Git more intimately and collaborating with you all. Thanks so
much!

Kindly,
Daniel
Junio C Hamano Nov. 4, 2020, 6:25 p.m. UTC | #9
Daniel Duvall <dan@mutual.io> writes:

> On Wed, Nov 4, 2020 at 5:33 AM Jeff King <peff@peff.net> wrote:
>>
>> On Tue, Nov 03, 2020 at 01:10:53PM -0800, Junio C Hamano wrote:
>>
>> > Yeah, I'd agree that punting on v0 and making sure the current
>> > version would work well is good enough.
>> >
>> > I lost track and am not sure what's the current status of the topic
>> > is.  Is v3 [*1*] the latest and satisfactory one?
>>
>> Yeah, I just read over v3 again and it looks good to me. Thanks, Daniel!
>>
>> -Peff
>
> Contributing this tiny patch was a rewarding experience, getting to
> know Git more intimately and collaborating with you all. Thanks so
> much!

Thanks, both of you.  Marked v3 to be ready for 'next'.
diff mbox series

Patch

diff --git a/t/t9904-upload-pack-stateless-timely-eof.sh b/t/t9904-upload-pack-stateless-timely-eof.sh
new file mode 100755
index 0000000..f8385a7
--- /dev/null
+++ b/t/t9904-upload-pack-stateless-timely-eof.sh
@@ -0,0 +1,24 @@ 
+#!/bin/sh
+
+test_description='stateless upload-pack gently handles EOF just after want/shallow/depth/flush'
+
+. ./test-lib.sh
+
+D=$(pwd)
+
+test_expect_success 'upload-pack outputs flush and exits ok' '
+	test_commit initial &&
+	head=$(git rev-parse HEAD) &&
+	hexsz=$(test_oid hexsz) &&
+
+	printf "%04xwant %s\n%04xshallow %s\n000ddeepen 1\n0000" \
+		$(($hexsz + 10)) $head $(($hexsz + 13)) $head >request &&
+
+	git upload-pack --stateless-rpc "$(pwd)" <request >actual &&
+
+	printf "0000" >expect &&
+
+	test_cmp expect actual
+'
+
+test_done
diff --git a/upload-pack.c b/upload-pack.c
index 3b858eb..2b128e4 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1344,7 +1344,18 @@  void upload_pack(struct upload_pack_options *options)
 				   PACKET_READ_DIE_ON_ERR_PACKET);
 
 		receive_needs(&data, &reader);
-		if (data.want_obj.nr) {
+
+		/*
+		 * An EOF at this exact point in negotiation should be
+		 * acceptable from stateless clients as they will consume the
+		 * shallow list before doing subsequent rpc with haves/etc.
+		 */
+		if (data.stateless_rpc)
+			reader.options |= PACKET_READ_GENTLE_ON_EOF;
+
+		if (data.want_obj.nr &&
+		    packet_reader_peek(&reader) != PACKET_READ_EOF) {
+			reader.options ^= PACKET_READ_GENTLE_ON_EOF;
 			get_common_commits(&data, &reader);
 			create_pack_file(&data, NULL);
 		}