diff mbox series

[6/6] remote-curl: ensure last packet is a flush

Message ID 7a689da2bb820f70d9e668d656b088af2297d456.1589393036.git.liu.denton@gmail.com (mailing list archive)
State New, archived
Headers show
Series remote-curl: partial fix for a deadlock with stateless rpc | expand

Commit Message

Denton Liu May 13, 2020, 6:04 p.m. UTC
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated before the transaction is
complete, remote-curl will blindly forward the packets before waiting on
more input from fetch-pack. Meanwhile, fetch-pack will read the
transaction and continue reading, expecting more input to continue the
transaction. This results in a deadlock between the two processes.

This can be seen in the following command which does not terminate:

	$ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...

whereas the v1 version does terminate as expected:

	$ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012
	Cloning into 'git'...
	fatal: the remote end hung up unexpectedly

Instead of blindly forwarding packets, raise a flag when a flush packet
is encountered. Ensure that the last packet sent is a flush packet
otherwise error out, breaking the deadlock.

This is not a complete solution to the problem, however. It is possible
that a flush packet could be sent in the middle of a message and the
connection could die immediately after. Then, remote-curl would not
error out and fetch-pack would still be in the middle of a transaction
and they would enter deadlock. A complete solution would involve
reframing the stateless-connect protocol, possibly by introducing
another control packet ("0002"?) as a stateless request separator
packet which is always sent at the end of post_rpc().

Although this is not a complete solution, it is better than nothing and
it resolves the reported issue for now.

Reported-by: Force Charlie <charlieio@outlook.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
---

Notes:
    I wish there were some way to insert a timeout on the test case so that
    we don't block forever in case we regress.

 remote-curl.c          |  9 +++++++++
 t/t5702-protocol-v2.sh | 17 +++++++++++++++++
 2 files changed, 26 insertions(+)

Comments

Denton Liu May 15, 2020, 9:02 p.m. UTC | #1
On Wed, May 13, 2020 at 02:04:58PM -0400, Denton Liu wrote:
> This is not a complete solution to the problem, however. It is possible
> that a flush packet could be sent in the middle of a message and the
> connection could die immediately after. Then, remote-curl would not
> error out and fetch-pack would still be in the middle of a transaction
> and they would enter deadlock. A complete solution would involve
> reframing the stateless-connect protocol, possibly by introducing
> another control packet ("0002"?) as a stateless request separator
> packet which is always sent at the end of post_rpc().
> 
> Although this is not a complete solution, it is better than nothing and
> it resolves the reported issue for now.

I managed to get the implementation of the control packet working. As a
result, I will be dropping this patch in the next reroll and replacing
it with the more complete solution. For anyone reviewing, feel free to
skip this patch.
Jeff King May 15, 2020, 9:41 p.m. UTC | #2
On Fri, May 15, 2020 at 05:02:45PM -0400, Denton Liu wrote:

> On Wed, May 13, 2020 at 02:04:58PM -0400, Denton Liu wrote:
> > This is not a complete solution to the problem, however. It is possible
> > that a flush packet could be sent in the middle of a message and the
> > connection could die immediately after. Then, remote-curl would not
> > error out and fetch-pack would still be in the middle of a transaction
> > and they would enter deadlock. A complete solution would involve
> > reframing the stateless-connect protocol, possibly by introducing
> > another control packet ("0002"?) as a stateless request separator
> > packet which is always sent at the end of post_rpc().
> > 
> > Although this is not a complete solution, it is better than nothing and
> > it resolves the reported issue for now.
> 
> I managed to get the implementation of the control packet working. As a
> result, I will be dropping this patch in the next reroll and replacing
> it with the more complete solution. For anyone reviewing, feel free to
> skip this patch.

OK. I'm less concerned about a flush packet in the middle of the
response fooling us into thinking things are done, and more that there
may be responses which _don't_ end in a flush. But maybe they all do.

This (and the previous patch) are definitely adding an extra layer of
assumptions about what the protocol going over the rpc channel looks
like. That seems like it will introduce more fragility.

I do kind of like the idea of a stateless separator packet, if I
understand your meaning correctly. I'll wait to see what the patches
look like. :)

-Peff
Junio C Hamano May 18, 2020, 4:34 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

> On Fri, May 15, 2020 at 05:02:45PM -0400, Denton Liu wrote:
>
>> On Wed, May 13, 2020 at 02:04:58PM -0400, Denton Liu wrote:
>> > This is not a complete solution to the problem, however. It is possible
>> > that a flush packet could be sent in the middle of a message and the
>> > connection could die immediately after. Then, remote-curl would not
>> > error out and fetch-pack would still be in the middle of a transaction
>> > and they would enter deadlock. A complete solution would involve
>> > reframing the stateless-connect protocol, possibly by introducing
>> > another control packet ("0002"?) as a stateless request separator
>> > packet which is always sent at the end of post_rpc().
>> > ...
> I do kind of like the idea of a stateless separator packet, if I
> understand your meaning correctly. I'll wait to see what the patches
> look like. :)

I vaguely recall talking with somebody (perhaps it was Shawn Pearce)
about my long-time complaint on the on-the-wire protocol, in that
the protocol sometimes uses flush to merely mean "I've finished
speaking one section of my speech" and sometimes "I've done talking
for now and it's your turn to speak to me" (the receiving end needs
to know which one a particular flush means without knowing the
context in which it is issued---which makes it impossible to write a
protocol agnostic proxy.  

If the above proposal means that we'll have an explicit way to say
"Not just I am done with one section of my speech, but it is your
turn to speak and I am going to listen", that would be wonderful.
Jeff King May 18, 2020, 4:52 p.m. UTC | #4
On Mon, May 18, 2020 at 09:34:42AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On Fri, May 15, 2020 at 05:02:45PM -0400, Denton Liu wrote:
> >
> >> On Wed, May 13, 2020 at 02:04:58PM -0400, Denton Liu wrote:
> >> > This is not a complete solution to the problem, however. It is possible
> >> > that a flush packet could be sent in the middle of a message and the
> >> > connection could die immediately after. Then, remote-curl would not
> >> > error out and fetch-pack would still be in the middle of a transaction
> >> > and they would enter deadlock. A complete solution would involve
> >> > reframing the stateless-connect protocol, possibly by introducing
> >> > another control packet ("0002"?) as a stateless request separator
> >> > packet which is always sent at the end of post_rpc().
> >> > ...
> > I do kind of like the idea of a stateless separator packet, if I
> > understand your meaning correctly. I'll wait to see what the patches
> > look like. :)
> 
> I vaguely recall talking with somebody (perhaps it was Shawn Pearce)
> about my long-time complaint on the on-the-wire protocol, in that
> the protocol sometimes uses flush to merely mean "I've finished
> speaking one section of my speech" and sometimes "I've done talking
> for now and it's your turn to speak to me" (the receiving end needs
> to know which one a particular flush means without knowing the
> context in which it is issued---which makes it impossible to write a
> protocol agnostic proxy.  
> 
> If the above proposal means that we'll have an explicit way to say
> "Not just I am done with one section of my speech, but it is your
> turn to speak and I am going to listen", that would be wonderful.

I think we already have that now in v2 because of the "0001" delim
packet. All of the flushes are (I think) really "this is the end of my
speech", and any inner "my list is ending, but I have more to say"
delimiters are "0001".

-Peff
Jeff King May 18, 2020, 9 p.m. UTC | #5
On Mon, May 18, 2020 at 12:52:07PM -0400, Jeff King wrote:

> > I vaguely recall talking with somebody (perhaps it was Shawn Pearce)
> > about my long-time complaint on the on-the-wire protocol, in that
> > the protocol sometimes uses flush to merely mean "I've finished
> > speaking one section of my speech" and sometimes "I've done talking
> > for now and it's your turn to speak to me" (the receiving end needs
> > to know which one a particular flush means without knowing the
> > context in which it is issued---which makes it impossible to write a
> > protocol agnostic proxy.  
> > 
> > If the above proposal means that we'll have an explicit way to say
> > "Not just I am done with one section of my speech, but it is your
> > turn to speak and I am going to listen", that would be wonderful.
> 
> I think we already have that now in v2 because of the "0001" delim
> packet. All of the flushes are (I think) really "this is the end of my
> speech", and any inner "my list is ending, but I have more to say"
> delimiters are "0001".

Sadly, this is only _mostly_ true. See my response in:

  https://lore.kernel.org/git/20200518205854.GB63978@coredump.intra.peff.net/

-Peff
diff mbox series

Patch

diff --git a/remote-curl.c b/remote-curl.c
index 8b740354e5..aab17851be 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -684,6 +684,7 @@  struct rpc_in_data {
 	struct active_request_slot *slot;
 	struct strbuf len_buf;
 	int remaining;
+	int last_flush;
 };
 
 /*
@@ -707,6 +708,8 @@  static size_t rpc_in(char *ptr, size_t eltsize,
 		data->rpc->any_written = 1;
 
 	while (unwritten) {
+		data->last_flush = 0;
+
 		if (!data->remaining) {
 			int digits_remaining = 4 - data->len_buf.len;
 			if (digits_remaining > unwritten)
@@ -720,6 +723,8 @@  static size_t rpc_in(char *ptr, size_t eltsize,
 				if (data->remaining < 0) {
 					die(_("remote-curl: bad line length character: %.4s"), data->len_buf.buf);
 				} else if (data->remaining <= 1) {
+					if (!data->remaining)
+						data->last_flush = 1;
 					data->remaining = 0;
 				} else if (data->remaining < 4) {
 					die(_("remote-curl: bad line length %d"), data->remaining);
@@ -960,6 +965,7 @@  static int post_rpc(struct rpc_state *rpc, int flush_received)
 	rpc_in_data.slot = slot;
 	strbuf_init(&rpc_in_data.len_buf, 4);
 	rpc_in_data.remaining = 0;
+	rpc_in_data.last_flush = 0;
 	curl_easy_setopt(slot->curl, CURLOPT_FILE, &rpc_in_data);
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
@@ -979,6 +985,9 @@  static int post_rpc(struct rpc_state *rpc, int flush_received)
 	if (rpc_in_data.remaining)
 		err = error(_("%d bytes are still expected"), rpc_in_data.remaining);
 
+	if (!rpc_in_data.last_flush)
+		err = error(_("last packet was not a flush"));
+
 	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 5039e66dc4..4570d0746c 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -586,6 +586,23 @@  test_expect_success 'clone with http:// using protocol v2' '
 	! grep "Send header: Transfer-Encoding: chunked" log
 '
 
+test_expect_success 'clone with http:// using protocol v2 and invalid parameters' '
+	test_when_finished "rm -f log" &&
+
+	test_must_fail env GIT_TRACE_PACKET="$(pwd)/log" GIT_TRACE_CURL="$(pwd)/log" \
+		git -c protocol.version=2 \
+		clone --shallow-since=20151012 "$HTTPD_URL/smart/http_parent" http_child_invalid 2>err &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+	# Verify that we errored out in the expected way
+	test_i18ngrep "last packet was not a flush" err &&
+	# Verify that the chunked encoding sending codepath is NOT exercised
+	! grep "Send header: Transfer-Encoding: chunked" log
+'
+
 test_expect_success 'clone big repository with http:// using protocol v2' '
 	test_when_finished "rm -f log" &&