diff mbox series

[5/6] remote-curl: error on incomplete packet

Message ID 3ed7cf87aaa40ee66b20aa929d89d28fefcec312.1589393036.git.liu.denton@gmail.com (mailing list archive)
State New, archived
Headers show
Series remote-curl: partial fix for a deadlock with stateless rpc | expand

Commit Message

Denton Liu May 13, 2020, 6:04 p.m. UTC
Currently, remote-curl acts as a proxy and blindly forwards packets
between an HTTP server and fetch-pack. In the case of a stateless RPC
connection where the connection is terminated with a partially written
packet, remote-curl will blindly send the partially written packet
before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
read the partial packet and continue reading, expecting more input. This
results in a deadlock between the two processes.

Instead of blindly forwarding packets, inspect each packet to ensure
that it is a full packet, erroring out if a partial packet is sent.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Denton Liu <liu.denton@gmail.com>
---

Notes:
    Unfortunately, I'm not really sure how to test this.

 remote-curl.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

Comments

Jeff King May 15, 2020, 9:38 p.m. UTC | #1
On Wed, May 13, 2020 at 02:04:57PM -0400, Denton Liu wrote:

> Currently, remote-curl acts as a proxy and blindly forwards packets
> between an HTTP server and fetch-pack. In the case of a stateless RPC
> connection where the connection is terminated with a partially written
> packet, remote-curl will blindly send the partially written packet
> before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
> read the partial packet and continue reading, expecting more input. This
> results in a deadlock between the two processes.
> 
> Instead of blindly forwarding packets, inspect each packet to ensure
> that it is a full packet, erroring out if a partial packet is sent.

Hmm. Right now there's no assumption in rpc_in that we're writing
pktlines. Will this always be the case?

I think the answer is probably yes. If there's a case where it isn't, it
might be something like v0 git-over-http against a server which doesn't
have the sideband capability.

> diff --git a/remote-curl.c b/remote-curl.c
> index da3e07184a..8b740354e5 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -682,6 +682,8 @@ static curlioerr rpc_ioctl(CURL *handle, int cmd, void *clientp)
>  struct rpc_in_data {
>  	struct rpc_state *rpc;
>  	struct active_request_slot *slot;
> +	struct strbuf len_buf;
> +	int remaining;

This remaining is "remaining in the current packet", I assume? An "int"
should be OK for that.

Using a strbuf feels a bit like overkill, because we have to remember to
free it (which I think doesn't actually happen in your patch). Could we
just use a "char len_buf[4]" (you'd need an extra int to count how many
bytes you've received there).

> @@ -702,7 +705,42 @@ static size_t rpc_in(char *ptr, size_t eltsize,
>  		return size;
>  	if (size)
>  		data->rpc->any_written = 1;
> -	write_or_die(data->rpc->in, ptr, size);
> +
> +	while (unwritten) {
> +		if (!data->remaining) {
> +			int digits_remaining = 4 - data->len_buf.len;
> +			if (digits_remaining > unwritten)
> +				digits_remaining = unwritten;
> +			strbuf_add(&data->len_buf, ptr, digits_remaining);
> +			ptr += digits_remaining;
> +			unwritten -= digits_remaining;

So we actually save up the 4 bytes, not sending them at all until we get
them. I wonder if this might be easier to follow if our count was simply
advisory. I.e., continue to write data as we get it, but keep a small
state machine tracking pktlines (which could even be done in its own
separate struct/function pair).

I dunno. It might be about the same level of confusing, but it makes it
easy to keep the logic separate from rpc_in, and it lets us put all of
the policy bits at the end, after we know we've received all of the
data (in post_rpc):

  if (data->len_buf.len < 4)
          die("incomplete packet header");
  if (data->remaining)
          die("incomplete packet");
  /* imagine we kept the actual pktlen value, instead of just counting
   * down remaining */
  if (data->pktlen)
          die("did not end in flush");

Notably, I'm not sure if your code will complain if the connection dies
will reading the 4-byte header (remaining would still be 0). That won't
leave the caller trying to read the packet (we never sent them the
header), but they may still be waiting for a response.

> +			if (data->len_buf.len == 4) {
> +				data->remaining = packet_length(data->len_buf.buf);
> +				if (data->remaining < 0) {
> +					die(_("remote-curl: bad line length character: %.4s"), data->len_buf.buf);
> +				} else if (data->remaining <= 1) {
> +					data->remaining = 0;

This treats 0001 delimiters the same as a 0000 flush. Expecting 0 more
bytes is the right thing, but would we later want to differentiate in
post_rpc()? I.e., is it ever correct for the server data to end with a
delim?

> +				} else if (data->remaining < 4) {
> +					die(_("remote-curl: bad line length %d"), data->remaining);

We don't use an 0002 or 0003 packet for anything yet, but this would
need to be updated if we ever did. I wonder if we should treat them also
as zero-length packets and quietly pass through, which is likely the
right thing to do. OTOH, this should complain loudly enough that
somebody would presumably notice as soon as they tried to use those
packets.

Regarding testing, I think the ideal thing would be a proxy layer we
could insert that simply eats all data after N bytes. That would be easy
to do if we could use --upload-pack='git-upload-pack | head -c 500' or
something. But we need it to happen between curl and the server side of
Git. Possibly we could insert something between apache and
git-http-backend (simialr to how we handle broken-smart-http.sh.

-Peff
Denton Liu May 18, 2020, 9:08 a.m. UTC | #2
Hi Peff,

On Fri, May 15, 2020 at 05:38:44PM -0400, Jeff King wrote:
> On Wed, May 13, 2020 at 02:04:57PM -0400, Denton Liu wrote:
> 
> > Currently, remote-curl acts as a proxy and blindly forwards packets
> > between an HTTP server and fetch-pack. In the case of a stateless RPC
> > connection where the connection is terminated with a partially written
> > packet, remote-curl will blindly send the partially written packet
> > before waiting on more input from fetch-pack. Meanwhile, fetch-pack will
> > read the partial packet and continue reading, expecting more input. This
> > results in a deadlock between the two processes.
> > 
> > Instead of blindly forwarding packets, inspect each packet to ensure
> > that it is a full packet, erroring out if a partial packet is sent.
> 
> Hmm. Right now there's no assumption in rpc_in that we're writing
> pktlines. Will this always be the case?
> 
> I think the answer is probably yes. If there's a case where it isn't, it
> might be something like v0 git-over-http against a server which doesn't
> have the sideband capability.

As far as I can tell from skimming the code, v0 uses always pktlines
although I'm far from being an expert on Git's networking stuff. Perhaps
we could implement this such that the line-length checking only happens
for stateless_connect()?
Jeff King May 18, 2020, 3:49 p.m. UTC | #3
On Mon, May 18, 2020 at 05:08:57AM -0400, Denton Liu wrote:

> > Hmm. Right now there's no assumption in rpc_in that we're writing
> > pktlines. Will this always be the case?
> > 
> > I think the answer is probably yes. If there's a case where it isn't, it
> > might be something like v0 git-over-http against a server which doesn't
> > have the sideband capability.
> 
> As far as I can tell from skimming the code, v0 uses always pktlines
> although I'm far from being an expert on Git's networking stuff. Perhaps
> we could implement this such that the line-length checking only happens
> for stateless_connect()?

Yeah, that would certainly limit the possibility of unintended side
effects (and I don't think there's any benefit to this patch for the
non-stateless-connect cases).

We'd still be locking stateless-connect into always using pktlines, but
I think that's probably OK in practice. AFAICT it's the case now, and I
doubt we'd have any desire to change it in the future (and if we do,
this is all local-ish code that we could modify).

One unfortunate thing about the protocol (but not new to your patch) is
that this will be a problem for _any_ remote-helper which claims to
support stateless-connect. So they'd all need to carry similar code to
deal with this issue. Right now remote-curl is the only one, but
probably git-remote-ext and git-remote-fd could support this, too. Those
are pretty exotic, though (I don't think anyone has even bothered to
make them support v2 yet).

-Peff
diff mbox series

Patch

diff --git a/remote-curl.c b/remote-curl.c
index da3e07184a..8b740354e5 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -682,6 +682,8 @@  static curlioerr rpc_ioctl(CURL *handle, int cmd, void *clientp)
 struct rpc_in_data {
 	struct rpc_state *rpc;
 	struct active_request_slot *slot;
+	struct strbuf len_buf;
+	int remaining;
 };
 
 /*
@@ -692,6 +694,7 @@  static size_t rpc_in(char *ptr, size_t eltsize,
 		size_t nmemb, void *buffer_)
 {
 	size_t size = eltsize * nmemb;
+	size_t unwritten = size;
 	struct rpc_in_data *data = buffer_;
 	long response_code;
 
@@ -702,7 +705,42 @@  static size_t rpc_in(char *ptr, size_t eltsize,
 		return size;
 	if (size)
 		data->rpc->any_written = 1;
-	write_or_die(data->rpc->in, ptr, size);
+
+	while (unwritten) {
+		if (!data->remaining) {
+			int digits_remaining = 4 - data->len_buf.len;
+			if (digits_remaining > unwritten)
+				digits_remaining = unwritten;
+			strbuf_add(&data->len_buf, ptr, digits_remaining);
+			ptr += digits_remaining;
+			unwritten -= digits_remaining;
+
+			if (data->len_buf.len == 4) {
+				data->remaining = packet_length(data->len_buf.buf);
+				if (data->remaining < 0) {
+					die(_("remote-curl: bad line length character: %.4s"), data->len_buf.buf);
+				} else if (data->remaining <= 1) {
+					data->remaining = 0;
+				} else if (data->remaining < 4) {
+					die(_("remote-curl: bad line length %d"), data->remaining);
+				} else {
+					data->remaining -= 4;
+				}
+				write_or_die(data->rpc->in, data->len_buf.buf, 4);
+				strbuf_reset(&data->len_buf);
+			}
+		}
+
+		if (data->remaining) {
+			int remaining = data->remaining;
+			if (remaining > unwritten)
+				remaining = unwritten;
+			write_or_die(data->rpc->in, ptr, remaining);
+			ptr += remaining;
+			unwritten -= remaining;
+			data->remaining -= remaining;
+		}
+	}
 	return size;
 }
 
@@ -920,6 +958,8 @@  static int post_rpc(struct rpc_state *rpc, int flush_received)
 	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, rpc_in);
 	rpc_in_data.rpc = rpc;
 	rpc_in_data.slot = slot;
+	strbuf_init(&rpc_in_data.len_buf, 4);
+	rpc_in_data.remaining = 0;
 	curl_easy_setopt(slot->curl, CURLOPT_FILE, &rpc_in_data);
 	curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
 
@@ -936,6 +976,9 @@  static int post_rpc(struct rpc_state *rpc, int flush_received)
 	if (!rpc->any_written)
 		err = -1;
 
+	if (rpc_in_data.remaining)
+		err = error(_("%d bytes are still expected"), rpc_in_data.remaining);
+
 	curl_slist_free_all(headers);
 	free(gzip_body);
 	return err;