http-backend: Treat empty CONTENT_LENGTH as zero
diff mbox series

Message ID 20180910205359.32332-1-max@max630.net
State New
Headers show
Series
  • http-backend: Treat empty CONTENT_LENGTH as zero
Related show

Commit Message

Max Kirillov Sept. 10, 2018, 8:53 p.m. UTC
From: Jeff King <peff@peff.net>
Subject: [PATCH] http-backend: Treat empty CONTENT_LENGTH as zero

There is no known case where empty body it used by a server as
instruction to read until EOF, so there is no need to violate the RFC.
Make get_content_length() return 0 in this case.

Currently there is no practical difference, as the GET request
where it can be empty is handled without actual reading the body
(in get_info_refs() function), but it is better to stick to the correct
behavior.

Signed-off-by: Max Kirillov <max@max630.net>
---
The incremental. Hopefully I described the reason right. Needs "signed-off-by"
 http-backend.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

Comments

Jonathan Nieder Sept. 10, 2018, 9:22 p.m. UTC | #1
Hi,

Max Kirillov wrote:

> From: Jeff King <peff@peff.net>
> Subject: [PATCH] http-backend: Treat empty CONTENT_LENGTH as zero

micronit: s/Treat/treat/

> There is no known case where empty body it used by a server as
> instruction to read until EOF, so there is no need to violate the RFC.
> Make get_content_length() return 0 in this case.
>
> Currently there is no practical difference, as the GET request
> where it can be empty is handled without actual reading the body
> (in get_info_refs() function), but it is better to stick to the correct
> behavior.
>
> Signed-off-by: Max Kirillov <max@max630.net>
> ---
> The incremental. Hopefully I described the reason right. Needs "signed-off-by"

Thanks.  I am wondering if we should go all the way and do

	ssize_t val;
	const char *str = getenv("CONTENT_LENGTH");

	if (!str || !*str)
		return 0;
	if (!git_parse_ssize_t(str, &val))
		die(...);
	return val;

That would match the RFC, but it seems to make t5510-fetch.sh hang,
right after

  ok 165 - --negotiation-tip understands abbreviated SHA-1

When I run with -v -i -x, it stalls at

  ++ git -C '/usr/local/google/home/jrn/src/git/t/trash directory.t5510-fetch/httpd/www/server' tag -d alpha_1 alpha_2 beta_1 beta_2
  Deleted tag 'alpha_1' (was a84e4a9)
  Deleted tag 'alpha_2' (was 7dd5cf4)
  Deleted tag 'beta_1' (was bcb5c65)
  Deleted tag 'beta_2' (was d3b6dcd)
  +++ pwd
  ++ GIT_TRACE_PACKET='/usr/local/google/home/jrn/src/git/t/trash directory.t5510-fetch/trace'
  ++ git -C client fetch --negotiation-tip=alpha_1 --negotiation-tip=beta_1 origin alpha_s beta_s

Do you know why?

Thanks,
Jonathan
Jeff King Sept. 11, 2018, 1:55 a.m. UTC | #2
On Mon, Sep 10, 2018 at 02:22:21PM -0700, Jonathan Nieder wrote:

> Thanks.  I am wondering if we should go all the way and do
> 
> 	ssize_t val;
> 	const char *str = getenv("CONTENT_LENGTH");
> 
> 	if (!str || !*str)
> 		return 0;
> 	if (!git_parse_ssize_t(str, &val))
> 		die(...);
> 	return val;
> 
> That would match the RFC, but it seems to make t5510-fetch.sh hang,
> right after
> 
>   ok 165 - --negotiation-tip understands abbreviated SHA-1
> 
> When I run with -v -i -x, it stalls at
> 
>   ++ git -C '/usr/local/google/home/jrn/src/git/t/trash directory.t5510-fetch/httpd/www/server' tag -d alpha_1 alpha_2 beta_1 beta_2
>   Deleted tag 'alpha_1' (was a84e4a9)
>   Deleted tag 'alpha_2' (was 7dd5cf4)
>   Deleted tag 'beta_1' (was bcb5c65)
>   Deleted tag 'beta_2' (was d3b6dcd)
>   +++ pwd
>   ++ GIT_TRACE_PACKET='/usr/local/google/home/jrn/src/git/t/trash directory.t5510-fetch/trace'
>   ++ git -C client fetch --negotiation-tip=alpha_1 --negotiation-tip=beta_1 origin alpha_s beta_s
> 
> Do you know why?

Yes. :)

It's due to this comment in the patch you are replying to:

+       if (!str) {
+               /*
+                * RFC3875 says this must mean "no body", but in practice we
+                * receive chunked encodings with no CONTENT_LENGTH. Tell the
+                * caller to read until EOF.
+                */
+               val = -1;

-Peff
Jeff King Sept. 11, 2018, 1:58 a.m. UTC | #3
On Mon, Sep 10, 2018 at 11:53:59PM +0300, Max Kirillov wrote:

> From: Jeff King <peff@peff.net>
> Subject: [PATCH] http-backend: Treat empty CONTENT_LENGTH as zero
> 
> There is no known case where empty body it used by a server as
> instruction to read until EOF, so there is no need to violate the RFC.
> Make get_content_length() return 0 in this case.
> 
> Currently there is no practical difference, as the GET request
> where it can be empty is handled without actual reading the body
> (in get_info_refs() function), but it is better to stick to the correct
> behavior.

There could be a difference if there is a server which actually sets
CONTENT_LENGTH to the empty string for a chunked body. But we don't know
of any such server at this point.

> Signed-off-by: Max Kirillov <max@max630.net>
> ---
> The incremental. Hopefully I described the reason right. Needs "signed-off-by"

Certainly this is:

  Signed-off-by: Jeff King <peff@peff.net>

-Peff
Jonathan Nieder Sept. 11, 2018, 2:20 a.m. UTC | #4
Jeff King wrote:
> On Mon, Sep 10, 2018 at 02:22:21PM -0700, Jonathan Nieder wrote:

>> Thanks.  I am wondering if we should go all the way and do
>>
>> 	ssize_t val;
>> 	const char *str = getenv("CONTENT_LENGTH");
>>
>> 	if (!str || !*str)
>> 		return 0;
>> 	if (!git_parse_ssize_t(str, &val))
>> 		die(...);
>> 	return val;
>>
>> That would match the RFC, but it seems to make t5510-fetch.sh hang,
[...]
>> Do you know why?
>
> Yes. :)
>
> It's due to this comment in the patch you are replying to:
>
> +       if (!str) {
> +               /*
> +                * RFC3875 says this must mean "no body", but in practice we
> +                * receive chunked encodings with no CONTENT_LENGTH. Tell the
> +                * caller to read until EOF.
> +                */
> +               val = -1;

Ah!  So "in practice" includes "in Apache".  An old discussion[1] on
Apache's httpd-users list agrees.

The question then becomes: what does IIS do for zero-length requests?
Does any other web server fail to support "read until EOF" in general?

The CGI standard does not cover chunked encoding so we can't lean on
the standard for advice.  It's not clear to me yet whether this patch
improves on what's in "master".

Thanks,
Jonathan

[1] http://mail-archives.apache.org/mod_mbox/httpd-users/200909.mbox/%3C4AAACC38.3070200@rowe-clan.net%3E
Jeff King Sept. 11, 2018, 2:30 a.m. UTC | #5
On Mon, Sep 10, 2018 at 07:20:28PM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> > On Mon, Sep 10, 2018 at 02:22:21PM -0700, Jonathan Nieder wrote:
> 
> >> Thanks.  I am wondering if we should go all the way and do
> >>
> >> 	ssize_t val;
> >> 	const char *str = getenv("CONTENT_LENGTH");
> >>
> >> 	if (!str || !*str)
> >> 		return 0;
> >> 	if (!git_parse_ssize_t(str, &val))
> >> 		die(...);
> >> 	return val;
> >>
> >> That would match the RFC, but it seems to make t5510-fetch.sh hang,
> [...]
> >> Do you know why?
> >
> > Yes. :)
> >
> > It's due to this comment in the patch you are replying to:
> >
> > +       if (!str) {
> > +               /*
> > +                * RFC3875 says this must mean "no body", but in practice we
> > +                * receive chunked encodings with no CONTENT_LENGTH. Tell the
> > +                * caller to read until EOF.
> > +                */
> > +               val = -1;
> 
> Ah!  So "in practice" includes "in Apache".  An old discussion[1] on
> Apache's httpd-users list agrees.
> 
> The question then becomes: what does IIS do for zero-length requests?
> Does any other web server fail to support "read until EOF" in general?
> 
> The CGI standard does not cover chunked encoding so we can't lean on
> the standard for advice.  It's not clear to me yet whether this patch
> improves on what's in "master".

I'd note that the case in question (no CONTENT_LENGTH at all) is not
changed between this patch and master. It's only the case of
CONTENT_LENGTH set to an empty string. But I agree that it is not clear
to me whether it is actually improving anything in practice.

-Peff

Patch
diff mbox series

diff --git a/http-backend.c b/http-backend.c
index 458642ef72..ea36a52118 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -353,8 +353,28 @@  static ssize_t get_content_length(void)
 	ssize_t val = -1;
 	const char *str = getenv("CONTENT_LENGTH");
 
-	if (str && *str && !git_parse_ssize_t(str, &val))
-		die("failed to parse CONTENT_LENGTH: %s", str);
+	if (!str) {
+		/*
+		 * RFC3875 says this must mean "no body", but in practice we
+		 * receive chunked encodings with no CONTENT_LENGTH. Tell the
+		 * caller to read until EOF.
+		 */
+		val = -1;
+	} else if (!*str) {
+		/*
+		 * An empty length should be treated as "no body" according to
+		 * RFC3875, and this seems to hold in practice.
+		 */
+		val = 0;
+	} else {
+		/*
+		 * We have a non-empty CONTENT_LENGTH; trust what's in it as long
+		 * as it can be parsed.
+		 */
+		if (!git_parse_ssize_t(str, &val))
+			die("failed to parse CONTENT_LENGTH: '%s'", str);
+	}
+
 	return val;
 }