diff mbox series

[v10,2/3] http: read HTTP WWW-Authenticate response headers

Message ID 703ac15222fdcfc98751b11af725cc1395134bd1.1676586881.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Enhance credential helper protocol to include auth headers | expand

Commit Message

Matthew John Cheetham Feb. 16, 2023, 10:34 p.m. UTC
From: Matthew John Cheetham <mjcheetham@outlook.com>

Read and store the HTTP WWW-Authenticate response headers made for
a particular request.

This will allow us to pass important authentication challenge
information to credential helpers or others that would otherwise have
been lost.

libcurl only provides us with the ability to read all headers recieved
for a particular request, including any intermediate redirect requests
or proxies. The lines returned by libcurl include HTTP status lines
delinating any intermediate requests such as "HTTP/1.1 200". We use
these lines to reset the strvec of WWW-Authenticate header values as
we encounter them in order to only capture the final response headers.

The collection of all header values matching the WWW-Authenticate
header is complicated by the fact that it is legal for header fields to
be continued over multiple lines, but libcurl only gives us each
physical line a time, not each logical header. This line folding feature
is deprecated in RFC 7230 [1] but older servers may still emit them, so
we need to handle them.

In the future [2] we may be able to leverage functions to read headers
from libcurl itself, but as of today we must do this ourselves.

[1] https://www.rfc-editor.org/rfc/rfc7230#section-3.2
[2] https://daniel.haxx.se/blog/2022/03/22/a-headers-api-for-libcurl/

Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
---
 credential.c      |   1 +
 credential.h      |  16 +++++++
 git-compat-util.h |  23 ++++++++++
 http.c            | 111 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 151 insertions(+)

Comments

Jeff King Feb. 23, 2023, 9:46 a.m. UTC | #1
On Thu, Feb 16, 2023 at 10:34:40PM +0000, Matthew John Cheetham via GitGitGadget wrote:

> +/*
> + * Like skip_prefix_mem, but compare case-insensitively. Note that the
> + * comparison is done via tolower(), so it is strictly ASCII (no multi-byte
> + * characters or locale-specific conversions).
> + */
> +static inline int skip_iprefix_mem(const char *buf, size_t len,
> +				   const char *prefix,
> +				   const char **out, size_t *outlen)
> +{
> +	size_t prefix_len = strlen(prefix);
> +
> +	if (len < prefix_len)
> +		return 0;
> +
> +	if (!strncasecmp(buf, prefix, prefix_len)) {
> +		*out = buf + prefix_len;
> +		*outlen = len - prefix_len;
> +		return 1;
> +	}
> +
> +	return 0;
> +}

The comment at the top of the function seems out of date. It's using
strncasecmp(), so it probably would be locale-dependent. I think that's
probably OK, but we should probably fix the comment.

Alternatively, you could copy the tolower() loop from skip_iprefix().
Something like:
  
diff --git a/git-compat-util.h b/git-compat-util.h
index 28456241b6..f671a0ec3f 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1296,17 +1296,13 @@ static inline int skip_iprefix_mem(const char *buf, size_t len,
 				   const char *prefix,
 				   const char **out, size_t *outlen)
 {
-	size_t prefix_len = strlen(prefix);
-
-	if (len < prefix_len)
-		return 0;
-
-	if (!strncasecmp(buf, prefix, prefix_len)) {
-		*out = buf + prefix_len;
-		*outlen = len - prefix_len;
-		return 1;
-	}
-
+	do {
+		if (!*prefix) {
+			*out = buf;
+			*outlen = len;
+			return 1;
+		}
+	} while (len-- > 0 && tolower(*buf++) == tolower(*prefix++));
 	return 0;
 }
 

looks right to me, though only lightly tested (via t5563). I'm happy
with either implementation.

> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
> [...]
> +	/*
> +	 * If this is a HTTP status line and not a header field, this signals
> +	 * a different HTTP response. libcurl writes all the output of all
> +	 * response headers of all responses, including redirects.
> +	 * We only care about the last HTTP request response's headers so clear
> +	 * the existing array.
> +	 */
> +	if (!strncasecmp(ptr, "http/", 5))
> +		strvec_clear(values);

Since "ptr" isn't NUL terminated, using strncasecmp() may walk off the
end. I think you'd need to check that there are five bytes. You could
even use skip_iprefix_mem(), though of course we'd throw away the output
values. (For strings there is also istarts_with(), but I don't think we
have a "mem" equivalent).

The rest of the parsing looks good to me.

-Peff
Junio C Hamano Feb. 23, 2023, 7:49 p.m. UTC | #2
Jeff King <peff@peff.net> writes:

> Alternatively, you could copy the tolower() loop from skip_iprefix().
> Something like:
>   
> diff --git a/git-compat-util.h b/git-compat-util.h
> index 28456241b6..f671a0ec3f 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -1296,17 +1296,13 @@ static inline int skip_iprefix_mem(const char *buf, size_t len,
>  				   const char *prefix,
>  				   const char **out, size_t *outlen)
>  {
> -	size_t prefix_len = strlen(prefix);
> -
> -	if (len < prefix_len)
> -		return 0;
> -
> -	if (!strncasecmp(buf, prefix, prefix_len)) {
> -		*out = buf + prefix_len;
> -		*outlen = len - prefix_len;
> -		return 1;
> -	}
> -
> +	do {
> +		if (!*prefix) {
> +			*out = buf;
> +			*outlen = len;
> +			return 1;
> +		}
> +	} while (len-- > 0 && tolower(*buf++) == tolower(*prefix++));
>  	return 0;
>  }
>  
>
> looks right to me, though only lightly tested (via t5563). I'm happy
> with either implementation.

Yeah, the alternative version looks clearer to me.

>> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
>> [...]
>> +	/*
>> +	 * If this is a HTTP status line and not a header field, this signals
>> +	 * a different HTTP response. libcurl writes all the output of all
>> +	 * response headers of all responses, including redirects.
>> +	 * We only care about the last HTTP request response's headers so clear
>> +	 * the existing array.
>> +	 */
>> +	if (!strncasecmp(ptr, "http/", 5))
>> +		strvec_clear(values);
>
> Since "ptr" isn't NUL terminated, using strncasecmp() may walk off the
> end. I think you'd need to check that there are five bytes. You could
> even use skip_iprefix_mem(), though of course we'd throw away the output
> values. (For strings there is also istarts_with(), but I don't think we
> have a "mem" equivalent).

Yuck, thank you very much for carefully reading.  I missed this one
when I queued it.

> The rest of the parsing looks good to me.

Thanks.
Matthew John Cheetham Feb. 27, 2023, 5:14 p.m. UTC | #3
On 2023-02-23 11:49, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
>> Alternatively, you could copy the tolower() loop from skip_iprefix().
>> Something like:
>>   
>> diff --git a/git-compat-util.h b/git-compat-util.h
>> index 28456241b6..f671a0ec3f 100644
>> --- a/git-compat-util.h
>> +++ b/git-compat-util.h
>> @@ -1296,17 +1296,13 @@ static inline int skip_iprefix_mem(const char *buf, size_t len,
>>  				   const char *prefix,
>>  				   const char **out, size_t *outlen)
>>  {
>> -	size_t prefix_len = strlen(prefix);
>> -
>> -	if (len < prefix_len)
>> -		return 0;
>> -
>> -	if (!strncasecmp(buf, prefix, prefix_len)) {
>> -		*out = buf + prefix_len;
>> -		*outlen = len - prefix_len;
>> -		return 1;
>> -	}
>> -
>> +	do {
>> +		if (!*prefix) {
>> +			*out = buf;
>> +			*outlen = len;
>> +			return 1;
>> +		}
>> +	} while (len-- > 0 && tolower(*buf++) == tolower(*prefix++));
>>  	return 0;
>>  }
>>  
>>
>> looks right to me, though only lightly tested (via t5563). I'm happy
>> with either implementation.
> 
> Yeah, the alternative version looks clearer to me.

Will update - thanks!

>>> +static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
>>> [...]
>>> +	/*
>>> +	 * If this is a HTTP status line and not a header field, this signals
>>> +	 * a different HTTP response. libcurl writes all the output of all
>>> +	 * response headers of all responses, including redirects.
>>> +	 * We only care about the last HTTP request response's headers so clear
>>> +	 * the existing array.
>>> +	 */
>>> +	if (!strncasecmp(ptr, "http/", 5))
>>> +		strvec_clear(values);
>>
>> Since "ptr" isn't NUL terminated, using strncasecmp() may walk off the
>> end. I think you'd need to check that there are five bytes. You could
>> even use skip_iprefix_mem(), though of course we'd throw away the output
>> values. (For strings there is also istarts_with(), but I don't think we
>> have a "mem" equivalent).
> 
> Yuck, thank you very much for carefully reading.  I missed this one
> when I queued it.

Oops! Will update in a v11

>> The rest of the parsing looks good to me.
> 
> Thanks.
diff mbox series

Patch

diff --git a/credential.c b/credential.c
index f6389a50684..897b4679333 100644
--- a/credential.c
+++ b/credential.c
@@ -22,6 +22,7 @@  void credential_clear(struct credential *c)
 	free(c->username);
 	free(c->password);
 	string_list_clear(&c->helpers, 0);
+	strvec_clear(&c->wwwauth_headers);
 
 	credential_init(c);
 }
diff --git a/credential.h b/credential.h
index f430e77fea4..3756a54c74d 100644
--- a/credential.h
+++ b/credential.h
@@ -2,6 +2,7 @@ 
 #define CREDENTIAL_H
 
 #include "string-list.h"
+#include "strvec.h"
 
 /**
  * The credentials API provides an abstracted way of gathering username and
@@ -115,6 +116,20 @@  struct credential {
 	 */
 	struct string_list helpers;
 
+	/**
+	 * A `strvec` of WWW-Authenticate header values. Each string
+	 * is the value of a WWW-Authenticate header in an HTTP response,
+	 * in the order they were received in the response.
+	 */
+	struct strvec wwwauth_headers;
+
+	/**
+	 * Internal use only. Keeps track of if we previously matched against a
+	 * WWW-Authenticate header line in order to re-fold future continuation
+	 * lines into one value.
+	 */
+	unsigned header_is_last_match:1;
+
 	unsigned approved:1,
 		 configured:1,
 		 quit:1,
@@ -130,6 +145,7 @@  struct credential {
 
 #define CREDENTIAL_INIT { \
 	.helpers = STRING_LIST_INIT_DUP, \
+	.wwwauth_headers = STRVEC_INIT, \
 }
 
 /* Initialize a credential structure, setting all fields to empty. */
diff --git a/git-compat-util.h b/git-compat-util.h
index a76d0526f79..a59230564e8 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -1266,6 +1266,29 @@  static inline int skip_iprefix(const char *str, const char *prefix,
 	return 0;
 }
 
+/*
+ * Like skip_prefix_mem, but compare case-insensitively. Note that the
+ * comparison is done via tolower(), so it is strictly ASCII (no multi-byte
+ * characters or locale-specific conversions).
+ */
+static inline int skip_iprefix_mem(const char *buf, size_t len,
+				   const char *prefix,
+				   const char **out, size_t *outlen)
+{
+	size_t prefix_len = strlen(prefix);
+
+	if (len < prefix_len)
+		return 0;
+
+	if (!strncasecmp(buf, prefix, prefix_len)) {
+		*out = buf + prefix_len;
+		*outlen = len - prefix_len;
+		return 1;
+	}
+
+	return 0;
+}
+
 static inline int strtoul_ui(char const *s, int base, unsigned int *result)
 {
 	unsigned long ul;
diff --git a/http.c b/http.c
index 8a5ba3f4776..3ff570ee3a9 100644
--- a/http.c
+++ b/http.c
@@ -183,6 +183,115 @@  size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
 	return nmemb;
 }
 
+/*
+ * A folded header continuation line starts with any number of spaces or
+ * horizontal tab characters (SP or HTAB) as per RFC 7230 section 3.2.
+ * It is not a continuation line if the line starts with any other character.
+ */
+static inline int is_hdr_continuation(const char *ptr, const size_t size)
+{
+	return size && (*ptr == ' ' || *ptr == '\t');
+}
+
+static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p)
+{
+	size_t size = eltsize * nmemb;
+	struct strvec *values = &http_auth.wwwauth_headers;
+	struct strbuf buf = STRBUF_INIT;
+	const char *val;
+	size_t val_len;
+
+	/*
+	 * Header lines may not come NULL-terminated from libcurl so we must
+	 * limit all scans to the maximum length of the header line, or leverage
+	 * strbufs for all operations.
+	 *
+	 * In addition, it is possible that header values can be split over
+	 * multiple lines as per RFC 7230. 'Line folding' has been deprecated
+	 * but older servers may still emit them. A continuation header field
+	 * value is identified as starting with a space or horizontal tab.
+	 *
+	 * The formal definition of a header field as given in RFC 7230 is:
+	 *
+	 * header-field   = field-name ":" OWS field-value OWS
+	 *
+	 * field-name     = token
+	 * field-value    = *( field-content / obs-fold )
+	 * field-content  = field-vchar [ 1*( SP / HTAB ) field-vchar ]
+	 * field-vchar    = VCHAR / obs-text
+	 *
+	 * obs-fold       = CRLF 1*( SP / HTAB )
+	 *                ; obsolete line folding
+	 *                ; see Section 3.2.4
+	 */
+
+	/* Start of a new WWW-Authenticate header */
+	if (skip_iprefix_mem(ptr, size, "www-authenticate:", &val, &val_len)) {
+		strbuf_add(&buf, val, val_len);
+
+		/*
+		 * Strip the CRLF that should be present at the end of each
+		 * field as well as any trailing or leading whitespace from the
+		 * value.
+		 */
+		strbuf_trim(&buf);
+
+		strvec_push(values, buf.buf);
+		http_auth.header_is_last_match = 1;
+		goto exit;
+	}
+
+	/*
+	 * This line could be a continuation of the previously matched header
+	 * field. If this is the case then we should append this value to the
+	 * end of the previously consumed value.
+	 */
+	if (http_auth.header_is_last_match && is_hdr_continuation(ptr, size)) {
+		/*
+		 * Trim the CRLF and any leading or trailing from this line.
+		 */
+		strbuf_add(&buf, ptr, size);
+		strbuf_trim(&buf);
+
+		/*
+		 * At this point we should always have at least one existing
+		 * value, even if it is empty. Do not bother appending the new
+		 * value if this continuation header is itself empty.
+		 */
+		if (!values->nr) {
+			BUG("should have at least one existing header value");
+		} else if (buf.len) {
+			char *prev = xstrdup(values->v[values->nr - 1]);
+
+			/* Join two non-empty values with a single space. */
+			const char *const sp = *prev ? " " : "";
+
+			strvec_pop(values);
+			strvec_pushf(values, "%s%s%s", prev, sp, buf.buf);
+			free(prev);
+		}
+
+		goto exit;
+	}
+
+	/* Not a continuation of a previously matched auth header line. */
+	http_auth.header_is_last_match = 0;
+
+	/*
+	 * If this is a HTTP status line and not a header field, this signals
+	 * a different HTTP response. libcurl writes all the output of all
+	 * response headers of all responses, including redirects.
+	 * We only care about the last HTTP request response's headers so clear
+	 * the existing array.
+	 */
+	if (!strncasecmp(ptr, "http/", 5))
+		strvec_clear(values);
+
+exit:
+	strbuf_release(&buf);
+	return size;
+}
+
 size_t fwrite_null(char *ptr, size_t eltsize, size_t nmemb, void *strbuf)
 {
 	return nmemb;
@@ -1864,6 +1973,8 @@  static int http_request(const char *url,
 					 fwrite_buffer);
 	}
 
+	curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);
+
 	accept_language = http_get_accept_language_header();
 
 	if (accept_language)