Message ID | 20190603204526.7723-3-matvore@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Harden url.c URL-decoding logic | expand |
On 2019-06-03 at 20:45:26, Matthew DeVore wrote: > There is no reason to allow %00 to terminate a string, so do not allow it. > Otherwise, we end up returning arbitrary content in the string (that which is > after the %00) which is effectively hidden from callers and can escape sanity > checks and validation, and possible be used in tandem with a security > vulnerability to introduce a payload. So I think the reason you've stated is good and I agree that we shouldn't decode data we're not going to use. However, I'm also interested in the cases in which we decode data and don't want to allow NULs, because we should, in general, allow bizarre URLs as long as they're URL-encoded. It looks like several of the places we do this are in the credential manager code, and I think I can agree that usernames and passwords should not contain NUL characters (for Basic auth, RFC 7617 prohibits it). It also seems that the credential code decodes the path parameter before passing it on, which is unfortunate, but can't be changed for backward compatibility reasons. And then the other instances are a file: URL in remote-testsvn.c and query parameters that have no reason to contain NULs in http-backend.c. So I think overall this is fine, although we probably want to change the commit summary to say "NUL" instead of "NULL".
Am 03.06.19 um 22:45 schrieb Matthew DeVore: > There is no reason to allow %00 to terminate a string, so do not allow it. > Otherwise, we end up returning arbitrary content in the string (that which is > after the %00) which is effectively hidden from callers and can escape sanity > checks and validation, and possible be used in tandem with a security > vulnerability to introduce a payload. It's a bit hard to see with the (extended, but still) limited context, but url_decode_internal() effectively returns a NUL-terminated string, even though it does use a strbuf parameter named "out" for temporary storage. So callers really have no use for decoded NULs, and this change thus makes sense to me. > > Signed-off-by: Matthew DeVore <matvore@google.com> > --- > url.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/url.c b/url.c > index c0bb4e23c3..cf791cb139 100644 > --- a/url.c > +++ b/url.c > @@ -41,21 +41,21 @@ static char *url_decode_internal(const char **query, int len, > if (!c) > break; > if (stop_at && strchr(stop_at, c)) { > q++; > len--; > break; > } > > if (c == '%' && len >= 3) { > int val = hex2chr(q + 1); > - if (0 <= val) { > + if (0 < val) { > strbuf_addch(out, val); > q += 3; > len -= 3; > continue; > } > } > > if (decode_plus && c == '+') > strbuf_addch(out, ' '); > else >
On Tue, Jun 04, 2019 at 07:01:01AM +0200, René Scharfe wrote: > It's a bit hard to see with the (extended, but still) limited context, > but url_decode_internal() effectively returns a NUL-terminated string, > even though it does use a strbuf parameter named "out" for temporary > storage. So callers really have no use for decoded NULs, and this > change thus makes sense to me. > That was more or less my train of thought as well. Thank you for taking a look.
On Tue, Jun 04, 2019 at 01:02:43AM +0000, brian m. carlson wrote: > It looks like several of the places we do this are in the credential > manager code, and I think I can agree that usernames and passwords > should not contain NUL characters (for Basic auth, RFC 7617 prohibits > it). It also seems that the credential code decodes the path parameter > before passing it on, which is unfortunate, but can't be changed for > backward compatibility reasons. > > And then the other instances are a file: URL in remote-testsvn.c and > query parameters that have no reason to contain NULs in http-backend.c. OK. Good to know that there is no justification to support %00 in URLs. > So I think overall this is fine, although we probably want to change the > commit summary to say "NUL" instead of "NULL". Applied for the next roll-up. Thank you for taking a look.
diff --git a/url.c b/url.c index c0bb4e23c3..cf791cb139 100644 --- a/url.c +++ b/url.c @@ -41,21 +41,21 @@ static char *url_decode_internal(const char **query, int len, if (!c) break; if (stop_at && strchr(stop_at, c)) { q++; len--; break; } if (c == '%' && len >= 3) { int val = hex2chr(q + 1); - if (0 <= val) { + if (0 < val) { strbuf_addch(out, val); q += 3; len -= 3; continue; } } if (decode_plus && c == '+') strbuf_addch(out, ' '); else
There is no reason to allow %00 to terminate a string, so do not allow it. Otherwise, we end up returning arbitrary content in the string (that which is after the %00) which is effectively hidden from callers and can escape sanity checks and validation, and possible be used in tandem with a security vulnerability to introduce a payload. Signed-off-by: Matthew DeVore <matvore@google.com> --- url.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)