Message ID | 20190404232704.GE21839@sigill.intra.peff.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | a rabbit hole of update-server-info fixes | expand |
Am 05.04.2019 um 01:27 schrieb Jeff King: > We can use skip_prefix() and parse_oid_hex() to continuously increment > our pointer, rather than dealing with magic numbers. This also fixes a > few small shortcomings: > > - if we see a 'P' line that does not match our expectations, we'll > leave our "i" counter in the middle of the line. So we'll parse: > "P P P pack-1234.pack" as if there was just one "P" which was not > intentional (though probably not that harmful). How so? The default case, which we'd fall through to, skips the rest of such a line, doesn't it? > > - if we see a line with the right prefix, suffix, and length, i.e. > matching /P pack-.{40}.pack\n/, we'll interpret the middle part as > hex without checking if it could be parsed. This could lead to us > looking at uninitialized garbage in the hash array. In practice this > means we'll just make a garbage request to the server which will > fail, though it's interesting that a malicious server could convince > us to leak 40 bytes of uninitialized stack to them. > > - the current code is picky about seeing a newline at the end of file, > but we can easily be more liberal > > Signed-off-by: Jeff King <peff@peff.net> > --- > http.c | 35 ++++++++++++++--------------------- > 1 file changed, 14 insertions(+), 21 deletions(-) > > diff --git a/http.c b/http.c > index a32ad36ddf..2ef47bc779 100644 > --- a/http.c > +++ b/http.c > @@ -2147,11 +2147,11 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head, > int http_get_info_packs(const char *base_url, struct packed_git **packs_head) > { > struct http_get_options options = {0}; > - int ret = 0, i = 0; > - char *url, *data; > + int ret = 0; > + char *url; > + const char *data; > struct strbuf buf = STRBUF_INIT; > - unsigned char hash[GIT_MAX_RAWSZ]; > - const unsigned hexsz = the_hash_algo->hexsz; > + struct object_id oid; > > end_url_with_slash(&buf, base_url); > strbuf_addstr(&buf, "objects/info/packs"); > @@ -2163,24 +2163,17 @@ int http_get_info_packs(const char *base_url, struct packed_git **packs_head) > goto cleanup; > > data = buf.buf; > - while (i < buf.len) { > - switch (data[i]) { > - case 'P': > - i++; > - if (i + hexsz + 12 <= buf.len && > - starts_with(data + i, " pack-") && > - starts_with(data + i + hexsz + 6, ".pack\n")) { > - get_sha1_hex(data + i + 6, hash); > - fetch_and_setup_pack_index(packs_head, hash, > - base_url); > - i += hexsz + 11; > - break; > - } > - default: > - while (i < buf.len && data[i] != '\n') > - i++; > + while (*data) { > + if (skip_prefix(data, "P pack-", &data) && > + !parse_oid_hex(data, &oid, &data) && > + skip_prefix(data, ".pack", &data) && > + (*data == '\n' || *data == '\0')) { > + fetch_and_setup_pack_index(packs_head, oid.hash, base_url); > + } else { > + data = strchrnul(data, '\n'); > } > - i++; > + if (*data) > + data++; /* skip past newline */ So much simpler, *and* converted to object_id -- I like it! Parsing "P" and "pack-" together crosses logical token boundaries, but that I don't mind it here. René
On Fri, Apr 05, 2019 at 12:41:27PM +0200, René Scharfe wrote: > Am 05.04.2019 um 01:27 schrieb Jeff King: > > We can use skip_prefix() and parse_oid_hex() to continuously increment > > our pointer, rather than dealing with magic numbers. This also fixes a > > few small shortcomings: > > > > - if we see a 'P' line that does not match our expectations, we'll > > leave our "i" counter in the middle of the line. So we'll parse: > > "P P P pack-1234.pack" as if there was just one "P" which was not > > intentional (though probably not that harmful). > > How so? The default case, which we'd fall through to, skips the rest > of such a line, doesn't it? Oh, you're right. I didn't notice the fall-through, which is quite subtle (I'm actually surprised -Wimplicit-fallthrough doesn't complain about this). I'll fix up the commit message (the cleanup is still very much worth it for the garbage-oid issue, IMHO). > > + while (*data) { > > + if (skip_prefix(data, "P pack-", &data) && > > + !parse_oid_hex(data, &oid, &data) && > > + skip_prefix(data, ".pack", &data) && > > + (*data == '\n' || *data == '\0')) { > > + fetch_and_setup_pack_index(packs_head, oid.hash, base_url); > > + } else { > > + data = strchrnul(data, '\n'); > > } > > - i++; > > + if (*data) > > + data++; /* skip past newline */ > > So much simpler, *and* converted to object_id -- I like it! > > Parsing "P" and "pack-" together crosses logical token boundaries, > but that I don't mind it here. Yeah, I was tempted to write: if (skip_prefix(data, "P ", &data) && skip_prefix(data, "pack-", &data) && ... but that felt a little silly. I dunno. I guess it is probably not any less efficient, because we'd expect skip_prefix() and its loop to get inlined here anyway. -Peff
Am 05.04.2019 um 20:11 schrieb Jeff King: > On Fri, Apr 05, 2019 at 12:41:27PM +0200, René Scharfe wrote: >> Parsing "P" and "pack-" together crosses logical token boundaries, >> but that I don't mind it here. > > Yeah, I was tempted to write: > > if (skip_prefix(data, "P ", &data) && > skip_prefix(data, "pack-", &data) && > ... > > but that felt a little silly. I dunno. I guess it is probably not any > less efficient, because we'd expect skip_prefix() and its loop to get > inlined here anyway. Didn't think of inlining. Clang unrolls the whole comparison (except on powerpc64), but the other compilers available at the Compiler Explorer website keep the consecutive calls separate: https://godbolt.org/z/7eTarV René
diff --git a/http.c b/http.c index a32ad36ddf..2ef47bc779 100644 --- a/http.c +++ b/http.c @@ -2147,11 +2147,11 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head, int http_get_info_packs(const char *base_url, struct packed_git **packs_head) { struct http_get_options options = {0}; - int ret = 0, i = 0; - char *url, *data; + int ret = 0; + char *url; + const char *data; struct strbuf buf = STRBUF_INIT; - unsigned char hash[GIT_MAX_RAWSZ]; - const unsigned hexsz = the_hash_algo->hexsz; + struct object_id oid; end_url_with_slash(&buf, base_url); strbuf_addstr(&buf, "objects/info/packs"); @@ -2163,24 +2163,17 @@ int http_get_info_packs(const char *base_url, struct packed_git **packs_head) goto cleanup; data = buf.buf; - while (i < buf.len) { - switch (data[i]) { - case 'P': - i++; - if (i + hexsz + 12 <= buf.len && - starts_with(data + i, " pack-") && - starts_with(data + i + hexsz + 6, ".pack\n")) { - get_sha1_hex(data + i + 6, hash); - fetch_and_setup_pack_index(packs_head, hash, - base_url); - i += hexsz + 11; - break; - } - default: - while (i < buf.len && data[i] != '\n') - i++; + while (*data) { + if (skip_prefix(data, "P pack-", &data) && + !parse_oid_hex(data, &oid, &data) && + skip_prefix(data, ".pack", &data) && + (*data == '\n' || *data == '\0')) { + fetch_and_setup_pack_index(packs_head, oid.hash, base_url); + } else { + data = strchrnul(data, '\n'); } - i++; + if (*data) + data++; /* skip past newline */ } cleanup:
We can use skip_prefix() and parse_oid_hex() to continuously increment our pointer, rather than dealing with magic numbers. This also fixes a few small shortcomings: - if we see a 'P' line that does not match our expectations, we'll leave our "i" counter in the middle of the line. So we'll parse: "P P P pack-1234.pack" as if there was just one "P" which was not intentional (though probably not that harmful). - if we see a line with the right prefix, suffix, and length, i.e. matching /P pack-.{40}.pack\n/, we'll interpret the middle part as hex without checking if it could be parsed. This could lead to us looking at uninitialized garbage in the hash array. In practice this means we'll just make a garbage request to the server which will fail, though it's interesting that a malicious server could convince us to leak 40 bytes of uninitialized stack to them. - the current code is picky about seeing a newline at the end of file, but we can easily be more liberal Signed-off-by: Jeff King <peff@peff.net> --- http.c | 35 ++++++++++++++--------------------- 1 file changed, 14 insertions(+), 21 deletions(-)