Message ID | 89d58db7-6a01-b3fa-54f0-19d5a3819eb3@web.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | pkt-line: don't check string length in packet_length() | expand |
René Scharfe <l.s.r@web.de> writes: > hex2chr() takes care not to run over the end of a short string. > 101736a14c (pkt-line: extern packet_length(), 2020-05-19) turned the > input parameter of packet_length() from a string pointer into an array > of known length, making string length checks unnecessary. Get rid of > them by using hexval() directly. I am puzzled about the part of the above description on "making string length checks unnecessary". The two callers we currently have both do pass char[4], but the compiler would not stop us from passing a pointer to a memory region of an unknown size; if we butcher one of the current callers diff --git c/pkt-line.c w/pkt-line.c index 6e022029ca..e1c49baefd 100644 --- c/pkt-line.c +++ w/pkt-line.c @@ -421,7 +421,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer, return PACKET_READ_EOF; } - len = packet_length(linelen); + len = packet_length(buffer); if (len < 0) { if (options & PACKET_READ_GENTLE_ON_READ_ERROR) where "buffer" is just a random piece of memory passed to the caller and there is no such guarantee like "it at least is 4 bytes long", we would just slurp garbage and run past the end of the buffer. > The resulting branchless code is simpler and it becomes easier to see > that the function mirrors set_packet_header(). I do like the resulting code, but I feel a bit uneasy to sell this change as "the code becomes more streamlined without losing safety". It looks more like "this change is safe for our two callers; those adding more callers in the future are better be very careful", no?
Am 05.07.23 um 07:56 schrieb Junio C Hamano: > René Scharfe <l.s.r@web.de> writes: > >> hex2chr() takes care not to run over the end of a short string. >> 101736a14c (pkt-line: extern packet_length(), 2020-05-19) turned the >> input parameter of packet_length() from a string pointer into an array >> of known length, making string length checks unnecessary. Get rid of >> them by using hexval() directly. > > I am puzzled about the part of the above description on "making > string length checks unnecessary". The two callers we currently > have both do pass char[4], but the compiler would not stop us from > passing a pointer to a memory region of an unknown size; if we > butcher one of the current callers > > diff --git c/pkt-line.c w/pkt-line.c > index 6e022029ca..e1c49baefd 100644 > --- c/pkt-line.c > +++ w/pkt-line.c > @@ -421,7 +421,7 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer, > return PACKET_READ_EOF; > } > > - len = packet_length(linelen); > + len = packet_length(buffer); > > if (len < 0) { > if (options & PACKET_READ_GENTLE_ON_READ_ERROR) > > where "buffer" is just a random piece of memory passed to the caller > and there is no such guarantee like "it at least is 4 bytes long", > we would just slurp garbage and run past the end of the buffer. Indeed, GCC warns if you give it a short array, but not if you pass a pointer: https://godbolt.org/z/T3xfE5W9n. Clang doesn't care at all. So that was wishful thinking on my part. Sorry. :-/ >> The resulting branchless code is simpler and it becomes easier to see >> that the function mirrors set_packet_header(). > > I do like the resulting code, but I feel a bit uneasy to sell this > change as "the code becomes more streamlined without losing safety". > It looks more like "this change is safe for our two callers; those > adding more callers in the future are better be very careful", no? With no way to enforce passing an array of a certain size to a function the only safe options I see are keeping the length check, using a macro or inlining the calculation. Hmm. René
René Scharfe <l.s.r@web.de> writes: >> I do like the resulting code, but I feel a bit uneasy to sell this >> change as "the code becomes more streamlined without losing safety". >> It looks more like "this change is safe for our two callers; those >> adding more callers in the future are better be very careful", no? > > With no way to enforce passing an array of a certain size to a function > the only safe options I see are keeping the length check, using a macro > or inlining the calculation. Hmm. We keep repeating "length check" because that is what the comment in the function says, but even if the caller has 4-byte, that 4-byte substring at the beginning is what it read from the untrusted side over the network. We should be checking if we have 4 hexadecimal length even if we are not running beyond the end of the buffer, no? So it may be that the comment needs to be fixed more than the code. Thanks.
Am 05.07.23 um 22:33 schrieb Junio C Hamano: > René Scharfe <l.s.r@web.de> writes: > >>> I do like the resulting code, but I feel a bit uneasy to sell this >>> change as "the code becomes more streamlined without losing safety". >>> It looks more like "this change is safe for our two callers; those >>> adding more callers in the future are better be very careful", no? >> >> With no way to enforce passing an array of a certain size to a function >> the only safe options I see are keeping the length check, using a macro >> or inlining the calculation. Hmm. Three more ideas: Wrap the buffer in a custom struct, pass the four bytes as a uint32_t or as four separate char parameters. I better stop now.. > We keep repeating "length check" because that is what the comment in > the function says, but even if the caller has 4-byte, that 4-byte > substring at the beginning is what it read from the untrusted side > over the network. We should be checking if we have 4 hexadecimal > length even if we are not running beyond the end of the buffer, no? Sure, that's done. If any of the four characters is not a hexadecimal digit then packet_length() returns a negative value, before and after the change. > So it may be that the comment needs to be fixed more than the code. Which comment specifically? The one in pkt-line.h doesn't mention what happens if you pass in a short buffer -- leaving it undefined is OK, I think. And in and around pkt-line.c::packet_length() I don't see any comment? René
René Scharfe <l.s.r@web.de> writes: > Sure, that's done. If any of the four characters is not a hexadecimal > digit then packet_length() returns a negative value, before and after > the change. Ah, it is the beauty of hexval[] table ;-) So as long as we are sure that we are not running beyond the end of the buffer, we are OK. And as I already said, I think "this change is safe for our two callers; those adding more callers in the future are better be very careful" is probably good enough for this one. hex.h:hex2chr() says "don't run over the end of short strings", but as far as I can see it does not check any such thing; find a page of memory, whose next page is unmapped, and pointing *s at the last byte of that page and calling it will happily run over the end and would cause SIGBUS. The function assumes that such a short string is always NUL terminated, which is not a great way to guarantee that we do not run over the end of strings.
Am 06.07.23 um 00:27 schrieb Junio C Hamano: > hex.h:hex2chr() says "don't run over the end of short strings", but > as far as I can see it does not check any such thing; find a page of > memory, whose next page is unmapped, and pointing *s at the last > byte of that page and calling it will happily run over the end and > would cause SIGBUS. The function assumes that such a short string > is always NUL terminated, which is not a great way to guarantee that > we do not run over the end of strings. Yes, hex2chr() works with C strings, i.e. those that end with a NUL character. An empty string is just a NUL byte, a string of length 1 is a non-NUL byte and a NUL. The function reads one byte from the former and otherwise two bytes -- no overrun. If a C string loses its NUL, how could you detect its end? René
diff --git a/pkt-line.c b/pkt-line.c index 62b4208b66..6e022029ca 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -375,8 +375,10 @@ static int get_packet_data(int fd, char **src_buf, size_t *src_size, int packet_length(const char lenbuf_hex[4]) { - int val = hex2chr(lenbuf_hex); - return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2); + return hexval(lenbuf_hex[0]) << 12 | + hexval(lenbuf_hex[1]) << 8 | + hexval(lenbuf_hex[2]) << 4 | + hexval(lenbuf_hex[3]); } static char *find_packfile_uri_path(const char *buffer)
hex2chr() takes care not to run over the end of a short string. 101736a14c (pkt-line: extern packet_length(), 2020-05-19) turned the input parameter of packet_length() from a string pointer into an array of known length, making string length checks unnecessary. Get rid of them by using hexval() directly. The resulting branchless code is simpler and it becomes easier to see that the function mirrors set_packet_header(). Signed-off-by: René Scharfe <l.s.r@web.de> --- pkt-line.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) -- 2.41.0