Message ID | 20250110-b4-pks-blame-truncate-hash-length-v3-2-e61f25b68f30@pks.im (mailing list archive) |
---|---|
State | Accepted |
Commit | e7fb2ca94556e6aadfc3038afaa1c8cc3525258c |
Headers | show |
Series | builtin/blame: fix out-of-bounds reads and writes | expand |
Hi Patrick, On Fri, 10 Jan 2025, Patrick Steinhardt wrote: > diff --git a/builtin/blame.c b/builtin/blame.c > index d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da..7555c445abe7ca2fa54670ac8fee1d95a6dbafe3 100644 > --- a/builtin/blame.c > +++ b/builtin/blame.c > @@ -489,9 +489,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int > fputs(color, stdout); > > if (suspect->commit->object.flags & UNINTERESTING) { > - if (blank_boundary) > - memset(hex, ' ', length); > - else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { > + if (blank_boundary) { > + memset(hex, ' ', strlen(hex)); Using `strlen()` is a neat trick. I could have done without slipping in a style change (introducing curlies), but the most important thing is that it fixes the bug. Thank you, Johannes > + } else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { > length--; > putchar('^'); > }
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > On Fri, 10 Jan 2025, Patrick Steinhardt wrote: > >> diff --git a/builtin/blame.c b/builtin/blame.c >> index d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da..7555c445abe7ca2fa54670ac8fee1d95a6dbafe3 100644 >> --- a/builtin/blame.c >> +++ b/builtin/blame.c >> @@ -489,9 +489,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int >> fputs(color, stdout); >> >> if (suspect->commit->object.flags & UNINTERESTING) { >> - if (blank_boundary) >> - memset(hex, ' ', length); >> - else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { >> + if (blank_boundary) { >> + memset(hex, ' ', strlen(hex)); > > Using `strlen()` is a neat trick. > > I could have done without slipping in a style change (introducing > curlies), but the most important thing is that it fixes the bug. Thank both of you for these last-minute fixes. Hopefully we can have them in today's release, and we didn't miss unexpected side effects in them, I hope ;-).
diff --git a/builtin/blame.c b/builtin/blame.c index d7630ac89cb7bd6e9ce5d72c6a98aa433b3b12da..7555c445abe7ca2fa54670ac8fee1d95a6dbafe3 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -489,9 +489,9 @@ static void emit_other(struct blame_scoreboard *sb, struct blame_entry *ent, int fputs(color, stdout); if (suspect->commit->object.flags & UNINTERESTING) { - if (blank_boundary) - memset(hex, ' ', length); - else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { + if (blank_boundary) { + memset(hex, ' ', strlen(hex)); + } else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) { length--; putchar('^'); } diff --git a/t/t8002-blame.sh b/t/t8002-blame.sh index b3f8b63d2e6744dd434f38fd9f10b56cd432141b..1ad039e1234828ca8779ad76147bfa7fe14c5a2e 100755 --- a/t/t8002-blame.sh +++ b/t/t8002-blame.sh @@ -134,6 +134,24 @@ test_expect_success 'blame --abbrev gets truncated with boundary commit' ' check_abbrev $hexsz --abbrev=9000 ^HEAD ' +test_expect_success 'blame --abbrev -b truncates the blank boundary' ' + # Note that `--abbrev=` always gets incremented by 1, which is why we + # expect 11 leading spaces and not 10. + cat >expect <<-EOF && + $(printf "%0.s " $(test_seq 11)) (<author@example.com> 2005-04-07 15:45:13 -0700 1) abbrev + EOF + git blame -b --abbrev=10 ^HEAD -- abbrev.t >actual && + test_cmp expect actual +' + +test_expect_success 'blame with excessive --abbrev and -b culls to hash length' ' + cat >expect <<-EOF && + $(printf "%0.s " $(test_seq $hexsz)) (<author@example.com> 2005-04-07 15:45:13 -0700 1) abbrev + EOF + git blame -b --abbrev=9000 ^HEAD -- abbrev.t >actual && + test_cmp expect actual +' + test_expect_success '--exclude-promisor-objects does not BUG-crash' ' test_must_fail git blame --exclude-promisor-objects one '
When passing the `-b` flag to git-blame(1), then any blamed boundary commits which were marked as uninteresting will not get their actual commit ID printed, but will instead be replaced by a couple of spaces. The flag can lead to an out-of-bounds write as though when combined with `--abbrev=` when the abbreviation length is longer than `GIT_MAX_HEXSZ` as we simply use memset(3p) on that array with the user-provided length directly. The result is most likely that we segfault. An obvious fix would be to cull `length` to `GIT_MAX_HEXSZ` many bytes. But when the underlying object ID is SHA1, and if the abbreviated length exceeds the SHA1 length, it would cause us to print more bytes than desired, and the result would be misaligned. Instead, fix the bug by computing the length via strlen(3p). This makes us write as many bytes as the formatted object ID requires and thus effectively limits the length of what we may end up printing to the length of its hash. If `--abbrev=` asks us to abbreviate to something shorter than the full length of the underlying hash function it would be handled by the call to printf(3p) correctly. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- builtin/blame.c | 6 +++--- t/t8002-blame.sh | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+), 3 deletions(-)