Message ID | 94e9de44e3b52513c5ab48aecd74f809dc34cbe3.1624473543.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Accepted |
Commit | f89ecf79888a48e0adf14d0e05c69ee09e853fd5 |
Headers | show |
Series | midx: verify MIDX checksum before reusing | expand |
On 24/06/21 01.39, Taylor Blau wrote: > But it does not check that the file's trailing checksum matches the data > that it records. So, if an on-disk corruption happened to occur in the > final few bytes (and all other data was recorded correctly), we would: > > - get a clean result from 'git multi-pack-index verify', but > - be unable to reuse the existing MIDX when writing a new one (since > we now check for checksum mismatches before reusing a MIDX) > > Teach the 'verify' sub-command to recognize corruption in the checksum > by calling midx_checksum_valid(). ... > + if (!midx_checksum_valid(m)) > + midx_report(_("incorrect checksum")); > + ... > +test_expect_success 'verify incorrect checksum' ' > + pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) && > + corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum" > +' > + We're in the context of checksum mismatches because of on-disk corruption, so why didn't the message say "incorrect checksum, possibly corrupt"?
On Wed, Jun 23, 2021 at 02:39:15PM -0400, Taylor Blau wrote: > 'git multi-pack-index verify' inspects the data in an existing MIDX for > correctness by checking that the recorded object offsets are correct, > and so on. > > But it does not check that the file's trailing checksum matches the data > that it records. So, if an on-disk corruption happened to occur in the > final few bytes (and all other data was recorded correctly), we would: > > - get a clean result from 'git multi-pack-index verify', but > - be unable to reuse the existing MIDX when writing a new one (since > we now check for checksum mismatches before reusing a MIDX) > > Teach the 'verify' sub-command to recognize corruption in the checksum > by calling midx_checksum_valid(). Makes sense. I was a little surprised we didn't do this already, but I guess it does not do the same "regenerate and make sure hashfile produces the same checksum" trick that the pack idx verifier does (as an aside, I think what the midx code is doing here is much _better_, because it is looking at semantic problems in the file, and is more robust against irrelevant changes in the format). > @@ -1228,6 +1228,9 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag > return result; > } > > + if (!midx_checksum_valid(m)) > + midx_report(_("incorrect checksum")); This "midx_report()" function doesn't provide much context on stderr. I get: $ echo foo >>.git/objects/pack/multi-pack-index $ git multi-pack-index verify incorrect checksum Verifying OID order in multi-pack-index: 100% (282/282), done. Sorting objects by packfile: 100% (283/283), done. Verifying object offsets: 100% (283/283), done. I think we should at least say "error:", but something along the lines of "midx file at %s does not match its trailing checksum (possibly corruption?)". Or something like that. I think all of the existing calls to midx_report() share this issue, though. We probably want to at least say "error:" here, but maybe even something like: diff --git a/midx.c b/midx.c index 9a35b0255d..e464907a7c 100644 --- a/midx.c +++ b/midx.c @@ -1172,10 +1172,12 @@ void clear_midx_file(struct repository *r) static int verify_midx_error; -static void midx_report(const char *fmt, ...) +static void midx_report(struct multi_pack_index *m, const char *fmt, ...) { va_list ap; verify_midx_error = 1; + /* do we need to care about the "next" pointer here? */ + fprintf(stderr, ("error: %s/multi-pack-index: "), m->object_dir); va_start(ap, fmt); vfprintf(stderr, fmt, ap); fprintf(stderr, "\n"); Also, a side note: we should use __attribute__((format)) on this function to get compile-time checks of our format strings. -Peff
On Wed, Jun 23, 2021 at 02:39:15PM -0400, Taylor Blau wrote: > 'git multi-pack-index verify' inspects the data in an existing MIDX for > correctness by checking that the recorded object offsets are correct, > and so on. > > But it does not check that the file's trailing checksum matches the data > that it records. So, if an on-disk corruption happened to occur in the > final few bytes (and all other data was recorded correctly), we would: > > - get a clean result from 'git multi-pack-index verify', but > - be unable to reuse the existing MIDX when writing a new one (since > we now check for checksum mismatches before reusing a MIDX) > > Teach the 'verify' sub-command to recognize corruption in the checksum > by calling midx_checksum_valid(). > > Suggested-by: Derrick Stolee <dstolee@microsoft.com> > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > midx.c | 3 +++ > t/t5319-multi-pack-index.sh | 5 +++++ > 2 files changed, 8 insertions(+) > > diff --git a/midx.c b/midx.c > index a12cbbf928..9a35b0255d 100644 > --- a/midx.c > +++ b/midx.c > @@ -1228,6 +1228,9 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag > return result; > } > > + if (!midx_checksum_valid(m)) > + midx_report(_("incorrect checksum")); > + > if (flags & MIDX_PROGRESS) > progress = start_delayed_progress(_("Looking for referenced packfiles"), > m->num_packs); > diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh > index d582f370c4..7609f1ea64 100755 > --- a/t/t5319-multi-pack-index.sh > +++ b/t/t5319-multi-pack-index.sh > @@ -418,6 +418,11 @@ test_expect_success 'corrupt MIDX is not reused' ' > git multi-pack-index verify > ' > > +test_expect_success 'verify incorrect checksum' ' > + pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) && > + corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum" This test is flaky and can fail with: ... + printf \377 + dd of=.git/objects/pack/multi-pack-index bs=1 seek=3839 conv=notrunc 1+0 records in 1+0 records out 1 byte copied, 5.0656e-05 s, 19.7 kB/s + test_must_fail git multi-pack-index verify --object-dir=.git/objects test_must_fail: command succeeded: git multi-pack-index verify --object-dir=.git/objects error: last command exited with $?=1 + mv midx-backup .git/objects/pack/multi-pack-index not ok 44 - verify incorrect checksum So the test corrupts the checksum trailer in the 'multi-pack-index' file by overwriting its last byte with 0xff, but if that byte were already 0xff, then the file would be left as is, and 'git multi-pack-index verify' wouldn't find anything amiss. Since SHA1 is essentially random, there's a 1:256 chance of that happening, assuming that the file's content is random. That't not really the case, however, because both the test repository's objects and the resulting packfiles are deterministic, and, consequently, the content of the MIDX is somewhat deterministic. Only "somewhat", though, because several of those objects appear in multiple packfiles, and the MIDX selects a copy in the most recently modified packfile, so filesystem mtime resolution and second boundaries become significant, and cause some variance in the contents of the OOFF chunk. Recently a laptop crossed my way that is somehow exceptionally good at generating MIDX files ending with 0xff: # ad-hoc checksum statistics: $ git diff diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index f1ee2ce56d..605713b518 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -482,9 +482,13 @@ test_expect_success 'corrupt MIDX is not reused' ' ' test_expect_success 'verify incorrect checksum' ' + skip=$(($(wc -c <$objdir/pack/multi-pack-index) - 20)) && + printf "checksum: " >&5 && + od -x -w20 -j$skip --endian=big -An "$objdir/pack/multi-pack-index" >&5 && pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) && corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum" ' +test_done test_expect_success 'repack progress off for redirected stderr' ' GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir repack 2>err && $ for i in {1..500} ; do ./t5319-multi-pack-index.sh |sed -n -e 's/^checksum: //p' ; done |sort |uniq -c 31 1a70 3b1c 8ed3 56a6 5101 2a38 057e 698d 6faf fbaa 340 5fc0 552f 0ac0 c876 f229 d9e3 ef13 a314 5847 89ff 129 ce7d 3710 fd21 ef7b 8316 2b99 4e6c e5d5 5e7c 7b08 That's almost 70% failure rate, though I haven't been able to trigger this failure on any of the other machines that I have access to.
On Thu, Nov 11, 2021 at 12:11:32AM +0100, SZEDER Gábor wrote: > So the test corrupts the checksum trailer in the 'multi-pack-index' > file by overwriting its last byte with 0xff, but if that byte were > already 0xff, then the file would be left as is, and 'git > multi-pack-index verify' wouldn't find anything amiss. Good find. I tried running with the patch you showed: > $ for i in {1..500} ; do ./t5319-multi-pack-index.sh |sed -n -e 's/^checksum: //p' ; done |sort |uniq -c > 31 1a70 3b1c 8ed3 56a6 5101 2a38 057e 698d 6faf fbaa > 340 5fc0 552f 0ac0 c876 f229 d9e3 ef13 a314 5847 89ff > 129 ce7d 3710 fd21 ef7b 8316 2b99 4e6c e5d5 5e7c 7b08 but I only ever got the last one, even under load. However, it's easy enough to tweak the mtimes to stimulate a failure: diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 3f69e43178..b84e96779f 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -481,10 +481,25 @@ test_expect_success 'corrupt MIDX is not reused' ' git multi-pack-index verify ' +while true; do test_expect_success 'verify incorrect checksum' ' + t=1234567890 && + for i in $(ls $objdir/pack/*.pack | shuf) + do + touch -d @$t $i && + t=$((t+100)) + done && + rm -f $objdir/pack/multi-pack-index && + git multi-pack-index write && + skip=$(($(wc -c <$objdir/pack/multi-pack-index) - 20)) && + printf "checksum: " >&5 && + od -x -w20 -j$skip --endian=big -An "$objdir/pack/multi-pack-index" >&5 && pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) && corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum" ' +done + +test_done test_expect_success 'repack progress off for redirected stderr' ' GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir repack 2>err && Brute force, but it fails after a few hundred attempts. :) > Since SHA1 is essentially random, there's a 1:256 chance of that > happening, assuming that the file's content is random. The most exact fix here would be to read the final byte, increment it mod-256, and write it back, which would ensure it was always changed. But that's a bit annoying to do in shell. Perhaps just corrupting more bytes is a good solution? The patch below should reduce your changes to 1 in 2^80. diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 3f69e43178..a612e44547 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -482,8 +482,10 @@ test_expect_success 'corrupt MIDX is not reused' ' ' test_expect_success 'verify incorrect checksum' ' - pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) && - corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum" + pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 10)) && + corrupt_midx_and_verify $pos \ + "\377\377\377\377\377\377\377\377\377\377" \ + $objdir "incorrect checksum" ' test_expect_success 'repack progress off for redirected stderr' ' There are other variants, of course. Just appending a single byte to the file is enough to give you a high probability of failing the checksum (since it shifts it all by one byte, making it essentially random), but the corrupt_midx() helper doesn't support that. -Peff
On Thu, Nov 11, 2021 at 05:05:06AM -0500, Jeff King wrote: > > Since SHA1 is essentially random, there's a 1:256 chance of that > > happening, assuming that the file's content is random. > > The most exact fix here would be to read the final byte, increment it > mod-256, and write it back, which would ensure it was always changed. > But that's a bit annoying to do in shell. Perhaps just corrupting more > bytes is a good solution? The patch below should reduce your changes to > 1 in 2^80. > > diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh > index 3f69e43178..a612e44547 100755 > --- a/t/t5319-multi-pack-index.sh > +++ b/t/t5319-multi-pack-index.sh > @@ -482,8 +482,10 @@ test_expect_success 'corrupt MIDX is not reused' ' > ' > > test_expect_success 'verify incorrect checksum' ' > - pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) && > - corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum" > + pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 10)) && > + corrupt_midx_and_verify $pos \ > + "\377\377\377\377\377\377\377\377\377\377" \ > + $objdir "incorrect checksum" > ' > > test_expect_success 'repack progress off for redirected stderr' ' > > There are other variants, of course. Just appending a single byte to the > file is enough to give you a high probability of failing the checksum > (since it shifts it all by one byte, making it essentially random), but > the corrupt_midx() helper doesn't support that. Thanks SZEDER for the report, and thanks Peff for the fix :). I agree that it's annoyingly cumbersome to write back the last byte incremented mod-256. So I'm content to just make it astronomically unlikely to run into a collision in practice. (As a matter of fact, I'm surprised that the current implementation hasn't produced failures for us more often). Peff: do you want me to turn this into a patch or were you planning on it? Thanks, Taylor
diff --git a/midx.c b/midx.c index a12cbbf928..9a35b0255d 100644 --- a/midx.c +++ b/midx.c @@ -1228,6 +1228,9 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag return result; } + if (!midx_checksum_valid(m)) + midx_report(_("incorrect checksum")); + if (flags & MIDX_PROGRESS) progress = start_delayed_progress(_("Looking for referenced packfiles"), m->num_packs); diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index d582f370c4..7609f1ea64 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -418,6 +418,11 @@ test_expect_success 'corrupt MIDX is not reused' ' git multi-pack-index verify ' +test_expect_success 'verify incorrect checksum' ' + pos=$(($(wc -c <$objdir/pack/multi-pack-index) - 1)) && + corrupt_midx_and_verify $pos "\377" $objdir "incorrect checksum" +' + test_expect_success 'repack progress off for redirected stderr' ' GIT_PROGRESS_DELAY=0 git multi-pack-index --object-dir=$objdir repack 2>err && test_line_count = 0 err
'git multi-pack-index verify' inspects the data in an existing MIDX for correctness by checking that the recorded object offsets are correct, and so on. But it does not check that the file's trailing checksum matches the data that it records. So, if an on-disk corruption happened to occur in the final few bytes (and all other data was recorded correctly), we would: - get a clean result from 'git multi-pack-index verify', but - be unable to reuse the existing MIDX when writing a new one (since we now check for checksum mismatches before reusing a MIDX) Teach the 'verify' sub-command to recognize corruption in the checksum by calling midx_checksum_valid(). Suggested-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- midx.c | 3 +++ t/t5319-multi-pack-index.sh | 5 +++++ 2 files changed, 8 insertions(+)