Message ID | YvQc9egGzwFomQbw@coredump.intra.peff.net (mailing list archive) |
---|---|
State | Accepted |
Commit | 53602a937dc9eacd67b6afcd781f7b15bb02682f |
Headers | show |
Series | actually detect bad file modes in fsck | expand |
Jeff King <peff@peff.net> writes: > We use the normal tree_desc code to iterate over trees in fsck, meaning > we only see the canonicalized modes it returns. And hence we'd never see > anything unexpected, since it will coerce literally any garbage into one > of our normal and accepted modes. Wow. I did know canon_mode() deliberately discarding the extra permission bits on trees and blobs, but it was that bad to mark whatever it does not understand as a gitlink. That is simply horrible. > - if (init_tree_desc_gently(&desc, buffer, size, 0)) { > + if (init_tree_desc_gently(&desc, buffer, size, TREE_DESC_RAW_MODES)) { > retval += report(options, tree_oid, OBJ_TREE, > FSCK_MSG_BAD_TREE, > "cannot be parsed as a tree"); OK, so we'll let desc.entry.mode carry whatever bogus bit pattern we got out of buffer and the downstream code already knows what to do with them. That's a clean and minimum way to do this. Thanks. > diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh > index ab7f31f1dc..53c2aa10b7 100755 > --- a/t/t1450-fsck.sh > +++ b/t/t1450-fsck.sh > @@ -364,6 +364,20 @@ test_expect_success 'tree entry with type mismatch' ' > test_i18ngrep ! "dangling blob" out > ' > > +test_expect_success 'tree entry with bogus mode' ' > + test_when_finished "remove_object \$blob" && > + test_when_finished "remove_object \$tree" && > + blob=$(echo blob | git hash-object -w --stdin) && > + blob_oct=$(echo $blob | hex2oct) && > + tree=$(printf "100000 foo\0${blob_oct}" | > + git hash-object -t tree --stdin -w --literally) && > + git fsck 2>err && > + cat >expect <<-EOF && > + warning in tree $tree: badFilemode: contains bad file modes > + EOF > + test_cmp expect err > +' > + > test_expect_success 'tag pointing to nonexistent' ' > badoid=$(test_oid deadbeef) && > cat >invalid-tag <<-EOF &&
On Wed, Aug 10, 2022 at 02:35:32PM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > We use the normal tree_desc code to iterate over trees in fsck, meaning > > we only see the canonicalized modes it returns. And hence we'd never see > > anything unexpected, since it will coerce literally any garbage into one > > of our normal and accepted modes. > > Wow. I did know canon_mode() deliberately discarding the extra > permission bits on trees and blobs, but it was that bad to mark > whatever it does not understand as a gitlink. That is simply > horrible. Yeah, I noticed that, as well. It might actually be reasonable to put a die() at the end of that canon_mode() function. It's rather unfriendly, but then, treating nonsense as a gitlink is likely to just result in the caller ending up aborting anyway. Outside the scope of my patch, though. :) -Peff
Jeff King <peff@peff.net> writes:
> Outside the scope of my patch, though. :)
Absolutely.
diff --git a/fsck.c b/fsck.c index 5acc982a7c..b3da1d68c0 100644 --- a/fsck.c +++ b/fsck.c @@ -578,7 +578,7 @@ static int fsck_tree(const struct object_id *tree_oid, const char *o_name; struct name_stack df_dup_candidates = { NULL }; - if (init_tree_desc_gently(&desc, buffer, size, 0)) { + if (init_tree_desc_gently(&desc, buffer, size, TREE_DESC_RAW_MODES)) { retval += report(options, tree_oid, OBJ_TREE, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index ab7f31f1dc..53c2aa10b7 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -364,6 +364,20 @@ test_expect_success 'tree entry with type mismatch' ' test_i18ngrep ! "dangling blob" out ' +test_expect_success 'tree entry with bogus mode' ' + test_when_finished "remove_object \$blob" && + test_when_finished "remove_object \$tree" && + blob=$(echo blob | git hash-object -w --stdin) && + blob_oct=$(echo $blob | hex2oct) && + tree=$(printf "100000 foo\0${blob_oct}" | + git hash-object -t tree --stdin -w --literally) && + git fsck 2>err && + cat >expect <<-EOF && + warning in tree $tree: badFilemode: contains bad file modes + EOF + test_cmp expect err +' + test_expect_success 'tag pointing to nonexistent' ' badoid=$(test_oid deadbeef) && cat >invalid-tag <<-EOF &&
We use the normal tree_desc code to iterate over trees in fsck, meaning we only see the canonicalized modes it returns. And hence we'd never see anything unexpected, since it will coerce literally any garbage into one of our normal and accepted modes. We can use the new RAW_MODES flag to see the real modes, and then use the existing code to actually analyze them. The existing code is written as allow-known-good, so there's not much point in testing a variety of breakages. The one tested here should be S_IFREG but with nonsense permissions. Do note that the error-reporting here isn't great. We don't mention the specific bad mode, but just that the tree has one or more broken modes. But when you go to look at it with "git ls-tree", we'll report the canonicalized mode! This isn't ideal, but given that this should come up rarely, and that any number of other tree corruptions might force you into looking at the binary bytes via "cat-file", it's not the end of the world. And it's something we can improve on top later if we choose. Reported-by: Xavier Morel <xavier.morel@masklinn.net> Signed-off-by: Jeff King <peff@peff.net> --- fsck.c | 2 +- t/t1450-fsck.sh | 14 ++++++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-)