diff mbox series

[2/3] cache-tree: detect mismatching number of index entries

Message ID 595693a6420b2571aabd51ed989bedfa0cfa62e2.1726556195.git.ps@pks.im (mailing list archive)
State New
Headers show
Series cache-tree: fix segfaults with invalid cache-trees | expand

Commit Message

Patrick Steinhardt Sept. 17, 2024, 7:13 a.m. UTC
In t4058 we have some tests that exercise git-read-tree(1) when used
with a tree that contains duplicate entries. While the expectation is
that we fail, we ideally should fail gracefully without a segfault.

But that is not the case: we never check that the number of entries in
the cache-tree is less than or equal to the number of entries in the
index. This can lead to an out-of-bounds read as we unconditionally
access `istate->cache[idx]`, where `idx` is controlled by the number of
cache-tree entries and the current position therein. The result is a
segfault.

Fix this segfault by adding a sanity check for the number of index
entries before dereferencing them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 cache-tree.c               |  5 +++++
 t/t4058-diff-duplicates.sh | 12 ++++++------
 2 files changed, 11 insertions(+), 6 deletions(-)

Comments

Junio C Hamano Sept. 19, 2024, 1:35 a.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> +	if (it->entry_count + pos > istate->cache_nr) {
> +		ret = error(_("corrupted cache-tree has entries not present in index"));
> +		goto out;
> +	}

Is it a safe assumption that the if() condition always indicates an
error?  When sparse-index is in effect, istate->cache_nr may be a
number that is smaller than the true number of paths in the index
(because all paths under a subdirectory we are not interested in are
folded into a single tree-ish entry), and I am not sure how it
should interact with it->entry_count (i.e. the number of paths under
the current directory we are looking at, which obviously cannot be a
sparsified entry) and pos (i.e. the index into active_cache[] that
represend the first path under the current directory)?

I guess as long as "it" is not folded, it does not matter how other
paths from different directories in active_cache[] are sparsified or
expanded, as long as "pos" keeps track of the current position
correctly.

> diff --git a/t/t4058-diff-duplicates.sh b/t/t4058-diff-duplicates.sh
> index 2501c89c1c9..3f602adb055 100755
> --- a/t/t4058-diff-duplicates.sh
> +++ b/t/t4058-diff-duplicates.sh
> @@ -132,15 +132,15 @@ test_expect_success 'create a few commits' '
>  	rm commit_id up final
>  '
>  
> -test_expect_failure 'git read-tree does not segfault' '
> -	test_when_finished rm .git/index.lock &&
> -	test_might_fail git read-tree --reset base
> +test_expect_success 'git read-tree does not segfault' '
> +	test_must_fail git read-tree --reset base 2>err &&
> +	test_grep "error: corrupted cache-tree has entries not present in index" err
>  '

Very good.  test_might_fail is a sign of trouble, and this gives us
a lot more predictability.
diff mbox series

Patch

diff --git a/cache-tree.c b/cache-tree.c
index 4228b6fad48..1e625673086 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -933,6 +933,11 @@  static int verify_one(struct repository *r,
 		pos = 0;
 	}
 
+	if (it->entry_count + pos > istate->cache_nr) {
+		ret = error(_("corrupted cache-tree has entries not present in index"));
+		goto out;
+	}
+
 	i = 0;
 	while (i < it->entry_count) {
 		struct cache_entry *ce = istate->cache[pos + i];
diff --git a/t/t4058-diff-duplicates.sh b/t/t4058-diff-duplicates.sh
index 2501c89c1c9..3f602adb055 100755
--- a/t/t4058-diff-duplicates.sh
+++ b/t/t4058-diff-duplicates.sh
@@ -132,15 +132,15 @@  test_expect_success 'create a few commits' '
 	rm commit_id up final
 '
 
-test_expect_failure 'git read-tree does not segfault' '
-	test_when_finished rm .git/index.lock &&
-	test_might_fail git read-tree --reset base
+test_expect_success 'git read-tree does not segfault' '
+	test_must_fail git read-tree --reset base 2>err &&
+	test_grep "error: corrupted cache-tree has entries not present in index" err
 '
 
-test_expect_failure 'reset --hard does not segfault' '
-	test_when_finished rm .git/index.lock &&
+test_expect_success 'reset --hard does not segfault' '
 	git checkout base &&
-	test_might_fail git reset --hard
+	test_must_fail git reset --hard 2>err &&
+	test_grep "error: corrupted cache-tree has entries not present in index" err
 '
 
 test_expect_failure 'git diff HEAD does not segfault' '