diff mbox series

[RFC] sparse index: fix use-after-free bug in cache_tree_verify()

Message ID pull.1053.git.1633512591608.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series [RFC] sparse index: fix use-after-free bug in cache_tree_verify() | expand

Commit Message

Phillip Wood Oct. 6, 2021, 9:29 a.m. UTC
From: Phillip Wood <phillip.wood@dunelm.org.uk>

In a sparse index it is possible for the tree that is being verified
to be freed while it is being verified. This happens when
index_name_pos() looks up a entry that is missing from the index and
that would be a descendant of a sparse entry. That triggers a call to
ensure_full_index() which frees the cache tree that is being verified.
Carrying on trying to verify the tree after this results in a
use-after-free bug. Instead restart the verification if a sparse index
is converted to a full index. This bug is triggered by a call to
reset_head() in "git rebase --apply". Thanks to René Scharfe for his
help analyzing the problem.

==74345==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000001b20 at pc 0x557cbe82d3a2 bp 0x7ffdfee08090 sp 0x7ffdfee08080
READ of size 4 at 0x606000001b20 thread T0
    #0 0x557cbe82d3a1 in verify_one /home/phil/src/git/cache-tree.c:863
    #1 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #2 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #3 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #4 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #5 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #6 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #7 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #8 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #9 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #10 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #11 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #12 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #13 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
    #14 0x557cbe5bcb8d in _start (/home/phil/src/git/git+0x1b9b8d)

0x606000001b20 is located 0 bytes inside of 56-byte region [0x606000001b20,0x606000001b58)
freed by thread T0 here:
    #0 0x7fdd4bacff19 in __interceptor_free /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:127
    #1 0x557cbe82af60 in cache_tree_free /home/phil/src/git/cache-tree.c:35
    #2 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #3 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #4 0x557cbe82aee5 in cache_tree_free /home/phil/src/git/cache-tree.c:31
    #5 0x557cbeb2557a in ensure_full_index /home/phil/src/git/sparse-index.c:310
    #6 0x557cbea45c4a in index_name_stage_pos /home/phil/src/git/read-cache.c:588
    #7 0x557cbe82ce37 in verify_one /home/phil/src/git/cache-tree.c:850
    #8 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #9 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #10 0x557cbe82ca9d in verify_one /home/phil/src/git/cache-tree.c:840
    #11 0x557cbe830a2b in cache_tree_verify /home/phil/src/git/cache-tree.c:910
    #12 0x557cbea53741 in write_locked_index /home/phil/src/git/read-cache.c:3250
    #13 0x557cbeab7fdd in reset_head /home/phil/src/git/reset.c:87
    #14 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #15 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #16 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #17 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #18 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #19 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #20 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

previously allocated by thread T0 here:
    #0 0x7fdd4bad0459 in __interceptor_calloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x557cbebc1807 in xcalloc /home/phil/src/git/wrapper.c:140
    #2 0x557cbe82b7d8 in cache_tree /home/phil/src/git/cache-tree.c:17
    #3 0x557cbe82b7d8 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:763
    #4 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #5 0x557cbe82b837 in prime_cache_tree_rec /home/phil/src/git/cache-tree.c:764
    #6 0x557cbe8304e1 in prime_cache_tree /home/phil/src/git/cache-tree.c:779
    #7 0x557cbeab7fa7 in reset_head /home/phil/src/git/reset.c:85
    #8 0x557cbe72147f in cmd_rebase builtin/rebase.c:2074
    #9 0x557cbe5bd151 in run_builtin /home/phil/src/git/git.c:461
    #10 0x557cbe5bd151 in handle_builtin /home/phil/src/git/git.c:714
    #11 0x557cbe5c0503 in run_argv /home/phil/src/git/git.c:781
    #12 0x557cbe5c0503 in cmd_main /home/phil/src/git/git.c:912
    #13 0x557cbe5bad28 in main /home/phil/src/git/common-main.c:52
    #14 0x7fdd4b82eb24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
    [RFC] sparse index: fix use-after-free bug in cache_tree_verify()
    
    In a sparse index it is possible for the tree that is being verified to
    be freed while it is being verified. This is an RFC as I'm not familiar
    with the cache tree code. I'm confused as to why this bug is triggered
    by the sequence
    
    unpack_trees()
    prime_cache_tree()
    write_locked_index()
    
    
    but not
    
    unpack_trees()
    write_locked_index()
    
    
    as unpack_trees() appears to update the cache tree with
    
    if (!cache_tree_fully_valid(o->result.cache_tree))
                cache_tree_update(&o->result,
                          WRITE_TREE_SILENT |
                          WRITE_TREE_REPAIR);
    
    
    and I don't understand why the cache tree from prime_cache_tree()
    results in different behavior. It concerns me that this fix is hiding
    another bug.
    
    Best Wishes
    
    Phillip

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1053%2Fphillipwood%2Fwip%2Fsparse-index-fix-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1053/phillipwood/wip/sparse-index-fix-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1053

 cache-tree.c                             | 29 +++++++++++++++++-------
 t/t1092-sparse-checkout-compatibility.sh |  2 +-
 2 files changed, 22 insertions(+), 9 deletions(-)


base-commit: cefe983a320c03d7843ac78e73bd513a27806845

Comments

Derrick Stolee Oct. 6, 2021, 11:20 a.m. UTC | #1
On 10/6/2021 5:29 AM, Phillip Wood via GitGitGadget wrote:
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
> 
> In a sparse index it is possible for the tree that is being verified
> to be freed while it is being verified. This happens when
> index_name_pos() looks up a entry that is missing from the index and
> that would be a descendant of a sparse entry. That triggers a call to
> ensure_full_index() which frees the cache tree that is being verified.
> Carrying on trying to verify the tree after this results in a
> use-after-free bug. Instead restart the verification if a sparse index
> is converted to a full index. This bug is triggered by a call to
> reset_head() in "git rebase --apply". Thanks to René Scharfe for his
> help analyzing the problem.

Thank you for identifying an interesting case! I hadn't thought to
change the mode from --merge to --apply.

>     In a sparse index it is possible for the tree that is being verified to
>     be freed while it is being verified. This is an RFC as I'm not familiar
>     with the cache tree code. I'm confused as to why this bug is triggered
>     by the sequence
>     
>     unpack_trees()
>     prime_cache_tree()
>     write_locked_index()
>     
>     but not
>     
>     unpack_trees()
>     write_locked_index()
>     
>     
>     as unpack_trees() appears to update the cache tree with
>     
>     if (!cache_tree_fully_valid(o->result.cache_tree))
>                 cache_tree_update(&o->result,
>                           WRITE_TREE_SILENT |
>                           WRITE_TREE_REPAIR);
>     
>     
>     and I don't understand why the cache tree from prime_cache_tree()
>     results in different behavior. It concerns me that this fix is hiding
>     another bug.

prime_cache_tree() appears to clear the cache tree and start from scratch
from a tree object instead of using the index.

In particular, prime_cache_tree_rec() does not stop at the sparse-checkout
cone, so the cache tree is the full size at that point.

When the verify_one() method reaches these nodes that are outside of the
cone, index_name_pos() triggers the index expansion in a way that the
cache-tree that is restricted to the sparse-checkout cone does not.

Hopefully that helps clear up _why_ this happens.

There is a remaining issue that "git rebase --apply" will be a lot slower
than "git rebase --merge" because of this construction of a cache-tree
that is much larger than necessary.

I will make note of this as a potential improvement for the future.

> -static void verify_one(struct repository *r,
> -		       struct index_state *istate,
> -		       struct cache_tree *it,
> -		       struct strbuf *path)
> +static int verify_one(struct repository *r,
> +		      struct index_state *istate,
> +		      struct cache_tree *it,
> +		      struct strbuf *path)
>  {
>  	int i, pos, len = path->len;
>  	struct strbuf tree_buf = STRBUF_INIT;
> @@ -837,21 +837,30 @@ static void verify_one(struct repository *r,
>  
>  	for (i = 0; i < it->subtree_nr; i++) {
>  		strbuf_addf(path, "%s/", it->down[i]->name);
> -		verify_one(r, istate, it->down[i]->cache_tree, path);
> +		if (verify_one(r, istate, it->down[i]->cache_tree, path))
> +			return 1;
>  		strbuf_setlen(path, len);
>  	}
>  
>  	if (it->entry_count < 0 ||
>  	    /* no verification on tests (t7003) that replace trees */
>  	    lookup_replace_object(r, &it->oid) != &it->oid)
> -		return;
> +		return 0;
>  
>  	if (path->len) {
> +		/*
> +		 * If the index is sparse index_name_pos() may trigger
> +		 * ensure_full_index() which will free the tree that is being
> +		 * verified.
> +		 */
> +		int is_sparse = istate->sparse_index;
>  		pos = index_name_pos(istate, path->buf, path->len);
> +		if (is_sparse && !istate->sparse_index)
> +			return 1;

I think this guard is good to have, even if we fix prime_cache_tree() to
avoid triggering expansion here in most cases.

>  		if (pos >= 0) {
>  			verify_one_sparse(r, istate, it, path, pos);
> -			return;
> +			return 0;
>  		}
>  
>  		pos = -pos - 1;
> @@ -899,6 +908,7 @@ static void verify_one(struct repository *r,
>  		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
>  	strbuf_setlen(path, len);
>  	strbuf_release(&tree_buf);
> +	return 0;
>  }
>  
>  void cache_tree_verify(struct repository *r, struct index_state *istate)
> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>  
>  	if (!istate->cache_tree)
>  		return;
> -	verify_one(r, istate, istate->cache_tree, &path);
> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
> +		strbuf_reset(&path);
> +		verify_one(r, istate, istate->cache_tree, &path);
> +	}

And this limits us to doing at most two passes. Good.

>  test_expect_success 'merge, cherry-pick, and rebase' '
>  	init_repos &&
>  
> -	for OPERATION in "merge -m merge" cherry-pick rebase
> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"

Thank you for the additional test!

Thanks,
-Stolee
Phillip Wood Oct. 6, 2021, 2:01 p.m. UTC | #2
Hi Stolee

On 06/10/2021 12:20, Derrick Stolee wrote:
> On 10/6/2021 5:29 AM, Phillip Wood via GitGitGadget wrote:
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> In a sparse index it is possible for the tree that is being verified
>> to be freed while it is being verified. This happens when
>> index_name_pos() looks up a entry that is missing from the index and
>> that would be a descendant of a sparse entry. That triggers a call to
>> ensure_full_index() which frees the cache tree that is being verified.
>> Carrying on trying to verify the tree after this results in a
>> use-after-free bug. Instead restart the verification if a sparse index
>> is converted to a full index. This bug is triggered by a call to
>> reset_head() in "git rebase --apply". Thanks to René Scharfe for his
>> help analyzing the problem.
> 
> Thank you for identifying an interesting case! I hadn't thought to
> change the mode from --merge to --apply.

Thanks, I can't really take much credit for that though - Junio pointed 
out that my patch converting the merge based rebase to use the same 
checkout code as the apply based rebase broke a test in seen and René 
diagnosed the problem.

>>      In a sparse index it is possible for the tree that is being verified to
>>      be freed while it is being verified. This is an RFC as I'm not familiar
>>      with the cache tree code. I'm confused as to why this bug is triggered
>>      by the sequence
>>      
>>      unpack_trees()
>>      prime_cache_tree()
>>      write_locked_index()
>>      
>>      but not
>>      
>>      unpack_trees()
>>      write_locked_index()
>>      
>>      
>>      as unpack_trees() appears to update the cache tree with
>>      
>>      if (!cache_tree_fully_valid(o->result.cache_tree))
>>                  cache_tree_update(&o->result,
>>                            WRITE_TREE_SILENT |
>>                            WRITE_TREE_REPAIR);
>>      
>>      
>>      and I don't understand why the cache tree from prime_cache_tree()
>>      results in different behavior. It concerns me that this fix is hiding
>>      another bug.
> 
> prime_cache_tree() appears to clear the cache tree and start from scratch
> from a tree object instead of using the index.
> 
> In particular, prime_cache_tree_rec() does not stop at the sparse-checkout
> cone, so the cache tree is the full size at that point.
> 
> When the verify_one() method reaches these nodes that are outside of the
> cone, index_name_pos() triggers the index expansion in a way that the
> cache-tree that is restricted to the sparse-checkout cone does not.
> 
> Hopefully that helps clear up _why_ this happens.

It does thanks - we end up with a full cache tree but a sparse index

> There is a remaining issue that "git rebase --apply" will be a lot slower
> than "git rebase --merge" because of this construction of a cache-tree
> that is much larger than necessary.
> 
> I will make note of this as a potential improvement for the future.

I think I'm going to remove the call to prime_cache_tree(). Correct me 
if I'm wrong but as I understand it unpack_trees() updates the cache 
tree so the call to prime_cache_tree() is not needed (I think it was 
copied from builtin/rebase.c which does need to call prime_cache_tree() 
if it has updated a few paths rather than the whole top-level tree). In 
any case I've just noticed that one of Victoria's patches[1] looks like 
it fixes prime_cache_tree() with a sparse index.

[1] 
https://lore.kernel.org/git/78cd85d8dcc790251ce8235e649902cf6adf091a.1633440057.git.gitgitgadget@gmail.com/

>> -static void verify_one(struct repository *r,
>> -		       struct index_state *istate,
>> -		       struct cache_tree *it,
>> -		       struct strbuf *path)
>> +static int verify_one(struct repository *r,
>> +		      struct index_state *istate,
>> +		      struct cache_tree *it,
>> +		      struct strbuf *path)
>>   {
>>   	int i, pos, len = path->len;
>>   	struct strbuf tree_buf = STRBUF_INIT;
>> @@ -837,21 +837,30 @@ static void verify_one(struct repository *r,
>>   
>>   	for (i = 0; i < it->subtree_nr; i++) {
>>   		strbuf_addf(path, "%s/", it->down[i]->name);
>> -		verify_one(r, istate, it->down[i]->cache_tree, path);
>> +		if (verify_one(r, istate, it->down[i]->cache_tree, path))
>> +			return 1;
>>   		strbuf_setlen(path, len);
>>   	}
>>   
>>   	if (it->entry_count < 0 ||
>>   	    /* no verification on tests (t7003) that replace trees */
>>   	    lookup_replace_object(r, &it->oid) != &it->oid)
>> -		return;
>> +		return 0;
>>   
>>   	if (path->len) {
>> +		/*
>> +		 * If the index is sparse index_name_pos() may trigger
>> +		 * ensure_full_index() which will free the tree that is being
>> +		 * verified.
>> +		 */
>> +		int is_sparse = istate->sparse_index;
>>   		pos = index_name_pos(istate, path->buf, path->len);
>> +		if (is_sparse && !istate->sparse_index)
>> +			return 1;
> 
> I think this guard is good to have, even if we fix prime_cache_tree() to
> avoid triggering expansion here in most cases.
> 
>>   		if (pos >= 0) {
>>   			verify_one_sparse(r, istate, it, path, pos);
>> -			return;
>> +			return 0;
>>   		}
>>   
>>   		pos = -pos - 1;
>> @@ -899,6 +908,7 @@ static void verify_one(struct repository *r,
>>   		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
>>   	strbuf_setlen(path, len);
>>   	strbuf_release(&tree_buf);
>> +	return 0;
>>   }
>>   
>>   void cache_tree_verify(struct repository *r, struct index_state *istate)
>> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>>   
>>   	if (!istate->cache_tree)
>>   		return;
>> -	verify_one(r, istate, istate->cache_tree, &path);
>> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
>> +		strbuf_reset(&path);
>> +		verify_one(r, istate, istate->cache_tree, &path);
>> +	}
> 
> And this limits us to doing at most two passes. Good.

In theory ensure_full_index() will only ever be called once but I wanted 
to make sure we could not get into an infinite loop.

>>   test_expect_success 'merge, cherry-pick, and rebase' '
>>   	init_repos &&
>>   
>> -	for OPERATION in "merge -m merge" cherry-pick rebase
>> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
> 
> Thank you for the additional test!

Thanks for your explanation and looking at the patch

Best Wishes

Phillip

> Thanks,
> -Stolee
>
Derrick Stolee Oct. 6, 2021, 2:19 p.m. UTC | #3
On 10/6/21 10:01 AM, Phillip Wood wrote:
> Hi Stolee
> 
> On 06/10/2021 12:20, Derrick Stolee wrote:
>> In particular, prime_cache_tree_rec() does not stop at the sparse-checkout
>> cone, so the cache tree is the full size at that point.
>>
>> When the verify_one() method reaches these nodes that are outside of the
>> cone, index_name_pos() triggers the index expansion in a way that the
>> cache-tree that is restricted to the sparse-checkout cone does not.
>>
>> Hopefully that helps clear up _why_ this happens.
> 
> It does thanks - we end up with a full cache tree but a sparse index

That's a short-and-sweet way to describe it.

>> There is a remaining issue that "git rebase --apply" will be a lot slower
>> than "git rebase --merge" because of this construction of a cache-tree
>> that is much larger than necessary.
>>
>> I will make note of this as a potential improvement for the future.
> 
> I think I'm going to remove the call to prime_cache_tree(). Correct me if I'm wrong but as I understand it unpack_trees() updates the cache tree so the call to prime_cache_tree() is not needed (I think it was copied from builtin/rebase.c which does need to call prime_cache_tree() if it has updated a few paths rather than the whole top-level tree). In any case I've just noticed that one of Victoria's patches[1] looks like it fixes prime_cache_tree() with a sparse index.
> 
> [1] https://lore.kernel.org/git/78cd85d8dcc790251ce8235e649902cf6adf091a.1633440057.git.gitgitgadget@gmail.com/

Of course it does! I'm losing track of all the ongoing work in
the sparse index as I've been distracted and out of it for a
while. It's in good hands.

Thanks,
-Stolee
Junio C Hamano Oct. 6, 2021, 7:17 p.m. UTC | #4
"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:

/*
 * Please document what the values that can be returned from
 * this function are and what they mean, just before this
 * funciton.  I am guessing that this is "all bets are off and
 * you need to redo the computation again over the full in-core
 * index"?  It is not an error and I think it makes sense to use
 * positive 1 like this patch does instead of -1.
 */
>  
> -static void verify_one(struct repository *r,
> -		       struct index_state *istate,
> -		       struct cache_tree *it,
> -		       struct strbuf *path)
> +static int verify_one(struct repository *r,
> +		      struct index_state *istate,
> +		      struct cache_tree *it,
> +		      struct strbuf *path)
>  {



> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>  
>  	if (!istate->cache_tree)
>  		return;
> -	verify_one(r, istate, istate->cache_tree, &path);
> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
> +		strbuf_reset(&path);
> +		verify_one(r, istate, istate->cache_tree, &path);
> +	}
>  	strbuf_release(&path);
>  }

This is just a style thing, but I would find it easier to follow if
it just recursed into itself, i.e.

-	verify_one(...);
+	if (verify_one(...))
+		cache_tree_verify(r, istate);

or

-	verify_one(...);
+	again:
+	if (verify_one(...))
+		strbuf_reset(&path);
+		goto again;
}	}

On the other hand, if the new code wants to say "I would retry at
most once, otherwise there is something wrong in me", then

> -	verify_one(r, istate, istate->cache_tree, &path);
> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
> +		strbuf_reset(&path);
> +		if (verify_one(r, istate, istate->cache_tree, &path))
> +			BUG("...");
> +	}

would be better.

Other than that, nicely done.

> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 886e78715fe..85d5279b33c 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -484,7 +484,7 @@ test_expect_success 'checkout and reset (mixed) [sparse]' '
>  test_expect_success 'merge, cherry-pick, and rebase' '
>  	init_repos &&
>  
> -	for OPERATION in "merge -m merge" cherry-pick rebase
> +	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
>  	do
>  		test_all_match git checkout -B temp update-deep &&
>  		test_all_match git $OPERATION update-folder1 &&
>
> base-commit: cefe983a320c03d7843ac78e73bd513a27806845
Derrick Stolee Oct. 6, 2021, 8:43 p.m. UTC | #5
On 10/6/2021 3:17 PM, Junio C Hamano wrote:
> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate)
>>  
>>  	if (!istate->cache_tree)
>>  		return;
>> -	verify_one(r, istate, istate->cache_tree, &path);
>> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
>> +		strbuf_reset(&path);
>> +		verify_one(r, istate, istate->cache_tree, &path);
>> +	}
>>  	strbuf_release(&path);
>>  }
> 
> This is just a style thing, but I would find it easier to follow if
> it just recursed into itself, i.e.
> 
> -	verify_one(...);
> +	if (verify_one(...))
> +		cache_tree_verify(r, istate);
> 
> or
> 
> -	verify_one(...);
> +	again:
> +	if (verify_one(...))
> +		strbuf_reset(&path);
> +		goto again;
> }	}
> 
> On the other hand, if the new code wants to say "I would retry at
> most once, otherwise there is something wrong in me", then
> 
>> -	verify_one(r, istate, istate->cache_tree, &path);
>> +	if (verify_one(r, istate, istate->cache_tree, &path)) {
>> +		strbuf_reset(&path);
>> +		if (verify_one(r, istate, istate->cache_tree, &path))
>> +			BUG("...");
>> +	}
> 
> would be better.

I'm in favor of this second option.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/cache-tree.c b/cache-tree.c
index 90919f9e345..7bdbbc24268 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -826,10 +826,10 @@  static void verify_one_sparse(struct repository *r,
 		    path->buf);
 }
 
-static void verify_one(struct repository *r,
-		       struct index_state *istate,
-		       struct cache_tree *it,
-		       struct strbuf *path)
+static int verify_one(struct repository *r,
+		      struct index_state *istate,
+		      struct cache_tree *it,
+		      struct strbuf *path)
 {
 	int i, pos, len = path->len;
 	struct strbuf tree_buf = STRBUF_INIT;
@@ -837,21 +837,30 @@  static void verify_one(struct repository *r,
 
 	for (i = 0; i < it->subtree_nr; i++) {
 		strbuf_addf(path, "%s/", it->down[i]->name);
-		verify_one(r, istate, it->down[i]->cache_tree, path);
+		if (verify_one(r, istate, it->down[i]->cache_tree, path))
+			return 1;
 		strbuf_setlen(path, len);
 	}
 
 	if (it->entry_count < 0 ||
 	    /* no verification on tests (t7003) that replace trees */
 	    lookup_replace_object(r, &it->oid) != &it->oid)
-		return;
+		return 0;
 
 	if (path->len) {
+		/*
+		 * If the index is sparse index_name_pos() may trigger
+		 * ensure_full_index() which will free the tree that is being
+		 * verified.
+		 */
+		int is_sparse = istate->sparse_index;
 		pos = index_name_pos(istate, path->buf, path->len);
+		if (is_sparse && !istate->sparse_index)
+			return 1;
 
 		if (pos >= 0) {
 			verify_one_sparse(r, istate, it, path, pos);
-			return;
+			return 0;
 		}
 
 		pos = -pos - 1;
@@ -899,6 +908,7 @@  static void verify_one(struct repository *r,
 		    oid_to_hex(&new_oid), oid_to_hex(&it->oid));
 	strbuf_setlen(path, len);
 	strbuf_release(&tree_buf);
+	return 0;
 }
 
 void cache_tree_verify(struct repository *r, struct index_state *istate)
@@ -907,6 +917,9 @@  void cache_tree_verify(struct repository *r, struct index_state *istate)
 
 	if (!istate->cache_tree)
 		return;
-	verify_one(r, istate, istate->cache_tree, &path);
+	if (verify_one(r, istate, istate->cache_tree, &path)) {
+		strbuf_reset(&path);
+		verify_one(r, istate, istate->cache_tree, &path);
+	}
 	strbuf_release(&path);
 }
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 886e78715fe..85d5279b33c 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -484,7 +484,7 @@  test_expect_success 'checkout and reset (mixed) [sparse]' '
 test_expect_success 'merge, cherry-pick, and rebase' '
 	init_repos &&
 
-	for OPERATION in "merge -m merge" cherry-pick rebase
+	for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge"
 	do
 		test_all_match git checkout -B temp update-deep &&
 		test_all_match git $OPERATION update-folder1 &&