diff mbox series

[v2,3/8] sparse-index: silently return when cache tree fails

Message ID 371985352680a767dfacb5477aa77e92e04008ee.1628625013.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Sparse index: delete ignored files outside sparse cone | expand

Commit Message

Derrick Stolee Aug. 10, 2021, 7:50 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

If cache_tree_update() returns a non-zero value, then it could not
create the cache tree. This is likely due to a path having a merge
conflict. Since we are already returning early, let's return silently to
avoid making it seem like we failed to write the index at all.

If we remove our dependence on the cache tree within
convert_to_sparse(), then we could still recover from this scenario and
have a sparse index.

When constructing the cache-tree extension in convert_to_sparse(), it is
possible that we construct a tree object that is new to the object
database. Without the WRITE_TREE_MISSING_OK flag, this results in an
error that halts our conversion to a sparse index. Add this flag to
remove this limitation.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 sparse-index.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

Comments

Elijah Newren Aug. 19, 2021, 6:24 p.m. UTC | #1
On Tue, Aug 10, 2021 at 12:50 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> If cache_tree_update() returns a non-zero value, then it could not
> create the cache tree. This is likely due to a path having a merge
> conflict. Since we are already returning early, let's return silently to
> avoid making it seem like we failed to write the index at all.
>
> If we remove our dependence on the cache tree within
> convert_to_sparse(), then we could still recover from this scenario and
> have a sparse index.
>
> When constructing the cache-tree extension in convert_to_sparse(), it is
> possible that we construct a tree object that is new to the object
> database. Without the WRITE_TREE_MISSING_OK flag, this results in an
> error that halts our conversion to a sparse index. Add this flag to
> remove this limitation.

Would this only happen when the user has staged but uncommitted
changes outside the sparsity paths, and tries to sparsify while in
that state?  (Notably, this is a much different condition than the
above mentioned merge conflict case that would case
cache_tree_update() to just fail.)

I think it might be nicer to split this commit in two, just to make it
easier to understand for future readers.  This seems like two logical
changes and trying to understand them and why would I think be easier
if the two were split.  I'd be tempted to put the
WRITE_TREE_MISSING_OK first.

> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  sparse-index.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/sparse-index.c b/sparse-index.c
> index bc5900eae35..b6e90417556 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -179,10 +179,15 @@ int convert_to_sparse(struct index_state *istate)
>
>         /* Clear and recompute the cache-tree */
>         cache_tree_free(&istate->cache_tree);
> -       if (cache_tree_update(istate, 0)) {
> -               warning(_("unable to update cache-tree, staying full"));
> -               return -1;
> -       }
> +       /*
> +        * Silently return if there is a problem with the cache tree update,
> +        * which might just be due to a conflict state in some entry.
> +        *
> +        * This might create new tree objects, so be sure to use
> +        * WRITE_TREE_MISSING_OK.
> +        */
> +       if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
> +               return 0;
>
>         remove_fsmonitor(istate);
>
> --
> gitgitgadget

These feel like cases where it would be nice to have a testcase
demonstrating the change in behavior.  Perhaps just splitting the
commit would be enough, but it took a bit of time to try to understand
what would change and why, despite the simple changes.
Derrick Stolee Aug. 20, 2021, 3:04 p.m. UTC | #2
On 8/19/2021 2:24 PM, Elijah Newren wrote:
> On Tue, Aug 10, 2021 at 12:50 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> If cache_tree_update() returns a non-zero value, then it could not
>> create the cache tree. This is likely due to a path having a merge
>> conflict. Since we are already returning early, let's return silently to
>> avoid making it seem like we failed to write the index at all.
>>
>> If we remove our dependence on the cache tree within
>> convert_to_sparse(), then we could still recover from this scenario and
>> have a sparse index.
>>
>> When constructing the cache-tree extension in convert_to_sparse(), it is
>> possible that we construct a tree object that is new to the object
>> database. Without the WRITE_TREE_MISSING_OK flag, this results in an
>> error that halts our conversion to a sparse index. Add this flag to
>> remove this limitation.
> 
> Would this only happen when the user has staged but uncommitted
> changes outside the sparsity paths, and tries to sparsify while in
> that state?  (Notably, this is a much different condition than the
> above mentioned merge conflict case that would case
> cache_tree_update() to just fail.)
> 
> I think it might be nicer to split this commit in two, just to make it
> easier to understand for future readers.  This seems like two logical
> changes and trying to understand them and why would I think be easier
> if the two were split.  I'd be tempted to put the
> WRITE_TREE_MISSING_OK first.

Ironically, I _had_ this as two commits because I discovered the
problems independently. It wasn't until I was organizing things that
I realized I was editing the same 'if' twice and thought it better
to merge patches. But I also don't feel strongly about that, so I
can split them.

>> +       /*
>> +        * Silently return if there is a problem with the cache tree update,
>> +        * which might just be due to a conflict state in some entry.
>> +        *
>> +        * This might create new tree objects, so be sure to use
>> +        * WRITE_TREE_MISSING_OK.
>> +        */
>> +       if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
>> +               return 0;
...
> These feel like cases where it would be nice to have a testcase
> demonstrating the change in behavior.  Perhaps just splitting the
> commit would be enough, but it took a bit of time to try to understand
> what would change and why, despite the simple changes.

I found these were required in the Scalar functional tests, so I bet
that if I remove this change I can create a test case from that. Thanks.

-Stolee
diff mbox series

Patch

diff --git a/sparse-index.c b/sparse-index.c
index bc5900eae35..b6e90417556 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -179,10 +179,15 @@  int convert_to_sparse(struct index_state *istate)
 
 	/* Clear and recompute the cache-tree */
 	cache_tree_free(&istate->cache_tree);
-	if (cache_tree_update(istate, 0)) {
-		warning(_("unable to update cache-tree, staying full"));
-		return -1;
-	}
+	/*
+	 * Silently return if there is a problem with the cache tree update,
+	 * which might just be due to a conflict state in some entry.
+	 *
+	 * This might create new tree objects, so be sure to use
+	 * WRITE_TREE_MISSING_OK.
+	 */
+	if (cache_tree_update(istate, WRITE_TREE_MISSING_OK))
+		return 0;
 
 	remove_fsmonitor(istate);