diff mbox series

[v3,2/6] stash: remove the second index in stash_working_tree()

Message ID 20200731165140.29197-3-alban.gruin@gmail.com (mailing list archive)
State New, archived
Headers show
Series stash: drop usage of a second index | expand

Commit Message

Alban Gruin July 31, 2020, 4:51 p.m. UTC
This removes the second index used in stash_working_tree() to simplify
the code.

The calls to set_alternative_index_output() are dropped to extract
`i_tree' to the main index, and `GIT_INDEX_FILE' is no longer set before
starting `update-index'.  When it exits, the index has changed, and must
be discarded.

With this commit, reset_tree() does not need to be called at the
beginning of stash_working_tree(), because it is only called by
do_create_stash(), which sets the index at `i_tree', and
save_untracked_files() does not change the main index.  But it will
become useful again in a later commit, when save_untracked_file() will
be rewritten to use the "main" index, so I did not remove it.

At the end of the function, the tree is reset to `i_tree'.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/stash.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

Comments

Junio C Hamano July 31, 2020, 6:26 p.m. UTC | #1
Alban Gruin <alban.gruin@gmail.com> writes:

> This removes the second index used in stash_working_tree() to simplify
> the code.

Continuing what I said for [0/6], say "the second index file" to
clarify the distinction between the in-core index and on-disk index
files to avoid confusion.

>  	init_revisions(&rev, NULL);
>  	copy_pathspec(&rev.prune_data, ps);
>  
> -	set_alternate_index_output(stash_index_path.buf);
>  	if (reset_tree(&info->i_tree, 0, 0)) {
>  		ret = -1;
>  		goto done;
>  	}
> -	set_alternate_index_output(NULL);

Hmph.  So at this point i_tree is what is in the index, we reset the
working tree to it with reset_tree(), and instead of writing to
$TMPindex, we let reset_tree() clobber the main on-disk index but we
do not care because the result is supposed to be the same as what
was originally in the index anyway?

> @@ -1091,8 +1088,6 @@ static int stash_working_tree(struct stash_info *info, const struct pathspec *ps
>  	argv_array_pushl(&cp_upd_index.args, "update-index",
>  			 "--ignore-skip-worktree-entries",
>  			 "-z", "--add", "--remove", "--stdin", NULL);
> -	argv_array_pushf(&cp_upd_index.env_array, "GIT_INDEX_FILE=%s",
> -			 stash_index_path.buf);
>  
>  	if (pipe_command(&cp_upd_index, diff_output.buf, diff_output.len,
>  			 NULL, 0, NULL, 0)) {

And then the new code now lets "update-index" work directly on the
main index (which does make an observable difference to the outside
world, but we are not letting any hook to look at this intermediate
state, so it might be OK---I cannot tell at this point in the code).

> @@ -1100,19 +1095,16 @@ static int stash_working_tree(struct stash_info *info, const struct pathspec *ps
>  		goto done;
>  	}
>  
> -	if (write_index_as_tree(&info->w_tree, &istate, stash_index_path.buf, 0,
> -				NULL)) {
> +	discard_cache();
> +	if (write_cache_as_tree(&info->w_tree, 0, NULL) ||
> +	    reset_tree(&info->i_tree, 0, 1))

We used to read from $TMPindex, which has been updated with the
contents of files modified in the working tree, and write its
content out as a tree object, grabbing its object name into w_tree.

The new code instead writes out the same tree from the main index,
which has been clobbered with the working tree changes that the user
hasn't added to the index.  Because of that, we need to discard
these changes and re-read the in-core index out of i_tree with
reset_tree().

So, this change makes one new call to reset_tree() that scans and
updates working tree files that are modified?  How expensive is
that?  

It's not like this change trades cost to write out a temporary index
file with the cost of having to call reset_tree() one more time---as
far as I can see, the new code writes to on-disk index file the same
time as the original code.

How is the use of second on-disk index hurting us?  I must be
missing something obvious, but at this point, it is not clear what
we are gaining---I only see downsides.
Junio C Hamano Aug. 2, 2020, 2:20 a.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

>> @@ -1091,8 +1088,6 @@ static int stash_working_tree(struct stash_info *info, const struct pathspec *ps
>>  	argv_array_pushl(&cp_upd_index.args, "update-index",
>>  			 "--ignore-skip-worktree-entries",
>>  			 "-z", "--add", "--remove", "--stdin", NULL);
>> -	argv_array_pushf(&cp_upd_index.env_array, "GIT_INDEX_FILE=%s",
>> -			 stash_index_path.buf);
>>  
>>  	if (pipe_command(&cp_upd_index, diff_output.buf, diff_output.len,
>>  			 NULL, 0, NULL, 0)) {
>
> And then the new code now lets "update-index" work directly on the
> main index (which does make an observable difference to the outside
> world, but we are not letting any hook to look at this intermediate
> state, so it might be OK---I cannot tell at this point in the code).

This changes the behaviour of the command a lot when a fatal happens
anywhere after this point til we reset the index to match the HEAD,
doesn't it, I wonder.  The original code refrained from touching the
index so after such a "fatal:" error message, the user would have
seen the original state (with previous "git add" etc. the user did)
in the index.  New code now added all the working tree changes to
the index, so the user would lose the work made so far in the main
index, no?  With later patch in the series applied, we will end up
having even untracked paths in the main index---I am not sure to
what extent that step would make things even worse, but it cannot be
harmless.

> How is the use of second on-disk index hurting us?  I must be
> missing something obvious, but at this point, it is not clear what
> we are gaining---I only see downsides.
diff mbox series

Patch

diff --git a/builtin/stash.c b/builtin/stash.c
index 9baa8b379e..2535335275 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -1059,17 +1059,14 @@  static int stash_working_tree(struct stash_info *info, const struct pathspec *ps
 	struct rev_info rev;
 	struct child_process cp_upd_index = CHILD_PROCESS_INIT;
 	struct strbuf diff_output = STRBUF_INIT;
-	struct index_state istate = { NULL };
 
 	init_revisions(&rev, NULL);
 	copy_pathspec(&rev.prune_data, ps);
 
-	set_alternate_index_output(stash_index_path.buf);
 	if (reset_tree(&info->i_tree, 0, 0)) {
 		ret = -1;
 		goto done;
 	}
-	set_alternate_index_output(NULL);
 
 	rev.diffopt.output_format = DIFF_FORMAT_CALLBACK;
 	rev.diffopt.format_callback = add_diff_to_buf;
@@ -1091,8 +1088,6 @@  static int stash_working_tree(struct stash_info *info, const struct pathspec *ps
 	argv_array_pushl(&cp_upd_index.args, "update-index",
 			 "--ignore-skip-worktree-entries",
 			 "-z", "--add", "--remove", "--stdin", NULL);
-	argv_array_pushf(&cp_upd_index.env_array, "GIT_INDEX_FILE=%s",
-			 stash_index_path.buf);
 
 	if (pipe_command(&cp_upd_index, diff_output.buf, diff_output.len,
 			 NULL, 0, NULL, 0)) {
@@ -1100,19 +1095,16 @@  static int stash_working_tree(struct stash_info *info, const struct pathspec *ps
 		goto done;
 	}
 
-	if (write_index_as_tree(&info->w_tree, &istate, stash_index_path.buf, 0,
-				NULL)) {
+	discard_cache();
+	if (write_cache_as_tree(&info->w_tree, 0, NULL) ||
+	    reset_tree(&info->i_tree, 0, 1))
 		ret = -1;
-		goto done;
-	}
 
 done:
-	discard_index(&istate);
 	UNLEAK(rev);
 	object_array_clear(&rev.pending);
 	clear_pathspec(&rev.prune_data);
 	strbuf_release(&diff_output);
-	remove_path(stash_index_path.buf);
 	return ret;
 }