diff mbox series

[v2] reset: unstage empty deleted ita files

Message ID 20190726044806.2216-1-vcnaik94@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v2] reset: unstage empty deleted ita files | expand

Commit Message

Varun Naik July 26, 2019, 4:48 a.m. UTC
It is possible to delete a committed file from the index and then add it
as intent-to-add. After `git reset HEAD`, the file should be identical
in the index and HEAD. This patch provides the desired behavior even
when the file is empty in the index.

Signed-off-by: Varun Naik <vcnaik94@gmail.com>
---
CC Duy because you seem to be the one who is most involved in changes to
ita behavior.

The test description was incorrect, so I changed that now. I also
figured out how to fix the related problem in "restore"; separate patch
coming soon.

 builtin/reset.c  |  1 +
 t/t7102-reset.sh | 11 +++++++++++
 2 files changed, 12 insertions(+)

Comments

Junio C Hamano July 26, 2019, 6:19 p.m. UTC | #1
Varun Naik <vcnaik94@gmail.com> writes:

> It is possible to delete a committed file from the index and then add it
> as intent-to-add. After `git reset HEAD`, the file should be identical
> in the index and HEAD. This patch provides the desired behavior even
> when the file is empty in the index.

The first and the second sentence describes what you want to achieve
concisely and sensibly.  There is a vast leap between them and the
last sentence.  What's missing is:

 - What goes wrong without this one-liner fix and how does the
   command make a wrong decision to leave the path 'empty' (in the
   new test) different from that of the tree of the HEAD?

 - How does the change help the machinery to make a right decision
   instead?

I had to briefly wonder if this change interacts with "reset -N" in
any way.  In that mode, we want to make sure that diff sees the
entries that are missing from the index that exist in the tree of
the HEAD, so that update_index_from_diff() can add them back to the
index as I-T-A entries.

Making I-T-A entries invisible in the index for the purpose of
running that diff would mean that they appear as removed, but it is
OK because we'll add them back as I-T-A entries anyway, so it all
evens out, I think.

> diff --git a/builtin/reset.c b/builtin/reset.c
> index 26ef9a7bd0..47a088f4b7 100644
> --- a/builtin/reset.c
> +++ b/builtin/reset.c
> @@ -163,6 +163,7 @@ static int read_from_tree(const struct pathspec *pathspec,
>  	opt.format_callback_data = &intent_to_add;
>  	opt.flags.override_submodule_config = 1;
>  	opt.repo = the_repository;
> +	opt.ita_invisible_in_index = 1;
>  
>  	if (do_diff_cache(tree_oid, &opt))
>  		return 1;
> diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
> index 97be0d968d..9f3854e8f0 100755
> --- a/t/t7102-reset.sh
> +++ b/t/t7102-reset.sh
> @@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' '
>  	test_must_be_empty actual
>  '
>  
> +test_expect_success 'reset --mixed adds deleted intent-to-add file back to index' '
> +	echo "nonempty" >nonempty &&
> +	>empty &&
> +	git add nonempty empty &&
> +	git commit -m "create files to be deleted" &&
> +	git rm --cached nonempty empty &&
> +	git add -N nonempty empty &&
> +	git reset HEAD nonempty empty &&
> +	git diff --staged --exit-code

We are not testing if the "diff --staged" synonym (that is not even
in "git diff --help") behaves identically to "diff --cached" here
(if we wanted to do such a test, we should do so somewhere in t4xxx
series, not here), so let's spell it in the canonical form, like so:

	git diff --cached --exit-code HEAD

At this point, we have three paths (empty, nonempty and secondfile)
in the tree of the HEAD, and we just resetted the entries for the
first two paths in the index to match.  The 'secondfile' added, in
previous tests, is still there unchanged, and is not shown in the
diff output.  The 'new-file', added as I-T-A in previous tests, is
also still there unchanged, and is not shown in the diff output.

How does the internal do_diff_cache() behave differently before and
after this patch on 'empty' and 'nonempty'?  How does the difference
affect the final outcome of "git reset" operation?

 - without ita-is-invisible bit set, we end up comparing the mode
   and blob object name; for 'nonempty', HEAD records a blob object
   name for the non-empty content, but the index has an empty blob
   object name (with I-T-A bit set, but that does not participate in
   the diff operation at the level of diff-lib.c::do_oneway_diff())
   and are deemed "modified".  Even though we should say "deleted",
   the end result turns out to be the same---we restore from the
   tree of the HEAD.

 - however, for 'empty', we mistakenly end up saying "both are empty
   blobs, so no difference; nothing to be done", leaving the i-t-a
   entry in the index.

 - with ita-is-invisible bit set, both 'nonempty' and 'empty' are
   immediately marked as "deleted" in do_oneway_diff().  This causes
   both paths to be resurrected from the tree of the HEAD the same
   way.

With the above kind of analysis, a reader can fill in the leap in
the explanation between the second and the third sentence in the
proposed log message.  But we shouldn't force readers to make that
effort to understand how the patch was meant to improve things.

Thanks.

> +'
> +
>  test_done
Varun Naik July 29, 2019, 6:52 a.m. UTC | #2
On Fri, Jul 26, 2019 at 11:20 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Varun Naik <vcnaik94@gmail.com> writes:
>
> > It is possible to delete a committed file from the index and then add it
> > as intent-to-add. After `git reset HEAD`, the file should be identical
> > in the index and HEAD. This patch provides the desired behavior even
> > when the file is empty in the index.
>
> The first and the second sentence describes what you want to achieve
> concisely and sensibly.  There is a vast leap between them and the
> last sentence.  What's missing is:
>
>  - What goes wrong without this one-liner fix and how does the
>    command make a wrong decision to leave the path 'empty' (in the
>    new test) different from that of the tree of the HEAD?
>
>  - How does the change help the machinery to make a right decision
>    instead?
>
> I had to briefly wonder if this change interacts with "reset -N" in
> any way.  In that mode, we want to make sure that diff sees the
> entries that are missing from the index that exist in the tree of
> the HEAD, so that update_index_from_diff() can add them back to the
> index as I-T-A entries.
>
> Making I-T-A entries invisible in the index for the purpose of
> running that diff would mean that they appear as removed, but it is
> OK because we'll add them back as I-T-A entries anyway, so it all
> evens out, I think.
>
> > diff --git a/builtin/reset.c b/builtin/reset.c
> > index 26ef9a7bd0..47a088f4b7 100644
> > --- a/builtin/reset.c
> > +++ b/builtin/reset.c
> > @@ -163,6 +163,7 @@ static int read_from_tree(const struct pathspec *pathspec,
> >       opt.format_callback_data = &intent_to_add;
> >       opt.flags.override_submodule_config = 1;
> >       opt.repo = the_repository;
> > +     opt.ita_invisible_in_index = 1;
> >
> >       if (do_diff_cache(tree_oid, &opt))
> >               return 1;
> > diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
> > index 97be0d968d..9f3854e8f0 100755
> > --- a/t/t7102-reset.sh
> > +++ b/t/t7102-reset.sh
> > @@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' '
> >       test_must_be_empty actual
> >  '
> >
> > +test_expect_success 'reset --mixed adds deleted intent-to-add file back to index' '
> > +     echo "nonempty" >nonempty &&
> > +     >empty &&
> > +     git add nonempty empty &&
> > +     git commit -m "create files to be deleted" &&
> > +     git rm --cached nonempty empty &&
> > +     git add -N nonempty empty &&
> > +     git reset HEAD nonempty empty &&
> > +     git diff --staged --exit-code
>
> We are not testing if the "diff --staged" synonym (that is not even
> in "git diff --help") behaves identically to "diff --cached" here
> (if we wanted to do such a test, we should do so somewhere in t4xxx
> series, not here), so let's spell it in the canonical form, like so:
>
>         git diff --cached --exit-code HEAD
>
> At this point, we have three paths (empty, nonempty and secondfile)
> in the tree of the HEAD, and we just resetted the entries for the
> first two paths in the index to match.  The 'secondfile' added, in
> previous tests, is still there unchanged, and is not shown in the
> diff output.  The 'new-file', added as I-T-A in previous tests, is
> also still there unchanged, and is not shown in the diff output.
>

To guard against changes to the test cases in the future, would it be
better if I write something like the following instead?
    git diff --cached --exit-code HEAD nonempty empty

> How does the internal do_diff_cache() behave differently before and
> after this patch on 'empty' and 'nonempty'?  How does the difference
> affect the final outcome of "git reset" operation?
>
>  - without ita-is-invisible bit set, we end up comparing the mode
>    and blob object name; for 'nonempty', HEAD records a blob object
>    name for the non-empty content, but the index has an empty blob
>    object name (with I-T-A bit set, but that does not participate in
>    the diff operation at the level of diff-lib.c::do_oneway_diff())
>    and are deemed "modified".  Even though we should say "deleted",
>    the end result turns out to be the same---we restore from the
>    tree of the HEAD.
>
>  - however, for 'empty', we mistakenly end up saying "both are empty
>    blobs, so no difference; nothing to be done", leaving the i-t-a
>    entry in the index.
>
>  - with ita-is-invisible bit set, both 'nonempty' and 'empty' are
>    immediately marked as "deleted" in do_oneway_diff().  This causes
>    both paths to be resurrected from the tree of the HEAD the same
>    way.
>
> With the above kind of analysis, a reader can fill in the leap in
> the explanation between the second and the third sentence in the
> proposed log message.  But we shouldn't force readers to make that
> effort to understand how the patch was meant to improve things.
>

Thank you for the detailed explanation, this really helped me
understand the internals of do_diff_cache(). While adjusting my commit
message, I realized that my change actually breaks `git reset HEAD`
for ita files (i.e. after `git add -N nonempty` and `git reset HEAD
nonempty`, the file is still marked as intent-to-add). So, instead of
setting the flag ita_invisible_in_index to 1 before calling
do_diff_cache(), we want to handle a specific edge case deep inside
the diff machinery. It looks like I fixed another bug in the process,
so I will write a test case for that and then send out v3.

> Thanks.
>
> > +'
> > +
> >  test_done
Junio C Hamano July 29, 2019, 4:07 p.m. UTC | #3
Varun Naik <vcnaik94@gmail.com> writes:

> To guard against changes to the test cases in the future, would it be
> better if I write something like the following instead?
>     git diff --cached --exit-code HEAD nonempty empty

Hmph, that is probably a good idea, as it matches the kind of
"reset" we just did, which is what we are testing.
diff mbox series

Patch

diff --git a/builtin/reset.c b/builtin/reset.c
index 26ef9a7bd0..47a088f4b7 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -163,6 +163,7 @@  static int read_from_tree(const struct pathspec *pathspec,
 	opt.format_callback_data = &intent_to_add;
 	opt.flags.override_submodule_config = 1;
 	opt.repo = the_repository;
+	opt.ita_invisible_in_index = 1;
 
 	if (do_diff_cache(tree_oid, &opt))
 		return 1;
diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh
index 97be0d968d..9f3854e8f0 100755
--- a/t/t7102-reset.sh
+++ b/t/t7102-reset.sh
@@ -566,4 +566,15 @@  test_expect_success 'reset --mixed sets up work tree' '
 	test_must_be_empty actual
 '
 
+test_expect_success 'reset --mixed adds deleted intent-to-add file back to index' '
+	echo "nonempty" >nonempty &&
+	>empty &&
+	git add nonempty empty &&
+	git commit -m "create files to be deleted" &&
+	git rm --cached nonempty empty &&
+	git add -N nonempty empty &&
+	git reset HEAD nonempty empty &&
+	git diff --staged --exit-code
+'
+
 test_done