Message ID | 20190726044806.2216-1-vcnaik94@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] reset: unstage empty deleted ita files | expand |
Varun Naik <vcnaik94@gmail.com> writes: > It is possible to delete a committed file from the index and then add it > as intent-to-add. After `git reset HEAD`, the file should be identical > in the index and HEAD. This patch provides the desired behavior even > when the file is empty in the index. The first and the second sentence describes what you want to achieve concisely and sensibly. There is a vast leap between them and the last sentence. What's missing is: - What goes wrong without this one-liner fix and how does the command make a wrong decision to leave the path 'empty' (in the new test) different from that of the tree of the HEAD? - How does the change help the machinery to make a right decision instead? I had to briefly wonder if this change interacts with "reset -N" in any way. In that mode, we want to make sure that diff sees the entries that are missing from the index that exist in the tree of the HEAD, so that update_index_from_diff() can add them back to the index as I-T-A entries. Making I-T-A entries invisible in the index for the purpose of running that diff would mean that they appear as removed, but it is OK because we'll add them back as I-T-A entries anyway, so it all evens out, I think. > diff --git a/builtin/reset.c b/builtin/reset.c > index 26ef9a7bd0..47a088f4b7 100644 > --- a/builtin/reset.c > +++ b/builtin/reset.c > @@ -163,6 +163,7 @@ static int read_from_tree(const struct pathspec *pathspec, > opt.format_callback_data = &intent_to_add; > opt.flags.override_submodule_config = 1; > opt.repo = the_repository; > + opt.ita_invisible_in_index = 1; > > if (do_diff_cache(tree_oid, &opt)) > return 1; > diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh > index 97be0d968d..9f3854e8f0 100755 > --- a/t/t7102-reset.sh > +++ b/t/t7102-reset.sh > @@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' ' > test_must_be_empty actual > ' > > +test_expect_success 'reset --mixed adds deleted intent-to-add file back to index' ' > + echo "nonempty" >nonempty && > + >empty && > + git add nonempty empty && > + git commit -m "create files to be deleted" && > + git rm --cached nonempty empty && > + git add -N nonempty empty && > + git reset HEAD nonempty empty && > + git diff --staged --exit-code We are not testing if the "diff --staged" synonym (that is not even in "git diff --help") behaves identically to "diff --cached" here (if we wanted to do such a test, we should do so somewhere in t4xxx series, not here), so let's spell it in the canonical form, like so: git diff --cached --exit-code HEAD At this point, we have three paths (empty, nonempty and secondfile) in the tree of the HEAD, and we just resetted the entries for the first two paths in the index to match. The 'secondfile' added, in previous tests, is still there unchanged, and is not shown in the diff output. The 'new-file', added as I-T-A in previous tests, is also still there unchanged, and is not shown in the diff output. How does the internal do_diff_cache() behave differently before and after this patch on 'empty' and 'nonempty'? How does the difference affect the final outcome of "git reset" operation? - without ita-is-invisible bit set, we end up comparing the mode and blob object name; for 'nonempty', HEAD records a blob object name for the non-empty content, but the index has an empty blob object name (with I-T-A bit set, but that does not participate in the diff operation at the level of diff-lib.c::do_oneway_diff()) and are deemed "modified". Even though we should say "deleted", the end result turns out to be the same---we restore from the tree of the HEAD. - however, for 'empty', we mistakenly end up saying "both are empty blobs, so no difference; nothing to be done", leaving the i-t-a entry in the index. - with ita-is-invisible bit set, both 'nonempty' and 'empty' are immediately marked as "deleted" in do_oneway_diff(). This causes both paths to be resurrected from the tree of the HEAD the same way. With the above kind of analysis, a reader can fill in the leap in the explanation between the second and the third sentence in the proposed log message. But we shouldn't force readers to make that effort to understand how the patch was meant to improve things. Thanks. > +' > + > test_done
On Fri, Jul 26, 2019 at 11:20 AM Junio C Hamano <gitster@pobox.com> wrote: > > Varun Naik <vcnaik94@gmail.com> writes: > > > It is possible to delete a committed file from the index and then add it > > as intent-to-add. After `git reset HEAD`, the file should be identical > > in the index and HEAD. This patch provides the desired behavior even > > when the file is empty in the index. > > The first and the second sentence describes what you want to achieve > concisely and sensibly. There is a vast leap between them and the > last sentence. What's missing is: > > - What goes wrong without this one-liner fix and how does the > command make a wrong decision to leave the path 'empty' (in the > new test) different from that of the tree of the HEAD? > > - How does the change help the machinery to make a right decision > instead? > > I had to briefly wonder if this change interacts with "reset -N" in > any way. In that mode, we want to make sure that diff sees the > entries that are missing from the index that exist in the tree of > the HEAD, so that update_index_from_diff() can add them back to the > index as I-T-A entries. > > Making I-T-A entries invisible in the index for the purpose of > running that diff would mean that they appear as removed, but it is > OK because we'll add them back as I-T-A entries anyway, so it all > evens out, I think. > > > diff --git a/builtin/reset.c b/builtin/reset.c > > index 26ef9a7bd0..47a088f4b7 100644 > > --- a/builtin/reset.c > > +++ b/builtin/reset.c > > @@ -163,6 +163,7 @@ static int read_from_tree(const struct pathspec *pathspec, > > opt.format_callback_data = &intent_to_add; > > opt.flags.override_submodule_config = 1; > > opt.repo = the_repository; > > + opt.ita_invisible_in_index = 1; > > > > if (do_diff_cache(tree_oid, &opt)) > > return 1; > > diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh > > index 97be0d968d..9f3854e8f0 100755 > > --- a/t/t7102-reset.sh > > +++ b/t/t7102-reset.sh > > @@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' ' > > test_must_be_empty actual > > ' > > > > +test_expect_success 'reset --mixed adds deleted intent-to-add file back to index' ' > > + echo "nonempty" >nonempty && > > + >empty && > > + git add nonempty empty && > > + git commit -m "create files to be deleted" && > > + git rm --cached nonempty empty && > > + git add -N nonempty empty && > > + git reset HEAD nonempty empty && > > + git diff --staged --exit-code > > We are not testing if the "diff --staged" synonym (that is not even > in "git diff --help") behaves identically to "diff --cached" here > (if we wanted to do such a test, we should do so somewhere in t4xxx > series, not here), so let's spell it in the canonical form, like so: > > git diff --cached --exit-code HEAD > > At this point, we have three paths (empty, nonempty and secondfile) > in the tree of the HEAD, and we just resetted the entries for the > first two paths in the index to match. The 'secondfile' added, in > previous tests, is still there unchanged, and is not shown in the > diff output. The 'new-file', added as I-T-A in previous tests, is > also still there unchanged, and is not shown in the diff output. > To guard against changes to the test cases in the future, would it be better if I write something like the following instead? git diff --cached --exit-code HEAD nonempty empty > How does the internal do_diff_cache() behave differently before and > after this patch on 'empty' and 'nonempty'? How does the difference > affect the final outcome of "git reset" operation? > > - without ita-is-invisible bit set, we end up comparing the mode > and blob object name; for 'nonempty', HEAD records a blob object > name for the non-empty content, but the index has an empty blob > object name (with I-T-A bit set, but that does not participate in > the diff operation at the level of diff-lib.c::do_oneway_diff()) > and are deemed "modified". Even though we should say "deleted", > the end result turns out to be the same---we restore from the > tree of the HEAD. > > - however, for 'empty', we mistakenly end up saying "both are empty > blobs, so no difference; nothing to be done", leaving the i-t-a > entry in the index. > > - with ita-is-invisible bit set, both 'nonempty' and 'empty' are > immediately marked as "deleted" in do_oneway_diff(). This causes > both paths to be resurrected from the tree of the HEAD the same > way. > > With the above kind of analysis, a reader can fill in the leap in > the explanation between the second and the third sentence in the > proposed log message. But we shouldn't force readers to make that > effort to understand how the patch was meant to improve things. > Thank you for the detailed explanation, this really helped me understand the internals of do_diff_cache(). While adjusting my commit message, I realized that my change actually breaks `git reset HEAD` for ita files (i.e. after `git add -N nonempty` and `git reset HEAD nonempty`, the file is still marked as intent-to-add). So, instead of setting the flag ita_invisible_in_index to 1 before calling do_diff_cache(), we want to handle a specific edge case deep inside the diff machinery. It looks like I fixed another bug in the process, so I will write a test case for that and then send out v3. > Thanks. > > > +' > > + > > test_done
Varun Naik <vcnaik94@gmail.com> writes: > To guard against changes to the test cases in the future, would it be > better if I write something like the following instead? > git diff --cached --exit-code HEAD nonempty empty Hmph, that is probably a good idea, as it matches the kind of "reset" we just did, which is what we are testing.
diff --git a/builtin/reset.c b/builtin/reset.c index 26ef9a7bd0..47a088f4b7 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -163,6 +163,7 @@ static int read_from_tree(const struct pathspec *pathspec, opt.format_callback_data = &intent_to_add; opt.flags.override_submodule_config = 1; opt.repo = the_repository; + opt.ita_invisible_in_index = 1; if (do_diff_cache(tree_oid, &opt)) return 1; diff --git a/t/t7102-reset.sh b/t/t7102-reset.sh index 97be0d968d..9f3854e8f0 100755 --- a/t/t7102-reset.sh +++ b/t/t7102-reset.sh @@ -566,4 +566,15 @@ test_expect_success 'reset --mixed sets up work tree' ' test_must_be_empty actual ' +test_expect_success 'reset --mixed adds deleted intent-to-add file back to index' ' + echo "nonempty" >nonempty && + >empty && + git add nonempty empty && + git commit -m "create files to be deleted" && + git rm --cached nonempty empty && + git add -N nonempty empty && + git reset HEAD nonempty empty && + git diff --staged --exit-code +' + test_done
It is possible to delete a committed file from the index and then add it as intent-to-add. After `git reset HEAD`, the file should be identical in the index and HEAD. This patch provides the desired behavior even when the file is empty in the index. Signed-off-by: Varun Naik <vcnaik94@gmail.com> --- CC Duy because you seem to be the one who is most involved in changes to ita behavior. The test description was incorrect, so I changed that now. I also figured out how to fix the related problem in "restore"; separate patch coming soon. builtin/reset.c | 1 + t/t7102-reset.sh | 11 +++++++++++ 2 files changed, 12 insertions(+)