Message ID | 20200611161640.52156-1-shrinidhi.kaushik@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | diff-files: treat "i-t-a" files as "not-in-index" | expand |
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes: > The `diff-files' command and related commands which call `cmd_diff_files()', > consider the "intent-to-add" files as a part of the index when comparing the > work-tree against it. This was previously addressed in [1] and [2] by turning > the option `--ita-invisible-in-index' (introduced in [3]) on by default. > > For `diff-files' (and `add -p' as a consequence) to show the i-t-a files as > as new, `ita_invisible_in_index' will be enabled by default here as well. > > [1] 0231ae71d3 (diff: turn --ita-invisible-in-index on by default, 2018-05-26) > [2] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist in > index", 2016-10-24) > [3] b42b451919 (diff: add --ita-[in]visible-in-index, 2016-10-24) Is there any place where we still run the internal diff machinery to compare the index and the working tree without setting the ita_invisible_in_index bit on with this patch applied, and if so, why? Does the justification why that other place needs to leave the bit off apply to this codepath as well? What I am trying to get at is if this is helping only one usecase for "diff-files" while breaking other usecases. On the other hand, if there is no longer anybody who wants ita_invisible_in_index off, perhaps we can get rid of the bit and lose many conditionals.
Thanks for replying! On Thu, Jun 11, 2020 at 01:27:22PM -0700, Junio C Hamano wrote: > Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes: > > > The `diff-files' command and related commands which call `cmd_diff_files()', > > consider the "intent-to-add" files as a part of the index when comparing the > > work-tree against it. This was previously addressed in [1] and [2] by turning > > the option `--ita-invisible-in-index' (introduced in [3]) on by default. > > > > For `diff-files' (and `add -p' as a consequence) to show the i-t-a files as > > as new, `ita_invisible_in_index' will be enabled by default here as well. > > > > [1] 0231ae71d3 (diff: turn --ita-invisible-in-index on by default, 2018-05-26) > > [2] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist in > > index", 2016-10-24) > > [3] b42b451919 (diff: add --ita-[in]visible-in-index, 2016-10-24) > > Is there any place where we still run the internal diff machinery to > compare the index and the working tree without setting the > ita_invisible_in_index bit on with this patch applied, and if so, > why? Does the justification why that other place needs to leave > the bit off apply to this codepath as well? Yes, I believe that there exist some use cases for `ita_invisible_in_index' to be unset. For instance, `index_differs_from' which is used in a quite a few places -- like "commit", "revert", and "rebase" -- which require a "no change" to be returned. This commit: [1] addressed the issue where the cache-tree was producing the same tree as the current commit when it involved "intent-to-add" entries, instead of aborting. [1] 018ec3c820 (commit: fix empty commit creation when there's no changes but ita entries, 2016-10-24) > What I am trying to get at is if this is helping only one usecase > for "diff-files" while breaking other usecases. Currently, `run_add_p' (for "add"; which this patch addresses the fix), and `push_to_deploy' (in "receive-pack"; where this is the intended behavior), call "diff-files" as a subprocess, in which case the `ita_invisible_in_index' bit is explicitly set. For all other cases, calls are made directly to `run_diff_files' and will be unaffected by this change. > On the other hand, if there is no longer anybody who wants > ita_invisible_in_index off, perhaps we can get rid of the bit and > lose many conditionals.
Hello, Is there any update on this patch? Please let me know if I missed anything. Thanks! On 06/12/2020 04:58, Srinidhi Kaushik wrote: > Thanks for replying! > > On Thu, Jun 11, 2020 at 01:27:22PM -0700, Junio C Hamano wrote: > > Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes: > > > > > The `diff-files' command and related commands which call `cmd_diff_files()', > > > consider the "intent-to-add" files as a part of the index when comparing the > > > work-tree against it. This was previously addressed in [1] and [2] by turning > > > the option `--ita-invisible-in-index' (introduced in [3]) on by default. > > > > > > For `diff-files' (and `add -p' as a consequence) to show the i-t-a files as > > > as new, `ita_invisible_in_index' will be enabled by default here as well. > > > > > > [1] 0231ae71d3 (diff: turn --ita-invisible-in-index on by default, 2018-05-26) > > > [2] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist in > > > index", 2016-10-24) > > > [3] b42b451919 (diff: add --ita-[in]visible-in-index, 2016-10-24) > > > > Is there any place where we still run the internal diff machinery to > > compare the index and the working tree without setting the > > ita_invisible_in_index bit on with this patch applied, and if so, > > why? Does the justification why that other place needs to leave > > the bit off apply to this codepath as well? > > Yes, I believe that there exist some use cases for `ita_invisible_in_index' > to be unset. For instance, `index_differs_from' which is used in a quite a > few places -- like "commit", "revert", and "rebase" -- which require a > "no change" to be returned. > > This commit: [1] addressed the issue where the cache-tree was producing > the same tree as the current commit when it involved "intent-to-add" > entries, instead of aborting. > > [1] 018ec3c820 (commit: fix empty commit creation when there's no changes > but ita entries, 2016-10-24) > > > What I am trying to get at is if this is helping only one usecase > > for "diff-files" while breaking other usecases. > > Currently, `run_add_p' (for "add"; which this patch addresses > the fix), and `push_to_deploy' (in "receive-pack"; where this > is the intended behavior), call "diff-files" as a subprocess, > in which case the `ita_invisible_in_index' bit is explicitly > set. For all other cases, calls are made directly > to `run_diff_files' and will be unaffected by this change. > > > On the other hand, if there is no longer anybody who wants > > ita_invisible_in_index off, perhaps we can get rid of the bit and > > lose many conditionals.
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes: > -test_expect_success 'diff-files/diff-cached shows ita as new/not-new files' ' > +test_expect_success 'diff/diff-cached shows ita as new/not-new files' ' > git reset --hard && > echo new >new-ita && > git add -N new-ita && Interesting. I thought that the test originally tested "diff-files" and "diff-index --cached" and a modernization patch forgot to update the title when the test body was changed to use "diff" and "diff --cached", but that is not the case here. When 0231ae71 (diff: turn --ita-invisible-in-index on by default, 2018-05-26) added this test, it gave a wrong title from the beginning. Nice catch. > @@ -243,6 +243,29 @@ test_expect_success 'diff-files/diff-cached shows ita as new/not-new files' ' > test_must_be_empty actual2 > ' > > +test_expect_success 'diff-files shows i-t-a files as new files' ' > + git reset --hard && > + touch empty && Use of "touch" gives a wrong impression that you care about the file timestamp; use something like ": >empty &&" instead when you care about the presence of the file and do not care about its timestamp. > + content="foo" && > + echo $content >not-empty && The quoting decision is backwards in these two lines. It is OK not to quote when the right hand side literal is clearly a single word without $IFS. On the other hand, it is a good practice to always quote when using what is in a "$variable". > + git add -N empty not-empty && > + git diff-files -p >actual && > + hash_e=$(git hash-object empty) && > + hash_n=$(git hash-object not-empty) && > + cat >expect <<-EOF && > + diff --git a/empty b/empty > + new file mode 100644 > + index 0000000..$(git rev-parse --short $hash_e) > + diff --git a/not-empty b/not-empty > + new file mode 100644 > + index 0000000..$(git rev-parse --short $hash_n) > + --- /dev/null > + +++ b/not-empty > + @@ -0,0 +1 @@ > + +$content > + EOF > + test_cmp expect actual > +' OK. Do we want to show what happens when "diff" and "diff --cached" are run with these two "added but not quite added yet" paths to contrast with this new case? Thanks.
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes: > Hello, > Is there any update on this patch? > Please let me know if I missed anything. Sorry, the patch totally fell through the cracks. I just sent out a brief review on the part I didn't review during the first round. Thanks.
Junio C Hamano <gitster@pobox.com> writes: >> + touch empty && > > Use of "touch" gives a wrong impression that you care about the file > timestamp; use something like ": >empty &&" instead when you care > about the presence of the file and do not care about its timestamp. I just realized that this is even more important in this case not to use "touch". The test that uses this file cares not just the presence, but it deeply cares that its contents is empty. The thing it least cares about is its timestamp. The purpose of using "touch" is to update the timestamp, to keep the current contents if it exists, and to ensure it exists (as a side effect), in the decreasing order of importance. Use of the command here misleads the readers. Thanks.
Thank you for reviewing this; I appreciate it! > > + content="foo" && > > + echo $content >not-empty && > > The quoting decision is backwards in these two lines. It is OK not > to quote when the right hand side literal is clearly a single word > without $IFS. On the other hand, it is a good practice to always > quote when using what is in a "$variable". Yes, that doesn't look right, I will make changes in v2. [...] > > > + touch empty && > > > > Use of "touch" gives a wrong impression that you care about the file > > timestamp; use something like ": >empty &&" instead when you care > > about the presence of the file and do not care about its timestamp. > > I just realized that this is even more important in this case not to > use "touch". > > The test that uses this file cares not just the presence, but it > deeply cares that its contents is empty. The thing it least cares > about is its timestamp. > > The purpose of using "touch" is to update the timestamp, to keep the > current contents if it exists, and to ensure it exists (as a side > effect), in the decreasing order of importance. Use of the command > here misleads the readers. Oops, you are right. That makes sense. Will update to ": >empty". [...] > > + git add -N empty not-empty && > > + git diff-files -p >actual && > > + hash_e=$(git hash-object empty) && > > + hash_n=$(git hash-object not-empty) && > > + cat >expect <<-EOF && > > + diff --git a/empty b/empty > > + new file mode 100644 > > + index 0000000..$(git rev-parse --short $hash_e) > > + diff --git a/not-empty b/not-empty > > + new file mode 100644 > > + index 0000000..$(git rev-parse --short $hash_n) > > + --- /dev/null > > + +++ b/not-empty > > + @@ -0,0 +1 @@ > > + +$content > > + EOF > > + test_cmp expect actual > > +' > > OK. Do we want to show what happens when "diff" and "diff --cached" > are run with these two "added but not quite added yet" paths to > contrast with this new case? I'm not sure if we want to repeat an older test. The test (which was renamed in this patch) in t2203-add-intent.sh: "diff/diff-cached shows ita as new/not-new files" is already doing that. Should the "diff" and "diff --cached" steps be appended here again? Thanks.
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes: >> > + git add -N empty not-empty && >> > + git diff-files -p >actual && >> > + hash_e=$(git hash-object empty) && >> > + hash_n=$(git hash-object not-empty) && >> > + cat >expect <<-EOF && >> > + diff --git a/empty b/empty >> > + new file mode 100644 >> > + index 0000000..$(git rev-parse --short $hash_e) >> > + diff --git a/not-empty b/not-empty >> > + new file mode 100644 >> > + index 0000000..$(git rev-parse --short $hash_n) >> > + --- /dev/null >> > + +++ b/not-empty >> > + @@ -0,0 +1 @@ >> > + +$content >> > + EOF >> > + test_cmp expect actual >> > +' >> >> OK. Do we want to show what happens when "diff" and "diff --cached" >> are run with these two "added but not quite added yet" paths to >> contrast with this new case? > > I'm not sure if we want to repeat an older test. The test (which was > renamed in this patch) in t2203-add-intent.sh: "diff/diff-cached shows > ita as new/not-new files" is already doing that. Should the "diff" and > "diff --cached" steps be appended here again? No, there is no need to repeat essentially the same test that exists elsewhere. I wonder if it reduces duplication even further if we extend that existing test that checks "diff" and "diff --cached" so that it also checks "diff-files" as well?, instead of adding this new one? The existing one checks diff and diff-cached only with a new non-empty path, and it can use tests with a new empty path at the same time, with a unified "set up" code that is in the early part of the test, e.g. diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh index 5bbe8dcce4..cfde790ac7 100755 --- a/t/t2203-add-intent.sh +++ b/t/t2203-add-intent.sh @@ -235,7 +235,8 @@ test_expect_success 'double rename detection in status' ' test_expect_success 'diff-files/diff-cached shows ita as new/not-new files' ' git reset --hard && echo new >new-ita && - git add -N new-ita && + : >new-ita-empty && + git add -N new-ita new-ita-empty && git diff --summary >actual && ... Then the existing tests can be updated to see not just --summary but also for the contents like you did in the new test---and another test that examines what "git diff-files" sees (which is what you added) can happen in the same test (that way, the same set-up can be reused for three tests). Thanks.
diff --git a/builtin/diff-files.c b/builtin/diff-files.c index 86ae474fbf..1e352dd8f7 100644 --- a/builtin/diff-files.c +++ b/builtin/diff-files.c @@ -28,6 +28,13 @@ int cmd_diff_files(int argc, const char **argv, const char *prefix) git_config(git_diff_basic_config, NULL); /* no "diff" UI options */ repo_init_revisions(the_repository, &rev, prefix); rev.abbrev = 0; + + /* + * Consider "intent-to-add" files as new by default, unless + * explicitly specified in the command line or anywhere else. + */ + rev.diffopt.ita_invisible_in_index = 1; + precompose_argv(argc, argv); argc = setup_revisions(argc, argv, &rev, NULL); diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh index 5bbe8dcce4..742f27a935 100755 --- a/t/t2203-add-intent.sh +++ b/t/t2203-add-intent.sh @@ -232,7 +232,7 @@ test_expect_success 'double rename detection in status' ' ) ' -test_expect_success 'diff-files/diff-cached shows ita as new/not-new files' ' +test_expect_success 'diff/diff-cached shows ita as new/not-new files' ' git reset --hard && echo new >new-ita && git add -N new-ita && @@ -243,6 +243,29 @@ test_expect_success 'diff-files/diff-cached shows ita as new/not-new files' ' test_must_be_empty actual2 ' +test_expect_success 'diff-files shows i-t-a files as new files' ' + git reset --hard && + touch empty && + content="foo" && + echo $content >not-empty && + git add -N empty not-empty && + git diff-files -p >actual && + hash_e=$(git hash-object empty) && + hash_n=$(git hash-object not-empty) && + cat >expect <<-EOF && + diff --git a/empty b/empty + new file mode 100644 + index 0000000..$(git rev-parse --short $hash_e) + diff --git a/not-empty b/not-empty + new file mode 100644 + index 0000000..$(git rev-parse --short $hash_n) + --- /dev/null + +++ b/not-empty + @@ -0,0 +1 @@ + +$content + EOF + test_cmp expect actual +' test_expect_success '"diff HEAD" includes ita as new files' ' git reset --hard &&
The `diff-files' command and related commands which call `cmd_diff_files()', consider the "intent-to-add" files as a part of the index when comparing the work-tree against it. This was previously addressed in [1] and [2] by turning the option `--ita-invisible-in-index' (introduced in [3]) on by default. For `diff-files' (and `add -p' as a consequence) to show the i-t-a files as as new, `ita_invisible_in_index' will be enabled by default here as well. [1] 0231ae71d3 (diff: turn --ita-invisible-in-index on by default, 2018-05-26) [2] 425a28e0a4 (diff-lib: allow ita entries treated as "not yet exist in index", 2016-10-24) [3] b42b451919 (diff: add --ita-[in]visible-in-index, 2016-10-24) Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> --- Hello! This is my first patch in this project. This issue was mentioned in #leftoverbits on GitHub: [1], and this patch implements the change proposed in [2]. [1] https://github.com/gitgitgadget/git/issues/647 [2] https://lore.kernel.org/git/20200527230357.GB546534@coredump.intra.peff.net builtin/diff-files.c | 7 +++++++ t/t2203-add-intent.sh | 25 ++++++++++++++++++++++++- 2 files changed, 31 insertions(+), 1 deletion(-)