diff mbox series

[v4,3/3] ls-files: add --deduplicate option

Message ID 0c7830d07db0aa1ec055b97de52bd873d05e3ab1.1610856136.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series builtin/ls-files.c:add git ls-file --dedup option | expand

Commit Message

ZheNing Hu Jan. 17, 2021, 4:02 a.m. UTC
From: ZheNing Hu <adlternative@gmail.com>

In order to provide users a better experience
when viewing information about files in the index
and the working tree, the `--deduplicate` option will suppress
some duplicate name under some conditions.

In a merge conflict, one file name of "git ls-files" output may
appear multiple times. For example,now there is an unmerged path
`a.c`,`a.c` will appear three times in the output of
"git ls-files".We can use "git ls-files --deduplicate" to output
`a.c` only one time.(unless `--stage` or `--unmerged` is
used to view all the detailed information in the index)

In addition, if you use both `--delete` and `--modify` at
the same time, The `--deduplicate` option
can also suppress file name output.

Additional instructions:
In order to display entries information,`deduplicate` suppresses
the output of duplicate file names, not the output of duplicate
entries information, so under the option of `-t`, `--stage`, `--unmerge`,
`--deduplicate` will have no effect.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 Documentation/git-ls-files.txt |  5 +++
 builtin/ls-files.c             | 23 +++++++++++++-
 t/t3012-ls-files-dedup.sh      | 57 ++++++++++++++++++++++++++++++++++
 3 files changed, 84 insertions(+), 1 deletion(-)
 create mode 100755 t/t3012-ls-files-dedup.sh

Comments

Junio C Hamano Jan. 17, 2021, 6:25 a.m. UTC | #1
"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> -		}else if (show_modified && ie_modified(repo->index, ce, &st, 0))
> +			}
> +		} else if (show_modified && ie_modified(repo->index, ce, &st, 0))

The preimage shows a style violation "}else if" introduced by an
earlier step in the series, and this fixes it.

Please make sure to proofread your patches before you show to others
to pretend that you are perfect coder and do not need "oops what I
did earlier in the series was wrong and here is a fix-up".

Thanks.
Junio C Hamano Jan. 17, 2021, 11:34 p.m. UTC | #2
"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/t/t3012-ls-files-dedup.sh b/t/t3012-ls-files-dedup.sh
> new file mode 100755
> index 00000000000..75877255c2c
> --- /dev/null
> +++ b/t/t3012-ls-files-dedup.sh
> @@ -0,0 +1,57 @@
> +#!/bin/sh
> +
> +test_description='git ls-files --deduplicate test'
> +
> +. ./test-lib.sh

We should already have a ls-files test so that we can add a handful
new tests to it, instead of dedicating a whole new test script.

Also, don't do everything in a single 'setup'.  There are various
scenarios you want to make sure ls-files to work (grep for ls-files
in the following you added---I count 4 of them), and when a future
developer touches the code, he or she may break one but not other
three.  The purpose you write tests is to protect your new feature
from such a developer *AND* help such a developer to debug and fix
his or her changes.  For that, it would be a lot more sensible to
have one set-up that is common, and then four separate tests.

> +test_expect_success 'setup' '
> +	>a.txt &&
> +	>b.txt &&
> +	>delete.txt &&
> +	git add a.txt b.txt delete.txt &&
> +	git commit -m master:1 &&

Needless use of the word "master".  Observe what is going on in the
project around you and avoid stepping other peoples' toes.  One of
the ongoing effort is to grep for the phrase master in t/ directory
and examine what happens when the default initial branch name
becomes something other than 'master', so adding a needless hit like
this is most unwelcome.

> +	echo a >a.txt &&
> +	echo b >b.txt &&
> +	echo delete >delete.txt &&
> +	git add a.txt b.txt delete.txt &&
> +	git commit -m master:2 &&

> +	git checkout HEAD~ &&
> +	git switch -c dev &&

Needless mixture of checkout/switch.  If you switch branches using
"git checkout", for example, consistently do so, i.e.

	git checkout -b dev HEAD~1 

It's not like these new tests are to test checkout and switch; your
mission is to protect "ls-files --dedup" feature here.

> +	test_when_finished "git switch master" &&
> +	echo change >a.txt &&
> +	git add a.txt &&
> +	git commit -m dev:1 &&

I'd consider all of the above to be 'setup' that is common for
subsequent tests.  It may make sense to actually do everything
on the initial branch, i.e. after creating two commits, do

	git tag tip &&
	git reset --hard HEAD^ &&
	echo change >a.txt &&
	git commit -a -m side &&
	git tag side

You are always on the initial branch without ever switching, so
there is no need for the when_finished stuff.

Then the first of your test is to show the index with conflicts.

> +	test_must_fail git merge master &&

This will become "git merge tip" instead of 'master'.

> +	git ls-files --deduplicate >actual &&
> +	cat >expect <<-\EOF &&
> +	a.txt
> +	b.txt
> +	delete.txt
> +	EOF
> +	test_cmp expect actual &&

And up to this point is the first test after 'setup'.

The next test should begin with:

	git reset --hard side &&
	test_must_fail git merge tip &&

so that even when the first test is skipped, or left unmerged,
you'll begin with a known state.

> +	rm delete.txt &&
> +	git ls-files -d -m --deduplicate >actual &&
> +	cat >expect <<-\EOF &&
> +	a.txt
> +	delete.txt
> +	EOF
> +	test_cmp expect actual &&
> +	git ls-files -d -m -t  --deduplicate >actual &&
> +	cat >expect <<-\EOF &&
> +	C a.txt
> +	C a.txt
> +	C a.txt
> +	R delete.txt
> +	C delete.txt
> +	EOF
> +	test_cmp expect actual &&
> +	git ls-files -d -m -c  --deduplicate >actual &&
> +	cat >expect <<-\EOF &&
> +	a.txt
> +	b.txt
> +	delete.txt
> +	EOF
> +	test_cmp expect actual &&

These three can be kept in the same test_expect_success, as they are
exercising read-only operation on the same state but with different
display options.

But in this case, the preparation is not too tedious (just a failed
merge plus a deletion), so you probably would prefer to split it
into 3 independent tests---that may make it more helpful to future
developers.

> +	git merge --abort
> +'
> +test_done
ZheNing Hu Jan. 18, 2021, 4:09 a.m. UTC | #3
Junio C Hamano <gitster@pobox.com> 于2021年1月18日周一 上午7:34写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > diff --git a/t/t3012-ls-files-dedup.sh b/t/t3012-ls-files-dedup.sh
> > new file mode 100755
> > index 00000000000..75877255c2c
> > --- /dev/null
> > +++ b/t/t3012-ls-files-dedup.sh
> > @@ -0,0 +1,57 @@
> > +#!/bin/sh
> > +
> > +test_description='git ls-files --deduplicate test'
> > +
> > +. ./test-lib.sh
>
> We should already have a ls-files test so that we can add a handful
> new tests to it, instead of dedicating a whole new test script.
>
Fine,It might be easier for me to write a test file myself for the time being.
But I will learn slowly.
> Also, don't do everything in a single 'setup'.  There are various
> scenarios you want to make sure ls-files to work (grep for ls-files
> in the following you added---I count 4 of them), and when a future
> developer touches the code, he or she may break one but not other
> three.  The purpose you write tests is to protect your new feature
> from such a developer *AND* help such a developer to debug and fix
> his or her changes.  For that, it would be a lot more sensible to
> have one set-up that is common, and then four separate tests.
>
> > +test_expect_success 'setup' '
> > +     >a.txt &&
> > +     >b.txt &&
> > +     >delete.txt &&
> > +     git add a.txt b.txt delete.txt &&
> > +     git commit -m master:1 &&
>
> Needless use of the word "master".  Observe what is going on in the
> project around you and avoid stepping other peoples' toes.  One of
> the ongoing effort is to grep for the phrase master in t/ directory
> and examine what happens when the default initial branch name
> becomes something other than 'master', so adding a needless hit like
> this is most unwelcome.
>
Well, I will try my best to use less "master".
> > +     echo a >a.txt &&
> > +     echo b >b.txt &&
> > +     echo delete >delete.txt &&
> > +     git add a.txt b.txt delete.txt &&
> > +     git commit -m master:2 &&
>
> > +     git checkout HEAD~ &&
> > +     git switch -c dev &&
>
> Needless mixture of checkout/switch.  If you switch branches using
> "git checkout", for example, consistently do so, i.e.
>
>         git checkout -b dev HEAD~1
>
> It's not like these new tests are to test checkout and switch; your
> mission is to protect "ls-files --dedup" feature here.
>
> > +     test_when_finished "git switch master" &&
> > +     echo change >a.txt &&
> > +     git add a.txt &&
> > +     git commit -m dev:1 &&
>
> I'd consider all of the above to be 'setup' that is common for
> subsequent tests.  It may make sense to actually do everything
> on the initial branch, i.e. after creating two commits, do
>
I understand it now...setup is for serve as a basis for other tests.
>         git tag tip &&
>         git reset --hard HEAD^ &&
>         echo change >a.txt &&
>         git commit -a -m side &&
>         git tag side
>
> You are always on the initial branch without ever switching, so
> there is no need for the when_finished stuff.
>
> Then the first of your test is to show the index with conflicts.
>
> > +     test_must_fail git merge master &&
>
> This will become "git merge tip" instead of 'master'.
>
use tag instead of use branch name...
> > +     git ls-files --deduplicate >actual &&
> > +     cat >expect <<-\EOF &&
> > +     a.txt
> > +     b.txt
> > +     delete.txt
> > +     EOF
> > +     test_cmp expect actual &&
>
> And up to this point is the first test after 'setup'.
>
> The next test should begin with:
>
>         git reset --hard side &&
>         test_must_fail git merge tip &&
>
> so that even when the first test is skipped, or left unmerged,
> you'll begin with a known state.
>
Well,I understand now that the a test_success should allow other
programmers to skip this test,so that we should reset to a known
state at the beginning of each test.
> > +     rm delete.txt &&
> > +     git ls-files -d -m --deduplicate >actual &&
> > +     cat >expect <<-\EOF &&
> > +     a.txt
> > +     delete.txt
> > +     EOF
> > +     test_cmp expect actual &&
> > +     git ls-files -d -m -t  --deduplicate >actual &&
> > +     cat >expect <<-\EOF &&
> > +     C a.txt
> > +     C a.txt
> > +     C a.txt
> > +     R delete.txt
> > +     C delete.txt
> > +     EOF
> > +     test_cmp expect actual &&
> > +     git ls-files -d -m -c  --deduplicate >actual &&
> > +     cat >expect <<-\EOF &&
> > +     a.txt
> > +     b.txt
> > +     delete.txt
> > +     EOF
> > +     test_cmp expect actual &&
>
> These three can be kept in the same test_expect_success, as they are
> exercising read-only operation on the same state but with different
> display options.
>
indeed so.
> But in this case, the preparation is not too tedious (just a failed
> merge plus a deletion), so you probably would prefer to split it
> into 3 independent tests---that may make it more helpful to future
> developers.
>
Thanks:)
> > +     git merge --abort
> > +'
> > +test_done
ZheNing Hu Jan. 18, 2021, 6:05 a.m. UTC | #4
Hi,Junio!
Here I am thinking about the role of this "--deduplicate" is to
suppress duplicate filenames rather than duplicate entries. Do you
think I should modify this sentence?

> > OPT_BOOL(0,"deduplicate",&skipping_duplicates,N_("suppress duplicate entries")),

胡哲宁 <adlternative@gmail.com> 于2021年1月18日周一 下午12:09写道:
>
> Junio C Hamano <gitster@pobox.com> 于2021年1月18日周一 上午7:34写道:
> >
> > "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > > diff --git a/t/t3012-ls-files-dedup.sh b/t/t3012-ls-files-dedup.sh
> > > new file mode 100755
> > > index 00000000000..75877255c2c
> > > --- /dev/null
> > > +++ b/t/t3012-ls-files-dedup.sh
> > > @@ -0,0 +1,57 @@
> > > +#!/bin/sh
> > > +
> > > +test_description='git ls-files --deduplicate test'
> > > +
> > > +. ./test-lib.sh
> >
> > We should already have a ls-files test so that we can add a handful
> > new tests to it, instead of dedicating a whole new test script.
> >
> Fine,It might be easier for me to write a test file myself for the time being.
> But I will learn slowly.
> > Also, don't do everything in a single 'setup'.  There are various
> > scenarios you want to make sure ls-files to work (grep for ls-files
> > in the following you added---I count 4 of them), and when a future
> > developer touches the code, he or she may break one but not other
> > three.  The purpose you write tests is to protect your new feature
> > from such a developer *AND* help such a developer to debug and fix
> > his or her changes.  For that, it would be a lot more sensible to
> > have one set-up that is common, and then four separate tests.
> >
> > > +test_expect_success 'setup' '
> > > +     >a.txt &&
> > > +     >b.txt &&
> > > +     >delete.txt &&
> > > +     git add a.txt b.txt delete.txt &&
> > > +     git commit -m master:1 &&
> >
> > Needless use of the word "master".  Observe what is going on in the
> > project around you and avoid stepping other peoples' toes.  One of
> > the ongoing effort is to grep for the phrase master in t/ directory
> > and examine what happens when the default initial branch name
> > becomes something other than 'master', so adding a needless hit like
> > this is most unwelcome.
> >
> Well, I will try my best to use less "master".
> > > +     echo a >a.txt &&
> > > +     echo b >b.txt &&
> > > +     echo delete >delete.txt &&
> > > +     git add a.txt b.txt delete.txt &&
> > > +     git commit -m master:2 &&
> >
> > > +     git checkout HEAD~ &&
> > > +     git switch -c dev &&
> >
> > Needless mixture of checkout/switch.  If you switch branches using
> > "git checkout", for example, consistently do so, i.e.
> >
> >         git checkout -b dev HEAD~1
> >
> > It's not like these new tests are to test checkout and switch; your
> > mission is to protect "ls-files --dedup" feature here.
> >
> > > +     test_when_finished "git switch master" &&
> > > +     echo change >a.txt &&
> > > +     git add a.txt &&
> > > +     git commit -m dev:1 &&
> >
> > I'd consider all of the above to be 'setup' that is common for
> > subsequent tests.  It may make sense to actually do everything
> > on the initial branch, i.e. after creating two commits, do
> >
> I understand it now...setup is for serve as a basis for other tests.
> >         git tag tip &&
> >         git reset --hard HEAD^ &&
> >         echo change >a.txt &&
> >         git commit -a -m side &&
> >         git tag side
> >
> > You are always on the initial branch without ever switching, so
> > there is no need for the when_finished stuff.
> >
> > Then the first of your test is to show the index with conflicts.
> >
> > > +     test_must_fail git merge master &&
> >
> > This will become "git merge tip" instead of 'master'.
> >
> use tag instead of use branch name...
> > > +     git ls-files --deduplicate >actual &&
> > > +     cat >expect <<-\EOF &&
> > > +     a.txt
> > > +     b.txt
> > > +     delete.txt
> > > +     EOF
> > > +     test_cmp expect actual &&
> >
> > And up to this point is the first test after 'setup'.
> >
> > The next test should begin with:
> >
> >         git reset --hard side &&
> >         test_must_fail git merge tip &&
> >
> > so that even when the first test is skipped, or left unmerged,
> > you'll begin with a known state.
> >
> Well,I understand now that the a test_success should allow other
> programmers to skip this test,so that we should reset to a known
> state at the beginning of each test.
> > > +     rm delete.txt &&
> > > +     git ls-files -d -m --deduplicate >actual &&
> > > +     cat >expect <<-\EOF &&
> > > +     a.txt
> > > +     delete.txt
> > > +     EOF
> > > +     test_cmp expect actual &&
> > > +     git ls-files -d -m -t  --deduplicate >actual &&
> > > +     cat >expect <<-\EOF &&
> > > +     C a.txt
> > > +     C a.txt
> > > +     C a.txt
> > > +     R delete.txt
> > > +     C delete.txt
> > > +     EOF
> > > +     test_cmp expect actual &&
> > > +     git ls-files -d -m -c  --deduplicate >actual &&
> > > +     cat >expect <<-\EOF &&
> > > +     a.txt
> > > +     b.txt
> > > +     delete.txt
> > > +     EOF
> > > +     test_cmp expect actual &&
> >
> > These three can be kept in the same test_expect_success, as they are
> > exercising read-only operation on the same state but with different
> > display options.
> >
> indeed so.
> > But in this case, the preparation is not too tedious (just a failed
> > merge plus a deletion), so you probably would prefer to split it
> > into 3 independent tests---that may make it more helpful to future
> > developers.
> >
> Thanks:)
> > > +     git merge --abort
> > > +'
> > > +test_done
Junio C Hamano Jan. 18, 2021, 9:31 p.m. UTC | #5
胡哲宁 <adlternative@gmail.com> writes:

> Here I am thinking about the role of this "--deduplicate" is to
> suppress duplicate filenames rather than duplicate entries. Do you
> think I should modify this sentence?
>
>> > OPT_BOOL(0,"deduplicate",&skipping_duplicates,N_("suppress duplicate entries")),

I see no strong need to.  One set of output entries from "ls-files"
may say

    $ git ls-files -u
    100644 536e55524db72bd2acf175208aef4f3dfc148d41 1	COPYING
    100644 536e55524db72bd2acf175208aef4f3dfc148d43 3	COPYING

and these three "entries" are not duplicates.  Another set of output
entries may say

    $ git ls-files COPYING
    COPYING
    COPYING
    COPYING

and these output entries are duplicates.  If you deduplicate the
latter but not the former, then "suppress duplicate entries" is
exactly what you are doing, I would think.

And if you are asked to show entries that would look like this in a
not-deduplicated form:

    $ git ls-files -u
    100644 536e55524db72bd2acf175208aef4f3dfc148d41 1	COPYING
    100644 536e55524db72bd2acf175208aef4f3dfc148d41 1	COPYING
    100644 536e55524db72bd2acf175208aef4f3dfc148d43 3	COPYING

"suppressing duplicates" would give us the first entry and drop the
second entry that is identical to the second entry, I would think
[*1*].

So "duplicate entries" would probably be more correct description of
what we want to happen than "duplicate filenames".


[Footnote]

*1* Multiple "common ancestor" versions at stage #1 for the same
    path is not an error.  That is how "merge-resolve" expresses
    criss-cross merge where multiple merge-bases exist.

    Multiple "their" versions at stage #3 for the same path is not
    an error, and "merge-octopus" should use it to express contents
    from histories being merged into ours, but the implementation of
    the octopus strategy does not use this feature of the index.

    Multiple "our" versions at stage #2 by definition should not
    happen ;-)
ZheNing Hu Jan. 19, 2021, 2:56 a.m. UTC | #6
Thank you very much for your answer,I also learned a lot from it, then
I will use the description of "suppress duplicate entries".

Junio C Hamano <gitster@pobox.com> 于2021年1月19日周二 上午5:31写道:
>
> 胡哲宁 <adlternative@gmail.com> writes:
>
> > Here I am thinking about the role of this "--deduplicate" is to
> > suppress duplicate filenames rather than duplicate entries. Do you
> > think I should modify this sentence?
> >
> >> > OPT_BOOL(0,"deduplicate",&skipping_duplicates,N_("suppress duplicate entries")),
>
> I see no strong need to.  One set of output entries from "ls-files"
> may say
>
>     $ git ls-files -u
>     100644 536e55524db72bd2acf175208aef4f3dfc148d41 1   COPYING
>     100644 536e55524db72bd2acf175208aef4f3dfc148d43 3   COPYING
>
> and these three "entries" are not duplicates.  Another set of output
> entries may say
>
>     $ git ls-files COPYING
>     COPYING
>     COPYING
>     COPYING
>
> and these output entries are duplicates.  If you deduplicate the
> latter but not the former, then "suppress duplicate entries" is
> exactly what you are doing, I would think.
>
> And if you are asked to show entries that would look like this in a
> not-deduplicated form:
>
>     $ git ls-files -u
>     100644 536e55524db72bd2acf175208aef4f3dfc148d41 1   COPYING
>     100644 536e55524db72bd2acf175208aef4f3dfc148d41 1   COPYING
>     100644 536e55524db72bd2acf175208aef4f3dfc148d43 3   COPYING
>
> "suppressing duplicates" would give us the first entry and drop the
> second entry that is identical to the second entry, I would think
> [*1*].
>
> So "duplicate entries" would probably be more correct description of
> what we want to happen than "duplicate filenames".
>
>
> [Footnote]
>
> *1* Multiple "common ancestor" versions at stage #1 for the same
>     path is not an error.  That is how "merge-resolve" expresses
>     criss-cross merge where multiple merge-bases exist.
>
>     Multiple "their" versions at stage #3 for the same path is not
>     an error, and "merge-octopus" should use it to express contents
>     from histories being merged into ours, but the implementation of
>     the octopus strategy does not use this feature of the index.
>
>     Multiple "our" versions at stage #2 by definition should not
>     happen ;-)
diff mbox series

Patch

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index cbcf5263dd0..d11c8ade402 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -13,6 +13,7 @@  SYNOPSIS
 		(--[cached|deleted|others|ignored|stage|unmerged|killed|modified])*
 		(-[c|d|o|i|s|u|k|m])*
 		[--eol]
+		[--deduplicate]
 		[-x <pattern>|--exclude=<pattern>]
 		[-X <file>|--exclude-from=<file>]
 		[--exclude-per-directory=<file>]
@@ -81,6 +82,10 @@  OPTIONS
 	\0 line termination on output and do not quote filenames.
 	See OUTPUT below for more information.
 
+--deduplicate::
+	Suppress duplicate entries when there are unmerged paths in index
+	or `--deleted` and `--modified` are combined.
+
 -x <pattern>::
 --exclude=<pattern>::
 	Skip untracked files matching pattern.
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 49c242128d7..390d7ef6b44 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -35,6 +35,7 @@  static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
 static int recurse_submodules;
+static int skipping_duplicates;
 
 static const char *prefix;
 static int max_prefix_len;
@@ -301,6 +302,7 @@  static void show_files(struct repository *repo, struct dir_struct *dir)
 {
 	int i;
 	struct strbuf fullname = STRBUF_INIT;
+	const struct cache_entry *last_shown_ce;
 
 	/* For cached/deleted files we don't need to even do the readdir */
 	if (show_others || show_killed) {
@@ -314,6 +316,7 @@  static void show_files(struct repository *repo, struct dir_struct *dir)
 	}
 	if (! (show_cached || show_stage || show_deleted || show_modified))
 		return;
+	last_shown_ce = NULL;
 	for (i = 0; i < repo->index->cache_nr; i++) {
 		const struct cache_entry *ce = repo->index->cache[i];
 		struct stat st;
@@ -321,28 +324,43 @@  static void show_files(struct repository *repo, struct dir_struct *dir)
 
 		construct_fullname(&fullname, repo, ce);
 
+		if (skipping_duplicates && last_shown_ce &&
+			!strcmp(last_shown_ce->name,ce->name))
+				continue;
 		if ((dir->flags & DIR_SHOW_IGNORED) &&
 			!ce_excluded(dir, repo->index, fullname.buf, ce))
 			continue;
 		if (ce->ce_flags & CE_UPDATE)
 			continue;
 		if (show_cached || show_stage) {
+			if (show_cached && skipping_duplicates && last_shown_ce &&
+				!strcmp(last_shown_ce->name,ce->name))
+					continue;
 			if (!show_unmerged || ce_stage(ce))
 				show_ce(repo, dir, ce, fullname.buf,
 					ce_stage(ce) ? tag_unmerged :
 					(ce_skip_worktree(ce) ? tag_skip_worktree :
 						tag_cached));
+			if(show_cached && skipping_duplicates)
+				last_shown_ce = ce;
 		}
 		if (ce_skip_worktree(ce))
 			continue;
+		if (skipping_duplicates && last_shown_ce && !strcmp(last_shown_ce->name,ce->name))
+			continue;
 		err = lstat(fullname.buf, &st);
 		if (err) {
+			if (skipping_duplicates && show_deleted && show_modified)
+				show_ce(repo, dir, ce, fullname.buf, tag_removed);
+			else {
 				if (show_deleted)
 					show_ce(repo, dir, ce, fullname.buf, tag_removed);
 				if (show_modified)
 					show_ce(repo, dir, ce, fullname.buf, tag_modified);
-		}else if (show_modified && ie_modified(repo->index, ce, &st, 0))
+			}
+		} else if (show_modified && ie_modified(repo->index, ce, &st, 0))
 			show_ce(repo, dir, ce, fullname.buf, tag_modified);
+		last_shown_ce = ce;
 	}
 
 	strbuf_release(&fullname);
@@ -569,6 +587,7 @@  int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			N_("pretend that paths removed since <tree-ish> are still present")),
 		OPT__ABBREV(&abbrev),
 		OPT_BOOL(0, "debug", &debug_mode, N_("show debugging data")),
+		OPT_BOOL(0,"deduplicate",&skipping_duplicates,N_("suppress duplicate entries")),
 		OPT_END()
 	};
 
@@ -600,6 +619,8 @@  int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 		tag_skip_worktree = "S ";
 		tag_resolve_undo = "U ";
 	}
+	if (show_tag && skipping_duplicates)
+		skipping_duplicates = 0;
 	if (show_modified || show_others || show_deleted || (dir.flags & DIR_SHOW_IGNORED) || show_killed)
 		require_work_tree = 1;
 	if (show_unmerged)
diff --git a/t/t3012-ls-files-dedup.sh b/t/t3012-ls-files-dedup.sh
new file mode 100755
index 00000000000..75877255c2c
--- /dev/null
+++ b/t/t3012-ls-files-dedup.sh
@@ -0,0 +1,57 @@ 
+#!/bin/sh
+
+test_description='git ls-files --deduplicate test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	>a.txt &&
+	>b.txt &&
+	>delete.txt &&
+	git add a.txt b.txt delete.txt &&
+	git commit -m master:1 &&
+	echo a >a.txt &&
+	echo b >b.txt &&
+	echo delete >delete.txt &&
+	git add a.txt b.txt delete.txt &&
+	git commit -m master:2 &&
+	git checkout HEAD~ &&
+	git switch -c dev &&
+	test_when_finished "git switch master" &&
+	echo change >a.txt &&
+	git add a.txt &&
+	git commit -m dev:1 &&
+	test_must_fail git merge master &&
+	git ls-files --deduplicate >actual &&
+	cat >expect <<-\EOF &&
+	a.txt
+	b.txt
+	delete.txt
+	EOF
+	test_cmp expect actual &&
+	rm delete.txt &&
+	git ls-files -d -m --deduplicate >actual &&
+	cat >expect <<-\EOF &&
+	a.txt
+	delete.txt
+	EOF
+	test_cmp expect actual &&
+	git ls-files -d -m -t  --deduplicate >actual &&
+	cat >expect <<-\EOF &&
+	C a.txt
+	C a.txt
+	C a.txt
+	R delete.txt
+	C delete.txt
+	EOF
+	test_cmp expect actual &&
+	git ls-files -d -m -c  --deduplicate >actual &&
+	cat >expect <<-\EOF &&
+	a.txt
+	b.txt
+	delete.txt
+	EOF
+	test_cmp expect actual &&
+	git merge --abort
+'
+test_done