diff mbox series

[RFC/PATCH,3/5] ls-files: add and use a new --sparse option

Message ID 20210317132814.30175-4-avarab@gmail.com (mailing list archive)
State New, archived
Headers show
Series [RFC/PATCH,1/5] ls-files: defer read_index() after parse_options() etc. | expand

Commit Message

Ævar Arnfjörð Bjarmason March 17, 2021, 1:28 p.m. UTC
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/git-ls-files.txt           |  4 ++
 builtin/ls-files.c                       | 10 ++++-
 t/t1091-sparse-checkout-builtin.sh       |  9 ++--
 t/t1092-sparse-checkout-compatibility.sh | 57 ++++++++++++++++--------
 4 files changed, 56 insertions(+), 24 deletions(-)

Comments

Elijah Newren March 17, 2021, 6:19 p.m. UTC | #1
On Wed, Mar 17, 2021 at 6:28 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  Documentation/git-ls-files.txt           |  4 ++
>  builtin/ls-files.c                       | 10 ++++-
>  t/t1091-sparse-checkout-builtin.sh       |  9 ++--
>  t/t1092-sparse-checkout-compatibility.sh | 57 ++++++++++++++++--------
>  4 files changed, 56 insertions(+), 24 deletions(-)
>
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> index 6d11ab506b..1145e960a4 100644
> --- a/Documentation/git-ls-files.txt
> +++ b/Documentation/git-ls-files.txt
> @@ -71,6 +71,10 @@ OPTIONS
>  --unmerged::
>         Show unmerged files in the output (forces --stage)
>
> +--sparse::
> +       Show sparse directories in the output instead of expanding
> +       them (forces --stage)
> +
>  -k::
>  --killed::
>         Show files on the filesystem that need to be removed due
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index 4db75351f2..1ebbb63c10 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -26,6 +26,7 @@ static int show_deleted;
>  static int show_cached;
>  static int show_others;
>  static int show_stage;
> +static int show_sparse;
>  static int show_unmerged;
>  static int show_resolve_undo;
>  static int show_modified;
> @@ -639,6 +640,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>                         DIR_SHOW_IGNORED),
>                 OPT_BOOL('s', "stage", &show_stage,
>                         N_("show staged contents' object name in the output")),
> +               OPT_BOOL(0, "sparse", &show_sparse,
> +                       N_("show unexpanded sparse directories in the output")),
>                 OPT_BOOL('k', "killed", &show_killed,
>                         N_("show files on the filesystem that need to be removed")),
>                 OPT_BIT(0, "directory", &dir.flags,
> @@ -705,12 +708,17 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>                 tag_skip_worktree = "S ";
>                 tag_resolve_undo = "U ";
>         }
> +       if (show_sparse) {
> +               prepare_repo_settings(the_repository);
> +               the_repository->settings.command_requires_full_index = 0;
> +       }
>         if (show_modified || show_others || show_deleted || (dir.flags & DIR_SHOW_IGNORED) || show_killed)
>                 require_work_tree = 1;
> -       if (show_unmerged)
> +       if (show_unmerged || show_sparse)
>                 /*
>                  * There's no point in showing unmerged unless
>                  * you also show the stage information.
> +                * The same goes for the --sparse option.

Yuck, haven't you just made --sparse an alias for --stage?  Why does
it need an alias?

Was the goal just to get a quick way to make the command run under
repo->settings.command_requires_full_index = 0 without auditing the
codepaths?  It seems to rely on them having been audited anyway, since
it just falls back to the code used for --stage, so I don't see how it
helps.  It also suggests the command might do unexpected or weird
things if run without the --sparse option?  If people manually
configure a sparse-checkout and cone mode AND a sparse-index (it's
annoying how they have to specify all three instead of having to just
pass one flag somewhere), then now we also need to force them to
remember to pass extra flags to random various commands for them to
operate in a sane manner in their environment??

I think this is a bad path to go down.

However, if you want to write the necessary tests to make it so that
ls-files can operate with command_requires_full_index = 0, then I
think that's useful.  If you want to add a special flag so that folks
in a sparse-checkout-with-cone-mode-with-sparse-index setup want to
operate densely (i.e. to show what files would be in the index if it
were fully populated), then I think that's useful.  But having
sparse-yes-with-cone-yes-very-sparse folks need to specify an extra
flag to commands to get sparse behavior just seems wrong to me.

>                  */
>                 show_stage = 1;
>         if (show_tag || show_stage)
> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
> index ff1ad570a2..c823df423c 100755
> --- a/t/t1091-sparse-checkout-builtin.sh
> +++ b/t/t1091-sparse-checkout-builtin.sh
> @@ -208,12 +208,13 @@ test_expect_success 'sparse-checkout disable' '
>  test_expect_success 'sparse-index enabled and disabled' '
>         git -C repo sparse-checkout init --cone --sparse-index &&
>         test_cmp_config -C repo true extensions.sparseIndex &&
> -       test-tool -C repo read-cache --table >cache &&
> -       grep " tree " cache &&
> +       git -C repo ls-files --sparse >cache &&
> +       grep "^040000 " cache >lines &&
> +       test_line_count = 3 lines &&
>
>         git -C repo sparse-checkout disable &&
> -       test-tool -C repo read-cache --table >cache &&
> -       ! grep " tree " cache &&
> +       git -C repo ls-files --sparse >cache &&
> +       ! grep "^040000 " cache &&
>         git -C repo config --list >config &&
>         ! grep extensions.sparseindex config
>  '
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index d97bf9b645..48d3920490 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -136,48 +136,67 @@ test_sparse_match () {
>         test_cmp sparse-checkout-err sparse-index-err
>  }
>
> +test_index_entry_like () {
> +       dir=$1
> +       shift
> +       fmt=$1
> +       shift
> +       rev=$1
> +       shift
> +       entry=$1
> +       shift
> +       file=$1
> +       shift
> +       hash=$(git -C "$dir" rev-parse "$rev") &&
> +       printf "$fmt\n" "$hash" "$entry" >expected &&
> +       if grep "$entry" "$file" >line
> +       then
> +               test_cmp expected line
> +       else
> +               cat cache &&
> +               false
> +       fi
> +}
> +
>  test_expect_success 'sparse-index contents' '
>         init_repos &&
>
> -       test-tool -C sparse-index read-cache --table >cache &&
> +       git -C sparse-index ls-files --sparse >cache &&
>         for dir in folder1 folder2 x
>         do
> -               TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
> -               grep "040000 tree $TREE $dir/" cache \
> -                       || return 1
> +               test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
>         done &&
>
>         git -C sparse-index sparse-checkout set folder1 &&
>
> -       test-tool -C sparse-index read-cache --table >cache &&
> +       git -C sparse-index ls-files --sparse >cache &&
>         for dir in deep folder2 x
>         do
> -               TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
> -               grep "040000 tree $TREE $dir/" cache \
> -                       || return 1
> +               test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
>         done &&
>
>         git -C sparse-index sparse-checkout set deep/deeper1 &&
>
> -       test-tool -C sparse-index read-cache --table >cache &&
> +       git -C sparse-index ls-files --sparse >cache &&
>         for dir in deep/deeper2 folder1 folder2 x
>         do
> -               TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
> -               grep "040000 tree $TREE $dir/" cache \
> -                       || return 1
> +               test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
>         done &&
>
> +       grep 040000 cache >lines &&
> +       test_line_count = 4 lines &&
> +
>         # Disabling the sparse-index removes tree entries with full ones
>         git -C sparse-index sparse-checkout init --no-sparse-index &&
>
> -       test-tool -C sparse-index read-cache --table >cache &&
> -       ! grep "040000 tree" cache &&
> -       test_sparse_match test-tool read-cache --table
> +       git -C sparse-index ls-files --sparse >cache &&
> +       ! grep "^040000 " cache >lines &&
> +       test_sparse_match git ls-tree -r HEAD
>  '
>
>  test_expect_success 'expanded in-memory index matches full index' '
>         init_repos &&
> -       test_sparse_match test-tool read-cache --expand --table
> +       test_sparse_match git ls-tree -r HEAD
>  '
>
>  test_expect_success 'status with options' '
> @@ -394,9 +413,9 @@ test_expect_success 'submodule handling' '
>         test_all_match git commit -m "add submodule" &&
>
>         # having a submodule prevents "modules" from collapse
> -       test-tool -C sparse-index read-cache --table >cache &&
> -       grep "100644 blob .*    modules/a" cache &&
> -       grep "160000 commit $(git -C initial-repo rev-parse HEAD)       modules/sub" cache
> +       git -C sparse-index ls-files --sparse >cache &&
> +       test_index_entry_like sparse-index "100644 %s 0\t%s" "HEAD:modules/a" "modules/a" cache &&
> +       test_index_entry_like sparse-index "160000 %s 0\t%s" "HEAD:modules/sub" "modules/sub" cache
>  '
>
>  test_expect_success 'sparse-index is expanded and converted back' '
> --
> 2.31.0.260.g719c683c1d

I do like the tests and your idea that we can use ls-files to list
whatever entries are in the index, I just think the tests should use
--stage to do that.
Ævar Arnfjörð Bjarmason March 17, 2021, 6:27 p.m. UTC | #2
On Wed, Mar 17 2021, Elijah Newren wrote:

> On Wed, Mar 17, 2021 at 6:28 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>  Documentation/git-ls-files.txt           |  4 ++
>>  builtin/ls-files.c                       | 10 ++++-
>>  t/t1091-sparse-checkout-builtin.sh       |  9 ++--
>>  t/t1092-sparse-checkout-compatibility.sh | 57 ++++++++++++++++--------
>>  4 files changed, 56 insertions(+), 24 deletions(-)
>>
>> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
>> index 6d11ab506b..1145e960a4 100644
>> --- a/Documentation/git-ls-files.txt
>> +++ b/Documentation/git-ls-files.txt
>> @@ -71,6 +71,10 @@ OPTIONS
>>  --unmerged::
>>         Show unmerged files in the output (forces --stage)
>>
>> +--sparse::
>> +       Show sparse directories in the output instead of expanding
>> +       them (forces --stage)
>> +
>>  -k::
>>  --killed::
>>         Show files on the filesystem that need to be removed due
>> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
>> index 4db75351f2..1ebbb63c10 100644
>> --- a/builtin/ls-files.c
>> +++ b/builtin/ls-files.c
>> @@ -26,6 +26,7 @@ static int show_deleted;
>>  static int show_cached;
>>  static int show_others;
>>  static int show_stage;
>> +static int show_sparse;
>>  static int show_unmerged;
>>  static int show_resolve_undo;
>>  static int show_modified;
>> @@ -639,6 +640,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>>                         DIR_SHOW_IGNORED),
>>                 OPT_BOOL('s', "stage", &show_stage,
>>                         N_("show staged contents' object name in the output")),
>> +               OPT_BOOL(0, "sparse", &show_sparse,
>> +                       N_("show unexpanded sparse directories in the output")),
>>                 OPT_BOOL('k', "killed", &show_killed,
>>                         N_("show files on the filesystem that need to be removed")),
>>                 OPT_BIT(0, "directory", &dir.flags,
>> @@ -705,12 +708,17 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>>                 tag_skip_worktree = "S ";
>>                 tag_resolve_undo = "U ";
>>         }
>> +       if (show_sparse) {
>> +               prepare_repo_settings(the_repository);
>> +               the_repository->settings.command_requires_full_index = 0;
>> +       }
>>         if (show_modified || show_others || show_deleted || (dir.flags & DIR_SHOW_IGNORED) || show_killed)
>>                 require_work_tree = 1;
>> -       if (show_unmerged)
>> +       if (show_unmerged || show_sparse)
>>                 /*
>>                  * There's no point in showing unmerged unless
>>                  * you also show the stage information.
>> +                * The same goes for the --sparse option.
>
> Yuck, haven't you just made --sparse an alias for --stage?  Why does
> it need an alias?

It doesn't, but --unmerged, the one other option which purely modifies
--stage output implies --stage.

So it's in line with existing UI convention in the command, it's
probably better to keep following that than have new options behave
differently.

But yeah, we could spell out --stage --sparse in the tests.

> Was the goal just to get a quick way to make the command run under
> repo->settings.command_requires_full_index = 0 without auditing the
> codepaths?  It seems to rely on them having been audited anyway, since
> it just falls back to the code used for --stage, so I don't see how it
> helps.  It also suggests the command might do unexpected or weird
> things if run without the --sparse option?  If people manually
> configure a sparse-checkout and cone mode AND a sparse-index (it's
> annoying how they have to specify all three instead of having to just
> pass one flag somewhere), then now we also need to force them to
> remember to pass extra flags to random various commands for them to
> operate in a sane manner in their environment??
>
> I think this is a bad path to go down.

Those are probably good points, I don't have enough overview of the
whole sparse thing yet to say.

I just thought it didn't make sense to have a series changing the nature
of the index without corresponding tooling changes to interrogate the
state of the index.

> However, if you want to write the necessary tests to make it so that
> ls-files can operate with command_requires_full_index = 0, then I
> think that's useful.  If you want to add a special flag so that folks
> in a sparse-checkout-with-cone-mode-with-sparse-index setup want to
> operate densely (i.e. to show what files would be in the index if it
> were fully populated), then I think that's useful.  But having
> sparse-yes-with-cone-yes-very-sparse folks need to specify an extra
> flag to commands to get sparse behavior just seems wrong to me.

Maybe, but what else do you suggest for getting this information out of
the index?

>>                  */
>>                 show_stage = 1;
>>         if (show_tag || show_stage)
>> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
>> index ff1ad570a2..c823df423c 100755
>> --- a/t/t1091-sparse-checkout-builtin.sh
>> +++ b/t/t1091-sparse-checkout-builtin.sh
>> @@ -208,12 +208,13 @@ test_expect_success 'sparse-checkout disable' '
>>  test_expect_success 'sparse-index enabled and disabled' '
>>         git -C repo sparse-checkout init --cone --sparse-index &&
>>         test_cmp_config -C repo true extensions.sparseIndex &&
>> -       test-tool -C repo read-cache --table >cache &&
>> -       grep " tree " cache &&
>> +       git -C repo ls-files --sparse >cache &&
>> +       grep "^040000 " cache >lines &&
>> +       test_line_count = 3 lines &&
>>
>>         git -C repo sparse-checkout disable &&
>> -       test-tool -C repo read-cache --table >cache &&
>> -       ! grep " tree " cache &&
>> +       git -C repo ls-files --sparse >cache &&
>> +       ! grep "^040000 " cache &&
>>         git -C repo config --list >config &&
>>         ! grep extensions.sparseindex config
>>  '
>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
>> index d97bf9b645..48d3920490 100755
>> --- a/t/t1092-sparse-checkout-compatibility.sh
>> +++ b/t/t1092-sparse-checkout-compatibility.sh
>> @@ -136,48 +136,67 @@ test_sparse_match () {
>>         test_cmp sparse-checkout-err sparse-index-err
>>  }
>>
>> +test_index_entry_like () {
>> +       dir=$1
>> +       shift
>> +       fmt=$1
>> +       shift
>> +       rev=$1
>> +       shift
>> +       entry=$1
>> +       shift
>> +       file=$1
>> +       shift
>> +       hash=$(git -C "$dir" rev-parse "$rev") &&
>> +       printf "$fmt\n" "$hash" "$entry" >expected &&
>> +       if grep "$entry" "$file" >line
>> +       then
>> +               test_cmp expected line
>> +       else
>> +               cat cache &&
>> +               false
>> +       fi
>> +}
>> +
>>  test_expect_success 'sparse-index contents' '
>>         init_repos &&
>>
>> -       test-tool -C sparse-index read-cache --table >cache &&
>> +       git -C sparse-index ls-files --sparse >cache &&
>>         for dir in folder1 folder2 x
>>         do
>> -               TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
>> -               grep "040000 tree $TREE $dir/" cache \
>> -                       || return 1
>> +               test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
>>         done &&
>>
>>         git -C sparse-index sparse-checkout set folder1 &&
>>
>> -       test-tool -C sparse-index read-cache --table >cache &&
>> +       git -C sparse-index ls-files --sparse >cache &&
>>         for dir in deep folder2 x
>>         do
>> -               TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
>> -               grep "040000 tree $TREE $dir/" cache \
>> -                       || return 1
>> +               test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
>>         done &&
>>
>>         git -C sparse-index sparse-checkout set deep/deeper1 &&
>>
>> -       test-tool -C sparse-index read-cache --table >cache &&
>> +       git -C sparse-index ls-files --sparse >cache &&
>>         for dir in deep/deeper2 folder1 folder2 x
>>         do
>> -               TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
>> -               grep "040000 tree $TREE $dir/" cache \
>> -                       || return 1
>> +               test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
>>         done &&
>>
>> +       grep 040000 cache >lines &&
>> +       test_line_count = 4 lines &&
>> +
>>         # Disabling the sparse-index removes tree entries with full ones
>>         git -C sparse-index sparse-checkout init --no-sparse-index &&
>>
>> -       test-tool -C sparse-index read-cache --table >cache &&
>> -       ! grep "040000 tree" cache &&
>> -       test_sparse_match test-tool read-cache --table
>> +       git -C sparse-index ls-files --sparse >cache &&
>> +       ! grep "^040000 " cache >lines &&
>> +       test_sparse_match git ls-tree -r HEAD
>>  '
>>
>>  test_expect_success 'expanded in-memory index matches full index' '
>>         init_repos &&
>> -       test_sparse_match test-tool read-cache --expand --table
>> +       test_sparse_match git ls-tree -r HEAD
>>  '
>>
>>  test_expect_success 'status with options' '
>> @@ -394,9 +413,9 @@ test_expect_success 'submodule handling' '
>>         test_all_match git commit -m "add submodule" &&
>>
>>         # having a submodule prevents "modules" from collapse
>> -       test-tool -C sparse-index read-cache --table >cache &&
>> -       grep "100644 blob .*    modules/a" cache &&
>> -       grep "160000 commit $(git -C initial-repo rev-parse HEAD)       modules/sub" cache
>> +       git -C sparse-index ls-files --sparse >cache &&
>> +       test_index_entry_like sparse-index "100644 %s 0\t%s" "HEAD:modules/a" "modules/a" cache &&
>> +       test_index_entry_like sparse-index "160000 %s 0\t%s" "HEAD:modules/sub" "modules/sub" cache
>>  '
>>
>>  test_expect_success 'sparse-index is expanded and converted back' '
>> --
>> 2.31.0.260.g719c683c1d
>
> I do like the tests and your idea that we can use ls-files to list
> whatever entries are in the index, I just think the tests should use
> --stage to do that.
Elijah Newren March 17, 2021, 6:44 p.m. UTC | #3
On Wed, Mar 17, 2021 at 11:27 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Wed, Mar 17 2021, Elijah Newren wrote:
>
> > On Wed, Mar 17, 2021 at 6:28 AM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> >> ---
> >>  Documentation/git-ls-files.txt           |  4 ++
> >>  builtin/ls-files.c                       | 10 ++++-
> >>  t/t1091-sparse-checkout-builtin.sh       |  9 ++--
> >>  t/t1092-sparse-checkout-compatibility.sh | 57 ++++++++++++++++--------
> >>  4 files changed, 56 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> >> index 6d11ab506b..1145e960a4 100644
> >> --- a/Documentation/git-ls-files.txt
> >> +++ b/Documentation/git-ls-files.txt
> >> @@ -71,6 +71,10 @@ OPTIONS
> >>  --unmerged::
> >>         Show unmerged files in the output (forces --stage)
> >>
> >> +--sparse::
> >> +       Show sparse directories in the output instead of expanding
> >> +       them (forces --stage)
> >> +
> >>  -k::
> >>  --killed::
> >>         Show files on the filesystem that need to be removed due
> >> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> >> index 4db75351f2..1ebbb63c10 100644
> >> --- a/builtin/ls-files.c
> >> +++ b/builtin/ls-files.c
> >> @@ -26,6 +26,7 @@ static int show_deleted;
> >>  static int show_cached;
> >>  static int show_others;
> >>  static int show_stage;
> >> +static int show_sparse;
> >>  static int show_unmerged;
> >>  static int show_resolve_undo;
> >>  static int show_modified;
> >> @@ -639,6 +640,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
> >>                         DIR_SHOW_IGNORED),
> >>                 OPT_BOOL('s', "stage", &show_stage,
> >>                         N_("show staged contents' object name in the output")),
> >> +               OPT_BOOL(0, "sparse", &show_sparse,
> >> +                       N_("show unexpanded sparse directories in the output")),
> >>                 OPT_BOOL('k', "killed", &show_killed,
> >>                         N_("show files on the filesystem that need to be removed")),
> >>                 OPT_BIT(0, "directory", &dir.flags,
> >> @@ -705,12 +708,17 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
> >>                 tag_skip_worktree = "S ";
> >>                 tag_resolve_undo = "U ";
> >>         }
> >> +       if (show_sparse) {
> >> +               prepare_repo_settings(the_repository);
> >> +               the_repository->settings.command_requires_full_index = 0;
> >> +       }
> >>         if (show_modified || show_others || show_deleted || (dir.flags & DIR_SHOW_IGNORED) || show_killed)
> >>                 require_work_tree = 1;
> >> -       if (show_unmerged)
> >> +       if (show_unmerged || show_sparse)
> >>                 /*
> >>                  * There's no point in showing unmerged unless
> >>                  * you also show the stage information.
> >> +                * The same goes for the --sparse option.
> >
> > Yuck, haven't you just made --sparse an alias for --stage?  Why does
> > it need an alias?
>
> It doesn't, but --unmerged, the one other option which purely modifies
> --stage output implies --stage.

--unmerged modifies --stage output.  --sparse won't.  (Maybe it does
_now_ because the command doesn't yet support sparse-indexes, but
that's a temporary artifact.  Long term, there should be no difference
in the output.)

> So it's in line with existing UI convention in the command, it's
> probably better to keep following that than have new options behave
> differently.
>
> But yeah, we could spell out --stage --sparse in the tests.

There should not be a --sparse option.  The index is _already_ sparse
and users had to take multiple steps to make it so; users shouldn't
have to repeat themselves with each and every command they ever type
when they've created a sparse index that they want sparse behavior.

You should just spell it "--stage".

> > Was the goal just to get a quick way to make the command run under
> > repo->settings.command_requires_full_index = 0 without auditing the
> > codepaths?  It seems to rely on them having been audited anyway, since
> > it just falls back to the code used for --stage, so I don't see how it
> > helps.  It also suggests the command might do unexpected or weird
> > things if run without the --sparse option?  If people manually
> > configure a sparse-checkout and cone mode AND a sparse-index (it's
> > annoying how they have to specify all three instead of having to just
> > pass one flag somewhere), then now we also need to force them to
> > remember to pass extra flags to random various commands for them to
> > operate in a sane manner in their environment??
> >
> > I think this is a bad path to go down.
>
> Those are probably good points, I don't have enough overview of the
> whole sparse thing yet to say.
>
> I just thought it didn't make sense to have a series changing the nature
> of the index without corresponding tooling changes to interrogate the
> state of the index.

That makes sense to me; I agree with you on that point.

> > However, if you want to write the necessary tests to make it so that
> > ls-files can operate with command_requires_full_index = 0, then I
> > think that's useful.  If you want to add a special flag so that folks
> > in a sparse-checkout-with-cone-mode-with-sparse-index setup want to
> > operate densely (i.e. to show what files would be in the index if it
> > were fully populated), then I think that's useful.  But having
> > sparse-yes-with-cone-yes-very-sparse folks need to specify an extra
> > flag to commands to get sparse behavior just seems wrong to me.
>
> Maybe, but what else do you suggest for getting this information out of
> the index?

Use git ls-files without new options...as I stated here:

...
> > I do like the tests and your idea that we can use ls-files to list
> > whatever entries are in the index, I just think the tests should use
> > --stage to do that.

In other words, I think making "git ls-files" the first, or at least
one of the first, commands to be modified to behave properly in a
sparse-index world is what you should be aiming for, not some
new-option-shortcut that'll make no sense long term and persist
indefinitely.

List the entries in the index: `git ls-files`
List the entries in the index with their hash, mode, and stage: `git
ls-files --stage`

List all the entries that would be in the index if it weren't sparse:
`git ls-files --$SOME_NEW_OPTION_NAME`

You don't need to implement the --$SOME_NEW_OPTION_NAME yet, of
course.  We can just note that it's the plan to add it later.
Derrick Stolee March 17, 2021, 8:43 p.m. UTC | #4
On 3/17/2021 9:28 AM, Ævar Arnfjörð Bjarmason wrote:
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh

I want to learn from your suggested changes to the test, here,
so forgive my questions here:
  
> +test_index_entry_like () {
> +	dir=$1
> +	shift
> +	fmt=$1
> +	shift
> +	rev=$1
> +	shift
> +	entry=$1
> +	shift
> +	file=$1
> +	shift

Why all the shifts? Why not just use $1, $2, $3,...? My
guess is that you want to be able to insert a new parameter
in the middle in the future without changing the later
numbers, but that seems unlikely, and we could just add
the parameter at the end.

> +	hash=$(git -C "$dir" rev-parse "$rev") &&
> +	printf "$fmt\n" "$hash" "$entry" >expected &&
> +	if grep "$entry" "$file" >line
> +	then
> +		test_cmp expected line
> +	else
> +		cat cache &&
> +		false
> +	fi
> +}
> +
>  test_expect_success 'sparse-index contents' '
>  	init_repos &&
>  
> -	test-tool -C sparse-index read-cache --table >cache &&
> +	git -C sparse-index ls-files --sparse >cache &&
>  	for dir in folder1 folder2 x
>  	do
> -		TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
> -		grep "040000 tree $TREE	$dir/" cache \
> -			|| return 1
> +		test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1

I see how this uses only one line, but it seems like the
test_index_entry_like is too generic to make it not a
complicated mess of format strings that need to copy
over and over again.

Perhaps instead it could be a "test_entry_is_tree"
and it only passes "$dir" and "cache"? Then we could drop the loop and
just have

	test_entry_is_tree cache folder1 &&
	test_entry_is_tree cache folder2 &&
	test_entry_is_tree cache x &&

or we could still use the loop, especially when we test for four trees.

> -	test-tool -C sparse-index read-cache --table >cache &&
> +	git -C sparse-index ls-files --sparse >cache &&
>  	for dir in deep/deeper2 folder1 folder2 x
>  	do
> -		TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
> -		grep "040000 tree $TREE	$dir/" cache \
> -			|| return 1
> +		test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
>  	done &&
>  
> +	grep 040000 cache >lines &&
> +	test_line_count = 4 lines &&
> +

The point here is to check that no other entries are trees? We know
that this number will be _at least_ 4 based on the loop above.

Thanks,
-Stolee
Ævar Arnfjörð Bjarmason March 24, 2021, 12:52 a.m. UTC | #5
On Wed, Mar 17 2021, Derrick Stolee wrote:

> On 3/17/2021 9:28 AM, Ævar Arnfjörð Bjarmason wrote:
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
>
> I want to learn from your suggested changes to the test, here,
> so forgive my questions here:
>   
>> +test_index_entry_like () {
>> +	dir=$1
>> +	shift
>> +	fmt=$1
>> +	shift
>> +	rev=$1
>> +	shift
>> +	entry=$1
>> +	shift
>> +	file=$1
>> +	shift
>
> Why all the shifts? Why not just use $1, $2, $3,...? My
> guess is that you want to be able to insert a new parameter
> in the middle in the future without changing the later
> numbers, but that seems unlikely, and we could just add
> the parameter at the end.

It's just crappy RFC-quality code. I probably copied some other function
and went with it. No good reason. Yeah it's ugly.

>> +	hash=$(git -C "$dir" rev-parse "$rev") &&
>> +	printf "$fmt\n" "$hash" "$entry" >expected &&
>> +	if grep "$entry" "$file" >line
>> +	then
>> +		test_cmp expected line
>> +	else
>> +		cat cache &&
>> +		false
>> +	fi
>> +}
>> +
>>  test_expect_success 'sparse-index contents' '
>>  	init_repos &&
>>  
>> -	test-tool -C sparse-index read-cache --table >cache &&
>> +	git -C sparse-index ls-files --sparse >cache &&
>>  	for dir in folder1 folder2 x
>>  	do
>> -		TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
>> -		grep "040000 tree $TREE	$dir/" cache \
>> -			|| return 1
>> +		test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
>
> I see how this uses only one line, but it seems like the
> test_index_entry_like is too generic to make it not a
> complicated mess of format strings that need to copy
> over and over again.
>
> Perhaps instead it could be a "test_entry_is_tree"
> and it only passes "$dir" and "cache"? Then we could drop the loop and
> just have
>
> 	test_entry_is_tree cache folder1 &&
> 	test_entry_is_tree cache folder2 &&
> 	test_entry_is_tree cache x &&
>
> or we could still use the loop, especially when we test for four trees.

Yeah that sounds good. Personally I don't mind 4x similar lines
copy/pasted over a for-loop in the tests. You don't need to worry about
the || return doing the right thing, and just setting up the for-loop is
already 3 lines...

>> -	test-tool -C sparse-index read-cache --table >cache &&
>> +	git -C sparse-index ls-files --sparse >cache &&
>>  	for dir in deep/deeper2 folder1 folder2 x
>>  	do
>> -		TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
>> -		grep "040000 tree $TREE	$dir/" cache \
>> -			|| return 1
>> +		test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
>>  	done &&
>>  
>> +	grep 040000 cache >lines &&
>> +	test_line_count = 4 lines &&
>> +
>
> The point here is to check that no other entries are trees? We know
> that this number will be _at least_ 4 based on the loop above.

It's exactly 4 because we have 4 folders we're checking. But you tell
me. I was just trying to refactor this dependence on the ls-tree format
while moving it over to ls-files without spending too much time on
understanding all the specifics.
diff mbox series

Patch

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 6d11ab506b..1145e960a4 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -71,6 +71,10 @@  OPTIONS
 --unmerged::
 	Show unmerged files in the output (forces --stage)
 
+--sparse::
+	Show sparse directories in the output instead of expanding
+	them (forces --stage)
+
 -k::
 --killed::
 	Show files on the filesystem that need to be removed due
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 4db75351f2..1ebbb63c10 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -26,6 +26,7 @@  static int show_deleted;
 static int show_cached;
 static int show_others;
 static int show_stage;
+static int show_sparse;
 static int show_unmerged;
 static int show_resolve_undo;
 static int show_modified;
@@ -639,6 +640,8 @@  int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			DIR_SHOW_IGNORED),
 		OPT_BOOL('s', "stage", &show_stage,
 			N_("show staged contents' object name in the output")),
+		OPT_BOOL(0, "sparse", &show_sparse,
+			N_("show unexpanded sparse directories in the output")),
 		OPT_BOOL('k', "killed", &show_killed,
 			N_("show files on the filesystem that need to be removed")),
 		OPT_BIT(0, "directory", &dir.flags,
@@ -705,12 +708,17 @@  int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 		tag_skip_worktree = "S ";
 		tag_resolve_undo = "U ";
 	}
+	if (show_sparse) {
+		prepare_repo_settings(the_repository);
+		the_repository->settings.command_requires_full_index = 0;
+	}
 	if (show_modified || show_others || show_deleted || (dir.flags & DIR_SHOW_IGNORED) || show_killed)
 		require_work_tree = 1;
-	if (show_unmerged)
+	if (show_unmerged || show_sparse)
 		/*
 		 * There's no point in showing unmerged unless
 		 * you also show the stage information.
+		 * The same goes for the --sparse option.
 		 */
 		show_stage = 1;
 	if (show_tag || show_stage)
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index ff1ad570a2..c823df423c 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -208,12 +208,13 @@  test_expect_success 'sparse-checkout disable' '
 test_expect_success 'sparse-index enabled and disabled' '
 	git -C repo sparse-checkout init --cone --sparse-index &&
 	test_cmp_config -C repo true extensions.sparseIndex &&
-	test-tool -C repo read-cache --table >cache &&
-	grep " tree " cache &&
+	git -C repo ls-files --sparse >cache &&
+	grep "^040000 " cache >lines &&
+	test_line_count = 3 lines &&
 
 	git -C repo sparse-checkout disable &&
-	test-tool -C repo read-cache --table >cache &&
-	! grep " tree " cache &&
+	git -C repo ls-files --sparse >cache &&
+	! grep "^040000 " cache &&
 	git -C repo config --list >config &&
 	! grep extensions.sparseindex config
 '
diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index d97bf9b645..48d3920490 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -136,48 +136,67 @@  test_sparse_match () {
 	test_cmp sparse-checkout-err sparse-index-err
 }
 
+test_index_entry_like () {
+	dir=$1
+	shift
+	fmt=$1
+	shift
+	rev=$1
+	shift
+	entry=$1
+	shift
+	file=$1
+	shift
+	hash=$(git -C "$dir" rev-parse "$rev") &&
+	printf "$fmt\n" "$hash" "$entry" >expected &&
+	if grep "$entry" "$file" >line
+	then
+		test_cmp expected line
+	else
+		cat cache &&
+		false
+	fi
+}
+
 test_expect_success 'sparse-index contents' '
 	init_repos &&
 
-	test-tool -C sparse-index read-cache --table >cache &&
+	git -C sparse-index ls-files --sparse >cache &&
 	for dir in folder1 folder2 x
 	do
-		TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
-		grep "040000 tree $TREE	$dir/" cache \
-			|| return 1
+		test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
 	done &&
 
 	git -C sparse-index sparse-checkout set folder1 &&
 
-	test-tool -C sparse-index read-cache --table >cache &&
+	git -C sparse-index ls-files --sparse >cache &&
 	for dir in deep folder2 x
 	do
-		TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
-		grep "040000 tree $TREE	$dir/" cache \
-			|| return 1
+		test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
 	done &&
 
 	git -C sparse-index sparse-checkout set deep/deeper1 &&
 
-	test-tool -C sparse-index read-cache --table >cache &&
+	git -C sparse-index ls-files --sparse >cache &&
 	for dir in deep/deeper2 folder1 folder2 x
 	do
-		TREE=$(git -C sparse-index rev-parse HEAD:$dir) &&
-		grep "040000 tree $TREE	$dir/" cache \
-			|| return 1
+		test_index_entry_like sparse-index "040000 %s 0\t%s" "HEAD:$dir" "$dir/" cache || return 1
 	done &&
 
+	grep 040000 cache >lines &&
+	test_line_count = 4 lines &&
+
 	# Disabling the sparse-index removes tree entries with full ones
 	git -C sparse-index sparse-checkout init --no-sparse-index &&
 
-	test-tool -C sparse-index read-cache --table >cache &&
-	! grep "040000 tree" cache &&
-	test_sparse_match test-tool read-cache --table
+	git -C sparse-index ls-files --sparse >cache &&
+	! grep "^040000 " cache >lines &&
+	test_sparse_match git ls-tree -r HEAD
 '
 
 test_expect_success 'expanded in-memory index matches full index' '
 	init_repos &&
-	test_sparse_match test-tool read-cache --expand --table
+	test_sparse_match git ls-tree -r HEAD
 '
 
 test_expect_success 'status with options' '
@@ -394,9 +413,9 @@  test_expect_success 'submodule handling' '
 	test_all_match git commit -m "add submodule" &&
 
 	# having a submodule prevents "modules" from collapse
-	test-tool -C sparse-index read-cache --table >cache &&
-	grep "100644 blob .*	modules/a" cache &&
-	grep "160000 commit $(git -C initial-repo rev-parse HEAD)	modules/sub" cache
+	git -C sparse-index ls-files --sparse >cache &&
+	test_index_entry_like sparse-index "100644 %s 0\t%s" "HEAD:modules/a" "modules/a" cache &&
+	test_index_entry_like sparse-index "160000 %s 0\t%s" "HEAD:modules/sub" "modules/sub" cache
 '
 
 test_expect_success 'sparse-index is expanded and converted back' '