diff mbox series

[v4] status: long status advice adapted to recent capabilities

Message ID pull.1384.v4.git.1668055574050.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series [v4] status: long status advice adapted to recent capabilities | expand

Commit Message

Rudy Rigot Nov. 10, 2022, 4:46 a.m. UTC
From: Rudy Rigot <rudy.rigot@gmail.com>

Improve the advice displayed when `git status` is slow because
of excessive numbers of untracked files.  Update the `git status`
man page to explain the various configuration options.

`git status` can be slow when there are a large number of untracked
files and directories, because Git must search the entire worktree
to enumerate them.  Previously, Git would print an advice message
with the elapsed search time and a suggestion to disable the search
using the `-uno` option.  This suggestion also carried a warning
that might scare off some users.

Git can reduce the size and time of the untracked file search when
the `core.untrackedCache` and `core.fsmonitor` features are enabled
by caching results from previous `git status` invocations.

Update the advice to explain the various combinations of additional
configuration options and refer to (new) documentation in the man
page that explains it in more detail than what can be printed in an
advice message.

Finally, add new tests to verify the new functionality.

Signed-off-by: Rudy Rigot <rudy.rigot@gmail.com>
---
    status: long status advice adapted to recent capabilities
    
    Here is version 4 for this patch.
    
    Changes since v3:
    
     * Integrated all feedback on the doc content itself, as is.
     * Moved the small test files into here docs in the test code.
     * Fix some awkward indentations.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1384%2Frudyrigot%2Fadvice_statusFsmonitor-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1384/rudyrigot/advice_statusFsmonitor-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1384

Range-diff vs v3:

 1:  3c98492cb82 ! 1:  85b35882c02 status: long status advice adapted to recent capabilities
     @@ Documentation/git-status.txt: during the write may conflict with other simultane
      +documented elsewhere in more detail, so please refer to them
      +for complete details.
      +
     -+* `-uno` or `status.showUntrackedFiles=false` : just don't search
     -+    and don't report on untracked files.  This is the fastest.
     -+    `git status` will not list the untracked files, so you need
     -+    to be careful to remember if you create any new files and
     -+    manually `git add` them.
     ++* The `-uno` flag or the `status.showUntrackedfiles=false`
     ++    config : indicate that `git status` should not report untracked
     ++	files. This is the fastest option. `git status` will not list
     ++	the untracked files, so you need to be careful to remember if
     ++	you create any new files and manually `git add` them.
      +
     -+* `advice.statusUoption=false` : search, but don't complain if it
     -+    takes too long.
     ++* `advice.statusUoption=false` : this config option disables a
     ++	warning message when the search for untracked files takes longer
     ++	than desired. In some large repositories, this message may appear
     ++	frequently and not be a helpful signal.
      +
      +* `core.untrackedCache=true` : enable the untracked cache feature
      +    and only search directories that have been modified since the
     @@ Documentation/git-status.txt: during the write may conflict with other simultane
      +    file within has not changed.  This is much faster than
      +    enumerating the contents of every directory, but still not
      +    without cost, because Git still has to search for the set of
     -+    modified directories.
     ++    modified directories. The untracked cache is stored in the
     ++	.git/index file. The reduced cost searching for untracked
     ++	files is offset slightly by the increased size of the index and
     ++	the cost of keeping it up-to-date. That reduced search time is
     ++	usually worth the additional size.
      +
      +* `core.untrackedCache=true` and `core.fsmonitor=true` or
      +    `core.fsmonitor=<hook_command_pathname>` : enable both the
     @@ Documentation/git-status.txt: during the write may conflict with other simultane
      +    `git status` command.  This is faster than using just the
      +    untracked cache alone because Git can also avoid searching
      +    for modified directories.  Git only has to enumerate the
     -+    exact set of directories that have changed recently.
     ++    exact set of directories that have changed recently. While
     ++	the FSMonitor feature can be enabled without the untracked
     ++	cache, the benefits are greatly reduced in that case.
      +
      +Note that after you turn on the untracked cache and/or FSMonitor
      +features it may take a few `git status` commands for the various
     @@ t/t7065-wtstatus-slow.sh (new)
      +test_expect_success 'when core.untrackedCache and fsmonitor are unset' '
      +	test_must_fail git config --get core.untrackedCache &&
      +	test_must_fail git config --get core.fsmonitor &&
     -+    git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&
     -+    test_cmp "$DATA/no_untrackedcache_no_fsmonitor" ../actual &&
     -+    rm -fr ../actual
     -+'
     -+
     -+test_expect_success 'when core.untrackedCache true, but not fsmonitor' '
     -+    git config core.untrackedCache true &&
     -+	test_must_fail git config --get core.fsmonitor &&
     -+    git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&
     -+    test_cmp "$DATA/with_untrackedcache_no_fsmonitor" ../actual &&
     -+    rm -fr ../actual
     -+'
     -+
     -+test_expect_success 'when core.untrackedCache true, and fsmonitor' '
     -+    git config core.untrackedCache true &&
     -+	git config core.fsmonitor true &&
     -+    git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&
     -+    test_cmp "$DATA/with_untrackedcache_with_fsmonitor" ../actual &&
     -+    rm -fr ../actual
     -+'
     -+
     -+test_done
     -
     - ## t/t7065/no_untrackedcache_no_fsmonitor (new) ##
     -@@
     ++	git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&
     ++	cat >../expected <<-EOF &&
      +On branch test
      +
      +No commits yet
      +
      +
      +It took X seconds to enumerate untracked files.
     -+See 'git help status' for information on how to improve this.
     ++See '"'"'git help status'"'"' for information on how to improve this.
      +
      +nothing to commit (create/copy files and use "git add" to track)
     -
     - ## t/t7065/with_untrackedcache_no_fsmonitor (new) ##
     -@@
     ++	EOF
     ++	test_cmp ../expected ../actual &&
     ++	rm -fr ../actual ../expected
     ++'
     ++
     ++test_expect_success 'when core.untrackedCache true, but not fsmonitor' '
     ++	git config core.untrackedCache true &&
     ++	test_must_fail git config --get core.fsmonitor &&
     ++	git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&
     ++	cat >../expected <<-EOF &&
      +On branch test
      +
      +No commits yet
      +
      +
      +It took X seconds to enumerate untracked files.
     -+See 'git help status' for information on how to improve this.
     ++See '"'"'git help status'"'"' for information on how to improve this.
      +
      +nothing to commit (create/copy files and use "git add" to track)
     -
     - ## t/t7065/with_untrackedcache_with_fsmonitor (new) ##
     -@@
     ++	EOF
     ++	test_cmp ../expected ../actual &&
     ++	rm -fr ../actual ../expected
     ++'
     ++
     ++test_expect_success 'when core.untrackedCache true, and fsmonitor' '
     ++	git config core.untrackedCache true &&
     ++	git config core.fsmonitor true &&
     ++	git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&
     ++	cat >../expected <<-EOF &&
      +On branch test
      +
      +No commits yet
     @@ t/t7065/with_untrackedcache_with_fsmonitor (new)
      +
      +It took X seconds to enumerate untracked files,
      +but this is currently being cached.
     -+See 'git help status' for information on how to improve this.
     ++See '"'"'git help status'"'"' for information on how to improve this.
      +
      +nothing to commit (create/copy files and use "git add" to track)
     ++	EOF
     ++	test_cmp ../expected ../actual &&
     ++	rm -fr ../actual ../expected
     ++'
     ++
     ++test_done
      
       ## wt-status.c ##
      @@
     @@ wt-status.c: static void wt_longstatus_print(struct wt_status *s)
       		if (s->show_ignored_mode)
       			wt_longstatus_print_other(s, &s->ignored, _("Ignored files"), "add -f");
      -		if (advice_enabled(ADVICE_STATUS_U_OPTION) && 2000 < s->untracked_in_ms) {
     --			status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
     --			status_printf_ln(s, GIT_COLOR_NORMAL,
     ++		if (uf_was_slow(s->untracked_in_ms) && advice_enabled(ADVICE_STATUS_U_OPTION)) {
     + 			status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
     ++			if (fsm_mode > FSMONITOR_MODE_DISABLED) {
     ++				status_printf_ln(s, GIT_COLOR_NORMAL,
     ++						_("It took %.2f seconds to enumerate untracked files,\n"
     ++						"but this is currently being cached."),
     ++						s->untracked_in_ms / 1000.0);
     ++			} else {
     ++				status_printf_ln(s, GIT_COLOR_NORMAL,
     ++						_("It took %.2f seconds to enumerate untracked files."),
     ++						s->untracked_in_ms / 1000.0);
     ++			}
     + 			status_printf_ln(s, GIT_COLOR_NORMAL,
      -					 _("It took %.2f seconds to enumerate untracked files. 'status -uno'\n"
      -					   "may speed it up, but you have to be careful not to forget to add\n"
      -					   "new files yourself (see 'git help status')."),
      -					 s->untracked_in_ms / 1000.0);
     -+		if (uf_was_slow(s->untracked_in_ms)) {
     -+			if (advice_enabled(ADVICE_STATUS_U_OPTION)) {
     -+				status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
     -+				if (fsm_mode > FSMONITOR_MODE_DISABLED) {
     -+					status_printf_ln(s, GIT_COLOR_NORMAL,
     -+							_("It took %.2f seconds to enumerate untracked files,\n"
     -+							"but this is currently being cached."),
     -+							s->untracked_in_ms / 1000.0);
     -+				} else {
     -+					status_printf_ln(s, GIT_COLOR_NORMAL,
     -+							_("It took %.2f seconds to enumerate untracked files."),
     -+							s->untracked_in_ms / 1000.0);
     -+				}
     -+				status_printf_ln(s, GIT_COLOR_NORMAL,
     -+						_("See 'git help status' for information on how to improve this."));
     -+				status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
     -+			}
     ++					_("See 'git help status' for information on how to improve this."));
     ++			status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
       		}
       	} else if (s->committable)
       		status_printf_ln(s, GIT_COLOR_NORMAL, _("Untracked files not listed%s"),


 Documentation/git-status.txt | 55 +++++++++++++++++++++++++++
 t/t7065-wtstatus-slow.sh     | 74 ++++++++++++++++++++++++++++++++++++
 wt-status.c                  | 32 +++++++++++++---
 3 files changed, 156 insertions(+), 5 deletions(-)
 create mode 100755 t/t7065-wtstatus-slow.sh


base-commit: bbe21b64a08f89475d8a3818e20c111378daa621

Comments

Eric Sunshine Nov. 10, 2022, 5:42 a.m. UTC | #1
On Wed, Nov 9, 2022 at 11:46 PM Rudy Rigot via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> Improve the advice displayed when `git status` is slow because
> of excessive numbers of untracked files.  Update the `git status`
> man page to explain the various configuration options.
>
> `git status` can be slow when there are a large number of untracked
> files and directories, because Git must search the entire worktree
> to enumerate them.  Previously, Git would print an advice message
> with the elapsed search time and a suggestion to disable the search
> using the `-uno` option.  This suggestion also carried a warning
> that might scare off some users.
>
> Git can reduce the size and time of the untracked file search when
> the `core.untrackedCache` and `core.fsmonitor` features are enabled
> by caching results from previous `git status` invocations.
>
> Update the advice to explain the various combinations of additional
> configuration options and refer to (new) documentation in the man
> page that explains it in more detail than what can be printed in an
> advice message.
>
> Finally, add new tests to verify the new functionality.
>
> Signed-off-by: Rudy Rigot <rudy.rigot@gmail.com>
> ---
> diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
> @@ -457,6 +457,61 @@ during the write may conflict with other simultaneous processes, causing
> +UNTRACKED FILES AND STATUS SPEED
> +--------------------------------
> +
> +`git status` can be very slow in large worktrees if/when it
> +needs to search for untracked files and directories.  There are
> +many configuration options available to speed this up by either
> +avoiding the work or making use of cached results from previous
> +Git commands.  Since we all work in different ways, there is no
> +single optimum set of settings right for everyone.  Here is a

Not necessarily worth a reroll, but the "since we all work..."
fragment could be dropped without any loss of clarity:

    There is no single optimum configuration suitable
    to every situation.

would probably be sufficient.

> +brief summary of the relevant options to help you choose which
> +is right for you.  Each of these settings is independently
> +documented elsewhere in more detail, so please refer to them
> +for complete details.

This leaves the reader hanging somewhat by not saying where
"elsewhere" actually is. You can use gitlink: to insert links to the
appropriate documentation.

> +* The `-uno` flag or the `status.showUntrackedfiles=false`
> +    config : indicate that `git status` should not report untracked
> +       files. This is the fastest option. `git status` will not list
> +       the untracked files, so you need to be careful to remember if
> +       you create any new files and manually `git add` them.

This and all the other bullet points use an odd mix of spaces and TAB
for indentation. They should consistently use one or the other.

Earlier in this thread, Ævar suggested that it might be worthwhile to
mention an additional possibility: that simply rerunning `git status`
might be sufficient to achieve reasonable speed since the second run
of `git status` may benefit from the filesystem cache having been
primed by the first invocation of `git status`. As such, it may be
overkill for a user to dive into the various options described here.
Hence, to mitigate the possibility of a user doing a lot of research
and extra unnecessary configuration, it might make sense for the very
first bullet point (before this one) to say merely:

    * Do nothing. Subsequent invocations of `git status` may be faster
      simply because the first `git status` primed the filesystem cache.

or something like that (probably more formal, though).

> diff --git a/t/t7065-wtstatus-slow.sh b/t/t7065-wtstatus-slow.sh
> @@ -0,0 +1,74 @@
> +#!/bin/sh
> +
> +test_description='test status when slow untracked files'
> +
> +. ./test-lib.sh
> +
> +DATA="$TEST_DIRECTORY/t7065"

This variable is unused, isn't it, now that you're inlining everything
with here-docs?

> +GIT_TEST_UF_DELAY_WARNING=1
> +export GIT_TEST_UF_DELAY_WARNING
> +
> +test_expect_success setup '
> +       git checkout -b test
> +'
> +
> +test_expect_success 'when core.untrackedCache and fsmonitor are unset' '
> +       test_must_fail git config --get core.untrackedCache &&
> +       test_must_fail git config --get core.fsmonitor &&

Rather than asserting that some preceding code has left the
configuration in a state this test wants, it would be more robust for
this test to forcibly set up the state it wants. Something like this
should work:

    test_might_fail git config ---unset-all core.untrackedCache &&
    test_might_fail git config ---unset-all core.fsmonitor &&

> +       git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&

At all costs, avoid escaping the "trash" directory in which this test
is running. All files created by this test should be within the
"trash" directory hierarchy. Therefore, drop "../" from all the
filenames, and instead create these files directly in the "trash"
directory or in some subdirectory of the "trash" directory.

I presume the reason you're escaping the "trash" directory is because
you don't want these untracked "actual" and "expected" files to
pollute the `git status` output you're testing? If so, then you should
probably instead set up a .gitignore file in the "trash" directory to
make `git status` ignore them.

We usually want to avoid having a Git command upstream of a pipe since
the exit code of the Git command will be lost. So, we'd typically do
this instead:

    git status >out &&
    sed "s/[0-9]\.[0-9][0-9]/X/g" out >actual &&

> +       cat >../expected <<-EOF &&
> +On branch test
> +
> +No commits yet
> +
> +
> +It took X seconds to enumerate untracked files.
> +See '"'"'git help status'"'"' for information on how to improve this.
> +
> +nothing to commit (create/copy files and use "git add" to track)
> +       EOF

Two comments. First, use <<-\EOF rather than <<-EOF to make it clear
to readers that you're not interpolating any variables in the here-doc
body. Second, the <<- operator allows you to indent the here-doc body
(with TABs, not spaces), so you can align the body with the rest of
the code:

    cat >expected <<-\EOF &&
    On branch test
    ...
    EOF

> +       test_cmp ../expected ../actual &&
> +       rm -fr ../actual ../expected
> +'

We usually don't bother cleaning up these files if they don't impact
subsequent tests negatively. However, note that if any code above the
`rm -fr` command fails, then the `rm -fr` command itself won't be run,
hence the files won't get cleaned up upon test failure. To ensure
cleanup (if that's what you desire), use test_when_finished() _before_
code which may fail. So, something like this is typical:

    test_expect_success 'when core.untrackedCache and fsmonitor are unset' '
        test_when_finished "rm -fr actual expected" &&
        test_might_fail git config ---unset-all core.untrackedCache &&
        test_might_fail git config ---unset-all core.fsmonitor &&
        ...
    '

Same comments apply to the other new tests added by this patch.

> diff --git a/wt-status.c b/wt-status.c
> @@ -1205,6 +1207,17 @@ static void wt_longstatus_print_tracking(struct wt_status *s)
> +static inline int uf_was_slow(uint32_t untracked_in_ms)
> +{

Does this need to be inline? Is this a hot piece of code, or is this
merely a premature optimization?

> +       const char *x;
> +       x = getenv("GIT_TEST_UF_DELAY_WARNING");
> +       if (x) {
> +               untracked_in_ms += UF_DELAY_WARNING_IN_MS + 1;
> +       }

Style is to avoid { } braces for these one-liner bodies.

I think we don't even need the variable "x" in this case, so:

    if (getenv("GIT_TEST_UF_DELAY_WARNING"))
        untracked_in_ms += UF_DELAY_WARNING_IN_MS + 1;

would be cleaner.

> +       return UF_DELAY_WARNING_IN_MS < untracked_in_ms;
> +}
Rudy Rigot Nov. 10, 2022, 5:01 p.m. UTC | #2
> the <<- operator allows you to indent the here-doc body
> (with TABs, not spaces), so you can align the body with the rest of
> the code

Unfortunately, that's how I had done it first, but since some of those
lines are blank, the test code had lines just made of "<tab><tab>" and
nothing else, which made the check-whitespace check fail. I considered
replacing empty line with something on the fly with sed (like just an
"x" character for instance), but this felt hacky and brittle (in the
unlikely case where an actual "x" would find itself genuinely lost in
the middle of that output, the test would mistakenly pass). I went
with the solution I'm presenting here because the readability
downsides of missing that indentation felt less bad. Definitely
willing to be convinced though.

I have no issues with anything else from your review, and I'm planning
to integrate it all. More specifically:

> I presume the reason you're escaping the "trash" directory is because
> you don't want these untracked "actual" and "expected" files to
> pollute the `git status` output you're testing?

You are presuming right! The test was being flappy in CI runs before I
changed this, which I found used as a solution in other
git-status-related tests currently in the codebase. I'm not familiar
with the trash directory approach, but I'll figure it out.

> Does this need to be inline? Is this a hot piece of code, or is this
> merely a premature optimization?

I'll admit my limits, I'm not familiar enough to know. If you feel the
inline is unnecessary here, I'm glad to trust you on it, I'll remove
it.

Thanks a lot for your in-depth review, I am planning to integrate all
the other feedback, and bring another iteration forward, possibly
today.
Eric Sunshine Nov. 10, 2022, 5:30 p.m. UTC | #3
On Thu, Nov 10, 2022 at 12:02 PM Rudy Rigot <rudy.rigot@gmail.com> wrote:
> > the <<- operator allows you to indent the here-doc body
> > (with TABs, not spaces), so you can align the body with the rest of
> > the code
>
> Unfortunately, that's how I had done it first, but since some of those
> lines are blank, the test code had lines just made of "<tab><tab>" and
> nothing else, which made the check-whitespace check fail. I considered
> replacing empty line with something on the fly with sed (like just an
> "x" character for instance), but this felt hacky and brittle (in the
> unlikely case where an actual "x" would find itself genuinely lost in
> the middle of that output, the test would mistakenly pass). I went
> with the solution I'm presenting here because the readability
> downsides of missing that indentation felt less bad. Definitely
> willing to be convinced though.

Okay, I see what you're getting at. Fortunately, there is a simple
solution as long as those lines are truly blank as emitted by `git
status`: just leave the blank lines completely blank in the here-doc
body (don't bother inserting a TAB on the blank line). This should
product the exact output you want:

    cat >../expected <<-\EOF &&
    On branch test

    No commits yet
    ...
    EOF

Although it should not be needed here, the `sed` approach is generally
fine, and we use it often enough in tests, though usually with a more
uncommon letter such as "Q". See, for instance, the q_to_nul(),
q_to_tab(), etc. functions in t/test-lib-functions.sh.

> > I presume the reason you're escaping the "trash" directory is because
> > you don't want these untracked "actual" and "expected" files to
> > pollute the `git status` output you're testing?
>
> You are presuming right! The test was being flappy in CI runs before I
> changed this, which I found used as a solution in other
> git-status-related tests currently in the codebase. I'm not familiar
> with the trash directory approach, but I'll figure it out.

Each test script is run in a temporary "trash" directory which gets
thrown away when the script finishes. We want tests to constrain
themselves to the trash directory so they don't inadvertently destroy
a user's files outside the directory.

I see what you mean about some existing status-related tests using
files such as "../actual" and "../expect". It's not at all obvious in
a lot of those cases but they are safe[*] because those tests have
already cd'd into a subdirectory of the "trash" directory, thus
"../actual" is referring to the "trash" directory itself, hence the
tests do constrain themselves to "trash".

Anyhow, I suspect that crafting a custom .gitignore file in the test
setup should satisfy this particular case and allow "actual" and
"expected" to reside in the "trash" directory itself without mucking
up `git status` output.

[*] Unfortunately, some of those scripts are poorly structured because
they `cd` around between tests, which can leave CWD in an unexpected
state if some test fails and subsequent tests expect CWD to be
somewhere other than where it was left by the failed test. These days,
we only allow tests to `cd` within a subshell so that CWD is restored
automatically whether the test itself succeeds or fails. So, this is
safe:

    test_expect_success 'title' '
        do something &&
        mkdir foo &&
        (
            cd foo &&
            do something else >../actual &&
        )
        grep foo actual
    '
Rudy Rigot Nov. 10, 2022, 5:47 p.m. UTC | #4
Oh, thanks, all of this helps a ton. I know exactly what to do about
those two things now.


On Thu, Nov 10, 2022 at 11:30 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Thu, Nov 10, 2022 at 12:02 PM Rudy Rigot <rudy.rigot@gmail.com> wrote:
> > > the <<- operator allows you to indent the here-doc body
> > > (with TABs, not spaces), so you can align the body with the rest of
> > > the code
> >
> > Unfortunately, that's how I had done it first, but since some of those
> > lines are blank, the test code had lines just made of "<tab><tab>" and
> > nothing else, which made the check-whitespace check fail. I considered
> > replacing empty line with something on the fly with sed (like just an
> > "x" character for instance), but this felt hacky and brittle (in the
> > unlikely case where an actual "x" would find itself genuinely lost in
> > the middle of that output, the test would mistakenly pass). I went
> > with the solution I'm presenting here because the readability
> > downsides of missing that indentation felt less bad. Definitely
> > willing to be convinced though.
>
> Okay, I see what you're getting at. Fortunately, there is a simple
> solution as long as those lines are truly blank as emitted by `git
> status`: just leave the blank lines completely blank in the here-doc
> body (don't bother inserting a TAB on the blank line). This should
> product the exact output you want:
>
>     cat >../expected <<-\EOF &&
>     On branch test
>
>     No commits yet
>     ...
>     EOF
>
> Although it should not be needed here, the `sed` approach is generally
> fine, and we use it often enough in tests, though usually with a more
> uncommon letter such as "Q". See, for instance, the q_to_nul(),
> q_to_tab(), etc. functions in t/test-lib-functions.sh.
>
> > > I presume the reason you're escaping the "trash" directory is because
> > > you don't want these untracked "actual" and "expected" files to
> > > pollute the `git status` output you're testing?
> >
> > You are presuming right! The test was being flappy in CI runs before I
> > changed this, which I found used as a solution in other
> > git-status-related tests currently in the codebase. I'm not familiar
> > with the trash directory approach, but I'll figure it out.
>
> Each test script is run in a temporary "trash" directory which gets
> thrown away when the script finishes. We want tests to constrain
> themselves to the trash directory so they don't inadvertently destroy
> a user's files outside the directory.
>
> I see what you mean about some existing status-related tests using
> files such as "../actual" and "../expect". It's not at all obvious in
> a lot of those cases but they are safe[*] because those tests have
> already cd'd into a subdirectory of the "trash" directory, thus
> "../actual" is referring to the "trash" directory itself, hence the
> tests do constrain themselves to "trash".
>
> Anyhow, I suspect that crafting a custom .gitignore file in the test
> setup should satisfy this particular case and allow "actual" and
> "expected" to reside in the "trash" directory itself without mucking
> up `git status` output.
>
> [*] Unfortunately, some of those scripts are poorly structured because
> they `cd` around between tests, which can leave CWD in an unexpected
> state if some test fails and subsequent tests expect CWD to be
> somewhere other than where it was left by the failed test. These days,
> we only allow tests to `cd` within a subshell so that CWD is restored
> automatically whether the test itself succeeds or fails. So, this is
> safe:
>
>     test_expect_success 'title' '
>         do something &&
>         mkdir foo &&
>         (
>             cd foo &&
>             do something else >../actual &&
>         )
>         grep foo actual
>     '
diff mbox series

Patch

diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index 54a4b29b473..c51ba5e79e1 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -457,6 +457,61 @@  during the write may conflict with other simultaneous processes, causing
 them to fail. Scripts running `status` in the background should consider
 using `git --no-optional-locks status` (see linkgit:git[1] for details).
 
+UNTRACKED FILES AND STATUS SPEED
+--------------------------------
+
+`git status` can be very slow in large worktrees if/when it
+needs to search for untracked files and directories.  There are
+many configuration options available to speed this up by either
+avoiding the work or making use of cached results from previous
+Git commands.  Since we all work in different ways, there is no
+single optimum set of settings right for everyone.  Here is a
+brief summary of the relevant options to help you choose which
+is right for you.  Each of these settings is independently
+documented elsewhere in more detail, so please refer to them
+for complete details.
+
+* The `-uno` flag or the `status.showUntrackedfiles=false`
+    config : indicate that `git status` should not report untracked
+	files. This is the fastest option. `git status` will not list
+	the untracked files, so you need to be careful to remember if
+	you create any new files and manually `git add` them.
+
+* `advice.statusUoption=false` : this config option disables a
+	warning message when the search for untracked files takes longer
+	than desired. In some large repositories, this message may appear
+	frequently and not be a helpful signal.
+
+* `core.untrackedCache=true` : enable the untracked cache feature
+    and only search directories that have been modified since the
+    previous `git status` command.  Git remembers the set of
+    untracked files within each directory and assumes that if a
+    directory has not been modified, then the set of untracked
+    file within has not changed.  This is much faster than
+    enumerating the contents of every directory, but still not
+    without cost, because Git still has to search for the set of
+    modified directories. The untracked cache is stored in the
+	.git/index file. The reduced cost searching for untracked
+	files is offset slightly by the increased size of the index and
+	the cost of keeping it up-to-date. That reduced search time is
+	usually worth the additional size.
+
+* `core.untrackedCache=true` and `core.fsmonitor=true` or
+    `core.fsmonitor=<hook_command_pathname>` : enable both the
+    untracked cache and FSMonitor features and only search
+    directories that have been modified since the previous
+    `git status` command.  This is faster than using just the
+    untracked cache alone because Git can also avoid searching
+    for modified directories.  Git only has to enumerate the
+    exact set of directories that have changed recently. While
+	the FSMonitor feature can be enabled without the untracked
+	cache, the benefits are greatly reduced in that case.
+
+Note that after you turn on the untracked cache and/or FSMonitor
+features it may take a few `git status` commands for the various
+caches to warm up before you see improved command times.  This is
+normal.
+
 SEE ALSO
 --------
 linkgit:gitignore[5]
diff --git a/t/t7065-wtstatus-slow.sh b/t/t7065-wtstatus-slow.sh
new file mode 100755
index 00000000000..468b8934836
--- /dev/null
+++ b/t/t7065-wtstatus-slow.sh
@@ -0,0 +1,74 @@ 
+#!/bin/sh
+
+test_description='test status when slow untracked files'
+
+. ./test-lib.sh
+
+DATA="$TEST_DIRECTORY/t7065"
+
+GIT_TEST_UF_DELAY_WARNING=1
+export GIT_TEST_UF_DELAY_WARNING
+
+test_expect_success setup '
+	git checkout -b test
+'
+
+test_expect_success 'when core.untrackedCache and fsmonitor are unset' '
+	test_must_fail git config --get core.untrackedCache &&
+	test_must_fail git config --get core.fsmonitor &&
+	git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&
+	cat >../expected <<-EOF &&
+On branch test
+
+No commits yet
+
+
+It took X seconds to enumerate untracked files.
+See '"'"'git help status'"'"' for information on how to improve this.
+
+nothing to commit (create/copy files and use "git add" to track)
+	EOF
+	test_cmp ../expected ../actual &&
+	rm -fr ../actual ../expected
+'
+
+test_expect_success 'when core.untrackedCache true, but not fsmonitor' '
+	git config core.untrackedCache true &&
+	test_must_fail git config --get core.fsmonitor &&
+	git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&
+	cat >../expected <<-EOF &&
+On branch test
+
+No commits yet
+
+
+It took X seconds to enumerate untracked files.
+See '"'"'git help status'"'"' for information on how to improve this.
+
+nothing to commit (create/copy files and use "git add" to track)
+	EOF
+	test_cmp ../expected ../actual &&
+	rm -fr ../actual ../expected
+'
+
+test_expect_success 'when core.untrackedCache true, and fsmonitor' '
+	git config core.untrackedCache true &&
+	git config core.fsmonitor true &&
+	git status | sed "s/[0-9]\.[0-9][0-9]/X/g" >../actual &&
+	cat >../expected <<-EOF &&
+On branch test
+
+No commits yet
+
+
+It took X seconds to enumerate untracked files,
+but this is currently being cached.
+See '"'"'git help status'"'"' for information on how to improve this.
+
+nothing to commit (create/copy files and use "git add" to track)
+	EOF
+	test_cmp ../expected ../actual &&
+	rm -fr ../actual ../expected
+'
+
+test_done
diff --git a/wt-status.c b/wt-status.c
index 5813174896c..336f41e6d9f 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -18,8 +18,10 @@ 
 #include "worktree.h"
 #include "lockfile.h"
 #include "sequencer.h"
+#include "fsmonitor-settings.h"
 
 #define AB_DELAY_WARNING_IN_MS (2 * 1000)
+#define UF_DELAY_WARNING_IN_MS (2 * 1000)
 
 static const char cut_line[] =
 "------------------------ >8 ------------------------\n";
@@ -1205,6 +1207,17 @@  static void wt_longstatus_print_tracking(struct wt_status *s)
 	strbuf_release(&sb);
 }
 
+static inline int uf_was_slow(uint32_t untracked_in_ms)
+{
+	const char *x;
+	x = getenv("GIT_TEST_UF_DELAY_WARNING");
+	if (x) {
+		untracked_in_ms += UF_DELAY_WARNING_IN_MS + 1;
+	}
+
+	return UF_DELAY_WARNING_IN_MS < untracked_in_ms;
+}
+
 static void show_merge_in_progress(struct wt_status *s,
 				   const char *color)
 {
@@ -1814,6 +1827,7 @@  static void wt_longstatus_print(struct wt_status *s)
 {
 	const char *branch_color = color(WT_STATUS_ONBRANCH, s);
 	const char *branch_status_color = color(WT_STATUS_HEADER, s);
+	enum fsmonitor_mode fsm_mode = fsm_settings__get_mode(s->repo);
 
 	if (s->branch) {
 		const char *on_what = _("On branch ");
@@ -1870,13 +1884,21 @@  static void wt_longstatus_print(struct wt_status *s)
 		wt_longstatus_print_other(s, &s->untracked, _("Untracked files"), "add");
 		if (s->show_ignored_mode)
 			wt_longstatus_print_other(s, &s->ignored, _("Ignored files"), "add -f");
-		if (advice_enabled(ADVICE_STATUS_U_OPTION) && 2000 < s->untracked_in_ms) {
+		if (uf_was_slow(s->untracked_in_ms) && advice_enabled(ADVICE_STATUS_U_OPTION)) {
 			status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
+			if (fsm_mode > FSMONITOR_MODE_DISABLED) {
+				status_printf_ln(s, GIT_COLOR_NORMAL,
+						_("It took %.2f seconds to enumerate untracked files,\n"
+						"but this is currently being cached."),
+						s->untracked_in_ms / 1000.0);
+			} else {
+				status_printf_ln(s, GIT_COLOR_NORMAL,
+						_("It took %.2f seconds to enumerate untracked files."),
+						s->untracked_in_ms / 1000.0);
+			}
 			status_printf_ln(s, GIT_COLOR_NORMAL,
-					 _("It took %.2f seconds to enumerate untracked files. 'status -uno'\n"
-					   "may speed it up, but you have to be careful not to forget to add\n"
-					   "new files yourself (see 'git help status')."),
-					 s->untracked_in_ms / 1000.0);
+					_("See 'git help status' for information on how to improve this."));
+			status_printf_ln(s, GIT_COLOR_NORMAL, "%s", "");
 		}
 	} else if (s->committable)
 		status_printf_ln(s, GIT_COLOR_NORMAL, _("Untracked files not listed%s"),