diff mbox series

[v2,3/3] grep docs: describe --no-index further and improve formatting a bit

Message ID a7e5151fa615d572ab4ed05519dd277048ce935c.1711302588.git.dsimic@manjaro.org (mailing list archive)
State Superseded
Headers show
Series Assorted improvements salvaged from an earlier series | expand

Commit Message

Dragan Simic March 24, 2024, 5:51 p.m. UTC
Improve the description of --no-index, to make it more clear to the users
what this option actually does under the hood, and what's its purpose.
Describe the dependency between --no-index and either of the --cached and
--untracked options, which cannot be used together.

As part of that, shuffle a couple of the options, to make the documentation
flow a bit better, because it makes more sense to describe first the options
that have something in common, and to after that describe an option that does
something differently.  In more detail, --cached and --untracked both leave
git-grep(1) in the usual state, in which it treats the directory as a local
git repository, unlike --no-index that makes git-grep(1) treat the directory
not as a git repository.

While there, improve the descriptions of grep worker threads a bit, to give
them better context.  Adjust the language a bit, to avoid addressing the
reader directly, which is in general preferred in technical documentation,
because it eliminates the possible element of persuading the user to do
something.  In other words, we should be telling the user what our software
can do, instead of telling the user what to do.

Also perform some minor formatting improvements, to make it clear it's the
git commands, command parameters, and configuration option names.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
---

Notes:
    Changes in v2:
        - Improved the patch description a bit, to make it more clear why
          this patch shuffles some of the options around, and why it changes
          some of the wording to passive voice
        - Reworded the description of --no-index a bit, to not mention the
          name of the utility we're describing, which avoids any possible
          confusion, as pointed out by Jean-Noel Avila [1]
    
    This patch is salvaged from my earlier series, [2] for which it has been
    concluded to be not acceptable for merging, because of possible issues
    with various git scripts. [3]
    
    Compared to the version in the earlier series, this version continues
    the effort to improve the description of --no-index, by also incorporating
    the possible improvements pointed out by Junio. [4]  This version also
    improves the wording of some related descriptions, mainly related to
    grep.threads, and performs some additional small formatting improvements.
    
    [1] https://lore.kernel.org/git/ed050f2d496a6db07e698fd2f1094b81@manjaro.org/
    [2] https://lore.kernel.org/git/cover.1710781235.git.dsimic@manjaro.org/T/#u
    [3] https://lore.kernel.org/git/d8475579f014a90b27efaf6207bc6fb0@manjaro.org/
    [4] https://lore.kernel.org/git/xmqqwmpzrqfv.fsf@gitster.g/

 Documentation/git-grep.txt | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

Comments

Junio C Hamano March 25, 2024, 7:18 p.m. UTC | #1
Dragan Simic <dsimic@manjaro.org> writes:

> +--no-index::
> +	Search files in the current directory that is not managed by Git,
> +	or by ignoring that the current directory is managed by Git.  This
> +	is rather similar to running the regular `grep(1)` utility with its
> +	`-r` option specified, but with some additional benefits, such as
> +	using multiple worker threads to speed up searches.

Sorry for not mentioning this earlier, but I do not think
multi-threaded grep has to be something we own and others cannot
implement.  A richer pathspec globbing [*1*] and logical operation
on match results may be better examples of "additional benefits" if
we really wanted to mention why people might want to use
"--no-index" in a directory that is outside Git.

[Footnote]

 *1* When you want to look for something in files whose name begins
     with "g" but does not have "rc" in it, you'd do

     $ git grep --no-index -c . ':(exclude)*rc*' 'g*'

> ++
> +This option cannot be used together with `--cached` or `--untracked`.
> +See also `grep.fallbackToNoIndex` in 'CONFIGURATION' below.
> +

OK.

>  --threads <num>::
> -	Number of grep worker threads to use.
> -	See `grep.threads` in 'CONFIGURATION' for more information.
> +	Number of `grep` worker threads to use, to speed up searches.
> +	See 'NOTES ON THREADS' and `grep.threads` in 'CONFIGURATION'
> +	for more information.

I actually do not think adding ", to speed up searches" is an
improvement.  But referring to NOTES ON THREADS is a good idea, and
by reading that NOTES ON THREADS section, readers can tell why it
sometimes does not speed things up or even slow them down.

Other than that, looking great.

Thanks.
Dragan Simic March 25, 2024, 7:53 p.m. UTC | #2
On 2024-03-25 20:18, Junio C Hamano wrote:
> Dragan Simic <dsimic@manjaro.org> writes:
> 
>> +--no-index::
>> +	Search files in the current directory that is not managed by Git,
>> +	or by ignoring that the current directory is managed by Git.  This
>> +	is rather similar to running the regular `grep(1)` utility with its
>> +	`-r` option specified, but with some additional benefits, such as
>> +	using multiple worker threads to speed up searches.
> 
> Sorry for not mentioning this earlier, but I do not think
> multi-threaded grep has to be something we own and others cannot
> implement.  A richer pathspec globbing [*1*] and logical operation
> on match results may be better examples of "additional benefits" if
> we really wanted to mention why people might want to use
> "--no-index" in a directory that is outside Git.

No worries.  Furthermore, multi-threaded searches may actually not
be faster for some workloads, or on some systems (the resulting speed
depends heavily on the I/O speed and available CPU power, which not
all systems have plenty of), so I agree that it's better not to mention
it.  I'll send the v3 with adjusted wording.

> [Footnote]
> 
>  *1* When you want to look for something in files whose name begins
>      with "g" but does not have "rc" in it, you'd do
> 
>      $ git grep --no-index -c . ':(exclude)*rc*' 'g*'
> 
>>  --threads <num>::
>> -	Number of grep worker threads to use.
>> -	See `grep.threads` in 'CONFIGURATION' for more information.
>> +	Number of `grep` worker threads to use, to speed up searches.
>> +	See 'NOTES ON THREADS' and `grep.threads` in 'CONFIGURATION'
>> +	for more information.
> 
> I actually do not think adding ", to speed up searches" is an
> improvement.  But referring to NOTES ON THREADS is a good idea, and
> by reading that NOTES ON THREADS section, readers can tell why it
> sometimes does not speed things up or even slow them down.

Agreed, as noted above.  Will be adjusted in the v3.

> Other than that, looking great.

Nice, that makes me happy. :)
diff mbox series

Patch

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index f64f40e9775a..bfa87ba22ed5 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -28,7 +28,7 @@  SYNOPSIS
 	   [-f <file>] [-e] <pattern>
 	   [--and|--or|--not|(|)|-e <pattern>...]
 	   [--recurse-submodules] [--parent-basename <basename>]
-	   [ [--[no-]exclude-standard] [--cached | --no-index | --untracked] | <tree>...]
+	   [ [--[no-]exclude-standard] [--cached | --untracked | --no-index] | <tree>...]
 	   [--] [<pathspec>...]
 
 DESCRIPTION
@@ -45,13 +45,20 @@  OPTIONS
 	Instead of searching tracked files in the working tree, search
 	blobs registered in the index file.
 
---no-index::
-	Search files in the current directory that is not managed by Git.
-
 --untracked::
 	In addition to searching in the tracked files in the working
 	tree, search also in untracked files.
 
+--no-index::
+	Search files in the current directory that is not managed by Git,
+	or by ignoring that the current directory is managed by Git.  This
+	is rather similar to running the regular `grep(1)` utility with its
+	`-r` option specified, but with some additional benefits, such as
+	using multiple worker threads to speed up searches.
++
+This option cannot be used together with `--cached` or `--untracked`.
+See also `grep.fallbackToNoIndex` in 'CONFIGURATION' below.
+
 --no-exclude-standard::
 	Also search in ignored files by not honoring the `.gitignore`
 	mechanism. Only useful with `--untracked`.
@@ -248,8 +255,9 @@  providing this option will cause it to die.
 	a non-zero status.
 
 --threads <num>::
-	Number of grep worker threads to use.
-	See `grep.threads` in 'CONFIGURATION' for more information.
+	Number of `grep` worker threads to use, to speed up searches.
+	See 'NOTES ON THREADS' and `grep.threads` in 'CONFIGURATION'
+	for more information.
 
 -f <file>::
 	Read patterns from <file>, one per line.
@@ -336,9 +344,9 @@  The `--threads` option (and the `grep.threads` configuration) will be ignored wh
 `--open-files-in-pager` is used, forcing a single-threaded execution.
 
 When grepping the object store (with `--cached` or giving tree objects), running
-with multiple threads might perform slower than single threaded if `--textconv`
-is given and there are too many text conversions. So if you experience low
-performance in this case, it might be desirable to use `--threads=1`.
+with multiple threads might perform slower than single-threaded if `--textconv`
+is given and there are too many text conversions.  Thus, if low performance is
+experienced in this case, it might be desirable to use `--threads=1`.
 
 CONFIGURATION
 -------------