diff mbox series

[GSoC,4/4] grep: re-enable threads in some non-worktree cases

Message ID 8c26abe9156e069ad4d19e9f0ce131cd1453f030.1565468806.git.matheus.bernardino@usp.br (mailing list archive)
State New, archived
Headers show
Series grep: re-enable threads when cached, w/ parallel inflation | expand

Commit Message

Matheus Tavares Aug. 10, 2019, 8:27 p.m. UTC
They were disabled at 53b8d93 ("grep: disable threading in non-worktree
case", 12-12-2011), due to observable performance drops. But now that
zlib inflation can be performed in parallel, for some of git-grep's
options, we can regain the speedup.

Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  17.3557s  | 20.8410s
    2    |   9.7170s  | 11.2415s
    8    |   6.1723s  |  6.9378s

These are all means of 30 executions after 2 warmup runs. All tests were
executed on an i7-7700HQ with 16GB of RAM and SSD. But to make sure the
optimization also performs well on HDD, the tests were repeated on an
AMD Turion 64 X2 TL-62 (dual-core) with 4GB of RAM and HDD (SATA-150,
5400 rpm):

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  40.3347s  |  47.6173s
    2    |  27.6547s  |  35.1797s

Unfortunately, textconv and submodules' operations remain thread-unsafe,
needing locks to be safely executed when threaded. Because of that, it's
not currently worthy to grep in parallel with them. So, when --textconv
or --recurse-submodules are given for a non-worktree case, threads are
kept disabled. In order to clarify this behavior, let's also add a
"NOTES" section to Documentation/git-grep.txt explaining the thread
usage details.

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 12 ++++++++++++
 builtin/grep.c             |  3 ++-
 2 files changed, 14 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 2d27969057..9686875fbc 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -330,6 +330,18 @@  EXAMPLES
 `git grep solution -- :^Documentation`::
 	Looks for `solution`, excluding files in `Documentation`.
 
+NOTES
+-----
+
+The --threads option (and grep.threads configuration) will be ignored when
+--open-files-in-pager is used, forcing a single-threaded execution.
+
+When grepping the index file (with --cached or giving tree objects), the
+following options will also suppress thread creation:
+
+	--recurse_submodules
+	--textconv
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/grep.c b/builtin/grep.c
index fa51392222..e5a9da471a 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1073,7 +1073,8 @@  int cmd_grep(int argc, const char **argv, const char *prefix)
 	pathspec.recursive = 1;
 	pathspec.recurse_submodules = !!recurse_submodules;
 
-	if (list.nr || cached || show_in_pager) {
+	if (show_in_pager ||
+	   ((list.nr || cached) && (recurse_submodules || opt.allow_textconv))) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
 		num_threads = 1;