[v3,12/12] grep: use no. of cores as the default no. of threads
diff mbox series

Message ID a5891176d7778b98ac35c756170dd334b8ee21c7.1579141989.git.matheus.bernardino@usp.br
State New
Headers show
Series
  • grep: improve threading and fix race conditions
Related show

Commit Message

Matheus Tavares Jan. 16, 2020, 2:40 a.m. UTC
When --threads is not specified, git-grep will use 8 threads by default.
This fixed number may be too many for machines with fewer cores and too
little for machines with more cores. So, instead, use the number of
logical cores available in the machine, which seems to result in the
best overall performance: The following measurements correspond to the
mean elapsed times for 30 git-grep executions in chromium's
repository[1] with a 95% confidence interval (each set of 30 were
performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
'(static|extern) (int|double) \*'.

      |          Working tree         |           Object Store
------|-------------------------------|--------------------------------
 #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
------|---------------|---------------|----------------|---------------
  32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
  16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
>  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
   4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
   2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
   1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01

The above tests were performed in a desktop running Debian 10.0 with
Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
RAM and a 7200 rpm, SATA 3.1 HDD.

Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:

      |          Working tree          |           Object Store
------|--------------------------------|--------------------------------
 #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
------|---------------|----------------|----------------|---------------
  32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
  16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
>  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
   4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
   2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
   1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 4 ++--
 builtin/grep.c             | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

Comments

Victor Leschuk Jan. 16, 2020, 1:11 p.m. UTC | #1
Grepping bottleneck is not cpu, but IO. Maybe it is more reasonable to
use not online_cpus() but online_cpus()*2?

--
Victor


On Thu, Jan 16, 2020 at 5:41 AM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> When --threads is not specified, git-grep will use 8 threads by default.
> This fixed number may be too many for machines with fewer cores and too
> little for machines with more cores. So, instead, use the number of
> logical cores available in the machine, which seems to result in the
> best overall performance: The following measurements correspond to the
> mean elapsed times for 30 git-grep executions in chromium's
> repository[1] with a 95% confidence interval (each set of 30 were
> performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
> '(static|extern) (int|double) \*'.
>
>       |          Working tree         |           Object Store
> ------|-------------------------------|--------------------------------
>  #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
> ------|---------------|---------------|----------------|---------------
>   32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
>   16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
> >  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
>    4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
>    2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
>    1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01
>
> The above tests were performed in a desktop running Debian 10.0 with
> Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
> RAM and a 7200 rpm, SATA 3.1 HDD.
>
> Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
> with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
>
>       |          Working tree          |           Object Store
> ------|--------------------------------|--------------------------------
>  #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
> ------|---------------|----------------|----------------|---------------
>   32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
>   16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
> >  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
>    4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
>    2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
>    1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08
>
> [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
>      04-06-2019), after a 'git gc' execution.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  Documentation/git-grep.txt | 4 ++--
>  builtin/grep.c             | 3 +--
>  2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index de628741fa..eb5412724f 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -59,8 +59,8 @@ grep.extendedRegexp::
>         other than 'default'.
>
>  grep.threads::
> -       Number of grep worker threads to use.  If unset (or set to 0),
> -       8 threads are used by default (for now).
> +       Number of grep worker threads to use. If unset (or set to 0), Git will
> +       use as many threads as the number of logical cores available.
>
>  grep.fullName::
>         If set to true, enable `--full-name` option by default.
> diff --git a/builtin/grep.c b/builtin/grep.c
> index a85b710b48..629eaf5dbc 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -33,7 +33,6 @@ static char const * const grep_usage[] = {
>
>  static int recurse_submodules;
>
> -#define GREP_NUM_THREADS_DEFAULT 8
>  static int num_threads;
>
>  static pthread_t *threads;
> @@ -1064,7 +1063,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>         } else if (num_threads < 0)
>                 die(_("invalid number of threads specified (%d)"), num_threads);
>         else if (num_threads == 0)
> -               num_threads = HAVE_THREADS ? GREP_NUM_THREADS_DEFAULT : 1;
> +               num_threads = HAVE_THREADS ? online_cpus() : 1;
>
>         if (num_threads > 1) {
>                 if (!HAVE_THREADS)
> --
> 2.24.1
>
Matheus Tavares Jan. 16, 2020, 2:47 p.m. UTC | #2
Hi, Victor

On Thu, Jan 16, 2020 at 10:11 AM Victor Leschuk <vleschuk@gmail.com> wrote:
>
> Grepping bottleneck is not cpu, but IO. Maybe it is more reasonable to
> use not online_cpus() but online_cpus()*2?

I also tried this approach, but the tests I ran with online_cpus() * 2
only showed slowdowns. The results can be seen in the commit message:

> On Thu, Jan 16, 2020 at 5:41 AM Matheus Tavares
> <matheus.bernardino@usp.br> wrote:
> >
[...]
> > The following measurements correspond to the
> > mean elapsed times for 30 git-grep executions in chromium's
> > repository[1] with a 95% confidence interval (each set of 30 were
> > performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
> > '(static|extern) (int|double) \*'.
> >
> >       |          Working tree         |           Object Store
> > ------|-------------------------------|--------------------------------
> >  #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
> > ------|---------------|---------------|----------------|---------------
> >   32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
> >   16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
> > >  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
> >    4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
> >    2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
> >    1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01
> >
> > The above tests were performed in a desktop running Debian 10.0 with
> > Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
> > RAM and a 7200 rpm, SATA 3.1 HDD.
> >
> > Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
> > with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
> >
> >       |          Working tree          |           Object Store
> > ------|--------------------------------|--------------------------------
> >  #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
> > ------|---------------|----------------|----------------|---------------
> >   32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
> >   16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
> > >  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
> >    4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
> >    2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
> >    1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08

I deliberately used somehow complex regexes for these tests. So I
decided to do one more test with a very simple fixed string ("abc"),
allowing git-grep to spend less time in the cpu-bound regex searching.
The results can be seen bellow (the metodology is the same as described
above and the machine is the Manjaro laptop, for which online_cpus()
returns 8):

  #ths |  Working Three |  Object Store
 ------|----------------|---------------
    16 |  3.22s ± 0.20  |  5.96s ± 0.06
     8 |  2.92s ± 0.01  |  5.73s ± 0.02

Patch
diff mbox series

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index de628741fa..eb5412724f 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -59,8 +59,8 @@  grep.extendedRegexp::
 	other than 'default'.
 
 grep.threads::
-	Number of grep worker threads to use.  If unset (or set to 0),
-	8 threads are used by default (for now).
+	Number of grep worker threads to use. If unset (or set to 0), Git will
+	use as many threads as the number of logical cores available.
 
 grep.fullName::
 	If set to true, enable `--full-name` option by default.
diff --git a/builtin/grep.c b/builtin/grep.c
index a85b710b48..629eaf5dbc 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -33,7 +33,6 @@  static char const * const grep_usage[] = {
 
 static int recurse_submodules;
 
-#define GREP_NUM_THREADS_DEFAULT 8
 static int num_threads;
 
 static pthread_t *threads;
@@ -1064,7 +1063,7 @@  int cmd_grep(int argc, const char **argv, const char *prefix)
 	} else if (num_threads < 0)
 		die(_("invalid number of threads specified (%d)"), num_threads);
 	else if (num_threads == 0)
-		num_threads = HAVE_THREADS ? GREP_NUM_THREADS_DEFAULT : 1;
+		num_threads = HAVE_THREADS ? online_cpus() : 1;
 
 	if (num_threads > 1) {
 		if (!HAVE_THREADS)