diff mbox series

[27/27] cache-tree: integrate with sparse directory entries

Message ID 05e7548b780da6b2bf2342d91d8757568df0a6b8.1611596534.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Sparse Index | expand

Commit Message

Derrick Stolee Jan. 25, 2021, 5:42 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

The cache-tree extension was previously disabled with sparse indexes.
However, the cache-tree is an important performance feature for commands
like 'git status' and 'git add'. Integrate it with sparse directory
entries.

When writing a sparse index, completely clear and recalculate the cache
tree. By starting from scratch, the only integration necessary is to
check if we hit a sparse directory entry and create a leaf of the
cache-tree that has an entry_count of one and no subtrees.

Once the cache-tree exists within a sparse index, we finally get
improved performance. I test the sparse index performance using a
private monorepo with over 2.1 million files at HEAD, but with a
sparse-checkout definition that has only 68,000 paths in the populated
cone. The sparse index has about 2,000 sparse directory entries. I
compare three scenarios:

 1. Use the full index. The index size is ~186 MB.
 2. Use the sparse index. The index size is ~5.5 MB.
 3. Use a commit where HEAD matches the populated set. The full index
    size is ~5.3MB.

The third benchmark is included as a theoretical optimium for a
repository of the same object database.

First, a clean 'git status' improves from 3.1s to 240ms.

Benchmark #1: full index (git status)
  Time (mean ± σ):      3.167 s ±  0.036 s    [User: 2.006 s, System: 1.078 s]
  Range (min … max):    3.100 s …  3.208 s    10 runs

Benchmark #2: sparse index (git status)
  Time (mean ± σ):     239.5 ms ±   8.1 ms    [User: 189.4 ms, System: 226.8 ms]
  Range (min … max):   226.0 ms … 251.9 ms    13 runs

Benchmark #3: small tree (git status)
  Time (mean ± σ):     195.3 ms ±   4.5 ms    [User: 116.5 ms, System: 84.4 ms]
  Range (min … max):   188.8 ms … 202.8 ms    15 runs

The optimimum is still 45ms faster. This is due in part to the 2,000+
sparse directory entries, but there might be other optimizations to make
in the sparse-index case. In particular, I find that this performance
difference disappears when I disable FS Monitor, which is somewhat
disabled in the sparse-index case, but might still be adding overhead.

The performance numbers for 'git add .' are much closer to optimal:

Benchmark #1: full index (git add .)
  Time (mean ± σ):      3.076 s ±  0.022 s    [User: 2.065 s, System: 0.943 s]
  Range (min … max):    3.044 s …  3.116 s    10 runs

Benchmark #2: sparse index (git add .)
  Time (mean ± σ):     218.0 ms ±   6.6 ms    [User: 195.7 ms, System: 206.6 ms]
  Range (min … max):   209.8 ms … 228.2 ms    13 runs

Benchmark #3: small tree (git add .)
  Time (mean ± σ):     217.6 ms ±   5.4 ms    [User: 131.9 ms, System: 86.7 ms]
  Range (min … max):   212.1 ms … 228.4 ms    14 runs

In this test, I also used "echo >>README.md" to append a line to the
README.md file, so the 'git add .' command is doing _something_ other
than a no-op. Without this edit (and FS Monitor enabled) the small
tree case again gains about 30ms on the sparse index case.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 cache-tree.c   | 18 ++++++++++++++++++
 sparse-index.c | 10 +++++++++-
 2 files changed, 27 insertions(+), 1 deletion(-)

Comments

Elijah Newren Feb. 1, 2021, 11:54 p.m. UTC | #1
On Mon, Jan 25, 2021 at 9:42 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The cache-tree extension was previously disabled with sparse indexes.
> However, the cache-tree is an important performance feature for commands
> like 'git status' and 'git add'. Integrate it with sparse directory
> entries.
>
> When writing a sparse index, completely clear and recalculate the cache
> tree. By starting from scratch, the only integration necessary is to
> check if we hit a sparse directory entry and create a leaf of the
> cache-tree that has an entry_count of one and no subtrees.
>
> Once the cache-tree exists within a sparse index, we finally get
> improved performance. I test the sparse index performance using a
> private monorepo with over 2.1 million files at HEAD, but with a
> sparse-checkout definition that has only 68,000 paths in the populated
> cone. The sparse index has about 2,000 sparse directory entries. I
> compare three scenarios:

How many .gitignore entries does this monorepo have?  What percentage
of those are populated for #2 and #3?

Can a testcase be devised that others can repeat?  For example, I
created https://github.com/newren/gvfs-like-git-bomb once upon a time
to create a very small repository with a very large index and a lot of
skip-worktree entries, mostly to test some stuff that someone (Ben
Peart?) mentioned as being slow and verify for myself that it wasn't
Windows specific.

>  1. Use the full index. The index size is ~186 MB.
>  2. Use the sparse index. The index size is ~5.5 MB.
>  3. Use a commit where HEAD matches the populated set. The full index
>     size is ~5.3MB.

I'm not sure I'm understanding the difference between #2 and #3, other
than #3 is smaller.  How did you form #2?  Also, what do you mean by
"full index size" for #3, when it's the smallest?  Isn't that index
the most sparse (or least full)?  Or is it an index for a different
commit entirely that has far fewer files in it?

> The third benchmark is included as a theoretical optimium for a

s/optimium/optimum/

> repository of the same object database.

This I'm also not understanding, but maybe this goes back to not
understanding the difference in how #2 and #3 are constructed.

> First, a clean 'git status' improves from 3.1s to 240ms.
>
> Benchmark #1: full index (git status)
>   Time (mean ± σ):      3.167 s ±  0.036 s    [User: 2.006 s, System: 1.078 s]
>   Range (min … max):    3.100 s …  3.208 s    10 runs
>
> Benchmark #2: sparse index (git status)
>   Time (mean ± σ):     239.5 ms ±   8.1 ms    [User: 189.4 ms, System: 226.8 ms]
>   Range (min … max):   226.0 ms … 251.9 ms    13 runs
>
> Benchmark #3: small tree (git status)
>   Time (mean ± σ):     195.3 ms ±   4.5 ms    [User: 116.5 ms, System: 84.4 ms]
>   Range (min … max):   188.8 ms … 202.8 ms    15 runs

Always nice to see a speedup factor greater than 10.  :-)

>
> The optimimum is still 45ms faster. This is due in part to the 2,000+

s/optimimum/optimum/

> sparse directory entries, but there might be other optimizations to make
> in the sparse-index case. In particular, I find that this performance
> difference disappears when I disable FS Monitor, which is somewhat
> disabled in the sparse-index case, but might still be adding overhead.

The FS monitor wording is unclear to me; it feels like multiple negatives.

> The performance numbers for 'git add .' are much closer to optimal:
>
> Benchmark #1: full index (git add .)
>   Time (mean ± σ):      3.076 s ±  0.022 s    [User: 2.065 s, System: 0.943 s]
>   Range (min … max):    3.044 s …  3.116 s    10 runs
>
> Benchmark #2: sparse index (git add .)
>   Time (mean ± σ):     218.0 ms ±   6.6 ms    [User: 195.7 ms, System: 206.6 ms]
>   Range (min … max):   209.8 ms … 228.2 ms    13 runs
>
> Benchmark #3: small tree (git add .)
>   Time (mean ± σ):     217.6 ms ±   5.4 ms    [User: 131.9 ms, System: 86.7 ms]
>   Range (min … max):   212.1 ms … 228.4 ms    14 runs
>
> In this test, I also used "echo >>README.md" to append a line to the
> README.md file, so the 'git add .' command is doing _something_ other
> than a no-op. Without this edit (and FS Monitor enabled) the small
> tree case again gains about 30ms on the sparse index case.

Meaning the small tree is 30 ms faster than reported here, or 30 ms
slower, or that both sparse index and small tree are faster but the
small tree decreases its time more than the sparse index one does?

Sorry, I don't mean to be dense, I'm just struggling with
understanding words today it seems.  (Also, it seems like there's a
joke in there about me being "dense" in a review of a "sparse"
feature...but I'm not quite coming up with it.)

> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  cache-tree.c   | 18 ++++++++++++++++++
>  sparse-index.c | 10 +++++++++-
>  2 files changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/cache-tree.c b/cache-tree.c
> index 5f07a39e501..9da6a4394e0 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -256,6 +256,24 @@ static int update_one(struct cache_tree *it,
>
>         *skip_count = 0;
>
> +       /*
> +        * If the first entry of this region is a sparse directory
> +        * entry corresponding exactly to 'base', then this cache_tree
> +        * struct is a "leaf" in the data structure, pointing to the
> +        * tree OID specified in the entry.
> +        */
> +       if (entries > 0) {
> +               const struct cache_entry *ce = cache[0];
> +
> +               if (S_ISSPARSEDIR(ce) &&
> +                   ce->ce_namelen == baselen &&
> +                   !strncmp(ce->name, base, baselen)) {
> +                       it->entry_count = 1;
> +                       oidcpy(&it->oid, &ce->oid);
> +                       return 1;
> +               }
> +       }
> +
>         if (0 <= it->entry_count && has_object_file(&it->oid))
>                 return it->entry_count;
>
> diff --git a/sparse-index.c b/sparse-index.c
> index a201f3b905c..9ea3b321400 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -181,7 +181,11 @@ int convert_to_sparse(struct index_state *istate)
>         istate->cache_nr = convert_to_sparse_rec(istate,
>                                                  0, 0, istate->cache_nr,
>                                                  "", 0, istate->cache_tree);
> -       istate->drop_cache_tree = 1;
> +
> +       /* Clear and recompute the cache-tree */
> +       cache_tree_free(&istate->cache_tree);
> +       cache_tree_update(istate, 0);
> +
>         istate->sparse_index = 1;
>         trace2_region_leave("index", "convert_to_sparse", istate->repo);
>         return 0;
> @@ -278,6 +282,10 @@ void ensure_full_index(struct index_state *istate)
>
>         free(full);
>
> +       /* Clear and recompute the cache-tree */
> +       cache_tree_free(&istate->cache_tree);
> +       cache_tree_update(istate, 0);
> +
>         trace2_region_leave("index", "ensure_full_index", istate->repo);
>  }
>
> --
> gitgitgadget

This is very exciting work!!
Derrick Stolee Feb. 2, 2021, 2:41 a.m. UTC | #2
On 2/1/2021 6:54 PM, Elijah Newren wrote:
> On Mon, Jan 25, 2021 at 9:42 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> In this test, I also used "echo >>README.md" to append a line to the
>> README.md file, so the 'git add .' command is doing _something_ other
>> than a no-op. Without this edit (and FS Monitor enabled) the small
>> tree case again gains about 30ms on the sparse index case.
> 
> Meaning the small tree is 30 ms faster than reported here, or 30 ms
> slower, or that both sparse index and small tree are faster but the
> small tree decreases its time more than the sparse index one does?
> 
> Sorry, I don't mean to be dense, I'm just struggling with
> understanding words today it seems.  (Also, it seems like there's a
> joke in there about me being "dense" in a review of a "sparse"
> feature...but I'm not quite coming up with it.)

I don't blame you! This is a lot to digest, and I appreciate you
pushing through to the end of it.

Clearly, I was getting a bit inexact near the end. My excitement
to share this RFC clearly overshadowed my attention to grammatical
detail. I'll go through your feedback more carefully soon and
hopefully clarify these and many other questions.

Thanks,
-Stolee
Elijah Newren Feb. 2, 2021, 3:05 a.m. UTC | #3
On Mon, Feb 1, 2021 at 6:41 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 2/1/2021 6:54 PM, Elijah Newren wrote:
> > On Mon, Jan 25, 2021 at 9:42 AM Derrick Stolee via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >> In this test, I also used "echo >>README.md" to append a line to the
> >> README.md file, so the 'git add .' command is doing _something_ other
> >> than a no-op. Without this edit (and FS Monitor enabled) the small
> >> tree case again gains about 30ms on the sparse index case.
> >
> > Meaning the small tree is 30 ms faster than reported here, or 30 ms
> > slower, or that both sparse index and small tree are faster but the
> > small tree decreases its time more than the sparse index one does?
> >
> > Sorry, I don't mean to be dense, I'm just struggling with
> > understanding words today it seems.  (Also, it seems like there's a
> > joke in there about me being "dense" in a review of a "sparse"
> > feature...but I'm not quite coming up with it.)
>
> I don't blame you! This is a lot to digest, and I appreciate you
> pushing through to the end of it.
>
> Clearly, I was getting a bit inexact near the end. My excitement
> to share this RFC clearly overshadowed my attention to grammatical
> detail. I'll go through your feedback more carefully soon and
> hopefully clarify these and many other questions.

I can't blame you for being excited; this series is awesome.  I've
thought we should do something along this direction for years[1].
Sure I found lots of little nitpicks here and there in your series,
but that's just attempting to help find any issues so it can be made
even better; overall I'm super excited about it.

[1] See "crazy idea" paragraph at
https://lore.kernel.org/git/CABPp-BGir_5xyqEfwytDog0rZDydPHXjuqXCpNKk67dVPXjUjA@mail.gmail.com/
diff mbox series

Patch

diff --git a/cache-tree.c b/cache-tree.c
index 5f07a39e501..9da6a4394e0 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -256,6 +256,24 @@  static int update_one(struct cache_tree *it,
 
 	*skip_count = 0;
 
+	/*
+	 * If the first entry of this region is a sparse directory
+	 * entry corresponding exactly to 'base', then this cache_tree
+	 * struct is a "leaf" in the data structure, pointing to the
+	 * tree OID specified in the entry.
+	 */
+	if (entries > 0) {
+		const struct cache_entry *ce = cache[0];
+
+		if (S_ISSPARSEDIR(ce) &&
+		    ce->ce_namelen == baselen &&
+		    !strncmp(ce->name, base, baselen)) {
+			it->entry_count = 1;
+			oidcpy(&it->oid, &ce->oid);
+			return 1;
+		}
+	}
+
 	if (0 <= it->entry_count && has_object_file(&it->oid))
 		return it->entry_count;
 
diff --git a/sparse-index.c b/sparse-index.c
index a201f3b905c..9ea3b321400 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -181,7 +181,11 @@  int convert_to_sparse(struct index_state *istate)
 	istate->cache_nr = convert_to_sparse_rec(istate,
 						 0, 0, istate->cache_nr,
 						 "", 0, istate->cache_tree);
-	istate->drop_cache_tree = 1;
+
+	/* Clear and recompute the cache-tree */
+	cache_tree_free(&istate->cache_tree);
+	cache_tree_update(istate, 0);
+
 	istate->sparse_index = 1;
 	trace2_region_leave("index", "convert_to_sparse", istate->repo);
 	return 0;
@@ -278,6 +282,10 @@  void ensure_full_index(struct index_state *istate)
 
 	free(full);
 
+	/* Clear and recompute the cache-tree */
+	cache_tree_free(&istate->cache_tree);
+	cache_tree_update(istate, 0);
+
 	trace2_region_leave("index", "ensure_full_index", istate->repo);
 }