diff mbox series

[06/10] dir: use expand_to_path() for sparse directories

Message ID 434306541613cbd0b9bb4ebb3102d97e3df9eb94.1618322497.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Sparse-index: integrate with status and add | expand

Commit Message

Derrick Stolee April 13, 2021, 2:01 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

The recently-implemented expand_to_path() method can supply position
queries a faster response if they are specifically asking for a path
within the sparse cone. Since this is the most-common scenario, this
provides a significant speedup.

Update t1092-sparse-checkout-compatibility.sh to fully ensure that 'git
status' does not expand a sparse index to a full one, even when there
exist untracked files.

The performance test script p2000-sparse-operations.sh demonstrates
that this is the final hole to fill to allow 'git status' to speed up
when using a sparse index:

Test                                  HEAD~1            HEAD
------------------------------------------------------------------------------
2000.4: git status (sparse-index-v3)  1.50(1.43+0.10)   0.04(0.04+0.03) -97.3%
2000.5: git status (sparse-index-v4)  1.50(1.43+0.10)   0.04(0.03+0.04) -97.3%

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1092-sparse-checkout-compatibility.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Elijah Newren April 21, 2021, 12:52 a.m. UTC | #1
On Tue, Apr 13, 2021 at 7:01 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The recently-implemented expand_to_path() method can supply position
> queries a faster response if they are specifically asking for a path
> within the sparse cone. Since this is the most-common scenario, this
> provides a significant speedup.
>
> Update t1092-sparse-checkout-compatibility.sh to fully ensure that 'git
> status' does not expand a sparse index to a full one, even when there
> exist untracked files.
>
> The performance test script p2000-sparse-operations.sh demonstrates
> that this is the final hole to fill to allow 'git status' to speed up
> when using a sparse index:
>
> Test                                  HEAD~1            HEAD
> ------------------------------------------------------------------------------
> 2000.4: git status (sparse-index-v3)  1.50(1.43+0.10)   0.04(0.04+0.03) -97.3%
> 2000.5: git status (sparse-index-v4)  1.50(1.43+0.10)   0.04(0.03+0.04) -97.3%

Um, I'm confused.  In the previous patch you claimed the following speedups:

2000.4: git status (sparse-index-v3)  2.43(2.33+0.14)  0.04(0.05+0.04) -98.4%
2000.5: git status (sparse-index-v4)  2.44(2.35+0.13)  0.05(0.04+0.07) -98.0%

I don't understand why the "Before" for this patch claims 1.50 as the
initial speed, if the "After" for the last patch was 0.04.  Should the
previous commit message have instead claimed:

2000.4: git status (sparse-index-v3)  2.43(2.33+0.14)  1.50(1.43+0.10) -38.3%
2000.5: git status (sparse-index-v4)  2.44(2.35+0.13)  1.50(1.43+0.10) -38.5%

?

>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  t/t1092-sparse-checkout-compatibility.sh | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> index 380a085f8ec4..b937d7096afd 100755
> --- a/t/t1092-sparse-checkout-compatibility.sh
> +++ b/t/t1092-sparse-checkout-compatibility.sh
> @@ -456,8 +456,9 @@ test_expect_success 'sparse-index is not expanded' '
>         init_repos &&
>
>         rm -f trace2.txt &&
> +       echo >>sparse-index/untracked.txt &&
>         GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> -               git -C sparse-index status -uno &&
> +               git -C sparse-index status &&
>         test_region ! index ensure_full_index trace2.txt
>  '
>
> --
> gitgitgadget

Oh!  So, the previous patch was testing without enumerating untracked
files (because it did those slowly), whereas this one enumerates
untracked files and is still able to achieve the same performance?
This wasn't very clear from the commit message.  Maybe I'm just bad at
reading, but perhaps the commit message could be tweaked slightly to
make this more clear?
Elijah Newren April 21, 2021, 12:53 a.m. UTC | #2
One more thing:

On Tue, Apr 20, 2021 at 5:52 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Apr 13, 2021 at 7:01 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: Derrick Stolee <dstolee@microsoft.com>
> >
> > The recently-implemented expand_to_path() method can supply position
> > queries a faster response if they are specifically asking for a path
> > within the sparse cone. Since this is the most-common scenario, this
> > provides a significant speedup.
> >
> > Update t1092-sparse-checkout-compatibility.sh to fully ensure that 'git
> > status' does not expand a sparse index to a full one, even when there
> > exist untracked files.
> >
> > The performance test script p2000-sparse-operations.sh demonstrates
> > that this is the final hole to fill to allow 'git status' to speed up
> > when using a sparse index:
> >
> > Test                                  HEAD~1            HEAD
> > ------------------------------------------------------------------------------
> > 2000.4: git status (sparse-index-v3)  1.50(1.43+0.10)   0.04(0.04+0.03) -97.3%
> > 2000.5: git status (sparse-index-v4)  1.50(1.43+0.10)   0.04(0.03+0.04) -97.3%
>
> Um, I'm confused.  In the previous patch you claimed the following speedups:
>
> 2000.4: git status (sparse-index-v3)  2.43(2.33+0.14)  0.04(0.05+0.04) -98.4%
> 2000.5: git status (sparse-index-v4)  2.44(2.35+0.13)  0.05(0.04+0.07) -98.0%
>
> I don't understand why the "Before" for this patch claims 1.50 as the
> initial speed, if the "After" for the last patch was 0.04.  Should the
> previous commit message have instead claimed:
>
> 2000.4: git status (sparse-index-v3)  2.43(2.33+0.14)  1.50(1.43+0.10) -38.3%
> 2000.5: git status (sparse-index-v4)  2.44(2.35+0.13)  1.50(1.43+0.10) -38.5%
>
> ?
>
> >
> > Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> > ---
> >  t/t1092-sparse-checkout-compatibility.sh | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
> > index 380a085f8ec4..b937d7096afd 100755
> > --- a/t/t1092-sparse-checkout-compatibility.sh
> > +++ b/t/t1092-sparse-checkout-compatibility.sh
> > @@ -456,8 +456,9 @@ test_expect_success 'sparse-index is not expanded' '
> >         init_repos &&
> >
> >         rm -f trace2.txt &&
> > +       echo >>sparse-index/untracked.txt &&
> >         GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
> > -               git -C sparse-index status -uno &&
> > +               git -C sparse-index status &&
> >         test_region ! index ensure_full_index trace2.txt
> >  '
> >
> > --
> > gitgitgadget
>
> Oh!  So, the previous patch was testing without enumerating untracked
> files (because it did those slowly), whereas this one enumerates
> untracked files and is still able to achieve the same performance?
> This wasn't very clear from the commit message.  Maybe I'm just bad at
> reading, but perhaps the commit message could be tweaked slightly to
> make this more clear?

Why is the subject of this commit "dir: use expand_to_path() ..." if
it only touches t1092-sparse-checkout-compatibility.sh?
Derrick Stolee April 21, 2021, 2:03 p.m. UTC | #3
On 4/20/2021 8:53 PM, Elijah Newren wrote:
> One more thing:
> 
> On Tue, Apr 20, 2021 at 5:52 PM Elijah Newren <newren@gmail.com> wrote:
>>
>> On Tue, Apr 13, 2021 at 7:01 AM Derrick Stolee via GitGitGadget
>> <gitgitgadget@gmail.com> wrote:
>>> Test                                  HEAD~1            HEAD
>>> ------------------------------------------------------------------------------
>>> 2000.4: git status (sparse-index-v3)  1.50(1.43+0.10)   0.04(0.04+0.03) -97.3%
>>> 2000.5: git status (sparse-index-v4)  1.50(1.43+0.10)   0.04(0.03+0.04) -97.3%
>>
>> Um, I'm confused.  In the previous patch you claimed the following speedups:
>>
>> 2000.4: git status (sparse-index-v3)  2.43(2.33+0.14)  0.04(0.05+0.04) -98.4%
>> 2000.5: git status (sparse-index-v4)  2.44(2.35+0.13)  0.05(0.04+0.07) -98.0%
>>
>> I don't understand why the "Before" for this patch claims 1.50 as the
>> initial speed, if the "After" for the last patch was 0.04.  Should the
>> previous commit message have instead claimed:
>>
>> 2000.4: git status (sparse-index-v3)  2.43(2.33+0.14)  1.50(1.43+0.10) -38.3%
>> 2000.5: git status (sparse-index-v4)  2.44(2.35+0.13)  1.50(1.43+0.10) -38.5%
...
>> Oh!  So, the previous patch was testing without enumerating untracked
>> files (because it did those slowly), whereas this one enumerates
>> untracked files and is still able to achieve the same performance?
>> This wasn't very clear from the commit message.  Maybe I'm just bad at
>> reading, but perhaps the commit message could be tweaked slightly to
>> make this more clear?
> 
> Why is the subject of this commit "dir: use expand_to_path() ..." if
> it only touches t1092-sparse-checkout-compatibility.sh?
 
You are right to be confused. This is another patch that simplified due
to refactors in the protections branch. This should just be squashed into
the previous.

For context: an earlier version inserted ensure_full_index() before
every call to index_name_pos() and then this patch swapped that for
a call to expand_to_path(). The change in the protections branch was
to have index_name_pos() call expand_to_path() itself, preventing the
need for these ensure_full_index() calls.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh
index 380a085f8ec4..b937d7096afd 100755
--- a/t/t1092-sparse-checkout-compatibility.sh
+++ b/t/t1092-sparse-checkout-compatibility.sh
@@ -456,8 +456,9 @@  test_expect_success 'sparse-index is not expanded' '
 	init_repos &&
 
 	rm -f trace2.txt &&
+	echo >>sparse-index/untracked.txt &&
 	GIT_TRACE2_EVENT="$(pwd)/trace2.txt" GIT_TRACE2_EVENT_NESTING=10 \
-		git -C sparse-index status -uno &&
+		git -C sparse-index status &&
 	test_region ! index ensure_full_index trace2.txt
 '