perf: disable automatic housekeeping

Message ID	ba3e16f7-bf9c-c5f3-4c0d-8288db6f44c7@web.de (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> Subject: [PATCH] perf: disable automatic housekeeping To: Junio C Hamano <gitster@pobox.com> Cc: Derrick Stolee <stolee@gmail.com>, Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>, Git List <git@vger.kernel.org> References: <b25ac1cc-8e77-17e6-602a-b289c1e1cfeb@web.de> <30df5e39-3f2a-00d8-541b-347c43f36b38@gmail.com> <581663a7-9b16-e464-ada7-368f20c99ff1@web.de> <xmqqsfxersvy.fsf@gitster.g> From: =?utf-8?q?Ren=C3=A9_Scharfe?= <l.s.r@web.de> Message-ID: <ba3e16f7-bf9c-c5f3-4c0d-8288db6f44c7@web.de> Date: Sat, 9 Oct 2021 16:39:24 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.14.0 MIME-Version: 1.0 In-Reply-To: <xmqqsfxersvy.fsf@gitster.g> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Precedence: bulk
Series	perf: disable automatic housekeeping \| expand perf: disable automatic housekeeping

Message ID

ba3e16f7-bf9c-c5f3-4c0d-8288db6f44c7@web.de (mailing list archive)

State

New, archived

Headers

Subject: [PATCH] perf: disable automatic housekeeping
To: Junio C Hamano <gitster@pobox.com>
Cc: Derrick Stolee <stolee@gmail.com>,
        Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
        Git List <git@vger.kernel.org>
References: <b25ac1cc-8e77-17e6-602a-b289c1e1cfeb@web.de>
 <30df5e39-3f2a-00d8-541b-347c43f36b38@gmail.com>
 <581663a7-9b16-e464-ada7-368f20c99ff1@web.de> <xmqqsfxersvy.fsf@gitster.g>
From: =?utf-8?q?Ren=C3=A9_Scharfe?= <l.s.r@web.de>
Message-ID: <ba3e16f7-bf9c-c5f3-4c0d-8288db6f44c7@web.de>
Date: Sat, 9 Oct 2021 16:39:24 +0200
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0)
 Gecko/20100101 Thunderbird/78.14.0
MIME-Version: 1.0
In-Reply-To: <xmqqsfxersvy.fsf@gitster.g>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: quoted-printable
Precedence: bulk

Series

perf: disable automatic housekeeping | expand

Commit Message

René Scharfe Oct. 9, 2021, 2:39 p.m. UTC

Turn off automatic background maintenance for perf tests by default to
avoid interference with performance measurements.  Do that by using the
new file t/perf/config and using it as the system config file for perf
tests.  Future tests intended to measure gc performance can override
the setting locally or call "git gc" explicitly.

This fixes a breakage in p2000 caused by gc automatically emptying the
reflog due its fake dates from 2005 being older than 90 days.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 t/perf/config      | 2 ++
 t/perf/perf-lib.sh | 4 ++++
 2 files changed, 6 insertions(+)
 create mode 100644 t/perf/config

--
2.33.0

Comments

Ævar Arnfjörð Bjarmason Oct. 9, 2021, 2:57 p.m. UTC | #1

[Changed subject]

On Sat, Oct 09 2021, René Scharfe wrote:

> Turn off automatic background maintenance for perf tests by default to
> avoid interference with performance measurements.

Turning off background GC during the perf tests seems like a good idea
in general, so I think this patch should go in. Even with the WIP
(haven't picked it up again in a while) test mode I menitoned in[1] it
still wouldn't make any sense to run detached background GC in t/perf.

Because first of all we take the repo as-is (hardlinks and all), so
changing it is a bug in itself.

But...

> Do that by using the
> new file t/perf/config and using it as the system config file for perf
> tests.  Future tests intended to measure gc performance can override
> the setting locally or call "git gc" explicitly.
>
> This fixes a breakage in p2000 caused by gc automatically emptying the
> reflog due its fake dates from 2005 being older than 90 days.

I think the behavior of git-reflog is probably just broken here and
needs fixing in general.

I.e. we don't actually change the ctime the relevant reflog files, so
surely this is pointing to deeper edge cases where e.g. someone might
import old history, only to see it wiped away due to the commits being
"old", but the mtime is just seconds/minutes old...

But I never had the time/chance to dig into that, maybe there's a good
reason for it.

Of course we can't rely on *just* the mtime since the whole point is to
expire older entries, we need to walk the file and trust the timestamps
therein. See the show_one_reflog_ent().

But my recollection & reading of [2] is that we nuke the whole
file. Surely we should have a "uh oh?" safeguard when we encounter a
file whose tip timestamp_t is from 2005, it's 2021, and our lstat()
mtime says it was changed seconds ago.

I.e. shouldn't reflog expiry at least have the invariant that nothing
should expire in a given file if the tip commit is old enough to be
expired, *AND* the mtime of the file is more recent than the expiry
epoch, because such a scenario points to either git's own test suite
(and we should just fake up the time there), or that we're about to
corrupt some user's repository because they're either doing similar
import work, or their clock was screwy?

> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  t/perf/config      | 2 ++
>  t/perf/perf-lib.sh | 4 ++++
>  2 files changed, 6 insertions(+)
>  create mode 100644 t/perf/config
>
> diff --git a/t/perf/config b/t/perf/config
> new file mode 100644
> index 0000000000..b92768b039
> --- /dev/null
> +++ b/t/perf/config
> @@ -0,0 +1,2 @@
> +[gc]
> +	auto = 0
> diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
> index f74cfd35d6..4c4c568a37 100644
> --- a/t/perf/perf-lib.sh
> +++ b/t/perf/perf-lib.sh
> @@ -27,6 +27,10 @@ TEST_NO_MALLOC_CHECK=t
>
>  . ../test-lib.sh
>
> +unset GIT_CONFIG_NOSYSTEM
> +GIT_CONFIG_SYSTEM="$TEST_DIRECTORY/perf/config"
> +export GIT_CONFIG_SYSTEM
> +
>  if test -n "$GIT_TEST_INSTALLED" -a -z "$PERF_SET_GIT_TEST_INSTALLED"
>  then
>  	error "Do not use GIT_TEST_INSTALLED with the perf tests.

1. https://lore.kernel.org/git/8735pfjg47.fsf@evledraar.gmail.com/
2. https://lore.kernel.org/git/b25ac1cc-8e77-17e6-602a-b289c1e1cfeb@web.de/

Junio C Hamano Oct. 9, 2021, 5:50 p.m. UTC | #2

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I.e. shouldn't reflog expiry at least have the invariant that nothing
> should expire in a given file if the tip commit is old enough to be
> expired, *AND* the mtime of the file is more recent than the expiry
> epoch, because such a scenario points to either git's own test suite
> (and we should just fake up the time there), or that we're about to
> corrupt some user's repository because they're either doing similar
> import work, or their clock was screwy?

A reflog whose entries (and possibly the containing file) have
inconsistent timestamps might need some care when pruning.  If the
time @{1} has is older than @{2}, there may be something fishy going
on.  Expiring @{1}, keep @{2}, and expire @{3} in such a case where
@{2} is exceptionally new but all others are older than expiry limit
is one possibility, but "entries out of order detected---do you want
to keep the one that is sandwiched between much older ones [y/N]?"
might be worth asking.  In any case, a reflog with such unordered
entries may not answer time-based @{1.week.ago} question correctly.

I do not think, at least by default, we should care about file's
mtime to the same degree, because it can easily be modified by
accident and because the timestamps on entries and mtime can come
from different time sources, and the local clock may be faster than
the fileserver's clock, resulting in the tip entry having a more
recent timestamp than the file's mtime.

Jeff King Oct. 11, 2021, 4:32 p.m. UTC | #3

On Sat, Oct 09, 2021 at 04:57:20PM +0200, Ævar Arnfjörð Bjarmason wrote:

> On Sat, Oct 09 2021, René Scharfe wrote:
> 
> > Turn off automatic background maintenance for perf tests by default to
> > avoid interference with performance measurements.
> 
> Turning off background GC during the perf tests seems like a good idea
> in general, so I think this patch should go in. Even with the WIP
> (haven't picked it up again in a while) test mode I menitoned in[1] it
> still wouldn't make any sense to run detached background GC in t/perf.
> 
> Because first of all we take the repo as-is (hardlinks and all), so
> changing it is a bug in itself.

Lots of perf tests modify the repository. This is generally OK as Git
tends to write and rename tempfiles (breaking the hardlink) rather than
modifying anything in place. We also only do the hardlink thing for the
objects/ directory, so the scope is limited.

It does make me a little nervous, but to my knowledge we've never had a
perf test which hurt the original donor repo in this way. And there are
more subtle ways to screw things up with various filesystem references;
see 36e834abc1 (t/perf: avoid copying worktree files from test repo,
2021-02-26).

That's all just an aside; I agree that we should avoid background gc
just because it confuses performance tests.

-Peff

diff --git a/t/perf/config b/t/perf/config
new file mode 100644
index 0000000000..b92768b039
--- /dev/null
+++ b/t/perf/config
@@ -0,0 +1,2 @@ 
+[gc]
+	auto = 0
diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index f74cfd35d6..4c4c568a37 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -27,6 +27,10 @@  TEST_NO_MALLOC_CHECK=t

 . ../test-lib.sh

+unset GIT_CONFIG_NOSYSTEM
+GIT_CONFIG_SYSTEM="$TEST_DIRECTORY/perf/config"
+export GIT_CONFIG_SYSTEM
+
 if test -n "$GIT_TEST_INSTALLED" -a -z "$PERF_SET_GIT_TEST_INSTALLED"
 then
 	error "Do not use GIT_TEST_INSTALLED with the perf tests.

perf: disable automatic housekeeping

Commit Message

Comments

Patch