Message ID | ba3e16f7-bf9c-c5f3-4c0d-8288db6f44c7@web.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | perf: disable automatic housekeeping | expand |
[Changed subject] On Sat, Oct 09 2021, René Scharfe wrote: > Turn off automatic background maintenance for perf tests by default to > avoid interference with performance measurements. Turning off background GC during the perf tests seems like a good idea in general, so I think this patch should go in. Even with the WIP (haven't picked it up again in a while) test mode I menitoned in[1] it still wouldn't make any sense to run detached background GC in t/perf. Because first of all we take the repo as-is (hardlinks and all), so changing it is a bug in itself. But... > Do that by using the > new file t/perf/config and using it as the system config file for perf > tests. Future tests intended to measure gc performance can override > the setting locally or call "git gc" explicitly. > > This fixes a breakage in p2000 caused by gc automatically emptying the > reflog due its fake dates from 2005 being older than 90 days. I think the behavior of git-reflog is probably just broken here and needs fixing in general. I.e. we don't actually change the ctime the relevant reflog files, so surely this is pointing to deeper edge cases where e.g. someone might import old history, only to see it wiped away due to the commits being "old", but the mtime is just seconds/minutes old... But I never had the time/chance to dig into that, maybe there's a good reason for it. Of course we can't rely on *just* the mtime since the whole point is to expire older entries, we need to walk the file and trust the timestamps therein. See the show_one_reflog_ent(). But my recollection & reading of [2] is that we nuke the whole file. Surely we should have a "uh oh?" safeguard when we encounter a file whose tip timestamp_t is from 2005, it's 2021, and our lstat() mtime says it was changed seconds ago. I.e. shouldn't reflog expiry at least have the invariant that nothing should expire in a given file if the tip commit is old enough to be expired, *AND* the mtime of the file is more recent than the expiry epoch, because such a scenario points to either git's own test suite (and we should just fake up the time there), or that we're about to corrupt some user's repository because they're either doing similar import work, or their clock was screwy? > Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Signed-off-by: René Scharfe <l.s.r@web.de> > --- > t/perf/config | 2 ++ > t/perf/perf-lib.sh | 4 ++++ > 2 files changed, 6 insertions(+) > create mode 100644 t/perf/config > > diff --git a/t/perf/config b/t/perf/config > new file mode 100644 > index 0000000000..b92768b039 > --- /dev/null > +++ b/t/perf/config > @@ -0,0 +1,2 @@ > +[gc] > + auto = 0 > diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh > index f74cfd35d6..4c4c568a37 100644 > --- a/t/perf/perf-lib.sh > +++ b/t/perf/perf-lib.sh > @@ -27,6 +27,10 @@ TEST_NO_MALLOC_CHECK=t > > . ../test-lib.sh > > +unset GIT_CONFIG_NOSYSTEM > +GIT_CONFIG_SYSTEM="$TEST_DIRECTORY/perf/config" > +export GIT_CONFIG_SYSTEM > + > if test -n "$GIT_TEST_INSTALLED" -a -z "$PERF_SET_GIT_TEST_INSTALLED" > then > error "Do not use GIT_TEST_INSTALLED with the perf tests. 1. https://lore.kernel.org/git/8735pfjg47.fsf@evledraar.gmail.com/ 2. https://lore.kernel.org/git/b25ac1cc-8e77-17e6-602a-b289c1e1cfeb@web.de/
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > I.e. shouldn't reflog expiry at least have the invariant that nothing > should expire in a given file if the tip commit is old enough to be > expired, *AND* the mtime of the file is more recent than the expiry > epoch, because such a scenario points to either git's own test suite > (and we should just fake up the time there), or that we're about to > corrupt some user's repository because they're either doing similar > import work, or their clock was screwy? A reflog whose entries (and possibly the containing file) have inconsistent timestamps might need some care when pruning. If the time @{1} has is older than @{2}, there may be something fishy going on. Expiring @{1}, keep @{2}, and expire @{3} in such a case where @{2} is exceptionally new but all others are older than expiry limit is one possibility, but "entries out of order detected---do you want to keep the one that is sandwiched between much older ones [y/N]?" might be worth asking. In any case, a reflog with such unordered entries may not answer time-based @{1.week.ago} question correctly. I do not think, at least by default, we should care about file's mtime to the same degree, because it can easily be modified by accident and because the timestamps on entries and mtime can come from different time sources, and the local clock may be faster than the fileserver's clock, resulting in the tip entry having a more recent timestamp than the file's mtime.
On Sat, Oct 09, 2021 at 04:57:20PM +0200, Ævar Arnfjörð Bjarmason wrote: > On Sat, Oct 09 2021, René Scharfe wrote: > > > Turn off automatic background maintenance for perf tests by default to > > avoid interference with performance measurements. > > Turning off background GC during the perf tests seems like a good idea > in general, so I think this patch should go in. Even with the WIP > (haven't picked it up again in a while) test mode I menitoned in[1] it > still wouldn't make any sense to run detached background GC in t/perf. > > Because first of all we take the repo as-is (hardlinks and all), so > changing it is a bug in itself. Lots of perf tests modify the repository. This is generally OK as Git tends to write and rename tempfiles (breaking the hardlink) rather than modifying anything in place. We also only do the hardlink thing for the objects/ directory, so the scope is limited. It does make me a little nervous, but to my knowledge we've never had a perf test which hurt the original donor repo in this way. And there are more subtle ways to screw things up with various filesystem references; see 36e834abc1 (t/perf: avoid copying worktree files from test repo, 2021-02-26). That's all just an aside; I agree that we should avoid background gc just because it confuses performance tests. -Peff
diff --git a/t/perf/config b/t/perf/config new file mode 100644 index 0000000000..b92768b039 --- /dev/null +++ b/t/perf/config @@ -0,0 +1,2 @@ +[gc] + auto = 0 diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh index f74cfd35d6..4c4c568a37 100644 --- a/t/perf/perf-lib.sh +++ b/t/perf/perf-lib.sh @@ -27,6 +27,10 @@ TEST_NO_MALLOC_CHECK=t . ../test-lib.sh +unset GIT_CONFIG_NOSYSTEM +GIT_CONFIG_SYSTEM="$TEST_DIRECTORY/perf/config" +export GIT_CONFIG_SYSTEM + if test -n "$GIT_TEST_INSTALLED" -a -z "$PERF_SET_GIT_TEST_INSTALLED" then error "Do not use GIT_TEST_INSTALLED with the perf tests.
Turn off automatic background maintenance for perf tests by default to avoid interference with performance measurements. Do that by using the new file t/perf/config and using it as the system config file for perf tests. Future tests intended to measure gc performance can override the setting locally or call "git gc" explicitly. This fixes a breakage in p2000 caused by gc automatically emptying the reflog due its fake dates from 2005 being older than 90 days. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> --- t/perf/config | 2 ++ t/perf/perf-lib.sh | 4 ++++ 2 files changed, 6 insertions(+) create mode 100644 t/perf/config -- 2.33.0