[v2,1/3] fsck: verify commit graph when implicitly enabled

Message ID	20210917225459.68086-2-chooglen@google.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <git-owner@kernel.org> Date: Fri, 17 Sep 2021 15:54:57 -0700 In-Reply-To: <20210917225459.68086-1-chooglen@google.com> Message-Id: <20210917225459.68086-2-chooglen@google.com> Mime-Version: 1.0 References: <20210913181221.42635-1-chooglen@google.com> <20210917225459.68086-1-chooglen@google.com> Subject: [PATCH v2 1/3] fsck: verify commit graph when implicitly enabled From: Glen Choo <chooglen@google.com> To: git@vger.kernel.org Cc: Glen Choo <chooglen@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Use default values from settings instead of config \| expand [v2,0/3] Use default values from settings instead of config [v2,1/3] fsck: verify commit graph when implicitly enabled [v2,2/3] fsck: verify multi-pack-index when implictly enabled [v2,3/3] gc: perform incremental repack when implictly enabled

Glen Choo Sept. 17, 2021, 10:54 p.m. UTC

the_repository->settings is the preferred way to get certain
settings (such as "core.commitGraph") because it gets default values
from prepare_repo_settings(). However, cmd_fsck() reads the config
directly via git_config_get_bool(), which bypasses these default values.
This causes fsck to ignore the commit graph if "core.commitgraph" is not
explicitly set in the config. This worked fine until commit-graph was
enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
default, 2019-08-13).

Replace git_config_get_bool("core.commitgraph") in fsck with the
equivalent call to the_repository->settings.core_commit_graph.

The expected behavior is that fsck respects the config value when it is
set, but uses the default value when it is unset. For example, for
core.commitGraph, there are three cases:

- Config value is set to true -> enabled
- Config value is set to false -> disabled
- Config value is unset -> enabled

As such, tests cover all three cases.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 builtin/fsck.c          |  3 ++-
 t/t5318-commit-graph.sh | 24 +++++++++++++++++++++++-
 2 files changed, 25 insertions(+), 2 deletions(-)

Eric Sunshine Sept. 29, 2021, 6:09 a.m. UTC | #1

On 9/17/21 6:54 PM, Glen Choo wrote:
> the_repository->settings is the preferred way to get certain
> settings (such as "core.commitGraph") because it gets default values
> from prepare_repo_settings(). However, cmd_fsck() reads the config
> directly via git_config_get_bool(), which bypasses these default values.
> This causes fsck to ignore the commit graph if "core.commitgraph" is not
> explicitly set in the config. This worked fine until commit-graph was
> enabled by default in 31b1de6a09 (commit-graph: turn on commit-graph by
> default, 2019-08-13).
> 
> Replace git_config_get_bool("core.commitgraph") in fsck with the
> equivalent call to the_repository->settings.core_commit_graph.
> [...]
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> @@ -674,7 +674,7 @@ test_expect_success 'detect incorrect chunk count' '
> -test_expect_success 'git fsck (checks commit-graph)' '
> +test_expect_success 'git fsck (checks commit-graph when config set to true)' '
>   	cd "$TRASH_DIRECTORY/full" &&
>   	git fsck &&
>   	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> @@ -683,6 +683,28 @@ test_expect_success 'git fsck (checks commit-graph)' '
>   	test_must_fail git fsck
>   '

Because I took the time to scan backward through this test script, I 
understand that `core.commitGraph` is already `true` by the time this 
test is reached, thus the new test title is accurate (for now). However, 
it would be a bit easier to reason about this and make it more robust by 
having the test itself guarantee that it is set to `true` by invoking 
`git config core.commitGraph true`.

> +test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
> +	cd "$TRASH_DIRECTORY/full" &&
> +	git fsck &&
> +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> +		"incorrect checksum" &&
> +	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
> +	test_config core.commitGraph false &&
> +	git fsck
> +'

test_config() unsets the config variable when the test completes, so I'm 
wondering if its use is in fact desirable here. I ask because, from a 
quick scan through the file, it appears that many tests in this script 
assume that `core.commitGraph` is `true`, so it's not clear that 
unsetting it at the end of this test is a good idea in general.

> +test_expect_success 'git fsck (checks commit-graph when config unset)' '
> +	test_when_finished "git config core.commitGraph true" &&
> +
> +	cd "$TRASH_DIRECTORY/full" &&

I find it somewhat confusing to reason about the behavior of 
test_when_finished() when it is invoked like this before the `cd`. It's 
true that the end-of-test `git config core.commitGraph true` will be 
invoked within `full` as desired (except in the very unusual case of the 
`cd` failing), so it's probably correct, but it requires extra thought 
to come to that conclusion. Switching the order of these two lines might 
lower the cognitive load a bit.

> +	git fsck &&
> +	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
> +		"incorrect checksum" &&
> +	test_unconfig core.commitGraph &&
> +	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
> +	test_must_fail git fsck
> +'

Taking Taylor's comment[1] on v1 patch [2/3] into account, I wonder if 
it would be simpler for these all to be combined into a single test. 
Something like (untested):

     cd "$TRASH_DIRECTORY/full" &&
     test_when_finished "git config core.commitGraph true" &&
     git config core.commitGraph true &&
     git fsck &&
     corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
         "incorrect checksum" &&
     cp commit-graph-pre-write-test $objdir/info/commit-graph &&
     test_must_fail git fsck &&
     git config core.commitGraph false &&
     git fsck &&
     git config --unset-all core.commitGraph &&
     test_must_fail git fsck

I didn't bother using test_unconfig() since, in this case, we _know_ 
that the config variable is set, thus don't have to worry about `git 
config --unset-all` failing.

A downside of combining the tests like this is that it makes it a bit 
more cumbersome to diagnose a failure (because there is more code and 
more cases to wade through).

Anyhow, I don't have a strong feeling one way or the other about 
combining the test or leaving them separate.

[1]: https://lore.kernel.org/git/YT+n81yGPYEiMXwQ@nand.local/

Glen Choo Sept. 30, 2021, 6 p.m. UTC | #2

Eric Sunshine <sunshine@sunshineco.com> writes:

> Because I took the time to scan backward through this test script, I 
> understand that `core.commitGraph` is already `true` by the time this 
> test is reached, thus the new test title is accurate (for now). However, 
> it would be a bit easier to reason about this and make it more robust by 
> having the test itself guarantee that it is set to `true` by invoking 
> `git config core.commitGraph true`.

Reading this and..

> I find it somewhat confusing to reason about the behavior of 
> test_when_finished() when it is invoked like this before the `cd`. It's 
> true that the end-of-test `git config core.commitGraph true` will be 
> invoked within `full` as desired (except in the very unusual case of the 
> `cd` failing), so it's probably correct, but it requires extra thought 
> to come to that conclusion. Switching the order of these two lines might 
> lower the cognitive load a bit.

I'll make both of these changes. Test readability is important and
people shouldn't be wasting time thinking about test correctness.

> test_config() unsets the config variable when the test completes, so I'm 
> wondering if its use is in fact desirable here. I ask because, from a 
> quick scan through the file, it appears that many tests in this script 
> assume that `core.commitGraph` is `true`, so it's not clear that 
> unsetting it at the end of this test is a good idea in general.

This is a good point. I made the original change in response to Taylor's
comment, but I think we both didn't consider that this script assumes
`core.commitGraph` is `true`. The rest of the tests pass, but only
because the default value is true and I'd rather not have tests rely on
a happy accident.

> Taking Taylor's comment[1] on v1 patch [2/3] into account, I wonder if 
> it would be simpler for these all to be combined into a single test. 

Interesting thought. The marginal benefit here is much lower than in
patch [2/3]. The difference is that corrupt_midx_and_verify() uses an
additional $COMMAND to perform verification, but
corrupt_graph_and_verify() does not. The result is that
corrupt_midx_and_verify() is subtle and confusing when used in separate
tests, but corrupt_graph_and_verify() is not so bad.

> A downside of combining the tests like this is that it makes it a bit 
> more cumbersome to diagnose a failure (because there is more code and 
> more cases to wade through).

Yes, and we also lose the test labels that would guide the reader. It
might not be that obvious why the tests for a boolean value has 3
cases {true,false,unset}.

Taking churn into account, I'm inclined to keep the tests separate.

Glen Choo Sept. 30, 2021, 6:35 p.m. UTC | #3

>> Because I took the time to scan backward through this test script, I 
>> understand that `core.commitGraph` is already `true` by the time this 
>> test is reached, thus the new test title is accurate (for now). However, 
>> it would be a bit easier to reason about this and make it more robust by 
>> having the test itself guarantee that it is set to `true` by invoking 
>> `git config core.commitGraph true`.

>> test_config() unsets the config variable when the test completes, so I'm 
>> wondering if its use is in fact desirable here. I ask because, from a 
>> quick scan through the file, it appears that many tests in this script 
>> assume that `core.commitGraph` is `true`, so it's not clear that 
>> unsetting it at the end of this test is a good idea in general.
>
> This is a good point. I made the original change in response to Taylor's
> comment, but I think we both didn't consider that this script assumes
> `core.commitGraph` is `true`. The rest of the tests pass, but only
> because the default value is true and I'd rather not have tests rely on
> a happy accident.

I said I would incorporate these suggestions, but I didn't propose what
changes I would actually make.

Given that each test depends on a global config value in the setup
phase, it might be simplest to read if we try to avoid anything that
touches that value. The easiest way to achieve this is to use git -c
core.commitGraph {true,false} for the {true,false} cases. Since there is
no -c equivalent for the unset case, I'll continue to use
test_unconfig() + test_when_finished() to temporarily unset the value.

Eric Sunshine Sept. 30, 2021, 6:39 p.m. UTC | #4

On Thu, Sep 30, 2021 at 2:35 PM Glen Choo <chooglen@google.com> wrote:
> >> test_config() unsets the config variable when the test completes, so I'm
> >> wondering if its use is in fact desirable here. I ask because, from a
> >> quick scan through the file, it appears that many tests in this script
> >> assume that `core.commitGraph` is `true`, so it's not clear that
> >> unsetting it at the end of this test is a good idea in general.
> >
> > This is a good point. I made the original change in response to Taylor's
> > comment, but I think we both didn't consider that this script assumes
> > `core.commitGraph` is `true`. The rest of the tests pass, but only
> > because the default value is true and I'd rather not have tests rely on
> > a happy accident.
>
> I said I would incorporate these suggestions, but I didn't propose what
> changes I would actually make.
>
> Given that each test depends on a global config value in the setup
> phase, it might be simplest to read if we try to avoid anything that
> touches that value. The easiest way to achieve this is to use git -c
> core.commitGraph {true,false} for the {true,false} cases. Since there is
> no -c equivalent for the unset case, I'll continue to use
> test_unconfig() + test_when_finished() to temporarily unset the value.

That was my thought, as well. (I wasn't quite sure why Taylor
recommended test_config() over `-c` which you used originally. It may
just be his personal preference. Perhaps he can chime in?)

Glen Choo Oct. 1, 2021, 5:28 p.m. UTC | #5

Eric Sunshine <sunshine@sunshineco.com> writes:

> That was my thought, as well. (I wasn't quite sure why Taylor
> recommended test_config() over `-c` which you used originally. It may
> just be his personal preference. Perhaps he can chime in?)

I'd also appreciate thoughts on test_config() over `-c` :). I won't
re-roll this series yet in case there's more to the test_config() story,
but I've included a patch that shows the `-c` work that we discussed.

---
 t/t5318-commit-graph.sh | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 42e785cb6e..4fcfc2ebbc 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -680,7 +680,7 @@ test_expect_success 'git fsck (checks commit-graph when config set to true)' '
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
 		"incorrect checksum" &&
 	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
-	test_must_fail git fsck
+	test_must_fail git -c core.commitGraph=true fsck
 '
 
 test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
@@ -689,14 +689,13 @@ test_expect_success 'git fsck (ignores commit-graph when config set to false)' '
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
 		"incorrect checksum" &&
 	cp commit-graph-pre-write-test $objdir/info/commit-graph &&
-	test_config core.commitGraph false &&
-	git fsck
+	git -c core.commitGraph=false fsck
 '
 
 test_expect_success 'git fsck (checks commit-graph when config unset)' '
+	cd "$TRASH_DIRECTORY/full" &&
 	test_when_finished "git config core.commitGraph true" &&
 
-	cd "$TRASH_DIRECTORY/full" &&
 	git fsck &&
 	corrupt_graph_and_verify $GRAPH_BYTE_FOOTER "\00" \
 		"incorrect checksum" &&

[v2,1/3] fsck: verify commit graph when implicitly enabled

Commit Message

Comments

Patch