Message ID | ZqeY0eHNZjKhNvIH@ArchLinux (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | ref consistency check infra setup | expand |
On Mon, Jul 29, 2024 at 09:27:45PM +0800, shejialuo wrote: > diff --git a/refs/files-backend.c b/refs/files-backend.c > index cb184953c1..0d4fc27768 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -3419,6 +3419,27 @@ typedef int (*files_fsck_refs_fn)(struct fsck_options *o, > const char *refs_check_dir, > struct dir_iterator *iter); > > +static int files_fsck_refs_name(struct fsck_options *o, > + const char *gitdir UNUSED, > + const char *refs_check_dir, > + struct dir_iterator *iter) > +{ > + struct strbuf sb = STRBUF_INIT; > + struct fsck_refs_info info; > + int ret = 0; > + > + if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) { > + strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path); > + info.path = sb.buf; > + ret = fsck_refs_report(o, NULL, &info, > + FSCK_MSG_BAD_REF_NAME, > + "invalid refname format"); > + } > + > + strbuf_release(&sb); > + return ret; > +} > + > static int files_fsck_refs_dir(struct ref_store *ref_store, > struct fsck_options *o, > const char *refs_check_dir, > @@ -3469,6 +3490,7 @@ static int files_fsck_refs(struct ref_store *ref_store, > struct fsck_options *o) > { > files_fsck_refs_fn fsck_refs_fns[]= { > + files_fsck_refs_name, > NULL Neat. I very much like that we can simply add new checks to this function and the rest is handled for us already. Makes this whole thing nicely extensible. > }; > > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh > new file mode 100755 > index 0000000000..b2db58d2c6 > --- /dev/null > +++ b/t/t0602-reffiles-fsck.sh > @@ -0,0 +1,101 @@ > +#!/bin/sh > + > +test_description='Test reffiles backend consistency check' > + > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > +GIT_TEST_DEFAULT_REF_FORMAT=files > +export GIT_TEST_DEFAULT_REF_FORMAT > + > +. ./test-lib.sh Is this test suite intentionally not marked with `TEST_PASSES_SANITIZE_LEAK=true`? > + > +test_expect_success 'ref name should be checked' ' > + test_when_finished "rm -rf repo" && > + git init repo && > + branch_dir_prefix=.git/refs/heads && > + tag_dir_prefix=.git/refs/tags && > + ( > + cd repo && > + git commit --allow-empty -m initial && > + git checkout -b branch-1 && > + git tag tag-1 && > + git commit --allow-empty -m second && > + git checkout -b branch-2 && > + git tag tag-2 && > + git tag multi_hierarchy/tag-2 > + ) && I don't quite get why you create several subshells only to cd into `repo` in each of them. Isn't a single subshell sufficient for all of those tests? If you want to delimit blocks, then you can simply add an empty newline between them. > + ( > + cd repo && > + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 && > + test_must_fail git fsck 2>err && > + cat >expect <<-EOF && > + error: refs/heads/.branch-1: badRefName: invalid refname format > + EOF > + rm $branch_dir_prefix/.branch-1 && > + test_cmp expect err > + ) && > + ( > + cd repo && > + cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock && > + test_must_fail git fsck 2>err && > + cat >expect <<-EOF && > + error: refs/tags/tag-1.lock: badRefName: invalid refname format > + EOF > + rm $tag_dir_prefix/tag-1.lock && > + test_cmp expect err > + ) && The other cases all make sense, but I don't think that a file ending with ".lock" should be marked as having a "badRefName". It is expected that concurrent writers may have such lock files. What could make sense is to eventually mark stale lock files older than X amount of time as errors or warnings. But I'd think that this is outside of the scope of this patch series. Patrick
> > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh > > new file mode 100755 > > index 0000000000..b2db58d2c6 > > --- /dev/null > > +++ b/t/t0602-reffiles-fsck.sh > > @@ -0,0 +1,101 @@ > > +#!/bin/sh > > + > > +test_description='Test reffiles backend consistency check' > > + > > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > > +GIT_TEST_DEFAULT_REF_FORMAT=files > > +export GIT_TEST_DEFAULT_REF_FORMAT > > + > > +. ./test-lib.sh > > Is this test suite intentionally not marked with > `TEST_PASSES_SANITIZE_LEAK=true`? > No, I don't know this. I will add `TEST_PASSES_SANITIZE_LEAK=true` and export this environment variable. > > + > > +test_expect_success 'ref name should be checked' ' > > + test_when_finished "rm -rf repo" && > > + git init repo && > > + branch_dir_prefix=.git/refs/heads && > > + tag_dir_prefix=.git/refs/tags && > > + ( > > + cd repo && > > + git commit --allow-empty -m initial && > > + git checkout -b branch-1 && > > + git tag tag-1 && > > + git commit --allow-empty -m second && > > + git checkout -b branch-2 && > > + git tag tag-2 && > > + git tag multi_hierarchy/tag-2 > > + ) && > > I don't quite get why you create several subshells only to cd into > `repo` in each of them. Isn't a single subshell sufficient for all of > those tests? If you want to delimit blocks, then you can simply add an > empty newline between them. > I just want to delimit, I will use newline in the next version. > > + ( > > + cd repo && > > + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 && > > + test_must_fail git fsck 2>err && > > + cat >expect <<-EOF && > > + error: refs/heads/.branch-1: badRefName: invalid refname format > > + EOF > > + rm $branch_dir_prefix/.branch-1 && > > + test_cmp expect err > > + ) && > > + ( > > + cd repo && > > + cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock && > > + test_must_fail git fsck 2>err && > > + cat >expect <<-EOF && > > + error: refs/tags/tag-1.lock: badRefName: invalid refname format > > + EOF > > + rm $tag_dir_prefix/tag-1.lock && > > + test_cmp expect err > > + ) && > > The other cases all make sense, but I don't think that a file ending > with ".lock" should be marked as having a "badRefName". It is expected > that concurrent writers may have such lock files. > > What could make sense is to eventually mark stale lock files older than > X amount of time as errors or warnings. But I'd think that this is > outside of the scope of this patch series. > If so, let us just ignore ".lock" situation at the moment. > Patrick
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt index f643585a34..d8e437a043 100644 --- a/Documentation/fsck-msgids.txt +++ b/Documentation/fsck-msgids.txt @@ -19,6 +19,9 @@ `badParentSha1`:: (ERROR) A commit object has a bad parent sha1. +`badRefName`:: + (ERROR) A ref has an invalid format. + `badTagName`:: (INFO) A tag has an invalid format. diff --git a/fsck.h b/fsck.h index b03dba442e..ce56ce4bef 100644 --- a/fsck.h +++ b/fsck.h @@ -31,6 +31,7 @@ enum fsck_msg_type { FUNC(BAD_NAME, ERROR) \ FUNC(BAD_OBJECT_SHA1, ERROR) \ FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_REF_NAME, ERROR) \ FUNC(BAD_TIMEZONE, ERROR) \ FUNC(BAD_TREE, ERROR) \ FUNC(BAD_TREE_SHA1, ERROR) \ diff --git a/refs/files-backend.c b/refs/files-backend.c index cb184953c1..0d4fc27768 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -3419,6 +3419,27 @@ typedef int (*files_fsck_refs_fn)(struct fsck_options *o, const char *refs_check_dir, struct dir_iterator *iter); +static int files_fsck_refs_name(struct fsck_options *o, + const char *gitdir UNUSED, + const char *refs_check_dir, + struct dir_iterator *iter) +{ + struct strbuf sb = STRBUF_INIT; + struct fsck_refs_info info; + int ret = 0; + + if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) { + strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path); + info.path = sb.buf; + ret = fsck_refs_report(o, NULL, &info, + FSCK_MSG_BAD_REF_NAME, + "invalid refname format"); + } + + strbuf_release(&sb); + return ret; +} + static int files_fsck_refs_dir(struct ref_store *ref_store, struct fsck_options *o, const char *refs_check_dir, @@ -3469,6 +3490,7 @@ static int files_fsck_refs(struct ref_store *ref_store, struct fsck_options *o) { files_fsck_refs_fn fsck_refs_fns[]= { + files_fsck_refs_name, NULL }; diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh new file mode 100755 index 0000000000..b2db58d2c6 --- /dev/null +++ b/t/t0602-reffiles-fsck.sh @@ -0,0 +1,101 @@ +#!/bin/sh + +test_description='Test reffiles backend consistency check' + +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME +GIT_TEST_DEFAULT_REF_FORMAT=files +export GIT_TEST_DEFAULT_REF_FORMAT + +. ./test-lib.sh + +test_expect_success 'ref name should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 && + git commit --allow-empty -m second && + git checkout -b branch-2 && + git tag tag-2 && + git tag multi_hierarchy/tag-2 + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/heads/.branch-1: badRefName: invalid refname format + EOF + rm $branch_dir_prefix/.branch-1 && + test_cmp expect err + ) && + ( + cd repo && + cp $tag_dir_prefix/tag-1 $tag_dir_prefix/tag-1.lock && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/tags/tag-1.lock: badRefName: invalid refname format + EOF + rm $tag_dir_prefix/tag-1.lock && + test_cmp expect err + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/heads/@: badRefName: invalid refname format + EOF + rm $branch_dir_prefix/@ && + test_cmp expect err + ) && + ( + cd repo && + cp $tag_dir_prefix/multi_hierarchy/tag-2 $tag_dir_prefix/multi_hierarchy/@ && + test_must_fail git fsck 2>err && + cat >expect <<-EOF && + error: refs/tags/multi_hierarchy/@: badRefName: invalid refname format + EOF + rm $tag_dir_prefix/multi_hierarchy/@ && + test_cmp expect err + ) +' + +test_expect_success 'ref name check should be adapted into fsck messages' ' + test_when_finished "rm -rf repo" && + git init repo && + branch_dir_prefix=.git/refs/heads && + tag_dir_prefix=.git/refs/tags && + ( + cd repo && + git commit --allow-empty -m initial && + git checkout -b branch-1 && + git tag tag-1 && + git commit --allow-empty -m second && + git checkout -b branch-2 && + git tag tag-2 + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/.branch-1 && + git -c fsck.badRefName=warn fsck 2>err && + cat >expect <<-EOF && + warning: refs/heads/.branch-1: badRefName: invalid refname format + EOF + rm $branch_dir_prefix/.branch-1 && + test_cmp expect err + ) && + ( + cd repo && + cp $branch_dir_prefix/branch-1 $branch_dir_prefix/@ && + git -c fsck.badRefName=ignore fsck 2>err && + test_must_be_empty err + ) +' + +test_done
The git-fsck(1) only implicitly checks the reference, it does not fully check refs with bad format name such as standalone "@" and name ending with ".lock". In order to provide such checks, add a new fsck message id "badRefName" with default ERROR type. Use existing "check_refname_format" to explicit check the ref name. And add a new unit test to verify the functionality. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> --- Documentation/fsck-msgids.txt | 3 + fsck.h | 1 + refs/files-backend.c | 22 ++++++++ t/t0602-reffiles-fsck.sh | 101 ++++++++++++++++++++++++++++++++++ 4 files changed, 127 insertions(+) create mode 100755 t/t0602-reffiles-fsck.sh