Message ID | Z67MDPtjoXQB2sGB@ArchLinux (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | add more ref consistency checks | expand |
shejialuo <shejialuo@gmail.com> writes: > Although "git-fsck(1)" and "packed-backend.c" will check some > consistency and correctness of "packed-refs" file, they never check the Because you say 'some' here, it made me more curious. Could you state exactly what checks are being done here? > filetype of the "packed-refs". The user should always use "git > pack-refs" command to create the raw regular "packed-refs" file, so we > need to explicitly check this in "git refs verify". > Not sure I understand how the start of this last sentence correlates to the end of it. Is the intention to say that we want to explicitly check the filetype to ensure that the 'packed-refs' file was only created via 'git pack-refs'? If so, perhaps: Verify that the 'packed-refs' file has the expected filetype, confirming it was created by 'git pack-refs'. > We could use "open_nofollow" wrapper to open the raw "packed-refs" file. > If the returned "fd" value is less than 0, we could check whether the > "errno" is "ELOOP" to report an error to the user. > > Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to > the user if "packed-refs" is not a regular file. > > Mentored-by: Patrick Steinhardt <ps@pks.im> > Mentored-by: Karthik Nayak <karthik.188@gmail.com> > Signed-off-by: shejialuo <shejialuo@gmail.com> > --- > refs/packed-backend.c | 39 +++++++++++++++++++++++++++++++++++---- > t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++ > 2 files changed, 57 insertions(+), 4 deletions(-) > > diff --git a/refs/packed-backend.c b/refs/packed-backend.c > index a7b6f74b6e..6401cecd5f 100644 > --- a/refs/packed-backend.c > +++ b/refs/packed-backend.c > @@ -4,6 +4,7 @@ > #include "../git-compat-util.h" > #include "../config.h" > #include "../dir.h" > +#include "../fsck.h" > #include "../gettext.h" > #include "../hash.h" > #include "../hex.h" > @@ -1748,15 +1749,45 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s > return empty_ref_iterator_begin(); > } > > -static int packed_fsck(struct ref_store *ref_store UNUSED, > - struct fsck_options *o UNUSED, > +static int packed_fsck(struct ref_store *ref_store, > + struct fsck_options *o, > struct worktree *wt) > { > + struct packed_ref_store *refs = packed_downcast(ref_store, > + REF_STORE_READ, "fsck"); > + int ret = 0; > + int fd; > > if (!is_main_worktree(wt)) > - return 0; > + goto cleanup; > > - return 0; > + if (o->verbose) > + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path); > + > + fd = open_nofollow(refs->path, O_RDONLY); > + if (fd < 0) { > + /* > + * If the packed-refs file doesn't exist, there's nothing > + * to check. > + */ > + if (errno == ENOENT) > + goto cleanup; > + > + if (errno == ELOOP) { > + struct fsck_ref_report report = { 0 }; > + report.path = "packed-refs"; > + ret = fsck_report_ref(o, &report, > + FSCK_MSG_BAD_REF_FILETYPE, > + "not a regular file"); > + goto cleanup; > + } > + > + ret = error_errno(_("unable to open %s"), refs->path); > + goto cleanup; The paragraph in the commit message: Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to the user if "packed-refs" is not a regular file. Gave me the indication that any error would be reported via 'fsck_report_ref()', but it seems like we are only reporting for symbolic links. Why is that being singled out? > + } > + > +cleanup: > + return ret; > } > > struct ref_storage_be refs_be_packed = { > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh > index cf7a202d0d..42c8d4ca1e 100755 > --- a/t/t0602-reffiles-fsck.sh > +++ b/t/t0602-reffiles-fsck.sh > @@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' ' > ) > ' > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' > + test_when_finished "rm -rf repo" && > + git init repo && > + ( > + cd repo && > + test_commit default && > + git branch branch-1 && > + git branch branch-2 && > + git branch branch-3 && > + git pack-refs --all && > + > + mv .git/packed-refs .git/packed-refs-back && > + ln -sf packed-refs-bak .git/packed-refs && This still doesn't make sense to me. 'packed-refs-bak' doesn't exist, is the intention to symlink '.git/packed-refs' -> something which doesn't exist? In that case why even make the effort to build a packed-refs file, could we simply do 'ln -sf packed-refs-bak .git/packed-refs' in an empty repo? If not, then 'packed-refs-bak' is definitely a typo and needs to be made 'packed-refs-back' which would go in hand with how we setup the test... > + test_must_fail git refs verify 2>err && > + cat >expect <<-EOF && > + error: packed-refs: badRefFiletype: not a regular file > + EOF > + rm .git/packed-refs && > + test_cmp expect err > + ) > +' > + > test_done > -- > 2.48.1
On Fri, Feb 14, 2025 at 01:50:26AM -0800, Karthik Nayak wrote: > shejialuo <shejialuo@gmail.com> writes: > > > Although "git-fsck(1)" and "packed-backend.c" will check some > > consistency and correctness of "packed-refs" file, they never check the > > Because you say 'some' here, it made me more curious. Could you state > exactly what checks are being done here? > Well, I don't think we need to elaborate on this at now for the following two reasons: 1. We will explain this in the later patches. 2. Here I just want to emphasis that it does not check the filetype. > > filetype of the "packed-refs". The user should always use "git > > pack-refs" command to create the raw regular "packed-refs" file, so we > > need to explicitly check this in "git refs verify". > > > > Not sure I understand how the start of this last sentence correlates to > the end of it. Is the intention to say that we want to explicitly check > the filetype to ensure that the 'packed-refs' file was only created via > 'git pack-refs'? If so, perhaps: > > Verify that the 'packed-refs' file has the expected filetype, > confirming it was created by 'git pack-refs'. > Thanks for the suggestion, I will improve this in the next version. > > We could use "open_nofollow" wrapper to open the raw "packed-refs" file. > > If the returned "fd" value is less than 0, we could check whether the > > "errno" is "ELOOP" to report an error to the user. > > > > Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to > > the user if "packed-refs" is not a regular file. > > > > Mentored-by: Patrick Steinhardt <ps@pks.im> > > Mentored-by: Karthik Nayak <karthik.188@gmail.com> > > Signed-off-by: shejialuo <shejialuo@gmail.com> > > --- > > refs/packed-backend.c | 39 +++++++++++++++++++++++++++++++++++---- > > t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++ > > 2 files changed, 57 insertions(+), 4 deletions(-) > > > > diff --git a/refs/packed-backend.c b/refs/packed-backend.c > > index a7b6f74b6e..6401cecd5f 100644 > > --- a/refs/packed-backend.c > > +++ b/refs/packed-backend.c > > @@ -4,6 +4,7 @@ > > #include "../git-compat-util.h" > > #include "../config.h" > > #include "../dir.h" > > +#include "../fsck.h" > > #include "../gettext.h" > > #include "../hash.h" > > #include "../hex.h" > > @@ -1748,15 +1749,45 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s > > return empty_ref_iterator_begin(); > > } > > > > -static int packed_fsck(struct ref_store *ref_store UNUSED, > > - struct fsck_options *o UNUSED, > > +static int packed_fsck(struct ref_store *ref_store, > > + struct fsck_options *o, > > struct worktree *wt) > > { > > + struct packed_ref_store *refs = packed_downcast(ref_store, > > + REF_STORE_READ, "fsck"); > > + int ret = 0; > > + int fd; > > > > if (!is_main_worktree(wt)) > > - return 0; > > + goto cleanup; > > > > - return 0; > > + if (o->verbose) > > + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path); > > + > > + fd = open_nofollow(refs->path, O_RDONLY); > > + if (fd < 0) { > > + /* > > + * If the packed-refs file doesn't exist, there's nothing > > + * to check. > > + */ > > + if (errno == ENOENT) > > + goto cleanup; > > + > > + if (errno == ELOOP) { > > + struct fsck_ref_report report = { 0 }; > > + report.path = "packed-refs"; > > + ret = fsck_report_ref(o, &report, > > + FSCK_MSG_BAD_REF_FILETYPE, > > + "not a regular file"); > > + goto cleanup; > > + } > > + > > + ret = error_errno(_("unable to open %s"), refs->path); > > + goto cleanup; > > The paragraph in the commit message: > > Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to > the user if "packed-refs" is not a regular file. > > Gave me the indication that any error would be reported via > 'fsck_report_ref()', but it seems like we are only reporting for > symbolic links. Why is that being singled out? > IIRC, when Patrick told me in first version that if I first stat the file type and then use the `strbuf_read_file` to read the content, there is a corner case that the file could be converted into symlink between the `stat` and read. So, I use `open_nofollow` to avoid this situation. (Actually, this could not be avoided because in Windows, we would first stat the file and then open the file due to that there is no "O_NOFOLLOW" flag for Windows). I will find a solution to do this in the next version. > > + } > > + > > +cleanup: > > + return ret; > > } > > > > struct ref_storage_be refs_be_packed = { > > diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh > > index cf7a202d0d..42c8d4ca1e 100755 > > --- a/t/t0602-reffiles-fsck.sh > > +++ b/t/t0602-reffiles-fsck.sh > > @@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' ' > > ) > > ' > > > > +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' > > + test_when_finished "rm -rf repo" && > > + git init repo && > > + ( > > + cd repo && > > + test_commit default && > > + git branch branch-1 && > > + git branch branch-2 && > > + git branch branch-3 && > > + git pack-refs --all && > > + > > + mv .git/packed-refs .git/packed-refs-back && > > + ln -sf packed-refs-bak .git/packed-refs && > > This still doesn't make sense to me. 'packed-refs-bak' doesn't exist, is > the intention to symlink '.git/packed-refs' -> something which doesn't > exist? > > In that case why even make the effort to build a packed-refs file, could > we simply do 'ln -sf packed-refs-bak .git/packed-refs' in an empty repo? > You are correct. My intention is not this. If the "packed-refs" is a symlink and points to file which we can successfully parse. Current Git won't complain. So my motivation here is to imitate this situation. > If not, then 'packed-refs-bak' is definitely a typo and needs to be made > 'packed-refs-back' which would go in hand with how we setup the test... > Thanks for noticing this problem. I definitely made a mistake to type the "packed-refs-back" to "packed-refs-bak". Jialuo
diff --git a/refs/packed-backend.c b/refs/packed-backend.c index a7b6f74b6e..6401cecd5f 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -4,6 +4,7 @@ #include "../git-compat-util.h" #include "../config.h" #include "../dir.h" +#include "../fsck.h" #include "../gettext.h" #include "../hash.h" #include "../hex.h" @@ -1748,15 +1749,45 @@ static struct ref_iterator *packed_reflog_iterator_begin(struct ref_store *ref_s return empty_ref_iterator_begin(); } -static int packed_fsck(struct ref_store *ref_store UNUSED, - struct fsck_options *o UNUSED, +static int packed_fsck(struct ref_store *ref_store, + struct fsck_options *o, struct worktree *wt) { + struct packed_ref_store *refs = packed_downcast(ref_store, + REF_STORE_READ, "fsck"); + int ret = 0; + int fd; if (!is_main_worktree(wt)) - return 0; + goto cleanup; - return 0; + if (o->verbose) + fprintf_ln(stderr, "Checking packed-refs file %s", refs->path); + + fd = open_nofollow(refs->path, O_RDONLY); + if (fd < 0) { + /* + * If the packed-refs file doesn't exist, there's nothing + * to check. + */ + if (errno == ENOENT) + goto cleanup; + + if (errno == ELOOP) { + struct fsck_ref_report report = { 0 }; + report.path = "packed-refs"; + ret = fsck_report_ref(o, &report, + FSCK_MSG_BAD_REF_FILETYPE, + "not a regular file"); + goto cleanup; + } + + ret = error_errno(_("unable to open %s"), refs->path); + goto cleanup; + } + +cleanup: + return ret; } struct ref_storage_be refs_be_packed = { diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh index cf7a202d0d..42c8d4ca1e 100755 --- a/t/t0602-reffiles-fsck.sh +++ b/t/t0602-reffiles-fsck.sh @@ -617,4 +617,26 @@ test_expect_success 'ref content checks should work with worktrees' ' ) ' +test_expect_success SYMLINKS 'the filetype of packed-refs should be checked' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit default && + git branch branch-1 && + git branch branch-2 && + git branch branch-3 && + git pack-refs --all && + + mv .git/packed-refs .git/packed-refs-back && + ln -sf packed-refs-bak .git/packed-refs && + test_must_fail git refs verify 2>err && + cat >expect <<-EOF && + error: packed-refs: badRefFiletype: not a regular file + EOF + rm .git/packed-refs && + test_cmp expect err + ) +' + test_done
Although "git-fsck(1)" and "packed-backend.c" will check some consistency and correctness of "packed-refs" file, they never check the filetype of the "packed-refs". The user should always use "git pack-refs" command to create the raw regular "packed-refs" file, so we need to explicitly check this in "git refs verify". We could use "open_nofollow" wrapper to open the raw "packed-refs" file. If the returned "fd" value is less than 0, we could check whether the "errno" is "ELOOP" to report an error to the user. Reuse "FSCK_MSG_BAD_REF_FILETYPE" fsck message id to report the error to the user if "packed-refs" is not a regular file. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: shejialuo <shejialuo@gmail.com> --- refs/packed-backend.c | 39 +++++++++++++++++++++++++++++++++++---- t/t0602-reffiles-fsck.sh | 22 ++++++++++++++++++++++ 2 files changed, 57 insertions(+), 4 deletions(-)