Message ID | 20250122100319.2280647-1-karthik.188@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | refs: fix creation of corrupted reflogs for symrefs | expand |
On Wed, Jan 22, 2025 at 11:03:19AM +0100, Karthik Nayak wrote: > The commit 297c09eabb (refs: allow multiple reflog entries for the same > refname, 2024-12-16) added logic for reflogs to exit early in > `lock_ref_for_update()` after obtaining the required lock. This was > added as a performance optimization as it was assumed that no further > processing was required for reflog only updates. However this was s/reflog only/reflog-only > incorrect since for a symref's reflog entry, the update needs to be > populated with the old_oid value. This is done right after the early > exit. > > This caused a bug in Git 2.48 where target references of symrefs being > updated would create a corrupted reflog entry for the symref since the > old_oid is not populated. Undo the skip in logic to fix this issue and > also add a test to ensure that such an issue doesn't arise in the > future. It's a bit curious that you describe the fix here, then in the next paragraph describe why we have skipped the logic only to reiterate the fix. > The early exit was added as a performance optimization for reflog-only > updates, but this accidentally broke symref reflog handling. Remove the > optimization since it wasn't essential to the original changes. [snip] > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 5cfb8b7ca8..29f08dced4 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -2615,9 +2615,6 @@ static int lock_ref_for_update(struct files_ref_store *refs, > > update->backend_data = lock; > > - if (update->flags & REF_LOG_ONLY) > - goto out; > - > if (update->type & REF_ISSYMREF) { > if (update->flags & REF_NO_DEREF) { > /* Okay, makes sense. The error is specific to the "files" backend, which might be worth mentioning in the commit message. One thing that made me a bit curious is that we now end up executing `check_old_oid()` for symref reflog entries, because we have `REF_ISSYMREF` and `REF_NO_DEREF` set. But that function should end up skipping the check because we explicitly unset `REF_HAVE_OLD` when queueing the update. The remainder should be skipped because we have `REF_LOG_ONLY` set. > diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh > index e2316f1dd4..59493dd73f 100755 > --- a/t/t1400-update-ref.sh > +++ b/t/t1400-update-ref.sh > @@ -4,6 +4,8 @@ > # > > test_description='Test git update-ref and basic ref logging' > +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main > +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME > > . ./test-lib.sh > We could use `git symbolic-ref HEAD` to resolve the branch name instead of overriding the branch name here. Patrick
On Wed, Jan 22, 2025 at 11:03:19AM +0100, Karthik Nayak wrote: > The commit 297c09eabb (refs: allow multiple reflog entries for the same > refname, 2024-12-16) added logic for reflogs to exit early in > `lock_ref_for_update()` after obtaining the required lock. This was > added as a performance optimization as it was assumed that no further > processing was required for reflog only updates. However this was > incorrect since for a symref's reflog entry, the update needs to be > populated with the old_oid value. This is done right after the early > exit. > > This caused a bug in Git 2.48 where target references of symrefs being > updated would create a corrupted reflog entry for the symref since the > old_oid is not populated. Undo the skip in logic to fix this issue and > also add a test to ensure that such an issue doesn't arise in the > future. > > The early exit was added as a performance optimization for reflog-only > updates, but this accidentally broke symref reflog handling. Remove the > optimization since it wasn't essential to the original changes. Thanks for the explanation. > Reported-by: Nika Layzell <nika@thelayzells.com> > Co-authored-by: Jeff King <peff@peff.net> > Signed-off-by: Karthik Nayak <karthik.188@gmail.com> I don't know if we need my s-o-b to delete a few lines of code, but just in case: Signed-off-by: Jeff King <peff@peff.net> > +test_expect_success 'update-ref should also create reflog for HEAD' ' > + test_when_finished "rm -rf repo" && > + git init repo && > + ( > + cd repo && > + test_commit A && > + test_commit B && > + git rev-parse HEAD >>expect && Using ">>" here is unexpected. It's OK because we are in a new repo (so there is no leftover "expect" file from a previous test) but probably better to stick to ">" unless we really need to append. Plus I don't think there is really any need for a new repo. The important thing is just updating the branch via update-ref (it doesn't even have to be a rewind, but of course it has to exist already, so a rewind is the simplest thing). > + git update-ref --create-reflog refs/heads/main HEAD~ && I agree with Patrick that we are probably better off just getting the branch name with symbolic-ref. So all together, something like: diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index e2316f1dd4..29045aad43 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -2068,4 +2068,13 @@ do done +test_expect_success 'update-ref should also create reflog for HEAD' ' + test_commit to-rewind && + git rev-parse HEAD >expect && + head=$(git symbolic-ref HEAD) && + git update-ref --create-reflog "$head" HEAD~ && + git rev-parse HEAD@{1} >actual && + test_cmp expect actual +' + test_done -Peff
Patrick Steinhardt <ps@pks.im> writes: >> This caused a bug in Git 2.48 where target references of symrefs being >> updated would create a corrupted reflog entry for the symref since the >> old_oid is not populated. Undo the skip in logic to fix this issue and >> also add a test to ensure that such an issue doesn't arise in the >> future. > > It's a bit curious that you describe the fix here, then in the next > paragraph describe why we have skipped the logic only to reiterate the > fix. > >> The early exit was added as a performance optimization for reflog-only >> updates, but this accidentally broke symref reflog handling. Remove the >> optimization since it wasn't essential to the original changes. Yeah, that indeed is a "bit" curious. I'd call it confusing, though ;-). > Okay, makes sense. The error is specific to the "files" backend, which > might be worth mentioning in the commit message. Indeed. >> test_description='Test git update-ref and basic ref logging' >> +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main >> +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME >> >> . ./test-lib.sh >> > > We could use `git symbolic-ref HEAD` to resolve the branch name instead > of overriding the branch name here. I agree. That sounds like a more sensible way to go. Thanks.
diff --git a/refs/files-backend.c b/refs/files-backend.c index 5cfb8b7ca8..29f08dced4 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2615,9 +2615,6 @@ static int lock_ref_for_update(struct files_ref_store *refs, update->backend_data = lock; - if (update->flags & REF_LOG_ONLY) - goto out; - if (update->type & REF_ISSYMREF) { if (update->flags & REF_NO_DEREF) { /* diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index e2316f1dd4..59493dd73f 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -4,6 +4,8 @@ # test_description='Test git update-ref and basic ref logging' +GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main +export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME . ./test-lib.sh @@ -2068,4 +2070,18 @@ do done +test_expect_success 'update-ref should also create reflog for HEAD' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit A && + test_commit B && + git rev-parse HEAD >>expect && + git update-ref --create-reflog refs/heads/main HEAD~ && + git rev-parse HEAD@{1} >actual && + test_cmp expect actual + ) +' + test_done
The commit 297c09eabb (refs: allow multiple reflog entries for the same refname, 2024-12-16) added logic for reflogs to exit early in `lock_ref_for_update()` after obtaining the required lock. This was added as a performance optimization as it was assumed that no further processing was required for reflog only updates. However this was incorrect since for a symref's reflog entry, the update needs to be populated with the old_oid value. This is done right after the early exit. This caused a bug in Git 2.48 where target references of symrefs being updated would create a corrupted reflog entry for the symref since the old_oid is not populated. Undo the skip in logic to fix this issue and also add a test to ensure that such an issue doesn't arise in the future. The early exit was added as a performance optimization for reflog-only updates, but this accidentally broke symref reflog handling. Remove the optimization since it wasn't essential to the original changes. Reported-by: Nika Layzell <nika@thelayzells.com> Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> --- Hello, This patch is based on top of 'maint' so that it can be easily backported. Sorry for the inconvenience here. This was a premature optimization which wasn't needed, and unfortunately this wasn't captured by any test. Karthik --- refs/files-backend.c | 3 --- t/t1400-update-ref.sh | 16 ++++++++++++++++ 2 files changed, 16 insertions(+), 3 deletions(-)