Message ID | 6821f57bdf326f161f152a8af0e47b54513c77b1.1594056572.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Remove special casing for PSEUDOREF updates | expand |
"Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes: > Subject: Re: [PATCH 1/2] Modify pseudo refs through ref backend storage > From: Han-Wen Nienhuys <hanwen@google.com> With what definition of "pseudo refs" has this change been made? Those things like HEAD, CHERRY_PICK_HEAD, FETCH_HEAD etc. that have traditionally been written as a plain text file in $GIT_DIR and are used to name objects by having a full object name in it? Or the entities that behave like refs and stored in ref backends, with all-uppercase-names but do not sit inside refs/ hierarchy? I think it is OK (and possibly a good move in the longer term, but that is just my gut feeling) to make ref backends resopnsible for enumerating, reading and writing them (i.e. I think we want to use the latter definition for the longer term health of the project). And we would want to ... > The previous behavior was introduced in commit 74ec19d4be > ("pseudorefs: create and use pseudoref update and delete functions", > Jul 31, 2015), with the justification "alternate ref backends still > need to store pseudorefs in GIT_DIR". ... declare that justification invalid for that purpose. Is that what is going on? I just want to make sure I am following your flow of thought. > Refs such as REBASE_HEAD are read through the ref backend. This can > only work consistently if they are written through the ref backend as > well. Tooling that works directly on files under .git should be > updated to use git commands to read refs instead. OK. > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 6516c7bc8c..9951c2e403 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -1228,7 +1228,6 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, > > for (i = 0; i < refnames->nr; i++) { > const char *refname = refnames->items[i].string; > - > if (refs_delete_ref(&refs->base, msg, refname, NULL, flags)) > result |= error(_("could not remove reference %s"), refname); > } Unreleated change? > @@ -2436,7 +2435,9 @@ static int lock_ref_for_update(struct files_ref_store *refs, > update->backend_data = lock; > > if (update->type & REF_ISSYMREF) { > - if (update->flags & REF_NO_DEREF) { > + if (update->flags & REF_NO_DEREF || > + (ref_type(update->refname) == REF_TYPE_PSEUDOREF && > + strcmp(update->refname, "HEAD"))) { > /* > * We won't be reading the referent as part of > * the transaction, so we have to read it here The old "if we are not dereferencing" condition in if() exactly matched the comment, but the condition in if() after this change is not "if we are not dereferencing". Even if we are dereferencing, under some new condition, we would still drop into this block and do not follow the "else" side that creates a new update for the referent. Is this part of "modify pseudo refs via backend" topic, or should it be a separate modification? Why is this change needed? It seems that no matter where in the refs/ hierarchy (or even outside) a symbolic ref resides, the way to update itself (not the referent through the symbolic ref) should be the same, which is what the original says, and we want to change that reasoning here, but it is not quite clear to me why. > @@ -2782,8 +2783,10 @@ static int files_transaction_finish(struct ref_store *ref_store, > struct ref_update *update = transaction->updates[i]; > struct ref_lock *lock = update->backend_data; > > - if (update->flags & REF_NEEDS_COMMIT || > - update->flags & REF_LOG_ONLY) { > + if ((ref_type(lock->ref_name) != REF_TYPE_PSEUDOREF || > + !strcmp(lock->ref_name, "HEAD")) && > + (update->flags & REF_NEEDS_COMMIT || > + update->flags & REF_LOG_ONLY)) { And this one stops the files backend from touching all pseudorefs other than HEAD with this codepath. That somehow feels totally opposite from what the log message explained above---weren't we updating the code to write these pseudorefs through the individual backends, which the files backend is one example of? Isn't this change stopping the backend from writing the pseudorefs other than HEAD instead? Puzzled. > if (files_log_ref_write(refs, > lock->ref_name, > &lock->old_oid, > diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh > index 27171f8261..6b8030e8fe 100755 > --- a/t/t1400-update-ref.sh > +++ b/t/t1400-update-ref.sh > @@ -476,7 +476,7 @@ test_expect_success 'git cat-file blob master@{2005-05-26 23:42}:F (expect OTHER > test_expect_success 'given old value for missing pseudoref, do not create' ' > test_must_fail git update-ref PSEUDOREF $A $B 2>err && > test_path_is_missing .git/PSEUDOREF && The reason why I asked what this patch thinks the definition of pseudoref is is because of this thing. Shouldn't this line be fixed not to depend on the files backend? Likewise for $(cat .git/PSEUDOREF) in the remaining two hunks. > - test_i18ngrep "could not read ref" err > + test_i18ngrep "unable to resolve reference" err > ' > > test_expect_success 'create pseudoref' ' > @@ -497,7 +497,7 @@ test_expect_success 'overwrite pseudoref with correct old value' ' > test_expect_success 'do not overwrite pseudoref with wrong old value' ' > test_must_fail git update-ref PSEUDOREF $D $E 2>err && > test $C = $(cat .git/PSEUDOREF) && > - test_i18ngrep "unexpected object ID" err > + test_i18ngrep "cannot lock ref.*expected" err > ' > > test_expect_success 'delete pseudoref' ' > @@ -509,7 +509,7 @@ test_expect_success 'do not delete pseudoref with wrong old value' ' > git update-ref PSEUDOREF $A && > test_must_fail git update-ref -d PSEUDOREF $B 2>err && > test $A = $(cat .git/PSEUDOREF) && > - test_i18ngrep "unexpected object ID" err > + test_i18ngrep "cannot lock ref.*expected" err > ' > > test_expect_success 'delete pseudoref with correct old value' '
On Mon, Jul 6, 2020 at 10:27 PM Junio C Hamano <gitster@pobox.com> wrote: > > "Han-Wen Nienhuys via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > Subject: Re: [PATCH 1/2] Modify pseudo refs through ref backend storage > > From: Han-Wen Nienhuys <hanwen@google.com> > > With what definition of "pseudo refs" has this change been made? > Those things like HEAD, CHERRY_PICK_HEAD, FETCH_HEAD etc. that have > traditionally been written as a plain text file in $GIT_DIR and are > used to name objects by having a full object name in it? > > Or the entities that behave like refs and stored in ref backends, > with all-uppercase-names but do not sit inside refs/ hierarchy? I thought these were the same? - see below :) > I think it is OK (and possibly a good move in the longer term, but > that is just my gut feeling) to make ref backends resopnsible for > enumerating, reading and writing them (i.e. I think we want to use > the latter definition for the longer term health of the project). > And we would want to ... > > > The previous behavior was introduced in commit 74ec19d4be > > ("pseudorefs: create and use pseudoref update and delete functions", > > Jul 31, 2015), with the justification "alternate ref backends still > > need to store pseudorefs in GIT_DIR". > > ... declare that justification invalid for that purpose. > > Is that what is going on? I just want to make sure I am following > your flow of thought. yep. > > const char *refname = refnames->items[i].string; > > - > > if (refs_delete_ref(&refs->base, msg, refname, NULL, flags)) > > result |= error(_("could not remove reference %s"), refname); > > } > > Unreleated change? oops. > > @@ -2436,7 +2435,9 @@ static int lock_ref_for_update(struct files_ref_store *refs, > > update->backend_data = lock; > > > > if (update->type & REF_ISSYMREF) { > > - if (update->flags & REF_NO_DEREF) { > > + if (update->flags & REF_NO_DEREF || > > + (ref_type(update->refname) == REF_TYPE_PSEUDOREF && > > + strcmp(update->refname, "HEAD"))) { > > /* > > * We won't be reading the referent as part of > > * the transaction, so we have to read it here > > The old "if we are not dereferencing" condition in if() exactly > matched the comment, but the condition in if() after this change is > not "if we are not dereferencing". Even if we are dereferencing, > under some new condition, we would still drop into this block and do > not follow the "else" side that creates a new update for the > referent. Is this part of "modify pseudo refs via backend" topic, > or should it be a separate modification? Why is this change needed? There is a test (mentioned in the commit message), t1405, which does $RUN create-symref FOO refs/heads/master nothing && # calls refs_create_symref() .. $RUN delete-refs 0 nothing FOO refs/tags/new-tag && # calls refs_delete_refs, which calls delete_pseudoref .. git rev-parse master >expected && Previously, the delete_pseudoref code would delete just .git/FOO file. Without the change here, going through the ref backend ends up in the !REF_NO_DEREF case, deleting the master branch. I don't know what the correct semantics are (are symrefs used for anything except HEAD ?), so > It seems that no matter where in the refs/ hierarchy (or even > outside) a symbolic ref resides, the way to update itself (not the > referent through the symbolic ref) should be the same, which is what > the original says, and we want to change that reasoning here, but it > is not quite clear to me why. > > > @@ -2782,8 +2783,10 @@ static int files_transaction_finish(struct ref_store *ref_store, > > struct ref_update *update = transaction->updates[i]; > > struct ref_lock *lock = update->backend_data; > > > > - if (update->flags & REF_NEEDS_COMMIT || > > - update->flags & REF_LOG_ONLY) { > > + if ((ref_type(lock->ref_name) != REF_TYPE_PSEUDOREF || > > + !strcmp(lock->ref_name, "HEAD")) && > > + (update->flags & REF_NEEDS_COMMIT || > > + update->flags & REF_LOG_ONLY)) { > > And this one stops the files backend from touching all pseudorefs > other than HEAD with this codepath. That somehow feels totally > opposite from what the log message explained above---weren't we > updating the code to write these pseudorefs through the individual > backends, which the files backend is one example of? Isn't this > change stopping the backend from writing the pseudorefs other than > HEAD instead? There is a test 'core.logAllRefUpdates=always creates no reflog for ORIG_HEAD' that wants there to not be reflog updates for ORIG_HEAD. I don't see why that behavior exists, and would be happy to change it, but I'm too much of a newbie to decide what is right here. > > --- a/t/t1400-update-ref.sh > > +++ b/t/t1400-update-ref.sh > > @@ -476,7 +476,7 @@ test_expect_success 'git cat-file blob master@{2005-05-26 23:42}:F (expect OTHER > > test_expect_success 'given old value for missing pseudoref, do not create' ' > > test_must_fail git update-ref PSEUDOREF $A $B 2>err && > > test_path_is_missing .git/PSEUDOREF && > > The reason why I asked what this patch thinks the definition of > pseudoref is is because of this thing. Shouldn't this line be fixed > not to depend on the files backend? Likewise for $(cat .git/PSEUDOREF) > in the remaining two hunks. This patch doesn't introduce reftable yet, so both definitions are equivalent for the sake of this patch.
Han-Wen Nienhuys <hanwen@google.com> writes: >> Or the entities that behave like refs and stored in ref backends, >> with all-uppercase-names but do not sit inside refs/ hierarchy? > > I thought these were the same? - see below :) >> > --- a/t/t1400-update-ref.sh >> > +++ b/t/t1400-update-ref.sh >> > @@ -476,7 +476,7 @@ test_expect_success 'git cat-file blob master@{2005-05-26 23:42}:F (expect OTHER >> > test_expect_success 'given old value for missing pseudoref, do not create' ' >> > test_must_fail git update-ref PSEUDOREF $A $B 2>err && >> > test_path_is_missing .git/PSEUDOREF && >> >> The reason why I asked what this patch thinks the definition of >> pseudoref is is because of this thing. Shouldn't this line be fixed >> not to depend on the files backend? Likewise for $(cat .git/PSEUDOREF) >> in the remaining two hunks. > > This patch doesn't introduce reftable yet, so both definitions are > equivalent for the sake of this patch. But at the end of the tunnel, we do want to see people stop expecting that .git/PSEUDOREF file is what PSEUDOREF ref is, no? I think that is what "pseudo refs are not just files directly under $GIT_DIR, but managed by the backend" means, and that in turn is a valid justification for other changes introduced in this patch. Expecting the effect of modifying pseudo refs _after_ the code is changed that such modifications are properly done through the ref API to appear on the filesystem feels like it goes against the reason why we are making this change, compared to checking to see if the pseudoref really got updated as desired via the ref API, at least to me.
On Tue, Jul 7, 2020 at 5:20 PM Junio C Hamano <gitster@pobox.com> wrote: > >> The reason why I asked what this patch thinks the definition of > >> pseudoref is is because of this thing. Shouldn't this line be fixed > >> not to depend on the files backend? Likewise for $(cat .git/PSEUDOREF) > >> in the remaining two hunks. > > > > This patch doesn't introduce reftable yet, so both definitions are > > equivalent for the sake of this patch. > > But at the end of the tunnel, we do want to see people stop > expecting that .git/PSEUDOREF file is what PSEUDOREF ref is, no? I > think that is what "pseudo refs are not just files directly under > $GIT_DIR, but managed by the backend" means, and that in turn is a > valid justification for other changes introduced in this patch. > Expecting the effect of modifying pseudo refs _after_ the code is > changed that such modifications are properly done through the ref > API to appear on the filesystem feels like it goes against the > reason why we are making this change, compared to checking to see if > the pseudoref really got updated as desired via the ref API, at > least to me. I can fix this specific instance here (which is what I think you want), for this commit to practice what it preaches. At the same time there are probably about 100 or so other places where the tests check the file system directly for ref(log) existence, so it would never be totally consistent. The only way to systematically find the offending places is to introduce a new ref backend and then fix all the tests, and I think that goes outside the scope of this small series. -- Han-Wen Nienhuys - Google Munich I work 80%. Don't expect answers from me on Fridays. -- Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Han-Wen Nienhuys <hanwen@google.com> writes: > I can fix this specific instance here (which is what I think you > want), for this commit to practice what it preaches. At the same time > there are probably about 100 or so other places where the tests check > the file system directly for ref(log) existence, so it would never be > totally consistent. > > The only way to systematically find the offending places is to > introduce a new ref backend and then fix all the tests, and I think > that goes outside the scope of this small series. Good. I think we are on the same page. I suggested to make the patch internally consistent, nothing more. And fixing everything in the world is outside the scope of these two patches. Thanks.
diff --git a/refs.c b/refs.c index 224ff66c7b..a2fd42364f 100644 --- a/refs.c +++ b/refs.c @@ -739,102 +739,6 @@ long get_files_ref_lock_timeout_ms(void) return timeout_ms; } -static int write_pseudoref(const char *pseudoref, const struct object_id *oid, - const struct object_id *old_oid, struct strbuf *err) -{ - const char *filename; - int fd; - struct lock_file lock = LOCK_INIT; - struct strbuf buf = STRBUF_INIT; - int ret = -1; - - if (!oid) - return 0; - - strbuf_addf(&buf, "%s\n", oid_to_hex(oid)); - - filename = git_path("%s", pseudoref); - fd = hold_lock_file_for_update_timeout(&lock, filename, 0, - get_files_ref_lock_timeout_ms()); - if (fd < 0) { - strbuf_addf(err, _("could not open '%s' for writing: %s"), - filename, strerror(errno)); - goto done; - } - - if (old_oid) { - struct object_id actual_old_oid; - - if (read_ref(pseudoref, &actual_old_oid)) { - if (!is_null_oid(old_oid)) { - strbuf_addf(err, _("could not read ref '%s'"), - pseudoref); - rollback_lock_file(&lock); - goto done; - } - } else if (is_null_oid(old_oid)) { - strbuf_addf(err, _("ref '%s' already exists"), - pseudoref); - rollback_lock_file(&lock); - goto done; - } else if (!oideq(&actual_old_oid, old_oid)) { - strbuf_addf(err, _("unexpected object ID when writing '%s'"), - pseudoref); - rollback_lock_file(&lock); - goto done; - } - } - - if (write_in_full(fd, buf.buf, buf.len) < 0) { - strbuf_addf(err, _("could not write to '%s'"), filename); - rollback_lock_file(&lock); - goto done; - } - - commit_lock_file(&lock); - ret = 0; -done: - strbuf_release(&buf); - return ret; -} - -static int delete_pseudoref(const char *pseudoref, const struct object_id *old_oid) -{ - const char *filename; - - filename = git_path("%s", pseudoref); - - if (old_oid && !is_null_oid(old_oid)) { - struct lock_file lock = LOCK_INIT; - int fd; - struct object_id actual_old_oid; - - fd = hold_lock_file_for_update_timeout( - &lock, filename, 0, - get_files_ref_lock_timeout_ms()); - if (fd < 0) { - error_errno(_("could not open '%s' for writing"), - filename); - return -1; - } - if (read_ref(pseudoref, &actual_old_oid)) - die(_("could not read ref '%s'"), pseudoref); - if (!oideq(&actual_old_oid, old_oid)) { - error(_("unexpected object ID when deleting '%s'"), - pseudoref); - rollback_lock_file(&lock); - return -1; - } - - unlink(filename); - rollback_lock_file(&lock); - } else { - unlink(filename); - } - - return 0; -} - int refs_delete_ref(struct ref_store *refs, const char *msg, const char *refname, const struct object_id *old_oid, @@ -843,11 +747,6 @@ int refs_delete_ref(struct ref_store *refs, const char *msg, struct ref_transaction *transaction; struct strbuf err = STRBUF_INIT; - if (ref_type(refname) == REF_TYPE_PSEUDOREF) { - assert(refs == get_main_ref_store(the_repository)); - return delete_pseudoref(refname, old_oid); - } - transaction = ref_store_transaction_begin(refs, &err); if (!transaction || ref_transaction_delete(transaction, refname, old_oid, @@ -1170,18 +1069,13 @@ int refs_update_ref(struct ref_store *refs, const char *msg, struct strbuf err = STRBUF_INIT; int ret = 0; - if (ref_type(refname) == REF_TYPE_PSEUDOREF) { - assert(refs == get_main_ref_store(the_repository)); - ret = write_pseudoref(refname, new_oid, old_oid, &err); - } else { - t = ref_store_transaction_begin(refs, &err); - if (!t || - ref_transaction_update(t, refname, new_oid, old_oid, - flags, msg, &err) || - ref_transaction_commit(t, &err)) { - ret = 1; - ref_transaction_free(t); - } + t = ref_store_transaction_begin(refs, &err); + if (!t || + ref_transaction_update(t, refname, new_oid, old_oid, flags, msg, + &err) || + ref_transaction_commit(t, &err)) { + ret = 1; + ref_transaction_free(t); } if (ret) { const char *str = _("update_ref failed for ref '%s': %s"); diff --git a/refs/files-backend.c b/refs/files-backend.c index 6516c7bc8c..9951c2e403 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1228,7 +1228,6 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg, for (i = 0; i < refnames->nr; i++) { const char *refname = refnames->items[i].string; - if (refs_delete_ref(&refs->base, msg, refname, NULL, flags)) result |= error(_("could not remove reference %s"), refname); } @@ -2436,7 +2435,9 @@ static int lock_ref_for_update(struct files_ref_store *refs, update->backend_data = lock; if (update->type & REF_ISSYMREF) { - if (update->flags & REF_NO_DEREF) { + if (update->flags & REF_NO_DEREF || + (ref_type(update->refname) == REF_TYPE_PSEUDOREF && + strcmp(update->refname, "HEAD"))) { /* * We won't be reading the referent as part of * the transaction, so we have to read it here @@ -2782,8 +2783,10 @@ static int files_transaction_finish(struct ref_store *ref_store, struct ref_update *update = transaction->updates[i]; struct ref_lock *lock = update->backend_data; - if (update->flags & REF_NEEDS_COMMIT || - update->flags & REF_LOG_ONLY) { + if ((ref_type(lock->ref_name) != REF_TYPE_PSEUDOREF || + !strcmp(lock->ref_name, "HEAD")) && + (update->flags & REF_NEEDS_COMMIT || + update->flags & REF_LOG_ONLY)) { if (files_log_ref_write(refs, lock->ref_name, &lock->old_oid, diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index 27171f8261..6b8030e8fe 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -476,7 +476,7 @@ test_expect_success 'git cat-file blob master@{2005-05-26 23:42}:F (expect OTHER test_expect_success 'given old value for missing pseudoref, do not create' ' test_must_fail git update-ref PSEUDOREF $A $B 2>err && test_path_is_missing .git/PSEUDOREF && - test_i18ngrep "could not read ref" err + test_i18ngrep "unable to resolve reference" err ' test_expect_success 'create pseudoref' ' @@ -497,7 +497,7 @@ test_expect_success 'overwrite pseudoref with correct old value' ' test_expect_success 'do not overwrite pseudoref with wrong old value' ' test_must_fail git update-ref PSEUDOREF $D $E 2>err && test $C = $(cat .git/PSEUDOREF) && - test_i18ngrep "unexpected object ID" err + test_i18ngrep "cannot lock ref.*expected" err ' test_expect_success 'delete pseudoref' ' @@ -509,7 +509,7 @@ test_expect_success 'do not delete pseudoref with wrong old value' ' git update-ref PSEUDOREF $A && test_must_fail git update-ref -d PSEUDOREF $B 2>err && test $A = $(cat .git/PSEUDOREF) && - test_i18ngrep "unexpected object ID" err + test_i18ngrep "cannot lock ref.*expected" err ' test_expect_success 'delete pseudoref with correct old value' '