mbox series

[v5,00/10] Clarify pseudo-ref terminology

Message ID cover.1715755591.git.ps@pks.im (mailing list archive)
Headers show
Series Clarify pseudo-ref terminology | expand

Message

Patrick Steinhardt May 15, 2024, 6:50 a.m. UTC
Hi,

this is the fifth version of my patch series that aims to clarify the
pseudo-ref terminology. Changes compared to v4:

  - Dropped a now-unneeded comment in `is_head_ref_syntax()` which
    claims that "HEAD is not a pseudoref". Due to the rename of the
    function this comment no longer applies.

  - Adapted `is_root_ref()` so that it does not check the ref for
    existence at all anymore, as proposed by Peff. This makes the
    function's behaviour way less confusing overall. I also added some
    explanations to the commit message to explain why this is okay to
    do.

  - Adapted the commit message of `is_headref()` to explain some of the
    subtleties that result from the removed ref existence check.

Thanks!

Patrick Steinhardt (10):
  Documentation/glossary: redefine pseudorefs as special refs
  Documentation/glossary: clarify limitations of pseudorefs
  Documentation/glossary: define root refs as refs
  refs: rename `is_pseudoref()` to `is_root_ref()`
  refs: rename `is_special_ref()` to `is_pseudo_ref()`
  refs: do not check ref existence in `is_root_ref()`
  refs: classify HEAD as a root ref
  refs: pseudorefs are no refs
  ref-filter: properly distinuish pseudo and root refs
  refs: refuse to write pseudorefs

 Documentation/glossary-content.txt | 72 +++++++++++-----------
 builtin/for-each-ref.c             |  2 +-
 ref-filter.c                       | 16 ++---
 ref-filter.h                       |  4 +-
 refs.c                             | 98 +++++++++++-------------------
 refs.h                             | 48 ++++++++++++++-
 refs/files-backend.c               |  3 +-
 refs/reftable-backend.c            |  3 +-
 t/t5510-fetch.sh                   |  6 +-
 t/t6302-for-each-ref-filter.sh     | 34 +++++++++++
 10 files changed, 169 insertions(+), 117 deletions(-)

Range-diff against v4:
 1:  b1fc4c1ac7 =  1:  1f2445b95b Documentation/glossary: redefine pseudorefs as special refs
 2:  dce3a0fa7e =  2:  d328081c52 Documentation/glossary: clarify limitations of pseudorefs
 3:  79249962f5 =  3:  0d185e6479 Documentation/glossary: define root refs as refs
 4:  ee2b090f75 !  4:  33b221b248 refs: rename `is_pseudoref()` to `is_root_ref()`
    @@ refs.c: int is_per_worktree_ref(const char *refname)
      	const char *c;
      
     @@ refs.c: static int is_pseudoref_syntax(const char *refname)
    + 			return 0;
    + 	}
    + 
    +-	/*
    +-	 * HEAD is not a pseudoref, but it certainly uses the
    +-	 * pseudoref syntax.
    +-	 */
      	return 1;
      }
      
 5:  2c09bc7690 !  5:  9087696d82 refs: refname `is_special_ref()` to `is_pseudo_ref()`
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    refs: refname `is_special_ref()` to `is_pseudo_ref()`
    +    refs: rename `is_special_ref()` to `is_pseudo_ref()`
     
         Rename `is_special_ref()` to `is_pseudo_ref()` to adapt to the newly
         defined terminology in our gitglossary(7). Note that in the preceding
 6:  5e402811a6 !  6:  af22581c22 refs: root refs can be symbolic refs
    @@ Metadata
     Author: Patrick Steinhardt <ps@pks.im>
     
      ## Commit message ##
    -    refs: root refs can be symbolic refs
    +    refs: do not check ref existence in `is_root_ref()`
     
         Before this patch series, root refs except for "HEAD" and our special
         refs were classified as pseudorefs. Furthermore, our terminology
    @@ Commit message
     
         Last but not least, the current behaviour can actually lead to a
         segfault when calling `is_root_ref()` with a reference that either does
    -    not exist or that is a symbolic ref because we never initialized `oid`.
    +    not exist or that is a symbolic ref because we never initialized `oid`,
    +    but then read it via `is_null_oid()`.
     
    -    Let's loosen the restrictions in accordance to the new definition of
    -    root refs, which are simply plain refs that may as well be a symbolic
    -    ref. Consequently, we can just check for the ref to exist instead of
    -    requiring it to be a regular ref.
    +    We have now changed terminology to clarify that pseudorefs are really
    +    only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
    +    in the root of the ref hierarchy are just plain refs. Thus, we do not
    +    need to check whether the ref is symbolic or not. In fact, we can now
    +    avoid looking up the ref completely as the name is sufficient for us to
    +    figure out whether something would be a root ref or not.
    +
    +    This change of course changes semantics for our callers. As there are
    +    only three of them we can assess each of them individually:
    +
    +      - "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
    +        It's clear that the intent is to classify based on the ref name,
    +        only.
    +
    +      - "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
    +        filter root refs. Again, using existence checks is pointless here as
    +        the iterator has just surfaced the ref, so we know it does exist.
    +
    +      - "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
    +        determine whether it should add a ref to the root directory of its
    +        iterator. This had the effect that we skipped over any files that
    +        are either a symbolic ref, or which are not a ref at all.
    +
    +        The new behaviour is to include symbolic refs know, which aligns us
    +        with the adapted terminology. Furthermore, files which look like
    +        root refs but aren't are now mark those as "broken". As broken refs
    +        are not surfaced by our tooling, this should not lead to a change in
    +        user-visible behaviour, but may cause us to emit warnings. This
    +        feels like the right thing to do as we would otherwise just silently
    +        ignore corrupted root refs completely.
    +
    +    So in all cases the existence check was either superfluous, not in line
    +    with the adapted terminology or masked potential issues. This commit
    +    thus changes the behaviour as proposed and drops the existence check
    +    altogether.
     
         Add a test that verifies that this does not change user-visible
         behaviour. Namely, we still don't want to show broken refs to the user
    @@ Commit message
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
    + ## ref-filter.c ##
    +@@ ref-filter.c: static int ref_kind_from_refname(const char *refname)
    + 			return ref_kind[i].kind;
    + 	}
    + 
    +-	if (is_root_ref(get_main_ref_store(the_repository), refname))
    ++	if (is_root_ref(refname))
    + 		return FILTER_REFS_PSEUDOREFS;
    + 
    + 	return FILTER_REFS_OTHERS;
    +
      ## refs.c ##
    +@@ refs.c: static int is_root_ref_syntax(const char *refname)
    + 	return 1;
    + }
    + 
    +-int is_root_ref(struct ref_store *refs, const char *refname)
    ++int is_root_ref(const char *refname)
    + {
    + 	static const char *const irregular_root_refs[] = {
    + 		"AUTO_MERGE",
     @@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
      		"NOTES_MERGE_REF",
      		"MERGE_AUTOSTASH",
      	};
     -	struct object_id oid;
    -+	struct strbuf referent = STRBUF_INIT;
    -+	struct object_id oid = { 0 };
    -+	int failure_errno, ret = 0;
    -+	unsigned int flags;
      	size_t i;
      
      	if (!is_root_ref_syntax(refname))
      		return 0;
      
    -+	/*
    -+	 * Note that we cannot use `refs_ref_exists()` here because that also
    -+	 * checks whether its target ref exists in case refname is a symbolic
    -+	 * ref.
    -+	 */
    - 	if (ends_with(refname, "_HEAD")) {
    +-	if (ends_with(refname, "_HEAD")) {
     -		refs_resolve_ref_unsafe(refs, refname,
     -					RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
     -					&oid, NULL);
     -		return !is_null_oid(&oid);
    -+		ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
    -+					 &flags, &failure_errno);
    -+		goto done;
    - 	}
    +-	}
    ++	if (ends_with(refname, "_HEAD"))
    ++		return 1;
      
    --	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
    -+	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++) {
    - 		if (!strcmp(refname, irregular_root_refs[i])) {
    + 	for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
    +-		if (!strcmp(refname, irregular_root_refs[i])) {
     -			refs_resolve_ref_unsafe(refs, refname,
     -						RESOLVE_REF_READING | RESOLVE_REF_NO_RECURSE,
     -						&oid, NULL);
     -			return !is_null_oid(&oid);
    -+			ret = !refs_read_raw_ref(refs, refname, &oid, &referent,
    -+						 &flags, &failure_errno);
    -+			goto done;
    - 		}
    -+	}
    +-		}
    ++		if (!strcmp(refname, irregular_root_refs[i]))
    ++			return 1;
      
    --	return 0;
    -+done:
    -+	strbuf_release(&referent);
    -+	return ret;
    + 	return 0;
      }
    - 
    - int is_headref(struct ref_store *refs, const char *refname)
     
      ## refs.h ##
     @@ refs.h: extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
    @@ refs.h: extern struct ref_namespace_info ref_namespace[NAMESPACE__COUNT];
      
      /*
     - * Check whether the reference is an existing root reference.
    -+ * Check whether the reference is an existing root reference. A root reference
    -+ * that is a dangling symbolic ref is considered to exist.
    ++ * Check whether the provided name names a root reference. This function only
    ++ * performs a syntactic check.
       *
       * A root ref is a reference that lives in the root of the reference hierarchy.
       * These references must conform to special syntax:
    +@@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
    +  *
    +  *   - MERGE_AUTOSTASH
    +  */
    +-int is_root_ref(struct ref_store *refs, const char *refname);
    ++int is_root_ref(const char *refname);
    + 
    + int is_headref(struct ref_store *refs, const char *refname);
    + 
    +
    + ## refs/files-backend.c ##
    +@@ refs/files-backend.c: static void add_pseudoref_and_head_entries(struct ref_store *ref_store,
    + 		strbuf_addstr(&refname, de->d_name);
    + 
    + 		dtype = get_dtype(de, &path, 1);
    +-		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
    +-								is_headref(ref_store, de->d_name)))
    ++		if (dtype == DT_REG && (is_root_ref(de->d_name) ||
    ++					is_headref(ref_store, de->d_name)))
    + 			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
    + 
    + 		strbuf_setlen(&refname, dirnamelen);
    +
    + ## refs/reftable-backend.c ##
    +@@ refs/reftable-backend.c: static int reftable_ref_iterator_advance(struct ref_iterator *ref_iterator)
    + 		 */
    + 		if (!starts_with(iter->ref.refname, "refs/") &&
    + 		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
    +-		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
    ++		     (is_root_ref(iter->ref.refname) ||
    + 		      is_headref(&iter->refs->base, iter->ref.refname)))) {
    + 			continue;
    + 		}
     
      ## t/t6302-for-each-ref-filter.sh ##
     @@ t/t6302-for-each-ref-filter.sh: test_expect_success '--include-root-refs with other patterns' '
 7:  b32c56afcb !  7:  b719fb7110 refs: classify HEAD as a root ref
    @@ Commit message
         Adapt the function to also treat "HEAD" as a root ref. This change is
         safe to do for all current callers:
     
    -    - `ref_kind_from_refname()` already handles "HEAD" explicitly before
    -      calling `is_root_ref()`.
    +      - `ref_kind_from_refname()` already handles "HEAD" explicitly before
    +        calling `is_root_ref()`.
     
    -    - The "files" and "reftable" backends explicitly called both
    -      `is_root_ref()` and `is_headref()`.
    +      - The "files" and "reftable" backends explicitly call both
    +        `is_root_ref()` and `is_headref()` together.
     
         This also aligns behaviour or `is_root_ref()` and `is_headref()` such
    -    that we also return a trueish value when the ref is a dangling symbolic
    -    ref. As there are no callers of `is_headref()` left afer the refactoring
    -    we absorb it completely into `is_root_ref()`.
    +    that we stop checking for ref existence. This changes semantics for our
    +    backends:
    +
    +      - In the reftable backend we already know that the ref must exist
    +        because `is_headref()` is called as part of the ref iterator. The
    +        existence check is thus redundant, and the change is safe to do.
    +
    +      - In the files backend we use it when populating root refs, where we
    +        would skip adding the "HEAD" file if it was not possible to resolve
    +        it. The new behaviour is to instead mark "HEAD" as broken, which
    +        will cause us to emit warnings in various places.
    +
    +    As there are no callers of `is_headref()` left afer the refactoring, we
    +    can absorb it completely into `is_root_ref()`.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## refs.c ##
     @@ refs.c: static int is_root_ref_syntax(const char *refname)
    - int is_root_ref(struct ref_store *refs, const char *refname)
    + int is_root_ref(const char *refname)
      {
      	static const char *const irregular_root_refs[] = {
     +		"HEAD",
      		"AUTO_MERGE",
      		"BISECT_EXPECTED_REV",
      		"NOTES_MERGE_PARTIAL",
    -@@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
    - 	return ret;
    +@@ refs.c: int is_root_ref(const char *refname)
    + 	return 0;
      }
      
     -int is_headref(struct ref_store *refs, const char *refname)
    @@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
       *
     @@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
       */
    - int is_root_ref(struct ref_store *refs, const char *refname);
    + int is_root_ref(const char *refname);
      
     -int is_headref(struct ref_store *refs, const char *refname);
     -
    @@ refs/files-backend.c: static void add_pseudoref_and_head_entries(struct ref_stor
      		strbuf_addstr(&refname, de->d_name);
      
      		dtype = get_dtype(de, &path, 1);
    --		if (dtype == DT_REG && (is_root_ref(ref_store, de->d_name) ||
    --								is_headref(ref_store, de->d_name)))
    -+		if (dtype == DT_REG && is_root_ref(ref_store, de->d_name))
    +-		if (dtype == DT_REG && (is_root_ref(de->d_name) ||
    +-					is_headref(ref_store, de->d_name)))
    ++		if (dtype == DT_REG && is_root_ref(de->d_name))
      			loose_fill_ref_dir_regular_file(refs, refname.buf, dir);
      
      		strbuf_setlen(&refname, dirnamelen);
    @@ refs/reftable-backend.c: static int reftable_ref_iterator_advance(struct ref_ite
      		 */
      		if (!starts_with(iter->ref.refname, "refs/") &&
      		    !(iter->flags & DO_FOR_EACH_INCLUDE_ROOT_REFS &&
    --		     (is_root_ref(&iter->refs->base, iter->ref.refname) ||
    +-		     (is_root_ref(iter->ref.refname) ||
     -		      is_headref(&iter->refs->base, iter->ref.refname)))) {
    -+		      is_root_ref(&iter->refs->base, iter->ref.refname))) {
    ++		      is_root_ref(iter->ref.refname))) {
      			continue;
      		}
      
 8:  19af8c754c !  8:  5709d7f780 refs: pseudorefs are no refs
    @@ refs.c: int is_per_worktree_ref(const char *refname)
      static int is_root_ref_syntax(const char *refname)
      {
      	const char *c;
    -@@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
    - 	unsigned int flags;
    +@@ refs.c: int is_root_ref(const char *refname)
    + 	};
      	size_t i;
      
     -	if (!is_root_ref_syntax(refname))
    @@ refs.c: int is_root_ref(struct ref_store *refs, const char *refname)
     +	    is_pseudo_ref(refname))
      		return 0;
      
    - 	/*
    + 	if (ends_with(refname, "_HEAD"))
     @@ refs.c: static int refs_read_special_head(struct ref_store *ref_store,
      	return result;
      }
 9:  86f7f2d2d8 !  9:  c7e90e3170 ref-filter: properly distinuish pseudo and root refs
    @@ ref-filter.c: static int ref_kind_from_refname(const char *refname)
      			return ref_kind[i].kind;
      	}
      
    --	if (is_root_ref(get_main_ref_store(the_repository), refname))
    +-	if (is_root_ref(refname))
     +	if (is_pseudo_ref(refname))
      		return FILTER_REFS_PSEUDOREFS;
    -+	if (is_root_ref(get_main_ref_store(the_repository), refname))
    ++	if (is_root_ref(refname))
     +		return FILTER_REFS_ROOT_REFS;
      
      	return FILTER_REFS_OTHERS;
    @@ refs.c: int is_per_worktree_ref(const char *refname)
      ## refs.h ##
     @@ refs.h: void update_ref_namespace(enum ref_namespace namespace, char *ref);
       */
    - int is_root_ref(struct ref_store *refs, const char *refname);
    + int is_root_ref(const char *refname);
      
     +/*
     + * Pseudorefs are refs that have different semantics compared to
10:  640d3b169f = 10:  15595991dc refs: refuse to write pseudorefs

base-commit: 83f1add914c6b4682de1e944ec0d1ac043d53d78