diff mbox series

[11/12] builtin/show-ref: add new mode to check for reference existence

Message ID 2f876e61dd36a8887a1286bb8db9fb6577c55c9b.1698152926.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series show-ref: introduce mode to check for ref existence | expand

Commit Message

Patrick Steinhardt Oct. 24, 2023, 1:11 p.m. UTC
While we have multiple ways to show the value of a given reference, we
do not have any way to check whether a reference exists at all. While
commands like git-rev-parse(1) or git-show-ref(1) can be used to check
for reference existence in case the reference resolves to something
sane, neither of them can be used to check for existence in some other
scenarios where the reference does not resolve cleanly:

    - References which have an invalid name cannot be resolved.

    - References to nonexistent objects cannot be resolved.

    - Dangling symrefs can be resolved via git-symbolic-ref(1), but this
      requires the caller to special case existence checks depending on
      whteher or not a reference is symbolic or direct.

Furthermore, git-rev-list(1) and other commands do not let the caller
distinguish easily between an actually missing reference and a generic
error.

Taken together, this gseems like sufficient motivation to introduce a
separate plumbing command to explicitly check for the existence of a
reference without trying to resolve its contents.

This new command comes in the form of `git show-ref --exists`. This
new mode will exit successfully when the reference exists, with a
specific error code of 2 when it does not exist, or with 1 when there
has been a generic error.

Note that the only way to properly implement this command is by using
the internal `refs_read_raw_ref()` function. While the public function
`refs_resolve_ref_unsafe()` can be made to behave in the same way by
passing various flags, it does not provide any way to obtain the errno
with which the reference backend failed when reading the reference. As
such, it becomes impossible for us to distinguish generic errors from
the explicit case where the reference wasn't found.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/git-show-ref.txt | 11 ++++++
 builtin/show-ref.c             | 47 ++++++++++++++++++++++--
 t/t1403-show-ref.sh            | 67 +++++++++++++++++++++++++++++++++-
 3 files changed, 120 insertions(+), 5 deletions(-)

Comments

Eric Sunshine Oct. 24, 2023, 9:01 p.m. UTC | #1
On Tue, Oct 24, 2023 at 9:11 AM Patrick Steinhardt <ps@pks.im> wrote:
> While we have multiple ways to show the value of a given reference, we
> do not have any way to check whether a reference exists at all. While
> commands like git-rev-parse(1) or git-show-ref(1) can be used to check
> for reference existence in case the reference resolves to something
> sane, neither of them can be used to check for existence in some other
> scenarios where the reference does not resolve cleanly:
>
>     - References which have an invalid name cannot be resolved.
>
>     - References to nonexistent objects cannot be resolved.
>
>     - Dangling symrefs can be resolved via git-symbolic-ref(1), but this
>       requires the caller to special case existence checks depending on
>       whteher or not a reference is symbolic or direct.

s/whteher/whether/

> Furthermore, git-rev-list(1) and other commands do not let the caller
> distinguish easily between an actually missing reference and a generic
> error.
>
> Taken together, this gseems like sufficient motivation to introduce a

s/gseems/seems/

> separate plumbing command to explicitly check for the existence of a
> reference without trying to resolve its contents.
>
> This new command comes in the form of `git show-ref --exists`. This
> new mode will exit successfully when the reference exists, with a
> specific error code of 2 when it does not exist, or with 1 when there
> has been a generic error.
>
> Note that the only way to properly implement this command is by using
> the internal `refs_read_raw_ref()` function. While the public function
> `refs_resolve_ref_unsafe()` can be made to behave in the same way by
> passing various flags, it does not provide any way to obtain the errno
> with which the reference backend failed when reading the reference. As
> such, it becomes impossible for us to distinguish generic errors from
> the explicit case where the reference wasn't found.
>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> diff --git a/Documentation/git-show-ref.txt b/Documentation/git-show-ref.txt
> @@ -65,6 +70,12 @@ OPTIONS
> +--exists::
> +
> +       Check whether the given reference exists. Returns an error code of 0 if

We probably want to call this "exit code" rather than "error code"
since the latter is unnecessarily scary sounding for the success case
(when the ref does exit).

> +       it does, 2 if it is missing, and 128 in case looking up the reference
> +       failed with an error other than the reference being missing.

The commit message says it returns 1 for a generic error, but this
inconsistently says it returns 128 for that case. The actual
implementation returns 1.

> diff --git a/builtin/show-ref.c b/builtin/show-ref.c
> @@ -214,6 +215,41 @@ static int cmd_show_ref__patterns(const struct patterns_options *opts,
> +static int cmd_show_ref__exists(const char **refs)
> +{
> +       struct strbuf unused_referent = STRBUF_INIT;
> +       struct object_id unused_oid;
> +       unsigned int unused_type;
> +       int failure_errno = 0;
> +       const char *ref;
> +       int ret = 1;
> +
> +       if (!refs || !*refs)
> +               die("--exists requires a reference");
> +       ref = *refs++;
> +       if (*refs)
> +               die("--exists requires exactly one reference");
> +
> +       if (refs_read_raw_ref(get_main_ref_store(the_repository), ref,
> +                             &unused_oid, &unused_referent, &unused_type,
> +                             &failure_errno)) {
> +               if (failure_errno == ENOENT) {
> +                       error(_("reference does not exist"));

The documentation doesn't mention this printing any output, and indeed
one would intuitively expect a boolean-like operation to not produce
any printed output since its exit code indicates the result (except,
of course, in the case of a real error).

> +                       ret = 2;
> +               } else {
> +                       error(_("failed to look up reference: %s"), strerror(failure_errno));

Or use error_errno():

    errno = failure_errno;
    error_errno(_("failed to look up reference: %s"));

> +               }
> +
> +               goto out;
> +       }
> +
> +       ret = 0;
> +
> +out:
> +       strbuf_release(&unused_referent);
> +       return ret;
> +}

It's a bit odd having `ret` be 1 at the outset rather than 0, thus
making the logic a bit more difficult to reason about. I would have
expected it to be organized like this:

    int ret = 0;
    if (refs_read_raw_ref(...)) {
         if (failure_errno == ENOENT) {
            ret = 2;
        } else {
            ret = 1;
            errno = failure_errno;
            error_errno(_("failed to look up reference: %s"));
       }
    }
    strbuf_release(...);
    return ret;

> @@ -272,13 +309,15 @@ int cmd_show_ref(int argc, const char **argv, const char *prefix)
> +       if ((!!exclude_existing_opts.enabled + !!verify + !!exists) > 1)
> +               die(_("only one of --exclude-existing, --exists or --verify can be given"));

When reviewing an earlier patch in this series, I forgot to mention
that we can simplify the life of translators by using placeholders:

    die(_("options '%s', '%s' or '%s' cannot be used together"),
        "--exclude-existing", "--exists", "--verify");

which ensures that they don't translate the literal option names, and
makes it possible to reuse the translated message in multiple
locations (since it doesn't mention hard-coded option names).
Patrick Steinhardt Oct. 25, 2023, 11:50 a.m. UTC | #2
On Tue, Oct 24, 2023 at 05:01:55PM -0400, Eric Sunshine wrote:
> On Tue, Oct 24, 2023 at 9:11 AM Patrick Steinhardt <ps@pks.im> wrote:
> > While we have multiple ways to show the value of a given reference, we
> > do not have any way to check whether a reference exists at all. While
> > commands like git-rev-parse(1) or git-show-ref(1) can be used to check
> > for reference existence in case the reference resolves to something
> > sane, neither of them can be used to check for existence in some other
> > scenarios where the reference does not resolve cleanly:
> >
> >     - References which have an invalid name cannot be resolved.
> >
> >     - References to nonexistent objects cannot be resolved.
> >
> >     - Dangling symrefs can be resolved via git-symbolic-ref(1), but this
> >       requires the caller to special case existence checks depending on
> >       whteher or not a reference is symbolic or direct.
> 
> s/whteher/whether/
> 
> > Furthermore, git-rev-list(1) and other commands do not let the caller
> > distinguish easily between an actually missing reference and a generic
> > error.
> >
> > Taken together, this gseems like sufficient motivation to introduce a
> 
> s/gseems/seems/
> 
> > separate plumbing command to explicitly check for the existence of a
> > reference without trying to resolve its contents.
> >
> > This new command comes in the form of `git show-ref --exists`. This
> > new mode will exit successfully when the reference exists, with a
> > specific error code of 2 when it does not exist, or with 1 when there
> > has been a generic error.
> >
> > Note that the only way to properly implement this command is by using
> > the internal `refs_read_raw_ref()` function. While the public function
> > `refs_resolve_ref_unsafe()` can be made to behave in the same way by
> > passing various flags, it does not provide any way to obtain the errno
> > with which the reference backend failed when reading the reference. As
> > such, it becomes impossible for us to distinguish generic errors from
> > the explicit case where the reference wasn't found.
> >
> > Signed-off-by: Patrick Steinhardt <ps@pks.im>
> > ---
> > diff --git a/Documentation/git-show-ref.txt b/Documentation/git-show-ref.txt
> > @@ -65,6 +70,12 @@ OPTIONS
> > +--exists::
> > +
> > +       Check whether the given reference exists. Returns an error code of 0 if
> 
> We probably want to call this "exit code" rather than "error code"
> since the latter is unnecessarily scary sounding for the success case
> (when the ref does exit).

I was trying to stick to the preexisting style of "error code" in this
manual page. But I think I agree with your argument that we also call it
an error code in the successful case, which is misleading.

> > +       it does, 2 if it is missing, and 128 in case looking up the reference
> > +       failed with an error other than the reference being missing.
> 
> The commit message says it returns 1 for a generic error, but this
> inconsistently says it returns 128 for that case. The actual
> implementation returns 1.

Good catch, fixed.

> > diff --git a/builtin/show-ref.c b/builtin/show-ref.c
> > @@ -214,6 +215,41 @@ static int cmd_show_ref__patterns(const struct patterns_options *opts,
> > +static int cmd_show_ref__exists(const char **refs)
> > +{
> > +       struct strbuf unused_referent = STRBUF_INIT;
> > +       struct object_id unused_oid;
> > +       unsigned int unused_type;
> > +       int failure_errno = 0;
> > +       const char *ref;
> > +       int ret = 1;
> > +
> > +       if (!refs || !*refs)
> > +               die("--exists requires a reference");
> > +       ref = *refs++;
> > +       if (*refs)
> > +               die("--exists requires exactly one reference");
> > +
> > +       if (refs_read_raw_ref(get_main_ref_store(the_repository), ref,
> > +                             &unused_oid, &unused_referent, &unused_type,
> > +                             &failure_errno)) {
> > +               if (failure_errno == ENOENT) {
> > +                       error(_("reference does not exist"));
> 
> The documentation doesn't mention this printing any output, and indeed
> one would intuitively expect a boolean-like operation to not produce
> any printed output since its exit code indicates the result (except,
> of course, in the case of a real error).

I'm inclined to leave this as-is. While the exit code should be
sufficient, I think it's rather easy to wonder whether it actually did
anything at all and why it failed in more interactive use cases. Not
that I think these will necessarily exist.

I also don't think it's going to hurt to print this error. If it ever
does start to become a problem we might end up honoring the "--quiet"
flag to squelch this case.

> > +                       ret = 2;
> > +               } else {
> > +                       error(_("failed to look up reference: %s"), strerror(failure_errno));
> 
> Or use error_errno():
> 
>     errno = failure_errno;
>     error_errno(_("failed to look up reference: %s"));

Ah, good suggestion.

> > +               }
> > +
> > +               goto out;
> > +       }
> > +
> > +       ret = 0;
> > +
> > +out:
> > +       strbuf_release(&unused_referent);
> > +       return ret;
> > +}
> 
> It's a bit odd having `ret` be 1 at the outset rather than 0, thus
> making the logic a bit more difficult to reason about. I would have
> expected it to be organized like this:
> 
>     int ret = 0;
>     if (refs_read_raw_ref(...)) {
>          if (failure_errno == ENOENT) {
>             ret = 2;
>         } else {
>             ret = 1;
>             errno = failure_errno;
>             error_errno(_("failed to look up reference: %s"));
>        }
>     }
>     strbuf_release(...);
>     return ret;

Fair enough. I've seen both styles used in our codebase, but ultimately
don't care much which of either we use here. Will adapt.

> > @@ -272,13 +309,15 @@ int cmd_show_ref(int argc, const char **argv, const char *prefix)
> > +       if ((!!exclude_existing_opts.enabled + !!verify + !!exists) > 1)
> > +               die(_("only one of --exclude-existing, --exists or --verify can be given"));
> 
> When reviewing an earlier patch in this series, I forgot to mention
> that we can simplify the life of translators by using placeholders:
> 
>     die(_("options '%s', '%s' or '%s' cannot be used together"),
>         "--exclude-existing", "--exists", "--verify");
> 
> which ensures that they don't translate the literal option names, and
> makes it possible to reuse the translated message in multiple
> locations (since it doesn't mention hard-coded option names).

Done.

Thanks for your review, highly appreciated! I'll wait until tomorrow for
additional feedback and then send out v2.

Patrick
diff mbox series

Patch

diff --git a/Documentation/git-show-ref.txt b/Documentation/git-show-ref.txt
index ab23e0b62e1..a7e9374bc2b 100644
--- a/Documentation/git-show-ref.txt
+++ b/Documentation/git-show-ref.txt
@@ -15,6 +15,7 @@  SYNOPSIS
 	     [-s | --hash[=<n>]] [--abbrev[=<n>]]
 	     [--] [<ref>...]
 'git show-ref' --exclude-existing[=<pattern>]
+'git show-ref' --exists <ref>
 
 DESCRIPTION
 -----------
@@ -30,6 +31,10 @@  The `--exclude-existing` form is a filter that does the inverse. It reads
 refs from stdin, one ref per line, and shows those that don't exist in
 the local repository.
 
+The `--exists` form can be used to check for the existence of a single
+references. This form does not verify whether the reference resolves to an
+actual object.
+
 Use of this utility is encouraged in favor of directly accessing files under
 the `.git` directory.
 
@@ -65,6 +70,12 @@  OPTIONS
 	Aside from returning an error code of 1, it will also print an error
 	message if `--quiet` was not specified.
 
+--exists::
+
+	Check whether the given reference exists. Returns an error code of 0 if
+	it does, 2 if it is missing, and 128 in case looking up the reference
+	failed with an error other than the reference being missing.
+
 --abbrev[=<n>]::
 
 	Abbreviate the object name.  When using `--hash`, you do
diff --git a/builtin/show-ref.c b/builtin/show-ref.c
index d0a32d07404..617e754bbed 100644
--- a/builtin/show-ref.c
+++ b/builtin/show-ref.c
@@ -2,7 +2,7 @@ 
 #include "config.h"
 #include "gettext.h"
 #include "hex.h"
-#include "refs.h"
+#include "refs/refs-internal.h"
 #include "object-name.h"
 #include "object-store-ll.h"
 #include "object.h"
@@ -18,6 +18,7 @@  static const char * const show_ref_usage[] = {
 	   "             [-s | --hash[=<n>]] [--abbrev[=<n>]]\n"
 	   "             [--] [<ref>...]"),
 	N_("git show-ref --exclude-existing[=<pattern>]"),
+	N_("git show-ref --exists <ref>"),
 	NULL
 };
 
@@ -214,6 +215,41 @@  static int cmd_show_ref__patterns(const struct patterns_options *opts,
 	return 0;
 }
 
+static int cmd_show_ref__exists(const char **refs)
+{
+	struct strbuf unused_referent = STRBUF_INIT;
+	struct object_id unused_oid;
+	unsigned int unused_type;
+	int failure_errno = 0;
+	const char *ref;
+	int ret = 1;
+
+	if (!refs || !*refs)
+		die("--exists requires a reference");
+	ref = *refs++;
+	if (*refs)
+		die("--exists requires exactly one reference");
+
+	if (refs_read_raw_ref(get_main_ref_store(the_repository), ref,
+			      &unused_oid, &unused_referent, &unused_type,
+			      &failure_errno)) {
+		if (failure_errno == ENOENT) {
+			error(_("reference does not exist"));
+			ret = 2;
+		} else {
+			error(_("failed to look up reference: %s"), strerror(failure_errno));
+		}
+
+		goto out;
+	}
+
+	ret = 0;
+
+out:
+	strbuf_release(&unused_referent);
+	return ret;
+}
+
 static int hash_callback(const struct option *opt, const char *arg, int unset)
 {
 	struct show_one_options *opts = opt->value;
@@ -243,10 +279,11 @@  int cmd_show_ref(int argc, const char **argv, const char *prefix)
 	struct exclude_existing_options exclude_existing_opts = {0};
 	struct patterns_options patterns_opts = {0};
 	struct show_one_options show_one_opts = {0};
-	int verify = 0;
+	int verify = 0, exists = 0;
 	const struct option show_ref_options[] = {
 		OPT_BOOL(0, "tags", &patterns_opts.tags_only, N_("only show tags (can be combined with heads)")),
 		OPT_BOOL(0, "heads", &patterns_opts.heads_only, N_("only show heads (can be combined with tags)")),
+		OPT_BOOL(0, "exists", &exists, N_("check for reference existence without resolving")),
 		OPT_BOOL(0, "verify", &verify, N_("stricter reference checking, "
 			    "requires exact ref path")),
 		OPT_HIDDEN_BOOL('h', NULL, &patterns_opts.show_head,
@@ -272,13 +309,15 @@  int cmd_show_ref(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, show_ref_options,
 			     show_ref_usage, 0);
 
-	if ((!!exclude_existing_opts.enabled + !!verify) > 1)
-		die(_("only one of --exclude-existing or --verify can be given"));
+	if ((!!exclude_existing_opts.enabled + !!verify + !!exists) > 1)
+		die(_("only one of --exclude-existing, --exists or --verify can be given"));
 
 	if (exclude_existing_opts.enabled)
 		return cmd_show_ref__exclude_existing(&exclude_existing_opts);
 	else if (verify)
 		return cmd_show_ref__verify(&show_one_opts, argv);
+	else if (exists)
+		return cmd_show_ref__exists(argv);
 	else
 		return cmd_show_ref__patterns(&patterns_opts, &show_one_opts, argv);
 }
diff --git a/t/t1403-show-ref.sh b/t/t1403-show-ref.sh
index 3a312c8b27c..17eba350ce5 100755
--- a/t/t1403-show-ref.sh
+++ b/t/t1403-show-ref.sh
@@ -197,8 +197,73 @@  test_expect_success 'show-ref --verify with dangling ref' '
 '
 
 test_expect_success 'show-ref sub-modes are mutually exclusive' '
+	cat >expect <<-EOF &&
+	fatal: only one of --exclude-existing, --exists or --verify can be given
+	EOF
+
 	test_must_fail git show-ref --verify --exclude-existing 2>err &&
-	grep "only one of --exclude-existing or --verify can be given" err
+	test_cmp expect err &&
+
+	test_must_fail git show-ref --verify --exists 2>err &&
+	test_cmp expect err &&
+
+	test_must_fail git show-ref --exclude-existing --exists 2>err &&
+	test_cmp expect err
+'
+
+test_expect_success '--exists with existing reference' '
+	git show-ref --exists refs/heads/$GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
+'
+
+test_expect_success '--exists with missing reference' '
+	test_expect_code 2 git show-ref --exists refs/heads/does-not-exist
+'
+
+test_expect_success '--exists does not use DWIM' '
+	test_expect_code 2 git show-ref --exists $GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME 2>err &&
+	grep "reference does not exist" err
+'
+
+test_expect_success '--exists with HEAD' '
+	git show-ref --exists HEAD
+'
+
+test_expect_success '--exists with bad reference name' '
+	test_when_finished "git update-ref -d refs/heads/bad...name" &&
+	new_oid=$(git rev-parse HEAD) &&
+	test-tool ref-store main update-ref msg refs/heads/bad...name $new_oid $ZERO_OID REF_SKIP_REFNAME_VERIFICATION &&
+	git show-ref --exists refs/heads/bad...name
+'
+
+test_expect_success '--exists with arbitrary symref' '
+	test_when_finished "git symbolic-ref -d refs/symref" &&
+	git symbolic-ref refs/symref refs/heads/$GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME &&
+	git show-ref --exists refs/symref
+'
+
+test_expect_success '--exists with dangling symref' '
+	test_when_finished "git symbolic-ref -d refs/heads/dangling" &&
+	git symbolic-ref refs/heads/dangling refs/heads/does-not-exist &&
+	git show-ref --exists refs/heads/dangling
+'
+
+test_expect_success '--exists with nonexistent object ID' '
+	test-tool ref-store main update-ref msg refs/heads/missing-oid $(test_oid 001) $ZERO_OID REF_SKIP_OID_VERIFICATION &&
+	git show-ref --exists refs/heads/missing-oid
+'
+
+test_expect_success '--exists with non-commit object' '
+	tree_oid=$(git rev-parse HEAD^{tree}) &&
+	test-tool ref-store main update-ref msg refs/heads/tree ${tree_oid} $ZERO_OID REF_SKIP_OID_VERIFICATION &&
+	git show-ref --exists refs/heads/tree
+'
+
+test_expect_success '--exists with directory fails with generic error' '
+	cat >expect <<-EOF &&
+	error: failed to look up reference: Is a directory
+	EOF
+	test_expect_code 1 git show-ref --exists refs/heads 2>err &&
+	test_cmp expect err
 '
 
 test_done