[RFC] for-each-ref: respect GIT_REF_PARANOIA setting

Message ID	7283f826198aaec94c847f0b26e228ace9a38433.1647880211.git.me@ttaylorr.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> Date: Mon, 21 Mar 2022 12:30:18 -0400 From: Taylor Blau <me@ttaylorr.com> To: git@vger.kernel.org Subject: [RFC PATCH] for-each-ref: respect GIT_REF_PARANOIA setting Message-ID: <7283f826198aaec94c847f0b26e228ace9a38433.1647880211.git.me@ttaylorr.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Precedence: bulk
Series	[RFC] for-each-ref: respect GIT_REF_PARANOIA setting \| expand [RFC] for-each-ref: respect GIT_REF_PARANOIA setting

Message ID

7283f826198aaec94c847f0b26e228ace9a38433.1647880211.git.me@ttaylorr.com (mailing list archive)

State

New, archived

Headers

Date: Mon, 21 Mar 2022 12:30:18 -0400
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Subject: [RFC PATCH] for-each-ref: respect GIT_REF_PARANOIA setting
Message-ID: 
 <7283f826198aaec94c847f0b26e228ace9a38433.1647880211.git.me@ttaylorr.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Precedence: bulk

Series

[RFC] for-each-ref: respect GIT_REF_PARANOIA setting | expand

Commit Message

Taylor Blau March 21, 2022, 4:30 p.m. UTC

When setting GIT_REF_PARANOIA=1 (which became the default in 968f12fdac
(refs: turn on GIT_REF_PARANOIA by default, 2021-09-24)), `git
for-each-ref` will display a warning when it encounters a broken ref,
but does not terminate the program.

This can result in somewhat surprising behavior like:

    $ echo bogus >.git/refs/heads/bogus
    $ GIT_REF_PARANOIA=1 git for-each-ref; echo "==> $?"
    warning: ignoring broken ref refs/heads/bogus
    ==> 0

This seems to be the case even before the introduction of the ref-filter
code via 7ebc8cbedd (Merge branch 'kn/for-each-ref', 2015-08-03).
Looking at 8afc493d11 (for-each-ref: report broken references correctly,
2015-06-02) when this was last addressed, the fix at the time was to
report any broken references, but this warning did not affect the
command's success.

It seems that `git for-each-ref` should exit non-zero in the case of a
broken reference when GIT_REF_PARANOIA is set to 1. This patch does
that, but there are a couple of open questions (hence its status as an
RFC):

  - First, there are a handful of other `filter_refs()` calls throughout
    the builtin tree that lack similar error handling. I suspect that
    these will need similar treatment, but I haven't looked at them
    deeply yet.

  - More pressing, though, is that there is some test fallout as a
    result of this change. Namely, t6301 expects that `git for-each-ref`
    should list out all of the non-broken references to stdout _even
    when broken references exist_.

    This patch changes that behavior and causes us to exit immediately,
    without printing out any of the non-broken references (since we are
    still building up the list of references to sort, and thus haven't
    printed anything out by the time we're in the ref_filter_handler
    callback).

    The test fallout can be seen in the changes to t6301, namely that we
    expect `for-each-ref` to fail in certain cases where it didn't
    before, and that in those cases we no longer guarantee the contents
    of stdout.

The second point gives me serious pause about whether or not this change
is the right one. So I'm curious if or how we should handle this case.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/for-each-ref.c         |  9 ++++++---
 ref-filter.c                   |  2 +-
 t/t6301-for-each-ref-errors.sh | 10 ++++------
 3 files changed, 11 insertions(+), 10 deletions(-)

Comments

Ævar Arnfjörð Bjarmason March 21, 2022, 5:29 p.m. UTC | #1

On Mon, Mar 21 2022, Taylor Blau wrote:

> When setting GIT_REF_PARANOIA=1 (which became the default in 968f12fdac
> (refs: turn on GIT_REF_PARANOIA by default, 2021-09-24)), `git
> for-each-ref` will display a warning when it encounters a broken ref,
> but does not terminate the program.
>
> This can result in somewhat surprising behavior like:
>
>     $ echo bogus >.git/refs/heads/bogus
>     $ GIT_REF_PARANOIA=1 git for-each-ref; echo "==> $?"
>     warning: ignoring broken ref refs/heads/bogus
>     ==> 0
>
> This seems to be the case even before the introduction of the ref-filter
> code via 7ebc8cbedd (Merge branch 'kn/for-each-ref', 2015-08-03).
> Looking at 8afc493d11 (for-each-ref: report broken references correctly,
> 2015-06-02) when this was last addressed, the fix at the time was to
> report any broken references, but this warning did not affect the
> command's success.
>
> It seems that `git for-each-ref` should exit non-zero in the case of a
> broken reference when GIT_REF_PARANOIA is set to 1. This patch does
> that, but there are a couple of open questions (hence its status as an
> RFC):

I started playing with this fix-up:
	
	diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
	index e1937b7e53e..05b7c9b2a69 100644
	--- a/builtin/for-each-ref.c
	+++ b/builtin/for-each-ref.c
	@@ -78,8 +78,6 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
	 	filter.name_patterns = argv;
	 	filter.match_as_path = 1;
	 	ret = filter_refs(&array, &filter, FILTER_REFS_ALL);
	-	if (ret)
	-		goto cleanup;
	 	ref_array_sort(sorting, &array);
	 
	 	if (!maxcount || array.nr < maxcount)
	@@ -93,7 +91,6 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
	 		putchar('\n');
	 	}
	 
	-cleanup:
	 	strbuf_release(&err);
	 	strbuf_release(&output);
	 	ref_array_clear(&array);
	diff --git a/t/t6301-for-each-ref-errors.sh b/t/t6301-for-each-ref-errors.sh
	index 2a5895c124a..8a93685bb3e 100755
	--- a/t/t6301-for-each-ref-errors.sh
	+++ b/t/t6301-for-each-ref-errors.sh
	@@ -19,7 +19,8 @@ test_expect_success 'Broken refs are reported correctly' '
	 	: >.git/$r &&
	 	test_when_finished "rm -f .git/$r" &&
	 	echo "warning: ignoring broken ref $r" >broken-err &&
	-	test_must_fail git for-each-ref 2>err &&
	+	test_must_fail git for-each-ref >out 2>err &&
	+	test_must_be_empty out &&
	 	test_cmp broken-err err
	 '
	 
	@@ -31,7 +32,11 @@ test_expect_success 'NULL_SHA1 refs are reported correctly' '
	 	test_must_fail git for-each-ref 2>err &&
	 	test_cmp zeros-err err &&
	 	test_must_fail git for-each-ref --format="%(objectname) %(refname)" \
	-		2>brief-err &&
	+		>actual 2>brief-err &&
	+	cat >expect <<-EOF &&
	+	$(git rev-parse HEAD) $(git rev-parse --symbolic-full-name HEAD)
	+	EOF
	+	test_cmp expect actual &&
	 	test_cmp zeros-err brief-err
	 '

It seems to me you'd want at least the part of it where we retain a
test_cmp or similar for the "out", rather than removing it completely.

>   - First, there are a handful of other `filter_refs()` calls throughout
>     the builtin tree that lack similar error handling. I suspect that
>     these will need similar treatment, but I haven't looked at them
>     deeply yet.
>
>   - More pressing, though, is that there is some test fallout as a
>     result of this change. Namely, t6301 expects that `git for-each-ref`
>     should list out all of the non-broken references to stdout _even
>     when broken references exist_.
>
>     This patch changes that behavior and causes us to exit immediately,
>     without printing out any of the non-broken references (since we are
>     still building up the list of references to sort, and thus haven't
>     printed anything out by the time we're in the ref_filter_handler
>     callback).
>
>     The test fallout can be seen in the changes to t6301, namely that we
>     expect `for-each-ref` to fail in certain cases where it didn't
>     before, and that in those cases we no longer guarantee the contents
>     of stdout.
>
> The second point gives me serious pause about whether or not this change
> is the right one. So I'm curious if or how we should handle this case.

I think the right thing to do is to add a filter_refs() where we add
FILTER_REFS_WANT_REF_ISBROKEN and maybe FILTER_REFS_WANT_REF_BAD_NAME.

Then have in the for-each-ref.c loop check the "flags" on the ref item
for REF_ISBROKEN etc, and either have format_ref_array_item() format it,
or skip that and issue a warning() there manually.

That way API users can opt-in and detect these, but still be able to
print the rest, and it allows to make the more narrow change of amending
the exit code.

I'd think builtin/{tag,branch}.c would want similar treatment, ditto
fetch-pack.c.

But I really don't see why we'd want to abort the iteration early just
because we see one broken ref, that's after all per-ref issue, and we
don't need to abort the walk entirely to issue a warning() or to change
our exit code.

So that part of the patch really seems like it's in the wrong place to
me, and that we should always have for-each-ref try as hard as it can to
emit the output it can emit, even if we change the exit code due to a
broken ref.

Junio C Hamano March 21, 2022, 8:33 p.m. UTC | #2

Taylor Blau <me@ttaylorr.com> writes:

> +	ret = filter_refs(&array, &filter, FILTER_REFS_ALL);
> +	if (ret)
> +		goto cleanup;
>  	ref_array_sort(sorting, &array);

Fixing the "regression" would be a simple matter of removing the
early return from here, and instead show what we have collected?

>  	if (!maxcount || array.nr < maxcount)
> @@ -91,11 +93,12 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>  		putchar('\n');
>  	}
>  
> +cleanup:
>  	strbuf_release(&err);
>  	strbuf_release(&output);
>  	ref_array_clear(&array);
>  	free_commit_list(filter.with_commit);
>  	free_commit_list(filter.no_commit);
>  	ref_sorting_release(sorting);
> -	return 0;
> +	return ret;
>  }
> diff --git a/ref-filter.c b/ref-filter.c
> index 7838bd22b8..f9667f6ca4 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -2249,7 +2249,7 @@ static int ref_filter_handler(const char *refname, const struct object_id *oid,
>  
>  	if (flag & REF_ISBROKEN) {
>  		warning(_("ignoring broken ref %s"), refname);
> -		return 0;
> +		return 1;
>  	}

Ah, no, not really.  This causes us to stop iterating prematurely.

If we are iterating because we want to find any breakage, such an
early stop in iteration makes good sense, but most of the time, we
are not, and it is questionable if such an early return makes much
sense.

I suspect that a handler may need to keep going, while recording a
bit for each ref it collects.  ref_array_item may or may not have (I
do not know offhand) already a bit in its flag word to signal a broken
ref that we can carry this information out to the callers?

diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 6f62f40d12..e1937b7e53 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -19,7 +19,7 @@  int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 	int i;
 	struct ref_sorting *sorting;
 	struct string_list sorting_options = STRING_LIST_INIT_DUP;
-	int maxcount = 0, icase = 0;
+	int maxcount = 0, icase = 0, ret = 0;
 	struct ref_array array;
 	struct ref_filter filter;
 	struct ref_format format = REF_FORMAT_INIT;
@@ -77,7 +77,9 @@  int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 
 	filter.name_patterns = argv;
 	filter.match_as_path = 1;
-	filter_refs(&array, &filter, FILTER_REFS_ALL);
+	ret = filter_refs(&array, &filter, FILTER_REFS_ALL);
+	if (ret)
+		goto cleanup;
 	ref_array_sort(sorting, &array);
 
 	if (!maxcount || array.nr < maxcount)
@@ -91,11 +93,12 @@  int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 		putchar('\n');
 	}
 
+cleanup:
 	strbuf_release(&err);
 	strbuf_release(&output);
 	ref_array_clear(&array);
 	free_commit_list(filter.with_commit);
 	free_commit_list(filter.no_commit);
 	ref_sorting_release(sorting);
-	return 0;
+	return ret;
 }
diff --git a/ref-filter.c b/ref-filter.c
index 7838bd22b8..f9667f6ca4 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2249,7 +2249,7 @@  static int ref_filter_handler(const char *refname, const struct object_id *oid,
 
 	if (flag & REF_ISBROKEN) {
 		warning(_("ignoring broken ref %s"), refname);
-		return 0;
+		return 1;
 	}
 
 	/* Obtain the current ref kind from filter_ref_kind() and ignore unwanted refs. */
diff --git a/t/t6301-for-each-ref-errors.sh b/t/t6301-for-each-ref-errors.sh
index 40edf9dab5..2a5895c124 100755
--- a/t/t6301-for-each-ref-errors.sh
+++ b/t/t6301-for-each-ref-errors.sh
@@ -19,8 +19,7 @@  test_expect_success 'Broken refs are reported correctly' '
 	: >.git/$r &&
 	test_when_finished "rm -f .git/$r" &&
 	echo "warning: ignoring broken ref $r" >broken-err &&
-	git for-each-ref >out 2>err &&
-	test_cmp full-list out &&
+	test_must_fail git for-each-ref 2>err &&
 	test_cmp broken-err err
 '
 
@@ -29,11 +28,10 @@  test_expect_success 'NULL_SHA1 refs are reported correctly' '
 	echo $ZEROS >.git/$r &&
 	test_when_finished "rm -f .git/$r" &&
 	echo "warning: ignoring broken ref $r" >zeros-err &&
-	git for-each-ref >out 2>err &&
-	test_cmp full-list out &&
+	test_must_fail git for-each-ref 2>err &&
 	test_cmp zeros-err err &&
-	git for-each-ref --format="%(objectname) %(refname)" >brief-out 2>brief-err &&
-	test_cmp brief-list brief-out &&
+	test_must_fail git for-each-ref --format="%(objectname) %(refname)" \
+		2>brief-err &&
 	test_cmp zeros-err brief-err
 '

[RFC] for-each-ref: respect GIT_REF_PARANOIA setting

Commit Message

Comments

Patch