mbox series

[v5,0/2] attr: add attr.tree config

Message ID pull.1577.v5.git.git.1697218770.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series attr: add attr.tree config | expand

Message

Philippe Blain via GitGitGadget Oct. 13, 2023, 5:39 p.m. UTC
44451a2e5e (attr: teach "--attr-source=" global option to "git", 2023-05-06)
provided the ability to pass in a treeish as the attr source. When a
revision does not resolve to a valid tree is passed, Git will die. At
GitLab, we server repositories as bare repos and would like to always read
attributes from the default branch, so we'd like to pass in HEAD as the
treeish to read gitattributes from on every command. In this context we
would not want Git to die if HEAD is unborn, like in the case of empty
repositories.

Instead of modifying the default behavior of --attr-source, create a new
config attr.tree with which an admin can configure a ref for all commands to
read gitattributes from. Also make the default tree to read from HEAD on
bare repositories.

Changes since v4:

 * removed superfluous test

Changes since v3:

 * clarified attr logic around ignoring errors if source set by attr.tree is
   invalid
 * refactored tests by using helpers
 * modified test to check for precedence between --attr-source, attr.tree,
   GIT_ATTR_SOURCE

Changes since v2:

 * relax the restrictions around attr.tree so that if it does not resolve to
   a valid treeish, ignore it.
 * add a commit to default to HEAD in bare repositories
 * remove commit that adds attr.allowInvalidSource

Changes since v1:

 * Added a commit to add attr.tree config

John Cai (2):
  attr: read attributes from HEAD when bare repo
  attr: add attr.tree for setting the treeish to read attributes from

 Documentation/config.txt      |  2 +
 Documentation/config/attr.txt |  7 ++++
 attr.c                        | 19 ++++++++-
 attr.h                        |  2 +
 config.c                      | 16 ++++++++
 t/t0003-attributes.sh         | 72 +++++++++++++++++++++++++++++++++++
 t/t5001-archive-attr.sh       |  2 +-
 7 files changed, 118 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/config/attr.txt


base-commit: 1fc548b2d6a3596f3e1c1f8b1930d8dbd1e30bf3
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1577%2Fjohn-cai%2Fjc%2Fconfig-attr-invalid-source-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1577/john-cai/jc/config-attr-invalid-source-v5
Pull-Request: https://github.com/git/git/pull/1577

Range-diff vs v4:

 1:  eaa27c47810 = 1:  eaa27c47810 attr: read attributes from HEAD when bare repo
 2:  749d8a8082e ! 2:  df4b3f53309 attr: add attr.tree for setting the treeish to read attributes from
     @@ t/t0003-attributes.sh: test_expect_success 'bare repository: check that .gitattr
      +	)
      +'
      +
     -+test_expect_success 'attr.tree points to non-existing ref' '
     -+	test_when_finished rm -rf empty &&
     -+	git init empty &&
     -+	(
     -+		cd empty &&
     -+		echo "f/path: test: unspecified" >expect &&
     -+		git -c attr.tree=refs/does/not/exist check-attr test -- f/path >actual 2>err &&
     -+		test_must_be_empty err &&
     -+		test_cmp expect actual
     -+	)
     -+'
     -+
      +test_expect_success 'bad attr source defaults to reading .gitattributes file' '
      +	test_when_finished rm -rf empty &&
      +	git init empty &&

Comments

Junio C Hamano Oct. 13, 2023, 6:52 p.m. UTC | #1
"John Cai via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Changes since v4:
>
>  * removed superfluous test

An alternative would have been to point with the ref some non-tree
object like a blob, but as the outcome should be the same as missing
case (from the code --- which is not exactly kosher), it should be
OK.

	if (repo_get_oid_treeish(the_repository,
				 default_attr_source_tree_object_name,
				 attr_source) && !ignore_bad_attr_tree)
		die(_("bad --attr-source or GIT_ATTR_SOURCE"));

OOPS!  Sorry for not noticing earlier, but repo_get_oid_treeish()
does *NOT* error out when the discovered object is not a treeish, as
the suggested object type is merely supplied for disambiguation
purposes (e.g., with objects 012345 that is a tree and 012346 that
is a blob, you can still ask for treeish "01234" but if you ask for
an object "01234" it will fail).

So, the alternative test would have caught this bug, no?  Instead of
silently treating the non-treeish as an empty tree, we would have
died much later when the object supposedly a tree-ish turns out to
be a blob, or something?
Junio C Hamano Oct. 13, 2023, 8:31 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> 	if (repo_get_oid_treeish(the_repository,
> 				 default_attr_source_tree_object_name,
> 				 attr_source) && !ignore_bad_attr_tree)
> 		die(_("bad --attr-source or GIT_ATTR_SOURCE"));
>
> OOPS!  Sorry for not noticing earlier, but repo_get_oid_treeish()
> does *NOT* error out when the discovered object is not a treeish, as
> the suggested object type is merely supplied for disambiguation
> purposes (e.g., with objects 012345 that is a tree and 012346 that
> is a blob, you can still ask for treeish "01234" but if you ask for
> an object "01234" it will fail).
>
> So, the alternative test would have caught this bug, no?  Instead of
> silently treating the non-treeish as an empty tree, we would have
> died much later when the object supposedly a tree-ish turns out to
> be a blob, or something?

There indeed is a bug, but not really.  If we add this test:

test_expect_success 'attr.tree that points at a non-treeish' '
	test_when_finished rm -rf empty &&
	git init empty &&
	(
		cd empty &&
		echo "f/path: test: unspecified" >expect &&
		H=$(git hash-object -t blob --stdin -w </dev/null) &&
		git -c attr.tree=$H check-attr test -- f/path >actual 2>err &&
		test_must_be_empty err &&
		test_cmp expect actual
	)
'

repo_get_oid_treeish() returns a blob object name and we end up
storing a blob object name in "attr_source" static variable of
default_attr_source() function.

Later this is fed to read_attr() by bootstrap_attr_stack() and then
to read_attr_from_blob() that uses it to call get_tree_entry(),
which fails for any path because it is a blob.  We do not give any
errors or error messages during the whole process.

So in a sense, for !!ignore_bad_attr_tree case, the code ends up
doing the right thing.  But if !ignore_bad_attr_tree is true, i.e.,
a blob object name is given via --attr-source or GIT_ATTR_SOURCE,
then the bug will be uncovered.

 t/t0003-attributes.sh | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git c/t/t0003-attributes.sh w/t/t0003-attributes.sh
index ecf43ab545..0f02f22171 100755
--- c/t/t0003-attributes.sh
+++ w/t/t0003-attributes.sh
@@ -394,6 +394,18 @@ test_expect_success 'bare repo defaults to reading .gitattributes from HEAD' '
 	test_cmp expect actual
 '
 
+test_expect_success '--attr-source that points at a non-treeish' '
+	test_when_finished rm -rf empty &&
+	git init empty &&
+	(
+		cd empty &&
+		echo "$bad_attr_source_err" >expect_err &&
+		H=$(git hash-object -t blob --stdin -w </dev/null) &&
+		test_must_fail git --attr-source=$H check-attr test -- f/path 2>err &&
+		test_cmp expect_err err
+	)
+'
+
 test_expect_success 'precedence of --attr-source, GIT_ATTR_SOURCE, then attr.tree' '
 	test_when_finished rm -rf empty &&
 	git init empty &&
Junio C Hamano Oct. 13, 2023, 8:47 p.m. UTC | #3
Junio C Hamano <gitster@pobox.com> writes:

> So in a sense, for !!ignore_bad_attr_tree case, the code ends up
> doing the right thing.  But if !ignore_bad_attr_tree is true, i.e.,
> a blob object name is given via --attr-source or GIT_ATTR_SOURCE,
> then the bug will be uncovered.

Having said all that, I suspect that this problem is not new and
certainly not caused by this topic.  We should have unconditionally
died when GIT_ATTR_SOURCE gave a blob object name, but pretended as
if an empty tree was given.  There may even be existing users who
now assume that is working as intended and depend on this bug.

So, let's leave it as a "possible bug" that we might want to fix in
the future, outside the scope of this series.

Thanks.


>  t/t0003-attributes.sh | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git c/t/t0003-attributes.sh w/t/t0003-attributes.sh
> index ecf43ab545..0f02f22171 100755
> --- c/t/t0003-attributes.sh
> +++ w/t/t0003-attributes.sh
> @@ -394,6 +394,18 @@ test_expect_success 'bare repo defaults to reading .gitattributes from HEAD' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success '--attr-source that points at a non-treeish' '
> +	test_when_finished rm -rf empty &&
> +	git init empty &&
> +	(
> +		cd empty &&
> +		echo "$bad_attr_source_err" >expect_err &&
> +		H=$(git hash-object -t blob --stdin -w </dev/null) &&
> +		test_must_fail git --attr-source=$H check-attr test -- f/path 2>err &&
> +		test_cmp expect_err err
> +	)
> +'
> +
>  test_expect_success 'precedence of --attr-source, GIT_ATTR_SOURCE, then attr.tree' '
>  	test_when_finished rm -rf empty &&
>  	git init empty &&
John Cai Oct. 19, 2023, 3:43 p.m. UTC | #4
Hi Junio,

On 13 Oct 2023, at 16:47, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> So in a sense, for !!ignore_bad_attr_tree case, the code ends up
>> doing the right thing.  But if !ignore_bad_attr_tree is true, i.e.,
>> a blob object name is given via --attr-source or GIT_ATTR_SOURCE,
>> then the bug will be uncovered.
>
> Having said all that, I suspect that this problem is not new and
> certainly not caused by this topic.  We should have unconditionally
> died when GIT_ATTR_SOURCE gave a blob object name, but pretended as
> if an empty tree was given.  There may even be existing users who
> now assume that is working as intended and depend on this bug.
>
> So, let's leave it as a "possible bug" that we might want to fix in
> the future, outside the scope of this series.
>
> Thanks.

That sounds good--let's fix in a separate series. Thanks for the careful review!

thanks
John

>
>
>>  t/t0003-attributes.sh | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git c/t/t0003-attributes.sh w/t/t0003-attributes.sh
>> index ecf43ab545..0f02f22171 100755
>> --- c/t/t0003-attributes.sh
>> +++ w/t/t0003-attributes.sh
>> @@ -394,6 +394,18 @@ test_expect_success 'bare repo defaults to reading .gitattributes from HEAD' '
>>  	test_cmp expect actual
>>  '
>>
>> +test_expect_success '--attr-source that points at a non-treeish' '
>> +	test_when_finished rm -rf empty &&
>> +	git init empty &&
>> +	(
>> +		cd empty &&
>> +		echo "$bad_attr_source_err" >expect_err &&
>> +		H=$(git hash-object -t blob --stdin -w </dev/null) &&
>> +		test_must_fail git --attr-source=$H check-attr test -- f/path 2>err &&
>> +		test_cmp expect_err err
>> +	)
>> +'
>> +
>>  test_expect_success 'precedence of --attr-source, GIT_ATTR_SOURCE, then attr.tree' '
>>  	test_when_finished rm -rf empty &&
>>  	git init empty &&