diff mbox series

[v2,1/2] object-name: fix resolution of object names containing curly braces

Message ID 13f68bebe90549ba19452f12abb6fea41c2517fb.1735949870.git.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series object-name: fix resolution of object names containing curly braces | expand

Commit Message

Elijah Newren Jan. 4, 2025, 12:17 a.m. UTC
From: Elijah Newren <newren@gmail.com>

Given a branch name of 'foo{bar', commands like

    git cat-file -p foo{bar:README.md

should succeed (assuming that branch had a README.md file, of course).
However, the change in cce91a2caef9 (Change 'master@noon' syntax to
'master@{noon}'., 2006-05-19) presumed that curly braces would always
come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md'
to entirely miss the ':' and assume there's no object being referenced.
In short, git would report:

    fatal: Not a valid object name foo{bar:README.md

Change the parsing to only make the assumption of paired curly braces
immediately after either a '@' or '^' character appears.

Add tests for this, as well as for a few other test cases that initial
versions of this patch broke:
  * 'foo@@{...}'
  * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}'

Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Helped-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
---
 object-name.c       |  8 +++++---
 t/t1006-cat-file.sh | 31 ++++++++++++++++++++++++++++++-
 2 files changed, 35 insertions(+), 4 deletions(-)

Comments

Junio C Hamano Jan. 4, 2025, 5:26 p.m. UTC | #1
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  	for (cp = name, bracket_depth = 0; *cp; cp++) {
> -		if (*cp == '{')
> +		if (*(cp+1) == '{' && (*cp == '@' || *cp == '^')) {
> +			cp++;
>  			bracket_depth++;

Checking cp[1] before even knowing if cp[0] is the end of the string
(hence cp[1] is an out of bounds access) smells fishy.  If it were
something like ...

	if (cp[0] && strchr("@^", cp[0]) && cp[1] == '{')

... it may be a bit more palatable, perhaps?  At least writing it
this way we can easily scale when we find the third character we
need to special case, hopefully, but again, I do prefer if we can
find a solution that does not have such an intimate knowledge about
"@^", which I just failed to do here X-<.

Thanks.
Elijah Newren Jan. 4, 2025, 6:54 p.m. UTC | #2
On Sat, Jan 4, 2025 at 9:26 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >       for (cp = name, bracket_depth = 0; *cp; cp++) {
> > -             if (*cp == '{')
> > +             if (*(cp+1) == '{' && (*cp == '@' || *cp == '^')) {
> > +                     cp++;
> >                       bracket_depth++;
>
> Checking cp[1] before even knowing if cp[0] is the end of the string
> (hence cp[1] is an out of bounds access) smells fishy.

We checked *cp in the loop already, so we know cp[0] != '\0'.
Combined with the fact that this is a NUL-terminated string, we
therefore also know that cp[1] is not an out-of-bounds access.

> If it were
> something like ...
>
>         if (cp[0] && strchr("@^", cp[0]) && cp[1] == '{')

Since we know cp[0] != '\0' already, couldn't this be simplified to

    if (strchr("@^", *cp) && cp[1] == '{')

?

I do like this form better though, yes.

> ... it may be a bit more palatable, perhaps?  At least writing it
> this way we can easily scale when we find the third character we
> need to special case, hopefully, but again, I do prefer if we can
> find a solution that does not have such an intimate knowledge about
> "@^", which I just failed to do here X-<.

Yeah, I have failed to come up with an alternative as well.  If I and
others can't come up with something better in a few days, I'll
resubmit with the above change and a comment in the commit message
that we'd prefer something better but were unable to come up with
anything.
Junio C Hamano Jan. 5, 2025, 4:14 p.m. UTC | #3
"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Given a branch name of 'foo{bar', commands like
>
>     git cat-file -p foo{bar:README.md
>
> should succeed (assuming that branch had a README.md file, of course).
> However, the change in cce91a2caef9 (Change 'master@noon' syntax to
> 'master@{noon}'., 2006-05-19) presumed that curly braces would always
> come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md'
> to entirely miss the ':' and assume there's no object being referenced.
> In short, git would report:
>
>     fatal: Not a valid object name foo{bar:README.md

Naïvely, it seems that a solution is to parse from left to right,
i.e., (1) notice there is a colon, (2) see if everything before that
colon resolves to a treeish, and (3) see if everything after it is a
path that appears in the treeish.

 - When we are given foo@{some:thing}, if we did that, we realize
   that "foo@{some" is not a valid tree-ish object name (since "@{"
   cannot appear in a refname) and then can backtrack by realizing
   "foo" is a ref, and @{...} could be a reflog reference (most
   likely a way to spell some sort of timestamp), and try that.

 - Similarly, for foo:path-gaffed, we would notice "foo" is a valid
   tree-ish object name, and if path-gaffed is a path in it, we'd be
   happy.  Or foo may not be a tree-ish, or path-gaffed may not
   exist in that tree-ish.  In which case, we can backtrack and see
   foo:path-g is an allowed prefix in a desribe name.

Now in the above description, I have assumed that an alternative
interpretation kicks in only as a fallback when we backtrack, but
we could make sure we try all possibilities and notice ambiguity if
we wanted to.

In any case, such an updated structure of the parsing code paths
(whether alternative interpretations are treated as fallbacks or
equally plausible candidates subject to disambiguation) would be
a vast departure from what we currently have, so a targeted "fix"
like these two patches attempt would be more appropriate as an
initial approach, I think.

Thanks, will queue, but probably we'd look at in any seriousness
after the 2.48 final gets tagged.
diff mbox series

Patch

diff --git a/object-name.c b/object-name.c
index a563635a8cb..e54ef1f621e 100644
--- a/object-name.c
+++ b/object-name.c
@@ -2051,12 +2051,14 @@  static enum get_oid_result get_oid_with_context_1(struct repository *repo,
 		return -1;
 	}
 	for (cp = name, bracket_depth = 0; *cp; cp++) {
-		if (*cp == '{')
+		if (*(cp+1) == '{' && (*cp == '@' || *cp == '^')) {
+			cp++;
 			bracket_depth++;
-		else if (bracket_depth && *cp == '}')
+		} else if (bracket_depth && *cp == '}') {
 			bracket_depth--;
-		else if (!bracket_depth && *cp == ':')
+		} else if (!bracket_depth && *cp == ':') {
 			break;
+		}
 	}
 	if (*cp == ':') {
 		struct object_id tree_oid;
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index ff9bf213aa2..398865d6ebe 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -240,7 +240,8 @@  test_expect_success "setup" '
 	git config extensions.objectformat $test_hash_algo &&
 	git config extensions.compatobjectformat $test_compat_hash_algo &&
 	echo_without_newline "$hello_content" > hello &&
-	git update-index --add hello
+	git update-index --add hello &&
+	git commit -m "add hello file"
 '
 
 run_blob_tests () {
@@ -602,6 +603,34 @@  test_expect_success FUNNYNAMES '--batch-check, -Z with newline in input' '
 	test_cmp expect actual
 '
 
+test_expect_success 'setup with curly braches in input' '
+	git branch "foo{bar" HEAD &&
+	git branch "foo@" HEAD
+'
+
+test_expect_success 'object reference with curly brace' '
+	git cat-file -p "foo{bar:hello" >actual &&
+	git cat-file -p HEAD:hello >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'object reference with at-sign' '
+	git cat-file -p "foo@@{0}:hello" >actual &&
+	git cat-file -p HEAD:hello >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'setup with commit with colon' '
+	git commit-tree -m "testing: just a bunch of junk" HEAD^{tree} >out &&
+	git branch other $(cat out)
+'
+
+test_expect_success 'object reference via commit text search' '
+	git cat-file -p "other^{/testing:}:hello" >actual &&
+	git cat-file -p HEAD:hello >expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'setup blobs which are likely to delta' '
 	test-tool genrandom foo 10240 >foo &&
 	{ cat foo && echo plus; } >foo-plus &&