Message ID | 20181229054357.11716-1-stanhu@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] sha1-name.c: Fix handling of revisions that contain paths with brackets | expand |
Stan Hu <stanhu@gmail.com> writes: > - if (len < 4 || name[len-1] != '}') > + if (len < 4) > return -1; The original does not expect any string given after the ^{<type>} dereferencer, like :<path>, and that is why this function returns very early for anything when name[len-1] is not a closing brace. We do not do that anymore, because...? This gives me a nagging feeling that the patch is barking up a wrong tree. Consider what happens when a path that does not have any funny characters, e.g. "v1.4.0^{tree}:a/b/c", is given from the top level of the request chain (e.g. "rev-parse v1.4.0^{tree}:a/b/c")? The caller must be feeding this function "v1.4.0^{tree}" after finding the colon before the path "a/b/c" and setting len to point at the colon---otherwise we won't be checking for "}" at the end like this. When given "master^{tree}:a/b/{c}", wouldn't the caller be doing the same stripping to find the colon and calling this function with len pointing at the colon before the path (i.e. "master^{tree}")? To put it another way, it is OK if this patch wants to shift the responsibility of finding the colon that separates the tree-ish part and the path part from the caller to this function, but then I would expect changes to the caller, because now the caller does not have to find ":a/b/c" in "v1.4.0^{tree}:a/b/c" and set up the len before calling this function. Why can the resulting code after applying this patch be correct without such a matching change? > > - for (sp = name + len - 1; name <= sp; sp--) { > + for (sp = name; sp < name + len; sp++) { > int ch = *sp; > - if (ch == '{' && name < sp && sp[-1] == '^') > + if (ch == '^' && sp < name + len && sp[1] == '{') > break; > } We used to scan from the tail (as we expected that the caller gives us a (name, len) that ends with "^{<type>}". The updated code instead scans from the front, looking for "^{". I do not particularly mind the change of strategy, as long as it is correctly done, but I suspect the function will stay simpler if the callee is fixed instead. The only troublesome case is the REV^{/...} syntax. For example, HEAD^{/^Git 2.0}^{tree}:t/ would want to find the commit "HEAD^{/^Git 2.0}", peel it down to a tree object with "^{tree}" and then take its "t/" subtree. It used to be that the caller (get_oid_with_context_1() has a loop that finds a colon outside "{...}" and that finds ":t/" before calling get_oid_1()) was responsible to give us "HEAD^{/^Git 2.0}^{tree}" by stripping ":t/", and I presume that it is still happening, but the above loop would terminate upon seeing "^{" immediately after HEAD, without even realizing that it has ^{tree} after it. > - if (sp <= name) > + > + if (sp == name + len) > return -1; > - sp++; /* beginning of type name, or closing brace for empty */ > - if (starts_with(sp, "commit}")) > + sp += 2; /* beginning of type name, or closing brace for empty */ And this comment indicates that the code did not expect to see ^{/^Git 2.0} before ^{tree}. I do not quite follow. How can this patch be correct? Puzzled. $ git rev-parse "master^{/^Git 2.0}^{tree}" bc6ce29d1ec757d9d036532531a1046db4da0d96 $ ./git rev-parse "master^{/^Git 2.0}^{tree}" master^{/^Git 2.0}^{tree} fatal: ambiguous argument 'master^{/^Git 2.0}^{tree}': unknown revision or path not in the working tree.
diff --git a/sha1-name.c b/sha1-name.c index faa60f69e3..ab372a90be 100644 --- a/sha1-name.c +++ b/sha1-name.c @@ -989,7 +989,7 @@ static int peel_onion(const char *name, int len, struct object_id *oid, unsigned lookup_flags) { struct object_id outer; - const char *sp; + const char *sp, *end; unsigned int expected_type = 0; struct object *o; @@ -1001,33 +1001,40 @@ static int peel_onion(const char *name, int len, struct object_id *oid, * "ref^{commit}". "commit^{tree}" could be used to find the * top-level tree of the given commit. */ - if (len < 4 || name[len-1] != '}') + if (len < 4) return -1; - for (sp = name + len - 1; name <= sp; sp--) { + for (sp = name; sp < name + len; sp++) { int ch = *sp; - if (ch == '{' && name < sp && sp[-1] == '^') + if (ch == '^' && sp < name + len && sp[1] == '{') break; } - if (sp <= name) + + if (sp == name + len) return -1; - sp++; /* beginning of type name, or closing brace for empty */ - if (starts_with(sp, "commit}")) + sp += 2; /* beginning of type name, or closing brace for empty */ + + if (skip_prefix(sp, "commit}", &end)) expected_type = OBJ_COMMIT; - else if (starts_with(sp, "tag}")) + else if (skip_prefix(sp, "tag}", &end)) expected_type = OBJ_TAG; - else if (starts_with(sp, "tree}")) + else if (skip_prefix(sp, "tree}", &end)) expected_type = OBJ_TREE; - else if (starts_with(sp, "blob}")) + else if (skip_prefix(sp, "blob}", &end)) expected_type = OBJ_BLOB; - else if (starts_with(sp, "object}")) + else if (skip_prefix(sp, "object}", &end)) expected_type = OBJ_ANY; - else if (sp[0] == '}') + else if (sp[0] == '}') { expected_type = OBJ_NONE; - else if (sp[0] == '/') + end = sp + 1; + } else if (sp[0] == '/') { expected_type = OBJ_COMMIT; - else + end = name + len; + } else + return -1; + + if (end != name + len) return -1; lookup_flags &= ~GET_OID_DISAMBIGUATORS; diff --git a/t/t3104-ls-tree-braces.sh b/t/t3104-ls-tree-braces.sh new file mode 100755 index 0000000000..adeed5bb49 --- /dev/null +++ b/t/t3104-ls-tree-braces.sh @@ -0,0 +1,75 @@ +#!/bin/sh + +test_description='ls-tree with folders with brackets' + +. ./test-lib.sh + +test_expect_success 'setup' ' + mkdir -p "newdir/{{curly}}" && + mkdir -p "newdir/^{tree}" && + touch "newdir/{{curly}}/one" && + touch "newdir/^{tree}/two" && + git add "newdir/{{curly}}/one" && + git add "newdir/^{tree}/two" && + git commit -m "Test message: search me" +' + +test_expect_success 'ls-tree with curly brace folder' ' + cat >expect <<-EOF && + 100644 blob $EMPTY_BLOB one + EOF + git ls-tree -r "HEAD:newdir/{{curly}}" >actual && + test_cmp expect actual +' + +test_expect_success 'ls-tree with type restriction and curly brace folder' ' + cat >expect <<-EOF && + 100644 blob $EMPTY_BLOB one + EOF + git ls-tree "HEAD^{tree}:newdir/{{curly}}" >actual && + test_cmp expect actual +' + +test_expect_success 'ls-tree with type restriction and folder named ^{tree}' ' + cat >expect <<-EOF && + 100644 blob $EMPTY_BLOB two + EOF + git ls-tree "HEAD^{tree}:newdir/^{tree}" >actual && + test_cmp expect actual +' + +test_expect_success 'ls-tree with unknown type restriction' ' + (git ls-tree HEAD^{foobar} >actual || true) && + test_must_be_empty actual +' + +test_output () { + sed -e "s/ $OID_REGEX / X /" <actual >check + test_cmp expect check +} + +test_expect_success 'ls-tree with regex' ' + cat >expect <<-EOF && + 040000 tree X newdir + EOF + git ls-tree "HEAD^{/message}" >actual && + test_output +' + +test_expect_success 'ls-tree with regex with a colon' ' + cat >expect <<-EOF && + 040000 tree X newdir + EOF + git ls-tree "HEAD^{/message: search}" >actual && + test_output +' + +test_expect_success 'ls-tree with regex and curly brace folder' ' + cat >expect <<-EOF && + 100644 blob $EMPTY_BLOB one + EOF + git ls-tree "HEAD^{/message: search}:newdir/{{curly}}" >actual && + test_cmp expect actual +' + +test_done
Previously, calling ls-tree with a revision such as `master^{tree}:foo/{{path}}` would show the root tree instead of the correct tree pointed by foo/{{path}}. We improve this by ensuring peel_onion() consumes all "len" characters or none. Note that it's also possible to have directories named ^{tree} and other patterns that look like type restrictions. To handle those cases, scan from the beginning of the name instead of the end. Signed-off-by: Stan Hu <stanhu@gmail.com> --- sha1-name.c | 35 ++++++++++-------- t/t3104-ls-tree-braces.sh | 75 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 96 insertions(+), 14 deletions(-) create mode 100755 t/t3104-ls-tree-braces.sh