[v2,2/2] git_path(): handle `.lock` files correctly
diff mbox series

Message ID 93dba5a3a38d75ba79329383a1d50419a1990c8d.1571350077.git.gitgitgadget@gmail.com
State New
Headers show
Series
  • Handle git_path() with lock files correctly in worktrees
Related show

Commit Message

Ben Keene via GitGitGadget Oct. 17, 2019, 10:07 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

Ever since worktrees were introduced, the `git_path()` function _really_
needed to be called e.g. to get at the path to `logs/HEAD` (`HEAD` is
specific to the worktree). However, the wrong path is returned for
`logs/HEAD.lock`.

This does not matter as long as the Git executable is doing the asking,
as the path for that `index.lock` file is constructed from
`git_path("index")` by appending the `.lock` suffix.

However, Git GUI just learned to use `--git-path` instead of appending
relative paths to what `git rev-parse --git-dir` returns (and as a
consequence not only using the correct hooks directory, but also using
the correct paths in worktrees other than the main one). While it does
not seem as if Git GUI in particular is asking for `logs/HEAD.lock`,
let's be safe rather than sorry.

Side note: Git GUI _does_ ask for `index.lock`, but that is already
resolved correctly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 path.c               |  4 ++--
 t/t1500-rev-parse.sh | 15 +++++++++++++++
 2 files changed, 17 insertions(+), 2 deletions(-)

Comments

Junio C Hamano Oct. 18, 2019, 1:23 a.m. UTC | #1
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Ever since worktrees were introduced, the `git_path()` function _really_
> needed to be called e.g. to get at the path to `logs/HEAD` (`HEAD` is
> specific to the worktree). However, the wrong path is returned for
> `logs/HEAD.lock`.
>
> This does not matter as long as the Git executable is doing the asking,
> as the path for that `index.lock` file is constructed from
> `git_path("index")` by appending the `.lock` suffix.

Is this still git_path("index") or is it now HEAD?

> Side note: Git GUI _does_ ask for `index.lock`, but that is already
> resolved correctly.

Is that s/but/and/?

> diff --git a/path.c b/path.c
> index e3da1f3c4e..ff85692b45 100644
> --- a/path.c
> +++ b/path.c
> @@ -268,7 +268,7 @@ static int trie_find(struct trie *root, const char *key, match_fn fn,
>  	int result;
>  	struct trie *child;
>  
> -	if (!*key) {
> +	if (!*key || !strcmp(key, ".lock")) {

We only do strcmp for the tail part at the end of the path, so this
should probably OK from performance point of view but semantically
it is not very satisfying to see a special case for a single .suffix
this deep in the callchain.  I wonder if it is nicer to have the
higher level callers notice ".lock" or whatever other suffixes they
care about and ask the lower layer for a key with the suffix
stripped?

Will queue.

Thanks.
SZEDER Gábor Oct. 18, 2019, 12:35 p.m. UTC | #2
On Fri, Oct 18, 2019 at 10:23:00AM +0900, Junio C Hamano wrote:
> > diff --git a/path.c b/path.c
> > index e3da1f3c4e..ff85692b45 100644
> > --- a/path.c
> > +++ b/path.c
> > @@ -268,7 +268,7 @@ static int trie_find(struct trie *root, const char *key, match_fn fn,
> >  	int result;
> >  	struct trie *child;
> >  
> > -	if (!*key) {
> > +	if (!*key || !strcmp(key, ".lock")) {
> 
> We only do strcmp for the tail part at the end of the path, so this
> should probably OK from performance point of view but semantically
> it is not very satisfying to see a special case for a single .suffix
> this deep in the callchain.  I wonder if it is nicer to have the
> higher level callers notice ".lock" or whatever other suffixes they
> care about and ask the lower layer for a key with the suffix
> stripped?

I was wondering the same.  Despite all of its functions being file
static, this is a fairly general trie implementation for paths, which
could potentially be useful in other parts of Git as well.  Special
casing any suffix within its implementation makes it less (re)usable
in other cases.
Johannes Schindelin Oct. 21, 2019, 8:26 p.m. UTC | #3
Hi Junio,

On Fri, 18 Oct 2019, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > Ever since worktrees were introduced, the `git_path()` function _really_
> > needed to be called e.g. to get at the path to `logs/HEAD` (`HEAD` is
> > specific to the worktree). However, the wrong path is returned for
> > `logs/HEAD.lock`.
> >
> > This does not matter as long as the Git executable is doing the asking,
> > as the path for that `index.lock` file is constructed from
> > `git_path("index")` by appending the `.lock` suffix.
>
> Is this still git_path("index") or is it now HEAD?

Darn. It is "logs/HEAD", of course.

> > Side note: Git GUI _does_ ask for `index.lock`, but that is already
> > resolved correctly.
>
> Is that s/but/and/?

It sounds better with an "and", you're right.

> > diff --git a/path.c b/path.c
> > index e3da1f3c4e..ff85692b45 100644
> > --- a/path.c
> > +++ b/path.c
> > @@ -268,7 +268,7 @@ static int trie_find(struct trie *root, const char *key, match_fn fn,
> >  	int result;
> >  	struct trie *child;
> >
> > -	if (!*key) {
> > +	if (!*key || !strcmp(key, ".lock")) {
>
> We only do strcmp for the tail part at the end of the path, so this
> should probably OK from performance point of view but semantically
> it is not very satisfying to see a special case for a single .suffix
> this deep in the callchain.  I wonder if it is nicer to have the
> higher level callers notice ".lock" or whatever other suffixes they
> care about and ask the lower layer for a key with the suffix
> stripped?

Hmm. The parameter is provided as a `const char *`, so I cannot just
edit it. My first attempt at fixing this resulted in this commit:

-- snip --
    path: switch `trie_find()` to a counted string

    Before this patch, the `trie_find()` function would be fed a regular,
    NUL-terminated string.

    However, in the next commit, we want to strip any `.lock` suffix from
    the argument, and the argument is provided as a `const char *` (i.e. we
    do not want to modify it).

    Let's use a string + length pair of parameters instead of a single
    NUL-terminated string to open the door for this kind of manipulation.

    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

diff --git a/path.c b/path.c
index e3da1f3c4e2..b18d552c890 100644
--- a/path.c
+++ b/path.c
@@ -236,7 +236,8 @@ static void *add_to_trie(struct trie *root, const char *key, void *value)
 	return old;
 }

-typedef int (*match_fn)(const char *unmatched, void *data, void *baton);
+typedef int (*match_fn)(const char *unmatched, size_t unmatched_len,
+			void *data, void *baton);

 /*
  * Search a trie for some key.  Find the longest /-or-\0-terminated
@@ -261,22 +262,22 @@ typedef int (*match_fn)(const char *unmatched, void *data, void *baton);
  * |-----------------|------------|---------------|--------------|
  *
  */
-static int trie_find(struct trie *root, const char *key, match_fn fn,
-		     void *baton)
+static int trie_find(struct trie *root, const char *key, size_t key_len,
+		     match_fn fn, void *baton)
 {
 	int i;
 	int result;
 	struct trie *child;

-	if (!*key) {
+	if (!key_len) {
 		/* we have reached the end of the key */
 		if (root->value && !root->len)
-			return fn(key, root->value, baton);
+			return fn(key, key_len, root->value, baton);
 		else
 			return -1;
 	}

-	for (i = 0; i < root->len; i++) {
+	for (i = 0; i < key_len && i < root->len; i++) {
 		/* Partial path normalization: skip consecutive slashes. */
 		if (key[i] == '/' && key[i+1] == '/') {
 			key++;
@@ -288,24 +289,26 @@ static int trie_find(struct trie *root, const char *key, match_fn fn,

 	/* Matched the entire compressed section */
 	key += i;
-	if (!*key)
+	key_len -= i;
+
+	if (!key_len)
 		/* End of key */
-		return fn(key, root->value, baton);
+		return fn(key, key_len, root->value, baton);

 	/* Partial path normalization: skip consecutive slashes */
-	while (key[0] == '/' && key[1] == '/')
-		key++;
+	while (key_len > 0 && key[0] == '/' && key[1] == '/')
+		key++, key_len--;

-	child = root->children[(unsigned char)*key];
+	child = root->children[!key_len ? '0' : (unsigned char)*key];
 	if (child)
-		result = trie_find(child, key + 1, fn, baton);
+		result = trie_find(child, key + 1, key_len - 1, fn, baton);
 	else
 		result = -1;

-	if (result >= 0 || (*key != '/' && *key != 0))
+	if (result >= 0 || (key_len && *key != '/'))
 		return result;
 	if (root->value)
-		return fn(key, root->value, baton);
+		return fn(key, key_len, root->value, baton);
 	else
 		return -1;
 }
@@ -330,17 +333,18 @@ static void init_common_trie(void)
  * Helper function for update_common_dir: returns 1 if the dir
  * prefix is common.
  */
-static int check_common(const char *unmatched, void *value, void *baton)
+static int check_common(const char *unmatched, size_t unmatched_len,
+			void *value, void *baton)
 {
 	struct common_dir *dir = value;

 	if (!dir)
 		return 0;

-	if (dir->is_dir && (unmatched[0] == 0 || unmatched[0] == '/'))
+	if (dir->is_dir && (!unmatched_len || unmatched[0] == '/'))
 		return !dir->exclude;

-	if (!dir->is_dir && unmatched[0] == 0)
+	if (!dir->is_dir && !unmatched_len)
 		return !dir->exclude;

 	return 0;
@@ -350,8 +354,10 @@ static void update_common_dir(struct strbuf *buf, int git_dir_len,
 			      const char *common_dir)
 {
 	char *base = buf->buf + git_dir_len;
+	size_t len = strlen(base);
+
 	init_common_trie();
-	if (trie_find(&common_trie, base, check_common, NULL) > 0)
+	if (trie_find(&common_trie, base, len, check_common, NULL) > 0)
 		replace_dir(buf, git_dir_len, common_dir);
 }
-- snap --

However, I think this is _really_ ugly and intrusive. The opposite of
what my goals were.

So I think I'll just bite the bullet and use a temporary copy if the
argument ends in `.lock`.

Ciao,
Dscho
Junio C Hamano Oct. 23, 2019, 2:12 a.m. UTC | #4
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> However, I think this is _really_ ugly and intrusive. The opposite of
> what my goals were.
>
> So I think I'll just bite the bullet and use a temporary copy if the
> argument ends in `.lock`.

Sounds like a quite sensible design decision to me.

Patch
diff mbox series

diff --git a/path.c b/path.c
index e3da1f3c4e..ff85692b45 100644
--- a/path.c
+++ b/path.c
@@ -268,7 +268,7 @@  static int trie_find(struct trie *root, const char *key, match_fn fn,
 	int result;
 	struct trie *child;
 
-	if (!*key) {
+	if (!*key || !strcmp(key, ".lock")) {
 		/* we have reached the end of the key */
 		if (root->value && !root->len)
 			return fn(key, root->value, baton);
@@ -288,7 +288,7 @@  static int trie_find(struct trie *root, const char *key, match_fn fn,
 
 	/* Matched the entire compressed section */
 	key += i;
-	if (!*key)
+	if (!*key || !strcmp(key, ".lock"))
 		/* End of key */
 		return fn(key, root->value, baton);
 
diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
index 01abee533d..d318a1eeef 100755
--- a/t/t1500-rev-parse.sh
+++ b/t/t1500-rev-parse.sh
@@ -116,6 +116,21 @@  test_expect_success 'git-path inside sub-dir' '
 	test_cmp expect actual
 '
 
+test_expect_success 'git-path in worktree' '
+	test_tick &&
+	git commit --allow-empty -m empty &&
+	git worktree add --detach wt &&
+	test_write_lines >expect \
+		"$(pwd)/.git/worktrees/wt/logs/HEAD" \
+		"$(pwd)/.git/worktrees/wt/logs/HEAD.lock" \
+		"$(pwd)/.git/worktrees/wt/index" \
+		"$(pwd)/.git/worktrees/wt/index.lock" &&
+	git -C wt rev-parse >actual \
+		--git-path logs/HEAD --git-path logs/HEAD.lock \
+		--git-path index --git-path index.lock &&
+	test_cmp expect actual
+'
+
 test_expect_success 'rev-parse --is-shallow-repository in shallow repo' '
 	test_commit test_commit &&
 	echo true >expect &&