diff mbox series

[v2,3/8] unpack-trees: compare sparse directories correctly

Message ID 24e71d8c062239be9b995bcc31dd12edf84106e3.1619213665.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Sparse-index: integrate with status | expand

Commit Message

Derrick Stolee April 23, 2021, 9:34 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

As we further integrate the sparse-index into unpack-trees, we need to
ensure that we compare sparse directory entries correctly with other
entries. This affects searching for an exact path as well as sorting
index entries.

Sparse directory entries contain the trailing directory separator. This
is important for the sorting, in particular. Thus, within
do_compare_entry() we stop using S_IFREG in all cases, since sparse
directories should use S_IFDIR to indicate that the comparison should
treat the entry name as a dirctory.

Within compare_entry(), it first calls do_compare_entry() to check the
leading portion of the name. When the input path is a directory name, we
could match exactly already. Thus, we should return 0 if we have an
exact string match on a sparse directory entry.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 unpack-trees.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Elijah Newren May 13, 2021, 3:26 a.m. UTC | #1
On Fri, Apr 23, 2021 at 2:34 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> As we further integrate the sparse-index into unpack-trees, we need to
> ensure that we compare sparse directory entries correctly with other
> entries. This affects searching for an exact path as well as sorting
> index entries.
>
> Sparse directory entries contain the trailing directory separator. This
> is important for the sorting, in particular. Thus, within
> do_compare_entry() we stop using S_IFREG in all cases, since sparse
> directories should use S_IFDIR to indicate that the comparison should
> treat the entry name as a dirctory.
>
> Within compare_entry(), it first calls do_compare_entry() to check the
> leading portion of the name. When the input path is a directory name, we
> could match exactly already. Thus, we should return 0 if we have an
> exact string match on a sparse directory entry.

Thanks for splitting up patch 2 from the original series; it's much
easier to understand these separate patches.

>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  unpack-trees.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 1067db19c9d2..3af797093095 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -969,6 +969,7 @@ static int do_compare_entry(const struct cache_entry *ce,
>         int pathlen, ce_len;
>         const char *ce_name;
>         int cmp;
> +       unsigned ce_mode;
>
>         /*
>          * If we have not precomputed the traverse path, it is quicker
> @@ -991,7 +992,8 @@ static int do_compare_entry(const struct cache_entry *ce,
>         ce_len -= pathlen;
>         ce_name = ce->name + pathlen;
>
> -       return df_name_compare(ce_name, ce_len, S_IFREG, name, namelen, mode);
> +       ce_mode = S_ISSPARSEDIR(ce->ce_mode) ? S_IFDIR : S_IFREG;

Ah, so here the fact that S_ISSPARSEDIR is defined as
   #define S_ISSPARSEDIR(m) ((m) == S_IFDIR)
whereas S_ISDIR is defined as
   #define S_ISDIR(m)      (((m) & S_IFMT) == S_IFDIR)
turns out to be critically important, because if you used S_ISDIR()
here, then we'd get ce_mode = S_IFDIR for submodules and break the
sorting.  S_ISSPARSEDIR() gives us the correct value.

> +       return df_name_compare(ce_name, ce_len, ce_mode, name, namelen, mode);
>  }
>
>  static int compare_entry(const struct cache_entry *ce, const struct traverse_info *info, const struct name_entry *n)
> @@ -1000,6 +1002,10 @@ static int compare_entry(const struct cache_entry *ce, const struct traverse_inf
>         if (cmp)
>                 return cmp;
>
> +       /* If ce is a sparse directory, then allow an exact match. */
> +       if (S_ISSPARSEDIR(ce->ce_mode))
> +               return 0;

I think the comment from the commit message belongs in the code; the
comment in the code is too jarring without the more detailed
explanation.

> +
>         /*
>          * Even if the beginning compared identically, the ce should
>          * compare as bigger than a directory leading up to it!
> --
> gitgitgadget
diff mbox series

Patch

diff --git a/unpack-trees.c b/unpack-trees.c
index 1067db19c9d2..3af797093095 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -969,6 +969,7 @@  static int do_compare_entry(const struct cache_entry *ce,
 	int pathlen, ce_len;
 	const char *ce_name;
 	int cmp;
+	unsigned ce_mode;
 
 	/*
 	 * If we have not precomputed the traverse path, it is quicker
@@ -991,7 +992,8 @@  static int do_compare_entry(const struct cache_entry *ce,
 	ce_len -= pathlen;
 	ce_name = ce->name + pathlen;
 
-	return df_name_compare(ce_name, ce_len, S_IFREG, name, namelen, mode);
+	ce_mode = S_ISSPARSEDIR(ce->ce_mode) ? S_IFDIR : S_IFREG;
+	return df_name_compare(ce_name, ce_len, ce_mode, name, namelen, mode);
 }
 
 static int compare_entry(const struct cache_entry *ce, const struct traverse_info *info, const struct name_entry *n)
@@ -1000,6 +1002,10 @@  static int compare_entry(const struct cache_entry *ce, const struct traverse_inf
 	if (cmp)
 		return cmp;
 
+	/* If ce is a sparse directory, then allow an exact match. */
+	if (S_ISSPARSEDIR(ce->ce_mode))
+		return 0;
+
 	/*
 	 * Even if the beginning compared identically, the ce should
 	 * compare as bigger than a directory leading up to it!