diff mbox series

[24/27] dir: use expand_to_path in add_patterns()

Message ID 926a2e12cf7534339e1f9bedff50d97bf251ffa2.1611596534.git.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series Sparse Index | expand

Commit Message

Derrick Stolee Jan. 25, 2021, 5:42 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

The add_patterns() method has a way to extract a pattern file from the
index. If this pattern file is sparse and within a sparse directory
entry, then we need to expand the index before looking for that entry as
a file path.

For now, convert ensure_full_index() into expand_to_path() to only
expand this way when absolutely necessary.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Elijah Newren Feb. 1, 2021, 11:21 p.m. UTC | #1
On Mon, Jan 25, 2021 at 9:42 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The add_patterns() method has a way to extract a pattern file from the
> index. If this pattern file is sparse and within a sparse directory
> entry, then we need to expand the index before looking for that entry as
> a file path.

Why?

Correct me if I'm wrong, but I thought the point of add_patterns() was
to read .gitignore entries, so that we can know whether to e.g. have
status report untracked files within some directory or have clean
delete files within a directory.  But if we have a sparse directory
entry in the index, we probably have no such directory in the working
directory.  And if we have no such working directory, getting
.gitignore entries for those directories is a big waste of time.

> For now, convert ensure_full_index() into expand_to_path() to only
> expand this way when absolutely necessary.

Not only should we probably not need to read these files at all,
expand_to_path() still expands a lot more than necessary, right?  If
two directories are sparse -- moduleA and moduleB, and we need
something from under moduleA/, then expand_to_path() will call
ensure_full_index() and fill out every entry under both modules, even
if moduleB is way bigger than moduleA.  Unless I've misunderstood
something, there's multiple ways we're falling short of "only...when
absolutely necessary".


Perhaps both of these things are future work you already had planned;
if so, some tweaks to the commit message may help keep this reader
oriented.  :-)

> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  dir.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/dir.c b/dir.c
> index 21998c7c4b7..7df8d3b1da0 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1093,7 +1093,7 @@ static int add_patterns(const char *fname, const char *base, int baselen,
>                         int pos;
>
>                         if (istate)
> -                               ensure_full_index(istate);
> +                               expand_to_path(istate, fname, strlen(fname), 0);
>
>                         if (oid_stat->valid &&
>                             !match_stat_data_racy(istate, &oid_stat->stat, &st))
> --
> gitgitgadget

There's also a read_skip_worktree_file_from_index() call earlier in
the same function, which in your big RFC patch you protected with the
ensure_full_index() call already.  Perhaps it should have an
expand_to_path() conversion as well?  But, in the big picture, it
seems like checking if we can avoid reading in that pattern file
(whenever the directory doesn't exist within the working copy) would
be a better first step.
diff mbox series

Patch

diff --git a/dir.c b/dir.c
index 21998c7c4b7..7df8d3b1da0 100644
--- a/dir.c
+++ b/dir.c
@@ -1093,7 +1093,7 @@  static int add_patterns(const char *fname, const char *base, int baselen,
 			int pos;
 
 			if (istate)
-				ensure_full_index(istate);
+				expand_to_path(istate, fname, strlen(fname), 0);
 
 			if (oid_stat->valid &&
 			    !match_stat_data_racy(istate, &oid_stat->stat, &st))