diff mbox series

[v2,03/10] sparse-index: create expand_to_pattern_list()

Message ID d15338573e570aebe239dacdd8c2aba275ff61b9.1652982758.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Sparse index: integrate with sparse-checkout | expand

Commit Message

Derrick Stolee May 19, 2022, 5:52 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

This is the first change in a series to allow modifying the
sparse-checkout pattern set without expanding a sparse index to a full
one in the process. Here, we focus on the problem of expanding the
pattern set through a command like 'git sparse-checkout add <path>'
which needs to create new index entries for the paths now being written
to the worktree.

To achieve this, we need to be able to replace sparse directory entries
with their contained files and subdirectories. Once this is complete,
other code paths can discover those cache entries and write the
corresponding files to disk before committing the index.

We already have logic in ensure_full_index() that expands the index
entries, so we will use that as our base. Create a new method,
expand_to_pattern_list(), which takes a pattern list, but for now mostly
ignores it. The current implementation is only correct when the pattern
list is NULL as that does the same as ensure_full_index(). In fact,
ensure_full_index() is converted to a shim over
expand_to_pattern_list().

A future update will actually implement expand_to_pattern_list() to its
full capabilities. For now, it is created and documented.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 sparse-index.c | 35 ++++++++++++++++++++++++++++++++---
 sparse-index.h | 14 ++++++++++++++
 2 files changed, 46 insertions(+), 3 deletions(-)

Comments

Junio C Hamano May 19, 2022, 7:50 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> list is NULL as that does the same as ensure_full_index(). In fact,
> ensure_full_index() is converted to a shim over
> expand_to_pattern_list().

Sounds like a natural evolution of the API that used to be
all-or-none to expand-only-those-that-match.

The old one had a sensible name to tell us that it is about the
in-core index (and "full index" implied it was about sparse-index
feature because what state other than "full" the index can be---some
are shrunk into tree entries, which by definition is the
sparse-index feature).  Contrasted to that, the name of the new one
is horrible.  It does not even have index anywhere in the name.

I wonder expand_index() would work?

> +void expand_to_pattern_list(struct index_state *istate,
> +			    struct pattern_list *pl)
>  {
>  	int i;
>  	struct index_state *full;
>  	struct strbuf base = STRBUF_INIT;
>  
> +	/*
> +	 * If the index is already full, then keep it full. We will convert
> +	 * it to a sparse index on write, if possible.
> +	 */
>  	if (!istate || !istate->sparse_index)
>  		return;
>  
> +	/*
> +	 * If our index is sparse, but our new pattern set does not use
> +	 * cone mode patterns, then we need to expand the index before we
> +	 * continue. A NULL pattern set indicates a full expansion to a
> +	 * full index.
> +	 */
> +	if (pl && !pl->use_cone_patterns)
> +		pl = NULL;
> +
>  	if (!istate->repo)
>  		istate->repo = the_repository;
>  
> -	trace2_region_enter("index", "ensure_full_index", istate->repo);
> +	/*
> +	 * A NULL pattern set indicates we are expanding a full index, so
> +	 * we use a special region name that indicates the full expansion.
> +	 * This is used by test cases, but also helps to differentiate the
> +	 * two cases.
> +	 */

Except that we lost the distinction for non-cone mode, which I am
not sure matters, but I suspect we do not have to, if we do not want
to.  Nobody used "pl" up to this point, so resetting it to NULL can
be done much later.  In later phases of this series, we add another
case where we can lose pl even if we are not using cone mode, so
this distinction may start to matter later.  I dunno.

I'd invent a separate "const char *tr2_region_label" variable and
set it at the beginning, regardless of where we clobber pl and why,
and use that label variable for trace2 calls, if I were doing this
patch.  That feels much simpler and cleaner.

> +	trace2_region_enter("index",
> +			    pl ? "expand_to_pattern_list" : "ensure_full_index",
> +			    istate->repo);

> diff --git a/sparse-index.h b/sparse-index.h
> index 633d4fb7e31..037b541f49d 100644
> --- a/sparse-index.h
> +++ b/sparse-index.h
> @@ -23,4 +23,18 @@ void expand_to_path(struct index_state *istate,
>  struct repository;
>  int set_sparse_index_config(struct repository *repo, int enable);
>  
> +struct pattern_list;
> +
> +/**
> + * Scan the given index and compare its entries to the given pattern list.
> + * If the index is sparse and the pattern list uses cone mode patterns,
> + * then modify the index to contain the all of the file entries within that
> + * new pattern list. This expands sparse directories only as far as needed.
> + *
> + * If the pattern list is NULL or does not use cone mode patterns, then the
> + * index is expanded to a full index.
> + */
> +void expand_to_pattern_list(struct index_state *istate,
> +			      struct pattern_list *pl);
> +
>  #endif
Derrick Stolee May 20, 2022, 6:01 p.m. UTC | #2
On 5/19/2022 3:50 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> list is NULL as that does the same as ensure_full_index(). In fact,
>> ensure_full_index() is converted to a shim over
>> expand_to_pattern_list().
> 
> Sounds like a natural evolution of the API that used to be
> all-or-none to expand-only-those-that-match.
> 
> The old one had a sensible name to tell us that it is about the
> in-core index (and "full index" implied it was about sparse-index
> feature because what state other than "full" the index can be---some
> are shrunk into tree entries, which by definition is the
> sparse-index feature).  Contrasted to that, the name of the new one
> is horrible.  It does not even have index anywhere in the name.
> 
> I wonder expand_index() would work?

Makes sense. Good suggestion.
 
>> -	trace2_region_enter("index", "ensure_full_index", istate->repo);
>> +	/*
>> +	 * A NULL pattern set indicates we are expanding a full index, so
>> +	 * we use a special region name that indicates the full expansion.
>> +	 * This is used by test cases, but also helps to differentiate the
>> +	 * two cases.
>> +	 */
> 
> Except that we lost the distinction for non-cone mode, which I am
> not sure matters, but I suspect we do not have to, if we do not want
> to.  Nobody used "pl" up to this point, so resetting it to NULL can
> be done much later.  In later phases of this series, we add another
> case where we can lose pl even if we are not using cone mode, so
> this distinction may start to matter later.  I dunno.
> 
> I'd invent a separate "const char *tr2_region_label" variable and
> set it at the beginning, regardless of where we clobber pl and why,
> and use that label variable for trace2 calls, if I were doing this
> patch.  That feels much simpler and cleaner.

Good idea.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/sparse-index.c b/sparse-index.c
index 8636af72de5..2a06ef58051 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -248,19 +248,41 @@  static int add_path_to_index(const struct object_id *oid,
 	return 0;
 }
 
-void ensure_full_index(struct index_state *istate)
+void expand_to_pattern_list(struct index_state *istate,
+			    struct pattern_list *pl)
 {
 	int i;
 	struct index_state *full;
 	struct strbuf base = STRBUF_INIT;
 
+	/*
+	 * If the index is already full, then keep it full. We will convert
+	 * it to a sparse index on write, if possible.
+	 */
 	if (!istate || !istate->sparse_index)
 		return;
 
+	/*
+	 * If our index is sparse, but our new pattern set does not use
+	 * cone mode patterns, then we need to expand the index before we
+	 * continue. A NULL pattern set indicates a full expansion to a
+	 * full index.
+	 */
+	if (pl && !pl->use_cone_patterns)
+		pl = NULL;
+
 	if (!istate->repo)
 		istate->repo = the_repository;
 
-	trace2_region_enter("index", "ensure_full_index", istate->repo);
+	/*
+	 * A NULL pattern set indicates we are expanding a full index, so
+	 * we use a special region name that indicates the full expansion.
+	 * This is used by test cases, but also helps to differentiate the
+	 * two cases.
+	 */
+	trace2_region_enter("index",
+			    pl ? "expand_to_pattern_list" : "ensure_full_index",
+			    istate->repo);
 
 	/* initialize basics of new index */
 	full = xcalloc(1, sizeof(struct index_state));
@@ -322,7 +344,14 @@  void ensure_full_index(struct index_state *istate)
 	cache_tree_free(&istate->cache_tree);
 	cache_tree_update(istate, 0);
 
-	trace2_region_leave("index", "ensure_full_index", istate->repo);
+	trace2_region_leave("index",
+			    pl ? "expand_to_pattern_list" : "ensure_full_index",
+			    istate->repo);
+}
+
+void ensure_full_index(struct index_state *istate)
+{
+	expand_to_pattern_list(istate, NULL);
 }
 
 void ensure_correct_sparsity(struct index_state *istate)
diff --git a/sparse-index.h b/sparse-index.h
index 633d4fb7e31..037b541f49d 100644
--- a/sparse-index.h
+++ b/sparse-index.h
@@ -23,4 +23,18 @@  void expand_to_path(struct index_state *istate,
 struct repository;
 int set_sparse_index_config(struct repository *repo, int enable);
 
+struct pattern_list;
+
+/**
+ * Scan the given index and compare its entries to the given pattern list.
+ * If the index is sparse and the pattern list uses cone mode patterns,
+ * then modify the index to contain the all of the file entries within that
+ * new pattern list. This expands sparse directories only as far as needed.
+ *
+ * If the pattern list is NULL or does not use cone mode patterns, then the
+ * index is expanded to a full index.
+ */
+void expand_to_pattern_list(struct index_state *istate,
+			      struct pattern_list *pl);
+
 #endif