diff mbox series

[05/10] sparse-checkout: automatically update in-tree definition

Message ID 9a2cb7bb5ed7a5dc039d4b47bfee83c589252a45.1588857462.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series In-tree sparse-checkout definitions | expand

Commit Message

Johannes Schindelin via GitGitGadget May 7, 2020, 1:17 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

A benefit of having sparse-checkouts defined in the tree data is that
the sparse-checkout definition can change over time. To take advantage
of that data, a user could run "git sparse-checkout reapply" whenever
they thought the sparse-checkout definition was out of date. That is
likely too manual of a process, since a user would probably only
discover this after a failure to accomplish their goal, such as a build
failure.

Prevent user frustration by automatically updating the sparse-checkout
definition after changing the index when an in-tree definition is
provided by config. This will happen during every index change,
including every step of a rebase, including conflict states.

Special care was needed around the --no-sparse-checkout option in "git
read-tree". This previously relied on changing the skip_sparse_checkout
option in "struct unpack_trees_options" to prevent applying the
skip-worktree bits. However, now that we make a second update to the
index focusing on the skip-worktree bits, this needs to be prevented in
another way. The simplest thing to do was disable the feature through
the core_apply_sparse_checkout global variable.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-sparse-checkout.txt |  7 ++++---
 builtin/commit.c                      |  4 +++-
 builtin/read-tree.c                   |  4 ++++
 read-cache.c                          |  8 +++++---
 sparse-checkout.c                     | 26 +++++++++++++++++++++++++-
 sparse-checkout.h                     |  1 +
 t/t1091-sparse-checkout-builtin.sh    | 19 +++++++++++++++----
 7 files changed, 57 insertions(+), 12 deletions(-)

Comments

Elijah Newren May 20, 2020, 4:28 p.m. UTC | #1
Hi,

Sorry for the long delay.  I'm trying to read through the patch series
but may comment out of order...

On Thu, May 7, 2020 at 6:21 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> A benefit of having sparse-checkouts defined in the tree data is that
> the sparse-checkout definition can change over time. To take advantage
> of that data, a user could run "git sparse-checkout reapply" whenever
> they thought the sparse-checkout definition was out of date. That is
> likely too manual of a process, since a user would probably only
> discover this after a failure to accomplish their goal, such as a build
> failure.
>
> Prevent user frustration by automatically updating the sparse-checkout
> definition after changing the index when an in-tree definition is
> provided by config. This will happen during every index change,
> including every step of a rebase, including conflict states.

...every step of a rebase _that writes to the index_.  (Currently,
rebase writes to the index and working tree with every commit that is
applied, though that's a waste of time and resources; if the patch
applies cleanly it should be done in-memory and just continue on to
the next one.  That avoids dirtying working tree files unnecessarily
too.  I'm hoping to start working on it again soon...)

> Special care was needed around the --no-sparse-checkout option in "git
> read-tree". This previously relied on changing the skip_sparse_checkout
> option in "struct unpack_trees_options" to prevent applying the
> skip-worktree bits. However, now that we make a second update to the
> index focusing on the skip-worktree bits, this needs to be prevented in
> another way. The simplest thing to do was disable the feature through
> the core_apply_sparse_checkout global variable.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Documentation/git-sparse-checkout.txt |  7 ++++---
>  builtin/commit.c                      |  4 +++-
>  builtin/read-tree.c                   |  4 ++++
>  read-cache.c                          |  8 +++++---
>  sparse-checkout.c                     | 26 +++++++++++++++++++++++++-
>  sparse-checkout.h                     |  1 +
>  t/t1091-sparse-checkout-builtin.sh    | 19 +++++++++++++++----
>  7 files changed, 57 insertions(+), 12 deletions(-)
>
> diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
> index da9322c5e41..c1713ebb1d2 100644
> --- a/Documentation/git-sparse-checkout.txt
> +++ b/Documentation/git-sparse-checkout.txt
> @@ -233,9 +233,10 @@ Use `git sparse-checkout set --in-tree <path>` to initialize the patterns
>  to those included in the file at `<path>`. This will override any existing
>  patterns you have in your sparse-checkout file.
>
> -After switching between commits with different versions of this file, run
> -`git sparse-checkout reapply` to adjust the sparse-checkout patterns to
> -the new definition.
> +As Git switches between commits, it will update the in-tree sparse-checkout
> +definition according to the files available at the new commit. If any of
> +the specified files do not exist at the new commit, then the sparse-checkout
> +definition will not change.
>
>
>  SUBMODULES
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 7ba33a3bec4..0eab8d74469 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -36,6 +36,7 @@
>  #include "help.h"
>  #include "commit-reach.h"
>  #include "commit-graph.h"
> +#include "sparse-checkout.h"
>
>  static const char * const builtin_commit_usage[] = {
>         N_("git commit [<options>] [--] <pathspec>..."),
> @@ -222,7 +223,8 @@ static int commit_index_files(void)
>         case COMMIT_AS_IS:
>                 break; /* nothing to do */
>         case COMMIT_NORMAL:
> -               err = commit_lock_file(&index_lock);
> +               err = commit_lock_file(&index_lock) ||
> +                     update_in_tree_sparse_checkout();
>                 break;
>         case COMMIT_PARTIAL:
>                 err = commit_lock_file(&index_lock);

I like the fact that you found one simple place to change to make sure
all places call the necessary code.  I'm trying to think out loud
about whether that catches everything and whether it has any ill
side-effects.  Not sure if there is one, but...if someone is doing a
merge, a rebase, or even a checkout, then:

* unpack_trees() will apply sparsity patterns (but likely the old ones
that weren't updated yet, right?)
* commit_index_files() will reapply the new sparsity patterns

While I think this doesn't present any problem from a correctness
point of view, is there a performance issue here?  Is there a case
where this will cause a huge amount of unnecessary disk activity, e.g.
when old rules include some big directory that later rules don't?

> diff --git a/builtin/read-tree.c b/builtin/read-tree.c
> index af7424b94c8..9ae81ffffa1 100644
> --- a/builtin/read-tree.c
> +++ b/builtin/read-tree.c
> @@ -247,6 +247,10 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
>                 parse_tree(tree);
>                 init_tree_desc(t+i, tree->buffer, tree->size);
>         }
> +
> +       if (opts.skip_sparse_checkout)
> +               core_apply_sparse_checkout = 0;
> +
>         if (unpack_trees(nr_trees, t, &opts))
>                 return 128;
>
> diff --git a/read-cache.c b/read-cache.c
> index aa427c5c170..150e73feb0d 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -25,6 +25,7 @@
>  #include "fsmonitor.h"
>  #include "thread-utils.h"
>  #include "progress.h"
> +#include "sparse-checkout.h"
>
>  /* Mask for the name length in ce_flags in the on-disk index */
>
> @@ -3074,9 +3075,10 @@ static int do_write_locked_index(struct index_state *istate, struct lock_file *l
>
>         if (ret)
>                 return ret;
> -       if (flags & COMMIT_LOCK)
> -               ret = commit_locked_index(lock);
> -       else
> +       if (flags & COMMIT_LOCK) {
> +               ret = commit_locked_index(lock) ||
> +                     update_in_tree_sparse_checkout();
> +       } else
>                 ret = close_lock_file_gently(lock);
>
>         run_hook_le(NULL, "post-index-change",
> diff --git a/sparse-checkout.c b/sparse-checkout.c
> index d6c27ca19c4..6c58fda9722 100644
> --- a/sparse-checkout.c
> +++ b/sparse-checkout.c
> @@ -92,9 +92,12 @@ int load_in_tree_pattern_list(struct repository *r,
>                  * Exit silently, as this is likely the case where Git
>                  * changed branches to a location where the inherit file
>                  * does not exist. Do not update the sparse-checkout.
> +                *
> +                * Use -1 return to ensure populate_from_existing_patterns()
> +                * skips the sparse-checkout updates.
>                  */
>                 if (pos < 0)
> -                       return 1;
> +                       return -1;
>
>                 oid = &istate->cache[pos]->oid;
>                 type = oid_object_info(r, oid, NULL);
> @@ -145,6 +148,7 @@ int populate_sparse_checkout_patterns(struct pattern_list *pl)
>         return result;
>  }
>
> +static int updating_sparse_checkout = 0;
>  int update_working_directory(struct pattern_list *pl)
>  {
>         enum update_sparsity_result result;
> @@ -152,6 +156,10 @@ int update_working_directory(struct pattern_list *pl)
>         struct lock_file lock_file = LOCK_INIT;
>         struct repository *r = the_repository;
>
> +       if (updating_sparse_checkout)
> +               return 0;
> +       updating_sparse_checkout = 1;
> +
>         memset(&o, 0, sizeof(o));
>         o.verbose_update = isatty(2);
>         o.update = 1;
> @@ -180,9 +188,24 @@ int update_working_directory(struct pattern_list *pl)
>         else
>                 rollback_lock_file(&lock_file);
>
> +       updating_sparse_checkout = 0;
>         return result;
>  }
>
> +int update_in_tree_sparse_checkout(void)
> +{
> +       const char *first_value;
> +
> +       if (!core_apply_sparse_checkout)
> +               return 0;
> +
> +       /* only update if doing so due to sparse.inTree. */
> +       if (!git_config_get_value(SPARSE_CHECKOUT_IN_TREE, &first_value) &&
> +           first_value)
> +               return update_working_directory(NULL);
> +       return 0;
> +}
> +
>  static char *escaped_pattern(char *pattern)
>  {
>         char *p = pattern;
> @@ -273,6 +296,7 @@ int write_patterns_and_update(struct pattern_list *pl)
>                 free(sparse_filename);
>                 clear_pattern_list(pl);
>                 update_working_directory(NULL);
> +               updating_sparse_checkout = 0;
>                 return result;
>         }
>
> diff --git a/sparse-checkout.h b/sparse-checkout.h
> index 993a5701a60..fb0ba48524a 100644
> --- a/sparse-checkout.h
> +++ b/sparse-checkout.h
> @@ -13,6 +13,7 @@ char *get_sparse_checkout_filename(void);
>  int populate_sparse_checkout_patterns(struct pattern_list *pl);
>  void write_patterns_to_file(FILE *fp, struct pattern_list *pl);
>  int update_working_directory(struct pattern_list *pl);
> +int update_in_tree_sparse_checkout(void);
>  int write_patterns(struct pattern_list *pl, int and_update);
>  int write_patterns_and_update(struct pattern_list *pl);
>  void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *path);
> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
> index 1040bf9c261..fdaafba5377 100755
> --- a/t/t1091-sparse-checkout-builtin.sh
> +++ b/t/t1091-sparse-checkout-builtin.sh
> @@ -605,6 +605,7 @@ test_expect_success MINGW 'cone mode replaces backslashes with slashes' '
>  '
>
>  test_expect_success 'basis of --in-tree' '
> +       git -C repo branch no-in-tree &&
>         git -C repo config auto.crlf false &&
>         cat >folder1 <<-\EOF &&
>         [sparse]
> @@ -690,18 +691,28 @@ test_expect_success '"add" with --in-tree' '
>         check_files repo a deep folder1
>  '
>
> -test_expect_success 'reapply after updating in-tree file' '
> +test_expect_success 'automatically change after updating in-tree file' '
>         git -C repo sparse-checkout set --in-tree .sparse/sparse &&
>         check_files repo a &&
>         test_path_is_dir repo/.sparse &&
> -       echo "\tdir = folder1" >>repo/.sparse/sparse &&
> +       printf "\tdir = folder1\n" >>repo/.sparse/sparse &&
>         git -C repo commit -a -m "Update sparse file" &&
> -       git -C repo sparse-checkout reapply &&
>         check_files repo a folder1 &&
>         test_path_is_dir repo/.sparse &&
>         git -C repo checkout HEAD~1 &&
> -       git -C repo sparse-checkout reapply &&
>         check_files repo a &&
> +       test_path_is_dir repo/.sparse &&
> +       git -C repo checkout - &&
> +       check_files repo a folder1 &&
> +       test_path_is_dir repo/.sparse
> +'
> +
> +test_expect_success 'keep definition when in-tree file is missing' '
> +       git -C repo checkout no-in-tree &&
> +       check_files repo a folder1 &&
> +       test_path_is_missing repo/.sparse &&
> +       git -C repo checkout - &&
> +       check_files repo a folder1 &&
>         test_path_is_dir repo/.sparse
>  '
>
> --
> gitgitgadget
>
diff mbox series

Patch

diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
index da9322c5e41..c1713ebb1d2 100644
--- a/Documentation/git-sparse-checkout.txt
+++ b/Documentation/git-sparse-checkout.txt
@@ -233,9 +233,10 @@  Use `git sparse-checkout set --in-tree <path>` to initialize the patterns
 to those included in the file at `<path>`. This will override any existing
 patterns you have in your sparse-checkout file.
 
-After switching between commits with different versions of this file, run
-`git sparse-checkout reapply` to adjust the sparse-checkout patterns to
-the new definition.
+As Git switches between commits, it will update the in-tree sparse-checkout
+definition according to the files available at the new commit. If any of
+the specified files do not exist at the new commit, then the sparse-checkout
+definition will not change.
 
 
 SUBMODULES
diff --git a/builtin/commit.c b/builtin/commit.c
index 7ba33a3bec4..0eab8d74469 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -36,6 +36,7 @@ 
 #include "help.h"
 #include "commit-reach.h"
 #include "commit-graph.h"
+#include "sparse-checkout.h"
 
 static const char * const builtin_commit_usage[] = {
 	N_("git commit [<options>] [--] <pathspec>..."),
@@ -222,7 +223,8 @@  static int commit_index_files(void)
 	case COMMIT_AS_IS:
 		break; /* nothing to do */
 	case COMMIT_NORMAL:
-		err = commit_lock_file(&index_lock);
+		err = commit_lock_file(&index_lock) ||
+		      update_in_tree_sparse_checkout();
 		break;
 	case COMMIT_PARTIAL:
 		err = commit_lock_file(&index_lock);
diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index af7424b94c8..9ae81ffffa1 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -247,6 +247,10 @@  int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 		parse_tree(tree);
 		init_tree_desc(t+i, tree->buffer, tree->size);
 	}
+
+	if (opts.skip_sparse_checkout)
+		core_apply_sparse_checkout = 0;
+
 	if (unpack_trees(nr_trees, t, &opts))
 		return 128;
 
diff --git a/read-cache.c b/read-cache.c
index aa427c5c170..150e73feb0d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -25,6 +25,7 @@ 
 #include "fsmonitor.h"
 #include "thread-utils.h"
 #include "progress.h"
+#include "sparse-checkout.h"
 
 /* Mask for the name length in ce_flags in the on-disk index */
 
@@ -3074,9 +3075,10 @@  static int do_write_locked_index(struct index_state *istate, struct lock_file *l
 
 	if (ret)
 		return ret;
-	if (flags & COMMIT_LOCK)
-		ret = commit_locked_index(lock);
-	else
+	if (flags & COMMIT_LOCK) {
+		ret = commit_locked_index(lock) ||
+		      update_in_tree_sparse_checkout();
+	} else
 		ret = close_lock_file_gently(lock);
 
 	run_hook_le(NULL, "post-index-change",
diff --git a/sparse-checkout.c b/sparse-checkout.c
index d6c27ca19c4..6c58fda9722 100644
--- a/sparse-checkout.c
+++ b/sparse-checkout.c
@@ -92,9 +92,12 @@  int load_in_tree_pattern_list(struct repository *r,
 		 * Exit silently, as this is likely the case where Git
 		 * changed branches to a location where the inherit file
 		 * does not exist. Do not update the sparse-checkout.
+		 *
+		 * Use -1 return to ensure populate_from_existing_patterns()
+		 * skips the sparse-checkout updates.
 		 */
 		if (pos < 0)
-			return 1;
+			return -1;
 
 		oid = &istate->cache[pos]->oid;
 		type = oid_object_info(r, oid, NULL);
@@ -145,6 +148,7 @@  int populate_sparse_checkout_patterns(struct pattern_list *pl)
 	return result;
 }
 
+static int updating_sparse_checkout = 0;
 int update_working_directory(struct pattern_list *pl)
 {
 	enum update_sparsity_result result;
@@ -152,6 +156,10 @@  int update_working_directory(struct pattern_list *pl)
 	struct lock_file lock_file = LOCK_INIT;
 	struct repository *r = the_repository;
 
+	if (updating_sparse_checkout)
+		return 0;
+	updating_sparse_checkout = 1;
+
 	memset(&o, 0, sizeof(o));
 	o.verbose_update = isatty(2);
 	o.update = 1;
@@ -180,9 +188,24 @@  int update_working_directory(struct pattern_list *pl)
 	else
 		rollback_lock_file(&lock_file);
 
+	updating_sparse_checkout = 0;
 	return result;
 }
 
+int update_in_tree_sparse_checkout(void)
+{
+	const char *first_value;
+
+	if (!core_apply_sparse_checkout)
+		return 0;
+
+	/* only update if doing so due to sparse.inTree. */
+	if (!git_config_get_value(SPARSE_CHECKOUT_IN_TREE, &first_value) &&
+	    first_value)
+	    	return update_working_directory(NULL);
+	return 0;
+}
+
 static char *escaped_pattern(char *pattern)
 {
 	char *p = pattern;
@@ -273,6 +296,7 @@  int write_patterns_and_update(struct pattern_list *pl)
 		free(sparse_filename);
 		clear_pattern_list(pl);
 		update_working_directory(NULL);
+		updating_sparse_checkout = 0;
 		return result;
 	}
 
diff --git a/sparse-checkout.h b/sparse-checkout.h
index 993a5701a60..fb0ba48524a 100644
--- a/sparse-checkout.h
+++ b/sparse-checkout.h
@@ -13,6 +13,7 @@  char *get_sparse_checkout_filename(void);
 int populate_sparse_checkout_patterns(struct pattern_list *pl);
 void write_patterns_to_file(FILE *fp, struct pattern_list *pl);
 int update_working_directory(struct pattern_list *pl);
+int update_in_tree_sparse_checkout(void);
 int write_patterns(struct pattern_list *pl, int and_update);
 int write_patterns_and_update(struct pattern_list *pl);
 void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *path);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 1040bf9c261..fdaafba5377 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -605,6 +605,7 @@  test_expect_success MINGW 'cone mode replaces backslashes with slashes' '
 '
 
 test_expect_success 'basis of --in-tree' '
+	git -C repo branch no-in-tree &&
 	git -C repo config auto.crlf false &&
 	cat >folder1 <<-\EOF &&
 	[sparse]
@@ -690,18 +691,28 @@  test_expect_success '"add" with --in-tree' '
 	check_files repo a deep folder1
 '
 
-test_expect_success 'reapply after updating in-tree file' '
+test_expect_success 'automatically change after updating in-tree file' '
 	git -C repo sparse-checkout set --in-tree .sparse/sparse &&
 	check_files repo a &&
 	test_path_is_dir repo/.sparse &&
-	echo "\tdir = folder1" >>repo/.sparse/sparse &&
+	printf "\tdir = folder1\n" >>repo/.sparse/sparse &&
 	git -C repo commit -a -m "Update sparse file" &&
-	git -C repo sparse-checkout reapply &&
 	check_files repo a folder1 &&
 	test_path_is_dir repo/.sparse &&
 	git -C repo checkout HEAD~1 &&
-	git -C repo sparse-checkout reapply &&
 	check_files repo a &&
+	test_path_is_dir repo/.sparse &&
+	git -C repo checkout - &&
+	check_files repo a folder1 &&
+	test_path_is_dir repo/.sparse
+'
+
+test_expect_success 'keep definition when in-tree file is missing' '
+	git -C repo checkout no-in-tree &&
+	check_files repo a folder1 &&
+	test_path_is_missing repo/.sparse &&
+	git -C repo checkout - &&
+	check_files repo a folder1 &&
 	test_path_is_dir repo/.sparse
 '