diff mbox series

[v1,6/7] mv: from in-cone to out-of-cone

Message ID 20220719132809.409247-7-shaoxuan.yuan02@gmail.com (mailing list archive)
State Superseded
Headers show
Series mv: from in-cone to out-of-cone | expand

Commit Message

Shaoxuan Yuan July 19, 2022, 1:28 p.m. UTC
Originally, moving an in-cone <source> to an out-of-cone <destination>
was not possible, mainly because such <destination> is a directory that
is not present in the working tree.

Change the behavior so that we can move an in-cone <source> to
out-of-cone <destination> when --sparse is supplied.

Such <source> can be either clean or dirty, and moving it results in
different behaviors:

A clean move should move the <source> to the <destination>, both in the
working tree and the index, then remove the resulted path from the
working tree, and turn on its CE_SKIP_WORKTREE bit.

A dirty move should move the <source> to the <destination>, both in the
working tree and the index, but should *not* remove the resulted path
from the working tree and should *not* turn on its CE_SKIP_WORKTREE bit.
Instead advise user to "git add" this path and run "git sparse-checkout
reapply" to re-sparsify that path.

Helped-by: Victoria Dye <vdye@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 advice.c                      | 19 +++++++++++++
 advice.h                      |  1 +
 builtin/mv.c                  | 50 ++++++++++++++++++++++++++++++-----
 t/t7002-mv-sparse-checkout.sh |  8 +++---
 4 files changed, 67 insertions(+), 11 deletions(-)

Comments

Derrick Stolee July 19, 2022, 6:14 p.m. UTC | #1
On 7/19/2022 9:28 AM, Shaoxuan Yuan wrote:
> Originally, moving an in-cone <source> to an out-of-cone <destination>
> was not possible, mainly because such <destination> is a directory that
> is not present in the working tree.
> 
> Change the behavior so that we can move an in-cone <source> to
> out-of-cone <destination> when --sparse is supplied.
> 
> Such <source> can be either clean or dirty, and moving it results in
> different behaviors:
> 
> A clean move should move the <source> to the <destination>, both in the
> working tree and the index, then remove the resulted path from the
> working tree, and turn on its CE_SKIP_WORKTREE bit.
> 
> A dirty move should move the <source> to the <destination>, both in the
> working tree and the index, but should *not* remove the resulted path
> from the working tree and should *not* turn on its CE_SKIP_WORKTREE bit.
> Instead advise user to "git add" this path and run "git sparse-checkout
> reapply" to re-sparsify that path.

I feel like this patch should be two, instead: you can have one that
makes the commands succeed or fail properly, and another that outputs the
advice. As it is, it's a bit muddled as to what is necessary for the
movement of the index entry versus representing the error message.

This might mean that you want to pull the advice messages out of the tests
that are added in patch 1 and only apply those to the test as you implement
the advice in that later patch. Such a split of the test implementation
would allow this patch to still switch a bunch of test_expect_failure tests
to test_expect_success.

> @@ -424,6 +426,7 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
>  		if (show_only)
>  			continue;
>  		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
> +		    !(dst_mode & SKIP_WORKTREE_DIR) &&

Here is a use of dst_mode that might be helpful to put with the change in
patch 4. Alternatively, you could delay declaring dst_mode until now.

> +		if (!(mode & SPARSE) && !lstat(src, &st))
> +			up_to_date = !ce_modified(active_cache[pos], &st, 0);

Here, you are only checking for a dirty file if (mode & SPARSE) and the
file exists.

Perhaps it would be helpful to negate this and rename the variable?

	sparse_and_dirty = ce_modified(active_cache[pos], &st, 0);

This makes it clear that we can't use the variable except in this
particular case.

> -		if ((mode & SPARSE) &&
> -		    (path_in_sparse_checkout(dst, &the_index))) {
> -			int dst_pos;
> +		if (ignore_sparse &&
> +		    core_apply_sparse_checkout &&
> +		    core_sparse_checkout_cone) {
>  
> -			dst_pos = cache_name_pos(dst, strlen(dst));
> -			active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
> +			/* from out-of-cone to in-cone */
> +			if ((mode & SPARSE) &&
> +			    path_in_sparse_checkout(dst, &the_index)) {
> +				int dst_pos = cache_name_pos(dst, strlen(dst));
> +				struct cache_entry *dst_ce = active_cache[dst_pos];
>  
> -			if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
> -				die(_("cannot checkout %s"), active_cache[dst_pos]->name);
> +				dst_ce->ce_flags &= ~CE_SKIP_WORKTREE;
> +
> +				if (checkout_entry(dst_ce, &state, NULL, NULL))
> +					die(_("cannot checkout %s"), dst_ce->name);
> +				continue;
> +			}

Here, it helps to ignore whitespace changes. This out to in was already
handled by the existing implementation.

> +			/* from in-cone to out-of-cone */
> +			if ((dst_mode & SKIP_WORKTREE_DIR) &&

This is disjoint from the other case (because of !path_in_sparse_checkout()),
so maybe we could short-circuit with an "else if" here? You could put your
comments about the in-to-out or out-to-in inside the if blocks.

> +			    !(mode & SPARSE) &&
> +			    !path_in_sparse_checkout(dst, &the_index)) {
> +				int dst_pos = cache_name_pos(dst, strlen(dst));
> +				struct cache_entry *dst_ce = active_cache[dst_pos];

(It took me a while to realize that dst_ce is the right entry because we
already called rename_cache_entry_at() earlier.)

> +				char *src_dir = dirname(xstrdup(src));

The xstrdup(src) return string is being lost here.

> +
> +				if (up_to_date) {

Based on my recommendation earlier, this would become

	if (!sparse_and_dirty) {

> +					dst_ce->ce_flags |= CE_SKIP_WORKTREE;
> +					unlink_or_warn(src);
> +				} else {
> +					string_list_append(&dirty_paths, dst);
> +					safe_create_leading_directories(xstrdup(dst));
> +					rename(src, dst);

Ok, still doing the file rename, but leaving it unstaged.

> +				}

Please provide some whitespace between the close of an if/else chain before
starting the next if.

> +				if ((mode & INDEX) && is_empty_dir(src_dir))
> +					rmdir_or_warn(src_dir);

This is an interesting cleanup of the first-level directory. Should it be
recursive and clean up an entire chain of paths?

Thanks,
-Stolee
Shaoxuan Yuan Aug. 3, 2022, 11:50 a.m. UTC | #2
On 7/20/2022 2:14 AM, Derrick Stolee wrote:
 >> -        if ((mode & SPARSE) &&
 >> -            (path_in_sparse_checkout(dst, &the_index))) {
 >> -            int dst_pos;
 >> +        if (ignore_sparse &&
 >> +            core_apply_sparse_checkout &&
 >> +            core_sparse_checkout_cone) {
 >>
 >> -            dst_pos = cache_name_pos(dst, strlen(dst));
 >> -            active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
 >> +            /* from out-of-cone to in-cone */
 >> +            if ((mode & SPARSE) &&
 >> +                path_in_sparse_checkout(dst, &the_index)) {
 >> +                int dst_pos = cache_name_pos(dst, strlen(dst));
 >> +                struct cache_entry *dst_ce = active_cache[dst_pos];
 >>
 >> -            if (checkout_entry(active_cache[dst_pos], &state, NULL, 
NULL))
 >> -                die(_("cannot checkout %s"), 
active_cache[dst_pos]->name);
 >> +                dst_ce->ce_flags &= ~CE_SKIP_WORKTREE;
 >> +
 >> +                if (checkout_entry(dst_ce, &state, NULL, NULL))
 >> +                    die(_("cannot checkout %s"), dst_ce->name);
 >> +                continue;
 >> +            }
 >
 > Here, it helps to ignore whitespace changes. This out to in was already
 > handled by the existing implementation.

Yes, I think it would be better to let `diff` ignore the existing
implementation. Are you suggesting the `-w` (--ignore-all-space) option
of `diff`? I tried this option and it does not work. But another reason
is that there *are* some changes that are different from the original
out-to-in implementation, so even though it looks a bit messy, I think
it makes sense.

 >> +            /* from in-cone to out-of-cone */
 >> +            if ((dst_mode & SKIP_WORKTREE_DIR) &&
 >
 > This is disjoint from the other case (because of 
!path_in_sparse_checkout()),
 > so maybe we could short-circuit with an "else if" here? You could put 
your
 > comments about the in-to-out or out-to-in inside the if blocks.

I tried an else-if but it does clutter the code a bit. I think I'll
leave it as-is. Or do you mind show me a diff of your approach? To be
honest, this disjoint here looks logically cleaner to me.

--
Thanks,
Shaoxuan
Derrick Stolee Aug. 3, 2022, 2:30 p.m. UTC | #3
On 8/3/2022 7:50 AM, Shaoxuan Yuan wrote:
> On 7/20/2022 2:14 AM, Derrick Stolee wrote:
>>> -        if ((mode & SPARSE) &&
>>> -            (path_in_sparse_checkout(dst, &the_index))) {
>>> -            int dst_pos;
>>> +        if (ignore_sparse &&
>>> +            core_apply_sparse_checkout &&
>>> +            core_sparse_checkout_cone) {
>>>
>>> -            dst_pos = cache_name_pos(dst, strlen(dst));
>>> -            active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
>>> +            /* from out-of-cone to in-cone */
>>> +            if ((mode & SPARSE) &&
>>> +                path_in_sparse_checkout(dst, &the_index)) {
>>> +                int dst_pos = cache_name_pos(dst, strlen(dst));
>>> +                struct cache_entry *dst_ce = active_cache[dst_pos];
>>>
>>> -            if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
>>> -                die(_("cannot checkout %s"), active_cache[dst_pos]->name);
>>> +                dst_ce->ce_flags &= ~CE_SKIP_WORKTREE;
>>> +
>>> +                if (checkout_entry(dst_ce, &state, NULL, NULL))
>>> +                    die(_("cannot checkout %s"), dst_ce->name);
>>> +                continue;
>>> +            }
>>
>> Here, it helps to ignore whitespace changes. This out to in was already
>> handled by the existing implementation.
> 
> Yes, I think it would be better to let `diff` ignore the existing
> implementation. Are you suggesting the `-w` (--ignore-all-space) option
> of `diff`? I tried this option and it does not work. But another reason
> is that there *are* some changes that are different from the original
> out-to-in implementation, so even though it looks a bit messy, I think
> it makes sense.

I'm just making a note that I looked at this not in its patch form,
but as a commit diff where I could use the '-w' option to get a nice
view. It's not possible to do that in the patch.
 
>>> +            /* from in-cone to out-of-cone */
>>> +            if ((dst_mode & SKIP_WORKTREE_DIR) &&
>>
>> This is disjoint from the other case (because of !path_in_sparse_checkout()),
>> so maybe we could short-circuit with an "else if" here? You could put your
>> comments about the in-to-out or out-to-in inside the if blocks.
> 
> I tried an else-if but it does clutter the code a bit. I think I'll
> leave it as-is. Or do you mind show me a diff of your approach? To be
> honest, this disjoint here looks logically cleaner to me.

Here's the diff I had in mind:

--- >8 ---

diff --git a/builtin/mv.c b/builtin/mv.c
index df1f69f1a7..111aafb69a 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -455,10 +455,9 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (ignore_sparse &&
 		    core_apply_sparse_checkout &&
 		    core_sparse_checkout_cone) {
-
-			/* from out-of-cone to in-cone */
 			if ((mode & SPARSE) &&
 			    path_in_sparse_checkout(dst, &the_index)) {
+				/* from out-of-cone to in-cone */
 				int dst_pos = cache_name_pos(dst, strlen(dst));
 				struct cache_entry *dst_ce = active_cache[dst_pos];
 
@@ -466,13 +465,10 @@ int cmd_mv(int argc, const char **argv, const char *prefix)
 
 				if (checkout_entry(dst_ce, &state, NULL, NULL))
 					die(_("cannot checkout %s"), dst_ce->name);
-				continue;
-			}
-
-			/* from in-cone to out-of-cone */
-			if ((dst_mode & SKIP_WORKTREE_DIR) &&
-			    !(mode & SPARSE) &&
-			    !path_in_sparse_checkout(dst, &the_index)) {
+			} else if ((dst_mode & SKIP_WORKTREE_DIR) &&
+				   !(mode & SPARSE) &&
+				   !path_in_sparse_checkout(dst, &the_index)) {
+				/* from in-cone to out-of-cone */
 				int dst_pos = cache_name_pos(dst, strlen(dst));
 				struct cache_entry *dst_ce = active_cache[dst_pos];
 				char *src_dir = dirname(xstrdup(src));

--- >8 ---

I agree with you that the whitespace breaking the two cases is nice,
but relying on that "continue;" to keep these cases disjoint is easy
to miss and I'd rather let the code be clearer about the cases.

Thanks,
-Stolee
Shaoxuan Yuan Aug. 4, 2022, 8:40 a.m. UTC | #4
On 7/20/2022 2:14 AM, Derrick Stolee wrote:
 >> +                if ((mode & INDEX) && is_empty_dir(src_dir))
 >> +                    rmdir_or_warn(src_dir);
 >
 > This is an interesting cleanup of the first-level directory. Should it be
 > recursive and clean up an entire chain of paths?

Indeed. I'm planning to use an array to record the possible `src_dir`,
i.e. WORKING_DIRECTORYs in `git-mv`, and use a loop to cleanup the
empty directories recursively.

--
Thanks,
Shaoxuan
diff mbox series

Patch

diff --git a/advice.c b/advice.c
index 6fda9edbc2..fd18968943 100644
--- a/advice.c
+++ b/advice.c
@@ -261,3 +261,22 @@  void detach_advice(const char *new_name)
 
 	fprintf(stderr, fmt, new_name);
 }
+
+void advise_on_moving_dirty_path(struct string_list *pathspec_list)
+{
+	struct string_list_item *item;
+
+	if (!pathspec_list->nr)
+		return;
+
+	fprintf(stderr, _("The following paths have been moved outside the\n"
+			  "sparse-checkout definition but are not sparse due to local\n"
+			  "modifications.\n"));
+	for_each_string_list_item(item, pathspec_list)
+		fprintf(stderr, "%s\n", item->string);
+
+	advise_if_enabled(ADVICE_UPDATE_SPARSE_PATH,
+			  _("To correct the sparsity of these paths, do the following:\n"
+			    "* Use \"git add --sparse <paths>\" to update the index\n"
+			    "* Use \"git sparse-checkout reapply\" to apply the sparsity rules"));
+}
diff --git a/advice.h b/advice.h
index 7ddc6cbc1a..07e0f76833 100644
--- a/advice.h
+++ b/advice.h
@@ -74,5 +74,6 @@  void NORETURN die_conclude_merge(void);
 void NORETURN die_ff_impossible(void);
 void advise_on_updating_sparse_paths(struct string_list *pathspec_list);
 void detach_advice(const char *new_name);
+void advise_on_moving_dirty_path(struct string_list *pathspec_list);
 
 #endif /* ADVICE_H */
diff --git a/builtin/mv.c b/builtin/mv.c
index d05914a233..d35994c443 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -184,6 +184,7 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 	struct lock_file lock_file = LOCK_INIT;
 	struct cache_entry *ce;
 	struct string_list only_match_skip_worktree = STRING_LIST_INIT_NODUP;
+	struct string_list dirty_paths = STRING_LIST_INIT_NODUP;
 
 	git_config(git_default_config, NULL);
 
@@ -414,6 +415,7 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 		const char *src = source[i], *dst = destination[i];
 		enum update_mode mode = modes[i];
 		int pos;
+		int up_to_date = 0;
 		struct checkout state = CHECKOUT_INIT;
 		state.istate = &the_index;
 
@@ -424,6 +426,7 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 		if (show_only)
 			continue;
 		if (!(mode & (INDEX | SPARSE | SKIP_WORKTREE_DIR)) &&
+		    !(dst_mode & SKIP_WORKTREE_DIR) &&
 		    rename(src, dst) < 0) {
 			if (ignore_errors)
 				continue;
@@ -443,20 +446,52 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 
 		pos = cache_name_pos(src, strlen(src));
 		assert(pos >= 0);
+		if (!(mode & SPARSE) && !lstat(src, &st))
+			up_to_date = !ce_modified(active_cache[pos], &st, 0);
 		rename_cache_entry_at(pos, dst);
 
-		if ((mode & SPARSE) &&
-		    (path_in_sparse_checkout(dst, &the_index))) {
-			int dst_pos;
+		if (ignore_sparse &&
+		    core_apply_sparse_checkout &&
+		    core_sparse_checkout_cone) {
 
-			dst_pos = cache_name_pos(dst, strlen(dst));
-			active_cache[dst_pos]->ce_flags &= ~CE_SKIP_WORKTREE;
+			/* from out-of-cone to in-cone */
+			if ((mode & SPARSE) &&
+			    path_in_sparse_checkout(dst, &the_index)) {
+				int dst_pos = cache_name_pos(dst, strlen(dst));
+				struct cache_entry *dst_ce = active_cache[dst_pos];
 
-			if (checkout_entry(active_cache[dst_pos], &state, NULL, NULL))
-				die(_("cannot checkout %s"), active_cache[dst_pos]->name);
+				dst_ce->ce_flags &= ~CE_SKIP_WORKTREE;
+
+				if (checkout_entry(dst_ce, &state, NULL, NULL))
+					die(_("cannot checkout %s"), dst_ce->name);
+				continue;
+			}
+
+			/* from in-cone to out-of-cone */
+			if ((dst_mode & SKIP_WORKTREE_DIR) &&
+			    !(mode & SPARSE) &&
+			    !path_in_sparse_checkout(dst, &the_index)) {
+				int dst_pos = cache_name_pos(dst, strlen(dst));
+				struct cache_entry *dst_ce = active_cache[dst_pos];
+				char *src_dir = dirname(xstrdup(src));
+
+				if (up_to_date) {
+					dst_ce->ce_flags |= CE_SKIP_WORKTREE;
+					unlink_or_warn(src);
+				} else {
+					string_list_append(&dirty_paths, dst);
+					safe_create_leading_directories(xstrdup(dst));
+					rename(src, dst);
+				}
+				if ((mode & INDEX) && is_empty_dir(src_dir))
+					rmdir_or_warn(src_dir);
+			}
 		}
 	}
 
+	if (dirty_paths.nr)
+		advise_on_moving_dirty_path(&dirty_paths);
+
 	if (gitmodules_modified)
 		stage_updated_gitmodules(&the_index);
 
@@ -465,6 +500,7 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 		die(_("Unable to write new index file"));
 
 	string_list_clear(&src_for_dst, 0);
+	string_list_clear(&dirty_paths, 0);
 	UNLEAK(source);
 	UNLEAK(dest_path);
 	free(submodule_gitfile);
diff --git a/t/t7002-mv-sparse-checkout.sh b/t/t7002-mv-sparse-checkout.sh
index c27fcdbfd0..dafe15b9cf 100755
--- a/t/t7002-mv-sparse-checkout.sh
+++ b/t/t7002-mv-sparse-checkout.sh
@@ -303,7 +303,7 @@  test_expect_success 'move sparse file to existing destination with --force and -
 	test_cmp expect sub/file1
 '
 
-test_expect_failure 'move clean path from in-cone to out-of-cone' '
+test_expect_success 'move clean path from in-cone to out-of-cone' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
@@ -356,7 +356,7 @@  test_expect_failure 'move clean path from in-cone to out-of-cone overwrite' '
 	test_cmp expect actual
 '
 
-test_expect_failure 'move dirty path from in-cone to out-of-cone' '
+test_expect_success 'move dirty path from in-cone to out-of-cone' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 	echo "modified" >>sub/d &&
@@ -380,7 +380,7 @@  test_expect_failure 'move dirty path from in-cone to out-of-cone' '
 	grep -x "H folder1/d" actual
 '
 
-test_expect_failure 'move dir from in-cone to out-of-cone' '
+test_expect_success 'move dir from in-cone to out-of-cone' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 
@@ -400,7 +400,7 @@  test_expect_failure 'move dir from in-cone to out-of-cone' '
 	grep -x "S folder1/dir/e" actual
 '
 
-test_expect_failure 'move partially-dirty dir from in-cone to out-of-cone' '
+test_expect_success 'move partially-dirty dir from in-cone to out-of-cone' '
 	test_when_finished "cleanup_sparse_checkout" &&
 	setup_sparse_checkout &&
 	touch sub/dir/e2 sub/dir/e3 &&