diff mbox series

[v1,4/7] mv: check if <destination> is a SKIP_WORKTREE_DIR

Message ID 20220719132809.409247-5-shaoxuan.yuan02@gmail.com (mailing list archive)
State Superseded
Headers show
Series mv: from in-cone to out-of-cone | expand

Commit Message

Shaoxuan Yuan July 19, 2022, 1:28 p.m. UTC
Originally, <destination> is assumed to be in the working tree. If it is
not found as a directory, then it is determined to be either a regular file
path, or error out if used under the second form (move into a directory)
of 'git-mv'. Such behavior is not ideal, mainly because Git does not
look into the index for <destination>, which could potentially be a
SKIP_WORKTREE_DIR, which we need to determine for the later "moving from
in-cone to out-of-cone" patch.

Change the logic so that Git first check if <destination> is a directory
with all its contents sparsified (a SKIP_WORKTREE_DIR). If yes,
then treat <destination> as a directory exists in the working tree, and
thus using the second form of 'git-mv', i.e. move into this
<destination>, and mark <destination> as a SKIP_WORKTREE_DIR.
If no, continue the original checking logic.

Also add a `dst_w_slash` to reuse the result from `add_slash()`, which
was everywhere and can be simplified.

Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
---
 builtin/mv.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

Comments

Derrick Stolee July 19, 2022, 5:59 p.m. UTC | #1
On 7/19/2022 9:28 AM, Shaoxuan Yuan wrote:
> Originally, <destination> is assumed to be in the working tree. If it is
> not found as a directory, then it is determined to be either a regular file
> path, or error out if used under the second form (move into a directory)
> of 'git-mv'. Such behavior is not ideal, mainly because Git does not
> look into the index for <destination>, which could potentially be a
> SKIP_WORKTREE_DIR, which we need to determine for the later "moving from
> in-cone to out-of-cone" patch.
> 
> Change the logic so that Git first check if <destination> is a directory
> with all its contents sparsified (a SKIP_WORKTREE_DIR). If yes,
> then treat <destination> as a directory exists in the working tree, and

"treat <destination> as a directory exists in the working tree" is a bit
akward, and the rest of the sentence struggles with that. Starting with
"If yes," we could rewrite as follows:

  If <destination> is such a sparse directory, then we should modify the
  index the same way as we would if this were a non-sparse directory. We
  must be careful to ensure that the <destination> is marked with 
  SKIP_WORKTREE_DIR.

(Note that I don't equate this as doing the same thing, just operating on
the index.)

> If no, continue the original checking logic.

I think this part doesn't need to be there, but I don't feel strongly
about it.

> Also add a `dst_w_slash` to reuse the result from `add_slash()`, which
> was everywhere and can be simplified.

>  	else if (!lstat(dest_path[0], &st) &&
>  			S_ISDIR(st.st_mode)) {
> -		dest_path[0] = add_slash(dest_path[0]);
> -		destination = internal_prefix_pathspec(dest_path[0], argv, argc, DUP_BASENAME);
> +		destination = internal_prefix_pathspec(dst_w_slash, argv, argc, DUP_BASENAME);

Hm. I find it interesting how this works even if the directory is _not_
present in the index. Is there a test that checks this kind of setup?

	git init &&
	>file &&
	git add file &&
	git commit -m init &&
	mkdir dir &&
	git mv file dir/

Locally, this indeed works with the following 'git status' detail:

        renamed:    file -> dir/file

>  	} else {
> -		if (argc != 1)
> +		if (!path_in_sparse_checkout(dst_w_slash, &the_index) &&
> +		    !check_dir_in_index(dst_w_slash)) {
> +			destination = internal_prefix_pathspec(dst_w_slash, argv, argc, DUP_BASENAME);
> +			dst_mode |= SKIP_WORKTREE_DIR;

Looks like we are assigning dst_mode here, but not using it. I wonder if
you'll get a compiler error if you build this patch with DEVELOPER=1.

You can test all of the commits in your series using this command:

  git rebase -x "make -j12 DEVELOPER=1" <base>

> +	if (dst_w_slash != dest_path[0])
> +		free((char *)dst_w_slash);

Good that you are freeing this here. You might also want to set it to NULL
just in case.

Thanks,
-Stolee
Shaoxuan Yuan July 21, 2022, 2:13 p.m. UTC | #2
On 7/20/2022 1:59 AM, Derrick Stolee wrote:
 > On 7/19/2022 9:28 AM, Shaoxuan Yuan wrote:
 >> Originally, <destination> is assumed to be in the working tree. If it is
 >> not found as a directory, then it is determined to be either a 
regular file
 >> path, or error out if used under the second form (move into a directory)
 >> of 'git-mv'. Such behavior is not ideal, mainly because Git does not
 >> look into the index for <destination>, which could potentially be a
 >> SKIP_WORKTREE_DIR, which we need to determine for the later "moving from
 >> in-cone to out-of-cone" patch.
 >>
 >> Change the logic so that Git first check if <destination> is a directory
 >> with all its contents sparsified (a SKIP_WORKTREE_DIR). If yes,
 >> then treat <destination> as a directory exists in the working tree, and
 >
 > "treat <destination> as a directory exists in the working tree" is a bit
 > akward, and the rest of the sentence struggles with that. Starting with
 > "If yes," we could rewrite as follows:
 >
 >   If <destination> is such a sparse directory, then we should modify the
 >   index the same way as we would if this were a non-sparse directory. We
 >   must be careful to ensure that the <destination> is marked with
 >   SKIP_WORKTREE_DIR.
 >
 > (Note that I don't equate this as doing the same thing, just operating on
 > the index.)

This wording sounds more natural, thanks for the rewrite!

 >> If no, continue the original checking logic.
 >
 > I think this part doesn't need to be there, but I don't feel strongly
 > about it.
 >
 >> Also add a `dst_w_slash` to reuse the result from `add_slash()`, which
 >> was everywhere and can be simplified.
 >
 >>      else if (!lstat(dest_path[0], &st) &&
 >>              S_ISDIR(st.st_mode)) {
 >> -        dest_path[0] = add_slash(dest_path[0]);
 >> -        destination = internal_prefix_pathspec(dest_path[0], argv, 
argc, DUP_BASENAME);
 >> +        destination = internal_prefix_pathspec(dst_w_slash, argv, 
argc, DUP_BASENAME);
 >
 > Hm. I find it interesting how this works even if the directory is _not_
 > present in the index. Is there a test that checks this kind of setup?
 >
 >     git init &&
 >     >file &&
 >     git add file &&
 >     git commit -m init &&
 >     mkdir dir &&
 >     git mv file dir/
 >
 > Locally, this indeed works with the following 'git status' detail:
 >
 >         renamed:    file -> dir/file

In my impression, 'git-mv' does not seem to care about whether the
<destination> directory is in index? Given that `rename()` works so
long as the directory is in the working tree, and `rename_index_entry_at()`
cares even less about the <destination> dir?

 >>      } else {
 >> -        if (argc != 1)
 >> +        if (!path_in_sparse_checkout(dst_w_slash, &the_index) &&
 >> +            !check_dir_in_index(dst_w_slash)) {
 >> +            destination = internal_prefix_pathspec(dst_w_slash, 
argv, argc, DUP_BASENAME);
 >> +            dst_mode |= SKIP_WORKTREE_DIR;
 >
 > Looks like we are assigning dst_mode here, but not using it. I wonder if
 > you'll get a compiler error if you build this patch with DEVELOPER=1.
 >
 > You can test all of the commits in your series using this command:
 >
 >   git rebase -x "make -j12 DEVELOPER=1" <base>

Oops, it seems that I was doing the wrong way. I only run test on the
tip of the branch, without actually testing individual commits.
Will do this.

 >> +    if (dst_w_slash != dest_path[0])
 >> +        free((char *)dst_w_slash);
 >
 > Good that you are freeing this here. You might also want to set it to 
NULL
 > just in case.

I was using the `FREE_AND_NULL()` macro, but I wasn't sure since other
places in 'git-mv' only use `free()`. Though I think it is better to
`FREE_AND_NULL()`.

--
Thanks,
Shaoxuan
Derrick Stolee July 22, 2022, 12:48 p.m. UTC | #3
On 7/21/2022 10:13 AM, Shaoxuan Yuan wrote:
> On 7/20/2022 1:59 AM, Derrick Stolee wrote:
>> On 7/19/2022 9:28 AM, Shaoxuan Yuan wrote:
>>> Also add a `dst_w_slash` to reuse the result from `add_slash()`, which
>>> was everywhere and can be simplified.
>>
>>>      else if (!lstat(dest_path[0], &st) &&
>>>              S_ISDIR(st.st_mode)) {
>>> -        dest_path[0] = add_slash(dest_path[0]);
>>> -        destination = internal_prefix_pathspec(dest_path[0], argv, argc, DUP_BASENAME);
>>> +        destination = internal_prefix_pathspec(dst_w_slash, argv, argc, DUP_BASENAME);
>>
>> Hm. I find it interesting how this works even if the directory is _not_
>> present in the index. Is there a test that checks this kind of setup?
>>
>>     git init &&
>>     >file &&
>>     git add file &&
>>     git commit -m init &&
>>     mkdir dir &&
>>     git mv file dir/
>>
>> Locally, this indeed works with the following 'git status' detail:
>>
>>         renamed:    file -> dir/file
> 
> In my impression, 'git-mv' does not seem to care about whether the
> <destination> directory is in index? Given that `rename()` works so
> long as the directory is in the working tree, and `rename_index_entry_at()`
> cares even less about the <destination> dir?

Right. Instead of changing the behavior, I'm asking you to double-check that
this behavior is tested. If not, then please add a test.

>>> +    if (dst_w_slash != dest_path[0])
>>> +        free((char *)dst_w_slash);
>>
>> Good that you are freeing this here. You might also want to set it to NULL
>> just in case.
> 
> I was using the `FREE_AND_NULL()` macro, but I wasn't sure since other
> places in 'git-mv' only use `free()`. Though I think it is better to
> `FREE_AND_NULL()`.

free() is generally the way to go if it is clear that the variable
is about to go out-of-scope and could not possibly be referenced
again. Since there is a lot more of the current code block to go,
nulling the variable is good defensive programming.

Thanks,
-Stolee
Junio C Hamano July 22, 2022, 6:40 p.m. UTC | #4
Derrick Stolee <derrickstolee@github.com> writes:

>>> Good that you are freeing this here. You might also want to set it to NULL
>>> just in case.
>> 
>> I was using the `FREE_AND_NULL()` macro, but I wasn't sure since other
>> places in 'git-mv' only use `free()`. Though I think it is better to
>> `FREE_AND_NULL()`.
>
> free() is generally the way to go if it is clear that the variable
> is about to go out-of-scope and could not possibly be referenced
> again. Since there is a lot more of the current code block to go,
> nulling the variable is good defensive programming.

NULLing it out is better when a potential misuse of the pointer
after it got freed will be caught by dereferencing NULL.

There however are pointer members of structures wher they represent
optional data.  Access to such a member goes like so:

	if (structure->optinal_member)
		do_things(structure->optional_member);

When you are done using such a structure and clearing it, after
releasing the resource held by the member, it is better to leave it
dangling than assigning NULL to it.  If somebody reuses that
structure and the control enters a codepath like the above one to
use the "optional" pointer, uncleared dangling pointer will likely
be caught at runtime; setting it to NULL will paper over it.  We've
seen many bugs caused by a premature releasing of a member that was
hidden exactly by such a use of FREE_AND_NULL() few relases ago.

Thanks.
diff mbox series

Patch

diff --git a/builtin/mv.c b/builtin/mv.c
index 23a297d6b8..2e9d577227 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -178,7 +178,8 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 		OPT_END(),
 	};
 	const char **source, **destination, **dest_path, **submodule_gitfile;
-	enum update_mode *modes;
+	const char *dst_w_slash;
+	enum update_mode *modes, dst_mode = 0;
 	struct stat st;
 	struct string_list src_for_dst = STRING_LIST_INIT_NODUP;
 	struct lock_file lock_file = LOCK_INIT;
@@ -208,6 +209,7 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 	if (argc == 1 && is_directory(argv[0]) && !is_directory(argv[1]))
 		flags = 0;
 	dest_path = internal_prefix_pathspec(prefix, argv + argc, 1, flags);
+	dst_w_slash = add_slash(dest_path[0]);
 	submodule_gitfile = xcalloc(argc, sizeof(char *));
 
 	if (dest_path[0][0] == '\0')
@@ -215,13 +217,20 @@  int cmd_mv(int argc, const char **argv, const char *prefix)
 		destination = internal_prefix_pathspec(dest_path[0], argv, argc, DUP_BASENAME);
 	else if (!lstat(dest_path[0], &st) &&
 			S_ISDIR(st.st_mode)) {
-		dest_path[0] = add_slash(dest_path[0]);
-		destination = internal_prefix_pathspec(dest_path[0], argv, argc, DUP_BASENAME);
+		destination = internal_prefix_pathspec(dst_w_slash, argv, argc, DUP_BASENAME);
 	} else {
-		if (argc != 1)
+		if (!path_in_sparse_checkout(dst_w_slash, &the_index) &&
+		    !check_dir_in_index(dst_w_slash)) {
+			destination = internal_prefix_pathspec(dst_w_slash, argv, argc, DUP_BASENAME);
+			dst_mode |= SKIP_WORKTREE_DIR;
+		} else if (argc != 1) {
 			die(_("destination '%s' is not a directory"), dest_path[0]);
-		destination = dest_path;
+		} else {
+			destination = dest_path;
+		}
 	}
+	if (dst_w_slash != dest_path[0])
+		free((char *)dst_w_slash);
 
 	/* Checking */
 	for (i = 0; i < argc; i++) {