diff mbox series

[v1,1/1] git stash needing mkdir deletes untracked file

Message ID 20230808172624.14205-1-tboegi@web.de (mailing list archive)
State New, archived
Headers show
Series [v1,1/1] git stash needing mkdir deletes untracked file | expand

Commit Message

Torsten Bögershausen Aug. 8, 2023, 5:26 p.m. UTC
From: Torsten Bögershausen <tboegi@web.de>

The following sequence leads to loss of work:
 git init
 mkdir README
 touch README/README
 git add .
 git commit -m "Init project"
 echo "Test" > README/README
 mv README/README README2
 rmdir README
 mv README2 README
 git stash
 git stash pop

The problem is, that `git stash` needs to create the directory README/
and to be able to do this, the file README needs to be removed.
And this is, where the work was lost.
There are different possibilities preventing this loss of work:
a)
  `git stash` does refuse the removel of the untracked file,
   when a directory with the same name needs to be created
  There is a small problem here:
  In the ideal world, the stash would do nothing at all,
  and not do anything but complain.
  The current code makes this hard to achieve
  An other solution could be to do as much stash work as possible,
  but stop when the file/directory conflict is detected.
  This would create some inconsistent state.

b) Create the directory as needed, but rename the file before doing that.
  This would let the `git stash` proceed as usual and create a "new" file,
  which may be surprising for some worlflows.

This change goes for b), as it seems the most intuitive solution for
Git users.

Introdue a new function rename_to_untracked_or_warn() and use it
in create_directories() in entry.c

Reported-by: Till Friebe <friebetill@gmail.com>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---
 entry.c          | 25 ++++++++++++++++++++++++-
 t/t3903-stash.sh | 23 +++++++++++++++++++++++
 2 files changed, 47 insertions(+), 1 deletion(-)

--
2.41.0.394.ge43f4fd0bd

Comments

Torsten Bögershausen Aug. 8, 2023, 6:03 p.m. UTC | #1
On Tue, Aug 08, 2023 at 07:26:24PM +0200, tboegi@web.de wrote:
> From: Torsten Bögershausen <tboegi@web.de>
> .

... The following sequence leads to loss of work:

I just realized that this breaks other tests:
t1092-sparse-checkout-compatibility.sh not ok 17 - diff with renames and conflicts
t1092-sparse-checkout-compatibility.sh not ok 18 - diff with directory/file conflicts

However, comments are welcome:
Is it a good idea to create file called "filename.untracked" ?
Eric Sunshine Aug. 8, 2023, 7:28 p.m. UTC | #2
On Tue, Aug 8, 2023 at 3:15 PM <tboegi@web.de> wrote:
> The following sequence leads to loss of work:
>  git init
>  mkdir README
>  touch README/README
>  git add .
>  git commit -m "Init project"
>  echo "Test" > README/README
>  mv README/README README2
>  rmdir README
>  mv README2 README
>  git stash
>  git stash pop
>
> The problem is, that `git stash` needs to create the directory README/
> and to be able to do this, the file README needs to be removed.
> And this is, where the work was lost.
> There are different possibilities preventing this loss of work:
> a)
>   `git stash` does refuse the removel of the untracked file,

s/removel/removal/

>    when a directory with the same name needs to be created

s/$/./

>   There is a small problem here:
>   In the ideal world, the stash would do nothing at all,
>   and not do anything but complain.
>   The current code makes this hard to achieve

s/$/./

>   An other solution could be to do as much stash work as possible,

s/An other/Another/

>   but stop when the file/directory conflict is detected.
>   This would create some inconsistent state.
>
> b) Create the directory as needed, but rename the file before doing that.
>   This would let the `git stash` proceed as usual and create a "new" file,
>   which may be surprising for some worlflows.

s/worlflows/workflows/

> This change goes for b), as it seems the most intuitive solution for
> Git users.
>
> Introdue a new function rename_to_untracked_or_warn() and use it

s/Introdue/Introduce/

> in create_directories() in entry.c
>
> Reported-by: Till Friebe <friebetill@gmail.com>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
> diff --git a/entry.c b/entry.c
> @@ -15,6 +15,28 @@
> +static int rename_to_untracked_or_warn(const char *file)
> +{
> +       const size_t file_name_len = strlen(file);
> +       const static char *dot_untracked = ".untracked";
> +       const size_t dot_un_len = strlen(dot_untracked);
> +       struct strbuf sb;
> +       int ret;
> +
> +       strbuf_init(&sb, file_name_len + dot_un_len);
> +       strbuf_add(&sb, file, file_name_len);
> +       strbuf_add(&sb, dot_untracked, dot_un_len);
> +       ret = rename(file, sb.buf);

This could probably all be simplified to:

    char *to = xstrfmt("%s.untracked", file);
    ret = rename(...);
    ...
    free(to);

If there is already a file named "foo.untracked", then this will
overwrite it, thus potentially losing work, right? I wonder if it
makes sense to be a bit more careful.

> +       if (ret) {
> +               int saved_errno = errno;
> +               warning_errno(_("unable rename '%s' into '%s'"), file, sb.buf);
> +               errno = saved_errno;
> +       }
> +       strbuf_release(&sb);
> +       return ret;
> +}

Do we want to give the user some warning/notification that their file,
as a safety precaution, got renamed to "foo.untracked"?

> diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> @@ -1512,4 +1512,27 @@ test_expect_success 'restore untracked files even when we hit conflicts' '
> +test_expect_success 'stash mkdir README needed - README.untracked created' '
> +       git init mkdir_needed_file_untracked &&
> +       (
> +               cd mkdir_needed_file_untracked &&
> +               mkdir README &&
> +               touch README/README &&

s/touch/>/

> +               git add . &&
> +               git commit -m "Add README/README" &&
> +               echo Version2 > README/README &&

s/> R/>R/

> +               mv README/README README2 &&
> +               rmdir README &&
> +               mv README2 README &&
> +               git stash &&
> +               test_path_is_file README.untracked &&
> +               echo Version2 >expect &&
> +               test_cmp expect README.untracked &&
> +               rm expect &&
> +               git stash pop &&
> +               test_path_is_file README.untracked &&
> +               echo Version2 >expect &&
> +               test_cmp expect README.untracked
> +       )
> +'
Phillip Wood Aug. 9, 2023, 1:15 p.m. UTC | #3
Hi Torsten

Thanks for working on this. I've cc'd Junio for his unpack_trees() 
knowledge.

On 08/08/2023 18:26, tboegi@web.de wrote:
> From: Torsten Bögershausen <tboegi@web.de>
> 
> The following sequence leads to loss of work:
>   git init
>   mkdir README
>   touch README/README
>   git add .
>   git commit -m "Init project"
>   echo "Test" > README/README
>   mv README/README README2
>   rmdir README
>   mv README2 README
>   git stash
>   git stash pop
> 
> The problem is, that `git stash` needs to create the directory README/
> and to be able to do this, the file README needs to be removed.
> And this is, where the work was lost.
> There are different possibilities preventing this loss of work:
> a)
>    `git stash` does refuse the removel of the untracked file,
>     when a directory with the same name needs to be created
>    There is a small problem here:
>    In the ideal world, the stash would do nothing at all,
>    and not do anything but complain.
>    The current code makes this hard to achieve
>    An other solution could be to do as much stash work as possible,
>    but stop when the file/directory conflict is detected.
>    This would create some inconsistent state.
> 
> b) Create the directory as needed, but rename the file before doing that.
>    This would let the `git stash` proceed as usual and create a "new" file,
>    which may be surprising for some worlflows.
> 
> This change goes for b), as it seems the most intuitive solution for
> Git users.
> 
> Introdue a new function rename_to_untracked_or_warn() and use it
> in create_directories() in entry.c

Although this change is framed in terms of changes to "git stash push" I 
think the underlying issue and this patch actually affects all users of 
unpack_trees(). For example if "README" is untracked then

	git checkout <rev> README

will currently fail if <rev>:README is a blob but will succeed and 
remove the untracked file if <rev>:README is a tree.

I'm far from an expert in this area but I think we might want to 
understand why unpack_trees() sets state->force when it calls 
checkout_entry() before making any changes.

Best Wishes

Phillip

> Reported-by: Till Friebe <friebetill@gmail.com>
> Signed-off-by: Torsten Bögershausen <tboegi@web.de>
> ---
>   entry.c          | 25 ++++++++++++++++++++++++-
>   t/t3903-stash.sh | 23 +++++++++++++++++++++++
>   2 files changed, 47 insertions(+), 1 deletion(-)
> 
> diff --git a/entry.c b/entry.c
> index 43767f9043..76d8a0762d 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -15,6 +15,28 @@
>   #include "entry.h"
>   #include "parallel-checkout.h"
> 
> +static int rename_to_untracked_or_warn(const char *file)
> +{
> +	const size_t file_name_len = strlen(file);
> +	const static char *dot_untracked = ".untracked";
> +	const size_t dot_un_len = strlen(dot_untracked);
> +	struct strbuf sb;
> +	int ret;
> +
> +	strbuf_init(&sb, file_name_len + dot_un_len);
> +	strbuf_add(&sb, file, file_name_len);
> +	strbuf_add(&sb, dot_untracked, dot_un_len);
> +	ret = rename(file, sb.buf);
> +
> +	if (ret) {
> +		int saved_errno = errno;
> +		warning_errno(_("unable rename '%s' into '%s'"), file, sb.buf);
> +		errno = saved_errno;
> +	}
> +	strbuf_release(&sb);
> +	return ret;
> +}
> +
>   static void create_directories(const char *path, int path_len,
>   			       const struct checkout *state)
>   {
> @@ -48,7 +70,8 @@ static void create_directories(const char *path, int path_len,
>   		 */
>   		if (mkdir(buf, 0777)) {
>   			if (errno == EEXIST && state->force &&
> -			    !unlink_or_warn(buf) && !mkdir(buf, 0777))
> +			    !rename_to_untracked_or_warn(buf) &&
> +			    !mkdir(buf, 0777))
>   				continue;
>   			die_errno("cannot create directory at '%s'", buf);
>   		}
> diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> index 0b3dfeaea2..1a210f8a5a 100755
> --- a/t/t3903-stash.sh
> +++ b/t/t3903-stash.sh
> @@ -1512,4 +1512,27 @@ test_expect_success 'restore untracked files even when we hit conflicts' '
>   	)
>   '
> 
> +test_expect_success 'stash mkdir README needed - README.untracked created' '
> +	git init mkdir_needed_file_untracked &&
> +	(
> +		cd mkdir_needed_file_untracked &&
> +		mkdir README &&
> +		touch README/README &&
> +		git add . &&
> +		git commit -m "Add README/README" &&
> +		echo Version2 > README/README &&
> +		mv README/README README2 &&
> +		rmdir README &&
> +		mv README2 README &&
> +		git stash &&
> +		test_path_is_file README.untracked &&
> +		echo Version2 >expect &&
> +		test_cmp expect README.untracked &&
> +		rm expect &&
> +		git stash pop &&
> +		test_path_is_file README.untracked &&
> +		echo Version2 >expect &&
> +		test_cmp expect README.untracked
> +	)
> +'
>   test_done
> --
> 2.41.0.394.ge43f4fd0bd
>
Torsten Bögershausen Aug. 9, 2023, 6:47 p.m. UTC | #4
On Wed, Aug 09, 2023 at 02:15:28PM +0100, Phillip Wood wrote:
> Hi Torsten
>
> Thanks for working on this. I've cc'd Junio for his unpack_trees()
> knowledge.

Thanks Eric for the review.

Hej Phillip,
I have been playing around with the whole thing some time.
At the end I had a version, which did fiddle the information
that we are doing a `git stash` (and not any other operation)
into entry.c, and all test cases passed.
So in principle I can dig out all changes, polish them
and send them out, after doing cleanups of course.

(And that could take a couple of days, or weeks ;-)

My main question is still open:
Is it a good idea, to create a "helper file" ?
The naming can be discussed, we may stick the date/time
into the filename to make it really unique, or so.

Reading the different reports and including own experience,
I still think that a directory called ".deleted-by-user"
or ".wastebin" or something in that style is a good idea.

What do others think ?
Junio C Hamano Aug. 9, 2023, 8:57 p.m. UTC | #5
Phillip Wood <phillip.wood123@gmail.com> writes:

> Although this change is framed in terms of changes to "git stash push"
> I think the underlying issue and this patch actually affects all users
> of unpack_trees(). For example if "README" is untracked then
>
> 	git checkout <rev> README
>
> will currently fail if <rev>:README is a blob but will succeed and
> remove the untracked file if <rev>:README is a tree.

Very true, and with an .untracked file nobody asked Git to create,
presumably?  I am not sure if the updated behaviour is better than
the current behaviour.  

If "silent and unconditional removal" bothers us, I wonder if it is
a lot better approach to error out and have the user sort out the
mess, which is what we usually do when it gets tempting to "move it
away with an arbitrary rename" like this patch tries to do.  I
dunno.

Thanks.
Phillip Wood Aug. 15, 2023, 9:15 a.m. UTC | #6
Hi Torsten

Sorry for the slow reply

On 09/08/2023 19:47, Torsten Bögershausen wrote:
> On Wed, Aug 09, 2023 at 02:15:28PM +0100, Phillip Wood wrote:
>> Hi Torsten
>>
>> Thanks for working on this. I've cc'd Junio for his unpack_trees()
>> knowledge.
> 
> Thanks Eric for the review.
> 
> Hej Phillip,
> I have been playing around with the whole thing some time.
> At the end I had a version, which did fiddle the information
> that we are doing a `git stash` (and not any other operation)
> into entry.c, and all test cases passed.
> So in principle I can dig out all changes, polish them
> and send them out, after doing cleanups of course.

I don't think we should be treating "git stash" as a special case here - 
commands like "git checkout" should not be removing untracked files 
unprompted either.

> (And that could take a couple of days, or weeks ;-)
> 
> My main question is still open:
> Is it a good idea, to create a "helper file" ?
> The naming can be discussed, we may stick the date/time
> into the filename to make it really unique, or so.

I think stopping and telling the user that the file would be overwritten 
as we do in other cases would be better.

> Reading the different reports and including own experience,
> I still think that a directory called ".deleted-by-user"
> or ".wastebin" or something in that style is a good idea.

I can see an argument for being able to opt-in to that for "git restore" 
and "git reset --hard" but that is a different problem to the one here.

Best Wishes

Phillip
Phillip Wood Aug. 15, 2023, 9:16 a.m. UTC | #7
On 09/08/2023 21:57, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
> If "silent and unconditional removal" bothers us, I wonder if it is
> a lot better approach to error out and have the user sort out the
> mess, which is what we usually 

Yes I think that would be a better approach.

Best Wishes

Phillip

> do when it gets tempting to "move it
> away with an arbitrary rename" like this patch tries to do.  I
> dunno.
> 
> Thanks.
> 
> 
>
Torsten Bögershausen Aug. 15, 2023, 3:25 p.m. UTC | #8
On Tue, Aug 15, 2023 at 10:15:37AM +0100, Phillip Wood wrote:
> Hi Torsten
>
> Sorry for the slow reply

No problem.
Thanks for the response, I think that we have an
agreement not to overwrite an untracked file, when a directory
with the same name needs to be created.

I try to come up with a patch series -
starting with the stash operation.

>
> On 09/08/2023 19:47, Torsten Bögershausen wrote:
> > On Wed, Aug 09, 2023 at 02:15:28PM +0100, Phillip Wood wrote:
> > > Hi Torsten
> > >
> > > Thanks for working on this. I've cc'd Junio for his unpack_trees()
> > > knowledge.
> >
> > Thanks Eric for the review.
> >
> > Hej Phillip,
> > I have been playing around with the whole thing some time.
> > At the end I had a version, which did fiddle the information
> > that we are doing a `git stash` (and not any other operation)
> > into entry.c, and all test cases passed.
> > So in principle I can dig out all changes, polish them
> > and send them out, after doing cleanups of course.
>
> I don't think we should be treating "git stash" as a special case here -
> commands like "git checkout" should not be removing untracked files
> unprompted either.
>
> > (And that could take a couple of days, or weeks ;-)
> >
> > My main question is still open:
> > Is it a good idea, to create a "helper file" ?
> > The naming can be discussed, we may stick the date/time
> > into the filename to make it really unique, or so.
>
> I think stopping and telling the user that the file would be overwritten as
> we do in other cases would be better.
>
> > Reading the different reports and including own experience,
> > I still think that a directory called ".deleted-by-user"
> > or ".wastebin" or something in that style is a good idea.
>
> I can see an argument for being able to opt-in to that for "git restore" and
> "git reset --hard" but that is a different problem to the one here.
>
> Best Wishes
>
> Phillip
>
Junio C Hamano Aug. 15, 2023, 6:03 p.m. UTC | #9
Phillip Wood <phillip.wood123@gmail.com> writes:

> I don't think we should be treating "git stash" as a special case here
> - commands like "git checkout" should not be removing untracked files
> unprompted either.

Yeah, I tend to agree.  "git checkout branch path" should overwrite
a leftover "path" in the working tree in response to such an
explicit request, and that should equally apply for a request with
pathspec e.g. "git checkout branch .", as the latter is also an
explicit "please check out all paths out of the tree-ish of the
branch".

But "git checkout branch" in a working tree with untracked "path"
should not lose it if "branch" has it as a tracked file.

> I think stopping and telling the user that the file would be
> overwritten as we do in other cases would be better.

Yup, that is what we have done and probably one of the design
choices that made us successful.

>> Reading the different reports and including own experience,
>> I still think that a directory called ".deleted-by-user"
>> or ".wastebin" or something in that style is a good idea.
>
> I can see an argument for being able to opt-in to that for "git
> restore" and "git reset --hard" but that is a different problem to the
> one here.

Yeah, I tend to agree.  If anything, such a trash directory should
be kept out-of-line, not inside the working tree.  Perhaps in $HOME
or somewhere, and not necessarily tied to the use of Git, as the way
a file gets "deleted by user" is not necessarily limited to the use
of Git.
diff mbox series

Patch

diff --git a/entry.c b/entry.c
index 43767f9043..76d8a0762d 100644
--- a/entry.c
+++ b/entry.c
@@ -15,6 +15,28 @@ 
 #include "entry.h"
 #include "parallel-checkout.h"

+static int rename_to_untracked_or_warn(const char *file)
+{
+	const size_t file_name_len = strlen(file);
+	const static char *dot_untracked = ".untracked";
+	const size_t dot_un_len = strlen(dot_untracked);
+	struct strbuf sb;
+	int ret;
+
+	strbuf_init(&sb, file_name_len + dot_un_len);
+	strbuf_add(&sb, file, file_name_len);
+	strbuf_add(&sb, dot_untracked, dot_un_len);
+	ret = rename(file, sb.buf);
+
+	if (ret) {
+		int saved_errno = errno;
+		warning_errno(_("unable rename '%s' into '%s'"), file, sb.buf);
+		errno = saved_errno;
+	}
+	strbuf_release(&sb);
+	return ret;
+}
+
 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
 {
@@ -48,7 +70,8 @@  static void create_directories(const char *path, int path_len,
 		 */
 		if (mkdir(buf, 0777)) {
 			if (errno == EEXIST && state->force &&
-			    !unlink_or_warn(buf) && !mkdir(buf, 0777))
+			    !rename_to_untracked_or_warn(buf) &&
+			    !mkdir(buf, 0777))
 				continue;
 			die_errno("cannot create directory at '%s'", buf);
 		}
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 0b3dfeaea2..1a210f8a5a 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1512,4 +1512,27 @@  test_expect_success 'restore untracked files even when we hit conflicts' '
 	)
 '

+test_expect_success 'stash mkdir README needed - README.untracked created' '
+	git init mkdir_needed_file_untracked &&
+	(
+		cd mkdir_needed_file_untracked &&
+		mkdir README &&
+		touch README/README &&
+		git add . &&
+		git commit -m "Add README/README" &&
+		echo Version2 > README/README &&
+		mv README/README README2 &&
+		rmdir README &&
+		mv README2 README &&
+		git stash &&
+		test_path_is_file README.untracked &&
+		echo Version2 >expect &&
+		test_cmp expect README.untracked &&
+		rm expect &&
+		git stash pop &&
+		test_path_is_file README.untracked &&
+		echo Version2 >expect &&
+		test_cmp expect README.untracked
+	)
+'
 test_done