diff mbox series

mingw: align symlinks-related rmdir() behavior with Linux

Message ID pull.1008.git.1627586493659.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series mingw: align symlinks-related rmdir() behavior with Linux | expand

Commit Message

Johannes Schindelin July 29, 2021, 7:21 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

When performing a rebase, rmdir() is called on the folder .git/logs. On
Unix rmdir() exits without deleting anything in case .git/logs is a
symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir
and RemoveDirectoryW) do not behave the same and remove the folder if it
is symlinked even if it is not empty.

This creates issues when folders in .git/ are symlinks which is
especially the case when git-repo[1] is used.

This commit updates mingw_rmdir() so that its behavior is the same as
Linux rmdir() in case of symbolic links.

This fixes https://github.com/git-for-windows/git/issues/2967

[1]: git-repo is a python tool built on top of Git which helps manage
many Git repositories. It stores all the .git/ folders in a central
place by taking advantage of symbolic links.
More information: https://gerrit.googlesource.com/git-repo/

Signed-off-by: Thomas Bétous <tomspycell@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    mingw: support the git-repo tool better
    
    This addresses an issue, originally reported at
    https://github.com/git-for-windows/git/issues/2967, where the git-repo
    tool [https://gerrit.googlesource.com/git-repo/] replaces folders in
    .git/ with symlinks and mingw_rmdir() erroneously removes the symlink
    target directory's contents.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1008%2Fdscho%2Ffix-rmdir-with-symlinks-on-windows-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1008/dscho/fix-rmdir-with-symlinks-on-windows-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1008

 compat/mingw.c    | 15 +++++++++++++++
 t/t3400-rebase.sh | 10 ++++++++++
 t/test-lib.sh     |  6 ++++++
 3 files changed, 31 insertions(+)


base-commit: ebf3c04b262aa27fbb97f8a0156c2347fecafafb

Comments

Johannes Schindelin July 29, 2021, 7:23 p.m. UTC | #1
Hi,

On Thu, 29 Jul 2021, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>

Sorry for missing this, this line should have read

	From: Thomas Bétous <tomspycell@gmail.com>

Ciao,
Dscho

>
> When performing a rebase, rmdir() is called on the folder .git/logs. On
> Unix rmdir() exits without deleting anything in case .git/logs is a
> symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir
> and RemoveDirectoryW) do not behave the same and remove the folder if it
> is symlinked even if it is not empty.
>
> This creates issues when folders in .git/ are symlinks which is
> especially the case when git-repo[1] is used.
>
> This commit updates mingw_rmdir() so that its behavior is the same as
> Linux rmdir() in case of symbolic links.
>
> This fixes https://github.com/git-for-windows/git/issues/2967
>
> [1]: git-repo is a python tool built on top of Git which helps manage
> many Git repositories. It stores all the .git/ folders in a central
> place by taking advantage of symbolic links.
> More information: https://gerrit.googlesource.com/git-repo/
>
> Signed-off-by: Thomas Bétous <tomspycell@gmail.com>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>     mingw: support the git-repo tool better
>
>     This addresses an issue, originally reported at
>     https://github.com/git-for-windows/git/issues/2967, where the git-repo
>     tool [https://gerrit.googlesource.com/git-repo/] replaces folders in
>     .git/ with symlinks and mingw_rmdir() erroneously removes the symlink
>     target directory's contents.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1008%2Fdscho%2Ffix-rmdir-with-symlinks-on-windows-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1008/dscho/fix-rmdir-with-symlinks-on-windows-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1008
>
>  compat/mingw.c    | 15 +++++++++++++++
>  t/t3400-rebase.sh | 10 ++++++++++
>  t/test-lib.sh     |  6 ++++++
>  3 files changed, 31 insertions(+)
>
> diff --git a/compat/mingw.c b/compat/mingw.c
> index aa647b367b0..685d3efa3c0 100644
> --- a/compat/mingw.c
> +++ b/compat/mingw.c
> @@ -341,6 +341,21 @@ int mingw_rmdir(const char *pathname)
>  {
>  	int ret, tries = 0;
>  	wchar_t wpathname[MAX_PATH];
> +	struct stat st;
> +
> +	/*
> +	* Contrary to Linux rmdir(), Windows' _wrmdir() and _rmdir()
> +	* will remove the directory at the path if it is a symbolic link
> +	* which leads to issues when symlinks are used in the .git folder
> +	* (in the context of git-repo for instance). So before calling _wrmdir()
> +	* we first check if the path is a symbolic link. If it is, we exit
> +	* and return the same error as Linux rmdir() in this case (ENOTDIR).
> +	*/
> +	if (!mingw_lstat(pathname, &st) && S_ISLNK(st.st_mode)) {
> +		errno = ENOTDIR;
> +		return -1;
> +	}
> +
>  	if (xutftowcs_path(wpathname, pathname) < 0)
>  		return -1;
>
> diff --git a/t/t3400-rebase.sh b/t/t3400-rebase.sh
> index 0bb88aa982b..23dbd3c82ed 100755
> --- a/t/t3400-rebase.sh
> +++ b/t/t3400-rebase.sh
> @@ -406,4 +406,14 @@ test_expect_success 'refuse to switch to branch checked out elsewhere' '
>  	test_i18ngrep "already checked out" err
>  '
>
> +test_expect_success MINGW,SYMLINKS_WINDOWS 'rebase when .git/logs is a symlink' '
> +	git checkout main &&
> +	mv .git/logs actual_logs &&
> +	cmd //c "mklink /D .git\logs ..\actual_logs" &&
> +	git rebase -f HEAD^ &&
> +	test -L .git/logs &&
> +	rm .git/logs &&
> +	mv actual_logs .git/logs
> +'
> +
>  test_done
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index adaf03543e8..73f6d645b66 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1513,6 +1513,12 @@ test_lazy_prereq SYMLINKS '
>  	ln -s x y && test -h y
>  '
>
> +test_lazy_prereq SYMLINKS_WINDOWS '
> +	# test whether symbolic links are enabled on Windows
> +	test_have_prereq MINGW &&
> +	cmd //c "mklink y x" &> /dev/null && test -h y
> +'
> +
>  test_lazy_prereq FILEMODE '
>  	test "$(git config --bool core.filemode)" = true
>  '
>
> base-commit: ebf3c04b262aa27fbb97f8a0156c2347fecafafb
> --
> gitgitgadget
>
>
Junio C Hamano July 29, 2021, 8:03 p.m. UTC | #2
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> When performing a rebase, rmdir() is called on the folder .git/logs. On
> Unix rmdir() exits without deleting anything in case .git/logs is a
> symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir
> and RemoveDirectoryW) do not behave the same and remove the folder if it
> is symlinked even if it is not empty.

The distinction is understandable, and I can see this justifies the
patch really nicely.

It is curious why "rebase" causes rmdir on the reflog hierarchy,
though.  It is also unclear if "rebase" is special in having this
behaviour (and if so why), or just an example the problem was
observed with and other subcommands may benefit from the same fix.

Will queue with authorship corrected (as you reported in the
follow-up).

Thanks.
Eric Sunshine July 29, 2021, 8:12 p.m. UTC | #3
On Thu, Jul 29, 2021 at 3:21 PM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> When performing a rebase, rmdir() is called on the folder .git/logs. On
> Unix rmdir() exits without deleting anything in case .git/logs is a
> symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir
> and RemoveDirectoryW) do not behave the same and remove the folder if it
> is symlinked even if it is not empty.
>
> This creates issues when folders in .git/ are symlinks which is
> especially the case when git-repo[1] is used.

"issues" is a rather nebulous word. It doesn't help the reader
understand what actually goes wrong. Presumably some step later in the
rebase fails? Or is the problem that subsequent interaction with
git-repo fails because the link which git-repo (presumably) created
disappears out from underneath it?

> This commit updates mingw_rmdir() so that its behavior is the same as
> Linux rmdir() in case of symbolic links.

Okay, makes sense so far as the above explanation goes. But it also
feels like a band-aid fix to support a use-case which only works by
accident on Unix, if I'm reading between the lines correctly. That is,
presumably rmdir(".git/logs") is an intended cleanup action but the
fact that the cleanup doesn't happen as intended when it is a symlink
is not intentional behavior, thus git-repo's reliance upon that
behavior is questionable.

I guess this would also help tools which replace .git/worktrees with a
symlink since git-worktree -- as a cleanup -- removes its
.git/worktrees directory when the last worktree is removed...

> Signed-off-by: Thomas Bétous <tomspycell@gmail.com>
> ---
>  compat/mingw.c    | 15 +++++++++++++++
> diff --git a/compat/mingw.c b/compat/mingw.c
> @@ -341,6 +341,21 @@ int mingw_rmdir(const char *pathname)
> +       /*
> +       * Contrary to Linux rmdir(), Windows' _wrmdir() and _rmdir()
> +       * will remove the directory at the path if it is a symbolic link
> +       * which leads to issues when symlinks are used in the .git folder
> +       * (in the context of git-repo for instance). So before calling _wrmdir()
> +       * we first check if the path is a symbolic link. If it is, we exit
> +       * and return the same error as Linux rmdir() in this case (ENOTDIR).
> +       */
> +       if (!mingw_lstat(pathname, &st) && S_ISLNK(st.st_mode)) {
> +               errno = ENOTDIR;
> +               return -1;
> +       }

Unfortunately, this code comment doesn't help future readers
understand when/how this case can be triggered, which means that
anyone working on this code in the future will have to consult the
commit message anyhow in order to obtain the necessary understanding.
This suggests that either the comment should be dropped altogether,
thus forcing the person to consult the commit message (probably
undesirable), or improved to give enough information to understand
when/how it can be triggered.
Johannes Schindelin Aug. 2, 2021, 8:17 p.m. UTC | #4
Hi Junio,

On Thu, 29 Jul 2021, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > When performing a rebase, rmdir() is called on the folder .git/logs. On
> > Unix rmdir() exits without deleting anything in case .git/logs is a
> > symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir
> > and RemoveDirectoryW) do not behave the same and remove the folder if it
> > is symlinked even if it is not empty.
>
> The distinction is understandable, and I can see this justifies the
> patch really nicely.
>
> It is curious why "rebase" causes rmdir on the reflog hierarchy,
> though.  It is also unclear if "rebase" is special in having this
> behaviour (and if so why), or just an example the problem was
> observed with and other subcommands may benefit from the same fix.

I augmented the commit message in preparation for v2, specifically to
address this concern.

Ciao,
Dscho
Johannes Schindelin Aug. 2, 2021, 8:28 p.m. UTC | #5
Hi Eric,

On Thu, 29 Jul 2021, Eric Sunshine wrote:

> On Thu, Jul 29, 2021 at 3:21 PM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > When performing a rebase, rmdir() is called on the folder .git/logs. On
> > Unix rmdir() exits without deleting anything in case .git/logs is a
> > symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir
> > and RemoveDirectoryW) do not behave the same and remove the folder if it
> > is symlinked even if it is not empty.
> >
> > This creates issues when folders in .git/ are symlinks which is
> > especially the case when git-repo[1] is used.
>
> "issues" is a rather nebulous word. It doesn't help the reader
> understand what actually goes wrong. Presumably some step later in the
> rebase fails? Or is the problem that subsequent interaction with
> git-repo fails because the link which git-repo (presumably) created
> disappears out from underneath it?

I added a few lines describing the problem that was observed, what the
code specifically does to cause the problem, and then documented also that
the regression test specifically verifies that the observed problem won't
reoccur.

> > This commit updates mingw_rmdir() so that its behavior is the same as
> > Linux rmdir() in case of symbolic links.
>
> Okay, makes sense so far as the above explanation goes. But it also
> feels like a band-aid fix to support a use-case which only works by
> accident on Unix, if I'm reading between the lines correctly. That is,
> presumably rmdir(".git/logs") is an intended cleanup action but the
> fact that the cleanup doesn't happen as intended when it is a symlink
> is not intentional behavior, thus git-repo's reliance upon that
> behavior is questionable.

It is the `remove_path()` function that tries to delete parent directories
recursively until things fail.

Whether git-repo's reliance upon that behavior questionable is outside the
purview of this patch. At this point in time, with so many Git versions
accommodating that behavior, I would not want to spend much time pondering
that question, though: we de facto supported that expectation for too long
to change the behavior on Git's side, and therefore adjusting the Windows
side to that expectation feels like the most pragmatic way forward.

> I guess this would also help tools which replace .git/worktrees with a
> symlink since git-worktree -- as a cleanup -- removes its
> .git/worktrees directory when the last worktree is removed...

Yes, indeed.

> > Signed-off-by: Thomas Bétous <tomspycell@gmail.com>
> > ---
> >  compat/mingw.c    | 15 +++++++++++++++
> > diff --git a/compat/mingw.c b/compat/mingw.c
> > @@ -341,6 +341,21 @@ int mingw_rmdir(const char *pathname)
> > +       /*
> > +       * Contrary to Linux rmdir(), Windows' _wrmdir() and _rmdir()
> > +       * will remove the directory at the path if it is a symbolic link
> > +       * which leads to issues when symlinks are used in the .git folder
> > +       * (in the context of git-repo for instance). So before calling _wrmdir()
> > +       * we first check if the path is a symbolic link. If it is, we exit
> > +       * and return the same error as Linux rmdir() in this case (ENOTDIR).
> > +       */
> > +       if (!mingw_lstat(pathname, &st) && S_ISLNK(st.st_mode)) {
> > +               errno = ENOTDIR;
> > +               return -1;
> > +       }
>
> Unfortunately, this code comment doesn't help future readers
> understand when/how this case can be triggered, which means that
> anyone working on this code in the future will have to consult the
> commit message anyhow in order to obtain the necessary understanding.
> This suggests that either the comment should be dropped altogether,
> thus forcing the person to consult the commit message (probably
> undesirable), or improved to give enough information to understand
> when/how it can be triggered.

Okay. I reworded the comment extensively, to address your concern.

Ciao,
Dscho
diff mbox series

Patch

diff --git a/compat/mingw.c b/compat/mingw.c
index aa647b367b0..685d3efa3c0 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -341,6 +341,21 @@  int mingw_rmdir(const char *pathname)
 {
 	int ret, tries = 0;
 	wchar_t wpathname[MAX_PATH];
+	struct stat st;
+
+	/*
+	* Contrary to Linux rmdir(), Windows' _wrmdir() and _rmdir()
+	* will remove the directory at the path if it is a symbolic link
+	* which leads to issues when symlinks are used in the .git folder
+	* (in the context of git-repo for instance). So before calling _wrmdir()
+	* we first check if the path is a symbolic link. If it is, we exit
+	* and return the same error as Linux rmdir() in this case (ENOTDIR).
+	*/
+	if (!mingw_lstat(pathname, &st) && S_ISLNK(st.st_mode)) {
+		errno = ENOTDIR;
+		return -1;
+	}
+
 	if (xutftowcs_path(wpathname, pathname) < 0)
 		return -1;
 
diff --git a/t/t3400-rebase.sh b/t/t3400-rebase.sh
index 0bb88aa982b..23dbd3c82ed 100755
--- a/t/t3400-rebase.sh
+++ b/t/t3400-rebase.sh
@@ -406,4 +406,14 @@  test_expect_success 'refuse to switch to branch checked out elsewhere' '
 	test_i18ngrep "already checked out" err
 '
 
+test_expect_success MINGW,SYMLINKS_WINDOWS 'rebase when .git/logs is a symlink' '
+	git checkout main &&
+	mv .git/logs actual_logs &&
+	cmd //c "mklink /D .git\logs ..\actual_logs" &&
+	git rebase -f HEAD^ &&
+	test -L .git/logs &&
+	rm .git/logs &&
+	mv actual_logs .git/logs
+'
+
 test_done
diff --git a/t/test-lib.sh b/t/test-lib.sh
index adaf03543e8..73f6d645b66 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1513,6 +1513,12 @@  test_lazy_prereq SYMLINKS '
 	ln -s x y && test -h y
 '
 
+test_lazy_prereq SYMLINKS_WINDOWS '
+	# test whether symbolic links are enabled on Windows
+	test_have_prereq MINGW &&
+	cmd //c "mklink y x" &> /dev/null && test -h y
+'
+
 test_lazy_prereq FILEMODE '
 	test "$(git config --bool core.filemode)" = true
 '