diff mbox series

reftable: ignore file-in-use errors when unlink(3p) fails on Windows

Message ID 20250125-b4-pks-reftable-win32-in-use-errors-v1-1-356dbc783b4f@pks.im (mailing list archive)
State New
Headers show
Series reftable: ignore file-in-use errors when unlink(3p) fails on Windows | expand

Commit Message

Patrick Steinhardt Jan. 25, 2025, 5:41 a.m. UTC
Unlinking a file may fail on Windows systems when the file is still held
open by another process. This is incompatible with POSIX semantics and
by extension with Git's assumed semantics when unlinking files, which
is that files can be unlinked regardless of whether they are still open
or not. To counteract this incompatibility, we have some custom error
handling in the `mingw_unlink()` wrapper that first retries the deletion
with some delay, and then asks the user whether we should continue to
retry.

While this logic might be sensible in many callsites throughout Git, it
is less when used in the reftable library. We only use unlink(3) there
to delete tables which aren't referenced anymore, and the code is very
aware of the limitations on Windows. As such, all calls to unlink(3p)
don't perform any error checking at all and are fine with the call
failing.

Instead, the library provides the `reftable_stack_clean()` function,
which Git knows to execute in git-pack-refs(1) after compacting a stack.
The effect of this function is that all stale tables will eventually get
deleted once they aren't kept open anymore.

So while we're fine with unlink(3p) failing, the Windows-emulation of
that function will still perform several sleeps and ultimately end up
asking the user:

    $ git pack-refs
    Unlink of file 'C:/temp/jgittest/jgit/.git/reftable/0x000000000002-0x000000000004-50486d0e.ref' failed. Should I try again? (y/n) n
    Unlink of file 'C:/temp/jgittest/jgit/.git/reftable/0x000000000002-0x000000000004-50486d0e.ref' failed. Should I try again? (y/n) n
    Unlink of file 'C:/temp/jgittest/jgit/.git/reftable/0x000000000002-0x000000000004-50486d0e.ref' failed. Should I try again? (y/n) n

It even asks multiple times, which is doubly annoying and puzzling to
the user:

  1. It asks when trying to delete the old file after having written the
     compacted stack.

  2. It asks when reloading the stack, where it will try to unlink
     now-unreferenced tables.

  3. It asks when calling `reftable_stack_clean()`, where it will try to
     unlink now-stale tables.

Fix the issue by making it possible to disable this behaviour with a
preprocessor define. As "git-compat-util.h" is only included from
"system.h", and given that "system.h" is only ever included by headers
and code that are internal to the reftable library, we can set that
macro in this header without impacting anything else but the reftable
library.

Reported-by: Christian Reich <Zottelbart@t-online.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
Hi,

This patch fixes the issue reported in [1].

Thanks!

Patrick

[1]: <d7fd0b1c-98fe-4cc3-b657-c2c3d0bc5c47@t-online.de>
---
 compat/mingw.c    | 5 ++++-
 compat/mingw.h    | 8 ++++++--
 reftable/system.h | 1 +
 3 files changed, 11 insertions(+), 3 deletions(-)


---
base-commit: 4e746b1a31f9f0036032b6f94279cf16fb363203
change-id: 20250124-b4-pks-reftable-win32-in-use-errors-969494f2fdf7

Comments

Johannes Sixt Jan. 25, 2025, 8:19 a.m. UTC | #1
Am 25.01.25 um 06:41 schrieb Patrick Steinhardt:
> While this logic might be sensible in many callsites throughout Git, it
> is less when used in the reftable library. We only use unlink(3) there
> to delete tables which aren't referenced anymore, and the code is very
> aware of the limitations on Windows. As such, all calls to unlink(3p)
> don't perform any error checking at all and are fine with the call
> failing.
> 
> Instead, the library provides the `reftable_stack_clean()` function,
> which Git knows to execute in git-pack-refs(1) after compacting a stack.
> The effect of this function is that all stale tables will eventually get
> deleted once they aren't kept open anymore.
> 
> So while we're fine with unlink(3p) failing, the Windows-emulation of
> that function will still perform several sleeps and ultimately end up
> asking the user:

Why don't the changes that your commits ending at 391bceae4350
("compat/mingw: support POSIX semantics for atomic renames", 2024-10-27)
help in this case, too?

Since the reftable layer is aware of the problem, why don't we just fix
it there and instead sweep it under the rug in the compat layer?

-- Hannes
Patrick Steinhardt Jan. 25, 2025, 8:32 a.m. UTC | #2
On Sat, Jan 25, 2025 at 09:19:57AM +0100, Johannes Sixt wrote:
> Am 25.01.25 um 06:41 schrieb Patrick Steinhardt:
> > While this logic might be sensible in many callsites throughout Git, it
> > is less when used in the reftable library. We only use unlink(3) there
> > to delete tables which aren't referenced anymore, and the code is very
> > aware of the limitations on Windows. As such, all calls to unlink(3p)
> > don't perform any error checking at all and are fine with the call
> > failing.
> > 
> > Instead, the library provides the `reftable_stack_clean()` function,
> > which Git knows to execute in git-pack-refs(1) after compacting a stack.
> > The effect of this function is that all stale tables will eventually get
> > deleted once they aren't kept open anymore.
> > 
> > So while we're fine with unlink(3p) failing, the Windows-emulation of
> > that function will still perform several sleeps and ultimately end up
> > asking the user:
> 
> Why don't the changes that your commits ending at 391bceae4350
> ("compat/mingw: support POSIX semantics for atomic renames", 2024-10-27)
> help in this case, too?

The user report was explicitly about compatibility with JGit, which
still had these files open. We don't have control over third-party
clients and how exactly they open files, so it is expected that we may
still see failures with the deletion of in-use files.

> Since the reftable layer is aware of the problem, why don't we just fix
> it there and instead sweep it under the rug in the compat layer?

I didn't really have a good idea for _how_ to do that. We automatically
pull in the compat version of `unlink()` via "git-compat-util.h", and we
cannot easily change that. So the reftable library is basically unable
to handle it there, because the assumed POSIX-behavior of `unlink()` is
different. I also don't want to reimplement the "compat/" layer for
Windows, because it brings us a couple of features for free, like
wide-character handling and handling the deletion of read-only files.

Another alternative would be to provide `reftable_unlink()` in the
reftable system layer, implement it via `mingw_unlink()` with that
second argument I'm introducing and then convert all callsites of unlink
to use that function instead. But that felt unnecessarily complex to me.

I'd be happy to hear about alternative ideas that didn't came to my
mind.

Patrick
Christian Reich Jan. 25, 2025, 8:45 a.m. UTC | #3
Greetings Patrick,

thx you for the patch. I build an own version of git for windows with 
this patch and it works as expected! *thumbsup*

But I see merge conflicts in git for windows main-branch:

1. mingw_unlink looks different in windows-master in compare to the diff:

https://github.com/git-for-windows/git/blob/main/compat/mingw.c#L551

2. in mingw_rename is another call of mingw_unlink:

https://github.com/git-for-windows/git/blob/main/compat/mingw.c#L2974

So I don't know how the patch would find it way to the windows version.

Thank you for your effort.

Christian

Am 25.01.2025 um 06:41 schrieb Patrick Steinhardt:
> Unlinking a file may fail on Windows systems when the file is still held
> open by another process. This is incompatible with POSIX semantics and
> by extension with Git's assumed semantics when unlinking files, which
> is that files can be unlinked regardless of whether they are still open
> or not. To counteract this incompatibility, we have some custom error
> handling in the `mingw_unlink()` wrapper that first retries the deletion
> with some delay, and then asks the user whether we should continue to
> retry.
>
> While this logic might be sensible in many callsites throughout Git, it
> is less when used in the reftable library. We only use unlink(3) there
> to delete tables which aren't referenced anymore, and the code is very
> aware of the limitations on Windows. As such, all calls to unlink(3p)
> don't perform any error checking at all and are fine with the call
> failing.
>
> Instead, the library provides the `reftable_stack_clean()` function,
> which Git knows to execute in git-pack-refs(1) after compacting a stack.
> The effect of this function is that all stale tables will eventually get
> deleted once they aren't kept open anymore.
>
> So while we're fine with unlink(3p) failing, the Windows-emulation of
> that function will still perform several sleeps and ultimately end up
> asking the user:
>
>      $ git pack-refs
>      Unlink of file 'C:/temp/jgittest/jgit/.git/reftable/0x000000000002-0x000000000004-50486d0e.ref' failed. Should I try again? (y/n) n
>      Unlink of file 'C:/temp/jgittest/jgit/.git/reftable/0x000000000002-0x000000000004-50486d0e.ref' failed. Should I try again? (y/n) n
>      Unlink of file 'C:/temp/jgittest/jgit/.git/reftable/0x000000000002-0x000000000004-50486d0e.ref' failed. Should I try again? (y/n) n
>
> It even asks multiple times, which is doubly annoying and puzzling to
> the user:
>
>    1. It asks when trying to delete the old file after having written the
>       compacted stack.
>
>    2. It asks when reloading the stack, where it will try to unlink
>       now-unreferenced tables.
>
>    3. It asks when calling `reftable_stack_clean()`, where it will try to
>       unlink now-stale tables.
>
> Fix the issue by making it possible to disable this behaviour with a
> preprocessor define. As "git-compat-util.h" is only included from
> "system.h", and given that "system.h" is only ever included by headers
> and code that are internal to the reftable library, we can set that
> macro in this header without impacting anything else but the reftable
> library.
>
> Reported-by: Christian Reich <Zottelbart@t-online.de>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> Hi,
>
> This patch fixes the issue reported in [1].
>
> Thanks!
>
> Patrick
>
> [1]: <d7fd0b1c-98fe-4cc3-b657-c2c3d0bc5c47@t-online.de>
> ---
>   compat/mingw.c    | 5 ++++-
>   compat/mingw.h    | 8 ++++++--
>   reftable/system.h | 1 +
>   3 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/compat/mingw.c b/compat/mingw.c
> index 1d5b211b54..0e4b6a70a4 100644
> --- a/compat/mingw.c
> +++ b/compat/mingw.c
> @@ -302,7 +302,7 @@ static wchar_t *normalize_ntpath(wchar_t *wbuf)
>   	return wbuf;
>   }
>   
> -int mingw_unlink(const char *pathname)
> +int mingw_unlink(const char *pathname, int handle_in_use_error)
>   {
>   	int ret, tries = 0;
>   	wchar_t wpathname[MAX_PATH];
> @@ -317,6 +317,9 @@ int mingw_unlink(const char *pathname)
>   	while ((ret = _wunlink(wpathname)) == -1 && tries < ARRAY_SIZE(delay)) {
>   		if (!is_file_in_use_error(GetLastError()))
>   			break;
> +		if (!handle_in_use_error)
> +			return ret;
> +
>   		/*
>   		 * We assume that some other process had the source or
>   		 * destination file open at the wrong moment and retry.
> diff --git a/compat/mingw.h b/compat/mingw.h
> index ebfb8ba423..a555af8d54 100644
> --- a/compat/mingw.h
> +++ b/compat/mingw.h
> @@ -224,8 +224,12 @@ int uname(struct utsname *buf);
>    * replacements of existing functions
>    */
>   
> -int mingw_unlink(const char *pathname);
> -#define unlink mingw_unlink
> +int mingw_unlink(const char *pathname, int handle_in_use_error);
> +#ifdef MINGW_DONT_HANDLE_IN_USE_ERROR
> +# define unlink(path) mingw_unlink(path, 0)
> +#else
> +# define unlink(path) mingw_unlink(path, 1)
> +#endif
>   
>   int mingw_rmdir(const char *path);
>   #define rmdir mingw_rmdir
> diff --git a/reftable/system.h b/reftable/system.h
> index 5274eca1d0..fe94bf205b 100644
> --- a/reftable/system.h
> +++ b/reftable/system.h
> @@ -13,6 +13,7 @@ license that can be found in the LICENSE file or at
>   
>   #define DISABLE_SIGN_COMPARE_WARNINGS
>   
> +#define MINGW_DONT_HANDLE_IN_USE_ERROR
>   #include "git-compat-util.h"
>   
>   /*
>
> ---
> base-commit: 4e746b1a31f9f0036032b6f94279cf16fb363203
> change-id: 20250124-b4-pks-reftable-win32-in-use-errors-969494f2fdf7
>
Johannes Sixt Jan. 25, 2025, 2:28 p.m. UTC | #4
Am 25.01.25 um 09:32 schrieb Patrick Steinhardt:
> The user report was explicitly about compatibility with JGit, which
> still had these files open. We don't have control over third-party
> clients and how exactly they open files, so it is expected that we may
> still see failures with the deletion of in-use files.

Fair enough.

> I'd be happy to hear about alternative ideas that didn't came to my
> mind.

Instead of calling _wunlink() in mingw_unlink, we could CreateFileW()
with access mode DELETE and flag FILE_FLAG_DELETE_ON_CLOSE, then close
the file right away. That would apply semantics that is similar, but not
quite, POSIX at least among the files that we open ourselves.

It would be even better that we do not depend on the POSIX behavior in
the first place. As you said, the reftable library can live with failed
deletes. And I don't think we depend on the POSIX behavior anywhere else
because we would see the "try again?" question much more frequently than
we do right now.

-- Hannes
Junio C Hamano Jan. 26, 2025, 1:41 a.m. UTC | #5
Patrick Steinhardt <ps@pks.im> writes:

> diff --git a/compat/mingw.c b/compat/mingw.c
> index 1d5b211b54..0e4b6a70a4 100644
> --- a/compat/mingw.c
> +++ b/compat/mingw.c
> @@ -302,7 +302,7 @@ static wchar_t *normalize_ntpath(wchar_t *wbuf)
>  	return wbuf;
>  }
>  
> -int mingw_unlink(const char *pathname)
> +int mingw_unlink(const char *pathname, int handle_in_use_error)
>  {
>  	int ret, tries = 0;
>  	wchar_t wpathname[MAX_PATH];
> @@ -317,6 +317,9 @@ int mingw_unlink(const char *pathname)
>  	while ((ret = _wunlink(wpathname)) == -1 && tries < ARRAY_SIZE(delay)) {
>  		if (!is_file_in_use_error(GetLastError()))
>  			break;
> +		if (!handle_in_use_error)
> +			return ret;
> +
>  		/*
>  		 * We assume that some other process had the source or
>  		 * destination file open at the wrong moment and retry.

So this is how we can avoid falling into the retry plus interaction.
This underlying function is prepared to offer both choices at
runtime.

> diff --git a/compat/mingw.h b/compat/mingw.h
> index ebfb8ba423..a555af8d54 100644
> --- a/compat/mingw.h
> +++ b/compat/mingw.h
> @@ -224,8 +224,12 @@ int uname(struct utsname *buf);
>   * replacements of existing functions
>   */
>  
> -int mingw_unlink(const char *pathname);
> -#define unlink mingw_unlink
> +int mingw_unlink(const char *pathname, int handle_in_use_error);
> +#ifdef MINGW_DONT_HANDLE_IN_USE_ERROR
> +# define unlink(path) mingw_unlink(path, 0)
> +#else
> +# define unlink(path) mingw_unlink(path, 1)
> +#endif

This one is yucky.  All calls to unlink() used in compilation units
with the CPP macro defined are going to fail on a path that is in
use, but in other code paths, there will be the retry loop.

Regardless of the platform, the code must be prepared to see its
unlink() fail and deal with the failure, but I wonder how much the
initial "if file-in-use caused problem, retry with delay without
bugging the end user" loop is helping.  

After that retry loop expires, we go interactive, and I can
understand why end-users may be annoyed by the code going
interactive at that point.

But wouldn't it be too drastic a change to break out of the retry
loop immediately after the initial failure?

Unless the case found in reftable is that the process that has the
file in use is ourselves but somebody else that is not under our
control, it could be that the current users are being helped by the
retry loop because these other users would quickly close and exit
while we are retrying before going interactive.  What I am getting
at is that it might be a less drastic move that helps users better
if we moved the "let's just accept the failure and return to the
caller" after that non-interactive retry loop, instead of "return
after even the first failure."  That way, we'll still keep the
automatic and non-interactive recovery, and punt a bit earlier than
before before we go into the other interactive retry loop.

Of course, if we are depending on the ability to unlink what _we_
ourselves are using, we should stop doing that by reorganizing the
code.  I recall we have done such a code shuffling to avoid removing
open files by flipping the order between unlink and close before
only to mollify Windows already in other code paths.  But if we are
failing due to random other users having the file open at the same
time, at least the earlier non-interactive retry loop sounds like a
reasonable workaround for quirks in the underlying filesystem to me.

Thanks.
Patrick Steinhardt Jan. 27, 2025, 7:48 a.m. UTC | #6
On Sat, Jan 25, 2025 at 03:28:28PM +0100, Johannes Sixt wrote:
> Am 25.01.25 um 09:32 schrieb Patrick Steinhardt:
> > The user report was explicitly about compatibility with JGit, which
> > still had these files open. We don't have control over third-party
> > clients and how exactly they open files, so it is expected that we may
> > still see failures with the deletion of in-use files.
> 
> Fair enough.
> 
> > I'd be happy to hear about alternative ideas that didn't came to my
> > mind.
> 
> Instead of calling _wunlink() in mingw_unlink, we could CreateFileW()
> with access mode DELETE and flag FILE_FLAG_DELETE_ON_CLOSE, then close
> the file right away. That would apply semantics that is similar, but not
> quite, POSIX at least among the files that we open ourselves.

Huh. And that works even when the file is still being held open by other
processes? I don't know enough about Windows to be sure there and
wouldn't quite feel comfortable with pushing a change like this into
`unlink()` given that it is used in so many places by Git.

> It would be even better that we do not depend on the POSIX behavior in
> the first place. As you said, the reftable library can live with failed
> deletes. And I don't think we depend on the POSIX behavior anywhere else
> because we would see the "try again?" question much more frequently than
> we do right now.

I have a feeling that there's a misunderstanding here, either on my side
or on yours. It's the rest of Git that wants to have POSIX behaviour for
`unlink()`, not the reftable library. In the reftable library we don't
care at all whether or not the file got deleted -- we can live with it
and know to retry at a later point. But because we use Git's `unlink()`
implementation we get the "try again?" questions for free, even though
we don't want to have it in the first place.

So the proposed fix is to _disable_ the POSIX emulation provided by Git
in the reftable library. Which seems to be what you're proposing?

Let me know in case I misunderstood, I'm a bit confused right now :)

Patrick
Patrick Steinhardt Jan. 27, 2025, 7:57 a.m. UTC | #7
On Sat, Jan 25, 2025 at 05:41:42PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> > diff --git a/compat/mingw.h b/compat/mingw.h
> > index ebfb8ba423..a555af8d54 100644
> > --- a/compat/mingw.h
> > +++ b/compat/mingw.h
> > @@ -224,8 +224,12 @@ int uname(struct utsname *buf);
> >   * replacements of existing functions
> >   */
> >  
> > -int mingw_unlink(const char *pathname);
> > -#define unlink mingw_unlink
> > +int mingw_unlink(const char *pathname, int handle_in_use_error);
> > +#ifdef MINGW_DONT_HANDLE_IN_USE_ERROR
> > +# define unlink(path) mingw_unlink(path, 0)
> > +#else
> > +# define unlink(path) mingw_unlink(path, 1)
> > +#endif
> 
> This one is yucky.  All calls to unlink() used in compilation units
> with the CPP macro defined are going to fail on a path that is in
> use, but in other code paths, there will be the retry loop.

Yeah, I don't like it either, but couldn't think of a better way to do
this.

> Regardless of the platform, the code must be prepared to see its
> unlink() fail and deal with the failure, but I wonder how much the
> initial "if file-in-use caused problem, retry with delay without
> bugging the end user" loop is helping.  
> 
> After that retry loop expires, we go interactive, and I can
> understand why end-users may be annoyed by the code going
> interactive at that point.
> 
> But wouldn't it be too drastic a change to break out of the retry
> loop immediately after the initial failure?

In general: yes. For the reftable library: no. The reftable backend
already knows to handle this situation just fine by simply retrying the
deletion on the next call to git-pack-refs(1). So the file will
eventually be removed, and it doesn't hurt to have it sitting around as
a stale file meanwhile.

> Unless the case found in reftable is that the process that has the
> file in use is ourselves but somebody else that is not under our
> control, it could be that the current users are being helped by the
> retry loop because these other users would quickly close and exit
> while we are retrying before going interactive.  What I am getting
> at is that it might be a less drastic move that helps users better
> if we moved the "let's just accept the failure and return to the
> caller" after that non-interactive retry loop, instead of "return
> after even the first failure."  That way, we'll still keep the
> automatic and non-interactive recovery, and punt a bit earlier than
> before before we go into the other interactive retry loop.

It would incur a delay though that is basically unnecessary. The file is
not being held open by the same process, but by a different one that is
not ourselves. And because Git opens with `FILE_SHARE_DELETE` on Windows
we even know that it wouldn't be a Git process that has the file open,
but something else (e.g. JGit, as reported by the user).

Chances are slim that such a third-party client would close the file in
the exact 71 milliseconds that we'd be sleeping for. And combined with
the fact that we know to clean up the file at a later point anyway I
think I'd rather not incur a user-facing delay only for a slight chance
that we might succeed unlinking the file.

> Of course, if we are depending on the ability to unlink what _we_
> ourselves are using, we should stop doing that by reorganizing the
> code.  I recall we have done such a code shuffling to avoid removing
> open files by flipping the order between unlink and close before
> only to mollify Windows already in other code paths.  But if we are
> failing due to random other users having the file open at the same
> time, at least the earlier non-interactive retry loop sounds like a
> reasonable workaround for quirks in the underlying filesystem to me.

As far as I'm aware this isn't an issue in the reftable library, we're
being quite careful there. I'm quite sure that we'd otherwise have
gotten reports about that before seeing reports about JGit keeping the
files open.

Patrick
Patrick Steinhardt Jan. 27, 2025, 7:58 a.m. UTC | #8
On Sat, Jan 25, 2025 at 09:45:21AM +0100, Christian Reich wrote:
> Greetings Patrick,
> 
> thx you for the patch. I build an own version of git for windows with this
> patch and it works as expected! *thumbsup*

Thanks for checking!

> But I see merge conflicts in git for windows main-branch:

This will be dealt with by the respective maintainers eventually. I've
already Cc'd Johannes for input, who is the Git for Windows maintainer.

Patrick
Johannes Sixt Jan. 28, 2025, 6:52 a.m. UTC | #9
Am 27.01.25 um 08:48 schrieb Patrick Steinhardt:
> On Sat, Jan 25, 2025 at 03:28:28PM +0100, Johannes Sixt wrote:
>> Am 25.01.25 um 09:32 schrieb Patrick Steinhardt:
>> Instead of calling _wunlink() in mingw_unlink, we could CreateFileW()
>> with access mode DELETE and flag FILE_FLAG_DELETE_ON_CLOSE, then close
>> the file right away. That would apply semantics that is similar, but not
>> quite, POSIX at least among the files that we open ourselves.
> 
> Huh. And that works even when the file is still being held open by other
> processes?

Only if the process cooperates and has opened the file with
FILE_SHARE_DELETE. So, in general, no.

I think, Cygwin implements unlink() in this manner.

> I have a feeling that there's a misunderstanding here, either on my side
> or on yours. It's the rest of Git that wants to have POSIX behaviour for
> `unlink()`, not the reftable library.

Yes and no. Yes, we expect to be able to delete a file that was opened
by some *other* Git process (e.g., packfiles during gc), but, no, we do
not delete files that have been opened in the current process and are
still open.

For this reason, I am arguing to remove the interactive part of
mingw_unlink() and use the cooperative strategy I mentioned above. That
gives us POSIX-like behavior for concurrent Git processes.

The interactive question is only useful when the user has control over
an uncooperative process that keeps a file open for an extended time and
can find that processes, which is either obvious or extremely difficult.
As I said, I haven't seen the question since a long, long time now, but
I am also the first to admit that my way of using Git is rather narrow.

-- Hannes
Patrick Steinhardt Jan. 28, 2025, 7:17 a.m. UTC | #10
On Tue, Jan 28, 2025 at 07:52:48AM +0100, Johannes Sixt wrote:
> Am 27.01.25 um 08:48 schrieb Patrick Steinhardt:
> > On Sat, Jan 25, 2025 at 03:28:28PM +0100, Johannes Sixt wrote:
> >> Am 25.01.25 um 09:32 schrieb Patrick Steinhardt:
> > I have a feeling that there's a misunderstanding here, either on my side
> > or on yours. It's the rest of Git that wants to have POSIX behaviour for
> > `unlink()`, not the reftable library.
> 
> Yes and no. Yes, we expect to be able to delete a file that was opened
> by some *other* Git process (e.g., packfiles during gc), but, no, we do
> not delete files that have been opened in the current process and are
> still open.
> 
> For this reason, I am arguing to remove the interactive part of
> mingw_unlink() and use the cooperative strategy I mentioned above. That
> gives us POSIX-like behavior for concurrent Git processes.
> 
> The interactive question is only useful when the user has control over
> an uncooperative process that keeps a file open for an extended time and
> can find that processes, which is either obvious or extremely difficult.
> As I said, I haven't seen the question since a long, long time now, but
> I am also the first to admit that my way of using Git is rather narrow.

That would be a much wider change compared to what I'm proposing though.
I don't quite feel comfortable with pushing for such a change as I don't
have enough of a stake in Windows to be able to judge whether it would
be sensible or not.

If Dscho confirms your take I'm happy to do so. But otherwise I'd prefer
to continue with the more limited scope, as I know that the behaviour is
unexpected and unnecessary in the reftable library.

Patrick
Johannes Sixt Jan. 29, 2025, 7:40 a.m. UTC | #11
Am 28.01.25 um 08:17 schrieb Patrick Steinhardt:
> On Tue, Jan 28, 2025 at 07:52:48AM +0100, Johannes Sixt wrote:
>> The interactive question is only useful when the user has control over
>> an uncooperative process that keeps a file open for an extended time and
>> can find that processes, which is either obvious or extremely difficult.
>> As I said, I haven't seen the question since a long, long time now, but
>> I am also the first to admit that my way of using Git is rather narrow.
> 
> That would be a much wider change compared to what I'm proposing though.
> I don't quite feel comfortable with pushing for such a change as I don't
> have enough of a stake in Windows to be able to judge whether it would
> be sensible or not.
> 
> If Dscho confirms your take I'm happy to do so. But otherwise I'd prefer
> to continue with the more limited scope, as I know that the behaviour is
> unexpected and unnecessary in the reftable library.

I can live with the narrow workaround that you proposed. A more thorough
rewrite like I propose would have to cook for quite some time before it
can be released to the general public.

-- Hannes
diff mbox series

Patch

diff --git a/compat/mingw.c b/compat/mingw.c
index 1d5b211b54..0e4b6a70a4 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -302,7 +302,7 @@  static wchar_t *normalize_ntpath(wchar_t *wbuf)
 	return wbuf;
 }
 
-int mingw_unlink(const char *pathname)
+int mingw_unlink(const char *pathname, int handle_in_use_error)
 {
 	int ret, tries = 0;
 	wchar_t wpathname[MAX_PATH];
@@ -317,6 +317,9 @@  int mingw_unlink(const char *pathname)
 	while ((ret = _wunlink(wpathname)) == -1 && tries < ARRAY_SIZE(delay)) {
 		if (!is_file_in_use_error(GetLastError()))
 			break;
+		if (!handle_in_use_error)
+			return ret;
+
 		/*
 		 * We assume that some other process had the source or
 		 * destination file open at the wrong moment and retry.
diff --git a/compat/mingw.h b/compat/mingw.h
index ebfb8ba423..a555af8d54 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -224,8 +224,12 @@  int uname(struct utsname *buf);
  * replacements of existing functions
  */
 
-int mingw_unlink(const char *pathname);
-#define unlink mingw_unlink
+int mingw_unlink(const char *pathname, int handle_in_use_error);
+#ifdef MINGW_DONT_HANDLE_IN_USE_ERROR
+# define unlink(path) mingw_unlink(path, 0)
+#else
+# define unlink(path) mingw_unlink(path, 1)
+#endif
 
 int mingw_rmdir(const char *path);
 #define rmdir mingw_rmdir
diff --git a/reftable/system.h b/reftable/system.h
index 5274eca1d0..fe94bf205b 100644
--- a/reftable/system.h
+++ b/reftable/system.h
@@ -13,6 +13,7 @@  license that can be found in the LICENSE file or at
 
 #define DISABLE_SIGN_COMPARE_WARNINGS
 
+#define MINGW_DONT_HANDLE_IN_USE_ERROR
 #include "git-compat-util.h"
 
 /*