diff mbox series

var(win32): do report the GIT_SHELL_PATH that is actually used

Message ID pull.1760.git.1720443778074.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series var(win32): do report the GIT_SHELL_PATH that is actually used | expand

Commit Message

Johannes Schindelin July 8, 2024, 1:02 p.m. UTC
From: Johannes Schindelin <johannes.schindelin@gmx.de>

On Windows, Unix-like paths like `/bin/sh` make very little sense. In
the best case, they simply don't work, in the worst case they are
misinterpreted as absolute paths that are relative to the drive
associated with the current directory.

To that end, Git does not actually use the path `/bin/sh` that is
recorded e.g. in Unix shell scripts' hash-bang lines. Instead, as of
776297548e (Do not use SHELL_PATH from build system in prepare_shell_cmd
on Windows, 2012-04-17), it re-interprets `/bin/sh` as "look up `sh` on
the `PATH` and use the result instead".

However, when 1e65721227 (var: add support for listing the shell,
2023-06-27) introduced support for `git var GIT_SHELL_PATH`, Windows was
not special-cased as above, which is why it outputs `/bin/sh` even
though that disagrees with what Git actually uses.

Let's fix this, and also adjust the corresponding test case to verify
that it actually finds a working executable.

Reported-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
    var(win32): do report the GIT_SHELL_PATH that is actually used

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1760%2Fdscho%2Fgit-var-on-windows-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1760/dscho/git-var-on-windows-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1760

 builtin/var.c      | 6 ++++++
 t/t0007-git-var.sh | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)


base-commit: 06e570c0dfb2a2deb64d217db78e2ec21672f558

Comments

brian m. carlson July 8, 2024, 11:40 p.m. UTC | #1
On 2024-07-08 at 18:54:41, Junio C Hamano wrote:
> This asks for a few naïve questions.
> 
> If there is a fixed path the "git" binary was compiled for, which
> can be referenced with a single variable GIT_SHELL_PATH, even though
> on non-POSIX systems it won't be like /bin/sh, wouldn't there be a
> path like "C:\Program Files\Git for Windows\bin\sh" (I do not do
> Windows, so you may be installing somewhere completely different)
> and wouldn't such a path work regardless of which drive is
> associated with the current directory?
> 
> I would actually understand that, with relocatable build where
> everything is relative to the installed directory, a single
> GIT_SHELL_PATH that is defined at the compile-time may not make much
> sense, and when you need to interpret a shell script, you may need
> to recompute the actual path, relative to the installation
> directory.

I'll jump in here, and Dscho can correct me if I'm wrong, but I believe
there's one build that's always relocatable (well, there's MinGit and
the regular, but ignoring that).  You can install to almost any drive
and location, so it's not known at compile time.

> But I wonder why the replacement shell that is spawned is searched
> for in PATH, not where you installed it (which of course would be
> different from /bin/sh).  In other words, when running script that
> calls for "#!/bin/sh", looking for "sh" on PATH might be a workable
> hack, and it might even yield the same result, especially if you
> prepend the path you installed git and bash as part of your
> installation package to the user's PATH, but wouldn't it make more
> sense to compute it just like how "git --exec-path" is recomputed
> with the relocatable build?
> 
> The "look on the %PATH%" strategy does not make any sense as an
> implementation for getting GIT_SHELL_PATH, which answers "what is
> the shell this instanciation of Git was built to work with?", at
> least to me.  Maybe I am missing some knowledge on limitations on
> Windows and Git for Windows why it is done that way.

Well, it may be that that's the approach that Git for Windows takes to
look up the shell.  (I don't know for certain.)  If that _is_ what it
does, then that's absolutely the value we want because we want to use
whatever shell Git for Windows uses.  I will say it's a risky approach
because it could well also find a Cygwin or MINGW shell (or, if it were
called bash, WSL), but we really want whatever Git for Windows does
here.

That's because external users who rely on Git for Windows to furnish a
POSIX shell will want to know the path, and this variable is the best
way to do that (and the reason I added it).
Junio C Hamano July 8, 2024, 11:55 p.m. UTC | #2
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

>> The "look on the %PATH%" strategy does not make any sense as an
>> implementation for getting GIT_SHELL_PATH, which answers "what is
>> the shell this instanciation of Git was built to work with?", at
>> least to me.  Maybe I am missing some knowledge on limitations on
>> Windows and Git for Windows why it is done that way.
>
> Well, it may be that that's the approach that Git for Windows takes to
> look up the shell.  (I don't know for certain.)

> If that _is_ what it does, then that's absolutely the value we
> want because we want to use whatever shell Git for Windows uses.
> I will say it's a risky approach because it could well also find a
> Cygwin or MINGW shell (or, if it were called bash, WSL), but we
> really want whatever Git for Windows does here.

Yeah, absolutely it is risky unless it is doing the "we are
relocatable, so where is the 'sh' _we_ installed?", which is what I
would expect GIT_SHELL_PATH to be.

> That's because external users who rely on Git for Windows to furnish a
> POSIX shell will want to know the path, and this variable is the best
> way to do that (and the reason I added it).

If Git for Windows furnishes programs other than POSIX shell, paths
to which external users would also want to learn, what happens?
GIT_PERL_PATH, GIT_WISH_PATH, etc.?  Locate them on PATH themselves
shouldn't be rocket science (and instead of locating, just spawning
them themselves would be even easier, I would presume).

Thanks.
brian m. carlson July 9, 2024, 12:25 a.m. UTC | #3
On 2024-07-08 at 23:55:25, Junio C Hamano wrote:
> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
> 
> > That's because external users who rely on Git for Windows to furnish a
> > POSIX shell will want to know the path, and this variable is the best
> > way to do that (and the reason I added it).
> 
> If Git for Windows furnishes programs other than POSIX shell, paths
> to which external users would also want to learn, what happens?
> GIT_PERL_PATH, GIT_WISH_PATH, etc.?  Locate them on PATH themselves
> shouldn't be rocket science (and instead of locating, just spawning
> them themselves would be even easier, I would presume).

In my experience, they have not always been on PATH.  It may be that
they are now, which is fantastic, but I seem to remember that at some
point bash.exe and git.exe were on PATH, but sh.exe and other commands
were not.  (There's also a different path under Git Bash than the
regular Windows PATH.)  You might say, "Well, why not use bash.exe?" and
that works great until you realize that there's also a bash.exe that is
part of WSL and will attempt to spawn WSL for you.

However, that doesn't always work, because sometimes WSL isn't installed
or is disabled or broken, and so the operation fails with an error
message.  Also, users will typically have wanted Git for Windows's bash
and not a Linux environment's bash, since the two environments will
typically have different software available.

Why not add Perl or Wish or something else?  Because once you have the
Git for Windows sh, you can use a fixed Unix path to look them up and
get a canonical Windows path with cygpath -w.  Thus, this is just the
minimal bootstrapping functionality to get that information.

Of course, on Unix, there can still be multiple Perl implementations,
but you're typically either going to build against one in particular,
which may or may not be what Git was built against, or you're going to
use /usr/bin/env and choose whatever's first in PATH.  In that case,
it's very unlikely that anyone cares what version of Perl Git actually
uses.

The only major benefit of using Git's shell on Unix is that you know it
supports POSIX functionality (which /bin/sh does not on some proprietary
Unices) and it also supports local, which may be convenient for
scripting.
Phillip Wood July 9, 2024, 8:55 a.m. UTC | #4
Hi Johannes

On 08/07/2024 14:02, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>

Thanks for putting a patch together so quickly

>   static char *shell_path(int ident_flag UNUSED)
>   {
> +#ifdef WIN32
> +	char *p = locate_in_PATH("sh");

If I'm reading is_busybox_applet() (which only exists in 
git-for-windows) correctly then this will return "busybox.exe" under 
mingit-busybox rather than ash.exe, so the calling program would have to 
know to set argv[0] (which is likely not possible unless the calling 
program is written in C) or pass "sh" as the first argument. As the code 
to support busybox does not exist upstream I guess that's best handled 
downstream.

Best Wishes

Phillip

> +	convert_slashes(p);
> +	return p;
> +#else
>   	return xstrdup(SHELL_PATH);
> +#endif
>   }
>   
>   static char *git_attr_val_system(int ident_flag UNUSED)
> diff --git a/t/t0007-git-var.sh b/t/t0007-git-var.sh
> index ff4fd9348cc..9fc58823873 100755
> --- a/t/t0007-git-var.sh
> +++ b/t/t0007-git-var.sh
> @@ -157,7 +157,7 @@ test_expect_success POSIXPERM 'GIT_SHELL_PATH points to a valid executable' '
>   test_expect_success MINGW 'GIT_SHELL_PATH points to a suitable shell' '
>   	shellpath=$(git var GIT_SHELL_PATH) &&
>   	case "$shellpath" in
> -	*sh) ;;
> +	[A-Z]:/*/sh.exe) test -f "$shellpath";;
>   	*) return 1;;
>   	esac
>   '
> 
> base-commit: 06e570c0dfb2a2deb64d217db78e2ec21672f558
Phillip Wood July 9, 2024, 1:53 p.m. UTC | #5
On 09/07/2024 00:40, brian m. carlson wrote:
> On 2024-07-08 at 18:54:41, Junio C Hamano wrote:
>> The "look on the %PATH%" strategy does not make any sense as an
>> implementation for getting GIT_SHELL_PATH, which answers "what is
>> the shell this instanciation of Git was built to work with?", at
>> least to me.  Maybe I am missing some knowledge on limitations on
>> Windows and Git for Windows why it is done that way.
> 
> Well, it may be that that's the approach that Git for Windows takes to
> look up the shell.  (I don't know for certain.)  If that _is_ what it
> does, then that's absolutely the value we want because we want to use
> whatever shell Git for Windows uses.

As I understand it this is the approach Git for Windows takes

> I will say it's a risky approach
> because it could well also find a Cygwin or MINGW shell (or, if it were
> called bash, WSL), but we really want whatever Git for Windows does
> here.

Git for Windows prepends the MINGW system directories to $PATH via some 
extra downstream code in setup_windows_environment() so looking up the 
shell in $PATH will find the correct executable.

Best Wishes

Phillip
Junio C Hamano July 9, 2024, 4:31 p.m. UTC | #6
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> Why not add Perl or Wish or something else?  Because once you have the
> Git for Windows sh, you can use a fixed Unix path to look them up and
> get a canonical Windows path with cygpath -w.  Thus, this is just the
> minimal bootstrapping functionality to get that information.

Besides, perl and wish are not part of Git.  The same thing can be
said that shell is not part of Git.  

So stepping back and thinking why we have "git var GIT_SHELL_PATH",
what should it mean to begin with?  It is obviously not to tell
where we installed the thing as a part of "Git" suite, even though
Git for Windows installer may let users install the shell and other
stuff (similar to how "apt" lets users install package on Debian
derived systems).

Hence, I can accept that "git var GIT_SHELL_PATH" is not (and does
not have to be) about where we installed what we shipped.  It is
where we use the shell from (i.e., when we need "sh", we use that
shell).

   The documentation for GIT_SHELL_PATH is already good.  Sorry for
   my earlier confusion---I should have looked at it first.  It says
   it is not about what we ship, but what we use when we need to
   shell out.

I am OK with GIT_SHELL_PATH computed differently depending on the
platform, as different platform apparently use different mechanism
to decide which shell to spawn.  On POSIX systems, it is the one we
compiled to use, while on Windows it is the one that happens to be
on the end-user's PATH.

But then the comment I made in my earlier review still stands that
such a platform dependent logic should be implemented a bit more
cleanly than the posted patch.

"Which shell do we use at runtime" should influence a lot of what
the things in run-command.c do, so perhaps

 - we remove builtin/var.c:shell_path()

 - We create run-command.c:git_shell_path() immediately above
   run-command.c:prepare_shell_cmd().  We will add conditional
   compilation there (i.e. #ifdef GIT_WINDOWS_NATIVE).  The default
   implementation is to return SHELL_PATH, and on Windows it looks
   for "sh" on %PATH%.

 - The entry for GIT_SHELL_PATH in builtin/var.c:git_vars[] should
   be updated to point at git_shell_path() in run-command.c

 - Near the beginning of run-command.c:prepare_shell_cmd(), there is
   a conditional compilation.  If we can remove it by using
   git_shell_path(), that would be a nice bonus.

would give us a good approach for implementation?

Thanks.
Johannes Schindelin July 11, 2024, 12:03 p.m. UTC | #7
Hi Phillip,

On Tue, 9 Jul 2024, Phillip Wood wrote:

> On 08/07/2024 14:02, Johannes Schindelin via GitGitGadget wrote:
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Thanks for putting a patch together so quickly
>
> >   static char *shell_path(int ident_flag UNUSED)
> >   {
> > +#ifdef WIN32
> > +	char *p = locate_in_PATH("sh");
>
> If I'm reading is_busybox_applet() (which only exists in git-for-windows)
> correctly then this will return "busybox.exe" under mingit-busybox rather than
> ash.exe, so the calling program would have to know to set argv[0] (which is
> likely not possible unless the calling program is written in C) or pass "sh"
> as the first argument. As the code to support busybox does not exist upstream
> I guess that's best handled downstream.

BusyBox-w32 is unfortunately displaying strange performance patterns. It
is partially (and expectedly) faster than the MSYS2 Bash, but in other
scenarios it is substantially slower (which is totally unexpected).

Some time ago, I tried to make this all work and investigate the
unexpected performance issues (and hoped to fix them, too), but ran out of
time [*1*]. That was almost two years ago, and I am unsure whether I will
ever be able to elevate the BusyBox flavor of MinGit to a non-experimental
state.

My original plan was to eventually no longer include `busybox.exe` in
the mingit-busybox packages, but instead a copy of that executable with
the name `sh.exe` and thereby have it work without that hack in the Git
code to call the `busybox.exe` with the `sh` argument inserted before the
regular command-line arguments.

In the context of the patch (or now: patch series) at hand, I don't think
we need to let BusyBox play any role.

Ciao,
Johannes

Footnote *1*: Interested parties can find the latest state here:
https://github.com/git-for-windows/git/compare/main...dscho:git:busybox
Phillip Wood July 17, 2024, 2:55 p.m. UTC | #8
Hi Johannes

On 11/07/2024 13:03, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Tue, 9 Jul 2024, Phillip Wood wrote:
> 
>> On 08/07/2024 14:02, Johannes Schindelin via GitGitGadget wrote:
>>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>>
>> Thanks for putting a patch together so quickly
>>
>>>    static char *shell_path(int ident_flag UNUSED)
>>>    {
>>> +#ifdef WIN32
>>> +	char *p = locate_in_PATH("sh");
>>
>> If I'm reading is_busybox_applet() (which only exists in git-for-windows)
>> correctly then this will return "busybox.exe" under mingit-busybox rather than
>> ash.exe, so the calling program would have to know to set argv[0] (which is
>> likely not possible unless the calling program is written in C) or pass "sh"
>> as the first argument. As the code to support busybox does not exist upstream
>> I guess that's best handled downstream.
> 
> BusyBox-w32 is unfortunately displaying strange performance patterns. It
> is partially (and expectedly) faster than the MSYS2 Bash, but in other
> scenarios it is substantially slower (which is totally unexpected).
> 
> Some time ago, I tried to make this all work and investigate the
> unexpected performance issues (and hoped to fix them, too), but ran out of
> time [*1*]. That was almost two years ago, and I am unsure whether I will
> ever be able to elevate the BusyBox flavor of MinGit to a non-experimental
> state.
> 
> My original plan was to eventually no longer include `busybox.exe` in
> the mingit-busybox packages, but instead a copy of that executable with
> the name `sh.exe` and thereby have it work without that hack in the Git
> code to call the `busybox.exe` with the `sh` argument inserted before the
> regular command-line arguments.

Thanks for the information.

> In the context of the patch (or now: patch series) at hand, I don't think
> we need to let BusyBox play any role.

Agreed

Best Wishes

Phillip

> Ciao,
> Johannes
> 
> Footnote *1*: Interested parties can find the latest state here:
> https://github.com/git-for-windows/git/compare/main...dscho:git:busybox
diff mbox series

Patch

diff --git a/builtin/var.c b/builtin/var.c
index 5dc384810c0..f4b1f34e403 100644
--- a/builtin/var.c
+++ b/builtin/var.c
@@ -51,7 +51,13 @@  static char *default_branch(int ident_flag UNUSED)
 
 static char *shell_path(int ident_flag UNUSED)
 {
+#ifdef WIN32
+	char *p = locate_in_PATH("sh");
+	convert_slashes(p);
+	return p;
+#else
 	return xstrdup(SHELL_PATH);
+#endif
 }
 
 static char *git_attr_val_system(int ident_flag UNUSED)
diff --git a/t/t0007-git-var.sh b/t/t0007-git-var.sh
index ff4fd9348cc..9fc58823873 100755
--- a/t/t0007-git-var.sh
+++ b/t/t0007-git-var.sh
@@ -157,7 +157,7 @@  test_expect_success POSIXPERM 'GIT_SHELL_PATH points to a valid executable' '
 test_expect_success MINGW 'GIT_SHELL_PATH points to a suitable shell' '
 	shellpath=$(git var GIT_SHELL_PATH) &&
 	case "$shellpath" in
-	*sh) ;;
+	[A-Z]:/*/sh.exe) test -f "$shellpath";;
 	*) return 1;;
 	esac
 '