diff mbox series

alias: document caveats and add trace of prepared command

Message ID 20240522024133.1108005-1-iwienand@redhat.com (mailing list archive)
State New
Headers show
Series alias: document caveats and add trace of prepared command | expand

Commit Message

Ian Wienand May 22, 2024, 2:41 a.m. UTC
There are a number of hidden caveats when using command/shell
expansion ("!") in an alias.  The command is exec'd, unless it can be
split, when it is then run as an argument to "sh -c".  Split commands
have "$@" appended.

Firstly this updates the documentation to explain this a little
clearer.

Secondly, this adds a trace point in prepare_cmd where this
substitution is done, so we can see the command being run without
having to resort to strace/code inspection.  e.g. "test = !echo" you
will show:

 $ GIT_TRACE=1 git test hello
 10:58:56.877234 git.c:755               trace: exec: git-test hello
 10:58:56.877382 run-command.c:657       trace: run_command: git-test hello
 10:58:56.878655 run-command.c:657       trace: run_command: echo hello
 10:58:56.878747 run-command.c:437       trace: prepare_cmd: /usr/bin/echo hello
 hello

For a "split" alias, e.g. test = "!echo $*" you will see

 $ GIT_TRACE=1 git test hello
 11:00:45.959420 git.c:755               trace: exec: git-test hello
 11:00:45.959737 run-command.c:657       trace: run_command: git-test hello
 11:00:45.961424 run-command.c:657       trace: run_command: 'echo $*' hello
 11:00:45.961528 run-command.c:437       trace: prepare_cmd: /bin/sh -c 'echo $* "$@"' 'echo $*' hello
 hello hello

which clearly shows you the appended "$@".  This can be very helpful
when an alias is giving you an unexpected synatx error that is very
difficult figure out from only the run_command trace point,
e.g. test = "!for i in 1 2 3; do echo $i; done"

 $ GIT_TRACE=1 test hello
 11:02:39.813030 git.c:755               trace: exec: git-test hello
 11:02:39.813233 run-command.c:657       trace: run_command: git-test hello
 11:02:39.814384 run-command.c:657       trace: run_command: 'for i in 1 2 3; do echo $i; done' hello
 11:02:39.814468 run-command.c:437       trace: prepare_cmd: /bin/sh -c 'for i in 1 2 3; do echo $i; done "$@"' 'for i in 1 2 3; do echo $i; done' hello
 for i in 1 2 3; do echo $i; done: -c: line 1: syntax error near unexpected token `"$@"'
 for i in 1 2 3; do echo $i; done: -c: line 1: `for i in 1 2 3; do echo $i; done "$@"'

Signed-off-by: Ian Wienand <iwienand@redhat.com>
---
 Documentation/config/alias.txt | 26 +++++++++++++++++++++-----
 run-command.c                  |  1 +
 2 files changed, 22 insertions(+), 5 deletions(-)

Comments

Eric Sunshine May 22, 2024, 3:29 a.m. UTC | #1
On Tue, May 21, 2024 at 10:42 PM Ian Wienand <iwienand@redhat.com> wrote:
> [...]
> For a "split" alias, e.g. test = "!echo $*" you will see
>
>  $ GIT_TRACE=1 git test hello
>  11:00:45.959420 git.c:755               trace: exec: git-test hello
>  11:00:45.959737 run-command.c:657       trace: run_command: git-test hello
>  11:00:45.961424 run-command.c:657       trace: run_command: 'echo $*' hello
>  11:00:45.961528 run-command.c:437       trace: prepare_cmd: /bin/sh -c 'echo $* "$@"' 'echo $*' hello
>  hello hello
>
> which clearly shows you the appended "$@".  This can be very helpful
> when an alias is giving you an unexpected synatx error that is very

s/synatx/syntax/

> difficult figure out from only the run_command trace point,
> e.g. test = "!for i in 1 2 3; do echo $i; done"
> [...]
> Signed-off-by: Ian Wienand <iwienand@redhat.com>
Junio C Hamano May 22, 2024, 4:07 p.m. UTC | #2
Ian Wienand <iwienand@redhat.com> writes:

> For a "split" alias, e.g. test = "!echo $*" you will see
>
>  $ GIT_TRACE=1 git test hello
>  11:00:45.959420 git.c:755               trace: exec: git-test hello
>  11:00:45.959737 run-command.c:657       trace: run_command: git-test hello
>  11:00:45.961424 run-command.c:657       trace: run_command: 'echo $*' hello
>  11:00:45.961528 run-command.c:437       trace: prepare_cmd: /bin/sh -c 'echo $* "$@"' 'echo $*' hello
>  hello hello
>
> which clearly shows you the appended "$@".  

A question and a comment on this part.

Who are "you" in this context?

It is somewhat surprising if we do not consistently add "$@" (or do
an equivalent), as the point of adding "$@" is to allow an alias to
specify "the initial part of a command line", e.g.

	[alias] lg = log --oneline

to let you say "git lg" to mean "git log --oneline" and also you can
say "git lg --graph" to mean "git log --oneline --graph".  In other
words, what you type after "git lg" makes "the rest of the command
line", while alias gives "the initial part".

Are you sure that with

	[alias] lg = log

you get the rest of the command line ignored, in other words, if you
say "git lg --oneline", it does not do "git log --oneline"?

>  Documentation/config/alias.txt | 26 +++++++++++++++++++++-----
>  run-command.c                  |  1 +
>  2 files changed, 22 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/config/alias.txt b/Documentation/config/alias.txt
> index 01df96fab3..a3f090d79d 100644
> --- a/Documentation/config/alias.txt
> +++ b/Documentation/config/alias.txt
> @@ -21,8 +21,24 @@ If the alias expansion is prefixed with an exclamation point,
>  it will be treated as a shell command.  For example, defining
>  `alias.new = !gitk --all --not ORIG_HEAD`, the invocation
>  `git new` is equivalent to running the shell command
> -`gitk --all --not ORIG_HEAD`.  Note that shell commands will be
> -executed from the top-level directory of a repository, which may
> -not necessarily be the current directory.
> -`GIT_PREFIX` is set as returned by running `git rev-parse --show-prefix`
> -from the original current directory. See linkgit:git-rev-parse[1].
> +`gitk --all --not ORIG_HEAD`.  Note:
> ++
> +* Shell commands will be executed from the top-level directory of a
> +  repository, which may not necessarily be the current directory.
> +* `GIT_PREFIX` is set as returned by running `git rev-parse --show-prefix`
> +  from the original current directory. See linkgit:git-rev-parse[1].
> +* Single commands will be executed directly.  Commands that can be "split"
> +  (contain whitespace or shell-reserved characters) will be run as shell
> +  commands via an argument to `sh -c`.
> +* When arguments are present to a "split" command running in a shell,
> +  the shell command will have `"$@"` appended.  The first non-command
> +  argument to `sh -c` (i.e. `$0` to the command) is always the alias
> +  string, and other user specified arguments will follow.
> +** This may initially be confusing if your command references argument
> +   variables or is not expecting the presence of `"$@"`.  For example:
> +   `alias.echo = "!echo $1"` when run as `git echo arg` will execute
> +   `sh -c "echo $1 $@" "echo $1" "1"` resulting in output `1 1`.
> +   An alias `alias.for = "!for i in 1 2 3; do echo $i; done"` will fail
> +   if any arguments are specified to `git for` as the appended `"$@"` will
> +   create invalid shell syntax.  Setting `GIT_TRACE=1` can help debug
> +   the command being run.

The above does a bit too many things in a single patch to be
reviewable.  Perhaps make the above change in two patches?

 (1) Reformulate the existing test into "Note:" followed by a
     bulletted list.

 (2) Add new items to the updated and easier-to-read form prepared
     by the first patch.

You have a lengthy new text in the documentation to help confused
users, but it looks to me that it places too much stress on the
mechanics (i.e. '$@' is added to the command line) without saying
much about the intent (i.e. you need to use '$@' to allow the
command line invocation to supply "the rest of the command line"
when you are running the alias via the shell).  I've learned over
the course of this project that readers learn better when the intent
behind the mechanics is given in an understandable form.

I think the idea is "we want the 'concatenation of what the alias
supplies and what the user gives when invoking the alias from the
command line' to work sensibly".  The most generic way to do so when
processing "git lg --graph -3" with "[alias] lg = !git log --oneline" is

    sh -c 'git log --oneline "$@"' - '--graph' '-3'

(readers can try this at home by adding 'echo ' before 'log').  We
may try to "optimize" the most generic pattern when there is no need
to use "sh -c" go a more direct route.  For example, if you have

    [alias] !foo = bar

then there is no need to use "sh -c" in the first place.  "git foo
baz quux" should behave just like when you said "sh bar baz quux" on
the command line, i.e. execute your shell script "bar" and give
"baz" and "quux" as two arguments to that script.  We can prepare
argv[] = ("sh", "bar", "baz", "quux") and fork-exec that, so the
command line the underlying exec(2) call sees may not have "$@" (and
it will not have "-c", either).

You can undo the "optimization" and rewrite the command line to

    sh -c 'bar "$@"' - 'baz' 'quux'

and it should mean the same thing.

> diff --git a/run-command.c b/run-command.c
> index 1b821042b4..36b2b2f194 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -435,6 +435,7 @@ static int prepare_cmd(struct strvec *out, const struct child_process *cmd)
>  		}
>  	}
>  
> +	trace_argv_printf(&out->v[1], "trace: prepare_cmd:");
>  	return 0;
>  }

Hmph, we do not have much tests that look for 'trace: foo' in our
test suite, but t0014 seems to use it.  Don't we need to cover this
new trace output in the test in the same patch (probably becomes
patch 3 after the two documentation changes)?

Thanks.
Ian Wienand May 23, 2024, 12:38 a.m. UTC | #3
On Wed, May 22, 2024 at 09:07:55AM -0700, Junio C Hamano wrote:
> Ian Wienand <iwienand@redhat.com> writes:
> 
> > For a "split" alias, e.g. test = "!echo $*" you will see
> >
> >  $ GIT_TRACE=1 git test hello
> >  11:00:45.959420 git.c:755               trace: exec: git-test hello
> >  11:00:45.959737 run-command.c:657       trace: run_command: git-test hello
> >  11:00:45.961424 run-command.c:657       trace: run_command: 'echo $*' hello
> >  11:00:45.961528 run-command.c:437       trace: prepare_cmd: /bin/sh -c 'echo $* "$@"' 'echo $*' hello
> >  hello hello
> >
> > which clearly shows you the appended "$@".  
> 
> A question and a comment on this part.
> 
> Who are "you" in this context?

Heh, "you" is the person who has been given a large git alias that
probably should have been written some other way (but likely due to
the fact that stuffing all the workflow logic into a git config-file
means that nothing extra like shell/python scripts needed to be
distributed) that "you" have to now debug :) I feel the extra info in
GIT_TRACE output showing what is _actually_ happening after
prepare_cmd would really help this person (who was me :).

> It is somewhat surprising if we do not consistently add "$@" (or do
> an equivalent), as the point of adding "$@" is to allow an alias to
> specify "the initial part of a command line", e.g.
> 
> 	[alias] lg = log --oneline
> 
> to let you say "git lg" to mean "git log --oneline" and also you can
> say "git lg --graph" to mean "git log --oneline --graph".  In other
> words, what you type after "git lg" makes "the rest of the command
> line", while alias gives "the initial part".
> 
> Are you sure that with
> 
> 	[alias] lg = log
> 
> you get the rest of the command line ignored, in other words, if you
> say "git lg --oneline", it does not do "git log --oneline"?

I'm not quite sure I'm following here, sorry.  The behaviour is
different for "shell" aliases prefixed with !.  There the stragtegy is
to append "$@", but only if there are arguments.  Whatever the merits
or not of that, I feel like it's impossible to change because it's out
there now.

Now I've been looking, I've seen a number of interesting ways people
have been dealing with this.  Some people have ignored it, for example
I've seen credential helpers use ":" to ignore it, like

 "!echo password=\$${TOKEN_NAME}; :"

The git documentation for credential helpers gives another way,
defining an inline function, so the "$@" appended becomes function
arguments to "f()" like

 "!f() { echo $1 $2 ; }; f"

Then I've seen other people nest in a "sh -c" call, and "fake" $0 so
that the appended "$@" becomes the $1 arguments to the interior shell
call (cue quoting nightmares!) like

 "!sh -c 'echo $1 $2' foo"

Once you realise what is going on, these are somewhat understandable,
but the documentation as written doesn't guide you in the right
direction.

> The above does a bit too many things in a single patch to be
> reviewable.  Perhaps make the above change in two patches?

Sure, I am happy to split this up

> You have a lengthy new text in the documentation to help confused
> users, but it looks to me that it places too much stress on the
> mechanics (i.e. '$@' is added to the command line) without saying
> much about the intent (i.e. you need to use '$@' to allow the
> command line invocation to supply "the rest of the command line"
> when you are running the alias via the shell).  I've learned over
> the course of this project that readers learn better when the intent
> behind the mechanics is given in an understandable form.

Let me take your feedback here and work it into the split up change,
and we can discuss further.  Thanks.

> > diff --git a/run-command.c b/run-command.c
> > index 1b821042b4..36b2b2f194 100644
> > --- a/run-command.c
> > +++ b/run-command.c
> > @@ -435,6 +435,7 @@ static int prepare_cmd(struct strvec *out, const struct child_process *cmd)
> >  		}
> >  	}
> >  
> > +	trace_argv_printf(&out->v[1], "trace: prepare_cmd:");
> >  	return 0;
> >  }
> 
> Hmph, we do not have much tests that look for 'trace: foo' in our
> test suite, but t0014 seems to use it.  Don't we need to cover this
> new trace output in the test in the same patch (probably becomes
> patch 3 after the two documentation changes)?

Ok, thanks for the pointer, I'll look at expanding this one.

-i
diff mbox series

Patch

diff --git a/Documentation/config/alias.txt b/Documentation/config/alias.txt
index 01df96fab3..a3f090d79d 100644
--- a/Documentation/config/alias.txt
+++ b/Documentation/config/alias.txt
@@ -21,8 +21,24 @@  If the alias expansion is prefixed with an exclamation point,
 it will be treated as a shell command.  For example, defining
 `alias.new = !gitk --all --not ORIG_HEAD`, the invocation
 `git new` is equivalent to running the shell command
-`gitk --all --not ORIG_HEAD`.  Note that shell commands will be
-executed from the top-level directory of a repository, which may
-not necessarily be the current directory.
-`GIT_PREFIX` is set as returned by running `git rev-parse --show-prefix`
-from the original current directory. See linkgit:git-rev-parse[1].
+`gitk --all --not ORIG_HEAD`.  Note:
++
+* Shell commands will be executed from the top-level directory of a
+  repository, which may not necessarily be the current directory.
+* `GIT_PREFIX` is set as returned by running `git rev-parse --show-prefix`
+  from the original current directory. See linkgit:git-rev-parse[1].
+* Single commands will be executed directly.  Commands that can be "split"
+  (contain whitespace or shell-reserved characters) will be run as shell
+  commands via an argument to `sh -c`.
+* When arguments are present to a "split" command running in a shell,
+  the shell command will have `"$@"` appended.  The first non-command
+  argument to `sh -c` (i.e. `$0` to the command) is always the alias
+  string, and other user specified arguments will follow.
+** This may initially be confusing if your command references argument
+   variables or is not expecting the presence of `"$@"`.  For example:
+   `alias.echo = "!echo $1"` when run as `git echo arg` will execute
+   `sh -c "echo $1 $@" "echo $1" "1"` resulting in output `1 1`.
+   An alias `alias.for = "!for i in 1 2 3; do echo $i; done"` will fail
+   if any arguments are specified to `git for` as the appended `"$@"` will
+   create invalid shell syntax.  Setting `GIT_TRACE=1` can help debug
+   the command being run.
diff --git a/run-command.c b/run-command.c
index 1b821042b4..36b2b2f194 100644
--- a/run-command.c
+++ b/run-command.c
@@ -435,6 +435,7 @@  static int prepare_cmd(struct strvec *out, const struct child_process *cmd)
 		}
 	}
 
+	trace_argv_printf(&out->v[1], "trace: prepare_cmd:");
 	return 0;
 }