diff mbox series

githooks: discuss Git operations in foreign repositories

Message ID pull.1457.git.1673171924727.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series githooks: discuss Git operations in foreign repositories | expand

Commit Message

Eric Sunshine Jan. 8, 2023, 9:58 a.m. UTC
From: Eric Sunshine <sunshine@sunshineco.com>

Hook authors are periodically caught off-guard by difficult-to-diagnose
errors when their hook invokes Git commands in a repository other than
the local one. In particular, Git environment variables, such as GIT_DIR
and GIT_WORK_TREE, which reference the local repository cause the Git
commands to operate on the local repository rather than on the
repository which the author intended. This is true whether the
environment variables have been set manually by the user or
automatically by Git itself. The same problem crops up when a hook
invokes Git commands in a different worktree of the same repository, as
well.

Recommended best-practice[1,2,3,4,5,6] for avoiding this problem is for
the hook to ensure that Git variables are unset before invoking Git
commands in foreign repositories or other worktrees:

    unset $(git rev-parse --local-env-vars)

However, this advice is not documented anywhere. Rectify this
shortcoming by mentioning it in githooks.txt documentation.

[1]: https://lore.kernel.org/git/YFuHd1MMlJAvtdzb@coredump.intra.peff.net/
[2]: https://lore.kernel.org/git/20200228190218.GC1408759@coredump.intra.peff.net/
[3]: https://lore.kernel.org/git/20190516221702.GA11784@sigill.intra.peff.net/
[4]: https://lore.kernel.org/git/20190422162127.GC9680@sigill.intra.peff.net/
[5]: https://lore.kernel.org/git/20180716183942.GB22298@sigill.intra.peff.net/
[6]: https://lore.kernel.org/git/20150203163235.GA9325@peff.net/

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
    githooks: discuss Git operations in foreign repositories

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1457%2Fsunshineco%2Fhookenv-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1457/sunshineco/hookenv-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1457

 Documentation/githooks.txt | 11 +++++++++++
 1 file changed, 11 insertions(+)


base-commit: a38d39a4c50d1275833aba54c4dbdfce9e2e9ca1

Comments

Preston Tunnell Wilson Jan. 8, 2023, 7:45 p.m. UTC | #1
Thank you for this wonderful remedy, Eric! I really appreciate the
background context and how you framed the problem that I ran into.

I have two questions:
1. Documentation is a great first step in addressing this, but I'm
wondering if this should be automatic? If this is a best practice for
hook authors, could `git` do this for them automatically when running
hooks?
2. Should we add something in the `git-worktree` documentation? In
`Documentation/git-worktree.txt`, it mentions:

> BUGS
> ----
> Multiple checkout in general is still experimental, and the support
> for submodules is incomplete. ...

Would it be helpful to plant a flag in the above documentation to
point to this potential issue?

Thank you,
Preston


On Sun, Jan 8, 2023 at 3:58 AM Eric Sunshine via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Eric Sunshine <sunshine@sunshineco.com>
>
> Hook authors are periodically caught off-guard by difficult-to-diagnose
> errors when their hook invokes Git commands in a repository other than
> the local one. In particular, Git environment variables, such as GIT_DIR
> and GIT_WORK_TREE, which reference the local repository cause the Git
> commands to operate on the local repository rather than on the
> repository which the author intended. This is true whether the
> environment variables have been set manually by the user or
> automatically by Git itself. The same problem crops up when a hook
> invokes Git commands in a different worktree of the same repository, as
> well.
>
> Recommended best-practice[1,2,3,4,5,6] for avoiding this problem is for
> the hook to ensure that Git variables are unset before invoking Git
> commands in foreign repositories or other worktrees:
>
>     unset $(git rev-parse --local-env-vars)
>
> However, this advice is not documented anywhere. Rectify this
> shortcoming by mentioning it in githooks.txt documentation.
>
> [1]: https://lore.kernel.org/git/YFuHd1MMlJAvtdzb@coredump.intra.peff.net/
> [2]: https://lore.kernel.org/git/20200228190218.GC1408759@coredump.intra.peff.net/
> [3]: https://lore.kernel.org/git/20190516221702.GA11784@sigill.intra.peff.net/
> [4]: https://lore.kernel.org/git/20190422162127.GC9680@sigill.intra.peff.net/
> [5]: https://lore.kernel.org/git/20180716183942.GB22298@sigill.intra.peff.net/
> [6]: https://lore.kernel.org/git/20150203163235.GA9325@peff.net/
>
> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> ---
>     githooks: discuss Git operations in foreign repositories
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1457%2Fsunshineco%2Fhookenv-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1457/sunshineco/hookenv-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1457
>
>  Documentation/githooks.txt | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index a16e62bc8c8..6e9a5420b7c 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -31,6 +31,17 @@ Hooks can get their arguments via the environment, command-line
>  arguments, and stdin. See the documentation for each hook below for
>  details.
>
> +If your hook needs to invoke Git commands in a foreign repository or in a
> +different working tree of the same repository, then it should clear local Git
> +environment variables, such as `GIT_DIR`, `GIT_WORK_TREE`, etc., which could
> +interfere with Git operations in the foreign repository since those variables
> +will be referencing the local repository and working tree. For example:
> +
> +------------
> +local_desc=$(git describe)
> +foreign_desc=$(unset $(git rev-parse --local-env-vars); git -C ../foreign-repo describe)
> +------------
> +
>  `git init` may copy hooks to the new repository, depending on its
>  configuration. See the "TEMPLATE DIRECTORY" section in
>  linkgit:git-init[1] for details. When the rest of this document refers
>
> base-commit: a38d39a4c50d1275833aba54c4dbdfce9e2e9ca1
> --
> gitgitgadget
Eric Sunshine Jan. 8, 2023, 11:25 p.m. UTC | #2
[administrivia: please reply inline rather than top-posting]

On Sun, Jan 8, 2023 at 2:45 PM Preston Tunnell Wilson
<prestontunnellwilson@gmail.com> wrote:
> Thank you for this wonderful remedy, Eric! I really appreciate the
> background context and how you framed the problem that I ran into.
>
> I have two questions:
> 1. Documentation is a great first step in addressing this, but I'm
> wondering if this should be automatic? If this is a best practice for
> hook authors, could `git` do this for them automatically when running
> hooks?

For the general case, probably not. "Best-practice" is context
sensitive. It may be best-practice when a hook needs to invoke Git
commands in some other repository (or worktree), but clearing those
variables automatically would, in some situations, break the much more
common case of the hook invoking Git commands in the local repository
(or worktree). The fact that those environment variables may have been
set manually by the user or automatically by Git further complicates
the situation.

> 2. Should we add something in the `git-worktree` documentation? In
> `Documentation/git-worktree.txt`, it mentions:
>
> > BUGS
> > ----
> > Multiple checkout in general is still experimental, and the support
> > for submodules is incomplete. ...
>
> Would it be helpful to plant a flag in the above documentation to
> point to this potential issue?

As noted above, we can't really call this a bug. Git is behaving as
intended. Whether the user set the variables manually or whether some
parent Git process set them automatically, the child Git respects the
variables as it should rather than second-guessing about the user's
intentions, and possibly guessing incorrectly.

So, no, I don't think this qualifies for the BUGS section of
git-wortkree, and mentioning this potential gotcha only in
git-worktree but not in any other hook-running command doesn't seem
ideal either. At present, the best place to discuss it seems to be
Documentation/githooks.txt, as this patch does. It may be possible to
argue that gitfaq.txt could talk about it, but considering that this
issue can manifest in many different ways (various error messages or
misbehaviors), it's difficult to come up with any text for the "Q"
which people would be likely to find when Googling. That's not to say
it shouldn't be mentioned elsewhere in the documentation, but rather
that I haven't come up with any better places than githooks.txt
itself.
Preston Tunnell Wilson Jan. 9, 2023, 2:54 a.m. UTC | #3
> "Best-practice" is context
> sensitive. It may be best-practice when a hook needs to invoke Git
> commands in some other repository (or worktree), but clearing those
> variables automatically would, in some situations, break the much more
> common case of the hook invoking Git commands in the local repository
> (or worktree). The fact that those environment variables may have been
> set manually by the user or automatically by Git further complicates
> the situation.

That makes sense, thank you for your answer!

> So, no, I don't think this qualifies for the BUGS section of
> git-wortkree, and mentioning this potential gotcha only in
> git-worktree but not in any other hook-running command doesn't seem
> ideal either. At present, the best place to discuss it seems to be
> Documentation/githooks.txt, as this patch does.

I agree the best place to put it is in Documentation/githooks.txt. I
also agree the BUGS section doesn't make sense, but I'm still
wondering if we should call it out in git-worktree.txt in addition to
githooks.txt. When I ran into this issue, I tried to compare my setup
to that of my coworkers. The difference was that I was using
git-worktree, they were not. git-worktree's documentation lists:

Within a linked worktree, $GIT_DIR is set to point to this private
directory (e.g. /path/main/.git/worktrees/test-next in the example)
and $GIT_COMMON_DIR is set to point back to the main worktree’s
$GIT_DIR (e.g. /path/main/.git). These settings are made in a .git
file located at the top directory of the linked worktree.

To me, this is the "other side of the coin" of your patch. (Or maybe
one of the many other sides of the coin for commands that can run
git-hooks.) Mentioning a potential collision between git-hooks and
these variables being set could maybe go in the above snippet, maybe
in parentheses. It took a lot of working backwards to narrow the issue
to the interaction between git-worktree and git-hooks rather than the
package manager I was using or the tool the hook was calling. Putting
a note in the git-worktree documentation (in addition to the note in
git-hooks) might help out someone in the future, but I defer to your
judgement. If it doesn't make sense, doesn't fit, or adding it here
would detract and make the documentation more confusing, I am happy to
leave it out.

And thank you for the administrivia!

On Sun, Jan 8, 2023 at 5:25 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> [administrivia: please reply inline rather than top-posting]
>
> On Sun, Jan 8, 2023 at 2:45 PM Preston Tunnell Wilson
> <prestontunnellwilson@gmail.com> wrote:
> > Thank you for this wonderful remedy, Eric! I really appreciate the
> > background context and how you framed the problem that I ran into.
> >
> > I have two questions:
> > 1. Documentation is a great first step in addressing this, but I'm
> > wondering if this should be automatic? If this is a best practice for
> > hook authors, could `git` do this for them automatically when running
> > hooks?
>
> For the general case, probably not. "Best-practice" is context
> sensitive. It may be best-practice when a hook needs to invoke Git
> commands in some other repository (or worktree), but clearing those
> variables automatically would, in some situations, break the much more
> common case of the hook invoking Git commands in the local repository
> (or worktree). The fact that those environment variables may have been
> set manually by the user or automatically by Git further complicates
> the situation.
>
> > 2. Should we add something in the `git-worktree` documentation? In
> > `Documentation/git-worktree.txt`, it mentions:
> >
> > > BUGS
> > > ----
> > > Multiple checkout in general is still experimental, and the support
> > > for submodules is incomplete. ...
> >
> > Would it be helpful to plant a flag in the above documentation to
> > point to this potential issue?
>
> As noted above, we can't really call this a bug. Git is behaving as
> intended. Whether the user set the variables manually or whether some
> parent Git process set them automatically, the child Git respects the
> variables as it should rather than second-guessing about the user's
> intentions, and possibly guessing incorrectly.
>
> So, no, I don't think this qualifies for the BUGS section of
> git-wortkree, and mentioning this potential gotcha only in
> git-worktree but not in any other hook-running command doesn't seem
> ideal either. At present, the best place to discuss it seems to be
> Documentation/githooks.txt, as this patch does. It may be possible to
> argue that gitfaq.txt could talk about it, but considering that this
> issue can manifest in many different ways (various error messages or
> misbehaviors), it's difficult to come up with any text for the "Q"
> which people would be likely to find when Googling. That's not to say
> it shouldn't be mentioned elsewhere in the documentation, but rather
> that I haven't come up with any better places than githooks.txt
> itself.
Junio C Hamano Jan. 9, 2023, 4:58 a.m. UTC | #4
"Eric Sunshine via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
> index a16e62bc8c8..6e9a5420b7c 100644
> --- a/Documentation/githooks.txt
> +++ b/Documentation/githooks.txt
> @@ -31,6 +31,17 @@ Hooks can get their arguments via the environment, command-line
>  arguments, and stdin. See the documentation for each hook below for
>  details.
>  
> +If your hook needs to invoke Git commands in a foreign repository or in a
> +different working tree of the same repository, then it should clear local Git
> +environment variables, such as `GIT_DIR`, `GIT_WORK_TREE`, etc., which could
> +interfere with Git operations in the foreign repository since those variables
> +will be referencing the local repository and working tree. For example:
> +
> +------------
> +local_desc=$(git describe)
> +foreign_desc=$(unset $(git rev-parse --local-env-vars); git -C ../foreign-repo describe)
> +------------
> +

It is an excellent idea to add the above, but

 * I think adding it one paragraph earlier may make it fit better.

 * The paragraph, after which the above gets inserted, can use a bit
   of enhancement.

That is, something like this?



 Documentation/githooks.txt | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git c/Documentation/githooks.txt w/Documentation/githooks.txt
index a16e62bc8c..f3d0404164 100644
--- c/Documentation/githooks.txt
+++ w/Documentation/githooks.txt
@@ -25,7 +25,20 @@ Before Git invokes a hook, it changes its working directory to either
 $GIT_DIR in a bare repository or the root of the working tree in a non-bare
 repository. An exception are hooks triggered during a push ('pre-receive',
 'update', 'post-receive', 'post-update', 'push-to-checkout') which are always
-executed in $GIT_DIR.
+executed in $GIT_DIR.  Environment variables like GIT_DIR and GIT_WORK_TREE
+are exported so that the hook can easily learn which repository it is
+working with.
+
+If your hook needs to invoke Git commands in a foreign repository or in a
+different working tree of the same repository, then it should clear local Git
+environment variables, such as `GIT_DIR`, `GIT_WORK_TREE`, etc., which could
+interfere with Git operations in the foreign repository since those variables
+will be referencing the local repository and working tree. For example:
+
+------------
+local_desc=$(git describe)
+foreign_desc=$(unset $(git rev-parse --local-env-vars); git -C ../foreign-repo describe)
+------------
 
 Hooks can get their arguments via the environment, command-line
 arguments, and stdin. See the documentation for each hook below for
Eric Sunshine Jan. 9, 2023, 5:03 a.m. UTC | #5
On Sun, Jan 8, 2023 at 11:58 PM Junio C Hamano <gitster@pobox.com> wrote:
> "Eric Sunshine via GitGitGadget" <gitgitgadget@gmail.com> writes:
> > +If your hook needs to invoke Git commands in a foreign repository or in a
> > +different working tree of the same repository, then it should clear local Git
> > +environment variables, such as `GIT_DIR`, `GIT_WORK_TREE`, etc., which could
> > +interfere with Git operations in the foreign repository since those variables
> > +will be referencing the local repository and working tree. For example:
> > +
> > +------------
> > +local_desc=$(git describe)
> > +foreign_desc=$(unset $(git rev-parse --local-env-vars); git -C ../foreign-repo describe)
> > +------------
>
> It is an excellent idea to add the above, but
>
>  * I think adding it one paragraph earlier may make it fit better.

That was my initial choice, as well, and is where I initially inserted
it, but moved it down a paragraph at the last moment. I'm happy to
move it back up again.

>  * The paragraph, after which the above gets inserted, can use a bit
>    of enhancement.
>
> That is, something like this?
>
>  repository. An exception are hooks triggered during a push ('pre-receive',
>  'update', 'post-receive', 'post-update', 'push-to-checkout') which are always
> -executed in $GIT_DIR.
> +executed in $GIT_DIR.  Environment variables like GIT_DIR and GIT_WORK_TREE
> +are exported so that the hook can easily learn which repository it is
> +working with.

Yes, good idea, although I might phrase it something like:

    Environment variables such as ... are exported so that Git
    commands run by the hook can correctly locate the repository.
Junio C Hamano Jan. 9, 2023, 5:43 a.m. UTC | #6
Eric Sunshine <sunshine@sunshineco.com> writes:

> Yes, good idea, although I might phrase it something like:
>
>     Environment variables such as ... are exported so that Git
>     commands run by the hook can correctly locate the repository.

Excellent.  Thanks.
Eric Sunshine Jan. 9, 2023, 9:47 p.m. UTC | #7
On Sun, Jan 8, 2023 at 9:54 PM Preston Tunnell Wilson
<prestontunnellwilson@gmail.com> wrote:
> > So, no, I don't think this qualifies for the BUGS section of
> > git-wortkree, and mentioning this potential gotcha only in
> > git-worktree but not in any other hook-running command doesn't seem
> > ideal either. At present, the best place to discuss it seems to be
> > Documentation/githooks.txt, as this patch does.
>
> I agree the best place to put it is in Documentation/githooks.txt. I
> also agree the BUGS section doesn't make sense, but I'm still
> wondering if we should call it out in git-worktree.txt in addition to
> githooks.txt. When I ran into this issue, I tried to compare my setup
> to that of my coworkers. The difference was that I was using
> git-worktree, they were not. git-worktree's documentation lists:
>
> Within a linked worktree, $GIT_DIR is set to point to this private
> directory (e.g. /path/main/.git/worktrees/test-next in the example)
> and $GIT_COMMON_DIR is set to point back to the main worktree’s
> $GIT_DIR (e.g. /path/main/.git). These settings are made in a .git
> file located at the top directory of the linked worktree.
>
> To me, this is the "other side of the coin" of your patch. (Or maybe
> one of the many other sides of the coin for commands that can run
> git-hooks.) Mentioning a potential collision between git-hooks and
> these variables being set could maybe go in the above snippet, maybe
> in parentheses. It took a lot of working backwards to narrow the issue
> to the interaction between git-worktree and git-hooks rather than the
> package manager I was using or the tool the hook was calling. Putting
> a note in the git-worktree documentation (in addition to the note in
> git-hooks) might help out someone in the future, but I defer to your
> judgement. If it doesn't make sense, doesn't fit, or adding it here
> would detract and make the documentation more confusing, I am happy to
> leave it out.

I understand your concern, and can relate to the amount of effort it
took to narrow down the problem. Nevertheless, even though you
encountered this problem in relation to git-worktree, it's a more
general issue which can manifest in other situations. As such, I can't
think of a good way to discuss the issue in the git-worktree
documentation that wouldn't feel out of place and make the
documentation more confusing.

I'm not necessarily opposed to someone else giving it a shot if it can
be done in a way which doesn't feel out of place and doesn't confuse
git-worktree documentation further (especially for those new to the
documentation). I just don't know how to do so myself.
diff mbox series

Patch

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index a16e62bc8c8..6e9a5420b7c 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -31,6 +31,17 @@  Hooks can get their arguments via the environment, command-line
 arguments, and stdin. See the documentation for each hook below for
 details.
 
+If your hook needs to invoke Git commands in a foreign repository or in a
+different working tree of the same repository, then it should clear local Git
+environment variables, such as `GIT_DIR`, `GIT_WORK_TREE`, etc., which could
+interfere with Git operations in the foreign repository since those variables
+will be referencing the local repository and working tree. For example:
+
+------------
+local_desc=$(git describe)
+foreign_desc=$(unset $(git rev-parse --local-env-vars); git -C ../foreign-repo describe)
+------------
+
 `git init` may copy hooks to the new repository, depending on its
 configuration. See the "TEMPLATE DIRECTORY" section in
 linkgit:git-init[1] for details. When the rest of this document refers