diff mbox series

[v3,4/6] hook: allow running non-native hooks

Message ID 20210819033450.3382652-5-emilyshaffer@google.com (mailing list archive)
State New, archived
Headers show
Series config-based hooks restarted | expand

Commit Message

Emily Shaffer Aug. 19, 2021, 3:34 a.m. UTC
As the hook architecture and 'git hook run' become more featureful, we
may find wrappers wanting to use the hook architecture to run their own
hooks, thereby getting nice things like parallelism and idiomatic Git
configuration for free. Enable this by letting 'git hook run' bypass the
known_hooks() check.

We do still want to keep known_hooks() around, though - by die()ing when
an internal Git call asks for run_hooks("my-new-hook"), we can remind
Git developers to update Documentation/githooks.txt with their new hook,
which in turn helps Git users discover this new hook.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt |  8 ++++++++
 builtin/hook.c             |  4 ++--
 hook.c                     | 35 +++++++++++++++++++++++++++++++----
 hook.h                     | 14 ++++++++++++++
 4 files changed, 55 insertions(+), 6 deletions(-)

Comments

Ævar Arnfjörð Bjarmason Aug. 24, 2021, 3:55 p.m. UTC | #1
On Wed, Aug 18 2021, Emily Shaffer wrote:

> As the hook architecture and 'git hook run' become more featureful, we
> may find wrappers wanting to use the hook architecture to run their own
> hooks, thereby getting nice things like parallelism and idiomatic Git
> configuration for free. Enable this by letting 'git hook run' bypass the
> known_hooks() check.
>
> We do still want to keep known_hooks() around, though - by die()ing when
> an internal Git call asks for run_hooks("my-new-hook"), we can remind
> Git developers to update Documentation/githooks.txt with their new hook,
> which in turn helps Git users discover this new hook.
>
> [...]
>
> +It's possible to use this command to refer to hooks which are not native to Git,
> +for example if a wrapper around Git wishes to expose hooks into its own
> +operation in a way which is already familiar to Git users. However, wrappers
> +invoking such hooks should be careful to name their hook events something which
> +Git is unlikely to use for a native hook later on. For example, Git is much less
> +likely to create a `mytool-validate-commit` hook than it is to create a
> +`validate-commit` hook.
> +
>  SUBCOMMANDS
>  -----------

The goal here makes sense, but...

> diff --git a/builtin/hook.c b/builtin/hook.c
> index d21f303eca..80397d39f5 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -46,7 +46,7 @@ static int list(int argc, const char **argv, const char *prefix)
>  
>  	hookname = argv[0];
>  
> -	head = hook_list(hookname);
> +	head = list_hooks_gently(hookname);
>  
>  	if (list_empty(head))
>  		return 1;
> @@ -105,7 +105,7 @@ static int run(int argc, const char **argv, const char *prefix)
>  	git_config(git_default_config, NULL);
>  
>  	hook_name = argv[0];
> -	hooks = list_hooks(hook_name);
> +	hooks = list_hooks_gently(hook_name);
>  	if (list_empty(hooks)) {
>  		/* ... act like run_hooks_oneshot() under --ignore-missing */
>  		if (ignore_missing)

This introduces a bug v.s. the previous state, e.g. before:

    $ git hook run --ignore-missing foobar
    fatal: the hook 'foobar' is not known to git, should be in hook-list.h via githooks(5)

But after we'll silently ignore it. I.e. we've conflated
--ignore-missing with a new and hypothetical (and this is now a synonym
of) --ignore-missing-and-allow-unknown-hook-names.

So we've conflated the user's one-shot "foobar" script with wanting to
catch a typo in e.g. git-send-email.perl.

Also instead of the user's typos being caught with a die (here using
your BUG(...) version):

    $ git hook list pre-recive
    BUG: hook.c:115: Don't recognize hook event 'pre-recive'! Is it documented in Documentation/githooks.txt?
    Aborted

We'll now silently return 1, so indistinguishabl from typing it properly
as pre-receive.

All that being said I think it's arguable that if we're going to allow
"git hook run blahblah" that the die() in the base topic in my
"hook-list.h: add a generated list of hooks, like config-list.h" is more
trouble than it's worth.

I.e. do we really need to be concerned about new hooks being added and
someone forgetting a githooks.txt update, or a typo in the git.git code
that nobody notices? Probably not.

But I think the change here is clearly broken vis-a-vis the stated goals
of its commit message as it stands, i.e. "[...]we do still want to keep
known_hooks() around, though[...]". Should we fix it by adding a new
internal-only flag to the command, or just saying we shouldn't have the
behavior at all? What do you think.

Aside from that, this change seems to be untested, I tried making this
non-gentle for testing, and all tests still passed. I.e. we don't have
any tests for running such a hook like mytool-validate-commit, but
should as part of this change.
Emily Shaffer Aug. 26, 2021, 10:50 p.m. UTC | #2
On Tue, Aug 24, 2021 at 05:55:13PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> 
> On Wed, Aug 18 2021, Emily Shaffer wrote:
> 
> > As the hook architecture and 'git hook run' become more featureful, we
> > may find wrappers wanting to use the hook architecture to run their own
> > hooks, thereby getting nice things like parallelism and idiomatic Git
> > configuration for free. Enable this by letting 'git hook run' bypass the
> > known_hooks() check.
> >
> > We do still want to keep known_hooks() around, though - by die()ing when
> > an internal Git call asks for run_hooks("my-new-hook"), we can remind
> > Git developers to update Documentation/githooks.txt with their new hook,
> > which in turn helps Git users discover this new hook.
> >
> > [...]
> >
> > +It's possible to use this command to refer to hooks which are not native to Git,
> > +for example if a wrapper around Git wishes to expose hooks into its own
> > +operation in a way which is already familiar to Git users. However, wrappers
> > +invoking such hooks should be careful to name their hook events something which
> > +Git is unlikely to use for a native hook later on. For example, Git is much less
> > +likely to create a `mytool-validate-commit` hook than it is to create a
> > +`validate-commit` hook.
> > +
> >  SUBCOMMANDS
> >  -----------
> 
> The goal here makes sense, but...
> 
> > diff --git a/builtin/hook.c b/builtin/hook.c
> > index d21f303eca..80397d39f5 100644
> > --- a/builtin/hook.c
> > +++ b/builtin/hook.c
> > @@ -46,7 +46,7 @@ static int list(int argc, const char **argv, const char *prefix)
> >  
> >  	hookname = argv[0];
> >  
> > -	head = hook_list(hookname);
> > +	head = list_hooks_gently(hookname);
> >  
> >  	if (list_empty(head))
> >  		return 1;
> > @@ -105,7 +105,7 @@ static int run(int argc, const char **argv, const char *prefix)
> >  	git_config(git_default_config, NULL);
> >  
> >  	hook_name = argv[0];
> > -	hooks = list_hooks(hook_name);
> > +	hooks = list_hooks_gently(hook_name);
> >  	if (list_empty(hooks)) {
> >  		/* ... act like run_hooks_oneshot() under --ignore-missing */
> >  		if (ignore_missing)
> 
> This introduces a bug v.s. the previous state, e.g. before:
> 
>     $ git hook run --ignore-missing foobar
>     fatal: the hook 'foobar' is not known to git, should be in hook-list.h via githooks(5)
> 
> But after we'll silently ignore it. I.e. we've conflated
> --ignore-missing with a new and hypothetical (and this is now a synonym
> of) --ignore-missing-and-allow-unknown-hook-names.
> 
> So we've conflated the user's one-shot "foobar" script with wanting to
> catch a typo in e.g. git-send-email.perl.
> 
> Also instead of the user's typos being caught with a die (here using
> your BUG(...) version):
> 
>     $ git hook list pre-recive
>     BUG: hook.c:115: Don't recognize hook event 'pre-recive'! Is it documented in Documentation/githooks.txt?
>     Aborted
> 
> We'll now silently return 1, so indistinguishabl from typing it properly
> as pre-receive.
> 
> All that being said I think it's arguable that if we're going to allow
> "git hook run blahblah" that the die() in the base topic in my
> "hook-list.h: add a generated list of hooks, like config-list.h" is more
> trouble than it's worth.
> 
> I.e. do we really need to be concerned about new hooks being added and
> someone forgetting a githooks.txt update, or a typo in the git.git code
> that nobody notices? Probably not.
> 
> But I think the change here is clearly broken vis-a-vis the stated goals
> of its commit message as it stands, i.e. "[...]we do still want to keep
> known_hooks() around, though[...]". Should we fix it by adding a new
> internal-only flag to the command, or just saying we shouldn't have the
> behavior at all? What do you think.

I think it's A) pretty important to make it easy for users to run
whatever not-necessarily-git-native hook they want, and B) useful for
script Git commands to take advantage of the typo check. So, I'll add a
`--enforce-known-hookname` (or maybe a better named one, this isn't my
strong suit) and switch git-send-email and friends to use it. Like we
discussed off-list, I think it's a good idea to drop the envvar for
exceptional test names from the codebase entirely.

> 
> Aside from that, this change seems to be untested, I tried making this
> non-gentle for testing, and all tests still passed. I.e. we don't have
> any tests for running such a hook like mytool-validate-commit, but
> should as part of this change.

Sure.


Actually, I was in the middle of typing about how I wouldn't change your
'test-hook' and so on tests, and it occurs to me that it might actually
be a better fit for your series to add this --reject-unknown (or
whatever) flag, instead of the envvar magics. So I'll hold off on making
any changes unless I hear from you.

 - Emily
Junio C Hamano Aug. 27, 2021, 12:22 a.m. UTC | #3
Emily Shaffer <emilyshaffer@google.com> writes:

> I think it's A) pretty important to make it easy for users to run
> whatever not-necessarily-git-native hook they want, and B) useful for
> script Git commands to take advantage of the typo check. So, I'll add a
> `--enforce-known-hookname` (or maybe a better named one, this isn't my
> strong suit) and switch git-send-email and friends to use it.

I somehow feel this is backwards.  

Once you write the invocation of "git hook run <hookname>" into your
script and tested it, there is no further need for typo checking.

What's the use case you are trying to help with typo checking?  When
a script takes a hookname from the user and runs "git hook run $1",
then passing --this-must-be-a-known-hook option that errors out when
the named hook does not exist and unrecognised (there is no need to
error out if a hook with unusual name the user gave us does
exist---the user asked us to run it, so we just can run it) might
make sense.  But I am somehow not getting the sense that it is the
expected use case you are worried about.

If the reason why you are making the typo-checking an opt-in feature
is because you want to allow users to run "git hook run" with
minimum typing, I suspect that you may be optimizing for a wrong
case.  Interactive users are the ones that benefit from
typo-checking the most, and if they are interactive (as opposed to
being a script), they are flexible enough not to say "git hook run
foobar" when they know foobar does not exist and they know foobar is
not a generally accepted hook, no?  So, I think it makes more sense
to by default allow a hook with a recognizable name to be missing,
but complain when a randomly named hook is missing, and to have an
option that permits a hook to be unrecognised and missing.
diff mbox series

Patch

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 701ada9fc0..d1db084e4f 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -19,6 +19,14 @@  This command is an interface to git hooks (see linkgit:githooks[5]).
 Currently it only provides a convenience wrapper for running hooks for
 use by git itself. In the future it might gain other functionality.
 
+It's possible to use this command to refer to hooks which are not native to Git,
+for example if a wrapper around Git wishes to expose hooks into its own
+operation in a way which is already familiar to Git users. However, wrappers
+invoking such hooks should be careful to name their hook events something which
+Git is unlikely to use for a native hook later on. For example, Git is much less
+likely to create a `mytool-validate-commit` hook than it is to create a
+`validate-commit` hook.
+
 SUBCOMMANDS
 -----------
 
diff --git a/builtin/hook.c b/builtin/hook.c
index d21f303eca..80397d39f5 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -46,7 +46,7 @@  static int list(int argc, const char **argv, const char *prefix)
 
 	hookname = argv[0];
 
-	head = hook_list(hookname);
+	head = list_hooks_gently(hookname);
 
 	if (list_empty(head))
 		return 1;
@@ -105,7 +105,7 @@  static int run(int argc, const char **argv, const char *prefix)
 	git_config(git_default_config, NULL);
 
 	hook_name = argv[0];
-	hooks = list_hooks(hook_name);
+	hooks = list_hooks_gently(hook_name);
 	if (list_empty(hooks)) {
 		/* ... act like run_hooks_oneshot() under --ignore-missing */
 		if (ignore_missing)
diff --git a/hook.c b/hook.c
index b1ea372e15..ab1e86ddcf 100644
--- a/hook.c
+++ b/hook.c
@@ -51,12 +51,21 @@  static int known_hook(const char *name)
 
 const char *find_hook(const char *name)
 {
-	static struct strbuf path = STRBUF_INIT;
+	const char *hook_path;
 
 	if (!known_hook(name))
-		die(_("the hook '%s' is not known to git, should be in hook-list.h via githooks(5)"),
+		BUG(_("the hook '%s' is not known to git, should be in hook-list.h via githooks(5)"),
 		    name);
 
+	hook_path = find_hook_gently(name);
+
+	return hook_path;
+}
+
+const char *find_hook_gently(const char *name)
+{
+	static struct strbuf path = STRBUF_INIT;
+
 	strbuf_reset(&path);
 	strbuf_git_path(&path, "hooks/%s", name);
 	if (access(path.buf, X_OK) < 0) {
@@ -92,10 +101,24 @@  int hook_exists(const char *name)
 	return !list_empty(list_hooks(name));
 }
 
+struct hook_config_cb
+{
+	struct strbuf *hook_key;
+	struct list_head *list;
+};
+
 struct list_head *list_hooks(const char *hookname)
 {
-	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
+	if (!known_hook(hookname))
+		BUG("Don't recognize hook event '%s'! "
+		    "Is it documented in Documentation/githooks.txt?",
+		    hookname);
+	return list_hooks_gently(hookname);
+}
 
+struct list_head *list_hooks_gently(const char *hookname)
+{
+	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
 
 	INIT_LIST_HEAD(hook_head);
 
@@ -103,7 +126,7 @@  struct list_head *list_hooks(const char *hookname)
 		BUG("null hookname was provided to hook_list()!");
 
 	if (have_git_dir()) {
-		const char *hook_path = find_hook(hookname);
+		const char *hook_path = find_hook_gently(hookname);
 		if (hook_path) {
 			struct hook *to_add = xmalloc(sizeof(*to_add));
 			to_add->hook_path = hook_path;
@@ -299,6 +322,10 @@  int run_hooks_oneshot(const char *hook_name, struct run_hooks_opt *options)
 	if (options->path_to_stdin && options->feed_pipe)
 		BUG("choose only one method to populate stdin");
 
+	/*
+	 * 'git hooks run <hookname>' uses run_hooks, so we don't need to
+	 * allow unknown hooknames here.
+	 */
 	hooks = list_hooks(hook_name);
 
 	/*
diff --git a/hook.h b/hook.h
index cd3123d290..6b7b2d14d2 100644
--- a/hook.h
+++ b/hook.h
@@ -9,8 +9,16 @@ 
  * Returns the path to the hook file, or NULL if the hook is missing
  * or disabled. Note that this points to static storage that will be
  * overwritten by further calls to find_hook and run_hook_*.
+ *
+ * If the hook is not a native hook (e.g. present in Documentation/githooks.txt)
+ * find_hook() will BUG(). find_hook_gently() does not consult the native hook
+ * list and will check for any hook name in the hooks directory regardless of
+ * whether it is known. find_hook() should be used by internal calls to hooks,
+ * and find_hook_gently() should only be used when the hookname was provided by
+ * the user, such as by 'git hook run my-wrapper-hook'.
  */
 const char *find_hook(const char *name);
+const char *find_hook_gently(const char *name);
 
 /*
  * A boolean version of find_hook()
@@ -32,8 +40,14 @@  struct hook {
 /*
  * Provides a linked list of 'struct hook' detailing commands which should run
  * in response to the 'hookname' event, in execution order.
+ *
+ * list_hooks() checks the provided hookname against Documentation/githooks.txt
+ * and BUG()s if it is not found.  list_hooks_gently() allows any hookname. The
+ * latter should only be used when the hook name is provided by the user, and
+ * the former should be used by internal callers.
  */
 struct list_head *list_hooks(const char *hookname);
+struct list_head *list_hooks_gently(const char *hookname);
 
 struct run_hooks_opt
 {