diff mbox series

[v2,4/6] hook: allow running non-native hooks

Message ID 20210812004258.74318-5-emilyshaffer@google.com (mailing list archive)
State Superseded
Headers show
Series config-based hooks restarted | expand

Commit Message

Emily Shaffer Aug. 12, 2021, 12:42 a.m. UTC
As the hook architecture and 'git hook run' become more featureful, we
may find wrappers wanting to use the hook architecture to run their own
hooks, thereby getting nice things like parallelism and idiomatic Git
configuration for free. Enable this by letting 'git hook run' bypass the
known_hooks() check.

We do still want to keep known_hooks() around, though - by die()ing when
an internal Git call asks for run_hooks("my-new-hook"), we can remind
Git developers to update Documentation/githooks.txt with their new hook,
which in turn helps Git users discover this new hook.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Documentation/git-hook.txt |  8 ++++++++
 builtin/hook.c             |  4 ++--
 hook.c                     | 32 ++++++++++++++++++++++++++++----
 hook.h                     | 16 +++++++++++++++-
 4 files changed, 53 insertions(+), 7 deletions(-)

Comments

Junio C Hamano Aug. 12, 2021, 7:08 p.m. UTC | #1
Emily Shaffer <emilyshaffer@google.com> writes:

> diff --git a/builtin/hook.c b/builtin/hook.c
> index c36b05376c..3aa65dd791 100644
> --- a/builtin/hook.c
> +++ b/builtin/hook.c
> @@ -46,7 +46,7 @@ static int list(int argc, const char **argv, const char *prefix)
>  
>  	hookname = argv[0];
>  
> -	head = hook_list(hookname);
> +	head = hook_list(hookname, 1);
>  
>  	if (list_empty(head)) {
>  		printf(_("no commands configured for hook '%s'\n"),
> @@ -108,7 +108,7 @@ static int run(int argc, const char **argv, const char *prefix)
>  	git_config(git_default_config, NULL);
>  
>  	hook_name = argv[0];
> -	hooks = hook_list(hook_name);
> +	hooks = hook_list(hook_name, 1);
>  	if (list_empty(hooks)) {
>  		/* ... act like run_hooks_oneshot() under --ignore-missing */
>  		if (ignore_missing)

This is minor, as I expect that the callers of hook_list() will
always confined in builtin/hook.c, but it probably is easier to read
if you gave two functions, just like you have the pair of helpers
find_hook() and find_hook_gently(), as the literal "1" forces the
readers to remember if that means "die if not found", or "ok if it
is a bogus name".

In addition, it may make more sense to keep hook_list() signal
failure by returning NULL and leave the dying to the caller.
In-code callers (as opposed to "git hook run" that can throw any
random string that came from the user at the API) will never throw a
bogus name unless there is a bug, and they'll have to check for an
error return from hook_list() anyway and the error message they
would give may have to be different from the one that is given
against a hook name randomly thrown at us by the user.
Emily Shaffer Aug. 18, 2021, 8:51 p.m. UTC | #2
On Thu, Aug 12, 2021 at 12:08:10PM -0700, Junio C Hamano wrote:
> 
> Emily Shaffer <emilyshaffer@google.com> writes:
> 
> > diff --git a/builtin/hook.c b/builtin/hook.c
> > index c36b05376c..3aa65dd791 100644
> > --- a/builtin/hook.c
> > +++ b/builtin/hook.c
> > @@ -46,7 +46,7 @@ static int list(int argc, const char **argv, const char *prefix)
> >  
> >  	hookname = argv[0];
> >  
> > -	head = hook_list(hookname);
> > +	head = hook_list(hookname, 1);
> >  
> >  	if (list_empty(head)) {
> >  		printf(_("no commands configured for hook '%s'\n"),
> > @@ -108,7 +108,7 @@ static int run(int argc, const char **argv, const char *prefix)
> >  	git_config(git_default_config, NULL);
> >  
> >  	hook_name = argv[0];
> > -	hooks = hook_list(hook_name);
> > +	hooks = hook_list(hook_name, 1);
> >  	if (list_empty(hooks)) {
> >  		/* ... act like run_hooks_oneshot() under --ignore-missing */
> >  		if (ignore_missing)
> 
> This is minor, as I expect that the callers of hook_list() will
> always confined in builtin/hook.c, but it probably is easier to read
> if you gave two functions, just like you have the pair of helpers
> find_hook() and find_hook_gently(), as the literal "1" forces the
> readers to remember if that means "die if not found", or "ok if it
> is a bogus name".

Yes, I see what you mean. Ok. I have been wanting to change the naming
anyways - most functions in hook.h are verb-y ("find hook", "run hooks",
so on) but hook_list stands out as the only noun-y function.

So I considered changing it to "list_hooks" and "list_hooks_gently", to align
with find_hook(_gently)....

> 
> In addition, it may make more sense to keep hook_list() signal
> failure by returning NULL and leave the dying to the caller.
> In-code callers (as opposed to "git hook run" that can throw any
> random string that came from the user at the API) will never throw a
> bogus name unless there is a bug, and they'll have to check for an
> error return from hook_list() anyway and the error message they
> would give may have to be different from the one that is given
> against a hook name randomly thrown at us by the user.

Sure, that makes sense enough... but then I wonder if it would be better
to let the caller check whether the name is allowed at all, first,
separately from the hook_list() call.

On the one hand, having hook_list() do the validation of the hook name
makes it harder for a hook doing something very unusual to neglect to
add documentation. (I'm thinking of, for example, a hook doing something
equally weird to the proc-receive hook, which cannot use the hook
library because it needs to be able to do this weird two-way
communication thing.
(https://lore.kernel.org/git/20210527000856.695702-31-emilyshaffer%40google.com))
It would be pretty bad for a hook which is already complicated to also
forget to include documentation.

On the other hand, as it is now - builtin/hook.c hardcodes "I don't care
if the hook is unknown" and hook.c hardcodes "reject if the hook is
unknown" and nobody else calls hook_list at all - it isn't so bad to
bail early, before even calling hook_list() in the first place, if the
hook is unknown.

I also think that approach would make a callsite easier to understand
than checking for null from hook_list().

  const char *hookname = "my-new-hook";

  /* Here it's pretty clear what the reason for the error was... */
  if (!known_hook(hookname))
    BUG("is hook '%s' in Documentation/githooks.txt?", hookname);

  hooks = hook_list(hookname);
  ...

vs.

  const char *hookname = "my-new-hook";
  hooks = hook_list(hookname);
  /*
   * But here, I have to go and look at the hook_list() source to
   * understand why null 'hooks' means I missed some doc step.
   */
  if (!hookname)
    BUG("is hook '%s' in Documentation/githooks.txt?", hookname);
  ...

Maybe others disagree with me, but I would guess the first example is
more easily understandable to someone unfamiliar with the hook code. So
I think I will go with that approach, and include some notice in the doc
comment over hook_list().

 - Emily
Emily Shaffer Aug. 18, 2021, 9:14 p.m. UTC | #3
On Wed, Aug 18, 2021 at 01:51:58PM -0700, Emily Shaffer wrote:
> 
> On Thu, Aug 12, 2021 at 12:08:10PM -0700, Junio C Hamano wrote:
> > 
> > Emily Shaffer <emilyshaffer@google.com> writes:
> > 
> > > diff --git a/builtin/hook.c b/builtin/hook.c
> > > index c36b05376c..3aa65dd791 100644
> > > --- a/builtin/hook.c
> > > +++ b/builtin/hook.c
> > > @@ -46,7 +46,7 @@ static int list(int argc, const char **argv, const char *prefix)
> > >  
> > >  	hookname = argv[0];
> > >  
> > > -	head = hook_list(hookname);
> > > +	head = hook_list(hookname, 1);
> > >  
> > >  	if (list_empty(head)) {
> > >  		printf(_("no commands configured for hook '%s'\n"),
> > > @@ -108,7 +108,7 @@ static int run(int argc, const char **argv, const char *prefix)
> > >  	git_config(git_default_config, NULL);
> > >  
> > >  	hook_name = argv[0];
> > > -	hooks = hook_list(hook_name);
> > > +	hooks = hook_list(hook_name, 1);
> > >  	if (list_empty(hooks)) {
> > >  		/* ... act like run_hooks_oneshot() under --ignore-missing */
> > >  		if (ignore_missing)
> > 
> > This is minor, as I expect that the callers of hook_list() will
> > always confined in builtin/hook.c, but it probably is easier to read
> > if you gave two functions, just like you have the pair of helpers
> > find_hook() and find_hook_gently(), as the literal "1" forces the
> > readers to remember if that means "die if not found", or "ok if it
> > is a bogus name".
> 
> Yes, I see what you mean. Ok. I have been wanting to change the naming
> anyways - most functions in hook.h are verb-y ("find hook", "run hooks",
> so on) but hook_list stands out as the only noun-y function.
> 
> So I considered changing it to "list_hooks" and "list_hooks_gently", to align
> with find_hook(_gently)....
> 
> > 
> > In addition, it may make more sense to keep hook_list() signal
> > failure by returning NULL and leave the dying to the caller.
> > In-code callers (as opposed to "git hook run" that can throw any
> > random string that came from the user at the API) will never throw a
> > bogus name unless there is a bug, and they'll have to check for an
> > error return from hook_list() anyway and the error message they
> > would give may have to be different from the one that is given
> > against a hook name randomly thrown at us by the user.
> 
> Sure, that makes sense enough... but then I wonder if it would be better
> to let the caller check whether the name is allowed at all, first,
> separately from the hook_list() call.
> 
> On the one hand, having hook_list() do the validation of the hook name
> makes it harder for a hook doing something very unusual to neglect to
> add documentation. (I'm thinking of, for example, a hook doing something
> equally weird to the proc-receive hook, which cannot use the hook
> library because it needs to be able to do this weird two-way
> communication thing.
> (https://lore.kernel.org/git/20210527000856.695702-31-emilyshaffer%40google.com))
> It would be pretty bad for a hook which is already complicated to also
> forget to include documentation.
> 
> On the other hand, as it is now - builtin/hook.c hardcodes "I don't care
> if the hook is unknown" and hook.c hardcodes "reject if the hook is
> unknown" and nobody else calls hook_list at all - it isn't so bad to
> bail early, before even calling hook_list() in the first place, if the
> hook is unknown.
> 
> I also think that approach would make a callsite easier to understand
> than checking for null from hook_list().
> 
>   const char *hookname = "my-new-hook";
> 
>   /* Here it's pretty clear what the reason for the error was... */
>   if (!known_hook(hookname))
>     BUG("is hook '%s' in Documentation/githooks.txt?", hookname);
> 
>   hooks = hook_list(hookname);
>   ...
> 
> vs.
> 
>   const char *hookname = "my-new-hook";
>   hooks = hook_list(hookname);
>   /*
>    * But here, I have to go and look at the hook_list() source to
>    * understand why null 'hooks' means I missed some doc step.
>    */
>   if (!hookname)
>     BUG("is hook '%s' in Documentation/githooks.txt?", hookname);
>   ...
> 
> Maybe others disagree with me, but I would guess the first example is
> more easily understandable to someone unfamiliar with the hook code. So
> I think I will go with that approach, and include some notice in the doc
> comment over hook_list().

Hm. Now that I sit to type it, I guess putting the onus on the
strange-new-hook caller to also type "if known_hook()" is about the same
as just expecting the strange-new-hook caller to know they are supposed
to document their hook. Plus, known_hook() is static right now.

I think it still makes sense to BUG() instead of error() or die() in
'list_hooks()' (non-gently) - the failure of that call is a developer
error, either in not having documented their hook correctly or in
calling 'list_hooks()' instead of 'list_hooks_gently()' when they meant
the latter. So I will not take the NULL return approach.

 - Emily
Junio C Hamano Aug. 18, 2021, 9:24 p.m. UTC | #4
Emily Shaffer <emilyshaffer@google.com> writes:

>> Yes, I see what you mean. Ok. I have been wanting to change the naming
>> anyways - most functions in hook.h are verb-y ("find hook", "run hooks",
>> so on) but hook_list stands out as the only noun-y function.
>> 
>> So I considered changing it to "list_hooks" and "list_hooks_gently", to align
>> with find_hook(_gently)....

I do not claim I am goot at naming (or better than you at it
anyway), but list-hooks sounds to me like it is calling printf()
to show the hooks to the user, not computing a list of hooks and
returning it to the caller.

>> I also think that approach would make a callsite easier to understand
>> than checking for null from hook_list().
>> 
>>   const char *hookname = "my-new-hook";
>> 
>>   /* Here it's pretty clear what the reason for the error was... */
>>   if (!known_hook(hookname))
>>     BUG("is hook '%s' in Documentation/githooks.txt?", hookname);

Yes.  The callsite becomes easier to understand, and it separates
the responsibility between the helper to respond to "please give me
list of defined hooks" and its caller that may react to the returned
list with "ok, among these hooks, this does not look kosher for such
and such reason, so I'd die/warn/error" much cleaner.
diff mbox series

Patch

diff --git a/Documentation/git-hook.txt b/Documentation/git-hook.txt
index 8bf82b5dd4..11a8b87c60 100644
--- a/Documentation/git-hook.txt
+++ b/Documentation/git-hook.txt
@@ -18,6 +18,14 @@  This command is an interface to git hooks (see linkgit:githooks[5]).
 Currently it only provides a convenience wrapper for running hooks for
 use by git itself. In the future it might gain other functionality.
 
+It's possible to use this command to refer to hooks which are not native to Git,
+for example if a wrapper around Git wishes to expose hooks into its own
+operation in a way which is already familiar to Git users. However, wrappers
+invoking such hooks should be careful to name their hook events something which
+Git is unlikely to use for a native hook later on. For example, Git is much less
+likely to create a `mytool-validate-commit` hook than it is to create a
+`validate-commit` hook.
+
 SUBCOMMANDS
 -----------
 
diff --git a/builtin/hook.c b/builtin/hook.c
index c36b05376c..3aa65dd791 100644
--- a/builtin/hook.c
+++ b/builtin/hook.c
@@ -46,7 +46,7 @@  static int list(int argc, const char **argv, const char *prefix)
 
 	hookname = argv[0];
 
-	head = hook_list(hookname);
+	head = hook_list(hookname, 1);
 
 	if (list_empty(head)) {
 		printf(_("no commands configured for hook '%s'\n"),
@@ -108,7 +108,7 @@  static int run(int argc, const char **argv, const char *prefix)
 	git_config(git_default_config, NULL);
 
 	hook_name = argv[0];
-	hooks = hook_list(hook_name);
+	hooks = hook_list(hook_name, 1);
 	if (list_empty(hooks)) {
 		/* ... act like run_hooks_oneshot() under --ignore-missing */
 		if (ignore_missing)
diff --git a/hook.c b/hook.c
index 2714b63473..e5acd02a50 100644
--- a/hook.c
+++ b/hook.c
@@ -52,12 +52,21 @@  static int known_hook(const char *name)
 
 const char *find_hook(const char *name)
 {
-	static struct strbuf path = STRBUF_INIT;
+	const char *hook_path;
 
 	if (!known_hook(name))
 		die(_("the hook '%s' is not known to git, should be in hook-list.h via githooks(5)"),
 		    name);
 
+	hook_path = find_hook_gently(name);
+
+	return hook_path;
+}
+
+const char *find_hook_gently(const char *name)
+{
+	static struct strbuf path = STRBUF_INIT;
+
 	strbuf_reset(&path);
 	strbuf_git_path(&path, "hooks/%s", name);
 	if (access(path.buf, X_OK) < 0) {
@@ -93,10 +102,16 @@  int hook_exists(const char *name)
 	return !!find_hook(name);
 }
 
-struct list_head* hook_list(const char* hookname)
+struct hook_config_cb
+{
+	struct strbuf *hook_key;
+	struct list_head *list;
+};
+
+struct list_head* hook_list(const char* hookname, int allow_unknown)
 {
 	struct list_head *hook_head = xmalloc(sizeof(struct list_head));
-	const char *hook_path = find_hook(hookname);
+	const char *hook_path;
 
 
 	INIT_LIST_HEAD(hook_head);
@@ -104,6 +119,11 @@  struct list_head* hook_list(const char* hookname)
 	if (!hookname)
 		return NULL;
 
+	if (allow_unknown)
+		hook_path = find_hook_gently(hookname);
+	else
+		hook_path = find_hook(hookname);
+
 	/* Add the hook from the hookdir */
 	if (hook_path) {
 		struct hook *to_add = xmalloc(sizeof(*to_add));
@@ -299,7 +319,11 @@  int run_hooks_oneshot(const char *hook_name, struct run_hooks_opt *options)
 	if (options->path_to_stdin && options->feed_pipe)
 		BUG("choose only one method to populate stdin");
 
-	hooks = hook_list(hook_name);
+	/*
+	 * 'git hooks run <hookname>' uses run_found_hooks, so we don't need to
+	 * allow unknown hooknames here.
+	 */
+	hooks = hook_list(hook_name, 0);
 
 	/*
 	 * If you need to act on a missing hook, use run_found_hooks()
diff --git a/hook.h b/hook.h
index 4f90228a0c..ffa96c6e4d 100644
--- a/hook.h
+++ b/hook.h
@@ -9,8 +9,16 @@ 
  * Returns the path to the hook file, or NULL if the hook is missing
  * or disabled. Note that this points to static storage that will be
  * overwritten by further calls to find_hook and run_hook_*.
+ *
+ * If the hook is not a native hook (e.g. present in Documentation/githooks.txt)
+ * find_hook() will die(). find_hook_gently() does not consult the native hook
+ * list and will check for any hook name in the hooks directory regardless of
+ * whether it is known. find_hook() should be used by internal calls to hooks,
+ * and find_hook_gently() should only be used when the hookname was provided by
+ * the user, such as by 'git hook run my-wrapper-hook'.
  */
 const char *find_hook(const char *name);
+const char *find_hook_gently(const char *name);
 
 /*
  * A boolean version of find_hook()
@@ -32,8 +40,14 @@  struct hook {
 /*
  * Provides a linked list of 'struct hook' detailing commands which should run
  * in response to the 'hookname' event, in execution order.
+ *
+ * If allow_unknown is unset, hooks will be checked against the hook list
+ * generated from Documentation/githooks.txt. Otherwise, any hook name will be
+ * allowed. allow_unknown should only be set when the hook name is provided by
+ * the user; internal calls to hook_list should make sure the hook they are
+ * invoking is present in Documentation/githooks.txt.
  */
-struct list_head* hook_list(const char *hookname);
+struct list_head* hook_list(const char *hookname, int allow_unknown);
 
 struct run_hooks_opt
 {