diff mbox series

[v3,5/9] bugreport: collect list of populated hooks

Message ID 20191025025129.250049-6-emilyshaffer@google.com (mailing list archive)
State New, archived
Headers show
Series add git-bugreport tool | expand

Commit Message

Emily Shaffer Oct. 25, 2019, 2:51 a.m. UTC
Occasionally a failure a user is seeing may be related to a specific
hook which is being run, perhaps without the user realizing. While the
contents of hooks can be sensitive - containing user data or process
information specific to the user's organization - simply knowing that a
hook is being run at a certain stage can help us to understand whether
something is going wrong.

Without a definitive list of hook names within the code, we compile our
own list from the documentation. This is likely prone to bitrot. To
reduce the amount of code humans need to read, we turn the list into a
string_list and iterate over it (as we are calling the same find_hook
operation on each string). However, since bugreport should primarily be
called by the user, the performance loss from massaging the string
seems acceptable.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 bugreport.c         | 44 ++++++++++++++++++++++++++++++++++++++++++++
 bugreport.h         |  6 ++++++
 builtin/bugreport.c |  4 ++++
 3 files changed, 54 insertions(+)

Comments

Johannes Schindelin Oct. 28, 2019, 2:31 p.m. UTC | #1
Hi Emily,

On Thu, 24 Oct 2019, Emily Shaffer wrote:

> Occasionally a failure a user is seeing may be related to a specific
> hook which is being run, perhaps without the user realizing. While the
> contents of hooks can be sensitive - containing user data or process
> information specific to the user's organization - simply knowing that a
> hook is being run at a certain stage can help us to understand whether
> something is going wrong.
>
> Without a definitive list of hook names within the code, we compile our
> own list from the documentation. This is likely prone to bitrot. To
> reduce the amount of code humans need to read, we turn the list into a
> string_list and iterate over it (as we are calling the same find_hook
> operation on each string). However, since bugreport should primarily be
> called by the user, the performance loss from massaging the string
> seems acceptable.

Good idea!

>
> Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> ---
>  bugreport.c         | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  bugreport.h         |  6 ++++++
>  builtin/bugreport.c |  4 ++++
>  3 files changed, 54 insertions(+)
>
> diff --git a/bugreport.c b/bugreport.c
> index afa4836ab1..9d7f44ff28 100644
> --- a/bugreport.c
> +++ b/bugreport.c
> @@ -103,3 +103,47 @@ void get_whitelisted_config(struct strbuf *config_info)
>  	strbuf_reset(config_info);
>  	strbuf_addbuf(config_info, &configs_and_values);
>  }
> +
> +void get_populated_hooks(struct strbuf *hook_info)
> +{
> +	/*
> +	 * Doesn't look like there is a list of all possible hooks; so below is
> +	 * a transcription of `git help hook`.
> +	 */
> +	const char *hooks = "applypatch-msg,"
> +			    "pre-applypatch,"
> +			    "post-applypatch,"
> +			    "pre-commit,"
> +			    "pre-merge-commit,"
> +			    "prepare-commit-msg,"
> +			    "commit-msg,"
> +			    "post-commit,"
> +			    "pre-rebase,"
> +			    "post-checkout,"
> +			    "post-merge,"
> +			    "pre-push,"
> +			    "pre-receive,"
> +			    "update,"
> +			    "post-receive,"
> +			    "post-update,"
> +			    "push-to-checkout,"
> +			    "pre-auto-gc,"
> +			    "post-rewrite,"
> +			    "sendemail-validate,"
> +			    "fsmonitor-watchman,"
> +			    "p4-pre-submit,"
> +			    "post-index-changex";

Well, this is disappointing: I tried to come up with a scripted way to
extract the commit names from our source code, and I failed! This is
where I gave up:

	git grep run_hook |
	sed -n 's/.*run_hook[^(]*([^,]*, *\([^,]*\).*/\1/p' |
	sed -e '/^name$/d' -e '/^const char \*name$/d' \
		-e 's/push_to_checkout_hook/"push-to-checkout"/' |
	sort |
	uniq

This prints only the following hook names:

	"applypatch-msg"
	"post-applypatch"
	"post-checkout"
	"post-index-change"
	"post-merge"
	"pre-applypatch"
	"pre-auto-gc"
	"pre-rebase"
	"prepare-commit-msg"
	"push-to-checkout"

But at least it was easy to script the extracting from the
Documentation:

	sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*//;p}}' \
		Documentation/githooks.txt

> +	struct string_list hooks_list = STRING_LIST_INIT_DUP;
> +	struct string_list_item *iter = NULL;
> +
> +	strbuf_reset(hook_info);
> +
> +	string_list_split(&hooks_list, hooks, ',', -1);
> +
> +	for_each_string_list_item(iter, &hooks_list) {

This should definitely be done at compile time, I think. We should be
able to generate an appropriate header via something like this:

	cat >hook-names.h <<-EOF
	static const char *hook_names[] = {
	$(sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*/",/;s/^/	"/;p}}' \
		Documentation/githooks.txt)
	};
	EOF

Then you would use a simple `for()` loop using `ARRAY_SIZE()` to
iterate over the hook names.

> +		if (find_hook(iter->string)) {
> +			strbuf_addstr(hook_info, iter->string);
> +			strbuf_complete_line(hook_info);
> +		}
> +	}
> +}
> diff --git a/bugreport.h b/bugreport.h
> index 7413e7e1be..942a5436e3 100644
> --- a/bugreport.h
> +++ b/bugreport.h
> @@ -12,3 +12,9 @@ void get_system_info(struct strbuf *sys_info);
>   * config_info will be discarded.
>   */
>  void get_whitelisted_config(struct strbuf *sys_info);
> +
> +/**
> + * Adds the paths to all configured hooks (but not their contents). The previous
> + * contents of hook_info will be discarded.
> + */
> +void get_populated_hooks(struct strbuf *hook_info);
> diff --git a/builtin/bugreport.c b/builtin/bugreport.c
> index 70fe0d2b85..a0eefba498 100644
> --- a/builtin/bugreport.c
> +++ b/builtin/bugreport.c
> @@ -60,6 +60,10 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix)
>  	get_whitelisted_config(&buffer);
>  	strbuf_write(&buffer, report);
>
> +	add_header(report, "Populated Hooks");

Wait! I should have stumbled over this in an earlier patch. The
`add_header()` function should not take a `FILE *` parameter at all, but
instead an `struct strbuf *` one!

Ciao,
Dscho

> +	get_populated_hooks(&buffer);
> +	strbuf_write(&buffer, report);
> +
>  	fclose(report);
>
>  	launch_editor(report_path.buf, NULL, NULL);
> --
> 2.24.0.rc0.303.g954a862665-goog
>
>
Emily Shaffer Dec. 11, 2019, 8:51 p.m. UTC | #2
On Mon, Oct 28, 2019 at 03:31:36PM +0100, Johannes Schindelin wrote:
> Hi Emily,
> 
> On Thu, 24 Oct 2019, Emily Shaffer wrote:
> 
> > Occasionally a failure a user is seeing may be related to a specific
> > hook which is being run, perhaps without the user realizing. While the
> > contents of hooks can be sensitive - containing user data or process
> > information specific to the user's organization - simply knowing that a
> > hook is being run at a certain stage can help us to understand whether
> > something is going wrong.
> >
> > Without a definitive list of hook names within the code, we compile our
> > own list from the documentation. This is likely prone to bitrot. To
> > reduce the amount of code humans need to read, we turn the list into a
> > string_list and iterate over it (as we are calling the same find_hook
> > operation on each string). However, since bugreport should primarily be
> > called by the user, the performance loss from massaging the string
> > seems acceptable.
> 
> Good idea!
> 
> >
> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
> > ---
> >  bugreport.c         | 44 ++++++++++++++++++++++++++++++++++++++++++++
> >  bugreport.h         |  6 ++++++
> >  builtin/bugreport.c |  4 ++++
> >  3 files changed, 54 insertions(+)
> >
> > diff --git a/bugreport.c b/bugreport.c
> > index afa4836ab1..9d7f44ff28 100644
> > --- a/bugreport.c
> > +++ b/bugreport.c
> > @@ -103,3 +103,47 @@ void get_whitelisted_config(struct strbuf *config_info)
> >  	strbuf_reset(config_info);
> >  	strbuf_addbuf(config_info, &configs_and_values);
> >  }
> > +
> > +void get_populated_hooks(struct strbuf *hook_info)
> > +{
> > +	/*
> > +	 * Doesn't look like there is a list of all possible hooks; so below is
> > +	 * a transcription of `git help hook`.
> > +	 */
> > +	const char *hooks = "applypatch-msg,"
> > +			    "pre-applypatch,"
> > +			    "post-applypatch,"
> > +			    "pre-commit,"
> > +			    "pre-merge-commit,"
> > +			    "prepare-commit-msg,"
> > +			    "commit-msg,"
> > +			    "post-commit,"
> > +			    "pre-rebase,"
> > +			    "post-checkout,"
> > +			    "post-merge,"
> > +			    "pre-push,"
> > +			    "pre-receive,"
> > +			    "update,"
> > +			    "post-receive,"
> > +			    "post-update,"
> > +			    "push-to-checkout,"
> > +			    "pre-auto-gc,"
> > +			    "post-rewrite,"
> > +			    "sendemail-validate,"
> > +			    "fsmonitor-watchman,"
> > +			    "p4-pre-submit,"
> > +			    "post-index-changex";
> 
> Well, this is disappointing: I tried to come up with a scripted way to
> extract the commit names from our source code, and I failed! This is
> where I gave up:
> 
> 	git grep run_hook |
> 	sed -n 's/.*run_hook[^(]*([^,]*, *\([^,]*\).*/\1/p' |
> 	sed -e '/^name$/d' -e '/^const char \*name$/d' \
> 		-e 's/push_to_checkout_hook/"push-to-checkout"/' |
> 	sort |
> 	uniq
> 
> This prints only the following hook names:
> 
> 	"applypatch-msg"
> 	"post-applypatch"
> 	"post-checkout"
> 	"post-index-change"
> 	"post-merge"
> 	"pre-applypatch"
> 	"pre-auto-gc"
> 	"pre-rebase"
> 	"prepare-commit-msg"
> 	"push-to-checkout"
> 
> But at least it was easy to script the extracting from the
> Documentation:
> 
> 	sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*//;p}}' \
> 		Documentation/githooks.txt
> 

To be totally frank, I'm not keen on spending a lot of time with
scripting this. My parallel work with config-based hooks will also
involve an in-code source of truth for available hooknames; I'd like to
punt on some scripting, putting it in the makefile, etc blah since I
know I'm planning to fix this particular annoyance at the source - and
since it looks like bugreport will stay in the review phase for at least
a little longer.

> > +	struct string_list hooks_list = STRING_LIST_INIT_DUP;
> > +	struct string_list_item *iter = NULL;
> > +
> > +	strbuf_reset(hook_info);
> > +
> > +	string_list_split(&hooks_list, hooks, ',', -1);
> > +
> > +	for_each_string_list_item(iter, &hooks_list) {
> 
> This should definitely be done at compile time, I think. We should be
> able to generate an appropriate header via something like this:
> 
> 	cat >hook-names.h <<-EOF
> 	static const char *hook_names[] = {
> 	$(sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*/",/;s/^/	"/;p}}' \
> 		Documentation/githooks.txt)
> 	};
> 	EOF
> 
> Then you would use a simple `for()` loop using `ARRAY_SIZE()` to
> iterate over the hook names.
> 
> > +		if (find_hook(iter->string)) {
> > +			strbuf_addstr(hook_info, iter->string);
> > +			strbuf_complete_line(hook_info);
> > +		}
> > +	}
> > +}
> > diff --git a/bugreport.h b/bugreport.h
> > index 7413e7e1be..942a5436e3 100644
> > --- a/bugreport.h
> > +++ b/bugreport.h
> > @@ -12,3 +12,9 @@ void get_system_info(struct strbuf *sys_info);
> >   * config_info will be discarded.
> >   */
> >  void get_whitelisted_config(struct strbuf *sys_info);
> > +
> > +/**
> > + * Adds the paths to all configured hooks (but not their contents). The previous
> > + * contents of hook_info will be discarded.
> > + */
> > +void get_populated_hooks(struct strbuf *hook_info);
> > diff --git a/builtin/bugreport.c b/builtin/bugreport.c
> > index 70fe0d2b85..a0eefba498 100644
> > --- a/builtin/bugreport.c
> > +++ b/builtin/bugreport.c
> > @@ -60,6 +60,10 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix)
> >  	get_whitelisted_config(&buffer);
> >  	strbuf_write(&buffer, report);
> >
> > +	add_header(report, "Populated Hooks");
> 
> Wait! I should have stumbled over this in an earlier patch. The
> `add_header()` function should not take a `FILE *` parameter at all, but
> instead an `struct strbuf *` one!
> 
> Ciao,
> Dscho
> 
> > +	get_populated_hooks(&buffer);
> > +	strbuf_write(&buffer, report);
> > +
> >  	fclose(report);
> >
> >  	launch_editor(report_path.buf, NULL, NULL);
> > --
> > 2.24.0.rc0.303.g954a862665-goog
> >
> >
Johannes Schindelin Dec. 15, 2019, 5:40 p.m. UTC | #3
Hi Emily,

On Wed, 11 Dec 2019, Emily Shaffer wrote:

> On Mon, Oct 28, 2019 at 03:31:36PM +0100, Johannes Schindelin wrote:
>
> > On Thu, 24 Oct 2019, Emily Shaffer wrote:
> >
> > > diff --git a/bugreport.c b/bugreport.c
> > > index afa4836ab1..9d7f44ff28 100644
> > > --- a/bugreport.c
> > > +++ b/bugreport.c
> > > @@ -103,3 +103,47 @@ void get_whitelisted_config(struct strbuf *config_info)
> > >  	strbuf_reset(config_info);
> > >  	strbuf_addbuf(config_info, &configs_and_values);
> > >  }
> > > +
> > > +void get_populated_hooks(struct strbuf *hook_info)
> > > +{
> > > +	/*
> > > +	 * Doesn't look like there is a list of all possible hooks; so below is
> > > +	 * a transcription of `git help hook`.
> > > +	 */
> > > +	const char *hooks = "applypatch-msg,"
> > > +			    "pre-applypatch,"
> > > +			    "post-applypatch,"
> > > +			    "pre-commit,"
> > > +			    "pre-merge-commit,"
> > > +			    "prepare-commit-msg,"
> > > +			    "commit-msg,"
> > > +			    "post-commit,"
> > > +			    "pre-rebase,"
> > > +			    "post-checkout,"
> > > +			    "post-merge,"
> > > +			    "pre-push,"
> > > +			    "pre-receive,"
> > > +			    "update,"
> > > +			    "post-receive,"
> > > +			    "post-update,"
> > > +			    "push-to-checkout,"
> > > +			    "pre-auto-gc,"
> > > +			    "post-rewrite,"
> > > +			    "sendemail-validate,"
> > > +			    "fsmonitor-watchman,"
> > > +			    "p4-pre-submit,"
> > > +			    "post-index-changex";
> >
> > Well, this is disappointing: I tried to come up with a scripted way to
> > extract the commit names from our source code, and I failed! This is
> > where I gave up:
> >
> > 	git grep run_hook |
> > 	sed -n 's/.*run_hook[^(]*([^,]*, *\([^,]*\).*/\1/p' |
> > 	sed -e '/^name$/d' -e '/^const char \*name$/d' \
> > 		-e 's/push_to_checkout_hook/"push-to-checkout"/' |
> > 	sort |
> > 	uniq
> >
> > This prints only the following hook names:
> >
> > 	"applypatch-msg"
> > 	"post-applypatch"
> > 	"post-checkout"
> > 	"post-index-change"
> > 	"post-merge"
> > 	"pre-applypatch"
> > 	"pre-auto-gc"
> > 	"pre-rebase"
> > 	"prepare-commit-msg"
> > 	"push-to-checkout"
> >
> > But at least it was easy to script the extracting from the
> > Documentation:
> >
> > 	sed -n '/^[a-z]/{N;/\n~~~/{s/\n.*//;p}}' \
> > 		Documentation/githooks.txt
> >
>
> To be totally frank, I'm not keen on spending a lot of time with
> scripting this. My parallel work with config-based hooks will also
> involve an in-code source of truth for available hooknames; I'd like to
> punt on some scripting, putting it in the makefile, etc blah since I
> know I'm planning to fix this particular annoyance at the source - and
> since it looks like bugreport will stay in the review phase for at least
> a little longer.

Fair enough, especially if it addresses my complaint about the
scriptability.

Ciao,
Dscho
diff mbox series

Patch

diff --git a/bugreport.c b/bugreport.c
index afa4836ab1..9d7f44ff28 100644
--- a/bugreport.c
+++ b/bugreport.c
@@ -103,3 +103,47 @@  void get_whitelisted_config(struct strbuf *config_info)
 	strbuf_reset(config_info);
 	strbuf_addbuf(config_info, &configs_and_values);
 }
+
+void get_populated_hooks(struct strbuf *hook_info)
+{
+	/*
+	 * Doesn't look like there is a list of all possible hooks; so below is
+	 * a transcription of `git help hook`.
+	 */
+	const char *hooks = "applypatch-msg,"
+			    "pre-applypatch,"
+			    "post-applypatch,"
+			    "pre-commit,"
+			    "pre-merge-commit,"
+			    "prepare-commit-msg,"
+			    "commit-msg,"
+			    "post-commit,"
+			    "pre-rebase,"
+			    "post-checkout,"
+			    "post-merge,"
+			    "pre-push,"
+			    "pre-receive,"
+			    "update,"
+			    "post-receive,"
+			    "post-update,"
+			    "push-to-checkout,"
+			    "pre-auto-gc,"
+			    "post-rewrite,"
+			    "sendemail-validate,"
+			    "fsmonitor-watchman,"
+			    "p4-pre-submit,"
+			    "post-index-changex";
+	struct string_list hooks_list = STRING_LIST_INIT_DUP;
+	struct string_list_item *iter = NULL;
+
+	strbuf_reset(hook_info);
+
+	string_list_split(&hooks_list, hooks, ',', -1);
+
+	for_each_string_list_item(iter, &hooks_list) {
+		if (find_hook(iter->string)) {
+			strbuf_addstr(hook_info, iter->string);
+			strbuf_complete_line(hook_info);
+		}
+	}
+}
diff --git a/bugreport.h b/bugreport.h
index 7413e7e1be..942a5436e3 100644
--- a/bugreport.h
+++ b/bugreport.h
@@ -12,3 +12,9 @@  void get_system_info(struct strbuf *sys_info);
  * config_info will be discarded.
  */
 void get_whitelisted_config(struct strbuf *sys_info);
+
+/**
+ * Adds the paths to all configured hooks (but not their contents). The previous
+ * contents of hook_info will be discarded.
+ */
+void get_populated_hooks(struct strbuf *hook_info);
diff --git a/builtin/bugreport.c b/builtin/bugreport.c
index 70fe0d2b85..a0eefba498 100644
--- a/builtin/bugreport.c
+++ b/builtin/bugreport.c
@@ -60,6 +60,10 @@  int cmd_bugreport(int argc, const char **argv, const char *prefix)
 	get_whitelisted_config(&buffer);
 	strbuf_write(&buffer, report);
 
+	add_header(report, "Populated Hooks");
+	get_populated_hooks(&buffer);
+	strbuf_write(&buffer, report);
+
 	fclose(report);
 
 	launch_editor(report_path.buf, NULL, NULL);