[v5,10/15] bugreport: add config values from safelist
diff mbox series

Message ID 20200124033436.81097-11-emilyshaffer@google.com
State New
Headers show
Series
  • add git-bugreport tool
Related show

Commit Message

Emily Shaffer Jan. 24, 2020, 3:34 a.m. UTC
From: Emily Shaffer <emilyshaffer@google.com>

Teach bugreport to gather the values of config options which are present
in 'bugreport-config-safelist.h'.

Many config options are sensitive, and many Git add-ons use config
options which git-core does not know about; it is better only to gather
config options which we know to be safe, rather than excluding options
which we know to be unsafe.

Taking the build-time generated array and putting it into a set saves us
time - since git_config_bugreport() is called for every option the user
has configured, performing option lookup in constant time is a useful
optimization.

Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
---
 Makefile    |  2 +-
 bugreport.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 76 insertions(+), 1 deletion(-)

Comments

Martin Ågren Jan. 30, 2020, 10:36 p.m. UTC | #1
On Fri, 24 Jan 2020 at 04:40, <emilyshaffer@google.com> wrote:
> --- a/Makefile
> +++ b/Makefile
> @@ -2465,7 +2465,7 @@ endif
>  git-%$X: %.o GIT-LDFLAGS $(GITLIBS)
>         $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
>
> -git-bugreport$X: bugreport.o GIT-LDFLAGS $(GITLIBS)
> +git-bugreport$X: bugreport-config-safelist.h bugreport.o GIT-LDFLAGS $(GITLIBS)
>         $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
>                 $(LIBS)

Haven't looked at this patch at all, except I've noticed that something
is up with the dependencies. I think bugreport.o needs to depend on
bugreport-config-safelist.h. As it is, the latter may or may not be
available as bugreport.o is built. (Reproduces fairly well for me with
something like `make clean && make -j16`.)

I'll try to continue tomorrow. :-)


Martin
Martin Ågren Jan. 31, 2020, 9:25 p.m. UTC | #2
On Fri, 24 Jan 2020 at 04:40, <emilyshaffer@google.com> wrote:
> Taking the build-time generated array and putting it into a set saves us
> time - since git_config_bugreport() is called for every option the user
> has configured, performing option lookup in constant time is a useful
> optimization.

I'm sympathetic to your sending out what you have to obtain comments,
knowing that it's not perfect. It would have saved me some time and
effort if I'd known that this was the case though. An "[RFC]" tag,
perhaps. Or at least tweaking the above part of this commit message to
say that this might be over-engineered, with a reference to [1].

[1] https://lore.kernel.org/git/20200124032905.GA37541@google.com/

> +       int safelist_len = sizeof(bugreport_config_safelist) / sizeof(const char *);

I was going to suggest ARRAY_SIZE, but then I realized there are some
outstanding questions around whether you need this stuff in the first
place. I'd be inclined to guess that the first version of this would be
"for each safelisted item, obtain it and include", ignoring any "a.*.b"
business. In which case you wouldn't really need this hashset stuff.


Martin
Emily Shaffer Feb. 5, 2020, 1:34 a.m. UTC | #3
On Thu, Jan 30, 2020 at 11:36:43PM +0100, Martin Ågren wrote:
> On Fri, 24 Jan 2020 at 04:40, <emilyshaffer@google.com> wrote:
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -2465,7 +2465,7 @@ endif
> >  git-%$X: %.o GIT-LDFLAGS $(GITLIBS)
> >         $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
> >
> > -git-bugreport$X: bugreport.o GIT-LDFLAGS $(GITLIBS)
> > +git-bugreport$X: bugreport-config-safelist.h bugreport.o GIT-LDFLAGS $(GITLIBS)
> >         $(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
> >                 $(LIBS)
> 
> Haven't looked at this patch at all, except I've noticed that something
> is up with the dependencies. I think bugreport.o needs to depend on
> bugreport-config-safelist.h. As it is, the latter may or may not be
> available as bugreport.o is built. (Reproduces fairly well for me with
> something like `make clean && make -j16`.)

Hum, I thought that the ordering of the dependency list here was
ensured. But I see where help.o relies directly on command-list.h, so
I'll add a dependency for bugreport.o (and drop it from the direct
dependency for git-bugreport).

 - Emily
Emily Shaffer Feb. 5, 2020, 2:31 a.m. UTC | #4
On Fri, Jan 31, 2020 at 10:25:06PM +0100, Martin Ågren wrote:
> On Fri, 24 Jan 2020 at 04:40, <emilyshaffer@google.com> wrote:
> > Taking the build-time generated array and putting it into a set saves us
> > time - since git_config_bugreport() is called for every option the user
> > has configured, performing option lookup in constant time is a useful
> > optimization.
> 
> I'm sympathetic to your sending out what you have to obtain comments,
> knowing that it's not perfect. It would have saved me some time and
> effort if I'd known that this was the case though. An "[RFC]" tag,
> perhaps. Or at least tweaking the above part of this commit message to
> say that this might be over-engineered, with a reference to [1].
> 
> [1] https://lore.kernel.org/git/20200124032905.GA37541@google.com/

Yeah, you are right, and thanks for the feedback. I think I had wrapped
up the series before I realized there were those significant outstanding
comments, but I definitely should have said so in this patch.

> 
> > +       int safelist_len = sizeof(bugreport_config_safelist) / sizeof(const char *);
> 
> I was going to suggest ARRAY_SIZE, but then I realized there are some
> outstanding questions around whether you need this stuff in the first
> place. I'd be inclined to guess that the first version of this would be
> "for each safelisted item, obtain it and include", ignoring any "a.*.b"
> business. In which case you wouldn't really need this hashset stuff.

Regardless, I think it's worth doing nicely for now.

It seems to me that supporting wildcarding in the config safelist is a
superset of supporting static strings in that safelist - that is, if I
write it simply to support static strings, a later change to support
wildcards would be welcome and non-breaking.

So I'm going to clean this up without making it more featureful, for
now.

Sorry about the confusion!
 - Emily
Martin Ågren Feb. 5, 2020, 8:12 p.m. UTC | #5
On Wed, 5 Feb 2020 at 03:31, Emily Shaffer <emilyshaffer@google.com> wrote:
>
> On Fri, Jan 31, 2020 at 10:25:06PM +0100, Martin Ågren wrote:
> > On Fri, 24 Jan 2020 at 04:40, <emilyshaffer@google.com> wrote:
> > > Taking the build-time generated array and putting it into a set saves us
> > > time - since git_config_bugreport() is called for every option the user
> > > has configured, performing option lookup in constant time is a useful
> > > optimization.
> >
> > I'm sympathetic to your sending out what you have to obtain comments,
> > knowing that it's not perfect. It would have saved me some time and
> > effort if I'd known that this was the case though. An "[RFC]" tag,
> > perhaps. Or at least tweaking the above part of this commit message to
> > say that this might be over-engineered, with a reference to [1].
> >
> > [1] https://lore.kernel.org/git/20200124032905.GA37541@google.com/
>
> Yeah, you are right, and thanks for the feedback. I think I had wrapped
> up the series before I realized there were those significant outstanding
> comments, but I definitely should have said so in this patch.
>
> >
> > > +       int safelist_len = sizeof(bugreport_config_safelist) / sizeof(const char *);
> >
> > I was going to suggest ARRAY_SIZE, but then I realized there are some
> > outstanding questions around whether you need this stuff in the first
> > place. I'd be inclined to guess that the first version of this would be
> > "for each safelisted item, obtain it and include", ignoring any "a.*.b"
> > business. In which case you wouldn't really need this hashset stuff.
>
> Regardless, I think it's worth doing nicely for now.
>
> It seems to me that supporting wildcarding in the config safelist is a
> superset of supporting static strings in that safelist - that is, if I
> write it simply to support static strings, a later change to support
> wildcards would be welcome and non-breaking.

Yeah, agreed. Also, whatever magic the code for gathering the config
items knows, the annotations in Documentation/, i.e., the generated list
in bugreport-config-safelist.h will be "synced". That is, until we know
how to handle "a.*.b" in the safelist, we probably won't be marking
"a.*.b" as safe. And even if we have some crazy mixed build, it's a
*safe*list, so we should be fine. (Famous last words?)

On that topic though, and just so we don't mess up from the beginning,
in the documentation, we typically write something like
"branch.<name>.merge", i.e., not "branch.*.merge". So I guess the
"a.*.b" feature would be more like "look for '.<', something, something,
'>.'." Would that be sufficient? Or would we actually want more
fine-grained control, i.e., something like regexes?

If so, we'd need to provide those regexes somehow, e.g., as part of the
asciidoc macro notation. :-/ Do we need to prepare somehow for such a
future? I don't think we do.

> So I'm going to clean this up without making it more featureful, for
> now.
>
> Sorry about the confusion!
>  - Emily

No worries. :)

Martin

Patch
diff mbox series

diff --git a/Makefile b/Makefile
index 2bc9f112ea..eb17ece120 100644
--- a/Makefile
+++ b/Makefile
@@ -2465,7 +2465,7 @@  endif
 git-%$X: %.o GIT-LDFLAGS $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
 
-git-bugreport$X: bugreport.o GIT-LDFLAGS $(GITLIBS)
+git-bugreport$X: bugreport-config-safelist.h bugreport.o GIT-LDFLAGS $(GITLIBS)
 	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
 		$(LIBS)
 
diff --git a/bugreport.c b/bugreport.c
index 07b84b9c94..7a9fd36b60 100644
--- a/bugreport.c
+++ b/bugreport.c
@@ -6,6 +6,9 @@ 
 #include "help.h"
 #include "compat/compiler.h"
 #include "run-command.h"
+#include "config.h"
+#include "bugreport-config-safelist.h"
+#include "khash.h"
 
 static void get_curl_version_info(struct strbuf *curl_info)
 {
@@ -18,6 +21,40 @@  static void get_curl_version_info(struct strbuf *curl_info)
 	    strbuf_addstr(curl_info, "'git-remote-https --build-info' not supported\n");
 }
 
+KHASH_INIT(cfg_set, const char*, int, 0, kh_str_hash_func, kh_str_hash_equal);
+
+struct cfgset {
+	kh_cfg_set_t set;
+};
+
+struct cfgset safelist;
+
+static void cfgset_init(struct cfgset *set, size_t initial_size)
+{
+	memset(&set->set, 0, sizeof(set->set));
+	if (initial_size)
+		kh_resize_cfg_set(&set->set, initial_size);
+}
+
+static int cfgset_insert(struct cfgset *set, const char *cfg_key)
+{
+	int added;
+	kh_put_cfg_set(&set->set, cfg_key, &added);
+	return !added;
+}
+
+static int cfgset_contains(struct cfgset *set, const char *cfg_key)
+{
+	khiter_t pos = kh_get_cfg_set(&set->set, cfg_key);
+	return pos != kh_end(&set->set);
+}
+
+static void cfgset_clear(struct cfgset *set)
+{
+	kh_release_cfg_set(&set->set);
+	cfgset_init(set, 0);
+}
+
 static void get_system_info(struct strbuf *sys_info)
 {
 	struct strbuf version_info = STRBUF_INIT;
@@ -54,6 +91,36 @@  static void get_system_info(struct strbuf *sys_info)
 	strbuf_complete_line(sys_info);
 }
 
+static void gather_safelist(void)
+{
+	int index;
+	int safelist_len = sizeof(bugreport_config_safelist) / sizeof(const char *);
+	cfgset_init(&safelist, safelist_len);
+	for (index = 0; index < safelist_len; index++)
+		cfgset_insert(&safelist, bugreport_config_safelist[index]);
+
+}
+
+static int git_config_bugreport(const char *var, const char *value, void *cb)
+{
+	struct strbuf *config_info = (struct strbuf *)cb;
+
+	if (cfgset_contains(&safelist, var))
+		strbuf_addf(config_info,
+			    "%s (%s) : %s\n",
+			    var, config_scope_to_string(current_config_scope()),
+			    value);
+
+	return 0;
+}
+
+static void get_safelisted_config(struct strbuf *config_info)
+{
+	gather_safelist();
+	git_config(git_config_bugreport, config_info);
+	cfgset_clear(&safelist);
+}
+
 static const char * const bugreport_usage[] = {
 	N_("git bugreport [-o|--output <file>]"),
 	NULL
@@ -94,12 +161,17 @@  int cmd_main(int argc, const char **argv)
 	FILE *report;
 	time_t now = time(NULL);
 	char *option_output = NULL;
+	int nongit_ok = 0;
 
 	const struct option bugreport_options[] = {
 		OPT_STRING('o', "output", &option_output, N_("path"),
 			   N_("specify a destination for the bugreport file")),
 		OPT_END()
 	};
+
+	/* Prerequisite for hooks and config checks */
+	setup_git_directory_gently(&nongit_ok);
+
 	argc = parse_options(argc, argv, "", bugreport_options,
 			     bugreport_usage, 0);
 
@@ -118,6 +190,9 @@  int cmd_main(int argc, const char **argv)
 	get_header(&buffer, "System Info");
 	get_system_info(&buffer);
 
+	get_header(&buffer, "Safelisted Config Info");
+	get_safelisted_config(&buffer);
+
 	report = fopen_for_writing(report_path.buf);
 	strbuf_write(&buffer, report);
 	fclose(report);