Message ID | 2375e34100e571f9c3ce658d28aba6648fba18a6.1586960921.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | range-diff: fix a crash in parsing git-log output | expand |
"Vasil Dimov via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Vasil Dimov <vd@FreeBSD.org> > > `git range-diff` calls `git log` internally and tries to parse its > output. But `git log` output can be customized by the user in their > git config and for certain configurations either an error will be > returned by `git range-diff` or it will crash. > > To fix this explicitly set the output format of the internally > executed `git log` with `--pretty=medium`. Because that cancels > `--notes`, add explicitly `--notes` at the end. Good finding. Shouldn't we also disable customizations that come from the configuration variables like diff.external, diff.<driver>.command? Thanks.
Hi Vasil, Nice find. I have a question below: On Wed, Apr 15, 2020 at 02:28:40PM +0000, Vasil Dimov via GitGitGadget wrote: > From: Vasil Dimov <vd@FreeBSD.org> > > `git range-diff` calls `git log` internally and tries to parse its > output. But `git log` output can be customized by the user in their > git config and for certain configurations either an error will be > returned by `git range-diff` or it will crash. > > To fix this explicitly set the output format of the internally > executed `git log` with `--pretty=medium`. Because that cancels > `--notes`, add explicitly `--notes` at the end. > > Also, make sure we never crash in the same way - trying to dereference > `util` which was never created and has remained NULL. It would happen > if the first line of `git log` output does not begin with 'commit '. > > Alternative considered but discarded - somehow disable all git configs > and behave as if no config is present in the internally executed > `git log`, but that does not seem to be possible. GIT_CONFIG_NOSYSTEM > is the closest to it, but even with that we would still read > `.git/config`. I don't know of a great way to do this off-hand. Perhaps an internal `--for-range-diff` option that ignores options that are incompatible with range-diff's own parsing seems like an OK path forward to me. Here, internal means that it's not part of the manual page. We can pass 'PARSE_OPT_NONEG' to make sure that a caller can't later overwrite it by passing '--no-for-range-diff'. > Signed-off-by: Vasil Dimov <vd@FreeBSD.org> > --- > range-diff.c | 13 +++++++++++++ > t/t3206-range-diff.sh | 10 ++++++++++ > 2 files changed, 23 insertions(+) > > diff --git a/range-diff.c b/range-diff.c > index f745567cf67..5cc920be391 100644 > --- a/range-diff.c > +++ b/range-diff.c > @@ -63,6 +63,8 @@ static int read_patches(const char *range, struct string_list *list, > "--output-indicator-old=<", > "--output-indicator-context=#", > "--no-abbrev-commit", > + "--pretty=medium", > + "--notes", > NULL); > if (other_arg) > argv_array_pushv(&cp.args, other_arg->argv); > @@ -106,6 +108,17 @@ static int read_patches(const char *range, struct string_list *list, > continue; > } > > + if (!util) { > + error(_("could not parse first line of `log` output: " > + "did not start with 'commit ': '%s'"), > + line); > + string_list_clear(list, 1); > + strbuf_release(&buf); > + strbuf_release(&contents); > + finish_command(&cp); > + return -1; > + } > + > if (starts_with(line, "diff --git")) { > struct patch patch = { 0 }; > struct strbuf root = STRBUF_INIT; > diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh > index bd808f87ed5..e024cff65cb 100755 > --- a/t/t3206-range-diff.sh > +++ b/t/t3206-range-diff.sh > @@ -513,6 +513,16 @@ test_expect_success 'range-diff overrides diff.noprefix internally' ' > git -c diff.noprefix=true range-diff HEAD^... > ' > > +test_expect_success 'basic with modified format.pretty with suffix' ' > + git -c format.pretty="format:commit %H%d%n" range-diff \ > + master..topic master..unmodified > +' > + > +test_expect_success 'basic with modified format.pretty without "commit "' ' > + git -c format.pretty="format:%H%n" range-diff \ > + master..topic master..unmodified > +' > + > test_expect_success 'range-diff compares notes by default' ' > git notes add -m "topic note" topic && > git notes add -m "unmodified note" unmodified && > -- > gitgitgadget Otherwise what you have here does look good to me, too. I'm just not sure that there aren't other ways that a caller could circumvent placing '--notes --pretty=medium' at the end, anyway. Thanks, Taylor
On Wed, Apr 15, 2020 at 08:31:39 -0700, Junio C Hamano wrote: > "Vasil Dimov via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > From: Vasil Dimov <vd@FreeBSD.org> > > > > `git range-diff` calls `git log` internally and tries to parse its > > output. But `git log` output can be customized by the user in their > > git config and for certain configurations either an error will be > > returned by `git range-diff` or it will crash. > > > > To fix this explicitly set the output format of the internally > > executed `git log` with `--pretty=medium`. Because that cancels > > `--notes`, add explicitly `--notes` at the end. > > Good finding. > > Shouldn't we also disable customizations that come from the > configuration variables like diff.external, diff.<driver>.command? Hmm, I am not sure. I just read the doc about diff.<driver>.command. Is it possible that the user has set up some custom diff tools to compare e.g. .jpg files by only comparing their EXIF data instead of the entire (binary) files? Surely if the customizations wrt diff.external and diff.<driver>.command produce result that is unparsable by git range-diff, then they are not good. But maybe the opposite is the case? I don't feel confident enough to judge.
On Wed, Apr 15, 2020 at 08:31:39AM -0700, Junio C Hamano wrote: > "Vasil Dimov via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > From: Vasil Dimov <vd@FreeBSD.org> > > > > `git range-diff` calls `git log` internally and tries to parse its > > output. But `git log` output can be customized by the user in their > > git config and for certain configurations either an error will be > > returned by `git range-diff` or it will crash. > > > > To fix this explicitly set the output format of the internally > > executed `git log` with `--pretty=medium`. Because that cancels > > `--notes`, add explicitly `--notes` at the end. > > Good finding. > > Shouldn't we also disable customizations that come from the > configuration variables like diff.external, diff.<driver>.command? If range-diff were a script, I would say it should be using the "rev-list | diff-tree --stdin" plumbing under the hood, rather than "log". The read_patches() function does let callers pass options to git-log, but I don't _think_ this is exposed to the user. We only allow a few --notes options to be passed, and we should be able to apply those to diff-tree. So converting it to use plumbing might be an option. Though I think there is another bug: $ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes commit 8f3d9f354286745c751374f5f1fcafee6b3f3136 git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed. Aborted Another option would be for range-diff to directly call the revision-traversal plumbing itself. There may be complications there, though (or else it would have been done from the outset). We should fix that assertion regardless, though. -Peff
Hi Peff, On Wed, Apr 15, 2020 at 12:23:26PM -0400, Jeff King wrote: > On Wed, Apr 15, 2020 at 08:31:39AM -0700, Junio C Hamano wrote: > > > "Vasil Dimov via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > > > From: Vasil Dimov <vd@FreeBSD.org> > > > > > > `git range-diff` calls `git log` internally and tries to parse its > > > output. But `git log` output can be customized by the user in their > > > git config and for certain configurations either an error will be > > > returned by `git range-diff` or it will crash. > > > > > > To fix this explicitly set the output format of the internally > > > executed `git log` with `--pretty=medium`. Because that cancels > > > `--notes`, add explicitly `--notes` at the end. > > > > Good finding. > > > > Shouldn't we also disable customizations that come from the > > configuration variables like diff.external, diff.<driver>.command? > > If range-diff were a script, I would say it should be using the > "rev-list | diff-tree --stdin" plumbing under the hood, rather than > "log". > > The read_patches() function does let callers pass options to git-log, > but I don't _think_ this is exposed to the user. We only allow a few > --notes options to be passed, and we should be able to apply those to > diff-tree. So converting it to use plumbing might be an option. > > Though I think there is another bug: > > $ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes > commit 8f3d9f354286745c751374f5f1fcafee6b3f3136 > git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed. > Aborted Nice find. I think that I have a patch addressing this below. Please let me know what you think: --- >8 --- Subject: [PATCH] diff-tree.c: load notes machinery with '--notes' Since its introduction in 7249e91 (revision.c: support --notes command-line option, 2011-03-29), combining '--notes' with '--pretty' causes 'git diff-tree' to fail a runtime assertion: $ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes commit 8f3d9f354286745c751374f5f1fcafee6b3f3136 git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed. Aborted This failure is due to diff-tree not calling 'load_display_notes' to initialize the notes machinery. Ordinarily, this failure isn't triggered, because it requires passing both '--notes' and '--pretty'. Specifically, passing '--pretty' sets 'opt->verbose_header', causing 'show_log()' to eventually call 'format_display_notes()', which expects a non-NULL 'display_note_trees'. Without initializing the notes machinery, 'display_note_trees' remains NULL, and thus triggers an assertion failure. This doesn't occur without '--pretty' since we never call 'format_display_notes()' without it. Fix this by initializing the notes machinery after parsing our options, and harden this behavior against regression with a test in t4013. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/diff-tree.c | 2 ++ t/t4013-diff-various.sh | 1 + 2 files changed, 3 insertions(+) diff --git a/builtin/diff-tree.c b/builtin/diff-tree.c index cb9ea79367..17c1cc8c3c 100644 --- a/builtin/diff-tree.c +++ b/builtin/diff-tree.c @@ -126,6 +126,8 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix) precompose_argv(argc, argv); argc = setup_revisions(argc, argv, opt, &s_r_opt); + if (opt->show_notes) + load_display_notes(&opt->notes_opt); while (--argc > 0) { const char *arg = *++argv; diff --git a/t/t4013-diff-various.sh b/t/t4013-diff-various.sh index dde3f11fec..6ae8cfb271 100755 --- a/t/t4013-diff-various.sh +++ b/t/t4013-diff-various.sh @@ -398,6 +398,7 @@ diff --no-index --raw --no-abbrev dir2 dir diff-tree --pretty --root --stat --compact-summary initial diff-tree --pretty -R --root --stat --compact-summary initial +diff-tree --pretty --notes initial diff-tree --stat --compact-summary initial mode diff-tree -R --stat --compact-summary initial mode EOF -- 2.26.0.106.g9fadedd637
On Wed, Apr 15, 2020 at 04:02:42PM -0600, Taylor Blau wrote: > Subject: [PATCH] diff-tree.c: load notes machinery with '--notes' > > Since its introduction in 7249e91 (revision.c: support --notes > command-line option, 2011-03-29), combining '--notes' with '--pretty' > causes 'git diff-tree' to fail a runtime assertion: > > $ git rev-list HEAD | git diff-tree --stdin --pretty=medium --notes > commit 8f3d9f354286745c751374f5f1fcafee6b3f3136 > git: notes.c:1308: format_display_notes: Assertion `display_notes_trees' failed. > Aborted > > This failure is due to diff-tree not calling 'load_display_notes' to > initialize the notes machinery. Yes, I think that's the problem that I saw. And this definitely fixes that case, but I think there's another related one. $ git notes add -m foo $ git log -1 --format='%h %N' 94316974f7 foo $ git rev-list -1 HEAD | git diff-tree --stdin --format='%h %N' -s 94316974f7e27dccc6b87f3946bce5d2fc252dc2 %N This is true even with your patch. With your patch I can add --notes to get the right output: $ git rev-list -1 HEAD | git diff-tree --stdin --format='%h %N' -s --notes 94316974f7e27dccc6b87f3946bce5d2fc252dc2 foo (It's also slightly curious that %h doesn't abbreviate in diff-tree; I guess this is a side effect of the plumbing having no default abbrev setting; it may be simplest to just live with it). > @@ -126,6 +126,8 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix) > > precompose_argv(argc, argv); > argc = setup_revisions(argc, argv, opt, &s_r_opt); > + if (opt->show_notes) > + load_display_notes(&opt->notes_opt); In git-log we have the equivalent of these new lines, but just before it we check the userformat, too: memset(&w, 0, sizeof(w)); userformat_find_requirements(NULL, &w); if (!rev->show_notes_given && (!rev->pretty_given || w.notes)) rev->show_notes = 1; if (rev->show_notes) load_display_notes(&rev->notes_opt); I think we'd want to do the same here. Even though it's plumbing, I can't think of any reason why somebody would _not_ want notes to be auto-enabled when they say "%N". > Ordinarily, this failure isn't triggered, because it requires passing > both '--notes' and '--pretty'. Specifically, passing '--pretty' sets > 'opt->verbose_header', causing 'show_log()' to eventually call > 'format_display_notes()', which expects a non-NULL 'display_note_trees'. > Without initializing the notes machinery, 'display_note_trees' remains > NULL, and thus triggers an assertion failure. This doesn't occur without > '--pretty' since we never call 'format_display_notes()' without it. It's not just --pretty, of course, but any option that causes us to actually try to format notes (--format, --oneline, etc). -Peff
diff --git a/range-diff.c b/range-diff.c index f745567cf67..5cc920be391 100644 --- a/range-diff.c +++ b/range-diff.c @@ -63,6 +63,8 @@ static int read_patches(const char *range, struct string_list *list, "--output-indicator-old=<", "--output-indicator-context=#", "--no-abbrev-commit", + "--pretty=medium", + "--notes", NULL); if (other_arg) argv_array_pushv(&cp.args, other_arg->argv); @@ -106,6 +108,17 @@ static int read_patches(const char *range, struct string_list *list, continue; } + if (!util) { + error(_("could not parse first line of `log` output: " + "did not start with 'commit ': '%s'"), + line); + string_list_clear(list, 1); + strbuf_release(&buf); + strbuf_release(&contents); + finish_command(&cp); + return -1; + } + if (starts_with(line, "diff --git")) { struct patch patch = { 0 }; struct strbuf root = STRBUF_INIT; diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh index bd808f87ed5..e024cff65cb 100755 --- a/t/t3206-range-diff.sh +++ b/t/t3206-range-diff.sh @@ -513,6 +513,16 @@ test_expect_success 'range-diff overrides diff.noprefix internally' ' git -c diff.noprefix=true range-diff HEAD^... ' +test_expect_success 'basic with modified format.pretty with suffix' ' + git -c format.pretty="format:commit %H%d%n" range-diff \ + master..topic master..unmodified +' + +test_expect_success 'basic with modified format.pretty without "commit "' ' + git -c format.pretty="format:%H%n" range-diff \ + master..topic master..unmodified +' + test_expect_success 'range-diff compares notes by default' ' git notes add -m "topic note" topic && git notes add -m "unmodified note" unmodified &&