Message ID | 20190414210933.20875-3-t.gummerer@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | output improvements for git range-diff | expand |
On Sun, Apr 14, 2019 at 5:09 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > [...] > However it can still be useful to have the function name that 'git > diff' extracts as additional context for the change. > [...] > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> > --- > diff --git a/range-diff.c b/range-diff.c > @@ -102,9 +102,12 @@ static int read_patches(const char *range, struct string_list *list) > + } else if (starts_with(line.buf, "@@ ")) { > + char *skip_lineno = strstr(line.buf + 3, "@@"); > + strbuf_remove(&line, 0, skip_lineno - line.buf); It makes me a bit uncomfortable that this is not checking for NULL return from strstr() before doing pointer arithmetic (even though the input is presumably machine-generated). if (!skip_lineno) BUG(...); might be appropriate. > + strbuf_addch(&buf, ' '); > + strbuf_addbuf(&buf, &line);
Hi Thomas, On Sun, 14 Apr 2019, Thomas Gummerer wrote: > When postprocessing the inner diff in range-diff, we currently replace > the whole hunk header line with just "@@". This matches how 'git > tbdiff' used to handle hunk headers as well. > > Most likely this is being done because line numbers in the hunk header > are not relevant without other changes. They can for example easily > change if a range is rebased, and lines are added/removed before a > change that we actually care about in our ranges. > > However it can still be useful to have the function name that 'git > diff' extracts as additional context for the change. > > Note that it is not guaranteed that the hunk header actually shows up > in the range-diff, and this change only aims to improve the case where > a hunk header would already be included in the final output. Makes sense. > diff --git a/range-diff.c b/range-diff.c > index 9242b8975f..f365141ade 100644 > --- a/range-diff.c > +++ b/range-diff.c > @@ -102,9 +102,12 @@ static int read_patches(const char *range, struct string_list *list) > strbuf_addch(&buf, '\n'); > } > continue; > - } else if (starts_with(line.buf, "@@ ")) > - strbuf_addstr(&buf, "@@"); > - else if (!line.buf[0] || starts_with(line.buf, "index ")) > + } else if (starts_with(line.buf, "@@ ")) { > + char *skip_lineno = strstr(line.buf + 3, "@@"); Rather than using the magic constant "3", it would probably make sense to declare `skip_lineno` outside of the `if` construct, and use `skip_prefix(line.buf, "@@ ", &skip_lineno)` instead of `starts_with(...)`. We *will*, however, want to have a safeguard against `strstr()` not finding anything. Maybe re-use the `p` variable that we already have, and do this instead: } else if (skip_prefix(line.buf, "@@ ", &p) && (p = strstr(p, "@@"))) { > + strbuf_remove(&line, 0, skip_lineno - line.buf); > + strbuf_addch(&buf, ' '); Shorter: strbuf_splice(&line, 0, p - line.buf, " ", 1); (assuming that you accept my suggestion to use `p` instead of `skip_lineno`...) Thanks, Dscho > + strbuf_addbuf(&buf, &line); > + } else if (!line.buf[0] || starts_with(line.buf, "index ")) > /* > * A completely blank (not ' \n', which is context) > * line is not valid in a diff. We skip it > -- > 2.21.0.593.g511ec345e1 > >
Hi Eric, On Sun, 14 Apr 2019, Eric Sunshine wrote: > On Sun, Apr 14, 2019 at 5:09 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > [...] > > However it can still be useful to have the function name that 'git > > diff' extracts as additional context for the change. > > [...] > > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> > > --- > > diff --git a/range-diff.c b/range-diff.c > > @@ -102,9 +102,12 @@ static int read_patches(const char *range, struct string_list *list) > > + } else if (starts_with(line.buf, "@@ ")) { > > + char *skip_lineno = strstr(line.buf + 3, "@@"); > > + strbuf_remove(&line, 0, skip_lineno - line.buf); > > It makes me a bit uncomfortable that this is not checking for NULL > return from strstr() before doing pointer arithmetic (even though the > input is presumably machine-generated). > > if (!skip_lineno) > BUG(...); Good point, but maybe we should not go so far as to declare this a bug, and fall back to removing everything bug the initial two `at` characters instead? Thanks, Dscho > > might be appropriate. > > > + strbuf_addch(&buf, ' '); > > + strbuf_addbuf(&buf, &line); >
On 04/15, Johannes Schindelin wrote: > Hi Eric, > > On Sun, 14 Apr 2019, Eric Sunshine wrote: > > > On Sun, Apr 14, 2019 at 5:09 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > > [...] > > > However it can still be useful to have the function name that 'git > > > diff' extracts as additional context for the change. > > > [...] > > > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> > > > --- > > > diff --git a/range-diff.c b/range-diff.c > > > @@ -102,9 +102,12 @@ static int read_patches(const char *range, struct string_list *list) > > > + } else if (starts_with(line.buf, "@@ ")) { > > > + char *skip_lineno = strstr(line.buf + 3, "@@"); > > > + strbuf_remove(&line, 0, skip_lineno - line.buf); > > > > It makes me a bit uncomfortable that this is not checking for NULL > > return from strstr() before doing pointer arithmetic (even though the > > input is presumably machine-generated). > > > > if (!skip_lineno) > > BUG(...); > > Good point, but maybe we should not go so far as to declare this a bug, > and fall back to removing everything bug the initial two `at` characters > instead? I like declaring this a bug. We are after all parsing machine-generated output, that does come from git (which is why I neglected the NULL checking in the first place). If that second "@@" is not there it's definitely a bug somewhere in the diff machinery, I'd say. Note that the "@@" also couldn't come from anywhere else, the diff header has a well defined format and so does the metadata. The diff itself is prefixed with '<', '>' and '#' in this case, and the commit message is also prefixed with four spaces. So if this breaks somewhere I'd rather hear about it loudly, than let users potentially get wrong output because we missed something somewhere. > > > > might be appropriate. > > > > > + strbuf_addch(&buf, ' '); > > > + strbuf_addbuf(&buf, &line); > >
On 04/15, Johannes Schindelin wrote: > Hi Thomas, > > > On Sun, 14 Apr 2019, Thomas Gummerer wrote: > > diff --git a/range-diff.c b/range-diff.c > > index 9242b8975f..f365141ade 100644 > > --- a/range-diff.c > > +++ b/range-diff.c > > @@ -102,9 +102,12 @@ static int read_patches(const char *range, struct string_list *list) > > strbuf_addch(&buf, '\n'); > > } > > continue; > > - } else if (starts_with(line.buf, "@@ ")) > > - strbuf_addstr(&buf, "@@"); > > - else if (!line.buf[0] || starts_with(line.buf, "index ")) > > + } else if (starts_with(line.buf, "@@ ")) { > > + char *skip_lineno = strstr(line.buf + 3, "@@"); > > Rather than using the magic constant "3", it would probably make sense to > declare `skip_lineno` outside of the `if` construct, and use > `skip_prefix(line.buf, "@@ ", &skip_lineno)` instead of > `starts_with(...)`. I like this suggestion. > We *will*, however, want to have a safeguard against `strstr()` not > finding anything. Maybe re-use the `p` variable that we already have, and > do this instead: > > } else if (skip_prefix(line.buf, "@@ ", &p) && > (p = strstr(p, "@@"))) { > > > + strbuf_remove(&line, 0, skip_lineno - line.buf); > > + strbuf_addch(&buf, ' '); > > Shorter: > > strbuf_splice(&line, 0, p - line.buf, " ", 1); > > (assuming that you accept my suggestion to use `p` instead of > `skip_lineno`...) And this is also much nicer, thanks! > Thanks, > Dscho > > > + strbuf_addbuf(&buf, &line); > > + } else if (!line.buf[0] || starts_with(line.buf, "index ")) > > /* > > * A completely blank (not ' \n', which is context) > > * line is not valid in a diff. We skip it > > -- > > 2.21.0.593.g511ec345e1 > > > >
Hi Thomas, On Mon, 15 Apr 2019, Thomas Gummerer wrote: > On 04/15, Johannes Schindelin wrote: > > > On Sun, 14 Apr 2019, Eric Sunshine wrote: > > > > > On Sun, Apr 14, 2019 at 5:09 PM Thomas Gummerer <t.gummerer@gmail.com> wrote: > > > > [...] > > > > However it can still be useful to have the function name that 'git > > > > diff' extracts as additional context for the change. > > > > [...] > > > > Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> > > > > --- > > > > diff --git a/range-diff.c b/range-diff.c > > > > @@ -102,9 +102,12 @@ static int read_patches(const char *range, struct string_list *list) > > > > + } else if (starts_with(line.buf, "@@ ")) { > > > > + char *skip_lineno = strstr(line.buf + 3, "@@"); > > > > + strbuf_remove(&line, 0, skip_lineno - line.buf); > > > > > > It makes me a bit uncomfortable that this is not checking for NULL > > > return from strstr() before doing pointer arithmetic (even though the > > > input is presumably machine-generated). > > > > > > if (!skip_lineno) > > > BUG(...); > > > > Good point, but maybe we should not go so far as to declare this a bug, > > and fall back to removing everything bug the initial two `at` characters > > instead? > > I like declaring this a bug. We are after all parsing > machine-generated output, that does come from git (which is why I > neglected the NULL checking in the first place). If that second "@@" > is not there it's definitely a bug somewhere in the diff machinery, > I'd say. Ah, but you do know about the micro-project I proposed to optionally feed an mbox to `range-diff`, right? The idea behind my proposal is that this would make it possible to generate a range-diff between the patches on public-inbox and the commits that actually made it into Junio's `pu`... > Note that the "@@" also couldn't come from anywhere else, the diff > header has a well defined format and so does the metadata. The diff > itself is prefixed with '<', '>' and '#' in this case, and the commit > message is also prefixed with four spaces. So if this breaks > somewhere I'd rather hear about it loudly, than let users potentially > get wrong output because we missed something somewhere. Agreed. But I could imagine that `die()`ing here would be the more appropriate way to holler loudly ;-) Ciao, Dscho
On 04/17, Johannes Schindelin wrote: > Hi Thomas, > > On Mon, 15 Apr 2019, Thomas Gummerer wrote: > > I like declaring this a bug. We are after all parsing > > machine-generated output, that does come from git (which is why I > > neglected the NULL checking in the first place). If that second "@@" > > is not there it's definitely a bug somewhere in the diff machinery, > > I'd say. > > Ah, but you do know about the micro-project I proposed to optionally feed > an mbox to `range-diff`, right? > > The idea behind my proposal is that this would make it possible to > generate a range-diff between the patches on public-inbox and the commits > that actually made it into Junio's `pu`... I had forgotten about that, and was only looking at what the code currently does. > > Note that the "@@" also couldn't come from anywhere else, the diff > > header has a well defined format and so does the metadata. The diff > > itself is prefixed with '<', '>' and '#' in this case, and the commit > > message is also prefixed with four spaces. So if this breaks > > somewhere I'd rather hear about it loudly, than let users potentially > > get wrong output because we missed something somewhere. > > Agreed. But I could imagine that `die()`ing here would be the more > appropriate way to holler loudly ;-) Yup, in the light of the potential microproject, I'm fine with just 'die()'ing here. If we'd get into this situation right now, there would probably be a bunch of other tests failing as well, so we might as well make it a little more future proof. Thanks.
diff --git a/range-diff.c b/range-diff.c index 9242b8975f..f365141ade 100644 --- a/range-diff.c +++ b/range-diff.c @@ -102,9 +102,12 @@ static int read_patches(const char *range, struct string_list *list) strbuf_addch(&buf, '\n'); } continue; - } else if (starts_with(line.buf, "@@ ")) - strbuf_addstr(&buf, "@@"); - else if (!line.buf[0] || starts_with(line.buf, "index ")) + } else if (starts_with(line.buf, "@@ ")) { + char *skip_lineno = strstr(line.buf + 3, "@@"); + strbuf_remove(&line, 0, skip_lineno - line.buf); + strbuf_addch(&buf, ' '); + strbuf_addbuf(&buf, &line); + } else if (!line.buf[0] || starts_with(line.buf, "index ")) /* * A completely blank (not ' \n', which is context) * line is not valid in a diff. We skip it
When postprocessing the inner diff in range-diff, we currently replace the whole hunk header line with just "@@". This matches how 'git tbdiff' used to handle hunk headers as well. Most likely this is being done because line numbers in the hunk header are not relevant without other changes. They can for example easily change if a range is rebased, and lines are added/removed before a change that we actually care about in our ranges. However it can still be useful to have the function name that 'git diff' extracts as additional context for the change. Note that it is not guaranteed that the hunk header actually shows up in the range-diff, and this change only aims to improve the case where a hunk header would already be included in the final output. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> --- range-diff.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)