Message ID | 20190124123240.GB11354@sigill.intra.peff.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | some diff --cc --stat fixes | expand |
On Thu, Jan 24, 2019 at 4:32 AM Jeff King <peff@peff.net> wrote: > > There's an odd bug when "log --color-moved" is used with the combination > of "--cc --stat -p": the stat for merge commits is erroneously shown > with the diff of the _next_ commit. > > The included test demonstrates the issue. Our history looks something > like this: > > A-B-M--D > \ / > C > > When we run "git log --cc --stat -p --color-moved" starting at D, we get > this sequence of events: > > 1. The diff for D is using -p, so diff_flush() calls into > diff_flush_patch_all_file_pairs(). There we see that o->color_moved > is in effect, so we point o->emitted_symbols to a static local > struct, causing diff_flush_patch() to queue the symbols instead of > actually writing them out. > > We then do our move detection, emit the symbols, and clear the > struct. But we leave o->emitted_symbols pointing to our struct. > > 2. Next we compute the diff for M. This is a merge, so we use the > combined diff code. In find_paths_generic(), we compute the > pairwise diff between each commit and its parent. Normally this is > done with DIFF_FORMAT_NO_OUTPUT, since we're just looking for > intersecting paths. But since "--stat --cc" shows the first-parent > stat, and since we're computing that diff anyway, we enable > DIFF_FORMAT_DIFFSTAT for the first parent. This outputs the stat > information immediately, saving us from running a separate > first-parent diff later. > > But where does that output go? Normally it goes directly to stdout, > but because o->emitted_symbols is set, we queue it. As a result, we > don't actually print the diffstat for the merge commit (yet), Thanks for your analysis. As always a pleasant read. I understand and agree with what is written up to here remembering the code vaguely. > which > is wrong. I disagree with this sentiment. If we remember to flush the queued output this is merely an inefficiency due to implementation details, but not wrong. We could argue that it is wrong to have o->emitted_symbols set, as we know we don't need it for producing a diffstat only. > > 3. Next we compute the diff for C. We're actually showing a patch > again, so we end up in diff_flush_patch_all_file_pairs(), but this > time we have the queued stat from step 2 waiting in our struct. Right, that is how the queueing can produce errors. I wonder if the test that is included in this patch would work on top of e6e045f803 ("diff.c: buffer all output if asked to", 2017-06-29) as that commit specifically wanted to make sure these errors would be caught. > > We add new elements to it for C's diff, and then flush the whole > thing. And we see the diffstat from M as part of C's diff, which is > wrong. > > So triggering the bug really does require the combination of all of > those options. Similarly we can trigger bugs by using options that enable buffering (so far only --color-moved) and options that do not fully buffer and flush. > > To fix it, we can simply restore o->emitted_symbols to NULL after > flushing it, so that it does not affect anything outside of > diff_flush_patch_all_file_pairs(). This intuitively makes sense, since > nobody outside of that function is going to bother flushing it, so we > would not want them to write to it either. This would also cause the inefficiency I mentioned after (2) to disappear, as the merge commits diffstat would be just printed to stdout? > > In fact, we could take this a step further and turn the local "esm" > struct into a non-static variable that goes away after the function > ends. However, since it contains a dynamically sized array, we benefit > from amortizing the cost of allocations over many calls. So we'll leave > it as static to retain that benefit. okay. > > But let's push the zero-ing of esm.nr into the conditional for "if > (o->emitted_symbols)" to make it clear that we do not expect esm to hold > any values if we did not just try to use it. With the code as it is > written now, if we did encounter such a case (which I think would be a > bug), we'd silently leak those values without even bothering to display > them. With this change, we'd at least eventually show them, and somebody > would notice. Wow. Good call. > > Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Stefan Beller <sbeller@google.com>
On Thu, Jan 24, 2019 at 10:55:10AM -0800, Stefan Beller wrote: > > But where does that output go? Normally it goes directly to stdout, > > but because o->emitted_symbols is set, we queue it. As a result, we > > don't actually print the diffstat for the merge commit (yet), > > Thanks for your analysis. As always a pleasant read. > I understand and agree with what is written up to here remembering > the code vaguely. > > > which > > is wrong. > > I disagree with this sentiment. If we remember to flush the queued output > this is merely an inefficiency due to implementation details, but not wrong. > > We could argue that it is wrong to have o->emitted_symbols set, as > we know we don't need it for producing a diffstat only. It's wrong in the sense that we finish printing that merge commit without having shown its diff. If it were the final commit, we would not ever print it at all! So if you are arguing that it would be OK to queue it as long as we flushed it before deciding we were done with the diff, then I agree. But doing that correctly would actually be non-trivial, because the combined-diff code does not use the emitted_symbols queue for its diff (so the stat and the patch would appear out of order). I also wondered why diffstats go to o->emitted_symbols at all. We do not do any analysis of them with --color-moved, I don't think. But I can also see that having emitted_symbols hold everything makes sense from a maintainability standpoint; future features may want to see more of what we're emitting. > > 3. Next we compute the diff for C. We're actually showing a patch > > again, so we end up in diff_flush_patch_all_file_pairs(), but this > > time we have the queued stat from step 2 waiting in our struct. > > Right, that is how the queueing can produce errors. I wonder if the > test that is included in this patch would work on top of > e6e045f803 ("diff.c: buffer all output if asked to", 2017-06-29) > as that commit specifically wanted to make sure these errors > would be caught. I suspect that would not work with "--cc", because combine-diff outputs directly stdout. That's something that we might want to improve in the long run (since obviously it cannot use --color-moved at this point). > > To fix it, we can simply restore o->emitted_symbols to NULL after > > flushing it, so that it does not affect anything outside of > > diff_flush_patch_all_file_pairs(). This intuitively makes sense, since > > nobody outside of that function is going to bother flushing it, so we > > would not want them to write to it either. > > This would also cause the inefficiency I mentioned after (2) to disappear, > as the merge commits diffstat would be just printed to stdout? Yes, it avoids the overhead of even storing them in the emitted struct at all. > Reviewed-by: Stefan Beller <sbeller@google.com> Thanks! I did quite a bit of head-scratching figuring out this bug, but at the end of it I now understand the flow of the color-moved code quite a bit better. :) -Peff
Jeff King <peff@peff.net> writes: > When we run "git log --cc --stat -p --color-moved" starting at D, we get > this sequence of events: > > 1. The diff for D is using -p, so diff_flush() calls into > diff_flush_patch_all_file_pairs(). There we see that o->color_moved > is in effect, so we point o->emitted_symbols to a static local > struct, causing diff_flush_patch() to queue the symbols instead of > actually writing them out. > > We then do our move detection, emit the symbols, and clear the > struct. But we leave o->emitted_symbols pointing to our struct. Wow, that was nasty. I did not like the complexity of that "emitted symbols" conversion we had to do recently and never trusted the code. There still is something funny in diff_flush_patch_all_file_pairs() even after this patch, though. - We first check o->color_moved and unconditionally point o->emitted_symbols to &esm. - In an if() block we enter when o->emitted_symbols is set, there is a check to see if o->color_moved is set. This makes sense only if we are trying to be prepared to handle a case where we are not the one that assigned a non-NULL to o->emitted_symbols due to o->color_moved. So it certainly is possible that o->emitted_symbols is set before we enter this function. - But then, it means that o->emitted_symbols we may have had non-NULL when the function is called may be overwritten if o->color_moved is set. The above observation does not necessarily indicate any bug; it just shows that the code structure is messier than necessary. > To fix it, we can simply restore o->emitted_symbols to NULL after > flushing it, so that it does not affect anything outside of > diff_flush_patch_all_file_pairs(). This intuitively makes sense, since > nobody outside of that function is going to bother flushing it, so we > would not want them to write to it either. Perhaps. I see word-diff codepath gives an allocated buffer to o->emitted_symbols, so assigning NULL without freeing would mean a leak, but I guess this helper function is not designed to be called
On Thu, Jan 24, 2019 at 12:18 PM Junio C Hamano <gitster@pobox.com> wrote: > > Jeff King <peff@peff.net> writes: > > > When we run "git log --cc --stat -p --color-moved" starting at D, we get > > this sequence of events: > > > > 1. The diff for D is using -p, so diff_flush() calls into > > diff_flush_patch_all_file_pairs(). There we see that o->color_moved > > is in effect, so we point o->emitted_symbols to a static local > > struct, causing diff_flush_patch() to queue the symbols instead of > > actually writing them out. > > > > We then do our move detection, emit the symbols, and clear the > > struct. But we leave o->emitted_symbols pointing to our struct. > > Wow, that was nasty. > > I did not like the complexity of that "emitted symbols" conversion > we had to do recently and never trusted the code. There still is > something funny in diff_flush_patch_all_file_pairs() even after this > patch, though. > > - We first check o->color_moved and unconditionally point > o->emitted_symbols to &esm. I do not recall if we discussed this or if I just had talking voices in my head, but this was done deliberately to separate the implementation of the buffering feature and its user. I thought once we had the buffering in place, we might add options that also require buffering; hence the explicit "move coloring implies using a buffer". Having the buffer as a local static might be an iffy implementation that should be fixed for clarify. > > - In an if() block we enter when o->emitted_symbols is set, there > is a check to see if o->color_moved is set. This makes sense > only if we are trying to be prepared to handle a case where we > are not the one that assigned a non-NULL to o->emitted_symbols > due to o->color_moved. So it certainly is possible that > o->emitted_symbols is set before we enter this function. Ah I see where you are coming from, as the code was written I imagined: if (o->color_moved) o->emitted_symbols = &esm; if (o->distim_gostaks) o->emitted_symbols = &esm; if (o->emitted_symbols) { if (o->color_moved) handle_color_moved(o); if (o->distim_gostaks) handle_distimming(o); ... flush symbols... ... free &cleanup ... } > - But then, it means that o->emitted_symbols we may have had > non-NULL when the function is called may be overwritten if > o->color_moved is set. I see. So either we'd want to have if (o->emitted_symbols) BUG("should not be already set"); or as Peff points out, make it non-static. > The above observation does not necessarily indicate any bug; it just > shows that the code structure is messier than necessary. Makes sense. I should not have anticipated any new feature to be added, yet. > > > To fix it, we can simply restore o->emitted_symbols to NULL after > > flushing it, so that it does not affect anything outside of > > diff_flush_patch_all_file_pairs(). This intuitively makes sense, since > > nobody outside of that function is going to bother flushing it, so we > > would not want them to write to it either. > > Perhaps. I see word-diff codepath gives an allocated buffer to > o->emitted_symbols, so assigning NULL without freeing would mean a > leak, but I guess this helper function is not designed to be called Yes, this was done as to make the buffering complete. We would not have to buffer any word diff as the color move doesn't yet work on that. See the NEEDSWORK in diff_words_flush, where we could implement a word based move coloring.
On Thu, Jan 24, 2019 at 12:18:41PM -0800, Junio C Hamano wrote: > I did not like the complexity of that "emitted symbols" conversion > we had to do recently and never trusted the code. There still is > something funny in diff_flush_patch_all_file_pairs() even after this > patch, though. > > - We first check o->color_moved and unconditionally point > o->emitted_symbols to &esm. > > - In an if() block we enter when o->emitted_symbols is set, there > is a check to see if o->color_moved is set. This makes sense > only if we are trying to be prepared to handle a case where we > are not the one that assigned a non-NULL to o->emitted_symbols > due to o->color_moved. So it certainly is possible that > o->emitted_symbols is set before we enter this function. Yeah, I noticed that, too. I assumed it was preparing for a day when the logic for "are we collecting symbols" becomes more complex than just being equivalent to "o->color_moved". Under that rationale, I was OK leaving it. > - But then, it means that o->emitted_symbols we may have had > non-NULL when the function is called may be overwritten if > o->color_moved is set. Yeah, that is true. I think in the new world order proposed by this patch, we'd always assume that it's NULL coming in, possibly assign it, and re-NULL it going out. > The above observation does not necessarily indicate any bug; it just > shows that the code structure is messier than necessary. Yeah, I don't think it's a bug currently, although... > > To fix it, we can simply restore o->emitted_symbols to NULL after > > flushing it, so that it does not affect anything outside of > > diff_flush_patch_all_file_pairs(). This intuitively makes sense, since > > nobody outside of that function is going to bother flushing it, so we > > would not want them to write to it either. > > Perhaps. I see word-diff codepath gives an allocated buffer to > o->emitted_symbols, so assigning NULL without freeing would mean a > leak, but I guess this helper function is not designed to be called Hrm, I'm embarrassed to say I did not notice that it also uses the emitted_symbols system. I think we only do it there, though, in the sub-diff_options that word-diff uses, in which case we make a separate emitted_diff_symbols struct instead of re-using the one from the parent diff_options. So I think the general idea still holds, which is that whoever assigns the emitted_symbols flag is responsible for flushing it. For --color-moved, that happens in a single function, but for word-diff, it's split across the init/flush functions. -Peff
On Thu, Jan 24, 2019 at 12:36:38PM -0800, Stefan Beller wrote: > > - In an if() block we enter when o->emitted_symbols is set, there > > is a check to see if o->color_moved is set. This makes sense > > only if we are trying to be prepared to handle a case where we > > are not the one that assigned a non-NULL to o->emitted_symbols > > due to o->color_moved. So it certainly is possible that > > o->emitted_symbols is set before we enter this function. > > Ah I see where you are coming from, as the code was written > I imagined: > > if (o->color_moved) > o->emitted_symbols = &esm; > if (o->distim_gostaks) > o->emitted_symbols = &esm; > > if (o->emitted_symbols) { > if (o->color_moved) > handle_color_moved(o); > if (o->distim_gostaks) > handle_distimming(o); > > ... flush symbols... > ... free &cleanup ... > } Yeah, FWIW this is what I took to be the reason for the code being laid out as it is. > > - But then, it means that o->emitted_symbols we may have had > > non-NULL when the function is called may be overwritten if > > o->color_moved is set. > > I see. So either we'd want to have > > if (o->emitted_symbols) > BUG("should not be already set"); > > or as Peff points out, make it non-static. Even if it's non-static, you'd still want to ensure that it's not set coming into the function (because somebody like diff-words might have left it set, confusing us). So I think that BUG() may be worth having either way. -Peff
diff --git a/diff.c b/diff.c index 1b5f276360..7b97739799 100644 --- a/diff.c +++ b/diff.c @@ -5894,8 +5894,10 @@ static void diff_flush_patch_all_file_pairs(struct diff_options *o) for (i = 0; i < esm.nr; i++) free((void *)esm.buf[i].line); + esm.nr = 0; + + o->emitted_symbols = NULL; } - esm.nr = 0; } void diff_flush(struct diff_options *options) diff --git a/t/t4066-diff-emit-delay.sh b/t/t4066-diff-emit-delay.sh new file mode 100755 index 0000000000..5df6b5e64e --- /dev/null +++ b/t/t4066-diff-emit-delay.sh @@ -0,0 +1,79 @@ +#!/bin/sh + +test_description='test combined/stat/moved interaction' +. ./test-lib.sh + +# This test covers a weird 3-way interaction between "--cc -p", which will run +# the combined diff code, along with "--stat", which will be computed as a +# first-parent stat during the combined diff, and "--color-moved", which +# enables the emitted_symbols list to store the diff in memory. + +test_expect_success 'set up history with a merge' ' + test_commit A && + test_commit B && + git checkout -b side HEAD^ && + test_commit C && + git merge -m M master && + test_commit D +' + +test_expect_success 'log --cc -p --stat --color-moved' ' + cat >expect <<-\EOF && + commit D + --- + D.t | 1 + + 1 file changed, 1 insertion(+) + + diff --git a/D.t b/D.t + new file mode 100644 + index 0000000..1784810 + --- /dev/null + +++ b/D.t + @@ -0,0 +1 @@ + +D + commit M + + B.t | 1 + + 1 file changed, 1 insertion(+) + commit C + --- + C.t | 1 + + 1 file changed, 1 insertion(+) + + diff --git a/C.t b/C.t + new file mode 100644 + index 0000000..3cc58df + --- /dev/null + +++ b/C.t + @@ -0,0 +1 @@ + +C + commit B + --- + B.t | 1 + + 1 file changed, 1 insertion(+) + + diff --git a/B.t b/B.t + new file mode 100644 + index 0000000..223b783 + --- /dev/null + +++ b/B.t + @@ -0,0 +1 @@ + +B + commit A + --- + A.t | 1 + + 1 file changed, 1 insertion(+) + + diff --git a/A.t b/A.t + new file mode 100644 + index 0000000..f70f10e + --- /dev/null + +++ b/A.t + @@ -0,0 +1 @@ + +A + EOF + git log --format="commit %s" --cc -p --stat --color-moved >actual && + test_cmp expect actual +' + +test_done
There's an odd bug when "log --color-moved" is used with the combination of "--cc --stat -p": the stat for merge commits is erroneously shown with the diff of the _next_ commit. The included test demonstrates the issue. Our history looks something like this: A-B-M--D \ / C When we run "git log --cc --stat -p --color-moved" starting at D, we get this sequence of events: 1. The diff for D is using -p, so diff_flush() calls into diff_flush_patch_all_file_pairs(). There we see that o->color_moved is in effect, so we point o->emitted_symbols to a static local struct, causing diff_flush_patch() to queue the symbols instead of actually writing them out. We then do our move detection, emit the symbols, and clear the struct. But we leave o->emitted_symbols pointing to our struct. 2. Next we compute the diff for M. This is a merge, so we use the combined diff code. In find_paths_generic(), we compute the pairwise diff between each commit and its parent. Normally this is done with DIFF_FORMAT_NO_OUTPUT, since we're just looking for intersecting paths. But since "--stat --cc" shows the first-parent stat, and since we're computing that diff anyway, we enable DIFF_FORMAT_DIFFSTAT for the first parent. This outputs the stat information immediately, saving us from running a separate first-parent diff later. But where does that output go? Normally it goes directly to stdout, but because o->emitted_symbols is set, we queue it. As a result, we don't actually print the diffstat for the merge commit (yet), which is wrong. 3. Next we compute the diff for C. We're actually showing a patch again, so we end up in diff_flush_patch_all_file_pairs(), but this time we have the queued stat from step 2 waiting in our struct. We add new elements to it for C's diff, and then flush the whole thing. And we see the diffstat from M as part of C's diff, which is wrong. So triggering the bug really does require the combination of all of those options. To fix it, we can simply restore o->emitted_symbols to NULL after flushing it, so that it does not affect anything outside of diff_flush_patch_all_file_pairs(). This intuitively makes sense, since nobody outside of that function is going to bother flushing it, so we would not want them to write to it either. In fact, we could take this a step further and turn the local "esm" struct into a non-static variable that goes away after the function ends. However, since it contains a dynamically sized array, we benefit from amortizing the cost of allocations over many calls. So we'll leave it as static to retain that benefit. But let's push the zero-ing of esm.nr into the conditional for "if (o->emitted_symbols)" to make it clear that we do not expect esm to hold any values if we did not just try to use it. With the code as it is written now, if we did encounter such a case (which I think would be a bug), we'd silently leak those values without even bothering to display them. With this change, we'd at least eventually show them, and somebody would notice. Signed-off-by: Jeff King <peff@peff.net> --- diff.c | 4 +- t/t4066-diff-emit-delay.sh | 79 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 82 insertions(+), 1 deletion(-) create mode 100755 t/t4066-diff-emit-delay.sh