Message ID | 20250208061702.88469-1-forivall@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | revision: fix missing null for freed memory | expand |
Emily M Klassen <forivall@gmail.com> writes: > "git log --graph --no-graph" missed cleaning up the output_prefix and > output_prefix_data pointers. This resulted in a segfault when using "--patch", > "--name-status" or "--name-only", as the output_prefix_data continued to be in > use after free() > > Signed-off-by: Emily M Klassen <forivall@gmail.com> > --- > I previously reported this a few hours ago, and ended up digging in and figuring > it out. I'll make sure to bottom reply in the follow ups to this patch. > > revision.c | 2 ++ > 1 file changed, 2 insertions(+) Reading the symptom in the proposed log message (which is very clearly written, by the way), it seems that this is reproducible? Can we have a test to make sure that the fix would not be broken later? > diff --git a/revision.c b/revision.c > index 474fa1e767..84cb028e11 100644 > --- a/revision.c > +++ b/revision.c > @@ -2615,6 +2615,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg > graph_clear(revs->graph); > revs->graph = graph_init(revs); > } else if (!strcmp(arg, "--no-graph")) { > + revs->diffopt.output_prefix = NULL; > + revs->diffopt.output_prefix_data = NULL; > graph_clear(revs->graph); > revs->graph = NULL; > } else if (!strcmp(arg, "--encode-email-headers")) { Interesting. In response to "--graph" (the code we can see in the context before this part), we clear the revs->graph and then call graph_init(revs) for ourselves, and we do not have to futz with diffopt at all, and it works OK because output_prefix_data and output_prefix would be overwritten by the graph_init() to the value we want to use anyway. But of course, after "--no-graph", nobody clears these two members for us, so we'd need to clear them here. It might make the API less error-prone if the "clear" function cleared the .graph and diffopt->output_prefix{,_data} together but among three existing callers of graph_clear(), only this caller needs to clear these two members, so it probably would not matter. So in short, this seems to be a good fix for the immediate issue, and it is unlikely that we'd need any follow-up work.
Emily M Klassen <forivall@gmail.com> writes: > Subject: Re: [PATCH] revision: fix missing null for freed memory > > "git log --graph --no-graph" missed cleaning up the output_prefix and > output_prefix_data pointers. This resulted in a segfault when using "--patch", > "--name-status" or "--name-only", as the output_prefix_data continued to be in > use after free() Rereading the title, I cannot make sense out of "fix missing null" and guess what it wants to say. Is "null" here used as a verb to mean "to assign a NULL to a variable that points at ..."? revision: clear graph callback upon "--no-graph" "git log --graph --no-graph" first populates the .output_prefix member of diffopt, which is a callback function, to compute "--graph" header, and then discards the data the callback needs to compute the graph header but forgets to clear .output_prefix pointer in response to "--no-graph". At runtime, we end up calling the function that we should not. Clear the member to stop making callback, and for a better hyginene, also clear the pointer pointing at a freed memory. or something? Other than that, as I said earlier, the patch looks good. Thanks. > Signed-off-by: Emily M Klassen <forivall@gmail.com> > --- > I previously reported this a few hours ago, and ended up digging in and figuring > it out. I'll make sure to bottom reply in the follow ups to this patch. > > revision.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/revision.c b/revision.c > index 474fa1e767..84cb028e11 100644 > --- a/revision.c > +++ b/revision.c > @@ -2615,6 +2615,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg > graph_clear(revs->graph); > revs->graph = graph_init(revs); > } else if (!strcmp(arg, "--no-graph")) { > + revs->diffopt.output_prefix = NULL; > + revs->diffopt.output_prefix_data = NULL; > graph_clear(revs->graph); > revs->graph = NULL; > } else if (!strcmp(arg, "--encode-email-headers")) {
On Mon, Feb 10, 2025 at 8:02 AM Junio C Hamano <gitster@pobox.com> wrote: > > Emily M Klassen <forivall@gmail.com> writes: > > > Subject: Re: [PATCH] revision: fix missing null for freed memory > > > > "git log --graph --no-graph" missed cleaning up the output_prefix and > > output_prefix_data pointers. This resulted in a segfault when using "--patch", > > "--name-status" or "--name-only", as the output_prefix_data continued to be in > > use after free() > > Rereading the title, I cannot make sense out of "fix missing null" > and guess what it wants to say. Is "null" here used as a verb to > mean "to assign a NULL to a variable that points at ..."? Yeah, this was meant to say something like "fix missing null assignment after freeing graph data", and I didn't really have the energy to think of a better summary at the time. > > revision: clear graph callback upon "--no-graph" > > "git log --graph --no-graph" first populates the .output_prefix > member of diffopt, which is a callback function, to compute > "--graph" header, and then discards the data the callback needs > to compute the graph header but forgets to clear .output_prefix > pointer in response to "--no-graph". At runtime, we end up > calling the function that we should not. > > Clear the member to stop making callback, and for a better > hyginene, also clear the pointer pointing at a freed memory. > > or something? Yup, this works well. A small bit of rephrasing for readability: revision: clear graph prefix callback upon "--no-graph" "git log --graph --no-graph" misses some cleanup: handling "--graph", it assigns the .output_prefix member of diffopt, which is a callback function to compute the graph prefix when displaying a diff. Then, when handling "--no-graph" it discards the data the callback needs to compute the graph header but forgets to clear .output_prefix pointer. At runtime, we call the function when we should not. It also passes a stale pointer to the data, which leads to a segfault when the callback is used for "--patch", "--name-status" or "--name-only". Clear the member to stop the callback from being called, and for hygiene, also clear the pointer pointing at a freed memory. > > Other than that, as I said earlier, the patch looks good. > > Thanks. Awesome. I'll also add a test before re-submitting, as mentioned in your other message. Thanks for the feedback! > > > Signed-off-by: Emily M Klassen <forivall@gmail.com> > > --- > > I previously reported this a few hours ago, and ended up digging in and figuring > > it out. I'll make sure to bottom reply in the follow ups to this patch. > > > > revision.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/revision.c b/revision.c > > index 474fa1e767..84cb028e11 100644 > > --- a/revision.c > > +++ b/revision.c > > @@ -2615,6 +2615,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg > > graph_clear(revs->graph); > > revs->graph = graph_init(revs); > > } else if (!strcmp(arg, "--no-graph")) { > > + revs->diffopt.output_prefix = NULL; > > + revs->diffopt.output_prefix_data = NULL; > > graph_clear(revs->graph); > > revs->graph = NULL; > > } else if (!strcmp(arg, "--encode-email-headers")) {
On Fri, Feb 07, 2025 at 10:17:02PM -0800, Emily M Klassen wrote: > "git log --graph --no-graph" missed cleaning up the output_prefix and > output_prefix_data pointers. This resulted in a segfault when using "--patch", > "--name-status" or "--name-only", as the output_prefix_data continued to be in > use after free() > > Signed-off-by: Emily M Klassen <forivall@gmail.com> > --- > I previously reported this a few hours ago, and ended up digging in and figuring > it out. I'll make sure to bottom reply in the follow ups to this patch. Do we know when this bug was introduced? Is it a recent regression or a long-standing issue? Might be nice to point out in the commit message if we do know. Patrick
On Tue, Feb 11, 2025 at 2:56 AM Patrick Steinhardt <ps@pks.im> wrote: > > On Fri, Feb 07, 2025 at 10:17:02PM -0800, Emily M Klassen wrote: > > "git log --graph --no-graph" missed cleaning up the output_prefix and > > output_prefix_data pointers. This resulted in a segfault when using "--patch", > > "--name-status" or "--name-only", as the output_prefix_data continued to be in > > use after free() > > > > Signed-off-by: Emily M Klassen <forivall@gmail.com> > > --- > > I previously reported this a few hours ago, and ended up digging in and figuring > > it out. I'll make sure to bottom reply in the follow ups to this patch. > > Do we know when this bug was introduced? Is it a recent regression or a > long-standing issue? Might be nice to point out in the commit message if > we do know. > > Patrick Out of morbid curiosity, I've started bisecting this, but it's proven slightly more subtle than I anticipated. If nobody beats me to it, I'll post my results when I have them.
On Tue, Feb 11, 2025 at 2:31 PM D. Ben Knoble <ben.knoble@gmail.com> wrote: > > On Tue, Feb 11, 2025 at 2:56 AM Patrick Steinhardt <ps@pks.im> wrote: > > > > On Fri, Feb 07, 2025 at 10:17:02PM -0800, Emily M Klassen wrote: > > > "git log --graph --no-graph" missed cleaning up the output_prefix and > > > output_prefix_data pointers. This resulted in a segfault when using "--patch", > > > "--name-status" or "--name-only", as the output_prefix_data continued to be in > > > use after free() > > > > > > Signed-off-by: Emily M Klassen <forivall@gmail.com> > > > --- > > > I previously reported this a few hours ago, and ended up digging in and figuring > > > it out. I'll make sure to bottom reply in the follow ups to this patch. > > > > Do we know when this bug was introduced? Is it a recent regression or a > > long-standing issue? Might be nice to point out in the commit message if > > we do know. > > > > Patrick > > Out of morbid curiosity, I've started bisecting this, but it's proven > slightly more subtle than I anticipated. If nobody beats me to it, > I'll post my results when I have them. > > -- > D. Ben Knoble 2.{30,35}.0 fails to recognize --no-graph, so I checked "git log --grep no-graph origin/master" with "git describe --contains" and decided that 2.36.0 was first release recognizing --no-graph, but it didn't build for me (possibly an issue on my end). I got 2.37.0 built, and it was "good," so that's where I started. Here's my "bisect run" script. #! /bin/sh -x make || exit 125 # segfault has exit >128 ./bin-wrappers/git --no-pager log -2 --graph --no-graph --patch --cc || exit 1 The --cc is important, since this repro logs from where the bisect is! Without it, if the head commits are both merges (likely), the repro will accidentally mark the commit as good when looking further for a commit with a patch will fail. Omitting -2 might work, too, but that makes "git log" take longer. With --first-parent on bisect, we find 3eb4cc451e (Merge branch 'jk/output-prefix-cleanup', 2024-10-10), which looks like a reasonable candidate. Restarting between that commit and it's first parent, we get 19752d9c91 (diff: return line_prefix directly when possible, 2024-10-03). That commit actually looks relatively innocuous, and I haven't tracked down how it really impacts the problem or fix. Perhaps the topic merge is more helpful to folks in assessing where the bug came from. Otherwise, it may be that --cc is not enough and we should bisect with --diff-merges=1 or something else guaranteed to generate a diff and trip the bug. To my shame, I didn't save the log from the --first-parent bisect from 2.37.0 to 388218fac7 (The ninth batch, 2025-02-10), but I did save the smaller one. # bad: [3eb4cc451ed97123ff76e183a5be8a7dc164d1f6] Merge branch 'jk/output-prefix-cleanup' # good: [31bc4454de66c22bc8570fd3af52a99843ac69b0] Merge branch 'ps/leakfixes-part-8' git bisect start '@' '@^' # good: [436728fe9d75d05fa2439f867ca2039012b86e69] diff: return const char from output_prefix callback git bisect good 436728fe9d75d05fa2439f867ca2039012b86e69 # bad: [1164e270b5af80516625b628945ec7365d992055] diff: store graph prefix buf in git_graph struct git bisect bad 1164e270b5af80516625b628945ec7365d992055 # bad: [19752d9c912478b9eef0bd83c2cf6da98974f536] diff: return line_prefix directly when possible git bisect bad 19752d9c912478b9eef0bd83c2cf6da98974f536 # first bad commit: [19752d9c912478b9eef0bd83c2cf6da98974f536] diff: return line_prefix directly when possible
On Tue, Feb 11, 2025 at 03:22:28PM -0500, D. Ben Knoble wrote: > 2.{30,35}.0 fails to recognize --no-graph, so I checked "git log --grep no-graph > origin/master" with "git describe --contains" and decided that 2.36.0 was first > release recognizing --no-graph, but it didn't build for me (possibly an issue on > my end). I got 2.37.0 built, and it was "good," so that's where I started. > > Here's my "bisect run" script. > > #! /bin/sh -x > make || exit 125 > # segfault has exit >128 > ./bin-wrappers/git --no-pager log -2 --graph --no-graph --patch > --cc || exit 1 I don't think this is quite enough. The problem is a use-after-free, so the behavior is undefined. Depending on whether that heap block is reused, it might work just fine, or output garbage data, or segfault. I'd have _thought_ it would usually just segfault, but it almost always just output garbage for me. Building with: make SANITIZE=address,undefined is a good way to get reliable results for this kind of memory error. Doing that shows that v2.37.0 is actually bad. And bisecting shows that it has been broken since 087c745833 (log: add a --no-graph option, 2022-02-11), which is not too surprising. > The --cc is important, since this repro logs from where the bisect is! Without > it, if the head commits are both merges (likely), the repro will accidentally > mark the commit as good when looking further for a commit with a patch will > fail. Omitting -2 might work, too, but that makes "git log" take longer. I've also run into non-determinism when bisecting like this, because my test command depends on the value of HEAD. The best solution here is to just feed a stable tip to git-log. I bisected on: git log --graph --no-graph --patch origin >/dev/null (I didn't need "-2" because good commits failed with "unrecognized argument" and bad ones were killed by ASan immediately ;) ). -Peff
Jeff King <peff@peff.net> writes: > Doing that shows that v2.37.0 is actually bad. And bisecting shows that > it has been broken since 087c745833 (log: add a --no-graph option, > 2022-02-11), which is not too surprising. Yeah, broken from its beginning is usually how these kinds of bugs turn out to be. > I've also run into non-determinism when bisecting like this, because my > test command depends on the value of HEAD. The best solution here is to > just feed a stable tip to git-log. I bisected on: > > git log --graph --no-graph --patch origin >/dev/null > > (I didn't need "-2" because good commits failed with "unrecognized > argument" and bad ones were killed by ASan immediately ;) ). Thanks for sharing a good tip.
On Tue, Feb 11, 2025 at 04:29:09PM -0500, Jeff King wrote: > On Tue, Feb 11, 2025 at 03:22:28PM -0500, D. Ben Knoble wrote: > > > 2.{30,35}.0 fails to recognize --no-graph, so I checked "git log --grep no-graph > > origin/master" with "git describe --contains" and decided that 2.36.0 was first > > release recognizing --no-graph, but it didn't build for me (possibly an issue on > > my end). I got 2.37.0 built, and it was "good," so that's where I started. > > > > Here's my "bisect run" script. > > > > #! /bin/sh -x > > make || exit 125 > > # segfault has exit >128 > > ./bin-wrappers/git --no-pager log -2 --graph --no-graph --patch > > --cc || exit 1 > > I don't think this is quite enough. The problem is a use-after-free, so > the behavior is undefined. Depending on whether that heap block is > reused, it might work just fine, or output garbage data, or segfault. > > I'd have _thought_ it would usually just segfault, but it almost always > just output garbage for me. Building with: > > make SANITIZE=address,undefined > > is a good way to get reliable results for this kind of memory error. > Doing that shows that v2.37.0 is actually bad. And bisecting shows that > it has been broken since 087c745833 (log: add a --no-graph option, > 2022-02-11), which is not too surprising. Thanks all for bisecting :) Patrick
Emily Klassen <forivall@gmail.com> writes: > Awesome. I'll also add a test before re-submitting, as mentioned in > your other message. Thanks.
> > Le 11 févr. 2025 à 16:29, Jeff King <peff@peff.net> a écrit : > > On Tue, Feb 11, 2025 at 03:22:28PM -0500, D. Ben Knoble wrote: > >> 2.{30,35}.0 fails to recognize --no-graph, so I checked "git log --grep no-graph >> origin/master" with "git describe --contains" and decided that 2.36.0 was first >> release recognizing --no-graph, but it didn't build for me (possibly an issue on >> my end). I got 2.37.0 built, and it was "good," so that's where I started. >> >> Here's my "bisect run" script. >> >> #! /bin/sh -x >> make || exit 125 >> # segfault has exit >128 >> ./bin-wrappers/git --no-pager log -2 --graph --no-graph --patch >> --cc || exit 1 > > I don't think this is quite enough. The problem is a use-after-free, so > the behavior is undefined. Depending on whether that heap block is > reused, it might work just fine, or output garbage data, or segfault. > > I'd have _thought_ it would usually just segfault, but it almost always > just output garbage for me. Building with: > > make SANITIZE=address,undefined > > is a good way to get reliable results for this kind of memory error. > Doing that shows that v2.37.0 is actually bad. And bisecting shows that > it has been broken since 087c745833 (log: add a --no-graph option, > 2022-02-11), which is not too surprising. Ah, fun, that’s more like what I was expecting. And thanks for the advice! > >> The --cc is important, since this repro logs from where the bisect is! Without >> it, if the head commits are both merges (likely), the repro will accidentally >> mark the commit as good when looking further for a commit with a patch will >> fail. Omitting -2 might work, too, but that makes "git log" take longer. > > I've also run into non-determinism when bisecting like this, because my > test command depends on the value of HEAD. The best solution here is to > just feed a stable tip to git-log. I bisected on: > > git log --graph --no-graph --patch origin >/dev/null > > (I didn't need "-2" because good commits failed with "unrecognized > argument" and bad ones were killed by ASan immediately ;) ). > > -Peff
diff --git a/revision.c b/revision.c index 474fa1e767..84cb028e11 100644 --- a/revision.c +++ b/revision.c @@ -2615,6 +2615,8 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg graph_clear(revs->graph); revs->graph = graph_init(revs); } else if (!strcmp(arg, "--no-graph")) { + revs->diffopt.output_prefix = NULL; + revs->diffopt.output_prefix_data = NULL; graph_clear(revs->graph); revs->graph = NULL; } else if (!strcmp(arg, "--encode-email-headers")) {
"git log --graph --no-graph" missed cleaning up the output_prefix and output_prefix_data pointers. This resulted in a segfault when using "--patch", "--name-status" or "--name-only", as the output_prefix_data continued to be in use after free() Signed-off-by: Emily M Klassen <forivall@gmail.com> --- I previously reported this a few hours ago, and ended up digging in and figuring it out. I'll make sure to bottom reply in the follow ups to this patch. revision.c | 2 ++ 1 file changed, 2 insertions(+)