Message ID | patch-v4-13.27-02ca92660af-20220331T005325Z-avarab@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | revision.[ch]: add and use release_revisions() | expand |
Hi Ævar On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote: > In preparation for having the "log" family of functions make wider use > of release_revisions() let's have them call it just before > exiting. This changes the "log", "whatchanged", "show", > "format-patch", etc. commands, all of which live in this file. > > The release_revisions() API still only frees the "pending" member, but > will learn to release more members of "struct rev_info" in subsequent > commits. > > In the case of "format-patch" revert the addition of UNLEAK() in > dee839a2633 (format-patch: mark rev_info with UNLEAK, 2021-12-16), > which will cause several tests that previously passed under > "TEST_PASSES_SANITIZE_LEAK=true" to start failing. > > In subsequent commits we'll now be able to use those tests to check > whether that part of the API is really leaking memory, and will fix > all of those memory leaks. Removing the UNLEAK() allows us to make > incremental progress in that direction. See [1] for further details > about this approach. This breaks "git bisect" but only when running the test suite to detect leaks so I guess that's not too bad. An alternative would be to manually remove the UNLEAK() when you're testing rather than committing the change. > Note that the release_revisions() will not be sufficient to deal with > the code in cmd_show() added in 5d7eeee2ac6 (git-show: grok blobs, > trees and tags, too, 2006-12-14) which clobbers the "pending" array in > the case of "OBJ_COMMIT". That will need to be dealt with by some > future follow-up work. > > 1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/ > > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > builtin/log.c | 20 ++++++++++++-------- > t/t4126-apply-empty.sh | 2 -- > 2 files changed, 12 insertions(+), 10 deletions(-) > > diff --git a/builtin/log.c b/builtin/log.c > index 6f9928fabfe..c40ebe1c3f4 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -295,6 +295,12 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix, > cmd_log_init_finish(argc, argv, prefix, rev, opt); > } > > +static int cmd_log_deinit(int ret, struct rev_info *rev) > +{ > + release_revisions(rev); > + return ret; > +} > /* > * This gives a rough estimate for how many commits we > * will print out in the list. > @@ -558,7 +564,7 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix) > cmd_log_init(argc, argv, prefix, &rev, &opt); > if (!rev.diffopt.output_format) > rev.diffopt.output_format = DIFF_FORMAT_RAW; > - return cmd_log_walk(&rev); > + return cmd_log_deinit(cmd_log_walk(&rev), &rev); This is a rather unusual pattern, at first I wondered if there were going to be more added to the body of cmd_log_deinit() in later commits but there isn't so why not just call release_revisions() here to be consistent with the other release_revisions() call that are added in other patches? Best Wishes Phillip
On Sat, Apr 02 2022, Phillip Wood wrote: [A comment on v4, but also applies to v5 I think] > On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote: >> In preparation for having the "log" family of functions make wider use >> of release_revisions() let's have them call it just before >> exiting. This changes the "log", "whatchanged", "show", >> "format-patch", etc. commands, all of which live in this file. >> The release_revisions() API still only frees the "pending" member, >> but >> will learn to release more members of "struct rev_info" in subsequent >> commits. >> In the case of "format-patch" revert the addition of UNLEAK() in >> dee839a2633 (format-patch: mark rev_info with UNLEAK, 2021-12-16), >> which will cause several tests that previously passed under >> "TEST_PASSES_SANITIZE_LEAK=true" to start failing. >> In subsequent commits we'll now be able to use those tests to check >> whether that part of the API is really leaking memory, and will fix >> all of those memory leaks. Removing the UNLEAK() allows us to make >> incremental progress in that direction. See [1] for further details >> about this approach. > > This breaks "git bisect" but only when running the test suite to > detect leaks so I guess that's not too bad. An alternative would be to > manually remove the UNLEAK() when you're testing rather than > committing the change. It doesn't, for this series each individual commit passes with make test GIT_TEST_PASSING_SANITIZE_LEAK=true make test SANITIZE=leak And also in a stricter mode that I have locally (not in git yet): make test GIT_TEST_PASSING_SANITIZE_LEAK=check make test SANITIZE=leak Which ensures not only that the tests we marked as leak free pass, but that no other tests we *haven't* marked pass unexpectedly (requires prep changes before this series to mark the still-not-marked-but-should-be tests). I think that should address/help explain things re your questions about some of the UNLEAK() back-and-forth. I.e. there's a few changes that are in this series just so it can pass in that "GIT_TEST_PASSING_SANITIZE_LEAK=check" mode, but it would still pass in "GIT_TEST_PASSING_SANITIZE_LEAK=true", i.e. because we'd make some new test pass unexpectedly. But I think maintaining the 1=1 correspondance really helps to follow along with this, i.e. tests are tweaked as they become leak-free, and we (or well, mostly I) can be confident that I marked all the relevant newlry passing ones, and that there are no regressions in-between. >> /* >> * This gives a rough estimate for how many commits we >> * will print out in the list. >> @@ -558,7 +564,7 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix) >> cmd_log_init(argc, argv, prefix, &rev, &opt); >> if (!rev.diffopt.output_format) >> rev.diffopt.output_format = DIFF_FORMAT_RAW; >> - return cmd_log_walk(&rev); >> + return cmd_log_deinit(cmd_log_walk(&rev), &rev); > > This is a rather unusual pattern, at first I wondered if there were > going to be more added to the body of cmd_log_deinit() in later > commits but there isn't so why not just call release_revisions() here > to be consistent with the other release_revisions() call that are > added in other patches? It's just a way to save every single call to this callsite a change on top like this: diff --git a/builtin/log.c b/builtin/log.c index 5dad70aa47e..ece03536bed 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -684,8 +684,11 @@ int cmd_show(int argc, const char **argv, const char *prefix) opt.tweak = show_setup_revisions_tweak; cmd_log_init(argc, argv, prefix, &rev, &opt); - if (!rev.no_walk) - return cmd_log_deinit(cmd_log_walk(&rev), &rev); + if (!rev.no_walk) { + ret = cmd_log_walk(&rev); + release_revisions(&rev); + return ret; + } count = rev.pending.nr; objects = rev.pending.objects; Which, given that there's 6 of them nicely cuts down on the resulting verbosity.
Phillip Wood <phillip.wood123@gmail.com> writes: >> +static int cmd_log_deinit(int ret, struct rev_info *rev) >> +{ >> + release_revisions(rev); >> + return ret; >> +} > > >> /* >> * This gives a rough estimate for how many commits we >> * will print out in the list. >> @@ -558,7 +564,7 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix) >> cmd_log_init(argc, argv, prefix, &rev, &opt); >> if (!rev.diffopt.output_format) >> rev.diffopt.output_format = DIFF_FORMAT_RAW; >> - return cmd_log_walk(&rev); >> + return cmd_log_deinit(cmd_log_walk(&rev), &rev); > > This is a rather unusual pattern, at first I wondered if there were > going to be more added to the body of cmd_log_deinit() in later > commits but there isn't so why not just call release_revisions() here > to be consistent with the other release_revisions() call that are > added in other patches? It is being cute and clever by not requiring a temporary variable ret, where you would normally say int ret = 0; /* assume success */ ... a lot of code ... ret = cmd_log_walk(&rev); release_revisions(&rev); return ret; I agree that this looks confusing; if this pattern can become majority locally in the file, I guess it would be OK---at that point we can claim that it is the (new) usual pattern ;-).
Hi Ævar On 03/04/2022 15:07, Ævar Arnfjörð Bjarmason wrote: > > On Sat, Apr 02 2022, Phillip Wood wrote: > > [A comment on v4, but also applies to v5 I think] > >> On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote: >>> In preparation for having the "log" family of functions make wider use >>> of release_revisions() let's have them call it just before >>> exiting. This changes the "log", "whatchanged", "show", >>> "format-patch", etc. commands, all of which live in this file. >>> The release_revisions() API still only frees the "pending" member, >>> but >>> will learn to release more members of "struct rev_info" in subsequent >>> commits. >>> In the case of "format-patch" revert the addition of UNLEAK() in >>> dee839a2633 (format-patch: mark rev_info with UNLEAK, 2021-12-16), >>> which will cause several tests that previously passed under >>> "TEST_PASSES_SANITIZE_LEAK=true" to start failing. >>> In subsequent commits we'll now be able to use those tests to check >>> whether that part of the API is really leaking memory, and will fix >>> all of those memory leaks. Removing the UNLEAK() allows us to make >>> incremental progress in that direction. See [1] for further details >>> about this approach. >> >> This breaks "git bisect" but only when running the test suite to >> detect leaks so I guess that's not too bad. An alternative would be to >> manually remove the UNLEAK() when you're testing rather than >> committing the change. > > It doesn't, for this series each individual commit passes with Oh I'd missed that, thanks for explaining > make test > GIT_TEST_PASSING_SANITIZE_LEAK=true make test SANITIZE=leak > > And also in a stricter mode that I have locally (not in git yet): > > make test > GIT_TEST_PASSING_SANITIZE_LEAK=check make test SANITIZE=leak > > Which ensures not only that the tests we marked as leak free pass, but > that no other tests we *haven't* marked pass unexpectedly (requires prep > changes before this series to mark the still-not-marked-but-should-be > tests). > > I think that should address/help explain things re your questions about > some of the UNLEAK() back-and-forth. Yes it does, the next patch makes sense to me now as well thanks > [...] >>> @@ -558,7 +564,7 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix) >>> cmd_log_init(argc, argv, prefix, &rev, &opt); >>> if (!rev.diffopt.output_format) >>> rev.diffopt.output_format = DIFF_FORMAT_RAW; >>> - return cmd_log_walk(&rev); >>> + return cmd_log_deinit(cmd_log_walk(&rev), &rev); >> >> This is a rather unusual pattern, at first I wondered if there were >> going to be more added to the body of cmd_log_deinit() in later >> commits but there isn't so why not just call release_revisions() here >> to be consistent with the other release_revisions() call that are >> added in other patches? > > It's just a way to save every single call to this callsite a change on > top like this: > > diff --git a/builtin/log.c b/builtin/log.c > index 5dad70aa47e..ece03536bed 100644 > --- a/builtin/log.c > +++ b/builtin/log.c > @@ -684,8 +684,11 @@ int cmd_show(int argc, const char **argv, const char *prefix) > opt.tweak = show_setup_revisions_tweak; > cmd_log_init(argc, argv, prefix, &rev, &opt); > > - if (!rev.no_walk) > - return cmd_log_deinit(cmd_log_walk(&rev), &rev); > + if (!rev.no_walk) { > + ret = cmd_log_walk(&rev); > + release_revisions(&rev); > + return ret; > + } > > count = rev.pending.nr; > objects = rev.pending.objects; > > Which, given that there's 6 of them nicely cuts down on the resulting > verbosity. If you want to adopt this pattern more widely then if there was a helper function in revisions.c it could be used in patches 11 and 14 as well as this one I think. Best Wishes Phillip
diff --git a/builtin/log.c b/builtin/log.c index 6f9928fabfe..c40ebe1c3f4 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -295,6 +295,12 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix, cmd_log_init_finish(argc, argv, prefix, rev, opt); } +static int cmd_log_deinit(int ret, struct rev_info *rev) +{ + release_revisions(rev); + return ret; +} + /* * This gives a rough estimate for how many commits we * will print out in the list. @@ -558,7 +564,7 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix) cmd_log_init(argc, argv, prefix, &rev, &opt); if (!rev.diffopt.output_format) rev.diffopt.output_format = DIFF_FORMAT_RAW; - return cmd_log_walk(&rev); + return cmd_log_deinit(cmd_log_walk(&rev), &rev); } static void show_tagger(const char *buf, struct rev_info *rev) @@ -677,7 +683,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) cmd_log_init(argc, argv, prefix, &rev, &opt); if (!rev.no_walk) - return cmd_log_walk(&rev); + return cmd_log_deinit(cmd_log_walk(&rev), &rev); count = rev.pending.nr; objects = rev.pending.objects; @@ -732,8 +738,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) ret = error(_("unknown type: %d"), o->type); } } - free(objects); - return ret; + return cmd_log_deinit(ret, &rev); } /* @@ -761,7 +766,7 @@ int cmd_log_reflog(int argc, const char **argv, const char *prefix) rev.always_show_header = 1; cmd_log_init_finish(argc, argv, prefix, &rev, &opt); - return cmd_log_walk(&rev); + return cmd_log_deinit(cmd_log_walk(&rev), &rev); } static void log_setup_revisions_tweak(struct rev_info *rev, @@ -792,7 +797,7 @@ int cmd_log(int argc, const char **argv, const char *prefix) opt.revarg_opt = REVARG_COMMITTISH; opt.tweak = log_setup_revisions_tweak; cmd_log_init(argc, argv, prefix, &rev, &opt); - return cmd_log_walk(&rev); + return cmd_log_deinit(cmd_log_walk(&rev), &rev); } /* format-patch */ @@ -2289,8 +2294,7 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) if (rev.ref_message_ids) string_list_clear(rev.ref_message_ids, 0); free(rev.ref_message_ids); - UNLEAK(rev); - return 0; + return cmd_log_deinit(0, &rev); } static int add_pending_commit(const char *arg, struct rev_info *revs, int flags) diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh index 33860d38290..66a7ba8ab8f 100755 --- a/t/t4126-apply-empty.sh +++ b/t/t4126-apply-empty.sh @@ -2,8 +2,6 @@ test_description='apply empty' - -TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh test_expect_success setup '
In preparation for having the "log" family of functions make wider use of release_revisions() let's have them call it just before exiting. This changes the "log", "whatchanged", "show", "format-patch", etc. commands, all of which live in this file. The release_revisions() API still only frees the "pending" member, but will learn to release more members of "struct rev_info" in subsequent commits. In the case of "format-patch" revert the addition of UNLEAK() in dee839a2633 (format-patch: mark rev_info with UNLEAK, 2021-12-16), which will cause several tests that previously passed under "TEST_PASSES_SANITIZE_LEAK=true" to start failing. In subsequent commits we'll now be able to use those tests to check whether that part of the API is really leaking memory, and will fix all of those memory leaks. Removing the UNLEAK() allows us to make incremental progress in that direction. See [1] for further details about this approach. Note that the release_revisions() will not be sufficient to deal with the code in cmd_show() added in 5d7eeee2ac6 (git-show: grok blobs, trees and tags, too, 2006-12-14) which clobbers the "pending" array in the case of "OBJ_COMMIT". That will need to be dealt with by some future follow-up work. 1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/log.c | 20 ++++++++++++-------- t/t4126-apply-empty.sh | 2 -- 2 files changed, 12 insertions(+), 10 deletions(-)