diff mbox series

[v4,13/27] revisions API users: use release_revisions() in builtin/log.c

Message ID patch-v4-13.27-02ca92660af-20220331T005325Z-avarab@gmail.com (mailing list archive)
State Superseded
Headers show
Series revision.[ch]: add and use release_revisions() | expand

Commit Message

Ævar Arnfjörð Bjarmason March 31, 2022, 1:11 a.m. UTC
In preparation for having the "log" family of functions make wider use
of release_revisions() let's have them call it just before
exiting. This changes the "log", "whatchanged", "show",
"format-patch", etc. commands, all of which live in this file.

The release_revisions() API still only frees the "pending" member, but
will learn to release more members of "struct rev_info" in subsequent
commits.

In the case of "format-patch" revert the addition of UNLEAK() in
dee839a2633 (format-patch: mark rev_info with UNLEAK, 2021-12-16),
which will cause several tests that previously passed under
"TEST_PASSES_SANITIZE_LEAK=true" to start failing.

In subsequent commits we'll now be able to use those tests to check
whether that part of the API is really leaking memory, and will fix
all of those memory leaks. Removing the UNLEAK() allows us to make
incremental progress in that direction. See [1] for further details
about this approach.

Note that the release_revisions() will not be sufficient to deal with
the code in cmd_show() added in 5d7eeee2ac6 (git-show: grok blobs,
trees and tags, too, 2006-12-14) which clobbers the "pending" array in
the case of "OBJ_COMMIT". That will need to be dealt with by some
future follow-up work.

1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/log.c          | 20 ++++++++++++--------
 t/t4126-apply-empty.sh |  2 --
 2 files changed, 12 insertions(+), 10 deletions(-)

Comments

Phillip Wood April 2, 2022, 9:22 a.m. UTC | #1
Hi Ævar

On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote:
> In preparation for having the "log" family of functions make wider use
> of release_revisions() let's have them call it just before
> exiting. This changes the "log", "whatchanged", "show",
> "format-patch", etc. commands, all of which live in this file.
> 
> The release_revisions() API still only frees the "pending" member, but
> will learn to release more members of "struct rev_info" in subsequent
> commits.
> 
> In the case of "format-patch" revert the addition of UNLEAK() in
> dee839a2633 (format-patch: mark rev_info with UNLEAK, 2021-12-16),
> which will cause several tests that previously passed under
> "TEST_PASSES_SANITIZE_LEAK=true" to start failing.
> 
> In subsequent commits we'll now be able to use those tests to check
> whether that part of the API is really leaking memory, and will fix
> all of those memory leaks. Removing the UNLEAK() allows us to make
> incremental progress in that direction. See [1] for further details
> about this approach.

This breaks "git bisect" but only when running the test suite to detect 
leaks so I guess that's not too bad. An alternative would be to manually 
remove the UNLEAK() when you're testing rather than committing the change.

> Note that the release_revisions() will not be sufficient to deal with
> the code in cmd_show() added in 5d7eeee2ac6 (git-show: grok blobs,
> trees and tags, too, 2006-12-14) which clobbers the "pending" array in
> the case of "OBJ_COMMIT". That will need to be dealt with by some
> future follow-up work.
> 
> 1. https://lore.kernel.org/git/220218.861r00ib86.gmgdl@evledraar.gmail.com/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   builtin/log.c          | 20 ++++++++++++--------
>   t/t4126-apply-empty.sh |  2 --
>   2 files changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/builtin/log.c b/builtin/log.c
> index 6f9928fabfe..c40ebe1c3f4 100644
> --- a/builtin/log.c
> +++ b/builtin/log.c
> @@ -295,6 +295,12 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix,
>   	cmd_log_init_finish(argc, argv, prefix, rev, opt);
>   }
>   
> +static int cmd_log_deinit(int ret, struct rev_info *rev)
> +{
> +	release_revisions(rev);
> +	return ret;
> +}


>   /*
>    * This gives a rough estimate for how many commits we
>    * will print out in the list.
> @@ -558,7 +564,7 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
>   	cmd_log_init(argc, argv, prefix, &rev, &opt);
>   	if (!rev.diffopt.output_format)
>   		rev.diffopt.output_format = DIFF_FORMAT_RAW;
> -	return cmd_log_walk(&rev);
> +	return cmd_log_deinit(cmd_log_walk(&rev), &rev);

This is a rather unusual pattern, at first I wondered if there were 
going to be more added to the body of cmd_log_deinit() in later commits 
but there isn't so why not just call release_revisions() here to be 
consistent with the other release_revisions() call that are added in 
other patches?


Best Wishes

Phillip
Ævar Arnfjörð Bjarmason April 3, 2022, 2:07 p.m. UTC | #2
On Sat, Apr 02 2022, Phillip Wood wrote:

[A comment on v4, but also applies to v5 I think]

> On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote:
>> In preparation for having the "log" family of functions make wider use
>> of release_revisions() let's have them call it just before
>> exiting. This changes the "log", "whatchanged", "show",
>> "format-patch", etc. commands, all of which live in this file.
>> The release_revisions() API still only frees the "pending" member,
>> but
>> will learn to release more members of "struct rev_info" in subsequent
>> commits.
>> In the case of "format-patch" revert the addition of UNLEAK() in
>> dee839a2633 (format-patch: mark rev_info with UNLEAK, 2021-12-16),
>> which will cause several tests that previously passed under
>> "TEST_PASSES_SANITIZE_LEAK=true" to start failing.
>> In subsequent commits we'll now be able to use those tests to check
>> whether that part of the API is really leaking memory, and will fix
>> all of those memory leaks. Removing the UNLEAK() allows us to make
>> incremental progress in that direction. See [1] for further details
>> about this approach.
>
> This breaks "git bisect" but only when running the test suite to
> detect leaks so I guess that's not too bad. An alternative would be to
> manually remove the UNLEAK() when you're testing rather than
> committing the change.

It doesn't, for this series each individual commit passes with

    make test
    GIT_TEST_PASSING_SANITIZE_LEAK=true make test SANITIZE=leak 

And also in a stricter mode that I have locally (not in git yet):

    make test
    GIT_TEST_PASSING_SANITIZE_LEAK=check make test SANITIZE=leak 

Which ensures not only that the tests we marked as leak free pass, but
that no other tests we *haven't* marked pass unexpectedly (requires prep
changes before this series to mark the still-not-marked-but-should-be
tests).

I think that should address/help explain things re your questions about
some of the UNLEAK() back-and-forth.

I.e. there's a few changes that are in this series just so it can pass
in that "GIT_TEST_PASSING_SANITIZE_LEAK=check" mode, but it would still
pass in "GIT_TEST_PASSING_SANITIZE_LEAK=true", i.e. because we'd make
some new test pass unexpectedly.

But I think maintaining the 1=1 correspondance really helps to follow
along with this, i.e. tests are tweaked as they become leak-free, and we
(or well, mostly I) can be confident that I marked all the relevant
newlry passing ones, and that there are no regressions in-between.


>>   /*
>>    * This gives a rough estimate for how many commits we
>>    * will print out in the list.
>> @@ -558,7 +564,7 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
>>   	cmd_log_init(argc, argv, prefix, &rev, &opt);
>>   	if (!rev.diffopt.output_format)
>>   		rev.diffopt.output_format = DIFF_FORMAT_RAW;
>> -	return cmd_log_walk(&rev);
>> +	return cmd_log_deinit(cmd_log_walk(&rev), &rev);
>
> This is a rather unusual pattern, at first I wondered if there were
> going to be more added to the body of cmd_log_deinit() in later
> commits but there isn't so why not just call release_revisions() here
> to be consistent with the other release_revisions() call that are
> added in other patches?

It's just a way to save every single call to this callsite a change on
top like this:
	
	diff --git a/builtin/log.c b/builtin/log.c
	index 5dad70aa47e..ece03536bed 100644
	--- a/builtin/log.c
	+++ b/builtin/log.c
	@@ -684,8 +684,11 @@ int cmd_show(int argc, const char **argv, const char *prefix)
	 	opt.tweak = show_setup_revisions_tweak;
	 	cmd_log_init(argc, argv, prefix, &rev, &opt);
	 
	-	if (!rev.no_walk)
	-		return cmd_log_deinit(cmd_log_walk(&rev), &rev);
	+	if (!rev.no_walk) {
	+		ret = cmd_log_walk(&rev);
	+		release_revisions(&rev);
	+		return ret;
	+	}
	 
	 	count = rev.pending.nr;
	 	objects = rev.pending.objects;

Which, given that there's 6 of them nicely cuts down on the resulting
verbosity.
Junio C Hamano April 3, 2022, 9:49 p.m. UTC | #3
Phillip Wood <phillip.wood123@gmail.com> writes:

>>   +static int cmd_log_deinit(int ret, struct rev_info *rev)
>> +{
>> +	release_revisions(rev);
>> +	return ret;
>> +}
>
>
>>   /*
>>    * This gives a rough estimate for how many commits we
>>    * will print out in the list.
>> @@ -558,7 +564,7 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
>>   	cmd_log_init(argc, argv, prefix, &rev, &opt);
>>   	if (!rev.diffopt.output_format)
>>   		rev.diffopt.output_format = DIFF_FORMAT_RAW;
>> -	return cmd_log_walk(&rev);
>> +	return cmd_log_deinit(cmd_log_walk(&rev), &rev);
>
> This is a rather unusual pattern, at first I wondered if there were
> going to be more added to the body of cmd_log_deinit() in later
> commits but there isn't so why not just call release_revisions() here
> to be consistent with the other release_revisions() call that are
> added in other patches?

It is being cute and clever by not requiring a temporary variable
ret, where you would normally say

	int ret = 0; /* assume success */

	... a lot of code ...
	ret = cmd_log_walk(&rev);
        release_revisions(&rev);
	return ret;

I agree that this looks confusing; if this pattern can become
majority locally in the file, I guess it would be OK---at that point
we can claim that it is the (new) usual pattern ;-).
Phillip Wood April 4, 2022, 9:27 a.m. UTC | #4
Hi Ævar

On 03/04/2022 15:07, Ævar Arnfjörð Bjarmason wrote:
> 
> On Sat, Apr 02 2022, Phillip Wood wrote:
> 
> [A comment on v4, but also applies to v5 I think]
> 
>> On 31/03/2022 02:11, Ævar Arnfjörð Bjarmason wrote:
>>> In preparation for having the "log" family of functions make wider use
>>> of release_revisions() let's have them call it just before
>>> exiting. This changes the "log", "whatchanged", "show",
>>> "format-patch", etc. commands, all of which live in this file.
>>> The release_revisions() API still only frees the "pending" member,
>>> but
>>> will learn to release more members of "struct rev_info" in subsequent
>>> commits.
>>> In the case of "format-patch" revert the addition of UNLEAK() in
>>> dee839a2633 (format-patch: mark rev_info with UNLEAK, 2021-12-16),
>>> which will cause several tests that previously passed under
>>> "TEST_PASSES_SANITIZE_LEAK=true" to start failing.
>>> In subsequent commits we'll now be able to use those tests to check
>>> whether that part of the API is really leaking memory, and will fix
>>> all of those memory leaks. Removing the UNLEAK() allows us to make
>>> incremental progress in that direction. See [1] for further details
>>> about this approach.
>>
>> This breaks "git bisect" but only when running the test suite to
>> detect leaks so I guess that's not too bad. An alternative would be to
>> manually remove the UNLEAK() when you're testing rather than
>> committing the change.
> 
> It doesn't, for this series each individual commit passes with

Oh I'd missed that, thanks for explaining

>      make test
>      GIT_TEST_PASSING_SANITIZE_LEAK=true make test SANITIZE=leak
> 
> And also in a stricter mode that I have locally (not in git yet):
> 
>      make test
>      GIT_TEST_PASSING_SANITIZE_LEAK=check make test SANITIZE=leak
> 
> Which ensures not only that the tests we marked as leak free pass, but
> that no other tests we *haven't* marked pass unexpectedly (requires prep
> changes before this series to mark the still-not-marked-but-should-be
> tests).
> 
> I think that should address/help explain things re your questions about
> some of the UNLEAK() back-and-forth.

Yes it does, the next patch makes sense to me now as well thanks

> [...]
>>> @@ -558,7 +564,7 @@ int cmd_whatchanged(int argc, const char **argv, const char *prefix)
>>>    	cmd_log_init(argc, argv, prefix, &rev, &opt);
>>>    	if (!rev.diffopt.output_format)
>>>    		rev.diffopt.output_format = DIFF_FORMAT_RAW;
>>> -	return cmd_log_walk(&rev);
>>> +	return cmd_log_deinit(cmd_log_walk(&rev), &rev);
>>
>> This is a rather unusual pattern, at first I wondered if there were
>> going to be more added to the body of cmd_log_deinit() in later
>> commits but there isn't so why not just call release_revisions() here
>> to be consistent with the other release_revisions() call that are
>> added in other patches?
> 
> It's just a way to save every single call to this callsite a change on
> top like this:
> 	
> 	diff --git a/builtin/log.c b/builtin/log.c
> 	index 5dad70aa47e..ece03536bed 100644
> 	--- a/builtin/log.c
> 	+++ b/builtin/log.c
> 	@@ -684,8 +684,11 @@ int cmd_show(int argc, const char **argv, const char *prefix)
> 	 	opt.tweak = show_setup_revisions_tweak;
> 	 	cmd_log_init(argc, argv, prefix, &rev, &opt);
> 	
> 	-	if (!rev.no_walk)
> 	-		return cmd_log_deinit(cmd_log_walk(&rev), &rev);
> 	+	if (!rev.no_walk) {
> 	+		ret = cmd_log_walk(&rev);
> 	+		release_revisions(&rev);
> 	+		return ret;
> 	+	}
> 	
> 	 	count = rev.pending.nr;
> 	 	objects = rev.pending.objects;
> 
> Which, given that there's 6 of them nicely cuts down on the resulting
> verbosity.

If you want to adopt this pattern more widely then if there was a helper 
function in revisions.c it could be used in patches 11 and 14 as well as 
this one I think.

Best Wishes

Phillip
diff mbox series

Patch

diff --git a/builtin/log.c b/builtin/log.c
index 6f9928fabfe..c40ebe1c3f4 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -295,6 +295,12 @@  static void cmd_log_init(int argc, const char **argv, const char *prefix,
 	cmd_log_init_finish(argc, argv, prefix, rev, opt);
 }
 
+static int cmd_log_deinit(int ret, struct rev_info *rev)
+{
+	release_revisions(rev);
+	return ret;
+}
+
 /*
  * This gives a rough estimate for how many commits we
  * will print out in the list.
@@ -558,7 +564,7 @@  int cmd_whatchanged(int argc, const char **argv, const char *prefix)
 	cmd_log_init(argc, argv, prefix, &rev, &opt);
 	if (!rev.diffopt.output_format)
 		rev.diffopt.output_format = DIFF_FORMAT_RAW;
-	return cmd_log_walk(&rev);
+	return cmd_log_deinit(cmd_log_walk(&rev), &rev);
 }
 
 static void show_tagger(const char *buf, struct rev_info *rev)
@@ -677,7 +683,7 @@  int cmd_show(int argc, const char **argv, const char *prefix)
 	cmd_log_init(argc, argv, prefix, &rev, &opt);
 
 	if (!rev.no_walk)
-		return cmd_log_walk(&rev);
+		return cmd_log_deinit(cmd_log_walk(&rev), &rev);
 
 	count = rev.pending.nr;
 	objects = rev.pending.objects;
@@ -732,8 +738,7 @@  int cmd_show(int argc, const char **argv, const char *prefix)
 			ret = error(_("unknown type: %d"), o->type);
 		}
 	}
-	free(objects);
-	return ret;
+	return cmd_log_deinit(ret, &rev);
 }
 
 /*
@@ -761,7 +766,7 @@  int cmd_log_reflog(int argc, const char **argv, const char *prefix)
 	rev.always_show_header = 1;
 	cmd_log_init_finish(argc, argv, prefix, &rev, &opt);
 
-	return cmd_log_walk(&rev);
+	return cmd_log_deinit(cmd_log_walk(&rev), &rev);
 }
 
 static void log_setup_revisions_tweak(struct rev_info *rev,
@@ -792,7 +797,7 @@  int cmd_log(int argc, const char **argv, const char *prefix)
 	opt.revarg_opt = REVARG_COMMITTISH;
 	opt.tweak = log_setup_revisions_tweak;
 	cmd_log_init(argc, argv, prefix, &rev, &opt);
-	return cmd_log_walk(&rev);
+	return cmd_log_deinit(cmd_log_walk(&rev), &rev);
 }
 
 /* format-patch */
@@ -2289,8 +2294,7 @@  int cmd_format_patch(int argc, const char **argv, const char *prefix)
 	if (rev.ref_message_ids)
 		string_list_clear(rev.ref_message_ids, 0);
 	free(rev.ref_message_ids);
-	UNLEAK(rev);
-	return 0;
+	return cmd_log_deinit(0, &rev);
 }
 
 static int add_pending_commit(const char *arg, struct rev_info *revs, int flags)
diff --git a/t/t4126-apply-empty.sh b/t/t4126-apply-empty.sh
index 33860d38290..66a7ba8ab8f 100755
--- a/t/t4126-apply-empty.sh
+++ b/t/t4126-apply-empty.sh
@@ -2,8 +2,6 @@ 
 
 test_description='apply empty'
 
-
-TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '